Skip to main content

Documentation Index

Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Astro Private Cloud (APC) provides centralized logging through Vector and Elasticsearch. Task logs, platform logs, and audit logs are collected by Vector and indexed in Elasticsearch for searchability and troubleshooting.

Architecture

Vector runs as a DaemonSet collecting logs from all pods. Logs are shipped to Elasticsearch for storage and indexing. For log visualization, you can connect your own tools (Kibana, Grafana, OpenSearch Dashboards, etc.) to query Elasticsearch.

Accessing logs

Airflow UI

Task logs are accessible directly in the Airflow webserver UI:
  1. Navigate to the Dag.
  2. Click a task instance.
  3. Click Log.

Elasticsearch API

Query logs directly via Elasticsearch:
# Search for errors in the last hour
curl -X GET "https://elasticsearch.<base-domain>/_search" \
  -H "Content-Type: application/json" \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "match": { "log_level": "ERROR" } },
          { "range": { "@timestamp": { "gte": "now-1h" } } }
        ]
      }
    }
  }'

BYO visualization

APC does not include a log visualization UI. Connect your preferred tool to Elasticsearch:
  • Kibana: Deploy separately and point to the Elasticsearch endpoint.
  • Grafana: Use the Elasticsearch data source.
  • OpenSearch Dashboards: Use Elasticsearch API compatibility.

Vector configuration

Vector is the log collection agent in APC 1.0.

Enable Vector

tags:
  logging: true

vectorEnabled: true
  vector:
    resources:
      requests:
        cpu: "250m"
        memory: "512Mi"
      limits:
        cpu: "1000m"
        memory: "1Gi"

Custom log parsing

Add custom transforms to parse Airflow log formats:
vector:
  customConfig: |
    [transforms.parse_airflow]
    type = "remap"
    inputs = ["kubernetes_logs"]
    source = '''
    .parsed = parse_regex!(.message, r'^\[(?P<timestamp>.+)\] \{(?P<logger>.+)\} (?P<level>\w+) - (?P<message>.+)$')
    '''

Logging sidecar

APC supports either DaemonSet or sidecar logging on a data plane cluster, but not both simultaneously. To use sidecar logging, you must first disable the Vector DaemonSet, then enable the sidecar:
global:
  vectorEnabled: false
  loggingSidecar:
    enabled: true
    name: sidecar-log-consumer
    repository: quay.io/astronomer/ap-vector
    tag: 0.52.0
    resources:
      requests:
        cpu: "100m"
        memory: "386Mi"

Elasticsearch configuration

Enable Elasticsearch

tags:
  logging: true

elasticsearch:
  common:
    persistence:
      enabled: true

  client:
    replicas: 2
    heapMemory: "2g"
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"
      limits:
        cpu: "2"
        memory: "4Gi"

  data:
    replicas: 3
    heapMemory: "2g"
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"
      limits:
        cpu: "2"
        memory: "4Gi"
    persistence:
      size: "100Gi"

  master:
    replicas: 3
    heapMemory: "2g"
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"

Index lifecycle management

Configure log retention:
elasticsearch:
  indexLifecycleManagement:
    enabled: true
    policies:
      - name: airflow-logs
        phases:
          hot:
            actions:
              rollover:
                max_size: 50gb
                max_age: 7d
          delete:
            min_age: 30d
            actions:
              delete: {}

External logging

Forward to external Elasticsearch

Send logs to your own Elasticsearch cluster:
global:
  customLogging:
    enabled: true
    scheme: https
    host: "elasticsearch.example.com"
    port: "9200"
    secret: "es-credentials"

Forward to S3

Archive logs to object storage:
vector:
  sinks:
    s3:
      type: "aws_s3"
      inputs: ["kubernetes_logs"]
      bucket: "my-logs-bucket"
      region: "us-west-2"
      compression: "gzip"
      encoding:
        codec: "json"

Forward to external systems

Configure Vector to send to any destination:
vector:
  sinks:
    # Splunk
    splunk:
      type: "splunk_hec"
      inputs: ["kubernetes_logs"]
      endpoint: "https://splunk.example.com:8088"
      token: "${SPLUNK_TOKEN}"

    # Datadog
    datadog:
      type: "datadog_logs"
      inputs: ["kubernetes_logs"]
      default_api_key: "${DATADOG_API_KEY}"

    # Generic HTTP
    http:
      type: "http"
      inputs: ["kubernetes_logs"]
      uri: "https://logs.example.com/v1/logs"
      encoding:
        codec: "json"

Deployment log settings

Task log retention

Configure log groomer to manage disk usage:
# In deployment values
scheduler:
  logGroomerSidecar:
    enabled: true
    retentionDays: 15
    frequencyMinutes: 15

workers:
  logGroomerSidecar:
    enabled: true
    retentionDays: 15

Log level configuration

env:
  - name: AIRFLOW__LOGGING__LOGGING_LEVEL
    value: "INFO"
  - name: AIRFLOW__LOGGING__FAB_LOGGING_LEVEL
    value: "WARNING"

Querying logs

Elasticsearch query examples

Find task failures:
{
  "query": {
    "bool": {
      "must": [
        { "match": { "log_level": "ERROR" } },
        { "match": { "kubernetes.labels.component": "worker" } }
      ]
    }
  }
}
Search specific Dag:
{
  "query": {
    "bool": {
      "must": [
        { "match": { "dag_id": "my_dag" } },
        { "match": { "task_id": "my_task" } }
      ]
    }
  }
}
Filter by time range:
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "2026-02-01T00:00:00",
        "lt": "2026-02-02T00:00:00"
      }
    }
  }
}

Common log fields

FieldDescription
kubernetes.namespace_nameDeployment namespace
kubernetes.labels.componentComponent (scheduler, worker, etc.)
kubernetes.pod_namePod name
dag_idDAG identifier
task_idTask identifier
log_levelDEBUG, INFO, WARNING, ERROR
@timestampLog timestamp

Troubleshooting

Logs aren’t appearing

  1. Check Vector is running:
    kubectl get pods -n astronomer -l app=vector
    
  2. Check Elasticsearch health:
    kubectl exec -n astronomer elasticsearch-0 -- \
      curl -s localhost:9200/_cluster/health
    
  3. Verify Vector logs:
    kubectl logs -n astronomer -l app=vector --tail=100
    

High disk usage

  1. Enable index lifecycle management.
  2. Reduce retention period.
  3. Increase Elasticsearch storage.
  4. Forward logs to external storage (S3).

Slow queries

  1. Add index patterns for common searches.
  2. Increase Elasticsearch resources.
  3. Reduce log verbosity.

Security

Access control

Elasticsearch access is restricted to platform components. For external access, configure authentication:
elasticsearch:
  auth:
    enabled: true
    secretName: "es-credentials"

Log redaction

Redact sensitive data before indexing:
vector:
  transforms:
    redact:
      type: "remap"
      source: '''
      .message = replace(.message, r'password=\S+', "password=***")
      .message = replace(.message, r'api_key=\S+', "api_key=***")
      '''

Best practices

  • Set appropriate retention based on compliance requirements.
  • Use log levels wisely - avoid DEBUG in production.
  • Enable log groomer to prevent disk exhaustion on Airflow pods.
  • Forward logs externally for long-term retention and compliance.
  • Monitor Elasticsearch health and disk usage.
  • Use your preferred visualization tool - deploy Kibana, Grafana, or other tools separately.