Logs configuration

Astro Private Cloud (APC) provides centralized logging through Vector and Elasticsearch. Task logs, platform logs, and audit logs are collected by Vector and indexed in Elasticsearch for searchability and troubleshooting.

Architecture

Vector runs as a DaemonSet collecting logs from all pods. Logs are shipped to Elasticsearch for storage and indexing. For log visualization, you can connect your own tools (Kibana, Grafana, OpenSearch Dashboards, etc.) to query Elasticsearch.

Accessing logs

Airflow UI

Task logs are accessible directly in the Airflow webserver UI:

Navigate to the Dag.
Click a task instance.
Click Log.

Elasticsearch API

Query logs directly via Elasticsearch:

# Search for errors in the last hour
curl -X GET "https://elasticsearch.<base-domain>/_search" \
  -H "Content-Type: application/json" \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "match": { "log_level": "ERROR" } },
          { "range": { "@timestamp": { "gte": "now-1h" } } }
        ]
      }
    }
  }'

BYO visualization

APC does not include a log visualization UI. Connect your preferred tool to Elasticsearch:

Kibana: Deploy separately and point to the Elasticsearch endpoint.
Grafana: Use the Elasticsearch data source.
OpenSearch Dashboards: Use Elasticsearch API compatibility.

Vector configuration

Vector is the log collection agent in APC 1.0.

Enable Vector

tags:
  logging: true

vectorEnabled: true
  vector:
    resources:
      requests:
        cpu: "250m"
        memory: "512Mi"
      limits:
        cpu: "1000m"
        memory: "1Gi"

Custom log parsing

Add custom transforms to parse Airflow log formats:

vector:
  customConfig: |
    [transforms.parse_airflow]
    type = "remap"
    inputs = ["kubernetes_logs"]
    source = '''
    .parsed = parse_regex!(.message, r'^\[(?P<timestamp>.+)\] \{(?P<logger>.+)\} (?P<level>\w+) - (?P<message>.+)$')
    '''

Logging sidecar

APC supports either DaemonSet or sidecar logging on a data plane cluster, but not both simultaneously. To use sidecar logging, you must first disable the Vector DaemonSet, then enable the sidecar:

global:
  vectorEnabled: false
  loggingSidecar:
    enabled: true
    name: sidecar-log-consumer
    repository: quay.io/astronomer/ap-vector
    tag: 0.52.0
    resources:
      requests:
        cpu: "100m"
        memory: "386Mi"

Elasticsearch configuration

Enable Elasticsearch

tags:
  logging: true

elasticsearch:
  common:
    persistence:
      enabled: true

  client:
    replicas: 2
    heapMemory: "2g"
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"
      limits:
        cpu: "2"
        memory: "4Gi"

  data:
    replicas: 3
    heapMemory: "2g"
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"
      limits:
        cpu: "2"
        memory: "4Gi"
    persistence:
      size: "100Gi"

  master:
    replicas: 3
    heapMemory: "2g"
    resources:
      requests:
        cpu: "1"
        memory: "2Gi"

Index lifecycle management

Configure log retention:

elasticsearch:
  indexLifecycleManagement:
    enabled: true
    policies:
      - name: airflow-logs
        phases:
          hot:
            actions:
              rollover:
                max_size: 50gb
                max_age: 7d
          delete:
            min_age: 30d
            actions:
              delete: {}

External logging

Forward to external Elasticsearch

Send logs to your own Elasticsearch cluster:

global:
  customLogging:
    enabled: true
    scheme: https
    host: "elasticsearch.example.com"
    port: "9200"
    secret: "es-credentials"

Forward to S3

Archive logs to object storage:

vector:
  sinks:
    s3:
      type: "aws_s3"
      inputs: ["kubernetes_logs"]
      bucket: "my-logs-bucket"
      region: "us-west-2"
      compression: "gzip"
      encoding:
        codec: "json"

Forward to external systems

Configure Vector to send to any destination:

vector:
  sinks:
    # Splunk
    splunk:
      type: "splunk_hec"
      inputs: ["kubernetes_logs"]
      endpoint: "https://splunk.example.com:8088"
      token: "${SPLUNK_TOKEN}"

    # Datadog
    datadog:
      type: "datadog_logs"
      inputs: ["kubernetes_logs"]
      default_api_key: "${DATADOG_API_KEY}"

    # Generic HTTP
    http:
      type: "http"
      inputs: ["kubernetes_logs"]
      uri: "https://logs.example.com/v1/logs"
      encoding:
        codec: "json"

Deployment log settings

Task log retention

Configure log groomer to manage disk usage:

# In deployment values
scheduler:
  logGroomerSidecar:
    enabled: true
    retentionDays: 15
    frequencyMinutes: 15

workers:
  logGroomerSidecar:
    enabled: true
    retentionDays: 15

Log level configuration

env:
  - name: AIRFLOW__LOGGING__LOGGING_LEVEL
    value: "INFO"
  - name: AIRFLOW__LOGGING__FAB_LOGGING_LEVEL
    value: "WARNING"

Querying logs

Elasticsearch query examples

Find task failures:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "log_level": "ERROR" } },
        { "match": { "kubernetes.labels.component": "worker" } }
      ]
    }
  }
}

Search specific Dag:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "dag_id": "my_dag" } },
        { "match": { "task_id": "my_task" } }
      ]
    }
  }
}

Filter by time range:

{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "2026-02-01T00:00:00",
        "lt": "2026-02-02T00:00:00"
      }
    }
  }
}

Common log fields

Field	Description
`kubernetes.namespace_name`	Deployment namespace
`kubernetes.labels.component`	Component (scheduler, worker, etc.)
`kubernetes.pod_name`	Pod name
`dag_id`	DAG identifier
`task_id`	Task identifier
`log_level`	DEBUG, INFO, WARNING, ERROR
`@timestamp`	Log timestamp

Troubleshooting

Logs aren’t appearing

Check Vector is running:

kubectl get pods -n astronomer -l app=vector

Check Elasticsearch health:

kubectl exec -n astronomer elasticsearch-0 -- \
  curl -s localhost:9200/_cluster/health

Verify Vector logs:

kubectl logs -n astronomer -l app=vector --tail=100

High disk usage

Enable index lifecycle management.
Reduce retention period.
Increase Elasticsearch storage.
Forward logs to external storage (S3).

Slow queries

Add index patterns for common searches.
Increase Elasticsearch resources.
Reduce log verbosity.

Security

Access control

Elasticsearch access is restricted to platform components. For external access, configure authentication:

elasticsearch:
  auth:
    enabled: true
    secretName: "es-credentials"

Log redaction

Redact sensitive data before indexing:

vector:
  transforms:
    redact:
      type: "remap"
      source: '''
      .message = replace(.message, r'password=\S+', "password=***")
      .message = replace(.message, r'api_key=\S+', "api_key=***")
      '''

Best practices

Set appropriate retention based on compliance requirements.
Use log levels wisely - avoid DEBUG in production.
Enable log groomer to prevent disk exhaustion on Airflow pods.
Forward logs externally for long-term retention and compliance.
Monitor Elasticsearch health and disk usage.
Use your preferred visualization tool - deploy Kibana, Grafana, or other tools separately.

Overview

Install And Upgrade

Documentation

Reference

Architecture

Accessing logs

Airflow UI

Elasticsearch API

BYO visualization

Vector configuration

Enable Vector

Custom log parsing

Logging sidecar

Elasticsearch configuration

Enable Elasticsearch

Index lifecycle management

External logging

Forward to external Elasticsearch

Forward to S3

Forward to external systems

Deployment log settings

Task log retention

Log level configuration

Querying logs

Elasticsearch query examples

Common log fields

Troubleshooting

Logs aren’t appearing

High disk usage

Slow queries

Security

Access control

Log redaction

Best practices

Overview

Install And Upgrade

Documentation

Reference

Documentation Index

​Architecture

​Accessing logs

​Airflow UI

​Elasticsearch API

​BYO visualization

​Vector configuration

​Enable Vector

​Custom log parsing

​Logging sidecar

​Elasticsearch configuration

​Enable Elasticsearch

​Index lifecycle management

​External logging

​Forward to external Elasticsearch

​Forward to S3

​Forward to external systems

​Deployment log settings

​Task log retention

​Log level configuration

​Querying logs

​Elasticsearch query examples

​Common log fields

​Troubleshooting

​Logs aren’t appearing

​High disk usage

​Slow queries

​Security

​Access control

​Log redaction

​Best practices

Architecture

Accessing logs

Airflow UI

Elasticsearch API

BYO visualization

Vector configuration

Enable Vector

Custom log parsing

Logging sidecar

Elasticsearch configuration

Enable Elasticsearch

Index lifecycle management

External logging

Forward to external Elasticsearch

Forward to S3

Forward to external systems

Deployment log settings

Task log retention

Log level configuration

Querying logs

Elasticsearch query examples

Common log fields

Troubleshooting

Logs aren’t appearing

High disk usage

Slow queries

Security

Access control

Log redaction

Best practices