Show Remote Execution Agent task logs in Airflow UI

You can display task logs in the Airflow UI by exporting logs to object storage and configuring the Astro API Server to retrieve them. Start by enabling log display after task completion, then optionally extend the setup to stream logs in real time as tasks run. This guide explains configuring post-task log display and expanding that configuration to support real-time log streaming.

Displaying task logs after task completion

Set up log uploading so logs are visible in the Airflow UI after task completion. This requires:

Remote Execution Agent configuration (values.yaml)
Astro UI Deployment configuration
Workload identities: write access for the Remote Execution Agent, read access for the Astro API Server

AWS
GCP
Azure

The Astro Orchestration Plane provides secure private connectivity with a pre-configured S3 Gateway Endpoint.

Configure the following environment variables in the Helm chart’s values.yaml, and replace the path for the AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER value with your information:

commonEnv:
  - name: AIRFLOW__LOGGING__REMOTE_LOGGING
    value: "True"
  - name: AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID
    value: "astro_aws_logging"
  - name: AIRFLOW_CONN_ASTRO_AWS_LOGGING
    value: "s3://"
  - name: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
    value: "s3://<bucket>/<deployment-id>"
  - name: AIRFLOW__LOGGING__LOGGING_CONFIG_CLASS
    value: "astronomer.runtime.logging.logging_config"
  - name: ASTRONOMER_ENVIRONMENT
    value: "cloud"

Mounting credentials manuallyIf you do not use workload identity and instead want to manually mount a credential, you must also add the following environment variable defining the location of a token file to your Remote Agent’s values.yaml file. You can customize the file path, /tmp/logging-token, to the name of your logging token file.

  - name: ASTRO_LOGGING_AWS_WEB_IDENTITY_TOKEN_FILE
    value: "/tmp/logging-token"

Run helm upgrade to apply the change to your Agents.
In the Astro UI, navigate to your Deployment and click the Details tab. Click Edit in the Advanced section to access your logging configurations.
Select Bucket Storage in the Task Logs field and fill in the Bucket URL as s3://<bucket>/<deployment-id>. Or, use the path that you configured for AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER in your Remote Agent’s Helm chart’s values.yaml.
In the Workload Identity for Bucket Storage section, select Customer Managed Identity and follow the instructions to set up your Customer Managed Identity so that the identity you create has read access to the specified bucket and path. The Customer Managed Identity must have s3:GetObject and s3:ListBucket permissions on the S3 bucket. Additionally, ensure that no ACLs on the bucket restrict those actions.

Default Identity is not currently supported for Task Logs Bucket Storage on AWS. You must use Customer Managed Identity.

(Optional) If your log bucket is in a different region from your Astro Deployment, you need to define the AWS region in the AIRFLOW__ASTRONOMER_PROVIDERS_LOGGING__AWS_REGION environment variable for Astronomer-managed components. In the Astro UI, navigate to your Deployment and click the Environment tab. Click Environment Variables, then click (+) Environment Variable to add the following environment variables to your Deployment:

AIRFLOW__ASTRONOMER_PROVIDERS_LOGGING__AWS_REGION <The region in which the S3 bucket is configured>

The Astro Orchestration Plane provides secure private connectivity with a pre-configured Private Service Connect endpoint to GCP Cloud Storage.

Configure the following environment variables in the Helm chart’s values.yaml:

commonEnv:
  - name: AIRFLOW__LOGGING__REMOTE_LOGGING
    value: "True"
  - name: AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID
    value: "astro_gcs_logging"
  - name: AIRFLOW_CONN_ASTRO_GCS_LOGGING
    value: "gcs://"
  - name: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
    value: "gs://<bucket>/<deployment-id>"
  - name: AIRFLOW__LOGGING__LOGGING_CONFIG_CLASS
    value: "astronomer.runtime.logging.logging_config"
  - name: ASTRONOMER_ENVIRONMENT
    value: "cloud"

The path for theAIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDERvalue is configurable. This is only an example format.

Do a helm upgrade to apply the change to your Agents.
In the Astro UI, navigate to your Deployment and click the Details tab. Click Edit in the Advanced section.
In the Task Logs field, select Bucket Storage and fill in the Bucket URL as gs://<bucket>/<deployment-id>. Or, use the path that you configured for AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER in your Remote Agent’s Helm chart’s values.yaml.
In the Workload Identity for Bucket Storage section, select Customer Managed Identity and follow the instructions to set up your Customer Managed Identity so that the identity you create has read access to the specified bucket and path.

On GCP, you can authorize your Deployment to cloud resources using workload identity with the alternative methods describedhere.

You must configure a connection from your Astro Orchestration Plane to your storage account for the Astro API server to read task logs. You can use either a private connection or a public connection.

Private Connectivity: Contact Astronomer support to establish a Private Link connection.
Public Connectivity: Enable Enabled from all networks in the networking settings of your storage account.

For security, Astronomer recommends you assign the Storage Blob Data Reader role to the workload identity used by the API server and apply the following settings for your storage account:

Secure transfer required: Enabled
Allow Blob anonymous access: Disabled
Minimum TLS version: TLS 1.2 or higher

Authenticate to your storage account with an Azure Managed Identity, or the alternative method that uses a Storage Account Access Key.

Azure Managed Identity (Recommended)

If you write task logs using an Azure Managed Identity, configure the following environment variables in the Helm chart’s values.yaml:

 commonEnv:
  - name: AIRFLOW__LOGGING__REMOTE_LOGGING
    value: "True"
  - name: AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID
    value: "astro_azure_logs_override"
  - name: AIRFLOW_CONN_ASTRO_AZURE_LOGS_OVERRIDE
    value: "wasb://<storage-account-name>@<storage-account-name>.blob.core.windows.net"
  - name: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
    value: "wasb-<storage-account-name>"
  - name: AIRFLOW__AZURE_REMOTE_LOGGING__REMOTE_WASB_LOG_CONTAINER
    value: "<storage-account-container-name>"
  - name: AIRFLOW__LOGGING__LOGGING_CONFIG_CLASS
    value: "astronomer.runtime.logging.logging_config"
  - name: ASTRONOMER_ENVIRONMENT
    value: "cloud"

labels:
  azure.workload.identity/use: "true"

annotations:
  azure.workload.identity/client-id: "<managed-identity-client-id>"

Storage Account Access Key (Alternative)

If writing task logs using an Storage Account Access Key, it is not necessary to configure the labels and annotations in your Helm chart’s values.yaml. It is required to configure the AIRFLOW_CONN_ASTRO_AZURE_LOGS_OVERRIDE environmental variable with the format value: wasb://<storage-account-name>:<access-key>@<storage-account-name>.blob.core.windows.net.

 commonEnv:
  - name: AIRFLOW__LOGGING__REMOTE_LOGGING
    value: "True"
  - name: AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID
    value: "astro_azure_logs_override"
  - name: AIRFLOW_CONN_ASTRO_AZURE_LOGS_OVERRIDE
    value: "wasb://<storage-account-name>:<access-key>@<storage-account-name>.blob.core.windows.net"
  - name: AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER
    value: "wasb-<storage-account-name>"
  - name: AIRFLOW__AZURE_REMOTE_LOGGING__REMOTE_WASB_LOG_CONTAINER
    value: "<storage-account-container-name>"
  - name: AIRFLOW__LOGGING__LOGGING_CONFIG_CLASS
    value: "astronomer.runtime.logging.logging_config"
  - name: ASTRONOMER_ENVIRONMENT
    value: "cloud"

Run helm upgrade to apply the change to your Agents.
In the Astro UI, navigate to your Deployment and click the Details tab. Click Edit in the Advanced section.
Select Bucket Storage in the Task Logs field and add the Bucket URL as wasb://<container-name>@<storage-account-name>.blob.core.windows.net/wasb-<storage-account-name>.
In the Workload Identity for Bucket Storage section, select Customer Managed Identity and follow the instructions to set up your Customer Managed Identity so that the identity you create has read access to the specified bucket and path.
In the Astro UI, navigate to your Deployment and click the Environment tab. Click Edit Variables, then click Add Variable to add the following environment variables to your Deployment:

ASTRO_LOGGING_AZURE_CLIENT_ID: <Client ID for the managed identity that you want the API server to use to fetch task logs>
ASTRO_LOGGING_AZURE_TENANT_ID: <Tenant ID for the managed identity that you want the API server to use to fetch task logs>

Displaying task logs during task execution

Once you have post-completion log visibility, you can enable real-time log display. Remote Execution prevents the Airflow API server from reading logs directly from workers until they reach object storage. Use Vector, included in the Remote Execution Agent Helm chart, to upload partial logs while tasks are running.

Prerequisites

Before you configure Vector, ensure that your Remote Execution Deployment is already set up to upload task logs to object storage after task completion.

Enable Vector sidecar

Use Vector to watch for log file changes and upload updates to object storage during task execution. In your Helm values.yaml:

Set loggingSidecar.enabled to true:

loggingSidecar:
  enabled: true

Configure loggingSidecar.volumeMounts:

loggingSidecar:
  volumeMounts:
    - name: task-logs
      mountPath: /var/log/airflow/task_logs
      readOnly: true
    - name: vector-data
      mountPath: /var/lib/vector

Configure log upload for your cloud provider

AWS S3
GCP Cloud Storage
Azure Blob Storage

Configure loggingSidecar.config:

loggingSidecar:
  config: |
    # Vector configuration for Astronomer Remote Execution agents for uploading Airflow task logs to AWS S3

    sources:
      airflow_task_logs:
        type: file
        include:
          - /var/log/airflow/task_logs/**/*.log
        read_from: beginning

    transforms:
      strip_path_prefix:
        type: remap
        inputs: [airflow_task_logs]
        source: |
          .log_path, err = replace(.file, "/var/log/airflow/task_logs/", "")
          if err != null {
            abort
          }

    sinks:
      s3:
        # For more Vector AWS S3 configuration options, see https://vector.dev/docs/reference/configuration/sinks/aws_s3
        type: aws_s3
        inputs: [strip_path_prefix]
        bucket: # set bucket name e.g. airflow_logs
        region: # set AWS region e.g. us-east-2
        compression: none
        encoding:
          codec: text
        key_prefix: '{{ "{{" }} log_path {{ "}}" }}.'
        filename_time_format: "%Y-%m-%dT%H-%M-%S"
        filename_append_uuid: false
        batch:
          max_bytes: 1000000  # configure based on your storage costs and log frequency requirements - see Caveats section below
          timeout_secs: 10  # configure based on your storage costs and log frequency requirements - see Caveats section below

AWS authentication with VectorAbove Vector config assumes a managed identity is set up for authentication, as described in Displaying task logs after task completion.If you require a different way to authenticate with AWS, such as static keys, see https://vector.dev/docs/reference/configuration/sinks/aws_s3/#auth for all available options.

Configure loggingSidecar.config:

loggingSidecar:
  config: |
    # Vector configuration for Astronomer Remote Execution agents for uploading Airflow task logs to GCS

    sources:
      airflow_task_logs:
        type: file
        include:
          - /var/log/airflow/task_logs/**/*.log
        read_from: beginning

    transforms:
      strip_path_prefix:
        type: remap
        inputs: [airflow_task_logs]
        source: |
          .log_path, err = replace(.file, "/var/log/airflow/task_logs/", "")
          if err != null {
            abort
          }

    sinks:
      gcs:
        # For more Vector GCP Cloud Storage configuration options, see https://vector.dev/docs/reference/configuration/sinks/gcp_cloud_storage/
        type: gcp_cloud_storage
        inputs: [strip_path_prefix]
        bucket: # set to GCS bucket name
        compression: none
        encoding:
          codec: text
        key_prefix: '{{ "{{" }} log_path {{ "}}" }}.'  # If you place logs at a specific prefix, prepend that prefix here like 'logs/{{ "{{" }} log_path {{ "}}" }}.'
        filename_time_format: "%Y-%m-%dT%H-%M-%S"
        filename_append_uuid: false
        batch:
          max_bytes: 1000000  # configure based on your storage costs and log frequency requirements - see Caveats section below
          timeout_secs: 10  # configure based on your storage costs and log frequency requirements - see Caveats section below

GCP authentication with VectorAbove Vector config assumes Workload Identity is set up for authentication, as described in Displaying task logs after task completion.If you require a different way to authenticate with GCP, such as service account keys, see https://vector.dev/docs/reference/configuration/sinks/gcp_cloud_storage/#auth for all available options.

Configure loggingSidecar.env to provide your Azure Storage connection string as a Kubernetes secret:

loggingSidecar:
  env:
    - name: AZURE_STORAGE_CONNECTION_STRING
      valueFrom:
        secretKeyRef:
          name: azure-blob-logs
          key: AZURE_STORAGE_CONNECTION_STRING

Configure loggingSidecar.config:

loggingSidecar:
  config: |
    # Vector configuration for Astronomer Remote Execution agents for uploading Airflow task logs to Azure Blob Storage

    sources:
      airflow_task_logs:
        type: file
        include:
          - /var/log/airflow/task_logs/**/*.log
        read_from: beginning

    transforms:
      strip_path_prefix:
        type: remap
        inputs: [airflow_task_logs]
        source: |
          .log_path, err = replace(.file, "/var/log/airflow/task_logs/", "")
          if err != null {
            abort
          }

    sinks:
      azure_blob:
        # For more Vector Azure Blob Storage configuration options, see https://vector.dev/docs/reference/configuration/sinks/azure_blob/
        type: azure_blob
        inputs: [strip_path_prefix]
        container_name: # set to Azure Blob Storage container name
        connection_string: "${AZURE_STORAGE_CONNECTION_STRING}"
        compression: none
        encoding:
          codec: text
        blob_prefix: '{{ "{{" }} log_path {{ "}}" }}.'  # If you configured a specific log path prefix, prepend that prefix here like 'wasb-<storage-account-name>/{{ "{{" }} log_path {{ "}}" }}.'
        blob_time_format: "%Y-%m-%dT%H-%M-%S"
        blob_append_uuid: false
        batch:
          max_bytes: 1000000  # configure based on your storage costs and log frequency requirements - see Caveats section below
          timeout_secs: 10  # configure based on your storage costs and log frequency requirements - see Caveats section below

Vector does not currently support Azure Managed Identity for the Azure Blob sink. You must authenticate using a Storage Account connection string. See the Vector Azure Blob sink documentation for all available authentication options.

Developing Vector Remap Language (VRL)Vector expressions are written in Vector Remap Language (VRL). If you want to edit an expression in the Vector config, this online VRL playground is a useful debugging tool.

Debugging VectorIf you’re having issues uploading logs, you can enable debug logging for the Vector sidecar by adding this to the sink configuration (so you’ll have 2 sinks, e.g. an s3 sink, and a debug sink):

debug:
  type: console
  inputs: [strip_path_prefix]
  encoding:
    codec: json

With this second sink, Vector will display debug logs on the console, accessible with kubectl logs [worker pod] -c vector-logging-sidecar.

Configure workers[*].volumes:

volumes:
  - name: task-logs
    emptyDir: {}
  - name: vector-data
    emptyDir: {}

Configure workers[*].volumeMounts:

volumeMounts:
  - name: task-logs
    mountPath: /usr/local/airflow/logs

Configure triggerer.volumes:

volumes:
  - name: task-logs
    emptyDir: {}
  - name: vector-data
    emptyDir: {}

Configure triggerer.volumeMounts:

volumeMounts:
  - name: task-logs
    mountPath: /usr/local/airflow/logs

Set AIRFLOW__LOGGING__DELETE_LOCAL_LOGS in commonEnv:

commonEnv:
  - name: AIRFLOW__LOGGING__DELETE_LOCAL_LOGS
    value: "True"

Log upload process

Partial logs are uploaded and displayed as follows:

Airflow worker or triggerer writes local task log files, as set by AIRFLOW__LOGGING__LOG_FILENAME_TEMPLATE.
Vector watches /var/log/airflow/task_logs/**/*.log and uploads log changes in chunks while the task runs.
Vector appends a timestamp to the file name before uploading each chunk.
Airflow scans object storage for log chunks when displaying the UI log view.
The UI displays all log content to the user.

Version compatibilityUsing Vector to upload logs assumes Airflow’s logging format is compatible. Significant changes to Airflow logging may require reconfiguration.

Caveats

Duplicate log storage

After task completion, Airflow uploads the complete log to object storage and deletes the local copy. This causes duplication:

Partial logs from Vector
Complete log from Airflow

The Airflow API server deduplicates log lines by timestamp and message. Only storage usage is affected; logs are displayed once.

Small file problem

High-frequency, small log file uploads can create many small objects. This may increase storage costs, load on object storage, or trigger rate limits. Adjust file size and upload frequency in your Vector config to balance performance and cost.

AWS bills object retrieval at a 128KB minimum on certain storage classes (source).
A large number of small objects means more object requests (PUTs, GETs, LISTs) and more load on metadata/indexing; this can result in rate limits or latency issues.

Ensure a proper balance between filesize/timeout and log upload frequency in your Vector config.

Overview

Get Started

Develop

Manage Configuration

Run Workloads

Deploy And Automate

Documentation

Remote Execution

Secrets Backend

CI/CD Templates

Observe And Alert

Administration

Infrastructure

Reference And Support

Best Practices

Airflow 3

Show Remote Execution Agent task logs in Airflow UI

Displaying task logs after task completion

Displaying task logs during task execution

Prerequisites

Enable Vector sidecar

Configure log upload for your cloud provider

Log upload process

Caveats

Duplicate log storage

Small file problem

Overview

Get Started

Develop

Manage Configuration

Run Workloads

Deploy And Automate

Documentation

Remote Execution

Secrets Backend

CI/CD Templates

Observe And Alert

Administration

Infrastructure

Reference And Support

Best Practices

Airflow 3

Documentation Index

​Displaying task logs after task completion

​Displaying task logs during task execution

​Prerequisites

​Enable Vector sidecar

​Configure log upload for your cloud provider

​Log upload process

​Caveats

​Duplicate log storage

​Small file problem

Displaying task logs after task completion

Displaying task logs during task execution

Prerequisites

Enable Vector sidecar

Configure log upload for your cloud provider

Log upload process

Caveats

Duplicate log storage

Small file problem