Skip to main content

Documentation Index

Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Remote Execution Agents require configuration to access your DAG code. This guide covers configuring DAG bundles, which are collections of DAG files and supporting code introduced in Airflow 3.
This feature requires Airflow 3.x Deployments. Configuring multiple DAG bundles in a single Deployment is only supported in Remote Execution mode.

DAG bundle types

Choose between two types of DAG bundles:
  • GitDagBundle: Dags stored in a Git repository (recommended for production)
  • LocalDagBundle: Dags stored in the container image or persistent volume (default)

When to use each bundle type

Use GitDagBundle when:
  • Running production deployments
  • Tracking DAG versions with full rerun capabilities
  • Storing dags in version control systems
  • Managing multiple teams or DAG repositories
Use LocalDagBundle when:
  • Running development or testing environments
  • Building dags into container images
  • Using existing PVC-based DAG management
  • Preferring simpler configuration
See GitDagBundle compared to LocalDagBundle for functional differences.

Configure GitDagBundle

GitDagBundle fetches dags from Git repositories and provides automatic versioning capabilities.
GitDagBundle is recommended for production Remote Execution deployments.

Supported authentication methods

GitDagBundle supports the following authentication methods:
  • Access tokens (personal access tokens, OAuth tokens)
  • SSH keys
  • SSH agent
Choose the method that aligns with your security requirements and infrastructure.

Configure public repository

For public repositories, no authentication configuration is required. Configure only the repository URL and tracking reference:
dagBundleConfigList: '[{"name": "public-dags", "classpath": "airflow.providers.git.bundles.git.GitDagBundle", "kwargs": {"repo_url": "https://github.com/your-org/public-dags", "tracking_ref": "main", "subdir": "dags"}}]'

Configure private repository

For private repositories, configure both the DAG bundle and an Airflow connection for authentication.
1
Create Git connection
2
Add an Airflow connection environment variable in values.yaml. The connection name suffix must match the git_conn_id value in your DAG bundle configuration.
3
commonEnv:
  - name: AIRFLOW_CONN_GIT_REPO
    value: >-
      {
        "conn_type": "git",
        "login": "<git-username>",
        "password": "<personal-access-token>",
        "host": "github.com",
        "schema": "https",
        "extra": {
          "repo": "<your-org>/<private-repo>",
          "branch": "main"
        }
      }
4
The connection name AIRFLOW_CONN_GIT_REPO creates a connection with ID git_repo. This ID must match the git_conn_id value in your DAG bundle configuration.
5
For production environments, store connection credentials in a secrets backend instead of values.yaml. See Use a secrets backend for Git credentials for an example using Azure Key Vault.
6
Configure DAG bundle
7
Configure the DAG bundle with matching git_conn_id:
8
dagBundleConfigList: '[{"name": "private-dags", "classpath": "airflow.providers.git.bundles.git.GitDagBundle", "kwargs": {"tracking_ref": "main", "subdir": "dags", "repo_url": "https://github.com/<your-org>/<private-repo>.git", "git_conn_id": "git_repo"}}]'
9
Note that git_conn_id: "git_repo" matches the connection ID from the AIRFLOW_CONN_GIT_REPO environment variable.
10
Update Helm release
11
Apply the configuration:
12
helm upgrade astro-agent astronomer/astro-remote-execution-agent -f values.yaml

Configure refresh interval

Control how frequently agents check for repository updates using the refresh_interval parameter:
dagBundleConfigList: '[{"name": "private-dags", "classpath": "airflow.providers.git.bundles.git.GitDagBundle", "kwargs": {"repo_url": "https://github.com/your-org/private-dags", "tracking_ref": "main", "subdir": "dags", "git_conn_id": "git_repo", "refresh_interval": 300}}]'
The default refresh interval is 300 seconds. Reducing this value across many bundles may increase the risk of hitting Git provider rate limits.

Use a secrets backend for Git credentials

For production environments, use a secrets backend to store Git connection credentials instead of hardcoding them in values.yaml. The following example shows how to configure Azure Key Vault with workload identity authentication on Azure AKS.
1
Configure Azure Key Vault as secrets backend
2
Add the secrets backend configuration to your values.yaml:
3
secretBackend: airflow.providers.microsoft.azure.secrets.key_vault.AzureKeyVaultBackend

commonEnv:
  - name: AIRFLOW__SECRETS__BACKEND_KWARGS
    value: '{"connections_prefix": "airflow-connection", "variables_prefix": "airflow-variable", "vault_url": "<your-vault-url>", "workload_identity_tenant_id": "<your-tenant-id>", "managed_identity_client_id": "<your-managed-identity-client-id>"}'
4
This configuration uses Azure workload identity for authentication, which is the recommended approach for Azure AKS environments. For other authentication methods, see Azure Key Vault secrets backend.
5
Store Git connections in Azure Key Vault
6
Create secrets in Azure Key Vault for each Git connection. The secret name must follow the pattern <connections_prefix>-<connection-id>. For example, to create a connection with ID git-repo1-conn:
7
  • In Azure Key Vault, create a secret named airflow-connection-git-repo1-conn.
  • Set the secret value to a JSON connection string:
  • 8
    {
      "conn_type": "git",
      "login": "<git-username>",
      "password": "<personal-access-token>",
      "host": "github.com",
      "schema": "https",
      "extra": {
        "repo": "<your-org>/<your-repo>",
        "branch": "main"
      }
    }
    
    9
    Repeat this process for each Git repository connection you need.
    10
    Configure DAG bundles with connection references
    11
    Configure your DAG bundles to reference the connections stored in Azure Key Vault:
    12
    dagBundleConfigList: '[
      {
        "name": "dags-folder",
        "classpath": "airflow.providers.git.bundles.git.GitDagBundle",
        "kwargs": {
          "repo_url": "https://github.com/<your-org>/<repo-1>.git",
          "tracking_ref": "main",
          "subdir": "dags",
          "refresh_interval": 120,
          "git_conn_id": "git-repo1-conn"
        }
      },
      {
        "name": "dags-folder2",
        "classpath": "airflow.providers.git.bundles.git.GitDagBundle",
        "kwargs": {
          "repo_url": "https://github.com/<your-org>/<repo-2>.git",
          "tracking_ref": "main",
          "subdir": "dags",
          "refresh_interval": 120,
          "git_conn_id": "git-repo2-conn"
        }
      }
    ]'
    
    13
    The git_conn_id values must match the connection IDs you created in Azure Key Vault (without the airflow-connection- prefix).
    14
    Update Helm release
    15
    Apply the configuration:
    16
    helm upgrade astro-agent astronomer/astro-remote-execution-agent -f values.yaml
    

    Configure LocalDagBundle

    LocalDagBundle reads dags from the local filesystem. This is the default dag bundle type.

    DAG storage options

    Choose one of two methods to provide dags to agents: Option 1: Include dags in container image Build a custom agent image that includes your DAG files. Copy dags into the /dags folder during image build. Option 2: Mount Persistent Volume Claim Create a PVC containing your dags and mount it into all agent components (Dag Processor, Worker, and Triggerer) at the same path.

    Configure DAG path

    LocalDagBundle looks for dags in /dags by default. Specify a different path using the path parameter:
    dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/opt/airflow/dags"}}]'
    

    Configure with container image

    1
    Build custom image
    2
    Create a Dockerfile extending the base agent image:
    3
    FROM images.astronomer.cloud/baseimages/astro-remote-execution-agent:3.1-3-python-3.12-astro-agent-1.2.0
    
    # Copy dags into the image
    COPY dags/ /dags/
    
    # Install additional dependencies if needed
    COPY requirements.txt /tmp/requirements.txt
    RUN pip install -r /tmp/requirements.txt
    
    4
    Update values file
    5
    Reference your custom image in values.yaml:
    6
    workers:
      - name: default-worker
        image: your-registry.example.com/custom-agent:1.0.0
    
    dagProcessor:
      image: your-registry.example.com/custom-agent:1.0.0
    
    triggerer:
      image: your-registry.example.com/custom-agent:1.0.0
    
    dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/dags"}}]'
    
    7
    Update Helm release
    8
    Apply the configuration:
    9
    helm upgrade astro-agent astronomer/astro-remote-execution-agent -f values.yaml
    

    Configure with Persistent Volume Claim

    1
    Create PVC
    2
    Create a PersistentVolumeClaim in your Kubernetes namespace:
    3
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: dags-pvc
      namespace: <your-namespace>
    spec:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 20Gi
      storageClassName: <your-storage-class>
    
    4
    Apply the PVC:
    5
    kubectl apply -f pvc.yaml
    
    6
    Configure volume mounts
    7
    Update values.yaml to mount the PVC into all components:
    8
    workers:
      - name: default-worker
        volumes:
          - name: dags-volume
            persistentVolumeClaim:
              claimName: dags-pvc
        volumeMounts:
          - name: dags-volume
            mountPath: /opt/airflow/dags
            readOnly: true
    
    dagProcessor:
      volumes:
        - name: dags-volume
          persistentVolumeClaim:
            claimName: dags-pvc
      volumeMounts:
        - name: dags-volume
          mountPath: /opt/airflow/dags
          readOnly: true
    
    triggerer:
      volumes:
        - name: dags-volume
          persistentVolumeClaim:
            claimName: dags-pvc
      volumeMounts:
        - name: dags-volume
          mountPath: /opt/airflow/dags
          readOnly: true
    
    dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/opt/airflow/dags"}}]'
    
    9
    Update Helm release
    10
    Apply the configuration:
    11
    helm upgrade astro-agent astronomer/astro-remote-execution-agent -f values.yaml
    

    GitDagBundle compared to LocalDagBundle

    Both bundle types support DAG versioning in the Airflow UI, but GitDagBundle provides additional capabilities:
    ScenarioLocalDagBundleGitDagBundle
    Viewing previous DAG runsDisplays DAG as it existed at run timeDisplays DAG as it existed at run time
    Creating new DAG runsUses current DAG codeUses current DAG code
    Rerunning whole previous DAG runUses current DAG codeUses DAG version from original run time
    Rerunning individual tasksUses latest version for rerun tasksUses task code from original run time
    Code changes during DAG runUses current DAG code at task start timeCompletes using bundle version from run start
    Running backfillsUses current DAG codeUses latest bundle version
    Version creationEvery structural DAG change creates new versionEvery committed structural change creates new version

    DAG versioning

    Airflow 3 automatically tracks DAG versions when you use DAG bundles. Each DAG run associates with a specific DAG version visible in the Airflow UI. Key behaviors:
    • New versions are created for structural changes (tasks, dependencies, schedules)
    • The scheduler uses the latest DAG version to create new runs
    • You can view code for any previous DAG version in the UI
    • GitDagBundle allows rerunning tasks with their original code version
    See Airflow DAG versioning for detailed information about versioning behavior.

    Next steps

    After configuring DAG sources: