Remote Execution Agents require configuration to access your DAG code. This guide covers configuring DAG bundles, which are collections of DAG files and supporting code introduced in Airflow 3.Documentation Index
Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
This feature requires Airflow 3.x Deployments. Configuring multiple DAG bundles in a single Deployment is only supported in Remote Execution mode.
DAG bundle types
Choose between two types of DAG bundles:- GitDagBundle: Dags stored in a Git repository (recommended for production)
- LocalDagBundle: Dags stored in the container image or persistent volume (default)
When to use each bundle type
Use GitDagBundle when:- Running production deployments
- Tracking DAG versions with full rerun capabilities
- Storing dags in version control systems
- Managing multiple teams or DAG repositories
- Running development or testing environments
- Building dags into container images
- Using existing PVC-based DAG management
- Preferring simpler configuration
Configure GitDagBundle
GitDagBundle fetches dags from Git repositories and provides automatic versioning capabilities.Supported authentication methods
GitDagBundle supports the following authentication methods:- Access tokens (personal access tokens, OAuth tokens)
- SSH keys
- SSH agent
Configure public repository
For public repositories, no authentication configuration is required. Configure only the repository URL and tracking reference:Configure private repository
For private repositories, configure both the DAG bundle and an Airflow connection for authentication.Add an Airflow connection environment variable in
values.yaml. The connection name suffix must match the git_conn_id value in your DAG bundle configuration.commonEnv:
- name: AIRFLOW_CONN_GIT_REPO
value: >-
{
"conn_type": "git",
"login": "<git-username>",
"password": "<personal-access-token>",
"host": "github.com",
"schema": "https",
"extra": {
"repo": "<your-org>/<private-repo>",
"branch": "main"
}
}
The connection name
AIRFLOW_CONN_GIT_REPO creates a connection with ID git_repo. This ID must match the git_conn_id value in your DAG bundle configuration.For production environments, store connection credentials in a secrets backend instead of
values.yaml. See Use a secrets backend for Git credentials for an example using Azure Key Vault.dagBundleConfigList: '[{"name": "private-dags", "classpath": "airflow.providers.git.bundles.git.GitDagBundle", "kwargs": {"tracking_ref": "main", "subdir": "dags", "repo_url": "https://github.com/<your-org>/<private-repo>.git", "git_conn_id": "git_repo"}}]'
Note that
git_conn_id: "git_repo" matches the connection ID from the AIRFLOW_CONN_GIT_REPO environment variable.Configure refresh interval
Control how frequently agents check for repository updates using therefresh_interval parameter:
Use a secrets backend for Git credentials
For production environments, use a secrets backend to store Git connection credentials instead of hardcoding them invalues.yaml. The following example shows how to configure Azure Key Vault with workload identity authentication on Azure AKS.
secretBackend: airflow.providers.microsoft.azure.secrets.key_vault.AzureKeyVaultBackend
commonEnv:
- name: AIRFLOW__SECRETS__BACKEND_KWARGS
value: '{"connections_prefix": "airflow-connection", "variables_prefix": "airflow-variable", "vault_url": "<your-vault-url>", "workload_identity_tenant_id": "<your-tenant-id>", "managed_identity_client_id": "<your-managed-identity-client-id>"}'
This configuration uses Azure workload identity for authentication, which is the recommended approach for Azure AKS environments. For other authentication methods, see Azure Key Vault secrets backend.
Create secrets in Azure Key Vault for each Git connection. The secret name must follow the pattern
<connections_prefix>-<connection-id>. For example, to create a connection with ID git-repo1-conn:airflow-connection-git-repo1-conn.{
"conn_type": "git",
"login": "<git-username>",
"password": "<personal-access-token>",
"host": "github.com",
"schema": "https",
"extra": {
"repo": "<your-org>/<your-repo>",
"branch": "main"
}
}
dagBundleConfigList: '[
{
"name": "dags-folder",
"classpath": "airflow.providers.git.bundles.git.GitDagBundle",
"kwargs": {
"repo_url": "https://github.com/<your-org>/<repo-1>.git",
"tracking_ref": "main",
"subdir": "dags",
"refresh_interval": 120,
"git_conn_id": "git-repo1-conn"
}
},
{
"name": "dags-folder2",
"classpath": "airflow.providers.git.bundles.git.GitDagBundle",
"kwargs": {
"repo_url": "https://github.com/<your-org>/<repo-2>.git",
"tracking_ref": "main",
"subdir": "dags",
"refresh_interval": 120,
"git_conn_id": "git-repo2-conn"
}
}
]'
The
git_conn_id values must match the connection IDs you created in Azure Key Vault (without the airflow-connection- prefix).Configure LocalDagBundle
LocalDagBundle reads dags from the local filesystem. This is the default dag bundle type.DAG storage options
Choose one of two methods to provide dags to agents: Option 1: Include dags in container image Build a custom agent image that includes your DAG files. Copy dags into the/dags folder during image build.
Option 2: Mount Persistent Volume Claim
Create a PVC containing your dags and mount it into all agent components (Dag Processor, Worker, and Triggerer) at the same path.
Configure DAG path
LocalDagBundle looks for dags in/dags by default. Specify a different path using the path parameter:
Configure with container image
FROM images.astronomer.cloud/baseimages/astro-remote-execution-agent:3.1-3-python-3.12-astro-agent-1.2.0
# Copy dags into the image
COPY dags/ /dags/
# Install additional dependencies if needed
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
workers:
- name: default-worker
image: your-registry.example.com/custom-agent:1.0.0
dagProcessor:
image: your-registry.example.com/custom-agent:1.0.0
triggerer:
image: your-registry.example.com/custom-agent:1.0.0
dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/dags"}}]'
Configure with Persistent Volume Claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dags-pvc
namespace: <your-namespace>
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 20Gi
storageClassName: <your-storage-class>
workers:
- name: default-worker
volumes:
- name: dags-volume
persistentVolumeClaim:
claimName: dags-pvc
volumeMounts:
- name: dags-volume
mountPath: /opt/airflow/dags
readOnly: true
dagProcessor:
volumes:
- name: dags-volume
persistentVolumeClaim:
claimName: dags-pvc
volumeMounts:
- name: dags-volume
mountPath: /opt/airflow/dags
readOnly: true
triggerer:
volumes:
- name: dags-volume
persistentVolumeClaim:
claimName: dags-pvc
volumeMounts:
- name: dags-volume
mountPath: /opt/airflow/dags
readOnly: true
dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/opt/airflow/dags"}}]'
GitDagBundle compared to LocalDagBundle
Both bundle types support DAG versioning in the Airflow UI, but GitDagBundle provides additional capabilities:| Scenario | LocalDagBundle | GitDagBundle |
|---|---|---|
| Viewing previous DAG runs | Displays DAG as it existed at run time | Displays DAG as it existed at run time |
| Creating new DAG runs | Uses current DAG code | Uses current DAG code |
| Rerunning whole previous DAG run | Uses current DAG code | Uses DAG version from original run time |
| Rerunning individual tasks | Uses latest version for rerun tasks | Uses task code from original run time |
| Code changes during DAG run | Uses current DAG code at task start time | Completes using bundle version from run start |
| Running backfills | Uses current DAG code | Uses latest bundle version |
| Version creation | Every structural DAG change creates new version | Every committed structural change creates new version |
DAG versioning
Airflow 3 automatically tracks DAG versions when you use DAG bundles. Each DAG run associates with a specific DAG version visible in the Airflow UI. Key behaviors:- New versions are created for structural changes (tasks, dependencies, schedules)
- The scheduler uses the latest DAG version to create new runs
- You can view code for any previous DAG version in the UI
- GitDagBundle allows rerunning tasks with their original code version
Next steps
After configuring DAG sources:- Deploy Remote Execution project - Build and deploy your Airflow project
- Configure logging - Set up task log collection
- Configure OpenLineage - Enable data lineage tracking