Documentation Index
Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Airflow 3This feature is only available for Airflow 3.x Deployments.
OpenLineage enables you to access data lineage and provenance across your Airflow workflows for your Remote Execution Agent. Features like Observe and Astro Alerts require that you enable OpenLineage for your data pipelines.
When you create your Remote Execution Agent, Astro automatically generates a Helm values.yaml file with OpenLineage configurations pre-filled. To set up OpenLineage, you need to configure an access credential for OpenLineage. This can be an Astro Deployment API token used as your OpenLineage API key. There are three methods you can use to add your API key to your Helm values:
- Configure the API key as plain text: This stores your API key in your
values.yaml file as plaintext, which is the simplest but least secure option. It would be appropriate for development or testing environments.
- Use a pre-created Kubernetes secret: This procedure stores your API key separately from your
values.yaml file, which provides more security than storing as plaintext. This option provides security with standard Kubernetes features.
- Inject your API key with a secrets manager: This approach uses the init container to inject the Agent Token into the Remote Execution Agent component Pods. This example uses the Hashicorp Vault agent, but you can use your own secrets manager. This option provides enhanced security with the potential for secret rotation.
Prerequisites
Setup
Step 1: Retrieve your OpenLineage Namespace and URL
-
In the Astro UI, go to the Deployment page and choose Agent. Then click Register Remote Agent.
-
Click Download
values.yaml file.
The downloaded Helm values file will come with some of the key information for OpenLineage pre-filled, including OpenLineage Namespace and URL. So you only need to configure the OpenLineage API key.
This method stores an Astro Deployment API token as your OpenLineage API key as plain text in your values.yaml file, so that the Remote Execution Agent Helm chart can use it to creates a Kubernetes secret named openlineage-api-key-secret. This API key is base64-encoded in the Kubernetes secret.All Remote Execution Agent components, such as the Worker, Dag processor, Triggerer, use this API key to authenticate with the OpenLineage endpoint.
- Add the following OpenLineage configuration to your
values.yaml file:
openLineage:
# Enable OpenLineage integration
enabled: true
# Set your OpenLineage API key directly in the values file
apiKey: "insert-your-openlineage-api-key-here"
# Do NOT set apiKeySecret when using apiKey
# apiKeySecret: ~
# Astro OpenLineage URL endpoint (This should be prefilled in the downloaded values.yaml from Astro UI)
url: "https://your-openlineage-endpoint.example.com"
# Astro Deployment's namespace (This should be prefilled in the downloaded values.yaml from Astro UI)
namespace: "your-astro-deployment-namespace"
- Apply the chart using the
values.yaml file with the following command:
helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml
When you use this method, the Remote Execution Agent Helm chart doesn’t create a new secret for the OpenLineage API key. Instead, it configures all Agent components, such as the Worker, Dag processor, Triggerer, to use the existing secret to authenticate with the OpenLineage endpoint.The secret must have a key named api-key containing the OpenLineage API key in your values.yaml file. However, you can use an Astro Deployment API token as your OpenLineage API key:
- Create a Kubernetes secret containing your OpenLineage API key:
kubectl create secret generic openlineage-api-key-secret \
--from-literal=api-key=your-openlineage-api-key-here \
--namespace <your-namespace>
- Configure your
values.yaml file so that OpenLineage uses your pre-created secret:
openLineage:
# Enable OpenLineage integration
enabled: true
# Do NOT set apiKey when using apiKeySecret
# apiKey: ~
# Reference the pre-created secret you created in the previous step
apiKeySecret: "openlineage-api-key-secret"
# Astro OpenLineage URL endpoint (This should be prefilled in the downloaded values.yaml from Astro UI)
url: "https://your-openlineage-endpoint.example.com"
# Astro Deployment's namespace (This should be prefilled in the downloaded values.yaml from Astro UI)
namespace: "your-astro-deployment-namespace"
- Apply the chart using the
values.yaml file with the following command:
helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml
You can also use a secrets manager to securely store your API keys. The following procedure specifically uses the Hashicorp Vault Agent.The Vault Agent init container runs before the main Remote Execution Agent container. The Vault agent authenticates with Vault to retrieve the OpenLIneage API key and then write the API key to a file in the shared volume. The Remote Exuction Agent container can then read the OpenLineage API Key from the file using the OPENLINEAGE_API_KEY environment variable. Use your Astro Deployment API token as your OpenLineage API key.
- Configure OpenLineage and add init containers for the
dagProcessor, workers, and triggerer components:
openLineage:
# Enable OpenLineage integration
enabled: true
# Don't set apiKey or apiKeySecret
# apiKey: ~
# apiKeySecret: ~
# Astro OpenLineage URL endpoint (This should be prefilled in the downloaded values.yaml from Astro UI)
url: "https://your-openlineage-endpoint.example.com"
# Astro Deployment's namespace (This should be prefilled in the downloaded values.yaml from Astro UI)
namespace: "your-astro-deployment-namespace"
# Configure each component to use the Vault init container
dagProcessor:
initContainers:
- name: vault-openlineage
image: hashicorp/vault:1.13.1
command: ["/bin/sh", "-c"]
args:
- |
export VAULT_ADDR=https://vault.example.com
vault agent -config=/vault/config/agent.hcl
volumeMounts:
- name: vault-config
mountPath: /vault/config
- name: openlineage-volume
mountPath: /vault/secrets
volumes:
- name: vault-config
configMap:
name: vault-agent-config
- name: openlineage-volume
emptyDir:
medium: Memory
volumeMounts:
- name: openlineage-volume
mountPath: /vault/secrets
env:
- name: OPENLINEAGE_API_KEY
valueFrom:
fileRef:
path: /vault/secrets/openlineage-api-key
triggerer:
initContainers:
- name: vault-openlineage
image: hashicorp/vault:1.13.1
command: ["/bin/sh", "-c"]
args:
- |
export VAULT_ADDR=https://vault.example.com
vault agent -config=/vault/config/agent.hcl
volumeMounts:
- name: vault-config
mountPath: /vault/config
- name: openlineage-volume
mountPath: /vault/secrets
volumes:
- name: vault-config
configMap:
name: vault-agent-config
- name: openlineage-volume
emptyDir:
medium: Memory
volumeMounts:
- name: openlineage-volume
mountPath: /vault/secrets
env:
- name: OPENLINEAGE_API_KEY
valueFrom:
fileRef:
path: /vault/secrets/openlineage-api-key
workers:
initContainers:
- name: vault-openlineage
image: hashicorp/vault:1.13.1
command: ["/bin/sh", "-c"]
args:
- |
export VAULT_ADDR=https://vault.example.com
vault agent -config=/vault/config/agent.hcl
volumeMounts:
- name: vault-config
mountPath: /vault/config
- name: openlineage-volume
mountPath: /vault/secrets
volumes:
- name: vault-config
configMap:
name: vault-agent-config
- name: openlineage-volume
emptyDir:
medium: Memory
volumeMounts:
- name: openlineage-volume
mountPath: /vault/secrets
env:
- name: OPENLINEAGE_API_KEY
valueFrom:
fileRef:
path: /vault/secrets/openlineage-api-key
- Create a ConfigMap for your vault agent:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: vault-agent-config
namespace: <your-namespace>
data:
agent.hcl: |
auto_auth {
method "kubernetes" {
mount_path = "auth/kubernetes"
config = {
role = "astro-agent"
}
}
}
template {
destination = "/vault/secrets/openlineage-api-key"
contents = "{{ with secret \"secret/data/openlineage/api-key\" }}{{ .Data.data.key }}{{ end }}"
}
EOF
- Apply the chart with the
values.yaml file:
helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml
Read more about Secrets backends on Astro on Astro.
Step 3: Set OpenLineage environment variables on the Orchestration Plane
-
In the Astro UI, open your Deployment.
-
Navigate to the Environment Variables tab.
-
Add the following environment variable:
OPENLINEAGE_DISABLED=False
-
Apply changes.
Setting this variable ensures that all required OpenLineage events, including task and DAG run events, are collected from scheduler, workers, dag processor, and triggerer components. This provides complete lineage in Observe and Astro Alerts.