Astro uses the OpenLineage Airflow Provider (pre-installed on Astro Runtime) to extract lineage metadata from Airflow. Astro also automatically configures OpenLineage to send metadata from your Deployments to Astro for Observe and alerts. This ensures lineage data is captured and delivered without any additional user setup. You can also forward lineage events to additional backends if needed. For information about data lineage concept, OpenLineage, and how it works with Airflow, see Integrate OpenLineage and Airflow. In the Astro UI, the Observe section allows you to view the different Data Products and Assets (dags, tasks, and datasets) that your Organization has created. When you view a specific Data Product, you can click Open Graph to render the lineage metadata generated by your dags as a dynamic graph. For more information on using the lineage graph, see Leveraging Data Products.Documentation Index
Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Recommended OpenLineage versions
OpenLineage client
Theopenlineage-python package is responsible for sending lineage metadata from Airflow to Astro.
It can be safely upgraded at any time, independent of Airflow and OpenLineage provider versions, to take advantage of the latest fixes, performance improvements, and features.
Use OpenLineage client version 1.38 or later for maximum compatibility and access to the latest features on Astro. Astronomer recommends always using the latest available release.
Add the following to your requirements.txt to upgrade:
OpenLineage provider
Theapache-airflow-providers-openlineage package serves as the Airflow integration layer for OpenLineage, extracting metadata from tasks and DAGs and passing it to the OpenLineage client.
Keep the provider at the latest available version supported by your Airflow version to ensure accurate and complete lineage capture.
Astronomer recommends always using the latest available release of the OpenLineage provider.
Add the following to your requirements.txt to upgrade:
Default Astro configuration
Astro automatically configures a set of environment variables for OpenLineage setup. These variables are grouped by their function.OpenLineage transport variables
Astro sets the following transport-related variables for compatibility with all supported OpenLineage client versions:Legacy http transport
These three legacy transport variables are used by older OpenLineage client and Astro supplies them for backward compatibility. If you use recommended OpenLineage client version or newer, these variables are ignored in favor of the composite transport configuration below.| Variable | Value / Example | Purpose |
|---|---|---|
OPENLINEAGE_URL | https://o11y.astronomer.io | Astro Observe ingestion URL. |
OPENLINEAGE_API_KEY | <deployment_api_key> | Authentication key for your Deployment. |
OPENLINEAGE_ENDPOINT | api/v1/lineage?ASTRO_ORGANIZATION_ID=... | Astro Observe ingestion endpoint with deployment identifiers. |
New composite transport
For newer OpenLineage clients, Astro configures a composite transport with explicit sub-transports. This structure allows you to easily append other backends by defining additional sub-transports, described below. Make sure to upgrade to recommended OpenLineage client version.| Variable | Value / Example | Description |
|---|---|---|
OPENLINEAGE__TRANSPORT__TYPE | composite | Specifies use of the composite transport, enabling multiple sub-transports. |
OPENLINEAGE__TRANSPORT__SORT_TRANSPORTS | true | Ensures sub-transport execution is sorted by priority. |
OPENLINEAGE__TRANSPORT__TRANSPORTS__DEFAULT_HTTP | {} | Internally used to avoid duplicate event sending. Safe to ignore warnings on older client versions. |
OPENLINEAGE__TRANSPORT__TRANSPORTS__ASTRO__* | Set of variables configuring the transport for Astro Observe delivery. |
- Primary HTTP transport: sends events to public Astro Observe ingestion URL.
- Backup HTTP transport: sends events to local ingestion URL within the cluster, invoked only if the primary transport fails.
- For OpenLineage client version 1.38 or higher, the backup transport is only called if the primary transport fails. For older client versions, the backup transport may be called unnecessarily even when the primary transport succeeds.
If you use automated forwarding of OpenLineage configuration to external jobs, such as Spark or dbt, from your Airflow DAGs, be aware that Astro’s backup transport uses an internal URL accessible only within the Airflow cluster. When forwarding OpenLineage configuration, to jobs outside of Astro or your Airflow deployment, ensure you only use globally accessible endpoints such as
https://o11y.astronomer.io/. The backup transport may not be reachable from outside the cluster, and should be omitted from the forwarded transport configuration.OpenLineage non-transport variables
| Variable | Description |
|---|---|
OPENLINEAGE_NAMESPACE | Unique job namespace for your Deployment. |
OPENLINEAGE__FACETS__ENVIRONMENT_VARIABLES | Includes ASTRO_ORGANIZATION_ID, ASTRO_WORKSPACE_ID, ASTRO_DEPLOYMENT_ID. If you specify any environment variables names here, Astro also appends ASTRO_ORGANIZATION_ID, ASTRO_WORKSPACE_ID, and ASTRO_DEPLOYMENT_ID to your list. |
AIRFLOW__OPENLINEAGE__EXECUTION_TIMEOUT | Maximum seconds to wait for OpenLineage listener to perform lineage metadata extraction. |
AIRFLOW__CORE__TASK_SUCCESS_OVERTIME | Ensures Airflow allows OpenLineage enough time to finish metadata extraction. |
If you define any of these environment variables manually, Astro preserves your values and does not overwrite them.
Sending lineage to additional backends
To send OpenLineage events to an external OpenLineage backend alongside Astro Observe, define an environment variable with the additional transport configuration. Example:- Astro’s default transport remains active.
- Transport’s
priorityattribute determines the order in which transports are executed. Astro’s transport priority is set to 1. - Do not set or overwrite any other transport related variables as it may cause some transports not to send events properly.
- Do not overwrite
OPENLINEAGE_NAMESPACEorAIRFLOW__OPENLINEAGE__NAMESPACE.
Example: Send data to Atlan
Custom namespace for AtlanAstro does not allow overwriting
OPENLINEAGE_NAMESPACE or AIRFLOW__OPENLINEAGE__NAMESPACE.If Atlan requires a different namespace, use a TransformTransport to set a custom namespace only for Atlan events.-
Update the placeholder values and add the following environment variable to your Deployment:
- Click Update Environment Variables to save your changes.
- Verify that Atlan receives lineage events alongside Astro Observe.
Opting out of Astro OpenLineage delivery
Stop sending data to Astronomer but send to your own backend
To override Astronomer’s default OpenLineage configuration and send events only to your backend, provide a full OpenLineage transport configuration using one of the following environment variables. These take precedence over Astro defaults:AIRFLOW__OPENLINEAGE__TRANSPORTAIRFLOW__OPENLINEAGE__CONFIG_PATH(YAML config path)
type key to be valid. Once configured, the Deployment does not send OpenLineage events to Astronomer.
Disable OpenLineage completely
By default, OpenLineage is enabled for all Astro Deployments. To disable all OpenLineage emission from your Deployment, set the following environment variable:false.
Expected log messages and warnings
You may encounter the following messages in your Airflow or OpenLineage logs. These log outputs are expected and safe to ignore in the context of Astro’s managed OpenLineage configuration:- “DEFAULT_HTTP already found in environment variables, skipping aliasing OPENLINEAGE_URL” This means Astro is pre-setting modern composite transport variables and suppressing legacy duplicate event sending. Upgrade to the latest OpenLineage client to suppress this message.
-
DeprecationWarning for
'api_key' option is deprecated, please use 'apiKey'. Astro populates both versions of the API key to maximize compatibility. Upgrade to the latest OpenLineage client to suppress this message. -
“Stopping OpenLineage CompositeTransport emission after the first successful delivery because continue_on_success=False. Transport that emitted the event:
<HttpTransport(name=astro_primary OR astro_backup …)>” Indicates that the Astro composite transport has stopped as it successfully delivered the event to Astro Observe backend. Any user-configured transports still run normally. Upgrade to the latest OpenLineage client to suppress this message.
Known limitations
- Astro accepts OpenLineage events up to a maximum size of 5 MB. Events exceeding this limit will result in an
HTTP 413 Content Too Largeerror.