Skip to main content

Documentation Index

Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Info This page has not yet been updated for Airflow 3. The concepts shown are relevant, but some code may need to be updated. If you run any examples, take care to update import statements and watch for any other breaking changes.
BigQuery is Google’s fully managed and serverless data warehouse. Integrating BigQuery with Airflow lets you execute BigQuery jobs from a DAG. There are multiple ways to connect Airflow and BigQuery, all of which require a GCP Service Account:
  • Use the contents of a service account key file directly in an Airflow connection.
  • Copy the service account key file to your Airflow project.
  • Store the contents of a service account key file in a secrets backend.
  • Use a Kubernetes service account to integrate Airflow and BigQuery. This is possible only if you run Airflow on Astro or Google Kubernetes Engine (GKE).
Using a Kubernetes service account is the most secure method because it doesn’t require storing a secret in Airflow’s metadata database, on disk, or in a secrets backend. The next most secure connection method is to store the contents of your service account key file in a secrets backend.
Tip If you’re an Astro user, Astronomer recommends using workload identity to authorize to your Deployments to BigQuery. This eliminates the need to specify secrets in your Airflow connections or copying credentials file to your Astro project. See Authorize Deployments to your cloud.

Prerequisites

Get connection details

A connection from Airflow to Google BigQuery requires the following information:
  • Service account name
  • Service account key file
  • Google Cloud Project ID
Complete one of the following sets of steps to retrieve these values:

Create your connection

Info Astro users can also create connections using the Astro Environment Manager, which stores connections in an Astro-managed secrets backend. These connections can be shared across multiple deployed and local Airflow environments. See Create Airflow connections in the Astro UI.

How it works

Airflow uses the python-bigquery library to connect to GCP BigQuery through the BigQueryHook. If you don’t define specific key credentials in the connection, Google defaults to using Application Default Credentials (ADC). This means when you use Workload Identity to connect to BigQuery, Airflow relies on ADC to authenticate.

See also