Astronomer Software supports Kerberos authentication for Airflow deployment databases, allowing you to connect to Kerberized PostgreSQL databases. This feature is available starting in Astronomer Software 0.37.7.Documentation Index
Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Kerberos is an authentication protocol that uses tickets to allow secure authentication in network environments. In enterprise environments with strict security requirements, databases are often configured to use Kerberos authentication instead of traditional username and password authentication. With Kerberos database support in Astronomer Software, you can:- Connect Airflow deployments to Kerberized PostgreSQL databases
- Maintain compliance with enterprise security policies that require Kerberos authentication
- Use existing Kerberos infrastructure for database authentication
How it works
Astronomer Software uses PgBouncer as a proxy between Airflow components and the Kerberized database. When you enable Kerberos for a deployment:- You provide labels and environment variables for PgBouncer Pods via the Houston API when creating or updating a deployment.
- Your Kerberos credential injection mechanism (such as a mutation webhook) uses these labels to inject Kerberos credentials (keytabs or credential refresh sidecars) into the PgBouncer Pods.
- PgBouncer authenticates to the PostgreSQL database using GSSAPI (Kerberos protocol).
- Airflow components connect to PgBouncer using standard authentication, and PgBouncer proxies the connection to the Kerberized database.
Architecture
The following diagram shows the architecture for Kerberos database authentication in Astronomer Software 0.37.7:
Prerequisites
Before configuring Kerberos authentication, ensure you have:- Astronomer Software 0.37.7 or later
- A Kerberized PostgreSQL database (PostgreSQL 18 has known issues)
- Kerberos infrastructure:
- Active Directory or MIT Kerberos KDC (Key Distribution Center)
- Network connectivity between your Kubernetes cluster and the KDC
- A mechanism to inject Kerberos credentials into PgBouncer Pods (see Kerberos credential injection)
- A Kerberos user principal created in your Active Directory
- Houston API access to create deployments
Responsibility model
Kerberos database authentication in Astronomer Software follows a shared responsibility model:Astronomer responsibilities
- Providing PgBouncer images with Kerberos (GSSAPI) support
- Supporting labels and environment variables on PgBouncer Pods via the Houston API
- Maintaining deployment stability during updates
Customer responsibilities
- Setting up and managing Kerberos infrastructure (Active Directory, KDC, etc.)
- Creating and managing Kerberos user principals and keytabs
- Implementing a mechanism to inject Kerberos credentials into PgBouncer Pods (such as a mutation webhook)
- Configuring appropriate labels and environment variables via the Houston API to trigger credential injection
- Creating Kerberos users in the PostgreSQL database with appropriate permissions
- Pre-creating Airflow databases for deployments
- Managing Kerberos ticket lifecycle (renewal, rotation)
Scope and limitations
The following are supported in Astronomer Software 0.37.7 with Kerberos Authentication:- Executor: Kubernetes Executor only
- Deployment Type: Image-based deployments only
- Database: PostgreSQL with Kerberos authentication
Kerberos credential injection
You are responsible for implementing a mechanism to inject Kerberos credentials into PgBouncer Pods. One common approach is using a Kubernetes mutation webhook.Mutation webhook approach (recommended)
A mutation webhook can automatically inject Kerberos credentials when PgBouncer Pods are created. The webhook typically:- Watches for Pod creation requests with specific labels that you configure via the Houston API
- Injects a sidecar container that manages Kerberos ticket renewal
- Mounts Kerberos configuration files (
krb5.conf) and keytabs as volumes - Configures environment variables for Kerberos authentication
pgbouncerConfig section that trigger your webhook to inject the necessary credentials.
Alternative approaches
Other methods for credential injection include:- Init containers that fetch credentials from a secret management system
- Direct volume mounts of Kerberos keytabs from Kubernetes secrets
- Service mesh sidecars
Step 1: Enable manual connection strings
Before creating Kerberos-enabled deployments, you must enable manual connection strings in your Astronomer configuration.-
Open your
values.yamlfile. - Add the following configuration:
- Push the configuration change. See Apply a platform configuration change.
Step 2: Create the Kerberos database user
You must create a Kerberos user in your PostgreSQL database with the appropriate permissions.- Connect to your PostgreSQL database using a superuser account.
-
Create the Kerberos user. The username must be in the format
<username>@<REALM>:
- Grant the necessary permissions:
astro_user with your Kerberos username and APC.ASTRONOMER.IO with your Kerberos realm.
Step 3: (Optional) Configure PgBouncer in the control plane
If you need to use a Kerberized database for the control plane (Houston’s database), you must configure PgBouncer in theastronomer namespace.
Create PgBouncer configuration
- Create a
pgbouncer.inifile with the following contents. Replace the placeholders with your actual values:
- Generate a password hash for the PgBouncer
users.txtfile:
- Create a
users.txtfile with the password hashes:
The password authentication in
users.txt is used for Houston to connect to PgBouncer. PgBouncer then authenticates to the PostgreSQL database using Kerberos (GSSAPI).- Create the Kubernetes secret:
Update Astronomer configuration
-
Open your
values.yamlfile. - Add the following PgBouncer configuration:
- Update the Astronomer bootstrap secret to point to the PgBouncer service:
users.txt file above.
-
Upgrade the Astronomer installation. You must perform the upgrade in two steps:
Step 1: First, upgrade with the
--no-hooksflag. This installs PgBouncer in the control plane without running database migration jobs:
<version> is the Astronomer Software version you want to upgrade to (e.g., 0.37.7).
Step 2: After the upgrade completes and PgBouncer is running, run the upgrade again without the --no-hooks flag. This runs the Airflow database migration jobs:
<version> is the Astronomer Software version you want to upgrade to (e.g., 0.37.7).
The two-step upgrade process is necessary because:
- The first upgrade installs PgBouncer, which is required for database connectivity when using a Kerberized database.
- The second upgrade runs database migration hooks that depend on PgBouncer being available.
Step 4: Create an Airflow deployment database
Before creating a Kerberos-enabled deployment, you must manually create the Airflow database. Follow these steps to create the database:- Create a PostgreSQL client Pod for database operations:
- Apply the Pod:
- Exec into the Pod and connect to the database:
- Create the Airflow database:
Step 5: Create a Kerberos-enabled deployment
Kerberos-enabled deployments must be created using the Houston API. They cannot be created from the Astronomer UI.Use the upsertDeployment mutation
- Compose your mutation payload. The following example shows the required fields for a Kerberos-enabled deployment:
Important configuration fields
kerberosEnabled: Must be set totrue. This enables Houston to perform Kerberos-specific validation.skipAirflowDatabaseProvisioning: Must be set totruebecause you manually create the Airflow database.metadataConnectionJsonandresultBackendConnectionJson:user: Must be in the format<username>@<REALM>pass: Can be any value (e.g.,"no-pass") since PgBouncer uses Kerberos for database authenticationhost: The hostname of your Kerberized PostgreSQL databasedb: The database name you created (e.g.,mydeployment_airflow)
pgbouncerConfig:labels: Custom labels for the PgBouncer Pod. Use these labels to trigger your Kerberos credential injection mechanism (e.g.,"krb-inject": "enabled"or"component": "pgbouncer").env: Environment variables for the PgBouncer Pod. Your credential injection mechanism can use these to configure Kerberos authentication (e.g.,KERBEROS_USER,KERBEROS_PASSWORD).extraIniMetadataandextraIniResultBackend: Must specify the Kerberos user.extraIni: Must include Kerberos/GSS settings:server_gssauth_negotiate = allowserver_krb_spn = postgres/<airflow-db-host>@<kerberos_realm>
sslmode: Set topreferfor RDS or other TLS-enabled databases
- Execute the mutation using the Houston API.
Step 6: Verify Kerberos authentication
After creating your deployment, verify that Kerberos authentication is working correctly.Verify deployment creation
- List the Pods in the deployment namespace:
- Verify that all Airflow Pods are running:
Verify database connectivity
- Check the PgBouncer logs for successful connections:
- Verify that the Airflow UI loads successfully: a. Log in to the Astronomer UI. b. Navigate to your deployment. c. Click Airflow UI.
Troubleshooting
PgBouncer Pod fails to start
If the PgBouncer Pod fails to start, check the following:- Verify that your Kerberos credential injection mechanism is configured correctly.
- Check the Pod events for error messages:
- Verify that the labels in
pgbouncerConfigmatch what your credential injection mechanism expects.
Kerberos authentication failures
If authentication fails, verify:- The Kerberos user exists in your Active Directory and PostgreSQL database.
- The
server_krb_spnin the PgBouncer configuration matches your database hostname and realm. - Network connectivity between the Kubernetes cluster and the KDC.
- Your Kerberos credential injection mechanism is working correctly.
Airflow UI fails to load
If the Airflow UI fails to load:- Check the webserver and scheduler logs for database connection errors.
- Verify that the database was created with the correct owner.
- Ensure the
metadataConnectionJsonandresultBackendConnectionJsonare correctly configured.