If some of your tasks require specific resources such as a GPU, you might want to run them in a different cluster than your Airflow instance. In setups where both clusters are used by the same AWS, Azure, or GCP account, you can manage separate clusters with roles and permissions.Documentation Index
Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
To launch Pods in external clusters from a local Airflow environment, you must have valid authentication for the external cluster so that your local Airflow environment has permissions to launch a Pod in the external cluster. For managed Kubernetes services from public cloud providers, authentication is federated through the native IAM service. To grant the Astro role permissions to launch pods on your cluster, you can either include static credentials or use workload identity to authorize the Astro role to your cluster.
Prerequisites
- Network connectivity between your Airflow execution environment and the external Kubernetes cluster:
- Hosted execution mode: A network connection between your Astro Deployment and the external cluster.
- Remote execution mode: Network connectivity between the environment where your Remote Execution Agent runs and the external cluster. You are responsible for managing this connectivity. A direct network connection between Astro and the external cluster is not required.
Setup
AmazonEKSWorkerNodePolicyAmazonEKS_CNI_PolicyAmazonEC2ContainerRegistryReadOnly
KubeConfig file to remotely connect to your new cluster. On AWS, you can run the following command to retrieve it:
KubeConfig file called my_kubeconfig.yaml.
<your assume role arn> with the IAM Role ARN from Step 1.
Astronomer recommends creating a Kubernetes cluster connection because it’s more secure than adding an unencrypted
kubeconfig file directly to your Astro project.kubeconfig configuration you retrieved from your cluster to JSON format.kubeconfig configuration you retrieved from your cluster after converting it from yaml to json format.You can now specify this connection in the configuration of any KubernetesPodOperator task that needs to access your external cluster.
Dockerfile to install the AWS CLI:
unzip package to your packages.txt file to make the unzip command available in your Docker container:
In your KubernetesPodOperator task configuration, ensure that you set
cluster-context and namespace for your remote cluster. In the following example, the task launches a Pod in an external cluster based on the configuration defined in the k8s connection.Example dag
The following dag uses several classes from the Amazon provider package to dynamically spin up and delete Pods for each task in a newly created node group. If your remote Kubernetes cluster already has a node group available, you only need to define your task in the KubernetesPodOperator itself. The example dag contains 5 consecutive tasks:- Create a node group according to the user’s specifications (For the example that uses GPU resources).
- Use a sensor to check that the cluster is running correctly.
- Use the KubernetesPodOperator to run any valid Docker image in a Pod on the newly created node group on the remote cluster. The example dag uses the standard
Ubuntuimage to print “hello” to the console using abashcommand. - Delete the node group.
- Verify that the node group has been deleted.