Skip to main content

Documentation Index

Fetch the complete documentation index at: https://astronomer-preview.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Astro Private Cloud (APC) runs on Kubernetes and requires careful resource planning for both control plane and data plane components. This guide covers resource configuration for all platform components.

Architecture overview

APC uses a control plane/data plane architecture:
  • Control Plane: Houston API, Astro UI, Registry, Config Syncer.
  • Data Plane: Commander, Registry, Airflow deployments.
  • Unified Mode: All components in a single cluster.

Control plane components

Houston API

Houston is the GraphQL API that manages platform operations.
houston:
  replicas: 2
  resources:
    requests:
      cpu: "500m"
      memory: "1Gi"
    limits:
      cpu: "1000m"
      memory: "2Gi"

Scaling recommendations

Platform SizeReplicasCPU RequestMemory Request
Small (< 10 deployments)2250m512Mi
Medium (10-50 deployments)2500m1Gi
Large (50+ deployments)31000m2Gi

Houston worker

Background job processor for asynchronous operations. The worker uses the same houston.resources as the Houston API.
houston:
  resources:
    requests:
      cpu: "500m"
      memory: "1Gi"
    limits:
      cpu: "1000m"
      memory: "2Gi"
  worker:
    replicas: 2

Astro UI

Web interface for platform management.
astroUI:
  replicas: 2
  resources:
    requests:
      cpu: "100m"
      memory: "256Mi"
    limits:
      cpu: "500m"
      memory: "512Mi"

Data plane components

Commander

Manages Airflow deployment provisioning and Kubernetes operations.
commander:
  replicas: 2
  resources:
    requests:
      cpu: "250m"
      memory: "512Mi"
    limits:
      cpu: "500m"
      memory: "1Gi"

Registry

Docker image registry for Airflow deployments.
registry:
  replicas: 1
  resources:
    requests:
      cpu: "100m"
      memory: "256Mi"
    limits:
      cpu: "500m"
      memory: "512Mi"
  persistence:
    enabled: true
    size: 100Gi

Storage backend options

  • Local PersistentVolume (default)
  • Google Cloud Storage (GCS)
  • Azure Blob Storage
  • Amazon S3

Ingress and networking

NGINX ingress controller

nginx:
  replicas: 2
  resources:
    requests:
      cpu: "500m"
      memory: "1Gi"
    limits:
      cpu: "1000m"
      memory: "2Gi"
  serviceType: LoadBalancer

Database

PostgreSQL

Houston metadata database.
postgresql:
  resources:
    requests:
      cpu: "250m"
      memory: "256Mi"
    limits:
      cpu: "1000m"
      memory: "1Gi"
  persistence:
    enabled: true
    size: 8Gi

Production configuration

postgresql:
  replication:
    enabled: true
    slaveReplicas: 2
    synchronousCommit: "on"

Resource sizing examples

Development environment

houston:
  replicas: 1
  resources:
    requests:
      cpu: "100m"
      memory: "256Mi"
    limits:
      cpu: "500m"
      memory: "1Gi"

astroUI:
  replicas: 1
  resources:
    requests:
      cpu: "50m"
      memory: "128Mi"

commander:
  replicas: 1
  resources:
    requests:
      cpu: "100m"
      memory: "256Mi"

nginx:
  replicas: 1
  resources:
    requests:
      cpu: "100m"
      memory: "256Mi"

Production environment

houston:
  replicas: 3
  resources:
    requests:
      cpu: "1000m"
      memory: "2Gi"
    limits:
      cpu: "2000m"
      memory: "4Gi"

astroUI:
  replicas: 2
  resources:
    requests:
      cpu: "250m"
      memory: "512Mi"
    limits:
      cpu: "500m"
      memory: "1Gi"

commander:
  replicas: 2
  resources:
    requests:
      cpu: "500m"
      memory: "1Gi"

nginx:
  replicas: 3
  resources:
    requests:
      cpu: "1000m"
      memory: "2Gi"

postgresql:
  resources:
    requests:
      cpu: "500m"
      memory: "1Gi"
  persistence:
    size: 50Gi

High availability configuration

houston:
  replicas: 3
  podDisruptionBudget:
    enabled: true
    maxUnavailable: 1

astroUI:
  replicas: 3
  podDisruptionBudget:
    enabled: true
    maxUnavailable: 1

commander:
  replicas: 3
  podDisruptionBudget:
    enabled: true
    maxUnavailable: 1

Monitor resource usage

# View pod resource usage
kubectl top pods -n astronomer

# View node resource usage
kubectl top nodes

Troubleshooting

Out of memory (OOMKilled)

Symptom: Pods restart with OOMKilled status. Solution: Increase memory limits:
houston:
  resources:
    limits:
      memory: "4Gi"

CPU throttling

Symptom: Slow response times, high latency. Solution: Increase CPU limits or add replicas:
houston:
  replicas: 3
  resources:
    limits:
      cpu: "2000m"

Pending pods

Symptom: Pods stuck in Pending state. Solution:
  1. Check node resources: kubectl describe nodes.
  2. Reduce resource requests or add nodes.
  3. Check for taints/tolerations mismatches.

Best practices

  • Set both requests and limits for predictable scheduling.
  • Use Pod Disruption Budgets for high availability.
  • Monitor resource usage before scaling.
  • Size based on workload not just component count.
  • Plan for growth with 20-30% headroom.
  • Use separate node pools for platform components.