building-with-multi-cloud

Deploy Kubernetes workloads to real cloud providers. Use when provisioning managed Kubernetes (DOKS, AKS, GKE, EKS, Civo) or self-managed clusters (Hetzner + K3s). Covers CLI tools, cluster creation, LoadBalancers, DNS, TLS, and cost optimization.

16 stars

Best use case

building-with-multi-cloud is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Deploy Kubernetes workloads to real cloud providers. Use when provisioning managed Kubernetes (DOKS, AKS, GKE, EKS, Civo) or self-managed clusters (Hetzner + K3s). Covers CLI tools, cluster creation, LoadBalancers, DNS, TLS, and cost optimization.

Teams using building-with-multi-cloud should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/building-with-multi-cloud/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/devops/building-with-multi-cloud/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/building-with-multi-cloud/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How building-with-multi-cloud Compares

Feature / Agentbuilding-with-multi-cloudStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Deploy Kubernetes workloads to real cloud providers. Use when provisioning managed Kubernetes (DOKS, AKS, GKE, EKS, Civo) or self-managed clusters (Hetzner + K3s). Covers CLI tools, cluster creation, LoadBalancers, DNS, TLS, and cost optimization.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Multi-Cloud Kubernetes Deployment

## Persona

You are a Cloud Platform Engineer with production experience deploying Kubernetes across DigitalOcean, Azure, GCP, AWS, Civo, and Hetzner. You understand the key insight: **only cluster provisioning differs between clouds—everything else (kubectl, Helm, Dapr, Ingress, cert-manager) is identical**. You help teams choose the right provider for their budget and needs, from $5/month learning labs to enterprise-grade production clusters.

## When to Use This Skill

Activate when the user mentions:
- DigitalOcean DOKS, doctl, managed Kubernetes
- Azure AKS, az aks, Azure Kubernetes
- Google GKE, gcloud container clusters
- AWS EKS, eksctl, Amazon Kubernetes
- Civo Kubernetes, civo CLI
- Hetzner Cloud, hetzner-k3s, K3s, self-managed
- Cloud Kubernetes pricing, cost comparison
- Production deployment, real cloud, beyond Docker Desktop
- LoadBalancer service, external IP, cloud DNS
- Multi-cloud, cloud-agnostic deployment

## Core Insight: The Universal Pattern

```
┌─────────────────────────────────────────────────────────────────┐
│              ONLY THIS DIFFERS BETWEEN CLOUDS                   │
├─────────────────────────────────────────────────────────────────┤
│  Cloud CLI → Create Cluster → Get Kubeconfig                   │
│                                                                 │
│  doctl kubernetes cluster create ...                            │
│  az aks create ... && az aks get-credentials ...               │
│  gcloud container clusters create ... && get-credentials ...   │
│  eksctl create cluster ...                                      │
│  civo kubernetes create ...                                     │
│  hetzner-k3s create --config cluster.yaml                      │
└─────────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────────┐
│              EVERYTHING BELOW IS IDENTICAL                      │
├─────────────────────────────────────────────────────────────────┤
│  kubectl apply -f ...           (same on all clouds)           │
│  helm upgrade --install ...     (same on all clouds)           │
│  dapr init -k                   (same on all clouds)           │
│  traefik/ingress-nginx         (same on all clouds)            │
│  cert-manager + Let's Encrypt  (same on all clouds)            │
│  Secrets, ConfigMaps           (same on all clouds)            │
└─────────────────────────────────────────────────────────────────┘
```

## Decision Logic: Choosing a Provider

### Quick Decision Matrix

| Scenario | Recommended Provider | Why |
|----------|---------------------|-----|
| **Learning/practice (~$5/mo)** | Hetzner + K3s | Cheapest real cloud, full K8s compatibility |
| **Startup MVP ($24+/mo)** | DigitalOcean DOKS | Simple, fast, free control plane, $200 credit |
| **Cost-conscious production** | Civo | $5/node, 90-second clusters, K3s-based |
| **Enterprise/existing Azure** | Azure AKS | Free control plane, Azure integration |
| **Enterprise/existing AWS** | AWS EKS | Best AWS integration, extensive ecosystem |
| **Enterprise/existing GCP** | Google GKE | Best autoscaling, GCP integration |
| **Budget enterprise** | Hetzner + K3s | Self-managed but production-capable |

### Cost Comparison (December 2025)

| Provider | Control Plane | 3-Node Cluster (min viable) | Notes |
|----------|--------------|------------------------------|-------|
| **Hetzner + K3s** | $0 (self-managed) | ~€16/mo (~$18) | Cheapest, requires management |
| **Civo** | Free | ~$15/mo (3x $5 nodes) | K3s-based, fast provisioning |
| **DigitalOcean DOKS** | Free | ~$36/mo (3x $12 nodes) | $200 free credit for new users |
| **Azure AKS** | Free | ~$45/mo | $200 free credit available |
| **Google GKE** | Free (Autopilot) | ~$50/mo | $300 free credit available |
| **AWS EKS** | $0.10/hr (~$73/mo) | ~$150/mo | Control plane NOT free |

### Managed vs Self-Managed

```
Need production SLA + minimal ops overhead?
├── Yes → Managed (DOKS, AKS, GKE, EKS, Civo)
│         You manage: workloads, helm charts, secrets
│         Provider manages: control plane, upgrades, HA
└── No → Self-managed (Hetzner + K3s, bare metal)
         You manage: EVERYTHING
         Benefit: Maximum cost savings, full control
```

## Provider CLI Commands

### DigitalOcean DOKS (doctl)

```bash
# Install doctl
brew install doctl  # macOS
# or: snap install doctl  # Linux

# Authenticate
doctl auth init  # Paste API token

# Create cluster
doctl kubernetes cluster create task-api-cluster \
  --region nyc1 \
  --version 1.31.4-do.0 \
  --size s-2vcpu-4gb \
  --count 3 \
  --wait

# Get kubeconfig (automatically saved)
doctl kubernetes cluster kubeconfig save task-api-cluster

# Verify
kubectl get nodes
```

**Key Options:**
- `--size`: Node size (use `doctl compute size list | grep kube`)
- `--count`: Number of nodes
- `--auto-upgrade`: Enable auto-upgrades
- `--ha`: Enable HA control plane ($40/mo extra)

### Azure AKS (az aks)

```bash
# Install Azure CLI
brew install azure-cli  # macOS
# or: curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash  # Linux

# Authenticate
az login

# Create resource group
az group create --name task-api-rg --location eastus

# Create cluster
az aks create \
  --resource-group task-api-rg \
  --name task-api-cluster \
  --node-count 3 \
  --node-vm-size Standard_B2s \
  --generate-ssh-keys

# Get kubeconfig
az aks get-credentials --resource-group task-api-rg --name task-api-cluster

# Verify
kubectl get nodes
```

**Key Options:**
- `--node-vm-size`: VM size (Standard_B2s is cheapest)
- `--enable-managed-identity`: Use managed identity (recommended)
- `--zones 1 2 3`: Multi-AZ deployment

### Google GKE (gcloud)

```bash
# Install gcloud CLI
brew install google-cloud-sdk  # macOS

# Authenticate
gcloud auth login
gcloud config set project YOUR_PROJECT_ID

# Create cluster (Autopilot - recommended)
gcloud container clusters create-auto task-api-cluster \
  --location us-central1

# OR Standard cluster
gcloud container clusters create task-api-cluster \
  --zone us-central1-a \
  --num-nodes 3 \
  --machine-type e2-small

# Get kubeconfig
gcloud container clusters get-credentials task-api-cluster \
  --location us-central1

# Verify
kubectl get nodes
```

**Key Options:**
- `--enable-autopilot`: Pay only for pods, not nodes
- `--spot`: Use spot instances (up to 91% cheaper)

### AWS EKS (eksctl)

```bash
# Install eksctl
brew tap weaveworks/tap
brew install weaveworks/tap/eksctl

# Create cluster (15-20 minutes)
eksctl create cluster \
  --name task-api-cluster \
  --region us-east-1 \
  --node-type t3.medium \
  --nodes 3

# kubeconfig is automatically configured

# Verify
kubectl get nodes
```

**Key Options:**
- `--enable-auto-mode`: EKS Auto Mode (newer, simpler)
- `--spot`: Use spot instances
- `--managed`: Use managed node groups

### Civo (civo)

```bash
# Install Civo CLI
brew install civo  # macOS
# or: curl -sL https://civo.com/get | sh  # Linux

# Authenticate
civo apikey save MY_KEY <your-api-key>
civo apikey current MY_KEY
civo region current NYC1

# Create cluster (90 seconds!)
civo kubernetes create task-api-cluster \
  --size g4s.kube.medium \
  --nodes 3 \
  --wait \
  --merge \
  --switch

# kubeconfig automatically merged and context switched

# Verify
kubectl get nodes
```

**Key Options:**
- `--applications`: Install marketplace apps (e.g., traefik2-nodeport)
- `--cluster-type talos`: Use Talos instead of K3s
- `--cni-plugin cilium`: Use Cilium CNI

### Hetzner + K3s (hetzner-k3s)

```bash
# Install hetzner-k3s
brew install vitobotta/tap/hetzner_k3s  # macOS
# or download binary from GitHub releases

# Create config file: cluster.yaml
cat > cluster.yaml << 'EOF'
hetzner_token: <your-hetzner-api-token>
cluster_name: task-api-cluster
kubeconfig_path: "./kubeconfig"
k3s_version: v1.31.4+k3s1

networking:
  ssh:
    port: 22
    use_agent: false
    public_key_path: "~/.ssh/id_ed25519.pub"
    private_key_path: "~/.ssh/id_ed25519"
    allowed_networks:
      ssh:
        - 0.0.0.0/0
      api:
        - 0.0.0.0/0

masters_pool:
  instance_type: cx22  # 2 vCPU, 4GB RAM
  instance_count: 1    # 1 for learning, 3 for HA
  location: fsn1

worker_node_pools:
  - name: workers
    instance_type: cx22
    instance_count: 2
    location: fsn1
EOF

# Create cluster (2-3 minutes)
hetzner-k3s create --config cluster.yaml

# Use the kubeconfig
export KUBECONFIG=./kubeconfig
kubectl get nodes
```

**Key Options:**
- `instance_type`: cx22 (€5.39/mo), cx32 (€10.59/mo), cx42 (€21.29/mo)
- `autoscaling.enabled: true`: Enable cluster autoscaler
- `networking.cni.cilium.enabled: true`: Use Cilium instead of Flannel

## After Provisioning: Universal Commands

Once you have a kubeconfig, **these commands work on ALL providers**:

### Install Dapr
```bash
dapr init -k
kubectl get pods -n dapr-system
```

### Install Ingress Controller (Traefik)
```bash
helm repo add traefik https://helm.traefik.io/traefik
helm install traefik traefik/traefik \
  --namespace traefik \
  --create-namespace
```

### Install cert-manager
```bash
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.yaml
```

### Create Let's Encrypt Issuer
```yaml
# cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your-email@example.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          ingressClassName: traefik
```

### Deploy Task API
```bash
helm upgrade --install task-api ./task-api-chart \
  --set image.repository=ghcr.io/your-org/task-api \
  --set image.tag=v1.0.0 \
  --set ingress.enabled=true \
  --set ingress.host=tasks.yourdomain.com \
  --set ingress.tls.enabled=true
```

## Cloud-Specific Considerations

### LoadBalancer Service Behavior

| Provider | LoadBalancer Creation | Cost |
|----------|----------------------|------|
| **DOKS** | Auto-provisions DO Load Balancer | $12/mo |
| **AKS** | Auto-provisions Azure LB | ~$18/mo |
| **GKE** | Auto-provisions GCP LB | ~$18/mo |
| **EKS** | Auto-provisions AWS ELB | ~$18/mo |
| **Civo** | Auto-provisions Civo LB | Included |
| **Hetzner K3s** | **NO auto-provision** | Use NodePort + external LB |

### Hetzner Workaround for LoadBalancer

Since Hetzner K3s doesn't auto-provision LoadBalancers:

```yaml
# Use NodePort + Hetzner Load Balancer
apiVersion: v1
kind: Service
metadata:
  name: traefik
  annotations:
    load-balancer.hetzner.cloud/name: "task-api-lb"
    load-balancer.hetzner.cloud/location: "fsn1"
spec:
  type: LoadBalancer  # hetzner-k3s includes CCM that handles this
  ports:
  - name: web
    port: 80
    targetPort: 8000
```

The hetzner-k3s tool includes the Hetzner Cloud Controller Manager, which can provision LoadBalancers automatically.

### DNS Configuration

| Provider | DNS Option | How to Configure |
|----------|------------|-----------------|
| **DOKS** | DigitalOcean DNS or external | Spaces → Networking → Domains |
| **AKS** | Azure DNS or external | DNS Zones service |
| **GKE** | Cloud DNS or external | Network services → Cloud DNS |
| **EKS** | Route53 or external | Route53 hosted zones |
| **Civo** | Civo DNS or external | Networking → DNS |
| **Hetzner** | External only (Cloudflare, etc.) | Use external DNS provider |

### Image Pull from GHCR

```yaml
# Create image pull secret (same on all clouds)
kubectl create secret docker-registry ghcr-secret \
  --docker-server=ghcr.io \
  --docker-username=YOUR_GITHUB_USERNAME \
  --docker-password=YOUR_GITHUB_PAT

# Reference in deployment
spec:
  imagePullSecrets:
  - name: ghcr-secret
```

## Cost Optimization Strategies

### 1. Right-Size Nodes
```
Development: 2 vCPU, 4GB RAM (cheapest viable)
Production: 4 vCPU, 8GB RAM (balanced)
High-traffic: 8 vCPU, 16GB RAM (performance)
```

### 2. Use Node Autoscaling
All managed providers support autoscaling. Set min=1, max=10 to scale with demand.

### 3. Schedule Non-Production Downtime
```bash
# Scale to zero at night
kubectl scale deployment task-api --replicas=0

# Or use KEDA for event-driven scaling
```

### 4. Consider Spot/Preemptible Nodes
- AWS: 60-90% savings with Spot
- GCP: 60-91% savings with Preemptible
- Azure: Similar savings with Spot
- **Not available**: DOKS, Civo, Hetzner

### 5. Teardown When Not in Use
```bash
# DOKS
doctl kubernetes cluster delete task-api-cluster -f

# AKS
az aks delete --resource-group task-api-rg --name task-api-cluster --yes

# GKE
gcloud container clusters delete task-api-cluster --location us-central1 -q

# EKS
eksctl delete cluster --name task-api-cluster

# Civo
civo kubernetes remove task-api-cluster -y

# Hetzner K3s
hetzner-k3s delete --config cluster.yaml
```

## Safety & Guardrails

### NEVER
- Leave clusters running without monitoring costs
- Deploy without resource requests/limits
- Expose cluster API to 0.0.0.0/0 in production
- Store cloud credentials in code or git
- Skip TLS for production traffic
- Use the same cluster for production and development

### ALWAYS
- Set budget alerts in cloud console
- Use namespaces to separate environments
- Enable cluster autoscaling with sensible limits
- Configure node auto-upgrade (managed providers)
- Use private node pools when available
- Backup etcd (for self-managed clusters)
- Document teardown commands for every cluster

### Cost Alerts Setup

**DigitalOcean**: Settings → Billing → Alerts
**Azure**: Cost Management → Budgets
**GCP**: Billing → Budgets & alerts
**AWS**: Billing → Budgets
**Civo**: Dashboard → Account → Billing alerts
**Hetzner**: Project → Usage → Limits (manual monitoring)

## TaskManager Production Deployment

Complete example deploying Task API to DigitalOcean DOKS:

```bash
# 1. Create cluster
doctl kubernetes cluster create task-prod \
  --region nyc1 \
  --size s-2vcpu-4gb \
  --count 3 \
  --wait

# 2. Save kubeconfig
doctl kubernetes cluster kubeconfig save task-prod

# 3. Install Dapr
dapr init -k
kubectl wait --for=condition=available --timeout=120s \
  deployment/dapr-operator -n dapr-system

# 4. Install Traefik
helm repo add traefik https://helm.traefik.io/traefik
helm install traefik traefik/traefik \
  -n traefik --create-namespace

# 5. Get LoadBalancer IP
kubectl get svc traefik -n traefik -w
# Wait for EXTERNAL-IP

# 6. Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.yaml
kubectl wait --for=condition=available --timeout=120s \
  deployment/cert-manager -n cert-manager

# 7. Create ClusterIssuer for Let's Encrypt
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@yourdomain.com
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          ingressClassName: traefik
EOF

# 8. Deploy Task API with Helm
helm upgrade --install task-api ./charts/task-api \
  --set image.repository=ghcr.io/your-org/task-api \
  --set image.tag=v1.0.0 \
  --set ingress.enabled=true \
  --set ingress.host=tasks.yourdomain.com \
  --set ingress.tls.enabled=true \
  --set ingress.annotations."cert-manager\.io/cluster-issuer"=letsencrypt-prod

# 9. Verify deployment
kubectl get pods
kubectl get ingress

# 10. Test HTTPS endpoint
curl https://tasks.yourdomain.com/health
```

## References

For detailed patterns, see:
- `references/digitalocean-doks.md` - DOKS-specific patterns
- `references/hetzner-k3s.md` - Hetzner + K3s setup
- `references/cloud-comparison.md` - Full provider comparison
- `references/cost-optimization.md` - Cost engineering patterns

Related Skills

building-ui

16
from diegosouzapw/awesome-omni-skill

Complete guide for building beautiful apps with Expo Router. Covers fundamentals, styling, components, navigation, animations, patterns, and native tabs.

building-mechanics

16
from diegosouzapw/awesome-omni-skill

Three.js 3D building system with spatial indexing, structural physics, and multiplayer networking. Use when creating survival games, sandbox builders, or any game with player-constructed structures. Covers performance optimization (spatial hash grids, octrees, chunk loading), structural validation (arcade/heuristic/realistic physics modes), and multiplayer sync (delta compression, client prediction, conflict resolution).

building-github-index

16
from diegosouzapw/awesome-omni-skill

Generate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".

building-ai-chat

16
from diegosouzapw/awesome-omni-skill

Builds AI chat interfaces and conversational UI with streaming responses, context management, and multi-modal support. Use when creating ChatGPT-style interfaces, AI assistants, code copilots, or conversational agents. Handles streaming text, token limits, regeneration, feedback loops, tool usage visualization, and AI-specific error patterns. Provides battle-tested components from leading AI products with accessibility and performance built in.

3d-building-mechanics

16
from diegosouzapw/awesome-omni-skill

Complete Three.js building system with spatial indexing, structural physics, and multiplayer networking. Use for survival/crafting games, sandbox games, multiplayer construction, or any 3D building mechanics.

Build Your Cloud Security Skill

16
from diegosouzapw/awesome-omni-skill

Create your cloud security skill in one prompt, then learn to improve it throughout the chapter

u01784-human-approval-routing-for-multilingual-translation-services

16
from diegosouzapw/awesome-omni-skill

Operate the "Human Approval Routing for multilingual translation services" capability in production for multilingual translation services workflows. Use when mission execution explicitly requires this capability and outcomes must be reproducible, policy-gated, and handoff-ready.

relational-database-web-cloudbase

16
from diegosouzapw/awesome-omni-skill

Use when building frontend Web apps that talk to CloudBase Relational Database via @cloudbase/js-sdk – provides the canonical init pattern so you can then use Supabase-style queries from the browser.

terradev-gpu-cloud

16
from diegosouzapw/awesome-omni-skill

Cross-cloud GPU provisioning with NUMA-aligned topology optimization, K8s cluster creation, and inference overflow. Get real-time pricing across 11+ cloud providers, provision the cheapest GPUs in seconds, spin up production K8s clusters with automatic GPU-NIC pairing, and burst to cloud when your local GPU maxes out. BYOAPI — your keys never leave your machine.

tencent-cloud-pptx

16
from diegosouzapw/awesome-omni-skill

Create professional Tencent Cloud themed presentations from markdown content. Use when users request: (1) Creating presentations with Tencent Cloud branding, (2) Converting markdown documents to PowerPoint slides, (3) Generating slides with automatic content structuring, (4) Creating bilingual (Chinese/English) technical presentations, (5) Adding AI-generated images to presentation slides. Keywords to watch: 腾讯云, Tencent Cloud, markdown to PPT, presentation generation, slides with images.

salesforce-service-cloud-automation

16
from diegosouzapw/awesome-omni-skill

Automate Salesforce Service Cloud tasks via Rube MCP (Composio). Always search tools first for current schemas.

preferences-cloudflare-wrangler-reference

16
from diegosouzapw/awesome-omni-skill

Cloudflare wrangler comprehensive reference for Workers, D1, R2, and KV configuration. Load when working with Cloudflare deployment or wrangler.toml.