kubernetes-orchestration

Kubernetes container orchestration. Use when deploying to Kubernetes, writing manifests, configuring Helm charts, or troubleshooting cluster issues.

16 stars

Best use case

kubernetes-orchestration is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Kubernetes container orchestration. Use when deploying to Kubernetes, writing manifests, configuring Helm charts, or troubleshooting cluster issues.

Teams using kubernetes-orchestration should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/kubernetes-orchestration/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/devops/kubernetes-orchestration/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/kubernetes-orchestration/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How kubernetes-orchestration Compares

Feature / Agentkubernetes-orchestrationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Kubernetes container orchestration. Use when deploying to Kubernetes, writing manifests, configuring Helm charts, or troubleshooting cluster issues.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Kubernetes Orchestration

Comprehensive guide for deploying, managing, and scaling applications on Kubernetes.

## When to Use

- Deploying applications to Kubernetes
- Writing and reviewing Kubernetes manifests
- Creating Helm charts
- Configuring ingress and networking
- Setting up autoscaling
- Troubleshooting pod and cluster issues

## Core Concepts

### Resource Hierarchy

```
Cluster
├── Namespaces
│   ├── Deployments → ReplicaSets → Pods → Containers
│   ├── StatefulSets → Pods
│   ├── DaemonSets → Pods
│   ├── Services (ClusterIP, NodePort, LoadBalancer)
│   ├── Ingress
│   ├── ConfigMaps
│   ├── Secrets
│   ├── PersistentVolumeClaims
│   └── ServiceAccounts
└── Cluster-wide
    ├── Nodes
    ├── PersistentVolumes
    ├── StorageClasses
    ├── ClusterRoles
    └── CustomResourceDefinitions
```

### Namespace Organization

```yaml
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: myapp-production
  labels:
    name: myapp-production
    environment: production
    team: platform
```

**Namespace Strategy:**
```
namespaces/
├── production
├── staging
├── development
├── monitoring       # Prometheus, Grafana
├── logging          # ELK, Loki
├── ingress-nginx    # Ingress controllers
└── cert-manager     # TLS certificates
```

### Deployments

**Production-Ready Deployment:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: production
  labels:
    app: myapp
    version: v1.0.0
spec:
  replicas: 3
  revisionHistoryLimit: 5
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
        version: v1.0.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9090"
    spec:
      serviceAccountName: myapp
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000

      containers:
        - name: myapp
          image: registry.example.com/myapp:v1.0.0
          imagePullPolicy: IfNotPresent

          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            - name: metrics
              containerPort: 9090
              protocol: TCP

          env:
            - name: NODE_ENV
              value: "production"
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: myapp-secrets
                  key: database-url

          envFrom:
            - configMapRef:
                name: myapp-config

          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 500m
              memory: 512Mi

          livenessProbe:
            httpGet:
              path: /health/live
              port: http
            initialDelaySeconds: 30
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3

          readinessProbe:
            httpGet:
              path: /health/ready
              port: http
            initialDelaySeconds: 5
            periodSeconds: 5
            timeoutSeconds: 3
            failureThreshold: 3

          startupProbe:
            httpGet:
              path: /health/live
              port: http
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 30

          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
            capabilities:
              drop:
                - ALL

          volumeMounts:
            - name: tmp
              mountPath: /tmp
            - name: config
              mountPath: /app/config
              readOnly: true

      volumes:
        - name: tmp
          emptyDir: {}
        - name: config
          configMap:
            name: myapp-config

      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: myapp
                topologyKey: kubernetes.io/hostname

      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app: myapp

      terminationGracePeriodSeconds: 30
```

### Services

**ClusterIP (Internal):**
```yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: production
spec:
  type: ClusterIP
  selector:
    app: myapp
  ports:
    - name: http
      port: 80
      targetPort: http
      protocol: TCP
```

**LoadBalancer (External):**
```yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp-lb
  namespace: production
  annotations:
    # AWS NLB
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
    service.beta.kubernetes.io/aws-load-balancer-internal: "false"
spec:
  type: LoadBalancer
  selector:
    app: myapp
  ports:
    - name: http
      port: 80
      targetPort: http
    - name: https
      port: 443
      targetPort: https
```

**Headless Service (StatefulSets):**
```yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp-headless
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: myapp
  ports:
    - port: 80
```

### Ingress

**Nginx Ingress:**
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp-ingress
  namespace: production
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "50m"
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
    - hosts:
        - myapp.example.com
        - api.example.com
      secretName: myapp-tls
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: myapp-frontend
                port:
                  number: 80
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: myapp-api
                port:
                  number: 80
```

### ConfigMaps & Secrets

**ConfigMap:**
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: myapp-config
  namespace: production
data:
  # Simple key-value
  LOG_LEVEL: "info"
  API_TIMEOUT: "30s"

  # File content
  config.yaml: |
    server:
      port: 8080
      host: 0.0.0.0
    features:
      cache: true
      debug: false
```

**Secrets:**
```yaml
apiVersion: v1
kind: Secret
metadata:
  name: myapp-secrets
  namespace: production
type: Opaque
data:
  # Base64 encoded
  database-url: cG9zdGdyZXM6Ly91c2VyOnBhc3NAZGIvYXBw
  api-key: c2VjcmV0LWFwaS1rZXk=
stringData:
  # Plain text (will be encoded)
  another-secret: "plain-text-value"
```

**External Secrets (with External Secrets Operator):**
```yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: myapp-secrets
spec:
  refreshInterval: 1h
  secretStoreRef:
    kind: ClusterSecretStore
    name: aws-secrets-manager
  target:
    name: myapp-secrets
    creationPolicy: Owner
  data:
    - secretKey: database-url
      remoteRef:
        key: prod/myapp/database
        property: url
```

### StatefulSets

**Database StatefulSet:**
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgresql
  namespace: production
spec:
  serviceName: postgresql-headless
  replicas: 3
  selector:
    matchLabels:
      app: postgresql
  template:
    metadata:
      labels:
        app: postgresql
    spec:
      containers:
        - name: postgresql
          image: postgres:16-alpine
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgresql-secrets
                  key: password
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          volumeMounts:
            - name: data
              mountPath: /var/lib/postgresql/data
          resources:
            requests:
              cpu: 250m
              memory: 256Mi
            limits:
              cpu: 1000m
              memory: 1Gi
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 10Gi
```

### Autoscaling

**Horizontal Pod Autoscaler:**
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: 1000
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      selectPolicy: Max
```

**Vertical Pod Autoscaler:**
```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
      - containerName: myapp
        minAllowed:
          cpu: 100m
          memory: 128Mi
        maxAllowed:
          cpu: 2
          memory: 2Gi
```

### RBAC

**ServiceAccount:**
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: myapp
  namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: myapp-role
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["configmaps", "secrets"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: myapp-rolebinding
  namespace: production
subjects:
  - kind: ServiceAccount
    name: myapp
    namespace: production
roleRef:
  kind: Role
  name: myapp-role
  apiGroup: rbac.authorization.k8s.io
```

### Network Policies

**Restrict Ingress/Egress:**
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: myapp-network-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: myapp
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: ingress-nginx
        - podSelector:
            matchLabels:
              app: myapp-frontend
      ports:
        - protocol: TCP
          port: 8080
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: postgresql
      ports:
        - protocol: TCP
          port: 5432
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
```

### Helm Charts

**Chart Structure:**
```
myapp-chart/
├── Chart.yaml
├── values.yaml
├── values-production.yaml
├── templates/
│   ├── _helpers.tpl
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   ├── configmap.yaml
│   ├── secrets.yaml
│   ├── hpa.yaml
│   └── NOTES.txt
└── charts/           # Dependencies
```

**Chart.yaml:**
```yaml
apiVersion: v2
name: myapp
description: My Application Helm Chart
type: application
version: 1.0.0
appVersion: "1.0.0"

dependencies:
  - name: postgresql
    version: 12.x.x
    repository: https://charts.bitnami.com/bitnami
    condition: postgresql.enabled
```

**values.yaml:**
```yaml
replicaCount: 3

image:
  repository: registry.example.com/myapp
  tag: latest
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: true
  className: nginx
  hosts:
    - host: myapp.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: myapp-tls
      hosts:
        - myapp.example.com

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 20
  targetCPUUtilizationPercentage: 70

postgresql:
  enabled: true
  auth:
    database: myapp
```

**Template Example:**
```yaml
# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "myapp.fullname" . }}
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
spec:
  {{- if not .Values.autoscaling.enabled }}
  replicas: {{ .Values.replicaCount }}
  {{- end }}
  selector:
    matchLabels:
      {{- include "myapp.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      labels:
        {{- include "myapp.selectorLabels" . | nindent 8 }}
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          ports:
            - containerPort: 8080
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
```

### Commands Reference

```bash
# Cluster info
kubectl cluster-info
kubectl get nodes
kubectl top nodes

# Namespace operations
kubectl get namespaces
kubectl create namespace myapp
kubectl config set-context --current --namespace=myapp

# Workloads
kubectl get pods -o wide
kubectl get deployments
kubectl describe pod <pod-name>
kubectl logs <pod-name> -f
kubectl logs <pod-name> -c <container-name>
kubectl exec -it <pod-name> -- /bin/sh

# Apply manifests
kubectl apply -f manifest.yaml
kubectl apply -f ./manifests/
kubectl apply -k ./kustomize/  # Kustomize

# Scaling
kubectl scale deployment myapp --replicas=5
kubectl autoscale deployment myapp --min=3 --max=10 --cpu-percent=70

# Rolling updates
kubectl set image deployment/myapp myapp=myapp:v2
kubectl rollout status deployment/myapp
kubectl rollout history deployment/myapp
kubectl rollout undo deployment/myapp
kubectl rollout undo deployment/myapp --to-revision=2

# Debug
kubectl describe pod <pod-name>
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl run debug --rm -it --image=busybox -- /bin/sh
kubectl port-forward pod/<pod-name> 8080:8080
kubectl cp <pod-name>:/path/to/file ./local-file

# Helm
helm install myapp ./myapp-chart
helm upgrade myapp ./myapp-chart -f values-prod.yaml
helm rollback myapp 1
helm list
helm uninstall myapp
```

### Troubleshooting

| Issue | Debug Command |
|-------|---------------|
| Pod stuck Pending | `kubectl describe pod <name>` - check events |
| Pod CrashLoopBackOff | `kubectl logs <pod> --previous` |
| Service not reachable | `kubectl get endpoints <service>` |
| Ingress not working | Check ingress controller logs |
| Node pressure | `kubectl describe node <name>` |
| OOMKilled | Increase memory limits |

**Common Pod States:**
```
Pending      → Check node resources, PVC binding
ContainerCreating → Image pull, volume mount
Running      → Healthy
Succeeded    → Job completed
Failed       → Check logs
Unknown      → Node communication issue
```

### Checklist

Before deploying:
- [ ] Resource requests/limits defined
- [ ] Liveness and readiness probes configured
- [ ] Security context (non-root, read-only fs)
- [ ] ConfigMaps/Secrets externalized
- [ ] Network policies in place
- [ ] HPA configured for scaling
- [ ] Pod disruption budgets set
- [ ] Affinity/anti-affinity rules
- [ ] Image tags pinned (not :latest)

## Integration

Works with:
- `/devops` - Deployment pipelines
- `/docker` - Container images for K8s
- `gitops` skill - ArgoCD/Flux deployments
- `/aws`, `/gcp`, `/azure` - Managed K8s services
- `/security` - Cluster security review

Related Skills

opentofu-kubernetes-explorer

16
from diegosouzapw/awesome-omni-skill

Explore and manage Kubernetes clusters and resources using OpenTofu/Terraform

learn-kubernetes-space-station-intermediate

16
from diegosouzapw/awesome-omni-skill

Interactive narrative learning session that teaches Kubernetes through a Space Station adventure at intermediate level. Use this session when you want to learn Kubernetes through immersive story-driven chapters, hands-on exercises, and tasks grounded in real, up-to-date documentation.

kubernetes-troubleshooting

16
from diegosouzapw/awesome-omni-skill

Debug Kubernetes pods, services, networking, and scaling issues. Use this skill when troubleshooting K8s deployments, investigating pod failures, or diagnosing cluster problems.

kubernetes-ops

16
from diegosouzapw/awesome-omni-skill

Kubernetes cluster operations: kubectl commands, manifest generation, Helm charts, RBAC, debugging, and deployment strategies.

kubernetes-operators

16
from diegosouzapw/awesome-omni-skill

Kubernetes infrastructure patterns including operators, Helm, GitOps, and component provisioning.

kubernetes-deployment

16
from diegosouzapw/awesome-omni-skill

Deploy, manage, and scale applications on Kubernetes clusters using manifests, Helm charts, and autoscaling configurations.

kubernetes-deployer

16
from diegosouzapw/awesome-omni-skill

Package and deploy applications to Kubernetes with Dockerfiles, Helm charts, and local Minikube deployment. Use when containerizing applications, creating Kubernetes manifests, setting up Helm charts, deploying to Minikube, or preparing cloud-ready configurations. Focuses on local-first deployment with stateless services.

kubernetes-architect

16
from diegosouzapw/awesome-omni-skill

Expert Kubernetes architect specializing in cloud-native infrastructure, advanced GitOps workflows (ArgoCD/Flux), and enterprise container orchestration.

Kind Local Kubernetes

16
from diegosouzapw/awesome-omni-skill

This skill should be used when the user asks to "setup Kind", "local Kubernetes", "Kind cluster", "multi-node cluster", "Kubernetes development", "k8s local environment", or works with local Kubernetes clusters using Kind.

featbit-deployment-kubernetes

16
from diegosouzapw/awesome-omni-skill

Deploys FeatBit to Kubernetes using Helm Charts. Use when user mentions "Kubernetes", "Helm", "K8s", "kubectl", works with values.yaml files, asks about "cloud deployment", "AKS", "EKS", "GKE", "ingress", or needs production-grade container orchestration setup.

FastAPI Kubernetes Deployment

16
from diegosouzapw/awesome-omni-skill

This skill should be used when the user asks to "deploy FastAPI to Kubernetes", "create Dockerfile", "build Docker image", "write Helm chart", "configure K8s deployment", "add health checks", "scale FastAPI", or mentions Docker, Kubernetes, K8s, containers, Helm, or deployment. Provides containerization and orchestration patterns.

docker-vigil-orchestration

16
from diegosouzapw/awesome-omni-skill

Docker Compose orchestration for Vigil Guard v2.0.0 microservices (11 services). Use when deploying services, managing containers, troubleshooting Docker network issues, working with vigil-net, configuring docker-compose.yml, handling service dependencies, or working with 3-branch detection services (heuristics, semantic, prompt-guard).