kubernetes-orchestration
Kubernetes container orchestration. Use when deploying to Kubernetes, writing manifests, configuring Helm charts, or troubleshooting cluster issues.
Best use case
kubernetes-orchestration is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Kubernetes container orchestration. Use when deploying to Kubernetes, writing manifests, configuring Helm charts, or troubleshooting cluster issues.
Teams using kubernetes-orchestration should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/kubernetes-orchestration/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How kubernetes-orchestration Compares
| Feature / Agent | kubernetes-orchestration | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Kubernetes container orchestration. Use when deploying to Kubernetes, writing manifests, configuring Helm charts, or troubleshooting cluster issues.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Kubernetes Orchestration
Comprehensive guide for deploying, managing, and scaling applications on Kubernetes.
## When to Use
- Deploying applications to Kubernetes
- Writing and reviewing Kubernetes manifests
- Creating Helm charts
- Configuring ingress and networking
- Setting up autoscaling
- Troubleshooting pod and cluster issues
## Core Concepts
### Resource Hierarchy
```
Cluster
├── Namespaces
│ ├── Deployments → ReplicaSets → Pods → Containers
│ ├── StatefulSets → Pods
│ ├── DaemonSets → Pods
│ ├── Services (ClusterIP, NodePort, LoadBalancer)
│ ├── Ingress
│ ├── ConfigMaps
│ ├── Secrets
│ ├── PersistentVolumeClaims
│ └── ServiceAccounts
└── Cluster-wide
├── Nodes
├── PersistentVolumes
├── StorageClasses
├── ClusterRoles
└── CustomResourceDefinitions
```
### Namespace Organization
```yaml
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: myapp-production
labels:
name: myapp-production
environment: production
team: platform
```
**Namespace Strategy:**
```
namespaces/
├── production
├── staging
├── development
├── monitoring # Prometheus, Grafana
├── logging # ELK, Loki
├── ingress-nginx # Ingress controllers
└── cert-manager # TLS certificates
```
### Deployments
**Production-Ready Deployment:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: production
labels:
app: myapp
version: v1.0.0
spec:
replicas: 3
revisionHistoryLimit: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: v1.0.0
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
spec:
serviceAccountName: myapp
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
- name: myapp
image: registry.example.com/myapp:v1.0.0
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: metrics
containerPort: 9090
protocol: TCP
env:
- name: NODE_ENV
value: "production"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: myapp-secrets
key: database-url
envFrom:
- configMapRef:
name: myapp-config
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
startupProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 30
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
volumeMounts:
- name: tmp
mountPath: /tmp
- name: config
mountPath: /app/config
readOnly: true
volumes:
- name: tmp
emptyDir: {}
- name: config
configMap:
name: myapp-config
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: myapp
topologyKey: kubernetes.io/hostname
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: myapp
terminationGracePeriodSeconds: 30
```
### Services
**ClusterIP (Internal):**
```yaml
apiVersion: v1
kind: Service
metadata:
name: myapp
namespace: production
spec:
type: ClusterIP
selector:
app: myapp
ports:
- name: http
port: 80
targetPort: http
protocol: TCP
```
**LoadBalancer (External):**
```yaml
apiVersion: v1
kind: Service
metadata:
name: myapp-lb
namespace: production
annotations:
# AWS NLB
service.beta.kubernetes.io/aws-load-balancer-type: nlb
service.beta.kubernetes.io/aws-load-balancer-internal: "false"
spec:
type: LoadBalancer
selector:
app: myapp
ports:
- name: http
port: 80
targetPort: http
- name: https
port: 443
targetPort: https
```
**Headless Service (StatefulSets):**
```yaml
apiVersion: v1
kind: Service
metadata:
name: myapp-headless
spec:
type: ClusterIP
clusterIP: None
selector:
app: myapp
ports:
- port: 80
```
### Ingress
**Nginx Ingress:**
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp-ingress
namespace: production
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- myapp.example.com
- api.example.com
secretName: myapp-tls
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-frontend
port:
number: 80
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-api
port:
number: 80
```
### ConfigMaps & Secrets
**ConfigMap:**
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
namespace: production
data:
# Simple key-value
LOG_LEVEL: "info"
API_TIMEOUT: "30s"
# File content
config.yaml: |
server:
port: 8080
host: 0.0.0.0
features:
cache: true
debug: false
```
**Secrets:**
```yaml
apiVersion: v1
kind: Secret
metadata:
name: myapp-secrets
namespace: production
type: Opaque
data:
# Base64 encoded
database-url: cG9zdGdyZXM6Ly91c2VyOnBhc3NAZGIvYXBw
api-key: c2VjcmV0LWFwaS1rZXk=
stringData:
# Plain text (will be encoded)
another-secret: "plain-text-value"
```
**External Secrets (with External Secrets Operator):**
```yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: myapp-secrets
spec:
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: aws-secrets-manager
target:
name: myapp-secrets
creationPolicy: Owner
data:
- secretKey: database-url
remoteRef:
key: prod/myapp/database
property: url
```
### StatefulSets
**Database StatefulSet:**
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgresql
namespace: production
spec:
serviceName: postgresql-headless
replicas: 3
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
spec:
containers:
- name: postgresql
image: postgres:16-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secrets
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
```
### Autoscaling
**Horizontal Pod Autoscaler:**
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 1000
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
```
**Vertical Pod Autoscaler:**
```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: Auto
resourcePolicy:
containerPolicies:
- containerName: myapp
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2
memory: 2Gi
```
### RBAC
**ServiceAccount:**
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: myapp
namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: myapp-role
namespace: production
rules:
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: myapp-rolebinding
namespace: production
subjects:
- kind: ServiceAccount
name: myapp
namespace: production
roleRef:
kind: Role
name: myapp-role
apiGroup: rbac.authorization.k8s.io
```
### Network Policies
**Restrict Ingress/Egress:**
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: myapp-network-policy
namespace: production
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
- podSelector:
matchLabels:
app: myapp-frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: postgresql
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
```
### Helm Charts
**Chart Structure:**
```
myapp-chart/
├── Chart.yaml
├── values.yaml
├── values-production.yaml
├── templates/
│ ├── _helpers.tpl
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── configmap.yaml
│ ├── secrets.yaml
│ ├── hpa.yaml
│ └── NOTES.txt
└── charts/ # Dependencies
```
**Chart.yaml:**
```yaml
apiVersion: v2
name: myapp
description: My Application Helm Chart
type: application
version: 1.0.0
appVersion: "1.0.0"
dependencies:
- name: postgresql
version: 12.x.x
repository: https://charts.bitnami.com/bitnami
condition: postgresql.enabled
```
**values.yaml:**
```yaml
replicaCount: 3
image:
repository: registry.example.com/myapp
tag: latest
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
ingress:
enabled: true
className: nginx
hosts:
- host: myapp.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: myapp-tls
hosts:
- myapp.example.com
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
targetCPUUtilizationPercentage: 70
postgresql:
enabled: true
auth:
database: myapp
```
**Template Example:**
```yaml
# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "myapp.fullname" . }}
labels:
{{- include "myapp.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "myapp.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "myapp.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
ports:
- containerPort: 8080
resources:
{{- toYaml .Values.resources | nindent 12 }}
```
### Commands Reference
```bash
# Cluster info
kubectl cluster-info
kubectl get nodes
kubectl top nodes
# Namespace operations
kubectl get namespaces
kubectl create namespace myapp
kubectl config set-context --current --namespace=myapp
# Workloads
kubectl get pods -o wide
kubectl get deployments
kubectl describe pod <pod-name>
kubectl logs <pod-name> -f
kubectl logs <pod-name> -c <container-name>
kubectl exec -it <pod-name> -- /bin/sh
# Apply manifests
kubectl apply -f manifest.yaml
kubectl apply -f ./manifests/
kubectl apply -k ./kustomize/ # Kustomize
# Scaling
kubectl scale deployment myapp --replicas=5
kubectl autoscale deployment myapp --min=3 --max=10 --cpu-percent=70
# Rolling updates
kubectl set image deployment/myapp myapp=myapp:v2
kubectl rollout status deployment/myapp
kubectl rollout history deployment/myapp
kubectl rollout undo deployment/myapp
kubectl rollout undo deployment/myapp --to-revision=2
# Debug
kubectl describe pod <pod-name>
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl run debug --rm -it --image=busybox -- /bin/sh
kubectl port-forward pod/<pod-name> 8080:8080
kubectl cp <pod-name>:/path/to/file ./local-file
# Helm
helm install myapp ./myapp-chart
helm upgrade myapp ./myapp-chart -f values-prod.yaml
helm rollback myapp 1
helm list
helm uninstall myapp
```
### Troubleshooting
| Issue | Debug Command |
|-------|---------------|
| Pod stuck Pending | `kubectl describe pod <name>` - check events |
| Pod CrashLoopBackOff | `kubectl logs <pod> --previous` |
| Service not reachable | `kubectl get endpoints <service>` |
| Ingress not working | Check ingress controller logs |
| Node pressure | `kubectl describe node <name>` |
| OOMKilled | Increase memory limits |
**Common Pod States:**
```
Pending → Check node resources, PVC binding
ContainerCreating → Image pull, volume mount
Running → Healthy
Succeeded → Job completed
Failed → Check logs
Unknown → Node communication issue
```
### Checklist
Before deploying:
- [ ] Resource requests/limits defined
- [ ] Liveness and readiness probes configured
- [ ] Security context (non-root, read-only fs)
- [ ] ConfigMaps/Secrets externalized
- [ ] Network policies in place
- [ ] HPA configured for scaling
- [ ] Pod disruption budgets set
- [ ] Affinity/anti-affinity rules
- [ ] Image tags pinned (not :latest)
## Integration
Works with:
- `/devops` - Deployment pipelines
- `/docker` - Container images for K8s
- `gitops` skill - ArgoCD/Flux deployments
- `/aws`, `/gcp`, `/azure` - Managed K8s services
- `/security` - Cluster security reviewRelated Skills
opentofu-kubernetes-explorer
Explore and manage Kubernetes clusters and resources using OpenTofu/Terraform
learn-kubernetes-space-station-intermediate
Interactive narrative learning session that teaches Kubernetes through a Space Station adventure at intermediate level. Use this session when you want to learn Kubernetes through immersive story-driven chapters, hands-on exercises, and tasks grounded in real, up-to-date documentation.
kubernetes-troubleshooting
Debug Kubernetes pods, services, networking, and scaling issues. Use this skill when troubleshooting K8s deployments, investigating pod failures, or diagnosing cluster problems.
kubernetes-ops
Kubernetes cluster operations: kubectl commands, manifest generation, Helm charts, RBAC, debugging, and deployment strategies.
kubernetes-operators
Kubernetes infrastructure patterns including operators, Helm, GitOps, and component provisioning.
kubernetes-deployment
Deploy, manage, and scale applications on Kubernetes clusters using manifests, Helm charts, and autoscaling configurations.
kubernetes-deployer
Package and deploy applications to Kubernetes with Dockerfiles, Helm charts, and local Minikube deployment. Use when containerizing applications, creating Kubernetes manifests, setting up Helm charts, deploying to Minikube, or preparing cloud-ready configurations. Focuses on local-first deployment with stateless services.
kubernetes-architect
Expert Kubernetes architect specializing in cloud-native infrastructure, advanced GitOps workflows (ArgoCD/Flux), and enterprise container orchestration.
Kind Local Kubernetes
This skill should be used when the user asks to "setup Kind", "local Kubernetes", "Kind cluster", "multi-node cluster", "Kubernetes development", "k8s local environment", or works with local Kubernetes clusters using Kind.
featbit-deployment-kubernetes
Deploys FeatBit to Kubernetes using Helm Charts. Use when user mentions "Kubernetes", "Helm", "K8s", "kubectl", works with values.yaml files, asks about "cloud deployment", "AKS", "EKS", "GKE", "ingress", or needs production-grade container orchestration setup.
FastAPI Kubernetes Deployment
This skill should be used when the user asks to "deploy FastAPI to Kubernetes", "create Dockerfile", "build Docker image", "write Helm chart", "configure K8s deployment", "add health checks", "scale FastAPI", or mentions Docker, Kubernetes, K8s, containers, Helm, or deployment. Provides containerization and orchestration patterns.
docker-vigil-orchestration
Docker Compose orchestration for Vigil Guard v2.0.0 microservices (11 services). Use when deploying services, managing containers, troubleshooting Docker network issues, working with vigil-net, configuring docker-compose.yml, handling service dependencies, or working with 3-branch detection services (heuristics, semantic, prompt-guard).