securing-kubernetes-on-cloud

This skill covers hardening managed Kubernetes clusters on EKS, AKS, and GKE by implementing Pod Security Standards, network policies, workload identity, RBAC scoping, image admission controls, and runtime security monitoring. It addresses cloud-specific security features including IRSA for EKS, Workload Identity for GKE, and Managed Identities for AKS.

4,032 stars

Best use case

securing-kubernetes-on-cloud is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

This skill covers hardening managed Kubernetes clusters on EKS, AKS, and GKE by implementing Pod Security Standards, network policies, workload identity, RBAC scoping, image admission controls, and runtime security monitoring. It addresses cloud-specific security features including IRSA for EKS, Workload Identity for GKE, and Managed Identities for AKS.

Teams using securing-kubernetes-on-cloud should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/securing-kubernetes-on-cloud/SKILL.md --create-dirs "https://raw.githubusercontent.com/mukul975/Anthropic-Cybersecurity-Skills/main/skills/securing-kubernetes-on-cloud/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/securing-kubernetes-on-cloud/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How securing-kubernetes-on-cloud Compares

Feature / Agentsecuring-kubernetes-on-cloudStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

This skill covers hardening managed Kubernetes clusters on EKS, AKS, and GKE by implementing Pod Security Standards, network policies, workload identity, RBAC scoping, image admission controls, and runtime security monitoring. It addresses cloud-specific security features including IRSA for EKS, Workload Identity for GKE, and Managed Identities for AKS.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Securing Kubernetes on Cloud

## When to Use

- When deploying new managed Kubernetes clusters in production with security requirements
- When hardening existing EKS, AKS, or GKE clusters after a security audit or pentest finding
- When implementing workload identity to eliminate static cloud credentials in pods
- When enforcing pod security policies across namespaces to prevent container escapes
- When integrating runtime security monitoring for detecting container-level threats

**Do not use** for non-Kubernetes container deployments like ECS Fargate or Azure Container Instances, for application-level security within containers (see securing-serverless-functions), or for CI/CD pipeline security (see implementing-cloud-devsecops).

## Prerequisites

- Managed Kubernetes cluster provisioned on EKS, AKS, or GKE with admin access
- kubectl configured with cluster admin credentials
- Familiarity with Kubernetes RBAC, namespaces, and security contexts
- Container network interface plugin supporting network policies (Calico, Cilium)

## Workflow

### Step 1: Enforce Pod Security Standards

Apply Pod Security Admission labels at the namespace level to enforce the Restricted profile in production namespaces. Pod Security Policies were removed in Kubernetes v1.25 and replaced with Pod Security Admission.

```yaml
# Production namespace with restricted Pod Security Standard
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
---
# Staging namespace with baseline enforcement
apiVersion: v1
kind: Namespace
metadata:
  name: staging
  labels:
    pod-security.kubernetes.io/enforce: baseline
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
```

```yaml
# Pod spec compliant with restricted profile
apiVersion: v1
kind: Pod
metadata:
  name: secure-app
  namespace: production
spec:
  automountServiceAccountToken: false
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
    - name: app
      image: company/app:v2.1@sha256:abc123...
      securityContext:
        allowPrivilegeEscalation: false
        readOnlyRootFilesystem: true
        capabilities:
          drop: ["ALL"]
      resources:
        limits:
          cpu: "500m"
          memory: "256Mi"
        requests:
          cpu: "100m"
          memory: "128Mi"
```

### Step 2: Configure Cloud-Native Workload Identity

Eliminate static cloud credentials in pods by binding Kubernetes service accounts to cloud IAM roles.

```bash
# EKS: IAM Roles for Service Accounts (IRSA)
eksctl create iamserviceaccount \
  --cluster production-cluster \
  --namespace production \
  --name web-app-sa \
  --attach-policy-arn arn:aws:iam::123456789012:policy/WebAppS3ReadOnly \
  --approve

# GKE: Workload Identity
gcloud iam service-accounts create web-app-sa \
  --project=my-gcp-project

gcloud iam service-accounts add-iam-policy-binding \
  web-app-sa@my-gcp-project.iam.gserviceaccount.com \
  --role roles/storage.objectViewer \
  --member "serviceAccount:my-gcp-project.svc.id.goog[production/web-app-sa]"

kubectl annotate serviceaccount web-app-sa \
  --namespace production \
  iam.gke.io/gcp-service-account=web-app-sa@my-gcp-project.iam.gserviceaccount.com

# AKS: Azure AD Workload Identity
az identity create --name web-app-identity --resource-group production-rg
az identity federated-credential create \
  --name web-app-federation \
  --identity-name web-app-identity \
  --resource-group production-rg \
  --issuer "$(az aks show -n production-cluster -g production-rg --query oidcIssuerProfile.issuerUrl -o tsv)" \
  --subject system:serviceaccount:production:web-app-sa
```

### Step 3: Implement Network Policies

Deploy network policies to restrict pod-to-pod communication following the principle of least privilege. By default, Kubernetes allows all pods to communicate with each other.

```yaml
# Default deny all ingress and egress in production namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress
---
# Allow web-app to receive traffic from ingress controller only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ingress-to-web
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: web-app
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: ingress-nginx
      ports:
        - protocol: TCP
          port: 8080
---
# Allow web-app to connect to database only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-web-to-db
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: web-app
  policyTypes:
    - Egress
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: postgres
      ports:
        - protocol: TCP
          port: 5432
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
```

### Step 4: Configure RBAC with Least Privilege

Scope Kubernetes RBAC roles to specific namespaces and resources. Avoid ClusterRoleBindings for non-administrative users.

```yaml
# Developer role scoped to specific namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer-role
  namespace: staging
rules:
  - apiGroups: [""]
    resources: ["pods", "pods/log", "services", "configmaps"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "watch", "update", "patch"]
  # Explicitly deny secrets access
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-binding
  namespace: staging
subjects:
  - kind: Group
    name: developers
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: developer-role
  apiGroup: rbac.authorization.k8s.io
```

### Step 5: Deploy Image Admission Controls

Use admission controllers to enforce that only signed images from trusted registries are deployed. Implement OPA/Gatekeeper or Kyverno for policy enforcement.

```yaml
# Kyverno policy: require images from approved registries
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: restrict-image-registries
spec:
  validationFailureAction: Enforce
  rules:
    - name: validate-registries
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "Images must come from approved registries"
        pattern:
          spec:
            containers:
              - image: "123456789012.dkr.ecr.us-east-1.amazonaws.com/* | gcr.io/my-gcp-project/*"
---
# Kyverno policy: require image digest (no mutable tags)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-image-digest
spec:
  validationFailureAction: Enforce
  rules:
    - name: require-digest
      match:
        any:
          - resources:
              kinds: ["Pod"]
      validate:
        message: "Images must use digest references, not tags"
        pattern:
          spec:
            containers:
              - image: "*@sha256:*"
```

### Step 6: Enable Runtime Security Monitoring

Deploy runtime security tools to detect anomalous behavior inside containers including process execution, file system modifications, and network connections.

```bash
# Deploy Falco for runtime threat detection
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm install falco falcosecurity/falco \
  --namespace falco-system --create-namespace \
  --set falcosidekick.enabled=true \
  --set falcosidekick.config.slack.webhookurl="https://hooks.slack.com/services/xxx"

# Run kube-bench for CIS Kubernetes Benchmark assessment
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job-eks.yaml
kubectl logs -l app=kube-bench
```

## Key Concepts

| Term | Definition |
|------|------------|
| Pod Security Standards | Three profiles (Privileged, Baseline, Restricted) enforced via Pod Security Admission that control pod security context capabilities |
| Workload Identity | Cloud-native mechanism binding Kubernetes service accounts to cloud IAM roles for credential-free cloud API access (IRSA, GKE WI, AKS MI) |
| Network Policy | Kubernetes resource defining allowed ingress and egress traffic flows between pods, enforced by the CNI plugin |
| Admission Controller | Kubernetes plugin that intercepts API requests before persistence to validate or mutate resources against security policies |
| RBAC | Role-Based Access Control in Kubernetes, defining what actions (verbs) identities can perform on which resources in which namespaces |
| Seccomp Profile | Linux kernel feature restricting the system calls a container process can make, reducing the kernel attack surface |
| Service Mesh | Infrastructure layer (Istio, Linkerd) providing mutual TLS, traffic policies, and observability for service-to-service communication |

## Tools & Systems

- **Falco**: Open-source runtime security engine detecting anomalous behavior in containers using kernel-level system call monitoring
- **Kyverno**: Kubernetes-native policy engine for admission control, mutation, and generation of resources based on security policies
- **kube-bench**: CIS Kubernetes Benchmark assessment tool checking cluster configuration against security best practices
- **Trivy**: Vulnerability scanner for container images, file systems, and Kubernetes resources with SBOM generation
- **Calico/Cilium**: CNI plugins providing network policy enforcement and advanced network security features including eBPF-based monitoring

## Common Scenarios

### Scenario: Cryptominer Deployed via Compromised Container Image

**Context**: GuardDuty Extended Threat Detection generates an AttackSequence:EKS/CompromisedCluster finding. A developer pulled a public Docker image containing an embedded XMRig cryptominer that executes at container startup.

**Approach**:
1. Isolate the affected pod by applying a deny-all network policy targeting its labels
2. Capture the container image digest and scan it with Trivy to identify the embedded binary
3. Review Kubernetes audit logs to identify who deployed the compromised image and when
4. Deploy Kyverno ClusterPolicy requiring images from approved private registries only
5. Enable image digest pinning to prevent tag mutation attacks
6. Deploy Falco with rules detecting crypto mining process signatures (/usr/bin/xmrig, stratum+tcp connections)

**Pitfalls**: Deleting the pod before capturing the image digest and audit logs destroys forensic evidence. Blocking only the specific image tag allows the attacker to re-push with a different tag.

## Output Format

```
Kubernetes Security Assessment Report
=======================================
Cluster: production-cluster (EKS 1.29)
Provider: AWS (us-east-1)
Assessment Date: 2025-02-23
Tool: kube-bench v0.8.0 + manual review

CIS KUBERNETES BENCHMARK RESULTS:
  Total Controls: 124
  Passed: 98 (79%)
  Failed: 18 (15%)
  Warnings: 8 (6%)

CRITICAL FINDINGS:
  [K8S-001] 3 namespaces lack Pod Security Standards enforcement
    Namespaces: monitoring, logging, default
    Remediation: Apply restricted PSA labels

  [K8S-002] Default service account tokens auto-mounted in 12 deployments
    Risk: Credential theft if container is compromised
    Remediation: Set automountServiceAccountToken: false

  [K8S-003] No network policies in production namespace
    Risk: Unrestricted lateral movement between all pods
    Remediation: Deploy default-deny policy with explicit allow rules

HIGH FINDINGS:
  [K8S-004] 5 pods running as root with privileged security context
  [K8S-005] Images deployed using mutable tags (:latest) in 8 deployments
  [K8S-006] RBAC ClusterRoleBinding grants cluster-admin to developers group
```

Related Skills

securing-serverless-functions

4032
from mukul975/Anthropic-Cybersecurity-Skills

This skill covers security hardening for serverless compute platforms including AWS Lambda, Azure Functions, and Google Cloud Functions. It addresses least privilege IAM roles, dependency vulnerability scanning, secrets management integration, input validation, function URL authentication, and runtime monitoring to protect against injection attacks, credential theft, and supply chain compromises.

securing-remote-access-to-ot-environment

4032
from mukul975/Anthropic-Cybersecurity-Skills

This skill covers implementing secure remote access to OT/ICS environments for operators, engineers, and vendors while preventing unauthorized access that could compromise industrial operations. It addresses jump server architecture, multi-factor authentication, session recording, privileged access management, vendor remote access controls, and compliance with IEC 62443 and NERC CIP-005 remote access requirements.

securing-historian-server-in-ot-environment

4032
from mukul975/Anthropic-Cybersecurity-Skills

This skill covers hardening and securing process historian servers (OSIsoft PI, Honeywell PHD, GE Proficy, AVEVA Historian) in OT environments. It addresses network placement across Purdue levels, access control for historian interfaces, data replication through DMZ using data diodes or PI-to-PI connectors, SQL injection prevention in historian queries, and integrity protection of process data used for safety analysis, regulatory reporting, and process optimization.

securing-helm-chart-deployments

4032
from mukul975/Anthropic-Cybersecurity-Skills

Secure Helm chart deployments by validating chart integrity, scanning templates for misconfigurations, and enforcing security contexts in Kubernetes releases.

securing-github-actions-workflows

4032
from mukul975/Anthropic-Cybersecurity-Skills

This skill covers hardening GitHub Actions workflows against supply chain attacks, credential theft, and privilege escalation. It addresses pinning actions to SHA digests, minimizing GITHUB_TOKEN permissions, protecting secrets from exfiltration, preventing script injection in workflow expressions, and implementing required reviewers for workflow changes.

securing-container-registry-with-harbor

4032
from mukul975/Anthropic-Cybersecurity-Skills

Harbor is an open-source container registry that provides security features including vulnerability scanning (integrated Trivy), image signing (Notary/Cosign), RBAC, content trust policies, replicatio

securing-container-registry-images

4032
from mukul975/Anthropic-Cybersecurity-Skills

Securing container registry images by implementing vulnerability scanning with Trivy and Grype, enforcing image signing with Cosign and Sigstore, configuring registry access controls, and building CI/CD pipelines that prevent deploying unscanned or unsigned images.

securing-azure-with-microsoft-defender

4032
from mukul975/Anthropic-Cybersecurity-Skills

This skill instructs security practitioners on deploying Microsoft Defender for Cloud as a cloud-native application protection platform for Azure, multi-cloud, and hybrid environments. It covers enabling Defender plans for servers, containers, storage, and databases, configuring security recommendations, managing Secure Score, and integrating with the unified Defender portal for centralized threat management.

securing-aws-lambda-execution-roles

4032
from mukul975/Anthropic-Cybersecurity-Skills

Securing AWS Lambda execution roles by implementing least-privilege IAM policies, applying permission boundaries, restricting resource-based policies, using IAM Access Analyzer to validate permissions, and enforcing role scoping through SCPs.

securing-aws-iam-permissions

4032
from mukul975/Anthropic-Cybersecurity-Skills

This skill guides practitioners through hardening AWS Identity and Access Management configurations to enforce least privilege access across cloud accounts. It covers IAM policy scoping, permission boundaries, Access Analyzer integration, and credential rotation strategies to reduce the blast radius of compromised identities.

securing-api-gateway-with-aws-waf

4032
from mukul975/Anthropic-Cybersecurity-Skills

Securing API Gateway endpoints with AWS WAF by configuring managed rule groups for OWASP Top 10 protection, creating custom rate limiting rules, implementing bot control, setting up IP reputation filtering, and monitoring WAF metrics for security effectiveness.

scanning-kubernetes-manifests-with-kubesec

4032
from mukul975/Anthropic-Cybersecurity-Skills

Perform security risk analysis on Kubernetes resource manifests using Kubesec to identify misconfigurations, privilege escalation risks, and deviations from security best practices.