k8s-cost-optimizer
Analyzes Kubernetes cluster resource allocation versus actual usage to find waste and generate right-sizing recommendations. Use when someone asks about Kubernetes costs, overprovisioned pods, resource requests/limits tuning, cluster efficiency, or cloud bill reduction for K8s workloads. Trigger words: k8s costs, pod resources, right-size, overprovisioned, resource waste, cluster optimization, CPU/memory requests.
Best use case
k8s-cost-optimizer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyzes Kubernetes cluster resource allocation versus actual usage to find waste and generate right-sizing recommendations. Use when someone asks about Kubernetes costs, overprovisioned pods, resource requests/limits tuning, cluster efficiency, or cloud bill reduction for K8s workloads. Trigger words: k8s costs, pod resources, right-size, overprovisioned, resource waste, cluster optimization, CPU/memory requests.
Teams using k8s-cost-optimizer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/k8s-cost-optimizer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How k8s-cost-optimizer Compares
| Feature / Agent | k8s-cost-optimizer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyzes Kubernetes cluster resource allocation versus actual usage to find waste and generate right-sizing recommendations. Use when someone asks about Kubernetes costs, overprovisioned pods, resource requests/limits tuning, cluster efficiency, or cloud bill reduction for K8s workloads. Trigger words: k8s costs, pod resources, right-size, overprovisioned, resource waste, cluster optimization, CPU/memory requests.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Kubernetes Cost Optimizer
## Overview
This skill audits Kubernetes clusters for resource inefficiency by comparing requested CPU/memory against actual usage from metrics-server. It identifies zombie deployments, overprovisioned workloads, and generates kustomize-compatible patches for right-sizing with safety buffers.
## Instructions
### Step 1: Verify Cluster Access and Metrics Availability
Run `kubectl cluster-info` and `kubectl top nodes` to confirm connectivity and that metrics-server is running. If metrics-server is unavailable, inform the user and suggest installing it first.
### Step 2: Collect Resource Data
For each namespace (or user-specified namespaces):
```bash
# Get resource requests/limits for all pods
kubectl get pods --all-namespaces -o json | jq '[.items[] | {
namespace: .metadata.namespace,
pod: .metadata.name,
containers: [.spec.containers[] | {
name: .name,
cpu_request: .resources.requests.cpu,
cpu_limit: .resources.limits.cpu,
mem_request: .resources.requests.memory,
mem_limit: .resources.limits.memory
}]
}]'
# Get actual usage
kubectl top pods --all-namespaces --no-headers
```
### Step 3: Calculate Efficiency Ratios
For each deployment, compute:
- **Efficiency** = actual_usage / requested × 100
- **Waste** = 100 - efficiency
- **Monthly cost estimate** = (requested_cpu × node_cost_per_cpu) + (requested_mem × node_cost_per_gi)
Use these reference prices if the user doesn't specify:
- On-demand: ~$0.0425/vCPU-hour, ~$0.0057/GiB-hour (AWS us-east-1 m5 family)
- Convert to monthly: multiply hourly by 730
### Step 4: Identify Zombie Workloads
Flag deployments where:
- CPU usage is < 1m (1 millicore) for 7+ days
- Pod restart count is 0 and replicas are running
- Namespace matches staging/dev/preview patterns
### Step 5: Generate Right-Sizing Recommendations
For each overprovisioned workload:
1. Take the p99 usage over available history (ideally 14 days)
2. Add a configurable headroom buffer (default: 30%)
3. Set requests = p99 + buffer, limits = requests × 1.5
4. Output as kustomize patches or plain YAML diffs
### Step 6: Produce Summary Report
Format a table showing:
- Top waste offenders sorted by estimated monthly savings
- Total cluster waste percentage
- Projected annual savings
- Risk assessment (low/medium/high) for each recommendation
## Examples
### Example 1: Namespace Audit
**User prompt:**
```
Audit the api-services namespace for resource waste
```
**Agent output:**
```
Resource Audit: api-services namespace
───────────────────────────────────────
Deployment Replicas CPU Req→Used Mem Req→Used Efficiency Est. Waste
user-auth 3 500m→120m 1Gi→340Mi 26% $85/mo
order-processor 2 1000m→650m 2Gi→1.4Gi 68% $31/mo
email-worker 2 250m→15m 512Mi→80Mi 8% $42/mo
Recommendations:
1. email-worker: Reduce to 1 replica, set requests to cpu:50m/mem:150Mi → saves $42/mo
2. user-auth: Set requests to cpu:180m/mem:500Mi → saves $68/mo
3. order-processor: Requests are reasonable, minor trim to cpu:850m → saves $12/mo
Total potential savings: $122/month ($1,464/year)
```
### Example 2: Zombie Detection
**User prompt:**
```
Find unused deployments across all namespaces
```
**Agent output:**
```
Zombie Workloads Detected
─────────────────────────
Namespace: staging
✗ feature-auth-v2 2 replicas CPU: 0m Last deploy: 2025-09-14 → $28/mo wasted
✗ hotfix-payment-flow 1 replica CPU: 0m Last deploy: 2025-10-02 → $14/mo wasted
✗ demo-dashboard 3 replicas CPU: 0m Last deploy: 2025-07-28 → $42/mo wasted
Namespace: dev
✗ test-migration 1 replica CPU: 0m Last deploy: 2025-11-18 → $14/mo wasted
Suggested cleanup:
kubectl delete deployment feature-auth-v2 hotfix-payment-flow demo-dashboard -n staging
kubectl delete deployment test-migration -n dev
Total zombie cost: $98/month
```
## Guidelines
- **Never auto-apply changes** — always present recommendations for human review
- **Safety buffer is critical** — default 30% headroom prevents OOMKills after right-sizing
- **Prioritize by savings** — show the biggest wins first so users focus effort where it matters
- **Account for traffic patterns** — warn if usage data covers less than 7 days or misses peak periods
- **Consider HPA** — if a deployment has a HorizontalPodAutoscaler, note that right-sizing requests affects scaling thresholds
- **Staging vs production** — be more aggressive with staging recommendations, more conservative with production
- **Cost estimates are approximate** — note the instance type assumptions and suggest the user verify with their actual pricingRelated Skills
sql-optimizer
Analyze and optimize SQL queries for performance. Use when a user asks to optimize a query, speed up a slow query, analyze a query plan, add indexes, fix N+1 queries, reduce query time, tune database performance, or rewrite SQL for efficiency. Supports PostgreSQL, MySQL, and SQLite.
llm-cost-optimizer
Track, analyze, and reduce LLM API costs — model routing, prompt caching, semantic caching, and budget alerts. Use when someone asks to "reduce AI costs", "track LLM spending", "optimize API costs", "set up model routing", "cache LLM responses", "compare model costs", "set budget limits for AI", or "my OpenAI bill is too high". Covers cost tracking per feature/user, smart model routing (expensive model for hard tasks, cheap for easy), semantic caching, prompt compression, and budget alerting.
gcp-waf-cost-optimization
Apply the Google Cloud Well-Architected Framework's Cost Optimization pillar to evaluate workloads and recommend FinOps practices — billing exports, budgets, rightsizing via Active Assist, Spot VMs, Committed Use Discounts, storage lifecycle policies, and serverless adoption. Use for cloud cost reviews, monthly billing analysis, and architecture cost decisions.
alert-optimizer
Restructure and optimize alert rules for monitoring platforms (Sentry, PagerDuty, Datadog, OpsGenie). Use when someone asks to "reduce alert noise", "fix alert fatigue", "create alert rules", "set up escalation policies", "tune alerting thresholds", or "create on-call runbooks". Generates platform-specific alert configurations and tiered escalation policies.
zustand
You are an expert in Zustand, the small, fast, and scalable state management library for React. You help developers manage global state without boilerplate using Zustand's hook-based stores, selectors for performance, middleware (persist, devtools, immer), computed values, and async actions — replacing Redux complexity with a simple, un-opinionated API in under 1KB.
zoho
Integrate and automate Zoho products. Use when a user asks to work with Zoho CRM, Zoho Books, Zoho Desk, Zoho Projects, Zoho Mail, or Zoho Creator, build custom integrations via Zoho APIs, automate workflows with Deluge scripting, sync data between Zoho apps and external systems, manage leads and deals, automate invoicing, build custom Zoho Creator apps, set up webhooks, or manage Zoho organization settings. Covers Zoho CRM, Books, Desk, Projects, Creator, and cross-product integrations.
zod
You are an expert in Zod, the TypeScript-first schema declaration and validation library. You help developers define schemas that validate data at runtime AND infer TypeScript types at compile time — eliminating the need to write types and validators separately. Used for API input validation, form validation, environment variables, config files, and any data boundary.
zipkin
Deploy and configure Zipkin for distributed tracing and request flow visualization. Use when a user needs to set up trace collection, instrument Java/Spring or other services with Zipkin, analyze service dependencies, or configure storage backends for trace data.
zig
Expert guidance for Zig, the systems programming language focused on performance, safety, and readability. Helps developers write high-performance code with compile-time evaluation, seamless C interop, no hidden control flow, and no garbage collector. Zig is used for game engines, operating systems, networking, and as a C/C++ replacement.
zed
Expert guidance for Zed, the high-performance code editor built in Rust with native collaboration, AI integration, and GPU-accelerated rendering. Helps developers configure Zed, create custom extensions, set up collaborative editing sessions, and integrate AI assistants for productive coding.
zeabur
Expert guidance for Zeabur, the cloud deployment platform that auto-detects frameworks, builds and deploys applications with zero configuration, and provides managed services like databases and message queues. Helps developers deploy full-stack applications with automatic scaling and one-click marketplace services.
zapier
Automate workflows between apps with Zapier. Use when a user asks to connect apps without code, automate repetitive tasks, sync data between services, or build no-code integrations between SaaS tools.