k8s-debug-pods
Debug Kurtosis pods on Kubernetes. Diagnose why pods are Pending, CrashLoopBackOff, ImagePullBackOff, or Evicted. Check node taints, tolerations, resource pressure, and pod events. Use when kurtosis engine start fails or pods aren't coming online.
Best use case
k8s-debug-pods is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Debug Kurtosis pods on Kubernetes. Diagnose why pods are Pending, CrashLoopBackOff, ImagePullBackOff, or Evicted. Check node taints, tolerations, resource pressure, and pod events. Use when kurtosis engine start fails or pods aren't coming online.
Debug Kurtosis pods on Kubernetes. Diagnose why pods are Pending, CrashLoopBackOff, ImagePullBackOff, or Evicted. Check node taints, tolerations, resource pressure, and pod events. Use when kurtosis engine start fails or pods aren't coming online.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "k8s-debug-pods" skill to help with this workflow task. Context: Debug Kurtosis pods on Kubernetes. Diagnose why pods are Pending, CrashLoopBackOff, ImagePullBackOff, or Evicted. Check node taints, tolerations, resource pressure, and pod events. Use when kurtosis engine start fails or pods aren't coming online.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/k8s-debug-pods/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How k8s-debug-pods Compares
| Feature / Agent | k8s-debug-pods | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Debug Kurtosis pods on Kubernetes. Diagnose why pods are Pending, CrashLoopBackOff, ImagePullBackOff, or Evicted. Check node taints, tolerations, resource pressure, and pod events. Use when kurtosis engine start fails or pods aren't coming online.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# K8s Debug Pods
Diagnose and fix issues with Kurtosis pods on Kubernetes.
## Quick triage
```bash
# See all kurtosis-related pods across namespaces
kubectl get pods -A | grep kurtosis
# Check for problem pods (not Running)
kubectl get pods -A | grep kurtosis | grep -v Running
# Get events for a specific pod
kubectl describe pod <POD_NAME> -n <NAMESPACE> | tail -30
```
## Common pod states and fixes
### Pending — Unschedulable
The pod can't be scheduled because of node taints, resource pressure, or affinity rules.
```bash
# Check node taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
# Check node conditions (DiskPressure, MemoryPressure, etc.)
kubectl get nodes -o custom-columns=NAME:.metadata.name,CONDITIONS:.status.conditions[*].type
```
**Fix**: Add tolerations to the kurtosis config at `~/Library/Application Support/kurtosis/kurtosis-config.yml` or fix the node condition.
### ImagePullBackOff
The image tag doesn't exist on the registry.
```bash
# Check which image is failing
kubectl describe pod <POD_NAME> -n <NAMESPACE> | grep -A5 "Image:"
# Verify image exists on Docker Hub
docker manifest inspect <IMAGE>:<TAG>
```
**Fix**: Push the correct image tag, or fix the image reference in the code.
### CrashLoopBackOff
The container starts but crashes immediately.
```bash
# Check container logs
kubectl logs <POD_NAME> -n <NAMESPACE>
kubectl logs <POD_NAME> -n <NAMESPACE> --previous
```
### Evicted
The node evicted the pod due to resource pressure.
```bash
# Check which nodes have pressure
kubectl get nodes -o custom-columns=NAME:.metadata.name,STATUS:.status.conditions[-1].type
# Clean up evicted pods
kubectl get pods -A | grep Evicted | awk '{print $2 " -n " $1}' | xargs -L1 kubectl delete pod
```
## Kurtosis-specific pod types
| Pod pattern | Component | Image source |
|-------------|-----------|-------------|
| `kurtosis-engine-*` | Engine server | `engine/server/Dockerfile` |
| `kurtosis-api` (in `kt-*` namespaces) | API Container (APIC) | `core/server/Dockerfile` |
| `kurtosis-logs-collector-*` | Fluentbit DaemonSet | Pulled from registry |
| `kurtosis-logs-aggregator-*` | Vector deployment | Pulled from registry |
| `remove-dir-pod-*` | Fluentbit cleanup pods | busybox |
| `files-artifact-expander` (init container) | Files artifacts | `core/files_artifacts_expander/Dockerfile` |
## Engine start failures
If `kurtosis engine start` fails:
1. Check if old kurtosis namespaces exist: `kubectl get ns | grep kurtosis`
2. Delete them: `kubectl get ns | grep kurtosis | awk '{print $1}' | xargs -r kubectl delete ns`
3. Retry engine start
## Logs collector issues
The logs collector is a DaemonSet that runs on every node. If some nodes are unhealthy:
```bash
# Check DaemonSet status
kubectl get ds -A | grep kurtosis
# See which pods are not running
kubectl get pods -A | grep logs-collector | grep -v Running
```
Nodes with DiskPressure or other taints may not schedule collector pods — this is expected and the engine should start with a warning about partially degraded collection.Related Skills
docker-debug
Debug Kurtosis running on local Docker. Inspect engine, API container, and service logs. Diagnose container crashes, port conflicts, and networking issues. Use when kurtosis commands fail or services aren't reachable on Docker.
starlark-dev
Develop and debug Kurtosis Starlark packages. Create packages from scratch, understand the plan-based execution model, use print() debugging, handle future references, and test packages locally. Use when writing or troubleshooting .star files.
service-manage
Manage services in Kurtosis enclaves. Add, inspect, stop, start, remove, update services. View logs, shell into containers, and execute commands. Use when you need to interact with running services.
run-package
Run Starlark scripts and packages with kurtosis run. Covers all flags including dry-run, args-file, parallel execution, image download modes, verbosity levels, and production mode. Use when executing Kurtosis packages locally or from GitHub.
portal
Manage Kurtosis Portal for remote context access. Start, stop, and check status of the Portal daemon that enables communication with remote Kurtosis servers. Use when working with remote Kurtosis contexts.
port-forward
View and manage port mappings for Kurtosis services. Check which local ports map to service ports and troubleshoot connectivity. Use when services aren't reachable or you need to find the right port.
lint
Lint and format Kurtosis Starlark files. Check syntax, validate docstrings, and auto-format .star files. Use when writing or reviewing Starlark packages to ensure code quality.
k8s-dev-deploy
Build, push, and deploy Kurtosis dev images to a Kubernetes cluster without creating a release. Rebuilds engine, core, and files-artifacts-expander as multi-arch Docker images with a unique tag, pushes to the logged-in user's Docker Hub, and restarts the engine. Use when testing local code changes on a k8s cluster.
k8s-clean-cluster
Force-clean all Kurtosis resources from a Kubernetes cluster when kurtosis clean hangs or fails. Removes all kurtosis namespaces, pods, daemonsets, cluster roles, and cluster role bindings. Use when kurtosis clean -a hangs or leaves behind orphaned resources.
import-compose
Import Docker Compose files into Kurtosis. Convert docker-compose.yml to Starlark packages or run them directly. Use when migrating existing Docker Compose workflows to Kurtosis.
grafloki
Start Grafana and Loki for centralized log collection from Kurtosis enclaves. View aggregated service logs in a Grafana dashboard. Use when you need a UI for browsing logs across multiple services or want persistent log storage.
gateway
Start and manage the Kurtosis gateway for Kubernetes. The gateway forwards local ports to the Kurtosis engine and services running in a k8s cluster. Required when using Kurtosis with Kubernetes. Use when kurtosis engine status shows nothing on k8s or services aren't reachable.