debug-diagnose

Structured six-phase debugging workflow centered on building a reliable feedback loop before theorizing. Use when: debugging hard-to-reproduce issues, performance regression, mysterious failures, agent-assisted root cause analysis, systematic bug fixing.

33 stars

bytheneoai

View on GitHub Installation ↓

Best use case

debug-diagnose is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using debug-diagnose should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/debug-diagnose/SKILL.md --create-dirs "https://raw.githubusercontent.com/theneoai/awesome-skills/main/skills/workflow/engineering/debug-diagnose/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/debug-diagnose/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How debug-diagnose Compares

Feature / Agent	debug-diagnose	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Debug & Diagnose

## One-Liner

Build a fast, deterministic, agent-runnable pass/fail signal first — the bug is 90% fixed once you have that.

---

## § 1 · Core Philosophy

**The critical insight:** If you have a fast, deterministic, agent-runnable pass/fail signal for the bug, you will find the cause. Without it, you are guessing.

Most debugging fails not because the engineer lacks knowledge, but because they skip directly to hypotheses without first establishing a reproducible signal. Build the loop, then let the loop drive everything else.

---

## § 2 · The Six Phases

### Phase 1 — Build a Feedback Loop

Create the fastest reproducible test signal you can. Options in order of preference:

1. **Unit test** — isolates the exact failing condition
2. **Integration test** — exercises the real subsystem path
3. **CLI script** — passes known-bad input, asserts known-bad output
4. **REPL session** — interactive exploration to narrow the search space
5. **Fuzz input** — for non-deterministic or data-dependent failures

**Key constraints:**
- Must be runnable by the agent without human interaction
- Must produce pass/fail, not just "looks wrong"
- Must be deterministic (same input → same result)

Be aggressive and creative. Refuse to give up. A slow loop is better than no loop; a noisy loop can be narrowed.

**Gate:** You have a script/test that reproducibly demonstrates the failure.

### Phase 2 — Reproduce

Run the loop and confirm it demonstrates **exactly** the reported failure — not a nearby issue, not a similar symptom.

If the loop shows a different failure than reported:
- You have found a second bug (note it, don't pursue it now)
- Continue narrowing until the loop matches the report precisely

**Gate:** Loop output matches the failure description word-for-word.

### Phase 3 — Hypothesize

Before touching any code, generate **3–5 ranked hypotheses**. Each must be:

- **Falsifiable** — a specific, testable prediction
- **Ranked** — order by likelihood given what you know
- **Independent** — don't let one hypothesis assume another is true

Write them down. Do not skip this step even if one hypothesis feels obvious — the obvious hypothesis is frequently wrong.

### Phase 4 — Instrument

Test hypotheses in ranked order. For each:

1. Add **targeted** instrumentation at the relevant boundary (not blanket logging)
2. Use a debugger or tagged log output, not `print`-everywhere
3. Run the feedback loop
4. Does the output confirm or falsify the hypothesis?

Stop when one hypothesis is confirmed. Remove instrumentation from all falsified hypotheses immediately to keep signal clean.

**Anti-pattern:** Adding instrumentation for all hypotheses at once. You lose the ability to read the signal.

### Phase 5 — Fix + Regression Test

1. Write a test **at the appropriate architectural seam** that fails because of the bug — before writing the fix
2. Implement the minimal fix
3. Confirm: feedback loop is now green, new regression test is green, existing tests still green
4. If the fix required touching more than one module, consider whether the modules should be decoupled

### Phase 6 — Cleanup + Post-Mortem

1. Remove all debug instrumentation
2. Verify the feedback loop test is committed (it is now a permanent regression guard)
3. Document findings:
- Root cause in one sentence
- Why the bug was not caught earlier
- What architectural change (if any) would prevent the class of bug

---

## § 3 · Quick Reference

```
Phase 1: BUILD THE LOOP (deterministic, agent-runnable, pass/fail)
Phase 2: REPRODUCE (confirm the loop shows exactly the reported failure)
Phase 3: HYPOTHESIZE (3-5 ranked, falsifiable hypotheses — written down)
Phase 4: INSTRUMENT (targeted, one hypothesis at a time)
Phase 5: FIX + REGRESSION TEST (test first, then fix)
Phase 6: CLEANUP + POST-MORTEM (remove debug code, document root cause)
```

---

## § 4 · When to Use This Skill

**Use when:**
- A bug is hard to reproduce or intermittent
- A performance regression appeared without obvious cause
- The same bug keeps recurring
- An agent is stuck in a "try random things" loop
- You need to present root cause analysis to stakeholders

**Do NOT use when:**
- The bug is a typo or trivially obvious from the error message
- You need to design a new feature (use `to-prd`)
- The codebase is unfamiliar — run `zoom-out` first

---

## § 5 · Relationship to Other Skills

| Skill | When to reach for it |
|-------|---------------------|
| `zoom-out` | Unfamiliar codebase — map it before Phase 1 |
| `tdd-workflow` | Once the bug is fixed — retrofit the regression test into the test suite |
| `architecture-review` | Post-mortem reveals a structural issue |

Related Skills

write-skill

from theneoai/awesome-skills

Meta-skill for creating high-quality SKILL.md files. Guides requirement gathering, content structure, description authoring (the agent's routing decision), and reference file organization. Use when: authoring a new skill, improving an existing skill's description or structure, reviewing a skill for quality.

caveman

from theneoai/awesome-skills

Ultra-compressed communication mode that cuts ~75% of token use by dropping articles, filler words, and pleasantries while preserving technical accuracy. Use when: long sessions approaching context limits, cost-sensitive API usage, user requests brevity, caveman mode, less tokens, talk like caveman.

zoom-out

from theneoai/awesome-skills

Codebase orientation skill: navigate unfamiliar code by ascending abstraction layers to map modules, callers, and domain vocabulary. Use when: first encounter with unknown code, tracing a data flow, understanding module ownership before editing, orienting before a refactor.

to-prd

from theneoai/awesome-skills

Converts conversation context into a structured Product Requirements Document (PRD) and publishes it to the project issue tracker. Do NOT interview the user — synthesize what is already known. Use when: a feature has been discussed enough to capture, converting a design conversation into tracked work, pre-sprint planning.

tdd-workflow

from theneoai/awesome-skills

Test-driven development workflow using vertical slices (tracer bullets). Enforces behavior-first testing through public interfaces. Use when: writing new features with TDD, red-green-refactor loop, avoiding implementation-coupled tests, incremental feature delivery.

issue-triage

from theneoai/awesome-skills

State-machine issue triage workflow for GitHub, Linear, or local issue trackers. Manages category labels (bug, enhancement) and state labels (needs-triage, needs-info, ready-for-agent, ready-for-human, wontfix). Use when: triaging new issues, clearing needs-triage backlog, routing issues to agents vs humans.

architecture-review

from theneoai/awesome-skills

Codebase architecture review using module depth analysis. Surfaces shallow modules, tight coupling, and locality violations. Proposes deepening opportunities. Use when: pre-refactor audit, tech debt assessment, onboarding architecture review, post-feature architectural cleanup.

vault-secrets-expert

from theneoai/awesome-skills

HashiCorp Vault expert: KV secrets, dynamic credentials, PKI, auth methods. Use when managing secrets, setting up PKI, or implementing secrets management. Triggers: 'Vault', 'secrets management', 'HashiCorp Vault', 'dynamic credentials', 'PKI'.

nmap-expert

from theneoai/awesome-skills

Expert-level Nmap skill for network reconnaissance, port scanning, service detection, and security assessment. Triggers: 'Nmap', '网络扫描', '端口扫描', 'NSE脚本'. Works with: Claude Code, Codex, OpenCode, Cursor, Cline, OpenClaw, Kimi.

metasploit-expert

from theneoai/awesome-skills

Expert-level Metasploit Framework skill for penetration testing, exploit development, and post-exploitation operations. Triggers: 'Metasploit', '渗透测试', '红队', '漏洞利用'. Works with: Claude Code, Codex, OpenCode, Cursor, Cline, OpenClaw, Kimi.

gerrit-permission-manager

from theneoai/awesome-skills

Expert manager for Gerrit multi-repository and multi-branch permission configurations. Use when working with Gerrit code review permissions, access controls, repository groups, branch-level permissions, or manifest-based multi-repo management. Use when: gerrit, permissions, code-review, access-control, devops.

container-security-expert

from theneoai/awesome-skills

Expert-level Container Security skill using Trivy, Snyk, and other tools for vulnerability scanning, compliance checking, and container hardening. Triggers: '容器安全', '漏洞扫描', 'Trivy', 'Docker安全', 'K8s安全'.