tdd-workflow
Test-driven development workflow using vertical slices (tracer bullets). Enforces behavior-first testing through public interfaces. Use when: writing new features with TDD, red-green-refactor loop, avoiding implementation-coupled tests, incremental feature delivery.
Best use case
tdd-workflow is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Test-driven development workflow using vertical slices (tracer bullets). Enforces behavior-first testing through public interfaces. Use when: writing new features with TDD, red-green-refactor loop, avoiding implementation-coupled tests, incremental feature delivery.
Teams using tdd-workflow should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/tdd-workflow/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How tdd-workflow Compares
| Feature / Agent | tdd-workflow | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Test-driven development workflow using vertical slices (tracer bullets). Enforces behavior-first testing through public interfaces. Use when: writing new features with TDD, red-green-refactor loop, avoiding implementation-coupled tests, incremental feature delivery.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# TDD Workflow
## One-Liner
Drive feature development with tests that describe behavior — one tracer bullet at a time, never bulk-first.
---
## § 1 · Core Philosophy
Tests verify **behavior through public interfaces**, not implementation details. Code can be rewritten entirely; well-written tests should survive that rewrite.
**The critical anti-pattern: horizontal slicing.**
Writing all tests first, then all implementation, produces tests written against imagined behavior — they fail on contact with reality. Don't do it.
**The correct pattern: vertical slices (tracer bullets).**
Write one test → implement the minimum to pass it → repeat. Each cycle uses real learning from the previous one.
---
## § 2 · Workflow
### Phase 1 — Plan
Before writing any code:
1. Confirm the public interface (function signatures, HTTP endpoints, component props) with the user
2. List prioritized behaviors from most critical to least
3. Identify what "done" looks like for each behavior
**Gate:** Interface agreed. Behavior list approved. No code written yet.
### Phase 2 — Tracer Bullet
1. Write **one** test describing the first behavior
2. Test must:
- Call a public interface (not internal functions)
- Assert on observable output (not internal state)
- Read like a specification ("given X, when Y, then Z")
3. Run the test — confirm it **fails** for the right reason
4. Write the minimal implementation to make it pass
5. Run again — confirm green
**Gate:** Test is green. Test would survive a complete rewrite of the implementation.
### Phase 3 — Incremental Loop
Repeat Phase 2 for each behavior in priority order:
```
for each behavior:
→ write test (public interface, behavior assertion)
→ confirm red (right reason)
→ implement minimum
→ confirm green
→ micro-refactor if obvious duplication
```
Never write more than one failing test at a time.
### Phase 4 — Refactor
After all behaviors are covered:
1. Extract duplication into well-named helpers
2. Deepen modules: if a function does two conceptually separate things, split it
3. Rerun all tests — must stay green
4. If a refactor breaks a test, the test was coupled to implementation; fix the test
---
## § 3 · Test Quality Checklist
Before marking any test done, verify:
| Check | Pass Condition |
|-------|---------------|
| Public interface | Test calls the outward-facing API, not internals |
| Behavior assertion | Asserts what the system does, not how |
| Survives refactor | A complete internal rewrite would leave this test green |
| Reads like spec | A reader unfamiliar with the code can understand the intent |
| One behavior | Each test asserts exactly one behavior |
---
## § 4 · When to Use This Skill
**Use when:**
- Adding a new feature to an existing codebase
- Building a module from scratch with known requirements
- Fixing a bug (write a failing test for the bug first)
- Pair-programming with an agent on incremental delivery
**Do NOT use when:**
- Exploring an unfamiliar codebase (use `zoom-out` first)
- The interface is entirely unknown (use `grill-with-docs` first to clarify)
- Writing tests for existing untested code (use `debug-diagnose` to stabilize first)
---
## § 5 · Relationship to Other Skills
| Skill | When to reach for it |
|-------|---------------------|
| `zoom-out` | Before starting — map the codebase context |
| `debug-diagnose` | When a test catches a bug mid-cycle |
| `architecture-review` | After a feature ships — assess if new code introduced shallow modules |
| `to-prd` | To convert the behavior list into a tracked issue |Related Skills
write-skill
Meta-skill for creating high-quality SKILL.md files. Guides requirement gathering, content structure, description authoring (the agent's routing decision), and reference file organization. Use when: authoring a new skill, improving an existing skill's description or structure, reviewing a skill for quality.
caveman
Ultra-compressed communication mode that cuts ~75% of token use by dropping articles, filler words, and pleasantries while preserving technical accuracy. Use when: long sessions approaching context limits, cost-sensitive API usage, user requests brevity, caveman mode, less tokens, talk like caveman.
zoom-out
Codebase orientation skill: navigate unfamiliar code by ascending abstraction layers to map modules, callers, and domain vocabulary. Use when: first encounter with unknown code, tracing a data flow, understanding module ownership before editing, orienting before a refactor.
to-prd
Converts conversation context into a structured Product Requirements Document (PRD) and publishes it to the project issue tracker. Do NOT interview the user — synthesize what is already known. Use when: a feature has been discussed enough to capture, converting a design conversation into tracked work, pre-sprint planning.
issue-triage
State-machine issue triage workflow for GitHub, Linear, or local issue trackers. Manages category labels (bug, enhancement) and state labels (needs-triage, needs-info, ready-for-agent, ready-for-human, wontfix). Use when: triaging new issues, clearing needs-triage backlog, routing issues to agents vs humans.
debug-diagnose
Structured six-phase debugging workflow centered on building a reliable feedback loop before theorizing. Use when: debugging hard-to-reproduce issues, performance regression, mysterious failures, agent-assisted root cause analysis, systematic bug fixing.
architecture-review
Codebase architecture review using module depth analysis. Surfaces shallow modules, tight coupling, and locality violations. Proposes deepening opportunities. Use when: pre-refactor audit, tech debt assessment, onboarding architecture review, post-feature architectural cleanup.
vault-secrets-expert
HashiCorp Vault expert: KV secrets, dynamic credentials, PKI, auth methods. Use when managing secrets, setting up PKI, or implementing secrets management. Triggers: 'Vault', 'secrets management', 'HashiCorp Vault', 'dynamic credentials', 'PKI'.
nmap-expert
Expert-level Nmap skill for network reconnaissance, port scanning, service detection, and security assessment. Triggers: 'Nmap', '网络扫描', '端口扫描', 'NSE脚本'. Works with: Claude Code, Codex, OpenCode, Cursor, Cline, OpenClaw, Kimi.
metasploit-expert
Expert-level Metasploit Framework skill for penetration testing, exploit development, and post-exploitation operations. Triggers: 'Metasploit', '渗透测试', '红队', '漏洞利用'. Works with: Claude Code, Codex, OpenCode, Cursor, Cline, OpenClaw, Kimi.
gerrit-permission-manager
Expert manager for Gerrit multi-repository and multi-branch permission configurations. Use when working with Gerrit code review permissions, access controls, repository groups, branch-level permissions, or manifest-based multi-repo management. Use when: gerrit, permissions, code-review, access-control, devops.
container-security-expert
Expert-level Container Security skill using Trivy, Snyk, and other tools for vulnerability scanning, compliance checking, and container hardening. Triggers: '容器安全', '漏洞扫描', 'Trivy', 'Docker安全', 'K8s安全'.