Test-Driven Development — The Iron Law

> **Type:** Rigid (NO exceptions)

Best use case

Test-Driven Development — The Iron Law is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

> **Type:** Rigid (NO exceptions)

Teams using Test-Driven Development — The Iron Law should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/test-driven-development/SKILL.md --create-dirs "https://raw.githubusercontent.com/SufficientDaikon/archon/main/skills/test-driven-development/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/test-driven-development/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How Test-Driven Development — The Iron Law Compares

Feature / AgentTest-Driven Development — The Iron LawStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

> **Type:** Rigid (NO exceptions)

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Test-Driven Development — The Iron Law

> **Type:** Rigid (NO exceptions)  
> **Trigger:** Any feature implementation or bugfix, BEFORE writing production code  
> **Applies to:** Every agent that writes code

## Iron Law

```
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
```

## The Cycle: Red → Green → Refactor

### RED Phase
1. Write ONE failing test that describes the desired behavior
2. Run the test
3. **VERIFY it FAILS** — if it passes, your test is wrong or the feature already exists
4. Read the failure message — it should describe what's missing

### GREEN Phase
1. Write the MINIMAL code to make the test pass
2. No extra features, no "while I'm here" additions
3. Run ALL tests (not just the new one)
4. **VERIFY ALL pass** — if others broke, fix before continuing

### REFACTOR Phase
1. Clean up: remove duplication, improve names, extract helpers
2. Run ALL tests after EVERY refactor change
3. **Keep green** — if any test fails during refactor, revert immediately
4. Don't add features during refactor — that's a new RED cycle

### REPEAT
Next behavior → new failing test → minimal code → refactor

## What "Delete" Means

If you wrote production code before the test:

1. **DELETE IT** — not "save for reference", not "I'll adapt it"
2. Don't look at it while writing the test
3. Start fresh: write the test first, THEN write the code
4. Delete means delete

## The 11-Entry Rationalization Table

| # | Excuse | Reality |
|---|--------|---------|
| 1 | "Too simple to test" | Simple code breaks too. Test takes 30 seconds. Write it. |
| 2 | "I'll test after" | Tests that pass immediately prove nothing about your code. |
| 3 | "Already manually tested" | Manual ≠ systematic. No record, can't re-run, can't automate. |
| 4 | "Deleting X hours of work is wasteful" | Sunk cost fallacy. The code was written without design. |
| 5 | "TDD is dogmatic/purist" | TDD IS pragmatic. It finds bugs before commit, not after deploy. |
| 6 | "I just need to make a small fix" | Small fixes break things. Test proves the fix works AND nothing broke. |
| 7 | "The test framework isn't set up" | Set it up. That's step 0 before any code. |
| 8 | "I know this works" | You know what you THINK works. Tests prove what ACTUALLY works. |
| 9 | "Time pressure — we need this shipped" | Untested code ships bugs. Bug-fixing is slower than TDD. |
| 10 | "Legacy code doesn't have tests" | Write a characterization test first. Then change. |
| 11 | "It's just a config change" | Config changes break systems. Test the expected behavior. |

## Red Flags

- Writing an implementation file before a test file
- A test that passes on first run (test is likely wrong)
- "Let me just get this working, then add tests"
- Tests that don't assert meaningful behavior
- Skipping the REFACTOR phase
- Running only the new test, not the full suite

## Framework Detection

Auto-detect and use the project's test framework:
| Language | Framework | Command |
|----------|-----------|---------|
| Python | pytest | `pytest` |
| JavaScript | jest/vitest | `npm test` |
| TypeScript | jest/vitest | `npm test` |
| Go | testing | `go test ./...` |
| Rust | cargo test | `cargo test` |
| Ruby | rspec | `rspec` |

## Integration

- **writing-plans** generates TDD-sized tasks (test → implement → verify)
- **subagent-driven-development** enforces TDD per task
- **verification-before-completion** ensures tests actually ran and passed
- **anti-rationalization synapse** catches TDD-skipping excuses

Related Skills

webapp-testing

7
from SufficientDaikon/archon

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

Vitest Unit Patterns

7
from SufficientDaikon/archon

> Design fast, isolated unit tests that validate business logic without network, database, or browser dependencies using Vitest.

ux-test-suite

7
from SufficientDaikon/archon

UX testing patterns including task completion testing, error recovery testing, usability scoring, cognitive load heuristics, and flow testing methodology. Use when testing user flows, evaluating usability, or creating UX test plans.

qa-test-planner

7
from SufficientDaikon/archon

Generate comprehensive test plans, manual test cases, regression test suites, and bug reports for QA engineers. Includes Figma MCP integration for design validation.

Finishing a Development Branch — The Closer

7
from SufficientDaikon/archon

> **Type:** Rigid

e2e-testing-patterns

7
from SufficientDaikon/archon

Master end-to-end testing with Playwright and Cypress to build reliable test suites that catch bugs, improve confidence, and enable fast deployment. Use when implementing E2E tests, debugging flaky tests, or establishing testing standards.

backend-development

7
from SufficientDaikon/archon

Backend API design, database architecture, microservices patterns, and test-driven development. Use for designing APIs, database schemas, or backend system architecture.

YAML Prompt Library

7
from SufficientDaikon/archon

> Store reusable AI prompts as YAML files with structured messages, variables, and test data for version-controlled prompt engineering.

writing-skills

7
from SufficientDaikon/archon

Use when creating new skills, editing existing skills, or verifying skills work before deployment

Writing Plans — TDD-Sized Task Breakdown

7
from SufficientDaikon/archon

> **Type:** Rigid process (follow structure exactly)

wireframing

7
from SufficientDaikon/archon

Wireframing patterns including layout grids, content blocks, responsive breakpoints, and page layout patterns for landing pages, dashboards, and forms. Use when creating wireframes, defining layouts, or planning responsive behavior.

windows-registry-editor

7
from SufficientDaikon/archon

Expert Windows Registry editor and optimizer via PowerShell. Read, write, search, backup, restore, and bulk-modify registry keys across all hives (HKLM, HKCU, HKCR, HKU, HKCC). Includes curated optimization presets for network, gaming, privacy, performance, and input latency. Use this skill whenever the user asks to edit the registry, apply registry tweaks, check a registry value, optimize Windows via registry, fix registry issues, export/import .reg files, search the registry, or apply gaming/network/privacy registry presets. Also triggers for "regedit", "registry hack", "registry fix", "DWORD", "HKLM", "HKCU", or any mention of Windows registry keys or values.