characterization-test-generator

Generates tests that capture current behavior of existing code before refactoring. Use when you need a safety net before AI-assisted refactoring or modifying legacy code.

110 stars

bygustavscirulis

View on GitHub Installation ↓

Best use case

characterization-test-generator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generates tests that capture current behavior of existing code before refactoring. Use when you need a safety net before AI-assisted refactoring or modifying legacy code.

Teams using characterization-test-generator should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/characterization-test-generator/SKILL.md --create-dirs "https://raw.githubusercontent.com/gustavscirulis/snapgrid/main/.claude/skills/skills/testing/characterization-test-generator/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/characterization-test-generator/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How characterization-test-generator Compares

Feature / Agent	characterization-test-generator	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Generates tests that capture current behavior of existing code before refactoring. Use when you need a safety net before AI-assisted refactoring or modifying legacy code.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# Characterization Test Generator

Generate tests that document what existing code **actually does** — not what it should do. These tests capture current behavior so you can refactor with confidence, especially when using AI to modify code.

## When This Skill Activates

Use this skill when the user:
- Says "I need to refactor this" or "help me refactor"
- Wants to "add tests before changing code"
- Asks for "characterization tests" or "golden master tests"
- Says "I want to safely modify this class/module"
- Mentions "legacy code" or "untested code"
- Wants AI to refactor but is worried about breaking things

## Why This Matters for AI-Generated Code

AI refactoring is powerful but risky:
- AI doesn't remember **why** code works a certain way
- AI may "improve" code and break subtle behavior
- Characterization tests freeze current behavior as a regression suite
- If any test fails after refactoring, you know something changed

## Process

### Phase 1: Discover Code Under Test

```
Glob: **/*.swift (in the target module)
Grep: "class |struct |enum |protocol |func " (public API surface)
```

Identify:
- [ ] Public types and their public methods
- [ ] Input types and return types
- [ ] Side effects (network, disk, UserDefaults, notifications)
- [ ] Dependencies (injected or created internally)
- [ ] State mutations (properties that change)

### Phase 2: Classify Behavior

For each public method, classify:

| Category | Example | Test Strategy |
|----------|---------|---------------|
| Pure computation | `calculate(a, b) -> c` | Input/output pairs |
| State mutation | `addItem(_:)` modifies array | Before/after state checks |
| Async operation | `fetchData() async -> [Item]` | Mock dependencies, verify results |
| Side effect | `save()` writes to disk | Verify mock was called |
| Event emission | `delegate?.didUpdate()` | Capture delegate calls |
| Error path | Throws on invalid input | Verify correct error type |

### Phase 3: Generate Tests

#### Test Naming Convention

Characterization tests should be clearly labeled:

```swift
import Testing
@testable import YourApp

@Suite("Characterization: ItemManager")
struct ItemManagerCharacterizationTests {
    // Tests document CURRENT behavior, not ideal behavior
}
```

#### Template: Pure Computation

```swift
@Test("current behavior: calculates total with tax")
func calculatesTotal() {
    let calculator = PriceCalculator()

    let result = calculator.total(subtotal: 100.0, taxRate: 0.08)

    // Document actual behavior — even if the rounding seems wrong
    #expect(result == 108.0)
}
```

#### Template: State Mutation

```swift
@Test("current behavior: addItem appends and sorts by date")
func addItemSortsbyDate() {
    let manager = ItemManager()
    let older = Item(title: "A", date: .distantPast)
    let newer = Item(title: "B", date: .now)

    manager.addItem(older)
    manager.addItem(newer)

    // Document actual ordering behavior
    #expect(manager.items.first?.title == "A")
    #expect(manager.items.last?.title == "B")
    #expect(manager.items.count == 2)
}
```

#### Template: Async with Dependencies

```swift
@Test("current behavior: loadItems returns cached data when offline")
func loadItemsOffline() async throws {
    let mockNetwork = MockNetworkClient(shouldFail: true)
    let mockCache = MockCache(items: [Item.sample])
    let service = ItemService(network: mockNetwork, cache: mockCache)

    let items = try await service.loadItems()

    // Document: falls back to cache when network fails
    #expect(items.count == 1)
    #expect(items.first?.title == Item.sample.title)
}
```

#### Template: Side Effects

```swift
@Test("current behavior: save writes to UserDefaults")
func saveWritesToDefaults() {
    let defaults = MockUserDefaults()
    let settings = SettingsManager(defaults: defaults)

    settings.setTheme(.dark)
    settings.save()

    // Document: saves as string, not as enum raw value
    #expect(defaults.lastSetValue as? String == "dark")
    #expect(defaults.lastSetKey == "app_theme")
}
```

#### Template: Error Paths

```swift
@Test("current behavior: throws on empty title")
func throwsOnEmptyTitle() {
    let validator = ItemValidator()

    #expect(throws: ValidationError.emptyField("title")) {
        try validator.validate(Item(title: "", date: .now))
    }
}
```

#### Template: Edge Cases

```swift
@Test("current behavior: handles nil optional gracefully")
func handlesNilOptional() {
    let parser = DataParser()

    let result = parser.parse(data: nil)

    // Document: returns empty array on nil, doesn't crash
    #expect(result.isEmpty)
}

@Test("current behavior: handles empty collection")
func handlesEmptyCollection() {
    let aggregator = StatsAggregator()

    let stats = aggregator.compute(values: [])

    // Document: returns zeroes, not NaN or crash
    #expect(stats.average == 0.0)
    #expect(stats.count == 0)
}
```

### Phase 4: Fill in Actual Values

This is the key step. For each test:

1. **Read the source code** to understand what the method actually returns
2. **Trace the logic** for the given inputs
3. **Write the assertion with the actual value**, even if it seems wrong

```swift
// ❌ Wrong — this is what you WANT it to do
#expect(result == expectedCorrectValue)

// ✅ Right — this is what it ACTUALLY does
#expect(result == actualCurrentValue)  // Note: off-by-one, but current behavior
```

**Add comments for surprising behavior:**
```swift
// CHARACTERIZATION: This returns 11 not 10 due to inclusive range.
// Don't "fix" this until intentionally changing behavior.
#expect(range.count == 11)
```

### Phase 5: Verify and Lock

1. **Run all characterization tests** — they must ALL pass
2. **If any fail**, adjust assertions to match actual behavior
3. **Tag them** so they're easy to find later:

```swift
@Suite("Characterization: ItemManager")
@Tag(.characterization)
struct ItemManagerCharacterizationTests { ... }

// Define the tag
extension Tag {
    @Tag static var characterization: Self
}
```

4. **Run selectively:**
```bash
# Run only characterization tests
xcodebuild test -scheme YourApp \
  -only-testing "YourAppTests/ItemManagerCharacterizationTests"
```

## Output Format

```markdown
## Characterization Tests Generated

**Module**: [Module name]
**Classes tested**: [List]
**Tests generated**: [Count]

### Coverage Summary

| Class | Methods Covered | Edge Cases | Notes |
|-------|----------------|------------|-------|
| ItemManager | 5/7 | 3 | 2 private methods skipped |
| PriceCalculator | 3/3 | 2 | All public API covered |

### Files Created
- `Tests/CharacterizationTests/ItemManagerCharacterizationTests.swift`
- `Tests/CharacterizationTests/PriceCalculatorCharacterizationTests.swift`

### Surprising Behaviors Found
- `PriceCalculator.total()` rounds DOWN, not to nearest cent
- `ItemManager.sort()` is unstable — equal dates may reorder

### Ready to Refactor
All [X] characterization tests passing. Safe to refactor with AI.
Run `xcodebuild test` after each change to verify no behavior changed.
```

## When NOT to Use This

- Code is already well-tested (check coverage first)
- You're deleting the code entirely (no need to characterize)
- The code is trivially simple (single-line computed properties)
- You **want** to change the behavior (use `tdd-bug-fix` instead)

## References

- Michael Feathers, *Working Effectively with Legacy Code* — coined "characterization test"
- `generators/test-generator/` — for standard test generation (not characterization)
- `testing/tdd-refactor-guard/` — pre-refactor checklist that uses these tests

Related Skills

test-data-factory

110

from gustavscirulis/snapgrid

Generate test fixture factories for your models. Builder pattern and static factories for zero-boilerplate test data. Use when tests need sample data setup.

test-contract

110

from gustavscirulis/snapgrid

Generate protocol/interface test suites that any implementation must pass. Define the contract once, test every implementation. Use when designing protocols or swapping implementations.

snapshot-test-setup

110

from gustavscirulis/snapgrid

Set up SwiftUI visual regression testing with swift-snapshot-testing. Generates snapshot test boilerplate and CI configuration. Use for UI regression prevention.

integration-test-scaffold

110

from gustavscirulis/snapgrid

Generate cross-module test harness with mock servers, in-memory stores, and test configuration. Use when testing networking + persistence + business logic together.

testing

110

from gustavscirulis/snapgrid

TDD and testing skills for iOS/macOS apps. Covers characterization tests, TDD workflows, test contracts, snapshot tests, and test infrastructure. Use for test-driven development, adding tests to existing code, or building test infrastructure.

test-spec

110

from gustavscirulis/snapgrid

Generates comprehensive test specification with unit tests, UI tests, accessibility testing, and beta testing plan. Creates TEST_SPEC.md from PRD and implementation specs. Use when creating QA strategy.

prd-generator

110

from gustavscirulis/snapgrid

Generates comprehensive Product Requirements Document from product plan. Creates PRD.md with features, user stories, acceptance criteria, and success metrics. Use when creating product requirements.

idea-generator

110

from gustavscirulis/snapgrid

Brainstorm and rank iOS/macOS app ideas tailored to developer skills. Use when user says "what should I build", "give me app ideas", "I don't know what to build", "brainstorm app ideas", or "help me find an app idea".

beta-testing

110

from gustavscirulis/snapgrid

Beta testing strategy for iOS/macOS apps. Covers TestFlight program setup, beta tester recruitment, feedback collection methodology, user interviews, signal-vs-noise interpretation, and go/no-go launch readiness decisions. Use when planning a beta, setting up TestFlight, collecting user feedback, or deciding if ready to launch.

widget-generator

110

from gustavscirulis/snapgrid

Generate WidgetKit widgets for iOS/macOS home screen and lock screen with timeline providers, interactive elements, and App Intent configuration. Use when adding widgets to an app.

tipkit-generator

110

from gustavscirulis/snapgrid

Generate TipKit infrastructure with inline/popover tips, rules, display frequency, and testing utilities. Use when adding contextual tips or feature discovery to an iOS/macOS app.

test-generator

110

from gustavscirulis/snapgrid

Generate test templates for unit tests, integration tests, and UI tests using Swift Testing and XCTest. Use when adding tests to iOS/macOS apps.