nw-property-based-testing

Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation

322 stars

bynWave-ai

View on GitHub Installation ↓

Best use case

nw-property-based-testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation

Teams using nw-property-based-testing should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/nw-property-based-testing/SKILL.md --create-dirs "https://raw.githubusercontent.com/nWave-ai/nWave/main/nWave/skills/nw-property-based-testing/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/nw-property-based-testing/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How nw-property-based-testing Compares

Feature / Agent	nw-property-based-testing	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# Property-Based Testing and Mutation Testing

> Deferred to Phase 2.25: Mutation testing runs ONCE per feature as final quality gate at orchestrator Phase 2.25 (after all steps complete). Do NOT run mutation testing during inner TDD loop.

## Property-Based Testing (PBT)

Instead of examples ("given X, expect Y"), write properties ("for all valid inputs, condition Z holds").
Framework generates hundreds/thousands of inputs checking property. Dramatically expands test coverage.

## Property Patterns
1. **Invariants**: "for all inputs, condition holds" (sorted list is ordered, balance >= 0)
2. **Roundtrip**: "encode then decode = original" (serialize/deserialize, compress/decompress)
3. **Oracle**: "compare against reference implementation" (optimized vs correct-but-slow)
4. **Metamorphic**: "different operations, same result" (add(a,b)==add(b,a), filter can't increase size)

## Shrinking

When property fails, framework auto-finds minimal failing input. Dramatically accelerates debugging.
Algorithm: find failing input -> try simpler variants -> if still fails, use as new candidate -> repeat.

## PBT Tools by Language

| Language | Framework |
|----------|-----------|
| Python | Hypothesis |
| JavaScript/TypeScript | fast-check |
| Haskell | QuickCheck |
| Rust | quickcheck |
| Java | jqwik |
| C# | FsCheck |

Adopted by Amazon, Volvo, Stripe, Jane Street (ICSE 2024 study).

## When PBT Adds Value
HIGH value: algorithms | data structures | serialization | business rules (validation, calculations) | protocols/state machines.
LOW value: simple CRUD | UI logic | external API integrations.
PBT complements example-based testing, doesn't replace it.

## PBT + TDD Integration
1. Start with example-based TDD for specific cases (drives detailed design)
2. Once basic implementation works, write properties to generalize
3. If property fails: found bug or need refined implementation
4. Refactor freely - properties verify behavior preservation

Properties = higher-level spec that survives refactoring better than examples.

## Mutation Testing

Evaluates test suite quality by introducing artificial bugs (mutations) and checking if tests catch them.
Mutation score = killed mutants / total mutants. Stronger metric than code coverage.

## Mutation Score Targets

| Score | Quality |
|-------|---------|
| < 60% | Weak suite, significant gaps |
| 60-80% | Moderate, some gaps |
| > 80% | Strong, few gaps |

Target: 75-80% minimum. Not all survivors indicate bad tests (equivalent mutants exist).

## Mutation Operators
Change == to != | + to - | remove method call | change constant | modify loop boundary | alter comparison.

## Mutation Testing Tools

| Language | Tool |
|----------|------|
| Java | PIT |
| JavaScript/TypeScript/C# | Stryker |
| Python | mutmut, Cosmic Ray |

Computationally expensive. Use incremental: on changed code in PRs, full codebase weekly.

## Combined PBT + Mutation Workflow
1. Write example-based tests (TDD) -> cover known scenarios
2. Apply mutation testing -> identify assertion gaps -> write more tests
3. Add PBT for complex logic -> cover input space systematically
4. Mutation testing again -> verify properties are comprehensive

Quality ratchet: each technique exposes gaps others miss. Prioritize critical paths and complex algorithms.

## PBT Performance Guidance
- Fast feedback: ~100 examples | CI/CD: ~1000 examples | Nightly builds: ~10000+ examples

Modern frameworks allow configuring example count per context.

Related Skills

nw-hexagonal-testing

322

from nWave-ai/nWave

5-layer agent output validation, I/O contract specification, vertical slice development, and test doubles policy with per-layer examples

nw-agent-testing

322

from nWave-ai/nWave

5-layer testing approach for agent validation including adversarial testing, security validation, and prompt injection resistance

nw-ux-web-patterns

322

from nWave-ai/nWave

Web UI design patterns for product owners. Load when designing web application interfaces, writing web-specific acceptance criteria, or evaluating responsive designs.

nw-ux-tui-patterns

322

from nWave-ai/nWave

Terminal UI and CLI design patterns for product owners. Load when designing command-line tools, interactive terminal applications, or writing CLI-specific acceptance criteria.

nw-ux-principles

322

from nWave-ai/nWave

Core UX principles for product owners. Load when evaluating interface designs, writing acceptance criteria with UX requirements, or reviewing wireframes and mockups.

nw-ux-emotional-design

322

from nWave-ai/nWave

Emotional design and delight patterns for product owners. Load when designing onboarding flows, empty states, first-run experiences, or evaluating the emotional quality of an interface.

nw-ux-desktop-patterns

322

from nWave-ai/nWave

Desktop application UI patterns for product owners. Load when designing native or cross-platform desktop applications, writing desktop-specific acceptance criteria, or evaluating panel layouts and keyboard workflows.

nw-user-story-mapping

322

from nWave-ai/nWave

User story mapping for backlog management and outcome-based prioritization. Load during Phase 2.5 (User Story Mapping) to produce story-map.md and prioritization.md.

nw-tr-review-criteria

322

from nWave-ai/nWave

Review dimensions and scoring for root cause analysis quality assessment

nw-tlaplus-verification

322

from nWave-ai/nWave

TLA+ formal verification for design correctness and PBT pipeline integration

nw-test-refactoring-catalog

322

from nWave-ai/nWave

Detailed refactoring mechanics with step-by-step procedures, and test code smell catalog with detection patterns and before/after examples

nw-test-organization-conventions

322

from nWave-ai/nWave

Test directory structure patterns by architecture style, language conventions, naming rules, and fixture placement. Decision tree for selecting test organization strategy.