nw-property-based-testing
Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation
Best use case
nw-property-based-testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation
Teams using nw-property-based-testing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/nw-property-based-testing/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How nw-property-based-testing Compares
| Feature / Agent | nw-property-based-testing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Property-based testing strategies, mutation testing, shrinking, and combined PBT+mutation workflow for test quality validation
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Property-Based Testing and Mutation Testing
> Deferred to Phase 2.25: Mutation testing runs ONCE per feature as final quality gate at orchestrator Phase 2.25 (after all steps complete). Do NOT run mutation testing during inner TDD loop.
## Property-Based Testing (PBT)
Instead of examples ("given X, expect Y"), write properties ("for all valid inputs, condition Z holds").
Framework generates hundreds/thousands of inputs checking property. Dramatically expands test coverage.
## Property Patterns
1. **Invariants**: "for all inputs, condition holds" (sorted list is ordered, balance >= 0)
2. **Roundtrip**: "encode then decode = original" (serialize/deserialize, compress/decompress)
3. **Oracle**: "compare against reference implementation" (optimized vs correct-but-slow)
4. **Metamorphic**: "different operations, same result" (add(a,b)==add(b,a), filter can't increase size)
## Shrinking
When property fails, framework auto-finds minimal failing input. Dramatically accelerates debugging.
Algorithm: find failing input -> try simpler variants -> if still fails, use as new candidate -> repeat.
## PBT Tools by Language
| Language | Framework |
|----------|-----------|
| Python | Hypothesis |
| JavaScript/TypeScript | fast-check |
| Haskell | QuickCheck |
| Rust | quickcheck |
| Java | jqwik |
| C# | FsCheck |
Adopted by Amazon, Volvo, Stripe, Jane Street (ICSE 2024 study).
## When PBT Adds Value
HIGH value: algorithms | data structures | serialization | business rules (validation, calculations) | protocols/state machines.
LOW value: simple CRUD | UI logic | external API integrations.
PBT complements example-based testing, doesn't replace it.
## PBT + TDD Integration
1. Start with example-based TDD for specific cases (drives detailed design)
2. Once basic implementation works, write properties to generalize
3. If property fails: found bug or need refined implementation
4. Refactor freely - properties verify behavior preservation
Properties = higher-level spec that survives refactoring better than examples.
## Mutation Testing
Evaluates test suite quality by introducing artificial bugs (mutations) and checking if tests catch them.
Mutation score = killed mutants / total mutants. Stronger metric than code coverage.
## Mutation Score Targets
| Score | Quality |
|-------|---------|
| < 60% | Weak suite, significant gaps |
| 60-80% | Moderate, some gaps |
| > 80% | Strong, few gaps |
Target: 75-80% minimum. Not all survivors indicate bad tests (equivalent mutants exist).
## Mutation Operators
Change == to != | + to - | remove method call | change constant | modify loop boundary | alter comparison.
## Mutation Testing Tools
| Language | Tool |
|----------|------|
| Java | PIT |
| JavaScript/TypeScript/C# | Stryker |
| Python | mutmut, Cosmic Ray |
Computationally expensive. Use incremental: on changed code in PRs, full codebase weekly.
## Combined PBT + Mutation Workflow
1. Write example-based tests (TDD) -> cover known scenarios
2. Apply mutation testing -> identify assertion gaps -> write more tests
3. Add PBT for complex logic -> cover input space systematically
4. Mutation testing again -> verify properties are comprehensive
Quality ratchet: each technique exposes gaps others miss. Prioritize critical paths and complex algorithms.
## PBT Performance Guidance
- Fast feedback: ~100 examples | CI/CD: ~1000 examples | Nightly builds: ~10000+ examples
Modern frameworks allow configuring example count per context.Related Skills
nw-hexagonal-testing
5-layer agent output validation, I/O contract specification, vertical slice development, and test doubles policy with per-layer examples
nw-agent-testing
5-layer testing approach for agent validation including adversarial testing, security validation, and prompt injection resistance
nw-ux-web-patterns
Web UI design patterns for product owners. Load when designing web application interfaces, writing web-specific acceptance criteria, or evaluating responsive designs.
nw-ux-tui-patterns
Terminal UI and CLI design patterns for product owners. Load when designing command-line tools, interactive terminal applications, or writing CLI-specific acceptance criteria.
nw-ux-principles
Core UX principles for product owners. Load when evaluating interface designs, writing acceptance criteria with UX requirements, or reviewing wireframes and mockups.
nw-ux-emotional-design
Emotional design and delight patterns for product owners. Load when designing onboarding flows, empty states, first-run experiences, or evaluating the emotional quality of an interface.
nw-ux-desktop-patterns
Desktop application UI patterns for product owners. Load when designing native or cross-platform desktop applications, writing desktop-specific acceptance criteria, or evaluating panel layouts and keyboard workflows.
nw-user-story-mapping
User story mapping for backlog management and outcome-based prioritization. Load during Phase 2.5 (User Story Mapping) to produce story-map.md and prioritization.md.
nw-tr-review-criteria
Review dimensions and scoring for root cause analysis quality assessment
nw-tlaplus-verification
TLA+ formal verification for design correctness and PBT pipeline integration
nw-test-refactoring-catalog
Detailed refactoring mechanics with step-by-step procedures, and test code smell catalog with detection patterns and before/after examples
nw-test-organization-conventions
Test directory structure patterns by architecture style, language conventions, naming rules, and fixture placement. Decision tree for selecting test organization strategy.