scientific-tdd
Pragmatic test-driven development for scientific code with numerical validation
Best use case
scientific-tdd is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Pragmatic test-driven development for scientific code with numerical validation
Teams using scientific-tdd should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/scientific-tdd-majiayu000/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How scientific-tdd Compares
| Feature / Agent | scientific-tdd | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Pragmatic test-driven development for scientific code with numerical validation
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Scientific Test-Driven Development
## Overview
Pragmatic test-driven development for scientific code: write tests first for new features and complex changes, verify with tests for simple bug fixes.
**Core principle:** Tests before implementation for new behavior, tests verify implementation for known bugs.
**Announce at start:** "I'm using the scientific-tdd skill to implement this feature."
## When to Use This Skill
**MUST use for:**
- New features or algorithms
- Complex modifications to existing code
- Adding new mathematical models
- Implementing new likelihood functions or state transitions
**Can skip test-first for:**
- Simple bug fixes where existing tests already cover the behavior
- Documentation changes
- Refactoring with existing comprehensive tests (use safe-refactoring instead)
## Process Checklist
Copy to TodoWrite:
```
Scientific TDD Progress:
- [ ] Understand existing behavior (read code and tests)
- [ ] Write test capturing desired new behavior
- [ ] Run test to confirm RED (fails as expected)
- [ ] Implement minimal code to pass test
- [ ] Run test to confirm GREEN (passes)
- [ ] Run full test suite (check for regressions)
- [ ] Run numerical validation if mathematical code changed
- [ ] Run code-reviewer agent (and/or ux-reviewer when appropriate)
- [ ] Refactor if needed (keep tests green)
- [ ] Commit with descriptive message
```
## Detailed Steps
### Step 1: Understand Existing Behavior
Before writing new tests, understand current state:
- Read relevant source files
- Read existing tests for similar functionality
- Run existing tests to see current behavior
- Identify what needs to change
**Commands:**
```bash
# Find relevant tests
pytest --collect-only -q | grep <relevant_term>
# Run specific test file
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest src/non_local_detector/tests/<test_file>.py -v
```
### Step 2: Write Failing Test (RED)
Write test that captures desired behavior:
**Test Structure:**
```python
def test_descriptive_name_of_behavior():
"""Test that [specific behavior] works correctly.
This test verifies that [explain what you're testing] when [condition].
"""
# Arrange: Set up test data
input_data = create_test_input()
# Act: Call the function/method
result = function_under_test(input_data)
# Assert: Verify behavior
assert result.shape == expected_shape
assert np.allclose(result.sum(), 1.0, atol=1e-10) # Probabilities sum to 1
```
**For mathematical code, verify:**
- Correct output shapes
- Mathematical invariants (probabilities sum to 1, matrices are stochastic)
- Expected numerical values (with appropriate tolerances)
- Edge cases (empty inputs, single element, boundary conditions)
### Step 3: Run Test - Confirm RED
**CRITICAL:** Test MUST fail before implementing:
```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest src/non_local_detector/tests/<test_file>.py::test_name -v
```
**Expected output:** Test fails with clear error (function not defined, wrong output, etc.)
**If test passes:** The test isn't testing new behavior - reconsider what you're testing.
### Step 4: Implement Minimal Code
Write simplest code that makes test pass:
- Don't over-engineer
- Don't add features not tested
- Follow YAGNI (You Aren't Gonna Need It)
- Use existing patterns from codebase
**For scientific code:**
- Maintain numerical stability
- Use JAX operations where appropriate
- Follow existing conventions for shapes and broadcasting
### Step 5: Run Test - Confirm GREEN
```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest src/non_local_detector/tests/<test_file>.py::test_name -v
```
**Expected output:** Test passes
**If test fails:** Debug until it passes, then verify you're testing the right thing.
### Step 6: Run Full Test Suite
Check for regressions:
```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest -v
```
**Expected:** All tests pass (same count as before)
**If new failures:** Your change broke something - fix before proceeding.
### Step 7: Numerical Validation (if applicable)
If you modified mathematical/algorithmic code:
**Use numerical-validation skill:**
```
@numerical-validation
```
This verifies:
- Mathematical invariants still hold
- Property-based tests pass
- Golden regression tests pass
- No unexpected numerical differences
### Step 8: Refactor (optional)
If code can be improved while keeping tests green:
- Improve readability
- Extract reusable functions
- Optimize performance (but verify numerics don't change)
**After each refactor:**
```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest -v
```
### Step 9: Commit
```bash
git add <test_file> <implementation_file>
git commit -m "feat: add <feature description>
- Add test for <specific behavior>
- Implement <what you implemented>
- All tests passing (<N> tests)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>"
```
## Example Workflow
**Task:** Add new random walk transition with custom variance
```
1. Read: src/non_local_detector/continuous_state_transitions.py
2. Read: src/non_local_detector/tests/transitions/test_continuous_transitions.py
3. Write test: test_random_walk_custom_variance()
4. Run test: FAIL - "NotImplementedError: custom variance not supported"
5. Implement: Add variance parameter to RandomWalk class
6. Run test: PASS
7. Run full suite: 427 tests passed
8. Run numerical validation: All invariants hold
9. Commit: "feat: add custom variance support to RandomWalk"
```
## Integration with Other Skills
- **Before using this skill:** Often preceded by brainstorming or design discussion
- **Use with numerical-validation:** For mathematical code changes
- **After this skill:** May use safe-refactoring for cleanup
- **Alternative to this skill:** Use safe-refactoring if changing structure, not behavior
## Red Flags
**Don't:**
- Write implementation before test (except for documented bug fixes)
- Skip running test to see it fail
- Add untested code "for future use"
- Skip full test suite after implementation
- Commit failing tests
- Skip numerical validation for mathematical code
**Do:**
- Write descriptive test names
- Test one behavior per test
- Use appropriate numerical tolerances (1e-10 for probabilities)
- Run tests frequently
- Commit small, working increments
- Ask if unsure whether to use TDD for a specific changeRelated Skills
u06740-skill-gap-diagnosis-for-scientific-publishing-pipelines
Operate the "Skill Gap Diagnosis for scientific publishing pipelines" capability in production for scientific publishing pipelines workflows. Use when mission execution explicitly requires this capability and outcomes must be reproducible, policy-gated, and handoff-ready.
claude-scientific-skills
Scientific research and analysis skills
scientific-schematics
Create publication-quality scientific diagrams using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.
scientific-papers-to-dataset
Build structured datasets from academic papers. Use when the user wants to extract structured data from scientific literature, traverse citation graphs, search OpenAlex for papers, or create datasets from PDFs for research purposes.
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
crawl-docs-skill
Run a Crawl4AI-based doc crawler and save internal pages as Markdown using page titles as filenames. Use when the user provides a docs URL and wants all internal subpages saved as .md files. Environment setup should only use uv.
copywriter
Brand voice guardian and conversion-focused copywriter, specializing in direct, no-fluff copy that adapts to project's brand voice
compound
Capture session learnings and save to skills, guidelines, or reference docs under ~/.claude/.
coder-docs
Index + offline snapshot of coder/coder documentation (progressive disclosure).
code-documenter
Use when adding docstrings, creating API documentation, or building documentation sites. Invoke for OpenAPI/Swagger specs, JSDoc, doc portals, tutorials, user guides.
code-documentation
Writing effective code documentation - API docs, README files, inline comments, and technical guides. Use for documenting codebases, APIs, or writing developer guides.
code-documentation-doc-generate
You are a documentation expert specializing in creating comprehensive, maintainable documentation from code. Generate API docs, architecture diagrams, user guides, and technical references using AI...