scientific-tdd

Pragmatic test-driven development for scientific code with numerical validation

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

scientific-tdd is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Pragmatic test-driven development for scientific code with numerical validation

Teams using scientific-tdd should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/scientific-tdd-majiayu000/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/devops/scientific-tdd-majiayu000/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/scientific-tdd-majiayu000/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How scientific-tdd Compares

Feature / Agent	scientific-tdd	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Pragmatic test-driven development for scientific code with numerical validation

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# Scientific Test-Driven Development

## Overview

Pragmatic test-driven development for scientific code: write tests first for new features and complex changes, verify with tests for simple bug fixes.

**Core principle:** Tests before implementation for new behavior, tests verify implementation for known bugs.

**Announce at start:** "I'm using the scientific-tdd skill to implement this feature."

## When to Use This Skill

**MUST use for:**

- New features or algorithms
- Complex modifications to existing code
- Adding new mathematical models
- Implementing new likelihood functions or state transitions

**Can skip test-first for:**

- Simple bug fixes where existing tests already cover the behavior
- Documentation changes
- Refactoring with existing comprehensive tests (use safe-refactoring instead)

## Process Checklist

Copy to TodoWrite:

```
Scientific TDD Progress:
- [ ] Understand existing behavior (read code and tests)
- [ ] Write test capturing desired new behavior
- [ ] Run test to confirm RED (fails as expected)
- [ ] Implement minimal code to pass test
- [ ] Run test to confirm GREEN (passes)
- [ ] Run full test suite (check for regressions)
- [ ] Run numerical validation if mathematical code changed
- [ ] Run code-reviewer agent (and/or ux-reviewer when appropriate)
- [ ] Refactor if needed (keep tests green)
- [ ] Commit with descriptive message
```

## Detailed Steps

### Step 1: Understand Existing Behavior

Before writing new tests, understand current state:

- Read relevant source files
- Read existing tests for similar functionality
- Run existing tests to see current behavior
- Identify what needs to change

**Commands:**

```bash
# Find relevant tests
pytest --collect-only -q | grep <relevant_term>

# Run specific test file
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest src/non_local_detector/tests/<test_file>.py -v
```

### Step 2: Write Failing Test (RED)

Write test that captures desired behavior:

**Test Structure:**

```python
def test_descriptive_name_of_behavior():
    """Test that [specific behavior] works correctly.

    This test verifies that [explain what you're testing] when [condition].
    """
    # Arrange: Set up test data
    input_data = create_test_input()

    # Act: Call the function/method
    result = function_under_test(input_data)

    # Assert: Verify behavior
    assert result.shape == expected_shape
    assert np.allclose(result.sum(), 1.0, atol=1e-10)  # Probabilities sum to 1
```

**For mathematical code, verify:**

- Correct output shapes
- Mathematical invariants (probabilities sum to 1, matrices are stochastic)
- Expected numerical values (with appropriate tolerances)
- Edge cases (empty inputs, single element, boundary conditions)

### Step 3: Run Test - Confirm RED

**CRITICAL:** Test MUST fail before implementing:

```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest src/non_local_detector/tests/<test_file>.py::test_name -v
```

**Expected output:** Test fails with clear error (function not defined, wrong output, etc.)

**If test passes:** The test isn't testing new behavior - reconsider what you're testing.

### Step 4: Implement Minimal Code

Write simplest code that makes test pass:

- Don't over-engineer
- Don't add features not tested
- Follow YAGNI (You Aren't Gonna Need It)
- Use existing patterns from codebase

**For scientific code:**

- Maintain numerical stability
- Use JAX operations where appropriate
- Follow existing conventions for shapes and broadcasting

### Step 5: Run Test - Confirm GREEN

```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest src/non_local_detector/tests/<test_file>.py::test_name -v
```

**Expected output:** Test passes

**If test fails:** Debug until it passes, then verify you're testing the right thing.

### Step 6: Run Full Test Suite

Check for regressions:

```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest -v
```

**Expected:** All tests pass (same count as before)

**If new failures:** Your change broke something - fix before proceeding.

### Step 7: Numerical Validation (if applicable)

If you modified mathematical/algorithmic code:

**Use numerical-validation skill:**

```
@numerical-validation
```

This verifies:

- Mathematical invariants still hold
- Property-based tests pass
- Golden regression tests pass
- No unexpected numerical differences

### Step 8: Refactor (optional)

If code can be improved while keeping tests green:

- Improve readability
- Extract reusable functions
- Optimize performance (but verify numerics don't change)

**After each refactor:**

```bash
/Users/edeno/miniconda3/envs/non_local_detector/bin/pytest -v
```

### Step 9: Commit

```bash
git add <test_file> <implementation_file>
git commit -m "feat: add <feature description>

- Add test for <specific behavior>
- Implement <what you implemented>
- All tests passing (<N> tests)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>"
```

## Example Workflow

**Task:** Add new random walk transition with custom variance

```
1. Read: src/non_local_detector/continuous_state_transitions.py
2. Read: src/non_local_detector/tests/transitions/test_continuous_transitions.py
3. Write test: test_random_walk_custom_variance()
4. Run test: FAIL - "NotImplementedError: custom variance not supported"
5. Implement: Add variance parameter to RandomWalk class
6. Run test: PASS
7. Run full suite: 427 tests passed
8. Run numerical validation: All invariants hold
9. Commit: "feat: add custom variance support to RandomWalk"
```

## Integration with Other Skills

- **Before using this skill:** Often preceded by brainstorming or design discussion
- **Use with numerical-validation:** For mathematical code changes
- **After this skill:** May use safe-refactoring for cleanup
- **Alternative to this skill:** Use safe-refactoring if changing structure, not behavior

## Red Flags

**Don't:**

- Write implementation before test (except for documented bug fixes)
- Skip running test to see it fail
- Add untested code "for future use"
- Skip full test suite after implementation
- Commit failing tests
- Skip numerical validation for mathematical code

**Do:**

- Write descriptive test names
- Test one behavior per test
- Use appropriate numerical tolerances (1e-10 for probabilities)
- Run tests frequently
- Commit small, working increments
- Ask if unsure whether to use TDD for a specific change

Related Skills

u06740-skill-gap-diagnosis-for-scientific-publishing-pipelines

from diegosouzapw/awesome-omni-skill

Operate the "Skill Gap Diagnosis for scientific publishing pipelines" capability in production for scientific publishing pipelines workflows. Use when mission execution explicitly requires this capability and outcomes must be reproducible, policy-gated, and handoff-ready.

claude-scientific-skills

from diegosouzapw/awesome-omni-skill

Scientific research and analysis skills

scientific-schematics

from diegosouzapw/awesome-omni-skill

Create publication-quality scientific diagrams using Nano Banana Pro AI with smart iterative refinement. Uses Gemini 3 Pro for quality review. Only regenerates if quality is below threshold for your document type. Specialized in neural network architectures, system diagrams, flowcharts, biological pathways, and complex scientific visualizations.

scientific-papers-to-dataset

from diegosouzapw/awesome-omni-skill

Build structured datasets from academic papers. Use when the user wants to extract structured data from scientific literature, traverse citation graphs, search OpenAlex for papers, or create datasets from PDFs for research purposes.

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

crawl-docs-skill

from diegosouzapw/awesome-omni-skill

Run a Crawl4AI-based doc crawler and save internal pages as Markdown using page titles as filenames. Use when the user provides a docs URL and wants all internal subpages saved as .md files. Environment setup should only use uv.

copywriter

from diegosouzapw/awesome-omni-skill

Brand voice guardian and conversion-focused copywriter, specializing in direct, no-fluff copy that adapts to project's brand voice

compound

from diegosouzapw/awesome-omni-skill

Capture session learnings and save to skills, guidelines, or reference docs under ~/.claude/.

coder-docs

from diegosouzapw/awesome-omni-skill

Index + offline snapshot of coder/coder documentation (progressive disclosure).

code-documenter

from diegosouzapw/awesome-omni-skill

Use when adding docstrings, creating API documentation, or building documentation sites. Invoke for OpenAPI/Swagger specs, JSDoc, doc portals, tutorials, user guides.

code-documentation

from diegosouzapw/awesome-omni-skill

Writing effective code documentation - API docs, README files, inline comments, and technical guides. Use for documenting codebases, APIs, or writing developer guides.

code-documentation-doc-generate

from diegosouzapw/awesome-omni-skill

You are a documentation expert specializing in creating comprehensive, maintainable documentation from code. Generate API docs, architecture diagrams, user guides, and technical references using AI...