simulation-validator

Validate simulations before, during, and after execution. Use for pre-flight checks, runtime monitoring, post-run validation, diagnosing failed simulations, checking convergence, detecting NaN/Inf, or verifying mass/energy conservation.

1,802 stars

Best use case

simulation-validator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Validate simulations before, during, and after execution. Use for pre-flight checks, runtime monitoring, post-run validation, diagnosing failed simulations, checking convergence, detecting NaN/Inf, or verifying mass/energy conservation.

Teams using simulation-validator should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/simulation-validator/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/simulation-validator/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/simulation-validator/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How simulation-validator Compares

Feature / Agentsimulation-validatorStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Validate simulations before, during, and after execution. Use for pre-flight checks, runtime monitoring, post-run validation, diagnosing failed simulations, checking convergence, detecting NaN/Inf, or verifying mass/energy conservation.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Simulation Validator

## Goal

Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.

## Requirements

- Python 3.8+
- No external dependencies (uses Python standard library only)
- Works on Linux, macOS, and Windows

## Inputs to Gather

Before running validation scripts, collect from the user:

| Input | Description | Example |
|-------|-------------|---------|
| Config file | Simulation configuration (JSON/YAML) | `simulation.json` |
| Log file | Runtime output log | `simulation.log` |
| Metrics file | Post-run metrics (JSON) | `results.json` |
| Required params | Parameters that must exist | `dt,dx,kappa` |
| Valid ranges | Parameter bounds | `dt:1e-6:1e-2` |

## Decision Guidance

### When to Run Each Stage

```
Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│         └── BLOCK status? → Fix issues, do NOT run simulation
│         └── WARN status? → Review warnings, document if accepted
│         └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│         └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│         └── Failed checks? → Do NOT use results
│                            → Run failure_diagnoser.py
│         └── All passed? → Results are valid
```

### Choosing Validation Thresholds

| Metric | Conservative | Standard | Relaxed |
|--------|--------------|----------|---------|
| Mass tolerance | 1e-6 | 1e-3 | 1e-2 |
| Residual growth | 2x | 10x | 100x |
| dt reduction | 10x | 100x | 1000x |

## Script Outputs (JSON Fields)

| Script | Output Fields |
|--------|---------------|
| `scripts/preflight_checker.py` | `report.status`, `report.blockers`, `report.warnings` |
| `scripts/runtime_monitor.py` | `alerts`, `residual_stats`, `dt_stats` |
| `scripts/result_validator.py` | `checks`, `confidence_score`, `failed_checks` |
| `scripts/failure_diagnoser.py` | `probable_causes`, `recommended_fixes` |

## Three-Stage Validation Protocol

### Stage 1: Pre-flight (Before Simulation)

1. Run `scripts/preflight_checker.py --config simulation.json`
2. **BLOCK status**: Stop immediately, fix all blocker issues
3. **WARN status**: Review warnings, document accepted risks
4. **PASS status**: Proceed to simulation

```bash
python3 scripts/preflight_checker.py \
    --config simulation.json \
    --required dt,dx,kappa \
    --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
    --min-free-gb 1.0 \
    --json
```

### Stage 2: Runtime (During Simulation)

1. Run `scripts/runtime_monitor.py --log simulation.log` periodically
2. Configure alert thresholds based on problem type
3. Stop simulation if critical alerts appear

```bash
python3 scripts/runtime_monitor.py \
    --log simulation.log \
    --residual-growth 10.0 \
    --dt-drop 100.0 \
    --json
```

### Stage 3: Post-flight (After Simulation)

1. Run `scripts/result_validator.py --metrics results.json`
2. **All checks PASS**: Results are valid for analysis
3. **Any check FAIL**: Do NOT use results, diagnose failure

```bash
python3 scripts/result_validator.py \
    --metrics results.json \
    --bound-min 0.0 \
    --bound-max 1.0 \
    --mass-tol 1e-3 \
    --json
```

### Failure Diagnosis

When validation fails:

```bash
python3 scripts/failure_diagnoser.py --log simulation.log --json
```

## Conversational Workflow Example

**User**: My phase field simulation crashed after 1000 steps. Can you help me figure out why?

**Agent workflow**:
1. First, check the log for obvious errors:
   ```bash
   python3 scripts/failure_diagnoser.py --log simulation.log --json
   ```
2. If diagnosis suggests numerical blow-up, check runtime stats:
   ```bash
   python3 scripts/runtime_monitor.py --log simulation.log --json
   ```
3. Recommend fixes based on findings:
   - If residual grew rapidly → reduce time step
   - If dt collapsed → check stability conditions
   - If NaN detected → check initial conditions

## Error Handling

| Error | Cause | Resolution |
|-------|-------|------------|
| `Config not found` | File path invalid | Verify config path exists |
| `Non-numeric value` | Parameter is not a number | Fix config file format |
| `out of range` | Parameter outside bounds | Adjust parameter or bounds |
| `Output directory not writable` | Permission issue | Check directory permissions |
| `Insufficient disk space` | Disk nearly full | Free up space or reduce output |

## Interpretation Guidance

### Status Meanings

| Status | Meaning | Action |
|--------|---------|--------|
| PASS | All checks passed | Proceed with confidence |
| WARN | Non-critical issues found | Review and document |
| BLOCK | Critical issues found | Must fix before proceeding |

### Confidence Score Interpretation

| Score | Meaning |
|-------|---------|
| 1.0 | All validation checks passed |
| 0.75+ | Most checks passed, minor issues |
| 0.5-0.75 | Significant issues, review carefully |
| < 0.5 | Major problems, do not trust results |

### Common Failure Patterns

| Pattern in Log | Likely Cause | Recommended Fix |
|----------------|--------------|-----------------|
| NaN, Inf, overflow | Numerical instability | Reduce dt, increase damping |
| max iterations, did not converge | Solver failure | Tune preconditioner, tolerances |
| out of memory | Memory exhaustion | Reduce mesh, enable out-of-core |
| dt reduced | Adaptive stepping triggered | May be okay if controlled |

## Limitations

- **Not a real-time monitor**: Scripts analyze logs after-the-fact
- **Regex-based**: Log parsing depends on pattern matching; may miss unusual formats
- **No automatic fixes**: Scripts diagnose but don't modify simulations

## References

- `references/validation_protocol.md` - Detailed checklist and criteria
- `references/log_patterns.md` - Common failure signatures and regex patterns

## Version History

- **v1.1.0** (2024-12-24): Enhanced documentation, decision guidance, Windows compatibility
- **v1.0.0**: Initial release with 4 validation scripts

Related Skills

simulation-orchestrator

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Orchestrate multi-simulation campaigns including parameter sweeps, batch jobs, and result aggregation. Use for running parameter studies, managing simulation batches, tracking job status, combining results from multiple runs, or automating simulation workflows.

ontology-validator

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Validate material sample annotations and data structures against ontology constraints. Use when checking if CMSO annotations are correct, verifying that required properties are present, or validating that object property relationships have consistent domain and range. Catches unknown classes, unknown properties, domain mismatches, and missing required fields.

zinc-database

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

zarr-python

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

xlsx

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

writing-skills

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Use when creating new skills, editing existing skills, or verifying skills work before deployment

writing-plans

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Use when you have a spec or requirements for a multi-step task, before touching code

wikipedia-search

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Search and fetch structured content from Wikipedia using the MediaWiki API for reliable, encyclopedic information

wellally-tech

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Integrate digital health data sources (Apple Health, Fitbit, Oura Ring) and connect to WellAlly.tech knowledge base. Import external health device data, standardize to local format, and recommend relevant WellAlly.tech knowledge base articles based on health data. Support generic CSV/JSON import, provide intelligent article recommendations, and help users better manage personal health data.

weightloss-analyzer

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

分析减肥数据、计算代谢率、追踪能量缺口、管理减肥阶段

<!--

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

# COPYRIGHT NOTICE

verification-before-completion

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always