giab-benchmark-validator

Genome in a Bottle benchmark validation skill for pipeline accuracy assessment

509 stars

Best use case

giab-benchmark-validator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Genome in a Bottle benchmark validation skill for pipeline accuracy assessment

Teams using giab-benchmark-validator should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/giab-benchmark-validator/SKILL.md --create-dirs "https://raw.githubusercontent.com/a5c-ai/babysitter/main/library/specializations/domains/science/bioinformatics/skills/giab-benchmark-validator/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/giab-benchmark-validator/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How giab-benchmark-validator Compares

Feature / Agent	giab-benchmark-validator	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Genome in a Bottle benchmark validation skill for pipeline accuracy assessment

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# GIAB Benchmark Validator Skill

## Purpose
Enable Genome in a Bottle benchmark validation for pipeline accuracy assessment.

## Capabilities
- Truth set comparison
- hap.py/vcfeval execution
- Sensitivity/specificity calculation
- Stratified performance metrics
- Difficult region analysis
- Validation report generation

## Usage Guidelines
- Use appropriate GIAB reference samples
- Compare against truth sets with hap.py
- Calculate sensitivity and specificity
- Stratify by region type and variant class
- Analyze performance in difficult regions
- Generate comprehensive validation reports

## Dependencies
- hap.py
- vcfeval
- GIAB resources

## Process Integration
- Analysis Pipeline Validation (pipeline-validation)
- Whole Genome Sequencing Pipeline (wgs-analysis-pipeline)

Related Skills

design-system-validator

509

from a5c-ai/babysitter

Validate design system compliance in code and detect token usage violations

link-validator

509

from a5c-ai/babysitter

Comprehensive link checking and validation for documentation. Validate internal links, external URLs, anchors, detect redirects, monitor link rot, and generate sitemap validation reports.

code-sample-validator

509

from a5c-ai/babysitter

Extract, validate, and test code samples in documentation. Verify syntax, execute samples, check outputs, validate imports, and ensure code samples are up-to-date with current APIs.

openapi-validator

509

from a5c-ai/babysitter

Validate OpenAPI specifications for correctness, security, and best practices

k8s-validator

509

from a5c-ai/babysitter

Validate Kubernetes manifests for security, best practices, and resource limits

performance-benchmark-suite

509

from a5c-ai/babysitter

SDK performance benchmarking and regression detection

specialization-validator

509

from a5c-ai/babysitter

Validate specialization completeness across all 7 phases, score each phase, identify gaps, and generate validation reports.

process-validator

509

from a5c-ai/babysitter

Validate process JS files for correct SDK patterns, task definitions, syntax, and quality gate implementation.

gpu-benchmarking

509

from a5c-ai/babysitter

Expert skill for automated GPU performance benchmarking and regression detection. Design micro-benchmarks, measure kernel execution time with CUDA events, calculate achieved vs theoretical performance, generate comparison reports, detect regressions in CI/CD, and profile power/thermal characteristics.

checklist-validator

509

from a5c-ai/babysitter

Skill for validating research against reporting checklists

rb-benchmarker

509

from a5c-ai/babysitter

Randomized benchmarking skill for gate fidelity characterization

math-notation-validator

509

from a5c-ai/babysitter

Validate and standardize mathematical notation