repro-enforcer

Export any bioinformatics analysis as a reproducible bundle with Conda environment, Singularity container definition, and Nextflow pipeline.

1,802 stars

byFreedomIntelligence

View on GitHub Installation ↓

Best use case

repro-enforcer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Export any bioinformatics analysis as a reproducible bundle with Conda environment, Singularity container definition, and Nextflow pipeline.

Teams using repro-enforcer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/repro-enforcer/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/repro-enforcer/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/repro-enforcer/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How repro-enforcer Compares

Feature / Agent	repro-enforcer	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Export any bioinformatics analysis as a reproducible bundle with Conda environment, Singularity container definition, and Nextflow pipeline.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# 🦖 Repro Enforcer

You are the **Repro Enforcer**, a specialised agent for making bioinformatics analyses reproducible and portable.

## Core Capabilities

1. **Conda Export**: Capture the current environment as a pinned `environment.yml`
2. **Singularity Definition**: Generate a Singularity `.def` file from the analysis dependencies
3. **Docker Compose**: Generate Dockerfile and docker-compose.yml for containerised execution
4. **Nextflow Pipeline**: Convert a sequence of shell commands into a Nextflow DSL2 pipeline
5. **Snakemake Workflow**: Alternative workflow export as Snakefile
6. **Checksum Manifest**: SHA-256 hashes for all input/output files
7. **README Generation**: Human-readable reproduction instructions

## Dependencies

- `pyyaml` (YAML generation)
- Optional: `conda` (environment capture), `singularity` (container build), `nextflow` (pipeline validation)

## Example Queries

- "Make this analysis reproducible as a Nextflow pipeline"
- "Export my current conda environment with pinned versions"
- "Generate a Singularity container for this workflow"
- "Create a checksums file for all input and output data"

## Status

**Planned** -- implementation targeting Week 6 (Apr 3-9).

Related Skills

tcga-bulk-data-preprocessing-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Guide Claude through ingesting TCGA sample sheets, expression archives, and clinical carts into omicverse, initialising survival metadata, and exporting annotated AnnData files.

single-cell-preprocessing-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Walk through omicverse's single-cell preprocessing tutorials to QC PBMC3k data, normalise counts, detect HVGs, and run PCA/embedding pipelines on CPU, CPU–GPU mixed, or GPU stacks.

bio-spatial-transcriptomics-spatial-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Quality control, filtering, normalization, and feature selection for spatial transcriptomics data. Calculate QC metrics, filter spots/cells, normalize counts, and identify highly variable genes. Use when filtering and normalizing spatial transcriptomics data.

bio-single-cell-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Quality control, filtering, and normalization for single-cell RNA-seq using Seurat (R) and Scanpy (Python). Use for calculating QC metrics, filtering cells and genes, normalizing counts, identifying highly variable genes, and scaling data. Use when filtering, normalizing, and selecting features in single-cell data.

bio-ribo-seq-riboseq-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Preprocess ribosome profiling data including adapter trimming, size selection, rRNA removal, and alignment. Use when preparing Ribo-seq reads for downstream analysis of translation.

bio-metabolomics-xcms-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

XCMS3 workflow for LC-MS/MS metabolomics preprocessing. Covers peak detection, retention time alignment, correspondence (grouping), and gap filling. Use when processing raw LC-MS data into a feature table for untargeted metabolomics.

bio-metabolomics-msdial-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

MS-DIAL-based metabolomics preprocessing as alternative to XCMS. Covers peak detection, alignment, annotation, and export for downstream analysis. Use when processing MS-DIAL output files for R/Python analysis or when preferring GUI-based preprocessing.

bio-imaging-mass-cytometry-data-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Load and preprocess imaging mass cytometry (IMC) and MIBI data. Covers MCD/TIFF handling, hot pixel removal, and image normalization. Use when starting IMC analysis from raw MCD files or preparing images for segmentation.

bio-cfdna-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Preprocesses cell-free DNA sequencing data including adapter trimming, alignment optimized for short fragments, and UMI-aware duplicate removal using fgbio. Applies cfDNA-specific quality thresholds and fragment length filtering. Use when processing plasma cfDNA sequencing data before downstream analysis.

zinc-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

zarr-python

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

xlsx

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.