repro-enforcer
Export any bioinformatics analysis as a reproducible bundle with Conda environment, Singularity container definition, and Nextflow pipeline.
Best use case
repro-enforcer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Export any bioinformatics analysis as a reproducible bundle with Conda environment, Singularity container definition, and Nextflow pipeline.
Teams using repro-enforcer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/repro-enforcer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How repro-enforcer Compares
| Feature / Agent | repro-enforcer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Export any bioinformatics analysis as a reproducible bundle with Conda environment, Singularity container definition, and Nextflow pipeline.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# 🦖 Repro Enforcer You are the **Repro Enforcer**, a specialised agent for making bioinformatics analyses reproducible and portable. ## Core Capabilities 1. **Conda Export**: Capture the current environment as a pinned `environment.yml` 2. **Singularity Definition**: Generate a Singularity `.def` file from the analysis dependencies 3. **Docker Compose**: Generate Dockerfile and docker-compose.yml for containerised execution 4. **Nextflow Pipeline**: Convert a sequence of shell commands into a Nextflow DSL2 pipeline 5. **Snakemake Workflow**: Alternative workflow export as Snakefile 6. **Checksum Manifest**: SHA-256 hashes for all input/output files 7. **README Generation**: Human-readable reproduction instructions ## Dependencies - `pyyaml` (YAML generation) - Optional: `conda` (environment capture), `singularity` (container build), `nextflow` (pipeline validation) ## Example Queries - "Make this analysis reproducible as a Nextflow pipeline" - "Export my current conda environment with pinned versions" - "Generate a Singularity container for this workflow" - "Create a checksums file for all input and output data" ## Status **Planned** -- implementation targeting Week 6 (Apr 3-9).
Related Skills
tcga-bulk-data-preprocessing-with-omicverse
Guide Claude through ingesting TCGA sample sheets, expression archives, and clinical carts into omicverse, initialising survival metadata, and exporting annotated AnnData files.
single-cell-preprocessing-with-omicverse
Walk through omicverse's single-cell preprocessing tutorials to QC PBMC3k data, normalise counts, detect HVGs, and run PCA/embedding pipelines on CPU, CPU–GPU mixed, or GPU stacks.
bio-spatial-transcriptomics-spatial-preprocessing
Quality control, filtering, normalization, and feature selection for spatial transcriptomics data. Calculate QC metrics, filter spots/cells, normalize counts, and identify highly variable genes. Use when filtering and normalizing spatial transcriptomics data.
bio-single-cell-preprocessing
Quality control, filtering, and normalization for single-cell RNA-seq using Seurat (R) and Scanpy (Python). Use for calculating QC metrics, filtering cells and genes, normalizing counts, identifying highly variable genes, and scaling data. Use when filtering, normalizing, and selecting features in single-cell data.
bio-ribo-seq-riboseq-preprocessing
Preprocess ribosome profiling data including adapter trimming, size selection, rRNA removal, and alignment. Use when preparing Ribo-seq reads for downstream analysis of translation.
bio-metabolomics-xcms-preprocessing
XCMS3 workflow for LC-MS/MS metabolomics preprocessing. Covers peak detection, retention time alignment, correspondence (grouping), and gap filling. Use when processing raw LC-MS data into a feature table for untargeted metabolomics.
bio-metabolomics-msdial-preprocessing
MS-DIAL-based metabolomics preprocessing as alternative to XCMS. Covers peak detection, alignment, annotation, and export for downstream analysis. Use when processing MS-DIAL output files for R/Python analysis or when preferring GUI-based preprocessing.
bio-imaging-mass-cytometry-data-preprocessing
Load and preprocess imaging mass cytometry (IMC) and MIBI data. Covers MCD/TIFF handling, hot pixel removal, and image normalization. Use when starting IMC analysis from raw MCD files or preparing images for segmentation.
bio-cfdna-preprocessing
Preprocesses cell-free DNA sequencing data including adapter trimming, alignment optimized for short fragments, and UMI-aware duplicate removal using fgbio. Applies cfDNA-specific quality thresholds and fragment length filtering. Use when processing plasma cfDNA sequencing data before downstream analysis.
zinc-database
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
zarr-python
Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.
xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.