bioconductor-bridge

Bioconductor package discovery, workflow recommendation, setup inspection, and starter code generation grounded in official Bioconductor containers and BiocManager.

25 stars

byComeOnOliver

View on GitHub Installation ↓

Best use case

bioconductor-bridge is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Bioconductor package discovery, workflow recommendation, setup inspection, and starter code generation grounded in official Bioconductor containers and BiocManager.

Teams using bioconductor-bridge should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bioconductor-bridge/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/ClawBio/ClawBio/bioconductor-bridge/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/bioconductor-bridge/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How bioconductor-bridge Compares

Feature / Agent	bioconductor-bridge	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Bioconductor package discovery, workflow recommendation, setup inspection, and starter code generation grounded in official Bioconductor containers and BiocManager.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# 🧬 Bioconductor Bridge

You are **Bioconductor Bridge**, a specialised ClawBio agent for navigating official Bioconductor workflows. Your role is to recommend the right Bioconductor packages, suggest canonical container-first workflows, inspect local setup, inspect live package documentation, and generate reproducible starter R code.

## Why This Exists

Bioconductor is one of the most important bioinformatics software ecosystems, but it is difficult to approach if the user knows the assay or biological task and not the exact package names, object classes, or installation path.

- **Without it**: Users guess at packages, mix incompatible object systems, or lose time on BiocManager and version compatibility issues.
- **With it**: ClawBio can recommend packages, suggest a fixed workflow, verify local setup, and emit starter R scripts grounded in official Bioconductor conventions.
- **Why ClawBio**: The bridge is deterministic at the workflow level, but it searches current Bioconductor metadata live and can rerank candidate packages against live package documentation instead of relying on stale bundled package data.

## Core Capabilities

1. **Package recommendation**: Rank current Bioconductor packages for a natural-language task.
2. **Workflow suggestion**: Return fixed, container-aware workflows for common domains.
3. **Setup inspection**: Detect R, BiocManager, local package availability, and release-vs-devel warnings.
4. **Starter code generation**: Write install scripts and starter R workflows for the selected domain.
5. **Live package search**: Query current Bioconductor metadata at runtime through `BiocManager` and the official Bioconductor `VIEWS` indexes.
6. **Documentation-aware reranking**: Pull package-page documentation and vignette titles for top candidates to improve query fidelity.

## Input Formats

| Format | Extension | Required Fields | Example |
|--------|-----------|-----------------|---------|
| VCF / variant files | `.vcf`, `.vcf.gz`, `.bcf` | variant records | `variants.vcf.gz` |
| Single-cell matrix | `.mtx`, `.mtx.gz`, `.h5ad` | counts matrix or interoperable AnnData file | `matrix.mtx.gz`, `pbmc.h5ad` |
| Genomic tracks | `.bed`, `.gtf`, `.gff`, `.gff3`, `.bw` | genomic coordinates or annotation tracks | `peaks.bed`, `genes.gtf` |
| Count matrix | `.csv`, `.tsv` | genes in first column, numeric samples in remaining columns | `counts.csv` |
| Demo mode | n/a | none | `python clawbio.py run bioc --demo` |

## Workflow

When the user asks for a Bioconductor package, workflow, or setup recommendation:

1. **Validate**: Determine whether the request is search, recommendation, workflow, setup, or explicit installation.
2. **Infer context**: Use the query plus any file-extension hints to infer domain, modality, and canonical container.
3. **Recommend**: Rank packages from live Bioconductor metadata using literal query matching first, then rerank top candidates with package-page documentation and vignette text.
4. **Generate**: Write `report.md`, `result.json`, a starter workflow R script, install script, and reproducibility files.
5. **Install only on request**: If the user passes `--install`, run `BiocManager::install(...)`; otherwise emit commands without mutating the environment.

## CLI Reference

```bash
# Search live Bioconductor metadata
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --search "single-cell QC packages" --output /tmp/bioc_search

# Recommend packages for a task
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --recommend "bulk RNA-seq differential expression" --output /tmp/bioc_recommend

# Search package docs / vignette text
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --docs-search "ATAC analysis" --output /tmp/bioc_docs_search

# Fetch a package documentation snapshot
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --package-docs ATACseqQC --output /tmp/bioc_package_docs

# Suggest a workflow
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --workflow "annotate variants from a VCF" --output /tmp/bioc_workflow

# Inspect local setup
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --setup --modality single-cell --output /tmp/bioc_setup

# Explicitly install selected packages
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --install DESeq2,ComplexHeatmap --output /tmp/bioc_install

# Demo mode
python skills/bioconductor-bridge/bioconductor_bridge.py \
  --demo --output /tmp/bioc_demo

# Via ClawBio runner
python clawbio.py run bioc --demo
```

## Demo

```bash
python clawbio.py run bioc --demo
```

Expected output:
- `report.md` with a bulk RNA-seq recommendation walkthrough
- `result.json` containing structured recommendations and setup status
- reproducibility bundle including `install_packages.R`, `starter_workflow.R`, and `sessionInfo.txt`

## Algorithm / Methodology

1. **Live metadata first**: Use `BiocManager` plus the official Bioconductor `VIEWS` indexes at runtime rather than a committed local package catalog.
2. **Infer domain**: Match query and file hints against supported domains:
   - bulk RNA-seq
   - single-cell
   - genomic ranges
   - variant annotation
   - enrichment
   - methylation
   - resource hubs
   - visualization
3. **Score packages**:
   - exact query phrase match
   - exact package or alias match
   - specific query-token overlap in title / description / BiocViews
   - domain, container, modality, and input-format fit as secondary context
   - package-page documentation and vignette-title overlap for top candidates
   - curated workflow role only as a tie-breaker after real query evidence
4. **Select workflow**: Map the detected domain to a fixed workflow template.
5. **Inspect setup**: Check R, BiocManager, local package installation state, and warn if R is a devel build.

**Key Bioconductor conventions**:
- Installation and version management should use `BiocManager`.
- Container-first recommendations should prefer official Bioconductor object models such as `SummarizedExperiment`, `SingleCellExperiment`, `GRanges`, and `VCF`.
- Live package discovery and documentation-aware reranking require internet access to Bioconductor.

## Example Queries

- "Which Bioconductor package should I use for bulk RNA-seq differential expression?"
- "Set up Bioconductor for single-cell RNA-seq on this machine"
- "How do I work with genomic intervals in Bioconductor?"
- "Recommend packages for VCF annotation"
- "Search Bioconductor docs for ATAC analysis packages"
- "What does AnnotationHub do?"
- "Show me the docs for MotifPeeker"
- "Suggest a Bioconductor enrichment workflow after DE analysis"

## Output Structure

```text
output_directory/
├── report.md
├── result.json
├── tables/
│   └── recommended_packages.csv
└── reproducibility/
    ├── commands.sh
    ├── environment.yml
    ├── install_packages.R
    ├── starter_workflow.R
    ├── sessionInfo.txt
    └── checksums.sha256
```

## Dependencies

**Required**:
- Python 3.10+
- `Rscript`

**Optional**:
- `BiocManager` for setup inspection and explicit installs

## Safety

- **Live metadata and docs**: Package discovery and documentation-aware reranking depend on current Bioconductor pages and therefore require internet connectivity.
- **Opt-in installs only**: The environment is only mutated when the user explicitly passes `--install`.
- **Disclaimer**: Every report includes the ClawBio medical disclaimer.
- **Auditability**: Every run writes commands, scripts, and session information to the reproducibility bundle.
- **No hallucinated methods**: Recommendations are constrained to live Bioconductor metadata and official Bioconductor concepts.

## Integration with Bio Orchestrator

**Trigger conditions** — the orchestrator routes here when:
- the user asks which Bioconductor package or workflow to use
- the user mentions `BiocManager`, `SummarizedExperiment`, `SingleCellExperiment`, `GenomicRanges`, `VariantAnnotation`, `AnnotationHub`, or `ExperimentHub`
- the user asks to set up Bioconductor locally

**Chaining partners** — this skill connects with:
- `rnaseq-de`: translate bulk RNA-seq tasks into Bioconductor-native package choices
- `scrna-orchestrator`: map Scanpy-style single-cell requests to Bioconductor equivalents
- `diff-visualizer`: suggest Bioconductor visualization/reporting packages
- `bio-orchestrator`: route package-selection and setup questions here first

## Citations

- [Bioconductor](https://www.bioconductor.org/) — official project and package ecosystem
- [BiocManager](https://bioconductor.org/install/) — official installation and version-management guidance
- [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) — canonical assay container
- [SingleCellExperiment](https://bioconductor.org/packages/release/bioc/html/SingleCellExperiment.html) — canonical single-cell container
- [GenomicRanges](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html) — canonical interval container
- [VariantAnnotation](https://bioconductor.org/packages/release/bioc/html/VariantAnnotation.html) — canonical VCF and variant annotation package

Related Skills

monitoring-cross-chain-bridges

from ComeOnOliver/skillshub

Monitor cross-chain bridge TVL, volume, fees, and transaction status across networks. Use when researching bridges, comparing routes, or tracking bridge transactions. Trigger with phrases like "monitor bridges", "compare bridge fees", "track bridge tx", "bridge TVL", or "cross-chain transfer status".

eventbridge-rule-creator

from ComeOnOliver/skillshub

Eventbridge Rule Creator - Auto-activating skill for AWS Skills. Triggers on: eventbridge rule creator, eventbridge rule creator Part of the AWS Skills skill category.

abridge-webhooks-events

from ComeOnOliver/skillshub

Implement Abridge webhook handling for clinical documentation events. Use when receiving note completion notifications, encounter status changes, provider enrollment events, or quality alert callbacks from Abridge. Trigger: "abridge webhook", "abridge events", "abridge notifications", "abridge note completed event", "abridge encounter event".

abridge-upgrade-migration

from ComeOnOliver/skillshub

Plan and execute Abridge integration upgrades and EHR migration procedures. Use when upgrading Abridge API versions, migrating between EHR systems, or handling breaking changes in clinical documentation workflows. Trigger: "abridge upgrade", "abridge migration", "abridge version update", "migrate abridge EHR", "abridge breaking changes".

abridge-security-basics

from ComeOnOliver/skillshub

Apply HIPAA-compliant security practices for Abridge clinical AI integrations. Use when securing PHI in transit/at rest, configuring access controls, implementing audit logging, or preparing for HIPAA security audits. Trigger: "abridge security", "abridge HIPAA", "abridge PHI protection", "abridge access control", "abridge audit logging".

abridge-sdk-patterns

from ComeOnOliver/skillshub

Apply production-ready patterns for Abridge clinical AI integration. Use when building reusable Abridge client wrappers, implementing HIPAA-compliant error handling, or establishing team coding standards for healthcare AI. Trigger: "abridge SDK patterns", "abridge best practices", "abridge code patterns".

abridge-reference-architecture

from ComeOnOliver/skillshub

Implement Abridge reference architecture for clinical AI integration. Use when designing a new Abridge deployment, reviewing project structure, or planning multi-site health system rollouts with EHR integration. Trigger: "abridge architecture", "abridge project structure", "abridge system design", "abridge multi-site".

abridge-rate-limits

from ComeOnOliver/skillshub

Implement Abridge rate limiting, backoff, and session throttling patterns. Use when handling 429 errors, managing concurrent encounter sessions, or optimizing API throughput for high-volume clinical deployments. Trigger: "abridge rate limit", "abridge 429", "abridge throttling", "abridge concurrent sessions".

abridge-prod-checklist

from ComeOnOliver/skillshub

Execute Abridge production readiness checklist for clinical AI deployment. Use when launching Abridge in a healthcare org, preparing for go-live, or validating HIPAA compliance before production deployment. Trigger: "abridge production checklist", "abridge go-live", "abridge launch readiness", "abridge prod deploy".

abridge-performance-tuning

from ComeOnOliver/skillshub

Optimize Abridge clinical AI integration performance for high-volume deployments. Use when reducing note generation latency, optimizing audio streaming throughput, improving FHIR push performance, or scaling for multi-site health systems. Trigger: "abridge performance", "abridge latency", "abridge optimization", "abridge slow", "abridge scale".

abridge-local-dev-loop

from ComeOnOliver/skillshub

Configure Abridge local development with FHIR server, synthetic data, and hot reload. Use when setting up a development environment for clinical AI integration, testing encounter workflows locally, or iterating on EHR integration code. Trigger: "abridge local dev", "abridge dev setup", "abridge test locally".

abridge-deploy-integration

from ComeOnOliver/skillshub

Deploy Abridge clinical AI integration to HIPAA-compliant cloud infrastructure. Use when deploying to GCP Cloud Run, AWS ECS, or Azure Container Apps with healthcare-grade secrets management and compliance controls. Trigger: "deploy abridge", "abridge production deploy", "abridge Cloud Run", "abridge AWS deploy", "abridge HIPAA infrastructure".