bio-methylation-calling
Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.
Best use case
bio-methylation-calling is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.
Teams using bio-methylation-calling should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/bio-methylation-calling/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How bio-methylation-calling Compares
| Feature / Agent | bio-methylation-calling | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
## Version Compatibility
Reference examples tested with: pandas 2.2+
Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
- CLI: `<tool> --version` then `<tool> --help` to confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Methylation Calling
**"Extract methylation calls from my Bismark BAM"** → Generate per-cytosine methylation reports (CpG, CHG, CHH contexts) from aligned bisulfite sequencing data.
- CLI: `bismark_methylation_extractor --bedGraph --cytosine_report sample.bam`
## Basic Extraction
```bash
# Extract methylation calls from Bismark BAM
bismark_methylation_extractor --gzip --bedGraph \
sample_bismark_bt2.bam
```
## Paired-End Extraction
```bash
bismark_methylation_extractor --paired-end --gzip --bedGraph \
sample_bismark_bt2_pe.bam
```
## Common Options
```bash
bismark_methylation_extractor \
--paired-end \ # For paired-end data
--gzip \ # Compress output
--bedGraph \ # Generate bedGraph file
--cytosine_report \ # Genome-wide cytosine report
--genome_folder /path/to/genome/ \ # Required for cytosine_report
--buffer_size 10G \ # Memory buffer
--parallel 4 \ # Parallel extraction
-o output_dir/ \
sample.bam
```
## CpG Context Only
```bash
# Most common - extract only CpG methylation
bismark_methylation_extractor \
--paired-end \
--no_overlap \ # Avoid double counting overlapping reads
--gzip \
--bedGraph \
--CX \ # Also extract CHG/CHH (optional)
sample.bam
```
## Genome-Wide Cytosine Report
```bash
# Comprehensive report with all CpGs in genome
bismark_methylation_extractor \
--paired-end \
--gzip \
--bedGraph \
--cytosine_report \
--genome_folder /path/to/genome/ \
sample.bam
```
## Strand-Specific Output
```bash
# Default: strand-specific output
# CpG_OT_sample.txt - Original Top strand
# CpG_OB_sample.txt - Original Bottom strand
# CpG_CTOT_sample.txt - Complementary to OT
# CpG_CTOB_sample.txt - Complementary to OB
# Merge strands (CpG methylation is usually symmetric)
bismark_methylation_extractor --merge_non_CpG --gzip sample.bam
```
## Avoid Double-Counting Overlapping Reads
```bash
# For paired-end data with overlapping reads
bismark_methylation_extractor \
--paired-end \
--no_overlap \ # Ignore overlapping portion of read 2
--gzip \
sample_pe.bam
```
## Generate Coverage File
```bash
# bismark2bedGraph creates coverage file
bismark_methylation_extractor --bedGraph --gzip sample.bam
# Or run separately
bismark2bedGraph -o sample CpG_context_sample.txt.gz
# Coverage format: chr start end methylation_percentage count_meth count_unmeth
```
## Convert to BigWig for Visualization
```bash
# bedGraph to BigWig (requires UCSC tools)
bedGraphToBigWig sample.bedGraph.gz chrom.sizes sample.bw
```
## M-Bias Plot
```bash
# Check for methylation bias across read positions
bismark_methylation_extractor --paired-end \
--mbias_only \ # Only generate M-bias plot
sample.bam
# Generates sample.M-bias.txt and sample.M-bias_R1.png, sample.M-bias_R2.png
```
## Ignore End Bias
```bash
# Ignore positions with systematic bias (found from M-bias plot)
bismark_methylation_extractor \
--paired-end \
--ignore 2 \ # Ignore first 2 bp of read 1
--ignore_r2 2 \ # Ignore first 2 bp of read 2
--ignore_3prime 2 \ # Ignore last 2 bp of read 1
--ignore_3prime_r2 2 \ # Ignore last 2 bp of read 2
sample.bam
```
## Output Files
```bash
# Main output files:
# CpG_context_sample.txt.gz - Per-read CpG methylation
# sample.bismark.cov.gz - Coverage file
# sample.bedGraph.gz - bedGraph for visualization
# sample.CpG_report.txt.gz - Genome-wide CpG report (with --cytosine_report)
# Coverage file format:
# chr start end methylation% count_methylated count_unmethylated
```
## Parse Output in Python
```python
import pandas as pd
cov = pd.read_csv('sample.bismark.cov.gz', sep='\t', header=None,
names=['chr', 'start', 'end', 'meth_pct', 'count_meth', 'count_unmeth'])
cov['coverage'] = cov['count_meth'] + cov['count_unmeth']
cov_filtered = cov[cov['coverage'] >= 10]
```
## Key Parameters
| Parameter | Description |
|-----------|-------------|
| --paired-end | Paired-end mode |
| --gzip | Compress output |
| --bedGraph | Generate bedGraph |
| --cytosine_report | Full genome cytosine report |
| --genome_folder | Path to genome (for cytosine_report) |
| --CX | Report CHG/CHH contexts |
| --no_overlap | Avoid counting overlapping reads twice |
| --parallel | Parallel extraction threads |
| --mbias_only | Only M-bias analysis |
| --ignore N | Ignore first N bp of read 1 |
| --ignore_r2 N | Ignore first N bp of read 2 |
## Output Formats
| Format | Description | Use Case |
|--------|-------------|----------|
| CpG_context | Per-read methylation calls | Detailed analysis |
| .bismark.cov | Per-CpG coverage summary | methylKit input |
| .bedGraph | Methylation track | Genome browser |
| .CpG_report | All genome CpGs | Comprehensive analysis |
## Related Skills
- bismark-alignment - Generate input BAM files
- methylkit-analysis - Import coverage files to R
- dmr-detection - Find differentially methylated regionsRelated Skills
langchain-tool-calling
How chat models call tools - includes bind_tools, tool choice strategies, parallel tool calling, and tool message handling
hic-tad-calling
This skill should be used when users need to identify topologically associating domains (TADs) from Hi-C data in .mcools (or .cool) files or when users want to visualize the TAD in target genome loci. It provides workflows for TAD calling and visualization.
bio-basecalling
Convert raw Nanopore signal data (FAST5/POD5) to nucleotide sequences using Dorado basecaller. Covers model selection, GPU acceleration, modified base detection, and quality filtering. Use when processing raw Nanopore data before alignment. Note: Guppy is deprecated; use Dorado for all new analyses.
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
large-data-with-dask
Specific optimization strategies for Python scripts working with larger-than-memory datasets via Dask.
langsmith-fetch
Debug LangChain and LangGraph agents by fetching execution traces from LangSmith Studio. Use when debugging agent behavior, investigating errors, analyzing tool calls, checking memory operations, or examining agent performance. Automatically fetches recent traces and analyzes execution patterns. Requires langsmith-fetch CLI installed.
langchain-notes
LangChain 框架学习笔记 - 快速查找概念、代码示例和最佳实践。包含 Core components、Middleware、Advanced usage、Multi-agent patterns、RAG retrieval、Long-term memory 等主题。当用户询问 LangChain、Agent、RAG、向量存储、工具使用、记忆系统时使用此 Skill。
langchain-js
Builds LLM-powered applications with LangChain.js for chat, agents, and RAG. Use when creating AI applications with chains, memory, tools, and retrieval-augmented generation in JavaScript.
langchain-agents
Expert guidance for building LangChain agents with proper tool binding, memory, and configuration. Use when creating agents, configuring models, or setting up tool integrations in LangConfig.
lang-python
Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.
kramme:agents-md
This skill should be used when the user asks to "update AGENTS.md", "add to AGENTS.md", "maintain agent docs", or needs to add guidelines to agent instructions. Guides discovery of local skills and enforces structured, keyword-based documentation style.
kontent-ai-automation
Automate Kontent AI tasks via Rube MCP (Composio). Always search tools first for current schemas.