bio-methylation-calling

Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.

1,802 stars

Best use case

bio-methylation-calling is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.

Teams using bio-methylation-calling should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bio-methylation-calling/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/bio-methylation-calling/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/bio-methylation-calling/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How bio-methylation-calling Compares

Feature / Agentbio-methylation-callingStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

## Version Compatibility

Reference examples tested with: pandas 2.2+

Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
- CLI: `<tool> --version` then `<tool> --help` to confirm flags

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Methylation Calling

**"Extract methylation calls from my Bismark BAM"** → Generate per-cytosine methylation reports (CpG, CHG, CHH contexts) from aligned bisulfite sequencing data.
- CLI: `bismark_methylation_extractor --bedGraph --cytosine_report sample.bam`

## Basic Extraction

```bash
# Extract methylation calls from Bismark BAM
bismark_methylation_extractor --gzip --bedGraph \
    sample_bismark_bt2.bam
```

## Paired-End Extraction

```bash
bismark_methylation_extractor --paired-end --gzip --bedGraph \
    sample_bismark_bt2_pe.bam
```

## Common Options

```bash
bismark_methylation_extractor \
    --paired-end \                 # For paired-end data
    --gzip \                       # Compress output
    --bedGraph \                   # Generate bedGraph file
    --cytosine_report \            # Genome-wide cytosine report
    --genome_folder /path/to/genome/ \  # Required for cytosine_report
    --buffer_size 10G \            # Memory buffer
    --parallel 4 \                 # Parallel extraction
    -o output_dir/ \
    sample.bam
```

## CpG Context Only

```bash
# Most common - extract only CpG methylation
bismark_methylation_extractor \
    --paired-end \
    --no_overlap \                 # Avoid double counting overlapping reads
    --gzip \
    --bedGraph \
    --CX \                         # Also extract CHG/CHH (optional)
    sample.bam
```

## Genome-Wide Cytosine Report

```bash
# Comprehensive report with all CpGs in genome
bismark_methylation_extractor \
    --paired-end \
    --gzip \
    --bedGraph \
    --cytosine_report \
    --genome_folder /path/to/genome/ \
    sample.bam
```

## Strand-Specific Output

```bash
# Default: strand-specific output
# CpG_OT_sample.txt - Original Top strand
# CpG_OB_sample.txt - Original Bottom strand
# CpG_CTOT_sample.txt - Complementary to OT
# CpG_CTOB_sample.txt - Complementary to OB

# Merge strands (CpG methylation is usually symmetric)
bismark_methylation_extractor --merge_non_CpG --gzip sample.bam
```

## Avoid Double-Counting Overlapping Reads

```bash
# For paired-end data with overlapping reads
bismark_methylation_extractor \
    --paired-end \
    --no_overlap \                 # Ignore overlapping portion of read 2
    --gzip \
    sample_pe.bam
```

## Generate Coverage File

```bash
# bismark2bedGraph creates coverage file
bismark_methylation_extractor --bedGraph --gzip sample.bam

# Or run separately
bismark2bedGraph -o sample CpG_context_sample.txt.gz

# Coverage format: chr start end methylation_percentage count_meth count_unmeth
```

## Convert to BigWig for Visualization

```bash
# bedGraph to BigWig (requires UCSC tools)
bedGraphToBigWig sample.bedGraph.gz chrom.sizes sample.bw
```

## M-Bias Plot

```bash
# Check for methylation bias across read positions
bismark_methylation_extractor --paired-end \
    --mbias_only \                 # Only generate M-bias plot
    sample.bam

# Generates sample.M-bias.txt and sample.M-bias_R1.png, sample.M-bias_R2.png
```

## Ignore End Bias

```bash
# Ignore positions with systematic bias (found from M-bias plot)
bismark_methylation_extractor \
    --paired-end \
    --ignore 2 \                   # Ignore first 2 bp of read 1
    --ignore_r2 2 \                # Ignore first 2 bp of read 2
    --ignore_3prime 2 \            # Ignore last 2 bp of read 1
    --ignore_3prime_r2 2 \         # Ignore last 2 bp of read 2
    sample.bam
```

## Output Files

```bash
# Main output files:
# CpG_context_sample.txt.gz      - Per-read CpG methylation
# sample.bismark.cov.gz          - Coverage file
# sample.bedGraph.gz             - bedGraph for visualization
# sample.CpG_report.txt.gz       - Genome-wide CpG report (with --cytosine_report)

# Coverage file format:
# chr  start  end  methylation%  count_methylated  count_unmethylated
```

## Parse Output in Python

```python
import pandas as pd

cov = pd.read_csv('sample.bismark.cov.gz', sep='\t', header=None,
                   names=['chr', 'start', 'end', 'meth_pct', 'count_meth', 'count_unmeth'])
cov['coverage'] = cov['count_meth'] + cov['count_unmeth']
cov_filtered = cov[cov['coverage'] >= 10]
```

## Key Parameters

| Parameter | Description |
|-----------|-------------|
| --paired-end | Paired-end mode |
| --gzip | Compress output |
| --bedGraph | Generate bedGraph |
| --cytosine_report | Full genome cytosine report |
| --genome_folder | Path to genome (for cytosine_report) |
| --CX | Report CHG/CHH contexts |
| --no_overlap | Avoid counting overlapping reads twice |
| --parallel | Parallel extraction threads |
| --mbias_only | Only M-bias analysis |
| --ignore N | Ignore first N bp of read 1 |
| --ignore_r2 N | Ignore first N bp of read 2 |

## Output Formats

| Format | Description | Use Case |
|--------|-------------|----------|
| CpG_context | Per-read methylation calls | Detailed analysis |
| .bismark.cov | Per-CpG coverage summary | methylKit input |
| .bedGraph | Methylation track | Genome browser |
| .CpG_report | All genome CpGs | Comprehensive analysis |

## Related Skills

- bismark-alignment - Generate input BAM files
- methylkit-analysis - Import coverage files to R
- dmr-detection - Find differentially methylated regions

Related Skills

bio-variant-calling

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Call SNPs and indels from aligned reads using bcftools mpileup and call. Use when detecting variants from BAM files or generating VCF from alignments.

bio-variant-calling-structural-variant-calling

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Call structural variants (SVs) from short-read sequencing using Manta, Delly, and LUMPY. Detects deletions, insertions, inversions, duplications, and translocations that are too large for standard SNV callers. Use when detecting structural variants from short-read data.

bio-variant-calling-joint-calling

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Joint genotype calling across multiple samples using GATK CombineGVCFs and GenotypeGVCFs. Essential for cohort studies, population genetics, and leveraging VQSR. Use when performing joint genotyping across multiple samples.

bio-variant-calling-filtering-best-practices

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Comprehensive variant filtering including GATK VQSR, hard filters, bcftools expressions, and quality metric interpretation for SNPs and indels. Use when filtering variants using GATK best practices.

bio-variant-calling-deepvariant

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Deep learning-based variant calling with Google DeepVariant. Provides high accuracy for germline SNPs and indels from Illumina, PacBio, and ONT data. Use when calling variants with DeepVariant deep learning caller.

bio-variant-calling-clinical-interpretation

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Clinical variant interpretation using ClinVar, ACMG guidelines, and pathogenicity predictors. Prioritize variants for diagnostic and research applications. Use when interpreting clinical significance of variants.

bio-methylation-methylkit

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

DNA methylation analysis with methylKit in R. Import Bismark coverage files, filter by coverage, normalize samples, and perform statistical comparisons. Use when analyzing single-base methylation patterns, comparing samples, or preparing data for DMR detection.

bio-methylation-dmr-detection

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Differentially methylated region (DMR) detection using methylKit tiles, bsseq BSmooth, and DMRcate. Use when identifying contiguous genomic regions with methylation differences between experimental conditions or cell types.

bio-methylation-bismark-alignment

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Bisulfite sequencing read alignment using Bismark with bowtie2/hisat2. Handles genome preparation and produces BAM files with methylation information. Use when aligning WGBS, RRBS, or other bisulfite-converted sequencing reads to a reference genome.

bio-methylation-based-detection

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Analyzes cfDNA methylation patterns for cancer detection using cfMeDIP-seq or bisulfite sequencing with MethylDackel. Identifies cancer-specific methylation signatures and performs tissue-of-origin deconvolution. Use when using methylation biomarkers for early cancer detection or minimal residual disease.

bio-long-read-sequencing-nanopore-methylation

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Calls DNA methylation from Oxford Nanopore sequencing data using signal-level analysis. Use when detecting 5mC or 6mA modifications directly from nanopore reads without bisulfite conversion.

bio-hi-c-analysis-loop-calling

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Detect chromatin loops and point interactions from Hi-C data using cooltools, chromosight, and HiCCUPS-like methods. Identify CTCF-mediated loops and enhancer-promoter contacts. Use when detecting chromatin loops from Hi-C data.