correlation-methylation-epiFeatures

This skill provides a complete pipeline for integrating CpG methylation data with chromatin features such as ATAC-seq signal, H3K27ac, H3K4me3, or other histone marks/TF signals.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

correlation-methylation-epiFeatures is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

This skill provides a complete pipeline for integrating CpG methylation data with chromatin features such as ATAC-seq signal, H3K27ac, H3K4me3, or other histone marks/TF signals.

Teams using correlation-methylation-epiFeatures should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/correlation-methylation-epifeatures/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/product/correlation-methylation-epifeatures/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/correlation-methylation-epifeatures/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How correlation-methylation-epiFeatures Compares

Feature / Agent	correlation-methylation-epiFeatures	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

This skill provides a complete pipeline for integrating CpG methylation data with chromatin features such as ATAC-seq signal, H3K27ac, H3K4me3, or other histone marks/TF signals.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Integrative Analysis of DNA Methylation and Chromatin Features

## 1. Overview

Main steps include:
- Refer to the **Inputs & Outputs** section to check required inputs and set up the output directory structure.
- **Always prompt user** for genome assembly used.
- **Always prompt user** for which columns in the methylation BED files are methylation fraction/percent and coverage and strand.
- Load and preprocess CpG methylation data
- Tile methylation into fixed-size windows (e.g., 1kb) or in target regions.
- Import chromatin feature signal from bigWig files
- Build a unified region-level integration table
- Calculate correlations between every two features.
- Visualization

---

## 2. When to Use This Skill

Use this pipeline when you want to explore how DNA methylation relates to chromatin state, accessibility, or histone modifications. Suitable scenarios include:
- Assessing promoter/enhancer activation via methylation & ATAC/H3K27ac
- Integrating multi-omics datasets (ChIP-seq, ATAC-seq, WGBS)
- Evaluating epigenomic shifts across conditions, tissues, or celltypes

---

## 3. Inputs & Outputs

### Inputs

`<methylation_coverage>.bed`
`<epi_feature_1>.bw`
`<epi_feature_2>.bw`
`<target_regions>.bed` (optional)
`<genomic_annotation>.gtf` (optional)

### Outputs

```bash
corr_epi_methylation/
  stats/
    region_signal_table.tsv   # Unified table of methylation + chromatin signal
    correlation_table.tsv                 # Per-feature Spearman correlations
  plots/
    *.pdf                          # heatmap/scatterplot of the correlations
  temp/
```
---

## 4. Decision Tree

### STEP 1: Prepare the sample methylation data

```r
library(GenomicRanges)
library(methylKit)
meth_files <- list("sample1.cov", "sample2.cov")
sample_ids <- c("S1", "S2")

meth <- methRead(
  location = "sample.bed",
  sample.id = "S1",
  assembly = "hg38",  # provided by user
  treatment = 0,
  context = "CpG",
  pipeline = list(
    fraction = FALSE,  # percMeth is 0–100, fraction is 0-1, depend on inputs
    chr.col = 1,
    start.col = 2,
    end.col = 3,
    strand.col = 6,    # provided by user
    coverage.col = 10, # provided by user
    freqC.col = 11     # provided by user
  )
)
```

### STEP 3: Tile methylation into 1kb bins or count methylation in target regions

Option 1: no BED for target regions provided, calculate correlation in fix bin size

``` r
library(rtracklayer)
meth_tile <- tileMethylCounts(meth, win.size = 1000)
d <- getData(meth_tile)
mean_methylation <- d$numCs / (d$numCs + d$numTs)
regions <- as(meth_tile, "GRanges")
```

Option 2: Target regions provided, calculate correlation in target bins

``` r
library(rtracklayer)
bed_file <- "targets.bed"
targets <- import(bed_file, format = "BED")
meth_region <- regionCounts(meth, regions = targets)
d <- getData(meth_region)
mean_methylation <- d$numCs / (d$numCs + d$numTs)
regions     <- as(meth_region, "GRanges")  # similar to 'targets'
```

Option 3: calculate correlation in target genomic regions (e.g. promoter)
```r
library(TxDb.Hsapiens.UCSC.hg38.knownGene) # depend on the genomic assembly provide by user
library(rtracklayer)
txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene
gene_gr <- genes(txdb)   # one GRanges per gene
regions <- promoters(gene_gr,  # prompt the user for the definition of promoter
                          upstream  = 2000,
                          downstream = 200)
regions <- keepStandardChromosomes(promoters_gr, pruning.mode = "coarse")

meth_region <- regionCounts(meth, regions = regions)
d <- getData(meth_region)
mean_methylation <- d$numCs / (d$numCs + d$numTs)
regions <- as(meth_region, "GRanges")  # similar to 'targets'
```

### Step 4: Build integrated region table

```r
bw_ATAC    <- "ATAC.bigWig"
bw_H3K27ac <- "H3K27ac.bigWig"
bw_H3K4me3 <- "H3K4me3.bigWig"
... # Other availabel genomic features

get_bw_mean <- function(bw_file, regions) {
  bw_list <- import(bw_file, which = regions, as = "NumericList")
  sapply(bw_list, function(x) mean(x, na.rm = TRUE))
}

ATAC_sig    <- get_bw_mean(bw_ATAC,    regions)
H3K27ac_sig <- get_bw_mean(bw_H3K27ac, regions)
H3K4me3_sig <- get_bw_mean(bw_H3K4me3, regions)

# Avoid adding the gene_id column when build the data frame here
df <- data.frame(
  seqnames = seqnames(regions),
  start = start(regions),
  end = end(regions),
  mean_methylation = mean_methylation,
  ATAC = ATAC_sig,
  H3K27ac = H3K27ac_sig,
  H3K4me3 = H3K4me3_sig
)

write.table(df, "region_signal_table.tsv", sep="\t",
            quote=FALSE, row.names=FALSE)
```


### STEP 6: Calculate correlations

```r
features_mat <- df[, c("mean_methylation", "ATAC", "H3K27ac", "H3K4me3")]
cor_mat <- cor(
  features_mat,
  use = "pairwise.complete.obs",
  method = "spearman"
)

write.table(
  cor_mat,
  "feature_correlation_tabel.tsv",
  sep = "\t",
  quote = FALSE,
  col.names = NA
)
```

### STEP 7: Visualization

```r
pdf("feature_correlation_heatmap.pdf", width = 4, height = 4)
pheatmap(
  cor_mat,
  cluster_rows = TRUE,
  cluster_cols = TRUE,
  display_numbers = TRUE,
  number_format = "%.2f",
  main = "Feature correlation"
)
dev.off()

# Scatter plots
pdf(file.path(output_dir, "plots", "methylation_epi_scatterplots.pdf"), width = 10, height = 5)
par(mfrow = c(1, 2))

# Methylation vs ATAC
plot(df_clean$mean_methylation, df_clean$ATAC,
     xlab = "Mean Methylation (%)", ylab = "ATAC-seq Signal",
     main = paste("Methylation vs ATAC-seq\nrho =", round(cor_mat["mean_methylation", "ATAC"], 3)),
     pch = 16, cex = 0.5, col = rgb(0, 0, 1, 0.3))
... # other methylation vs. feature pairs
dev.off()
```

Related Skills

global-methylation-profile

from diegosouzapw/awesome-omni-skill

This skill performs genome-wide DNA methylation profiling. It supports single-sample and multi-sample workflows to compute methylation density distributions, genomic feature distribution of the methylation profile, and sample-level clustering/PCA. Use it when you want to systematically characterize global methylation patterns from WGBS or similar per-CpG methylation call files.

methylation-variability-analysis

from diegosouzapw/awesome-omni-skill

This skill provides a complete and streamlined workflow for performing methylation variability and epigenetic heterogeneity analysis from whole-genome bisulfite sequencing (WGBS) data. It is designed for researchers who want to quantify CpG-level variability across biological samples or conditions, identify highly variable CpGs (HVCs), and explore epigenetic heterogeneity.

bio-methylation-calling

from diegosouzapw/awesome-omni-skill

Extract methylation calls from Bismark BAM files using bismark_methylation_extractor. Generates per-cytosine reports for CpG, CHG, and CHH contexts. Use when extracting methylation levels from aligned bisulfite sequencing data for downstream analysis.

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

technical-architecture

from diegosouzapw/awesome-omni-skill

Autonomous Staff Engineer agent that analyzes a product requirement brief, extracts non-functional requirements, and generates a comprehensive technical architecture document. Accepts an optional tech-stack-preferences.md file path. Runs end-to-end without asking questions. Use when turning product requirements into technical architecture decisions.

tdd

from diegosouzapw/awesome-omni-skill

Use when implementing features or fixing bugs — write the test first, watch it fail, write minimal code to pass. Ensures tests verify behavior by requiring failure first.

tag-based-search

from diegosouzapw/awesome-omni-skill

Code tagging system using @FEAT, @COMP, @TYPE tags for easy discovery and navigation. Use when adding documentation tags or searching for related code across the codebase.

SurveyMonkey Automation

from diegosouzapw/awesome-omni-skill

Automate SurveyMonkey survey creation, response collection, collector management, and survey discovery through natural language commands

strict-user-requirements-adherence

from diegosouzapw/awesome-omni-skill

Strictly adheres to specified user flow and game rules, making sure to follow documented features.

sre-task-refinement

from diegosouzapw/awesome-omni-skill

Use when you have to refine subtasks into actionable plans ensuring that all corner cases are handled and we understand all the requirements.

spec-prd-creator

from diegosouzapw/awesome-omni-skill

Generate a Product Requirements Document (PRD) for a new feature. Use when planning a feature, starting a new project, or when asked to create a PRD. Triggers on: create a prd, write prd for, plan this feature, requirements for, spec out.

software-engineering-lead

from diegosouzapw/awesome-omni-skill

Expert software engineering lead who translates product requirements into comprehensive engineering plans using GitHub Projects. Reviews PRDs and user stories, identifies gaps and conflicts, pushes back constructively on poor requirements, applies software engineering best practices, creates detailed technical plans with tasks and milestones, and ensures production-ready architecture. Use when translating product specs into actionable development plans, validating requirements, or designing system architecture.