bio-atac-seq-differential-accessibility
Find differentially accessible chromatin regions between conditions using DiffBind or DESeq2. Use when comparing chromatin accessibility between treatment groups, cell types, or developmental stages in ATAC-seq experiments.
Best use case
bio-atac-seq-differential-accessibility is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Find differentially accessible chromatin regions between conditions using DiffBind or DESeq2. Use when comparing chromatin accessibility between treatment groups, cell types, or developmental stages in ATAC-seq experiments.
Teams using bio-atac-seq-differential-accessibility should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/bio-atac-seq-differential-accessibility/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How bio-atac-seq-differential-accessibility Compares
| Feature / Agent | bio-atac-seq-differential-accessibility | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Find differentially accessible chromatin regions between conditions using DiffBind or DESeq2. Use when comparing chromatin accessibility between treatment groups, cell types, or developmental stages in ATAC-seq experiments.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
## Version Compatibility
Reference examples tested with: DESeq2 1.42+, GenomicRanges 1.54+, Subread 2.0+, numpy 1.26+, pandas 2.2+, scanpy 1.10+, scipy 1.12+
Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
- R: `packageVersion('<pkg>')` then `?function_name` to verify parameters
- CLI: `<tool> --version` then `<tool> --help` to confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Differential Accessibility
**"Find differentially accessible regions between my conditions"** → Identify chromatin regions with statistically significant changes in accessibility between treatment groups, cell types, or timepoints.
- R: `DiffBind` or `DESeq2` on a peak-by-sample count matrix
## DiffBind Workflow
**Goal:** Identify differentially accessible chromatin regions between experimental conditions.
**Approach:** Load sample metadata and peak files into DiffBind, count reads in consensus peaks, normalize, define contrasts, and run differential analysis with DESeq2 backend.
```r
library(DiffBind)
# 1. Create sample sheet
samples <- data.frame(
SampleID = c('ctrl_1', 'ctrl_2', 'treat_1', 'treat_2'),
Condition = c('control', 'control', 'treated', 'treated'),
Replicate = c(1, 2, 1, 2),
bamReads = c('ctrl_1.bam', 'ctrl_2.bam', 'treat_1.bam', 'treat_2.bam'),
Peaks = c('ctrl_1.narrowPeak', 'ctrl_2.narrowPeak', 'treat_1.narrowPeak', 'treat_2.narrowPeak')
)
write.csv(samples, 'samples.csv', row.names=FALSE)
# 2. Load data
dba <- dba(sampleSheet='samples.csv')
# 3. Count reads
dba <- dba.count(dba)
# 4. Normalize
dba <- dba.normalize(dba)
# 5. Set up contrasts
dba <- dba.contrast(dba, contrast=c('Condition', 'treated', 'control'))
# 6. Differential analysis
dba <- dba.analyze(dba)
# 7. Get results
results <- dba.report(dba)
```
## DiffBind with Consensus Peaks
```r
library(DiffBind)
# Load samples
dba <- dba(sampleSheet='samples.csv')
# Count with specific parameters
dba <- dba.count(dba,
summits=250, # Re-center peaks on summit
minOverlap=2, # Peak in at least 2 samples
score=DBA_SCORE_NORMALIZED)
# Normalize
dba <- dba.normalize(dba, normalize=DBA_NORM_NATIVE)
# Analyze
dba <- dba.contrast(dba, contrast=c('Condition', 'treated', 'control'))
dba <- dba.analyze(dba, method=DBA_DESEQ2)
# Extract results
results <- dba.report(dba, th=0.05, bCounts=TRUE)
# Save
write.csv(as.data.frame(results), 'differential_peaks.csv')
```
## DiffBind Visualizations
```r
# PCA plot
dba.plotPCA(dba, attributes=DBA_CONDITION)
# MA plot
dba.plotMA(dba)
# Volcano plot
dba.plotVolcano(dba)
# Heatmap of differential peaks
dba.plotHeatmap(dba, contrast=1, correlations=FALSE)
# Venn diagram of overlapping peaks
dba.plotVenn(dba, contrast=1, bDB=TRUE, bGain=TRUE, bLoss=TRUE)
```
## Using DESeq2 Directly
**Goal:** Run differential accessibility analysis using DESeq2 on a peak count matrix without DiffBind.
**Approach:** Load peak-by-sample counts into a DESeqDataSet, filter low counts, run the DESeq2 pipeline, and extract significant differential peaks.
```r
library(DESeq2)
library(GenomicRanges)
# Load peak counts (from featureCounts or custom counting)
counts <- read.delim('peak_counts.txt', row.names=1)
# Sample metadata
coldata <- data.frame(
row.names = colnames(counts),
condition = factor(c('control', 'control', 'treated', 'treated'))
)
# Create DESeq object
dds <- DESeqDataSetFromMatrix(countData=counts, colData=coldata, design=~condition)
# Filter low counts
dds <- dds[rowSums(counts(dds)) >= 10, ]
# Run DESeq2
dds <- DESeq(dds)
# Results
res <- results(dds, contrast=c('condition', 'treated', 'control'))
res <- res[order(res$padj), ]
# Significant peaks
sig <- subset(res, padj < 0.05 & abs(log2FoldChange) > 1)
```
## Count Reads in Peaks
**Goal:** Generate a peak-by-sample count matrix as input for differential analysis.
**Approach:** Convert consensus peaks to SAF format and run featureCounts to count reads from all BAM files in each peak region.
```bash
# Using featureCounts
# First convert peaks to SAF format
awk 'BEGIN{OFS="\t"; print "GeneID\tChr\tStart\tEnd\tStrand"}
{print $1"_"$2"_"$3, $1, $2, $3, "."}' consensus_peaks.bed > peaks.saf
featureCounts \
-a peaks.saf \
-F SAF \
-o peak_counts.txt \
-p \
--countReadPairs \
-T 8 \
*.bam
```
## Python Alternative
```python
import pandas as pd
import numpy as np
from scipy import stats
def simple_differential(counts_file, groups):
'''Simple differential accessibility test.'''
counts = pd.read_csv(counts_file, sep='\t', index_col=0, comment='#')
# Normalize to CPM
cpm = counts.div(counts.sum()) * 1e6
# Log transform
log_cpm = np.log2(cpm + 1)
# Separate groups
group1 = [c for c in counts.columns if groups[c] == 'control']
group2 = [c for c in counts.columns if groups[c] == 'treated']
results = []
for peak in counts.index:
g1_vals = log_cpm.loc[peak, group1]
g2_vals = log_cpm.loc[peak, group2]
log2fc = g2_vals.mean() - g1_vals.mean()
t_stat, pval = stats.ttest_ind(g1_vals, g2_vals)
results.append({
'peak': peak,
'log2FoldChange': log2fc,
'pvalue': pval
})
df = pd.DataFrame(results)
df['padj'] = stats.false_discovery_control(df['pvalue'])
return df
```
## Annotate Differential Peaks
**Goal:** Map differential peaks to nearby genes and genomic features for biological interpretation.
**Approach:** Use ChIPseeker to annotate peaks with promoter/intron/intergenic classification and distance to nearest TSS.
```r
library(ChIPseeker)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
# Annotate differential peaks
diff_peaks <- dba.report(dba)
peakAnno <- annotatePeak(diff_peaks, TxDb=TxDb.Hsapiens.UCSC.hg38.knownGene)
# Plot annotation
plotAnnoPie(peakAnno)
plotDistToTSS(peakAnno)
# Get genes
genes <- as.data.frame(peakAnno)$geneId
```
## Filter Results
```r
# Get significant results
sig_peaks <- dba.report(dba, th=0.05, fold=1)
# Opened in treatment
opened <- sig_peaks[sig_peaks$Fold > 0]
# Closed in treatment
closed <- sig_peaks[sig_peaks$Fold < 0]
# Export as BED
export.bed(opened, 'opened_peaks.bed')
export.bed(closed, 'closed_peaks.bed')
```
## Multi-factor Designs
```r
# Complex design with batch correction
samples$Batch <- factor(c('A', 'B', 'A', 'B'))
dba <- dba(sampleSheet=samples)
dba <- dba.count(dba)
dba <- dba.normalize(dba)
# Design formula approach
dba <- dba.contrast(dba, design='~Batch + Condition')
dba <- dba.analyze(dba)
```
## Related Skills
- atac-seq/atac-peak-calling - Generate input peaks
- differential-expression/deseq2-basics - DESeq2 methods
- chip-seq/differential-binding - Similar DiffBind workflow
- pathway-analysis/go-enrichment - Analyze differential genesRelated Skills
datacommons-client
Work with Data Commons, a platform providing programmatic access to public statistical data from global sources. Use this skill when working with demographic data, economic indicators, health statistics, environmental data, or any public datasets available through Data Commons. Applicable for querying population statistics, GDP figures, unemployment rates, disease prevalence, geographic entity resolution, and exploring relationships between statistical entities.
bulk-rna-seq-differential-expression-with-omicverse
Guide Claude through omicverse's bulk RNA-seq DEG pipeline, from gene ID mapping and DESeq2 normalization to statistical testing, visualization, and pathway enrichment. Use when a user has bulk count matrices and needs differential expression analysis in omicverse.
bio-single-cell-scatac-analysis
Single-cell ATAC-seq analysis with Signac (R/Seurat) and ArchR. Process 10X Genomics scATAC data, perform QC, dimensionality reduction, clustering, peak calling, and motif activity scoring with chromVAR. Use when analyzing single-cell ATAC-seq data.
bio-proteomics-differential-abundance
Statistical testing for differentially abundant proteins between conditions. Covers limma and MSstats workflows with multiple testing correction. Use when identifying proteins with significant abundance changes between experimental groups.
bio-microbiome-differential-abundance
Differential abundance testing for microbiome data using compositionally-aware methods like ALDEx2, ANCOM-BC2, and MaAsLin2. Use when identifying taxa that differ between experimental groups while accounting for the compositional nature of microbiome data.
bio-hi-c-analysis-hic-differential
Compare Hi-C contact matrices between conditions to identify differential chromatin interactions. Compute log2 fold changes, statistical significance, and visualize differential contact maps. Use when comparing Hi-C contacts between conditions.
bio-flow-cytometry-differential-analysis
Differential abundance and state analysis for cytometry data. Compare cell populations between conditions using statistical methods. Use when testing for significant changes in cell frequencies or marker expression between groups.
bio-differential-splicing
Detects differential alternative splicing between conditions using rMATS-turbo (BAM-based) or SUPPA2 diffSplice (TPM-based). Reports events with FDR-corrected significance and delta PSI effect sizes. Use when comparing splicing patterns between treatment groups, tissues, or disease states.
bio-differential-expression-timeseries-de
Analyze time-series RNA-seq data using limma voom with splines, maSigPro, and ImpulseDE2. Identify genes with dynamic expression patterns. Use when analyzing time-series or longitudinal expression data.
bio-differential-expression-batch-correction
Remove batch effects from RNA-seq data using ComBat, ComBat-Seq, limma removeBatchEffect, and SVA for unknown batch variables. Use when correcting batch effects in expression data.
bio-chipseq-differential-binding
Differential binding analysis using DiffBind. Compare ChIP-seq peaks between conditions with statistical rigor. Requires replicate samples. Outputs differentially bound regions with fold changes and p-values. Use when comparing ChIP-seq binding between conditions.
bio-atac-seq-nucleosome-positioning
Extract nucleosome positions from ATAC-seq data using NucleoATAC, ATACseqQC, and fragment analysis. Use when analyzing chromatin organization, identifying nucleosome-free regions at promoters, or characterizing nucleosome occupancy patterns from ATAC-seq fragment size distributions.