bio-atac-seq-nucleosome-positioning
Extract nucleosome positions from ATAC-seq data using NucleoATAC, ATACseqQC, and fragment analysis. Use when analyzing chromatin organization, identifying nucleosome-free regions at promoters, or characterizing nucleosome occupancy patterns from ATAC-seq fragment size distributions.
Best use case
bio-atac-seq-nucleosome-positioning is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Extract nucleosome positions from ATAC-seq data using NucleoATAC, ATACseqQC, and fragment analysis. Use when analyzing chromatin organization, identifying nucleosome-free regions at promoters, or characterizing nucleosome occupancy patterns from ATAC-seq fragment size distributions.
Teams using bio-atac-seq-nucleosome-positioning should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/bio-atac-seq-nucleosome-positioning/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How bio-atac-seq-nucleosome-positioning Compares
| Feature / Agent | bio-atac-seq-nucleosome-positioning | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Extract nucleosome positions from ATAC-seq data using NucleoATAC, ATACseqQC, and fragment analysis. Use when analyzing chromatin organization, identifying nucleosome-free regions at promoters, or characterizing nucleosome occupancy patterns from ATAC-seq fragment size distributions.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
## Version Compatibility
Reference examples tested with: Rsamtools 2.18+, matplotlib 3.8+, numpy 1.26+, pyBigWig 0.3+, pysam 0.22+, samtools 1.19+
Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
- R: `packageVersion('<pkg>')` then `?function_name` to verify parameters
- CLI: `<tool> --version` then `<tool> --help` to confirm flags
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Nucleosome Positioning
**"Map nucleosome positions from ATAC-seq"** → Separate nucleosome-free and mono-nucleosome fragments by size, then call nucleosome center positions and occupancy scores.
- CLI: `nucleoatac run --bed peaks.bed --bam atac.bam --fasta ref.fa`
- R: `ATACseqQC::splitGAlignmentsByCut()` for fragment separation
Extract nucleosome positions and occupancy from ATAC-seq fragment size patterns.
## Background
ATAC-seq fragments reflect chromatin structure:
- **< 100 bp**: Nucleosome-free regions (NFR)
- **180-247 bp**: Mono-nucleosome
- **315-473 bp**: Di-nucleosome
- **558-615 bp**: Tri-nucleosome
## ATACseqQC (R)
### Installation
```r
BiocManager::install('ATACseqQC')
```
### Fragment Size Distribution
```r
library(ATACseqQC)
library(Rsamtools)
# Read BAM
bamfile <- 'sample.bam'
# Fragment size distribution
fragSize <- fragSizeDist(bamfile, 'sample')
# Nucleosome-free and mono-nucleosome ratios
# Automatic QC metrics
```
### Nucleosome Positioning
**Goal:** Map nucleosome positions around TSS using ATAC-seq fragment size classes.
**Approach:** Read BAM, apply Tn5 shift correction, split fragments into NFR and mono-nucleosome classes by size, then compute signal profiles around TSS.
```r
library(ATACseqQC)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
library(BSgenome.Hsapiens.UCSC.hg38)
# Get TSS regions
txs <- transcripts(TxDb.Hsapiens.UCSC.hg38.knownGene)
tss <- promoters(txs, upstream=1000, downstream=1000)
# Read BAM
gal <- readBamFile(bamfile, asMates=TRUE, bigFile=TRUE)
# Shift reads (Tn5 offset correction)
gal_shifted <- shiftGAlignmentsList(gal)
# Split by nucleosome-free and nucleosomal
objs <- splitGAlignmentsByCut(gal_shifted, txs=txs,
genome=BSgenome.Hsapiens.UCSC.hg38)
# nucleosome-free fragments
nfr <- objs$NussomeFree
# Mono-nucleosome fragments
mono <- objs$mononucleosome
# Signal around TSS
sigs <- featureAlignedSignal(cvglist=objs,
feature.gr=tss,
upstream=1000,
downstream=1000)
```
### V-Plot (Fragment Size vs Position)
```r
# V-plot showing nucleosome positioning around TSS
vp <- vPlot(gal_shifted, tss,
genome=BSgenome.Hsapiens.UCSC.hg38,
upstream=1000, downstream=1000)
```
### Footprinting
```r
# Transcription factor footprinting
library(MotifDb)
# Get motif
motif <- query(MotifDb, 'CTCF')[[1]]
# Find motif occurrences
library(motifmatchr)
motif_pos <- matchMotifs(motif, BSgenome.Hsapiens.UCSC.hg38,
genome='hg38', out='positions')
# Calculate footprint
fp <- factorFootprints(gal_shifted, motif_pos,
genome=BSgenome.Hsapiens.UCSC.hg38,
upstream=100, downstream=100)
```
## NucleoATAC (Python)
### Installation
```bash
pip install nucleoatac
```
### Run NucleoATAC
**Goal:** Call precise nucleosome center positions and occupancy scores from ATAC-seq data.
**Approach:** Run NucleoATAC on defined genomic regions with a reference genome, producing nucleosome position calls and occupancy tracks.
```bash
# Call nucleosomes
nucleoatac run --bed regions.bed --bam sample.bam --fasta reference.fa \
--out nucleoatac_output --cores 8
```
### Output Files
| File | Description |
|------|-------------|
| `.nucpos.bed` | Nucleosome positions |
| `.nucpos.redundant.bed` | All nucleosome calls |
| `.nfrpos.bed` | NFR positions |
| `.occ.bedgraph` | Nucleosome occupancy track |
| `.nucmap_combined.bed` | Combined nucleosome map |
### Visualize Output
```bash
# Convert to bigWig for visualization
bedGraphToBigWig nucleoatac_output.occ.bedgraph chrom.sizes nucleosome_occ.bw
```
## Fragment Analysis (Custom)
### Extract Fragment Sizes
**Goal:** Visualize ATAC-seq fragment size distribution to assess nucleosome periodicity.
**Approach:** Extract template lengths from properly paired reads, then plot the histogram with NFR and mono-nucleosome cutoff markers.
```python
import pysam
import numpy as np
import matplotlib.pyplot as plt
bam = pysam.AlignmentFile('sample.bam', 'rb')
fragment_sizes = []
for read in bam.fetch():
if read.is_proper_pair and read.is_read1:
frag_size = abs(read.template_length)
if 0 < frag_size < 1000:
fragment_sizes.append(frag_size)
bam.close()
# Plot distribution
plt.figure(figsize=(10, 6))
plt.hist(fragment_sizes, bins=200, edgecolor='none', alpha=0.7)
plt.axvline(100, color='red', linestyle='--', label='NFR cutoff')
plt.axvline(180, color='blue', linestyle='--', label='Mono-nuc start')
plt.xlabel('Fragment Size (bp)')
plt.ylabel('Count')
plt.legend()
plt.savefig('fragment_distribution.png', dpi=300)
```
### Split by Fragment Size
```bash
# Extract nucleosome-free reads
samtools view -h sample.bam | \
awk '$9 > -100 && $9 < 100 || $1 ~ /^@/' | \
samtools view -b > nfr.bam
# Extract mono-nucleosome reads
samtools view -h sample.bam | \
awk '($9 >= 180 && $9 <= 247) || ($9 <= -180 && $9 >= -247) || $1 ~ /^@/' | \
samtools view -b > mono_nuc.bam
```
### Signal Around Features
```python
import pysam
import numpy as np
import pyBigWig
def signal_around_sites(bam_file, sites, upstream=1000, downstream=1000):
bam = pysam.AlignmentFile(bam_file, 'rb')
window_size = upstream + downstream
signal = np.zeros(window_size)
for chrom, pos, strand in sites:
start = pos - upstream if strand == '+' else pos - downstream
end = pos + downstream if strand == '+' else pos + upstream
for read in bam.fetch(chrom, max(0, start), end):
if read.is_proper_pair and read.is_read1:
frag_center = read.reference_start + abs(read.template_length) // 2
rel_pos = frag_center - start
if 0 <= rel_pos < window_size:
signal[rel_pos] += 1
bam.close()
return signal / len(sites)
# Load TSS sites
tss_sites = [] # Load from GTF
nfr_signal = signal_around_sites('nfr.bam', tss_sites)
mono_signal = signal_around_sites('mono_nuc.bam', tss_sites)
```
## DANPOS
### Installation
```bash
conda install -c bioconda danpos
```
### Run DANPOS
```bash
# Single sample
danpos.py dpos sample.bam -o danpos_output
# Compare conditions
danpos.py dpeak -b treatment.bam -c control.bam -o danpos_diff
```
## Complete Workflow
**Goal:** Run end-to-end nucleosome positioning analysis from BAM to heatmaps and V-plots.
**Approach:** Read BAM, shift reads for Tn5 offset, split fragments by size class, compute signal profiles around TSS, and generate heatmaps and V-plots.
```r
library(ATACseqQC)
library(TxDb.Hsapiens.UCSC.hg38.knownGene)
library(BSgenome.Hsapiens.UCSC.hg38)
bamfile <- 'sample.bam'
# 1. Fragment size QC
fragSize <- fragSizeDist(bamfile, 'sample')
pdf('fragment_size.pdf')
plot(fragSize)
dev.off()
# 2. Read and shift
gal <- readBamFile(bamfile, asMates=TRUE, bigFile=TRUE)
gal_shifted <- shiftGAlignmentsList(gal)
# 3. Get TSS regions
txs <- transcripts(TxDb.Hsapiens.UCSC.hg38.knownGene)
tss <- promoters(txs, upstream=2000, downstream=2000)
# 4. Split by fragment size
objs <- splitGAlignmentsByCut(gal_shifted, txs=txs,
genome=BSgenome.Hsapiens.UCSC.hg38)
# 5. Calculate signals
sigs <- featureAlignedSignal(cvglist=objs,
feature.gr=tss,
upstream=2000,
downstream=2000)
# 6. Plot heatmap
pdf('nucleosome_heatmap.pdf', width=8, height=10)
featureAlignedHeatmap(sigs, tss, upstream=2000, downstream=2000)
dev.off()
# 7. V-plot
pdf('vplot.pdf')
vPlot(gal_shifted, tss, genome=BSgenome.Hsapiens.UCSC.hg38,
upstream=1000, downstream=1000)
dev.off()
# 8. Export nucleosome-free and nucleosomal BAMs
export(objs$NuclsomeFree, 'nfr.bam')
export(objs$mononucleosome, 'mono_nucleosome.bam')
```
## Related Skills
- atac-seq/atac-peak-calling - Call accessibility peaks
- atac-seq/atac-qc - Quality control metrics
- atac-seq/footprinting - TF footprinting
- chip-seq/peak-annotation - Annotate nucleosome positionsRelated Skills
datacommons-client
Work with Data Commons, a platform providing programmatic access to public statistical data from global sources. Use this skill when working with demographic data, economic indicators, health statistics, environmental data, or any public datasets available through Data Commons. Applicable for querying population statistics, GDP figures, unemployment rates, disease prevalence, geographic entity resolution, and exploring relationships between statistical entities.
bio-single-cell-scatac-analysis
Single-cell ATAC-seq analysis with Signac (R/Seurat) and ArchR. Process 10X Genomics scATAC data, perform QC, dimensionality reduction, clustering, peak calling, and motif activity scoring with chromVAR. Use when analyzing single-cell ATAC-seq data.
bio-atac-seq-motif-deviation
Analyze transcription factor motif accessibility variability using chromVAR. Use when identifying which TF motifs show variable accessibility across samples or conditions in ATAC-seq data.
bio-atac-seq-footprinting
Detect transcription factor binding sites through footprinting analysis in ATAC-seq data using TOBIAS. Use when identifying TF occupancy patterns within accessible regions, as TF binding protects DNA from Tn5 cutting.
bio-atac-seq-differential-accessibility
Find differentially accessible chromatin regions between conditions using DiffBind or DESeq2. Use when comparing chromatin accessibility between treatment groups, cell types, or developmental stages in ATAC-seq experiments.
bio-atac-seq-atac-qc
Quality control metrics for ATAC-seq data including fragment size distribution, TSS enrichment, FRiP, and library complexity. Use when assessing ATAC-seq library quality before or after peak calling to identify problematic samples.
bio-atac-seq-atac-peak-calling
Call accessible chromatin regions from ATAC-seq data using MACS3 with ATAC-specific parameters. Use when identifying open chromatin regions from aligned ATAC-seq BAM files, different from ChIP-seq peak calling.
zinc-database
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
zarr-python
Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.
xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
writing-skills
Use when creating new skills, editing existing skills, or verifying skills work before deployment
writing-plans
Use when you have a spec or requirements for a multi-step task, before touching code