TF-differential-binding

The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

181 stars

Best use case

TF-differential-binding is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Teams using TF-differential-binding should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/9-tf-differential-binding/SKILL.md --create-dirs "https://raw.githubusercontent.com/majiayu000/claude-skill-registry/main/skills/data/9-tf-differential-binding/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/9-tf-differential-binding/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How TF-differential-binding Compares

Feature / AgentTF-differential-bindingStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# DiffBind TF Differential Binding Analysis

## Overview

This skill enables comprehensive differential TF binding analysis using **DiffBind** in R. DiffBind integrates read counting, normalization, and statistical modeling to identify differentially bound peaks between conditions.

To perform DiffBind differential binding analysis:
- Initialize the project directory.
- Refer to the **Inputs & Outputs** section to check inputs and build the output architecture. All the output file should located in `${proj_dir}` in Step 0.
- **Always prompt user** if required files are missing.
- Provide a sample sheet with ChIP-seq peak files and corresponding BAM files for each sample.
- Construct a `DBA` object from the sample sheet.
- Compute read counts over consensus peak regions.
- Specify experimental conditions (e.g., treatment vs. control or cell_type_A vs. cell_type_B).
- Run statistical tests to identify differentially bound regions.
- Generate correlation heatmaps, PCA plots, and volcano plots; extract significant binding events.

---

## When to use this skill
Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Recommended applications include:

- Comparing treated vs. control or wild-type vs. mutant conditions to identify TF binding changes in response to stimuli, drugs, or mutations.
- Comparing TF binding profiles between two cell types or experimental conditions to identify differentially bound regions (DBRs).
- Comparing the different TF function in two conditions.
- Integrating with RNA-seq to correlate TF binding alterations with gene expression changes.
- Investigating co-factor dependencies or chromatin remodeling events linked to TF occupancy.

---

## Inputs & Outputs

### Inputs (choose one)
- If starting from BAM files and BED peak files → Generate consensus peaks and count matrix.  
- If starting from existing count matrix → Go directly to DiffBind analysis.  
- If multiple conditions or batches → Include batch/condition in design 

### Outputs
```bash
${sample}_TF_DB_analysis/
    DBs/
      DB_results.csv # DESeq2 results (log2FC, p-values)
      DB_up.bed
      DB_down.bed  
    plots/ # visualization outputs
      PCA.pdf
      volcano.pdf
      heatmap.pdf
    logs/ # analysis logs 
    temp/ # other temp files
```

---

## Decision Tree

### Step 0: Initialize Project

1. Make director for this project:

Call:

- `mcp__project-init-tools__project_init`

with:

- `sample`: sample name (e.g. c1_vs_c2)
- `task`: TF_DB

The tool will:

- Create `${sample}_TF_DB` directory.
- Return the full path of the `${sample}_TF_DB` directory, which will be used as `${proj_dir}`.

### Step 1: Prepare Input Data

Create a CSV sample sheet (`samplesheet.csv`) with the following columns:

| SampleID | Tissue | Factor | Condition | bamReads | Peaks | PeakCaller |
|-----------|------------|------------|-----------|--------|-------------|-------------|
| TF_A_1    | A    | TF   | Control       | Control1.bam | Control1_peaks.narrowPeak | narrow |
| TF_A_2    | A    | TF   | Control       | Control2.bam | Control2_peaks.narrowPeak | narrow |
| TF_B_1    | A    | TF   | Treated       | Treated1.bam | Treated1_peaks.narrowPeak | narrow |
| TF_B_2    | A    | TF   | Treated       | Treated2.bam | Treated2_peaks.narrowPeak | narrow |

### Step 2: Load Data and Build the DiffBind Object

```r
library(DiffBind)
samples <- read.csv("samplesheet.csv")
dbObj <- dba(sampleSheet=samples)
```

**Key parameters:**
- `sampleSheet`: CSV file with BAM and peak information
- Supports both narrowPeak and broadPeak formats


### Step 3: Read Counting and Consensus Peak Generation

Count reads overlapping consensus peaks across samples:

```r
# Generate a consensus peakset
dbObj <- dba.count(dbObj, summits=250)
```

**Notes:**
- `summits`: re-centers peaks ±250 bp around summits for consistency.
- The resulting matrix contains normalized counts for all samples.

---

### Step 4: Contrast Definition

Define conditions for comparison:

```r
# Define experimental contrasts (e.g., Treated vs Control)
dbObj <- dba.contrast(dbObj, categories=DBA_CONDITION, minMembers=2)
```

**Alternatives:**
- For multifactor experiments: use `DBA_TISSUE`, `DBA_TREATMENT`, or custom metadata.
- Check contrasts:
  ```r
  dba.show(dbObj, bContrasts=TRUE)
  ```

---

### Step 5: Differential Binding Analysis

```r
# Perform analysis
dbObj <- dba.analyze(dbObj, method=DBA_DESEQ2)
```

**Parameters:**
- `method`: choose `DBA_DESEQ2` (default) or `DBA_EDGER`
- `th`: FDR threshold (default 0.05)
- `fold`: minimum log2 fold change
- `bUsePval=TRUE`: use p-values instead of FDR cutoff

---

### Step 6: Visualization and Quality Control

#### Correlation Heatmap

```r
dba.plotHeatmap(dbObj, correlations=TRUE, scale="row")
```

#### PCA Plot

```r
dba.plotPCA(dbObj, attributes=DBA_CONDITION, label=DBA_ID)
```

#### Volcano Plot

```r
# Volcano plot
allResults <- dba.report(dbObj, method=DBA_DESEQ2, th=1)
with(allResults, plot(Fold, -log10(FDR),
     col=ifelse(FDR < 0.05 & abs(Fold) > 1, "red", "grey"),
     pch=16, main="Volcano Plot"))
```
Output: `heatmap.pdf`  `Volcano.pdf` `PCA.pdf` 

---

### Step 7: Result Extraction

Export significant differential peaks:

```r
write.csv(as.data.frame(allResults), "DB_results.csv", row.names = FALSE)
library(rtracklayer)
# Extract results with FDR < 0.05 and |log2FC| > 1
sigSites <- dba.report(dbObj, method=DBA_DESEQ2, th=0.05, fold=1)
print("Differential binding results summary:")
print(summary(sigSites))

# get the peaks that up or down in treated condition
diff_up <- sigSites[sigSites$Fold > 0]
diff_down <- sigSites[sigSites$Fold < 0]
export(diff_up, "DB_up_${treated_condition}.bed")
export(diff_down, "DB_down_${treated_condition}.bed")
```
Output: `DB_results.csv`  `DB_up_${treated_condition}.bed` `DB_down_${treated_condition}.bed` 


---

## Interpretation and Biological Insights

### Significance Criteria

- **FDR < 0.05** → statistically significant  
- **|log2FC| > 1** → biologically meaningful difference  
- **Consistent replicates** → at least two replicates per condition recommended

### Typical Biological Interpretations

- **Increased binding** in treated condition → potential activation or recruitment of TFs
- **Decreased binding** → loss of TF affinity or chromatin closing
- Combine with RNA-seq to correlate with target gene expression.

---

## Troubleshooting

| Problem | Possible Cause | Solution |
|----------|----------------|-----------|
| No differential peaks found | Insufficient replicates or low coverage | Increase sequencing depth or lower FDR threshold |
| Errors in sample sheet | Column names incorrect or missing | Use standard DiffBind column format |
| Inconsistent genome build | Mixed genome assemblies | Ensure all BAM and peak files use the same genome reference |
| Over-normalization | Strong batch effects | Include batch term in design or run `dba.contrast(..., block=...)` |

Related Skills

differential-region-analysis

181
from majiayu000/claude-skill-registry

The differential-region-analysis pipeline identifies genomic regions exhibiting significant differences in signal intensity between experimental conditions using a count-based framework and DESeq2. It supports detection of both differentially accessible regions (DARs) from open-chromatin assays (e.g., ATAC-seq, DNase-seq) and differential transcription factor (TF) binding regions from TF-centric assays (e.g., ChIP-seq, CUT&RUN, CUT&Tag). The pipeline can start from aligned BAM files or a precomputed count matrix and is suitable whenever genomic signal can be summarized as read counts per region.

differential-methylation

181
from majiayu000/claude-skill-registry

This skill performs differential DNA methylation analysis (DMRs and DMCs) between experimental conditions using WGBS methylation tracks (BED/BedGraph). It standardizes input files into per-sample four-column Metilene tables, constructs a merged methylation matrix, runs Metilene for DMR detection, filters the results, and generates quick visualizations.

differential-tad-analysis

181
from majiayu000/claude-skill-registry

This skill performs differential topologically associating domain (TAD) analysis using HiCExplorer's hicDifferentialTAD tool. It compares Hi-C contact matrices between two conditions based on existing TAD definitions to identify significantly altered chromatin domains.

lets-go-rss

159
from majiayu000/claude-skill-registry

A lightweight, full-platform RSS subscription manager that aggregates content from YouTube, Vimeo, Behance, Twitter/X, and Chinese platforms like Bilibili, Weibo, and Douyin, featuring deduplication and AI smart classification.

Content & Documentation

vly-money

159
from majiayu000/claude-skill-registry

Generate crypto payment links for supported tokens and networks, manage access to X402 payment-protected content, and provide direct access to the vly.money wallet interface.

Fintech & CryptoClaude

grail-miner

159
from majiayu000/claude-skill-registry

This skill assists in setting up, managing, and optimizing Grail miners on Bittensor Subnet 81, handling tasks like environment configuration, R2 storage, model checkpoint management, and performance tuning.

DevOps & Infrastructure

tech-blog

159
from majiayu000/claude-skill-registry

Generates comprehensive technical blog posts, offering detailed explanations of system internals, architecture, and implementation, either through source code analysis or document-driven research.

Content & DocumentationClaude

ontopo

159
from majiayu000/claude-skill-registry

An AI agent skill to search for Israeli restaurants, check table availability, view menus, and retrieve booking links via the Ontopo platform, acting as an unofficial interface to its data.

General Utilities

whisper-transcribe

159
from majiayu000/claude-skill-registry

Transcribes audio and video files to text using OpenAI's Whisper CLI, enhanced with contextual grounding from local markdown files for improved accuracy.

Media Processing

chrome-debug

159
from majiayu000/claude-skill-registry

This skill empowers AI agents to debug web applications and inspect browser behavior using the Chrome DevTools Protocol (CDP), offering both collaborative (headful) and automated (headless) modes.

Coding & DevelopmentClaude

thor-skills

159
from majiayu000/claude-skill-registry

An entry point and router for AI agents to manage various THOR-related cybersecurity tasks, including running scans, analyzing logs, troubleshooting, and maintenance.

SecurityClaude

modal-deployment

159
from majiayu000/claude-skill-registry

Run Python code in the cloud with serverless containers, GPUs, and autoscaling using Modal. This skill enables agents to generate code for deploying ML models, running batch jobs, serving APIs, and scaling compute-intensive workloads.

DevOps & Infrastructure