TF-differential-binding

The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

16 stars

Best use case

TF-differential-binding is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Teams using TF-differential-binding should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/tf-differential-binding/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/tf-differential-binding/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/tf-differential-binding/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How TF-differential-binding Compares

Feature / AgentTF-differential-bindingStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

The TF-differential-binding pipeline performs differential transcription factor (TF) binding analysis from ChIP-seq datasets (TF peaks) using the DiffBind package in R. It identifies genomic regions where TF binding intensity significantly differs between experimental conditions (e.g., treatment vs. control, mutant vs. wild-type). Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# DiffBind TF Differential Binding Analysis

## Overview

This skill enables comprehensive differential TF binding analysis using **DiffBind** in R. DiffBind integrates read counting, normalization, and statistical modeling to identify differentially bound peaks between conditions.

To perform DiffBind differential binding analysis:
- Initialize the project directory.
- Refer to the **Inputs & Outputs** section to check inputs and build the output architecture. All the output file should located in `${proj_dir}` in Step 0.
- **Always prompt user** if required files are missing.
- Provide a sample sheet with ChIP-seq peak files and corresponding BAM files for each sample.
- Construct a `DBA` object from the sample sheet.
- Compute read counts over consensus peak regions.
- Specify experimental conditions (e.g., treatment vs. control or cell_type_A vs. cell_type_B).
- Run statistical tests to identify differentially bound regions.
- Generate correlation heatmaps, PCA plots, and volcano plots; extract significant binding events.

---

## When to use this skill
Use the TF-differential-binding pipeline when you need to analyze the different function of the same TF across two or more biological conditions, cell types, or treatments using ChIP-seq data or TF binding peaks. This pipeline is ideal for studying regulatory mechanisms that underlie transcriptional differences or epigenetic responses to perturbations.

Recommended applications include:

- Comparing treated vs. control or wild-type vs. mutant conditions to identify TF binding changes in response to stimuli, drugs, or mutations.
- Comparing TF binding profiles between two cell types or experimental conditions to identify differentially bound regions (DBRs).
- Comparing the different TF function in two conditions.
- Integrating with RNA-seq to correlate TF binding alterations with gene expression changes.
- Investigating co-factor dependencies or chromatin remodeling events linked to TF occupancy.

---

## Inputs & Outputs

### Inputs (choose one)
- If starting from BAM files and BED peak files → Generate consensus peaks and count matrix.  
- If starting from existing count matrix → Go directly to DiffBind analysis.  
- If multiple conditions or batches → Include batch/condition in design 

### Outputs
```bash
${sample}_TF_DB_analysis/
    DBs/
      DB_results.csv # DESeq2 results (log2FC, p-values)
      DB_up.bed
      DB_down.bed  
    plots/ # visualization outputs
      PCA.pdf
      volcano.pdf
      heatmap.pdf
    logs/ # analysis logs 
    temp/ # other temp files
```

---

## Decision Tree

### Step 0: Initialize Project

1. Make director for this project:

Call:

- `mcp__project-init-tools__project_init`

with:

- `sample`: sample name (e.g. c1_vs_c2)
- `task`: TF_DB

The tool will:

- Create `${sample}_TF_DB` directory.
- Return the full path of the `${sample}_TF_DB` directory, which will be used as `${proj_dir}`.

### Step 1: Prepare Input Data

Create a CSV sample sheet (`samplesheet.csv`) with the following columns:

| SampleID | Tissue | Factor | Condition | bamReads | Peaks | PeakCaller |
|-----------|------------|------------|-----------|--------|-------------|-------------|
| TF_A_1    | A    | TF   | Control       | Control1.bam | Control1_peaks.narrowPeak | narrow |
| TF_A_2    | A    | TF   | Control       | Control2.bam | Control2_peaks.narrowPeak | narrow |
| TF_B_1    | A    | TF   | Treated       | Treated1.bam | Treated1_peaks.narrowPeak | narrow |
| TF_B_2    | A    | TF   | Treated       | Treated2.bam | Treated2_peaks.narrowPeak | narrow |

### Step 2: Load Data and Build the DiffBind Object

```r
library(DiffBind)
samples <- read.csv("samplesheet.csv")
dbObj <- dba(sampleSheet=samples)
```

**Key parameters:**
- `sampleSheet`: CSV file with BAM and peak information
- Supports both narrowPeak and broadPeak formats


### Step 3: Read Counting and Consensus Peak Generation

Count reads overlapping consensus peaks across samples:

```r
# Generate a consensus peakset
dbObj <- dba.count(dbObj, summits=250)
```

**Notes:**
- `summits`: re-centers peaks ±250 bp around summits for consistency.
- The resulting matrix contains normalized counts for all samples.

---

### Step 4: Contrast Definition

Define conditions for comparison:

```r
# Define experimental contrasts (e.g., Treated vs Control)
dbObj <- dba.contrast(dbObj, categories=DBA_CONDITION, minMembers=2)
```

**Alternatives:**
- For multifactor experiments: use `DBA_TISSUE`, `DBA_TREATMENT`, or custom metadata.
- Check contrasts:
  ```r
  dba.show(dbObj, bContrasts=TRUE)
  ```

---

### Step 5: Differential Binding Analysis

```r
# Perform analysis
dbObj <- dba.analyze(dbObj, method=DBA_DESEQ2)
```

**Parameters:**
- `method`: choose `DBA_DESEQ2` (default) or `DBA_EDGER`
- `th`: FDR threshold (default 0.05)
- `fold`: minimum log2 fold change
- `bUsePval=TRUE`: use p-values instead of FDR cutoff

---

### Step 6: Visualization and Quality Control

#### Correlation Heatmap

```r
dba.plotHeatmap(dbObj, correlations=TRUE, scale="row")
```

#### PCA Plot

```r
dba.plotPCA(dbObj, attributes=DBA_CONDITION, label=DBA_ID)
```

#### Volcano Plot

```r
# Volcano plot
allResults <- dba.report(dbObj, method=DBA_DESEQ2, th=1)
with(allResults, plot(Fold, -log10(FDR),
     col=ifelse(FDR < 0.05 & abs(Fold) > 1, "red", "grey"),
     pch=16, main="Volcano Plot"))
```
Output: `heatmap.pdf`  `Volcano.pdf` `PCA.pdf` 

---

### Step 7: Result Extraction

Export significant differential peaks:

```r
write.csv(as.data.frame(allResults), "DB_results.csv", row.names = FALSE)
library(rtracklayer)
# Extract results with FDR < 0.05 and |log2FC| > 1
sigSites <- dba.report(dbObj, method=DBA_DESEQ2, th=0.05, fold=1)
print("Differential binding results summary:")
print(summary(sigSites))

# get the peaks that up or down in treated condition
diff_up <- sigSites[sigSites$Fold > 0]
diff_down <- sigSites[sigSites$Fold < 0]
export(diff_up, "DB_up_${treated_condition}.bed")
export(diff_down, "DB_down_${treated_condition}.bed")
```
Output: `DB_results.csv`  `DB_up_${treated_condition}.bed` `DB_down_${treated_condition}.bed` 


---

## Interpretation and Biological Insights

### Significance Criteria

- **FDR < 0.05** → statistically significant  
- **|log2FC| > 1** → biologically meaningful difference  
- **Consistent replicates** → at least two replicates per condition recommended

### Typical Biological Interpretations

- **Increased binding** in treated condition → potential activation or recruitment of TFs
- **Decreased binding** → loss of TF affinity or chromatin closing
- Combine with RNA-seq to correlate with target gene expression.

---

## Troubleshooting

| Problem | Possible Cause | Solution |
|----------|----------------|-----------|
| No differential peaks found | Insufficient replicates or low coverage | Increase sequencing depth or lower FDR threshold |
| Errors in sample sheet | Column names incorrect or missing | Use standard DiffBind column format |
| Inconsistent genome build | Mixed genome assemblies | Ensure all BAM and peak files use the same genome reference |
| Over-normalization | Strong batch effects | Include batch term in design or run `dba.contrast(..., block=...)` |

Related Skills

differential-tad-analysis

16
from diegosouzapw/awesome-omni-skill

This skill performs differential topologically associating domain (TAD) analysis using HiCExplorer's hicDifferentialTAD tool. It compares Hi-C contact matrices between two conditions based on existing TAD definitions to identify significantly altered chromatin domains.

differential-review

16
from diegosouzapw/awesome-omni-skill

Perform security-focused review of code diffs and pull requests, identifying newly introduced vulnerabilities, security regressions, and unsafe patterns in changed code.

differential-region-analysis

16
from diegosouzapw/awesome-omni-skill

The differential-region-analysis pipeline identifies genomic regions exhibiting significant differences in signal intensity between experimental conditions using a count-based framework and DESeq2. It supports detection of both differentially accessible regions (DARs) from open-chromatin assays (e.g., ATAC-seq, DNase-seq) and differential transcription factor (TF) binding regions from TF-centric assays (e.g., ChIP-seq, CUT&RUN, CUT&Tag). The pipeline can start from aligned BAM files or a precomputed count matrix and is suitable whenever genomic signal can be summarized as read counts per region.

differential-methylation

16
from diegosouzapw/awesome-omni-skill

This skill performs differential DNA methylation analysis (DMRs and DMCs) between experimental conditions using WGBS methylation tracks (BED/BedGraph). It standardizes input files into per-sample four-column Metilene tables, constructs a merged methylation matrix, runs Metilene for DMR detection, filters the results, and generates quick visualizations.

bgo

16
from diegosouzapw/awesome-omni-skill

Automated Blender build-go workflow. Automatically builds, removes old version, installs, enables, and launches Blender with your extension/add-on. Use when you want to quickly test changes, execute complete build-to-launch cycle, or run custom packaging scripts with automatic Blender launch.

Coding & Development

hic-compartments-calling

16
from diegosouzapw/awesome-omni-skill

This skill performs PCA-based A/B compartments calling on Hi-C .mcool datasets using pre-defined MCP tools from the cooler-tools, cooltools-tools, and plot-hic-tools servers.

heyzine-automation

16
from diegosouzapw/awesome-omni-skill

Automate Heyzine tasks via Rube MCP (Composio). Always search tools first for current schemas.

heyreach-automation

16
from diegosouzapw/awesome-omni-skill

Automate Heyreach tasks via Rube MCP (Composio). Always search tools first for current schemas.

HexCore Binary Analysis

16
from diegosouzapw/awesome-omni-skill

Skill para analise de binarios com ferramentas HexCore integradas ao editor

here-automation

16
from diegosouzapw/awesome-omni-skill

Automate Here tasks via Rube MCP (Composio). Always search tools first for current schemas.

helpwise-automation

16
from diegosouzapw/awesome-omni-skill

Automate Helpwise tasks via Rube MCP (Composio). Always search tools first for current schemas.

helm-validator

16
from diegosouzapw/awesome-omni-skill

Comprehensive toolkit for validating, linting, testing, and analyzing Helm charts and their rendered Kubernetes resources. Use this skill when working with Helm charts, validating templates, debugging chart issues, working with Custom Resource Definitions (CRDs) that require documentation lookup, or checking Helm best practices.