genomic-feature-annotation

This skill is used to perform genomic feature annotation and visualization for any file containing genomic region information using Homer (Hypergeometric Optimization of Motif EnRichment). It annotates regions such as promoters, exons, introns, intergenic regions, and TSS proximity, and generates visual summaries of feature distributions. ChIPseeker mode is also supported according to requirements.

181 stars

Best use case

genomic-feature-annotation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

This skill is used to perform genomic feature annotation and visualization for any file containing genomic region information using Homer (Hypergeometric Optimization of Motif EnRichment). It annotates regions such as promoters, exons, introns, intergenic regions, and TSS proximity, and generates visual summaries of feature distributions. ChIPseeker mode is also supported according to requirements.

Teams using genomic-feature-annotation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/10-toolbased-genomic-feature-annotation/SKILL.md --create-dirs "https://raw.githubusercontent.com/majiayu000/claude-skill-registry/main/skills/data/10-toolbased-genomic-feature-annotation/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/10-toolbased-genomic-feature-annotation/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How genomic-feature-annotation Compares

Feature / Agentgenomic-feature-annotationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

This skill is used to perform genomic feature annotation and visualization for any file containing genomic region information using Homer (Hypergeometric Optimization of Motif EnRichment). It annotates regions such as promoters, exons, introns, intergenic regions, and TSS proximity, and generates visual summaries of feature distributions. ChIPseeker mode is also supported according to requirements.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Genomic Feature Annotation and Visualization with Homer

## Overview

- Prepare genomic region files in BED or other supported formats. Ensure that the input genomic regions are provided in a valid BED format (chromosome, start, end). If the file does not meet this format, extract the required columns to create a valid BED file.regions file.
- Identify and specify the correct genome assembly for annotation.
- Always prompt user for the tool to use, choose from ChIPseeker or HOMER
- If the user choose HOMER, then:
    - Annotate the genomic regions using Homer's `annotatePeaks.pl`.  
    - Generate annotation statistics and feature distribution summaries.  
    - Visualize annotation results (e.g., pie charts, barplots).

---

## When to use this skill

- Find target genes of a certain TF. This skill will return an annotated peak file with the nearby genes of the TF. Genes whose promoter annotated to the TF peaks could be candidate target genes of the TF.
- Annotate the genomic regions like TF peaks, histone modification peaks, ATAC-seq peaks, etc.
- Generate annotation statistics and feature distribution summaries.
- Visualize annotation results (e.g., pie charts, barplots).


---

## Inputs & Outputs

### Inputs
Genomic region formats supported:
- **BED files**: Standard genomic interval format
- **narrowPeak**: narrow peak format
- **broadPeak**: broad peak format

### Outputs
```bash
genomic_feature_annotation/
    results/
        ${sample}.anno_genomic_features.txt
        ${sample}.anno_genomic_features_stats.txt
    logs/
        ${sample}.anno_genomic_features.log
    plots/
        ${sample}.anno_genomic_features.pdf
```


## Decision Tree

### Step 0 — Gather Required Information from the User

Before calling any tool, **ask the user**:

1. Sample name (`sample`): used as prefix and for the output directory `${sample}_genomic_feature_annotation`.
2. Genome assembly (`genome`): e.g. `hg38`, `mm10`, `danRer11`.  
   - **Never** guess or auto-detect.

---

### Step 1: Initialize Project

1. Make director for this project:

Call:
- `mcp__project-init-tools__project_init`

with:
- `sample`: the user-provided sample name
- `task`: de_novo_motif_discovery

The tool will:
- Create `${sample}_genomic_feature_annotation` directory.
- Get the full path of the `${sample}_genomic_feature_annotation` directory, which will be used as `${proj_dir}`.

---



### Step 2 (Optional): Standardize chromosome names for BED files

This step is optional. Only perform this step if the input file is a BED file. If the input file is a gene list, skip this step.

From `1` format to `chr1` format
From `MT` format to `chrM` format

Call:
- `mcp__file-format-tools__standardize_bed_chrom_names`

with:
- `input_bed`: the user-provided BED file
- `output_bed`: the path to save the standardized BED file

The tool will:
- Standardize the chromosome names in the BED file.
- Return the path of the standardized BED file.


---


### Step 3: Genomic Feature Annotation

- (Option 1) HOMER mode
Call:
`mcp__homer-tools__annotate_genomic_features`

With:
- `sample`: the user-provided sample name
- `proj_dir`: directory to save the genomic feature annotation results. In this skill, it is the full path of the `${sample}_genomic_feature_annotation` directory returned by `mcp__project-init-tools__project_init`
- `regions_bed`: the user-provided regions file in BED format. May end with `.bed`, `.narrowPeak`, `.broadPeak`, etc.
- `genome`: the user-provided genome assembly, e.g. `hg38`, `mm10`, `danRer11`
- `ann`: "custom homer annotation file (created by assignGenomeAnnotation.pl), (default: None).
- `size_given`: keep original region sizes (default: True)
- `cpg`: include CpG information (default: False)

The tool will:
- Annotate the genomic regions using Homer's `annotatePeaks.pl`.
- Return the path of the annotated regions file under `${proj_dir}/results/` directory, and the path to the log file under `${proj_dir}/logs/` directory.
    - `${proj_dir}/results/${sample}.anno_genomic_features.txt`
    - `${proj_dir}/results/${sample}.anno_genomic_features_stats.txt`
    - `${proj_dir}/logs/${sample}.anno_genomic_features.log`

- (Option 2) ChIPseeker mode

```r
library(ChIPseeker)
library(TxDb.Mmusculus.UCSC.mm10.knownGene) # ajust this depend on species
library(org.Mm.eg.db) # ajust this depend on species
txdb <- TxDb.Mmusculus.UCSC.mm10.knownGene # ajust this depend on species
peak_file <- "$sample.narrowPeak" 
peak_anno <- annotatePeak(
  peak_file,
  TxDb     = txdb,
  tssRegion = c(-3000, 3000),     # define "promoter" window around TSS
  annoDb   = "org.Mm.eg.db"       # adds SYMBOL, GENENAME, etc.
)
pdf("plots/${sample}_anno_ChIPseeker.pdf", width = 6, height = 5)
plotAnnoPie(peak_anno)
plotAnnoBar(peak_anno)
plotDistToTSS(peak_anno)
dev.off()
```


---


### Step 4: Visualize the annotation results (executed only in HOMER mode)

Call:
- `mcp__plot-anno-tools__visualize_annotation_results`

With:
- `sample`: the user-provided sample name
- `proj_dir`: directory to save the annotation results. In this skill, it is the full path of the `${sample}_genomic_feature_annotation` directory returned by `mcp__project-init-tools__project_init`
- `chart_type`: Type of plot: 'pie' for pie chart, 'bar' for barplot. Default: 'pie'.

The tool will:
- Visualize the annotation results.
- Return the path of the plot file under `${proj_dir}/plots/` directory, and ends with `.pdf`.

---

### Step 5. Interpretation of Results

Typical annotation categories:
- **Promoter**: -1 kb to +100 bp from TSS  
- **5' UTR**, **Exon**, **Intron**, **3' UTR**, **Intergenic**, **TTS**  

Quality indicators:
- **Annotation rate**: % of peaks successfully annotated.  
- **Promoter fraction**: Often high in TF ChIP-seq.  
- **Intergenic fraction**: Reflects enhancer-rich or noncoding regions.  


---

## Best Practices

- Use high-confidence regions (e.g., IDR-filtered peaks).  
- Ensure genome naming convention matches input files.  
- Use visualization to assess annotation patterns across datasets.  
- Save annotation parameters and plots for reproducibility.  

---

Related Skills

advanced-features

181
from majiayu000/claude-skill-registry

Implement advanced task features - Priorities, Tags, Due Dates, Reminders, Recurring Tasks, Search, Filter, and Sort. Use when adding Phase 5 advanced functionality. (project)

advanced-features-2025

181
from majiayu000/claude-skill-registry

Advanced 2025 Claude Code plugin features. PROACTIVELY activate for: (1) Agent Skills with progressive disclosure (2) Hook automation (PreToolUse, PostToolUse, etc.) (3) MCP server integration (4) Repository-level configuration (5) Team plugin distribution (6) Context efficiency optimization Provides cutting-edge plugin capabilities and patterns.

Addon/Feature System Development Guide

181
from majiayu000/claude-skill-registry

**Version:** 1.0

add-new-feature

181
from majiayu000/claude-skill-registry

No description provided.

add-feature

181
from majiayu000/claude-skill-registry

Scaffold complete feature with types, repository, API routes, components, store actions, and tests. Use when adding major new functionality like water tracking, sleep tracking, etc.

add-feature-hook

181
from majiayu000/claude-skill-registry

Creates TanStack Query hooks for API features with authentication. Use when connecting frontend to backend endpoints, creating data fetching hooks.

bio-metagenomics-abundance

181
from majiayu000/claude-skill-registry

Species abundance estimation using Bracken with Kraken2 output. Redistributes reads from higher taxonomic levels to species for more accurate estimates. Use when accurate species-level abundances are needed from Kraken2 classification output.

meta:cli-feature-creator

181
from majiayu000/claude-skill-registry

CLI Feature Creator wizard for adding new aaa CLI commands. Use when user asks to "add aaa command", "create CLI feature", "add CLI command", or needs to extend the aaa CLI with new functionality.

a11y-annotation-generator

181
from majiayu000/claude-skill-registry

Adds accessibility annotations (ARIA labels, roles, alt text) to make web content accessible. Use when user asks to "add accessibility", "make accessible", "add aria labels", "wcag compliance", or "screen reader support".

correlation-methylation-epiFeatures

181
from majiayu000/claude-skill-registry

This skill provides a complete pipeline for integrating CpG methylation data with chromatin features such as ATAC-seq signal, H3K27ac, H3K4me3, or other histone marks/TF signals.

1k-feature-guides

181
from majiayu000/claude-skill-registry

Feature development guides for OneKey. Use when adding new chains, socket events, notifications, pages, or routes. Covers blockchain integration, WebSocket subscriptions, push notifications, and navigation patterns.

Clarify Epic/Feature/UserStory/Task ticketing guidance in SKILL

181
from majiayu000/claude-skill-registry

No description provided.