functional-enrichment
Perform GO and KEGG functional enrichment using HOMER from genomic regions (BED/narrowPeak/broadPeak) or gene lists, and produce R-based barplot/dotplot visualizations. Use this skill when you want to perform GO and KEGG functional enrichment using HOMER from genomic regions or just want to link genomic region to genes.
Best use case
functional-enrichment is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Perform GO and KEGG functional enrichment using HOMER from genomic regions (BED/narrowPeak/broadPeak) or gene lists, and produce R-based barplot/dotplot visualizations. Use this skill when you want to perform GO and KEGG functional enrichment using HOMER from genomic regions or just want to link genomic region to genes.
Teams using functional-enrichment should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/11-toolbased-functional-enrichment/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How functional-enrichment Compares
| Feature / Agent | functional-enrichment | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Perform GO and KEGG functional enrichment using HOMER from genomic regions (BED/narrowPeak/broadPeak) or gene lists, and produce R-based barplot/dotplot visualizations. Use this skill when you want to perform GO and KEGG functional enrichment using HOMER from genomic regions or just want to link genomic region to genes.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Functional Enrichment (HOMER + R)
## Overview
- **Validate input**: Accept BED/peak files with genomic coordinates or gene lists; check format and genome assembly.
- **Map regions to genes**: Convert regions to a unique gene set using HOMER `annotatePeaks.pl`.
- **Run GO enrichment**: Use HOMER `findGO.pl` (or `annotatePeaks.pl -go`) for BP/MF/CC.
- **Run KEGG enrichment**: Use HOMER `findGO.pl -kegg` (or `annotatePeaks.pl -kegg`).
- **Collect outputs**: Save tidy tables for downstream plotting and a compact summary of top terms.
- **Visualize in R**: Create barplots and dotplots (GO/KEGG) with `ggplot2` from standardized outputs.
- **QC & troubleshooting**: Provide checks for genome mismatch, chromosome naming, and low-signal inputs.
## Inputs & Outputs
### Inputs (choose one):
#### Option 1: Input is a genomic region file (BED/narrowPeak/broadPeak)
Genomic region formats supported:
- **BED files**: Standard genomic interval format
- **narrowPeak**: narrow peak format
- **broadPeak**: broad peak format
#### Option 2: Input is a gene list (txt)
- `gene_list.txt` with one official gene symbol per line (no header). And an optional `gene_list_background.txt` with one official gene symbol per line (no header).
### Outputs (directory layout):
```bash
${sample}_functional_enrichment/
results/
${sample}.anno_genomic_features.txt
${sample}.anno_genomic_features_stats.txt
biological_process.txt
cellular_component.txt
molecular_function.txt
kegg.txt
biocyc.txt
chromosome.txt
cosmic.txt
interactions.txt
interpro.txt
gene3d.txt
pathwayInteractionDB.txt
pfam.txt
prints.txt
prosite.txt
reactome.txt
smpdb.txt
wikipathways.txt
gwas.txt
lipidmaps.txt
msigdb.txt
smart.txt
tables/
${sample}.gene_list.txt
go_bp.tsv
go_mf.tsv
go_cc.tsv
kegg.tsv
logs/
${sample}.anno_genomic_features.log # if genome region file is provided
findGO.log
```
## Decision Tree
### Step 0 — Gather Required Information from the User
Before calling any tool, **ask the user**:
1. Sample name (`sample`): used as prefix and for the output directory `${sample}_functional_enrichment`.
2. Genome assembly (`genome`): e.g. `hg38`, `mm10`, `danRer11`.
- **Never** guess or auto-detect.
---
### Step 1: Initialize Project
1. Make director for this project:
Call:
- `mcp__project-init-tools__project_init`
with:
- `sample`: the user-provided sample name
- `task`: de_novo_motif_discovery
The tool will:
- Create `${sample}_functional_enrichment` directory.
- Get the full path of the `${sample}_functional_enrichment` directory, which will be used as `${proj_dir}`.
---
### Step 2: Prepare genome file for homer
Call:
- `mcp__homer-tools__check_genome_installation`
With:
- `genome`: the user-provided genome assembly, e.g. `hg38`, `mm10`, `danRer11`
The tool will:
- Check if the genome is installed in HOMER.
- If not, install the genome.
---
### Step 3 (Optional): Standardize chromosome names for BED files
This step is optional. Only perform this step if the input file is a BED file. If the input file is a gene list, skip this step.
From `1` format to `chr1` format
From `MT` format to `chrM` format
Call:
- `mcp__file-format-tools__standardize_bed_chrom_names`
with:
- `input_bed`: the user-provided BED file
- `output_bed`: the path to save the standardized BED file
The tool will:
- Standardize the chromosome names in the BED file.
- Return the path of the standardized BED file.
---
### Step 4 (Optional): Convert gene ID to gene symbol
This step is optional. Only perform this step if the input file is a gene list file. If the input file is a BED file, skip this step.
Call:
- `mcp__mygene-tools__convert_gene_ids_mygene`
With:
- `input_ids_file`: the user-provided gene list file. May end with `.txt`.
- `scopes`: the source ID type for mygene (e.g., 'ensembl.gene', 'symbol', 'entrezgene', 'uniprot', or a comma-separated list).
- `fields`: the comma-separated target fields to retrieve from mygene (e.g., 'symbol,ensembl.gene,uniprot,entrezgene').
- `species`: the species for mygene (e.g., 'human', 'mouse', 'zebrafish', or NCBI taxon ID like '9606').
- `out_file`: the path to save the converted gene list file. In this skill, it is the full path of the `${sample}_functional_enrichment` directory returned by `mcp__project-init-tools__project_init`
- `batch_size`: the batch size for mygene.querymany (default 1000).
The tool will:
- Convert the gene ID to gene symbol.
- Return the path of the converted gene list file.
---
### Step 5: GO enrichment analysis
#### Option 1: from genomic regions file
Only if the input file is a BED file. If the input file is a gene list, call tools in Option 2.
1. annotate the genomic regions using Homer's `annotatePeaks.pl` with `-go` option. If user also provides a background genome region file, like a control peak file, also call this tool for the background genome region file. Use a different `${sample}` as the sample name for the background sample.
Call:
`mcp__homer-tools__annotate_genomic_features`
With:
- `sample`: the user-provided sample name
- `proj_dir`: directory to save the genomic feature annotation results. In this skill, it is the full path of the `${sample}_functional_enrichment` directory returned by `mcp__project-init-tools__project_init`
- `regions_bed`: the user-provided regions file in BED format. May end with `.bed`, `.narrowPeak`, `.broadPeak`, etc.
- `genome`: the user-provided genome assembly, e.g. `hg38`, `mm10`, `danRer11`
- `ann`: "custom homer annotation file (created by assignGenomeAnnotation.pl), (default: None).
- `size_given`: keep original region sizes (default: True)
- `cpg`: include CpG information (default: False)
- `go`: `True` to perform GO enrichment analysis.
The tool will:
- Annotate the genomic regions using Homer's `annotatePeaks.pl`.
- Return the path of the annotated regions file under `${proj_dir}/results/` directory, and the path to the log file under `${proj_dir}/logs/` directory.
- `${proj_dir}/results/${sample}.anno_genomic_features.txt`
- `${proj_dir}/results/${sample}.anno_genomic_features_stats.txt`
- `${proj_dir}/logs/${sample}.anno_genomic_features.log`
---
2. (optional) extract the genes from the annotated regions file if neccessary for future analysis or the target gene list is requested by user. If not requested, skip this step.
Call:
`mcp__file-format-tools__extract_gene_list`
With:
- `sample`: the user-provided sample name
- `proj_dir`: directory to save the genomic feature annotation results. In this skill, it is the full path of the `${sample}_functional_enrichment` directory returned by `mcp__project-init-tools__project_init`
The tool will:
- Extract the genes from the annotated regions file.
- Return the path of the gene list file under `${proj_dir}/tables/` directory.
- `${proj_dir}/tables/${sample}.gene_list.txt`
---
#### Option 2: from gene list file
Only if the input file is a gene list file. If the input file is a BED file, call tools in Option 1.
Call:
`mcp__homer-tools__gene_function_enrichment`
With:
- `sample`: the user-provided sample name
- `proj_dir`: directory to save the GO & KEGG enrichment results. In this skill, it is the full path of the `${sample}_functional_enrichment` directory returned by `mcp__project-init-tools__project_init`
- `gene_list_file`: the user-provided gene list file. May end with `.txt`.
- `organism`: the user-provided organism name, e.g. `human`, `mouse`, `zebrafish`, etc.
- `background_gene_list_file`: the user-provided background gene list file. May end with `.txt`. If not provided, set this parameter to `None`.
The tool will:
- Find the GO enrichment for the gene list.
- Return the path of the GO & KEGG enrichment results under `${proj_dir}/results/` directory.
- `${proj_dir}/results/biological_process.txt`
- `${proj_dir}/results/kegg.txt`
- ... other GO and KEGG enrichment results files.
- Return the path of the log file under `${proj_dir}/logs/` directory.
- `${proj_dir}/logs/${sample}.find_go_and_kegg_enrichment.log`
---
> **Alternative direct from BED**
> `annotatePeaks.pl peaks.bed hg38 -go results/{run}/tables/go_dir -genomeOntology`
> `annotatePeaks.pl peaks.bed hg38 -kegg results/{run}/tables/kegg_dir`
## Notes & Best Practices
- **Genome & naming**: Ensure the HOMER genome key matches the species; chromosome naming must be consistent (`chr1` vs `1`).
- **BED format**: Tab-delimited, ≥3 columns, 0-based coordinates, no header.
- **Multiple testing**: Prefer FDR (BH) if provided; otherwise fallback to P-value.
- **Background set**: `-bg` helps reduce bias; choose a reasonable universe (e.g., all expressed or all accessible regions → genes).
- **Direct-from-BED**: `annotatePeaks.pl -go/-kegg` is convenient; the gene-list route yields uniform TSVs for plotting.
## Troubleshooting
- **Many NAs after annotation**: Check genome version, chromosome naming, BED formatting, and headers.
- **Empty/weak enrichment**: Ensure sufficient genes (suggest ≥50), verify species of symbols, tune thresholds or background.
- **Column name drift**: HOMER versions may differ; adjust R column mappings if needed.Related Skills
known-motif-enrichment
This skill should be used when users need to perform known motif enrichment analysis on ChIP-seq, ATAC-seq, or other genomic peak files using HOMER (Hypergeometric Optimization of Motif EnRichment). It identifies enrichment of known transcription factor binding motifs from established databases in genomic regions.
modal-deployment
Run Python code in the cloud with serverless containers, GPUs, and autoscaling using Modal. This skill enables agents to generate code for deploying ML models, running batch jobs, serving APIs, and scaling compute-intensive workloads.
astro
This skill provides essential Astro framework patterns, focusing on server-side rendering (SSR), static site generation (SSG), middleware, and TypeScript best practices. It helps AI agents implement secure authentication, manage API routes, and debug rendering behaviors within Astro projects.
lets-go-rss
A lightweight, full-platform RSS subscription manager that aggregates content from YouTube, Vimeo, Behance, Twitter/X, and Chinese platforms like Bilibili, Weibo, and Douyin, featuring deduplication and AI smart classification.
chrome-debug
This skill empowers AI agents to debug web applications and inspect browser behavior using the Chrome DevTools Protocol (CDP), offering both collaborative (headful) and automated (headless) modes.
ontopo
An AI agent skill to search for Israeli restaurants, check table availability, view menus, and retrieve booking links via the Ontopo platform, acting as an unofficial interface to its data.
whisper-transcribe
Transcribes audio and video files to text using OpenAI's Whisper CLI, enhanced with contextual grounding from local markdown files for improved accuracy.
thor-skills
An entry point and router for AI agents to manage various THOR-related cybersecurity tasks, including running scans, analyzing logs, troubleshooting, and maintenance.
grail-miner
This skill assists in setting up, managing, and optimizing Grail miners on Bittensor Subnet 81, handling tasks like environment configuration, R2 storage, model checkpoint management, and performance tuning.
ux
This AI agent skill provides comprehensive guidance for creating professional and insightful User Experience (UX) designs, covering user research, information architecture, interaction design, visual guidance, and usability evaluation. It aims to produce actionable, user-centered solutions that avoid generic AI aesthetics.
tech-blog
Generates comprehensive technical blog posts, offering detailed explanations of system internals, architecture, and implementation, either through source code analysis or document-driven research.
vly-money
Generate crypto payment links for supported tokens and networks, manage access to X402 payment-protected content, and provide direct access to the vly.money wallet interface.