bio-microbiome-differential-abundance
Differential abundance testing for microbiome data using compositionally-aware methods like ALDEx2, ANCOM-BC2, and MaAsLin2. Use when identifying taxa that differ between experimental groups while accounting for the compositional nature of microbiome data.
Best use case
bio-microbiome-differential-abundance is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Differential abundance testing for microbiome data using compositionally-aware methods like ALDEx2, ANCOM-BC2, and MaAsLin2. Use when identifying taxa that differ between experimental groups while accounting for the compositional nature of microbiome data.
Teams using bio-microbiome-differential-abundance should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/bio-microbiome-differential-abundance/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How bio-microbiome-differential-abundance Compares
| Feature / Agent | bio-microbiome-differential-abundance | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Differential abundance testing for microbiome data using compositionally-aware methods like ALDEx2, ANCOM-BC2, and MaAsLin2. Use when identifying taxa that differ between experimental groups while accounting for the compositional nature of microbiome data.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
## Version Compatibility
Reference examples tested with: DESeq2 1.42+, ggplot2 3.5+, phyloseq 1.46+, scanpy 1.10+
Before using code patterns, verify installed versions match. If versions differ:
- R: `packageVersion('<pkg>')` then `?function_name` to verify parameters
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# Differential Abundance Testing
**"Find which taxa differ between my groups"** → Identify differentially abundant taxa between experimental conditions using compositionally-aware methods that account for the relative nature of microbiome data.
- R: `ALDEx2::aldex()` for CLR-transformed Welch's t-test
- R: `ANCOMBC::ancombc2()` for bias-corrected log-linear models
- R: `Maaslin2::Maaslin2()` for multivariable association
## The Compositionality Problem
Microbiome data is compositional - abundances are relative, not absolute. Standard tests (t-test, DESeq2) can give false positives.
## ALDEx2 (Recommended)
**Goal:** Identify differentially abundant taxa between groups using a compositionally-aware statistical framework.
**Approach:** Apply CLR transformation with Monte Carlo sampling on the OTU table, run Welch's t-test per taxon, and filter by FDR-corrected p-value and effect size.
```r
library(ALDEx2)
library(phyloseq)
ps <- readRDS('phyloseq_object.rds')
otu <- as.data.frame(otu_table(ps))
if (!taxa_are_rows(ps)) otu <- t(otu)
# Define groups
groups <- sample_data(ps)$Group
# Run ALDEx2 (CLR transformation + Welch's t-test)
aldex_results <- aldex(otu, groups, mc.samples = 128, test = 'welch',
effect = TRUE, include.sample.summary = FALSE)
# Filter significant
sig_aldex <- aldex_results[aldex_results$we.eBH < 0.05 & abs(aldex_results$effect) > 1, ]
# Volcano-like plot
aldex.plot(aldex_results, type = 'MW', test = 'welch')
```
## ANCOM-BC2 (Recommended)
```r
library(ANCOMBC)
# Run ANCOM-BC2 with sensitivity analysis
ancom_result <- ancombc2(data = ps, fix_formula = 'Group',
p_adj_method = 'BH', pseudo_sens = TRUE,
prv_cut = 0.1, lib_cut = 1000,
group = 'Group', struc_zero = TRUE)
# Extract results (includes sensitivity analysis)
res_df <- ancom_result$res
# Primary results
sig_ancom <- res_df[res_df$diff_Group == TRUE, ]
# Check sensitivity (passed_ss = passed sensitivity analysis)
robust_hits <- res_df[res_df$diff_Group == TRUE & res_df$passed_ss_Group == TRUE, ]
```
## MaAsLin2
```r
library(Maaslin2)
# Prepare data
features <- as.data.frame(t(otu_table(ps)))
metadata <- as.data.frame(sample_data(ps))
# Run MaAsLin2
maaslin_results <- Maaslin2(
input_data = features,
input_metadata = metadata,
output = 'maaslin2_output',
fixed_effects = 'Group',
normalization = 'CLR',
transform = 'NONE',
analysis_method = 'LM'
)
# Results in maaslin2_output/all_results.tsv
sig_maaslin <- maaslin_results$results[maaslin_results$results$qval < 0.05, ]
```
## DESeq2 (with caution)
```r
library(DESeq2)
library(phyloseq)
# Convert to DESeq2 (use geometric mean of poscounts)
ps_deseq <- ps
ps_deseq <- prune_samples(sample_sums(ps_deseq) > 1000, ps_deseq)
dds <- phyloseq_to_deseq2(ps_deseq, ~ Group)
dds <- DESeq(dds, test = 'Wald', fitType = 'parametric', sfType = 'poscounts')
res <- results(dds, alpha = 0.05)
sig_deseq <- res[which(res$padj < 0.05 & abs(res$log2FoldChange) > 1), ]
```
## Visualization
```r
library(ggplot2)
# Volcano plot from ALDEx2
ggplot(aldex_results, aes(x = effect, y = -log10(we.eBH))) +
geom_point(aes(color = we.eBH < 0.05 & abs(effect) > 1), alpha = 0.6) +
geom_hline(yintercept = -log10(0.05), linetype = 'dashed') +
geom_vline(xintercept = c(-1, 1), linetype = 'dashed') +
scale_color_manual(values = c('grey', 'red')) +
theme_minimal() +
labs(x = 'Effect Size', y = '-log10(Adjusted P-value)')
```
## Method Comparison
| Method | Handles | Covariates | Speed | Notes |
|--------|---------|------------|-------|-------|
| ALDEx2 | Compositionality | Limited | Slow | Best for simple designs |
| ANCOM-BC2 | Compositionality, zeros, sensitivity | Yes | Medium | Recommended for complex designs |
| MaAsLin2 | Compositionality | Yes | Fast | Good for longitudinal |
| DESeq2 | Sparsity (less ideal) | Yes | Fast | Use with caution for microbiome |
## Related Skills
- diversity-analysis - Identify overall differences first
- differential-expression/deseq2-basics - Similar concepts
- pathway-analysis/go-enrichment - Enrichment of differential taxaRelated Skills
bulk-rna-seq-differential-expression-with-omicverse
Guide Claude through omicverse's bulk RNA-seq DEG pipeline, from gene ID mapping and DESeq2 normalization to statistical testing, visualization, and pathway enrichment. Use when a user has bulk count matrices and needs differential expression analysis in omicverse.
bio-proteomics-differential-abundance
Statistical testing for differentially abundant proteins between conditions. Covers limma and MSstats workflows with multiple testing correction. Use when identifying proteins with significant abundance changes between experimental groups.
bio-microbiome-taxonomy-assignment
Taxonomic classification of ASVs using reference databases like SILVA, GTDB, or UNITE. Covers naive Bayes classifiers (DADA2, IDTAXA) and exact matching approaches. Use when assigning taxonomy to ASVs after DADA2 amplicon processing.
bio-microbiome-qiime2-workflow
QIIME2 command-line workflow for 16S/ITS amplicon analysis. Alternative to DADA2/phyloseq R workflow with built-in provenance tracking. Use when preferring CLI over R, needing reproducible provenance, or working within QIIME2 ecosystem.
bio-microbiome-functional-prediction
Predict metagenome functional content from 16S rRNA marker gene data using PICRUSt2. Infer KEGG, MetaCyc, and EC abundances from ASV tables. Use when functional profiling is needed from 16S data without shotgun metagenomics sequencing.
bio-microbiome-diversity-analysis
Alpha and beta diversity analysis for microbiome data. Calculate within-sample richness, evenness, and between-sample dissimilarity with phyloseq and vegan. Use when comparing community composition across samples or testing for group differences in microbiome structure.
bio-microbiome-amplicon-processing
Amplicon sequence variant (ASV) inference from 16S rRNA or ITS amplicon sequencing using DADA2. Covers quality filtering, error learning, denoising, and chimera removal. Use when processing demultiplexed amplicon FASTQ files to generate an ASV table for downstream analysis.
bio-metagenomics-abundance
Species abundance estimation using Bracken with Kraken2 output. Redistributes reads from higher taxonomic levels to species for more accurate estimates. Use when accurate species-level abundances are needed from Kraken2 classification output.
bio-hi-c-analysis-hic-differential
Compare Hi-C contact matrices between conditions to identify differential chromatin interactions. Compute log2 fold changes, statistical significance, and visualize differential contact maps. Use when comparing Hi-C contacts between conditions.
bio-flow-cytometry-differential-analysis
Differential abundance and state analysis for cytometry data. Compare cell populations between conditions using statistical methods. Use when testing for significant changes in cell frequencies or marker expression between groups.
bio-differential-splicing
Detects differential alternative splicing between conditions using rMATS-turbo (BAM-based) or SUPPA2 diffSplice (TPM-based). Reports events with FDR-corrected significance and delta PSI effect sizes. Use when comparing splicing patterns between treatment groups, tissues, or disease states.
bio-differential-expression-timeseries-de
Analyze time-series RNA-seq data using limma voom with splines, maSigPro, and ImpulseDE2. Identify genes with dynamic expression patterns. Use when analyzing time-series or longitudinal expression data.