bio-metagenomics-abundance

Species abundance estimation using Bracken with Kraken2 output. Redistributes reads from higher taxonomic levels to species for more accurate estimates. Use when accurate species-level abundances are needed from Kraken2 classification output.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

bio-metagenomics-abundance is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using bio-metagenomics-abundance should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bio-metagenomics-abundance/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/bio-metagenomics-abundance/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/bio-metagenomics-abundance/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How bio-metagenomics-abundance Compares

Feature / Agent	bio-metagenomics-abundance	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Abundance Estimation with Bracken

## Basic Abundance Estimation

```bash
# Run Bracken on Kraken2 report
bracken -d /path/to/kraken2_db \
    -i kraken_report.txt \
    -o bracken_output.txt \
    -r 150 \                       # Read length (100, 150, 200, 250, 300)
    -l S                           # Taxonomic level
```

## Full Workflow with Kraken2

```bash
# Step 1: Classify with Kraken2
kraken2 --db /path/to/kraken2_db \
    --threads 8 \
    --paired \
    --report sample_kraken_report.txt \
    reads_R1.fastq.gz reads_R2.fastq.gz

# Step 2: Estimate abundances with Bracken
bracken -d /path/to/kraken2_db \
    -i sample_kraken_report.txt \
    -o sample_bracken_species.txt \
    -w sample_bracken_report.txt \
    -r 150 \
    -l S
```

## Different Taxonomic Levels

```bash
# Species level (default)
bracken -d db -i report.txt -o species.txt -r 150 -l S

# Genus level
bracken -d db -i report.txt -o genus.txt -r 150 -l G

# Family level
bracken -d db -i report.txt -o family.txt -r 150 -l F

# Phylum level
bracken -d db -i report.txt -o phylum.txt -r 150 -l P
```

## Build Bracken Database

```bash
# Build Bracken database for specific read lengths
# Run AFTER building Kraken2 database
bracken-build -d /path/to/kraken2_db -t 8 -l 150

# Build for multiple read lengths
bracken-build -d /path/to/kraken2_db -t 8 -l 100
bracken-build -d /path/to/kraken2_db -t 8 -l 250
```

## Output Format

```
name                    taxonomy_id    taxonomy_lvl    kraken_assigned_reads    added_reads    new_est_reads    fraction_total_reads
Escherichia coli        562           S               5234                     1245           6479             0.52
Staphylococcus aureus   1280          S               2156                     456            2612             0.21
```

## Filter Low-Abundance Taxa

```bash
# Use threshold for minimum reads
bracken -d db \
    -i report.txt \
    -o bracken.txt \
    -r 150 \
    -l S \
    -t 10                          # Minimum reads threshold
```

## Combine Multiple Samples

```bash
# Run Bracken on each sample
for report in kraken_reports/*.txt; do
    sample=$(basename $report _kraken_report.txt)
    bracken -d db -i $report -o bracken/${sample}_species.txt -r 150 -l S
done

# Combine into abundance matrix
combine_bracken_outputs.py --files bracken/*_species.txt -o combined_abundance.txt
```

## Parse Bracken Output in Python

```python
import pandas as pd

bracken = pd.read_csv('bracken_output.txt', sep='\t')

bracken_sorted = bracken.sort_values('new_est_reads', ascending=False)
bracken_sorted[['name', 'fraction_total_reads']].head(20)

total_reads = bracken['new_est_reads'].sum()
bracken['relative_abundance'] = bracken['new_est_reads'] / total_reads * 100
```

## Convert to Relative Abundance

```python
import pandas as pd

df = pd.read_csv('bracken_output.txt', sep='\t')

total = df['new_est_reads'].sum()
df['relative_abundance'] = df['new_est_reads'] / total * 100

df.to_csv('bracken_relative_abundance.txt', sep='\t', index=False)
```

## Create Abundance Matrix

```python
import pandas as pd
import os

files = [f for f in os.listdir('bracken') if f.endswith('_species.txt')]

dfs = []
for f in files:
    sample = f.replace('_species.txt', '')
    df = pd.read_csv(f'bracken/{f}', sep='\t')
    df = df[['name', 'new_est_reads']].rename(columns={'new_est_reads': sample})
    dfs.append(df)

merged = dfs[0]
for df in dfs[1:]:
    merged = merged.merge(df, on='name', how='outer')

merged = merged.fillna(0)
merged.to_csv('abundance_matrix.txt', sep='\t', index=False)
```

## Key Parameters

| Parameter | Description |
|-----------|-------------|
| -d | Kraken2 database path |
| -i | Input Kraken2 report |
| -o | Output abundance file |
| -w | Output updated report (optional) |
| -r | Read length used |
| -l | Taxonomic level |
| -t | Minimum read threshold |

## Taxonomic Levels

| Level | Code | Description |
|-------|------|-------------|
| Kingdom | K | Bacteria, Archaea |
| Phylum | P | Major divisions |
| Class | C | Class level |
| Order | O | Order level |
| Family | F | Family level |
| Genus | G | Genus level |
| Species | S | Species level |

## Read Length Options

Pre-built databases typically include: 50, 75, 100, 150, 200, 250, 300 bp

Choose the length closest to your actual read length.

## Related Skills

- kraken-classification - Generate Kraken2 report
- metaphlan-profiling - Alternative profiling method
- metagenome-visualization - Visualize abundances

Related Skills

bio-metagenomics-amr-detection

from diegosouzapw/awesome-omni-skill

Detect antimicrobial resistance genes using AMRFinderPlus, ResFinder, and CARD. Screen isolates and metagenomes for resistance determinants. Use when characterizing resistance profiles in clinical isolates, surveillance samples, or metagenomic data.

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

nosql-expert

from diegosouzapw/awesome-omni-skill

Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems.

nosql-databases

from diegosouzapw/awesome-omni-skill

Apply NoSQL best practices for MongoDB, Convex, and document databases. Use when designing schemas, writing queries, optimizing performance, or building applications with non-relational databases. Use with database-expert for query optimization and DBA-level tuning (20+ years experience).

nodejs-javascript-vitest

from diegosouzapw/awesome-omni-skill

Guidelines for writing Node.js and JavaScript code with Vitest testing Triggers on: **/*.js, **/*.mjs, **/*.cjs

nodejs-best-practices

from diegosouzapw/awesome-omni-skill

Node.js development principles and decision-making. Framework selection, async patterns, security, and architecture. Teaches thinking, not copying.

nodejs-backend-typescript

from diegosouzapw/awesome-omni-skill

Node.js backend development with TypeScript, Express/Fastify servers, routing, middleware, and database integration

nodejs-backend-patterns

from diegosouzapw/awesome-omni-skill

Build production-ready Node.js backend services with Express/Fastify, implementing middleware patterns, error handling, authentication, database integration, and API design best practices. Use when creating Node.js servers, REST APIs, GraphQL backends, or microservices architectures.

Node Tuning Helper Scripts

from diegosouzapw/awesome-omni-skill

Generate tuned manifests and evaluate node tuning snapshots

nock-interpreter-engineer-assistant

from diegosouzapw/awesome-omni-skill

Expert Nock interpreter builder for implementing virtual machines in C, Python, Rust, Haskell, or JavaScript. Use when building Nock interpreters, porting between languages, implementing evaluation loops, or understanding Nock runtime behavior.

nix-workspace

from diegosouzapw/awesome-omni-skill

Node.js dashboard for Nix agents - task management with comments, agent status, action log, and task routing. Start with `node server.js`. Access at http://localhost:5050 or https://nixbot.auromations.com

ninja-enrich

from diegosouzapw/awesome-omni-skill

Enrich meta.yaml long_description fields from man pages and websites