bio-ribo-seq-ribosome-stalling

Detect ribosome pausing and stalling sites from Ribo-seq data at codon resolution. Use when studying translational regulation, identifying pause sites, or analyzing codon-specific translation dynamics.

1,802 stars

byFreedomIntelligence

View on GitHub Installation ↓

Best use case

bio-ribo-seq-ribosome-stalling is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using bio-ribo-seq-ribosome-stalling should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bio-ribo-seq-ribosome-stalling/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/bio-ribo-seq-ribosome-stalling/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/bio-ribo-seq-ribosome-stalling/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How bio-ribo-seq-ribosome-stalling Compares

Feature / Agent	bio-ribo-seq-ribosome-stalling	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

## Version Compatibility

Reference examples tested with: BioPython 1.83+, numpy 1.26+, scipy 1.12+

Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Ribosome Stalling Detection

**"Find ribosome pause sites in my data"** → Detect codon-level ribosome stalling and pausing events from Ribo-seq footprint density, identifying positions with abnormally high ribosome occupancy.
- Python: `plastid` for codon-resolution density calculation, `scipy` for statistical scoring

## Concept

Ribosome stalling/pausing occurs when ribosomes slow or stop at specific codons:
- Rare codons (low tRNA availability)
- Specific amino acid motifs (polyproline)
- Regulatory pause sites (upstream of stress response genes)
- Nascent chain interactions

## Calculate Codon-Level Occupancy

**Goal:** Quantify ribosome occupancy at each codon position across all transcripts.

**Approach:** Map reads to P-sites using a fixed offset, then bin counts into codons along each CDS.

```python
from plastid import BAMGenomeArray, GTF2_TranscriptAssembler, FivePrimeMapFactory
import numpy as np
from collections import defaultdict

def get_codon_occupancy(bam_path, gtf_path, psite_offset=12):
    '''Calculate ribosome occupancy per codon'''
    # Load reads with P-site mapping
    alignments = BAMGenomeArray(
        bam_path,
        mapping=FivePrimeMapFactory(offset=psite_offset)
    )

    transcripts = list(GTF2_TranscriptAssembler(gtf_path))

    codon_counts = defaultdict(lambda: defaultdict(int))

    for tx in transcripts:
        if tx.cds_start is None:
            continue

        cds = tx.get_cds()
        cds_seq = tx.get_sequence(cds)

        # Get counts at each position
        counts = alignments.count_in_region(cds)

        # Assign to codons
        for i in range(0, len(cds_seq) - 2, 3):
            codon = cds_seq[i:i+3]
            codon_pos = i // 3
            codon_counts[tx.get_name()][codon_pos] = counts  # Simplified

    return codon_counts
```

## Identify Pause Sites

**Goal:** Detect codon positions with significantly elevated ribosome occupancy indicative of translational pausing.

**Approach:** Z-score normalize occupancy per transcript and flag positions exceeding a threshold (default z > 3).

```python
def find_pause_sites(codon_occupancy, threshold_zscore=3):
    '''Find positions with significantly elevated ribosome occupancy

    Pause sites have much higher occupancy than surrounding codons
    '''
    pause_sites = []

    for tx, occupancy in codon_occupancy.items():
        values = np.array(list(occupancy.values()))

        if len(values) < 10 or values.sum() < 100:
            continue

        # Z-score normalization
        mean_occ = values.mean()
        std_occ = values.std()

        if std_occ == 0:
            continue

        zscores = (values - mean_occ) / std_occ

        # Find positions above threshold
        for pos, zscore in enumerate(zscores):
            if zscore > threshold_zscore:
                pause_sites.append({
                    'transcript': tx,
                    'codon_position': pos,
                    'occupancy': values[pos],
                    'zscore': zscore
                })

    return pause_sites
```

## Codon-Specific Occupancy

**Goal:** Calculate average ribosome occupancy for each of the 64 codon types across all genes.

**Approach:** Aggregate read density per codon identity across all CDS positions and compute per-codon mean occupancy.

```python
from Bio.Seq import Seq
from Bio.Data import CodonTable

def codon_occupancy_table(bam_path, gtf_path, psite_offset=12):
    '''Calculate average occupancy per codon type'''
    # Count reads per codon type
    codon_reads = defaultdict(list)

    alignments = BAMGenomeArray(bam_path,
                                 mapping=FivePrimeMapFactory(offset=psite_offset))
    transcripts = list(GTF2_TranscriptAssembler(gtf_path))

    for tx in transcripts:
        if tx.cds_start is None:
            continue

        cds = tx.get_cds()
        cds_seq = str(tx.get_sequence(cds))

        # Get read density
        density = alignments.get_density(cds)

        for i in range(0, len(cds_seq) - 2, 3):
            codon = cds_seq[i:i+3]
            if len(density) > i + 2:
                codon_reads[codon].append(sum(density[i:i+3]))

    # Calculate mean occupancy per codon
    codon_means = {codon: np.mean(reads) for codon, reads in codon_reads.items()}

    return codon_means
```

## Correlate with Codon Usage

**Goal:** Test whether ribosome pausing correlates with tRNA availability across codons.

**Approach:** Compute Spearman rank correlation between per-codon occupancy and tRNA abundance; expect a negative relationship.

```python
def correlate_with_trna(codon_occupancy, trna_abundance):
    '''Test if pausing correlates with tRNA availability

    Rare codons (low tRNA) should have higher occupancy
    '''
    from scipy import stats

    codons = list(set(codon_occupancy.keys()) & set(trna_abundance.keys()))

    occ = [codon_occupancy[c] for c in codons]
    trna = [trna_abundance[c] for c in codons]

    corr, pval = stats.spearmanr(occ, trna)

    return corr, pval  # Expect negative correlation
```

## Motif Analysis at Pause Sites

**Goal:** Extract amino acid sequence context around identified pause sites to discover recurrent motifs.

**Approach:** Translate the coding region flanking each pause site and collect fixed-width windows for motif analysis.

```python
def extract_pause_motifs(pause_sites, sequences, window=10):
    '''Extract amino acid context around pause sites'''
    motifs = []

    for site in pause_sites:
        tx = site['transcript']
        pos = site['codon_position']
        seq = sequences.get(tx, '')

        if len(seq) > pos * 3 + window * 3:
            start = max(0, (pos - window) * 3)
            end = min(len(seq), (pos + window + 1) * 3)
            aa_seq = str(Seq(seq[start:end]).translate())
            motifs.append(aa_seq)

    return motifs
```

## Known Pause Motifs

| Motif | Description |
|-------|-------------|
| PPP | Polyproline (ribosome tunnel interaction) |
| XPX | Proline-containing |
| D/E-rich | Negatively charged nascent chain |
| Stop codon context | Influenced by nucleotides around stop |

## Related Skills

- ribosome-periodicity - Validate data quality
- orf-detection - Context for pause sites
- translation-efficiency - Gene-level translation

Related Skills

bio-ribo-seq-translation-efficiency

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Calculate translation efficiency (TE) as the ratio of ribosome occupancy to mRNA abundance. Use when comparing translational regulation between conditions or identifying genes with altered translation independent of transcription.

bio-ribo-seq-ribosome-periodicity

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Validate Ribo-seq data quality by checking 3-nucleotide periodicity and calculating P-site offsets. Use when assessing library quality or determining read offsets for downstream analysis.

bio-ribo-seq-riboseq-preprocessing

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Preprocess ribosome profiling data including adapter trimming, size selection, rRNA removal, and alignment. Use when preparing Ribo-seq reads for downstream analysis of translation.

bio-ribo-seq-orf-detection

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Detect and quantify translated ORFs from Ribo-seq data including uORFs and novel ORFs using RiboCode and ORFquant. Use when identifying translated regions beyond annotated coding sequences or quantifying ORF-level translation.

zinc-database

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

zarr-python

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

xlsx

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

writing-skills

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Use when creating new skills, editing existing skills, or verifying skills work before deployment

writing-plans

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Use when you have a spec or requirements for a multi-step task, before touching code

wikipedia-search

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Search and fetch structured content from Wikipedia using the MediaWiki API for reliable, encyclopedic information

wellally-tech

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Integrate digital health data sources (Apple Health, Fitbit, Oura Ring) and connect to WellAlly.tech knowledge base. Import external health device data, standardize to local format, and recommend relevant WellAlly.tech knowledge base articles based on health data. Support generic CSV/JSON import, provide intelligent article recommendations, and help users better manage personal health data.

weightloss-analyzer

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

分析减肥数据、计算代谢率、追踪能量缺口、管理减肥阶段