tooluniverse-noncoding-rna
Analyze non-coding RNAs (miRNAs, lncRNAs, circRNAs) using miRBase, LNCipedia, RNAcentral, Rfam, and target prediction databases. Covers ncRNA identification, target prediction, disease associations, expression profiling, and functional annotation. Use when asked about microRNAs, long non-coding RNAs, RNA interference, miRNA targets, lncRNA function, or ncRNA-disease associations.
Best use case
tooluniverse-noncoding-rna is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Analyze non-coding RNAs (miRNAs, lncRNAs, circRNAs) using miRBase, LNCipedia, RNAcentral, Rfam, and target prediction databases. Covers ncRNA identification, target prediction, disease associations, expression profiling, and functional annotation. Use when asked about microRNAs, long non-coding RNAs, RNA interference, miRNA targets, lncRNA function, or ncRNA-disease associations.
Teams using tooluniverse-noncoding-rna should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/tooluniverse-noncoding-rna/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How tooluniverse-noncoding-rna Compares
| Feature / Agent | tooluniverse-noncoding-rna | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyze non-coding RNAs (miRNAs, lncRNAs, circRNAs) using miRBase, LNCipedia, RNAcentral, Rfam, and target prediction databases. Covers ncRNA identification, target prediction, disease associations, expression profiling, and functional annotation. Use when asked about microRNAs, long non-coding RNAs, RNA interference, miRNA targets, lncRNA function, or ncRNA-disease associations.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
SKILL.md Source
# Non-Coding RNA Analysis
Pipeline for identifying, annotating, and interpreting non-coding RNAs and their biological roles. Covers microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and other ncRNA classes.
**Key principles**:
1. **Class determines function** — miRNAs repress mRNA translation; lncRNAs have diverse mechanisms (scaffolds, guides, decoys, enhancers); rRNAs/tRNAs are structural
2. **Targets matter more than the ncRNA itself** — for miRNAs, the regulated mRNA targets determine the phenotype
3. **Expression context is critical** — ncRNAs are highly tissue/cell-type specific
4. **Conservation indicates function** — deeply conserved ncRNAs (miR-let-7, MALAT1) have well-established roles
5. **Evidence grading** — T1: validated targets (reporter assay, CLIP-seq), T2: high-confidence computational prediction, T3: expression correlation, T4: sequence-based prediction only
**Type-based reasoning — look up, don't guess**:
Non-coding RNA function depends on type: miRNA silences target mRNAs (look up targets in miRTarBase/TargetScan), lncRNA has diverse functions (scaffolding, guiding, decoying — check literature for the specific lncRNA), circRNA may sponge miRNAs.
For any ncRNA query: first identify the class from the name/sequence, then select the appropriate evidence source. Do not assume function based on name alone — a gene named "LINC" may have a characterized mechanism, or none at all. Always search PubMed for the specific ncRNA before interpreting. For miRNAs, validated targets (T1) from miRTarBase outweigh any computational prediction — a predicted target with no experimental support is a hypothesis, not a finding. For lncRNAs, mechanism is almost always determined by experimental studies; use `PubMed_search_articles` with the lncRNA name + "mechanism" or "function" to find relevant evidence. For circRNAs, miRNA sponging is the most common proposed mechanism but is frequently over-claimed — look for CLIP-seq or reporter assay evidence before asserting it.
---
## When to Use
- "What are the targets of miR-21?"
- "Find lncRNAs associated with breast cancer"
- "Is this lncRNA conserved across species?"
- "What miRNAs regulate TP53?"
- "Annotate these non-coding RNA IDs"
- "Which miRNAs are biomarkers for [disease]?"
**Not this skill**: For mRNA expression analysis, use `tooluniverse-rnaseq-deseq2`. For CRISPR screens, use `tooluniverse-crispr-screen-analysis`.
---
## Core Tools
| Tool | Use For |
|------|---------|
| `miRBase_search_mirna` | Search miRNAs by name, accession, or sequence |
| `miRBase_get_mirna` | Detailed miRNA info (sequence, genomic location, family) |
| `miRBase_get_mature_mirna` | Mature miRNA sequences and annotations |
| `PubMed_search_articles` | Search for validated miRNA targets in literature (e.g., "miR-21 target validation") |
| `LNCipedia_search_lncrna` | Search lncRNAs by name, gene symbol, or transcript ID |
| `LNCipedia_get_lncrna` | Detailed lncRNA transcript info (sequence, structure, conservation) |
| `LNCipedia_get_lncrna_xrefs` | lncRNA gene info with all transcript variants |
| `LNCipedia_search_ncrna_by_type` | List all transcripts for a lncRNA gene |
| `LNCipedia_get_lncrna_publications` | lncRNA sequence (FASTA format) |
| `RNAcentral_search` | Search all ncRNA types across databases |
| `RNAcentral_get_rna` | Detailed ncRNA annotations from 40+ databases |
| `Rfam_get_family` | RNA family details (structure, alignment, species distribution) |
| `Rfam_search` | Search RNA families by keyword |
| `DisGeNET_search_gene` | ncRNA-disease associations |
| `PubMed_search_articles` | ncRNA literature |
| `GTEx_get_median_gene_expression` | Tissue expression of ncRNA genes |
---
## Workflow
```
Phase 0: ncRNA Identity & Classification
Name/ID → miRBase/LNCipedia/RNAcentral → class, sequence, genomic location
|
Phase 1: Target & Interaction Analysis
miRNA → target mRNAs; lncRNA → interacting proteins/RNAs/chromatin
|
Phase 2: Expression & Tissue Specificity
GTEx/GEO → where is it expressed? Tissue-specific or ubiquitous?
|
Phase 3: Disease Associations
DisGeNET/PubMed/CTD → ncRNA-disease links with evidence
|
Phase 4: Functional Interpretation
Pathway enrichment of targets → biological role → clinical significance
```
### Phase 0: ncRNA Identity & Classification
ncRNA classes by size and database:
- **miRNA** (~22 nt, miRBase): Post-transcriptional silencing via 3'UTR binding
- **lncRNA** (>200 nt, LNCipedia): Diverse — chromatin remodeling, transcription regulation, miRNA sponges
- **rRNA** (120-5000 nt, RNAcentral/Rfam): Ribosome components
- **tRNA** (~76 nt, RNAcentral): Amino acid delivery
- **snoRNA** (60-300 nt, Rfam): rRNA modification (methylation, pseudouridylation)
- **snRNA** (~150 nt, Rfam): Spliceosome components
- **piRNA** (26-31 nt, RNAcentral): Transposon silencing in germline
- **circRNA** (variable, RNAcentral): miRNA sponges, protein scaffolds (experimental evidence required)
**Identification workflow**:
- Name starts with `miR-` or `hsa-mir-` → search miRBase
- Name starts with `LINC`, `MALAT`, `HOTAIR`, `XIST`, or ends in `-AS1` → search LNCipedia
- Any ncRNA type → search RNAcentral (aggregates all databases)
- RNA family question → search Rfam
### Phase 1: Target & Interaction Analysis
**For miRNAs** — the targets determine the biology:
**NOTE**: There is no dedicated miRNA target lookup tool in ToolUniverse. To find miRNA targets:
1. **Literature search** (most reliable): `PubMed_search_articles(query="miR-21 target validation luciferase")`
2. **Cross-references**: `miRBase_get_mirna_xrefs(accession="MIMAT0000076")` — may link to external target databases
3. **Known targets for well-studied miRNAs**: Use the reference table below, then validate via STRING/Reactome
4. **For novel miRNAs**: Search PubMed for "[miRNA] target" and extract validated targets from papers
Well-studied miRNA targets (for common oncomiRs/tumor suppressors):
- **miR-21**: PTEN, PDCD4, TPM1, RECK, SPRY1, SPRY2, BTG2
- **miR-155**: SOCS1, SHIP1, AID, TP53INP1
- **miR-122**: SLC7A1, ADAM17 (also HCV IRES cofactor)
- **let-7**: RAS, HMGA2, MYC, LIN28
**Target interpretation framework**:
- **Validated** (T1): Luciferase reporter, CLIP-seq, degradome-seq — base conclusions on these
- **High-confidence prediction** (T2): TargetScan conserved sites, DIANA-microT score > 0.9 — support validated findings
- **Prediction only** (T3-T4): miRanda, PicTar, RNA22 — hypothesis generation only; do not report as findings
**For lncRNAs** — the mechanism varies:
| lncRNA Mechanism | Example | How to Investigate |
|---|---|---|
| **Chromatin modifier** | HOTAIR, XIST | Check interacting proteins (PRC2, LSD1) via PubMed |
| **Transcription regulator** | NEAT1, MEG3 | Check nearby genes (cis-regulation) via genomic location |
| **miRNA sponge** | MALAT1, circRNAs | Search for miRNA binding sites |
| **Scaffold** | NKILA, BCAR4 | Check protein interactions |
| **Enhancer RNA** | eRNAs | Check ENCODE enhancer annotations |
### Phase 2: Expression & Tissue Specificity
```python
GTEx_get_median_gene_expression(gene_symbol="MIR21") # miRNA host gene expression
# Note: GTEx measures RNA-seq; miRNA expression may need miRNA-seq data from GEO
```
**Interpretation**: Tissue-restricted ncRNAs are often functionally important in that tissue. Ubiquitous ncRNAs (like MALAT1) tend to have housekeeping roles.
### Phase 3: Disease Associations
```python
DisGeNET_search_gene(query="MIR21") # miR-21 disease associations
PubMed_search_articles(query="miR-21 biomarker cancer")
```
**Key ncRNA-disease associations** (well-established T1 examples — always verify via DisGeNET or PubMed for the specific ncRNA):
- miR-21: OncomiR in multiple cancers; targets PTEN, PDCD4, TPM1 (hundreds of T1 studies)
- miR-155: B-cell lymphoma, inflammation — immune regulation
- miR-122: Hepatitis C liver disease — HCV replication cofactor; therapeutic target (miravirsen)
- let-7 family: Lung cancer, stem cell differentiation — tumor suppressor targeting RAS, HMGA2
- HOTAIR: Breast/colorectal cancer — recruits PRC2, promotes metastasis
- MALAT1: Lung cancer/metastasis — splicing regulation
- XIST: X-inactivation, cancer — chromatin silencing
- H19: Beckwith-Wiedemann syndrome, cancer — imprinted lncRNA, miR-675 host
- ANRIL: CVD, diabetes, cancer — CDKN2A/B locus regulation (GWAS-validated)
### Phase 4: Functional Interpretation
After identifying miRNA targets (Phase 1), run pathway enrichment:
```python
# Collect validated target gene symbols
targets = ["PTEN", "PDCD4", "TPM1", "RECK", "SPRY1"] # miR-21 targets
# Pathway enrichment
ReactomeAnalysis_pathway_enrichment(identifiers="PTEN PDCD4 TPM1 RECK SPRY1")
STRING_get_network(identifiers="PTEN\rPDCD4\rTPM1\rRECK\rSPRY1", species=9606)
```
**Interpretation**: If miR-21 targets are enriched in apoptosis and PI3K-AKT signaling → miR-21 is an oncomiR that promotes survival by simultaneously suppressing multiple tumor suppressors.
**Report structure**:
1. **ncRNA Identity** — class, sequence, genomic location, conservation
2. **Targets/Interactions** — validated targets with evidence grades
3. **Expression Profile** — tissue specificity, disease-specific expression changes
4. **Disease Associations** — evidence-graded disease links
5. **Pathway Analysis** — enriched pathways among targets
6. **Mechanistic Model** — how this ncRNA contributes to disease biology
7. **Clinical Potential** — biomarker utility, therapeutic target potential (antagomirs, ASOs)
---
## Limitations
### Computational Procedure: TargetScan Predicted Targets (Download-and-Process)
TargetScan provides the best computational miRNA target predictions but has no REST API. Download and process locally:
```python
# Step 1: Download TargetScan predicted targets (one-time, ~10MB zipped)
# URL: https://www.targetscan.org/vert_80/vert_80_data_download/Summary_Counts.default_predictions.txt.zip
import pandas as pd
import zipfile, io, requests
url = "https://www.targetscan.org/vert_80/vert_80_data_download/Summary_Counts.default_predictions.txt.zip"
resp = requests.get(url, timeout=60)
with zipfile.ZipFile(io.BytesIO(resp.content)) as z:
fname = z.namelist()[0]
df = pd.read_csv(z.open(fname), sep='\t')
# Step 2: Query for a specific miRNA family
mirna = "miR-21-5p" # or "miR-21/590-5p" (TargetScan uses family names)
targets = df[df['miRNA Family'].str.contains("miR-21", case=False, na=False)]
# Step 3: Rank by cumulative weighted context++ score
targets_ranked = targets.sort_values('Cumulative weighted context++ score', ascending=True)
print(f"Top 20 predicted targets of {mirna}:")
for _, row in targets_ranked.head(20).iterrows():
print(f" {row['Target Gene']:10s} score={row['Cumulative weighted context++ score']:.3f} "
f"sites={row['Total num conserved sites']}")
```
**Interpretation**: More negative context++ score = stronger predicted repression. Conserved sites (>1) are higher confidence.
### Computational Procedure: miRTarBase Validated Targets (Download-and-Process)
miRTarBase has Cloudflare protection blocking programmatic access. Use the R/Bioconductor data package or bulk download:
```python
# Option 1: Download from miRTarBase bulk export (requires browser download first)
# Go to: https://mirtarbase.cuhk.edu.cn/~miRTarBase/miRTarBase_2025/
# Download: hsa_MTI.xlsx (human miRNA-target interactions)
# Option 2: Use the GitHub data dump
# https://github.com/jorainer/mirtarbase — R package with cached data
# Once you have the file:
import pandas as pd
mti = pd.read_excel("hsa_MTI.xlsx") # or read_csv if TSV
# Filter for your miRNA
mir21_targets = mti[mti['miRNA'].str.contains('hsa-miR-21', case=False, na=False)]
print(f"miR-21 validated targets: {len(mir21_targets)}")
# Filter by evidence strength
strong = mir21_targets[mir21_targets['Support Type'].str.contains(
'Luciferase|Reporter|Western|CLIP', case=False, na=False
)]
print(f" Strong evidence (reporter/CLIP): {len(strong)}")
for _, row in strong.head(10).iterrows():
print(f" {row['Target Gene']:10s} — {row['Support Type']}")
```
**When download is not available**: Use the built-in reference table in Phase 1 for well-studied miRNAs, or search PubMed for validated targets.
---
## Limitations
- **miRNA target prediction is noisy** — even the best algorithms have >50% false positive rates; always prioritize experimentally validated targets
- **lncRNA function is poorly characterized** — only ~5% of annotated lncRNAs have known functions
- **Expression measurement varies** — miRNA-seq, RNA-seq, and microarray capture different ncRNA classes; check the assay type
- **Species differences** — miRNAs are often conserved but lncRNAs are frequently species-specific; cross-species lncRNA comparisons are unreliableRelated Skills
tooluniverse
Router skill for ToolUniverse tasks. First checks if specialized tooluniverse skills (105+ skills covering disease/drug/target research, gene-disease associations, clinical decision support, genomics, epigenomics, proteomics, comparative genomics, chemical safety, toxicology, systems biology, and more) can solve the problem, then falls back to general strategies for using 2300+ scientific tools. Covers tool discovery, multi-hop queries, comprehensive research workflows, disambiguation, evidence grading, and report generation. Use when users need to research any scientific topic, find biological data, or explore drug/target/disease relationships. ALSO USE for any biology, medicine, chemistry, pharmacology, or life science question — even simple factoid questions like "how many X in protein Y", "what drug interacts with Z", "what gene causes disease W", or "translate this sequence". These questions benefit from database lookups (UniProt, PubMed, ChEMBL, ClinVar, GWAS Catalog, etc.) rather than answering from memory alone. When in doubt about a scientific fact, USE THIS SKILL to verify against real databases.
tooluniverse-variant-to-mechanism
End-to-end variant-to-mechanism analysis: given a genetic variant (rsID or coordinates), trace its functional impact from regulatory context (GWAS, eQTL, RegulomeDB, ENCODE) through target gene identification (GTEx, OpenTargets L2G) to downstream pathway and disease biology (STRING, Reactome, GO enrichment, disease associations). Produces an evidence-graded mechanistic narrative linking genotype to phenotype. Use when asked "how does this variant cause disease?", "what is the mechanism of rs7903146?", "trace variant to pathway", or "connect this GWAS hit to biology".
tooluniverse-variant-interpretation
Systematic clinical variant interpretation from raw variant calls to ACMG-classified recommendations with structural impact analysis. Aggregates evidence from ClinVar, gnomAD, CIViC, UniProt, and PDB across ACMG criteria. Produces pathogenicity scores (0-100), clinical recommendations, and treatment implications. Use when interpreting genetic variants, classifying variants of uncertain significance (VUS), performing ACMG variant classification, or translating variant calls to clinical actionability.
tooluniverse-variant-functional-annotation
Comprehensive functional annotation of protein variants — pathogenicity, population frequency, structural context, and clinical significance. Integrates ProtVar (map_variant, get_function, get_population) for protein-level mapping and structural context, ClinVar for clinical classifications, gnomAD for population frequency with ancestry data, CADD for deleteriousness scores, and ClinGen for gene-disease validity. Produces a structured variant annotation report with evidence grading. Use when asked about protein variant impact, missense variant pathogenicity, ProtVar annotation, variant functional context, or combining population and structural evidence for a variant.
tooluniverse-variant-analysis
Production-ready VCF processing, variant annotation, mutation analysis, and structural variant (SV/CNV) interpretation for bioinformatics questions. Parses VCF files (streaming, large files), classifies mutation types (missense, nonsense, synonymous, frameshift, splice, intronic, intergenic) and structural variants (deletions, duplications, inversions, translocations), applies VAF/depth/quality/consequence filters, annotates with ClinVar/dbSNP/gnomAD/CADD via ToolUniverse, interprets SV/CNV clinical significance using ClinGen dosage sensitivity scores, computes variant statistics, and generates reports. Solves questions like "What fraction of variants with VAF < 0.3 are missense?", "How many non-reference variants remain after filtering intronic/intergenic?", "What is the pathogenicity of this deletion affecting BRCA1?", or "Which dosage-sensitive genes overlap this CNV?". Use when processing VCF files, annotating variants, filtering by VAF/depth/consequence, classifying mutations, interpreting structural variants, assessing CNV pathogenicity, comparing cohorts, or answering variant analysis questions.
tooluniverse-vaccine-design
Design and evaluate vaccine candidates using computational immunology tools. Covers epitope prediction (MHC-I/II binding via IEDB), population coverage analysis, antigen selection, adjuvant matching, and immunogenicity assessment. Integrates IEDB for epitope prediction, UniProt for antigen sequences, PDB/AlphaFold for structural epitopes, BVBRC for pathogen proteomes, and literature for clinical precedent. Use when asked about vaccine design, epitope prediction, immunogenicity, MHC binding, T-cell epitopes, B-cell epitopes, or population coverage for vaccine candidates.
tooluniverse-toxicology
Assess chemical and drug toxicity via adverse outcome pathways, real-world adverse event signals, and toxicogenomic evidence. Integrates AOPWiki (AOPWiki_list_aops, AOPWiki_get_aop) for mechanism- level pathway tracing, FAERS for post-market adverse event quantification, OpenFDA for label mining, and CTD for chemical-gene-disease evidence. Produces structured toxicity reports with evidence grading (T1-T4). Use when asked about toxicity mechanisms, adverse outcome pathways, AOP mapping, FAERS signal detection, or chemical-disease relationships for drugs or environmental chemicals.
tooluniverse-target-research
Gather comprehensive biological target intelligence from 9 parallel research paths covering protein info, structure, interactions, pathways, expression, variants, drug interactions, and literature. Features collision-aware searches, evidence grading (T1-T4), explicit Open Targets coverage, and mandatory completeness auditing. Use when users ask about drug targets, proteins, genes, or need target validation, druggability assessment, or comprehensive target profiling.
tooluniverse-systems-biology
Comprehensive systems biology and pathway analysis using multiple pathway databases (Reactome, KEGG, WikiPathways, Pathway Commons, BioModels). Performs pathway enrichment, protein-pathway mapping, keyword searches, and systems-level analysis. Use when analyzing gene sets, exploring biological pathways, or investigating systems-level biology.
tooluniverse-structural-variant-analysis
Comprehensive structural variant (SV) analysis skill for clinical genomics. Classifies SVs (deletions, duplications, inversions, translocations), assesses pathogenicity using ACMG-adapted criteria, evaluates gene disruption and dosage sensitivity, and provides clinical interpretation with evidence grading. Use when analyzing CNVs, large deletions/duplications, chromosomal rearrangements, or any structural variants requiring clinical interpretation.
tooluniverse-structural-proteomics
Integrate structural biology data with proteomics for drug target validation. Retrieves protein structures from PDB (RCSB, PDBe), AlphaFold predictions, antibody structures (SAbDab), GPCR data (GPCRdb), binding pocket analysis (ProteinsPlus), and ligand interactions (BindingDB). Use when asked to find structures for a drug target, identify binding site ligands, cross-validate drug binding with structural data, assess structural druggability, or compare experimental vs predicted structures.
tooluniverse-stem-cell-organoid
Research stem cells, iPSCs, organoids, and cell differentiation using ToolUniverse tools. Covers pluripotency marker identification, differentiation pathway analysis, organoid model characterization, cell type annotation, and disease modeling. Integrates CellxGene/HCA for single-cell atlas data, CellMarker for cell type markers, GEO for stem cell datasets, and pathway tools for differentiation signaling. Use when asked about stem cells, iPSCs, organoids, cell reprogramming, pluripotency, differentiation protocols, or 3D culture models.