tooluniverse-gwas-snp-interpretation

Interpret genetic variants (SNPs) from GWAS studies by aggregating evidence from multiple databases (GWAS Catalog, Open Targets Genetics, ClinVar). Retrieves variant annotations, GWAS trait associations, fine-mapping evidence, locus-to-gene predictions, and clinical significance. Use when asked to interpret a SNP by rsID, find disease associations for a variant, assess clinical significance, or answer questions like "What diseases is rs429358 associated with?" or "Interpret rs7903146".

1,802 stars

byFreedomIntelligence

View on GitHub Installation ↓

Best use case

tooluniverse-gwas-snp-interpretation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using tooluniverse-gwas-snp-interpretation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/tooluniverse-gwas-snp-interpretation/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/tooluniverse-gwas-snp-interpretation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/tooluniverse-gwas-snp-interpretation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How tooluniverse-gwas-snp-interpretation Compares

Feature / Agent	tooluniverse-gwas-snp-interpretation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# GWAS SNP Interpretation Skill

## Overview

Interpret genetic variants (SNPs) from GWAS studies by aggregating evidence from multiple sources to provide comprehensive clinical and biological context.

**Use Cases:**
- "Interpret rs7903146" (TCF7L2 diabetes variant)
- "What diseases is rs429358 associated with?" (APOE Alzheimer's variant)
- "Clinical significance of rs1801133" (MTHFR variant)
- "Is rs12913832 in any fine-mapped loci?" (Eye color variant)

## What It Does

The skill provides a comprehensive interpretation of SNPs by:

1. **SNP Annotation**: Retrieves basic variant information including genomic coordinates, alleles, functional consequence, and mapped genes
2. **Association Discovery**: Finds all GWAS trait/disease associations with statistical significance
3. **Fine-Mapping Evidence**: Identifies credible sets the variant belongs to (fine-mapped causal loci)
4. **Gene Mapping**: Uses Locus-to-Gene (L2G) predictions to identify likely causal genes
5. **Clinical Summary**: Aggregates evidence into actionable clinical significance

## Workflow

```
User Input: rs7903146
    ↓
[1] SNP Lookup
    → Get location, consequence, MAF
    → gwas_get_snp_by_id
    ↓
[2] Association Search
    → Find all trait/disease associations
    → gwas_get_associations_for_snp
    ↓
[3] Fine-Mapping (Optional)
    → Get credible set membership
    → OpenTargets_get_variant_credible_sets
    ↓
[4] Gene Predictions
    → Extract L2G scores for causal genes
    → (embedded in credible sets)
    ↓
[5] Clinical Summary
    → Aggregate evidence
    → Identify key traits and genes
    ↓
Output: Comprehensive Interpretation Report
```

## Data Sources

### GWAS Catalog (EMBL-EBI)
- **SNP annotations**: Functional consequences, mapped genes, population frequencies
- **Associations**: P-values, effect sizes, study metadata
- **Coverage**: 350,000+ publications, 670,000+ associations

### Open Targets Genetics
- **Fine-mapping**: Statistical credible sets from SuSiE, FINEMAP methods
- **L2G predictions**: Machine learning-based gene prioritization
- **Colocalization**: QTL evidence for causal genes
- **Coverage**: UK Biobank, FinnGen, and other large cohorts

## Input Parameters

### Required
- `rs_id` (str): dbSNP rs identifier
  - Format: "rs" + number (e.g., "rs7903146")
  - Must be valid rsID in GWAS Catalog

### Optional
- `include_credible_sets` (bool, default=True): Query fine-mapping data
  - True: Complete interpretation (slower, ~10-30s)
  - False: Fast associations only (~2-5s)
- `p_threshold` (float, default=5e-8): Genome-wide significance threshold
- `max_associations` (int, default=100): Maximum associations to retrieve

## Output Format

Returns `SNPInterpretationReport` containing:

### 1. SNP Basic Info
```python
{
    'rs_id': 'rs7903146',
    'chromosome': '10',
    'position': 112998590,
    'ref_allele': 'C',
    'alt_allele': 'T',
    'consequence': 'intron_variant',
    'mapped_genes': ['TCF7L2'],
    'maf': 0.293
}
```

### 2. Trait Associations
```python
[
    {
        'trait': 'Type 2 diabetes',
        'p_value': 1.2e-128,
        'beta': '0.28 unit increase',
        'study_id': 'GCST010555',
        'pubmed_id': '33536258',
        'effect_allele': 'T'
    },
    ...
]
```

### 3. Credible Sets (Fine-Mapping)
```python
[
    {
        'study_id': 'GCST90476118',
        'trait': 'Renal failure',
        'finemapping_method': 'SuSiE-inf',
        'p_value': 3.5e-42,
        'predicted_genes': [
            {'gene': 'TCF7L2', 'score': 0.863}
        ],
        'region': '10:112950000-113050000'
    },
    ...
]
```

### 4. Clinical Significance
```
Genome-wide significant associations with 100 traits/diseases:
  - Type 2 diabetes
  - Diabetic retinopathy
  - HbA1c levels
  ...

Identified in 20 fine-mapped loci.
Predicted causal genes: TCF7L2
```

## Example Usage

See `QUICK_START.md` for platform-specific examples.

## Tools Used

### GWAS Catalog Tools
1. `gwas_get_snp_by_id`: Get SNP annotation
2. `gwas_get_associations_for_snp`: Get all trait associations

### Open Targets Tools
3. `OpenTargets_get_variant_info`: Get variant details with population frequencies
4. `OpenTargets_get_variant_credible_sets`: Get fine-mapping credible sets with L2G

## Interpretation Guide

### P-value Significance Levels
- **p < 5e-8**: Genome-wide significant (strong evidence)
- **p < 5e-6**: Suggestive (moderate evidence)
- **p < 0.05**: Nominal (weak evidence)

### L2G Score Interpretation
- **> 0.5**: High confidence causal gene
- **0.1-0.5**: Moderate confidence
- **< 0.1**: Low confidence

### Clinical Actionability
1. **High**: Multiple genome-wide significant associations + in credible sets + high L2G scores
2. **Moderate**: Genome-wide significant associations but limited fine-mapping
3. **Low**: Suggestive associations or limited replication

## Limitations

1. **Variant ID Conversion**: OpenTargets requires chr_pos_ref_alt format, which may need allele lookup
2. **Population Specificity**: Associations may vary by ancestry
3. **Effect Sizes**: Beta values are study-dependent (different phenotype scales)
4. **Causality**: Associations don't prove causation; fine-mapping improves confidence
5. **Currency**: Data reflects published GWAS; latest studies may not be included

## Best Practices

1. **Use Full Interpretation**: Enable `include_credible_sets=True` for clinical decisions
2. **Check Multiple Variants**: Look at other variants in the same locus
3. **Validate Populations**: Consider ancestry-specific effect sizes
4. **Review Publications**: Check original studies for context
5. **Integrate Evidence**: Combine with functional data, eQTLs, pQTLs

## Technical Notes

### Performance
- **Fast mode** (no credible sets): 2-5 seconds
- **Full mode** (with credible sets): 10-30 seconds
- **Bottleneck**: OpenTargets GraphQL API rate limits

### Error Handling
- Invalid rs_id: Returns error message
- No associations: Returns empty list with note
- API failures: Graceful degradation (returns partial results)

## Related Skills

- **Gene Function Analysis**: Interpret predicted causal genes
- **Disease Ontology Lookup**: Understand trait classifications
- **PubMed Literature Search**: Find original GWAS publications
- **Variant Effect Prediction**: Functional consequence analysis

## References

1. GWAS Catalog: https://www.ebi.ac.uk/gwas/
2. Open Targets Genetics: https://genetics.opentargets.org/
3. GWAS Significance Thresholds: Fadista et al. 2016
4. L2G Method: Mountjoy et al. 2021 (Nature Genetics)

## Version

- **Version**: 1.0.0
- **Last Updated**: 2026-02-13
- **ToolUniverse Version**: >= 1.0.0
- **Tools Required**: gwas_get_snp_by_id, gwas_get_associations_for_snp, OpenTargets_get_variant_credible_sets

Related Skills

tooluniverse-variant-interpretation

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Systematic clinical variant interpretation from raw variant calls to ACMG-classified recommendations with structural impact analysis. Aggregates evidence from ClinVar, gnomAD, CIViC, UniProt, and PDB across ACMG criteria. Produces pathogenicity scores (0-100), clinical recommendations, and treatment implications. Use when interpreting genetic variants, classifying variants of uncertain significance (VUS), performing ACMG variant classification, or translating variant calls to clinical actionability.

tooluniverse-variant-analysis

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Production-ready VCF processing, variant annotation, mutation analysis, and structural variant (SV/CNV) interpretation for bioinformatics questions. Parses VCF files (streaming, large files), classifies mutation types (missense, nonsense, synonymous, frameshift, splice, intronic, intergenic) and structural variants (deletions, duplications, inversions, translocations), applies VAF/depth/quality/consequence filters, annotates with ClinVar/dbSNP/gnomAD/CADD via ToolUniverse, interprets SV/CNV clinical significance using ClinGen dosage sensitivity scores, computes variant statistics, and generates reports. Solves questions like "What fraction of variants with VAF < 0.3 are missense?", "How many non-reference variants remain after filtering intronic/intergenic?", "What is the pathogenicity of this deletion affecting BRCA1?", or "Which dosage-sensitive genes overlap this CNV?". Use when processing VCF files, annotating variants, filtering by VAF/depth/consequence, classifying mutations, interpreting structural variants, assessing CNV pathogenicity, comparing cohorts, or answering variant analysis questions.

tooluniverse-target-research

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Gather comprehensive biological target intelligence from 9 parallel research paths covering protein info, structure, interactions, pathways, expression, variants, drug interactions, and literature. Features collision-aware searches, evidence grading (T1-T4), explicit Open Targets coverage, and mandatory completeness auditing. Use when users ask about drug targets, proteins, genes, or need target validation, druggability assessment, or comprehensive target profiling.

tooluniverse-systems-biology

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Comprehensive systems biology and pathway analysis using multiple pathway databases (Reactome, KEGG, WikiPathways, Pathway Commons, BioModels). Performs pathway enrichment, protein-pathway mapping, keyword searches, and systems-level analysis. Use when analyzing gene sets, exploring biological pathways, or investigating systems-level biology.

tooluniverse-structural-variant-analysis

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Comprehensive structural variant (SV) analysis skill for clinical genomics. Classifies SVs (deletions, duplications, inversions, translocations), assesses pathogenicity using ACMG-adapted criteria, evaluates gene disruption and dosage sensitivity, and provides clinical interpretation with evidence grading. Use when analyzing CNVs, large deletions/duplications, chromosomal rearrangements, or any structural variants requiring clinical interpretation.

tooluniverse-statistical-modeling

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Perform statistical modeling and regression analysis on biomedical datasets. Supports linear regression, logistic regression (binary/ordinal/multinomial), mixed-effects models, Cox proportional hazards survival analysis, Kaplan-Meier estimation, and comprehensive model diagnostics. Extracts odds ratios, hazard ratios, confidence intervals, p-values, and effect sizes. Designed to solve BixBench statistical reasoning questions involving clinical/experimental data. Use when asked to fit regression models, compute odds ratios, perform survival analysis, run statistical tests, or interpret model coefficients from provided data.

tooluniverse-spatial-transcriptomics

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Analyze spatial transcriptomics data to map gene expression in tissue architecture. Supports 10x Visium, MERFISH, seqFISH, Slide-seq, and imaging-based platforms. Performs spatial clustering, domain identification, cell-cell proximity analysis, spatial gene expression patterns, tissue architecture mapping, and integration with single-cell data. Use when analyzing spatial transcriptomics datasets, studying tissue organization, identifying spatial expression patterns, mapping cell-cell interactions in tissue context, characterizing tumor microenvironment spatial structure, or integrating spatial and single-cell RNA-seq data for comprehensive tissue analysis.

tooluniverse-spatial-omics-analysis

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Computational analysis framework for spatial multi-omics data integration. Given spatially variable genes (SVGs), spatial domain annotations, tissue type, and disease context from spatial transcriptomics/proteomics experiments (10x Visium, MERFISH, DBiTplus, SLIDE-seq, etc.), performs comprehensive biological interpretation including pathway enrichment, cell-cell interaction inference, druggable target identification, immune microenvironment characterization, and multi-modal integration. Produces a detailed markdown report with Spatial Omics Integration Score (0-100), domain-by-domain characterization, and validation recommendations. Uses 70+ ToolUniverse tools across 9 analysis phases. Use when users ask about spatial transcriptomics analysis, spatial omics interpretation, tissue heterogeneity, spatial gene expression patterns, tumor microenvironment mapping, tissue zonation, or cell-cell communication from spatial data.

tooluniverse-single-cell

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Production-ready single-cell and expression matrix analysis using scanpy, anndata, and scipy. Performs scRNA-seq QC, normalization, PCA, UMAP, Leiden/Louvain clustering, differential expression (Wilcoxon, t-test, DESeq2), cell type annotation, per-cell-type statistical analysis, gene-expression correlation, batch correction (Harmony), trajectory inference, and cell-cell communication analysis. NEW: Analyzes ligand-receptor interactions between cell types using OmniPath (CellPhoneDB, CellChatDB), scores communication strength, identifies signaling cascades, and handles multi-subunit receptor complexes. Integrates with ToolUniverse gene annotation tools (HPA, Ensembl, MyGene, UniProt) and enrichment tools (gseapy, PANTHER, STRING). Supports h5ad, 10X, CSV/TSV count matrices, and pre-annotated datasets. Use when analyzing single-cell RNA-seq data, studying cell-cell interactions, performing cell type differential expression, computing gene-expression correlations by cell type, analyzing tumor-immune communication, or answering questions about scRNA-seq datasets.

tooluniverse-sequence-retrieval

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Retrieves biological sequences (DNA, RNA, protein) from NCBI and ENA with gene disambiguation, accession type handling, and comprehensive sequence profiles. Creates detailed reports with sequence metadata, cross-database references, and download options. Use when users need nucleotide sequences, protein sequences, genome data, or mention GenBank, RefSeq, EMBL accessions.

tooluniverse-rnaseq-deseq2

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Production-ready RNA-seq differential expression analysis using PyDESeq2. Performs DESeq2 normalization, dispersion estimation, Wald testing, LFC shrinkage, and result filtering. Handles multi-factor designs, multiple contrasts, batch effects, and integrates with gene enrichment (gseapy) and ToolUniverse annotation tools (UniProt, Ensembl, OpenTargets). Supports CSV/TSV/H5AD input formats and any organism. Use when analyzing RNA-seq count matrices, identifying DEGs, performing differential expression with statistical rigor, or answering questions about gene expression changes.

tooluniverse-rare-disease-diagnosis

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Provide differential diagnosis for patients with suspected rare diseases based on phenotype and genetic data. Matches symptoms to HPO terms, identifies candidate diseases from Orphanet/OMIM, prioritizes genes for testing, interprets variants of uncertain significance. Use when clinician asks about rare disease diagnosis, unexplained phenotypes, or genetic testing interpretation.