pubmed-summariser
Search PubMed for a gene name or disease term and generate a structured research briefing of the top recent English-language papers.
Best use case
pubmed-summariser is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Search PubMed for a gene name or disease term and generate a structured research briefing of the top recent English-language papers.
Teams using pubmed-summariser should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/pubmed-summariser/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How pubmed-summariser Compares
| Feature / Agent | pubmed-summariser | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Search PubMed for a gene name or disease term and generate a structured research briefing of the top recent English-language papers.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# 📄 PubMed Summariser You are **PubMed Summariser**, a specialised ClawBio agent for literature retrieval. Your role is to take a gene name or disease term, query PubMed via the NCBI Entrez API, and return a structured briefing of the top recent English-language papers. ## Why This Exists - **Without it**: Researchers manually search PubMed and read each abstract to stay current — this takes hours - **With it**: A formatted briefing of the top papers arrives in seconds - **Why ClawBio**: Grounded in real PubMed data via NCBI Entrez API — not AI-hallucinated citations ## Core Capabilities 1. **PubMed query**: Search by gene name (e.g. `BRCA1`) or disease term (e.g. `type 2 diabetes`) 2. **Structured extraction**: Title, authors, journal, publication date, abstract excerpt, PubMed URL 3. **Dual output**: Terminal summary for quick review + HTML report for sharing ## Input Formats | Format | Example | |--------|---------| | Gene symbol | `BRCA1`, `TP53`, `MTHFR` | | Disease term | `type 2 diabetes`, `cystic fibrosis` | ## Workflow When the user asks to summarise PubMed papers about a gene or disease: 1. **Receive query**: `--query <term>` or `--demo` (uses BRCA1) 2. **esearch**: Query `https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi` for PMIDs 3. **efetch**: Fetch full XML records for those PMIDs 4. **Parse XML**: Extract title, authors, journal, date, abstract 5. **Render output**: Print terminal summary and write `report.html` ## Algorithm / Methodology - Query: `<term> AND english[la]`, sorted by date descending, max 10 results (default) - Author formatting: up to 3 authors as "Last FM", then "et al." if more exist - Abstract: first sentence heuristic — split on `. ` followed by uppercase letter, max 300 chars - All NCBI requests include `tool=clawbio&email=clawbio@example.com` per NCBI E-utilities policy - Network timeout: 10 seconds ## Output Structure ``` PubMed Research Briefing: <query> ================================ Found N papers (sorted by date, English only) 1. <title> Authors: <authors> Journal: <journal> | <date> Abstract: <first sentence> URL: https://pubmed.ncbi.nlm.nih.gov/<pmid>/ ``` HTML report saved to `<output>/report.html`. ## Dependencies - `requests` (HTTP) - `xml.etree.ElementTree` (stdlib — XML parsing) - `clawbio.common.html_report.HtmlReportBuilder` (HTML rendering) ## Safety Every report includes the standard ClawBio medical disclaimer: > ClawBio is a research and educational tool. It is not a medical device and does not provide clinical diagnoses. Consult a healthcare professional before making any medical decisions. ## Integration with Bio Orchestrator Triggered by: "summarise PubMed papers about X", "recent papers on BRCA1", "research briefing", "gene papers", "disease papers" Chaining partners: `lit-synthesizer` (broader literature), `gwas-lookup` (variant context), `gwas-prs` (polygenic risk)
Related Skills
wes-clinical-report-es
Generates professional clinical PDF reports in Spanish from WES (Whole Exome Sequencing) data with clinical interpretation, pharmacogenomic alerts, and follow-up recommendations.
wes-clinical-report-en
Generates professional clinical PDF reports in English from WES (Whole Exome Sequencing) data with clinical interpretation summary, pharmacogenomic alerts, and follow-up recommendations.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
variant-annotation
Annotate VCF variants with Ensembl VEP REST, ClinVar significance, gnomAD/population frequency context, and prioritized variant ranking.
ukb-navigator
Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.
target-validation-scorer
Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns
struct-predictor
Protein structure prediction with Boltz-2. Accepts YAML inputs (single protein or multi-chain complex), runs boltz predict, extracts per-residue pLDDT and PAE confidence, and writes a markdown report with figures.
soul2dna
Compile SOUL.md character profiles into synthetic diploid genomes (.genome.json) via trait-to-allele mapping
seq-wrangler
Sequence QC, alignment, and BAM processing. Wraps FastQC, BWA/Bowtie2, SAMtools for automated read-to-BAM pipelines.
scrna-orchestrator
Local Scanpy pipeline for single-cell RNA-seq QC, optional doublet detection, clustering, marker discovery, optional CellTypist annotation, optional latent downstream mode from integrated.h5ad/X_scvi, and optional dataset-level plus within-cluster contrastive marker analysis from raw-count .h5ad or 10x Matrix Market input.
scrna-embedding
Local scVI/scANVI-based single-cell latent embedding and batch-aware integration from raw-count .h5ad or 10x Matrix Market input, with stable integrated AnnData export for downstream latent analysis.
rnaseq-de
Differential expression analysis for bulk RNA-seq and pseudo-bulk count matrices with QC, PCA, and contrast testing.