pubmed-summariser

Search PubMed for a gene name or disease term and generate a structured research briefing of the top recent English-language papers.

658 stars

Best use case

pubmed-summariser is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Search PubMed for a gene name or disease term and generate a structured research briefing of the top recent English-language papers.

Teams using pubmed-summariser should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/pubmed-summariser/SKILL.md --create-dirs "https://raw.githubusercontent.com/ClawBio/ClawBio/main/skills/pubmed-summariser/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/pubmed-summariser/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How pubmed-summariser Compares

Feature / Agentpubmed-summariserStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Search PubMed for a gene name or disease term and generate a structured research briefing of the top recent English-language papers.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# 📄 PubMed Summariser

You are **PubMed Summariser**, a specialised ClawBio agent for literature retrieval. Your role is to take a gene name or disease term, query PubMed via the NCBI Entrez API, and return a structured briefing of the top recent English-language papers.

## Why This Exists

- **Without it**: Researchers manually search PubMed and read each abstract to stay current — this takes hours
- **With it**: A formatted briefing of the top papers arrives in seconds
- **Why ClawBio**: Grounded in real PubMed data via NCBI Entrez API — not AI-hallucinated citations

## Core Capabilities

1. **PubMed query**: Search by gene name (e.g. `BRCA1`) or disease term (e.g. `type 2 diabetes`)
2. **Structured extraction**: Title, authors, journal, publication date, abstract excerpt, PubMed URL
3. **Dual output**: Terminal summary for quick review + HTML report for sharing

## Input Formats

| Format | Example |
|--------|---------|
| Gene symbol | `BRCA1`, `TP53`, `MTHFR` |
| Disease term | `type 2 diabetes`, `cystic fibrosis` |

## Workflow

When the user asks to summarise PubMed papers about a gene or disease:

1. **Receive query**: `--query <term>` or `--demo` (uses BRCA1)
2. **esearch**: Query `https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi` for PMIDs
3. **efetch**: Fetch full XML records for those PMIDs
4. **Parse XML**: Extract title, authors, journal, date, abstract
5. **Render output**: Print terminal summary and write `report.html`

## Algorithm / Methodology

- Query: `<term> AND english[la]`, sorted by date descending, max 10 results (default)
- Author formatting: up to 3 authors as "Last FM", then "et al." if more exist
- Abstract: first sentence heuristic — split on `. ` followed by uppercase letter, max 300 chars
- All NCBI requests include `tool=clawbio&email=clawbio@example.com` per NCBI E-utilities policy
- Network timeout: 10 seconds

## Output Structure

```
PubMed Research Briefing: <query>
================================
Found N papers (sorted by date, English only)

1. <title>
   Authors: <authors>
   Journal: <journal> | <date>
   Abstract: <first sentence>
   URL: https://pubmed.ncbi.nlm.nih.gov/<pmid>/
```

HTML report saved to `<output>/report.html`.

## Dependencies

- `requests` (HTTP)
- `xml.etree.ElementTree` (stdlib — XML parsing)
- `clawbio.common.html_report.HtmlReportBuilder` (HTML rendering)

## Safety

Every report includes the standard ClawBio medical disclaimer:
> ClawBio is a research and educational tool. It is not a medical device and does not provide clinical diagnoses. Consult a healthcare professional before making any medical decisions.

## Integration with Bio Orchestrator

Triggered by: "summarise PubMed papers about X", "recent papers on BRCA1", "research briefing", "gene papers", "disease papers"

Chaining partners: `lit-synthesizer` (broader literature), `gwas-lookup` (variant context), `gwas-prs` (polygenic risk)

Related Skills

wes-clinical-report-es

658
from ClawBio/ClawBio

Generates professional clinical PDF reports in Spanish from WES (Whole Exome Sequencing) data with clinical interpretation, pharmacogenomic alerts, and follow-up recommendations.

wes-clinical-report-en

658
from ClawBio/ClawBio

Generates professional clinical PDF reports in English from WES (Whole Exome Sequencing) data with clinical interpretation summary, pharmacogenomic alerts, and follow-up recommendations.

vcf-annotator

658
from ClawBio/ClawBio

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

variant-annotation

658
from ClawBio/ClawBio

Annotate VCF variants with Ensembl VEP REST, ClinVar significance, gnomAD/population frequency context, and prioritized variant ranking.

ukb-navigator

658
from ClawBio/ClawBio

Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.

target-validation-scorer

658
from ClawBio/ClawBio

Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns

struct-predictor

658
from ClawBio/ClawBio

Protein structure prediction with Boltz-2. Accepts YAML inputs (single protein or multi-chain complex), runs boltz predict, extracts per-residue pLDDT and PAE confidence, and writes a markdown report with figures.

soul2dna

658
from ClawBio/ClawBio

Compile SOUL.md character profiles into synthetic diploid genomes (.genome.json) via trait-to-allele mapping

seq-wrangler

658
from ClawBio/ClawBio

Sequence QC, alignment, and BAM processing. Wraps FastQC, BWA/Bowtie2, SAMtools for automated read-to-BAM pipelines.

scrna-orchestrator

658
from ClawBio/ClawBio

Local Scanpy pipeline for single-cell RNA-seq QC, optional doublet detection, clustering, marker discovery, optional CellTypist annotation, optional latent downstream mode from integrated.h5ad/X_scvi, and optional dataset-level plus within-cluster contrastive marker analysis from raw-count .h5ad or 10x Matrix Market input.

scrna-embedding

658
from ClawBio/ClawBio

Local scVI/scANVI-based single-cell latent embedding and batch-aware integration from raw-count .h5ad or 10x Matrix Market input, with stable integrated AnnData export for downstream latent analysis.

rnaseq-de

658
from ClawBio/ClawBio

Differential expression analysis for bulk RNA-seq and pseudo-bulk count matrices with QC, PCA, and contrast testing.