biostatistics

Performs biostatistical analyses specialized for clinical and biomedical research including survival analysis, Kaplan-Meier estimation, Cox proportional hazards regression, longitudinal data modeling, and diagnostic test evaluation; trigger when users discuss clinical outcomes, survival curves, or biomedical study statistics.

564 stars

Best use case

biostatistics is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Performs biostatistical analyses specialized for clinical and biomedical research including survival analysis, Kaplan-Meier estimation, Cox proportional hazards regression, longitudinal data modeling, and diagnostic test evaluation; trigger when users discuss clinical outcomes, survival curves, or biomedical study statistics.

Teams using biostatistics should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/biostatistics/SKILL.md --create-dirs "https://raw.githubusercontent.com/beita6969/ScienceClaw/main/skills/biostatistics/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/biostatistics/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How biostatistics Compares

Feature / AgentbiostatisticsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Performs biostatistical analyses specialized for clinical and biomedical research including survival analysis, Kaplan-Meier estimation, Cox proportional hazards regression, longitudinal data modeling, and diagnostic test evaluation; trigger when users discuss clinical outcomes, survival curves, or biomedical study statistics.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

## When to Trigger

Activate this skill when the user mentions:
- Survival analysis, time-to-event, censoring
- Kaplan-Meier curves, log-rank test, median survival
- Cox regression, proportional hazards, hazard ratio
- Longitudinal data, mixed-effects models, GEE
- Diagnostic accuracy, sensitivity, specificity, ROC/AUC
- Competing risks, Fine-Gray model, cumulative incidence
- Sample size for clinical endpoints, multiplicity adjustment
- Missing data in clinical studies, multiple imputation, MCAR/MAR/MNAR

## Step-by-Step Methodology

1. **Study design assessment** - Confirm study type (cohort, case-control, cross-sectional, RCT). Identify primary endpoint type (continuous, binary, time-to-event, count, ordinal). Determine if data is clustered or longitudinal.
2. **Survival analysis** - Define time origin, event definition, and censoring mechanism. Verify censoring is non-informative. Estimate survival curves with Kaplan-Meier method. Compare groups with log-rank test (or weighted variants: Wilcoxon, Tarone-Ware for non-proportional hazards).
3. **Cox regression** - Check proportional hazards assumption (Schoenfeld residuals, log-log plots). If violated, use time-varying coefficients, stratified Cox, or restricted mean survival time (RMST). Report hazard ratios with 95% CIs. Handle multiple covariates with purposeful selection or penalized regression.
4. **Competing risks** - When multiple event types exist, use cumulative incidence functions (not 1-KM). Apply Fine-Gray subdistribution hazard model or cause-specific hazard models. Report cumulative incidence at clinically relevant timepoints.
5. **Longitudinal analysis** - For repeated measures: linear or generalized mixed-effects models (random intercepts/slopes). Choose appropriate correlation structure. Handle dropout with pattern mixture models or joint models for longitudinal and survival data.
6. **Diagnostic test evaluation** - Compute sensitivity, specificity, PPV, NPV at defined cutoffs. Generate ROC curve and compute AUC with DeLong confidence intervals. For biomarker discovery, apply cross-validation to avoid overoptimism.
7. **Missing data handling** - Classify missingness mechanism (MCAR, MAR, MNAR). For MAR: multiple imputation (m >= 20 imputations, Rubin's rules for pooling). Conduct sensitivity analysis under MNAR assumptions.

## Key Databases and Tools

- **R survival / survminer** - Survival analysis packages
- **SAS PROC PHREG / LIFETEST** - Clinical biostatistics standard
- **STATA stcox / stcurve** - Survival modeling
- **R mice / Amelia** - Multiple imputation
- **pROC / cutpointr** - ROC analysis

## Output Format

- Kaplan-Meier curves with number-at-risk table, median survival with 95% CI.
- Cox model results as a table: variable, HR, 95% CI, p-value, with PH assumption test.
- Cumulative incidence curves for competing risks with event-specific estimates.
- ROC curves with AUC, optimal cutpoint, and sensitivity/specificity at that point.
- Missing data report: pattern, mechanism assessment, imputation method, sensitivity results.

## Quality Checklist

- [ ] Time origin and event definition clearly specified
- [ ] Censoring mechanism described and non-informative assumption justified
- [ ] Proportional hazards assumption tested and result reported
- [ ] Competing risks handled appropriately (not ignored)
- [ ] Multiple comparisons adjustment applied when needed
- [ ] Missing data mechanism assessed and appropriate method used
- [ ] Sample size adequate for number of covariates (EPV >= 10 for Cox)
- [ ] Effect estimates reported with confidence intervals, not just p-values
- [ ] Sensitivity analyses performed for key assumptions

Related Skills

xurl

564
from beita6969/ScienceClaw

A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.

xlsx

564
from beita6969/ScienceClaw

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

writing

564
from beita6969/ScienceClaw

No description provided.

world-bank-data

564
from beita6969/ScienceClaw

World Bank Open Data API for development indicators. Use when: user asks about GDP, population, poverty, health, or education statistics by country. NOT for: real-time financial data or stock prices.

wikipedia-search

564
from beita6969/ScienceClaw

Search and fetch structured content from Wikipedia using the MediaWiki API for reliable, encyclopedic information

wikidata-knowledge

564
from beita6969/ScienceClaw

Query Wikidata for structured knowledge using SPARQL and entity search. Use when: (1) finding structured facts about entities (people, places, organizations), (2) querying relationships between entities, (3) cross-referencing external identifiers (Wikipedia, VIAF, GND, ORCID), (4) building knowledge graphs from linked data. NOT for: full-text article content (use Wikipedia API), scientific literature (use semantic-scholar), geospatial data (use OpenStreetMap).

weather

564
from beita6969/ScienceClaw

Get current weather and forecasts via wttr.in or Open-Meteo. Use when: user asks about weather, temperature, or forecasts for any location. NOT for: historical weather data, severe weather alerts, or detailed meteorological analysis. No API key needed.

wacli

564
from beita6969/ScienceClaw

Send WhatsApp messages to other people or search/sync WhatsApp history via the wacli CLI (not for normal user chats).

voice-call

564
from beita6969/ScienceClaw

Start voice calls via the OpenClaw voice-call plugin.

visualization

564
from beita6969/ScienceClaw

Create publication-quality scientific figures and plots using Python (matplotlib, seaborn, plotly). Supports bar charts, scatter plots, heatmaps, box plots, violin plots, survival curves, network graphs, and more. Use when user asks to plot data, create figures, make charts, visualize results, or generate publication-ready graphics. Triggers on "plot", "chart", "figure", "graph", "visualize", "heatmap", "scatter plot", "bar chart", "histogram".

video-frames

564
from beita6969/ScienceClaw

Extract frames or short clips from videos using ffmpeg.

venue-templates

564
from beita6969/ScienceClaw

Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.