outcome-extraction-for-clinical-trials

Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.

53 stars

byaipoch

View on GitHub Installation ↓

Best use case

outcome-extraction-for-clinical-trials is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using outcome-extraction-for-clinical-trials should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/outcome-extraction-for-clinical-trials/SKILL.md --create-dirs "https://raw.githubusercontent.com/aipoch/medical-research-skills/main/scientific-skills/Data Analysis/outcome-extraction-for-clinical-trials/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/outcome-extraction-for-clinical-trials/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How outcome-extraction-for-clinical-trials Compares

Feature / Agent	outcome-extraction-for-clinical-trials	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

> **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills)
# Clinical Outcome Extraction

Extract structured outcome data from clinical research papers for meta-analysis.

## When to Use

- Use this skill when you need clinical research outcome extraction for meta-analysis. use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. handles both database lookup by pmid and real-time llm extraction in a reproducible workflow.
- Use this skill when a data analytics task needs a packaged method instead of ad-hoc freeform output.
- Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
- Use this skill when `scripts/extract_pdf.py` is the most direct path to complete the request.
- Use this skill when you need the `outcome-extraction for clinical trials` package behavior rather than a generic answer.

## Key Features

- Scope-focused workflow aligned to: Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.
- Packaged executable path(s): `scripts/extract_pdf.py`.
- Reference material available in `references/` for task-specific guidance.
- Structured execution path designed to keep outputs consistent and reviewable.

## Dependencies

- `Python`: `3.10+`. Repository baseline for current packaged skills.
- `Third-party packages`: `not explicitly version-pinned in this skill package`. Add pinned versions if this skill needs stricter environment control.

## Example Usage

```bash
cd "20260316/scientific-skills/Data Analytics/outcome-extraction-for-clinical-trials"
python -m py_compile scripts/extract_pdf.py
python scripts/extract_pdf.py --help
```

Example run plan:
1. Confirm the user input, output path, and any required config values.
2. Edit the in-file `CONFIG` block or documented parameters if the script uses fixed settings.
3. Run `python scripts/extract_pdf.py` with the validated inputs.
4. Review the generated output and return the final artifact with any assumptions called out.

## Implementation Details

See `## Workflow` above for related details.

- Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
- Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
- Primary implementation surface: `scripts/extract_pdf.py`.
- Reference guidance: `references/` contains supporting rules, prompts, or checklists.
- Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
- Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

## Workflow

1. **Input Processing**
- User provides: full paper text + optional PMID
- If PMID provided: query database first for existing results
- If no PMID or no database match: proceed to LLM extraction

2. **Outcome Identification** (LLM)
- Extract all outcome measures from the paper
- Determine outcome types: binary, continuous, or survival
- Identify measurement time points
- Output JSON format with outcome classification

3. **Data Classification** (Code)
- Separate outcomes into three categories:
- `bi_outcomes`: Binary/dichotomous outcomes
- `con_outcomes`: Continuous outcomes
- `sur_outcomes`: Survival outcomes

4. **Data Extraction by Type**

### Binary Outcomes
Extract for each intervention group:
- Sample size (n)
- Number of events (event)

### Continuous Outcomes
Extract for each intervention group:
- Sample size (n)
- Mean (mean)
- Standard deviation (sd)

### Survival Outcomes
Extract for each intervention group:
- Sample size (n)
- Hazard ratio (HR)
- 95% Lower CI
- 95% Upper CI

5. **Output Formatting**
- Combine all extracted data
- Ensure consistent JSON structure
- Convert values to strings

## Output Format

```json
[
{
"outcome_name": "PFS",
"detection_time_point": "12 months",
"groups": [
{
"group_name": "Treatment A",
"sample_size": "100",
"outcome_type": "Binary|Continuous|Survival",
"data": [
{"value_type": "Events|Mean|SD|HR|95%Lower CI|95%Upper CI", "value": "25"}
]
}
]
}
]
```

## ‼️‼️‼️See references (extraction-promots.md) for detailed JSON structures for each outcome type (binary, continuous, survival)‼️‼️‼️

## Requirements

- Extract from full text, not just abstract
- Consider ALL intervention groups in the paper
- Include ALL outcome measures of interest
- Report all data regardless of statistical significance
- Use specific group names (intervention names in English), not generic terms like "treatment group"
- Output in JSON format
- Output language: English for all field values
- If data not found: output blank space ""

Related Skills

clinical-reports

from aipoch/medical-research-skills

Write comprehensive clinical reports (case reports, diagnostic reports, clinical trial reports, and patient documentation) when accuracy, regulatory compliance (HIPAA/FDA/ICH-GCP), and template-driven validation are required.

clinicaltrials-gov-parser

from aipoch/medical-research-skills

Monitor and summarize competitor clinical trial status changes from ClinicalTrials.gov.

clinicaltrials-db

from aipoch/medical-research-skills

Query the ClinicalTrials.gov API v2 to search for clinical trials, retrieve detailed study protocols, and analyze recruitment status; use when you need to find trials by condition/drug, export results, or verify study details by NCT ID.

clinical-study-info-extractor

from aipoch/medical-research-skills

Batch extracts and verifies structured information (PMID, title, abstract, methodology, results, etc.) from clinical research literature using PMIDs. Use when the user wants to extract details from specific PMIDs.

preclinical-pkpd-analyst

from aipoch/medical-research-skills

Use preclinical pkpd analyst for data analysis workflows that need structured execution, explicit assumptions, and clear output boundaries.

clinical-data-cleaner

from aipoch/medical-research-skills

Use when cleaning clinical trial data, preparing data for FDA/EMA submission, standardizing SDTM datasets, handling missing values in clinical studies, detecting outliers in lab results, or converting raw CRF data to CDISC format. Cleans and standardizes clinical trial data for regulatory compliance with audit trails.

baseline-extraction-for-clinical-trials

from aipoch/medical-research-skills

Extracts clinical trial baseline data (study, region, participants, etc.) from article text or PMID. Checks PubMed for metadata; always falls back to LLM extraction for full details.

nhanes-clinical-retrospective-biomarker

from aipoch/medical-research-skills

Generates complete NHANES-style cross-sectional epidemiology + retrospective clinical validation research designs from a user-provided disease and biomarker direction. Always use this skill whenever a user wants to design, plan, or build a population-level biomarker association study using NHANES or similar survey datasets, especially when the article logic includes disease definition, biomarker formula derivation, multivariable logistic regression, restricted cubic spline analysis, subgroup stability testing, and a secondary hospital-based retrospective validation cohort. Covers five study patterns (cross-sectional association, dose-response / RCS, subgroup-stability, NHANES + retrospective validation, preliminary screening-performance) and always outputs four workload configs (Lite / Standard / Advanced / Publication+) with recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, publication upgrade path...

clinical-cohort-protocol-designer

from aipoch/medical-research-skills

Designs retrospective or prospective clinical cohort study protocols for biomedical and clinical research. Always use this skill when the user needs a cohort-based study plan rather than a general study idea, evidence summary, or mechanistic experiment design. Focus on cohort appropriateness, enrollment logic, baseline time-zero definition, follow-up structure, endpoint definition, variable collection, confounding control, and a coherent primary statistical analysis line. Do not invent data availability, follow-up completeness, outcome ascertainment quality, sample size adequacy, or causal interpretability.

unmet-clinical-need-extractor

from aipoch/medical-research-skills

Extracts concrete unmet clinical needs from guidelines, reviews, real-world studies, and clinical-practice evidence. Use this skill when a user wants to turn broad medical research value into specific clinical pain points such as weak early detection, poor risk stratification, treatment-response heterogeneity, monitoring gaps, diagnostic delay, undertreatment, overtreatment, or implementation failure. Always ground unmet-need claims in retrieved evidence and distinguish true care gaps from generic statements of importance.

clinical-question-clarifier

from aipoch/medical-research-skills

Clarifies a vague clinical or biomedical research idea into a structured, bounded, searchable, researchable, and testable question. Always use this skill whenever a user has an early-stage clinical or research thought, an over-broad topic, an ill-defined evidence question, or an unclear problem statement that must be translated into a question framing suitable for literature retrieval, evidence synthesis, gap analysis, study design, or downstream protocol planning. Never jump straight to answering the substantive medical question unless the user explicitly asks for that. Focus first on question framing, boundary setting, and downstream-ready formulation.

skill-auditor

from aipoch/medical-research-skills

A comprehensive auditor for any agent skill — including Manus, OpenClaw/ClawHub, Claude, LobeHub, or custom SKILL.md-based skills. Use this skill whenever a user wants to evaluate, audit, review, score, or quality-check an agent skill before publishing, updating, or deploying. Covers two hard veto gates (structural redlines + research integrity redlines), static quality scoring across 25 criteria (ISO 25010 + OpenSSF + Agent), dynamic test input generation, multi-mode execution testing, multi-layer output evaluation with five specialized category rubrics (Evidence Insight / Protocol Design / Data Analysis / Academic Writing / Other), a Research Veto that applies to all four research categories, human eval viewer generation, actionable P0/P1/P2 optimization recommendations, and automatic skill improvement that outputs a polished, production-ready SKILL.md. Also use whenever a user says "audit my skill", "evaluate my skill", "improve my skill", or wants a corrected version after evaluation.