clinical-study-info-extractor
Batch extracts and verifies structured information (PMID, title, abstract, methodology, results, etc.) from clinical research literature using PMIDs. Use when the user wants to extract details from specific PMIDs.
Best use case
clinical-study-info-extractor is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Batch extracts and verifies structured information (PMID, title, abstract, methodology, results, etc.) from clinical research literature using PMIDs. Use when the user wants to extract details from specific PMIDs.
Teams using clinical-study-info-extractor should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/clinical-study-info-extractor/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How clinical-study-info-extractor Compares
| Feature / Agent | clinical-study-info-extractor | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Batch extracts and verifies structured information (PMID, title, abstract, methodology, results, etc.) from clinical research literature using PMIDs. Use when the user wants to extract details from specific PMIDs.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
> **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills)
# Clinical Study Info Extractor
This skill extracts structured information from clinical study literature based on provided PMIDs. It performs a search, parses the results, and uses LLM extraction with strict quality rules to produce a consolidated Markdown table.
## When to Use
- Use this skill when you need batch extracts and verifies structured information (pmid, title, abstract, methodology, results, etc.) from clinical research literature using pmids. use when the user wants to extract details from specific pmids in a reproducible workflow.
- Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
- Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
- Use this skill when `scripts/utils.py` is the most direct path to complete the request.
- Use this skill when you need the `clinical-study-info-extractor` package behavior rather than a generic answer.
## Key Features
- Scope-focused workflow aligned to: Batch extracts and verifies structured information (PMID, title, abstract, methodology, results, etc.) from clinical research literature using PMIDs. Use when the user wants to extract details from specific PMIDs.
- Packaged executable path(s): `scripts/utils.py`.
- Reference material available in `references/` for task-specific guidance.
- Structured execution path designed to keep outputs consistent and reviewable.
## Dependencies
- `Python`: `3.10+`. Repository baseline for current packaged skills.
- `Third-party packages`: `not explicitly version-pinned in this skill package`. Add pinned versions if this skill needs stricter environment control.
## Example Usage
See `## Usage` above for related details.
```bash
cd "20260316/scientific-skills/Evidence Insight/clinical-study-info-extractor"
python -m py_compile scripts/utils.py
python scripts/utils.py --help
```
Example run plan:
1. Confirm the user input, output path, and any required config values.
2. Edit the in-file `CONFIG` block or documented parameters if the script uses fixed settings.
3. Run `python scripts/utils.py` with the validated inputs.
4. Review the generated output and return the final artifact with any assumptions called out.
## Implementation Details
See `## Workflow` above for related details.
- Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
- Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
- Primary implementation surface: `scripts/utils.py`.
- Reference guidance: `references/` contains supporting rules, prompts, or checklists.
- Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
- Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.
## Workflow
1. **Input Normalization**: Splits and cleans the input string of PMIDs.
2. **Literature Search**: Queries the PubMed API directly to fetch document details.
3. **Information Extraction**: Iterates through documents to extract fields (Title, Year, Journal, Abstract, DOI, Type, Population, Sample Size, Intervention, Results, Conclusion).
4. **Verification**: Enforces quality rules (e.g., sample size only for research articles).
5. **Output Formatting**: Aggregates results into a Chinese Markdown table.
## Usage
When you have a list of PMIDs and need structured details:
1. **Normalize Input**:
Use `scripts/utils.py` with `normalize_pmids` to parse the input string.
2. **Search & Process**:
Use `scripts/utils.py` with `fetch_pubmed_data` to query PubMed and get a list of document JSON strings.
3. **Extract & Verify**:
For each document, use the prompts defined in `references/extraction_rules.md` to extract and verify information.
- Step 1: Extraction
- Step 2: Verification
4. **Format Output**:
Use `scripts/utils.py` with `format_table` to generate the final Markdown table.
## Quality Rules
See `references/extraction_rules.md` for detailed extraction logic and constraints.
- **Article Type**: Must be one of Research, Meta-analysis, Case Report, Review.
- **Sample Size**: Numeric only, empty for non-research.
- **Intervention**: Single column, "None" if not mentioned.
- **Language**: All Chinese except Journal Name.Related Skills
clinical-reports
Write comprehensive clinical reports (case reports, diagnostic reports, clinical trial reports, and patient documentation) when accuracy, regulatory compliance (HIPAA/FDA/ICH-GCP), and template-driven validation are required.
methodology-extractor
Batch extraction of experimental methods from multiple papers for protocol.
gene-info
Retrieves comprehensive gene information including PubMed publication counts, NCBI summaries, and Ensembl transcript data. Supports batch processing and file input. Invoke when the user asks for gene details, publication statistics, or needs to analyze a list of genes.
clinicaltrials-gov-parser
Monitor and summarize competitor clinical trial status changes from ClinicalTrials.gov.
clinicaltrials-db
Query the ClinicalTrials.gov API v2 to search for clinical trials, retrieve detailed study protocols, and analyze recruitment status; use when you need to find trials by condition/drug, export results, or verify study details by NCT ID.
study-design-scale-selector
Determines the appropriate Risk of Bias assessment scale for a medical study based on its design (RCT, Cohort, etc.), using PubMed metadata lookup or text analysis. Use when the user wants to know which quality assessment tool to use for a specific paper (given PMID or abstract).
preclinical-pkpd-analyst
Use preclinical pkpd analyst for data analysis workflows that need structured execution, explicit assumptions, and clear output boundaries.
outcome-extraction-for-clinical-trials
Clinical research outcome extraction for meta-analysis. Use when users need to extract outcome measures (binary, continuous, or survival data) from clinical research papers for systematic review and meta-analysis. Handles both database lookup by PMID and real-time LLM extraction.
diagnostic-study-quality-assessment-quadas-2
Analyzes clinical diagnostic accuracy studies for bias using the QUADAS-2 tool. Use when Claude needs to assess the quality, risk of bias, or applicability of diagnostic accuracy studies (e.g., "Assess this paper using QUADAS-2").
cohort-study-quality-assessment-nos
Evaluates the quality of cohort studies using the Newcastle-Ottawa Scale (NOS). Use when the user provides a cohort study article or text and needs a quality assessment report.
clinical-data-cleaner
Use when cleaning clinical trial data, preparing data for FDA/EMA submission, standardizing SDTM datasets, handling missing values in clinical studies, detecting outliers in lab results, or converting raw CRF data to CDISC format. Cleans and standardizes clinical trial data for regulatory compliance with audit trails.
baseline-extraction-for-clinical-trials
Extracts clinical trial baseline data (study, region, participants, etc.) from article text or PMID. Checks PubMed for metadata; always falls back to LLM extraction for full details.