literature-filtering
Filter literature by publication year, journal, and predefined screening rules to produce inclusion/exclusion lists; use when conducting preliminary screening or systematic review screening to narrow the literature scope.
Best use case
literature-filtering is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Filter literature by publication year, journal, and predefined screening rules to produce inclusion/exclusion lists; use when conducting preliminary screening or systematic review screening to narrow the literature scope.
Teams using literature-filtering should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/literature-filtering/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How literature-filtering Compares
| Feature / Agent | literature-filtering | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Filter literature by publication year, journal, and predefined screening rules to produce inclusion/exclusion lists; use when conducting preliminary screening or systematic review screening to narrow the literature scope.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
> **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills)
## When to Use
- You need to quickly narrow a large bibliography by **publication year range** (e.g., 2015–2024).
- You must restrict results to a **target journal set** (e.g., a whitelist/blacklist of journals).
- You are running **preliminary screening** before full-text review and need traceable inclusion/exclusion decisions.
- You are conducting **systematic review screening** and must record consistent reasons for exclusion.
- You need standardized outputs (lists + logs) for collaboration, auditing, or downstream analysis.
## Key Features
- Rule-based filtering by **year**, **journal**, and **literature type/criteria**.
- **Journal name normalization** to match abbreviations and full names consistently.
- Structured recording of **exclusion reasons** for transparency and reproducibility.
- Support for **borderline/controversial item review** to improve consistency.
- Standardized outputs: **inclusion list**, **exclusion list**, and **screening statistics/summary**.
## Dependencies
- None (documentation-driven workflow).
- Optional template file:
- `assets/screening_log_template.csv`
## Example Usage
The following example is a complete, runnable Python script that:
1) normalizes journal names, 2) filters by year and journal whitelist, 3) applies simple inclusion/exclusion rules, and 4) outputs inclusion/exclusion CSV files plus a screening log.
```python
#!/usr/bin/env python3
import csv
import re
from dataclasses import dataclass
from typing import Dict, List, Tuple
# ----------------------------
# Configuration (edit as needed)
# ----------------------------
YEAR_MIN = 2018
YEAR_MAX = 2024
# Journal whitelist after normalization
JOURNAL_WHITELIST = {
"journal of finance",
"journal of financial economics",
"review of financial studies",
}
# Abbreviation/full-name mapping (extend as needed)
JOURNAL_ALIASES = {
"j. finan.": "journal of finance",
"j finan": "journal of finance",
"jfe": "journal of financial economics",
"rev. financ. stud.": "review of financial studies",
"rfs": "review of financial studies",
}
# Simple keyword-based screening rules (example)
INCLUDE_KEYWORDS = {"asset pricing", "corporate finance", "risk premium"}
EXCLUDE_KEYWORDS = {"editorial", "book review", "erratum"}
# ----------------------------
# Data model
# ----------------------------
@dataclass
class Record:
id: str
title: str
year: int
journal: str
abstract: str
# ----------------------------
# Helpers
# ----------------------------
def normalize_journal(name: str, aliases: Dict[str, str]) -> str:
"""
Normalize journal names:
- lowercase
- strip punctuation
- collapse whitespace
- map abbreviations to canonical full names
"""
if not name:
return ""
raw = name.strip().lower()
raw = re.sub(r"[^\w\s\.]", " ", raw) # keep dots for alias keys like "j. finan."
raw = re.sub(r"\s+", " ", raw).strip()
# Try alias mapping on the dot-preserved version
if raw in aliases:
return aliases[raw]
# Also try a dot-stripped variant
nodot = raw.replace(".", "")
if nodot in aliases:
return aliases[nodot]
# Canonicalize by removing dots and extra spaces
canonical = re.sub(r"[\.]", "", raw)
canonical = re.sub(r"\s+", " ", canonical).strip()
return canonical
def contains_any(text: str, keywords: set) -> bool:
t = (text or "").lower()
return any(k in t for k in keywords)
def screen_record(r: Record) -> Tuple[bool, str]:
"""
Returns (included, reason).
Reasons are designed to be human-auditable.
"""
if r.year < YEAR_MIN or r.year > YEAR_MAX:
return False, f"Excluded: year out of range ({r.year})"
norm_journal = normalize_journal(r.journal, JOURNAL_ALIASES)
if norm_journal not in JOURNAL_WHITELIST:
return False, f"Excluded: journal not in whitelist ({norm_journal})"
text = f"{r.title}\n{r.abstract}"
if contains_any(text, EXCLUDE_KEYWORDS):
return False, "Excluded: matches exclusion keywords"
if not contains_any(text, INCLUDE_KEYWORDS):
return False, "Excluded: does not match inclusion keywords"
return True, "Included: meets all criteria"
# ----------------------------
# I/O
# ----------------------------
def read_input_csv(path: str) -> List[Record]:
"""
Expected columns: id,title,year,journal,abstract
"""
out = []
with open(path, "r", newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
out.append(
Record(
id=row.get("id", "").strip(),
title=row.get("title", "").strip(),
year=int(row.get("year", "0")),
journal=row.get("journal", "").strip(),
abstract=row.get("abstract", "").strip(),
)
)
return out
def write_csv(path: str, rows: List[Dict[str, str]], fieldnames: List[str]) -> None:
with open(path, "w", newline="", encoding="utf-8") as f:
w = csv.DictWriter(f, fieldnames=fieldnames)
w.writeheader()
w.writerows(rows)
def main():
input_path = "input_literature.csv"
records = read_input_csv(input_path)
included, excluded, log = [], [], []
for r in records:
norm_journal = normalize_journal(r.journal, JOURNAL_ALIASES)
ok, reason = screen_record(r)
log.append({
"id": r.id,
"title": r.title,
"year": str(r.year),
"journal_raw": r.journal,
"journal_normalized": norm_journal,
"decision": "include" if ok else "exclude",
"reason": reason,
})
base = {
"id": r.id,
"title": r.title,
"year": str(r.year),
"journal": norm_journal,
}
(included if ok else excluded).append(base)
write_csv("included.csv", included, ["id", "title", "year", "journal"])
write_csv("excluded.csv", excluded, ["id", "title", "year", "journal"])
write_csv(
"screening_log.csv",
log,
["id", "title", "year", "journal_raw", "journal_normalized", "decision", "reason"],
)
# Simple screening statistics
stats = {
"total": len(records),
"included": len(included),
"excluded": len(excluded),
}
print("Screening complete:", stats)
print("Outputs: included.csv, excluded.csv, screening_log.csv")
if __name__ == "__main__":
main()
```
Minimal input file example (`input_literature.csv`):
```csv
id,title,year,journal,abstract
1,Asset Pricing with Risk Premiums,2020,J. Finan.,We study asset pricing and the risk premium...
2,An Editorial Note,2021,Journal of Finance,This editorial summarizes...
3,Corporate Finance Evidence,2017,JFE,Empirical corporate finance results...
```
## Implementation Details
### 1. Rule Setting
- **Year rules**: define an inclusive range `[YEAR_MIN, YEAR_MAX]`.
- **Journal rules**:
- Use a **whitelist** (or blacklist) of canonical journal names.
- Apply **normalization** before matching to avoid false mismatches.
- **Screening criteria**:
- Define explicit inclusion/exclusion criteria (e.g., topic, study type, population, method).
- Ensure each exclusion has a **single primary reason** (or a controlled multi-reason scheme).
### 2. Journal Name Normalization
Recommended normalization steps (in order):
1. Convert to lowercase.
2. Remove/standardize punctuation and collapse whitespace.
3. Apply **abbreviation/full-name mapping** (e.g., `J. Finan.` → `Journal of Finance`).
4. Output a canonical form used for matching and reporting.
Key parameters:
- `JOURNAL_ALIASES`: dictionary for abbreviation/full-name mapping.
- Normalization policy choices:
- Case sensitivity (typically disabled by lowercasing).
- Punctuation handling (strip most punctuation; optionally preserve dots for alias keys).
- Whitespace collapsing.
### 3. Execution of Screening
- Apply filters in a stable order to keep decisions consistent and auditable:
1. Year range
2. Journal match (after normalization)
3. Inclusion/exclusion criteria
- Record a **decision** and **reason** for every record in a screening log.
### 4. Review and Consistency
- Flag borderline items (e.g., unclear abstracts, ambiguous journal names) for manual review.
- Keep a shared, versioned rule set (year range, journal list, alias map, criteria) to ensure consistent application across reviewers.
### 5. Output Organization
Produce at minimum:
- `included.csv`: records that pass all rules.
- `excluded.csv`: records that fail at least one rule.
- `screening_log.csv`: full trace with normalized journal and exclusion reason.
- Optional: screening statistics and a reason summary (counts by reason).
Reference formats and checkpoints can be aligned with `references/guide.md` if available.Related Skills
literatureimages-interpretation
Interpret figures in academic papers and their captions when the input is a PDF-to-Markdown document with page markers and image links, producing a structured Markdown report for extracting variables, trends, and conclusions.
literature-statistics
Generate statistics for publication-year and journal distributions from local references or PDFs; use when you need standardized Year/Journal tables and a summary without any network access.
literature-management
Import local literature into a managed library; trigger when you need offline deduplication, tagging, and a searchable index.
literature-extensive-read
Rapidly skim and summarize academic papers (default:PDF-to-Markdown full text with `## Page XX` pagination and image references) and output a structured extensive-reading summary in Markdown when you need to quickly understand research questions, methods, key results, conclusions, and decide whether intensive reading is worthwhile.
literature-experiment-extract
Extract experimental models, experimental methods, and biomarker information from paper Markdown (typically produced by PDF-to-Markdown tools) when a user provides paper Markdown and needs a structured, evidence-backed summary (1 Markdown + 3 CSVs).
literature-close-read
Produce a structured close-reading report from a paper's full PDF-to-Markdown text (with `## Page XX` pagination and image references) when you need to systematically extract background, research questions, methods, results, limitations, and reproducible experimental details.
multi-database-literature-collector
Collects candidate biomedical literature across multiple databases, adapts search logic by database, preserves source metadata, and organizes results into a structured, screening-ready candidate pool. Always use this skill when a user wants cross-database literature collection, search strategy construction, candidate paper aggregation, or first-pass evidence organization before deduplication, screening, layered reading, or review planning. Requires real and verifiable literature records only. Every formal literature item must include a real link and DOI when available; never fabricate citations, titles, authors, years, journals, abstracts, PMIDs, or DOIs. If a DOI is unavailable or cannot be verified, state that explicitly rather than inventing one.
medical-research-literature-reader-pro
A medical-research-native literature reading skill for users with clinical, bioinformatics, translational, and basic experimental backgrounds. Use this skill whenever a user wants to read, analyze, critique, or interpret a medical or scientific paper — whether they provide a PDF, abstract, DOI, PMID, or just a title. Triggers include requests like "analyze this paper", "critique this study", "is this a strong paper?", "give me similar studies", "prepare me for journal club", "help me understand this bioinformatics paper", "what are the weaknesses here?", or "turn this into a mind map". Also activate for any downstream deliverables such as journal club kits, comparison tables, PI decision briefs, replication starters, or follow-up experiment designs. Do NOT treat as a generic summarizer — this skill performs structured evidence-type classification, track-specific critical appraisal, interpretation-boundary judgment, and research-grade follow-up generation.
skill-auditor
A comprehensive auditor for any agent skill — including Manus, OpenClaw/ClawHub, Claude, LobeHub, or custom SKILL.md-based skills. Use this skill whenever a user wants to evaluate, audit, review, score, or quality-check an agent skill before publishing, updating, or deploying. Covers two hard veto gates (structural redlines + research integrity redlines), static quality scoring across 25 criteria (ISO 25010 + OpenSSF + Agent), dynamic test input generation, multi-mode execution testing, multi-layer output evaluation with five specialized category rubrics (Evidence Insight / Protocol Design / Data Analysis / Academic Writing / Other), a Research Veto that applies to all four research categories, human eval viewer generation, actionable P0/P1/P2 optimization recommendations, and automatic skill improvement that outputs a polished, production-ready SKILL.md. Also use whenever a user says "audit my skill", "evaluate my skill", "improve my skill", or wants a corrected version after evaluation.
two-sample-mr-research-planner
Generates complete two-sample Mendelian randomization (MR) research designs from a user-provided research direction. Use when users want to design, plan, or build a study using two-sample MR to test causal relationships. Triggers:"design a two-sample MR study", "build a publishable MR paper", "test whether this biomarker causally affects this disease", "generate Lite/Standard/Advanced MR plans", "screen multiple exposures with MR", "bidirectional MR design", "causal inference using GWAS summary statistics", or "I want to study X and Y using MR". Always outputs four workload configurations (Lite / Standard / Advanced / Publication+) with a recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, and publication upgrade path.
research-proposal-generator
Generates a comprehensive research proposal design based on input literature, including hypothesis, mechanism verification, and budget. Use when the user wants to design a research project from a paper.
research-grants
Write competitive research proposals for NSF, NIH, DOE, DARPA, and Taiwan's NSTC when you need agency-compliant narratives, budgets, and review-criteria alignment for a specific solicitation/FOA/BAA.