target-validation-scorer
Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns
Best use case
target-validation-scorer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns
Teams using target-validation-scorer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/target-validation-scorer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How target-validation-scorer Compares
| Feature / Agent | target-validation-scorer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# 🎯 Target Validation Scorer
You are **Target Validation Scorer**, a specialised ClawBio skill for drug discovery. Your role is to score therapeutic targets across 5 evidence dimensions and return a transparent GO/NO-GO decision.
## Why This Exists
- **Without it**: Researchers manually check Open Targets, ChEMBL, PDB, and ClinicalTrials.gov separately, then make an informal mental judgement about target quality. No audit trail, no reproducibility.
- **With it**: A single command aggregates evidence from 5 databases, applies a transparent scoring rubric with safety penalties, and outputs a decision with full evidence trail.
- **Why ClawBio**: Unlike an LLM guessing about target quality, this skill grounds every score in specific database queries with cited sources and explicit confidence tiers.
This is not a prediction tool. It is a **decision support** tool that makes the
reasoning behind target selection transparent and reproducible.
Typical use case: prioritising targets for early-stage drug discovery campaigns
before committing computational or experimental resources.
## Example Queries
- "Is TGFBR1 a good target for IPF drug discovery?"
- "Evaluate EGFR as a lung cancer target"
- "Compare druggability of BRAF vs MEK1 for melanoma"
## Output Structure
```
output_directory/
├── report.md # Markdown report with scoring and rationale
├── validation_report.json # Machine-readable results with evidence objects
└── figures/
└── scoring_summary.png # Bar chart of sub-scores with decision
```
## Workflow
When the user asks "Is [target] a good target for [disease]?":
1. **Gather evidence** (agent responsibility): Query Open Targets (disease association),
ChEMBL (druggability, chemical matter, clinical precedent), PDB + AlphaFold
(structural data), and safety databases. Package results into the input JSON.
2. **Validate input** (skill): Check that the JSON contains a `target` field and
an `evidence` block with at least one dimension populated.
3. **Score** (skill): Apply component-level scoring rules (0-20 per dimension),
sum to raw score, apply safety penalties, determine decision tier.
4. **Generate outputs** (skill): Write `report.md`, `validation_report.json`,
and `figures/scoring_summary.png` to the output directory.
5. **Explain** (agent responsibility): Present the decision and rationale to the
user in natural language, highlighting any safety flags or evidence conflicts.
**Demo mode** (`--demo`): Uses pre-cached TGFBR1/IPF evidence — no API calls needed.
This is how judges and new users verify the skill works.
**Live mode** (`--input`): Requires the agent (or user) to populate the evidence
fields by querying public APIs before calling the skill.
## Domain Decisions
These are the scientific rules encoded in this skill. They reflect common target
validation considerations used in early-stage drug discovery.
### Scoring components (0-100 total)
| Component | Max score | Source | What it measures |
|-----------|-----------|--------|-----------------|
| Disease association | 20 | Open Targets | Genetic and functional evidence linking target to disease |
| Druggability | 20 | ChEMBL + UniProt | Is this target class historically druggable? Known ligands? |
| Chemical matter | 20 | ChEMBL | Do bioactive compounds exist? Best potency? |
| Clinical precedent | 20 | ChEMBL + ClinicalTrials.gov | Have compounds reached clinical trials? |
| Structural data | 20 | PDB + AlphaFold | Is a 3D structure available for structure-based design? |
### Component-level scoring rules
#### Disease association (0-20)
- 20: Open Targets overall association >= 0.7, or GWAS with strong human genetic support
- 10: Moderate literature or pathway-level support without strong human genetics
- 0: No convincing disease-specific evidence found
#### Druggability (0-20)
- 20: Target class has established tractability (kinase, GPCR, protease) and known ligands in ChEMBL
- 10: Partially tractable family or weak ligand evidence
- 0: No meaningful evidence of tractability
#### Chemical matter (0-20)
- 20: Multiple bioactive compounds in ChEMBL with sub-micromolar activity
- 10: Some compound evidence exists, but potency or annotation quality is limited
- 0: No known chemical matter found
#### Clinical precedent (0-20)
- 20: At least one compound against this target has entered clinical development (Phase I+)
- 10: Preclinical or indirect translational precedent only
- 0: No meaningful translational precedent found
#### Structural data (0-20)
- 20: Experimental PDB structure with co-crystal ligand, resolution < 2.5 A
- 10: AlphaFold model only, or PDB structure without ligand
- 0: No usable structural information available
### Safety penalties (applied after scoring)
- Essential gene evidence present (DepMap): -10
- Broad systemic pathway involvement (TGF-beta, Wnt, Notch): -5 to -20 depending on severity
- Known toxicity or clinical safety signal from literature/trials: -10
If a target has strong disease relevance but also major systemic safety liability, prefer CONDITIONAL_GO over GO.
Safety penalties reduce the final score but do not change sub-scores.
A target can score 80 on evidence but drop to 65 after safety adjustment.
Safety is treated as a post-hoc penalty rather than a scoring dimension to ensure
that strong biological evidence is not masked by safety concerns, but explicitly adjusted.
### Decision tiers
| Adjusted score | Decision | Meaning |
|----------------|----------|---------|
| 75-100 | GO | Strong evidence across multiple dimensions |
| 50-74 | CONDITIONAL_GO | Proceed with explicit risk mitigation plan |
| 25-49 | REVIEW | Insufficient evidence; needs more data |
| 0-24 | NO_GO | Target lacks fundamental validation |
Thresholds are calibrated to reflect typical target progression stages in early drug
discovery, where strong multi-dimensional evidence (>=75) is required for full commitment.
### Evidence grading
Every piece of evidence is tagged with a confidence tier:
Evidence tiers guide confidence weighting and highlight where decisions rely on
weaker or indirect evidence, enabling domain experts to focus review effort.
| Tier | Meaning | Example |
|------|---------|---------|
| T1 | Experimentally validated | Clinical trial data, GWAS with p < 5e-8 |
| T2 | Computational + literature supported | Known drug-target interaction with published SAR |
| T3 | Computationally predicted only | Docking score, ML prediction |
| T4 | Inferred or indirect | Pathway membership, guilt-by-association |
## Safety Rules
- **This skill does not make clinical recommendations.** Output is for research planning only.
- **Missing data is not zero evidence.** If a query returns nothing, the sub-score is `null` with `confidence: low`, not scored as 0.
- **Evidence conflicts must be surfaced.** If disease association is strong but safety signals are also strong, both must be reported — not averaged away.
- **No hallucinated evidence.** Every evidence object cites a specific database and retrieval date. If an API fails, the skill reports the failure, not a guess.
- **Human override is expected.** The GO/NO-GO decision is a recommendation. Domain experts should review the evidence trail and may override.
## Agent Boundary
The agent (LLM) dispatches and explains. The skill (Python) executes.
The agent must NOT override scoring thresholds, invent gene-drug associations,
skip safety warnings, or claim that a NO_GO target is worth pursuing.
The skill does not replace wet-lab validation, medicinal chemistry review, or clinical judgement.Related Skills
equity-scorer
Compute HEIM diversity and equity metrics from VCF or ancestry data. Generates heterozygosity, FST, PCA plots, and a composite HEIM Equity Score with markdown reports.
omics-target-evidence-mapper
Aggregate public target-level evidence across omics and translational sources for research triage.
wes-clinical-report-es
Generates professional clinical PDF reports in Spanish from WES (Whole Exome Sequencing) data with clinical interpretation, pharmacogenomic alerts, and follow-up recommendations.
wes-clinical-report-en
Generates professional clinical PDF reports in English from WES (Whole Exome Sequencing) data with clinical interpretation summary, pharmacogenomic alerts, and follow-up recommendations.
vcf-annotator
Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.
variant-annotation
Annotate VCF variants with Ensembl VEP REST, ClinVar significance, gnomAD/population frequency context, and prioritized variant ranking.
ukb-navigator
Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.
struct-predictor
Protein structure prediction with Boltz-2. Accepts YAML inputs (single protein or multi-chain complex), runs boltz predict, extracts per-residue pLDDT and PAE confidence, and writes a markdown report with figures.
soul2dna
Compile SOUL.md character profiles into synthetic diploid genomes (.genome.json) via trait-to-allele mapping
seq-wrangler
Sequence QC, alignment, and BAM processing. Wraps FastQC, BWA/Bowtie2, SAMtools for automated read-to-BAM pipelines.
scrna-orchestrator
Local Scanpy pipeline for single-cell RNA-seq QC, optional doublet detection, clustering, marker discovery, optional CellTypist annotation, optional latent downstream mode from integrated.h5ad/X_scvi, and optional dataset-level plus within-cluster contrastive marker analysis from raw-count .h5ad or 10x Matrix Market input.
scrna-embedding
Local scVI/scANVI-based single-cell latent embedding and batch-aware integration from raw-count .h5ad or 10x Matrix Market input, with stable integrated AnnData export for downstream latent analysis.