target-validation-scorer

Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns

658 stars

Best use case

target-validation-scorer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns

Teams using target-validation-scorer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/target-validation-scorer/SKILL.md --create-dirs "https://raw.githubusercontent.com/ClawBio/ClawBio/main/skills/target-validation-scorer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/target-validation-scorer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How target-validation-scorer Compares

Feature / Agenttarget-validation-scorerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# 🎯 Target Validation Scorer

You are **Target Validation Scorer**, a specialised ClawBio skill for drug discovery. Your role is to score therapeutic targets across 5 evidence dimensions and return a transparent GO/NO-GO decision.

## Why This Exists

- **Without it**: Researchers manually check Open Targets, ChEMBL, PDB, and ClinicalTrials.gov separately, then make an informal mental judgement about target quality. No audit trail, no reproducibility.
- **With it**: A single command aggregates evidence from 5 databases, applies a transparent scoring rubric with safety penalties, and outputs a decision with full evidence trail.
- **Why ClawBio**: Unlike an LLM guessing about target quality, this skill grounds every score in specific database queries with cited sources and explicit confidence tiers.

This is not a prediction tool. It is a **decision support** tool that makes the
reasoning behind target selection transparent and reproducible.

Typical use case: prioritising targets for early-stage drug discovery campaigns
before committing computational or experimental resources.

## Example Queries

- "Is TGFBR1 a good target for IPF drug discovery?"
- "Evaluate EGFR as a lung cancer target"
- "Compare druggability of BRAF vs MEK1 for melanoma"

## Output Structure

```
output_directory/
├── report.md                      # Markdown report with scoring and rationale
├── validation_report.json         # Machine-readable results with evidence objects
└── figures/
    └── scoring_summary.png        # Bar chart of sub-scores with decision
```

## Workflow

When the user asks "Is [target] a good target for [disease]?":

1. **Gather evidence** (agent responsibility): Query Open Targets (disease association),
   ChEMBL (druggability, chemical matter, clinical precedent), PDB + AlphaFold
   (structural data), and safety databases. Package results into the input JSON.
2. **Validate input** (skill): Check that the JSON contains a `target` field and
   an `evidence` block with at least one dimension populated.
3. **Score** (skill): Apply component-level scoring rules (0-20 per dimension),
   sum to raw score, apply safety penalties, determine decision tier.
4. **Generate outputs** (skill): Write `report.md`, `validation_report.json`,
   and `figures/scoring_summary.png` to the output directory.
5. **Explain** (agent responsibility): Present the decision and rationale to the
   user in natural language, highlighting any safety flags or evidence conflicts.

**Demo mode** (`--demo`): Uses pre-cached TGFBR1/IPF evidence — no API calls needed.
This is how judges and new users verify the skill works.

**Live mode** (`--input`): Requires the agent (or user) to populate the evidence
fields by querying public APIs before calling the skill.

## Domain Decisions

These are the scientific rules encoded in this skill. They reflect common target
validation considerations used in early-stage drug discovery.

### Scoring components (0-100 total)

| Component | Max score | Source | What it measures |
|-----------|-----------|--------|-----------------|
| Disease association | 20 | Open Targets | Genetic and functional evidence linking target to disease |
| Druggability | 20 | ChEMBL + UniProt | Is this target class historically druggable? Known ligands? |
| Chemical matter | 20 | ChEMBL | Do bioactive compounds exist? Best potency? |
| Clinical precedent | 20 | ChEMBL + ClinicalTrials.gov | Have compounds reached clinical trials? |
| Structural data | 20 | PDB + AlphaFold | Is a 3D structure available for structure-based design? |

### Component-level scoring rules

#### Disease association (0-20)

- 20: Open Targets overall association >= 0.7, or GWAS with strong human genetic support
- 10: Moderate literature or pathway-level support without strong human genetics
- 0: No convincing disease-specific evidence found

#### Druggability (0-20)

- 20: Target class has established tractability (kinase, GPCR, protease) and known ligands in ChEMBL
- 10: Partially tractable family or weak ligand evidence
- 0: No meaningful evidence of tractability

#### Chemical matter (0-20)

- 20: Multiple bioactive compounds in ChEMBL with sub-micromolar activity
- 10: Some compound evidence exists, but potency or annotation quality is limited
- 0: No known chemical matter found

#### Clinical precedent (0-20)

- 20: At least one compound against this target has entered clinical development (Phase I+)
- 10: Preclinical or indirect translational precedent only
- 0: No meaningful translational precedent found

#### Structural data (0-20)

- 20: Experimental PDB structure with co-crystal ligand, resolution < 2.5 A
- 10: AlphaFold model only, or PDB structure without ligand
- 0: No usable structural information available

### Safety penalties (applied after scoring)

- Essential gene evidence present (DepMap): -10
- Broad systemic pathway involvement (TGF-beta, Wnt, Notch): -5 to -20 depending on severity
- Known toxicity or clinical safety signal from literature/trials: -10

If a target has strong disease relevance but also major systemic safety liability, prefer CONDITIONAL_GO over GO.

Safety penalties reduce the final score but do not change sub-scores.
A target can score 80 on evidence but drop to 65 after safety adjustment.
Safety is treated as a post-hoc penalty rather than a scoring dimension to ensure
that strong biological evidence is not masked by safety concerns, but explicitly adjusted.

### Decision tiers

| Adjusted score | Decision | Meaning |
|----------------|----------|---------|
| 75-100 | GO | Strong evidence across multiple dimensions |
| 50-74 | CONDITIONAL_GO | Proceed with explicit risk mitigation plan |
| 25-49 | REVIEW | Insufficient evidence; needs more data |
| 0-24 | NO_GO | Target lacks fundamental validation |

Thresholds are calibrated to reflect typical target progression stages in early drug
discovery, where strong multi-dimensional evidence (>=75) is required for full commitment.

### Evidence grading

Every piece of evidence is tagged with a confidence tier:

Evidence tiers guide confidence weighting and highlight where decisions rely on
weaker or indirect evidence, enabling domain experts to focus review effort.

| Tier | Meaning | Example |
|------|---------|---------|
| T1 | Experimentally validated | Clinical trial data, GWAS with p < 5e-8 |
| T2 | Computational + literature supported | Known drug-target interaction with published SAR |
| T3 | Computationally predicted only | Docking score, ML prediction |
| T4 | Inferred or indirect | Pathway membership, guilt-by-association |

## Safety Rules

- **This skill does not make clinical recommendations.** Output is for research planning only.
- **Missing data is not zero evidence.** If a query returns nothing, the sub-score is `null` with `confidence: low`, not scored as 0.
- **Evidence conflicts must be surfaced.** If disease association is strong but safety signals are also strong, both must be reported — not averaged away.
- **No hallucinated evidence.** Every evidence object cites a specific database and retrieval date. If an API fails, the skill reports the failure, not a guess.
- **Human override is expected.** The GO/NO-GO decision is a recommendation. Domain experts should review the evidence trail and may override.

## Agent Boundary

The agent (LLM) dispatches and explains. The skill (Python) executes.
The agent must NOT override scoring thresholds, invent gene-drug associations,
skip safety warnings, or claim that a NO_GO target is worth pursuing.
The skill does not replace wet-lab validation, medicinal chemistry review, or clinical judgement.

Related Skills

equity-scorer

658
from ClawBio/ClawBio

Compute HEIM diversity and equity metrics from VCF or ancestry data. Generates heterozygosity, FST, PCA plots, and a composite HEIM Equity Score with markdown reports.

omics-target-evidence-mapper

656
from ClawBio/ClawBio

Aggregate public target-level evidence across omics and translational sources for research triage.

wes-clinical-report-es

658
from ClawBio/ClawBio

Generates professional clinical PDF reports in Spanish from WES (Whole Exome Sequencing) data with clinical interpretation, pharmacogenomic alerts, and follow-up recommendations.

wes-clinical-report-en

658
from ClawBio/ClawBio

Generates professional clinical PDF reports in English from WES (Whole Exome Sequencing) data with clinical interpretation summary, pharmacogenomic alerts, and follow-up recommendations.

vcf-annotator

658
from ClawBio/ClawBio

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

variant-annotation

658
from ClawBio/ClawBio

Annotate VCF variants with Ensembl VEP REST, ClinVar significance, gnomAD/population frequency context, and prioritized variant ranking.

ukb-navigator

658
from ClawBio/ClawBio

Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.

struct-predictor

658
from ClawBio/ClawBio

Protein structure prediction with Boltz-2. Accepts YAML inputs (single protein or multi-chain complex), runs boltz predict, extracts per-residue pLDDT and PAE confidence, and writes a markdown report with figures.

soul2dna

658
from ClawBio/ClawBio

Compile SOUL.md character profiles into synthetic diploid genomes (.genome.json) via trait-to-allele mapping

seq-wrangler

658
from ClawBio/ClawBio

Sequence QC, alignment, and BAM processing. Wraps FastQC, BWA/Bowtie2, SAMtools for automated read-to-BAM pipelines.

scrna-orchestrator

658
from ClawBio/ClawBio

Local Scanpy pipeline for single-cell RNA-seq QC, optional doublet detection, clustering, marker discovery, optional CellTypist annotation, optional latent downstream mode from integrated.h5ad/X_scvi, and optional dataset-level plus within-cluster contrastive marker analysis from raw-count .h5ad or 10x Matrix Market input.

scrna-embedding

658
from ClawBio/ClawBio

Local scVI/scANVI-based single-cell latent embedding and batch-aware integration from raw-count .h5ad or 10x Matrix Market input, with stable integrated AnnData export for downstream latent analysis.