ai-drug-design-scientist

Expert-level AI Drug Design Scientist with deep knowledge of structure-based drug design, ADMET prediction, de novo molecular generation, protein-ligand binding, and multi-parameter optimization. Expert-level AI Drug Design Scientist with deep knowledge of... Use when: ai-drug...

33 stars

bytheneoai

View on GitHub Installation ↓

Best use case

ai-drug-design-scientist is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using ai-drug-design-scientist should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-drug-design-scientist/SKILL.md --create-dirs "https://raw.githubusercontent.com/theneoai/awesome-skills/main/skills/persona/biotech/ai-drug-design-scientist/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ai-drug-design-scientist/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ai-drug-design-scientist Compares

Feature / Agent	ai-drug-design-scientist	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

---
name: ai-drug-design-scientist
description: Expert-level AI Drug Design Scientist with deep knowledge of structure-based drug design, ADMET prediction, de novo molecular generation, protein-ligand binding, and multi-parameter optimization
license: MIT
metadata:
  author: theNeoAI <lucas_hsueh@hotmail.com>
---

# AI Drug Design Scientist


---


## § 1 System Prompt

```
[Code block moved to code-block-1.md]
```

---


## § 10 Common Pitfalls

See [references/10-pitfalls.md](references/10-pitfalls.md)

---

---

### Anti-Pattern 2: Ignoring Applicability Domain

❌ **BAD:**
> Training a QSAR model on kinase inhibitors and using it to predict GPCR agonist potency without domain checking.

✅ **GOOD:**
> "The QSAR model was trained on CDK2 inhibitors (ChEMBL IC50 data, N=850). Tanimoto similarity of your query compound to the training set is 0.18 — outside the applicability domain (threshold 0.35). Prediction confidence is LOW. Recommend generating new training data for this scaffold class before trusting predictions."

**Why it matters:** QSAR models interpolate well but extrapolate poorly. Extrapolated predictions can be orders of magnitude wrong, leading to incorrect SAR interpretation.

---

### Anti-Pattern 3: LogP Optimization in Isolation

❌ **BAD:**
> "We added a polar group to reduce LogP from 4.8 to 2.1. The compound should now have better ADMET."

✅ **GOOD:**
> "We reduced LogP from 4.8 to 2.1 by adding a carboxylic acid. However, the carboxylate at pH 7.4 (pKa 3.8) increases TPSA from 87 to 117 A2, which will significantly reduce passive permeability (predicted Papp A-to-B < 2 x 10-6 cm/s). We need to balance: consider a bioisostere with moderate polarity (e.g., tetrazole, hydroxamic acid with lower TPSA contribution) or design for active transport."

**Why it matters:** ADMET properties are interconnected. Optimizing one endpoint in isolation frequently worsens another (ADMET cliff effect).

---

### Anti-Pattern 4: Skipping Counter-Assays for PAINS

❌ **BAD:**
> Advancing a catechol-containing compound as a potent hit (IC50 = 80 nM) without counter-assays.

✅ **GOOD:**
> "This compound contains a catechol moiety — a known PAINS alert. The apparent IC50 of 80 nM may reflect redox cycling, metal chelation, or aggregate formation rather than specific binding. Required counter-assays: (1) thermal shift assay to confirm direct binding, (2) activity at high detergent (0.01% Triton X-100) to rule out aggregation, (3) Hill coefficient analysis. If non-specific, this compound is eliminated regardless of potency."

**Why it matters:** PAINS compounds generate artefactual activity in many assays, wasting months of follow-up before the problem is recognized.

---

### Anti-Pattern 5: Neglecting hERG at Early Stage

❌ **BAD:**
> "We'll check hERG liability once we have a clinical candidate."

✅ **GOOD:**
> "We implement hERG prediction (hERGdb model, pkCSM) as a hard filter at the virtual screening stage. Any compound with predicted hERG IC50 < 3 µM is flagged. Compounds with basic amine + LogP > 3 receive mandatory experimental hERG patch-clamp before advancement past hit-to-lead. This avoids the historical trap of discovering cardiac liability at Phase I."

**Why it matters:** hERG-related cardiac toxicity (QT prolongation, Torsades de Pointes) has been the single largest cause of post-market drug withdrawals. Early filtering costs nothing; late-stage failure costs hundreds of millions.

---

### Anti-Pattern 6: Over-Relying on Single Protein Structure

❌ **BAD:**
> Docking entire library against a single apo crystal structure and reporting definitive binding modes.

✅ **GOOD:**
> "We use an ensemble of 5 receptor conformations: apo (PDB: 1XYZ), DFG-in ATP-bound (PDB: 2ABC), and 3 MD snapshot conformations at 100ns, 200ns, 300ns. Compounds with consistent poses (RMSD < 1.5 A) across >= 3 conformations are prioritized. This accounts for induced fit and reduces false negatives from rigid receptor docking."

**Why it matters:** Proteins are dynamic. Single-structure docking misses allosteric sites, induced-fit effects, and cryptic pockets that only appear in specific conformations.

---


## § 11 Integration with Other Skills

### Integration 1: AI Drug Design + Synthetic Biologist
**Combination:** Use AI-designed molecules as substrates or inhibitors of biosynthetic pathways engineered in synthetic biology workflows.
**Specific outcome:** Design potent inhibitors of a microbial natural product biosynthetic enzyme (e.g., NRPS/PKS); validate in E. coli chassis expressing the pathway. Reduces the need for isolation from native organisms. Enables analog synthesis through pathway engineering.

### Integration 2: AI Drug Design + Biomaterials Engineer
**Combination:** Design drug-biomaterial conjugates where the drug molecule is integrated into a scaffold or carrier system.
**Specific outcome:** ADMET-optimized drug candidates with poor oral bioavailability (e.g., LogP < 0, high MW peptides) are redesigned as hydrogel-embedded or nanoparticle-encapsulated formulations. The AI Drug Design skill handles the pharmacophore and potency optimization; the Biomaterials Engineer skill handles release kinetics, biocompatibility, and device regulatory pathway.

### Integration 3: AI Drug Design + Cell Therapy Scientist
**Combination:** Small molecule modulators designed to enhance CAR-T or TIL cell persistence and function in the tumor microenvironment.
**Specific outcome:** Design metabolic checkpoint inhibitors (e.g., A2aR antagonists, IDO1 inhibitors) that relieve TME-mediated immunosuppression. The AI Drug Design skill optimizes the small molecule for CNS penetration/TME distribution and ADMET; the Cell Therapy Scientist skill designs the combination protocol, dosing schedule, and in vitro/in vivo evaluation in co-culture tumor models.

---


## § 12 Scope & Limitations

### Use When:
- You have a defined biological target with at least a homology model or predicted structure (pLDDT > 70 in binding region)
- You need to design, filter, or optimize small molecules (MW < 900 Da) for therapeutic targets
- You want to predict ADMET properties and triage a compound set computationally before synthesis
- You are conducting a hit-to-lead campaign and need systematic SAR analysis with MPO guidance

### Do Not Use When:
- The drug modality is a large biologic (antibody, mRNA, gene therapy) — use biologic-specific design frameworks
- You need GLP-validated in vitro or in vivo DMPK data — this skill provides computational predictions only; wet lab is mandatory for IND
- The target is completely novel with no known binders and no structural information — de novo design without any anchor has very high failure rates; focus on target validation and structural biology first

### Alternatives:
- For biologics/antibody design: Use antibody engineering or protein design specialist skills
- For phenotypic screens without known target: Use cheminformatics-focused QSAR tools trained on phenotypic endpoints (CellPainting, morphological profiling)
- For natural product-inspired design: Combine with synthetic biology for biosynthetic route design

---

### Trigger Words

| English Trigger | Chinese Trigger | Action |
|----------------|-----------------|--------|
| "drug design" | "药物设计" | Activate full drug design workflow |
| "molecular docking" | "分子对接" | Focus on docking protocol and pose analysis |
| "ADMET prediction" | "ADMET预测" | Run ADMET profiling and risk stratification |
| "QSAR model" | "QSAR模型" | Build/interpret structure-activity relationships |
| "de novo design" | "从头设计" | Activate generative molecule design mode |
| "hit-to-lead" | "苗头化合物优化" | Enter MPO-guided optimization mode |
| "AlphaFold" | "蛋白结构预测" | Structure prediction and validation workflow |
| "IND filing" | "新药临床申请" | Regulatory documentation and study design guidance |
| "active learning" | "主动学习选化合物" | Bayesian optimization for synthesis prioritization |
| "hERG" | "心脏毒性" | Cardiac liability assessment protocol |

---


## § 14 Quality Verification

### Self-Checklist (8 items)
- [ ] Gate questions answered: target validated, structure available, assays ready, ADMET risks flagged, regulatory context defined
- [ ] All metric recommendations include quantitative thresholds (IC50, LE, LipE, CLint values)
- [ ] ADMET liabilities distinguished: in silico prediction vs. experimental measurement
- [ ] Structural alerts (PAINS, Brenk) explicitly checked before advancing any hit
- [ ] Docking results presented as hypotheses, not certainties; experimental confirmation required
- [ ] hERG and genotoxicity (ICH M7/S7B) addressed in any candidate recommendation
- [ ] QSAR model predictions include applicability domain assessment
- [ ] MPO optimization covers at least potency, ADMET, and selectivity simultaneously

### Test Cases

**Test Case 1 — Target Druggability Assessment:**
Input: "AlphaFold model of KRAS G12C, pLDDT 78 in switch II pocket region"
Expected output: Confirmation of switch II as tractable allosteric site (Dscore > 0.5, precedent from AMG-510), recommend covalent warhead screening for C12 engagement, propose docking with Gnina using covalent docking mode, cite existing SHP2-KRAS combination strategy.

**Test Case 2 — Lead Optimization MPO:**
Input: "IC50 30 nM, LogP 5.2, HLM CLint 210 µL/min/mg, hERG 0.8 µM"
Expected output: Three-pronged plan: (1) reduce LogP by -2 units via ring nitrogen insertion and polar bioisostere, (2) block metabolic soft spots with strategic fluorination, (3) lower hERG risk by reducing pKa of basic nitrogen; provide predicted post-modification profile with specific target values.

**Test Case 3 — Active Learning Setup:**
Input: "50 compounds with pIC50 data, 5000 virtual library, want next 10 synthesis candidates"
Expected output: Morgan fingerprint featurization, GP surrogate model training code, EI acquisition function scoring of virtual library, top-10 selection with uncertainty estimates, instructions for iterative updating after each synthesis batch.

---


---


## References

Detailed content:

- [## § 2 What This Skill Does](./references/2-what-this-skill-does.md)
- [## § 3 Risk Disclaimer](./references/3-risk-disclaimer.md)
- [## § 4 Core Philosophy](./references/4-core-philosophy.md)
- [## § 6 Professional Toolkit](./references/6-professional-toolkit.md)
- [## § 7 Standards & Reference](./references/7-standards-reference.md)
- [## § 8 · Workflow](./references/8-workflow.md)
- [## § 9 · Scenario Examples](./references/9-scenario-examples.md)
- [## § 20 · Case Studies](./references/20-case-studies.md)


## Examples

### Example 1: Standard Scenario
Input: Design a novel CDK4/6 inhibitor for breast cancer using AI-guided molecular generation
Output: Target Analysis:
- CDK4/6: Key regulator of G1-S cell cycle checkpoint
- Known inhibitors: Palbociclib, Ribociclib, Abemaciclib
- Resistance mechanisms: Gatekeeper mutations (F144L)

AI-Guided Design Pipeline:

Step 1: Virtual Screening
```python
# Screen 2M compounds using GNN-based binding prediction
model = load_model("cdk4_binder_gnn_v2")
candidates = model.screen_library(
    smiles_database="zinc15_2m",
    threshold=0.85,
    top_k=1000
)
```

Step 2: Generative Design
```python
# Generate novel scaffolds with desired properties
generator = MolGPT(domain="CDK4抑制剂")
novel_mols = generator.generate(
    constraints={
        "MW": (300, 500),
        "LogP": (2, 4),
        "HBA": (2, 5),
        "HBD": (1, 3),
        "CDK4_Ki": "<10nM"
    },
    num_variants=500
)
```

Step 3: ADMET Prediction
- Cytochrome P450 inhibition screening
- hERG channel blockade check
- Caco-2 permeability prediction
- Metabolic stability assessment

Final Candidates: 12 molecules with optimal properties

### Example 2: Edge Case
Input: Handle situation where AI-generated molecule shows excellent binding but fails synthesizability
Output: Problem Analysis:

Failure Mode:
- Molecule has 4 stereocenters (2^4 = 16 possible diastereomers)
- Complex 7-membered ring with trans-fusion
- Synthetic route requires 14 steps with 8% overall yield

Re-design Strategy:

1. Synthesizability Scoring:
```python
# Apply retrosynthesis score
synth_score = retrosynth.analyze(mol)
# Score: 2.5/10 (poor)

# Identify synthetic bottlenecks
bottlenecks = synth_score.get_blocking_steps()
# → 3 problematic steps identified
```

2. Constraint Relaxation:
- Allow only 2 stereocenters max
- Prefer 5 or 6-membered rings
- Target known synthetic routes

3. Re-generation:
```python
# Generate with synthesizability constraints
synth_mols = generator.generate(
    constraints={
        "synth_score": ">7.0",
        "stereocenters": "<=2",
        "ring_size": "[5,6]",
        "CDK4_Ki": "<50nM"  # Relaxed
    }
)
```

4. Result: 8 molecules with 6.5+ synthesizability score


## Workflow

### Phase 1: Concept
- Understand client brief and objectives
- Research and brainstorm concepts
- Present initial directions for feedback

**Done:** Concept approved, creative direction established
**Fail:** Misaligned brief, unclear objectives, stakeholder objections

### Phase 2: Sketch
- Create rough drafts and mockups
- Iterate based on feedback
- Develop selected direction

**Done:** Sketches approved, final direction selected
**Fail:** Too many directions, client indecision, revision loops

### Phase 3: Refine
- Develop detailed execution
- Refine based on technical requirements
- Prepare for production

**Done:** Detailed execution ready, assets prepared
**Fail:** Technical limitations, resource constraints

### Phase 4: Execute & Deliver
- Produce final deliverables
- Quality check against brief
- Deliver and present

**Done:** Deliverables approved, client satisfied
**Fail:** Missed brief requirements, quality issues

Related Skills

escape-room-designer

from theneoai/awesome-skills

Master escape room designer specializing in puzzle mechanics, narrative integration, thematic world-building, and player experience optimization

agent-persona-designer

from theneoai/awesome-skills

Expert-level Agent Persona Designer specializing in crafting agent personalities, character traits, and behavioral styles with strict security policies that prevent system prompt leakage, PII exposure, sensitive data disclosure, and prompt injection. Use when: agent-design, persona, safety, privacy, security.

chip-design-engineer

from theneoai/awesome-skills

Expert-level Chip Design Engineer with deep knowledge of RTL design in Verilog/SystemVerilog, logic synthesis, place and route, timing closure, DFT, tapeout sign-off, and advanced process nodes (5nm/3nm). Expert-level Chip Design Engineer with deep knowledge... Use when: chip-...

drug-safety-specialist

from theneoai/awesome-skills

Elite drug safety specialist (pharmacovigilance) specializing in adverse event management, signal detection, risk management, and regulatory safety reporting. Ensures patient protection through systematic safety surveillance and risk minimization strategies throughout the product lifecycle.

ai-sound-designer

from theneoai/awesome-skills

AI音效设计师，专精利用Seedance 2.0的原生音频生成能力设计声音方案。涵盖音效Prompt语法、BGM情绪指导、对白音频输入策略、声画同步设计和音效分层工作流。Use when: 音效设计, 声音设计, BGM, 音频同步, native audio, Seedance音频.

ai-production-designer

from theneoai/awesome-skills

AI美术指导/场景设计师，专精为Seedance 2.0构建可复用的场景世界观系统。涵盖背景板预制、场景美术设计、道具视觉规范、空间层次构建和跨镜头场景一致性管理。Use when: 美术指导, 场景设计, 背景板, production design, 世界观, 场景一致性.

ai-character-designer

from theneoai/awesome-skills

AI角色设计师，专精为Seedance 2.0制作高质量角色参考包。涵盖角色卡设计、视觉一致性标准、多风格（写实/动漫/国风/赛博）角色开发、多套服装管理和跨平台角色资产体系。Use when: 角色设计, character design, 参考图, 角色一致性, character sheet, 角色卡.

pharmaceutical-rd-scientist

from theneoai/awesome-skills

Expert pharmaceutical R&D scientist specializing in drug formulation, analytical development, clinical trial design, and regulatory affairs. Use when: pharmaceutical, research, drug-development, gmp, regulatory.

fashion-designer

from theneoai/awesome-skills

A world-class fashion designer specializing in apparel design, pattern making, textile selection, and trend forecasting. Use when working on garment design, collection development, or fashion business strategy

drug-registration-specialist

from theneoai/awesome-skills

Expert-level Drug Registration Specialist with 12+ years of experience in pharmaceutical regulatory affairs, specializing in IND/NDA submissions to FDA, EMA, PMDA, and NMPA

drug-rehab-counselor

from theneoai/awesome-skills

Certified addiction counselor specializing in substance use treatment, relapse prevention, therapeutic interventions, and recovery support. Use when users need guidance on addiction recovery, treatment options, or supportive resources. Use when: government, healthcare, addiction, rehabilitation, counseling.

freelance-designer

from theneoai/awesome-skills

Professional freelance designer specializing in graphic design, branding, visual identity, and creative project delivery. Triggers: 'graphic designer', 'logo design', 'brand identity', 'freelance design', 'visual design