drug-development-guide

End-to-end drug development pipeline from target identification to regulatory...

191 stars

Best use case

drug-development-guide is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

End-to-end drug development pipeline from target identification to regulatory...

Teams using drug-development-guide should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/drug-development-guide/SKILL.md --create-dirs "https://raw.githubusercontent.com/wentorai/research-plugins/main/skills/domains/pharma/drug-development-guide/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/drug-development-guide/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How drug-development-guide Compares

Feature / Agent	drug-development-guide	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

End-to-end drug development pipeline from target identification to regulatory...

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Drug Development Guide

A comprehensive skill covering the drug development pipeline from target identification through regulatory approval. Designed for pharmaceutical researchers, medicinal chemists, and clinical scientists conducting academic or industry research.

## Drug Discovery Pipeline Overview

```
Target ID -> Hit Finding -> Lead Optimization -> Preclinical -> Phase I -> Phase II -> Phase III -> Regulatory Filing
  (1-2 yr)    (1-2 yr)      (1-3 yr)            (1-2 yr)      (1 yr)     (2 yr)      (3 yr)       (1-2 yr)

Total timeline: ~10-15 years | Success rate: ~5-10% from Phase I to approval
Estimated cost: $1.3B-$2.8B per approved drug (DiMasi et al., 2016)
```

## Target Identification and Validation

### Computational Target Discovery

```python
import pandas as pd
from scipy import stats

def differential_expression_analysis(expression_data: pd.DataFrame,
                                      disease_group: list[str],
                                      control_group: list[str],
                                      fdr_threshold: float = 0.05) -> pd.DataFrame:
    """
    Identify differentially expressed genes as potential drug targets.

    Args:
        expression_data: Gene x Sample expression matrix
        disease_group: Sample IDs in disease condition
        control_group: Sample IDs in control condition
        fdr_threshold: False discovery rate threshold
    """
    results = []
    for gene in expression_data.index:
        disease_vals = expression_data.loc[gene, disease_group]
        control_vals = expression_data.loc[gene, control_group]
        t_stat, p_value = stats.ttest_ind(disease_vals, control_vals)
        fold_change = disease_vals.mean() / (control_vals.mean() + 1e-10)
        results.append({
            'gene': gene,
            'fold_change': fold_change,
            'log2_fc': np.log2(abs(fold_change) + 1e-10),
            'p_value': p_value,
            't_statistic': t_stat
        })

    df = pd.DataFrame(results)
    # Benjamini-Hochberg FDR correction
    from statsmodels.stats.multitest import multipletests
    df['fdr'] = multipletests(df['p_value'], method='fdr_bh')[1]
    df['significant'] = df['fdr'] < fdr_threshold
    return df.sort_values('fdr')
```

### Target Validation Criteria

A robust drug target should satisfy multiple criteria:

| Criterion | Method | Evidence Strength |
|-----------|--------|------------------|
| Genetic association | GWAS, Mendelian randomization | Strong |
| Expression in disease tissue | RNA-seq, immunohistochemistry | Moderate |
| Functional role | CRISPR knockout, siRNA | Strong |
| Druggability | Structural analysis, binding pockets | Essential |
| Safety (anti-target) | Phenotype of loss-of-function mutations | Essential |

## Lead Optimization

### ADMET Property Prediction

Assess absorption, distribution, metabolism, excretion, and toxicity early:

```python
def lipinski_rule_of_five(molecular_weight: float, logp: float,
                           hbd: int, hba: int) -> dict:
    """
    Evaluate Lipinski's Rule of Five for oral bioavailability.

    Args:
        molecular_weight: Molecular weight in Da
        logp: Calculated LogP (lipophilicity)
        hbd: Number of hydrogen bond donors
        hba: Number of hydrogen bond acceptors
    """
    violations = 0
    details = []

    if molecular_weight > 500:
        violations += 1
        details.append(f"MW {molecular_weight} > 500")
    if logp > 5:
        violations += 1
        details.append(f"LogP {logp} > 5")
    if hbd > 5:
        violations += 1
        details.append(f"HBD {hbd} > 5")
    if hba > 10:
        violations += 1
        details.append(f"HBA {hba} > 10")

    return {
        'violations': violations,
        'passes': violations <= 1,
        'details': details,
        'assessment': 'Likely orally bioavailable' if violations <= 1
                      else 'Poor oral bioavailability expected'
    }
```

## Pharmacokinetics Modeling

### Compartmental PK Analysis

```python
import numpy as np
from scipy.optimize import curve_fit

def one_compartment_iv(t, dose, V, CL):
    """One-compartment IV bolus model."""
    k_el = CL / V
    return (dose / V) * np.exp(-k_el * t)

def compute_pk_parameters(time_points: np.ndarray,
                           concentrations: np.ndarray,
                           dose: float) -> dict:
    """
    Fit one-compartment model and derive PK parameters.
    """
    popt, pcov = curve_fit(
        lambda t, V, CL: one_compartment_iv(t, dose, V, CL),
        time_points, concentrations,
        p0=[10, 1], bounds=(0, [1000, 100])
    )
    V, CL = popt
    t_half = 0.693 * V / CL
    auc = dose / CL

    return {
        'volume_of_distribution_L': round(V, 2),
        'clearance_L_hr': round(CL, 2),
        'half_life_hr': round(t_half, 2),
        'AUC_mg_hr_L': round(auc, 2)
    }
```

## Clinical Trial Design

### Phase Selection and Endpoints

| Phase | Primary Goal | Typical N | Key Endpoints |
|-------|-------------|-----------|---------------|
| Phase I | Safety, dose finding | 20-80 | MTD, DLT, PK |
| Phase II | Efficacy signal | 100-300 | ORR, PFS, biomarkers |
| Phase III | Confirmatory efficacy | 300-3000 | OS, PFS, PROs |
| Phase IV | Post-marketing surveillance | 1000+ | ADRs, real-world effectiveness |

Always pre-register clinical trials on ClinicalTrials.gov and follow CONSORT guidelines for reporting. Use adaptive trial designs (e.g., Bayesian adaptive randomization, seamless Phase II/III) when appropriate to improve efficiency.

## References

- DiMasi, J. A., Grabowski, H. G., & Hansen, R. W. (2016). Innovation in the pharmaceutical industry. *Journal of Health Economics*, 47, 20-33.
- Lipinski, C. A. (2004). Lead- and drug-like compounds. *Advanced Drug Delivery Reviews*, 56(3), 215-217.