protein-qc
Quality control metrics and filtering thresholds for protein design. Use this skill when: (1) Evaluating design quality for binding, expression, or structure, (2) Setting filtering thresholds for pLDDT, ipTM, PAE, (3) Checking sequence liabilities (cysteines, deamidation, polybasic clusters), (4) Creating multi-stage filtering pipelines, (5) Computing PyRosetta interface metrics (dG, SC, dSASA), (6) Checking biophysical properties (instability, GRAVY, pI), (7) Ranking designs with composite scoring. This skill provides research-backed thresholds from binder design competitions and published benchmarks.
Best use case
protein-qc is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Quality control metrics and filtering thresholds for protein design. Use this skill when: (1) Evaluating design quality for binding, expression, or structure, (2) Setting filtering thresholds for pLDDT, ipTM, PAE, (3) Checking sequence liabilities (cysteines, deamidation, polybasic clusters), (4) Creating multi-stage filtering pipelines, (5) Computing PyRosetta interface metrics (dG, SC, dSASA), (6) Checking biophysical properties (instability, GRAVY, pI), (7) Ranking designs with composite scoring. This skill provides research-backed thresholds from binder design competitions and published benchmarks.
Teams using protein-qc should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/protein-qc/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How protein-qc Compares
| Feature / Agent | protein-qc | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Quality control metrics and filtering thresholds for protein design. Use this skill when: (1) Evaluating design quality for binding, expression, or structure, (2) Setting filtering thresholds for pLDDT, ipTM, PAE, (3) Checking sequence liabilities (cysteines, deamidation, polybasic clusters), (4) Creating multi-stage filtering pipelines, (5) Computing PyRosetta interface metrics (dG, SC, dSASA), (6) Checking biophysical properties (instability, GRAVY, pI), (7) Ranking designs with composite scoring. This skill provides research-backed thresholds from binder design competitions and published benchmarks.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Protein Design Quality Control
## Critical Limitation
**Individual metrics have weak predictive power for binding**. Research shows:
- Individual metric ROC AUC: 0.64-0.66 (slightly better than random)
- Metrics are **pre-screening filters**, not affinity predictors
- **Composite scoring is essential** for meaningful ranking
These thresholds filter out poor designs but do NOT predict binding affinity.
## QC Organization
QC is organized by **purpose** and **level**:
| Purpose | What it assesses | Key metrics |
|---------|------------------|-------------|
| **Binding** | Interface quality, binding geometry | ipTM, PAE, SC, dG, dSASA |
| **Expression** | Manufacturability, solubility | Instability, GRAVY, pI, cysteines |
| **Structural** | Fold confidence, consistency | pLDDT, pTM, scRMSD |
Each category has two levels:
- **Metric-level**: Calculated values with thresholds (pLDDT > 0.85)
- **Design-level**: Pattern/motif detection (odd cysteines, NG sites)
---
## Quick Reference: All Thresholds
| Category | Metric | Standard | Stringent | Source |
|----------|--------|----------|-----------|--------|
| **Structural** | pLDDT | > 0.85 | > 0.90 | AF2/Chai/Boltz |
| | pTM | > 0.70 | > 0.80 | AF2/Chai/Boltz |
| | scRMSD | < 2.0 Å | < 1.5 Å | Design vs pred |
| **Binding** | ipTM | > 0.50 | > 0.60 | AF2/Chai/Boltz |
| | PAE_interaction | < 12 Å | < 10 Å | AF2/Chai/Boltz |
| | Shape Comp (SC) | > 0.50 | > 0.60 | PyRosetta |
| | interface_dG | < -10 | < -15 | PyRosetta |
| **Expression** | Instability | < 40 | < 30 | BioPython |
| | GRAVY | < 0.4 | < 0.2 | BioPython |
| | ESM2 PLL | > 0.0 | > 0.2 | ESM2 |
### Design-Level Checks (Expression)
| Pattern | Risk | Action |
|---------|------|--------|
| Odd cysteine count | Unpaired disulfides | Redesign |
| NG/NS/NT motifs | Deamidation | Flag/avoid |
| K/R >= 3 consecutive | Proteolysis | Flag |
| >= 6 hydrophobic run | Aggregation | Redesign |
See: references/binding-qc.md, references/expression-qc.md, references/structural-qc.md
---
## Sequential Filtering Pipeline
```python
import pandas as pd
designs = pd.read_csv('designs.csv')
# Stage 1: Structural confidence
designs = designs[designs['pLDDT'] > 0.85]
# Stage 2: Self-consistency
designs = designs[designs['scRMSD'] < 2.0]
# Stage 3: Binding quality
designs = designs[(designs['ipTM'] > 0.5) & (designs['PAE_interaction'] < 10)]
# Stage 4: Sequence plausibility
designs = designs[designs['esm2_pll_normalized'] > 0.0]
# Stage 5: Expression checks (design-level)
designs = designs[designs['cysteine_count'] % 2 == 0] # Even cysteines
designs = designs[designs['instability_index'] < 40]
```
---
## Composite Scoring (Required for Ranking)
Individual metrics alone are too weak. Use composite scoring:
```python
def composite_score(row):
return (
0.30 * row['pLDDT'] +
0.20 * row['ipTM'] +
0.20 * (1 - row['PAE_interaction'] / 20) +
0.15 * row['shape_complementarity'] +
0.15 * row['esm2_pll_normalized']
)
designs['score'] = designs.apply(composite_score, axis=1)
top_designs = designs.nlargest(100, 'score')
```
For advanced composite scoring, see references/composite-scoring.md.
---
## Tool-Specific Filtering
### BindCraft Filter Levels
| Level | Use Case | Stringency |
|-------|----------|------------|
| Default | Standard design | Most stringent |
| Relaxed | Need more designs | Higher failure rate |
| Peptide | Designs < 30 AA | ~5-10x lower success |
### BoltzGen Filtering
```bash
boltzgen run ... \
--budget 60 \
--alpha 0.01 \
--filter_biased true \
--refolding_rmsd_threshold 2.0 \
--additional_filters 'ALA_fraction<0.3'
```
- `alpha=0.0`: Quality-only ranking
- `alpha=0.01`: Default (slight diversity)
- `alpha=1.0`: Diversity-only
---
## Design-Level Severity Scoring
For pattern-based checks, use severity scoring:
| Severity Level | Score | Action |
|----------------|-------|--------|
| LOW | 0-15 | Proceed |
| MODERATE | 16-35 | Review flagged issues |
| HIGH | 36-60 | Redesign recommended |
| CRITICAL | 61+ | Redesign required |
---
## Experimental Correlation
| Metric | AUC | Use |
|--------|-----|-----|
| ipTM | ~0.64 | Pre-screening |
| PAE | ~0.65 | Pre-screening |
| ESM2 PLL | ~0.72 | Best single metric |
| Composite | ~0.75+ | **Always use** |
**Key insight**: Metrics work as **filters** (eliminating failures) not **predictors** (ranking successes).
---
## Campaign Health Assessment
Quick assessment of your design campaign:
| Pass Rate | Status | Interpretation |
|-----------|--------|----------------|
| > 15% | Excellent | Above average, proceed |
| 10-15% | Good | Normal, proceed |
| 5-10% | Marginal | Below average, review issues |
| < 5% | Poor | Significant problems, diagnose |
---
## Failure Recovery Trees
### Too Few Pass pLDDT Filter (< 5% with pLDDT > 0.85)
```
Low pLDDT across campaign
├── Check scRMSD distribution
│ ├── High scRMSD (>2.5Å): Backbone issue
│ │ └── Fix: Regenerate backbones with lower noise_scale (0.5-0.8)
│ └── Low scRMSD but low pLDDT: Disordered regions
│ └── Fix: Check design length, simplify topology
├── Try more sequences per backbone
│ └── modal run modal_proteinmpnn.py --num-seq-per-target 32 --sampling-temp 0.1
├── Use SolubleMPNN instead of ProteinMPNN
│ └── Better for expression-optimized sequences
└── Consider different design tool
└── BindCraft (integrated design) may work better
```
### Too Few Pass ipTM Filter (< 5% with ipTM > 0.5)
```
Low ipTM across campaign
├── Review hotspot selection
│ ├── Are hotspots surface-exposed? (SASA > 20Ų)
│ ├── Are hotspots conserved? (check MSA)
│ └── Try 3-6 different hotspot combinations
├── Increase binder length (more contact area)
│ └── Try 80-100 AA instead of 60-80 AA
├── Check interface geometry
│ ├── Is target flat? → Try helical binders
│ └── Is target concave? → Try smaller binders
└── Try all-atom design tool
└── BoltzGen (all-atom, better packing)
```
### High scRMSD (> 50% with scRMSD > 2.0Å)
```
Sequences don't specify intended structure
├── ProteinMPNN issue
│ ├── Lower temperature: --sampling-temp 0.1
│ ├── Increase sequences: --num-seq-per-target 32
│ └── Check fixed_positions aren't over-constraining
├── Backbone geometry issue
│ ├── Backbones may be unusual/strained
│ ├── Regenerate with lower noise_scale (0.5-0.8)
│ └── Reduce diffuser.T to 30-40
└── Try different sequence design
└── ColabDesign (AF2 gradient-based) may work better
```
### Everything Passes But No Experimental Hits
```
In silico metrics don't predict affinity
├── Generate MORE designs (10x current)
│ └── Computational metrics have high false positive rate
├── Increase diversity
│ ├── Higher ProteinMPNN temperature (0.2-0.3)
│ ├── Different backbone topologies
│ └── Different hotspot combinations
├── Try different design approach
│ ├── BindCraft (different algorithm)
│ ├── ColabDesign (AF2 hallucination)
│ └── BoltzGen (all-atom diffusion)
└── Check if target is druggable
└── Some targets are inherently difficult
```
### Too Many Designs Pass (> 50%)
```
Suspiciously high pass rate
├── Check if thresholds are too lenient
│ └── Use stringent thresholds: pLDDT > 0.90, ipTM > 0.60
├── Verify prediction quality
│ ├── Are predictions actually running? Check output files
│ └── Are complexes being predicted, not just monomers?
├── Check for data issues
│ ├── Same sequence being predicted multiple times?
│ └── Wrong FASTA format (missing chain separator)?
└── Apply diversity filter
└── Cluster at 70% identity, take top per cluster
```
---
## Diagnostic Commands
### Quick Campaign Assessment
```python
import pandas as pd
df = pd.read_csv('designs.csv')
# Pass rates at each stage
print(f"Total designs: {len(df)}")
print(f"pLDDT > 0.85: {(df['pLDDT'] > 0.85).mean():.1%}")
print(f"ipTM > 0.50: {(df['ipTM'] > 0.50).mean():.1%}")
print(f"scRMSD < 2.0: {(df['scRMSD'] < 2.0).mean():.1%}")
print(f"All filters: {((df['pLDDT'] > 0.85) & (df['ipTM'] > 0.5) & (df['scRMSD'] < 2.0)).mean():.1%}")
# Identify top issue
if (df['pLDDT'] > 0.85).mean() < 0.1:
print("ISSUE: Low pLDDT - check backbone or sequence quality")
elif (df['ipTM'] > 0.50).mean() < 0.1:
print("ISSUE: Low ipTM - check hotspots or interface geometry")
elif (df['scRMSD'] < 2.0).mean() < 0.5:
print("ISSUE: High scRMSD - sequences don't specify backbone")
```
---Related Skills
tooluniverse-protein-therapeutic-design
Design novel protein therapeutics (binders, enzymes, scaffolds) using AI-guided de novo design. Uses RFdiffusion for backbone generation, ProteinMPNN for sequence design, ESMFold/AlphaFold2 for validation. Use when asked to design protein binders, therapeutic proteins, or engineer protein function.
tooluniverse-protein-structure-retrieval
Retrieves protein structure data from RCSB PDB, PDBe, and AlphaFold with protein disambiguation, quality assessment, and comprehensive structural profiles. Creates detailed structure reports with experimental metadata, ligand information, and download links. Use when users need protein structures, 3D models, crystallography data, or mention PDB IDs (4-character codes like 1ABC) or UniProt accessions.
protein-interaction-network-analysis
Analyze protein-protein interaction networks using STRING, BioGRID, and SASBDB databases. Maps protein identifiers, retrieves interaction networks with confidence scores, performs functional enrichment analysis (GO/KEGG/Reactome), and optionally includes structural data. No API key required for core functionality (STRING). Use when analyzing protein networks, discovering interaction partners, identifying functional modules, or studying protein complexes.
proteinmpnn
Design protein sequences using ProteinMPNN inverse folding. Use this skill when: (1) Designing sequences for RFdiffusion backbones, (2) Redesigning existing protein sequences, (3) Fixing specific residues while designing others, (4) Optimizing sequences for expression or stability, (5) Multi-state or negative design. For backbone generation, use rfdiffusion or bindcraft. For ligand-aware design, use ligandmpnn. For solubility optimization, use solublempnn.
protein-design-workflow
End-to-end guidance for protein design pipelines. Use this skill when: (1) Starting a new protein design project, (2) Need step-by-step workflow guidance, (3) Understanding the full design pipeline, (4) Planning compute resources and timelines, (5) Integrating multiple design tools. For tool selection, use binder-design. For QC thresholds, use protein-qc.
string-protein-interaction-analysis-with-omicverse
Help Claude query STRING for protein interactions, build PPI graphs with pyPPI, and render styled network figures for bulk gene lists.
bio-proteomics-protein-inference
Protein grouping and inference from peptide identifications. Use when resolving protein ambiguity from shared peptides. Handles protein groups and protein-level FDR control using parsimony and probabilistic approaches.
zinc-database
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
zarr-python
Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.
xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
writing-skills
Use when creating new skills, editing existing skills, or verifying skills work before deployment
writing-plans
Use when you have a spec or requirements for a multi-step task, before touching code