bio-structural-biology-alphafold-predictions
Access and analyze AlphaFold protein structure predictions. Use when predicted structures are needed for proteins without experimental structures, or for confidence scores (pLDDT).
Best use case
bio-structural-biology-alphafold-predictions is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Access and analyze AlphaFold protein structure predictions. Use when predicted structures are needed for proteins without experimental structures, or for confidence scores (pLDDT).
Teams using bio-structural-biology-alphafold-predictions should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/bio-structural-biology-alphafold-predictions/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How bio-structural-biology-alphafold-predictions Compares
| Feature / Agent | bio-structural-biology-alphafold-predictions | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Access and analyze AlphaFold protein structure predictions. Use when predicted structures are needed for proteins without experimental structures, or for confidence scores (pLDDT).
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
## Version Compatibility
Reference examples tested with: BioPython 1.83+, matplotlib 3.8+, numpy 1.26+, scanpy 1.10+
Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# AlphaFold Predictions
**"Get the AlphaFold predicted structure for my protein"** → Download pre-computed AlphaFold structures by UniProt ID and assess prediction quality via per-residue pLDDT confidence scores.
- Python: `requests.get(f'https://alphafold.ebi.ac.uk/files/AF-{uniprot}-F1-model_v4.pdb')`
Download and analyze AlphaFold predicted protein structures from the AlphaFold Protein Structure Database.
## Download Structures
**Goal:** Retrieve pre-computed AlphaFold protein structure predictions and assess prediction quality via pLDDT confidence scores.
**Approach:** Query the AlphaFold Protein Structure Database API by UniProt accession to download PDB/CIF files, then extract per-residue pLDDT scores from B-factor columns to identify high-confidence and disordered regions.
### Single Structure by UniProt ID
```python
import requests
def download_alphafold(uniprot_id, output_dir='.'):
'''Download AlphaFold structure for UniProt accession'''
base_url = 'https://alphafold.ebi.ac.uk/files'
pdb_url = f'{base_url}/AF-{uniprot_id}-F1-model_v4.pdb'
cif_url = f'{base_url}/AF-{uniprot_id}-F1-model_v4.cif'
response = requests.get(pdb_url)
if response.status_code == 200:
output_path = f'{output_dir}/AF-{uniprot_id}-F1-model_v4.pdb'
with open(output_path, 'w') as f:
f.write(response.text)
return output_path
return None
pdb_file = download_alphafold('P04637') # Human p53
```
### Check Availability
```python
def check_alphafold_exists(uniprot_id):
'''Check if AlphaFold prediction exists'''
url = f'https://alphafold.ebi.ac.uk/api/prediction/{uniprot_id}'
response = requests.get(url)
return response.status_code == 200
if check_alphafold_exists('P04637'):
print('AlphaFold structure available')
```
### Get Metadata
```python
def get_alphafold_info(uniprot_id):
'''Get AlphaFold prediction metadata'''
url = f'https://alphafold.ebi.ac.uk/api/prediction/{uniprot_id}'
response = requests.get(url)
if response.status_code == 200:
return response.json()[0]
return None
info = get_alphafold_info('P04637')
print(f"Gene: {info['gene']}")
print(f"Organism: {info['organismScientificName']}")
print(f"Model version: {info['latestVersion']}")
```
## File Types Available
Database version v4 (current as of 2025). The version number refers to the database release, not the AlphaFold model version.
| File | URL Pattern | Description |
|------|-------------|-------------|
| PDB | `AF-{id}-F1-model_v4.pdb` | Structure coordinates |
| mmCIF | `AF-{id}-F1-model_v4.cif` | Structure with metadata |
| PAE JSON | `AF-{id}-F1-predicted_aligned_error_v4.json` | Predicted aligned error |
```python
def download_pae(uniprot_id, output_dir='.'):
'''Download PAE (predicted aligned error) matrix'''
url = f'https://alphafold.ebi.ac.uk/files/AF-{uniprot_id}-F1-predicted_aligned_error_v4.json'
response = requests.get(url)
if response.status_code == 200:
output_path = f'{output_dir}/AF-{uniprot_id}-F1-pae.json'
with open(output_path, 'w') as f:
f.write(response.text)
return output_path
return None
```
## Analyze pLDDT Confidence Scores
### Extract from PDB B-factors
AlphaFold stores pLDDT scores in the B-factor column.
```python
from Bio.PDB import PDBParser
def extract_plddt(pdb_file):
'''Extract pLDDT confidence scores from AlphaFold PDB'''
parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', pdb_file)
residue_plddt = {}
for model in structure:
for chain in model:
for residue in chain:
if residue.id[0] == ' ': # Standard residue
ca = residue['CA'] if 'CA' in residue else list(residue.get_atoms())[0]
residue_plddt[residue.id[1]] = ca.get_bfactor()
return residue_plddt
plddt = extract_plddt('AF-P04637-F1-model_v4.pdb')
avg_plddt = sum(plddt.values()) / len(plddt)
print(f'Average pLDDT: {avg_plddt:.1f}')
```
### Confidence Interpretation
| pLDDT | Confidence | Interpretation |
|-------|------------|----------------|
| >90 | Very high | High accuracy, can be used as experimental |
| 70-90 | Confident | Good backbone, may have sidechain errors |
| 50-70 | Low | Caution, may be disordered |
| <50 | Very low | Likely disordered or wrong |
### Plot pLDDT per Residue
```python
import matplotlib.pyplot as plt
def plot_plddt(plddt_dict, output='plddt_plot.png'):
residues = sorted(plddt_dict.keys())
scores = [plddt_dict[r] for r in residues]
plt.figure(figsize=(12, 4))
plt.fill_between(residues, scores, alpha=0.3)
plt.plot(residues, scores)
plt.axhline(y=70, color='orange', linestyle='--', label='Confident threshold')
plt.axhline(y=90, color='green', linestyle='--', label='Very high threshold')
plt.xlabel('Residue')
plt.ylabel('pLDDT')
plt.ylim(0, 100)
plt.legend()
plt.savefig(output)
plt.close()
plot_plddt(plddt)
```
## Analyze PAE (Predicted Aligned Error)
```python
import json
import numpy as np
import matplotlib.pyplot as plt
def load_pae(pae_file):
'''Load PAE matrix from JSON'''
with open(pae_file) as f:
data = json.load(f)
# AlphaFold v4 format
if 'predicted_aligned_error' in data[0]:
return np.array(data[0]['predicted_aligned_error'])
# Older format
return np.array(data['predicted_aligned_error'])
def plot_pae(pae_matrix, output='pae_plot.png'):
plt.figure(figsize=(8, 8))
plt.imshow(pae_matrix, cmap='Greens_r', vmin=0, vmax=30)
plt.colorbar(label='Expected position error (A)')
plt.xlabel('Scored residue')
plt.ylabel('Aligned residue')
plt.title('Predicted Aligned Error')
plt.savefig(output)
plt.close()
pae = load_pae('AF-P04637-F1-pae.json')
plot_pae(pae)
```
### PAE Interpretation
- **Low PAE (green):** Residues have well-defined relative positions
- **High PAE (white):** Uncertain relative positions (flexible linkers, domains)
- **Diagonal blocks:** Distinct structural domains
## Batch Download
```python
def batch_download_alphafold(uniprot_ids, output_dir='.'):
'''Download multiple AlphaFold structures'''
import os
os.makedirs(output_dir, exist_ok=True)
results = {}
for uid in uniprot_ids:
pdb_file = download_alphafold(uid, output_dir)
results[uid] = pdb_file
if pdb_file:
print(f'Downloaded: {uid}')
else:
print(f'Not found: {uid}')
return results
ids = ['P04637', 'P53_HUMAN', 'Q9Y6K9']
files = batch_download_alphafold(ids, 'alphafold_structures')
```
## Compare with Experimental Structure
```python
from Bio.PDB import PDBParser, Superimposer
def compare_structures(alphafold_pdb, experimental_pdb):
'''Calculate RMSD between AlphaFold and experimental structure'''
parser = PDBParser(QUIET=True)
af_struct = parser.get_structure('af', alphafold_pdb)
exp_struct = parser.get_structure('exp', experimental_pdb)
# Get CA atoms from first chain
af_atoms = [r['CA'] for r in af_struct[0].get_residues() if 'CA' in r]
exp_atoms = [r['CA'] for r in exp_struct[0].get_residues() if 'CA' in r]
# Align by length (simple approach)
min_len = min(len(af_atoms), len(exp_atoms))
af_atoms = af_atoms[:min_len]
exp_atoms = exp_atoms[:min_len]
super_imposer = Superimposer()
super_imposer.set_atoms(exp_atoms, af_atoms)
rmsd = super_imposer.rms
return rmsd
```
## Related Skills
- structural-biology/structure-io - Load and parse PDB/mmCIF files
- structural-biology/geometric-analysis - RMSD, superimposition
- database-access/uniprot-access - Get UniProt IDs for proteins
- structural-biology/structure-navigation - Navigate structure hierarchyRelated Skills
tooluniverse-systems-biology
Comprehensive systems biology and pathway analysis using multiple pathway databases (Reactome, KEGG, WikiPathways, Pathway Commons, BioModels). Performs pathway enrichment, protein-pathway mapping, keyword searches, and systems-level analysis. Use when analyzing gene sets, exploring biological pathways, or investigating systems-level biology.
tooluniverse-structural-variant-analysis
Comprehensive structural variant (SV) analysis skill for clinical genomics. Classifies SVs (deletions, duplications, inversions, translocations), assesses pathogenicity using ACMG-adapted criteria, evaluates gene disruption and dosage sensitivity, and provides clinical interpretation with evidence grading. Use when analyzing CNVs, large deletions/duplications, chromosomal rearrangements, or any structural variants requiring clinical interpretation.
bio-variant-calling-structural-variant-calling
Call structural variants (SVs) from short-read sequencing using Manta, Delly, and LUMPY. Detects deletions, insertions, inversions, duplications, and translocations that are too large for standard SNV callers. Use when detecting structural variants from short-read data.
bio-structural-biology-modern-structure-prediction
Predict protein structures using modern ML models including AlphaFold3, ESMFold, Chai-1, and Boltz-1. Use when predicting structures for novel proteins, protein complexes, or when comparing predictions across multiple methods.
bio-longread-structural-variants
Detect structural variants from long-read alignments using Sniffles, cuteSV, and SVIM. Use when detecting deletions, insertions, inversions, translocations, or complex rearrangements from ONT or PacBio data, especially those missed by short-read methods.
alphafold
Validate protein designs using AlphaFold2 structure prediction. Use this skill when: (1) Validating designed sequences fold correctly, (2) Predicting binder-target complex structures, (3) Calculating confidence metrics (pLDDT, pTM, ipTM), (4) Self-consistency validation of designs, (5) Multi-chain complex prediction with AlphaFold-Multimer. For faster single-chain prediction, use esm. For QC thresholds, use protein-qc.
alphafold-database
Access AlphaFold's 200M+ AI-predicted protein structures. Retrieve structures by UniProt ID, download PDB/mmCIF files, analyze confidence metrics (pLDDT, PAE), for drug discovery and structural biology.
zinc-database
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
zarr-python
Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.
xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
writing-skills
Use when creating new skills, editing existing skills, or verifying skills work before deployment
writing-plans
Use when you have a spec or requirements for a multi-step task, before touching code