bio-pdb-structure-modification

Modify protein structures using Biopython Bio.PDB. Use when transforming coordinates, removing atoms or residues, adding new entities, modifying B-factors and occupancies, or building structures programmatically.

1,802 stars

Best use case

bio-pdb-structure-modification is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Modify protein structures using Biopython Bio.PDB. Use when transforming coordinates, removing atoms or residues, adding new entities, modifying B-factors and occupancies, or building structures programmatically.

Teams using bio-pdb-structure-modification should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/bio-pdb-structure-modification/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/bio-pdb-structure-modification/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/bio-pdb-structure-modification/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How bio-pdb-structure-modification Compares

Feature / Agentbio-pdb-structure-modificationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Modify protein structures using Biopython Bio.PDB. Use when transforming coordinates, removing atoms or residues, adding new entities, modifying B-factors and occupancies, or building structures programmatically.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

## Version Compatibility

Reference examples tested with: BioPython 1.83+, numpy 1.26+

Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Structure Modification

**"Extract a chain from a PDB file"** → Remove/add atoms, residues, or chains; transform coordinates; modify B-factors and occupancies; build structures programmatically.
- Python: `Bio.PDB.PDBIO()` with `Select` subclass for filtering, `Bio.PDB.Superimposer()` for transforms

Transform coordinates, remove/add entities, modify properties, and build structures programmatically.

## Required Imports

```python
from Bio.PDB import PDBParser, PDBIO, StructureBuilder
from Bio.PDB.Structure import Structure
from Bio.PDB.Model import Model
from Bio.PDB.Chain import Chain
from Bio.PDB.Residue import Residue
from Bio.PDB.Atom import Atom
import numpy as np
```

## Transforming Coordinates

```python
from Bio.PDB import PDBParser, PDBIO
import numpy as np

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Translate all atoms
translation = np.array([10.0, 0.0, 0.0])
for atom in structure.get_atoms():
    atom.coord = atom.coord + translation

# Save transformed structure
io = PDBIO()
io.set_structure(structure)
io.save('translated.pdb')
```

## Rotation Around Axis

```python
from Bio.PDB import PDBParser
from Bio.PDB.vectors import rotaxis
import numpy as np

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Rotate around Z axis by 90 degrees
angle = np.radians(90)
axis = np.array([0, 0, 1])

# Get center of mass for rotation origin
coords = np.array([a.coord for a in structure.get_atoms()])
center = coords.mean(axis=0)

# Rotation matrix
cos_a = np.cos(angle)
sin_a = np.sin(angle)
rot_matrix = np.array([
    [cos_a, -sin_a, 0],
    [sin_a, cos_a, 0],
    [0, 0, 1]
])

# Apply rotation around center
for atom in structure.get_atoms():
    atom.coord = np.dot(rot_matrix, atom.coord - center) + center
```

## Applying Transformation Matrix

```python
from Bio.PDB import PDBParser
import numpy as np

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# 4x4 transformation matrix (rotation + translation)
# From superimposition or external source
transform = np.array([
    [1.0, 0.0, 0.0, 10.0],
    [0.0, 1.0, 0.0, 5.0],
    [0.0, 0.0, 1.0, 0.0],
    [0.0, 0.0, 0.0, 1.0]
])

rotation = transform[:3, :3]
translation = transform[:3, 3]

for atom in structure.get_atoms():
    atom.coord = np.dot(rotation, atom.coord) + translation
```

## Center Structure at Origin

```python
from Bio.PDB import PDBParser
import numpy as np

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Calculate center
coords = np.array([a.coord for a in structure.get_atoms()])
center = coords.mean(axis=0)

# Translate to origin
for atom in structure.get_atoms():
    atom.coord = atom.coord - center
```

## Removing Atoms

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Remove hydrogens
for residue in structure.get_residues():
    atoms_to_remove = [a.id for a in residue if a.element == 'H']
    for atom_id in atoms_to_remove:
        residue.detach_child(atom_id)

io = PDBIO()
io.set_structure(structure)
io.save('no_hydrogens.pdb')
```

## Removing Residues

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Remove water molecules
for model in structure:
    for chain in model:
        residues_to_remove = [r.id for r in chain if r.id[0] == 'W']
        for res_id in residues_to_remove:
            chain.detach_child(res_id)

io = PDBIO()
io.set_structure(structure)
io.save('no_water.pdb')
```

## Removing a Chain

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Remove chain B
model = structure[0]
if model.has_id('B'):
    model.detach_child('B')

io = PDBIO()
io.set_structure(structure)
io.save('without_chain_B.pdb')
```

## Modifying B-factors

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Set all B-factors to same value
for atom in structure.get_atoms():
    atom.bfactor = 20.0

# Or set based on residue property (e.g., conservation score)
conservation_scores = {100: 9.0, 101: 5.0, 102: 3.0}  # resnum -> score
for residue in structure.get_residues():
    resnum = residue.id[1]
    score = conservation_scores.get(resnum, 5.0)
    for atom in residue:
        atom.bfactor = score * 10  # Scale to B-factor range

io = PDBIO()
io.set_structure(structure)
io.save('modified_bfactor.pdb')
```

## Modifying Occupancy

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Set occupancy for specific chain
for atom in structure[0]['A'].get_atoms():
    atom.occupancy = 0.5

io = PDBIO()
io.set_structure(structure)
io.save('modified_occupancy.pdb')
```

## Renumbering Residues

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

chain = structure[0]['A']

# Renumber sequentially starting from 1
new_residues = []
for i, residue in enumerate(chain, start=1):
    hetfield, _, icode = residue.id
    new_id = (hetfield, i, icode)
    residue.id = new_id
    new_residues.append(residue)

io = PDBIO()
io.set_structure(structure)
io.save('renumbered.pdb')
```

## Changing Chain ID

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Rename chain A to X
model = structure[0]
chain = model['A']
chain.id = 'X'

io = PDBIO()
io.set_structure(structure)
io.save('renamed_chain.pdb')
```

## Building Structure from Scratch

```python
from Bio.PDB.Structure import Structure
from Bio.PDB.Model import Model
from Bio.PDB.Chain import Chain
from Bio.PDB.Residue import Residue
from Bio.PDB.Atom import Atom
from Bio.PDB import PDBIO
import numpy as np

# Create hierarchy
structure = Structure('new_struct')
model = Model(0)
chain = Chain('A')
residue = Residue((' ', 1, ' '), 'ALA', '')

# Add atoms
ca = Atom('CA', np.array([0.0, 0.0, 0.0]), 20.0, 1.0, ' ', 'CA', 1, 'C')
cb = Atom('CB', np.array([1.5, 0.0, 0.0]), 20.0, 1.0, ' ', 'CB', 2, 'C')

residue.add(ca)
residue.add(cb)
chain.add(residue)
model.add(chain)
structure.add(model)

io = PDBIO()
io.set_structure(structure)
io.save('new_structure.pdb')
```

## Using StructureBuilder

```python
from Bio.PDB import StructureBuilder, PDBIO
import numpy as np

sb = StructureBuilder.StructureBuilder()
sb.init_structure('built')
sb.init_model(0)
sb.init_chain('A')
sb.init_seg(' ')

# Add residue with atoms
sb.init_residue('ALA', ' ', 1, ' ')
sb.init_atom('N', np.array([-1.0, 0.0, 0.0]), 20.0, 1.0, ' ', 'N', 1, 'N')
sb.init_atom('CA', np.array([0.0, 0.0, 0.0]), 20.0, 1.0, ' ', 'CA', 2, 'C')
sb.init_atom('C', np.array([1.0, 0.0, 0.0]), 20.0, 1.0, ' ', 'C', 3, 'C')
sb.init_atom('O', np.array([1.5, 1.0, 0.0]), 20.0, 1.0, ' ', 'O', 4, 'O')

sb.init_residue('GLY', ' ', 2, ' ')
sb.init_atom('N', np.array([1.5, -1.0, 0.0]), 20.0, 1.0, ' ', 'N', 5, 'N')
sb.init_atom('CA', np.array([2.5, -1.0, 0.0]), 20.0, 1.0, ' ', 'CA', 6, 'C')
sb.init_atom('C', np.array([3.5, -1.0, 0.0]), 20.0, 1.0, ' ', 'C', 7, 'C')
sb.init_atom('O', np.array([4.0, 0.0, 0.0]), 20.0, 1.0, ' ', 'O', 8, 'O')

structure = sb.get_structure()

io = PDBIO()
io.set_structure(structure)
io.save('built_structure.pdb')
```

## Adding a Residue to Existing Chain

```python
from Bio.PDB import PDBParser, PDBIO
from Bio.PDB.Residue import Residue
from Bio.PDB.Atom import Atom
import numpy as np

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

chain = structure[0]['A']

# Create new residue
new_residue = Residue((' ', 999, ' '), 'ALA', '')
ca = Atom('CA', np.array([50.0, 50.0, 50.0]), 20.0, 1.0, ' ', 'CA', 9999, 'C')
new_residue.add(ca)

# Add to chain
chain.add(new_residue)

io = PDBIO()
io.set_structure(structure)
io.save('with_new_residue.pdb')
```

## Copying a Chain

```python
from Bio.PDB import PDBParser, PDBIO
from Bio.PDB.Chain import Chain
from Bio.PDB.Residue import Residue
from Bio.PDB.Atom import Atom
import copy

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Deep copy chain A as chain B
original_chain = structure[0]['A']
new_chain = Chain('B')

for residue in original_chain:
    new_residue = Residue(residue.id, residue.resname, residue.segid)
    for atom in residue:
        new_atom = Atom(
            atom.name, atom.coord.copy(), atom.bfactor, atom.occupancy,
            atom.altloc, atom.fullname, atom.serial_number, atom.element
        )
        new_residue.add(new_atom)
    new_chain.add(new_residue)

structure[0].add(new_chain)

io = PDBIO()
io.set_structure(structure)
io.save('duplicated_chain.pdb')
```

## Extract and Save Specific Residues

```python
from Bio.PDB import PDBParser, PDBIO
from Bio.PDB.Structure import Structure
from Bio.PDB.Model import Model
from Bio.PDB.Chain import Chain

parser = PDBParser(QUIET=True)
structure = parser.get_structure('protein', 'protein.pdb')

# Extract residues 50-100
new_structure = Structure('subset')
new_model = Model(0)
new_chain = Chain('A')

for residue in structure[0]['A']:
    if 50 <= residue.id[1] <= 100 and residue.id[0] == ' ':
        new_chain.add(residue.copy())

new_model.add(new_chain)
new_structure.add(new_model)

io = PDBIO()
io.set_structure(new_structure)
io.save('residues_50_100.pdb')
```

## Merge Two Structures

```python
from Bio.PDB import PDBParser, PDBIO

parser = PDBParser(QUIET=True)
struct1 = parser.get_structure('s1', 'structure1.pdb')
struct2 = parser.get_structure('s2', 'structure2.pdb')

# Add chains from struct2 to struct1 (rename to avoid conflicts)
for chain in struct2[0]:
    new_id = chr(ord(chain.id) + 10)  # Offset chain ID
    chain.id = new_id
    struct1[0].add(chain)

io = PDBIO()
io.set_structure(struct1)
io.save('merged.pdb')
```

## Related Skills

- structure-io - Parse and write structure files
- structure-navigation - Access chains, residues, atoms
- geometric-analysis - Calculate distances, angles, RMSD
- sequence-manipulation/seq-objects - Generate sequences from modified structures

Related Skills

tooluniverse-protein-structure-retrieval

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Retrieves protein structure data from RCSB PDB, PDBe, and AlphaFold with protein disambiguation, quality assessment, and comprehensive structural profiles. Creates detailed structure reports with experimental metadata, ligand information, and download links. Use when users need protein structures, 3D models, crystallography data, or mention PDB IDs (4-character codes like 1ABC) or UniProt accessions.

bio-substructure-search

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Searches molecular libraries for substructure matches using SMARTS patterns with RDKit. Filters compounds by pharmacophore features, functional groups, or scaffold matches with atom mapping. Use when finding compounds containing specific chemical moieties or filtering libraries by structural features.

bio-structural-biology-modern-structure-prediction

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Predict protein structures using modern ML models including AlphaFold3, ESMFold, Chai-1, and Boltz-1. Use when predicting structures for novel proteins, protein complexes, or when comparing predictions across multiple methods.

bio-pdb-structure-navigation

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Navigate protein structure hierarchy using Biopython Bio.PDB SMCRA model. Use when accessing models, chains, residues, and atoms, iterating over structure levels, or extracting sequences from PDB files.

bio-pdb-structure-io

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Parse and write protein structure files using Biopython Bio.PDB. Use when reading PDB, mmCIF, and MMTF files, downloading structures from RCSB PDB, or writing structures to various formats.

zinc-database

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

zarr-python

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

xlsx

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

writing-skills

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Use when creating new skills, editing existing skills, or verifying skills work before deployment

writing-plans

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Use when you have a spec or requirements for a multi-step task, before touching code

wikipedia-search

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Search and fetch structured content from Wikipedia using the MediaWiki API for reliable, encyclopedic information

wellally-tech

1802
from FreedomIntelligence/OpenClaw-Medical-Skills

Integrate digital health data sources (Apple Health, Fitbit, Oura Ring) and connect to WellAlly.tech knowledge base. Import external health device data, standardize to local format, and recommend relevant WellAlly.tech knowledge base articles based on health data. Support generic CSV/JSON import, provide intelligent article recommendations, and help users better manage personal health data.