cell-free-expression

Guidance for cell-free protein synthesis (CFPS) optimization. Use when: (1) Planning CFPS experiments, (2) Troubleshooting low yield or aggregation, (3) Optimizing DNA template design for CFPS, (4) Expressing difficult proteins (disulfide-rich, toxic, membrane).

1,802 stars

byFreedomIntelligence

View on GitHub Installation ↓

Best use case

cell-free-expression is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using cell-free-expression should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/cell-free-expression/SKILL.md --create-dirs "https://raw.githubusercontent.com/FreedomIntelligence/OpenClaw-Medical-Skills/main/skills/cell-free-expression/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/cell-free-expression/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How cell-free-expression Compares

Feature / Agent	cell-free-expression	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Cell-Free Protein Synthesis (CFPS)

## System Selection Guide

| System | Best For | Yield | PTMs | Disulfides | Cost |
|--------|----------|-------|------|------------|------|
| **E. coli extract** | Rapid prototyping, prokaryotic proteins | High (100-400 μg/mL) | None | Poor (reducing) | Low |
| **E. coli PURE** | Defined conditions, unnatural AAs | Medium (50-150 μg/mL) | None | Controllable | High |
| **Wheat germ** | Eukaryotic proteins, membrane proteins | High (100-500 μg/mL) | Limited | Moderate | Medium |
| **Rabbit reticulocyte** | Mammalian proteins, post-translational studies | Low (10-50 μg/mL) | Some | Poor | High |
| **Insect (Sf21)** | Glycoproteins, complex folds | Medium (50-100 μg/mL) | Glycosylation | Good | High |
| **HeLa/CHO** | Native mammalian proteins | Low (10-50 μg/mL) | Full mammalian | Good | Very High |

---

## CFPS Troubleshooting Matrix

| Problem | Likely Causes | Design Fix | Reagent Fix |
|---------|---------------|------------|-------------|
| **No expression** | Rare codons at N-terminus, poor RBS | Codon optimize first 30 codons | Use BL21-CodonPlus extract |
| **Low yield** | Strong mRNA secondary structure, template issues | Optimize 5' UTR (ΔG > -5 kcal/mol) | Increase Mg²⁺ (10-18 mM), ATP |
| **Aggregation** | Hydrophobic protein, fast translation | Add solubility tags (MBP, SUMO) | Add 0.1% Tween-20, chaperones |
| **Inactive protein** | Misfolding, missing cofactors | Slow translation (use rare codons!) | Add GroEL/ES, DnaK/J |
| **Truncation** | Rare codon clusters, mRNA instability | Remove AGG/AGA/CUA clusters | Supplement rare tRNAs |
| **Degradation** | Proteolysis | N-terminal Met-Ala | Add protease inhibitors |

---

## Codon Optimization for CFPS

### Codons to Avoid in E. coli CFPS

| Codon | Amino Acid | Issue | tRNA Abundance |
|-------|------------|-------|----------------|
| AGG | Arg | Very rare, stalling | 0.2% |
| AGA | Arg | Very rare, stalling | 0.4% |
| CUA | Leu | Low abundance | 0.4% |
| AUA | Ile | Rare | 0.5% |
| CGA | Arg | Inefficient decoding | 0.6% |
| CCC | Pro | Can cause pausing | 0.5% |
| GGA | Gly | Moderate | 1.1% |

### Design Rules

1. **First 30 codons**: Most critical - use only high-frequency codons
2. **Rare codon clusters**: Avoid 2+ rare codons within 10 nt
3. **Rare codon content**: Keep overall <5% of coding sequence
4. **GC content**: Target 40-60% for balanced expression
5. **Avoid runs**: No >6 consecutive G or C residues (secondary structure)
6. **Strategic slow codons**: Place rare codons between domains (aids folding!)

### When to Use Rare Codons
- Domain boundaries (allow cotranslational folding)
- Before complex structural elements
- When protein is prone to misfolding

---

## mRNA Template Design

### 5' UTR Optimization

| Element | Optimal Design | Impact |
|---------|----------------|--------|
| **RBS (SD sequence)** | AGGAGG, 7-9 nt from start | Ribosome binding |
| **Spacing** | 7 nt between SD and AUG | Translation initiation |
| **Secondary structure** | ΔG > -5 kcal/mol | Accessibility |
| **Upstream AUG** | Avoid (causes false starts) | Reduces truncations |

### Secondary Structure Targets

| Region | Ideal ΔG | Impact |
|--------|----------|--------|
| -30 to +30 around AUG | > -5 kcal/mol | Translation initiation |
| Full 5' UTR | > -10 kcal/mol | Ribosome loading |
| RBS accessibility | Unpaired | Critical |

### Template Format

| Format | Advantages | Disadvantages |
|--------|------------|---------------|
| **Plasmid** | Stable, high yield | Requires cloning |
| **Linear PCR** | Fast, no cloning | May need stabilization |
| **mRNA** | Direct translation | Unstable, expensive |

---

## Disulfide Bond Formation

### System Capabilities

| System | Native Disulfide Support | Additives Needed |
|--------|--------------------------|------------------|
| Standard E. coli extract | Poor (DTT present) | IAM, PDI, GSSG/GSH |
| Oxidizing E. coli extract | Good | Pre-oxidized glutathione |
| Wheat germ | Moderate | Lower DTT, add PDI |
| PURE system | Minimal | Full oxidative system |
| Insect/Mammalian | Good | Microsome membranes |

### Oxidative Folding Protocol (E. coli extract)

```
1. Deplete DTT from extract (dialysis or treatment with IAM 5 mM)
2. Add oxidized/reduced glutathione: 4 mM GSSG, 1 mM GSH (4:1 ratio)
3. Add 10 μM PDI (protein disulfide isomerase)
4. Optional: Add 5 μM DsbC (disulfide isomerase)
5. Express at 25°C (not 37°C) for better folding
6. Incubation time: 4-6 hours
```

### Disulfide-Rich Protein Tips
- Start with wheat germ or oxidizing extract
- Use PURE system for precise control
- Consider co-expression of PDI/DsbC
- Verify by non-reducing SDS-PAGE

---

## Expression Prediction from Sequence

| Feature | Good | Marginal | Bad |
|---------|------|----------|-----|
| **Rare codon content** | <3% | 3-8% | >10% |
| **First 30 codons rare** | 0 | 1-2 | >2 |
| **GC content** | 45-55% | 35-45% or 55-65% | <30% or >70% |
| **5' UTR ΔG** | > -3 kcal/mol | -3 to -8 | < -10 kcal/mol |
| **Hydrophobic stretches** | <5 consecutive | 5-7 | >8 consecutive |
| **N-terminal residue** | Met-Ala, Met-Ser, Met-Gly | Met-Val, Met-Thr | Met-Arg, Met-Lys |
| **Cysteine pairs** | Paired (even number) | Mixed | Odd number (free thiols) |

---

## Solubility Enhancement Strategies

### Fusion Tags (ranked by effectiveness)

| Tag | Size | Solubility Enhancement | Cleavage | Notes |
|-----|------|------------------------|----------|-------|
| **MBP** | 40 kDa | Excellent | TEV, Factor Xa | Best overall |
| **SUMO** | 11 kDa | Very Good | SUMO protease | Native N-terminus after cleavage |
| **NusA** | 55 kDa | Excellent | - | Large size |
| **Trx** | 12 kDa | Good | Enterokinase | For disulfide proteins |
| **GST** | 26 kDa | Moderate | - | Dimeric |
| **His₆** | 1 kDa | Minimal | - | Mainly for purification |

### Buffer Additives for Solubility

| Additive | Concentration | Mechanism |
|----------|---------------|-----------|
| Trehalose | 50-100 mM | Chemical chaperone |
| Glycerol | 5-10% | Reduces hydrophobic aggregation |
| L-Arginine | 50-100 mM | Suppresses aggregation |
| Tween-20 | 0.05-0.1% | Prevents surface adsorption |
| Proline | 50 mM | Osmolyte stabilization |

### Chaperone Supplementation

| Chaperone System | Target Problem | Concentration |
|------------------|----------------|---------------|
| GroEL/GroES | General folding | 1-2 μM |
| DnaK/DnaJ/GrpE | Aggregation-prone | 1 μM each |
| Trigger Factor | Nascent chain | 1-2 μM |
| ClpB | Aggregate resolubilization | 0.5 μM |

---

## Temperature Optimization

| Temperature | Use Case | Trade-offs |
|-------------|----------|------------|
| **37°C** | Fast expression, stable proteins | Higher aggregation risk |
| **30°C** | Balanced (default) | Good compromise |
| **25°C** | Disulfide proteins, complex folds | Slower, better folding |
| **18-20°C** | Aggregation-prone proteins | Much slower, best folding |
| **16°C** | Cold-shock proteins | Very slow, specialized |

---

## E. coli Extract Preparation (Key Variables)

| Variable | Impact | Optimal Range |
|----------|--------|---------------|
| **Cell density at harvest** | Ribosome content | OD₆₀₀ 2.5-3.5 |
| **Lysis method** | Extract activity | Sonication, bead beating |
| **Run-off reaction** | Removes endogenous mRNA | 20-80 min at 37°C |
| **Mg²⁺ concentration** | Translation fidelity | 10-18 mM |
| **K⁺ concentration** | Translation rate | 150-200 mM |
| **Energy system** | Sustained synthesis | ATP/GTP, creatine phosphate |

---

## PURE System Specifics

### Advantages
- Defined composition (no proteases/nucleases)
- Linear DNA templates work well
- Unnatural amino acid incorporation
- Reproducible between batches

### Limitations
- No chaperones (add separately)
- No post-translational modifications
- Lower yields than crude extracts
- Higher cost

### When to Use PURE
- Unnatural amino acid incorporation
- Studying translation mechanisms
- "Clean" proteins needed
- Protease-sensitive targets
- Linear template expression

---

## Common Artifacts and Solutions

### Low Molecular Weight Bands
**Causes**: Premature termination, proteolysis, internal initiation
**Solutions**:
- Optimize rare codon clusters
- Add protease inhibitors
- Check for internal AUG codons
- Use PURE system

### Higher MW Bands
**Causes**: Incomplete termination, read-through, aggregation
**Solutions**:
- Ensure strong stop codon (UAA preferred)
- Check template 3' end
- Add release factors (RF1/RF2)
- Reduce protein concentration

### No Soluble Protein
**Causes**: Aggregation during synthesis
**Solutions**:
- Lower temperature (25°C → 18°C)
- Add chaperones
- Use solubility tag
- Optimize translation rate

---

## References

### CFPS Overview
- [User's Guide to CFPS - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC6481089/)
- [Optimising Protein Synthesis in Cell-Free Systems - PMC](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9996726/)
- [CFPS Systems Comparison - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC8258279/)

### Extract Preparation
- [Crude Extract Preparation - MDPI Methods](https://www.mdpi.com/2409-9279/2/3/68)
- [Simple Rapid Cell-Free Lysate - PLOS One](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0165137)
- [High-Throughput Extract Preparation - Nature Scientific Reports](https://www.nature.com/articles/srep08663)

### PURE System
- [PURE System Evolution - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC10521753/)
- [PURE System for Membrane Proteins - Nature Protocols](https://www.nature.com/articles/nprot.2015.082)

### Wheat Germ
- [Wheat Germ Systems Review - FEBS Letters](https://febs.onlinelibrary.wiley.com/doi/10.1016/j.febslet.2014.05.061)
- [Wheat Germ for Structural Biology - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC8027086/)

### Codon Optimization
- [Rare Codons and Solubility - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC2723077/)
- [Codon Influence on Expression - Nature](https://www.nature.com/articles/nature16509)
- [Synonymous Codon Substitutions Perturb Folding - PNAS](https://www.pnas.org/doi/10.1073/pnas.1907126117)

### Disulfide Formation
- [Oxidative Protein Folding in ER - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC4312752/)
- [PDI-Regulated Disulfide Formation - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC7794689/)

### Solubility Tags
- [SUMO Fusion for Difficult Proteins - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC7129290/)
- [Fusion Tags Review - Frontiers Microbiol](https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2014.00063/full)

### Temperature Effects
- [Cold Shock Promoters - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC9800685/)
- [Strategies to Optimize E. coli Expression - PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC7162232/)

Related Skills

tooluniverse-single-cell

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Production-ready single-cell and expression matrix analysis using scanpy, anndata, and scipy. Performs scRNA-seq QC, normalization, PCA, UMAP, Leiden/Louvain clustering, differential expression (Wilcoxon, t-test, DESeq2), cell type annotation, per-cell-type statistical analysis, gene-expression correlation, batch correction (Harmony), trajectory inference, and cell-cell communication analysis. NEW: Analyzes ligand-receptor interactions between cell types using OmniPath (CellPhoneDB, CellChatDB), scores communication strength, identifies signaling cascades, and handles multi-subunit receptor complexes. Integrates with ToolUniverse gene annotation tools (HPA, Ensembl, MyGene, UniProt) and enrichment tools (gseapy, PANTHER, STRING). Supports h5ad, 10X, CSV/TSV count matrices, and pre-annotated datasets. Use when analyzing single-cell RNA-seq data, studying cell-cell interactions, performing cell type differential expression, computing gene-expression correlations by cell type, analyzing tumor-immune communication, or answering questions about scRNA-seq datasets.

tooluniverse-expression-data-retrieval

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Retrieves gene expression and omics datasets from ArrayExpress and BioStudies with gene disambiguation, experiment quality assessment, and structured reports. Creates comprehensive dataset profiles with metadata, sample information, and download links. Use when users need expression data, omics datasets, or mention ArrayExpress (E-MTAB, E-GEOD) or BioStudies (S-BSST) accessions.

single-cell-preprocessing-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Walk through omicverse's single-cell preprocessing tutorials to QC PBMC3k data, normalise counts, detect HVGs, and run PCA/embedding pipelines on CPU, CPU–GPU mixed, or GPU stacks.

single-cell-multi-omics-integration

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Quick-reference sheet for OmicVerse tutorials spanning MOFA, GLUE pairing, SIMBA integration, TOSICA transfer, and StaVIA cartography.

single-cell-downstream-analysis

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Checklist-style reference for OmicVerse downstream tutorials covering AUCell scoring, metacell DEG, and related exports.

single-cell-clustering-and-batch-correction-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Guide Claude through omicverse's single-cell clustering workflow, covering preprocessing, QC, multimethod clustering, topic modeling, cNMF, and cross-batch integration as demonstrated in t_cluster.ipynb and t_single_batch.ipynb.

single-cell-cellphonedb-communication-mapping

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Run omicverse's CellPhoneDB v5 wrapper on annotated single-cell data to infer ligand-receptor networks and produce CellChat-style visualisations.

single-cell-rna-qc

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.

single-cell-annotation-skills-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Guide Claude through SCSA, MetaTiME, CellVote, CellMatch, GPTAnno, and weighted KNN transfer workflows for annotating single-cell modalities.

cellxgene-census

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Query CZ CELLxGENE Census (61M+ cells). Filter by cell type/tissue/disease, retrieve expression data, integrate with scanpy/PyTorch, for population-scale single-cell analysis.

bulk-rna-seq-differential-expression-with-omicverse

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Guide Claude through omicverse's bulk RNA-seq DEG pipeline, from gene ID mapping and DESeq2 normalization to statistical testing, visualization, and pathway enrichment. Use when a user has bulk count matrices and needs differential expression analysis in omicverse.

bio-single-cell-trajectory-inference

1802

from FreedomIntelligence/OpenClaw-Medical-Skills

Infer developmental trajectories and pseudotime from single-cell RNA-seq data using Monocle3, Slingshot, and scVelo for RNA velocity analysis. Use when inferring developmental trajectories or pseudotime.