soul2dna

Compile SOUL.md character profiles into synthetic diploid genomes (.genome.json) via trait-to-allele mapping

658 stars

byClawBio

View on GitHub Installation ↓

Best use case

soul2dna is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Compile SOUL.md character profiles into synthetic diploid genomes (.genome.json) via trait-to-allele mapping

Teams using soul2dna should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/soul2dna/SKILL.md --create-dirs "https://raw.githubusercontent.com/ClawBio/ClawBio/main/skills/soul2dna/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/soul2dna/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How soul2dna Compares

Feature / Agent	soul2dna	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Compile SOUL.md character profiles into synthetic diploid genomes (.genome.json) via trait-to-allele mapping

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# 🧬 Soul2DNA Compiler

## Purpose

Compile SOUL.md character profiles into synthetic diploid genomes. Each soul file
describes a historical or fictional figure with trait scores (0.0 to 1.0). The
compiler maps these scores to alleles at defined loci using additive, dominant, or
recessive inheritance models, producing a `.genome.json` file per character.

## How It Works

1. **Parse SOUL.md** files from `GENOMEBOOK/DATA/SOULS/` extracting identity
   metadata (name, sex, ancestry, domain, era) and trait scores.
2. **Load trait registry** (`GENOMEBOOK/DATA/trait_registry.json`) which defines
   loci, alleles, chromosomal positions, dominance models, and effect sizes for
   each trait.
3. **Assign genotypes** at each locus based on trait score thresholds:
   - Additive: <0.33 ref/ref, 0.33-0.66 ref/alt, >0.66 alt/alt
   - Dominant: <0.40 ref/ref, 0.40-0.75 ref/alt, >0.75 alt/alt
   - Recessive: <0.50 ref/ref, 0.50-0.80 ref/alt, >0.80 alt/alt
4. **Write genome** as JSON with full locus detail, trait scores, and metadata.

## Input

- `GENOMEBOOK/DATA/SOULS/*.soul.md` (20 historical figures)
- `GENOMEBOOK/DATA/trait_registry.json`

## Output

- `GENOMEBOOK/DATA/GENOMES/<name>-g0.genome.json` per character

## CLI Usage

```bash
# Compile all souls to genomes
python skills/soul2dna/soul2dna.py

# Demo mode (shows summary without writing files)
python skills/soul2dna/soul2dna.py --demo
```

## Output Format

Each `.genome.json` contains:

```json
{
  "id": "einstein-g0",
  "name": "Albert Einstein",
  "sex": "Male",
  "sex_chromosomes": "XY",
  "ancestry": "...",
  "generation": 0,
  "parents": [null, null],
  "loci": { "<locus_id>": { "chromosome": "...", "alleles": ["A","G"], ... } },
  "trait_scores": { "curiosity": 0.95, ... }
}
```

Related Skills

wes-clinical-report-es

658

from ClawBio/ClawBio

Generates professional clinical PDF reports in Spanish from WES (Whole Exome Sequencing) data with clinical interpretation, pharmacogenomic alerts, and follow-up recommendations.

wes-clinical-report-en

658

from ClawBio/ClawBio

Generates professional clinical PDF reports in English from WES (Whole Exome Sequencing) data with clinical interpretation summary, pharmacogenomic alerts, and follow-up recommendations.

vcf-annotator

658

from ClawBio/ClawBio

Annotate VCF variants with VEP, ClinVar, gnomAD frequencies, and ancestry-aware context. Generates prioritised variant reports.

variant-annotation

658

from ClawBio/ClawBio

Annotate VCF variants with Ensembl VEP REST, ClinVar significance, gnomAD/population frequency context, and prioritized variant ranking.

ukb-navigator

658

from ClawBio/ClawBio

Semantic search across UK Biobank's 12,000+ data fields and publications — find the right variables for your research question.

target-validation-scorer

658

from ClawBio/ClawBio

Evidence-grounded target validation scoring with GO/NO-GO decisions for drug discovery campaigns

struct-predictor

658

from ClawBio/ClawBio

Protein structure prediction with Boltz-2. Accepts YAML inputs (single protein or multi-chain complex), runs boltz predict, extracts per-residue pLDDT and PAE confidence, and writes a markdown report with figures.

seq-wrangler

658

from ClawBio/ClawBio

Sequence QC, alignment, and BAM processing. Wraps FastQC, BWA/Bowtie2, SAMtools for automated read-to-BAM pipelines.

scrna-orchestrator

658

from ClawBio/ClawBio

Local Scanpy pipeline for single-cell RNA-seq QC, optional doublet detection, clustering, marker discovery, optional CellTypist annotation, optional latent downstream mode from integrated.h5ad/X_scvi, and optional dataset-level plus within-cluster contrastive marker analysis from raw-count .h5ad or 10x Matrix Market input.

scrna-embedding

658

from ClawBio/ClawBio

Local scVI/scANVI-based single-cell latent embedding and batch-aware integration from raw-count .h5ad or 10x Matrix Market input, with stable integrated AnnData export for downstream latent analysis.

rnaseq-de

658

from ClawBio/ClawBio

Differential expression analysis for bulk RNA-seq and pseudo-bulk count matrices with QC, PCA, and contrast testing.

repro-enforcer

658

from ClawBio/ClawBio

Export any bioinformatics analysis as a reproducible bundle with Conda environment, Singularity container definition, and Nextflow pipeline.