Scientific Diagram Generation

AI-powered scientific illustration generation using Gemini Image models. Creates publication-quality mechanism diagrams, pathway illustrations, and scientific figures.

912 stars

bywu-yc

View on GitHub Installation ↓

Best use case

Scientific Diagram Generation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

AI-powered scientific illustration generation using Gemini Image models. Creates publication-quality mechanism diagrams, pathway illustrations, and scientific figures.

Teams using Scientific Diagram Generation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/scientific-diagram-generation/SKILL.md --create-dirs "https://raw.githubusercontent.com/wu-yc/LabClaw/main/skills/general/scientific-diagram-generation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/scientific-diagram-generation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How Scientific Diagram Generation Compares

Feature / Agent	Scientific Diagram Generation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

AI-powered scientific illustration generation using Gemini Image models. Creates publication-quality mechanism diagrams, pathway illustrations, and scientific figures.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# Scientific Diagram Generation

AI-powered scientific illustration generation using Gemini Image models. Creates publication-quality mechanism diagrams, pathway illustrations, and scientific figures.

## API Configuration

| Parameter | Value |
|-----------|-------|
| **Provider** | Google Gemini via yunwu.ai relay |
| **Model** | `gemini-3.1-flash-image-preview` |
| **Base URL** | `https://yunwu.ai/v1beta/models` |
| **Full Endpoint** | `https://yunwu.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent` |
| **Auth** | `Authorization: Bearer <LLM_API_KEY>` |
| **API Key env var** | `LLM_API_KEY` (Gemini series key) |
| **Response** | Image in `candidates[].content.parts[].inlineData.data` (base64 PNG) |

## API Call Structure

```bash
curl -X POST "https://yunwu.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LLM_API_KEY" \
  -d '{
    "contents": [{"role": "user", "parts": [{"text": "YOUR_PROMPT_HERE"}]}],
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"]
    }
  }'
```

## Python Implementation

```python
import httpx, base64

API_KEY = "your-gemini-key"
MODEL = "gemini-3.1-flash-image-preview"
URL = f"https://yunwu.ai/v1beta/models/{MODEL}:generateContent"

async def generate(prompt: str) -> bytes:
    payload = {
        "contents": [{"role": "user", "parts": [{"text": prompt}]}],
        "generationConfig": {"responseModalities": ["TEXT", "IMAGE"]}
    }
    async with httpx.AsyncClient(timeout=120) as c:
        r = await c.post(URL, json=payload,
            headers={"Content-Type": "application/json",
                     "Authorization": f"Bearer {API_KEY}"})
        r.raise_for_status()
        for cand in r.json().get("candidates", []):
            for part in cand["content"]["parts"]:
                if "inlineData" in part:
                    return base64.b64decode(part["inlineData"]["data"])
    return b""
```

## Style Presets

### Publication Style (default)
```
Cell/Nature/Science publication style. Realistic cell morphology with smooth
membranes. Activation arrows: solid black. Inhibition: red T-bar. Secretion:
dashed arrow. Proteins as colored ovals. Receptors as Y-shapes on membranes.
Clean, professional, suitable for journal figures.
```

### Vector-Friendly Style
```
Flat vector style with clean outlines and solid color fills. No gradients,
textures, or noise. High contrast. Easy to edit in Adobe Illustrator or Inkscape.
```

### Infographic Style
```
Modern infographic style with grid layout. Rounded rectangles for cells.
Circles for molecules. 3-5 accent colors maximum. Clean geometric shapes.
```

## Core Prompt Rules (append to every prompt)

```
CRITICAL RULES FOR SCIENTIFIC DIAGRAM GENERATION:
1. BIOLOGICAL COMPLETENESS: Name all cell types, receptors, ligands, molecules,
   transcription factors. Include activation, inhibition, binding, phosphorylation,
   secretion, translocation.
2. VISUAL COMPOSITION: Describe spatial layout (top/bottom/left/right). Define
   compartments (membrane, cytoplasm, nucleus, extracellular). Choose layout flow.
3. FONT: Use Arial or clean sans-serif for ALL labels. Never decorative fonts.
4. TEXT LABELS: Title Case for all labels. Never ALL CAPS. Gene/protein
   abbreviations kept as-is (PD-L1, IFN-γ, JAK).
5. BACKGROUND: Pure white #FFFFFF. No gradients, textures, or vignettes.
6. EDITABILITY: Elements clearly separated with sharp edges. No overlapping.
   Easy post-editing.
```

## SketchGraph Schema (for structured bio diagrams)

### BioNode Types
`cell`, `protein`, `receptor`, `mRNA`, `DNA`, `complex`, `small_molecule`, `vesicle`, `exosome`, `antibody`, `other`

### BioEdge Actions
`activate`, `inhibit`, `bind`, `phosphorylate`, `secrete`, `recruit`, `translocate`, `transcribe`, `degrade`, `upregulate`, `downregulate`

### Compartments
`membrane`, `cytoplasm`, `nucleus`, `extracellular_space`, `mitochondria`, `endoplasmic_reticulum`, `golgi`

## Prompt Templates

### Template 1: Mechanism Diagram
```
Generate a scientific mechanism diagram:
[DESCRIPTION OF THE MECHANISM]

STYLE: [publication / vector_friendly / infographic]

[CORE_PROMPT_RULES]

EXACT TEXT LABELS to render (use these exact strings, do NOT change
capitalization): ["Label1", "Label2", ...]
```

### Template 2: Signaling Pathway
```
Generate a signaling pathway diagram showing:
- Ligand: [name] binding to receptor: [name] on [cell type]
- Intracellular cascade: [kinase1] → [kinase2] → [transcription factor]
- Downstream effects: [gene expression changes]
- Compartments: extracellular, membrane, cytoplasm, nucleus

STYLE: publication
[CORE_PROMPT_RULES]
```

### Template 3: Tumor Microenvironment
```
Generate a tumor microenvironment diagram showing:
- Central: tumor cells (irregular shape, dark)
- Surrounding: [immune cells, fibroblasts, endothelial cells]
- Key interactions: [list of interactions with arrow types]
- Secreted factors: [cytokines, chemokines with dashed arrows]

Layout: radial, tumor at center
STYLE: publication
[CORE_PROMPT_RULES]
```

### Template 4: Cell Biology Overview
```
Generate a cell biology diagram showing:
- Cell with organelles: nucleus, mitochondria, ER, Golgi, lysosomes
- Process: [e.g., autophagy, apoptosis, protein trafficking]
- Key molecules at each step: [list]
- Arrows showing process flow

STYLE: publication
[CORE_PROMPT_RULES]
```

### Template 5: Non-Bio Scientific Diagram (GENERAL)
```
Generate a scientific diagram (bypassing SketchGraph, direct prompting):
[DESCRIPTION — can be circuit diagram, chemical reaction scheme,
geological cross-section, physics experiment setup, etc.]

STYLE RULES:
- White background #FFFFFF
- Arial font for all labels
- Title Case for labels
- Clean, professional, publication quality
- No decorative elements
```

## Retry Logic

The API may return 429 (rate limit) or 5xx errors. Recommended retry:
- Attempt 1: immediate
- Attempt 2: wait 15 seconds
- Attempt 3: wait 30 seconds
- Max 3 attempts

## Output

- Format: PNG (default), can request JPEG
- Resolution: ~1024x1024 (model default)
- File size: typically 800KB-1.5MB
- Save to: `~/.scienceclaw/workspace/diagrams/diagram_{uuid}.png`

## Tips for Best Results

1. **Be specific**: "TREM2 receptor on macrophage" > "a receptor on a cell"
2. **Name everything**: Every molecule, cell, and arrow should have a label
3. **Specify compartments**: "extracellular space", "cytoplasm", "nucleus"
4. **Use exact label injection**: Always provide a list of exact text labels
5. **Keep it focused**: One mechanism per diagram, not an entire pathway map
6. **Iterate**: If the first result isn't perfect, refine the prompt and regenerate

Related Skills

scientific-visualization

912

from wu-yc/LabClaw

## Overview

scientific-writing

912

from wu-yc/LabClaw

Core skill for the deep research and writing tool. Write scientific manuscripts in full paragraphs (never bullet points). Use two-stage process with (1) section outlines with key points using research-lookup then (2) convert to flowing prose. IMRAD structure, citations (APA/AMA/Vancouver), figures/tables, reporting guidelines (CONSORT/STROBE/PRISMA), for research papers and journal submissions.

scientific-slides

912

from wu-yc/LabClaw

Build slide decks and presentations for research talks. Use this for making PowerPoint slides, conference presentations, seminar talks, research presentations, thesis defense slides, or any scientific talk. Provides slide structure, design templates, timing guidance, and visual validation. Works with PowerPoint and LaTeX Beamer.

scientific-critical-thinking

912

from wu-yc/LabClaw

Evaluate scientific claims and evidence quality. Use for assessing experimental design validity, identifying biases and confounders, applying evidence grading frameworks (GRADE, Cochrane Risk of Bias), or teaching critical analysis. Best for understanding evidence quality, identifying flaws. For formal peer review writing use peer-review.

scientific-brainstorming

912

from wu-yc/LabClaw

Creative research ideation and exploration. Use for open-ended brainstorming sessions, exploring interdisciplinary connections, challenging assumptions, or identifying research gaps. Best for early-stage research planning when you do not have specific observations yet. For formulating testable hypotheses from data use hypothesis-generation.

Academic Presentation Generation

912

from wu-yc/LabClaw

## Overview

hypothesis-generation

912

from wu-yc/LabClaw

Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechanisms, and design experiments to test them. Follows scientific method framework. For open-ended ideation use scientific-brainstorming; for automated LLM-driven hypothesis testing on datasets use hypogenic.

generate_scientific_method_section

912

from wu-yc/LabClaw

Automated SCI-standard Methods section generator from experiment execution records. Parses LabOS skill call chains, structured JSON logs (extract_experiment_data_from_video, analyze_lab_video_cell_behavior), protocol text, and ELN entries to produce flowing, past-tense, passive-voice Methods prose with full reagent citations, equipment model numbers, and statistical analysis subsections. Outputs LaTeX (\subsection{} / \paragraph{}) or Markdown, ready for direct insertion into a manuscript draft.

hot3d

912

from wu-yc/LabClaw

HOT3D (Hand-Object 3D Dataset) by Meta Facebook - multi-view egocentric hand and object 3D tracking for Aria/Quest smart glasses. State-of-the-art multi-view 3D hand pose, object pose, and hand-object interaction tracking. Supports visualization with 3D joint projections, meshes, and skeletal overlays on video frames.

handtracking

912

from wu-yc/LabClaw

Real-time hand detection in egocentric videos using victordibia/handtracking. Outputs bounding boxes for hands, specifically trained on EgoHands dataset. Supports video input/output with labeled hand boxes. Lightweight and fast for egocentric view applications.

hands-3d-pose

912

from wu-yc/LabClaw

High-quality 3D hand pose estimation for egocentric videos from ECCV 2024 (ap229997/hands). Provides 3D joint keypoints and skeleton visualization projected to 2D. Optimized for daily egocentric activities with state-of-the-art accuracy. Outputs hand skeleton overlays on video frames.

hand-tracking-toolkit

912

from wu-yc/LabClaw

Facebook Research Hand Tracking Challenge Toolkit - evaluation and visualization tools for 3D hand tracking. Supports loading HOT3D data, computing metrics (PA-MPJPE, AUC, etc.), visualizing 3D pose projections, and generating tracking evaluation reports. Essential for benchmarking hand tracking algorithms.