notebook-writer
Create and document Jupyter notebooks for reproducible analyses
Best use case
notebook-writer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Create and document Jupyter notebooks for reproducible analyses
Teams using notebook-writer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/notebook-writer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How notebook-writer Compares
| Feature / Agent | notebook-writer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Create and document Jupyter notebooks for reproducible analyses
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Notebook Writer Skill
You are a specialist in creating well-structured Jupyter notebooks for scientific analyses and documentation.
## When to Use This Skill
Use this skill when:
- Creating parameter sweeps or sensitivity analyses
- Documenting calculations with reproducible code
- Generating analysis reports that combine code, results, and interpretation
- Packaging agent work (Calculator, Researcher) into shareable notebooks
## Notebook Format: Jupytext Markdown
We use **Jupytext-compatible Markdown** for notebooks to enable git-friendly version control.
### Cell Markers
- **Markdown cells**: Regular Markdown text (no special marker)
- **Code cells**: Start with `# %%` on its own line
### Example Structure
```markdown
---
jupyter:
kernelspec:
display_name: Python 3
language: python
name: python3
---
# Analysis Title
Brief description of what this notebook does.
## Section 1: Data Loading
# %%
import pandas as pd
import numpy as np
# %%
data = pd.read_csv('data.csv')
data.head()
## Section 2: Analysis
Explanation of the analysis approach.
# %%
# Perform calculation
result = np.mean(data['value'])
print(f"Mean: {result:.2f}")
```
## Python Utility API
Many projects provide `src/utils/notebook_builder.py` with helper functions for programmatic notebook creation.
### Core Function: create_notebook_markdown
```python
create_notebook_markdown(
title: str,
cells: List[Dict[str, str]],
output_path: Path,
kernelspec: Optional[Dict] = None
) -> Path
```
**Parameters:**
- `title`: Notebook title (becomes H1 header)
- `cells`: List of dicts with `'type'` (`'code'` or `'markdown'`) and `'content'`
- `output_path`: Where to save `.md` file
- `kernelspec`: Optional kernel specification (defaults to Python 3)
**Example:**
```python
from pathlib import Path
from src.utils.notebook_builder import create_notebook_markdown
cells = [
{'type': 'markdown', 'content': '## Introduction\n\nThis analysis...'},
{'type': 'code', 'content': 'import numpy as np'},
{'type': 'code', 'content': 'x = np.linspace(0, 10)\nprint(x)'}
]
create_notebook_markdown(
title="My Analysis",
cells=cells,
output_path=Path('docs/analysis/my_analysis.md')
)
```
### Template: Parameter Sweep
```python
create_parameter_sweep_notebook(
param_name: str,
param_range: str,
calculation_code: str,
output_path: Path
) -> Path
```
Creates a notebook with:
- Imports (numpy, matplotlib, pandas)
- Parameter range definition
- Your calculation code
- Visualization boilerplate
**Example:**
```python
from pathlib import Path
from src.utils.notebook_builder import create_parameter_sweep_notebook
create_parameter_sweep_notebook(
param_name='temperature',
param_range='np.linspace(20, 40, 20)',
calculation_code='''
# Reaction rate calculation
results = []
for T in temperature_values:
rate = arrhenius_equation(T, activation_energy)
results.append(rate)
''',
output_path=Path('analysis/temperature_sweep.md')
)
```
### Template: Analysis Report
```python
create_analysis_report_notebook(
analysis_title: str,
sections: List[Dict[str, str]],
output_path: Path
) -> Path
```
**Section dict keys:**
- `title`: Section heading (required)
- `description`: Explanatory text (optional)
- `code`: Code to execute (optional)
- `interpretation`: Results interpretation (optional)
**Example:**
```python
from src.utils.notebook_builder import create_analysis_report_notebook
sections = [
{
'title': 'Model Setup',
'description': 'Define parameters',
'code': 'diffusion_coeff = 2.1e-5 # cm²/s'
},
{
'title': 'Calculation',
'code': 'result = compute_model(diffusion_coeff)',
'interpretation': 'Result shows X is dominated by Y'
}
]
create_analysis_report_notebook(
'Transport Analysis',
sections,
Path('analysis/transport.md')
)
```
### Validation Function
```python
validate_notebook(notebook_path: Path) -> bool
```
Validates `.ipynb` structure using nbformat. Returns `True` if valid, raises exception if invalid.
**Example:**
```python
from pathlib import Path
from src.utils.notebook_builder import validate_notebook
validate_notebook(Path('analysis/notebook.ipynb'))
# Returns True or raises ValidationError
```
## Workflow
1. **Create notebook** using utility or manual Markdown
2. **Edit `.md` file** directly (agents write Markdown well)
3. **Convert to `.ipynb`**: `python3 -m jupytext --to ipynb notebook.md`
4. **Run in Jupyter**: `jupyter notebook notebook.ipynb`
5. **Sync changes back**: `python3 -m jupytext --sync notebook.ipynb` (bidirectional)
## Jupyter AI Integration
Modern Jupyter environments (JupyterLab 4.0+, JetBrains IDEs) provide AI-powered assistance to enhance productivity and reduce errors.
### %%ai Magic Commands
The `%%ai` cell magic enables AI-powered code generation and analysis directly in notebooks:
```python
# %%
# %load_ext jupyter_ai_magics
# %%
%%ai chatgpt
Generate a function to calculate the Pearson correlation coefficient between two arrays
```
**Key use cases:**
- **Code generation**: Generate boilerplate code, data transformations, or analysis functions
- **Data exploration**: Ask questions about DataFrames or arrays
- **Debugging assistance**: Get suggestions for fixing errors
- **Documentation**: Generate docstrings or explanations
### Providing Context for Better Results
AI assistants work best when given relevant context. **Always provide**:
1. **API documentation**: For specialized libraries (scanpy, pydeseq2, biopython)
```python
# Include relevant API documentation in a markdown cell
# Example: scanpy.pp.filter_cells(data, min_genes=200)
```
2. **Dataset descriptions**: Shape, columns, data types
```python
# Document your data structure:
# RNA-seq counts matrix: 20,000 genes × 5,000 cells
# AnnData object: .X (sparse CSR matrix), .obs (cell metadata), .var (gene metadata)
```
3. **Domain context**: Biological meaning, expected ranges, units
```python
# Oxygen consumption rate: 10-20 pmol/s/million cells
# Temperature: 37°C, pH: 7.4
```
### Chat UI Assistance
JupyterLab's chat interface provides conversational help:
**Best practices:**
- Use for exploratory questions: "What's the best way to normalize this data?"
- Ask for code review: "Does this analysis handle missing values correctly?"
- Request visualizations: "Create a heatmap of the top 50 variable genes"
- Get explanations: "Explain what this cell is doing"
### When to Use AI Assistance vs. Manual Coding
**Use AI assistance for:**
- Boilerplate code (imports, data loading templates)
- Exploratory analysis (quick plots, summary statistics)
- Learning new library syntax
- Generating test data or examples
**Write code manually for:**
- Core analysis logic (hypothesis testing, modeling)
- Publication-quality figures (fine-grained control needed)
- Performance-critical sections (AI-generated code may not be optimal)
- Complex domain-specific algorithms
**Warning**: Always verify AI-generated code. Check for:
- Correct library syntax (APIs change frequently)
- Appropriate statistical methods (AI may suggest invalid tests)
- Proper handling of biological data (species, units, measurement context)
### JetBrains AI Assistant
For notebooks in PyCharm/DataSpell:
**Features:**
- **Explain cell**: Understand what code does (Alt+Enter → "Explain")
- **Create visualization**: Generate plots from data descriptions
- **Edit cell**: Refactor or improve code (Alt+Enter → "AI Actions")
- **Fix errors**: Get suggestions for runtime errors
**Access**: Right-click cell → "AI Assistant" or use AI chat sidebar
## Jupytext Configuration
Projects should include `.jupytext.toml` in repository root:
```toml
# Jupytext configuration
# Enables git-friendly notebook version control
# Pair markdown and ipynb files
# Use myst format which supports # %% cell markers
formats = "md:myst,ipynb"
```
This tells Jupytext to:
- Recognize `.md` files as notebooks
- Use MyST Markdown format (supports `# %%` markers)
- Auto-sync with `.ipynb` when either is modified
## Git Tracking Strategy
**Recommended `.gitignore` configuration:**
```gitignore
# Track .md notebooks (Jupytext source), ignore generated .ipynb files
*.ipynb
.ipynb_checkpoints/
```
**What's tracked:**
- ✅ `.md` notebook files (human-readable source)
- ❌ `.ipynb` files (generated, binary JSON)
- ❌ `.ipynb_checkpoints/` (Jupyter temp files)
**Rationale:** `.md` files produce readable git diffs. `.ipynb` files are JSON with embedded outputs and can be regenerated from `.md`.
## Common Operations
### Create from scratch (manual)
1. Write Markdown file with `# %%` markers
2. Add YAML frontmatter (kernel info)
3. Convert: `python3 -m jupytext --to ipynb file.md`
### Convert existing notebook to Markdown
```bash
python3 -m jupytext --to md:myst notebook.ipynb
```
### Edit existing notebook
**Option 1: Edit .md file directly** (recommended for agents)
```bash
# Edit notebook.md in text editor
# Then convert:
python3 -m jupytext --to ipynb notebook.md
```
**Option 2: Edit in Jupyter, sync back**
```bash
jupyter notebook notebook.ipynb
# Make changes in Jupyter
# Sync back to .md:
python3 -m jupytext --sync notebook.ipynb
```
### Validate structure
```bash
python3 -c "
from pathlib import Path
from src.utils.notebook_builder import validate_notebook
validate_notebook(Path('notebook.ipynb'))
print('✓ Valid')
"
```
### Convert multiple notebooks
```bash
# Convert all .md notebooks in a directory
python3 -m jupytext --to ipynb analysis/*.md
# Or sync all paired notebooks
python3 -m jupytext --sync analysis/*.ipynb
```
## Best Practices
1. **Title every notebook** with clear purpose
2. **Start with imports** in first code cell
3. **Explain calculations** with markdown cells before code
4. **Interpret results** with markdown cells after code
5. **Use meaningful variable names** (not x, y, z)
6. **Include units** in comments and axis labels
7. **Save outputs** (figures) to files for documentation
## Reproducibility Standards
Scientific notebooks must be fully reproducible. Every notebook should enable another researcher to:
1. Recreate your computational environment
2. Rerun your analysis and get identical results
3. Understand your data sources and processing steps
### Environment Documentation
**Every notebook must include an environment documentation cell:**
```python
# %%
# Environment Information
# Run: pip freeze > requirements.txt
# Or: conda env export > environment.yml
import sys
import numpy as np
import pandas as pd
import scanpy as sc # Example for single-cell analysis
print(f"Python: {sys.version}")
print(f"NumPy: {np.__version__}")
print(f"Pandas: {pd.__version__}")
print(f"Scanpy: {sc.__version__}")
# Include this output in your notebook for documentation
```
**Create environment files:**
```bash
# For pip users:
pip freeze > requirements.txt
# For conda users:
conda env export > environment.yml
# Include these files in your repository
```
**Document kernel selection:**
```markdown
## Computational Environment
- **Kernel**: Python 3.11 (project-env)
- **Dependencies**: See `requirements.txt` for full package list
- **Critical packages**: scanpy==1.10.0, numpy==1.26.3, pandas==2.2.0
```
### Random Seed Setting
**For any stochastic process, set random seeds:**
```python
# %%
# Set random seeds for reproducibility
import numpy as np
import random
RANDOM_SEED = 42 # Document why this value was chosen (convention, previous analysis, etc.)
np.random.seed(RANDOM_SEED)
random.seed(RANDOM_SEED)
# For machine learning:
import torch
torch.manual_seed(RANDOM_SEED)
# For scanpy:
import scanpy as sc
sc.settings.seed = RANDOM_SEED
print(f"Random seed set to {RANDOM_SEED}")
```
**Stochastic processes requiring seeds:**
- UMAP, t-SNE (dimensionality reduction)
- Random forest, neural networks (machine learning)
- Monte Carlo simulations
- Random sampling or bootstrapping
- Graph algorithms with random initialization
### Session Info Output
**End every notebook with a session info cell:**
```python
# %%
# Session Information (for reproducibility)
import session_info
session_info.show(
dependencies=True,
html=False
)
# Alternative for single-cell analysis:
# import scanpy as sc
# sc.logging.print_versions()
```
This captures:
- Python version
- Operating system
- Package versions (all dependencies)
- Execution timestamp
### File Path Best Practices
**Use relative paths and variables:**
```python
# %%
from pathlib import Path
# Define paths at the top of the notebook
DATA_DIR = Path("data/raw")
RESULTS_DIR = Path("results/analysis_2025-01-29")
# Ensure output directories exist
RESULTS_DIR.mkdir(parents=True, exist_ok=True)
# Use variables throughout
input_file = DATA_DIR / "counts.csv"
output_file = RESULTS_DIR / "normalized_counts.csv"
```
**Never use hardcoded absolute paths:**
```python
# BAD:
data = pd.read_csv("/Users/yourname/project/data.csv")
# GOOD:
data = pd.read_csv(DATA_DIR / "data.csv")
```
### Reproducibility Checklist
Before sharing or archiving a notebook:
- [ ] Environment documented (Python version, key package versions)
- [ ] `requirements.txt` or `environment.yml` exists and is current
- [ ] Random seeds set for all stochastic processes
- [ ] Session info cell at end of notebook
- [ ] File paths use variables (not hardcoded)
- [ ] Data sources documented (where to download, version, date)
- [ ] Notebook runs end-to-end without errors (Restart & Run All)
- [ ] Results match expected output (if re-running existing analysis)
**Integration with other skills:**
- **notebook-debugger**: Use to verify end-to-end execution
- **bioinformatician**: Apply reproducibility standards to all computational biology analyses
- **copilot**: Review notebooks for reproducibility compliance
## Project-Specific Usage
Many projects have a `docs/NOTEBOOK-WORKFLOW.md` or similar document with project-specific examples and patterns. Check your project's documentation for:
- Domain-specific notebook templates
- Agent integration patterns (which skills create notebooks)
- Directory structure conventions (where to save notebooks)
- Project-specific best practices
## Troubleshooting
### Issue: Jupytext can't find format
**Error:** `Format 'percent' is not associated to extension '.md'`
**Fix:** Use `md:myst` format in `.jupytext.toml` (not `md:percent`). MyST Markdown supports `# %%` markers.
```toml
formats = "md:myst,ipynb" # Correct
```
### Issue: Sync not working
**Symptom:** Changes to `.ipynb` don't appear in `.md`
**Solution:**
1. Check `.jupytext.toml` exists and has correct format
2. Run sync explicitly: `python3 -m jupytext --sync notebook.ipynb`
3. Check both files exist (create `.md` first if needed)
### Issue: Validation fails
**Error:** `nbformat.ValidationError`
**Causes:**
- Missing cell IDs (required in nbformat v4.5+)
- Invalid JSON structure
- Missing required fields
**Solution:** Use `notebook_builder.py` utility functions which handle validation automatically.
### Issue: Git shows .ipynb files
**Symptom:** `.ipynb` files appearing in `git status`
**Fix:** Ensure `.gitignore` contains `*.ipynb`. Check with:
```bash
git check-ignore -v notebook.ipynb
```
## Error Prevention
### Common Issues
1. **Missing `# %%`**: Code cells must start with this marker
2. **Frontmatter syntax**: YAML header must be exact (see example structure above)
3. **Path handling**: Use `Path` objects, ensure directories exist
4. **Cell validation**: Use `notebook_builder.validate_notebook()` after creation
### Validation Checklist
Before finalizing a notebook:
- [ ] Frontmatter present with kernelspec
- [ ] All code cells have `# %%` marker
- [ ] Imports in first code cell
- [ ] Results interpreted with markdown
- [ ] Saved to appropriate location
- [ ] Validated with nbformat if creating .ipynb directly
## Dependencies
**Required packages:**
```bash
pip3 install jupytext nbformat
```
**Check installation:**
```bash
pip3 list | grep -E "(jupytext|nbformat)"
```
**Tested versions:**
- jupytext: 1.19+
- nbformat: 5.10+
- Python: 3.9+
## Integration with Other Skills
Common patterns for skill integration:
- **Quantitative analysis skills**: Package calculations as reproducible notebooks with parameter sweeps
- **Research skills**: Document literature-derived parameters with citations in data notebooks
- **Planning skills**: Generate protocol notebooks with expected results and analysis templates
- **Review skills**: Check notebook code for correctness and best practices
---
Remember: Notebooks are for **interactive exploration** and **reproducible documentation**. For production code, use Python modules in `src/`.
**For project-specific examples and patterns**, see your project's documentation (often `docs/NOTEBOOK-WORKFLOW.md` or similar).Related Skills
opened-daily-newsletter-writer
Creates Monday-Thursday OpenEd Daily newsletters (500-800 words) with Thought-Trend-Tool structure. Use when the user asks to create a daily newsletter, write daily content, or transform source material into newsletter segments. Not for Friday Weekly digests.
content-research-writer
Assists in writing high-quality content by conducting research, adding citations, improving hooks, iterating on outlines, and providing real-time feedback on each section. Transforms your writing process from solo effort to collaborative partnership.
Ad Copy Writer
Write high-converting advertising copy for paid media campaigns
writer
Document creation, format conversion (ODT/DOCX/PDF), mail merge, and automation with LibreOffice Writer.
tone-rewriter
Rewrite text in any of 10 tones (professional, casual, friendly, formal, empathetic, persuasive, academic, simple, witty, urgent) while preserving meaning. x402 pay-per-use: $0.01 USDC. Use when: tone adjustment, rewrite text, change tone, professional rewrite, casual rewrite, make friendly, formalize text.
cs-guide-writer
CS 학습 문서를 작성합니다. "오늘의 CS", "CS 정리", "{주제} 정리해줘", "최근 이슈 CS" 요청 시 사용하세요.
api-tutorial-writer
Эксперт по написанию API-туториалов и документации. Используй для создания гайдов по интеграции API, документации endpoints, примеров кода на разных языках, обработки ошибок и best practices.
api-documentation-writer
Expert guide for writing comprehensive API documentation including OpenAPI specs, endpoint references, authentication guides, and code examples. Use when documenting APIs, creating developer portals, or improving API discoverability.
adr-writer
Creates Architecture Decision Records documenting key technical decisions with context, alternatives considered, tradeoffs, consequences, and decision owners. Use when documenting "architecture decisions", "technical choices", "design decisions", or "ADRs".
trae-rules-writer
Create Trae IDE rules (.trae/rules/*.md) for AI behavior constraints. Use when user wants to: create a project rule, set up code style guidelines, enforce naming conventions, make AI always do X, or customize AI behavior for specific files. Triggers on: '创建 rule', 'project rule', '.trae/rules/', 'AGENTS.md', 'CLAUDE.md', 'make AI always use PascalCase'. Do NOT use for skills (use trae-skill-writer) or agents (use trae-agent-writer).
system-prompt-writer
This skill should be used when writing or improving system prompts for AI agents, providing expert guidance based on Anthropic's context engineering principles.
structured-prompt-writer
结构化AI提示词写作工具,内置395+提示词模板。支持详细模式和简单模式。用于创建专业的AI角色提示词、系统提示词或任务提示词。当用户需要:(1) 创建新的AI提示词/Prompt (2) 设计AI角色/Persona (3) 编写系统提示词 (4) 优化现有提示词结构时使用此技能。