open-notebook

Self-hosted, open-source alternative to Google NotebookLM for AI-powered research and document analysis. Use when organizing research materials into notebooks, ingesting diverse content sources (PDFs, videos, audio, web pages, Office documents), generating AI-powered notes and summaries, creating multi-speaker podcasts from research, chatting with documents using context-aware AI, searching across materials with full-text and vector search, or running custom content transformations. Supports 16+ AI providers including OpenAI, Anthropic, Google, Ollama, Groq, and Mistral with complete data privacy through self-hosting.

1,174 stars

byforyourhealth111-pixel

View on GitHub Installation ↓

Best use case

open-notebook is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using open-notebook should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/open-notebook/SKILL.md --create-dirs "https://raw.githubusercontent.com/foryourhealth111-pixel/Vibe-Skills/main/bundled/skills/open-notebook/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/open-notebook/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How open-notebook Compares

Feature / Agent	open-notebook	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for ChatGPT

Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

AI Agent for Product Research

Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.

SKILL.md Source

# Open Notebook

## Overview

Open Notebook is an open-source, self-hosted alternative to Google's NotebookLM that enables researchers to organize materials, generate AI-powered insights, create podcasts, and have context-aware conversations with their documents — all while maintaining complete data privacy.

Unlike Google's Notebook LM, which has no publicly available API outside of the Enterprise version, Open Notebook provides a comprehensive REST API, supports 16+ AI providers, and runs entirely on your own infrastructure.

**Key advantages over NotebookLM:**
- Full REST API for programmatic access and automation
- Choice of 16+ AI providers (not locked to Google models)
- Multi-speaker podcast generation with 1-4 customizable speakers (vs. 2-speaker limit)
- Complete data sovereignty through self-hosting
- Open source and fully extensible (MIT license)

**Repository:** https://github.com/lfnovo/open-notebook

## Quick Start

### Prerequisites

- Docker Desktop installed
- API key for at least one AI provider (or local Ollama for free local inference)

### Installation

Deploy Open Notebook using Docker Compose:

```bash
# Download the docker-compose file
curl -o docker-compose.yml https://raw.githubusercontent.com/lfnovo/open-notebook/main/docker-compose.yml

# Set the required encryption key
export OPEN_NOTEBOOK_ENCRYPTION_KEY="your-secret-key-here"

# Launch the services
docker-compose up -d
```

Access the application:
- **Frontend UI:** http://localhost:8502
- **REST API:** http://localhost:5055
- **API Documentation:** http://localhost:5055/docs

### Configure AI Provider

After startup, configure at least one AI provider:

1. Navigate to **Settings > API Keys** in the UI
2. Add credentials for your preferred provider (OpenAI, Anthropic, etc.)
3. Test the connection and discover available models
4. Register models for use across the platform

Or configure via the REST API:

```python
import requests

BASE_URL = "http://localhost:5055/api"

# Add a credential for an AI provider
response = requests.post(f"{BASE_URL}/credentials", json={
    "provider": "openai",
    "name": "My OpenAI Key",
    "api_key": "sk-..."
})
credential = response.json()

# Discover available models
response = requests.post(
    f"{BASE_URL}/credentials/{credential['id']}/discover"
)
discovered = response.json()

# Register discovered models
requests.post(
    f"{BASE_URL}/credentials/{credential['id']}/register-models",
    json={"model_ids": [m["id"] for m in discovered["models"]]}
)
```

## Core Features

### Notebooks
Organize research into separate notebooks, each containing sources, notes, and chat sessions.

```python
import requests

BASE_URL = "http://localhost:5055/api"

# Create a notebook
response = requests.post(f"{BASE_URL}/notebooks", json={
    "name": "Cancer Genomics Research",
    "description": "Literature review on tumor mutational burden"
})
notebook = response.json()
notebook_id = notebook["id"]
```

### Sources
Ingest diverse content types including PDFs, videos, audio files, web pages, and Office documents. Sources are processed for full-text and vector search.

```python
# Add a web URL source
response = requests.post(f"{BASE_URL}/sources", data={
    "url": "https://arxiv.org/abs/2301.00001",
    "notebook_id": notebook_id,
    "process_async": "true"
})
source = response.json()

# Upload a PDF file
with open("paper.pdf", "rb") as f:
    response = requests.post(
        f"{BASE_URL}/sources",
        data={"notebook_id": notebook_id},
        files={"file": ("paper.pdf", f, "application/pdf")}
    )
```

### Notes
Create and manage notes (human or AI-generated) associated with notebooks.

```python
# Create a human note
response = requests.post(f"{BASE_URL}/notes", json={
    "title": "Key Findings",
    "content": "TMB correlates with immunotherapy response in NSCLC...",
    "note_type": "human",
    "notebook_id": notebook_id
})
```

### Context-Aware Chat
Chat with your research materials using AI that cites sources.

```python
# Create a chat session
session = requests.post(f"{BASE_URL}/chat/sessions", json={
    "notebook_id": notebook_id,
    "title": "TMB Discussion"
}).json()

# Send a message with context from sources
response = requests.post(f"{BASE_URL}/chat/execute", json={
    "session_id": session["id"],
    "message": "What are the key biomarkers for immunotherapy response?",
    "context": {"include_sources": True, "include_notes": True}
})
```

### Search
Search across all materials using full-text or vector (semantic) search.

```python
# Vector search across the knowledge base
results = requests.post(f"{BASE_URL}/search", json={
    "query": "tumor mutational burden immunotherapy",
    "search_type": "vector",
    "limit": 10
}).json()

# Ask a question with AI-powered answer
answer = requests.post(f"{BASE_URL}/search/ask/simple", json={
    "query": "How does TMB predict checkpoint inhibitor response?"
}).json()
```

### Podcast Generation
Generate professional multi-speaker podcasts from research materials with 1-4 customizable speakers.

```python
# Generate a podcast episode
job = requests.post(f"{BASE_URL}/podcasts/generate", json={
    "notebook_id": notebook_id,
    "episode_profile_id": episode_profile_id,
    "speaker_profile_ids": [speaker1_id, speaker2_id]
}).json()

# Check generation status
status = requests.get(f"{BASE_URL}/podcasts/jobs/{job['job_id']}").json()

# Download audio when ready
audio = requests.get(
    f"{BASE_URL}/podcasts/episodes/{status['episode_id']}/audio"
)
```

### Content Transformations
Apply custom AI-powered transformations to content for summarization, extraction, and analysis.

```python
# Create a custom transformation
transform = requests.post(f"{BASE_URL}/transformations", json={
    "name": "extract_methods",
    "title": "Extract Methods",
    "description": "Extract methodology details from papers",
    "prompt": "Extract and summarize the methodology section...",
    "apply_default": False
}).json()

# Execute transformation on text
result = requests.post(f"{BASE_URL}/transformations/execute", json={
    "transformation_id": transform["id"],
    "input_text": "...",
    "model_id": "model_id_here"
}).json()
```

## Supported AI Providers

Open Notebook supports 16+ AI providers through the Esperanto library:

| Provider | LLM | Embedding | Speech-to-Text | Text-to-Speech |
|----------|-----|-----------|----------------|----------------|
| OpenAI | Yes | Yes | Yes | Yes |
| Anthropic | Yes | No | No | No |
| Google GenAI | Yes | Yes | No | Yes |
| Vertex AI | Yes | Yes | No | Yes |
| Ollama | Yes | Yes | No | No |
| Groq | Yes | No | Yes | No |
| Mistral | Yes | Yes | No | No |
| Azure OpenAI | Yes | Yes | No | No |
| DeepSeek | Yes | No | No | No |
| xAI | Yes | No | No | No |
| OpenRouter | Yes | No | No | No |
| ElevenLabs | No | No | Yes | Yes |
| Perplexity | Yes | No | No | No |
| Voyage | No | Yes | No | No |

## Environment Variables

Key configuration variables for Docker deployment:

| Variable | Description | Default |
|----------|-------------|---------|
| `OPEN_NOTEBOOK_ENCRYPTION_KEY` | **Required.** Secret key for encrypting stored credentials | None |
| `SURREAL_URL` | SurrealDB connection URL | `ws://surrealdb:8000/rpc` |
| `SURREAL_NAMESPACE` | Database namespace | `open_notebook` |
| `SURREAL_DATABASE` | Database name | `open_notebook` |
| `OPEN_NOTEBOOK_PASSWORD` | Optional password protection for the UI | None |

## API Reference

The REST API is available at `http://localhost:5055/api` with interactive documentation at `/docs`.

Core endpoint groups:
- `/api/notebooks` - Notebook CRUD and source association
- `/api/sources` - Source ingestion, processing, and retrieval
- `/api/notes` - Note management
- `/api/chat/sessions` - Chat session management
- `/api/chat/execute` - Chat message execution
- `/api/search` - Full-text and vector search
- `/api/podcasts` - Podcast generation and management
- `/api/transformations` - Content transformation pipelines
- `/api/models` - AI model configuration and discovery
- `/api/credentials` - Provider credential management

For complete API reference with all endpoints and request/response formats, see `references/api_reference.md`.

## Architecture

Open Notebook uses a modern stack:
- **Backend:** Python with FastAPI
- **Database:** SurrealDB (document + relational)
- **AI Integration:** LangChain with the Esperanto multi-provider library
- **Frontend:** Next.js with React
- **Deployment:** Docker Compose with persistent volumes

## Important Notes

- Open Notebook requires Docker for deployment
- At least one AI provider must be configured for AI features to work
- For free local inference without API costs, use Ollama
- The `OPEN_NOTEBOOK_ENCRYPTION_KEY` must be set before first launch and kept consistent across restarts
- All data is stored locally in Docker volumes for complete data sovereignty

Related Skills

pyopenms

1174

from foryourhealth111-pixel/Vibe-Skills

Python interface to OpenMS for mass spectrometry data analysis. Use for LC-MS/MS proteomics and metabolomics workflows including file handling (mzML, mzXML, mzTab, FASTA, pepXML, protXML, mzIdentML), signal processing, feature detection, peptide identification, and quantitative analysis. Apply when working with mass spectrometry data, analyzing proteomics experiments, or processing metabolomics datasets.

opentrons-integration

1174

from foryourhealth111-pixel/Vibe-Skills

Official Opentrons Protocol API for OT-2 and Flex robots. Use when writing protocols specifically for Opentrons hardware with full access to Protocol API v2 features. Best for production Opentrons protocols, official API compatibility. For multi-vendor automation or broader equipment control use pylabrobot.

opentargets-database

1174

from foryourhealth111-pixel/Vibe-Skills

Query Open Targets Platform for target-disease associations, drug target discovery, tractability/safety data, genetics/omics evidence, known drugs, for therapeutic target identification.

openalex-database

1174

from foryourhealth111-pixel/Vibe-Skills

Query and analyze scholarly literature using the OpenAlex database. This skill should be used when searching for academic papers, analyzing research trends, finding works by authors or institutions, tracking citations, discovering open access publications, or conducting bibliometric analysis across 240M+ scholarly works. Use for literature searches, research output analysis, citation analysis, and academic database queries.

openai-knowledge

1174

from foryourhealth111-pixel/Vibe-Skills

Compatibility alias for OpenAI platform documentation guidance. Delegate to the canonical local `openai-docs` payload while preserving route compatibility.

jupyter-notebook

1174

from foryourhealth111-pixel/Vibe-Skills

Use when the user asks to create, scaffold, or edit Jupyter notebooks (`.ipynb`) for experiments, explorations, or tutorials; prefer the bundled templates and run the helper script `new_notebook.py` to generate a clean starting notebook.

openai-docs

1140

from foryourhealth111-pixel/Vibe-Skills

Use when the user asks how to build with OpenAI products or APIs and needs up-to-date official documentation with citations (for example: Codex, Responses API, Chat Completions, Apps SDK, Agents SDK, Realtime, model capabilities or limits); prioritize OpenAI docs MCP tools and restrict any fallback browsing to official OpenAI domains.

zinc-database

1174

from foryourhealth111-pixel/Vibe-Skills

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

zarr-python

1174

from foryourhealth111-pixel/Vibe-Skills

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

yeet

1174

from foryourhealth111-pixel/Vibe-Skills

Use only when the user explicitly asks to stage, commit, push, and open a GitHub pull request in one flow using the GitHub CLI (`gh`).

xlsx

1174

from foryourhealth111-pixel/Vibe-Skills

Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.

xan

1174

from foryourhealth111-pixel/Vibe-Skills

High-performance CSV processing with xan CLI for large tabular datasets, streaming transformations, and low-memory pipelines.