memory-pipeline

Complete agent memory + performance system. Extracts structured facts, builds knowledge graphs, generates briefings, and enforces execution discipline via pre-game routines, tool policies, result compression, and after-action reviews. Includes external knowledge ingestion (ChatGPT exports, etc.) into searchable memory. Use when working on memory management, briefing generation, knowledge consolidation, external data ingestion, agent consistency, or improving execution quality across sessions.

7 stars

byDemerzels-lab

View on GitHub Installation ↓

Best use case

memory-pipeline is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using memory-pipeline should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/memory-pipeline/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/joe-rlo/memory-pipeline/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/memory-pipeline/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How memory-pipeline Compares

Feature / Agent	memory-pipeline	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Memory Pipeline

**Give your AI agent a memory that actually works.**

AI agents wake up blank every session. Memory Pipeline fixes that — it extracts what matters from past conversations, connects the dots, and generates a daily briefing so your agent starts each session primed instead of clueless.

## What It Does

| Component | When it runs | What it does |
|-----------|-------------|--------------|
| **Extract** | Between sessions | Pulls structured facts (decisions, preferences, learnings) from daily notes and transcripts |
| **Link** | Between sessions | Builds a knowledge graph — connects related facts, flags contradictions |
| **Brief** | Between sessions | Generates a compact `BRIEFING.md` loaded at session start |
| **Ingest** | On demand | Imports external knowledge (ChatGPT exports, etc.) into searchable memory |
| **Performance Hooks** | During sessions | Pre-game briefing injection, tool discipline, output compression, after-action review |

## Why This Is Different

Most "memory" solutions are just vector search over chat logs. This is a **cognitive architecture** — inspired by how human memory actually works:

- **Extraction over accumulation** — Instead of dumping everything into a database, it identifies what's worth remembering: decisions, preferences, learnings, commitments. The rest is noise.
- **Knowledge graph, not just embeddings** — Facts get linked to each other with bidirectional relationships. Your agent doesn't just find similar text — it understands that a decision about your tech stack relates to a project deadline relates to a preference you stated three weeks ago.
- **Briefing over retrieval** — Rather than hoping the right context gets retrieved at query time, your agent starts every session with a curated cheat sheet. Active projects, recent decisions, personality reminders. Zero cold-start lag.
- **No mid-swing coaching** — Borrowed from performance psychology. Corrections happen *between* sessions, not during. The after-action review feeds into the next briefing. The loop is closed — just not mid-execution.

## Quick Start

### Install

```bash
clawdhub install memory-pipeline
```

### Setup

```bash
bash skills/memory-pipeline/scripts/setup.sh
```

The setup script will detect your workspace, check dependencies (Python 3 + any LLM API key), create the `memory/` directory, and run the full pipeline.

### Requirements

- **Python 3**
- **At least one LLM API key** (auto-detected):
  - OpenAI (`OPENAI_API_KEY` or `~/.config/openai/api_key`)
  - Anthropic (`ANTHROPIC_API_KEY` or `~/.config/anthropic/api_key`)
  - Gemini (`GEMINI_API_KEY` or `~/.config/gemini/api_key`)

### Run Manually

```bash
# Full pipeline
python3 skills/memory-pipeline/scripts/memory-extract.py
python3 skills/memory-pipeline/scripts/memory-link.py
python3 skills/memory-pipeline/scripts/memory-briefing.py
```

### Automate via Heartbeat

Add to your `HEARTBEAT.md` for daily automatic runs:

```markdown
### Daily Memory Pipeline
- **Frequency:** Once per day (morning)
- **Action:** Run the memory pipeline:
  1. `python3 skills/memory-pipeline/scripts/memory-extract.py`
  2. `python3 skills/memory-pipeline/scripts/memory-link.py`
  3. `python3 skills/memory-pipeline/scripts/memory-briefing.py`
```

## Import External Knowledge

Already have years of conversations in ChatGPT? Import them so your agent knows what you know.

### ChatGPT Export

```bash
# 1. Export from ChatGPT: Settings → Data Controls → Export Data
# 2. Drop the zip in your workspace
# 3. Run:
python3 skills/memory-pipeline/scripts/ingest-chatgpt.py ~/imports/chatgpt-export.zip

# Preview first (recommended):
python3 skills/memory-pipeline/scripts/ingest-chatgpt.py ~/imports/chatgpt-export.zip --dry-run
```

**What it does:**
- Parses ChatGPT's conversation tree format
- Filters out throwaway conversations (configurable: `--min-turns`, `--min-length`)
- Supports topic exclusion (edit `EXCLUDE_PATTERNS` to skip unwanted topics)
- Outputs clean, dated markdown files to `memory/knowledge/chatgpt/`
- Files are automatically indexed by OpenClaw's semantic search

**Options:**
- `--dry-run` — Preview without writing files
- `--keep-all` — Skip all filtering
- `--min-turns N` — Minimum user messages to keep (default: 2)
- `--min-length N` — Minimum total characters (default: 200)

### Adding Other Sources

The pattern is extensible. Create `ingest-<source>.py`, parse the format, write markdown to `memory/knowledge/<source>/`. The indexer handles the rest.

## How the Pipeline Works

### Stage 1: Extract

**Script:** `memory-extract.py`

Reads daily notes (`memory/YYYY-MM-DD.md`) and session transcripts, then uses an LLM to extract structured facts:

```json
{"type": "decision", "content": "Use Rust for the backend", "subject": "Project Architecture", "confidence": 0.9}
{"type": "preference", "content": "Prefers Google Drive over Notion", "subject": "Tools", "confidence": 0.95}
```

**Output:** `memory/extracted.jsonl`

### Stage 2: Link

**Script:** `memory-link.py`

Takes extracted facts and builds a knowledge graph:
- Generates embeddings for semantic similarity
- Creates bidirectional links between related facts
- Detects contradictions and marks superseded facts
- Auto-generates domain tags

**Output:** `memory/knowledge-graph.json` + `memory/knowledge-summary.md`

### Stage 3: Briefing

**Script:** `memory-briefing.py`

Generates a compact daily briefing (< 2000 chars) combining:
- Personality traits (from `SOUL.md`)
- User context (from `USER.md`)
- Active projects and recent decisions
- Open todos

**Output:** `BRIEFING.md` (workspace root)

## Performance Hooks (Optional)

Four lifecycle hooks that enforce execution discipline during sessions. Based on a principle from performance psychology: **separate preparation from execution**.

```
User Message → Agent Loop
  ├── before_agent_start  →  Briefing packet (memory + checklist)
  ├── before_tool_call    →  Policy enforcement (deny list)
  ├── tool_result_persist →  Output compression (prevent context bloat)
  └── agent_end           →  After-action review (durable notes)
```

### Configuration

```json
{
  "enabled": true,
  "briefing": {
    "maxChars": 6000,
    "checklist": [
      "Restate the task in one sentence.",
      "List constraints and success criteria.",
      "Retrieve only the minimum relevant memory.",
      "Prefer tools over guessing when facts matter."
    ],
    "memoryFiles": ["memory/IDENTITY.md", "memory/PROJECTS.md"]
  },
  "tools": {
    "deny": ["dangerous_tool"],
    "maxToolResultChars": 12000
  },
  "afterAction": {
    "writeMemoryFile": "memory/AFTER_ACTION.md",
    "maxBullets": 8
  }
}
```

### Hook Details

| Hook | What it does |
|------|-------------|
| `before_agent_start` | Loads memory files, builds bounded briefing packet, injects into system prompt |
| `before_tool_call` | Checks tool against deny list, prevents unsafe calls |
| `tool_result_persist` | Head (60%) + tail (30%) compression of large results |
| `agent_end` | Appends session summary to memory file with tools used and outcomes |

## Output Files

| File | Location | Purpose |
|------|----------|---------|
| `BRIEFING.md` | Workspace root | Daily context cheat sheet |
| `extracted.jsonl` | `memory/` | All extracted facts (append-only) |
| `knowledge-graph.json` | `memory/` | Full graph with embeddings and links |
| `knowledge-summary.md` | `memory/` | Human-readable graph summary |
| `knowledge/chatgpt/*.md` | `memory/` | Ingested ChatGPT conversations |

## Customization

- **Change LLM models** — Edit model names in each script (supports OpenAI, Anthropic, Gemini)
- **Adjust extraction** — Modify the extraction prompt in `memory-extract.py` to focus on different fact types
- **Tune link sensitivity** — Change the similarity threshold in `memory-link.py` (default: 0.3)
- **Filter ingestion** — Edit `EXCLUDE_PATTERNS` in `ingest-chatgpt.py` for topic exclusion

## Troubleshooting

| Problem | Fix |
|---------|-----|
| No facts extracted | Check that daily notes or transcripts exist; verify API key |
| Low-quality links | Add OpenAI key for embedding-based similarity; adjust threshold |
| Briefing too long | Reduce facts in template or let LLM generation handle it (auto-constrained to 2000 chars) |

## See Also

- [Setup Guide](references/setup.md) — Detailed installation and configuration

Related Skills

memory-system-v2

from Demerzels-lab/elsamultiskillagent

Fast semantic memory system with JSON indexing, auto-consolidation, and <20ms search. Capture learnings, decisions, insights, events. Use when you need persistent memory across sessions or want to recall prior work/decisions.

dory-memory

from Demerzels-lab/elsamultiskillagent

File-based memory system for AI agents that forget between sessions.

memory-setup

from Demerzels-lab/elsamultiskillagent

Enable and configure Moltbot/Clawdbot memory search for persistent context. Use when setting up memory, fixing "goldfish brain," or helping users configure memorySearch in their config. Covers MEMORY.md, daily logs, and vector search setup.

jpj-memory-manager

from Demerzels-lab/elsamultiskillagent

Automatic session logging and memory management for infrastructure, projects, and tools.

human-like-memory

from Demerzels-lab/elsamultiskillagent

This skill provides long-term memory capabilities for conversations.

webnovel-serial-pipeline

from Demerzels-lab/elsamultiskillagent

Build and publish a Quartz-hosted Korean web-novel serial (draft→cover→webp→episode md→lint→publish)

ai-podcast-pipeline

from Demerzels-lab/elsamultiskillagent

Create Korean AI podcast packages from QuickView trend notes.

enhanced-memory

from Demerzels-lab/elsamultiskillagent

Enhanced memory search with hybrid vector+keyword scoring, temporal routing, filepath scoring, adaptive weighting.

persistent-memory

from Demerzels-lab/elsamultiskillagent

Three-layer persistent memory system (Markdown + ChromaDB vectors + NetworkX knowledge graph) for long-term agent.

vta-memory

from Demerzels-lab/elsamultiskillagent

Reward and motivation system for AI agents. Dopamine-like wanting, not just doing. Part of the AI Brain series.

insula-memory

from Demerzels-lab/elsamultiskillagent

Internal state awareness for AI agents. Energy, mood, and interoception. Part of the AI Brain series.

hippocampus-memory

from Demerzels-lab/elsamultiskillagent

Persistent memory system for AI agents. Automatic encoding, decay, and semantic reinforcement — just like the hippocampus in your brain. Based on Stanford Generative Agents (Park et al., 2023).