hierarchical-agent-memory

Scoped CLAUDE.md memory system that reduces context token spend. Creates directory-level context files, tracks savings via dashboard, and routes agents to the right sub-context.

16 stars

Best use case

hierarchical-agent-memory is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Scoped CLAUDE.md memory system that reduces context token spend. Creates directory-level context files, tracks savings via dashboard, and routes agents to the right sub-context.

Teams using hierarchical-agent-memory should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/hierarchical-agent-memory/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/ai-agents/hierarchical-agent-memory/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/hierarchical-agent-memory/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How hierarchical-agent-memory Compares

Feature / Agenthierarchical-agent-memoryStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Scoped CLAUDE.md memory system that reduces context token spend. Creates directory-level context files, tracks savings via dashboard, and routes agents to the right sub-context.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Hierarchical Agent Memory (HAM)

Scoped memory system that gives AI coding agents a cheat sheet for each directory instead of re-reading your entire project every prompt. Root CLAUDE.md holds global context (~200 tokens), subdirectory CLAUDE.md files hold scoped context (~250 tokens each), and a `.memory/` layer stores decisions, patterns, and an inbox for unconfirmed inferences.

## When to Use This Skill

- Use when you want to reduce input token costs across Claude Code sessions
- Use when your project has 3+ directories and the agent keeps re-reading the same files
- Use when you want directory-scoped context instead of one monolithic CLAUDE.md
- Use when you want a dashboard to visualize token savings, session history, and context health
- Use when setting up a new project and want structured agent memory from day one

## How It Works

### Step 1: Setup ("go ham")

Auto-detects your project platform and maturity, then generates the memory structure:

```
project/
├── CLAUDE.md              # Root context (~200 tokens)
├── .memory/
│   ├── decisions.md       # Architecture Decision Records
│   ├── patterns.md        # Reusable patterns
│   ├── inbox.md           # Inferred items awaiting confirmation
│   └── audit-log.md       # Audit history
└── src/
    ├── api/CLAUDE.md      # Scoped context for api/
    ├── components/CLAUDE.md
    └── lib/CLAUDE.md
```

### Step 2: Context Routing

The root CLAUDE.md includes a routing section that tells the agent exactly which sub-context to load:

```markdown
## Context Routing

→ api: src/api/CLAUDE.md
→ components: src/components/CLAUDE.md
→ lib: src/lib/CLAUDE.md
```

The agent reads root, then immediately loads the relevant subdirectory context — no guessing.

### Step 3: Dashboard ("ham dashboard")

Launches a web dashboard at localhost:7777 that visualizes:
- Token savings (HAM-on vs HAM-off sessions)
- Daily token and cost trends
- Per-directory session breakdown
- Context file health (missing/stale/inherited CLAUDE.md coverage)
- Routing compliance (how often the agent follows the routing map)
- Carbon/energy estimates

## Commands

| Trigger | What it does |
|---|---|
| `go ham` | Set up HAM — auto-detect platform, generate CLAUDE.md files |
| `ham savings` | Show token and cost savings report |
| `ham dashboard` | Launch the interactive web dashboard |
| `ham audit` | Health check on memory files |
| `ham insights` | Generate actionable insights from session data |
| `ham route` | Add/update Context Routing section in root CLAUDE.md |
| `ham carbon` | Show energy and carbon efficiency data |

## Examples

### Example 1: First-time setup

```
User: go ham

Agent: HAM setup complete. Created 8 files.
  - CLAUDE.md (root)
  - .memory/decisions.md
  - .memory/patterns.md
  - .memory/inbox.md
  - src/api/CLAUDE.md
  - src/components/CLAUDE.md
  - src/lib/CLAUDE.md
  - src/utils/CLAUDE.md

Baseline captured in .memory/baseline.json
Run "HAM savings" to see your token and cost savings.
```

### Example 2: Checking savings

```
User: ham savings

Agent:
  Before HAM: ~7,500 tokens/prompt
  After HAM:    ~450 tokens/prompt
  Savings:    7,050 tokens (94%)

  Monthly projection (1,500 prompts):
    Sonnet: ~$31.73 saved
    Opus:   ~$158.63 saved
```

## Best Practices

- Keep root CLAUDE.md under 60 lines / 250 tokens
- Keep subdirectory CLAUDE.md files under 75 lines each
- Run `ham audit` every 2 weeks to catch stale or missing context files
- Use `ham route` after adding new directories to keep routing current
- Review `.memory/inbox.md` periodically — confirm or reject inferred items

## Limitations

- Token estimates use ~4 chars = 1 token approximation, not a real tokenizer
- Baseline savings comparisons are estimates based on typical agent behavior
- Dashboard requires Node.js 18+ and reads session data from `~/.claude/projects/`
- Context routing detection relies on CLAUDE.md read order in session JSONL files
- Does not auto-update subdirectory CLAUDE.md content — you maintain those manually or via `ham audit`
- Carbon estimates use regional grid averages, not real-time energy data

## Related Skills

- `agent-memory-systems` — general agent memory architecture patterns
- `agent-memory-mcp` — MCP-based memory integration

Related Skills

helix-memory

16
from diegosouzapw/awesome-omni-skill

Long-term memory system for Claude Code using HelixDB graph-vector database. Store and retrieve facts, preferences, context, and relationships across sessions using semantic search, reasoning chains, and time-window filtering.

agentMemory

16
from diegosouzapw/awesome-omni-skill

A hybrid memory system that provides persistent, searchable knowledge management for AI agents.

agent-memory-systems

16
from diegosouzapw/awesome-omni-skill

Memory is the cornerstone of intelligent agents. Without it, every interaction starts from zero. This skill covers the architecture of agent memory: short-term (context window), long-term (vector stores), and the cognitive architectures that organize them. Key insight: Memory isn't just storage - it's retrieval. A million stored facts mean nothing if you can't find the right one. Chunking, embedding, and retrieval strategies determine whether your agent remembers or forgets. The field is fragm

agent-memory-skills

16
from diegosouzapw/awesome-omni-skill

Self-improving agent architecture using ChromaDB for continuous learning, self-evaluation, and improvement storage. Agents maintain separate memory collections for learned patterns, performance metrics, and self-assessments without modifying their static .md configuration.

agent-memory-mcp

16
from diegosouzapw/awesome-omni-skill

A hybrid memory system that provides persistent, searchable knowledge management for AI agents (Architecture, Patterns, Decisions).

agent-memory

16
from diegosouzapw/awesome-omni-skill

Long-term memory store for AI agents - save, search, and manage persistent memories across sessions. Load this skill for complete command reference.

memorylane

16
from diegosouzapw/awesome-omni-skill

Zero-config persistent memory for Claude with automatic cost savings. Use when you need to remember project context, reduce API token costs, track learned patterns, manage memories across sessions, or curate/clean up memories. Automatically compresses context 6x and saves 84% on API costs. Keywords: memory, remember, recall, context, cost savings, reduce tokens, learn, patterns, insights, curate, clean up memories, review memories.

ai-runtime-memory

16
from diegosouzapw/awesome-omni-skill

AI Runtime分层记忆系统,支持SQL风格的事件查询、时间线管理,以及记忆的智能固化和检索,用于项目历史追踪和经验传承

qdrant-memory

16
from diegosouzapw/awesome-omni-skill

Intelligent token optimization through Qdrant-powered semantic caching and long-term memory. Use for (1) Semantic Cache - avoid LLM calls entirely for semantically similar queries with 100% token savings, (2) Long-Term Memory - retrieve only relevant context chunks instead of full conversation history with 80-95% context reduction, (3) Hybrid Search - combine vector similarity with keyword filtering for technical queries, (4) Memory Management - store and retrieve conversation memories, decisions, and code patterns with metadata filtering. Triggers when needing to cache responses, remember past interactions, optimize context windows, or implement RAG patterns.

project-memory

16
from diegosouzapw/awesome-omni-skill

Set up and maintain a structured project memory system in docs/project_notes/ that tracks bugs with solutions, architectural decisions, key project facts, and work history. Use this skill when asked to "set up project memory", "track our decisions", "log a bug fix", "update project memory", or "initialize memory system". Configures both CLAUDE.md and AGENTS.md to maintain memory awareness across different AI coding tools.

memory-systems

16
from diegosouzapw/awesome-omni-skill

Design short-term, long-term, and graph-based memory architectures

elite-longterm-memory

16
from diegosouzapw/awesome-omni-skill

Ultimate AI agent memory system for Cursor, Claude, ChatGPT & Copilot. WAL protocol + vector search + git-notes + cloud backup. Never lose context again. Vibe-coding ready.