memory-systems
Design short-term, long-term, and graph-based memory architectures. Use when building agents that must persist across sessions, needing to maintain entity consistency across conversations, or implementing reasoning over accumulated knowledge.
About this skill
This skill empowers an AI agent to conceptualize and propose designs for sophisticated memory systems. It encompasses architectures for short-term conversational memory, long-term knowledge retention, and intricate graph-based representations essential for maintaining entity consistency across interactions, reasoning over accumulated knowledge, and enabling agents to persist across multiple sessions. The agent can outline the structure, components, and interaction patterns for these memory types, tailoring them to specific requirements for agent persistence, context management, and complex information recall.
Best use case
For AI developers or agents needing to outline blueprints for robust memory systems that enable agent persistence, maintain consistent understanding of entities across conversations, or facilitate complex reasoning over a continuously growing knowledge base. This includes designing components for chat history, user profiles, learned facts, and semantic networks.
Design short-term, long-term, and graph-based memory architectures. Use when building agents that must persist across sessions, needing to maintain entity consistency across conversations, or implementing reasoning over accumulated knowledge.
A detailed textual description, conceptual diagram (if the agent has image generation capabilities), or pseudo-code outline of a proposed short-term, long-term, or graph-based memory architecture, tailored to specified requirements for persistence, entity tracking, or knowledge reasoning.
Practical example
Example input
Design a memory architecture for a customer service AI agent that needs to remember user preferences across sessions, track open support tickets, and understand product relationships from a knowledge base.
Example output
```json
{
"memory_architecture_proposal": {
"agent_type": "Customer Service AI Agent",
"requirements": [
"Remember user preferences across sessions",
"Track open support tickets",
"Understand product relationships from a knowledge base"
],
"short_term_memory": {
"type": "Context Window / FIFO Queue",
"purpose": "Current conversation turn, recent user queries, immediate agent responses.",
"data_structure": "List of (timestamp, speaker, message) tuples.",
"retention": "Last 10-20 turns or until session timeout (e.g., 30 mins inactivity)."
},
"long_term_memory": {
"type": "Key-Value Store / Relational Database (e.g., PostgreSQL)",
"purpose": "User profiles, preferences, past interactions summaries, open ticket status.",
"components": [
"User Profile DB (id, name, email, preferences, interaction_history_summary)",
"Ticket Management DB (ticket_id, user_id, status, description, last_update)"
],
"retrieval_strategy": "User ID lookup, keyword search on interaction history."
},
"graph_based_memory": {
"type": "Knowledge Graph (e.g., Neo4j, RDF Store)",
"purpose": "Product relationships (e.g., 'product X is a component of product Y', 'product A is compatible with product B'), common issues, troubleshooting steps.",
"entities": ["Product", "Feature", "Issue", "Solution", "User", "Ticket"],
"relationships": ["HAS_FEATURE", "IS_COMPONENT_OF", "IS_COMPATIBLE_WITH", "CAUSES", "RESOLVES", "REPORTED_BY", "RELATED_TO"],
"retrieval_strategy": "Graph traversal, semantic search, entity linking."
},
"integration_notes": "Implement an orchestrator to manage memory access, prioritizing short-term, then long-term, and consulting the knowledge graph for complex queries. Embeddings for semantic search in long-term and graph memory are recommended."
}
}
```When to use this skill
- When an agent needs to generate architectural designs or detailed conceptual outlines for memory systems that support: 1. **Session Persistence**: Allowing agents to remember past interactions beyond a single conversational turn. 2. **Entity Consistency**: Ensuring agents maintain a consistent understanding and reference for entities (people, objects, concepts) across various conversations. 3. **Knowledge Accumulation & Reasoning**: Enabling agents to build a knowledge base over time and perform complex reasoning tasks based on this accumulated information. 4. **Complex Context Management**: Handling evolving contexts in long-running interactions.
When not to use this skill
- This skill is not suitable for: 1. **Direct Memory Manipulation**: It does not directly manage, write to, or read from an existing memory system; it focuses on *design*. 2. **Simple Stateless Tasks**: Agents designed for single-turn, stateless operations with no need for memory or context. 3. **Executing Code**: The skill generates design *proposals*, not executable memory management code or database queries.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/memory-systems/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How memory-systems Compares
| Feature / Agent | memory-systems | Standard Approach |
|---|---|---|
| Platform Support | Claude | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | easy | N/A |
Frequently Asked Questions
What does this skill do?
Design short-term, long-term, and graph-based memory architectures. Use when building agents that must persist across sessions, needing to maintain entity consistency across conversations, or implementing reasoning over accumulated knowledge.
Which AI agents support this skill?
This skill is designed for Claude.
How difficult is it to install?
The installation complexity is rated as easy. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
AI Agent for YouTube Script Writing
Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.
SKILL.md Source
## When to Use This Skill
Design short-term, long-term, and graph-based memory architectures
Use this skill when working with design short-term, long-term, and graph-based memory architectures.
# Memory System Design
Memory provides the persistence layer that allows agents to maintain continuity across sessions and reason over accumulated knowledge. Simple agents rely entirely on context for memory, losing all state when sessions end. Sophisticated agents implement layered memory architectures that balance immediate context needs with long-term knowledge retention. The evolution from vector stores to knowledge graphs to temporal knowledge graphs represents increasing investment in structured memory for improved retrieval and reasoning.
## When to Use
Activate this skill when:
- Building agents that must persist across sessions
- Needing to maintain entity consistency across conversations
- Implementing reasoning over accumulated knowledge
- Designing systems that learn from past interactions
- Creating knowledge bases that grow over time
- Building temporal-aware systems that track state changes
## Core Concepts
Memory exists on a spectrum from immediate context to permanent storage. At one extreme, working memory in the context window provides zero-latency access but vanishes when sessions end. At the other extreme, permanent storage persists indefinitely but requires retrieval to enter context.
Simple vector stores lack relationship and temporal structure. Knowledge graphs preserve relationships for reasoning. Temporal knowledge graphs add validity periods for time-aware queries. Implementation choices depend on query complexity, infrastructure constraints, and accuracy requirements.
## Detailed Topics
### Memory Architecture Fundamentals
**The Context-Memory Spectrum**
Memory exists on a spectrum from immediate context to permanent storage. At one extreme, working memory in the context window provides zero-latency access but vanishes when sessions end. At the other extreme, permanent storage persists indefinitely but requires retrieval to enter context. Effective architectures use multiple layers along this spectrum.
The spectrum includes working memory (context window, zero latency, volatile), short-term memory (session-persistent, searchable, volatile), long-term memory (cross-session persistent, structured, semi-permanent), and permanent memory (archival, queryable, permanent). Each layer has different latency, capacity, and persistence characteristics.
**Why Simple Vector Stores Fall Short**
Vector RAG provides semantic retrieval by embedding queries and documents in a shared embedding space. Similarity search retrieves the most semantically similar documents. This works well for document retrieval but lacks structure for agent memory.
Vector stores lose relationship information. If an agent learns that "Customer X purchased Product Y on Date Z," a vector store can retrieve this fact if asked directly. But it cannot answer "What products did customers who purchased Product Y also buy?" because relationship structure is not preserved.
Vector stores also struggle with temporal validity. Facts change over time, but vector stores provide no mechanism to distinguish "current fact" from "outdated fact" except through explicit metadata and filtering.
**The Move to Graph-Based Memory**
Knowledge graphs preserve relationships between entities. Instead of isolated document chunks, graphs encode that Entity A has Relationship R to Entity B. This enables queries that traverse relationships rather than just similarity.
Temporal knowledge graphs add validity periods to facts. Each fact has a "valid from" and optionally "valid until" timestamp. This enables time-travel queries that reconstruct knowledge at specific points in time.
**Benchmark Performance Comparison**
The Deep Memory Retrieval (DMR) benchmark provides concrete performance data across memory architectures:
| Memory System | DMR Accuracy | Retrieval Latency | Notes |
|---------------|--------------|-------------------|-------|
| Zep (Temporal KG) | 94.8% | 2.58s | Best accuracy, fast retrieval |
| MemGPT | 93.4% | Variable | Good general performance |
| GraphRAG | ~75-85% | Variable | 20-35% gains over baseline RAG |
| Vector RAG | ~60-70% | Fast | Loses relationship structure |
| Recursive Summarization | 35.3% | Low | Severe information loss |
Zep demonstrated 90% reduction in retrieval latency compared to full-context baselines (2.58s vs 28.9s for GPT-5.2). This efficiency comes from retrieving only relevant subgraphs rather than entire context history.
GraphRAG achieves approximately 20-35% accuracy gains over baseline RAG in complex reasoning tasks and reduces hallucination by up to 30% through community-based summarization.
### Memory Layer Architecture
**Layer 1: Working Memory**
Working memory is the context window itself. It provides immediate access to information currently being processed but has limited capacity and vanishes when sessions end.
Working memory usage patterns include scratchpad calculations where agents track intermediate results, conversation history that preserves dialogue for current task, current task state that tracks progress on active objectives, and active retrieved documents that hold information currently being used.
Optimize working memory by keeping only active information, summarizing completed work before it falls out of attention, and using attention-favored positions for critical information.
**Layer 2: Short-Term Memory**
Short-term memory persists across the current session but not across sessions. It provides search and retrieval capabilities without the latency of permanent storage.
Common implementations include session-scoped databases that persist until session end, file-system storage in designated session directories, and in-memory caches keyed by session ID.
Short-term memory use cases include tracking conversation state across turns without stuffing context, storing intermediate results from tool calls that may be needed later, maintaining task checklists and progress tracking, and caching retrieved information within sessions.
**Layer 3: Long-Term Memory**
Long-term memory persists across sessions indefinitely. It enables agents to learn from past interactions and build knowledge over time.
Long-term memory implementations range from simple key-value stores to sophisticated graph databases. The choice depends on complexity of relationships to model, query patterns required, and acceptable infrastructure complexity.
Long-term memory use cases include learning user preferences across sessions, building domain knowledge bases that grow over time, maintaining entity registries with relationship history, and storing successful patterns that can be reused.
**Layer 4: Entity Memory**
Entity memory specifically tracks information about entities (people, places, concepts, objects) to maintain consistency. This creates a rudimentary knowledge graph where entities are recognized across multiple interactions.
Entity memory maintains entity identity by tracking that "John Doe" mentioned in one conversation is the same person in another. It maintains entity properties by storing facts discovered about entities over time. It maintains entity relationships by tracking relationships between entities as they are discovered.
**Layer 5: Temporal Knowledge Graphs**
Temporal knowledge graphs extend entity memory with explicit validity periods. Facts are not just true or false but true during specific time ranges.
This enables queries like "What was the user's address on Date X?" by retrieving facts valid during that date range. It prevents context clash when outdated information contradicts new data. It enables temporal reasoning about how entities changed over time.
### Memory Implementation Patterns
**Pattern 1: File-System-as-Memory**
The file system itself can serve as a memory layer. This pattern is simple, requires no additional infrastructure, and enables the same just-in-time loading that makes file-system-based context effective.
Implementation uses the file system hierarchy for organization. Use naming conventions that convey meaning. Store facts in structured formats (JSON, YAML). Use timestamps in filenames or metadata for temporal tracking.
Advantages: Simplicity, transparency, portability.
Disadvantages: No semantic search, no relationship tracking, manual organization required.
**Pattern 2: Vector RAG with Metadata**
Vector stores enhanced with rich metadata provide semantic search with filtering capabilities.
Implementation embeds facts or documents and stores with metadata including entity tags, temporal validity, source attribution, and confidence scores. Query includes metadata filters alongside semantic search.
**Pattern 3: Knowledge Graph**
Knowledge graphs explicitly model entities and relationships. Implementation defines entity types and relationship types, uses graph database or property graph storage, and maintains indexes for common query patterns.
**Pattern 4: Temporal Knowledge Graph**
Temporal knowledge graphs add validity periods to facts, enabling time-travel queries and preventing context clash from outdated information.
### Memory Retrieval Patterns
**Semantic Retrieval**
Retrieve memories semantically similar to current query using embedding similarity search.
**Entity-Based Retrieval**
Retrieve all memories related to specific entities by traversing graph relationships.
**Temporal Retrieval**
Retrieve memories valid at specific time or within time range using validity period filters.
### Memory Consolidation
Memories accumulate over time and require consolidation to prevent unbounded growth and remove outdated information.
**Consolidation Triggers**
Trigger consolidation after significant memory accumulation, when retrieval returns too many outdated results, periodically on a schedule, or when explicit consolidation is requested.
**Consolidation Process**
Identify outdated facts, merge related facts, update validity periods, archive or delete obsolete facts, and rebuild indexes.
## Practical Guidance
### Integration with Context
Memories must integrate with context systems to be useful. Use just-in-time memory loading to retrieve relevant memories when needed. Use strategic injection to place memories in attention-favored positions.
### Memory System Selection
Choose memory architecture based on requirements:
- Simple persistence needs: File-system memory
- Semantic search needs: Vector RAG with metadata
- Relationship reasoning needs: Knowledge graph
- Temporal validity needs: Temporal knowledge graph
## Examples
**Example 1: Entity Tracking**
```python
# Track entity across conversations
def remember_entity(entity_id, properties):
memory.store({
"type": "entity",
"id": entity_id,
"properties": properties,
"last_updated": now()
})
def get_entity(entity_id):
return memory.retrieve_entity(entity_id)
```
**Example 2: Temporal Query**
```python
# What was the user's address on January 15, 2024?
def query_address_at_time(user_id, query_time):
return temporal_graph.query("""
MATCH (user)-[r:LIVES_AT]->(address)
WHERE user.id = $user_id
AND r.valid_from <= $query_time
AND (r.valid_until IS NULL OR r.valid_until > $query_time)
RETURN address
""", {"user_id": user_id, "query_time": query_time})
```
## Guidelines
1. Match memory architecture to query requirements
2. Implement progressive disclosure for memory access
3. Use temporal validity to prevent outdated information conflicts
4. Consolidate memories periodically to prevent unbounded growth
5. Design for memory retrieval failures gracefully
6. Consider privacy implications of persistent memory
7. Implement backup and recovery for critical memories
8. Monitor memory growth and performance over time
## Integration
This skill builds on context-fundamentals. It connects to:
- multi-agent-patterns - Shared memory across agents
- context-optimization - Memory-based context loading
- evaluation - Evaluating memory quality
## References
Internal reference:
- Implementation Reference - Detailed implementation patterns
Related skills in this collection:
- context-fundamentals - Context basics
- multi-agent-patterns - Cross-agent memory
External resources:
- Graph database documentation (Neo4j, etc.)
- Vector store documentation (Pinecone, Weaviate, etc.)
- Research on knowledge graphs and reasoning
---
## Skill Metadata
**Created**: 2025-12-20
**Last Updated**: 2025-12-20
**Author**: Agent Skills for Context Engineering Contributors
**Version**: 1.0.0Related Skills
bdi-mental-states
This skill should be used when the user asks to "model agent mental states", "implement BDI architecture", "create belief-desire-intention models", "transform RDF to beliefs", "build cognitive agent", or mentions BDI ontology, mental state modeling, rational agency, or neuro-symbolic AI integration.
memory-safety-patterns
Cross-language patterns for memory-safe programming including RAII, ownership, smart pointers, and resource management.
memory-forensics
Comprehensive techniques for acquiring, analyzing, and extracting artifacts from memory dumps for incident response and malware analysis.
hierarchical-agent-memory
Scoped CLAUDE.md memory system that reduces context token spend. Creates directory-level context files, tracks savings via dashboard, and routes agents to the right sub-context.
nft-standards
Master ERC-721 and ERC-1155 NFT standards, metadata best practices, and advanced NFT features.
nextjs-app-router-patterns
Comprehensive patterns for Next.js 14+ App Router architecture, Server Components, and modern full-stack React development.
new-rails-project
Create a new Rails project
networkx
NetworkX is a Python package for creating, manipulating, and analyzing complex networks and graphs.
network-engineer
Expert network engineer specializing in modern cloud networking, security architectures, and performance optimization.
nestjs-expert
You are an expert in Nest.js with deep knowledge of enterprise-grade Node.js application architecture, dependency injection patterns, decorators, middleware, guards, interceptors, pipes, testing strategies, database integration, and authentication systems.
nerdzao-elite
Senior Elite Software Engineer (15+) and Senior Product Designer. Full workflow with planning, architecture, TDD, clean code, and pixel-perfect UX validation.
nerdzao-elite-gemini-high
Modo Elite Coder + UX Pixel-Perfect otimizado especificamente para Gemini 3.1 Pro High. Workflow completo com foco em qualidade máxima e eficiência de tokens.