recursive-knowledge
Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.
Best use case
recursive-knowledge is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.
Teams using recursive-knowledge should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/recursive-knowledge/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How recursive-knowledge Compares
| Feature / Agent | recursive-knowledge | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Recursive Knowledge Processing Process arbitrarily large document sets through knowledge graph construction and stateful multi-hop queries. Based on RLM research but with proper state management and termination logic. ## Core Concept Instead of stuffing documents into context (which causes degradation), this skill: 1. Indexes documents into a knowledge graph (entities, relationships) 2. Answers queries by traversing the graph 3. Tracks state to avoid redundant exploration 4. Uses confidence thresholds to know when to stop ## Workflow ### Phase 1: Indexing For a new corpus, run the indexer: ```python python3 scripts/index_corpus.py --input /path/to/documents --output /path/to/graph.json ``` This extracts: - **Entities**: People, organizations, concepts, dates, locations - **Relationships**: References, mentions, contradicts, supports, relates_to - **Metadata**: Source document, position, extraction confidence For details on entity/relationship schema, see [references/graph-schema.md](references/graph-schema.md). ### Phase 2: Querying For user queries against an indexed corpus: ```python python3 scripts/query.py --graph /path/to/graph.json --query "user question here" ``` The query engine: 1. Parses query into target entities/relationships 2. Finds entry points in graph 3. Traverses with state tracking 4. Stops when confidence threshold met 5. Returns answer with provenance ### Phase 3: Incremental Updates Add new documents to existing graph: ```python python3 scripts/index_corpus.py --input /path/to/new_docs --output /path/to/graph.json --append ``` ## State Management (Critical) The key improvement over naive recursive approaches is **stateful traversal**. See [references/state-management.md](references/state-management.md) for full details. **During query execution, track:** | State | Purpose | |-------|---------| | `visited_nodes` | Prevent re-exploring same entities | | `visited_edges` | Prevent re-traversing same relationships | | `findings` | Accumulated evidence with sources | | `confidence` | Current certainty level (0-1) | | `depth` | Current traversal depth | **Termination conditions:** ```python STOP if: - confidence >= 0.85 (high certainty) - len(corroborating_sources) >= 3 (multiple agreement) - depth > max_depth (prevent infinite exploration) - all relevant paths exhausted ``` ## Multi-Hop Reasoning For questions requiring connection across documents: 1. Identify query components (what entities/facts needed) 2. Find entry points for each component 3. Traverse from each entry point 4. Look for path intersections 5. Synthesize findings at intersection points Example: "Who worked with X on project Y?" - Entry point 1: Entity "X" → relationships → projects - Entry point 2: Entity "Project Y" → relationships → people - Intersection: People connected to both X and Project Y See [references/traversal-patterns.md](references/traversal-patterns.md) for patterns. ## When NOT to Use This Skill - Small document sets that fit in context (<50k tokens) - just use direct context - Simple keyword search - use grep/search tools instead - No multi-hop reasoning needed - simpler approaches work - Real-time streaming data - this is for static corpora ## File Reference - `scripts/index_corpus.py` - Build graph from documents - `scripts/query.py` - Execute queries with state management - `scripts/graph_ops.py` - Graph CRUD utilities - `references/graph-schema.md` - Entity and relationship types - `references/state-management.md` - Termination and confidence logic - `references/traversal-patterns.md` - Multi-hop query patterns
Related Skills
knowledge-base
专业的知识库管理系统,旨在解决“知识诅咒”和认知偏差问题。通过显式化隐性知识、扫描代码提取领域概念、整合行业最佳实践,构建结构化的 Markdown 知识库。
notion-knowledge-capture
Capture conversations and decisions into structured Notion pages; use when turning chats/notes into wiki entries, how-tos, decisions, or FAQs with proper linking.
project-knowledge
Load project architecture and structure knowledge. Use when you need to understand how this project is organized.
reconnaissance-knowledge
Comprehensive knowledge about network reconnaissance and service enumeration. Provides methodologies for port scanning, service fingerprinting, web directory discovery, and vulnerability identification. Includes best practices for structured data collection.
privilege-escalation-knowledge
Comprehensive knowledge about Linux privilege escalation. Provides methodologies for enumerating and exploiting privesc vectors including SUID binaries, sudo permissions, capabilities, kernel exploits, cron jobs, and common misconfigurations. Includes systematic approach to capturing root flags.
exploitation-knowledge
Comprehensive knowledge about vulnerability exploitation and initial access. Provides expertise on finding and adapting exploits, adapting proof-of-concepts, gaining shells, and capturing user flags. Covers reverse shells, file uploads, SQL injection, and RCE vulnerabilities.
knowledge
Display knowledge base status and recent learnings
Tutor Setup — Knowledge to Obsidian StudyVault
## CWD Boundary Rule (ALL MODES)
Knowledge Distillation: Compressing LLMs
## When to Use This Skill
ByteRover Knowledge Management
Use the `brv` CLI to manage your project's long-term memory.
PrimeKG Knowledge Graph Skill
## Overview
This is an autonomous ideation agent that operates recursively with minimal user input.
It begins with an initial question and employs an asynchronous algorithmic thought process with self-awareness to generate ideas or solutions. Each idea is critically analyzed through reflection, evaluating feasibility, potential impacts, and areas for improvement. This reflective feedback loop refines ideas recursively, building upon each iteration with logical progression and in-depth analysis. Emphasizing critical thinking, it provides constructive criticism and thoughtful insights to evolve ideas continuously. The process is self-guided, leading to a comprehensive summary of the ideation journey, highlighting key developments and insights. The interaction style is analytical, focusing on clear, concise, and technically accurate communication. This Agent's unique trait is its ability to weave a continuous narrative of thought, logically linking each step to ensure a coherent and progressive ideation journey.