recursive-knowledge

Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.

242 stars

byaiskillstore

View on GitHub Installation ↓

Best use case

recursive-knowledge is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "recursive-knowledge" skill to help with this workflow task. Context: Process large document corpora (1000+ docs, millions of tokens) through knowledge graph construction and stateful multi-hop reasoning. Use when (1) User provides a large corpus exceeding context limits, (2) Questions require connections across multiple documents, (3) Multi-hop reasoning needed for complex queries, (4) User wants persistent queryable knowledge from documents. Replaces brute-force document stuffing with intelligent graph traversal.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/recursive-knowledge/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/cornjebus/recursive-knowledge/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/recursive-knowledge/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How recursive-knowledge Compares

Feature / Agent	recursive-knowledge	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Recursive Knowledge Processing

Process arbitrarily large document sets through knowledge graph construction and stateful multi-hop queries. Based on RLM research but with proper state management and termination logic.

## Core Concept

Instead of stuffing documents into context (which causes degradation), this skill:
1. Indexes documents into a knowledge graph (entities, relationships)
2. Answers queries by traversing the graph
3. Tracks state to avoid redundant exploration
4. Uses confidence thresholds to know when to stop

## Workflow

### Phase 1: Indexing

For a new corpus, run the indexer:

```python
python3 scripts/index_corpus.py --input /path/to/documents --output /path/to/graph.json
```

This extracts:
- **Entities**: People, organizations, concepts, dates, locations
- **Relationships**: References, mentions, contradicts, supports, relates_to
- **Metadata**: Source document, position, extraction confidence

For details on entity/relationship schema, see [references/graph-schema.md](references/graph-schema.md).

### Phase 2: Querying

For user queries against an indexed corpus:

```python
python3 scripts/query.py --graph /path/to/graph.json --query "user question here"
```

The query engine:
1. Parses query into target entities/relationships
2. Finds entry points in graph
3. Traverses with state tracking
4. Stops when confidence threshold met
5. Returns answer with provenance

### Phase 3: Incremental Updates

Add new documents to existing graph:

```python
python3 scripts/index_corpus.py --input /path/to/new_docs --output /path/to/graph.json --append
```

## State Management (Critical)

The key improvement over naive recursive approaches is **stateful traversal**. See [references/state-management.md](references/state-management.md) for full details.

**During query execution, track:**

| State | Purpose |
|-------|---------|
| `visited_nodes` | Prevent re-exploring same entities |
| `visited_edges` | Prevent re-traversing same relationships |
| `findings` | Accumulated evidence with sources |
| `confidence` | Current certainty level (0-1) |
| `depth` | Current traversal depth |

**Termination conditions:**

```python
STOP if:
  - confidence >= 0.85 (high certainty)
  - len(corroborating_sources) >= 3 (multiple agreement)
  - depth > max_depth (prevent infinite exploration)
  - all relevant paths exhausted
```

## Multi-Hop Reasoning

For questions requiring connection across documents:

1. Identify query components (what entities/facts needed)
2. Find entry points for each component
3. Traverse from each entry point
4. Look for path intersections
5. Synthesize findings at intersection points

Example: "Who worked with X on project Y?"
- Entry point 1: Entity "X" → relationships → projects
- Entry point 2: Entity "Project Y" → relationships → people
- Intersection: People connected to both X and Project Y

See [references/traversal-patterns.md](references/traversal-patterns.md) for patterns.

## When NOT to Use This Skill

- Small document sets that fit in context (<50k tokens) - just use direct context
- Simple keyword search - use grep/search tools instead
- No multi-hop reasoning needed - simpler approaches work
- Real-time streaming data - this is for static corpora

## File Reference

- `scripts/index_corpus.py` - Build graph from documents
- `scripts/query.py` - Execute queries with state management
- `scripts/graph_ops.py` - Graph CRUD utilities
- `references/graph-schema.md` - Entity and relationship types
- `references/state-management.md` - Termination and confidence logic
- `references/traversal-patterns.md` - Multi-hop query patterns

Related Skills

knowledge-base

242

from aiskillstore/marketplace

专业的知识库管理系统，旨在解决“知识诅咒”和认知偏差问题。通过显式化隐性知识、扫描代码提取领域概念、整合行业最佳实践，构建结构化的 Markdown 知识库。

notion-knowledge-capture

242

from aiskillstore/marketplace

Capture conversations and decisions into structured Notion pages; use when turning chats/notes into wiki entries, how-tos, decisions, or FAQs with proper linking.

project-knowledge

242

from aiskillstore/marketplace

Load project architecture and structure knowledge. Use when you need to understand how this project is organized.

reconnaissance-knowledge

242

from aiskillstore/marketplace

Comprehensive knowledge about network reconnaissance and service enumeration. Provides methodologies for port scanning, service fingerprinting, web directory discovery, and vulnerability identification. Includes best practices for structured data collection.

privilege-escalation-knowledge

242

from aiskillstore/marketplace

Comprehensive knowledge about Linux privilege escalation. Provides methodologies for enumerating and exploiting privesc vectors including SUID binaries, sudo permissions, capabilities, kernel exploits, cron jobs, and common misconfigurations. Includes systematic approach to capturing root flags.

exploitation-knowledge

242

from aiskillstore/marketplace

Comprehensive knowledge about vulnerability exploitation and initial access. Provides expertise on finding and adapting exploits, adapting proof-of-concepts, gaining shells, and capturing user flags. Covers reverse shells, file uploads, SQL injection, and RCE vulnerabilities.

knowledge

242

from aiskillstore/marketplace

Display knowledge base status and recent learnings

azure-quotas

242

from aiskillstore/marketplace

Check/manage Azure quotas and usage across providers. For deployment planning, capacity validation, region selection. WHEN: "check quotas", "service limits", "current usage", "request quota increase", "quota exceeded", "validate capacity", "regional availability", "provisioning limits", "vCPU limit", "how many vCPUs available in my subscription".

DevOps & Infrastructure

raindrop-io

242

from aiskillstore/marketplace

Manage Raindrop.io bookmarks with AI assistance. Save and organize bookmarks, search your collection, manage reading lists, and organize research materials. Use when working with bookmarks, web research, reading lists, or when user mentions Raindrop.io.

Data & Research

zlibrary-to-notebooklm

242

from aiskillstore/marketplace

自动从 Z-Library 下载书籍并上传到 Google NotebookLM。支持 PDF/EPUB 格式，自动转换，一键创建知识库。

discover-skills

242

from aiskillstore/marketplace

当你发现当前可用的技能都不够合适（或用户明确要求你寻找技能）时使用。本技能会基于任务目标和约束，给出一份精简的候选技能清单，帮助你选出最适配当前任务的技能。

web-performance-seo

242

from aiskillstore/marketplace

Fix PageSpeed Insights/Lighthouse accessibility "!" errors caused by contrast audit failures (CSS filters, OKLCH/OKLAB, low opacity, gradient text, image backgrounds). Use for accessibility-driven SEO/performance debugging and remediation.