knowledge-extractor
Extract tribal knowledge from code, documentation, and commit history to preserve institutional memory
Best use case
knowledge-extractor is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Extract tribal knowledge from code, documentation, and commit history to preserve institutional memory
Teams using knowledge-extractor should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/knowledge-extractor/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How knowledge-extractor Compares
| Feature / Agent | knowledge-extractor | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Extract tribal knowledge from code, documentation, and commit history to preserve institutional memory
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Knowledge Extractor Skill
Extracts tribal knowledge from code comments, commit messages, documentation, and other sources to preserve institutional memory during migration.
## Purpose
Enable knowledge preservation for:
- Comment analysis and extraction
- Commit message mining
- Documentation parsing
- Pattern recognition
- Business rule discovery
## Capabilities
### 1. Comment Analysis
- Extract TODO/FIXME comments
- Parse documentation comments
- Identify explanatory notes
- Find warning comments
### 2. Commit Message Mining
- Extract rationale from commits
- Identify bug fix context
- Find feature explanations
- Track decision history
### 3. Documentation Parsing
- Parse markdown documentation
- Extract from wikis
- Process README files
- Catalog API docs
### 4. Pattern Recognition
- Identify coding patterns
- Recognize idioms
- Detect conventions
- Map architectural patterns
### 5. Business Rule Extraction
- Find business logic comments
- Extract validation rules
- Identify calculation explanations
- Document edge cases
### 6. Glossary Generation
- Build domain vocabulary
- Define abbreviations
- Map term usage
- Create terminology guide
## Tool Integrations
| Tool | Purpose | Integration Method |
|------|---------|-------------------|
| Sourcegraph | Code search | API |
| GitHub API | Commit history | API |
| grep/ripgrep | Pattern search | CLI |
| Custom NLP | Text analysis | Library |
| Confluence API | Wiki extraction | API |
## Output Schema
```json
{
"extractionId": "string",
"timestamp": "ISO8601",
"knowledge": {
"comments": [
{
"type": "todo|fixme|note|warning|explanation",
"file": "string",
"line": "number",
"content": "string",
"context": "string"
}
],
"commits": [
{
"hash": "string",
"message": "string",
"author": "string",
"context": "string",
"relatedFiles": []
}
],
"documentation": [],
"businessRules": [],
"glossary": {}
}
}
```
## Integration with Migration Processes
- **legacy-codebase-assessment**: Knowledge discovery
- **documentation-migration**: Source material
## Related Skills
- `legacy-code-interpreter`: Code understanding
- `documentation-generator`: Doc creation
## Related Agents
- `legacy-system-archaeologist`: Uses for excavation
- `documentation-migration-agent`: Uses for doc creationRelated Skills
mock-spec-extractor
Extracts design specifications from mock images including colors, typography, spacing, and component details
contract-extractor
Extracts key terms from contracts, identifies risks, flags unusual provisions
knowledge-analytics
Knowledge base analytics, usage reporting, and effectiveness measurement
domain-model-extractor
Extract domain models from monolithic codebases using DDD principles for microservices decomposition
knowledge-curation
Context priming before work (bd prime) and self-reflection after completion to extract patterns, gotchas, and decisions into the knowledge base.
knowledge-graph-management
Capture, validate, query, and sync architectural patterns and design decisions in the knowledge graph
cog-knowledge-consolidation
Build structured knowledge frameworks from scattered vault notes with source attribution
process-builder
Scaffold new babysitter process definitions following SDK patterns, proper structure, and best practices. Guides the 3-phase workflow from research to implementation.
babysitter
Orchestrate via @babysitter. Use this skill when asked to babysit a run, orchestrate a process or whenever it is called explicitly. (babysit, babysitter, orchestrate, orchestrate a run, workflow, etc.)
yolo
Run Babysitter autonomously with minimal manual interruption.
user-install
Install the user-level Babysitter Codex setup.
team-install
Install the team-pinned Babysitter Codex workspace setup.