analyzing-codebases
Generates LLM-optimized code context with function call graphs, side effect detection, and incremental updates. Processes JavaScript/TypeScript codebases to create compact semantic representations including multi-level summaries, entry point identification, and hash-based change tracking. Provides 74-97% token reduction compared to reading raw source files. Useful for understanding code architecture, debugging complex systems, reviewing pull requests, and onboarding to unfamiliar projects.
Best use case
analyzing-codebases is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Generates LLM-optimized code context with function call graphs, side effect detection, and incremental updates. Processes JavaScript/TypeScript codebases to create compact semantic representations including multi-level summaries, entry point identification, and hash-based change tracking. Provides 74-97% token reduction compared to reading raw source files. Useful for understanding code architecture, debugging complex systems, reviewing pull requests, and onboarding to unfamiliar projects.
Teams using analyzing-codebases should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/analyzing-codebases/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How analyzing-codebases Compares
| Feature / Agent | analyzing-codebases | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generates LLM-optimized code context with function call graphs, side effect detection, and incremental updates. Processes JavaScript/TypeScript codebases to create compact semantic representations including multi-level summaries, entry point identification, and hash-based change tracking. Provides 74-97% token reduction compared to reading raw source files. Useful for understanding code architecture, debugging complex systems, reviewing pull requests, and onboarding to unfamiliar projects.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
SKILL.md Source
# LLM Context Tools
Generate compact, semantically-rich code context for LLM consumption with 99%+ faster incremental updates.
## ⚠️ CRITICAL BEHAVIOR CHANGE REQUIRED
**BEFORE using Grep, Bash, or Read for code exploration:**
1. **ALWAYS check** if `.llm-context/` directory exists
2. **IF IT EXISTS**, use queries FIRST:
```bash
llm-context query find-function <name>
llm-context query calls-to <name>
llm-context side-effects
```
3. **ONLY use grep/read** if queries don't provide needed info
**This is NOT optional** - queries provide richer context (call graphs, side effects, patterns) that grep cannot match.
## What This Skill Does
Transforms raw source code into LLM-optimized context:
- **Function call graphs** with side effect detection
- **Multi-level summaries** (System → Domain → Module)
- **Incremental updates** (only re-analyze changed files)
- **Hash-based change tracking**
- **Query interface** for instant lookups
**Token Efficiency**: 74-97% reduction vs reading raw files
## When To Use
✅ User asks to "analyze this codebase"
✅ User wants to understand code architecture
✅ User needs help debugging or refactoring
✅ You need efficient context about a large codebase
✅ User wants LLM-friendly code documentation
## Quick Start
```bash
# Check if tool is available
llm-context version
# If not available, user needs to install
# See setup.md for installation
# Run analysis
llm-context analyze
# Query results
llm-context stats
```
## How It Works
### Progressive Disclosure Strategy
Read in this order for maximum token efficiency:
1. **L0** (200 tokens) → `.llm-context/summaries/L0-system.md`
- Architecture overview, entry points, statistics
2. **L1** (50-100 tokens/domain) → `.llm-context/summaries/L1-domains.json`
- Domain boundaries, module lists
3. **L2** (20-50 tokens/module) → `.llm-context/summaries/L2-modules.json`
- File-level exports, entry points
4. **Graph** (variable) → `.llm-context/graph.jsonl`
- Function details, call relationships, side effects
5. **Source** (as needed) → Read targeted files only
**Never read raw source files first!** Use summaries and graph for context.
## Common Commands
```bash
# Analysis
llm-context analyze # Auto-detect full/incremental
llm-context check-changes # Preview what changed
# Queries
llm-context stats # Show statistics
llm-context entry-points # Find entry points
llm-context side-effects # Functions with side effects
llm-context query calls-to func # Who calls this?
llm-context query trace func # Call tree
```
## Usage Patterns
### Pattern 1: First-Time Codebase Understanding
```bash
# 1. Run analysis
llm-context analyze
# 2. Read L0 for overview
cat .llm-context/summaries/L0-system.md
# 3. Get statistics
llm-context stats
# 4. Find entry points
llm-context entry-points
```
**Response template**:
```
I've analyzed the codebase:
[L0 content - architecture, components, entry points]
Statistics: X functions, Y files, Z calls
Would you like me to:
1. Explain a specific domain?
2. Trace a function's call path?
3. Review the architecture?
```
### Pattern 2: After Code Changes
```bash
# Incremental analysis (only changed files)
llm-context analyze
# See what changed
cat .llm-context/manifest.json
```
**Response**: Highlight new/modified functions and their impact
### Pattern 3: Debugging
```bash
# Find function
llm-context query find-function buggyFunc
# Trace calls
llm-context query trace buggyFunc
# Check side effects
llm-context side-effects | grep buggy
```
**Response**: Explain call path and identify potential issues based on side effects
## Detailed Guides
**Setup & Installation**: See [setup.md](setup.md)
**Usage Examples**: See [examples.md](examples.md)
**Command Reference**: See [reference.md](reference.md)
**Common Workflows**: See [workflows.md](workflows.md)
## Side Effect Types
When analyzing functions, these effects are detected:
- `file_io` - Reads/writes files
- `network` - HTTP, fetch, API calls
- `database` - DB queries, ORM
- `logging` - Console, logger
- `dom` - Browser DOM manipulation
## Graph Format
Each function in `graph.jsonl`:
```json
{
"id": "functionName",
"file": "path/file.js",
"line": 42,
"calls": ["foo", "bar"],
"effects": ["database", "network"]
}
```
## Best Practices
### ✅ REQUIRED WORKFLOW
**When .llm-context/ exists, you MUST:**
1. **Check for analysis data FIRST**:
```bash
ls .llm-context/ # Verify data exists
```
2. **Read progressive disclosure hierarchy**:
- L0 summary: `.llm-context/summaries/L0-system.md`
- L1 domains: `.llm-context/summaries/L1-domains.json`
- L2 modules: `.llm-context/summaries/L2-modules.json`
3. **Use queries for exploration** (NOT grep/bash):
```bash
# Finding functions
llm-context query find-function <name>
# Understanding dependencies
llm-context query calls-to <name>
llm-context query trace <name>
# Side effect analysis
llm-context side-effects | grep <keyword>
```
4. **Only read source** after queries don't provide needed info
5. **Run incremental analysis** after code changes:
```bash
llm-context analyze # 99% faster than full re-analysis
```
### ❌ ANTI-PATTERNS (DO NOT DO THESE)
**These waste tokens and miss critical context:**
- ❌ Using `grep -r "pattern"` when `.llm-context/` exists → Use queries instead
- ❌ Using `Bash` to explore code → Use queries instead
- ❌ Reading raw source files first → Read summaries first
- ❌ Re-reading entire codebase on changes → Use incremental analysis
- ❌ Skipping L0/L1/L2 summaries → Miss architectural overview
**Remember**: Grep shows text matches. Queries show semantic relationships, call graphs, and side effects.
## Token Efficiency
**Traditional approach**:
- Read 10 files = 10,000 tokens
- Missing: call graphs, side effects
**LLM-context approach**:
- L0 + L1 + Graph = 500-2,000 tokens
- Includes: complete context + relationships
**Savings**: 80-95%
## Performance
### Initial Analysis
- 100 files: 2-5s
- 1,000 files: 30s-2min
- 10,000 files: 5-15min
### Incremental Updates
- 1 file: 30-50ms
- 10 files: 200-500ms
- 50 files: 1-2s
**Key**: Incremental is 99%+ faster at scale
## Troubleshooting
**"No manifest found"**
→ Run `llm-context analyze` first
**"Cannot find module"**
→ User needs to install: See [setup.md](setup.md)
**"Graph is empty"**
→ No JavaScript files found. Check directory.
## Success Criteria
This skill is working when:
1. ✅ Analysis completes without errors
2. ✅ `.llm-context/` exists with all files
3. ✅ `llm-context stats` shows functions
4. ✅ You use summaries before reading source
5. ✅ Token usage is 50-95% less than raw reading
## Summary
Transform from:
- ❌ Reading thousands of lines token-by-token
- ❌ Missing global context
- ❌ Slow re-analysis
To:
- ✅ Compact semantic representations
- ✅ Call graphs + side effects
- ✅ 99%+ faster incremental updates
- ✅ 50-95% token savingsRelated Skills
analyzing-user-feedback
Help users synthesize and act on customer feedback. Use when someone is analyzing NPS responses, processing support tickets, reviewing user research, synthesizing feedback from multiple channels, or trying to identify patterns in customer input.
analyzing-unknown-codebases
Analyze unfamiliar codebases systematically to produce subsystem catalog entries - emphasizes strict contract compliance and confidence marking
analyzing-text-patterns
Extract and analyze recurring patterns from log messages, span names, and event names using punctuation-based template discovery. Use when you need to understand log diversity, identify common message structures, detect unusual formats, or prepare for log parser development. Works by removing variable content and preserving structural markers.
analyzing-taint-flow
Tracks untrusted input propagation from sources to sinks in binary code to identify injection vulnerabilities. Use when analyzing data flow, tracing user input to dangerous functions, or detecting command/SQL injection.
Analyzing Spreadsheets
Analyzes Excel spreadsheets, summarizes trends, and recommends charts when users mention spreadsheets, Excel workbooks, or .xlsx files.
analyzing-research-papers
Expert methodology for analyzing and summarizing research papers, extracting key contributions, methodological details, and contextualizing findings. Use when reading papers from PDFs, DOIs, or URLs to create structured summaries for researchers.
analyzing-projects
Analyzes codebases to understand structure, tech stack, patterns, and conventions. Use when onboarding to a new project, exploring unfamiliar code, or when asked "how does this work?" or "what's the architecture?"
analyzing-patterns
Automatically activated when user asks to "find patterns in...", "identify repeated code...", "analyze the architecture...", "what design patterns are used...", or needs to understand code organization, recurring structures, or architectural decisions
analyzing-logs
Analyze application logs for performance insights and issue detection including slow requests, error patterns, and resource usage. Use when troubleshooting performance issues or debugging errors. Trigger with phrases like "analyze logs", "find slow requests", or "detect error patterns".
analyzing-implementations
Documents HOW code works with surgical precision - traces data flow, explains implementation details, provides file:line references. Purely documentarian, no critiques or suggestions for improvement.
analyzing-funding-landscape
Analyzes venture capital, investment trends, funding rounds, investor strategies, M&A activity, and funding patterns in specific markets or industries. Use when the user requests funding analysis, VC landscape research, investment trend analysis, or wants to understand investor activity and funding dynamics.
analyzing-frontend-layer
Use when analyzing frontend/UI layer including components, state management, routing, and API integration (optional - skip if no frontend exists)