token-audit
Audit Claude Code configuration to measure fixed-context token overhead and produce a prioritized action plan
Best use case
token-audit is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Audit Claude Code configuration to measure fixed-context token overhead and produce a prioritized action plan
Teams using token-audit should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/token-audit/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How token-audit Compares
| Feature / Agent | token-audit | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Audit Claude Code configuration to measure fixed-context token overhead and produce a prioritized action plan
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
SKILL.md Source
# /token-audit — Context Token Audit
**Purpose**: Measure how many tokens your Claude Code configuration consumes before any user task begins. Identify the biggest sources of overhead. Produce a concrete action plan with savings estimates.
**When to use**:
- You're hitting rate limits before end of day
- Sessions feel slow or context compresses early
- You've added a lot of rules files and want to know the real cost
- After a major config change
---
## What You Will Measure
| Component | Loaded when | Typical range |
|-----------|-------------|---------------|
| `~/.claude/CLAUDE.md` + @imports | Always | 5-15K tokens |
| Project `CLAUDE.md` | Always | 2-8K tokens |
| `.claude/rules/*.md` | Always (all files) | 5-40K tokens |
| `MEMORY.md` | Always | 1-3K tokens |
| Claude Code system prompt | Always | ~7,500 tokens |
| Hook stdout | Per tool call | variable |
| Commands, agents, skills | On invocation only | 0 by default |
Key insight: `.claude/rules/` loads every `.md` file at session start, regardless of relevance. Commands and agents are lazy-loaded — they cost zero until invoked. Rules files are the most common source of unexpected overhead.
---
## Step 1 — Run the Measurement
Execute these commands from the project root:
```bash
# Component sizes
echo "=== PROJECT CLAUDE.md ===" && wc -c CLAUDE.md 2>/dev/null || echo "none"
echo ""
echo "=== RULES FILES (sorted by size) ===" && find .claude/rules -name "*.md" 2>/dev/null \
| xargs wc -c 2>/dev/null | sort -rn | head -20
echo ""
echo "=== GLOBAL ~/.claude ===" && ls -la ~/.claude/*.md 2>/dev/null \
| awk '{print $5, $9}' | sort -rn
```
Then calculate the full budget:
```bash
GLOBAL=$(cat ~/.claude/CLAUDE.md ~/.claude/*.md 2>/dev/null | wc -c)
PROJECT=$(wc -c < CLAUDE.md 2>/dev/null || echo 0)
RULES=$(find .claude/rules -name "*.md" 2>/dev/null | xargs cat 2>/dev/null | wc -c || echo 0)
MEMORY=$(find ~/.claude/projects -name "MEMORY.md" 2>/dev/null \
| xargs grep -l "$(basename $(pwd))" 2>/dev/null | head -1 \
| xargs wc -c 2>/dev/null | awk '{print $1}' || echo 0)
TOTAL=$(( GLOBAL + PROJECT + RULES + MEMORY + 30000 ))
echo "Global ~/.claude : ~$(( GLOBAL / 4 )) tokens ($(( GLOBAL / 1000 ))K chars)"
echo "Project CLAUDE.md : ~$(( PROJECT / 4 )) tokens"
echo "Rules (auto-loaded): ~$(( RULES / 4 )) tokens"
echo "MEMORY.md : ~$(( MEMORY / 4 )) tokens"
echo "System prompt : ~7,500 tokens"
echo "---"
echo "TOTAL fixed context: ~$(( TOTAL / 4 )) tokens"
echo "% of 200K window : $(( TOTAL / 4 * 100 / 200000 ))%"
```
---
## Step 2 — Classify Rules Files
For each file in `.claude/rules/`, classify it:
| Class | Definition | Action |
|-------|------------|--------|
| **ALWAYS** | Applies to most tasks (conventions, output format, safety) | Keep auto-loaded |
| **SOMETIMES** | Relevant in 20-40% of sessions | Keep if small (<3K chars); lazy-load if large |
| **RARELY** | Relevant in <10% of sessions (Figma, Windows, design system) | Remove from auto-load |
| **NEVER** | Outdated or covered elsewhere | Delete or archive |
Run this classification prompt:
```
Read every file in .claude/rules/. For each file, output a table row:
| File | Size (chars) | Class (ALWAYS/SOMETIMES/RARELY/NEVER) | Reasoning (one sentence) |
Sort by size descending within each class.
At the end, calculate: total chars that would leave the fixed context if all
RARELY and NEVER files were excluded. Convert to tokens (÷ 4).
```
---
## Step 3 — Audit Hook Overhead
Hooks on `PreToolUse` and `PostToolUse` fire on every tool call. Each invocation injects its stdout into context. A hook outputting 500 chars on 150 tool calls per session = 75K chars ≈ 19K extra tokens.
Check what you have:
```bash
# List hooks by event type
python3 - << 'EOF'
import json, os
for path in [os.path.expanduser("~/.claude/settings.json"), ".claude/settings.json"]:
if not os.path.exists(path): continue
print(f"\n--- {path} ---")
data = json.load(open(path))
for event, hooks in data.get("hooks", {}).items():
for h in hooks:
cmd = h.get("command", "?")
matcher = h.get("matcher", "*")
print(f" [{event}] matcher={matcher} → {cmd[:80]}")
EOF
```
For each `PreToolUse` or `PostToolUse` hook, estimate its stdout size by running it manually. Multiply by your average tool call count per session (visible in `/cost` after a session).
**Red flags**:
- Hooks that `cat` files unconditionally
- `git status` or `git log` on every call
- Multi-line echo output for debugging that was never removed
- JSON blobs injected as context
---
## Step 4 — Build the Action Plan
Produce a prioritized table. Rule of thumb: only include actions achievable without external infrastructure (no RAG, no vector databases, no custom MCP servers).
| Action | Estimated token savings | Effort | Risk |
|--------|------------------------|--------|------|
| Remove RARELY files from auto-load | varies | 30 min | Low |
| Split large rules into core + detail | varies | 1-2h | Low |
| Trim hook stdout to essential fields | varies | 1h | Low |
| Compress verbose rules (see §8 context-engineering.md) | 20-30% of rules | 1-2h | Low |
| Archive outdated MEMORY.md entries | 500-1K tokens | 30 min | Low |
---
## Step 5 — The RAG Question
Lazy-loading via a vector database (RAG) is sometimes pitched as the solution. Assess it honestly before committing:
1. What fixed-context tokens remain after Steps 1-4? (Measure this first.)
2. Is RAG justified? A pgvector + custom MCP setup is a 1-2 week project.
3. Break-even: if you have 10 rules files averaging 3K chars each, classification (30 min) saves as much as RAG would. RAG earns its cost at 50+ rule files where intent-based routing is the only scalable solution.
---
## Output Format
After running the audit, produce this report:
```markdown
## Token Audit — [PROJECT] — [DATE]
### Budget Summary
| Component | Tokens | % of total |
|-----------|--------|------------|
| Global ~/.claude | X | Y% |
| Project CLAUDE.md | X | Y% |
| Rules (auto-loaded) | X | Y% |
| MEMORY.md | X | Y% |
| System prompt | 7,500 | Y% |
| **TOTAL** | **X** | **100%** |
Context window used before any task: X% of 200K
### Rules Classification
| File | Chars | Class | Action |
|------|-------|-------|--------|
| ... | ... | ALWAYS/SOMETIMES/RARELY | keep/lazy-load/remove |
### Hook Overhead
| Hook | Event | Est. stdout | Calls/session | Total tokens/session |
|------|-------|-------------|---------------|----------------------|
| ... | PreToolUse | X chars | ~Y | ~Z tokens |
### Action Plan
| Action | Savings | Effort | Risk |
|--------|---------|--------|------|
| ... | -X tokens | 30 min | Low |
**Total achievable without infrastructure**: -X tokens → from Y to Z (N% reduction)
### RAG Verdict
[One paragraph: remaining overhead after action plan, whether RAG is justified,
estimated setup cost vs savings.]
```
---
## Interpreting Results
| Fixed context | Assessment |
|---------------|------------|
| < 20K tokens | Healthy — no urgent action needed |
| 20-40K tokens | Moderate — run the classification pass, grab easy wins |
| 40-60K tokens | High — rules audit is worth an afternoon |
| > 60K tokens | Critical — you are burning 30%+ of your window before any task |
A 48% reduction is typical after a first-pass audit on a heavily configured project, with no infrastructure changes — just removing the RARELY-used files from auto-load.Related Skills
audit-agents-skills
Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates.
git-ai-archaeology
Analyze AI config evolution in a git repo — first commits per path, monthly distribution, major PRs, maturity phases
voice-refine
Transform verbose voice input into structured, token-efficient Claude prompts. Use when cleaning up voice memos, dictation output, or speech-to-text transcriptions that contain filler words, repetitions, and unstructured thoughts.
talk-stage6-revision
Produces revision sheets with quick navigation by act, a master concept-to-URL table, Q&A cheat-sheet with 6-10 anticipated questions, glossary, and external resources list. Use when preparing for a talk with Q&A, creating shareable reference material for attendees, or building a safety-net glossary for live delivery.
talk-stage5-script
Produces a complete 5-act pitch with speaker notes, a slide-by-slide specification, and a ready-to-paste Kimi prompt for AI slide generation. Requires validated angle and title from Stage 4. Use when you have a confirmed talk angle and need the full script, slide spec, and AI-generated presentation prompt.
talk-stage4-position
Generates 3-4 strategic talk angles with strength/weakness analysis, title options, CFP descriptions, and a peer feedback draft, then enforces a mandatory CHECKPOINT for user confirmation before scripting. Use when deciding how to frame a talk, preparing a CFP submission, or choosing between multiple narrative angles.
talk-stage3-concepts
Builds a numbered, categorized concept catalogue from the talk summary and timeline, scoring each concept HIGH / MEDIUM / LOW for talk potential with optional repo enrichment. Use when you need a structured inventory of concepts before choosing a talk angle, or when assessing which ideas have the strongest presentation potential.
talk-stage2-research
Performs git archaeology, changelog analysis, and builds a verified factual timeline by cross-referencing git history with source material. REX mode only — skipped automatically in Concept mode. Use when building a REX talk and you need verified commit metrics, release timelines, and contributor data from a git repository.
talk-stage1-extract
Extracts and structures source material (articles, transcripts, notes) into a talk summary with narrative arc, themes, metrics, and gaps. Auto-detects REX vs Concept type. Use when starting a new talk from any source material or auditing existing material before committing to a talk.
talk-pipeline
Orchestrates the complete talk preparation pipeline from raw material to revision sheets, running 6 stages in sequence with human-in-the-loop checkpoints for REX or Concept mode talks. Use when starting a new talk pipeline, resuming a pipeline from a specific stage, or running the full end-to-end preparation workflow.
skill-creator
Scaffold a new Claude Code skill with SKILL.md, frontmatter, and bundled resources. Use when creating a custom skill, standardizing skill structure across a team, or packaging a skill for distribution.
rtk-optimizer
Wrap high-verbosity shell commands with RTK to reduce token consumption. Use when running git log, git diff, cargo test, pytest, or other verbose CLI output that wastes context window tokens.