token-audit

Audit Claude Code configuration to measure fixed-context token overhead and produce a prioritized action plan

3,046 stars

Best use case

token-audit is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Audit Claude Code configuration to measure fixed-context token overhead and produce a prioritized action plan

Teams using token-audit should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/token-audit/SKILL.md --create-dirs "https://raw.githubusercontent.com/FlorianBruniaux/claude-code-ultimate-guide/main/examples/skills/token-audit/skill.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/token-audit/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How token-audit Compares

Feature / Agenttoken-auditStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Audit Claude Code configuration to measure fixed-context token overhead and produce a prioritized action plan

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# /token-audit — Context Token Audit

**Purpose**: Measure how many tokens your Claude Code configuration consumes before any user task begins. Identify the biggest sources of overhead. Produce a concrete action plan with savings estimates.

**When to use**:
- You're hitting rate limits before end of day
- Sessions feel slow or context compresses early
- You've added a lot of rules files and want to know the real cost
- After a major config change

---

## What You Will Measure

| Component | Loaded when | Typical range |
|-----------|-------------|---------------|
| `~/.claude/CLAUDE.md` + @imports | Always | 5-15K tokens |
| Project `CLAUDE.md` | Always | 2-8K tokens |
| `.claude/rules/*.md` | Always (all files) | 5-40K tokens |
| `MEMORY.md` | Always | 1-3K tokens |
| Claude Code system prompt | Always | ~7,500 tokens |
| Hook stdout | Per tool call | variable |
| Commands, agents, skills | On invocation only | 0 by default |

Key insight: `.claude/rules/` loads every `.md` file at session start, regardless of relevance. Commands and agents are lazy-loaded — they cost zero until invoked. Rules files are the most common source of unexpected overhead.

---

## Step 1 — Run the Measurement

Execute these commands from the project root:

```bash
# Component sizes
echo "=== PROJECT CLAUDE.md ===" && wc -c CLAUDE.md 2>/dev/null || echo "none"

echo ""
echo "=== RULES FILES (sorted by size) ===" && find .claude/rules -name "*.md" 2>/dev/null \
  | xargs wc -c 2>/dev/null | sort -rn | head -20

echo ""
echo "=== GLOBAL ~/.claude ===" && ls -la ~/.claude/*.md 2>/dev/null \
  | awk '{print $5, $9}' | sort -rn
```

Then calculate the full budget:

```bash
GLOBAL=$(cat ~/.claude/CLAUDE.md ~/.claude/*.md 2>/dev/null | wc -c)
PROJECT=$(wc -c < CLAUDE.md 2>/dev/null || echo 0)
RULES=$(find .claude/rules -name "*.md" 2>/dev/null | xargs cat 2>/dev/null | wc -c || echo 0)
MEMORY=$(find ~/.claude/projects -name "MEMORY.md" 2>/dev/null \
  | xargs grep -l "$(basename $(pwd))" 2>/dev/null | head -1 \
  | xargs wc -c 2>/dev/null | awk '{print $1}' || echo 0)
TOTAL=$(( GLOBAL + PROJECT + RULES + MEMORY + 30000 ))

echo "Global ~/.claude   : ~$(( GLOBAL / 4 )) tokens ($(( GLOBAL / 1000 ))K chars)"
echo "Project CLAUDE.md  : ~$(( PROJECT / 4 )) tokens"
echo "Rules (auto-loaded): ~$(( RULES / 4 )) tokens"
echo "MEMORY.md          : ~$(( MEMORY / 4 )) tokens"
echo "System prompt      : ~7,500 tokens"
echo "---"
echo "TOTAL fixed context: ~$(( TOTAL / 4 )) tokens"
echo "% of 200K window   : $(( TOTAL / 4 * 100 / 200000 ))%"
```

---

## Step 2 — Classify Rules Files

For each file in `.claude/rules/`, classify it:

| Class | Definition | Action |
|-------|------------|--------|
| **ALWAYS** | Applies to most tasks (conventions, output format, safety) | Keep auto-loaded |
| **SOMETIMES** | Relevant in 20-40% of sessions | Keep if small (<3K chars); lazy-load if large |
| **RARELY** | Relevant in <10% of sessions (Figma, Windows, design system) | Remove from auto-load |
| **NEVER** | Outdated or covered elsewhere | Delete or archive |

Run this classification prompt:

```
Read every file in .claude/rules/. For each file, output a table row:

| File | Size (chars) | Class (ALWAYS/SOMETIMES/RARELY/NEVER) | Reasoning (one sentence) |

Sort by size descending within each class.
At the end, calculate: total chars that would leave the fixed context if all
RARELY and NEVER files were excluded. Convert to tokens (÷ 4).
```

---

## Step 3 — Audit Hook Overhead

Hooks on `PreToolUse` and `PostToolUse` fire on every tool call. Each invocation injects its stdout into context. A hook outputting 500 chars on 150 tool calls per session = 75K chars ≈ 19K extra tokens.

Check what you have:

```bash
# List hooks by event type
python3 - << 'EOF'
import json, os
for path in [os.path.expanduser("~/.claude/settings.json"), ".claude/settings.json"]:
    if not os.path.exists(path): continue
    print(f"\n--- {path} ---")
    data = json.load(open(path))
    for event, hooks in data.get("hooks", {}).items():
        for h in hooks:
            cmd = h.get("command", "?")
            matcher = h.get("matcher", "*")
            print(f"  [{event}] matcher={matcher} → {cmd[:80]}")
EOF
```

For each `PreToolUse` or `PostToolUse` hook, estimate its stdout size by running it manually. Multiply by your average tool call count per session (visible in `/cost` after a session).

**Red flags**:
- Hooks that `cat` files unconditionally
- `git status` or `git log` on every call
- Multi-line echo output for debugging that was never removed
- JSON blobs injected as context

---

## Step 4 — Build the Action Plan

Produce a prioritized table. Rule of thumb: only include actions achievable without external infrastructure (no RAG, no vector databases, no custom MCP servers).

| Action | Estimated token savings | Effort | Risk |
|--------|------------------------|--------|------|
| Remove RARELY files from auto-load | varies | 30 min | Low |
| Split large rules into core + detail | varies | 1-2h | Low |
| Trim hook stdout to essential fields | varies | 1h | Low |
| Compress verbose rules (see §8 context-engineering.md) | 20-30% of rules | 1-2h | Low |
| Archive outdated MEMORY.md entries | 500-1K tokens | 30 min | Low |

---

## Step 5 — The RAG Question

Lazy-loading via a vector database (RAG) is sometimes pitched as the solution. Assess it honestly before committing:

1. What fixed-context tokens remain after Steps 1-4? (Measure this first.)
2. Is RAG justified? A pgvector + custom MCP setup is a 1-2 week project.
3. Break-even: if you have 10 rules files averaging 3K chars each, classification (30 min) saves as much as RAG would. RAG earns its cost at 50+ rule files where intent-based routing is the only scalable solution.

---

## Output Format

After running the audit, produce this report:

```markdown
## Token Audit — [PROJECT] — [DATE]

### Budget Summary

| Component | Tokens | % of total |
|-----------|--------|------------|
| Global ~/.claude | X | Y% |
| Project CLAUDE.md | X | Y% |
| Rules (auto-loaded) | X | Y% |
| MEMORY.md | X | Y% |
| System prompt | 7,500 | Y% |
| **TOTAL** | **X** | **100%** |

Context window used before any task: X% of 200K

### Rules Classification

| File | Chars | Class | Action |
|------|-------|-------|--------|
| ... | ... | ALWAYS/SOMETIMES/RARELY | keep/lazy-load/remove |

### Hook Overhead

| Hook | Event | Est. stdout | Calls/session | Total tokens/session |
|------|-------|-------------|---------------|----------------------|
| ... | PreToolUse | X chars | ~Y | ~Z tokens |

### Action Plan

| Action | Savings | Effort | Risk |
|--------|---------|--------|------|
| ... | -X tokens | 30 min | Low |

**Total achievable without infrastructure**: -X tokens → from Y to Z (N% reduction)

### RAG Verdict

[One paragraph: remaining overhead after action plan, whether RAG is justified,
estimated setup cost vs savings.]
```

---

## Interpreting Results

| Fixed context | Assessment |
|---------------|------------|
| < 20K tokens | Healthy — no urgent action needed |
| 20-40K tokens | Moderate — run the classification pass, grab easy wins |
| 40-60K tokens | High — rules audit is worth an afternoon |
| > 60K tokens | Critical — you are burning 30%+ of your window before any task |

A 48% reduction is typical after a first-pass audit on a heavily configured project, with no infrastructure changes — just removing the RARELY-used files from auto-load.

Related Skills

audit-agents-skills

3046
from FlorianBruniaux/claude-code-ultimate-guide

Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates.

git-ai-archaeology

3046
from FlorianBruniaux/claude-code-ultimate-guide

Analyze AI config evolution in a git repo — first commits per path, monthly distribution, major PRs, maturity phases

voice-refine

3046
from FlorianBruniaux/claude-code-ultimate-guide

Transform verbose voice input into structured, token-efficient Claude prompts. Use when cleaning up voice memos, dictation output, or speech-to-text transcriptions that contain filler words, repetitions, and unstructured thoughts.

talk-stage6-revision

3046
from FlorianBruniaux/claude-code-ultimate-guide

Produces revision sheets with quick navigation by act, a master concept-to-URL table, Q&A cheat-sheet with 6-10 anticipated questions, glossary, and external resources list. Use when preparing for a talk with Q&A, creating shareable reference material for attendees, or building a safety-net glossary for live delivery.

talk-stage5-script

3046
from FlorianBruniaux/claude-code-ultimate-guide

Produces a complete 5-act pitch with speaker notes, a slide-by-slide specification, and a ready-to-paste Kimi prompt for AI slide generation. Requires validated angle and title from Stage 4. Use when you have a confirmed talk angle and need the full script, slide spec, and AI-generated presentation prompt.

talk-stage4-position

3046
from FlorianBruniaux/claude-code-ultimate-guide

Generates 3-4 strategic talk angles with strength/weakness analysis, title options, CFP descriptions, and a peer feedback draft, then enforces a mandatory CHECKPOINT for user confirmation before scripting. Use when deciding how to frame a talk, preparing a CFP submission, or choosing between multiple narrative angles.

talk-stage3-concepts

3046
from FlorianBruniaux/claude-code-ultimate-guide

Builds a numbered, categorized concept catalogue from the talk summary and timeline, scoring each concept HIGH / MEDIUM / LOW for talk potential with optional repo enrichment. Use when you need a structured inventory of concepts before choosing a talk angle, or when assessing which ideas have the strongest presentation potential.

talk-stage2-research

3046
from FlorianBruniaux/claude-code-ultimate-guide

Performs git archaeology, changelog analysis, and builds a verified factual timeline by cross-referencing git history with source material. REX mode only — skipped automatically in Concept mode. Use when building a REX talk and you need verified commit metrics, release timelines, and contributor data from a git repository.

talk-stage1-extract

3046
from FlorianBruniaux/claude-code-ultimate-guide

Extracts and structures source material (articles, transcripts, notes) into a talk summary with narrative arc, themes, metrics, and gaps. Auto-detects REX vs Concept type. Use when starting a new talk from any source material or auditing existing material before committing to a talk.

talk-pipeline

3046
from FlorianBruniaux/claude-code-ultimate-guide

Orchestrates the complete talk preparation pipeline from raw material to revision sheets, running 6 stages in sequence with human-in-the-loop checkpoints for REX or Concept mode talks. Use when starting a new talk pipeline, resuming a pipeline from a specific stage, or running the full end-to-end preparation workflow.

skill-creator

3046
from FlorianBruniaux/claude-code-ultimate-guide

Scaffold a new Claude Code skill with SKILL.md, frontmatter, and bundled resources. Use when creating a custom skill, standardizing skill structure across a team, or packaging a skill for distribution.

rtk-optimizer

3046
from FlorianBruniaux/claude-code-ultimate-guide

Wrap high-verbosity shell commands with RTK to reduce token consumption. Use when running git log, git diff, cargo test, pytest, or other verbose CLI output that wastes context window tokens.