codebase-analyzer

Statistical rule discovery from Go codebase patterns.

290 stars

bynotque

View on GitHub Installation ↓

Best use case

codebase-analyzer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Statistical rule discovery from Go codebase patterns.

Teams using codebase-analyzer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/codebase-analyzer/SKILL.md --create-dirs "https://raw.githubusercontent.com/notque/claude-code-toolkit/main/skills/codebase-analyzer/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/codebase-analyzer/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How codebase-analyzer Compares

Feature / Agent	codebase-analyzer	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Statistical rule discovery from Go codebase patterns.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# Codebase Analyzer Skill

Statistical rule discovery through measurement of Go codebases. Python scripts count patterns to avoid LLM training bias, then statistics are interpreted to derive confidence-scored rules. The core principle is **Measure First, Interpret Second** -- what IS in the code is the local standard, not what an LLM thinks "should be" there.

## Instructions

### Phase 1: CONFIGURE

**Goal**: Validate target and select analyzer variant.

Read and follow the repository's CLAUDE.md before doing anything else -- project instructions override default behaviors.

**Step 1: Validate the target**
- Confirm path points to a Go repository root with .go files
- Check for standard structure (cmd/, internal/, pkg/)
- Verify sufficient file count: 50+ files for meaningful rules, 100+ ideal. Below 50 files, statistics produce high variance -- patterns that look consistent may be coincidence. For small repos, combine analysis across multiple team repos rather than treating thin data as definitive.

**Step 2: Select cartographer variant**

| Variant | Script | Metrics | Use When |
|---------|--------|---------|----------|
| Omni (recommended) | `cartographer_omni.py` (not yet implemented) | 100 across 25 categories | Full codebase profiling |
| Basic | `cartographer.py` (not yet implemented) | ~15 categories | Quick pattern overview |
| Ultimate | `cartographer_ultimate.py` | 6 focused categories | Performance pattern detection |

**Step 3: Verify environment**
- Python 3.7+ available
- No external dependencies needed (uses only Python standard library)
- Output directories exist or can be created

```
===============================================================
 PHASE 1: CONFIGURE
===============================================================

 Target Repository:
   - Path: [/path/to/repo]
   - Go Files: [N files found]
   - Structure: [cmd/ | internal/ | pkg/ | flat]

 Variant Selected: [Omni | Basic | Ultimate]
 Reason: [why this variant]

 Validation:
   - [ ] Path exists and contains .go files
   - [ ] File count >= 50 (actual: N)
   - [ ] Python 3.7+ available
   - [ ] Output directory writable

 CONFIGURE complete. Proceeding to MEASURE...
===============================================================
```

**Gate**: Target directory exists, contains 50+ Go files, variant selected. Proceed only when gate passes.

### Phase 2: MEASURE

**Goal**: Run statistical analysis scripts. Pure measurement -- no interpretation yet.

This phase is strictly mechanical. Scripts count and measure; keep interpretation separate from data collection. Combining measurement with interpretation introduces LLM training bias -- the model reports what "should be" instead of what IS. Run scripts first, interpret the numbers second, always as separate steps.

Automatically filter vendor/, testdata/, and generated code (files with "Code generated by..." markers) to avoid polluting statistics with external patterns.

**Step 1: Execute the cartographer**

```bash
# TODO: scripts/cartographer_omni.py not yet implemented
# Manual alternative: use grep/find to count patterns across Go files
# Example: count error wrapping patterns
grep -rn 'fmt.Errorf.*%w' ~/repos/my-project --include="*.go" | wc -l
# Example: count constructor patterns
grep -rn 'func New' ~/repos/my-project --include="*.go" | wc -l
```

Always run the cartographer scripts for measurement; reserve LLM interpretation for Phase 3. When an LLM sees `return err` it may report "not wrapping errors properly" even if that IS the local standard. The scripts produce deterministic, reproducible counts; the LLM's role begins at interpretation in Phase 3.

**Step 2: Verify output integrity**
- Confirm JSON output is valid and complete
- Check file count matches expectations (no vendor pollution)
- Verify all three lenses produced data
- Confirm derived_rules section exists in output

**Step 3: Check for data quality issues**
- File count suspiciously high? Vendor code may be included
- File count suspiciously low? Subdirectories may be missed
- All percentages near 50%? May indicate mixed codebase or insufficient data

```
===============================================================
 PHASE 2: MEASURE
===============================================================

 Script Executed: [cartographer_omni.py (not yet implemented — use manual pattern counting)]
 Target: [/path/to/repo]

 Results:
   - Files analyzed: [N]
   - Total lines: [N]
   - Categories measured: [N of 25]
   - Derived rules: [N auto-extracted]

 Data Quality:
   - [ ] JSON output valid
   - [ ] File count reasonable (no vendor pollution)
   - [ ] All three lenses have data
   - [ ] No unexpected zeros in major categories

 Output saved to: [path/to/output.json]

 MEASURE complete. Proceeding to INTERPRET...
===============================================================
```

**Gate**: Script completed without errors, JSON output is valid, file count is reasonable. Proceed only when gate passes.

### Phase 3: INTERPRET

**Goal**: Derive rules from statistics. This is where LLM interpretation happens -- AFTER measurement is complete.

Report facts and show complete statistics rather than describing them. Report facts without editorializing about code quality -- the numbers speak for themselves.

**Step 1: Review the three lenses**

| Lens | Question | Measures |
|------|----------|----------|
| Consistency (Frequency) | "How often do they use X?" | Imports, test frameworks, logging, modern features |
| Signature (Structure) | "How do they name/structure things?" | Constructors, receivers, parameter order, variables |
| Idiom (Implementation) | "How do they implement patterns?" | Error handling, control flow, context usage, defer |

For detailed lens explanations, see `references/three-lenses.md`.

**Step 2: Extract rules by confidence**

Only derive rules from patterns with sufficient consistency. Forcing rules from weak patterns causes false positives in reviews and may impose standards the team has not organically adopted.

| Confidence | Threshold | Action | Example |
|------------|-----------|--------|---------|
| HIGH | >85% consistency | Extract as enforceable rule | "96% use err not e" -> MUST use err |
| MEDIUM | 70-85% consistency | Extract as recommendation | "78% guard clauses" -> SHOULD prefer guards |
| Below 70% | Not extracted as rule | Report as observation only | "55% single-letter receivers" -> No rule |

**Step 3: Review Style Vector** (Omni only)
- 10 composite scores (0-100): Consistency, Modernization, Safety, Idiomaticity, Documentation, Testing Maturity, Architecture, Performance, Observability, Production Readiness
- Identify strengths (scores >75) and gaps (scores <50)
- Note shadow constitution entries (accepted linter suppressions)

**Step 4: Cross-reference lenses**
- Pattern confirmed across multiple lenses = higher confidence
- Pattern in one lens only = standard confidence
- Contradictions between lenses = investigate further

**Gate**: Rules extracted with evidence and confidence levels. Style Vector reviewed. Proceed only when gate passes.

### Phase 4: DELIVER

**Goal**: Produce actionable output artifacts.

**Step 1: Save statistical report**
```
cartography_data/{repo_name}_cartography.json
```

**Step 2: Generate derived rules document**
```
derived_rules/{repo_name}_rules.md
```

Format each rule as:
```markdown
## Rule: [Statement]
**Confidence**: HIGH/MEDIUM
**Evidence**: [X% consistency across N occurrences]
**Category**: [error_handling | naming | control_flow | architecture | ...]
**Lens**: [Consistency | Signature | Idiom | Multiple]
```

**Step 3: Summarize Style Vector** (Omni only)

```markdown
## Style Vector Summary
| Dimension | Score | Assessment |
|-----------|-------|------------|
| Consistency | [0-100] | [Strength/Gap/Neutral] |
| Modernization | [0-100] | [Strength/Gap/Neutral] |
| ... | ... | ... |
```

**Step 4: Recommend next steps**
- Compare with pr-workflow (miner) data if available (explicit vs implicit rules)
- Suggest CLAUDE.md updates for high-confidence rules
- Identify golangci-lint rules that could enforce discovered patterns
- Suggest quarterly re-analysis schedule -- coding patterns evolve with team growth and new Go versions, so a one-time snapshot becomes stale within months

```
===============================================================
 PHASE 4: DELIVER
===============================================================

 Artifacts:
   - [ ] JSON report: [path]
   - [ ] Rules document: [path]
   - [ ] Style Vector summary: [included in rules doc]

 Results Summary:
   - HIGH confidence rules: [N]
   - MEDIUM confidence rules: [N]
   - Observations (below threshold): [N]
   - Style Vector overall: [strong/mixed/weak]

 Next Steps:
   1. [Specific recommendation]
   2. [Specific recommendation]
   3. [Specific recommendation]

 DELIVER complete. Analysis finished.
===============================================================
```

**Gate**: JSON report saved, rules document generated, next steps documented. Analysis complete.

---

## Complementary Skills

| Skill | Extracts | Combined Value |
|-------|----------|----------------|
| pr-workflow (miner) | Explicit rules (what people argue about in reviews) | Agreement = HIGH confidence; Silence + consistency = implicit rule |
| codebase-analyzer | Implicit rules (what they actually do) | pr-workflow (miner) says X but code does Y = rule not followed |

### Reconciliation Matrix

| pr-workflow (miner) | codebase-analyzer | Conclusion |
|----------|-------------------|------------|
| Says X | Shows X at >85% | Confirmed rule (both explicit and practiced) |
| Silent | Shows X at >85% | Implicit rule (nobody argues because everyone agrees) |
| Says X | Shows Y at >85% | Rule stated but not followed (needs enforcement or is outdated) |
| Mixed signals | Inconsistent | No standard yet (opportunity to establish one) |

---

## Examples

### Example 1: Single Repository Analysis
User says: "What conventions does this repo follow?"
Actions:
1. Validate target has 100+ Go files (CONFIGURE)
2. Run pattern counting against the repo (MEASURE)
3. Extract rules from statistics: error wrapping 89%, guard clauses 5.2x, New{Type} 94% (INTERPRET)
4. Save JSON report and rules document (DELIVER)
Result: 30+ rules extracted with confidence levels, Style Vector produced

### Example 2: Team-Wide Standards Discovery
User says: "Find our team's coding patterns across all services"
Actions:
1. Validate all target repos, confirm 50+ files each (CONFIGURE)
2. Run cartographer on each repo separately (MEASURE)
3. Cross-reference patterns: error wrapping 87-91% across all repos = team standard (INTERPRET)
4. Produce team-wide rules document with per-repo breakdowns (DELIVER)
Result: Team-wide standards with cross-repo evidence

### Example 3: Onboarding New Developer
User says: "I just joined the team, what coding patterns should I follow?"
Actions:
1. Identify main team repos, validate Go file counts (CONFIGURE)
2. Run omni-cartographer on primary service (MEASURE)
3. Extract top 10 HIGH confidence rules as onboarding checklist (INTERPRET)
4. Produce concise rules doc focusing on error handling, naming, and control flow (DELIVER)
Result: Evidence-based onboarding guide with concrete examples from actual codebase

---

## Error Handling

### Error: "No Go files found"
Cause: Path does not point to a Go repository root, or .go files are in subdirectories not being scanned
Solution:
1. Verify path points to repository root with `ls *.go` or `find . -name "*.go" | head`
2. If Go files are nested, point to parent directory
3. Confirm vendor/ is not the only directory containing Go files

### Error: "No rules derived"
Cause: Codebase too small (<50 files) or patterns genuinely inconsistent
Solution:
1. Check file count -- if <50, combine analysis across multiple repos from same team
2. If >50 files but no rules, team genuinely lacks consistent patterns
3. Lower threshold to 60% to find emerging patterns (note reduced confidence)

### Error: "Statistics dominated by vendor/generated code"
Cause: Vendor directory or generated files not filtered, polluting pattern data
Solution:
1. Verify scripts are filtering vendor/, testdata/, and _test files for core patterns
2. If non-standard structure, analyze specific directories manually
3. Check for generated code markers (Code generated by...) and exclude those files

---

## References

### Reference Files
- `${CLAUDE_SKILL_DIR}/references/three-lenses.md`: Detailed explanation of the three analysis lenses
- `${CLAUDE_SKILL_DIR}/references/examples.md`: Real-world analysis examples and workflows
- `${CLAUDE_SKILL_DIR}/references/metrics-catalog.md`: Complete 100-metric catalog across 25 categories

### Prerequisites
- Python 3.7+
- Go codebase to analyze (50+ files recommended)
- No external dependencies (uses only Python standard library)

Related Skills

codebase-overview

290

from notque/claude-code-toolkit

Systematic codebase exploration and architecture mapping.

x-api

290

from notque/claude-code-toolkit

Post tweets, build threads, upload media via the X API.

worktree-agent

290

from notque/claude-code-toolkit

Mandatory rules for agents in git worktree isolation.

workflow

290

from notque/claude-code-toolkit

Structured multi-phase workflows: review, debug, refactor, deploy, create, research, and more.

workflow-help

290

from notque/claude-code-toolkit

Interactive guide to workflow system: agents, skills, routing, execution patterns.

wordpress-uploader

290

from notque/claude-code-toolkit

WordPress REST API integration for posts and media uploads.

wordpress-live-validation

290

from notque/claude-code-toolkit

Validate published WordPress posts in browser via Playwright.

with-anti-rationalization

290

from notque/claude-code-toolkit

Anti-rationalization enforcement for maximum-rigor task execution.

voice-writer

290

from notque/claude-code-toolkit

Unified voice content generation pipeline with mandatory validation and joy-check. 8-phase pipeline: LOAD, GROUND, GENERATE, VALIDATE, REFINE, JOY-CHECK, OUTPUT, CLEANUP. Use when writing articles, blog posts, or any content that uses a voice profile. Use for "write article", "blog post", "write in voice", "generate content", "draft article", "write about".

voice-validator

290

from notque/claude-code-toolkit

Critique-and-rewrite loop for voice fidelity validation.

vitest-runner

290

from notque/claude-code-toolkit

Run Vitest tests and parse results into actionable output.

video-editing

290

from notque/claude-code-toolkit

Video editing pipeline: cut footage, assemble clips via FFmpeg and Remotion.