reference-enrichment

Analyze agent/skill reference depth and generate missing domain-specific reference files.

290 stars

Best use case

reference-enrichment is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Analyze agent/skill reference depth and generate missing domain-specific reference files.

Teams using reference-enrichment should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/reference-enrichment/SKILL.md --create-dirs "https://raw.githubusercontent.com/notque/claude-code-toolkit/main/skills/reference-enrichment/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/reference-enrichment/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How reference-enrichment Compares

Feature / Agent	reference-enrichment	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Analyze agent/skill reference depth and generate missing domain-specific reference files.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

# Reference Enrichment Skill

Enrich an agent or skill's reference files from Level 0-1 to Level 2-3. The pipeline runs five
phases with explicit gates because each phase feeds the next — starting Phase 3 without Phase 2
research produces filler, not depth.

## Workflow

### Phase 1: DISCOVER

**Goal**: Identify which sub-domains are missing reference coverage.

1. Run the gap analyzer: `python3 skills/reference-enrichment/scripts/gap-analyzer.py --agent {name}`
   (or `--skill {name}`)
2. Read the component's .md file to understand its stated purpose, triggers, and domain claims
3. Read any existing reference files to map current coverage
4. Compare stated domains against covered domains to identify gaps

Output format:
```
DISCOVER: {name}
  Current level: {0-3}
  Existing references: [{filenames}]
  Stated domains: [{domains from description and body}]
  Gaps: [{sub-domains with no reference coverage}]
  Recommended files: [{filename} → {why}]
```

**Gate**: Gap report exists with at least one identified gap. If no gaps exist (Level 3 already),
report and stop — over-generating creates noise, not signal.

---

### Phase 2: RESEARCH

**Goal**: Compile concrete, domain-specific content for each gap.

For each identified gap:
1. Read existing Level 3 reference files in this repo as exemplars — golang-general-engineer's
   references/ is the benchmark: version-specific patterns, grep commands, error-fix mappings
2. Identify: version-specific patterns (what changed in version X.Y), common anti-patterns with
   detection commands (`grep -rn "pattern" --include="*.ext"`), error-fix mappings
   (error message → root cause → fix), project-specific conventions visible in the codebase

Dispatch up to 5 parallel research agents — one per sub-domain gap — because sequential research
bottlenecks the pipeline. Each agent receives: the sub-domain, the component's .md as context,
and a path to an exemplar Level 3 reference file.

**Gate**: Each gap has at least 10 concrete findings (version numbers, function names, grep
patterns, code examples). Generic advice ("follow best practices") does not count toward this gate.

---

### Phase 3: COMPILE

**Goal**: Assemble research into structured reference files.

For each gap, create one reference file following `references/reference-file-template.md`:
- One file per major sub-domain (not one monolithic file) because focused files are faster to
  load and easier to update as language versions change
- Max 500 lines per file (CLAUDE.md standard) — split into sub-topics if content exceeds this
- Include: overview paragraph, pattern table with version ranges, anti-pattern table with
  detection commands, error-fix mappings where applicable
- Match the tone of existing Level 3 references: direct, concrete, no hedging

Write files to: `agents/{name}/references/` or `skills/{name}/references/`

**Gate**: Each generated file is between 80-500 lines. Run
`scripts/validate-references.py --agent {name}` if it exists.

---

### Phase 4: VALIDATE

**Goal**: Confirm the reference files meet Level 2+ depth before integrating.

**Tier 1 (Deterministic):**
```bash
python3 scripts/audit-reference-depth.py --agent {name} --json
```
Verify the `level` field is 2 or 3 in the output. If still Level 1, the files are too
generic — return to Phase 2 for the weak sub-domain.

**Tier 2 (LLM self-assessment):**
Read each generated file and apply the rubric from `references/quality-rubric.md`. Ask: would
a reviewer using only this file produce Level 3 quality output? Concrete test: pick one
anti-pattern from the file — does it include a grep command to detect it?

**Gate**: Both tiers pass. If Tier 2 fails for a specific sub-domain, loop back to Phase 2 for
that gap only (not all gaps). Maximum 2 loops per gap before flagging for manual enrichment.

---

### Phase 5: INTEGRATE

**Goal**: Wire the new references into the component so they are actually loaded.

1. Read the component's .md file
2. Add or update a loading table in the body — pattern from `skills/do/references/repo-architecture.md`:
   ```
   | Task type | Load |
   |-----------|------|
   | {task}    | `references/{file}.md` |
   ```
3. Write the updated .md file
4. Run validation:
   ```bash
   python3 scripts/validate-references.py --agent {name}
   python3 -m pytest scripts/tests/test_reference_loading.py -k {name} -v
   ```
5. Stage all changes: `git add agents/{name}/ skills/{name}/`

**Gate**: Validation passes. Changes staged. Report the level change (was: N, now: M) and list
each new file with its line count.

---

## Reference Material

Load when the task type matches:

| Task type | Load |
|-----------|------|
| Understanding Level 0-3 criteria | `references/quality-rubric.md` |
| Creating new reference files | `references/reference-file-template.md` |

---

## Error Handling

**Gap analyzer fails**: The component may not exist in expected paths. Check both `agents/` and
`skills/` directories, and `~/.claude/agents/` for deployed copies.

**Phase 2 gate fails** (fewer than 10 concrete findings): The domain may be too narrow or already
well-documented upstream. Flag and suggest manual enrichment with project-specific production
incidents rather than generic docs research.

**Phase 4 Tier 1 still Level 1 after compile**: The files are too short or too generic. Read one
file, apply the rubric directly, identify the weakest section, and target Phase 2 research at
that section specifically.

**validate-references.py not found**: Script may not exist for this component. Skip that check,
proceed with `audit-reference-depth.py` as the sole Tier 1 gate.

Related Skills

x-api

290

from notque/claude-code-toolkit

Post tweets, build threads, upload media via the X API.

worktree-agent

290

from notque/claude-code-toolkit

Mandatory rules for agents in git worktree isolation.

workflow

290

from notque/claude-code-toolkit

Structured multi-phase workflows: review, debug, refactor, deploy, create, research, and more.

workflow-help

290

from notque/claude-code-toolkit

Interactive guide to workflow system: agents, skills, routing, execution patterns.

wordpress-uploader

290

from notque/claude-code-toolkit

WordPress REST API integration for posts and media uploads.

wordpress-live-validation

290

from notque/claude-code-toolkit

Validate published WordPress posts in browser via Playwright.

with-anti-rationalization

290

from notque/claude-code-toolkit

Anti-rationalization enforcement for maximum-rigor task execution.

voice-writer

290

from notque/claude-code-toolkit

Unified voice content generation pipeline with mandatory validation and joy-check. 8-phase pipeline: LOAD, GROUND, GENERATE, VALIDATE, REFINE, JOY-CHECK, OUTPUT, CLEANUP. Use when writing articles, blog posts, or any content that uses a voice profile. Use for "write article", "blog post", "write in voice", "generate content", "draft article", "write about".

voice-validator

290

from notque/claude-code-toolkit

Critique-and-rewrite loop for voice fidelity validation.

vitest-runner

290

from notque/claude-code-toolkit

Run Vitest tests and parse results into actionable output.

video-editing

290

from notque/claude-code-toolkit

Video editing pipeline: cut footage, assemble clips via FFmpeg and Remotion.

verification-before-completion

290

from notque/claude-code-toolkit

Defense-in-depth verification before declaring any task complete.