check-completeness
Analyze collection completeness against canonical discography and generate prioritized gap report
Best use case
check-completeness is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
It is a strong fit for teams already working in Codex.
Analyze collection completeness against canonical discography and generate prioritized gap report
Teams using check-completeness should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/check-completeness/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How check-completeness Compares
| Feature / Agent | check-completeness | Standard Approach |
|---|---|---|
| Platform Support | Codex | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Analyze collection completeness against canonical discography and generate prioritized gap report
Which AI agents support this skill?
This skill is designed for Codex.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
SKILL.md Source
# check-completeness
Analyze collection completeness by comparing actual inventory against canonical discography, identify gaps, score coverage, and generate prioritized acquisition plans.
## Purpose
Answer collection completeness questions:
- "How complete is my [artist] collection?"
- "What am I missing?"
- "What should I acquire next?"
- "What legendary performances am I missing?"
- "Where should I upgrade quality?"
Produces comprehensive gap reports with actionable acquisition strategies prioritized by importance, availability, and cost.
## Parameters
### Required
**`<artist>`** - Artist name (quoted if contains spaces)
- Matched against collection directory structure
- Used to fetch canonical discography from MusicBrainz
- Example: `"Pink Floyd"` or `"The Beatles"`
### Optional
**`--collection <path>`** - Collection root directory
- Default: Current working directory
- Must contain artist subdirectories
- Example: `/mnt/media/Music`
**`--gaps-only`** - Show only missing items, skip completeness scoring
- Faster execution when you just need gap list
- Useful for quick "what's missing?" queries
**`--priority <level>`** - Filter gaps by priority level
- Values: `high`, `medium`, `low`, `all` (default: `all`)
- Example: `--priority high` shows only critical gaps
**`--output <file>`** - Write report to file instead of stdout
- Format determined by extension: `.yaml`, `.json`, `.md`
- Default: stdout (terminal display)
- Example: `--output gaps-report.yaml`
**`--canonical-source <source>`** - Canonical discography source
- Values: `musicbrainz` (default), `discogs`, `allmusic`
- Falls back to secondary sources if primary fails
- Example: `--canonical-source discogs`
**`--include-legendary`** - Include legendary/bootleg content in analysis
- Default: Official releases only
- Adds section for high-priority unofficial recordings
- Requires legendary catalog file or web search
**`--quality-threshold <level>`** - Minimum quality for "complete" status
- Values: `lossless`, `320kbps`, `256kbps`, `any` (default: `any`)
- Example: `--quality-threshold lossless` treats 320kbps as needing upgrade
**`--create-gap-notes`** - Generate GAP-NOTE.md files for missing items
- Creates placeholder files in artist directory structure
- Each contains acquisition strategy and priority
- Enables parallel gap-fill workflow
**`--incremental`** - Incremental update mode
- Loads previous state file if exists
- Only re-fetches canonical if >7 days old
- Faster for routine monitoring
**`--cost-estimate`** - Include cost estimates in report
- Estimates based on typical pricing for format/availability
- Adds budget planning section to output
- Example range: "$15-25 for used CD"
## Workflow
### 1. Load Collection Inventory
```bash
# Use existing scan results if available
SCAN_FILE=".media-curator/scans/${ARTIST_SLUG}/scan-latest.json"
if [[ -f "$SCAN_FILE" ]]; then
INVENTORY=$(cat "$SCAN_FILE")
else
# Trigger fresh scan
/scan-collection "$ARTIST" --format json --output "$SCAN_FILE"
INVENTORY=$(cat "$SCAN_FILE")
fi
```
Inventory provides:
- All albums/releases present in collection
- Track counts and file formats
- Quality levels (lossless/lossy bitrates)
- File sizes and directory structure
### 2. Fetch Canonical Discography
```bash
# Check for cached canonical data
CANONICAL_FILE=".media-curator/completeness/${ARTIST_SLUG}/canonical.yaml"
CACHE_AGE_DAYS=$((( $(date +%s) - $(stat -c %Y "$CANONICAL_FILE" 2>/dev/null || echo 0) ) / 86400))
if [[ $CACHE_AGE_DAYS -gt 7 ]] || [[ ! -f "$CANONICAL_FILE" ]]; then
# Fetch fresh canonical discography
# MusicBrainz API example:
MBID=$(musicbrainz-lookup "$ARTIST" --type artist --format json | jq -r '.artists[0].id')
RELEASES=$(musicbrainz-releases "$MBID" --format json)
# Parse into canonical.yaml structure
echo "$RELEASES" | parse-canonical > "$CANONICAL_FILE"
fi
```
Canonical discography includes:
- Studio albums with track counts and years
- EPs and singles with A/B sides
- Official live albums
- Compilations (filtered by novelty)
- Official box sets
**MusicBrainz advantages:**
- Comprehensive and accurate
- Includes release dates and track counts
- Distinguishes editions (original, remaster, deluxe)
- Free API with rate limiting
**Discogs alternative:**
- More complete for rare/indie releases
- Includes pressing details and variants
- Better for vinyl collectors
- Requires API key
### 3. Build Extended Content Catalog
If `--include-legendary` flag set:
```bash
# Load legendary performances catalog
LEGENDARY_FILE=".media-curator/completeness/${ARTIST_SLUG}/legendary.yaml"
if [[ ! -f "$LEGENDARY_FILE" ]]; then
# Web search for "artist legendary performances bootleg"
# Or load from community-maintained catalog
# Or use private tracker tags (if integrated)
fetch-legendary-catalog "$ARTIST" > "$LEGENDARY_FILE"
fi
```
Extended catalog includes:
- Notable covers (tribute albums, sessions)
- Soundtrack contributions
- Collaboration tracks (appears on other artists' albums)
- Legendary live performances (soundboard, historical significance)
- Unreleased demos and outtakes
### 4. Compare Inventory to Canonical
```yaml
# Comparison logic pseudocode
for release in canonical_releases:
match = find_in_inventory(release.title, release.year)
if match:
if match.track_count == release.track_count:
release.status = "complete"
else:
release.status = "partial"
release.missing_tracks = release.track_count - match.track_count
release.quality = match.quality
release.source = match.source
else:
release.status = "missing"
```
**Fuzzy matching rules:**
- Tolerate minor title differences ("The Album" vs "Album, The")
- Match by year if title ambiguous
- Prefer exact track count match over year match
- Flag multiple matches as potential duplicates
### 5. Score Completeness
```python
# Weighted scoring formula
weights = {
'studio_albums': 0.40,
'eps': 0.15,
'singles': 0.15,
'live_albums': 0.10,
'notable_covers': 0.10,
'legendary': 0.10
}
component_scores = {}
for category, weight in weights.items():
total = len(canonical[category])
complete = sum(1 for item in canonical[category] if item.status == 'complete')
partial = sum(0.5 for item in canonical[category] if item.status == 'partial')
component_scores[category] = ((complete + partial) / total * 100) if total > 0 else 100
overall_score = sum(component_scores[cat] * weights[cat] for cat in weights)
# Quality bonus
lossless_pct = (lossless_count / total_count) * 100
if lossless_pct > 80:
overall_score += 5
```
### 6. Identify and Prioritize Gaps
```yaml
# Gap prioritization logic
for release in canonical_releases:
if release.status == "missing" or release.status == "partial":
priority = calculate_priority(release)
availability = check_availability(release)
cost = estimate_cost(release, availability)
gaps.append({
'item': release.title,
'type': release.type,
'year': release.year,
'priority': priority,
'availability': availability,
'estimated_cost': cost,
'reason': priority_reason(release)
})
gaps.sort(key=lambda x: (priority_order[x['priority']], x['year']))
```
**Priority calculation:**
```python
def calculate_priority(release):
score = 0
# Type importance
if release.type == 'studio_album':
score += 40
elif release.type in ['ep', 'single']:
score += 15
elif release.type == 'live_album':
score += 10
# Critical acclaim (if available)
if release.rating and release.rating > 8.0:
score += 20
# Classic period (artist-specific)
if release.year in artist_classic_years:
score += 15
# Completes a set
if is_part_of_trilogy(release) and have_other_parts(release):
score += 10
# Historical significance
if release.legendary or release.historically_significant:
score += 25
# Assign tier
if score >= 60:
return "HIGH"
elif score >= 30:
return "MEDIUM"
else:
return "LOW"
```
### 7. Check Availability
```python
def check_availability(release):
sources = []
# Streaming services (easy, legal, lossy)
if available_on_streaming(release):
sources.append({
'type': 'streaming',
'services': ['Apple Music', 'Spotify', 'Tidal'],
'quality': '256kbps AAC',
'cost': 'subscription',
'difficulty': 'easy'
})
# Digital purchase (legal, often lossless)
if available_on_bandcamp(release):
sources.append({
'type': 'digital_purchase',
'platform': 'Bandcamp',
'quality': 'FLAC / 320kbps MP3',
'cost': '$7-12',
'difficulty': 'easy'
})
# Physical (CD/vinyl)
discogs_listings = check_discogs(release)
if discogs_listings:
sources.append({
'type': 'physical',
'format': 'CD',
'availability': f"{len(discogs_listings)} listings",
'price_range': f"${min(discogs_listings)} - ${max(discogs_listings)}",
'difficulty': 'moderate'
})
# Private trackers (torrents, requires ratio)
if user_has_tracker_access():
tracker_results = search_trackers(release)
if tracker_results:
sources.append({
'type': 'private_tracker',
'sites': tracker_results.sites,
'quality': 'FLAC',
'cost': 'ratio',
'difficulty': 'moderate' if tracker_results.seeded else 'hard'
})
return sources
```
### 8. Generate Report
Output format depends on `--output` extension or defaults to rich terminal display:
**Terminal output (default):**
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Pink Floyd Collection Completeness Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📊 Overall Score: 92.3% (Excellent Collection)
🎵 Canonical Releases: 94.1%
✅ Studio Albums: 16/17 (94.1%)
✅ EPs: 4/4 (100%)
✅ Singles: 24/30 (80.0%)
✅ Live Albums: 12/15 (80.0%)
🎸 Extended Content: 78.5%
✅ Notable Covers: 8/12 (66.7%)
✅ Soundtracks: 3/5 (60.0%)
⚠️ Legendary: 12/47 (25.5%)
💿 Quality Distribution:
🟢 Lossless (FLAC): 87% (298 files)
🟡 High (320kbps): 11% (38 files)
🔴 Medium (256kbps): 2% (6 files)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Top 5 Acquisition Priorities
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔴 HIGH: The Endless River (2014)
Missing: Complete studio album
Reason: Only missing album from official discography
Sources:
• Apple Music (stream rip, 256kbps) - FREE with subscription
• Bandcamp (FLAC) - $9.99
• Discogs (used CD) - $12-18
Recommended: Bandcamp FLAC purchase
🔴 HIGH: Live at Pompeii (1972) - Soundboard
Missing: Legendary performance
Reason: Exceptional quality, historically significant
Sources:
• Private tracker (FLAC soundboard) - Ratio cost
• YouTube rip (256kbps, unofficial) - FREE but low quality
Recommended: Private tracker if available
🟡 MEDIUM: Obscured by Clouds B-sides
Missing: 3 tracks from singles
Reason: Completes album era, unique content
Sources:
• Compilation "Works" (1983) - Discogs $15-20
• Individual singles (rare) - eBay ~$30 each
Recommended: Works compilation
🟡 MEDIUM: The Wall (24bit/96kHz remaster)
Upgrade: Current 320kbps → Lossless remaster
Reason: Significant audio improvement, bonus tracks
Sources:
• HDtracks (24bit FLAC) - $19.99
• Qobuz (24bit streaming) - Subscription
Recommended: Wait for HDtracks sale (<$15)
🟢 LOW: Relics (1971 compilation)
Missing: Compilation of earlier singles
Reason: Low priority, all tracks owned on original albums
Sources:
• Streaming (256kbps) - FREE
• CD (out of print) - eBay $20-30
Recommended: Stream rip if desired for convenience
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Acquisition Plan Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Next 5 acquisitions:
1. The Endless River - Bandcamp FLAC ($9.99)
2. Live at Pompeii - Private tracker (ratio)
3. Obscured by Clouds B-sides - Works CD ($18)
4. The Wall remaster - Wait for sale (~$15)
5. (Optional) Relics - Stream rip (FREE)
💰 Estimated cost: $43-60 (excluding sale wait items)
⏱️ Estimated time: 1-2 weeks
🎯 Next score: ~96.5% (Completionist tier)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Report saved to: .media-curator/completeness/pink-floyd/gap-report-2026-02-14.yaml
GAP-NOTE files created: 5 items
```
**YAML output (--output report.yaml):**
Uses the full gap report structure defined in the Completeness Tracker agent documentation.
**Markdown output (--output report.md):**
Human-readable report with sections, tables, and acquisition instructions.
### 9. Create GAP-NOTE Files (if --create-gap-notes)
```bash
for gap in high_priority_gaps:
NOTE_PATH="${COLLECTION}/${ARTIST}/${gap.year} - ${gap.title}/GAP-NOTE.md"
mkdir -p "$(dirname "$NOTE_PATH")"
cat > "$NOTE_PATH" << EOF
# GAP-NOTE: ${gap.title} (${gap.year})
## Expected Content
- ${gap.type}: ${gap.track_count} tracks
- ${gap.description}
- Rating: ${gap.rating}/10
## Acquisition Strategy
${format_acquisition_steps(gap.sources)}
## Priority
**${gap.priority}** - ${gap.reason}
## Sources
${format_source_links(gap)}
- Last searched: Never
EOF
done
```
## Examples
### Example 1: Basic completeness check
```bash
/check-completeness "Pink Floyd" --collection /mnt/media/Music
```
Output: Full terminal report with scores and top 5 priorities
### Example 2: Just show high-priority gaps
```bash
/check-completeness "The Beatles" --gaps-only --priority high
```
Output:
```
High-Priority Gaps for The Beatles:
1. Let It Be Sessions (1970) - Legendary unreleased sessions
2. Live at Hollywood Bowl (original 1977) - Out of print, reissue available
3. Anthology outtakes (various) - 12 tracks not on official release
```
### Example 3: Create acquisition workflow
```bash
/check-completeness "Led Zeppelin" \
--include-legendary \
--create-gap-notes \
--output gaps-report.yaml
```
Creates:
- `gaps-report.yaml` - Full structured report
- GAP-NOTE.md files in each missing album directory
- Updated completeness state file
### Example 4: Incremental monitoring
```bash
# Weekly cron job
/check-completeness "Grateful Dead" \
--incremental \
--quality-threshold lossless \
--output /reports/weekly-$(date +%F).yaml
```
Fast execution, only re-fetches canonical if stale.
## Integration with Other Commands
**Before completeness check:**
```bash
/scan-collection "Artist" --format json # Generates inventory
```
**After completeness check:**
```bash
/acquisition-plan "Artist" --from-gaps gaps-report.yaml # Plan downloads
/quality-audit "Artist" --upgrades-only # Find quality improvement targets
```
**Parallel gap filling:**
```bash
/completeness-fill "Artist" --priority high --source streaming &
/completeness-fill "Artist" --priority high --source physical &
wait
/check-completeness "Artist" --incremental # Re-check after acquisition
```
## Error Handling
**MusicBrainz API failure:**
- Fall back to Discogs API
- If both fail, use cached canonical (warn if >30 days old)
- Suggest manual canonical.yaml creation
**Artist not found:**
- Suggest fuzzy matches from collection directory names
- Offer to use directory name as canonical source
**No collection inventory:**
- Auto-trigger `/scan-collection` if not found
- Error if collection path invalid
## Performance
**Expected execution time:**
- Small collection (<500 files): 5-10 seconds
- Medium collection (500-2000 files): 15-30 seconds
- Large collection (>2000 files): 30-60 seconds
- Incremental mode: 50% faster (uses cached canonical)
**Rate limiting:**
- MusicBrainz: 1 request/second (built-in delay)
- Discogs: 60 requests/minute (with API key)
- Wait and retry on 429 responses
## Output Files
All output written to `.media-curator/completeness/{artist-slug}/`:
```
.media-curator/completeness/pink-floyd/
├── canonical.yaml # Cached canonical discography
├── legendary.yaml # Extended content catalog
├── state.json # Completeness tracking state
├── gap-report-2026-02-14.yaml
├── gap-report-2026-01-15.yaml # Historical reports
└── history.json # Completeness score over time
```
## Success Criteria
- Completeness scores accurate within ±2%
- Zero false positives (claiming missing when present)
- Priority assignments align with collector consensus >90%
- Availability checks current within 7 days
- Report generation <60 seconds for any collection size
## References
- @$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/research-before-decision.md — Research canonical discography (MusicBrainz, Discogs) before determining completeness
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/analyze-artist/SKILL.md — Artist analysis that establishes the canonical discography used as completeness baseline
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/gap-documentation/SKILL.md — Gap documentation skill used to record identified missing items
- @$AIWG_ROOT/agentic/code/frameworks/media-curator/skills/find-sources/SKILL.md — Source discovery skill used to locate missing releases identified by completeness checkRelated Skills
eslint-checker
Run ESLint for JavaScript/TypeScript code quality and style enforcement. Use for static analysis and auto-fixing.
traceability-check
Verify bidirectional traceability from requirements to code to tests and identify coverage gaps and orphan artifacts
regression-check
Compare current behavior against baseline to detect regressions
quality-checker
Validate skill quality, completeness, and adherence to standards. Use before packaging to ensure skill meets quality requirements.
project-health-check
Analyze overall project health and metrics
link-check
Verify @file references in AIWG skills and agents against the linking contract — per-file or corpus-wide, with optional auto-fix
flow-handoff-checklist
Orchestrate handoff validation between SDLC phases and tracks (Discovery→Delivery, Delivery→Ops, phase transitions)
flow-gate-check
Orchestrate SDLC phase gate validation with multi-agent review and comprehensive reporting
citation-check
Check a file for citation quality and GRADE compliance
checkpoint
Create, list, or recover mid-workflow checkpoints so interrupted work resumes from a known-good position
check-traceability
Verify the full refinement chain from use cases through behavioral specs, pseudo-code specs, code, and tests — report coverage at each layer and identify gaps
aiwg-orchestrate
Route structured artifact work to AIWG workflows via MCP with zero parent context cost