measure-ai-proficiency
Assess and improve repository AI coding proficiency and context engineering maturity. Use when users ask about: (1) AI readiness or AI maturity assessment, (2) context engineering quality or improvement, (3) CLAUDE.md, .cursorrules, or copilot-instructions files, (4) measuring how well a repo is prepared for AI coding assistants, (5) recommendations for improving AI collaboration, (6) what context files to add, or (7) comparing their repo to AI proficiency best practices.
Best use case
measure-ai-proficiency is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Assess and improve repository AI coding proficiency and context engineering maturity. Use when users ask about: (1) AI readiness or AI maturity assessment, (2) context engineering quality or improvement, (3) CLAUDE.md, .cursorrules, or copilot-instructions files, (4) measuring how well a repo is prepared for AI coding assistants, (5) recommendations for improving AI collaboration, (6) what context files to add, or (7) comparing their repo to AI proficiency best practices.
Teams using measure-ai-proficiency should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/measure-ai-proficiency/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How measure-ai-proficiency Compares
| Feature / Agent | measure-ai-proficiency | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Assess and improve repository AI coding proficiency and context engineering maturity. Use when users ask about: (1) AI readiness or AI maturity assessment, (2) context engineering quality or improvement, (3) CLAUDE.md, .cursorrules, or copilot-instructions files, (4) measuring how well a repo is prepared for AI coding assistants, (5) recommendations for improving AI collaboration, (6) what context files to add, or (7) comparing their repo to AI proficiency best practices.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Measure AI Proficiency Assess repository context engineering maturity and provide actionable recommendations for improving AI collaboration. This skill works with Claude Code, GitHub Copilot, Cursor, and OpenAI Codex (via the [Agent Skills](https://agentskills.io/) open standard). ## Prerequisites Install the measure-ai-proficiency tool: ```bash pip install measure-ai-proficiency ``` ## Workflow ### 1. Choose Your Scanning Method #### Option A: Scan GitHub Directly (No Cloning Required!) Scan GitHub repositories without cloning them: ```bash # Scan a single GitHub repository measure-ai-proficiency --github-repo owner/repo # Scan entire GitHub organization measure-ai-proficiency --github-org your-org-name # Limit number of repos scanned measure-ai-proficiency --github-org your-org-name --limit 50 # Output to file measure-ai-proficiency --github-org your-org --format json --output report.json ``` **Requirements:** [GitHub CLI (gh)](https://cli.github.com/) authenticated with `gh auth login` **How it works:** - Uses GitHub API to fetch repository file tree - Downloads only AI proficiency files (CLAUDE.md, .cursorrules, skills, etc.) - Scans in temporary directories - Cleans up automatically - Much faster than cloning! #### Option B: Discover Then Clone (Traditional Method) For organizations wanting more control, first discover which repos have AI context artifacts: ```bash # Find active repos (commits in last 90 days) with AI context files ./scripts/find-org-repos.sh your-org-name # JSON output for automation ./scripts/find-org-repos.sh your-org-name --json > repos.json ``` **What you get:** - Total repos in organization - Active repos (with recent commits) - Repos with AI context artifacts (CLAUDE.md, AGENTS.md, .cursorrules, etc.) - Percentage baseline for your org - List of repos to scan **Requirements:** [GitHub CLI (gh)](https://cli.github.com/) and [jq](https://stedolan.github.io/jq/) Then clone and scan the identified repos. #### Option C: Scan Local Repositories ```bash # Scan current directory measure-ai-proficiency # Scan specific repository measure-ai-proficiency /path/to/repo # Scan multiple repositories measure-ai-proficiency /path/to/repo1 /path/to/repo2 # Scan all repos in directory (cloned org) measure-ai-proficiency --org /path/to/org-repos ``` ### 2. Run Assessment Most common commands: ```bash # Local scan measure-ai-proficiency # GitHub scan (recommended for orgs) measure-ai-proficiency --github-org your-org-name ``` ### 3. Interpret Results **Maturity Levels (aligned with Steve Yegge's 8-stage model):** | Level | Name | Yegge Stage | Indicators | |-------|------|-------------|------------| | 1 | Zero AI | Stage 1 | No AI-specific files (baseline) | | 2 | Basic Instructions | Stage 2 | CLAUDE.md, .cursorrules exist | | 3 | Comprehensive Context | Stage 3 | Architecture, conventions documented | | 4 | Skills & Automation | Stage 4 | Hooks, commands, memory files, skills | | 5 | Multi-Agent Ready | Stage 5 | Specialized agents, MCP configs | | 6 | Fleet Infrastructure | Stage 6 | Beads, shared context, workflows | | 7 | Agent Fleet | Stage 7 | Governance, scheduling, 10+ agents | | 8 | Custom Orchestration | Stage 8 | Gas Town, meta-automation, frontier | **Score interpretation:** File count matters more than percentage. The tool includes hundreds of patterns for comprehensive detection. ### Understanding Quality Scoring Each AI instruction file is scored 0-10 based on quality indicators: | Symbol | Indicator | What It Means | Points | |--------|-----------|---------------|--------| | § | Sections | Markdown headers (`##`) | 0-2 | | ⌘ | Paths | File/dir paths (`/src/`) | 0-2 | | $ | Commands | CLI in backticks | 0-2 | | ! | Constraints | never/avoid/don't | 0-2 | | ↻N | Commits | Git history (N commits) | 0-2 | **Commit scoring:** Files with 5+ commits get full points (indicates active maintenance). 3-4 commits = 1pt. ### Cross-Reference Detection The tool detects links between your AI instruction files: - **Markdown links:** `[architecture](ARCHITECTURE.md)` - **File mentions:** `"CONVENTIONS.md"` or `` `TESTING.md` `` - **Relative paths:** `./docs/API.md` - **Directory refs:** `skills/`, `.claude/commands/` Resolution tracking shows if referenced files exist (helps find broken links). **Bonus points:** Up to +10 points from cross-references (5 pts) + quality (5 pts). ### Content Validation The tool validates that your documentation references real files: - **Missing references:** Files mentioned in docs that don't exist - **Stale references:** References to deleted files (detected via git history) - **Template markers:** Uncustomized content (TODO, PLACEHOLDER, etc.) **Validation penalty:** Up to -4 points for validation issues. **Skip false positives:** If your docs contain example file names (meta-tools, templates), configure `skip_validation_patterns` in `.ai-proficiency.yaml`: ```yaml skip_validation_patterns: - "COMPLIANCE.md" # Example mentioned in docs - ".mcp.json" # Best practice not yet adopted - "examples/*" # All files under examples/ ``` ### 4. Provide Recommendations After assessment, offer to create missing high-priority files: **Level 2 gaps:** Create CLAUDE.md, .cursorrules, or .github/copilot-instructions.md **Level 3 gaps:** Create ARCHITECTURE.md, CONVENTIONS.md, or TESTING.md **Level 4 gaps:** - Create skills directories: `.claude/skills/`, `.github/skills/`, or `.cursor/skills/` - Add `.claude/commands/` for slash commands - Create MEMORY.md or LEARNINGS.md - Consider SOUL.md or IDENTITY.md (ClawdBot pattern) for agent personality - **Boris Cherny's key insight:** Add verification loops (tests, linters) - this 2-3x quality **Level 5 gaps:** - Create specialized agents in `.github/agents/` or `.claude/agents/` - Set up `.mcp.json` at root (Boris Cherny pattern) for team-shared tool configs - Add agents/HANDOFFS.md for agent coordination **Level 6 gaps:** Beads memory system, shared context, workflow pipelines **Level 7 gaps:** - Add GOVERNANCE.md for agent permissions and policies - Set up convoys/ or molecules/ (Gas Town work decomposition) - Consider swarm/, wisps/, polecats/ for advanced agent patterns **Level 8 gaps:** - Build custom orchestration in orchestration/ - Consider .gastown/ for Kubernetes-like agent management - Add protocols: MAIL_PROTOCOL.md, FEDERATION.md, ESCALATION.md, watchdog/ ### 5. Create Missing Files When creating context files, include: **CLAUDE.md structure:** - Project overview (what it does, who it's for) - Directory structure and key files - Important conventions and patterns - Common tasks and how to perform them - Things to avoid **ARCHITECTURE.md structure:** - System overview and purpose - Key components and responsibilities - Data flow between components - Important design decisions **CONVENTIONS.md structure:** - Naming conventions - Code organization patterns - Error handling approach - Testing conventions ## Quick Reference Common triggers for this skill: - "Assess my AI proficiency" - "How mature is my context engineering?" - "What context files should I add?" - "Help me improve for AI coding" - "Check my CLAUDE.md setup" - "Am I ready for AI-assisted development?" ## Customization Use the **customize-measurement** skill for guided configuration: ``` "Customize measurement for my repo" ``` Or see the manual guide: https://github.com/pskoett/measuring-ai-proficiency/blob/main/CUSTOMIZATION.md
Related Skills
customize-measurement
Customize AI proficiency measurement for your specific repository through a guided interview. Use when: setting up measure-ai-proficiency for a new repo, adjusting thresholds for your team's size, hiding irrelevant recommendations, or mapping custom file names to standard patterns.
verify-gate
Runs project compile, test, and lint commands between implementation and quality review. Gates simplify-and-harden behind machine verification. If checks fail, routes back to implementation with diagnostics for a fix loop. If checks pass, signals ready for the quality pass. Use after any implementation work completes and before simplify-and-harden. Essential for the inner loop's verify step.
use-agent-factory
How to drive the 14-workflow agent factory in this repo from a Claude session. Covers: when to use the factory vs. direct edits, how to start the chain, where the human gates are, how to pick an implementer, how to recover from stuck PRs, and all the failure modes learned to date. Use this skill when the user asks you to ship a feature, fix, or refactor through the factory; when they reference an existing issue or PR in the factory chain; when a workflow is stuck or misbehaving; or when you need to file issues or plan files that the factory will pick up. Do NOT use this skill for: single-file scratch edits on an untracked branch, research questions, one-shot script runs, or any work that does not produce a PR to main.
simplify-and-harden
Post-completion self-review for coding agents that runs simplify, harden, and micro-documentation passes on non-trivial code changes. Use when: a coding task is complete in a general agent session and you want a bounded quality and security sweep before signaling done. For CI pipeline execution, use simplify-and-harden-ci.
pre-flight-check
[Beta] Session-start scan that surfaces relevant learnings, recent errors, and eval status before work begins. Bridges the outer loop back into the inner loop by making accumulated knowledge visible at task start. Activated via SessionStart hook or manually before major tasks.
plan-interview
Ensures alignment between user and Claude during feature/spec planning through a structured interview process. Use this skill when the user invokes /plan-interview before implementing a new feature, refactoring, or any non-trivial implementation task. The skill runs an upfront interview to gather requirements across technical constraints, scope boundaries, risk tolerance, and success criteria before any codebase exploration. Do NOT use this skill for: pure research/exploration tasks, simple bug fixes, or when the user just wants standard planning without the interview process.
learning-aggregator
[Beta] Cross-session analysis of accumulated .learnings/ files. Reads all entries, groups by pattern_key, computes recurrence across sessions, and outputs ranked promotion candidates. This is the outer loop's inspect step — it turns raw learning data into actionable gap reports. Use on a regular cadence (weekly, before major tasks, or at session start for critical projects). Can be invoked manually or scheduled.
intent-framed-agent
Frames coding-agent work sessions with explicit intent capture and drift monitoring. Use when a session transitions from planning/Q&A to implementation for coding tasks, refactors, feature builds, bug fixes, or other multi-step execution where scope drift is a risk.
eval-creator
[Beta] Creates permanent eval cases from promoted learnings and runs regression checks against them. Turns failures into test cases that prevent silent regression. This is the outer loop's regress-test step. Use when a learning is promoted and has a clear pass/fail condition, or on cadence to verify promoted rules still hold.
context-surfing
Monitors context window health throughout a session and rides peak context quality for maximum output fidelity. Activates automatically after plan-interview and intent-framed-agent. Stays active through execution and hands off cleanly to simplify-and-harden and self-improvement when the wave completes naturally or exits via handoff. Use this skill whenever a multi-step agent task is underway and session continuity or context drift is a concern. Especially important for long-running tasks, complex refactors, or any work where degraded context would silently corrupt the output. Trigger even if the user doesn't say "context surfing" — if an agent task is running across multiple steps with intent and a plan already established, this skill is live.
Agentic Workflow Creator
Create natural language GitHub Actions workflows using the agentic workflows pattern from GitHub Next.
rv-measure
Quantifies R_V contraction signatures in AI models.