verify-gate

Runs project compile, test, and lint commands between implementation and quality review. Gates simplify-and-harden behind machine verification. If checks fail, routes back to implementation with diagnostics for a fix loop. If checks pass, signals ready for the quality pass. Use after any implementation work completes and before simplify-and-harden. Essential for the inner loop's verify step.

6 stars

Best use case

verify-gate is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Runs project compile, test, and lint commands between implementation and quality review. Gates simplify-and-harden behind machine verification. If checks fail, routes back to implementation with diagnostics for a fix loop. If checks pass, signals ready for the quality pass. Use after any implementation work completes and before simplify-and-harden. Essential for the inner loop's verify step.

Teams using verify-gate should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/verify-gate/SKILL.md --create-dirs "https://raw.githubusercontent.com/pskoett/measuring-ai-proficiency/main/.claude/skills/verify-gate/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/verify-gate/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How verify-gate Compares

Feature / Agentverify-gateStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Runs project compile, test, and lint commands between implementation and quality review. Gates simplify-and-harden behind machine verification. If checks fail, routes back to implementation with diagnostics for a fix loop. If checks pass, signals ready for the quality pass. Use after any implementation work completes and before simplify-and-harden. Essential for the inner loop's verify step.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Verify Gate

Machine verification gate between implementation and quality review. Runs the project's compile, test, and lint commands. If any fail, enters a fix loop. If all pass, unblocks simplify-and-harden.

This is the inner loop's **verify** step. Without it, the agent hands off code with zero machine signal about whether it actually works.

## When to Use

- After any implementation work completes, before signaling "done"
- Before running simplify-and-harden
- After fixing audit findings from agent-teams-simplify-and-harden
- Any time you want a machine-verified green signal

## Pipeline Position

```
[implementation] → verify-gate → simplify-and-harden → self-improvement
                   ↻ fix loop
```

## Step 1: Discover Project Commands

Read the project's configuration to find verification commands. Check these sources in order:

1. **Project instruction files** (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md) — look for a `## Verification` or `## Test Commands` section
2. **package.json** — `scripts.test`, `scripts.lint`, `scripts.typecheck`, `scripts.build`. Also check for a `bun.lock` / `bun.lockb` alongside it → prefer `bun run <script>` over `npm run <script>` when present. Check for `pnpm-lock.yaml` → prefer `pnpm run`. Check for `yarn.lock` → prefer `yarn`.
3. **Makefile** / **Justfile** — `test`, `lint`, `check`, `build` targets
4. **Cargo.toml** — `cargo build`, `cargo test`, `cargo clippy`
5. **pyproject.toml** / **setup.cfg** — `pytest`, `mypy`, `ruff`
6. **go.mod** — `go build ./...`, `go test ./...`, `go vet ./...`
7. **deno.json** / **deno.jsonc** — `deno task <name>` for any defined tasks

If no commands are discoverable, ask the user once and suggest they add a `## Verification` section to their project instruction files (CLAUDE.md, AGENTS.md, or equivalent) for future sessions:

```markdown
## Verification

- Build: `npm run build`
- Test: `npm test`
- Lint: `npm run lint`
- Type check: `npx tsc --noEmit`
```

## Step 2: Run Verification

Run discovered commands in this order. Stop at the first failure category.

### Phase 1: Compile / Type Check
Run the build or type-check command. These catch structural errors before wasting time on tests.

```
Exit 0 → proceed to Phase 2
Exit non-zero → enter fix loop with compiler output
```

### Phase 2: Tests
Run the test command. Scope to changed files if the test runner supports it.

```
Exit 0 → proceed to Phase 3
Exit non-zero → enter fix loop with test output
```

### Phase 3: Lint (optional, skippable with --skip-lint)
Run the lint command. Lint failures are lower severity but still worth catching.

```
Exit 0 → all phases green, gate passes
Exit non-zero → enter fix loop with lint output
```

## Step 3: Fix Loop

When a phase fails:

1. **Read the output.** Parse the error output for actionable diagnostics — file paths, line numbers, error messages.
2. **Scope the fix.** Only fix what the verification caught. Do not refactor, improve, or touch unrelated code.
3. **Apply the fix.** Make the minimal change to resolve the failure.
4. **Re-run the failed phase.** Not all phases — just the one that failed.
5. **If it passes**, continue to the next phase.
6. **If it fails again**, increment the attempt counter.

### Fix Loop Limits

- **Default max attempts:** 3 per phase (configurable via `--fix-limit N`)
- **Counter increments on every attempt**, even if the error changes. Fixing Error A and uncovering Error B counts as attempt 2, not attempt 1. The counter tracks fix attempts, not unique errors.
- **If limit reached:** Stop. Report what failed, what was tried, and the remaining error output. Do not guess further — signal to the user that manual intervention is needed.
- **Total budget:** The fix loop should not exceed 20% of the original implementation effort. If fixes are snowballing, stop and report.

## Step 4: Gate Signal

When all phases pass:

```markdown
## Verify Gate: PASSED

- Build: passed
- Tests: passed (N tests, M suites)
- Lint: passed (or skipped)

Ready for simplify-and-harden.
```

When the fix loop is exhausted:

```markdown
## Verify Gate: BLOCKED

- Build: passed
- Tests: FAILED (attempt 3/3)
  - [file:line] error description
  - [file:line] error description
- Lint: not reached

Fix loop exhausted. Manual intervention needed before quality review.
```

## Integration with Other Skills

### skill-pipeline
verify-gate should run at every pipeline depth except Trivial:

| Task size | Pipeline |
|-----------|----------|
| Trivial | None |
| Small | verify-gate → simplify-and-harden |
| Medium | intent-framed-agent + verify-gate → simplify-and-harden |
| Large | Full pipeline with verify-gate before quality pass |

### agent-teams-simplify-and-harden
agent-teams already has compile + tests embedded in Step 4. verify-gate can replace that embedded logic for consistency — the team lead spawns verify-gate instead of running ad-hoc compile/test commands.

### self-improvement
If the fix loop resolves an error that was non-obvious, log it:
- Pattern: what broke and why
- Fix: what resolved it
- Prevention: what convention or check would have caught it earlier

## What This Skill Does NOT Do

- Does not review code quality (that's simplify-and-harden)
- Does not check security (that's harden-auditor)
- Does not verify spec compliance (that's spec-auditor)
- Does not modify test files or add new tests
- Does not run tests for code it didn't change (unless the test runner doesn't support scoping)

## Configuration

If the project has a `.verify-gate.yml` or a `verify-gate` section in its project instruction files (CLAUDE.md, AGENTS.md, or equivalent):

```yaml
verify-gate:
  build: npm run build
  test: npm test
  lint: npm run lint
  type_check: npx tsc --noEmit
  fix_limit: 3
  skip_lint: false
  test_scope: changed  # changed | all
```

If no configuration exists, discover commands automatically (Step 1) and suggest persisting them.

### Custom Verification Tools (mcp-scripts)

Projects with custom invariants can define inline verification tools using gh-aw's `mcp-scripts`. These run as additional phases after the standard compile/test/lint checks.

Example — a project that needs API schema validation and legacy import checks:

```yaml
# In .github/workflows/verify-gate-ci.md or plugin config
mcp-scripts:
  verify-api-schema:
    lang: shell
    description: "Validate API schema matches implementation"
    run: |
      python scripts/validate_schema.py --strict

  check-no-legacy-imports:
    lang: shell
    description: "Ensure no imports from deprecated legacy/ directory"
    run: |
      ! grep -r "from legacy" src/ --include="*.py"

  verify-rate-limits:
    lang: javascript
    description: "All API routes must have rate limiting middleware"
    run: |
      const routes = require('./src/routes');
      const missing = routes.filter(r => !r.middleware.includes('rateLimit'));
      if (missing.length) { console.error('Missing rate limit:', missing); process.exit(1); }
```

When mcp-scripts are defined, verify-gate runs them as **Phase 4** after lint. Each script's exit code determines pass/fail. Failed scripts enter the same fix loop as standard phases.

This moves project-specific invariants from "knowledge in your head" to "knowledge in the harness" — exactly where the agent can reach it.

Related Skills

use-agent-factory

6
from pskoett/measuring-ai-proficiency

How to drive the 14-workflow agent factory in this repo from a Claude session. Covers: when to use the factory vs. direct edits, how to start the chain, where the human gates are, how to pick an implementer, how to recover from stuck PRs, and all the failure modes learned to date. Use this skill when the user asks you to ship a feature, fix, or refactor through the factory; when they reference an existing issue or PR in the factory chain; when a workflow is stuck or misbehaving; or when you need to file issues or plan files that the factory will pick up. Do NOT use this skill for: single-file scratch edits on an untracked branch, research questions, one-shot script runs, or any work that does not produce a PR to main.

simplify-and-harden

6
from pskoett/measuring-ai-proficiency

Post-completion self-review for coding agents that runs simplify, harden, and micro-documentation passes on non-trivial code changes. Use when: a coding task is complete in a general agent session and you want a bounded quality and security sweep before signaling done. For CI pipeline execution, use simplify-and-harden-ci.

pre-flight-check

6
from pskoett/measuring-ai-proficiency

[Beta] Session-start scan that surfaces relevant learnings, recent errors, and eval status before work begins. Bridges the outer loop back into the inner loop by making accumulated knowledge visible at task start. Activated via SessionStart hook or manually before major tasks.

plan-interview

6
from pskoett/measuring-ai-proficiency

Ensures alignment between user and Claude during feature/spec planning through a structured interview process. Use this skill when the user invokes /plan-interview before implementing a new feature, refactoring, or any non-trivial implementation task. The skill runs an upfront interview to gather requirements across technical constraints, scope boundaries, risk tolerance, and success criteria before any codebase exploration. Do NOT use this skill for: pure research/exploration tasks, simple bug fixes, or when the user just wants standard planning without the interview process.

measure-ai-proficiency

6
from pskoett/measuring-ai-proficiency

Assess and improve repository AI coding proficiency and context engineering maturity. Use when users ask about: (1) AI readiness or AI maturity assessment, (2) context engineering quality or improvement, (3) CLAUDE.md, .cursorrules, or copilot-instructions files, (4) measuring how well a repo is prepared for AI coding assistants, (5) recommendations for improving AI collaboration, (6) what context files to add, or (7) comparing their repo to AI proficiency best practices.

learning-aggregator

6
from pskoett/measuring-ai-proficiency

[Beta] Cross-session analysis of accumulated .learnings/ files. Reads all entries, groups by pattern_key, computes recurrence across sessions, and outputs ranked promotion candidates. This is the outer loop's inspect step — it turns raw learning data into actionable gap reports. Use on a regular cadence (weekly, before major tasks, or at session start for critical projects). Can be invoked manually or scheduled.

intent-framed-agent

6
from pskoett/measuring-ai-proficiency

Frames coding-agent work sessions with explicit intent capture and drift monitoring. Use when a session transitions from planning/Q&A to implementation for coding tasks, refactors, feature builds, bug fixes, or other multi-step execution where scope drift is a risk.

eval-creator

6
from pskoett/measuring-ai-proficiency

[Beta] Creates permanent eval cases from promoted learnings and runs regression checks against them. Turns failures into test cases that prevent silent regression. This is the outer loop's regress-test step. Use when a learning is promoted and has a clear pass/fail condition, or on cadence to verify promoted rules still hold.

customize-measurement

6
from pskoett/measuring-ai-proficiency

Customize AI proficiency measurement for your specific repository through a guided interview. Use when: setting up measure-ai-proficiency for a new repo, adjusting thresholds for your team's size, hiding irrelevant recommendations, or mapping custom file names to standard patterns.

context-surfing

6
from pskoett/measuring-ai-proficiency

Monitors context window health throughout a session and rides peak context quality for maximum output fidelity. Activates automatically after plan-interview and intent-framed-agent. Stays active through execution and hands off cleanly to simplify-and-harden and self-improvement when the wave completes naturally or exits via handoff. Use this skill whenever a multi-step agent task is underway and session continuity or context drift is a concern. Especially important for long-running tasks, complex refactors, or any work where degraded context would silently corrupt the output. Trigger even if the user doesn't say "context surfing" — if an agent task is running across multiple steps with intent and a plan already established, this skill is live.

Agentic Workflow Creator

6
from pskoett/measuring-ai-proficiency

Create natural language GitHub Actions workflows using the agentic workflows pattern from GitHub Next.

create-issue-gate

31392
from sickn33/antigravity-awesome-skills

Use when starting a new implementation task and an issue must be created with strict acceptance criteria gating before execution.

Software DevelopmentClaude