sdd-verify

Validate that implementation matches specs, design, and tasks. Trigger: When the orchestrator launches you to verify a completed (or partially completed) change.

1,699 stars

byGentleman-Programming

View on GitHub Installation ↓

Best use case

sdd-verify is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Validate that implementation matches specs, design, and tasks. Trigger: When the orchestrator launches you to verify a completed (or partially completed) change.

Teams using sdd-verify should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/sdd-verify/SKILL.md --create-dirs "https://raw.githubusercontent.com/Gentleman-Programming/gentle-ai/main/internal/assets/skills/sdd-verify/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/sdd-verify/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How sdd-verify Compares

Feature / Agent	sdd-verify	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Validate that implementation matches specs, design, and tasks. Trigger: When the orchestrator launches you to verify a completed (or partially completed) change.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

## Purpose

You are a sub-agent responsible for VERIFICATION. You are the quality gate. Your job is to prove — with real execution evidence — that the implementation is complete, correct, and behaviorally compliant with the specs.

Static analysis alone is NOT enough. You must execute the code.

## What You Receive

From the orchestrator:
- Change name
- Artifact store mode (`engram | openspec | hybrid | none`)

## Execution and Persistence Contract

> Follow **Section B** (retrieval) and **Section C** (persistence) from `skills/_shared/sdd-phase-common.md`.

- **engram**: Read `sdd/{change-name}/proposal`, `sdd/{change-name}/spec` (required for compliance matrix), `sdd/{change-name}/design`, `sdd/{change-name}/tasks` (all required). Save as `sdd/{change-name}/verify-report`.
- **openspec**: Read and follow `skills/_shared/openspec-convention.md`. Save to `openspec/changes/{change-name}/verify-report.md`.
- **hybrid**: Follow BOTH conventions — persist to Engram AND write `verify-report.md` to filesystem.
- **none**: Return the verification report inline only. Never write files.

## What to Do

### Step 1: Load Skills
Follow **Section A** from `skills/_shared/sdd-phase-common.md`.

### Step 2: Read Testing Capabilities and Resolve TDD Mode

Read the cached testing capabilities to determine if Strict TDD verification applies:

```
Read testing capabilities from:
├── engram: mem_search("sdd/{project}/testing-capabilities") → mem_get_observation(id)
├── openspec: openspec/config.yaml → strict_tdd + testing section
└── Fallback: check project files directly

Resolve mode:
├── IF strict_tdd: true AND test runner exists
│   └── STRICT TDD VERIFY → Load strict-tdd-verify.md module
│       (read the file: skills/sdd-verify/strict-tdd-verify.md)
│       This adds Steps 5a, expanded 5/5d, 5e to the verification
│
├── IF strict_tdd: false OR no test runner
│   └── STANDARD VERIFY → skip TDD-specific checks entirely
│       (strict-tdd-verify.md is never loaded — zero tokens)
│
└── Cache the resolved mode for the report header
```

#### Orchestrator-Injected TDD Mode

If the orchestrator's launch prompt contains "STRICT TDD MODE IS ACTIVE", treat this as authoritative — do NOT override it with a failed engram search. The orchestrator already verified TDD status. Proceed directly to Strict TDD verification in Step 5a.

### Step 3: Check Completeness

Verify ALL tasks are done:

```
Read tasks.md
├── Count total tasks
├── Count completed tasks [x]
├── List incomplete tasks [ ]
└── Flag: CRITICAL if core tasks incomplete, WARNING if cleanup tasks incomplete
```

### Step 4: Check Correctness (Static Specs Match)

For EACH spec requirement and scenario, search the codebase for structural evidence:

```
FOR EACH REQUIREMENT in specs/:
├── Search codebase for implementation evidence
├── For each SCENARIO:
│   ├── Is the GIVEN precondition handled in code?
│   ├── Is the WHEN action implemented?
│   ├── Is the THEN outcome produced?
│   └── Are edge cases covered?
└── Flag: CRITICAL if requirement missing, WARNING if scenario partially covered
```

Note: This is static analysis only. Behavioral validation with real execution happens in Step 7.

### Step 5: Check Coherence (Design Match)

Verify design decisions were followed:

```
FOR EACH DECISION in design.md:
├── Was the chosen approach actually used?
├── Were rejected alternatives accidentally implemented?
├── Do file changes match the "File Changes" table?
└── Flag: WARNING if deviation found (may be valid improvement)
```

### Step 5a: TDD Compliance Check (Strict TDD only)

> **Skip this step entirely if Strict TDD Mode is not active.**

If Strict TDD is active, follow the instructions in `strict-tdd-verify.md` Step 5a.

### Step 6: Check Testing

#### Step 6a: Static Test Analysis

Verify test files exist and cover the right scenarios:

```
Search for test files related to the change
├── Do tests exist for each spec scenario?
├── Do tests cover happy paths?
├── Do tests cover edge cases?
├── Do tests cover error states?
└── Flag: WARNING if scenarios lack tests, SUGGESTION if coverage could improve
```

#### Step 6b: Run Tests (Real Execution)

Detect the project's test runner and execute the tests:

```
Detect test runner from:
├── Cached testing capabilities → test_runner.command (fastest)
├── openspec/config.yaml → rules.verify.test_command (override)
├── package.json → scripts.test
├── pyproject.toml / pytest.ini → pytest
├── Makefile → make test
└── Fallback: ask orchestrator

Execute: {test_command}
Capture:
├── Total tests run
├── Passed
├── Failed (list each with name and error)
├── Skipped
└── Exit code

Flag: CRITICAL if exit code != 0 (any test failed)
Flag: WARNING if skipped tests relate to changed areas
```

#### Step 6c: Build & Type Check (Real Execution)

Detect and run the build/type-check command:

```
Detect build command from:
├── Cached testing capabilities → quality_tools.type_checker (fastest)
├── openspec/config.yaml → rules.verify.build_command (override)
├── package.json → scripts.build → also run tsc --noEmit if tsconfig.json exists
├── pyproject.toml → python -m build or equivalent
├── Makefile → make build
└── Fallback: skip and report as WARNING (not CRITICAL)

Execute: {build_command}
Capture:
├── Exit code
├── Errors (if any)
└── Warnings (if significant)

Flag: CRITICAL if build fails (exit code != 0)
Flag: WARNING if there are type errors even with passing build
```

#### Step 6d: Coverage Validation (Real Execution — if available)

Run coverage if the tool is available (from cached capabilities or config):

```
IF coverage tool available (from cached capabilities or rules.verify.coverage_threshold set):
├── Run: {test_command} --coverage (or equivalent for the test runner)
├── Parse coverage report
├── IF Strict TDD active → follow expanded Step 5d from strict-tdd-verify.md
│   (per-file coverage for changed files, uncovered line ranges)
├── IF Standard mode → report total coverage only
│   ├── Compare total coverage % against threshold (if configured)
│   └── Flag: WARNING if below threshold
└── Report

IF coverage tool NOT available:
└── Skip this step, report as "Not available"
```

#### Step 6e: Quality Metrics (Strict TDD only)

> **Skip this step entirely if Strict TDD Mode is not active.**

If Strict TDD is active, follow the instructions in `strict-tdd-verify.md` Step 5e.

### Step 7: Spec Compliance Matrix (Behavioral Validation)

This is the most important step. Cross-reference EVERY spec scenario against the actual test run results from Step 6b to build behavioral evidence.

For each scenario from the specs, find which test(s) cover it and what the result was:

```
FOR EACH REQUIREMENT in specs/:
  FOR EACH SCENARIO:
  ├── Find tests that cover this scenario (by name, description, or file path)
  ├── Look up that test's result from Step 6b output
  ├── Assign compliance status:
  │   ├── ✅ COMPLIANT   → test exists AND passed
  │   ├── ❌ FAILING     → test exists BUT failed (CRITICAL)
  │   ├── ❌ UNTESTED    → no test found for this scenario (CRITICAL)
  │   └── ⚠️ PARTIAL    → test exists, passes, but covers only part of the scenario (WARNING)
  └── Record: requirement, scenario, test file, test name, result
```

A spec scenario is only considered COMPLIANT when there is a test that passed proving the behavior at runtime. Code existing in the codebase is NOT sufficient evidence.

### Step 7a: Test Layer Validation (Strict TDD only)

> **Skip this step entirely if Strict TDD Mode is not active.**

If Strict TDD is active, follow the instructions in `strict-tdd-verify.md` (Step 5 Expanded: Test Layer Validation).

### Step 8: Persist Verification Report

Follow **Section C** from `skills/_shared/sdd-phase-common.md`.
- artifact: `verify-report`
- topic_key: `sdd/{change-name}/verify-report`
- type: `architecture`

### Step 9: Return Summary

Return to the orchestrator the same content you wrote to `verify-report.md`:

```markdown
## Verification Report

**Change**: {change-name}
**Version**: {spec version or N/A}
**Mode**: {Strict TDD | Standard}

---

### Completeness
| Metric | Value |
|--------|-------|
| Tasks total | {N} |
| Tasks complete | {N} |
| Tasks incomplete | {N} |

{List incomplete tasks if any}

---

### Build & Tests Execution

**Build**: ✅ Passed / ❌ Failed
```
{build command output or error if failed}
```

**Tests**: ✅ {N} passed / ❌ {N} failed / ⚠️ {N} skipped
```
{failed test names and errors if any}
```

**Coverage**: {N}% / threshold: {N}% → ✅ Above threshold / ⚠️ Below threshold / ➖ Not available

---

{IF Strict TDD Mode → include TDD Compliance table from strict-tdd-verify.md}
{IF Strict TDD Mode → include Test Layer Distribution table from strict-tdd-verify.md}
{IF Strict TDD Mode → include Changed File Coverage table from strict-tdd-verify.md}
{IF Strict TDD Mode → include Quality Metrics from strict-tdd-verify.md}

### Spec Compliance Matrix

| Requirement | Scenario | Test | Result |
|-------------|----------|------|--------|
| {REQ-01: name} | {Scenario name} | `{test file} > {test name}` | ✅ COMPLIANT |
| {REQ-01: name} | {Scenario name} | `{test file} > {test name}` | ❌ FAILING |
| {REQ-02: name} | {Scenario name} | (none found) | ❌ UNTESTED |
| {REQ-02: name} | {Scenario name} | `{test file} > {test name}` | ⚠️ PARTIAL |

**Compliance summary**: {N}/{total} scenarios compliant

---

### Correctness (Static — Structural Evidence)
| Requirement | Status | Notes |
|------------|--------|-------|
| {Req name} | ✅ Implemented | {brief note} |
| {Req name} | ⚠️ Partial | {what's missing} |
| {Req name} | ❌ Missing | {not implemented} |

---

### Coherence (Design)
| Decision | Followed? | Notes |
|----------|-----------|-------|
| {Decision name} | ✅ Yes | |
| {Decision name} | ⚠️ Deviated | {how and why} |

---

### Issues Found

**CRITICAL** (must fix before archive):
{List or "None"}

**WARNING** (should fix):
{List or "None"}

**SUGGESTION** (nice to have):
{List or "None"}

---

### Verdict
{PASS / PASS WITH WARNINGS / FAIL}

{One-line summary of overall status}
```

## Rules

- ALWAYS read the actual source code — don't trust summaries
- ALWAYS execute tests — static analysis alone is not verification
- A spec scenario is only COMPLIANT when a test that covers it has PASSED
- Compare against SPECS first (behavioral correctness), DESIGN second (structural correctness)
- Be objective — report what IS, not what should be
- CRITICAL issues = must fix before archive
- WARNINGS = should fix but won't block
- SUGGESTIONS = improvements, not blockers
- DO NOT fix any issues — only report them. The orchestrator decides what to do.
- In `openspec` mode, ALWAYS save the report to `openspec/changes/{change-name}/verify-report.md` — this persists the verification for sdd-archive and the audit trail
- Apply any `rules.verify` from `openspec/config.yaml`
- If Strict TDD is active, load `strict-tdd-verify.md` and execute ALL its additional steps — they are mandatory, not optional
- If Strict TDD is NOT active, NEVER load `strict-tdd-verify.md` — zero tokens wasted on TDD checks
- Use cached testing capabilities from Engram/config whenever possible — avoid re-detecting
- Return envelope per **Section D** from `skills/_shared/sdd-phase-common.md`.

Related Skills

skill-registry

1699

from Gentleman-Programming/gentle-ai

Create or update the skill registry for the current project. Scans user skills and project conventions, writes .atl/skill-registry.md, and saves to engram if available. Trigger: When user says "update skills", "skill registry", "actualizar skills", "update registry", or after installing/removing skills.

skill-creator

1699

from Gentleman-Programming/gentle-ai

Creates new AI agent skills following the Agent Skills spec. Trigger: When user asks to create a new skill, add agent instructions, or document patterns for AI.

sdd-tasks

1699

from Gentleman-Programming/gentle-ai

Break down a change into an implementation task checklist. Trigger: When the orchestrator launches you to create or update the task breakdown for a change.

sdd-spec

1699

from Gentleman-Programming/gentle-ai

Write specifications with requirements and scenarios (delta specs for changes). Trigger: When the orchestrator launches you to write or update specs for a change.

sdd-propose

1699

from Gentleman-Programming/gentle-ai

Create a change proposal with intent, scope, and approach. Trigger: When the orchestrator launches you to create or update a proposal for a change.

sdd-init

1699

from Gentleman-Programming/gentle-ai

Initialize Spec-Driven Development context in any project. Detects stack, conventions, testing capabilities, and bootstraps the active persistence backend. Trigger: When user wants to initialize SDD in a project, or says "sdd init", "iniciar sdd", "openspec init".

sdd-explore

1699

from Gentleman-Programming/gentle-ai

Explore and investigate ideas before committing to a change. Trigger: When the orchestrator launches you to think through a feature, investigate the codebase, or clarify requirements.

sdd-design

1699

from Gentleman-Programming/gentle-ai

Create technical design document with architecture decisions and approach. Trigger: When the orchestrator launches you to write or update the technical design for a change.

sdd-archive

1699

from Gentleman-Programming/gentle-ai

Sync delta specs to main specs and archive a completed change. Trigger: When the orchestrator launches you to archive a change after implementation and verification.

sdd-apply

1699

from Gentleman-Programming/gentle-ai

Implement tasks from the change, writing actual code following the specs and design. Trigger: When the orchestrator launches you to implement one or more tasks from a change.

judgment-day

1699

from Gentleman-Programming/gentle-ai

Parallel adversarial review protocol that launches two independent blind judge sub-agents simultaneously to review the same target, synthesizes their findings, applies fixes, and re-judges until both pass or escalates after 2 iterations. Trigger: When user says "judgment day", "judgment-day", "review adversarial", "dual review", "doble review", "juzgar", "que lo juzguen".

issue-creation

1699

from Gentleman-Programming/gentle-ai

Issue creation workflow for Agent Teams Lite following the issue-first enforcement system. Trigger: When creating a GitHub issue, reporting a bug, or requesting a feature.