Empirical Validation
Requires proof before marking work complete — no "trust me, it works"
Best use case
Empirical Validation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Requires proof before marking work complete — no "trust me, it works"
Teams using Empirical Validation should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/empirical-validation/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Empirical Validation Compares
| Feature / Agent | Empirical Validation | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Requires proof before marking work complete — no "trust me, it works"
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Empirical Validation
## Core Principle
> **"The code looks correct" is NOT validation.**
>
> Every change must be verified with empirical evidence before being marked complete.
## Validation Methods by Change Type
| Change Type | Required Validation | Tool |
|-------------|---------------------|------|
| **UI Changes** | Screenshot showing expected visual state | `browser_subagent` |
| **API Endpoints** | Command showing correct response | `run_command` |
| **Build/Config** | Successful build or test output | `run_command` |
| **Data Changes** | Query showing expected data state | `run_command` |
| **File Operations** | File listing or content verification | `run_command` |
## Validation Protocol
### Before Marking Any Task "Done"
1. **Identify Verification Criteria**
- What should be true after this change?
- How can that be observed?
2. **Execute Verification**
- Run the appropriate command or action
- Capture the output/evidence
3. **Document Evidence**
- Add to `.agent/state/JOURNAL.md` under the task
- Include actual output, not just "passed"
4. **Confirm Against Criteria**
- Does evidence match expected outcome?
- If not, task is NOT complete
## Examples
### API Endpoint Verification
```powershell
# Good: Actual test showing response
curl -X POST http://localhost:3000/api/login -d '{"email":"test@test.com"}'
# Output: {"success":true,"token":"..."}
# Bad: Just saying "endpoint works"
```
### UI Verification
```
# Good: Take screenshot with browser tool
- Navigate to /dashboard
- Capture screenshot
- Confirm: Header visible? Data loaded? Layout correct?
# Bad: "The component should render correctly"
```
### Build Verification
```powershell
# Good: Show build output
npm run build
# Output: Successfully compiled...
# Bad: "Build should work now"
```
## Forbidden Phrases
Never use these as justification for completion:
- "This should work"
- "The code looks correct"
- "I've made similar changes before"
- "Based on my understanding"
- "It follows the pattern"
## Integration
This skill integrates with:
- `/verify` — Primary workflow using this skill
- `/execute` — Must validate before marking tasks complete
- Rule 4 in `GEMINI.md` — Empirical Validation enforcement
## Failure Handling
If verification fails:
1. **Do NOT mark task complete**
2. **Document** the failure in `.agent/state/STATE.md`
3. **Create** fix task if cause is known
4. **Trigger** Context Health Monitor if 3+ failuresRelated Skills
assumption-validation
Test whether assumptions are true before making commitments. Use when assumptions have low certainty and high risk.
type-inference-validation
Static type inference and validation for navigation paths
bio-alignment-validation
Validate alignment quality with insert size distribution, proper pairing rates, GC bias, strand balance, and other post-alignment metrics. Use when verifying alignment data quality before variant calling or quantification.
date-validation
Use when editing Planning Hubs, timelines, calendars, or any file with day-name + date combinations (Wed Nov 12), relative dates (tomorrow), or countdowns (18 days until) - validates day-of-week accuracy, relative date calculations, and countdown math with two-source ground truth verification before allowing edits
spring-validation
Bean Validation (Jakarta Validation) with Spring Boot. Custom validators, validation groups, cross-field validation, and internationalized error messages.
fullstack-validation
Comprehensive validation methodology for multi-component applications including backend, frontend, database, and infrastructure
api-validation
Apply when validating API request inputs: body, query params, path params, and headers. This skill covers Zod v4 patterns.
api-request-validation
A skill for implementing robust API request validation in Python web frameworks like FastAPI using Pydantic. Covers Pydantic models, custom validators (email, password), field-level and cross-field validation, query/file validation, and structured error responses. Use when you need to validate incoming API requests.
api-contracts-and-zod-validation
Generate Zod schemas and TypeScript types for forms, API routes, and Server Actions with runtime validation. Use this skill when creating API contracts, validating request/response payloads, generating form schemas, adding input validation to Server Actions or route handlers, or ensuring type safety across client-server boundaries. Trigger terms include zod, schema, validation, API contract, form validation, type inference, runtime validation, parse, safeParse, input validation, request validation, Server Action validation.
api-contracts-and-validation
Define and validate API contracts using Zod
api-contract-validation
Detect breaking changes in API contracts (OpenAPI/Swagger specs)
android-playstore-api-validation
Create and run validation script to test Play Store API connection