Codex

claims-validator

Validate documentation for unsupported claims, made-up metrics, and unverifiable statements

104 stars

byjmagly

View on GitHub Installation ↓

Best use case

claims-validator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

It is a strong fit for teams already working in Codex.

Validate documentation for unsupported claims, made-up metrics, and unverifiable statements

Teams using claims-validator should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/claims-validator/SKILL.md --create-dirs "https://raw.githubusercontent.com/jmagly/aiwg/main/.agents/skills/claims-validator/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/claims-validator/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How claims-validator Compares

Feature / Agent	claims-validator	Standard Approach
Platform Support	Codex	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Validate documentation for unsupported claims, made-up metrics, and unverifiable statements

Which AI agents support this skill?

This skill is designed for Codex.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# claims-validator

Validate documentation for unsupported claims, made-up metrics, and unverifiable statements.

## Triggers


Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):

- "fact-check this" → claim validation
- "verify [claim]" → specific claim check

## Purpose

This skill identifies statements that make claims without evidence, including:
- Performance metrics without benchmarks or data
- Time/cost estimates without basis
- Percentage claims without citation
- Comparative statements without baselines
- Features described as implemented that don't exist
- Marketing superlatives presented as facts

## Behavior

When triggered, this skill:

1. **Scans for metric claims**:
   - Percentage improvements ("40% faster", "reduces by 60%")
   - Time estimates ("saves 2-3 hours", "in minutes not hours")
   - Cost projections ("$50-150/month", "ROI of 3x")
   - Performance numbers ("99x faster", "sub-millisecond")

2. **Identifies unsupported comparatives**:
   - "faster than", "better than", "more efficient"
   - "best", "leading", "revolutionary", "game-changing"
   - "comprehensive", "complete", "full-featured"

3. **Checks for feature claims**:
   - Commands or flags mentioned that don't exist in codebase
   - Features described in present tense that aren't implemented
   - Integration claims without actual integration code

4. **Validates citations**:
   - Claims that reference data should have sources
   - Benchmarks should link to methodology
   - Statistics should be reproducible

5. **Generates report**:
   - List each claim found
   - Classification (metric, comparative, feature, cost)
   - Recommendation (remove, add citation, verify, rephrase)

## Claim Categories

### Metrics Without Data

```markdown
# Flagged
"Time Saved: 92-96% (9-15 hours → 45-60 minutes)"
"99x faster routing"
"45x cache speedup"

# Problem
No benchmark data, methodology, or reproducible test

# Fix
Remove claim, or add: "Based on [benchmark/test], measured [how]"
```

### Cost Estimates Without Basis

```markdown
# Flagged
"Budget $20-50/month for moderate use"
"Light usage: ~$10-20/month"
"Enterprise teams may see $100-500+/month"

# Problem
No actual usage data, varies wildly by use case

# Fix
Remove specific numbers, or link to pricing calculator/methodology
```

### Time Estimates Without Data

```markdown
# Flagged
"Deploy Full SDLC Framework (2 Minutes)"
"5 minutes, replaces 2-4 hours manual work"
"campaign setup from 2-3 weeks → 1 week"

# Problem
No measurement, varies by project complexity

# Fix
Remove time claims, describe what it does instead
```

### Comparative Claims Without Baseline

```markdown
# Flagged
"faster than manual processes"
"more efficient than traditional approaches"
"better than existing solutions"

# Problem
No specific comparison, no baseline defined

# Fix
Remove comparison, or specify exactly what's being compared
```

### Feature Claims for Unimplemented Features

```markdown
# Flagged
"aiwg -migrate-workspace  # Optional migration tool"
"Run 'config-validator --fix' to apply automated fixes"

# Problem
Command doesn't exist in codebase

# Fix
Remove until implemented, or mark as "Planned:"
```

### Marketing Superlatives

```markdown
# Flagged
"comprehensive", "revolutionary", "game-changing"
"best-in-class", "industry-leading", "cutting-edge"
"seamless", "effortless", "zero-friction"

# Problem
Subjective claims that can't be verified

# Fix
Replace with specific, factual descriptions
```

## Validation Report Format

```markdown
# Claims Validation Report

**Document**: README.md
**Date**: 2025-12-09
**Claims Found**: 12
**Issues**: 8

## Summary

| Category | Found | Unsupported | Action Needed |
|----------|-------|-------------|---------------|
| Metrics | 5 | 4 | Remove or cite |
| Time estimates | 3 | 3 | Remove |
| Cost estimates | 2 | 2 | Remove |
| Comparatives | 1 | 1 | Rephrase |
| Features | 1 | 1 | Remove (not implemented) |

## Issues

### 1. Unsupported Metric
**Line 204**: "Time Savings: 20-98% reduction across 5 core use cases"
**Problem**: No benchmark data or methodology
**Action**: Remove claim

### 2. Cost Estimate Without Data
**Line 1180**: "Light usage: ~$10-20/month"
**Problem**: No actual usage data to support estimate
**Action**: Remove specific numbers

### 3. Non-existent Feature
**Line 583**: "aiwg -migrate-workspace"
**Problem**: Command not implemented
**Action**: Remove until implemented

## Recommendations

1. **Remove all percentage-based time savings claims**
2. **Remove cost estimates** - direct users to measure their own usage
3. **Remove non-existent commands** - only document what exists
4. **Replace superlatives with specifics** - say what it does, not how good it is
```

## Usage Examples

### Validate a Document

```
User: "Check README.md for unsupported claims"

Output:
"Claims Validation: README.md

Found 8 unsupported claims:

Metrics (4):
- Line 204: '20-98% reduction' - no data
- Line 362: '56-63% time saved' - no data
- Line 588: '99x faster routing' - no benchmark
- Line 589: '45x cache speedup' - no benchmark

Time Estimates (2):
- Line 187: '5 minutes, replaces 2-4 hours' - unmeasured
- Line 366: '2 Minutes' in heading - varies by project

Non-existent Features (2):
- Line 583: 'aiwg -migrate-workspace' - not implemented
- Line 651: 'aiwg -rollback-workspace' - not implemented

Recommendation: Remove these claims. Describe what features
do, not how much time/money they save."
```

### Quick Check

```
User: "Any BS in this section?"

Output:
"Found 2 issues:

1. 'Revolutionary approach' - marketing speak, be specific
2. '10x productivity gain' - no measurement

Suggest: Replace with factual descriptions of functionality."
```

## Integration

This skill complements:
- **Voice Framework**: Voice defines *how* to write, claims-validator checks *what* you claim
- **config-validator**: Validates config files, claims-validator validates prose claims

## What This Skill Does NOT Flag

- Factual descriptions of features that exist
- Documented benchmarks with methodology
- Qualified statements ("may vary", "depending on", "in our testing")
- User testimonials clearly attributed
- Comparative claims with specific baselines cited

## Output Location

- Validation reports: `.aiwg/reports/claims-validation.md`

## References

- @$AIWG_ROOT/agentic/code/addons/aiwg-utils/README.md — aiwg-utils addon overview
- @$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/research-before-decision.md — Verify claims before accepting them
- @$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/vague-discretion.md — Requirements for measurable, verifiable claims
- @$AIWG_ROOT/docs/cli-reference.md — CLI reference for validation commands

Related Skills

config-validator

104

from jmagly/aiwg

Validate AIWG configuration files and project setup for correctness and completeness

Codex

aiwg-orchestrate

104

from jmagly/aiwg

Route structured artifact work to AIWG workflows via MCP with zero parent context cost

venv-manager

104

from jmagly/aiwg

Create, manage, and validate Python virtual environments. Use for project isolation and dependency management.

pytest-runner

104

from jmagly/aiwg

Execute Python tests with pytest, supporting fixtures, markers, coverage, and parallel execution. Use for Python test automation.

vitest-runner

104

from jmagly/aiwg

Execute JavaScript/TypeScript tests with Vitest, supporting coverage, watch mode, and parallel execution. Use for JS/TS test automation.

eslint-checker

104

from jmagly/aiwg

Run ESLint for JavaScript/TypeScript code quality and style enforcement. Use for static analysis and auto-fixing.

repo-analyzer

104

from jmagly/aiwg

Analyze GitHub repositories for structure, documentation, dependencies, and contribution patterns. Use for codebase understanding and health assessment.