visual-verdict
Structured visual QA verdict for screenshot-to-reference comparisons
Best use case
visual-verdict is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Structured visual QA verdict for screenshot-to-reference comparisons
Structured visual QA verdict for screenshot-to-reference comparisons
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "visual-verdict" skill to help with this workflow task. Context: Structured visual QA verdict for screenshot-to-reference comparisons
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/visual-verdict/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How visual-verdict Compares
| Feature / Agent | visual-verdict | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Structured visual QA verdict for screenshot-to-reference comparisons
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
SKILL.md Source
<Purpose>
Use this skill to compare generated UI screenshots against one or more reference images and return a strict JSON verdict that can drive the next edit iteration.
</Purpose>
<Use_When>
- The task includes visual fidelity requirements (layout, spacing, typography, component styling)
- You have a generated screenshot and at least one reference image
- You need deterministic pass/fail guidance before continuing edits
</Use_When>
<Inputs>
- `reference_images[]` (one or more image paths)
- `generated_screenshot` (current output image)
- Optional: `category_hint` (e.g., `hackernews`, `sns-feed`, `dashboard`)
</Inputs>
<Output_Contract>
Return **JSON only** with this exact shape:
```json
{
"score": 0,
"verdict": "revise",
"category_match": false,
"differences": ["..."],
"suggestions": ["..."],
"reasoning": "short explanation"
}
```
Rules:
- `score`: integer 0-100
- `verdict`: short status (`pass`, `revise`, or `fail`)
- `category_match`: `true` when the generated screenshot matches the intended UI category/style
- `differences[]`: concrete visual mismatches (layout, spacing, typography, colors, hierarchy)
- `suggestions[]`: actionable next edits tied to the differences
- `reasoning`: 1-2 sentence summary
<Threshold_And_Loop>
- Target pass threshold is **90+**.
- If `score < 90`, continue editing and rerun `/oh-my-claudecode:visual-verdict` before any further visual review pass.
- Do **not** treat the visual task as complete until the next screenshot clears the threshold.
</Threshold_And_Loop>
<Debug_Visualization>
When mismatch diagnosis is hard:
1. Keep `$visual-verdict` as the authoritative decision.
2. Use pixel-level diff tooling (pixel diff / pixelmatch overlay) as a **secondary debug aid** to localize hotspots.
3. Convert pixel diff hotspots into concrete `differences[]` and `suggestions[]` updates.
</Debug_Visualization>
<Example>
```json
{
"score": 87,
"verdict": "revise",
"category_match": true,
"differences": [
"Top nav spacing is tighter than reference",
"Primary button uses smaller font weight"
],
"suggestions": [
"Increase nav item horizontal padding by 4px",
"Set primary button font-weight to 600"
],
"reasoning": "Core layout matches, but style details still diverge."
}
```
</Example>
Task: {{ARGUMENTS}}Related Skills
writer-memory
Agentic memory system for writers - track characters, relationships, scenes, and themes
ultrawork
Parallel execution engine for high-throughput task completion
ultraqa
QA cycling workflow - test, verify, fix, repeat until goal met
trace
Evidence-driven tracing lane that orchestrates competing tracer hypotheses in Claude built-in team mode
team
N coordinated agents on shared task list using Claude Code native teams
skill
Manage local skills - list, add, remove, search, edit, setup wizard
setup
Use first for install/update routing — sends setup, doctor, or MCP requests to the correct OMC setup flow
sciomc
Orchestrate parallel scientist agents for comprehensive analysis with AUTO mode
release
Automated release workflow for oh-my-claudecode
ralplan
Consensus planning entrypoint that auto-gates vague ralph/autopilot/team requests before execution
ralph
Self-referential loop until task completion with configurable verification reviewer
project-session-manager
Worktree-first dev environment manager for issues, PRs, and features with optional tmux sessions