visual-review
Take screenshots of web pages and UI using Playwright for visual review and iteration
Best use case
visual-review is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Take screenshots of web pages and UI using Playwright for visual review and iteration
Teams using visual-review should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/visual-review/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How visual-review Compares
| Feature / Agent | visual-review | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Take screenshots of web pages and UI using Playwright for visual review and iteration
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Visual Review
## Purpose
Agents cannot see rendered web pages, presentations, or UI. This skill bridges that gap by using Playwright to capture screenshots of web pages (local files, localhost servers, or deployed URLs) so agents can visually inspect their work, identify layout issues, and iterate.
This is especially useful for:
- **Reveal.js presentations** — capture each slide individually
- **Web UIs** — verify layout, styling, and responsiveness
- **Landing pages** — check visual design matches intent
- **Email templates** — verify rendering
## Prerequisites
This skill requires **Playwright** to be available. Before first use, check and install if needed:
```bash
# Check if Playwright is available
npx playwright --version 2>/dev/null
# If not available, install it (one-time setup)
npx playwright install chromium
```
**Important:** Only the `chromium` browser is needed. Do not install all browsers — it wastes disk space and time.
If `npx playwright` fails entirely, the user may need to install it:
```bash
npm install -g playwright
npx playwright install chromium
```
## Workflow
### Phase 1: Setup
1. **Verify Playwright is available** — run `npx playwright --version`
2. **If not installed** — run `npx playwright install chromium` and inform the user
3. **Create output directory** — `mkdir -p /tmp/visual-review`
4. **Determine the target** — local file path, localhost URL, or deployed URL
### Phase 2: Capture Screenshots
#### For a single page:
```bash
npx playwright screenshot \
--browser chromium \
--viewport-size "1456,816" \
--wait-for-timeout 3000 \
"<URL_OR_FILE_PATH>" \
/tmp/visual-review/page.png
```
**Parameters:**
- `--viewport-size "1456,816"` — 16:9 aspect ratio, good for presentations
- `--wait-for-timeout 3000` — wait 3 seconds for fonts, images, and animations to settle
- Use `file:///absolute/path/to/file.html` for local files
- Use full URLs for deployed sites
#### For Reveal.js presentations (multiple slides):
Capture each slide by appending the slide hash to the URL:
```bash
# Capture all slides (adjust count as needed)
for i in $(seq 0 15); do
npx playwright screenshot \
--browser chromium \
--viewport-size "1456,816" \
--wait-for-timeout 2000 \
"<BASE_URL>#/$i" \
"/tmp/visual-review/slide-$i.png"
done
```
**Tip:** To capture slides with all fragments visible, add `?fragments=true` or use JavaScript injection via a Playwright script. The simpler approach is to accept that fragment-heavy slides will appear in their initial (pre-click) state.
#### For responsive testing:
```bash
# Mobile
npx playwright screenshot --viewport-size "390,844" "<URL>" /tmp/visual-review/mobile.png
# Tablet
npx playwright screenshot --viewport-size "768,1024" "<URL>" /tmp/visual-review/tablet.png
# Desktop wide
npx playwright screenshot --viewport-size "1920,1080" "<URL>" /tmp/visual-review/desktop.png
```
### Phase 3: Show and Review Screenshots
Use the Read tool to analyze each screenshot:
```
Read /tmp/visual-review/slide-0.png
Read /tmp/visual-review/slide-1.png
...
```
**Showing screenshots to the user:** The agent can see screenshots via Read, but the user cannot. When the user wants to see what you see — or when you want to discuss a visual issue together — there are two approaches depending on the environment:
**Local machine (has a display):**
```bash
# Open a specific screenshot
open /tmp/visual-review/slide-2.png
# Open all screenshots at once
open /tmp/visual-review/slide-*.png
```
**Remote/headless machine (SSH, tmux, etc.):**
Serve the screenshots directory over HTTP so the user can view them in their local browser:
```bash
# Start a simple file server (runs in background)
# Use a high port (>10000) to avoid conflicts with dev tools
python3 -m http.server 18080 --directory /tmp/visual-review &
# User can now browse to:
# http://<hostname>:18080/
# http://<hostname>:18080/slide-2.png
# etc.
```
Pick any open high-numbered port (18080, 19090, etc.) — low ports like 8080 are often taken by dev tools. This serves the entire `/tmp/visual-review/` directory with an auto-generated file listing. The user can click through all screenshots in their browser. Stop the server when done:
```bash
# Stop the background server
kill %1
# Or find and kill by port
lsof -ti:18080 | xargs kill
```
Use whichever approach fits the environment. This is useful when collaborating on visual issues, when the user asks what something looks like, or when you want confirmation on a subjective design choice.
For each screenshot, evaluate:
- **Layout** — Is content centered/aligned as intended?
- **Typography** — Are headings, body text, and code readable?
- **Colors** — Do accent colors, backgrounds, and contrasts look right?
- **Spacing** — Is there enough whitespace? Any cramped areas?
- **Content** — Is all expected content visible? (Note: fragment content may be hidden)
- **Responsiveness** — Does it work at the target viewport size?
### Phase 4: Document Findings
Create a findings summary:
```markdown
## Visual Review — [Project Name]
### Reviewed: [date]
| Slide/Page | Status | Notes |
|-----------|--------|-------|
| slide-0 | Good | Title centered, accent color correct |
| slide-1 | Issue | Text overflows right edge on mobile |
| ... | ... | ... |
### Issues to Fix
1. [Description of issue + which slide/page]
2. [...]
### Looks Good
- [Things that are working well]
```
### Phase 5: Fix and Re-capture
After making fixes:
1. Re-capture only the affected slides/pages
2. Review the new screenshots
3. Repeat until satisfied
## Reveal.js-Specific Tips
- **Slide numbering**: Reveal.js uses `#/0` for the first slide, `#/1` for the second, etc.
- **Vertical slides**: Use `#/horizontal/vertical` format (e.g., `#/2/1`)
- **Fragments**: Content with `class="fragment"` is hidden until clicked. Screenshots show the initial state. This is expected behavior, not a bug.
- **Speaker notes**: Not visible in screenshots (they appear in a separate window with `S` key)
- **Theme loading**: Use `--wait-for-timeout 3000` to ensure CDN-loaded themes and fonts are ready
## Common Mistakes
| Mistake | Why It's Wrong | Do This Instead |
|---------|---------------|-----------------|
| Not waiting for page load | Fonts/images may not render | Use `--wait-for-timeout 3000` |
| Using `http://` for local files | File URLs need `file:///` protocol | Use `file:///absolute/path` |
| Installing all Playwright browsers | Wastes time and disk | Only install `chromium` |
| Treating fragment-hidden content as a bug | Fragments are designed to reveal on click | Note it as expected |
| Forgetting to create output directory | Screenshots fail silently | Always `mkdir -p /tmp/visual-review` first |
| Taking too many screenshots at once | Floods context window | Review in batches of 4-6 |
| Not offering to `open` when discussing visuals | User can't see what you're describing | Use `open` when collaborating on visual issues |
## Integration Points
**Upstream:** Any skill that produces visual output (presentations, web UIs, email templates)
**Downstream:** Fixes feed back into the source files, then re-capture to verify
## Verification Checklist
- [ ] Playwright is installed and working
- [ ] Screenshots captured at correct viewport size
- [ ] All pages/slides reviewed
- [ ] Issues documented with specific descriptions
- [ ] Fixes verified with re-captured screenshotsRelated Skills
plan-review-gate
Automatic adversarial review gate that spawns 3 independent reviewers in parallel after any plan is drafted - all must PASS before presenting to user
design-review-gate
Automatic review gate that runs after brainstorming completes - spawns PM, Architect, Designer, Security, and CTO agents in parallel, iterates until all approve
status
Diagnostic status report — shows metaswarm installation state, project setup, and potential issues
start
Use when starting work on any task, when the user mentions metaswarm, or when the user wants to begin tracked development work
setup
Interactive project setup — detects your project, configures metaswarm, writes project-local files
pr-shepherd
Monitor a PR through to merge — handle CI failures, review comments, and thread resolution automatically until all checks pass
orchestrated-execution
4-phase execution loop for work units - IMPLEMENT, VALIDATE, ADVERSARIAL REVIEW, COMMIT
migrate
Migrate from npm-installed metaswarm to the marketplace plugin — removes redundant files with safety checks
handling-pr-comments
Address PR review feedback systematically — fetch inline comments, review bodies, handle outside-diff-range comments, resolve threads with proper attribution
external-tools
Delegate implementation and review tasks to external AI CLI tools (Codex, Gemini) with cross-model adversarial review
create-issue
Create comprehensive GitHub issues with TDD plans, acceptance criteria, and agent instructions for autonomous PR lifecycle management
brainstorming-extension
Enforces design review gate after brainstorming — bridges superpowers:brainstorming into the metaswarm quality pipeline