testing-visual-regression

Detect visual changes in UI components using screenshot comparison. Use when detecting unintended UI changes or pixel differences. Trigger with phrases like "test visual changes", "compare screenshots", or "detect UI regressions".

1,868 stars

byjeremylongshore

View on GitHub Installation ↓

Best use case

testing-visual-regression is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using testing-visual-regression should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/testing-visual-regression/SKILL.md --create-dirs "https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/main/plugins/testing/visual-regression-tester/skills/testing-visual-regression/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/testing-visual-regression/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How testing-visual-regression Compares

Feature / Agent	testing-visual-regression	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# Visual Regression Tester

## Overview

Detect unintended visual changes in UI components by capturing screenshots and comparing them pixel-by-pixel against approved baselines. Supports Playwright visual comparisons, Percy, Chromatic, BackstopJS, and reg-suit.

## Prerequisites

- Browser automation tool installed (Playwright, Puppeteer, or Cypress)
- Visual regression library configured (Playwright `toHaveScreenshot`, Percy, Chromatic, or BackstopJS)
- Baseline screenshots committed to version control or stored in a cloud service
- Storybook or component playground running for isolated component captures (optional)
- Consistent rendering environment (Docker or CI with fixed OS/fonts/GPU settings)

## Instructions

1. Identify all UI components and pages requiring visual coverage using Glob to scan component directories and route definitions.
2. Create a visual test file for each component or page:
   - Navigate to the component URL or Storybook story.
   - Wait for all network requests, animations, and lazy-loaded images to complete.
   - Set a consistent viewport size (e.g., 1280x720 for desktop, 375x812 for mobile).
3. Capture screenshots with deterministic settings:
   - Disable animations and transitions (`* { animation: none !important; transition: none !important; }`).
   - Mask dynamic content (timestamps, random avatars, ads) with CSS overlays.
   - Use `fullPage: true` for scrollable pages.
4. Compare captured screenshots against baselines:
   - Configure pixel difference threshold (recommended: 0.1% for component tests, 0.5% for full-page).
   - Generate diff images highlighting changed regions.
   - Flag tests as failed when differences exceed the threshold.
5. For responsive testing, capture at multiple breakpoints:
   - Mobile: 375px width
   - Tablet: 768px width
   - Desktop: 1280px width
   - Wide: 1920px width
6. Review diff images for each failure and classify as:
   - **Intentional change**: Update the baseline with `--update-snapshots`.
   - **Regression**: File a bug with the diff image attached.
7. Integrate into CI so visual tests run on every pull request with diff images uploaded as artifacts.

## Output

- Screenshot baseline images stored in `__screenshots__/` or equivalent directory
- Diff images highlighting pixel-level changes between baseline and current
- Visual regression test report with pass/fail status per component
- CI artifacts containing all captured, baseline, and diff images
- Responsive coverage matrix showing results across breakpoints

## Error Handling

| Error | Cause | Solution |
|-------|-------|---------|
| Anti-aliasing differences across OS | Font rendering varies between macOS, Linux, and Windows | Run visual tests in Docker with fixed fonts; use `threshold` option to allow sub-pixel variance |
| Flaky screenshots from animations | CSS transitions or JS animations still running at capture time | Inject `prefers-reduced-motion` or disable animations via `addStyleTag` before capture |
| Missing baseline on first run | No previous screenshot exists to compare against | Run with `--update-snapshots` to create initial baselines; commit them to the repository |
| Viewport size mismatch | Browser chrome or scrollbar width differs between environments | Use `setViewportSize` explicitly; hide scrollbars with CSS `overflow: hidden` |
| Dynamic content causes false failures | Timestamps, user avatars, or ads change between runs | Mask dynamic elements with `mask` option or replace content via `page.evaluate` |

## Examples

**Playwright visual regression test:**
```typescript
import { test, expect } from '@playwright/test';

test('homepage matches baseline', async ({ page }) => {
  await page.goto('/');
  await page.waitForLoadState('networkidle');
  await page.addStyleTag({ content: '* { animation: none !important; }' });
  await expect(page).toHaveScreenshot('homepage.png', {
    maxDiffPixelRatio: 0.001,
    fullPage: true,
  });
});
```

**BackstopJS scenario configuration:**
```json
{
  "label": "Login Page",
  "url": "http://localhost:3000/login",  # 3000: 3 seconds in ms
  "selectors": ["document"],
  "misMatchThreshold": 0.1,
  "viewports": [
    { "label": "phone", "width": 375, "height": 812 },  # 812: 375 = configured value
    { "label": "desktop", "width": 1280, "height": 720 }  # 1280: 720 = configured value
  ]
}
```

## Resources

- Playwright visual comparisons: https://playwright.dev/docs/test-snapshots
- Percy visual testing: https://www.percy.io/
- Chromatic (Storybook): https://www.chromatic.com/
- BackstopJS: https://github.com/garris/BackstopJS
- reg-suit visual regression: https://reg-viz.github.io/reg-suit/

Related Skills

performing-security-testing

1868

from jeremylongshore/claude-code-plugins-plus-skills

Test automate security vulnerability testing covering OWASP Top 10, SQL injection, XSS, CSRF, and authentication issues. Use when performing security assessments, penetration tests, or vulnerability scans. Trigger with phrases like "scan for vulnerabilities", "test security", or "run penetration test".

tracking-regression-tests

1868

from jeremylongshore/claude-code-plugins-plus-skills

Track and manage regression test suites across releases. Use when performing specialized testing. Trigger with phrases like "track regressions", "manage regression suite", or "validate against baseline".

testing-mobile-apps

1868

from jeremylongshore/claude-code-plugins-plus-skills

Execute mobile app testing on iOS and Android devices/simulators. Use when performing specialized testing. Trigger with phrases like "test mobile app", "run iOS tests", or "validate Android functionality".

testing-load-balancers

1868

from jeremylongshore/claude-code-plugins-plus-skills

Validate load balancer behavior, failover, and traffic distribution. Use when performing specialized testing. Trigger with phrases like "test load balancer", "validate failover", or "check traffic distribution".

testing-browser-compatibility

1868

from jeremylongshore/claude-code-plugins-plus-skills

Test across multiple browsers and devices for cross-browser compatibility. Use when ensuring cross-browser or device compatibility with BrowserStack, Sauce Labs, LambdaTest, or Kobiton. Trigger with phrases like "test browser compatibility", "check cross-browser", "validate on browsers", "test on real devices", "kobiton test".

automating-api-testing

1868

from jeremylongshore/claude-code-plugins-plus-skills

Test automate API endpoint testing including request generation, validation, and comprehensive test coverage for REST and GraphQL APIs. Use when testing API contracts, validating OpenAPI specifications, or ensuring endpoint reliability. Trigger with phrases like "test the API", "generate API tests", or "validate API contracts".

performing-penetration-testing

1868

from jeremylongshore/claude-code-plugins-plus-skills

Perform security testing on web applications, APIs, and codebases. Use when the user asks to "run a security scan", "check for vulnerabilities", "audit dependencies", "check security headers", "find security issues", "pentest", "security audit", or "scan for secrets". Trigger with "pentest", "security scan", "vulnerability check", "audit dependencies", "check headers", "find secrets".

neurodivergent-visual-org

1868

from jeremylongshore/claude-code-plugins-plus-skills

Creates ADHD-friendly visual organizational tools using Mermaid diagrams optimized for neurodivergent thinking patterns. Auto-detects overwhelm, provides compassionate task breakdowns with realistic time estimates. Use when creating visual task breakdowns, decision trees, or organizational diagrams for neurodivergent users or accessibility-focused projects. Trigger with 'neurodivergent', 'visual', 'org'.

detecting-performance-regressions

1868

from jeremylongshore/claude-code-plugins-plus-skills

Automatically detect performance regressions in CI/CD pipelines by comparing metrics against baselines. Use when validating builds or analyzing performance trends. Trigger with phrases like "detect performance regression", "compare performance metrics", or "analyze performance degradation".

backtesting-trading-strategies

1868

from jeremylongshore/claude-code-plugins-plus-skills

Backtest crypto and traditional trading strategies against historical data. Calculates performance metrics (Sharpe, Sortino, max drawdown), generates equity curves, and optimizes strategy parameters. Use when user wants to test a trading strategy, validate signals, or compare approaches. Trigger with phrases like "backtest strategy", "test trading strategy", "historical performance", "simulate trades", "optimize parameters", or "validate signals".

load-testing-apis

1868

from jeremylongshore/claude-code-plugins-plus-skills

Execute comprehensive load and stress testing to validate API performance and scalability. Use when validating API performance under load. Trigger with phrases like "load test the API", "stress test API", or "benchmark API performance".

performing-regression-analysis

1868

from jeremylongshore/claude-code-plugins-plus-skills

Execute this skill empowers AI assistant to perform regression analysis and modeling using the regression-analysis-tool plugin. it analyzes datasets, generates appropriate regression models (linear, polynomial, etc.), validates the models, and provides performa... Use when analyzing code or data. Trigger with phrases like 'analyze', 'review', or 'examine'.