agent-eyes

Visual context analyzer for AI agents. Provides screenshots, accessibility scans, DOM snapshots, and element descriptions for web pages. Use when you need to see what a web page looks like, analyze accessibility issues, inspect DOM structure, or get detailed element information. Triggers on requests like "take a screenshot", "check accessibility", "what does this page look like", "analyze the UI", "inspect this element", or any visual/UI analysis task.

16 stars

Best use case

agent-eyes is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Visual context analyzer for AI agents. Provides screenshots, accessibility scans, DOM snapshots, and element descriptions for web pages. Use when you need to see what a web page looks like, analyze accessibility issues, inspect DOM structure, or get detailed element information. Triggers on requests like "take a screenshot", "check accessibility", "what does this page look like", "analyze the UI", "inspect this element", or any visual/UI analysis task.

Teams using agent-eyes should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/agent-eyes/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/ai-agents/agent-eyes/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/agent-eyes/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How agent-eyes Compares

Feature / Agentagent-eyesStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Visual context analyzer for AI agents. Provides screenshots, accessibility scans, DOM snapshots, and element descriptions for web pages. Use when you need to see what a web page looks like, analyze accessibility issues, inspect DOM structure, or get detailed element information. Triggers on requests like "take a screenshot", "check accessibility", "what does this page look like", "analyze the UI", "inspect this element", or any visual/UI analysis task.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Agent Eyes

Visual context analyzer for web pages. Provides AI agents with the ability to "see" web applications through screenshots, accessibility scans, DOM snapshots, and element descriptions.

## Prerequisites

- Python 3.10+
- `uv` package manager (recommended)
- Playwright browsers installed: `playwright install chromium`

## Compact Mode (Token-Efficient Output)

**All commands support `--compact` / `-c` flag** for token-efficient output:

| Mode | Screenshot | DOM | A11y | Total Tokens |
|------|------------|-----|------|--------------|
| Standard | Base64 inline | depth=5, 20 children | Full violations | ~500K+ |
| **Compact** | File path only | depth=3, 10 children | Summary only | **~3-5K** |

Use compact mode when context window size is a concern (which is most of the time).

```bash
# Compact context - reduces ~500K tokens to ~3-5K tokens
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --compact

# Compact screenshot - always saves to file, never returns base64
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --compact

# Compact a11y - returns summary + top N issues only
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --compact

# Compact DOM - stricter limits on depth and children
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --compact
```

## Commands

All commands use `uv run` for automatic dependency management:

```bash
SKILL_DIR=".claude/skills/agent-eyes/scripts"
```

### Screenshot

Capture full page or element screenshots:

```bash
# Full page screenshot (saves to .canvas/screenshots/)
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000

# Element screenshot
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --selector ".hero"

# Save to specific path
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --output ./tmp/page.png

# Get as base64 (for inline context) - NOT recommended, use --compact instead
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --base64

# RECOMMENDED: Compact mode - always saves to file, never returns base64
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --compact
```

### Accessibility Scan

Run axe-core accessibility analysis:

```bash
# Full page scan (WCAG 2.1 AA)
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000

# Scoped to element
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --selector "main"

# WCAG AAA level
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --level AAA

# RECOMMENDED: Compact mode - summary + top issues only (~1-2K tokens vs 100K+)
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --compact
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --compact --max-issues 5
```

### DOM Snapshot

Get simplified DOM tree:

```bash
# Full page DOM
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000

# Subtree only
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --selector ".content"

# Control depth and children
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --depth 3 --max-children 10

# RECOMMENDED: Compact mode - depth=3, max-children=10, text=50 chars
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --compact
```

### Describe Element

Get detailed element information (styles, bounding box, attributes):

```bash
uv run $SKILL_DIR/agent_eyes.py describe http://localhost:3000 --selector ".hero-button"
```

### Full Context

Get comprehensive context bundle (screenshot + a11y + DOM + description):

```bash
# Full context for page
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000

# Focused on element
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --selector ".hero"

# Without screenshot (smaller output)
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --no-screenshot

# RECOMMENDED: Compact mode - file paths only, limited DOM/a11y (~3-5K tokens)
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --compact

# Compact with custom limits
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --compact \
  --dom-depth 2 --max-children 5 --max-issues 5
```

## Output Format

All commands return JSON to stdout:

```json
{
  "ok": true,
  "...": "command-specific fields"
}
```

On error:

```json
{
  "ok": false,
  "error": "Error description"
}
```

### Compact Mode Output Examples

**Compact context output** (~3-5K tokens instead of ~500K):

```json
{
  "ok": true,
  "url": "http://localhost:3000",
  "title": "My App",
  "timestamp": "2026-01-22T10-30-00-000Z",
  "compact": true,
  "screenshot_path": ".canvas/screenshots/2026-01-22T10-30-00-000Z.png",
  "screenshot_size": 443281,
  "dom": {
    "tag": "body",
    "children": [...]
  },
  "a11y_summary": {
    "total_violations": 5,
    "by_severity": {"critical": 1, "serious": 2, "moderate": 2, "minor": 0},
    "top_issues": [
      {"id": "color-contrast", "impact": "serious", "affected_count": 3}
    ]
  }
}
```

**Compact a11y output** (~1-2K tokens instead of ~100K):

```json
{
  "ok": true,
  "total_violations": 15,
  "by_severity": {"critical": 2, "serious": 5, "moderate": 6, "minor": 2},
  "by_category": {"color": 3, "aria": 5, "keyboard": 2},
  "top_issues": [
    {
      "id": "color-contrast",
      "impact": "serious",
      "description": "Elements must have sufficient color contrast...",
      "affected_count": 3,
      "help_url": "https://dequeuniversity.com/rules/axe/..."
    }
  ],
  "passes": 42,
  "incomplete": 3
}
```

## Typical Agent Workflow

1. **Start dev server** (if not running):
   ```bash
   npm run dev &
   ```

2. **Take initial screenshot** to see current state:
   ```bash
   uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000
   ```

3. **Run accessibility scan** to find issues:
   ```bash
   uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000
   ```

4. **Inspect specific element** for details:
   ```bash
   uv run $SKILL_DIR/agent_eyes.py describe http://localhost:3000 --selector ".problematic-button"
   ```

5. **Get full context** for comprehensive analysis:
   ```bash
   uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --selector ".hero"
   ```

## Example: Analyze and Fix A11y Issues

```bash
# 1. Get accessibility violations
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000

# Output shows violations like:
# {
#   "ok": true,
#   "violations": [
#     {
#       "id": "color-contrast",
#       "impact": "serious",
#       "description": "Elements must have sufficient color contrast",
#       "nodes": [{"html": "<button class='cta'>..."}]
#     }
#   ]
# }

# 2. Describe the element to understand current styles
uv run $SKILL_DIR/agent_eyes.py describe http://localhost:3000 --selector ".cta"

# 3. Make code changes to fix the contrast issue

# 4. Re-run a11y to verify fix
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000
```

## Notes

- Screenshots are saved to `.canvas/screenshots/` by default with ISO timestamps
- The tool runs headless Chromium via Playwright
- All commands wait for `networkidle` before capturing
- DOM snapshots are simplified to reduce output size
- A11y scans use axe-core, the industry standard accessibility testing engine

## Token Budget Guide

| Operation | Standard Mode | Compact Mode |
|-----------|---------------|--------------|
| Screenshot | ~100-470K tokens (base64) | ~50 tokens (path only) |
| DOM Snapshot | ~50-150K tokens | ~2-3K tokens |
| A11y Scan | ~50-100K tokens | ~500-1K tokens |
| Full Context | ~500K+ tokens | **~3-5K tokens** |

**Recommendation**: Always use `--compact` flag unless you specifically need base64 data for inline image processing. The compact mode reduces token usage by **99%** while preserving all essential information.

### When to Use Each Mode

| Mode | Use Case |
|------|----------|
| Standard | Debugging, when you need full HTML snippets, when feeding to vision model |
| **Compact** | Most agent workflows, design reviews, accessibility audits, CI/CD pipelines |

Related Skills

fresh-eyes

16
from diegosouzapw/awesome-omni-skill

Re-reads code you just wrote with fresh perspective to catch bugs, errors, and issues. Use after completing a feature, fixing a bug, or any code changes. Triggers on "review my code", "fresh eyes", "check for bugs", "did I miss anything", or "sanity check".

bgo

10
from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

acc-psr-coding-style-knowledge

16
from diegosouzapw/awesome-omni-skill

PSR-1 and PSR-12 coding standards knowledge base for PHP 8.5 projects. Provides quick reference for basic coding standard and extended coding style with detection patterns, examples, and antipattern identification. Use for code style audits and compliance reviews.

acc-detect-test-smells

16
from diegosouzapw/awesome-omni-skill

Detects test antipatterns and code smells in PHP test suites. Identifies 15 smells (Logic in Test, Mock Overuse, Fragile Tests, Mystery Guest, etc.) with fix recommendations and refactoring patterns for testability.

acc-create-value-object

16
from diegosouzapw/awesome-omni-skill

Generates DDD Value Objects for PHP 8.5. Creates immutable, self-validating objects with equality comparison. Includes unit tests.

acc-create-unit-test

16
from diegosouzapw/awesome-omni-skill

Generates PHPUnit unit tests for PHP 8.5. Creates isolated tests with AAA pattern, proper naming, attributes, and one behavior per test. Supports Value Objects, Entities, Services.

acc-create-test-double

16
from diegosouzapw/awesome-omni-skill

Generates test doubles (Mocks, Stubs, Fakes, Spies) for PHP 8.5. Creates appropriate double type based on testing needs with PHPUnit MockBuilder patterns.

acc-create-psr7-http-message

16
from diegosouzapw/awesome-omni-skill

Generates PSR-7 HTTP Message implementations for PHP 8.5. Creates Request, Response, Stream, Uri, and ServerRequest classes with immutability. Includes unit tests.

acc-create-policy

16
from diegosouzapw/awesome-omni-skill

Generates Policy pattern for PHP 8.5. Creates encapsulated business rules for authorization, validation, and domain constraints. Includes unit tests.

acc-create-null-object

16
from diegosouzapw/awesome-omni-skill

Generates Null Object pattern for PHP 8.5. Creates safe default implementations eliminating null checks. Includes unit tests.

acc-create-command

16
from diegosouzapw/awesome-omni-skill

Generates CQRS Commands and Handlers for PHP 8.5. Creates immutable command DTOs with handlers that modify state. Includes unit tests.

acc-analyze-test-coverage

16
from diegosouzapw/awesome-omni-skill

Analyzes PHP codebase for test coverage gaps. Detects untested classes, methods, branches, exception paths, and edge cases. Provides actionable recommendations.