web-screenshot

Capture screenshots of web pages running on local or remote servers using Puppeteer in headless Chromium. Use when user asks to screenshot web pages, capture web UI, take website screenshots, or document web application interfaces. Supports login-required SPAs (Vue/React/Angular) by performing form-based authentication before navigating. Generates screenshots and an optional result.json with per-page descriptions.

3,791 stars

Best use case

web-screenshot is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Capture screenshots of web pages running on local or remote servers using Puppeteer in headless Chromium. Use when user asks to screenshot web pages, capture web UI, take website screenshots, or document web application interfaces. Supports login-required SPAs (Vue/React/Angular) by performing form-based authentication before navigating. Generates screenshots and an optional result.json with per-page descriptions.

Teams using web-screenshot should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/sjht-web-screenshot/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/aowind/sjht-web-screenshot/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/sjht-web-screenshot/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How web-screenshot Compares

Feature / Agentweb-screenshotStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Capture screenshots of web pages running on local or remote servers using Puppeteer in headless Chromium. Use when user asks to screenshot web pages, capture web UI, take website screenshots, or document web application interfaces. Supports login-required SPAs (Vue/React/Angular) by performing form-based authentication before navigating. Generates screenshots and an optional result.json with per-page descriptions.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Web Screenshot

Capture screenshots of web pages (especially SPA applications) with automatic login handling.

## Dependencies

- `puppeteer-core` (npm global)
- `chromium-browser` (`/usr/bin/chromium-browser`)
- Node.js

Verify with: `which chromium-browser && npm ls -g puppeteer-core`

## Quick Start

```bash
node <skill_dir>/scripts/screenshot.js <config.json>
```

## Config Format

```json
{
  "baseUrl": "http://192.168.7.66:8080",
  "outputDir": "/root/screenpics/my-capture",
  "resolution": [1920, 1080],
  "login": {
    "url": "/login",
    "usernameSelector": "input[placeholder='请输入用户名']",
    "passwordSelector": "input[type='password']",
    "submitSelector": "button.el-button--primary",
    "credentials": { "username": "admin", "password": "123456" }
  },
  "pages": [
    { "name": "01_dashboard", "path": "/dashboard", "waitMs": 3000 },
    { "name": "02_project_list", "path": "/project/list", "waitMs": 2000 }
  ],
  "descriptions": {
    "01_dashboard": "工作台首页,展示KPI卡片和图表。",
    "02_project_list": "项目管理列表页面。"
  }
}
```

### Login Flow (SPA Authentication)

The script handles Vue/React SPA login by:
1. Navigating to the login page
2. Setting input values via native `HTMLInputElement.value` setter + dispatching `input` events (Vue-reactive compatible)
3. Clicking the submit button
4. Waiting for SPA router navigation (URL change)
5. Using Vue's `$router.push()` for subsequent page navigation (avoids Pinia/Redux store reset on full page reload)

### Fields

| Field | Required | Description |
|-------|----------|-------------|
| `baseUrl` | ✅ | Base URL of the web app |
| `outputDir` | ✅ | Output directory for screenshots |
| `resolution` | No | Viewport size `[width, height]`, default `[1920, 1080]` |
| `login` | No | Login config (skip for public pages) |
| `login.usernameSelector` | ✅* | CSS selector for username input |
| `login.passwordSelector` | ✅* | CSS selector for password input |
| `login.submitSelector` | ✅* | CSS selector for submit button |
| `login.credentials` | ✅* | `{ username, password }` |
| `pages` | ✅ | Array of pages to capture |
| `pages[].name` | ✅ | Filename prefix (e.g. `01_dashboard`) |
| `pages[].path` | ✅ | URL path (e.g. `/dashboard`) |
| `pages[].waitMs` | No | Extra wait in ms after navigation (default 2000) |
| `descriptions` | No | Map of `name` → description text (included in result.json) |

## Output

- `{outputDir}/{name}.png` — one PNG per page
- `{outputDir}/result.json` — metadata with filenames, titles, URLs, descriptions

### result.json Format

```json
{
  "project": "auto-generated",
  "captureDate": "2026-03-22",
  "baseUrl": "...",
  "resolution": "1920x1080",
  "screenshots": [
    {
      "filename": "01_dashboard.png",
      "title": "Dashboard",
      "url": "...",
      "description": "..."
    }
  ]
}
```

## Capture Login Page Too

To include the login page as the first screenshot, add it to `pages` with a special flag:

```json
{
  "pages": [
    { "name": "00_login", "path": "/login", "isLoginPage": true, "waitMs": 2000 }
  ]
}
```

When `isLoginPage: true`, the script captures this page before performing login.

## Advanced: Custom Vue Store Login

If the form-based login doesn't work (e.g., custom auth flow), use `storeLogin` instead:

```json
{
  "login": {
    "url": "/login",
    "storeLogin": {
      "storeName": "user",
      "method": "login",
      "args": ["平台管理员"]
    }
  }
}
```

This directly calls `pinia._s.get(storeName).method(...args)` via CDP.

## Troubleshooting

- **Blank charts (ECharts/Chart.js)**: Headless Chromium has no GPU. Charts using Canvas may render empty. Use `--disable-gpu` (already included).
- **Redirected to login on all pages**: Login failed. Check selectors match the actual form elements. Try `storeLogin` approach.
- **SPA navigation not working**: Ensure `login` section is configured. Without login, `page.goto()` is used instead of `$router.push()`.

Related Skills

app-store-screenshots-generator

3791
from openclaw/skills

Generate production-ready App Store screenshots for iOS apps using AI agents, Next.js, and html-to-image

macpilot-screenshot-ocr

3791
from openclaw/skills

Capture screenshots and extract text via OCR using MacPilot. Take full-screen, region, or window screenshots, and recognize text in images or screen areas with multi-language support.

screenshot-tool

3791
from openclaw/skills

网页截图 + 文档截图工具。支持网页全页截图、PPT/Word/Excel/PDF 转高清图片。保留原始样式,300 DPI 高清输出。

screenshot-ux-auditor

3791
from openclaw/skills

Turn app screenshots into structured UX, copywriting, and conversion audits with issue severity and recommended fixes.

screenshot-to-task

3791
from openclaw/skills

把截图里的待办或灵感整理成任务、备注和优先级。;use for screenshots, tasks, capture workflows;do not use for 伪造截图内容, 替代 OCR 系统.

clawford

3795
from openclaw/skills

Browse the Clawford skill marketplace — verified skill packs with benchmarks that prove they work. Free courses included.

General Utilities

virtualbox

3795
from openclaw/skills

Control and manage VirtualBox virtual machines directly from openclaw. Start, stop, snapshot, clone, configure and monitor VMs using VBoxManage CLI. Supports full lifecycle management including VM creation, network configuration, shared folders, and performance monitoring.

DevOps & Infrastructure

0xarchive

3795
from openclaw/skills

Query historical crypto market data from 0xArchive across Hyperliquid, Lighter.xyz, and HIP-3. Covers orderbooks, trades, candles, funding rates, open interest, liquidations, and data quality. Use when the user asks about crypto market data, orderbooks, trades, funding rates, or historical prices on Hyperliquid, Lighter.xyz, or HIP-3.

Data & Research

image-gen

3795
from openclaw/skills

Generate AI images from text prompts. Triggers on: "生成图片", "画一张", "AI图", "generate image", "配图", "create picture", "draw", "visualize", "generate an image".

Content & Documentation

asr

3795
from openclaw/skills

Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".

General Utilities

listenhub

3795
from openclaw/skills

Explain anything — turn ideas into podcasts, explainer videos, or voice narration. Use when the user wants to "make a podcast", "create an explainer video", "read this aloud", "generate an image", or share knowledge in audio/visual form. Supports: topic descriptions, YouTube links, article URLs, plain text, and image prompts.

Content & Documentation

explainer

3795
from openclaw/skills

Create explainer videos with narration and AI-generated visuals. Triggers on: "解说视频", "explainer video", "explain this as a video", "tutorial video", "introduce X (video)", "解释一下XX(视频形式)".

Content & Documentation