screenshot-tool

网页截图 + 文档截图工具。支持网页全页截图、PPT/Word/Excel/PDF 转高清图片。保留原始样式，300 DPI 高清输出。

3,891 stars

Best use case

screenshot-tool is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

网页截图 + 文档截图工具。支持网页全页截图、PPT/Word/Excel/PDF 转高清图片。保留原始样式，300 DPI 高清输出。

Teams using screenshot-tool should expect a more consistent output, faster repeated execution, less prompt rewriting, better workflow continuity with your supporting tools.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.
You already have the supporting tools or dependencies needed by this skill.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/screenshot-tool/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/834948655/screenshot-tool/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/screenshot-tool/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How screenshot-tool Compares

Feature / Agent	screenshot-tool	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

网页截图 + 文档截图工具。支持网页全页截图、PPT/Word/Excel/PDF 转高清图片。保留原始样式，300 DPI 高清输出。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

SKILL.md Source

# Screenshot Tool - 网页 & 文档截图工具

支持网页截图和文档转高清图片，保留原始样式。

## 功能

| 功能 | 说明 |
|------|------|
| **网页截图** | 使用 headless 浏览器截图，支持整页、单页 |
| **文档转图** | PPT/Word/Excel/PDF 转 300 DPI 高清图片 |
| **高清输出** | 4000×2250 像素，适合打印和展示 |

## 依赖安装

### 必需依赖

| 依赖 | 用途 | 安装命令 |
|------|------|---------|
| **agent-browser** | 网页截图 | `npm install -g agent-browser && agent-browser install` |
| **LibreOffice** | 文档转 PDF | `sudo apt-get install -y libreoffice-impress libreoffice-writer libreoffice-calc` |
| **poppler-utils** | PDF 处理 | `sudo apt-get install -y poppler-utils` |
| **Python 库** | PDF 转图片 | `pip3 install pdf2image pillow` |

### 安装步骤

```bash
# 1. 安装 agent-browser（网页截图必需）
npm install -g agent-browser
agent-browser install
agent-browser install --with-deps  # 如需要系统依赖

# 2. 安装 LibreOffice（文档转换必需）
sudo apt-get install -y libreoffice-impress libreoffice-writer libreoffice-calc

# 3. 安装 poppler-utils（PDF 处理必需）
sudo apt-get install -y poppler-utils

# 4. 安装 Python 依赖
pip3 install pdf2image pillow

# 5. 安装中文字体（可选，用于中文文档）
sudo apt-get install -y fonts-wqy-zenhei fonts-wqy-microhei fonts-noto-cjk
```

### 验证安装

```bash
# 验证 agent-browser
agent-browser --version

# 验证 LibreOffice
libreoffice --version

# 验证 poppler
which pdftoppm pdfinfo
```

## 使用方法

### 1. 网页截图

```bash
# 截图单个网页
python3 skills/screenshot-tool/scripts/web_screenshot.py --url "https://example.com" --output page.png

# 截图并滚动（长页面）
python3 skills/screenshot-tool/scripts/web_screenshot.py --url "https://example.com" --full-page --output page.png
```

### 2. 文档转图片

```bash
# PPT/Word/Excel/PDF 转图片
python3 skills/screenshot-tool/scripts/doc_screenshot.py --input file.pptx --output-dir ./images

# 指定 DPI（默认 300）
python3 skills/screenshot-tool/scripts/doc_screenshot.py --input file.pdf --dpi 200 --output-dir ./images
```

### 3. 使用 agent-browser 截图

```bash
# 打开网页
agent-browser open "https://example.com" --timeout 60000

# 截图
agent-browser screenshot output.png --full

# 关闭浏览器
agent-browser close
```

## 支持的格式

### 文档格式
| 格式 | 扩展名 | 状态 |
|------|--------|------|
| PowerPoint | .pptx, .ppt | ✅ 支持 |
| Word | .docx, .doc | ✅ 支持 |
| Excel | .xlsx, .xls | ✅ 支持 |
| PDF | .pdf | ✅ 支持 |
| OpenDocument | .odp, .odt, .ods | ✅ 支持 |

### 网页截图
| 方式 | 说明 | 依赖 |
|------|------|------|
| agent-browser | 使用 headless Chrome | **agent-browser** |
| OpenClaw browser | 内置浏览器工具 | OpenClaw 内置 |

## 输出规格

| 参数 | 默认值 | 说明 |
|------|--------|------|
| DPI | 300 | 分辨率 |
| 格式 | PNG | 图片格式 |
| 尺寸 | 4000×2250 | 16:9 比例 |

## 示例

### 示例1：网页截图
```bash
# 截图京东首页
python3 skills/screenshot-tool/scripts/web_screenshot.py \
  --url "https://www.jd.com" \
  --output jd_homepage.png \
  --wait 5
```

### 示例2：PPT 转图片
```bash
# 转换整个 PPT
python3 skills/screenshot-tool/scripts/doc_screenshot.py \
  --input presentation.pptx \
  --output-dir ./slides \
  --dpi 300
```

### 示例3：PDF 转图片
```bash
# 转换 PDF 前5页
python3 skills/screenshot-tool/scripts/doc_screenshot.py \
  --input document.pdf \
  --output-dir ./pages \
  --first-page 1 \
  --last-page 5
```

## 流程说明

### 文档转图片流程
```
PPT/Word/Excel → LibreOffice → PDF → pdf2image → PNG (300 DPI)
                                         ↑
                                    依赖: poppler-utils
```

### 网页截图流程
```
URL → agent-browser (headless Chrome) → Screenshot → PNG
              ↑
        依赖: agent-browser CLI
```

## 故障排除

### LibreOffice 转换失败
```bash
# 检查 LibreOffice 安装
libreoffice --version

# 手动转换测试
libreoffice --headless --convert-to pdf file.pptx
```

### pdf2image 错误
```bash
# 检查 poppler 安装
which pdftoppm pdfinfo

# 重新安装
sudo apt-get install -y poppler-utils
```

### 中文字体显示问题
```bash
# 安装中文字体
sudo apt-get install -y fonts-wqy-zenhei fonts-wqy-microhei fonts-noto-cjk
```

## 文件结构

```
skills/screenshot-tool/
├── SKILL.md              # 本文件
├── scripts/
│   ├── web_screenshot.py    # 网页截图脚本
│   └── doc_screenshot.py    # 文档转图片脚本
└── README.md             # 详细说明
```

## License

MIT

Related Skills

AI Coding Toolkit — Master Every AI Coding Assistant

3891

from openclaw/skills

> The complete methodology for 10X productivity with AI-assisted development. Covers Cursor, Windsurf, Cline, Aider, Claude Code, GitHub Copilot, and more — tool-agnostic principles that work everywhere.

china-tools-sourcing

3891

from openclaw/skills

Comprehensive tools industry sourcing guide for international buyers – provides detailed information about China's hand tools, power tools, garden tools, measuring tools, and industrial tool manufacturing clusters, supply chain structure, regional specializations, and industry trends (2026 updated).

github-tools

3891

from openclaw/skills

Interact with GitHub using the `gh` CLI. Use `gh issue`, `gh pr`, `gh run`, and `gh api` for issues, PRs, CI runs, and advanced queries.

DevOps & Infrastructure

pdf-tool

3891

from openclaw/skills

PDF文字提取工具 — 支持从PDF文件中提取文字内容，用于解析简历。by Barry

devtools-secrets

3891

from openclaw/skills

Knowledge and guardrails for the mise + fnox + infisical secrets toolchain. Use when the user asks to "configure secrets", "set up fnox", "infisical", "mise env", "secrets management", "environment variables for secrets", or mentions secret injection, secret providers, or env var hygiene.

searxng-tool-for-openclaw

3891

from openclaw/skills

Install an OpenClaw plugin that adds SearXNG-powered web search without paid search APIs.

Feishu SuperToolkit

3891

from openclaw/skills

飞书超级工具包 - 集成文件发送（含音频卡片）、日历、审批、多维表格、通讯录、考勤六大模块

tool-call-retry

3891

from openclaw/skills

Auto retry & fix LLM tool calls with exponential backoff, format validation, error correction, boost tool call success rate by 90%

openclaw-tool-executor

3891

from openclaw/skills

Use this skill whenever the user asks for information from, or wants to take an action in, a third-party tool or service. This includes — but is not limited to — searching the web, reading or writing documents, sending messages, querying databases, managing tasks, fetching data from APIs, or interacting with any connected SaaS product (e.g. "search Exa for...", "read my Notion page", "send a Slack message", "get my Google Sheet", "create a GitHub issue", "query Snowflake", "look up a HubSpot contact"). Trigger this skill any time the user's request involves an external service, integration, or data source — even if the provider is not explicitly named. Handles OAuth and non-OAuth (API Key, Bearer, Basic) connections, tool discovery, execution, and proxy fallback via Scalekit Connect. ## Provider Mapping Some services are accessed through a different provider name in Scalekit. Always use the mapped provider name below: | User asks about | Use provider | |---|---| | LinkedIn — profiles, jobs, companies, posts, people search, ads, groups | `HARVESTAPI` |

web-screenshot

3891

from openclaw/skills

Capture screenshots of web pages running on local or remote servers using Puppeteer in headless Chromium. Use when user asks to screenshot web pages, capture web UI, take website screenshots, or document web application interfaces. Supports login-required SPAs (Vue/React/Angular) by performing form-based authentication before navigating. Generates screenshots and an optional result.json with per-page descriptions.

chrome-devtools-mcp-manager

3891

from openclaw/skills

Manage chrome-devtools-mcp service and OpenClaw's built-in Chrome browser for MCP-based browser automation. Use when user needs to use chrome-devtools-mcp functionality, ensure the browser is ready for MCP operations, or manage the browser/MCP lifecycle.

pydantic-ai-tool-system

3891

from openclaw/skills

Register and implement PydanticAI tools with proper context handling, type annotations, and docstrings. Use when adding tool capabilities to agents, implementing function calling, or creating agent actions.