browser-screenshot

Capture browser page screenshot (webpage content only, not desktop). When you need to show page state, document results, or debug issues. For desktop screenshots, use desktop_screenshot instead.

1,592 stars

byopenakita

View on GitHub Installation ↓

Best use case

browser-screenshot is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Capture browser page screenshot (webpage content only, not desktop). When you need to show page state, document results, or debug issues. For desktop screenshots, use desktop_screenshot instead.

Teams using browser-screenshot should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/browser-screenshot/SKILL.md --create-dirs "https://raw.githubusercontent.com/openakita/openakita/main/skills/system/browser-screenshot/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/browser-screenshot/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How browser-screenshot Compares

Feature / Agent	browser-screenshot	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Capture browser page screenshot (webpage content only, not desktop). When you need to show page state, document results, or debug issues. For desktop screenshots, use desktop_screenshot instead.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Browser Screenshot

截取当前页面截图。

## Parameters

| 参数 | 类型 | 必填 | 说明 |
|-----|------|-----|------|
| path | string | 否 | 保存路径（可选，不填自动生成） |

## Examples

**截取当前页面**:
```json
{}
```

**保存到指定路径**:
```json
{"path": "C:/screenshots/result.png"}
```

## Notes

- 仅截取浏览器页面内容
- 如需截取桌面或其他应用，请使用 `desktop_screenshot`

## Workflow

1. 截图后获取 `file_path`
2. 使用 `deliver_artifacts` 发送给用户

## Related Skills

- `desktop-screenshot`: 截取桌面应用
- `deliver-artifacts`: 发送截图给用户


## 推荐

对于多步骤的浏览器任务，建议优先使用 `browser_task` 工具。它可以自动规划和执行复杂的浏览器操作，无需手动逐步调用各个工具。

示例：
```python
browser_task(task="打开百度搜索福建福州并截图")
```

Related Skills

desktop-screenshot

1592

from openakita/openakita

Capture Windows desktop screenshot with automatic file saving. When you need to show desktop state, capture application windows, or record operation results. IMPORTANT - must actually call this tool, never say 'screenshot done' without calling. Returns file_path for deliver_artifacts.

browser-type

1592

from openakita/openakita

Type text into input fields on webpage. When you need to fill forms, enter search queries, or input data. PREREQUISITE - must use browser_navigate first. May need to click field first for focus.

browser-task

1592

from openakita/openakita

Smart browser task agent - describe what you want done in natural language and it completes automatically. PREFERRED tool for multi-step browser operations like searching, form filling, and data extraction.

browser-switch-tab

1592

from openakita/openakita

Switch to a specific browser tab by index. When you need to work with a different tab or return to previous page. Use browser_list_tabs to get tab indices.

browser-status

1592

from openakita/openakita

Check browser current state including open status, current URL, page title, tab count. Useful for checking current page URL/title. Note - browser_open already includes status check and auto-starts if needed, so you don't need to call browser_status before browser_open.

browser-open

1592

from openakita/openakita

Launch browser or check its status. Returns current state (is_open, url, title, tab_count). If already running, returns status without restarting. Auto-handles everything - no need to call browser_status first.

browser-new-tab

1592

from openakita/openakita

Open new browser tab and navigate to URL (keeps current page open). When you need to open additional page without closing current, or multi-task across pages. PREREQUISITE - must confirm browser is running first.

browser-navigate

1592

from openakita/openakita

Navigate browser to specified URL to open a webpage. When you need to open webpages or start web automation. PREREQUISITE - must call before browser_click/type operations. Auto-starts browser if not running.

browser-list-tabs

1592

from openakita/openakita

List all open browser tabs with their index, URL and title. When you need to check what pages are open, manage multiple tabs, or find a specific tab to switch to.

browser-get-content

1592

from openakita/openakita

Extract page content and element text from current webpage. When you need to read page information, get element values, scrape data, or verify page content.

browser-click

1592

from openakita/openakita

Click page elements by CSS selector or text content. When you need to click buttons, links, or select options. PREREQUISITE - must use browser_navigate to open target page first.

openakita/skills@yuque-skills

1592

from openakita/openakita

Manage Yuque (语雀) knowledge bases, documents, and team collaboration through API integration. Supports personal search, weekly reports, knowledge base management, document CRUD, and group collaboration workflows. Based on yuque/yuque-skills.