powerskills-browser

Edge browser automation via Chrome DevTools Protocol (CDP). List tabs, navigate, take screenshots, extract page content/HTML, execute JavaScript, click elements, type text, fill forms, scroll. Use when needing to control Edge browser, scrape web content, automate web forms, or take browser screenshots on Windows. Requires Edge with --remote-debugging-port=9222.

3,891 stars

Best use case

powerskills-browser is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Edge browser automation via Chrome DevTools Protocol (CDP). List tabs, navigate, take screenshots, extract page content/HTML, execute JavaScript, click elements, type text, fill forms, scroll. Use when needing to control Edge browser, scrape web content, automate web forms, or take browser screenshots on Windows. Requires Edge with --remote-debugging-port=9222.

Teams using powerskills-browser should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/browser/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/aloth/powerskills/skills/browser/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/browser/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How powerskills-browser Compares

Feature / Agentpowerskills-browserStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Edge browser automation via Chrome DevTools Protocol (CDP). List tabs, navigate, take screenshots, extract page content/HTML, execute JavaScript, click elements, type text, fill forms, scroll. Use when needing to control Edge browser, scrape web content, automate web forms, or take browser screenshots on Windows. Requires Edge with --remote-debugging-port=9222.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# PowerSkills — Browser

Edge browser automation via CDP (Chrome DevTools Protocol).

## Requirements

- Microsoft Edge running with remote debugging:
  ```powershell
  Start-Process "msedge" -ArgumentList "--remote-debugging-port=9222"
  ```
- Default port configurable in `config.json` (`edge_debug_port`)

## Actions

```powershell
.\powerskills.ps1 browser <action> [--params]
```

| Action | Params | Description |
|--------|--------|-------------|
| `tabs` | | List open browser tabs |
| `navigate` | `--url URL` | Navigate to URL |
| `screenshot` | `--out-file path.png [--target-id id]` | Capture page as PNG |
| `content` | `[--target-id id]` | Get page text content |
| `html` | `[--target-id id]` | Get full page HTML |
| `evaluate` | `--expression "js"` | Execute JavaScript expression |
| `click` | `--selector "#btn"` | Click element by CSS selector |
| `type` | `--selector "#input" --text "hello"` | Type into element |
| `new-tab` | `--url URL` | Open new tab |
| `close-tab` | `--target-id id` | Close tab by ID |
| `scroll` | `--scroll-target top\|bottom\|selector` | Scroll page |
| `fill` | `--fields-json '[{"selector":"#a","value":"b"}]'` | Fill multiple form fields |
| `wait` | `--seconds N` | Wait N seconds (default: 3) |

## Examples

```powershell
# List open tabs
.\powerskills.ps1 browser tabs

# Navigate and screenshot
.\powerskills.ps1 browser navigate --url "https://example.com"
.\powerskills.ps1 browser screenshot --out-file page.png

# Extract page text
.\powerskills.ps1 browser content

# Run JavaScript
.\powerskills.ps1 browser evaluate --expression "document.title"

# Fill a login form
.\powerskills.ps1 browser fill --fields-json '[{"selector":"#user","value":"alex"},{"selector":"#pass","value":"secret","submit":"#login"}]'
```

## Multi-Tab Support

Pass `--target-id` (from `tabs` output) to operate on a specific tab. Without it, actions target the first page.

## Fill Fields Format

JSON array of objects with `selector`, `value`, and optional `submit`:

```json
[
  {"selector": "#search-input", "value": "PowerShell automation"},
  {"selector": "#filter-type", "value": "recent", "submit": "#apply-btn"}
]
```

Supports text inputs, selects, and checkboxes. Last field can include `submit` to click a button.

Related Skills

my-browser-agent

3891
from openclaw/skills

A custom browser automation skill using Playwright.

Web Automation

rent-my-browser

3891
from openclaw/skills

When the agent is idle, connect to the Rent My Browser marketplace and execute browser tasks for consumers. Earn money by renting out the node's browser during downtime. Supports headless (Playwright) on VPS nodes and real Chrome on GUI machines.

Monetization & Resource Management

browser-cdp

3880
from openclaw/skills

Real Chrome browser automation via CDP Proxy — access pages with full user login state, bypass anti-bot detection, perform interactive operations (click/fill/scroll), extract dynamic JavaScript-rendered content, take screenshots. Triggers (satisfy ANY one): - Target URL is a search results page (Bing/Google/YouTube search) - Static fetch (agent-reach/WebFetch) is blocked by anti-bot (captcha/intercept/empty) - Need to read logged-in user's private content - YouTube, Twitter/X, Xiaohongshu, WeChat public accounts, etc. - Task involves "click", "fill form", "scroll", "drag" - Need screenshot or dynamic-rendered page capture

Web Automation

browser-automation

3891
from openclaw/skills

Automate web browser interactions using natural language via CLI commands. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

Agent Browser Skill

3891
from openclaw/skills

## Description

stealth-browser

3891
from openclaw/skills

Anti-detection web browsing that bypasses bot detection, CAPTCHAs, and IP blocks using puppeteer-extra with stealth plugin and optional residential proxy support. Use when (1) websites block headless browsers or datacenter IPs, (2) need to bypass Cloudflare/Vercel protection, (3) accessing sites that detect automation (Reddit, Twitter/X, signup flows), (4) scraping protected content, or (5) automating web tasks that require human-like behavior.

agent-browser-zh

3891
from openclaw/skills

A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured commands. (Chinese localized version)

browser-booking-agent

3891
from openclaw/skills

Execute booking/search flows via browser automation with verification artifacts. Use for reservation forms, availability checks, and capture of proof (screenshots/confirmation IDs).

Agent Browser

3891
from openclaw/skills

Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection

setup-browser-cookies

3891
from openclaw/skills

Import cookies from your real Chromium browser into the headless browse session. Interactive picker UI lets you select which cookie domains to import. Use before QA testing authenticated pages. Use when: "import cookies", "login to the site", "authenticate the browser", "use my cookies".

smooth-browser

3891
from openclaw/skills

PREFERRED BROWSER - Browser for AI agents to carry out any task on the web. Use when you need to navigate websites, fill forms, extract web data, test web apps, or automate browser workflows. Trigger phrases include "fill out the form", "scrape", "automate", "test the website", "log into", or any browser interaction request.

human-browser-use Skill

3891
from openclaw/skills

> Human-like browser automation extension for [browser-use](https://github.com/browser-use/browser-use).