my-browser-agent

A custom browser automation skill using Playwright.

3,891 stars
Complexity: medium

About this skill

The `my-browser-agent` skill equips AI coding agents with essential web browsing capabilities through Playwright. This skill allows an agent to programmatically open any specified URL, making it highly valuable for tasks that require direct interaction with web content beyond simple HTTP requests. It's built to render web pages fully, similar to a human user browsing the internet. Key functionalities include taking full-page screenshots of visited URLs, which is useful for visual verification, archival purposes, or content analysis. Additionally, it can extract the title of a web page, providing quick summaries or verification of navigation. The roadmap indicates future support for clicking elements, which will significantly expand its utility for form submissions, navigation, and interacting with dynamic web applications. Developers, QA testers, and AI agents needing robust web interaction will find this skill particularly useful. It bridges the gap between an AI's analytical capabilities and the dynamic, visual world of web browsers, enabling more sophisticated and realistic web-based tasks.

Best use case

This skill is primarily used for automating fundamental web browser interactions. Its main use case is enabling AI agents to navigate the web, gather visual proof (screenshots), and extract basic page information (titles). It benefits developers integrating web UI testing, AI agents requiring live web data, and users needing to automate simple web-based workflows or content monitoring tasks.

A custom browser automation skill using Playwright.

A successful visit to a specified URL, a saved screenshot file, or the return of the web page's title.

Practical example

Example input

Use my-browser-agent to visit https://www.bilibili.com and take a screenshot.

Example output

Visited https://www.bilibili.com. Screenshot saved as bilibili_screenshot.png.

When to use this skill

  • When you need an AI agent to visit and render a specific web page.
  • To automate taking screenshots of web pages for documentation or analysis.
  • For programmatically extracting the title of any given URL.
  • When basic interaction like clicking elements on a web page is required (future feature).

When not to use this skill

  • For highly complex web scraping that requires advanced data parsing logic.
  • When a simpler HTTP request (e.g., cURL) is sufficient and rendering is unnecessary.
  • For tasks requiring sophisticated human-like interaction or CAPTCHA solving.
  • If a specialized API exists for the specific web service you wish to interact with.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/my-browser-agent/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/1393368499/my-browser-agent/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/my-browser-agent/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How my-browser-agent Compares

Feature / Agentmy-browser-agentStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexitymediumN/A

Frequently Asked Questions

What does this skill do?

A custom browser automation skill using Playwright.

How difficult is it to install?

The installation complexity is rated as medium. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# my-browser-agent

A custom browser automation skill using Playwright.

## Features

- Visit any URL
- Take screenshots
- Get page title
- Click elements (future)

## Usage

Call with:
- `url`: The URL to visit (required)
- `action`: Optional action like "screenshot", "title", "click"

## Example

> Use my-browser-agent to visit https://www.bilibili.com and take a screenshot.

Related Skills

browser-cdp

3880
from openclaw/skills

Real Chrome browser automation via CDP Proxy — access pages with full user login state, bypass anti-bot detection, perform interactive operations (click/fill/scroll), extract dynamic JavaScript-rendered content, take screenshots. Triggers (satisfy ANY one): - Target URL is a search results page (Bing/Google/YouTube search) - Static fetch (agent-reach/WebFetch) is blocked by anti-bot (captcha/intercept/empty) - Need to read logged-in user's private content - YouTube, Twitter/X, Xiaohongshu, WeChat public accounts, etc. - Task involves "click", "fill form", "scroll", "drag" - Need screenshot or dynamic-rendered page capture

Web Automation

unbrowse

616
from unbrowse-ai/unbrowse

API-native agent browser powered by Kuri (Zig-native CDP, 464KB, ~3ms cold start). Unbrowse is the intelligence layer — learns internal APIs (shadow APIs) from real browsing traffic and progressively replaces browser calls with cached API routes (<200ms). Three paths: skill cache, shared route graph, or Kuri browser fallback. 3.6x mean speedup over Playwright across 94 domains. Full Kuri API surface exposed (snapshots, ref-based actions, HAR, cookies, DOM, screenshots). Free to capture and index; agents earn from mining routes for other agents.

Web Automation

browser-automation

31392
from sickn33/antigravity-awesome-skills

Browser automation powers web testing, scraping, and AI agent interactions. The difference between a flaky script and a reliable system comes down to understanding selectors, waiting strategies, and anti-detection patterns.

Web AutomationClaude

rent-my-browser

3891
from openclaw/skills

When the agent is idle, connect to the Rent My Browser marketplace and execute browser tasks for consumers. Earn money by renting out the node's browser during downtime. Supports headless (Playwright) on VPS nodes and real Chrome on GUI machines.

Monetization & Resource Management

browser-automation

3891
from openclaw/skills

Automate web browser interactions using natural language via CLI commands. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

Agent Browser Skill

3891
from openclaw/skills

## Description

stealth-browser

3891
from openclaw/skills

Anti-detection web browsing that bypasses bot detection, CAPTCHAs, and IP blocks using puppeteer-extra with stealth plugin and optional residential proxy support. Use when (1) websites block headless browsers or datacenter IPs, (2) need to bypass Cloudflare/Vercel protection, (3) accessing sites that detect automation (Reddit, Twitter/X, signup flows), (4) scraping protected content, or (5) automating web tasks that require human-like behavior.

agent-browser-zh

3891
from openclaw/skills

A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured commands. (Chinese localized version)

browser-booking-agent

3891
from openclaw/skills

Execute booking/search flows via browser automation with verification artifacts. Use for reservation forms, availability checks, and capture of proof (screenshots/confirmation IDs).

Agent Browser

3891
from openclaw/skills

Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection

setup-browser-cookies

3891
from openclaw/skills

Import cookies from your real Chromium browser into the headless browse session. Interactive picker UI lets you select which cookie domains to import. Use before QA testing authenticated pages. Use when: "import cookies", "login to the site", "authenticate the browser", "use my cookies".

smooth-browser

3891
from openclaw/skills

PREFERRED BROWSER - Browser for AI agents to carry out any task on the web. Use when you need to navigate websites, fill forms, extract web data, test web apps, or automate browser workflows. Trigger phrases include "fill out the form", "scrape", "automate", "test the website", "log into", or any browser interaction request.