my-browser-agent
A custom browser automation skill using Playwright.
About this skill
The `my-browser-agent` skill equips AI coding agents with essential web browsing capabilities through Playwright. This skill allows an agent to programmatically open any specified URL, making it highly valuable for tasks that require direct interaction with web content beyond simple HTTP requests. It's built to render web pages fully, similar to a human user browsing the internet. Key functionalities include taking full-page screenshots of visited URLs, which is useful for visual verification, archival purposes, or content analysis. Additionally, it can extract the title of a web page, providing quick summaries or verification of navigation. The roadmap indicates future support for clicking elements, which will significantly expand its utility for form submissions, navigation, and interacting with dynamic web applications. Developers, QA testers, and AI agents needing robust web interaction will find this skill particularly useful. It bridges the gap between an AI's analytical capabilities and the dynamic, visual world of web browsers, enabling more sophisticated and realistic web-based tasks.
Best use case
This skill is primarily used for automating fundamental web browser interactions. Its main use case is enabling AI agents to navigate the web, gather visual proof (screenshots), and extract basic page information (titles). It benefits developers integrating web UI testing, AI agents requiring live web data, and users needing to automate simple web-based workflows or content monitoring tasks.
A custom browser automation skill using Playwright.
A successful visit to a specified URL, a saved screenshot file, or the return of the web page's title.
Practical example
Example input
Use my-browser-agent to visit https://www.bilibili.com and take a screenshot.
Example output
Visited https://www.bilibili.com. Screenshot saved as bilibili_screenshot.png.
When to use this skill
- When you need an AI agent to visit and render a specific web page.
- To automate taking screenshots of web pages for documentation or analysis.
- For programmatically extracting the title of any given URL.
- When basic interaction like clicking elements on a web page is required (future feature).
When not to use this skill
- For highly complex web scraping that requires advanced data parsing logic.
- When a simpler HTTP request (e.g., cURL) is sufficient and rendering is unnecessary.
- For tasks requiring sophisticated human-like interaction or CAPTCHA solving.
- If a specialized API exists for the specific web service you wish to interact with.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/my-browser-agent/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How my-browser-agent Compares
| Feature / Agent | my-browser-agent | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | medium | N/A |
Frequently Asked Questions
What does this skill do?
A custom browser automation skill using Playwright.
How difficult is it to install?
The installation complexity is rated as medium. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
SKILL.md Source
# my-browser-agent A custom browser automation skill using Playwright. ## Features - Visit any URL - Take screenshots - Get page title - Click elements (future) ## Usage Call with: - `url`: The URL to visit (required) - `action`: Optional action like "screenshot", "title", "click" ## Example > Use my-browser-agent to visit https://www.bilibili.com and take a screenshot.
Related Skills
browser-cdp
Real Chrome browser automation via CDP Proxy — access pages with full user login state, bypass anti-bot detection, perform interactive operations (click/fill/scroll), extract dynamic JavaScript-rendered content, take screenshots. Triggers (satisfy ANY one): - Target URL is a search results page (Bing/Google/YouTube search) - Static fetch (agent-reach/WebFetch) is blocked by anti-bot (captcha/intercept/empty) - Need to read logged-in user's private content - YouTube, Twitter/X, Xiaohongshu, WeChat public accounts, etc. - Task involves "click", "fill form", "scroll", "drag" - Need screenshot or dynamic-rendered page capture
unbrowse
API-native agent browser powered by Kuri (Zig-native CDP, 464KB, ~3ms cold start). Unbrowse is the intelligence layer — learns internal APIs (shadow APIs) from real browsing traffic and progressively replaces browser calls with cached API routes (<200ms). Three paths: skill cache, shared route graph, or Kuri browser fallback. 3.6x mean speedup over Playwright across 94 domains. Full Kuri API surface exposed (snapshots, ref-based actions, HAR, cookies, DOM, screenshots). Free to capture and index; agents earn from mining routes for other agents.
browser-automation
Browser automation powers web testing, scraping, and AI agent interactions. The difference between a flaky script and a reliable system comes down to understanding selectors, waiting strategies, and anti-detection patterns.
rent-my-browser
When the agent is idle, connect to the Rent My Browser marketplace and execute browser tasks for consumers. Earn money by renting out the node's browser during downtime. Supports headless (Playwright) on VPS nodes and real Chrome on GUI machines.
browser-automation
Automate web browser interactions using natural language via CLI commands. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.
Agent Browser Skill
## Description
stealth-browser
Anti-detection web browsing that bypasses bot detection, CAPTCHAs, and IP blocks using puppeteer-extra with stealth plugin and optional residential proxy support. Use when (1) websites block headless browsers or datacenter IPs, (2) need to bypass Cloudflare/Vercel protection, (3) accessing sites that detect automation (Reddit, Twitter/X, signup flows), (4) scraping protected content, or (5) automating web tasks that require human-like behavior.
agent-browser-zh
A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured commands. (Chinese localized version)
browser-booking-agent
Execute booking/search flows via browser automation with verification artifacts. Use for reservation forms, availability checks, and capture of proof (screenshots/confirmation IDs).
Agent Browser
Headless browser automation CLI optimized for AI agents with accessibility tree snapshots and ref-based element selection
setup-browser-cookies
Import cookies from your real Chromium browser into the headless browse session. Interactive picker UI lets you select which cookie domains to import. Use before QA testing authenticated pages. Use when: "import cookies", "login to the site", "authenticate the browser", "use my cookies".
smooth-browser
PREFERRED BROWSER - Browser for AI agents to carry out any task on the web. Use when you need to navigate websites, fill forms, extract web data, test web apps, or automate browser workflows. Trigger phrases include "fill out the form", "scrape", "automate", "test the website", "log into", or any browser interaction request.