agent-browser
Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.
Best use case
agent-browser is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.
Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "agent-browser" skill to help with this workflow task. Context: Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/agent-browser/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How agent-browser Compares
| Feature / Agent | agent-browser | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
## When to use this skill Use this skill whenever the user wants to: - Automate browser interactions (click, fill, navigate, screenshot) via CLI - Scrape web content or extract data from pages - Build AI agent workflows that interact with websites - Use refs-based element selection for deterministic automation - Run browser automation in agent mode with JSON output - Manage authenticated sessions with custom headers or CDP ## How to use this skill This skill is organized to match the agent-browser official documentation structure (https://github.com/vercel-labs/agent-browser/blob/main/README.md). When working with agent-browser: ### Quick-Start Example: Snapshot → Identify → Interact ```bash # 1. Install npm install -g @anthropic-ai/agent-browser # 2. Open a page and take a snapshot to get element refs agent-browser open "https://example.com" agent-browser snapshot # Output includes refs like @e1, @e2, @e3 for each element # 3. Click an element by ref agent-browser click @e3 # 4. Fill a form field agent-browser fill @e5 "hello@example.com" # 5. Agent mode (JSON output for programmatic use) agent-browser snapshot --json ``` ### Detailed Documentation 1. **Install agent-browser**: - Load `examples/getting-started/installation.md` for installation instructions 2. **Quick Start**: - Load `examples/quick-start/quick-start.md` for basic workflow examples 3. **Learn core commands**: - Load `examples/commands/basic-commands.md` for basic commands (open, click, fill, etc.) - Load `examples/commands/advanced-commands.md` for advanced commands (snapshot, eval, etc.) - Load `examples/commands/get-info/` for information retrieval commands - Load `examples/commands/check-state/` for state checking commands - Load `examples/commands/find-elements/` for semantic locator commands - Load `examples/commands/wait/` for wait commands - Load `examples/commands/mouse-control/` for mouse control commands - Load `examples/commands/browser-settings/` for browser configuration - Load `examples/commands/cookies-storage/` for cookies and storage management - Load `examples/commands/network/` for network interception - Load `examples/commands/tabs-windows/` for tab and window management - Load `examples/commands/frames/` for iframe handling - Load `examples/commands/dialogs/` for dialog handling - Load `examples/commands/debug/` for debugging commands - Load `examples/commands/navigation/` for navigation commands - Load `examples/commands/setup/` for setup commands 4. **Understand selectors**: - Load `examples/selectors/refs.md` for refs-based selection (@e1, @e2, etc.) - Load `examples/selectors/traditional-selectors.md` for CSS, XPath, and semantic locators 5. **Use agent mode**: - Load `examples/agent-mode/introduction.md` for agent mode overview - Load `examples/agent-mode/optimal-workflow.md` for optimal AI workflow - Load `examples/agent-mode/integration.md` for integrating with AI agents 6. **Advanced features**: - Load `examples/advanced/sessions.md` for session management - Load `examples/advanced/headed-mode.md` for debugging with visible browser - Load `examples/advanced/authenticated-sessions.md` for authentication via headers - Load `examples/advanced/custom-executable.md` for custom browser executable - Load `examples/advanced/cdp-mode.md` for Chrome DevTools Protocol integration - Load `examples/advanced/streaming.md` for browser viewport streaming - Load `examples/advanced/architecture.md` for architecture overview - Load `examples/advanced/platforms.md` for platform support - Load `examples/advanced/usage-with-agents.md` for AI agent integration patterns 7. **Configure options**: - Load `examples/options/global-options.md` for global CLI options - Load `examples/options/snapshot-options.md` for snapshot-specific options - Load `examples/options/session-options.md` for session management options 8. **Reference API documentation** when needed: - `api/commands.md` - Complete command reference - `api/selectors.md` - Selector reference - `api/options.md` - Options reference 9. **Use templates** for quick start: - `templates/basic-automation.md` - Basic automation workflow - `templates/ai-agent-workflow.md` - AI agent workflow template ### Doc mapping (one-to-one with official documentation) - See examples and API files → https://github.com/vercel-labs/agent-browser ## Examples and Templates This skill includes detailed examples organized to match the official documentation structure. All examples are in the `examples/` directory (see mapping above). **To use examples:** - Identify the topic from the user's request - Load the appropriate example file from the mapping above - Follow the instructions, syntax, and best practices in that file - Adapt the code examples to your specific use case **To use templates:** - Reference templates in `templates/` directory for common scaffolding - Adapt templates to your specific needs and coding style ## API Reference - **Commands API**: `api/commands.md` - Complete command reference with syntax and examples - **Selectors API**: `api/selectors.md` - Selector types and usage reference - **Options API**: `api/options.md` - All options reference ## Best Practices 1. **Use Refs**: Prefer refs (@e1, @e2) over traditional selectors for deterministic automation 2. **Snapshot First**: Always snapshot before interacting with elements to get refs 3. **Agent Mode**: Use `--json` flag for machine-readable output in agent mode 4. **Session Management**: Use `--session` to maintain state across commands 5. **Interactive Snapshot**: Use `-i` flag for interactive snapshot selection 6. **Semantic Locators**: Use semantic locators (role/name) when refs are not available 7. **Error Handling**: Check command exit codes and error messages 8. **Wait for Navigation**: Commands automatically wait for navigation to complete 9. **Headed Mode**: Use `--headed` for debugging, headless for production 10. **CDP Integration**: Use `--cdp` for Chrome DevTools Protocol integration 11. **Streaming**: Use `AGENT_BROWSER_STREAM_PORT` for live browser preview 12. **Authenticated Sessions**: Use `--headers` for authentication without login flows 13. **Custom Executable**: Use `--executable-path` for serverless deployments or custom browsers 14. **Snapshot Options**: Combine `-i`, `-c`, `-d`, `-s` options to optimize snapshot output ## Resources - **GitHub Repository**: https://github.com/vercel-labs/agent-browser - **Official README**: https://github.com/vercel-labs/agent-browser/blob/main/README.md - **Agent Mode Documentation**: https://agent-browser.dev/agent-mode - **Issues**: https://github.com/vercel-labs/agent-browser/issues ## Keywords agent-browser, CLI browser automation, AI agents, browser automation CLI, refs, snapshot, agent mode, semantic locators, browser automation tool, command-line browser, AI agent browser, deterministic selectors, accessibility tree, browser commands, web automation CLI, sessions, headed mode, authenticated sessions, CDP mode, streaming, Chrome DevTools Protocol, Playwright, browser automation for AI
Related Skills
vant-vue3
Provides structured guidance for Vant of Vue 3.0. Use when the user needs Vant with Vue 3, asks about mobile UI components such as Button, Cell, Form, Dialog, Toast, Popup, ConfigProvider, theme customization, project setup, or wants to implement mobile-first interfaces with vant or van- components.
layui-vue3
Provides comprehensive guidance for Layui Vue component library including components, layer dialogs, and utilities. Use when the user asks about Layui Vue, needs to use Layui components in Vue 3, or implement UI components.
element-plus-vue3
Provides comprehensive guidance for Element Plus Vue 3 component library including installation, components, themes, internationalization, and API reference. Use when the user asks about Element Plus for Vue 3, needs to build Vue 3 applications with Element Plus, or customize component styles.
bootstrap-vue3
Provides comprehensive guidance for Bootstrap Vue 3 component library including Bootstrap components, grid system, utilities, and Vue 3 integration. Use when the user asks about Bootstrap Vue 3, needs to use Bootstrap components in Vue 3, or implement responsive layouts.
vuex-vue2
Provides comprehensive guidance for Vuex 2.x state management in Vue 2 applications including state, mutations, actions, getters, modules, and plugins. Use when the user asks about Vuex for Vue 2, needs to manage state in Vue 2 applications, or implement Vuex patterns.
vue3
Guidance for Vue 3 using the official guide and API reference. Use when the user needs Vue 3 concepts, patterns, or API details to build components, apps, and tooling.
vue2
Provides comprehensive guidance for Vue 2.x development including Options API, components, directives, lifecycle hooks, computed properties, watchers, Vuex state management, and Vue Router. Use when the user asks about Vue 2, needs to create Vue 2 components, implement reactive data binding, handle component communication, or work with Vue 2 ecosystem tools.
vue-router
Provides comprehensive guidance for Vue Router including route configuration, navigation, dynamic routes, nested routes, route guards, programmatic navigation, and route meta. Use when the user asks about Vue Router, needs to set up routing, implement navigation guards, handle route parameters, or manage route transitions.
vue-router-v4
Provides comprehensive guidance for Vue Router v4 including route configuration, navigation, nested routes, route guards, and Vue 3 integration. Use when the user asks about Vue Router v4, needs to set up routing for Vue 3 applications, implement navigation guards, or work with Vue Router v4 features.
vue-router-v3
Guidance for Vue Router v3 using the official Installation, Guide, and API docs. Use when users need routing setup, navigation patterns, or API details for Vue 2 projects.
pinia
Provides comprehensive guidance for Pinia state management including stores, state, getters, actions, plugins, and TypeScript support. Use when the user asks about Pinia, needs to manage application state, create stores, implement state persistence, or migrate from Vuex.
vscode-project-init
Scaffold a new VS Code extension project using TypeScript via Yeoman generator (yo code), creating src/extension.ts entry point and package.json manifest. Use when the user wants to start a new VS Code extension project from scratch.