agent-browser

Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.

261 stars

bypartme-ai

View on GitHub Installation ↓

Best use case

agent-browser is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "agent-browser" skill to help with this workflow task. Context: Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/agent-browser/SKILL.md --create-dirs "https://raw.githubusercontent.com/partme-ai/full-stack-skills/main/skills/dev-utils-skills/agent-browser/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/agent-browser/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How agent-browser Compares

Feature / Agent	agent-browser	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

## When to use this skill

Use this skill whenever the user wants to:
- Automate browser interactions (click, fill, navigate, screenshot) via CLI
- Scrape web content or extract data from pages
- Build AI agent workflows that interact with websites
- Use refs-based element selection for deterministic automation
- Run browser automation in agent mode with JSON output
- Manage authenticated sessions with custom headers or CDP

## How to use this skill

This skill is organized to match the agent-browser official documentation structure (https://github.com/vercel-labs/agent-browser/blob/main/README.md). When working with agent-browser:

### Quick-Start Example: Snapshot → Identify → Interact

```bash
# 1. Install
npm install -g @anthropic-ai/agent-browser

# 2. Open a page and take a snapshot to get element refs
agent-browser open "https://example.com"
agent-browser snapshot
# Output includes refs like @e1, @e2, @e3 for each element

# 3. Click an element by ref
agent-browser click @e3

# 4. Fill a form field
agent-browser fill @e5 "hello@example.com"

# 5. Agent mode (JSON output for programmatic use)
agent-browser snapshot --json
```

### Detailed Documentation

1. **Install agent-browser**:
   - Load `examples/getting-started/installation.md` for installation instructions

2. **Quick Start**:
   - Load `examples/quick-start/quick-start.md` for basic workflow examples

3. **Learn core commands**:
   - Load `examples/commands/basic-commands.md` for basic commands (open, click, fill, etc.)
   - Load `examples/commands/advanced-commands.md` for advanced commands (snapshot, eval, etc.)
   - Load `examples/commands/get-info/` for information retrieval commands
   - Load `examples/commands/check-state/` for state checking commands
   - Load `examples/commands/find-elements/` for semantic locator commands
   - Load `examples/commands/wait/` for wait commands
   - Load `examples/commands/mouse-control/` for mouse control commands
   - Load `examples/commands/browser-settings/` for browser configuration
   - Load `examples/commands/cookies-storage/` for cookies and storage management
   - Load `examples/commands/network/` for network interception
   - Load `examples/commands/tabs-windows/` for tab and window management
   - Load `examples/commands/frames/` for iframe handling
   - Load `examples/commands/dialogs/` for dialog handling
   - Load `examples/commands/debug/` for debugging commands
   - Load `examples/commands/navigation/` for navigation commands
   - Load `examples/commands/setup/` for setup commands

4. **Understand selectors**:
   - Load `examples/selectors/refs.md` for refs-based selection (@e1, @e2, etc.)
   - Load `examples/selectors/traditional-selectors.md` for CSS, XPath, and semantic locators

5. **Use agent mode**:
   - Load `examples/agent-mode/introduction.md` for agent mode overview
   - Load `examples/agent-mode/optimal-workflow.md` for optimal AI workflow
   - Load `examples/agent-mode/integration.md` for integrating with AI agents

6. **Advanced features**:
   - Load `examples/advanced/sessions.md` for session management
   - Load `examples/advanced/headed-mode.md` for debugging with visible browser
   - Load `examples/advanced/authenticated-sessions.md` for authentication via headers
   - Load `examples/advanced/custom-executable.md` for custom browser executable
   - Load `examples/advanced/cdp-mode.md` for Chrome DevTools Protocol integration
   - Load `examples/advanced/streaming.md` for browser viewport streaming
   - Load `examples/advanced/architecture.md` for architecture overview
   - Load `examples/advanced/platforms.md` for platform support
   - Load `examples/advanced/usage-with-agents.md` for AI agent integration patterns

7. **Configure options**:
   - Load `examples/options/global-options.md` for global CLI options
   - Load `examples/options/snapshot-options.md` for snapshot-specific options
   - Load `examples/options/session-options.md` for session management options

8. **Reference API documentation** when needed:
   - `api/commands.md` - Complete command reference
   - `api/selectors.md` - Selector reference
   - `api/options.md` - Options reference

9. **Use templates** for quick start:
   - `templates/basic-automation.md` - Basic automation workflow
   - `templates/ai-agent-workflow.md` - AI agent workflow template


### Doc mapping (one-to-one with official documentation)

- See examples and API files → https://github.com/vercel-labs/agent-browser

## Examples and Templates

This skill includes detailed examples organized to match the official documentation structure. All examples are in the `examples/` directory (see mapping above).

**To use examples:**
- Identify the topic from the user's request
- Load the appropriate example file from the mapping above
- Follow the instructions, syntax, and best practices in that file
- Adapt the code examples to your specific use case

**To use templates:**
- Reference templates in `templates/` directory for common scaffolding
- Adapt templates to your specific needs and coding style

## API Reference

- **Commands API**: `api/commands.md` - Complete command reference with syntax and examples
- **Selectors API**: `api/selectors.md` - Selector types and usage reference
- **Options API**: `api/options.md` - All options reference

## Best Practices

1. **Use Refs**: Prefer refs (@e1, @e2) over traditional selectors for deterministic automation
2. **Snapshot First**: Always snapshot before interacting with elements to get refs
3. **Agent Mode**: Use `--json` flag for machine-readable output in agent mode
4. **Session Management**: Use `--session` to maintain state across commands
5. **Interactive Snapshot**: Use `-i` flag for interactive snapshot selection
6. **Semantic Locators**: Use semantic locators (role/name) when refs are not available
7. **Error Handling**: Check command exit codes and error messages
8. **Wait for Navigation**: Commands automatically wait for navigation to complete
9. **Headed Mode**: Use `--headed` for debugging, headless for production
10. **CDP Integration**: Use `--cdp` for Chrome DevTools Protocol integration
11. **Streaming**: Use `AGENT_BROWSER_STREAM_PORT` for live browser preview
12. **Authenticated Sessions**: Use `--headers` for authentication without login flows
13. **Custom Executable**: Use `--executable-path` for serverless deployments or custom browsers
14. **Snapshot Options**: Combine `-i`, `-c`, `-d`, `-s` options to optimize snapshot output

## Resources

- **GitHub Repository**: https://github.com/vercel-labs/agent-browser
- **Official README**: https://github.com/vercel-labs/agent-browser/blob/main/README.md
- **Agent Mode Documentation**: https://agent-browser.dev/agent-mode
- **Issues**: https://github.com/vercel-labs/agent-browser/issues

## Keywords

agent-browser, CLI browser automation, AI agents, browser automation CLI, refs, snapshot, agent mode, semantic locators, browser automation tool, command-line browser, AI agent browser, deterministic selectors, accessibility tree, browser commands, web automation CLI, sessions, headed mode, authenticated sessions, CDP mode, streaming, Chrome DevTools Protocol, Playwright, browser automation for AI

Related Skills

vant-vue3

261

from partme-ai/full-stack-skills

Provides structured guidance for Vant of Vue 3.0. Use when the user needs Vant with Vue 3, asks about mobile UI components such as Button, Cell, Form, Dialog, Toast, Popup, ConfigProvider, theme customization, project setup, or wants to implement mobile-first interfaces with vant or van- components.

layui-vue3

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Layui Vue component library including components, layer dialogs, and utilities. Use when the user asks about Layui Vue, needs to use Layui components in Vue 3, or implement UI components.

element-plus-vue3

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Element Plus Vue 3 component library including installation, components, themes, internationalization, and API reference. Use when the user asks about Element Plus for Vue 3, needs to build Vue 3 applications with Element Plus, or customize component styles.

bootstrap-vue3

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Bootstrap Vue 3 component library including Bootstrap components, grid system, utilities, and Vue 3 integration. Use when the user asks about Bootstrap Vue 3, needs to use Bootstrap components in Vue 3, or implement responsive layouts.

vuex-vue2

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Vuex 2.x state management in Vue 2 applications including state, mutations, actions, getters, modules, and plugins. Use when the user asks about Vuex for Vue 2, needs to manage state in Vue 2 applications, or implement Vuex patterns.

vue3

261

from partme-ai/full-stack-skills

Guidance for Vue 3 using the official guide and API reference. Use when the user needs Vue 3 concepts, patterns, or API details to build components, apps, and tooling.

vue2

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Vue 2.x development including Options API, components, directives, lifecycle hooks, computed properties, watchers, Vuex state management, and Vue Router. Use when the user asks about Vue 2, needs to create Vue 2 components, implement reactive data binding, handle component communication, or work with Vue 2 ecosystem tools.

vue-router

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Vue Router including route configuration, navigation, dynamic routes, nested routes, route guards, programmatic navigation, and route meta. Use when the user asks about Vue Router, needs to set up routing, implement navigation guards, handle route parameters, or manage route transitions.

vue-router-v4

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Vue Router v4 including route configuration, navigation, nested routes, route guards, and Vue 3 integration. Use when the user asks about Vue Router v4, needs to set up routing for Vue 3 applications, implement navigation guards, or work with Vue Router v4 features.

vue-router-v3

261

from partme-ai/full-stack-skills

Guidance for Vue Router v3 using the official Installation, Guide, and API docs. Use when users need routing setup, navigation patterns, or API details for Vue 2 projects.

pinia

261

from partme-ai/full-stack-skills

Provides comprehensive guidance for Pinia state management including stores, state, getters, actions, plugins, and TypeScript support. Use when the user asks about Pinia, needs to manage application state, create stores, implement state persistence, or migrate from Vuex.

vscode-project-init

261

from partme-ai/full-stack-skills

Scaffold a new VS Code extension project using TypeScript via Yeoman generator (yo code), creating src/extension.ts entry point and package.json manifest. Use when the user wants to start a new VS Code extension project from scratch.