browser-use
Browser Use Cloud API for AI-powered browser automation. Use when user mentions "Browser Use", "browser automation", "web task", "AI agent browser", "run browser task", or "automated browser session".
Best use case
browser-use is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Browser Use Cloud API for AI-powered browser automation. Use when user mentions "Browser Use", "browser automation", "web task", "AI agent browser", "run browser task", or "automated browser session".
Teams using browser-use should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/browser-use/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How browser-use Compares
| Feature / Agent | browser-use | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Browser Use Cloud API for AI-powered browser automation. Use when user mentions "Browser Use", "browser automation", "web task", "AI agent browser", "run browser task", or "automated browser session".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Browser Use
Browser Use Cloud runs AI agents in hosted browsers to complete web tasks. You submit a natural-language task prompt; the agent navigates the browser, interacts with pages, and returns a result.
> Official docs: `https://docs.browser-use.com/cloud/api-reference`
---
## When to Use
Use this skill when you need to:
- Run an AI agent to complete a web task (e.g. "search for X and return results")
- Automate multi-step browser interactions via natural language
- Check task status or retrieve results from a completed browser session
- View a live stream of an agent working in a browser
- Manage browser sessions and billing
---
## Prerequisites
Connect the **Browser Use** connector at [app.vm0.ai/connectors](https://app.vm0.ai/connectors).
> **Troubleshooting:** If requests fail, run `zero doctor check-connector --env-name BROWSER_USE_TOKEN` or `zero doctor check-connector --url https://api.browser-use.com/api/v2/tasks --method GET`
---
## How to Use
### 1. Run a Task
Submit a natural-language task. The agent will open a browser and complete it.
Write to `/tmp/browser_use_task.json`:
```json
{
"task": "Search for the top Hacker News post and return the title and URL."
}
```
```bash
curl -s -X POST "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d @/tmp/browser_use_task.json
```
Returns `{"id": "<task-id>", "sessionId": "<session-id>"}`.
### 2. Get Task Status and Result
Poll until `status` is `"finished"` or `"failed"`. The `output` field contains the agent's result.
```bash
curl -s "https://api.browser-use.com/api/v2/tasks/<task-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```
Task `status` values: `created`, `running`, `paused`, `finished`, `failed`, `stopped`.
### 3. Watch a Live Session
Get the live browser stream URL from the session.
```bash
curl -s "https://api.browser-use.com/api/v2/sessions/<session-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```
The response includes `"liveUrl"` — open it in a browser to watch the agent in real time.
### 4. Stop a Session
```bash
curl -s -X PATCH "https://api.browser-use.com/api/v2/sessions/<session-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d '{"action": "stop"}'
```
### 5. List Tasks
```bash
curl -s "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```
Optional query params: `pageSize`, `pageNumber`, `sessionId`, `status`.
### 6. Run a Task with Advanced Settings
Customize the model, timeouts, and browser behavior.
Write to `/tmp/browser_use_task.json`:
```json
{
"task": "Go to linkedin.com and find the CEO of Anthropic.",
"browserSettings": {
"viewport": {
"width": 1280,
"height": 800
}
},
"maxSteps": 30,
"useVision": true
}
```
```bash
curl -s -X POST "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d @/tmp/browser_use_task.json
```
### 7. Get Account Billing
Check credit balances and plan information.
```bash
curl -s "https://api.browser-use.com/api/v2/billing/account" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```
Returns `monthlyCreditsBalanceUsd`, `additionalCreditsBalanceUsd`, `totalCreditsBalanceUsd`, and plan details.
---
## Guidelines
1. **Poll for results**: Tasks are async — poll `GET /api/v2/tasks/<task-id>` until `status` is `finished` or `failed`.
2. **Sessions persist**: After a task finishes, the session stays open. Stop it explicitly with PATCH if you don't need it.
3. **Task writing**: Clear, specific task prompts produce better results. Include the exact URL if applicable.
4. **Credits**: Each task consumes credits. Check balance via the billing endpoint before running large batches.
5. **Session limit**: Sessions are capped at 15 minutes of total runtime.Related Skills
browserless
Browserless API for headless Chrome. Use when user mentions "headless Chrome", "browserless", or needs browser automation.
browserbase
Browserbase API for headless browser automation. Use when user mentions "headless browser", "browser automation", or "Browserbase".
zoom
Zoom API for managing meetings, webinars, cloud recordings, and user data. Use when user mentions "Zoom", "Zoom meeting", "join URL", "cloud recording", or "webinar".
zeptomail
ZeptoMail API for transactional email. Use when user mentions "ZeptoMail", "transactional email", "send email", or Zoho email.
zep
Zep API for long-term memory and conversation history management in AI agents. Use when user mentions "Zep", "conversation memory", "session memory", "memory search", "user facts", "agent memory", or "long-term memory".
zendesk
Zendesk API for customer support. Use when user mentions "Zendesk", "support ticket", "customer service", or help desk.
zapsign
ZapSign API for e-signatures. Use when user mentions "ZapSign", "e-signature", "sign document", or Brazilian e-signature.
zapier
Zapier API for workflow automation. Use when user mentions "Zapier", "zap", "automation", or asks about connecting apps.
youtube
YouTube API for videos and channels. Use when user mentions "YouTube", "youtube.com", "youtu.be", shares a video link, "channel stats", or asks about video content.
xero
Xero API for accounting. Use when user mentions "Xero", "accounting", "invoices", "bookkeeping", or asks about financial management.
x
X (Twitter) API for tweets and profiles. Use when user mentions "X", "Twitter", "x.com", "twitter.com", shares a tweet link, "check X", or asks about social media posts.
wrike
Wrike API for project management. Use when user mentions "Wrike", "wrike.com", shares a Wrike link, "Wrike task", or asks about Wrike workspace.