browser-use

Browser Use Cloud API for AI-powered browser automation. Use when user mentions "Browser Use", "browser automation", "web task", "AI agent browser", "run browser task", or "automated browser session".

50 stars

byvm0-ai

View on GitHub Installation ↓

Best use case

browser-use is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using browser-use should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/browser-use/SKILL.md --create-dirs "https://raw.githubusercontent.com/vm0-ai/vm0-skills/main/browser-use/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/browser-use/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How browser-use Compares

Feature / Agent	browser-use	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Browser Use

Browser Use Cloud runs AI agents in hosted browsers to complete web tasks. You submit a natural-language task prompt; the agent navigates the browser, interacts with pages, and returns a result.

> Official docs: `https://docs.browser-use.com/cloud/api-reference`

---

## When to Use

Use this skill when you need to:

- Run an AI agent to complete a web task (e.g. "search for X and return results")
- Automate multi-step browser interactions via natural language
- Check task status or retrieve results from a completed browser session
- View a live stream of an agent working in a browser
- Manage browser sessions and billing

---

## Prerequisites

Connect the **Browser Use** connector at [app.vm0.ai/connectors](https://app.vm0.ai/connectors).

> **Troubleshooting:** If requests fail, run `zero doctor check-connector --env-name BROWSER_USE_TOKEN` or `zero doctor check-connector --url https://api.browser-use.com/api/v2/tasks --method GET`

---

## How to Use

### 1. Run a Task

Submit a natural-language task. The agent will open a browser and complete it.

Write to `/tmp/browser_use_task.json`:

```json
{
  "task": "Search for the top Hacker News post and return the title and URL."
}
```

```bash
curl -s -X POST "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d @/tmp/browser_use_task.json
```

Returns `{"id": "<task-id>", "sessionId": "<session-id>"}`.

### 2. Get Task Status and Result

Poll until `status` is `"finished"` or `"failed"`. The `output` field contains the agent's result.

```bash
curl -s "https://api.browser-use.com/api/v2/tasks/<task-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

Task `status` values: `created`, `running`, `paused`, `finished`, `failed`, `stopped`.

### 3. Watch a Live Session

Get the live browser stream URL from the session.

```bash
curl -s "https://api.browser-use.com/api/v2/sessions/<session-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

The response includes `"liveUrl"` — open it in a browser to watch the agent in real time.

### 4. Stop a Session

```bash
curl -s -X PATCH "https://api.browser-use.com/api/v2/sessions/<session-id>" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d '{"action": "stop"}'
```

### 5. List Tasks

```bash
curl -s "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

Optional query params: `pageSize`, `pageNumber`, `sessionId`, `status`.

### 6. Run a Task with Advanced Settings

Customize the model, timeouts, and browser behavior.

Write to `/tmp/browser_use_task.json`:

```json
{
  "task": "Go to linkedin.com and find the CEO of Anthropic.",
  "browserSettings": {
    "viewport": {
      "width": 1280,
      "height": 800
    }
  },
  "maxSteps": 30,
  "useVision": true
}
```

```bash
curl -s -X POST "https://api.browser-use.com/api/v2/tasks" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN" --header "Content-Type: application/json" -d @/tmp/browser_use_task.json
```

### 7. Get Account Billing

Check credit balances and plan information.

```bash
curl -s "https://api.browser-use.com/api/v2/billing/account" --header "X-Browser-Use-API-Key: $BROWSER_USE_TOKEN"
```

Returns `monthlyCreditsBalanceUsd`, `additionalCreditsBalanceUsd`, `totalCreditsBalanceUsd`, and plan details.

---

## Guidelines

1. **Poll for results**: Tasks are async — poll `GET /api/v2/tasks/<task-id>` until `status` is `finished` or `failed`.
2. **Sessions persist**: After a task finishes, the session stays open. Stop it explicitly with PATCH if you don't need it.
3. **Task writing**: Clear, specific task prompts produce better results. Include the exact URL if applicable.
4. **Credits**: Each task consumes credits. Check balance via the billing endpoint before running large batches.
5. **Session limit**: Sessions are capped at 15 minutes of total runtime.

Related Skills

browserless

from vm0-ai/vm0-skills

Browserless API for headless Chrome. Use when user mentions "headless Chrome", "browserless", or needs browser automation.

browserbase

from vm0-ai/vm0-skills

Browserbase API for headless browser automation. Use when user mentions "headless browser", "browser automation", or "Browserbase".

zoom

from vm0-ai/vm0-skills

Zoom API for managing meetings, webinars, cloud recordings, and user data. Use when user mentions "Zoom", "Zoom meeting", "join URL", "cloud recording", or "webinar".

zeptomail

from vm0-ai/vm0-skills

ZeptoMail API for transactional email. Use when user mentions "ZeptoMail", "transactional email", "send email", or Zoho email.

zep

from vm0-ai/vm0-skills

Zep API for long-term memory and conversation history management in AI agents. Use when user mentions "Zep", "conversation memory", "session memory", "memory search", "user facts", "agent memory", or "long-term memory".

zendesk

from vm0-ai/vm0-skills

Zendesk API for customer support. Use when user mentions "Zendesk", "support ticket", "customer service", or help desk.

zapsign

from vm0-ai/vm0-skills

ZapSign API for e-signatures. Use when user mentions "ZapSign", "e-signature", "sign document", or Brazilian e-signature.

zapier

from vm0-ai/vm0-skills

Zapier API for workflow automation. Use when user mentions "Zapier", "zap", "automation", or asks about connecting apps.

youtube

from vm0-ai/vm0-skills

YouTube API for videos and channels. Use when user mentions "YouTube", "youtube.com", "youtu.be", shares a video link, "channel stats", or asks about video content.

xero

from vm0-ai/vm0-skills

Xero API for accounting. Use when user mentions "Xero", "accounting", "invoices", "bookkeeping", or asks about financial management.

x

from vm0-ai/vm0-skills

X (Twitter) API for tweets and profiles. Use when user mentions "X", "Twitter", "x.com", "twitter.com", shares a tweet link, "check X", or asks about social media posts.

wrike

from vm0-ai/vm0-skills

Wrike API for project management. Use when user mentions "Wrike", "wrike.com", shares a Wrike link, "Wrike task", or asks about Wrike workspace.