browser-automation

Automate web browser interactions using natural language via CLI commands. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

3,891 stars

Best use case

browser-automation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Automate web browser interactions using natural language via CLI commands. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

Teams using browser-automation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/browser-automations/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/basillytton/browser-automations/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/browser-automations/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How browser-automation Compares

Feature / Agentbrowser-automationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Automate web browser interactions using natural language via CLI commands. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# SkillBoss

One API key, 50+ models across providers (Bedrock, OpenAI, Vertex, ElevenLabs, Replicate, Minimax, and more). Call any model directly by ID, or use smart routing to auto-select the cheapest or highest-quality option for a task.

**Base URL:** `https://api.heybossai.com/v1`
**Auth:** `-H "Authorization: Bearer $SKILLBOSS_API_KEY"`

## List Models

```bash
curl -s https://api.heybossai.com/v1/models \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Filter by type:

```bash
curl -s "https://api.heybossai.com/v1/models?types=image" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Get full docs for specific models:

```bash
curl -s "https://api.heybossai.com/v1/models?ids=mm/img,bedrock/claude-4-5-sonnet" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Types: `chat`, `image`, `video`, `tts`, `stt`, `music`, `search`, `scraper`, `email`, `storage`, `ppt`, `embedding`

## Chat

```bash
curl -s -X POST https://api.heybossai.com/v1/chat/completions \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bedrock/claude-4-5-sonnet",
    "messages": [{"role": "user", "content": "Explain quantum computing"}]
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `bedrock/claude-4-5-sonnet`, `bedrock/claude-4-6-opus`, `openai/gpt-5`, `vertex/gemini-2.5-flash`, `deepseek/deepseek-chat` |
| `messages` | Array of `{role, content}` objects |
| `system` | Optional system prompt |
| `temperature` | Optional, 0.0–1.0 |
| `max_tokens` | Optional, max output tokens |

Response: `choices[0].message.content`

## Image Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/img",
    "inputs": {"prompt": "A sunset over mountains"}
  }'
```

Save to file:

```bash
URL=$(curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "mm/img", "inputs": {"prompt": "A sunset over mountains"}}' \
  | jq -r '.image_url // .result.image_url // .data[0]')
curl -sL "$URL" -o sunset.png
```

| Parameter | Description |
|-----------|-------------|
| `model` | `mm/img`, `replicate/black-forest-labs/flux-2-pro`, `replicate/black-forest-labs/flux-1.1-pro-ultra`, `vertex/gemini-2.5-flash-image-preview`, `vertex/gemini-3-pro-image-preview` |
| `inputs.prompt` | Text description of the image |
| `inputs.size` | Optional, e.g. `"1024*768"` |
| `inputs.aspect_ratio` | Optional, e.g. `"16:9"` |

Response: `image_url`, `data[0]`, or `generated_images[0]`

## Video Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/t2v",
    "inputs": {"prompt": "A cat playing with yarn"}
  }'
```

Image-to-video:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/i2v",
    "inputs": {"prompt": "Zoom in slowly", "image": "https://example.com/photo.jpg"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `mm/t2v` (text-to-video), `mm/i2v` (image-to-video), `vertex/veo-3-generate-preview` |
| `inputs.prompt` | Text description |
| `inputs.image` | Image URL (for i2v) |
| `inputs.duration` | Optional, seconds |

Response: `video_url`

## Text-to-Speech

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax/speech-01-turbo",
    "inputs": {"text": "Hello world", "voice_setting": {"voice_id": "male-qn-qingse", "speed": 1.0}}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `minimax/speech-01-turbo`, `elevenlabs/eleven_multilingual_v2`, `openai/tts-1` |
| `inputs.text` | Text to speak |
| `inputs.voice` | Voice name (e.g. `alloy`, `nova`, `shimmer`) for OpenAI |
| `inputs.voice_id` | Voice ID for ElevenLabs |

Response: `audio_url` or binary audio data

## Speech-to-Text

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/whisper-1",
    "inputs": {"audio_data": "BASE64_AUDIO", "filename": "recording.mp3"}
  }'
```

Response: `text`

## Music Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/elevenlabs/music",
    "inputs": {"prompt": "upbeat electronic", "duration": 30}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `replicate/elevenlabs/music`, `replicate/meta/musicgen`, `replicate/google/lyria-2` |
| `inputs.prompt` | Music description |
| `inputs.duration` | Duration in seconds |

Response: `audio_url`

## Background Removal

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/remove-bg",
    "inputs": {"image": "https://example.com/photo.jpg"}
  }'
```

Response: `image_url` or `data[0]`

## Document Processing

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "reducto/parse",
    "inputs": {"document_url": "https://example.com/file.pdf"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `reducto/parse` (PDF/DOCX to markdown), `reducto/extract` (structured extraction) |
| `inputs.document_url` | URL of the document |
| `inputs.instructions` | For extract: `{"schema": {...}}` |

## Web Search

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "linkup/search",
    "inputs": {"query": "latest AI news", "depth": "standard", "outputType": "searchResults"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `linkup/search`, `perplexity/sonar`, `firecrawl/scrape` |
| `inputs.query` | Search query |
| `inputs.depth` | `standard` or `deep` |
| `inputs.outputType` | `searchResults`, `sourcedAnswer`, `structured` |

## Email

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "email/send",
    "inputs": {"to": "user@example.com", "subject": "Hello", "html": "<p>Hi</p>"}
  }'
```

## SMS Verification

Send OTP:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prelude/verify-send",
    "inputs": {"target": {"type": "phone_number", "value": "+1234567890"}}
  }'
```

Verify OTP:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prelude/verify-check",
    "inputs": {"target": {"type": "phone_number", "value": "+1234567890"}, "code": "123456"}
  }'
```

## Smart Mode (auto-select best model)

List task types:

```bash
curl -s -X POST https://api.heybossai.com/v1/pilot \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"discover": true}'
```

Run a task:

```bash
curl -s -X POST https://api.heybossai.com/v1/pilot \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "image",
    "inputs": {"prompt": "A sunset over mountains"}
  }'
```

## Available Models (50+)

| Category | Models | Details |
|----------|--------|---------|
| Chat | 25+ models — Claude, GPT, Gemini, DeepSeek, Qwen, HuggingFace | `chat-models.md` |
| Image | 9 models — Gemini, FLUX, upscaling, background removal | `image-models.md` |
| Video | 3 models — Veo, text-to-video, image-to-video | `video-models.md` |
| Audio | 11 models — TTS, STT, music generation | `audio-models.md` |
| Search & Scraping | 19 models — Perplexity, Firecrawl, ScrapingDog, CEO interviews | `search-models.md` |
| Tools | 11 models — documents, email, SMS, embeddings, presentations | `tools-models.md` |

Notes:
- Get SKILLBOSS_API_KEY at https://www.skillboss.co
- Use the models endpoint to discover all available models live
- Use smart mode (pilot) to auto-select the best model for any task

Related Skills

n8n Workflow Mastery — Complete Automation Engineering System

3891
from openclaw/skills

You are an expert n8n workflow architect. You design, build, debug, optimize, and scale n8n automations following production-grade methodology. Every workflow you create is complete, functional, and follows the patterns in this guide.

Workflow & Productivity

Insurance Operations Automation

3891
from openclaw/skills

Comprehensive insurance operations framework for AI agents. Covers the full insurance lifecycle — underwriting, claims, policy management, renewals, compliance, and broker operations.

Workflow & Productivity

afrexai-business-automation

3891
from openclaw/skills

Turn your AI agent into a business automation architect. Design, document, implement, and monitor automated workflows across sales, ops, finance, HR, and support — no n8n or Zapier required.

Workflow & Productivity

Business Automation Strategy — AfrexAI

3891
from openclaw/skills

> The complete methodology for identifying, designing, building, and scaling business automations. Platform-agnostic — works with n8n, Zapier, Make, Power Automate, custom code, or any combination.

AI Automation Agency Blueprint

3891
from openclaw/skills

You are an AI Automation Agency strategist. Help the user build, price, sell, and scale an AI agent services business — from solo consultant to 7-figure agency. Every recommendation must be specific, actionable, and backed by real economics.

Business Strategy & Growth

Accounts Payable Automation Framework

3891
from openclaw/skills

You are an AP process optimizer. When the user describes their payable workflows, vendor relationships, or payment processes, generate a complete accounts payable management framework.

Workflow & Productivity

my-browser-agent

3891
from openclaw/skills

A custom browser automation skill using Playwright.

Web Automation

n8n-workflow-automation

3891
from openclaw/skills

Designs and outputs n8n workflow JSON with robust triggers, idempotency, error handling, logging, retries, and human-in-the-loop review queues. Use when you need an auditable automation that won’t silently fail.

Workflow & Productivity

rent-my-browser

3891
from openclaw/skills

When the agent is idle, connect to the Rent My Browser marketplace and execute browser tasks for consumers. Earn money by renting out the node's browser during downtime. Supports headless (Playwright) on VPS nodes and real Chrome on GUI machines.

Monetization & Resource Management

google-workspace-automation

3891
from openclaw/skills

Design Gmail, Drive, Sheets, and Calendar automations with scope-aware plans. Use for repeatable daily task automation with explicit OAuth scopes and audit-ready outputs.

Workflow & Productivity

docs-pipeline-automation

3891
from openclaw/skills

Build repeatable data-to-Docs pipelines from Sheets and Drive sources. Use for automated status reports, template-based document assembly, and scheduled publishing workflows.

Workflow & Productivity

agentic-workflow-automation

3891
from openclaw/skills

Generate reusable multi-step agent workflow blueprints. Use for trigger/action orchestration, deterministic workflow definitions, and automation handoff artifacts.

Workflow & Productivity