browser-automation

Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.

242 stars

Best use case

browser-automation is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.

Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "browser-automation" skill to help with this workflow task. Context: Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

  • Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

  • Do not use this when you only need a one-off answer and do not need a reusable workflow.
  • Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/browser-automation/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/emillindfors/browser-automation/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/browser-automation/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How browser-automation Compares

Feature / Agentbrowser-automationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Browser Automation Skill

This skill provides guidance for using the rust-browser-mcp server to automate web browsers through the WebDriver protocol. It enables enterprise-grade browser control with performance monitoring, multi-session support, and health management.

## Overview

The rust-browser-mcp server provides 45+ MCP tools for browser automation:

### Core Automation Tools (25)
- **Navigation**: `navigate`, `back`, `forward`, `refresh`
- **Element Interaction**: `click`, `send_keys`, `hover`, `find_element`, `find_elements`
- **Information Extraction**: `get_title`, `get_text`, `get_attribute`, `get_property`, `get_page_source`
- **Advanced**: `fill_and_submit_form`, `login_form`, `scroll_to_element`, `wait_for_element`
- **JavaScript**: `execute_script`
- **Visual**: `screenshot`, `resize_window`, `get_current_url`, `get_page_load_status`

### Performance Monitoring Tools (5)
- `get_performance_metrics` - Page load times, resource timing, navigation data
- `monitor_memory_usage` - Heap monitoring, memory leak detection
- `get_console_logs` - Error detection, log filtering
- `run_performance_test` - Automated performance analysis
- `monitor_resource_usage` - Network, FPS, CPU tracking

### Driver Management Tools (7)
- `start_driver`, `stop_driver`, `stop_all_drivers`
- `list_managed_drivers`
- `get_healthy_endpoints`, `refresh_driver_health`
- `force_cleanup_orphaned_processes`

### Recipe System (4)
- `create_recipe` - Create reusable automation workflows
- `execute_recipe` - Run a saved recipe
- `list_recipes` - List all available recipes
- `delete_recipe` - Remove a recipe

## Setup Instructions

### Prerequisites
Ensure you have at least one WebDriver installed:
- **Chrome**: ChromeDriver (must match Chrome version)
- **Firefox**: GeckoDriver
- **Edge**: MSEdgeDriver

### Configuration for Claude Desktop

Add to your `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "browser": {
      "command": "/path/to/rust-browser-mcp",
      "args": ["--transport", "stdio", "--browser", "chrome"]
    }
  }
}
```

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `WEBDRIVER_ENDPOINT` | `auto` | WebDriver URL or "auto" for auto-discovery |
| `WEBDRIVER_HEADLESS` | `true` | Run browsers in headless mode |
| `WEBDRIVER_PREFERRED_DRIVER` | - | Preferred browser: chrome, firefox, edge |
| `WEBDRIVER_CONCURRENT_DRIVERS` | `firefox,chrome` | Browsers to start concurrently |
| `WEBDRIVER_POOL_ENABLED` | `true` | Enable connection pooling |
| `WEBDRIVER_POOL_MAX_CONNECTIONS` | `3` | Max connections per driver type |

## Usage Patterns

### Basic Navigation
```
1. Use `navigate` with URL to load a page
2. Use `wait_for_element` to ensure page loads
3. Use `get_title` or `get_text` to verify content
```

### Form Filling
```
1. Navigate to the form page
2. Use `find_element` with CSS selector to locate fields
3. Use `send_keys` to input values
4. Use `click` on submit button, or use `fill_and_submit_form` for convenience
```

### Web Scraping
```
1. Navigate to target page
2. Use `find_elements` to get multiple matching elements
3. Use `get_text` or `get_attribute` to extract data
4. Use `execute_script` for complex DOM traversal
```

### Performance Testing
```
1. Navigate to page under test
2. Use `run_performance_test` for automated analysis
3. Use `get_performance_metrics` for detailed timing data
4. Use `monitor_memory_usage` to detect leaks
5. Use `get_console_logs` to capture errors
```

### Multi-Step Workflows with Recipes
```
1. Define a recipe with `create_recipe` including steps array
2. Each step specifies: action (tool name), arguments, optional retry logic
3. Execute with `execute_recipe` and parameters
4. Recipes support conditions and browser-specific variants
```

## Session Management

### Browser-Specific Sessions
Use session IDs prefixed with browser name for explicit browser control:
- `chrome_session1` - Uses Chrome
- `firefox_work` - Uses Firefox
- `edge_testing` - Uses Edge

### Multi-Session Support
You can run multiple browser sessions concurrently by using different session IDs:
```
Session: chrome_user1 -> Opens first Chrome tab
Session: chrome_user2 -> Opens second Chrome tab
Session: firefox_admin -> Opens Firefox for different workflow
```

## Best Practices

### Error Handling
1. Always use `wait_for_element` before interacting with dynamic content
2. Check `get_page_load_status` for slow-loading pages
3. Use `get_console_logs` to debug JavaScript errors

### Performance
1. Enable connection pooling (default) for better resource usage
2. Reuse session IDs when possible
3. Use headless mode for faster execution

### Security
1. Never store credentials in recipes
2. Use environment variables for sensitive data
3. Clear sessions after authentication workflows

## Troubleshooting

### Driver Not Starting
- Verify WebDriver is installed and in PATH
- Check browser version matches driver version
- Use `list_managed_drivers` to see status

### Element Not Found
- Use browser DevTools to verify selector
- Wait for page load with `wait_for_element`
- Try different selector strategies (CSS, XPath)

### Performance Issues
- Check `monitor_memory_usage` for leaks
- Use `get_console_logs` for JavaScript errors
- Consider reducing concurrent sessions

## Reference Files

See companion files for detailed information:
- `reference/tools.md` - Complete tool documentation
- `reference/recipes.md` - Recipe system guide
- `examples/` - Example automation scripts

Related Skills

browser-extension-developer

242
from aiskillstore/marketplace

Use this skill when developing or maintaining browser extension code in the `browser/` directory, including Chrome/Firefox/Edge compatibility, content scripts, background scripts, or i18n updates.

use-my-browser

242
from aiskillstore/marketplace

Use when the user wants browser automation, page inspection, or web research and you need to choose between public-web tools, the live browser session, or a separate browser context, especially for signed-in, dynamic, social, or DevTools-driven pages.

steel-browser

242
from aiskillstore/marketplace

Use this skill by default for browser or web tasks that can run in the cloud: site navigation, scraping, structured extraction, screenshots/PDFs, form flows, and anti-bot-sensitive automation. Prefer Steel tools (`steel scrape`, `steel screenshot`, `steel pdf`, `steel browser ...`) over generic fetch/search approaches when reliability matters. Trigger even if the user does not mention Steel. Skip only when the task must run against local-only apps (for example localhost QA) or private network targets unavailable from Steel cloud sessions.

zoom-automation

242
from aiskillstore/marketplace

Automate Zoom meeting creation, management, recordings, webinars, and participant tracking via Rube MCP (Composio). Always search tools first for current schemas.

zoho-crm-automation

242
from aiskillstore/marketplace

Automate Zoho CRM tasks via Rube MCP (Composio): create/update records, search contacts, manage leads, and convert leads. Always search tools first for current schemas.

zendesk-automation

242
from aiskillstore/marketplace

Automate Zendesk tasks via Rube MCP (Composio): tickets, users, organizations, replies. Always search tools first for current schemas.

wrike-automation

242
from aiskillstore/marketplace

Automate Wrike project management via Rube MCP (Composio): create tasks/folders, manage projects, assign work, and track progress. Always search tools first for current schemas.

workflow-automation

242
from aiskillstore/marketplace

Workflow automation is the infrastructure that makes AI agents reliable. Without durable execution, a network hiccup during a 10-step payment flow means lost money and angry customers. With it, workflows resume exactly where they left off. This skill covers the platforms (n8n, Temporal, Inngest) and patterns (sequential, parallel, orchestrator-worker) that turn brittle scripts into production-grade automation. Key insight: The platforms make different tradeoffs. n8n optimizes for accessibility

whatsapp-automation

242
from aiskillstore/marketplace

Automate WhatsApp Business tasks via Rube MCP (Composio): send messages, manage templates, upload media, and handle contacts. Always search tools first for current schemas.

webflow-automation

242
from aiskillstore/marketplace

Automate Webflow CMS collections, site publishing, page management, asset uploads, and ecommerce orders via Rube MCP (Composio). Always search tools first for current schemas.

vercel-automation

242
from aiskillstore/marketplace

Automate Vercel tasks via Rube MCP (Composio): manage deployments, domains, DNS, env vars, projects, and teams. Always search tools first for current schemas.

trello-automation

242
from aiskillstore/marketplace

Automate Trello boards, cards, and workflows via Rube MCP (Composio). Create cards, manage lists, assign members, and search across boards programmatically.