browser-automation
Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.
Best use case
browser-automation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.
Teams using browser-automation should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/browser-automation/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How browser-automation Compares
| Feature / Agent | browser-automation | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Enterprise-grade browser automation using WebDriver protocol. Use when the user needs to automate web browsers, perform web scraping, test web applications, fill forms, take screenshots, monitor performance, or execute multi-step browser workflows. Supports Chrome, Firefox, and Edge with connection pooling and health management.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Browser Automation Skill
This skill provides guidance for using the rust-browser-mcp server to automate web browsers through the WebDriver protocol. It enables enterprise-grade browser control with performance monitoring, multi-session support, and health management.
## Overview
The rust-browser-mcp server provides 45+ MCP tools for browser automation:
### Core Automation Tools (25)
- **Navigation**: `navigate`, `back`, `forward`, `refresh`
- **Element Interaction**: `click`, `send_keys`, `hover`, `find_element`, `find_elements`
- **Information Extraction**: `get_title`, `get_text`, `get_attribute`, `get_property`, `get_page_source`
- **Advanced**: `fill_and_submit_form`, `login_form`, `scroll_to_element`, `wait_for_element`
- **JavaScript**: `execute_script`
- **Visual**: `screenshot`, `resize_window`, `get_current_url`, `get_page_load_status`
### Performance Monitoring Tools (5)
- `get_performance_metrics` - Page load times, resource timing, navigation data
- `monitor_memory_usage` - Heap monitoring, memory leak detection
- `get_console_logs` - Error detection, log filtering
- `run_performance_test` - Automated performance analysis
- `monitor_resource_usage` - Network, FPS, CPU tracking
### Driver Management Tools (7)
- `start_driver`, `stop_driver`, `stop_all_drivers`
- `list_managed_drivers`
- `get_healthy_endpoints`, `refresh_driver_health`
- `force_cleanup_orphaned_processes`
### Recipe System (4)
- `create_recipe` - Create reusable automation workflows
- `execute_recipe` - Run a saved recipe
- `list_recipes` - List all available recipes
- `delete_recipe` - Remove a recipe
## Setup Instructions
### Prerequisites
Ensure you have at least one WebDriver installed:
- **Chrome**: ChromeDriver (must match Chrome version)
- **Firefox**: GeckoDriver
- **Edge**: MSEdgeDriver
### Configuration for Claude Desktop
Add to your `claude_desktop_config.json`:
```json
{
"mcpServers": {
"browser": {
"command": "/path/to/rust-browser-mcp",
"args": ["--transport", "stdio", "--browser", "chrome"]
}
}
}
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `WEBDRIVER_ENDPOINT` | `auto` | WebDriver URL or "auto" for auto-discovery |
| `WEBDRIVER_HEADLESS` | `true` | Run browsers in headless mode |
| `WEBDRIVER_PREFERRED_DRIVER` | - | Preferred browser: chrome, firefox, edge |
| `WEBDRIVER_CONCURRENT_DRIVERS` | `firefox,chrome` | Browsers to start concurrently |
| `WEBDRIVER_POOL_ENABLED` | `true` | Enable connection pooling |
| `WEBDRIVER_POOL_MAX_CONNECTIONS` | `3` | Max connections per driver type |
## Usage Patterns
### Basic Navigation
```
1. Use `navigate` with URL to load a page
2. Use `wait_for_element` to ensure page loads
3. Use `get_title` or `get_text` to verify content
```
### Form Filling
```
1. Navigate to the form page
2. Use `find_element` with CSS selector to locate fields
3. Use `send_keys` to input values
4. Use `click` on submit button, or use `fill_and_submit_form` for convenience
```
### Web Scraping
```
1. Navigate to target page
2. Use `find_elements` to get multiple matching elements
3. Use `get_text` or `get_attribute` to extract data
4. Use `execute_script` for complex DOM traversal
```
### Performance Testing
```
1. Navigate to page under test
2. Use `run_performance_test` for automated analysis
3. Use `get_performance_metrics` for detailed timing data
4. Use `monitor_memory_usage` to detect leaks
5. Use `get_console_logs` to capture errors
```
### Multi-Step Workflows with Recipes
```
1. Define a recipe with `create_recipe` including steps array
2. Each step specifies: action (tool name), arguments, optional retry logic
3. Execute with `execute_recipe` and parameters
4. Recipes support conditions and browser-specific variants
```
## Session Management
### Browser-Specific Sessions
Use session IDs prefixed with browser name for explicit browser control:
- `chrome_session1` - Uses Chrome
- `firefox_work` - Uses Firefox
- `edge_testing` - Uses Edge
### Multi-Session Support
You can run multiple browser sessions concurrently by using different session IDs:
```
Session: chrome_user1 -> Opens first Chrome tab
Session: chrome_user2 -> Opens second Chrome tab
Session: firefox_admin -> Opens Firefox for different workflow
```
## Best Practices
### Error Handling
1. Always use `wait_for_element` before interacting with dynamic content
2. Check `get_page_load_status` for slow-loading pages
3. Use `get_console_logs` to debug JavaScript errors
### Performance
1. Enable connection pooling (default) for better resource usage
2. Reuse session IDs when possible
3. Use headless mode for faster execution
### Security
1. Never store credentials in recipes
2. Use environment variables for sensitive data
3. Clear sessions after authentication workflows
## Troubleshooting
### Driver Not Starting
- Verify WebDriver is installed and in PATH
- Check browser version matches driver version
- Use `list_managed_drivers` to see status
### Element Not Found
- Use browser DevTools to verify selector
- Wait for page load with `wait_for_element`
- Try different selector strategies (CSS, XPath)
### Performance Issues
- Check `monitor_memory_usage` for leaks
- Use `get_console_logs` for JavaScript errors
- Consider reducing concurrent sessions
## Reference Files
See companion files for detailed information:
- `reference/tools.md` - Complete tool documentation
- `reference/recipes.md` - Recipe system guide
- `examples/` - Example automation scriptsRelated Skills
google-sheets-automation
Google Sheets Automation - Auto-activating skill for Business Automation. Triggers on: google sheets automation, google sheets automation Part of the Business Automation skill category.
conducting-browser-compatibility-tests
This skill enables cross-browser compatibility testing for web applications using BrowserStack, Selenium Grid, or Playwright. It tests across Chrome, Firefox, Safari, and Edge, identifying browser-specific bugs and ensuring consistent functionality. It is used when a user requests to "test browser compatibility", "run cross-browser tests", or uses the `/browser-test` or `/bt` command to assess web application behavior across different browsers and devices. The skill generates a report detailing compatibility issues and screenshots for visual verification. Activates when you request "conducting browser compatibility tests" functionality.
playwright-automation-fill-in-form
Automate filling in a form using Playwright MCP
../../../engineering-team/playwright-pro/skills/browserstack/SKILL.md
No description provided.
browser-extension-developer
Use this skill when developing or maintaining browser extension code in the `browser/` directory, including Chrome/Firefox/Edge compatibility, content scripts, background scripts, or i18n updates.
use-my-browser
Use when the user wants browser automation, page inspection, or web research and you need to choose between public-web tools, the live browser session, or a separate browser context, especially for signed-in, dynamic, social, or DevTools-driven pages.
deployment-automation
Automate application deployment to cloud platforms and servers. Use when setting up CI/CD pipelines, deploying to Docker/Kubernetes, or configuring cloud infrastructure. Handles GitHub Actions, Docker, Kubernetes, AWS, Vercel, and deployment best practices.
steel-browser
Use this skill by default for browser or web tasks that can run in the cloud: site navigation, scraping, structured extraction, screenshots/PDFs, form flows, and anti-bot-sensitive automation. Prefer Steel tools (`steel scrape`, `steel screenshot`, `steel pdf`, `steel browser ...`) over generic fetch/search approaches when reliability matters. Trigger even if the user does not mention Steel. Skip only when the task must run against local-only apps (for example localhost QA) or private network targets unavailable from Steel cloud sessions.
zoom-automation
Automate Zoom meeting creation, management, recordings, webinars, and participant tracking via Rube MCP (Composio). Always search tools first for current schemas.
zoho-crm-automation
Automate Zoho CRM tasks via Rube MCP (Composio): create/update records, search contacts, manage leads, and convert leads. Always search tools first for current schemas.
zendesk-automation
Automate Zendesk tasks via Rube MCP (Composio): tickets, users, organizations, replies. Always search tools first for current schemas.
youtube-automation
Automate YouTube tasks via Rube MCP (Composio): upload videos, manage playlists, search content, get analytics, and handle comments. Always search tools first for current schemas.