fox-pilot
Firefox browser automation CLI for AI agents. Use when users ask to automate Firefox, navigate websites, fill forms, take screenshots, extract web data, or test web apps in Firefox. Trigger phrases include "in Firefox", "fox-pilot", "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", or any browser interaction request mentioning Firefox.
Best use case
fox-pilot is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Firefox browser automation CLI for AI agents. Use when users ask to automate Firefox, navigate websites, fill forms, take screenshots, extract web data, or test web apps in Firefox. Trigger phrases include "in Firefox", "fox-pilot", "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", or any browser interaction request mentioning Firefox.
Teams using fox-pilot should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/fox-pilot/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How fox-pilot Compares
| Feature / Agent | fox-pilot | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Firefox browser automation CLI for AI agents. Use when users ask to automate Firefox, navigate websites, fill forms, take screenshots, extract web data, or test web apps in Firefox. Trigger phrases include "in Firefox", "fox-pilot", "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", or any browser interaction request mentioning Firefox.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Fox Pilot Skill
CLI-based Firefox browser automation optimized for AI agents. Simple commands, persistent sessions, accessibility snapshots with refs.
**Difference from agent-browser:** Fox Pilot controls a real Firefox browser (not Chromium), preserving your existing session, cookies, and extensions.
## Setup
```bash
# Install the CLI globally
npm install -g @fox-pilot/cli
# Install native messaging host (run once)
fox-pilot install
# Install Firefox extension from:
# https://addons.mozilla.org/firefox/addon/fox-pilot/
```
## Quick Start
```bash
fox-pilot open example.com
fox-pilot snapshot # Get accessibility tree with refs
fox-pilot click @e2 # Click by ref from snapshot
fox-pilot fill @e3 "test@example.com" # Fill by ref
fox-pilot screenshot /tmp/page.png
```
## Workflow Pattern
1. **Navigate** to URL
2. **Snapshot** to discover elements (use `-i` for interactive only)
3. **Interact** using refs (`@e1`, `@e2`, etc.)
4. **Verify** state with `get url`, `get title`, or screenshot
5. **Repeat** as needed
## Core Commands
### Navigation
```bash
fox-pilot open <url> # Navigate to URL
fox-pilot back # Go back
fox-pilot forward # Go forward
fox-pilot reload # Reload page
```
### Snapshot (Element Discovery)
The snapshot command returns an accessibility tree with refs for each element:
```bash
fox-pilot snapshot
# Output:
# - heading "Example Domain" [ref=@e1] [level=1]
# - button "Submit" [ref=@e2]
# - textbox "Email" [ref=@e3]
# - link "Learn more" [ref=@e4]
```
#### Snapshot Options
```bash
fox-pilot snapshot -i # Interactive elements only (buttons, inputs, links)
fox-pilot snapshot -c # Compact (remove empty structural elements)
fox-pilot snapshot -d 3 # Limit depth to 3 levels
fox-pilot snapshot -s "#main" # Scope to CSS selector
fox-pilot snapshot -i -c -d 5 # Combine options
```
**Recommended:** Use `snapshot -i -c` for most cases - shows only actionable elements.
### Interactions
```bash
fox-pilot click <sel> # Click element
fox-pilot dblclick <sel> # Double-click
fox-pilot fill <sel> <text> # Clear and fill input
fox-pilot type <sel> <text> # Type into element (append)
fox-pilot press <key> [sel] # Press key (Enter, Tab, Control+a)
fox-pilot select <sel> <val> # Select dropdown option
fox-pilot check <sel> # Check checkbox
fox-pilot uncheck <sel> # Uncheck checkbox
fox-pilot scroll <dir> [px] # Scroll (up/down/left/right)
fox-pilot hover <sel> # Hover element
```
### Get Information
```bash
fox-pilot get text <sel> # Get text content
fox-pilot get html <sel> # Get innerHTML
fox-pilot get value <sel> # Get input value
fox-pilot get attr <sel> <attr> # Get attribute
fox-pilot get title # Get page title
fox-pilot get url # Get current URL
fox-pilot get count <sel> # Count matching elements
```
### Check State
```bash
fox-pilot is visible <sel> # Check if visible
fox-pilot is enabled <sel> # Check if enabled
fox-pilot is checked <sel> # Check if checked
```
### Screenshots
```bash
fox-pilot screenshot [path] # Take screenshot
fox-pilot screenshot -f [path] # Full page screenshot
```
### Waiting
```bash
fox-pilot wait <selector> # Wait for element visible
fox-pilot wait 2000 # Wait 2 seconds
fox-pilot wait --text "Welcome" # Wait for text
fox-pilot wait --url "**/success" # Wait for URL pattern
```
### Semantic Locators (find command)
```bash
fox-pilot find role button click --name "Submit"
fox-pilot find text "Sign In" click
fox-pilot find label "Email" fill "test@test.com"
fox-pilot find placeholder "Search" fill "query"
```
### Tabs
```bash
fox-pilot tab # List tabs
fox-pilot tab new [url] # New tab
fox-pilot tab 2 # Switch to tab 2
fox-pilot tab close # Close current tab
```
### JavaScript Execution
```bash
fox-pilot eval "return document.title"
fox-pilot eval "localStorage.getItem('token')"
```
## Selectors
### Refs (Recommended)
Use refs from snapshot for reliable element selection:
```bash
fox-pilot click @e2
fox-pilot fill @e3 "value"
fox-pilot get text @e1
```
### CSS Selectors
```bash
fox-pilot click "#id"
fox-pilot click ".class"
fox-pilot click "button[type=submit]"
```
## Common Patterns
### Login Flow
```bash
fox-pilot open https://example.com/login
fox-pilot snapshot -i
# Shows: textbox "Email" [ref=@e1], textbox "Password" [ref=@e2], button "Sign in" [ref=@e3]
fox-pilot fill @e1 "user@example.com"
fox-pilot fill @e2 "password123"
fox-pilot click @e3
fox-pilot wait --url "**/dashboard"
fox-pilot get url
```
### Form Submission
```bash
fox-pilot open https://example.com/contact
fox-pilot snapshot -i -c
fox-pilot fill @e1 "John Doe"
fox-pilot fill @e2 "john@example.com"
fox-pilot fill @e3 "Hello, this is my message"
fox-pilot click @e4 # Submit button
fox-pilot wait --text "Thank you"
fox-pilot screenshot /tmp/confirmation.png
```
### Scraping Data
```bash
fox-pilot open https://example.com/products
fox-pilot snapshot -s ".product-list"
fox-pilot get text ".product-title"
fox-pilot get attr ".product-link" "href"
```
### Debug Failed Interaction
```bash
fox-pilot screenshot /tmp/debug.png
fox-pilot get url
fox-pilot get title
fox-pilot snapshot -i
```
## Options
| Option | Description |
| ------------------ | ------------------------- |
| `--json` | JSON output (for parsing) |
| `-f, --full` | Full page screenshot |
| `-i, --interactive`| Interactive elements only |
| `-c, --compact` | Compact snapshot |
| `-d, --depth <n>` | Limit snapshot depth |
| `-s, --scope <sel>`| Scope snapshot to selector|
## Tips
- **Always snapshot first** before interacting with unknown pages
- **Use `-i -c` flags** on snapshot to reduce noise
- **Prefer refs over CSS** for reliability
- **Check `--json` output** when you need to parse results programmatically
- **Firefox session is preserved** - your cookies, logins, and extensions work
## Troubleshooting
```bash
# Check if native host is running
lsof -i :9222
# View native host logs
tail -f /tmp/fox-pilot.log
# Reinstall native host
fox-pilot install
# Reload extension
# Go to about:debugging#/runtime/this-firefox and click "Reload"
```Related Skills
copilotkit
Build AI copilots, chatbots, and agentic UIs in React and Next.js using CopilotKit. Use this skill when the user wants to add an AI assistant, copilot, chat interface, AI-powered textarea, or agentic UI to their app. Covers setup, hooks (useCopilotAction, useCopilotReadable, useCoAgent, useAgent), chat components (CopilotPopup, CopilotSidebar, CopilotChat), generative UI, human-in-the-loop, CoAgents with LangGraph, AG-UI protocol, MCP Apps, and Python SDK integration. Triggers on CopilotKit, copilotkit, useCopilotAction, useCopilotReadable, useCoAgent, useAgent, CopilotRuntime, CopilotChat, CopilotSidebar, CopilotPopup, CopilotTextarea, AG-UI, agentic frontend, in-app AI copilot, AI assistant React, chatbot React, useFrontendTool, useRenderToolCall, useDefaultTool, useCoAgentStateRender, useLangGraphInterrupt, useCopilotChat, useCopilotAdditionalInstructions, useCopilotChatSuggestions, useHumanInTheLoop, CopilotTask, copilot runtime, LangGraphAgent, BasicAgent, BuiltInAgent, CopilotKitRemoteEndpoint, A2UI, MCP Apps, AI textarea, AI form completion, add AI to React app.
copilotkit-pitch-deck
Production-ready CopilotKit pitch deck wizard in main application. Use when enhancing AI conversation features, optimizing Edge Function integration, debugging chat interface, or improving pitch deck generation flow. System is PRODUCTION READY (98/100).
copilot-tui-harness
Expert in the Copilot SDK TUI Harness project. Use for development tasks including architecture, event system, plugins, OpenTUI components, and Copilot SDK integration. Triggers on TUI development, harness events, streaming UI, plugin system, event-driven architecture.
copilot-search
Search the web using Claude Code's WebSearch/WebFetch tools combined with GitHub Copilot CLI to find current documentation, best practices, solutions, and technical information. Use when the user needs to research libraries, find API documentation, troubleshoot errors, or learn about new technologies. Requires Copilot CLI installed.
copilot-sdk
Build agentic applications with GitHub Copilot SDK. Use when embedding AI agents in apps, creating custom tools, implementing streaming responses, managing sessions, connecting to MCP servers, or creating custom agents. Triggers on Copilot SDK, GitHub SDK, agentic app, embed Copilot, programmable agent, MCP server, custom agent.
copilot-sdk-go
Expert guidance for using the GitHub Copilot CLI SDK with Go, including API reference, best practices, and common usage patterns.
copilot-sdk-dotnet
Build applications with GitHub Copilot CLI SDKs for .NET. Use for direct CopilotClient integration or Microsoft Agent Framework. Covers sessions, streaming, tools, MCP, permissions, and multi-agent workflows.
copilot-sdk-copilot-sdk
Build agentic applications with GitHub Copilot SDK. Use when embedding AI agents in apps, creating custom tools, implementing streaming responses, managing sessions, connecting to MCP servers, or creating custom agents. Triggers on Copilot SDK, GitHub SDK, agentic app, embed Copilot, programmable agent, MCP server, custom agent. Use when: the task directly matches copilot sdk responsibilities within plugin copilot-sdk. Do not use when: a more specific framework or task-focused skill is clearly a better match.
copilot-mcp-server
Direct access to GitHub Copilot MCP server tools for AI-powered development assistance
copilot-instructions-blueprint-generator
Technology-agnostic blueprint generator for creating comprehensive copilot-instructions.md files that guide GitHub Copilot to produce code consistent with project standards, architecture patterns, and exact technology versions by analyzing existing codebase patterns and avoiding assumptions.
awesome-copilot-root-typescript-mcp-expert
Expert assistant for developing Model Context Protocol (MCP) servers in TypeScript Use when: the task directly matches typescript mcp expert responsibilities within plugin awesome-copilot-root. Do not use when: a more specific framework or task-focused skill is clearly a better match.
awesome-copilot-root-rust-mcp-expert
Expert assistant for Rust MCP server development using the rmcp SDK with tokio async runtime Use when: the task directly matches rust mcp expert responsibilities within plugin awesome-copilot-root. Do not use when: a more specific framework or task-focused skill is clearly a better match.