cmux-browser
Open websites, take screenshots, inspect elements, and debug UI issues using cmux browser automation. Use this skill whenever you need to visually verify a web page, debug CSS/layout problems, check if a UI change looks correct, take a screenshot of a running site, inspect computed styles or DOM structure, or interact with a page in a real browser. Also trigger when the user says "open this in the browser", "take a screenshot", "check the page", "what does it look like", "inspect the element", "debug the layout", or references cmux. Use this proactively when working on frontend/UI tasks to verify your changes actually render correctly rather than just assuming they do.
Best use case
cmux-browser is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Open websites, take screenshots, inspect elements, and debug UI issues using cmux browser automation. Use this skill whenever you need to visually verify a web page, debug CSS/layout problems, check if a UI change looks correct, take a screenshot of a running site, inspect computed styles or DOM structure, or interact with a page in a real browser. Also trigger when the user says "open this in the browser", "take a screenshot", "check the page", "what does it look like", "inspect the element", "debug the layout", or references cmux. Use this proactively when working on frontend/UI tasks to verify your changes actually render correctly rather than just assuming they do.
Teams using cmux-browser should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/cmux-browser/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How cmux-browser Compares
| Feature / Agent | cmux-browser | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Open websites, take screenshots, inspect elements, and debug UI issues using cmux browser automation. Use this skill whenever you need to visually verify a web page, debug CSS/layout problems, check if a UI change looks correct, take a screenshot of a running site, inspect computed styles or DOM structure, or interact with a page in a real browser. Also trigger when the user says "open this in the browser", "take a screenshot", "check the page", "what does it look like", "inspect the element", "debug the layout", or references cmux. Use this proactively when working on frontend/UI tasks to verify your changes actually render correctly rather than just assuming they do.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# cmux Browser Automation
Open pages, screenshot, inspect the DOM, and debug layout — all from the CLI.
All commands target a surface ID (e.g., `surface:7`). Open a page first to get one, then reuse it for the session.
## Open and Navigate
```bash
cmux browser open-split <url> # opens in split pane, returns surface ID
cmux browser surface:N navigate <url> # navigate existing surface
cmux browser surface:N reload
cmux browser surface:N back
```
## Screenshot
Save to /tmp, then Read the PNG to view it.
```bash
cmux browser surface:N screenshot --out /tmp/page.png
```
If the screenshot is blank, the page hasn't loaded — `navigate` to the URL again and retry.
## DOM Snapshot
Returns the accessibility tree with `[ref=eN]` identifiers — faster than reading HTML.
```bash
cmux browser surface:N snapshot --interactive --compact
cmux browser surface:N snapshot --selector ".component" --interactive # scoped
```
## Inspect Elements
```bash
cmux browser surface:N get styles "<sel>" --property <prop> # computed CSS
cmux browser surface:N get text "<sel>" # visible text
cmux browser surface:N get html "<sel>" # innerHTML
cmux browser surface:N get box "<sel>" # bounding box
cmux browser surface:N get count "<sel>" # match count
cmux browser surface:N is visible "<sel>"
cmux browser surface:N highlight "<sel>" # visual highlight
```
## JavaScript Eval
For multi-property inspection, use `eval` with an IIFE returning JSON:
```bash
cmux browser surface:N eval "(() => {
const el = document.querySelector('.my-element');
const cs = getComputedStyle(el);
return JSON.stringify({
width: el.offsetWidth,
scrollWidth: el.scrollWidth,
overflow: cs.overflow,
isOverflowing: el.scrollWidth > el.offsetWidth
})
})()"
```
## Interact
```bash
cmux browser surface:N click "<sel>"
cmux browser surface:N fill "<sel>" --text "value"
cmux browser surface:N scroll down 300
cmux browser surface:N eval "document.querySelector('<sel>').scrollIntoView({block:'center'})"
```
## Wait
```bash
cmux browser surface:N wait --load-state complete
cmux browser surface:N wait --selector ".loaded"
cmux browser surface:N wait --text "Expected text"
```
## Gotchas
- **Server restarts**: Gunicorn `--reload` only watches .py files. Touch a Python file or `kill -HUP <master-pid>` to pick up template changes.
- **Narrow viewport**: The cmux split is narrower than a real browser. JS eval widths reflect actual layout at full page width — more reliable than what the screenshot shows.
- **Debug outlines**: `cmux browser surface:N addstyle "* { outline: 1px solid red; }"` to visualize element boundaries.
- **Console errors**: `cmux browser surface:N errors list`Related Skills
stop-slop
Use this skill when writing or editing prose to eliminate predictable AI writing patterns. Helps make writing more direct, authentic, and human.
sonos-control
Control Sonos speakers on Tim's home network. Use when the user wants to (1) play, pause, or stop music on Sonos speakers, (2) change volume on speakers, (3) skip tracks, (4) check what's playing, (5) see speaker status, (6) group or ungroup speakers, (7) any Sonos or music/audio playback task involving home speakers. Triggers on "sonos", "speakers", "play music", "what's playing", "volume", "turn up", "turn down", "pause music", "stop music".
slack-message
Draft and send Slack messages in Tim's natural voice. Use when the user wants to (1) post an update to a channel, (2) draft a Slack message, (3) share something on Slack, (4) send a DM, (5) reply in a thread. Applies Tim's Slack writing style and prose principles automatically.
skill-creator
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
sending-to-codex
Delegate tasks or ask questions to OpenAI's Codex CLI from within Claude Code. Use this skill when the user says "ask codex", "send to codex", "delegate to codex", "have codex do this", "get codex's opinion", "run this in codex", or wants to offload a coding task or question to the Codex agent. Supports both fire-and-forget coding tasks (fix bugs, add features, refactor) and research questions (analyze code, explain behavior, get a second opinion).
reviewing-writing
Review and critique writing using Michael Nielsen's principles on craft. Analyzes text for purpose focus, brevity, danger words, opening strength, originality, reader psychology, truthfulness, and title impact. Use when the user says "review my writing", "nielsen review", "writing review", "review this writing", "critique my writing", or asks for feedback on prose quality.
reviewing-code
Review pull requests, branch changes, or code diffs. Triggers on "review this PR", "review my changes", "code review", "review branch", or GitHub PR URLs. Focuses on bugs, tests, complexity, and performance - not linting.
resend-email
Send emails via Resend.com API. Use when the user wants to (1) send an email, (2) email someone, (3) send a message to an email address, (4) send email with attachments, (5) schedule an email for later. Requires RESEND_API_KEY environment variable.
refresh-dotfiles
Full sync of personal (yadm) and work (yadm-work) dotfiles. Pulls remote changes, commits and pushes local changes, and audits for untracked files that should be tracked. Use when the user says 'refresh yadm', 'sync dotfiles', 'dotfiles sync', or 'update dotfiles'.
omnifocus
Interact with OmniFocus task manager via the command-line interface (@stephendolan/omnifocus-cli). Use when the user wants to: (1) Add tasks or projects to OmniFocus, (2) List, view, or search tasks/projects, (3) Update or complete tasks, (4) Manage inbox items, (5) Work with tags and analyze tag usage, (6) Process or organize their OmniFocus database from the command line.
omnifocus-triage
Interactively process OmniFocus inbox items using AskUserQuestion. Use when the user wants to (1) triage their inbox, (2) process inbox items, (3) organize their OmniFocus inbox, (4) clear out their inbox, (5) do a GTD-style inbox review. Triggers on "triage inbox", "process inbox", "organize inbox", "clear inbox", "inbox zero".
Nightshift
Manage and interact with Nightshift, an AI-powered development automation tool that runs coding tasks during off-hours.