chrome-cdp

Interact with local Chrome browser session (only on explicit user approval after being asked to inspect, debug, or interact with a page open in Chrome)

215 stars

Best use case

chrome-cdp is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Interact with local Chrome browser session (only on explicit user approval after being asked to inspect, debug, or interact with a page open in Chrome)

Teams using chrome-cdp should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/chrome-cdp/SKILL.md --create-dirs "https://raw.githubusercontent.com/megalithic/dotfiles/main/home/common/programs/ai/pi-coding-agent/skills/chrome-cdp/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/chrome-cdp/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How chrome-cdp Compares

Feature / Agentchrome-cdpStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Interact with local Chrome browser session (only on explicit user approval after being asked to inspect, debug, or interact with a page open in Chrome)

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Chrome CDP

Lightweight Chrome DevTools Protocol CLI. Connects directly via WebSocket — no Puppeteer, works with 100+ tabs, instant connection.

## Prerequisites

- Chrome (or Chromium, Brave, Edge, Vivaldi) with remote debugging enabled: open `chrome://inspect/#remote-debugging` and toggle the switch
- Node.js 22+ (uses built-in WebSocket)
- If your browser's `DevToolsActivePort` is in a non-standard location, set `CDP_PORT_FILE` to its full path

## Commands

All commands use `scripts/cdp.mjs`. The `<target>` is a **unique** targetId prefix from `list`; copy the full prefix shown in the `list` output (for example `6BE827FA`). The CLI rejects ambiguous prefixes.

### List open pages

```bash
scripts/cdp.mjs list
```

### Take a screenshot

```bash
scripts/cdp.mjs shot <target> [file]    # default: screenshot-<target>.png in runtime dir
```

Captures the **viewport only**. Scroll first with `eval` if you need content below the fold. Output includes the page's DPR and coordinate conversion hint (see **Coordinates** below).

### Accessibility tree snapshot

```bash
scripts/cdp.mjs snap <target>
```

### Evaluate JavaScript

```bash
scripts/cdp.mjs eval <target> <expr>
```

> **Watch out:** avoid index-based selection (`querySelectorAll(...)[i]`) across multiple `eval` calls when the DOM can change between them (e.g. after clicking Ignore, card indices shift). Collect all data in one `eval` or use stable selectors.

### Other commands

```bash
scripts/cdp.mjs html    <target> [selector]   # full page or element HTML
scripts/cdp.mjs nav     <target> <url>         # navigate and wait for load
scripts/cdp.mjs net     <target>               # resource timing entries
scripts/cdp.mjs click   <target> <selector>    # click element by CSS selector
scripts/cdp.mjs clickxy <target> <x> <y>       # click at CSS pixel coords
scripts/cdp.mjs type    <target> <text>         # Input.insertText at current focus; works in cross-origin iframes unlike eval
scripts/cdp.mjs loadall <target> <selector> [ms]  # click "load more" until gone (default 1500ms between clicks)
scripts/cdp.mjs evalraw <target> <method> [json]  # raw CDP command passthrough
scripts/cdp.mjs open    [url]                  # open new tab (each triggers Allow prompt)
scripts/cdp.mjs stop    [target]               # stop daemon(s)
```

## Coordinates

`shot` saves an image at native resolution: image pixels = CSS pixels × DPR. CDP Input events (`clickxy` etc.) take **CSS pixels**.

```
CSS px = screenshot image px / DPR
```

`shot` prints the DPR for the current page. Typical Retina (DPR=2): divide screenshot coords by 2.

## Tips

- Prefer `snap --compact` over `html` for page structure.
- Use `type` (not eval) to enter text in cross-origin iframes — `click`/`clickxy` to focus first, then `type`.
- Chrome shows an "Allow debugging" modal once per tab on first access. A background daemon keeps the session alive so subsequent commands need no further approval. Daemons auto-exit after 20 minutes of inactivity.

Related Skills

writing-clearly-and-concisely

215
from megalithic/dotfiles

Apply Strunk's timeless writing rules to ANY prose humans will read - documentation, commit messages, error messages, explanations, reports, or UI text. Makes your writing clearer, stronger, and more professional.

web-search

215
from megalithic/dotfiles

Web search using DuckDuckGo (free, unlimited). Falls back to pi-web-access extension for content extraction.

web-browser

215
from megalithic/dotfiles

Interact with web pages using agent-browser CLI. MUST run 'browser connect 9222' FIRST to use existing browser with authenticated sessions.

tmux

215
from megalithic/dotfiles

Remote control tmux sessions for interactive CLIs (python, gdb, etc.) by sending keystrokes and scraping pane output.

ticket-worker

215
from megalithic/dotfiles

Work on a single tk ticket end-to-end. Use when the user says 'work on ticket X' or when spawned by work-tickets.sh.

ticket-creator

215
from megalithic/dotfiles

Create and refine tickets for the tk ticket system. Use when the user says 'create tickets for X', 'refine ticket X', 'break this into tickets', 'seed tickets from plan', or anything about creating or refining tk tickets.

tell

215
from megalithic/dotfiles

Delegate tasks to other agents - pi sessions or external agents (claude, opencode, aider). Non-blocking with task tracking and completion notifications.

task-pipeline

215
from megalithic/dotfiles

Structured workflow for research → plan → tickets → work. Use when starting or continuing a task with /task, /plan, or /tickets commands.

preview

215
from megalithic/dotfiles

Display code, diffs, images, and other content in a tmux pane or popup. Auto-detects nvim/megaterm for floating popups.

mcpctl

215
from megalithic/dotfiles

Manage MCP server configurations — add, remove, list, inspect, troubleshoot. Use when asked to "add mcp server", "remove mcp", "list mcp servers", "mcp status", "configure mcp", "troubleshoot mcp", or any MCP server management task.

handoff

215
from megalithic/dotfiles

Save session state for later pickup. Use /handoff when context is degrading, /pickup to resume in a new session.

github

215
from megalithic/dotfiles

Interact with GitHub using the `gh` CLI. Use `gh issue`, `gh pr`, `gh run`, and `gh api` for issues, PRs, CI runs, and advanced queries.