peekaboo
Capture and automate macOS UI with the Peekaboo CLI.
Best use case
peekaboo is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Capture and automate macOS UI with the Peekaboo CLI.
Teams using peekaboo should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/peekaboo/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How peekaboo Compares
| Feature / Agent | peekaboo | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Capture and automate macOS UI with the Peekaboo CLI.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Peekaboo Peekaboo is a full macOS UI automation CLI: capture/inspect screens, target UI elements, drive input, and manage apps/windows/menus. Commands share a snapshot cache and support `--json`/`-j` for scripting. Run `peekaboo` or `peekaboo <cmd> --help` for flags; `peekaboo --version` prints build metadata. Tip: run via `polter peekaboo` to ensure fresh builds. ## Features (all CLI capabilities, excluding agent/MCP) Core - `bridge`: inspect Peekaboo Bridge host connectivity - `capture`: live capture or video ingest + frame extraction - `clean`: prune snapshot cache and temp files - `config`: init/show/edit/validate, providers, models, credentials - `image`: capture screenshots (screen/window/menu bar regions) - `learn`: print the full agent guide + tool catalog - `list`: apps, windows, screens, menubar, permissions - `permissions`: check Screen Recording/Accessibility status - `run`: execute `.peekaboo.json` scripts - `sleep`: pause execution for a duration - `tools`: list available tools with filtering/display options Interaction - `click`: target by ID/query/coords with smart waits - `drag`: drag & drop across elements/coords/Dock - `hotkey`: modifier combos like `cmd,shift,t` - `move`: cursor positioning with optional smoothing - `paste`: set clipboard -> paste -> restore - `press`: special-key sequences with repeats - `scroll`: directional scrolling (targeted + smooth) - `swipe`: gesture-style drags between targets - `type`: text + control keys (`--clear`, delays) System - `app`: launch/quit/relaunch/hide/unhide/switch/list apps - `clipboard`: read/write clipboard (text/images/files) - `dialog`: click/input/file/dismiss/list system dialogs - `dock`: launch/right-click/hide/show/list Dock items - `menu`: click/list application menus + menu extras - `menubar`: list/click status bar items - `open`: enhanced `open` with app targeting + JSON payloads - `space`: list/switch/move-window (Spaces) - `visualizer`: exercise Peekaboo visual feedback animations - `window`: close/minimize/maximize/move/resize/focus/list Vision - `see`: annotated UI maps, snapshot IDs, optional analysis Global runtime flags - `--json`/`-j`, `--verbose`/`-v`, `--log-level <level>` - `--no-remote`, `--bridge-socket <path>` ## Quickstart (happy path) ```bash peekaboo permissions peekaboo list apps --json peekaboo see --annotate --path /tmp/peekaboo-see.png peekaboo click --on B1 peekaboo type "Hello" --return ``` ## Common targeting parameters (most interaction commands) - App/window: `--app`, `--pid`, `--window-title`, `--window-id`, `--window-index` - Snapshot targeting: `--snapshot` (ID from `see`; defaults to latest) - Element/coords: `--on`/`--id` (element ID), `--coords x,y` - Focus control: `--no-auto-focus`, `--space-switch`, `--bring-to-current-space`, `--focus-timeout-seconds`, `--focus-retry-count` ## Common capture parameters - Output: `--path`, `--format png|jpg`, `--retina` - Targeting: `--mode screen|window|frontmost`, `--screen-index`, `--window-title`, `--window-id` - Analysis: `--analyze "prompt"`, `--annotate` - Capture engine: `--capture-engine auto|classic|cg|modern|sckit` ## Common motion/typing parameters - Timing: `--duration` (drag/swipe), `--steps`, `--delay` (type/scroll/press) - Human-ish movement: `--profile human|linear`, `--wpm` (typing) - Scroll: `--direction up|down|left|right`, `--amount <ticks>`, `--smooth` ## Examples ### See -> click -> type (most reliable flow) ```bash peekaboo see --app Safari --window-title "Login" --annotate --path /tmp/see.png peekaboo click --on B3 --app Safari peekaboo type "user@example.com" --app Safari peekaboo press tab --count 1 --app Safari peekaboo type "supersecret" --app Safari --return ``` ### Target by window id ```bash peekaboo list windows --app "Visual Studio Code" --json peekaboo click --window-id 12345 --coords 120,160 peekaboo type "Hello from Peekaboo" --window-id 12345 ``` ### Capture screenshots + analyze ```bash peekaboo image --mode screen --screen-index 0 --retina --path /tmp/screen.png peekaboo image --app Safari --window-title "Dashboard" --analyze "Summarize KPIs" peekaboo see --mode screen --screen-index 0 --analyze "Summarize the dashboard" ``` ### Live capture (motion-aware) ```bash peekaboo capture live --mode region --region 100,100,800,600 --duration 30 \ --active-fps 8 --idle-fps 2 --highlight-changes --path /tmp/capture ``` ### App + window management ```bash peekaboo app launch "Safari" --open https://example.com peekaboo window focus --app Safari --window-title "Example" peekaboo window set-bounds --app Safari --x 50 --y 50 --width 1200 --height 800 peekaboo app quit --app Safari ``` ### Menus, menubar, dock ```bash peekaboo menu click --app Safari --item "New Window" peekaboo menu click --app TextEdit --path "Format > Font > Show Fonts" peekaboo menu click-extra --title "WiFi" peekaboo dock launch Safari peekaboo menubar list --json ``` ### Mouse + gesture input ```bash peekaboo move 500,300 --smooth peekaboo drag --from B1 --to T2 peekaboo swipe --from-coords 100,500 --to-coords 100,200 --duration 800 peekaboo scroll --direction down --amount 6 --smooth ``` ### Keyboard input ```bash peekaboo hotkey --keys "cmd,shift,t" peekaboo press escape peekaboo type "Line 1\nLine 2" --delay 10 ``` Notes - Requires Screen Recording + Accessibility permissions. - Use `peekaboo see --annotate` to identify targets before clicking.
Related Skills
opentwitter
Twitter/X data via the 6551 API. Supports user profiles, tweet search, user tweets, follower events, deleted tweets, and KOL followers.
opennews
Crypto news search, AI ratings, trading signals, and real-time updates via the OpenNews 6551 API. Supports keyword search, coin filtering, source filtering, AI score ranking, and WebSocket live feeds.
agent-reach
Give your AI agent eyes to see the entire internet. Read and search across Twitter/X, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu, Instagram, LinkedIn, Boss直聘, RSS, and any web page — all from a single CLI. Use when: (1) reading content from URLs (tweets, Reddit posts, articles, videos), (2) searching across platforms (web, Twitter, Reddit, GitHub, YouTube, Bilibili, XiaoHongShu, Instagram, LinkedIn, Boss直聘), (3) user asks to configure/enable a platform channel, (4) checking channel health or updating Agent Reach. Triggers: "search Twitter/Reddit/YouTube", "read this URL", "find posts about", "搜索", "读取", "查一下", "看看这个链接", "帮我配", "帮我添加", "帮我安装".
searxng-search
使用自建SearXNG搜索引擎搜索互联网内容。触发词:搜索、查一下、帮我查、查找、搜一下、帮我搜索。
multi-search-engine
Multi search engine integration with 17 engines (8 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required.
weather
Get current weather and forecasts via wttr.in or Open-Meteo. Use when: user asks about weather, temperature, or forecasts for any location. NOT for: historical weather data, severe weather alerts, or detailed meteorological analysis. No API key needed.
wacli
Send WhatsApp messages to other people or search/sync WhatsApp history via the wacli CLI (not for normal user chats).
voice-call
Start voice calls via the OpenClaw voice-call plugin.
video-frames
Extract frames or short clips from videos using ffmpeg.
trello
Manage Trello boards, lists, and cards via the Trello REST API.
tmux
Remote-control tmux sessions for interactive CLIs by sending keystrokes and scraping pane output.
things-mac
Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database). Use when a user asks OpenClaw to add a task to Things, list inbox/today/upcoming, search tasks, or inspect projects/areas/tags.