Puppeteer

Automate Chrome and Chromium with Puppeteer for scraping, testing, screenshots, and browser workflows.

3,891 stars
Complexity: medium

About this skill

The Puppeteer skill empowers AI agents to programmatically control Chrome or Chromium browsers. It enables agents to perform a wide array of web-related tasks, including sophisticated web scraping for data extraction, robust end-to-end testing of web applications, generating high-quality screenshots and PDFs of web pages, and executing complex, multi-step browser workflows. This skill is designed to handle common challenges in browser automation by providing core rules for reliable interaction, such as waiting for elements before acting, using specific and stable CSS selectors, and explicitly handling page navigation. By abstracting the complexities of Puppeteer, it allows AI agents to efficiently interact with dynamic web content, filling forms, clicking buttons, navigating pages, and extracting structured information. Users benefit from a reliable mechanism for automating tasks that traditionally require manual browser interaction or custom scripting. It ensures that an agent can consistently achieve its goals on the web, making it invaluable for data collection, quality assurance, content generation, and streamlining online operations.

Best use case

The primary use case is automating any task that requires interacting with a web browser. This includes data scraping from websites, running UI tests to ensure website functionality, or creating visual assets like screenshots and PDFs. It benefits developers, data analysts, QA professionals, and anyone needing programmatic control over web browsers for information gathering, testing, or content creation.

Automate Chrome and Chromium with Puppeteer for scraping, testing, screenshots, and browser workflows.

A successfully executed browser automation task, resulting in scraped data, a test report, a screenshot, or a PDF, indicating the completion of the requested web workflow.

Practical example

Example input

Scrape the product titles, prices, and images from the first three pages of the 'electronics' category on 'example-ecommerce.com'.

Example output

Successfully extracted data for 30 products. Saved product details to `~/puppeteer/output/example-ecommerce_electronics.json` and screenshots to `~/puppeteer/output/images/`.

When to use this skill

  • When you need to perform web scraping for data extraction.
  • When conducting end-to-end (E2E) testing of web applications.
  • When generating screenshots or PDFs of web pages.
  • When automating complex multi-step browser workflows like filling forms or navigating gated content.

When not to use this skill

  • When the task only involves simple HTTP requests without needing a full browser context.
  • When interacting with APIs is sufficient and a browser is unnecessary.
  • For tasks that are entirely local and do not involve web interaction.
  • If the target website employs strong, evolving bot detection that specifically targets headless browsers.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/puppeteer-1-0-0/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/1215656/puppeteer-1-0-0/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/puppeteer-1-0-0/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How Puppeteer Compares

Feature / AgentPuppeteerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexitymediumN/A

Frequently Asked Questions

What does this skill do?

Automate Chrome and Chromium with Puppeteer for scraping, testing, screenshots, and browser workflows.

How difficult is it to install?

The installation complexity is rated as medium. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

## Setup

On first use, read `setup.md` for integration guidelines.

## When to Use

User needs browser automation: web scraping, E2E testing, PDF generation, screenshots, or any headless Chrome task. Agent handles page navigation, element interaction, waiting strategies, and data extraction.

## Architecture

Scripts and outputs in `~/puppeteer/`. See `memory-template.md` for structure.

```
~/puppeteer/
├── memory.md       # Status + preferences
├── scripts/        # Reusable automation scripts
└── output/         # Screenshots, PDFs, scraped data
```

## Quick Reference

| Topic | File |
|-------|------|
| Setup process | `setup.md` |
| Memory template | `memory-template.md` |
| Selectors guide | `selectors.md` |
| Waiting patterns | `waiting.md` |

## Core Rules

### 1. Always Wait Before Acting
Never click or type immediately after navigation. Always wait for the element:
```javascript
await page.waitForSelector('#button');
await page.click('#button');
```
Clicking without waiting causes "element not found" errors 90% of the time.

### 2. Use Specific Selectors
Prefer stable selectors in this order:
1. `[data-testid="submit"]` — test attributes (most stable)
2. `#unique-id` — IDs
3. `form button[type="submit"]` — semantic combinations
4. `.class-name` — classes (least stable, changes often)

Avoid: `div > div > div > button` — breaks on any DOM change.

### 3. Handle Navigation Explicitly
After clicks that navigate, wait for navigation:
```javascript
await Promise.all([
  page.waitForNavigation(),
  page.click('a.next-page')
]);
```
Without this, the script continues before the new page loads.

### 4. Set Realistic Viewport
Always set viewport for consistent rendering:
```javascript
await page.setViewport({ width: 1280, height: 800 });
```
Default viewport is 800x600 — many sites render differently or show mobile views.

### 5. Handle Popups and Dialogs
Dismiss dialogs before they block interaction:
```javascript
page.on('dialog', async dialog => {
  await dialog.dismiss(); // or dialog.accept()
});
```
Unhandled dialogs freeze the script.

### 6. Close Browser on Errors
Always wrap in try/finally:
```javascript
const browser = await puppeteer.launch();
try {
  // ... automation code
} finally {
  await browser.close();
}
```
Leaked browser processes consume memory and ports.

### 7. Respect Rate Limits
Add delays between requests to avoid blocks:
```javascript
await page.waitForTimeout(1000 + Math.random() * 2000);
```
Hammering sites triggers CAPTCHAs and IP bans.

## Common Traps

- `page.click()` on invisible element → fails silently, use `waitForSelector` with `visible: true`
- Screenshots of elements off-screen → blank image, scroll into view first
- `page.evaluate()` returns undefined → cannot return DOM nodes, only serializable data
- Headless blocked by site → use `headless: 'new'` or set user agent
- Form submit reloads page → `page.waitForNavigation()` or data is lost
- Shadow DOM elements invisible to selectors → use `page.evaluateHandle()` to pierce shadow roots
- Cookies not persisting → launch with `userDataDir` for session persistence

## Security & Privacy

**Data that stays local:**
- All scraped data in ~/puppeteer/output/
- Browser profile in specified userDataDir

**This skill does NOT:**
- Send scraped data anywhere
- Store credentials (you provide them per-script)
- Access files outside ~/puppeteer/

## Related Skills
Install with `clawhub install <slug>` if user confirms:
- `playwright` — Cross-browser automation alternative
- `chrome` — Chrome DevTools and debugging
- `web` — General web development

## Feedback

- If useful: `clawhub star puppeteer`
- Stay updated: `clawhub sync`

Related Skills

Go Production Engineering

3891
from openclaw/skills

You are a Go production engineering expert. Follow this system for every Go project — from architecture decisions through production deployment. Apply phases sequentially for new projects; use individual phases as needed for existing codebases.

Coding & Development

Database Engineering Mastery

3891
from openclaw/skills

> Complete database design, optimization, migration, and operations system. From schema design to production monitoring — covers PostgreSQL, MySQL, SQLite, and general SQL patterns.

Coding & Development

afrexai-code-reviewer

3891
from openclaw/skills

Enterprise-grade code review agent. Reviews PRs, diffs, or code files for security vulnerabilities, performance issues, error handling gaps, architecture smells, and test coverage. Works with any language, any repo, no dependencies required.

Coding & Development

API Documentation Generator

3891
from openclaw/skills

Generate production-ready API documentation from endpoint descriptions. Outputs OpenAPI 3.0, markdown reference docs, and SDK quickstart guides.

Coding & Development

bili-rs

3891
from openclaw/skills

Development skill for bili-rs, a Rust CLI tool for Bilibili (B站). Use when implementing features, fixing bugs, or extending the bilibili-cli-rust codebase. Provides architecture conventions, API endpoints, coding patterns, and project-specific constraints. Triggers on tasks involving adding CLI commands, calling Bilibili APIs, handling authentication, implementing output formatting, or working with the layered cli/commands/client/payloads architecture.

Coding & Development

pharaoh

3891
from openclaw/skills

Codebase knowledge graph with 23 development workflow skills. Query architecture, dependencies, blast radius, dead code, and test coverage via MCP. Requires GitHub App installation (read-only repo access) and OAuth authentication. Connects to external MCP server at mcp.pharaoh.so.

Coding & Development

git-commit-helper

3891
from openclaw/skills

Generate standardized git commit messages following Conventional Commits format. Use this skill when the user asks to commit code, write a commit message, or create a git commit. Enforces team conventions for type prefixes, scope naming, message length, and breaking change documentation.

Coding & Development

ask-claude

3891
from openclaw/skills

Delegate a task to Claude Code CLI and immediately report the result back in chat. Supports persistent sessions with full context memory. Safe execution: no data exfiltration, no external calls, file operations confined to workspace. Use when the user asks to run Claude, delegate a coding task, continue a previous Claude session, or any task benefiting from Claude Code's tools (file editing, code analysis, bash, etc.).

Coding & Development

bnbchain-mcp

3891
from openclaw/skills

Interact with the BNB Chain Model Context Protocol (MCP) server. Blocks, contracts, tokens, NFTs, wallet, Greenfield, and ERC-8004 agent tools. Use npx @bnb-chain/mcp@latest or read the official skill page.

Coding & Development

helius-phantom

3891
from openclaw/skills

Build frontend Solana applications with Phantom Connect SDK and Helius infrastructure. Covers React, React Native, and browser SDK integration, transaction signing via Helius Sender, API key proxying, token gating, NFT minting, crypto payments, real-time updates, and secure frontend architecture.

Coding & Development

micropython-skills/sensor

3891
from openclaw/skills

MicroPython sensor reading — DHT11/22, BME280, MPU6050, ADC, ultrasonic HC-SR04, photoresistor, generic I2C sensors.

Coding & Development

micropython-skills/network

3891
from openclaw/skills

MicroPython networking — WiFi STA/AP, HTTP requests, MQTT pub/sub, BLE, NTP time sync, WebSocket.

Coding & Development