web-scraping

This AI agent skill performs intelligent web scraping, discovers APIs, recommends optimal data extraction strategies, and guides the development of production-ready Apify Actors.

29 stars

byyfe404

Complexity: medium

View on GitHub Installation ↓

About this skill

The `web-scraping` AI agent skill offers a comprehensive and adaptive framework for extracting data from websites. It begins with a quick, low-cost assessment to gather initial intelligence, then progresses through reconnaissance phases to identify site frameworks, discover APIs via traffic interception, and recommend the most effective scraping strategy—such as sitemap, API, DOM scraping, or a hybrid approach. The skill is designed to iteratively implement these strategies, continuously evaluating for success and adapting as needed. A core strength of this skill is its robust handling of anti-blocking measures. It proactively detects protection signals (e.g., 403 errors) and escalates its reconnaissance to a full mode, loading specialized anti-blocking strategies to overcome common challenges. Beyond data extraction, it empowers developers by guiding the creation of production-grade TypeScript Apify Actors using the Apify CLI, making it suitable for building scalable and reliable data collection solutions. Users can leverage this skill for a wide array of tasks, from straightforward data extraction to complex, large-scale projects requiring resilient scraping capabilities. It automates much of the initial investigative work, strategy selection, and even code scaffolding, significantly reducing the manual effort typically involved in web scraping and Actor development. This makes it an invaluable tool for data analysts, developers, and researchers seeking efficient and effective web data acquisition.

Best use case

The primary use case for this skill is automating and streamlining the entire web scraping process, from initial site reconnaissance and strategy formulation to the development of production-ready, anti-blocking resilient scrapers. It benefits data professionals, developers, and researchers who need to reliably extract structured data from various websites, especially those with anti-bot measures, and who wish to build scalable scraping solutions using Apify.

This AI agent skill performs intelligent web scraping, discovers APIs, recommends optimal data extraction strategies, and guides the development of production-ready Apify Actors.

A comprehensive scraping strategy, a successfully extracted dataset, or a scaffolded Apify Actor tailored for the target website.

Practical example

Example input

Scrape all product details including name, price, and description from the electronics category on example.com/shop. I suspect they have anti-bot protection.

Example output

```json
{
  "strategy_recommendation": "Hybrid (API + DOM scraping)",
  "extracted_data_sample": [
    {"product_name": "Laptop Pro X", "price": "$1200", "description": "High-performance laptop..."},
    {"product_name": "Gaming Mouse", "price": "$75", "description": "Ergonomic gaming mouse..."}
  ],
  "notes": "Anti-bot measures detected, used anti-blocking strategy. Ready for Apify Actor generation."
}
```

When to use this skill

When you need to extract specific data or all links from a website.
When encountering blocking issues (e.g., 403 errors) during web scraping.
When seeking to determine the optimal scraping strategy for a given site.
When you want to productionize a scraper or develop an Apify Actor in TypeScript.

When not to use this skill

For tasks entirely unrelated to web data extraction or scraper development.
When data is available via a simple, public API that doesn't require complex scraping.
If your primary goal is real-time, ultra-low-latency data access without any potential for complex site structures or anti-bot measures.
When working with local files or non-web-based data sources.

How web-scraping Compares

Feature / Agent	web-scraping	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	medium	N/A

Frequently Asked Questions

What does this skill do?

This AI agent skill performs intelligent web scraping, discovers APIs, recommends optimal data extraction strategies, and guides the development of production-ready Apify Actors.

How difficult is it to install?

The installation complexity is rated as medium. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# Web Scraping with Intelligent Strategy Selection

## When This Skill Activates

Activate automatically when user requests:
- "Scrape [website]"
- "Extract data from [site]"
- "Get product information from [URL]"
- "Find all links/pages on [site]"
- "I'm getting blocked" or "Getting 403 errors" (loads `strategies/anti-blocking.md`)
- "Make this an Apify Actor" (loads `apify/` subdirectory)
- "Productionize this scraper"

## Input Parsing

Determine reconnaissance depth from user request:

| User Says | Mode | Phases Run |
|-----------|------|------------|
| "quick recon", "just check", "what framework" | Quick | Phase 0 only |
| "scrape X", "extract data from X" (default) | Standard | Phases 0-3 + 5, Phase 4 only if protection signals detected |
| "full recon", "deep scan", "production scraping" | Full | All phases (0-5) including protection testing |

Default is Standard mode. Escalate to Full if protection signals appear during any phase.

## Adaptive Reconnaissance Workflow

This skill uses an adaptive phased workflow with quality gates. Each gate asks **"Do I have enough?"** — continue only when the answer is no.

**See**: `strategies/framework-signatures.md` for framework detection tables referenced throughout.

### Phase 0: QUICK ASSESSMENT (curl, no browser)

Gather maximum intelligence with minimum cost — a single HTTP request.

**Step 0a: Fetch raw HTML and headers**
```bash
curl -s -D- -L "https://target.com/page" -o response.html
```

**Step 0b: Check response headers**
- Match headers against `strategies/framework-signatures.md` → Response Header Signatures table
- Note `Server`, `X-Powered-By`, `X-Shopify-Stage`, `Set-Cookie` (protection markers)
- Check HTTP status code (200 = accessible, 403 = protected, 3xx = redirects)

**Step 0c: Check Known Major Sites table**
- Match domain against `strategies/framework-signatures.md` → Known Major Sites
- If matched: use the specified data strategy, skip generic pattern scanning

**Step 0d: Detect framework from HTML**
- Search raw HTML for signatures in `strategies/framework-signatures.md` → HTML Signatures table
- Look for `__NEXT_DATA__`, `__NUXT__`, `ld+json`, `/wp-content/`, `data-reactroot`

**Step 0e: Search for target data points**
- For each data point the user wants: search raw HTML for that content
- Track which data points are found vs missing
- Check for sitemaps: `curl -s https://[site]/robots.txt | grep -i Sitemap`

**Step 0f: Note protection signals**
- 403/503 status, Cloudflare challenge HTML, CAPTCHA elements, `cf-ray` header
- Record for Phase 4 decision

**See**: `strategies/cheerio-vs-browser-test.md` for the Cheerio viability assessment

> **QUALITY GATE A**: All target data points found in raw HTML + no protection signals?
> → YES: Skip to Phase 3 (Validate Findings). No browser needed.
> → NO: Continue to Phase 1.

### Phase 1: BROWSER RECONNAISSANCE (only if Phase 0 needs it)

Launch browser only for data points missing from raw HTML or when JavaScript rendering is required.

**Step 1a: Initialize browser session**
- `proxy_start()` → Start traffic interception proxy
- `interceptor_chrome_launch(url, stealthMode: true)` → Launch Chrome with anti-detection
- `interceptor_chrome_devtools_attach(target_id)` → Attach DevTools bridge
- `interceptor_chrome_devtools_screenshot()` → Capture visual state

**Step 1b: Capture traffic and rendered DOM**
- `proxy_list_traffic()` → Review all traffic from page load
- `proxy_search_traffic(query: "application/json")` → Find JSON responses
- `interceptor_chrome_devtools_list_network(resource_types: ["xhr", "fetch"])` → XHR/fetch calls
- `interceptor_chrome_devtools_snapshot()` → Accessibility tree (rendered DOM)

**Step 1c: Search rendered DOM for missing data points**
- For each data point NOT found in Phase 0: search rendered DOM
- Use framework-specific search strategy from `strategies/framework-signatures.md` → Framework → Search Strategy table
- Only search patterns relevant to the detected framework

**Step 1d: Inspect discovered endpoints**
- `proxy_get_exchange(exchange_id)` → Full request/response for promising endpoints
- Document: method, headers, auth, response structure, pagination

> **QUALITY GATE B**: All target data points now covered (raw HTML + rendered DOM + traffic)?
> → YES: Skip to Phase 3 (Validate Findings). No deep scan needed.
> → NO: Continue to Phase 2 for missing data points only.

### Phase 2: DEEP SCAN (only for missing data points)

Targeted investigation for data points not yet found. Only search for what's missing.

**Step 2a: Test interactions for missing data**
- `proxy_clear_traffic()` before each action → Isolate API calls
- `humanizer_click(target_id, selector)` → Trigger dynamic content loads
- `humanizer_scroll(target_id, direction, amount)` → Trigger lazy loading / infinite scroll
- `humanizer_idle(target_id, duration_ms)` → Wait for delayed content
- After each action: `proxy_list_traffic()` → Check for new API calls

**Step 2b: Sniff APIs (framework-aware)**
- Search only patterns relevant to detected framework:
  - Next.js → `proxy_list_traffic(url_filter: "/_next/data/")`
  - WordPress → `proxy_list_traffic(url_filter: "/wp-json/")`
  - GraphQL → `proxy_search_traffic(query: "graphql")`
  - Generic → `proxy_list_traffic(url_filter: "/api/")` + `proxy_search_traffic(query: "application/json")`
- Skip patterns that don't apply to the detected framework

**Step 2c: Test pagination and filtering**
- Only if pagination data is a missing data point or needed for coverage assessment
- `proxy_clear_traffic()` → click next page → `proxy_list_traffic(url_filter: "page=")`
- Document pagination type (URL-based, API offset, cursor, infinite scroll)

> **QUALITY GATE C**: Enough data points covered for a useful report?
> → YES: Go to Phase 3.
> → NO: Document gaps, go to Phase 3 anyway (report will note missing data in self-critique).

### Phase 3: VALIDATE FINDINGS

Every claimed extraction method must be verified. A data point is not "found" until the extraction path is specified and tested.

**See**: `strategies/cheerio-vs-browser-test.md` for validation methodology

**Step 3a: Validate CSS selectors**
- For each Cheerio/selector-based method: confirm the selector matches actual HTML
- Test against raw HTML (curl output) or rendered DOM (snapshot)
- Confirm selector extracts the correct value, not a different element

**Step 3b: Validate JSON paths**
- For each JSON extraction (e.g., `__NEXT_DATA__`, API response): confirm the path resolves
- Parse the JSON, follow the path, verify it returns the expected data type and value

**Step 3c: Validate API endpoints**
- For each discovered API: replay the request (curl or `proxy_get_exchange`)
- Confirm: response status 200, expected data structure, correct values
- Test pagination if claimed (at least page 1 and page 2)

**Step 3d: Downgrade or re-investigate failures**
- If a selector doesn't match: try alternative selectors, or downgrade to PARTIAL confidence
- If an API returns 403: note protection requirement, flag for Phase 4
- If a JSON path is wrong: re-examine the JSON structure, correct the path

### Phase 4: PROTECTION TESTING (conditional)

**See**: `strategies/proxy-escalation.md` for complete skip/run decision logic

**Skip Phase 4 when ALL true**:
- No protection signals detected in Phases 0-2
- All data points have validated extraction methods
- User didn't request "full recon"

**Run Phase 4 when ANY true**:
- 403/challenge page observed during any phase
- Known high-protection domain
- High-volume or production intent
- User explicitly requested it

**If running**:

**Step 4a: Test raw HTTP access**
```bash
curl -s -o /dev/null -w "%{http_code}" "https://target.com/page"
```
- 200 → Cheerio viable, no browser needed for accessible endpoints
- 403/503 → Escalate to stealth browser

**Step 4b: Test with stealth browser** (if needed)
- Already running from Phase 1 — check if pages loaded without challenges
- `interceptor_chrome_devtools_list_cookies(domain_filter: "cloudflare")` → Protection cookies
- `interceptor_chrome_devtools_list_storage_keys(storage_type: "local")` → Fingerprint markers
- `proxy_get_tls_fingerprints()` → TLS fingerprint analysis

**Step 4c: Test with upstream proxy** (if needed)
- `proxy_set_upstream("http://user:pass@proxy-provider:port")`
- Re-test blocked endpoints through proxy
- Document minimum access level for each data point

**Step 4d: Document protection profile**
- What protections exist, what worked to bypass them, what production scrapers will need

### Phase 5: REPORT + SELF-CRITIQUE

Generate the intelligence report, then critically review it for gaps.

**See**: `reference/report-schema.md` for complete report format

**Step 5a: Generate report**
- Follow `reference/report-schema.md` schema (Sections 1-6)
- Include `Validated?` status for every strategy (YES / PARTIAL / NO)
- Include all discovered endpoints with full specs

**Step 5b: Self-critique**
- Write Section 7 (Self-Critique) per `reference/report-schema.md`:
  - **Gaps**: Data points not found — why, and what would find them
  - **Skipped steps**: Which phases skipped, with quality gate reasoning
  - **Unvalidated claims**: Anything marked PARTIAL or NO
  - **Assumptions**: Things not verified (e.g., "consistent layout across categories")
  - **Staleness risk**: Geo-dependent prices, A/B layouts, session-specific content
  - **Recommendations**: Targeted next steps (not "re-run everything")

**Step 5c: Fix gaps with targeted re-investigation**
- If self-critique reveals fixable gaps: go back to the specific phase/step, not a full re-run
- Example: "Price selector untested" → run one curl + parse, don't re-launch browser
- Update report with results

**Step 5d: Record session** (if browser was used)
- `proxy_session_start(name)` → `proxy_session_stop(session_id)` → `proxy_export_har(session_id, path)`
- HAR file captures all traffic for replay. See `strategies/session-workflows.md`

---

### IMPLEMENTATION (after reconnaissance)

After reconnaissance report is accepted, implement scraper iteratively.

**Core Pattern**:
1. Implement recommended approach (minimal code)
2. Test with small batch (5-10 items)
3. Validate data quality
4. Scale to full dataset or fallback
5. Handle blocking if encountered
6. Add robustness (error handling, retries, logging)

**See**: `workflows/implementation.md` for complete implementation patterns and code examples

### PRODUCTIONIZATION (on request)

Convert scraper to production-ready Apify Actor.

**Activation triggers**: "Make this an Apify Actor", "Productionize this", "Deploy to Apify"

**Core Pattern**:
1. Confirm TypeScript preference (STRONGLY RECOMMENDED)
2. Initialize with `apify create` command (CRITICAL)
3. Port scraping logic to Actor format
4. Test locally and deploy

**Note**: During development, proxy-mcp provides reconnaissance and traffic analysis. For production Actors, use Crawlee crawlers (CheerioCrawler/PlaywrightCrawler) on Apify infrastructure.

**See**: `workflows/productionization.md` for complete workflow and `apify/` for Actor development guides

## Quick Reference

| Task | Pattern/Command | Documentation |
|------|----------------|---------------|
| **Reconnaissance** | **Adaptive Phases 0-5** | **`workflows/reconnaissance.md`** |
| Framework detection | Header + HTML signature matching | `strategies/framework-signatures.md` |
| Cheerio vs Browser | Three-way test + early exit | `strategies/cheerio-vs-browser-test.md` |
| Traffic analysis | `proxy_list_traffic()` + `proxy_get_exchange()` | `strategies/traffic-interception.md` |
| Protection testing | Conditional escalation | `strategies/proxy-escalation.md` |
| Report format | Sections 1-7 with self-critique | `reference/report-schema.md` |
| Find sitemaps | `RobotsFile.find(url)` | `strategies/sitemap-discovery.md` |
| Filter sitemap URLs | `RequestList + regex` | `reference/regex-patterns.md` |
| Discover APIs | Traffic capture (automatic) | `strategies/api-discovery.md` |
| DOM scraping | DevTools bridge + humanizer | `strategies/dom-scraping.md` |
| HTTP scraping | `CheerioCrawler` | `strategies/cheerio-scraping.md` |
| Hybrid approach | Sitemap + API | `strategies/hybrid-approaches.md` |
| Handle blocking | Stealth mode + upstream proxies | `strategies/anti-blocking.md` |
| Session recording | `proxy_session_start()` / `proxy_export_har()` | `strategies/session-workflows.md` |
| Proxy-MCP tools | Complete reference | `reference/proxy-tool-reference.md` |
| Fingerprint configs | Stealth + TLS presets | `reference/fingerprint-patterns.md` |
| Create Apify Actor | `apify create` | `apify/cli-workflow.md` |
| Template selection | Cheerio vs Playwright | `workflows/productionization.md` |
| Input schema | `.actor/input_schema.json` | `apify/input-schemas.md` |
| Deploy actor | `apify push` | `apify/deployment.md` |

## Common Patterns

### Pattern 1: Sitemap-Based Scraping

```javascript
import { RobotsFile, CheerioCrawler, Dataset } from 'crawlee';

// Auto-discover and parse sitemaps
const robots = await RobotsFile.find('https://example.com');
const urls = await robots.parseUrlsFromSitemaps();

const crawler = new CheerioCrawler({
    async requestHandler({ $, request }) {
        const data = {
            title: $('h1').text().trim(),
            // ... extract data
        };
        await Dataset.pushData(data);
    },
});

await crawler.addRequests(urls);
await crawler.run();
```

See `examples/sitemap-basic.js` for complete example.

### Pattern 2: API-Based Scraping

```javascript
import { gotScraping } from 'got-scraping';

const productIds = [123, 456, 789];

for (const id of productIds) {
    const response = await gotScraping({
        url: `https://api.example.com/products/${id}`,
        responseType: 'json',
    });

    console.log(response.body);
}
```

See `examples/api-scraper.js` for complete example.

### Pattern 3: Hybrid (Sitemap + API)

```javascript
// Get URLs from sitemap
const robots = await RobotsFile.find('https://shop.com');
const urls = await robots.parseUrlsFromSitemaps();

// Extract IDs from URLs
const productIds = urls
    .map(url => url.match(/\/products\/(\d+)/)?.[1])
    .filter(Boolean);

// Fetch data via API
for (const id of productIds) {
    const data = await gotScraping({
        url: `https://api.shop.com/v1/products/${id}`,
        responseType: 'json',
    });
    // Process data
}
```

See `examples/hybrid-sitemap-api.js` for complete example.

## Directory Navigation

This skill uses **progressive disclosure** - detailed information is organized in subdirectories and loaded only when needed.

### Workflows (Implementation Patterns)
**For**: Step-by-step workflow guides for each phase

- `workflows/reconnaissance.md` - **Phase 1 interactive reconnaissance (CRITICAL)**
- `workflows/implementation.md` - Phase 4 iterative implementation patterns
- `workflows/productionization.md` - Phase 5 Apify Actor creation workflow

### Strategies (Deep Dives)
**For**: Detailed guides on specific scraping approaches

- `strategies/framework-signatures.md` - **Framework detection lookup tables (Phase 0/1)**
- `strategies/cheerio-vs-browser-test.md` - **Cheerio vs Browser decision test with early exit**
- `strategies/proxy-escalation.md` - **Protection testing skip/run conditions (Phase 4)**
- `strategies/traffic-interception.md` - Traffic interception via MITM proxy
- `strategies/sitemap-discovery.md` - Complete sitemap guide (4 patterns)
- `strategies/api-discovery.md` - Finding and using APIs
- `strategies/dom-scraping.md` - DOM scraping via DevTools bridge
- `strategies/cheerio-scraping.md` - HTTP-only scraping
- `strategies/hybrid-approaches.md` - Combining strategies
- `strategies/anti-blocking.md` - Multi-layer anti-detection (stealth, humanizer, proxies, TLS)
- `strategies/session-workflows.md` - Session recording, HAR export, replay

### Examples (Runnable Code)
**For**: Working code to reference or execute

**JavaScript Learning Examples** (Simple standalone scripts):
- `examples/sitemap-basic.js` - Simple sitemap scraper
- `examples/api-scraper.js` - Pure API approach
- `examples/traffic-interception-basic.js` - Proxy-based reconnaissance
- `examples/hybrid-sitemap-api.js` - Combined approach
- `examples/iterative-fallback.js` - Try traffic interception→sitemap→API→DOM scraping

**TypeScript Production Examples** (Complete Actors):
- `apify/examples/basic-scraper/` - Sitemap + Playwright
- `apify/examples/anti-blocking/` - Fingerprinting + proxies
- `apify/examples/hybrid-api/` - Sitemap + API (optimal)

### Reference (Quick Lookup)
**For**: Quick patterns and troubleshooting

- `reference/report-schema.md` - **Intelligence report format (Sections 1-7 + self-critique)**
- `reference/proxy-tool-reference.md` - Proxy-MCP tool reference (all 80+ tools)
- `reference/regex-patterns.md` - Common URL regex patterns
- `reference/fingerprint-patterns.md` - Stealth mode + TLS fingerprint presets
- `reference/anti-patterns.md` - What NOT to do

### Apify (Production Deployment)
**For**: Creating production Apify Actors

- `apify/README.md` - When and how to use Apify
- `apify/typescript-first.md` - **Why TypeScript for actors**
- `apify/cli-workflow.md` - **apify create workflow (CRITICAL)**
- `apify/initialization.md` - Complete setup guide
- `apify/input-schemas.md` - Input validation patterns
- `apify/configuration.md` - actor.json setup
- `apify/deployment.md` - Testing and deployment
- `apify/templates/` - TypeScript boilerplate

**Note**: Each file is self-contained and can be read independently. Claude will navigate to specific files as needed.

## Core Principles

### 1. Assess Before Committing Resources
Start cheap (curl), escalate only when needed:
- Phase 0 (curl) before Phase 1 (browser) before Phase 2 (deep scan)
- Quality gates skip phases when data is sufficient
- Never launch a browser if curl gives you everything

### 2. Detect First, Then Search Relevant Patterns
Use framework detection to focus searches:
- Match against `strategies/framework-signatures.md` before scanning
- Skip patterns that don't apply (no `__NEXT_DATA__` on Amazon)
- Known major sites get direct strategy lookup

### 3. Validate, Don't Assume
Every claimed extraction method must be tested:
- "Found text in HTML" is not enough — need a working selector/path
- Phase 3 validates every finding before the report
- Unvalidated claims are marked PARTIAL or NO in the report

### 4. Iterative Implementation
Build incrementally:
- Small test batch first (5-10 items)
- Validate quality
- Scale or fallback
- Add robustness last

### 5. Production-Ready Code
When productionizing:
- Use TypeScript (strongly recommended)
- Use `apify create` (never manual setup)
- Add proper error handling
- Include logging and monitoring

---

**Remember**: Traffic interception first, sitemaps second, APIs third, DOM scraping last!

For detailed guidance on any topic, navigate to the relevant subdirectory file listed above.

Related Skills

laravel-expert

31392

from sickn33/antigravity-awesome-skills

Senior Laravel Engineer role for production-grade, maintainable, and idiomatic Laravel solutions. Focuses on clean architecture, security, performance, and modern standards (Laravel 10/11+).

Coding & DevelopmentClaude

debug-nw

7754

from nativewind/nativewind

Debug a Nativewind v5 setup issue. Walks through common configuration problems with metro, babel, postcss, and dependencies.

Coding & Development

Go Production Engineering

3891

from openclaw/skills

You are a Go production engineering expert. Follow this system for every Go project — from architecture decisions through production deployment. Apply phases sequentially for new projects; use individual phases as needed for existing codebases.

Coding & Development

Database Engineering Mastery

3891

from openclaw/skills

> Complete database design, optimization, migration, and operations system. From schema design to production monitoring — covers PostgreSQL, MySQL, SQLite, and general SQL patterns.

Coding & Development

afrexai-code-reviewer

3891

from openclaw/skills

Enterprise-grade code review agent. Reviews PRs, diffs, or code files for security vulnerabilities, performance issues, error handling gaps, architecture smells, and test coverage. Works with any language, any repo, no dependencies required.

Coding & Development

API Documentation Generator

3891

from openclaw/skills

Generate production-ready API documentation from endpoint descriptions. Outputs OpenAPI 3.0, markdown reference docs, and SDK quickstart guides.

Coding & Development

bili-rs

3891

from openclaw/skills

Development skill for bili-rs, a Rust CLI tool for Bilibili (B站). Use when implementing features, fixing bugs, or extending the bilibili-cli-rust codebase. Provides architecture conventions, API endpoints, coding patterns, and project-specific constraints. Triggers on tasks involving adding CLI commands, calling Bilibili APIs, handling authentication, implementing output formatting, or working with the layered cli/commands/client/payloads architecture.

Coding & Development

Puppeteer

3891

from openclaw/skills

Automate Chrome and Chromium with Puppeteer for scraping, testing, screenshots, and browser workflows.

Coding & Development

pharaoh

3891

from openclaw/skills

Codebase knowledge graph with 23 development workflow skills. Query architecture, dependencies, blast radius, dead code, and test coverage via MCP. Requires GitHub App installation (read-only repo access) and OAuth authentication. Connects to external MCP server at mcp.pharaoh.so.

Coding & Development

git-commit-helper

3891

from openclaw/skills

Generate standardized git commit messages following Conventional Commits format. Use this skill when the user asks to commit code, write a commit message, or create a git commit. Enforces team conventions for type prefixes, scope naming, message length, and breaking change documentation.

Coding & Development

ask-claude

3891

from openclaw/skills

Delegate a task to Claude Code CLI and immediately report the result back in chat. Supports persistent sessions with full context memory. Safe execution: no data exfiltration, no external calls, file operations confined to workspace. Use when the user asks to run Claude, delegate a coding task, continue a previous Claude session, or any task benefiting from Claude Code's tools (file editing, code analysis, bash, etc.).

Coding & Development

bnbchain-mcp

3891

from openclaw/skills

Interact with the BNB Chain Model Context Protocol (MCP) server. Blocks, contracts, tokens, NFTs, wallet, Greenfield, and ERC-8004 agent tools. Use npx @bnb-chain/mcp@latest or read the official skill page.

Coding & Development