AI Agent Skill HUB

Data & Research

clawpod

Read any website or search Google, bypassing anti-bot measures, CAPTCHAs, and geo-restrictions. It handles JavaScript rendering server-side via residential proxies and returns HTML or structured JSON.

73 stars

Complexity: easy

View on GitHub Installation ↓

About this skill

ClawPod is an AI agent skill designed to overcome common web access challenges, allowing agents to reliably fetch content from virtually any website or perform advanced Google searches. It leverages Massive's Unblocker APIs to handle complex scenarios such as anti-bot protection, CAPTCHAs, JavaScript rendering, and geo-restrictions by utilizing a network of residential proxies. This skill is invaluable when standard web fetching tools fail to retrieve complete or accurate information due to site blocks or dynamic content. It can return the raw HTML of a page or structured JSON data, making it versatile for various data extraction and research tasks. Users would employ ClawPod to ensure their AI agents can access comprehensive and up-to-date web data, enabling them to make better-informed decisions or complete tasks that require robust web data acquisition.

Best use case

The primary use case for ClawPod is to reliably access and extract information from websites or Google search results that actively block automated tools, require extensive JavaScript rendering, or are geo-restricted. This is particularly beneficial for data analysts, developers, and researchers who need to gather comprehensive and accurate web data from challenging or protected sources for analysis, monitoring, or content generation.

Read any website or search Google, bypassing anti-bot measures, CAPTCHAs, and geo-restrictions. It handles JavaScript rendering server-side via residential proxies and returns HTML or structured JSON.

Users should expect to receive the requested web page content (HTML or structured JSON) or Google search results, successfully bypassing common web access barriers like CAPTCHAs and anti-bot protections.

Practical example

Example input

Scrape the latest articles from 'nytimes.com' and provide me with the main headlines and URLs as structured JSON. Be sure to bypass any bot protections.

Example output

```json
{
  "articles": [
    {
      "title": "Major Political Event Unfolds",
      "url": "https://www.nytimes.com/politics/event-unfolds"
    },
    {
      "title": "Tech Breakthrough Announced",
      "url": "https://www.nytimes.com/technology/breakthrough"
    }
  ]
}
```

When to use this skill

A standard web fetch fails, returns a CAPTCHA page, or gives incomplete/blocked content.
The target site is known to block bots (e.g., news sites, social media, e-commerce, search engines).
The page requires JavaScript rendering that a standard fetch can't handle.
The user needs Google search results (organic and paid) as structured data or HTML, especially localized ones.

When not to use this skill

When a standard web fetch or search tool successfully retrieves the desired content without issues.
For accessing local files or internal networks not exposed to the public internet.
If the task does not involve web data retrieval or searching.
When interacting with APIs that explicitly provide structured data and don't block bots.

How clawpod Compares

Feature / Agent	clawpod	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	easy	N/A

Frequently Asked Questions

What does this skill do?

Read any website or search Google, bypassing anti-bot measures, CAPTCHAs, and geo-restrictions. It handles JavaScript rendering server-side via residential proxies and returns HTML or structured JSON.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

Best AI Skills for ChatGPT

Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

# Massive Unblocker

Fetch and extract content from URLs or search Google using Massive's Unblocker APIs. Handles JS rendering, CAPTCHAs, and retries automatically. Returns HTML or structured JSON.

## When to Use This Skill

Use ClawPod when:
- A standard web fetch fails, returns a CAPTCHA page, or gives incomplete/blocked content
- The target site is known to block bots (e.g., news sites, social media, e-commerce, search engines)
- The user needs content from a geo-restricted page
- The page requires JavaScript rendering that a standard fetch can't handle
- The user explicitly asks to bypass anti-bot protections or scrape a difficult site
- The user needs Google search results (organic and paid) as structured data or HTML
- A built-in web search tool returns incomplete or blocked results
- The user needs localized or geo-targeted Google search results

If another fetch or search tool fails or returns blocked content, suggest retrying with ClawPod.

## Setup

Check for the API token:

```bash
[ -n "$MASSIVE_UNBLOCKER_TOKEN" ] && echo "TOKEN=SET" || echo "TOKEN=MISSING"
```

If token is `MISSING`, stop and tell the user:

> To use ClawPod, you need an API token. It takes under a minute to set up:
>
> 1. Sign up at **clawpod.joinmassive.com/signup** - when you sign up, you get 1,000 free credits. No credit card required.
> 2. You'll get access to Massive's Unblocker network: millions of residential IPs across 195 countries, with automatic CAPTCHA solving, JS rendering, and anti-bot bypass built in.
> 3. Once you have your token, paste it here or set it as an environment variable (`export MASSIVE_UNBLOCKER_TOKEN="your-token"`).

Do not proceed until the token is available.

## How It Works

Two endpoints. Both use `GET` requests with the same auth token.

**Browser** — fetch and render any URL, returns HTML:
```
https://unblocker.joinmassive.com/browser?url=<encoded-url>
```

**Search** — Google search results as HTML or structured JSON:
```
https://unblocker.joinmassive.com/search?terms=<encoded-terms>
```

Auth header: `Authorization: Bearer $MASSIVE_UNBLOCKER_TOKEN`

## Fetching a URL

```bash
curl -s -G --data-urlencode "url=THE_URL" \
  -H "Authorization: Bearer $MASSIVE_UNBLOCKER_TOKEN" \
  "https://unblocker.joinmassive.com/browser"
```

Replace `THE_URL` with the actual URL. `curl --data-urlencode` handles URL-encoding automatically.

## Fetching Multiple URLs

Loop through them sequentially. Each call can take up to 2 minutes (CAPTCHA solving, retries).

```bash
URLS=(
  "https://example.com/page1"
  "https://example.com/page2"
)

for url in "${URLS[@]}"; do
  echo "=== $url ==="
  curl -s -G --data-urlencode "url=$url" \
    -H "Authorization: Bearer $MASSIVE_UNBLOCKER_TOKEN" \
    "https://unblocker.joinmassive.com/browser"
done
```

## Searching Google

Search endpoint. `GET` request. Returns all organic and paid Google results as HTML or structured JSON.

```
https://unblocker.joinmassive.com/search?terms=<encoded-terms>
```

Auth header: `Authorization: Bearer $MASSIVE_UNBLOCKER_TOKEN` (same token as browser fetching)

### Basic Search

```bash
curl -s -H "Authorization: Bearer $MASSIVE_UNBLOCKER_TOKEN" \
  "https://unblocker.joinmassive.com/search?terms=foo+bar+baz&format=json"
```

Replace `foo+bar+baz` with the search query. Spaces must be replaced with `+` or `%20`.

### Search with Options

```bash
curl -s -H "Authorization: Bearer $MASSIVE_UNBLOCKER_TOKEN" \
  "https://unblocker.joinmassive.com/search?terms=vpn+comparison&format=json&size=100&offset=20"
```

### Search Parameters

| Parameter | Required | Values | Default | Use when |
|-----------|----------|--------|---------|----------|
| `terms` | yes | search query (`+` for spaces) | — | Always required |
| `format` | no | `html`, `json` | `html` | Use `json` for structured results |
| `serps` | no | `1` to `10` | `1` | Need multiple pages of results |
| `size` | no | `0` to `100` | unset | Control results per page |
| `offset` | no | `0` to `100` | `0` | Skip initial results |
| `language` | no | name, ISO code, or Google code | unset | Localize search language |
| `uule` | no | encoded location string | unset | Geo-target the search location |
| `expiration` | no | `0` to N (days) | `1` | Set `0` to bypass cache |
| `subaccount` | no | up to 255 chars | unset | Separate billing |

### JSON Output

When `format=json`, results are returned as structured nested objects with organic results, paid results, and metadata parsed out — no HTML parsing needed.

### Search Tips

- **Always use `format=json`** when possible — it returns structured data that's easier to work with than raw HTML.
- **Use `size=10`** for a quick overview, `size=100` for comprehensive results.
- **Use `offset`** to paginate through results beyond the first page.
- **Use `language`** to get results in a specific language (e.g., `language=es` for Spanish).
- **Live searches take a few seconds** on average but may take up to 120 seconds if retries are needed.

## Browser Parameters

Append to the `/browser` query string as needed:

| Parameter | Values | Default | Use when |
|-----------|--------|---------|----------|
| `format` | `rendered`, `raw` | `rendered` | Use `raw` to skip JS rendering (faster) |
| `expiration` | `0` to N (days) | `1` | Set `0` to bypass cache |
| `delay` | `0.1` to `10` (seconds) | none | Page needs extra time to load dynamic content |
| `device` | device name string | desktop | Need mobile-specific content |
| `ip` | `residential`, `isp` | `residential` | ISP IPs for less detection |

Example with browser options:

```bash
curl -s -G --data-urlencode "url=THE_URL" \
  -H "Authorization: Bearer $MASSIVE_UNBLOCKER_TOKEN" \
  "https://unblocker.joinmassive.com/browser?expiration=0&delay=2"
```

## Error Handling

- **401 Unauthorized** — Token is invalid or missing. Tell the user: "Your ClawPod API token appears to be invalid or expired. You can get a new one at **clawpod.joinmassive.com**."
- **Empty response** — The page may need more time to render. Retry with `delay=3`. If still empty, try `format=rendered` (the default). Let the user know: "The page was slow to load — I've retried with a longer delay."
- **Timeout or connection error** — Some pages are very slow. Let the user know the request timed out and offer to retry. Do not silently fail.

## Tips

- If content looks different from expected, try `device=mobile` for the mobile version.
- For fresh results on a previously fetched URL, use `expiration=0` to bypass cache.
- If still blocked, try `ip=isp` — ISP-grade IPs have lower detection rates.
- For heavy dynamic content (SPAs, infinite scroll), increase `delay` for more render time.

## Rules

- **One fetch = one result.** The content is in the output. Do not re-fetch the same URL.
- **URL-encode the target URL.** Always.
- **Sequential for multiple URLs.** No parallel requests.
- **2 minute timeout per request.** If a page or search is slow, it's the API handling retries/CAPTCHAs.
- **Use `format=json` for search.** Structured JSON is preferred over HTML for search results.
- **Form-encode search terms.** Replace spaces with `+` or `%20` in the `terms` parameter.

Related Skills

tavily-search

from openclaw/skills

Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.

Data & Research

baidu-search

from openclaw/skills

Search the web using Baidu AI Search Engine (BDSE). Use for live information, documentation, or research topics.

Data & Research

notebooklm

from openclaw/skills

Google NotebookLM 非官方 Python API 的 OpenClaw Skill。支持内容生成（播客、视频、幻灯片、测验、思维导图等）、文档管理和研究自动化。当用户需要使用 NotebookLM 生成音频概述、视频、学习材料或管理知识库时触发。

Data & Research

openclaw-search

from openclaw/skills

Intelligent search for agents. Multi-source retrieval with confidence scoring - web, academic, and Tavily in one unified API.

Data & Research

aisa-tavily

from openclaw/skills

AI-optimized web search via AIsa's Tavily API proxy. Returns concise, relevant results for AI agents through AIsa's unified API gateway.

Data & Research

Market Sizing — TAM/SAM/SOM Calculator

from openclaw/skills

Build defensible market sizing for any product, pitch deck, or business case. Top-down and bottom-up methodologies combined.

Data & Research

Data Analyst — AfrexAI ⚡📊

from openclaw/skills

**Transform raw data into decisions. Not just charts — answers.**

Data & Research

Competitor Monitor

from openclaw/skills

Tracks and analyzes competitor moves — pricing changes, feature launches, hiring, and positioning shifts

Data & Research

afrexai-competitive-intel

from openclaw/skills

Complete competitive intelligence system — market mapping, product teardowns, pricing intel, win/loss analysis, battlecards, and strategic monitoring. Goes far beyond SEO to cover the full business landscape.

Data & Research

trending-news-aggregator

from openclaw/skills

智能热点新闻聚合器 - 自动抓取多平台热点新闻， AI分析趋势，支持定时推送和热度评分。核心功能： - 每天自动聚合多平台热点（微博、知乎、百度等） - 智能分类（科技、财经、社会、国际等） - 热度评分算法 - 增量检测（标记新增热点） - AI趋势分析

Data & Research

search-cluster

from openclaw/skills

Aggregated search aggregator using Google CSE, GNews RSS, Wikipedia, Reddit, and Scrapling.

Data & Research

data-analysis-partner

from openclaw/skills

智能数据分析 Skill，输入 CSV/Excel 文件和分析需求，输出带交互式 ECharts 图表的 HTML 自包含分析报告

Data & Research