review-site-scraper

Scrape product reviews from G2, Capterra, and Trustpilot using Apify. Single script with platform dispatch. Use when you need to monitor competitor reviews, track product sentiment, or gather customer feedback from review sites.

380 stars

Best use case

review-site-scraper is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Scrape product reviews from G2, Capterra, and Trustpilot using Apify. Single script with platform dispatch. Use when you need to monitor competitor reviews, track product sentiment, or gather customer feedback from review sites.

Teams using review-site-scraper should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/review-site-scraper/SKILL.md --create-dirs "https://raw.githubusercontent.com/gooseworks-ai/goose-skills/main/skills/capabilities/review-site-scraper/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/review-site-scraper/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How review-site-scraper Compares

Feature / Agentreview-site-scraperStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Scrape product reviews from G2, Capterra, and Trustpilot using Apify. Single script with platform dispatch. Use when you need to monitor competitor reviews, track product sentiment, or gather customer feedback from review sites.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Review Site Scraper

Scrape product reviews from G2, Capterra, and Trustpilot using platform-specific Apify actors.

## Quick Start

Requires `APIFY_API_TOKEN` env var (or `--token` flag). No external dependencies needed (uses stdlib `urllib`).

```bash
# Trustpilot reviews
python3 skills/capabilities/review-site-scraper/scripts/scrape_reviews.py \
  --platform trustpilot \
  --url "https://www.trustpilot.com/review/example.com" \
  --max-reviews 10 --output summary

# G2 reviews with keyword filter
python3 skills/capabilities/review-site-scraper/scripts/scrape_reviews.py \
  --platform g2 \
  --url "https://www.g2.com/products/example/reviews" \
  --keywords "pricing,support"

# Capterra reviews (uses company name, not URL)
python3 skills/capabilities/review-site-scraper/scripts/scrape_reviews.py \
  --platform capterra \
  --company-name "HubSpot CRM" \
  --max-reviews 20
```

## Supported Platforms

| Platform | Actor | Input | Cost |
|----------|-------|-------|------|
| G2 | `focused_vanguard/g2-reviews-scraper` | `--url` (G2 product page URL) | Free tier available |
| Capterra | `getdataforme/capterra-reviews-scraper-bulk` | `--company-name` (company name, not a URL) | Pay-per-result |
| Trustpilot | `agents/trustpilot-reviews` | `--url` (Trustpilot review page URL) | ~$0.20/1k reviews |

## CLI Reference

| Flag | Default | Description |
|------|---------|-------------|
| `--platform` | *required* | `g2`, `capterra`, or `trustpilot` |
| `--url` | none | Product review page URL (required for G2 and Trustpilot) |
| `--company-name` | none | Company name to search (Capterra only) |
| `--max-reviews` | 50 | Max reviews to scrape |
| `--keywords` | none | Keywords to filter (comma-separated, OR logic) |
| `--days` | none | Only include reviews from last N days |
| `--output` | json | Output format: `json` or `summary` |
| `--token` | env var | Apify token (prefer `APIFY_API_TOKEN` env var) |
| `--timeout` | 300 | Max seconds for Apify run |

## Normalized Output Schema

All platforms are normalized but each has platform-specific fields.

**G2 output fields:**

```json
{
  "platform": "g2",
  "id": "review-id",
  "product_name": "Product Name",
  "title": null,
  "text": "Review body text",
  "rating": 4,
  "author": "Reviewer Name",
  "author_title": "Job Title",
  "author_company": "Company Name",
  "author_company_size": "51-200",
  "author_industry": "Software",
  "date": "2026-02-18",
  "source": "organic",
  "url": "https://..."
}
```

**Capterra output fields:**

```json
{
  "platform": "capterra",
  "title": "Review title",
  "text": "Review body text",
  "overall_rating": 4,
  "ease_of_use": 5,
  "customer_service": 3,
  "features": 4,
  "author": "Reviewer Name",
  "job_title": "Marketing Manager",
  "industry": "Marketing and Advertising",
  "usage_duration": "1-2 years",
  "date": "2026-02-18",
  "url": "https://..."
}
```

**Trustpilot output fields:**

```json
{
  "platform": "trustpilot",
  "id": "review-id",
  "title": "Review title",
  "text": "Review body text",
  "rating": 4,
  "author": "Reviewer Name",
  "date": "2026-02-18T12:00:00.000Z",
  "experienced_date": "2026-02-15T00:00:00.000Z",
  "likes": 2,
  "input_source": "organic",
  "url": "https://..."
}
```

Related Skills

sales-performance-review

381
from gooseworks-ai/goose-skills

Periodic sales performance review composite. Aggregates ALL sales initiatives taken in a given period — outbound campaigns, inbound efforts, events, partnerships, content, referrals — and measures the impact of each on pipeline and revenue. Produces a team-presentable report covering initiative-level performance, cross-initiative comparisons, pipeline attribution, what's working, what's not, and where to invest next. Tool-agnostic — pulls data from any combination of CRM, outreach tools, and tracking systems.

twitter-scraper

381
from gooseworks-ai/goose-skills

Search and scrape Twitter/X posts using Apify. Use when you need to find tweets, track brand mentions, monitor competitors on Twitter, or analyze Twitter discussions. Uses Twitter native search syntax (since:/until:) for reliable date filtering.

review-scraper

381
from gooseworks-ai/goose-skills

Scrape product reviews from G2, Capterra, and Trustpilot using Apify. Single script with platform dispatch. Use when you need to monitor competitor reviews, track product sentiment, or gather customer feedback from review sites.

reddit-scraper

381
from gooseworks-ai/goose-skills

Scrape and search Reddit posts using Apify. Use when you need to find Reddit discussions, track competitor mentions, monitor product feedback, discover pain points, or analyze subreddit content. Supports keyword filtering, time-based searches, and subreddit-specific queries.

meta-ad-scraper

381
from gooseworks-ai/goose-skills

Scrape competitor ads from Meta's Ad Library (Facebook, Instagram, Messenger, Threads, WhatsApp). Search by company name, Facebook Page URL, or keyword. Returns ad creatives, spend estimates, reach, impressions, and campaign details. Use for competitive ad research, messaging analysis, and creative inspiration.

icp-website-review

381
from gooseworks-ai/goose-skills

Evaluate a website, landing page, content, or any online asset through the eyes of pre-built synthetic ICP personas. Loads personas from icp-persona-builder output, then runs them against target URLs. Supports three modes: structured scorecard, freeform focus group, and head-to-head competitive comparison. Reusable — run against the same site after changes, or against new content anytime.

blog-scraper

381
from gooseworks-ai/goose-skills

Scrape blog posts via RSS feeds (free, no API key) with Apify fallback for JS-heavy sites. Use when you need to monitor competitor blogs, track industry content, or aggregate blog posts by keyword.

review-intelligence-digest

380
from gooseworks-ai/goose-skills

Scrape G2, Capterra, and Trustpilot reviews for your product and competitors, then extract recurring themes, objections, proof points, and exact customer language for use in messaging. Chains review-site-scraper with LLM analysis. Produces a weekly or monthly digest that feeds directly into copywriting, positioning, and sales enablement. Use when a marketing team needs to ground messaging in real customer language.

pipeline-review

380
from gooseworks-ai/goose-skills

Pipeline analysis composite. Pulls deal/meeting data from any CRM or tracking system, analyzes the pipeline over a user-defined period (weekly, fortnightly, monthly, quarterly), and produces both an executive summary and a detailed diagnostic report. Covers volume, qualification rates, source effectiveness, stage velocity, stuck deals, and actionable recommendations. Tool-agnostic — works with any CRM (Salesforce, HubSpot, Pipedrive, Close, Supabase, CSV).

icp-website-audit

380
from gooseworks-ai/goose-skills

End-to-end website audit through ICP eyes. Builds synthetic personas (if they don't already exist), runs a structured scorecard review of the client's site, then runs a head-to-head competitive comparison against top competitors. Produces a single consolidated report with persona feedback, competitive positioning, and prioritized recommendations. The complete "how do our buyers actually experience our site vs the competition?" workflow.

web-archive-scraper

380
from gooseworks-ai/goose-skills

Search the Wayback Machine for archived versions of websites. Extract cached pages, customer lists, testimonials, and partner directories from sites that have changed or gone offline. Uses the free CDX API — no API key needed.

site-content-catalog

380
from gooseworks-ai/goose-skills

Crawl a website's sitemap and blog index to build a complete content inventory. Lists every page with URL, title, publish date, content type, and topic cluster. Groups content by category and topic. Optionally deep-reads top N pages for quality analysis and funnel stage tagging. Use before SEO audits, content gap analysis, or brand voice extraction.