xhs-scraper

小红书搜索抓取 skill - 通过 agent-browser (CDP) 抓取小红书搜索结果,支持列表+详情、多格式输出。使用场景:按关键词抓取笔记列表与正文、生成 RSS/JSON/Markdown。

1,172 stars

Best use case

xhs-scraper is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

小红书搜索抓取 skill - 通过 agent-browser (CDP) 抓取小红书搜索结果,支持列表+详情、多格式输出。使用场景:按关键词抓取笔记列表与正文、生成 RSS/JSON/Markdown。

Teams using xhs-scraper should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/xhs-scraper/SKILL.md --create-dirs "https://raw.githubusercontent.com/inclusionAI/AWorld/main/examples/skill_agent/skills/xhs-scraper/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/xhs-scraper/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How xhs-scraper Compares

Feature / Agentxhs-scraperStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

小红书搜索抓取 skill - 通过 agent-browser (CDP) 抓取小红书搜索结果,支持列表+详情、多格式输出。使用场景:按关键词抓取笔记列表与正文、生成 RSS/JSON/Markdown。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# 小红书抓取 (xhs-scraper)

## 概述

通过已连接 CDP 的浏览器(agent-browser)抓取小红书搜索结果:列表页滚动采集卡片信息,可选进入详情页获取正文,输出为 Markdown / RSS / JSON。

## 工具路径

- 脚本:`.claude/skills/xhs-scraper/scrape_xhs.sh`
- 依赖:`agent-browser`(CDP 已连接)、`python3`

## 用法

```bash
./scrape_xhs.sh -k <keyword> [-p <cdp_port>] [-n <max_scrolls>] [-d <detail_count>] [-o <output_file>] [-f <format>]
```

### 参数

| 参数 | 说明 | 默认 |
|------|------|------|
| `-k` | 搜索关键词(必填) | - |
| `-p` | CDP 端口 | 9222 |
| `-n` | 列表页最大滚动次数 | 5 |
| `-d` | 进入详情页获取正文的条数(0=仅列表) | 10 |
| `-o` | 输出文件路径 | stdout |
| `-f` | 格式:`md` \| `rss` \| `json` | md |

### 示例

```bash
./scrape_xhs.sh -k "Agent开发工程师"
./scrape_xhs.sh -k "AI Agent岗位" -d 5 -f rss -o feed.xml
./scrape_xhs.sh -k "大模型面经" -n 10 -d 20 -f json -o data.json
```

Related Skills

x-scraper

1172
from inclusionAI/AWorld

X (Twitter) 抓取 skill - 通过 agent-browser (CDP) 抓取指定用户推文或首页推荐流,支持关键词过滤、Tab 切换、多格式输出。使用场景:按用户/关键词抓取时间线、查看首页推荐流、生成 RSS/JSON/Markdown。

xhs-publisher

1172
from inclusionAI/AWorld

小红书发布 skill - 通过 agent-browser (CDP) 自动发布小红书图文笔记,支持多图上传、标题正文填写、一键发布。使用场景:自动化发布图文笔记到小红书创作中心。

read large webpage or knowledge

1172
from inclusionAI/AWorld

This skill is used for segmented reading and organization when facing large-scale knowledge bases or web pages. It captures original content segment by segment, summarizes key points in real-time, and continuously deposits them into the knowledge base, ensuring orderly information ingestion, clear structure, and traceability.

text2agent

1172
from inclusionAI/AWorld

Creates new agents from user requirements by generating Python implementation and mcp_config.

optimizer

1172
from inclusionAI/AWorld

Analyzes and automatically optimizes existing agents by improving system prompts and tool configuration.

media_comprehension

1172
from inclusionAI/AWorld

An intelligent assistant specialized in handling media files (images/audio/video). **Only for media file analysis**, does not handle document types.\n\n✅ Media files that can be processed:\n- Images: .jpg, .jpeg, .png, .gif, .bmp, .webp, .svg\n- Audio: .mp3, .wav, .m4a, .flac, .aac, .ogg\n- Video: .mp4, .avi, .mov, .mkv, .webm, .flv\n\n❌ Files that cannot be processed (please do not trigger this skill):\n- Documents: .pdf, .doc, .docx, .txt, .md, .rtf\n- Spreadsheets: .xlsx, .xls, .csv, .tsv\n- Presentations: .pptx, .ppt, .key\n- Code: .py, .js, .ts, .java, .cpp, .go, .rs\n- Archives: .zip, .tar, .gz, .rar, .7z\n- Executables: .exe, .bin, .app, .dmg\n- Databases: .db, .sqlite, .sql\n- Configuration files: .json, .xml, .yaml, .yml, .toml, .ini\n- Web pages: .html, .htm, .css\n\n**Trigger conditions**: When the user explicitly requests to analyze image/audio/video content, or when the file extension belongs to the aforementioned media types.".

app_evaluator

1172
from inclusionAI/AWorld

A professional skill for App Evaluation (evaluating app's performance with score) and App Improvement (giving professional suggestions for improving the app's performance).

agent-browser

1172
from inclusionAI/AWorld

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

OpenClaw

1172
from inclusionAI/AWorld

Complete guide for OpenClaw installation, Discord configuration, and sending messages, including common issues and solutions

html-to-image

1172
from inclusionAI/AWorld

HTML 转图片 skill - 将 HTML 文件或内容通过 agent-browser 渲染并截图为图片。适用于生成信息图、社交媒体配图、数据可视化截图等场景。

news-hot-scraper

3891
from openclaw/skills

This skill should be used when users need to scrape hot news topics from Chinese platforms (微博、知乎、B站、抖音、今日头条、腾讯新闻、澎湃新闻), generate summaries, and cite sources. It supports both API-based and direct scraping methods, and offers both extractive and abstractive summarization techniques.

Data & Research

x-twitter-scraper

31392
from sickn33/antigravity-awesome-skills

X (Twitter) data platform skill — tweet search, user lookup, follower extraction, engagement metrics, giveaway draws, monitoring, webhooks, 19 extraction tools, MCP server.