harvest-single
Single page smart extraction - articles, docs, blog posts to clean markdown
Best use case
harvest-single is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Single page smart extraction - articles, docs, blog posts to clean markdown
Teams using harvest-single should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/harvest-single/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How harvest-single Compares
| Feature / Agent | harvest-single | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Single page smart extraction - articles, docs, blog posts to clean markdown
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Harvest Single Page Extract and clean content from a single web page. Auto-detects content type (article, documentation, API reference, blog post) and produces clean, structured markdown. ## Usage ``` /harvest <url> ``` ## Examples ```bash # Extract a blog post /harvest https://blog.example.com/best-practices-2024 # Extract API documentation page /harvest https://docs.stripe.com/api/charges # Extract a GitHub README /harvest https://github.com/owner/repo ``` ## How It Works 1. Fetch URL content via WebFetch or crawl4ai 2. Detect content type (article, docs, API ref, blog, wiki) 3. Extract main content, strip navigation/ads/footers 4. Preserve code blocks, tables, images 5. Add metadata header (source, date, word count) 6. Save to `.claude/cache/agents/harvest/` ## Output Format ```markdown # [Page Title] > Source: [URL] > Extracted: [timestamp] > Type: [article|docs|api|blog|wiki] > Words: [count] [Clean extracted content in markdown] ## Links Found - [Link text](URL) ``` ## Fallback Chain 1. crawl4ai Docker (port 11235) - preferred 2. WebFetch tool - built-in fallback 3. curl + html2text - last resort ## When to Use - Quick grab of a single page's content - Extracting a specific doc page for reference - Saving an article for later analysis - Getting clean markdown from messy HTML
Related Skills
harvest-structured
Structured data extraction - tables, pricing, products, API endpoints with schema
harvest-monitor
Web change monitoring - track changes on pages, detect updates, changelog diffs
harvest-deep-crawl
Multi-page deep crawling - documentation sites, wikis, knowledge bases
harvest-competitive
Competitive intelligence - extract features, pricing, tech stack from competitor sites
harvest-adaptive
Adaptive content summarization - auto-detect content type and produce relevant summary
workflow-router
Goal-based workflow orchestration - routes tasks to specialist agents based on user goals
wiring
Wiring Verification
websocket-patterns
Connection management, room patterns, reconnection strategies, message buffering, and binary protocol design.
visual-verdict
Screenshot comparison QA for frontend development. Takes a screenshot of the current implementation, scores it across multiple visual dimensions, and returns a structured PASS/REVISE/FAIL verdict with concrete fixes. Use when implementing UI from a design reference or verifying visual correctness.
verification-loop
Comprehensive verification system covering build, types, lint, tests, security, and diff review before a PR.
vector-db-patterns
Embedding strategies, ANN algorithms, hybrid search, RAG chunking strategies, and reranking for semantic search and retrieval.
variant-analysis
Find similar vulnerabilities across a codebase after discovering one instance. Uses pattern matching, AST search, Semgrep/CodeQL queries, and manual tracing to propagate findings. Adapted from Trail of Bits. Use after finding a bug to check if the same pattern exists elsewhere.