anydocs

Generic Documentation Indexing & Search. Index any documentation site (SPA/static) and search it instantly.

16 stars

Best use case

anydocs is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generic Documentation Indexing & Search. Index any documentation site (SPA/static) and search it instantly.

Teams using anydocs should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/anydocs/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/documentation/anydocs/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/anydocs/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How anydocs Compares

Feature / AgentanydocsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Generic Documentation Indexing & Search. Index any documentation site (SPA/static) and search it instantly.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# anydocs - Generic Documentation Indexing & Search

A powerful, reusable skill for indexing and searching **ANY** documentation site.

## What It Does

`anydocs` solves a real problem: accessing documentation from code or CLI. Instead of opening a browser every time, you can:

- **Index** any documentation site (Discord, OpenClaw, internal docs, etc.)
- **Search** instantly from the command line or Python API
- **Cache** pages locally to avoid repeated network calls
- **Configure** multiple profiles for different doc sites

## When to Use It

Use `anydocs` when you need to:
- Quickly look up API documentation without leaving the terminal
- Build agents that need to reference docs
- Extract specific information from documentation
- Search across multiple documentation sites
- Integrate docs into your workflow

## Key Features

### 🔍 Multi-Method Search
- **Keyword search**: Fast, term-based matching with BM25-style scoring
- **Hybrid search**: Keyword + phrase proximity for better relevance
- **Regex search**: Advanced pattern matching for power users

### 🌐 Works with Any Docs Site
- Sitemap-based discovery (standard XML sitemap)
- Fallback crawling from base URL
- HTML content extraction with smart selector detection
- Automatic rate limiting to be respectful

### 💾 Smart Caching
- Pages cached locally with 7-day TTL (configurable)
- Search indexes cached for instant second searches
- Cache statistics and cleanup commands
- Respects cache invalidation

### ⚙️ Profile-Based Configuration
- Support multiple doc sites simultaneously
- Per-profile search methods and cache TTLs
- Configuration stored in `~/.anydocs/config.json`
- Examples for Discord, OpenClaw, and custom sites

### 🌐 JavaScript Rendering (Optional)
- Uses Playwright to render client-side SPAs (Single Page Apps)
- Automatically discovers links on JS-heavy sites like Discord docs
- Gracefully falls back to standard HTTP if Playwright unavailable
- Configure per-discovery session or globally per profile

## Installation

```bash
cd /path/to/skills/anydocs
pip install -r requirements.txt
chmod +x anydocs.py
```

### Optional: Browser-based rendering (for JavaScript-heavy sites)

For sites like Discord that use client-side rendering, install Playwright:

```bash
pip install playwright==1.40.0
playwright install  # Downloads Chromium
```

If Playwright is unavailable, anydocs gracefully falls back to standard HTTP fetching.

## Quick Start

### 1. Configure a Documentation Site

```bash
python anydocs.py config vuejs \
  https://vuejs.org \
  https://vuejs.org/sitemap.xml
```

### 2. Build the Index

```bash
python anydocs.py index vuejs
```

This discovers all pages via sitemap, scrapes content, and builds a searchable index.

### 3. Search

```bash
python anydocs.py search "composition api" --profile vuejs
python anydocs.py search "reactivity" --profile vuejs --limit 5
```

### 4. Fetch a Specific Page

```bash
python anydocs.py fetch "guide/introduction" --profile vuejs
```

## CLI Commands

### Configuration

```bash
# Add or update a profile
anydocs config <profile> <base_url> <sitemap_url> [--search-method hybrid] [--ttl-days 7]

# List configured profiles
anydocs list-profiles
```

### Indexing

```bash
# Build index for a profile
anydocs index <profile>

# Force re-index (skip cache)
anydocs index <profile> --force
```

### Search

```bash
# Basic keyword search
anydocs search "query" --profile discord

# Limit results
anydocs search "query" --profile discord --limit 5

# Regex search
anydocs search "^API" --profile discord --regex
```

### Fetch

```bash
# Fetch a specific page (URL or path)
anydocs fetch "https://discord.com/developers/docs/resources/webhook"
anydocs fetch "resources/webhook" --profile discord
```

### Cache Management

```bash
# Show cache statistics
anydocs cache status

# Clear all cache
anydocs cache clear

# Clear specific profile's cache
anydocs cache clear --profile discord
```

## Python API

For use in agents and scripts:

```python
from lib.config import ConfigManager
from lib.scraper import DiscoveryEngine
from lib.indexer import SearchIndex

# Load configuration
config_mgr = ConfigManager()
config = config_mgr.get_profile("discord")

# Scrape documentation
scraper = DiscoveryEngine(config["base_url"], config["sitemap_url"])
pages = scraper.fetch_all()

# Build search index
index = SearchIndex()
index.build(pages)

# Search
results = index.search("webhooks", limit=10)
for result in results:
    print(f"{result['title']} ({result['relevance_score']})")
    print(f"  {result['url']}")
```

## Configuration File Format

Configuration is stored in `~/.anydocs/config.json`:

```json
{
  "discord": {
    "name": "discord",
    "base_url": "https://discord.com/developers/docs",
    "sitemap_url": "https://discord.com/developers/docs/sitemap.xml",
    "search_method": "hybrid",
    "cache_ttl_days": 7
  },
  "openclaw": {
    "name": "openclaw",
    "base_url": "https://docs.openclaw.ai",
    "sitemap_url": "https://docs.openclaw.ai/sitemap.xml",
    "search_method": "hybrid",
    "cache_ttl_days": 7
  }
}
```

## Search Methods

### Keyword Search
- **Speed**: Fast
- **Best for**: Common terms, exact matches
- **How it works**: Term matching with position weighting (title > tags > content)
- **Example**: `anydocs search "webhooks"`

### Hybrid Search (Default)
- **Speed**: Fast
- **Best for**: Natural language queries
- **How it works**: Keyword search + phrase proximity scoring
- **Example**: `anydocs search "how to set up webhooks"`

### Regex Search
- **Speed**: Medium
- **Best for**: Complex patterns
- **How it works**: Compiled regex pattern matching across all content
- **Example**: `anydocs search "^(GET|POST)" --regex`

## Caching Behavior

- **Pages**: Cached as JSON with 7-day TTL (configurable)
- **Indexes**: Cached after indexing, invalidated on TTL expiry
- **Cache location**: `~/.anydocs/cache/`
- **Manual refresh**: Use `--force` flag or clear cache

## Performance Notes

- First index build takes 2-10 minutes depending on site size
- Subsequent searches are instant (cached indexes)
- Rate limit: 0.5s per page to be respectful
- Typical search returns ~100 results in <100ms

## Troubleshooting

### "No index for 'profile'" error
Run `anydocs index <profile>` first to build the index.

### Sitemap not found
Check the sitemap URL. Falls back to crawling from base_url if unavailable.

### Slow indexing
This is normal for large sites. Rate limiting prevents overwhelming servers.

### Cache grows too large
Run `anydocs cache clear` or set `--ttl-days` to a smaller value.

## Examples

### Vue.js Framework Docs (SPA Example)
```bash
anydocs config vuejs \
  https://vuejs.org \
  https://vuejs.org/sitemap.xml
anydocs index vuejs
anydocs search "composition api"
```

### Next.js API Docs
```bash
anydocs config nextjs \
  https://nextjs.org \
  https://nextjs.org/sitemap.xml
anydocs index nextjs
anydocs search "app router" --profile nextjs
```

### Internal Company Documentation
```bash
anydocs config internal \
  https://docs.company.local \
  https://docs.company.local/sitemap.xml
anydocs index internal --force
anydocs search "deployment" --profile internal
```

## Architecture

- **scraper.py**: Discovers URLs via sitemap, fetches and parses HTML
- **indexer.py**: Builds searchable indexes, implements multiple search strategies
- **config.py**: Manages configuration profiles
- **cache.py**: TTL-based file caching for pages and indexes
- **cli.py**: Click-based command-line interface

## Contributing

To add new documentation sites, run:
```bash
anydocs config <profile> <base_url> <sitemap_url>
```

To extend search functionality, modify `lib/indexer.py`.

## License

Part of the OpenClaw system.

Related Skills

bgo

10
from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

google-docs-manager

16
from diegosouzapw/awesome-omni-skill

Expert in Google Docs management. Use when creating, reading, updating, formatting, or managing Google Docs with markdown support, advanced formatting, tables with full manipulation, images with styling, lists, headers/footers, and table of contents.

genesis-tools:living-docs

16
from diegosouzapw/awesome-omni-skill

Self-maintaining documentation system. Bootstraps, validates, refines, and optimizes codebase documentation. Creates minimal, token-efficient doc chunks. Use when creating, updating, or auditing project documentation.

generate-docstrings

16
from diegosouzapw/awesome-omni-skill

Create docstrings for functions and classes. Use when documenting code APIs.

generate-agent-docs

16
from diegosouzapw/awesome-omni-skill

Generates documentation and usage guides for agents, skills, prompts, and instructions. Works with GitHub Copilot, Claude Code, Codex, OpenCode, and other providers. Use when onboarding team members, creating README files for your customizations, or generating usage examples for existing agents.

flow-documenter

16
from diegosouzapw/awesome-omni-skill

Document findings and maintain task notes using Flow framework. Use when user says "document", "document this", "document finding", "add notes", "add this to notes", "write this down", "summarize", "summarize this", "generate changelog", "create changelog", or wants to capture discoveries. Helps update task Notes sections, create summaries with /flow-summarize, and keep documentation synchronized with work. Focuses on concise, actionable documentation.

fix-markdown

16
from diegosouzapw/awesome-omni-skill

Fix lint, formatting, and prose issues in markdown files using Prettier and Vale. Use when the user or agent needs to fix lint, formatting, and prose issues in markdown files.

file-placement

16
from diegosouzapw/awesome-omni-skill

Activate when creating any summary, report, or output file. Ensures files go to correct directories (summaries/, memory/, stories/, bugs/). Mirrors what summary-file-enforcement hook enforces.

feature-docs

16
from diegosouzapw/awesome-omni-skill

[Documentation] Create or update business feature documentation in docs/business-features/{Module}/. Generates comprehensive 26-section docs with verified code evidence and AI companion files. Triggers on: feature docs, business feature documentation, module documentation, document feature, update feature docs, ai companion, ai context file, quick feature docs, feature readme, single file docs, verified documentation.

faf-docs

16
from diegosouzapw/awesome-omni-skill

Access FAF documentation, guides, and resources. Answers questions about The Reading Order, IANA registration, Podium scoring, format specification, and best practices. Use when user asks "how does FAF work", "show me docs", "explain The Reading Order", or needs reference information.

explanation-docs

16
from diegosouzapw/awesome-omni-skill

Explanation documentation patterns for understanding-oriented content - conceptual guides that explain why things work the way they do

executing-plans

16
from diegosouzapw/awesome-omni-skill

Execute implementation plans with batch processing and review checkpoints. Use when given a plan document.