webthinker-deep-research

Deep web research for VCO: multi-hop search+browse+extract with an auditable action trace and a structured report (WebThinker-style).

1,174 stars

byforyourhealth111-pixel

View on GitHub Installation ↓

Best use case

webthinker-deep-research is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Deep web research for VCO: multi-hop search+browse+extract with an auditable action trace and a structured report (WebThinker-style).

Teams using webthinker-deep-research should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/webthinker-deep-research/SKILL.md --create-dirs "https://raw.githubusercontent.com/foryourhealth111-pixel/Vibe-Skills/main/bundled/skills/webthinker-deep-research/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/webthinker-deep-research/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How webthinker-deep-research Compares

Feature / Agent	webthinker-deep-research	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Deep web research for VCO: multi-hop search+browse+extract with an auditable action trace and a structured report (WebThinker-style).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

Best AI Skills for ChatGPT

Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

# WebThinker Deep Research (VCO)

## When to use

Use this skill when the task requires **deep web research** (not just one-shot search), for example:

- Multi-hop questions (“find → open → follow links → verify”)
- “Deep research report” / “调研报告” / “竞品调研” / “技术调研”
- Need an **auditable trace** of web actions and sources
- Need to merge findings into a structured deliverable (report / brief / spec)

## Non-goals (avoid redundancy)

- For **quick citations** or “give me 3 sources”, prefer `research-lookup`.
- For **interactive UI flows** (login / forms / downloads), prefer `playwright` or `turix-cua` overlays.
- For **codebase structure / call chains**, prefer GitNexus overlays (not web research).

## Output contract (must)

Produce a folder with:

- `report.md` — structured report (problem → findings → implications → next steps)
- `sources.json` — all sources (URL/title/access time/snippet)
- `trace.jsonl` — append-only action trace (search/open/extract/decision)
- `notes.md` — working notes with per-source anchors

Use `scripts/init_webthinker_run.py` to scaffold the folder.

## Runtime (Upstream vendoring)

This VCO skill supports a **stable Lite mode** by default, and keeps the upstream WebThinker repo **vendored** for optional advanced use.

- Vendored upstream paths:
  - `C:\Users\羽裳\.codex\_external\ruc-nlpir\WebThinker\`
- Runtime config (no secrets stored):
  - `C:\Users\羽裳\.codex\skills\vibe\config\ruc-nlpir-runtime.json`
- Preflight / install (no secrets echoed):
  - `pwsh C:\Users\羽裳\.codex\skills\vibe\scripts\ruc-nlpir\preflight.ps1`
  - Manually create an isolated venv for the vendored runtime and install only the minimal packages you need. The old `install-upstreams.ps1` auto-install path has been removed on purpose.

LLM endpoint conventions (recommended):

- Base URL: `OPENAI_BASE_URL` (or runtime default)
- API key: `OPENAI_API_KEY` (**env var only; never write into files or CLI args**)

## Modes

### Mode A (Recommended): Lite — tool-orchestrated deep research

Use existing tools (no heavy model hosting):

1. Scaffold outputs:
   - `python C:\Users\羽裳\.codex\skills\webthinker-deep-research\scripts\init_webthinker_run.py --topic "…" --out outputs/webthinker`
2. Search (broad → narrow):
   - Use `web.run` search queries or `mcp__tavily__tavily_search` if available.
3. Browse/extract:
   - Use `web.run open/click/find` for structured pages
   - Use `playwright` when pages require dynamic rendering / interactions
4. Draft + iterate:
   - Update `notes.md` and `sources.json` continuously
   - Write `report.md` as you go (think-search-and-draft), not only at the end
5. Verification:
   - Triangulate key claims across ≥2 sources when possible
   - Flag uncertainties explicitly

### Mode B (Optional): Full WebThinker stack

Only choose this if you want to run the upstream system end-to-end and you have the environment:

- Requires heavy deps (`torch`, `transformers`, `vllm`) + a served reasoning model
- Requires a search API (Serper recommended by upstream)
- Optional: Crawl4AI parser client for JS-heavy pages

This mode is for **high-throughput** deep research runs; for most VCO tasks, Lite mode is enough and cheaper.

## Action trace format (trace.jsonl)

Each line is one JSON object, e.g.:

- `{"ts":"…","type":"search","query":"…","provider":"web.run"}`
- `{"ts":"…","type":"open","url":"…"}`
- `{"ts":"…","type":"extract","url":"…","highlights":["…","…"]}`
- `{"ts":"…","type":"decision","reason":"why this source matters","next":"…"}`

## Quality gates

- Every major claim in `report.md` links back to at least one entry in `sources.json`.
- `sources.json` contains the exact URLs you used (no “I saw somewhere…”).
- Keep the report actionable: add “Next steps” with concrete verification tasks.

Related Skills

ux-researcher-designer

1174

from foryourhealth111-pixel/Vibe-Skills

UX research and design toolkit for Senior UX Designer/Researcher including data-driven persona generation, journey mapping, usability testing frameworks, and research synthesis. Use for user research, persona creation, journey mapping, and design validation.

research-lookup

1174

from foryourhealth111-pixel/Vibe-Skills

Look up current research information using Perplexity's Sonar Pro Search or Sonar Reasoning Pro models through OpenRouter. Automatically selects the best model based on query complexity. Search academic papers, recent studies, technical documentation, and general research information with citations.

research-grants

1174

from foryourhealth111-pixel/Vibe-Skills

Write competitive research proposals for NSF, NIH, DOE, DARPA, and Taiwan NSTC. Agency-specific formatting, review criteria, budget preparation, broader impacts, significance statements, innovation narratives, and compliance with submission requirements.

market-research-reports

1174

from foryourhealth111-pixel/Vibe-Skills

Generate comprehensive market research reports (50+ pages) in the style of top consulting firms (McKinsey, BCG, Gartner). Features professional LaTeX formatting, extensive visual generation with scientific-schematics and generate-image, deep integration with research-lookup for data gathering, and multi-framework strategic analysis including Porter Five Forces, PESTLE, SWOT, TAM/SAM/SOM, and BCG Matrix.

deeptools

1174

from foryourhealth111-pixel/Vibe-Skills

NGS analysis toolkit. BAM to bigWig conversion, QC (correlation, PCA, fingerprints), heatmaps/profiles (TSS, peaks), for ChIP-seq, RNA-seq, ATAC-seq visualization.

deepchem

1174

from foryourhealth111-pixel/Vibe-Skills

Molecular ML with diverse featurizers and pre-built datasets. Use for property prediction (ADMET, toxicity) with traditional ML or GNNs when you want extensive featurization options and MoleculeNet benchmarks. Best for quick experiments with pre-trained models, diverse molecular representations. For graph-first PyTorch workflows use torchdrug; for benchmark datasets use pytdc.

deepagent-toolchain-plan

1174

from foryourhealth111-pixel/Vibe-Skills

DeepAgent-style tool discovery for VCO: propose a minimal skill/tool chain (with verification points) and reduce confirm_required friction.

deepagent-memory-fold

1174

from foryourhealth111-pixel/Vibe-Skills

DeepAgent-style memory folding for VCO sessions: compress long context into structured working/tool memory without using episodic-memory.

content-research-writer

1174

from foryourhealth111-pixel/Vibe-Skills

Assists in writing high-quality content by conducting research, adding citations, improving hooks, iterating on outlines, and providing real-time feedback on each section. Transforms your writing process from solo effort to collaborative partnership.

comprehensive-research-agent

1174

from foryourhealth111-pixel/Vibe-Skills

Ensure thorough validation, error recovery, and transparent reasoning in research tasks with multiple tool calls

zinc-database

1174

from foryourhealth111-pixel/Vibe-Skills

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

zarr-python

1174

from foryourhealth111-pixel/Vibe-Skills

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.