autoresearchclaw-autonomous-research
Fully autonomous research pipeline that turns a topic idea into a complete academic paper with real citations, experiments, and conference-ready LaTeX.
Best use case
autoresearchclaw-autonomous-research is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Fully autonomous research pipeline that turns a topic idea into a complete academic paper with real citations, experiments, and conference-ready LaTeX.
Teams using autoresearchclaw-autonomous-research should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/autoresearchclaw-autonomous-research/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How autoresearchclaw-autonomous-research Compares
| Feature / Agent | autoresearchclaw-autonomous-research | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Fully autonomous research pipeline that turns a topic idea into a complete academic paper with real citations, experiments, and conference-ready LaTeX.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# AutoResearchClaw — Autonomous Research Pipeline
> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.
AutoResearchClaw is a fully autonomous 23-stage research pipeline that takes a natural language topic and produces a complete academic paper: real arXiv/Semantic Scholar citations, sandboxed experiments, statistical analysis, multi-agent peer review, and conference-ready LaTeX (NeurIPS/ICML/ICLR). No hallucinated references. No human babysitting.
---
## Installation
```bash
# Clone and install
git clone https://github.com/aiming-lab/AutoResearchClaw.git
cd AutoResearchClaw
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
# Verify CLI is available
researchclaw --help
```
**Requirements:** Python 3.11+
---
## Configuration
```bash
cp config.researchclaw.example.yaml config.arc.yaml
```
### Minimum config (`config.arc.yaml`)
```yaml
project:
name: "my-research"
research:
topic: "Your research topic here"
llm:
provider: "openai"
base_url: "https://api.openai.com/v1"
api_key_env: "OPENAI_API_KEY"
primary_model: "gpt-4o"
fallback_models: ["gpt-4o-mini"]
experiment:
mode: "sandbox"
sandbox:
python_path: ".venv/bin/python"
```
```bash
export OPENAI_API_KEY="$YOUR_OPENAI_KEY"
```
### OpenRouter config (200+ models)
```yaml
llm:
provider: "openrouter"
api_key_env: "OPENROUTER_API_KEY"
primary_model: "anthropic/claude-3.5-sonnet"
fallback_models:
- "google/gemini-pro-1.5"
- "meta-llama/llama-3.1-70b-instruct"
```
```bash
export OPENROUTER_API_KEY="$YOUR_OPENROUTER_KEY"
```
### ACP (Agent Client Protocol) — no API key needed
```yaml
llm:
provider: "acp"
acp:
agent: "claude" # or: codex, gemini, opencode, kimi
cwd: "."
```
The agent CLI (e.g. `claude`) handles its own authentication.
### OpenClaw bridge (optional advanced capabilities)
```yaml
openclaw_bridge:
use_cron: true # Scheduled research runs
use_message: true # Progress notifications
use_memory: true # Cross-session knowledge persistence
use_sessions_spawn: true # Parallel sub-sessions
use_web_fetch: true # Live web search in literature review
use_browser: false # Browser-based paper collection
```
---
## Key CLI Commands
```bash
# Basic run — fully autonomous, no prompts
researchclaw run --topic "Your research idea" --auto-approve
# Run with explicit config file
researchclaw run --config config.arc.yaml --topic "Mixture-of-experts routing efficiency" --auto-approve
# Run with topic defined in config (omit --topic flag)
researchclaw run --config config.arc.yaml --auto-approve
# Interactive mode — pauses at gate stages for approval
researchclaw run --config config.arc.yaml --topic "Your topic"
# Check pipeline status / resume a run
researchclaw status --run-id rc-20260315-120000-abc123
# List past runs
researchclaw list
```
**Gate stages** (5, 9, 20) pause for human approval in interactive mode. Pass `--auto-approve` to skip all gates.
---
## Python API
```python
from researchclaw.pipeline import Runner
from researchclaw.config import load_config
# Load config and run
config = load_config("config.arc.yaml")
config.research.topic = "Efficient attention mechanisms for long-context LLMs"
config.auto_approve = True
runner = Runner(config)
result = runner.run()
# Access outputs
print(result.artifact_dir) # artifacts/rc-YYYYMMDD-HHMMSS-<hash>/
print(result.deliverables_dir) # .../deliverables/
print(result.paper_draft_path) # .../deliverables/paper_draft.md
print(result.latex_path) # .../deliverables/paper.tex
print(result.bibtex_path) # .../deliverables/references.bib
print(result.verification_report) # .../deliverables/verification_report.json
```
```python
# Run specific stages only
from researchclaw.pipeline import Runner, StageRange
runner = Runner(config)
result = runner.run(stages=StageRange(start="LITERATURE_COLLECT", end="KNOWLEDGE_EXTRACT"))
```
```python
# Access knowledge base after a run
from researchclaw.knowledge import KnowledgeBase
kb = KnowledgeBase.load(result.artifact_dir)
findings = kb.get("findings")
literature = kb.get("literature")
decisions = kb.get("decisions")
```
---
## Output Structure
After a run, all outputs land in `artifacts/rc-YYYYMMDD-HHMMSS-<hash>/`:
```
artifacts/rc-20260315-120000-abc123/
├── deliverables/
│ ├── paper_draft.md # Full academic paper (Markdown)
│ ├── paper.tex # Conference-ready LaTeX
│ ├── references.bib # Real BibTeX — auto-pruned to inline citations
│ ├── verification_report.json # 4-layer citation integrity report
│ └── reviews.md # Multi-agent peer review
├── experiment_runs/
│ ├── run_001/
│ │ ├── code/ # Generated experiment code
│ │ ├── results.json # Structured metrics
│ │ └── sandbox_output.txt # Execution logs
├── charts/
│ └── *.png # Auto-generated comparison charts
├── evolution/
│ └── lessons.json # Self-learning lessons for future runs
└── knowledge_base/
├── decisions.json
├── experiments.json
├── findings.json
├── literature.json
├── questions.json
└── reviews.json
```
---
## Pipeline Stages Reference
| Phase | Stage # | Name | Notes |
|-------|---------|------|-------|
| A | 1 | TOPIC_INIT | Parse and scope research topic |
| A | 2 | PROBLEM_DECOMPOSE | Break into sub-problems |
| B | 3 | SEARCH_STRATEGY | Build search queries |
| B | 4 | LITERATURE_COLLECT | Real API calls to arXiv + Semantic Scholar |
| B | 5 | LITERATURE_SCREEN | **Gate** — approve/reject literature |
| B | 6 | KNOWLEDGE_EXTRACT | Extract structured knowledge |
| C | 7 | SYNTHESIS | Synthesize findings |
| C | 8 | HYPOTHESIS_GEN | Multi-agent debate to form hypotheses |
| D | 9 | EXPERIMENT_DESIGN | **Gate** — approve/reject design |
| D | 10 | CODE_GENERATION | Generate experiment code |
| D | 11 | RESOURCE_PLANNING | GPU/MPS/CPU auto-detection |
| E | 12 | EXPERIMENT_RUN | Sandboxed execution |
| E | 13 | ITERATIVE_REFINE | Self-healing on failure |
| F | 14 | RESULT_ANALYSIS | Multi-agent analysis |
| F | 15 | RESEARCH_DECISION | PROCEED / REFINE / PIVOT |
| G | 16 | PAPER_OUTLINE | Structure paper |
| G | 17 | PAPER_DRAFT | Write full paper |
| G | 18 | PEER_REVIEW | Evidence-consistency check |
| G | 19 | PAPER_REVISION | Incorporate review feedback |
| H | 20 | QUALITY_GATE | **Gate** — final approval |
| H | 21 | KNOWLEDGE_ARCHIVE | Save lessons to KB |
| H | 22 | EXPORT_PUBLISH | Emit LaTeX + BibTeX |
| H | 23 | CITATION_VERIFY | 4-layer anti-hallucination check |
---
## Common Patterns
### Pattern: Quick paper on a topic
```bash
export OPENAI_API_KEY="$OPENAI_API_KEY"
researchclaw run \
--topic "Self-supervised learning for protein structure prediction" \
--auto-approve
```
### Pattern: Reproducible run with full config
```yaml
# config.arc.yaml
project:
name: "protein-ssl-research"
research:
topic: "Self-supervised learning for protein structure prediction"
llm:
provider: "openai"
api_key_env: "OPENAI_API_KEY"
primary_model: "gpt-4o"
fallback_models: ["gpt-4o-mini"]
experiment:
mode: "sandbox"
sandbox:
python_path: ".venv/bin/python"
max_iterations: 3
timeout_seconds: 300
```
```bash
researchclaw run --config config.arc.yaml --auto-approve
```
### Pattern: Use Claude via OpenRouter for best reasoning
```bash
export OPENROUTER_API_KEY="$OPENROUTER_API_KEY"
cat > config.arc.yaml << 'EOF'
project:
name: "my-research"
llm:
provider: "openrouter"
api_key_env: "OPENROUTER_API_KEY"
primary_model: "anthropic/claude-3.5-sonnet"
fallback_models: ["google/gemini-pro-1.5"]
experiment:
mode: "sandbox"
sandbox:
python_path: ".venv/bin/python"
EOF
researchclaw run --config config.arc.yaml \
--topic "Efficient KV cache compression for transformer inference" \
--auto-approve
```
### Pattern: Resume after a failed run
```bash
# List runs to find the run ID
researchclaw list
# Resume from last completed stage
researchclaw run --resume rc-20260315-120000-abc123
```
### Pattern: Programmatic batch research
```python
import asyncio
from researchclaw.pipeline import Runner
from researchclaw.config import load_config
topics = [
"LoRA fine-tuning on limited hardware",
"Speculative decoding for LLM inference",
"Flash attention variants comparison",
]
config = load_config("config.arc.yaml")
config.auto_approve = True
for topic in topics:
config.research.topic = topic
runner = Runner(config)
result = runner.run()
print(f"[{topic}] → {result.deliverables_dir}")
```
### Pattern: OpenClaw one-liner (if using OpenClaw agent)
```
Share the repo URL with OpenClaw, then say:
"Research mixture-of-experts routing efficiency"
```
OpenClaw auto-reads `RESEARCHCLAW_AGENTS.md`, clones, installs, configures, and runs the full pipeline.
---
## Compile the LaTeX Output
```bash
# Navigate to deliverables
cd artifacts/rc-*/deliverables/
# Compile (requires a LaTeX distribution)
pdflatex paper.tex
bibtex paper
pdflatex paper.tex
pdflatex paper.tex
# Or upload paper.tex + references.bib directly to Overleaf
```
---
## Troubleshooting
### `researchclaw: command not found`
```bash
# Make sure the venv is active and package is installed
source .venv/bin/activate
pip install -e .
which researchclaw
```
### API key errors
```bash
# Verify env var is set
echo $OPENAI_API_KEY
# Should print your key (not empty)
# Set it explicitly for the session
export OPENAI_API_KEY="sk-..."
```
### Experiment sandbox failures
The pipeline self-heals at Stage 13 (ITERATIVE_REFINE). If it keeps failing:
```yaml
# Increase timeout and iterations in config
experiment:
max_iterations: 5
timeout_seconds: 600
sandbox:
python_path: ".venv/bin/python"
```
### Citation hallucination warnings
Stage 23 (CITATION_VERIFY) runs a 4-layer check. If references are pruned:
- This is **expected behaviour** — fake citations are removed automatically
- Check `verification_report.json` for details on which citations were rejected and why
### PIVOT loop running indefinitely
Stage 15 (RESEARCH_DECISION) may pivot multiple times. To cap iterations:
```yaml
research:
max_pivots: 2
max_refines: 3
```
### LaTeX compilation errors
```bash
# Check for missing packages
pdflatex paper.tex 2>&1 | grep "File.*not found"
# Install missing packages (TeX Live)
tlmgr install <package-name>
```
### Out of memory during experiments
```yaml
# Force CPU mode in config
experiment:
sandbox:
device: "cpu"
max_memory_gb: 4
```
---
## Key Concepts
- **PIVOT/REFINE Loop**: Stage 15 autonomously decides PROCEED, REFINE (tweak params), or PIVOT (new hypothesis direction). All artifacts are versioned.
- **Multi-Agent Debate**: Stages 8, 14, 18 use structured multi-perspective debate — not a single LLM pass.
- **Self-Learning**: Each run extracts lessons with 30-day time decay. Future runs on similar topics benefit from past mistakes.
- **Sentinel Watchdog**: Background monitor detects NaN/Inf in results, checks paper-evidence consistency, scores citation relevance, and guards against fabrication throughout the run.
- **4-Layer Citation Verification**: arXiv lookup → CrossRef lookup → DataCite lookup → LLM relevance scoring. A citation must pass all layers to survive.Related Skills
taiwan-equity-research-coverage
Structured equity research database for 1,735 Taiwan-listed companies with wikilink knowledge graph, supply chain mapping, and financial data tools.
openai-symphony-autonomous-agents
Symphony turns project work into isolated, autonomous implementation runs, allowing teams to manage work instead of supervising coding agents.
codex-autoresearch-loop
Self-directed iterative research skill for Codex that continuously cycles through modify, verify, retain or discard, and repeat until a measurable goal is reached.
autoresearch-genealogy
Structured prompts, vault templates, and autonomous research workflows for AI-assisted genealogy using Claude Code.
```markdown
---
zeroboot-vm-sandbox
Sub-millisecond VM sandboxes for AI agents using copy-on-write KVM forking via Zeroboot
yourvpndead-vpn-detection
Android app that detects VPN/proxy servers (VLESS/xray/sing-box) via local SOCKS5 vulnerability, exposing exit IPs and server configs without root
xata-postgres-platform
Expert skill for Xata open-source cloud-native Postgres platform with copy-on-write branching, scale-to-zero, and Kubernetes deployment
x-mentor-skill-nuwa
AI-powered X (Twitter) content strategy skill that distills methodologies from 6 top creators + open-source algorithm data into actionable writing, growth, and monetization guidance.
wx-favorites-report
End-to-end pipeline to extract, decrypt, and visualize WeChat Mac favorites from encrypted SQLite DB into an interactive HTML report.
wterm-web-terminal
Web terminal emulator with Zig/WASM core, DOM rendering, and React/vanilla JS bindings
worldmonitor-intelligence-dashboard
Real-time global intelligence dashboard with AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking