arxiv-mcp
Search and retrieve academic papers from arXiv.org using WebFetch and Exa. No MCP server required - uses existing tools to access arXiv API directly.
Best use case
arxiv-mcp is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Search and retrieve academic papers from arXiv.org using WebFetch and Exa. No MCP server required - uses existing tools to access arXiv API directly.
Teams using arxiv-mcp should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/arxiv-mcp/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How arxiv-mcp Compares
| Feature / Agent | arxiv-mcp | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Search and retrieve academic papers from arXiv.org using WebFetch and Exa. No MCP server required - uses existing tools to access arXiv API directly.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# arXiv Search Skill
<identity>
arXiv Search Skill - Search and retrieve academic papers from arXiv.org using existing tools (WebFetch, Exa). No MCP server installation required.
</identity>
## ✅ No Installation Required
This skill uses **existing tools** to access arXiv:
- **WebFetch** - Direct access to arXiv API
- **Exa** - Semantic search with arXiv filtering
Works immediately - no MCP server, no restart needed.
<capabilities>
- Search academic papers by keywords, authors, categories, or date ranges
- Retrieve detailed paper metadata (title, authors, abstract, categories, PDF link)
- Get specific papers by arXiv ID
- Find related papers based on categories and keywords
- Filter by arXiv categories (cs.AI, cs.LG, cs.CV, math.*, physics.*, etc.)
- No API key required - uses public arXiv API
</capabilities>
<instructions>
<execution_process>
## Method 1: WebFetch with arXiv API (Recommended for specific queries)
The arXiv API is publicly accessible at `http://export.arxiv.org/api/query`.
### Search by Keywords
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=all:transformer+attention&max_results=10&sortBy=relevance',
prompt: 'Extract paper titles, authors, abstracts, arXiv IDs, and PDF links from these results',
});
```
### Search by Author
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=au:LeCun&max_results=10&sortBy=submittedDate',
prompt: 'Extract paper titles, authors, abstracts, and arXiv IDs',
});
```
### Search by Category
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=cat:cs.LG&max_results=15&sortBy=submittedDate',
prompt: 'Extract paper titles, authors, abstracts, categories, and arXiv IDs',
});
```
### Get Specific Paper by ID
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?id_list=2301.07041',
prompt:
'Extract full details: title, all authors, abstract, categories, published date, PDF link',
});
```
### API Query Parameters
| Parameter | Description | Example |
| -------------- | ----------------------------------------------------------- | --------------------------------------------- |
| `search_query` | Search terms with field prefixes | `all:transformer`, `au:LeCun`, `ti:attention` |
| `id_list` | Comma-separated arXiv IDs | `2301.07041,2302.13971` |
| `max_results` | Number of results (default 10, max 100) | `max_results=20` |
| `start` | Offset for pagination | `start=10` |
| `sortBy` | Sort order: `relevance`, `lastUpdatedDate`, `submittedDate` | `sortBy=submittedDate` |
| `sortOrder` | `ascending` or `descending` | `sortOrder=descending` |
### Field Prefixes for search_query
| Prefix | Field | Example |
| ------ | ---------- | ------------------------- |
| `all:` | All fields | `all:machine+learning` |
| `ti:` | Title | `ti:transformer` |
| `au:` | Author | `au:Vaswani` |
| `abs:` | Abstract | `abs:attention+mechanism` |
| `cat:` | Category | `cat:cs.LG` |
| `co:` | Comment | `co:accepted` |
### Boolean Operators
Combine terms with `AND`, `OR`, `ANDNOT`:
```
search_query=ti:transformer+AND+abs:attention
search_query=au:LeCun+OR+au:Bengio
search_query=cat:cs.LG+ANDNOT+ti:survey
```
---
## Method 2: Exa Search (Better for semantic/natural language queries)
Use Exa for more natural language queries with arXiv filtering:
### Semantic Search
```javascript
mcp__Exa__web_search_exa({
query: 'site:arxiv.org transformer architecture attention mechanism deep learning',
numResults: 10,
});
```
### Recent Papers in a Field
```javascript
mcp__Exa__web_search_exa({
query: 'site:arxiv.org large language model scaling laws 2024',
numResults: 15,
});
```
### Author-Focused Search
```javascript
mcp__Exa__web_search_exa({
query: 'site:arxiv.org author:"Yann LeCun" deep learning',
numResults: 10,
});
```
---
## Common arXiv Categories
| Category | Field |
| ---------- | ------------------------------- |
| cs.AI | Artificial Intelligence |
| cs.LG | Machine Learning |
| cs.CL | Computation and Language (NLP) |
| cs.CV | Computer Vision |
| cs.SE | Software Engineering |
| cs.CR | Cryptography and Security |
| stat.ML | Machine Learning (Statistics) |
| math.\* | Mathematics (all subcategories) |
| physics.\* | Physics (all subcategories) |
| q-bio.\* | Quantitative Biology |
| econ.\* | Economics |
---
## Workflow: Complete Research Process
### Step 1: Initial Search
```javascript
// Start with broad Exa search for semantic matching
mcp__Exa__web_search_exa({
query: 'site:arxiv.org transformer attention mechanism neural networks',
numResults: 10,
});
```
### Step 2: Get Specific Papers
```javascript
// Get details for interesting papers by ID
WebFetch({
url: 'http://export.arxiv.org/api/query?id_list=2301.07041,2302.13971',
prompt: 'Extract full metadata for each paper: title, authors, abstract, categories, PDF URL',
});
```
### Step 3: Find Related Work
```javascript
// Search by category of interesting paper
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=cat:cs.LG+AND+ti:attention&max_results=10&sortBy=submittedDate',
prompt: 'Find related papers, extract titles and abstracts',
});
```
### Step 4: Get Recent Papers
```javascript
// Latest papers in the field
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=cat:cs.LG&max_results=20&sortBy=submittedDate&sortOrder=descending',
prompt: 'Extract the 20 most recent machine learning papers',
});
```
</execution_process>
<best_practices>
1. **Use Exa for discovery**: Natural language queries find semantically related papers
2. **Use WebFetch for precision**: Specific IDs, categories, or API queries
3. **Combine approaches**: Exa to discover, WebFetch to deep-dive
4. **Use specific queries**: "transformer attention mechanism" > "machine learning"
5. **Check multiple categories**: Papers often span cs.AI + cs.LG + cs.CL
6. **Sort by date for recent work**: `sortBy=submittedDate&sortOrder=descending`
</best_practices>
</instructions>
<examples>
<usage_example>
**Example 1: Search for transformer papers**:
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=ti:transformer+AND+abs:attention&max_results=10&sortBy=relevance',
prompt: 'Extract paper titles, authors, abstracts, and arXiv IDs',
});
```
**Example 2: Find papers by researcher**:
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=au:Vaswani&max_results=15',
prompt: 'List all papers by this author with titles and dates',
});
```
**Example 3: Get recent ML papers**:
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?search_query=cat:cs.LG&max_results=20&sortBy=submittedDate&sortOrder=descending',
prompt: 'Extract the 20 most recent machine learning papers with titles and abstracts',
});
```
**Example 4: Semantic search with Exa**:
```javascript
mcp__Exa__web_search_exa({
query: 'site:arxiv.org multimodal large language models vision 2024',
numResults: 10,
});
```
**Example 5: Get specific paper details**:
```javascript
WebFetch({
url: 'http://export.arxiv.org/api/query?id_list=1706.03762',
prompt: "Extract complete details for the 'Attention Is All You Need' paper",
});
```
</usage_example>
</examples>
## Agent Integration
This skill is automatically assigned to:
- **researcher** - Academic research, literature review
- **scientific-research-expert** - Deep scientific analysis
- **developer** - Finding technical papers for implementation
## Memory Protocol (MANDATORY)
**Before starting:**
```bash
cat .claude/context/memory/learnings.md
```
**After completing:**
- New pattern -> `.claude/context/memory/learnings.md`
- Issue found -> `.claude/context/memory/issues.md`
- Decision made -> `.claude/context/memory/decisions.md`
> ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.Related Skills
arxiv-paper-extract
Extract, translate and save arXiv CS.CV papers for a specific date. Use when user asks to fetch arXiv papers, download paper lists, extract CV papers, translate paper titles to Chinese, or save paper metadata from arxiv.org/list/cs.CV.
arxivterminal
CLI tool (arxivterminal) for fetching, searching, and managing arXiv papers locally. Use when working with arXiv papers using the arxivterminal command - fetching new papers by category, searching the local database, viewing papers from specific dates, or managing the local paper database.
arxiv-reader
arXiv 論文の内容を取得・要約するスキル。URL が arxiv.org/abs/{論文ID} 形式の場合に使用。PDF をダウンロードして Read ツールで読み取る。
arxiv-search
Search arXiv preprint repository for papers in physics, mathematics, computer science, quantitative biology, and related fields
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
moai-lang-r
R 4.4+ best practices with testthat 3.2, lintr 3.2, and data analysis patterns.
moai-lang-python
Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.
moai-icons-vector
Vector icon libraries ecosystem guide covering 10+ major libraries with 200K+ icons, including React Icons (35K+), Lucide (1000+), Tabler Icons (5900+), Iconify (200K+), Heroicons, Phosphor, and Radix Icons with implementation patterns, decision trees, and best practices.
moai-foundation-trust
Complete TRUST 4 principles guide covering Test First, Readable, Unified, Secured. Validation methods, enterprise quality gates, metrics, and November 2025 standards. Enterprise v4.0 with 50+ software quality standards references.
moai-foundation-memory
Persistent memory across sessions using MCP Memory Server for user preferences, project context, and learned patterns
moai-foundation-core
MoAI-ADK's foundational principles - TRUST 5, SPEC-First TDD, delegation patterns, token optimization, progressive disclosure, modular architecture, agent catalog, command reference, and execution rules for building AI-powered development workflows
moai-cc-claude-md
Authoring CLAUDE.md Project Instructions. Design project-specific AI guidance, document workflows, define architecture patterns. Use when creating CLAUDE.md files for projects, documenting team standards, or establishing AI collaboration guidelines.