arxiv-search
Search arXiv for preprints in physics, math, CS, quantitative biology, quantitative finance, statistics, electrical engineering, economics. Use when: (1) finding preprints by topic, (2) searching by author, (3) browsing arXiv categories, (4) getting paper metadata/abstracts. NOT for: published journal articles (use crossref-search), biomedical (use pubmed-search).
Best use case
arxiv-search is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Search arXiv for preprints in physics, math, CS, quantitative biology, quantitative finance, statistics, electrical engineering, economics. Use when: (1) finding preprints by topic, (2) searching by author, (3) browsing arXiv categories, (4) getting paper metadata/abstracts. NOT for: published journal articles (use crossref-search), biomedical (use pubmed-search).
Teams using arxiv-search should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/arxiv-search/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How arxiv-search Compares
| Feature / Agent | arxiv-search | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Search arXiv for preprints in physics, math, CS, quantitative biology, quantitative finance, statistics, electrical engineering, economics. Use when: (1) finding preprints by topic, (2) searching by author, (3) browsing arXiv categories, (4) getting paper metadata/abstracts. NOT for: published journal articles (use crossref-search), biomedical (use pubmed-search).
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# arXiv Search
Search arXiv preprints via public API. Covers physics, math, CS, q-bio, q-fin,
statistics, electrical engineering, and economics.
## API Endpoint
```bash
curl -s "http://export.arxiv.org/api/query?search_query=all:transformer+attention&start=0&max_results=5"
```
Parameters: `search_query=` (required), `id_list=` (direct lookup by arXiv ID),
`start=` (pagination offset), `max_results=` (default 10, max 30000),
`sortBy=relevance|lastUpdatedDate|submittedDate`, `sortOrder=ascending|descending`.
## Query Syntax
**Field prefixes**: `ti:` title, `au:` author, `abs:` abstract, `co:` comment,
`jr:` journal ref, `cat:` category, `all:` all fields.
**Boolean**: `AND`, `OR`, `ANDNOT`. Example:
```bash
curl -s "http://export.arxiv.org/api/query?search_query=au:bengio+AND+cat:cs.LG+AND+ti:attention&max_results=10"
```
## Category Codes
**Physics**: `astro-ph` (.CO/.EP/.GA/.HE/.IM/.SR), `cond-mat` (.dis-nn/.mes-hall/.mtrl-sci/.soft/.stat-mech/.str-el/.supr-con), `hep-ex`, `hep-lat`, `hep-ph`, `hep-th`, `quant-ph`, `gr-qc`, `nucl-ex`, `nucl-th`
**CS**: `cs.AI`, `cs.CL` (NLP), `cs.CV`, `cs.LG` (ML), `cs.CR`, `cs.DB`, `cs.DS`, `cs.SE`, `cs.RO`
**Math**: `math.AG`, `math.AP`, `math.CO`, `math.PR`, `math.ST`
**Other**: `q-bio` (.BM/.CB/.GN/.MN/.NC/.PE/.QM/.SC/.TO), `q-fin` (.CP/.EC/.GN/.MF/.PM/.PR/.RM/.ST/.TR), `stat` (.AP/.CO/.ME/.ML/.OT/.TH), `eess` (.AS/.IV/.SP/.SY), `econ` (.EM/.GN/.TH)
## Response Parsing
The API returns Atom XML. Parse with Python:
```bash
curl -s "http://export.arxiv.org/api/query?search_query=ti:large+language+model&max_results=5&sortBy=submittedDate&sortOrder=descending" | python3 -c "
import sys, xml.etree.ElementTree as ET
ns = {'a': 'http://www.w3.org/2005/Atom'}
root = ET.parse(sys.stdin).getroot()
for entry in root.findall('a:entry', ns):
title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
aid = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
pub = entry.find('a:published', ns).text[:10]
authors = ', '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
print(f'[{aid}] {pub} | {title}')
print(f' Authors: {authors}\n')
"
```
## Direct Lookup and Pagination
```bash
# By ID
curl -s "http://export.arxiv.org/api/query?id_list=2301.07041,2302.13971"
# Pagination
curl -s "http://export.arxiv.org/api/query?search_query=cat:cs.AI&start=0&max_results=25&sortBy=submittedDate&sortOrder=descending"
curl -s "http://export.arxiv.org/api/query?search_query=cat:cs.AI&start=25&max_results=25&sortBy=submittedDate&sortOrder=descending"
```
## Rate Limiting
No official limit, but keep to 1 request per 3 seconds for bulk queries.
For large-scale harvesting, use the OAI-PMH bulk access endpoint instead.
## Best Practices
1. Use `sortBy=submittedDate&sortOrder=descending` for latest papers.
2. Combine `cat:` with keyword searches for targeted results.
3. Check `opensearch:totalResults` in the response for total match count.
4. For PDF access, replace `/abs/` with `/pdf/` in the paper URL.
5. Use `id_list` for direct lookups (faster and more reliable).
6. URL-encode spaces as `+` in query terms.
## Zero-Hallucination Rule
NEVER fabricate results from training data. Every paper title, author, DOI, PMID, citation count, and metadata detail presented to the user MUST come from an actual API response in this conversation. If the API returns no results or partial data, report exactly what was returned. Do not "fill in" missing details from memory.Related Skills
wikipedia-search
Search and fetch structured content from Wikipedia using the MediaWiki API for reliable, encyclopedic information
social-science-research
Orchestrates a social science research workflow from literature review through data collection, text analysis, statistical modeling, and report generation. Use when conducting empirical social science research, policy analysis, or mixed-methods studies. NOT for pure natural science analysis or clinical trial data.
search-strategy
COPYRIGHT NOTICE
research-reflection
Reflect on completed research tasks to improve future performance. Use when: a research task has just been completed and the agent should evaluate its own process, store lessons learned, or retrieve past reflections before starting new work. NOT for: active research execution or data analysis.
research-lookup
Look up current research information using the Parallel Chat API (primary) or Perplexity sonar-pro-search (academic paper searches). Automatically routes queries to the best backend. Use for finding papers, gathering research data, and verifying scientific information.
research-literature
COPYRIGHT NOTICE
research-grants
Write competitive research proposals for NSF, NIH, DOE, DARPA, and Taiwan NSTC. Agency-specific formatting, review criteria, budget preparation, broader impacts, significance statements, innovation narratives, and compliance with submission requirements.
research-ethics
Guides research ethics compliance including IRB protocol preparation, informed consent document drafting, research integrity standards, data management plans, and ethical considerations for human/animal subjects; trigger when users discuss IRB, ethical approval, consent forms, or responsible conduct of research.
pubmed-search
Search PubMed/MEDLINE for biomedical literature via NCBI E-utilities API. Use when: (1) searching medical/biomedical papers, (2) finding clinical studies, (3) querying with MeSH terms, (4) retrieving abstracts by PMID. NOT for: non-biomedical papers (use arxiv-search or semantic-scholar), full-text access (PubMed provides abstracts), or social science literature.
psychology-research
Conduct psychological research analysis including mental health, cognitive science, and behavioral studies
perplexity-search
Perform AI-powered web searches with real-time information using Perplexity models via LiteLLM and OpenRouter. This skill should be used when conducting web searches for current information, finding recent scientific literature, getting grounded answers with source citations, or accessing information beyond the model knowledge cutoff. Provides access to multiple Perplexity models including Sonar Pro, Sonar Pro Search (advanced agentic search), and Sonar Reasoning Pro through a single OpenRouter API key.
openalex-search
Open academic metadata via OpenAlex API. Use when: user needs author profiles, institution data, concept mapping, or open citation data. NOT for: full-text search or downloading papers.