academic-search
Search academic paper repositories (arXiv, Semantic Scholar) for scholarly articles in physics, mathematics, computer science, quantitative biology, AI/ML, and related fields
Best use case
academic-search is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Search academic paper repositories (arXiv, Semantic Scholar) for scholarly articles in physics, mathematics, computer science, quantitative biology, AI/ML, and related fields
Teams using academic-search should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/academic-search/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How academic-search Compares
| Feature / Agent | academic-search | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Search academic paper repositories (arXiv, Semantic Scholar) for scholarly articles in physics, mathematics, computer science, quantitative biology, AI/ML, and related fields
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Academic Search Skill
This skill provides access to academic paper repositories, primarily arXiv, for searching scholarly articles. arXiv is a free distribution service and open-access archive for preprints in physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering, systems science, and economics.
## When to Use This Skill
Use this skill when you need to:
- **Find cutting-edge research**: Access preprints and recent papers before formal journal publication
- **Search AI/ML papers**: Find machine learning, deep learning, and artificial intelligence research
- **Explore computational methods**: Search for algorithms, theoretical frameworks, and mathematical foundations
- **Research interdisciplinary topics**: Find papers spanning computer science, biology, physics, and mathematics
- **Gather literature reviews**: Collect relevant papers for comprehensive topic overviews
- **Track state-of-the-art**: Find the latest advances in rapidly evolving fields
### Ideal Use Cases
| Scenario | Example Query |
|----------|---------------|
| Understanding new architectures | "transformer attention mechanism" |
| Exploring applications | "large language models code generation" |
| Finding benchmarks | "image classification benchmark ImageNet" |
| Surveying methods | "reinforcement learning robotics" |
| Technical deep-dives | "backpropagation neural networks" |
## How to Use
The skill provides a Python script that searches arXiv and returns formatted results with titles and abstracts.
### Basic Usage
**Note:** Always use the absolute path from your skills directory.
If running from a virtual environment:
```bash
.venv/bin/python [YOUR_SKILLS_DIR]/academic-search/arxiv_search.py "your search query"
```
Or for system Python:
```bash
python3 [YOUR_SKILLS_DIR]/academic-search/arxiv_search.py "your search query"
```
Replace `[YOUR_SKILLS_DIR]` with the absolute skills directory path from your system prompt.
### Command-Line Arguments
| Argument | Required | Default | Description |
|----------|----------|---------|-------------|
| `query` | Yes | - | The search query string |
| `--max-papers` | No | 10 | Maximum number of papers to retrieve |
| `--output-format` | No | text | Output format: `text`, `json`, or `markdown` |
### Examples
**Search for transformer architecture papers:**
```bash
python3 arxiv_search.py "attention is all you need transformer" --max-papers 5
```
**Search for reinforcement learning papers:**
```bash
python3 arxiv_search.py "deep reinforcement learning continuous control" --max-papers 10
```
**Search for LLM papers with JSON output:**
```bash
python3 arxiv_search.py "large language model reasoning" --output-format json
```
**Search for specific author or topic:**
```bash
python3 arxiv_search.py "author:Hinton deep learning"
```
**Search in specific arXiv categories:**
```bash
python3 arxiv_search.py "cat:cs.LG neural network pruning"
```
## Step-by-Step Workflow
### 1. Formulate Your Query
- Use specific, technical terms (e.g., "convolutional neural network image segmentation" not "AI for pictures")
- Include key authors if known: `author:Bengio`
- Specify arXiv categories for focused results: `cat:cs.CL` (Computation and Language)
- Combine terms for intersection: `"graph neural network" AND "molecular property"`
### 2. Execute the Search
```bash
python3 [SKILLS_DIR]/academic-search/arxiv_search.py "your refined query" --max-papers 10
```
### 3. Review Results
The output includes:
- **Title**: Full paper title
- **Authors**: List of paper authors
- **Published**: Publication date
- **arXiv ID**: Unique identifier (useful for citing)
- **URL**: Direct link to the paper
- **Summary**: Abstract text
### 4. Iterate if Needed
- Too many irrelevant results? Add more specific terms or use category filters
- Too few results? Broaden the query or remove restrictive terms
- Looking for recent work? arXiv sorts by relevance by default
### 5. Save and Synthesize
Save relevant findings to your research workspace for later synthesis:
```
research_workspace/
papers/
topic_findings.md
```
## Output Formats
### Text Format (Default)
```
================================================================================
Title: Attention Is All You Need
Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, ...
Published: 2017-06-12
arXiv ID: 1706.03762
URL: https://arxiv.org/abs/1706.03762
--------------------------------------------------------------------------------
Summary: The dominant sequence transduction models are based on complex
recurrent or convolutional neural networks...
================================================================================
```
### JSON Format
```json
{
"query": "transformer attention",
"total_results": 5,
"papers": [
{
"title": "Attention Is All You Need",
"authors": ["Ashish Vaswani", "Noam Shazeer", ...],
"published": "2017-06-12",
"arxiv_id": "1706.03762",
"url": "https://arxiv.org/abs/1706.03762",
"summary": "The dominant sequence transduction models..."
}
]
}
```
### Markdown Format
```markdown
## Attention Is All You Need
**Authors:** Ashish Vaswani, Noam Shazeer, ...
**Published:** 2017-06-12
**arXiv ID:** [1706.03762](https://arxiv.org/abs/1706.03762)
### Abstract
The dominant sequence transduction models are based on complex...
```
## arXiv Category Reference
Common categories for AI/ML research:
| Category | Description |
|----------|-------------|
| `cs.LG` | Machine Learning |
| `cs.AI` | Artificial Intelligence |
| `cs.CL` | Computation and Language (NLP) |
| `cs.CV` | Computer Vision |
| `cs.NE` | Neural and Evolutionary Computing |
| `cs.RO` | Robotics |
| `stat.ML` | Machine Learning (Statistics) |
| `q-bio` | Quantitative Biology |
| `math.OC` | Optimization and Control |
## Best Practices
### Query Construction
1. **Be specific**: "graph attention network node classification" > "graph neural network"
2. **Use quotation marks**: For exact phrases: `"self-supervised learning"`
3. **Combine operators**: `cat:cs.CV AND "object detection" AND 2023`
4. **Include variations**: Search for both "LLM" and "large language model"
### Research Workflow Integration
1. **Start broad, then narrow**: Begin with general queries, refine based on initial results
2. **Track paper IDs**: Save arXiv IDs for citing and revisiting
3. **Check references**: Seminal papers often cite foundational work
4. **Note publication dates**: Preprints may be superseded by updated versions
### Limitations to Consider
- **Preprint status**: Papers may not be peer-reviewed
- **Version updates**: Check for newer versions (v2, v3, etc.)
- **Coverage gaps**: Not all fields are well-represented on arXiv
- **Rate limiting**: Avoid excessive rapid queries
## Dependencies
This skill requires the `arxiv` Python package:
```bash
# Virtual environment (recommended)
.venv/bin/python -m pip install arxiv
# System-wide
python3 -m pip install arxiv
```
The script will detect if the package is missing and display installation instructions.
## Troubleshooting
### "Error: arxiv package not installed"
Install the arxiv package as shown in Dependencies section.
### No results returned
- Try broader search terms
- Remove category restrictions
- Check for typos in technical terms
### Rate limiting errors
- Wait a few seconds between queries
- Reduce `--max-papers` value
### Connection errors
- Check internet connectivity
- arXiv API may have temporary outages
## Integration with Research Workflow
This skill works well with the web-research skill for comprehensive research:
1. **Use academic-search** for foundational/theoretical papers
2. **Use web-research** for current implementations, tutorials, and practical guides
3. **Synthesize** findings from both sources in your research report
## Notes
- arXiv is particularly strong for:
- Computer Science (cs.*)
- Physics (physics.*, hep-*, cond-mat.*)
- Mathematics (math.*)
- Quantitative Biology (q-bio.*)
- Statistics (stat.*)
- Results are sorted by relevance by default
- The arXiv API is free and requires no authentication
- Consider checking cited papers for deeper understandingRelated Skills
gpt-researcher
Run GPT-Researcher multi-agent deep research framework locally using OpenAI GPT-5.2. Replaces ChatGPT Deep Research with local control. Researches 100+ sources in parallel, provides comprehensive citations. Use for Phase 3 industry/technical research or comprehensive synthesis. Takes 6-20 min depending on report type. Supports multiple LLM providers.
deep-research
Web research with Graph-of-Thoughts for fast-changing topics. Use when user requests research, analysis, investigation, or comparison requiring current information. Features hypothesis testing, source triangulation, claim verification, Red Team, self-critique, and gap analysis. Supports Quick/Standard/Deep/Exhaustive tiers. Creative Mode for cross-industry innovation.
brutal-deepresearch
Structured deep research pipeline with confirmation gates and resume support. Generates outline, launches parallel research agents, produces validated JSON results and markdown report.
agent-market-researcher
Expert market researcher specializing in market analysis, consumer insights, and competitive intelligence. Masters market sizing, segmentation, and trend analysis with focus on identifying opportunities and informing strategic business decisions.
agent-data-researcher
Expert data researcher specializing in discovering, collecting, and analyzing diverse data sources. Masters data mining, statistical analysis, and pattern recognition with focus on extracting meaningful insights from complex datasets to support evidence-based decisions.
agency-researcher
Find and qualify real estate agencies in a given suburb
add-search-engine
Integrate a new LLM search provider into Mentha
academic-data-integration
When the user needs to integrate multiple data sources (Canvas API, user memory, file systems) to create comprehensive academic reports. This skill combines course information, assignment details, submission status, and user context to generate actionable insights. Triggers include requests that involve cross-referencing multiple data sources or creating consolidated academic reports from disparate systems.
academic-course-setup-automator
When the user needs to set up multiple academic courses in a learning management system (Canvas/LMS) from structured data sources. This skill automates the entire workflow extracting course schedules from emails/attachments, matching instructors from CSV files, creating courses, enrolling teachers, publishing announcements with class details, uploading syllabi, enabling resource sharing for instructors teaching multiple courses, and publishing all courses. Triggers include course schedule setup, Canvas/LMS administration, academic term preparation, instructor assignment, syllabus distribution, and multi-course management.
academic-benchmark-researcher
When the user requests information about academic benchmarks, datasets, or research papers, particularly in machine learning, deep learning, or logical reasoning domains. This skill enables systematic research of academic benchmarks by searching web sources, downloading and analyzing arXiv papers, extracting key metadata (number of tasks, training availability, difficulty levels), and compiling comparative summaries. It triggers on requests involving dataset comparisons, benchmark analysis, or academic paper research for table creation.
content-research-writer
Assists in writing high-quality content by conducting research, adding citations, improving hooks, iterating on outlines, and providing real-time feedback on each section. Transforms your writing process from solo effort to collaborative partnership.
Automate YouTube Top-Ten Video Creation with OpenAI and Safe Image Search
Integrates OpenAI API for content generation, Bing Image Search API for safe image retrieval, and Pexels API for video footage. Handles authentication via Bearer token, enforces safe search, formats ChatGPT responses into a top-ten list, and includes error handling for API failures.