arxiv-batch-reporting
Batch search and report generation from arXiv preprint repository
Best use case
arxiv-batch-reporting is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Batch search and report generation from arXiv preprint repository
Teams using arxiv-batch-reporting should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/arxiv-batch-reporting/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How arxiv-batch-reporting Compares
| Feature / Agent | arxiv-batch-reporting | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Batch search and report generation from arXiv preprint repository
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# arXiv Batch Reporting
## Overview
Keeping up with the flood of new preprints on arXiv is one of the most persistent challenges in fast-moving fields like machine learning, physics, mathematics, and computer science. The arXiv Batch Reporting skill provides a systematic approach to searching, filtering, and generating structured reports from arXiv at scale.
Unlike ad-hoc manual searches, this skill enables researchers to define persistent query profiles, run batch searches across date ranges, and produce formatted reports that highlight the most relevant papers. It is particularly useful for weekly or monthly literature surveillance, lab meeting preparation, and trend analysis across subfields.
The skill leverages the arXiv API and supports advanced query syntax, category filtering, and result ranking by relevance or recency. Reports can be generated in Markdown, HTML, or CSV formats for integration into existing workflows.
## Setting Up Batch Queries
### Query Profile Definition
Define your search profiles as structured configurations. Each profile specifies the search terms, category filters, date range, and output preferences:
```yaml
profile_name: "transformer-architectures-weekly"
queries:
- "ti:transformer AND abs:attention mechanism"
- "ti:vision transformer"
- "abs:efficient transformer AND cat:cs.LG"
categories:
- cs.LG
- cs.CL
- cs.CV
date_range: "last_7_days"
max_results: 100
sort_by: "submittedDate"
sort_order: "descending"
```
### arXiv API Query Syntax
The arXiv API supports field-specific searches:
- `ti:` — Search in title
- `abs:` — Search in abstract
- `au:` — Search by author
- `cat:` — Filter by category (e.g., `cs.AI`, `math.PR`, `physics.comp-ph`)
- Boolean operators: `AND`, `OR`, `ANDNOT`
- Group with parentheses for complex queries
**Example queries:**
- Find recent GAN papers in computer vision: `abs:generative adversarial AND cat:cs.CV`
- Find a specific author's work: `au:bengio AND ti:deep learning`
- Exclude survey papers: `abs:reinforcement learning ANDNOT ti:survey`
### Rate Limiting and Pagination
The arXiv API enforces rate limits. Follow these guidelines:
- Wait at least 3 seconds between API requests
- Use pagination with `start` and `max_results` parameters (max 2000 per request)
- For large batch jobs, implement exponential backoff on HTTP 503 responses
- Cache results locally to avoid redundant API calls
## Report Generation
### Standard Report Template
After collecting batch results, generate a report with the following structure:
```markdown
# arXiv Batch Report: [Profile Name]
**Date range:** [start] to [end]
**Total results:** [N] papers
**Generated:** [timestamp]
## Highlights (Top 10 by Relevance)
| # | Title | Authors | Category | Date |
|---|-------|---------|----------|------|
| 1 | [Title](arxiv-link) | First Author et al. | cs.LG | 2026-03-08 |
## Category Breakdown
- cs.LG: 45 papers
- cs.CL: 23 papers
- cs.CV: 18 papers
## Keyword Frequency
- "transformer": 38 mentions
- "attention": 29 mentions
- "efficient": 15 mentions
## Full Results
[Expandable table with all papers]
```
### Filtering and Ranking
After retrieving raw results, apply post-processing filters to surface the most relevant papers:
1. **Relevance scoring**: Score each paper based on keyword density in the title and abstract relative to your query terms.
2. **Author filtering**: Boost papers from authors on your watch list (key researchers in your field).
3. **Citation proxy**: Papers that appear in multiple query results likely sit at the intersection of your interests—rank them higher.
4. **Novelty detection**: Flag papers whose abstracts contain terms not seen in your previous reports, indicating potentially new directions.
## Automation and Scheduling
For ongoing literature surveillance, automate your batch reports:
- **Cron scheduling**: Run batch queries weekly (e.g., every Monday at 8 AM) using a scheduled task or CI pipeline.
- **Diff reports**: Compare the current week's results against the previous week to highlight only new papers.
- **Alert thresholds**: Set alerts when a report contains more than N papers matching a high-priority query, indicating a burst of activity in that area.
- **Email or Slack delivery**: Route generated reports to your inbox or lab Slack channel for team-wide awareness.
Store all generated reports in a versioned directory structure for longitudinal trend analysis:
```
reports/
transformer-architectures-weekly/
2026-03-03.md
2026-03-10.md
...
```
## References
- arXiv API documentation: https://info.arxiv.org/help/api/index.html
- arXiv category taxonomy: https://arxiv.org/category_taxonomy
- arXiv Batch Search: wentor-research-pluginsRelated Skills
arxiv-paper-processor
Process and analyze arXiv papers systematically for research workflows
arxiv-cli-tools
Command-line tools for searching and batch-downloading arXiv papers
arxiv-api
Search and retrieve preprints from the arXiv open-access repository
arxiv-latex-source
Download and parse LaTeX source files from arXiv preprints
zotero-arxiv-daily-guide
Guide to Zotero arXiv Daily for personalized daily paper recommendations
arxiv-preprint-template
LaTeX template for arXiv preprints in NIPS/NeurIPS style format
thuthesis-guide
Write Tsinghua University theses using the ThuThesis LaTeX template
thesis-writing-guide
Templates, formatting rules, and strategies for thesis and dissertation writing
thesis-template-guide
Set up LaTeX templates for PhD and Master's thesis documents
sjtuthesis-guide
Write SJTU theses using the SJTUThesis LaTeX template with full compliance
scientific-article-pdf
Generate publication-ready scientific article PDFs from templates
novathesis-guide
LaTeX thesis template supporting multiple universities and formats