academic-benchmark-researcher

When the user requests information about academic benchmarks, datasets, or research papers, particularly in machine learning, deep learning, or logical reasoning domains. This skill enables systematic research of academic benchmarks by searching web sources, downloading and analyzing arXiv papers, extracting key metadata (number of tasks, training availability, difficulty levels), and compiling comparative summaries. It triggers on requests involving dataset comparisons, benchmark analysis, or academic paper research for table creation.

181 stars

Best use case

academic-benchmark-researcher is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

When the user requests information about academic benchmarks, datasets, or research papers, particularly in machine learning, deep learning, or logical reasoning domains. This skill enables systematic research of academic benchmarks by searching web sources, downloading and analyzing arXiv papers, extracting key metadata (number of tasks, training availability, difficulty levels), and compiling comparative summaries. It triggers on requests involving dataset comparisons, benchmark analysis, or academic paper research for table creation.

Teams using academic-benchmark-researcher should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/academic-benchmark-researcher/SKILL.md --create-dirs "https://raw.githubusercontent.com/majiayu000/claude-skill-registry/main/skills/data/academic-benchmark-researcher/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/academic-benchmark-researcher/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How academic-benchmark-researcher Compares

Feature / Agentacademic-benchmark-researcherStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

When the user requests information about academic benchmarks, datasets, or research papers, particularly in machine learning, deep learning, or logical reasoning domains. This skill enables systematic research of academic benchmarks by searching web sources, downloading and analyzing arXiv papers, extracting key metadata (number of tasks, training availability, difficulty levels), and compiling comparative summaries. It triggers on requests involving dataset comparisons, benchmark analysis, or academic paper research for table creation.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Instructions

## Primary Objective
Systematically research academic benchmarks, datasets, or research papers to extract and compile comparative information (e.g., into a summary table). The core workflow involves: 1) Identifying relevant sources, 2) Extracting key metadata, 3) Synthesizing findings into a structured output (like a LaTeX table).

## Core Workflow
1.  **Clarify & Parse Request:** Identify the specific benchmarks/datasets/papers mentioned by the user. Note any required output format (e.g., LaTeX table with specific columns) and constraints (e.g., "no commented lines").
2.  **Initial Information Gathering:** For each identified entity (dataset/paper):
    *   Use `local-web_search` to find general information, official pages (GitHub, project sites), and relevant arXiv IDs.
    *   For arXiv papers, use `arxiv_local-download_paper` or `fetch-fetch_markdown` to obtain the paper content.
    *   Search for specific attributes requested by the user (e.g., "number of tasks," "training set," "difficulty levels").
3.  **Deep Dive & Verification:** Read paper abstracts, introductions, and methodology sections (using `arxiv_local-read_paper` or parsed markdown) to confirm key details. Cross-reference information from multiple sources (official site, paper, blog posts) for accuracy.
4.  **Information Synthesis:** Compile the extracted metadata into a structured format aligned with the user's request. Resolve any ambiguities (e.g., if a "task" count refers to broad categories or individual instances) based on the most authoritative source (typically the original paper).
5.  **Output Generation:** Create the final deliverable (e.g., a `.tex` file). Ensure it strictly adheres to the user's formatting specifications. Optionally, provide a concise textual summary of the findings.

## Key Metadata to Extract
When researching a benchmark/dataset, prioritize finding:
*   **Full Name & Acronym**
*   **Number of Tasks/Categories:** Distinguish between broad task categories and individual task instances.
*   **Training Data Availability:** Does it include a dedicated training set, or is it for evaluation only?
*   **Difficulty Levels:** Does it feature adjustable or tiered difficulty levels?
*   **Core Purpose/Description**
*   **Primary Source (arXiv ID, GitHub repo)**

## Tool Usage Guidelines
*   `local-web_search`: Use for initial discovery and finding high-level descriptions. Employ specific queries combining the dataset name and target attributes (e.g., "BBH training set few-shot examples").
*   `arxiv_local-download_paper` / `fetch-fetch_markdown`: Use to access the canonical source for detailed information. Prefer `arxiv_local-download_paper` for full text analysis when needed.
*   `filesystem-write_file` / `filesystem-read_file`: Use for creating and verifying final output files in the workspace.
*   `local-claim_done`: Use only after successfully delivering the requested output and providing a final summary.

## Output Standards
*   **LaTeX Tables:** Ensure the output contains only the specified table content, without extra comments, document headers, or unrelated text.
*   **Summaries:** Be concise but complete, highlighting the sourced information for each dataset.
*   **Accuracy:** Base conclusions on the original paper or official project documentation where possible. Acknowledge if information is not explicitly stated.

## Common Pitfalls & Resolutions
*   **Ambiguous Task Counts:** If a paper mentions "5 task categories" (like KOR-Bench), report that as the task count unless the user specifies otherwise. Clarify in the summary if needed.
*   **Missing Information:** If a key attribute (e.g., training set) is not mentioned in primary sources, infer based on benchmark type (e.g., many evaluation benchmarks lack training sets) and denote with `\ding{55}`. State the assumption in your summary.
*   **arXiv Paper Processing:** If `arxiv_local-download_paper` returns a "converting" status, use `fetch-fetch_markdown` on the arXiv abstract page as a reliable fallback to get the paper's metadata and abstract.

Related Skills

academic-writing-style

181
from majiayu000/claude-skill-registry

Personalized academic writing assistant for university assignments in Chinese and English. Use when users need help writing/revising academic reports, project docs, technical analyses, research reviews, or case studies. Produces natural prose avoiding AI markers. Triggers: academic writing, assignment, report, technical analysis, research review, case study. | 个性化学术写作助手,适用于中英文大学作业。触发词:学术写作、作业、报告、技术分析、研究综述、案例研究、项目文档。

academic-writing-standards

181
from majiayu000/claude-skill-registry

Expert knowledge of academic writing standards for peer-reviewed papers, including citation integrity, style compliance, clarity, and scientific writing best practices. Use when reviewing or editing academic manuscripts, papers, or research documentation.

academic-writing-cs

181
from majiayu000/claude-skill-registry

Comprehensive toolkit for writing high-quality computer science research papers (conference, journal, thesis). Provides narrative construction guidance, sentence-level clarity principles (Gopen & Swan), academic phrasebank, CS-specific conventions, and section-by-section quality checklists. Use when assisting with academic paper writing, revision, or structure planning across all stages from drafting to submission.

academic-task-planner

181
from majiayu000/claude-skill-registry

Transform academic course assignment PDFs into structured, actionable markdown checklists with dates, unique IDs, and custom tags. Asks when the user will start, assigns tasks only to weekdays (Monday-Friday), respects weekends automatically, and calculates forum deadlines 3 days before due date. Use this skill when the user uploads academic PDFs or asks to create a task plan from course assignments.

academic-search

181
from majiayu000/claude-skill-registry

Search academic paper repositories (arXiv, Semantic Scholar) for scholarly articles in physics, mathematics, computer science, quantitative biology, AI/ML, and related fields

academic-reviewer

181
from majiayu000/claude-skill-registry

Expert guidance for reviewing academic manuscripts submitted to journals, particularly in political science, economics, and quantitative social sciences. Use when asked to review, critique, or provide feedback on academic papers, research designs, or empirical strategies. Emphasizes methodological rigor, causal identification strategies, and constructive feedback on research design.

Academic Researcher

181
from majiayu000/claude-skill-registry

Academic paper search across 14+ scholarly platforms including arXiv, PubMed, Google Scholar, Web of Science, Semantic Scholar, Sci-Hub, and more. Use for literature review, research discovery, and citation management.

academic-research

181
from majiayu000/claude-skill-registry

Systematic academic literature search with source prioritization and APA 7th edition citations. Use when the user needs to research a topic with scholarly sources, verify claims with academic backing, find peer-reviewed evidence, compile research findings, or generate properly cited reports. Triggers: "research [topic]", "what does the research say about...", "find studies on...", "verify this claim...", "literature review", "academic sources for...", "peer-reviewed evidence", "scholarly articles about...", "evidence-based", "cite sources for...". This skill provides basic APA citation capabilities; for advanced citation work (complex source types, edge cases, batch formatting), consider the `apa-style-citation` skill which offers enhanced citation expertise.

academic-research-writing

181
from majiayu000/claude-skill-registry

Use when writing CS research papers (conference, journal, thesis), reviewing scientific manuscripts, improving academic writing clarity, or preparing IEEE/ACM submissions. Invoke when user mentions paper, manuscript, research writing, journal submission, or needs help with academic structure, formatting, or revision.

academic-research-writer

181
from majiayu000/claude-skill-registry

Write academic research documents following academic guidelines with peer-reviewed sources from Google Scholar and other academic databases. Always verify source credibility and generate IEEE standard references. Use for research papers, literature reviews, technical reports, theses, dissertations, conference papers, and academic proposals requiring proper citations and scholarly rigor.

Academic Paper

181
from majiayu000/claude-skill-registry

## Description

academic-letter-architect

181
from majiayu000/claude-skill-registry

Use when writing recommendation letters, reference letters, or award nominations for students, postdocs, or colleagues. Invoke when user mentions recommendation letter, reference, nomination, letter of support, endorsement, or needs help with strong advocacy, comparative statements, or evidence-based character assessment.