mineru-pdf-extractor
Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods.
Best use case
mineru-pdf-extractor is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods.
Teams using mineru-pdf-extractor should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/mineru-pdf-extractor/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How mineru-pdf-extractor Compares
| Feature / Agent | mineru-pdf-extractor | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
Best AI Agents for Marketing
A curated list of the best AI agents and skills for marketing teams focused on SEO, content systems, outreach, and campaign execution.
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
SKILL.md Source
# MinerU PDF Extractor
Extract PDF documents to structured Markdown using the MinerU API. Supports formula recognition, table extraction, and OCR.
> **Note**: This is a community skill, not an official MinerU product. You need to obtain your own API key from [MinerU](https://mineru.net/).
---
## 📁 Skill Structure
```
mineru-pdf-extractor/
├── SKILL.md # English documentation
├── SKILL_zh.md # Chinese documentation
├── docs/ # Documentation
│ ├── Local_File_Parsing_Guide.md # Local PDF parsing detailed guide (English)
│ ├── Online_URL_Parsing_Guide.md # Online PDF parsing detailed guide (English)
│ ├── MinerU_本地文档解析完整流程.md # Local parsing complete guide (Chinese)
│ └── MinerU_在线文档解析完整流程.md # Online parsing complete guide (Chinese)
└── scripts/ # Executable scripts
├── local_file_step1_apply_upload_url.sh # Local parsing Step 1
├── local_file_step2_upload_file.sh # Local parsing Step 2
├── local_file_step3_poll_result.sh # Local parsing Step 3
├── local_file_step4_download.sh # Local parsing Step 4
├── online_file_step1_submit_task.sh # Online parsing Step 1
└── online_file_step2_poll_result.sh # Online parsing Step 2
```
---
## 🔧 Requirements
### Required Environment Variables
Scripts automatically read MinerU Token from environment variables (choose one):
```bash
# Option 1: Set MINERU_TOKEN
export MINERU_TOKEN="your_api_token_here"
# Option 2: Set MINERU_API_KEY
export MINERU_API_KEY="your_api_token_here"
```
### Required Command-Line Tools
- `curl` - For HTTP requests (usually pre-installed)
- `unzip` - For extracting results (usually pre-installed)
### Optional Tools
- `jq` - For enhanced JSON parsing and security (recommended but not required)
- If not installed, scripts will use fallback methods
- Install: `apt-get install jq` (Debian/Ubuntu) or `brew install jq` (macOS)
### Optional Configuration
```bash
# Set API base URL (default is pre-configured)
export MINERU_BASE_URL="https://mineru.net/api/v4"
```
> 💡 **Get Token**: Visit https://mineru.net/apiManage/docs to register and obtain an API Key
---
## 📄 Feature 1: Parse Local PDF Documents
For locally stored PDF files. Requires 4 steps.
### Quick Start
```bash
cd scripts/
# Step 1: Apply for upload URL
./local_file_step1_apply_upload_url.sh /path/to/your.pdf
# Output: BATCH_ID=xxx UPLOAD_URL=xxx
# Step 2: Upload file
./local_file_step2_upload_file.sh "$UPLOAD_URL" /path/to/your.pdf
# Step 3: Poll for results
./local_file_step3_poll_result.sh "$BATCH_ID"
# Output: FULL_ZIP_URL=xxx
# Step 4: Download results
./local_file_step4_download.sh "$FULL_ZIP_URL" result.zip extracted/
```
### Script Descriptions
#### local_file_step1_apply_upload_url.sh
Apply for upload URL and batch_id.
**Usage:**
```bash
./local_file_step1_apply_upload_url.sh <pdf_file_path> [language] [layout_model]
```
**Parameters:**
- `language`: `ch` (Chinese), `en` (English), `auto` (auto-detect), default `ch`
- `layout_model`: `doclayout_yolo` (fast), `layoutlmv3` (accurate), default `doclayout_yolo`
**Output:**
```
BATCH_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
UPLOAD_URL=https://mineru.oss-cn-shanghai.aliyuncs.com/...
```
---
#### local_file_step2_upload_file.sh
Upload PDF file to the presigned URL.
**Usage:**
```bash
./local_file_step2_upload_file.sh <upload_url> <pdf_file_path>
```
---
#### local_file_step3_poll_result.sh
Poll extraction results until completion or failure.
**Usage:**
```bash
./local_file_step3_poll_result.sh <batch_id> [max_retries] [retry_interval_seconds]
```
**Output:**
```
FULL_ZIP_URL=https://cdn-mineru.openxlab.org.cn/pdf/.../xxx.zip
```
---
#### local_file_step4_download.sh
Download result ZIP and extract.
**Usage:**
```bash
./local_file_step4_download.sh <zip_url> [output_zip_filename] [extract_directory_name]
```
**Output Structure:**
```
extracted/
├── full.md # 📄 Markdown document (main result)
├── images/ # 🖼️ Extracted images
├── content_list.json # Structured content
└── layout.json # Layout analysis data
```
### Detailed Documentation
📚 **Complete Guide**: See `docs/Local_File_Parsing_Guide.md`
---
## 🌐 Feature 2: Parse Online PDF Documents (URL Method)
For PDF files already available online (e.g., arXiv, websites). Only 2 steps, more concise and efficient.
### Quick Start
```bash
cd scripts/
# Step 1: Submit parsing task (provide URL directly)
./online_file_step1_submit_task.sh "https://arxiv.org/pdf/2410.17247.pdf"
# Output: TASK_ID=xxx
# Step 2: Poll results and auto-download/extract
./online_file_step2_poll_result.sh "$TASK_ID" extracted/
```
### Script Descriptions
#### online_file_step1_submit_task.sh
Submit parsing task for online PDF.
**Usage:**
```bash
./online_file_step1_submit_task.sh <pdf_url> [language] [layout_model]
```
**Parameters:**
- `pdf_url`: Complete URL of the online PDF (required)
- `language`: `ch` (Chinese), `en` (English), `auto` (auto-detect), default `ch`
- `layout_model`: `doclayout_yolo` (fast), `layoutlmv3` (accurate), default `doclayout_yolo`
**Output:**
```
TASK_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
```
---
#### online_file_step2_poll_result.sh
Poll extraction results, automatically download and extract when complete.
**Usage:**
```bash
./online_file_step2_poll_result.sh <task_id> [output_directory] [max_retries] [retry_interval_seconds]
```
**Output Structure:**
```
extracted/
├── full.md # 📄 Markdown document (main result)
├── images/ # 🖼️ Extracted images
├── content_list.json # Structured content
└── layout.json # Layout analysis data
```
### Detailed Documentation
📚 **Complete Guide**: See `docs/Online_URL_Parsing_Guide.md`
---
## 📊 Comparison of Two Parsing Methods
| Feature | **Local PDF Parsing** | **Online PDF Parsing** |
|---------|----------------------|------------------------|
| **Steps** | 4 steps | 2 steps |
| **Upload Required** | ✅ Yes | ❌ No |
| **Average Time** | 30-60 seconds | 10-20 seconds |
| **Use Case** | Local files | Files already online (arXiv, websites, etc.) |
| **File Size Limit** | 200MB | Limited by source server |
---
## ⚙️ Advanced Usage
### Batch Process Local Files
```bash
for pdf in /path/to/pdfs/*.pdf; do
echo "Processing: $pdf"
# Step 1
result=$(./local_file_step1_apply_upload_url.sh "$pdf" 2>&1)
batch_id=$(echo "$result" | grep BATCH_ID | cut -d= -f2)
upload_url=$(echo "$result" | grep UPLOAD_URL | cut -d= -f2)
# Step 2
./local_file_step2_upload_file.sh "$upload_url" "$pdf"
# Step 3
zip_url=$(./local_file_step3_poll_result.sh "$batch_id" | grep FULL_ZIP_URL | cut -d= -f2)
# Step 4
filename=$(basename "$pdf" .pdf)
./local_file_step4_download.sh "$zip_url" "${filename}.zip" "${filename}_extracted"
done
```
### Batch Process Online Files
```bash
for url in \
"https://arxiv.org/pdf/2410.17247.pdf" \
"https://arxiv.org/pdf/2409.12345.pdf"; do
echo "Processing: $url"
# Step 1
result=$(./online_file_step1_submit_task.sh "$url" 2>&1)
task_id=$(echo "$result" | grep TASK_ID | cut -d= -f2)
# Step 2
filename=$(basename "$url" .pdf)
./online_file_step2_poll_result.sh "$task_id" "${filename}_extracted"
done
```
---
## ⚠️ Notes
1. **Token Configuration**: Scripts prioritize `MINERU_TOKEN`, fall back to `MINERU_API_KEY` if not found
2. **Token Security**: Do not hard-code tokens in scripts; use environment variables
3. **URL Accessibility**: For online parsing, ensure the provided URL is publicly accessible
4. **File Limits**: Single file recommended not exceeding 200MB, maximum 600 pages
5. **Network Stability**: Ensure stable network when uploading large files
6. **Security**: This skill includes input validation and sanitization to prevent JSON injection and directory traversal attacks
7. **Optional jq**: Installing `jq` provides enhanced JSON parsing and additional security checks
---
## 📚 Reference Documentation
| Document | Description |
|----------|-------------|
| `docs/Local_File_Parsing_Guide.md` | Detailed curl commands and parameters for local PDF parsing |
| `docs/Online_URL_Parsing_Guide.md` | Detailed curl commands and parameters for online PDF parsing |
External Resources:
- 🏠 **MinerU Official**: https://mineru.net/
- 📖 **API Documentation**: https://mineru.net/apiManage/docs
- 💻 **GitHub Repository**: https://github.com/opendatalab/MinerU
---
*Skill Version: 1.0.0*
*Release Date: 2026-02-18*
*Community Skill - Not affiliated with MinerU official*Related Skills
recipe-video-extractor
Extract a structured cooking recipe from a shared video URL when the user sends `recipe <url>`. Prioritize caption/description and comments via browser automation, then use web search/fetch as fallback with clear source attribution.
pdf-process-mineru
PDF document parsing tool based on local MinerU, supports converting PDF to Markdown, JSON, and other machine-readable formats.
invoice-extractor
Extract invoice information from images and PDF files using Baidu OCR API, export to Excel. Supports single file, multiple files, or entire directory processing. Use when the user mentions invoices, invoice recognition, extracting invoice data, processing receipts, converting invoices to Excel, or batch processing invoice files.
methodology-extractor
Batch extraction of experimental methods from multiple papers for protocol.
clinical-data-extractor
Extract clinical trial data from pharmaceutical conference websites or PDF documents. Use when user provides a URL or PDF file containing innovative drug clinical trial data and needs structured extraction of: drug name, manufacturer, indication, clinical phase, trial name, conference, efficacy and safety data (presented as tables), and markdown output to "药品名称@适应症.md" file.
terabox-link-extractor
Direct link extraction from TeraBox URLs using the XAPIverse protocol. Extracts high-speed download and stream links (All Resolutions) without browser session requirements. Use when the user provides a TeraBox link and wants to download or stream content directly.
pdf-figure-extractor
从PDF论文中精确提取Figure图片,自动分析PDF结构、定位caption位置、裁剪干净图形,并验证图片质量。支持学术新闻稿、论文写作等场景的自动化图片处理。
Web Content Extractor - 网页内容提取器
**版本**: 2.0
---
name: article-factory-wechat
humanizer
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.
find-skills
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
tavily-search
Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.