document-parser

高精度文档解析技能，从 PDF、图片、Word 文档中提取结构化数据。

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

document-parser is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

高精度文档解析技能，从 PDF、图片、Word 文档中提取结构化数据。

Teams using document-parser should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/document-parser/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/ankylala/document-parser/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/document-parser/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How document-parser Compares

Feature / Agent	document-parser	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

高精度文档解析技能，从 PDF、图片、Word 文档中提取结构化数据。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# document-parser

高精度文档解析技能，从 PDF、图片、Word 文档中提取结构化数据。

## 用途
- 解析 PDF、图片 (JPG/PNG)、Word 文档
- 版面分析与结构提取
- 表格识别（输出 HTML/Markdown）
- OCR 文字识别
- 印章检测
- 目录提取

## 命令

### 解析文档
```
document-parser parse <文件路径> [选项]
```

示例：
```
document-parser parse C:\docs\report.pdf
document-parser parse C:\docs\scan.jpg --layout --table
document-parser parse C:\docs\contract.docx --output markdown
```

### 查询任务状态
```
document-parser status <任务 ID>
```

## 参数说明

| 参数 | 说明 | 示例 |
|------|------|------|
| 文件路径 | PDF/图片/Word 文件路径 | `C:\docs\report.pdf` |
| --layout | 启用版面分析 | `--layout` |
| --table | 启用表格识别 | `--table` |
| --seal | 启用印章检测 | `--seal` |
| --output | 输出格式 (json/markdown/both) | `--output markdown` |
| --pages | 页码范围 | `--pages 1-5,8,10-12` |

## 配置

### 方式一：环境变量
```
DOCUMENT_PARSER_API_KEY=your_api_key
DOCUMENT_PARSER_BASE_URL=http://47.111.146.164:8088/taidp/v1/idp/general_parse
```

### 方式二：配置文件
在技能目录创建 `config.json`：
```json
{
  "api_key": "your_api_key",
  "base_url": "http://47.111.146.164:8088/taidp/v1/idp/general_parse"
}
```

## 输出格式

返回结构化 JSON 包含：
- **pages**: 解析后的页面数组
- **elements**: 版面元素（文本、表格、图片等）
- **markdown**: Markdown 格式文本
- **data**: 数据统计摘要

## 依赖
- requests
- python-docx (Word 支持)
- Pillow (图片处理)

## 错误码

| 错误码 | 消息 | 说明 |
|--------|------|------|
| 10000 | Success | 识别成功 |
| 10001 | Missing parameter | 参数缺失 |
| 10002 | Invalid parameter | 非法参数 |
| 10003 | Invalid file | 文件格式非法 |
| 10004 | Failed to recognize | 识别失败 |
| 10005 | Internal error | 内部错误 |

Related Skills

API Documentation Generator

3891

from openclaw/skills

Generate production-ready API documentation from endpoint descriptions. Outputs OpenAPI 3.0, markdown reference docs, and SDK quickstart guides.

Coding & Development

content-parser

3891

from openclaw/skills

Extract and parse content from URLs. Triggers on: user provides a URL to extract content from, another skill needs to parse source material, "parse this URL", "extract content", "解析链接", "提取内容".

Data & Research

shuke-document-formatting

3891

from openclaw/skills

数科公司文印格式自动化工具包。自动按照数科公司文印格式要求（方正小标宋简体、仿宋GB2312、楷体GB2312、黑体等字体，28字/行，22行/页）格式化Word文档并生成PDF。

Contract Reviewer - AI Legal Document Risk Scanner

3891

from openclaw/skills

Upload any contract or legal document and get a structured risk analysis with flagged clauses, plain-language explanations, and negotiation suggestions.

resume-parser

3891

from openclaw/skills

智能简历解析系统，支持PDF/Word/图片格式简历的结构化信息提取、岗位匹配度分析、优化建议生成。完全本地运行，无需外部API。使用场景：(1) 解析上传的简历文件提取核心信息，(2) 输入岗位JD计算简历匹配度，(3) 生成简历优化建议，(4) 导出结构化简历数据。

multimodal-parser

3891

from openclaw/skills

Unified multi-modal content parser for images, PDF, DOCX, audio, auto OCR/transcription, output structured text for LLM processing

document-release

3891

from openclaw/skills

Post-ship documentation update. Reads all project docs, cross-references the diff, updates README/ARCHITECTURE/CONTRIBUTING/CLAUDE.md to match what shipped, polishes CHANGELOG voice, cleans up TODOS, and optionally bumps VERSION. Use when asked to "update the docs", "sync documentation", or "post-ship docs".

document-qa

3891

from openclaw/skills

Answers questions based on the content of uploaded documents (PDF, DOCX, TXT), supporting individual files or entire folders.

name: Snipara MCP - Smart Documentation Search

3891

from openclaw/skills

description: Find answers in your codebase 10x faster with semantic search. Query multiple repos at once. AI remembers your preferences across sessions.

quality-documentation-manager

3891

from openclaw/skills

Document control system management for medical device QMS. Covers document numbering, version control, change management, and 21 CFR Part 11 compliance. Use for document control procedures, change control workflow, document numbering, version management, electronic signature compliance, or regulatory documentation review.

pdf-parser

3891

from openclaw/skills

使用 MinerU API 将 PDF 解析为 Markdown，支持公式、表格、OCR。提供本地文件和在线 URL 两种解析方式。触发条件：(1) 用户说"解析 PDF [路径]"，(2) 用户说"将 PDF 转为 Markdown"，(3) 在 paper-workflow 中自动调用。使用场景：学术论文解析、文档提取、知识库构建。

Name: unidoc_parser

3891

from openclaw/skills

Description: Parse documents using UniDoc API for conversion to Markdown or JSON format. Supports both synchronous and asynchronous parsing with automatic status polling.