multiAI Summary Pending

pdf-processing

Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.

231 stars

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/pdf-processing/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/0xkynz/pdf-processing/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/pdf-processing/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How pdf-processing Compares

Feature / Agentpdf-processingStandard Approach
Platform SupportmultiLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Extract text and tables from PDF files, fill forms, merge documents. Use when working with PDF files or when the user mentions PDFs, forms, or document extraction.

Which AI agents support this skill?

This skill is compatible with multi.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# PDF Processing Skill

This skill provides capabilities for working with PDF documents.

## Quick Start

Use pdfplumber to extract text from PDFs:

```python
import pdfplumber

with pdfplumber.open("document.pdf") as pdf:
    text = pdf.pages[0].extract_text()
```

## Capabilities

### Text Extraction
- Extract text from single or multiple pages
- Preserve layout and formatting
- Handle multi-column documents

### Table Extraction
- Identify and extract tables
- Convert to structured data (CSV, JSON)
- Handle complex table layouts

### Form Operations
- Fill PDF forms programmatically
- Extract form field values
- Create fillable forms

### Document Operations
- Merge multiple PDFs
- Split PDFs by page
- Rotate pages
- Add watermarks

## Best Practices

1. Always check if the PDF is encrypted before processing
2. Handle OCR cases for scanned documents
3. Validate extracted data for accuracy
4. Use appropriate libraries (pdfplumber for extraction, PyPDF2 for manipulation)