ocrmypdf
OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.
Best use case
ocrmypdf is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.
OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "ocrmypdf" skill to help with this workflow task. Context: OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/ocrmypdf/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How ocrmypdf Compares
| Feature / Agent | ocrmypdf | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# OCRmyPDF — Core OCR Guide ## Overview [OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF) adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. It uses Tesseract OCR, supports 100+ languages, produces PDF/A by default, and distributes work across all CPU cores. For image processing (deskew, rotate, clean), see the **ocrmypdf-image** skill. For optimization and PDF/A options, see **ocrmypdf-optimize**. For batch/Docker/scripting, see **ocrmypdf-batch**. For Python API and plugins, see **ocrmypdf-api**. ## Installation ### One-liner installs (recommended) | OS | Command | |----|---------| | **Debian / Ubuntu** | `apt install ocrmypdf` | | **Fedora** | `dnf install ocrmypdf tesseract-osd` | | **macOS (Homebrew)** | `brew install ocrmypdf` | | **macOS (MacPorts)** | `port install ocrmypdf` | | **FreeBSD** | `pkg install py-ocrmypdf` | | **Snap** | `snap install ocrmypdf` | ### pip install (latest version) ```bash # After installing system dependencies (Tesseract, Ghostscript) pip install ocrmypdf ``` ### Verify ```bash ocrmypdf --version ocrmypdf --help ``` ### Requirements - **Python 3.11+** - **Tesseract 4.1.1+** (OCR engine) - **Ghostscript 9.54+** or **pypdfium2** (PDF rasterization) - Optional: jbig2enc (compression), pngquant (image optimization), unpaper (cleaning) ## Quick Start ```bash # Basic OCR — input scanned PDF, output searchable PDF/A ocrmypdf input.pdf output.pdf # OCR an image file directly ocrmypdf --image-dpi 300 scan.png output.pdf # OCR in place (only overwrites on success) ocrmypdf myfile.pdf myfile.pdf ``` ## Language Support OCRmyPDF uses Tesseract language packs. Install them for your OS: ```bash # Debian / Ubuntu apt-cache search tesseract-ocr # List all language packs apt install tesseract-ocr-chi-sim # Chinese Simplified apt install tesseract-ocr-fra # French # macOS (Homebrew) brew install tesseract-lang # All languages # Fedora dnf search tesseract-langpack dnf install tesseract-langpack-ita # Italian ``` ### Using languages ```bash # Single language ocrmypdf -l fra document.pdf output.pdf # Multiple languages ocrmypdf -l eng+fra bilingual.pdf output.pdf # Chinese Simplified + English ocrmypdf -l chi_sim+eng chinese-doc.pdf output.pdf ``` **Note**: Use [ISO 639-3 codes](https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html) for language identifiers. ## OCR Modes ### Default mode (skip existing text) ```bash # Skip pages that already have text — only OCR pages without text ocrmypdf input.pdf output.pdf ``` ### Force OCR (`--force-ocr` or `-m force`) ```bash # Rasterize and OCR all pages, even those with existing text ocrmypdf --force-ocr input.pdf output.pdf # v17+ short form: ocrmypdf -m force input.pdf output.pdf ``` ### Redo OCR (`--redo-ocr` or `-m redo`) ```bash # Replace existing OCR without rasterizing (preserves quality) ocrmypdf --redo-ocr input.pdf output.pdf # v17+ short form: ocrmypdf -m redo input.pdf output.pdf ``` ### Skip text (`--skip-text` or `-m skip`) ```bash # Skip pages with any text, only OCR blank/image pages ocrmypdf --skip-text input.pdf output.pdf # v17+ short form: ocrmypdf -m skip input.pdf output.pdf ``` ### No OCR (image processing only) ```bash # Apply image processing / PDF/A conversion without OCR ocrmypdf --ocr-engine none input.pdf output.pdf ``` ## Page Selection ```bash # OCR only specific pages ocrmypdf --pages 1,3,5-10 input.pdf output.pdf # OCR only the first page, minimal changes elsewhere ocrmypdf --pages 1 --output-type pdf --optimize 0 input.pdf output.pdf ``` ## Output Types ```bash # PDF/A (default) — for archival ocrmypdf --output-type pdfa input.pdf output.pdf # Standard PDF ocrmypdf --output-type pdf input.pdf output.pdf # Auto (v17+) — speculative PDF/A, falls back to standard PDF ocrmypdf --output-type auto input.pdf output.pdf # No output PDF — only produce sidecar text ocrmypdf --output-type none --sidecar text.txt input.pdf - ``` ## Sidecar Text File ```bash # Produce a companion text file with OCR text ocrmypdf --sidecar output.txt input.pdf output.pdf ``` ## Metadata ```bash # Set output PDF metadata ocrmypdf --title "My Document" --author "Author Name" --subject "Subject" input.pdf output.pdf ``` ## Parallel Processing ```bash # Use 4 CPU cores (default: all available) ocrmypdf --jobs 4 input.pdf output.pdf # Single-threaded ocrmypdf --jobs 1 input.pdf output.pdf ``` ## Common Recipes ### Make a scanned PDF searchable ```bash ocrmypdf scanned.pdf searchable.pdf ``` ### Convert image to searchable PDF ```bash ocrmypdf --image-dpi 300 scan.jpg output.pdf ``` ### OCR a multilingual document ```bash ocrmypdf -l eng+deu+fra multilingual.pdf output.pdf ``` ### Re-OCR with newer Tesseract ```bash ocrmypdf --redo-ocr old-ocr.pdf updated.pdf ``` ### Strip all text/OCR from a PDF ```bash ocrmypdf --ocr-engine none --force-ocr input.pdf stripped.pdf ``` ## Quick Reference | Task | Command | |------|---------| | Basic OCR | `ocrmypdf input.pdf output.pdf` | | Specify language | `ocrmypdf -l fra input.pdf output.pdf` | | Multiple languages | `ocrmypdf -l eng+fra input.pdf output.pdf` | | Force re-OCR all pages | `ocrmypdf --force-ocr input.pdf output.pdf` | | Replace existing OCR | `ocrmypdf --redo-ocr input.pdf output.pdf` | | Skip pages with text | `ocrmypdf --skip-text input.pdf output.pdf` | | Specific pages only | `ocrmypdf --pages 1,3,5-10 input.pdf output.pdf` | | Output standard PDF | `ocrmypdf --output-type pdf input.pdf output.pdf` | | Extract text sidecar | `ocrmypdf --sidecar text.txt input.pdf output.pdf` | | Image to PDF | `ocrmypdf --image-dpi 300 image.png output.pdf` | | In-place OCR | `ocrmypdf myfile.pdf myfile.pdf` | | Set metadata | `ocrmypdf --title "Title" input.pdf output.pdf` | | Parallel jobs | `ocrmypdf --jobs 4 input.pdf output.pdf` | ## Troubleshooting - **"Tesseract not found"**: Install Tesseract and ensure it's on PATH. - **Poor OCR quality**: Check language packs (`-l`), try `--deskew` (see ocrmypdf-image), or `--oversample 300`. - **"Input file has text"**: Use `--force-ocr`, `--redo-ocr`, or `--skip-text` as appropriate. - **Large output files**: See ocrmypdf-optimize for `--optimize` levels and JBIG2. - **Signed PDFs**: Use `--invalidate-digital-signatures` to override (signatures will be invalidated). ## References - [OCRmyPDF Documentation](https://ocrmypdf.readthedocs.io/en/latest/) - [OCRmyPDF GitHub](https://github.com/ocrmypdf/OCRmyPDF) - [Tesseract Language Packs](https://github.com/tesseract-ocr/tessdata) - [OCRmyPDF Cookbook](https://ocrmypdf.readthedocs.io/en/latest/cookbook.html)
Related Skills
ocrmypdf-optimize
OCRmyPDF optimization skill — compress PDFs, configure PDF/A output, JBIG2 encoding, and lossless optimization. Use when the user needs to reduce PDF file size, create archival PDF/A files, or optimize OCR output.
ocrmypdf-image
OCRmyPDF image processing skill — deskew, rotate, clean, despeckle, remove border from scanned documents. Use when the user needs to improve scanned PDF quality, fix skewed pages, remove noise, or clean up scanned documents before OCR.
ocrmypdf-batch
OCRmyPDF batch processing skill — process multiple PDFs, Docker automation, shell scripting, and CI/CD integration. Use when the user needs to OCR many PDFs, set up automated OCR pipelines, or integrate OCR into workflows.
ocrmypdf-api
OCRmyPDF Python API and plugin skill — use OCRmyPDF programmatically from Python, integrate with applications, and extend with plugins (EasyOCR, PaddleOCR, AppleOCR). Use when the user needs to call OCRmyPDF from Python code, build OCR pipelines, or use alternative OCR engines.
vant-vue3
Provides structured guidance for Vant of Vue 3.0. Use when the user needs Vant with Vue 3, asks about mobile UI components such as Button, Cell, Form, Dialog, Toast, Popup, ConfigProvider, theme customization, project setup, or wants to implement mobile-first interfaces with vant or van- components.
layui-vue3
Provides comprehensive guidance for Layui Vue component library including components, layer dialogs, and utilities. Use when the user asks about Layui Vue, needs to use Layui components in Vue 3, or implement UI components.
element-plus-vue3
Provides comprehensive guidance for Element Plus Vue 3 component library including installation, components, themes, internationalization, and API reference. Use when the user asks about Element Plus for Vue 3, needs to build Vue 3 applications with Element Plus, or customize component styles.
bootstrap-vue3
Provides comprehensive guidance for Bootstrap Vue 3 component library including Bootstrap components, grid system, utilities, and Vue 3 integration. Use when the user asks about Bootstrap Vue 3, needs to use Bootstrap components in Vue 3, or implement responsive layouts.
vuex-vue2
Provides comprehensive guidance for Vuex 2.x state management in Vue 2 applications including state, mutations, actions, getters, modules, and plugins. Use when the user asks about Vuex for Vue 2, needs to manage state in Vue 2 applications, or implement Vuex patterns.
vue3
Guidance for Vue 3 using the official guide and API reference. Use when the user needs Vue 3 concepts, patterns, or API details to build components, apps, and tooling.
vue2
Provides comprehensive guidance for Vue 2.x development including Options API, components, directives, lifecycle hooks, computed properties, watchers, Vuex state management, and Vue Router. Use when the user asks about Vue 2, needs to create Vue 2 components, implement reactive data binding, handle component communication, or work with Vue 2 ecosystem tools.
vue-router
Provides comprehensive guidance for Vue Router including route configuration, navigation, dynamic routes, nested routes, route guards, programmatic navigation, and route meta. Use when the user asks about Vue Router, needs to set up routing, implement navigation guards, handle route parameters, or manage route transitions.