image-ocr-processing

image ocr processing

7,385 stars

Best use case

image-ocr-processing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

image ocr processing

Teams using image-ocr-processing should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/image-ocr-processing/SKILL.md --create-dirs "https://raw.githubusercontent.com/kreuzberg-dev/kreuzberg/main/.ai-rulez/domains/ocr-integration/skills/image-ocr-processing/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/image-ocr-processing/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How image-ocr-processing Compares

Feature / Agent	image-ocr-processing	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

image ocr processing

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Top AI Agents for Productivity

See the top AI agent skills for productivity, workflow automation, operational systems, documentation, and everyday task execution.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

Process image documents with OCR

1. Create OcrProcessor with cache config
2. Check cache for image hash
3. If cache hit, return cached result
4. Select optimal OCR backend
5. Apply preprocessing if configured
6. Execute OCR backend
7. Parse results (hOCR, confidence, tables)
8. Store in cache
9. Return structured result

Related Skills

image-preprocessing-pipeline

7385

from kreuzberg-dev/kreuzberg

image preprocessing pipeline

batch-ocr-processing

7385

from kreuzberg-dev/kreuzberg

uatch ocr processing

kreuzberg

7385

from kreuzberg-dev/kreuzberg

Extract text, tables, metadata, and images from 91+ document formats (PDF, Office, images, HTML, email, archives, academic) using Kreuzberg. Use when writing code that calls Kreuzberg APIs in Python, Node.js/TypeScript, Rust, or CLI. Covers installation, extraction (sync/async), configuration (OCR, chunking, output format), batch processing, error handling, and plugins.