receipt-ocr
Implement and extend the Tesseract.js OCR pipeline for receipt field extraction.
Best use case
receipt-ocr is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Implement and extend the Tesseract.js OCR pipeline for receipt field extraction.
Teams using receipt-ocr should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/receipt-ocr/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How receipt-ocr Compares
| Feature / Agent | receipt-ocr | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Implement and extend the Tesseract.js OCR pipeline for receipt field extraction.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# receipt-ocr skill
## OCR pipeline overview
Receipt OCR uses Tesseract.js (WASM, runs in Node.js) to extract three fields from uploaded images: vendor name, transaction date, and total amount. Each extracted field includes a confidence score.
### Server-side implementation
`server/src/ocr/extractFields.ts`:
```typescript
import Tesseract from 'tesseract.js';
import path from 'node:path';
export interface OcrResult {
vendor: string | null;
date: string | null; // ISO 8601 or null
amount: number | null;
confidence: number; // 0-100, average of matched fields
rawText: string;
}
export async function extractReceiptFields(imagePath: string): Promise<OcrResult> {
const { data } = await Tesseract.recognize(imagePath, 'eng', {
logger: () => {}, // suppress progress logs
});
const raw = data.text;
const confidence = Math.round(data.confidence);
return {
vendor: extractVendor(raw),
date: extractDate(raw),
amount: extractAmount(raw),
confidence,
rawText: raw,
};
}
```
### Vendor extraction
Heuristic: the vendor name is typically on the first 1-3 non-empty lines, in all-caps or title case.
```typescript
function extractVendor(text: string): string | null {
const lines = text.split('\n').map(l => l.trim()).filter(Boolean);
// Take the first line that looks like a business name (>3 chars, not purely numeric)
for (const line of lines.slice(0, 4)) {
if (line.length > 3 && !/^\d+$/.test(line)) {
return titleCase(line);
}
}
return null;
}
function titleCase(s: string): string {
return s.toLowerCase().replace(/\b\w/g, c => c.toUpperCase());
}
```
### Date extraction
Matches common date formats: MM/DD/YYYY, DD/MM/YYYY, YYYY-MM-DD, Month DD YYYY, DD Month YYYY.
```typescript
const DATE_PATTERNS: Array<{ re: RegExp; parse: (m: RegExpMatchArray) => string }> = [
{
re: /\b(\d{1,2})[\/\-\.](\d{1,2})[\/\-\.](\d{4})\b/,
parse: ([, m, d, y]) => `${y}-${m.padStart(2,'0')}-${d.padStart(2,'0')}`,
},
{
re: /\b(\d{4})[\/\-\.](\d{1,2})[\/\-\.](\d{1,2})\b/,
parse: ([, y, m, d]) => `${y}-${m.padStart(2,'0')}-${d.padStart(2,'0')}`,
},
{
re: /\b(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*\.?\s+(\d{1,2}),?\s+(\d{4})\b/i,
parse: ([, mon, d, y]) => `${y}-${monthNum(mon)}-${d.padStart(2,'0')}`,
},
];
function extractDate(text: string): string | null {
for (const { re, parse } of DATE_PATTERNS) {
const m = text.match(re);
if (m) return parse(m);
}
return null;
}
```
### Amount extraction
Finds the largest dollar value on a line containing "TOTAL", "AMOUNT DUE", "GRAND TOTAL", or "DUE". Falls back to the largest dollar value in the document.
```typescript
const AMOUNT_RE = /\$?\s?(\d{1,6}[.,]\d{2})/g;
function extractAmount(text: string): number | null {
const lines = text.split('\n');
// Priority: line contains total keyword
const totalKeywords = /\b(total|amount\s*due|grand\s*total|due)\b/i;
for (const line of lines) {
if (totalKeywords.test(line)) {
const amounts = [...line.matchAll(AMOUNT_RE)].map(m => parseAmount(m[1]));
if (amounts.length > 0) return Math.max(...amounts);
}
}
// Fallback: largest amount in document
const all = [...text.matchAll(AMOUNT_RE)].map(m => parseAmount(m[1]));
return all.length > 0 ? Math.max(...all) : null;
}
function parseAmount(s: string): number {
return parseFloat(s.replace(',', ''));
}
```
## Multer upload configuration
`server/src/middleware/upload.ts`:
```typescript
import multer from 'multer';
import path from 'node:path';
import { randomUUID } from 'node:crypto';
import fs from 'node:fs';
const UPLOAD_DIR = process.env.UPLOAD_DIR ?? './tmp/receipts';
// Ensure upload directory exists
fs.mkdirSync(UPLOAD_DIR, { recursive: true });
const ALLOWED_MIME = new Set(['image/jpeg', 'image/png', 'image/webp']);
const MAX_BYTES = parseInt(process.env.MAX_FILE_BYTES ?? '10485760', 10);
export const receiptUpload = multer({
storage: multer.diskStorage({
destination: UPLOAD_DIR,
filename: (_req, _file, cb) => cb(null, `${randomUUID()}.jpg`),
}),
limits: { fileSize: MAX_BYTES },
fileFilter: (_req, file, cb) => {
if (ALLOWED_MIME.has(file.mimetype)) cb(null, true);
else cb(new Error(`Unsupported file type: ${file.mimetype}`));
},
});
```
## OCR route
`server/src/routes/receipts.ts`:
```typescript
import { Router } from 'express';
import { receiptUpload } from '../middleware/upload.js';
import { extractReceiptFields } from '../ocr/extractFields.js';
const router = Router();
router.post('/ocr', receiptUpload.single('receipt'), async (req, res) => {
if (!req.file) {
return res.status(400).json({ error: 'No file uploaded' });
}
try {
const result = await extractReceiptFields(req.file.path);
res.json({
vendor: result.vendor,
date: result.date,
amount: result.amount,
confidence: result.confidence,
rawText: result.rawText,
filePath: req.file.filename, // relative path for storage
});
} catch (err) {
res.status(500).json({ error: 'OCR failed', detail: String(err) });
}
});
export default router;
```
## Confidence thresholds
| Confidence | Meaning | UI treatment |
|------------|---------|--------------|
| >= 85 | High | Green badge, field auto-filled, no warning |
| 65-84 | Medium | Yellow badge, field pre-filled, user prompted to verify |
| < 65 | Low | Red badge, field left blank, user must fill manually |
The `OCR_CONFIDENCE_THRESHOLD` env var (default 75) controls which fields are flagged in the UI. Fields below the threshold receive a warning indicator.
## Improving OCR accuracy
1. Ensure the uploaded image is at least 800px wide. Narrow images lose character detail.
2. Receipt images that are skewed more than 15 degrees may reduce accuracy. Consider adding a deskew step using `sharp` before passing to Tesseract.
3. For images with dark backgrounds (thermal receipts), use `sharp` to invert and increase contrast before OCR.
4. Set the Tesseract page segmentation mode: `psm 6` (assume a single uniform block of text) works well for receipts. Pass via `config.tessedit_pageseg_mode = '6'` in the `recognize` call options.
## Testing OCR extraction
```typescript
// server/src/ocr/__tests__/extractFields.test.ts
import { extractReceiptFields } from '../extractFields.js';
import path from 'node:path';
test('extracts vendor, date, and amount from sample receipt', async () => {
const result = await extractReceiptFields(
path.resolve(__dirname, 'fixtures/sample_receipt.jpg')
);
expect(result.vendor).toBeTruthy();
expect(result.date).toMatch(/^\d{4}-\d{2}-\d{2}$/);
expect(result.amount).toBeGreaterThan(0);
expect(result.confidence).toBeGreaterThan(60);
}, 30_000); // OCR can take up to 30s on first run (WASM init)
```
Place test receipt images in `server/src/ocr/__tests__/fixtures/`.
## Known OCR edge cases
- **"T0TAL" (digit zero):** Tesseract sometimes confuses uppercase O with 0. The amount extractor accepts both in the total keyword regex.
- **Comma as decimal separator:** European receipts use commas (e.g. "52,00"). `parseAmount` normalizes commas to periods.
- **Multi-currency symbols:** The amount regex matches `$`, euro, and pound signs. The currency is not extracted from OCR; it defaults to the app default and must be corrected by the user.
- **Tip vs subtotal:** The extractor uses the largest amount on a "TOTAL" line, which should be the grand total including tip. If the receipt only shows a subtotal line, the tip is not included.Related Skills
Skill: Uptime Monitoring
## Overview
Skill: Status Page
## Overview
Skill: unit-conversion
## Overview
Skill: recipe-scaler
## Overview
reading-list
Operate the reading-list API to save, manage, tag, search, and export articles.
email-digest
Configure, test, and troubleshoot the reading-list daily email digest delivered via nodemailer.
websocket-realtime
Use the WebSocket connection in poll-builder to receive live vote updates. Use when you need to stream real-time poll results, monitor a poll for new votes, or build a live dashboard. Triggers include "live results", "real-time updates", "stream votes", "watch poll", or "WebSocket".
poll-builder
Self-hosted poll creation tool with real-time results. Use when you need to create a poll, check vote counts, close a poll, export results, or get the shareable link for a poll. Triggers include "create poll", "vote", "poll results", "survey", "collect votes", "share poll", or any task involving polling or voting.
Skill: personal-finance
## Overview
Skill: csv-import
## Overview
Skill: Syntax Highlighting
## Purpose
Skill: Pastebin Core
## Purpose