llmwhisperer

Extract text and layout from images and PDFs using LLMWhisperer API. Good for handwriting and complex forms.

7 stars

Best use case

llmwhisperer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Extract text and layout from images and PDFs using LLMWhisperer API. Good for handwriting and complex forms.

Teams using llmwhisperer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/llmwhisperer/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/gumadeiras/llmwhisperer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/llmwhisperer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How llmwhisperer Compares

Feature / AgentllmwhispererStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Extract text and layout from images and PDFs using LLMWhisperer API. Good for handwriting and complex forms.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# LLMWhisperer

Extract text from images and PDFs using the [LLMWhisperer API](https://unstract.com/llmwhisperer/) — great for handwriting and complex forms.

## Configuration

Requires `LLMWHISPERER_API_KEY` in `~/.clawdbot/.env`:
```bash
echo "LLMWHISPERER_API_KEY=your_key_here" >> ~/.clawdbot/.env
```

### Get an API Key
Get a free API key at [unstract.com/llmwhisperer](https://unstract.com/llmwhisperer/).
- **Free Tier:** 100 pages/day

## Usage

```bash
llmwhisperer <file>
```

## Script Source

The executable script is located at `scripts/llmwhisperer`.

```bash
#!/bin/bash
# Extract text using LLMWhisperer API

if [ -z "$LLMWHISPERER_API_KEY" ]; then
  if [ -f ~/.clawdbot/.env ]; then
    # shellcheck disable=SC2046
    export $(grep -v '^#' ~/.clawdbot/.env | grep 'LLMWHISPERER_API_KEY' | xargs)
  fi
fi

if [ -z "$LLMWHISPERER_API_KEY" ]; then
  echo "Error: LLMWHISPERER_API_KEY not found in env or ~/.clawdbot/.env"
  exit 1
fi

FILE="$1"
if [ -z "$FILE" ]; then
  echo "Usage: $0 <file>"
  exit 1
fi

curl -s -X POST "https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper?mode=high_quality&output_mode=layout_preserving" \
  -H "Content-Type: application/octet-stream" \
  -H "unstract-key: $LLMWHISPERER_API_KEY" \
  --data-binary "@$FILE"
```

## Examples

**Print text to terminal:**
```bash
llmwhisperer flyer.jpg
```

**Save output to a text file:**
```bash
llmwhisperer invoice.pdf > invoice.txt
```

**Process a handwritten note:**
```bash
llmwhisperer notes.jpg
```

Related Skills

paylock

7
from Demerzels-lab/elsamultiskillagent

Non-custodial SOL escrow for AI agent deals.

agent-reputation

7
from Demerzels-lab/elsamultiskillagent

summary: Cross-platform AI agent reputation checker with trust scoring and PayLock escrow recommendations.

Telecom Agent Skill

7
from Demerzels-lab/elsamultiskillagent

Turn your AI Agent into a Telecom Operator. Bulk calling, ChatOps, and Field Monitoring.

OpenClaw-Finnhub

7
from Demerzels-lab/elsamultiskillagent

OpenClaw skill for real-time stock quote, and financials via Finnhub API.

```markdown

7
from Demerzels-lab/elsamultiskillagent

# OpenClaw-Last.fm

security-operator

7
from Demerzels-lab/elsamultiskillagent

Runtime security guardrails for OpenClaw agents.

operator-humanizer

7
from Demerzels-lab/elsamultiskillagent

Transform AI-generated text into authentic human writing.

kit-email-operator

7
from Demerzels-lab/elsamultiskillagent

**AI-powered email marketing for Kit (ConvertKit)**.

agora

7
from Demerzels-lab/elsamultiskillagent

Trade prediction markets on Agora — the prediction market exclusively for AI agents. Register, browse markets, trade YES/NO, create markets, earn reputation via Brier scores.

surf-check

7
from Demerzels-lab/elsamultiskillagent

Surf forecast decision engine.

jinko-flight-search

7
from Demerzels-lab/elsamultiskillagent

Search flights and discover travel destinations using the Jinko MCP server. Provides two core capabilities: (1) Destination discovery — find where to travel based on criteria like budget, climate, or activities when the user has no specific destination in mind, and (2) Specific flight search — compare flights between two known cities/airports with flexible dates, cabin classes, and budget filters. Use this skill when the user wants to: search for flights, find cheap flights, discover travel destinations, compare flight prices, plan a trip, find deals from a specific city, or explore where to go. Triggers on any flight-booking, travel-planning, or destination-discovery request. Requires the Jinko MCP server connected at https://mcp.gojinko.com.

mlx-whisper

7
from Demerzels-lab/elsamultiskillagent

Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key).