openrouter-transcribe

Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).

533 stars

Best use case

openrouter-transcribe is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).

Teams using openrouter-transcribe should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/openrouter-transcribe/SKILL.md --create-dirs "https://raw.githubusercontent.com/sundial-org/awesome-openclaw-skills/main/skills/openrouter-transcribe/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/openrouter-transcribe/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How openrouter-transcribe Compares

Feature / Agent	openrouter-transcribe	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# OpenRouter Audio Transcription

Transcribe audio files using OpenRouter's chat completions API with `input_audio` content type. Works with any audio-capable model.

## Quick start

```bash
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
```

Output goes to stdout.

## Useful flags

```bash
# Custom model (default: google/gemini-2.5-flash)
{baseDir}/scripts/transcribe.sh audio.ogg --model openai/gpt-4o-audio-preview

# Custom instructions
{baseDir}/scripts/transcribe.sh audio.m4a --prompt "Transcribe with speaker labels"

# Save to file
{baseDir}/scripts/transcribe.sh audio.m4a --out /tmp/transcript.txt

# Custom caller identifier (for OpenRouter dashboard)
{baseDir}/scripts/transcribe.sh audio.m4a --title "MyApp"
```

## How it works

1. Converts audio to WAV (mono, 16kHz) using ffmpeg
2. Base64 encodes the audio
3. Sends to OpenRouter chat completions with `input_audio` content
4. Extracts transcript from response

## API key

Set `OPENROUTER_API_KEY` env var, or configure in `~/.clawdbot/clawdbot.json`:

```json5
{
  skills: {
    "openrouter-transcribe": {
      apiKey: "YOUR_OPENROUTER_KEY"
    }
  }
}
```

## Headers

The script sends identification headers to OpenRouter:
- `X-Title`: Caller name (default: "Peanut/Clawdbot")
- `HTTP-Referer`: Reference URL (default: "https://clawdbot.com")

These show up in your OpenRouter dashboard for tracking.

## Troubleshooting

**ffmpeg format errors**: The script uses a temp directory (not `mktemp -t file.wav`) because macOS's mktemp adds random suffixes after the extension, breaking format detection.

**Argument list too long**: Large audio files produce huge base64 strings that exceed shell argument limits. The script writes to temp files (`--rawfile` for jq, `@file` for curl) instead of passing data as arguments.

**Empty response**: If you get "Empty response from API", the script will dump the raw response for debugging. Common causes:
- Invalid API key
- Model doesn't support audio input
- Audio file too large or corrupted

Related Skills

assemblyai-transcribe

533

from sundial-org/awesome-openclaw-skills

Transcribe audio/video with AssemblyAI (local upload or URL), plus subtitles + paragraph/sentence exports.

portfolio-watcher

533

from sundial-org/awesome-openclaw-skills

Monitor stock/crypto holdings, get price alerts, track portfolio performance

portainer

533

from sundial-org/awesome-openclaw-skills

Control Docker containers and stacks via Portainer API. List containers, start/stop/restart, view logs, and redeploy stacks from git.

portable-tools

533

from sundial-org/awesome-openclaw-skills

Build cross-device tools without hardcoding paths or account names

polymarket

533

from sundial-org/awesome-openclaw-skills

Trade prediction markets on Polymarket. Analyze odds, place bets, track positions, automate alerts, and maximize returns from event outcomes. Covers sports, politics, entertainment, and more.

polymarket-traiding-bot

533

from sundial-org/awesome-openclaw-skills

No description provided.

polymarket-analysis

533

from sundial-org/awesome-openclaw-skills

Analyze Polymarket prediction markets for trading edges. Pair Cost arbitrage, whale tracking, sentiment analysis, momentum signals, user profile tracking. No execution.

polymarket-agent

533

from sundial-org/awesome-openclaw-skills

Autonomous prediction market agent - analyzes markets, researches news, and identifies trading opportunities

polymarket-5

533

from sundial-org/awesome-openclaw-skills

Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.

polymarket-4

533

from sundial-org/awesome-openclaw-skills

Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.

polymarket-3

533

from sundial-org/awesome-openclaw-skills

Query Polymarket prediction market odds and events via CLI. Search for markets, get current prices, list events by category. Supports sports betting (NFL, NBA, soccer/EPL, Champions League), politics, crypto, elections, geopolitics. Real money markets = more accurate than polls. No API key required. Use when asked about odds, probabilities, predictions, or "what are the chances of X".

polymarket-2

533

from sundial-org/awesome-openclaw-skills

Query Polymarket prediction markets - check odds, trending markets, search events, track prices.