openrouter-transcribe
Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).
Best use case
openrouter-transcribe is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).
Teams using openrouter-transcribe should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/openrouter-transcribe/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How openrouter-transcribe Compares
| Feature / Agent | openrouter-transcribe | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# OpenRouter Audio Transcription
Transcribe audio files using OpenRouter's chat completions API with `input_audio` content type. Works with any audio-capable model.
## Quick start
```bash
{baseDir}/scripts/transcribe.sh /path/to/audio.m4a
```
Output goes to stdout.
## Useful flags
```bash
# Custom model (default: google/gemini-2.5-flash)
{baseDir}/scripts/transcribe.sh audio.ogg --model openai/gpt-4o-audio-preview
# Custom instructions
{baseDir}/scripts/transcribe.sh audio.m4a --prompt "Transcribe with speaker labels"
# Save to file
{baseDir}/scripts/transcribe.sh audio.m4a --out /tmp/transcript.txt
# Custom caller identifier (for OpenRouter dashboard)
{baseDir}/scripts/transcribe.sh audio.m4a --title "MyApp"
```
## How it works
1. Converts audio to WAV (mono, 16kHz) using ffmpeg
2. Base64 encodes the audio
3. Sends to OpenRouter chat completions with `input_audio` content
4. Extracts transcript from response
## API key
Set `OPENROUTER_API_KEY` env var, or configure in `~/.clawdbot/clawdbot.json`:
```json5
{
skills: {
"openrouter-transcribe": {
apiKey: "YOUR_OPENROUTER_KEY"
}
}
}
```
## Headers
The script sends identification headers to OpenRouter:
- `X-Title`: Caller name (default: "Peanut/Clawdbot")
- `HTTP-Referer`: Reference URL (default: "https://clawdbot.com")
These show up in your OpenRouter dashboard for tracking.
## Troubleshooting
**ffmpeg format errors**: The script uses a temp directory (not `mktemp -t file.wav`) because macOS's mktemp adds random suffixes after the extension, breaking format detection.
**Argument list too long**: Large audio files produce huge base64 strings that exceed shell argument limits. The script writes to temp files (`--rawfile` for jq, `@file` for curl) instead of passing data as arguments.
**Empty response**: If you get "Empty response from API", the script will dump the raw response for debugging. Common causes:
- Invalid API key
- Model doesn't support audio input
- Audio file too large or corruptedRelated Skills
assemblyai-transcribe
Transcribe audio/video with AssemblyAI (local upload or URL), plus subtitles + paragraph/sentence exports.
portfolio-watcher
Monitor stock/crypto holdings, get price alerts, track portfolio performance
portainer
Control Docker containers and stacks via Portainer API. List containers, start/stop/restart, view logs, and redeploy stacks from git.
portable-tools
Build cross-device tools without hardcoding paths or account names
polymarket
Trade prediction markets on Polymarket. Analyze odds, place bets, track positions, automate alerts, and maximize returns from event outcomes. Covers sports, politics, entertainment, and more.
polymarket-traiding-bot
No description provided.
polymarket-analysis
Analyze Polymarket prediction markets for trading edges. Pair Cost arbitrage, whale tracking, sentiment analysis, momentum signals, user profile tracking. No execution.
polymarket-agent
Autonomous prediction market agent - analyzes markets, researches news, and identifies trading opportunities
polymarket-5
Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.
polymarket-4
Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.
polymarket-3
Query Polymarket prediction market odds and events via CLI. Search for markets, get current prices, list events by category. Supports sports betting (NFL, NBA, soccer/EPL, Champions League), politics, crypto, elections, geopolitics. Real money markets = more accurate than polls. No API key required. Use when asked about odds, probabilities, predictions, or "what are the chances of X".
polymarket-2
Query Polymarket prediction markets - check odds, trending markets, search events, track prices.