elevenlabs-stt
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
Best use case
elevenlabs-stt is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
Teams using elevenlabs-stt should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/elevenlabs-stt/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How elevenlabs-stt Compares
| Feature / Agent | elevenlabs-stt | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# ElevenLabs Speech-to-Text
Transcribe audio files using ElevenLabs' Scribe v2 model. Supports 90+ languages with speaker diarization.
## Quick Start
```bash
# Basic transcription
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3
# With speaker diarization
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --diarize
# Specify language (improves accuracy)
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --lang en
# Full JSON output with timestamps
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --json
```
## Options
| Flag | Description |
|------|-------------|
| `--diarize` | Identify different speakers |
| `--lang CODE` | ISO language code (e.g., en, pt, es) |
| `--json` | Output full JSON with word timestamps |
| `--events` | Tag audio events (laughter, music, etc.) |
## Supported Formats
All major audio/video formats: mp3, m4a, wav, ogg, webm, mp4, etc.
## API Key
Set `ELEVENLABS_API_KEY` environment variable, or configure in clawdbot.json:
```json5
{
skills: {
entries: {
"elevenlabs-stt": {
apiKey: "sk_..."
}
}
}
}
```
## Examples
```bash
# Transcribe a WhatsApp voice note
{baseDir}/scripts/transcribe.sh ~/Downloads/voice_note.ogg
# Meeting recording with multiple speakers
{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize --lang en
# Get JSON for processing
{baseDir}/scripts/transcribe.sh podcast.mp3 --json > transcript.json
```Related Skills
miranda-elevenlabs-speech
Text-to-Speech and Speech-to-Text using ElevenLabs AI.
elevenlabs-cli
CLI for ElevenLabs AI audio platform - text-to-speech, speech-to-text, voice cloning.
elevenlabs-phone-reminder-lite
Build AI phone call reminders with ElevenLabs Conversational AI + Twilio. Free starter guide.
elevenlabs-ai
OpenClaw skill for ElevenLabs APIs: text-to-speech, speech-to-speech, realtime speech-to-text, voices/models.
elevenlabs-music
Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lullabies, or any audio music from descriptions. Supports vocals with AI-generated lyrics, instrumental tracks, and multiple genres/styles. Requires paid ElevenLabs plan.
elevenlabs-speech
Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.
paylock
Non-custodial SOL escrow for AI agent deals.
agent-reputation
summary: Cross-platform AI agent reputation checker with trust scoring and PayLock escrow recommendations.
Telecom Agent Skill
Turn your AI Agent into a Telecom Operator. Bulk calling, ChatOps, and Field Monitoring.
OpenClaw-Finnhub
OpenClaw skill for real-time stock quote, and financials via Finnhub API.
```markdown
# OpenClaw-Last.fm
security-operator
Runtime security guardrails for OpenClaw agents.