elevenlabs-stt

Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

7 stars

Best use case

elevenlabs-stt is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

Teams using elevenlabs-stt should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/elevenlabs-stt/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/clawdbotborges/elevenlabs-stt/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/elevenlabs-stt/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How elevenlabs-stt Compares

Feature / Agentelevenlabs-sttStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# ElevenLabs Speech-to-Text

Transcribe audio files using ElevenLabs' Scribe v2 model. Supports 90+ languages with speaker diarization.

## Quick Start

```bash
# Basic transcription
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3

# With speaker diarization
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --diarize

# Specify language (improves accuracy)
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --lang en

# Full JSON output with timestamps
{baseDir}/scripts/transcribe.sh /path/to/audio.mp3 --json
```

## Options

| Flag | Description |
|------|-------------|
| `--diarize` | Identify different speakers |
| `--lang CODE` | ISO language code (e.g., en, pt, es) |
| `--json` | Output full JSON with word timestamps |
| `--events` | Tag audio events (laughter, music, etc.) |

## Supported Formats

All major audio/video formats: mp3, m4a, wav, ogg, webm, mp4, etc.

## API Key

Set `ELEVENLABS_API_KEY` environment variable, or configure in clawdbot.json:

```json5
{
  skills: {
    entries: {
      "elevenlabs-stt": {
        apiKey: "sk_..."
      }
    }
  }
}
```

## Examples

```bash
# Transcribe a WhatsApp voice note
{baseDir}/scripts/transcribe.sh ~/Downloads/voice_note.ogg

# Meeting recording with multiple speakers
{baseDir}/scripts/transcribe.sh meeting.mp3 --diarize --lang en

# Get JSON for processing
{baseDir}/scripts/transcribe.sh podcast.mp3 --json > transcript.json
```

Related Skills

miranda-elevenlabs-speech

7
from Demerzels-lab/elsamultiskillagent

Text-to-Speech and Speech-to-Text using ElevenLabs AI.

elevenlabs-cli

7
from Demerzels-lab/elsamultiskillagent

CLI for ElevenLabs AI audio platform - text-to-speech, speech-to-text, voice cloning.

elevenlabs-phone-reminder-lite

7
from Demerzels-lab/elsamultiskillagent

Build AI phone call reminders with ElevenLabs Conversational AI + Twilio. Free starter guide.

elevenlabs-ai

7
from Demerzels-lab/elsamultiskillagent

OpenClaw skill for ElevenLabs APIs: text-to-speech, speech-to-speech, realtime speech-to-text, voices/models.

elevenlabs-music

7
from Demerzels-lab/elsamultiskillagent

Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lullabies, or any audio music from descriptions. Supports vocals with AI-generated lyrics, instrumental tracks, and multiple genres/styles. Requires paid ElevenLabs plan.

elevenlabs-speech

7
from Demerzels-lab/elsamultiskillagent

Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.

paylock

7
from Demerzels-lab/elsamultiskillagent

Non-custodial SOL escrow for AI agent deals.

agent-reputation

7
from Demerzels-lab/elsamultiskillagent

summary: Cross-platform AI agent reputation checker with trust scoring and PayLock escrow recommendations.

Telecom Agent Skill

7
from Demerzels-lab/elsamultiskillagent

Turn your AI Agent into a Telecom Operator. Bulk calling, ChatOps, and Field Monitoring.

OpenClaw-Finnhub

7
from Demerzels-lab/elsamultiskillagent

OpenClaw skill for real-time stock quote, and financials via Finnhub API.

```markdown

7
from Demerzels-lab/elsamultiskillagent

# OpenClaw-Last.fm

security-operator

7
from Demerzels-lab/elsamultiskillagent

Runtime security guardrails for OpenClaw agents.