transcribe

Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.

7 stars

Best use case

transcribe is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.

Teams using transcribe should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/transcribe/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/javicasper/transcribe/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/transcribe/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How transcribe Compares

Feature / AgenttranscribeStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Transcribe

Local audio transcription using faster-whisper in Docker.

## Installation

```bash
cd /path/to/skills/transcribe/scripts
chmod +x install.sh
./install.sh
```

This builds the Docker image `whisper:local` and installs the `transcribe` CLI.

## Usage

```bash
transcribe /path/to/audio.mp3 [language]
```

- Default language: `es` (Spanish)
- Use `auto` for auto-detection
- Outputs plain text to stdout

## Examples

```bash
transcribe /tmp/voice.ogg          # Spanish (default)
transcribe /tmp/meeting.mp3 en     # English
transcribe /tmp/audio.m4a auto     # Auto-detect
```

## Supported Formats

mp3, m4a, ogg, wav, webm, flac, aac

## When Receiving Voice Messages

1. Save the audio attachment to a temp file
2. Run `transcribe <path>`
3. Include the transcription in your response
4. Clean up the temp file

## Files

- `scripts/transcribe` - CLI wrapper (bash)
- `scripts/install.sh` - Installation script (includes Dockerfile inline)

## Notes

- Model: `small` (fast) - edit install.sh for `large-v3` (accurate)
- Fully local, no API key needed

Related Skills

gettr-transcribe-summarize

7
from Demerzels-lab/elsamultiskillagent

Download audio from a GETTR post (via HTML og:video), transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT), and summarize the transcript into bullet points and/or a timestamped outline. Use when given a GETTR post URL and asked to produce a transcript or summary.

transcribee

7
from Demerzels-lab/elsamultiskillagent

Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts ready for LLM analysis.

voice-transcribe

7
from Demerzels-lab/elsamultiskillagent

Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).

audio-transcribe

7
from Demerzels-lab/elsamultiskillagent

Auto-transcribe voice messages using faster-whisper (local, no API key needed).

paylock

7
from Demerzels-lab/elsamultiskillagent

Non-custodial SOL escrow for AI agent deals.

agent-reputation

7
from Demerzels-lab/elsamultiskillagent

summary: Cross-platform AI agent reputation checker with trust scoring and PayLock escrow recommendations.

Telecom Agent Skill

7
from Demerzels-lab/elsamultiskillagent

Turn your AI Agent into a Telecom Operator. Bulk calling, ChatOps, and Field Monitoring.

OpenClaw-Finnhub

7
from Demerzels-lab/elsamultiskillagent

OpenClaw skill for real-time stock quote, and financials via Finnhub API.

```markdown

7
from Demerzels-lab/elsamultiskillagent

# OpenClaw-Last.fm

security-operator

7
from Demerzels-lab/elsamultiskillagent

Runtime security guardrails for OpenClaw agents.

operator-humanizer

7
from Demerzels-lab/elsamultiskillagent

Transform AI-generated text into authentic human writing.

kit-email-operator

7
from Demerzels-lab/elsamultiskillagent

**AI-powered email marketing for Kit (ConvertKit)**.