assemblyai-transcribe

Transcribe podcast and audio files with speaker diarization using AssemblyAI API. Use when the user wants to: (1) Transcribe a podcast or audio file with AssemblyAI, (2) Get speaker-labeled transcripts (who said what), (3) Diarize audio to identify different speakers, (4) Generate SRT subtitles from audio. Triggers on: "assemblyai", "transcribe with assemblyai", "diarize podcast", "assemblyai transcribe".

6 stars

bytdhopper

View on GitHub Installation ↓

Best use case

assemblyai-transcribe is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using assemblyai-transcribe should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/assemblyai-transcribe/SKILL.md --create-dirs "https://raw.githubusercontent.com/tdhopper/dotfiles2.0/main/.claude/skills/assemblyai-transcribe/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/assemblyai-transcribe/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How assemblyai-transcribe Compares

Feature / Agent	assemblyai-transcribe	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Podcast Transcription with AssemblyAI

Transcribe audio files with speaker diarization using `scripts/transcribe.py`.

## Requirements

- Set `ASSEMBLYAI_API_KEY` environment variable
- Dependencies installed automatically via `uv run`

## Supported Formats

WAV, MP3, AIFF, AAC, OGG, FLAC, M4A, WMA, WEBM

## Usage

Transcribe a local file with speaker diarization (default):
```bash
uv run scripts/transcribe.py /path/to/podcast.mp3
```

Transcribe from a URL:
```bash
uv run scripts/transcribe.py https://example.com/podcast.mp3
```

Save to file:
```bash
uv run scripts/transcribe.py /path/to/podcast.mp3 -o transcript.txt
```

Specify expected number of speakers:
```bash
uv run scripts/transcribe.py /path/to/podcast.mp3 -n 3
```

Plain text output (no speaker labels):
```bash
uv run scripts/transcribe.py /path/to/podcast.mp3 --no-diarize -f text
```

SRT subtitle format:
```bash
uv run scripts/transcribe.py /path/to/podcast.mp3 -f srt -o subtitles.srt
```

## Options

| Flag | Description |
|------|-------------|
| `-o, --output` | Output file path (default: stdout) |
| `-f, --format` | Output format: `diarized` (default), `text`, `srt` |
| `--no-diarize` | Disable speaker diarization |
| `-n, --speakers` | Expected number of speakers (helps accuracy) |

## Output Formats

- **diarized** (default): `[MM:SS] Speaker A: text` with blank lines between utterances
- **text**: Plain transcript without speaker labels or timestamps
- **srt**: SRT subtitle format with speaker labels

## Notes

- Local files are uploaded to AssemblyAI's servers for processing, then transcribed
- URLs are passed directly (the audio must be publicly accessible)
- Polling interval is 5 seconds; long audio files may take several minutes
- By default, AssemblyAI detects up to 10 speakers; use `-n` to hint if you know the count

Related Skills

stop-slop

from tdhopper/dotfiles2.0

Use this skill when writing or editing prose to eliminate predictable AI writing patterns. Helps make writing more direct, authentic, and human.

sonos-control

from tdhopper/dotfiles2.0

Control Sonos speakers on Tim's home network. Use when the user wants to (1) play, pause, or stop music on Sonos speakers, (2) change volume on speakers, (3) skip tracks, (4) check what's playing, (5) see speaker status, (6) group or ungroup speakers, (7) any Sonos or music/audio playback task involving home speakers. Triggers on "sonos", "speakers", "play music", "what's playing", "volume", "turn up", "turn down", "pause music", "stop music".

slack-message

from tdhopper/dotfiles2.0

Draft and send Slack messages in Tim's natural voice. Use when the user wants to (1) post an update to a channel, (2) draft a Slack message, (3) share something on Slack, (4) send a DM, (5) reply in a thread. Applies Tim's Slack writing style and prose principles automatically.

skill-creator

from tdhopper/dotfiles2.0

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.

sending-to-codex

from tdhopper/dotfiles2.0

Delegate tasks or ask questions to OpenAI's Codex CLI from within Claude Code. Use this skill when the user says "ask codex", "send to codex", "delegate to codex", "have codex do this", "get codex's opinion", "run this in codex", or wants to offload a coding task or question to the Codex agent. Supports both fire-and-forget coding tasks (fix bugs, add features, refactor) and research questions (analyze code, explain behavior, get a second opinion).

reviewing-writing

from tdhopper/dotfiles2.0

Review and critique writing using Michael Nielsen's principles on craft. Analyzes text for purpose focus, brevity, danger words, opening strength, originality, reader psychology, truthfulness, and title impact. Use when the user says "review my writing", "nielsen review", "writing review", "review this writing", "critique my writing", or asks for feedback on prose quality.

reviewing-code

from tdhopper/dotfiles2.0

Review pull requests, branch changes, or code diffs. Triggers on "review this PR", "review my changes", "code review", "review branch", or GitHub PR URLs. Focuses on bugs, tests, complexity, and performance - not linting.

resend-email

from tdhopper/dotfiles2.0

Send emails via Resend.com API. Use when the user wants to (1) send an email, (2) email someone, (3) send a message to an email address, (4) send email with attachments, (5) schedule an email for later. Requires RESEND_API_KEY environment variable.

refresh-dotfiles

from tdhopper/dotfiles2.0

Full sync of personal (yadm) and work (yadm-work) dotfiles. Pulls remote changes, commits and pushes local changes, and audits for untracked files that should be tracked. Use when the user says 'refresh yadm', 'sync dotfiles', 'dotfiles sync', or 'update dotfiles'.

omnifocus

from tdhopper/dotfiles2.0

Interact with OmniFocus task manager via the command-line interface (@stephendolan/omnifocus-cli). Use when the user wants to: (1) Add tasks or projects to OmniFocus, (2) List, view, or search tasks/projects, (3) Update or complete tasks, (4) Manage inbox items, (5) Work with tags and analyze tag usage, (6) Process or organize their OmniFocus database from the command line.

omnifocus-triage

from tdhopper/dotfiles2.0

Interactively process OmniFocus inbox items using AskUserQuestion. Use when the user wants to (1) triage their inbox, (2) process inbox items, (3) organize their OmniFocus inbox, (4) clear out their inbox, (5) do a GTD-style inbox review. Triggers on "triage inbox", "process inbox", "organize inbox", "clear inbox", "inbox zero".

Nightshift

from tdhopper/dotfiles2.0

Manage and interact with Nightshift, an AI-powered development automation tool that runs coding tasks during off-hours.