whisper

Whisper OpenAI speech recognition. Use for speech-to-text.

7 stars

byG1Joshi

View on GitHub Installation ↓

Best use case

whisper is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Whisper OpenAI speech recognition. Use for speech-to-text.

Teams using whisper should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/whisper/SKILL.md --create-dirs "https://raw.githubusercontent.com/G1Joshi/Agent-Skills/main/skills/ai-ml/whisper/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/whisper/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How whisper Compares

Feature / Agent	whisper	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Whisper OpenAI speech recognition. Use for speech-to-text.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Whisper

Whisper (OpenAI) is the industry standard for **Speech-to-Text**. It supports 99 languages and translation. V3 (large-v3) is the current state of the art.

## When to Use

- **Transcription**: Creating subtitles for videos.
- **Translation**: Translating audio to English text.
- **Local Privacy**: Runs 100% locally (sensitive meetings).

## Core Concepts

### Models

`tiny`, `base`, `small`, `medium`, `large`, `legacy`, `large-v3`, `large-v3-turbo`.

### Distil-Whisper

Smaller, faster versions of Whisper (6x speedup, 1% accuracy loss).

## Best Practices (2025)

**Do**:

- **Use `insanely-fast-whisper`**: A wrapper that uses Flash Attention to transcribing 2 hours of audio in 2 minutes.
- **Use API for streaming**: OpenAI API supports streaming audio transcription.

**Don't**:

- **Don't use `large` for realtime**: It's too slow. Use `turbo` or `distil` models.

## References

- [Whisper GitHub](https://github.com/openai/whisper)