parakeet-stt

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

533 stars

Best use case

parakeet-stt is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

Teams using parakeet-stt should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/parakeet-stt/SKILL.md --create-dirs "https://raw.githubusercontent.com/sundial-org/awesome-openclaw-skills/main/skills/parakeet-stt/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/parakeet-stt/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How parakeet-stt Compares

Feature / Agentparakeet-sttStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Parakeet TDT (Speech-to-Text)

Local transcription using NVIDIA Parakeet TDT 0.6B v3 with ONNX Runtime.
Runs on CPU — no GPU required. ~30x faster than realtime.

## Installation

```bash
# Clone the repo
git clone https://github.com/groxaxo/parakeet-tdt-0.6b-v3-fastapi-openai.git
cd parakeet-tdt-0.6b-v3-fastapi-openai

# Run with Docker (recommended)
docker compose up -d parakeet-cpu

# Or run directly with Python
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 5000
```

Default port is `5000`. Set `PARAKEET_URL` to override (e.g., `http://localhost:5092`).

## API Endpoint

OpenAI-compatible API at `$PARAKEET_URL` (default: `http://localhost:5000`).

## Quick Start

```bash
# Transcribe audio file (plain text)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=text"

# Get timestamps and segments
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=verbose_json"

# Generate subtitles (SRT)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
  -F "file=@/path/to/audio.mp3" \
  -F "response_format=srt"
```

## Python / OpenAI SDK

```python
import os
from openai import OpenAI

client = OpenAI(
    base_url=os.getenv("PARAKEET_URL", "http://localhost:5000") + "/v1",
    api_key="not-needed"
)

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="parakeet-tdt-0.6b-v3",
        file=f,
        response_format="text"
    )
print(transcript)
```

## Response Formats

| Format | Output |
|--------|--------|
| `text` | Plain text |
| `json` | `{"text": "..."}` |
| `verbose_json` | Segments with timestamps and words |
| `srt` | SRT subtitles |
| `vtt` | WebVTT subtitles |

## Supported Languages (25)

English, Spanish, French, German, Italian, Portuguese, Polish, Russian,
Ukrainian, Dutch, Swedish, Danish, Finnish, Norwegian, Greek, Czech,
Romanian, Hungarian, Bulgarian, Slovak, Croatian, Lithuanian, Latvian,
Estonian, Slovenian

Language is auto-detected — no configuration needed.

## Web Interface

Open `$PARAKEET_URL` in a browser for drag-and-drop transcription UI.

## Docker Management

```bash
# Check status
docker ps --filter "name=parakeet"

# View logs
docker logs -f <container-name>

# Restart
docker compose restart

# Stop
docker compose down
```

## Why Parakeet over Whisper?

- **Speed**: ~30x faster than realtime on CPU
- **Accuracy**: Comparable to Whisper large-v3
- **Privacy**: Runs 100% locally, no cloud calls
- **Compatibility**: Drop-in replacement for OpenAI's transcription API

Related Skills

parakeet-mlx

533
from sundial-org/awesome-openclaw-skills

Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon (no API key).

portfolio-watcher

533
from sundial-org/awesome-openclaw-skills

Monitor stock/crypto holdings, get price alerts, track portfolio performance

portainer

533
from sundial-org/awesome-openclaw-skills

Control Docker containers and stacks via Portainer API. List containers, start/stop/restart, view logs, and redeploy stacks from git.

portable-tools

533
from sundial-org/awesome-openclaw-skills

Build cross-device tools without hardcoding paths or account names

polymarket

533
from sundial-org/awesome-openclaw-skills

Trade prediction markets on Polymarket. Analyze odds, place bets, track positions, automate alerts, and maximize returns from event outcomes. Covers sports, politics, entertainment, and more.

polymarket-traiding-bot

533
from sundial-org/awesome-openclaw-skills

No description provided.

polymarket-analysis

533
from sundial-org/awesome-openclaw-skills

Analyze Polymarket prediction markets for trading edges. Pair Cost arbitrage, whale tracking, sentiment analysis, momentum signals, user profile tracking. No execution.

polymarket-agent

533
from sundial-org/awesome-openclaw-skills

Autonomous prediction market agent - analyzes markets, researches news, and identifies trading opportunities

polymarket-5

533
from sundial-org/awesome-openclaw-skills

Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.

polymarket-4

533
from sundial-org/awesome-openclaw-skills

Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.

polymarket-3

533
from sundial-org/awesome-openclaw-skills

Query Polymarket prediction market odds and events via CLI. Search for markets, get current prices, list events by category. Supports sports betting (NFL, NBA, soccer/EPL, Champions League), politics, crypto, elections, geopolitics. Real money markets = more accurate than polls. No API key required. Use when asked about odds, probabilities, predictions, or "what are the chances of X".

polymarket-2

533
from sundial-org/awesome-openclaw-skills

Query Polymarket prediction markets - check odds, trending markets, search events, track prices.