youtube-voice-summarizer
Transform YouTube videos into podcast-style voice summaries using ElevenLabs TTS
Best use case
youtube-voice-summarizer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Transform YouTube videos into podcast-style voice summaries using ElevenLabs TTS
Teams using youtube-voice-summarizer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/youtube-voice-summarizer-elevenlabs/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How youtube-voice-summarizer Compares
| Feature / Agent | youtube-voice-summarizer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Transform YouTube videos into podcast-style voice summaries using ElevenLabs TTS
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# YouTube Voice Summarizer
Transform any YouTube video into a professional voice summary delivered in under 60 seconds.
## What It Does
When a user sends a YouTube URL, this skill:
1. Extracts the video transcript via Supadata
2. Generates a concise AI summary via OpenRouter/Cerebras
3. Converts the summary to natural speech via ElevenLabs
4. Returns an audio file the user can listen to
## Requirements
This skill requires a running backend server. Deploy the summarizer service:
```bash
git clone https://github.com/Franciscomoney/elevenlabs-moltbot.git
cd elevenlabs-moltbot
npm install
cp .env.example .env
# Add your API keys to .env
npm start
```
### Required API Keys
| Service | Purpose | Get Key |
|---------|---------|---------|
| ElevenLabs | Text-to-speech | https://elevenlabs.io |
| Supadata | YouTube transcripts | https://supadata.ai |
| OpenRouter | AI summarization | https://openrouter.ai |
## How to Use
When user sends a YouTube URL:
### Step 1: Start the voice summary job
```bash
curl -s -X POST http://127.0.0.1:3050/api/summarize \
-H "Content-Type: application/json" \
-d '{"url":"YOUTUBE_URL","length":"short","voice":"podcast"}'
```
Returns: `{"jobId": "job_xxx", "status": "processing"}`
### Step 2: Poll for completion (wait 3-5 seconds between checks)
```bash
curl -s http://127.0.0.1:3050/api/status/JOB_ID
```
Keep polling until status is "completed".
### Step 3: Return the audio to user
When complete, the response includes:
- `result.audioUrl` - The MP3 audio URL (send this to the user!)
- `result.teaser` - Short hook text about the content
- `result.summary` - Full text summary
- `result.keyPoints` - Array of key takeaways
Send the user:
1. The teaser text as a message
2. The audio URL so they can listen
## Voice Options
| Voice | Style |
|-------|-------|
| `podcast` | Deep male narrator (default) |
| `news` | British authoritative |
| `casual` | Friendly conversational |
| `female_warm` | Warm female voice |
## Summary Lengths
| Length | Duration | Best For |
|--------|----------|----------|
| `short` | 1-2 min | Quick overview |
| `medium` | 3-5 min | Balanced detail |
| `detailed` | 5-10 min | Comprehensive |
## Example Flow
User: "Summarize this: https://www.youtube.com/watch?v=dQw4w9WgXcQ"
1. Start job:
```bash
curl -s -X POST http://127.0.0.1:3050/api/summarize \
-H "Content-Type: application/json" \
-d '{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","length":"short","voice":"podcast"}'
```
2. Poll status with the returned jobId
3. When complete, send the audioUrl to the user
## Text-Only Summary (No Audio)
For faster, cheaper text-only summaries:
```bash
curl -s -X POST http://127.0.0.1:3050/api/quick-summary \
-H "Content-Type: application/json" \
-d '{"url":"YOUTUBE_URL","length":"short"}'
```
## Troubleshooting
**"Video may not have captions"**
- The video needs subtitles enabled on YouTube
- Auto-generated captions may take time on new videos
**Audio URL not working**
- Ensure BASE_URL in .env is publicly accessible
- Check firewall allows traffic on port 3050
## Cost Per Summary
| Service | Cost |
|---------|------|
| Supadata | ~$0.001 |
| OpenRouter | ~$0.005-0.02 |
| ElevenLabs | ~$0.05-0.15 |
| **Total** | **~$0.06-0.17** |Related Skills
invoice-tracker-pro
Complete freelance billing workflow — generate professional invoices, track payment status, send automated.
invoice-template
Free simple invoice generator.
youtube-video-downloader
Download YouTube videos in various formats and qualities. Use when you need to save videos for offline viewing, extract audio, download playlists, or get specific video formats.
youtube-thumbnail-grabber
Download YouTube video thumbnails in various resolutions. Use when you need to get video preview images, create collages, or save thumbnails for reference.
youtube-summarize
Summarize YouTube videos by extracting transcripts and captions. Use when you need to get a quick summary of a video, extract key points, or analyze video content without watching it.
kagi-summarizer
Summarize any URL or text using Kagi's Universal Summarizer API.
youtube-editor
Automate YouTube video editing workflow: Download -> Transcribe (Whisper) -> Analyze (GPT-4) -> High-Quality.
voicemonkey
Control Alexa devices via VoiceMonkey API v2 - make announcements, trigger routines, start flows, and display media.
vibevoice
Local Spanish TTS using Microsoft VibeVoice.
percept-voice-cmd
Voice command detection and action execution for OpenClaw agents.
worthclip-youtube-video-scorer
AI-powered YouTube video scoring.
youtube-apify-transcript
Fetch YouTube transcripts via APIFY API (works from cloud IPs, bypasses YouTube bot detection).