youtube-voice-summarizer

Transform YouTube videos into podcast-style voice summaries using ElevenLabs TTS

7 stars

Best use case

youtube-voice-summarizer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Transform YouTube videos into podcast-style voice summaries using ElevenLabs TTS

Teams using youtube-voice-summarizer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/youtube-voice-summarizer-elevenlabs/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/franciscoandsam/youtube-voice-summarizer-elevenlabs/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/youtube-voice-summarizer-elevenlabs/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How youtube-voice-summarizer Compares

Feature / Agentyoutube-voice-summarizerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Transform YouTube videos into podcast-style voice summaries using ElevenLabs TTS

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# YouTube Voice Summarizer

Transform any YouTube video into a professional voice summary delivered in under 60 seconds.

## What It Does

When a user sends a YouTube URL, this skill:
1. Extracts the video transcript via Supadata
2. Generates a concise AI summary via OpenRouter/Cerebras
3. Converts the summary to natural speech via ElevenLabs
4. Returns an audio file the user can listen to

## Requirements

This skill requires a running backend server. Deploy the summarizer service:

```bash
git clone https://github.com/Franciscomoney/elevenlabs-moltbot.git
cd elevenlabs-moltbot
npm install
cp .env.example .env
# Add your API keys to .env
npm start
```

### Required API Keys

| Service | Purpose | Get Key |
|---------|---------|---------|
| ElevenLabs | Text-to-speech | https://elevenlabs.io |
| Supadata | YouTube transcripts | https://supadata.ai |
| OpenRouter | AI summarization | https://openrouter.ai |

## How to Use

When user sends a YouTube URL:

### Step 1: Start the voice summary job

```bash
curl -s -X POST http://127.0.0.1:3050/api/summarize \
  -H "Content-Type: application/json" \
  -d '{"url":"YOUTUBE_URL","length":"short","voice":"podcast"}'
```

Returns: `{"jobId": "job_xxx", "status": "processing"}`

### Step 2: Poll for completion (wait 3-5 seconds between checks)

```bash
curl -s http://127.0.0.1:3050/api/status/JOB_ID
```

Keep polling until status is "completed".

### Step 3: Return the audio to user

When complete, the response includes:
- `result.audioUrl` - The MP3 audio URL (send this to the user!)
- `result.teaser` - Short hook text about the content
- `result.summary` - Full text summary
- `result.keyPoints` - Array of key takeaways

Send the user:
1. The teaser text as a message
2. The audio URL so they can listen

## Voice Options

| Voice | Style |
|-------|-------|
| `podcast` | Deep male narrator (default) |
| `news` | British authoritative |
| `casual` | Friendly conversational |
| `female_warm` | Warm female voice |

## Summary Lengths

| Length | Duration | Best For |
|--------|----------|----------|
| `short` | 1-2 min | Quick overview |
| `medium` | 3-5 min | Balanced detail |
| `detailed` | 5-10 min | Comprehensive |

## Example Flow

User: "Summarize this: https://www.youtube.com/watch?v=dQw4w9WgXcQ"

1. Start job:
```bash
curl -s -X POST http://127.0.0.1:3050/api/summarize \
  -H "Content-Type: application/json" \
  -d '{"url":"https://www.youtube.com/watch?v=dQw4w9WgXcQ","length":"short","voice":"podcast"}'
```

2. Poll status with the returned jobId
3. When complete, send the audioUrl to the user

## Text-Only Summary (No Audio)

For faster, cheaper text-only summaries:

```bash
curl -s -X POST http://127.0.0.1:3050/api/quick-summary \
  -H "Content-Type: application/json" \
  -d '{"url":"YOUTUBE_URL","length":"short"}'
```

## Troubleshooting

**"Video may not have captions"**
- The video needs subtitles enabled on YouTube
- Auto-generated captions may take time on new videos

**Audio URL not working**
- Ensure BASE_URL in .env is publicly accessible
- Check firewall allows traffic on port 3050

## Cost Per Summary

| Service | Cost |
|---------|------|
| Supadata | ~$0.001 |
| OpenRouter | ~$0.005-0.02 |
| ElevenLabs | ~$0.05-0.15 |
| **Total** | **~$0.06-0.17** |

Related Skills

invoice-tracker-pro

7
from Demerzels-lab/elsamultiskillagent

Complete freelance billing workflow — generate professional invoices, track payment status, send automated.

invoice-template

7
from Demerzels-lab/elsamultiskillagent

Free simple invoice generator.

youtube-video-downloader

7
from Demerzels-lab/elsamultiskillagent

Download YouTube videos in various formats and qualities. Use when you need to save videos for offline viewing, extract audio, download playlists, or get specific video formats.

youtube-thumbnail-grabber

7
from Demerzels-lab/elsamultiskillagent

Download YouTube video thumbnails in various resolutions. Use when you need to get video preview images, create collages, or save thumbnails for reference.

youtube-summarize

7
from Demerzels-lab/elsamultiskillagent

Summarize YouTube videos by extracting transcripts and captions. Use when you need to get a quick summary of a video, extract key points, or analyze video content without watching it.

kagi-summarizer

7
from Demerzels-lab/elsamultiskillagent

Summarize any URL or text using Kagi's Universal Summarizer API.

youtube-editor

7
from Demerzels-lab/elsamultiskillagent

Automate YouTube video editing workflow: Download -> Transcribe (Whisper) -> Analyze (GPT-4) -> High-Quality.

voicemonkey

7
from Demerzels-lab/elsamultiskillagent

Control Alexa devices via VoiceMonkey API v2 - make announcements, trigger routines, start flows, and display media.

vibevoice

7
from Demerzels-lab/elsamultiskillagent

Local Spanish TTS using Microsoft VibeVoice.

percept-voice-cmd

7
from Demerzels-lab/elsamultiskillagent

Voice command detection and action execution for OpenClaw agents.

worthclip-youtube-video-scorer

7
from Demerzels-lab/elsamultiskillagent

AI-powered YouTube video scoring.

youtube-apify-transcript

7
from Demerzels-lab/elsamultiskillagent

Fetch YouTube transcripts via APIFY API (works from cloud IPs, bypasses YouTube bot detection).