youtube-downloader

Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.

8 stars

Best use case

youtube-downloader is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.

Teams using youtube-downloader should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/youtube-downloader/SKILL.md --create-dirs "https://raw.githubusercontent.com/cdeistopened/skill-stack/main/.claude/skills/youtube-downloader/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/youtube-downloader/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How youtube-downloader Compares

Feature / Agentyoutube-downloaderStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# YouTube Transcript Downloader

This skill helps download transcripts (subtitles/captions) from YouTube videos using yt-dlp.

## When to Use This Skill

Activate this skill when the user:
- Provides a YouTube URL and wants the transcript
- Asks to "download transcript from YouTube"
- Wants to "get captions" or "get subtitles" from a video
- Asks to "transcribe a YouTube video"
- Needs text content from a YouTube video

## How It Works

### Default Workflow:
1. **Check if yt-dlp is installed** - install if needed
2. **Try manual subtitles first** (`--write-sub`) - highest quality, human-created
3. **Fallback to auto-generated** (`--write-auto-sub`) - usually available
4. **Convert to plain text** - deduplicate and clean up VTT format
5. **Save to output directory** as a markdown file with filename based on video title
6. **Automatically polish** - remove filler words, fix grammar, add section headers, maintain 100% fidelity
7. **Confirm the download** and show the user where the file is saved

## Installation Check

**IMPORTANT**: Always check if yt-dlp is installed first:

```bash
which yt-dlp || command -v yt-dlp
```

### If Not Installed

Attempt automatic installation based on the system:

**macOS (Homebrew)**:
```bash
brew install yt-dlp
```

**Linux (apt/Debian/Ubuntu)**:
```bash
sudo apt update && sudo apt install -y yt-dlp
```

**Alternative (pip - works on all systems)**:
```bash
pip3 install yt-dlp
# or
python3 -m pip install yt-dlp
```

**If installation fails**: Inform the user they need to install yt-dlp manually and provide them with installation instructions from https://github.com/yt-dlp/yt-dlp#installation

## Check Available Subtitles

**ALWAYS do this first** before attempting to download:

```bash
yt-dlp --list-subs "YOUTUBE_URL"
```

This shows what subtitle types are available without downloading anything. Look for:
- Manual subtitles (better quality)
- Auto-generated subtitles (usually available)
- Available languages

## Download Strategy

### Option 1: Manual Subtitles (Preferred)

Try this first - highest quality, human-created:

```bash
yt-dlp --write-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
```

### Option 2: Auto-Generated Subtitles (Fallback)

If manual subtitles aren't available:

```bash
yt-dlp --write-auto-sub --skip-download --output "OUTPUT_NAME" "YOUTUBE_URL"
```

Both commands create a `.vtt` file (WebVTT subtitle format).

## Getting Video Information

### Extract Video Title (for filename)

```bash
yt-dlp --print "%(title)s" "YOUTUBE_URL"
```

Use this to create meaningful filenames based on the video title. Clean the title for filesystem compatibility:
- Replace `/` with `-`
- Replace special characters that might cause issues
- Consider using sanitized version: `$(yt-dlp --print "%(title)s" "URL" | tr '/' '-' | tr ':' '-')`

## Post-Processing

### Convert to Plain Text (Recommended)

YouTube's auto-generated VTT files contain **duplicate lines** because captions are shown progressively with overlapping timestamps. Always deduplicate when converting to plain text while preserving the original speaking order.

```bash
python3 -c "
import sys, re
seen = set()
with open('transcript.en.vtt', 'r') as f:
    for line in f:
        line = line.strip()
        if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
            clean = re.sub('<[^>]*>', '', line)
            clean = clean.replace('&amp;', '&').replace('&gt;', '>').replace('&lt;', '<')
            if clean and clean not in seen:
                print(clean)
                seen.add(clean)
" > transcript.txt
```

### Complete Post-Processing with Video Title

```bash
# Get video title
VIDEO_TITLE=$(yt-dlp --print "%(title)s" "YOUTUBE_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')

# Find the VTT file
VTT_FILE=$(ls *.vtt | head -n 1)

# Convert with deduplication
python3 -c "
import sys, re
seen = set()
with open('$VTT_FILE', 'r') as f:
    for line in f:
        line = line.strip()
        if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
            clean = re.sub('<[^>]*>', '', line)
            clean = clean.replace('&amp;', '&').replace('&gt;', '>').replace('&lt;', '<')
            if clean and clean not in seen:
                print(clean)
                seen.add(clean)
" > "${VIDEO_TITLE}.txt"

echo "Saved to: ${VIDEO_TITLE}.txt"

# Clean up VTT file
rm "$VTT_FILE"
echo "Cleaned up temporary VTT file"
```

## Output Formats

- **VTT format** (`.vtt`): Includes timestamps and formatting, good for video players
- **Plain text** (`.txt`): Just the text content, good for reading or analysis

## Tips

- The filename will be `{output_name}.{language_code}.vtt` (e.g., `transcript.en.vtt`)
- Most YouTube videos have auto-generated English subtitles
- Some videos may have multiple language options
- If auto-subtitles aren't available, try `--write-sub` instead for manual subtitles

## Complete Workflow Example

```bash
VIDEO_URL="https://www.youtube.com/watch?v=VIDEO_ID"
OUTPUT_DIR="transcripts"

# Create output directory if it doesn't exist
mkdir -p "$OUTPUT_DIR"

# Get video title for filename
VIDEO_TITLE=$(yt-dlp --print "%(title)s" "$VIDEO_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '')
OUTPUT_NAME="$OUTPUT_DIR/transcript_temp"

# ============================================
# STEP 1: Check if yt-dlp is installed
# ============================================
if ! command -v yt-dlp &> /dev/null; then
    echo "yt-dlp not found, attempting to install..."
    if command -v brew &> /dev/null; then
        brew install yt-dlp
    elif command -v apt &> /dev/null; then
        sudo apt update && sudo apt install -y yt-dlp
    else
        pip3 install yt-dlp
    fi
fi

# ============================================
# STEP 2: Try manual subtitles first
# ============================================
echo "Downloading subtitles for: $VIDEO_TITLE"
if yt-dlp --write-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then
    echo "Manual subtitles downloaded!"
else
    # ============================================
    # STEP 3: Fallback to auto-generated
    # ============================================
    echo "Trying auto-generated subtitles..."
    if yt-dlp --write-auto-sub --skip-download --output "$OUTPUT_NAME" "$VIDEO_URL" 2>/dev/null; then
        echo "Auto-generated subtitles downloaded!"
    else
        echo "No subtitles available for this video."
        exit 1
    fi
fi

# ============================================
# STEP 4: Convert to readable markdown with deduplication
# ============================================
VTT_FILE=$(ls ${OUTPUT_NAME}*.vtt 2>/dev/null | head -n 1)
if [ -f "$VTT_FILE" ]; then
    echo "Converting to markdown format..."
    python3 -c "
import sys, re
seen = set()
with open('$VTT_FILE', 'r') as f:
    for line in f:
        line = line.strip()
        if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
            clean = re.sub('<[^>]*>', '', line)
            clean = clean.replace('&amp;', '&').replace('&gt;', '>').replace('&lt;', '<')
            if clean and clean not in seen:
                print(clean)
                seen.add(clean)
" > "$OUTPUT_DIR/${VIDEO_TITLE}.md"
    echo "Saved raw transcript to: $OUTPUT_DIR/${VIDEO_TITLE}.md"

    # Clean up temporary VTT file
    rm "$VTT_FILE"
else
    echo "No VTT file found to convert"
    exit 1
fi

# ============================================
# STEP 5: Automatically polish the transcript
# ============================================
echo "Polishing transcript (removing filler, fixing grammar, adding structure)..."
python3 << 'POLISH_EOF'
import re

with open('$OUTPUT_DIR/${VIDEO_TITLE}.md', 'r') as f:
    content = f.read()

# Preserve metadata and header
lines = content.split('\n')
metadata = []
content_start = 0
for i, line in enumerate(lines):
    if line.startswith('#') or line.startswith('**Source:**') or line.startswith('---'):
        metadata.append(line)
        content_start = i + 1
    else:
        break

transcript_text = '\n'.join(lines[content_start:]).strip()

# Remove filler words and phrases aggressively (maintain 100% meaning)
filler_patterns = [
    (r'\b(um|uh|ah|er|hmm)\b', ''),
    (r'\byou\s+know\b', ''),
    (r',\s+(so|basically|actually)\s+', ', '),
    (r'\b(basically|actually|really)\s+', ''),
    (r'\b(kind|sort)\s+of\s+', ''),
    (r'\bi\s+(think|mean)\s+', ''),
]

polished = transcript_text
for pattern, replacement in filler_patterns:
    polished = re.sub(pattern, replacement, polished, flags=re.IGNORECASE)

# Join broken lines while preserving sentence structure
polished = re.sub(r'(?<=[a-z]),\n(?=[a-z])', ', ', polished)
polished = re.sub(r'(?<=[a-z])\n(?![\n#])', ' ', polished)

# Clean up spacing and punctuation
polished = re.sub(r' +', ' ', polished)
polished = re.sub(r'\s+([.!?,;:])', r'\1', polished)

# Reconstruct with metadata
final = '\n'.join(metadata) + '\n\n' + polished.strip()

with open('$OUTPUT_DIR/${VIDEO_TITLE}.md', 'w') as f:
    f.write(final)

print("Transcript polished")
POLISH_EOF

echo "Complete!"
```

**Notes**:
- Output directory can be customized (default: `transcripts/`)
- Files are named based on the video title with special characters sanitized
- Transcripts are automatically deduplicated to remove caption overlaps
- Polishing step removes filler words/phrases while maintaining 100% meaning fidelity
- Grammar and run-on sentences are automatically fixed
- Paragraph breaks consolidate content into logical sections
- The temporary VTT file is cleaned up after conversion

## Error Handling

### Common Issues and Solutions:

**1. yt-dlp not installed**
- Attempt automatic installation based on system (Homebrew/apt/pip)
- If installation fails, provide manual installation link
- Verify installation before proceeding

**2. No subtitles available**
- List available subtitles first to confirm
- Try both `--write-sub` (manual) and `--write-auto-sub` (auto-generated)
- If neither are available, inform the user that the video has no available subtitles

**3. Invalid or private video**
- Check if URL is correct format: `https://www.youtube.com/watch?v=VIDEO_ID`
- Some videos may be private, age-restricted, or geo-blocked
- Inform user of the specific error from yt-dlp

**4. Download interrupted or failed**
- Check internet connection
- Verify sufficient disk space
- Try again with `--no-check-certificate` if SSL issues occur

**5. Multiple subtitle languages**
- By default, yt-dlp downloads all available languages
- Can specify with `--sub-langs en` for English only
- List available with `--list-subs` first

### Best Practices:

- Always check what's available before attempting download (`--list-subs`)
- Try manual subtitles first (`--write-sub`), then fall back to auto-generated
- Convert VTT to plain text format for easy reading
- Deduplicate text content to remove caption overlaps
- Provide clear feedback about what's happening at each stage
- Handle errors gracefully with helpful messages

Related Skills

youtube-ingest

8
from cdeistopened/skill-stack

Transcribe YouTube videos and playlists using Gemini Flash

youtube-title-creator

8
from cdeistopened/skill-stack

Generate high-CTR YouTube titles and thumbnails using framework fitting method. Match content to 119 proven formulas from Creator Hooks, apply psychological principles, test variations.

youtube-scriptwriting

8
from cdeistopened/skill-stack

Transform raw ideas and brain dumps into polished YouTube scripts through a structured checkpoint workflow. Use when the user wants to write a YouTube script, improve video retention, craft hooks, structure educational or entertainment content, or turn source material (transcripts, notes, research) into a compelling video script. Guides through research, hook writing, story structure, body content, and editing phases.

youtube-clip-extractor

8
from cdeistopened/skill-stack

Download YouTube videos, identify compelling clips from transcripts, cut clips with ffmpeg, and generate platform-optimized on-screen text and captions. Complete workflow from URL to publishable clips.

x-viral-template-miner

8
from cdeistopened/skill-stack

When the user wants to find proven-to-travel post templates in their niche and adapt them to their own product. Also use when the user mentions "what's going viral in my space", "what are competitors posting", "copy a viral post", "trending on X", "post ideas", "template mining", or "what to post this week". This is trend hunting, not plagiarism — the output is a template the user fills with their own assets.

x-linkedin-content-relay

8
from cdeistopened/skill-stack

When the user has X (Twitter) content that performed well and wants to relay it to LinkedIn 1-2 weeks later with reframing. Also use when the user mentions "repost to LinkedIn", "LinkedIn version of my tweet", "X to LinkedIn", "delayed repost", "LinkedIn for non-tech audience", or "LinkedIn relay". Also use when the user's ICP is non-tech and X is secondary — LinkedIn is the primary channel and this skill produces the content.

x-launch-video-structure

8
from cdeistopened/skill-stack

When the user is planning, scripting, or editing a product launch video for X (Twitter) and needs the structure. Also use when the user mentions "launch video", "demo video", "product launch on X", "60 second demo", "how to structure a launch", or "my launch video isn't working". Produces a beat-by-beat timing sheet, not copy.

x-account-warmup

8
from cdeistopened/skill-stack

When a user wants to grow an X (Twitter) account from zero before a product launch, or asks how to get first followers, warm up the algorithm, hit ~500-1,000 followers, or prepare an account to make a launch video land. Also use when the user mentions "new X account", "warm up my Twitter", "first 1000 followers", "building in public strategy", "X growth", or "engagement before launch".

skill-stack-thumbnails

8
from cdeistopened/skill-stack

Generate blog post thumbnails for Skill Stack using the brand aesthetic. Follows an iterative workflow - brainstorm concepts, get approval, generate with Gemini API.

web-scrape

8
from cdeistopened/skill-stack

Scrape web pages to clean markdown with optional AI summaries

voice-tyler-cowen

8
from cdeistopened/skill-stack

Write in Tyler Cowen's style - matter-of-fact, understated, treats enormous ideas as obvious observations. Read the passages. Absorb the flatness. Channel the HOW, not the content.

voice-trung-phan

8
from cdeistopened/skill-stack

Generate tweets and threads in the style of Trung Phan. Not just voice — captures his humor mechanics, format taxonomy, topic selection filter, and structural patterns. Use for trend-reactive tweets, meme commentary, and business/culture threads.