advanced-video-downloader

Download and transcribe videos from YouTube, Bilibili, TikTok and 1000+ platforms. Use when user requests video download, transcription (转录/字幕提取), or converting video to text/markdown. Supports quality selection, audio extraction, playlist downloads, cookie-based authentication, and AI-powered transcription via SiliconFlow API (免费转录).

7 stars

Best use case

advanced-video-downloader is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Download and transcribe videos from YouTube, Bilibili, TikTok and 1000+ platforms. Use when user requests video download, transcription (转录/字幕提取), or converting video to text/markdown. Supports quality selection, audio extraction, playlist downloads, cookie-based authentication, and AI-powered transcription via SiliconFlow API (免费转录).

Teams using advanced-video-downloader should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/advanced-video-downloader/SKILL.md --create-dirs "https://raw.githubusercontent.com/Jst-Well-Dan/Skill-Box/main/content-pipeline/advanced-video-downloader/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/advanced-video-downloader/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How advanced-video-downloader Compares

Feature / Agentadvanced-video-downloaderStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Download and transcribe videos from YouTube, Bilibili, TikTok and 1000+ platforms. Use when user requests video download, transcription (转录/字幕提取), or converting video to text/markdown. Supports quality selection, audio extraction, playlist downloads, cookie-based authentication, and AI-powered transcription via SiliconFlow API (免费转录).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Advanced Video Downloader

## Overview

This skill provides comprehensive video downloading and transcription capabilities from 1000+ platforms including YouTube, Bilibili, TikTok, Twitter, Instagram, and more. It combines:
- **yt-dlp**: Powerful video downloading tool
- **SiliconFlow API**: Free AI-powered transcription to convert videos to Markdown

## When to Use This Skill

Activate this skill when the user:
- Explicitly requests to download a video ("download this video", "下载视频")
- Provides video URLs from any platform
- Mentions saving videos for offline viewing
- Wants to extract audio from videos
- Needs to download multiple videos or playlists
- Asks about video quality options
- Requests video transcription ("转录视频", "提取字幕", "视频转文字")
- Wants to convert video/audio to text or Markdown
- Asks to download AND transcribe a video in one workflow

## Core Capabilities

### 1. Single Video Download
Download individual videos from any supported platform with automatic quality selection.

**Example usage:**
```
User: "Download this YouTube video: https://youtube.com/watch?v=abc123"
User: "下载这个B站视频: https://bilibili.com/video/BV1xxx"
```

### 2. Batch & Playlist Download
Download multiple videos or entire playlists at once.

**Example usage:**
```
User: "Download all videos from this playlist"
User: "Download these 3 videos: [URL1], [URL2], [URL3]"
```

### 3. Audio Extraction
Extract audio only from videos, saving as MP3 or M4A.

**Example usage:**
```
User: "Download only the audio from this video"
User: "Convert this video to MP3"
```

### 4. Quality Selection
Choose specific video quality (4K, 1080p, 720p, etc.).

**Example usage:**
```
User: "Download in 4K quality"
User: "Get the 720p version to save space"
```

### 5. Video/Audio Transcription
Convert video or audio files to Markdown text using SiliconFlow's free AI transcription API.

**Example usage:**
```
User: "Transcribe this video to text" / "转录这个视频"
User: "Download and transcribe this YouTube video"
User: "将这个音频转成文字"
User: "Extract transcript from this MP4 file"
```

**Supported formats:**
- Audio: MP3, WAV, M4A, FLAC, AAC, OGG, OPUS, WMA
- Video: MP4, AVI, MOV, MKV, FLV, WMV, WEBM, M4V

## Response Pattern

When a user requests video download:

### Step 1: Identify the Platform and URL(s)
```python
# Extract video URL(s) from user message
# Identify platform: YouTube, Bilibili, TikTok, etc.
```

### Step 2: Check Tool Availability
```bash
# Check if yt-dlp is installed
yt-dlp --version
```

### Step 3: Select Appropriate yt-dlp Command

Based on platform and requirements:
- **YouTube, Twitter, Instagram, TikTok**: Basic command works
- **Bilibili**: Basic command works for most videos
- **Quality selection**: Use `-f` with height filter
- **Audio only**: Use `-x --audio-format mp3`
- **Playlists**: Use playlist-specific output template

### Step 4: Execute Download

Use yt-dlp directly with appropriate options:

```bash
# Basic download (best quality MP4)
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Specific quality (1080p)
yt-dlp -f "bestvideo[height<=1080]+bestaudio/best[height<=1080]" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Audio only (MP3)
yt-dlp -x --audio-format mp3 -o "%(title)s.%(ext)s" "VIDEO_URL"

# With cookies file (for protected content)
yt-dlp --cookies cookies.txt -o "%(title)s.%(ext)s" "VIDEO_URL"

# Playlist download
yt-dlp -o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "PLAYLIST_URL"
```

### Step 5: Report Results
After download completes, report:
- ✅ Video title and duration
- ✅ File size and format
- ✅ Save location
- ✅ Download speed and time taken
- ⚠️ Any warnings or quality limitations

**Example output:**
```
✅ Downloaded: "Video Title Here"
   Duration: 15:30
   Quality: 1080p MP4
   Size: 234 MB
   Location: ./Video Title Here.mp4
   Time: 45 seconds at 5.2 MB/s
```

## Transcription Response Pattern

When a user requests video/audio transcription:

### Step 1: Check Prerequisites
```bash
# Verify SiliconFlow API key is available
echo $SILICONFLOW_API_KEY
# Or user must provide via --api-key parameter
```

**API Key Setup:**
- Get free API key from: https://cloud.siliconflow.cn/account/ak
- Copy `.env.example` to `.env` and add your API key
- Or set environment variable: `SILICONFLOW_API_KEY=sk-xxx`

### Step 2: Validate File
Ensure the file exists and is a supported format (audio or video).

### Step 3: Execute Transcription
Use the bundled script `scripts/transcribe_siliconflow.py`:

```bash
# Basic transcription
python scripts/transcribe_siliconflow.py --file video.mp4 --api-key sk-xxx

# With custom output path
python scripts/transcribe_siliconflow.py --file audio.mp3 --output transcript.md --api-key sk-xxx

# Using environment variable for API key
python scripts/transcribe_siliconflow.py --file video.mp4
```

### Step 4: Report Transcription Results
```
✅ Transcription complete!
   File: video.mp4
   Output: 2025-01-15-video.md
   Size: 12.5 KB

   Preview:
   --------------------------------------------------
   [First 200 characters of transcription...]
   --------------------------------------------------
```

## Combined Workflow: Download + Transcribe

For requests like "Download and transcribe this video":

```bash
# Step 1: Download video
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Step 2: Transcribe the downloaded file
python scripts/transcribe_siliconflow.py --file "Downloaded Video Title.mp4" --api-key sk-xxx
```

## Platform-Specific Notes

### YouTube
- Fully supported by yt-dlp
- No authentication needed for public videos
- Supports all quality levels including 4K/8K

### Bilibili
- Supported by yt-dlp
- High-quality downloads may require login cookies
- Use `--cookies` with cookies.txt for member-only content

### Other Platforms
- Most platforms work well with yt-dlp
- Check `references/supported_platforms.md` for full list

## Handling Cookies for Protected Content

For platforms requiring authentication (Bilibili VIP, member-only content, etc.):

### Method 1: Export Cookies File (Recommended)
```bash
# Use browser extension "Get cookies.txt LOCALLY"
# Export cookies.txt, then:
yt-dlp --cookies cookies.txt "VIDEO_URL"
```

### Method 2: Manual Cookies File
```bash
# Create cookies.txt in Netscape format
# Use browser extension "Get cookies.txt LOCALLY"
# Then use with yt-dlp
yt-dlp --cookies cookies.txt "VIDEO_URL"
```

## Troubleshooting

### Issue: Video quality lower than expected
**Solution:**
1. Check if platform requires login for HD
2. Use `--cookies cookies.txt` for authenticated access
3. Explicitly specify quality with `-f` parameter

### Issue: Download very slow
**Solution:**
1. Check internet connection
2. Try different time of day (peak hours affect speed)
3. Use `--concurrent-fragments` for faster downloads

### Issue: "Video unavailable" or geo-restricted
**Solution:**
1. Video may be region-locked
2. Use proxy/VPN if legally permitted
3. Check if video is still available on platform

### Issue: Transcription API key error
**Solution:**
1. Verify API key starts with `sk-`
2. Get free key from: https://cloud.siliconflow.cn/account/ak
3. Set environment variable: `SILICONFLOW_API_KEY=sk-xxx`

### Issue: Transcription returns empty text
**Solution:**
1. Check if audio/video has clear speech
2. Verify file format is supported
3. File may be too short or contain only music

## Common Commands

### Quality Presets

```bash
# 4K (2160p)
yt-dlp -f "bestvideo[height<=2160]+bestaudio/best[height<=2160]" --merge-output-format mp4 "VIDEO_URL"

# 1080p (Full HD)
yt-dlp -f "bestvideo[height<=1080]+bestaudio/best[height<=1080]" --merge-output-format mp4 "VIDEO_URL"

# 720p (HD)
yt-dlp -f "bestvideo[height<=720]+bestaudio/best[height<=720]" --merge-output-format mp4 "VIDEO_URL"

# 480p (SD)
yt-dlp -f "bestvideo[height<=480]+bestaudio/best[height<=480]" --merge-output-format mp4 "VIDEO_URL"
```

### Audio Extraction

```bash
# Extract as MP3
yt-dlp -x --audio-format mp3 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Extract as M4A (better quality)
yt-dlp -x --audio-format m4a -o "%(title)s.%(ext)s" "VIDEO_URL"
```

### Batch Downloads

```bash
# Download multiple URLs from file
yt-dlp -a urls.txt

# Download playlist with custom naming
yt-dlp -o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "PLAYLIST_URL"

# Download channel's videos
yt-dlp -o "%(uploader)s/%(title)s.%(ext)s" "CHANNEL_URL"
```

## Bundled Resources

### Configuration

#### `.env.example`
Template for environment variables. Copy to `.env` and add your SiliconFlow API key.

### Scripts

#### `scripts/transcribe_siliconflow.py`
AI-powered transcription script using SiliconFlow's free API.

**Usage:**
```bash
python scripts/transcribe_siliconflow.py --file <audio/video> [--api-key <key>] [--output <path>]
```

**Parameters:**
- `--file, -f`: Input audio/video file (required)
- `--api-key, -k`: SiliconFlow API key (or use `SILICONFLOW_API_KEY` env var)
- `--output, -o`: Output markdown file path (default: `YYYY-MM-DD-filename.md`)
- `--model, -m`: Model to use (default: `FunAudioLLM/SenseVoiceSmall`)

### References

#### `references/supported_platforms.md`
Comprehensive list of 1000+ supported platforms with platform-specific notes and requirements.

#### `references/quality_formats.md`
Detailed explanation of video formats, codecs, and quality selection strategies.

## Tips for Best Results

1. **Always specify quality if user has preference** - saves bandwidth and storage
2. **Batch downloads save time** - use playlist URLs when possible
3. **Audio extraction is faster** - recommend for podcast/music content
4. **Check file size before downloading** - warn user for very large files (>1GB)
5. **Transcription works best with clear audio** - consider extracting audio first for better results

## Sources

- [yt-dlp Documentation](https://github.com/yt-dlp/yt-dlp)
- [yt-dlp Installation Guide](https://github.com/yt-dlp/yt-dlp#installation)
- [SiliconFlow API Documentation](https://docs.siliconflow.cn/)
- [SiliconFlow Free API Key](https://cloud.siliconflow.cn/account/ak)

Related Skills

video-frames

7
from Jst-Well-Dan/Skill-Box

使用 ffmpeg 从视频中提取帧或短视频片段。支持按时间戳截取图片,适用于视频预览和 UI 分析。

theme-factory

7
from Jst-Well-Dan/Skill-Box

Toolkit for styling artifacts with a theme. These artifacts can be slides, docs, reportings, HTML landing pages, etc. There are 10 pre-set themes with colors/fonts that you can apply to any artifact that has been creating, or can generate a new theme on-the-fly.

slack-gif-creator

7
from Jst-Well-Dan/Skill-Box

Toolkit for creating animated GIFs optimized for Slack, with validators for size constraints and composable animation primitives. This skill applies when users request animated GIFs or emoji animations for Slack from descriptions like "make me a GIF for Slack of X doing Y".

remotion-best-practices

7
from Jst-Well-Dan/Skill-Box

Best practices for Remotion - Video creation in React

image-enhancer

7
from Jst-Well-Dan/Skill-Box

Improves the quality of images, especially screenshots, by enhancing resolution, sharpness, and clarity. Perfect for preparing images for presentations, documentation, or social media posts.

canvas-design

7
from Jst-Well-Dan/Skill-Box

Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.

algorithmic-art

7
from Jst-Well-Dan/Skill-Box

Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.

raffle-winner-picker

7
from Jst-Well-Dan/Skill-Box

Picks random winners from lists, spreadsheets, or Google Sheets for giveaways, raffles, and contests. Ensures fair, unbiased selection with transparency.

nlm-skill

7
from Jst-Well-Dan/Skill-Box

Expert guide for the NotebookLM CLI (`nlm`) and MCP server - interfaces for Google NotebookLM. Use this skill when users want to interact with NotebookLM programmatically, including: creating/managing notebooks, adding sources (URLs, YouTube, text, Google Drive), generating content (podcasts, reports, quizzes, flashcards, mind maps, slides, infographics, videos, data tables), conducting research, chatting with sources, or automating NotebookLM workflows. Triggers on mentions of "nlm", "notebooklm", "notebook lm", "podcast generation", "audio overview", or any NotebookLM-related automation task.

md-to-pdf

7
from Jst-Well-Dan/Skill-Box

Use this skill when users want to convert Markdown files to PDF. Handles workflows like "Convert this markdown to PDF", "转换为PDF", "批量转换MD文件". Supports single file and batch directory conversion with excellent CJK (Chinese) font support, image embedding, and clean typography.

markdown-to-epub-converter

7
from Jst-Well-Dan/Skill-Box

Convert markdown documents and chat summaries into formatted EPUB ebook files that can be read on any device or uploaded to Kindle.

xlsx

7
from Jst-Well-Dan/Skill-Box

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas