advanced-video-downloader

Download and transcribe videos from YouTube, Bilibili, TikTok and 1000+ platforms. Use when user requests video download, transcription (转录/字幕提取), or converting video to text/markdown. Supports quality selection, audio extraction, playlist downloads, cookie-based authentication, and AI-powered transcription via SiliconFlow API (免费转录).

16 stars

Best use case

advanced-video-downloader is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Download and transcribe videos from YouTube, Bilibili, TikTok and 1000+ platforms. Use when user requests video download, transcription (转录/字幕提取), or converting video to text/markdown. Supports quality selection, audio extraction, playlist downloads, cookie-based authentication, and AI-powered transcription via SiliconFlow API (免费转录).

Teams using advanced-video-downloader should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/advanced-video-downloader/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/backend/advanced-video-downloader/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/advanced-video-downloader/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How advanced-video-downloader Compares

Feature / Agentadvanced-video-downloaderStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Download and transcribe videos from YouTube, Bilibili, TikTok and 1000+ platforms. Use when user requests video download, transcription (转录/字幕提取), or converting video to text/markdown. Supports quality selection, audio extraction, playlist downloads, cookie-based authentication, and AI-powered transcription via SiliconFlow API (免费转录).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Advanced Video Downloader

## Overview

This skill provides comprehensive video downloading and transcription capabilities from 1000+ platforms including YouTube, Bilibili, TikTok, Twitter, Instagram, and more. It combines:
- **yt-dlp**: Powerful video downloading tool
- **SiliconFlow API**: Free AI-powered transcription to convert videos to Markdown

## When to Use This Skill

Activate this skill when the user:
- Explicitly requests to download a video ("download this video", "下载视频")
- Provides video URLs from any platform
- Mentions saving videos for offline viewing
- Wants to extract audio from videos
- Needs to download multiple videos or playlists
- Asks about video quality options
- Requests video transcription ("转录视频", "提取字幕", "视频转文字")
- Wants to convert video/audio to text or Markdown
- Asks to download AND transcribe a video in one workflow

## Core Capabilities

### 1. Single Video Download
Download individual videos from any supported platform with automatic quality selection.

**Example usage:**
```
User: "Download this YouTube video: https://youtube.com/watch?v=abc123"
User: "下载这个B站视频: https://bilibili.com/video/BV1xxx"
```

### 2. Batch & Playlist Download
Download multiple videos or entire playlists at once.

**Example usage:**
```
User: "Download all videos from this playlist"
User: "Download these 3 videos: [URL1], [URL2], [URL3]"
```

### 3. Audio Extraction
Extract audio only from videos, saving as MP3 or M4A.

**Example usage:**
```
User: "Download only the audio from this video"
User: "Convert this video to MP3"
```

### 4. Quality Selection
Choose specific video quality (4K, 1080p, 720p, etc.).

**Example usage:**
```
User: "Download in 4K quality"
User: "Get the 720p version to save space"
```

### 5. Video/Audio Transcription
Convert video or audio files to Markdown text using SiliconFlow's free AI transcription API.

**Example usage:**
```
User: "Transcribe this video to text" / "转录这个视频"
User: "Download and transcribe this YouTube video"
User: "将这个音频转成文字"
User: "Extract transcript from this MP4 file"
```

**Supported formats:**
- Audio: MP3, WAV, M4A, FLAC, AAC, OGG, OPUS, WMA
- Video: MP4, AVI, MOV, MKV, FLV, WMV, WEBM, M4V

## Response Pattern

When a user requests video download:

### Step 1: Identify the Platform and URL(s)
```python
# Extract video URL(s) from user message
# Identify platform: YouTube, Bilibili, TikTok, etc.
```

### Step 2: Check Tool Availability
```bash
# Check if yt-dlp is installed
yt-dlp --version
```

### Step 3: Select Appropriate yt-dlp Command

Based on platform and requirements:
- **YouTube, Twitter, Instagram, TikTok**: Basic command works
- **Bilibili**: Basic command works for most videos
- **Quality selection**: Use `-f` with height filter
- **Audio only**: Use `-x --audio-format mp3`
- **Playlists**: Use playlist-specific output template

### Step 4: Execute Download

Use yt-dlp directly with appropriate options:

```bash
# Basic download (best quality MP4)
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Specific quality (1080p)
yt-dlp -f "bestvideo[height<=1080]+bestaudio/best[height<=1080]" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Audio only (MP3)
yt-dlp -x --audio-format mp3 -o "%(title)s.%(ext)s" "VIDEO_URL"

# With cookies file (for protected content)
yt-dlp --cookies cookies.txt -o "%(title)s.%(ext)s" "VIDEO_URL"

# Playlist download
yt-dlp -o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "PLAYLIST_URL"
```

### Step 5: Report Results
After download completes, report:
- ✅ Video title and duration
- ✅ File size and format
- ✅ Save location
- ✅ Download speed and time taken
- ⚠️ Any warnings or quality limitations

**Example output:**
```
✅ Downloaded: "Video Title Here"
   Duration: 15:30
   Quality: 1080p MP4
   Size: 234 MB
   Location: ./Video Title Here.mp4
   Time: 45 seconds at 5.2 MB/s
```

## Transcription Response Pattern

When a user requests video/audio transcription:

### Step 1: Check Prerequisites
```bash
# Verify SiliconFlow API key is available
echo $SILICONFLOW_API_KEY
# Or user must provide via --api-key parameter
```

**API Key Setup:**
- Get free API key from: https://cloud.siliconflow.cn/account/ak
- Copy `.env.example` to `.env` and add your API key
- Or set environment variable: `SILICONFLOW_API_KEY=sk-xxx`

### Step 2: Validate File
Ensure the file exists and is a supported format (audio or video).

### Step 3: Execute Transcription
Use the bundled script `scripts/transcribe_siliconflow.py`:

```bash
# Basic transcription
python scripts/transcribe_siliconflow.py --file video.mp4 --api-key sk-xxx

# With custom output path
python scripts/transcribe_siliconflow.py --file audio.mp3 --output transcript.md --api-key sk-xxx

# Using environment variable for API key
python scripts/transcribe_siliconflow.py --file video.mp4
```

### Step 4: Report Transcription Results
```
✅ Transcription complete!
   File: video.mp4
   Output: 2025-01-15-video.md
   Size: 12.5 KB

   Preview:
   --------------------------------------------------
   [First 200 characters of transcription...]
   --------------------------------------------------
```

## Combined Workflow: Download + Transcribe

For requests like "Download and transcribe this video":

```bash
# Step 1: Download video
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Step 2: Transcribe the downloaded file
python scripts/transcribe_siliconflow.py --file "Downloaded Video Title.mp4" --api-key sk-xxx
```

## Platform-Specific Notes

### YouTube
- Fully supported by yt-dlp
- No authentication needed for public videos
- Supports all quality levels including 4K/8K

### Bilibili
- Supported by yt-dlp
- High-quality downloads may require login cookies
- Use `--cookies` with cookies.txt for member-only content

### Other Platforms
- Most platforms work well with yt-dlp
- Check `references/supported_platforms.md` for full list

## Handling Cookies for Protected Content

For platforms requiring authentication (Bilibili VIP, member-only content, etc.):

### Method 1: Export Cookies File (Recommended)
```bash
# Use browser extension "Get cookies.txt LOCALLY"
# Export cookies.txt, then:
yt-dlp --cookies cookies.txt "VIDEO_URL"
```

### Method 2: Manual Cookies File
```bash
# Create cookies.txt in Netscape format
# Use browser extension "Get cookies.txt LOCALLY"
# Then use with yt-dlp
yt-dlp --cookies cookies.txt "VIDEO_URL"
```

## Troubleshooting

### Issue: Video quality lower than expected
**Solution:**
1. Check if platform requires login for HD
2. Use `--cookies cookies.txt` for authenticated access
3. Explicitly specify quality with `-f` parameter

### Issue: Download very slow
**Solution:**
1. Check internet connection
2. Try different time of day (peak hours affect speed)
3. Use `--concurrent-fragments` for faster downloads

### Issue: "Video unavailable" or geo-restricted
**Solution:**
1. Video may be region-locked
2. Use proxy/VPN if legally permitted
3. Check if video is still available on platform

### Issue: Transcription API key error
**Solution:**
1. Verify API key starts with `sk-`
2. Get free key from: https://cloud.siliconflow.cn/account/ak
3. Set environment variable: `SILICONFLOW_API_KEY=sk-xxx`

### Issue: Transcription returns empty text
**Solution:**
1. Check if audio/video has clear speech
2. Verify file format is supported
3. File may be too short or contain only music

## Common Commands

### Quality Presets

```bash
# 4K (2160p)
yt-dlp -f "bestvideo[height<=2160]+bestaudio/best[height<=2160]" --merge-output-format mp4 "VIDEO_URL"

# 1080p (Full HD)
yt-dlp -f "bestvideo[height<=1080]+bestaudio/best[height<=1080]" --merge-output-format mp4 "VIDEO_URL"

# 720p (HD)
yt-dlp -f "bestvideo[height<=720]+bestaudio/best[height<=720]" --merge-output-format mp4 "VIDEO_URL"

# 480p (SD)
yt-dlp -f "bestvideo[height<=480]+bestaudio/best[height<=480]" --merge-output-format mp4 "VIDEO_URL"
```

### Audio Extraction

```bash
# Extract as MP3
yt-dlp -x --audio-format mp3 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Extract as M4A (better quality)
yt-dlp -x --audio-format m4a -o "%(title)s.%(ext)s" "VIDEO_URL"
```

### Batch Downloads

```bash
# Download multiple URLs from file
yt-dlp -a urls.txt

# Download playlist with custom naming
yt-dlp -o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "PLAYLIST_URL"

# Download channel's videos
yt-dlp -o "%(uploader)s/%(title)s.%(ext)s" "CHANNEL_URL"
```

## Bundled Resources

### Configuration

#### `.env.example`
Template for environment variables. Copy to `.env` and add your SiliconFlow API key.

### Scripts

#### `scripts/transcribe_siliconflow.py`
AI-powered transcription script using SiliconFlow's free API.

**Usage:**
```bash
python scripts/transcribe_siliconflow.py --file <audio/video> [--api-key <key>] [--output <path>]
```

**Parameters:**
- `--file, -f`: Input audio/video file (required)
- `--api-key, -k`: SiliconFlow API key (or use `SILICONFLOW_API_KEY` env var)
- `--output, -o`: Output markdown file path (default: `YYYY-MM-DD-filename.md`)
- `--model, -m`: Model to use (default: `FunAudioLLM/SenseVoiceSmall`)

### References

#### `references/supported_platforms.md`
Comprehensive list of 1000+ supported platforms with platform-specific notes and requirements.

#### `references/quality_formats.md`
Detailed explanation of video formats, codecs, and quality selection strategies.

## Tips for Best Results

1. **Always specify quality if user has preference** - saves bandwidth and storage
2. **Batch downloads save time** - use playlist URLs when possible
3. **Audio extraction is faster** - recommend for podcast/music content
4. **Check file size before downloading** - warn user for very large files (>1GB)
5. **Transcription works best with clear audio** - consider extracting audio first for better results

## Sources

- [yt-dlp Documentation](https://github.com/yt-dlp/yt-dlp)
- [yt-dlp Installation Guide](https://github.com/yt-dlp/yt-dlp#installation)
- [SiliconFlow API Documentation](https://docs.siliconflow.cn/)
- [SiliconFlow Free API Key](https://cloud.siliconflow.cn/account/ak)

Related Skills

advanced-kubernetes

16
from diegosouzapw/awesome-omni-skill

Custom Resource Definitions (CRDs) extend Kubernetes API with custom object types. Operators are controllers that manage these custom resources using domain-specific logic.

youtube-downloader

16
from diegosouzapw/awesome-omni-skill

Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.

video-sdk/web

16
from diegosouzapw/awesome-omni-skill

Zoom Video SDK for Web - JavaScript/TypeScript integration for browser-based video sessions, real-time communication, screen sharing, recording, and live transcription

typescript-advanced-types

16
from diegosouzapw/awesome-omni-skill

Master TypeScript's advanced type system including generics, conditional types, mapped types, template literals, and utility types for building type-safe applications. Use when implementing complex...

notion-uploader-downloader

16
from diegosouzapw/awesome-omni-skill

Bidirectional sync between Markdown and Notion. Upload .md files with images to Notion pages/databases, append to existing pages, or download Notion content back to markdown. Supports rich formatting, tables, code blocks, GitHub-flavored markdown, and recursive page hierarchy downloads with YAML frontmatter for round-trip sync.

Fixed Video Format (9:16)

16
from diegosouzapw/awesome-omni-skill

Fixed 1080x1920 pixel video format with percentage-based positioning. Use this when laying out video compositions, positioning elements on the canvas, or calculating dimensions. All videos render at exactly 9:16 aspect ratio for TikTok/Instagram Reels.

aspnet-core-advanced

16
from diegosouzapw/awesome-omni-skill

Master advanced ASP.NET Core development including Entity Framework Core, authentication, testing, and enterprise patterns for production applications.

animating-advanced

16
from diegosouzapw/awesome-omni-skill

Creates Awwwards-level, high-performance animations using industry-standard tools like GSAP, Three.js/R3F, and Lenis. Specializes in "Hero Sections", 3D interactions, and scroll-linked storytelling.

advanced_tools

16
from diegosouzapw/awesome-omni-skill

Use when finding files by name, searching code content, locating patterns with regex, exploring codebase, or batch refactoring across multiple files. Conforms to docs/reference/skill-routing-value-standard.md.

advanced-workflows

16
from diegosouzapw/awesome-omni-skill

Multi-tool orchestration patterns for complex Bluera Knowledge operations. Teaches progressive library exploration, adding libraries with job monitoring, handling large result sets, multi-store searches, and error recovery workflows.

Advanced Typescript Type Level

16
from diegosouzapw/awesome-omni-skill

Master TypeScript type-level programming with conditional types, mapped types, template literals, and infer patterns. Use when writing advanced types, creating utility types, or solving complex type challenges.

advanced-typescript-patterns

16
from diegosouzapw/awesome-omni-skill

Advanced TypeScript patterns for TMNL. Covers conditional types, mapped types, branded types, generic constraints, type inference, and utility type composition. Pure TypeScript patterns beyond Effect Schema.