video-toolkit
Intelligent video processor for downloading media and extracting transcripts from YouTube and 1000+ supported sites. Automatically handles format selection, subtitle extraction, and post-processing.
Best use case
video-toolkit is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Intelligent video processor for downloading media and extracting transcripts from YouTube and 1000+ supported sites. Automatically handles format selection, subtitle extraction, and post-processing.
Teams using video-toolkit should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/video-toolkit/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How video-toolkit Compares
| Feature / Agent | video-toolkit | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Intelligent video processor for downloading media and extracting transcripts from YouTube and 1000+ supported sites. Automatically handles format selection, subtitle extraction, and post-processing.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Video Toolkit This skill provides intelligent capabilities to download video/audio and extract transcripts using `yt-dlp`. ## Usage Guidelines **Goal-Oriented Workflow**: DO NOT ask for confirmation unless the user's intent is ambiguous. - If user says "Download": Download the best available quality video. - If user says "Transcribe" or "Summarize": Extract subtitles (auto-generated if needed) and convert to clean text. - If user says "Audio only": Extract mp3/m4a. ## Smart Defaults - **Language**: Match the video's recognized language. Fallback to English (`en`). - **Cookies**: Use specific browser cookies (e.g., `--cookies-from-browser chrome`) **ONLY** if the download fails with "Sign in" or "403 Forbidden" errors. - **Format**: `mp4` for video, `srt` -> `txt` for transcripts. ## Workflows ### 1. Transcript & Text Extraction Best for extracting content for reading, summarizing, or RAG. 1. **Extract Subtitles**: Downloads subtitles (manual or auto-generated) without downloading the video. ```bash yt-dlp --write-sub --write-auto-sub --sub-lang "en,zh-Hans,.*" --convert-subs srt --skip-download -o "%(title)s.%(ext)s" "[URL]" ``` *(Note: Adjust `--sub-lang` priority based on the video's likely language if known)* 2. **Convert to Text**: Run the cleaning script to produce a readable `.txt` file: ```bash python3 scripts/clean_transcript.py "[FILE].srt" ``` ### 2. Media Download Best for saving files locally. - **Standard Video**: ```bash yt-dlp -o "%(title)s.%(ext)s" "[URL]" ``` - **Audio Only**: ```bash yt-dlp -x --audio-format mp3 -o "%(title)s.%(ext)s" "[URL]" ``` ### 3. Advanced & Troubleshooting - **Permission Errors**: If you encounter 403 or Sign-in errors, **retry** with cookies: `yt-dlp --cookies-from-browser chrome ...` (or firefox/safari/edge depending on user OS). - **Specific Formats**: If user requests specific resolution (e.g., 1080p) or format, consult the [yt-dlp Full Manual](references/yt-dlp-readme.md). - **Unsupported Sites**: Check the [Supported Sites List](references/supports-site.md) if the URL seems obscure. ## References - [yt-dlp Full Manual](references/yt-dlp-readme.md) - [Supported Sites List](references/supports-site.md)
Related Skills
vidu-video
使用 Vidu Q3 Pro 模型生成视频。当用户想要文生视频、生成带音频的视频,或提到 vidu 时使用此 skill。
videodb-skills
Upload, stream, search, edit, transcribe, and generate AI video and audio using the VideoDB SDK.
videocut:安装
环境准备。安装依赖、下载模型。触发词:安装、环境准备、初始化
video
Generate videos using fal.ai (Wan, Kling) or Sora. Text-to-video and image-to-video.
video-processing-editing
FFmpeg automation for cutting, trimming, concatenating videos. Audio mixing, timeline editing, transitions, effects. Export optimization for YouTube, social media. Subtitle handling, color grading, batch processing. Use for videogen projects, content creation, automated video production. Activate on "video editing", "FFmpeg", "trim video", "concatenate", "transitions", "export optimization". NOT for real-time video editing UI, 3D compositing, or motion graphics.
video-commercial
Generate 30-second video commercials from a concept. Creates storyboard, generates scene images, adds narration via ElevenLabs, assembles final video. Use when asked to create commercials, promo videos, video ads, or short marketing videos.
video-analyzer
鏅鸿兘鍒嗘瀽 Bilibili/YouTube/鏈湴瑙嗛锛岀敓鎴愯浆鍐欍€佽瘎浼板拰鎬荤粨銆傛敮鎸佸叧閿抚鎴浘鑷姩宓屽叆銆?
Media Uploader - R2/S3 with video download
Upload files or download videos from popular platforms (YouTube, Vimeo, Bilibili, etc.) and upload to Cloudflare R2, AWS S3, or any S3-compatible storage with secure presigned download links.
ltxv2-video
Build LTX-V2 19B video workflows — text-to-video, image-to-video, distilled model, camera control LoRAs, and two-stage upscaling
edu-video-analyzer
Analyze educational YouTube channels for classroom adoption potential, curriculum alignment, and pedagogical effectiveness. Use when comparing educational video content (like MRU vs Crash Course), evaluating teaching methodologies, identifying content gaps for course design, or developing educational video strategy focused on student learning outcomes rather than monetization.
Automate YouTube Top-Ten Video Creation with OpenAI and Safe Image Search
Integrates OpenAI API for content generation, Bing Image Search API for safe image retrieval, and Pexels API for video footage. Handles authentication via Bearer token, enforces safe search, formats ChatGPT responses into a top-ten list, and includes error handling for API failures.
apex-video-generator
Generate real estate marketing videos from property data. Use when creating property showcases, social media content, market reports, or neighborhood tours. Integrates Firecrawl scraped data with Remotion rendering.