video-transcript-downloader
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.
Best use case
video-transcript-downloader is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.
Teams using video-transcript-downloader should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/video-transcript-downloader/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How video-transcript-downloader Compares
| Feature / Agent | video-transcript-downloader | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Video Transcript Downloader `./scripts/vtd.js` can: - Print a transcript as a clean paragraph (timestamps optional). - Download video/audio/subtitles. Transcript behavior: - YouTube: fetch via `youtube-transcript-plus` when possible. - Otherwise: pull subtitles via `yt-dlp`, then clean into a paragraph. ## Setup ```bash cd ~/Projects/agent-scripts/skills/video-transcript-downloader && npm ci ``` ## Transcript (default: clean paragraph) ```bash ./scripts/vtd.js transcript --url 'https://…' ./scripts/vtd.js transcript --url 'https://…' --lang en ./scripts/vtd.js transcript --url 'https://…' --timestamps ./scripts/vtd.js transcript --url 'https://…' --keep-brackets ``` ## Download video / audio / subtitles ```bash ./scripts/vtd.js download --url 'https://…' --output-dir ~/Downloads ./scripts/vtd.js audio --url 'https://…' --output-dir ~/Downloads ./scripts/vtd.js subs --url 'https://…' --output-dir ~/Downloads --lang en ``` ## Formats (list + choose) List available formats (format ids, resolution, container, audio-only, etc): ```bash ./scripts/vtd.js formats --url 'https://…' ``` Download a specific format id (example): ```bash ./scripts/vtd.js download --url 'https://…' --output-dir ~/Downloads -- --format 137+140 ``` Prefer MP4 container without re-encoding (remux when possible): ```bash ./scripts/vtd.js download --url 'https://…' --output-dir ~/Downloads -- --remux-video mp4 ``` ## Notes - Default transcript output is a single paragraph. Use `--timestamps` only when asked. - Bracketed cues like `[Music]` are stripped by default; keep them via `--keep-brackets`. - Pass extra `yt-dlp` args after `--` for `transcript` fallback, `download`, `audio`, `subs`, `formats`. ```bash ./scripts/vtd.js formats --url 'https://…' -- -v ``` ## Troubleshooting (only when needed) - Missing `yt-dlp` / `ffmpeg`: ```bash brew install yt-dlp ffmpeg ``` - Verify: ```bash yt-dlp --version ffmpeg -version | head -n 1 ```
Related Skills
youtube-transcript
Fetch and summarize YouTube video transcripts. Use when asked to summarize, transcribe, or extract content from YouTube videos. Handles transcript fetching via residential IP proxy to bypass YouTube's cloud IP blocks.
video-frames
Extract frames or short clips from videos using ffmpeg.
sglang-diffusion-video
Generate videos using a local SGLang-Diffusion server (Wan2.2, Hunyuan, FastWan, etc.). Use when: user asks to generate, create, or render a video with a locally running SGLang-Diffusion instance. NOT for: cloud-hosted video APIs or image generation (use sglang-diffusion for images). Requires a running SGLang-Diffusion server with a video model loaded.
seek-and-analyze-video
Video intelligence and content analysis using Memories.ai LVMM. Discover videos on TikTok, YouTube, Instagram by topic or creator. Analyze video content, summarize meetings, build searchable knowledge bases across multiple videos. Use for video research, competitor content analysis, meeting notes, lecture summaries, or building video knowledge libraries.
ltx-video
Generate videos via LTX-2.3 API (ltx.video). Supports text-to-video, image-to-video, audio-to-video (lip-sync from audio + image), extend, and retake. Use when: generating AI video from text/image/audio, animating a portrait, creating lip-sync video from an existing image + audio recording.
citedy-video-shorts
Generate branded AI avatar lip-sync video shorts for TikTok, Reels, and YouTube Shorts. Create 15-second talking-head videos with custom avatars, auto-generated scripts, and burned-in subtitles for $1.85.
youtube-watcher
Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.
youtube-auto-captions - YouTube 自动字幕
## 描述
youtube
YouTube Data API integration with managed OAuth. Search videos, manage playlists, access channel data, and interact with comments. Use this skill when users want to interact with YouTube. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).
yahoo-finance
Get stock prices, quotes, fundamentals, earnings, options, dividends, and analyst ratings using Yahoo Finance. Uses yfinance library - no API key required.
xurl
A Twitter research and content intelligence skill focused on attracting WordPress and Shopify clients. Use to analyze Twitter profiles, threads, and conversations for: (1) Identifying what small agency founders and eCommerce brands are discussing; (2) Understanding pain points around WordPress performance, Shopify CRO, and development bottlenecks; (3) Extracting high-performing content angles; (4) Turning insights into authority-building posts; (5) Converting Twitter intelligence into business leverage for clear content angles, strong positioning, and qualified inbound leads.
xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.