youtube-ingest
Transcribe YouTube videos and playlists using Gemini Flash
Best use case
youtube-ingest is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Transcribe YouTube videos and playlists using Gemini Flash
Teams using youtube-ingest should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/youtube-ingest/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How youtube-ingest Compares
| Feature / Agent | youtube-ingest | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Transcribe YouTube videos and playlists using Gemini Flash
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# YouTube Ingest Downloads YouTube audio via yt-dlp and produces polished markdown transcripts via Gemini Flash. ## When to Use - User wants to transcribe a YouTube video or playlist - Building a knowledge base from YouTube content - Extracting insights from video lectures, interviews, talks ## Commands ```bash # Transcribe a single video bun run youtube-ingest/scripts/ingest.ts --url "https://youtube.com/watch?v=..." # List videos in a playlist bun run youtube-ingest/scripts/ingest.ts --playlist "https://youtube.com/playlist?list=..." --list # Transcribe first 5 videos from playlist bun run youtube-ingest/scripts/ingest.ts --playlist "https://youtube.com/playlist?list=..." --limit 5 ``` ## Requirements - `GEMINI_API_KEY` environment variable - `yt-dlp` installed (`brew install yt-dlp`) - Output goes to `output/youtube/` by default ## Output Format Each video produces a markdown file with YAML frontmatter containing title, channel, videoId, source URL, and full transcript with sections. ## Cost ~$0.005 per hour of audio (Gemini Flash).
Related Skills
podcast-ingest
Transcribe podcast episodes from RSS feeds using Gemini Flash
youtube-title-creator
Generate high-CTR YouTube titles and thumbnails using framework fitting method. Match content to 119 proven formulas from Creator Hooks, apply psychological principles, test variations.
youtube-scriptwriting
Transform raw ideas and brain dumps into polished YouTube scripts through a structured checkpoint workflow. Use when the user wants to write a YouTube script, improve video retention, craft hooks, structure educational or entertainment content, or turn source material (transcripts, notes, research) into a compelling video script. Guides through research, hook writing, story structure, body content, and editing phases.
youtube-downloader
Download YouTube video transcripts when user provides a YouTube URL or asks to download/get/fetch a transcript from YouTube. Also use when user wants to transcribe or get captions/subtitles from a YouTube video.
youtube-clip-extractor
Download YouTube videos, identify compelling clips from transcripts, cut clips with ffmpeg, and generate platform-optimized on-screen text and captions. Complete workflow from URL to publishable clips.
x-viral-template-miner
When the user wants to find proven-to-travel post templates in their niche and adapt them to their own product. Also use when the user mentions "what's going viral in my space", "what are competitors posting", "copy a viral post", "trending on X", "post ideas", "template mining", or "what to post this week". This is trend hunting, not plagiarism — the output is a template the user fills with their own assets.
x-linkedin-content-relay
When the user has X (Twitter) content that performed well and wants to relay it to LinkedIn 1-2 weeks later with reframing. Also use when the user mentions "repost to LinkedIn", "LinkedIn version of my tweet", "X to LinkedIn", "delayed repost", "LinkedIn for non-tech audience", or "LinkedIn relay". Also use when the user's ICP is non-tech and X is secondary — LinkedIn is the primary channel and this skill produces the content.
x-launch-video-structure
When the user is planning, scripting, or editing a product launch video for X (Twitter) and needs the structure. Also use when the user mentions "launch video", "demo video", "product launch on X", "60 second demo", "how to structure a launch", or "my launch video isn't working". Produces a beat-by-beat timing sheet, not copy.
x-account-warmup
When a user wants to grow an X (Twitter) account from zero before a product launch, or asks how to get first followers, warm up the algorithm, hit ~500-1,000 followers, or prepare an account to make a launch video land. Also use when the user mentions "new X account", "warm up my Twitter", "first 1000 followers", "building in public strategy", "X growth", or "engagement before launch".
skill-stack-thumbnails
Generate blog post thumbnails for Skill Stack using the brand aesthetic. Follows an iterative workflow - brainstorm concepts, get approval, generate with Gemini API.
web-scrape
Scrape web pages to clean markdown with optional AI summaries
voice-tyler-cowen
Write in Tyler Cowen's style - matter-of-fact, understated, treats enormous ideas as obvious observations. Read the passages. Absorb the flatness. Channel the HOW, not the content.