gemini-yt-video-transcript

Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).

533 stars

Best use case

gemini-yt-video-transcript is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).

Teams using gemini-yt-video-transcript should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/gemini-yt-video-transcript/SKILL.md --create-dirs "https://raw.githubusercontent.com/sundial-org/awesome-openclaw-skills/main/skills/gemini-yt-video-transcript/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/gemini-yt-video-transcript/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How gemini-yt-video-transcript Compares

Feature / Agentgemini-yt-video-transcriptStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Gemini YouTube Video Transcript

Create a **verbatim transcript** for a YouTube URL using **Google Gemini**.

**Output format**
- First line: YouTube video title
- Then transcript lines only in the form:

```
Speaker: text
```

**Requirements**
- No time codes
- No extra headings / lists / commentary

## Usage

```bash
python3 {baseDir}/scripts/youtube_transcript.py "https://www.youtube.com/watch?v=..."
```

Options:
- `--out <path>` Write transcript to a specific file (default: auto-named in the workspace `out/` folder).

## Delivery

When chatting: send the resulting transcript as a document/attachment.

Related Skills

google-gemini-media

533
from sundial-org/awesome-openclaw-skills

Use the Gemini API (Nano Banana image generation, Veo video, Gemini TTS speech and audio understanding) to deliver end-to-end multimodal media workflows and code templates for "generation + understanding".

gemini

533
from sundial-org/awesome-openclaw-skills

Gemini CLI for one-shot Q&A, summaries, and generation.

gemini-stt

533
from sundial-org/awesome-openclaw-skills

Transcribe audio files using Google's Gemini API or Vertex AI

gemini-image-simple

533
from sundial-org/awesome-openclaw-skills

Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.

gemini-deep-research

533
from sundial-org/awesome-openclaw-skills

Perform complex, long-running research tasks using Gemini Deep Research Agent. Use when asked to research topics requiring multi-source synthesis, competitive analysis, market research, or comprehensive technical investigations that benefit from systematic web search and analysis.

gemini-computer-use

533
from sundial-org/awesome-openclaw-skills

Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.

ffmpeg-video-editor

533
from sundial-org/awesome-openclaw-skills

Generate FFmpeg commands from natural language video editing requests - cut, trim, convert, compress, change aspect ratio, extract audio, and more.

demo-video

533
from sundial-org/awesome-openclaw-skills

Create product demo videos by automating browser interactions and capturing frames. Use when the user wants to record a demo, walkthrough, product showcase, or interactive video of a web application. Supports Playwright CDP screencast for high-quality capture and FFmpeg for video encoding.

ai-video-gen

533
from sundial-org/awesome-openclaw-skills

End-to-end AI video generation - create videos from text prompts using image generation, video synthesis, voice-over, and editing. Supports OpenAI DALL-E, Replicate models, LumaAI, Runway, and FFmpeg editing.

portfolio-watcher

533
from sundial-org/awesome-openclaw-skills

Monitor stock/crypto holdings, get price alerts, track portfolio performance

portainer

533
from sundial-org/awesome-openclaw-skills

Control Docker containers and stacks via Portainer API. List containers, start/stop/restart, view logs, and redeploy stacks from git.

portable-tools

533
from sundial-org/awesome-openclaw-skills

Build cross-device tools without hardcoding paths or account names