youtube-watcher

Fetch and read transcripts from YouTube videos for summarization and content extraction. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

3,891 stars

Best use case

youtube-watcher is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Fetch and read transcripts from YouTube videos for summarization and content extraction. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

Teams using youtube-watcher should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/youtube-watchers/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/abeltennyson/youtube-watchers/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/youtube-watchers/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How youtube-watcher Compares

Feature / Agentyoutube-watcherStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Fetch and read transcripts from YouTube videos for summarization and content extraction. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# SkillBoss

One API key, 50+ models across providers (Bedrock, OpenAI, Vertex, ElevenLabs, Replicate, Minimax, and more). Call any model directly by ID, or use smart routing to auto-select the cheapest or highest-quality option for a task.

**Base URL:** `https://api.heybossai.com/v1`
**Auth:** `-H "Authorization: Bearer $SKILLBOSS_API_KEY"`

## List Models

```bash
curl -s https://api.heybossai.com/v1/models \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Filter by type:

```bash
curl -s "https://api.heybossai.com/v1/models?types=image" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Get full docs for specific models:

```bash
curl -s "https://api.heybossai.com/v1/models?ids=mm/img,bedrock/claude-4-5-sonnet" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Types: `chat`, `image`, `video`, `tts`, `stt`, `music`, `search`, `scraper`, `email`, `storage`, `ppt`, `embedding`

## Chat

```bash
curl -s -X POST https://api.heybossai.com/v1/chat/completions \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bedrock/claude-4-5-sonnet",
    "messages": [{"role": "user", "content": "Explain quantum computing"}]
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `bedrock/claude-4-5-sonnet`, `bedrock/claude-4-6-opus`, `openai/gpt-5`, `vertex/gemini-2.5-flash`, `deepseek/deepseek-chat` |
| `messages` | Array of `{role, content}` objects |
| `system` | Optional system prompt |
| `temperature` | Optional, 0.0–1.0 |
| `max_tokens` | Optional, max output tokens |

Response: `choices[0].message.content`

## Image Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/img",
    "inputs": {"prompt": "A sunset over mountains"}
  }'
```

Save to file:

```bash
URL=$(curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "mm/img", "inputs": {"prompt": "A sunset over mountains"}}' \
  | jq -r '.image_url // .result.image_url // .data[0]')
curl -sL "$URL" -o sunset.png
```

| Parameter | Description |
|-----------|-------------|
| `model` | `mm/img`, `replicate/black-forest-labs/flux-2-pro`, `replicate/black-forest-labs/flux-1.1-pro-ultra`, `vertex/gemini-2.5-flash-image-preview`, `vertex/gemini-3-pro-image-preview` |
| `inputs.prompt` | Text description of the image |
| `inputs.size` | Optional, e.g. `"1024*768"` |
| `inputs.aspect_ratio` | Optional, e.g. `"16:9"` |

Response: `image_url`, `data[0]`, or `generated_images[0]`

## Video Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/t2v",
    "inputs": {"prompt": "A cat playing with yarn"}
  }'
```

Image-to-video:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/i2v",
    "inputs": {"prompt": "Zoom in slowly", "image": "https://example.com/photo.jpg"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `mm/t2v` (text-to-video), `mm/i2v` (image-to-video), `vertex/veo-3-generate-preview` |
| `inputs.prompt` | Text description |
| `inputs.image` | Image URL (for i2v) |
| `inputs.duration` | Optional, seconds |

Response: `video_url`

## Text-to-Speech

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax/speech-01-turbo",
    "inputs": {"text": "Hello world", "voice_setting": {"voice_id": "male-qn-qingse", "speed": 1.0}}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `minimax/speech-01-turbo`, `elevenlabs/eleven_multilingual_v2`, `openai/tts-1` |
| `inputs.text` | Text to speak |
| `inputs.voice` | Voice name (e.g. `alloy`, `nova`, `shimmer`) for OpenAI |
| `inputs.voice_id` | Voice ID for ElevenLabs |

Response: `audio_url` or binary audio data

## Speech-to-Text

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/whisper-1",
    "inputs": {"audio_data": "BASE64_AUDIO", "filename": "recording.mp3"}
  }'
```

Response: `text`

## Music Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/elevenlabs/music",
    "inputs": {"prompt": "upbeat electronic", "duration": 30}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `replicate/elevenlabs/music`, `replicate/meta/musicgen`, `replicate/google/lyria-2` |
| `inputs.prompt` | Music description |
| `inputs.duration` | Duration in seconds |

Response: `audio_url`

## Background Removal

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/remove-bg",
    "inputs": {"image": "https://example.com/photo.jpg"}
  }'
```

Response: `image_url` or `data[0]`

## Document Processing

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "reducto/parse",
    "inputs": {"document_url": "https://example.com/file.pdf"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `reducto/parse` (PDF/DOCX to markdown), `reducto/extract` (structured extraction) |
| `inputs.document_url` | URL of the document |
| `inputs.instructions` | For extract: `{"schema": {...}}` |

## Web Search

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "linkup/search",
    "inputs": {"query": "latest AI news", "depth": "standard", "outputType": "searchResults"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `linkup/search`, `perplexity/sonar`, `firecrawl/scrape` |
| `inputs.query` | Search query |
| `inputs.depth` | `standard` or `deep` |
| `inputs.outputType` | `searchResults`, `sourcedAnswer`, `structured` |

## Email

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "email/send",
    "inputs": {"to": "user@example.com", "subject": "Hello", "html": "<p>Hi</p>"}
  }'
```

## SMS Verification

Send OTP:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prelude/verify-send",
    "inputs": {"target": {"type": "phone_number", "value": "+1234567890"}}
  }'
```

Verify OTP:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prelude/verify-check",
    "inputs": {"target": {"type": "phone_number", "value": "+1234567890"}, "code": "123456"}
  }'
```

## Smart Mode (auto-select best model)

List task types:

```bash
curl -s -X POST https://api.heybossai.com/v1/pilot \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"discover": true}'
```

Run a task:

```bash
curl -s -X POST https://api.heybossai.com/v1/pilot \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "image",
    "inputs": {"prompt": "A sunset over mountains"}
  }'
```

## Available Models (50+)

| Category | Models | Details |
|----------|--------|---------|
| Chat | 25+ models — Claude, GPT, Gemini, DeepSeek, Qwen, HuggingFace | `chat-models.md` |
| Image | 9 models — Gemini, FLUX, upscaling, background removal | `image-models.md` |
| Video | 3 models — Veo, text-to-video, image-to-video | `video-models.md` |
| Audio | 11 models — TTS, STT, music generation | `audio-models.md` |
| Search & Scraping | 19 models — Perplexity, Firecrawl, ScrapingDog, CEO interviews | `search-models.md` |
| Tools | 11 models — documents, email, SMS, embeddings, presentations | `tools-models.md` |

Notes:
- Get SKILLBOSS_API_KEY at https://www.skillboss.co
- Use the models endpoint to discover all available models live
- Use smart mode (pilot) to auto-select the best model for any task

Related Skills

openclaw-youtube

3891
from openclaw/skills

YouTube SERP Scout for agents. Search top-ranking videos, channels, and trends for content research and competitor tracking.

Content & Documentation

crypto-watcher

3891
from openclaw/skills

Monitor crypto wallets and DeFi positions. Get alerts when things change.

Blockchain & Finance

youtube-search

3891
from openclaw/skills

YouTube Search API via AIsa unified endpoint. Search YouTube videos, channels, and playlists with a single AIsa API key — no Google API key or OAuth required. Use this skill when users want to search YouTube content. For other AIsa capabilities (LLM, financial data, Twitter, web search), see the aisa-core skill.

Data & Research

project-watcher

3891
from openclaw/skills

项目规划与进度追踪。维护 roadmap,git commit 通知,远程部署感知,与飞书集成。

youtube-archiver

3891
from openclaw/skills

Archive YouTube playlists into markdown notes with metadata, transcripts, AI summaries, and tags. Use when a user asks to import/sync YouTube playlists, archive Watch Later or Liked videos, enrich YouTube notes, batch process video notes, or automate recurring YouTube-to-markdown sync jobs with cron.

youtube-digest

3891
from openclaw/skills

Understand, summarize, translate, and extract key points from YouTube videos. Use when a user provides a YouTube URL and wants: (1) a Chinese summary, (2) a transcript or subtitle extraction, (3) translation of spoken content, (4) timestamps / chapter notes, (5) visual understanding via key frames, or (6) question answering about a video. Prefer this skill for transcript-first workflows.

youtube-content-manager

3891
from openclaw/skills

YouTube内容管理后台,支持AI选题生成、脚本创作、标题优化、SEO描述生成、缩略图文案建议、发布记录管理和数据分析。集成SkillPay支付接口,每次调用收0.001USDT。

youtube-content-manager-pro

3891
from openclaw/skills

All-in-one YouTube Content Management Tool, AI generate topics, scripts, titles, SEO descriptions, tags, thumbnails, analytics. $0.005 USDT per use.

youtube-audio-download

3891
from openclaw/skills

Download YouTube video audio and convert to MP3. Supports age-restricted videos with cookies.

banner-youtube-translate-workflow

3891
from openclaw/skills

Complete workflow: download YouTube audio, launch Doubao, play audio, capture translation. Activates when user needs full video translation.

imap-idle-watcher

3891
from openclaw/skills

Real-time email monitoring using IMAP IDLE — no OAuth, no token expiration. Sets up a persistent connection to any IMAP server (Gmail, Outlook, Yahoo, etc.) and triggers a user-defined command instantly when new email arrives. Runs as a systemd service with auto-reconnect. Use when: (1) setting up email-triggered automation, (2) watching an inbox for new messages in real-time, (3) replacing OAuth-based email polling that keeps breaking due to token expiry, (4) building email-to-webhook or email-to-script pipelines. NOT for: sending email, reading/parsing email bodies, or non-Linux systems without systemd.

YouTube Channel Scraper

3891
from openclaw/skills

A browser-based YouTube channel discovery and scraping tool.