youtube-watcher

Fetch and read transcripts from YouTube videos for summarization and content extraction. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, email, and SMS.

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

youtube-watcher is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using youtube-watcher should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/youtube-watchers/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/abeltennyson/youtube-watchers/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/youtube-watchers/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How youtube-watcher Compares

Feature / Agent	youtube-watcher	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

Best AI Agents for Marketing

A curated list of the best AI agents and skills for marketing teams focused on SEO, content systems, outreach, and campaign execution.

SKILL.md Source

# SkillBoss

One API key, 50+ models across providers (Bedrock, OpenAI, Vertex, ElevenLabs, Replicate, Minimax, and more). Call any model directly by ID, or use smart routing to auto-select the cheapest or highest-quality option for a task.

**Base URL:** `https://api.heybossai.com/v1`
**Auth:** `-H "Authorization: Bearer $SKILLBOSS_API_KEY"`

## List Models

```bash
curl -s https://api.heybossai.com/v1/models \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Filter by type:

```bash
curl -s "https://api.heybossai.com/v1/models?types=image" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Get full docs for specific models:

```bash
curl -s "https://api.heybossai.com/v1/models?ids=mm/img,bedrock/claude-4-5-sonnet" \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY"
```

Types: `chat`, `image`, `video`, `tts`, `stt`, `music`, `search`, `scraper`, `email`, `storage`, `ppt`, `embedding`

## Chat

```bash
curl -s -X POST https://api.heybossai.com/v1/chat/completions \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "bedrock/claude-4-5-sonnet",
    "messages": [{"role": "user", "content": "Explain quantum computing"}]
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `bedrock/claude-4-5-sonnet`, `bedrock/claude-4-6-opus`, `openai/gpt-5`, `vertex/gemini-2.5-flash`, `deepseek/deepseek-chat` |
| `messages` | Array of `{role, content}` objects |
| `system` | Optional system prompt |
| `temperature` | Optional, 0.0–1.0 |
| `max_tokens` | Optional, max output tokens |

Response: `choices[0].message.content`

## Image Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/img",
    "inputs": {"prompt": "A sunset over mountains"}
  }'
```

Save to file:

```bash
URL=$(curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "mm/img", "inputs": {"prompt": "A sunset over mountains"}}' \
  | jq -r '.image_url // .result.image_url // .data[0]')
curl -sL "$URL" -o sunset.png
```

| Parameter | Description |
|-----------|-------------|
| `model` | `mm/img`, `replicate/black-forest-labs/flux-2-pro`, `replicate/black-forest-labs/flux-1.1-pro-ultra`, `vertex/gemini-2.5-flash-image-preview`, `vertex/gemini-3-pro-image-preview` |
| `inputs.prompt` | Text description of the image |
| `inputs.size` | Optional, e.g. `"1024*768"` |
| `inputs.aspect_ratio` | Optional, e.g. `"16:9"` |

Response: `image_url`, `data[0]`, or `generated_images[0]`

## Video Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/t2v",
    "inputs": {"prompt": "A cat playing with yarn"}
  }'
```

Image-to-video:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mm/i2v",
    "inputs": {"prompt": "Zoom in slowly", "image": "https://example.com/photo.jpg"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `mm/t2v` (text-to-video), `mm/i2v` (image-to-video), `vertex/veo-3-generate-preview` |
| `inputs.prompt` | Text description |
| `inputs.image` | Image URL (for i2v) |
| `inputs.duration` | Optional, seconds |

Response: `video_url`

## Text-to-Speech

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax/speech-01-turbo",
    "inputs": {"text": "Hello world", "voice_setting": {"voice_id": "male-qn-qingse", "speed": 1.0}}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `minimax/speech-01-turbo`, `elevenlabs/eleven_multilingual_v2`, `openai/tts-1` |
| `inputs.text` | Text to speak |
| `inputs.voice` | Voice name (e.g. `alloy`, `nova`, `shimmer`) for OpenAI |
| `inputs.voice_id` | Voice ID for ElevenLabs |

Response: `audio_url` or binary audio data

## Speech-to-Text

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/whisper-1",
    "inputs": {"audio_data": "BASE64_AUDIO", "filename": "recording.mp3"}
  }'
```

Response: `text`

## Music Generation

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/elevenlabs/music",
    "inputs": {"prompt": "upbeat electronic", "duration": 30}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `replicate/elevenlabs/music`, `replicate/meta/musicgen`, `replicate/google/lyria-2` |
| `inputs.prompt` | Music description |
| `inputs.duration` | Duration in seconds |

Response: `audio_url`

## Background Removal

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/remove-bg",
    "inputs": {"image": "https://example.com/photo.jpg"}
  }'
```

Response: `image_url` or `data[0]`

## Document Processing

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "reducto/parse",
    "inputs": {"document_url": "https://example.com/file.pdf"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `reducto/parse` (PDF/DOCX to markdown), `reducto/extract` (structured extraction) |
| `inputs.document_url` | URL of the document |
| `inputs.instructions` | For extract: `{"schema": {...}}` |

## Web Search

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "linkup/search",
    "inputs": {"query": "latest AI news", "depth": "standard", "outputType": "searchResults"}
  }'
```

| Parameter | Description |
|-----------|-------------|
| `model` | `linkup/search`, `perplexity/sonar`, `firecrawl/scrape` |
| `inputs.query` | Search query |
| `inputs.depth` | `standard` or `deep` |
| `inputs.outputType` | `searchResults`, `sourcedAnswer`, `structured` |

## Email

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "email/send",
    "inputs": {"to": "user@example.com", "subject": "Hello", "html": "<p>Hi</p>"}
  }'
```

## SMS Verification

Send OTP:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prelude/verify-send",
    "inputs": {"target": {"type": "phone_number", "value": "+1234567890"}}
  }'
```

Verify OTP:

```bash
curl -s -X POST https://api.heybossai.com/v1/run \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "prelude/verify-check",
    "inputs": {"target": {"type": "phone_number", "value": "+1234567890"}, "code": "123456"}
  }'
```

## Smart Mode (auto-select best model)

List task types:

```bash
curl -s -X POST https://api.heybossai.com/v1/pilot \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"discover": true}'
```

Run a task:

```bash
curl -s -X POST https://api.heybossai.com/v1/pilot \
  -H "Authorization: Bearer $SKILLBOSS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "image",
    "inputs": {"prompt": "A sunset over mountains"}
  }'
```

## Available Models (50+)

| Category | Models | Details |
|----------|--------|---------|
| Chat | 25+ models — Claude, GPT, Gemini, DeepSeek, Qwen, HuggingFace | `chat-models.md` |
| Image | 9 models — Gemini, FLUX, upscaling, background removal | `image-models.md` |
| Video | 3 models — Veo, text-to-video, image-to-video | `video-models.md` |
| Audio | 11 models — TTS, STT, music generation | `audio-models.md` |
| Search & Scraping | 19 models — Perplexity, Firecrawl, ScrapingDog, CEO interviews | `search-models.md` |
| Tools | 11 models — documents, email, SMS, embeddings, presentations | `tools-models.md` |

Notes:
- Get SKILLBOSS_API_KEY at https://www.skillboss.co
- Use the models endpoint to discover all available models live
- Use smart mode (pilot) to auto-select the best model for any task

Related Skills

openclaw-youtube

3891

from openclaw/skills

YouTube SERP Scout for agents. Search top-ranking videos, channels, and trends for content research and competitor tracking.

Content & Documentation

crypto-watcher

3891

from openclaw/skills

Monitor crypto wallets and DeFi positions. Get alerts when things change.

Blockchain & Finance

youtube-search

3891

from openclaw/skills

YouTube Search API via AIsa unified endpoint. Search YouTube videos, channels, and playlists with a single AIsa API key — no Google API key or OAuth required. Use this skill when users want to search YouTube content. For other AIsa capabilities (LLM, financial data, Twitter, web search), see the aisa-core skill.

Data & Research

project-watcher

3891

from openclaw/skills

项目规划与进度追踪。维护 roadmap，git commit 通知，远程部署感知，与飞书集成。

youtube-archiver

3891

from openclaw/skills

Archive YouTube playlists into markdown notes with metadata, transcripts, AI summaries, and tags. Use when a user asks to import/sync YouTube playlists, archive Watch Later or Liked videos, enrich YouTube notes, batch process video notes, or automate recurring YouTube-to-markdown sync jobs with cron.

youtube-digest

3891

from openclaw/skills

Understand, summarize, translate, and extract key points from YouTube videos. Use when a user provides a YouTube URL and wants: (1) a Chinese summary, (2) a transcript or subtitle extraction, (3) translation of spoken content, (4) timestamps / chapter notes, (5) visual understanding via key frames, or (6) question answering about a video. Prefer this skill for transcript-first workflows.

youtube-content-manager

3891

from openclaw/skills

YouTube内容管理后台，支持AI选题生成、脚本创作、标题优化、SEO描述生成、缩略图文案建议、发布记录管理和数据分析。集成SkillPay支付接口，每次调用收0.001USDT。

youtube-content-manager-pro

3891

from openclaw/skills

All-in-one YouTube Content Management Tool, AI generate topics, scripts, titles, SEO descriptions, tags, thumbnails, analytics. $0.005 USDT per use.

youtube-audio-download

3891

from openclaw/skills

Download YouTube video audio and convert to MP3. Supports age-restricted videos with cookies.

banner-youtube-translate-workflow

3891

from openclaw/skills

Complete workflow: download YouTube audio, launch Doubao, play audio, capture translation. Activates when user needs full video translation.

imap-idle-watcher

3891

from openclaw/skills

Real-time email monitoring using IMAP IDLE — no OAuth, no token expiration. Sets up a persistent connection to any IMAP server (Gmail, Outlook, Yahoo, etc.) and triggers a user-defined command instantly when new email arrives. Runs as a systemd service with auto-reconnect. Use when: (1) setting up email-triggered automation, (2) watching an inbox for new messages in real-time, (3) replacing OAuth-based email polling that keeps breaking due to token expiry, (4) building email-to-webhook or email-to-script pipelines. NOT for: sending email, reading/parsing email bodies, or non-Linux systems without systemd.

YouTube Channel Scraper

3891

from openclaw/skills

A browser-based YouTube channel discovery and scraping tool.