video-ad-analyzer

Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.

7 stars

byDemerzels-lab

View on GitHub Installation ↓

Best use case

video-ad-analyzer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using video-ad-analyzer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/meta-video-ad-analyzer/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/fortytwode/meta-video-ad-analyzer/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/meta-video-ad-analyzer/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How video-ad-analyzer Compares

Feature / Agent	video-ad-analyzer	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

SKILL.md Source

# Video Ad Analyzer

AI-powered video content extraction using Google Gemini Vision.

## What This Skill Does

- **Frame Extraction**: Smart sampling with scene change detection
- **OCR Text Detection**: Extract text overlays using EasyOCR
- **Audio Transcription**: Convert speech to text with Google Cloud Speech
- **AI Scene Analysis**: Describe each scene using Gemini Vision
- **Native Video Analysis**: Direct video understanding for longer content
- **Thumbnail Generation**: Auto-generate thumbnails from first frame

## Setup

### 1. Environment Variables

```bash
# Required for Gemini Vision
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

# Required for audio transcription
# (same service account needs Speech-to-Text API enabled)
```

### 2. Dependencies

```bash
pip install opencv-python pillow easyocr ffmpeg-python google-cloud-speech vertexai google-api-python-client
```

Also requires `ffmpeg` and `ffprobe` installed on system.

## Usage

### Basic Video Analysis

```python
from scripts.video_extractor import VideoExtractor
from scripts.models import ExtractedVideoContent
import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize Vertex AI
vertexai.init(project="your-project-id", location="us-central1")
gemini_model = GenerativeModel("gemini-1.5-flash")

# Create extractor
extractor = VideoExtractor(gemini_model=gemini_model)

# Analyze video
result = extractor.extract_content("/path/to/video.mp4")

print(f"Duration: {result.duration}s")
print(f"Scenes: {len(result.scene_timeline)}")
print(f"Text overlays: {len(result.text_timeline)}")
print(f"Transcript: {result.transcript[:200]}...")
```

### Extract Only Frames

```python
frames, timestamps, text_timeline, scene_timeline, thumbnail = extractor.extract_smart_frames(
    "/path/to/video.mp4",
    scene_interval=2,    # Check for scene changes every 2s
    text_interval=0.5    # Check for text every 0.5s
)
```

### Analyze Images

```python
# Works with images too
result = extractor.extract_content("/path/to/image.jpg")
print(result.scene_timeline[0]['description'])
```

## Output Structure

```python
ExtractedVideoContent(
    video_path="/path/to/video.mp4",
    duration=30.5,
    transcript="Here's what we found...",
    text_timeline=[
        {"at": 0.0, "text": ["Download Now"]},
        {"at": 5.5, "text": ["50% Off Today"]}
    ],
    scene_timeline=[
        {"timestamp": 0.0, "description": "Woman using phone app..."},
        {"timestamp": 2.0, "description": "Product showcase with features..."}
    ],
    thumbnail_url="/static/thumbnails/video_thumb.jpg",
    extraction_complete=True
)
```

## Key Features

| Feature | Description |
|---------|-------------|
| Scene Detection | Histogram-based change detection (threshold=65) |
| OCR Confidence | Tiered thresholds (0.5 high, 0.3 low) |
| AI Proofreading | Gemini cleans up OCR errors |
| Source Reconciliation | Merges OCR + Vision text intelligently |
| Native Video | Direct Gemini analysis for <20MB files |

## Prompts

Customize AI behavior by editing prompts in the `prompts/` folder:

- `scene_analysis.md` - Frame analysis prompts
- `scene_reconciliation.md` - Scene enrichment prompts

## Common Questions This Answers

- "What text appears in this video ad?"
- "Describe each scene in this creative"
- "What does the narrator say?"
- "Extract the call-to-action from this ad"

Related Skills

Portfolio Risk & Optimization Analyzer

from Demerzels-lab/elsamultiskillagent

**AI-powered crypto portfolio risk analysis with automated $BANKR buyback monetization.**

video-watcher

from Demerzels-lab/elsamultiskillagent

Analyze video content by extracting frames at regular intervals.

youtube-video-downloader

from Demerzels-lab/elsamultiskillagent

Download YouTube videos in various formats and qualities. Use when you need to save videos for offline viewing, extract audio, download playlists, or get specific video formats.

ai-notes-video

from Demerzels-lab/elsamultiskillagent

The video AI notes tool is provided by Baidu.

ai-notes-of-video

from Demerzels-lab/elsamultiskillagent

The video AI notes tool is provided by Baidu.

fliz-ai-video-generator

from Demerzels-lab/elsamultiskillagent

Complete integration guide for the Fliz REST API - an AI-powered video generation platform that transforms text content into professional videos with voiceovers, AI-generated images, and subtitles. Use this skill when: - Creating integrations with Fliz API (WordPress, Zapier, Make, n8n, custom apps) - Building video generation workflows via API - Implementing webhook handlers for video completion notifications - Developing automation tools that create, manage, or translate videos - Troubleshooting Fliz API errors or authentication issues - Understanding video processing steps and status polling Key capabilities: video creation from text/Brief, video status monitoring, translation, duplication, voice/music listing, webhook notifications.