video-ad-analyzer

Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.

7 stars

Best use case

video-ad-analyzer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.

Teams using video-ad-analyzer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/meta-video-ad-analyzer/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/fortytwode/meta-video-ad-analyzer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/meta-video-ad-analyzer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How video-ad-analyzer Compares

Feature / Agentvideo-ad-analyzerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Video Ad Analyzer

AI-powered video content extraction using Google Gemini Vision.

## What This Skill Does

- **Frame Extraction**: Smart sampling with scene change detection
- **OCR Text Detection**: Extract text overlays using EasyOCR
- **Audio Transcription**: Convert speech to text with Google Cloud Speech
- **AI Scene Analysis**: Describe each scene using Gemini Vision
- **Native Video Analysis**: Direct video understanding for longer content
- **Thumbnail Generation**: Auto-generate thumbnails from first frame

## Setup

### 1. Environment Variables

```bash
# Required for Gemini Vision
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

# Required for audio transcription
# (same service account needs Speech-to-Text API enabled)
```

### 2. Dependencies

```bash
pip install opencv-python pillow easyocr ffmpeg-python google-cloud-speech vertexai google-api-python-client
```

Also requires `ffmpeg` and `ffprobe` installed on system.

## Usage

### Basic Video Analysis

```python
from scripts.video_extractor import VideoExtractor
from scripts.models import ExtractedVideoContent
import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize Vertex AI
vertexai.init(project="your-project-id", location="us-central1")
gemini_model = GenerativeModel("gemini-1.5-flash")

# Create extractor
extractor = VideoExtractor(gemini_model=gemini_model)

# Analyze video
result = extractor.extract_content("/path/to/video.mp4")

print(f"Duration: {result.duration}s")
print(f"Scenes: {len(result.scene_timeline)}")
print(f"Text overlays: {len(result.text_timeline)}")
print(f"Transcript: {result.transcript[:200]}...")
```

### Extract Only Frames

```python
frames, timestamps, text_timeline, scene_timeline, thumbnail = extractor.extract_smart_frames(
    "/path/to/video.mp4",
    scene_interval=2,    # Check for scene changes every 2s
    text_interval=0.5    # Check for text every 0.5s
)
```

### Analyze Images

```python
# Works with images too
result = extractor.extract_content("/path/to/image.jpg")
print(result.scene_timeline[0]['description'])
```

## Output Structure

```python
ExtractedVideoContent(
    video_path="/path/to/video.mp4",
    duration=30.5,
    transcript="Here's what we found...",
    text_timeline=[
        {"at": 0.0, "text": ["Download Now"]},
        {"at": 5.5, "text": ["50% Off Today"]}
    ],
    scene_timeline=[
        {"timestamp": 0.0, "description": "Woman using phone app..."},
        {"timestamp": 2.0, "description": "Product showcase with features..."}
    ],
    thumbnail_url="/static/thumbnails/video_thumb.jpg",
    extraction_complete=True
)
```

## Key Features

| Feature | Description |
|---------|-------------|
| Scene Detection | Histogram-based change detection (threshold=65) |
| OCR Confidence | Tiered thresholds (0.5 high, 0.3 low) |
| AI Proofreading | Gemini cleans up OCR errors |
| Source Reconciliation | Merges OCR + Vision text intelligently |
| Native Video | Direct Gemini analysis for <20MB files |

## Prompts

Customize AI behavior by editing prompts in the `prompts/` folder:

- `scene_analysis.md` - Frame analysis prompts
- `scene_reconciliation.md` - Scene enrichment prompts

## Common Questions This Answers

- "What text appears in this video ad?"
- "Describe each scene in this creative"
- "What does the narrator say?"
- "Extract the call-to-action from this ad"

Related Skills

Portfolio Risk & Optimization Analyzer

7
from Demerzels-lab/elsamultiskillagent

**AI-powered crypto portfolio risk analysis with automated $BANKR buyback monetization.**

video-watcher

7
from Demerzels-lab/elsamultiskillagent

Analyze video content by extracting frames at regular intervals.

youtube-video-downloader

7
from Demerzels-lab/elsamultiskillagent

Download YouTube videos in various formats and qualities. Use when you need to save videos for offline viewing, extract audio, download playlists, or get specific video formats.

ai-notes-video

7
from Demerzels-lab/elsamultiskillagent

The video AI notes tool is provided by Baidu.

ai-notes-of-video

7
from Demerzels-lab/elsamultiskillagent

The video AI notes tool is provided by Baidu.

fliz-ai-video-generator

7
from Demerzels-lab/elsamultiskillagent

Complete integration guide for the Fliz REST API - an AI-powered video generation platform that transforms text content into professional videos with voiceovers, AI-generated images, and subtitles. Use this skill when: - Creating integrations with Fliz API (WordPress, Zapier, Make, n8n, custom apps) - Building video generation workflows via API - Implementing webhook handlers for video completion notifications - Developing automation tools that create, manage, or translate videos - Troubleshooting Fliz API errors or authentication issues - Understanding video processing steps and status polling Key capabilities: video creation from text/Brief, video status monitoring, translation, duplication, voice/music listing, webhook notifications.

seedance-video-generation

7
from Demerzels-lab/elsamultiskillagent

Generate AI videos using ByteDance Seedance.

seedance-video-generation-byteplus

7
from Demerzels-lab/elsamultiskillagent

Generate AI videos using BytePlus Seedance API (International)

terrain-route-video

7
from Demerzels-lab/elsamultiskillagent

Generate a minimalist terrain-style animated driving route video (MP4) from a list of stops (cities/POIs)

worthclip-youtube-video-scorer

7
from Demerzels-lab/elsamultiskillagent

AI-powered YouTube video scoring.

videogames

7
from Demerzels-lab/elsamultiskillagent

A skill to lookup video game information, prices, compatibility, and duration.

universal-video-downloader

7
from Demerzels-lab/elsamultiskillagent

Download videos from YouTube, Instagram, TikTok, Twitter/X, and 1000+ other sites using yt-dlp.