gemini-video-analyzer

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.

3,891 stars

Best use case

gemini-video-analyzer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.

Teams using gemini-video-analyzer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/gemini-video-analyzer/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/aiwithabidi/gemini-video-analyzer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/gemini-video-analyzer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How gemini-video-analyzer Compares

Feature / Agentgemini-video-analyzerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe speech, identify objects and actions. Use when: (1) User sends a video file and wants it analyzed, (2) Video summarization or description needed, (3) Extracting text, UI elements, or information from screen recordings, (4) Answering questions about video content, (5) Comparing multiple videos, (6) Analyzing tutorials, demos, or walkthroughs.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Gemini Video Analyzer

Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.

## Quick Start

```bash
# Analyze a video with default prompt (full description)
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4

# Ask a specific question
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"

# Manage uploaded files
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup
```

## Supported Formats

MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.

## How It Works

1. Video uploads to Google's Files API (temporary, auto-deletes after 48h)
2. Gemini processes at 1 frame/sec — understands motion, transitions, audio context
3. Model generates response based on your prompt
4. Way better than frame extraction for understanding temporal content

## Use Cases

| Task | Example Prompt |
|------|---------------|
| General description | *(default — no prompt needed)* |
| UI/text extraction | `"What text and UI elements are visible?"` |
| Tutorial summary | `"Summarize the steps shown in this tutorial"` |
| Bug report from video | `"Describe what went wrong in this screen recording"` |
| Meeting notes | `"Summarize the key points discussed"` |
| Content comparison | Upload 2 videos, ask for differences |

## Configuration

Set `GOOGLE_AI_API_KEY` in your environment or `.env` file. Get a free key at [aistudio.google.com](https://aistudio.google.com/apikey).

Default model: `gemini-2.5-flash` (fast, cheap, excellent vision). Override with `--model gemini-2.5-pro` for complex analysis.

## API Reference

See [references/gemini-files-api.md](references/gemini-files-api.md) for file upload limits, processing details, and advanced options.

## Credits

Built by [M. Abidi](https://www.agxntsix.ai) · [LinkedIn](https://www.linkedin.com/in/mohammad-ali-abidi) · [YouTube](https://youtube.com/@aiwithabidi) · [GitHub](https://github.com/aiwithabidi) · [Book a Call](https://cal.com/agxntsix/abidi-openclaw)

Related Skills

Profit Margin Analyzer

3891
from openclaw/skills

Analyze and optimize profit margins across your business. Identifies margin compression, pricing opportunities, and cost levers.

Pricing Strategy Analyzer

3891
from openclaw/skills

Analyze and optimize pricing for any product or service. Covers value-based, cost-plus, competitive, and tiered pricing models.

Business Strategy & Growth

Portfolio Risk Analyzer

3891
from openclaw/skills

Complete investment portfolio risk management system. Analyze positions, calculate risk metrics, stress test scenarios, optimize allocations, and generate institutional-grade risk reports — all without external APIs.

Finance & Investing

Commercial Lease Analyzer

3891
from openclaw/skills

Analyze commercial leases (office, retail, industrial, warehouse) for hidden costs, unfavorable terms, and negotiation leverage. Use when reviewing a new lease, renegotiating a renewal, or comparing multiple lease options.

Business Analysis

Franchise Operations Analyzer

3891
from openclaw/skills

Evaluate franchise opportunities and manage multi-unit operations with data-driven frameworks.

Business & Finance

Financial Due Diligence Analyzer

3891
from openclaw/skills

Run comprehensive financial due diligence on acquisition targets, investment opportunities, or partnership prospects. Built for PE firms, corporate development teams, and founders evaluating deals.

Financial Analysis & Due Diligence

Employee Retention & Turnover Risk Analyzer

3891
from openclaw/skills

Diagnose why people leave. Fix it before they do.

HR & People Analytics

Contract Analyzer

3891
from openclaw/skills

Analyzes contracts and agreements for risks, unusual terms, and missing clauses

Legal Technology & Document Automation

Churn Risk Analyzer

3891
from openclaw/skills

Identify customers most likely to churn before they leave. Uses behavioral signals, usage patterns, and engagement data to score accounts and recommend retention actions.

Workflow & Productivity

simple-tech-analyzer - 简易技术分析器

3891
from openclaw/skills

**版本**: 1.0.0

Finance & Trading

demo-video

3891
from openclaw/skills

Create product demo videos by automating browser interactions and capturing frames. Use when the user wants to record a demo, walkthrough, product showcase, or interactive video of a web application. Supports Playwright CDP screencast for high-quality capture and FFmpeg for video encoding.

Video Production

seo-analyzer

3891
from openclaw/skills

Analyzes websites for SEO opportunities. Generates keyword ideas, checks on-page SEO factors, and provides actionable optimization recommendations.