video-understand 技能

## 概述

506 stars

Best use case

video-understand 技能 is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## 概述

Teams using video-understand 技能 should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/video-understand/SKILL.md --create-dirs "https://raw.githubusercontent.com/xiaotianfotos/skills/main/video-understand/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/video-understand/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How video-understand 技能 Compares

Feature / Agentvideo-understand 技能Standard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

## 概述

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# video-understand 技能

## 概述

通过多模态 Omni 模型理解视频/图片内容,用自然语言驱动输出任意所需结果——复刻视频的生成 Prompt、场景描述、音频转录、内容分析等。

当前实现基于 OpenRouter 上的 `xiaomi/mimo-v2-omni`,但工具本身可以改造为任何支持视频/图片输入的多模态模型(不限于 Omni 类型),只需替换模型 ID 和对应的 API 接入方式。

## 核心功能

把视频或图片交给模型,配合你的需求描述,直接生成所需内容:

- 复刻视频的生成 Prompt(用于 AI 视频生成工具)
- 视频内容理解与描述
- 音频转录
- 多媒体对比分析
- 其他自定义需求

## 使用方法

> 前置条件:`export OPENROUTER_API_KEY=your_api_key`

```bash
# 单视频
node scripts/openrouter-mimo-omni.js -p "你的需求" -v video.mp4

# 多图
node scripts/openrouter-mimo-omni.js -p "你的需求" -i img1.jpg -i img2.jpg

# 图片 + 视频混合
node scripts/openrouter-mimo-omni.js -p "你的需求" -i photo.jpg -v video.mp4

# 指定提取帧数
node scripts/openrouter-mimo-omni.js -p "你的需求" -v video.mp4 --frames 15
```

## 注意事项

- 需要安装 `ffmpeg`(用于大视频自动压缩)
- 需要设置 `OPENROUTER_API_KEY`

Related Skills

tutor

506
from xiaotianfotos/skills

一对一辅导老师技能,用于解答数学题,生成HTML讲解文档和带配音的Manim动画视频。 核心工作流:数学分析 → HTML可视化 → 分镜脚本 → TTS音频 → 验证更新 → 脚手架 → Manim代码 → 渲染验证 触发条件:学生粘贴数学题图片、需要教学视频、需要HTML讲解资料

dlna

506
from xiaotianfotos/skills

Control DLNA MediaRenderer devices. Discover devices and play media URLs on DLNA-compatible TVs, speakers, and media players. Supports default device configuration.

claude-in-tmux

506
from xiaotianfotos/skills

Run Claude Code CLI inside tmux sessions within Docker containers. Use this skill when the user wants to run Claude Code in a tmux session, check running tmux sessions, read logs from tmux sessions, or manage claude.sh wrapper script in OpenClaw agent containers.

demo-video

3891
from openclaw/skills

Create product demo videos by automating browser interactions and capturing frames. Use when the user wants to record a demo, walkthrough, product showcase, or interactive video of a web application. Supports Playwright CDP screencast for high-quality capture and FFmpeg for video encoding.

Video Production

remotion-video-creation

144923
from affaan-m/everything-claude-code

Best practices for Remotion - Video creation in React. 29 domain-specific rules covering 3D, animations, audio, captions, charts, transitions, and more.

understanding-streamlit-architecture

44152
from streamlit/streamlit

Explains Streamlit's internal architecture including backend runtime, frontend rendering, and WebSocket communication. Use when debugging cross-layer issues, understanding how features work end-to-end, planning architectural changes, or onboarding to the codebase. Covers ForwardMsg/BackMsg protocol, script rerun model, element tree, widget state management, and more.

videodb

31392
from sickn33/antigravity-awesome-skills

Video and audio perception, indexing, and editing. Ingest files/URLs/live streams, build visual/spoken indexes, search with timestamps, edit timelines, add overlays/subtitles, generate media, and create real-time alerts.

videodb-skills

31392
from sickn33/antigravity-awesome-skills

Upload, stream, search, edit, transcribe, and generate AI video and audio using the VideoDB SDK.

seek-and-analyze-video

31392
from sickn33/antigravity-awesome-skills

Seek and analyze video content using Memories.ai Large Visual Memory Model for persistent video intelligence

azure-ai-contentunderstanding-py

31392
from sickn33/antigravity-awesome-skills

Azure AI Content Understanding SDK for Python. Use for multimodal content extraction from documents, images, audio, and video.

short-video-script-generator-pro

3891
from openclaw/skills

AI Short Video Script Generator, support TikTok/YouTube Shorts/Instagram Reels, auto generate hook, shots, voiceover, subtitles, BGM, CTA. $0.005 USDT per use.

ai-notes-of-video

3891
from openclaw/skills

The video AI notes tool is provided by Baidu. Based on the video download address provided by the user, it downloads and parses the video, and finally generates AI notes corresponding to the video (a total of three types of notes can be generated: document notes, outline notes, and image-text notes).