Best use case
video-understand 技能 is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
## 概述
Teams using video-understand 技能 should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/video-understand/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How video-understand 技能 Compares
| Feature / Agent | video-understand 技能 | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
## 概述
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# video-understand 技能 ## 概述 通过多模态 Omni 模型理解视频/图片内容,用自然语言驱动输出任意所需结果——复刻视频的生成 Prompt、场景描述、音频转录、内容分析等。 当前实现基于 OpenRouter 上的 `xiaomi/mimo-v2-omni`,但工具本身可以改造为任何支持视频/图片输入的多模态模型(不限于 Omni 类型),只需替换模型 ID 和对应的 API 接入方式。 ## 核心功能 把视频或图片交给模型,配合你的需求描述,直接生成所需内容: - 复刻视频的生成 Prompt(用于 AI 视频生成工具) - 视频内容理解与描述 - 音频转录 - 多媒体对比分析 - 其他自定义需求 ## 使用方法 > 前置条件:`export OPENROUTER_API_KEY=your_api_key` ```bash # 单视频 node scripts/openrouter-mimo-omni.js -p "你的需求" -v video.mp4 # 多图 node scripts/openrouter-mimo-omni.js -p "你的需求" -i img1.jpg -i img2.jpg # 图片 + 视频混合 node scripts/openrouter-mimo-omni.js -p "你的需求" -i photo.jpg -v video.mp4 # 指定提取帧数 node scripts/openrouter-mimo-omni.js -p "你的需求" -v video.mp4 --frames 15 ``` ## 注意事项 - 需要安装 `ffmpeg`(用于大视频自动压缩) - 需要设置 `OPENROUTER_API_KEY`
Related Skills
tutor
一对一辅导老师技能,用于解答数学题,生成HTML讲解文档和带配音的Manim动画视频。 核心工作流:数学分析 → HTML可视化 → 分镜脚本 → TTS音频 → 验证更新 → 脚手架 → Manim代码 → 渲染验证 触发条件:学生粘贴数学题图片、需要教学视频、需要HTML讲解资料
dlna
Control DLNA MediaRenderer devices. Discover devices and play media URLs on DLNA-compatible TVs, speakers, and media players. Supports default device configuration.
claude-in-tmux
Run Claude Code CLI inside tmux sessions within Docker containers. Use this skill when the user wants to run Claude Code in a tmux session, check running tmux sessions, read logs from tmux sessions, or manage claude.sh wrapper script in OpenClaw agent containers.
demo-video
Create product demo videos by automating browser interactions and capturing frames. Use when the user wants to record a demo, walkthrough, product showcase, or interactive video of a web application. Supports Playwright CDP screencast for high-quality capture and FFmpeg for video encoding.
remotion-video-creation
Best practices for Remotion - Video creation in React. 29 domain-specific rules covering 3D, animations, audio, captions, charts, transitions, and more.
understanding-streamlit-architecture
Explains Streamlit's internal architecture including backend runtime, frontend rendering, and WebSocket communication. Use when debugging cross-layer issues, understanding how features work end-to-end, planning architectural changes, or onboarding to the codebase. Covers ForwardMsg/BackMsg protocol, script rerun model, element tree, widget state management, and more.
videodb
Video and audio perception, indexing, and editing. Ingest files/URLs/live streams, build visual/spoken indexes, search with timestamps, edit timelines, add overlays/subtitles, generate media, and create real-time alerts.
videodb-skills
Upload, stream, search, edit, transcribe, and generate AI video and audio using the VideoDB SDK.
seek-and-analyze-video
Seek and analyze video content using Memories.ai Large Visual Memory Model for persistent video intelligence
azure-ai-contentunderstanding-py
Azure AI Content Understanding SDK for Python. Use for multimodal content extraction from documents, images, audio, and video.
short-video-script-generator-pro
AI Short Video Script Generator, support TikTok/YouTube Shorts/Instagram Reels, auto generate hook, shots, voiceover, subtitles, BGM, CTA. $0.005 USDT per use.
ai-notes-of-video
The video AI notes tool is provided by Baidu. Based on the video download address provided by the user, it downloads and parses the video, and finally generates AI notes corresponding to the video (a total of three types of notes can be generated: document notes, outline notes, and image-text notes).