Best use case
mv-generator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
MediaClaw 内置的音乐视频端到端生成技能。
Teams using mv-generator should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/mv-generator/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How mv-generator Compares
| Feature / Agent | mv-generator | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
MediaClaw 内置的音乐视频端到端生成技能。
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# MV Generator — 音乐视频生成管线
MediaClaw 内置的音乐视频端到端生成技能。
## 触发词
- 生成MV
- MV制作
- 音乐视频
- mv generator
- 虚拟歌手MV
## 管线流程
```
音频文件 + 歌词 + 角色卡
│
▼
┌────────────────────────────┐
│ Step 1: 音频分析 │
│ scripts/audio-analyzer.py │
│ → 时长、BPM、场景时间线 │
└────────────┬───────────────┘
│
▼
┌────────────────────────────┐
│ Step 2: 歌词解析 │
│ scripts/lyrics-parser.py │
│ → 段落结构、情绪曲线 │
└────────────┬───────────────┘
│
▼
┌────────────────────────────┐
│ Step 3: 场景+帧设计 │
│ scripts/scene-design.py │
│ → 场景类型、运镜、首尾帧prompt │
└────────────┬───────────────┘
│
▼
┌────────────────────────────┐
│ Step 4: 帧图像生成 │
│ qingyun gpt-image-1 API │
│ → 每场景 first+last frame │
└────────────┬───────────────┘
│
▼
┌────────────────────────────┐
│ Step 5: 视频片段生成 │
│ kling-2 / veo-3.1 / etc │
│ → 首帧→尾帧 5s 视频片段 │
└────────────┬───────────────┘
│
▼
┌────────────────────────────┐
│ Step 6: 最终合成 │
│ FFmpeg merge │
│ → 音轨对齐+转场+字幕 │
└────────────────────────────┘
```
## 输入
| 参数 | 必填 | 说明 |
|------|------|------|
| `audio_file` | ✅ | 音频文件路径 (.mp3/.wav/.m4a) |
| `lyrics` | ✅ | 歌词文本 |
| `character_card` | ✅ | 角色卡 JSON(外貌、声音、风格) |
| `style` | ❌ | cinematic / realistic / anime / ethereal |
| `num_scenes` | ❌ | 分镜数(默认 9) |
| `output_dir` | ❌ | 输出目录(默认: `MediaClaw/output/<project>/`) |
## 输出目录结构
```
output/<project-name>/
├── README.md ← 项目概要 + 时间轴
├── frames/ ← 首尾帧图像
│ ├── 01_first.png
│ ├── 01_last.png
│ └── ...
├── audio/ ← 音频文件(多版本)
│ └── song_v1.mp3
├── script/ ← MV 剧本
│ ├── mv_final.json ← 精准时间轴 JSON
│ ├── mv_final.md ← 人类可读剧本
│ └── character.json ← 角色卡
├── docs/ ← 文档和参考素材
│ └── avatar.png
└── video/ ← 生成的视频片段(Step 5 产出)
├── scene_01.mp4
└── ...
```
## 经验总结(来自 Chloe《樱花落尽时》项目)
### 图像生成最佳实践
| 项目 | 推荐 | 备注 |
|------|------|------|
| 模型 | qingyun `gpt-image-1` | 成功率最高 |
| API | `/v1/images/generations` | 比 edits 更稳定 |
| 尺寸 | `1024x1536`(竖版) | 必须用规定尺寸 |
| 并发 | 2-3 并发 | 过多会 SSL EOF |
| 重试 | 失败单独重试 | 不整批重跑 |
### 角色一致性技巧
- **统一 prompt prefix**: 包含角色核心外貌特征
- **参考图**: 用 edits API 传入角色头像(但成功率低)
- **场景限定词**: 每帧 prompt 固定前缀 + 场景变量
### 音频同步
- 用 `mutagen` 读 MP3 时长(无需 ffprobe)
- 场景时间按歌曲结构比例分配
- BPM 检测需 `librosa`(可选)
### 视频模型选择
| 情绪 | 模型 | 场景类型 |
|------|------|---------|
| ≤ 6.0 | kling-1.6 | 静态/自然/国风 |
| 6.0-8.0 | kling-1.6 或 kling-2 | 过渡 |
| ≥ 8.0 | kling-2 | 爆发/力量 |
| 360°旋转 | veo-3.1 | 特殊运镜 |
## 依赖
- Python: mutagen, requests
- 外部 API: qingyun (图像), xingjiabi (视频), Google (veo-3.1)
- 可选: librosa (BPM), ffmpeg (合成)
## 参考
- [audio-analyzer.py](scripts/audio-analyzer.py) — 音频分析
- [Chloe 项目完整输出](../../output/chloe-sakura-mv/)
- [virtual-singer-mv-script](~/clawd/skills/virtual-singer-mv-script/) — 独立 MV 剧本 skillRelated Skills
tailored-resume-generator
Analyzes job descriptions and generates tailored resumes that highlight relevant experience, skills, and achievements to maximize interview chances
style-guide-generator
Generate comprehensive website style guides and design systems from URLs, screenshots, and existing documentation. Use this skill when users ask to create a style guide, design system documentation, brand guidelines document, or design specification from a website, app, or existing materials. This skill produces professional PDF outputs following industry-standard style guide structure.
runbook-generator
Analyze a codebase and generate production-grade operational runbooks with verification steps, rollback paths, escalation guidance, and staleness checks.
react-component-generator
This skill provides automated assistance for react component generator tasks within the Frontend Development domain
invoice-generator-agent
Automatic invoice generation with CRM integration
cold-email-sequence-generator
Generate personalized cold email sequences (7-14 emails) with A/B test subject lines, follow-up timing recommendations, and integrated social proof. Creates multi-touch campaigns optimized for response rates. Use when users need outbound email campaigns, sales sequences, or lead generation emails.
changelog-generator
Automatically creates user-facing changelogs from git commits by analyzing commit history, categorizing changes, and transforming technical commits into clear, customer-friendly release notes. Turns hours of manual changelog writing into minutes of automated generation.
wemp-operator
> 微信公众号全功能运营——草稿/发布/评论/用户/素材/群发/统计/菜单/二维码 API 封装
zsxq-smart-publish
Publish and manage content on 知识星球 (zsxq.com). Supports talk posts, Q&A, long articles, file sharing, digest/bookmark, homework tasks, and tag management. Use when publishing content to 知识星球, creating/editing posts, uploading files/images/audio, managing digests, batch publishing, or formatting content for 知识星球.
zoom-automation
Automate Zoom meeting creation, management, recordings, webinars, and participant tracking via Rube MCP (Composio). Always search tools first for current schemas.
zoho-crm-automation
Automate Zoho CRM tasks via Rube MCP (Composio): create/update records, search contacts, manage leads, and convert leads. Always search tools first for current schemas.
ziliu-publisher
字流(Ziliu) - AI驱动的多平台内容分发工具。用于一次创作、智能适配排版、一键分发到16+平台(公众号/知乎/小红书/B站/抖音/微博/X等)。当用户需要多平台发布、内容排版、格式适配时使用。触发词:字流、ziliu、多平台发布、一键分发、内容分发、排版发布。