mv-generator

MediaClaw 内置的音乐视频端到端生成技能。

33 stars

byaAAaqwq

View on GitHub Installation ↓

Best use case

mv-generator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

MediaClaw 内置的音乐视频端到端生成技能。

Teams using mv-generator should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/mv-generator/SKILL.md --create-dirs "https://raw.githubusercontent.com/aAAaqwq/AGI-Super-Team/main/skills/mv-generator/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/mv-generator/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How mv-generator Compares

Feature / Agent	mv-generator	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

MediaClaw 内置的音乐视频端到端生成技能。

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# MV Generator — 音乐视频生成管线

MediaClaw 内置的音乐视频端到端生成技能。

## 触发词
- 生成MV
- MV制作
- 音乐视频
- mv generator
- 虚拟歌手MV

## 管线流程

```
音频文件 + 歌词 + 角色卡
        │
        ▼
┌────────────────────────────┐
│  Step 1: 音频分析            │
│  scripts/audio-analyzer.py  │
│  → 时长、BPM、场景时间线     │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│  Step 2: 歌词解析            │
│  scripts/lyrics-parser.py   │
│  → 段落结构、情绪曲线        │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│  Step 3: 场景+帧设计         │
│  scripts/scene-design.py    │
│  → 场景类型、运镜、首尾帧prompt │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│  Step 4: 帧图像生成          │
│  qingyun gpt-image-1 API    │
│  → 每场景 first+last frame  │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│  Step 5: 视频片段生成        │
│  kling-2 / veo-3.1 / etc    │
│  → 首帧→尾帧 5s 视频片段    │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│  Step 6: 最终合成            │
│  FFmpeg merge               │
│  → 音轨对齐+转场+字幕       │
└────────────────────────────┘
```

## 输入

| 参数 | 必填 | 说明 |
|------|------|------|
| `audio_file` | ✅ | 音频文件路径 (.mp3/.wav/.m4a) |
| `lyrics` | ✅ | 歌词文本 |
| `character_card` | ✅ | 角色卡 JSON（外貌、声音、风格） |
| `style` | ❌ | cinematic / realistic / anime / ethereal |
| `num_scenes` | ❌ | 分镜数（默认 9） |
| `output_dir` | ❌ | 输出目录（默认: `MediaClaw/output/<project>/`） |

## 输出目录结构

```
output/<project-name>/
├── README.md              ← 项目概要 + 时间轴
├── frames/                ← 首尾帧图像
│   ├── 01_first.png
│   ├── 01_last.png
│   └── ...
├── audio/                 ← 音频文件（多版本）
│   └── song_v1.mp3
├── script/                ← MV 剧本
│   ├── mv_final.json      ← 精准时间轴 JSON
│   ├── mv_final.md        ← 人类可读剧本
│   └── character.json     ← 角色卡
├── docs/                  ← 文档和参考素材
│   └── avatar.png
└── video/                 ← 生成的视频片段（Step 5 产出）
    ├── scene_01.mp4
    └── ...
```

## 经验总结（来自 Chloe《樱花落尽时》项目）

### 图像生成最佳实践
| 项目 | 推荐 | 备注 |
|------|------|------|
| 模型 | qingyun `gpt-image-1` | 成功率最高 |
| API | `/v1/images/generations` | 比 edits 更稳定 |
| 尺寸 | `1024x1536`（竖版） | 必须用规定尺寸 |
| 并发 | 2-3 并发 | 过多会 SSL EOF |
| 重试 | 失败单独重试 | 不整批重跑 |

### 角色一致性技巧
- **统一 prompt prefix**: 包含角色核心外貌特征
- **参考图**: 用 edits API 传入角色头像（但成功率低）
- **场景限定词**: 每帧 prompt 固定前缀 + 场景变量

### 音频同步
- 用 `mutagen` 读 MP3 时长（无需 ffprobe）
- 场景时间按歌曲结构比例分配
- BPM 检测需 `librosa`（可选）

### 视频模型选择
| 情绪 | 模型 | 场景类型 |
|------|------|---------|
| ≤ 6.0 | kling-1.6 | 静态/自然/国风 |
| 6.0-8.0 | kling-1.6 或 kling-2 | 过渡 |
| ≥ 8.0 | kling-2 | 爆发/力量 |
| 360°旋转 | veo-3.1 | 特殊运镜 |

## 依赖
- Python: mutagen, requests
- 外部 API: qingyun (图像), xingjiabi (视频), Google (veo-3.1)
- 可选: librosa (BPM), ffmpeg (合成)

## 参考
- [audio-analyzer.py](scripts/audio-analyzer.py) — 音频分析
- [Chloe 项目完整输出](../../output/chloe-sakura-mv/)
- [virtual-singer-mv-script](~/clawd/skills/virtual-singer-mv-script/) — 独立 MV 剧本 skill

Related Skills

tailored-resume-generator

from aAAaqwq/AGI-Super-Team

Analyzes job descriptions and generates tailored resumes that highlight relevant experience, skills, and achievements to maximize interview chances

style-guide-generator

from aAAaqwq/AGI-Super-Team

Generate comprehensive website style guides and design systems from URLs, screenshots, and existing documentation. Use this skill when users ask to create a style guide, design system documentation, brand guidelines document, or design specification from a website, app, or existing materials. This skill produces professional PDF outputs following industry-standard style guide structure.

runbook-generator

from aAAaqwq/AGI-Super-Team

Analyze a codebase and generate production-grade operational runbooks with verification steps, rollback paths, escalation guidance, and staleness checks.

react-component-generator

from aAAaqwq/AGI-Super-Team

This skill provides automated assistance for react component generator tasks within the Frontend Development domain

invoice-generator-agent

from aAAaqwq/AGI-Super-Team

Automatic invoice generation with CRM integration

cold-email-sequence-generator

from aAAaqwq/AGI-Super-Team

Generate personalized cold email sequences (7-14 emails) with A/B test subject lines, follow-up timing recommendations, and integrated social proof. Creates multi-touch campaigns optimized for response rates. Use when users need outbound email campaigns, sales sequences, or lead generation emails.

changelog-generator

from aAAaqwq/AGI-Super-Team

Automatically creates user-facing changelogs from git commits by analyzing commit history, categorizing changes, and transforming technical commits into clear, customer-friendly release notes. Turns hours of manual changelog writing into minutes of automated generation.

wemp-operator

from aAAaqwq/AGI-Super-Team

> 微信公众号全功能运营——草稿/发布/评论/用户/素材/群发/统计/菜单/二维码 API 封装

Content & Documentation

zsxq-smart-publish

from aAAaqwq/AGI-Super-Team

Publish and manage content on 知识星球 (zsxq.com). Supports talk posts, Q&A, long articles, file sharing, digest/bookmark, homework tasks, and tag management. Use when publishing content to 知识星球, creating/editing posts, uploading files/images/audio, managing digests, batch publishing, or formatting content for 知识星球.

zoom-automation

from aAAaqwq/AGI-Super-Team

Automate Zoom meeting creation, management, recordings, webinars, and participant tracking via Rube MCP (Composio). Always search tools first for current schemas.

zoho-crm-automation

from aAAaqwq/AGI-Super-Team

Automate Zoho CRM tasks via Rube MCP (Composio): create/update records, search contacts, manage leads, and convert leads. Always search tools first for current schemas.

ziliu-publisher

from aAAaqwq/AGI-Super-Team

字流(Ziliu) - AI驱动的多平台内容分发工具。用于一次创作、智能适配排版、一键分发到16+平台（公众号/知乎/小红书/B站/抖音/微博/X等）。当用户需要多平台发布、内容排版、格式适配时使用。触发词：字流、ziliu、多平台发布、一键分发、内容分发、排版发布。