videocut:自进化

字幕生成与烧录。火山引擎转录→词典纠错→审核→烧录。触发词:加字幕、生成字幕、字幕

33 stars

Best use case

videocut:自进化 is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

字幕生成与烧录。火山引擎转录→词典纠错→审核→烧录。触发词:加字幕、生成字幕、字幕

Teams using videocut:自进化 should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/videocut-subtitle/SKILL.md --create-dirs "https://raw.githubusercontent.com/aAAaqwq/AGI-Super-Team/main/skills/videocut-subtitle/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/videocut-subtitle/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How videocut:自进化 Compares

Feature / Agentvideocut:自进化Standard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

字幕生成与烧录。火山引擎转录→词典纠错→审核→烧录。触发词:加字幕、生成字幕、字幕

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

<!--
input: 视频文件
output: 带字幕视频
pos: 后置 skill,剪辑完成后调用
-->

# 字幕

> 转录 → Agent校对 → 人工审核 → 烧录

## 核心流程(总计约 8-15 分钟,含人工审核)

```
1. 提取音频 + 上传          ⏱ ~1min
    ↓
2. 火山引擎转录(带热词)    ⏱ ~2min
    ↓
3. Agent 自动校对            ⏱ ~3-5min
    ↓
4. 人工审核确认              ⏱ 取决于用户
    ↓
5. 烧录字幕                  ⏱ ~1-2min
```

---

## Step 1: 提取音频并上传

```bash
# 提取音频
ffmpeg -i "video.mp4" -vn -acodec libmp3lame -y audio.mp3

# 上传到 uguu.se(临时文件托管)
curl -s -F "files[]=@audio.mp3" https://uguu.se/upload
# 返回 URL 如: https://o.uguu.se/xxxxx.mp3
```

---

## Step 2: 火山引擎转录(带热词)

转录脚本会**自动读取词典**作为热词,提高识别准确率:

```bash
# 词典位置: /Users/chengfeng/Desktop/AIos/剪辑Agent/.claude/skills/字幕/词典.txt
# 脚本会自动加载

bash ../剪口播/scripts/volcengine_transcribe.sh "https://o.uguu.se/xxxxx.mp3"
```

**词典格式**(每行一个词):
```
skills
Claude
Agent
成峰
剪辑skills
claude code
```

---

## Step 3: Agent 自动校对

### 3.1 生成带时间戳的字幕

```javascript
const result = JSON.parse(fs.readFileSync('volcengine_result.json'));
const subtitles = result.utterances.map((u, i) => ({
  id: i + 1,
  text: u.text,
  start: u.start_time / 1000,
  end: u.end_time / 1000
}));
fs.writeFileSync('subtitles_with_time.json', JSON.stringify(subtitles, null, 2));
```

### 3.2 Agent 手动校对(不用脚本)

**转录后,Agent 必须逐条阅读全部字幕,手动校对以下问题:**

#### 常见误识别规则表

| 误识别 | 正确 | 类型 |
|--------|------|------|
| 成风 | 成峰 | 同音字 |
| 正特/整特 | Agent | 误识别 |
| IT就 | Agent就 | 发音相似 |
| edge的叉100 | Agentx100 | 误识别 |
| cloud code | Claude Code | 发音相似 |
| Schill/skill | skills | 发音相似 |
| 剪口拨/剪口波 | 剪口播 | 同音字 |
| 自净化/资金化 | 自进化 | 同音字 |
| 减口播 | 剪口播 | 同音字 |
| 录剪 | 漏剪 | 同音字 |
| 作为这个 | 做这个 | 同音字 |
| 斜杠V1/斜杠v | /v | 口语描述 |
| 斜杠v点口拨 | /v.口播 | 口语+同音 |
| a p i t/APIK | API Key | 误识别 |
| excuse | skills | 误识别 |
| 移完了 | 剪完了 | 同音字 |

#### 常见漏字问题

| 原文 | 修正 | 说明 |
|------|------|------|
| 步呢是配置 | 第二步呢是配置 | 漏"第二" |
| 4步就是 | 第4步就是 | 漏"第" |
| 别省时间 | 特别省时间 | 漏"特" |
| 这个我们的 | 这个是我们的 | 漏"是" |
| 在里面看到 | 可以在里面看到 | 漏"可以" |
| 跟大家处理完 | 跟大家讲完 | 用词不当 |
| 剪辑了逻辑 | 剪辑的逻辑 | 语法错误 |

### 3.3 对照原稿校对(如有原稿)

如果有原稿/脚本,可以辅助校对,但**不要用脚本自动匹配**(文字差异会导致时间戳累积错误)。

Agent 应:
1. 读取原稿作为参考
2. 手动逐条对比,发现差异时修正
3. 不确定的地方标记,留给人工审核

---

## Step 4: 启动审核服务器

```bash
cd 字幕目录/
node /path/to/skills/字幕/scripts/subtitle_server.js 8898 "video.mp4"
```

访问 http://localhost:8898

**功能:**
- 左侧视频播放,右侧字幕列表
- 播放时自动高亮当前字幕
- 双击字幕文字编辑(时间戳不变)
- 倍速播放(1x/1.5x/2x/3x)
- 保存字幕 / 导出 SRT / 烧录字幕
- 底部显示词典快捷插入

---

## Step 5: 烧录字幕

**默认样式:22号金黄粗体、黑色描边2px、底部居中**

```bash
ffmpeg -i "video.mp4" \
  -vf "subtitles='video.srt':force_style='FontSize=22,FontName=PingFang SC,Bold=1,PrimaryColour=&H0000deff,OutlineColour=&H00000000,Outline=2,Alignment=2,MarginV=30'" \
  -c:a copy \
  -y "video_字幕.mp4"
```

| 参数 | 值 | 说明 |
|------|------|------|
| FontSize | 22 | 字体大小 |
| FontName | PingFang SC | 苹方字体 |
| Bold | 1 | 粗体 |
| PrimaryColour | &H0000deff | 金黄色 #ffde00 |
| OutlineColour | &H00000000 | 黑色描边 |
| Outline | 2 | 描边宽度 |
| Alignment | 2 | 底部居中 |
| MarginV | 30 | 底部边距 |

---

## 目录结构

```
output/YYYY-MM-DD_视频名/字幕/
├── 1_转录/
│   ├── audio.mp3
│   └── volcengine_result.json
├── subtitles_with_time.json    # 核心文件
└── 3_输出/
    ├── video.srt
    └── video_字幕.mp4
```

---

## 字幕规范

| 规则 | 说明 |
|------|------|
| 一屏一行 | 不换行,不堆叠 |
| 句尾无标点 | `你好` 不是 `你好。` |
| 句中保留标点 | `先点这里,再点那里` |

---

## 反馈记录

### 2026-01-31
- 火山引擎支持热词,已集成到转录脚本
- Agent 转录后需要自动校对,不能直接交给用户
- 字幕样式:金黄粗体 #ffde00,描边 2px
- IT 常被误识别为 Agent,需加入纠错规则
- **重要**:Agent 校对必须手动逐条阅读,不能用脚本自动匹配
- 新增 17 条常见误识别规则(详见 3.2 节)
- 漏字问题比误识别更难发现,需要特别注意

Related Skills

wemp-operator

33
from aAAaqwq/AGI-Super-Team

> 微信公众号全功能运营——草稿/发布/评论/用户/素材/群发/统计/菜单/二维码 API 封装

Content & Documentation

zsxq-smart-publish

33
from aAAaqwq/AGI-Super-Team

Publish and manage content on 知识星球 (zsxq.com). Supports talk posts, Q&A, long articles, file sharing, digest/bookmark, homework tasks, and tag management. Use when publishing content to 知识星球, creating/editing posts, uploading files/images/audio, managing digests, batch publishing, or formatting content for 知识星球.

zoom-automation

33
from aAAaqwq/AGI-Super-Team

Automate Zoom meeting creation, management, recordings, webinars, and participant tracking via Rube MCP (Composio). Always search tools first for current schemas.

zoho-crm-automation

33
from aAAaqwq/AGI-Super-Team

Automate Zoho CRM tasks via Rube MCP (Composio): create/update records, search contacts, manage leads, and convert leads. Always search tools first for current schemas.

ziliu-publisher

33
from aAAaqwq/AGI-Super-Team

字流(Ziliu) - AI驱动的多平台内容分发工具。用于一次创作、智能适配排版、一键分发到16+平台(公众号/知乎/小红书/B站/抖音/微博/X等)。当用户需要多平台发布、内容排版、格式适配时使用。触发词:字流、ziliu、多平台发布、一键分发、内容分发、排版发布。

zhihu-post-skill

33
from aAAaqwq/AGI-Super-Team

> 知乎文章发布——知乎平台内容创作与发布自动化

zendesk-automation

33
from aAAaqwq/AGI-Super-Team

Automate Zendesk tasks via Rube MCP (Composio): tickets, users, organizations, replies. Always search tools first for current schemas.

youtube-knowledge-extractor

33
from aAAaqwq/AGI-Super-Team

This skill performs deep analysis of YouTube videos through **both information channels** Multimodal YouTube video analysis through both audio (transcript) and visual (frame extraction + image analysis) channels. Especially powerful for HowTo videos, tutorials, demos, and explainer videos where what is SHOWN (screenshots, UI demos, diagrams, code, physical actions) is just as important as what is SAID. Use this skill whenever a user wants to analyze, summarize, or create step-by-step guides from YouTube videos, or when they share a YouTube URL and want to understand what happens in the video. Triggers on requests like "Analyze this YouTube video", "Create a step-by-step guide from this video", "What does this video show?", "Summarize this tutorial", or any YouTube URL shared with analysis intent.

youtube-factory

33
from aAAaqwq/AGI-Super-Team

Generate complete YouTube videos from a single prompt - script, voiceover, stock footage, captions, thumbnail. Self-contained, no external modules. 100% free tools.

youtube-automation

33
from aAAaqwq/AGI-Super-Team

Automate YouTube tasks via Rube MCP (Composio): upload videos, manage playlists, search content, get analytics, and handle comments. Always search tools first for current schemas.

xlsx

33
from aAAaqwq/AGI-Super-Team

Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas

xiaomo-assistant-template

33
from aAAaqwq/AGI-Super-Team

小a助手配置模板。基于 xiaomo-starter-kit 改编,提供预配置的 OpenClaw 助手框架文件。当用户需要快速配置新助手、设置助手身份、创建助手配置文件时使用此技能。