videocut-clip-oral

口播视频转录和口误识别。生成审查稿和删除任务清单。触发词:剪口播、处理视频、识别口误

23 stars

Best use case

videocut-clip-oral is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

口播视频转录和口误识别。生成审查稿和删除任务清单。触发词:剪口播、处理视频、识别口误

Teams using videocut-clip-oral should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/videocut-clip-oral/SKILL.md --create-dirs "https://raw.githubusercontent.com/christophacham/agent-skills-library/main/skills/media-production/videocut-clip-oral/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/videocut-clip-oral/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How videocut-clip-oral Compares

Feature / Agentvideocut-clip-oralStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

口播视频转录和口误识别。生成审查稿和删除任务清单。触发词:剪口播、处理视频、识别口误

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

<!--
input: 视频文件 (*.mp4)
output: 转录JSON、审查稿、删除任务TodoList
pos: 转录+识别,到用户审核为止

架构守护者:一旦我被修改,请同步更新:
1. ../README.md 的 Skill 清单
2. /CLAUDE.md 路由表
-->

# 剪口播

> 转录 + 口误/静音识别 → 生成审查稿

## 快速使用

```
用户: 帮我剪这个口播视频
用户: 处理一下这个视频
```

## 流程

```
1. FunASR 30s 分段转录(字符级时间戳)
    ↓
2. 识别口误(逐句检查)
    ↓
3. 识别微口误(VAD 检测短片段)
    ↓
4. 识别语气词(嗯/哎/诶 等)
    ↓
5. 识别静音(≥1s)
    ↓
6. 生成审查稿(时间戳驱动)
    ↓
7. 输出删除任务 TodoList
    ↓
【等待用户确认】→ 用户确认后,执行 /videocut:剪辑
```

### ⚠️ 为什么用 30s 分段

FunASR 长视频有时间戳漂移,30s 分段可避免。

## 进度 TodoList

启动时创建:

```
- [ ] 读取「转录最佳实践」→ 转录视频
- [ ] 读取「口误识别方法论」→ 识别口误
- [ ] VAD 检测微口误(短片段 < 0.5s)
- [ ] 扫描语气词(嗯/哎/诶 等)
- [ ] 识别静音(≥1s)
- [ ] 生成审查稿
- [ ] 输出删除任务清单
```

### ⚠️ 必须先读方法论再执行

| 阶段 | 先读 | 再执行 |
|------|------|--------|
| 转录 | `tips/转录最佳实践.md` | 调用ASR |
| 识别口误 | `tips/口误识别方法论.md` | 逐句分析 |

---

## 核心:时间戳驱动

### 删除任务格式

每项**必须标注精确时间戳** `(start-end)`:

```
口误(N处):
- [ ] 1. `(start-end)` 删"错误文本" → 保留"正确文本"

语气词(N处):
- [ ] 1. `(前字end-后字start)` 删"嗯" 上下文: XX【嗯】YY

静音(N处):
- [ ] 1. `(start-end)` 静音Xs
```

### 口误类型

| 类型 | 示例 | 删除策略 |
|------|------|----------|
| 重复型 | `拉满新拉满` | 只删差异("新") |
| 替换型 | `AI就是AI就会` | 删第一个完整版本("AI就是") |
| 卡顿型 | `听会会` | 删第一个重复字 |

### ⚠️ 关键规则

1. **时间戳驱动**:审查稿直接标注时间戳,剪辑不再搜索文本
2. **逐token分析**:对于"删前面保后面"的口误,必须逐token查时间戳
3. **检查时间跨度**:如果口误时间跨度 > 2秒,必有静音,需拆分

---

## 输出文件

```
01-xxx-v1_transcript.json  # 转录结果(含时间戳)
01-xxx-v1_审查稿.md        # 口误审查稿
```

### 展示要求

生成审查稿后,**必须展示给用户**:
1. 写入文件 `01-xxx-v1_审查稿.md`
2. 读取并展示内容
3. 等待用户确认要删除哪些项目

---

## 方法论

详见 `tips/口误识别方法论.md`:
- 口误识别方法(逐句检查)
- "删前面保后面"的精确处理
- FunASR 时间戳对齐规则

Related Skills

temporal-python-testing

23
from christophacham/agent-skills-library

Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal wor...

temporal-python-pro

23
from christophacham/agent-skills-library

Master Temporal workflow orchestration with Python SDK. Implements durable workflows, saga patterns, and distributed transactions. Covers async/await, testing strategies, and production deployment.

videocut-subtitle

23
from christophacham/agent-skills-library

字幕生成与烧录。转录→词典纠错→审核→烧录。触发词:加字幕、生成字幕、字幕

videocut-self-update

23
from christophacham/agent-skills-library

自更新 skills。记录用户反馈,更新方法论和规则。触发词:更新规则、记录反馈、改进skill

videocut-install

23
from christophacham/agent-skills-library

环境准备。安装依赖、下载模型、验证环境。触发词:安装、环境准备、初始化

videocut-clip

23
from christophacham/agent-skills-library

执行视频剪辑。根据确认的删除任务执行FFmpeg剪辑,循环直到零口误,生成字幕。触发词:执行剪辑、开始剪、确认剪辑

obsidian-clipper-template-creator

23
from christophacham/agent-skills-library

Guide for creating templates for the Obsidian Web Clipper. Use when you want to create a new clipping template, understand available variables, or format clipped content.

temporal-golang-pro

23
from christophacham/agent-skills-library

Use when building durable distributed systems with Temporal Go SDK. Covers deterministic workflow rules, mTLS worker configs, and advanced patterns.

microsoft-code-reference

23
from christophacham/agent-skills-library

Look up Microsoft API references, find working code samples, and verify SDK code is correct. Use when working with Azure SDKs, .NET libraries, or Microsoft APIs—to find the right method, check parameters, get working examples, or troubleshoot errors. Catches hallucinated methods, wrong signatures, and deprecated patterns by querying official docs.

eos-composition

23
from christophacham/agent-skills-library

Strunk & White composition review using the 11 principles from "Elements of Style" Chapter II. Use when analyzing structure, improving flow, or tightening prose.

enhance-cross-file

23
from christophacham/agent-skills-library

Use when checking cross-file consistency: tools vs frontmatter, agent references, duplicate rules, contradictions.

crossing-the-chasm

23
from christophacham/agent-skills-library

Navigate the technology adoption lifecycle from early adopters to mainstream market. Use when the user mentions "crossing the chasm", "beachhead segment", "whole product", "early adopters vs. mainstream", or "tech go-to-market". Covers D-Day analogy, bowling-pin strategy, and positioning against incumbents. For product positioning, see obviously-awesome. For new market creation, see blue-ocean-strategy.