ppt-to-video-workflow

PPT/슬라이드를 나레이션과 자막이 포함된 영상으로 변환합니다. PPTX 파일 또는 slides.json에서 슬라이드 이미지를 추출/렌더링하고, TTS로 나레이션을 생성하며, 자막을 추가하여 최종 MP4 영상을 만듭니다. "PPT를 영상으로 만들어줘", "발표 영상 생성", "자막 포함 영상 만들기" 요청 시 사용합니다.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

ppt-to-video-workflow is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using ppt-to-video-workflow should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ppt-to-video-workflow/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/tools/ppt-to-video-workflow/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ppt-to-video-workflow/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ppt-to-video-workflow Compares

Feature / Agent	ppt-to-video-workflow	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# PPT to Video Workflow

PPT/슬라이드를 나레이션과 자막이 포함된 영상으로 변환하는 종합 스킬입니다.

## 트리거

다음과 같은 요청 시 이 스킬을 사용합니다:
- "PPT를 영상으로 만들어줘"
- "슬라이드 영상 생성해줘"
- "발표 영상 만들어줘"
- "나레이션이 포함된 영상 생성"
- "자막 포함 영상 만들어줘"
- "PPTX 파일을 영상으로 변환"

---

## 전체 워크플로우

```
┌─────────────────────────────────────────────────────────────────────┐
│                        입력 소스 (택1)                               │
├─────────────────────────────────────────────────────────────────────┤
│  [PPTX 파일]              또는           [slides.json]              │
│       │                                       │                     │
│       ▼                                       ▼                     │
│  이미지 추출 (python-pptx)          슬라이드 렌더링 (Pillow)         │
│       │                                       │                     │
│       └───────────────┬───────────────────────┘                     │
│                       ▼                                             │
│              slide_01.png, slide_02.png, ...                        │
└─────────────────────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      나레이션 생성                                   │
├─────────────────────────────────────────────────────────────────────┤
│  [slides.json + reference_audio.wav]                                │
│       │                                                             │
│       ▼                                                             │
│  Chatterbox TTS API (슬라이드별 음성 생성)                          │
│       │                                                             │
│       ▼                                                             │
│  narration_01.mp3, narration_02.mp3, ... + timing.json              │
└─────────────────────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      자막 생성 (선택)                                │
├─────────────────────────────────────────────────────────────────────┤
│  [timing.json]                                                      │
│       │                                                             │
│       ▼                                                             │
│  문장 분할 + ASS 자막 생성                                          │
│       │                                                             │
│       ▼                                                             │
│  subtitle.ass                                                       │
└─────────────────────────────────────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      영상 합성                                       │
├─────────────────────────────────────────────────────────────────────┤
│  [slide_*.png + narration_*.mp3 + subtitle.ass]                     │
│       │                                                             │
│       ▼                                                             │
│  FFmpeg (페이드 전환 + 자막 하드코딩)                               │
│       │                                                             │
│       ▼                                                             │
│  final_video.mp4                                                    │
└─────────────────────────────────────────────────────────────────────┘
```

---

## 입력 형식

### 옵션 1: PPTX 파일
기존 PPTX 파일에서 슬라이드 이미지를 추출합니다.

```python
from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE
from PIL import Image
import io

prs = Presentation("presentation.pptx")
for i, slide in enumerate(prs.slides):
    for shape in slide.shapes:
        if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
            image = Image.open(io.BytesIO(shape.image.blob))
            image.save(f"slide_{i+1:02d}.png")
```

### 옵션 2: slides.json
구조화된 JSON에서 슬라이드를 렌더링합니다.

```json
{
  "slides": [
    {
      "slide": 1,
      "title": "슬라이드 제목",
      "content": ["항목1", "항목2", "항목3"],
      "narration": "이 슬라이드에서 읽을 전체 대본 텍스트입니다."
    }
  ]
}
```

### 필수 파일
- **reference_audio.wav**: 음성 클로닝을 위한 참조 음성 (3-10초)

---

## 스크립트

### 1. render_slides.py
slides.json을 1920x1080 이미지로 렌더링합니다.

```bash
# Academic 테마 (기본, 흰색 배경)
python scripts/render_slides.py slides.json -i generated_images/ -o rendered_slides/

# RDF 다크 테마 (골드/청록 강조)
python scripts/render_slides.py slides.json -i generated_images/ -o rendered_slides/ -t rdf
```

**옵션**:
- `-i, --images`: AI 생성 이미지 디렉토리
- `-o, --output`: 출력 디렉토리
- `-t, --theme`: 테마 선택 (`academic` 또는 `rdf`)

**출력**: `rendered_slides/slide_01.png`, `slide_02.png`, ...

### 2. generate_narration.py
슬라이드별 TTS 음성을 생성합니다.

```bash
python scripts/generate_narration.py slides.json reference_audio.wav -o narrations/
```

**옵션**:
- `-o, --output`: 출력 디렉토리
- `-l, --language`: 언어 코드 (기본: `ko`)

**출력**:
- `narrations/narration_01.mp3`, `narration_02.mp3`, ...
- `narrations/timing.json` (시간 정보 + 전체 텍스트)

**timing.json 구조**:
```json
{
  "total_slides": 17,
  "total_duration": 374.96,
  "slides": [
    {
      "slide": 1,
      "audio_file": "narrations/narration_01.mp3",
      "duration": 11.76,
      "text": "전체 나레이션 텍스트..."
    }
  ]
}
```

### 3. generate_subtitles.py
timing.json에서 ASS 자막 파일을 생성합니다.

```bash
python scripts/generate_subtitles.py narrations/timing.json -o subtitle.ass
```

**자막 분할 규칙**:
- 한 자막당 최대 2줄
- 줄당 최대 40자 (한글 기준)
- 구두점(., !, ?) 기준 분할
- 최소 2초 표시 시간 보장

**출력**: `subtitle.ass`

### 4. create_video.py
이미지와 나레이션을 합성하여 영상을 생성합니다.

```bash
# 기본 (자막 없음)
python scripts/create_video.py rendered_slides/ narrations/ -o final_video.mp4

# 자막 포함
python scripts/create_video.py rendered_slides/ narrations/ -o final_video.mp4 --subtitle

# 페이드 효과 없이
python scripts/create_video.py rendered_slides/ narrations/ -o final_video.mp4 --no-fade
```

**옵션**:
- `-o, --output`: 출력 파일명
- `--subtitle`: 자막 포함 (ASS 파일 자동 생성)
- `--no-fade`: 페이드 전환 효과 비활성화

**출력**:
- `final_video.mp4`
- `final_video.ass` (--subtitle 사용 시)

---

## 전체 워크플로우 예시

### 케이스 1: slides.json에서 영상 생성

```bash
# 1. 슬라이드 렌더링
python ~/.claude/skills/ppt-to-video-workflow/scripts/render_slides.py \
    slides.json -i generated_images/ -o rendered_slides/ -t rdf

# 2. 나레이션 생성
python ~/.claude/skills/ppt-to-video-workflow/scripts/generate_narration.py \
    slides.json reference_audio.wav -o narrations/

# 3. 영상 합성 (자막 포함)
python ~/.claude/skills/ppt-to-video-workflow/scripts/create_video.py \
    rendered_slides/ narrations/ -o presentation.mp4 --subtitle
```

### 케이스 2: PPTX 파일에서 영상 생성

```python
# 1. PPTX에서 이미지 추출
from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE
from PIL import Image
import io, os

os.makedirs("pptx_slides", exist_ok=True)
prs = Presentation("presentation.pptx")

for i, slide in enumerate(prs.slides):
    for shape in slide.shapes:
        if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
            img = Image.open(io.BytesIO(shape.image.blob))
            img = img.resize((1920, 1080), Image.LANCZOS)
            img.save(f"pptx_slides/slide_{i+1:02d}.png")
            break
```

```bash
# 2. 나레이션 생성 (slides.json 필요)
python ~/.claude/skills/ppt-to-video-workflow/scripts/generate_narration.py \
    slides.json audio.wav -o narrations/

# 3. 영상 합성
python ~/.claude/skills/ppt-to-video-workflow/scripts/create_video.py \
    pptx_slides/ narrations/ -o output.mp4 --subtitle
```

### 케이스 3: 기존 영상에 자막만 추가

```bash
# 1. 자막 파일 생성
python ~/.claude/skills/ppt-to-video-workflow/scripts/generate_subtitles.py \
    narrations/timing.json -o subtitle.ass

# 2. FFmpeg로 자막 하드코딩
ffmpeg -i input.mp4 -vf "ass=subtitle.ass" -c:a copy output_with_subtitle.mp4
```

---

## 테마

### Academic (기본)
- 흰색 배경
- 빨간색 강조 (#DC3545)
- 얇은 상단 바
- 심플한 불릿 포인트

### RDF Dark Theme
- 다크 헤더 (#1A1A1A) + 격자 도트 패턴
- 골드 강조 (#F5A623)
- 청록 보조 (#4ECDC4)
- 골드 세로바 섹션 제목
- 이미지에 그림자 효과
- 청록색 하단 바

---

## 설정

| 항목 | 값 |
|------|-----|
| 해상도 | 1920x1080 (Full HD) |
| FPS | 30 |
| 영상 형식 | MP4 (H.264 + AAC) |
| 전환 효과 | 페이드 (0.5초) |
| 한글 폰트 | AppleSDGothicNeo, NanumGothic |
| 테마 | academic, rdf |
| 자막 형식 | ASS (Advanced SubStation Alpha) |
| 자막 폰트 | AppleSDGothicNeo-Medium (42pt) |
| 자막 위치 | 하단 중앙 |
| 자막 배경 | 반투명 검정 (50% opacity) |

---

## 의존성

```bash
pip install pillow pydub gradio-client python-pptx
```

- **Python 3.x**
- **Pillow**: 이미지 렌더링
- **FFmpeg**: 영상 합성 (시스템 설치 필요)
- **pydub**: 오디오 처리
- **gradio-client**: Chatterbox TTS API
- **python-pptx**: PPTX 파일 처리

---

## 디렉토리 구조

```
ppt-to-video-workflow/
├── SKILL.md                    # 이 문서
└── scripts/
    ├── render_slides.py        # 슬라이드 → 이미지 렌더링
    ├── generate_narration.py   # TTS 나레이션 생성
    ├── generate_subtitles.py   # ASS 자막 생성
    └── create_video.py         # 영상 합성
```

---

## 관련 스킬

- `academic-ppt-generator`: slides.json 생성 및 PPTX 변환
- `chatterbox-tts-workflow`: TTS 생성 (단일 파일)

---

## 예상 결과물

**입력**:
- `slides.json` (제목, 내용, 나레이션)
- `reference_audio.wav` (참조 음성)

**출력**:
- `slide_*.png` (슬라이드 이미지들)
- `narration_*.mp3` (슬라이드별 음성)
- `timing.json` (타이밍 정보)
- `subtitle.ass` (자막 파일)
- `final_video.mp4` (최종 영상)

Related Skills

workflow-status

from diegosouzapw/awesome-omni-skill

Display project workflow progress by reading handoff documents in .docs/ directory. This skill should be used when users want to check their workflow status, see what phase they're in, or when other workflow skills need to verify prerequisites. Provides reusable prerequisite-checking templates for integration with other workflow skills.

workflow-new-plugin

from diegosouzapw/awesome-omni-skill

Guided workflow for creating a new Volon plugin — ideation, requirements, spec, plan, tasks.

workflow-integration-git

from diegosouzapw/awesome-omni-skill

Git commit workflow with conventional commits, artifact cleanup, and optional push/PR creation

workflow-guide

from diegosouzapw/awesome-omni-skill

Provides guidance on Cursor ↔ Claude Code 2-agent workflow. Use when user mentions ワークフローについて, Cursorとの連携, 作業の流れ, 2-agent workflow, collaboration. Do NOT load for: 実装作業, ワークフロー設定, ハンドオフ実行.

workflow-creator

from diegosouzapw/awesome-omni-skill

Create complete Claude Code workflow directories with curated skills. Use when user wants to (1) create a new workflow for specific use case (media creator, developer, marketer, etc.), (2) set up a Claude Code project with pre-configured skills, (3) download and organize skills from GitHub repositories, or (4) generate README.md and AGENTS.md documentation for workflows. Triggers on phrases like "create workflow", "new workflow", "set up workflow", "build a xxx-workflow".

wan-t2v-video

from diegosouzapw/awesome-omni-skill

Build WAN 2.2 Text-to-Video workflows — dual hi-lo models, lightning LoRAs, VACE modules, and KSamplerAdvanced two-pass

video-to-claude

from diegosouzapw/awesome-omni-skill

Analyze videos with Gemini and get build instructions for Claude. Supports YouTube URLs, GIFs, screen recordings, and local video files. Use when user shares a video/GIF and wants to understand how to build it, clone a UI, reverse engineer a product, or extract tutorial steps. Triggers on "watch this video", "clone this", "reverse engineer", "how do I build this", "analyze this demo".

video-testimonial

from diegosouzapw/awesome-omni-skill

Capture authentic customer testimonials through guided self-recording workflows, from outreach and briefing to recording and publishing. Use when: Building social proof for marketing campaigns; Creating customer success stories; Gathering testimonials for website and ads; Producing case study videos; Collecting user-generated content

video-sourcing

from diegosouzapw/awesome-omni-skill

Run the Video Sourcing Agent with deterministic, concise chat UX for /video_sourcing using a pinned self-bootstrap runtime.

video-outreach

from diegosouzapw/awesome-omni-skill

When the user wants to build video-first cold outreach, create personalized video at scale, implement async selling, or use AI demo generation for prospecting. Also use when the user mentions 'video outreach,' 'personalized video,' 'video prospecting,' 'Tavus,' 'Sendspark,' 'HeyGen,' 'video email,' 'async selling,' 'video demo,' or 'made this for you.' This skill covers video-first outreach systems from personalization through conversion optimization.

testing-workflow

from diegosouzapw/awesome-omni-skill

Integrated testing workflow combining all testing tools and MCPs. Use when deciding which testing tools to use, planning testing strategy, or executing tests in different environments and phases. Tags official skills: wallaby-testing, web-browser, agent-browser. Triggers on "testing workflow", "which test tool", "testing strategy", "run tests", "test combination".

spec-workflow-orchestrator

from diegosouzapw/awesome-omni-skill

Orchestrate comprehensive planning phase from ideation to development-ready specifications using 4 specialized agents