ai-video-gen
End-to-end AI video generation - create videos from text prompts using image generation, video synthesis, voice-over, and editing. Supports OpenAI DALL-E, Replicate models, LumaAI, Runway, and FFmpeg editing.
Best use case
ai-video-gen is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
End-to-end AI video generation - create videos from text prompts using image generation, video synthesis, voice-over, and editing. Supports OpenAI DALL-E, Replicate models, LumaAI, Runway, and FFmpeg editing.
Teams using ai-video-gen should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/ai-video-gen/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How ai-video-gen Compares
| Feature / Agent | ai-video-gen | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
End-to-end AI video generation - create videos from text prompts using image generation, video synthesis, voice-over, and editing. Supports OpenAI DALL-E, Replicate models, LumaAI, Runway, and FFmpeg editing.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# AI Video Generation Skill Generate complete videos from text descriptions using AI. ## Capabilities 1. **Image Generation** - DALL-E 3, Stable Diffusion, Flux 2. **Video Generation** - LumaAI, Runway, Replicate models 3. **Voice-over** - OpenAI TTS, ElevenLabs 4. **Video Editing** - FFmpeg assembly, transitions, overlays ## Quick Start ```bash # Generate a complete video python skills/ai-video-gen/generate_video.py --prompt "A sunset over mountains" --output sunset.mp4 # Just images to video python skills/ai-video-gen/images_to_video.py --images img1.png img2.png --output result.mp4 # Add voiceover python skills/ai-video-gen/add_voiceover.py --video input.mp4 --text "Your narration" --output final.mp4 ``` ## Setup ### Required API Keys Add to your environment or `.env` file: ```bash # Image Generation (pick one) OPENAI_API_KEY=sk-... # DALL-E 3 REPLICATE_API_TOKEN=r8_... # Stable Diffusion, Flux # Video Generation (pick one) LUMAAI_API_KEY=luma_... # LumaAI Dream Machine RUNWAY_API_KEY=... # Runway ML REPLICATE_API_TOKEN=r8_... # Multiple models # Voice (optional) OPENAI_API_KEY=sk-... # OpenAI TTS ELEVENLABS_API_KEY=... # ElevenLabs # Or use FREE local options (no API needed) ``` ### Install Dependencies ```bash pip install openai requests pillow replicate python-dotenv ``` ### FFmpeg Already installed via winget. ## Usage Examples ### 1. Text to Video (Full Pipeline) ```bash python skills/ai-video-gen/generate_video.py \ --prompt "A futuristic city at night with flying cars" \ --duration 5 \ --voiceover "Welcome to the future" \ --output future_city.mp4 ``` ### 2. Multiple Scenes ```bash python skills/ai-video-gen/multi_scene.py \ --scenes "Morning sunrise" "Busy city street" "Peaceful night" \ --duration 3 \ --output day_in_life.mp4 ``` ### 3. Image Sequence to Video ```bash python skills/ai-video-gen/images_to_video.py \ --images frame1.png frame2.png frame3.png \ --fps 24 \ --output animation.mp4 ``` ## Workflow Options ### Budget Mode (FREE) - Image: Stable Diffusion (local or free API) - Video: Open source models - Voice: OpenAI TTS (cheap) or free TTS - Edit: FFmpeg ### Quality Mode (Paid) - Image: DALL-E 3 or Midjourney - Video: Runway Gen-3 or LumaAI - Voice: ElevenLabs - Edit: FFmpeg + effects ## Scripts Reference - `generate_video.py` - Main end-to-end generator - `images_to_video.py` - Convert image sequence to video - `add_voiceover.py` - Add narration to existing video - `multi_scene.py` - Create multi-scene videos - `edit_video.py` - Apply effects, transitions, overlays ## API Cost Estimates - **DALL-E 3**: ~$0.04-0.08 per image - **Replicate**: ~$0.01-0.10 per generation - **LumaAI**: $0-0.50 per 5sec (free tier available) - **Runway**: ~$0.05 per second - **OpenAI TTS**: ~$0.015 per 1K characters - **ElevenLabs**: ~$0.30 per 1K characters (better quality) ## Examples See `examples/` folder for sample outputs and prompts.
Related Skills
gemini-yt-video-transcript
Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).
ffmpeg-video-editor
Generate FFmpeg commands from natural language video editing requests - cut, trim, convert, compress, change aspect ratio, extract audio, and more.
demo-video
Create product demo videos by automating browser interactions and capturing frames. Use when the user wants to record a demo, walkthrough, product showcase, or interactive video of a web application. Supports Playwright CDP screencast for high-quality capture and FFmpeg for video encoding.
portfolio-watcher
Monitor stock/crypto holdings, get price alerts, track portfolio performance
portainer
Control Docker containers and stacks via Portainer API. List containers, start/stop/restart, view logs, and redeploy stacks from git.
portable-tools
Build cross-device tools without hardcoding paths or account names
polymarket
Trade prediction markets on Polymarket. Analyze odds, place bets, track positions, automate alerts, and maximize returns from event outcomes. Covers sports, politics, entertainment, and more.
polymarket-traiding-bot
No description provided.
polymarket-analysis
Analyze Polymarket prediction markets for trading edges. Pair Cost arbitrage, whale tracking, sentiment analysis, momentum signals, user profile tracking. No execution.
polymarket-agent
Autonomous prediction market agent - analyzes markets, researches news, and identifies trading opportunities
polymarket-5
Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.
polymarket-4
Query Polymarket prediction markets. Use for questions about prediction markets, betting odds, market prices, event probabilities, or when user asks about Polymarket data.