ai-video-generation
Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, HunyuanVideo. Capabilities: text-to-video, image-to-video, lipsync, avatar animation, video upscaling, foley sound. Use for: social media videos, marketing content, explainer videos, product demos, AI avatars. Triggers: video generation, ai video, text to video, image to video, veo, animate image, video from image, ai animation, video generator, generate video, t2v, i2v, ai video maker, create video with ai, runway alternative, pika alternative, sora alternative, kling alternative
Best use case
ai-video-generation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, HunyuanVideo. Capabilities: text-to-video, image-to-video, lipsync, avatar animation, video upscaling, foley sound. Use for: social media videos, marketing content, explainer videos, product demos, AI avatars. Triggers: video generation, ai video, text to video, image to video, veo, animate image, video from image, ai animation, video generator, generate video, t2v, i2v, ai video maker, create video with ai, runway alternative, pika alternative, sora alternative, kling alternative
Teams using ai-video-generation should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/ai-video-generation/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How ai-video-generation Compares
| Feature / Agent | ai-video-generation | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, HunyuanVideo. Capabilities: text-to-video, image-to-video, lipsync, avatar animation, video upscaling, foley sound. Use for: social media videos, marketing content, explainer videos, product demos, AI avatars. Triggers: video generation, ai video, text to video, image to video, veo, animate image, video from image, ai animation, video generator, generate video, t2v, i2v, ai video maker, create video with ai, runway alternative, pika alternative, sora alternative, kling alternative
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# AI Video Generation
Generate videos with 40+ AI models via [inference.sh](https://inference.sh) CLI.

## Quick Start
```bash
# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login
# Generate a video with Veo
infsh app run google/veo-3-1-fast --input '{"prompt": "drone shot flying over a forest"}'
```
## Available Models
### Text-to-Video
| Model | App ID | Best For |
|-------|--------|----------|
| Veo 3.1 Fast | `google/veo-3-1-fast` | Fast, with optional audio |
| Veo 3.1 | `google/veo-3-1` | Best quality, frame interpolation |
| Veo 3 | `google/veo-3` | High quality with audio |
| Veo 3 Fast | `google/veo-3-fast` | Fast with audio |
| Veo 2 | `google/veo-2` | Realistic videos |
| Grok Video | `xai/grok-imagine-video` | xAI, configurable duration |
| Seedance 1.5 Pro | `bytedance/seedance-1-5-pro` | With first-frame control |
| Seedance 1.0 Pro | `bytedance/seedance-1-0-pro` | Up to 1080p |
### Image-to-Video
| Model | App ID | Best For |
|-------|--------|----------|
| Wan 2.5 | `falai/wan-2-5` | Animate any image |
| Wan 2.5 I2V | `falai/wan-2-5-i2v` | High quality i2v |
| Seedance Lite | `bytedance/seedance-1-0-lite` | Lightweight 720p |
### Avatar / Lipsync
| Model | App ID | Best For |
|-------|--------|----------|
| OmniHuman 1.5 | `bytedance/omnihuman-1-5` | Multi-character |
| OmniHuman 1.0 | `bytedance/omnihuman-1-0` | Single character |
| Fabric 1.0 | `falai/fabric-1-0` | Image talks with lipsync |
| PixVerse Lipsync | `falai/pixverse-lipsync` | Realistic lipsync |
### Utilities
| Tool | App ID | Description |
|------|--------|-------------|
| HunyuanVideo Foley | `infsh/hunyuanvideo-foley` | Add sound effects to video |
| Topaz Upscaler | `falai/topaz-video-upscaler` | Upscale video quality |
| Media Merger | `infsh/media-merger` | Merge videos with transitions |
## Browse All Video Apps
```bash
infsh app list --category video
```
## Examples
### Text-to-Video with Veo
```bash
infsh app run google/veo-3-1-fast --input '{
"prompt": "A timelapse of a flower blooming in a garden"
}'
```
### Grok Video
```bash
infsh app run xai/grok-imagine-video --input '{
"prompt": "Waves crashing on a beach at sunset",
"duration": 5
}'
```
### Image-to-Video with Wan 2.5
```bash
infsh app run falai/wan-2-5 --input '{
"image_url": "https://your-image.jpg"
}'
```
### AI Avatar / Talking Head
```bash
infsh app run bytedance/omnihuman-1-5 --input '{
"image_url": "https://portrait.jpg",
"audio_url": "https://speech.mp3"
}'
```
### Fabric Lipsync
```bash
infsh app run falai/fabric-1-0 --input '{
"image_url": "https://face.jpg",
"audio_url": "https://audio.mp3"
}'
```
### PixVerse Lipsync
```bash
infsh app run falai/pixverse-lipsync --input '{
"image_url": "https://portrait.jpg",
"audio_url": "https://speech.mp3"
}'
```
### Video Upscaling
```bash
infsh app run falai/topaz-video-upscaler --input '{"video_url": "https://..."}'
```
### Add Sound Effects (Foley)
```bash
infsh app run infsh/hunyuanvideo-foley --input '{
"video_url": "https://silent-video.mp4",
"prompt": "footsteps on gravel, birds chirping"
}'
```
### Merge Videos
```bash
infsh app run infsh/media-merger --input '{
"videos": ["https://clip1.mp4", "https://clip2.mp4"],
"transition": "fade"
}'
```
## Related Skills
```bash
# Full platform skill (all 150+ apps)
npx skills add inferencesh/skills@inference-sh
# Google Veo specific
npx skills add inferencesh/skills@google-veo
# AI avatars & lipsync
npx skills add inferencesh/skills@ai-avatar-video
# Text-to-speech (for video narration)
npx skills add inferencesh/skills@text-to-speech
# Image generation (for image-to-video)
npx skills add inferencesh/skills@ai-image-generation
# Twitter (post videos)
npx skills add inferencesh/skills@twitter-automation
```
Browse all apps: `infsh app list`
## Documentation
- [Running Apps](https://inference.sh/docs/apps/running) - How to run apps via CLI
- [Streaming Results](https://inference.sh/docs/api/sdk/streaming) - Real-time progress updates
- [Content Pipeline Example](https://inference.sh/docs/examples/content-pipeline) - Building media workflowsRelated Skills
generational-agent-succession
Parallel agent swarms with generational succession. Combines agent-architect's multi-agent parallelism with automatic succession when agents degrade. Each parallel agent gets fresh context through controlled handoffs while maintaining accumulated wisdom.
vidu-video
使用 Vidu Q3 Pro 模型生成视频。当用户想要文生视频、生成带音频的视频,或提到 vidu 时使用此 skill。
videodb-skills
Upload, stream, search, edit, transcribe, and generate AI video and audio using the VideoDB SDK.
videocut:安装
环境准备。安装依赖、下载模型。触发词:安装、环境准备、初始化
video
Generate videos using fal.ai (Wan, Kling) or Sora. Text-to-video and image-to-video.
video-toolkit
Intelligent video processor for downloading media and extracting transcripts from YouTube and 1000+ supported sites. Automatically handles format selection, subtitle extraction, and post-processing.
video-processing-editing
FFmpeg automation for cutting, trimming, concatenating videos. Audio mixing, timeline editing, transitions, effects. Export optimization for YouTube, social media. Subtitle handling, color grading, batch processing. Use for videogen projects, content creation, automated video production. Activate on "video editing", "FFmpeg", "trim video", "concatenate", "transitions", "export optimization". NOT for real-time video editing UI, 3D compositing, or motion graphics.
video-commercial
Generate 30-second video commercials from a concept. Creates storyboard, generates scene images, adds narration via ElevenLabs, assembles final video. Use when asked to create commercials, promo videos, video ads, or short marketing videos.
video-analyzer
鏅鸿兘鍒嗘瀽 Bilibili/YouTube/鏈湴瑙嗛锛岀敓鎴愯浆鍐欍€佽瘎浼板拰鎬荤粨銆傛敮鎸佸叧閿抚鎴浘鑷姩宓屽叆銆?
Media Uploader - R2/S3 with video download
Upload files or download videos from popular platforms (YouTube, Vimeo, Bilibili, etc.) and upload to Cloudflare R2, AWS S3, or any S3-compatible storage with secure presigned download links.
media-generation
Generate images, videos, and audio using Google's Gemini APIs. Use for image generation/editing (Gemini 3 Pro Image), video generation (Veo 3), and speech (TBD). Trigger words - images: generate, create, draw, design, make, edit, modify image/picture. Video: generate video, create video, animate, make a video. Supports text-to-image, image-to-image editing, text-to-video, and image-to-video.
ltxv2-video
Build LTX-V2 19B video workflows — text-to-video, image-to-video, distilled model, camera control LoRAs, and two-stage upscaling