ai-video-generation

Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, HunyuanVideo. Capabilities: text-to-video, image-to-video, lipsync, avatar animation, video upscaling, foley sound. Use for: social media videos, marketing content, explainer videos, product demos, AI avatars. Triggers: video generation, ai video, text to video, image to video, veo, animate image, video from image, ai animation, video generator, generate video, t2v, i2v, ai video maker, create video with ai, runway alternative, pika alternative, sora alternative, kling alternative

16 stars

Best use case

ai-video-generation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, HunyuanVideo. Capabilities: text-to-video, image-to-video, lipsync, avatar animation, video upscaling, foley sound. Use for: social media videos, marketing content, explainer videos, product demos, AI avatars. Triggers: video generation, ai video, text to video, image to video, veo, animate image, video from image, ai animation, video generator, generate video, t2v, i2v, ai video maker, create video with ai, runway alternative, pika alternative, sora alternative, kling alternative

Teams using ai-video-generation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-video-generation/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/content-media/ai-video-generation/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ai-video-generation/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ai-video-generation Compares

Feature / Agentai-video-generationStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, HunyuanVideo. Capabilities: text-to-video, image-to-video, lipsync, avatar animation, video upscaling, foley sound. Use for: social media videos, marketing content, explainer videos, product demos, AI avatars. Triggers: video generation, ai video, text to video, image to video, veo, animate image, video from image, ai animation, video generator, generate video, t2v, i2v, ai video maker, create video with ai, runway alternative, pika alternative, sora alternative, kling alternative

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI Video Generation

Generate videos with 40+ AI models via [inference.sh](https://inference.sh) CLI.

![AI Video Generation](https://cloud.inference.sh/app/files/u/4mg21r6ta37mpaz6ktzwtt8krr/01kg2c0egyg243mnyth4y6g51q.jpeg)

## Quick Start

```bash
# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login

# Generate a video with Veo
infsh app run google/veo-3-1-fast --input '{"prompt": "drone shot flying over a forest"}'
```

## Available Models

### Text-to-Video

| Model | App ID | Best For |
|-------|--------|----------|
| Veo 3.1 Fast | `google/veo-3-1-fast` | Fast, with optional audio |
| Veo 3.1 | `google/veo-3-1` | Best quality, frame interpolation |
| Veo 3 | `google/veo-3` | High quality with audio |
| Veo 3 Fast | `google/veo-3-fast` | Fast with audio |
| Veo 2 | `google/veo-2` | Realistic videos |
| Grok Video | `xai/grok-imagine-video` | xAI, configurable duration |
| Seedance 1.5 Pro | `bytedance/seedance-1-5-pro` | With first-frame control |
| Seedance 1.0 Pro | `bytedance/seedance-1-0-pro` | Up to 1080p |

### Image-to-Video

| Model | App ID | Best For |
|-------|--------|----------|
| Wan 2.5 | `falai/wan-2-5` | Animate any image |
| Wan 2.5 I2V | `falai/wan-2-5-i2v` | High quality i2v |
| Seedance Lite | `bytedance/seedance-1-0-lite` | Lightweight 720p |

### Avatar / Lipsync

| Model | App ID | Best For |
|-------|--------|----------|
| OmniHuman 1.5 | `bytedance/omnihuman-1-5` | Multi-character |
| OmniHuman 1.0 | `bytedance/omnihuman-1-0` | Single character |
| Fabric 1.0 | `falai/fabric-1-0` | Image talks with lipsync |
| PixVerse Lipsync | `falai/pixverse-lipsync` | Realistic lipsync |

### Utilities

| Tool | App ID | Description |
|------|--------|-------------|
| HunyuanVideo Foley | `infsh/hunyuanvideo-foley` | Add sound effects to video |
| Topaz Upscaler | `falai/topaz-video-upscaler` | Upscale video quality |
| Media Merger | `infsh/media-merger` | Merge videos with transitions |

## Browse All Video Apps

```bash
infsh app list --category video
```

## Examples

### Text-to-Video with Veo

```bash
infsh app run google/veo-3-1-fast --input '{
  "prompt": "A timelapse of a flower blooming in a garden"
}'
```

### Grok Video

```bash
infsh app run xai/grok-imagine-video --input '{
  "prompt": "Waves crashing on a beach at sunset",
  "duration": 5
}'
```

### Image-to-Video with Wan 2.5

```bash
infsh app run falai/wan-2-5 --input '{
  "image_url": "https://your-image.jpg"
}'
```

### AI Avatar / Talking Head

```bash
infsh app run bytedance/omnihuman-1-5 --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'
```

### Fabric Lipsync

```bash
infsh app run falai/fabric-1-0 --input '{
  "image_url": "https://face.jpg",
  "audio_url": "https://audio.mp3"
}'
```

### PixVerse Lipsync

```bash
infsh app run falai/pixverse-lipsync --input '{
  "image_url": "https://portrait.jpg",
  "audio_url": "https://speech.mp3"
}'
```

### Video Upscaling

```bash
infsh app run falai/topaz-video-upscaler --input '{"video_url": "https://..."}'
```

### Add Sound Effects (Foley)

```bash
infsh app run infsh/hunyuanvideo-foley --input '{
  "video_url": "https://silent-video.mp4",
  "prompt": "footsteps on gravel, birds chirping"
}'
```

### Merge Videos

```bash
infsh app run infsh/media-merger --input '{
  "videos": ["https://clip1.mp4", "https://clip2.mp4"],
  "transition": "fade"
}'
```

## Related Skills

```bash
# Full platform skill (all 150+ apps)
npx skills add inferencesh/skills@inference-sh

# Google Veo specific
npx skills add inferencesh/skills@google-veo

# AI avatars & lipsync
npx skills add inferencesh/skills@ai-avatar-video

# Text-to-speech (for video narration)
npx skills add inferencesh/skills@text-to-speech

# Image generation (for image-to-video)
npx skills add inferencesh/skills@ai-image-generation

# Twitter (post videos)
npx skills add inferencesh/skills@twitter-automation
```

Browse all apps: `infsh app list`

## Documentation

- [Running Apps](https://inference.sh/docs/apps/running) - How to run apps via CLI
- [Streaming Results](https://inference.sh/docs/api/sdk/streaming) - Real-time progress updates
- [Content Pipeline Example](https://inference.sh/docs/examples/content-pipeline) - Building media workflows

Related Skills

generational-agent-succession

16
from diegosouzapw/awesome-omni-skill

Parallel agent swarms with generational succession. Combines agent-architect's multi-agent parallelism with automatic succession when agents degrade. Each parallel agent gets fresh context through controlled handoffs while maintaining accumulated wisdom.

vidu-video

16
from diegosouzapw/awesome-omni-skill

使用 Vidu Q3 Pro 模型生成视频。当用户想要文生视频、生成带音频的视频,或提到 vidu 时使用此 skill。

videodb-skills

16
from diegosouzapw/awesome-omni-skill

Upload, stream, search, edit, transcribe, and generate AI video and audio using the VideoDB SDK.

videocut:安装

16
from diegosouzapw/awesome-omni-skill

环境准备。安装依赖、下载模型。触发词:安装、环境准备、初始化

video

16
from diegosouzapw/awesome-omni-skill

Generate videos using fal.ai (Wan, Kling) or Sora. Text-to-video and image-to-video.

video-toolkit

16
from diegosouzapw/awesome-omni-skill

Intelligent video processor for downloading media and extracting transcripts from YouTube and 1000+ supported sites. Automatically handles format selection, subtitle extraction, and post-processing.

video-processing-editing

16
from diegosouzapw/awesome-omni-skill

FFmpeg automation for cutting, trimming, concatenating videos. Audio mixing, timeline editing, transitions, effects. Export optimization for YouTube, social media. Subtitle handling, color grading, batch processing. Use for videogen projects, content creation, automated video production. Activate on "video editing", "FFmpeg", "trim video", "concatenate", "transitions", "export optimization". NOT for real-time video editing UI, 3D compositing, or motion graphics.

video-commercial

16
from diegosouzapw/awesome-omni-skill

Generate 30-second video commercials from a concept. Creates storyboard, generates scene images, adds narration via ElevenLabs, assembles final video. Use when asked to create commercials, promo videos, video ads, or short marketing videos.

video-analyzer

16
from diegosouzapw/awesome-omni-skill

鏅鸿兘鍒嗘瀽 Bilibili/YouTube/鏈湴瑙嗛锛岀敓鎴愯浆鍐欍€佽瘎浼板拰鎬荤粨銆傛敮鎸佸叧閿抚鎴浘鑷姩宓屽叆銆?

Media Uploader - R2/S3 with video download

16
from diegosouzapw/awesome-omni-skill

Upload files or download videos from popular platforms (YouTube, Vimeo, Bilibili, etc.) and upload to Cloudflare R2, AWS S3, or any S3-compatible storage with secure presigned download links.

media-generation

16
from diegosouzapw/awesome-omni-skill

Generate images, videos, and audio using Google's Gemini APIs. Use for image generation/editing (Gemini 3 Pro Image), video generation (Veo 3), and speech (TBD). Trigger words - images: generate, create, draw, design, make, edit, modify image/picture. Video: generate video, create video, animate, make a video. Supports text-to-image, image-to-image editing, text-to-video, and image-to-video.

ltxv2-video

16
from diegosouzapw/awesome-omni-skill

Build LTX-V2 19B video workflows — text-to-video, image-to-video, distilled model, camera control LoRAs, and two-stage upscaling