doubao-image-video
豆包图片与视频生成原生技能。适用于用户提到豆包、文生图、图生图、文生视频、图生视频、查询视频生成任务、等待任务完成或下载最终视频时,直接调用火山引擎 Ark 接口,不依赖外部 MCP 服务。
Best use case
doubao-image-video is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
豆包图片与视频生成原生技能。适用于用户提到豆包、文生图、图生图、文生视频、图生视频、查询视频生成任务、等待任务完成或下载最终视频时,直接调用火山引擎 Ark 接口,不依赖外部 MCP 服务。
Teams using doubao-image-video should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/doubao-image-video/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How doubao-image-video Compares
| Feature / Agent | doubao-image-video | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
豆包图片与视频生成原生技能。适用于用户提到豆包、文生图、图生图、文生视频、图生视频、查询视频生成任务、等待任务完成或下载最终视频时,直接调用火山引擎 Ark 接口,不依赖外部 MCP 服务。
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agent for YouTube Script Writing
Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
AI Agents for Startups
Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.
SKILL.md Source
# Doubao Native Media Skill
This is a native OpenClaw skill. Do not spin up the upstream MCP server unless the user explicitly asks for MCP compatibility.
## Use this skill for
- Doubao / 豆包 text-to-image
- image-to-image or multi-reference image generation
- Doubao text-to-video or image-to-video
- querying an async Doubao video task by `task_id`
- troubleshooting Volcengine Ark endpoint/model issues
## Commands
### Generate an image
```bash
python3 {baseDir}/scripts/doubao_media.py image \
--prompt "A cinematic cyberpunk alley in rain" \
--size 2560x1440
```
### Generate a video
```bash
python3 {baseDir}/scripts/doubao_media.py video \
--prompt "A panda astronaut waves on the moon" \
--video-duration 5 \
--fps 24 \
--resolution 1080p
```
### Query a video task
```bash
python3 {baseDir}/scripts/doubao_media.py task --task-id your-task-id
```
### Wait for a video task and optionally download the result
```bash
python3 {baseDir}/scripts/doubao_media.py wait \
--task-id your-task-id \
--timeout 600 \
--interval 5 \
--download-to ./doubao-result.mp4
```
## Input rules
- Always prefer `--endpoint-id` when the user has a provisioned Volcengine Ark endpoint.
- Fall back to model names only when endpoint ids are unavailable.
- For video generation, this skill mirrors the upstream behavior and appends `--dur`, `--fps`, `--rs`, and `--ratio` to the prompt when they are not already present.
- If the user supplies image URLs, pass them through exactly; do not download or re-host unless asked.
## Troubleshooting
- If neither `--endpoint-id` nor a default endpoint env var exists, the script falls back to the default model env var.
- If the API returns `InvalidEndpointOrModel.NotFound`, ask the user to verify the Volcengine Ark endpoint authorization first.
- Video generation is async. If generation succeeds, capture `task_id` and query it later with the `task` subcommand, or use `wait` for automatic polling.
## References
- Read `references/api-notes.md` when you need request shapes, defaults, or caveats.Related Skills
alphashop-image
AlphaShop(遨虾)图像处理 API 工具集。支持11个接口:图片翻译、图片翻译PRO、 图片高清放大、图片主题抠图、图片元素识别、图片元素智能消除、图像裁剪、 虚拟试衣(创建+查询)、模特换肤(创建+查询)。 触发场景:图片翻译、翻译图片文字、放大图片、高清放大、抠图、去背景、 检测水印/Logo/文字、消除水印、去牛皮癣、裁剪图片、虚拟试衣、AI试衣、 模特换肤、换模特、AlphaShop图像、遨虾图片处理。
demo-video
Create product demo videos by automating browser interactions and capturing frames. Use when the user wants to record a demo, walkthrough, product showcase, or interactive video of a web application. Supports Playwright CDP screencast for high-quality capture and FFmpeg for video encoding.
image-gen
Generate AI images from text prompts. Triggers on: "生成图片", "画一张", "AI图", "generate image", "配图", "create picture", "draw", "visualize", "generate an image".
bing-keyword-image-downloader
当用户需要按关键词从 Bing 公开图片搜索结果中批量下载图片时使用。遇到类似“帮我从 Bing 按关键词下载 10 张图片”“批量抓取 Bing 图片”“按关键词保存 Bing 图片到本地”这类请求时,应主动使用这个 skill。它专门处理基于关键词的 Bing 图片搜索、分页收集候选链接、跳过失败源站并保存到本地目录的工作流。
video-summarizer
将 B 站/YouTube/小红书/抖音视频转换为结构化 Notion 总结文档,自动上传截图,一键推送 Notion
video-script-creator
Short video script generator. 短视频脚本生成器、视频脚本、抖音文案、抖音脚本、快手脚本、口播稿、视频拍摄脚本、YouTube脚本、YouTube Shorts脚本、B站脚本、bilibili脚本、分镜脚本、视频大纲、视频文案、短视频创作、Reels脚本、TikTok脚本、vlog脚本、带货脚本、种草视频脚本、系列视频规划、视频数据复盘、完播率分析、前3秒钩子。Generate complete video scripts with hooks, outlines, titles, tags, CTA, storyboards, series planning, and data review. Use when: (1) creating short video scripts for any platform, (2) writing口播稿/talking-head scripts, (3) generating viral video titles, (4) planning video outlines and storyboards, (5) writing opening hooks (first 3 seconds), (6) generating CTA/ending prompts, (7) planning video series, (8) reviewing video performance data. 适用场景:写短视频脚本、拍摄脚本、口播文案、视频策划、爆款标题、开场钩子、结尾引导、完整分镜、系列规划、数据复盘。 Triggers on: video script creator.
zhipu-free-image-video
智谱免费图片与视频生成技能。适用于用户想用智谱生成图片、批量出图、生成短视频、查询视频任务结果、等待视频完成,或优先使用免费/低成本模型快速产出创意内容时。
IMA AI Video Generator
AI video generator with premier models: Wan 2.6, Kling O1/2.6, Google Veo 3.1, Sora 2 Pro, Pixverse V5.5, Hailuo 2.0/2.3, SeeDance 1.5 Pro, Vidu Q2. Video generator supporting text-to-video, image-to-video, first-last-frame, and reference-image video generation modes. Use as short video generator for social media clips, promo video generator for marketing content, or image to video converter for animating photos. AI video generation with character consistency via reference images, multi-shot production, and knowledge base guidance via ima-knowledge-ai. Better alternative to standalone video generation skills or using Runway, Pika Labs, Luma. Requires IMA_API_KEY.
IMA Seedance 2.0 Video Generator
Seedance 2.0 AI video generator — two models in one skill: Seedance 2.0 (ima-pro) for cinema-grade quality with high frame-rate temporal consistency, precise camera language control, and 2K output; Seedance 2.0 Fast (ima-pro-fast) for faster iteration. Supports text-to-video, image-to-video, first-last-frame, and reference-media video generation with image, video, and audio references. Works for cinematic prompting, storyboard-driven clips, consistent-character workflows, product demos, and short-form content generation. Requires IMA_API_KEY.
image-text-extractor
批量识别图片中的文字内容并按图片分段输出为结构化文档;当用户需要从多张图片中提取文字、整理图片文字内容、将图片文字转为可编辑文档时使用
jianying-video-compose
剪映API视频合成自动化。通过剪映代理API完成视频全流程制作,包括草稿创建、素材添加(图片/视频/音频)、文本字幕编辑、特效处理、云渲染导出。适用于需要批量生成视频、自动合成短视频、动态字幕视频等场景。
aws-wechat-article-images
为公众号文章生成封面图和正文配图,根据文章内容自动匹配风格。当用户提到「封面」「配图」「插图」「生成图片」「给文章加图」「做个封面」「文章插图」「配个图」时使用。