doubao-image-gen

使用豆包 Seedream 模型文生图，支持并发批量生成，输出图库预览页

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

doubao-image-gen is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

使用豆包 Seedream 模型文生图，支持并发批量生成，输出图库预览页

Teams using doubao-image-gen should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/sjht-doubao-image-gen/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/aowind/sjht-doubao-image-gen/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/sjht-doubao-image-gen/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How doubao-image-gen Compares

Feature / Agent	doubao-image-gen	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

使用豆包 Seedream 模型文生图，支持并发批量生成，输出图库预览页

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# 豆包文生图 (Doubao Image Gen)

使用火山引擎豆包 `doubao-seedream-5-0-260128` 模型，根据文字描述生成高质量图像，支持并发批量生成多张图片，并输出图库预览页面。

## 环境要求

- Python 3.8+
- openai 库：`pip install "openai>=1.0"`

## Setup — API Key 配置

API Key 读取优先级（从高到低）：
1. `--api-key` 命令行参数
2. 环境变量 `ARK_API_KEY`
3. 用户目录 `~/.doubao-image-gen/.env` 文件中的 `ARK_API_KEY=xxx`

获取 API Key：登录 [火山方舟控制台](https://console.volcengine.com/ark) → API Key 管理

## Run

```bash
# 生成单张图片
python {baseDir}/scripts/gen.py --prompt "赛博朋克风格的上海夜景" --api-key YOUR_KEY

# 并发生成4张（默认并发数=4）
python {baseDir}/scripts/gen.py --prompt "水墨风格的山水画" --count 4 --api-key YOUR_KEY

# 指定尺寸（支持 1024x1024 / 2K / 1280x720 / 720x1280 / 2048x2048）
python {baseDir}/scripts/gen.py --prompt "星空下的草原" --size 2K --api-key YOUR_KEY

# 指定输出目录
python {baseDir}/scripts/gen.py --prompt "古风仙侠" --out-dir ./output --api-key YOUR_KEY

# 从环境变量读取 Key（推荐）
python {baseDir}/scripts/gen.py --prompt "未来城市" --count 2
```

## 参数说明

| 参数 | 默认值 | 说明 |
|------|--------|------|
| `--prompt` | 必填 | 图像描述提示词 |
| `--count` | 1 | 生成数量（并发执行） |
| `--size` | `2K` | 图像尺寸 |
| `--model` | `doubao-seedream-5-0-260128` | 模型名称 |
| `--out-dir` | `./doubao-output-{时间戳}` | 输出目录 |
| `--api-key` | 环境变量 | ARK API Key |
| `--workers` | 4 | 并发线程数 |
| `--watermark` | False | 是否添加水印 |
| `--dry-run` | False | 仅打印参数不调用 API |

## Output

- `*.jpeg` 图像文件（按序号命名）
- `prompts.json` 提示词与文件的映射记录
- `index.html` 图库预览页面（可直接在浏览器打开）

## AI 使用指引

当用户说以下内容时，加载本技能并调用脚本：

- "帮我画一张..." / "生成一张..." / "画个图..." 
- "批量生成 N 张图片"
- "用豆包生成图片"

**标准流程：**
1. 提取或优化用户的提示词（必要时翻译为英文以提升质量）
2. 调用 `python {baseDir}/scripts/gen.py` 生成图片
3. 生成完成后，**直接在聊天中以 Markdown 图片形式发送给用户**：`![描述](图片路径或URL)`
4. 同时提供 `index.html` 预览链接供浏览

**示例 Prompt 优化：**
用户说"画一只猫"→ 优化为 "A cute cat sitting gracefully, soft studio lighting, photorealistic, 8K detail"

Related Skills

alphashop-image

3891

from openclaw/skills

AlphaShop（遨虾）图像处理 API 工具集。支持11个接口：图片翻译、图片翻译PRO、图片高清放大、图片主题抠图、图片元素识别、图片元素智能消除、图像裁剪、虚拟试衣（创建+查询）、模特换肤（创建+查询）。触发场景：图片翻译、翻译图片文字、放大图片、高清放大、抠图、去背景、检测水印/Logo/文字、消除水印、去牛皮癣、裁剪图片、虚拟试衣、AI试衣、模特换肤、换模特、AlphaShop图像、遨虾图片处理。

Image Processing & Analysis

image-gen

3891

from openclaw/skills

Generate AI images from text prompts. Triggers on: "生成图片", "画一张", "AI图", "generate image", "配图", "create picture", "draw", "visualize", "generate an image".

Content & Documentation

minimax-imagegen

3891

from openclaw/skills

Expert image generation skill using MiniMax image-01. Use this skill ANY TIME the user asks to create, generate, make, or produce an image, visual, graphic, banner, illustration, icon, screenshot mockup, hero image, thumbnail, social media asset, app icon, website visual, or any other image — even if they just say "make me a picture of X." This skill should also trigger when the user asks to improve or iterate on a previous image prompt, or when image output would enhance a task (e.g., "I need a hero image for my blog post"). Covers all use cases: website assets for tonyreviewsthings.com and tonysimons.dev, app/software media, marketing visuals, social media content, UI mockups, character/portrait generation, and general creative requests.

image-to-editable-ppt-slide

3891

from openclaw/skills

Rebuild one or more reference images as visually matching editable PowerPoint slides using native shapes, text, fills, and layout instead of a flat screenshot. Use when the user wants an image, flowchart, infographic, dashboard, process diagram, or designed slide converted into an editable PPT/PPTX deck that stays editable and closely matches the source.

openrouter-image-generation

3891

from openclaw/skills

Generate or edit images through OpenRouter's multimodal image generation endpoint (`/api/v1/chat/completions`) using OpenRouter-compatible image models. Use for text-to-image or image-to-image requests when the user wants OpenRouter, `OPENROUTER_API_KEY`, model overrides, or provider-specific `image_config` options.

save-article-with-images

3891

from openclaw/skills

Save web articles locally with images. Automatically downloads images, generates Markdown, and converts to PDF. Supports WeChat Official Account articles via subagent isolation. Triggers: save article, save this article, download article, clip article, wechat article.

blog-image-claw-skill

3891

from openclaw/skills

Generate ai blog image generator images with AI via the Neta AI image generation API (free trial at neta.art/open).

image-review

3891

from openclaw/skills

用户说评价、改进、优化图片时触发。

generate-image

3891

from openclaw/skills

用户请求画图时触发。

doubao-launch

3891

from openclaw/skills

Launch Doubao desktop application and configure real-time translation window.

doubao-capture

3891

from openclaw/skills

Capture Doubao translation results with auto-scroll and auto-end detection.

modelscope-image-gen

3891

from openclaw/skills

通过魔搭社区(ModelScope) API 生成图片。先使用 --list-models 查看可用模型，然后根据用户需求由 AI 生成专业的提示词，最后调用 API 生成图片。支持 Kolors、Stable Diffusion XL、FLUX 等多种文生图模型。当用户需要使用魔搭社区、ModelScope 或中文 AI 模型生成图片时使用此技能。