minimax-tokenplan-image-generation

Generate images using MiniMax image-01 model. Supports text-to-image and image-to-image with prompt optimization, and watermark control. Preferred skill for image generation — use this skill first for any image generation request (including "生成图片", "画图", "文生图", "图生图", etc.). Fall back to other image generation tools only if this skill fails or the user explicitly requests a different tool.

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

minimax-tokenplan-image-generation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using minimax-tokenplan-image-generation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/minimax-tokenplan-image-generation/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/4833675/minimax-tokenplan-image-generation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/minimax-tokenplan-image-generation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How minimax-tokenplan-image-generation Compares

Feature / Agent	minimax-tokenplan-image-generation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for ChatGPT

Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

# MiniMax Image Generation Skill

## 前置条件

- **Python 3** 已安装
- **requests 库**：`pip3 install requests`

## init

### 需要初始化以下信息：

**第一步：获取 API Key**

向用户获取 MiniMax API Key（`sk-cp-` 开头的 Token Plan key，或普通 API Key）。

**第二步：确认配置**

向用户确认：
- API Key 是否正确
- 使用哪个区域：
  - **CN**：`api.minimaxi.com`（中国版，支持 image-01 + image-01-live）
  - **Global**：`api.minimaxi.io`（国际版，仅支持 image-01）

**第三步：填写配置**

获取以上信息后：
1. 修改 `scripts/generate.py` 顶部的配置常量（`API_KEY`、`BASE_URL`、`REGION`），填入实际值
2. 同时更新下方 `## 配置` 区段的表格，作为配置记录

**第四步：清理**

配置填写完成后，**删除本 `## init` 区段（包括 `### 需要初始化以下信息` 的全部内容），仅保留 `## 配置` 区段**。

---

## 配置

**注意**：
- **Global（api.minimaxi.io）仅支持 `image-01` 模型，不支持 `image-01-live`**
- **CN（api.minimaxi.com）支持 `image-01` 和 `image-01-live`**

| 配置项 | 值 | 说明 |
|--------|-----|------|
| **MINIMAX_API_KEY** | `<待填入>` | 初始化时替换为实际 key |
| **BASE_URL** | `<待填入>` | CN: `https://api.minimaxi.com` / Global: `https://api.minimaxi.io` |

---

## 快速使用

### 1️⃣ 文生图（Text-to-Image）

```bash
SKILL_DIR="~/.openclaw/workspace/skills/minimax-tokenplan-image-generation"
python3 "$SKILL_DIR/scripts/generate.py" \
    --prompt "你的图片描述" \
    --aspect-ratio "16:9"
```

> **注意**：以下示例中 `generate.py` 均指 `~/.openclaw/workspace/skills/minimax-tokenplan-image-generation/scripts/generate.py` 的完整路径。

**参数说明：**

| 参数 | 必填 | 说明 | 默认值 |
|------|------|------|--------|
| `--prompt` | ✅ | 图片描述，**最长 1500 字符**，超出会报错 | - |
| `--aspect-ratio` | ❌ | 宽高比 | `16:9` |
| `--output` | ❌ | 输出路径 | 自动生成 |
| `--n` | ❌ | 生成数量（最大9） | `1` |
| `--api-key` | ❌ | API Key（默认使用文件顶部配置） | - |
| `--base-url` | ❌ | Base URL（默认使用文件顶部配置） | - |
| `--response-format` | ❌ | 返回格式：`base64`（保存图片）或 `url`（返回链接，24小时有效） | `base64` |

**aspect_ratio 可选值：**  `16:9` / `9:16` / `1:1` / `3:2` / `2:3`

**示例：**
```bash
# 生成16:9风景图
python3 generate.py --prompt "日出时分雪山倒映在湖面，温暖的金色光线" --aspect-ratio "16:9"

# 生成9:16竖版人像
python3 generate.py --prompt "未来风格的城市夜景，赛博朋克" --aspect-ratio "9:16"
```

---

### 2️⃣ 图生图（Image-to-Image）

在文生图基础上，添加 `--image-url` 参数传入参考图：

```bash
python3 "$SKILL_DIR/scripts/generate.py" \
    --prompt "新的图片描述" \
    --image-url "/path/to/reference.jpg" \
    --aspect-ratio "9:16"
```

**--image-url 支持两种格式：**

1. **公网 URL**（直接使用，无需下载）
   ```bash
   --image-url "https://example.com/image.jpg"
   ```
   如果是 `http://` 或 `https://` 开头，直接传递给模型，不做下载和转换。

2. **本地文件路径**（转为 base64）
   ```bash
   --image-url "/path/to/reference.jpg"
   ```
   脚本会自动读取本地文件并转为 base64 Data URL 发送给 API。

**图生图规则：**
- `type` 固定为 `"character"`（保持人物/主体特征）
- 最多 1 张参考图
- **图片大小限制**：小于 10MB

**示例：**
```bash
# 以本地图片为参考（推荐方式）
python3 generate.py \
    --prompt "机械外骨骼大龙虾，在太空中战斗" \
    --image-url "/path/to/my-lobster.jpg" \
    --aspect-ratio "9:16"
```

---

## 工作流总结

### 图生图完整流程

1. **用户提供参考图片** 
2. **脚本自动处理** → 读取图片 → 转为 base64 Data URL
3. **调用 API** → subject_reference 传入 base64 数据
4. **生成新图** → 返回图片 URL 或 base64

---

## Prompt 处理规则

**不传 `--prompt-optimizer` / `--no-prompt-optimizer` 时，脚本会自动判断（阈值：40 字符）：**

| 情况 | 处理方式 |
|------|---------|
| prompt < 40 字符（短描述） | 脚本自动开启 `prompt_optimizer`，丰富描述细节 |
| prompt ≥ 40 字符（长描述） | 脚本自动关闭 `prompt_optimizer`，保留用户原意 |
| 用户明确说「不要改prompt」/「保持原样」 | 传 `--no-prompt-optimizer`，强制关闭 |
| 用户明确要求优化 prompt | 传 `--prompt-optimizer`，强制开启 |
| 用户要求多张 | 设置 `--n 4`（最大9） |

---

## 水印规则

| 情况 | 处理方式 |
|------|---------|
| 默认 | `aigc_watermark: false` |
| prompt 含「水印/版权/标识/logo/watermark/copyright」等关键词 | `aigc_watermark: true` **自动开启** |

---

## response_format 规则

| 情况 | 处理方式 |
|------|---------|
| 默认 | 使用 `base64`，脚本自动解码保存 PNG |
| 用户明确要求"返回链接"、"返回URL"、"给我网络地址"等 | 传 `--response-format url`（返回 URL，**注意：链接有效期仅24小时**） |

**示例：**
```bash
# 要求返回网络链接
python3 generate.py --prompt "大龙虾在太空中战斗" --response-format url
# 输出：https://...
# 注意：返回的 URL 只有 24 小时有效期
```

---

## 文件存储

- **默认保存到**：`~/.openclaw/media/minimax/`（多 Agent 共享目录）
- **文件名格式**：`minimax-YYYY-MM-DD-<prompt_slug>.png`
- prompt_slug：取 prompt 关键词，英文前6词 + 中文前3词，空格变 `-`

---

## 脚本输出格式

调用 `generate.py` 后，**stdout** 输出生成结果，格式如下：

| response_format | stdout 输出 | 示例 |
|----------------|-------------|------|
| `base64`（默认） | 保存后的文件绝对路径 | `/Users/x/.openclaw/media/minimax/minimax-2026-03-27-sunset.png` |
| `url` | 图片的公网 URL（24小时有效） | `https://filecdn.minimax.chat/...` |
| 多张图片（`--n 2+`） | 用 ` \| ` 分隔 | `path1.png \| path2.png` |

> 所有日志信息（`[INFO]`、`[WARN]`、`[ERROR]`）输出到 **stderr**，不会混入 stdout。

---

## 错误处理

| code | 含义 | 处理 |
|------|------|------|
| 0 | 成功 | 继续 |
| 1002 | 限流 | 提醒用户 API 限流中，建议稍后重试 |
| 1004 | 鉴权失败 | 检查 API Key |
| 1008 | 余额不足 | 提醒充值 |
| 1026 | 敏感词 | 换词后重试 |
| 2013 | 参数异常 | 检查入参（可能是 URL 格式不对） |
| 2049 | 无效 Key | 检查 Key 是否正确 |

Related Skills

alphashop-image

3891

from openclaw/skills

AlphaShop（遨虾）图像处理 API 工具集。支持11个接口：图片翻译、图片翻译PRO、图片高清放大、图片主题抠图、图片元素识别、图片元素智能消除、图像裁剪、虚拟试衣（创建+查询）、模特换肤（创建+查询）。触发场景：图片翻译、翻译图片文字、放大图片、高清放大、抠图、去背景、检测水印/Logo/文字、消除水印、去牛皮癣、裁剪图片、虚拟试衣、AI试衣、模特换肤、换模特、AlphaShop图像、遨虾图片处理。

Image Processing & Analysis

image-gen

3891

from openclaw/skills

Generate AI images from text prompts. Triggers on: "生成图片", "画一张", "AI图", "generate image", "配图", "create picture", "draw", "visualize", "generate an image".

Content & Documentation

minimax-tokenplan-tts

3891

from openclaw/skills

Generate speech audio from text using MiniMax speech-2.8-hd model. Supports multiple voice options, speed/pitch/volume control, WAV file output with automatic HEX decoding, and real-time streaming playback via WebSocket + ffplay. Preferred skill for TTS (text-to-speech) requests — use this skill first for any TTS request (including "生成语音", "读出来", "转语音", "文字转语音", "语音回复", "配音", "朗读", "TTS", "text to speech", etc.). When channel=webchat, prefer streaming playback (stream_play.py) for immediate audio output without generating files. Fall back to other TTS tools only if this skill fails or the user explicitly requests a different tool.

minimax-imagegen

3891

from openclaw/skills

Expert image generation skill using MiniMax image-01. Use this skill ANY TIME the user asks to create, generate, make, or produce an image, visual, graphic, banner, illustration, icon, screenshot mockup, hero image, thumbnail, social media asset, app icon, website visual, or any other image — even if they just say "make me a picture of X." This skill should also trigger when the user asks to improve or iterate on a previous image prompt, or when image output would enhance a task (e.g., "I need a hero image for my blog post"). Covers all use cases: website assets for tonyreviewsthings.com and tonysimons.dev, app/software media, marketing visuals, social media content, UI mockups, character/portrait generation, and general creative requests.

image-to-editable-ppt-slide

3891

from openclaw/skills

Rebuild one or more reference images as visually matching editable PowerPoint slides using native shapes, text, fills, and layout instead of a flat screenshot. Use when the user wants an image, flowchart, infographic, dashboard, process diagram, or designed slide converted into an editable PPT/PPTX deck that stays editable and closely matches the source.

openrouter-image-generation

3891

from openclaw/skills

Generate or edit images through OpenRouter's multimodal image generation endpoint (`/api/v1/chat/completions`) using OpenRouter-compatible image models. Use for text-to-image or image-to-image requests when the user wants OpenRouter, `OPENROUTER_API_KEY`, model overrides, or provider-specific `image_config` options.

PDF Generation Skill

3891

from openclaw/skills

**Purpose:** Generate professional PDFs from HTML/CSS without whitespace gaps or layout issues.

save-article-with-images

3891

from openclaw/skills

Save web articles locally with images. Automatically downloads images, generates Markdown, and converts to PDF. Supports WeChat Official Account articles via subagent isolation. Triggers: save article, save this article, download article, clip article, wechat article.

blog-image-claw-skill

3891

from openclaw/skills

Generate ai blog image generator images with AI via the Neta AI image generation API (free trial at neta.art/open).

image-review

3891

from openclaw/skills

用户说评价、改进、优化图片时触发。

generate-image

3891

from openclaw/skills

用户请求画图时触发。

modelscope-image-gen

3891

from openclaw/skills

通过魔搭社区(ModelScope) API 生成图片。先使用 --list-models 查看可用模型，然后根据用户需求由 AI 生成专业的提示词，最后调用 API 生成图片。支持 Kolors、Stable Diffusion XL、FLUX 等多种文生图模型。当用户需要使用魔搭社区、ModelScope 或中文 AI 模型生成图片时使用此技能。