AI Image Generation & Editor — Nanobanana, GPT Image, ComfyUI

Generate images from text with multi-provider routing — supports Nanobanana 2, Seedream 5.0, GPT Image, and local ComfyUI workflows. Includes 1,300+ curated prompts and style-aware prompt enhancement. Use when users want to create images, design assets, enhance prompts, or manage AI art workflows.

1,864 stars

Best use case

AI Image Generation & Editor — Nanobanana, GPT Image, ComfyUI is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generate images from text with multi-provider routing — supports Nanobanana 2, Seedream 5.0, GPT Image, and local ComfyUI workflows. Includes 1,300+ curated prompts and style-aware prompt enhancement. Use when users want to create images, design assets, enhance prompts, or manage AI art workflows.

Teams using AI Image Generation & Editor — Nanobanana, GPT Image, ComfyUI should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/creative-toolkit/SKILL.md --create-dirs "https://raw.githubusercontent.com/LeoYeAI/openclaw-master-skills/main/skills/creative-toolkit/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/creative-toolkit/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How AI Image Generation & Editor — Nanobanana, GPT Image, ComfyUI Compares

Feature / AgentAI Image Generation & Editor — Nanobanana, GPT Image, ComfyUIStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Generate images from text with multi-provider routing — supports Nanobanana 2, Seedream 5.0, GPT Image, and local ComfyUI workflows. Includes 1,300+ curated prompts and style-aware prompt enhancement. Use when users want to create images, design assets, enhance prompts, or manage AI art workflows.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Creative Toolkit

Generate professional AI images through a unified interface that routes across multiple providers. Search curated prompts, enhance ideas into production-ready descriptions, and manage local ComfyUI workflows — all from a single MCP server.

## Quick Start

Add the MCP server to your mcporter config (`~/.config/mcporter/config.json`):

```json
{
  "mcpServers": {
    "creative-toolkit": {
      "command": "npx",
      "args": ["-y", "meigen@1.2.5"]
    }
  }
}
```

Free tools (search, enhance, inspire) work immediately — no API key needed:

```bash
mcporter call creative-toolkit.search_gallery query="cyberpunk"
mcporter call creative-toolkit.enhance_prompt brief="a cat in space" style="realistic"
```

To unlock image generation, configure **one** of these providers:

| Provider | Config | What you need |
|----------|--------|---------------|
| **MeiGen Cloud** | `MEIGEN_API_TOKEN` | Token from [meigen.ai](https://www.meigen.ai) (avatar → Settings → API Keys) |
| **Local ComfyUI** | `comfyuiUrl` | A running ComfyUI instance — no external API needed |
| **Any OpenAI-compatible API** | `openaiApiKey` + `openaiBaseUrl` + `openaiModel` | Your own key from Together AI, Fireworks AI, etc. |

Set credentials in `~/.clawdbot/.env`, `~/.config/meigen/config.json`, or add an `"env"` block to the mcporter config above. See `references/providers.md` for details.

## Available Tools

### Free — no API key required

| Tool | What it does |
|------|-------------|
| `search_gallery` | Semantic search across 1,300+ AI image prompts. Supports category filtering and curated browsing. Returns prompt text, thumbnails, and metadata. |
| `get_inspiration` | Get the full prompt and high-res images for any gallery entry. Use after `search_gallery` to get copyable prompts. |
| `enhance_prompt` | Expand a brief idea into a detailed, style-aware prompt with lighting, composition, and material directions. Supports realistic, anime, and illustration styles. |
| `list_models` | List all available models across configured providers with capabilities and supported features. |

### Requires configured provider

| Tool | What it does |
|------|-------------|
| `generate_image` | Generate an image from a text prompt. Routes to the best available provider. Supports aspect ratio, seed, and reference images. |
| `upload_reference_image` | Compress a local image (max 2MB, 2048px) and upload to temporary storage (expires in 24 hours) for use as a style reference. Call this MCP tool directly — do NOT construct upload HTTP requests manually. ComfyUI users can skip this — pass local file paths directly to `generate_image`. |
| `comfyui_workflow` | List, view, import, modify, and delete ComfyUI workflow templates. Adjust steps, CFG scale, sampler, and checkpoint without editing JSON. |
| `manage_preferences` | Save and load user preferences (default style, aspect ratio, style notes, favorite prompts). |

## Important Rules

### Never describe generated images

You **cannot see** generated images. After generation, only present the **exact** data from the tool response:

```
**Direction 1: Modern Minimal**
- Image URL: https://images.meigen.art/...
- Saved to: ~/Pictures/meigen/2026-02-08_xxxx.jpg
```

Do NOT write creative commentary about what the image "looks like".

### Never specify model or provider

Do NOT pass `model` or `provider` to `generate_image` unless the user explicitly asks. The server auto-selects the best available provider and model.

### Always confirm before generating multiple images

When the user wants multiple variations, present options first and ask which direction(s) to try. Include an "all of the above" option. Never auto-generate all variants without user confirmation.

---

## Workflow Modes

### Mode 1: Single Image

User wants one image. Write a prompt (or call `enhance_prompt` if the description is brief), generate, present URL + path.

### Mode 2: Prompt Enhancement + Generation

For brief ideas (under ~30 words, lacking visual details), enhance first:

```
1. enhance_prompt brief="futuristic city" style="realistic"
   -> Returns detailed prompt with camera lens, lighting, atmospheric effects

2. generate_image prompt="<enhanced prompt>" aspectRatio="16:9"
```

### Mode 3: Parallel Generation (2+ images)

User needs multiple variations — different directions, styles, or concepts.

1. Plan directions, present as a table
2. Ask user which direction(s) to try
3. Write distinct prompts for each — don't just tweak one word
4. Generate selected directions (max 4 parallel for API providers, 1 at a time for ComfyUI)
5. Present URLs + paths

### Mode 4: Multi-Step Creative (base + extensions)

User wants a base design plus derivatives (e.g., "design a logo and make mockups").

1. Plan 3-5 directions, ask user which to try
2. Generate selected direction(s)
3. Present results, ask user to approve or try another
4. Plan extensions using the approved Image URL as `referenceImages`
5. Generate extensions

Never jump from plan to generating everything at once.

### Mode 5: Edit/Modify Existing Image

User provides an image and asks for changes (add text, change background, etc.).

- Upload the reference image (if local), then generate with a **short, literal prompt** describing ONLY the edit
- The reference image carries all visual context — do NOT re-describe the original image
- Example prompt: "Add the text 'meigen.ai' at the bottom of this image"

### Mode 6: Inspiration Search

```
1. search_gallery query="dreamy portrait with soft light"
   -> Finds semantically similar prompts with thumbnails

2. get_inspiration id="<entry_id>"
   -> Get full prompt text — copy and modify for your own generation
```

### Mode 7: Reference Image Generation

Use an existing image to guide visual style. You MUST call the MCP tool directly — do NOT construct HTTP requests manually.

```
1. upload_reference_image filePath="~/Desktop/my-logo.png"
   -> Compresses and returns a temporary URL (expires in 24 hours)

2. generate_image prompt="coffee mug mockup with this logo" referenceImages=["<url>"]
```

Reference image sources: gallery URLs, previous generation URLs, `upload_reference_image` for local files. ComfyUI users can pass local file paths directly — no upload needed.

**Important**: If `upload_reference_image` fails or is unavailable, do NOT attempt to replicate its behavior by calling HTTP endpoints yourself. Instead, tell the user to upload the image manually at [meigen.ai](https://www.meigen.ai) and provide the URL, or use the image URL directly if it's already online.

### Mode 8: ComfyUI Workflows

```
1. comfyui_workflow action="list"           -> See saved workflows
2. comfyui_workflow action="view" name="txt2img"  -> See adjustable parameters
3. comfyui_workflow action="modify" name="txt2img" modifications={"steps": 30}
4. generate_image prompt="..." workflow="txt2img"  -> Generate
```

## Alternative Providers

You can use your own OpenAI-compatible API or a local ComfyUI instance instead of — or alongside — the default MeiGen provider. See `references/providers.md` for detailed configuration, model pricing, and provider comparison.

## Troubleshooting

See `references/troubleshooting.md` for common issues, solutions, and security & privacy details.

Related Skills

openai-image-gen

1864
from LeoYeAI/openclaw-master-skills

Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.

media-generation

1864
from LeoYeAI/openclaw-master-skills

Generate images, edit existing images, create short videos, run inpainting/outpainting and object-focused edits, use reference images as provider inputs, batch related media jobs from a manifest, and fetch returned media from URLs/HTML/JSON/data URLs/base64. Use when working on AI image generation, AI image editing, mask-based inpainting, outpainting, reference-image workflows, short AI video generation, product-shot variations, or reusable media-production pipelines.

main-image-editor

1864
from LeoYeAI/openclaw-master-skills

Orchestrate screenshot + Chinese instruction into PSD batch edits with transaction rollback by reusing psd-automator.

image-generate

1864
from LeoYeAI/openclaw-master-skills

使用内置 image_generate.py 脚本生成图片, 准备清晰具体的 `prompt`。

youtube-watcher

1864
from LeoYeAI/openclaw-master-skills

Fetch and read transcripts from YouTube videos. Use when you need to summarize a video, answer questions about its content, or extract information from it.

youtube-transcript

1864
from LeoYeAI/openclaw-master-skills

Fetch and summarize YouTube video transcripts. Use when asked to summarize, transcribe, or extract content from YouTube videos. Handles transcript fetching via residential IP proxy to bypass YouTube's cloud IP blocks.

youtube-auto-captions - YouTube 自动字幕

1864
from LeoYeAI/openclaw-master-skills

## 描述

youtube

1864
from LeoYeAI/openclaw-master-skills

YouTube Data API integration with managed OAuth. Search videos, manage playlists, access channel data, and interact with comments. Use this skill when users want to interact with YouTube. For other third party apps, use the api-gateway skill (https://clawhub.ai/byungkyu/api-gateway).

yahoo-finance

1864
from LeoYeAI/openclaw-master-skills

Get stock prices, quotes, fundamentals, earnings, options, dividends, and analyst ratings using Yahoo Finance. Uses yfinance library - no API key required.

xurl

1864
from LeoYeAI/openclaw-master-skills

A Twitter research and content intelligence skill focused on attracting WordPress and Shopify clients. Use to analyze Twitter profiles, threads, and conversations for: (1) Identifying what small agency founders and eCommerce brands are discussing; (2) Understanding pain points around WordPress performance, Shopify CRO, and development bottlenecks; (3) Extracting high-performing content angles; (4) Turning insights into authority-building posts; (5) Converting Twitter intelligence into business leverage for clear content angles, strong positioning, and qualified inbound leads.

xlsx

1864
from LeoYeAI/openclaw-master-skills

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

xiaohongshu-mcp

1864
from LeoYeAI/openclaw-master-skills

Automate Xiaohongshu (RedNote) content operations using a Python client for the xiaohongshu-mcp server. Use for: (1) Publishing image, text, and video content, (2) Searching for notes and trends, (3) Analyzing post details and comments, (4) Managing user profiles and content feeds. Triggers: xiaohongshu automation, rednote content, publish to xiaohongshu, xiaohongshu search, social media management.