Image Generation
AI图像生成与编辑能力,基于 Nano Banana (Gemini Image) 实现文生图、图生图、图像编辑。适用于创意设计、营销素材、社交媒体内容、演示文稿配图等场景。支持多种风格、高分辨率输出(最高4K)、文字渲染、角色一致性保持。
Best use case
Image Generation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
AI图像生成与编辑能力,基于 Nano Banana (Gemini Image) 实现文生图、图生图、图像编辑。适用于创意设计、营销素材、社交媒体内容、演示文稿配图等场景。支持多种风格、高分辨率输出(最高4K)、文字渲染、角色一致性保持。
Teams using Image Generation should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/image_generation/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Image Generation Compares
| Feature / Agent | Image Generation | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
AI图像生成与编辑能力,基于 Nano Banana (Gemini Image) 实现文生图、图生图、图像编辑。适用于创意设计、营销素材、社交媒体内容、演示文稿配图等场景。支持多种风格、高分辨率输出(最高4K)、文字渲染、角色一致性保持。
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
## 能力概述
AI图像生成能力让你能够:
- **文生图**:根据文字描述生成图像
- **图生图**:基于参考图像生成新图像
- **图像编辑**:修改现有图像的特定部分
- **风格转换**:改变图像风格(写实、动漫、油画等)
- **文字渲染**:在图像中生成清晰可读的文字
底层基于 Google Gemini 的 Nano Banana / Nano Banana Pro 模型。
## 工作流程
### Phase 1: 需求理解
1. 理解用户的图像需求(主题、风格、用途)
2. 确认输出格式(尺寸、分辨率、数量)
3. 如有参考图,确认编辑意图
### Phase 2: Prompt 构建
1. 将用户意图转化为英文 Prompt(效果更好)
2. 遵循 Prompt 公式:`<subject> <action> <scene> <style> <quality>`
3. 补充必要的细节描述
### Phase 3: 图像生成
1. 调用 `generate_image` 工具
2. 如需编辑,调用 `edit_image` 工具
3. 生成多个候选(如用户需要选择)
### Phase 4: 交付
1. 展示生成结果
2. 询问是否需要调整
3. 保存到用户指定位置
## 工具使用
### generate_image
- **用途**:根据文字描述生成图像
- **参数**:
- `prompt`: 图像描述(英文效果更佳)
- `style`: 风格预设(realistic, anime, oil_painting, watercolor, minimal, cinematic)
- `aspect_ratio`: 宽高比(1:1, 16:9, 9:16, 4:3, 3:4)
- `resolution`: 分辨率(1K, 2K, 4K)
- `num_images`: 生成数量(1-4)
- **示例**:
```python
generate_image(
prompt="A majestic horse galloping through cherry blossoms, golden hour lighting, Chinese New Year festive atmosphere",
style="realistic",
aspect_ratio="16:9",
resolution="2K",
num_images=2
)
```
### edit_image
- **用途**:编辑现有图像
- **参数**:
- `image_path`: 原图路径或URL
- `prompt`: 编辑指令(如:"将背景改为夜景")
- `preserve_subject`: 是否保持主体不变(默认True)
- **示例**:
```python
edit_image(
image_path="/workspace/photo.jpg",
prompt="Add Chinese New Year decorations and red lanterns to the background",
preserve_subject=True
)
```
## Prompt 最佳实践
### 基础公式
```
[主体] + [动作/姿态] + [场景/背景] + [风格] + [氛围/光线]
```
### 风格关键词
- **写实**:photorealistic, hyperrealistic, 8K, detailed
- **动漫**:anime style, Ghibli style, cel shading
- **油画**:oil painting style, impressionist, Van Gogh style
- **极简**:minimal, flat design, vector art
- **电影感**:cinematic, dramatic lighting, movie poster style
### 质量增强词
- `high quality`, `detailed`, `sharp focus`
- `professional photography`, `award winning`
- `4K resolution`, `ultra detailed`
### 避免事项
- ❌ 避免模糊描述:"一张好看的图"
- ❌ 避免矛盾描述:"写实风格的卡通"
- ❌ 避免敏感内容
- ✅ 具体、清晰、有层次
## 应用场景模板
### 场景1:微信红包封面/节日祝福图
```yaml
prompt_template: |
A {animal} in {pose}, surrounded by {decorations},
Chinese New Year theme, festive red and gold colors,
{style} style, high quality, {text_content}
variables:
animal: "majestic horse" # 马年
pose: "running gracefully"
decorations: "cherry blossoms, red lanterns, gold coins"
style: "elegant illustration"
text_content: "with Chinese text '恭喜发财' in golden calligraphy"
```
### 场景2:演示文稿配图
```yaml
prompt_template: |
{concept} visualization, professional infographic style,
clean white background, modern corporate aesthetic,
subtle gradients, minimalist design
variables:
concept: "AI workflow automation"
```
### 场景3:社交媒体内容
```yaml
prompt_template: |
{subject} {action}, {platform} optimized aspect ratio,
vibrant colors, eye-catching composition,
trending aesthetic, shareable content style
variables:
subject: "coffee cup"
action: "with steam rising"
platform: "Instagram" # 1:1 or 4:5
```
## 输出格式
### 生成结果展示
```markdown
## 🎨 图像生成完成
**Prompt**: [使用的英文Prompt]
**参数**:
- 风格: [style]
- 尺寸: [aspect_ratio]
- 分辨率: [resolution]
**生成结果**:

**下一步**:
- [ ] 满意,保存到指定位置
- [ ] 需要调整风格/颜色
- [ ] 需要修改特定部分
- [ ] 重新生成
```
## 注意事项
1. **版权合规**:生成的图像带有 SynthID 水印
2. **内容政策**:遵守 Google 使用政策,不生成敏感内容
3. **商业使用**:支持商业用途(营销、产品)
4. **文字渲染**:Nano Banana Pro 支持多语言文字,但中文效果需要验证
5. **角色一致性**:跨图保持角色特征需要使用参考图功能
## 资源引用
- `resources/prompt_templates.yaml` - 预设 Prompt 模板
- `resources/style_presets.md` - 风格预设详解
- `resources/chinese_new_year_2026.md` - 马年专属模板Related Skills
generational-agent-succession
Parallel agent swarms with generational succession. Combines agent-architect's multi-agent parallelism with automatic succession when agents degrade. Each parallel agent gets fresh context through controlled handoffs while maintaining accumulated wisdom.
all-images-ai-automation
Automate All Images AI tasks via Rube MCP (Composio). Always search tools first for current schemas.
ai-image-generator
使用 ModelScope 等平台生成 AI 图像。当用户需要生成图像、设计图标、创建角色立绘,或需要帮助编写 AI 绘画提示词时使用此技能。支持直接生成图像和仅优化提示词两种模式。
xhs-images
Xiaohongshu (Little Red Book) infographic series generator with multiple style options. Breaks down content into 1-10 cartoon-style infographics. Use when user asks to create "小红书图片", "XHS images", or "RedNote infographics".
x-image-cards
Create X/Twitter cards that look like images, not marketing banners. Use when asked to "create OG images", "set up X cards", "make social cards", or "twitter card without text".
wiro-image-fill
Generate missing or placeholder images in a project by calling the Wiro image generation API, saving assets under public/assets generated folders, and producing a JSON mapping. Use when you see empty img src, placeholder.png, or other image gaps that need real assets.
seedream-image-generator
Generate images using the Doubao SeeDream API based on text prompts. Use this skill when users request AI-generated images, artwork, illustrations, or visual content creation. The skill handles API calls, downloads generated images to the project's /pic folder, and supports batch generation of up to 4 sequential images.
placeholder-images
Rule to use placekitten.com for placeholder images in seed data.
og-image-generator
Generate and optimize Open Graph meta images for social media sharing. Use this skill when building web applications that need dynamic OG image generation with support for Vercel's @vercel/og library, pre-generated image storage, and social media optimization (Twitter Cards, Facebook, LinkedIn). Handles dynamic routes, performance optimization, and includes best practices for crawler compatibility and testing.
nanobanana-image
Nano Banana (Google Gemini API) を使って画像を生成・編集するスキル。「画像を生成して」「イラストを作って」「○○の絵を描いて」「画像を作成」「この画像を編集して」「この画像をもとに○○を作って」「generate an image」「create a picture」「edit this image」などの依頼があった場合に使用。テキストからの生成、参照画像からの生成、画像編集、Google検索グラウンディングによる最新情報を反映した画像生成に対応。「最新の○○」「トレンドを反映」「リアルタイム情報」といった依頼にも対応可能。
nano-image-generate
Generate images using Nano Banana (Flash) or Nano Banana Pro. Use 'flash' for speed/efficiency and 'pro' for high quality, text rendering, and complex prompt adherence. Triggers include 'generate image', 'create logo', 'fast image', 'high quality image'.
media-generation
Generate images, videos, and audio using Google's Gemini APIs. Use for image generation/editing (Gemini 3 Pro Image), video generation (Veo 3), and speech (TBD). Trigger words - images: generate, create, draw, design, make, edit, modify image/picture. Video: generate video, create video, animate, make a video. Supports text-to-image, image-to-image editing, text-to-video, and image-to-video.