nano-banana
AI image generation using Nano Banana PRO (Gemini 3 Pro Image) and Nano Banana (Gemini 2.5 Flash Image). Use this skill when: (1) Generating images from text prompts, (2) Editing existing images, (3) Creating professional visual assets like infographics, logos, product shots, stickers, (4) Working with character consistency across multiple images, (5) Creating images with accurate text rendering, (6) Any task requiring AI-generated visuals. Triggers on: 'generate image', 'create image', 'make a picture', 'design a logo', 'create infographic', 'AI image', 'nano banana', or any image generation request.
Best use case
nano-banana is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
AI image generation using Nano Banana PRO (Gemini 3 Pro Image) and Nano Banana (Gemini 2.5 Flash Image). Use this skill when: (1) Generating images from text prompts, (2) Editing existing images, (3) Creating professional visual assets like infographics, logos, product shots, stickers, (4) Working with character consistency across multiple images, (5) Creating images with accurate text rendering, (6) Any task requiring AI-generated visuals. Triggers on: 'generate image', 'create image', 'make a picture', 'design a logo', 'create infographic', 'AI image', 'nano banana', or any image generation request.
Teams using nano-banana should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/nano-banana/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How nano-banana Compares
| Feature / Agent | nano-banana | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
AI image generation using Nano Banana PRO (Gemini 3 Pro Image) and Nano Banana (Gemini 2.5 Flash Image). Use this skill when: (1) Generating images from text prompts, (2) Editing existing images, (3) Creating professional visual assets like infographics, logos, product shots, stickers, (4) Working with character consistency across multiple images, (5) Creating images with accurate text rendering, (6) Any task requiring AI-generated visuals. Triggers on: 'generate image', 'create image', 'make a picture', 'design a logo', 'create infographic', 'AI image', 'nano banana', or any image generation request.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Nano Banana PRO Image Generation
Generate professional AI images using Google's Nano Banana models via the Gemini API.
## Prerequisites
- API key must be set as `GEMINI_API_KEY` environment variable
- Uses curl for all API calls (no SDK required)
## Model Selection
| Model | Identifier | Best For |
|-------|------------|----------|
| **Nano Banana PRO** | `gemini-3-pro-image-preview` | Professional assets, text rendering, infographics, 4K output, complex multi-turn editing |
| **Nano Banana** | `gemini-2.5-flash-image` | Fast generation, simple edits, lower cost |
**Default to PRO** for quality work. Use Flash for rapid iterations or simple tasks.
## CRITICAL: Prompt Engineering First
**BEFORE calling the API, always craft an effective prompt.** Read [`references/prompting-guide.md`](references/prompting-guide.md) for comprehensive prompting strategies. Key principles:
### The Golden Rules
1. **Describe scenes, don't list keywords** - Write narrative descriptions, not tag soup
2. **Use natural language** - Full sentences with proper grammar
3. **Be specific** - Define subject, setting, lighting, mood, materials
4. **Provide context** - The "why" helps the model make better artistic decisions
5. **Edit, don't re-roll** - If 80% correct, ask for specific changes
### The ICS Framework (Quick Reference)
For any image, specify:
- **I**mage type: What kind of visual (photo, infographic, logo, sticker, etc.)
- **C**ontent: Specific elements, data, or information to include
- **S**tyle: Visual style, color palette, artistic approach
## API Reference
### Text-to-Image Generation
```bash
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [{"text": "YOUR_PROMPT_HERE"}]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "16:9",
"imageSize": "2K"
}
}
}'
```
### Image Editing (with input image)
```bash
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [
{"text": "YOUR_EDIT_INSTRUCTION"},
{"inline_data": {"mime_type": "image/png", "data": "BASE64_IMAGE_DATA"}}
]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"]
}
}'
```
### Configuration Options
| Parameter | Values | Notes |
|-----------|--------|-------|
| `aspectRatio` | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` | Match use case |
| `imageSize` | `1K`, `2K`, `4K` | Use uppercase K; PRO model only for 4K |
### Google Search Grounding (Real-time Data)
Add `"tools": [{"google_search": {}}]` to generate images based on current information:
```bash
curl -s -X POST \
"https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{"parts": [{"text": "Create an infographic of current tech stock prices"}]}],
"tools": [{"google_search": {}}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {"aspectRatio": "16:9"}
}
}'
```
## Workflow
### Step 1: Craft the Prompt
Use the ICS framework and prompting guide. Examples:
**Photorealistic:**
```
A photorealistic close-up portrait of an elderly Japanese ceramicist with deep wrinkles and a warm smile, inspecting a tea bowl. Soft golden hour light from a window. 85mm lens, shallow depth of field. Serene mood.
```
**Infographic:**
```
Create a clean, modern infographic explaining photosynthesis as a recipe. Show "ingredients" (sunlight, water, CO2) and "finished dish" (energy). Style like a colorful kids' cookbook page.
```
**Product Shot:**
```
High-resolution studio photograph of a matte black ceramic coffee mug on polished concrete. Three-point softbox lighting, 45-degree angle, sharp focus on rising steam. Square format.
```
### Step 2: Generate Image
Use `scripts/generate-image.sh` or call API directly:
```bash
./scripts/generate-image.sh "Your prompt here" output.png --ratio 16:9 --size 2K
```
### Step 3: Process Response
The API returns base64-encoded image data. Extract and decode:
```bash
# Response contains: {"candidates":[{"content":{"parts":[{"inlineData":{"mimeType":"image/png","data":"BASE64..."}}]}}]}
# Extract with jq and decode:
cat response.json | jq -r '.candidates[0].content.parts[] | select(.inlineData) | .inlineData.data' | base64 -d > image.png
```
## Common Use Cases
### Landing Pages & Ads
- Use 16:9 or 21:9 for hero images
- Specify brand colors, modern/minimal style
- Include text requirements in prompt
### Logos & Icons
- Use 1:1 aspect ratio
- Request "minimalist", "clean lines", "vector-style"
- Specify color scheme explicitly
### Product Photography
- Describe lighting setup (softbox, natural, studio)
- Mention surface/background materials
- Include camera angle and lens type
### Infographics
- Define data to visualize
- Specify style (corporate, playful, technical)
- Request clear text and labeled sections
### Stickers & Illustrations
- Request "bold outlines", "kawaii", "cel-shading"
- Specify "white background" or "transparent background"
- Define color palette
### Character Consistency (Multiple Images)
- PRO supports up to 14 reference images
- Explicitly state: "Keep facial features exactly the same as Image 1"
- Describe expression/pose changes while maintaining identity
## Scripts
See [`scripts/generate-image.sh`](scripts/generate-image.sh) for a ready-to-use generation script.
## Detailed Prompting Guide
For advanced techniques including:
- Photorealistic scene templates
- Text rendering best practices
- Sequential art and storyboarding
- Dimensional translation (2D↔3D)
- Search grounding for real-time data
Read [`references/prompting-guide.md`](references/prompting-guide.md).Related Skills
nanobanana-ppt-skills
AI-powered PPT generation with document analysis and styled images
nanobanana-image
Nano Banana (Google Gemini API) を使って画像を生成・編集するスキル。「画像を生成して」「イラストを作って」「○○の絵を描いて」「画像を作成」「この画像を編集して」「この画像をもとに○○を作って」「generate an image」「create a picture」「edit this image」などの依頼があった場合に使用。テキストからの生成、参照画像からの生成、画像編集、Google検索グラウンディングによる最新情報を反映した画像生成に対応。「最新の○○」「トレンドを反映」「リアルタイム情報」といった依頼にも対応可能。
nano-image-generate
Generate images using Nano Banana (Flash) or Nano Banana Pro. Use 'flash' for speed/efficiency and 'pro' for high quality, text rendering, and complex prompt adherence. Triggers include 'generate image', 'create logo', 'fast image', 'high quality image'.
nano-banana-skill
Generates, edits, and restores images using Google Gemini image models (Nano Banana). Use when the user wants to create images from text prompts, edit existing images with natural language, restore or enhance photos, or generate icons, patterns, diagrams, or visual content. Requires a GEMINI_API_KEY environment variable.
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
ui-ux-pro-max
UI/UX design intelligence. 50 styles, 21 palettes, 50 font pairings, 20 charts, 9 stacks.
ui-ux-principles
Apply core UI/UX design principles for intuitive, beautiful interfaces. Covers visual hierarchy, color theory, typography, spacing systems, Gestalt principles, usability heuristics, and user-centered design. Use for design decisions, layout planning, and creating polished user experiences.
UI/UX Intelligence Expert
UI/UX 设计智能库与推荐专家。包含 67 种风格、96 种配色方案、57 种字体搭配、99 条 UX 指南,支持跨技术栈的设计系统生成。
ui ux
Searchable database of UI styles, color palettes, font pairings, chart types, product recommendations, UX guidelines, and stack-specific best practices.
ui-ux-improve
Research UI/UX improvements with trend analysis and generate actionable recommendations. Use when you need comprehensive UI/UX analysis and improvement suggestions.
ui-ux-designer
Create interface designs, wireframes, and design systems. Masters user research, accessibility standards, and modern design tools.
ui-ux-design-system
Expert in building premium, accessible UI/UX design systems for SaaS apps. Covers design tokens, component architecture with shadcn/ui and Radix, dark mode, glassmorphism, micro-animations, responsive layouts, and accessibility. Use when: ui, ux, design system, shadcn, radix, tailwind, dark mode, animation, accessibility, components, figma to code.