nano-banana-pro

Generate images with Google's Nano Banana Pro (Gemini 3 Pro Image). Use when generating AI images via Gemini API, creating professional visuals, or building image generation features. Triggers on Nano Banana Pro, Gemini 3 Pro Image, gemini-3-pro-image-preview, Google image generation.

152 stars

Best use case

nano-banana-pro is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generate images with Google's Nano Banana Pro (Gemini 3 Pro Image). Use when generating AI images via Gemini API, creating professional visuals, or building image generation features. Triggers on Nano Banana Pro, Gemini 3 Pro Image, gemini-3-pro-image-preview, Google image generation.

Teams using nano-banana-pro should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/nano-banana-pro/SKILL.md --create-dirs "https://raw.githubusercontent.com/hoodini/ai-agents-skills/main/skills/nano-banana-pro/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/nano-banana-pro/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How nano-banana-pro Compares

Feature / Agentnano-banana-proStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Generate images with Google's Nano Banana Pro (Gemini 3 Pro Image). Use when generating AI images via Gemini API, creating professional visuals, or building image generation features. Triggers on Nano Banana Pro, Gemini 3 Pro Image, gemini-3-pro-image-preview, Google image generation.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Nano Banana Pro (Gemini 3 Pro Image)

Generate high-quality images with Google's Gemini 3 Pro Image API.

## Overview

**Nano Banana Pro** is the marketing name for **Gemini 3 Pro Image** (`gemini-3-pro-image-preview`), Google's state-of-the-art image generation and editing model built on Gemini 3 Pro.

## Quick Start

### Get API Key
1. Go to [Google AI Studio](https://aistudio.google.com)
2. Click "Get API Key"
3. Store securely as environment variable

### Basic Image Generation (Python)
```python
from google import genai
from google.genai import types

client = genai.Client(api_key="YOUR_GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="A serene Japanese garden with cherry blossoms and a koi pond",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)

# Process response
for part in response.candidates[0].content.parts:
    if hasattr(part, 'text'):
        print(f"Description: {part.text}")
    elif hasattr(part, 'inline_data'):
        # Save image
        image_data = part.inline_data.data  # Base64 encoded
        mime_type = part.inline_data.mime_type  # image/png
        
        import base64
        with open("output.png", "wb") as f:
            f.write(base64.b64decode(image_data))
```

### REST API (cURL)
```bash
curl -s -X POST \
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{
      "role": "user",
      "parts": [{"text": "Create a vibrant infographic about photosynthesis"}]
    }],
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"]
    }
  }'
```

### TypeScript/JavaScript
```typescript
const GEMINI_API_KEY = process.env.GEMINI_API_KEY;

async function generateImage(prompt: string) {
  const response = await fetch(
    'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
    {
      method: 'POST',
      headers: {
        'x-goog-api-key': GEMINI_API_KEY!,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        contents: [{ 
          role: 'user', 
          parts: [{ text: prompt }] 
        }],
        generationConfig: {
          responseModalities: ['TEXT', 'IMAGE'],
        },
      }),
    }
  );

  const data = await response.json();
  return data;
}
```

## Configuration Options

### Image Configuration
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Professional product photo of a coffee mug",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",  # Options: 1:1, 3:2, 16:9, 9:16, 21:9
            image_size="2K"       # Options: 1K, 2K, 4K
        )
    )
)
```

### With Google Search Grounding
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Create an infographic showing today's stock market trends",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]  # Enable search grounding
    )
)
```

## Multi-Turn Conversations (Iterative Editing)

```python
# Create a chat session
chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

# Initial generation
response1 = chat.send_message(
    "Create a vibrant infographic explaining photosynthesis"
)

# Edit the image
response2 = chat.send_message(
    "Update this infographic to be in Spanish. Keep all other elements the same."
)
```

## Key Capabilities

### 1. Superior Text Rendering
```python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="""Create a professional poster with:
    - Title: "Annual Tech Summit 2025"
    - Date: March 15-17, 2025
    - Location: San Francisco Convention Center
    """,
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)
```

### 2. Character Consistency (Up to 5 Subjects)
```python
import base64

def load_image(path: str) -> str:
    with open(path, "rb") as f:
        return base64.b64encode(f.read()).decode()

character_ref = load_image("character.png")

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        {"text": "Generate an image of this person at a tech conference"},
        {"inline_data": {"mime_type": "image/png", "data": character_ref}}
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE']
    )
)
```

## Next.js API Route

```typescript
// app/api/generate-image/route.ts
import { NextRequest, NextResponse } from 'next/server';

export async function POST(request: NextRequest) {
  const { prompt, aspectRatio = '1:1', imageSize = '2K' } = await request.json();

  try {
    const response = await fetch(
      'https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent',
      {
        method: 'POST',
        headers: {
          'x-goog-api-key': process.env.GEMINI_API_KEY!,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          contents: [{ role: 'user', parts: [{ text: prompt }] }],
          generationConfig: {
            responseModalities: ['TEXT', 'IMAGE'],
            imageConfig: { aspectRatio, imageSize },
          },
        }),
      }
    );

    const data = await response.json();
    const parts = data.candidates?.[0]?.content?.parts || [];
    const imagePart = parts.find((p: any) => p.inline_data);

    return NextResponse.json({
      image: imagePart ? {
        data: imagePart.inline_data.data,
        mimeType: imagePart.inline_data.mime_type,
        url: `data:${imagePart.inline_data.mime_type};base64,${imagePart.inline_data.data}`,
      } : null,
    });
  } catch (error) {
    return NextResponse.json({ error: 'Generation failed' }, { status: 500 });
  }
}
```

## Model Comparison

| Feature | Nano Banana (2.5 Flash) | Nano Banana Pro (3 Pro Image) |
|---------|-------------------------|-------------------------------|
| Model ID | gemini-2.5-flash-image | gemini-3-pro-image-preview |
| Quality | Good | Best |
| Speed | Faster | Slower |
| Cost | Lower | Higher |
| Best For | Previews, high-volume | Production, professional |

## Resources

- **Documentation**: https://ai.google.dev/gemini-api/docs/image-generation
- **Google AI Studio**: https://aistudio.google.com
- **Prompt Guide**: https://ai.google.dev/gemini-api/docs/prompting-intro

Related Skills

google-workspace-cli

152
from hoodini/ai-agents-skills

Interact with all Google Workspace APIs via the gws CLI. Use when managing Drive files, sending/reading Gmail, creating Calendar events, reading/writing Sheets/Docs/Slides, managing Chat spaces, contacts, Admin users/groups, Vault eDiscovery, Classroom, Apps Script, Workspace Events, or configuring the gws MCP server. Triggers on Google Workspace, gws, Drive, Gmail, Calendar, Sheets, Docs, Slides, Chat, Tasks, Meet, Forms, Keep, Admin, People, Vault, Classroom, Apps Script, Cloud Identity, Alert Center, Groups Settings, Licensing, Reseller, Model Armor, gws CLI, gws mcp, Google API, Workspace automation, npx skills add.

Workflow & Productivity

skill-name

152
from hoodini/ai-agents-skills

Brief description of what this skill enables. Include trigger keywords that should activate this skill. Triggers on keyword1, keyword2, keyword3.

web-accessibility

152
from hoodini/ai-agents-skills

Build accessible web applications following WCAG guidelines. Use when implementing ARIA patterns, keyboard navigation, screen reader support, or ensuring accessibility compliance. Triggers on accessibility, a11y, WCAG, ARIA, screen reader, keyboard navigation.

vercel

152
from hoodini/ai-agents-skills

Deploy and configure applications on Vercel. Use when deploying Next.js apps, configuring serverless functions, setting up edge functions, or managing Vercel projects. Triggers on Vercel, deploy, serverless, edge function, Next.js deployment.

ux-design-systems

152
from hoodini/ai-agents-skills

Build consistent design systems with tokens, components, and theming. Use when creating component libraries, implementing design tokens, building theme systems, or ensuring design consistency. Triggers on design system, design tokens, component library, theming, dark mode.

shabbat-times

152
from hoodini/ai-agents-skills

Access Jewish calendar data and Shabbat times via Hebcal API. Use when building apps with Shabbat times, Jewish holidays, Hebrew dates, or Zmanim. Triggers on Shabbat times, Hebcal, Jewish calendar, Hebrew date, Zmanim.

railway

152
from hoodini/ai-agents-skills

Deploy applications on Railway platform. Use when deploying containerized apps, setting up databases, configuring private networking, or managing Railway projects. Triggers on Railway, railway.app, deploy container, Railway database.

owasp-security

152
from hoodini/ai-agents-skills

Implement secure coding practices following OWASP Top 10. Use when preventing security vulnerabilities, implementing authentication, securing APIs, or conducting security reviews. Triggers on OWASP, security, XSS, SQL injection, CSRF, authentication security, secure coding, vulnerability.

mongodb

152
from hoodini/ai-agents-skills

Work with MongoDB databases using best practices. Use when designing schemas, writing queries, building aggregation pipelines, or optimizing performance. Triggers on MongoDB, Mongoose, NoSQL, aggregation pipeline, document database, MongoDB Atlas.

mobile-responsiveness

152
from hoodini/ai-agents-skills

Build responsive, mobile-first web applications. Use when implementing responsive layouts, touch interactions, mobile navigation, or optimizing for various screen sizes. Triggers on responsive design, mobile-first, breakpoints, touch events, viewport.

mermaid-diagrams

152
from hoodini/ai-agents-skills

Create diagrams and visualizations using Mermaid syntax. Use when generating flowcharts, sequence diagrams, class diagrams, entity-relationship diagrams, Gantt charts, or any visual documentation. Triggers on Mermaid, flowchart, sequence diagram, class diagram, ER diagram, Gantt chart, diagram, visualization.

local-llm-router

152
from hoodini/ai-agents-skills

Route AI coding queries to local LLMs in air-gapped networks. Integrates Serena MCP for semantic code understanding. Use when working offline, with local models (Ollama, LM Studio, Jan, OpenWebUI), or in secure/closed environments. Triggers on local LLM, Ollama, LM Studio, Jan, air-gapped, offline AI, Serena, local inference, closed network, model routing, defense network, secure coding.