muapi-nano-banana

Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

muapi-nano-banana is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting

Teams using muapi-nano-banana should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/muapi-nano-banana-skill/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/anil-matcha/muapi-nano-banana-skill/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/muapi-nano-banana-skill/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How muapi-nano-banana Compares

Feature / Agent	muapi-nano-banana	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Best AI Skills for ChatGPT

Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# 🍌 Nano-Banana Expert Skill (Gemini 3 Style)

**A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation.**
Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.

## Core Competencies

1. **Reasoning-Driven Prompting**: Using natural language logic to define physics, lighting, and spatial relationships.
2. **Structured Creative Briefs**: Implementing the "Perfect Prompt" formula: `Subject + Action + Context + Composition + Lighting`.
3. **Text Rendering Precision**: Explicitly defining typography and signifiers for legible text integration.
4. **Contextual Grounding**: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy.

---

## 🏗️ Technical Specification

### 1. The "Perfect Prompt" Formula

| Component | Description | Example |
| :--- | :--- | :--- |
| **Subject** | Detailed entity description | "A stoic robot barista with exposed copper wiring" |
| **Action** | Dynamic interaction | "Pouring a latte art leaf with mechanical precision" |
| **Context** | Environment & Atmosphere | "Inside a neon-lit cyberpunk cafe at midnight" |
| **Composition** | Camera & Lens choice | "Close-up, 85mm lens, f/1.8 aperture" |
| **Lighting** | Mood & Direction | "Volumetric blue rim light, warm cafe glow" |
| **Style** | Aesthetic anchor | "Cinematic, photorealistic, 4K production value" |

### 2. Advanced Features
- **Negative Constraint Logic**: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes."
- **Identity Consistency**: (Simulated) "Maintain consistent facial structure across variations."
- **Text Integration**: Use double quotes for specific text: `The sign reads "OPEN 24/7"`.

---

## 🧠 Prompt Optimization Protocol (Agent Instruction)

**Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:**

1. **NO KEYWORD SOUP**: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
2. **PHYSICAL CONSISTENCY**: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").
3. **TEXT PRECISION**: If the user wants text, define it precisely: `featuring a sign that says "STORE NAME" in a weathered serif font`.
4. **OPTICAL DIRECTIVES**: Specify lens behavior: *Shallow Depth of Field (f/1.8)*, *Macro Lens*, *Anamorphic Flare*.

---

## 🚀 Protocol: Using Nano-Banana

### Step 1: Define the Creative Logic
Provide the agent with a subject and a specific scenario.

### Step 2: Invoke the Script
The `generate-nano-art.sh` script translates the logic into a structured Gemini 3-style prompt.

```bash
# Generating a reasoning-driven image
bash scripts/generate-nano-art.sh \
  --subject "a glass chess piece" \
  --action "shattering into liquid shards" \
  --context "on a obsidian table" \
  --style "macro photography"
```

---

## ⚠️ Constraints & Guardrails

- **No Keyword Soup**: **MANDATORY** - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
- **Physics Logic**: Ensure the prompt describes *physically possible* lighting and reflection interactions.
- **Full Sentences**: The model parses relationships; use "light reflecting off the water" instead of "water, reflection".

---

## ⚙️ Implementation Details
This skill applies a "Logic Wrapper" around the `core/media/generate-image.sh` primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.

Related Skills

nano-pdf

3891

from openclaw/skills

Edit PDFs with natural-language instructions using the nano-pdf CLI.

Content & Documentation

evolink-nano-banana-2

3891

from openclaw/skills

Nano Banana 2 — AI image generation powered by Google Gemini 3.1 Flash. Fast, versatile text-to-image and image editing via Evolink API. One API key.

muapi-workflow

3891

from openclaw/skills

Build, run, and visualize multi-step AI generation workflows. The AI architect translates natural language descriptions into connected node graphs — chain image generation, video creation, enhancement, and editing into automated pipelines.

muapi-ui-design

3891

from openclaw/skills

Generate high-fidelity UI/UX mockups for mobile and web apps using Atomic Design principles — creates wireframes and design systems via muapi.ai

muapi-seedance-2

3891

from openclaw/skills

Expert Cinema Director skill for Seedance 2.0 (ByteDance) — high-fidelity video generation using technical camera grammar and multimodal references. Supports text-to-video, image-to-video, and video extension.

muapi-platform

3891

from openclaw/skills

Setup and utility scripts for muapi.ai — configure API keys, test connectivity, and poll for async generation results

muapi-photo-pack-generator

3891

from openclaw/skills

Generate a pack of professional or aesthetic photos from a single reference image while preserving the exact identity of the person.

muapi-media-generation

3891

from openclaw/skills

Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5

muapi-media-editing

3891

from openclaw/skills

Edit and enhance images and videos with AI via muapi.ai — prompt-based editing, upscaling, background removal, face swap, lipsync, video effects, and more

muapi-cinema-director

3891

from openclaw/skills

Direct high-fidelity cinematic video with AI — translates creative intent into technical cinematographic directives for Veo3, Kling, and Luma video models via muapi.ai

muapi-logo-creator

3891

from openclaw/skills

Engineer professional-grade brand logos using geometric primitives and negative space — generates minimalist, scalable vector-style marks via muapi.ai

IMA Nano Banana Image Generator

3891

from openclaw/skills

Nano Banana-only image generation on IMA Open API. Supports text_to_image and image_to_image with gemini-3.1-flash-image (budget) and gemini-3-pro-image (premium). Deterministic size/ratio mapping, 512/1K/2K/4K resolution. Requires IMA_API_KEY.