muapi-nano-banana

Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting

3,891 stars

Best use case

muapi-nano-banana is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting

Teams using muapi-nano-banana should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/muapi-nano-banana-skill/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/anil-matcha/muapi-nano-banana-skill/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/muapi-nano-banana-skill/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How muapi-nano-banana Compares

Feature / Agentmuapi-nano-bananaStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# 🍌 Nano-Banana Expert Skill (Gemini 3 Style)

**A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation.**
Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.

## Core Competencies

1. **Reasoning-Driven Prompting**: Using natural language logic to define physics, lighting, and spatial relationships.
2. **Structured Creative Briefs**: Implementing the "Perfect Prompt" formula: `Subject + Action + Context + Composition + Lighting`.
3. **Text Rendering Precision**: Explicitly defining typography and signifiers for legible text integration.
4. **Contextual Grounding**: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy.

---

## 🏗️ Technical Specification

### 1. The "Perfect Prompt" Formula

| Component | Description | Example |
| :--- | :--- | :--- |
| **Subject** | Detailed entity description | "A stoic robot barista with exposed copper wiring" |
| **Action** | Dynamic interaction | "Pouring a latte art leaf with mechanical precision" |
| **Context** | Environment & Atmosphere | "Inside a neon-lit cyberpunk cafe at midnight" |
| **Composition** | Camera & Lens choice | "Close-up, 85mm lens, f/1.8 aperture" |
| **Lighting** | Mood & Direction | "Volumetric blue rim light, warm cafe glow" |
| **Style** | Aesthetic anchor | "Cinematic, photorealistic, 4K production value" |

### 2. Advanced Features
- **Negative Constraint Logic**: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes."
- **Identity Consistency**: (Simulated) "Maintain consistent facial structure across variations."
- **Text Integration**: Use double quotes for specific text: `The sign reads "OPEN 24/7"`.

---

## 🧠 Prompt Optimization Protocol (Agent Instruction)

**Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:**

1. **NO KEYWORD SOUP**: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
2. **PHYSICAL CONSISTENCY**: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").
3. **TEXT PRECISION**: If the user wants text, define it precisely: `featuring a sign that says "STORE NAME" in a weathered serif font`.
4. **OPTICAL DIRECTIVES**: Specify lens behavior: *Shallow Depth of Field (f/1.8)*, *Macro Lens*, *Anamorphic Flare*.

---

## 🚀 Protocol: Using Nano-Banana

### Step 1: Define the Creative Logic
Provide the agent with a subject and a specific scenario.

### Step 2: Invoke the Script
The `generate-nano-art.sh` script translates the logic into a structured Gemini 3-style prompt.

```bash
# Generating a reasoning-driven image
bash scripts/generate-nano-art.sh \
  --subject "a glass chess piece" \
  --action "shattering into liquid shards" \
  --context "on a obsidian table" \
  --style "macro photography"
```

---

## ⚠️ Constraints & Guardrails

- **No Keyword Soup**: **MANDATORY** - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
- **Physics Logic**: Ensure the prompt describes *physically possible* lighting and reflection interactions.
- **Full Sentences**: The model parses relationships; use "light reflecting off the water" instead of "water, reflection".

---

## ⚙️ Implementation Details
This skill applies a "Logic Wrapper" around the `core/media/generate-image.sh` primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.

Related Skills

nano-pdf

3891
from openclaw/skills

Edit PDFs with natural-language instructions using the nano-pdf CLI.

Content & Documentation

evolink-nano-banana-2

3891
from openclaw/skills

Nano Banana 2 — AI image generation powered by Google Gemini 3.1 Flash. Fast, versatile text-to-image and image editing via Evolink API. One API key.

muapi-workflow

3891
from openclaw/skills

Build, run, and visualize multi-step AI generation workflows. The AI architect translates natural language descriptions into connected node graphs — chain image generation, video creation, enhancement, and editing into automated pipelines.

muapi-ui-design

3891
from openclaw/skills

Generate high-fidelity UI/UX mockups for mobile and web apps using Atomic Design principles — creates wireframes and design systems via muapi.ai

muapi-seedance-2

3891
from openclaw/skills

Expert Cinema Director skill for Seedance 2.0 (ByteDance) — high-fidelity video generation using technical camera grammar and multimodal references. Supports text-to-video, image-to-video, and video extension.

muapi-platform

3891
from openclaw/skills

Setup and utility scripts for muapi.ai — configure API keys, test connectivity, and poll for async generation results

muapi-photo-pack-generator

3891
from openclaw/skills

Generate a pack of professional or aesthetic photos from a single reference image while preserving the exact identity of the person.

muapi-media-generation

3891
from openclaw/skills

Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5

muapi-media-editing

3891
from openclaw/skills

Edit and enhance images and videos with AI via muapi.ai — prompt-based editing, upscaling, background removal, face swap, lipsync, video effects, and more

muapi-cinema-director

3891
from openclaw/skills

Direct high-fidelity cinematic video with AI — translates creative intent into technical cinematographic directives for Veo3, Kling, and Luma video models via muapi.ai

muapi-logo-creator

3891
from openclaw/skills

Engineer professional-grade brand logos using geometric primitives and negative space — generates minimalist, scalable vector-style marks via muapi.ai

IMA Nano Banana Image Generator

3891
from openclaw/skills

Nano Banana-only image generation on IMA Open API. Supports text_to_image and image_to_image with gemini-3.1-flash-image (budget) and gemini-3-pro-image (premium). Deterministic size/ratio mapping, 512/1K/2K/4K resolution. Requires IMA_API_KEY.