muapi-nano-banana
Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting
Best use case
muapi-nano-banana is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting
Teams using muapi-nano-banana should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/muapi-nano-banana-skill/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How muapi-nano-banana Compares
| Feature / Agent | muapi-nano-banana | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Reasoning-driven image generation using structured creative briefs (Gemini 3 style) — generates high-fidelity images via muapi.ai with logic-based prompting
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# 🍌 Nano-Banana Expert Skill (Gemini 3 Style) **A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation.** Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs. ## Core Competencies 1. **Reasoning-Driven Prompting**: Using natural language logic to define physics, lighting, and spatial relationships. 2. **Structured Creative Briefs**: Implementing the "Perfect Prompt" formula: `Subject + Action + Context + Composition + Lighting`. 3. **Text Rendering Precision**: Explicitly defining typography and signifiers for legible text integration. 4. **Contextual Grounding**: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy. --- ## 🏗️ Technical Specification ### 1. The "Perfect Prompt" Formula | Component | Description | Example | | :--- | :--- | :--- | | **Subject** | Detailed entity description | "A stoic robot barista with exposed copper wiring" | | **Action** | Dynamic interaction | "Pouring a latte art leaf with mechanical precision" | | **Context** | Environment & Atmosphere | "Inside a neon-lit cyberpunk cafe at midnight" | | **Composition** | Camera & Lens choice | "Close-up, 85mm lens, f/1.8 aperture" | | **Lighting** | Mood & Direction | "Volumetric blue rim light, warm cafe glow" | | **Style** | Aesthetic anchor | "Cinematic, photorealistic, 4K production value" | ### 2. Advanced Features - **Negative Constraint Logic**: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes." - **Identity Consistency**: (Simulated) "Maintain consistent facial structure across variations." - **Text Integration**: Use double quotes for specific text: `The sign reads "OPEN 24/7"`. --- ## 🧠 Prompt Optimization Protocol (Agent Instruction) **Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:** 1. **NO KEYWORD SOUP**: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences. 2. **PHYSICAL CONSISTENCY**: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor"). 3. **TEXT PRECISION**: If the user wants text, define it precisely: `featuring a sign that says "STORE NAME" in a weathered serif font`. 4. **OPTICAL DIRECTIVES**: Specify lens behavior: *Shallow Depth of Field (f/1.8)*, *Macro Lens*, *Anamorphic Flare*. --- ## 🚀 Protocol: Using Nano-Banana ### Step 1: Define the Creative Logic Provide the agent with a subject and a specific scenario. ### Step 2: Invoke the Script The `generate-nano-art.sh` script translates the logic into a structured Gemini 3-style prompt. ```bash # Generating a reasoning-driven image bash scripts/generate-nano-art.sh \ --subject "a glass chess piece" \ --action "shattering into liquid shards" \ --context "on a obsidian table" \ --style "macro photography" ``` --- ## ⚠️ Constraints & Guardrails - **No Keyword Soup**: **MANDATORY** - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions. - **Physics Logic**: Ensure the prompt describes *physically possible* lighting and reflection interactions. - **Full Sentences**: The model parses relationships; use "light reflecting off the water" instead of "water, reflection". --- ## ⚙️ Implementation Details This skill applies a "Logic Wrapper" around the `core/media/generate-image.sh` primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.
Related Skills
nano-pdf
Edit PDFs with natural-language instructions using the nano-pdf CLI.
evolink-nano-banana-2
Nano Banana 2 — AI image generation powered by Google Gemini 3.1 Flash. Fast, versatile text-to-image and image editing via Evolink API. One API key.
muapi-workflow
Build, run, and visualize multi-step AI generation workflows. The AI architect translates natural language descriptions into connected node graphs — chain image generation, video creation, enhancement, and editing into automated pipelines.
muapi-ui-design
Generate high-fidelity UI/UX mockups for mobile and web apps using Atomic Design principles — creates wireframes and design systems via muapi.ai
muapi-seedance-2
Expert Cinema Director skill for Seedance 2.0 (ByteDance) — high-fidelity video generation using technical camera grammar and multimodal references. Supports text-to-video, image-to-video, and video extension.
muapi-platform
Setup and utility scripts for muapi.ai — configure API keys, test connectivity, and poll for async generation results
muapi-photo-pack-generator
Generate a pack of professional or aesthetic photos from a single reference image while preserving the exact identity of the person.
muapi-media-generation
Generate AI images, videos, music, and audio from the terminal via muapi.ai — supports 100+ models including Flux, Midjourney v7, Kling 3.0, Veo3, and Suno V5
muapi-media-editing
Edit and enhance images and videos with AI via muapi.ai — prompt-based editing, upscaling, background removal, face swap, lipsync, video effects, and more
muapi-cinema-director
Direct high-fidelity cinematic video with AI — translates creative intent into technical cinematographic directives for Veo3, Kling, and Luma video models via muapi.ai
muapi-logo-creator
Engineer professional-grade brand logos using geometric primitives and negative space — generates minimalist, scalable vector-style marks via muapi.ai
IMA Nano Banana Image Generator
Nano Banana-only image generation on IMA Open API. Supports text_to_image and image_to_image with gemini-3.1-flash-image (budget) and gemini-3-pro-image (premium). Deterministic size/ratio mapping, 512/1K/2K/4K resolution. Requires IMA_API_KEY.