gemini-image-simple

Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.

533 stars

bysundial-org

View on GitHub Installation ↓

Best use case

gemini-image-simple is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.

Teams using gemini-image-simple should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/gemini-image-simple/SKILL.md --create-dirs "https://raw.githubusercontent.com/sundial-org/awesome-openclaw-skills/main/skills/gemini-image-simple/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/gemini-image-simple/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How gemini-image-simple Compares

Feature / Agent	gemini-image-simple	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Gemini Image Simple

Generate and edit images using Google's Gemini 2.0 Flash image generation API.

## Why This Skill

| Feature | This Skill | Others (nano-banana-pro, etc.) |
|---------|------------|-------------------------------|
| **Dependencies** | None (stdlib only) | google-genai, pillow, etc. |
| **Requires pip/uv** | ❌ No | ✅ Yes |
| **Works on Fly.io free** | ✅ Yes | ❌ Fails |
| **Works in containers** | ✅ Yes | ❌ Often fails |
| **Image generation** | ✅ Full | ✅ Full |
| **Image editing** | ✅ Yes | ✅ Yes |
| **Setup complexity** | Just set API key | Install packages first |

**Bottom line:** This skill works anywhere Python 3 exists. No package managers, no virtual environments, no permission issues.

## Quick Start

```bash
# Generate
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "A cat wearing a tiny hat" cat.png

# Edit existing image  
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "Make it sunset lighting" edited.png --input original.png
```

## Usage

### Generate new image

```bash
python3 {baseDir}/scripts/generate.py "your prompt" output.png
```

### Edit existing image

```bash
python3 {baseDir}/scripts/generate.py "edit instructions" output.png --input source.png
```

Supported input formats: PNG, JPG, JPEG, GIF, WEBP

## Environment

Set `GEMINI_API_KEY` environment variable. Get one at https://aistudio.google.com/apikey

## How It Works

Uses Gemini 2.0 Flash experimental image generation:
- Pure `urllib.request` for HTTP (no requests library)
- Pure `json` for parsing (stdlib)
- Pure `base64` for encoding (stdlib)

That's it. No external packages. Works on any Python 3.10+ installation.

## Examples

```bash
# Landscape
python3 {baseDir}/scripts/generate.py "Misty mountains at sunrise, photorealistic" mountains.png

# Product shot
python3 {baseDir}/scripts/generate.py "Minimalist product photo of a coffee cup, white background" coffee.png

# Edit: change style
python3 {baseDir}/scripts/generate.py "Convert to watercolor painting style" watercolor.png --input photo.jpg

# Edit: add element
python3 {baseDir}/scripts/generate.py "Add a rainbow in the sky" rainbow.png --input landscape.png
```

Related Skills

openai-image-gen

533

from sundial-org/awesome-openclaw-skills

Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.

imagemagick

533

from sundial-org/awesome-openclaw-skills

No description provided.

google-gemini-media

533

from sundial-org/awesome-openclaw-skills

Use the Gemini API (Nano Banana image generation, Veo video, Gemini TTS speech and audio understanding) to deliver end-to-end multimodal media workflows and code templates for "generation + understanding".

gemini

533

from sundial-org/awesome-openclaw-skills

Gemini CLI for one-shot Q&A, summaries, and generation.

gemini-yt-video-transcript

533

from sundial-org/awesome-openclaw-skills

Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).

gemini-stt

533

from sundial-org/awesome-openclaw-skills

Transcribe audio files using Google's Gemini API or Vertex AI

gemini-deep-research

533

from sundial-org/awesome-openclaw-skills

Perform complex, long-running research tasks using Gemini Deep Research Agent. Use when asked to research topics requiring multi-source synthesis, competitive analysis, market research, or comprehensive technical investigations that benefit from systematic web search and analysis.

gemini-computer-use

533

from sundial-org/awesome-openclaw-skills

Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.

chart-image

533

from sundial-org/awesome-openclaw-skills

Generate publication-quality chart images from data. Supports line, bar, area, and point charts. Use when visualizing data, creating graphs, plotting time series, or generating chart images for reports/alerts. Designed for Fly.io/VPS deployments - no native compilation, no Puppeteer, no browser required. Pure Node.js with prebuilt binaries.

brave-images

533

from sundial-org/awesome-openclaw-skills

Search for images using Brave Search API. Use when you need to find images, pictures, photos, or visual content on any topic. Requires BRAVE_API_KEY environment variable.

antigravity-image-gen

533

from sundial-org/awesome-openclaw-skills

Generate images using the internal Google Antigravity API (Gemini 3 Pro Image). High quality, native generation without browser automation.

portfolio-watcher

533

from sundial-org/awesome-openclaw-skills

Monitor stock/crypto holdings, get price alerts, track portfolio performance