gemini-image-simple
Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.
Best use case
gemini-image-simple is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.
Teams using gemini-image-simple should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/gemini-image-simple/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How gemini-image-simple Compares
| Feature / Agent | gemini-image-simple | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Gemini Image Simple
Generate and edit images using Google's Gemini 2.0 Flash image generation API.
## Why This Skill
| Feature | This Skill | Others (nano-banana-pro, etc.) |
|---------|------------|-------------------------------|
| **Dependencies** | None (stdlib only) | google-genai, pillow, etc. |
| **Requires pip/uv** | ❌ No | ✅ Yes |
| **Works on Fly.io free** | ✅ Yes | ❌ Fails |
| **Works in containers** | ✅ Yes | ❌ Often fails |
| **Image generation** | ✅ Full | ✅ Full |
| **Image editing** | ✅ Yes | ✅ Yes |
| **Setup complexity** | Just set API key | Install packages first |
**Bottom line:** This skill works anywhere Python 3 exists. No package managers, no virtual environments, no permission issues.
## Quick Start
```bash
# Generate
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "A cat wearing a tiny hat" cat.png
# Edit existing image
python3 /data/clawd/skills/gemini-image-simple/scripts/generate.py "Make it sunset lighting" edited.png --input original.png
```
## Usage
### Generate new image
```bash
python3 {baseDir}/scripts/generate.py "your prompt" output.png
```
### Edit existing image
```bash
python3 {baseDir}/scripts/generate.py "edit instructions" output.png --input source.png
```
Supported input formats: PNG, JPG, JPEG, GIF, WEBP
## Environment
Set `GEMINI_API_KEY` environment variable. Get one at https://aistudio.google.com/apikey
## How It Works
Uses Gemini 2.0 Flash experimental image generation:
- Pure `urllib.request` for HTTP (no requests library)
- Pure `json` for parsing (stdlib)
- Pure `base64` for encoding (stdlib)
That's it. No external packages. Works on any Python 3.10+ installation.
## Examples
```bash
# Landscape
python3 {baseDir}/scripts/generate.py "Misty mountains at sunrise, photorealistic" mountains.png
# Product shot
python3 {baseDir}/scripts/generate.py "Minimalist product photo of a coffee cup, white background" coffee.png
# Edit: change style
python3 {baseDir}/scripts/generate.py "Convert to watercolor painting style" watercolor.png --input photo.jpg
# Edit: add element
python3 {baseDir}/scripts/generate.py "Add a rainbow in the sky" rainbow.png --input landscape.png
```Related Skills
openai-image-gen
Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.
imagemagick
No description provided.
google-gemini-media
Use the Gemini API (Nano Banana image generation, Veo video, Gemini TTS speech and audio understanding) to deliver end-to-end multimodal media workflows and code templates for "generation + understanding".
gemini
Gemini CLI for one-shot Q&A, summaries, and generation.
gemini-yt-video-transcript
Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).
gemini-stt
Transcribe audio files using Google's Gemini API or Vertex AI
gemini-deep-research
Perform complex, long-running research tasks using Gemini Deep Research Agent. Use when asked to research topics requiring multi-source synthesis, competitive analysis, market research, or comprehensive technical investigations that benefit from systematic web search and analysis.
gemini-computer-use
Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.
chart-image
Generate publication-quality chart images from data. Supports line, bar, area, and point charts. Use when visualizing data, creating graphs, plotting time series, or generating chart images for reports/alerts. Designed for Fly.io/VPS deployments - no native compilation, no Puppeteer, no browser required. Pure Node.js with prebuilt binaries.
brave-images
Search for images using Brave Search API. Use when you need to find images, pictures, photos, or visual content on any topic. Requires BRAVE_API_KEY environment variable.
antigravity-image-gen
Generate images using the internal Google Antigravity API (Gemini 3 Pro Image). High quality, native generation without browser automation.
portfolio-watcher
Monitor stock/crypto holdings, get price alerts, track portfolio performance