generate-image
Generate or edit images using AI models (FLUX, Gemini). Use for general-purpose image generation including photos, illustrations, artwork, visual assets, concept art, and any image that isn't a technical diagram or schematic. For flowcharts, circuits, pathways, and technical diagrams, use the scientific-schematics skill instead.
Best use case
generate-image is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Generate or edit images using AI models (FLUX, Gemini). Use for general-purpose image generation including photos, illustrations, artwork, visual assets, concept art, and any image that isn't a technical diagram or schematic. For flowcharts, circuits, pathways, and technical diagrams, use the scientific-schematics skill instead.
Teams using generate-image should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/generate-image/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How generate-image Compares
| Feature / Agent | generate-image | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate or edit images using AI models (FLUX, Gemini). Use for general-purpose image generation including photos, illustrations, artwork, visual assets, concept art, and any image that isn't a technical diagram or schematic. For flowcharts, circuits, pathways, and technical diagrams, use the scientific-schematics skill instead.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Generate Image Generate and edit high-quality images using OpenRouter's image generation models including FLUX.2 Pro and Gemini 3 Pro. ## When to Use This Skill **Use generate-image for:** - Photos and photorealistic images - Artistic illustrations and artwork - Concept art and visual concepts - Visual assets for presentations or documents - Image editing and modifications - Any general-purpose image generation needs **Use scientific-schematics instead for:** - Flowcharts and process diagrams - Circuit diagrams and electrical schematics - Biological pathways and signaling cascades - System architecture diagrams - CONSORT diagrams and methodology flowcharts - Any technical/schematic diagrams ## Quick Start Use the `scripts/generate_image.py` script to generate or edit images: ```bash # Generate a new image python scripts/generate_image.py "A beautiful sunset over mountains" # Edit an existing image python scripts/generate_image.py "Make the sky purple" --input photo.jpg ``` This generates/edits an image and saves it as `generated_image.png` in the current directory. ## API Key Setup **CRITICAL**: The script requires an OpenRouter API key. Before running, check if the user has configured their API key: 1. Look for a `.env` file in the project directory or parent directories 2. Check for `OPENROUTER_API_KEY=<key>` in the `.env` file 3. If not found, inform the user they need to: - Create a `.env` file with `OPENROUTER_API_KEY=your-api-key-here` - Or set the environment variable: `export OPENROUTER_API_KEY=your-api-key-here` - Get an API key from: https://openrouter.ai/keys The script will automatically detect the `.env` file and provide clear error messages if the API key is missing. ## Model Selection **Default model**: `google/gemini-3-pro-image-preview` (high quality, recommended) **Available models for generation and editing**: - `google/gemini-3-pro-image-preview` - High quality, supports generation + editing - `black-forest-labs/flux.2-pro` - Fast, high quality, supports generation + editing **Generation only**: - `black-forest-labs/flux.2-flex` - Fast and cheap, but not as high quality as pro Select based on: - **Quality**: Use gemini-3-pro or flux.2-pro - **Editing**: Use gemini-3-pro or flux.2-pro (both support image editing) - **Cost**: Use flux.2-flex for generation only ## Common Usage Patterns ### Basic generation ```bash python scripts/generate_image.py "Your prompt here" ``` ### Specify model ```bash python scripts/generate_image.py "A cat in space" --model "black-forest-labs/flux.2-pro" ``` ### Custom output path ```bash python scripts/generate_image.py "Abstract art" --output artwork.png ``` ### Edit an existing image ```bash python scripts/generate_image.py "Make the background blue" --input photo.jpg ``` ### Edit with a specific model ```bash python scripts/generate_image.py "Add sunglasses to the person" --input portrait.png --model "black-forest-labs/flux.2-pro" ``` ### Edit with custom output ```bash python scripts/generate_image.py "Remove the text from the image" --input screenshot.png --output cleaned.png ``` ### Multiple images Run the script multiple times with different prompts or output paths: ```bash python scripts/generate_image.py "Image 1 description" --output image1.png python scripts/generate_image.py "Image 2 description" --output image2.png ``` ## Script Parameters - `prompt` (required): Text description of the image to generate, or editing instructions - `--input` or `-i`: Input image path for editing (enables edit mode) - `--model` or `-m`: OpenRouter model ID (default: google/gemini-3-pro-image-preview) - `--output` or `-o`: Output file path (default: generated_image.png) - `--api-key`: OpenRouter API key (overrides .env file) ## Example Use Cases ### For Scientific Documents ```bash # Generate a conceptual illustration for a paper python scripts/generate_image.py "Microscopic view of cancer cells being attacked by immunotherapy agents, scientific illustration style" --output figures/immunotherapy_concept.png # Create a visual for a presentation python scripts/generate_image.py "DNA double helix structure with highlighted mutation site, modern scientific visualization" --output slides/dna_mutation.png ``` ### For Presentations and Posters ```bash # Title slide background python scripts/generate_image.py "Abstract blue and white background with subtle molecular patterns, professional presentation style" --output slides/background.png # Poster hero image python scripts/generate_image.py "Laboratory setting with modern equipment, photorealistic, well-lit" --output poster/hero.png ``` ### For General Visual Content ```bash # Website or documentation images python scripts/generate_image.py "Professional team collaboration around a digital whiteboard, modern office" --output docs/team_collaboration.png # Marketing materials python scripts/generate_image.py "Futuristic AI brain concept with glowing neural networks" --output marketing/ai_concept.png ``` ## Error Handling The script provides clear error messages for: - Missing API key (with setup instructions) - API errors (with status codes) - Unexpected response formats - Missing dependencies (requests library) If the script fails, read the error message and address the issue before retrying. ## Notes - Images are returned as base64-encoded data URLs and automatically saved as PNG files - The script supports both `images` and `content` response formats from different OpenRouter models - Generation time varies by model (typically 5-30 seconds) - For image editing, the input image is encoded as base64 and sent to the model - Supported input image formats: PNG, JPEG, GIF, WebP - Check OpenRouter pricing for cost information: https://openrouter.ai/models ## Image Editing Tips - Be specific about what changes you want (e.g., "change the sky to sunset colors" vs "edit the sky") - Reference specific elements in the image when possible - For best results, use clear and detailed editing instructions - Both Gemini 3 Pro and FLUX.2 Pro support image editing through OpenRouter ## Integration with Other Skills - **scientific-schematics**: Use for technical diagrams, flowcharts, circuits, pathways - **generate-image**: Use for photos, illustrations, artwork, visual concepts - **scientific-slides**: Combine with generate-image for visually rich presentations - **latex-posters**: Use generate-image for poster visuals and hero images
Related Skills
daily-paper-push-writing(wo image)
A research/push notification writing guide. Use this skill with high priority when users ask you to perform tasks like daily paper push.
acpx
Use the ACPX CLI through DrClaw's existing exec/long_exec tools to run Codex in the current project workspace.
ui-ux-pro-max
[Frontend] Frontend UI/UX design intelligence - activate FIRST when user requests beautiful, stunning, gorgeous, or aesthetic interfaces. 50 styles, 21 palettes, 50 font pairings, 20 charts, 8 stacks. Triggers on ui design, ux design, design system, color palette, typography, glassmorphism, claymorphism, neumorphism, bento grid, font pairing, ui-ux-pro-max, stunning interface, beautiful ui.
fetch
Fetch metadata and links from arXiv for a given query.
web_literature_mining
Scientific Literature Mining - Mine scientific literature: PubMed search, arXiv search, web search, and Tavily deep search. Use this skill for scientific informatics tasks involving pubmed search search literature search web tavily search. Combines 4 tools from 2 SCP server(s).
uniprot_deep_analysis
UniProt Deep Protein Analysis - Deep UniProt analysis: entry data, UniRef clusters, UniParc cross-references, and gene-centric view. Use this skill for protein science tasks involving get uniprotkb entry by accession get uniref cluster by id get uniparc entry by upi get gene centric by accession. Combines 4 tools from 1 SCP server(s).
synthetic_biology_design
Synthetic Biology Design - Design synthetic biology construct: gene lookup, codon optimization, protein property prediction, and structure prediction. Use this skill for synthetic biology tasks involving get sequence id DegenerateCodonCalculatorbyAminoAcid calculate protein sequence properties pred protein structure esmfold. Combines 4 tools from 4 SCP server(s).
structural_homology_modeling
Structural Homology & Evolution Analysis - Analyze protein evolution: get gene tree from Ensembl, find homologs, compare sequences, and predict structure. Use this skill for evolutionary biology tasks involving get homology symbol get genetree member symbol calculate protein sequence properties pred protein structure esmfold. Combines 4 tools from 3 SCP server(s).
proteome_analysis
Proteome-Level Analysis - Analyze at proteome level: get proteome from UniProt, gene-centric view, functional annotation from STRING. Use this skill for proteomics tasks involving get proteome by id get gene centric by proteome get functional annotation. Combines 3 tools from 2 SCP server(s).
protein_structure_analysis
Protein Structure Comprehensive Analysis - Comprehensive structure analysis: download PDB, extract chains, calculate geometry, quality metrics, and composition. Use this skill for structural biology tasks involving retrieve protein data by pdbcode extract pdb chains calculate pdb structural geometry calculate pdb quality metrics calculate pdb composition info. Combines 5 tools from 1 SCP server(s).
protein_solubility_optimization
Protein Solubility Optimization - Optimize protein solubility: calculate properties, predict solubility, predict hydrophilicity, and suggest mutations. Use this skill for protein engineering tasks involving calculate protein sequence properties predict protein function ComputeHydrophilicity zero shot sequence prediction. Combines 4 tools from 3 SCP server(s).
protein_similarity_search
Protein Similarity Search - Search for similar proteins: extract sequence from PDB, search structures with FoldSeek, find homologs with STRING, and check UniProt. Use this skill for bioinformatics tasks involving extract pdb sequence foldseek search get best similarity hits between species search uniprotkb entries. Combines 4 tools from 3 SCP server(s).