gpt-image-1-5
Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.
Best use case
gpt-image-1-5 is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.
Teams using gpt-image-1-5 should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/gpt-image-1-5/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How gpt-image-1-5 Compares
| Feature / Agent | gpt-image-1-5 | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# GPT Image 1.5 - Image Generation & Editing
Generate new images or edit existing ones using OpenAI's GPT Image 1.5 model.
- **Generation**: Uses the Responses API with image_generation tool
- **Editing**: Uses the Image API for reliable mask-based inpainting
## Usage
Run the script using absolute path (do NOT cd to skill directory first):
**Generate new image:**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "your image description" --filename "output-name.png" [--quality low|medium|high] [--size 1024x1024|1024x1536|1536x1024|auto] [--background transparent|opaque|auto] [--api-key KEY]
```
**Edit existing image (without mask - full image edit):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "editing instructions" --filename "output-name.png" --input-image "path/to/input.png" [--size 1024x1024|1024x1536|1536x1024|auto] [--api-key KEY]
```
**Edit existing image (with mask - precise inpainting):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "what to put in masked area" --filename "output-name.png" --input-image "path/to/input.png" --mask "path/to/mask.png" [--size 1024x1024|1024x1536|1536x1024|auto] [--api-key KEY]
```
**Important:** Always run from the user's current working directory so images are saved where the user is working, not in the skill directory.
## Parameters
### Quality Options
- **low** - Fastest generation, lower quality
- **medium** (default) - Balanced quality and speed
- **high** - Best quality, slower generation
Map user requests:
- No mention of quality -> `medium`
- "quick", "fast", "draft" -> `low`
- "high quality", "best", "detailed", "high-res" -> `high`
### Size Options
- **1024x1024** (default) - Square format
- **1024x1536** - Portrait format
- **1536x1024** - Landscape format
- **auto** - Let the model decide based on prompt
Map user requests:
- No mention of size -> `1024x1024`
- "square" -> `1024x1024`
- "portrait", "vertical", "tall" -> `1024x1536`
- "landscape", "horizontal", "wide" -> `1536x1024`
### Background Options (generation only)
- **auto** (default) - Model decides
- **transparent** - Transparent background (PNG/WebP output)
- **opaque** - Solid background
## API Key
The script checks for API key in this order:
1. `--api-key` argument (use if user provided key in chat)
2. `OPENAI_API_KEY` environment variable
If neither is available, the script exits with an error message.
## Filename Generation
Generate filenames with the pattern: `yyyy-mm-dd-hh-mm-ss-name.png`
**Format:** `{timestamp}-{descriptive-name}.png`
- Timestamp: Current date/time in format `yyyy-mm-dd-hh-mm-ss` (24-hour format)
- Name: Descriptive lowercase text with hyphens
- Keep the descriptive part concise (1-5 words typically)
- Use context from user's prompt or conversation
- If unclear, use random identifier (e.g., `x9k2`, `a7b3`)
Examples:
- Prompt "A serene Japanese garden" -> `2025-12-17-14-23-05-japanese-garden.png`
- Prompt "sunset over mountains" -> `2025-12-17-15-30-12-sunset-mountains.png`
- Prompt "create an image of a robot" -> `2025-12-17-16-45-33-robot.png`
- Unclear context -> `2025-12-17-17-12-48-x9k2.png`
## Image Editing
Both editing modes use the Image API (images.edit endpoint) with gpt-image-1.5 for reliable results.
### Without Mask (Full Image Edit)
When the user wants to modify an existing image without specifying exact regions:
1. Use `--input-image` parameter with the path to the image
2. The prompt should contain editing instructions (e.g., "make the sky more dramatic", "change to cartoon style")
3. A fully transparent mask is auto-generated, allowing the model to edit the entire image
### With Mask (Precise Inpainting)
When the user wants to edit specific regions:
1. Use `--input-image` parameter with the path to the image
2. Use `--mask` parameter with a PNG mask file
3. The mask should have transparent areas (alpha=0) where edits should occur
4. The prompt describes what should appear in the masked region
Common editing tasks: add/remove elements, change style, adjust colors, replace backgrounds, etc.
## Prompt Handling
**For generation:** Pass user's image description as-is to `--prompt`. Only rework if clearly insufficient.
**For editing:** Pass editing instructions in `--prompt` (e.g., "add a rainbow in the sky", "make it look like a watercolor painting")
Preserve user's creative intent in both cases.
## Output
- Saves PNG to current directory (or specified path if filename includes directory)
- Script outputs the full path to the generated image
- **Do not read the image back** - just inform the user of the saved path
## Examples
**Generate new image:**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "A serene Japanese garden with cherry blossoms" --filename "2025-12-17-14-23-05-japanese-garden.png" --quality high --size 1536x1024
```
**Generate with transparent background:**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "A cute cartoon cat mascot" --filename "2025-12-17-14-25-30-cat-mascot.png" --background transparent --quality high
```
**Edit existing image (full image):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "make the sky more dramatic with storm clouds" --filename "2025-12-17-14-27-00-dramatic-sky.png" --input-image "original-photo.jpg"
```
**Edit with mask (inpainting):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "a flamingo swimming" --filename "2025-12-17-14-30-00-lounge-flamingo.png" --input-image "lounge.png" --mask "mask.png"
```Related Skills
image-optimization-helper
Image Optimization Helper - Auto-activating skill for Frontend Development. Triggers on: image optimization helper, image optimization helper Part of the Frontend Development skill category.
azure-image-builder
Build Azure managed images and Azure Compute Gallery images with Packer. Use when creating custom images for Azure VMs.
java-add-graalvm-native-image-support
GraalVM Native Image expert that adds native image support to Java applications, builds the project, analyzes build errors, applies fixes, and iterates until successful compilation using Oracle best practices.
image-manipulation-image-magick
Process and manipulate images using ImageMagick. Supports resizing, format conversion, batch processing, and retrieving image metadata. Use when working with images, creating thumbnails, resizing wallpapers, or performing batch image operations.
OpenAI Image Gen
Generate a handful of “random but structured” prompts and render them via the OpenAI Images API.
Nano Banana Pro (Gemini 3 Pro Image)
Use the bundled script to generate or edit images.
image-gen
AI 图片生成。通过 ModelScope API 生成图片,支持文生图、异步任务轮询、LoRA 风格叠加、自定义尺寸。当用户要求生成图片、画图、创建插画、制作海报配图时使用。
image-analysis
图片分析与识别,可分析本地图片、网络图片、视频、文件。适用于 OCR、物体识别、场景理解等。当用户发送图片或要求分析图片时必须使用此技能。
image-assistant
配图助手 - 把文章/模块内容转成统一风格、少字高可读的 16:9 信息图提示词;先定“需要几张图+每张讲什么”,再压缩文案与隐喻,最后输出可直接复制的生图提示词并迭代。
zimage-skill
Generate images using ModelScope Z-Image-Turbo API. Use when user asks to generate, create, or make images, pictures, or illustrations.
qwen-image
Generate and edit images with Alibaba Qwen-Image-2.0 models via inference.sh CLI. Models: Qwen-Image-2.0 (fast), Qwen-Image-2.0-Pro (professional text rendering). Capabilities: text-to-image, multi-image editing, complex text rendering. Triggers: qwen image, qwen-image, alibaba image, dashscope image, qwen image 2, qwen image pro
qwen-image-pro
Generate images with Alibaba Qwen-Image-2.0-Pro via inference.sh CLI. Professional text rendering, fine-grained realism, enhanced semantic adherence. Ideal for posters, banners, and text-heavy designs. Triggers: qwen image pro, qwen-image-pro, qwen 2 pro, alibaba image pro, dashscope pro, professional text rendering