gpt-image-1-5

Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.

25 stars

Best use case

gpt-image-1-5 is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.

Teams using gpt-image-1-5 should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/gpt-image-1-5/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/intellectronica/agent-skills/gpt-image-1-5/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/gpt-image-1-5/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How gpt-image-1-5 Compares

Feature / Agentgpt-image-1-5Standard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# GPT Image 1.5 - Image Generation & Editing

Generate new images or edit existing ones using OpenAI's GPT Image 1.5 model.

- **Generation**: Uses the Responses API with image_generation tool
- **Editing**: Uses the Image API for reliable mask-based inpainting

## Usage

Run the script using absolute path (do NOT cd to skill directory first):

**Generate new image:**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "your image description" --filename "output-name.png" [--quality low|medium|high] [--size 1024x1024|1024x1536|1536x1024|auto] [--background transparent|opaque|auto] [--api-key KEY]
```

**Edit existing image (without mask - full image edit):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "editing instructions" --filename "output-name.png" --input-image "path/to/input.png" [--size 1024x1024|1024x1536|1536x1024|auto] [--api-key KEY]
```

**Edit existing image (with mask - precise inpainting):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "what to put in masked area" --filename "output-name.png" --input-image "path/to/input.png" --mask "path/to/mask.png" [--size 1024x1024|1024x1536|1536x1024|auto] [--api-key KEY]
```

**Important:** Always run from the user's current working directory so images are saved where the user is working, not in the skill directory.

## Parameters

### Quality Options
- **low** - Fastest generation, lower quality
- **medium** (default) - Balanced quality and speed
- **high** - Best quality, slower generation

Map user requests:
- No mention of quality -> `medium`
- "quick", "fast", "draft" -> `low`
- "high quality", "best", "detailed", "high-res" -> `high`

### Size Options
- **1024x1024** (default) - Square format
- **1024x1536** - Portrait format
- **1536x1024** - Landscape format
- **auto** - Let the model decide based on prompt

Map user requests:
- No mention of size -> `1024x1024`
- "square" -> `1024x1024`
- "portrait", "vertical", "tall" -> `1024x1536`
- "landscape", "horizontal", "wide" -> `1536x1024`

### Background Options (generation only)
- **auto** (default) - Model decides
- **transparent** - Transparent background (PNG/WebP output)
- **opaque** - Solid background

## API Key

The script checks for API key in this order:
1. `--api-key` argument (use if user provided key in chat)
2. `OPENAI_API_KEY` environment variable

If neither is available, the script exits with an error message.

## Filename Generation

Generate filenames with the pattern: `yyyy-mm-dd-hh-mm-ss-name.png`

**Format:** `{timestamp}-{descriptive-name}.png`
- Timestamp: Current date/time in format `yyyy-mm-dd-hh-mm-ss` (24-hour format)
- Name: Descriptive lowercase text with hyphens
- Keep the descriptive part concise (1-5 words typically)
- Use context from user's prompt or conversation
- If unclear, use random identifier (e.g., `x9k2`, `a7b3`)

Examples:
- Prompt "A serene Japanese garden" -> `2025-12-17-14-23-05-japanese-garden.png`
- Prompt "sunset over mountains" -> `2025-12-17-15-30-12-sunset-mountains.png`
- Prompt "create an image of a robot" -> `2025-12-17-16-45-33-robot.png`
- Unclear context -> `2025-12-17-17-12-48-x9k2.png`

## Image Editing

Both editing modes use the Image API (images.edit endpoint) with gpt-image-1.5 for reliable results.

### Without Mask (Full Image Edit)
When the user wants to modify an existing image without specifying exact regions:
1. Use `--input-image` parameter with the path to the image
2. The prompt should contain editing instructions (e.g., "make the sky more dramatic", "change to cartoon style")
3. A fully transparent mask is auto-generated, allowing the model to edit the entire image

### With Mask (Precise Inpainting)
When the user wants to edit specific regions:
1. Use `--input-image` parameter with the path to the image
2. Use `--mask` parameter with a PNG mask file
3. The mask should have transparent areas (alpha=0) where edits should occur
4. The prompt describes what should appear in the masked region

Common editing tasks: add/remove elements, change style, adjust colors, replace backgrounds, etc.

## Prompt Handling

**For generation:** Pass user's image description as-is to `--prompt`. Only rework if clearly insufficient.

**For editing:** Pass editing instructions in `--prompt` (e.g., "add a rainbow in the sky", "make it look like a watercolor painting")

Preserve user's creative intent in both cases.

## Output

- Saves PNG to current directory (or specified path if filename includes directory)
- Script outputs the full path to the generated image
- **Do not read the image back** - just inform the user of the saved path

## Examples

**Generate new image:**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "A serene Japanese garden with cherry blossoms" --filename "2025-12-17-14-23-05-japanese-garden.png" --quality high --size 1536x1024
```

**Generate with transparent background:**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "A cute cartoon cat mascot" --filename "2025-12-17-14-25-30-cat-mascot.png" --background transparent --quality high
```

**Edit existing image (full image):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "make the sky more dramatic with storm clouds" --filename "2025-12-17-14-27-00-dramatic-sky.png" --input-image "original-photo.jpg"
```

**Edit with mask (inpainting):**
```bash
uv run ~/.claude/skills/gpt-image-1-5/scripts/generate_image.py --prompt "a flamingo swimming" --filename "2025-12-17-14-30-00-lounge-flamingo.png" --input-image "lounge.png" --mask "mask.png"
```

Related Skills

image-optimization-helper

25
from ComeOnOliver/skillshub

Image Optimization Helper - Auto-activating skill for Frontend Development. Triggers on: image optimization helper, image optimization helper Part of the Frontend Development skill category.

azure-image-builder

25
from ComeOnOliver/skillshub

Build Azure managed images and Azure Compute Gallery images with Packer. Use when creating custom images for Azure VMs.

java-add-graalvm-native-image-support

25
from ComeOnOliver/skillshub

GraalVM Native Image expert that adds native image support to Java applications, builds the project, analyzes build errors, applies fixes, and iterates until successful compilation using Oracle best practices.

image-manipulation-image-magick

25
from ComeOnOliver/skillshub

Process and manipulate images using ImageMagick. Supports resizing, format conversion, batch processing, and retrieving image metadata. Use when working with images, creating thumbnails, resizing wallpapers, or performing batch image operations.

OpenAI Image Gen

25
from ComeOnOliver/skillshub

Generate a handful of “random but structured” prompts and render them via the OpenAI Images API.

Nano Banana Pro (Gemini 3 Pro Image)

25
from ComeOnOliver/skillshub

Use the bundled script to generate or edit images.

image-gen

25
from ComeOnOliver/skillshub

AI 图片生成。通过 ModelScope API 生成图片,支持文生图、异步任务轮询、LoRA 风格叠加、自定义尺寸。当用户要求生成图片、画图、创建插画、制作海报配图时使用。

image-analysis

25
from ComeOnOliver/skillshub

图片分析与识别,可分析本地图片、网络图片、视频、文件。适用于 OCR、物体识别、场景理解等。当用户发送图片或要求分析图片时必须使用此技能。

image-assistant

25
from ComeOnOliver/skillshub

配图助手 - 把文章/模块内容转成统一风格、少字高可读的 16:9 信息图提示词;先定“需要几张图+每张讲什么”,再压缩文案与隐喻,最后输出可直接复制的生图提示词并迭代。

zimage-skill

25
from ComeOnOliver/skillshub

Generate images using ModelScope Z-Image-Turbo API. Use when user asks to generate, create, or make images, pictures, or illustrations.

qwen-image

25
from ComeOnOliver/skillshub

Generate and edit images with Alibaba Qwen-Image-2.0 models via inference.sh CLI. Models: Qwen-Image-2.0 (fast), Qwen-Image-2.0-Pro (professional text rendering). Capabilities: text-to-image, multi-image editing, complex text rendering. Triggers: qwen image, qwen-image, alibaba image, dashscope image, qwen image 2, qwen image pro

qwen-image-pro

25
from ComeOnOliver/skillshub

Generate images with Alibaba Qwen-Image-2.0-Pro via inference.sh CLI. Professional text rendering, fine-grained realism, enhanced semantic adherence. Ideal for posters, banners, and text-heavy designs. Triggers: qwen image pro, qwen-image-pro, qwen 2 pro, alibaba image pro, dashscope pro, professional text rendering