openai-image-gen

Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.

411 stars

byunderstudy-ai

View on GitHub Installation ↓

Best use case

openai-image-gen is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.

Teams using openai-image-gen should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/openai-image-gen/SKILL.md --create-dirs "https://raw.githubusercontent.com/understudy-ai/understudy/main/skills/openai-image-gen/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/openai-image-gen/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How openai-image-gen Compares

Feature / Agent	openai-image-gen	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# OpenAI Image Gen

Generate a handful of “random but structured” prompts and render them via the OpenAI Images API.

## Run

Note: Image generation can take longer than common exec timeouts (for example 30 seconds).
When invoking this skill via Understudy’s exec tool, set a higher timeout to avoid premature termination/retries (e.g., exec timeout=300).

```bash
python3 {baseDir}/scripts/gen.py
open ~/Projects/tmp/openai-image-gen-*/index.html  # if ~/Projects/tmp exists; else ./tmp/...
```

Useful flags:

```bash
# GPT image models with various options
python3 {baseDir}/scripts/gen.py --count 16 --model gpt-image-1
python3 {baseDir}/scripts/gen.py --prompt "ultra-detailed studio photo of a lobster astronaut" --count 4
python3 {baseDir}/scripts/gen.py --size 1536x1024 --quality high --out-dir ./out/images
python3 {baseDir}/scripts/gen.py --model gpt-image-1.5 --background transparent --output-format webp

# DALL-E 3 (note: count is automatically limited to 1)
python3 {baseDir}/scripts/gen.py --model dall-e-3 --quality hd --size 1792x1024 --style vivid
python3 {baseDir}/scripts/gen.py --model dall-e-3 --style natural --prompt "serene mountain landscape"

# DALL-E 2
python3 {baseDir}/scripts/gen.py --model dall-e-2 --size 512x512 --count 4
```

## Model-Specific Parameters

Different models support different parameter values. The script automatically selects appropriate defaults based on the model.

### Size

- **GPT image models** (`gpt-image-1`, `gpt-image-1-mini`, `gpt-image-1.5`): `1024x1024`, `1536x1024` (landscape), `1024x1536` (portrait), or `auto`
  - Default: `1024x1024`
- **dall-e-3**: `1024x1024`, `1792x1024`, or `1024x1792`
  - Default: `1024x1024`
- **dall-e-2**: `256x256`, `512x512`, or `1024x1024`
  - Default: `1024x1024`

### Quality

- **GPT image models**: `auto`, `high`, `medium`, or `low`
  - Default: `high`
- **dall-e-3**: `hd` or `standard`
  - Default: `standard`
- **dall-e-2**: `standard` only
  - Default: `standard`

### Other Notable Differences

- **dall-e-3** only supports generating 1 image at a time (`n=1`). The script automatically limits count to 1 when using this model.
- **GPT image models** support additional parameters:
  - `--background`: `transparent`, `opaque`, or `auto` (default)
  - `--output-format`: `png` (default), `jpeg`, or `webp`
  - Note: `stream` and `moderation` are available via API but not yet implemented in this script
- **dall-e-3** has a `--style` parameter: `vivid` (hyper-real, dramatic) or `natural` (more natural looking)

## Output

- `*.png`, `*.jpeg`, or `*.webp` images (output format depends on model + `--output-format`)
- `prompts.json` (prompt → file mapping)
- `index.html` (thumbnail gallery)

Related Skills

openai-whisper

411

from understudy-ai/understudy

Local speech-to-text with the Whisper CLI (no API key).

openai-whisper-api

411

from understudy-ai/understudy

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

xurl

411

from understudy-ai/understudy

A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.

weather

411

from understudy-ai/understudy

Get current weather and forecasts via wttr.in or Open-Meteo. Use when: user asks about weather, temperature, or forecasts for any location. NOT for: historical weather data, severe weather alerts, or detailed meteorological analysis. No API key needed.

wacli

411

from understudy-ai/understudy

Send WhatsApp messages to other people or search/sync WhatsApp history via the wacli CLI (not for normal user chats).

video-frames

411

from understudy-ai/understudy

Extract frames or short clips from videos using ffmpeg.

trello

411

from understudy-ai/understudy

Manage Trello boards, lists, and cards via the Trello REST API.

tmux

411

from understudy-ai/understudy

Remote-control tmux sessions for interactive CLIs by sending keystrokes and scraping pane output.

things-mac

411

from understudy-ai/understudy

Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database). Use when a user asks Understudy to add a task to Things, list inbox/today/upcoming, search tasks, or inspect projects/areas/tags.

summarize

411

from understudy-ai/understudy

Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).

spotify-player

411

from understudy-ai/understudy

Terminal Spotify playback/search via spogo (preferred) or spotify_player.

sonoscli

411

from understudy-ai/understudy

Control Sonos speakers (discover/status/play/volume/group).