multiAI Summary Pending
image-generator
Generate professional visuals using Gemini via browser automation with 6-gate quality control. Use when creating chapter illustrations, diagrams, or teaching visuals. NOT for stock photos or decorative images.
231 stars
Installation
Claude Code / Cursor / Codex
$curl -o ~/.claude/skills/image-generator/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/92bilal26/image-generator/SKILL.md"
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/image-generator/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How image-generator Compares
| Feature / Agent | image-generator | Standard Approach |
|---|---|---|
| Platform Support | multi | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate professional visuals using Gemini via browser automation with 6-gate quality control. Use when creating chapter illustrations, diagrams, or teaching visuals. NOT for stock photos or decorative images.
Which AI agents support this skill?
This skill is compatible with multi.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Image Generator
Generate professional teaching visuals using Gemini 3 with multi-turn reasoning partnership.
## Quick Start
```bash
# 1. Start browser (via browser-use skill)
bash .claude/skills/browser-use/scripts/start-server.sh
# 2. Navigate to Gemini
# Use browser_navigate to https://gemini.google.com/
# 3. Generate image from creative brief
# Paste creative brief → Wait 30-35s → Verify 6 gates → Download
```
## Core Principles
1. **Reasoning over prediction** - Creative briefs (Story/Intent/Metaphor) activate reasoning; pixel specs don't
2. **Multi-turn partnership** - Teach Gemini your standards through principle-based feedback
3. **6-gate quality** - Explicit pass/fail before download
4. **Autonomous batch** - No permission-asking between visuals
## Input: Creative Brief Format
Receive from visual-asset-workflow:
```markdown
## The Story
[Narrative about what's visualized]
## Emotional Intent
[What it should FEEL like]
## Visual Metaphor
[Universal concept for instant comprehension]
## Subject / Composition / Action / Location / Style
[Gemini 3 prompt structure]
## Color Semantics
Blue (#2563eb) = Authority | Green (#10b981) = Execution
## Typography Hierarchy
Largest: Key insight | Medium: Supporting | Smallest: Context
```
**Do NOT convert to pixel specs** - use as-is to activate reasoning.
## Workflow (Per Visual)
| Step | Action | Tool |
|------|--------|------|
| 1 | Navigate to gemini.google.com | browser_navigate |
| 2 | Select "🍌 Create Image" | browser_click |
| 3 | Paste creative brief | browser_type |
| 4 | Wait 30-35 seconds | browser_wait_for |
| 5 | Verify 6 gates (below) | Visual inspection |
| 6 | If fail: Iterate with feedback (max 3) | browser_type |
| 7 | If pass: Download full size | browser_click |
| 8 | Copy to `apps/learn-app/static/img/part-{N}/chapter-{NN}/` | Bash |
| 9 | Embed in lesson immediately | Edit |
| 10 | NEW CHAT for next visual | browser_navigate |
## Quality Gates (ALL Must Pass)
| Gate | Criterion | Fail Action |
|------|-----------|-------------|
| 1. Spelling | 99% accuracy (Y-Combinator, Kubernetes) | Iterate |
| 2. Layout | Proportions match prompt (2×2 not 3×1) | Iterate |
| 3. Color | Brand colors match (#2563eb not #002050) | Iterate |
| 4. Typography | Largest = key concept (not decoration) | Iterate |
| 5. Teaching | <5 sec concept grasp at target proficiency | Iterate |
| 6. Uniqueness | Not duplicate of existing chapter image | New chat |
**Decision**: ALL pass → Download | ANY fail → Iterate (max 3 tries)
## Iteration: Principle-Based Feedback
When gate fails, provide teaching feedback:
```
Gate 4 FAILED: Typography hierarchy incorrect
The largest text is "$100K" (supporting detail) but should be "$3T"
(key insight students must grasp).
Increase '$3T' to dominant size. Reduce '$100K' to supporting size.
Information importance drives sizing.
```
## Batch Mode
When invoked with "generate all visuals":
```
For EACH visual in list:
A. NEW CHAT (context isolation)
B. Generate (paste brief)
C. Verify 6 gates
D. Iterate if needed (max 3)
E. Download when pass
F. Embed in lesson
G. Log "✅ N/M"
H. NEXT (no stopping)
```
**Never ask**: "Continue?" "Pause here?" "Review?"
**Report at END only**:
```
BATCH COMPLETE
✅ Generated: 16/18
⚠️ Deferred: 2 (quality issues)
Location: apps/learn-app/static/img/part-{N}/
```
## Proficiency Limits
| Level | Max Elements | Grasp Time |
|-------|--------------|------------|
| A2 | 5-7 | <5 sec |
| B1 | 7-10 | <10 sec |
| C2 | No limit | N/A |
## Token Conservation (Batch Mode)
For >8 visuals, condense briefs:
**Original** (250 tokens):
```
"Top Layer shows Coordinator at center top with label 'Orchestrator'
featuring conductor icon, with role 'Strategic oversight'..."
```
**Condensed** (80 tokens):
```
"Top Layer - Coordinator: Center top, 'Orchestrator' (conductor),
Role: 'Strategic oversight', Gold (#fbbf24), Large hexagon."
```
Keep: Story, Intent, Metaphor, Colors, Reasoning
Condense: Long examples → Short labels
## Anti-Patterns
| Don't | Why |
|-------|-----|
| Accept first output without 6 gates | Quality standard violation |
| Ask permission between batch items | Breaks autonomous agency |
| Convert briefs to pixel specs | Defeats reasoning activation |
| Skip embedding step | Creates orphan images |
| Reuse same chat for next visual | Context contamination |
## Session Interruption
If session ends mid-batch, create checkpoint:
```markdown
# Checkpoint: Part {N}
Status: INTERRUPTED at 8/18
## Completed:
- ✅ Image 1: filename (embedded lesson-01.md)
- ✅ Image 2: filename (embedded lesson-02.md)
## Remaining:
- ⏳ Image 8: filename
```
On continuation: Read checkpoint → Resume → Update incrementally
## Success Indicators
- ✅ All 6 gates verified before download
- ✅ Batch completion without permission-asking
- ✅ Principle-based iteration feedback
- ✅ Images organized by part/chapter
- ✅ Immediate embedding (no orphans)
- ✅ >85% production-ready rate