ai-generation-client

External AI API integration with retry logic, rate limiting, content safety detection, and multi-turn conversation support for image generation.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

ai-generation-client is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

External AI API integration with retry logic, rate limiting, content safety detection, and multi-turn conversation support for image generation.

Teams using ai-generation-client should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-generation-client/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/content-media/ai-generation-client/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ai-generation-client/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ai-generation-client Compares

Feature / Agent	ai-generation-client	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

External AI API integration with retry logic, rate limiting, content safety detection, and multi-turn conversation support for image generation.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI Generation Client

Robust AI API integration with retry logic and content safety.

## When to Use This Skill

- Integrating with AI generation APIs (Gemini, OpenAI, etc.)
- Need retry logic for flaky AI services
- Handling rate limits gracefully
- Detecting content policy violations
- Supporting multi-turn refinements

## Core Concepts

AI API integration requires:
1. **Exponential backoff** - Retry with increasing delays
2. **Rate limit handling** - Respect Retry-After headers
3. **Content safety** - Detect and handle policy violations
4. **Multi-turn context** - Enable cheaper refinements

## Implementation

### Python

```python
import asyncio
import base64
import time
import uuid
from dataclasses import dataclass
from typing import Optional, List
import aiohttp


@dataclass
class GenerationRequest:
    prompt: str
    width: int
    height: int
    model: str = "gemini-2.0-flash-exp"
    seed: Optional[int] = None
    input_image: Optional[bytes] = None
    conversation_history: Optional[List[dict]] = None


@dataclass
class GenerationResponse:
    image_data: bytes
    generation_id: str
    seed: int
    inference_time_ms: int
    thought_signature: Optional[bytes] = None


class RateLimitError(Exception):
    def __init__(self, retry_after: int = 60):
        self.retry_after = retry_after


class ContentPolicyError(Exception):
    def __init__(self, reason: str = "Content violates usage policies"):
        self.reason = reason


class GenerationError(Exception):
    def __init__(self, message: str, details: dict = None):
        self.message = message
        self.details = details or {}


class AIGenerationClient:
    """Async client for AI generation APIs with retry logic."""
    
    RETRY_DELAYS = [1, 2, 4]  # Exponential backoff
    BASE_URL = "https://generativelanguage.googleapis.com/v1beta"
    
    STRICT_CONSTRAINT = """STRICT RULES:
1. CREATE ORIGINAL ART - Do NOT use screenshots or existing images.
2. TEXT RENDERING - Render ALL text EXACTLY as written.
3. QUANTITIES - If prompt says "3 items" render EXACTLY 3.
4. NO ADDITIONS - Do NOT add elements not mentioned.
"""
    
    def __init__(self, api_key: str, timeout: int = 120, max_retries: int = 3):
        self.api_key = api_key
        self.timeout = timeout
        self.max_retries = min(max_retries, len(self.RETRY_DELAYS))
        self._session: Optional[aiohttp.ClientSession] = None
    
    async def _get_session(self) -> aiohttp.ClientSession:
        if self._session is None or self._session.closed:
            self._session = aiohttp.ClientSession(
                timeout=aiohttp.ClientTimeout(total=self.timeout)
            )
        return self._session
    
    async def close(self):
        if self._session and not self._session.closed:
            await self._session.close()
    
    async def generate(self, request: GenerationRequest) -> GenerationResponse:
        """Generate with exponential backoff retry."""
        last_exception = None
        
        for attempt in range(self.max_retries):
            try:
                return await self._execute_generation(request)
            
            except ContentPolicyError:
                raise  # Don't retry content policy violations
            
            except RateLimitError as e:
                last_exception = e
                delay = e.retry_after if e.retry_after else self.RETRY_DELAYS[attempt]
                if attempt < self.max_retries - 1:
                    await asyncio.sleep(delay)
                    continue
                raise
            
            except (GenerationError, asyncio.TimeoutError) as e:
                last_exception = e
                if attempt < self.max_retries - 1:
                    await asyncio.sleep(self.RETRY_DELAYS[attempt])
                    continue
                raise
        
        raise last_exception or GenerationError("Generation failed after all retries")
    
    async def _execute_generation(self, request: GenerationRequest) -> GenerationResponse:
        generation_id = str(uuid.uuid4())
        used_seed = request.seed or int(time.time() * 1000) % (2**31)
        start_time = time.time()
        
        # Build prompt with constraints
        constrained_prompt = f"{self.STRICT_CONSTRAINT}{request.prompt}\n\nGenerate as {request.width}x{request.height} pixels."
        
        parts = []
        if request.input_image:
            parts.append({
                "inlineData": {
                    "mimeType": "image/png",
                    "data": base64.b64encode(request.input_image).decode()
                }
            })
        parts.append({"text": constrained_prompt})
        
        # Handle multi-turn conversation
        if request.conversation_history:
            contents = self._build_multi_turn(request.conversation_history, request.prompt, request.width, request.height)
        else:
            contents = [{"parts": parts}]
        
        request_body = {
            "contents": contents,
            "generationConfig": {
                "responseModalities": ["IMAGE", "TEXT"],
            },
            "safetySettings": [
                {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
                {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
                {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
                {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
            ]
        }
        
        url = f"{self.BASE_URL}/models/{request.model}:generateContent"
        headers = {"Content-Type": "application/json", "x-goog-api-key": self.api_key}
        
        session = await self._get_session()
        async with session.post(url, json=request_body, headers=headers) as response:
            inference_time_ms = int((time.time() - start_time) * 1000)
            
            if response.status == 200:
                data = await response.json()
                image_data, thought_sig = self._extract_image(data)
                return GenerationResponse(
                    image_data=image_data,
                    generation_id=generation_id,
                    seed=used_seed,
                    inference_time_ms=inference_time_ms,
                    thought_signature=thought_sig,
                )
            
            elif response.status == 429:
                retry_after = int(response.headers.get("Retry-After", 60))
                raise RateLimitError(retry_after=retry_after)
            
            elif response.status == 400:
                error_data = await response.json()
                error_str = str(error_data).lower()
                if any(term in error_str for term in ["safety", "blocked", "policy"]):
                    raise ContentPolicyError(reason=str(error_data))
                raise GenerationError(f"Bad request: {error_data}")
            
            else:
                error_text = await response.text()
                raise GenerationError(f"API error {response.status}: {error_text}")
    
    def _build_multi_turn(self, history: List[dict], prompt: str, width: int, height: int) -> List[dict]:
        contents = []
        for turn in history:
            parts = []
            if turn.get("text"):
                parts.append({"text": turn["text"]})
            if turn.get("image_data"):
                image_b64 = base64.b64encode(turn["image_data"]).decode() if isinstance(turn["image_data"], bytes) else turn["image_data"]
                parts.append({"inlineData": {"mimeType": "image/png", "data": image_b64}})
            if parts:
                contents.append({"role": turn.get("role", "user"), "parts": parts})
        
        contents.append({
            "role": "user",
            "parts": [{"text": f"Refinement: {prompt}\n\nKeep at {width}x{height} pixels."}]
        })
        return contents
    
    def _extract_image(self, data: dict) -> tuple:
        candidates = data.get("candidates", [])
        if not candidates:
            raise GenerationError("No image generated")
        
        parts = candidates[0].get("content", {}).get("parts", [])
        for part in parts:
            if "inlineData" in part and "data" in part["inlineData"]:
                image_data = base64.b64decode(part["inlineData"]["data"])
                thought_sig = base64.b64decode(part["thoughtSignature"]) if "thoughtSignature" in part else None
                return image_data, thought_sig
        
        raise GenerationError("No image data in response")
```

### TypeScript

```typescript
interface GenerationRequest {
  prompt: string;
  width: number;
  height: number;
  model?: string;
  seed?: number;
  inputImage?: Buffer;
  conversationHistory?: Array<{ role: string; text?: string; imageData?: Buffer }>;
}

interface GenerationResponse {
  imageData: Buffer;
  generationId: string;
  seed: number;
  inferenceTimeMs: number;
  thoughtSignature?: Buffer;
}

class RateLimitError extends Error {
  constructor(public retryAfter: number = 60) {
    super(`Rate limit exceeded. Retry after ${retryAfter} seconds.`);
  }
}

class ContentPolicyError extends Error {
  constructor(public reason: string = "Content violates usage policies") {
    super(`Content policy violation: ${reason}`);
  }
}

class AIGenerationClient {
  private static RETRY_DELAYS = [1000, 2000, 4000];
  
  constructor(
    private apiKey: string,
    private timeout: number = 120000,
    private maxRetries: number = 3
  ) {}

  async generate(request: GenerationRequest): Promise<GenerationResponse> {
    let lastError: Error | null = null;

    for (let attempt = 0; attempt < this.maxRetries; attempt++) {
      try {
        return await this.executeGeneration(request);
      } catch (error) {
        if (error instanceof ContentPolicyError) throw error;
        
        lastError = error as Error;
        if (attempt < this.maxRetries - 1) {
          const delay = error instanceof RateLimitError
            ? error.retryAfter * 1000
            : AIGenerationClient.RETRY_DELAYS[attempt];
          await new Promise(resolve => setTimeout(resolve, delay));
        }
      }
    }

    throw lastError || new Error('Generation failed after all retries');
  }

  private async executeGeneration(request: GenerationRequest): Promise<GenerationResponse> {
    const generationId = crypto.randomUUID();
    const seed = request.seed ?? Math.floor(Date.now() % (2 ** 31));
    const startTime = Date.now();

    const response = await fetch(
      `https://generativelanguage.googleapis.com/v1beta/models/${request.model || 'gemini-2.0-flash-exp'}:generateContent`,
      {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'x-goog-api-key': this.apiKey,
        },
        body: JSON.stringify(this.buildRequestBody(request)),
        signal: AbortSignal.timeout(this.timeout),
      }
    );

    const inferenceTimeMs = Date.now() - startTime;

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
      throw new RateLimitError(retryAfter);
    }

    if (response.status === 400) {
      const error = await response.json();
      if (JSON.stringify(error).toLowerCase().includes('safety')) {
        throw new ContentPolicyError(JSON.stringify(error));
      }
      throw new Error(`Bad request: ${JSON.stringify(error)}`);
    }

    if (!response.ok) {
      throw new Error(`API error ${response.status}`);
    }

    const data = await response.json();
    const imageData = this.extractImage(data);

    return { imageData, generationId, seed, inferenceTimeMs };
  }

  private buildRequestBody(request: GenerationRequest): object {
    const parts: any[] = [];
    
    if (request.inputImage) {
      parts.push({
        inlineData: {
          mimeType: 'image/png',
          data: request.inputImage.toString('base64'),
        },
      });
    }
    
    parts.push({ text: request.prompt });

    return {
      contents: [{ parts }],
      generationConfig: { responseModalities: ['IMAGE', 'TEXT'] },
    };
  }

  private extractImage(data: any): Buffer {
    const parts = data.candidates?.[0]?.content?.parts || [];
    for (const part of parts) {
      if (part.inlineData?.data) {
        return Buffer.from(part.inlineData.data, 'base64');
      }
    }
    throw new Error('No image data in response');
  }
}
```

## Usage Examples

### Basic Generation

```python
client = AIGenerationClient(api_key="your-key")

response = await client.generate(GenerationRequest(
    prompt="A cute cartoon banana mascot waving",
    width=512,
    height=512,
))
# response.image_data contains PNG bytes
```

### Multi-Turn Refinement

```python
# First generation
response1 = await client.generate(GenerationRequest(
    prompt="Gaming thumbnail with bold text 'EPIC WIN'",
    width=1280,
    height=720,
))

# Refinement (cheaper, uses context)
response2 = await client.generate(GenerationRequest(
    prompt="Make the text bigger and add more glow",
    width=1280,
    height=720,
    conversation_history=[
        {"role": "user", "text": "Gaming thumbnail with bold text 'EPIC WIN'"},
        {"role": "model", "image_data": response1.image_data},
    ],
))
```

## Best Practices

1. Always use retry logic - AI APIs can be flaky
2. Respect Retry-After headers for rate limits
3. Don't retry content policy errors
4. Use strict prompts to prevent hallucination
5. Track generation IDs for debugging
6. Set appropriate timeouts (30-120s)

## Common Mistakes

- No retry logic (fails on transient errors)
- Retrying content policy violations (wastes quota)
- Ignoring Retry-After headers (gets blocked)
- No timeout (hangs forever)
- Missing generation ID logging

## Related Patterns

- prompt-engine - Template-based prompt building
- rate-limiting - Protect your API quota
- circuit-breaker - Handle AI service outages

Related Skills

generational-agent-succession

from diegosouzapw/awesome-omni-skill

Parallel agent swarms with generational succession. Combines agent-architect's multi-agent parallelism with automatic succession when agents degrade. Each parallel agent gets fresh context through controlled handoffs while maintaining accumulated wisdom.

media-generation

from diegosouzapw/awesome-omni-skill

Generate images, videos, and audio using Google's Gemini APIs. Use for image generation/editing (Gemini 3 Pro Image), video generation (Veo 3), and speech (TBD). Trigger words - images: generate, create, draw, design, make, edit, modify image/picture. Video: generate video, create video, animate, make a video. Supports text-to-image, image-to-image editing, text-to-video, and image-to-video.

Image Generation

from diegosouzapw/awesome-omni-skill

AI图像生成与编辑能力，基于 Nano Banana (Gemini Image) 实现文生图、图生图、图像编辑。适用于创意设计、营销素材、社交媒体内容、演示文稿配图等场景。支持多种风格、高分辨率输出（最高4K）、文字渲染、角色一致性保持。

ai-video-generation

from diegosouzapw/awesome-omni-skill

Generate AI videos with Google Veo, Seedance, Wan, Grok and 40+ models via inference.sh CLI. Models: Veo 3.1, Veo 3, Seedance 1.5 Pro, Wan 2.5, Grok Imagine Video, OmniHuman, Fabric, HunyuanVideo. Capabilities: text-to-video, image-to-video, lipsync, avatar animation, video upscaling, foley sound. Use for: social media videos, marketing content, explainer videos, product demos, AI avatars. Triggers: video generation, ai video, text to video, image to video, veo, animate image, video from image, ai animation, video generator, generate video, t2v, i2v, ai video maker, create video with ai, runway alternative, pika alternative, sora alternative, kling alternative

apollo-client-patterns

from diegosouzapw/awesome-omni-skill

Use when implementing Apollo Client patterns for queries, mutations, cache management, and local state in React applications.

podcast-generation

from diegosouzapw/awesome-omni-skill

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creatio...

Invoice Generation

from diegosouzapw/awesome-omni-skill

This skill provides comprehensive patterns for generating invoices in billing systems. It covers invoice data structure, PDF generation using PDFKit and Puppeteer, invoice templates, line item calcula

apify-lead-generation

from diegosouzapw/awesome-omni-skill

Generates B2B/B2C leads by scraping Google Maps, websites, Instagram, TikTok, Facebook, LinkedIn, YouTube, and Google Search. Use when user asks to find leads, prospects, businesses, build lead lis...

api-generation

from diegosouzapw/awesome-omni-skill

Generate TypeScript API client from Swagger/Go comments. Use when updating API endpoints, adding new routes, or regenerating the frontend API client after backend changes.

api-doc-generation

from diegosouzapw/awesome-omni-skill

Generate and update API documentation from NestJS controllers. Use when modifying controllers, adding endpoints, or when the user asks about API documentation.

api-client

from diegosouzapw/awesome-omni-skill

REST API client builder with authentication, error handling, retry logic, and request management. Supports OAuth, JWT, API keys. Use when building API integrations, creating API clients, or working with REST services.

api-client-patterns

from diegosouzapw/awesome-omni-skill

HTTP client patterns, API integration, request/response handling, error handling, retry logic, axios usage. Use when building API clients, integrating external services, handling API errors, or making HTTP requests.