ai-generation-persistence
AI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
Best use case
ai-generation-persistence is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
AI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
Teams using ai-generation-persistence should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/ai-generation-persistence/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How ai-generation-persistence Compares
| Feature / Agent | ai-generation-persistence | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
AI generation persistence patterns — unique IDs, addressable URLs, database storage, and cost tracking for every LLM generation
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# AI Generation Persistence
**AI generations are expensive, non-reproducible assets. Never discard them.**
Every call to an LLM costs real money and produces unique output that cannot be exactly reproduced. Treat generations like database records — assign an ID, persist immediately, and make them retrievable.
## Core Rules
1. **Generate an ID before the LLM call** — use `nanoid()` or `createId()` from `@paralleldrive/cuid2`
2. **Persist every generation** — text and metadata to database, images and files to Vercel Blob
3. **Make every generation addressable** — URL pattern: `/chat/[id]`, `/generate/[id]`, `/image/[id]`
4. **Track metadata** — model name, token usage, estimated cost, timestamp, user ID
5. **Never stream without saving** — if the user refreshes, the generation must survive
## Generate-Then-Redirect Pattern
The standard UX flow for AI features: create the resource first, then redirect to its page.
```ts
// app/api/chat/route.ts
import { nanoid } from "nanoid";
import { db } from "@/lib/db";
import { redirect } from "next/navigation";
export async function POST(req: Request) {
const { prompt, model } = await req.json();
const id = nanoid();
// Create the record BEFORE generation starts
await db.insert(generations).values({
id,
prompt,
model,
status: "pending",
createdAt: new Date(),
});
// Redirect to the generation page — it handles streaming
redirect(`/chat/${id}`);
}
```
```tsx
// app/chat/[id]/page.tsx
import { db } from "@/lib/db";
import { notFound } from "next/navigation";
export default async function ChatPage({ params }: { params: Promise<{ id: string }> }) {
const { id } = await params;
const generation = await db.query.generations.findFirst({
where: eq(generations.id, id),
});
if (!generation) notFound();
// Render with streaming if still pending, or show saved result
return <ChatView generation={generation} />;
}
```
This gives you: shareable URLs, back-button support, multi-tab sessions, and generation history for free.
## Persistence Schema
```ts
// lib/db/schema.ts
import { pgTable, text, integer, timestamp, jsonb } from "drizzle-orm/pg-core";
export const generations = pgTable("generations", {
id: text("id").primaryKey(), // nanoid
userId: text("user_id"), // auth user
model: text("model").notNull(), // "openai/gpt-5.4"
prompt: text("prompt"), // input text
result: text("result"), // generated output
imageUrls: jsonb("image_urls"), // Blob URLs for generated images
tokenUsage: jsonb("token_usage"), // { promptTokens, completionTokens }
estimatedCostCents: integer("estimated_cost_cents"),
status: text("status").default("pending"), // pending | streaming | complete | error
createdAt: timestamp("created_at").defaultNow(),
});
```
## Storage Strategy
| Data Type | Storage | Why |
|-----------|---------|-----|
| Text, metadata, history | Neon Postgres via Drizzle | Queryable, relational, supports search |
| Generated images & files | Vercel Blob (`@vercel/blob`) | Permanent URLs, CDN-backed, no expiry |
| Prompt dedup cache | Upstash Redis | Fast lookup, TTL-based expiry |
## Image Persistence
Never serve generated images as ephemeral base64 or temporary URLs. Save to Blob immediately:
```ts
import { put } from "@vercel/blob";
import { generateText } from "ai";
const result = await generateText({ model, prompt });
// Save every generated image to permanent storage
const imageUrls: string[] = [];
for (const file of result.files ?? []) {
if (file.mediaType?.startsWith("image/")) {
const ext = file.mediaType.split("/")[1] || "png";
const blob = await put(`generations/${generationId}.${ext}`, file.uint8Array, {
access: "public",
contentType: file.mediaType,
});
imageUrls.push(blob.url);
}
}
// Update the generation record with permanent URLs
await db.update(generations)
.set({ imageUrls, status: "complete" })
.where(eq(generations.id, generationId));
```
## Cost Tracking
Extract usage from every generation and store it. This enables billing, budgeting, and abuse detection:
```ts
const result = await generateText({ model, prompt });
const usage = result.usage; // { promptTokens, completionTokens, totalTokens }
const estimatedCostCents = estimateCost(model, usage);
await db.update(generations).set({
result: result.text,
tokenUsage: usage,
estimatedCostCents,
status: "complete",
}).where(eq(generations.id, generationId));
```
## Prompt Dedup / Caching
Avoid paying for identical generations. Cache by content hash:
```ts
import { Redis } from "@upstash/redis";
import { createHash } from "crypto";
const redis = Redis.fromEnv();
function hashPrompt(model: string, prompt: string): string {
return createHash("sha256").update(`${model}:${prompt}`).digest("hex");
}
// Check cache before generating
const cacheKey = `gen:${hashPrompt(model, prompt)}`;
const cached = await redis.get<string>(cacheKey);
if (cached) return cached; // Return cached generation ID
// After generation, cache the result
await redis.set(cacheKey, generationId, { ex: 3600 }); // 1hr TTL
```
## Anti-Patterns
- **Streaming to client without saving** — generation lost on page refresh. Always write to DB as tokens arrive or on completion.
- **Routes without `[id]` segments** — `/api/chat` with no ID means generations aren't addressable. Use `/chat/[id]`.
- **Re-generating identical prompts** — check cache first. Same prompt + same model = same cost for no new value.
- **Ephemeral base64 images** — generated images served inline are lost when the component unmounts. Save to Vercel Blob.
- **Missing metadata** — always store model name, token counts, and timestamp. You need this for cost tracking and debugging.
- **Client-only state** — storing generations only in React state or localStorage. Use a database — generations must survive across devices and sessions.Related Skills
workflow
Vercel Workflow DevKit (WDK) expert guidance. Use when building durable workflows, long-running tasks, API routes or agents that need pause/resume, retries, step-based execution, or crash-safe orchestration with Vercel Workflow.
verification
Full-story verification — infers what the user is building, then verifies the complete flow end-to-end: browser → API → data → response. Triggers on dev server start and 'why isn't this working' signals.
vercel-storage
Vercel storage expert guidance — Blob, Edge Config, and Marketplace storage (Neon Postgres, Upstash Redis). Use when choosing, configuring, or using data storage with Vercel applications.
vercel-services
Vercel Services — deploy multiple services within a single Vercel project. Use for monorepo layouts or when combining a backend (Python, Go) with a frontend (Next.js, Vite) in one deployment.
vercel-sandbox
Vercel Sandbox guidance — ephemeral Firecracker microVMs for running untrusted code safely. Supports AI agents, code generation, and experimentation. Use when executing user-generated or AI-generated code in isolation.
vercel-queues
Vercel Queues guidance (public beta) — durable event streaming with topics, consumer groups, retries, and delayed delivery. $0.60/1M ops. Powers Workflow DevKit. Use when building async processing, fan-out patterns, or event-driven architectures.
vercel-functions
Vercel Functions expert guidance — Serverless Functions, Edge Functions, Fluid Compute, streaming, Cron Jobs, and runtime configuration. Use when configuring, debugging, or optimizing server-side code running on Vercel.
vercel-flags
Vercel Flags guidance — feature flags platform with unified dashboard, Flags Explorer, gradual rollouts, A/B testing, and provider adapters. Use when implementing feature flags, experimentation, or staged rollouts.
vercel-firewall
Vercel Firewall and security expert guidance. Use when configuring DDoS protection, WAF rules, rate limiting, bot filtering, IP allow/block lists, OWASP rulesets, Attack Challenge Mode, or any security configuration on the Vercel platform.
vercel-cli
Vercel CLI expert guidance. Use when deploying, managing environment variables, linking projects, viewing logs, managing domains, or interacting with the Vercel platform from the command line.
vercel-api
Vercel MCP and REST API expert guidance. Use when the agent needs live access to Vercel projects, deployments, environment variables, domains, logs, or documentation through the MCP server or REST API.
vercel-agent
Vercel Agent guidance — AI-powered code review, incident investigation, and SDK installation. Automates PR analysis and anomaly debugging. Use when configuring or understanding Vercel's AI development tools.