elevenlabs-reference-architecture
Implement ElevenLabs reference architecture for production TTS/voice applications. Use when designing new ElevenLabs integrations, reviewing project structure, or building a scalable audio generation service. Trigger: "elevenlabs architecture", "elevenlabs project structure", "how to organize elevenlabs", "TTS service architecture", "elevenlabs design patterns", "voice API architecture".
Best use case
elevenlabs-reference-architecture is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Implement ElevenLabs reference architecture for production TTS/voice applications. Use when designing new ElevenLabs integrations, reviewing project structure, or building a scalable audio generation service. Trigger: "elevenlabs architecture", "elevenlabs project structure", "how to organize elevenlabs", "TTS service architecture", "elevenlabs design patterns", "voice API architecture".
Teams using elevenlabs-reference-architecture should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/elevenlabs-reference-architecture/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How elevenlabs-reference-architecture Compares
| Feature / Agent | elevenlabs-reference-architecture | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Implement ElevenLabs reference architecture for production TTS/voice applications. Use when designing new ElevenLabs integrations, reviewing project structure, or building a scalable audio generation service. Trigger: "elevenlabs architecture", "elevenlabs project structure", "how to organize elevenlabs", "TTS service architecture", "elevenlabs design patterns", "voice API architecture".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# ElevenLabs Reference Architecture
## Overview
Production-ready architecture for ElevenLabs TTS/voice applications. Covers project layout, service layers, caching, streaming, and multi-model orchestration.
## Prerequisites
- Understanding of layered architecture patterns
- ElevenLabs SDK knowledge (see `elevenlabs-sdk-patterns`)
- TypeScript project with async patterns
- Redis (optional, for distributed caching)
## Instructions
### Step 1: Project Structure
```
my-elevenlabs-service/
├── src/
│ ├── elevenlabs/
│ │ ├── client.ts # Singleton client with retry config
│ │ ├── config.ts # Environment-aware configuration
│ │ ├── models.ts # Model selection logic
│ │ ├── errors.ts # Error classification (see sdk-patterns)
│ │ └── types.ts # TypeScript interfaces
│ ├── services/
│ │ ├── tts-service.ts # Text-to-Speech orchestration
│ │ ├── voice-service.ts # Voice management (clone, list, settings)
│ │ ├── audio-service.ts # SFX, isolation, transcription
│ │ └── cache-service.ts # Audio caching layer
│ ├── api/
│ │ ├── routes/
│ │ │ ├── tts.ts # POST /api/tts
│ │ │ ├── voices.ts # GET/POST /api/voices
│ │ │ ├── webhooks.ts # POST /webhooks/elevenlabs
│ │ │ └── health.ts # GET /health
│ │ └── middleware/
│ │ ├── rate-limit.ts # Request throttling
│ │ └── auth.ts # Your app's auth (not ElevenLabs auth)
│ ├── queue/
│ │ ├── tts-queue.ts # Async TTS job processing
│ │ └── workers.ts # Queue workers
│ └── monitoring/
│ ├── metrics.ts # Latency, error rate, quota tracking
│ └── alerts.ts # Budget and health alerts
├── tests/
│ ├── unit/
│ │ ├── tts-service.test.ts
│ │ └── cache-service.test.ts
│ └── integration/
│ └── tts-smoke.test.ts
├── config/
│ ├── development.json
│ ├── staging.json
│ └── production.json
└── .env.example
```
### Step 2: Configuration Layer
```typescript
// src/elevenlabs/config.ts
export interface ElevenLabsConfig {
apiKey: string;
environment: "development" | "staging" | "production";
defaults: {
modelId: string;
voiceId: string;
outputFormat: string;
voiceSettings: {
stability: number;
similarity_boost: number;
style: number;
speed: number;
};
};
performance: {
maxConcurrency: number;
timeoutMs: number;
maxRetries: number;
};
cache: {
enabled: boolean;
maxSizeMB: number;
ttlSeconds: number;
};
}
const ENV_CONFIGS: Record<string, Partial<ElevenLabsConfig>> = {
development: {
defaults: {
modelId: "eleven_flash_v2_5", // Cheap + fast for dev
voiceId: "21m00Tcm4TlvDq8ikWAM", // Rachel
outputFormat: "mp3_22050_32", // Small files
voiceSettings: { stability: 0.5, similarity_boost: 0.75, style: 0, speed: 1 },
},
performance: { maxConcurrency: 2, timeoutMs: 30_000, maxRetries: 1 },
cache: { enabled: true, maxSizeMB: 50, ttlSeconds: 3600 },
},
production: {
defaults: {
modelId: "eleven_multilingual_v2", // High quality for prod
voiceId: "21m00Tcm4TlvDq8ikWAM",
outputFormat: "mp3_44100_128", // High quality
voiceSettings: { stability: 0.5, similarity_boost: 0.75, style: 0, speed: 1 },
},
performance: { maxConcurrency: 10, timeoutMs: 60_000, maxRetries: 3 },
cache: { enabled: true, maxSizeMB: 500, ttlSeconds: 86_400 },
},
};
export function loadConfig(): ElevenLabsConfig {
const env = process.env.NODE_ENV || "development";
const envConfig = ENV_CONFIGS[env] || ENV_CONFIGS.development;
return {
apiKey: process.env.ELEVENLABS_API_KEY!,
environment: env as any,
...envConfig,
} as ElevenLabsConfig;
}
```
### Step 3: TTS Service Layer
```typescript
// src/services/tts-service.ts
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import PQueue from "p-queue";
import { loadConfig } from "../elevenlabs/config";
import { classifyError } from "../elevenlabs/errors";
export class TTSService {
private client: ElevenLabsClient;
private queue: PQueue;
private config: ReturnType<typeof loadConfig>;
constructor() {
this.config = loadConfig();
this.client = new ElevenLabsClient({
apiKey: this.config.apiKey,
maxRetries: this.config.performance.maxRetries,
timeoutInSeconds: this.config.performance.timeoutMs / 1000,
});
this.queue = new PQueue({
concurrency: this.config.performance.maxConcurrency,
});
}
async generate(text: string, options?: {
voiceId?: string;
modelId?: string;
outputFormat?: string;
streaming?: boolean;
}): Promise<ReadableStream | Buffer> {
const voiceId = options?.voiceId || this.config.defaults.voiceId;
const modelId = options?.modelId || this.config.defaults.modelId;
const format = options?.outputFormat || this.config.defaults.outputFormat;
return this.queue.add(async () => {
const start = performance.now();
try {
if (options?.streaming) {
return await this.client.textToSpeech.stream(voiceId, {
text,
model_id: modelId,
output_format: format,
voice_settings: this.config.defaults.voiceSettings,
});
}
const audio = await this.client.textToSpeech.convert(voiceId, {
text,
model_id: modelId,
output_format: format,
voice_settings: this.config.defaults.voiceSettings,
});
const latency = performance.now() - start;
console.log(`[TTS] ${text.length} chars, ${modelId}, ${latency.toFixed(0)}ms`);
return audio;
} catch (error) {
throw classifyError(error);
}
}) as Promise<ReadableStream | Buffer>;
}
// Split long text into chunks with prosody context
async generateLongText(text: string, voiceId?: string): Promise<Buffer[]> {
const chunks = this.splitText(text, 4500); // Stay under 5000 limit
const results: Buffer[] = [];
for (let i = 0; i < chunks.length; i++) {
const audio = await this.generate(chunks[i], {
voiceId,
// Pass context for natural prosody across chunks
});
results.push(audio as Buffer);
}
return results;
}
private splitText(text: string, maxChars: number): string[] {
const chunks: string[] = [];
const sentences = text.match(/[^.!?]+[.!?]+/g) || [text];
let current = "";
for (const sentence of sentences) {
if ((current + sentence).length > maxChars) {
if (current) chunks.push(current.trim());
current = sentence;
} else {
current += sentence;
}
}
if (current) chunks.push(current.trim());
return chunks;
}
}
```
### Step 4: Voice Management Service
```typescript
// src/services/voice-service.ts
export class VoiceService {
private client: ElevenLabsClient;
constructor(client: ElevenLabsClient) {
this.client = client;
}
async listVoices(filter?: { category?: "premade" | "cloned" | "generated" }) {
const { voices } = await this.client.voices.getAll();
if (filter?.category) {
return voices.filter(v => v.category === filter.category);
}
return voices;
}
async cloneVoice(name: string, description: string, audioFiles: NodeJS.ReadableStream[]) {
return this.client.voices.add({
name,
description,
files: audioFiles,
});
}
async getVoiceSettings(voiceId: string) {
return this.client.voices.getSettings(voiceId);
}
async updateVoiceSettings(voiceId: string, settings: {
stability: number;
similarity_boost: number;
}) {
return this.client.voices.editSettings(voiceId, settings);
}
async deleteVoice(voiceId: string) {
return this.client.voices.delete(voiceId);
}
}
```
### Step 5: Data Flow Diagram
```
┌──────────────┐
│ Client │
│ (Browser/ │
│ Mobile) │
└──────┬───────┘
│
┌──────▼───────┐
│ API Layer │
│ /api/tts │
│ /api/voice │
└──────┬───────┘
│
┌───────────┼───────────┐
│ │ │
┌──────▼──┐ ┌─────▼─────┐ ┌──▼──────┐
│ Cache │ │ TTS │ │ Voice │
│ Service │ │ Service │ │ Service │
└──────┬───┘ └─────┬─────┘ └────────┘
│ │
┌─────▼─┐ ┌─────▼──────────┐
│ Redis/ │ │ Concurrency │
│ LRU │ │ Queue (p-queue)│
└────────┘ └─────┬──────────┘
│
┌──────▼───────┐
│ ElevenLabs │
│ Client SDK │
│ (singleton) │
└──────┬───────┘
│
┌───────────┼───────────┐
│ │ │
┌──────▼──┐ ┌─────▼─────┐ ┌──▼──────┐
│ /v1/tts │ │ /v1/voices│ │ /v1/sfx │
│ REST/WS │ │ REST │ │ REST │
└──────────┘ └───────────┘ └─────────┘
ElevenLabs API (api.elevenlabs.io)
```
### Step 6: Health Check Composition
```typescript
// src/api/routes/health.ts
export async function healthCheck() {
const checks = await Promise.allSettled([
checkElevenLabsConnectivity(),
checkQuotaStatus(),
checkCacheHealth(),
]);
const elevenlabs = checks[0].status === "fulfilled" ? checks[0].value : null;
const quota = checks[1].status === "fulfilled" ? checks[1].value : null;
const cache = checks[2].status === "fulfilled" ? checks[2].value : null;
const degraded = !elevenlabs || (quota && quota.pctUsed > 90);
return {
status: !elevenlabs ? "unhealthy" : degraded ? "degraded" : "healthy",
services: { elevenlabs, quota, cache },
timestamp: new Date().toISOString(),
};
}
```
## Architecture Decisions
| Decision | Choice | Rationale |
|----------|--------|-----------|
| Client pattern | Singleton | One connection pool, shared retry config |
| Concurrency | p-queue | Respects plan limits, prevents 429 |
| Caching | LRU (local) or Redis (distributed) | Repeated content is common in TTS |
| Long text | Sentence-boundary splitting | Preserves natural speech prosody |
| Error handling | Classification + retry | Different strategies for 429 vs 401 vs 500 |
| Model selection | Environment-based | Flash in dev (cheap), Multilingual in prod (quality) |
| Streaming | HTTP streaming + WebSocket | HTTP for simple, WS for LLM integration |
## Error Handling
| Issue | Cause | Solution |
|-------|-------|----------|
| Circular dependencies | Wrong layering | Services depend on client, never reverse |
| Cold start latency | Client initialization | Pre-warm in server startup |
| Memory pressure | Unbounded audio cache | Set `maxSizeMB` on cache |
| Type errors | SDK version mismatch | Pin SDK version in package.json |
## Resources
- [ElevenLabs API Reference](https://elevenlabs.io/docs/api-reference/introduction)
- [ElevenLabs SDK Source](https://github.com/elevenlabs/elevenlabs-js)
- [p-queue](https://github.com/sindresorhus/p-queue)
- [LRU Cache](https://github.com/isaacs/node-lru-cache)
## Next Steps
Start with `elevenlabs-install-auth` for setup, then apply this architecture. Use `elevenlabs-core-workflow-a` and `elevenlabs-core-workflow-b` for feature implementation.Related Skills
exa-reference-architecture
Implement Exa reference architecture for search pipelines, RAG, and content discovery. Use when designing new Exa integrations, reviewing project structure, or establishing architecture standards for neural search applications. Trigger with phrases like "exa architecture", "exa project structure", "exa RAG pipeline", "exa reference design", "exa search pipeline".
exa-architecture-variants
Choose and implement Exa architecture patterns at different scales: direct search, cached search, and RAG pipeline. Use when designing Exa integrations, choosing between simple search and full RAG, or planning architecture for different traffic volumes. Trigger with phrases like "exa architecture", "exa blueprint", "how to structure exa", "exa RAG design", "exa at scale".
evernote-reference-architecture
Reference architecture for Evernote integrations. Use when designing system architecture, planning integrations, or building scalable Evernote applications. Trigger with phrases like "evernote architecture", "design evernote system", "evernote integration pattern", "evernote scale".
elevenlabs-webhooks-events
Implement ElevenLabs webhook HMAC signature verification and event handling. Use when setting up webhook endpoints for transcription completion, call recording, or agent conversation events from ElevenLabs. Trigger: "elevenlabs webhook", "elevenlabs events", "elevenlabs webhook signature", "handle elevenlabs notifications", "elevenlabs post-call webhook", "elevenlabs transcription webhook".
elevenlabs-upgrade-migration
Upgrade ElevenLabs SDK versions and migrate between API model generations. Use when upgrading the elevenlabs-js or elevenlabs Python SDK, migrating from v1 to v2 models, or handling deprecations. Trigger: "upgrade elevenlabs", "elevenlabs migration", "elevenlabs breaking changes", "update elevenlabs SDK", "migrate elevenlabs model", "eleven_v3 migration".
elevenlabs-security-basics
Apply ElevenLabs security best practices for API keys, webhook HMAC validation, and voice data protection. Use when securing API keys, validating webhook signatures, or auditing ElevenLabs security configuration. Trigger: "elevenlabs security", "elevenlabs secrets", "secure elevenlabs", "elevenlabs API key security", "elevenlabs webhook signature", "elevenlabs HMAC".
elevenlabs-sdk-patterns
Apply production-ready ElevenLabs SDK patterns for TypeScript and Python. Use when implementing ElevenLabs integrations, refactoring SDK usage, or establishing team coding standards for audio AI applications. Trigger: "elevenlabs SDK patterns", "elevenlabs best practices", "elevenlabs code patterns", "idiomatic elevenlabs", "elevenlabs typescript".
elevenlabs-rate-limits
Implement ElevenLabs rate limiting, concurrency queuing, and backoff patterns. Use when handling 429 errors, implementing retry logic, or managing concurrent TTS request throughput. Trigger: "elevenlabs rate limit", "elevenlabs throttling", "elevenlabs 429", "elevenlabs retry", "elevenlabs backoff", "elevenlabs concurrent requests".
elevenlabs-prod-checklist
Execute ElevenLabs production deployment checklist with health checks and rollback. Use when deploying TTS/voice integrations to production, preparing for launch, or implementing go-live procedures for ElevenLabs-powered apps. Trigger: "elevenlabs production", "deploy elevenlabs", "elevenlabs go-live", "elevenlabs launch checklist", "production TTS".
elevenlabs-performance-tuning
Optimize ElevenLabs TTS latency with model selection, streaming, caching, and audio format tuning. Use when experiencing slow TTS responses, implementing real-time voice features, or optimizing audio generation throughput. Trigger: "elevenlabs performance", "optimize elevenlabs", "elevenlabs latency", "elevenlabs slow", "fast TTS", "reduce elevenlabs latency", "TTS streaming".
elevenlabs-local-dev-loop
Configure local ElevenLabs development with mocking, hot reload, and audio testing. Use when setting up a dev environment for TTS/voice projects, configuring test workflows, or building a fast iteration cycle with ElevenLabs audio. Trigger: "elevenlabs dev setup", "elevenlabs local development", "elevenlabs dev environment", "develop with elevenlabs", "test elevenlabs locally".
elevenlabs-install-auth
Install and configure ElevenLabs SDK authentication for Node.js or Python. Use when setting up a new ElevenLabs project, configuring API keys, or initializing the elevenlabs npm/pip package. Trigger: "install elevenlabs", "setup elevenlabs", "elevenlabs auth", "configure elevenlabs API key", "elevenlabs credentials".