elevenlabs-reference-architecture

Implement ElevenLabs reference architecture for production TTS/voice applications. Use when designing new ElevenLabs integrations, reviewing project structure, or building a scalable audio generation service. Trigger: "elevenlabs architecture", "elevenlabs project structure", "how to organize elevenlabs", "TTS service architecture", "elevenlabs design patterns", "voice API architecture".

25 stars

Best use case

elevenlabs-reference-architecture is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Implement ElevenLabs reference architecture for production TTS/voice applications. Use when designing new ElevenLabs integrations, reviewing project structure, or building a scalable audio generation service. Trigger: "elevenlabs architecture", "elevenlabs project structure", "how to organize elevenlabs", "TTS service architecture", "elevenlabs design patterns", "voice API architecture".

Teams using elevenlabs-reference-architecture should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/elevenlabs-reference-architecture/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/jeremylongshore/claude-code-plugins-plus-skills/elevenlabs-reference-architecture/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/elevenlabs-reference-architecture/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How elevenlabs-reference-architecture Compares

Feature / Agentelevenlabs-reference-architectureStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Implement ElevenLabs reference architecture for production TTS/voice applications. Use when designing new ElevenLabs integrations, reviewing project structure, or building a scalable audio generation service. Trigger: "elevenlabs architecture", "elevenlabs project structure", "how to organize elevenlabs", "TTS service architecture", "elevenlabs design patterns", "voice API architecture".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# ElevenLabs Reference Architecture

## Overview

Production-ready architecture for ElevenLabs TTS/voice applications. Covers project layout, service layers, caching, streaming, and multi-model orchestration.

## Prerequisites

- Understanding of layered architecture patterns
- ElevenLabs SDK knowledge (see `elevenlabs-sdk-patterns`)
- TypeScript project with async patterns
- Redis (optional, for distributed caching)

## Instructions

### Step 1: Project Structure

```
my-elevenlabs-service/
├── src/
│   ├── elevenlabs/
│   │   ├── client.ts            # Singleton client with retry config
│   │   ├── config.ts            # Environment-aware configuration
│   │   ├── models.ts            # Model selection logic
│   │   ├── errors.ts            # Error classification (see sdk-patterns)
│   │   └── types.ts             # TypeScript interfaces
│   ├── services/
│   │   ├── tts-service.ts       # Text-to-Speech orchestration
│   │   ├── voice-service.ts     # Voice management (clone, list, settings)
│   │   ├── audio-service.ts     # SFX, isolation, transcription
│   │   └── cache-service.ts     # Audio caching layer
│   ├── api/
│   │   ├── routes/
│   │   │   ├── tts.ts           # POST /api/tts
│   │   │   ├── voices.ts        # GET/POST /api/voices
│   │   │   ├── webhooks.ts      # POST /webhooks/elevenlabs
│   │   │   └── health.ts        # GET /health
│   │   └── middleware/
│   │       ├── rate-limit.ts    # Request throttling
│   │       └── auth.ts          # Your app's auth (not ElevenLabs auth)
│   ├── queue/
│   │   ├── tts-queue.ts         # Async TTS job processing
│   │   └── workers.ts           # Queue workers
│   └── monitoring/
│       ├── metrics.ts           # Latency, error rate, quota tracking
│       └── alerts.ts            # Budget and health alerts
├── tests/
│   ├── unit/
│   │   ├── tts-service.test.ts
│   │   └── cache-service.test.ts
│   └── integration/
│       └── tts-smoke.test.ts
├── config/
│   ├── development.json
│   ├── staging.json
│   └── production.json
└── .env.example
```

### Step 2: Configuration Layer

```typescript
// src/elevenlabs/config.ts
export interface ElevenLabsConfig {
  apiKey: string;
  environment: "development" | "staging" | "production";
  defaults: {
    modelId: string;
    voiceId: string;
    outputFormat: string;
    voiceSettings: {
      stability: number;
      similarity_boost: number;
      style: number;
      speed: number;
    };
  };
  performance: {
    maxConcurrency: number;
    timeoutMs: number;
    maxRetries: number;
  };
  cache: {
    enabled: boolean;
    maxSizeMB: number;
    ttlSeconds: number;
  };
}

const ENV_CONFIGS: Record<string, Partial<ElevenLabsConfig>> = {
  development: {
    defaults: {
      modelId: "eleven_flash_v2_5",    // Cheap + fast for dev
      voiceId: "21m00Tcm4TlvDq8ikWAM", // Rachel
      outputFormat: "mp3_22050_32",     // Small files
      voiceSettings: { stability: 0.5, similarity_boost: 0.75, style: 0, speed: 1 },
    },
    performance: { maxConcurrency: 2, timeoutMs: 30_000, maxRetries: 1 },
    cache: { enabled: true, maxSizeMB: 50, ttlSeconds: 3600 },
  },
  production: {
    defaults: {
      modelId: "eleven_multilingual_v2", // High quality for prod
      voiceId: "21m00Tcm4TlvDq8ikWAM",
      outputFormat: "mp3_44100_128",     // High quality
      voiceSettings: { stability: 0.5, similarity_boost: 0.75, style: 0, speed: 1 },
    },
    performance: { maxConcurrency: 10, timeoutMs: 60_000, maxRetries: 3 },
    cache: { enabled: true, maxSizeMB: 500, ttlSeconds: 86_400 },
  },
};

export function loadConfig(): ElevenLabsConfig {
  const env = process.env.NODE_ENV || "development";
  const envConfig = ENV_CONFIGS[env] || ENV_CONFIGS.development;

  return {
    apiKey: process.env.ELEVENLABS_API_KEY!,
    environment: env as any,
    ...envConfig,
  } as ElevenLabsConfig;
}
```

### Step 3: TTS Service Layer

```typescript
// src/services/tts-service.ts
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import PQueue from "p-queue";
import { loadConfig } from "../elevenlabs/config";
import { classifyError } from "../elevenlabs/errors";

export class TTSService {
  private client: ElevenLabsClient;
  private queue: PQueue;
  private config: ReturnType<typeof loadConfig>;

  constructor() {
    this.config = loadConfig();
    this.client = new ElevenLabsClient({
      apiKey: this.config.apiKey,
      maxRetries: this.config.performance.maxRetries,
      timeoutInSeconds: this.config.performance.timeoutMs / 1000,
    });
    this.queue = new PQueue({
      concurrency: this.config.performance.maxConcurrency,
    });
  }

  async generate(text: string, options?: {
    voiceId?: string;
    modelId?: string;
    outputFormat?: string;
    streaming?: boolean;
  }): Promise<ReadableStream | Buffer> {
    const voiceId = options?.voiceId || this.config.defaults.voiceId;
    const modelId = options?.modelId || this.config.defaults.modelId;
    const format = options?.outputFormat || this.config.defaults.outputFormat;

    return this.queue.add(async () => {
      const start = performance.now();

      try {
        if (options?.streaming) {
          return await this.client.textToSpeech.stream(voiceId, {
            text,
            model_id: modelId,
            output_format: format,
            voice_settings: this.config.defaults.voiceSettings,
          });
        }

        const audio = await this.client.textToSpeech.convert(voiceId, {
          text,
          model_id: modelId,
          output_format: format,
          voice_settings: this.config.defaults.voiceSettings,
        });

        const latency = performance.now() - start;
        console.log(`[TTS] ${text.length} chars, ${modelId}, ${latency.toFixed(0)}ms`);
        return audio;
      } catch (error) {
        throw classifyError(error);
      }
    }) as Promise<ReadableStream | Buffer>;
  }

  // Split long text into chunks with prosody context
  async generateLongText(text: string, voiceId?: string): Promise<Buffer[]> {
    const chunks = this.splitText(text, 4500); // Stay under 5000 limit
    const results: Buffer[] = [];

    for (let i = 0; i < chunks.length; i++) {
      const audio = await this.generate(chunks[i], {
        voiceId,
        // Pass context for natural prosody across chunks
      });
      results.push(audio as Buffer);
    }

    return results;
  }

  private splitText(text: string, maxChars: number): string[] {
    const chunks: string[] = [];
    const sentences = text.match(/[^.!?]+[.!?]+/g) || [text];
    let current = "";

    for (const sentence of sentences) {
      if ((current + sentence).length > maxChars) {
        if (current) chunks.push(current.trim());
        current = sentence;
      } else {
        current += sentence;
      }
    }
    if (current) chunks.push(current.trim());
    return chunks;
  }
}
```

### Step 4: Voice Management Service

```typescript
// src/services/voice-service.ts
export class VoiceService {
  private client: ElevenLabsClient;

  constructor(client: ElevenLabsClient) {
    this.client = client;
  }

  async listVoices(filter?: { category?: "premade" | "cloned" | "generated" }) {
    const { voices } = await this.client.voices.getAll();
    if (filter?.category) {
      return voices.filter(v => v.category === filter.category);
    }
    return voices;
  }

  async cloneVoice(name: string, description: string, audioFiles: NodeJS.ReadableStream[]) {
    return this.client.voices.add({
      name,
      description,
      files: audioFiles,
    });
  }

  async getVoiceSettings(voiceId: string) {
    return this.client.voices.getSettings(voiceId);
  }

  async updateVoiceSettings(voiceId: string, settings: {
    stability: number;
    similarity_boost: number;
  }) {
    return this.client.voices.editSettings(voiceId, settings);
  }

  async deleteVoice(voiceId: string) {
    return this.client.voices.delete(voiceId);
  }
}
```

### Step 5: Data Flow Diagram

```
                         ┌──────────────┐
                         │   Client     │
                         │  (Browser/   │
                         │   Mobile)    │
                         └──────┬───────┘
                                │
                         ┌──────▼───────┐
                         │   API Layer  │
                         │   /api/tts   │
                         │   /api/voice │
                         └──────┬───────┘
                                │
                    ┌───────────┼───────────┐
                    │           │           │
             ┌──────▼──┐ ┌─────▼─────┐ ┌──▼──────┐
             │  Cache   │ │   TTS     │ │  Voice  │
             │ Service  │ │  Service  │ │ Service │
             └──────┬───┘ └─────┬─────┘ └────────┘
                    │           │
              ┌─────▼─┐  ┌─────▼──────────┐
              │ Redis/ │  │ Concurrency    │
              │ LRU    │  │ Queue (p-queue)│
              └────────┘  └─────┬──────────┘
                                │
                         ┌──────▼───────┐
                         │  ElevenLabs  │
                         │  Client SDK  │
                         │  (singleton) │
                         └──────┬───────┘
                                │
                    ┌───────────┼───────────┐
                    │           │           │
             ┌──────▼──┐ ┌─────▼─────┐ ┌──▼──────┐
             │ /v1/tts  │ │ /v1/voices│ │ /v1/sfx │
             │ REST/WS  │ │  REST     │ │  REST   │
             └──────────┘ └───────────┘ └─────────┘
                    ElevenLabs API (api.elevenlabs.io)
```

### Step 6: Health Check Composition

```typescript
// src/api/routes/health.ts
export async function healthCheck() {
  const checks = await Promise.allSettled([
    checkElevenLabsConnectivity(),
    checkQuotaStatus(),
    checkCacheHealth(),
  ]);

  const elevenlabs = checks[0].status === "fulfilled" ? checks[0].value : null;
  const quota = checks[1].status === "fulfilled" ? checks[1].value : null;
  const cache = checks[2].status === "fulfilled" ? checks[2].value : null;

  const degraded = !elevenlabs || (quota && quota.pctUsed > 90);

  return {
    status: !elevenlabs ? "unhealthy" : degraded ? "degraded" : "healthy",
    services: { elevenlabs, quota, cache },
    timestamp: new Date().toISOString(),
  };
}
```

## Architecture Decisions

| Decision | Choice | Rationale |
|----------|--------|-----------|
| Client pattern | Singleton | One connection pool, shared retry config |
| Concurrency | p-queue | Respects plan limits, prevents 429 |
| Caching | LRU (local) or Redis (distributed) | Repeated content is common in TTS |
| Long text | Sentence-boundary splitting | Preserves natural speech prosody |
| Error handling | Classification + retry | Different strategies for 429 vs 401 vs 500 |
| Model selection | Environment-based | Flash in dev (cheap), Multilingual in prod (quality) |
| Streaming | HTTP streaming + WebSocket | HTTP for simple, WS for LLM integration |

## Error Handling

| Issue | Cause | Solution |
|-------|-------|----------|
| Circular dependencies | Wrong layering | Services depend on client, never reverse |
| Cold start latency | Client initialization | Pre-warm in server startup |
| Memory pressure | Unbounded audio cache | Set `maxSizeMB` on cache |
| Type errors | SDK version mismatch | Pin SDK version in package.json |

## Resources

- [ElevenLabs API Reference](https://elevenlabs.io/docs/api-reference/introduction)
- [ElevenLabs SDK Source](https://github.com/elevenlabs/elevenlabs-js)
- [p-queue](https://github.com/sindresorhus/p-queue)
- [LRU Cache](https://github.com/isaacs/node-lru-cache)

## Next Steps

Start with `elevenlabs-install-auth` for setup, then apply this architecture. Use `elevenlabs-core-workflow-a` and `elevenlabs-core-workflow-b` for feature implementation.

Related Skills

exa-reference-architecture

25
from ComeOnOliver/skillshub

Implement Exa reference architecture for search pipelines, RAG, and content discovery. Use when designing new Exa integrations, reviewing project structure, or establishing architecture standards for neural search applications. Trigger with phrases like "exa architecture", "exa project structure", "exa RAG pipeline", "exa reference design", "exa search pipeline".

exa-architecture-variants

25
from ComeOnOliver/skillshub

Choose and implement Exa architecture patterns at different scales: direct search, cached search, and RAG pipeline. Use when designing Exa integrations, choosing between simple search and full RAG, or planning architecture for different traffic volumes. Trigger with phrases like "exa architecture", "exa blueprint", "how to structure exa", "exa RAG design", "exa at scale".

evernote-reference-architecture

25
from ComeOnOliver/skillshub

Reference architecture for Evernote integrations. Use when designing system architecture, planning integrations, or building scalable Evernote applications. Trigger with phrases like "evernote architecture", "design evernote system", "evernote integration pattern", "evernote scale".

elevenlabs-webhooks-events

25
from ComeOnOliver/skillshub

Implement ElevenLabs webhook HMAC signature verification and event handling. Use when setting up webhook endpoints for transcription completion, call recording, or agent conversation events from ElevenLabs. Trigger: "elevenlabs webhook", "elevenlabs events", "elevenlabs webhook signature", "handle elevenlabs notifications", "elevenlabs post-call webhook", "elevenlabs transcription webhook".

elevenlabs-upgrade-migration

25
from ComeOnOliver/skillshub

Upgrade ElevenLabs SDK versions and migrate between API model generations. Use when upgrading the elevenlabs-js or elevenlabs Python SDK, migrating from v1 to v2 models, or handling deprecations. Trigger: "upgrade elevenlabs", "elevenlabs migration", "elevenlabs breaking changes", "update elevenlabs SDK", "migrate elevenlabs model", "eleven_v3 migration".

elevenlabs-security-basics

25
from ComeOnOliver/skillshub

Apply ElevenLabs security best practices for API keys, webhook HMAC validation, and voice data protection. Use when securing API keys, validating webhook signatures, or auditing ElevenLabs security configuration. Trigger: "elevenlabs security", "elevenlabs secrets", "secure elevenlabs", "elevenlabs API key security", "elevenlabs webhook signature", "elevenlabs HMAC".

elevenlabs-sdk-patterns

25
from ComeOnOliver/skillshub

Apply production-ready ElevenLabs SDK patterns for TypeScript and Python. Use when implementing ElevenLabs integrations, refactoring SDK usage, or establishing team coding standards for audio AI applications. Trigger: "elevenlabs SDK patterns", "elevenlabs best practices", "elevenlabs code patterns", "idiomatic elevenlabs", "elevenlabs typescript".

elevenlabs-rate-limits

25
from ComeOnOliver/skillshub

Implement ElevenLabs rate limiting, concurrency queuing, and backoff patterns. Use when handling 429 errors, implementing retry logic, or managing concurrent TTS request throughput. Trigger: "elevenlabs rate limit", "elevenlabs throttling", "elevenlabs 429", "elevenlabs retry", "elevenlabs backoff", "elevenlabs concurrent requests".

elevenlabs-prod-checklist

25
from ComeOnOliver/skillshub

Execute ElevenLabs production deployment checklist with health checks and rollback. Use when deploying TTS/voice integrations to production, preparing for launch, or implementing go-live procedures for ElevenLabs-powered apps. Trigger: "elevenlabs production", "deploy elevenlabs", "elevenlabs go-live", "elevenlabs launch checklist", "production TTS".

elevenlabs-performance-tuning

25
from ComeOnOliver/skillshub

Optimize ElevenLabs TTS latency with model selection, streaming, caching, and audio format tuning. Use when experiencing slow TTS responses, implementing real-time voice features, or optimizing audio generation throughput. Trigger: "elevenlabs performance", "optimize elevenlabs", "elevenlabs latency", "elevenlabs slow", "fast TTS", "reduce elevenlabs latency", "TTS streaming".

elevenlabs-local-dev-loop

25
from ComeOnOliver/skillshub

Configure local ElevenLabs development with mocking, hot reload, and audio testing. Use when setting up a dev environment for TTS/voice projects, configuring test workflows, or building a fast iteration cycle with ElevenLabs audio. Trigger: "elevenlabs dev setup", "elevenlabs local development", "elevenlabs dev environment", "develop with elevenlabs", "test elevenlabs locally".

elevenlabs-install-auth

25
from ComeOnOliver/skillshub

Install and configure ElevenLabs SDK authentication for Node.js or Python. Use when setting up a new ElevenLabs project, configuring API keys, or initializing the elevenlabs npm/pip package. Trigger: "install elevenlabs", "setup elevenlabs", "elevenlabs auth", "configure elevenlabs API key", "elevenlabs credentials".