add-tts
Adds text-to-speech audio help to a feature using the TTS system. Use when adding voice narration, audio feedback, or spoken instructions to any part of the app.
Best use case
add-tts is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Adds text-to-speech audio help to a feature using the TTS system. Use when adding voice narration, audio feedback, or spoken instructions to any part of the app.
Teams using add-tts should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/add-tts/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How add-tts Compares
| Feature / Agent | add-tts | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Adds text-to-speech audio help to a feature using the TTS system. Use when adding voice narration, audio feedback, or spoken instructions to any part of the app.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Adding TTS Audio to a Feature
This skill walks you through adding text-to-speech audio to a feature in the app. The TTS system plays audio via a **voice chain** (pregenerated mp3 → on-demand generation via OpenAI → browser SpeechSynthesis → subtitles). Clips are collected at runtime, persisted to the database, and generated as high-quality OpenAI TTS mp3s — either on-the-fly during playback (if the `generate` chain entry is configured) or in batch from the admin panel.
## Before You Start
**Read the integration guide:** `apps/web/.claude/reference/tts-audio-system.md`
It contains the full API reference, patterns, anti-patterns, and existing implementations.
## Your Job
1. Understand what text the feature needs spoken and when
2. Create a feature-specific audio hook
3. Wire it into the component
4. Verify with TypeScript
## Step 1: Design the Utterances
For each piece of audio the feature needs, determine:
- **What** text to speak (static string or dynamic from state/props)
- **When** to speak it (on mount, on state change, on user action, on completion)
- **How** it should sound (tone — write as voice-actor stage directions)
## Step 2: Create a Feature Audio Hook
Create a hook in the feature's `hooks/` directory. This hook owns text construction, tone strings, auto-play logic, and cleanup.
**Read the reference implementation first:**
```
apps/web/src/components/practice/hooks/usePracticeAudioHelp.ts
```
**Key rules:**
1. **Tone strings must be module-level constants** — never compute them dynamically per render
2. **Always clean up on unmount** — call `stop()` in a cleanup effect
3. **Use refs to track previous values** — prevents re-playing on every render
4. **Guard with `isEnabled`** — respect the user's audio toggle
**Template:**
```typescript
'use client'
import { useEffect, useRef } from 'react'
import { useTTS } from '@/hooks/useTTS'
import { useAudioManager } from '@/hooks/useAudioManager'
// Stable tone constants — changing these creates new clips
const INSTRUCTION_TONE =
'Patiently guiding a young child. Clear, slow, friendly.'
const CELEBRATION_TONE =
'Warmly congratulating a child. Genuinely encouraging and happy.'
interface UseMyFeatureAudioHelpOptions {
currentStep: string
isComplete: boolean
}
export function useMyFeatureAudioHelp({
currentStep,
isComplete,
}: UseMyFeatureAudioHelpOptions) {
const { isEnabled, stop } = useAudioManager()
// Declare utterances
const sayInstruction = useTTS(currentStep, { tone: INSTRUCTION_TONE })
const sayCelebration = useTTS(
isComplete ? 'Well done!' : '',
{ tone: CELEBRATION_TONE },
)
// Auto-play when step changes
const prevStepRef = useRef<string>('')
useEffect(() => {
if (!isEnabled || !currentStep || currentStep === prevStepRef.current) return
prevStepRef.current = currentStep
sayInstruction()
}, [isEnabled, currentStep, sayInstruction])
// Auto-play celebration on completion
useEffect(() => {
if (!isEnabled || !isComplete) return
sayCelebration()
}, [isEnabled, isComplete, sayCelebration])
// Stop audio on unmount
useEffect(() => {
return () => stop()
}, [stop])
return { replay: sayInstruction }
}
```
## Step 3: Wire Into the Component
```typescript
import { useMyFeatureAudioHelp } from './hooks/useMyFeatureAudioHelp'
import { useAudioManager } from '@/hooks/useAudioManager'
function MyFeature() {
const { isEnabled, isPlaying } = useAudioManager()
const { replay } = useMyFeatureAudioHelp({
currentStep: 'Tap the bead to move it up',
isComplete: false,
})
return (
<div>
{isEnabled && (
<button onClick={replay} disabled={isPlaying}>
{isPlaying ? 'Speaking...' : 'Replay'}
</button>
)}
</div>
)
}
```
## Step 4: Verify
```bash
cd apps/web && npx tsc --noEmit
```
## Common Patterns
### Dynamic text from state
```typescript
const text = useMemo(
() => (terms ? termsToSentence(terms) : ''),
[terms],
)
const sayProblem = useTTS(text, { tone: MATH_TONE })
```
### One-shot playback (play once, don't repeat)
```typescript
const playedRef = useRef(false)
useEffect(() => {
if (!shouldPlay || playedRef.current) return
playedRef.current = true
sayIt()
}, [shouldPlay, sayIt])
// Reset when trigger resets
useEffect(() => {
if (!shouldPlay) playedRef.current = false
}, [shouldPlay])
```
### Multiple utterances — play the right one
```typescript
const sayStep1 = useTTS('First, look at the abacus', { tone: INST })
const sayStep2 = useTTS('Now tap the bead', { tone: INST })
// speak() stops previous before starting
if (step === 0) sayStep1()
if (step === 1) sayStep2()
```
## Tone String Guidelines
Write tones as **voice-actor stage directions**. Be specific about emotion, pace, and audience.
**Good examples:**
- `'Speaking clearly and steadily, reading a math problem to a young child. Pause slightly between each number and operator.'`
- `'Warmly congratulating a child. Genuinely encouraging and happy.'`
- `'Gently guiding a child after a wrong answer. Kind, not disappointed.'`
- `'Patiently guiding a young child through an abacus tutorial. Clear, slow, friendly.'`
**Bad examples:**
- `'Read this text'` — too vague
- `` `Speaking ${mood}` `` — dynamic per render, creates new clips every time
## Anti-Patterns to Avoid
1. **Never use raw `speechSynthesis`** — always go through `useTTS` so the voice chain and collection work
2. **Never forget cleanup** — always `useEffect(() => () => stop(), [stop])`
3. **Never use dynamic tone strings** — keep them as module-level constants
4. **Never call `speak()` unconditionally in render** — always guard with refs and `isEnabled`
## Key Files
| File | Role |
|------|------|
| `src/hooks/useTTS.ts` | Primary hook — declare (text, tone), get speak function |
| `src/hooks/useAudioManager.ts` | Reactive state — isEnabled, isPlaying, volume, subtitles, stop() |
| `src/lib/audio/TtsAudioManager.ts` | Core engine — voice chain, playback, collection, subtitles |
| `src/lib/audio/voiceSource.ts` | Voice source class hierarchy — polymorphic `generate()` per voice type |
| `src/contexts/AudioManagerContext.tsx` | React context — singleton manager, boot-time manifest loading |
| `src/lib/audio/termsToSentence.ts` | `[5, 3]` → `"five plus three"` |
| `src/lib/audio/buildFeedbackText.ts` | Correct/incorrect feedback sentences |
| `src/lib/audio/numberToEnglish.ts` | `42` → `"forty two"` |
## Voice Chain
Audio plays through the voice chain in order. The typical chain is:
```
pregenerated voice (nova) → auto-generate → browser TTS → subtitles
```
- **Pregenerated**: instant playback from pre-generated mp3 on disk
- **Auto-generate**: calls OpenAI on-the-fly if the pregenerated mp3 is missing, caches result
- **Browser TTS**: uses the browser's built-in speech synthesis
- **Subtitles**: shows text on screen with a reading-time timer
You don't need to think about this when adding TTS to a feature — just use `useTTS()` and the chain handles fallback automatically. The admin configures the chain at `/admin/audio`.
## Reference Implementations
| Hook | Location | What it does |
|------|----------|-------------|
| `usePracticeAudioHelp` | `src/components/practice/hooks/` | Reads math problems, correct/incorrect feedback |
| `useTutorialAudioHelp` | `src/components/tutorial/hooks/` | Speaks tutorial step instructions |
Follow `usePracticeAudioHelp` as the most complete example.Related Skills
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
aegis-protocol-ratification
Ratify AEGIS protocol governance frameworks.
aegis-architect
Enhanced architecture guidance for voice-first Brazilian fintech applications. Use when designing voice interfaces, implementing PIX/Boletos, optimizing financial systems, or making technology stack decisions for Brazilian market applications. Integrates with docs/ content, MCP tools for Brazilian market research, enhanced validation scripts, and comprehensive Brazilian compliance patterns.
ae-sdd-init
Initialize a new SDD change set after user-approved naming
adynato-web
Web development conventions for Adynato projects. Covers image optimization with img4web, asset management, component patterns, styling, and performance best practices. Use when building or modifying web applications, adding images/assets, or creating UI components.
adynato-mobile
Mobile app development conventions for Adynato projects using React Native and Expo. Covers navigation patterns, native APIs, performance optimization, and platform-specific considerations. Use when building or modifying mobile applications.
adw-design
Guide creation of AI Developer Workflows (ADWs) that combine deterministic orchestration code with non-deterministic agents. Use when building automated development pipelines, designing AFK agent systems, or implementing the PITER framework.
advocacy-roster-system
Scoring and governance framework for managing reference customers and advocacy cohorts.
adventure
Room-based exploration with narrative evidence collection
advanced_tools
Use when finding files by name, searching code content, locating patterns with regex, exploring codebase, or batch refactoring across multiple files. Conforms to docs/reference/skill-routing-value-standard.md.
advanced-workflows
Multi-tool orchestration patterns for complex Bluera Knowledge operations. Teaches progressive library exploration, adding libraries with job monitoring, handling large result sets, multi-store searches, and error recovery workflows.
Advanced Typescript Type Level
Master TypeScript type-level programming with conditional types, mapped types, template literals, and infer patterns. Use when writing advanced types, creating utility types, or solving complex type challenges.