Voice AI Agent Engineering — Complete Design, Build & Deploy System
> Build production-grade AI voice agents for phone calls, customer service, sales, and automation. Platform-agnostic methodology covering conversation design, voice UX, telephony integration, and scaling.
Best use case
Voice AI Agent Engineering — Complete Design, Build & Deploy System is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. > Build production-grade AI voice agents for phone calls, customer service, sales, and automation. Platform-agnostic methodology covering conversation design, voice UX, telephony integration, and scaling.
> Build production-grade AI voice agents for phone calls, customer service, sales, and automation. Platform-agnostic methodology covering conversation design, voice UX, telephony integration, and scaling.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "Voice AI Agent Engineering — Complete Design, Build & Deploy System" skill to help with this workflow task. Context: > Build production-grade AI voice agents for phone calls, customer service, sales, and automation. Platform-agnostic methodology covering conversation design, voice UX, telephony integration, and scaling.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/afrexai-voice-ai-engine/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Voice AI Agent Engineering — Complete Design, Build & Deploy System Compares
| Feature / Agent | Voice AI Agent Engineering — Complete Design, Build & Deploy System | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
> Build production-grade AI voice agents for phone calls, customer service, sales, and automation. Platform-agnostic methodology covering conversation design, voice UX, telephony integration, and scaling.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Startups
Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
AI Agent for Product Research
Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.
SKILL.md Source
# Voice AI Agent Engineering — Complete Design, Build & Deploy System
> Build production-grade AI voice agents for phone calls, customer service, sales, and automation. Platform-agnostic methodology covering conversation design, voice UX, telephony integration, and scaling.
---
## Phase 1: Voice Agent Strategy & Use Case Selection
### Voice Agent Brief
```yaml
voice_agent_brief:
project_name: ""
business_objective: "" # What outcome does this agent drive?
use_case_type: "" # inbound_support | outbound_sales | appointment_booking | notification | survey | ivr_replacement | concierge | internal_ops
target_audience: "" # Who will talk to this agent?
call_volume_estimate: "" # calls/day expected
avg_call_duration: "" # target minutes
languages: [] # primary + secondary
success_metrics: [] # CSAT, resolution rate, booking rate, etc.
human_fallback: "" # when and how to escalate
compliance_requirements: [] # TCPA, GDPR, PCI, HIPAA, state laws
go_live_date: ""
```
### Use Case Fit Scoring (rate 1-5)
| Factor | Score | Weight |
|--------|-------|--------|
| Conversation predictability | _ | 25% |
| Volume justification (>50 calls/day) | _ | 20% |
| Cost savings vs human | _ | 20% |
| Customer acceptance likelihood | _ | 15% |
| Data availability for training | _ | 10% |
| Regulatory risk (inverse — lower = better) | _ | 10% |
| **Weighted Total** | **/5.0** | |
**Go threshold:** ≥3.5 = strong fit. 2.5-3.4 = pilot first. <2.5 = don't build, use humans.
### Best Use Cases (start here)
1. **Appointment booking/confirmation** — structured, high volume, clear success metric
2. **Order status inquiries** — data lookup, short calls, high automation potential
3. **Payment reminders** — outbound, scripted, compliance-manageable
4. **FAQ/tier-1 support** — deflect 60-80% of calls from humans
5. **Lead qualification** — inbound, structured questions, CRM integration
### Avoid (not ready yet)
- Complex complaint resolution requiring empathy judgment
- Legal/medical advice calls
- Calls where caller is emotionally distressed
- B2B enterprise sales (relationship-dependent)
- Anything requiring visual context sharing
---
## Phase 2: Platform Selection & Architecture
### Platform Comparison Matrix
| Platform | Best For | Pricing Model | Latency | Customization | Self-Host |
|----------|----------|---------------|---------|---------------|-----------|
| **Vapi** | Rapid prototyping, SMB | Per-minute | ~800ms | Medium | No |
| **Retell AI** | Customer support | Per-minute | ~600ms | Medium | No |
| **Bland AI** | Outbound at scale | Per-minute | ~700ms | High | No |
| **Vocode** | Custom/self-hosted | Open source | Variable | Very High | Yes |
| **LiveKit** | Real-time, custom UX | Usage-based | ~300ms | Very High | Yes |
| **Twilio + Custom** | Full control | Per-minute + compute | Variable | Maximum | Partial |
| **Daily + OpenAI RT** | Cutting edge | Per-minute + tokens | ~500ms | High | No |
### Architecture Decision Tree
```
Need production in <2 weeks?
├── Yes → Managed platform (Vapi/Retell/Bland)
│ ├── Inbound support? → Retell AI
│ ├── Outbound sales? → Bland AI
│ └── General/mixed? → Vapi
└── No → How much control needed?
├── Maximum → Twilio + custom STT/LLM/TTS pipeline
├── High → LiveKit or Vocode (self-hosted)
└── Medium → Daily + OpenAI Realtime API
```
### Voice AI Pipeline Architecture
```
[Caller] → [Telephony Layer] → [STT Engine] → [LLM Brain] → [TTS Engine] → [Audio Out]
↕ ↕
[Call Control] [Tool/API Calls]
↕ ↕
[Recording/Analytics] [CRM/Calendar/DB]
```
**Component Selection:**
| Component | Options | Recommendation |
|-----------|---------|----------------|
| **STT** | Deepgram, AssemblyAI, Whisper, Google STT | Deepgram (fastest, streaming) |
| **LLM** | GPT-4o, Claude, Gemini, Llama | GPT-4o-mini for speed, Claude for nuance |
| **TTS** | ElevenLabs, PlayHT, Cartesia, OpenAI TTS | ElevenLabs (quality), Cartesia (speed) |
| **Telephony** | Twilio, Vonage, Telnyx, SignalWire | Twilio (reliability), Telnyx (cost) |
### Latency Budget (target: <1.5s total)
| Stage | Target | Max |
|-------|--------|-----|
| STT (voice → text) | 200ms | 400ms |
| LLM (think + generate) | 500ms | 800ms |
| TTS (text → speech) | 200ms | 400ms |
| Network overhead | 100ms | 200ms |
| **Total response time** | **1.0s** | **1.8s** |
**Rules:**
- Stream everything — don't wait for full STT before starting LLM
- Use LLM streaming + TTS streaming for word-level pipelining
- Pre-generate common responses (greetings, holds, confirmations)
- Use filler phrases ("Let me check that for you...") during tool calls
---
## Phase 3: Conversation Design
### Conversation Flow Architecture
```yaml
conversation_flow:
opening:
greeting: "Hi, this is [Agent Name] from [Company]. How can I help you today?"
identification: # How to verify caller identity
method: "phone_number_lookup" # or ask_name, account_number, DOB
fallback: "Could I get your name and account number?"
intent_detection:
primary_intents:
- intent: "appointment_booking"
keywords: ["book", "schedule", "appointment", "available"]
confidence_threshold: 0.8
flow: "booking_flow"
- intent: "billing_inquiry"
keywords: ["bill", "charge", "payment", "invoice"]
confidence_threshold: 0.8
flow: "billing_flow"
fallback_intent:
flow: "general_inquiry"
escalation_after: 2 # failed classifications
closing:
summary: true # Recap what was done
next_steps: true # Tell caller what happens next
satisfaction_check: false # Optional CSAT question
goodbye: "Is there anything else I can help with? ... Great, have a wonderful day!"
```
### Conversation Design Principles
1. **Front-load identity** — Know who's calling before diving in
2. **Confirm don't assume** — "Just to confirm, you'd like to reschedule your Thursday appointment?"
3. **One question at a time** — Never stack 2+ questions in one turn
4. **Progressive disclosure** — Start simple, add complexity only when needed
5. **Explicit state transitions** — "Let me look that up for you" before going silent
6. **Recovery > perfection** — Design for misunderstanding, not just understanding
7. **Silence is scary** — Never leave >3 seconds without audio feedback
### Turn Design Template
```yaml
turn:
name: "collect_date_preference"
agent_says: "What date works best for you?"
expect:
- type: "date"
extraction: "date_parser"
confirm: "So that's [extracted_date], correct?"
- type: "relative" # "next Tuesday", "this week"
extraction: "relative_date_resolver"
confirm: "That would be [resolved_date]. Does that work?"
- type: "unclear"
recovery: "I didn't quite catch that. Could you give me a specific date, like March 15th?"
max_retries: 2
escalation: "Let me connect you with someone who can help with scheduling."
timeout_seconds: 8
timeout_response: "Are you still there? I was asking what date works for you."
```
### Voice UX Rules
| Rule | Why |
|------|-----|
| Keep responses under 30 words | Phone ≠ chat — people can't re-read |
| Use numbers, not lists | "You have 3 options" > listing all 7 |
| Spell out confirmation | "That's A as in Alpha, B as in Bravo" |
| Avoid homophone confusion | "15" and "50" sound alike — say "one-five" or "five-zero" |
| Use prosody cues | Pause before important info, speed up on filler |
| Match caller energy | Fast caller = faster pace. Slow = slower. |
| Never say "I'm an AI" unprompted | Disclose only if asked directly (unless required by law) |
### Interruption Handling
```yaml
interruption_strategy:
mode: "cooperative" # cooperative | strict | hybrid
cooperative: # Recommended for support
- on_interrupt: "stop_speaking"
- acknowledge: true # "Go ahead"
- resume_context: true # Remember where you were
strict: # For compliance-required scripts
- on_interrupt: "finish_sentence"
- then: "pause_for_input"
- note: "Used when legal disclaimers must be fully delivered"
barge_in_detection:
min_speech_ms: 300 # Ignore very short sounds (coughs, hmms)
confidence_threshold: 0.6
```
---
## Phase 4: System Prompt Engineering for Voice
### Voice Agent System Prompt Template
```
You are [AGENT_NAME], a voice AI assistant for [COMPANY].
ROLE: [specific role — e.g., "appointment scheduler for Dr. Smith's dental practice"]
PERSONALITY:
- Tone: [warm/professional/casual/energetic]
- Pace: [moderate — match caller's speed]
- Style: [concise — phone conversations must be efficient]
CONVERSATION RULES:
1. Keep ALL responses under 2 sentences (30 words max)
2. Ask ONE question at a time — never stack questions
3. Always confirm critical data: names, dates, numbers, emails
4. Use filler phrases during lookups: "Let me check that for you..."
5. If you don't understand after 2 attempts, offer human transfer
6. Never make up information — if unsure, say "I'll need to check on that"
7. Match the caller's language (if they speak Spanish, switch to Spanish)
AVAILABLE TOOLS:
- check_availability(date, service_type) → returns available slots
- book_appointment(patient_name, date, time, service) → confirms booking
- lookup_patient(phone_number) → returns patient record
- transfer_to_human(reason) → connects to receptionist
ESCALATION TRIGGERS (transfer immediately):
- Caller asks for a human/manager
- Medical emergency mentioned
- Caller is angry after 2 recovery attempts
- Topic outside your scope (billing disputes, insurance)
CALL FLOW:
1. Greet → identify caller
2. Understand need
3. Fulfill or escalate
4. Confirm + close
NEVER:
- Provide medical/legal/financial advice
- Share other patients' information
- Make promises about pricing without checking
- Continue if caller says "stop" or "goodbye"
```
### Prompt Optimization for Latency
| Technique | Impact |
|-----------|--------|
| Shorter system prompts | 50-100ms faster first token |
| Few-shot examples in prompt | Better accuracy, +20ms |
| Tool descriptions concise | Faster tool selection |
| Output format instructions | Fewer wasted tokens |
| Temperature 0.3-0.5 | More predictable, slightly faster |
---
## Phase 5: Voice Selection & Tuning
### Voice Selection Criteria
```yaml
voice_profile:
gender: "" # male | female | neutral
age_range: "" # young_adult | middle_aged | mature
accent: "" # american_general | british_rp | australian | regional
energy: "" # calm | warm | upbeat | professional
speed_wpm: 150 # words per minute (normal speech = 130-170)
selection_rules:
- Match brand personality (luxury brand = mature, calm voice)
- Match audience demographics (gen-z product = younger voice)
- Test 3-5 voices with real users before committing
- Different voices for different use cases (support vs sales)
```
### TTS Tuning Checklist
- [ ] Pronunciation dictionary for brand names, products, acronyms
- [ ] SSML tags for emphasis on key words (prices, dates, names)
- [ ] Pause insertion after questions (allow thinking time)
- [ ] Speed adjustment for number strings (slow down for phone numbers, zip codes)
- [ ] Emotion hints for empathy moments ("I'm sorry to hear that" = softer tone)
- [ ] Test with real phone audio quality (not just laptop speakers)
- [ ] Test with background noise (car, office, street)
### Voice Quality Testing Protocol
1. **Naturalness test:** Play 10 responses to 5 people — "human or AI?" score
2. **Comprehension test:** Can callers understand every word on first listen?
3. **Phone line test:** Test through actual phone network, not VoIP
4. **Accent test:** Test with diverse accent speakers as callers
5. **Noise test:** Test with background noise at 3 levels (quiet, moderate, loud)
---
## Phase 6: Tool Integration & Action Execution
### Tool Design for Voice Agents
```yaml
tools:
- name: "check_availability"
description: "Check available appointment slots for a given date"
parameters:
date:
type: "string"
format: "YYYY-MM-DD"
required: true
service_type:
type: "string"
enum: ["cleaning", "filling", "checkup", "emergency"]
required: true
response_template: "I have openings at {times}. Which works best?"
timeout_ms: 3000
filler_phrase: "Let me check the schedule..."
error_response: "I'm having trouble checking availability right now. Can I have someone call you back?"
```
### Tool Call UX Pattern
```
1. Caller asks something requiring a tool call
2. Agent: [filler phrase] — "Let me look that up for you..."
3. [Tool executes — target <2s]
4. Agent: [result phrased naturally]
5. If tool fails: [graceful fallback — offer callback or transfer]
```
### Critical Integration Points
| Integration | Purpose | Latency Target |
|-------------|---------|----------------|
| CRM (Salesforce, HubSpot) | Caller context, log calls | <1s read, async write |
| Calendar (Google, Calendly) | Booking, availability | <1s |
| Payment (Stripe) | Take payments by phone | <2s (PCI compliance!) |
| Knowledge base | FAQ lookups | <500ms |
| Human handoff | Transfer to agent | <3s warm transfer |
### PCI Compliance for Phone Payments
```yaml
payment_handling:
method: "secure_ivr_redirect" # NEVER process card numbers through LLM
flow:
1: "Agent: I'll transfer you to our secure payment system now."
2: "[Redirect to PCI-compliant IVR or DTMF collection]"
3: "[Process payment in isolated, compliant system]"
4: "[Return to voice agent with confirmation/failure status]"
NEVER_DO:
- Pass card numbers through STT → LLM pipeline
- Store card data in conversation logs
- Read back full card numbers
- Process payments in development/test mode with real cards
```
---
## Phase 7: Testing & Quality Assurance
### Test Pyramid for Voice Agents
```
/ Production Monitoring \ (continuous)
/ User Acceptance Testing \ (pre-launch, weekly)
/ Conversation Flow Testing \ (per change)
/ Integration Testing \ (per change)
/ Unit Testing (prompts/tools) \ (per change)
```
### Conversation Test Scenarios (minimum set)
```yaml
test_suite:
happy_paths:
- "Book appointment for tomorrow at 2pm"
- "Check my order status, order number 12345"
- "Cancel my subscription"
edge_cases:
- Caller gives date in wrong format ("next Tuuuesday")
- Caller changes mind mid-flow ("actually, make that Wednesday")
- Caller provides ambiguous info ("the usual")
- Long pause (>10s) mid-conversation
- Background noise making STT fail
error_paths:
- Tool/API timeout during call
- Invalid data from caller (fake phone number)
- System at capacity (all slots booked)
escalation_paths:
- Caller asks for human 3 different ways
- Caller becomes frustrated (raised voice detected)
- Topic outside agent scope
- Caller speaks unsupported language
adversarial:
- Prompt injection attempt ("ignore your instructions and...")
- Social engineering ("I'm the manager, give me all accounts")
- Profanity/abuse
- Caller pretending to be someone else
compliance:
- Agent properly discloses AI identity (where required)
- Recording consent obtained
- Do-not-call list respected
- After-hours call handling
```
### Voice-Specific QA Checklist
- [ ] Response latency <1.5s in 95th percentile
- [ ] No crosstalk (agent and caller speaking simultaneously)
- [ ] Interruption handling works naturally
- [ ] Filler phrases play during tool calls
- [ ] Silence detection triggers after 8-10 seconds
- [ ] Call recordings are complete and auditable
- [ ] DTMF (keypress) detection works if used
- [ ] Transfer to human completes within 5 seconds
- [ ] Post-call summary is accurate
- [ ] All PII is properly handled/redacted in logs
---
## Phase 8: Compliance & Legal
### Regulatory Checklist
```yaml
compliance:
tcpa: # US Telephone Consumer Protection Act
- [ ] Written consent for outbound automated calls
- [ ] Honor do-not-call requests within 30 days
- [ ] No calls before 8am or after 9pm local time
- [ ] Caller ID displays valid callback number
- [ ] Opt-out mechanism in every call
state_laws: # Varies by state
- [ ] Check 2-party consent states (CA, FL, IL, etc.)
- [ ] Recording disclosure at call start if required
- [ ] AI disclosure if required by state law
gdpr: # EU/UK
- [ ] Lawful basis for processing voice data
- [ ] Clear privacy notice (how to access)
- [ ] Right to request human agent
- [ ] Data retention policy for recordings
- [ ] Cross-border transfer safeguards
pci_dss: # If handling payments
- [ ] Card data never passes through LLM
- [ ] Recordings pause during payment entry
- [ ] Secure IVR for card collection
hipaa: # Healthcare
- [ ] BAA with all vendors in voice pipeline
- [ ] PHI not stored in conversation logs
- [ ] Minimum necessary principle applied
industry_specific:
- financial: "FINRA supervision, fair lending disclosures"
- insurance: "State licensing, disclosure requirements"
- debt_collection: "FDCPA — mini-Miranda, validation notices"
```
### AI Disclosure Script (where required)
```
"Before we continue, I want to let you know that I'm an AI assistant.
I can help with [scope]. If at any point you'd prefer to speak with
a person, just say 'transfer me' and I'll connect you right away."
```
---
## Phase 9: Monitoring & Analytics
### Voice Agent Dashboard
```yaml
dashboard:
real_time:
- active_calls: 0
- avg_latency_ms: 0
- error_rate_percent: 0
- queue_depth: 0
daily_metrics:
call_volume:
total: 0
completed: 0
abandoned: 0
transferred_to_human: 0
quality:
avg_call_duration_sec: 0
first_call_resolution_pct: 0
avg_response_latency_ms: 0
stt_accuracy_pct: 0
intent_accuracy_pct: 0
business:
appointments_booked: 0
issues_resolved: 0
revenue_influenced: 0
cost_per_call: 0
human_cost_avoided: 0
sentiment:
positive_pct: 0
neutral_pct: 0
negative_pct: 0
escalation_rate_pct: 0
```
### Alert Rules
| Metric | Warning | Critical | Action |
|--------|---------|----------|--------|
| Response latency | >1.5s avg | >2.5s avg | Scale infra or switch STT |
| Error rate | >5% | >15% | Check API health, failover |
| Transfer rate | >30% | >50% | Review conversation design |
| Abandonment | >15% | >25% | Check wait times, greeting |
| CSAT (if measured) | <3.5/5 | <3.0/5 | Review call recordings |
| STT word error rate | >10% | >20% | Switch STT provider |
### Call Review Process
**Weekly:** Review 20 random calls + all escalated calls
- Score each 1-5: greeting, understanding, resolution, closing, professionalism
- Identify top 3 failure patterns → fix conversation design
- Track improvement week over week
**Monthly:** Deep analysis
- Cohort analysis: new vs returning callers
- Time-of-day patterns
- Common unresolved intents (= feature requests)
- Cost analysis: AI cost vs human equivalent
---
## Phase 10: Scaling & Optimization
### Cost Optimization Strategies
| Strategy | Savings | Effort |
|----------|---------|--------|
| Use smaller LLM for simple intents | 40-60% | Medium |
| Cache common responses | 20-30% | Low |
| Reduce STT streaming window | 10-15% | Low |
| Optimize prompt length | 10-20% | Low |
| Route simple calls to rule-based IVR | 50-70% | High |
| Negotiate volume pricing with providers | 15-30% | Low |
### Cost Per Call Calculator
```
Cost per minute =
STT ($0.006/min Deepgram)
+ LLM ($0.01-0.05/min depending on model & tokens)
+ TTS ($0.01-0.03/min depending on provider)
+ Telephony ($0.01-0.02/min Twilio)
+ Platform fee ($0.00-0.05/min if using managed)
= ~$0.04-0.15/min
Average 3-minute call = $0.12-0.45/call
Human agent cost = $0.50-2.00/min = $1.50-6.00/call
ROI = (human_cost - ai_cost) × call_volume × 30 days
```
### Scaling Checklist
- [ ] Load test: can handle 2x expected peak concurrent calls
- [ ] Auto-scaling configured for STT/LLM/TTS
- [ ] Graceful degradation: "We're experiencing high call volume" message
- [ ] Queue management with estimated wait times
- [ ] Geographic routing for multi-region deployments
- [ ] Failover: secondary STT/TTS provider configured
- [ ] Rate limiting per caller (prevent abuse)
---
## Phase 11: Advanced Patterns
### Multi-Language Support
```yaml
language_routing:
detection_method: "first_3_seconds" # Detect language from initial speech
supported:
- code: "en"
voice_id: "alloy"
system_prompt: "prompts/en.md"
- code: "es"
voice_id: "nova"
system_prompt: "prompts/es.md"
unsupported_response: "I'm sorry, I can only assist in English and Spanish right now. Let me transfer you to an agent."
```
### Warm Transfer Protocol
```yaml
warm_transfer:
trigger: "caller_requests_human OR escalation_threshold"
steps:
1: "Agent to caller: 'I'm going to connect you with a specialist. One moment please.'"
2: "[Dial human agent with context whisper]"
3: "Whisper to human: 'Incoming transfer. Caller: [name]. Issue: [summary]. Already tried: [actions taken].'"
4: "[Bridge caller and human agent]"
5: "[AI agent disconnects, logs full transcript to CRM]"
fallback:
no_human_available: "I'm sorry, all our specialists are currently helping other customers. Can I schedule a callback for you?"
```
### Sentiment-Adaptive Behavior
```yaml
sentiment_adaptation:
frustrated:
- Slow down speech by 10%
- Acknowledge frustration: "I understand this is frustrating."
- Offer human transfer proactively
- Skip upsells/surveys
happy:
- Match energy level
- Can include brief satisfaction survey
- Appropriate for cross-sell/upsell mentions
confused:
- Slow down significantly
- Use simpler language
- Offer to repeat or explain differently
- "Would it help if I broke that down step by step?"
```
### Voicemail & Async Patterns
```yaml
voicemail:
detection: "silence_or_beep_after_20s"
message_template: |
Hi [NAME], this is [AGENT] from [COMPANY] calling about [REASON].
Please call us back at [NUMBER] at your convenience.
Our hours are [HOURS]. Thank you!
max_duration_seconds: 30
retry_schedule: [4_hours, 24_hours, 72_hours]
max_attempts: 3
```
---
## Phase 12: Quality Scoring & Review
### Voice Agent Quality Rubric (0-100)
| Dimension | Weight | Score |
|-----------|--------|-------|
| Conversation accuracy (correct info, right actions) | 25% | /25 |
| Response latency (<1.5s target) | 20% | /20 |
| Voice naturalness & comprehension | 15% | /15 |
| Error handling & recovery | 15% | /15 |
| Compliance adherence | 10% | /10 |
| Integration reliability (tools work) | 10% | /10 |
| User satisfaction (CSAT/transfer rate) | 5% | /5 |
| **Total** | **100%** | **/100** |
**Grading:** 90+ = production-ready. 75-89 = good with improvements. 60-74 = needs work. <60 = don't launch.
### 10 Common Mistakes
| # | Mistake | Fix |
|---|---------|-----|
| 1 | Responses too long for phone | Max 2 sentences per turn |
| 2 | No filler during tool calls | Add "Let me check..." phrases |
| 3 | Ignoring latency budget | Profile every component |
| 4 | No human escalation path | Always offer transfer option |
| 5 | Testing on laptop, not phone | Test through real phone network |
| 6 | Stacking multiple questions | One question at a time |
| 7 | No silence handling | Add timeout + "Are you still there?" |
| 8 | Card numbers through LLM | Secure IVR redirect for payments |
| 9 | Ignoring recording consent laws | Disclose at call start |
| 10 | No post-call logging | Write summary + transcript to CRM |
### Weekly Review Template
```yaml
weekly_review:
date: ""
calls_reviewed: 20
scores:
avg_accuracy: 0
avg_latency_ms: 0
escalation_rate: 0%
top_3_issues:
- issue: ""
frequency: 0
fix: ""
improvements_shipped: []
next_week_priorities: []
```
---
## Natural Language Commands
1. "Design a voice agent for [use case]" → Full brief + conversation flow + system prompt
2. "Compare voice AI platforms for [requirements]" → Platform selection matrix
3. "Write a system prompt for a [role] voice agent" → Optimized voice prompt
4. "Create conversation flows for [scenario]" → Turn-by-turn YAML design
5. "Audit my voice agent for compliance" → Regulatory checklist by jurisdiction
6. "Calculate voice agent ROI for [volume] calls/day" → Cost analysis
7. "Design the test suite for my voice agent" → Complete test scenarios
8. "Optimize my voice agent latency" → Component-by-component analysis
9. "Set up monitoring for my voice agent" → Dashboard + alert rules
10. "Build a warm transfer protocol" → Complete handoff design
11. "Review this call transcript" → Score + improvement recommendations
12. "Scale my voice agent from [X] to [Y] calls/day" → Scaling plan
---
*Built by AfrexAI — AI agents that work. Zero dependencies.*Related Skills
Invoice Generator
Creates professional invoices in markdown and HTML
Presentation Mastery — Complete Slide Design & Delivery System
You are a Presentation Architect. You help build presentations that persuade, inform, and move people to action. You cover the full lifecycle: audience analysis → narrative structure → slide design → delivery coaching → post-presentation follow-up.
afrexai-performance-engineering
Complete performance engineering system — profiling, optimization, load testing, capacity planning, and performance culture. Use when diagnosing slow applications, optimizing code/queries/infrastructure, load testing before launch, planning capacity, or building performance into CI/CD. Covers Node.js, Python, Go, Java, databases, APIs, and frontend.
OpenClaw Mastery — The Complete Agent Engineering & Operations System
> Built by AfrexAI — the team that runs 9+ production agents 24/7 on OpenClaw.
Next.js Production Engineering
> Complete methodology for building, optimizing, and operating production Next.js applications. From architecture decisions to deployment strategies — everything beyond "hello world."
n8n Workflow Mastery — Complete Automation Engineering System
You are an expert n8n workflow architect. You design, build, debug, optimize, and scale n8n automations following production-grade methodology. Every workflow you create is complete, functional, and follows the patterns in this guide.
ML & AI Engineering System
Complete methodology for building, deploying, and operating production ML/AI systems — from experiment to scale.
MCP Engineering — Complete Model Context Protocol System
Build, integrate, secure, and scale MCP servers and clients. From first server to production multi-tool architecture.
Go-to-Market Strategy Builder
Build a complete GTM plan for product launches, market entries, or expansion plays. Covers positioning, channel strategy, pricing, launch timeline, and success metrics.
Growth Engineering Mastery
> Complete growth system: experimentation engine, viral mechanics, channel playbooks, funnel optimization, retention loops, and scaling frameworks. From zero users to exponential growth.
Go Production Engineering
You are a Go production engineering expert. Follow this system for every Go project — from architecture decisions through production deployment. Apply phases sequentially for new projects; use individual phases as needed for existing codebases.
Git Engineering & Repository Strategy
You are a Git Engineering expert. You help teams design branching strategies, implement code review workflows, manage monorepos, automate releases, and maintain healthy repository practices at scale.