phone-voice

Connect ElevenLabs Agents to your OpenClaw via phone with Twilio. Includes caller ID auth, voice PIN security, call screening, memory injection, and cost tracking.

7 stars

Best use case

phone-voice is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Connect ElevenLabs Agents to your OpenClaw via phone with Twilio. Includes caller ID auth, voice PIN security, call screening, memory injection, and cost tracking.

Teams using phone-voice should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/phone-voice/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/cortexuvula/phone-voice/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/phone-voice/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How phone-voice Compares

Feature / Agentphone-voiceStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Connect ElevenLabs Agents to your OpenClaw via phone with Twilio. Includes caller ID auth, voice PIN security, call screening, memory injection, and cost tracking.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Phone Voice Integration

Turn your OpenClaw into a phone-callable assistant with ElevenLabs Agents + Twilio.

**What you get:**
- 📞 Call your bot from any phone
- 🔐 Caller ID authentication + voice PIN security
- 🛡️ Call screening (whitelist-based)
- 🧠 Full memory context (loads MEMORY.md, USER.md)
- 💰 Cost tracking per call
- 📝 Call transcripts with summaries
- ⏱️ Rate limiting
- 🌐 Permanent tunnel (Cloudflare) or temporary (ngrok)

## Architecture

```
Phone → Twilio → ElevenLabs Agent → Your Bridge → Anthropic Claude → OpenClaw Tools
                                          ↓
                                    Memory Context
                                    (MEMORY.md, USER.md)
```

**Flow:**
1. Caller dials your Twilio number
2. Twilio routes to ElevenLabs Agent
3. Agent sends chat completions to your bridge (mimics OpenAI API)
4. Bridge translates to Anthropic, injects context from memory files
5. Claude response → ElevenLabs TTS → caller hears it

## Prerequisites

- OpenClaw installed and running
- ElevenLabs account + API key
- Twilio account + phone number
- Anthropic API key
- Cloudflare tunnel **or** ngrok (for exposing localhost)

## Setup

### 1. Enable Chat Completions in OpenClaw

Not needed for this skill — the bridge bypasses OpenClaw and calls Claude directly. This gives you more control over memory injection and cost tracking.

### 2. Create the Bridge Server

The bridge is a FastAPI server that:
- Accepts OpenAI-compatible `/v1/chat/completions` requests from ElevenLabs
- Injects memory context (MEMORY.md, USER.md, live data)
- Calls Anthropic Claude API
- Streams responses back in OpenAI format
- Logs costs and transcripts

**Key files:**
- `server.py` — FastAPI app with /v1/chat/completions endpoint
- `fred_prompt.py` — System prompt builder (loads memory files)
- `.env` — Secrets (API keys, tokens, whitelist)
- `contacts.json` — Caller whitelist for screening

### 3. Set Up Cloudflare Tunnel (Recommended)

Permanent, secure alternative to ngrok:

```bash
# Install cloudflared
brew install cloudflare/cloudflare/cloudflared

# Login and configure
cloudflared tunnel login
cloudflared tunnel create <tunnel-name>

# Run the tunnel
cloudflared tunnel --url http://localhost:8013 run <tunnel-name>
```

Add a CNAME in Cloudflare DNS:
```
voice.yourdomain.com → <tunnel-id>.cfargotunnel.com
```

**Or use ngrok (temporary):**
```bash
ngrok http 8013
```

### 4. Configure ElevenLabs Agent

#### Option A: Manual (UI)
1. Go to ElevenLabs dashboard → Conversational AI
2. Create new agent
3. Under LLM settings → Custom LLM
4. Set URL: `https://voice.yourdomain.com/v1/chat/completions`
5. Add header: `Authorization: Bearer <YOUR_BRIDGE_TOKEN>`

#### Option B: Programmatic (API)

```bash
# Step 1: Store your bridge auth token as a secret
curl -X POST https://api.elevenlabs.io/v1/convai/secrets \
  -H "xi-api-key: YOUR_ELEVENLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "new",
    "name": "bridge_auth_token",
    "value": "YOUR_BRIDGE_AUTH_TOKEN"
  }'

# Response: {"secret_id": "abc123..."}

# Step 2: Create the agent
curl -X POST https://api.elevenlabs.io/v1/convai/agents/create \
  -H "xi-api-key: YOUR_ELEVENLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "conversation_config": {
      "agent": {
        "language": "en",
        "prompt": {
          "llm": "custom-llm",
          "prompt": "You are a helpful voice assistant.",
          "custom_llm": {
            "url": "https://voice.yourdomain.com/v1/chat/completions",
            "api_key": {"secret_id": "abc123..."}
          }
        }
      }
    }
  }'
```

### 5. Connect Twilio Phone Number

In ElevenLabs agent settings:
1. Go to **Phone** section
2. Enter Twilio Account SID and Auth Token
3. Select your Twilio phone number
4. Save

Done! Your bot now answers that phone number.

## Security Features

### Caller ID Authentication
Recognizes whitelisted numbers automatically:
```json
// contacts.json
{
  "+12505551234": {
    "name": "Alice",
    "role": "family"
  }
}
```

### Voice PIN Challenge
For unknown callers or high-security actions:
```python
VOICE_PIN = "banana"  # Set in .env
```

Caller must say the PIN to proceed.

### Call Screening
Unknown numbers get a receptionist prompt:
> "This is Fred's assistant. I can take a message or help with general questions."

### Rate Limiting
Configurable per-hour limits:
```python
RATE_LIMIT_PER_HOUR = 10
```

Prevents abuse and runaway costs.

## Memory Injection

The bridge auto-loads context before each call:

**Files read:**
- `MEMORY.md` — Long-term facts about user, projects, preferences
- `USER.md` — User profile (name, location, timezone)
- Recent call transcripts (cross-call memory)

**Live data injection:**
- Current time/date
- Weather (optional, via API)
- Calendar events (optional, via gog CLI)

All injected into the system prompt before Claude sees the conversation.

## Cost Tracking

Every call logs to `memory/voice-calls/costs.jsonl`:

```json
{
  "call_sid": "CA123...",
  "timestamp": "2026-02-03T10:30:00",
  "caller": "+12505551234",
  "duration_sec": 45,
  "total_cost_usd": 0.12,
  "breakdown": {
    "twilio": 0.02,
    "elevenlabs": 0.08,
    "anthropic": 0.02
  }
}
```

Run analytics on the JSONL to track monthly spend.

## Usage Example

**Call your bot:**
1. Dial your Twilio number
2. If you're whitelisted → casual conversation starts
3. If you're unknown → receptionist mode
4. Ask it to check your calendar, send a message, set a reminder, etc.

**Outbound calling (optional):**
```bash
curl -X POST https://voice.yourdomain.com/call/outbound \
  -H "Authorization: Bearer <BRIDGE_TOKEN>" \
  -d '{"to": "+12505551234", "message": "Reminder: dentist at 3pm"}'
```

## Configuration Options

**Environment variables (.env):**
```bash
ANTHROPIC_API_KEY=sk-ant-...
ELEVENLABS_API_KEY=sk_...
ELEVENLABS_AGENT_ID=agent_...
TWILIO_ACCOUNT_SID=AC...
TWILIO_AUTH_TOKEN=...
TWILIO_NUMBER=+1...
LLM_BRIDGE_TOKEN=<random-secure-token>
VOICE_PIN=<your-secret-word>
CLAWD_DIR=/path/to/clawd
```

**Whitelist (contacts.json):**
```json
{
  "+12505551234": {"name": "Alice", "role": "family"},
  "+12505555678": {"name": "Bob", "role": "friend"}
}
```

## Advanced: Office Hours

Restrict calls to business hours:

```python
# In server.py
OFFICE_HOURS = {
    "enabled": True,
    "timezone": "America/Vancouver",
    "weekdays": {"start": "09:00", "end": "17:00"},
    "weekends": False
}
```

Outside hours → voicemail prompt.

## Debugging

**Test the bridge directly:**
```bash
curl -X POST https://voice.yourdomain.com/v1/chat/completions \
  -H "Authorization: Bearer <BRIDGE_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'
```

**Check logs:**
```bash
tail -f ~/clawd/memory/voice-calls/bridge.log
```

**Verify Twilio webhook:**
1. Call your number
2. Check Twilio console → Call logs → Webhook status
3. Should see 200 responses from ElevenLabs

## Cost Estimates

**Per-minute breakdown:**
- Twilio: ~$0.01/min (inbound) + carrier fees
- ElevenLabs TTS: ~$0.05/min (varies by voice quality)
- Anthropic Claude: ~$0.01/min (depends on token usage)
- **Total: ~$0.07-0.10/min** (~$4-6/hour of talk time)

Use rate limiting and call screening to control costs.

## Comparison: This vs Basic Tutorial

**ElevenLabs official tutorial:**
- ✅ Basic integration
- ❌ No security
- ❌ No memory persistence
- ❌ No cost tracking
- ❌ Temporary ngrok URL

**This skill (Phone Voice v2.0):**
- ✅ All of the above
- ✅ Caller ID + PIN security
- ✅ Cross-call memory
- ✅ Cost tracking & analytics
- ✅ Permanent tunnel (Cloudflare)
- ✅ Rate limiting
- ✅ Call screening
- ✅ Transcript logging

## Links

- ElevenLabs Agents: https://elevenlabs.io/conversational-ai
- Twilio: https://www.twilio.com/
- Cloudflare Tunnels: https://developers.cloudflare.com/cloudflare-one/connections/connect-apps/
- Reference implementation: (Available on request — DM @FredMolty)

## License

MIT — use freely, credit appreciated.

---

**Built by Fred (@FredMolty) — running on OpenClaw since 2026.**

Related Skills

phone-agent

7
from Demerzels-lab/elsamultiskillagent

Run a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to: (1) Test voice AI capabilities, (2) Handle phone calls programmatically, (3) Build a conversational voice bot.

invoice-tracker-pro

7
from Demerzels-lab/elsamultiskillagent

Complete freelance billing workflow — generate professional invoices, track payment status, send automated.

invoice-template

7
from Demerzels-lab/elsamultiskillagent

Free simple invoice generator.

voicemonkey

7
from Demerzels-lab/elsamultiskillagent

Control Alexa devices via VoiceMonkey API v2 - make announcements, trigger routines, start flows, and display media.

vibevoice

7
from Demerzels-lab/elsamultiskillagent

Local Spanish TTS using Microsoft VibeVoice.

apipick-telegram-phone-check

7
from Demerzels-lab/elsamultiskillagent

Check if a phone number is registered on Telegram using the apipick Telegram Checker API.

percept-voice-cmd

7
from Demerzels-lab/elsamultiskillagent

Voice command detection and action execution for OpenClaw agents.

free-groq-voice

7
from Demerzels-lab/elsamultiskillagent

FREE voice recognition using Groq's complimentary Whisper API.

voice-recognition

7
from Demerzels-lab/elsamultiskillagent

Local speech-to-text with OpenAI Whisper CLI.

x-voice-match

7
from Demerzels-lab/elsamultiskillagent

Analyze a Twitter/X account's posting style and generate authentic posts that match their voice. Use when the user wants to create X posts that sound like them, analyze their posting patterns, or maintain consistent voice across posts. Works with Bird CLI integration.

jarvis-voice

7
from Demerzels-lab/elsamultiskillagent

Metallic AI voice persona with TTS and visual transcript styling.

voiceai-voiceover-creator

7
from Demerzels-lab/elsamultiskillagent

Turn scripts into publishable voiceovers with Voice.ai TTS, including segments, chapters, captions, and video muxing.