phone-call

Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs

3,891 stars

Best use case

phone-call is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs

Teams using phone-call should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/phone-call/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/arein/concierge/skills/phone-call/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/phone-call/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How phone-call Compares

Feature / Agentphone-callStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Phone Call Skill

Make autonomous phone calls with a goal-driven AI agent. The AI handles the conversation until the goal is achieved.

## Prerequisites

1. **Required configuration:**
   ```bash
   concierge config set twilioAccountSid <your-sid>
   concierge config set twilioAuthToken <your-token>
   concierge config set twilioPhoneNumber <your-number>
   concierge config set deepgramApiKey <your-key>
   concierge config set elevenLabsApiKey <your-key>
   concierge config set elevenLabsVoiceId <voice-id>
   concierge config set anthropicApiKey <your-key>
   ```

2. **Optional for auto-managed ngrok:**
   ```bash
   concierge config set ngrokAuthToken <your-ngrok-token>
   ```

## Usage

### Basic call
```bash
concierge call "+1-555-123-4567" \
  --goal "Book a hotel room for February 15" \
  --name "John Smith" \
  --email "john@example.com" \
  --customer-phone "+1-555-444-1212" \
  --context "2 nights, king bed preferred"
```

### Interactive mode
```bash
concierge call "+1-555-123-4567" \
  --goal "Make a reservation" \
  --name "John Smith" \
  --email "john@example.com" \
  --customer-phone "+1-555-444-1212" \
  --interactive
```
In interactive mode, you type what the AI should say in real-time.

### Infrastructure behavior

- By default, `call` auto-starts `ngrok` and `server` if server is unavailable.
- Use `--no-auto-infra` to disable this and run everything manually.
- Auto-managed processes are stopped automatically when the call ends.
- Log files are written to:
  - `~/.config/concierge/call-runs/<run-id>/server.log`
  - `~/.config/concierge/call-runs/<run-id>/ngrok.log`

### Server management
```bash
# Check server status
concierge server status

# Start server
concierge server start --public-url <ngrok-url>

# Stop server
concierge server stop
```

## Preflight checks

Before dialing, the system validates:
- Local runtime dependencies (`ffmpeg` binary + MP3 decode support, plus `ngrok` if auto-infra is used)
- Twilio credentials/account status/from-number availability
- Deepgram API key/auth reachability
- ElevenLabs character quota sufficiency (estimated call budget)

## How It Works

1. CLI sends a call request with goal + customer identity details
2. The server places the call via Twilio
3. Audio streams bidirectionally via WebSocket
4. Deepgram transcribes human speech in real-time
5. Claude generates appropriate responses
6. ElevenLabs synthesizes speech for responses
7. Call continues until goal is achieved or human hangs up

## Examples

### Book a hotel reservation
```bash
concierge call "+1-800-HILTON" \
  --goal "Book a room for 2 nights" \
  --name "Sarah Johnson" \
  --email "sarah@example.com" \
  --customer-phone "+1-555-000-2222" \
  --context "Check-in: March 10, Guest: Sarah Johnson, King bed, non-smoking"
```

### Make a restaurant reservation
```bash
concierge call "+1-555-DINER" \
  --goal "Reserve a table for dinner" \
  --name "Garcia" \
  --email "garcia@example.com" \
  --customer-phone "+1-555-000-3333" \
  --context "Party of 4, 7:30 PM, Saturday, name: Garcia"
```

### Cancel an appointment
```bash
concierge call "+1-555-DOCTOR" \
  --goal "Cancel appointment" \
  --name "Mike Chen" \
  --email "mike@example.com" \
  --customer-phone "+1-555-000-4444" \
  --context "Patient: Mike Chen, Appointment on Tuesday at 2 PM"
```

## Supported Voice IDs

Some popular ElevenLabs voices:
- `EXAVITQu4vr4xnSDxMaL` - Rachel (default, conversational female)
- `pNInz6obpgDQGcFmaJgB` - Adam (conversational male)
- `21m00Tcm4TlvDq8ikWAM` - Rachel (narration)
- `AZnzlk1XvdvUeBnXmlld` - Domi (young female)

Set your preferred voice:
```bash
concierge config set elevenLabsVoiceId <voice-id>
```

## Latency

Target voice-to-voice latency: < 500ms

- Deepgram STT: ~150ms
- Response generation: ~100-200ms
- ElevenLabs TTS: ~75ms
- Network: ~50ms

## Troubleshooting

### Server won't start
- Check all config keys are set: `concierge config show`
- If using manual mode, ensure ngrok is running and URL is correct
- Check port 3000 is available

### Call not connecting
- Verify Twilio phone number is active
- Check Twilio account has sufficient balance
- Ensure ngrok URL is publicly accessible (manual mode)

### TTS fails mid-call
- Check ElevenLabs quota/credits.
- New preflight usually catches this before dialing.
- If it still happens, reduce prompt/context length or top up ElevenLabs.

### Audio quality issues
- ElevenLabs uses optimized phone call settings
- Deepgram uses the phone call model
- Audio is at 8kHz (telephone quality)

Related Skills

Bland AI — Voice Calling Skill

3891
from openclaw/skills

Make and manage AI-powered phone calls via the Bland AI API.

Workflow & Productivity

openclaw-phone

3891
from openclaw/skills

Use CallMyCall API to start, end, and check AI phone calls, and return results in chat. Use when the user asks to call someone, plan a future call, end a call, or fetch call results.

clawphone-wechat-control

3891
from openclaw/skills

处理微信会话列表、进入聊天、发送消息、处理微信内弹窗与聊天页失败排查。适用于用户要求查看微信消息、回复联系人、转发、处理聊天输入框或发送失败时。执行时必须先确认当前在微信的哪个页面,再按聊天场景一步一验。

clawphone-phone-control

3891
from openclaw/skills

使用手机控制 MCP 完成手机界面感知与操作。适用于读取当前手机状态、打开 App、处理弹窗、点击控件、输入文本、排查手机自动化失败等场景。执行时优先读取界面状态,涉及坐标点击时必须基于当前截图临时判定,禁止把历史坐标当成通用规则。

phone-calls

3891
from openclaw/skills

Make and manage real phone calls through Twilio. Handles outbound calls with a stated objective, monitors call progress, and returns transcripts and summaries. Use when the user wants to call someone, check on a call, or review call history.

call-screening

3891
from openclaw/skills

Screen incoming phone calls with an AI receptionist. Amber answers calls, identifies the caller, determines the purpose, takes a message, and delivers a structured summary. Use when the user wants to set up call screening, check screened call results, or customize screening behavior.

tool-call-retry

3891
from openclaw/skills

Auto retry & fix LLM tool calls with exponential backoff, format validation, error correction, boost tool call success rate by 90%

feishu-memory-recall

3891
from openclaw/skills

Cross-group memory, search, and event sharing for OpenClaw Feishu agents

calling-agent-squad

3891
from openclaw/skills

Activate a multi-agent team (the Squad) to manage complex projects, business tasks, or development workflows. The squad includes a Manager, Architect, Coder, Reviewer, and Observer. Use when the user wants to "call a squad", "start a project", or "deploy squad" with specialized roles and quality control loops.

memory-totalrecall

3891
from openclaw/skills

Total Recall memory backend — git-branch-based persistent memory store with time-decay relevance.

call-web-search-agent

3891
from openclaw/skills

AI agent for call web search agent tasks

call-web-search-agent-strategy

3891
from openclaw/skills

AI agent for call web search agent strategy tasks