phone-call

Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs

3,891 stars

byopenclaw

View on GitHub Installation ↓

Best use case

phone-call is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs

Teams using phone-call should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/phone-call/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/arein/concierge/skills/phone-call/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/phone-call/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How phone-call Compares

Feature / Agent	phone-call	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# Phone Call Skill

Make autonomous phone calls with a goal-driven AI agent. The AI handles the conversation until the goal is achieved.

## Prerequisites

1. **Required configuration:**
   ```bash
   concierge config set twilioAccountSid <your-sid>
   concierge config set twilioAuthToken <your-token>
   concierge config set twilioPhoneNumber <your-number>
   concierge config set deepgramApiKey <your-key>
   concierge config set elevenLabsApiKey <your-key>
   concierge config set elevenLabsVoiceId <voice-id>
   concierge config set anthropicApiKey <your-key>
   ```

2. **Optional for auto-managed ngrok:**
   ```bash
   concierge config set ngrokAuthToken <your-ngrok-token>
   ```

## Usage

### Basic call
```bash
concierge call "+1-555-123-4567" \
  --goal "Book a hotel room for February 15" \
  --name "John Smith" \
  --email "john@example.com" \
  --customer-phone "+1-555-444-1212" \
  --context "2 nights, king bed preferred"
```

### Interactive mode
```bash
concierge call "+1-555-123-4567" \
  --goal "Make a reservation" \
  --name "John Smith" \
  --email "john@example.com" \
  --customer-phone "+1-555-444-1212" \
  --interactive
```
In interactive mode, you type what the AI should say in real-time.

### Infrastructure behavior

- By default, `call` auto-starts `ngrok` and `server` if server is unavailable.
- Use `--no-auto-infra` to disable this and run everything manually.
- Auto-managed processes are stopped automatically when the call ends.
- Log files are written to:
  - `~/.config/concierge/call-runs/<run-id>/server.log`
  - `~/.config/concierge/call-runs/<run-id>/ngrok.log`

### Server management
```bash
# Check server status
concierge server status

# Start server
concierge server start --public-url <ngrok-url>

# Stop server
concierge server stop
```

## Preflight checks

Before dialing, the system validates:
- Local runtime dependencies (`ffmpeg` binary + MP3 decode support, plus `ngrok` if auto-infra is used)
- Twilio credentials/account status/from-number availability
- Deepgram API key/auth reachability
- ElevenLabs character quota sufficiency (estimated call budget)

## How It Works

1. CLI sends a call request with goal + customer identity details
2. The server places the call via Twilio
3. Audio streams bidirectionally via WebSocket
4. Deepgram transcribes human speech in real-time
5. Claude generates appropriate responses
6. ElevenLabs synthesizes speech for responses
7. Call continues until goal is achieved or human hangs up

## Examples

### Book a hotel reservation
```bash
concierge call "+1-800-HILTON" \
  --goal "Book a room for 2 nights" \
  --name "Sarah Johnson" \
  --email "sarah@example.com" \
  --customer-phone "+1-555-000-2222" \
  --context "Check-in: March 10, Guest: Sarah Johnson, King bed, non-smoking"
```

### Make a restaurant reservation
```bash
concierge call "+1-555-DINER" \
  --goal "Reserve a table for dinner" \
  --name "Garcia" \
  --email "garcia@example.com" \
  --customer-phone "+1-555-000-3333" \
  --context "Party of 4, 7:30 PM, Saturday, name: Garcia"
```

### Cancel an appointment
```bash
concierge call "+1-555-DOCTOR" \
  --goal "Cancel appointment" \
  --name "Mike Chen" \
  --email "mike@example.com" \
  --customer-phone "+1-555-000-4444" \
  --context "Patient: Mike Chen, Appointment on Tuesday at 2 PM"
```

## Supported Voice IDs

Some popular ElevenLabs voices:
- `EXAVITQu4vr4xnSDxMaL` - Rachel (default, conversational female)
- `pNInz6obpgDQGcFmaJgB` - Adam (conversational male)
- `21m00Tcm4TlvDq8ikWAM` - Rachel (narration)
- `AZnzlk1XvdvUeBnXmlld` - Domi (young female)

Set your preferred voice:
```bash
concierge config set elevenLabsVoiceId <voice-id>
```

## Latency

Target voice-to-voice latency: < 500ms

- Deepgram STT: ~150ms
- Response generation: ~100-200ms
- ElevenLabs TTS: ~75ms
- Network: ~50ms

## Troubleshooting

### Server won't start
- Check all config keys are set: `concierge config show`
- If using manual mode, ensure ngrok is running and URL is correct
- Check port 3000 is available

### Call not connecting
- Verify Twilio phone number is active
- Check Twilio account has sufficient balance
- Ensure ngrok URL is publicly accessible (manual mode)

### TTS fails mid-call
- Check ElevenLabs quota/credits.
- New preflight usually catches this before dialing.
- If it still happens, reduce prompt/context length or top up ElevenLabs.

### Audio quality issues
- ElevenLabs uses optimized phone call settings
- Deepgram uses the phone call model
- Audio is at 8kHz (telephone quality)

Related Skills

Bland AI — Voice Calling Skill

3891

from openclaw/skills

Make and manage AI-powered phone calls via the Bland AI API.

Workflow & Productivity

openclaw-phone

3891

from openclaw/skills

Use CallMyCall API to start, end, and check AI phone calls, and return results in chat. Use when the user asks to call someone, plan a future call, end a call, or fetch call results.

clawphone-wechat-control

3891

from openclaw/skills

处理微信会话列表、进入聊天、发送消息、处理微信内弹窗与聊天页失败排查。适用于用户要求查看微信消息、回复联系人、转发、处理聊天输入框或发送失败时。执行时必须先确认当前在微信的哪个页面，再按聊天场景一步一验。

clawphone-phone-control

3891

from openclaw/skills

使用手机控制 MCP 完成手机界面感知与操作。适用于读取当前手机状态、打开 App、处理弹窗、点击控件、输入文本、排查手机自动化失败等场景。执行时优先读取界面状态，涉及坐标点击时必须基于当前截图临时判定，禁止把历史坐标当成通用规则。

phone-calls

3891

from openclaw/skills

Make and manage real phone calls through Twilio. Handles outbound calls with a stated objective, monitors call progress, and returns transcripts and summaries. Use when the user wants to call someone, check on a call, or review call history.

call-screening

3891

from openclaw/skills

Screen incoming phone calls with an AI receptionist. Amber answers calls, identifies the caller, determines the purpose, takes a message, and delivers a structured summary. Use when the user wants to set up call screening, check screened call results, or customize screening behavior.

tool-call-retry

3891

from openclaw/skills

Auto retry & fix LLM tool calls with exponential backoff, format validation, error correction, boost tool call success rate by 90%

feishu-memory-recall

3891

from openclaw/skills

Cross-group memory, search, and event sharing for OpenClaw Feishu agents

calling-agent-squad

3891

from openclaw/skills

Activate a multi-agent team (the Squad) to manage complex projects, business tasks, or development workflows. The squad includes a Manager, Architect, Coder, Reviewer, and Observer. Use when the user wants to "call a squad", "start a project", or "deploy squad" with specialized roles and quality control loops.

memory-totalrecall

3891

from openclaw/skills

Total Recall memory backend — git-branch-based persistent memory store with time-decay relevance.

call-web-search-agent

3891

from openclaw/skills

AI agent for call web search agent tasks

call-web-search-agent-strategy

3891

from openclaw/skills

AI agent for call web search agent strategy tasks