voice-memo

Send native iMessage voice bubbles with ElevenLabs TTS via BlueBubbles. Use when: user asks to send a voice message, wants something spoken aloud, storytelling or summaries requested, or voice delivery would be more engaging than text. Requires ElevenLabs API key and BlueBubbles.

3,891 stars

Best use case

voice-memo is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Send native iMessage voice bubbles with ElevenLabs TTS via BlueBubbles. Use when: user asks to send a voice message, wants something spoken aloud, storytelling or summaries requested, or voice delivery would be more engaging than text. Requires ElevenLabs API key and BlueBubbles.

Teams using voice-memo should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/imessage-voice-memo-skill/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/amzzzzzzz/imessage-voice-memo-skill/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/imessage-voice-memo-skill/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How voice-memo Compares

Feature / Agentvoice-memoStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Send native iMessage voice bubbles with ElevenLabs TTS via BlueBubbles. Use when: user asks to send a voice message, wants something spoken aloud, storytelling or summaries requested, or voice delivery would be more engaging than text. Requires ElevenLabs API key and BlueBubbles.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Voice Memo

Send native iMessage voice bubbles (not file attachments) using ElevenLabs TTS and BlueBubbles.

## Quick Start

Run the script with text and recipient:

```bash
scripts/send-voice-memo.sh "Your message here" +14169060839
```

This will:
1. Generate TTS audio via ElevenLabs (Rachel voice by default)
2. Convert to Opus CAF @ 24kHz (iMessage native format)
3. Send as native voice bubble via BlueBubbles

## Requirements

- BlueBubbles running locally with Private API enabled
- ElevenLabs API key (for TTS)
- macOS (for `afconvert` audio conversion)
- Environment variables in `~/.openclaw/.env`:
  ```bash
  ELEVENLABS_API_KEY=your-key-here
  BLUEBUBBLES_PASSWORD=your-password-here
  # Optional overrides:
  ELEVENLABS_VOICE_ID=21m00Tcm4TlvDq8ikWAM  # Rachel (default)
  ELEVENLABS_MODEL_ID=eleven_turbo_v2_5      # Turbo v2.5 (default)
  ```

## The Working Formula

**Critical parameters discovered 2026-02-19:**

| Parameter | Value | Why |
|-----------|-------|-----|
| chatGuid | `any;-;+PHONE` | NOT `iMessage;-;` (causes timeouts) |
| method | `private-api` | Required for native bubble |
| isAudioMessage | `true` | Required |
| Audio format | Opus @ 24kHz in CAF | iMessage native format |
| Pre-convert | Yes | Don't let BlueBubbles convert (wrong codec) |

## Voice Options

**Default voice:** Rachel (ElevenLabs)
- Voice ID: `21m00Tcm4TlvDq8ikWAM`
- Model: `eleven_turbo_v2_5` (fast, natural)
- Cost: ~$0.04 per 30s message

**Expressive tags:**
- `[laughs]` — natural laughter
- `[sighs]` — expressive sigh
- `[excited]` — energetic delivery

Example: `"[excited] Oh my god, it worked!"`

For full voice list and IDs, see [VOICES.md](references/VOICES.md).

## Bidirectional Voice Memos

**Sending (Amz → Amy):**
Use this skill. Native voice bubbles appear with waveform UI.

**Receiving (Amy → Amz):**
BlueBubbles auto-converts incoming voice memos to MP3. OpenClaw transcribes via Whisper. Transcribed text flows into conversation context automatically.

**Memory note:** Incoming voice memo transcriptions flow into conversation context like any text message. They are NOT automatically persisted to memory or files — the agent must explicitly choose to store them, same as any conversation content. If you want to prevent transcriptions from being retained, instruct the agent not to record voice memo content in memory.

## Troubleshooting

**Voice bubble arrives as file attachment:**
- Check `method=private-api` is set
- Verify chatGuid uses `any;-;` prefix (not `iMessage;-;`)
- Check response has `"isAudioMessage": true`

**API times out:**
- Use `any;-;+PHONE` format for chatGuid
- Verify BlueBubbles Private API is enabled
- Restart BlueBubbles if consistently slow

**Audio is 0 seconds / unplayable:**
- Ensure pre-conversion to Opus @ 24kHz
- Don't let BlueBubbles convert (uses wrong codec)
- Verify with: `afinfo output.caf` (should show opus @ 24000 Hz)

Related Skills

Invoice Generator

3891
from openclaw/skills

Creates professional invoices in markdown and HTML

Workflow & Productivity

Agent Memory Architecture

3891
from openclaw/skills

Complete zero-dependency memory system for AI agents — file-based architecture, daily notes, long-term curation, context management, heartbeat integration, and memory hygiene. No APIs, no databases, no external tools. Works with any agent framework.

memory-cache

3891
from openclaw/skills

High-performance temporary storage system using Redis. Supports namespaced keys (mema:*), TTL management, and session context caching. Use for: (1) Saving agent state, (2) Caching API results, (3) Sharing data between sub-agents.

General Utilities

brand-voice-generator

3891
from openclaw/skills

Creates consistent brand voice guidelines and content. Generates copy that matches your brand personality across all channels. Perfect for startups building their identity.

Content & Documentation

Memory

3891
from openclaw/skills

Infinite organized memory that complements your agent's built-in memory with unlimited categorized storage.

Memory Management

invoice-ocr

3891
from openclaw/skills

发票 OCR 识别技能。扫描文件夹中的发票文件(PDF/图片),调用阿里云 OCR API 识别发票信息并导出到 Excel 表格。支持 17+ 种发票类型(增值税发票、火车票、出租车票、机票行程单、定额发票、机动车销售发票、过路过桥费发票等)。使用场景:(1) 用户提到"发票识别"、"发票统计"、"发票整理"、"发票汇总" (2) 用户需要批量处理发票 (3) 用户提到阿里云 OCR 识别发票。**重要:首次使用必须先配置阿里云凭证,主动向用户索要 AccessKey ID 和 AccessKey Secret,或引导用户运行 --config 命令自行配置。**

Workflow & Productivity

Bland AI — Voice Calling Skill

3891
from openclaw/skills

Make and manage AI-powered phone calls via the Bland AI API.

Workflow & Productivity

auto-memory

3891
from openclaw/skills

Indestructible agent memory — permanently stored, never lost. Save decisions, identity, and context as a memory chain on the Autonomys Network. Rebuild your full history from a single CID, even after total state loss.

AI Persistence & Memory

afrexai-invoice-engine

3880
from openclaw/skills

Generate, manage, and track professional invoices with payment terms, recurring billing, overdue automation, and financial reporting. Use when creating invoices, tracking payments, managing clients, or reviewing revenue.

Workflow & Productivity

Triple-Layer Memory System

3880
from openclaw/skills

三层记忆系统 - 解决 AI Agent 长对话记忆丢失和上下文管理问题

Memory & Context Management

agent-memory-os

3891
from openclaw/skills

Stop agents from "forgetting, mixing projects, and rotting over time" by giving them a practical memory operating system: global memory, project memory, promotion rules, validation cases, and a maintenance loop.

benos-memory-core

3891
from openclaw/skills

Core runtime/volatile memory module for BenOS agent environment. Use to: store and retrieve active session state, open loops, decisions, and scratch notes at runtime.