voicemonkey

Control Alexa devices via VoiceMonkey API v2 - make announcements, trigger routines, start flows, and display media.

7 stars

byDemerzels-lab

View on GitHub Installation ↓

Best use case

voicemonkey is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Control Alexa devices via VoiceMonkey API v2 - make announcements, trigger routines, start flows, and display media.

Teams using voicemonkey should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/voicemonkey/SKILL.md --create-dirs "https://raw.githubusercontent.com/Demerzels-lab/elsamultiskillagent/main/public/skills/jayakumark/voicemonkey/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/voicemonkey/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How voicemonkey Compares

Feature / Agent	voicemonkey	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Control Alexa devices via VoiceMonkey API v2 - make announcements, trigger routines, start flows, and display media.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# VoiceMonkey

Control Alexa/Echo devices via VoiceMonkey API v2. Make TTS announcements, trigger Alexa routines, start flows, and display images/videos on Echo Show devices.

## Setup

1. Get your secret token from [Voice Monkey Console](https://console.voicemonkey.io) → Settings → API Credentials
2. Set environment variable:
   ```bash
   export VOICEMONKEY_TOKEN="your-secret-token"
   ```
   Or add to `~/.clawdbot/clawdbot.json`:
   ```json
   {
     "skills": {
       "entries": {
         "voicemonkey": {
           "env": { "VOICEMONKEY_TOKEN": "your-secret-token" }
         }
       }
     }
   }
   ```
3. Find your Device IDs in the Voice Monkey Console → Settings → Devices

## API Base URL

```
https://api-v2.voicemonkey.io
```

## Announcement API

Make TTS announcements, play audio/video, or display images on Alexa devices.

**Endpoint:** `https://api-v2.voicemonkey.io/announcement`

### Basic TTS Announcement

```bash
curl -X GET "https://api-v2.voicemonkey.io/announcement?token=$VOICEMONKEY_TOKEN&device=YOUR_DEVICE_ID&text=Hello%20from%20Echo"
```

### With Authorization Header (recommended)

```bash
curl -X POST "https://api-v2.voicemonkey.io/announcement" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_DEVICE_ID",
    "text": "Hello from Echo the Fox!"
  }'
```

### With Voice and Chime

```bash
curl -X POST "https://api-v2.voicemonkey.io/announcement" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_DEVICE_ID",
    "text": "Dinner is ready!",
    "voice": "Brian",
    "chime": "soundbank://soundlibrary/alarms/beeps_and_bloops/bell_02"
  }'
```

### Display Image on Echo Show

```bash
curl -X POST "https://api-v2.voicemonkey.io/announcement" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_DEVICE_ID",
    "text": "Check out this image",
    "image": "https://example.com/image.jpg",
    "media_width": "100",
    "media_height": "100",
    "media_scaling": "best-fit"
  }'
```

### Play Audio File

```bash
curl -X POST "https://api-v2.voicemonkey.io/announcement" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_DEVICE_ID",
    "audio": "https://example.com/sound.mp3"
  }'
```

### Play Video on Echo Show

```bash
curl -X POST "https://api-v2.voicemonkey.io/announcement" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_DEVICE_ID",
    "video": "https://example.com/video.mp4",
    "video_repeat": 1
  }'
```

### Open Website on Echo Show

```bash
curl -X POST "https://api-v2.voicemonkey.io/announcement" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_DEVICE_ID",
    "website": "https://example.com",
    "no_bg": "true"
  }'
```

### Announcement Parameters

| Parameter | Required | Description |
|-----------|----------|-------------|
| `token` | Yes* | Secret token (*or use Authorization header) |
| `device` | Yes | Device ID from Voice Monkey console |
| `text` | No | TTS text (supports SSML) |
| `voice` | No | Voice for TTS (see API Playground for options) |
| `language` | No | Language code for better pronunciation |
| `chime` | No | Sound URL or Alexa sound library reference |
| `audio` | No | HTTPS URL of audio file to play |
| `background_audio` | No | Audio to play behind TTS |
| `image` | No | HTTPS URL of image for Echo Show |
| `video` | No | HTTPS URL of MP4 video for Echo Show |
| `video_repeat` | No | Number of times to loop video |
| `website` | No | URL to open on Echo Show |
| `no_bg` | No | Set "true" to hide Voice Monkey branding |
| `media_width` | No | Image width |
| `media_height` | No | Image height |
| `media_scaling` | No | Image scaling mode |
| `media_align` | No | Image alignment |
| `media_radius` | No | Corner radius for image clipping |
| `var-[name]` | No | Update Voice Monkey variables |

## Routine Trigger API

Trigger Voice Monkey devices to start Alexa Routines.

**Endpoint:** `https://api-v2.voicemonkey.io/trigger`

```bash
curl -X POST "https://api-v2.voicemonkey.io/trigger" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_TRIGGER_DEVICE_ID"
  }'
```

| Parameter | Required | Description |
|-----------|----------|-------------|
| `token` | Yes* | Secret token (*or use Authorization header) |
| `device` | Yes | Trigger Device ID from Voice Monkey console |

## Flows Trigger API

Start Voice Monkey Flows.

**Endpoint:** `https://api-v2.voicemonkey.io/flows`

```bash
curl -X POST "https://api-v2.voicemonkey.io/flows" \
  -H "Authorization: $VOICEMONKEY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "device": "YOUR_DEVICE_ID",
    "flow": 12345
  }'
```

| Parameter | Required | Description |
|-----------|----------|-------------|
| `token` | Yes* | Secret token (*or use Authorization header) |
| `device` | Yes | Device ID |
| `flow` | Yes | Numeric Flow ID from Voice Monkey console |

## Media Requirements

### Images
- Most common formats supported (JPG, PNG, etc.)
- **No animated GIFs**
- Optimize file size for faster loading
- Must be hosted at HTTPS URL with valid SSL
- CORS must allow wildcard: `Access-Control-Allow-Origin: *`

### Videos
- **MP4 format only** (MPEG-4 Part-14)
- Audio codecs: AAC, MP3
- Max resolution: 1080p @30fps or @60fps
- Must be hosted at HTTPS URL with valid SSL

### Audio
- Formats: AAC, MP3, OGG, Opus, WAV
- Bit rate: ≤ 1411.20 kbps
- Sample rate: ≤ 48kHz
- File size: ≤ 10MB
- Total response length: ≤ 240 seconds

## SSML Examples

Use SSML in the `text` parameter for richer announcements:

```xml
<speak>
  <amazon:emotion name="excited" intensity="high">
    This is exciting news!
  </amazon:emotion>
</speak>
```

```xml
<speak>
  The time is <say-as interpret-as="time">3:30pm</say-as>
</speak>
```

## Notes

- Keep your token secure; rotate via Console → Settings → API Credentials if compromised
- Use the [API Playground](https://console.voicemonkey.io) to test and explore options
- Premium members can upload media directly in the Voice Monkey console
- Always confirm before sending announcements to avoid unexpected noise

Related Skills

paylock

from Demerzels-lab/elsamultiskillagent

Non-custodial SOL escrow for AI agent deals.

agent-reputation

from Demerzels-lab/elsamultiskillagent

summary: Cross-platform AI agent reputation checker with trust scoring and PayLock escrow recommendations.

Telecom Agent Skill

from Demerzels-lab/elsamultiskillagent

Turn your AI Agent into a Telecom Operator. Bulk calling, ChatOps, and Field Monitoring.

OpenClaw-Finnhub

from Demerzels-lab/elsamultiskillagent

OpenClaw skill for real-time stock quote, and financials via Finnhub API.

```markdown

from Demerzels-lab/elsamultiskillagent

# OpenClaw-Last.fm

security-operator

from Demerzels-lab/elsamultiskillagent

Runtime security guardrails for OpenClaw agents.

operator-humanizer

from Demerzels-lab/elsamultiskillagent

Transform AI-generated text into authentic human writing.

kit-email-operator

from Demerzels-lab/elsamultiskillagent

**AI-powered email marketing for Kit (ConvertKit)**.

agora

from Demerzels-lab/elsamultiskillagent

Trade prediction markets on Agora — the prediction market exclusively for AI agents. Register, browse markets, trade YES/NO, create markets, earn reputation via Brier scores.

surf-check

from Demerzels-lab/elsamultiskillagent

Surf forecast decision engine.

jinko-flight-search

from Demerzels-lab/elsamultiskillagent

Search flights and discover travel destinations using the Jinko MCP server. Provides two core capabilities: (1) Destination discovery — find where to travel based on criteria like budget, climate, or activities when the user has no specific destination in mind, and (2) Specific flight search — compare flights between two known cities/airports with flexible dates, cabin classes, and budget filters. Use this skill when the user wants to: search for flights, find cheap flights, discover travel destinations, compare flight prices, plan a trip, find deals from a specific city, or explore where to go. Triggers on any flight-booking, travel-planning, or destination-discovery request. Requires the Jinko MCP server connected at https://mcp.gojinko.com.

mlx-whisper

from Demerzels-lab/elsamultiskillagent

Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key).