avatar
Control the VTuber avatar system — speak through it with lip sync, change expressions, manage the avatar renderer and control server. Use when interacting with the avatar, making it speak, changing expressions, or troubleshooting avatar connection issues.
Best use case
avatar is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Control the VTuber avatar system — speak through it with lip sync, change expressions, manage the avatar renderer and control server. Use when interacting with the avatar, making it speak, changing expressions, or troubleshooting avatar connection issues.
Teams using avatar should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/avatar/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How avatar Compares
| Feature / Agent | avatar | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Control the VTuber avatar system — speak through it with lip sync, change expressions, manage the avatar renderer and control server. Use when interacting with the avatar, making it speak, changing expressions, or troubleshooting avatar connection issues.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Avatar — VTuber Control
## Quick Start
- Start system: ~/openclaw/scripts/start-avatar.sh
- Stop system: ~/openclaw/scripts/stop-avatar.sh
- Check health: curl -s http://localhost:8766/health
## Speaking
~/openclaw/scripts/avatar-speak.sh "text" [emotion] [output]
Output controls where audio plays:
- speakers — default system sink, people in the room hear it
- mic — AvatarMic sink, people in Meet/calls hear it
- both — both simultaneously
Default output is speakers.
## Emotions
neutral (default, eyes open), happy, sad, angry, relaxed, surprised
Use neutral by default. happy closes the eyes (anime smile) — only use for genuine excitement.
## Infrastructure
When avatar system is started, these are always available:
- Virtual mic: AvatarMic.monitor (set as default source for Meet)
- Virtual camera: /dev/video10 (captures renderer via CDP)
- Virtual speaker sink: AvatarSpeaker (available for routing)
The bot chooses per-speak where audio goes. Virtual mic and camera are always-on pipes.
## Service Control
systemctl --user {start|stop|status|restart} avatar-control-server
Renderer: cd ~/openclaw/avatar/renderer && npm run dev
## WebSocket API (Advanced)
Port 8765 — must send identify first:
{ type: "identify", role: "agent", name: "@agentName@" }
Commands after identify:
- speak: { type: "speak", text: "Hello", emotion: "neutral", output: "mic" }
- setExpression: { type: "setExpression", name: "happy", intensity: 1 }
- setIdle: { type: "setIdle", mode: "breathing" }
- getStatus: { type: "getStatus" }
Wait ~1s after identify before sending commands.
Wait for speakAck duration + 2s buffer before closing WebSocket.
See server.js in ~/openclaw/avatar/control-server/ for full protocol.
## Ports
- 8765: WebSocket (control)
- 8766: HTTP (audio serving + health)
- 3000: Renderer (browser, visual only)
- /dev/video10: Virtual camera
## Audio Flow
Agent sends speak with output target -> control server runs edge-tts -> generates MP3 -> ffmpeg plays to chosen PulseAudio sink(s) -> renderer gets lip sync data only (visual animation, no browser audio).
## Troubleshooting
- "Control Server Disconnected" in browser: check systemctl --user status avatar-control-server
- No audio in Meet: verify AvatarMic sink exists (pactl list sinks short | grep AvatarMic), check output is "mic" or "both"
- No audio in room: check output is "speakers" or "both", check default system sink volume
- Speak command hangs: must send identify before any other command
- Virtual camera not in Meet: restart Meet after starting avatar (Chrome enumerates devices at join time)
- Renderer won't start: check ~/openclaw/avatar/renderer/node_modules exists, run npm install if neededRelated Skills
ai-avatar-video
Create AI avatar and talking head videos with OmniHuman, Fabric, PixVerse via inference.sh CLI. Models: OmniHuman 1.5, OmniHuman 1.0, Fabric 1.0, PixVerse Lipsync. Capabilities: audio-driven avatars, lipsync videos, talking head generation, virtual presenters. Use for: AI presenters, explainer videos, virtual influencers, dubbing, marketing videos. Triggers: ai avatar, talking head, lipsync, avatar video, virtual presenter, ai spokesperson, audio driven video, heygen alternative, synthesia alternative, talking avatar, lip sync, video avatar, ai presenter, digital human
bgo
Automated Blender build-go workflow. Automatically builds, removes old version, installs, enables, and launches Blender with your extension/add-on. Use when you want to quickly test changes, execute complete build-to-launch cycle, or run custom packaging scripts with automatic Blender launch.
chapter-outline-generator
Generate structured chapter outlines for books with plot points, character arcs, word counts, and pacing notes. Use when planning book chapters, structuring narratives, or organizing story flow.
certifier-automation
Automate Certifier tasks via Rube MCP (Composio). Always search tools first for current schemas.
census-bureau-automation
Automate Census Bureau tasks via Rube MCP (Composio). Always search tools first for current schemas.
ccn:create-topic
Create a new topic file in .notes/ with frontmatter template
ccmvn
Run Maven builds using the ccmvn tool, which provides Maven proxy support for sandboxed environments. Use this skill when you need to run Maven commands like clean install, package, test, or compile.
cc-soul-setup
Build cc-soul from source (requires cmake, make, C++ compiler)
cats-automation
Automate Cats tasks via Rube MCP (Composio). Always search tools first for current schemas.
catalyst-identification
Identify upcoming and recent catalysts that could move the stock
castingwords-automation
Automate Castingwords tasks via Rube MCP (Composio). Always search tools first for current schemas.
cass
Always search before starting any work across all coding agent session histories (Claude Code, Codex, Cursor, Gemini CLI, Aider, ChatGPT) to find whatever we've discussed before.