azure-ai-voicelive-java
Azure AI VoiceLive SDK for Java. Real-time bidirectional voice conversations with AI assistants using WebSocket. Triggers: "VoiceLiveClient java", "voice assistant java", "real-time voice java", "audio streaming java", "voice activity detection java".
Best use case
azure-ai-voicelive-java is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Azure AI VoiceLive SDK for Java. Real-time bidirectional voice conversations with AI assistants using WebSocket. Triggers: "VoiceLiveClient java", "voice assistant java", "real-time voice java", "audio streaming java", "voice activity detection java".
Azure AI VoiceLive SDK for Java. Real-time bidirectional voice conversations with AI assistants using WebSocket. Triggers: "VoiceLiveClient java", "voice assistant java", "real-time voice java", "audio streaming java", "voice activity detection java".
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "azure-ai-voicelive-java" skill to help with this workflow task. Context: Azure AI VoiceLive SDK for Java. Real-time bidirectional voice conversations with AI assistants using WebSocket. Triggers: "VoiceLiveClient java", "voice assistant java", "real-time voice java", "audio streaming java", "voice activity detection java".
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/azure-ai-voicelive-java/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How azure-ai-voicelive-java Compares
| Feature / Agent | azure-ai-voicelive-java | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Azure AI VoiceLive SDK for Java. Real-time bidirectional voice conversations with AI assistants using WebSocket. Triggers: "VoiceLiveClient java", "voice assistant java", "real-time voice java", "audio streaming java", "voice activity detection java".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Azure AI VoiceLive SDK for Java
Real-time, bidirectional voice conversations with AI assistants using WebSocket technology.
## Installation
```xml
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-ai-voicelive</artifactId>
<version>1.0.0-beta.2</version>
</dependency>
```
## Environment Variables
```bash
AZURE_VOICELIVE_ENDPOINT=https://<resource>.openai.azure.com/
AZURE_VOICELIVE_API_KEY=<your-api-key>
```
## Authentication
### API Key
```java
import com.azure.ai.voicelive.VoiceLiveAsyncClient;
import com.azure.ai.voicelive.VoiceLiveClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(System.getenv("AZURE_VOICELIVE_ENDPOINT"))
.credential(new AzureKeyCredential(System.getenv("AZURE_VOICELIVE_API_KEY")))
.buildAsyncClient();
```
### DefaultAzureCredential (Recommended)
```java
import com.azure.identity.DefaultAzureCredentialBuilder;
VoiceLiveAsyncClient client = new VoiceLiveClientBuilder()
.endpoint(System.getenv("AZURE_VOICELIVE_ENDPOINT"))
.credential(new DefaultAzureCredentialBuilder().build())
.buildAsyncClient();
```
## Key Concepts
| Concept | Description |
|---------|-------------|
| `VoiceLiveAsyncClient` | Main entry point for voice sessions |
| `VoiceLiveSessionAsyncClient` | Active WebSocket connection for streaming |
| `VoiceLiveSessionOptions` | Configuration for session behavior |
### Audio Requirements
- **Sample Rate**: 24kHz (24000 Hz)
- **Bit Depth**: 16-bit PCM
- **Channels**: Mono (1 channel)
- **Format**: Signed PCM, little-endian
## Core Workflow
### 1. Start Session
```java
import reactor.core.publisher.Mono;
client.startSession("gpt-4o-realtime-preview")
.flatMap(session -> {
System.out.println("Session started");
// Subscribe to events
session.receiveEvents()
.subscribe(
event -> System.out.println("Event: " + event.getType()),
error -> System.err.println("Error: " + error.getMessage())
);
return Mono.just(session);
})
.block();
```
### 2. Configure Session Options
```java
import com.azure.ai.voicelive.models.*;
import java.util.Arrays;
ServerVadTurnDetection turnDetection = new ServerVadTurnDetection()
.setThreshold(0.5) // Sensitivity (0.0-1.0)
.setPrefixPaddingMs(300) // Audio before speech
.setSilenceDurationMs(500) // Silence to end turn
.setInterruptResponse(true) // Allow interruptions
.setAutoTruncate(true)
.setCreateResponse(true);
AudioInputTranscriptionOptions transcription = new AudioInputTranscriptionOptions(
AudioInputTranscriptionOptionsModel.WHISPER_1);
VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
.setInstructions("You are a helpful AI voice assistant.")
.setVoice(BinaryData.fromObject(new OpenAIVoice(OpenAIVoiceName.ALLOY)))
.setModalities(Arrays.asList(InteractionModality.TEXT, InteractionModality.AUDIO))
.setInputAudioFormat(InputAudioFormat.PCM16)
.setOutputAudioFormat(OutputAudioFormat.PCM16)
.setInputAudioSamplingRate(24000)
.setInputAudioNoiseReduction(new AudioNoiseReduction(AudioNoiseReductionType.NEAR_FIELD))
.setInputAudioEchoCancellation(new AudioEchoCancellation())
.setInputAudioTranscription(transcription)
.setTurnDetection(turnDetection);
// Send configuration
ClientEventSessionUpdate updateEvent = new ClientEventSessionUpdate(options);
session.sendEvent(updateEvent).subscribe();
```
### 3. Send Audio Input
```java
byte[] audioData = readAudioChunk(); // Your PCM16 audio data
session.sendInputAudio(BinaryData.fromBytes(audioData)).subscribe();
```
### 4. Handle Events
```java
session.receiveEvents().subscribe(event -> {
ServerEventType eventType = event.getType();
if (ServerEventType.SESSION_CREATED.equals(eventType)) {
System.out.println("Session created");
} else if (ServerEventType.INPUT_AUDIO_BUFFER_SPEECH_STARTED.equals(eventType)) {
System.out.println("User started speaking");
} else if (ServerEventType.INPUT_AUDIO_BUFFER_SPEECH_STOPPED.equals(eventType)) {
System.out.println("User stopped speaking");
} else if (ServerEventType.RESPONSE_AUDIO_DELTA.equals(eventType)) {
if (event instanceof SessionUpdateResponseAudioDelta) {
SessionUpdateResponseAudioDelta audioEvent = (SessionUpdateResponseAudioDelta) event;
playAudioChunk(audioEvent.getDelta());
}
} else if (ServerEventType.RESPONSE_DONE.equals(eventType)) {
System.out.println("Response complete");
} else if (ServerEventType.ERROR.equals(eventType)) {
if (event instanceof SessionUpdateError) {
SessionUpdateError errorEvent = (SessionUpdateError) event;
System.err.println("Error: " + errorEvent.getError().getMessage());
}
}
});
```
## Voice Configuration
### OpenAI Voices
```java
// Available: ALLOY, ASH, BALLAD, CORAL, ECHO, SAGE, SHIMMER, VERSE
VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
.setVoice(BinaryData.fromObject(new OpenAIVoice(OpenAIVoiceName.ALLOY)));
```
### Azure Voices
```java
// Azure Standard Voice
options.setVoice(BinaryData.fromObject(new AzureStandardVoice("en-US-JennyNeural")));
// Azure Custom Voice
options.setVoice(BinaryData.fromObject(new AzureCustomVoice("myVoice", "endpointId")));
// Azure Personal Voice
options.setVoice(BinaryData.fromObject(
new AzurePersonalVoice("speakerProfileId", PersonalVoiceModels.PHOENIX_LATEST_NEURAL)));
```
## Function Calling
```java
VoiceLiveFunctionDefinition weatherFunction = new VoiceLiveFunctionDefinition("get_weather")
.setDescription("Get current weather for a location")
.setParameters(BinaryData.fromObject(parametersSchema));
VoiceLiveSessionOptions options = new VoiceLiveSessionOptions()
.setTools(Arrays.asList(weatherFunction))
.setInstructions("You have access to weather information.");
```
## Best Practices
1. **Use async client** — VoiceLive requires reactive patterns
2. **Configure turn detection** for natural conversation flow
3. **Enable noise reduction** for better speech recognition
4. **Handle interruptions** gracefully with `setInterruptResponse(true)`
5. **Use Whisper transcription** for input audio transcription
6. **Close sessions** properly when conversation ends
## Error Handling
```java
session.receiveEvents()
.doOnError(error -> System.err.println("Connection error: " + error.getMessage()))
.onErrorResume(error -> {
// Attempt reconnection or cleanup
return Flux.empty();
})
.subscribe();
```
## Reference Links
| Resource | URL |
|----------|-----|
| GitHub Source | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/ai/azure-ai-voicelive |
| Samples | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/ai/azure-ai-voicelive/src/samples |Related Skills
azure-quotas
Check/manage Azure quotas and usage across providers. For deployment planning, capacity validation, region selection. WHEN: "check quotas", "service limits", "current usage", "request quota increase", "quota exceeded", "validate capacity", "regional availability", "provisioning limits", "vCPU limit", "how many vCPUs available in my subscription".
modern-javascript-patterns
Master ES6+ features including async/await, destructuring, spread operators, arrow functions, promises, modules, iterators, generators, and functional programming patterns for writing clean, efficient JavaScript code. Use when refactoring legacy code, implementing modern patterns, or optimizing JavaScript applications.
microsoft-azure-webjobs-extensions-authentication-events-dotnet
Microsoft Entra Authentication Events SDK for .NET. Azure Functions triggers for custom authentication extensions. Use for token enrichment, custom claims, attribute collection, and OTP customization in Entra ID. Triggers: "Authentication Events", "WebJobsAuthenticationEventsTrigger", "OnTokenIssuanceStart", "OnAttributeCollectionStart", "custom claims", "token enrichment", "Entra custom extension", "authentication extension".
javascript-typescript-typescript-scaffold
You are a TypeScript project architecture expert specializing in scaffolding production-ready Node.js and frontend applications. Generate complete project structures with modern tooling (pnpm, Vite, N
javascript-testing-patterns
Implement comprehensive testing strategies using Jest, Vitest, and Testing Library for unit tests, integration tests, and end-to-end testing with mocking, fixtures, and test-driven development. Use when writing JavaScript/TypeScript tests, setting up test infrastructure, or implementing TDD/BDD workflows.
javascript-pro
Master modern JavaScript with ES6+, async patterns, and Node.js APIs. Handles promises, event loops, and browser/Node compatibility. Use PROACTIVELY for JavaScript optimization, async debugging, or complex JS patterns.
javascript-mastery
Comprehensive JavaScript reference covering 33+ essential concepts every developer should know. From fundamentals like primitives and closures to advanced patterns like async/await and functional programming. Use when explaining JS concepts, debugging JavaScript issues, or teaching JavaScript fundamentals.
java-pro
Master Java 21+ with modern features like virtual threads, pattern matching, and Spring Boot 3.x. Expert in the latest Java ecosystem including GraalVM, Project Loom, and cloud-native patterns. Use PROACTIVELY for Java development, microservices architecture, or performance optimization.
azure-web-pubsub-ts
Build real-time messaging applications using Azure Web PubSub SDKs for JavaScript (@azure/web-pubsub, @azure/web-pubsub-client). Use when implementing WebSocket-based real-time features, pub/sub messaging, group chat, or live notifications.
azure-storage-queue-ts
Azure Queue Storage JavaScript/TypeScript SDK (@azure/storage-queue) for message queue operations. Use for sending, receiving, peeking, and deleting messages in queues. Supports visibility timeout, message encoding, and batch operations. Triggers: "queue storage", "@azure/storage-queue", "QueueServiceClient", "QueueClient", "send message", "receive message", "dequeue", "visibility timeout".
azure-storage-queue-py
Azure Queue Storage SDK for Python. Use for reliable message queuing, task distribution, and asynchronous processing. Triggers: "queue storage", "QueueServiceClient", "QueueClient", "message queue", "dequeue".
azure-storage-file-share-ts
Azure File Share JavaScript/TypeScript SDK (@azure/storage-file-share) for SMB file share operations. Use for creating shares, managing directories, uploading/downloading files, and handling file metadata. Supports Azure Files SMB protocol scenarios. Triggers: "file share", "@azure/storage-file-share", "ShareServiceClient", "ShareClient", "SMB", "Azure Files".