Stream
Stream live call audio to a WebSocket endpoint.
Stream
The Stream verb streams live call audio to a WebSocket endpoint in real-time. This is the key verb for integrating AI agents and real-time audio processing.
Example
{
"voxml_version": "1.0",
"instructions": [
{
"verb": "Stream",
"url": "wss://ai-agent.example.com/stream",
"track": "both_tracks"
}
]
}Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| url | string | yes | — | WebSocket URL (must start with wss://) |
| name | string | no | — | Identifier for this stream |
| track | string | no | both_tracks | "inbound_track", "outbound_track", or "both_tracks" |
| status_callback_url | string | no | — | URL to POST stream status events to |
| parameters | object | no | — | Custom parameters sent in start message |
Audio Tracks
Both Tracks (Default)
Stream both caller and callee audio:
{
"verb": "Stream",
"url": "wss://ai.example.com/stream",
"track": "both_tracks"
}Inbound Track Only
Stream only the caller's audio:
{
"verb": "Stream",
"url": "wss://ai.example.com/stream",
"track": "inbound_track"
}Outbound Track Only
Stream only the callee's audio (what the caller hears):
{
"verb": "Stream",
"url": "wss://ai.example.com/stream",
"track": "outbound_track"
}Custom Parameters
Pass metadata to your WebSocket endpoint:
{
"verb": "Stream",
"url": "wss://ai.example.com/stream",
"parameters": {
"agent_id": "agent_123",
"language": "en-US",
"context": "customer_support"
}
}WebSocket Protocol
Connection
TryVox connects to your WebSocket URL and sends a start message:
{
"event": "start",
"stream_id": "stream_abc123",
"call_uuid": "call_xyz789",
"account_id": "acc_123",
"from": "+919876543210",
"to": "+911234567890",
"custom_parameters": {
"agent_id": "agent_123"
}
}Media Messages
Audio is sent as binary messages with this structure:
{
"event": "media",
"stream_id": "stream_abc123",
"track": "inbound",
"payload": "<base64-encoded-audio>",
"timestamp": 1234567890
}Audio format:
- Codec: μ-law (PCMU)
- Sample rate: 8kHz
- Encoding: Base64
- Frame size: 20ms
Stop Message
When streaming ends, TryVox sends:
{
"event": "stop",
"stream_id": "stream_abc123",
"call_uuid": "call_xyz789"
}Status Callbacks
Monitor stream lifecycle events:
{
"verb": "Stream",
"url": "wss://ai.example.com/stream",
"status_callback_url": "https://example.com/stream-status"
}TryVox POSTs to status_callback_url for these events:
stream-started- Stream connected successfullystream-stopped- Stream ended normallystream-failed- Stream connection failed
Example callback payload:
{
"call_uuid": "call_xyz789",
"stream_id": "stream_abc123",
"event": "stream-started",
"timestamp": 1234567890
}AI Agent Integration
Stream is designed for AI voice agents. Here's a typical flow:
{
"voxml_version": "1.0",
"instructions": [
{
"verb": "Say",
"text": "Connecting you to our AI assistant."
},
{
"verb": "Stream",
"url": "wss://ai-agent.example.com/voice",
"track": "both_tracks",
"parameters": {
"user_id": "user_123",
"intent": "support"
}
}
]
}Your AI agent receives audio, processes it, and can:
- Send audio back through the WebSocket
- Use real-time STT/TTS
- Maintain conversation context
- Trigger actions based on speech
Best Practices
- Always use secure WebSocket (
wss://) URLs - Implement reconnection logic in your WebSocket server
- Handle
startandstopevents to manage resources - Use
parametersto pass call context to your AI agent - Monitor
status_callback_urlfor debugging - Process audio in real-time to avoid latency
- Use
both_tracksfor full conversation context in AI agents
Common Use Cases
- AI voice agents and conversational AI
- Real-time speech analytics
- Call quality monitoring
- Live transcription
- Sentiment analysis
- Voice biometrics
- Custom audio processing pipelines