Stream

The Stream verb streams live call audio to a WebSocket endpoint in real-time. This is the key verb for integrating AI agents and real-time audio processing.

Example

{
  "voxml_version": "1.0",
  "instructions": [
    {
      "verb": "Stream",
      "url": "wss://ai-agent.example.com/stream",
      "track": "both_tracks"
    }
  ]
}

Parameters

Parameter	Type	Required	Default	Description
url	string	yes	—	WebSocket URL (must start with wss://)
name	string	no	—	Identifier for this stream
track	string	no	both_tracks	"inbound_track", "outbound_track", or "both_tracks"
status_callback_url	string	no	—	URL to POST stream status events to
parameters	object	no	—	Custom parameters sent in start message

Audio Tracks

Both Tracks (Default)

Stream both caller and callee audio:

{
  "verb": "Stream",
  "url": "wss://ai.example.com/stream",
  "track": "both_tracks"
}

Inbound Track Only

Stream only the caller's audio:

{
  "verb": "Stream",
  "url": "wss://ai.example.com/stream",
  "track": "inbound_track"
}

Outbound Track Only

Stream only the callee's audio (what the caller hears):

{
  "verb": "Stream",
  "url": "wss://ai.example.com/stream",
  "track": "outbound_track"
}

Custom Parameters

Pass metadata to your WebSocket endpoint:

{
  "verb": "Stream",
  "url": "wss://ai.example.com/stream",
  "parameters": {
    "agent_id": "agent_123",
    "language": "en-US",
    "context": "customer_support"
  }
}

WebSocket Protocol

Connection

TryVox connects to your WebSocket URL and sends a start message:

{
  "event": "start",
  "stream_id": "stream_abc123",
  "call_uuid": "call_xyz789",
  "account_id": "acc_123",
  "from": "+919876543210",
  "to": "+911234567890",
  "custom_parameters": {
    "agent_id": "agent_123"
  }
}

Media Messages

Audio is sent as binary messages with this structure:

{
  "event": "media",
  "stream_id": "stream_abc123",
  "track": "inbound",
  "payload": "<base64-encoded-audio>",
  "timestamp": 1234567890
}

Audio format:

Codec: μ-law (PCMU)
Sample rate: 8kHz
Encoding: Base64
Frame size: 20ms

Stop Message

When streaming ends, TryVox sends:

{
  "event": "stop",
  "stream_id": "stream_abc123",
  "call_uuid": "call_xyz789"
}

Status Callbacks

Monitor stream lifecycle events:

{
  "verb": "Stream",
  "url": "wss://ai.example.com/stream",
  "status_callback_url": "https://example.com/stream-status"
}

TryVox POSTs to status_callback_url for these events:

stream-started - Stream connected successfully
stream-stopped - Stream ended normally
stream-failed - Stream connection failed

Example callback payload:

{
  "call_uuid": "call_xyz789",
  "stream_id": "stream_abc123",
  "event": "stream-started",
  "timestamp": 1234567890
}

AI Agent Integration

Stream is designed for AI voice agents. Here's a typical flow:

{
  "voxml_version": "1.0",
  "instructions": [
    {
      "verb": "Say",
      "text": "Connecting you to our AI assistant."
    },
    {
      "verb": "Stream",
      "url": "wss://ai-agent.example.com/voice",
      "track": "both_tracks",
      "parameters": {
        "user_id": "user_123",
        "intent": "support"
      }
    }
  ]
}

Your AI agent receives audio, processes it, and can:

Send audio back through the WebSocket
Use real-time STT/TTS
Maintain conversation context
Trigger actions based on speech

Best Practices

Always use secure WebSocket (wss://) URLs
Implement reconnection logic in your WebSocket server
Handle start and stop events to manage resources
Use parameters to pass call context to your AI agent
Monitor status_callback_url for debugging
Process audio in real-time to avoid latency
Use both_tracks for full conversation context in AI agents

Common Use Cases

AI voice agents and conversational AI
Real-time speech analytics
Call quality monitoring
Live transcription
Sentiment analysis
Voice biometrics
Custom audio processing pipelines

Stream

On this page