VoxML Reference
Declarative call control with JSON — TryVox's answer to TwiML.
VoxML Reference
VoxML is TryVox's declarative call control language that lets you control calls by returning JSON from your webhooks. When TryVox calls your answer_url, you return VoxML instructions that are executed sequentially.
VoxML Response Envelope
Every VoxML response follows this structure:
{
"voxml_version": "1.0",
"instructions": [
{"verb": "Say", "text": "Hello world"},
{"verb": "Hangup"}
]
}Available Verbs
VoxML supports 11 verbs for controlling call flow:
| Verb | Description |
|---|---|
| Say | Convert text to speech on the call |
| Play | Play an audio file on the call |
| Gather | Collect DTMF digits or speech input from the caller |
| Dial | Connect the call to another party |
| Record | Record audio from the caller |
| Stream | Stream live call audio to a WebSocket endpoint |
| Conference | Join the caller into a named conference room |
| Redirect | Fetch new VoxML from a different URL |
| Pause | Pause execution for a specified duration |
| Hangup | End the call |
| Reject | Reject an incoming call without answering |
Execution Model
Instructions are executed sequentially from top to bottom. Some verbs like Gather and Dial can interrupt the flow by making callbacks to your server, allowing you to return new VoxML instructions dynamically.