Initiate a Stream

Learn how to establish a WebSocket connection and start streaming audio from active calls using the Stream XML element.

What Happens

When Vobiz encounters an Stream element in your XML response:

  1. Vobiz initiates a WebSocket connection to your specified URL
  2. Once connected, raw audio packets are streamed in real-time
  3. Your application can process, analyze, or forward the audio
  4. For bidirectional streams, your app can also send audio back to the call

XML Setup

To initiate an audio stream, include the Stream element in your XML response with the WebSocket URL:

Basic Stream XML
<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Stream
        bidirectional="false"
        audioTrack="inbound"
        streamTimeout="7200">
        wss://stream.vobiz.ai/stream
    </Stream>
</Response>

Key Configuration Parameters

  • WebSocket URL: The text content of the element (e.g., wss://stream.vobiz.ai/stream)
  • bidirectional: Set to true to enable sending audio back to the call
  • audioTrack: Choose which audio to stream (inbound, outbound, or both)
  • streamTimeout: Maximum streaming duration in seconds (default: 86400 / 24 hours)

WebSocket Connection

Your WebSocket server must be ready to accept connections from Vobiz. Here's what the initial connection looks like:

Connection Start Message

Vobiz sends this when the stream starts
{
  "event": "start",
  "streamId": "20170ada-f610-433b-8758-c02a2aab3662",
  "accountId": "MA_2210JXXN",
  "callId": "CALL_abc123xyz",
  "from": "+14155551111",
  "to": "+14155552222",
  "direction": "inbound",
  "contentType": "audio/x-l16;rate=8000",
  "audioTrack": "inbound"
}

This message provides context about the call and audio stream configuration.

Audio Data Messages

Continuous audio packets sent during the stream
{
  "event": "media",
  "streamId": "20170ada-f610-433b-8758-c02a2aab3662",
  "media": {
    "payload": "base64-encoded-audio-data...",
    "contentType": "audio/x-l16",
    "sampleRate": 8000,
    "timestamp": 1634567890123
  }
}

Vobiz sends these messages continuously, containing base64-encoded raw audio data.

Stream Stop Message

Vobiz sends this when the stream ends
{
  "event": "stop",
  "streamId": "20170ada-f610-433b-8758-c02a2aab3662",
  "reason": "call_ended"
}

The connection closes gracefully when the call ends or the stream timeout is reached.

Connection Flow

  1. 1
    Vobiz receives your XML response containing the Stream element
  2. 2
    WebSocket connection established to your specified URL
  3. 3
    "start" event sent with call metadata and stream configuration
  4. 4
    Continuous "media" events stream audio packets in real-time
  5. 5
    Bidirectional streams (optional): Your app can send playAudio events back to Vobiz
  6. 6
    "stop" event sent when call ends or timeout is reached
  7. 7
    WebSocket connection closes gracefully

Implementation Examples

Node.js WebSocket Server

Simple WebSocket server to receive audio
const WebSocket = require('ws');

const wss = new WebSocket.Server({ port: 8080 });

wss.on('connection', (ws) => {
  console.log('Vobiz connected to audio stream');

  ws.on('message', (message) => {
    const data = JSON.parse(message);

    if (data.event === 'start') {
      console.log('Stream started:', data.streamId);
      console.log('Call from:', data.from, 'to:', data.to);
    } else if (data.event === 'media') {
      // Process base64-encoded audio data
      const audioBuffer = Buffer.from(data.media.payload, 'base64');
      console.log('Received audio chunk:', audioBuffer.length, 'bytes');

      // Your audio processing logic here
      // e.g., send to transcription service, save to file, etc.
    } else if (data.event === 'stop') {
      console.log('Stream stopped:', data.reason);
    }
  });

  ws.on('close', () => {
    console.log('WebSocket connection closed');
  });
});

console.log('WebSocket server listening on port 8080');

Python WebSocket Handler

Python asyncio WebSocket server
import asyncio
import websockets
import json
import base64

async def handle_audio_stream(websocket, path):
    print("Vobiz connected to audio stream")

    async for message in websocket:
        data = json.loads(message)

        if data['event'] == 'start':
            print(f"Stream started: {data['streamId']}")
            print(f"Call from {data['from']} to {data['to']}")

        elif data['event'] == 'media':
            # Decode base64 audio data
            audio_bytes = base64.b64decode(data['media']['payload'])
            print(f"Received audio chunk: {len(audio_bytes)} bytes")

            # Your audio processing logic here
            # e.g., send to transcription, analysis, etc.

        elif data['event'] == 'stop':
            print(f"Stream stopped: {data.get('reason', 'unknown')}")
            break

async def main():
    async with websockets.serve(handle_audio_stream, "0.0.0.0", 8080):
        print("WebSocket server listening on port 8080")
        await asyncio.Future()  # run forever

if __name__ == "__main__":
    asyncio.run(main())

Vobiz XML Response with Stream

Complete example with status callbacks
<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Speak>This call is being recorded for quality assurance.</Speak>
    <Stream
        bidirectional="false"
        audioTrack="both"
        streamTimeout="3600"
        statusCallbackUrl="https://api.vobiz.ai/stream-status"
        statusCallbackMethod="POST"
        contentType="audio/x-l16;rate=16000"
        extraHeaders="session_id=abc123,agent_id=john">
        wss://stream.vobiz.ai/stream
    </Stream>
    <Dial>+14155551234</Dial>
</Response>

This example streams both inbound and outbound audio with 16kHz sample rate, includes custom session headers, and sends status updates to your callback URL.