Speak Text on Calls

Convert text to speech and speak it during active calls using advanced TTS technology.

The Speak Text API uses Vobiz's text-to-speech (TTS) engine to convert written text into spoken audio during ongoing calls. This allows you to deliver dynamic messages, notifications, or instructions without pre-recording audio files. Support for 29 languages and multiple voices enables global reach.

Supported: 29 languages with both WOMAN and MAN voices (availability varies by language). Default is English (US) with WOMAN voice.

Key Features

29 Languages

Support for major global languages including English, Spanish, French, German, Chinese, Japanese, and more.

Multiple Voices

Choose between WOMAN and MAN voices for most languages (availability varies).

Dynamic Content

Generate speech on-the-fly without pre-recording, perfect for personalized messages.

Call Leg Selection

Choose which participants hear the speech: caller, callee, or both.

Loop Support

Repeat text indefinitely for continuous notifications or prompts.

Audio Mixing

Mix speech with call audio or mute participants during playback.

Common Use Cases

  • Dynamic IVR Menus

    Generate menu options and prompts based on customer data or time of day.

  • Personalized Greetings

    Welcome callers by name or deliver customized messages based on caller ID.

  • Queue Updates

    Announce queue position, wait times, or estimated callback times dynamically.

  • Multi-Language Support

    Deliver messages in the caller's preferred language for global customer support.

  • Real-Time Notifications

    Deliver account balances, order statuses, or appointment confirmations on-demand.

Best Practices

Keep Text Concise

Short, clear messages are easier to understand. Break long content into multiple segments if needed.

Use Proper Punctuation

Punctuation affects speech pacing and tone. Use periods, commas, and question marks for natural-sounding speech.

Test Different Voices

Try both WOMAN and MAN voices to find which works best for your use case and audience.

Choose Right Language

Match the language to your caller's preference for better user experience and comprehension.

Avoid Special Characters

Emojis and special symbols may not be pronounced correctly. Use plain text for best results.