Features
Plans About Blog Contact
Voice & VoIP

Text-to-Speech (TTS)

Convert text to natural-sounding audio using AWS Polly, Azure Neural TTS, OpenAI, and ElevenLabs voice engines — accessed through a single unified API. Power IVR prompts, RVM messages, outbound voice notifications, and AI voice bots with production-ready synthesis across 60+ languages and 300+ voices.

One API, Multiple TTS Engines

Rather than integrating each TTS vendor separately, MOBITELSMS provides a single REST endpoint that routes synthesis requests to the appropriate engine based on voice selection, cost preference, or quality requirements. Responses are cached in our high-speed distributed cache to eliminate redundant synthesis for identical text-voice combinations, reducing cost by up to 80% on high-volume use cases.

TTS Platform Capabilities

Multi-Engine Support
Route to AWS Polly, Azure Neural TTS, OpenAI TTS, or ElevenLabs based on voice name, engine policy, or fallback rules. Automatic engine failover maintains availability if one provider has an outage.
60+ Languages
Synthesis in English, Spanish, French, German, Portuguese, Arabic, Mandarin, Japanese, and 50+ more languages. Neural voices deliver near-human prosody and natural pacing across all supported languages.
SSML Support
Full Speech Synthesis Markup Language support for fine-grained control over pronunciation, rate, pitch, volume, pauses, and phoneme substitution. Standardised SSML works consistently across all supported TTS engines.
Audio Caching
Identical text-voice pairs are synthesised once and cached in our high-speed distributed cache with configurable TTL. Cache hit rates above 80% are typical for IVR prompts, reducing both latency and provider API costs significantly.
Output Formats
Receive audio as MP3, WAV (PCM 8/16kHz), OGG, or telephony-optimised ULAW/ALAW for direct injection into IVR systems. Streaming output supported for real-time voice bot applications.
Telecom Integration
Native integration with the MOBITELSMS IVR and RVM systems — synthesise prompts on-demand during a live call or pre-generate audio for voicemail drops. No external TTS API keys needed when using our hosted service.

Specifications

EnginesAWS Polly, Azure, OpenAI, ElevenLabs
Languages60+
Voices300+ across all engines
Output FormatsMP3, WAV, OGG, ULAW, ALAW
SSMLFull W3C SSML 1.1 support
Cache BackendDistributed in-memory cache (configurable TTL)
APIREST JSON + Streaming
Latency (cold)<800ms typical

Add Natural-Sounding Voice to Your App

One API key gives you access to all supported TTS engines. Start synthesising in minutes with our REST API and pre-built IVR integration.

MOBITELSMS Assistant

Hi! I'm the MOBITELSMS assistant. How can I help you today?