ElevenLabs’ new AI voice can speak like a human in 70+ languages

Unlike earlier iterations of robotic-sounding voice synthesis tools, Eleven v3 has been designed to perform like a trained voice actor.

Written by BrandWagon Online

Updated: June 10, 2025 16:14 IST

It can adapt mid-sentence to changes in tone, convey complex emotional shifts such as excitement or sadness, and even include non-verbal cues like laughter or sighs.

ElevenLabs, a player in AI voice technology, has announced the launch of Eleven v3 (alpha), a major upgrade to its text-to-speech platform. This latest version introduces a level of realism and expressiveness in synthetic speech that brings it remarkably close to human-like voice acting. The company claims that the new model can interpret text not only with clarity, but also with a full range of emotions, tones, and even dramatic cues.

Unlike earlier iterations of robotic-sounding voice synthesis tools, Eleven v3 has been designed to perform like a trained voice actor. It can adapt mid-sentence to changes in tone, convey complex emotional shifts such as excitement or sadness, and even include non-verbal cues like laughter or sighs. For creators working on video content, podcasts, audiobooks, or interactive applications, the new tool offers the ability to deliver spoken text with personality and nuance. One of the most significant updates is its expanded language support. While earlier versions were limited to around 30 languages, Eleven v3 now supports over 70, including several widely spoken Indian languages such as Hindi, Tamil, and Bengali. This makes it particularly relevant for the Indian market, where regional language content consumption is on the rise.

“Our goal was to build the most expressive text-to-speech model ever created,” said Mati Staniszewski, Co-Founder and CEO of ElevenLabs. “With full control over delivery, pacing, and emotion, users can now tailor AI voices to match any script. We’re especially proud to include Indian languages as part of this global rollout.” The model enables users to insert specific instructions into the text, such as [whisper], [laugh], or [sing], to control non-verbal and stylistic elements. It can also shift accents, emulate multiple characters in a single recording, and adjust speech dynamics for dramatic or storytelling purposes.

This new capability opens doors for a wide range of Indian users. Content creators on platforms like YouTube can produce voiceovers that sound natural and emotionally engaging. Educators and edtech platforms can use the tool to create immersive, audio-rich learning materials. Game developers and app builders can generate realistic character voices or virtual assistants. Even businesses can benefit, using the AI to build smarter voice bots and more human-like customer service systems. Authors and publishers can also convert books into lifelike audiobooks, reducing dependence on human narrators while preserving the emotive quality of the original text.