Bengaluru-based voice AI startup Gnani.ai has raised $10 million in a Series B round led by Aavishkaar Capital at a “significantly higher” valuation, says co-founder Ganesh Gopalan. The raise comes at a time when AI funding is surging globally, as well as within India, yet Gnani’s total capital raised so far, at roughly $15 million, remains modest in comparison.

Meanwhile, its peer Sarvam AI is reportedly nearing a unicorn valuation in its next round of funding. In a conversation with Ayanti Bera, Gopalan speaks about the drivers behind the fundraise, his approach to multilingual voice AI models, and plans to scale its voice AI agents globally. Excerpts: 

You last raised capital in mid-2024. What drove this latest fundraise?

The decision was driven by strong business growth and increasing market demand for our voice AI solutions. Over the past year, we’ve added more than 100 enterprise customers, and there’s significant interest in what we’re building. At the same time, we’re doubling down on R&D, especially as part of the India AI Mission, where we’re developing advanced models like voice-to-voice systems. This round helps us scale both the business and our R&D capabilities. 

Can you tell us more about the Series B round and what it will be used for?

This is a $10 million Series B round led by Aavishkaar Capital, with participation from InfoEdge Ventures. The investment will help us accelerate global expansion, deepen our generative AI capabilities, and continue building our engineering and product teams.

What are your global expansion plans? Which markets are you targeting?

We’re already seeing traction outside India. We’ve partnered with customers in Japan and East Asia, are working in the Middle East, and have early customers in the US. While India remains a core market, we’re taking a measured approach to expanding globally, while deepening our presence in existing markets.

With countries increasingly focusing on sovereign AI, does that create barriers in global markets?

It’s true that sovereign AI is becoming important globally. But we see that as an opportunity as well. Many countries and organisations are exploring partnerships to build their own sovereign AI capabilities, and our deep-tech stack positions us well to support that. 

You recently launched multiple models under the Vachana stack. What’s the strategy there?

We’ve launched several models, including speech-to-text, text-to-speech, and a preview of our voice-to-voice model, which we are doing as a part of the India AI Mission. These are multilingual and built with LLM assistance. We’re also building small language models (SLMs) for specific industries. 

All of this goes along with our fundamental premise that to win in this industry, you need to have control over all elements of the AI software pipeline. That’s how you reduce latency, reduce hallucinations, improve accuracy, and also get better margins.

One standout feature is voice cloning. What are its real-world applications?

Voice cloning opens up a wide range of use cases. For instance, a person can communicate in multiple languages using their own voice. Enterprises can deploy voice agents using a brand ambassador’s voice, making interactions more natural. Beyond that, applications extend to media, dubbing, and multilingual content creation. Once you have the ability to speak in multiple languages in your own voice, it creates an infinite number of possibilities.

How do you differentiate yourself in an increasingly crowded voice AI market?

Our core differentiation is that we build our models in-house rather than relying on external APIs. That gives us control over pricing, latency, and performance. For example, adding a new language to our system can take just about a month because the core models are ours. We believe wrapper businesses, which are dependent on external APIs, will struggle to sustain in this space.