Giving shape to LLM dreams

Indian deeptech startups like CoRover, Gnani.ai, and Fractal are making strides in building India-specific large language models (LLMs) and small language models (SLMs), focusing on multilingual, domain-specific, and energy-efficient AI under the IndiaAI Mission.

LLM, Fractal Analytics, AI, artificial intelligence, Gnani.ai, CoRover

Ever since China’s DeepSeek rocked the world with its low-cost, open-source AI model that rivals OpenAI’s ChatGPT, there has been a clamour among Indian deeptech startups to demonstrate their capabilities in building foundational AI models. While Bengaluru-based Sarvam is the first firm chosen under the Rs 10,370 crore IndiaAI Mission to build the first homegrown large language model (LLM), S Shanthi & Sudhir Chowdhary explore some other promising players in the arena:

Right-sizing vital for big impact

CoRover

Bengaluru

Expertise: Conversational

AI platform

CoRover, a human-centric conversational AI platform for enterprises, is currently focused on architecting India-specific LLMs. Its multilingual strategy involves developing foundational LLMs, followed by the creation of specialised domain-specific LLMs and efficient small language models (SLMs) designed for edge deployment. “Our philosophy centres on developing right-sized models optimised for specific use-cases rather than solely pursuing maximal parameter counts. We are heavily invested in responsible AI, prioritising not only cost-effectiveness but also energy efficiency by exploring methods for reduced computational footprints while achieving superior results,” Ankush Sabharwal, founder and CEO CoRover told FE.

The startup is also exploring the application of the Mixture of Experts (MoEs) principle, a machine learning technique where multiple specialised models work together to solve a complex problem, to enhance model specialisation and mitigate hallucinatory outputs. This aligns with the startup’s focus on domain-specific expertise, offering a more targeted approach to specific problem domains. This strategy, it says, promises reduced computational cost, enhanced energy efficiency, and improved problem-solving capabilities tailored to the Indian context.

CoRover claims that it is strategically positioned with access to a substantial and relevant data ecosystem, effectively mitigating the data scarcity challenge commonly faced by others. “We also commend governmental initiatives such as AI-Kosh and the Bhashini initiatives, which are contributing to more data sources nationwide. Our existing strengths in data accessibility and talent, coupled with a proactive growth strategy, uniquely equip CoRover to excel in the development of advanced LLMs/SLMs for various AI-based applications,” Sabharwal added.

Ankush Sabharwal, founder & CEO, CoRover

Strong focus on smaller, more specialised models

Gnani.ai

Bengaluru

Expertise: Voice-first AI solutions

Voice AI startup Gnani.ai has been focusing on developing India-specific language models that address the country’s vast linguistic diversity, low-resource languages, and nuanced speech patterns. A key area of innovation for the firm is its development of voice-to-voice models specifically for Indic languages—an emerging field with limited global players. The company is also building SLMs tailored for Indian languages, dialects, and code-mixed speech, optimised for low-latency and on-device deployment.

The firm’s other efforts include curating high-quality, domain-specific datasets and training models across the stack —speech-to-text (ASR), text-to-speech (TTS), and natural language understanding (NLU) — for multiple Indian languages. “In addition, we are investing heavily in model compression and fine-tuning techniques to ensure our models are scalable, efficient, and affordable for real-world enterprise use, particularly in BFSI, healthcare, and governance sectors,” said Ganesh Gopalan, co-founder and CEO, Gnani.ai.

Building India-specific LLMs comes with several critical challenges, including the limited availability of high-quality, annotated data across Indian languages, which hampers effective model training and performance. Many regional languages are low-resource, making data collection and curation a significant hurdle. Gnani.ai has thus built a large proprietary audio dataset in India, supplemented with open data sources, combined with its deep expertise in training large-scale models and solving complex multilingual AI problems.

Ganesh Gopalan, co-founder & CEO, Gnani.ai

Unlocking the next phase of AI

Fractal Analytics

Mumbai, New York

Expertise: Analytics &

AI solutions

AI startup Fractal’s research team is working to extend the boundaries of AI towards artificial general intelligence (AGI) by developing foundational model systems. Its research involves building models across modalities, including complex multi-modal systems. It is working on advanced reasoning systems that achieve state-of-the-art performance in areas like mathematical reasoning. It is also focused on developing agentic systems capable of accomplishing complex tasks that compete on frontier benchmarks.

The startup has submitted two proposals, one focused on building a SOTA-level large reasoning model that can compete with the best reasoning models globally, and another focused on building a ground-up medical multi-modal model that can deliver best-in-class performance in the medical and healthcare domain. Suraj Amonkar, chief AI research and platforms officer, Fractal, says the lack of access to sufficiently large GPU infrastructure and the limited availability of high-quality datasets are principal challenges in building large foundational models.

Suraj Amonkar, chief AI research & platforms officer, Fractal

This article was first uploaded on May fourteen, twenty twenty-five, at thirteen minutes past ten in the night.