The flaw AI can’t fix (yet)

AI hallucination—the generation of convincing but fabricated information—is a structural flaw in Large Language Models (LLMs), as exposed by the 2023 New York lawyers’ case.

Nikhil Malhotra, chief innovation officer and global head – AI at Tech Mahindra
Nikhil Malhotra, chief innovation officer and global head – AI at Tech Mahindra

By Sandeep Budki

It all started with a bold deception. In May 2023, two lawyers from New York filed a court document that had been drafted using ChatGPT. The submission appeared impeccable, complete with citations and quotes from judges. However, the court discovered that none of the referenced cases actually existed. The AI had fabricated them, which serves as a classic example of hallucination – that is, AI generating false information.

That courtroom embarrassment exposed the central paradox of AI. Generative models can write with confidence but cannot always separate fact from fiction. “By nature, large language models are designed to respond to everything and that’s precisely why they hallucinate,” said Ankush Sabharwal, founder and CEO of CoRover.ai. “Hallucinations is not an engineering bug that disappears with scale; it is a direct consequence of how modern generative models are built.”

Nikhil Malhotra, chief innovation officer and global head – AI at Tech Mahindra, describes the issue as structural: “These models are fundamentally predictive engines. They generate responses by statistically estimating the next most probable word or sequence. In the absence of a verified knowledge base, the model relies purely on probabilistic predictions rather than factual validation.”

As businesses and consumers integrate GenAI into daily workflows for tasks like content creation, customer support, data analysis, and decision-making, the potential impact of errors becomes more severe. No wonder, startups across India are tackling AI hallucination from multiple directions. Tredence, for instance, is building retrieval-augmented generation (RAG) pipelines which makes sure that AI responses are verified. An approach particularly valuable for enterprises and regulated sectors that demand factual precision. At the same time, companies such as Gupshup and Vexoo Labs are refining context-aware prompt engineering and fine-tuning frameworks, training models to recognise ambiguity and flag uncertainty instead of producing fabricated or overconfident answers.

Several players are also shaping India’s broader AI ecosystem. Gnani AI and Gan AI have been selected under the IndiaAI Foundation Models initiative to develop multilingual generative systems suited for Indian contexts. Sarvam AI is building India-focused large language models with emphasis on factual accuracy and local reasoning, working with government and academic partners to establish responsible AI standards. Meanwhile, HalluDetect, a joint effort by IIT Bombay and NLSIU, is creating benchmark datasets and live evaluation tools to measure and reduce hallucination rates in legal and enterprise chatbot deployments across the country.

Malhotra threw some light on Tech Mahindra’s Project Indus, an initiative designed to create India’s own large language model grounded in local languages, dialects and cultural nuances. “By integrating deep linguistic datasets from across the country and combining them with domain-specific enterprise knowledge, we are significantly reducing hallucinations in multilingual contexts,” he revealed.

Sathish Murthy, field CTO at Rubrik for India and the Asia Pacific region, said, “When you acknowledge the hallucinations, AI can become a powerful tool. An acknowledgement helps people avoid over-relying on AI and trusting everything a tool shares. Moreover, users would trust AI more if systems disclosed a reliability score.”

Google’s core AI research is pursuing something similar. It focuses on reasoning, grounding, and factuality, supported by extensive pre-launch testing, ecosystem-wide collaboration, and guardrails that ensure most generative results include reference links for cross-verification.

Yet total elimination remains unrealistic. According to Sabharwal, hallucination can never be completely eliminated from LLMs. “But in GenAI solutions, it can and should be minimised.” His BharatGPT platform uses smaller, domain-specific systems trained on verified client data.

This article was first uploaded on November five, twenty twenty-five, at zero minutes past ten in the night.

/