For as long as concerns have existed about artificial intelligence (AI) making office jobs redundant, conversations around the limitations of AI have happened side by side.
But here’s a new discovery that will both aggravate and alleviate your worries about AI at the same time — leading AI models and chatbots are showing signs of cognitive decline, similar to what ageing does to humans.
A study published in the British Medical Journal in December 2024, titled Age Against the Machine-Susceptibility of Large Language Models to Cognitive Impairment: Cross Sectional Analysis, examined chatbots and large language models (LLMs) used for medical diagnoses, through the Montreal Cognitive Assessment (MoCA) test. It evaluated ChatGPT 4, ChatGPT 4o, Claude 3.5, Gemini 1, and Gemini 1.5.
MoCA is actually a test that helps evaluate early signs of dementia in old people. The researchers adapted it into a digital model and used it to test parameters of attention, memory, spatial skills, and executive tasks, among other things.
The study stated, “With the exception of ChatGPT 4o, almost all large language models subjected to the MoCA test showed signs of mild cognitive impairment and early dementia. Moreover, as in humans, age is a key determinant of cognitive decline:
‘older’ chatbots, like older patients, tend to perform worse on the MoCA test.” It went on to say, “These findings challenge the assumption that artificial intelligence will soon replace human doctors, as the cognitive impairment evident in leading chatbots may affect their reliability in medical diagnostics and undermine patients’ confidence.”
What this means
Researchers working on this study reflected that these findings prove the cognitive limitations of AI, which means that while they are already being used as medical tools, their reliability could come into question now. However, this is not the first time that this would be happening. How reliable AI models are has always been a question mark. In October last year, another study had reported that OpenAI’s AI-powered transcription tool — which was already being
used in many hospitals globally to transcribe patient records — was “hallucinating”. The tool was basically “making up” things that patients had never even said, including racial slurs, violent narratives, and medical treatments that doctors had never advised.
Around the same time, researchers at the University of South Carolina (USC) in the United States also put AI’s cognitive abilities to the test, evaluating AI models through IQ tests and visual problems. The team found that while both open-source and closed-sourced AI models struggled with cognitive abilities, the former also had trouble with abstract visual reasoning puzzles.
AI in medicine
But is AI actually used that closely in medicine?
Yes, globally AI is already being used to streamline patient records, offer personalised healthcare treatment plans, and analyse tests like CT scans, X-rays, or even MRIs.
It’s also being used to increase efficiency and reduce costs in clinical trials and drug developments. Additionally, AI also offers support in treating serious diseases like cancer.
Analysing tests with AI can help with early detection, prognosis, and diagnosis. Not just that, AI tools can also assist in understanding how the disease might progress, and monitor it to change the course of medication, if required.
So is this study concerning?
Yes. If AI is being used in high-risk fields, such as the medical space, the error of margin has to be minimal. Like the study said, findings like these could make patients doubt AI’s need in the very first place.
Interestingly, a recent study published in JAMA Network Open showed that when it comes to the use of AI in any form of healthcare, a majority of people in the US do not trust that it would be used responsibly, and that their health-related data will be protected.
According to the study, 65.8% adults in the US do not believe that AI can be used responsibly in healthcare, and 57.7% adults do not trust that the healthcare system can ensure that AI does no harm to the patient.
A Lancet study from December last year also said that while AI tools used in psychiatry have shown promise, there needs to be work done in making sure that the use of AI in other medical fields is transparent and standardised for better results.
However, the positive side is that a lot of work is being put into making AI better.
According to data analysis firm Crunchbase, in 2024, funding into AI crossed the $100 billion mark.
An article in Nature journal also recently said, “It is clear that AI companies are laser-focused on giving their systems the whole range of cognitive abilities enjoyed by humans. Companies developing AI models have a strong incentive to maintain the idea that AGI (artificial general intelligence) is high, to attract interest and therefore investment.”
Not just that, the USC researchers also discovered that when the AI models were “prompted to think step by step through reasoning tasks,” they performed significantly better and showed a lot of improvement. One of the study authors had noted at the time, “By guiding the models with hints, we were able to see up to 100% improvement in performance.”
It’s as the saying goes, “Where there’s a will, there’s a way.”