Babylon Healthcare Services Ltd, said its artificial intelligence software, in tests, can assess common conditions more accurately than human doctors. (Reuters)

Babylon Healthcare Services Ltd., the fast-growing mobile medical consultation service, said its artificial intelligence software, in tests, can assess common conditions more accurately than human doctors. London-based Babylon’s AI correctly answered 81 percent of diagnostic questions designed to mimic those trainee doctors must answer as part of the Royal College of General Practitioner’s exam that must be passed to become a qualified GP doctor in the U.K. The exam is graded on a curve, but over the past five years, the average score trainees needed to pass was 72 percent.

Babylon demonstrated this technology publicly for the first time in a live test at an event at London’s Royal College of Physicians Wednesday. The company said it would publish the results of its tests online. It’s made the AI tool available for free through its app and website in some parts of the U.K., including London, as well as Rwanda. Because the technology has not been approved by regulators, Babylon calls the software’s answers “health information” not “diagnoses.”

Babylon’s announcement shows the rapid progress technology companies are making toward creating AI-enabled software that can assist — or, perhaps in the future, replace — doctors. In addition to Babylon, others racing to create general diagnostic software include Ada, a startup with offices in Berlin and London that has launched a similar symptom-checker app to Babylon, HealthTap, in Palo Alto, California, Your.MD, a health information app created by British medical publisher BMJ, and International Business Machines Corp.

So far, AI companies’ efforts to change healthcare have not always been smooth. IBM’s Watson Health business has been criticized for falling short of promises that its system could help doctors select the best available cancer treatments for a given patient. Meanwhile, British regulators faulted a London hospital for transferring millions of patient records to Alphabet Inc.’s DeepMind.

There are also concerns about exactly how tools like Babylon’s will be used. “There is already pushback from patients that they are not getting enough face time with the doctor,” said Michael Nash, a medical researcher and scientific advisor to Analytics Venture Lab.

Ali Parsa, Babylon’s founder and chief executive officer, took a different view. “There’s no way to make health care accessible and affordable if we have to send people to doctors, and doctors take as long as they currently do with each patient,” he said.

But the CEO agreed that in an ideal situation and in most developed countries, the best results would likely occur when doctors used the Babylon tool and in conjunction with their own skills and intuition. “They can improve each other,” he said.

Parsa, a former physicist and investment banker, had previously founded Circle Holdings, a large partnership of private doctors that also sought to manage British hospitals on behalf of the government. It went public in 2011, but struggled to run hospitals profitably and was taken private by investment firm Toscafund in 2017.

Babylon said its existing service — which lets people consult with a doctor over their mobile phone — has 2 million users in Rwanda, a country where there are 0.064 doctors per 1,000 people, according to World Bank Data, compared to 2.8 in the U.K. and 2.3 in the U.S.

Babylon also pitted its software against a group of seven experienced primary care doctors on a set of 100 hypothetical cases developed by primary care experts at the Royal College of Physicians and the health systems at Stanford University and Yale University. Babylon’s software made the correct diagnosis in 80 per cent of cases, while the human docs’ accuracy ranged from 64 per cent to 94 per cent, it said.

When the software was presented with a smaller set of cases that represented the conditions most commonly seen by GPs, its accuracy was 98 per cent, the company said.

Martin Marshall, vice chairman of the Royal College of General Practitioners, which administers the exam Babylon’s AI sought to benchmark itself against, said that software should never be compared to the abilities of human doctors.

“No app or algorithm will ever be able to do what a GP does,” Marshall said in a statement Wednesday. “Much of what a GP does is based on a trusting relationship between a patient and a doctor and research has shown that GPs have a ‘gut feeling’ when they just know something is wrong with a patient.”

He said that Babylon’s “GP at Hand” service, which allows patients in the U.K. to use Babylon as their primary care provider, “risks undermining and damaging traditional general practice services.” He said the Royal College of General Practitioners had criticized Bayblon for “cherry-picking” the healthier patients, leaving those with more complex needs to fall back on traditional doctors. As a result, the group didn’t endorse Babylon’s app is currently being used.

Parsa said Babylon isn’t about replacing doctors, who have a vital role to play in patient care. Still, he was disappointed that the Royal College of General Practitioners commented without reading Babylon’s research. Marshall’s response was “driven by national little politics and not science,” Parsa said.