ChatGPT Health in new controversy: Sends user to doctor after misreading Apple Health data

Technology columnist Geoffrey A. Fowler decided to test ChatGPT Health by uploading nearly ten years of his Apple Watch data

Written by Tech Desk

January 28, 2026 12:54 IST

OpenAI has also stated that ChatGPT Health is not meant to diagnose conditions.

A few weeks ago, OpenAI released ChatGPT Health as a new value-addition feature to its commercial AI services. The service can analyse health and fitness data from devices like the Apple Watch and give a better understanding of the data provided, delivering on insights that general health tracking apps may miss out on. However, a recent test has shown that the tool may not be as reliable as it seems, especially when it comes to giving health-related advice.

Alarming results from AI

Technology columnist Geoffrey A. Fowler decided to test ChatGPT Health by uploading nearly ten years of his Apple Watch data. This included millions of steps and heart-rate readings. After reviewing the data, ChatGPT Health gave him an “F” grade for heart health, warning him that he could be at serious risk.

Obviously, this was a call for concern. Geoffrey Fowler thereafter changed his daily habits and even visited a doctor to get checked. However, the medical results told a completely different story. His doctor confirmed that he was in excellent health and had a very low risk of heart disease. In fact, he was so healthy that extra heart tests were not even considered necessary.

How ChatGPT Health made a mistake?

The main problem was how the AI understood the data. One key issue was VO2 max, a number that estimates fitness levels. While Apple Watch provides this value, Apple clearly states that it is only an estimate and not a medical measurement. Proper VO2 max testing requires lab equipment, something ChatGPT Health did not take into account.

Another issue came from changes in heart-rate readings when Geoffrey Fowler upgraded his Apple Watch. These changes were caused by better sensors in newer models, not by any real change in his health. However, the AI treated these differences as warning signs and went on to alter its impressions.

The tool also showed inconsistency. When Geoffrey Fowler asked the same question again, ChatGPT Health gave him different grades, ranging from an F to a B. At times, it even forgot basic information like his age and gender, which are important for health assessments.

Are AI health insights helpful yet?

This incident highlights an important point that while AI can help track health trends, it should not be trusted for medical judgment. OpenAI has also stated that ChatGPT Health is not meant to diagnose conditions. However, giving grades and serious warnings on the basis of non-medical grade data provided can create scares that are otherwise not necessary.