ChatGPT Health missed emergency care in over half of cases: Study

A study published in Nature Medicine found that ChatGPT Health failed to recommend immediate medical care in more than half of emergency-level cases. Should you be concerned?

Published By: Divya | Published: Mar 05, 2026, 03:56 PM (IST)

techlusive.in Written By article news — Written By Divya

We accept it or not, but AI is surely becoming the part of our day to day conversations, including health. Whether it is about checking symptoms or getting a quick advice, the AI is always now on the top of the mind even before speaking to doctors. To take care of it, ChatGPT even introduced the Health mode recently. However, a new study suggests that relying too heavily on AI for medical guidance could sometimes be risky.Also Read: How to summarize long PDFs with ChatGPT

A recent research paper published in Nature Medicine has raised concerns about the triage ability of ChatGPT Health, a specialised health-focused AI assistant. According to the study, the tool failed to recommend immediate medical care in more than half of simulated emergency scenarios.Also Read: 6 things you didn’t know ChatGPT can do

What does the study observed?

Researchers from the Icahn School of Medicine at Mount Sinai tested how the AI system responds to different medical situations. To test how reliable the system really is, the researchers built a set of 60 patient scenarios. Some of them were minor issues like everyday illnesses, while others involved situations that doctors would normally treat as medical emergencies.

Before showing these cases to the AI, the researchers asked three independent doctors to review them using standard clinical guidelines. Their job was simple, decide what level of care each patient would realistically need.

Once that was done, the same scenarios were fed to ChatGPT Health. The researchers didn’t just test it once either. They ran the cases multiple times, changing details such as gender, adding lab reports, or including extra symptoms. In total, the team analysed close to 1,000 responses from the chatbot to see how its recommendations changed.

Where the AI struggled

The findings suggest that this is where the system had the most trouble. In about 52 percent of the cases that doctors classified as emergencies, the chatbot recommended a less urgent response. Instead of advising an immediate trip to the emergency room, the AI often suggested that the patient could wait 24 to 48 hours and see a doctor later.

Some of these cases involved serious conditions such as diabetic ketoacidosis or symptoms pointing toward respiratory failure, situations where doctors would typically recommend getting help right away.

Add Techlusive as a Preferred Source

At the same time, the AI sometimes made the opposite mistake. In around 65 percent of non-serious scenarios, it suggested seeing a doctor even though the symptoms could likely be managed at home.