Advertisement

AI Chatbots Mislead In 50% Cases, Struggle With Complex Clinical Diagnosis: Study

A new study finds AI chatbots often fail at clinical reasoning, missing key diagnoses and giving misleading advice. Experts warn these tools cannot replace doctors, especially in complex medical decision-making.

AI Chatbots Mislead In 50% Cases, Struggle With Complex Clinical Diagnosis: Study
AI chatbots are transforming access to health information
AI-generated image
  • AI chatbots often fail to provide accurate differential diagnoses in medical scenarios
  • Over 80% of AI models tested struggled with step-by-step clinical reasoning processes
  • AI accuracy improves with complete patient data but lacks real-world diagnostic context
Did our AI summary help?
Let us know.

Artificial intelligence (AI) chatbots are increasingly being used for health advice, symptom checking, and even self-diagnosis. However, a growing body of research suggests that these tools may not yet be reliable for clinical decision-making. A new study by researchers at the Massacheusetts General Brigham, published in JAMA Network Open, highlights a critical gap. While AI models may appear knowledgeable, they struggle with clinical reasoning, a cornerstone of medical diagnosis.

The findings show that even advanced AI systems frequently fail to generate accurate differential diagnoses, the process doctors use to distinguish between conditions with similar symptoms. In some cases, this can lead to misleading or incomplete medical advice.

With millions of people turning to chatbots for quick health answers, these findings raise urgent questions about safety, accuracy, and the role of AI in healthcare. Experts now emphasise that while AI can assist doctors, it cannot replace trained medical judgment, especially when it comes to complex or uncertain cases.

What The Study Found

The study evaluated 21 leading AI models, including some of the most advanced large language models (LLMs), across a range of clinical scenarios.

Key findings include:

  • AI systems failed to produce appropriate differential diagnoses more than 80% of the time
  • Even when they reached correct final answers, they struggled with the step-by-step reasoning process required in medicine
  • Diagnostic accuracy improved only when complete clinical data, such as lab results and imaging, was provided

Researchers concluded that current AI models are not ready for unsupervised clinical use, particularly in real-world patient care settings.

Also Read: AI Must Augment, Not Replace Doctors: Health Minister At India AI Impact Summit

Why Clinical Reasoning Matters

At the heart of the problem is something called differential diagnosis, a systematic method doctors use to evaluate multiple possible conditions before arriving at a final diagnosis.

Unlike AI, physicians:

  • Weigh probabilities
  • Consider rare but serious conditions
  • Continuously refine decisions as new data emerges

The study notes that AI lacks this structured reasoning. As researchers explained, differential diagnosis represents the "art of medicine", requiring judgment, context, and experience, not just data retrieval.

Further analysis in JAMA Network Open shows that AI models often:

  • Jump to conclusions prematurely
  • Miss critical symptoms
  • Fail to prioritise life-threatening conditions

The Risk Of Misleading Medical Advice

One of the most concerning findings is that AI chatbots can produce confident but incorrect advice. Recent research published in journals such as The Lancet Digital Health and Nature Medicine has shown that:

  • Chatbots may accept and repeat false medical claims if phrased convincingly
  • Around one in three responses may fail to detect misinformation

In extreme cases, AI systems have suggested unsafe or scientifically unfounded remedies, highlighting the risks of relying on them without verification.

Why AI Performs Better In Some Cases

Interestingly, the study found that AI performs relatively well when:

  • All patient information is provided
  • The task is limited to identifying a final diagnosis

In such scenarios, accuracy rates ranged between 60% and over 90%, depending on the model. However, this does not reflect real-world medicine, where:

  • Information is often incomplete
  • Symptoms evolve over time
  • Decisions must be made under uncertainty

This gap explains why AI may appear accurate in controlled tests but fail in practical use.

Global Evidence: A Consistent Pattern

The findings align with broader research on AI in healthcare. According to studies indexed in Nature Medicine and National Institutes of Health:

  • AI systems excel at pattern recognition
  • But struggle with contextual reasoning and uncertainty

A 2026 evaluation published in Nature Digital Medicine also emphasised that AI tools require regulation and use-case-specific validation, as they are not designed to function as independent medical devices.

Also Read: Google, AI Chatbots Cannot Diagnose Or Care For You Like Doctors Can

The Human Advantage In Diagnosis

Medical diagnosis is not just about identifying symptoms, it involves:

  • Patient history
  • Emotional and behavioural cues
  • Ethical decision-making
  • Risk assessment

These are areas where human doctors still outperform AI. As experts point out, clinical reasoning involves "weighing probabilities and risks", something AI models currently do inconsistently.

Should You Trust AI For Medical Advice?

Health authorities and researchers agree on a cautious approach.

AI chatbots can be useful for:

  • General health information
  • Understanding symptoms
  • Preparing questions for doctors
  • But they should not be used for diagnosis or treatment decisions.

The study strongly recommends a "human-in-the-loop" model, where AI supports, but does not replace, doctors.

AI chatbots are transforming access to health information, but they are far from replacing doctors. The latest research makes one thing clear: while these tools can provide quick answers, they lack the clinical reasoning needed for safe and accurate diagnosis.

From missing differential diagnoses to confidently delivering incorrect advice, the risks are significant, especially when users rely on AI without medical supervision. As technology evolves, AI may become a valuable assistant in healthcare. But for now, when it comes to diagnosis and treatment, human expertise remains irreplaceable.

Disclaimer: This content, including advice, provides generic information only. It is in no way a substitute for a qualified medical opinion. Always consult a specialist or your own doctor for more information. NDTV does not claim responsibility for this information.

Track Latest News Live on NDTV.com and get news updates from India and around the world

Follow us:
Listen to the latest songs, only on JioSaavn.com