AI Chatbots Mislead In 50% Cases, Struggle With Complex Clinical Diagnosis: Study

Artificial intelligence (AI) chatbots are increasingly being used for health advice, symptom checking, and even self-diagnosis. However, a growing body of research suggests that these tools may not yet be reliable for clinical decision-making. A new study by researchers at the Massacheusetts General Brigham, published in JAMA Network Open, highlights a critical gap. While AI models may appear knowledgeable, they struggle with clinical reasoning, a cornerstone of medical diagnosis.

The findings show that even advanced AI systems frequently fail to generate accurate differential diagnoses, the process doctors use to distinguish between conditions with similar symptoms. In some cases, this can lead to misleading or incomplete medical advice.

With millions of people turning to chatbots for quick health answers, these findings raise urgent questions about safety, accuracy, and the role of AI in healthcare. Experts now emphasise that while AI can assist doctors, it cannot replace trained medical judgment, especially when it comes to complex or uncertain cases.

What The Study Found

The study evaluated 21 leading AI models, including some of the most advanced large language models (LLMs), across a range of clinical scenarios.

Key findings include:

AI systems failed to produce appropriate differential diagnoses more than 80% of the time
Even when they reached correct final answers, they struggled with the step-by-step reasoning process required in medicine
Diagnostic accuracy improved only when complete clinical data, such as lab results and imaging, was provided

Researchers concluded that current AI models are not ready for unsupervised clinical use, particularly in real-world patient care settings.

Also Read: AI Must Augment, Not Replace Doctors: Health Minister At India AI Impact Summit

Why Clinical Reasoning Matters

At the heart of the problem is something called differential diagnosis, a systematic method doctors use to evaluate multiple possible conditions before arriving at a final diagnosis.

Unlike AI, physicians:

Weigh probabilities
Consider rare but serious conditions
Continuously refine decisions as new data emerges

The study notes that AI lacks this structured reasoning. As researchers explained, differential diagnosis represents the "art of medicine", requiring judgment, context, and experience, not just data retrieval.

Further analysis in JAMA Network Open shows that AI models often:

Jump to conclusions prematurely
Miss critical symptoms
Fail to prioritise life-threatening conditions

The Risk Of Misleading Medical Advice

One of the most concerning findings is that AI chatbots can produce confident but incorrect advice. Recent research published in journals such as The Lancet Digital Health and Nature Medicine has shown that:

Chatbots may accept and repeat false medical claims if phrased convincingly
Around one in three responses may fail to detect misinformation

In extreme cases, AI systems have suggested unsafe or scientifically unfounded remedies, highlighting the risks of relying on them without verification.

Why AI Performs Better In Some Cases

Interestingly, the study found that AI performs relatively well when:

All patient information is provided
The task is limited to identifying a final diagnosis

In such scenarios, accuracy rates ranged between 60% and over 90%, depending on the model. However, this does not reflect real-world medicine, where:

Information is often incomplete
Symptoms evolve over time
Decisions must be made under uncertainty

This gap explains why AI may appear accurate in controlled tests but fail in practical use.

Global Evidence: A Consistent Pattern

The findings align with broader research on AI in healthcare. According to studies indexed in Nature Medicine and National Institutes of Health:

AI systems excel at pattern recognition
But struggle with contextual reasoning and uncertainty

A 2026 evaluation published in Nature Digital Medicine also emphasised that AI tools require regulation and use-case-specific validation, as they are not designed to function as independent medical devices.

Also Read: Google, AI Chatbots Cannot Diagnose Or Care For You Like Doctors Can

The Human Advantage In Diagnosis

Medical diagnosis is not just about identifying symptoms, it involves:

Patient history
Emotional and behavioural cues
Ethical decision-making
Risk assessment

These are areas where human doctors still outperform AI. As experts point out, clinical reasoning involves "weighing probabilities and risks", something AI models currently do inconsistently.

Should You Trust AI For Medical Advice?

Health authorities and researchers agree on a cautious approach.

AI chatbots can be useful for:

General health information
Understanding symptoms
Preparing questions for doctors
But they should not be used for diagnosis or treatment decisions.

The study strongly recommends a "human-in-the-loop" model, where AI supports, but does not replace, doctors.

AI chatbots are transforming access to health information, but they are far from replacing doctors. The latest research makes one thing clear: while these tools can provide quick answers, they lack the clinical reasoning needed for safe and accurate diagnosis.

From missing differential diagnoses to confidently delivering incorrect advice, the risks are significant, especially when users rely on AI without medical supervision. As technology evolves, AI may become a valuable assistant in healthcare. But for now, when it comes to diagnosis and treatment, human expertise remains irreplaceable.

Disclaimer: This content, including advice, provides generic information only. It is in no way a substitute for a qualified medical opinion. Always consult a specialist or your own doctor for more information. NDTV does not claim responsibility for this information.

Show full article

Trending Stories

Related News

Nutritionist Explains How To Boost Iron Absorption Without Changing Your Entire Diet

Menstrual Blood Can Help Detect Endometriosis Without Surgery, Reveals Study

Cycle Syncing: Why You Should Adjust Your Diet And Workouts To Your Hormonal Phases

Honey Singh Opens Up About Hair Loss During Bipolar Disorder Treatment: What Really Causes It?

Tired Of Bloating? 5 Diagnostic Tests That Reveal Exactly What Your Gut Needs

Iron Supplements: Why Pairing Them With This Fruit Juice Fixes Anaemia Faster