AI chatbots often fail to provide accurate differential diagnoses in medical scenarios Over 80% of AI models tested struggled with step-by-step clinical reasoning processes AI accuracy improves with complete patient data but lacks real-world diagnostic context