Common Sense: AI is starting to beat human doctors at making correct diagnoses in the emergency room

Monday, May 04, 2026

AI is starting to beat human doctors at making correct diagnoses in the emergency room

Good news! Finally, human doctors/phycisians get some serious competition to the benefit of our health!

"... This vision of AI-assisted emergency health care may soon be reality. In a new study, researchers show that a type of AI known as a large language model (LLM) often outperformed physicians at diagnosing complex and potentially life-threatening conditions, including decreased blood flow to the heart, even in the fast-moving stages of real ER care when information is limited, they report today in Science.

In early ER cases, the model identified the correct or a very close diagnosis in about 67% of cases, compared with roughly 50% to 55% for physicians. And the technology is only getting better. ..."

From the editor's summary and abstract:

"Editor’s summary

Computational tools for medical decision support have been advancing over time, mainly by serving as resources for limited applications. Machine learning tools for autonomous interpretation of clinical cases have also been gradually improving over time.

Brodeur et al. pitted a large language model, the OpenAI o1 series, directly against hundreds of physicians at different levels of training and experience on a variety of clinical cases ranging from published patient vignettes to evaluations of brand-new emergency room patients, as well as on clinical tasks including both diagnosis and planning of clinical management ... Across a variety of scenarios and applications, the large language model outperformed both human physicians and older models, suggesting its potential utility for clinical care. ...

Abstract

More than 65 years ago, complex clinical diagnostic reasoning cases were introduced as the gold standard for the evaluation of expert medical computing systems, a standard that has held ever since.

In this study, we report the results of a physician evaluation of a large language model (LLM) on challenging clinical cases across five experiments with a baseline of hundreds of physicians.

We then report a real-world study comparing human expert and artificial intelligence (AI) second opinions in randomly selected patients in the emergency room of a major tertiary academic medical center.

In all experiments, the LLM outperformed physician baselines and displayed continued improvement from prior generations of AI clinical decision support. Our study suggests that LLMs have eclipsed most benchmarks of clinical reasoning, motivating the urgent need for prospective trials."

AI is starting to beat doctors at making correct diagnoses | Science | AAAS "Large language model excels at clinical decisions, even in fast pace of a simulated ER"

AI can reason like a physician—what comes next? (Perspective, open access) "Text-based AI can think like a physician; the challenge is achieving safe clinical implementation" [This Perspective has no abstract!]

Performance of a large language model on the reasoning tasks of a physician (no public access)

Superhuman performance of a large language model on the reasoning tasks of a physician (similar preprint, published 12/14/2024, open access)

Monday, May 04, 2026

AI is starting to beat human doctors at making correct diagnoses in the emergency room

No comments: