ChatGPT outperforms docs who *use* it: NYTimes. (Small study.)
Doctors *overruled* the AI if they disagreed with it. Will they overrule *patients* who use it?
An article last Sunday in the New York Times reported on a small study of doctors who used ChatGPT in a diagnosis test. To the authors’ surprise, ChatGPT-4 alone did better than doctors using the same tool.
GPT got 90% right. Docs using GPT got 76% right … only a smidge better than docs without GPT (74%).
What’s going on?
What’s going on, apparently, is two things:
Many of the docs weren’t very good at prompting!
We obviously need to fix this, or else savvy patients will snort at them … and trust in medicine will erode.
The other docs were good at prompting but ignored the AI when it disagreed with their own ideas.
This might be the well-known problem of diagnostic anchoring, or it might be lack of trust in this unknown new tool, or who knows what.
Savvy readers will know that a study this small (n=50) needs to be replicated with a much bigger population. Happily, it’s already underway: study co-author Adam Rodman, of Beth Israel Deaconess (my hospital), tells me a much larger study is underway, and early indications are that results are similar. He also cited a similar small study published a year ago by Google.
The #PatientsUseAI angle: What if the “GPT only” diagnosis came from a patient using the AI? If the doctor snorts and rejects the AI’s advice (the patient’s contribution), we’ll get more diagnostic errors … and trust in doctors will further erode.
We’re at the tip of the cultural iceberg.
Make no mistake, we are right now at a big moment in history. Our decisions will have long lasting consequences, and we have a great opportunity to think ahead to a future state that’s never been feasible.
It matches another turning point: the launch of PubMed in the 1990s, which for the first time gave the public access to the medical literature on the Web. For the first time, savvy patients could autonomously pursue questions - without consuming doctor time. (This was pointed out to me by the pioneering Gilles Frydman, founder (1996) of the ACOR patient community that helped save my life in 2007.)
But information is only part of empowerment: as I said in the Times last summer, “Google gives you access to information. A.I. gives access to clinical thought.” Generative AI is a huge boost to the potential contribution of patients to the clinical relationship … a contribution to care itself, to “outcomes,” as the medical nerds call it.
To make the most of this moment we need two things that don’t exist yet:
Training for the public (training that’s easy to understand) on how to do it well and not be clueless.
Training for doctors on how to partner with (not reject) patients who know what they’re doing.
Let’s get to work on both. For my part, this blog occasionally provides “how to” tips for patients - but I could use an organization or some co-workers to help! Ping me on LinkedIn.
Dave: It turned out to be yet another hoax. Unfortunately, The New York Times ran with it, and the story spiraled out of control: https://sergeiai.substack.com/p/the-ai-outperforms-doctors-claim
I actually think the JAMA study was quite good, and they are very clear that the study did not mean that AI should be used alone. The NYT and others picked up on that finding.
And it is a great one. The core fact is that the AI made a huge difference and the doctors (fairly young and highly educated) were unable to incorporate that into their practice.
I too wrote about this here but think it is a key finding: https://www.linkedin.com/feed/update/urn:li:activity:7264598665728516096/