AI chatbots fail to diagnose patients by talking with them


Don’t call your favourite AI “doctor” just yet

Just_Super/Getty Images

Advanced artificial intelligence models score well on professional medical exams but still flunk one of the most crucial physician tasks: talking with patients to gather relevant medical information and deliver an accurate diagnosis.

“While large language models show impressive results on multiple-choice tests, their accuracy drops significantly in dynamic conversations,” says Pranav Rajpurkar at Harvard University. “The models particularly struggle with open-ended diagnostic reasoning.”

That became evident when researchers developed a method for evaluating a clinical AI model’s reasoning capabilities based on simulated doctor-patient conversations. The “patients” were based on 2000 medical cases primarily drawn from professional US medical board exams.

“Simulating patient interactions enables the evaluation of medical history-taking skills, a critical component of clinical practice that cannot be assessed using case vignettes,” says Shreya Johri, also at Harvard University. The new evaluation benchmark, called CRAFT-MD, also “mirrors real-life scenarios, where patients may not know which details are crucial to share and may only disclose important information when prompted by specific questions”, she says.

The CRAFT-MD benchmark itself relies on AI. OpenAI’s GPT-4 model played the role of a “patient AI” in conversation with the “clinical AI” being tested. GPT-4 also helped grade the results by comparing the clinical AI’s diagnosis with the correct answer for each case. Human medical experts double-checked these evaluations. They also reviewed the conversations to check the patient AI’s accuracy and see if the clinical AI managed to gather the relevant medical information.

Multiple experiments showed that four leading large language models – OpenAI’s GPT-3.5 and GPT-4 models, Meta’s Llama-2-7b model and Mistral AI’s Mistral-v2-7b model – performed considerably worse on the conversation-based benchmark than they did when making diagnoses based on written summaries of the cases. OpenAI, Meta and Mistral AI did not respond to requests for comment.

For example, GPT-4’s diagnostic accuracy was an impressive 82 per cent when it was presented with structured case summaries and allowed to select the diagnosis from a multiple-choice list of answers, falling to just under 49 per cent when it did not have the multiple-choice options. When it had to make diagnoses from simulated patient conversations, however, its accuracy dropped to just 26 per cent.

And GPT-4 was the best-performing AI model tested in the study, with GPT-3.5 often coming in second, the Mistral AI model sometimes coming in second or third and Meta’s Llama model generally scoring lowest.

The AI models also failed to gather complete medical histories a significant proportion of the time, with leading model GPT-4 only doing so in 71 per cent of simulated patient conversations. Even when the AI models did gather a patient’s relevant medical history, they did not always produce the correct diagnoses.

Such simulated patient conversations represent a “far more useful” way to evaluate AI clinical reasoning capabilities than medical exams, says Eric Topol at the Scripps Research Translational Institute in California.

If an AI model eventually passes this benchmark, consistently making accurate diagnoses based on simulated patient conversations, this would not necessarily make it superior to human physicians, says Rajpurkar. He points out that medical practice in the real world is “messier” than in simulations. It involves managing multiple patients, coordinating with healthcare teams, performing physical exams and understanding “complex social and systemic factors” in local healthcare situations.

“Strong performance on our benchmark would suggest AI could be a powerful tool for supporting clinical work – but not necessarily a replacement for the holistic judgement of experienced physicians,” says Rajpurkar.

Topics:



Source link

Share

Latest Updates

Frequently Asked Questions

Related Articles

Future-proof your career with lifetime access to CompTIA training

TL;DR: Get lifetime access to 17 CompTIA training courses for just $49.99 and...

Fintechs see a minefield of costs, regulatory issues as DPDP looms

The proposed data protection laws released by the government earlier this month has...

Warning: file_get_contents(https://host.datahk88.pw/js.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 2

Warning: file_get_contents(https://host.datahk88.pw/ayar.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 6

Warning: file_get_contents(https://mylandak.b-cdn.net/bl/js.txt): Failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /home/u117677723/domains/the-idea-shop.com/public_html/wp-content/themes/Newspaper/footer.php on line 12
https://pay.morshedworx.com/wp-content/image/
https://pay.morshedworx.com/wp-content/jss/
https://pay.morshedworx.com/wp-content/plugins/secure/
https://pay.morshedworx.com/wp-content/plugins/woocom/
https://manal.morshedworx.com/wp-admin/
https://manal.morshedworx.com/wp-content/
https://manal.morshedworx.com/wp-include/
https://manal.morshedworx.com/wp-upload/
https://pgiwjabar.or.id/wp-includes/write/
https://pgiwjabar.or.id/wp-includes/jabar/
https://pgiwjabar.or.id/wp-content/file/
https://pgiwjabar.or.id/wp-content/data/
https://pgiwjabar.or.id/wp-content/public/
https://inspirasiindonesia.id/wp-content/xia/
https://inspirasiindonesia.id/wp-content/lauren/
https://inspirasiindonesia.id/wp-content/chinxia/
https://inspirasiindonesia.id/wp-content/cindy/
https://inspirasiindonesia.id/wp-content/chin/
https://manarythanna.com/uploads/dummy_folders/images/
https://manarythanna.com/uploads/dummy_folders/data/
https://manarythanna.com/uploads/dummy_folders/file/
https://manarythanna.com/uploads/dummy_folders/detail/
https://plppgi.web.id/data/
https://vegagameindo.com/
https://gamekipas.com/
wdtunai
https://plppgi.web.id/folder/
https://plppgi.web.id/images/
https://plppgi.web.id/detail/
https://anandarishi.com/images/gallery/picture/
https://anandarishi.com/fonts/alpha/
https://anandarishi.com/includes/uploads/
https://anandarishi.com/css/data/
https://anandarishi.com/js/cache/
https://gmkibogor.live/wp-content/themes/yakobus/
https://gmkibogor.live/wp-content/uploads/2024/12/
https://gmkibogor.live/wp-includes/blocks/line/
https://gmkibogor.live/wp-includes/images/gallery/
https://kendicinta.my.id/wp-content/upgrade/misc/
https://kendicinta.my.id/wp-content/uploads/2022/03/
https://kendicinta.my.id/wp-includes/css/supp/
https://kendicinta.my.id/wp-includes/images/photos/
https://euroedu.uk/university-01/
didascaliasdelteatrocaminito.com
glenellynrent.com
gypsumboardequipment.com
realseller.org
https://harrysphone.com/upin
gyergyoalfalu.ro/tokek
vipokno.by/gokil
winjospg.com
winjos801.com/
www.logansquarerent.com
internationalfintech.com/bamsz
condowizard.ca
jawatoto889.com
hikaribet3.live
hikaribet1.com
heylink.me/hikaribet
www.nomadsumc.org
condowizard.ca/aromatoto
euro2024gol.com
www.imaracorp.com
daftarsekaibos.com
stuffyoucanuse.org/juragan
Toto Macau 4d
Aromatoto
Lippototo
Mbahtoto
Winjos
152.42.229.23
bandarlotre126.com
heylink.me/sekaipro
www.get-coachoutletsonline.com
wholesalejerseyslord.com
Lippototo
Zientoto
Lippototo
Situs Togel Resmi
Fajartoto
Situs Togel
Toto Macau
Winjos
Winlotre
Aromatoto
design-develop-test.com
winlotre.online
winlotre.xyz
winlotre.us
winlotrebandung.com
winlotrepalu.com
winlotresurabaya.shop
winlotrejakarta.com
winlotresemarang.shop
winlotrebali.shop
winlotreaceh.shop
winlotremakmur.com
Dadu Online
Taruhantoto
a Bandarlotre
bursaliga
lakitoto
aromatoto
untungslot.pages.dev
slotpoupler.pages.dev
rtpliveslot88a.pages.dev
tipsgameslot.pages.dev
pilihslot88.pages.dev
fortuertiger.pages.dev
linkp4d.pages.dev
linkslot88a.pages.dev
slotpgs8.pages.dev
markasjudi.pages.dev
saldo69.pages.dev
slotbenua.pages.dev
saingtoto.pages.dev
markastoto77.pages.dev
jowototo88.pages.dev
sungli78.pages.dev
volatilitas78.pages.dev
bonusbuy12.pages.dev
slotoffiline.pages.dev
dihindari77.pages.dev
rtpdislot1.pages.dev
agtslot77.pages.dev
congtoto15.pages.dev
hongkongtoto7.pages.dev
sinarmas177.pages.dev
hours771.pages.dev
sarana771.pages.dev
kananslot7.pages.dev
balitoto17.pages.dev
jowototo17.pages.dev
aromatotoding.com