AI no better than other methods for patients seeking medical advice, study shows


FILE PHOTO: AI (Artificial Intelligence) letters and a robot hand are placed on a computer motherboard in this illustration created on June 23, 2023. REUTERS/Dado Ruvic/Illustration/File Photo

LONDON, Feb 9 (Reuters) - Asking AI about medical ‌symptoms does not help patients make better decisions about their health than other methods, such as a ‌standard internet search, according to a new study published in Nature Medicine.

The authors said the study ‌was important as people were increasingly turning to AI and chatbots for advice on their health, but without evidence that this was necessarily the best and safest approach.

Researchers led by the University of Oxford’s Internet Institute worked alongside a group of doctors to draw up 10 different ‍medical scenarios, ranging from a common cold to a life-threatening haemorrhage causing ‍bleeding on the brain.

When tested without human participants, ‌three large-language models – Open AI's Chat GPT-4o, Meta's Llama 3 and Cohere's Command R+ – identified the conditions in 94.9% ‍of ​cases, and chose the correct course of action, like calling an ambulance or going to the doctor, in an average of 56.3% of cases. The companies did not respond to requests for comment.

'HUGE GAP' BETWEEN ⁠AI'S POTENTIAL AND ACTUAL PERFORMANCE

The researchers then recruited 1,298 participants in Britain ‌to either use AI, or their usual resources like an internet search, or their experience, or the National Health Service website to investigate ⁠the symptoms and decide ‍their next step.

When the participants did this, relevant conditions were identified in less than 34.5% of cases, and the right course of actionwas given in less than 44.2%, no better than the control group using more traditional tools.

Adam Mahdi, co-author of the paper ‍and associate professor at Oxford, said the study showed the “huge gap” ‌between the potential of AI and the pitfalls when it was used by people.

“The knowledge may be in those bots; however, this knowledge doesn’t always translate when interacting with humans,” he said, meaning that more work was needed to identify why this was happening.

HUMANS OFTEN GIVING INCOMPLETE INFORMATION

The team studied around 30 of the interactions in detail, and concluded that often humans were providing incomplete or wrong information, but the LLMs were also sometimes generating misleading or incorrect responses.

For example, one patient reporting the symptoms of a subarachnoid haemhorrhage – a life-threatening condition causing bleeding on the brain – was correctly ‌told by AI to go to hospital after describing a stiff neck, light sensitivity and the "worst headache ever". The other described the same symptoms but a "terrible" headache, and was told to lie down in a darkened room.

The team now plans a similar study in different countries ​and languages, and over time, to test if that impacts AI’s performance.

The study was supported by the data company Prolific, the German non-profit Dieter Schwarz Stiftung, and the UK and U.S. governments.

(Reporting by Jennifer Rigby; Additional reporting by Supantha Mukherjee; Editing by David Holmes)

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

Oppo Find N6 launches March 17 with a 200-megapixel camera. Pre-orders available
Realme launches Note 80 at RM399 with a 6,300mAh battery and 6.74in screen
Family sues ChatGPT-maker OpenAI over school shooting in Canada
German start-up plans 30-megawatt AI data centre in boost to sovereign control
An Amish avatar and an AI monk are pitching supplements on social media
AI can now pick up signs of heart disease in breast cancer screenings
Ex-Meta AI chief Yann LeCun's AMI raises $1.03 billion for alternative AI approach
Meta, Google pivot in addiction trial to accuser’s home life
AI-generated fake voices becoming increasingly hard to detect
Former Google AI researcher sets up AI robotics startup in Tokyo

Others Also Read