ChatGPT's recommendations for guidelines-based cancer treatments prove limited

Correct and incorrect ChatGPT recommendations for guidelines-based cancer treatments inter-mingled in one-third of the chatbot’s responses, making errors more difficult to detect.

The Internet is a powerful tool for self-education on medical topics for many patients.

With ChatGPT now at patients’ fingertips, researchers from Brigham and Women’s Hospital, a founding member of the Mass General Brigham healthcare system, assessed how consistently the artificial intelligence chatbot provides recommendations for cancer treatment that align with National Comprehensive Cancer Network (NCCN) guidelines.

Their findings, published in JAMA Oncology, show that ChatGPT 3.5 provided an inappropriate (“non-concordant”) recommendation in approximately one-third of cases, highlighting the need for awareness of the technology’s limitations.

ChatGPT on a smartphone – artistic visualization. Image credit: Levart Photographer via Unsplash, free license

“Patients should feel empowered to educate themselves about their medical conditions, but they should always discuss with a clinician, and resources on the Internet should not be consulted in isolation,” said corresponding author Danielle Bitterman, MD, of the Department of Radiation Oncology at Brigham and Women’s Hospital and the Artificial Intelligence in Medicine (AIM) Program of Mass General Brigham.

“ChatGPT responses can sound a lot like a human and can be quite convincing. But, when it comes to clinical decision-making, there are so many subtleties for every patient’s unique situation. A right answer can be very nuanced, and not necessarily something ChatGPT or another large language model can provide.”

The emergence of artificial intelligence tools in health has been groundbreaking and has the potential to positively reshape the continuum of care.

AI Face Pexels cottonbro studio resize ChatGPT's recommendations for guidelines-based cancer treatments prove limited — Artificial intelligence, ChatGPT – artistic concept image. Photo credit: Pexels / Cottonbro Studio, free license

Mass General Brigham, as one of the nation’s top integrated academic health systems and largest innovation enterprises, is leading the way in conducting rigorous research on new and emerging technologies to inform the responsible incorporation of AI into care delivery, workforce support, and administrative processes.

Although medical decision-making can be influenced by many factors, Bitterman and colleagues chose to evaluate the extent to which ChatGPT’s recommendations aligned with the NCCN guidelines, which are used by physicians at institutions across the country.

They focused on the three most common cancers (breast, prostate and lung cancer) and prompted ChatGPT to provide a treatment approach for each cancer based on the severity of the disease.

In total, the researchers included 26 unique diagnosis descriptions and used four, slightly different prompts to ask ChatGPT to provide a treatment approach, generating a total of 104 prompts.

medical monitoring hospital ChatGPT's recommendations for guidelines-based cancer treatments prove limited — Medical monitoring machine in a hospital – Stephen Andrews via Unsplash

Nearly all responses (98 percent) included at least one treatment approach that agreed with NCCN guidelines. However, the researchers found that 34 percent of these responses also included one or more non-concordant recommendations, which were sometimes difficult to detect amidst otherwise sound guidance.

A non-concordant treatment recommendation was defined as one that was only partially correct; for example, for a locally advanced breast cancer, a recommendation of surgery alone, without mention of another therapy modality.

Notably, complete agreement in scoring only occurred in 62 percent of cases, underscoring both the complexity of the NCCN guidelines themselves and the extent to which ChatGPT’s output could be vague or difficult to interpret.

In 12.5 percent of cases, ChatGPT produced “hallucinations,” or a treatment recommendation entirely absent from NCCN guidelines. These included recommendations of novel therapies, or curative therapies for non-curative cancers.

The authors emphasized that this form of misinformation can incorrectly set patients’ expectations about treatment and potentially impact the clinician-patient relationship.

Going forward, the researchers are exploring how well both patients and clinicians can distinguish between medical advice written by a clinician versus a large language model (LLM) like ChatGPT. They are also prompting ChatGPT with more detailed clinical cases to further evaluate its clinical knowledge.

The authors used GPT-3.5-turbo-0301, one of the largest models available at the time they conducted the study and the model class that is currently used in the open-access version of ChatGPT (a newer version, GPT-4, is only available with the paid subscription).

They also used the 2021 NCCN guidelines, because GPT-3.5-turbo-0301 was developed using data up to September 2021. While results may vary if other LLMs and/or clinical guidelines are used, the researchers emphasize that many LLMs are similar in the way they are built and the limitations they possess.

“It is an open research question as to the extent LLMs provide consistent logical responses as oftentimes ‘hallucinations’ are observed,” said first author Shan Chen, MS, of the AIM Program.

“Users are likely to seek answers from the LLMs to educate themselves on health-related topics—similarly to how Google searches have been used. At the same time, we need to raise awareness that LLMs are not the equivalent of trained medical professionals.”

Source: BWH

Source link

Unikisan: Leveraging Agriculture with the World’s First AI Farming Guideline App

Potential Battleground for China–Taiwan Rivalry in the Pacific

Greek Church Opposes Extension of Surrogacy Law: Firm Stance on Legal Changes

IndiGo and Mumbai Airport Receive Official Notice Following Viral Video of Passengers Enjoying a Meal on the Runway

China’s Population Drops by 2.08 Million as Birth Rate Hits Record Low

Maharashtra NCP Candidate List Sharad Pawar Rashtrawadi Congress

What’s the Buzz in Uttar Pradesh Today? Hint: PM Modi to Inaugurate 14,000 Projects!

Congress Leader Krishanappa Asserts Independence: No Authorization Needed for Ram Mandir Visit

Breaking News: Key Associate of Shiv Sena (UBT) MLA Aaditya Thackeray Apprehended in BMC Khichdi COVID Scam

BJP Rejects Sushil Kumar Shinde’s ‘Switch Offer’ Allegations, Embraces Backing for PM Modi

Income Protection 101: The Crucial Role of Term Insurance in Financial Planning

IRCTC Launches tour for Ladakh ex Mumbai in month of May package starting from 58900 rupees

Struggling to Reach Your Goals? Discover the True Path with Priority Management Mastery.

SEBI Keeps a Watchful Eye on Mutual Fund Firms for Providing Rewards, Sponsored Trips to Distributors Upon Meeting Sales Goals

Bank Closure Notice: Uttar Pradesh to Observe Public Holiday on 22nd January for Ram Mandir Pran Pratishtha Ceremony

Snoop Dogg to Join NBC’s 2024 Paris Olympics Broadcast – Hip-Hop Icon Adds Star Power to Games Coverage!

Former cricket player Ambati Rayudu joins the YSR Congress

Bajrang Punia returns Padma Shri after Brij Bhushan’s aide becomes WFI Chief

Shakib Al Hasan, Bangladeshi cricketer, steps into politics with eye on 2024 elections

Who is the pitch invader that broke into the World Cup 2023 final between Australia and India?

Unveiling the Mystery of Disease X: A Spotlight on the Latest Headlines and Why It Matters

Why Sperm Counts Are Decreasing Globally: Uncovering the Reasons

DeepMind’s AI Successfully Tackles Challenging Geometry Puzzles in Mathematics Olympiad

Unlocking the Secrets: How Tardigrades Brave Extreme Environments Revealed at Last

Discovery Reveals Seabed Trawling as Significant Contributor to Worldwide CO2 Emissions

ChatGPT’s recommendations for guidelines-based cancer treatments prove limited

Related articles

Unikisan: Leveraging Agriculture with the World’s First AI Farming Guideline App

Maharashtra NCP Candidate List Sharad Pawar Rashtrawadi Congress

Potential Battleground for China–Taiwan Rivalry in the Pacific

Looking for a Personal Coach? Dubai-Based Celebrity Fitness Trainer Sameer Khan Has a Few Tips & Busts Some Myths

Recent articles

Unikisan: Leveraging Agriculture with the World’s First AI Farming Guideline App

Maharashtra NCP Candidate List Sharad Pawar Rashtrawadi Congress

Potential Battleground for China–Taiwan Rivalry in the Pacific

Looking for a Personal Coach? Dubai-Based Celebrity Fitness Trainer Sameer Khan Has a Few Tips & Busts Some Myths