Patients managing complex illnesses like cancer often face a daunting information challenge. They not only need to understand their condition but also learn how to effectively manage it. Close interaction with healthcare experts has been shown to improve patient learning and ultimately their disease outcome. However, this approach is resource-intensive and can take expert time away from other critical tasks. With recent advancements in Generative AI models aimed at improving the healthcare system, researchers set out to investigate whether and how generative visual question answering systems could responsibly support patient information needs in the context of radiology imaging data. Through a formative need-finding study involving discussions of chest computed tomography (CT) scans and associated radiology reports with a cardiothoracic radiologist, participants highlighted common themes such as clarifying medical terminology, locating issues mentioned in reports within scanned images, understanding disease prognosis, discussing diagnostic steps, and comparing treatment options. Thematic analysis of these interactions led to the identification of 91 content-based codes grouped into 10 broad themes that captured the essence of participant-radiologist conversations. Evaluation of two state-of-the-art generative visual language models - ChatGPT-4V and MedFlamingo - against responses provided by the radiologist revealed varying levels of response quality across different themes. While MedFlamingo tended to provide concise answers in clinical language that may be challenging for patients and caregivers to understand, ChatGPT-4V generated lengthy responses with generic descriptions without focusing on specific case details or filtering relevant information. The study's findings suggest that current generative AI systems may not adequately address patients' information needs when it comes to understanding medical scans and reports. High error rates were observed in both models during evaluation on real interaction questions, raising concerns about misinformation being conveyed to patients who may not possess medical expertise to discern inaccuracies. The models also struggled with relevance and frequently produced irrelevant elaborations rather than directly addressing questions asked. Overall, the research highlights a critical gap in evaluating generative AI systems for practical healthcare applications and underscores the challenges of ensuring accountable deployment in real-world scenarios where accuracy and relevance are paramount for supporting patients' informational needs effectively.
- - Patients managing complex illnesses like cancer face a daunting information challenge
- - Close interaction with healthcare experts improves patient learning and disease outcome
- - Generative AI models are being explored to support patient information needs in radiology imaging data
- - Thematic analysis identified common themes in discussions between participants and radiologists
- - Evaluation of ChatGPT-4V and MedFlamingo showed varying levels of response quality
- - Current generative AI systems may not adequately address patients' information needs for medical scans and reports
- - High error rates were observed during evaluation, raising concerns about misinformation being conveyed to patients
- - Models struggled with relevance and frequently produced irrelevant elaborations
Summary1. Patients with complex illnesses like cancer have a hard time finding information.
2. Talking a lot with healthcare experts helps patients learn and get better from their disease.
3. Smart computer programs are being tested to help patients understand X-ray pictures.
4. People found common topics in talks between patients and doctors about X-rays.
5. Some computer programs did well, but others didn't give good answers when tested.
Definitions- Patients: People who are sick and need medical help.
- Healthcare experts: Doctors, nurses, or other people who know a lot about keeping people healthy.
- Generative AI models: Computer programs that can think and learn on their own to help people solve problems.
- Thematic analysis: Looking for patterns or common things in conversations or discussions.
- Evaluation: Checking how well something works by testing it out.
- Error rates: How often mistakes happen when using something.
- Relevance: How closely something matches what is needed or wanted.
- Misinformation: Wrong or false information that can be confusing or harmful.
Introduction:
Patients managing complex illnesses like cancer often face a daunting information challenge. Not only do they need to understand their condition, but they also need to learn how to effectively manage it. This can be overwhelming and confusing, especially when it comes to understanding medical scans and reports.
Close interaction with healthcare experts has been shown to improve patient learning and ultimately their disease outcome. However, this approach is resource-intensive and can take expert time away from other critical tasks. With recent advancements in Generative AI models aimed at improving the healthcare system, researchers set out to investigate whether and how generative visual question answering systems could responsibly support patient information needs in the context of radiology imaging data.
Research Objective:
The main objective of this research was to explore the potential use of generative AI models in supporting patients' informational needs related to radiology imaging data. The study aimed to identify common themes that arise during discussions between patients and a cardiothoracic radiologist, evaluate two state-of-the-art generative visual language models - ChatGPT-4V and MedFlamingo - against responses provided by the radiologist, and highlight any gaps or challenges in using these models for practical healthcare applications.
Methodology:
To achieve their research objective, the researchers conducted a formative need-finding study involving discussions of chest computed tomography (CT) scans and associated radiology reports with a cardiothoracic radiologist. Participants included both patients managing complex illnesses like cancer as well as caregivers who assist them in understanding their condition.
During these discussions, participants highlighted common themes such as clarifying medical terminology, locating issues mentioned in reports within scanned images, understanding disease prognosis, discussing diagnostic steps, and comparing treatment options. Thematic analysis of these interactions led to the identification of 91 content-based codes grouped into 10 broad themes that captured the essence of participant-radiologist conversations.
Evaluation Process:
After identifying common themes through thematic analysis, the researchers evaluated two state-of-the-art generative visual language models - ChatGPT-4V and MedFlamingo - against responses provided by the radiologist. The evaluation was done on real interaction questions, and the models were assessed based on response quality, relevance, and accuracy.
Findings:
The study's findings suggest that current generative AI systems may not adequately address patients' information needs when it comes to understanding medical scans and reports. High error rates were observed in both models during evaluation on real interaction questions, raising concerns about misinformation being conveyed to patients who may not possess medical expertise to discern inaccuracies.
Furthermore, the models struggled with relevance and frequently produced irrelevant elaborations rather than directly addressing questions asked. While MedFlamingo tended to provide concise answers in clinical language that may be challenging for patients and caregivers to understand, ChatGPT-4V generated lengthy responses with generic descriptions without focusing on specific case details or filtering relevant information.
Implications:
This research highlights a critical gap in evaluating generative AI systems for practical healthcare applications. It underscores the challenges of ensuring accountable deployment in real-world scenarios where accuracy and relevance are paramount for supporting patients' informational needs effectively.
Conclusion:
In conclusion, this research sheds light on the potential use of generative AI models in supporting patient information needs related to radiology imaging data. However, it also highlights significant gaps and challenges that need to be addressed before these models can be responsibly deployed in practical healthcare settings. Future studies should focus on improving model performance and addressing issues such as relevance and accuracy to ensure effective support for patients managing complex illnesses like cancer.