Dr ChatGPT, tell me what I want to hear: How prompt knowledge impacts health answer correctness

AI-generated keywords: GPLMs ChatGPT Prompt Knowledge Health Advice Accuracy

AI-generated Key Points

  • The study explores the impact of prompt knowledge on the correctness of answers generated by generative pre-trained language models (GPLMs) like ChatGPT in the context of consumers seeking health advice.
  • Prompt knowledge can override the model's encoded knowledge, leading to a decrease in answer correctness.
  • The effectiveness of ChatGPT in answering health-related questions is demonstrated with an accuracy rate of 80%.
  • Prompt knowledge often overturns model-generated answers about treatments, resulting in a decrease in overall accuracy (63%).
  • Incorporating prompt knowledge can impact the reliability of GPLMs in providing accurate health advice.
  • Further development is needed to ensure trustworthy outcomes.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Guido Zuccon, Bevan Koopman

License: CC BY 4.0

Abstract: Generative pre-trained language models (GPLMs) like ChatGPT encode in the model's parameters knowledge the models observe during the pre-training phase. This knowledge is then used at inference to address the task specified by the user in their prompt. For example, for the question-answering task, the GPLMs leverage the knowledge and linguistic patterns learned at training to produce an answer to a user question. Aside from the knowledge encoded in the model itself, answers produced by GPLMs can also leverage knowledge provided in the prompts. For example, a GPLM can be integrated into a retrieve-then-generate paradigm where a search engine is used to retrieve documents relevant to the question; the content of the documents is then transferred to the GPLM via the prompt. In this paper we study the differences in answer correctness generated by ChatGPT when leveraging the model's knowledge alone vs. in combination with the prompt knowledge. We study this in the context of consumers seeking health advice from the model. Aside from measuring the effectiveness of ChatGPT in this context, we show that the knowledge passed in the prompt can overturn the knowledge encoded in the model and this is, in our experiments, to the detriment of answer correctness. This work has important implications for the development of more robust and transparent question-answering systems based on generative pre-trained language models.

Submitted to arXiv on 23 Feb. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2302.13793v1

This paper titled "Dr ChatGPT, tell me what I want to hear: How prompt knowledge impacts health answer correctness" explores the impact of prompt knowledge on the correctness of answers generated by generative pre-trained language models (GPLMs) like ChatGPT in the context of consumers seeking health advice. GPLMs encode knowledge observed during pre-training and utilize it during inference to address user prompts. Additionally, they can leverage knowledge provided in the prompts themselves. The study aims to compare answer correctness when ChatGPT relies solely on its internal knowledge versus when it combines that knowledge with prompt knowledge. The researchers find that prompt knowledge can override the model's encoded knowledge, leading to a decrease in answer correctness. This work has significant implications for developing more robust and transparent question-answering systems based on GPLMs. The paper also investigates the effectiveness of ChatGPT in answering health-related questions and demonstrates its accuracy rate of 80%. Furthermore, when prompting with supporting or contrary evidence, they observe that prompt knowledge often overturns model-generated answers about treatments, resulting in a decrease in overall accuracy (63%). Overall, this research sheds light on how incorporating prompt knowledge can impact the reliability of GPLMs in providing accurate health advice and emphasizes the need for further development in this area to ensure trustworthy outcomes.
Created on 28 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.