Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

AI-generated keywords: LLM-Augmenter External Knowledge Automated Feedback Hallucinations Wiki QA

AI-generated Key Points

  • Challenges of applying large language models (LLMs) to real-world applications
  • Issues with LLMs such as hallucinations and lack of external knowledge integration
  • Introduction of LLM-Augmenter system to enhance LLMs
  • LLM-Augmenter incorporates plug-and-play modules for external knowledge integration
  • Iterative revision of LLM prompts using feedback generated by utility functions
  • Empirical validation in task-oriented dialog and open-domain question answering scenarios
  • Reduction of hallucinations without sacrificing response fluency and informativeness compared to ChatGPT alone
  • Availability of source code and models for public use
  • Evaluation results on Wiki QA showing top-5 answer recall of consolidated evidence (CORE) at 50.83%
  • Conclusion: LLM-Augmenter effectively improves large language models by incorporating external knowledge and automated feedback mechanisms
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao

10 pages
License: CC BY 4.0

Abstract: Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e.g., task-oriented dialog and question answering. However, applying LLMs to real-world, mission-critical applications remains challenging mainly due to their tendency to generate hallucinations and inability to use external knowledge.This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules. Our system makes the LLM generate responses grounded in consolidated external knowledge, e.g., stored in task-specific databases. It also iteratively revises LLM prompts to improve model responses using feedback generated by utility functions, e.g., the factuality score of a LLM-generated response. The effectiveness of LLM-Augmenter is empirically validated on two types of mission-critical scenarios, task-oriented dialog and open-domain question answering. LLM-Augmenter significantly reduces ChatGPT's hallucinations without sacrificing the fluency and informativeness of its responses. We make the source code and models publicly available.

Submitted to arXiv on 24 Feb. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2302.12813v1

The paper titled "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback" addresses the challenges of applying large language models (LLMs) like ChatGPT to real-world, mission-critical applications. While LLMs are capable of generating human-like responses for various tasks, they often suffer from issues such as hallucinations and a lack of external knowledge integration. To overcome these limitations, the authors propose a system called LLM-Augmenter. This system enhances a black-box LLM by incorporating plug-and-play modules. These modules enable the LLM to generate responses grounded in consolidated external knowledge stored in task-specific databases. Additionally, the system iteratively revises LLM prompts using feedback generated by utility functions, such as factuality scores of LLM-generated responses. The effectiveness of LLM-Augmenter is empirically validated in two mission-critical scenarios: task-oriented dialog and open-domain question answering. The results demonstrate that LLM-Augmenter significantly reduces hallucinations without sacrificing response fluency and informativeness compared to ChatGPT alone. The paper also highlights the availability of the source code and models for public use. Furthermore, it provides evaluation results on Wiki QA showing that the top-5 answer recall of consolidated evidence (CORE) is 50.83%. In conclusion, this paper presents an approach that improves large language models by augmenting them with external knowledge and automated feedback mechanisms. The proposed LLM Augmenter system effectively addresses issues related to hallucinations while maintaining response quality.
Created on 17 Aug. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.