Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

AI-generated keywords: LLM-Augmenter External Knowledge Automated Feedback Hallucinations Wiki QA

AI-generated Key Points

Challenges of applying large language models (LLMs) to real-world applications
Issues with LLMs such as hallucinations and lack of external knowledge integration
Introduction of LLM-Augmenter system to enhance LLMs
LLM-Augmenter incorporates plug-and-play modules for external knowledge integration
Iterative revision of LLM prompts using feedback generated by utility functions
Empirical validation in task-oriented dialog and open-domain question answering scenarios
Reduction of hallucinations without sacrificing response fluency and informativeness compared to ChatGPT alone
Availability of source code and models for public use
Evaluation results on Wiki QA showing top-5 answer recall of consolidated evidence (CORE) at 50.83%
Conclusion: LLM-Augmenter effectively improves large language models by incorporating external knowledge and automated feedback mechanisms

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao

arXiv: 2302.12813v1 - DOI (cs.CL)

10 pages

License: CC BY 4.0

Abstract: Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e.g., task-oriented dialog and question answering. However, applying LLMs to real-world, mission-critical applications remains challenging mainly due to their tendency to generate hallucinations and inability to use external knowledge.This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules. Our system makes the LLM generate responses grounded in consolidated external knowledge, e.g., stored in task-specific databases. It also iteratively revises LLM prompts to improve model responses using feedback generated by utility functions, e.g., the factuality score of a LLM-generated response. The effectiveness of LLM-Augmenter is empirically validated on two types of mission-critical scenarios, task-oriented dialog and open-domain question answering. LLM-Augmenter significantly reduces ChatGPT's hallucinations without sacrificing the fluency and informativeness of its responses. We make the source code and models publicly available.

Submitted to arXiv on 24 Feb. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2302.12813v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

The paper titled "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback" addresses the challenges of applying large language models (LLMs) like ChatGPT to real-world, mission-critical applications. While LLMs are capable of generating human-like responses for various tasks, they often suffer from issues such as hallucinations and a lack of external knowledge integration. To overcome these limitations, the authors propose a system called LLM-Augmenter. This system enhances a black-box LLM by incorporating plug-and-play modules. These modules enable the LLM to generate responses grounded in consolidated external knowledge stored in task-specific databases. Additionally, the system iteratively revises LLM prompts using feedback generated by utility functions, such as factuality scores of LLM-generated responses. The effectiveness of LLM-Augmenter is empirically validated in two mission-critical scenarios: task-oriented dialog and open-domain question answering. The results demonstrate that LLM-Augmenter significantly reduces hallucinations without sacrificing response fluency and informativeness compared to ChatGPT alone. The paper also highlights the availability of the source code and models for public use. Furthermore, it provides evaluation results on Wiki QA showing that the top-5 answer recall of consolidated evidence (CORE) is 50.83%. In conclusion, this paper presents an approach that improves large language models by augmenting them with external knowledge and automated feedback mechanisms. The proposed LLM Augmenter system effectively addresses issues related to hallucinations while maintaining response quality.

- Challenges of applying large language models (LLMs) to real-world applications
- Issues with LLMs such as hallucinations and lack of external knowledge integration
- Introduction of LLM-Augmenter system to enhance LLMs
- LLM-Augmenter incorporates plug-and-play modules for external knowledge integration
- Iterative revision of LLM prompts using feedback generated by utility functions
- Empirical validation in task-oriented dialog and open-domain question answering scenarios
- Reduction of hallucinations without sacrificing response fluency and informativeness compared to ChatGPT alone
- Availability of source code and models for public use
- Evaluation results on Wiki QA showing top-5 answer recall of consolidated evidence (CORE) at 50.83%
- Conclusion: LLM-Augmenter effectively improves large language models by incorporating external knowledge and automated feedback mechanisms

The key points are about using big computer programs to help with language, but there are some problems. The programs sometimes make mistakes and don't know everything. A new system called LLM-Augmenter helps make the programs better by adding more knowledge from outside sources. It also uses feedback to make the programs smarter over time. The new system has been tested and it works well, making the programs better without losing their ability to give good answers. The source code and models for the new system are available for everyone to use.

Improving Large Language Models with External Knowledge and Automated Feedback

The recent advancements in natural language processing (NLP) have enabled the development of large language models (LLMs), such as ChatGPT, which are capable of generating human-like responses for various tasks. However, these LLMs suffer from issues such as hallucinations and a lack of external knowledge integration. To address these challenges, researchers at Microsoft Research India proposed a system called LLM Augmenter in their paper titled “Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback”.

Background

The authors note that while LLMs can generate fluent responses to user queries, they often fail to provide accurate answers due to their limited understanding of the world. This is because they rely solely on training data without any external knowledge integration or automated feedback mechanisms. As a result, these models tend to hallucinate or generate inaccurate responses when faced with unseen scenarios or complex questions.

Proposed System – LLM Augmenter

To overcome the limitations of existing LLMs, the authors propose a system called LLM Augmenter which enhances black-box LLMs by incorporating plug-and-play modules that enable them to access consolidated external knowledge stored in task-specific databases. Additionally, this system iteratively revises prompts using feedback generated by utility functions such as factuality scores of generated responses. The effectiveness of this approach is evaluated empirically in two mission-critical scenarios: task-oriented dialogs and open domain question answering (QA).

Experimental Results

The results demonstrate that compared to ChatGPT alone, the proposed system significantly reduces hallucinations without sacrificing response fluency and informativeness for both tasks. Furthermore, evaluation results on Wiki QA show that the top 5 answer recall for consolidated evidence (CORE) is 50.83%. The source code and models used in this research are available for public use online.

Conclusion

In conclusion, this paper presents an effective approach for improving large language models by augmenting them with external knowledge sources and automated feedback mechanisms through plug-and-play modules. The proposed system effectively addresses issues related to hallucinations while maintaining response quality across different mission critical applications like task oriented dialogs and open domain QA systems

Created on 17 Aug. 2023

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

65.9%

Chat-REC: Towards Interactive and Explainable LLMs-Augmented Recommender Syst…

cs.IR

65.7%

WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Huma…

cs.CL

65.3%

Training a Helpful and Harmless Assistant with Reinforcement Learning from Hu…

cs.CL

65.1%

RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit

cs.IR

64.8%

Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Em…

cs.CL

64.5%

LLM-powered Data Augmentation for Enhanced Crosslingual Performance

cs.CL

63.8%

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative …

cs.CL

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.