A Survey on Retrieval-Augmented Text Generation

AI-generated keywords: Retrieval-Augmented Text Generation NLP Dialogue Response Generation Text Summarization Paraphrase Generation

AI-generated Key Points

  • Retrieval-augmented text generation has gained significant attention in computational linguistics
  • It offers advantages over conventional generation models and has achieved state-of-the-art performance in various NLP tasks
  • The authors aim to conduct a comprehensive survey on retrieval-augmented text generation
  • The survey highlights the generic paradigm of retrieval-augmented generation
  • Notable approaches for dialogue response generation, machine translation, and other tasks are reviewed
  • RETRO, a large pre-trained language model enhanced with retrieved documents, shows comparable performance to GPT-3 with fewer parameters
  • Adaptive decoding frameworks are proposed for text summarization using retrieval-based techniques
  • Paraphrase generation utilizes retrieval-based frameworks to generate paraphrased sentences based on similar sentences retrieved from a corpus
  • Sentential exemplars are used as syntax templates to control linguistic syntax in generated text
  • Retrieval-based frameworks are employed for text style transfer tasks by retrieving similar texts and editing them to derive the output
  • Incorporating retrieval information from multiple sources improves model performance in style transfer tasks
  • Retrieval augmented generation is adapted for data-to-text generation tasks by retrieving candidate texts based on source data
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu

all authors contributed equally
License: CC BY 4.0

Abstract: Recently, retrieval-augmented text generation attracted increasing attention of the computational linguistics community. Compared with conventional generation models, retrieval-augmented text generation has remarkable advantages and particularly has achieved state-of-the-art performance in many NLP tasks. This paper aims to conduct a survey about retrieval-augmented text generation. It firstly highlights the generic paradigm of retrieval-augmented generation, and then it reviews notable approaches according to different tasks including dialogue response generation, machine translation, and other generation tasks. Finally, it points out some important directions on top of recent methods to facilitate future research.

Submitted to arXiv on 02 Feb. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2202.01110v2

Recently, retrieval-augmented text generation has gained significant attention in the field of computational linguistics. This approach offers several advantages over conventional generation models and has achieved state-of-the-art performance in various natural language processing (NLP) tasks. In this paper, the authors aim to conduct a comprehensive survey on retrieval-augmented text generation. The survey begins by highlighting the generic paradigm of retrieval-augmented generation. It then reviews notable approaches for different tasks, excluding question answering. For dialogue response generation, machine translation, and other generation tasks, the authors discuss how retrieval-augmented techniques have been applied and their effectiveness. In dialogue response generation, a corrupted input sequence is used during learning along with a set of retrieved multi-lingual texts. The model learns to reconstruct the original sequence based on these retrieved documents. RETRO, a large pre-trained language model enhanced with retrieved documents has shown comparable performance to GPT-3 using significantly fewer parameters. Text summarization is another area where retrieval-augmented techniques have been applied. Adaptive decoding frameworks have been proposed that retrieve exemplar documents based on the source document and generate summaries using adaptive generation processes. Some approaches also incorporate an intermediate re-ranking stage to improve summarization quality. For paraphrase generation, retrieval-based frameworks are used to retrieve similar sentences as a basis for generating paraphrased sentences. Another aspect explored is controlling linguistic syntax in generated text by extracting sentential exemplars as syntax templates. In text style transfer tasks, retrieval-based frameworks are employed to retrieve similar texts based on lexical level similarity. Irrelevant tokens are then deleted from the retrieved texts and the output is derived from the edited template. Incorporating retrieval information from multiple sources has shown improved model performance in this area. Retrieval augmented generation has also been adapted for data to text generation tasks. A framework is proposed that retrieves candidate texts from an unlabelled corpus based on source data; a neural selector measures similarities between the source data and candidate texts to extract more fine grained prototypes which are then used as input for generating text descriptions of structured data.
Created on 02 Oct. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.