You Only Need One Model for Open-domain Question Answering
AI-generated Key Points
⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.
- The paper proposes a new approach to Open-domain Question Answering (QA) using a singular model architecture instead of the traditional three-model approach.
- The existing approach involves separate retriever, reranker, and reader models with weakly coupled parameters during training.
- The proposed method uses hard-attention mechanisms within its transformer architecture to sequentially apply the retriever and reranker and feed resulting computed representations to the reader.
- This singular model architecture progressively refines hidden representations from the retriever to the reranker to the reader, leading to better gradient flow when trained in an end-to-end manner.
- A pre-training methodology is proposed to effectively train this architecture.
- The authors evaluate their model on Natural Questions and TriviaQA open datasets and show that their approach outperforms previous state-of-the-art models by 1.0 and 0.7 exact match scores for a fixed parameter budget.
- Contributions of this paper include proposing a new singular model architecture for Open-domain QA that efficiently uses model capacity while improving performance over previous approaches, utilizing hard attention mechanisms within its transformer architecture which enables end-to-end training with improved gradient flow compared to traditional approaches, and proposing a pre-training methodology which further boosts its performance on open domain QA tasks such as Natural Questions and TriviaQA datasets where it outperforms existing state of art models by 1.0 and 0.7 exact match scores respectively for fixed parameter budget.
Authors: Haejun Lee, Akhil Kedia, Jongwon Lee, Ashwin Paranjape, Christopher D. Manning, Kyoung-Gu Woo
Abstract: Recent works for Open-domain Question Answering refer to an external knowledge base using a retriever model, optionally rerank the passages with a separate reranker model and generate an answer using an another reader model. Despite performing related tasks, the models have separate parameters and are weakly-coupled during training. In this work, we propose casting the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture and feeding the resulting computed representations to the reader. In this singular model architecture the hidden representations are progressively refined from the retriever to the reranker to the reader, which is more efficient use of model capacity and also leads to better gradient flow when we train it in an end-to-end manner. We also propose a pre-training methodology to effectively train this architecture. We evaluate our model on Natural Questions and TriviaQA open datasets and for a fixed parameter budget, our model outperforms the previous state-of-the-art model by 1.0 and 0.7 exact match scores.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
⚠The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.