Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices

AI-generated keywords: Question Answering Natural Language Understanding Deep Learning Hybrid Question Answering Knowledge-Based QA

AI-generated Key Points

  • The internet has led to an increase in available information, requiring automated answering systems
  • Question-Answering (QA) is used to provide relevant answers using Natural Language Understanding (NLU)
  • QA involves mapping user questions, retrieving relevant information, and finding the best answer
  • Deep learning models have shown significant improvements in QA tasks
  • Open challenges include automatic question generation, similarity detection, and low resource availability for language processing
  • State-of-the-art models on QA datasets are evaluated based on performance metrics such as F1 score and EM score
  • Approaches used by researchers include pre-training BERT or GPT models without feature extraction or using Bidirectional Long Short Term Memory (BLSTM) networks with embedding layers for end-to-end training
  • Hybrid Question Answering requires multiple semantic clues to constrain the answer set for complex questions
  • QA can be divided into Raw Text-Based QA and Knowledge-Based QA (KBQA)
  • Deep learning techniques are gaining popularity for resource-rich languages in QA research but still far off for low-resource languages where rule-based or machine learning approaches prevail.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Hariom A. Pandya, Brijesh S. Bhatt

License: CC BY 4.0

Abstract: The usage and amount of information available on the internet increase over the past decade. This digitization leads to the need for automated answering system to extract fruitful information from redundant and transitional knowledge sources. Such systems are designed to cater the most prominent answer from this giant knowledge source to the user query using natural language understanding (NLU) and thus eminently depends on the Question-answering(QA) field. Question answering involves but not limited to the steps like mapping of user question to pertinent query, retrieval of relevant information, finding the best suitable answer from the retrieved information etc. The current improvement of deep learning models evince compelling performance improvement in all these tasks. In this review work, the research directions of QA field are analyzed based on the type of question, answer type, source of evidence-answer, and modeling approach. This detailing followed by open challenges of the field like automatic question generation, similarity detection and, low resource availability for a language. In the end, a survey of available datasets and evaluation measures is presented.

Submitted to arXiv on 07 Dec. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2112.03572v1

The past decade has seen a significant increase in the usage and amount of information available on the internet, leading to the need for automated answering systems that can extract useful information from redundant and transitional knowledge sources. These systems rely heavily on the field of Question-Answering (QA) to provide users with relevant answers using Natural Language Understanding (NLU). QA involves several steps, including mapping user questions to pertinent queries, retrieving relevant information, and finding the best suitable answer from retrieved data. Recent improvements in deep learning models have shown compelling performance improvement in all these tasks. This review work analyzes research directions in QA based on question type, answer type, source of evidence-answer, and modeling approach. The paper also highlights open challenges such as automatic question generation, similarity detection, and low resource availability for language processing. The authors present a survey of available datasets and evaluation measures. State-of-the-art models on QA datasets are evaluated based on their performance metrics such as F1 score and EM score. The authors discuss various approaches used by researchers such as pre-training BERT or GPT models without feature extraction or using Bidirectional Long Short Term Memory (BLSTM) networks with embedding layers for end-to-end training. Hybrid Question Answering is another area where multiple semantic clues are required to constrain the answer set for complex questions. Researchers have proposed various methods like generating multiple query graphs for a given question or dividing the model into two parts: question interpretation and answer inference. Based on storage where we look for an answer, we can divide QA further into Raw Text-Based QA and Knowledge-Based QA (KBQA). While humans can easily detect an answer paragraph or sentence from a given passage, machines struggle with this task due to reasoning disparities between humans and machines. Overall, while deep learning techniques are gaining popularity for resource-rich languages in QA research; it is far off for low-resource languages where rule-based or machine learning approaches still prevail.
Created on 25 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.