MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering

AI-generated keywords: MultiTabQA Tabular QA Semantic Parsing Neural Models Table Reasoning

AI-generated Key Points

  • Recent advances in tabular question answering (QA) have been limited to answering questions over a single table, which constrains their coverage and does not involve common table operations such as set operations, Cartesian products (joins), or nested queries.
  • Two major directions have been explored to address this gap: semantic parsing-based techniques and end-to-end neural models.
  • The proposed model is called MultiTabQA, which answers questions over multiple tables and generates tabular answers.
  • A pre-training dataset comprising 132,645 SQL queries and tabular answers was built for effective training.
  • MultiTabQA outperforms state-of-the-art single-table QA models adapted to multi-table QA settings by finetuning on three datasets (Spider, Atis, and GeoQuery) in terms of accuracy and generalization ability.
  • The approach is particularly useful for non-normalized tables from sources other than relational databases such as web tables or tables in text documents.
  • Novel evaluation metrics that assess the quality of generated tables at different levels of granularity based on their structural properties such as column names and data types associated with the generated values are introduced.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Vaishali Pal, Andrew Yates, Evangelos Kanoulas, Maarten de Rijke

Accepted at ACL-2023
License: CC BY-NC-SA 4.0

Abstract: Recent advances in tabular question answering (QA) with large language models are constrained in their coverage and only answer questions over a single table. However, real-world queries are complex in nature, often over multiple tables in a relational database or web page. Single table questions do not involve common table operations such as set operations, Cartesian products (joins), or nested queries. Furthermore, multi-table operations often result in a tabular output, which necessitates table generation capabilities of tabular QA models. To fill this gap, we propose a new task of answering questions over multiple tables. Our model, MultiTabQA, not only answers questions over multiple tables, but also generalizes to generate tabular answers. To enable effective training, we build a pre-training dataset comprising of 132,645 SQL queries and tabular answers. Further, we evaluate the generated tables by introducing table-specific metrics of varying strictness assessing various levels of granularity of the table structure. MultiTabQA outperforms state-of-the-art single table QA models adapted to a multi-table QA setting by finetuning on three datasets: Spider, Atis and GeoQuery.

Submitted to arXiv on 22 May. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2305.12820v2

Recent advances in tabular question answering (QA) have been limited to answering questions over a single table, which constrains their coverage and does not involve common table operations such as set operations, Cartesian products (joins), or nested queries. To address this gap, researchers have explored two major directions: semantic parsing-based techniques that transform natural language questions into logical forms used to query a relational database, and end-to-end neural models that combine question understanding with table reasoning. Our work focuses on the latter direction and proposes a new task of answering questions over multiple tables using our model called MultiTabQA. This model not only answers questions over multiple tables but also generates tabular answers. To enable effective training, we build a pre-training dataset comprising 132,645 SQL queries and tabular answers. Furthermore, we evaluate the generated tables by introducing table-specific metrics of varying strictness assessing various levels of granularity of the table structure. Compared to state-of-the-art single-table QA models adapted to multi-table QA settings by finetuning on three datasets (Spider, Atis, and GeoQuery), MultiTabQA outperforms them in terms of accuracy and generalization ability. Our approach is particularly useful for non-normalized tables from sources other than relational databases such as web tables or tables in text documents. Moreover, our work highlights the importance of generating accurate tabular outputs for multi-table QA tasks since they often result in a tabular output. We introduce novel evaluation metrics that assess the quality of generated tables at different levels of granularity based on their structural properties such as column names and data types associated with the generated values. In summary, our proposed MultiTabQA model addresses the limitations of existing tabular QA systems by enabling accurate answer generation for complex multi-table queries from diverse sources beyond relational databases while providing an effective training methodology and novel evaluation metrics for generated tables.
Created on 25 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.