DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction

AI-generated keywords: Large Language Models (LLMs)

AI-generated Key Points

  • Authors propose a decomposition approach to improve the performance of Large Language Models (LLMs) in complex text-to-SQL tasks
  • Decomposition involves breaking down SQL queries into smaller sub-problems and feeding solutions into LLMs
  • Approach significantly enhances LLM performance, bridging the gap between fine-tuned models and prompting approaches on challenging datasets like Spider
  • Experiments conducted using three LLMs: two variants of CodeX family (Davinci and Cushman) and GPT-4
  • Evaluation performed on Spider dataset consisting of 10,181 questions and 5,693 unique complex SQL queries across 200 databases
  • Proposed approach improves LLM performance by approximately 10%, pushing accuracy towards state-of-the-art levels
  • New state-of-the-art execution accuracy achieved using this approach is 85.3% on holdout test set of Spider, surpassing previous best results at 79.9%
  • Proposed approach outperforms heavily fine-tuned models by at least 5%
  • Experimental setup includes greedy decoding with zero temperature for output generation and specific hyperparameters for each module involved in the approach
  • Evaluation metrics used are exact-set match accuracy (EM) and execution accuracy (EX)
  • Study demonstrates that decomposing text-to-SQL tasks into subproblems can significantly enhance LLM performance
  • Findings contribute to advancing natural language interfaces to databases by improving access to data through relational databases using more efficient reasoning processes
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Mohammadreza Pourreza, Davood Rafiei

License: CC BY 4.0

Abstract: We study the problem of decomposing a complex text-to-sql task into smaller sub-tasks and how such a decomposition can significantly improve the performance of Large Language Models (LLMs) in the reasoning process. There is currently a significant gap between the performance of fine-tuned models and prompting approaches using LLMs on challenging text-to-sql datasets such as Spider. We show that SQL queries, despite their declarative structure, can be broken down into sub-problems and the solutions of those sub-problems can be fed into LLMs to significantly improve their performance. Our experiments with three LLMs show that this approach consistently improves their performance by roughly 10%, pushing the accuracy of LLMs towards state-of-the-art, and even beating large fine-tuned models on the holdout Spider dataset.

Submitted to arXiv on 21 Apr. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2304.11015v2

In this study, the authors address the problem of improving the performance of Large Language Models (LLMs) in complex text-to-SQL tasks. They propose a decomposition approach, where SQL queries are broken down into smaller sub-problems and the solutions to these sub-problems are fed into LLMs. The authors demonstrate that this decomposition significantly enhances the performance of LLMs, bridging the gap between fine-tuned models and prompting approaches on challenging text-to-SQL datasets like Spider. To evaluate their approach, the authors conduct experiments using three LLMs: two variants of the CodeX family (Davinci and Cushman) and GPT-4. These models are chosen for their large size and applicability to prompting tasks. The evaluation is performed on the Spider dataset, which consists of 10,181 questions and 5,693 unique complex SQL queries across 200 databases. The results show that the proposed approach consistently improves the performance of LLMs by approximately 10%, pushing their accuracy towards state-of-the-art levels. In fact, on the holdout test set of Spider, the new state-of-the-art execution accuracy achieved using this approach is 85.3%, surpassing previous best results at 79.9%. Furthermore, when compared to heavily fine-tuned models, the proposed approach outperforms them by at least 5%. The authors also provide details about their experimental setup. They use greedy decoding with zero temperature for output generation and set specific hyperparameters for each module involved in their approach. The evaluation metrics used are exact-set match accuracy (EM) and execution accuracy (EX), which assess how well predicted SQL queries match ground truth queries. Overall, this study demonstrates that decomposing text-to-SQL tasks into subproblems can significantly enhance LLM performance. The findings contribute to advancing natural language interfaces to databases by improving access to data through relational databases using more efficient reasoning processes.
Created on 25 Oct. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.