Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models

AI-generated keywords: Large Language Models

AI-generated Key Points

  • Large language models (LLMs) are effective in natural language processing tasks
  • Multi-step reasoning tasks still pose a challenge for LLMs
  • Few-shot chain-of-thought (CoT) prompting uses manually crafted step-by-step reasoning demonstrations to help LLMs generate explicit reasoning steps and improve accuracy
  • Zero-shot-CoT concatenates the problem statement with "Let's think step by step" as an input prompt to LLMs, but suffers from errors such as calculation errors, missing-step errors, and semantic misunderstanding errors
  • Plan and Solve (PS) Prompting is proposed to address missing-step errors in Zero-shot CoT by dividing the task into smaller subtasks and carrying them out according to a plan
  • PS+ prompting extends PS prompting with more detailed instructions to further improve the quality of generated reasoning steps and address calculation errors
  • The proposed zero shot prompting strategy is evaluated on ten datasets across three reasoning problems using GPT 3
  • Plan and Solve Prompting consistently outperforms Zero shot CoT across all datasets by a large margin and is comparable to or exceeds Zero shot Program of Thought Prompting
  • It has comparable performance with 8 shot CoT prompting on math reasoning problems
  • Plan and Solve Prompting provides an effective solution for generating explicit reasoning steps in LLMs without requiring manual effort.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Lei Wang, Wanyu Xu, Yihuai Lan, Zhiqiang Hu, Yunshi Lan, Roy Ka-Wei Lee, Ee-Peng Lim

ACL 2023
License: CC BY 4.0

Abstract: Large language models (LLMs) have recently been shown to deliver impressive performance in various NLP tasks. To tackle multi-step reasoning tasks, few-shot chain-of-thought (CoT) prompting includes a few manually crafted step-by-step reasoning demonstrations which enable LLMs to explicitly generate reasoning steps and improve their reasoning task accuracy. To eliminate the manual effort, Zero-shot-CoT concatenates the target problem statement with "Let's think step by step" as an input prompt to LLMs. Despite the success of Zero-shot-CoT, it still suffers from three pitfalls: calculation errors, missing-step errors, and semantic misunderstanding errors. To address the missing-step errors, we propose Plan-and-Solve (PS) Prompting. It consists of two components: first, devising a plan to divide the entire task into smaller subtasks, and then carrying out the subtasks according to the plan. To address the calculation errors and improve the quality of generated reasoning steps, we extend PS prompting with more detailed instructions and derive PS+ prompting. We evaluate our proposed prompting strategy on ten datasets across three reasoning problems. The experimental results over GPT-3 show that our proposed zero-shot prompting consistently outperforms Zero-shot-CoT across all datasets by a large margin, is comparable to or exceeds Zero-shot-Program-of-Thought Prompting, and has comparable performance with 8-shot CoT prompting on the math reasoning problem. The code can be found at https://github.com/AGI-Edgerunners/Plan-and-Solve-Prompting.

Submitted to arXiv on 06 May. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2305.04091v1

Large language models (LLMs) have demonstrated impressive performance in various natural language processing tasks. However, multi-step reasoning tasks still pose a challenge for LLMs. To address this issue, few-shot chain-of-thought (CoT) prompting has been proposed, which includes manually crafted step-by-step reasoning demonstrations to help LLMs generate explicit reasoning steps and improve their accuracy. To eliminate the need for manual effort, Zero-shot-CoT was introduced, which concatenates the problem statement with "Let's think step by step" as an input prompt to LLMs. Despite its success, Zero-shot-CoT suffers from calculation errors, missing-step errors, and semantic misunderstanding errors. To address the missing-step errors in Zero-shot CoT, this paper proposes Plan and Solve (PS) Prompting. The PS prompting strategy consists of two components: devising a plan to divide the entire task into smaller subtasks and carrying out the subtasks according to the plan. To further improve the quality of generated reasoning steps and address calculation errors, PS prompting is extended with more detailed instructions to derive PS+ prompting. The proposed zero shot prompting strategy is evaluated on ten datasets across three reasoning problems using GPT 3. The experimental results show that our proposed zero shot prompting consistently outperforms Zero shot CoT across all datasets by a large margin and is comparable to or exceeds Zero shot Program of Thought Prompting. Moreover, it has comparable performance with 8 shot CoT prompting on math reasoning problems. In conclusion, Plan and Solve Prompting provides an effective solution for generating explicit reasoning steps in LLMs without requiring manual effort. It addresses several issues faced by existing strategies such as missing step errors and calculation errors while achieving superior performance compared to other state of the art methods on various datasets and reasoning problems.
Created on 26 May. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.