Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving

AI-generated keywords: Large Language Models Hierarchical Policy In-Context Learning Tournament-Based Approach MATH Dataset

AI-generated Key Points

  • Large language models (LLMs) struggle with complex reasoning tasks in mathematics
  • Current approaches involve sampling or searching detailed reasoning chains but have limitations
  • Proposed approach leverages LLMs as hierarchical policies through in-context learning
  • Approach consists of a visionary leader and a follower for problem-solving tactics
  • Follower explores multiple reasoning chains guided by leader's hints
  • Tournament-based approach used to select the best solution group
  • Enhances problem-solving strategy exploration and improves accuracy on challenging problems
  • Code for the approach will be released at a specific GitHub repository
  • Unleashes LLMs' creative potential and improves reasoning abilities in challenging tasks across domains
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Zhan Ling, Yunhao Fang, Xuanlin Li, Tongzhou Mu, Mingu Lee, Reza Pourreza, Roland Memisevic, Hao Su

License: CC BY 4.0

Abstract: Large Language Models (LLMs) have achieved tremendous progress, yet they still often struggle with challenging reasoning problems. Current approaches address this challenge by sampling or searching detailed and low-level reasoning chains. However, these methods are still limited in their exploration capabilities, making it challenging for correct solutions to stand out in the huge solution space. In this work, we unleash LLMs' creative potential for exploring multiple diverse problem solving strategies by framing an LLM as a hierarchical policy via in-context learning. This policy comprises of a visionary leader that proposes multiple diverse high-level problem-solving tactics as hints, accompanied by a follower that executes detailed problem-solving processes following each of the high-level instruction. The follower uses each of the leader's directives as a guide and samples multiple reasoning chains to tackle the problem, generating a solution group for each leader proposal. Additionally, we propose an effective and efficient tournament-based approach to select among these explored solution groups to reach the final answer. Our approach produces meaningful and inspiring hints, enhances problem-solving strategy exploration, and improves the final answer accuracy on challenging problems in the MATH dataset. Code will be released at https://github.com/lz1oceani/LLM-As-Hierarchical-Policy.

Submitted to arXiv on 01 Nov. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2311.00694v1

Large language models (LLMs) have shown great potential in various domains but still struggle with complex reasoning tasks such as mathematical proofs and advanced mathematics problems. Current approaches to address this challenge involve sampling or searching detailed and low-level reasoning chains but have limitations in exploring the vast solution space, making it difficult for correct solutions to stand out. To overcome these limitations, we propose a novel approach that leverages the creative potential of LLMs by framing them as hierarchical policies through in-context learning. Our approach consists of a visionary leader and a follower. The leader generates multiple diverse high-level problem-solving tactics as hints while the follower executes detailed problem-solving processes based on each hint provided by the leader. The follower explores multiple reasoning chains guided by the leader's directives and generates a solution group for each proposed tactic. In addition to this hierarchical policy framework, we introduce an effective tournament-based approach to select the best solution group among those explored by the follower. This approach enhances problem-solving strategy exploration and improves accuracy on challenging problems in the MATH dataset. Our work contributes meaningful and inspiring hints for problem solving, expands exploration capabilities of LLMs, and ultimately improves the accuracy of final answers. The code for our approach will be released at https://github.com/lz1oceani/LLM-As-Hierarchical-Policy. Overall, our proposed method unleashes LLMs' creative potential by enabling them to explore multiple diverse problem-solving strategies through a hierarchical policy framework. This advancement has significant implications for improving reasoning abilities in challenging tasks across various domains.
Created on 02 Nov. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.