A Prefrontal Cortex-inspired Architecture for Planning in Large Language Models

AI-generated keywords: Large Language Models Multi-step Reasoning Goal-directed Planning Prefrontal Cortex Black Box Architecture

AI-generated Key Points

  • Researchers explore limitations of Large Language Models (LLMs) in tasks requiring multi-step reasoning or goal-directed planning
  • Proposal of a novel black box architecture, GPT-4, with multiple LLM-based modules inspired by the human brain's prefrontal cortex (PFC)
  • Modules mimic functions such as conflict monitoring, state prediction, task decomposition, and coordination found in the PFC
  • New architecture improves planning by breaking down complex problems into automated calls to the LLM through specialized PFC-inspired modules
  • Evaluation of combined architecture on challenging planning tasks like graph traversal, Tower of Hanoi, and logistics shows significant performance improvements compared to standard LLM methods and competitive baselines
  • Study demonstrates potential benefits of integrating knowledge from cognitive neuroscience to enhance planning capabilities in LLMs
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Taylor Webb, Shanka Subhra Mondal, Chi Wang, Brian Krabach, Ida Momennejad

License: CC BY 4.0

Abstract: Large language models (LLMs) demonstrate impressive performance on a wide variety of tasks, but they often struggle with tasks that require multi-step reasoning or goal-directed planning. To address this, we take inspiration from the human brain, in which planning is accomplished via the recurrent interaction of specialized modules in the prefrontal cortex (PFC). These modules perform functions such as conflict monitoring, state prediction, state evaluation, task decomposition, and task coordination. We find that LLMs are sometimes capable of carrying out these functions in isolation, but struggle to autonomously coordinate them in the service of a goal. Therefore, we propose a black box architecture with multiple LLM-based (GPT-4) modules. The architecture improves planning through the interaction of specialized PFC-inspired modules that break down a larger problem into multiple brief automated calls to the LLM. We evaluate the combined architecture on three challenging planning tasks -- graph traversal, Tower of Hanoi, and logistics -- finding that it yields significant improvements over standard LLM methods (e.g., zero-shot prompting, in-context learning, and chain-of-thought). These results demonstrate the benefit of utilizing knowledge from cognitive neuroscience to improve planning in LLMs.

Submitted to arXiv on 30 Sep. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2310.00194v3

In this study, the researchers explore the limitations of Large Language Models (LLMs) in tasks that require multi-step reasoning or goal-directed planning. Drawing inspiration from the human brain's prefrontal cortex (PFC), which utilizes specialized modules for planning, they propose a novel black box architecture with multiple LLM-based modules (GPT-4). These modules mimic functions such as conflict monitoring, state prediction, task decomposition, and coordination found in the PFC. The new architecture improves planning by breaking down complex problems into automated calls to the LLM through specialized PFC-inspired modules. The researchers evaluate their combined architecture on challenging planning tasks like graph traversal, Tower of Hanoi, and logistics. Comparing it against standard LLM methods and competitive baselines like zero-shot prompting and chain-of-thought approaches, they find significant improvements in performance. By leveraging insights from cognitive neuroscience to enhance planning capabilities in LLMs, this study demonstrates the potential benefits of integrating knowledge from different domains to advance artificial intelligence research.
Created on 28 Apr. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.