Token-Budget-Aware LLM Reasoning

AI-generated keywords: Token-Budget-Aware LLM Reasoning reasoning large language models (LLMs) Chain-of-Thought (CoT) reasoning token budget

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Authors: Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen
  • Importance of Reasoning in LLMs:
  • Enhances performance across tasks
  • Effectiveness of Chain-of-Thought (CoT) reasoning
  • Drawback: Increased token usage and costs in reasoning process
  • Proposed Solution:
  • Incorporate a reasonable token budget within the prompt
  • Pivotal choice of token budget for compression strategy effectiveness
  • Introduction of Token-Budget-Aware LLM Reasoning Framework:
  • Dynamically estimates token budgets based on problem complexity
  • Guides reasoning process effectively with estimated budgets
  • Experimental Results:
  • Successful reduction of token costs in CoT reasoning with marginal impact on performance
  • Balancing Efficiency and Accuracy:
  • Practical solution for optimizing LLM reasoning tasks
  • Future Developments:
  • Valuable insights for optimizing reasoning processes in language models
  • For more details and access to their code repository, visit https://github.com/GeniusHTX/TALE.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen

Abstract: Reasoning is critical for large language models (LLMs) to excel in a wide range of tasks. While methods like Chain-of-Thought (CoT) reasoning enhance LLM performance by decomposing problems into intermediate steps, they also incur significant overhead in token usage, leading to increased costs. We find that the reasoning process of current LLMs is unnecessarily lengthy and it can be compressed by including a reasonable token budget in the prompt, but the choice of token budget plays a crucial role in the actual compression effectiveness. We then propose a token-budget-aware LLM reasoning framework, which dynamically estimates token budgets for different problems based on reasoning complexity and uses the estimated token budgets to guide the reasoning process. Experiments show that our method effectively reduces token costs in CoT reasoning with only a slight performance reduction, offering a practical solution to balance efficiency and accuracy in LLM reasoning. Code: https://github.com/GeniusHTX/TALE.

Submitted to arXiv on 24 Dec. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2412.18547v4

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In their paper titled "Token-Budget-Aware LLM Reasoning," authors Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, and Zhenyu Chen address the critical role of reasoning in enhancing the performance of large language models (LLMs) across various tasks. They highlight the effectiveness of methods like Chain-of-Thought (CoT) reasoning in breaking down complex problems into manageable intermediate steps. However, they also identify a significant drawback in the form of increased token usage and associated costs. The researchers observe that the current reasoning process employed by LLMs is often unnecessarily lengthy and propose a solution to compress it by incorporating a reasonable token budget within the prompt. They emphasize that the choice of token budget is pivotal in determining the actual effectiveness of this compression strategy. To address this challenge, they introduce a novel token-budget-aware LLM reasoning framework. This framework dynamically estimates token budgets based on the complexity of each problem and utilizes these estimates to guide the reasoning process effectively. Through experiments conducted as part of their study, the authors demonstrate that their proposed method successfully reduces token costs in CoT reasoning while only marginally impacting performance. This approach offers a practical solution for striking a balance between efficiency and accuracy in LLM reasoning tasks. The research provides valuable insights into optimizing reasoning processes within language models and presents a promising avenue for future developments in this field. For more details and access to their code repository, interested readers can refer to https://github.com/GeniusHTX/TALE.
Created on 01 May. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.