Small Language Models are the Future of Agentic AI

AI-generated keywords: agentic AI large language models small language models specialized tasks efficiency

AI-generated Key Points

  • Debate between large language models (LLMs) and small language models (SLMs) in the evolving landscape of agentic AI systems
  • Shift towards specialized tasks with repetitive functions in agentic AI systems
  • Argument that SLMs are powerful, suitable, and cost-effective for many applications in agentic systems
  • Detailed process outlined from data curation and filtering to SLM selection, specialized SLM fine-tuning, and continuous iteration and refinement
  • Importance of embracing SLMs in agentic AI systems for efficiency, cost reduction, and enhanced performance in specialized task-oriented applications
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, Pavlo Molchanov

License: CC BY 4.0

Abstract: Large language models (LLMs) are often praised for exhibiting near-human performance on a wide range of tasks and valued for their ability to hold a general conversation. The rise of agentic AI systems is, however, ushering in a mass of applications in which language models perform a small number of specialized tasks repetitively and with little variation. Here we lay out the position that small language models (SLMs) are sufficiently powerful, inherently more suitable, and necessarily more economical for many invocations in agentic systems, and are therefore the future of agentic AI. Our argumentation is grounded in the current level of capabilities exhibited by SLMs, the common architectures of agentic systems, and the economy of LM deployment. We further argue that in situations where general-purpose conversational abilities are essential, heterogeneous agentic systems (i.e., agents invoking multiple different models) are the natural choice. We discuss the potential barriers for the adoption of SLMs in agentic systems and outline a general LLM-to-SLM agent conversion algorithm. Our position, formulated as a value statement, highlights the significance of the operational and economic impact even a partial shift from LLMs to SLMs is to have on the AI agent industry. We aim to stimulate the discussion on the effective use of AI resources and hope to advance the efforts to lower the costs of AI of the present day. Calling for both contributions to and critique of our position, we commit to publishing all such correspondence at https://research.nvidia.com/labs/lpr/slm-agents.

Submitted to arXiv on 02 Jun. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2506.02153v1

The evolving landscape of agentic AI systems is sparking a debate between large language models (LLMs) and small language models (SLMs). While LLMs are praised for their near-human performance and conversational abilities, the rise of agentic AI systems calls for a shift towards specialized tasks with repetitive functions. This has led to the argument that SLMs are not only powerful but also more suitable and cost-effective for many applications in agentic systems, making them the future of agentic AI. To support this stance, a detailed process is outlined from data curation and filtering (S2) to SLM selection (S4), specialized SLM fine-tuning (S5), and continuous iteration and refinement (S6). The process involves collecting data, clustering tasks, selecting appropriate SLMs based on criteria such as capabilities and performance benchmarks, fine-tuning them with task-specific datasets, and continuously refining the models with new data to adapt to changing patterns. The authors highlight the transformative potential of agentic AI in white-collar work and beyond, emphasizing the importance of cost savings and sustainability in AI infrastructure. They invite contributions and critiques on their position via email at [email protected] and commit to publishing all correspondence on their website. Overall, this comprehensive summary presents a strong case for embracing SLMs in agentic AI systems to drive efficiency, reduce costs, and enhance performance in specialized task-oriented applications.
Created on 29 Sep. 2025

Assess the quality of the AI-generated content by voting

Score: 1

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.