Continuous Autoregressive Language Models

AI-generated keywords: Continuous Autoregressive Language Models High-fidelity Autoencoder Next-vector Prediction Likelihood-free Framework Ultra-efficient Language Models

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Continuous Autoregressive Language Models (CALM) introduced as a novel paradigm shift in language modeling
  • CALM utilizes a high-fidelity autoencoder to compress tokens into continuous vectors, reducing generative steps and improving performance-compute trade-off
  • Comprehensive likelihood-free framework supports robust training, evaluation, and controllable sampling in the continuous domain
  • CALM achieves comparable performance to strong discrete baselines at lower computational cost through experiments
  • Next-vector prediction enhances performance in CALM
  • Authors provide access to their code repository on GitHub and invite readers to explore their project further through their dedicated website
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Chenze Shao, Darren Li, Fandong Meng, Jie Zhou

Abstract: The efficiency of large language models (LLMs) is fundamentally limited by their sequential, token-by-token generation process. We argue that overcoming this bottleneck requires a new design axis for LLM scaling: increasing the semantic bandwidth of each generative step. To this end, we introduce Continuous Autoregressive Language Models (CALM), a paradigm shift from discrete next-token prediction to continuous next-vector prediction. CALM uses a high-fidelity autoencoder to compress a chunk of K tokens into a single continuous vector, from which the original tokens can be reconstructed with over 99.9\% accuracy. This allows us to model language as a sequence of continuous vectors instead of discrete tokens, which reduces the number of generative steps by a factor of K. The paradigm shift necessitates a new modeling toolkit; therefore, we develop a comprehensive likelihood-free framework that enables robust training, evaluation, and controllable sampling in the continuous domain. Experiments show that CALM significantly improves the performance-compute trade-off, achieving the performance of strong discrete baselines at a significantly lower computational cost. More importantly, these findings establish next-vector prediction as a powerful and scalable pathway towards ultra-efficient language models. Code: https://github.com/shaochenze/calm. Project: https://shaochenze.github.io/blog/2025/CALM.

Submitted to arXiv on 31 Oct. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2510.27688v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The authors introduce Continuous Autoregressive Language Models (CALM) as a novel paradigm shift in language modeling. By utilizing a high-fidelity autoencoder to compress tokens into continuous vectors, CALM reduces the number of generative steps required and improves performance-compute trade-off. The comprehensive likelihood-free framework developed by the authors supports robust training, evaluation, and controllable sampling in the continuous domain. Through experiments, they demonstrate that CALM achieves comparable performance to strong discrete baselines at a lower computational cost. and are enhanced through next-vector prediction in CALM. The authors provide access to their code repository on GitHub and invite readers to explore their project further through their dedicated website.
Created on 06 Nov. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.