A Survey of Large Language Models

AI-generated keywords: Large Language Models (LLMs) Pre-training Adaptation Tuning Utilization Capacity Evaluation

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Language is a complex system governed by grammatical rules
  • Modeling language has evolved from statistical models to neural models over the past two decades
  • Pre-trained language models (PLMs) have been proposed using Transformer models pre-trained over large corpora
  • Model scaling can lead to performance improvement when parameters exceed a certain level
  • Large language models (LLMs) are PLMs of significant size that show special abilities not present in small-scale language models
  • Wayne Xin Zhao et al. review recent advances in LLMs focusing on pre-training, adaptation tuning, utilization, and capacity evaluation
  • Pre-training is crucial for achieving high performance on downstream tasks while adaptation tuning aims at fine-tuning PLMs on specific tasks or domains
  • Utilization involves using PLMs as building blocks to construct more complex systems such as chatbots or question-answering systems
  • Capacity evaluation assesses whether larger models are necessary for specific tasks
  • Ethical implications of LLMs include exacerbating existing biases or generating fake news
  • Researchers and developers should be aware of these issues and work towards mitigating them.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen

ongoing work; 51 pages

Abstract: Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural language models. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large-scale corpora, showing strong capabilities in solving various NLP tasks. Since researchers have found that model scaling can lead to performance improvement, they further study the scaling effect by increasing the model size to an even larger size. Interestingly, when the parameter scale exceeds a certain level, these enlarged language models not only achieve a significant performance improvement but also show some special abilities that are not present in small-scale language models. To discriminate the difference in parameter scale, the research community has coined the term large language models (LLM) for the PLMs of significant size. Recently, the research on LLMs has been largely advanced by both academia and industry, and a remarkable progress is the launch of ChatGPT, which has attracted widespread attention from society. The technical evolution of LLMs has been making an important impact on the entire AI community, which would revolutionize the way how we develop and use AI algorithms. In this survey, we review the recent advances of LLMs by introducing the background, key findings, and mainstream techniques. In particular, we focus on four major aspects of LLMs, namely pre-training, adaptation tuning, utilization, and capacity evaluation. Besides, we also summarize the available resources for developing LLMs and discuss the remaining issues for future directions.

Submitted to arXiv on 31 Mar. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2303.18223v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Language is a complex and intricate system of human expressions governed by grammatical rules. Modeling language has been widely studied as an approach for understanding and generating language, evolving from statistical models to neural models over the past two decades. Recently, pre-trained language models (PLMs) have been proposed by pre-training Transformer models over large corpora, showing strong capabilities in solving various NLP tasks. Researchers have found that model scaling can lead to performance improvement when the parameter scale exceeds a certain level. These enlarged language models not only achieve significant performance improvement but also show some special abilities that are not present in small-scale language models. To distinguish the difference in parameter scale, the research community has coined the term large language models (LLMs) for PLMs of significant size. Wayne Xin Zhao et al. review recent advances in LLMs by introducing background information, key findings, and mainstream techniques focusing on four major aspects: pre-training, adaptation tuning, utilization and capacity evaluation. They summarize available resources for developing LLMs and discuss remaining issues for future directions while highlighting that pre-training is crucial for achieving high performance on downstream tasks while adaptation tuning aims at fine-tuning PLMs on specific tasks or domains. Utilization involves using PLMs as building blocks to construct more complex systems such as chatbots or question-answering systems while capacity evaluation assesses whether larger models are necessary for specific tasks. The authors also discuss ethical implications of LLMs such as their potential to exacerbate existing biases or generate fake news suggesting researchers and developers should be aware of these issues and work towards mitigating them. This survey provides a comprehensive overview of recent advances in LLMs and highlights their potential impact on the AI community serving as a valuable resource for those interested in this rapidly evolving field.
Created on 21 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.