OneRec Technical Report
AI-generated keywords:
Recommender Systems
Multi-stage Cascaded Architectures
Artificial Intelligence
OneRec
End-to-end Generative Framework
- Longstanding reliance on multi-stage cascaded architectures in recommender systems
- Limitations due to computational fragmentation and optimization inconsistencies
- Development of OneRec by Guorui Zhou, Jiaxin Deng, and team as a groundbreaking approach
- OneRec reshapes recommendation systems through an end-to-end generative framework
- Enhances computational FLOPs of existing models by 10 times and establishes scaling laws for recommendations
- Leverages reinforcement learning techniques to optimize recommendations effectively
- Achieves impressive Model FLOPs Utilization rates on flagship GPUs during training and inference stages
- Deployment in Kuaishou/Kuaishou Lite APP results in handling a quarter of total queries per second and enhancing App Stay Time significantly
- Substantial increases in metrics like 7-day Lifetime post-OneRec implementation, improving user engagement and satisfaction
- Drastic reduction in operational expenses associated with traditional recommendation pipelines to just 10.6%
- Technical report authored by Zhou et al. provides insights into the development and optimization process behind OneRec, with real-world implications for production-scale recommendation systems
Authors:
Guorui Zhou,
Jiaxin Deng,
Jinghao Zhang,
Kuo Cai,
Lejian Ren,
Qiang Luo,
Qianqian Wang,
Qigen Hu,
Rui Huang,
Shiyao Wang,
Weifeng Ding,
Wuchao Li,
Xinchen Luo,
Xingmei Wang,
Zexuan Cheng,
Zixing Zhang,
Bin Zhang,
Boxuan Wang,
Chaoyi Ma,
Chengru Song,
Chenhui Wang,
Di Wang,
Dongxue Meng,
Fan Yang,
Fangyu Zhang,
Feng Jiang,
Fuxing Zhang,
Gang Wang,
Guowang Zhang,
Han Li,
Hengrui Hu,
Hezheng Lin,
Hongtao Cheng,
Hongyang Cao,
Huanjie Wang,
Jiaming Huang,
Jiapeng Chen,
Jiaqiang Liu,
Jinghui Jia,
Kun Gai,
Lantao Hu,
Liang Zeng,
Liao Yu,
Qiang Wang,
Qidong Zhou,
Shengzhe Wang,
Shihui He,
Shuang Yang,
Shujie Yang,
Sui Huang,
Tao Wu,
Tiantian He,
Tingting Gao,
Wei Yuan,
Xiao Liang,
Xiaoxiao Xu,
Xugang Liu,
Yan Wang,
Yi Wang,
Yiwu Liu,
Yue Song,
Yufei Zhang,
Yunfan Wu,
Yunfeng Zhao,
Zhanyu Liu
Authors are listed alphabetically by their first name
Abstract: Recommender systems have been widely used in various large-scale user-oriented platforms for many years. However, compared to the rapid developments in the AI community, recommendation systems have not achieved a breakthrough in recent years. For instance, they still rely on a multi-stage cascaded architecture rather than an end-to-end approach, leading to computational fragmentation and optimization inconsistencies, and hindering the effective application of key breakthrough technologies from the AI community in recommendation scenarios. To address these issues, we propose OneRec, which reshapes the recommendation system through an end-to-end generative approach and achieves promising results. Firstly, we have enhanced the computational FLOPs of the current recommendation model by 10 $\times$ and have identified the scaling laws for recommendations within certain boundaries. Secondly, reinforcement learning techniques, previously difficult to apply for optimizing recommendations, show significant potential in this framework. Lastly, through infrastructure optimizations, we have achieved 23.7% and 28.8% Model FLOPs Utilization (MFU) on flagship GPUs during training and inference, respectively, aligning closely with the LLM community. This architecture significantly reduces communication and storage overhead, resulting in operating expense that is only 10.6% of traditional recommendation pipelines. Deployed in Kuaishou/Kuaishou Lite APP, it handles 25% of total queries per second, enhancing overall App Stay Time by 0.54% and 1.24%, respectively. Additionally, we have observed significant increases in metrics such as 7-day Lifetime, which is a crucial indicator of recommendation experience. We also provide practical lessons and insights derived from developing, optimizing, and maintaining a production-scale recommendation system with significant real-world impact.
Submitted to arXiv on 16 Jun. 2025
- Comprehensive Summary
- Key points
- Layman's Summary
- Blog article
In the field of recommender systems, there has been a longstanding reliance on multi-stage cascaded architectures that have not kept pace with the rapid advancements in artificial intelligence. This has led to computational fragmentation and optimization inconsistencies, limiting the effective integration of key AI breakthroughs into recommendation scenarios. To address these challenges, a team of researchers led by Guorui Zhou, Jiaxin Deng, and their colleagues have developed OneRec - a groundbreaking approach that reshapes recommendation systems through an end-to-end generative framework. OneRec represents a significant leap forward in recommendation technology by enhancing the computational FLOPs of existing models by 10 times and establishing scaling laws for recommendations within specific boundaries. By leveraging reinforcement learning techniques previously deemed challenging for optimizing recommendations, OneRec demonstrates substantial potential in improving recommendation accuracy and efficiency. Furthermore, through infrastructure optimizations, the team has achieved impressive Model FLOPs Utilization rates on flagship GPUs during both training and inference stages - aligning closely with leading-edge practices in the AI community. The deployment of OneRec in the Kuaishou/Kuaishou Lite APP has yielded remarkable results - handling a quarter of total queries per second while enhancing overall App Stay Time by significant margins. Notably, metrics such as 7-day Lifetime have shown substantial increases following the implementation of OneRec - underscoring its positive impact on user engagement and satisfaction. Additionally, operational expenses associated with traditional recommendation pipelines have been drastically reduced to just 10.6% through the adoption of this innovative architecture. The technical report authored by Zhou et al. not only presents the development and optimization process behind OneRec but also offers valuable insights derived from maintaining a production-scale recommendation system with tangible real-world implications. The comprehensive approach taken by the research team showcases how cutting-edge technologies can be effectively harnessed to revolutionize recommender systems and enhance user experiences across diverse platforms.
- - Longstanding reliance on multi-stage cascaded architectures in recommender systems
- - Limitations due to computational fragmentation and optimization inconsistencies
- - Development of OneRec by Guorui Zhou, Jiaxin Deng, and team as a groundbreaking approach
- - OneRec reshapes recommendation systems through an end-to-end generative framework
- - Enhances computational FLOPs of existing models by 10 times and establishes scaling laws for recommendations
- - Leverages reinforcement learning techniques to optimize recommendations effectively
- - Achieves impressive Model FLOPs Utilization rates on flagship GPUs during training and inference stages
- - Deployment in Kuaishou/Kuaishou Lite APP results in handling a quarter of total queries per second and enhancing App Stay Time significantly
- - Substantial increases in metrics like 7-day Lifetime post-OneRec implementation, improving user engagement and satisfaction
- - Drastic reduction in operational expenses associated with traditional recommendation pipelines to just 10.6%
- - Technical report authored by Zhou et al. provides insights into the development and optimization process behind OneRec, with real-world implications for production-scale recommendation systems
Summary- Recommender systems have been using a certain way of organizing information for a long time.
- There are some problems with this old way because it's not very efficient and consistent.
- A new method called OneRec was created by Guorui Zhou, Jiaxin Deng, and their team, which is very innovative.
- OneRec changes how recommendations are made by using a special kind of framework from start to finish.
- It makes existing models work much faster and smarter, especially on powerful computers.
Definitions- Recommender systems: Tools that suggest things you might like based on your past preferences or behavior.
- Framework: A basic structure or set of ideas that something is built upon.
- FLOPs: A measure of how fast a computer can do calculations (Floating Point Operations Per Second).
- Reinforcement learning: A type of machine learning where the system learns through trial and error based on rewards or punishments.
- Inference stages: The part of a process where conclusions are drawn based on available information.
Recommender systems have become an integral part of our daily lives, providing personalized recommendations for products, services, and content. However, the traditional multi-stage cascaded architectures used in these systems have not kept pace with the rapid advancements in artificial intelligence (AI). This has led to computational fragmentation and optimization inconsistencies, limiting the effective integration of key AI breakthroughs into recommendation scenarios.
To address these challenges, a team of researchers led by Guorui Zhou and Jiaxin Deng developed OneRec - a groundbreaking approach that reshapes recommender systems through an end-to-end generative framework. Their research paper titled "OneRec: An End-to-End Generative Framework for Recommendation" presents their findings and insights on this innovative approach.
The Need for Change
Traditional recommender systems rely on multi-stage cascaded architectures where each stage is responsible for a specific task such as data preprocessing or feature engineering. While this approach has been successful in the past, it has not evolved to keep up with recent advancements in AI. As a result, there is a lack of efficient integration of new technologies into recommendation scenarios.
This has led to computational fragmentation - where different stages use different hardware and software platforms - making it challenging to optimize the system as a whole. Additionally, traditional approaches do not fully leverage reinforcement learning techniques which have shown great potential in optimizing recommendations but are deemed challenging to implement.
Introducing OneRec
In response to these challenges, Zhou et al. developed OneRec - an end-to-end generative framework that revolutionizes recommender systems by enhancing both efficiency and accuracy. The key idea behind OneRec is to replace the traditional multi-stage architecture with a single unified model trained end-to-end using reinforcement learning techniques.
By doing so, they were able to significantly increase computational FLOPs (floating-point operations) compared to existing models while also establishing scaling laws within specific boundaries for recommendations. This means that as more data becomes available, OneRec can easily scale to handle larger datasets without compromising efficiency.
Impressive Results
The deployment of OneRec in the Kuaishou/Kuaishou Lite APP - a popular Chinese video-sharing and social networking platform - has yielded impressive results. The system is able to handle a quarter of total queries per second while also enhancing overall App Stay Time by significant margins. This means that users are spending more time on the app, engaging with recommended content.
OneRec has also shown promising results in terms of user satisfaction, as metrics such as 7-day Lifetime have increased following its implementation. This highlights the positive impact that this innovative approach has on user engagement and retention.
Efficiency Improvements
In addition to improving accuracy and user satisfaction, OneRec also offers significant improvements in efficiency. By leveraging reinforcement learning techniques previously deemed challenging for optimizing recommendations, the team was able to achieve impressive Model FLOPs Utilization rates on flagship GPUs during both training and inference stages. This aligns closely with leading-edge practices in the AI community.
Furthermore, through infrastructure optimizations, operational expenses associated with traditional recommendation pipelines were drastically reduced to just 10.6%. This demonstrates how OneRec not only improves performance but also reduces costs associated with maintaining a production-scale recommendation system.
Real-World Implications
The technical report authored by Zhou et al. not only presents the development and optimization process behind OneRec but also offers valuable insights derived from maintaining a production-scale recommendation system with tangible real-world implications. The comprehensive approach taken by the research team showcases how cutting-edge technologies can be effectively harnessed to revolutionize recommender systems and enhance user experiences across diverse platforms.
Conclusion
In conclusion, Zhou et al.'s research paper "OneRec: An End-to-End Generative Framework for Recommendation" presents an innovative approach that reshapes recommender systems through an end-to-end generative framework. By replacing traditional multi-stage architectures with a single unified model trained end-to-end, OneRec offers significant improvements in efficiency and accuracy. Its deployment in the Kuaishou/Kuaishou Lite APP has yielded impressive results, showcasing its potential to revolutionize recommender systems and enhance user experiences. This research not only presents a groundbreaking approach but also offers valuable insights for maintaining production-scale recommendation systems with real-world implications.