Large Language Models for Compiler Optimization

AI-generated keywords: Large Language Models Code Optimization LLVM Assembly Instruction Counts Compiler

AI-generated Key Points

  • Large Language Models (LLMs) applied to code optimization
  • 7B-parameter transformer model trained from scratch for LLVM assembly optimization
  • Model predicts instruction counts before and after optimization, as well as optimized code
  • Auxiliary learning tasks enhance model's performance and understanding
  • Achieves 3.0% improvement in reducing instruction counts compared to the compiler
  • Outperforms two state-of-the-art baselines requiring thousands of compilations
  • Strong code reasoning abilities: generates compilable code 91% of the time, emulates compiler output 70% of the time
  • Unique focus on optimizing code compared to other LLMs trained on source code for different tasks
  • Demonstrates potential of LLMs in improving code performance through automated optimizations
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Chris Cummins, Volker Seeker, Dejan Grubisic, Mostafa Elhoushi, Youwei Liang, Baptiste Roziere, Jonas Gehring, Fabian Gloeckle, Kim Hazelwood, Gabriel Synnaeve, Hugh Leather

License: CC BY 4.0

Abstract: We explore the novel application of Large Language Models to code optimization. We present a 7B-parameter transformer model trained from scratch to optimize LLVM assembly for code size. The model takes as input unoptimized assembly and outputs a list of compiler options to best optimize the program. Crucially, during training, we ask the model to predict the instruction counts before and after optimization, and the optimized code itself. These auxiliary learning tasks significantly improve the optimization performance of the model and improve the model's depth of understanding. We evaluate on a large suite of test programs. Our approach achieves a 3.0% improvement in reducing instruction counts over the compiler, outperforming two state-of-the-art baselines that require thousands of compilations. Furthermore, the model shows surprisingly strong code reasoning abilities, generating compilable code 91% of the time and perfectly emulating the output of the compiler 70% of the time.

Submitted to arXiv on 11 Sep. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2309.07062v1

This paper explores the novel application of Large Language Models (LLMs) to code optimization. The authors present a 7B-parameter transformer model that is trained from scratch to optimize LLVM assembly for code size. The model takes unoptimized assembly as input and outputs a list of compiler options to best optimize the program. During training, the model is asked to predict instruction counts before and after optimization, as well as the optimized code itself. These auxiliary learning tasks significantly improve the optimization performance of the model and enhance its depth of understanding. The authors evaluate their approach on a large suite of test programs and find that it achieves a 3.0% improvement in reducing instruction counts compared to the compiler. This outperforms two state-of-the-art baselines that require thousands of compilations. Additionally, the model demonstrates strong code reasoning abilities, generating compilable code 91% of the time and perfectly emulating the output of the compiler 70% of the time. While there have been previous LLMs trained on source code for various tasks such as code search, summarization, and documentation generation, this work is unique in its focus on optimizing code. Most LLMs are trained at least partly on code, but this paper specifically targets programming language models for optimization purposes. Overall, this research showcases how Large Language Models can be effectively utilized for compiler optimization and highlights their potential in improving code performance through automated optimizations.
Created on 15 Sep. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.