Moccasin: Efficient Tensor Rematerialization for Neural Networks
AI-generated Key Points
- Deployment and training of neural networks on edge computing devices present challenges due to low memory nature of these devices
- Tensor rematerialization or recompute is used to address high memory requirements for neural network training and inference
- MOCCASIN is a new constraint programming formulation that minimizes execution time of compute graphs subject to a memory budget
- MOCCASIN has only O(n) integer variables, which is a significant improvement over recent literature that proposes formulations with O(n^2) Boolean variables
- Retention interval formulation for rematerialization simplifies problem formulation greatly by defining output retention intervals for each node in the computation graph
- Parameter Cv defines the maximum number of times a node v can be computed in the final sequence, and this simple complexity reduction retains solution quality even for very small values of Cv
- MOCCASIN is up to an order of magnitude faster than recent work, especially for large-scale graphs
- Empirical results demonstrate MOCCASIN's effectiveness compared to other recent works while highlighting its scalability to larger graphs.
Authors: Burak Bartan, Haoming Li, Harris Teague, Christopher Lott, Bistra Dilkina
Abstract: The deployment and training of neural networks on edge computing devices pose many challenges. The low memory nature of edge devices is often one of the biggest limiting factors encountered in the deployment of large neural network models. Tensor rematerialization or recompute is a way to address high memory requirements for neural network training and inference. In this paper we consider the problem of execution time minimization of compute graphs subject to a memory budget. In particular, we develop a new constraint programming formulation called \textsc{Moccasin} with only $O(n)$ integer variables, where $n$ is the number of nodes in the compute graph. This is a significant improvement over the works in the recent literature that propose formulations with $O(n^2)$ Boolean variables. We present numerical studies that show that our approach is up to an order of magnitude faster than recent work especially for large-scale graphs.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.