LUT-NN: Towards Unified Neural Network Inference by Table Lookup

AI-generated keywords: DNN Inference LUT-NN Centroid Learning Table Lookup Hardware Design

AI-generated Key Points

  • Deep Neural Network (DNN) inference is computationally intensive and costly
  • LUT-NN is a technique for DNN inference by table lookup
  • LUT-NN learns typical features, called centroids, of each layer from training data
  • Centroids are precomputed with model weights and saved in tables for future input
  • LUT-NN achieves comparable accuracy (<5% difference) with original models on real complex datasets such as CIFAR, ImageNet, and GLUE
  • LUT-NN simplifies computing operators to only two: closest centroid search and table lookup
  • LUT-NN has been implemented for Intel and ARM CPUs reducing model size up to 3.5x for CNN models and 7x for BERT while achieving real speedup up to 7x for BERT and 2x for ResNet latency-wise.
  • Current hardware design limitations result in lower speedup than theoretical results.
  • The authors expect first-class table lookup support in future hardware designs to unleash the full potential of LUT-NN.
  • The proposed approach simplifies DNN inference while maintaining high accuracy levels without requiring extensive system development or resource costs.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Xiaohu Tang, Yang Wang, Ting Cao, Li Lyna Zhang, Qi Chen, Deng Cai, Yunxin Liu, Mao Yang

License: CC BY 4.0

Abstract: DNN inference requires huge effort of system development and resource cost. This drives us to propose LUT-NN, the first trial towards empowering deep neural network (DNN) inference by table lookup, to eliminate the diverse computation kernels as well as save running cost. Based on the feature similarity of each layer, LUT-NN can learn the typical features, named centroids, of each layer from the training data, precompute them with model weights, and save the results in tables. For future input, the results of the closest centroids with the input features can be directly read from the table, as the approximation of layer output. We propose the novel centroid learning technique for DNN, which enables centroid learning through backpropagation, and adapts three levels of approximation to minimize the model loss. By this technique, LUT-NN achieves comparable accuracy (<5% difference) with original models on real complex dataset, including CIFAR, ImageNet, and GLUE. LUT-NN simplifies the computing operators to only two: closest centroid search and table lookup. We implement them for Intel and ARM CPUs. The model size is reduced by up to 3.5x for CNN models and 7x for BERT. Latency-wise, the real speedup of LUT-NN is up to 7x for BERT and 2x for ResNet, much lower than theoretical results because of the current unfriendly hardware design for table lookup. We expect firstclass table lookup support in the future to unleash the potential of LUT-NN.

Submitted to arXiv on 07 Feb. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2302.03213v1

Deep Neural Network (DNN) inference is a computationally intensive task that requires significant system development and resource cost. To address this challenge, the authors propose LUT-NN, a technique for DNN inference by table lookup. LUT-NN eliminates the need for diverse computation kernels and reduces running costs by learning typical features, called centroids, of each layer from training data. These centroids are precomputed with model weights and saved in tables. For future input, the results of the closest centroids with the input features can be directly read from the table as an approximation of layer output. The authors introduce a novel centroid learning technique for DNN that enables centroid learning through backpropagation and adapts three levels of approximation to minimize model loss. By this technique, LUT-NN achieves comparable accuracy (<5% difference) with original models on real complex datasets such as CIFAR, ImageNet, and GLUE. LUT-NN simplifies computing operators to only two: closest centroid search and table lookup. LUT-NN has been implemented for Intel and ARM CPUs reducing model size up to 3.5x for CNN models and 7x for BERT while achieving real speedup up to 7x for BERT and 2x for ResNet latency-wise. However, current hardware design limitations result in lower speedup than theoretical results. The authors expect first-class table lookup support in future hardware designs to unleash the full potential of LUT-NN. The proposed approach has significant implications as it simplifies DNN inference while maintaining high accuracy levels without requiring extensive system development or resource costs.
Created on 18 May. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.