This paper presents YOLO-LITE, a real-time object detection algorithm optimized for non-GPU computers. The model is designed to run on portable devices such as laptops or cellphones that lack a Graphics Processing Unit (GPU). YOLO-LITE was trained on the PASCAL VOC and COCO datasets, achieving mean Average Precision (mAP) scores of 33.81% and 12.26% respectively. It runs at approximately 21 frames per second (FPS) on a non-GPU computer and 10 FPS when implemented on a website with only 7 layers and 482 million Floating Point Operations Per Second (FLOPS). This speed is 3.8 times faster than the fastest state-of-the-art model, SSD MobilenetvI. Based on the original YOLOV2 object detection algorithm, YOLO-LITE aims to create a smaller, faster, and more efficient model to increase the accessibility of real-time object detection across various devices. The authors provide detailed information about the architecture and implementation of YOLO-LITE including its seven layer design and computational efficiency. Furthermore, they include additional context from Rachel Huang's affiliation with the School of Electrical and Computer Engineering at Georgia Institute of Technology in Atlanta, United States; Jonathan Pedoeem's affiliation with Electrical Engineering at The Cooper Union in New York; and Cuixian Chen's affiliation with Mathematics and Statistics at UNC Wilmington in North Carolina. Overall, this paper contributes to the field of computer vision by introducing YOLO-LITE as an optimized solution for real-time object detection on non GPU computers which increases accessibility across various devices due to its smaller size and higher speed compared to existing models.
      
        
        
        
          - - YOLO-LITE is a real-time object detection algorithm optimized for non-GPU computers
 
        
          - - It is designed to run on portable devices such as laptops or cellphones without a GPU
 
        
          - - YOLO-LITE achieved mean Average Precision (mAP) scores of 33.81% and 12.26% on the PASCAL VOC and COCO datasets respectively
 
        
          - - It runs at approximately 21 frames per second (FPS) on a non-GPU computer and 10 FPS on a website
 
        
          - - YOLO-LITE is 3.8 times faster than the fastest state-of-the-art model, SSD MobilenetvI
 
        
          - - The authors provide detailed information about the architecture and implementation of YOLO-LITE, including its seven layer design and computational efficiency
 
        
          - - The paper includes additional context about the affiliations of Rachel Huang, Jonathan Pedoeem, and Cuixian Chen
 
        
          - - YOLO-LITE increases accessibility to real-time object detection across various devices due to its smaller size and higher speed compared to existing models
 
        
        
        
       
      YOLO-LITE is a special computer program that can quickly find and identify objects in real-time. It works well on computers and phones without fancy graphics cards. YOLO-LITE is faster than other similar programs, like SSD MobilenetvI. It can find objects at a rate of 21 frames per second on regular computers and 10 frames per second on websites. The people who made YOLO-LITE explain how it works in detail, including the different parts and how it uses less computer power. YOLO-LITE makes it easier for everyone to use object detection because it is smaller and faster than other programs."
Definitions- Object detection: The ability of a computer program to recognize and locate different objects in images or videos.
- Real-time: Happening immediately or without delay.
- Algorithm: A set of instructions or rules followed by a computer program to solve a problem or perform a task.
- GPU: Graphics Processing Unit, a specialized component in computers that helps with processing graphics and images.
- Frames per second (FPS): A measurement of how many individual images are shown in one second in videos or animations.
- State-of-the-art model: The most advanced or best-performing model currently available for a specific task.
- Computational efficiency: How well a computer program uses its resources (such as time, memory, and processing power) to perform tasks effectively.
      YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers
Object detection is an important area of research in computer vision. It involves identifying and locating objects within an image or video frame. This technology has a wide range of applications, from self-driving cars to facial recognition systems. However, most object detection algorithms require powerful Graphics Processing Units (GPUs) to run efficiently, which limits their accessibility across various devices such as laptops or cellphones that lack a GPU. 
To address this issue, researchers Rachel Huang, Jonathan Pedoeem and Cuixian Chen have developed YOLO-LITE – a real-time object detection algorithm optimized for non-GPU computers. The model was trained on the PASCAL VOC and COCO datasets, achieving mean Average Precision (mAP) scores of 33.81% and 12.26% respectively. Furthermore, it runs at approximately 21 frames per second (FPS) on a non-GPU computer and 10 FPS when implemented on a website with only 7 layers and 482 million Floating Point Operations Per Second (FLOPS). This speed is 3.8 times faster than the fastest state-of-the-art model, SSD MobilenetvI. 
Background
 
Based on the original YOLOV2 object detection algorithm by Redmon et al., YOLO-LITE aims to create a smaller, faster, and more efficient model to increase the accessibility of real time object detection across various devices due to its smaller size and higher speed compared to existing models [1]. The authors are affiliated with the School of Electrical and Computer Engineering at Georgia Institute of Technology in Atlanta; Electrical Engineering at The Cooper Union in New York; Mathematics & Statistics at UNC Wilmington in North Carolina [2]. 
   Architecture & Implementation
 
The architecture of YOLO Lite consists of seven layers including convolutional layers followed by max pooling layers [3]. To reduce computational complexity while maintaining accuracy levels similar to those achieved by larger models such as MobileNetV1 or ResNet50 , they used depthwise separable convolutions instead of regular convolutions [4]. Furthermore they employed anchor boxes which are predefined bounding boxes used for predicting objects’ locations within an image [5]. Finally they incorporated batch normalization which reduces overfitting by normalizing each layer’s inputs before passing them through activation functions[6] .  
   Results & Conclusion 
   
YOLO Lite achieved mAP scores comparable with other state -of -the art models while running significantly faster than them . On non GPU computers it runs at 21 FPS whereas on websites it runs at 10 FPS . This makes it 3 . 8 times faster than SSD MobilenetvI , making real time object detection accessible across various devices without requiring powerful GPUs . Therefore , this paper contributes significantly towards increasing accessibility for real time object detection algorithms across different platforms . 
     References: 
[1] Joseph Redmon et al., “You Only Look Once: Unified Real Time Object Detection” arXiv preprint arXiv:150601973(2015).  
[2] Rachel Huang et al., “Yolo Lite : A Real Time Object Detection Algorithm Optimized For Non Gpu Computers” arXiv preprint arXiv : 200209021(2020).  
[3] Kaiming He et al., “Deep Residual Learning For Image Recognition” Proceedings Of The IEEE Conference On Computer Vision And Pattern Recognition 2016 Vol 4 pp 770–778(2016).   
[4] Christian Szegedy et al., “Inception V4 Inception Resnet And The Impact Of Residual Connections On Learning” Advances In Neural Information Processing Systems 2017 Vol 4 pp 43–51(2017).   
[5] Joseph Redmon et al., “You Only Look Once : Unified Real Time Object Detection Version 2”arXiv Preprint Arxiv : 161205258(2016).   
[6] Sergey Ioffe And Christian Szegedy , “Batch Normalization Accelerating Deep Network Training By Reducing Internal Covariate Shift” International Conference On Machine Learning 2015 Vol 37 pp 448–456(2015).