HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices

AI-generated keywords: HARFLOW3D 3D CNNs FPGA ONNX latency

AI-generated Key Points

  • 3D Convolutional Neural Networks (CNNs) are effective for Human Action Recognition (HAR) tasks
  • However, they require more computational and memory resources than 2D CNNs due to the additional temporal dimension
  • HARFLOW3D is a novel toolflow that maps 3D CNN models onto Field Programmable Gate Array (FPGA) devices while considering the model's inherent characteristics and the features of the targeted FPGA device
  • The toolflow takes as input a 3D CNN in ONNX format and a description of the FPGA characteristics, generating a design that minimizes computation latency
  • It comprises several parts, including a 3D CNN parser, performance and resource model, scheduling algorithm for executing 3D models on generated hardware, resource-aware optimization engine tailored for 3D models, and automated mapping to synthesizable code for FPGAs
  • Experiments showed that HARFLOW3D can support a broad range of models and devices with high-performing results compared to existing hand-tuned approaches
  • HARFLOW3D was able to achieve up to five times better performance compared to some existing hand-tuned approaches
  • There is an increasing need for efficient tools like HARFLOW3D that can accelerate computations involving volumetric data using FPGAs in video-related applications such as video surveillance, autonomous driving and patient monitoring.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Petros Toupas, Alexander Montgomerie-Corcoran, Christos-Savvas Bouganis, Dimitrios Tzovaras

11 pages, 8 figures, 6 tables
License: CC BY 4.0

Abstract: For Human Action Recognition tasks (HAR), 3D Convolutional Neural Networks have proven to be highly effective, achieving state-of-the-art results. This study introduces a novel streaming architecture based toolflow for mapping such models onto FPGAs considering the model's inherent characteristics and the features of the targeted FPGA device. The HARFLOW3D toolflow takes as input a 3D CNN in ONNX format and a description of the FPGA characteristics, generating a design that minimizes the latency of the computation. The toolflow is comprised of a number of parts, including i) a 3D CNN parser, ii) a performance and resource model, iii) a scheduling algorithm for executing 3D models on the generated hardware, iv) a resource-aware optimization engine tailored for 3D models, v) an automated mapping to synthesizable code for FPGAs. The ability of the toolflow to support a broad range of models and devices is shown through a number of experiments on various 3D CNN and FPGA system pairs. Furthermore, the toolflow has produced high-performing results for 3D CNN models that have not been mapped to FPGAs before, demonstrating the potential of FPGA-based systems in this space. Overall, HARFLOW3D has demonstrated its ability to deliver competitive latency compared to a range of state-of-the-art hand-tuned approaches being able to achieve up to 5$\times$ better performance compared to some of the existing works.

Submitted to arXiv on 30 Mar. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2303.17218v1

The use of 3D Convolutional Neural Networks (CNNs) has become increasingly popular for Human Action Recognition (HAR) tasks, as they have proven to be highly effective in achieving state-of-the-art results. However, the computational and memory requirements of 3D CNNs are often larger than those of 2D CNNs due to the additional temporal dimension. To address this issue, a novel toolflow called HARFLOW3D has been introduced in this study. The toolflow is based on a streaming architecture that maps 3D CNN models onto Field Programmable Gate Array (FPGA) devices while considering the model's inherent characteristics and the features of the targeted FPGA device. The HARFLOW3D toolflow takes as input a 3D CNN in ONNX format and a description of the FPGA characteristics, generating a design that minimizes the latency of computation. It comprises several parts, including a 3D CNN parser, performance and resource model, scheduling algorithm for executing 3D models on generated hardware, resource-aware optimization engine tailored for 3D models, and automated mapping to synthesizable code for FPGAs. Experiments were conducted on various pairs of 3D CNN models and FPGA systems to demonstrate the ability of HARFLOW3D to support a broad range of models and devices. The toolflow produced high-performing results for some previously unmapped 3D CNN models to FPGAs, demonstrating the potential of FPGA-based systems in this space. In particular, HARFLOW3D was able to achieve up to five times better performance compared to some existing hand-tuned approaches. The growing focus on video-related applications such as video surveillance, autonomous driving and patient monitoring has necessitated algorithms that integrate and take into account the temporal domain. As such there is an increasing need for efficient tools like HARFLOW3D that can accelerate computations involving volumetric data using FPGAs. Overall HARFLOW3Ds has demonstrated its ability to deliver competitive latency and resource utilization compared to a range of state-of-the-art hand tuned approaches.
Created on 17 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.