Accelerating deep neural networks on heterogeneous computing platforms

Cittadini, Edoardo

The integration of machine learning (ML) algorithms and neural networks into safety-critical and autonomous systems raises significant concerns regarding safety, security, and performance predictability. This thesis provides three main contributions in this area. First, it proposes an innovative architecture that leverages heterogeneous platforms and virtualization technologies to manage AI-powered applications with mixed criticalities. The architecture exploits the Xilinx UltraScale+ multiprocessor system-on-a-chip (MPSoC) family to create two isolated domains: one for running high-performance deep learning algorithms under Linux, and one for running safety-critical tasks under the FreeRTOS real-time operating system. The effectiveness of such a dual-domain approach is validated on an unmanned aerial vehicle (UAV) that tracks moving targets with a deep neural network accelerated on the FPGA available on the platform. As a second contribution, the thesis addresses the problem of efficiently executing deep neural networks for real-time tasks in small cyber-physical systems, such as drones and robots. In particular, it presents a general-purpose FPGA accelerator capable of handling both preprocessing and postprocessing operations for vision tasks, thus overcoming the typical bottlenecks of conventional accelerators, which are specialized only in neural network layers. Tested on the AMD Xilinx Kria KR-260 UltraScale+ MPSoC, the accelerator showed excellent performance, enabling onboard real-time visual processing with a contained power consumption. A third contribution is provided by proposing a new approach to multi-object tracking that achieves reduced and predictable execution times without sacrificing accuracy. By partitioning the matching process into smaller sub-problems and selectively applying re-identification, the proposed method enhances tracking efficiency while maintaining high performance. This solution was tested in urban scenarios, proving its effectiveness in reducing execution times without sacrificing the accuracy compared to state-of-the-art tracking models. In summary, this thesis advances the integration of AI in autonomous and safety-critical systems by proposing architectural improvements, efficient hardware acceleration solutions, and an innovative formulation of the multi-object tracking pipeline. These contributions facilitate the development of more robust, efficient, and secure AI-driven applications across various domains, from autonomous vehicles to robotics.

Accelerating deep neural networks on heterogeneous computing platforms

CITTADINI, EDOARDO

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)