Integrating robotic systems with animals in closed-loop interactions requires the robot to accurately interpret animal behavior in real time to facilitate meaningful interactions. Artificial Intelligence offers a powerful approach to understanding animal behavior by recognizing actions in video data. While the development of deep learning models for action recognition has advanced significantly in recent years, much of this progress has been centered on human actions. However, one notable exception is the release of Animal Kingdom, the largest dataset specifically designed for animal action recognition. In this thesis, we focus on designing and developing a fast and efficient action recognition model to be integrated into a robotic dog, enabling it to navigate around animals and understand their behavior. Extending and exceeding previous the state-of-the-art approaches for the Animal Kingdom action recognition tasks, we not only improved upon existing models but also established new benchmarks, becoming leaders in this domain. The development process prioritized enhancing accuracy while reducing the computational demands to ensure real-time performance. Initially, we employed Selective States Models, later incorporating distillation strategies to streamline the network by reducing the number of modalities used. The final architecture was successfully integrated into the robotic dog and tested in real-world environments, where it demonstrated the ability to effectively recognize and interpret the actions of nearby animals. The findings of my research have the potential to benefit the machine learning, robotic and entomological community, with applications ranging from industrial farming tasks to animal behavior laboratory research activities.

Towards Artificial Intellingence in the loop in Animal-Robot Interaction

FAZZARI, EDOARDO
2026

Abstract

Integrating robotic systems with animals in closed-loop interactions requires the robot to accurately interpret animal behavior in real time to facilitate meaningful interactions. Artificial Intelligence offers a powerful approach to understanding animal behavior by recognizing actions in video data. While the development of deep learning models for action recognition has advanced significantly in recent years, much of this progress has been centered on human actions. However, one notable exception is the release of Animal Kingdom, the largest dataset specifically designed for animal action recognition. In this thesis, we focus on designing and developing a fast and efficient action recognition model to be integrated into a robotic dog, enabling it to navigate around animals and understand their behavior. Extending and exceeding previous the state-of-the-art approaches for the Animal Kingdom action recognition tasks, we not only improved upon existing models but also established new benchmarks, becoming leaders in this domain. The development process prioritized enhancing accuracy while reducing the computational demands to ensure real-time performance. Initially, we employed Selective States Models, later incorporating distillation strategies to streamline the network by reducing the number of modalities used. The final architecture was successfully integrated into the robotic dog and tested in real-world environments, where it demonstrated the ability to effectively recognize and interpret the actions of nearby animals. The findings of my research have the potential to benefit the machine learning, robotic and entomological community, with applications ranging from industrial farming tasks to animal behavior laboratory research activities.
8-gen-2026
Italiano
Deep Learning
Multimodal Fusion
Animal Action Recognition
Animal-Robot Interaction
Animal Behavior
MAZZONI, ALBERTO
MARIO CIMINO
GIULIA DE MASI
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_Fazzari.pdf

embargo fino al 03/10/2028

Licenza: Tutti i diritti riservati
Dimensione 8.63 MB
Formato Adobe PDF
8.63 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/357846
Il codice NBN di questa tesi è URN:NBN:IT:SSSUP-357846