Learning with uncertainty via Hyperbolic Neural Networks

Mandica, Paolo

This thesis explores the application of hyperbolic geometry and hyperbolic neural networks across various domains, with a focus on leveraging uncertainty estimation to improve the learning process and performance in complex tasks. We begin with a brief introduction to hyperbolic neural networks, providing the theoretical foundation and key concepts that underpin our subsequent research. This work then spans three main areas: self-supervised representation learning for skeleton-based action recognition, active domain adaptation for semantic segmentation, and multimodal large language models. First, this thesis investigates self-supervised learning in the context of skeleton-based action recognition, where effective representation learning remains challenging due to the hierarchical nature of human motion data. We introduce hyperbolic neural networks to address this challenge through uncertainty-aware learning, developing a novel Hyperbolic Self-Paced learning model (HYSP). This approach leverages the hyperbolic radius as an uncertainty metric to adaptively pace the learning process, scaling the gradient determined by each sample by the norm of the hyperbolic embedding. When evaluated on standard action recognition benchmarks, HYSP demonstrates superior performance while eliminating the need for computationally expensive negative mining procedures. Next, we explore active learning for semantic segmentation under domain shift, where efficient label acquisition is crucial for adapting to new environments while keeping labeling costs down. For this challenge, we develop a hyperbolic approach named HALO (Hyperbolic Active Learning Optimization), which interprets the hyperbolic radius as an indicator of data scarcity. By combining the hyperbolic radius with prediction entropy, we obtain an estimator of epistemic uncertainty, which we use for selective annotation of pixels in the image. HALO achieves state-of-the-art results on domain adaptation benchmarks while requiring only a small fraction of target labels, surpassing even fully supervised domain adaptation methods. Finally, this thesis examines large-scale vision-language modeling, where uncertainty estimation becomes particularly challenging due to the scale and multimodal nature of the data. By developing a novel training strategy for a hyperbolic version of BLIP-2, we demonstrate that hyperbolic learning can be successfully scaled to billion-parameter architectures without compromising stability or performance. Our approach achieves results comparable to its Euclidean counterpart while providing meaningful uncertainty estimates thanks to hyperbolic embeddings, offering a new perspective on uncertainty quantification in large multimodal models. Throughout these studies, we demonstrate that learning in hyperbolic space offers unique advantages in estimating uncertainty and improving model performance and efficiency across diverse machine learning tasks. This work contributes to the broader understanding of hyperbolic neural networks and their potential to advance the field of deep learning.

Learning with uncertainty via Hyperbolic Neural Networks

MANDICA, PAOLO

2025

Abstract

This thesis explores the application of hyperbolic geometry and hyperbolic neural networks across various domains, with a focus on leveraging uncertainty estimation to improve the learning process and performance in complex tasks. We begin with a brief introduction to hyperbolic neural networks, providing the theoretical foundation and key concepts that underpin our subsequent research. This work then spans three main areas: self-supervised representation learning for skeleton-based action recognition, active domain adaptation for semantic segmentation, and multimodal large language models. First, this thesis investigates self-supervised learning in the context of skeleton-based action recognition, where effective representation learning remains challenging due to the hierarchical nature of human motion data. We introduce hyperbolic neural networks to address this challenge through uncertainty-aware learning, developing a novel Hyperbolic Self-Paced learning model (HYSP). This approach leverages the hyperbolic radius as an uncertainty metric to adaptively pace the learning process, scaling the gradient determined by each sample by the norm of the hyperbolic embedding. When evaluated on standard action recognition benchmarks, HYSP demonstrates superior performance while eliminating the need for computationally expensive negative mining procedures. Next, we explore active learning for semantic segmentation under domain shift, where efficient label acquisition is crucial for adapting to new environments while keeping labeling costs down. For this challenge, we develop a hyperbolic approach named HALO (Hyperbolic Active Learning Optimization), which interprets the hyperbolic radius as an indicator of data scarcity. By combining the hyperbolic radius with prediction entropy, we obtain an estimator of epistemic uncertainty, which we use for selective annotation of pixels in the image. HALO achieves state-of-the-art results on domain adaptation benchmarks while requiring only a small fraction of target labels, surpassing even fully supervised domain adaptation methods. Finally, this thesis examines large-scale vision-language modeling, where uncertainty estimation becomes particularly challenging due to the scale and multimodal nature of the data. By developing a novel training strategy for a hyperbolic version of BLIP-2, we demonstrate that hyperbolic learning can be successfully scaled to billion-parameter architectures without compromising stability or performance. Our approach achieves results comparable to its Euclidean counterpart while providing meaningful uncertainty estimates thanks to hyperbolic embeddings, offering a new perspective on uncertainty quantification in large multimodal models. Throughout these studies, we demonstrate that learning in hyperbolic space offers unique advantages in estimating uncertainty and improving model performance and efficiency across diverse machine learning tasks. This work contributes to the broader understanding of hyperbolic neural networks and their potential to advance the field of deep learning.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				DIPARTIMENTO DI INFORMATICA
FACOLTA' DI INGEGNERIA DELL'INFORMAZIONE, INFORMATICA e STATISTICA
			
	Corso di studio
	
				Informatica
			
	Data di pubblicazione
	
				15-gen-2025
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				GALASSO, FABIO
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				MANCINI, MAURIZIO
			
	Nome Editore
	
				Università degli Studi di Roma "La Sapienza"
			
	Numero di pagine
	
				122
			
	Collezione di appartenenza
	
				Università degli Studi di Roma La Sapienza

File in questo prodotto:

File	Dimensione	Formato
Tesi_dottorato_Mandica.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 10.57 MB Formato Adobe PDF Visualizza/Apri	10.57 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/188447

Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA1-188447