Navigating Rough Landscapes: Statistical Physics Approaches to Machine Learning and Inference

Straziota, Davide

Artificial Intelligence (AI) has rapidly become a central field in contemporary research, driving innovation in several disciplines, including physics, engineering, economics, and medicine. At the heart of AI lies the development of algorithms capable of autonomously learning from data and adapting to new challenges, moving beyond traditional rule-based programming. Despite remarkable progress, the underlying reason for the success of many AI models remains largely unexplained, underscoring the need for a deeper theoretical understanding of their inner mechanisms. This thesis addresses this challenge by employing tools from statistical physics to analyze and interpret machine learning algorithms, with a particular focus on the geometry of their loss landscapes. By leveraging methods such as the replica method, random matrix theory, and message passing algorithms, the thesis investigates how the structure of the loss landscape influences inference, sampling, and optimization in high-dimensional settings. The thesis investigates four canonical machine-learning tasks, each characterized by a non-convex loss landscape: for the negative and binary perceptron we introduce and analyse a stochastic-localization algorithm that fairly samples their solution space; for phase retrieval we recast the task within a phase-selection framework, dividing its combinatorial and continuous components; we develop a theoretical framework for the spectral analysis of neural-network loss Hessians and specialise it to the Tree Committee Machine; and for Restricted Boltzmann Machines we propose pseudo-likelihood maximisation (which can be reframed in term of loss minimization) as a stable and efficient alternative to contrastive divergence based learning. Building on these concepts, the thesis offers new theoretical insights into the interplay between optimization, inference, and the underlying geometry of complex machine learning models.

Navigating Rough Landscapes: Statistical Physics Approaches to Machine Learning and Inference

STRAZIOTA, DAVIDE

2026

Abstract

Artificial Intelligence (AI) has rapidly become a central field in contemporary research, driving innovation in several disciplines, including physics, engineering, economics, and medicine. At the heart of AI lies the development of algorithms capable of autonomously learning from data and adapting to new challenges, moving beyond traditional rule-based programming. Despite remarkable progress, the underlying reason for the success of many AI models remains largely unexplained, underscoring the need for a deeper theoretical understanding of their inner mechanisms. This thesis addresses this challenge by employing tools from statistical physics to analyze and interpret machine learning algorithms, with a particular focus on the geometry of their loss landscapes. By leveraging methods such as the replica method, random matrix theory, and message passing algorithms, the thesis investigates how the structure of the loss landscape influences inference, sampling, and optimization in high-dimensional settings. The thesis investigates four canonical machine-learning tasks, each characterized by a non-convex loss landscape: for the negative and binary perceptron we introduce and analyse a stochastic-localization algorithm that fairly samples their solution space; for phase retrieval we recast the task within a phase-selection framework, dividing its combinatorial and continuous components; we develop a theoretical framework for the spectral analysis of neural-network loss Hessians and specialise it to the Tree Committee Machine; and for Restricted Boltzmann Machines we propose pseudo-likelihood maximisation (which can be reframed in term of loss minimization) as a stable and efficient alternative to contrastive divergence based learning. Building on these concepts, the thesis offers new theoretical insights into the interplay between optimization, inference, and the underlying geometry of complex machine learning models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				STATISTICS AND COMPUTER SCIENCE
			
	Data di pubblicazione
	
				29-gen-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				LUCIBELLO, CARLO
MALATESTA, ENRICO MARIA
			
	Nome Editore
	
				Università Bocconi
			
	Collezione di appartenenza
	
				Università Commerciale Luigi Bocconi di Milano

File in questo prodotto:

File	Dimensione	Formato
ThesisStraziotaFinalSubmission.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 5.34 MB Formato Adobe PDF Visualizza/Apri	5.34 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/355867

Il codice NBN di questa tesi è URN:NBN:IT:UNIBOCCONI-355867