Artificial Intelligence (AI) has rapidly become a central field in contemporary research, driving innovation in several disciplines, including physics, engineering, economics, and medicine. At the heart of AI lies the development of algorithms capable of autonomously learning from data and adapting to new challenges, moving beyond traditional rule-based programming. Despite remarkable progress, the underlying reason for the success of many AI models remains largely unexplained, underscoring the need for a deeper theoretical understanding of their inner mechanisms. This thesis addresses this challenge by employing tools from statistical physics to analyze and interpret machine learning algorithms, with a particular focus on the geometry of their loss landscapes. By leveraging methods such as the replica method, random matrix theory, and message passing algorithms, the thesis investigates how the structure of the loss landscape influences inference, sampling, and optimization in high-dimensional settings. The thesis investigates four canonical machine-learning tasks, each characterized by a non-convex loss landscape: for the negative and binary perceptron we introduce and analyse a stochastic-localization algorithm that fairly samples their solution space; for phase retrieval we recast the task within a phase-selection framework, dividing its combinatorial and continuous components; we develop a theoretical framework for the spectral analysis of neural-network loss Hessians and specialise it to the Tree Committee Machine; and for Restricted Boltzmann Machines we propose pseudo-likelihood maximisation (which can be reframed in term of loss minimization) as a stable and efficient alternative to contrastive divergence based learning. Building on these concepts, the thesis offers new theoretical insights into the interplay between optimization, inference, and the underlying geometry of complex machine learning models.

Navigating Rough Landscapes: Statistical Physics Approaches to Machine Learning and Inference

STRAZIOTA, DAVIDE
2026

Abstract

Artificial Intelligence (AI) has rapidly become a central field in contemporary research, driving innovation in several disciplines, including physics, engineering, economics, and medicine. At the heart of AI lies the development of algorithms capable of autonomously learning from data and adapting to new challenges, moving beyond traditional rule-based programming. Despite remarkable progress, the underlying reason for the success of many AI models remains largely unexplained, underscoring the need for a deeper theoretical understanding of their inner mechanisms. This thesis addresses this challenge by employing tools from statistical physics to analyze and interpret machine learning algorithms, with a particular focus on the geometry of their loss landscapes. By leveraging methods such as the replica method, random matrix theory, and message passing algorithms, the thesis investigates how the structure of the loss landscape influences inference, sampling, and optimization in high-dimensional settings. The thesis investigates four canonical machine-learning tasks, each characterized by a non-convex loss landscape: for the negative and binary perceptron we introduce and analyse a stochastic-localization algorithm that fairly samples their solution space; for phase retrieval we recast the task within a phase-selection framework, dividing its combinatorial and continuous components; we develop a theoretical framework for the spectral analysis of neural-network loss Hessians and specialise it to the Tree Committee Machine; and for Restricted Boltzmann Machines we propose pseudo-likelihood maximisation (which can be reframed in term of loss minimization) as a stable and efficient alternative to contrastive divergence based learning. Building on these concepts, the thesis offers new theoretical insights into the interplay between optimization, inference, and the underlying geometry of complex machine learning models.
29-gen-2026
Inglese
LUCIBELLO, CARLO
MALATESTA, ENRICO MARIA
Università Bocconi
File in questo prodotto:
File Dimensione Formato  
ThesisStraziotaFinalSubmission.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 5.34 MB
Formato Adobe PDF
5.34 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/355867
Il codice NBN di questa tesi è URN:NBN:IT:UNIBOCCONI-355867