The systematic exploitation of data allows to optimize complex processes, discover hidden patterns, and support evidence-based decision-making. To ensure that data is used accurately, efficiently, and fairly, a wide range of data analysis and optimization algorithms have emerged over the years. However, the modern landscape of data and data-driven applications creates challenges for which these algorithms are not prepared, limiting or preventing their applicability and leaving several relevant problems unsolved. In this thesis, we design distributed algorithms for data analysis and data-driven optimization that effectively address these challenges and possess a number of desirable properties, such as communication efficiency, fast convergence, and robustness to various types of heterogeneity. We consider both fully distributed and federated scenarios, and focus on three major research topics. The first is curvature-aware optimization, which exploits the Hessian matrix of the objective function to accelerate the convergence rate of algorithms. The second is zeroth-order optimization, which tackles problems where the objective function is a black box accessible only through expensive function queries. The third is distributed clustering of heterogeneous data, where local datasets are not identically distributed and belong to different and partially overlapping feature spaces. We make significant contributions to each of these research areas, particularly by analyzing the properties of various zeroth-order derivative estimators, proposing algorithms that largely outperform previous state-of-the-art solutions, and addressing understudied problems.
Distributed algorithms for curvature-aware and zeroth-order optimization, and for masked data clustering
MARITAN, ALESSIO
2026
Abstract
The systematic exploitation of data allows to optimize complex processes, discover hidden patterns, and support evidence-based decision-making. To ensure that data is used accurately, efficiently, and fairly, a wide range of data analysis and optimization algorithms have emerged over the years. However, the modern landscape of data and data-driven applications creates challenges for which these algorithms are not prepared, limiting or preventing their applicability and leaving several relevant problems unsolved. In this thesis, we design distributed algorithms for data analysis and data-driven optimization that effectively address these challenges and possess a number of desirable properties, such as communication efficiency, fast convergence, and robustness to various types of heterogeneity. We consider both fully distributed and federated scenarios, and focus on three major research topics. The first is curvature-aware optimization, which exploits the Hessian matrix of the objective function to accelerate the convergence rate of algorithms. The second is zeroth-order optimization, which tackles problems where the objective function is a black box accessible only through expensive function queries. The third is distributed clustering of heterogeneous data, where local datasets are not identically distributed and belong to different and partially overlapping feature spaces. We make significant contributions to each of these research areas, particularly by analyzing the properties of various zeroth-order derivative estimators, proposing algorithms that largely outperform previous state-of-the-art solutions, and addressing understudied problems.| File | Dimensione | Formato | |
|---|---|---|---|
|
tesi_definitiva_Alessio_Maritan.pdf
embargo fino al 19/02/2027
Licenza:
Tutti i diritti riservati
Dimensione
13.52 MB
Formato
Adobe PDF
|
13.52 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/361055
URN:NBN:IT:UNIPD-361055