In many contemporary applications of statistics and machine learning — such as recommendation systems, network analysis, and genomics — data can be represented as large, sparse matrices in which only a small fraction of the entries are observed. We refer to these matrices as high-dimensional, since the number of variables (e.g., individuals, or genes) can be very large, often in the order of hundreds of thousands or even millions. In contrast, the number of observations, which correspond to non-zero entries, is not quadratic in the number of variables. As a result, the sparsity of the matrices increases with dimensionality. For instance, in the case of the MovieLens dataset, which contains ratings of movies, the proportion of observed ratings decreases from approximately 1% to 0.1% as the number of users and movies increases from tens to hundreds of thousands. This poses serious challenges for the application of traditional statistical methods, which often rely on factorizations of such matrices, such as the singular value decomposition or the Cholesky decomposition, that may be computationally infeasible in high-dimensional settings, even if optimized for sparse matrices. Iterative algorithms are a popular alternative to address these challenges, as they can provide approximate solutions without requiring full matrix decompositions. These methods have a cheap computational cost per iteration, usually linear in the number of variables and observations, and are memory-efficient, requiring storage only of the current iterate. However, in high-dimensional settings, specific features of the models or the data can lead to ill-conditioned regions in the variable space. This makes the exploration slower as the dimension increases, resulting in a larger number of iterations for convergence. In this thesis, we investigate two iterative algorithms: the conjugate gradient method applied to linear mixed models and the alternating least squares algorithm for the matrix completion problem. We provide a detailed analysis of their convergence in high-dimensional settings, propose strategies to speed up the exploration of problematic regions, and show that these strategies make the algorithms converge in a number of iterations that is independent of the dimension.

Conjugate Gradient and Alternating Minimization for High-Dimensional Statistical Models

PANDOLFI, ANDREA
2026

Abstract

In many contemporary applications of statistics and machine learning — such as recommendation systems, network analysis, and genomics — data can be represented as large, sparse matrices in which only a small fraction of the entries are observed. We refer to these matrices as high-dimensional, since the number of variables (e.g., individuals, or genes) can be very large, often in the order of hundreds of thousands or even millions. In contrast, the number of observations, which correspond to non-zero entries, is not quadratic in the number of variables. As a result, the sparsity of the matrices increases with dimensionality. For instance, in the case of the MovieLens dataset, which contains ratings of movies, the proportion of observed ratings decreases from approximately 1% to 0.1% as the number of users and movies increases from tens to hundreds of thousands. This poses serious challenges for the application of traditional statistical methods, which often rely on factorizations of such matrices, such as the singular value decomposition or the Cholesky decomposition, that may be computationally infeasible in high-dimensional settings, even if optimized for sparse matrices. Iterative algorithms are a popular alternative to address these challenges, as they can provide approximate solutions without requiring full matrix decompositions. These methods have a cheap computational cost per iteration, usually linear in the number of variables and observations, and are memory-efficient, requiring storage only of the current iterate. However, in high-dimensional settings, specific features of the models or the data can lead to ill-conditioned regions in the variable space. This makes the exploration slower as the dimension increases, resulting in a larger number of iterations for convergence. In this thesis, we investigate two iterative algorithms: the conjugate gradient method applied to linear mixed models and the alternating least squares algorithm for the matrix completion problem. We provide a detailed analysis of their convergence in high-dimensional settings, propose strategies to speed up the exploration of problematic regions, and show that these strategies make the algorithms converge in a number of iterations that is independent of the dimension.
27-gen-2026
Inglese
PAPASPILIOPOULOS, OMIROS
ZANELLA, GIACOMO
Università Bocconi
File in questo prodotto:
File Dimensione Formato  
Revised thesis_Pandolfi_Andrea.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 1.27 MB
Formato Adobe PDF
1.27 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/356328
Il codice NBN di questa tesi è URN:NBN:IT:UNIBOCCONI-356328