Conjugate Gradient and Alternating Minimization for High-Dimensional Statistical Models

Pandolfi, Andrea

In many contemporary applications of statistics and machine learning — such as recommendation systems, network analysis, and genomics — data can be represented as large, sparse matrices in which only a small fraction of the entries are observed. We refer to these matrices as high-dimensional, since the number of variables (e.g., individuals, or genes) can be very large, often in the order of hundreds of thousands or even millions. In contrast, the number of observations, which correspond to non-zero entries, is not quadratic in the number of variables. As a result, the sparsity of the matrices increases with dimensionality. For instance, in the case of the MovieLens dataset, which contains ratings of movies, the proportion of observed ratings decreases from approximately 1% to 0.1% as the number of users and movies increases from tens to hundreds of thousands. This poses serious challenges for the application of traditional statistical methods, which often rely on factorizations of such matrices, such as the singular value decomposition or the Cholesky decomposition, that may be computationally infeasible in high-dimensional settings, even if optimized for sparse matrices. Iterative algorithms are a popular alternative to address these challenges, as they can provide approximate solutions without requiring full matrix decompositions. These methods have a cheap computational cost per iteration, usually linear in the number of variables and observations, and are memory-efficient, requiring storage only of the current iterate. However, in high-dimensional settings, specific features of the models or the data can lead to ill-conditioned regions in the variable space. This makes the exploration slower as the dimension increases, resulting in a larger number of iterations for convergence. In this thesis, we investigate two iterative algorithms: the conjugate gradient method applied to linear mixed models and the alternating least squares algorithm for the matrix completion problem. We provide a detailed analysis of their convergence in high-dimensional settings, propose strategies to speed up the exploration of problematic regions, and show that these strategies make the algorithms converge in a number of iterations that is independent of the dimension.

Conjugate Gradient and Alternating Minimization for High-Dimensional Statistical Models

PANDOLFI, ANDREA

2026

Abstract

In many contemporary applications of statistics and machine learning — such as recommendation systems, network analysis, and genomics — data can be represented as large, sparse matrices in which only a small fraction of the entries are observed. We refer to these matrices as high-dimensional, since the number of variables (e.g., individuals, or genes) can be very large, often in the order of hundreds of thousands or even millions. In contrast, the number of observations, which correspond to non-zero entries, is not quadratic in the number of variables. As a result, the sparsity of the matrices increases with dimensionality. For instance, in the case of the MovieLens dataset, which contains ratings of movies, the proportion of observed ratings decreases from approximately 1% to 0.1% as the number of users and movies increases from tens to hundreds of thousands. This poses serious challenges for the application of traditional statistical methods, which often rely on factorizations of such matrices, such as the singular value decomposition or the Cholesky decomposition, that may be computationally infeasible in high-dimensional settings, even if optimized for sparse matrices. Iterative algorithms are a popular alternative to address these challenges, as they can provide approximate solutions without requiring full matrix decompositions. These methods have a cheap computational cost per iteration, usually linear in the number of variables and observations, and are memory-efficient, requiring storage only of the current iterate. However, in high-dimensional settings, specific features of the models or the data can lead to ill-conditioned regions in the variable space. This makes the exploration slower as the dimension increases, resulting in a larger number of iterations for convergence. In this thesis, we investigate two iterative algorithms: the conjugate gradient method applied to linear mixed models and the alternating least squares algorithm for the matrix completion problem. We provide a detailed analysis of their convergence in high-dimensional settings, propose strategies to speed up the exploration of problematic regions, and show that these strategies make the algorithms converge in a number of iterations that is independent of the dimension.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				STATISTICS AND COMPUTER SCIENCE
			
	Data di pubblicazione
	
				27-gen-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				PAPASPILIOPOULOS, OMIROS
ZANELLA, GIACOMO
			
	Nome Editore
	
				Università Bocconi
			
	Collezione di appartenenza
	
				Università Commerciale Luigi Bocconi di Milano

File in questo prodotto:

File	Dimensione	Formato
Revised thesis_Pandolfi_Andrea.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 1.27 MB Formato Adobe PDF Visualizza/Apri	1.27 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/356328

Il codice NBN di questa tesi è URN:NBN:IT:UNIBOCCONI-356328