Advances in Cluster Analysis with Applications to Financial Risk Management

Mecchina, Andrea

Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.

Advances in Cluster Analysis with Applications to Financial Risk Management

MECCHINA, ANDREA

2026

Abstract

Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.

Scheda breve

Scheda completa

Scheda completa (DC)

	Corso di studio
	
				APPLIED DATA SCIENCE AND ARTIFICIAL INTELLIGENCE
			
	Data di pubblicazione
	
				3-feb-2026
			
	Lingua
	
				Inglese
			
	Abstract in italiano
	
				Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets.

We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations.

Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework.

Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.
			
	Parola chiave
	
				Cluster Analysis; Density-Based Models; Time Series; Extreme-Value Theory; Risk Management
			
	Relatore, Supervisor, Advisor o Tutor
	
				TORELLI, Nicola
BORTOLUSSI, LUCA
RODRIGUEZ GARCIA, ALEJANDRO
			
	Nome Editore
	
				Università degli Studi di Trieste
			
	Collezione di appartenenza
	
				Università degli Studi di Trieste

File in questo prodotto:

File	Dimensione	Formato
PhD_Thesis.pdf embargo fino al 03/02/2027 Licenza: Tutti i diritti riservati Dimensione 12.84 MB Formato Adobe PDF	12.84 MB	Adobe PDF
PhD_Thesis_1.pdf embargo fino al 03/02/2027 Licenza: Tutti i diritti riservati Dimensione 12.84 MB Formato Adobe PDF	12.84 MB	Adobe PDF
PhD_Thesis_2.pdf embargo fino al 03/02/2027 Licenza: Tutti i diritti riservati Dimensione 12.84 MB Formato Adobe PDF	12.84 MB	Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/356206

Il codice NBN di questa tesi è URN:NBN:IT:UNITS-356206