Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.

Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.

Advances in Cluster Analysis with Applications to Financial Risk Management

MECCHINA, ANDREA
2026

Abstract

Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.
3-feb-2026
Inglese
Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.
Cluster Analysis; Density-Based Models; Time Series; Extreme-Value Theory; Risk Management
TORELLI, Nicola
BORTOLUSSI, LUCA
RODRIGUEZ GARCIA, ALEJANDRO
Università degli Studi di Trieste
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis.pdf

embargo fino al 03/02/2027

Licenza: Tutti i diritti riservati
Dimensione 12.84 MB
Formato Adobe PDF
12.84 MB Adobe PDF
PhD_Thesis_1.pdf

embargo fino al 03/02/2027

Licenza: Tutti i diritti riservati
Dimensione 12.84 MB
Formato Adobe PDF
12.84 MB Adobe PDF
PhD_Thesis_2.pdf

embargo fino al 03/02/2027

Licenza: Tutti i diritti riservati
Dimensione 12.84 MB
Formato Adobe PDF
12.84 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/356206
Il codice NBN di questa tesi è URN:NBN:IT:UNITS-356206