Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.
Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.
Advances in Cluster Analysis with Applications to Financial Risk Management
MECCHINA, ANDREA
2026
Abstract
Cluster analysis, commonly referred to as clustering, is a subfield of machine learning that aims at partitioning a set of unlabeled data into groups based on their similarities. Clustering techniques are highly versatile and can be applied to a wide variety of data, among which time series represent a notable example due to their intrinsic complexity and ordered structure. A relevant portion of this dissertation is devoted to the development of new strategies for cluster analysis, with a focus on possible applications to the analysis of financial data. More specifically: (i) we contribute to clustering methodologies by proposing a novel density-based algorithm; (ii) we consider the use of an evidence accumulation strategy to detect groups of financial time series with the aim of managing risk; (iii) we consider a specific problem related to the allocation of volatility in portfolios of financial assets. We begin by introducing Stochastic Density Peaks, a clustering algorithm that embeds a stochastic formulation of density-based methods, such as Density Peaks clustering, within the solid linear algebra foundations of Generalized Perron Cluster Analysis. Due to its general-purpose design, we assess its performance against competitors on diverse problems: from synthetic datasets and handwritten digits partitioning, to the identification of metastable states in molecular dynamics simulations. Successively, we focus on time series of financial logarithmic returns. With the purpose of identifying clusters of assets that exhibit extreme co-movements, bivariate copula models have been extensively adopted in the literature. However, the specification of a single model cannot capture varying dependence structures. We propose an evidence accumulation clustering methodology that combines results from multiple partitions, each relying on a different underlying copula, and is inherently robust to other design choices. The quality of the resulting groups is assessed by their effectiveness in diversifying financial risk within a portfolio backtesting framework. Finally, we tackle a problem relevant to portfolio managers. We introduce a pipeline to allocate the volatility contributions of a portfolio returns to a set of exogenous risk factors, i.e., which are not part of the investible universe, while accounting for the potential multi-collinearity among them. This is possibly the use case of a practitioner who aims at disentangling the effect of the risk-free interest Rate from Corporate credit spread risk, which are indeed significantly correlated. The methodology is demonstrated across different portfolio strategies, and we are able to quantify the residual volatility exposure left unexplained by a set of available risk factors.| File | Dimensione | Formato | |
|---|---|---|---|
|
PhD_Thesis.pdf
embargo fino al 03/02/2027
Licenza:
Tutti i diritti riservati
Dimensione
12.84 MB
Formato
Adobe PDF
|
12.84 MB | Adobe PDF | |
|
PhD_Thesis_1.pdf
embargo fino al 03/02/2027
Licenza:
Tutti i diritti riservati
Dimensione
12.84 MB
Formato
Adobe PDF
|
12.84 MB | Adobe PDF | |
|
PhD_Thesis_2.pdf
embargo fino al 03/02/2027
Licenza:
Tutti i diritti riservati
Dimensione
12.84 MB
Formato
Adobe PDF
|
12.84 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/356206
URN:NBN:IT:UNITS-356206