Local depth functions (LDFs) are used for describing the local geometric features and mode(s) in multidimensional distributions. In this thesis, we undertake a rigorous systematic study of LDFs and establish several analytical and statistical properties. First, we show that, when the underlying probability distribution is absolutely continuous, scaled versions of LDFs (referred to as τ-approximation) converge, uniformly and in L^q, to the density, when τ converges to zero. Second, we establish that, as the sample size diverges to infinity the centered and scaled sample LDFs converge in distribution to a centered Gaussian process uniformly in the space of bounded functions on H_G, a class of functions yielding LDFs. Third, using the sample version of the τ-approximation and the gradient system analysis, we develop a new clustering algorithm. The validity of this algorithm requires several results concerning the uniform finite difference approximation of the gradient system associated with the sample τ-approximation. For this reason, we establish a Bernstein-type inequality for deviations between the centered and scaled sample LDFs. Finally, invoking the above results, we establish consistency of the clustering algorithm. Applications of the proposed methods to mode estimation and upper level set estimation are also provided.

Local depth functions and applications to clustering

Francisci, Giacomo
2022

Abstract

Local depth functions (LDFs) are used for describing the local geometric features and mode(s) in multidimensional distributions. In this thesis, we undertake a rigorous systematic study of LDFs and establish several analytical and statistical properties. First, we show that, when the underlying probability distribution is absolutely continuous, scaled versions of LDFs (referred to as τ-approximation) converge, uniformly and in L^q, to the density, when τ converges to zero. Second, we establish that, as the sample size diverges to infinity the centered and scaled sample LDFs converge in distribution to a centered Gaussian process uniformly in the space of bounded functions on H_G, a class of functions yielding LDFs. Third, using the sample version of the τ-approximation and the gradient system analysis, we develop a new clustering algorithm. The validity of this algorithm requires several results concerning the uniform finite difference approximation of the gradient system associated with the sample τ-approximation. For this reason, we establish a Bernstein-type inequality for deviations between the centered and scaled sample LDFs. Finally, invoking the above results, we establish consistency of the clustering algorithm. Applications of the proposed methods to mode estimation and upper level set estimation are also provided.
4-feb-2022
Inglese
Italiano
Spagnolo
Agostinelli, Claudio
Università degli studi di Trento
TRENTO
148
File in questo prodotto:
File Dimensione Formato  
phd_unitn_Giacomo_Francisci.pdf

Open Access dal 04/02/2024

Dimensione 2.21 MB
Formato Adobe PDF
2.21 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/175922
Il codice NBN di questa tesi è URN:NBN:IT:UNITN-175922