High dimensional data analysis has become a popular research topic in the recent years, due to the emergence of various new applications in several fields of sciences underscoring the need for analysing massive data sets. One of the main challenge in analysing high dimensional data regards the interpretability of estimated models as well as the computational efficiency of procedures adopted. Such a purpose can be achieved through the identification of relevant variables that really affect the phenomenon of interest, so that effective models can be subsequently constructed and applied to solve practical problems. The first two chapters of the thesis are devoted in studying high dimensional statistics for variable selection. We firstly introduce a short but exhaustive review on the main developed techniques for the general problem of variable selection using nonparametric statistics. Lastly in chapter 3 we will present our proposal regarding a feature screening approach for non additive models developed by using of conditional information in the estimation procedure. Differently, the second part of the thesis focuses on the spatio-temporal models in high dimensional contexts. Over the last decade, a particular class of spatio-temporal models has been rapidly developed, the spatial dynamic panel data models (SDPD). Several versions of the SDPD model have been proposed, based on different assumptions on the spatial parameters and different properties of the estimators. The standard version of the model assumes the spatial parameters constant over location, meanwhile another recently proposed version assumes the spatial parameters are adaptive over location. The assumption of different scalar coefficients is motivated by practical situations, in which empirical evidence shows how considering constant effect for each location can be limiting. While chapter 4 is devoted to introduce principal elements of spatio-temporal models in statistical and econometric frameworks, in chapter 5 we propose a strategy for testing the particular structure of SDPD model, by means of a multiple testing procedure that allows choosing between the version of the model with adaptive spatial parameters and some specific versions derived from the general one by imposing particular constraints on the parameters. The multiple test is made in high dimensional setting by the Bonferroni technique and the distribution of the multiple test statistic is derived by a residual bootstrap resampling scheme. [edited by Author]

High-dimensional statistics for complex data

PACELLA, MASSIMO
2018

Abstract

High dimensional data analysis has become a popular research topic in the recent years, due to the emergence of various new applications in several fields of sciences underscoring the need for analysing massive data sets. One of the main challenge in analysing high dimensional data regards the interpretability of estimated models as well as the computational efficiency of procedures adopted. Such a purpose can be achieved through the identification of relevant variables that really affect the phenomenon of interest, so that effective models can be subsequently constructed and applied to solve practical problems. The first two chapters of the thesis are devoted in studying high dimensional statistics for variable selection. We firstly introduce a short but exhaustive review on the main developed techniques for the general problem of variable selection using nonparametric statistics. Lastly in chapter 3 we will present our proposal regarding a feature screening approach for non additive models developed by using of conditional information in the estimation procedure. Differently, the second part of the thesis focuses on the spatio-temporal models in high dimensional contexts. Over the last decade, a particular class of spatio-temporal models has been rapidly developed, the spatial dynamic panel data models (SDPD). Several versions of the SDPD model have been proposed, based on different assumptions on the spatial parameters and different properties of the estimators. The standard version of the model assumes the spatial parameters constant over location, meanwhile another recently proposed version assumes the spatial parameters are adaptive over location. The assumption of different scalar coefficients is motivated by practical situations, in which empirical evidence shows how considering constant effect for each location can be limiting. While chapter 4 is devoted to introduce principal elements of spatio-temporal models in statistical and econometric frameworks, in chapter 5 we propose a strategy for testing the particular structure of SDPD model, by means of a multiple testing procedure that allows choosing between the version of the model with adaptive spatial parameters and some specific versions derived from the general one by imposing particular constraints on the parameters. The multiple test is made in high dimensional setting by the Bonferroni technique and the distribution of the multiple test statistic is derived by a residual bootstrap resampling scheme. [edited by Author]
29-mag-2018
Inglese
Variable selection
High-dimensional data
Spatio-temporal models
Giordano, Francesco
DESTEFANIS, Sergio Pietro
Università degli Studi di Salerno
File in questo prodotto:
File Dimensione Formato  
128320859104572737794855465461663201046.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 38.16 kB
Formato Adobe PDF
38.16 kB Adobe PDF Visualizza/Apri
160488184873290253001003276535535197779.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 819.56 kB
Formato Adobe PDF
819.56 kB Adobe PDF Visualizza/Apri
8926295636198342340712027643296458480.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 32.08 kB
Formato Adobe PDF
32.08 kB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/311487
Il codice NBN di questa tesi è URN:NBN:IT:UNISA-311487