Tree based methods for regression and classification have a long and successful history in statistics and data–analysis and are essentially based on a recursive partition of the covariate space, possibly driven by specific testing procedures design to control branch creation. Starting from the conditional approach introduced in where the choice of the split–variable and the split–value are divided into two different steps allowing an unbiased feature selection, in this work we introduce an energy based testing scheme to validate each of these phases. Energy methods are based on metrics such as distance correlation which, under suitable conditions, ensures the independence of the variables and are therefore more informative than standard association measures. Moreover, as distance correlation measures can be defined for (almost) any kind of variables, our proposed framework is flexible enough to accomodate multiple types of covariates. We focus in particular on the case of functional covariates, for which we show simulated and real data examples, as well as comparisons with more established functional data analysis methods.
Classification and regression energy tree for functional data
BRANDI, MARCO
2018
Abstract
Tree based methods for regression and classification have a long and successful history in statistics and data–analysis and are essentially based on a recursive partition of the covariate space, possibly driven by specific testing procedures design to control branch creation. Starting from the conditional approach introduced in where the choice of the split–variable and the split–value are divided into two different steps allowing an unbiased feature selection, in this work we introduce an energy based testing scheme to validate each of these phases. Energy methods are based on metrics such as distance correlation which, under suitable conditions, ensures the independence of the variables and are therefore more informative than standard association measures. Moreover, as distance correlation measures can be defined for (almost) any kind of variables, our proposed framework is flexible enough to accomodate multiple types of covariates. We focus in particular on the case of functional covariates, for which we show simulated and real data examples, as well as comparisons with more established functional data analysis methods.File | Dimensione | Formato | |
---|---|---|---|
Tesi dottorato Brandi
accesso aperto
Dimensione
5.11 MB
Formato
Unknown
|
5.11 MB | Unknown | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/94415
URN:NBN:IT:UNIROMA1-94415