This work aims to emphasize the peculiarities and strengths of the multivariate mixed model and of the principal components. Usually, Principal Components Analysis is considered in the context of exploratory statistics as a helpful tool to reduce dimension while avoiding the loss of a huge amount of information. In this thesis the probabilistic formulation of the Principal Components Analysis (see Tipping and Bishop and Ulfarsson and Solo) or both, as will be seen in chapter 2 is used, which allows the application of Bayesian methods thanks to the definition of a likelihood measure. As for the Linear Mixed Models (LMM), the Bayesian approach to the estimation requires an expectation of some random model parameters, conditionally to the observed data. Therefore this work will concern two models: a probabilistic model for PCs and a mixed multivariate model, applied to the Small Area Estimation (SAE) framework, particularly dealing with a multivariate unit-level (NER) model. Several works have analyzed multivariate models for small areas (see Fay, Datta et al. , Gonzales and Manteiga, and Benavent and Morales. Usually, unlike the univariate field, dealing with several dependent variables creates convergence problems, which are difficult to solve. Furthermore, if there are covariates, there could be problems such as omitted variables (that can cause biased estimates) or situations of redundancy (which means an increase in the variability of the estimates). Despite these drawbacks, the high correlation between the response variables reduces the Mean Squared Error (MSE). Therefore on one hand the correlation between multivariate dependent variables can account for applying multivariate estimation to sample data, and on the other hand, the relationship within the data could be also preserved in the prediction. From this, the idea of building a new predictor that takes into account the information contained in the data. This is done not only by using a probabilistic model for PCs but also by making sure that the predictor links together the multivariate mixed model and that model. Chapter 3 presents this new predictor which is called ``shift predictor'' because it is formed by the classic unit-level predictor plus an additional part that takes advantage of all the available information. In this sense it could be seen as a predictor adjusted by the principal components, and - by premultiplying it by the eigenvector matrix - we obtained principal components adjusted by the predictor. The new predictor is unbiased and the simulations show that its prediction MSE is quite similar to that of the standard Eblup, and has a much lower Average Absolute Relative Bias.

Introducing a "Shift" predictor in the Multivariate Small Area Estimation

MARCIS, Laura
2023

Abstract

This work aims to emphasize the peculiarities and strengths of the multivariate mixed model and of the principal components. Usually, Principal Components Analysis is considered in the context of exploratory statistics as a helpful tool to reduce dimension while avoiding the loss of a huge amount of information. In this thesis the probabilistic formulation of the Principal Components Analysis (see Tipping and Bishop and Ulfarsson and Solo) or both, as will be seen in chapter 2 is used, which allows the application of Bayesian methods thanks to the definition of a likelihood measure. As for the Linear Mixed Models (LMM), the Bayesian approach to the estimation requires an expectation of some random model parameters, conditionally to the observed data. Therefore this work will concern two models: a probabilistic model for PCs and a mixed multivariate model, applied to the Small Area Estimation (SAE) framework, particularly dealing with a multivariate unit-level (NER) model. Several works have analyzed multivariate models for small areas (see Fay, Datta et al. , Gonzales and Manteiga, and Benavent and Morales. Usually, unlike the univariate field, dealing with several dependent variables creates convergence problems, which are difficult to solve. Furthermore, if there are covariates, there could be problems such as omitted variables (that can cause biased estimates) or situations of redundancy (which means an increase in the variability of the estimates). Despite these drawbacks, the high correlation between the response variables reduces the Mean Squared Error (MSE). Therefore on one hand the correlation between multivariate dependent variables can account for applying multivariate estimation to sample data, and on the other hand, the relationship within the data could be also preserved in the prediction. From this, the idea of building a new predictor that takes into account the information contained in the data. This is done not only by using a probabilistic model for PCs but also by making sure that the predictor links together the multivariate mixed model and that model. Chapter 3 presents this new predictor which is called ``shift predictor'' because it is formed by the classic unit-level predictor plus an additional part that takes advantage of all the available information. In this sense it could be seen as a predictor adjusted by the principal components, and - by premultiplying it by the eigenvector matrix - we obtained principal components adjusted by the predictor. The new predictor is unbiased and the simulations show that its prediction MSE is quite similar to that of the standard Eblup, and has a much lower Average Absolute Relative Bias.
24-ott-2023
Inglese
small area estimation; mixed models; survey sampling, random principal components.
SALVATORE, Renato
TOMASSONI, Rosella
Università degli studi di Cassino
Cassino (FR)
File in questo prodotto:
File Dimensione Formato  
MARCIS_Laura_Introducing a Shift Predictor in the Multivariate Small Area Estimation.pdf

accesso aperto

Dimensione 2.04 MB
Formato Adobe PDF
2.04 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/70984
Il codice NBN di questa tesi è URN:NBN:IT:UNICAS-70984