Introduction. The role of psychosocial aspects in the onset of the cardiovascular disease is known but, it is not clear which features are more associated with cardiovascular risk and if they can improve the predictive capability of the classic models based on age, sex, smoking, diabetes, blood pressure and cholesterol. The study of the association among the risk of an event and the people's characteristics is, generally, conducted using two approaches: i) stochastic data modeling, ii) algorithmic modeling. In the field of epidemiology, we have always preferred the first approach because the results are interpretable by the probabilistic point of view, however, it is based on parametric assumptions that often fail to identify factors or interaction between important factors. The second approach, based on statistical learning techniques (Random Forest) is promising in terms of selection of the variables and identification of possible interactions between them. Objective. To identify items of psychosocial scale most involved in the prediction of major cardiovascular events and to evaluate the association between events and the selected items, by integrating, where possible, the information with biomarkers data. Methods. The analysis for the selection of items by using the techniques of statistical learning was conducted on 6567 individuals belonging to the cohorts MONICA-Brianza and PAMELA, aged 25-64 years who, during a median follow-up of 15 years, have experienced 527 events. The technique of random forests has been used during the selection of the items psychosocial. Since one of the main problems of random forests is the difficulty of grasping signals in the presence of strongly unbalanced data, we analyzed a dataset in which each event is paired with a non-event of the same sex, age (± 5 years) and under observation at the onset of the event. We identified the number and type of item most associated with the event of interest and we use them in a Cox proportional hazards model aimed at both the risk assessment and the evaluation of their contribution. The additional contribution of the psychosocial item was measured in terms of increase in the index of discrimination (c-index). Results. The analysis with random forests highlighted as potential predictors of cardiovascular risk 2 item related to Jenkins sleep questionnaire, 4 items related to the Jenkins Activity Survey, and two items related to the Job Content Questionnaire. These items lead to 1.3% increase in AUC when inserted into a model with age, sex, and major risk factors. Conclusions. The results suggest that the measurement of some psychosocial aspects is important in predicting the risk of cardiovascular event. The mixed strategy used to develop the risk model (algorithmic in the variable selection and the stochastic one for the estimation of the risk) is able to make the most of the features of the two approaches: less constrained to distributional assumptions and linearity of the first, the most suitable to provide an estimate interpretable in terms of the risk the second one.

Indicatori socio-occupazionali, psicologici e nuovi biomarcatori nella predizione di eventi cardiovascolari maggiori

BERTU', LORENZA
2016

Abstract

Introduction. The role of psychosocial aspects in the onset of the cardiovascular disease is known but, it is not clear which features are more associated with cardiovascular risk and if they can improve the predictive capability of the classic models based on age, sex, smoking, diabetes, blood pressure and cholesterol. The study of the association among the risk of an event and the people's characteristics is, generally, conducted using two approaches: i) stochastic data modeling, ii) algorithmic modeling. In the field of epidemiology, we have always preferred the first approach because the results are interpretable by the probabilistic point of view, however, it is based on parametric assumptions that often fail to identify factors or interaction between important factors. The second approach, based on statistical learning techniques (Random Forest) is promising in terms of selection of the variables and identification of possible interactions between them. Objective. To identify items of psychosocial scale most involved in the prediction of major cardiovascular events and to evaluate the association between events and the selected items, by integrating, where possible, the information with biomarkers data. Methods. The analysis for the selection of items by using the techniques of statistical learning was conducted on 6567 individuals belonging to the cohorts MONICA-Brianza and PAMELA, aged 25-64 years who, during a median follow-up of 15 years, have experienced 527 events. The technique of random forests has been used during the selection of the items psychosocial. Since one of the main problems of random forests is the difficulty of grasping signals in the presence of strongly unbalanced data, we analyzed a dataset in which each event is paired with a non-event of the same sex, age (± 5 years) and under observation at the onset of the event. We identified the number and type of item most associated with the event of interest and we use them in a Cox proportional hazards model aimed at both the risk assessment and the evaluation of their contribution. The additional contribution of the psychosocial item was measured in terms of increase in the index of discrimination (c-index). Results. The analysis with random forests highlighted as potential predictors of cardiovascular risk 2 item related to Jenkins sleep questionnaire, 4 items related to the Jenkins Activity Survey, and two items related to the Job Content Questionnaire. These items lead to 1.3% increase in AUC when inserted into a model with age, sex, and major risk factors. Conclusions. The results suggest that the measurement of some psychosocial aspects is important in predicting the risk of cardiovascular event. The mixed strategy used to develop the risk model (algorithmic in the variable selection and the stochastic one for the estimation of the risk) is able to make the most of the features of the two approaches: less constrained to distributional assumptions and linearity of the first, the most suitable to provide an estimate interpretable in terms of the risk the second one.
15-mar-2016
Italiano
Università degli Studi di Milano-Bicocca
File in questo prodotto:
File Dimensione Formato  
phd_unimib_528429.pdf

Open Access dal 16/03/2017

Dimensione 2.49 MB
Formato Adobe PDF
2.49 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/106777
Il codice NBN di questa tesi è URN:NBN:IT:UNIMIB-106777