Cancer survival analysis in population-based settings relies on the relative survival framework, which enables the estimation of survival specific to the disease of interest in the absence of information on the individual cause of death. Within this framework, the overall hazard is decomposed into the excess hazard due to cancer and hazard due to the other competing causes of death, the latter derived from routine sources such as national or regional life tables. The survival function associated only with the excess hazard corresponds to the estimand of interest, known as net survival. Non-parametric estimators of this quantity, including Ederer I, Ederer II, Hakulinen, and Pohar-Perme, have been widely used in cancer epidemiology, yet their application has been the subject of long-standing methodological debates regarding bias, variability, and consistency. Researchers have also increasingly emphasized the value of model-based approaches, which allow flexible modeling, improved interpretability, and the ability to compare different demographic groups. This thesis begins by introducing the relative survival framework and providing the necessary mathematical background to understand the main non-parametric estimators and their statistical properties. We also review the historical development and discussion of these methods over the past decades. We then focus on the estimation of cure fractions in cancer populations and propose two novel mixture cure models. The first is a cure model with Weibull components, where the number of components is chosen using model selection criteria. The second is a Weibull mixture cure model in which the cure fraction is modeled as a flexible function of the covariates using smoothing splines and neural networks. The performances of these models are evaluated separately through simulation studies under different scenarios, including varying sample sizes and covariate distributions. Finally, we apply these novel methodologies to a cohort of colon cancer patients diagnosed in the municipality of Varese, Italy, between 1993 and 2013, providing relevant insights into colon cancer survival outcomes. Then, the thesis addresses prevalence estimation and projection using an illness-death model, in which incidence and net survival are combined in a forward calculation method to estimate this quantity of interest. We improve the state-of-the-art procedure by applying regularization techniques for incidence model selection and estimating net survival within the framework of flexible parametric relative survival cure models. We conducted a simulation study to compare the proposed model selection algorithm with the classical stepwise selection procedure. The combined novel and classical pipelines for estimating prevalence are then compared using data from colon cancer patients diagnosed in Sweden between 1958 and 2019. The thesis concludes with a summary of the main findings and a discussion of directions for future research.

Estimation of morbidity indicators in population-based cancer studies: methodological developments

DI MARI, FABRIZIO
2026

Abstract

Cancer survival analysis in population-based settings relies on the relative survival framework, which enables the estimation of survival specific to the disease of interest in the absence of information on the individual cause of death. Within this framework, the overall hazard is decomposed into the excess hazard due to cancer and hazard due to the other competing causes of death, the latter derived from routine sources such as national or regional life tables. The survival function associated only with the excess hazard corresponds to the estimand of interest, known as net survival. Non-parametric estimators of this quantity, including Ederer I, Ederer II, Hakulinen, and Pohar-Perme, have been widely used in cancer epidemiology, yet their application has been the subject of long-standing methodological debates regarding bias, variability, and consistency. Researchers have also increasingly emphasized the value of model-based approaches, which allow flexible modeling, improved interpretability, and the ability to compare different demographic groups. This thesis begins by introducing the relative survival framework and providing the necessary mathematical background to understand the main non-parametric estimators and their statistical properties. We also review the historical development and discussion of these methods over the past decades. We then focus on the estimation of cure fractions in cancer populations and propose two novel mixture cure models. The first is a cure model with Weibull components, where the number of components is chosen using model selection criteria. The second is a Weibull mixture cure model in which the cure fraction is modeled as a flexible function of the covariates using smoothing splines and neural networks. The performances of these models are evaluated separately through simulation studies under different scenarios, including varying sample sizes and covariate distributions. Finally, we apply these novel methodologies to a cohort of colon cancer patients diagnosed in the municipality of Varese, Italy, between 1993 and 2013, providing relevant insights into colon cancer survival outcomes. Then, the thesis addresses prevalence estimation and projection using an illness-death model, in which incidence and net survival are combined in a forward calculation method to estimate this quantity of interest. We improve the state-of-the-art procedure by applying regularization techniques for incidence model selection and estimating net survival within the framework of flexible parametric relative survival cure models. We conducted a simulation study to compare the proposed model selection algorithm with the classical stepwise selection procedure. The combined novel and classical pipelines for estimating prevalence are then compared using data from colon cancer patients diagnosed in Sweden between 1958 and 2019. The thesis concludes with a summary of the main findings and a discussion of directions for future research.
27-gen-2026
Inglese
DE ANGELIS, Roberta
ROCCI, Roberto
VICARI, Donatella
Università degli Studi di Roma "La Sapienza"
97
File in questo prodotto:
File Dimensione Formato  
Tesi_dottorato_DiMari.pdf

accesso aperto

Licenza: Creative Commons
Dimensione 2.76 MB
Formato Adobe PDF
2.76 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/357137
Il codice NBN di questa tesi è URN:NBN:IT:UNIROMA1-357137