Since the first risk prediction model, the Framingham Coronary Risk Prediction Model (1) for a chronic disease was published in 1976 an increasing number of researchers and clinicians have focused the attention on these tools. In the 1980s and early 1990s, investigators began to publish models that predicted the absolute risk of breast cancer, that is, the probability that an individual would develop breast cancer over a defined period of time. The absolute risk, sometimes called the cumulative incidence or crude risk, is reduced by the chance that the person will die of competing causes of death before developing the disease of interest. Recent identification of genetic mutations, such as mutations in BRCA1 and BRCA2 for breast cancer, that carry high risk have given rise to models that predict whether an individual carries a mutation (“genetic susceptibility models”). By coupling genetic susceptibility models with the cumulative incidence of disease associated with the mutations, one can use such models to predict absolute risk. Genetic variants with weaker associations with disease, such as single nucleotide polymorphisms (SNPs), also provide information that can be combined with clinical, epidemiological, and biological risk factors to assess the risk more accurately. Risk prediction models have proved useful and can be applied for many different purposes either at the individual or population level. They can be used to design, plan and establish eligibility criteria for prevention and intervention trials, to identify specific groups of people who may benefit from targeted actions, to assess the likely effects of intervention trials on the burden of the disease in a population and, in clinical decision making, to help physicians and patients determine appropriate screening regimes and/or interventions. There are two main approaches to modeling the absolute risk from population-based data: one based on modeling cause-specific hazards and the other based on modeling the absolute risk itself (2). The first has the advantage that it can be estimated from various kinds of data: a prospective cohort study, a retrospective or historical cohort study, case-cohort data, nested case-control data, and a combination of case-control and registry data. For this reason, we concentrate on modeling cause-specific hazards in this thesis. The second approach requires cohort data. The present work will consider the first type of model mentioned above and in particular variations of the so-called “Gail Breast Cancer Risk Assessment Model” proposed by Gail et al. in 1989 (3) for breast cancer (BC). We will estimate the absolute risk by combining case-control data with registry data from Italy. The steps required in the development of this model can be summarized as follows: selection of risk factors and estimation of relative risk (RR), determining the population attributable risk fraction (AR), estimating the baseline age-specific BC hazard rate, and combining this information with data on the age-specific hazard of death from competing causes to produce an estimate of breast cancer absolute risk. These concepts will be explained in more detail under Materials and Methods in Section 3.1. The Gail model was derived using information on irreversible or non-modifiable risk factors: age, age at menarche, age at birth of first live child, number of previous benign breast biopsies, and number of first-degree relatives with breast cancer. In this work, we intend to evaluate, in addition to non-modifiable risk factors, the effect of potentially modifiable predictors of risk on the long-term probability of developing invasive BC. The incidence of female breast cancer varies by approximately five-fold between countries, being highest in the United States and Northern Europe, intermediate in Southern and Eastern Europe and South America and lowest in Asia (4,5). In Italy no global estimates are available. The age-standardized (world) incidence rates, ASR(W), (per 100,000) reported in Cancer Incidence in Five Continents (CI5) Volume IX published in 2007, showed the lowest Italian rate in Naples at 62.9 (standard error 1.98) and the highest in Milan, 94.4 (standard error 1.45) (6). Although deaths from breast cancer have been decreasing in many Western countries, the incidence of the disease is continuing to increase. Also, although our understanding of the causes of breast cancer has improved substantially over the past 50 years, few of the established risk factors are modifiable in light of the existing social norms. The principal motivation for this research was to incorporate modifiable risk factors not only to improve risk predictions but also to evaluate the potential impact of changes in behavior or lifestyle that might prevent the disease. In particular we will calculate the effect on absolute risk of setting a group of modifiable risk factors at their lowest levels, while keeping the other non-modifiable and known risk factors unchanged. Based on a review of the literature and a study of potentially modifiable factors in our data, we concluded that alcohol consumption, the body mass index, and leisure-time physical activity were the modifiable risk factors we would include in our absolute risk model. We evaluated the performance of the risk model, in terms of calibration and discriminatory ability, using data from the Florence-EPIC cohort. We examined the effects of modifying risk factors at the individual level; we report some examples. We also measured the impact of risk modification at the population level in terms of average and percent risk reduction. Both these quantities were estimated for the entire population and for defined subsets of the population. Methods for defining subset indicators, such as indicators for high-risk subsets based on all risk factors, non-modifiable and modifiable, or on only on the non-modifiable risk factors were derived. For all these quantities 95% confidence intervals were calculated by means of a non parametric bootstrap procedure that resampled the case-control data. Chapter 2 introduces the Gail model and its assumptions, notations, and methods. In Section 3 we describe the data used to build the new absolute risk model with non-modifiable and modifiable risk factors, the construction of the new model, its validation, and its application to assessing the effects of modifying risk factors at the individual level. Chapter 4 presents methods and results for estimating the effects of a modification of risk factors at the population level. We discuss these findings in Chapter 5.
Effects of Risk factor modifications on projections of absolute breast cancer risk
PETRACCI, ELISABETTA
2011
Abstract
Since the first risk prediction model, the Framingham Coronary Risk Prediction Model (1) for a chronic disease was published in 1976 an increasing number of researchers and clinicians have focused the attention on these tools. In the 1980s and early 1990s, investigators began to publish models that predicted the absolute risk of breast cancer, that is, the probability that an individual would develop breast cancer over a defined period of time. The absolute risk, sometimes called the cumulative incidence or crude risk, is reduced by the chance that the person will die of competing causes of death before developing the disease of interest. Recent identification of genetic mutations, such as mutations in BRCA1 and BRCA2 for breast cancer, that carry high risk have given rise to models that predict whether an individual carries a mutation (“genetic susceptibility models”). By coupling genetic susceptibility models with the cumulative incidence of disease associated with the mutations, one can use such models to predict absolute risk. Genetic variants with weaker associations with disease, such as single nucleotide polymorphisms (SNPs), also provide information that can be combined with clinical, epidemiological, and biological risk factors to assess the risk more accurately. Risk prediction models have proved useful and can be applied for many different purposes either at the individual or population level. They can be used to design, plan and establish eligibility criteria for prevention and intervention trials, to identify specific groups of people who may benefit from targeted actions, to assess the likely effects of intervention trials on the burden of the disease in a population and, in clinical decision making, to help physicians and patients determine appropriate screening regimes and/or interventions. There are two main approaches to modeling the absolute risk from population-based data: one based on modeling cause-specific hazards and the other based on modeling the absolute risk itself (2). The first has the advantage that it can be estimated from various kinds of data: a prospective cohort study, a retrospective or historical cohort study, case-cohort data, nested case-control data, and a combination of case-control and registry data. For this reason, we concentrate on modeling cause-specific hazards in this thesis. The second approach requires cohort data. The present work will consider the first type of model mentioned above and in particular variations of the so-called “Gail Breast Cancer Risk Assessment Model” proposed by Gail et al. in 1989 (3) for breast cancer (BC). We will estimate the absolute risk by combining case-control data with registry data from Italy. The steps required in the development of this model can be summarized as follows: selection of risk factors and estimation of relative risk (RR), determining the population attributable risk fraction (AR), estimating the baseline age-specific BC hazard rate, and combining this information with data on the age-specific hazard of death from competing causes to produce an estimate of breast cancer absolute risk. These concepts will be explained in more detail under Materials and Methods in Section 3.1. The Gail model was derived using information on irreversible or non-modifiable risk factors: age, age at menarche, age at birth of first live child, number of previous benign breast biopsies, and number of first-degree relatives with breast cancer. In this work, we intend to evaluate, in addition to non-modifiable risk factors, the effect of potentially modifiable predictors of risk on the long-term probability of developing invasive BC. The incidence of female breast cancer varies by approximately five-fold between countries, being highest in the United States and Northern Europe, intermediate in Southern and Eastern Europe and South America and lowest in Asia (4,5). In Italy no global estimates are available. The age-standardized (world) incidence rates, ASR(W), (per 100,000) reported in Cancer Incidence in Five Continents (CI5) Volume IX published in 2007, showed the lowest Italian rate in Naples at 62.9 (standard error 1.98) and the highest in Milan, 94.4 (standard error 1.45) (6). Although deaths from breast cancer have been decreasing in many Western countries, the incidence of the disease is continuing to increase. Also, although our understanding of the causes of breast cancer has improved substantially over the past 50 years, few of the established risk factors are modifiable in light of the existing social norms. The principal motivation for this research was to incorporate modifiable risk factors not only to improve risk predictions but also to evaluate the potential impact of changes in behavior or lifestyle that might prevent the disease. In particular we will calculate the effect on absolute risk of setting a group of modifiable risk factors at their lowest levels, while keeping the other non-modifiable and known risk factors unchanged. Based on a review of the literature and a study of potentially modifiable factors in our data, we concluded that alcohol consumption, the body mass index, and leisure-time physical activity were the modifiable risk factors we would include in our absolute risk model. We evaluated the performance of the risk model, in terms of calibration and discriminatory ability, using data from the Florence-EPIC cohort. We examined the effects of modifying risk factors at the individual level; we report some examples. We also measured the impact of risk modification at the population level in terms of average and percent risk reduction. Both these quantities were estimated for the entire population and for defined subsets of the population. Methods for defining subset indicators, such as indicators for high-risk subsets based on all risk factors, non-modifiable and modifiable, or on only on the non-modifiable risk factors were derived. For all these quantities 95% confidence intervals were calculated by means of a non parametric bootstrap procedure that resampled the case-control data. Chapter 2 introduces the Gail model and its assumptions, notations, and methods. In Section 3 we describe the data used to build the new absolute risk model with non-modifiable and modifiable risk factors, the construction of the new model, its validation, and its application to assessing the effects of modifying risk factors at the individual level. Chapter 4 presents methods and results for estimating the effects of a modification of risk factors at the population level. We discuss these findings in Chapter 5.I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/75224
URN:NBN:IT:UNIMI-75224