Effectiveness of early detection on breast cancer mortality reduction in Catalonia (Spain)

Background At present, it is complicated to use screening trials to determine the optimal age intervals and periodicities of breast cancer early detection. Mathematical models are an alternative that has been widely used. The aim of this study was to estimate the effect of different breast cancer early detection strategies in Catalonia (Spain), in terms of breast cancer mortality reduction (MR) and years of life gained (YLG), using the stochastic models developed by Lee and Zelen (LZ). Methods We used the LZ model to estimate the cumulative probability of death for a cohort exposed to different screening strategies after T years of follow-up. We also obtained the cumulative probability of death for a cohort with no screening. These probabilities were used to estimate the possible breast cancer MR and YLG by age, period and cohort of birth. The inputs of the model were: incidence of, mortality from and survival after breast cancer, mortality from other causes, distribution of breast cancer stages at diagnosis and sensitivity of mammography. The outputs were relative breast cancer MR and YLG. Results Relative breast cancer MR varied from 20% for biennial exams in the 50 to 69 age interval to 30% for annual exams in the 40 to 74 age interval. When strategies differ in periodicity but not in the age interval of exams, biennial screening achieved almost 80% of the annual screening MR. In contrast to MR, the effect on YLG of extending screening from 69 to 74 years of age was smaller than the effect of extending the screening from 50 to 45 or 40 years. Conclusion In this study we have obtained a measure of the effect of breast cancer screening in terms of mortality and years of life gained. The Lee and Zelen mathematical models have been very useful for assessing the impact of different modalities of early detection on MR and YLG in Catalonia (Spain).


Background
Randomized controlled trials (RCT) are the gold standard for measuring medical interventions. Although controversial, RCT assessing the effectiveness of screening with mammography have provided valuable information [1,2]. While there is still debate about the best screening strategies, and what benefits they produce, at present the cost, time, contamination issues and difficulties with compliance preclude additional RCT for early detection of breast cancer. Because statistical population trends are affected by many factors and therefore are not accurate in measuring the effect of health interventions, there has been increased interest in using population data and mathematical models to assess the effectiveness of early breast cancer detection.
Mathematical models for assessing the effect of health interventions are structured representations of health states and the transitions between them. These models may describe the relationship between an intervention and changes in incidence and mortality rates for a specific disease in a particular population. In the United States (US), a consortium of researchers participating in the Cancer Intervention and Surveillance Modeling Network (CISNET) used statistical and simulation modeling to quantify the relative impact of adjuvant therapy and screening mammography on the decline of breast cancer mortality [3,4]. The stochastic models that Sandra Lee and Marvin Zelen designed in the US under CISNET are an alternative to population trials for addressing and responding to most of the questions that arise when developing and assessing the effect of early detection [5][6][7][8][9][10].
In Spain there is a National Health System (NHS), financed primarily by taxes, which provides universal and free health coverage, including early detection of breast cancer. Catalonia is an autonomous region of Spain which has approximately one sixth of the Spanish population. By the year 2007, the Catalan Health Service was providing services to seven million inhabitants, including 3.5 million women. The Catalan Breast Cancer Screening Program (BCSP) started gradually, at the beginning of the 1990s, providing biennial mammography screening tests, with the target population being women 50-64 years old.
Since the year 2000, women older than 64 are kept in the program until the age of 69, based on the results of a model-derived cost-effectiveness study published in 1998 [11]. At the present time, there is interest in assessing the impact and cost-effectiveness of different modalities (age at the first exam, number of exams and periodicity of exams) of breast cancer early detection in Catalonia (Spain).
The aim of this study was to estimate the effect of different strategies of breast cancer early detection in Catalonia (Spain), in terms of reduction of breast cancer mortality and years of life gained, using the stochastic models developed by Lee and Zelen. Lee and Zelen developed a probabilistic model that predicts mortality as a function of the early detection modality. The characteristics and assumptions of the LZ model are described in detail elsewhere [5][6][7][8][9][10]. The assumptions of the LZ model (see Figure 1)   The LZ model considers:
• Three chronological times (see Figure 1): -x: time at entering S p , z + x: age when entering S p .
The time x is not observed but can be derived from the incidence function and the distribution of sojourn time in the S p state.
x takes a negative value if the transition to S p occurs before the age at first exam, z. A very relevant element that the LZ models use is the concept of an individual of generation j, which is defined as an individual that enters the pre-clinical state at the jth interval, (t j-1 , t j ). The formulas used to estimate the model probabilities are based on this concept.
The LZ basic model calculates the cumulative probability of death for the cohort group exposed to any screening program after T years of follow-up. Similarly, the cumulative probability of death for the cohort group having typical health care can be calculated. These probabilities are used to calculate the possible reduction in mortality from an early detection program after T years of follow-up.
Survival distributions for exam-diagnosed, interval, and control cases are assumed to be conditional on the stage at diagnosis and treatment, but are not dependent on the mode of diagnosis. The LZ model assumes k stages,  s (j),  i (j) and  c (j) represent the probability of being diagnosed at stage j, j = 1,..., k for exam-diagnosed, interval and control cases, respectively, and f j (t|z + ) is the probability density function (pdf) of survival time t among subjects who would have been clinically diagnosed at stage j in the absence of screening. Then the survival time pdfs of the exam-diagnosed, interval and control cases are the mixtures , and , respectively.
Since screening will appear to increase survival time, the LZ model controls for lead time bias by setting the origin of survival time for the screened, interval, and clinical cases at the time of clinical diagnosis. Consequently, there is an implied guarantee time for disease-specific survival, that is, the cases diagnosed earlier would have been alive at the time the disease would have been clinically diagnosed. This guarantee time, also called lead time, is a random variable and is incorporated into the equations of the model.
Explicitly, the lead time is  -t r where  is the time at which the individual enters the clinical state and t r is the time at which the r detection exam, when the disease will be diagnosed, is given.

The inputs of the Lee and Zelen's model for Catalonia
We studied women born during the calendar years 1930 to 1959. In this article we present results for the cohorts born from 1955 to 1959. We assumed that the incidence of breast cancer for ages younger than 25 years was insignificant. When incidence or mortality data were not available, they were estimated using age-cohort models [12].

Incidence of breast cancer in Catalonia
Incidence data from the population-based cancer registries of the Catalan provinces Girona and Tarragona was used. These two registries cover 20% of the Catalan population. Data from the province of Girona was provided by the Girona Cancer Registry and data from Tarragona was downloaded from the International Agency for Research on Cancer (IARC) [13]. The available periods with information on breast cancer incidence were 1980-1989 and 1994-2002 for Girona and 1983-1997 for Tarragona. We obtained the observed incidence rate by combining both sources of information and using the population counts of the official census for the same time periods [14]. To model the data and obtain incidence estimates outside of the observed calendar years we used a generalized linear model with a Poisson distribution and polynomial parametrization of age and cohort variables [15]. This model included a forth-degree polynomial for the age effects and a second-degree polynomial for the cohort effects. Models with lower degree terms for the age or cohort effects did not fit the observed data properly.

Mortality due to causes other than breast cancer in Catalonia
We used the multi-decrement life table methodology to partition overall mortality into mortality due to breast g t z j f t z cancer and mortality due to causes other than breast cancer [16]. Mortality data was obtained from the Catalan Mortality Registry and the National Institute of Statistics (INE) [14,17]. Overall mortality data was available for calendar years 1900 to 2004 and breast cancer mortality data was available for calendar years 1975-2004. Population estimates were obtained from the INE and the Catalan Statistics Institute (IDESCAT) [14,18]. We subtracted the conditional probabilities of dying from breast cancer from the overall conditional probabilities of death to obtain the probabilities of dying from causes other than breast cancer. We estimated the missing breast cancer mortality probabilities for earlier years of birth using an age-cohort model similar to the generalized linear model described in the incidence data section. Details of these estimations can be found elsewhere [19].

Distribution of stages at diagnosis
Since there was limited information in Catalonia on the distribution of disease stages at diagnosis, we used the US data. Lee and Zelen [7] reported the AJCC distributions of stages for cases diagnosed without screening, screeningdetected cases and interval cases. The stage distribution for cases without screening was provided by the Surveillance Epidemiology and End Results (SEER) program of the National Cancer Institute and the stage distribution for the screen-detected and interval cases were provided by the Breast Cancer Surveillance Consortium (BCSC). For screen-detected cases and interval cases Lee and Zelen distinguish between annual, biennial and irregular screening. Details on stage distribution and definitions of screen-detected or interval cases can be found in Lee and Zelen's work.

Sensitivity of mammography
The sensitivity of mammography in our model was assumed to be the following: 0.55 for < 40 years, 0.65 for 40 -45 years, 0.70 for 45 -50 years, 0.75 for 50 -70 years and 0.80 for  70 years. These values were used by Lee and Zelen for screening exams conducted in 1995-2000, when they estimated the impact of mammography and adjuvant treatments in the US [7]. Lee and Zelen derived these data from the BCSC database which contains mammogram screening data and follow-up for approximately one million US women starting from 1994. If the hazard ratios of Girona versus the US were not proportional over time, we estimated a time dependent hazard ratio using the formula:

Estimation of survival functions in Catalonia
The Catalan cumulative survival functions, by stage of disease, obtained from the estimated hazard functions using expression (1) fit the observed data from the Girona Cancer Registry well, based on the deviance statistic.
More details on how we obtained the Catalan breast cancer survival functions can be found elsewhere [20].
We used the same method to obtain estimates of the survival functions for the 1990-2001 period and we used these functions to asses the impact of changing the survival functions in the effectiveness of early detection.
Since screening was prevalent in Catalonia during the 1990s, these survival functions are affected by the lead time bias and overestimate survival time after breast cancer diagnosis. In this paper we used them as a very favourable scenario to compare with results obtained when using the survival functions of the 1980s.

The application of Lee and Zelen's model to assess the effect of different breast cancer screening scenarios in Catalonia
Estimation of mortality for the not-screened group (control group) Lee and Zelen [5] estimated I c (y|z), the probability of dying y years after the start of the study conditional on being age z at time zero for the group receiving usual care as: is the age-specific point incidence function for age z +  and g c (y -|z + ) is the survival time pdf of control detected cases.
In the LZ model, the probability of disease-specific death, for the control group, at age z + T can be estimated as: And, the cumulative probability of disease-specific death, for the control group, after T years of follow-up time can be estimated as: Estimation of mortality for the screened group Mortality from cases detected in the screening exams Lee and Zelen estimate the probability D r (y|z) of being diagnosed at the r exam (time t r ) and dying y years after the start of the study, where z is the age at the start of the study. In order to obtain D r (y|z) Lee and Zelen distinguish two situations: 1. Being diagnosed at the first exam (t 0 , r = 0). In this case the women had entered S p before t 0 .
In this case there are three possible situations depending on cases: (a) being at S p before t 0 . All the previous screening exams gave false negative results.
(b) entering S p at a later time x after t 0 , but prior to the exam r -1 (time t r-1 ), (t j-1 <x  t j , j = 1, 2,..., r -1, r > 1). At least the exam r -1 gave a false negative result.
(c) entering S p at (t r-1 , t r ). No previous false negative results.
The four situations cover all the possibilities of early detection in a specific exam. Adding up the cumulative probabilities of dying in any of these four situations, one obtains the probability that a diagnosed case dies after y years of having started the study:

Mortality from cases detected in intervals between exams
Similarly to cases detected by exams, one can estimate the probability I r (y|z) of being diagnosed in the r interval between exams (t r-1 , t r ) and dying y years after the start of the study.
Once the probabilities I r (y|z) are estimated, the probability that an interval case dies y years after the start of the study is: Combining both possibilities of detection, the probability of disease-specific death for cases diagnosed in the early detection program, at age z + T, can be estimated as: And the cumulative probability of disease-specific death for cases diagnosed in the early detection program, after T years of follow-up time, is:

Measures of effect Relative breast cancer mortality reduction (MR) up to a specific age
For a specific cohort , the breast cancer relative mortality reduction up to age w, can be obtained using the expression: In our analysis we have considered the upper limit of age w to be 80 years. MR has been obtained from 40 to 80 years of age for the cohort.

Sensitivity analysis
In order to assess the impact on mortality reduction of changes in the input parameters, we varied the mammography sensitivity and the survival time pdfs. We estimated the impact of changing the sensitivity of mammography by setting the initial sensitivities in the model to 90% for all age groups. But, since changes in the sensitivity of mammography may affect the distribution of stages at diagnosis as well as the distribution of sojourn time in a pre-clinical state, for which we do not have accurate data, we present only data on the effect of changing the survival time pdfs.   Table 2 shows the effect on MR and YLG when different survival pdfs are used. Results for annual exams in ages 40-74 years and biennial exams in ages 50-69 years are presented. In scenario A we assumed that the survival time pdfs were the 1980-1989 Catalan functions (pre-mam-  When looking at the YLG between 40 and 80 years of age, we did not see the same pattern as with MR (Table 2). Contrary to what was expected, YLG per woman screened or per breast cancer diagnosed remained similar or even decreased when the survival functions improved. We attribute this result to the fact that, when survival by stage of disease improves, there is a gain in life-years in the noscreening group, as well.

Discussion
Randomized clinical trials are important for assessing the effects of screening. Nevertheless, the benefit of screening for breast cancer has remained controversial because of inconsistent results from clinical trials and controversies in systematic reviews [1,2]. Mathematical models may help answer questions for which empirical evidence is scarce and aid in understanding some of the basic issues relating to the early diagnosis of breast cancer [5]. We have identified mathematical models which are very useful for assessing the impact of different modalities of early detection on reducing mortality and potential years of life lost in Catalonia (Spain). Among the different approaches to population modeling, we chose the Lee and Zelen model because its assumptions are realistic and consistent with other data sources [5,10]. The LZ model is flexible, can incorporate complex information and interventions and may be used to determine optimal screening modalities.
Our aim was to assess the effect of different early detection scenarios using data from Catalonia, when available. We used Catalan population and mortality statistics. Breast cancer incidence was estimated using data from Cancer Registries in two Catalan provinces (Girona and Tarragona) and survival after a diagnosis of breast cancer was obtained using data from one of the Catalan Cancer Registries (the Girona Cancer Registry) and the US survival data. Incidence and mortality data for future time periods was projected using age-cohort models. When regional data was not available, we used information from the literature. We assumed that screening started in 1985, which  is consistent with the fact that the Catalan Breast Cancer Screening Program (BCSP) started gradually at the beginning of the 1990s, but some opportunistic breast cancer screening was done in the public and private health care sector during the 1980s [21]. We also assumed that exams started at 40 years of age or later and ceased at 69 or 74 years of age. In order to compare the effect of different modalities of screening, we assessed the effect of different screening scenarios in the same age span, 40-80 years of age. Our results reflect the effect of early detection if all women in each specific cohort had participated and complied with the screening scenarios assessed.

Our findings
For all the studied birth cohorts, depending on the screening scenario, our estimated reduction in breast cancer mortality in the 40-80 age span varied from about 20% for biennial exams in the 50-69 age interval to about 30% for annual exams in the 40-74 age interval. When exams were performed biennially, the MR achieved was almost 80% of the annual screening MR.
With annual exams, extending the program from 69 to 74 years produced mortality reductions roughly 2% higher, whereas extending the program from 50 to 45 years produced increases of 1.5%. Extending from 45 to 40 years represented an increase of 1% in mortality reduction. If we look at the years of life gained, there were mimimal changes when the program was extended from 69 to 74 years of age, whereas extending the program from 50 to 40 years of age increased the time gained per breast cancer diagnosed by about 0.3 years (four months). In any case, these results should be interpreted with caution, because the impact of early detection also needs to take into account the potential harm and cost of intensive screening [22][23][24]. We have also observed that changes in breast cancer survival have an impact on the mortality reduction achieved but not on the years of life gained.
Changes in the sensitivity of mammography to 90% sensitivity in all age groups did not result in changes in MR or YLG (data not shown). As we mentioned in the methods section, an improvement in mammography sensitivity may affect the distribution of the stages at diagnosis and the distribution of sojourn time in a pre-clinical state.
Since we did not change these distributions, our assessment of the impact of changing the sensitivity of mammography may not fully reflect what would happen in practice.

Comparison with other studies
Tabar et al, using data from the Regional Oncology Centres and Statistics Sweden for two Swedish counties (Dalarna and Linkping), compared all deaths from breast cancer diagnosed in the 20 years before screening was introduced with those in the 20 years after introduction of screening [25]. After adjustment for age, self-selection bias, and changes in breast-cancer incidence in the 40-69 year age-group, Tabar et al estimated a 44% reduction in breast cancer mortality in women exposed to screening (RR = 0.56). They estimated a 16% reduction in women not exposed to screening (RR = 0.84). The 33.3% ((0.84-0.56)/0.84) difference in breast cancer mortality reduction between the screened and non-screened group can be interpreted as the reduction attributable to screening. This figure is higher than our estimated 21% reduction in the 40-69 year age-group with biennial exams, a scenario comparable to the study of two Swedish counties where the screening interval was 18 months for women in the 40-54 year age-group and two years in older women [26]. Similarly, the 39% reduction reported by the Swedish Organised Service Screening Evaluation group, using data from six counties with interscreening intervals of Ӎ two years, is higher than our result for annual screening in the 40-69 year age interval [27,28].
Anderson et al assessed the impact of early detection in Connecticut. They found that incidence rates for earlystage tumors increased dramatically, whereas rates for late-stage tumors experienced a modest decrease. Breast cancer mortality rates fell 31.6%. These results were consistent with effective early detection and improved treatment over time, but also suggest that many mammography-detected early-stage lesions may never progress to late-stage cancers [29].
A Cochrane systematic review updated in 2006 by Gotzsche and Nielsen, based on seven trials involving half a million women in Europe and North America, estimated a 15%-20% reduction in breast cancer mortality [24]. These trials were conducted in the 1970s and the 1980s, and had different age intervals for exams and different periodicities. Most of the trials included women aged 50 to 69 and the periodicity was variable; in most of them screening was nearer to biennial exams than annual. Reductions obtained using the LZ models with the Catalan 1980-89 data are consistent with the results reported in the Cochrane review.

Limitations and other considerations
We may have overstated the advantages of modelling and the inability of RCT to answer some specific questions, such as age at initiation and the ideal frequency of screening. It should be noted that many of the parameters used in the models are obtained from the published RCT. In fact, the models could also be seen as a form of sensitivity analysis around health policy decision making. Models can complement RCT by testing hypotheses that would be impossible to test otherwise.
The assumption that the distribution of breast cancer stages at diagnosis was the same in Catalonia as in the US may have slightly biased our results. We made this assumption because information on the distribution of stages for different screening patterns was not available in Catalonia. We know that, before the introduction of mammography, the distribution of stages in the GCR was worse than in the US (41% localized, 49% regional and 10% distant stages in the GCR in the 1980-89 period versus 53% localized, 38% regional and 9% distant stages in the US in the 1975-79 period). If we assume that the stage distribution for different screening scenarios in Catalonia is similar to the US distribution (which is consistent with most of the RCT results, 70-80% of early diagnosed cases which are node negative), then the stage shift attributable to screening would have been even higher in Catalonia, resulting in a larger effect of early detection. Therefore, the assumption we made about the distribution of breast cancer stages at diagnosis may have resulted in a underestimation of the effects.
We do not present a validation of our findings in this article comparing the estimated effect of screening with observed mortality data. Breast cancer mortality in the last two decades has not only been influenced by the introduction of mammography but also by other events like the dissemination of adjuvant treatments. To validate our findings we need to incorporate the dissemination of mammography and adjuvant treatments, similar to what the CISNET did in the US [3]. We are planning to do that soon.
Several authors have reported a significant increase in incidence attributable to screening [29,30]. We used an age-period-cohort model to assess changes in incidence trend related to the dissemination of screening. We found a slight increase in incidence starting at the beginning of the 1990s that did not reach statistical significance. As a consequence, we projected breast cancer incidence using an age-cohort model.
Gotzsche and Nielsen estimated a 30% increase in overdiagnosis and overtreatment [24]. According to these authors, for every 2000 women invited for screening throughout 10 years, one will have her life prolonged and 10 healthy women, who would not have been diagnosed if there had not been screening, will be diagnosed as breast cancer patients and will be treated unnecessarily.
This study has not estimated the impact of false positive results of early detection exams. We think that this is an important issue to take into account when assessing the effect of early detection. We are planning to extend the LZ models to estimate the impact of false positive screening results. Some authors have estimated that after 10 years of annual screening, 30 to 50% of women have at least one false-positive mammogram result [31][32][33]. In Catalonia, Castells et al estimated that after 10 mammograms, the cumulative false positive recall rate was 32.4% [34].
Finally, the ultimate goal of our project is to assess the cost-effectiveness of different strategies for the early detection of breast cancer in Catalonia. In this study we have obtained a measure of the effect of breast cancer screening in terms of mortality and years of life gained. The impact of false positives and a cost-effectiveness analysis using an extension of the LZ models are our next targets.

Conclusion
We have estimated the impact of different strategies for early detection, using mammography, on breast cancer mortality reduction. For the first time our study presents a measure of the effect of early detection based on the observed Catalan incidence and breast cancer survival data. Since it is currently difficult to use experimental studies to determine optimal age intervals and periodicities for screening, mathematical models are an alternative for assessing the effects of early detection.