Overdiagnosis of breast cancer in the Norwegian Breast Cancer Screening Program estimated by the Norwegian Women and Cancer cohort study

Background There is increasing ambiguity towards national mammographic screening programs due to varying publicized estimates of overdiagnosis, i.e., breast cancer that would not have been diagnosed in the women’s lifetime outside screening. This analysis compares the cumulative incidence of breast cancer in screened and unscreened women in Norway from the start of the fully implemented Norwegian Breast Cancer Screening Program (NBCSP) in 2005. Methods Subjects were 53 363 women in the Norwegian Women and Cancer (NOWAC) study, aged 52–79 years, with follow-up through 2010. Mammogram and breast cancer risk factor information were taken from the most recent questionnaire (2002–07) before the start of individual follow-up. The analysis differentiated screening into incidence (52–69 years) and post screening (70–79 years). Relative risks (RR) were estimated by Poisson regression. Results The analysis failed to detect a significantly increased cumulative incidence rate in screened versus other women 52–79 years. RR of breast cancer among women outside the NBCSP, the “control group”, was non-significantly reduced by 7% (RR = 0∙93; 95% confidence interval 0∙79 to 1∙10) compared to those in the program. The RR was attenuated when adjusted for risk factors; RRadj = 0∙97 (0∙82 to 1∙15). The control group consisted of two subpopulations, those who only had a mammogram outside the program (RRadj =1∙04; 0∙86 to 1∙26) and those who never had a mammogram (RRadj = 0∙77; 0∙59 to 1∙01). These groups differed significantly with respect to risk factors for breast cancer, partly as a consequence of the prescription rules for hormone therapy which indicate a mammogram. Conclusions In the fully implemented NBCSP, no significant difference was found in cumulative incidence rates of breast cancer between NOWAC women screened and not screened. Naïve comparisons of screened and unscreened women may be affected by important differences in risk factors. The current challenge for the screening program is to improve the diagnostics used at prevalence screenings (ages 50–51).


Background
The public discussion following a large number of scientific articles related to overdiagnosis in national mammographic screening programs for breast cancer has become a major concern both for national screening programs and women deciding to participate. In the context of screening, overdiagnosis is the discovery of cancers that without screening would not have been diagnosed and consequently treated in a woman's lifetime [1,2]. The main problem is the lack of diagnostic procedures that can subclassify breast tumors into overdiagnostic and clinically important invasive cancers which would obviate overtreatment. This limitation has forced researchers to try many different approaches to estimate the overdiagnosis [3][4][5][6][7][8][9][10][11][12][13][14]. An independent metaanalysis of three clinical trials reported a 19% increased incidence of breast cancer among screened women during the screening period and an 11% increased incidence if the years after the active screening were included [1]. Estimates based on ecological analyses are heterogeneous [6,8,9,[11][12][13]. A major weakness of ecological analyses is the inability to adequately [15] control for the confounding effect of hormone therapy (HT). In Norway, as in most other countries, public guidelines for prescribing HT include an initial mammogram or participation in a national screening program [16,17]. Thus, the participants in the program will more often be users of HT. Since HT users have a higher risk of breast cancer [18,19], some of the estimated overdiagnosis might be due to the more extensive HT use among screening program participants. In addition, HT can reduce both mammogram sensitivity and specificity due to high breast density associated with HT use [20].
Estimates of overdiagnosis have included either only invasive cancers, or both invasive and ductal carcinoma in situ (DCIS) which are most often identified through mammography. Several recent cohort analyses were published from Norway [2], Denmark [21], and Italy [22] using a record linkage design with information on screening invitations or participations from program registries, and outcomes from cancer registries. The estimated overdiagnosis varied from almost zero to around ten percent when the years after the end of active screening were included. Individual level data were used to examine overdiagnosis in Norway, resulting in overdiagnosis estimates between 10 and 20 percent [2]. None of the studies had access to information on mammograms taken outside the program or necessary information for control of confounders or assessment of risk factors.
While the historical development of the screening program in Norway and many other countries has been used for estimating overdiagnosis, the core question for women entering the system today is the current and future level of risk of overdiagnosis in the national ongoing program. The Norwegian Breast Cancer Screening Program (NBCSP) has operated on a national scale since 2005 [23]. When estimating the consequence of participating in a mammographic screening program, three different screening phases are identified. First, prevalence screening occurs during the first participation. In the NBCSP, all women are first invited at 50 or 51 years of age. Later screening examinations (52-69 years), based on both the clinical and mammographic examinations compared to previous ones, give an incidence screening. Finally, the "compensatory drop" in the years after the age of 69 when the women are no longer offered screening. Since screening should detect cancers earlier than normally identified, there is expected to be a drop in the incidence when screening is stopped [10].
This analysis uses the national population based cohort Norwegian Women and Cancer study (NOWAC) to compare cumulative breast cancer incidence rates among women with different mammography histories between 52 and 79 years of age using incidence data for 2005-2010.

Norwegian Breast Cancer Screening Program (NBCSP)
The NBCSP started in 1996 as an evaluation project in four counties, but later expanded to the entire country [24]. It follows European guidelines with mammograms obtained in two views and each read independently by two radiologists. In 2005, the program was fully implemented in most of the country, but first invitations to the program were still being sent to all age groups (50-69) in two small counties: Hedmark and Vestfold. Starting in 2006, the screening program was fully implemented and all women were first invited to the program at 50 or 51 years. In 2006, there were 28 375 women with first invitations to the NBCSP, including 25 357 (89%) at ages 50-51, 1361 (5%) at ages 52-53 and the rest (6%) in older age groups (Cancer Registry of Norway, unpublished data). Women are then invited back every two years through age 69.

Norwegian Women and Cancer (NOWAC)
NOWAC was initiated in 1991 [25]. Questionnaires were mailed to women randomly selected from the national population register held by Statistics Norway during 1991-2007. For each woman the unique person number, name, and address were extracted. Before mailing the letter of information and the questionnaire, the person number was replaced by a serial number that was the only identification on the questionnaire. The overall response rate for NOWAC questionnaires is 62%. All linkages between NOWAC members and national registries were done by Statistics Norway based on the unique person number.
The NOWAC questionnaires include information about mammography as well as lifestyle and socialdemographics. During 2002-2005, the questionnaires included detailed questions about the type of mammogram. Women were asked if they had a mammogram and if so, how many were through an invitation to the NBCSP, through a referral from their doctor, or without an invitation or referral. Ninety-one percent of women aged 52 or older at the time of their submitted questionnaire answered these detailed mammography questions. Women who indicated that they had at least one examination via NBCSP invitation were considered as participating in the mammography program. During 2005-2007, after the nationwide implementation of the NBCSP, the referral questions were removed and instead women were asked how many years it had been since their last mammography examination. The change in questions was based on the assumption that information on participation could be taken from the screening register held by the Cancer Registry of Norway. However, this detailed information has not been available to researchers on an individual level, with a recent exception [2].
Although the questions about mammography were asked only once, the answers for women over age 52 were stable indicators of mammography patterns. A random subset of NOWAC participants who were asked about their mammography history in 2003 received another questionnaire during 2010-2011. For those who were 52 years or more at the time of the first questionnaire and answered a second questionnaire (N = 7361), 93% of those who reported never having had a mammogram on the first questionnaire had the same response. The answers regarding programmed mammograms were also robust. Of those indicating participation in the NBCSP, 83% responded the same 7 years later and for those indicating that they only had mammograms outside the NBCSP, 78% responded the same 7 years later.

Sample selection
The NOWAC Cohort includes 172 478 women between the ages of 30 and 70 at recruitment who were randomly selected from the Norwegian population. A subset of these women were selected to form a Mammography Evaluation Cohort for this study. The sample was restricted to women who completed a NOWAC questionnaire during 2002-2007 at an age of 52 or higher and who lived in a county with a fully implemented screening program. These restrictions ensured that women in the study would have received at least one invitation to participate in the screening program prior to the questionnaire, thus determining which women were participating in the nation screening. The evaluation cohort also excluded women with a diagnosis of invasive breast cancer or DCIS prior to their questionnaire, and those who did not answer the mammography questions. The Mammography Evaluation Cohort includes 53 363 women divided into three groups based upon their mammography history. The "never had a mammogram" group includes all women who reported only "No" to the questions about mammogram history. Since the Mammography Evaluation Cohort necessarily precludes women in the prevalence screening (ages 50-51) in order to accurately identify those participating in the program, previously published rates for invasive breast cancer and DCIS for Norwegian women aged 50-51 were used for comparison [2].

Person-years and follow-up
The number of people at risk for having a breast cancer diagnosis during 2005-2010 was calculated at the person level. Person-years (PY) were based on date of entrance into the study, age group, and date of exit from the study (date of invasive breast cancer diagnosis, death, or end of follow-up on 31 December 2010). Follow-up data during 2005-2010 came from the Cancer Registry of Norway and the Cause of Death Registry. Women in the Mammography Evaluation Cohort had an average follow-up time of 5•6 years (median 6•0) for a total of 300 016 PY, and 972 incident diagnoses of breast cancer ( Table 1). The majority of these diagnoses were invasive breast cancer (89%) with the remaining DCIS (11%). The participation rate in the NBCSP for the cohort was 75%, which is similar to previously reported national participation rates of 76% [23]. As a randomly selected cohort, NOWAC participants have similar age-specific incidence rates as national figures [26] for 2006-2010 ( Figure 1). NOWAC participants 65-69 years had a slightly higher incidence rate than those nationally, but the cumulative incidence rates were similar. The incidence rates for those in the Mammography Evaluation Cohort, the current study population, are representative of the overall cohort and thus comparable to national rates (dashed line in Figure 1; ages 55-79). For DCIS, the incidence rates were closely correlated.

Statistical analyses
Characteristics of the Mammography Evaluation Cohort groups were compared using chi-square tests of independence. Confidence intervals for the age-specific breast cancer incidence rates were calculated assuming a Poisson distribution [27]. Cumulative incidence rates for ages 52-79 were calculated as the cumulated sums of the age-specific incidence rates. Rates for each age were estimated from rates calculated for each age group assuming a constant rate within each group [28]. Log rank tests were used to compare cumulative incidence rates between groups. Age-adjusted relative risks and their 95% confidence intervals were estimated using Poisson regression with robust error variance [29].
The NOWAC questions before 2005 made it possible to perform the analyses taking into account three groups: the program group of women with at least one mammogram in the NBCSP, the outside group of women with mammograms only outside the screening program, and the never group of women who reported never having a mammogram. The last two groups were combined into a "control" or reference group for comparison with women participating in the screening program in order to be comparable with other analyses of program screened versus a control group. Estimates of relative risks were adjusted for major risk factors for breast cancer taken from the woman's most recent questionnaire.
All statistical analyses were conducted in SAS version 9•2 (SAS Incorporated, Cary, NC, USA). Statistical significance was defined as a two-sided test resulting in a p-value less than 0•05.

Results
The distribution of lifestyle factors related to breast cancer risk (Table 2) shows several distinct differences.
Women who never had a mammogram tended to be older than those in the other groups, had more children, were less likely to have had a maternal history of breast cancer, and most distinctly, were less likely to be current users of HT (12% versus 25% for those in the program and 32% for those with mammograms outside the  program). Differences between women in the control group, i.e., those who had a mammogram outside the program and those who never had a mammogram, were statistically significant for all risk factors (p < 0•05). The age-specific incidence rates for the program group versus the control group showed similar rates across the study interval 52-79 years ( Figure 2). Women who had a mammogram in the program and those who did not both demonstrated increasing rates of incident breast cancer during ages 52-69, although this was most evident in those who participated in the NBCSP. The drop in incidence rates for women over 69 years was clearly shown for both groups and was in the order of 200/100 000 PY from the preceding age group. Values for ages 50-51 are previously published rates of invasive breast cancer and DCIS for Norwegian women who attended the NBCSP during 1995-2009 (394/100 000 PY) and for those invited but who did not attend (211/100 000 PY) [2].
The cumulative incidence rates of invasive breast cancer and DCIS for women in the program and those in the control group were similar over the age-groups 52-79 years (Figure 3; p = 0•47). The cumulative rate ratio for ages 52-79 was 1•05 between the program and control groups. If the previously published rates for ages 50-51 are included, then the cumulative rate ratio for ages 50-79 is 1•09 between the program and control groups. When examined separately, the heterogeneity of the control group becomes evident. Women who reported never having had a mammogram had the lowest cumulative rate of breast cancer, although it was not significantly different than those in the screening program (p = 0•07). Women who reported only having had mammograms outside the screening program had the highest cumulative rate of breast cancer. Notably, the only significant difference in cumulative rates of cancer was between women who had mammograms outside the program and those who never had mammograms, i.e., between the two groups in the 'control' group (p = 0•05). When cumulative rates of only invasive breast cancers  are examined (not shown), the general patterns are the same although the differences in cumulated rates for ages 52-79 are much smaller among the groups. The age-adjusted relative risks of having a breast cancer diagnosis for those who never had a mammogram compared with those in the program group showed a marginally-significant 23% reduced risk (RR = 0•77; 95% CI: 0•59 to 1•01) ( Table 3). The results were nonsignificant and attenuated after adjusting for important risk factors (RR adj = 0•82, 0•61 to 1•11). Women who had received a mammogram only outside the screening program had a slightly higher, although also nonsignificant, age-adjusted relative risk for a breast cancer diagnosis compared to those in the screening program (RR adj = 1•04, 0•86 to 1•26). The combined group of all women without a screening program mammogram (never had a mammogram or only outside screening) versus the program group showed an age-adjusted relative risk of 0•93 (0•79 to 1•10). Further adjustments for risk factors attenuated the estimate; 0•97 (0•82 to 1•15).

Discussion
This analysis did not show a significantly increased cumulative incidence rate in screened women versus other women from NOWAC in age-groups covering 52-79 years. Relative risks of those not screened compared with screened showed a 7% decreased risk, although non-significant. After adjusting for risk factors, the decreased risk was 3%. The control group, which did not participate in the screening program, included two quite different subpopulations: those who never had a mammogram, and those who only had them outside the program. These subpopulations differed significantly on risk factors and although the cumulative risk of breast cancer for those with mammograms outside the program was higher compared with those who never had a mammogram, the difference was non-significant.
The results and interpretations have some limitations. First, the cohort is small. This is clearly demonstrated in the analysis of different subgroups. Second, the information on screening participation was based on a self-report questionnaire of mammography history. It is unknown if some women in the control group may have had a screening program mammogram after their last questionnaire, although the small retest group suggests low bias. A source of systematic bias could be the introduction of digital technology between 2004 and 2011. This could give an additional prevalence screening in all age-groups dependent on the introduction in the different counties. The introduction was gradual and the last counties changed in 2011 (Cancer Registry of Norway, unpublished data).
On the other hand, NOWAC is one of very few national representative cohorts that can be used for the analysis of an ongoing screening program. The results clearly demonstrate the problem of comparing a control and a screened group and the interpretations of the control group concept. The control group in Norway consisted of women either with mammograms taken outside the program as a consequence of diagnostic procedures or opportunistic screening, and those without any mammograms. The relative structure of these groups might differ from country to country and over time, invalidating ecological analyses. The importance of the weights of the three groups was clearly related to the systematic differences in risk factors and by the underlying incidence rates. The most important was the confounding by current use of HT, which could be a serious bias in ecological analyses due to its known carcinogenic properties [30]. Another systematic selection bias between the two groups could be due to the Norway action plan for surveillance of breast cancer in carriers of certain BRCA mutations or evidence of risk of hereditary breast cancer without documented mutations [31]. This plan calls for annual mammograms for women from as young as 25 through age 60 and is independent of the NBCSP. Thus, these women are included in the group who had mammograms outside the NBCSP, and could explain the high prevalence of a maternal history of breast cancer of 8% in this group versus 3% among those who reported never having taken a mammogram. This differential prevalence of women with a maternal history of breast cancer could partially explain the differences in incidence.
The analysis included two age-related groups and background data on a third covering the ages of screening and a 10-year post-screening follow-up. The previously published data for the prevalence screening (ages 50-51) indicated that screened women had twice the rate of breast cancer during these ages as those who did not attend screening. Our analysis of the incidence screening (52-69 years) showed no significant differences according to mammogram history. There was a drop in breast cancer rates following the end of screening for both those screened and those not participating in the NBCSP, possibly related to repeated screenings outside the NBCSP for those in the control group.
The cumulative incident rates for the individual mammography histories elucidated the differences within the control group. Those who never had a mammogram had the lowest cumulative rate for ages 52-79, while those who only had mammograms outside the NBCSP had the highest cumulative rate. Although not significantly different, these subtle differences in cumulative rates mirror the significant differences in risk factors observed. These findings suggest that ecological comparisons among self-selected groups of mammography attendees may be misleading if they fail to account for the heterogeneity within the unscreened population.
Other approaches for the analysis of the Norwegian screening program used a cohort design with counties as proxy for individual information on screening participation [3]. This gave an overdiagnosis of 25% in the agerange 50-69 years or 15% including the "compensatory drop". Again, the analysis was based on the assumption of a control group without knowledge of the use of mammography in that group. The Cancer Registry of Norway [2] recently estimated overdiagnosis in the order of 10-13% for invasive breast cancer and 14-20% for both DCIS and invasive breast cancers depending on if women were only invited to mammography or had multiple screening visits. The analysis was based on individual information on screening status in the national screening program, but did not examine risk factors. The discrepancies towards the estimates given based on the older clinical trials [1] could be due to the improvement in diagnostics over time. The estimate of overdiagnosis in the present analysis is lower than both with the UK independent estimate [1] and the estimate from the Cancer Registry of Norway [2], but did not include prevalence diagnoses. Given these published overdiagnosis estimates that included prevalence screenings, the high rates of breast cancer cases during ages 50-51, and our non-significant findings during ages 52-69, the data suggest that most overdiagnosis may occur during the prevalence screenings.

Conclusions
The results from the present analysis differ from previous studies due to the focus on recent, modern screening in the context of the Norwegian health care system and with proper adjustment for important confounders. Future work on overdiagnosis should include examination of risk factors, especially use of HT, when providing estimates. The findings support that women participating in the ongoing, national mammographic screening program of breast cancer after its complete installment might only have overdiagnosis related to the prevalence screening. This should lead to a more careful diagnostic work-up for women during the initial prevalence screening and careful considerations of necessary treatment.

Consent
All women have filled in an informed consent for later linkages to the Cancer Registry of Norway, the Norwegian Breast Cancer Screening Program, and the register of death certificates in Statistics Norway as well as the use of these data in research. The NOWAC study was approved by the Regional Committee for Medical and Health Research Ethics in North Norway. The Directorate of Health gave an exemption from the Norwegian rules of confidentiality for linking data using personal identifiers.