Flemish breast cancer screening programme: 15 years of key performance indicators (2002–2016)

Background We examined 15 years of key performance indicators (KPIs) of the population-based mammography screening programme (PMSP) in Flanders, Belgium. Methods Individual screening data were linked to the national cancer registry to obtain oncological follow-up. We benchmarked crude KPI results against KPI-targets set by the European guidelines and KPI results of other national screening programmes. Temporal trends were examined by plotting age-standardised KPIs against the year of screening and estimating the Average Annual Percentage Change (AAPC). Results PMSP coverage increased significantly over the period of 15 years (+ 7.5% AAPC), but the increase fell to + 1.6% after invitation coverage was maximised. In 2016, PMSP coverage was at 50.0% and opportunistic coverage was at 14.1%, resulting in a total coverage by screening of 64.2%. The response to the invitations was 49.8% in 2016, without a trend. Recall rate decreased significantly (AAPC -1.5% & -5.0% in initial and subsequent regular screenings respectively) while cancer detection remained stable (AAPC 0.0%). The result was an increased positive predictive value (AAPC + 3.8%). Overall programme sensitivity was stable and was at 65.1% in 2014. In initial screens of 2015, the proportion of DCIS, tumours stage II+, and node negative invasive cancers was 18.2, 31.2, and 61.6% respectively. In subsequent regular screens of 2015, those proportions were 14.0, 24.8, and 65.4% respectively. Trends were not significant. Conclusion Besides a suboptimal attendance rate, most KPIs in the Flemish PMSP meet EU benchmark targets. Nonetheless, there are several priorities for further investigation such as a critical evaluation of strategies to increase screening participation, organising a biennial radiological review of interval cancers, analysing the effect that preceding opportunistic screening has on the KPI for initial screenings, and efforts to estimate the impact on breast cancer mortality.


Introduction
Breast cancer (BC) is a leading cause of disease burden among women in Europe: an estimated 522,513 women were diagnosed with BC in 2018, and 137,707 died of BC that year (GLOBOCAN 2018). Mammographic screening can reduce BC mortality in women over 50 years old, although the magnitude of this mortality reduction is the subject of ongoing debate. Estimates range from 20% or less for the group invited to screening, to 48% for the group that gets screened [1,2]. Mammographic screening also has limitations, including the occurrence of interval cancers and diagnosing BC that never would have been diagnosed nor caused symptoms in the absence of screening (overdiagnosis).
Many countries offer mammographic screening in the framework of a population-based mammography screening programme (PMSP), which aims to give all asymptomatic women in the target population systematic and equal access to screening while quality assurance and data collection are performed in a centralized manner. A PMSP can exist in parallel with opportunistic screening, which follows the spontaneous initiative of the woman or her physician [3].
Using breast cancer mortality as an endpoint in the evaluation of a PMSP seems obvious, but it takes many years before an effect on mortality can be observed [4]. Key performance indicators (KPIs) cannot replace a mortality analysis, but enable programmes to compare performance against objectives. Monitoring and evaluating KPIs (such as cancer detection rate or programme sensitivity) is a necessity for public health interventions such as a PMSP to justify the use of public means [1,4].
We calculated KPIs for the Flemish PMSP for the years 2002-2016, benchmarked crude KPI results against KPItargets set by the European guidelines and KPI results of other national screening programmes, and examined temporal trends in age-standardised KPIs. The mammograms are read independently by two certified screening radiologists. Both readers categorize mammograms according to a five-category classification similar to BI-RADS (Breast Imaging-Reporting and Data System) [5]. Classes III (probably benign), IV (suspicious abnormality), and V (highly suspicious lesion) are recalled for diagnostic assessment. If the two readers do not reach the same conclusion, a third radiologist performs the third (and decisive) reading.

General outline of the PMSP in Flanders
All results are sent to women (by post) and their physicians (electronically, and also by post in case of a suspicious finding). The physician's letter describes breast density, type of lesion, location of the lesion, and advice regarding the nature of diagnostic assessment, and it is sent 3 days before the woman's letter. Diagnostic assessment can take place in any radiological centre.

Two pathways of PMSP participation
There are two pathways by which a woman can get screened in the PMSP. In pathway-1-screenings, physicians specifically prescribe a PMSP screening. This prescription is equal to a PMSP letter of invitation as in pathway − 2screenings (see below). Pathway-1-screenings are reported as self-registration since these women did not receive an invitation prior to their participation. This pathway is not a safety net for unequal access to the PMSP, but rather meant to acknowledge the fact that some physicians have an excellent physician-patient relationship, rendering an invitation unnecessary. Women can be screened on a regular basis in pathway 1 for many years, without ever receiving an invitation.
In pathway 2, the CvKO uses the list of the eligible population to send out invitations by post every 2 years (eligible population is explained in the next section). Invitations contain an appointment to a certified mammogram unit, which can be altered by calling a toll free number. Besides this letter, there is no other formal system to remind women of an upcoming appointment.

Population
The target population includes all women in Flanders aged 50-69, identified with the central population registry.
The eligible population excludes from the target population all women who had a bilateral mastectomy or BC in the last 10 years, by using a unique 11-digit personal identification number to cross-link each individual of the target population to the BCR. This exclusion is performed twice per year, before sending out the invitations that are scheduled to be sent out over the following 6 months.
All women from the eligible population should receive an invitation the same year, except women who: actively opted out; already had a PMSP screening in the previous year; were already invited in the previous year; had a pathway-1-screening in the current year.
We calculate invitation coverage to assess whether all these women did indeed receive an invitation.

Opportunistic screening in Flanders
Women can also have a mammogram outside the PMSP. These mammograms are billed to the health insurance as "diagnostic mammograms", they follow the spontaneous initiative of the woman or her physician, and require a prescription that is different from the prescription that is used for a Pathway-1-screening. The results of these mammograms are communicated at the end of the exam and there is no systematic second reading. These mammograms can either have a diagnostic indication (women with symptoms of breast cancer or meant as diagnostic assessment) or be intended for opportunistic screening (women without symptoms of breast cancer). Because data on diagnostic mammograms are not stored centrally, the total number of these mammograms can only be obtained with reimbursement records. Unfortunately, reimbursement records cannot distinguish between mammograms performed for a diagnostic indication and those done for opportunistic screening.
We therefore consider all of these mammograms as opportunistic screening, even though some of them were undoubtedly for diagnostic purposes (see below, Determining screening status).

Oncological follow-up of screenings
The BCR collects data concerning all new cancer cases in Belgium and has access to health insurance reimbursement data. The completeness of the BCR breast cancer data was previously estimated to be 99.7% [6]. At the time of screening, women are given the possibility to opt-out of their data being used for research. Refusal rates fluctuate around 1% or less of screened women. The national privacy commission approved using a unique 11-digit personal identification number to cross-link each consenting screened individual to the oncological data from the BCR. Relevant BCR data can therefore be used as oncological follow-up for every consenting screened woman. This is currently the only source of follow-up data.

Determining screening status
We report on two types of participation data:

Invitation response
Percentage of women who got a PMSP screening within 24 months after receiving their invitation (The invitation is valid up to 24 months after being sent).

Coverage
The basis of our coverage data was the eligible population. Since the eligible population fluctuates throughout the year (death, immigration, etc.), we used the data of the first of January of each year as the basis for coverage data. The Flemish Working Group on breast cancer screening developed a method to determine coverage status for all of these women: check for opportunistic screening and PMSP screening in year x and x-1 and then use

Definitions
The definitions in Table 2 were used together with the above descriptions of population and screening status.

Statistical analysis
We included all screening mammograms made for women 50-69 years old during the period 2002-2016. Crude KPIs were calculated as described above, stratified by year of screening, and reported separately for initial and subsequent screenings (see Table 2). Age-standardised KPIs were calculated using the world standard population [7]. We benchmarked our crude KPI results against KPI results of other national screening programmes, and the KPI-targets set by the European guidelines for quality assurance in breast cancer screening [4].
Age-standardised KPIs were plotted against the year of screening to analyse temporal trends. APCs (Annual Percentage Change) were estimated from least squares regressions on the logarithm of the age-standardised KPIs versus year of screening. APC is to be interpreted as the mean multiplicative change per year (relative percentage change). If a trend could not be considered linear over the entire interval (on a log scale), the Average Annual Percentage Change (AAPC) was calculated instead of the APC. The AAPC is calculated as the average of the APC estimates of several segments, weighted by the corresponding segment length. In each of these segments the trend (on a log scale) can be considered linear [8]. This method has been used in many studies in a variety of fields to identify temporal patterns [9,10].
We used the Joinpoint Regression Programme (version 4.7.0) developed by the US National Cancer Institute, to estimate the models that best fitted the data (default setting, Permutation Test) and to calculate AAPC. When a KPI had several joinpoints, we also report the APC of the last segment, since this can give interesting information about the most recent trend. All other analyses were conducted using Stata version 13 (StataCorp., USA); significance was set at p < 0.05. Table 1 shows that between 2002 and 2016, a total of 2, 613,737 PMSP screenings were performed, of which a BCR link was established for 97.7%. These women had a mean age of 58.6 (years).

Participation
In the first 10 years of the PMSP, the proportion of women receiving an invitation was suboptimal: invitation coverage did not reach 90% until 2011 and achieved 96.0% in 2016 (see Fig. 1 and Table 3). PMSP coverage was at 50.0% in 2016 and opportunistic coverage was at 14.1%, resulting in total coverage by screening at 64.2%. PMSP coverage increased significantly (+ 7.5% AAPC), but the increase mainly occurred between 2002 and 2007 (APC + 14.2%), coinciding with the sharp rise in invitation coverage. After 2007, the AAPC is still positive but falls to + 1.6%. The response to the invitations was 49.8% in 2016 and did not display an upwards trend since the initiation of the programme.

Recall rate & cancer detection
Figure 2 combines recall rates, positive predictive values, and cancer detection rates (as proposed by Blanks [11]). Figure 2 and Table 3 show that recall rate has decreased in initial and subsequent screenings (AAPC -1.5% & -5.0% in initial and subsequent regular screenings). In the subsequent regular screens a decrease in recall rates occurred together with a stable CDR (AAPC 0.0%), resulting in an increased positive predictive value (PPV) (AAPC + 3.8%). Table 4 shows that overall programme sensitivity is stable and was 65.1% in 2014. There is only a significant trend in the initial screens (AAPC − 1.3%). Most of the interval cancers (62.9% for women screened in 2014) arise in the second year after screening (no significant trend). The majority of interval cancers appear after a negative screening. Nonetheless, 9.6% of all interval cancers occurring after a 2014 screening were found after a positive screening followed by a false negative diagnostic Cancer detection rate The number of breast cancers detected in a screening round per 1000 women screened.

Interval cancers and sensitivity
False-positive recall Any recall for diagnostic assessment that was not followed by a screen-detected cancer.
False-positive recall rate The number of women with a False-positive recall per 1000 women screened.
Initial screening The first screening examination of individual women within the PMSP, regardless of how long the programme has been running Interval cancer • Breast cancer that was diagnosed within 24 months of a negative screen.
• Breast cancer that was diagnosed more than 3 months after the first diagnostic assessment that followed a positive screen (but at the latest within 24 months of screening).
Interval cancer rate The number of interval cancers diagnosed per 1000 women screened.

Invitation coverage
The number of women that receive an invitation in year x, as a proportion of all women that should be invited in that year.
Positive predictive value The number of breast cancers detected per 100 women recalled for diagnostic assessment. Screen-detected cancer Breast cancer that was diagnosed within 3 months of the first diagnostic assessment that followed a positive screen (but at the latest within 24 months of screening).

Subsequent irregular screening
Any screening examination after the initial screening, where the most recent PMSP screening occurred > 30 months after the previous PMSP screening

Subsequent regular screening
Any screening examination after the initial screening, where the most recent PMSP screening occurred <=30 months after the previous PMSP screening assessment. This proportion shows a clear decreasing trend (AAPC − 6.4%). The interval cancer rate for screenings from 2014 on was 3.6/1000 and 2.7/1000 (initial and subsequent regular screens respectively), without a significant trend.
Tumour stage of screen-detected cancers Figure 3 and Table 4 show the distribution of tumor stage. There appears little difference between the distribution of initial and subsequent screens, which is surprising.
The proportion of DCIS was 18.2 and 14.0% in 2015 (initial and subsequent regular screens respectively), without a significant trend.
The proportion of tumours stage II+ was 31.2 and 24.8% in 2015 (initial and subsequent regular screens respectively). There is a significant trend only in the initial screens (AAPC + 1.9%).
Benchmark targets for DCIS distribution were achieved. The benchmark for stage II+ were not achieved in initial screenings, while 2015 was the first year they were achieved for subsequent regular screens.

Nodal status of screen-detected cancers
The proportion of node negative cases among all invasive SDC was 61.6 and 65.4% in 2015 (initial and subsequent regular screens, respectively), without a significant trend ( Fig. 4 and Table 4). This is below EU targets. The proportion of invasive SDC for which nodal status was unknown was 7.7 and 11.2% in 2015 (initial and subsequent regular screens respectively). Figure 4 also shows what the proportion of node negative SDC would be if all these unknown cases turn out to be node negative.

Discussion
We analysed key performance indicators for the Flemish PMSP for the period 2002-2016.
A much larger fraction of the population was covered in 2016 (64.2%) compared to the start of the programme (46.2% in 2003), even though the response to the screening invitation remained stable throughout 15 years. The growth in coverage slowed down after the majority of women started receiving timely invitations (93.2% in 2011). This indicates that the PMSP coverage increase was not so much the result of a change in intention to screen among the target group, but was instead largely due to the fact that more women were receiving their invitation on time.
Opportunistic screening was well established in Belgium long before the PMSP started [12]. Between 2003 and 2016, opportunistic coverage gradually decreased (AAPC −3.0%). Many of these women gradually switched to the PMSP. Several factors may have encouraged this switch: the quality of the opportunistic screening is not guaranteed (quality assurance of equipment, double-reading, etc.), opportunistic screening is not entirely free of charge, and booking appointments for a PMSP screening requires less effort from the women.    [13]. The decrease in recall rate, combined with the stable CDR, means that fewer women are receiving a falsepositive recall (20.2/1000 screens in 2016) leading to a higher positive predictive value of the screening mammograms (21.3% in 2016), which is also above the EU mean of 12.2% [3]. There are several hypotheses for this. Firstly, yearly symposia on lowering recall rate have been organized by the CvKO since 2010. Secondly, individual 4-monthly feedback is sent to all readers since 2008-2009. These reports compare their individual recall rate with the anonymised rates of their colleagues. Thirdly, the introduction of digital mammography screening, which led to an increased CDR in other countries [14], occurred in the same period as the reduction of the recall rate. Theoretically, the introduction of digital screening could have increased the CDR and thereby masked the lowering of CDR due to more restrictive recall strategy. However, this is unlikely as previous research has shown that digitalization in Flanders did not result in significantly different cancer detection rates [15]. Although the lowering of recall rate in combination with a stable CDR is a positive evolution, it is necessary to evaluate the negative counterpart i.e. interval cancer rate. More specifically, a review of interval cancers could determine whether breast cancers are more likely to be missed compared to other countries. Surprisingly, the tumour stage distributions hardly differ between initial and subsequent regular screening. The same is true for CDR: in 2016 the CDR was 5.0‰ in subsequent regular screens (EU mean 5.6‰) and 6.3‰ in initial screens (EU mean 7.2‰) [3]. This could be explained by a large proportion of "initial screens" which were preceded by opportunistic screening [16]. In 2019, the CvKO will pilot a method that adjusts the KPIs of initial screens for the occurrence of such preceding opportunistic screening.
Benchmark targets for nodal status have not been achieved in 2015. This could be partly caused by the fact that more than 10% of 2015's invasive SDC still have unknown nodal status. Assuming at least some of these unknown cases are node negative, the benchmark targets might be achieved.
Programme sensitivity is stable (65.1% in 2014) but lower than in other countries such as Germany (78.2%) [17], the Netherlands (74.4%) [18], Norway (75.5%) [19], or Canada (68%) [20]. Closer inspection reveals that the categorization of BC as either SDC or interval cancers differs between programmes. For instance, in the German programme any BC found within 24 months after a positive screening was considered an SDC, while the Canadian Programme only considered a BC as screen detected if they were found within 6 months after a positive screening [17,20]. The Canadian programme will thus classify certain BC as interval cancers, while the German programme would see them as SDC. Such differences will influence programme sensitivity. The Flemish PMSP only considers Interval cancer rate,  indicates the (A)APC is significantly different from zero at the alpha = 0.05 level a BC as screen detected if it was found within 3 months after the first diagnostic assessment that follows a positive screening (see also Table 2) [21]. The Canadian definition of an SDC is relatively close to the Flemish, which might explain why their programme sensitivity is similar (68% in Canada, 65.1% in Flanders) [20].
To decrease the risk of missing BC (thereby increasing sensitivity), the CvKO started a self-teaching project in 2018 which provides all readers with a yearly list of BC for which they had made a negative reading. To counter a possible increase in recall rate, readers also receive a list of their positive readings in which no breast cancer was found in the 2 years following screening.
The major strength of this first nationwide analysis of KPIs in the Flemish PMSP is the availability of national data on all mammographic PMSP screenings performed over 15 years, together with the matched oncological follow-up data from the BCR. The completeness of BCR breast cancer data was previously estimated to be 99.7% [6]. Our study also has some limitations. Firstly, not all screened women provided an informed consent to link their screening data to the BCR data, mostly during the programme initiation in 2002 and 2003. Refusal rates fluctuated around 1% or less of screened women. Secondly, we suspect that some of the "initial screens" in the programme are in fact preceded by opportunistic screen. We are investigating this further. Thirdly, some of the tumor characteristics have missing data, meaning the proportions calculated for those KPIs might still rise. For instance, in 2015 65.4% of invasive BC were node-negative, but a further 11.2% had unknown nodal status. The same is true for stage distribution. Fourthly, we considered all diagnostic mammograms as opportunistic screening, even though a minority are undoubtedly for diagnostic purposes [12]. The BCR and CvKO are currently investigating the proportion of all diagnostic mammograms that are for screening purposes. Fifthly, in the current analysis, we cannot estimate the impact on breast cancer mortality. The CvKO participates in the EU-topia project (https:// eu-topia.org) to attempt to obtain an estimate, while the BCR is currently performing its own analysis.

Conclusion
Besides the suboptimal attendance rate, most performance indicators in the Flemish PMSP meet EU benchmark targets. Nonetheless, there are several priorities for further investigation. Firstly, the response to invitation has remained stable, indicating that the strategies that have been used to increase screening uptake these last 15 years have had limited effect. Now that the invitation scheme has been optimised, a critical evaluation should be made of these strategies. Secondly, interval cancers should be analysed by individual radiological review as described in the European guidelines [4]. If the proportion of "missed cancers" is comparable to the results in other countries, it can be concluded that Flanders has found a successful way of reducing recall rate while maintaining a stable CDR. The ensuing lower number of false positive screenings will lead to increased rescreening rates [22,23]. Thirdly, ways must be found to further limit the occurrence of interval cancers after positive screenings with negative diagnostic assessment. One possibility could be to let diagnostic assessment only take place in specialised centres. Fourthly, the clinical and health economic impact of the PMSP should be analysed, along with the effect of opportunistic screening on CDR in initial and subsequent irregular screens. The BCR and CvKO are therefore analysing the impact of mammographic screening in three scenarios: women attending PMSP, women attending only opportunistic screening, women attending both screening types. Among other things, the study will compare costeffectiveness and clinical outcome. This is being done in parallel with efforts to estimate the impact on breast cancer mortality.