Skip to main content

Cost-effectiveness of using artificial intelligence versus polygenic risk score to guide breast cancer screening

Abstract

Background

Current guidelines for mammography screening for breast cancer vary across agencies, especially for women aged 40–49. Using artificial Intelligence (AI) to read mammography images has been shown to predict breast cancer risk with higher accuracy than alternative approaches including polygenic risk scores (PRS), raising the question whether AI-based screening is more cost-effective than screening based on PRS or existing guidelines. This study provides the first evidence to shed light on this important question.

Methods

This study is a model-based economic evaluation. We used a hybrid decision tree/microsimulation model to compare the cost-effectiveness of eight strategies of mammography screening for women aged 40–49 (screening beyond age 50 follows existing guidelines). Six of these strategies were defined by combinations of risk prediction approaches (AI, PRS or family history) and screening frequency for low-risk women (no screening or biennial screening). The other two strategies involved annual screening for all women and no screening, respectively. Data used to populate the model were sourced from the published literature.

Results

Risk prediction using AI followed by no screening for low-risk women is the most cost-effective strategy. It dominates (i.e., costs more and generates fewer quality adjusted life years (QALYs)) strategies for risk prediction using PRS followed by no screening or biennial screening for low-risk women, risk prediction using AI or family history followed by biennial screening for low-risk women, and annual screening for all women. It also extendedly dominates (i.e., achieves higher QALYs at a lower incremental cost per QALY) the strategy for risk prediction using family history followed by no screening for low-risk women. Meanwhile, it is cost-effective versus no screening, with an incremental cost-effectiveness ratio of $23,755 per QALY gained.

Conclusions

Risk prediction using AI followed by no breast cancer screening for low-risk women is the most cost-effective strategy. This finding can be explained by AI’s ability to identify high-risk women more accurately than PRS and family history (which reduces the possibility of delayed breast cancer diagnosis) and fewer false-positive diagnoses from not screening low-risk women.

Peer Review reports

Background

There is widespread debate among clinicians and researchers globally over what constitutes appropriate breast cancer screening, especially for women younger than age 50 [1]. Consequently, existing guidelines on mammography screening for breast cancer vary widely, even within a country. In the United States (US), the American College of Obstetricians and Gynecologists (ACOG) and the American College of Radiology (ACR) recommend annual mammography starting at age 40 for all women [2]. Meanwhile, the most recent US Preventive Services Task Force (USPSTF) guidelines recommend biennial mammography between ages 50 to 74 years for women without family history of breast cancer while indicating that women with family history may benefit from starting screening between ages 40 and 49 [2]. In Canada, breast cancer experts have challenged the Canadian Preventive Task Force which recommends against breast cancer screening for women aged between 40 and 49 years who are not at high risk, arguing that these recommendations are “outdated and dangerous” and have called for annual screening of all women above age 40 [3].

Cost-effectiveness analyses can inform this debate by estimating and comparing the costs and effectiveness of alternative screening strategies to identify the most cost-effective screening strategy. However, despite several cost-effectiveness analyses of alternative screening intervals and starting ages for mammography screening associated with current screening guidelines [4, 5], the results remain inconclusive. Earlier studies have found starting screening at age 40 was not cost-effective relative to starting at age 50 [4], which lends support to the existing USPSTF guidelines while more recent cost-effectiveness analyses point to the value of extending screening to women younger than age 50 [5] as recommended by ACOG/ACR.

A key limitation of existing guidelines is that these do not fully account for heterogeneity in women’s risk of breast cancer. For instance, while risk assessment tools may consider family history or breast density as risk factors, these tools do not consider the full set of genetic markers now known to be associated with breast cancer. Furthermore, breast density measurements are also subject to radiologists’ assessment and discernment. From an economic perspective, a more rigorous risk stratification can enable focusing health care resources on screening women with high risk while avoiding unnecessary screening and follow-up costs for those with low risk.

Two new risk prediction approaches have recently emerged, namely polygenic risk score (PRS) and artificial intelligence (AI). PRSs estimate a woman’s risk of breast cancer based on susceptibility loci identified through genome wide association studies [6]. AI algorithms, in contrast, identify discriminative image patterns from full-field mammograms to categorize a woman’s risk of developing breast cancer in the future [7].

To date, there is very little evidence on the cost-effectiveness of using these new risk-stratification approaches to aid breast cancer screening. Only one study has examined the cost-effectiveness of PRS-based risk-stratified mammography screening versus screening all women aged between 50 and 69 years and no screening for breast cancer. This study found that offering mammography screening only to women above the 70th percentile of the PRS-based risk distribution is cost-effective relative to screening all women aged between 50 and 69 years and no screening [8]. Notably, no study has compared the cost-effectiveness of risk-stratified mammography screening based on risk prediction using AI vs PRS. Our study fills this evidence gap.

In this study, we examine the cost-effectiveness of using AI or PRS to guide mammography screening for breast cancer compared with screening based exclusively on family history (similar to USPSTF guidelines), annual screening for all women (similar to ACOG/ACR guidelines) and no screening, among white women. As most of the debate over breast cancer screening centers on screening for women aged between 40 and 49 years and as data on predictive ability of AI has been validated only for the short-term [7], we consider AI and PRS for guiding screening for only women in the 40 to 49 years age group, with screening for older women based on existing guidelines.

Methods

Study cohort and risk of breast cancer

Our model simulated 100,000 white women aged 40 years with no previous history of breast cancer. Each woman had an underlying risk of developing breast cancer based on a recent risk distribution estimated for US white females using a comprehensive set of genetic and other non-modifiable and modifiable breast cancer risk factors [9]. As criteria for who is considered ‘high risk’ for screening purposes differ across guidelines, we conservatively classified women into three categories: (i) ‘true’ low risk, defined as those with an underlying risk of breast cancer less than 1.1 times the average risk in the population of 40 year old women (that is, relative risk (RR) is lower than 1.1); (ii) ‘true’ high risk, defined as those with RR between 1.1 and 4; and (iii) ‘true’ very high risk, defined as those with RR of 4 or higher. The RR threshold of 1.1 was chosen because it can capture a broad range of factors known for increasing risk of breast cancer, including family history of breast cancer, reproductive risk factors, genetic variations and dense breast on mammography [10]. Meanwhile, the RR threshold of 4 captures factors such as history of chest radiation and atypical hyperplasia [11, 12]. With these RR thresholds, 1% of our hypothetical study cohort was classified as ‘true’ very high risk, 42% as ‘true’ high risk and the remaining 57% as ‘true’ low risk.

Screening strategies

We compared eight alternative screening strategies as shown in Fig. 1. The first two strategies involved no screening and annual screening for all women, respectively. The remaining six strategies were defined by combinations of risk prediction approaches (AI, PRS or family history) and screening frequencies among low-risk women aged 40–49 (no screening or biennial screening). We describe these strategies in detail below.

Fig. 1
figure 1

Screening strategies. In strategies 3–6, ‘High risk’ and ‘Low risk’ during age 40–49 refer to estimated high-risk and low-risk by AI or PRS, while beyond age 50 refer to presence or absence of family history, respectively. In strategies 7–8, ‘High risk’ and ‘Low risk’ refer to presence or absence of family history, respectively. Beyond age 50, in all strategies except strategies 1 and 2, women without family history undergo biennial screening; those with family history undergo annual screening. Screening in all strategies ceases at age 74

No screening

In strategy 1 (‘No screening’, hereafter), women were not screened at any age regardless of risk level.

Annual screening for all

In strategy 2 (‘Annual screening for all’, hereafter), all women (regardless of risk level) underwent annual mammography starting at age 40, similar to recommendations by ACOG and ACR.

Screening guided by AI

Strategies 3 and 4 involved risk stratification based on AI reading of an index mammogram. All women underwent an index mammogram at age 40, which was interpreted using AI to predict risk of breast cancer. This mammogram may or may not be part of standard screening services. Women predicted to have high risk (RR > =1.1) underwent annual digital mammography starting at age 40. In strategy 3 (‘AI + no screening for low-risk’, hereafter), women predicted to have low risk were not screened while in strategy 4 (‘AI + biennial screening for low-risk’, hereafter), they underwent biennial screening. This screening pattern continued until age 49. Beyond age 50, screening followed the existing USPSTF guideline as described below.

Screening guided by PRS

In strategies 5 (‘PRS + no screening for low-risk’, hereafter) and 6 (‘PRS + biennial screening for low-risk’, hereafter), screening pathways were the same as in strategies 3 and 4; however, risk stratification was performed using PRS instead of AI. All women underwent genetic testing at age 40 in which 76 single nucleotide polymorphisms (SNPs) known to be associated with breast cancer were genotyped [6].

Screening guided by family history

In strategies 7 and 8, screening was guided by family history (similar to existing recommendations by the USPSTF). For women aged between 40 and 49 years, existing USPSTF recommendation to screen women without family history is only a grade C recommendation (i.e., the net benefit of screening in this group is small) [11, 13]. Therefore, in strategy 7 (‘Family history + no screening for low-risk’, hereafter), we considered that women younger than age 50 without family history were not screened, while in strategy 8 (‘Family history + biennial screening for low-risk’, hereafter), they were screened biennially. The USPSTF guidelines indicate that women with family history may benefit from starting screening before age 50 [2] but do not specify frequency of screening for these women. Given that most other screening guidelines recommend annual screening for high-risk women [2], we considered that women with family history underwent annual mammography starting at age 40.

Beyond age 50, screening in strategies 3-8 followed existing USPSTF guidelines. Therefore, women without family history were screened biennially [11]. Furthermore, as the USPSTF does not specify screening frequency for those with family history, similar to younger women, women with family history underwent annual mammography. In all strategies (except ‘No screening’), screening ceased at age 74.

The eight strategies, thus, differed in the proportion of women subjected to aggressive screening. ‘Annual screening for all’ was the most aggressive as all women, including those at low risk, were screened annually starting at age 40. By contrast, in the remaining strategies, low-risk women younger than age 50 were either not screened or screened only biennially while those aged over 50 were screened biennially. While screening frequencies were the same in strategies 3,5 and 7, and in strategies 4, 6 and 8, these strategies differed in their accuracy of risk prediction for women aged between 40 and 49, which in turn determined the proportion of women screened prior to age 50.

Model structure

We developed a hybrid decision tree/microsimulation model to estimate the costs and effectiveness of the eight screening strategies. The analysis was conducted from the health care system’s perspective. Cycle length was 1 year and lifetime horizon was used.

Figure 2 shows a simplified depiction of the model. The decision tree component of the model captured risk prediction and stratification at age 40 based on AI, PRS or family history. Women entering the model had an underlying ‘true’ low, high or very high risk of breast cancer. As risk factors associated with very high risk (RR > =4) are likely known a priori, women with RR > =4 did not require risk prediction and underwent annual screening regardless of screening strategy (except in the ‘No screening’ strategy). Depending on risk-stratification strategy, AI, PRS or family history were used to predict the underlying risk for the remaining women; the extent to which the estimated risk category matched the underlying risk category was determined by the accuracy of each method (described below).

Fig. 2
figure 2

Simplified depiction of model. Clinical pathways for strategies 4, 6 and 8 are the same as for strategies 3, 5 and 7, respectively, except that patients identified as low risk are screened biennially instead of no screening. Clinical pathways for progression to in situ or invasive cancer and to death follow the pathways described in Schousboe et al, 2011 [14]). Beyond age 50, in all strategies except strategies 1 and 2, women without family history undergo biennial screening; those with family history undergo annual screening

The microsimulation component, which was adapted from a previously published model [14], simulated the screening, diagnosis, disease progression and mortality from breast cancer. All women entering the microsimulation model had no tumor but could develop in-situ or invasive cancer over time based on observed age-specific incidence rates; in situ cancer could further progress to invasive cancer. Invasive cancers were classified into local, regional and distant stages [14]. Women who underwent mammography screening were more likely to be diagnosed with in situ cancer. However, more aggressive mammography screening also resulted in more cancers being diagnosed in earlier (instead of more advanced) stages [14]. Women who developed invasive breast cancer faced risk of death from cancer or from other causes.

Model inputs

Inputs used in our model are presented in Table 1 and described below. Further details are provided in the Online Supplementary Materials.

Table 1 Model Inputs

Accuracy of risk prediction

The key determinant of costs and effectiveness of each screening strategy was the accuracy of risk prediction. Higher accuracy of risk prediction implied that fewer women with underlying high-risk were incorrectly predicted to be at low risk, resulting in timely diagnosis and treatment of cancer for high-risk women. It also meant that fewer low-risk women were incorrectly predicted to be at high risk, leading to reduction in screening and fewer false-positive diagnoses.

In our model, accuracy of breast cancer risk prediction using AI and PRS was measured using area under the receiver operating characteristic curve (AUC) obtained from published studies [6, 7]. As real-world clinical decisions will also likely utilize information on other demographic and personal risk factors (such as weight, family history and breast density) in addition to AI or PRS, we used AUC values for models based on both AI or PRS and other risk factors. Using data from digital screening mammograms read by deep learning algorithms (AI), information on other demographic and personal risk factors and breast cancer outcomes from tumor registries, Yala et al. estimated an AUC of 0.71 for white females in the US [7]. We chose this study to obtain the AUC for AI owing to its large study sample of patients seen in the US (over 31,000 patients in the training dataset and over 3900 patients in the test set) [7]. Meanwhile, AUC for PRS was obtained from Vachon et al., a recent, high-quality study that estimated the AUC for PRS combined with other risk factors for a large study sample primarily consisting of American women [6]. Vachon et al. estimated an AUC of 0.69 for a model that combined PRSs developed based on 76 SNPs and information from the Breast Cancer Surveillance Consortium (BCSC) five-year risk-prediction model [6]. We followed a previously published method to simulate distributions of RR estimated using AI or PRS using these AUC values [29, 30]. Women with estimated RR of 1.1 or higher were then classified as high risk while those with estimated RR below 1.1 as low risk. We note that as AUC of both AI and PRS is below 1, not all ‘true’ high risk women will be correctly classified as such.

In strategies that involved risk prediction based on family history, as women with an underlying low risk will not have a family history of breast cancer, all low-risk women will be correctly classified as such. Among high-risk women, we assumed that 37% will be correctly classified. This proportion was calculated as the share of US women with first-degree family history of breast cancer (16% [15, 16]) among high-risk women (43% of our study cohort).

Incidence and stage distribution of breast cancer and mortality risk

To estimate a woman’s likelihood of developing in situ or invasive breast cancer, we multiply age-specific breast cancer incidence rates per 100,000 white women in the US [31] (adjusted for increase in incidence rates due to mammography screening [14]) with the woman’s ‘true’ RR. The stage at cancer detection depended on screening frequency and sensitivity of mammography; the latter depended on patient age and was obtained from the published literature [20]. Women receiving more aggressive screening were diagnosed at earlier stages than those receiving less frequent screening. Stage distribution at diagnosis in the absence of screening was calculated based on the proportions of local, regional and distant cancers observed among white women aged below or above 50 years during 1975–1979 (when mammography screening was not widespread in the US) [17]. Meanwhile, stage distributions with annual or biennial screening were obtained from more recent estimates based on 1996–2012 Breast Cancer Surveillance Consortium data [18]. Patients diagnosed with invasive breast cancer faced risk of breast cancer mortality for up to 20 years after diagnosis. This risk was specific to age and stage at diagnosis as well as estrogen-receptor (ER) and human epidermal growth factor 2 (HER2) status [32]. All women faced risk of mortality from non-breast cancer causes which was age-specific, and was obtained by subtracting age-specific breast cancer mortality from the 2017 US life tables [33].

Costs

The cost of each strategy included cost of risk prediction (index mammogram read by AI technology or genetic testing as applicable), cost of screening with digital mammogram (if any), and cost of breast cancer treatment determined by the stage at cancer diagnosis (treatment costs were lower for cancers detected at an earlier stage). Cost of genetic test to determine PRSs was the cost of OncoArray test in US laboratories [22]. We assumed that patients underwent genetic counselling before and after the genetic test, and that each counselling session costed $44 [23]. While cost of AI-based risk prediction in clinical practice is not yet available, calculations by European Society of Radiology suggest fixed costs of €60,000 ($65,300) in addition to an annual cost of €20,000 ($21,770) for the software license [21]. Assuming equipment is amortized in 10 years, and with 8695 mammogram facilities in the US [27] serving nearly two million women aged 40 years [28], cost of AI reading of each mammogram amounts to ~$112. We varied cost of AI reading per mammogram over a wide range (up to $500) in the sensitivity analyses.

The cost of mammogram was obtained from Center for Medicare and Medicaid’s 2020 Physician Fee Schedule [24]. Cost of diagnostic work-up following a positive diagnosis and cost of treatment of breast cancer were obtained from the published literature [19, 25]. All costs were estimated in 2020 US dollars and discounted at 3% per year [34].

Effectiveness

Effectiveness was measured in terms of Quality Adjusted Life Years (QALYs) that captured a person’s life expectancy adjusted by his/her health-related quality of life called utility. Screening entailed disutility of 0.006 QALYs for 1 week and diagnostic workup following a positive screening result involved disutility of 0.105 QALYs for 5 weeks [25]. Utilities were specific to patient age and stage of cancer [14]. For all cancer stages, utilities in the first year after breast cancer diagnosis were lower than in later years [14]. All utility values were discounted at 3% per year [34].

Cost effectiveness analysis

We estimated the total costs and QALYs of the eight strategies. A strategy was considered cost-effective relative to another strategy if the Incremental Cost Effectiveness Ratio (ICER), calculated as the difference between the overall costs of the two strategies divided by the difference between the total QALYs gained, was lower than the conventional willingness-to-pay threshold (WTP) of $100,000 per QALY. Meanwhile, a strategy was dominated if it was both more costly and less effective than the other strategy or extended dominated if it achieved fewer total QALYs than a more costly strategy at a higher incremental cost per QALY (i.e., its ICER relative to the next less costly strategy was higher than the ICER of a more effective strategy) [35].

In addition to the eight strategies examined in the main analysis, we conducted an augmented analysis which included 4 additional strategies. These additional strategies were similar to strategies 3–6 above, except that risk prediction was performed exclusively using AI or PRS, i.e., without considering demographic and personal risk factors. Thus, AUC values in these additional strategies were 0.69 for AI [7] and 0.63 for PRS [36] (instead of 0.71 and 0.69, respectively, in strategies 3-6).

Furthermore, we conducted several sensitivity analyses. First, we varied values of key costs and utilities in one-way sensitivity analyses and addressed parameter uncertainty using probabilistic sensitivity analyses (PSA). Next, we varied AUCs of AI and PRS to 20% lower and higher values than those used in the main analysis.

We also examined the robustness of our findings to the choice of the RR threshold used to define estimated high risk. Following previous studies, we used alternative thresholds of 1.3 and 2 (instead of 1.1 used in the base case analysis) [37]. All analyses were performed using TreeAge Pro 2019 v2.1 [38].

Model validation

We assessed the validity of our model following the Assessment of the Validation Status of Health-Economic decision models (AdViSHE) tool [39] and guidelines of the International Society for Pharmacoeconomics and Outcomes Research [40]. First, we conducted trace analysis and compared the modelled lifetime cumulative breast cancer incidence and mortality with screening to recently observed proportions. Next, while cost-effectiveness of AI-based screening has not been examined previously, we cross-validated the estimated incremental costs, QALYs and false-positive rates (compared with no screening) against previous studies for the strategy where risk prediction is based on family history and those without family history are screened biennially starting at age 50.

Results

Base case analysis

Table 2 summarizes the lifetime costs and QALYs gained, and breast cancer outcomes with each screening strategy. ‘No screening’ involved the least lifetime total costs ($1.75 billion per 100,000 women) but also generated the fewest QALYs (1,976,720 per 100,000 women). The strategies involving screening resulted in $77.8 million - $276.3 million higher lifetime costs (per 100,000 women) and 1521–4110 additional QALY (per 100,000 women) relative to ‘No screening’.

Table 2 Lifetime Costs, QALYs and Breast Cancer Outcomes by Screening Strategy

The cost-effectiveness plane in Fig. 3 shows the results from stepwise comparisons with the next less costly strategy. Among the eight strategies, only ‘No screening’ and ‘AI + no screening for low-risk’ strategies lay on the cost-effectiveness efficiency frontier. The ‘Family history + no screening for low-risk’ strategy was extended dominated while the remaining 5 strategies (that involved either risk prediction using PRS and/or biennial or annual screening for low-risk women) were dominated by ‘AI + no screening for low-risk’. Excluding these dominated and extended dominated strategies, ‘AI + no screening for low-risk’ was the most cost-effective strategy. It cost $97.6 million (per 100,000 women) more than ‘No screening’ but generated 4110 additional QALYs (per 100,000 women). The ICER compared with ‘No screening’ was $23,755 per QALY gained which was lower than the conventional WTP threshold of $100,000 per QALY gained.

Fig. 3
figure 3

Cost-effectiveness plane. ICER: Incremental Cost-Effectiveness Ratio

The superior cost-effectiveness of ‘AI + no screening for low-risk’ compared with other screening strategies is explained by the combination of: (i) higher accuracy of AI in identifying high-risk women compared with family history and PRS; and (ii) prevention of costs and disutility of screening and additional diagnostic work-up for low-risk women. As shown in Table 2, AI correctly classifies 57% of true high-risk women as such, compared with 36% with family history. Consequently, even though total costs of ‘AI + no screening for low risk’ are higher than ‘Family history + no screening for low risk’ (because more women are screened during ages 40 to 49), more high-risk women would benefit from this screening, as reflected in fewer breast cancer deaths (2956 vs 2988 per 100,000 women). While risk prediction using AI is also more costly than genetic testing, its higher accuracy justifies the higher cost: 57% vs 56.7% high-risk women and 87% vs 86.4% of low-risk women are correctly classified with AI and PRS, respectively. The lower accuracy of PRS implies that more low-risk women incorrectly undergo annual screening between ages 40 and 49 compared with AI, leading to more false-positive diagnoses (141,443 per 100,000 women with ‘PRS + no screening for low risk’ vs 141,339 per 100,000 women with ‘AI + no screening for low risk’).

Meanwhile, no screening for low-risk women aged 40–49 explain the lower costs and higher effectiveness of this strategy relative to strategies involving biennial or annual screening for low-risk women. Even though breast cancer deaths are higher as not all women are screened during age 40–49, there are 17–51% fewer false-positive diagnoses. Thus, not screening women identified as low-risk saves both costs and disutility of screening and additional diagnostic work-up.

The results from the augmented analysis (that included the 4 additional strategies for risk prediction using AI or PRS without other risk factors) supported our base case findings. As shown in Table A1 (Online Supplementary Materials), ‘AI + no screening for low risk’ remained the most cost-effective strategy. In particular, it dominated the strategies involving risk prediction based exclusively on AI or PRS.

Sensitivity analyses

The results from one-way sensitivity analyses are presented in a tornado diagram in Fig. 4. They indicate that the ICER is most sensitive to cost of mammography and health state-specific utilities and costs. For all values of these costs and utilities in the ±25% range, however, ‘AI + no screening for low-risk’ remains cost-effective vs no screening. In particular, it remains the most cost-effective screening strategy as long as cost of AI reading is below $318 per mammogram (Fig. 5). The cost-effectiveness acceptability curve shows that, at the WTP threshold of $100,000/QALY, ‘AI + no screening for low risk’ is cost-effective in 96% of iterations (Fig. 6).

Fig. 4
figure 4

Tornado Diagram. Costs and utilities are varied in a range of ±25% of base case values. ICER: Incremental Cost-Effectiveness Ratio

Fig. 5
figure 5

Threshold analysis for cost of AI

Fig. 6
figure 6

Cost-effectiveness acceptability curve

Table 3 presents results from additional sensitivity analyses. It shows that ‘AI + no screening for low risk’ remained the most cost-effective strategy even when we used AUC values for AI and PRS that were 20% lower or higher than that in the base case.

Table 3 Sensitivity analyses

Alternative RR thresholds

In our base case analysis, we used an RR threshold of 1.1 to define estimated high risk with AI or PRS. However, this RR threshold is likely to itself be a policy alternative to be determined by decision-makers. We, therefore, conducted additional analyses in which we used alternative thresholds of 1.3 and 2 (instead of 1.1 used in the base case analysis) (Table 4) [37]. We found that, for higher RR thresholds, ‘PRS + no screening for low risk’ strategy generated higher QALYs but it also resulted in higher total costs than ‘AI + no screening for low-risk’, yielding an ICER that exceeded the WTP threshold of $100,000/QALY. Thus, ‘AI + no screening for low risk’ was still the optimal strategy.

Table 4 Alternative RR thresholds

Model validation

Trace analysis indicated that the modelled lifetime cumulative breast cancer incidence and mortality were 16% and 2.9% with screening. These proportions were similar to the proportions observed for white women in 2016–18 reported by SEER (13% and 2.5%, respectively) [41]; the slight difference can be explained by < 100% adherence to screening guidelines in the real world [10]. Cross-validation against previous studies showed that our estimated incremental costs and QALYs for the ‘Family history + no screening for low risk’ strategy were similar to those estimated in a recent, high-quality cost-effectiveness analysis [19]: $778 vs $682 (in 2020 $) incremental costs and 0.015 vs 0.017 incremental QALYs. Furthermore, estimated number of false positive diagnoses for this strategy (121,737 per 100,000 women) fell within the range indicated by USPSTF (830–1325 per 1000 women) [11].

Discussion

Our study provides the first cost-effectiveness analysis of using AI or PRS to risk-stratify 40–49 year old white women for breast cancer screening versus screening based on family history, annual screening for all women, and no screening. We found that risk prediction using AI followed by no screening for low-risk women is the most cost-effective strategy with an ICER of $23,755 per QALY gained.

Our results reveal several interesting patterns. We find that with both the AI and PRS algorithms, there exists a negative dose response relationship between screening frequencies and effectiveness, i.e., no screening is more effective than biennial screening. However, this pattern is reversed: (i) when risk prediction is based on family history; and (ii) when we compare ‘Annual screening for all’ with ‘AI/PRS + biennial screening for low risk’. These opposite patterns highlight how the accuracy of a risk prediction tool may reinforce or attenuate the effects of screening frequency on outcomes. The relatively lower accuracy of risk prediction using family history compared with AI/PRS means that more true high-risk women, who are incorrectly predicted as low risk based on family history, benefit from biennial screening. If these benefits outweigh the disutility from more frequent screening for low-risk women, effectiveness of ‘Family history + biennial screening for low risk’ can still be higher than ‘Family history + no screening for low risk’.

Meanwhile, under ‘Annual screening for all’, all high-risk women are correctly classified. In addition, some low-risk women who still develop cancer also benefit from annual screening. As a result, fewer cases are missed compared with ‘AI/PRS + biennial screening for low risk’. Even though ‘Annual screening for all’ also carries the burden of more frequent screening for low-risk women, the total effectiveness can still be higher than ‘AI/PRS + biennial screening for low risk’ if the utility gains from fewer missed cases more than offset the disutility from more frequent screening.

Our study provides useful insights to inform the ongoing debate over appropriate breast cancer screening practices for women aged between 40 and 49 years. We find that using AI to risk-stratify women and targeting screening at only high-risk women can generate greater economic value than existing screening guidelines. Compared with family history-based screening (which reflects current USPSTF guidelines), this AI-based strategy can help alleviate existing concerns about delayed diagnoses as more high-risk women would be accurately identified and screened. At the same time, it can reduce false-positive diagnoses from screening all women over age 40 annually (as recommended by ACOG/ACR guidelines).

Our study has several limitations. First, randomized controlled trials that directly compare AI with PRS or existing screening criteria are lacking. Thus, data on efficacy of AI and PRS had to be obtained from different studies and demographic and personal risk factors considered in addition to AI and PRS differed slightly in the two studies. Second, cost of using AI for breast cancer risk prediction in clinical practice is not yet known and was not available from existing literature. Therefore, for our analysis, we had to rely on cost estimates from the European Society of Radiology [21] to estimate this cost. Nevertheless, we varied the cost of AI in one-way sensitivity analyses and our results continued to hold for all costs of AI as high as $318 per mammogram. Finally, in our model, AI was used to guide breast cancer screening over a 10-year duration (i.e., between ages 40 and 49) while existing data could validate the accuracy of AI-based risk prediction only for 5 years post risk-assessment [7]. However, these existing data provide suggestive evidence that AI is able to detect features associated with long-term risk [7]. As deep learning models improve in the future and long-term data become available, future studies could re-examine the cost-effectiveness of using AI to guide breast cancer screening not just among women aged 40–49 but in women across the entire candidate age range, including those over age 50.

Despite these limitations, our study can serve as a useful starting point to stimulate and inform future research and policy choices about breast cancer screening guided by novel AI technologies. Furthermore, it provides a general framework that can be easily updated (when new data on AI risk prediction become available) or adapted to study cost-effectiveness of using AI in other disease domains.

Conclusions

This study finds that using AI to risk-stratify women for breast cancer screening between ages 40 and 49 (followed by screening based on existing guidelines beyond age 50) is cost-effective compared with screening based on PRS or family history, annual screening for all women and no screening. By accurately identifying and screening more high-risk women and avoiding screening for low-risk women, this cost-effective AI-based screening strategy can help address existing concerns about delayed diagnoses as well as false-positive diagnoses that could arise with conventional screening strategies.

Availability of data and materials

Data used in this analysis were obtained from the published literature and references to sources have been provided in the manuscript. For further information, please contact the corresponding author Dr. Hai Nguyen at hvnguyen@mun.ca.

Abbreviations

AI:

Artificial Intelligence

PRS:

Polygenic Risk Score

QALY:

Quality Adjusted Life Year

USPSTF:

US Preventive Services Task Force

ACS:

American Cancer Society

AUC:

Area under the Receiver Operating Characteristic Curve

ICER:

Incremental Cost-Effectiveness Ratio

WTP:

Willingness to Pay

PSA:

Probabilistic Sensitivity Analysis

ER:

Estrogen Receptor

HER2:

Human Epidermal Growth Factor

ACOG:

American College of Obstetricians and Gynecologists

ACR:

American College of Radiology

References

  1. CBS News. The high cost of breast cancer “false positives.” 2015. https://www.cbsnews.com/news/the-cost-of-breast-cancer-false-positives/. Accessed 5 Apr 2020.

    Google Scholar 

  2. Centers for Disease Control and Prevention. Breast Cancer Screening Guidelines for Women 2016. https://www.cdc.gov/cancer/breast/pdf/BreastCancerScreeningGuidelines.pdf.

    Google Scholar 

  3. Global News. New breast cancer screening guidelines are outdated and dangerous, experts say. 2019. https://globalnews.ca/news/4898068/breast-cancer-screening-guidelines-backlash/. Accessed 4 Apr 2020.

    Google Scholar 

  4. Mittmann N, Stout NK, Lee P, Tosteson AN, Trentham-Dietz A, Alagoz O, et al. Total cost-effectiveness of mammography screening strategies. Health Rep. 2015;26:16.

    PubMed  PubMed Central  Google Scholar 

  5. Mittmann N, Stout NK, Tosteson AN, Trentham-Dietz A, Alagoz O, Yaffe MJ. Cost-effectiveness of mammography from a publicly funded health care system perspective. CMAJ Open. 2018;6:E77.

    Article  Google Scholar 

  6. Vachon CM, Pankratz VS, Scott CG, Haeberle L, Ziv E, Jensen MR, et al. The contributions of breast density and common genetic variation to breast cancer risk. J Natl Cancer Inst. 2015;107:dju397.

    Article  Google Scholar 

  7. Yala A, Lehman C, Schuster T, Portnoi T, Barzilay R. A deep learning mammography-based model for improved breast cancer risk prediction. Radiology. 2019;292:60–6.

    Article  Google Scholar 

  8. Pashayan N, Morris S, Gilbert FJ, Pharoah PD. Cost-effectiveness and benefit-to-harm ratio of risk-stratified screening for breast cancer: a life-table model. JAMA Oncol. 2018;4:1504–10.

    Article  Google Scholar 

  9. Maas P, Barrdahl M, Joshi AD, Auer PL, Gaudet MM, Milne RL, et al. Breast cancer risk from modifiable and nonmodifiable risk factors among white women in the United States. JAMA Oncol. 2016;2:1295–302.

    Article  Google Scholar 

  10. American Cancer Society. Breast Cancer Facts & Figures 2019–2020. Atlanta: American Cancer Society, Inc.; 2019. https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/breast-cancer-facts-and-figures/breast-cancer-facts-and-figures-2019-2020.pdf

    Google Scholar 

  11. United States Preventive Services Taskforce. Breast Cancer. Screening. 2016; https://www.uspreventiveservicestaskforce.org/uspstf/recommendation/breast-cancer-screening. Accessed 5 Apr 2020.

  12. Trentham-Dietz A, Kerlikowske K, Stout NK, Miglioretti DL, Schechter CB, Ergun MA, et al. Tailoring breast cancer screening intervals by breast density and risk for women aged 50 years or older: collaborative modeling of screening outcomes. Ann Intern Med. 2016;165:700–12.

    Article  Google Scholar 

  13. United States Preventive Services Taskforce. Grade definitions. https://epss.ahrq.gov/ePSS/gradedef.jsp. Accessed 15 Oct 2020.

  14. Schousboe JT, Kerlikowske K, Loh A, Cummings SR. Personalizing mammography by breast density and other risk factors for breast cancer: analysis of health benefits and cost-effectiveness. Ann Intern Med. 2011;155:10–20.

    Article  Google Scholar 

  15. Shiyanbola OO, Arao RF, Miglioretti DL, Sprague BL, Hampton JM, Stout NK, et al. Emerging trends in family history of breast cancer and associated risk. Cancer Epidemiol Prev Biomarkers. 2017;26:1753–60.

    Article  Google Scholar 

  16. Ahern TP, Sprague BL, Bissell MCS, Miglioretti DL, Buist DSM, Braithwaite D, et al. Family history of breast Cancer, breast density, and breast Cancer risk in a U.S. breast Cancer screening population. Cancer Epidemiol Biomark Prev. 2017;26:938–44.

    CAS  Article  Google Scholar 

  17. Narod SA, Iqbal J, Miller AB. Why have breast cancer mortality rates declined? J Cancer Policy. 2015;5:8–17.

    Article  Google Scholar 

  18. Miglioretti DL, Zhu W, Kerlikowske K, Sprague BL, Onega T, Buist DS, et al. Breast tumor prognostic characteristics and biennial vs annual mammography, age, and menopausal status. JAMA Oncol. 2015;1:1069–77.

    Article  Google Scholar 

  19. Shih Y-CT, Dong W, Xu Y, Shen Y. Assessing the cost-effectiveness of updated breast cancer screening guidelines for average-risk women. Value Health. 2019;22:185–93.

    Article  Google Scholar 

  20. Kerlikowske K, Hubbard RA, Miglioretti DL, Geller BM, Yankaskas BC, Lehman CD, et al. Comparative effectiveness of digital versus film-screen mammography in community practice in the United States: a cohort study. Ann Intern Med. 2011;155:493–502.

    Article  Google Scholar 

  21. European Society of Radiology. The cost of AI in radiology: is it really worth it? 2019. https://ai.myesr.org/healthcare/the-cost-of-ai-in-radiology-is-it-really-worth-it/. Accessed 4 Apr 2020.

    Google Scholar 

  22. Iowa Institute of Human Genetics. Microarrays and Fees 2020. https://medicine.uiowa.edu/humangenetics/research/genomics-division/microarray/microarrays-and-fees. Accessed 4 Apr 2020.

    Google Scholar 

  23. Sun L, Brentnall A, Patel S, Buist DSM, Bowles EJA, Evans DGR, et al. A cost-effectiveness analysis of multigene testing for all patients with breast cancer. JAMA Oncol. 2019;5:1718–30.

    Article  Google Scholar 

  24. Centers for Medicare and Medicaid Services. Physician fee schedule search. 2020. Https://www.cms.gov/apps/physician-fee-schedule/license-agreement.aspx. Accessed 4 apr 2020.

    Google Scholar 

  25. Stout NK, Lee SJ, Schechter CB, Kerlikowske K, Alagoz O, Berry D, et al. Benefits, harms, and costs for breast cancer screening after US implementation of digital mammography. J Natl Cancer Inst. 2014;106:dju092.

  26. Exchange rates.org. Euros (EUR) to US dollars (USD) Rates for 2/26/2020. 2020. https://www.exchange-rates.org/Rate/EUR/USD/2-26-2020.

    Google Scholar 

  27. US Food and Drug Administration. MQSA National Statistics. 2020. https://www.fda.gov/radiation-emitting-products/mqsa-insights/mqsa-national-statistics. Accessed 4 Apr 2020.

    Google Scholar 

  28. US Census Bureau. 2018 Population estimates by age, sex, race and hispanic origin, vol. 2019. https://www.census.gov/newsroom/press-kits/2019/detailed-estimates.html. Accessed 4 Apr 2020

  29. Naber SK, Kundu S, Kuntz KM, Dotson WD, Williams MS, Zauber AG, et al. Cost-effectiveness of risk-stratified colorectal cancer screening based on polygenic risk: current status and future potential. JNCI Cancer Spectr. 2020;4(1):pkz086.

  30. Kundu S, Kers JG, Janssens ACJ. Constructing hypothetical risk data from the area under the ROC curve: modelling distributions of polygenic risk. Plos One. 2016;11:e0152359.

    Article  Google Scholar 

  31. National Cancer Institute Surveillance, Epidemiology, and End Results Program. Breast Cancer SEER incidence rates by age at diagnosis, 2013-2017. 2020. https://seer.cancer.gov/explorer/application.html.

    Google Scholar 

  32. Munoz DF, Plevritis SK. Estimating breast cancer survival by molecular subtype in the absence of screening and adjuvant treatment. Med Decis Mak. 2018;38(1_suppl):32S–43S.

    Article  Google Scholar 

  33. Centers for Disease Control and Prevention. National Vital Statistics Report Volume 68, Number 7. United States Life Tables, 2017. https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_07-508.pdf. Accessed 21 Dec 2019.

  34. Sanders GD, Neumann PJ, Basu A, Brock DW, Feeny D, Krahn M, et al. Recommendations for conduct, methodological practices, and reporting of cost-effectiveness analyses: second panel on cost-effectiveness in health and medicine. JAMA. 2016;316:1093–103.

    Article  Google Scholar 

  35. US Department of Veteran Affairs. HERC: cost-effectiveness analysis. 2020. https://www.herc.research.va.gov/include/page.asp?id=cost-effectiveness-analysis. Accessed 14 Oct 2020.

    Google Scholar 

  36. Mavaddat N, Michailidou K, Dennis J, Lush M, Fachal L, Lee A, et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am J Hum Genet. 2019;104:21–34.

    CAS  Article  Google Scholar 

  37. Khan SA, Hernandez-Villafuerte KV, Muchadeyi MT, Schlander M. Cost-effectiveness of risk-based breast cancer screening: a systematic review. Int J Cancer. 2021;149(4):790–810.

    CAS  Article  Google Scholar 

  38. TreeAge Software. TreeAge Pro 2019, R2. https://www.treeage.com/software-downloads/treeage-pro-2019-r2/. Accessed 17 Aug 2020.

  39. Vemer P, Corro Ramos I, van Voorn GAK, Al MJ, Feenstra TL. AdViSHE: a validation-assessment tool of health-economic models for decision makers and model users. PharmacoEconomics. 2016;34:349–61.

    CAS  Article  Google Scholar 

  40. Eddy DM, Hollingworth W, Caro JJ, Tsevat J, McDonald KM, Wong JB. Model transparency and validation: a report of the ISPOR-SMDM modeling good research practices task force–7. Med Decis Mak. 2012;32:733–43.

    Article  Google Scholar 

  41. Surveillance Research Program, National Cancer Institute. SEER*Explorer: An interactive website for SEER cancer statistics. 2020. https://seer.cancer.gov/explorer/index.html. Accessed 7 Oct 2020.

    Google Scholar 

Download references

Acknowledgements

None.

Funding

The authors received no funding for this work.

Author information

Authors and Affiliations

Authors

Contributions

SM conceptualized the idea. Both SM and HVN contributed to data analysis, interpretation of results and manuscript writing. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Hai V. Nguyen.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mital, S., Nguyen, H.V. Cost-effectiveness of using artificial intelligence versus polygenic risk score to guide breast cancer screening. BMC Cancer 22, 501 (2022). https://doi.org/10.1186/s12885-022-09613-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12885-022-09613-1

Keywords

  • Artificial intelligence
  • Polygenic risk scores
  • Breast cancer screening
  • Cost-effectiveness