Performance of ultrasonography screening for breast cancer: a systematic review and meta-analysis
BMC Cancer volume 20, Article number: 499 (2020)
To investigate the performance of primary ultrasound (P-US) screening for breast cancer, and that of supplemental ultrasound (S-US) screening for breast cancer after negative mammography (MAM).
Electronic databases (PubMed, Scopus, Web of Science, and Embase) were systematically searched to identify relevant studies published between January 2003 and May 2018. Only high-quality or fair-quality studies reporting any of the following performance values for P-US or S-US screening were included: sensitivity, specificity, cancer detected rate (CDR), recall rate (RR), biopsy rate (BR), proportion of invasive cancers among screening-detected cancers (ProIC), and proportion of node-negative cancers among screening-detected invasive cancers (ProNNIC).
Twenty-three studies were included, including 12 studies in which S-US screening was used after negative MAM and 11 joint screening studies in which both primary MAM (P-MAM) and P-US were used. Meta-analyses revealed that S-US screening could detect 96% [95% confidential intervals (CIs): 82 to 99%] of occult breast cancers missed by MAM and identify 93% (95% CIs: 89 to 96%) of healthy women, with a CDR of 3.0/1000 (95% CIs: 1.8/1000 to 4.6/1000), RR of 8.8% (95% CIs: 5.0 to 13.4%), BR of 3.9% (95% CIs: 2.7 to 5.4%), ProIC of 73.9% (95% CIs: 49.0 to 93.7%), and ProNNIC of 70.9% (95% CIs: 46.0 to 91.6%). Compared with P-MAM screening, P-US screening led to the recall of significantly more women with positive screening results [1.5% (95% CIs:0.6 to 2.3%), P = 0.001] and detected significantly more invasive cancers [16.3% (95% CIs: 10.6 to 22.1%), P < 0.001]. However, there were no significant differences for other performance measures between the two screening methods, including sensitivity, specificity, CDR, BR, and ProNNIC.
Current evidence suggests that S-US screening could detect occult breast cancers missed by MAM. P-US screening has shown to be comparable to P-MAM screening in women with dense breasts in terms of sensitivity, specificity, cancer detection rate, and biopsy rate, but with higher recall rates and higher detection rates for invasive cancers.
Cancer is a global public health issue in the world. In 2016, an estimated 17.2 million cancer cases and 8.9 million cancer deaths occurred worldwide . For women, both the most commonly occuring cancer and the leading cause of cancer deaths and disability-adjusted life-years (DALYs) was breast cancer (1.7 million incident cases, 535, 000 deaths, and 14.9 million DALYs) . Over the years, the burden of cancer has shifted from more developed countries to less developed countries . Moreover, the burden is expected to grow worldwide due to the aging of the population and the adoption of lifestyle behaviors such as smoking, poor diet, physical inactivity, and reproductive changes (including lower parity and later age at first birth), particularly in less developed countries . Therefore, broad prevention measures, such as cancer screening, are urgently needed to control this increasing burden, especially in less developed countries.
Mammography (MAM) has been used to screen for breast cancer since the 1970s and is now widely available in developed countries. However, in less developed counties, such as China, MAM is not easily accessible due to several barriers, including insufficient MAM equipment, inadequate insurance coverage for MAM, and widely dispersed populations . Moreover, MAM has a low sensitivity in women with dense breasts , who could suffer a higher risk of breast cancer than those without dense breasts . Worrisome researches from Denmark and Netherlands showed that nearly one in every three or half of screening-detected breast cancers represents overdiagnosis, respectively [6, 7].
Recent data indicates that supplemental ultrasonography (S-US) screening could detect occult breast cancers missed by MAM, and primary ultrasonography (P-US) screening seems to perform comparably to primary MAM (P-MAM) screening [8,9,10,11]. However, systematic reviews of the performances of S-US or P-US screening have been published only in limited studies. Moreover, among broad screening studies in which both P-MAM and P-US were used, researchers just focused on the performance differences between joint screening and P-MAM screening alone. Limited studies investigated the independent performances of P-US screening. Therefore, we conducted this systematic review and meta-analysis to provide a global profile of S-US screening after negative MAM screening or P-US screening for breast cancers.
This meta-analysis was reported in line with the preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: The PRISMA-DTA Statement .
Types of studies and participants
Randomized-controlled trials (RCTs), prospective or retrospective screening cohort studies focusing on the performance of P-US screening for breast cancer or performance of S-US screening for breast cancer after negative MAM were included. The screening performance included the following indicators: sensitivity, specificity, cancer detected rate (CDR), recall rate (RR), biopsy rate (BR), proportions of invasive cancers among screening-detected cancers (ProIC), and proportions of node-negative invasive cancers among screening-detected invasive cancers (ProNNIC). The types of ultrasonography (US) included were hand-held ultrasonography (HHUS) and automated whole breast ultrasonography (ABUS). Diagnostic studies of patients with histopathologically proven breast cancer or women with suspicious finding after initial screening were excluded. Screening studies for second cancers among women previously diagnosed with breast cancer were also excluded.
A comprehensive search was conducted according to the Cochrane handbook guidelines. The American College of Radiology (ACR) developed the Breast Imaging Reporting and Data System (BI-RADS) classification for breast ultrasonography examinations starting in 2003 . Electronic databases (PubMed, Scopus, Web of Science, and Embase) were systematically searched to identify relevant studies published in English between January 2003 and May 2018. Five groups of key words were used in the searching strategies: (1) breast neoplasm, breast cancer, breast carcinoma; (2) ultrasound, ultrasonography; (3) screening; (4) supplemental, supplementary, adjunct, adjunctive, combined, joint, primary, single, alone; (5) sensitivity, specificity, detection rate, recall rate, biopsy rate. Reference lists from retrieved articles were also reviewed. Detailed searching strategies are referred to in the supplementary S1.
Selection of studies
Two authors independently screened the titles and abstracts of all selected articles to confirm their eligibility. All selected articles were analyzed by EndNote software that allows reviewers to manage articles and detect duplicate publications. When two or more articles from the same trial were selected, the article with the larger sample size, longer duration of follow-up, or the latest results was included. Any disagreement on the selection of articles was discussed and arbitrated by a third author. Details of the selection process are provided in the supplementary S2.
Two authors independently extracted the following data from the qualifying studies: general information (name of first author, year of publication, and country or countries where the study was performed), design of study (sample size, median age, percent of women with dense breasts among the whole population, type of US, screening mode), performance of US, and information for risk assessment of bias (detailed information referred to in the following section). Since there was not a consistent conclusion that dense breast can be regarded as an independent risk factor of breast cancer [5, 14], in order to avoid bringing ‘high risk’ labels to women with dense breasts, we collected information of dense breast as an attribute for average risk women. All data was entered into STATA 14.0 software for analysis. Any disagreements on data extracted were also discussed and arbitrated by the same third author.
Risk assessment of bias in included studies
Two investigators critically appraised all included studies independently according to the pre-specified criteria, which were adjusted from the USPSTF’s design-specific criteria and the STARD checklist for reporting diagnostic accuracy studies [15, 16]. The adjusted criteria included 7 items: source of population, sample size, inclusion and exclusion criteria, blinding of test, data completeness, BIRADS criteria, and reference standards. Result of each item was classified as high-risk or low-risk. Detailed information of the adjusted risk assessment criteria of bias refered to supplementary S3.
According to the above-mentioned criteria, high-quality studies were defined as those meeting at least six low-risk items for joint screening studies and five low-risk items for S-US screening studies. Fair-quality studies meet four or five low-risk items for joint screening studies and three or four low-risk items for S-US screening studies. Poor quality studies were defined as those meeting less than four low-risk items for joint screening studies and three low-risk items for S-US screening studies. Poor studies were excluded from the review.
Data synthesis and analysis
All data were extracted with pre-specified uniform tables and recalculated with uniform methods. The corresponding authors were contacted to obtain any missing information from their studies. For those studies in which the number of ‘examinations’ rather than the number of ‘women’ as the denominator to calculate the detection rate of breast cancer, each woman would be followed up several times, and every time she had an examination. Therefore, each woman would have several examinations in these stuides. In this study, if we changed the number of ‘women’ as the denominator to calculate the detection rate for these studies, the results would significantly be overestimated since the number of ‘women’ was significantly less than the number of ‘examinations’. Therefore, in order to follow the analysis protocol in the original studies and avoid potential overestimate in detection rate, we equate each examination with an independent woman. However, equating each examination with an independent woman could bias the estimate because observations within a woman are not ‘independent’ observations.
Cancer detected rate was defined as any cancer detected (including carcinoma in situ and invasive cancer but not high-risk precancerous lesion) among all examinations/participants. The recall rate was calculated as the number of women recalled for further diagnosed examinations divided by the total number of women who participated the screening. If the number of women recalled for any further diagnosed examinations was not available, the number of women with a positive result of index screening modality was used instead. The biopsy rate was calculated as the number of women recalled for pathological examination divided by the total number of women participated the screening.
The variation in different screening performances attributable to heterogeneity was measured as I2. If the P value for I2 was less than 0.1, significant heterogeneity was indicated among included trials and the random-effect model was used to combine screening performances . Otherwise, the fixed-effect model was used if the P value for I2 was larger than 0.1. To search for sources of heterogeneity and obtain clinically meaningful estimates, subgroup analyses were conducted according to different studies characteristics, such as sample size > 1000 (Yes/No), all women with dense breasts (Yes/No), type of US (HHUS/ABUS), and quality assessment (Yes/No), whenever possible.
The package “midas” was used to combine sensitivity and specificity, to investigate whether there were potential publication biases among included studies, and to plot the summary receiver operating characteristic (SROC) curve with its 95% confidence and prediction contours . The package “metaprop” was used to combine CDR, RR, BR, ProIC, and ProNNIC . In addition, the package “metan” was used to compare the performances between MAM and US .
All meta-analyses were conducted with STATA software (version 14.0). All tests were two-sided, and P values of less than 0.05 for all meta-analyses indicated statistical significance.
Supplementary S2 shows a flowchart of the study selection procedure. The electronic searches yielded 1162 potentially relevant studies, of which 23 eligible studies were included in the final review [9,10,11, 21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40], including 12 studies in which S-US screening was used after negative MAM and 11 joint screening studies in which both P-MAM and P-US were used.
Table 1 shows the baseline characteristics of the 23 studies. Twelve studies were conducted among women with dense breasts. Twenty studies screened women with HHUS. Twelve studies were conducted among general community women or well-defined high-risk women. Eleven studies excluded women who had a personal history of breast cancer. Eight joint screening studies masked the results of P-MAM screening and P-US screening. Nineteen studies had low risk of incomplete data. Sixteen studies reported US results according to BI-RADS classification criteria. The reference standard in seventeen studies was pathologic examination combined with 12-month clinical follow-up. Finally, according to the pre-specified criteria, seven studies were of high quality, while the remaining 16 were of fair quality.
Screening accuracy for S-US and P-US screening
Table 2 shows the original data of screening accuracy for S-US and P-US screening among the included studies. Based on meta-analyses, S-US screening could detect 96% [95% confidential intervals (CIs): 82 to 99%; I2 = 64.9%, P < 0.01] of occult breast cancers missed by MAM and identify 93% (95% CIs: 89 to 96%; I2 = 99.8%, P < 0.01) of healthy women (Fig. 1a, supplementary S4). The area under the SROC (AUC) for S-US screening was 98% (95CIs: 97 to 99%) (Fig. 1a). No publication bias was found among these studies (P = 0.397).
Among 11 joint screening studies, P-MAM screening could detect 65% (95% CIs: 53 to 75%; I2 = 93.2%, P < 0.01) of breast cancers and identify 97% (95% CIs: 93 to 99%; I2 = 99.9%, P < 0.01) of healthy women (Fig. 1b, supplementary S5), respectively. P-US screening could detect 68% (95% CIs: 45 to 85%; I2 = 96.2%, P < 0.01) of breast cancers and identify 98% (95CIs: 94 to 99%; I2 = 100%, P < 0.01) of healthy women (Fig. 1c, supplementary S6). The AUCs for P-MAM screening and P-US screening were 88% (95CIs: 85 to 91%) (Fig. 1b) and 96% (95CIs: 94 to 97%) (Fig. 1c), respectively. No publication bias was found for both P-MAM screening (P = 0.215) and P-US screening (P = 0.266). No significant differences were found for either sensitivity [0.3% (95% CIs: − 14.4 to 14.9%), P = 0.970; I2 = 88.0%, P < 0.001] or specificity [− 0.1% (95% CIs: − 0.7 to 0.5%), P = 0.860; I2 = 96.3%, P < 0.001] between P-MAM screening and P-US screening (Fig. 2).
Screening efficacy for S-US and P-US screening
Table 3 shows the original data for screening accuracy for S-US and P-US screening reported by the included studies. Meta-analyses showed that the summary CDR for S-US screening was 3.0/1000 (95% CIs: 1.8/1000 to 4.6/1000; I2 = 85.1%, P < 0.001), with a RR of 8.8% (95% CIs: 5.0 to 13.4%; I2 = 99.7%, P < 0.001) and a BR of 3.9% (95% CIs: 2.7 to 5.4%; I2 = 98.0%, P < 0.001) (Fig. 3).
Among 11 joint screening studies, the summary CDRs for P-MAM screening and P-US screening were 4.6/1000 (95% CIs: 3.2/1000 to 6.1/1000; I2 = 89.8%, P < 0.001) and 4.6/1000 (95% CIs: 3.1/1000 to 6.3/1000; I2 = 91.9%, P < 0.001), with summary RRs of 4.6% (95% CIs: 2.2 to 7.7%; I2 = 99.8%, P < 0.001) and 5.9% (95% CIs: 2.7 to 10.2%; I2 = 99.8%, P < 0.001), and summary BRs of 1.5% (95% CIs: 0.5 to 3.0%; I2 = 98.9%, P < 0.001) and 2.3% (95% CIs: 0.9 to 4.5%; I2 = 99.2%, P < 0.001) (Fig. 4). Compared to P-MAM screening, P-US screening recalled significantly more women with positive screening results [1.5% (95% CIs: 0.6 to 2.3%), P = 0.001] (Fig. 2). No significant differences were found for either CDR [− 0.2/1000 (95% CIs:-1.1/1000 to 0.6/1000, P = 0.581; I2 = 46.1%, P = 0.046] or BR [− 1.0% (95% CIs: − 2.0 to 0.6%), P = 0.066; I2 = 96.6%, P < 0.001] for P-MAM screening compared to P-US screening (Fig. 2).
Cancer characteristics for S-US and P-US screening
Table 4 shows the original data for cancer characteristics for S-US and P-US screening reported by the included studies. The studies from Corsetti , Hwang , Youk , and Brancato  among the S-US screening studies, as well as Shen  among joint screening studies did not report detailed information of invasive cancers or node-negative invasive cancers among screening detected cancers, therefore, they are missed in Table 4. After meta-analyses, 73.9% (95% CIs: 49.0 to 93.7%; I2 = 66.4%, P = 0.007) of cancers detected by S-US screening were invasive cancers, while 70.9% (95% CIs: 46.0 to 91.6%) of cancers were node-negative invasive cancers (Fig. 3).
Among 11 joint screening studies, 65.1% (95% CIs: 57.5 to 72.5%; I2 = 45.9%, P = 0.055) and 86.9% (95% CIs: 77.4 to 94.5%; I2 = 72.5%, P < 0.001) of cancers detected by P-MAM screening and by P-US were invasive cancers, while 82.0% (95% CIs: 59.7 to 97.6%; I2 = 82.8%, P < 0.001) and 83.4% (95% CIs: 64.9 to 96.7%; I2 = 81.2%, P < 0.001) of cancers were node-negative invasive cancers (Fig. 4). Compared to P-MAM screening, P-US screening detected significantly more invasive cancers [16.3, 95% CIs (10.6 to 22.1%), P < 0.001; I2 = 0, P = 0.623] but a similar number of node-negative invasive cancers [0.3, 95% CIs (− 6.0 to 6.7%), P = 0.916; I2 = 0, P = 0.923] (Fig. 2).
Subgroup analyses showed very similar results to those of primary analyses (Supplementary S7 and S8). In addition to results comparable to those observed in the primary analyses, lower sensitivity, higher specificity, higher cancer detection rate, and higher biopsy rate were found for S-US screening among women with dense breasts compared to those without dense breasts (Supplementary S7). Moreover, the differences for sensitivities, specificities, and cancer detection rates between P-MAM screening and P-US screening were larger among women with dense breasts compared to those without dense breasts (Supplementary S8).
The U.S. Preventive Services Task Force (USPSTF) had initially reviewed the performances and clinical outcomes of S-US screening in women with dense breasts or negative mammography . However, only two studies were included. The authors concluded that the effects of S-US screening on breast cancer outcomes remain unclear due to sparse good evidence . In addition, Gartlehnerhad systematically reviewed the evidence investigating the joint effectiveness of screening with P-MAM and P-US compared to MAM screening alone . However, this review did not investigate the performance of P-US screening. Our study is the first systematic review and meta-analysis to investigate the performance of P-US screening for breast cancer, and this is also an important up-to-date systematic review and meta-analysis investigating the performance of S-US screening.
The role of S-US screening was first addressed in ACRIN 6666 by Berg in 2008 . Berg concluded that S-US screening to P-MAM screening would yield an additional 1.1 to 7.2 cancers per 1000 high-risk women . Our analyses also found a similar additional 0.4 to 22.4 cancers per 1000 examinations. Moreover, after re-analysis of ACRIN 6666, Berg concluded that ultrasound could be used as the primary screening method for breast cancer . However, up to now, there have been no consistent conclusions concerning whether US screening should be recommended as the primary screening method for women in the screening guidelines for breast cancer. For example, the National Comprehensive Cancer Network, the European Society of Breast Imaging (EUSOBI), the Japanese Breast Cancer Society, and the Chinese Anti-Cancer Association (CACA) supported that S-US screening should be recommended for women with dense breasts after negative MAM [42,43,44,45], while no clear recommendations of US screening were suggested by the USPSTF, the American Cancer Society, the American College of Physicians, and the Canadian Task Force on Preventive Health Care [46,47,48,49].
Several reasons would lead to these inconsistent recommendations among current guidelines. As argued by USPSTF, sparse good evidence would be the major reason. However, as shown in our study, several high-quality studies and fair-quality studies had been conducted since 2003. Although EUSOBI supported S-US screening after P-MAM, it also addressed the concern that breast US was inappropriately suggested to be a primary screening method since P-US screening had not been shown to reduce mortality of breast cancer in the general female population. Moreover, US would lead to more biopsies and recalls than MAM . In this systematic review, we did observe higher recall rates for US compared to MAM. We also observed higher biopsy rates for US compared to MAM; however, the difference was nonsignificant. This nonsignificant difference in biopsy rates between US and MAM may be due to small sample sizes, but it may also reflect no actual difference. In addition, there are several limitations of breast ultrasound that would make it inappropriate for a screening test. These included: US cannot take an image of the whole breast at once as MAM does; US cannot show microcalcifications, which would be the most common feature of tissue around a tumor; the skill level of the US operators makes a great difference in the screening results. However, as shown in our study, these concerns seemed not to cause significant differences in the sensitivity and specificity, or even in cancer detection rates and cancer characteristics (such as the proportion of node-negative invasive cancers) between P-US screening and P-MAM screening. Moreover, lower price, larger coverage, absence of radiation effects, and lower overdiagnosis rates for US compared to MAM make US more easily accepted in China and other countries [3, 9, 50].Therefore, Chinese Anti-Cancer Association and other societies supported S-US screening in their guidelines.
Lastly, the following results are significant. First, we observed significantly higher RR and ProIC for P-US screening compared with P-MAM screening. Higher recall rates would be an important barrier to promote US screening. More studies are needed to investigate the factors associated with higher recall rates of US screening to reduce unnecessary recalls. Second, as shown in supplementary S7, subgroup analyses did not find obvious differences in sensitivity, specificity or cancer detection rate for S-US screening after negative MAM screening between women with and without dense breasts. These results suggested that influence of dense breasts on the performance of S-US after negative MAM would be influenced by other factors. Moreover, as shown in supplementary S8, subgroup analyses also did not find significantly higher sensitivity for P-MAM compared to P-US among women with dense breasts. Small sample size could be an important factor, since only 3/11 exclusively recruited women with dense breasts (a proportion of 100% dense breasts) and only > 40% of participating women had dense breasts in another 5/11 studies.
First, due to lack of evidence for reduced mortality from breast cancer, we cannot conclude that US screening would lead to a long-term benefit. More studies with sophistacted design and long-time follow-up are needed to investigate the long-term benefits and potential risks (including false positivity, “unnecessary” recalls, and overdiagnosis) of P-US screening. Second, in addition to breast density, no studies investigated whether other risk factors (such as obesity) influenced the differences in screening performance between US and MAM. Therefore, we cannot conclude whether these different performances between US and MAM derived from confounding effects or from the actual differences between US and MAM. Third, as shown in Table 3 and Table 4, not all included studies reported all screening performances indexes (such as biopsy rate, proportions of invasive cancers, and proportions of node-negative invasive cancers). Therefore, meta-analysis results from studies reporting screening performances indexes would lead to biased results and complete reporting screening performances for US and MAM screening studies are needed to improve the current results. Fourth, combination data from repeated (longitudinal) US screening for a woman with data from an initial screening would also lead to biased results. Fifth, meta-analyses under the criteria of P < 0.05 would potentially overestimate the performations of US even though random-effect model was used. More real-world studies with large sample size are needed in the future.
Current evidence suggests that S-US screening could detect occult breast cancers missed by MAM. P-US screening has shown to be comparable to P-MAM screening in women with dense breasts in terms of sensitivity, specificity, cancer detection rate, and biopsy rate, but with higher recall rates and higher detection rates for invasive cancers. More studies are needed to investigate the long-term benefits and potential risks (including false positivity, “unnecessary” recalls, and overdiagnosis) of P-US screening. Moreover, we hope that US screening for breast cancer should deserve more attention in the future, not only because US is comparable to MAM in women with dense breasts in terms of sensitivity, specificity, cancer detection rate, and biopsy rate, but also because ultrasound is not a radiation modality and is easier to access in low-resources areas, such as Chinese rural areas.
Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information files.
Automated whole breast ultrasonography
Cancer detected rate
Proportions of invasive cancers
Node-negative invasive cancers
The U.S. preventive services task force
European society of breast imaging
Chinese anti-cancer association
Fitzmaurice C, Akinyemiju TF, Al LF, et al. Global, regional, and National Cancer Incidence, mortality, years of life lost, years lived with disability, and disability-adjusted life-years for 29 Cancer groups, 1990 to 2016: a systematic analysis for the global burden of disease study. JAMA Oncol. 2018;4:1553–68.
Torre LA, Bray F, Siegel RL, et al. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65:87–108.
Fan L, Strasser-Weippl K, Li J, et al. Breast cancer in China. Lancet Oncol. 2014;15:e279–89.
Berg WA, Blume JD, Cormack JB, et al. Combined screening with ultrasound and mammography vs mammography alone in women at elevated risk of breast cancer. JAMA. 2008;299:2151–63.
Boyd NF, Guo H, Martin LJ, et al. Mammographic density and the risk and detection of breast cancer. N Engl J Med. 2007;356:227–36.
Autier P, Boniol M, Koechlin A, Pizot C, Boniol M. Effectiveness of and overdiagnosis from mammography screening in the Netherlands: population based study. BMJ. 2017;359:j5224.
Jorgensen KJ, Gotzsche PC, Kalager M, Zahl PH. Breast Cancer screening in Denmark: a cohort study of tumor size and Overdiagnosis. Ann Intern Med. 2017;166:313–23.
Berg WA, Zhang Z, Lehrer D, et al. Detection of breast cancer with addition of annual screening ultrasound or a single screening MRI to mammography in women with elevated breast cancer risk. JAMA. 2012;307:1394–404.
Dong H, Huang Y, Song F, et al. Improved performance of adjunctive ultrasonography after mammography screening for breast cancer among Chinese females. Clinl Breast Cancer. 2017;18:e353–61.
Ohuchi N, Suzuki A, Sobue T, et al. Sensitivity and specificity of mammography and adjunctive ultrasonography to screen for breast cancer in the Japan strategic anti-cancer randomized trial (J-START): a randomised controlled trial. Lancet. 2016;387:341–8.
Berg WA, Bandos AI, Mendelson EB, et al. Ultrasound as the primary screening test for breast cancer: analysis from ACRIN 6666. J Natl Cancer Inst. 2016;108:djv367.
McInnes M, Moher D, Thombs BD, et al. Preferred reporting items for a systematic review and meta-analysis of diagnostic test accuracy studies: the PRISMA-DTA statement. JAMA. 2018;319:388–96.
Mendelson EB, Baum JK, Berg WA, Merritt CRB, Rubin E. Breast imaging reporting and data system BIRADS: ultrasound. Reston, VA: American College of Radiology; 2003.
Dai H, Yan Y, Wang P, et al. Distribution of mammographic density and its influential factors among Chinese women. Int J Epidemiol. 2014;43:1240–51.
Melnikow J, Fenton JJ, Whitlock EP, et al. Supplemental screening for breast Cancer in women with dense breasts: a systematic review for the U.S. preventive services task force. Ann Intern Med. 2016;164:268–78.
Bossuyt PM, Reitsma JB, Bruns DE, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. BMJ. 2003;326:41–4.
Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–60.
Dwamena BA. MIDAS: Stata module for meta-analytical integration of diagnostic test accuracy studies, Statistical Software Components S456880. Boston College Department of Economics; 2007. (revised 05 Feb 2009). https://ideas.repec.org/c/boc/bocode/s456880.html. Accessed 20 May 2018.
Nyaga VN, Arbyn M, Aerts M. Metaprop: a Stata command to perform meta-analysis of binomial data. Arch Public Health. 2014;72:39.
Ross JH, Michael JB, Jonathan JD, et al. Metan: fixed- and random-effects meta-analysis. Stata J. 2007;8:3–28.
Tagliafico AS, Calabrese M, Mariscotti G, et al. Adjunct screening with Tomosynthesis or ultrasound in women with mammography-negative dense breasts: interim report of a prospective comparative trial. J Clin Oncol. 2016.
Kim SY, Kim MJ, Moon HJ, Yoon JH, Kim EK. Application of the downgrade criteria to supplemental screening ultrasound for women with negative mammography but dense breasts. Medicine (Baltimore). 2016;95:e5279.
Shen S, Zhou Y, Xu Y, et al. A multi-Centre randomised trial comparing ultrasound vs mammography for screening breast cancer in high-risk Chinese women. Br J Cancer. 2015;112:998–1004.
Moon HJ, Jung I, Park SJ, et al. Comparison of Cancer yields and diagnostic performance of screening mammography vs. supplemental screening ultrasound in 4394 women with average risk for breast Cancer. Ultraschall Med. 2015;36:255–63.
Hwang JY, Han BK, Ko EY, et al. Screening ultrasound in women with negative mammography: outcome analysis. Yonsei Med J. 2015;56:1352–8.
Weigert J, Steenbergen S. The Connecticut experiments second year: ultrasound in the screening of women with dense breasts. Breast J. 2015;21:175–80.
Girardi V, Tonegutti M, Ciatto S, Bonetti F. Breast ultrasound in 22,131 asymptomatic women with negative mammography. Breast. 2013;22:806–9.
Parris T, Wakefield D, Frimmer H. Real world performance of screening breast ultrasound following enactment of Connecticut bill 458. Breast J. 2013;19:64–70.
Venturini E, Losio C, Panizza P, et al. Tailored breast cancer screening program with microdose mammography, US, and MR imaging: short-term results of a pilot study in 40-49-year-old women. Radiology. 2013;268:347–55.
Huang Y, Kang M, Li H, et al. Combined performance of physical examination, mammography, and ultrasonography for breast cancer screening among Chinese women: a follow-up study. Curr Oncol. 2012;19:S22–30.
Hooley RJ, Greenberg KL, Stackhouse RM, et al. Screening US in patients with mammographically dense breasts: initial experience with Connecticut public act 09-41. Radiology. 2012;265:59–69.
Leong LC, Gogna A, Pant R, Ng FC, Sim LS. Supplementary breast ultrasound screening in Asian women with negative but dense mammograms-a pilot study. Ann Acad Med Singap. 2012;41:432–9.
Corsetti V, Houssami N, Ghirardi M, et al. Evidence of the effect of adjunct ultrasound screening in women with mammography-negative dense breasts: interval breast cancers at 1 year follow-up. Eur J Cancer. 2011;47:1021–6.
Youk JH, Kim EK, Kim MJ, Kwak JY, Son EJ. Performance of hand-held whole-breast ultrasound based on BI-RADS in women with mammographically negative dense breast. Eur Radiol. 2011;21:667–75.
Weinstein SP, Localio AR, Conant EF, et al. Multimodality screening of high-risk women: a prospective cohort study. J Clin Oncol. 2009;27:6124–8.
Brancato B, Bonardi R, Catarzi S, et al. Negligible advantages and excess costs of routine addition of breast ultrasonography to mammography in dense breasts. Tumori. 2007;93:562–6.
Honjo S, Ando J, Tsukioka T, et al. Relative and combined performance of mammography and ultrasonography for breast cancer screening in the general population: a pilot study in Tochigi prefecture, Japan. Jpn J Clin Oncol. 2007;37:715–20.
Wilczek B, Wilczek HE, Rasouliyan L, Leifland K. Adding 3D automated breast ultrasound to mammography screening in women with heterogeneously and extremely dense breasts: report from a hospital-based, high-volume, single-center breast cancer screening program. Eur J Radiol. 2016;85:1554–63.
Brem RF, Tabar L, Duffy SW, et al. Assessing improvement in detection of breast cancer with three-dimensional automated breast US in women with dense breast tissue: the SomoInsight study. Radiology. 2015;274:663–73.
Kelly KM, Dean J, Comulada WS, Lee SJ. Breast cancer detection using automated whole breast ultrasound and mammography in radiographically dense breasts. Eur Radiol. 2010;20:734–42.
Gartlehner G, Thaler K, Chapman A, et al. Mammography in combination with breast ultrasonography versus mammography for breast cancer screening in women at average risk. Cochrane Database Syst Rev. 2013;4:D9632.
Tozaki M, Kuroki Y, Kikuchi M, et al. The Japanese breast Cancer society clinical practice guidelines for screening and imaging diagnosis of breast cancer, 2015 edition. Breast Cancer. 2016;23:357–66.
National Comprehensive Cancer Network. NCCN Clinical Practice Guidelines in Oncology (NCCN Guidelines):Breast Cancer Screening and Diagnosis. V1 ed.2016.
The Committee of Breast Cancer from the Chinese Anti-Cancer Association. Guidelines of diagnosis and treatment for breast Cancer by the Chinese anti-Cancer association (2017 edition). J Chin Oncol. 2017;27:695–760.
Evans A, Trimboli RM, Athanasiou A, et al. Breast ultrasound: recommendations for information to women and referring physicians by the European Society of Breast Imaging. Insights Imaging. 2018;9:449–61.
Siu AL. Screening for breast Cancer: U.S. preventive services task force recommendation statement. Ann Intern Med. 2016;164:279–96.
Wilt TJ, Harris RP, Qaseem A. Screening for cancer: advice for high-value care from the american college of physicians. Ann Intern Med. 2015;162:718–25.
Oeffinger KC, Fontham ET, Etzioni R, et al. Breast Cancer screening for women at average risk: 2015 guideline update from the American Cancer Society. JAMA. 2015;314:1599–614.
Tonelli M, Connor GS, Joffres M, et al. Recommendations on screening for breast cancer in average-risk women aged 40-74 years. CMAJ. 2011;183:1991–2001.
Huang Y, Dai H, Song F, et al. Preliminary effectiveness of breast cancer screening among 1.22 million Chinese females and different cancer patterns between urban and rural women. Sci Rep. 2016;6:39459.
This work was supported by the Natural Science Foundation of Tianjin [Grant number 18JCQNJC80300, Grantee: Yubei Huang]; Chinese National Key Research and Development Project [Grant number 2018YFC1315600, Grantee: Ping Wang]; National Natural Science Foundation of China [Grant numbers 81502476, Grantee: Yubei Huang]; and the Beijing Young Talent Program [Grant number 2016000021469G189, Grantee: Lei Yang]. The funders had no roles in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Searching strategies in details from four databases. Supplementary S2. Flowchart of searching strategy. Supplementary S3. Bias risk assessment criteria. Supplementary S4. Screening accuracy for S-US screening. Supplementary S5. Screening accuracy for P-MAM screening. Supplementary S6. Screening accuracy for P-US screening. Supplementary S7. Subgroup analyses on the performance of S-US screening for breast cancer. Supplementary S8. Subgroup analyses on the performance differences between P-MAM and P-US for breast cancer.
About this article
Cite this article
Yang, L., Wang, S., Zhang, L. et al. Performance of ultrasonography screening for breast cancer: a systematic review and meta-analysis. BMC Cancer 20, 499 (2020). https://doi.org/10.1186/s12885-020-06992-1
- Breast cancer
- Supplemental ultrasonography