Study subjects
This prospective randomized controlled trial of parallel design was performed at a large tertiary care academic medical center and was approved by the hospital’s institutional review board. Our study adheres to the CONSORT guidelines. Patients at a single tertiary care cancer center were approached by their treating oncologists or surgeons during routine clinical appointments if they met the eligibility criteria and their scheduled appointment time allowed. The patients’ oncologists or surgeons obtained written informed consent. Eight oncologists and three breast surgeons recruited patients between 2/1/2015 and 4/30/2019. Patients were followed for a minimum of 12 months.
The eligibility criteria included: (a) female patients 18 years or older; (b) PHBC (including DCIS and invasive ductal or lobular carcinoma); (c) prior unilateral mastectomy or breast conservation surgery; (d) treatment for breast cancer completed; and (e) no symptoms of breast cancer. Patients were excluded if they were considered high-risk (lifetime risk ≥ 25%) [21], were unable to undergo an MRI due to either physical or mental issues (i.e.: severe claustrophobia, allergy to gadolinium, severe renal failure), had bilateral mastectomies, were pregnant or breastfeeding, or had undergone a breast MRI within the last 6 months. Regular surveillance imaging consisted of annual surveillance MG, irrespective of breast tissue density. All patients had undergone prior mammographic imaging, and some (< 50%) had undergone prior breast MRI imaging.
Eligible patients were randomized in a 1:1 allocation ratio to one of the two arms of the study: 1) surveillance with MG or 2) MG plus A-MRI, with use of permuted blocks of variable length (2, 4, and 6) to ensure that recruiting physicians remained unaware of the randomization. Researchers or study participants were not blinded to their allocation. Patients could only participate once in the study.
Imaging technique and interpretation
All mammographic examinations were performed using a full-field digital technique (Hologic, Bedford, MA, USA) in accordance with national guidelines. Standard two-dimensional craniocaudal (CC) and mediolateral oblique (MLO) views were obtained.
All abbreviated dynamic contrast material-enhanced breast MRIs were performed with one 3 T system (Magnetom TrioTim Syngo, Siemens). The standardized protocol consisted of 8-channel breast coil (Sentinelle Medical Inc.), T1 localizer, T1 dynamic contrast-enhanced fat-suppressed with one precontrast and one 2 min postcontrast (3D transverse, phase encoding direction right to left, phase resolution of 60%, phase partial Fourier 6/8, no interpolation, FA 10 degrees, TR 4.07 ms and TE 1.96 ms, no IR, NEX 1, Voxel size: 1 × 1x1 mm, acceleration factor 4, no interpolation, base resolution 448,1:01 min, slice thickness 1 mm). Post-processing axial subtracted sequences and axial and sagittal maximum intensity projection were generated of the subtracted images. No T2-weighted sequences were obtained. For all examinations, gadolinium contrast material (Gadovist) was power injected (0.1 mmol/kg at 2 mL/s) followed by a 20 mL saline flush. The entire protocol took 3 min.
Surveillance MG and A-MRI were reviewed by one of two breast radiologists independently (the first with 8 years of experience reading mammography and breast MRI and the second reader with 20 years reading mammography and breast MRI) using ACR Breast Imaging-Reporting Data System (BI-RADS) lexicon[22]. For patients in the A-MRI group, MG and A-MRI studies were performed on the same day according to the protocol. Radiologists were not blinded but reported each modality separately according to the imaging modality findings, with the mammograms interpreted first. Based on the imaging findings, additional mammographic images, including diagnostic tomosynthesis, or targeted ultrasound were requested at the discretion of the interpreting radiologist. Findings and management were communicated to the patient by telephone by the reporting radiologist. Subsequent imaging was performed on separate visits, within 3 weeks of the MG or A-MRI. Histologic samples for pathologic diagnosis were obtained under ultrasound (14G, 5–6 cores), stereotactic (10G, 6–12 cores) or MRI (10G, 6–12 cores) guidance.
Anxiety measures
Patients in both groups were asked to fill out four validated self-report questionnaires that measure anxiety level and overall health [23,24,25,26]) (see supplemental materials). The primary outcome was the State-Trait Anxiety Inventory (STAI) [23]. This STAI consists of two separate 20-item scales that assess state anxiety (S-Anxiety) (i.e., how the person feels at this moment) and trait anxiety (T-Anxiety) (i.e., how the personal generally feels). The items are rated on a 1 to 4 scale with total scores ranging from 20–80. Cut-off scores of ≥ 32.2 and ≥ 31.8 indicate elevated levels of state and trait anxiety, respectively. Both STAI scales have solid psychometric properties and are sensitive to assessment of longitudinal change. There are no validated cutoff scores for the STAI scales in women with PHBC, however a cutoff score of 41 on the trait form of the STAI and 44 on the state form of the STAI have been used in previous research to identify clinical levels of anxiety in women with breast cancer [27, 28]. Other psychological measures included the Penn State Worry Questionnaire (PSWQ) [24], Breast Cancer Worry Scale (BCWS) [25], and the Health Status Questionnaire 12 (HSQ-12) [26]. The PSWQ [24] is a 16-item self-report questionnaire which measures frequency and intensity of worry symptoms. Items are rated on a 5-point scale, with total scores ranging from 16–80. A score between 16–39 indicates low worry, 40–59 moderate worry and 60–80 high worry. The BCWS [25] is a 3-item scale which measures frequency of breast cancer worry and the impact of worrying on mood and ability to perform daily activities. Higher scores indicate greater cancer worry. The HSQ-12 [26] assesses the impact of health on social, emotional and physical functioning over the past four weeks. Depending on the item, questions are rated of a 3-point, 5-point and 6-point scale. Items were recoded using the method described by Barry et al. [26]. Total HSQ scores range from 0 to 800, with higher scores indicating better health status. The questionnaires were completed upon enrolment during consultation at time 1 (T1) when the patients were due for their surveillance test(s) to measure baseline levels of anxiety, at time 2 (T2) that occurred after the patient received of their surveillance MG and/or MRI test results, and then 6 months later at time 3 (T3), to determine if there was a sustained effect observed from the type of surveillance test. T3 questionnaires were mailed to patients and returned to the study coordination center.
Data collection and statistical analysis
Medical records were reviewed to determine patient age, family history of breast and/or ovarian cancer in a first-degree relative, surgery modality, initial breast tumor stage (TNM), histology, hormone receptor status, months since diagnosis of breast cancer and breast density. Results were compared between the two groups. For malignant or atypical/high-risk lesions, surgical pathologic results were reviewed when available. Imaging and clinical follow-up were determined by review of the hospital picture archiving system (PACS) and medical records as well as the digital imaging repository which includes all clinics and hospitals that serve the region’s population of 1.2 million. The emigration rate in the region is < 0.5% per year [29]. Imaging follow-up for all patients with benign imaging or pathology was documented with the date of the most recent negative mammogram.
The anxiety measures were analyzed using SPSS Statistics version 25. Analysis was based on intent-to-treat (ITT) principles. Data were analyzed using linear mixed models, with surveillance groups (MG only versus MG plus A-MRI), time of assessment (T1, T2, T3), and Intervention by Time interaction as fixed factors. Models were estimated using Restricted Maximum Likelihood (REML) with an unstructured covariance structure to account for correlations among repeated measures over time. A significant Time by surveillance group interaction would suggest that changes in measures over time were different between the surveillance method; significant interactions were further analyzed with pairwise least square mean comparisons. Data from missing questionnaires were not imputed because our analytical strategy using REML allowed the estimation of reliable parameters without the need for imputation of the data under an assumption of missing at random (MAR) [30]. Descriptive statistics were calculated using a spreadsheet software program (Excel, Version 2013, Microsoft). Screening outcomes were compared between groups using Fisher’s exact test. Sample size calculation was based on primary outcome the STAI. There is no generally accepted minimal clinically important difference in the STAI subscales and a 4-point difference was selected to be a minimal clinically important difference. This was based on previous study by Millar et al. [28] which used a 4-point difference in the STAI and on consensus with the research team and the experience of the psychologist researcher. In order to have 80% power to detect a 4-point difference between the groups at any of the three time points, we planned 134 patients per group. Recruitment stopped early due to differences in cancer detection rates (CDR). Results were considered significant if p < 0.05.
Imaging modalities (MG, A-MRI), and BI-RADS final assessment categories for each modality were noted. Imaging findings and outcomes were documented for all BI-RADS 3, 4 and 5 lesions, including suspicious extra-mammary findings. Results were compared between MG and A-MRI. A screening examination was considered as positive when additional diagnostic imaging was recommended prior to the next routine screening examination and included BI-RADS 0, 3, 4 and 5, defined as abnormal interpretations. True positive findings were defined as a cancer diagnosis within 12 months of a positive screening examination. Imaging studies were considered false negatives if there was a tissue diagnosis of cancer within 12 months of a negative study, or in the surveillance groups if there was a tissue diagnosis of cancer in the follow-up period. The following performance metrics were calculated for each modality: CDR, abnormal interpretation rate (AIR), biopsy rate, positive predictive value for biopsy recommendations (PPV2 = biopsies recommended/cancers diagnosed), positive predictive value for biopsies performed (PPV3 = biopsies performed/cancers diagnosed), sensitivity and specificity.