Validation of the CancerMath prognostic tool for breast cancer in Southeast Asia

Background CancerMath is a set of web-based prognostic tools which predict nodal status and survival up to 15 years after diagnosis of breast cancer. This study validated its performance in a Southeast Asian setting. Methods Using Singapore Malaysia Hospital-Based Breast Cancer Registry, clinical information was retrieved from 7064 stage I to III breast cancer patients who were diagnosed between 1990 and 2011 and underwent surgery. Predicted and observed probabilities of positive nodes and survival were compared for each subgroup. Calibration was assessed by plotting observed value against predicted value for each decile of the predicted value. Discrimination was evaluated by area under a receiver operating characteristic curve (AUC) with 95 % confidence interval (CI). Results The median predicted probability of positive lymph nodes is 40.6 % which was lower than the observed 43.6 % (95 % CI, 42.5 %–44.8 %). The calibration plot showed underestimation for most of the groups. The AUC was 0.71 (95 % CI, 0.70–0.72). Cancermath predicted and observed overall survival probabilities were 87.3 % vs 83.4 % at 5 years after diagnosis and 75.3 % vs 70.4 % at 10 years after diagnosis. The difference was smaller for patients from Singapore, patients diagnosed more recently and patients with favorable tumor characteristics. Calibration plot also illustrated overprediction of survival for patients with poor prognosis. The AUC for 5-year and 10-year overall survival was 0.77 (95 % CI: 0.75–0.79) and 0.74 (95 % CI: 0.71–0.76). Conclusions The discrimination and calibration of CancerMath were modest. The results suggest that clinical application of CancerMath should be limited to patients with better prognostic profile.


Background
Adjuvant chemotherapy and hormone therapy improve long-term survival and reduce the risk of recurrence in early breast cancer patients [1][2][3]. However, the benefit varies greatly from patient to patient due to biologic heterogeneity of the disease and differences in response to treatment [4,5]. Risk of adverse effects and high cost of adjuvant therapy also make it challenging for oncologists to choose the most appropriate treatment. Therefore, several clinical tools have been developed to predict prognosis and survival benefit from treatment, using clinicopathological features, genetic profiles, and novel biomarkers [6].
The Nottingham Prognostic Index was the first prognostic model introduced for breast cancer patients in 1982. It includes only tumor grade, size, and nodal status for prediction of disease-free survival [7,8]. The widely used Adjuvant! Online (www.adjuvantonline.com) calculates 10-year overall survival and disease-free survival of patients with non-metastatic breast cancer, based on patient's age, tumor size, grade, estrogen-receptor (ER) status, nodal status, and co-morbidities. It also quantitatively predicts the absolute gain from adjuvant therapy [9]. Although it is recommended by the National Institute for Health and Clinical Excellence and widely used by oncologists [10][11][12][13], several validation studies have suggested that Adjuvant! Online is suboptimal in women younger than 40 years and older than 75 years [14,15]. The model was recently validated in Malaysia, Korea, and Taiwan, where it was shown to substantially overestimate actual survival [16][17][18]. CancerMath (http:// www.lifemath.net/cancer/) is the latest web-based prognostic tool, which takes human epidermal growth factor receptor 2 (HER2) status into account [19]. It was established based on the binary biological model of cancer metastasis and the parameters were derived from the Surveillance, Epidemiology and End-Result (SEER) registry in the United States [20]. CancerMath provides information on overall survival, conditional survival (the likelihood of surviving given being alive after a certain number of years) and benefit of systemic treatment for each of the first 15 years after diagnosis. This model also estimates probability of positive lymph nodes and nipple involvement. Validation study has shown comparable results between CancerMath and Adjuvant! Online [19]. However this new tool has not been validated outside the United States. Given the differences in underlying distribution of prognostic factors and life expectancy between Asia and the United States [21][22][23], direct application without any correction may not generate reliable prediction. The aim of the study is to validate this model in the Singapore Malaysia Hospital-Based Breast Cancer Registry, demonstrating its predictive performance for different subgroups and determining its calibration and discrimination.  prospectively collected data on breast cancer patients diagnosed since 1993 [24]. No consent was needed and ethics approval was obtained from Domain Specific Review Board under National Healthcare Group in Singapore and Medical Ethics Committee under UMMC. The consolidated registry included information on ethnicity, age and date of diagnosis, histologically determined tumor size, number of positive lymph nodes, ER and progesterone receptor (PR) status (positive defined as 1 % or more positively stained tumor cells at NUH or 10 % or more positively stained tumor cells at TTSH and UMMC, negative, or unknown), HER2 status based on fluorescence in situ hybridization (FISH) or immunohistochemistry (IHC) if FISH was not performed (positive defined as FISH positive or IHC score of 3+, negative defined as FISH negative or IHC scored of 0 or 1+, equivocal defined as IHC score of 2+, or unknown), histological type (ductal, lobular, mucinous, others, or unknown), grade (1, 2, 3, or unknown), type of surgery (no surgery, mastectomy, breast conserving surgery, or unknown), chemotherapy (yes, no or unknown), hormone therapy (yes, no, or unknown), and radiotherapy (yes, no, or unknown). Detailed chemotherapeutic treatment regimens were only available for UMMC patients. For chemotherapy, cyclophosphamide, methotrexate and fluorouracil (CMF) was categorized as first generation regimen and fluorouracil, epirubicin and cyclophosphamide (FEC), and doxorubicin and cyclophosphamide (AC) followed by paclitaxel were second generation. Docetaxel, doxorubicin and cyclophosphamide (TAC), and FEC followed by docetaxel were categorized as third generation. Hormone therapy was categorized into five groups: tamoxifen, aromatase inhibitors (AI), tamoxifen followed by AI, ovarian ablation, and ovarian ablation plus tamoxifen. Vital status was obtained from the hospitals' medical records and ascertained by linkage to death registries in both countries. Patients diagnosed until 31 st December 2011 were followed up from date of diagnosis until date of death or date of last fellow-up, whichever came first.

Methods
Date of last follow-up was 1 st March 2013 for UMMC, 31 st July 2013 for NUH, and 1 st October 2012 for TTSH. Male patients, patients with unknown age at diagnosis and tumor size were excluded from this analysis as these two were essential predictors for all four CancerMath calculators. Javascript codes of all four CancerMath calculators which contained predetermined parameters and mathematical equations were exported on 9 th Nov 2013 from its website by selecting "view-> source" in the browser menu. The script was then transcribed into R script to allow calculation for a group of patients. For nodal status calculator, patient's age, tumor size, ER and PR status, histological type, and grade were used by the program to calculate probability of positive nodes for each patient. Overall mortality risk at each year up to 15 year after diagnoses was predicted by outcome calculator, based on age, tumor size, number of positive nodes, grade, histological type, ER, PR, and HER2 status. Effect of hormone and chemotherapeutic regimen on overall mortality was further adjusted by the therapy calculator and number of years since diagnosis were considered in the conditional survival calculator. Results from R script and website were crosschecked with a random subset of 20 patients to verify the accuracy of R script. Histological type recorded as others was re-categorized as unknown. If HER2 status was equivocal based on IHC and FISH was not performed, HER2 status was treated as unknown. Evidence of recurrence was set as unknown for conditional survival calculation.
In total, 7064 female breast cancer patients were in-   were selected for comparison of observed and predicted survival. As NUH and TTSH did not collect details of hormone therapy and chemotherapy regimen data before 2006, therapy calculator was only validated for UMMC patients with minimum 5-year follow up (N = 1538).

Statistical analysis Nodal status calculator
Observed and predicted probability of positive lymph nodes were compared. Calibration was assessed by dividing the data into deciles based on the predicted probability of positive nodes and then plotting the observed probability of positive nodes against means of predicted probability for each decile. A 45 degree diagonal line was plotted to illustrate perfect agreement. Discrimination of nodal status calculator was evaluated by area under the curve (AUC) in receiver operating characteristic analysis. A value of 0.5 indicates no discrimination and a value of 1.0 means perfect discrimination.

Outcome and therapy calculator
Ratio of observed and predicted numbers of death within 5 years and 10 years of diagnosis were calculated as mortality ratio (MR) with 95 % confidence interval (CI) constructed by exact procedure [25]. MR was also calculated for different subgroups by country, period of diagnosis, age, race, and other clinical characteristics. Observed 5-year and 10-year survival rates were compared with the median predicted survival from Cancer-Math. A difference of less than 3 % would be considered reliable enough for clinical use as 10-year survival benefit of 3-5 % is an indication for adjuvant chemotherapy [26]. The relationship of average 5-year and 10-year predicted survival and observed 5-year and 10-year survival was illustrated by the calibration plot. Discrimination of outcome and therapy calculator was evaluated by AUC using dataset with minimum 5-year and 10-year followup accordingly. Outcome calculator was further evaluated using concordance index (c-index) proposed by Harrell et al. for the entire dataset regardless of followup time [27]. C-index is the probability of correctly distinguishing patient who survives longer within a random pair of patients [27]. Like for the AUC, a c-index of 0.5 indicates no discrimination and a c-index of 1.0 means perfect discrimination.

Conditional survival calculator
For patients who survived two years after diagnosis, predicted 5-year survival was compared with observed 5year survival. Similarly predicted 10-year survival was compared with observed 10-year survival for patients who survived 5 years and 7 years respectively. Discrimination was evaluated by AUC.

Results
In total, 7064 female breast cancer patients were included. Tables 1, 2, 3 and 4 present clinical characteristics of 6807 patients with nodal status, 4517 patients with minimum 5-year follow-up, 1649 patients with 10-year followup, and 1538 patients with detailed treatment data and minimum of 5-years follow-up, respectively.

Nodal status calculator
A total of 6807 patients with nodal status data were selected for validation of nodal status calculator. In this dataset, 43.6 % patients (n = 2970) (95 % CI, 42.5 %-44.8 %) had at least one positive lymph node and the median predicted probability was 40.6 %. CancerMath underestimated the probability of positive node for most of the subgroups ( Table 1). The calibration plot (Fig. 1) also illustrated underestimation except for the last two deciles. The discrimination of this calculator was fair, with AUC of 0.71 (95 % CI, 0.70-0.72).

Outcome calculator
The observed number of deaths within 5 years after diagnosis was significantly higher than the predicted   We also observed notable differences for cases diagnosed in earlier period and of younger age (Tables 2 and  3). In addition, CancerMath significantly overpredicted survival for patients with unfavorable prognostic characteristics such as large tumor size, more positive nodes and ER negative tumor. For those with relatively better predicted survival, CancerMath predictions were similar to observed outcome (Fig. 2a, b and c). The difference between 5-year predicted and observed survival was 15 %, 3 % and 1 % for the first, fifth, and tenth deciles respectively. The Kaplan-Meier curves of overall survival by quintiles of predicted 5-year survival were illustrated in Fig. 3. The difference in survival experience between the five groups was statistically significant (p-value < 0.001 by the log-rank test). The AUC

Therapy calculator
For therapy calculator which was only validated in Malaysian patients, predicted survival was significantly higher than the observed survival for almost all subgroups, except for those diagnosed recently and with more favorable tumor characteristics (Table 4, Fig. 2d). The calculator showed fair discrimination at 5-year overall survival (AUC = 0.73, 95 % CI 0.70-0.77).

Discussion
Many prognostic tools have been developed over the past two decades to aid clinical decision making for breast cancer patients. This study validated four different prognostic calculators provided by CancerMath in the Singapore-Malaysia Hospital-Based Breast Cancer Registry. The discrimination was fair for nodal status calculator. CancerMath outcome, therapy and conditional survival calculator also moderately discriminated between survivors and non-survivors at 5 years and 10 years after diagnosis. It however consistently overestimated survival for this cohort of Southeast Asian patients, especially for those with poor prognostic profile.
CancerMath was previously built and validated using SEER data and patients diagnosed at Massachusetts General and Brigham and Women's Hospitals [19]. In the SEER database, 82.7 % of the invasive breast cancer cases diagnosed between 2003 and 2007 were white and only 6.9 % were Asian/ /Pacific Islander [28]. It was shown that the differences between observed and predicted survival was within 2 % for 97 % of the patients in the validation set [19]. Our study is the first one to independently validate CancerMath outside United States and is also the largest validation study of a westernderived breast cancer prognostic model in Asia. We demonstrated that CancerMath overpredicted survival by more than 3 % for almost all clinical and pathological subgroups. The findings were similar to previous validation studies of Adjuvant! Online conducted in Asia. In the Malaysian, Korean, and Taiwanese studies, the predicted and observed 10-year overall survival differed by 6.7 %, 11.1 %, and 3.9 % correspondingly [16][17][18]. The   [16]. Furthermore the prediction was too optimistic for young patients in almost all validation studies of Adjuvant! Online [12,[15][16][17]. Although adjustment of 1.5-fold increase in risk was added to Adjuvant! Online version 7.0 for patients younger than 36 years and with ER positive breast cancer, overprediction was still found in recent validation studies [12,16,17]. Our findings from current validation of CancerMath also suggested that correction for young age at diagnosis is needed.
The selection of patients for validation can partially explain the discrepancy in observed and predicted survival. CancerMath has only been validated among patients with tumor size no more than 50 mm and positive nodes no more than seven [29]. In our validation dataset, 10 % of patients had tumor size larger than 50 mm and 8 % had more than ten positive nodes. However even for patients with tumor size in between 20 mm and 50 mm and one to three positive nodes, the difference between the predicted and observed survival was more than 3 %. In general, Asian patients are more likely to present with unfavorable prognostic features such as young age, negative hormone receptor status, HER2 overexpression, and more advanced stage compared to their western counterparts [30][31][32]. In our current analysis, reduced agreement was observed for patients with poorer predicted outcome, especially for Malaysian patients, as illustrated by the calibration plot. In addition, the slope of the calibration plot for Malaysian patients were greater than 1 for the first three deciles which suggested that the spread of the predicted survival was less than observed survival. CancerMath's poorer performance in Malaysia might be explained by higher proportion of patients in advanced stages and more heterogeneous prognosis in Malaysia. Such limitation of CancerMath may restrict its use to patients with better prognostic profile only. Furthermore CancerMath therapy calculator applies the same amount of risk reduction from adjuvant therapy as Adjuvant! Online, which was estimated from meta-analysis of clinical trials mainly conducted in western population [9,19]. However nonadherence to treatment is more common among Asian women [33][34][35]. Studies also reported different drug metabolism and toxicity induced by chemotherapy between Asian and Caucasian patients [36]. These evidences may imply CancerMath overestimate the effect of treatment in Asian patients.
Another possible explanation of suboptimal performance of CancerMath and also the limitation of our study is missing data on ER (6 %), PR (15 %), HER2 status (47 %), and tumor grade (11 %). For patients with complete information on required predictors (N = 1872), Fig. 3 Kaplan-Meier curves of overall survival by quintiles of 5-year predicted survival from outcome calculator the predicted and observed 5-year survival was 86.0 % and 82.5 %. The difference were similar to what we observed in the entire dataset. Therefore the impact of missing data is relatively small on performance of CancerMath.
Several gene expression profiling assays, such as MammaPrint [37] and Oncotype Dx [38] are currently available for breast cancer prognostication and treatment decision. However these tools do not incorporate clinicopathologic factors which are readily available or relatively cheap to obtain. Due to the high cost of these tests and larger proportion of patients with high predicted risk in Asia [39,40], the clinical utility is uncertain in this region. Therefore traditional prognostic model using clinicopathologic factors seems more reasonable in our local setting.

Conclusions
In conclusion, CancerMath demonstrated modest discrimination and calibration among Southeast Asian patients. Our results suggest that CancerMath is more suitable for patients diagnosed with favorable disease.