Nomogram for predicting preoperative regional lymph nodes metastasis in patients with metaplastic breast cancer: a SEER population-based study

Background Metaplastic breast cancer (MBC) is a rare subtype of breast cancer, and generally associated with poor outcomes. Lymph nodes metastasis (LNM) is confirmed as a critical independent prognostic factor and determine the optimal treatment strategies in MBC patients. We aimed to develop and validate a nomogram to predict the possibility of preoperative regional LNM in MBC patients. Methods MBC patients diagnosed between 1990 and 2016 in the Surveillance, Epidemiology, and End Results (SEER) database were included and stochastically divided into a training set and validation set at a ratio of 7:3. The risk variables of regional LNM in the training set were determined by univariate and multivariate logistic regression analyses. And then we integrated those risk factors to construct the nomogram. The prediction nomogram was further verified in the verification set. The discrimination, calibration and clinical utility of the nomogram were evaluated by the area under the receiver operating characteristic (ROC) curve (AUC), calibration plots and decision curve analysis (DCA), respectively. Results A total of 2205 female MBC patients were included in the study. Among the 2205 patients, 24.8% (546/2205) had positive regional lymph nodes. The nomogram for predicting the risk of regional LNM contained predictors of grade, estrogen receptor (ER) status and tumor size, with AUC of 0.683 (95% confidence interval (CI): 0.653–0.713) and 0.667 (95% CI: 0.621–0.712) in the training and validation sets, respectively. Calibration plots showed perfect agreement between actual and predicted regional LNM risks. At the same time, DCA of the nomogram demonstrated good clinical utilities. Conclusions The nomogram established in this study showed excellent prediction ability, and could be used to preoperatively estimate the regional LNM risk in MBC.


Background
Metaplastic breast cancer (MBC) is a rare neoplasm, accounting for approximately 0.02-5% of breast cancer (BC) [1,2]. It is characterized by the presence of two or more components in histology, usually representing a mixture of epithelial (e.g., adenocarcinoma) and mesenchymal (e.g., matrix, spindle cell, and sarcomatous) components [3,4]. MBC was first described by Huvos et al. in 1973 [5], and considered to be a distinct histologic subtype in the 2000 World Health Organization (WHO) guidelines for histologic classification of tumors of the breast.
MBC is associated with poor prognosis, and lymph nodes metastasis (LNM) is an important prognostic determinant for patients with BC. Accurately preoperative evaluation of regional LNM is critical for determining the optimal treatment strategies for MBC patients. Neoadjuvant systemic therapy (NAST) has many potential advantages, including: downstaging the breast cancer and axilla, improving prognostication based on response and the chances of breast-conserving surgery. Hence, it is increasingly used in patients with clinically nodepositive BC [6]. Moreover, patients with extensive axillary nodal involvement planned for irradiation of regional lymph nodes may benefit from reducing the probability of locoregional recurrence [7,8]. Currently, sentinel lymph node biopsy (SLNB) is performed in all clinically node-negative patients to probe axillary lymph node status. And modified radical mastectomy with axillary lymph node dissection (ALND) remains one of the most effective surgical methods for patients with local advanced BC without distant metastasis.
However, different from other types of BC, the incidence of LNM in MBC is low. The largest study showed the incidence of axillary lymph nodes (ALNs) involvement in MBC was 21.9%, significantly lower than 34.3% in infiltrating ductal carcinoma (IDC) [1]. Wargotz et al. also found ALNs metastasis rate in MBC ranged from 6 to 26% [9]. In addition, by reviewing the data from the previous study, we found that sentinel LNM occurred in only 30% of BC patients with clinically negative lymph nodes. And SLNB had an inherent false-negative rate of 5-10% [10,11]. In many BC patients, the histopathological examination of their dissected ALNs revealed no metastasis [12]. Furthermore, owing to the advances in treatment, the overall breast cancer death rate has decreased rapidly [13]. In this scenario, health-related quality of life has become more and more important. But, SLNB and ALND are both associated with the risk of the complications, including breast cancer related lymphedema (BCRL) [14], axillary web syndrome [15], numbness, paraesthesia [10,16], reduced range of motion, upper limb pain [17][18][19], cancer-related fatigue [20,21] and so on. Notably, approximately 20% of breast cancer survivors (BCS) will develop BCRL, which is a lifelong threat due to its protracted time of onset [22]. And cancer-related fatigue is also a very common longterm side effect in BCS. A previous meta-analysis involving 12,327 BCS indicated about a quarter of BCS suffered from severe fatigue [23]. Health-related quality of life and psychological health are severely impaired in patients with BC [24,25]. Based on the above reasons, MBC without LNM may not require additional SLNB and ALND to avoid overtreatment. Therefore, adequate and accurate assessment and prediction of preoperative regional lymph node status are very important. If we can predict lymph node status before surgery, lymph node negative patients could avoid unnecessary treatment.
At present, mammography, ultrasonography, computed tomography, and magnetic resonance are the main methods used to evaluate lymph node status [26][27][28][29]. Nevertheless, it is insufficient to screen and evaluate the lymph node status of MBC patients based solely on the imaging appearance. More importantly, MBC is difficult to identify preoperatively on biopsy [30,31]. Recently, nomogram for predicting the possibility of preoperative LNM has been proven effective and widely used. Unfortunately, so far, there is no nomogram to predict preoperative regional LNM in patients with MBC. Hence, we retrospectively analyzed the clinical characteristics of a large cohort of female MBC patients, aiming to establish an easy, reliable and sensitive clinical risk factor model to predict the risk of regional LNM before surgery.

Patient selection and data collection
The retrospective study was based on the SEER program. The SEER database is an open access resource for cancer-based clinical data, and no ethics committee review approval was needed. We included patients diagnosed with microscopically confirmed MBC between 1990 and 2016. Only patients with MBC as their only cancer were included. The ICD-O-3 codes included in this study were 8052, 8070-8072, 8074, 8560, 8571, 8572, 8575, and 8980, based on previously published studies [32,33]. The following clinicopathological factors were extracted from the SEER database: age at diagnosis, gender, race, marital status, grade, laterality, estrogen receptor (ER) status, progesterone receptor (PR) status, human epidermal growth factor receptor-2 (HER-2) status, tumor size and regional lymph node status. We excluded male patients and patients with unknown regional lymph node status, laterality, ER status, PR status, race, tumor size, stage and grade.

Statistical analysis
Chi-square test was used to compare categorical variables between the training set and the validation set.
Univariable and multivariable binary logistic regression analyses were utilized to identify factors associated with regional LNM. Variables with P < 0.05 were included in the nomogram. The accuracy of the nomogram was evaluated by the discrimination and calibration ability. Discrimination was validated by the area under the receiver operating characteristic (ROC) curve (AUC). Calibration (visualized as the calibration plot) was used to illustrate the correlation between the actual probability and the predicted probability of regional LNM. Clinical usefulness was estimated with decision curve analysis (DCA). Analyses were conducted by SPSS (version 18.0; SPSS, Inc., Chicago, IL) and the packages (rms, hmisc, rmda, etc.) in R software version 4.0.2 (http://www.rproject.org). A two-sided P-value less than 0.05 was considered statistically significant.

Patient characteristics
In this study, we included 2205 female MBC patients diagnosed from 1990 to 2016. The flow diagram for concrete steps to patient selection is shown in Fig. 1. Table 1 summarized the clinicopathological features of 2205 female MBC patients in detail. Majority patients were married white women. 81.8% (1803/2205) were poorly differentiated or undifferentiated. 74.1% (1633/ 2205) of patients were diagnosed with a tumor larger than 2 cm. Most of the tumors lacked ER, PR, and HER2 expression. 24.8% (546/2205) of patients were confirmed to regional LNM.
Two thousand two hundred five MBC patients were stochastically divided into a training set (1543) and validation set (662). The regional LNM rate of the training set and the validation set was 24.6% (379/1543) and 25.2% (167/662), respectively. There were no significant differences in patient age, race, marital status, grade, laterality, receptor (ER, PR and HER-2) status, tumor size and regional lymph node status between the training set and the validation set (P > 0.05).

Nomogram construction and validation
A nomogram to predict preoperative regional LNM was established in the training set. Binary logistic regression analyses indicated that grade, ER status and tumor size were independent predictive factors of regional LNM in MBC patients. Therefore, we integrated those three variables to construct the nomogram (Fig. 2). The AUC in the training and validation set were 0.683 (95% CI: 0.653-0.713) and 0.667 (95% CI: 0.621-0.712) respectively, representing the moderate discrimination ability of the nomogram to estimate the status of regional LNM. More importantly, the AUC of all predictors alone were lower than the AUC of the nomogram, both in the training set (Fig. 3a) and validation set (Fig.  3b). Furthermore, the calibration plot with 1000 bootstrapping repetitions presented good agreement between the actual regional LNM and the predicted probability of regional LNM, no matter in the training set (Fig. 4a) and validation set (Fig. 4b).
Clinical utility of the nomogram DCA was performed to evaluate the clinical utility of the nomogram based on net benefits at different threshold probabilities. Compared with grade, ER status and tumor size, the increased net benefit of the nomogram was the largest, which indicated that the nomogram was a reliable clinical tool for predicting regional LNM in MBC patients (Fig. 5).

Discussion
MBC is a rare histological subtype of BC, most of which commonly present with a triple negative phenotype [34,35]. Compared with invasive ductal carcinoma, MBC is characterized by lower differentiation, larger tumor size, LNM is considered a significant negative prognostic factor and is vitally important for therapeutic decisionmaking for MBC patients. But, current preoperative imaging modalities do not have high sensitivities and specificities in the diagnosis of LNM. Moreover, MBC has a wide range of histological patterns and extremely minimal metaplastic area, so it is difficult to identify by fineneedle aspiration or core biopsy before operation [38]. Leyrer CM et al. reported only 41% (46/113) patients were identified preoperatively as MBC on initial image-guided core biopsy [31]. Furthermore, nomogram, as a simple and advanced prediction tool, can estimate individualized risk by integrating substantial clinicopathological characteristics [39]. Therefore, it is necessary to establish a simple and sensitive preoperative prediction model of regional LNM in MBC patients.
In our study, grade, ER status and tumor size were considered independent predictors for regional LNM in patients with MBC. Then, those three clinicopathological variables were incorporated into a preoperative estimation model of regional LNM risk. To the best of our knowledge, this is the first population-based study to develop and validate a nomogram for predicting the preoperative individualized risk of regional LNM in MBC patients. In both training set and validation set, the AUC of the nomogram was higher than 0.6 and the calibration curves corresponded with the idealized 45°line, which demonstrated excellent discrimination and calibration of the nomogram. Nevertheless, great discrimination and calibration of the nomogram are not sufficient because they do not equal to clinical utility. In addition, MBC has a low incidence rate, and most available studies are retrospective studies of small samples. Hence, we used DCA to estimate the clinical usefulness, and DCA curves revealed greater net benefit of the established nomogram model. In other words, through this nomogram, we can accurately predict the regional lymph node status of MBC patients.
ER status as one of the most influential independent predictors of LNM has been reported in some studies. Gann PH et al. studied data from 18,025 breast carcinoma cases and suggested tumors lacking ER had a significantly lower risk of LNM than tumors containing ER [40]. Additionally, Ye FG et al. revealed ER status was an independently associated with a higher likelihood of Compared with the previous studies, we achieved a consistent conclusion. In our study, 30.9% of women with ER positive cancers were found to have positive regional lymph nodes compared to 23.2% of women with ER negative cancers (P = 0.006). Several studies demonstrated that histologic grade was related to LNM status. Kollias J et al. retrospectively analyzed the medical records of 2684 BC patients, and showed that 29% of the patients with grade III cancers had positive lymph nodes, while the proportion of positive lymph nodes with grade I and grade II was 11 and 18%, respectively (P = 0.006) [42]. Besides, Bruno C et al. revealed the high pathological grade indicated high axillary nodal involvement in BC. The risk of axillary nodal involvement in grade III tumors doubled compared to grade I tumors (37.8% versus 18.3%) [43]. In our study, the highest positive rate of regional lymph nodes was observed in grade III MBC, which is consistent with the above conclusion.
The relationship between tumor size and LNM in BC patients have been widely reported in previous researches. In 1989, Carter et al. showed the incidence of LNM was approximately 31.1% (2591/8319) in the BC patients with tumors less than 2 cm, while the proportion of LNM was as high as 70.0% (1889/2698) in the BC patients with tumors 5 cm or greater [44]. Moreover, in 2006, Wada et al. reported nearly 50% (62/116) of the BC patients with T2 tumor (> 2.0 cm) had positive nonsentinel lymph nodes [45]. Similarly, our study found that tumor size was an independent risk factor significantly associated with LNM in MBC patients. In our data, the percentages of regional lymph node positive tumors with greatest dimensions less than or equal to 2 cm and greater than 5 cm were 12.2% (70/572) and 42.6% (203/476).
We successfully constructed a nomogram based on a large population-based cohort for assessing the potential risk of regional LNM in MBC patients by utilizing grade, ER status, and tumor size. The established predictive model exhibited excellent performance, and was based on easily available clinicopathological factors. Therefore, the preoperative prediction of regional LNM could be accurately and conveniently identified on the nomogram by collecting the readily accessible information.
Although the nomogram had good accuracy for LNM prediction in MBC patients, some potential limitations of our study should be noted. First, due to the nature of retrospective analyses, we could not exclude selection bias. For example, some patients were excluded due to missing data, which may cause selection bias. Second, Ki-67 has been identified as an independent predictor of Fig. 4 Internal (a) and external (b) calibration plots of the nomogram for predicting regional LNM in patients with MBC Fig. 5 Decision curve for prediction of regional LNM for MBC. Black line: assume no patient will have regional LNM; gray line: assume all patients will have regional LNM; orange line: binary decision rule based on ER status alone; green line: binary decision rule based on grade alone; blue line: binary decision rule based on tumor size alone; red line: decision based on nomogram. The x-axis and the yaxis were the threshold probability and the net benefit, respectively LNM in BC [46]. Unfortunately, it was not recorded in the SEER database. Thus we cannot incorporate this important factor into our nomogram. Finally, both training and validation sets came from the SEER database, which may lead to overfitting the model. The nomogram needs to be validated in more external force columns from other institutions to demonstrate its reproducibility.

Conclusion
In conclusion, through logistic regression analysis, we found that lower differentiation, ER positive status and larger tumor size were independent risk factors associated with regional LNM in MBC. Based on these clinical risk factors, we established the first nomogram that could accurately and easily predict the preoperative individualized risk of regional LNM for MBC patients, thus contributing to treatment decision making.