Development and validation of nomograms for predicting axillary non-SLN metastases in breast cancer patients with 1–2 positive sentinel lymph node macro-metastases: a retrospective analysis of two independent cohorts

Background It is reported that appropriately 50% of early breast cancer patients with 1–2 positive sentinel lymph node (SLN) micro-metastases could not benefit from axillary lymph node dissection (ALND) or breast-conserving surgery with whole breast irradiation. However, whether patients with 1–2 positive SLN macro-metastases could benefit from ALND remains unknown. The aim of our study was to develop and validate nomograms for assessing axillary non-SLN metastases in patients with 1–2 positive SLN macro-metastases, using their pathological features alone or in combination with STMs. Methods We retrospectively reviewed pathological features and STMs of 1150 early breast cancer patients from two independent cohorts. Best subset regression was used for feature selection and signature building. The risk score of axillary non-SLN metastases was calculated for each patient as a linear combination of selected predictors that were weighted by their respective coefficients. Results The pathology-based nomogram possessed a strong discrimination ability for axillary non-SLN metastases, with an area under the receiver operating characteristic (ROC) curve (AUC) of 0.727 (95% CI: 0.682–0.771) in the primary cohort and 0.722 (95% CI: 0.653–0.792) in the validation cohort. The addition of CA 15–3 and CEA can significantly improve the performance of pathology-based nomogram in the primary cohort (AUC: 0.773 (0.732–0.815) vs. 0.727 (0.682–0.771), P < 0.001) and validation cohort (AUC: (0.777 (0.713–0.840) vs. 0.722 (0.653–0.792), P < 0.001). Decision curve analysis demonstrated that the nomograms were clinically useful. Conclusion The nomograms based on pathological features can be used to identify axillary non-SLN metastases in breast cancer patients with 1–2 positive SLN. In addition, the combination of STMs and pathological features can identify patients with patients with axillary non-SLN metastases more accurately than pathological characteristics alone.


(Continued from previous page)
Conclusion: The nomograms based on pathological features can be used to identify axillary non-SLN metastases in breast cancer patients with 1-2 positive SLN. In addition, the combination of STMs and pathological features can identify patients with patients with axillary non-SLN metastases more accurately than pathological characteristics alone.
Keywords: Nomogram, Axillary non-SLN metastases, Pathological features, Serum tumor markers Background Breast cancer is the most common type of cancer in women and a leading cause of cancer-related death worldwide [1]. Sentinel lymph node biopsy (SLNB) is the standard treatment in early breast cancer patients with clinical negative axillary lymph node, and no further axillary treatment is required for sentinel lymph node (SLN) negative patients [2]. However, the optimal management of SLN positive patients remains controversial, since no more than half of patients have axillary non-SLN metastases when axillary lymph node dissection (ALND) is performed [3,4]. In order to reduce unnecessary postoperative complications followed by ALND, breast-conserving surgery with whole breast irradiation has been recommended in patients with 1-2 positive SLN micro-metastases [5,6]. However, whether patients with 1-2 positive SLN macro-metastases could benefit from breast conserving therapy and whole-breast radiotherapy remains controversial. Therefore, there is an urgent need to develop a nomogram for predicting the risk of non-SLN metastases in patients with 1-2 positive SLN macro-metastases.
Although pathology-based MSKCC breast nomogram [7] has been widely used to identify the patient's individual risk of non-SLN metastases, its accuracy varies greatly among different populations (with an AUC ranges from 0.58 to 0.86) [8][9][10] and its application has not been validated in Chinese breast cancer patients. Wang et al. also has reported that tumor pathologic invasion size, number of positive SLNs and ALN status on imaging was associated with non-SLNs metastases in patients with 1-2 SLNs macro-metastases, but the included clinicopathological features and sample size are relatively small. Serum tumor markers (STMs) have been reported to be associated with the prognosis, recurrence and therapeutic effect of breast cancer [11,12], whereas their predictive value for non-SLN metastases remains unknown. The aim of our study was to develop and validate nomograms for identifying patients at a high risk for axillary non-SLN metastases through their pathological features alone or in combination with STMs.

Study design and patient cohort
The study was approved by the institutional ethics committee of People's Hospital of Zhengzhou University (Henan Provincial People's Hospital), and written informed consent was obtained from all participants in accordance with the Declaration of Helsinki. A primary cohort of 618 patients with histologically confirmed breast cancer was retrospectively analyzed between April 2016 and July 2020 at the Henan Provincial People's Hospital (Henan, China). Inclusion criteria included the following: I) histologically confirmed infiltrating breast carcinoma; II) clinically negative axillary lymph node; III) pathologically confirmed 1-2 positive SLN macrometastases; IV) completion of axillary lymph node dissection and histopathological assessment of dissected lymph nodes. Furthermore, an independent validation cohort of 532 patients was screened using the same criteria between October 2016 and November 2019 at Ruzhou First People's Hospital (Henan, China). The diagram of establishing and validating our nomograms for predicting axillary non-SLN metastases in breast cancer patients with 1-2 positive SLN was shown in Fig. 1.

Data collection
The pathological information of all eligible breast cancer patients was obtained from their medical records, including age, number of tumor lesions, tumor grade, histological type, T stage, number of positive SLN, number of negative SLN, lymphovascular invasion, estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER-2) and Ki-67. Tumors displaying ≥10% nuclear-stained cells were considered to be tumor ER and PR positivity. HER-2/ neu immunohistochemical staining was scored from 0 to 3+, 3+ was considered positive, and 0 or 1+ were considered negative. Fluorescence in situ hybridization tests were performed for patients with HER-2 scored as 2+. Tumor Ki-67 positivity corresponded to ≥14% nuclearstained tumor cells. Serum samples obtained within 1 week before surgery were analyzed for each patient for carcinoembryonic antigen (CEA), carbohydrate antigen (CA) 125, and CA 15-3. Pathological cut-off levels were established as 5 ng/ml for CEA, 35 U/ml for CA 125, and 32 U/ml for CA 15-3.

Nomograms building
The best subset regression was used to simplify prediction models when overfitting or multicollinearity occurred due to excessive amounts of variables. Logistic regression is one of the most commonly-used methods for establishing a prediction model to classify two groups of the population, and a nomogram is a practical tool to visualize the results of it, which we used to establish this prediction model. The AIC value for the final model was minimized with the fewest number of variables.

Performance validation of nomograms Discrimination
Discrimination ability was quantified by the area under the receiver operating characteristic (ROC) curve (AUC). AUC ranges from 0 to 1, with 1 indicating perfect concordance, 0.5 indicating no better concordance than chance, and 0 indicating perfect discordance.

Calibration
Calibration curves were plotted to assess the calibration of the nomogram [13], which consisted of two lines: one was a 45-degree reference line, and the other line represented the actual line. The interval between the two lines reflected the accuracy of the nomogram. Hosmer-Lemeshow test was used to evaluate the calibration of prediction model and a significant test statistic implies that the model does not calibrate perfectly.

Clinical usefulness
Decision curve analysis was conducted to determine the clinical usefulness of the nomograms via quantifying the net benefits at different threshold probabilities in the primary and validation cohorts [14,15].

Statistical analysis
Continuous data with normal distribution were expressed as mean (SD), while discrete variables were expressed as count (%). The alpha-level was set to 0.05, and statistical significance levels were all two sided. A P value < 0.05 was considered statistically significant. The continuous variables were transformed into binary variables by applying inflexion points of ROC curves as the cut-offs. Differences in continuous data between patients with and without axillary non-SLN metastases were analyzed using student's t-tests. Chi-square tests or Fisher exact tests were used to examine the association of categorical data between these two groups. Statistical analysis was conducted with STATA 15.0 (Stata Corp, Texas, USA) and Rstudio software (Version 4.0.2, https://www.R-project.org).

Clinical characteristics
In total, 1150 well-documented patients were recruited. The demographics and clinic features of these patients are shown in Table 1. Among the 1150 patients, A total of 618 patients (age range from 27 to 86 years) in the primary cohort and 532 patients in the validation cohort (age range from 26 to 89 years) met the inclusion criterion. There was no significant difference in the incidence

Clinicopathological features selection and nomogram building
Among the twelve clinicopathological features in the primary cohort, five variables were finally selected as predictive factors to develop prediction model, including number of negative SLN, number of positive SLN, number of tumor lesions, tumor grade and lymphovascular invasion ( Table 2). Using the regression coefficients of multivariate logistic regression models to weight each feature in our models, we developed a risk score formula to predict axillary non-SLN metastases: risk score = − To provide clinicians with a quantitative method for predicting the individual probability of axillary non-SLN metastases, we built a nomogram based on selected clinicopathological features (Fig. 2a).

Performance of the pathology-based nomogram
Internal performance The calibration curve of the nomogram showed good agreement between prediction and observation in the primary cohort (Fig. 3a). The Hosmer-Lemeshow test yielded a nonsignificant statistic (P = 0.948), which suggested that there was no departure from perfect fit. Besides, a strong discrimination ability with an AUC of 0.727 (95% CI: 0.682-0.771) was observed in the primary cohort (Fig. 4a). The decision curve revealed that if the threshold probability of a patient ranges from 0.09 to 0.64, using the nomogram to predict axillary non-SLN metastases would add more benefits than the assumption that all patients or none of patients had non-SLN metastases (Fig. 5a).

Independent validation
To determine whether the nomogram derived from the primary cohort was robust, we measured its performance in an independent validation cohort. The predictive score of each patient in the validation cohort was calculated by the regression coefficient to weight their respective predictors. In line with the results in the primary cohort, good calibration was also observed in Table 1 The clinical characteristics of eligible breast cancer patients in the primary and validation cohort (Continued)   (Fig. 3b). In addition, the ROC curve yielded an AUC of 0.722 (95% CI: 0.653-0.792) (Fig. 4b) and the decision curve indicated more net benefits when the threshold probability ranges from 0.04 to 0.82 (Fig. 5b).

Incremental predictive value of STMs for the pathologybased nomogram
To investigate the potential predictive value of STMs for axillary non-SLN metastases, the best subset regression  showed good agreement between prediction and observation in the primary (Fig. 3c, P = 0.960) and validation cohorts (Fig. 3d, P = 0.853). ROC analysis was further performed to compare the discrimination ability of the two nomograms. As shown in Fig. 4a  Though there were several overlaps between both nomograms in decision curve, the addition of CA 15-3 and CEA brought more net benefits to the pathology-based nomogram within the threshold probability of 0.17-0.64 in both cohorts ( Fig. 5a and b).

Discussion
Using the data from 1150 early breast cancer patients in two independent cohorts, the findings of our study confirmed that pathology-based nomogram possessed a strong discrimination ability for axillary non-SLN metastases in Chinese breast cancer patients with 1-2 positive SLN. In addition, our study is the first to explore the predictive value of pathological features in combination with STMs for axillary non- obtained in our study may greatly help clinicians to predict the risk of axillary non-SLN metastases and therefore to provide evidence to guide clinical decisionmaking of radiation field. MSKCC nomogram based on eight pathological features, including number of tumor lesions, tumor size, tumor grade, number of positive SLN, number of negative SLN, detection methods of SLN, lymphovascular invasion and the status of ER, has been the most widely used model for predicting axillary non-SLN metastases [7]. However, its predictive value varies greatly among different populations. Degnim et al. reported that MSKC C nomogram possessed a strong discrimination ability with an AUC of 0.86 [8], but Klar et al. reported that its predictive value was only 0.58 [9]. The significant differences among different populations may be related to detection methods of SLN and evaluation criteria of pathological features.
The results of our study supported the conclusion that number of tumor lesions, tumor grade, lymphovascular  invasion, number of positive SLN and number of negative SLN acted as an independent risk factor of axillary non-SLN metastases. Previous studies have reported that number of tumor lesions was significant associated with axillary non-SLN metastases, but not with SLN positive rate [16][17][18]. A possible explanation is that lymph containing tumor cells drained from multiple sites to the ipsilateral axillary, leading to a higher false negative rate of SLNB in the multifocal group than the unifocal group. Tumor with high grade [19,20] and lymphovascular invasion [21,22] has long been considered to be associated with non-SLN metastases due to its high aggressiveness. Number of positive SLN, number of negative SLN and the ratio of negative SLN to positive SLN has also been reported to be an independent predictor of axillary non-SLN metastases [7,23]. However, tumor size and the status of ER was not found to be correlated with non-SLN metastases in our study. Chen et al. [24] and Abdessalam et al. [25] also reported that there is no significant correlation between tumor size and non-SLN metastases. Although several studies have reported the risk of axillary non-SLN metastases is higher in breast cancer patients with ER positive [26,27], an increasing number of evidences suggested that there was no significant difference between them [28][29][30]. A possible reason is that included patients and evaluation methods of ER positivity are different among different institutions.
Although previous studies have demonstrated that preoperative STMs are important prognostic factors of breast cancer patients, the predictive value of STMs in combination with pathological features for axillary non-SLN metastases in patients with 1-2 positive SLN macro-metastases remains unknown [11,12]. Li et al. reported preoperative serum CEA levels could be an independent prognostic factors for overall survival, and the nomograms including it would provide more personal forecasts information to optimize treatment for young breast cancer patients better [11]. Wang et al. reported that elevated serum CEA and CA 15-3 are significantly associated with bone metastases of breast cancer [31]. In line with these findings, the results of our study showed that breast cancer patients with positive axillary non-SLN are prone to have elevated serum CEA and CA 15-3. In addition, the performance of pathology-based model was significantly improved after the addition of CEA and CA 15-3.

Conclusions
To the best of our knowledge, there are few studies that have assessed the validity of MSKCC breast nomogram in Chinese breast cancer patients. This research, therefore, overcomes this limitation to some extent. In addition, our study is the first to investigate the predictive value of STMs for axillary non-SLN metastasis.
However, this study also has a few limitations. First, this study also has the inherent defects of retrospective and cross-sectional studies, such as patient inclusion and sample selection biases. Moreover, some features, such as imaging examination and some other detailed laboratory examination, were not well documented in our breast cancer database, which led to the exclusion of some potential predictors to ensure data authenticity and integrity.