- Research
- Open access
- Published:
Individualized prediction of non-sentinel lymph node metastasis in Chinese breast cancer patients with ≥ 3 positive sentinel lymph nodes based on machine-learning algorithms
BMC Cancer volume 24, Article number: 1090 (2024)
Abstract
Background
Axillary lymph node dissection (ALND) is a standard procedure for early-stage breast cancer (BC) patients with three or more positive sentinel lymph nodes (SLNs). However, ALND can lead to significant postoperative complications without always providing additional clinical benefits. This study aims to develop machine-learning (ML) models to predict non-sentinel lymph node (non-SLN) metastasis in Chinese BC patients with three or more positive SLNs, potentially allowing the omission of ALND.
Methods
Data from 2217 BC patients who underwent SLN biopsy at Shantou University Medical College were analyzed, with 634 having positive SLNs. Patients were categorized into those with ≤ 2 positive SLNs and those with ≥ 3 positive SLNs. We applied nine ML algorithms to predict non-SLN metastasis. Model performance was evaluated using ROC curves, precision-recall curves, and calibration curves. Decision Curve Analysis (DCA) assessed the clinical utility of the models.
Results
The RF model showed superior predictive performance, achieving an AUC of 0.987 in the training set and 0.828 in the validation set. Key predictive features included size of positive SLNs, tumor size, number of SLNs, and ER status. In external validation, the RF model achieved an AUC of 0.870, demonstrating robust predictive capabilities.
Conclusion
The developed RF model accurately predicts non-SLN metastasis in BC patients with ≥ 3 positive SLNs, suggesting that ALND might be avoided in selected patients by applying additional axillary radiotherapy. This approach could reduce the incidence of postoperative complications and improve patient quality of life. Further validation in prospective clinical trials is warranted.
Introduction
At present, for patients with early-stage breast cancer (BC) and clinically negative axillary lymph nodes (ALNs), axillary sentinel lymph node biopsy (SLNB), as opposed to axillary lymph node dissection (ALND), demonstrates no significant difference in local disease control, disease-free survival (DFS), and overall survival (OS). However, SLNB can significantly mitigate numbness, sensory loss, shoulder joint dysfunction, and the incidence of upper limb lymphedema associated with ALND, as confirmed by several international multi-center clinical trials [1,2,3].
For patients with clinically negative ALNs and a low metastatic burden of sentinel lymph nodes (SLNs) ≤ 2 tumor metastases, evidence from multiple clinical trials, including ACOSOG Z0011, IBCSG 23–01, AMAROS, and OTOASOR, indicates that ALND can be safely avoided when high-tangential whole-breast irradiation (WBI) is added after breast-conserving surgery or when additional axillary regional nodal irradiation (RNI) is included after total mastectomy, without affecting regional recurrence rates and OS. This approach has been widely accepted and applied in clinical practice to reduce the incidence of complications such as lymphedema caused by ALND [4,5,6,7,8]. However, ALND is still performed in some breast centers under these circumstances. Studies show that only about 23–34% of non-SLN have metastasis, meaning that in 66–77% of patients without non-SLN metastasis, ALND provides no benefit while increasing complications when combined with RNI [4,5,6,7]. As a result, some prediction models have been developed to predict the status of non-SLN metastasis based on SLN and clinicopathological features. Patients accurately predicted as Non-SLN negative by these models could even potentially be spared from RNI. These prediction models have been validated in clinical applications across multiple centers [9,10,11,12,13,14].
However, in early-stage BC and clinically negative axillary nodes, if there are ≥ 3 SLNs with metastasis, the incidence of non-SLN metastasis is considered to be significantly increased. Current guidelines and clinical practice recommend ALND in these cases, although there are limited data from separate studies on these patients in the real world. Subgroup data reported in studies show that patients with ≥ 3 positive SLNs account for about 10% of SLN-positive cases, and still, more than 30% of the non-SLNs show no metastasis during ALND [15]. For these patients, ALND does not alter the postoperative treatment plan nor provide additional benefits, suggesting that further ALND can be exempted. Alternatively, WBI and RNI without ALND may also reduce the complications affecting the upper limb. It remains clinically significant to establish a prediction model for patients with ≥ 3 positive SLNs to predict the metastasis status of non-SLNs as a means to exclude the necessity of ALND and to evaluate prognosis. The MonarchE trial demonstrated that early-stage patients with hormone receptor-positive, Human epidermal growth factor receptor 2 (HER2)-negative status, having ≥ 4 positive LNs or 1–3 positive LNs alongside other high-risk factors, experience sustained survival benefits from abemaciclib combined with standard adjuvant endocrine therapy compared to endocrine therapy alone [16]. Machine learning (ML) is an emerging field in medicine, encompassing a robust set of algorithms designed for data representation, adaptation, learning, prediction, and analysis. To date, these algorithms have not been employed to construct predictive models for non-SLN metastasis.
In this study, we examined DFS and OS, and performed univariate and multivariate Cox regression analyses in patients with ≥ 3 positive SLNs compared to those with ≤ 2 positive SLNs. We utilized ML algorithms to develop predictive models for non-SLN metastasis in patients with three or more positive SLNs and assessed the feasibility of adding WBI or RNI without the need for ALND.
Patients and methods
Patients
This was a retrospective study. From January 2010 to January 2023, a total of 2217 consecutive female patients diagnosed with primary invasive BC underwent SLNB at the Breast Center, Cancer Hospital of Shantou University Medical College (CHSU). Among these patients, 634 were found to have positive SLNs and met the following criteria. (1) negative clinical and imaging examinations, or negative pathohistological results for suspicious ALNs via hollow-core needle aspiration, with tumors staged as cT1-3N0M0 according to the eighth edition of the American Joint Committee on Cancer (AJCC) staging manual. (2) No prior neoadjuvant therapy. (3) Positive SLNs, including tumor micrometastases or macrometastases, identified after SLNB. (4) SLNB performed by an experienced surgical team. (5) Patients accepted further ALND. (6) Patients had no history of previous malignancy. (7) Complete follow-up time. Patients were excluded if they met any of the following criteria. (1) BC in situ. (2) Stage IV BC. (3) Isolated tumor cells (ITC) in SLNs. (4) Necessary clinical information unavailable. Patients were divided into the SLNs ≤ 2 group and the ≥ 3 positive SLN group according to the number of positive SLNs for subsequent analyses. Additionally, we recruited 42 patients who met the aforementioned inclusion criteria and had ≥ 3 positive SLNs from Jieyang People's Hospital (JPH) as the validation cohort. This study was approved by the Ethics Committees of CHSU (No. 2024038) and JPH (No. 2024054), and was conducted in accordance with the 1964 Helsinki Declaration and its subsequent amendments, or comparable ethical standards. Our Ethics Committees granted a waiver of informed consent.
Surgery and pathology
SLNB was performed using methylene blue (MB) injection (Jumpcan Pharmaceutical Group Co., Ltd., Jiangsu, China) and indocyanine green (ICG) solution (Dandong Yichuang Pharmaceutical Co., Ltd., Jilin, China). First, 2 mL MB was injected subcutaneously into the periareolar area near the outer upper quadrant, and 5 min later 1 mL ICG solution (0.5 mg/mL) was injected subcutaneously in the same area. Then, the fluorescence detector (Mingde Pharmaceutical Co., Ltd., Jiangsu, China) was used to observe along the lymphatic vessels and mark the point of fluorescence disappearance as the incision of SLNB. Palpable and/or fluorescent lymph nodes (ICG positive) and/or blue-stained lymph nodes (MB positive) were excised as SLNs. SLN metastasis was diagnosed by frozen section during operation or by postoperative paraffin section. If tumor macrometastasis or micrometastasis was found in more than 2 metastatic SLNs or in 1 to 2 SLNs and the patients were not willing to receive additional axillary RNI, we routinely performed level I or II ALND. If lymph nodes in level II displayed metastases, we also performed level III ALND. After the operation, all specimens were paraffin-embedded for immunohistochemistry.
Patient clinicopathological characteristics
The clinicopathological variables included age, tumor location, tumor size, multifocality, histological type, lymphovascular invasion, extracapsular extension (ECE), histological grade, estrogen receptor (ER), progesterone receptor (PR), HER2 status, Ki-67, molecular subtype, number of SLNs, number of negative SLNs, number of positive SLNs, size of the SLN metastasis, surgery, chemotherapy, radiotherapy, and endocrinotherapy. ER and PR were judged as positive if ≥ 1% of tumor cells showed nuclear staining. HER2-positive status was defined as a 3 + score by immunocytochemistry or HER2 gene amplification by fluorescent in situ hybridization (FISH). The Ki-67 assay follows the 2011 'International Ki-67 in BC Working Group Recommendations': level ≤ 14% was considered low expression; level > 14% was considered high expression. Macrometastasis was defined as metastatic lesions larger than 2 mm in diameter, and micrometastasis was defined as metastatic lesions larger than 0.2 mm and no larger than 2.0 mm in diameter or more than 200 tumor cells in the slice.
Survival analysis
We utilized the Kaplan–Meier (K-M) method to illustrate survival curves between the two cohorts. Univariate and multivariate Cox regression analyses were performed to ascertain independent prognostic factors for survival. The study endpoints encompassed OS and DFS. The survival interval was delineated as the duration from the date of BC diagnosis to the date of disease progression or recurrence, death, or the last follow-up.
Feature selection and model construction
The Boruta algorithm, a feature selection methodology rooted in Random Forest (RF), was employed to identify pivotal features by comparing authentic features against randomly generated "shadow features". For this purpose, Boruta version 8.0.0 was utilized. To predict non-SLN metastasis in patients with SLNs ≥ 3 positive, nine prevalent ML algorithms were deployed, including RF, logistic, extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), adaptive boosting (AdaBoost), decision tree (DT), gradient boosting decision tree (GBDT), complement naive bayes (CNB), and Support Vector Machine (SVM). To enhance the model's robustness, iterative testing and tuning were conducted through ten-fold cross-validation and grid search to ascertain the optimal hyperparameter settings. Patients from the CHSU were randomly divided into training and validation sets in a 7:3 ratio to select the most effective ML model. Additionally, patients from JPH served as an external validation cohort to further verify the extensiveness of the optimal model.
Evaluation of ML models
The performance of the ML models was evaluated using a variety of metrics, including receiver operating characteristic (ROC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1-score, and Kappa. Performance was further assessed through precision-recall curves, demonstrating precision-recall relationships at various thresholds, and calibration curves, which compared the models' predicted probabilities against actual observed probabilities to evaluate bias and accuracy. Additionally, Decision Curve Analysis (DCA) was employed to ascertain the clinical utility of these models. Kolmogorov–Smirnov (K-S) curves, based on the cumulative distribution function, and confusion matrices were utilized to analyze model performance across different thresholds and to visualize the classification accuracy of the optimal model, respectively. Feature contributions were explained using SHAPley Additive exPlanations (SHAP) values, calculated via the "SHAP" software package. Statistical analyses were conducted using R software version 4.2.1 (r-project.org) and Python version 3.8 (Python Software Foundation), with a significance threshold set at p < 0.05.
Results
Clinicopathologic characteristics
This study evaluated 634 BC patients with positive SLNs, divided into two groups: 522 patients with SLNs ≤ 2 positive and 112 patients with SLNs ≥ 3 positive. Analysis from Table 1 indicated that the mean tumor size for the group with SLNs ≥ 3 positive was 3.57 cm, markedly larger than the 3.10 cm observed in the SLNs ≤ 2 positive group (P < 0.001). Moreover, the mean number of positive SLNs was 4.79 for patients with SLNs ≥ 3 positive, compared to 3.13 for those with SLNs ≤ 2 positive (P < 0.001). Patients with SLNs ≥ 3 positive also had fewer negative SLNs, averaging 1.28, versus 1.86 in the SLNs ≤ 2 positive group (P < 0.001). Furthermore, a significantly higher proportion of ER and PR positivity was observed in the SLNs ≤ 2 positive group, while the rates of HER2 and Ki-67 positivity were significantly lower compared to the SLNs ≥ 3 positive group. In terms of treatment, the SLNs ≤ 2 positive group had a higher rate of breast-conserving surgery, while the SLNs ≥ 3 positive group received more chemotherapy, radiotherapy, and endocrinotherapy.
Survival analysis
To evaluate the survival disparities between patients with SLNs ≤ 2 positive and those with SLNs ≥ 3 positive, Kaplan–Meier survival analysis was conducted. This analysis revealed no significant differences in OS and DFS between the two cohorts (OS: P = 0.129; DFS: P = 0.228, Fig. 1A and B). Additionally, univariate and multivariate Cox regression analyses were performed to identify independent predictors influencing survival disparities among these groups. Initial multicollinearity tests indicated high collinearity, as the generalized variance inflation factors (GVIFs) for molecular subtype, ER, and endocrinotherapy exceeded 5 (Table S1). Therefore, molecular subtype and endocrinotherapy were excluded from further analysis. Further univariate and multivariate Cox regression analyses identified tumor size as an independent risk factor for OS, while G2 was an independent protective factor for OS. Regarding DFS, the number of SLNs was an independent risk factor for OS, whereas G2, the number of negative SLNs, and radiotherapy were independent protective factors for DFS (Table 2). Crucially, these analyses confirmed the absence of statistically significant survival differences between patients with SLNs ≤ 2 positive and those with SLNs ≥ 3 positive. However, forest plots showed survival differences within receptor subgroups between patients with SLNs ≤ 2 positive and those with SLNs ≥ 3 positive (Fig. 1C and D). OS for the SLNs ≤ 2 positive group was superior to that of the SLNs ≥ 3 positive group in the ER-positive and PR-positive subgroups. Furthermore, DFS for the SLNs ≤ 2 positive group was better than that for the SLNs ≥ 3 positive group in the ER-positive, PR-positive, and HER2-negative subgroups.
Clinical characteristics and selection of features in patients with SLNs ≥ 3 positive
Among the 112 patients with ≥ 3 positive SLNs, 25% (28/112) did not have non-SLN metastases (Table 3). The incidence of micrometastases in SLNs was significantly higher in patients without non-SLN metastases compared to those with such metastases (P < 0.001). To enhance the identification of patients with ≥ 3 positive SLNs who did not develop non-SLN metastases, we developed predictive models using nine ML algorithms. Initial feature correlation analysis revealed that the number of negative SLNs had a correlation coefficient above 0.7, indicating significant multicollinearity, as did the relationship between estrogen and progesterone receptors (Fig. 2A). Consequently, the interaction between the number of negative SLNs and PR was excluded from further analysis. Boruta’s algorithm identified four critical features: size of positive SLNs, tumor size, number of SLNs, and ER status, as significant predictors (Fig. 2B).
Construction and evaluation of models
We integrated key features into the construction of ML models to predict the risk of non-SLN metastasis in patients with ≥ 3 positive SLNs. Figure 3A and Table 4 presented the performance of the nine ML models in predicting non-SLN metastasis within the training group. The RF model demonstrated superior predictive ability, with an AUC of 0.987, accuracy of 0.955, F1-score of 0.977, and Kappa statistic of 0.855. At the optimal cutoff, the RF model achieved a sensitivity of 0.966, specificity of 0.964, PPV of 0.988, and NPV of 0.871. The AUCs for the Logistic, XGBoost, LightGBM, AdaBoost, DT, GBDT, CNB, and SVM models were 0.648, 0.743, 0.669, 0.910, 0.856, 0.694, 0.648, and 0.757 respectively, with corresponding accuracies of 0.745, 0.640, 0.412, 0.754, 0.677, 0.448, 0.521, and 0.743. The F1-scores for these models were 0.823, 0.785, 0.402, 0.857, 0.608, 0.358, 0.576, and 0.827, respectively. Figure 3B and Table 4 illustrated the performance of the nine ML models in the validation group. Again, the RF model outperformed the others, achieving an AUC of 0.828, accuracy of 0.832, F1-score of 0.882, and Kappa of 0.569. The Logistic, XGBoost, LightGBM, AdaBoost, DT, GBDT, CNB, and SVM models yielded AUCs of 0.559, 0.619, 0.592, 0.793, 0.677, 0.638, 0.592, and accuracies of 0.668, 0.558, 0.367, 0.698, 0.574, 0.414, 0.428, 0.609, respectively. Their F1-scores were 0.659, 0.521, 0.230, 0.838, 0.398, 0.320, 0.204, and 0.707, respectively.
PR curves were utilized to assess the precision and recall of the models at various thresholds. Figure 3C and D depict the average precision (AP) scores of the nine models in both the training and validation groups. Notably, the RF model exhibited the highest performance, achieving AP scores of 0.995 and 0.918 in the training and validation groups, respectively. Additionally, the calibration curves demonstrated a strong concordance between the predicted probabilities and the actual observations for the RF model (Fig. 3E). DCA was employed to evaluate the clinical utility of the models. The results indicated that the RF model provided significant net clinical benefits in predicting non-SLN metastasis (Fig. 3F). Consequently, the RF model emerged as the optimal choice for predicting non-SLN metastasis in patients with ≥ 3 positive-SLNs.
Performance and interpretability of RF model
The K-S curve was instrumental in identifying the optimal classification threshold that maximized the difference between the true positive and true negative rates. By selecting the threshold with the highest K-S statistic, we enhanced classification performance. The results revealed that the maximum K-S statistic was 0.762 at an intercept value of 0.236 (Fig. 4A). In clinical practice, accurately predicting patients who were less likely to develop non-SLN metastasis prevented unnecessary ALND. The RF model accurately predicted non-SLN metastasis status in 86.4% (19/22) of patients in the training group (Fig. 4B). Furthermore, the RF model successfully predicted the non-SLN-negative status in 80% (5/6) of the cases in the validation group (Fig. 4C).
The parallel coordinates plot visualized the distribution and trends among various features, facilitating a comprehensive comparison (Fig. 4D). Subsequently, SHAP analysis elucidated the predictive mechanisms of the RF model for non-SLN metastasis by quantifying the importance of each feature. The SHAP values for each feature varied across different levels; features with increasing values turned progressively redder, while decreasing values shifted towards blue (Fig. 4E). Notably, a feature point positioned to the right of the axis signified an increased risk of non-SLN metastasis, whereas a point on the left indicated a reduced risk. Furthermore, the features were ranked based on their importance to the model (Fig. 4E). Higher-ranked features played a more crucial role in the model’s decision-making process. Notably, tumor size and the number of SLNs were the most valued by the RF model for predicting non-SLN metastasis.
External validation of RF model
To further assess the robustness and applicability of the RF model, we enrolled 42 patients with ≥ 3 positive-SLNs from JPH as an external validation cohort (Table S2). The RF model demonstrated strong performance, achieving an AUC of 0.870 in this cohort (Fig. 5A). Additionally, DCA confirmed the clinical utility of the RF model in the external validation cohort (Fig. 5B).
Discussion
In our study, patients with SLNs ≥ 3 positive constituted 17.7% of SLN-positive BC patients, a higher proportion compared to the 5.6–10.7% reported in the literature [14, 15, 17, 18]. This discrepancy might be attributed to our use of MB in combination with ICG during SLNB, which typically achieves a higher detection rate than using either dye alone, thereby reducing the rate of missed diagnoses. Meanwhile, the removal of palpable nodes during SLNB undoubtedly increases the detection rate of SLNs. The NSABP B-32 trial demonstrated that the false-negative rate was 10.0% when two SLNs were found, 6.9% when three were identified, and 5.5% when four were detected [19]. This may also reflect variations in clinicians' understanding and performance of SLNB. Research has demonstrated that preoperative imaging for ALN identification and the development of a nomogram can aid in predicting the likelihood of involvement of three or more lymph nodes [20]. This approach enhances the accurate assessment of ALN metastasis and reduces the proportion of patients who are clinically negative for ALNs but have SLNs ≥ 3 positive intraoperatively. Some studies have reported that the expression of ER, PR, HER2, and Ki-67 can be used as predictors of no-SLN metastasis in patients with SLNs ≤ 2 positive. However, neither ER/PR, HER2, nor Ki-67 status independently predicted no-SLN metastasis [8, 10,11,12]. In our study, although the statuses of ER, PR, HER2, and Ki-67 in patients with SLNs ≥ 3 positive did not significantly relate with non-SLN metastasis, the Boruta algorithm's feature selection suggests that the ER status should be included in the construction of ML models.
In this study, 75% of patients with SLNs ≥ 3 positive exhibited non-SLN metastasis. Other studies have reported that 55.5–67.7% of patients with SLNs ≥ 3 positive experience non-SLN metastasis [15, 18]. This discrepancy may be related with the false-negative rate of hollow-core needle biopsies of suspicious ALNs in our study. Among patients with SLNs ≤ 2 positive, approximately 30% have non-SLN metastasis, while 70% do not [8, 21]. In recent years, various study designs and the use of graphical and numerical models have been employed to predict non-SLN metastatic status in early BC. Several nomograms and scoring systems based on clinicopathological variables have been developed to estimate the probability of non-SLN metastasis in patients with early BC and SLNs ≤ 2 positive. Notable examples include the Memorial Sloan Kettering Cancer Center (MSKCC) nomogram [9], the Cambridge nomogram [10], the Stanford nomogram [11], the Tenon score [22], the MD Anderson Cancer Center Score [13], and the Shanghai Cancer Hospital (SCH) nomogram [23]. These models have been validated and provide improved predictions of non-SLN metastasis status. For patients predicted by the model to have no non-SLN metastasis, avoiding further ALND and additional axillary radiotherapy can reduce the incidence of lymphedema and shoulder joint complications. However, for patients with SLNs ≥ 3 positive, it remains necessary to develop and validate models for clinical application, which would be valuable for further axillary management. For patients with a very low probability of non-SLN metastasis predicted by the model, ALND can be avoided while adding axillary radiotherapy, thus reducing surgical trauma and the occurrence of upper limb lymphedema. Therefore, in patients with SLNs ≥ 3 positive, the model could be useful for identifying the presence of non-SLN metastasis in early BC.
In this study, we employed machine-learning models to predict the risk of non-SLN metastasis in patients with SLNs ≥ 3 positive. Among the nine ML models used in the training set, the RF model exhibited the best performance in predicting non-SLN metastasis, achieving an AUC of 0.987, an accuracy of 0.955, an F1-score of 0.977, and a Kappa of 0.855. The RF model also demonstrated superior performance in the validation cohort, with an AUC of 0.828. These results suggest that the RF model is an excellent tool for evaluating whether patients with SLNs ≥ 3 positive can be exempted from ALND. The model demonstrates superior performance compared to previous logistic models. Reports indicate that the average accuracy, specificity, sensitivity, and AUC of the deep ML model TabNet are significantly better than those of the logistic regression model [24]. Our model's AUC also surpasses that of models predicting the risk of non-SLN metastasis in patients with SLNs ≤ 2 positive, such as the MSKCC and other prediction nomograms, where the AUC varies from 0.6 to 0.8 due to regional and patient population differences [9, 25, 26]. The SCH nomogram, the first model established using a population of Chinese individuals diagnosed with BC, had an original AUC of 0.779 [27]. Recently, Yang et al. also developed a new nomogram to predict the non-SLN status including early patients with SLNs ≥ 3 positive, with an AUC of 0.701–0.813 [14].
In clinical applications, it is crucial that the model accurately predicts patients without non-SLN metastasis to avoid unnecessary ALND. Our prediction model, RF, demonstrated an accuracy of 87.1% in identifying non-SLN-negative patients in the training cohort. Furthermore, the RF model successfully predicted 71.7% of non-SLN-negative patients in the validation cohort, surpassing the accuracy of previous prediction models for SLNs ≤ 2 positive [10, 11, 23, 25, 26, 28]. A significant advantage of the RF model is its ability to rank the features based on their importance, with higher-ranked features contributing more to the model. Our study revealed that the RF model prioritized tumor size and the number of SLNs, which were pivotal in our decision to prioritize the exemption of ALND. The prediction accuracy and predictive weight of the proposed model were superior to those of previous models for SLNs ≤ 2 positive [10, 11, 23, 25, 26, 28].
The AUC of the RF model in this study was 0.870 in the external validation cohort, indicating high validation efficiency. Its prediction accuracy, clinical applicability, and robustness were also excellent. Notably, a prediction model for non-SLN metastasis tailored to individuals diagnosed with BC in the Chinese population is needed [23]. Correspondingly, this RF model is particularly suitable for female BC patients in this region, and the model predicting non-SLN metastasis status when SLNs ≥ 3 positive is appropriate for regional promotion and application.
The limitations of this study are as follows: (1) The small sample size reduces the overall reliability and generalizability of the results. (2) The prediction model requires prospective clinical comparative study data to further validate its predictive efficiency. (3) Additional follow-up data are necessary to confirm local disease control, DFS, OS, and complications in the affected upper limbs of patients exempt from ALND.
Conclusion
Our study developed ML models to predict non-SLN metastatic status in patients with SLNs ≥ 3 positive, based on the Chinese BC population. The results demonstrate that when clinical ALNs are negative and SLNs ≥ 3 positive, it is feasible to construct RF prediction models using clinicopathological characteristics of the patients. The prediction accuracy and efficiency are excellent, making it applicable to regional populations. For cases with a very low rate of non-SLN metastasis as predicted by the model, ALND can be avoided by incorporating axillary radiotherapy.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.
Abbreviations
- BC:
-
Breast cancer
- SLNB:
-
Sentinel lymph node biopsy
- ALND:
-
Axillary lymph node dissection
- DFS:
-
Disease-free survival
- OS:
-
Overall survival
- SLN:
-
Sentinel lymph node
- non-SLN:
-
Non-sentinel lymph node
- WBI:
-
Whole-breast irradiation
- RNI:
-
Regional nodal irradiation
- ML:
-
Machine learning
- CHSU:
-
Cancer Hospital of Shantou University Medical College
- AJCC:
-
American Joint Committee on Cancer
- ITC:
-
Isolated tumor cells
- JPH:
-
Jieyang People's Hospital
- ECE:
-
Extracapsular extension
- ER:
-
Estrogen receptor
- PR:
-
Progesterone receptor
- HER2:
-
Human epidermal growth factor receptor 2
- FISH:
-
Fluorescent in situ hybridization
- K-M:
-
Kaplan–Meier
- RF:
-
Random Forest
- XGboost:
-
Extreme gradient boosting
- LightGBM:
-
Light gradient boosting machine
- AdaBoost:
-
Adaptive boosting
- DT:
-
Decision tree
- GBDT:
-
Gradient boosting decision tree
- CNB:
-
Complement naive bayes
- SVM:
-
Support Vector Machine
- ROC:
-
Receiver operating characteristic
- PPV:
-
Positive predictive value
- NPV:
-
Negative predictive value
- DCA:
-
Decision Curve Analysis
- K-S:
-
Kolmogorov–Smirnov
- SHAP:
-
SHAPley Additive exPlanations
- GVIF:
-
Generalized variance inflation factor
- AP:
-
Average precision
- MSKCC:
-
Memorial Sloan Kettering Cancer Center
- SCH:
-
Shanghai Cancer Hospital
References
Krag DN, Anderson SJ, Julian TB, et al. Sentinel-lymph-node resection compared with conventional axillary-lymph-node dissection in clinically node-negative patients with breast cancer: overall survival findings from the NSABP B-32 randomised phase 3 trial. Lancet Oncol. 2010;11(10):927–33.
Fleissig A, Fallowfield LJ, Langridge CI, et al. Post-operative arm morbidity and quality of life. Results of the ALMANAC randomised trial comparing sentinel node biopsy with standard axillary treatment in the management of patients with early breast cancer. Breast Cancer Res Treat. 2006;95(3):279–93.
Ashikaga T, Krag DN, Land SR, et al. Morbidity results from the NSABP B-32 trial comparing sentinel lymph node dissection versus axillary dissection. J Surg Oncol. 2010;102(2):111–8.
Giuliano AE, Ballman K, McCall L, Beitsch P, Whitworth PW, Blumencranz P, et al. Locoregional recurrence after sentinel lymph node dissection with or without axillary dissection in patients with sentinel lymph node metastases: long-term follow-up from the American College of Surgeons Oncology Group (Alliance) ACOSOG Z0011 randomized trial. Ann Surg. 2016;264(3):413–20. https://doi.org/10.1097/SLA.0000000000001863.
Giuliano AE, Ballman KV, McCall L, Beitsch PD, Brennan MB, Kelemen PR, et al. Effect of axillary dissection vs no axillary dissection on 10-year overall survival among women with invasive breast cancer and sentinel node metastasis: the ACOSOG Z0011 (Alliance) randomized clinical trial. JAMA. 2017;318(10):918–26. https://doi.org/10.1001/jama.2017.11470.
Galimberti V, Cole BF, Viale G, Veronesi P, Vicini E, Intra M, et al. Axillary dissection versus no axillary dissection in patients with breast cancer and sentinel-node micrometastases (IBCSG 23–01): 10-year follow-up of a randomised, controlled phase 3 trial. Lancet Oncol. 2018;19(10):1385–93. https://doi.org/10.1016/S1470-2045(18)30380-2.
Donker M, van Tienhoven G, Straver ME, Meijnen P, van de Velde CJ, Mansel RE, et al. Radiotherapy or surgery of the axilla after a positive sentinel node in breast cancer (EORTC 10981–22023 AMAROS): a randomised, multicentre, open-label, phase 3 non-inferiority trial. Lancet Oncol. 2014;15(12):1303–10. https://doi.org/10.1016/S1470-2045(14)70460-7.
Sávolt Á, Péley G, Polgár C, Udvarhelyi N, Rubovszky G, Kovács E, et al. Eight-year follow up result of the OTOASOR trial: the optimal treatment of the axilla - surgery or radiotherapy after positive sentinel lymph node biopsy in early-stage breast cancer: a randomized, single centre, phase III, non-inferiority trial. Eur J Surg Oncol. 2017;43(4):672–9. https://doi.org/10.1016/j.ejso.2016.12.011.
Van Zee KJ, Manasseh DM, Bevilacqua JL, Boolbol SK, Fey JV, Tan LK, et al. A nomogram for predicting the likelihood of additional nodal metastases in breast cancer patients with a positive sentinel node biopsy. Ann Surg Oncol. 2003;10(10):1140–51. https://doi.org/10.1245/aso.2003.03.015.
Pal A, Provenzano E, Duffy SW, Pinder SE, Purushotham AD. A model for predicting non-sentinel lymph node metastatic disease when the sentinel lymph node is positive. Br J Surg. 2008;95(3):302–9. https://doi.org/10.1002/bjs.5943.
Kohrt HE, Olshen RA, Bermas HR, Goodson WH, Wood DJ, Henry S, et al. New models and online calculator for predicting non-sentinel lymph node status in sentinel lymph node positive breast cancer patients. BMC Cancer. 2008;8: 66. https://doi.org/10.1186/1471-2407-8-66.
Duijm LE, Groenewoud JH, Roumen RM, de Koning HJ, Plaisier ML, Fracheboud J. A decade of breast cancer screening in The Netherlands: trends in the preoperative diagnosis of breast cancer. Breast Cancer Res Treat. 2007;106(1):113–9. https://doi.org/10.1007/s10549-006-9468-5.
Hwang RF, Krishnamurthy S, Hunt KK, Mirza N, Ames FC, Feig B, et al. Clinicopathologic factors predicting involvement of nonsentinel axillary nodes in women with breast cancer. Ann Surg Oncol. 2003;10(3):248–54. https://doi.org/10.1245/aso.2003.05.020.
Yang L, Zhao X, Yang L, Chang Y, Cao C, Li X, et al. A new prediction nomogram of non-sentinel lymph node metastasis in cT1-2 breast cancer patients with positive sentinel lymph nodes. Sci Rep. 2024;14(1):9596. https://doi.org/10.1038/s41598-024-60198-0.
Maimaitiaili A, Wu D, Liu Z, Liu H, Muyiduli X, Fan Z. Analysis of factors related to non-sentinel lymph node metastasis in 296 sentinel lymph node-positive Chinese breast cancer patients. Cancer Biol Med. 2018;15(3):282–9. https://doi.org/10.20892/j.issn.2095-3941.2018.0023.
Johnston S, Toi M, O’Shaughnessy J, Rastogi P, Campone M, Neven P, et al. Abemaciclib plus endocrine therapy for hormone receptor-positive, HER2-negative, node-positive, high-risk early breast cancer (monarchE): results from a preplanned interim analysis of a randomised, open-label, phase 3 trial. Lancet Oncol. 2023;24(1):77–90. https://doi.org/10.1016/S1470-2045(22)00694-5.
Tong C, Miao Q, Zheng J, Wu J. A novel nomogram for predicting the decision to delayed extubation after thoracoscopic lung cancer surgery. Ann Med. 2023;55(1):800–7. https://doi.org/10.1080/07853890.2022.2160490.
Dong LF, Xu SY, Long JP, Wan F, Chen YD. Role of number of sentinel nodes in predicting non-sentinel node metastasis in breast cancer. J Int Med Res. 2018;46(2):828–35. https://doi.org/10.1177/0300060517729589.
Krag DN, Anderson SJ, Julian TB, Brown AM, Harlow SP, Ashikaga T, et al. Technical outcomes of sentinel-lymph-node resection and conventional axillary-lymph-node dissection in patients with clinically node-negative breast cancer: results from the NSABP B-32 randomised phase III trial. Lancet Oncol. 2007;8(10):881–8. https://doi.org/10.1016/S1470-2045(07)70278-4.
Ahn SK, Kim MK, Kim J, Lee E, Yoo TK, Lee HB, et al. Can we skip intraoperative evaluation of sentinel lymph nodes? Nomogram predicting involvement of three or more axillary lymph nodes before breast cancer surgery. Cancer Res Treat. 2017;49(4):1088–96. https://doi.org/10.4143/crt.2016.473.
Ortega Expósito C, Falo C, Pernas S, Pérez Carton S, Gil Gil M, Ortega R, et al. The effect of omitting axillary dissection and the impact of radiotherapy on patients with breast cancer sentinel node macrometastases: a cohort study following the ACOSOG Z0011 and AMAROS trials. Breast Cancer Res Treat. 2021;189(1):111–20. https://doi.org/10.1007/s10549-021-06274-9.
Barranger E, Coutant C, Flahault A, Delpech Y, Darai E, Uzan S. An axilla scoring system to predict non-sentinel lymph node status in breast cancer patients with sentinel lymph node involvement. Breast Cancer Res Treat. 2005;91(2):113–9. https://doi.org/10.1007/s10549-004-5781-z.
Chen JY, Chen JJ, Xue JY, Chen Y, Liu GY, Han QX, et al. Predicting non-sentinel lymph node metastasis in a Chinese breast cancer population with 1–2 positive sentinel nodes: development and assessment of a new predictive nomogram. World J Surg. 2015;39(12):2919–27. https://doi.org/10.1007/s00268-015-3189-z.
Shahriarirad R, Meshkati Yazd SM, Fathian R, Fallahi M, Ghadiani Z, Nafissi N. Prediction of sentinel lymph node metastasis in breast cancer patients based on preoperative features: a deep machine learning approach. Sci Rep. 2024;14(1):1351. https://doi.org/10.1038/s41598-024-51244-y.
Bi X, Wang Y, Li M, Chen P, Zhou Z, Liu Y, et al. Validation of the Memorial Sloan Kettering Cancer Center nomogram for predicting non-sentinel lymph node metastasis in sentinel lymph node-positive breast-cancer patients. Onco Targets Ther. 2015;8:487–93. https://doi.org/10.2147/OTT.S78903.
Gur AS, Unal B, Johnson R, Ahrendt G, Bonaventura M, Gordon P, et al. Predictive probability of four different breast cancer nomograms for nonsentinel axillary lymph node metastasis in positive sentinel node biopsy. J Am Coll Surg. 2009;208(2):229–35. https://doi.org/10.1016/j.jamcollsurg.2008.10.029.
Ishizuka Y, Horimoto Y, Nakamura M, Arakawa A, Fujita T, Iijima K, et al. Predictive factors for non-sentinel nodal metastasis in patients with sentinel lymph node-positive breast cancer. Anticancer Res. 2020;40(8):4405–12. https://doi.org/10.21873/anticanres.14445.
Wu P, Zhao K, Liang Y, Ye W, Liu Z, Liang C. Validation of breast cancer models for predicting the nonsentinel lymph node metastasis after a positive sentinel lymph node biopsy in a Chinese population. Technol Cancer Res Treat. 2018;17: 1533033818785032. https://doi.org/10.1177/1533033818785032.
Acknowledgements
Not applicable.
Funding
This work was supported by funds from the Foundation of Basic and Applied Basic Research of Guangdong Province, China (No. 2022A1515220202), funds from the 2023 Science and Technology Innovation Strategy Project of Guangdong Province (Big Project + Task List), China (No. STKJ2023009,20230403), funds from the Foundation of Basic and Applied Basic Research of Guangdong Province, China (No. 2023A1515220231).
Author information
Authors and Affiliations
Contributions
Xiangli Xie, Yutong Fang, Lifang He participated in the data analysis, Qunchen Zhang organized the article writing. Jundong Wu critically modified the manuscript. Zexiao Chen modified the manuscript. Chunfa Chen and Huancheng Zeng drafted the manuscript. Bingfeng Chen were responsiblefor the acquisition of data; Guangshen Huang contributed to the literature search. Cuiping Guo corrected language expression. All authors read and approved the manuscript and agree to be accountable for all aspects of the research in ensuring that the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
This study was approved by the Ethics Committees of Cancer Hospital of Shantou University Medical College (No. 2024038) and Jieyang People's Hospital (No. 2024054), and a waiver of informed consent was granted.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Xie, X., Fang, Y., He, L. et al. Individualized prediction of non-sentinel lymph node metastasis in Chinese breast cancer patients with ≥ 3 positive sentinel lymph nodes based on machine-learning algorithms. BMC Cancer 24, 1090 (2024). https://doi.org/10.1186/s12885-024-12870-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12885-024-12870-x