Lymph node positivity in different early breast carcinoma phenotypes: a predictive model

Background A strong correlation between breast cancer (BC) molecular subtypes and axillary status has been shown. It would be useful to predict the probability of lymph node (LN) positivity. Objective: To develop the performance of multivariable models to predict LN metastases, including nomograms derived from logistic regression with clinical, pathologic variables provided by tumor surgical results or only by biopsy. Methods A retrospective cohort was randomly divided into two separate patient sets: a training set and a validation set. In the training set, we used multivariable logistic regression techniques to build different predictive nomograms for the risk of developing LN metastases. The discrimination ability and calibration accuracy of the resulting nomograms were evaluated on the training and validation set. Results Consecutive sample of 12,572 early BC patients with sentinel node biopsies and no neoadjuvant therapy. In our predictive macro metastases LN model, the areas under curve (AUC) values were 0.780 and 0.717 respectively for pathologic and pre-operative model, with a good calibration, and results with validation data set were similar: AUC respectively of 0.796 and 0.725. Among the list of candidate’s regression variables, on the training set we identified age, tumor size, LVI, and molecular subtype as statistically significant factors for predicting the risk of LN metastases. Conclusions Several nomograms were reported to predict risk of SLN involvement and NSN involvement. We propose a new calculation model to assess this risk of positive LN with similar performance which could be useful to choose management strategies, to avoid axillary LN staging or to propose ALND for patients with high level probability of major axillary LN involvement but also to propose immediate breast reconstruction when post mastectomy radiotherapy is not required for patients without LN macro metastasis. Electronic supplementary material The online version of this article (10.1186/s12885-018-5227-3) contains supplementary material, which is available to authorized users.


Background
In breast cancer (BC), nodal status is a major prognostic factor that determines therapeutic decisions to a large extent. Sentinel lymph node biopsy (SLNB) provides a reliable assessment of the axilla status in early clinically node-negative BC [1]. Since it also causes less morbidity than axillary lymph node dissection (ALND), it is now considered as a standard of care procedure. The omission of completion ALND in patients with negative sentinel lymph nodes (SLN) has been recognized as a reasonable attitude since the publication of the NSABP B-32 results [2]. Moreover, it is likely that it can be safely expanded to patients with minimal SLN involvement (isolated tumor cells and micro metastases), with regard to survival outcomes [3,4]. Indeed, 40 to 70% of these patients do not have metastatic non-sentinel lymph nodes (NSLN) [5]. Main predictors of LN metastases are tumor size, grade, lymphovascular invasion (LVI), age at diagnosis, extracapsular extension of the positive SLN, and hormonal and HER2 receptor status [6][7][8][9][10]. In addition, a strong correlation between BC molecular subtypes and /or tumor phenotypes on the one hand (determined by hormonal receptor and HER2 status) and axillary status on the other hand has been shown in numerous studies [11][12][13][14][15][16].
The determination of the risk of positive axillary LN can significantly contribute to therapeutic decisions. However, this risk cannot be immediately induced from the results of multivariate analyses that provide broad statistical information. Only an appropriate prediction tool, using a nomogram, can indicate the individual risk of a given patient. These nomograms can also be used to compare populations from different studies. A large cohort is necessary to reliably determine the probability of positive SN, particularly for less frequent tumor phenotypes. Reyal et al. published such a nomogram predictive of the risk of developing SN metastases in 2011 [11], built on a training set made of 1543 early-stage BC patients, and validated on two cohorts of 615 and 496 patients respectively. This model was further validated in a cohort of 755 consecutive patients treated at Institut Curie in 2009 [17].
The aim of our study was to develop and compare the performance of multivariable models to predict LN metastases, including nomograms derived from logistic regression with clinical, pathologic variables provided by tumor surgical results or only provided by biopsy as explanatory variables.

Patients
Our cohort consisted of 12,572 consecutive patients with small (≤ 30 mm based on clinical and radiologic findings), clinically node-negative invasive BC, who did not receive neoadjuvant therapy, and underwent SLNB between 1999 and 2012 at 13 French centers. HER2 status was determined for all patients. During the first years of the study, ALND was systematically performed in some sites; thereafter, ALND was performed only in case of SN involvement, this attitude being homogeneous within all the participating sites.

Evaluation
The following data were retrieved: characteristics of patients (age at the time of SLNB), and tumors [size, clinical stage, histological type, estrogen (ER), progesterone (PR) and HER2 status, LVI, Scarff-Bloom-Richardson (SBR) grade], description of ALND (number of LN sampled and involved), and results of the pathological examination of surgical resection specimens. Tumor size was determined on the results of pathological examination but could be evaluated pre operatively by mammography, sonography and in selected cases by MRI (clinical T stage). LVI was detected on surgical specimen.
Tumor phenotype was defined by the combination of ER, PR and HER2 status, evaluated by immuno-histochemistry (IHC) and confirmed by FISH in case of IHC-HER2 2+. Positivity for ER and PR was determined according to French guidelines (≥ 10% of cancer cells expressing ER/PR). Five molecular subtypes were defined according to clinico-pathological criteria [18]. Because information on Ki-67 was not available, we used grade to capture cell proliferation, as described by von Minckwitz et al [19] The following definitions were used: triple-negative Although the methods used for histological examination were not standardized in the protocol, all sites proceeded similarly: serial sections were performed every 200 μm and stained with standard hematoxylin and eosin. The number of sections was six to ten, or pursued until node exhaustion in case of large SN. Additional IHC analysis was done in case of negative results at standard examination. For additional nodes identified by completion ALND, routine HE analysis was performed.

Statistical methods
Our main objective was to create prediction models for the risk of LN positivity and the risk of LN macroscopic metastases from clinical and pathologic variables provided by tumor surgical results or by biopsy, and evaluate their performance with respect to three main features: discrimination (i.e. whether the relative ranking of individual predictions is in the correct order), calibration (i.e. agreement between observed outcomes and predictions) and clinical utility defined as proportions of patients classified into risk categories using predefined cutoff values (< 10%, between 10 and 20%, between 20 and 30%, between 30 and 40%, and > = 40%). Our main evaluation criteria were based on the final status of LN metastases (pN0(i+), pN1mi or pN1ma) as the result of SLNB alone or the final result of both SLNB and ALND. LN positivity was defined as the presence of isolated tumor cells, micro or macro LN metastases. We used logistic regression models [21] including age (<=40, 41-75,> 75), tumor size (<=20, 20-30, > = 30 mm) or clinical T stage (T0-T1, T2, T3-T4), tumor grade, histology type, LVI, and molecular subtypes as predictor factors to predict each individual risks. The list of predictor factors was set beforehand, based on the investigator's experience and some reference papers [6-11, 13-15, 17]. No additional procedure was used in regression analysis to reduce the list of only 5 or 6 predictor factors identified beforehand. Prior to analysis, we randomly divided our initial cohort (N = 12,572) in two separate sub-cohorts: a large training cohort (N = 8381) to create prediction models and a confirmatory cohort (N = 4191) to evaluate their individual's prediction performance. A split-sample approach was adopted in order to estimate unbiasedly the model performance, as these estimates are known to be biased upwards when regression parameters are estimated on the same dataset [22]. First we performed a descriptive analysis using the following criteria: patient's age at SN biopsy, clinical and pathological tumor size, tumor grade and histology type, lymphovascular invasion or not (LVI), presence of estrogen (ER), progesterone (PR) and hormonal receptors (RH), Her2 positivity, tumor subtype, number of SN removed and final LN status. The evaluation of each model was assessed in the training sample and the confirmatory sample. Differences in patient's and tumor's characteristics were compared using Chi Square or exact Fisher test, Student or Wilcoxon rank sum tests as appropriate. The discrimination ability was evaluated by the area under the ROC (Receiver Operating Characteristic) curve (AUC). We used the functions roc and pROC implemented in R to estimate AUC with 95%CI and test for difference in AUCs along the Delong's method in the confirmatory sample [23]. Empirical distributions of AUC observed after re-fitting a model on bootstrap replicates (B = 2000) were used to estimate AUC and difference in AUCs with 95% Ci in the training sample. Model calibration was evaluated using Hosmer goodness-of-fit test [24]. All statistical analyses were conducted in the R Language and Environment for Statistical Computing version 3.2.5 (The R Foundation, Vienna, Austria).
We evaluated the loss in discrimination ability in pre-operative prediction models omitting the information about LVI and substituting pathological tumor The change in AUCs between pathological and per-operative model were found statistically significantly decreased (p < 0.001). We also evaluated in the confirmatory sample the discrimination ability of the prediction models obtained when treating the variable age and

Discussion
The aim of this study was to better understand the relationships between tumor characteristics and the probability of axillary LN positivity. The large cohort used in our study is appropriate for less frequent tumor phenotypes (namely Her2+ and HR-Her2-). We distinguished between various histological tumor types, showing     T1  3731  61  813  43  1835  61  405  42   T2  710  12  673  36  348  11  384  40   > T3  30  0  157  8  20  1  76  8 a lower LN positivity rate in tumors other than ductal, lobular or mixt, as previously reported for BC with favorable histology (tubular, mucinous, papillary, medullary, adenoid cystic and secretory) that are associated with a very low LN positivity rate [25]. In our model, we used the same independent variables as Reyal et al. [11], namely age, tumor size, molecular subtypes and LVI, and we added grade and histological type. However, age intervals were different, as well as tumor phenotype definitions (ER only in the Reyal model) and tumor size description (continuous variable in the Reyal model). We obtained different odds ratios for the same variables and clinical utility results were different and higher for low probability of positive lymph node, particularly for macro metastases in our population for both models. Clinical utility results for low probability of positive lymph node could be contributive to avoid surgical axillary staging by sentinel lymph node biopsy or axillary lymph node dissection.
The models were less reliable when information about LVI was missing. LVI could be detected on pre-operative biopsies but the difference in accuracy is obviously large in comparison with surgical specimen analysis.
The HER2 status was unknown in old studies [8] and others studies were based on small number of patients. We found that HER2 negative tumors were associated with LN positivity less frequently than HER2 positive tumors (22.9% vs. 31.9%). Lu et al. published that the lowest probability of node metastasis was for ER-/ HER2-tumors [12]. Similarly in our study, triple negative tumors had the lowest probability of node metastasis, while HR-/ Her2+ tumors had the highest probability. Reyal et al. hypothesized that the axillary LN metastatic process is predominantly related to intrinsic biological properties in ER-negative and HER2-negative BC, while tumor size, proliferation rate and LVI are the main determinants in the ER positive or HER2 positive breast cancers. However, positive axillary lymph nodes in triple negative BC were pejorative prognostic factors for sentinel node macro-metastases but also for occult sentinel node involvement (pN0(i+) and pN1mi) [26]. A reliable predictive model of LN positivity, based on pathologic parameters, can be used to compare populations from different studies, particularly for trials with or without axillary surgical procedure. Above all, it might allow avoiding SN biopsy when the probability of positivity is very low (< 10%). Some authors already suggested that SN biopsy could be omitted in tumors with good-prognosis subtypes [25] or that axillary dissection is useless in older patients [27]. We believe that these criteria lack accuracy and we prefer a decision-making approach, based on molecular subtypes. However, we must be aware of the risk of insufficient treatment in small tumors with favorable prognostic factors, in which LN status is a major determinant of adjuvant chemotherapy and regional radiotherapy. Moreover, the model is less reliable when LVI is not documented, which is usually the case before surgery. Ultra-sonography of the axilla and percutaneous biopsy is a growing practice. These clinical predictive tools may be helpful relative to the use of axillary ultra-sonography with percutaneous LN biopsy for patients with high level risk of axillary LN involvement.
These models can also be contributive in order to determined indications of post mastectomy radiotherapy for patients with axillary lymph nodes macro-metastases [28], particularly when immediate breast reconstruction can be proposed.

Conclusions
We reported a reliable predictive model of LN positivity according to different early breast carcinoma phenotypes in a large cohort. The determination of the risk of positive axillary LN can significantly contribute to therapeutic decisions. These models, with or without LVI results, can also be used to determine the risk of positive axillary LN or the risk of LN macro-metastasis. Before surgery, clinical models can be used to propose SLNB or not according to LN involvement probability. After surgery, in case of SLNB omission, if LN involvement probability is high, with eventually modifications of adjuvant treatment indications according to LN status, a re-operation can be proposed (SLNB or cALND). Thus clinical and pathologic models should be helpful in surgical planning, in the setting of a clinical trial and in clinical practice to avoid SLNB for very low risk of LN involvement and to avoid re-operation in case of SLNB omission or to propose ALND for patients with high level probability of major axillary LN involvement but also to propose immediate breast reconstruction when PMRT is not required for.