Nomogram for predicting preoperative lymph node involvement in patients with invasive micropapillary carcinoma of breast: a SEER population-based study

Background Invasive micropapillary carcinoma (IMPC) is an unusual and distinct subtype of invasive breast tumor with high propensity for regional lymph node metastases. This study was to identify risk factors accounting for IMPC of the breast and to develop a nomogram to preoperatively predict the probability of lymph node involvement. Methods A retrospective review of the clinical and pathology records was performed in patients diagnosed with IMPC between 2003 and 2014 from Surveillance, Epidemiology, and End Results (SEER) database. The cohort was divided into training and validation sets. Training set comprised patients diagnosed between 2003 and 2009, while validation set included patients diagnosed thereafter. A logistic regression model was used to construct the nomogram in the training set and then varified in the validation set. Nomogram performance was quantified with respect to discrimination and calibration using R 3.4.1 software. Results Overall, 1407 patients diagnosed with IMPC were enrolled, of which 527 in training set and 880 in validation set. Logistic regression analysis indicated larger lesions, younger age at diagnosis, black ethnic and lack of hormone receptor expression were significantly related to regional nodes involvement. The AUC of the nomogram was 0.735 (95% confidential interval (CI) 0.692 to 0.777), demonstrating a good prediction performance. Calibration curve for the nomogram was plotted and the slope was close to 1, which demonstrated excellent calibration of the nomogram. The performance of the nomogram was further validated in the validation set, with AUC of 0.748 (95% CI 0.701 to 0.767). Conclusions The striking difference between IMPC and IDC remains the increased lymph node involvement in IMPC and therefore merits aggressive treatment. The nomogram based on the clinicalpathologic parameters was established, which could accurately preoperatively predict regional lymph node status. This nomogram would facilitate evaluating lymph node state preoperatively and thus treatment decision-making of individual patients.


Background
Invasive micropapillary carcinoma (IMPC) of the breast was first described by Fisher et al. in 1980 [1], and then defined by Siriaunkgul and Tavassoli in1993 [2]. In 2003 World Health Organization (WHO) guidelines for histologic classification of tumors of the breast, IMPC was considered as a rare subtype of invasive breast carcinoma, accounting for approximately 2% to 8% of all breast cancers [3]. Over the past decades, a series of studies have conducted to explore clinical-pathologic characteristics, clinical outcomes, prognostic factors and underlying mechanisms of IMPC.
Fundamental research concerning IMPC, in conjunction with clinical practice, had confirmed that IMPC was linked to high frequency of lymph nodal metastasis (LNM) and lymphovascular invasion (LVI) [4][5][6][7][8]. Axillary lymph node metastasis is one of the most important prognostic determinants for patients with breast cancer. Accurately preoperative assessment of lymph node involvement has become an essential issue with respect to determining the need for neoadjuvant therapy and aiding in axillary lymph nodes dissection decision making or other alternative treatment options.
However, the relatively low incidence of IMPC makes it robustly difficult to characterize the nature course of this aggressive subset of breast carcinoma. Till now, the factors contributing to the lymphotropic nature of IMPC had not yet been fully understood and need to be further elucidated. Although studies indicated that tumor-infiltrating lymphocytes (TILs) [9], cytokines, membrane proteins (such as stromal cell-derived factor-1 and its receptor CXCR4, caveolin-1) [10][11][12], epigenetic regulators (such as miRNAs (let-7b, miR-30c, miR-148a, miR-181a, miR-181b), promoter hypermethylation of the LZTS1 gene) [13,14] were associated with lymph node metastasis, there was no convenient nomogram facilitating preoperative prediction of the lymph node involvement.
The objective of this retrospective study was to determine the clinicopathologic characteristics correlating with lymphotropic nature of IMPC. Furthermore, a nomogram was constructed by applying the identified factors to predict the lesions likely to be regional lymph node involvement prior to surgery.

Ethical statement
This study was approved by the Ethical Committee of the Shanghai Cancer Center of Fudan University. The data released from the SEER database did not require informed patient consent because cancer is a reportable disease in every state in the US.

Patient selection
The SEER*Stat version 8.3.4 was used to generate a case listing. A total of 1407 patients was eligible and enrolled according to the following inclusion criteria: year of diagnosis from 2003 to 2014, pathologically confirmed invasive micropapillary carcinoma of the breast (ICD-O-3 8507), unilateral breast cancer, adjusted AJCC 6th stage, known tumor size, regional lymph nodes involvement, ER and PR status. Patients with no record of regional lymph node status and tumor size and diagnosed before 2003 were excluded from this analysis. Comparable clinicopathologic characteristics of IMPC patients diagnosed between the periods of 2003 to 2009 and 2010 to 2014, therefore, the training set comprised patients diagnosed between 2003 to 2009, while validation set included patients diagnosed thereafter.

Statistical analysis
Continuous variables were compared between the training set and validation set using Mann-Whitney U tests, and categorical variables were analysed using the Pearson's chi-square test or Fisher's exact test when needed. To identify factors that were associated with regional lymph node involvement, binary logistic regression analysis was used for multivariable analysis. Odds ratios (OR) were presented with 95% CI. Preoperatively available variables were included in the logistic regression analysis.
To construct a well calibrated and discriminative nomogram for predicting regional lymph node metastasis, a model was developed in a training set and then validated in another data set. A logistic regression model was used to construct the nomogram. Variables with P <0.05 were included in the nomogram, unless otherwise specified. A likelihood ratio test approach for model selection was performed.
Nomogram performance was quantified with respect to discrimination and calibration. Discrimination (the ability of a nomogram to separate patients with different lymph node status) was quantified by means of the area under the receiver operating characteristic (ROC) curve (AUC or C-index). Calibration was assessed graphically by plotting the relationship between actual (observed) probabilities and predicted probabilities (calibration plot) by using Hosmer goodness-of-fit test [15]. Internal validation of performance was estimated with the bootstrapping method (1000 replications). All tests were twosided, and P<0.05 was deemed significant. Statistical analyses were conducted using SPSS for windows (version 22.0, SPSS Inc., Chicago, IL, USA) and the R programming language and environment version 3.4.1 (http://cran.r-project.org).

Clinical and pathologic characteristics of the study cohort
The clinicopathologic features of the study cohort were listed in Table 1. The cohort included 1407 patients with a median follow-up time of 37 (25-75%, 15-69) months. The median survival times of the primary and validation cohorts were 79 months (25-75%, 64-97) and 22 months (25-75%, 9-37), respectively. Similar to invasive ductal carcinoma (IDC) of the breast, an overwhelming majority patients were female (98.22%). The median ages at diagnosis of the primary and validation set were 60 (25-75%, 51-70) and 61 (25-75%, 52-70), respectively. The breast cancer lesion was located in the left breast in 724 patients (51.46%) and in the right breast in 683 patients (48.54%).
As to the routine immunoprofiles, ER status showed positive rate of 89.48%, and 77.83% for PR status. In the present study, 50.46% (710/1407) of patients were confirmed to pathological lymph node metastasis, with positive rate of 49.91% and 50.80% in the primary and validation cohorts, respectively. There were no significant differences between the primary and validation groups with regard to clinicopathologic characteristics.

Factors associated with preoperative axillary lymph node metastasis
To identify factors potentially predict axillary lymph node metastasis, univariate analysis was performed in the primary cohort as indicated in Table 2. The results demonstrated that factors most strongly associated with preoperative axillary lymph node involvement were older age at diagnosis, tumor size, histological grade, and stage. Of note, the ER status, which was quantified as one of the most important drivers of breast cancer development, progression and metastasis, showed no significant relation with positive axillary lymph nodes, while PR status had marginal correlation with that.
Given the purpose to achieve flexible utility of available clinicopathologic features to preoperative predict the metastatic lymph nodes, factors including age at diagnosis, ethnicity, gender, laterality, tumor size, ER and PR status were further analyzed in binary logistic regression analysis. The results conveyed that age at diagnosis, ER status and tumor size were independent factors involving the positivity of axillary lymph nodes and were then incorporated into the development of the nomogram as indicated in Table 3.

Nomogram development
A nomogram to predict preoperative axillary lymph nodes positivity was developed in primary cohort. A logistic regression analysis identified age at diagnosis, tumor size and ER status were risk factors ( Table 3). The model that incorporated the above independent predictors in conjunction with parameters previously shown to be associated with axillary lymph node metastasis, including gender, laterality, ethics and PR status [16,17] was developed and presented as the final nomogram (Fig. 1).

Internal and external validation of the model
The nomogram was internally verified using the bootstrap validation method. The nomogram demonstrated good accuracy for predicting positive axillary lymph nodes, with an unadjusted concordance index (C-index) of 0.735 (95% CI, 0.692 to 0.777).
( Fig. 2a) and was subjected to bootstrapping validation (1000 bootstrap resamples) to calculate a relatively corrected C-index of 0.734. Calibration curves for estimating positive axillary lymph nodes indicated there was no apparent departure from perfect fit, with good correlation between the prediction and observation in the primary cohort (Fig. 2b).
Good calibration was observed for the probability of lymph node metastasis in the validation cohort (Fig. 2d), and the C-index of the nomogram for the prediction of lymph node status was 0.748 (95% CI, 0.701 to 0.767) (Fig. 2c).

Discussion
IMPC is a rare subtype of breast carcinoma listed by the World Health Organization histologic classification of tumors of the breast in 2003, and presumed to be more aggressive than invasive ductal carcinoma (IDC) [3]. In agreement with the previous IMPC series, the positive rate of ER ranges from 25 to 91%, 13% to 82% of PR, as well as 36% to 100% of HER2 [18][19][20][21][22][23][24][25]. In our study, the positive rate of ER, PR and HER2 is 89.48%, 77.83% and 19.43% respectively. Besides, the positivity of involving lymph nodes, histological grade, tumor size and stage is similar to that of previous researches.
Although substantial controversy existed, most previous studies had demonstrated that IMPC was associated with advanced and aggressive clinicopathologic features, and the survival outcomes of IMPC was generally accepted to be worse than those of IDC [26][27][28]. To make matters worse, the standard treatment strategy of IMPC is unavailable and the rather low frequency of IMPC renders it difficult to directly compare its clinical outcomes and pathologic features to IDC. According to previous investigation, there were some commons and differences between IMPC and IDC in the aspect of contributors to lymph node involvement. Although some controversy existed, hormonal receptor status and tumor size were generally accepted to account for lymph node involvement both in IMPC and IDC, which was consistent with our results. While histological type, lymphovascular invasion (LVI), number of involved sentinel lymph nodes and extranodal extension were assumed to attribute to lymph node involvement of IDC, and their effects on IMPC need to be further research [29][30][31][32].
Thus, currently almost all IMPC patients were treated according to standard IDC treatments [33]. However, previous reports of IMPC had confirmed its distinctly morphological and genetic profiles, thus established IMPC of the breast as a special entity. Most patients with IMPC presented with axillary lymph node metastasis at initial diagnosis, the reported lymphatic and lymph nodal spread ranged from 33 to 95% higher than that of IDC [2,20,24,25]. However, the underlying mechanisms were poorly understood, with rare laboratory studies indicated that some regulatory molecules involving with the lymphotropism. Notably, lymph node metastasis was the most important prognosis predictor of IMPC and IDC. Some uncertain and controversial remained concerning the value of sentinel lymph node biopsy (SLNB) in IMPC setting [7]. Management strategies that avoid axillary invasive procedures are needed for lymph node negative patients. If we can predict the state of the axillary lymph nodes before sentinel lymph node biopsy, individuals who are axillary negative could avoid the unnecessary axillary operation. However, the preoperative clinical and imaging examinations of the axilla is a multifactorial event. Moreover, Paterakos et al. had suggested patients with IMPC may not benefit from SLNB [7]. Therefore, the accurate predict and appropriate dispose of axillary lymph nodes shows important clinical significance.
To some extent, to help guide axillary lymph nodes treatment decisions in this subpopulation, we adopted the large-scale population-based SEER database to develop and validate a convenient nomogram for the preoperative individualized prediction of lymph node metastasis in patients with IMPC. By univariate and multivariate analysis, age at diagnosis, ER status and tumor size were independent factors involving the positivity of axillary lymph nodes, and PR had marginal correlation with that. In consideration of previously identified factors, gender, laterality and ethics, contributed to the involvement of lymph nodes. We took these factors to construct nomogram. The nomogram incorporated preoperatively available clinical and certain pathological parameters and had some utility in clinical practice with good discrimination and calibration. With this nomogram, the patients' axillary lymph nodes status could be accurately predicted, and the clinical practice, especially the surgery planning could be made precisely and individually.
Although the nomogram showed good accuracy for predicting axillary lymph node metastasis, there were some limitations to the data that must be considered when interpreting the results. First of all, the use of retrospective data introduced the possibility of selection bias. Moreover, the vast majority of patients in the study cohort were white, so the estimation may be less precise for patients of nonwhite race. Additionally, the duration of follow-up time in this study should be longer to meet the accurate prediction as survival of breast cancer  In summary, we used population-based cohort to develop a nomogram to preoperatively estimate axillary lymph node metastasis in IMPC patients. This clinically useful tool applying readily available clinicopathologic factors to estimate the propensity of positive lymph node and can further facilitate individualized clinical decision making. Of note, the extrapolation of this conclusion to other database should be cautious and the following points should be taken into account. On the one hand, the definitions or evaluation criteria of ER/PR/HER2 status are not clear in this database, and some differences can exist between time periods as indicated by ASCO/ CAP and its updates, thus some discrepancies may exist between training and validation sets. On the other hand, the study cohort includes data related to post neoadjuvant chemotherapy surgical samples, and these cases may have been down staged, thus may result in undervaluation of lymph node involvement. Given that, future larger sample sizes and longer follow-up prospective studies are needed to determine the accuracy of the nomogram model and to precisely extract the subpopulations awarding more intensively treatment and surveillance.

Conclusion
In conclusion, we found that younger age at diagnosis, larger tumor lesions, ER negative status and advanced stage were risk factors associated with lymph nodes involvement in IMPC of the breast. Incorporating the previously reported factors with the currently identified risk factors, we constructed the nomogram for preoperatively predict the axillary lymph nodes status with good discrimination and calibration both in the training and validation cohorts. This accessible nomogram will facilitate