Clinical characteristics and overall survival prognostic nomogram for invasive cribriform carcinoma of breast: a SEER population-based analysis

Background The prognositc factors in patient with invasive cribriform carcinoma (ICC) of breast is still remain controversal. The study aims to establish a nomogram to predict the survival outcomes in patients with ICC based on the Surveillance, Epidemiology and End Results (SEER) database. Methods We retrieved SEER database for clinical data about patients including ICC and infiltrating ductal carcinoma (IDC) from 2004 to 2015. Kaplan-Meier survival was used to compare the difference survival outcomes between ICC and IDC. ICC patients were randomly allocated to training cohort and validation cohort. A nomogram was built to predict individual patient’s 3-year and 5-year survival status for ICC. The established TMN model and the newly established nomogram was further evaluated by the concordance index (C-index) and the decision curve analysis (DCA). Results Comparing the baseline clinical data between IDC and ICC, a significant of smaller tumor mass, less infiltrated lymph nodes, lower metastases rate, better tumor differentiation degree, higher proportion of estrogen receptor (ER) and progesterone receptor (PR) positive and lower rate of chemotherapy and radiotherapy was found in ICC. Age at diagnosis, marriage status, tumor location, T stage, M stage, ER status, surgery were independent significant prognostic factors for the overall survival (OS). A significantly higher C-index was found in nomogram compared with established TNM model in validation cohort. Conclusions The prognosis of ICC patients is better than that of IDC patients. The nomogram is recommended for future patient with ICC to survival analysis. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-021-07895-5.


Background
Breast cancer has the highly mortality rate in female worldwide. Breast cancer includes many pathological subtypes, among them ICC is a rare but unique one, characterized by mild to moderate cytological atypia nest surround by a dense fibrous stroma, with an incidence rate of less than 4% [1]. Distinct from breast cancer, ICC is considered to have higher survival rate, so in clinical practice, its uniqueness should be considered [2]. However, some limitations exist in previous reports including small number cohort or evident bias result from limited follow-up period due to the fact that ICC patients are relatively rare [3][4][5]. Most of ICC's local and system best controlled treatment methods are inferred from IDC's treatment experience, and have not been strictly verified in ICC patients. The TNM is the most wildly used staging system. It indicate the objective tumor load and metastasis status but have limited capacity to characterize the biological behavior features and guide decision making [6]. Because ICC lacks a unique prognostic evaluation system, ICC treatment is relatively uniform. Nomogram is confirmed as an reliable and alternative prognosis assessment tool in many cancer types [7][8][9] and even thought to be a emerging new standard [10]. In this study, we aim to build a reliable and high accuracy nomogram to predict individual ICC patient's survival outcome based on clinical and pathological data from SEER database.

Data source and study population
The Surveillance, Epidemiology, and End Results (SEER) database aims to collect information about cancer characteristics, cancer incidence and results. We acquired permission to download and analyze data for academic purpose (reference number: 14737-Nov2018). This study does not contain any experiments on humans as well as animals and/or the use of human tissue samples performed by any of the authors. The inclusion and exclusion criteria for extracting and screening data from SEER database including people from 18 regions , released on August 8th, 2019 were as follows. Inclusion criteria: (1) the diagnosing year ranged from 2004 to 2015, (2) the primary site of tumor was breast, and (3) histological types were confined to 8500/3 (IDC) and 8201/3 (ICC) according to ICD-0-3. Exclusion criteria: (1) patients with unknown information of race, diagnosing year, marital status or important clinicopathological data, (2) patients younger than 20 years old, (3) patients with a history of other cancer, (4) patients with less than 1 month survival after diagnosis, and (5) patient's diagnoses were only depend on biopsy or autopsy. Patients with ICC that met criteria were randomly allocated to training cohort (n = 532) and validation cohort (n = 228).

Endpoint and statistical analysis
Overall survival (OS) was defined as the time from the date of diagnosis to the date of death due to any cause or the last followup. Race was divided into white, black and other races; estrogen receptor (ER) is divided into positive and negative; progesterone receptor (PR) is divided into positive and negative; human epidermal growth factor receptor 2 (HER-2) was divided into negative, positive and unknown; the age grouping is implemented through the X-tile Software (Fig. S1). Age is divided into < 68,68~78,> = 79; and marital status was reclassified as married, single (never married or with a domestic partner) or divorced (separated, divorced and widowed). The clinicopathological features of different groups were analyzed by chi-square test or Fisher exact test. The survival curve was generated by the Kaplan-Meier method. The log-rank test was used to assess the difference in survival of each group. The three-year and five-year overall survival were calculated by the life table method. In the training cohort, the Cox regression model, hazard ratios (HRs) and 95% confidence intervals (CIs) were used for variable analysis to adjust prognostic variables. Variables selected by univariate Cox regression with p-value < 0.05 were selected for multivariate analysis using forword stepwise regression. In multivariate analysis, T, N, and M variables were used instead of stage variables to avoid multicollinearity. According to the results of the multivariate Cox regression hazards model, the nomogram model was constructed using the rms package in R software. The nomogram model was verified by the identification and calibration measurements of the training cohort and validation cohort. C-index, which measures the difference in predictive power between observation and prediction, was used to evaluate the discriminative power of the nomogram model [11]. The receiver operating characteristic (ROC) curve was used to verify the nomogram model. The use of marginal estimates to establish a calibration map of the model represents the calibration between the predicted survival rate and the observed survival rate of the nomogram model. Evaluation of the clinical effectiveness and benefit of the prediction model by decision curve analysis (DCA) [12]. C-index and DCA were used to compare the nomogram model with the AJCC 6th TNM staging system in the validation cohort.
Analyses p values were two-sided, and values of < 0.05 were considered statistically significant. Statistical analysis was performed with IBM SPSS Statistics 20 and R software (version 3.6.3).   Table 1).

Survival analysis
The median follow-up time was 60 months (1-155 months). The survival of ICC patients was significantly prolonged than IDC patients by KM analysis (p < 0.001). The 3-year and 5-year OS rates of ICC were 94.43 and 90.26%, respectively. While the 3-year and 5-year OS rates of IDC patients were 90.88 and 85.26%, respectively (Fig. 1). Through univariate Cox regression analysis, the histological type of ICC was a better prognostic factor for breast cancer (HR = 0.683; 95% CI, 0.559-0.834 p < 0.001).  (Table 2). Kaplan-Meier survival curves of each independent prognostic factor was show in Fig. 2.

Construction and validation of nomogram
The independent prognostic factors identified by the Cox regression (age, marital status, tumor location, T, M, PR, and whether it is surgically treated) were used for building a nomogram model to predict the OS in ICC (Fig. 3). The nomogram model showed that M stage had the greatest impact on prognosis, and the smallest is PR. All subtypes of all variables are assigned scores ( Table 3). The nomogram model was internally and externally verified. On the one hand, the internal verification in the training cohort, presented that the C-index predicted by OS was 0.845 (95% CI [0.788-0.902]). On the other hand, the externally verified shows that the C-index predicted from the validation cohort by the OS was 0.807 (95% CI [0.728-0.876]). The calibration plots showed the good consistency between the nomogram prediction and the actual observation in training cohort and validation cohort (Fig. 4). The ROC of the training and verification cohort is shown in the Fig. 5.
The C-index of the OS predicted by the nomogram in the verification cohort was 0.807 (95% CI, 0.728-0.876), that was even higher than the AJCC 6th TNM staging system (C-index = 0.591; 95% CI [0.505-0.677]). The DCA was used to contrast the availability and benefits of the nomogram and the AJCC 6th TNM staging system. Compared with the AJCC 6th TNM staging system, the 3-year and 5-year DCA curves of the nomogram showed a bigger net benefit across a series of death risks in the validation cohort (Fig. 6).

Discussion
ICC of breast cancer is a rare histological type with a low degree of malignancy. A series of previous studies showed that ICC has a good prognosis and has characteristics different from other histological breast cancer types [13,14]. Also in our study, the prognosis of ICC patients was significantly better than that of IDC patients. However, ICC is difficult to distinguish from other types of breast cancer in imaging [15].
The prognosis of ICC patients are still controversial. The effectiveness of breast cancer subtypes as prognostic factors has been widely accepted by clinicians at present. Currently NCCN and ASCO guidelines recommend the use of ER and PR status as the significant prognostic factors in medical decision-making. However, in this study, multivariate Cox analysis indicated that ER was not an independently prognostic factor for ICC. Part of the reason is that the positive rate of ER in our study is too high, which will make it difficult for us to determine the prognostic effect of ER status.
In this study, whether surgical treatment is an independent prognostic factor for ICC, but whether chemotherapy treatment is not an independent prognostic factor that affects the prognosis of ICC. Zhang et al. also believe that due to the good prognosis of ICC, chemotherapy is not required [3]. Chemotherapy in this study is not an independent prognostic factor, which may be caused by the late stage of patients requiring chemotherapy. Therefore, whether ICC patients need chemotherapy needs further study in the future.
The prognosis of traditional TNM staging system evaluation patients only includes T stage, N stage, and M stage, which does not include other biological factors. From the nomogram we constructed, we can see that in addition to T stage and M stages, the age of diagnosis, marital status, tumor location, PR status also has a greater impact on prognosis. But, the N stage in our study had no statistical significance for the prognosis in the Cox regression analysis. ICC can be divided into simple type and mixed type. The mixed type has higher lymph node positive rate and poor prognosis [3,5,13]. However, our research cannot distinguish the subtypes of ICC. Therefore, we speculate that N stage may be less important than T and M staging in Cox regression analysis or that N stage is not a prognostic factor due to the bias generated by the N stage that cannot distinguish the ICC subtypes in our research.
A published study analyzed the prognosis of ICC using the SEER data. The results showed that ICC patients were more likely to be elderly women with  Fig. 4 The calibration plot for predicting 3-and 5-year overall survival for patients with ICC. Calibration plot of nomogram prediction of a 3-year and b 5-year OS of patients with ICC smaller tumors, better tumor differentiation degree, fewer lymph node metastases, and higher ER and PR positive rates. The prognosis is better than IDC patients [16]. As far as we know, this is the first study to build a nomogram in ICC based on a large sample. This nomogram has better accuracy (in the training cohort C-index 0.845) and clinical usability than the TNM staging system. Limitations of this study include that we failed to distinguish ICC subtypes. ICC is divided into simple and mixed types, which have different prognostic results. Second, SEER data lacks information about Ki-67, chemotherapy regimens, endocrine therapy, and vascular invasion. In addition, our study did not verify the nomogram with multiple centers. Finally, the study is a retrospective cohort study, not a prospective cohort study. But, our study had a new understanding of the clinicopathological characteristics and prognosis of patients with ICC.

Conclusions
ICC patients have smaller tumors, less lymph node invasion, less distant metastasis rate, higher frequency of well tumor differentiation degree, higher ER positive and PR positive rates, and less chemotherapy and radiotherapy. The prognosis of ICC patients is significantly better than that of IDC patients. Our research is the first to build a ICC nomogram model.