Nomogram to predict overall survival based on the log odds of positive lymph nodes for patients with endometrial carcinosarcoma after surgery

Purpose Aims to compare the prognostic performance of the number of positive lymph nodes (PLNN), lymph node ratio (LNR) and log odds of metastatic lymph nodes (LODDS) and establish a prognostic nomogram to predict overall survival (OS) rate for patients with endometrial carcinosarcoma (ECS). Methods Patients were retrospectively obtained from Surveillance, Epidemiology and End Results (SEER) database from 2004 to 2015. The prognostic value of PLNN, LNR and LODDS were assessed. A prediction model for OS was established based on univariate and multivariate analysis of clinical and demographic characteristics of ECS patients. The clinical practical usefulness of the prediction model was valued by decision curve analysis (DCA) through quantifying its net benefits. Results The OS prediction accuracy of LODDS for ECS is better than that of PLNN and LNR. Five factors, age, tumor size, 2009 FIGO, LODDS and peritoneal cytology, were independent prognostic factors of OS. The C-index of the nomogram was 0.743 in the training cohort. The AUCs were 0.740, 0.682 and 0.660 for predicting 1-, 3- and 5-year OS, respectively. The calibration plots and DCA showed good clinical applicability of the nomogram, which is better than 2009 FIGO staging system. These results were verified in the validation cohort. A risk classification system was built that could classify ECS patients into three risk groups. The Kaplan-Meier curves showed that OS in the different groups was accurately differentiated by the risk classification system and performed much better than FIGO 2009. Conclusion Our results indicated that LODDS was an independent prognostic indicator for ECS patients, with better predictive efficiency than PLNN and LNR. A novel prognostic nomogram for predicting the OS rate of ECS patients was established based on the population in the SEER database. Our nomogram based on LODDS has a more accurate and convenient value for predicting the OS of ECS patients than the FIGO staging system alone. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-021-08888-0.


Introduction
Endometrial cancer is one of the most common gynecologic malignancies in the world. More than 65,000 new cases were confirmed in the United States in 2020 [1]. Endometrial carcinosarcoma (ECS), composed of epithelial and mesenchymal cells, is a rare and aggressive solid malignant tumor, which accounts for less than 5% of uterine malignancies, but about 15% of uterine cancer deaths are related to ECS [2,3]. Recent studies have shown that ECS is more prone to lymph node (LN) metastasis and recurrence after surgery. The foundation of therapy for ECS is surgical resection, including total abdominal hysterectomy, bilateral salpingo-oophorectomy, and lymph node dissection, with or without combination chemoradiation [4,5]. Although adopting this aggressive surgical method, the local area recurrence rate is as high as 60%. The International Federation of Gynecology and Obstetrics (FIGO) recommended that ECS use the same staging system as endometrial adenocarcinoma, namely the 2009 FIGO staging system, and pointed out that the clinical pathological disease staging at the time of diagnosis is an important factor affecting the prognosis [6]. However, up to 30% of ECS patients may have extrauterine metastases at the time of onset, resulting in a significantly worse prognosis than endometrial adenocarcinoma [7]. Studies pointed out that the 5-year overall survival rate for stage I or II ECS patients is 30-45%, and the 5-year overall survival rate for stage III or IV ECS patients is 0-10% [2,6,7].
Some studies have demonstrated that the postoperative survival rate is not only affected by the overall LN status (i.e., no metastases versus metastases), but also by the number of metastatic LNs [8,9]. Therefore, adequate LN histopathological evaluation is essential to predict the prognosis of ECS. However, the current 2009 FIGO staging system of endometrial cancer is only based on the anatomical location of metastatic LN metastasis, without considering the number of metastatic LNs, which may limit its prognostic accuracy [10]. Many studies have referred that the LN status of several solid tumors usually depends on the anatomical location and the number of metastatic LNs [11]. In recent years, many studies have shown that there is a significant correlation between different LN staging systems and patient survival outcomes, including the number of positive lymph nodes (PLNN), lymph node ratio (LNR), and log odds of positive lymph nodes (LODDS) [12][13][14].
LNR is the ratio of the number of positive LNs to the total number of resected LNs [13]. It was reported that LNR provides important guidance regarding the survival of patients with gastric adenocarcinoma, which have shown its superiority in guiding the prognosis over PLNN [12]. LODDS, defined as the logarithm of the ratio of the number of positive and negative LNs, has been applied to predict the prognosis of several tumors. When the number of LN removed is insufficient, the algorithm can stratify patients according to different prognosis [15]. At present, there are few studies on the value of different LN staging systems in predicting the prognosis of ECS, and the most appropriate way for predicting the prognosis of ECS remains unclear.
The overall prognosis of women with ECS is dismal. The survival outcomes of women with ECS are even worse than other types of high-grade endometrial cancers [16,17]. Therefore, the purpose of this study was to compare the prognostic performance f PLNN, LNR and LODDS and establish a prognostic nomogram to predict overall survival (OS) rate for patients with ECS based on the population derived from Surveillance, Epidemiology and End Results (SEER) database.

Patient inclusion
Patients diagnosed with ECS between 2004 and 2015 were retrieved from the Surveillance, Epidemiology, and End Results (SEER) database (SEER*Stat version 8.3.8).
For data collection, we limited Primary Site: the International Classification of Diseases for Oncology, third edition (ICD-O-3) C54.1. And select only malignant cancers and known age. In total, 99,177 records were collected.
The inclusion criteria including: (1) patients diagnosed with ECS between 2004 and 2015; (2) patients with a histologic diagnosis of ECS (ICD-O-3:8930 to 8999); (3) patients who were 18 years old or older at diagnosis; (4) patients with regional nodes resection and examined after surgery. The exclusion criteria ruled out patients with inadequate information on race, tumor size, tumor extension, the seventh edition of the AJCC stage, patients with inadequate information on LNs (including examined LNs and positive LNs); and absent information on survival months or cause of death. Finally, based on the aforementioned criteria, a total of 715 patients were included and the data process flowchart was presented in Fig. 1. Afterwards, the patients assigned to the training cohort and the validation cohort with a portion of 7:3, using a random sampling method.

Characteristics
The data of clinical characteristics including year of diagnosis, age, race, metastatic status, histologic grade, tumor size, cause of death, peritoneal cytology status, the seventh edition of the AJCC staging system, the total amount of lymph nodes retrieved, the amount of metastatic lymph nodes, survival time, and survival status were collected from the SEER database. The original staging information of ECS in the SEER database is the seventh edition of the AJCC staging system. On the basis of the 2009 FIGO staging system, we transformed the seventh edition of the AJCC staging system to 2009 FIGO in this study. PLNN represents the numbers of positive lymph nodes. LNR is the ratio of the number of positive LNs to the total number of resected LNs. LODDS, defined as the logarithm of the ratio of the number of positive and negative LNs.
The main endpoint was overall survival (OS) rate which was calculated from the date of diagnosis to the date of death from any cause. Optimal cutoff values were determined using X-tile software. Based on the optimal cut-off value, PLNN, LNR, and LODDS was calculated into categorized variables. Tumor size was divided into ≤58 mm, and > 58 mm groups. PLNN was classified into two group: namely PLNN1 (=0) and PLNN2 (> 0). LNR was divided into two categories, namely LNR1 (≤0.03448276) and LNR2 (> 0.03448276). The LODDS was divided into two subgroups, namely LODDS1 We obtained approval to access the SEER of the National Cancer Institute in the United States using the reference number 20256-Nov2019.

Development of the model
Relations to OS were evaluated with a univariable analysis according to the Kaplan-Meier approach and using the log-rank test to assess statistically significant differences among groups. To predict 1-, 3-and 5-year OS, a multivariate cox proportional hazards model was performed, which included the relevant predictors in univariate analysis (P < 0.1) ( Table 2). The multivariate analysis was applied to generate the nomogram based on the R software. We assessed the predictive performance of the nomogram by evaluating the concordance index (C-index), the area under the receiver operating characteristic (ROC) curve (AUC), the Akaike information criterion (AIC) and calibration plots (comparing the survival probability predicted by the nomogram with the observed value by Kaplan-Meier analysis). A smaller AIC value indicated a better model for predicting outcome. Backward stepwise selection was performed to determine independent covariates [12][13][14][15]. Variables entered into the model were age, tumor size, 2009 FIGO, LODDS and peritoneal cytology. Variables were eliminated from the model if their removal actually improved the overall quality of the model (as measured by AIC). Additionally, according to the total score of each patient in the training cohort by using the nomogram, all patients were divided into three prognostic groups (namely low-, intermediate-, and high-risk groups) with similar number of patients to establish a risk classification system. Kaplan-Meier curve and log-rank test were used to illustrate and compare the OS of patients in different risk groups.

Validation of the model
The nomogram was confirmed using the validation cohort of 216 patients. A bootstrap re-sampling method to obtain relatively unbiased estimates (1000 repetitions) was used for external validation. For each group of 1000 bootstrap samples, the model was refitted and tested against the observed sample to estimate the predictive accuracy and bias [6,12,13].
Additionally, decision curve analysis (DCA) assisted in confirming the threshold probability range of the nomogram, which was compared with the 2009 FIGO staging system. Besides, the predictive efficiency of PLNN, LNR, and LODDS were compared using the C-index, AIC, and AUC [12][13][14][15].
Descriptive statistics are described as mean ± standard deviation(SD)for continuous variables and number for categorical variables. A chi-square test was used for the analysis of all categorical data. The Kruskal-Wallis H test or Wilcoxon test was used for the analysis of continuous variables. Bonferroni-adjusted significance tests were applied for pairwise comparisons. The Kaplan-Meier method and the log-rank test were used to construct and compare the survival curves, respectively. Statistical analysis was carried out with SPSS (Statistical Package for the Social Sciences) for Windows, version 22, and R 3.6.3 software (http://www.r-project.org). A p < 0.1 was chosen as the criterion for removing a variable from the multivariate Cox proportional hazards model, and a p < 0.05 was considered significant for all other tests.

Patient characteristics and survival outcomes
The study enrolled 715 patients with ECS diagnosed from 2004 to 2015 in the SEER database. These patients randomly divided into a training cohort and a validation cohort by a ratio of 7:3. The clinical and demographic characteristics of the involved patients are summarized in Table 1. The mean age of these patients was 63.38 years (range 21 to 85 years) in the whole population, 63.52 years (range 24 to 85 years) in the training cohort, and 63.05 years (range 21 to 85 years) in the validation cohort. Among the whole population, there were 385 (53.85%) patients diagnosed with I stage, 55 (7.69%) patients diagnosed with II stage, 196 (27.41%) patients diagnosed with III stage, 79 (11.05%) patients diagnosed with IV stage. Moreover, the mean PLNN were 1.08 ± 2.23, the mean LNR were 0.81 ± 0.73, and the mean LODDS were 0.048 ± 1.71. In order to compare different LN staging systems comprehensively and reasonably, we grouped continuous variables of the PLNN, LNR and LODDS schemes into two classification levels according to best cut-off points. PLNN was classified into two group: 205 (29%) in PLNN1 (=0) and 510 (71%) in PLNN2 (> 0). LNR classification was determined: 207 (29%) in LNR1 (≤ 0.03448276), 508 (71%) in LNR2 (> 0.03448276). For the LODDS system, 222 (31%) patients were in the LODDS1 group (LODDS≤ − 0.9199705), and 493 (69%) patients were in the LODDS2 group (LODDS> − 0.9199705). At the last follow-up, 303 patients (42%) died. Only 33 patients (11% of all deaths) died of causes other than ECS. The median OS in the whole population (n = 715) was 51 months.  (Table 3).

Construction and validation of the Nomogram for OS
On the basis of the univariate and multivariate cox regression analyses we showed above, a nomogram incorporating the significant risk factors was established to predict 1-, 3-and 5-year OS rates of ECS patients (Fig. 2). And the Additiona file 1: Fig.S1 provides a direct vision for daily use of the model. For predicting OS rate, the c-statistic of this nomogram in the training cohort was 0.743 [95% confidence interval (95% CI), 0.718-0.769] compared with 0.674 (95% CI, 0.647-0.701) for 2009 FIGO staging system. The c-statistic of the nomogram in the validation cohort was 0.735 (95% CI, 0.717-0.753), compared with 0.669 (95% CI, 0.651-0.687) for the FIGO 2009 staging system. Furthermore, the predictive performance of our nomogram was calculated by time ROC curve (Fig. 3A, C). The AUC values in the training cohort were 0.740, 0.682 and 0.660 for 1-, 3-and 5-year OS rates and in the validation cohort were 0.798, 0.683 and 0.630, both indicating good statistic power of the nomogram.
Besides, calibration was good in both training and validation cohort (Fig. 3B, D). These results indicated that compared with the FIGO 2009 staging system, our nomogram demonstrated better discrimination and prognostic prediction capabilities.

Clinical value of nomogram
According to DCA, compared with the FIGO 2009 staging system, the nomogram demonstrated more net benefit across the range of decision threshold probabilities (Fig. 4). Most importantly, patients can benefit more from the nomogram to predict individual survival outcomes.

Risk classification system of Nomogram
In addition to the nomogram, a risk classification system for OS was also developed according to the total scores of each patient in the training cohort produced by the nomogram to divide all patients into three prognostic groups, with a similar number of cases per group. Based on the novel classification system, all patients were classified into the low-risk (166/499, 33.3%; score − 1.025 to 0.837), intermediate-risk (167/499, 33.4%; score 0.837 to1.655), or high-risk groups (166/499, 33.1%; score 1.655 to 4) (Fig. 2). The Kaplan-Meier curves showed that OS in the different groups was accurately differentiated by the risk classification system (Fig. 5). and performed much better than FIGO 2009.

Discussion
ECS is a rare tumor with a poor prognosis due to its extremely aggressive behavior [3]. An accurate staging system of predict survival is pivotal to help guide treatment selection and judgment of the prognosis of patients with ECS. According to the 2009 FIGO staging system, the   Recently, because of the close relation between LN status and prognosis in many tumors, many studies were conducted to explore a brilliant LN staging system [13,22,23]. Previous studies have provided different results for the evaluation of these different staging systems in different tumors, some of which support the prognostic ability of the LNR staging system, while others advocate the use of the LODDS staging system [13,[24][25][26][27].. In this study, we compared the predictive abilities of PLNN, LNR and LODDS. The results showed that by comparing the C index, AUC and AIC of the three lymph node staging systems, LODDS has a slightly better prognostic indicator for predicting OS of ECS patients. In addition, through multivariate cox analysis, our study indicated that LODDS was an independent prognostic determinant that affects the prognosis of ECS patients. On the other hand, not only the absolute number of positive lymph nodes, but also the number of negative lymph nodes were considered on LODDS. Therefore, LODDS has better ability of discrimination, especially in patients with no lymph node involvement or all lymph node involvement. To the best of our knowledge, this was the first study to evaluate the prognostic ability of different LN staging systems for ECS patients.
According to our results, patients in the group of advancing age, tumor size> 58 mm, and positive peritoneal cytology were at a significantly worse prognosis than others in OS rate. It was previously reported that compared with the small-size group, the large-size group was more tend to lymph node metastasis, distant metastasis and aggressive growth characterized, all of which were related with a poor prognosis [28,29]. According to the current staging system (2009 FIGO), malignant peritoneal cytology is not included as a basis of staging factors, while malignant peritoneal cytology has been reported to be strongly related with an increased risk of all-cause mortality of ECS patients [30].
DCA is a practical tool in determining the effectiveness of model-based clinical decisions, which can directly provide with useful clinical information [31]. We applied DCA to estimate the effectiveness of threshold probability-based prediction models in clinical practice., which demonstrated that compared with the 2009 FIGO alone, the nomogram has more advantages. Therefore, based on the SEER database, we established a nomogram based on the LODDS system to predict the OS rate of ECS patients. Compared with FIGO 2009 staging, our nomogram shows higher accuracy and relatively better prognostic judgment. Furthermore, a novel risk classification was established on basis of the predictive scores calculated by the nomogram, which classified all patients into three different prognostic groups.
It should be pointed out that our research has some limitations. First, familial endometrial carcinoma such as Lynch syndrome, an autosomal dominant susceptibility disease, which is not enrolled in the SEER database.  Second, the absence of detailed individual information on chemotherapy and radiotherapy or any other treatment before surgery, which may be relevant factors that are not included in the model. Third, though SEER is a huge population-based database, retrospective data had an inherent bias and lack of external data from different cancer centers to validate the nomogram model. Furthermore, because of the characteristics of the multicentric, retrospective entry of pathology data without central pathology review in the SEER database, there could be certain deviation of diagnosis [32]. Future welldesigned studies could improve the nomogram by incorporating these factors based on their predictive power.

Conclusions
Our results indicated that LODDS was an independent prognostic indicator for ECS patients, with better predictive efficiency than PLNN and LNR. A novel prognostic nomogram for predicting the OS rate of ECS patients was established based on the population in the SEER database. Our nomogram based on LODDS has a more accurate and convenient value for predicting the OS of ECS patients than the FIGO staging system alone.