Calculating the overall survival probability in patients with cervical cancer: a nomogram and decision curve analysis-based study

Background Cervical cancer has long been a common malignance troubling women. However, there are few studies developing nomogram with comprehensive factors for the prognosis of cervical cancer. Hence, we aimed to build a nomogram to calculate the overall survival (OS) probability in patients with cervical cancer. Methods Data of 9876 female patients in SEER database and diagnosed as cervical cancer during 2010–2015, was retrospectively analyzed. Univariate and multivariate Cox proportional hazard regression model were applied to select predicted factors and a nomogram was developed to visualize the prediction model. The nomogram was compared with the FIGO stage prediction model. Harrell’s C-index, receiver operating curve, calibration plot and decision curve analysis were used to assess the discrimination, accuracy, calibration and clinical utility of the prediction models. Result Eleven independent prognostic variables, including age at diagnosis, race, marital status at diagnosis, grade, histology, tumor size, FIGO stage, primary site surgery, regional lymph node surgery, radiotherapy and chemotherapy, were used to build the nomogram. The C-index of the nomogram was 0.826 (95% CI: 0.818 to 0.834), which was better than that of the FIGO stage prediction model (C-index: 0.785, 95% CI: 0.776 to 0.793). Calibration plot of the nomogram was well fitted in 3-year overall OS prediction, but overfitting in 5-year OS prediction. The net benefit of the nomogram was higher than the FIGO prediction model. Conclusion A clinical useful nomogram for calculating the overall survival probability in cervical cancer patients was developed. It performed better than the FIGO stage prediction model and could help clinicians to choose optimal treatments and precisely predict prognosis in clinical care and research.


Background
Cervical cancer has long been a common malignance troubling women. Although the screening programs for cervical cancer are conducted in many countries and regions, there are still a large number of people dead of advanced stage cervical cancer [1]. It was estimated that there were approximately 311,000 deaths owing to cervical cancer worldwide, which ranked only following the breast cancer [2]. And the proportion of cervical cancer in young women is increasing, which will shorten life expectancy [3]. Tough there are several clinical treatments on cervical cancer, the prognosis of advanced cervical cancer is still poor. The International Federation of Gynecologists and Obstetricians (FIGO) stage system mainly based on clinical examination, is widely used to stage cervical cancer for clinicians to choose specific treatments and predict prognosis [4]. However, the survival of cervical cancer patients differ from each other even with the same clinical stage. Therefore, using FIGO stage system to estimate prognosis of cervical cancer is not entirely satisfactory. And many other factors have been identified related to the survival of cervical cancer [5,6].
Nomogram is a visualized method of prediction model and it can generate a probability of a clinical event tailored to individual patient [7]. It integrates multiple predictors to provide comprehensively considered probabilities. In recent years, nomogram has gained increasing attention. Some researchers used it in oncology studies, and found that nomogram could precisely predict the oncology diagnosis and prognosis, and perform better than the frequently-used TNM stage system [8,9]. There were some nomograms built for predicting the survival of cervical cancer, but they tended to be based on small sample size cohorts, which might reduce robustness of the prediction models [10,11].
Decision curve analysis is a novel method and recommended by several top journals [12][13][14]. It can calculate net benefit of the prediction models to measure their clinical usefulness, which is significant for the final application of the prediction models. However, in the field of cervical cancer, there are few researches applying it to analyze the net benefit of the prediction models for its newness [10,11].
In this study, our goal was to develop a clinical useful nomogram to calculate the overall survival probability in cervical cancer patients, based on the sociodemographic characteristics and clinical treatment information. Such a nomogram would be a useful tool helping clinicians to choose optimal treatments in clinical care and research.

Patients
All data was obtained from the Surveillance, Epidemiology, and End Results (SEER) database by SEER*Stat 8.3.6. (https://seer.cancer.gov/seerstat/). SEER database is a database of cancer statistics, collecting information of patients in 18 tumor registries and covering 28% of the total U.S. population [15]. When downloading the SEER*Stat, we all signed and returned the research data agreement to the SEER Program and followed the agreement through the whole study in order to protect the privacy of patients.
A total of 9876 patients were finally included, and the inclusion and exclusion of patients were done through SEER*Stat 8.3.6. by choosing corresponding variables and limitations. Those who were newly diagnosed with cervical cancer during 2010-2015 were included. Those who were less than 18 years old, had multisource tumor and whose information was uncompleted or collected from autopsy or death certificate were excluded. The ending status was dead or censor by November 31st, 2018.
Data included the sociodemographic information (age at diagnosis, race and marital status at diagnosis), pathologic and histologic information (grade, histology, tumor size, FIGO stage, cancer cells in lymph nodes and metastasis), clinical treatments (primary site surgery, regional lymph node surgery, radiotherapy and chemotherapy) and survival information (survival time and ending status). Among them, grade represented pathological grade, including well differentiated (Grade I), moderately differentiated (Grade II) and poorly differentiated (Grade III). FIGO stage was classified by the 2009 FIGO stage system. And the continuous predictors (age and tumor size) were divided into subgroups by previously reported cutoff points [6,16]. The time calculated from diagnosis to dead or censored was defined as overall survival.

Statistical analysis
Descriptive statistics was used to embody the baseline characteristics. Kaplan-Meier method was conducted to estimate overall survival rates, and log-rank test was employed to contrast the different subgroups of the variables. Univariate and multivariate Cox proportional hazard regression model were performed to find significant predictors (P-value< 0.05). Forward stepwise was conducted by likelihood ratio (LR). A nomogram was built to visualize the prediction model based on the prognostic factors and it was compared to the FIGO stage prediction model which was only based on the FIGO stage.
Bootstrapping 1000 resamples was used to internally validate the predicted ability of the nomogram. Harrell's C-index was calculated to measure the discrimination of the prediction models, which mirrored their abilities to accurately distinguish patients who were dead and censored. The area under the receiver operating characteristic (ROC) curve (AUC) was used to assess the accuracy of the prediction models in 3-year and 5-year OS prediction.
Calibration plots were drawn to assess the calibration of the nomogram. The predicted probabilities for each cervical cancer patient were listed in orders and divided into ten groups, then compared with the actual probabilities. The calibration plots can measure whether a nomogram is erroneously estimating and overfitting. A prediction model is considered having good calibration when the plot perfectly agrees with the 45-degree line. When the slope is less than 1, it illustrates that the nomogram is overfitting; when the intercept is less than 0, it illustrates that the nomogram overestimates the probabilities [17].
Clinical value of the prediction models were estimated by decision curve analysis. It can compare net benefits of a prediction models with the scenes when all patients die or none. The x-axis and y-axis represent threshold probability and net benefit, respectively. Net benefit is calculated by benefits of the positives subtracting harms of the false positives [12]. If the prediction model has higher net benefits than the scenes when all patients die or none, it is considered of being clinical useful.

Baseline characteristics
A total of 9876 cervical cancer patients were included. Baseline characteristics could be seen in Table 1. There were 2505 (25.36%) death over a median follow-up time of 42.43 (95% CI, 41.95 to 42.91) months. In all patients, the 3-year OS rate was 74.4%, and the 5-year OS rate was 67.7%. The survival curves were shown in Additional file 1: Fig. S1.

Independent prognostic factors
The results of univariate and multivariate Cox regression analysis could be seen in Table 1. Age at diagnosis, race, marital status at diagnosis, grade, histology, tumor size, FIGO stage, cancer cells in lymph nodes, metastasis, primary site surgery, regional lymph node surgery, radiotherapy and chemotherapy were the factors influencing on the overall survival of cervical cancer patients in univariate analysis. However, cancer cells in lymph nodes and metastasis were not statistically significant in multivariate analysis. Therefore, eleven variables were the independent prognostic factors of the nomogram.

Development and internal validation
The independent prognostic factors were entered to develop the nomogram (Fig. 1). In the nomogram, FIGO stage shared the largest contribution, followed by histology, grade and chemotherapy. The sociodemographic factors (age at diagnosis, race and marital status at diagnosis) also partially dedicated to the nomogram. The Cindex of the nomogram was 0.826 (95% CI: 0.818 to 0.834), which was better than that of the FIGO stage prediction model (C-index: 0.785, 95% CI: 0.776 to 0.793). ROC curves for the nomogram and FIGO stage prediction model were shown in Fig. 2. AUC of the nomogram in 3-year and 5-year OS prediction were 0.847 and 0.831 respectively, which were higher than those of the FIGO stage prediction model (0.807 in 3year OS prediction and 0.793 in 5-year OS prediction). The calibration plots of nomogram were shown in Fig. 3. The nomogram had good calibration when predicting 3year OS probability. But when it came to 5-year OS prediction, it had poor calibration and was overfitting and overestimated the probabilities.

Clinical usefulness
Decision curves for the nomogram and FIGO stage prediction model were showed in Fig. 4. In 3-year OS prediction, when the threshold probability was between 0 and 90%, the net benefits of the nomogram were better than the scenes when all patients died or none. In 5-year OS prediction, when the threshold probability was between 4 and 91%, the net benefits of the nomogram were better than the scenes when all patients died or none. And the net benefits of the nomogram were higher than those of the FIGO stage prediction model.

Discussion
A nomogram was developed to calculate 3-year and 5year OS probability in patients with cervical cancer. The nomogram comprised eleven predictors on the sociodemographic characteristics and clinical treatment information. The discrimination of the nomogram was better than the FIGO stage prediction model. The calibration of the nomogram was great in 3-year OS prediction, but poor in 5-year OS prediction. In addition, the nomogram had higher net benefit than the FIGO stage prediction model.
We found that FIGO stage was not the only factor influencing the prognosis of cervical cancer, and sociodemographic characteristics and clinical treatment information were also important. For married women, they have sex life, which is an important factor of the occurrence of cervical cancer. But they could also get external support from their husband, which is a protective factor [18]. The influence of histology on the prognosis of cervical cancer has long been debated [19,20]. A study reported that adenocarcinoma had negative relation with the survival of advanced stage cervical cancer [21]. Tough FIGO stage system was widely used in clinical activities, the survival time of cervical cancer was diverse even with identical stage. Therefore, the FIGO     stage is not satisfactory enough when used to predicted prognosis. The nomogram could reduce the diversity due to different treatment and sociodemographic status when predicting prognosis of cervical cancer. And we found that the nomogram performed better than the FIGO stage prediction model on precise prognosis. Nomogram is widely used as diagnosis device to predict the probability of patients suffering from diseases and the prognosis of malignance [22][23][24][25]. But for the prognosis of cervical cancer, only a few studies applied it to visualize prediction models [10,11]. In previous studies, most were based on small sample size cohorts, which might reduce the robustness and generalizability of the nomograms [10,26]. Our nomogram was based on a cohort with large sample size, and it guaranteed the reliability and generality of the result. Some researchers developed nomograms to predict survival of cervical cancer with certain FIGO stages or specific treatments [27][28][29]. Marchetti et al. [27] estimated the survival of stage IB2-IIIB cervical cancer after curative chemotherapy and radical surgery with nomogram. The C-indexes of the previous nomograms tended to be between 0.65 and 0.75, which were acceptable [10,23,30]. The Cindex of our nomogram was 0.826 (95% CI, 0.818 to 0.834), indicating that the discrimination ability of our monogram was excellent. In addition, the C-index of the nomogram was better than that of FIGO stage prediction model. For current reports with nomogram, most are with great calibration, and their calibration plots are close to the ideal line [10,23,30]. In our study, the nomogram had good calibration in 3-year OS prediction. However, the calibration plot of the nomogram deviated from the ideal line in 5-year OS prediction. It might be owing to the inappropriate proportion of censored events. In this study, there were too many censored events covering over 70% of total patients. The increased censorship might lead to decreased accuracy and effectiveness, and increased bias of the prediction model [31]. Despite that, this nomogram was still meaningful as its calibration was good in 3-year OS prediction.
Decision curve analysis puts benefit and harm together to measure net benefit of diagnosis method or prediction model [12]. Compared with traditional ROC curve, decision curve analysis is better, because it takes clinical usefulness into consideration. Clinical usefulness is an important judging indicator whether a prediction model can be truly used in clinical activities and patients can benefit from it. As far as we know, the number of papers applying this new method to assess the net benefit of prediction models is very small. In some high-quality papers, researchers used it to assess the clinical usefulness of their prediction models about venous thromboembolism and gestational diabetes mellitus [24,25]. But there are only a few papers applying it to prediction models for cervical cancer survival. Zhang et al. [32] measured the net benefit of their risk assessment system for estimating survival time of distantly metastatic cervical cancer, and found that it was of clinical utility. In this study, we not only calculated the net benefit of the nomogram but also the FIGO stage prediction model. And we found that the net benefit of the nomogram was higher than the FIGO stage prediction model, indicating that our nomogram was clinical useful.
For patients with cervical cancer, the key point they are concerned about might be that how long they will live. Tough the FIGO stage system is currently available in clinical activities, the survival time of cervical cancer has a wide spectrum even with identical FIGO stage. Our study successfully developed a nomogram to predict the survivorship of cervical cancer patients, which was composed of sociodemographic and clinical treatment information. It could provide more comprehensive and more accurate prognostic prediction than the traditional FIGO stage system and it could be proposed as a complement of the FIGO stage system. The nomogram was simple-to-use and it could be used as a paper-based or online prediction tool to predict prognosis of cervical cancer before and after treatment, so that it could help clinicians make the optimal strategic decisions, provide individualized clinical care and consultation to meet the needs of patients. Moreover, it could aid clinicians to distinguish patients who might have more benefits from treatments, carry out clinical trials and make tailored follow-up plans. In short, the nomogram was helpful both in clinical treatment and research.
However, our study still had some limitations. Firstly, the nomogram did not contain some factors related to cervical cancer, such as lymph vascular space invasion. Umezu et al. [33] found that lymph-vascular space invasion was one of the prognostic factors of the overall survival of cervical cancer patients who were staged IA-IIA and underwent surgical resection. And Srisomboon et al. [34] identified that lymph vascular space invasion was an important factor influencing the survival of cervical cancer. Because information of lymph-vascular space invasion for cervical cancer patients was blank in the SEER database, we did not include this significant factor in this study. Besides, the SEER database did not have details of the chemotherapy, such as the use of targeted drug, which was critical for the prognosis of cervical cancer. And it lacked the information of living surroundings, lifestyle, adjuvant therapy and commodities, so we could not get all prognostic factors into consideration, which was an intrinsic limitation of SEER-based study. However, this nomogram embodied acceptable performance with present prognostic factors. Secondly, different treatments might have different impacts on the prognosis of cervical cancer. We failed to subdivide each treatment and only divided them by whether it was performed or not. For primary site surgery, there are many surgery methods for cervical cancer, such as conization of uterine cervix, total hysterectomy removes, et al. Patients with different surgery methods might have different outcome. Thirdly, we did not conduct external validation to further assess this nomogram.

Conclusions
In conclusion, we developed a clinical useful nomogram for calculating overall survival in cervical cancer patients, based on sociodemographic and clinical related information. It performed better than the FIGO stage prediction model and could help clinicians to choose optimal treatments and precisely predict prognosis in clinical care and research.