MRI-based random survival Forest model improves prediction of progression-free survival to induction chemotherapy plus concurrent Chemoradiotherapy in Locoregionally Advanced nasopharyngeal carcinoma

Background The present study aimed to explore the application value of random survival forest (RSF) model and Cox model in predicting the progression-free survival (PFS) among patients with locoregionally advanced nasopharyngeal carcinoma (LANPC) after induction chemotherapy plus concurrent chemoradiotherapy (IC + CCRT). Methods Eligible LANPC patients underwent magnetic resonance imaging (MRI) scan before treatment were subjected to radiomics feature extraction. Radiomics and clinical features of patients in the training cohort were subjected to RSF analysis to predict PFS and were tested in the testing cohort. The performance of an RSF model with clinical and radiologic predictors was assessed with the area under the receiver operating characteristic (ROC) curve (AUC) and Delong test and compared with Cox models based on clinical and radiologic parameters. Further, the Kaplan-Meier method was used for risk stratification of patients. Results A total of 294 LANPC patients (206 in the training cohort; 88 in the testing cohort) were enrolled and underwent magnetic resonance imaging (MRI) scans before treatment. The AUC value of the clinical Cox model, radiomics Cox model, clinical + radiomics Cox model, and clinical + radiomics RSF model in predicting 3- and 5-year PFS for LANPC patients was [0.545 vs 0.648 vs 0.648 vs 0.899 (training cohort), and 0.566 vs 0.736 vs 0.730 vs 0.861 (testing cohort); 0.556 vs 0.604 vs 0.611 vs 0.897 (training cohort), and 0.591 vs 0.661 vs 0.676 vs 0.847 (testing cohort), respectively]. Delong test showed that the RSF model and the other three Cox models were statistically significant, and the RSF model markedly improved prediction performance (P < 0.001). Additionally, the PFS of the high-risk group was lower than that of the low-risk group in the RSF model (P < 0.001), while comparable in the Cox model (P > 0.05). Conclusion The RSF model may be a potential tool for prognostic prediction and risk stratification of LANPC patients. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-022-09832-6.


Background
Nasopharyngeal carcinoma (NPC) is an epithelial malignant tumor that originates from the nasopharyngeal mucosa, characterized by distinct geographical distribution and is particularly prevalent in the south of China [1,2]. More than 70% of NPC patients have been in locoregionally advanced stage (stage III-IVa) at diagnosis [3]. Big-data and multi-center studies have shown that compared with CCRT alone, IC + CCRT significantly improves the survival rate in LANPC patients [4,5]. Moreover, IC + CCRT was proposed as level 2A evidence for these patients by the National Comprehensive Cancer Network (NCCN) guidelines, and it has become the first-line therapy for LANPC [6]. Nevertheless, approximately 20-30% of NPC patients report unsatisfactory efficacy after IC + CCRT [7,8], and local recurrence and distant metastasis are still the main reasons for treatment failure in LANPC patients [9]. The application of IC + CCRT for ineffective NPC patients will significantly increase the toxicity and treatment cost [10]. Therefore, it is essential to accurately predict the treatment response, prognosis and survival of LANPC patients undergoing IC + CCRT before treatment, and to guide clinicians to develop individualized treatment regimens for patients. Further, identifying an effective prognostic prediction method is warranted for LANPC patients before IC + CCRT.
Presently, TNM staging system and MRI are routine approaches for therapeutic decision-making and prognostic prediction of LANPC [11,12]. However, TNM staging system and traditional MRI techniques such as T1-weighted imaging (T 1 WI) and T2-weighted imaging (T 2 WI) are mainly based on the anatomical structure of tumor invasion, without considering the microscopic conditions in the tumor, which cannot accurately predict the prognosis of patients. Inflammatory biomarkers have been shown to be prognostic predictors for NPC patients. However, different study sample sizes and therapeutic approaches can lead to different cut-off values of inflammatory biomarkers, limiting their predictive value for prognosis of LANPC patients [13,14]. Radiomics is a rapidly emerging analytical approach. Radiomics analysis based on imaging data can reflect the heterogeneity within the tumor through numerous automatically extracted data characterization algorithms [15]. Tumor heterogeneity may be closely associated with cancer staging, prognostic prediction, and treatment response [16]. Recently, radiomics has been applied to predict the efficacy and prognosis of NPC, and it has shown that radiomics features are associated with PFS, recurrence, metastasis, and other clinical outcomes [17][18][19][20]. Although there are many different algorithms available for the development of radiomics risk models for NPC, it is unclear which algorithm is optimal in efficiency. The traditional Cox risk regression model is the most commonly used one for predicting the efficacy and prognosis of NPC, but it is unstable in diagnostic efficiency, and no standardized guideline is available. Thus, it remains controversial in the prognostic prediction of NPC [21][22][23].
The RSF model is an integrated machine learning model based on survival trees, which is suitable for the construction of prognostic models of survival data. Unlike the Cox risk regression model, this model does not need to hypothesize the distribution of parameters in advance, and the effect of variables on the risk function is linear. Hence, it is suitable for modeling high-dimensional complex data and can explore the nonlinear effects of variables on prognosis [24,25]. In addition, the RSF model can also rank the importance of variables to screen variables with greater importance and reduce the dimensions of variables, which is conducive to the application of the model in clinical practice. Lin et al. [26] constructed an RSF model to predict the survival outcome of hepatocellular carcinoma (HCC) patients with Barcelona Clinic Liver Cancer (BCLC)-B after transcatheter arterial chemoembolization (TACE). There are also studies comparing RSF with other methods including Cox regression model, and the findings demonstrate that the performance of RSF is superior or comparable to other models [27]. In addition, the RSF model has also shown good prediction performance in the prognostic studies of tumors such as glioma and lung cancer [28,29]. Nevertheless, few data are available regarding the accuracy of the RSF model vs the traditional Cox risk regression model in predicting the prognosis of LANPC patients after IC + CCRT.
The present study aimed to construct prediction models by RSF method and Cox regression based on clinical and radiomics parameters of LANPC patients after IC + CCRT, respectively, and compare the prediction performance of these models. It was hypothesized that the RSF model had higher performance, which would help improve the precise individualized treatment and clinical decision-making of LANPC patients.

Study design and participants
The present study used a dataset from the medical record at our hospital from January 2015 to June 2018. Patients were eligible for inclusion if they had a histological diagnosis of LANPC, had not received any anti-tumor therapy, underwent MRI scan (including axial T 2 WI and CET 1 WI images) and IC + CCRT before treatment. The exclusion criteria were: 1) distant metastasis before the initial treatment; 2) pre-existing or concurrent malignant tumors; 3) insufficient quality of MRI due to motion artifacts or poor contrast material injection.
Eligible patients were randomly assigned to the training cohort(n = 206) and testing cohort(n = 88) at a ratio of 7:3. Tumor staging was classified according to the 8th edition of the American Joint Committee on Cancer (AJCC) TNM Staging System Manual. According to the World Health Organization (WHO) criteria, the histological tumor subtypes were classified as type I (differentiated keratinizing carcinoma), type II (differentiated non-keratinizing carcinoma), and type III (undifferentiated non-keratinizing carcinoma). The present study was approved by the Institutional Review Board, and the written informed consent was waived.

Treatment and data collection
Details about the treatments of the patients is shown in Supplementary Materials. Patients were followed up every 1-3 months in the first 2 years, once every 6 months in the 3-5 years, and once a year thereafter. All participants were followed up for at least 2 years. The study endpoint was the PFS, which was calculated from the starting of treatment to the disease progression (or censored at the last follow-up).

Image acquisition and segmentation
The details regarding the acquisition parameters and image segmentation are presented in Supplementary Materials. The workflow chart of radiomics was shown in Fig. 1. All tumor segmentations were conducted blindly by two radiologists (observers 1 and 2 with 10 and 15 years of clinical experience in interpretation of head and neck MRI images) (Fig. 1A).
A total of 2074 radiomics features were extracted from the T 2 WI and CET 1 WI images of each patient, including histogram features, shape features, and texture features (Fig. 1B). All feature parameters were standardized by Z-score based on training cohort data, and the univariate/multivariate Cox regression method and RSF method were used to reduce the dimensionality of high-dimensional data (Fig.1C) to extract the optimal features.
Construction of the Cox prediction model: Based on the multivariate stepwise Cox analysis results of clinical and radiomics features in the training cohort, the Cox prediction model of the training cohort was constructed (  Construction of the RSF model: RSF was calculated by a group of binary decision trees; bootstrap and random node splitting were used to grow independent decision trees, and then all trees were set to form RSF. Details about the training steps of the RSF model is shown in Supplementary Materials. The output risk scores of the Cox and RSF models stratified patients into high-and low-risk groups based on clinical and radiomics features in the training cohort and testing cohort; and the survival outcome between the high-risk group and the low-risk group was compared.

Statistical analysis
Statistical analyses were performed with the use of R software (4.1.1). Normally distributed measurement data were presented as mean ± standard deviation (SD) and compared by the t test; measurement data of skewed distribution were presented as M (range) and compared by the Mann-Whitney U test. Count data were presented as absolute number or percentage and compared using the χ 2 test. Univariable and multivariable survival analyses were conducted using the Cox proportional hazards model. The Kaplan-Meier method was used to plot the survival curve and the survival rate was calculated; the X-tile software was used to select the optimal cut-off value for continuous variables, and the log-rank test was conducted to compare whether the difference in survival time between the two groups was statistically significant. All tests were two-tailed with significance tests, and P < 0.05 was considered statistically significant. A time-dependent ROC curve was plotted, and the AUC was calculated to evaluate the prediction performance of different models. The Delong test was used to compare the performance among models. To ensure the stability of the testing effect, the prediction model of the training cohort was confirmed in the testing cohort.

Clinical characteristics of the patients
A total of 294 patients (213 males and 81 females; the mean age was 43.6 years (SD: 10.9 years, range: 19-71 years) were enrolled in the present study. The last follow-up ended on May 21, 2021, and the median follow-up time was 43.9 months (range:8.0-75.0 months). The clinical characteristics of all LANPC patients in the training cohort and testing cohort were summarized in Table 1. Univariate and multivariate Cox regression analyses were used to explore the clinical characteristics, and the results showed that Epstein-Barr virus (EBV) DNA, Overall Stage, and T stage were independent risk factors that affected the survival and prognosis of NPC patients (all P < 0.05) ( Table 2).

Construction of radiomics labelling
The ICC values between the features of the two observers and the ICC value of the features extracted by the ROI plotted by the observer A were calculated for comparison. Among them, the repeatability between the two features based on the observer A was excellent (ICC = 0.782-0.957), and the consistency of the features between the two observers was good (ICC = 0.732-0.948). In the 2074 radiomics features extracted from T 2 WI and CET 1 WI images, radiomics labeling was constructed by univariate and multivariate stepwise Cox analysis.

Construction and verification of the cox nomogram model
A nomogram was constructed based on significant variables in univariate and multivariate Cox analyses (these variables are presented in Supplementary Materials). In the current nomogram (Fig. 2), a node was assigned to each variable based on HR. By adding up the total scores of each variable and positioning it on the total score scale, the probability of 3-and 5-year PFS were obtained. In the training cohort, the AUC of the clinical Cox model, the radiomics Cox model, and the clinical + radiomics Cox model in predicting the 3-year PFS after NPC treatment was 0.545, 0.648, and 0.648, respectively; the AUC of 5-year PFS was 0.556, 0.604, and 0.611, respectively. In the testing cohort, the AUC of the three models in predicting the 3-year PFS after NPC treatment was 0.566, 0.736, and 0.730, respectively; the AUC of 5-year PFS was 0.591, 0.661, and 0.676, respectively. The ROC curve was shown in Figs. 3 and 4. Overall, in the comparison among the three Cox models, the prediction performance was comparable (Table 3).

Construction and verification of the RSF model
The error rate corresponding to the number of survival trees within 100 was obtained, as shown in Fig. 5. The results showed that when constructing 100 survival trees, the error rate was low and maintained a relatively stable level. The RSF model was constructed according to the optimal parameter ntree = 100, and as it shows in Fig. 5 and in Supplementary Materials, 7 features associated with the PFS were selected according to the importance score of each radiomics feature. The survival rate and cumulative hazard curves plotted over time were shown in Fig. 6. The results showed that as the survival time increased, the prediction performance of the RSF model in the survival rate gradually decreased, and the cumulative hazard increased. The decision rule diagram based on the RSF model was shown in Fig. 7.
In the training cohort, the AUC of the RSF model in predicting the 3-and 5-year PFS after NPC treatment was 0.899 and 0.897, respectively; in the testing cohort, it was 0.861 and 0.847, respectively. Compared with the three Cox models, the RSF model showed the highest prediction performance, and the differences among the models were statistically significant (all P < 0.001, Table 4). Patients in the low-risk group achieved better PFS (all P < 0.001, Fig. 8), demonstrating the good clinical application value of this model.

Stratification analysis of the clinic + radiomics cox nomogram model and RSF model
According to the ROC curves of the Cox and RSF models in the training set, the prognostic risk score maximizing the Youden index was used as the threshold (cutoff value), which was used to assign patients to the non-high-risk group (the prognostic risk score was less than the threshold) and high-risk group (the prognostic risk score was greater than or equal to the threshold). Figure 8 showed the Kaplan-Meier survival curves of the two models, which were used to stratify patients into high-and low-risk groups based on risk scores for treatment recommendations. Kaplan-Meier survival analysis showed that Cox combination model could not distinguish PFS in high-and low-risk patients (P > 0.05; Fig. 8A and C), whereas the RSF model could distinguish PFS in high-and low-risk patients (P < 0.001; Fig. 8B and D).

Discussion
In the present study, two different models were constructed to predict the PFS of LANPC patients after IC + CCRT. The current findings suggested that compared with the conventional Cox model, the RSF model significantly improved the predictive value and successfully distinguished high-risk and low-risk patients, indicating that it can be used as a noninvasive and useful tool for predicting the prognosis of LANPC patients.
Previous studies have demonstrated that EBV-DNA and TNM staging indicators can help predict the prognosis of NPC [30,31]. The present multivariate analysis showed that EBV-DNA, T staging and overall stages before treatment were valuable in predicting PFS in LANPC patients, which was consistent with previous findings [3,30,31], so they were included in the prediction model. However, the prediction performance of the Cox model based only on clinical features was relatively low. In the training cohort, the AUC of the clinical model in predicting the 3-and 5-year PFS was 0.545 and 0.556, respectively; in the testing cohort, it was 0.566 and 0.591, respectively. The reasons may be as follows: First, patients are only in stage III-IVa, and the clinical stages are narrow and similar. Therefore, it will be more difficult to predict the PFS by clinical stages; second, the T and N stages of the present study are unbalanced, and there are only 5.2% T1 and 2.0% N0 patients in the training set. Even if the clinical staging is effective, it will produce large errors; third, the T staging and overall stages are based on the gross anatomical information of the tumor, and unable to reflect the heterogeneity within the tumor. Thus, despite the addition of EBV-DNA, the prediction performance of the model is still low.
Recently, radiomics has become a popular approach for tumor prognostic prediction. By the analysis of the whole tumor lesions, radiomics has successfully transformed medical imaging into excavated, quantitative, and highdimensional imaging features and reflects the heterogeneity of tumors to help patients assess risks and guide clinical decision-making [32,33]; it is a non-invasive, effective, and reliable approach. Therefore, radiomics labelling can be a useful supplement to clinical features in terms of prognostic value, which can explain the prognostic prediction performance of the radiomics model in the present study is better than that of the clinical model. The potential clinical value of predictive models based  2 Visual nomogram of the clinical + radiomic Cox model in predicting 3-and 5-year PFS. Note: EBV-DNA, Epstein-Barr virus DNA (0, < 1000 copies/ml; 1, ≥1000 copies/ml). Nomogram is used: First, all predictor nodes can be found on the "node" line (EBV-DNA < 1000 copies/ml is rated 0 point, and EBV-DNA ≥ 1000 copies/ml 7.5 points; overall stage 3 is rated 0 point, and the overall stage 4 3.0 points; stage T1 is rated 0 points, stage T2 2.0 points, stage T3 4.0 points, and T4 6.0 points, and so on) . Then ten predicted nodes are added to the "total score" row. Finally, a vertical line was plotted down from the "total score" to the "3-or 5-year survival rate" axis on radiomics in predicting PFS in NPC patients has been previously emphasized [21,34]. However, previous reports mostly used the Cox model to predict the prognosis of NPC. Different studies included different stages and treatment methods for NPC patients, resulting in different clinical and radiomics features, thereby increasing the study heterogeneity and affecting the prediction performance [21][22][23]. A study [35] constructed a Cox proportional hazard regression model to predict the PFS of NPC patients. However, as compared with the clinical   Similarly, in the present study, the Cox model 3 with the addition of radiomics did not significantly improve the prognostic prediction of LANPC patients. In addition, when comparing survival differences among groups, the Cox model requires data to meet the precondition of proportional hazard hypothesis [36]. When the data does not meet the prerequisite requirements, it should make the data meet the hypothesis through stratification or data conversion for analysis. At present, many researchers ignore the testing of the proportional hazard hypothesis when using the Cox regression model, affecting the authenticity and reliability of the findings.
In the present study, based on the RSF model, the survival prediction study of LANPC patients after IC + CCRT was conducted. The findings showed that, as compared with the traditional Cox model, the RSF model significantly improved the prediction performance for PFS of LANPC, and the model had better stability. It is reported in the literature that the RSF model has the advantages of general Random forest (RF) and can prevent the overfitting of its algorithm through two random sampling processes [24]. At the same time, the advantage of the RSF model is that it is not limited by conditions such as proportional hazard and log-linear hypotheses [37]. Compared with traditional survival analysis methods such as the Cox model, the prediction accuracy of the RSF model is at least equal to or better than that of traditional survival analysis methods. Several studies have emphasized the important role of RF classifiers in the selection of radiomics features and model construction of NPC patients [38][39][40], which improves the accuracy of survival prediction. Previous studies [28] reported that compared with models that included clinical or genetic features alone, the RSF model with the addition of radiomics to clinical and genetic features significantly improved the survival prediction of gliomas. Another study obtained radiomics features from CT images of 573 patients with non-small cell lung cancer and fitted the RSF model, revealing that the RSF model had the potential to predict distant metastasis in patients with non-small cell lung cancer [41]. It suggests that the RSF model has a good potential for predicting the prognosis of cancer patients. Therefore, the RSF model of the present study achieved better effects in both the PFS prediction and risk stratification of LANPC patients. To our knowledge, there are few feasibility studies to explore the prognosis of LANPC patients after IC + CCRT by comparing two radiomics-based models, so the present study may be an important reference because it compared the prediction performance of different models in the training cohort and testing cohort. Such comparative studies may improve the reliability of predictive analysis models based on radiomics and help broaden the scope of radiomics in cancer treatment.
In addition, the RSF model based on clinical and radiomics features showed better prognostic prediction performance than the Cox model. The Kaplan-Meier survival curve was used to separate the patients. The PFS of the high-risk group was lower than that of the low-risk group, which was similar to previous findings [23,32,34,40]; it demonstrates a significant difference between the two models, which may help to accurately stratify individual treatment strategies in clinical practice, thereby improving the clinical outcome of LANPC patients.
The present study has several limitations. First, the single-center study may limit the applicability of the present findings for patients in other regions and centers, so it needs to be further verified by multiple centers. Second, the present study only extracts the radiomics features of the primary tumor and does not explore the lymph nodes. Further, N stage was not significantly associated with prognosis. This may be related to the small number of cases in this study. In addition, due to the retrospective nature, there may be selection bias. Thus, the welldesigned prospective studies are warranted.
In conclusion, the present study demonstrates that as compared with the Cox model, the RSF model including clinical and radiomics features shows better performance in predicting the PFS of LANPC patients after IC + CCRT. The RSF model can divide patients into low-risk and high-risk groups, and it may offer additional information for individual treatment strategies for LANPC patients. The construction and comparison of different radiomics prediction models will facilitate the application of radiomics in tumor precision medicine and clinical practice.