Skip to main content

Prediction of the Ki-67 expression level in head and neck squamous cell carcinoma with machine learning-based multiparametric MRI radiomics: a multicenter study

Abstract

Background

This study aimed to develop and validate a machine learning (ML)-based fusion model to preoperatively predict Ki-67 expression levels in patients with head and neck squamous cell carcinoma (HNSCC) using multiparametric magnetic resonance imaging (MRI).

Methods

A total of 351 patients with pathologically proven HNSCC from two medical centers were retrospectively enrolled in the study and divided into training (n = 196), internal validation (n = 84), and external validation (n = 71) cohorts. Radiomics features were extracted from T2-weighted images and contrast-enhanced T1-weighted images and screened. Seven ML classifiers, including k-nearest neighbors (KNN), support vector machine (SVM), logistic regression (LR), random forest (RF), linear discriminant analysis (LDA), naive Bayes (NB), and eXtreme Gradient Boosting (XGBoost) were trained. The best classifier was used to calculate radiomics (Rad)-scores and combine clinical factors to construct a fusion model. Performance was evaluated based on calibration, discrimination, reclassification, and clinical utility.

Results

Thirteen features combining multiparametric MRI were finally selected. The SVM classifier showed the best performance, with the highest average area under the curve (AUC) of 0.851 in the validation cohorts. The fusion model incorporating SVM-based Rad-scores with clinical T stage and MR-reported lymph node status achieved encouraging predictive performance in the training (AUC = 0.916), internal validation (AUC = 0.903), and external validation (AUC = 0.885) cohorts. Furthermore, the fusion model showed better clinical benefit and higher classification accuracy than the clinical model.

Conclusions

The ML-based fusion model based on multiparametric MRI exhibited promise for predicting Ki-67 expression levels in HNSCC patients, which might be helpful for prognosis evaluation and clinical decision-making.

Peer Review reports

Background

Head and neck squamous cell carcinoma (HNSCC), which arises from the mucosal epithelium of the oral cavity, larynx, and pharynx, is one of the most typical malignant tumors of the head and neck [1]. Although multimodal treatment strategies have been established in recent years, the prognosis for this highly malignant disease is still poor, and 5-year survival rates are unsatisfactory [2]. The current approach used to assess the prognosis of HNSCC patients mainly depends on the Tumor Node Metastasis (TNM) staging system. Nevertheless, due to variability in pathological features and tumor biology, even patients with the same TNM staging may have completely different survival outcomes and treatment results [3]. Therefore, it is crucial to identify a reliable prognostic indicator for HNSCC.

Ki-67 is a nuclear antigen related to cell proliferation and correlated positively with cancer aggressiveness [4]. Some studies confirmed that a high Ki-67 expression level was closely associated with the aggressive behavior and poor prognosis of HNSCC [5, 6]. In addition, prior studies demonstrated that tumors with a higher Ki-67 index were more sensitive to radiation and responded significantly better to radiation therapy [7,8,9]. Consequently, accurately determining the preoperative Ki-67 expression level is essential for evaluating the prognosis of HNSCC patients and clinical decision-making. In clinical practice, the Ki-67 expression level is mainly determined based on immunohistochemistry (IHC) using surgery- or biopsy-derived pathological tissues, which is invasive and time-consuming and does not enable real-time assessment. Moreover, due to the high heterogeneity of tumor tissues, local tissues obtained through biopsy alone may not accurately reflect the whole tumor [10, 11]. Therefore, an accurate and noninvasive tool is required to preoperatively assess Ki-67 expression levels in patients with HNSCC.

Magnetic resonance imaging (MRI) and computed tomography (CT) are widely used imaging modalities in the diagnosis, staging, and treatment follow-up of HNSCC. Compared with CT, MRI is considered to have substantial advantages in demonstrating the extent of tumor invasion and visualizing soft tissue [12]. However, conventional MRI results are mostly subjective and qualitative, which may lead to a lack of consistency and reproducibility between different institutions and physicians. Some quantitative parameters of functional MRI have been demonstrated to be associated with the Ki-67 expression level in HNSCC patients [13,14,15]. Nevertheless, these measurements are likely to be taken outside the biopsied area and may not fully reflect tumor heterogeneity. In addition, functional MRI examinations require additional scan sequences, resulting in increased costs and scan times for patients. Radiomics can extract deep quantitative features from medical images that cannot be recognized by the naked eye. By analyzing the correlation between these features and clinical, pathological, and genetic information, the overall heterogeneity and biological behavior of tumors can be unraveled [16]. In recent years, some studies have achieved a good predictive efficiency for Ki-67 in several malignant cancers, including breast cancer [17], meningioma [18], hepatocellular carcinoma [19], and sinonasal malignancy [20], using MRI-based radiomics. However, the predictive value of radiomics regarding the Ki-67 expression level in HNSCC patients remains uncertain. Moreover, multiple machine learning (ML) algorithms have not been combined with radiomics to predict Ki-67 expression in HNSCC thus far.

This study aimed to develop an ML-based radiomics model using multiparametric MRI to effectively predict the Ki-67 expression level in HNSCC patients. In addition, we constructed a fusion model based on clinical characteristics and MRI radiomics features to improve the predictive power and interpretability of the Ki-67 expression level.

Methods

Patient selection and clinical data

This study approved by the Institutional Review Boards of the Fifth Affiliated Hospital of Wenzhou Medical University (Center 1) and the Sixth Affiliated Hospital of Wenzhou Medical University (Center 2). The requirement for informed consent from patients was waived due to the study’s retrospective nature. This study was conducted in accordance with STARD 2015 guidelines (equator-network.org). HNSCC patients with confirmed pathology were identified in Center 1 (from January 2017 to August 2023) and Center 2 (from January 2020 to August 2023). After applying the inclusion and exclusion criteria (see Appendix E1), 351 patients with HNSCC were enrolled in the study. Among them, 280 eligible patients from Center 1 were randomly divided into a training cohort (n = 196) and an internal validation cohort (n = 84) in a 7:3 ratio, while 71 patients from Center 2 were recruited as an external validation cohort. The clinical data of the enrolled patients were retrospectively collected from the medical record system and included age, sex, smoking history, tumor location, and clinical T stage. The clinical T stage of HNSCC was based on the 2017 8th edition manual of the American Joint Committee on Cancer (AJCC) [21]. The detailed patient enrollment process is shown in Fig. 1, and the sample size estimation is described in Appendix E2.

Fig. 1
figure 1

Recruitment pathway for eligible patients in this study. HNSCC, head and neck squamous cell carcinoma; IHC, immunohistochemistry; MRI, magnetic resonance imaging

MRI acquisition and evaluation

MR images were obtained using different 3.0-T MR scanners from two manufacturers (Center 1: Ingenia, Philips Healthcare; Center 2: Discovery 750W, GE Healthcare), both with the neck orthogonal coil. The scan sequences included T2-weighted imaging fat suppression (T2WI-FS) and contrast-enhanced T1-weighted imaging (CE-T1WI) sequences. The acquisition parameters of these protocols are summarized in Table S1. Gd-DTPA (Schering, Germany) was used as the contrast agent, injected via the arm vein with a MEDRAD high-pressure syringe at a dose of 0.2 mmol/kg and a flow rate of 2.5 ml/s. After contrast injection, 20 ml of saline was injected continuously at the same flow rate. CE-T1WI was acquired 15 s following the contrast injection.

All MRI images were reviewed in consensus by two radiologists (reader A and reader B, with 7 and 16 years of experience in head and neck MRI diagnostics, respectively) in a blinded manner (knowing the diagnosis of HNSCC but not the other clinical and pathological details). The classification criterion of MR-reported lymph node (LN) status is shown in Appendix E3.

IHC staining of Ki-67

The expression level of Ki-67 was assessed by performing IHC staining on surgical histopathology samples. After sample fixation, embedding, drying, dewaxing, rinsing, and hydration, IHC staining was performed using a Ki-67 protein antibody (dilution 1:300). Cells were considered positive when the nuclei were dark yellow or brown. Positive cells were selected from among the five areas with the highest density of positives, following which 100 nuclei were counted at a high magnification (× 200) to determine the percentage of positive cells. According to previous studies [22,23,24], using 50% as the cut-off value of Ki-67 in HNSCC can effectively predict the prognosis. Thus, a Ki-67 index of < 50% was considered low expression, while that of ≥ 50% was defined as high expression. The Ki-67 analyses were retrospectively performed by two pathologists (with 5 and 10 years of experience) who were blinded to the clinical information. Representative MRI images from patients with Ki-67 expression levels identified as low and high are shown in Fig. 2.

Fig. 2
figure 2

Representative MRI images of head and neck squamous cell carcinoma patients whose Ki-67 expression levels were determined as being low and high. A A 67-year-old male patient with a visible mass in the hypopharynx. The immunohistochemical image presented a low Ki-67 expression level (× 200; Ki-67 index = 20%). B A 62-year-old male patient with a visible mass in the hypopharynx. The immunohistochemical image presented a high Ki-67 expression level (× 200; Ki-67 index = 90%). MRI, magnetic resonance imaging

Tumor segmentation, feature extraction, and repeatability analysis

The radiomics workflow of the present study is shown in Fig. 3. Bias field correction was performed to eliminate signal intensity variations due to magnetic field inhomogeneities before outlining the regions of interest (ROIs). The pre-processed images were uploaded to the Radcloud platform (v7.1, http://radcloud.cn/). Two radiologists (reader A and reader B) who were blinded to the final pathological results outlined ROIs along the edges of the lesion layer-by-layer on the T2WI-FS and CE-T1WI images, respectively. For each sequence image, a separate whole-tumor volume of interest (VOI) was generated by ROI superimposition.

Fig. 3
figure 3

The workflow of radiomics analysis in the present study. First, VOIs were manually delineated around the entire tumor outline on each axial slice of T2WI-FS and CE-T1WI images. Second, 1688 radiomics features were extracted from each three-dimensional segmentation. Third, three steps of feature selection were applied to all extracted features. Then, seven radiomics signatures were built using seven machine learning classifiers, and the radiomics signature with the best predictive performance was used to build the radiomics model. A clinical model was constructed using logistic regression analysis. Finally, a fusion model incorporating the optimal radiomics score and key clinical characteristics was built and presented as a nomogram, which was evaluated by ROC analysis, calibration curve, and DCA. CE-T1WI, contrast-enhanced T1-weighted imaging; DCA, decision curve analysis; LASSO, least absolute shrinkage and selection operator; ROC, receiver operating characteristic; T2WI-FS, T2-weighted imaging fat suppression; VOI, volume of interest

Then, the radiomics features from each VOI were extracted using the Radcloud platform with a wide variety of engineered, hard-coded feature algorithms [25, 26]. Before feature extraction, all images were resampled to a voxel size of 1 × 1 × 1 mm3 using B-Spline interpolation to reduce the effect of slice thickness variations and isotropic voxels to ensure rotation invariance. Subsequently, to minimize inherent differences in pixel intensities across two different MR scanners, the gray-level intensity for all image volumes was scaled in the range of 0–255 after removing pixels with outlier values [27]. A total of 1688 radiomics features were initially extracted from each VOI for each patient image sequence. The details of the extracted features are given in Appendix E4. The intra- and inter-class correlation coefficients (ICCs) were calculated for repeatability analysis (Appendix E5).

Feature selection, radiomics signature construction, and evaluation

To eliminate scaling differences, radiomics features were normalized using a standardized method (Appendix E6). Then, a three-step procedure involving variance threshold, SelectKBest, and least absolute shrinkage and selection operator (LASSO) regression was performed for the selection of task-specific radiomics features in the training cohort from the feature subsets of T2WI-FS and CE-T1WI sequences alone and in combination, as detailed in Appendix E7.

Thereafter, the selected features were entered into the following ML classifiers to construct radiomics signatures: k-nearest neighbors (KNN), support vector machine (SVM), logistic regression (LR), random forest (RF), linear discriminant analysis (LDA), naive Bayes (NB), and eXtreme Gradient Boosting (XGBoost). The rationales and considerations behind the choice of the seven ML classifiers in this study were detailed in the Appendix E8. The Grid Search in Python was utilized to automatically search for the optimal hyperparameter combinations for each classifier (see in the Appendix E9). Additionally, the seven classifiers were validated in the validation cohorts. The prediction performance of the radiomics signatures was evaluated using the area under the receiver operator characteristic (ROC) curve (AUC), sensitivity, specificity, and accuracy. The classifier with the highest average AUC value in the validation cohorts was chosen as the best classifier [28,29,30]. The best classifier was used to classify key radiomics features of HNSCC patients according to different Ki-67 expression levels, thereby calculating Radiomics (Rad)-scores, which indicate the relative risk of high Ki-67 expression in HNSCC patients, and were used to build the radiomics model. The distribution of the Rad-scores between the Ki-67 low- and high-expression groups was also analyzed to verify its diagnostic performance.

Development and validation of the prediction models

Univariate LR analysis was performed to assess the association between clinical-radiological characteristics and Ki-67 expression level, and clinical predictors with P < 0.1 were included in the multivariate LR analysis to develop the clinical model. Afterward, multivariate analysis and backward stepwise regression analysis based on the Akaike Information Criterion were performed to establish the fusion model and corresponding nomogram incorporating the Rad-score and significant clinical predictors in the training cohort. During this procedure, collinearity was examined, and variables with a variance inflation factor (VIF) of greater than 10 and P > 0.05 were excluded [31]. The models were tested in the validation cohorts. The predictive performance of the prediction models was evaluated using ROC analysis, calibration curves, and decision curve analysis (DCA). The percentage of true positive, false positive, true negative, and false negative results was determined according to the reference standard of pathological results by ROC analysis, and the results are displayed in the form of a confusion matrix diagram. Calibration curves were plotted by bootstrapping with 1000 resamples, and DCA was performed to visualize the net benefit for clinical decisions. The net reclassification improvement (NRI) and integrated discrimination improvement (IDI) values were used to quantify the different models’ clinical usefulness and net benefit.

Statistical analysis

All statistical analyses were completed using Python v3.7.6 and R software. The Kolmogorov–Smirnov test was performed to test the normality of continuous variables. Student’s t-test was applied to compare continuous variables with a normal distribution, the Mann–Whitney U test was used for non-normally distributed variables, and the chi-square test was performed for categorical variables. The R packages used in this study included “glmnet” (for LASSO regression), “rms” (for LR analysis and calibration curves), “rmda” (for DCA), and “PredictABEL” (for the calculation of NRI and IDI). ROC analysis was performed using MedCalc, and the DeLong test was used to compare the differences in AUC values between models. All tests were two-tailed, and P < 0.05 was considered statistically significant.

Results

Patient characteristics and clinical model construction

The clinical characteristics and MRI features of the 351 patients in the training, internal validation, and external validation cohorts are summarized in Table 1 and Table S2. Overall, the three cohorts were balanced and comparable. Significant differences in the clinical T stage and MR-reported LN status were observed between the Ki-67 low- and high-expression groups in all three cohorts (all P < 0.05), while differences in other characteristics were not statistically significant (all P > 0.05). Following univariate and multivariate regression analyses, clinical T3-T4 stage (odds ratio [OR]: 3.715, confidence interval [CI]: 1.580–8.737, P = 0.003) and MR-reported LN metastasis (OR: 2.836, CI: 1.195–6.729, P = 0.018) were confirmed as independent predictors of high Ki-67 expression and used to construct the clinical model (Table S3). No collinearity was detected since the VIFs of the predictors were 1.077 and 1.131, respectively.

Table 1 Comparison of clinical characteristics and MRI features between patients with Ki-67 low expression and those with high expression of HNSCC

Radiomics feature selection and signature construction

Intra-observer ICCs ranged from 0.858 to 0.963, while inter-observer ICCs ranged from 0.827 to 0.931. The results of feature selection in the training cohort are shown in Fig. S1. The optimal radiomics feature subsets selected from T2WI-FS and CE-T1WI for predicting high Ki-67 expression are listed in Table S4. Finally, 13 radiomics features were retained from the combined images of T2WI-FS (n = 7) and CE-T1WI (n = 6) using LASSO regression (Fig. S2), and the relative importance of the 13 selected features is shown in Fig. S2C. The correlation heatmap indicated that the selected features from the combined images were relatively independent (Fig. S3). Then, all selected features were combined to generate the radiomics signature with seven ML classifiers (KNN, SVM, LR, RF, LDA, NB, and XGBoost).

Performance of radiomics signatures and radiomics model construction

For the combined sequences, the predictive performances of the radiomics signatures based on six classifiers in the training, internal validation, and external validation cohorts are shown in Fig. 4 and Table 2. Among these classifiers, the accuracy of RF was 100.0% in the training cohort but 60.2% and 63.4% in the internal and external validation cohorts, respectively, which suggested the presence of overfitting. The SVM classifier achieved the highest average AUC of 0.851 and the highest average accuracy of 0.832 in the validation cohorts. Moreover, the SVM classifier exhibited better predictive performance than the other classifiers in the validation cohorts according to the ROC curves (Fig. S4) and DeLong tests (Fig. 4B). Therefore, SVM was selected as the optimal classifier to calculate Rad-scores for constructing the radiomics model.

Fig. 4
figure 4

ROC analysis results (A) and DeLong’s tests (P value) of different radiomics signatures (B) in the internal validation cohort (left) and external validation cohort (right). ACC, accuracy; AUC, area under the curve; KNN, k-nearest neighbors; LDA, linear discriminant analysis; LR, logistic regression; NB, naive Bayes; RF, random forest; ROC, receiver operating characteristic; SEN, sensitivity; SPE, specificity; SVM, support vector machine; XGBoost, eXtreme Gradient Boosting

Table 2 Diagnostic performance of various machine learning-based radiomics signatures

We constructed the radiomics models using the single sequence (T2WI-FS and CE-T1WI) images based on the SVM classifier. As depicted in Fig. S5, the radiomics model using the combined sequences had higher AUC values than the models with T2WI-FS and CE-T1WI in all three cohorts (all P < 0.05). The SVM-based Rad-scores using combined sequences showed significant differences between the Ki-67 low- and high-expression groups in all three cohorts (all P < 0.001, Fig. S6A-C), and the correlation between Ki-67 status, clinical features, and radiomics features is shown in Fig. S6D-F.

Development and validation of an individualized prediction nomogram

We further integrated the SVM-based Rad-scores with significant clinical factors (clinical T stage and MR-reported LN status) to build a fusion prediction model. The detailed performance of three models in the training and validation cohorts is summarized in Table 3 and depicted in Fig. 5A by confusion matrices. The ROCs for Ki-67 status prediction according to these three models and the results of DeLong’s tests are shown in Fig. 5B. Notably, the incorporation of Rad-scores led to a significant increase in the AUC values for the clinical model in the training, internal validation, and external validation cohorts from 0.737 to 0.916 (Z = 5.702, P < 0.001), 0.715 to 0.903 (Z = 3.485, P = 0.011), and 0.654 to 0.885 (Z = 3.477, P < 0.001), respectively. However, no significant difference in AUCs was found between the radiomics model and fusion model in the internal and external validation cohorts (all P > 0.05).

Table 3 Predictive performance of the clinical model, SVM-based radiomics model and fusion model in the training, internal validation and external validation cohorts
Fig. 5
figure 5

A Confusion matrices and (B) ROC curves of Ki-67 expression level classification for different models in the training, internal validation, and external validation cohorts. A Confusion matrices of the clinical model, radiomics model, and fusion model. The color depends on the number inside the square: the higher the number, the darker the color. B ROC curves of different models for predicting Ki-67 expression levels and the results of DeLong’s tests. AUC, area under the curve; ROC, receiver operating characteristic

Finally, we visualized the fusion model as a nomogram to individually predict the risk of high Ki-67 expression in HNSCC patients (Fig. 6A). The calibration curves showed that the predicted Ki-67 high-expression probabilities of the fusion model had excellent agreement with the actual observations (Fig. 6B). Additionally, the DCA results showed that the radiomics model and fusion model had a higher overall net benefit than the clinical model across the majority of the range of reasonable threshold probabilities in the three cohorts (Fig. 6C). Furthermore, the inclusion of Rad-scores in the fusion model yielded a total NRI of 0.477 (95% CI: 0.206–0.748, P < 0.05) and IDI of 0.204 (95% CI: 0.112–0.325, P < 0.05). Similar results were observed in the validation cohorts (Fig. S7), which showed improved prediction efficiency and classification accuracy for the Ki-67 expression outcome. However, we found no significant difference in the NRI and IDI between the radiomics model and the fusion model in the three cohorts (all P > 0.05).

Fig. 6
figure 6

The nomogram, calibration curves, and DCA. A The fusion nomogram incorporating the SVM-based Rad-score and clinical characteristics (clinical T stage and MR-reported lymph node status) for predicting the probability of high Ki-67 expression (Ki-67 index ≥ 50%). B The calibration curves of the fusion model in the training, internal validation, and external validation cohorts. C The DCA results of the clinical model, radiomics model, and fusion model in the three cohorts. DCA, decision curve analysis; MR, magnetic resonance; SVM, support vector machine

Discussion

High Ki-67 expression in HNSCC correlates with strong proliferative activity and tumor invasiveness [5]. Additionally, the Ki-67 index can be used as an important indicator to help identify candidates for radiotherapy [7]. Thus, the accurate preoperative assessment of Ki-67 expression is essential for prognostic evaluation and treatment planning. This is the first study to establish a fusion model using multiparametric MRI to preoperatively predict the Ki-67 expression level in HNSCC patients by incorporating SVM-based radiomics signatures and clinical features. The fusion model accurately distinguished between Ki-67 indexes of < 50% and ≥ 50% with favorable AUCs (0.916, 0.903, and 0.885, respectively) and high accuracies (85.20%, 85.71%, and 84.51%, respectively) in the training, internal validation, and external validation cohorts. The proposed fusion model had superior performance to the clinical model that included clinical T stage and MRI-reported LN, suggesting that the addition of radiomics features enhanced its diagnostic efficacy and incremental value in predicting Ki-67 expression level. Thus, this model can accurately and robustly predict high Ki67 expression in HNSCC and provide additional information for clinical decision-making.

Radiomics can extract abundant high-dimensional information from medical images and characterize the heterogeneity within tumors comprehensively and accurately. Tumors with different Ki-67 expression levels have been reported to exhibit significant heterogeneity in terms of cell proliferation and differentiation [24]. Therefore, by analyzing the radiomics features corresponding to tumors with different Ki-67 expression levels, the correlation between these characteristics and their potential biological significance can be explored. By transforming CT or MRI images into high-throughput quantitative data, radiomics features have been used to predict the Ki-67 index in various tumor types [17,18,19,20, 32, 33]. In this study, 13 optimal radiomics features were screened for their correlation with Ki-67 expression level in HNSCC, including nine wavelet transformed features, three first-order statistical features, and one filter transformed feature. The wavelet transformed features obtained by wavelet decomposition of the first order and texture features can extract heterogeneity information from the original images [34]. The wavelet features mainly include the gray level size zone matrix (GLSZM), gray level dependence matrix (GLDM), and first-order features. GLSZM is the number of linker voxels with the same gray intensity, while GLDM is the number of linker voxels within a specific distance dependent on the central voxel. The above two texture parameters are calculated values based on voxel alignment, which can characterize the irregularity of voxel alignment in the tumor space. Additionally, tumor heterogeneity may be related to local tumor cell number, proliferation, hypoxia, angiogenesis, and necrosis [35], and these factors are closely related to Ki-67 expression levels. This further suggests that medical image-based radiomics analysis can reflect tumor heterogeneity by describing the voxel arrangement in the tumor space.

In recent years, MRI has become an indispensable part of radiomics analysis due to its ultra-high soft-tissue resolution, absence of ionizing radiation, and multiparametric imaging capabilities. Previous MRI-based radiomics studies mainly focused on the evaluation of the staging [36], prognosis [37, 38], and treatment efficacy [39] of HNSCC. Ren et al. [36] reported an MRI-based radiomics signature using combined T2WI-FS and CE-T1WI images for the preoperative assessment of stage I-II and III-IV HNSCC with an AUC of 0.850 in the training cohort, which was higher than that of the radiomics signatures based on T2WI-FS images (AUC: 0.818) and CE-T1WI images (AUC: 0.828) alone, indicating that combined sequences can more comprehensively mine the heterogeneous features of the tumor. The advantage of multiparametric MRI was also confirmed in the study of Khanfari et al. [40]. Thus, we extracted the radiomics features from these two conventional MRI images to predict the Ki-67 expression level in HNSCC. The results showed that both sequences contributed to radiomics signature construction (seven features from T2WI-FS and six features from CE-T1WI). Then, based on the selected features, we used various ML classifiers to generate radiomics signatures, among which the SVM classifier showed the best performance in the validation cohorts and was selected as the optimal classifier. One possible explanation for this result is that the SVM algorithm usually seeks the best balance between complexity and learning ability, which can facilitate maximum generalizability in limited sample data [41]. In the present study, the SVM-based radiomics model using the combined sequences had higher AUC values (training cohort AUC: 0.884, validation cohorts average AUC: 0.851) than the models based on a single sequence. The results are comparable to a previous study that established a CT-based radiomics model to predict Ki-67 expression in HNSCC, with AUCs of 0.919 and 0.825 in the training and validation cohorts, respectively [22]. However, it should be noted that the soft-tissue resolution of CT images is poorer than that of MR images, and it may be challenging to precisely distinguish tumor boundaries in clinical practice by outlining ROIs. In addition, the high level of ionizing radiation generated by CT is a key concern for operators and patients.

This study also confirmed that clinical T stage and MR-reported LN status were significantly associated with the Ki-67 expression level in HNSCC. Significantly more HNSCC patients in the T3-T4 stage were present among those with a high Ki-67 index than among those with a low Ki-67 index, similar to a previous study [6]. This may be because tumors with high Ki-67 expression grow faster, are more aggressive, and are more likely to exhibit invasive growth and invade surrounding tissues. MR-reported LN status is another essential predictor. Liu et al. [42] indicated that the Ki-67 index correlated with the LN metastasis of HNSCC, and Gadbail et al. [43] found that the Ki-67 index was significantly higher in oral squamous cell carcinoma patients with LN metastasis. Our results were consistent with these findings. Nevertheless, the clinical model constructed based on the above two features only showed moderate performance, with AUCs of 0.737, 0.715, and 0.654 in the training, internal validation, and external validation cohorts, respectively. This is because clinical characteristics provide only visually observed anatomical data and cannot adequately reflect intra-tumoral heterogeneity. Therefore, we further integrated Rad-scores with clinical features to establish the fusion model. The incorporation of Rad-scores led to a significant increase in predictive efficiency and classification accuracy for the clinical model. However, there was no significant difference between the fusion model and the radiomics model in terms of AUCs, DCA, NRI, and IDI in the validation cohorts, which further confirmed the limitations of clinical features and highlighted the unique advantage of the radiomics signature in predicting Ki-67 expression levels.

Our study has some limitations. First, it is a retrospective study with unavoidable bias and a limited sample size. Further prospective studies and datasets with larger sample sizes from more centers are required to validate our prediction model. Second, we only adopted T2WI-FS and CE-T1WI images without diffusion-weighted images (DWI), because a significant proportion of patients lacked DWI. Given the potential value of DWI in radiomics analysis, subsequent studies should consider to incorporate DWI when available. Third, only HNSCC patients with a tumor maximal diameter beyond 5 mm were included in this study to obtain better tumor boundaries and sufficient pixel size for radiomics analysis. To broaden the model’s applicability, future research should consider including HNSCC patients with smaller tumor diameters. Fourth, due to the substantial disparities in prognosis and treatment response between nasopharyngeal carcinoma (NPC) and HNSCC at other sites, NPC patients were not included in this study. Thus, further parallel research on NPCs would be beneficial. Finally, manual segmentation is complex and time-consuming, thereby an automated, reliable, and reproducible segmentation method is required to develop in the future [44].

Conclusions

In summary, we developed and validated an ML-based fusion model and corresponding nomogram that incorporated multiparametric MRI radiomics features and clinical factors to preoperatively predict the Ki-67 expression level in HNSCC patients. The risk calculated based on the nomogram helps to identify HNSCC patients with different risks of high Ki-67 expression, thereby identifying which patients have highly aggressive tumor and poor prognosis. Thus, this prediction model can provide important supplementary information to evaluate prognosis and guide treatment decisions.

Availability of data and materials

The datasets generated and/or analysed during the current study are not publicly available due [REASON WHY DATA ARE NOT PUBLIC] but are available from the corresponding author on reasonable request.

Abbreviations

AUC:

Area under the curve

CE-T1WI:

Contrast-enhanced T1-weighted

CI:

Confidence interval

DCA:

Decision curve analysis

GLDM:

Gray level dependence matrix

GLSZM:

Gray level size zone matrix

HNSCC:

Head and neck squamous cell carcinoma

ICC:

Intra-class correlation coefficient

KNN:

K-nearest neighbors

LASSO:

Least absolute shrinkage and selection operator

LDA:

Linear discriminant analysis

LR:

Logistic regression

LN:

Lymph node

ML:

Machine learning

NB:

Naive Bayes

OR:

Odds ratio

Rad-score:

Radiomics score

RF:

Random forest

ROC:

Receiver operating characteristic

ROI:

Region of interest

SVM:

Support vector machine

T2WI-FS:

T2-weighted imaging fat suppression

VIF:

Variance inflation factor

VOI:

Volume of interest

XGBoost:

EXtreme Gradient Boosting

References

  1. Johnson DE, Burtness B, Leemans CR, Lui VWY, Bauman JE, Grandis JR. Head and neck squamous cell carcinoma. Nat Rev Dis Primers. 2020;6(1):92.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Budach V, Tinhofer I. Novel prognostic clinical factors and biomarkers for outcome prediction in head and neck cancer: a systematic review. Lancet Oncol. 2019;20(6):e313–26.

    Article  PubMed  Google Scholar 

  3. Cui J, Wang L, Tan G, Chen W, He G, Huang H, Chen Z, Yang H, Chen J, Liu G. Development and validation of nomograms to accurately predict risk of recurrence for patients with laryngeal squamous cell carcinoma: Cohort study. Int J Surg (London, England). 2020;76:163–70.

    Article  Google Scholar 

  4. Remnant L, Kochanova NY, Reid C, Cisneros-Soberanis F, Earnshaw WC. The intrinsically disorderly story of Ki-67. Open Biol. 2021;11(8): 210120.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Fischer CA, Jung M, Zlobec I, Green E, Storck C, Tornillo L, Lugli A, Wolfensberger M, Terracciano LM. Co-overexpression of p21 and Ki-67 in head and neck squamous cell carcinoma relative to a significantly poor prognosis. Head Neck. 2011;33(2):267–73.

    Article  PubMed  Google Scholar 

  6. Dumitru CS, Ceausu AR, Comsa S, Raica M. Loss of E-Cadherin Expression Correlates With Ki-67 in Head and Neck Squamous Cell Carcinoma. In vivo (Athens, Greece). 2022;36(3):1150–4.

    CAS  PubMed  Google Scholar 

  7. Ahmed WA, Suzuki K, Imaeda Y, Horibe Y. Ki-67, p53 and epidermal growth factor receptor expression in early glottic cancer involving the anterior commissure treated with radiotherapy. Auris Nasus Larynx. 2008;35(2):213–9.

    Article  PubMed  Google Scholar 

  8. Lothaire P, de Azambuja E, Dequanter D, Lalami Y, Sotiriou C, Andry G, Castro G Jr, Awada A. Molecular markers of head and neck squamous cell carcinoma: promising signs in need of prospective evaluation. Head Neck. 2006;28(3):256–69.

    Article  PubMed  Google Scholar 

  9. Couture C, Raybaud-Diogène H, Têtu B, Bairati I, Murry D, Allard J, Fortin A. p53 and Ki-67 as markers of radioresistance in head and neck carcinoma. Cancer. 2002;94(3):713–22.

    Article  CAS  PubMed  Google Scholar 

  10. Ahmed AA, Elmohr MM, Fuentes D, Habra MA, Fisher SB, Perrier ND, Zhang M, Elsayes KM. Radiomic mapping model for prediction of Ki-67 expression in adrenocortical carcinoma. Clin Radiol. 2020;75(6):479.e417-479.e422.

    Article  Google Scholar 

  11. Juan MW, Yu J, Peng GX, Jun LJ, Feng SP, Fang LP. Correlation between DCE-MRI radiomics features and Ki-67 expression in invasive breast cancer. Oncol Lett. 2018;16(4):5084–90.

    PubMed  PubMed Central  Google Scholar 

  12. Jethanandani A, Lin TA, Volpe S, Elhalawani H, Mohamed ASR, Yang P, Fuller CD. Exploring Applications of Radiomics in Magnetic Resonance Imaging of Head and Neck Cancer: A Systematic Review. Front Oncol. 2018;8:131.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Karabay N, Bülbül HM, Doğan E, İkiz A, Bülbül G, Sarıoğlu S. The correlations between dynamic contrast enhanced magnetic resonance imaging and immunohistochemical data in head and neck squamous cell carcinomas. Turk J Med Sci. 2022;52(6):1950–7.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Surov A, Meyer HJ, Gawlitza M, Höhn AK, Boehm A, Kahn T, Stumpp P. Correlations Between DCE MRI and Histopathological Parameters in Head and Neck Squamous Cell Carcinoma. Transl Oncol. 2017;10(1):17–21.

    Article  PubMed  Google Scholar 

  15. Surov A, Meyer HJ, Winter K, Richter C, Hoehn AK. Histogram analysis parameters of apparent diffusion coefficient reflect tumor cellularity and proliferation activity in head and neck squamous cell carcinoma. Oncotarget. 2018;9(34):23599–607.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Bruixola G, Remacha E, Jiménez-Pastor A, Dualde D, Viala A, Montón JV, Ibarrola-Villava M, Alberich-Bayarri Á, Cervantes A. Radiomics and radiogenomics in head and neck squamous cell carcinoma: Potential contribution to patient management and challenges. Cancer Treat Rev. 2021;99: 102263.

    Article  PubMed  Google Scholar 

  17. Fan M, Yuan W, Zhao W, Xu M, Wang S, Gao X, Li L. Joint Prediction of Breast Cancer Histological Grade and Ki-67 Expression Level Based on DCE-MRI and DWI Radiomics. IEEE J Biomed Health Inform. 2020;24(6):1632–42.

    Article  PubMed  Google Scholar 

  18. Ouyang ZQ, He SN, Zeng YZ, Zhu Y, Ling BB, Sun XJ, Gu HY, He B, Han D, Lu Y. Contrast enhanced magnetic resonance imaging-based radiomics nomogram for preoperatively predicting expression status of Ki-67 in meningioma: a two-center study. Quant Imaging Med Surg. 2023;13(2):1100–14.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Fan Y, Yu Y, Wang X, Hu M, Hu C. Radiomic analysis of Gd-EOB-DTPA-enhanced MRI predicts Ki-67 expression in hepatocellular carcinoma. BMC Med Imaging. 2021;21(1):100.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Bi S, Li J, Wang T, Man F, Zhang P, Hou F, Wang H, Hao D. Multi-parametric MRI-based radiomics signature for preoperative prediction of Ki-67 proliferation status in sinonasal malignancies: a two-centre study. Eur Radiol. 2022;32(10):6933–42.

    Article  CAS  PubMed  Google Scholar 

  21. Huang SH. O’ Sullivan B: Overview of the 8th Edition TNM Classification for Head and Neck Cancer. Curr Treat Options Oncol. 2017;18(7):40.

    Article  PubMed  Google Scholar 

  22. Zheng YM, Chen J, Zhang M, Wu ZJ, Tang GZ, Zhang Y, Dong C. CT radiomics nomogram for prediction of the Ki-67 index in head and neck squamous cell carcinoma. Eur Radiol. 2023;33(3):2160–70.

    Article  CAS  PubMed  Google Scholar 

  23. Huang W, Zhang Q, Wu G, Chen PP, Li J, McCabe Gillen K, Spincemaille P, Chiang GC, Gupta A, Wang Y, et al. DCE-MRI quantitative transport mapping for noninvasively detecting hypoxia inducible factor-1α, epidermal growth factor receptor overexpression, and Ki-67 in nasopharyngeal carcinoma patients. Radiother Oncol. 2021;164:146–54.

    Article  CAS  PubMed  Google Scholar 

  24. Sakata K, Oouchi A, Nagakura H, Akiba H, Tamakawa M, Koito K, Hareyama M, Asakura K, Satoh M, Ohtani S. Accelerated radiotherapy for T1, 2 glottic carcinoma: analysis of results with KI-67 index. Int J Radiat Oncol Biol Phys. 2000;47(1):81–8.

    Article  CAS  PubMed  Google Scholar 

  25. Nie P, Yang G, Wang N, Yan L, Miao W, Duan Y, Wang Y, Gong A, Zhao Y, Wu J, et al. Additional value of metabolic parameters to PET/CT-based radiomics nomogram in predicting lymphovascular invasion and outcome in lung adenocarcinoma. Eur J Nucl Med Mol Imaging. 2021;48(1):217–30.

    Article  PubMed  Google Scholar 

  26. Nie P, Yang G, Wang Z, Yan L, Miao W, Hao D, Wu J, Zhao Y, Gong A, Cui J, et al. A CT-based radiomics nomogram for differentiation of renal angiomyolipoma without visible fat from homogeneous clear cell renal cell carcinoma. Eur Radiol. 2020;30(2):1274–84.

    Article  PubMed  Google Scholar 

  27. Haga A, Takahashi W, Aoki S, Nawa K, Yamashita H, Abe O, Nakagawa K. Standardization of imaging features for radiomics analysis. J Med Invest. 2019;66(1.2):35–7.

    Article  PubMed  Google Scholar 

  28. Bi Q, Wang Y, Deng Y, Liu Y, Pan Y, Song Y, Wu Y, Wu K. Different multiparametric MRI-based radiomics models for differentiating stage IA endometrial cancer from benign endometrial lesions: A multicenter study. Front Oncol. 2022;12: 939930.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Guo Y, Wu J, Wang Y, Jin Y. Development and Validation of an Ultrasound-Based Radiomics Nomogram for Identifying HER2 Status in Patients with Breast Carcinoma. Diagnostics (Basel, Switzerland). 2022;12(12):3130.

    CAS  PubMed  Google Scholar 

  30. Rui W, Qiao N, Wu Y, Zhang Y, Aili A, Zhang Z, Ye H, Wang Y, Zhao Y, Yao Z. Radiomics analysis allows for precise prediction of silent corticotroph adenoma among non-functioning pituitary adenomas. Eur Radiol. 2022;32(3):1570–8.

    Article  PubMed  Google Scholar 

  31. Kim JH. Multicollinearity and misleading statistical results. Korean J Anesthesiol. 2019;72(6):558–69.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Wu H, Han X, Wang Z, Mo L, Liu W, Guo Y, Wei X, Jiang X. Prediction of the Ki-67 marker index in hepatocellular carcinoma based on CT radiomics features. Phys Med Biol. 2020;65(23): 235048.

    Article  CAS  PubMed  Google Scholar 

  33. Liu Y, He C, Fang W, Peng L, Shi F, Xia Y, Zhou Q, Zhang R, Li C. Prediction of Ki-67 expression in gastrointestinal stromal tumors using radiomics of plain and multiphase contrast-enhanced CT. Eur Radiol. 2023;33(11):7609–17.

    Article  CAS  PubMed  Google Scholar 

  34. Chen J, Lu S, Mao Y, Tan L, Li G, Gao Y, Tan P, Huang D, Zhang X, Qiu Y, et al. An MRI-based radiomics-clinical nomogram for the overall survival prediction in patients with hypopharyngeal squamous cell carcinoma: a multi-cohort study. Eur Radiol. 2022;32(3):1548–57.

    Article  PubMed  Google Scholar 

  35. Ganeshan B, Goh V, Mandeville HC, Ng QS, Hoskin PJ, Miles KA. Non-small cell lung cancer: histopathologic correlates for texture parameters at CT. Radiology. 2013;266(1):326–36.

    Article  PubMed  Google Scholar 

  36. Ren J, Tian J, Yuan Y, Dong D, Li X, Shi Y, Tao X. Magnetic resonance imaging based radiomics signature for the preoperative discrimination of stage I-II and III-IV head and neck squamous cell carcinoma. Eur J Radiol. 2018;106:1–6.

    Article  PubMed  Google Scholar 

  37. Mes SW, van Velden FHP, Peltenburg B, Peeters CFW, Te Beest DE, van de Wiel MA, Mekke J, Mulder DC, Martens RM, Castelijns JA, et al. Outcome prediction of head and neck squamous cell carcinoma by MRI radiomic signatures. Eur Radiol. 2020;30(11):6311–21.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Alfieri S, Romanò R, Bologna M, Calareso G, Corino V, Mirabile A, Ferri A, Bellanti L, Poli T, Marcantoni A, et al. Prognostic role of pre-treatment magnetic resonance imaging (MRI)-based radiomic analysis in effectively cured head and neck squamous cell carcinoma (HNSCC) patients. Acta Oncol. 2021;60(9):1192–200.

    Article  CAS  PubMed  Google Scholar 

  39. Guha A, Anjari M, Cook G, Goh V, Connor S. Radiomic Analysis of Tumour Heterogeneity Using MRI in Head and Neck Cancer Following Chemoradiotherapy: A Feasibility Study. Front Oncol. 2022;12: 784693.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Khanfari H, Mehranfar S, Cheki M, Mohammadi Sadr M, Moniri S, Heydarheydari S, Rezaeijo SM. Exploring the efficacy of multi-flavored feature extraction with radiomics and deep features for prostate cancer grading on mpMRI. BMC Med Imaging. 2023;23(1):195.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Way TW, Sahiner B, Hadjiiski LM, Chan HP. Effect of finite sample size on feature selection and classification: a simulation study. Med Phys. 2010;37(2):907–20.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Liu M, Lawson G, Delos M, Jamart J, Ide C, Coche E, Weynand B, Desuter G, Hamoir M, Remacle M, et al. Predictive value of the fraction of cancer cells immunolabeled for proliferating cell nuclear antigen or Ki67 in biopsies of head and neck carcinomas to identify lymph node metastasis: comparison with clinical and radiologic examinations. Head Neck. 2003;25(4):280–8.

    Article  PubMed  Google Scholar 

  43. Gadbail AR, Sarode SC, Chaudhary MS, Gondivkar SM, Tekade SA, Yuwanati M, Patil S. Ki67 Labelling Index predicts clinical outcome and survival in oral squamous cell carcinoma. Journal of applied oral science : revista FOB. 2021;29: e20200751.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Heydarheydari S, Birgani MJT, Rezaeijo SM. Auto-segmentation of head and neck tumors in positron emission tomography images using non-local means and morphological frameworks. Pol J Radiol. 2023;88:e365–70.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No.82072026 to Jiansong Ji), Medical and Health General Project of Zhejiang Province (Grant No. 2024KY568 to Weiyue Chen, and No. 2023KY425 to Guihan Lin).

Author information

Authors and Affiliations

Authors

Contributions

Conception and design of the work: W.Y.C, G.H.L, C.Y.L, J.S.J; Data collection: W.Y.C, G.H.L, Y.J.C, F.C, X.L, J.Y.D, Y.Z; Data analysis: W.Y.C, G.H.L, C.L.K; Data interpretation: W.Y.C, G.H.L; Drafting the manuscript: W.Y.C, G.H.L; Critical revision of the manuscript: W.Y.C, M.J.C, S.W.X, C.Y.L, J.S.J; Final approval of the version to be published: W.Y.C, G.H.L, M.J.C, C.Y.L, J.S.J. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Chenying Lu or Jiansong Ji.

Ethics declarations

Ethics approval and consent to participate

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. This study was approved by the Institutional Review Board and Human Ethics Committee of the Fifth Affiliated Hospital of Wenzhou Medical University (protocol code 2023–729) and the Sixth Affiliated Hospital of Wenzhou Medical University, with the requirement for patient informed consent being waived due to its retrospective nature. All patients’ information was anonymized prior to the analysis.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, W., Lin, G., Chen, Y. et al. Prediction of the Ki-67 expression level in head and neck squamous cell carcinoma with machine learning-based multiparametric MRI radiomics: a multicenter study. BMC Cancer 24, 418 (2024). https://doi.org/10.1186/s12885-024-12026-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12885-024-12026-x

Keywords