- Open Access
Association between cancer stem cell gene expression signatures and prognosis in head and neck squamous cell carcinoma
BMC Cancer volume 22, Article number: 1077 (2022)
Various cancer stem cell (CSC) biomarkers and the genes encoding them in head and neck squamous cell carcinoma (HNSCC) have been identified and evaluated. However, the validity of these factors in the prognosis of HNSCC has been questioned and remains unclear. In this study, we examined the clinical significance of CSC biomarker genes in HNSCC, using five publicly available HNSCC cohorts.
To predict the prognosis of patients with HNSCC, we developed and validated the expression signatures of CSC biomarker genes whose mRNA expression levels correlated with at least one of the four CSC genes (CD44, MET, ALDH1A1, and BMI1).
Patients in The Cancer Genome Atlas (TCGA) HNSCC cohort were classified into CSC gene expression-associated high-risk (CSC-HR; n = 285) and CSC gene expression-associated low-risk (CSC-LR; n = 281) subgroups. The 5-year overall survival and recurrence-free survival rates were significantly lower in the CSC-HR subgroup than in the CSC-LR subgroup (p = 0.04 and 0.02, respectively). The clinical significance of the CSC gene expression signature was validated using four independent cohorts. Analysis using Cox proportional hazards models showed that the CSC gene expression signature was an independent prognostic factor of non-oropharyngeal HNSCC which mostly indicates HPV (–) status. Furthermore, the CSC gene expression signature was associated with the prognosis of HNSCC patients who received radiotherapy.
The CSC gene expression signature is associated with the prognosis of HNSCC and may help in personalized treatments for patients with HNSCC, especially in cases with HPV (–) status who were classified in more detail.
Head and neck squamous cell carcinoma (HNSCC) is the sixth most common cancer worldwide and includes all cancers that occur in the mucosa of the oropharynx, oral cavity, hypopharynx, or larynx . Approximately 650,000 new cases of HNSCC occur every year and 350,000 patients die of it worldwide . Despite advances in therapeutic methods, the survival rates of this condition have not markedly improved over the past few decades .
HPV status is a well-known factor that influences the prognosis of patients with HNSCC . Recently, various molecular markers that influence HNSCC prognosis have been identified for precision medicine and personalized treatment . However, there are many other molecular markers that require further investigation. Cancer stem cells (CSCs) and CSC markers are important targets in this respect .
CSCs constitute the part of a tumor that has long-term repopulation potential, the ability to evade cell death, clonal tumor initiation capacity, and self-renewal properties . CSCs have been identified by their cell surface markers expression, which are mostly selected by embryonic stem cells or by self-properties involved with tissue development lineage molecules . The most well-known CSC marker is CD44, which is associated with cell proliferation, angiogenesis, adhesion, and migration during tumorigenesis . In addition to CD44, 27 other CSC biomarkers have been reported and evaluated in HNSCC .
However, CSCs comprise only a small proportion of cancer cells  and can be regulated by external forces and cell-autonomous forces . In other words, CSCs may not necessarily be rare within tumors, and non-CSCs in tumors can be reversibly reprogrammed to become CSCs . Thus, genes encoding CSC markers in tumors need to be analyzed, regardless of whether they are CSCs or non-CSCs. However, the validity and clinical significance of these genes have been questioned recently and remain to be ascertained in the context of HNSCC. Additionally, the validity and underlying relationship among these CSC biomarker genes in predicting the prognosis of HNSCC have not been demonstrated. Therefore, further research is needed to validate the role of CSC biomarker genes in HNSCC and determine personalized treatments for patients with HNSCC.
In this study, we analyzed the genomic data of patients with HNSCC to determine the molecular subtypes associated with CSC biomarker genes, thereby predicting their prognosis. We hypothesized that the investigation of mRNA expression of various genes including CSC biomarker genes in The Cancer Genome Atlas (TCGA) HNSCC cohort would generate CSC gene expression-associated molecular signatures, which could be validated in various independent HNSCC cohorts. We also investigated the prognostic importance of CSC gene expression signatures in various subgroups of patients with HNSCC.
Gene expression levels and clinical data from five independent cohorts were downloaded from public databases. Using the UCSC Cancer Genomics Browser (https://xena.ucsc.edu/public), clinical and gene expression data of TCGA cohort (n = 566) were obtained. Using the National Center for Biotechnology Information Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo), corresponding data from the Institute for Medical Informatics, Statistics and Epidemiology (Leipzig cohort, GSE65858, n = 270) , Fred Hutchinson Cancer Research Center (FHCRC cohort, GSE41613, n = 97) , MD Anderson Cancer Center (MDACC cohort, GSE42743, n = 74) , and AHEPA Hospital in Thessaloniki (Greece cohort, GSE27020, n = 109)  were obtained. The gene expression profile of TCGA cohort was measured using Illumina HiSeq® 2000 (Illumina Inc., San Diego, CA, USA), while that of the Leipzig cohort was measured using Illumina HumanHT-12 v4.0 Expression BeadChip, and those of the FHCRC, MDACC, and Greece cohorts were measured using Affymetrix Human Genome U133 Plus 2.0 Array (Affymetrix Inc., California, USA). All gene expression data were standardized using different platforms.
Selection of reference CSC genes (training cohort)
The search results for the reported CSC biomarkers and encoding genes were obtained from the study by Xiao et al. . Specifically, CSC biomarkers and encoding genes were searched for in PubMed, using the terms ‘tumor stem cells’, ‘tumor stem-like cells’, ‘CSCs’, ‘cancer stem cells’, ‘cancer stem-like cells’, and ‘HNSCC’, ‘head and neck squamous cell carcinoma’. Twenty-eight genes were selected from the studies that met the following criteria: 1) studies in humans, 2) studies showing validated evidence, and 3) an original research paper. Case reports, comments, reviews, letters to the editor, and conference abstracts were excluded.
To select reference CSC genes among the 28 CSC genes, we additionally searched for articles written between 2012 and 2021 on CSC biomarkers or encoding genes in HNSCC. Articles were searched for in PubMed, using the terms ‘HNSCC’, ‘head and neck squamous cell carcinoma’, and each term about ‘CSC biomarkers and encoding genes’. We selected genes that satisfied the following criteria: (a) CSC biomarkers showing clinical significance when classified according to expression and (b) those that have been studied more than twice.
The selected genes were analyzed using a training cohort (TCGA cohort). First, TCGA cohorts were classified into two subgroups based on the mRNA expression of each gene. To define dichotomous cut-off values for continuous mRNA expression for each gene, an online tool (http://molpath.charite.de/cutoff/) was used . The Kaplan–Meier method was used to generate survival curves for each subgroup of each gene. The log-rank test was used to compare the prognoses of the two subgroups for each gene. CSC genes showing significant differences in the 5-year overall survival (OS) or recurrence-free survival (RFS) rates between the two subgroups classified according to the mRNA expression were selected as reference CSC genes.
Development of CSC gene expression-associated signature
Gene expression data from TCGA cohort were analyzed to identify CSC gene expression-associated signatures in HNSCC. Genes whose mRNA expression levels were negatively or positively correlated with at least one CSC gene marker were selected. We then performed unsupervised hierarchical clustering analysis with the centered correlation coefficient as a measure of similarity and a complete linkage clustering method using the Gene Cluster 3.0 program (Stanford University, Stanford, CA, USA; downloaded at https://cluster2.software.informer.com) . In detail, selected CSC gene markers were adjusted, checking center genes with median methods. Next, adjusted data were divided into two groups using unsupervised hierarchical clustering, checking genes and arrays cluster with the centered correlation, and a complete linkage clustering method. The patients were divided into CSC gene expression-associated high-risk (CSC-HR) and low-risk (CSC-LR) subgroups. The subgroup showing significantly lower survival rates than the other group was defined as the CSC-HR subgroup. The Java Treeview program was used to generate heat maps for the cluster analysis.
Construction of prediction models and validation in the four independent cohorts
Before constructing the prediction models, all gene expression data for each cohort were standardized by being transformed into a median of 0 and standard deviation of 1 because they were generated using different platforms. The Support Vector Machine (SVM) class prediction engine was used to test the ability of CSC gene expression-associated signatures to predict the class of patients with HNSCC in four independent cohorts . Gene expression data from TCGA cohort were combined to form a series of classifiers according to the SVM algorithm, following which the robustness of the classifier was estimated according to the misclassification rate determined during leave-one-out cross-validation of the training set using BRB-Array Tools . The validation was conducted in four independent cohorts (Leipzig, FHCRC, MDACC, and Greece).
To identify gene ontology categories with significantly enriched gene numbers, 81 CSC gene expression-associated signatures were analyzed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) (version 6.8) . To map the CSC gene expression signature to the reference set of direct and indirect relationships, default settings from the software were utilized. Relevant inputs to the gene list, such as biological functions and molecular networks, were then generated using the software’s algorithm. Significant gene annotation was determined using a two-tailed Fisher’s exact test (p < 0.05).
Gene expression and survival data were used to test prognostic significance. OS was defined as the number of months between the date of diagnosis and the date of death. The number of months from the date of diagnosis to recurrence was defined as the RFS. The Kaplan–Meier method was used to produce OS and RFS curves in each subgroup of each cohort. The log-rank test was used to compare the OS and RFS rates between each subgroup. Receiver operating characteristic (ROC) curves were used to compare the sensitivity and specificity of the 1-year, 3-year, and 5-year survival predictions of CSC gene expression signatures. The area under the curve (AUC) was calculated for each ROC curve. Univariate and multivariate Cox regression models were used to evaluate the independent prognostic factors associated with the survival of patients with HNSCC. The results of the Cox regression analyses were reported as hazard ratios (HRs), 95% confidence intervals (95% CIs), and p-values. The R software package (http://www.r-project.org) was used for all statistical analyses. Statistical significance was set at p < 0.05.
Development of CSC gene expression-associated signatures in patients with HNSCC
Among the 28 CSC genes encoding CSC biomarker proteins that have been validated in HNSCC , we selected seven CSC genes that satisfied the following criteria, CD44, MET, ALDH1A1, BMI1, PROM1, SOX2, and POU5F1 from a literature search: (a) showing clinical significance associated with the expression of corresponding CSC biomarker proteins in HNSCC and (b) studied more than twice over the past 10 years [5, 8, 9, 22,23,24,25,26,27,28,29,30]. In TCGA cohort, high expression of four CSC genes (CD44, MET, ALDH1A1, and BMI1) was significantly associated with patient prognosis (p = 0.0069, 0.0051, 0.028, and 0.021, respectively; Fig. S1). Thus, these four genes were selected as the reference CSC genes for this study.
We then identified genes whose mRNA expression was correlated with at least one of the four reference CSC genes in TCGA cohort. A total of 81 genes were identified and selected as CSC gene expression-associated signatures (Fig. S2a and Table S1). Using the CSC gene expression signatures, patients in TCGA cohort (n = 566) were classified into the CSC-HR (n = 285) and CSC-LR (n = 281) subgroups (Fig. 1a). The mRNA expression levels of CD44 and MET were significantly higher in the CSC-HR subgroup than in the CSC-LR subgroup (p < 0.0001 both). The mRNA expression levels of ALDH1A1 and BMI1 were significantly higher in the CSC-LR subgroup than in the CSC-HR subgroup (p < 0.0001 both). The results of the Kaplan–Meier analysis and log-rank test indicated that the 5-year OS and RFS rates were significantly lower in the CSC-HR subgroup than in the CSC-LR subgroup, in TCGA cohort (p = 0.04 and 0.02, respectively; Fig. 1b-c).
Independent validation of the CSC gene expression-associated signature
The CSC gene expression signature was validated using four independent cohorts: Leipzig (n = 270), FHCRC (n = 97), MDACC (n = 74), and Greece (n = 109) (Fig. S2b). Details of the clinical and pathological characteristics of each cohort used in this study are shown in Table 1. Patients in each validation cohort were efficiently classified into CSC-HR and CSC-LR subgroups, based on the CSC gene expression signature. The CSC-HR subgroup in each validation cohort had a worse prognosis than the CSC-LR subgroup (Fig. 2). Five-year OS rates tended to be lower in the CSC-HR subgroup than in the CSC-LR subgroup, in the Leipzig cohort, although the differences were not significant (p = 0.06; Fig. 2a). In the FHCRC and MDACC cohorts, the 5-year OS rates were significantly lower in the CSC-HR subgroup than in the CSC-LR subgroup (p < 0.0001 and = 0.02, respectively; Fig. 2c and e). Furthermore, the 5-year RFS rates were significantly lower in the CSC-HR subgroup than in the CSC-LR subgroup, in the Greece cohort (p = 0.009; Fig. 2g). In all patients in the five cohorts, the 5-year OS rates were significantly lower in the CSC-HR subgroups than in the CSC-LR subgroups (p < 0.0001; Fig. S3a).
The sensitivity and specificity of the CSC gene expression signatures were identified in each cohort using ROC curves. The AUCs were 0.608, 0.572, and 0.541 for the 1-year, 3-year, and 5-year OS, respectively, in the Leipzig cohort (95% CI, 0.523–0.692, 0.489–0.655, and 0.413–0.668, respectively; Fig. 2b). The AUCs for the 5-year OS were 0.671 and 0.876 in the FHCRC and MDACC cohorts (95% CI, 0.574–0.767 and 0.790–0.960, respectively; Fig. 2d and f), indicating good discriminatory ability in the MDACC cohort . The AUCs for 1-year, 3-year, and 5-year OS were 0.692, 0.628, and 0.538 in the Greece cohort (95% CI, 0.581–0.803, 0.518–0.737, and 0.385–0.691, respectively; Fig. 2h). The AUC for the 5-year OS was 0.582 for all patients in the five cohorts (95% CI, 0.531–0.632; Fig. S3b). These results support the prognostic value of the CSC gene expression signature in the analyzed cohorts.
The CSC gene expression signature as an independent prognostic factor of non-oropharyngeal HNSCC
To assess the independent prognostic factors of HPV (–) HNSCC patients, we decided to select non-oropharyngeal HNSCC patients in the five independent HNSCC cohorts. The HPV status was missing in many patients, thus we hypothesized that analysis of non-oropharyngeal HNSCC might help find prognostic factors of HPV (-) HNSCC patients. The Greece cohort did not report OS but RFS, so non-oropharyngeal HNSCC patients were selected from other four cohorts (n = 816). Cox proportional hazards models using CSC gene expression signatures, patient demographics, alcohol history, smoking history, and clinical staging of non-oropharyngeal HNSCC patient. Upon analysis, the CSC gene expression signature (CSC-HR vs. CSC-LR) and regional lymph node metastasis (N + vs. N-) were independent prognostic factors of OS in non-oropharyngeal HNSCC patients (p = 0.0140 and 0.0292, respectively; Table S2).
Association of the CSC gene expression signature with HPV status of HNSCC
We thought that if the additional survival analysis was performed individually according to HPV status, it might be helpful to find appropriate indications to investigate the CSC gene expression signatures to that can predict patient prognosis in HNSCC. Thus, we analyzed the prognosis of the CSC-HR and CSC-LR subgroups in patients with HPV ( +) and HPV (–) HNSCC from the three HNSCC cohorts (TCGA, Leipzig and FHCRC) that include information about the HPV status (Fig. 3). There were no significant differences in the 5-year OS rates between the CSC-HR and CSC-LR subgroups in patients with HPV ( +) HNSCC (n = 128 and p = 0.2; Fig. 3a). However, the CSC-HR subgroup showed significantly lower 5-year OS rates than the CSC-LR subgroup, among patients with HPV (–) HNSCC (n = 578 and p = 0.003; Fig. 3b).
Association of the CSC gene expression signature with the results of radiotherapy (RT)
The expression of CSC markers is correlated with poor prognosis after RT in HNSCC [32, 33]. However, the clinical correlation between RT and genes encoding CSC markers has not yet been clearly studied. Thus, we analyzed the prognosis of the CSC-HR and CSC-LR subgroups in the two HNSCC cohorts (TCGA and MDACC) that include information on whether RT has been received or not. The CSC-HR subgroup showed significantly lower 5-year OS rates than the CSC-LR subgroup, among patients with HNSCC who received RT (p < 0.0001; Fig. 4a). However, there were no significant differences in the 5-year OS rates between the two subgroups of patients with HNSCC who did not receive RT (Fig. 4b). Similarly, there were no significant differences in 5-year OS rates between patients who received RT and those who did not, in the CSC-HR subgroup (p = 0.1; Fig. 4c). However, the prognosis was better in patients who received RT than in those who did not, in the CSC-LR subgroup (p < 0.0001; Fig. 4d). To determine any correlation between CSC gene expression signatures and RT in HNSCC, we performed an interaction test for OS. The results revealed a significant correlation between the CSC gene expression signature and RT (p < 0.0001).
A total of 8 significant Kyoto Encyclopedia of Genes and Genomes pathways were identified using DAVID (Table S3). Several of these pathways appeared to be related to the cancer or HNSCC pathways, including focal adhesion (p = 1.0E-5), small-cell lung cancer (p = 3.1E-4), ECM–receptor interaction (p = 4.6E-3), proteoglycans in cancer (p = 7.2E-3), and PI3K-Akt signaling pathway (p = 1.0E-2). Pathways associated with endothelial-mesenchymal transition signaling were also identified, such as regulation of the actin cytoskeleton (p = 8.5E-3) and leukocyte transendothelial migration (p = 9.9E-3).
In this study, we developed and validated CSC gene expression signatures in five independent HNSCC cohorts. We observed that patients in the CSC-HR subgroup had worse prognosis than those in the CSC-LR subgroup, in each cohort. Similar results were observed in the two subgroups of patients with HPV (–) HNSCC. Furthermore, the CSC gene expression signature could accurately predict the outcomes of patients receiving RT. Thus, the CSC gene expression signature could identify patients with HNSCC who do not respond to RT and require intensified or personalized treatment.
Cancer cells within individual tumor masses often represent distinct phenotypic states that differ in their functional attributes . Within this tumor heterogeneity, CSCs are essential for tumor initiation, maintenance, recurrence, and metastasis. To date, the identification of CSCs has mainly been based on CSC surface markers. However, the genes encoding these CSC biomarkers that could predict the prognosis of patients with HNSCC have not been clearly studied. Thus, we focused on the association between CSC biomarker genes and prognosis of patients with HNSCC.
We believed that it would be difficult to predict the prognosis of HNSCC by considering all CSC biomarker genes, since each CSC biomarker gene had different effects on the prognosis, and the proportion of the expression of each CSC biomarker gene is heterogeneous depending on each patient. Thus, we decided to select CSC biomarker genes that satisfied the following criteria: (a) whose corresponding biomarker expression showed clinical significance in more than two studies in the last 10 years and (b) whose high expression of each gene was significantly associated with prognosis in patients with HNSCC. On the above basis, we selected four CSC biomarker genes, CD44, MET, ALDH1A1, and BMI1. We then comprehensively analyzed five independent public cohorts while considering gene signatures associated with these CSC biomarker genes.
CD44 is a transmembrane glycoprotein that is the major receptor for hyaluronan . CD44 is a commonly used CSC marker and is associated with prognosis in various human tumors, including HNSCC . High CD44 expression is associated with poor survival in HNSCC . CD44 is also highly expressed in proliferating cells obtained from N + HNSCC metastasis, thereby highlighting its possible role in tumor progression . In addition, CD44 is a biological factor that is significantly correlated with response to RT, in patients with early stage laryngeal cancer .
The expression of c-MET (a mesenchymal-to-epithelial transition factor) was found to be a CSC marker that is positively correlated with the expression of CD44 in HNSCC clinical databases . Lim et al. found that activation of the c-MET pathway is critical for the proliferation and maintenance of CSC traits in HNSCC . c-MET knockdown significantly decreased the expression of CD44-positive cells . c-MET is expressed in the majority of locally advanced HNSCC, and high expression of c-MET predicts a worse prognosis . High MET expression has also been found to be associated with poor loco-regional tumor control and increased metastasis after post-operative chemoradiotherapy in patients with HPV (–) HNSCC .
Aldehyde dehydrogenase 1 (ALDH1) and B-lymphoma moloney murine leukemia virus insertion region-1 (BMI-1) are two of the most studied CSC markers in HNSCC . ALDH1 is an important stem cell marker in both normal and cancer cells . ALDH1 regulates cellular functions by detoxifying various aldehydes and retinoid signaling. ALDH1 appears to have protective properties against HNSCC . In another study, the positive expression of ALDH1 showed significant correlation with lymph node metastasis and poor prognosis . The positivity of ALDH1 was also found to be correlated with the number of cells undergoing epithelial-mesenchymal transition and metastasis in early stage oral squamous cell carcinoma (OSCC) . However, the association between ALDH1 expression and prognosis is contradictory.
BMI-1 is important for the self-renewal ability of stem cells and is related to epithelial-mesenchymal transition . Rao et al. found a significant positive correlation between ALDH1 and BMI-1 expression in OSCC tissue samples, although the underlying pathways have not yet been elucidated . High expression of BMI-1 was associated with poor prognosis in advanced-stage HNSCC treated with primary chemoradiotherapy . BMI1 is also upregulated after irradiation in OSCC, and is associated with poor prognosis . Based on these results, the CSC biomarker genes selected in this study may play a significant role in the prognosis of HNSCC.
There are genes, other than CSC genes, whose expression is associated with the diagnosis and prognosis of HNSCC. Lohavanichbutr et al. identified and validated a 13-gene expression signature that was strongly predictive of survival in HPV (–) OSCC patients . They first identified 131 genes by comparing the differential gene expression between OSCC and normal control groups . Thirteen of these genes were then further screened using the L1-penalized Cox proportional hazard regression method. Three genes, LAMC2, SERPINE1, KLF7, were found to overlap between the 13 gene expression signatures identified in the study by Lohavanichbutr et al. and the 81 CSC gene expression signatures identified in our analysis. LAMC2, SERPINE1, KLF7 play a role in cell proliferation, migration, and adhesion. High expression of these genes is associated with poor prognosis in HNSCC [41,42,43]. Hypoxia- and ferroptosis-related gene signatures predicting the prognosis of patients with OSCC have also been identified and validated [44, 45]. In this study, we developed and validated signatures associated with CSC biomarker genes, the expression of which was correlated with the prognosis of patients with HNSCC.
Patients with HPV (–) HNSCC have a worse prognosis in terms of OS and RFS rates than those with HPV ( +) HNSCC . However, each patient with HPV (–) HNSCC has a different prognosis owing to various risk factors. We confirmed that the CSC gene expression signature was an independent prognostic factor of non-oropharyngeal HNSCC. Since many non-oropharyngeal HNSCC patients did not include HPV status, we indirectly analyzed the role of CSC gene expression signature in HPV (–) HNSCC using information about non-oropharyngeal HNSCC patients. In addition, the CSC-HR subgroup showed a significantly worse prognosis than the CSC-LR subgroup, among patients with HPV (–) HNSCC. Next, we investigated whether the CSC gene expression signatures influence the prognosis of patients with HPV ( +) HNSCC. In these patients, the 5-year OS rates tended to be lower in the CSC-HR subgroup than in the CSC-LR subgroup; however, the differences were not significant. This may be due to the relatively small size of the HPV ( +) HNSCC cohort (n = 128). There is a need for further studies in larger HPV ( +) HNSCC cohorts, to confirm the association between CSC gene expression signatures and prognosis of patients with HPV ( +) HNSCC.
CSCs can regulate their proliferative and self-renewal capacity, and are thus, involved in metastasis, cancer development, and resistance to RT . However, the association between various CSC biomarker genes and the response to RT in HNSCC has not been studied. Only the mRNA expression of CD44 has been shown to be a significant predictor of local recurrence after RT in early stage laryngeal cancer . Thus, we hypothesized that the overexpression of a specific mRNA of CSC biomarker genes in HNSCC might be correlated with response to RT. However, each patient heterogeneously expresses various CSC biomarker genes, and thus, might respond heterogeneously to RT. Our results showed that compared to the CSC-HR subgroup, the CSC-LR subgroup benefited significantly from RT. These results indicated that the CSC gene expression signature might help to program a RT schedule, if further research is conducted on the response to various doses of irradiation in CSC-HR and CSC-LR HNSCC cell lines.
A limitation of our study is that we analyzed CSC gene expression signatures using five different public HNSCC cohorts. Thus, there was a difference in the essential information that was available for each cohort. In particular, the HPV status was missing in about 40% in TCGA cohort and all patients in MDACC and Greece cohorts. Thus, it was not possible to accurately evaluate the effect of the CSC gene expression signature in prognosis of HNSCC patients with HPV (–) status. Instead, we hypothesized that analysis of non-oropharyngeal HNSCC regardless of the HPV status might help find independent prognostic factors of HPV (–) HNSCC patients. In addition, detailed treatment modality methods or doses, such as post-operative RT, concurrent chemoradiotherapy, and induction chemoradiotherapy with surgery, were not included in each cohort. To compensate for the missing information, we conducted an additional analysis on the CSC gene expression signature and found that the CSC gene expression signature was associated with the prognosis of patients with HPV (–) HNSCC and the response to RT in HNSCC. Finally, the mRNA expression of selected CSC biomarker genes showed very low values for AUC as well as sensitivity and specificity that were below the thresholds required for decision-making in clinical settings (AUCs were less than 0.6 for CD44, MET, ALDH1A1, and BMI1). A possible reason for the same seems to be that the prognosis of HNSCC is not entirely changed by the mRNA expression of only a single gene, because the cancer is caused by the accumulation of multiple mutations in various pathways. However, these four genes have shown clinically significant association with the expression of corresponding CSC biomarker proteins in HNSCC over the past 10 years [5, 8, 22,23,24,25,26,27,28]. Thus, we analyzed and confirmed the actual association between mRNA expression of these genes and prognosis in TCGA HNSCC cohort, by referring to these ROC curves.
To the best of our knowledge, this is the first study to assess the prognosis of patients with HNSCC using various CSC biomarker genes. Each CSC biomarker gene influences the prognosis of patients with HNSCC, but the proportions of these genes are highly heterogeneous in each patient. Thus, we first clarified that the gene expression signatures of the four reference CSC biomarker genes, CD44, MET, ALDH1A1, and BMI1, were significantly related to the prognosis of patients with HNSCC. In addition, the Cox proportional hazards model showed that the CSC gene expression signature was an independent prognostic factor that influenced the OS of non-oropharyngeal HNSCC patients.
We developed CSC gene expression signatures that could predict the prognosis of patients with HNSCC, especially in case with HPV (–) status. CSC gene expression signatures was an independent prognostic factor of non-oropharyngeal HNSCC which mostly indicates HPV (–) status. In addition, there was a significant correlation between the CSC gene expression signature and the response to RT in HNSCC. Therefore, our data provide evidence that CSC gene expression signatures may help in the design of personalized treatments for patients with HPV (–) HNSCC who were classified in more detail.
Availability of data and materials
The datasets supporting the conclusions of this study are available from the UCSC Cancer Genomics Browser (https://xena.ucsc.edu/public) (TCGA cohort, n = 566) and the National Center for Biotechnology Information Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo) (Leipzig cohort, GSE65858, n = 270; FHCRC cohort, GSE41613, n = 97; MDACC cohort, GSE42743, n = 74; Greece cohort, GSE27020, n = 109).
Kamangar F, Dores GM, Anderson WF. Patterns of cancer incidence, mortality, and prevalence across five continents: defining priorities to reduce cancer disparities in different geographic regions of the world. J Clin Oncol. 2006;24(14):2137–50. https://doi.org/10.1200/JCO.2005.05.2308.
Argiris A, Karamouzis MV, Raben D, Ferris RL. Head and neck cancer. Lancet. 2008;371(9625):1695–709. https://doi.org/10.1016/S0140-6736(08)60728-X.
Gupta S, Kong W, Peng Y, Miao Q, Mackillop WJ. Temporal trends in the incidence and survival of cancers of the upper aerodigestive tract in Ontario and the United States. Int J Cancer. 2009;125(9):2159–65. https://doi.org/10.1002/ijc.24533.
Lohaus F, Linge A, Tinhofer I, Budach V, Gkika E, Stuschke M, et al. HPV16 DNA status is a strong prognosticator of loco-regional control after postoperative radiochemotherapy of locally advanced oropharyngeal carcinoma: results from a multicentre explorative study of the German Cancer Consortium Radiation Oncology Group (DKTK-ROG). Radiother Oncol. 2014;113(3):317–23. https://doi.org/10.1016/j.radonc.2014.11.011.
Linge A, Löck S, Gudziol V, Nowak A, Lohaus F, von Neubeck C, et al. Low Cancer Stem Cell Marker Expression and Low Hypoxia Identify Good Prognosis Subgroups in HPV(-) HNSCC after Postoperative Radiochemotherapy: A Multicenter Study of the DKTK-ROG. Clin Cancer Res. 2016;22(11):2639–49. https://doi.org/10.1158/1078-0432.CCR-15-1990.
Rosen JM, Jordan CT. The increasing complexity of the cancer stem cell paradigm. Science. 2009;324(5935):1670–3. https://doi.org/10.1126/science.1171837.
Plaks V, Kong N, Werb Z. The cancer stem cell niche: how essential is the niche in regulating stemness of tumor cells? Cell Stem Cell. 2015;16(3):225–38. https://doi.org/10.1016/j.stem.2015.02.015.
Ortiz RC, Lopes NM, Amôr NG, Ponce JB, Schmerling CK, Lara VS, et al. CD44 and ALDH1 immunoexpression as prognostic indicators of invasion and metastasis in oral squamous cell carcinoma. J Oral Pathol Med. 2018;47(8):740–7. https://doi.org/10.1111/jop.12734.
Huang CF, Xu XR, Wu TF, Sun ZJ, Zhang WF. Correlation of ALDH1, CD44, OCT4 and SOX2 in tongue squamous cell carcinoma and their association with disease progression and prognosis. J Oral Pathol Med. 2014;43(7):492–8. https://doi.org/10.1111/jop.12159.
Xiao M, Liu L, Zhang S, Yang X, Wang Y. Cancer stem cell biomarkers for head and neck squamous cell carcinoma: a bioinformatic analysis. Oncol Rep. 2018;40(6):3843–51. https://doi.org/10.3892/or.2018.6771.
Picon H, Guddati AK. Cancer stem cells in head and neck cancer. Am J Stem Cells. 2021;10(3):28–35.
Gupta PB, Chaffer CL, Weinberg RA. Cancer stem cells: mirage or reality? Nat Med. 2009;15(9):1010–2. https://doi.org/10.1038/nm0909-1010.
Wang T, Shigdar S, Gantier MP, Hou Y, Wang L, Li Y, et al. Cancer stem cell targeted therapy: progress amid controversies. Oncotarget. 2015;6(42):44191–206. https://doi.org/10.18632/oncotarget.6176.
Wichmann G, Rosolowski M, Krohn K, Kreuz M, Boehm A, Reiche A, et al. The role of HPV RNA transcription, immune response-related gene expression and disruptive TP53 mutations in diagnostic and prognostic profiling of head and neck cancer. Int J Cancer. 2015;137(12):2846–57. https://doi.org/10.1002/ijc.29649.
Lohavanichbutr P, Méndez E, Holsinger FC, Rue TC, Zhang Y, Houck J, et al. A 13-gene signature prognostic of HPV-negative OSCC: discovery and external validation. Clin Cancer Res. 2013;19(5):1197–203. https://doi.org/10.1158/1078-0432.CCR-12-2647.
Fountzilas E, Kotoula V, Angouridakis N, Karasmanis I, Wirtz RM, Eleftheraki AG, et al. Identification and validation of a multigene predictor of recurrence in primary laryngeal cancer. PLoS One. 2013;8(8):e70429. https://doi.org/10.1371/journal.pone.0070429.
Budczies J, Klauschen F, Sinn BV, Győrffy B, Schmitt WD, Darb-Esfahani S, et al. Cutoff Finder: a comprehensive and straightforward Web application enabling rapid biomarker cutoff optimization. PLoS One. 2012;7(12):e51862. https://doi.org/10.1371/journal.pone.0051862.
Li J, Zhou J, Zhang J, Xiao Z, Wang W, Chen H, et al. DNA repair genes are associated with tumor tissue differentiation and immune environment in lung adenocarcinoma: a bioinformatics analysis based on big data. J Thorac Dis. 2021;13(7):4464–75. https://doi.org/10.21037/jtd-21-949.
Corona RI, Sudarshan S, Aluru S, Guo JT. An SVM-based method for assessment of transcription factor-DNA complex models. BMC Bioinformatics. 2018;19(Suppl 20):506. https://doi.org/10.1186/s12859-018-2538-y.
Simon R, Lam A, Li MC, Ngan M, Menenzes S, Zhao Y. Analysis of gene expression data using BRB-ArrayTools. Cancer Inform. 2007;3:11–7.
da Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57. https://doi.org/10.1038/nprot.2008.211.
Jakob M, Sharaf K, Schirmer M, Leu M, Küffer S, Bertlich M, et al. Role of cancer stem cell markers ALDH1, BCL11B, BMI-1, and CD44 in the prognosis of advanced HNSCC. Strahlenther Onkol. 2021;197(3):231–45. https://doi.org/10.1007/s00066-020-01653-5.
Mannelli G, Magnelli L, Deganello A, Busoni M, Meccariello G, Parrinello G, et al. Detection of putative stem cell markers, CD44/CD133, in primary and lymph node metastases in head and neck squamous cell carcinomas. A preliminary immunohistochemical and in vitro study. Clin Otolaryngol. 2015;40(4):312–20. https://doi.org/10.1111/coa.12368.
Baschnagel AM, Williams L, Hanna A, Chen PY, Krauss DJ, Pruetz BL, et al. c-Met expression is a marker of poor prognosis in patients with locally advanced head and neck squamous cell carcinoma treated with chemoradiation. Int J Radiat Oncol Biol Phys. 2014;88(3):701–7. https://doi.org/10.1016/j.ijrobp.2013.11.013.
Rao RS, Raju KL, Augustine D, Patil S. Prognostic Significance of ALDH1, Bmi1, and OCT4 Expression in Oral Epithelial Dysplasia and Oral Squamous Cell Carcinoma. Cancer Control. 2020;27(1):1073274820904959. https://doi.org/10.1177/1073274820904959.
Michifuri Y, Hirohashi Y, Torigoe T, Miyazaki A, Kobayashi J, Sasaki T, et al. High expression of ALDH1 and SOX2 diffuse staining pattern of oral squamous cell carcinomas correlates to lymph node metastasis. Pathol Int. 2012;62(10):684–9. https://doi.org/10.1111/j.1440-1827.2012.02851.x.
Tamatani T, Takamaru N, Ohe G, Akita K, Nakagawa T, Miyamoto Y. Expression of CD44, CD44v9, ABCG2, CD24, Bmi-1 and ALDH1 in stage I and II oral squamous cell carcinoma and their association with clinicopathological factors. Oncol Lett. 2018;16(1):1133–40. https://doi.org/10.3892/ol.2018.8703.
Hu Q, Wu T, Chen X, Li H, Du Z, Hao Y, et al. The poor outcome of second primary oral squamous cell carcinoma is attributed to Bmi1 upregulation. Cancer Med. 2018;7(4):1056–69. https://doi.org/10.1002/cam4.1348.
Canis M, Lechner A, Mack B, Zengel P, Laubender RP, Koehler U, et al. CD133 is a predictor of poor survival in head and neck squamous cell carcinomas. Cancer Biomark. 2012;12(2):97–105. https://doi.org/10.3233/CBM-130297.
Habu N, Imanishi Y, Kameyama K, Shimoda M, Tokumaru Y, Sakamoto K, et al. Expression of Oct3/4 and Nanog in the head and neck squamous carcinoma cells and its clinical implications for delayed neck metastasis in stage I/II oral tongue squamous cell carcinoma. BMC Cancer. 2015;15:730. https://doi.org/10.1186/s12885-015-1732-9.
Roelen CA, Heymans MW, Twisk JW, van der Klink JJ, Groothoff JW, van Rhenen WJJoor. Work Ability Index as tool to identify workers at risk of premature work exit. J Occup Rehabil. 2014;24(4):747–54. https://doi.org/10.1007/s10926-014-9505-x.
de Jong MC, Pramana J, van der Wal JE, Lacko M, Peutz-Kootstra CJ, de Jong JM, et al. CD44 expression predicts local recurrence after radiotherapy in larynx cancer. Clin Cancer Res. 2010;16(21):5329–38. https://doi.org/10.1158/1078-0432.CCR-10-0799.
Koukourakis MI, Giatromanolaki A, Tsakmaki V, Danielidis V, Sivridis E. Cancer stem cell phenotype relates to radio-chemotherapy outcome in locally advanced squamous cell head-neck cancer. Br J Cancer. 2012;106(5):846–53. https://doi.org/10.1038/bjc.2012.33.
Orian-Rousseau V. CD44 Acts as a Signaling Platform Controlling Tumor Progression and Metastasis. Front Immunol. 2015;6:154. https://doi.org/10.3389/fimmu.2015.00154.
Lee JC, Wu ATH, Chen JH, Huang WY, Lawal B, Mokgautsi N, et al. HNC0014, a Multi-Targeted Small-Molecule, Inhibits Head and Neck Squamous Cell Carcinoma by Suppressing c-Met/STAT3/CD44/PD-L1 Oncoimmune Signature and Eliciting Antitumor Immune Responses. Cancers (Basel). 2020;12(12):3759. https://doi.org/10.3390/cancers12123759.
Lim YC, Kang HJ, Moon JH. C-Met pathway promotes self-renewal and tumorigenecity of head and neck squamous cell carcinoma stem-like cell. Oral Oncol. 2014;50(7):633–9. https://doi.org/10.1016/j.oraloncology.2014.04.004.
Sharaf K, Lechner A, Haider SP, Wiebringhaus R, Walz C, Kranz G, et al. Discrimination of Cancer Stem Cell Markers ALDH1A1, BCL11B, BMI-1, and CD44 in Different Tissues of HNSCC Patients. Curr Oncol. 2021;28(4):2763–74. https://doi.org/10.3390/curroncol28040241.
Vassalli G. Aldehyde Dehydrogenases: Not Just Markers, but Functional Regulators of Stem Cells. Stem Cells Int. 2019;2019:3904645. https://doi.org/10.1155/2019/3904645.
Curtarelli RB, Gonçalves JM, Dos Santos LGP, Savi MG, Nör JE, Mezzomo LAM, et al. Expression of Cancer Stem Cell Biomarkers in Human Head and Neck Carcinomas: a Systematic Review. Stem Cell Rev Rep. 2018;14(6):769–84. https://doi.org/10.1007/s12015-018-9839-4.
Méndez E, Houck JR, Doody DR, Fan W, Lohavanichbutr P, Rue TC, et al. A genetic expression profile associated with oral cancer identifies a group of patients at high risk of poor survival. Clin Cancer Res. 2009;15(4):1353–61. https://doi.org/10.1158/1078-0432.CCR-08-1816.
Wang QY, Liu YC, Zhou SH, Chen HH. LAMC2 acts as a novel therapeutic target of cetuximab in laryngeal cancer. Neoplasma. 2021;68(6):1257–64. https://doi.org/10.4149/neo_2021_210421N549.
Arroyo-Solera I, Pavón M, León X, López M, Gallardo A, Céspedes MV, et al. Effect of serpinE1 overexpression on the primary tumor and lymph node, and lung metastases in head and neck squamous cell carcinoma. Head Neck. 2019;41(2):429–39. https://doi.org/10.1002/hed.25437.
Lyu J, Wang J, Miao Y, Xu T, Zhao W, Bao T, et al. KLF7 is associated with poor prognosis and regulates migration and adhesion in tongue cancer. Oral Dis. 2022;28(3):577–84. https://doi.org/10.1111/odi.13767.
Zhao C, Zhou Y, Ma H, Wang J, Guo H, Liu H. A four-hypoxia-genes-based prognostic signature for oral squamous cell carcinoma. BMC Oral Health. 2021;21(1):232. https://doi.org/10.1186/s12903-021-01587-z.
Li H, Zhang X, Yi C, He Y, Chen X, Zhao W, et al. Ferroptosis-related gene signature predicts the prognosis in Oral squamous cell carcinoma patients. BMC Cancer. 2021;21(1):835. https://doi.org/10.1186/s12885-021-08478-0.
Gupta AK, Lee JH, Wilke WW, Quon H, Smith G, Maity A, et al. Radiation response in two HPV-infected head-and-neck cancer cell lines in comparison to a non-HPV-infected cell line and relationship to signaling through AKT. Int J Radiat Oncol Biol Phys. 2009;74(3):928–33. https://doi.org/10.1016/j.ijrobp.2009.03.004.
Senobari Z, Karimi G, Jamialahmadi K. Ellagitannins, promising pharmacological agents for the treatment of cancer stem cells. Phytother Res. 2022;36(1):231–42. https://doi.org/10.1002/ptr.7307.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1A5A2019413) and by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HI20C1205). The founders had no role in the study design, data analysis and interpretation of these data and in writing the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare no conflicts interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Supplementary Figure 1.
The prognosis according to the mRNA expression of each CSC gene in TCGA cohort. Cut-off values for continuous mRNA expression values were selected while referring to the ROC curve analysis. Kaplan–Meier plots showing the 5-year OS or RFS rates between the two subgroups based on the cut-off values of each CSC gene in TCGA cohort were depicted. Log-rank test was used to compare the prognosis of the two subgroups for each gene, and the plots with the lower p-value between plots depicting the 5-year OS and RFS for each gene were then selected. (a-d) CD44, MET, ALDH1A1, and BMI1 showed significant differences in the OS or RFS rates between the two subgroups classified according to the mRNA expression in TCGA cohort (p=0.0069, 0.0051, 0.028, and 0.021, respectively). (e-g) There were no significant differences in the OS or RFS rates between the two subgroups classified according to the mRNA expression of PROM1, SOX2, and POU5F1 in TCGA cohort (p=0.11, 0.097, and 0.12, respectively). *p<0.05
Additional file 2: Supplementary Figure 2.
Construction of the prediction model. (a) Venn diagram showing CSC gene expression signatures correlated with the four CSC genes – CD44, MET, ALDH1A1, and BMI1. (b) Schematic overview of the strategy used for constructing the prediction models and evaluating the predicted outcomes based on the CSC gene expression signatures.
Additional file 3: Supplementary Figure 3.
Kaplan–Meier and ROC analyses for OS of all patients in the five independent cohorts. All patients were classified into CSC-HR and CSC-LR subgroups using the 81 CSC gene expression signatures. (a) Kaplan–Meier plots showing significant difference in the OS rates between the two groups (p<0.0001). (b) ROC curves showing the sensitivity and specificity of the CSC gene expression signatures in predicting 1-year, 3-year, and 5-year patient OS in the five independent cohorts (AUC=0.582 for the 5-year OS). *p<0.05
Additional file 4: Supplementary Table 1.
Eighty-one CSC gene expression signatures in TCGA HNSCC cohort. Supplementary Table 2. Univariate and multivariate analyses of the characteristics associated with overall survival in patients with non-oropharyngeal cases in the four independent HNSCC cohorts (n=816). Supplementary Table 3. A total of 8 significant KEGG pathways associated with CSC gene expression signatures.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Kim, S.I., Woo, S.R., Noh, J.K. et al. Association between cancer stem cell gene expression signatures and prognosis in head and neck squamous cell carcinoma. BMC Cancer 22, 1077 (2022). https://doi.org/10.1186/s12885-022-10184-4
- Head and neck squamous cell carcinoma
- Cancer stem cell
- Gene expression signature
- Overall survival
- Recurrence-free survival