Identifying prognostic-related DEGs that were associated with mitophagy
A total of 5 normal and 169 GBM samples with mRNA expression profile and clinical data were extracted from TCGA. The 51 mitophagy-related genetic expressions were compared between normal group and tumor group via “limma” package. Twenty-seven mitophagy-related DEGs were identified with P < 0.05, among which 15 genes (GABARAPL1, PRKN, OPTN, PINK1, MAP1LC3A, ULK1, DNM1L, AMBRA1, TOMM20, ATG13, MFN2, PTEN, TOMM70, RNF41, VPS13C) were downregulated while 12 genes (TOMM40, GABARAP, CSNK2B, TOMM22, TOMM5, ROCK1, TOMM7, RIPK2, MTERF3, UBA52, PHB2, RPS27A) were upregulated. The heatmap of 51 DEGs was illustrated in Fig. 1A. To further explore the interaction between DEGs, we performed protein–protein interaction (PPI) on mitophagy-related gene (Fig. 1B). We identified 28 hub genes with the minimum required interaction score of 0.900 (highest confidence), among which 10 genes (ATG 13, CSNK2B, DNM1L, GABARAP, MAP1LC3A, MFN2, MTERF3, OPTN, PINK1, PTEN) were DEGs (Fig. 1B). The correlation network of 33 mitophagy-related genes expressions with significant difference is further shown in Fig. 1C (caption: red = positive correlations; blue = negative correlations).
Consensus clustering analysis of GBM based on different expression pattern
To further classify GBM subtypes based on their distinct expression pattern of mitophagy-related DEGs, we constructed consensus clustering analysis in GBM patient in the TCGA cohort. By using ConsensusClusterPlus package based on 27 mitophagy-related DEGs, we identified 2 different regulation patterns (k = 2) including 139 cases in cluster 1 and 30 in cluster 2 with the highest intragroup correlations (Fig. 2A). Then we integrated the cluster with the mRNA expression level and clinical informations including age (≤60 or > 60 years), sex (male or female) and survival status (dead or alive), which are illustrated in heatmap (Fig. 2B).
Construction of risk signature based on the TCGA cohort
After deleting duplicate and missing value, a total of 159 GBM samples were match with clinical information. To screen out candidates for constructing risk signature, we performed univariate Cox regression analysis on DEGs and identified 5 genes (MAP1LC3A, TOMM20, TOMM22, PHB2, UBA52) with the criteria of P < 0.1 (Fig. 3A). Among them, 1 gene (MAP1LC3A) was related to increased risk with HRs > 1, while the other 4 genes (TOMM20, TOMM22, PHB2, UBA52) were associated with protective effect with HRs < 1 (Fig. 3A). Next, least absolute shrinkage and selection operator (LASSO) Cox regression analysis was used to identified 4 survival-related genes (MAP1LC3A, TOMM20, PHB2, UBA52) to construct risk model for prognosis according to the optimum λ value (Fig. 3B, C). The risk score formula was as follows: risk score = (0.0272 * MAP1LC3A exp.) + (− 0.00546 * PHB2 exp.) + (− 0.008623 * TOMM20 exp.) + (− 0.002654 * UBA52 exp.). The downregulation of MAP1L3A, TOMM20 and upregulation of PHB2, UBA52 in GBM tissue were further validated by The Human Protein Atlas database (Fig. 3D).
The 159 samples were classified into high-risk group (79) and low-risk group (80) (Fig. 4A). Principal component analysis (PCA) showed that GBM patients were well divided into 2 clusters (Fig. 4B). As shown in Fig. 4C, patients in high-risk group exhibited more death (red dot) and lower survival time compared to low-risk group. Furthermore, the difference of OS between high-risk and low-risk groups was prominent (P = 0.0033) (Fig. 4D). Time-dependent ROC curves were utilized to evaluate the predictive performance of the model and the area under the curve (AUC) was 0.635 for 1-year, 0.75 for 2-year, and 0.724 for 3-year survival (Fig. 4E).
To evaluate whether risk score was an independent risk factor, we utilized univariate and multivariable Cox regression analyses in TCGA cohort. In univariate Cox regression, the risk model had a significant relationship with OS (HR = 1.7566, 95%CI (1.1988–2.5739), P < 0.005) (Fig. 4F). The multivariate analysis also indicated the risk model was an important factor correlated to OS (HR = 1.6734, 95%CI (1.1368–2. 4633), P < 0.01) (Fig. 4G). Furthermore, the heatmap of clinical features showed that the survival status was better in the low-risk group (Fig. 4H).
Validation of the prognostic model in the Chinese Gliomas Genome Atlas (CGGA) databases
After screening out, 134 GBM patient’s samples with clinical informations were extracted from CGGA mRNAseq_693. According to the median risk score built in TCGA cohort, patients were classified into high-risk (66) and low-risk (67) groups (Fig. 5A). PCA analysis indicated that patients were separated well into two groups (Fig. 5B). The dot line plot showed that patients with low-risk exhibited better survival time and survival rate (Fig. 5C). Furthermore, Kaplan–Meier analysis also showed significant difference in survival status between two groups (P = 0.0175) (Fig. 5D). Time-dependent ROC curves indicated that the 1-year, 2-year, 3-year AUC were 0.603, 0.709, 0.653 respectively, suggesting satisfactory predicting efficacy of the risk score model (Fig. 5E).
To evaluate whether risk score was an independent risk factor, we utilized univariate and multivariable Cox regression analyses in CCGA cohort. In univariate Cox regression, the risk model had a significant relationship with OS (HR = 1.5757, 95%CI (1.0786–2.3019), P = 0.0187) (Fig. 5F). The multivariate analysis also indicated the risk model was an important factor correlated to OS (HR = 1.5702, 95%CI (1.0599–2. 3262), P = 0.0245) (Fig. 5G).
Establishment and validation of nomograph
To construct a clinical-based method for predicting the prognosis of GBM patients, we established a nomograph based on 4 prognostic parameters including age, sex, radiation therapy, riskscore (Fig. 6A). The calibration plots indicated excellent agreement between the predicted and actual observation in both training and validation cohort (Fig. 6B, C). The AUC of nomograph in predicting 1-year, 2-year, 3-year survival were 0.693, 0.726, 0.731 respectively in TCGA cohort and were 0.637, 0.704, 0.681 respectively in CGGA cohort, indicating a favorable predictive ability of nomograph (Fig. 6D, E).
The GO and KEGG enrichment analysis based on risk model
To further clarify the pathway and the biological function of DEGs according to risk model, we performed Gene ontology (GO) enrichment analysis and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis on DEGs between low-risk and high-risk groups in TCGA cohort. First, we applied limma package in R language to identify 187 DEGs between two groups with the criteria FDR < 0.05 and |log2FC | ≥ 1. In 187 DEGs, 27 genes were downregulated in high-risk group, while the other 160 genes were upregulated. GO and KEGG enrichment analysis were applied on DEGs. According to GO analysis, DEGs based on risk model were mainly enriched in neurotransmitter transport, presynapse, ion channel activity (P < 0.005) (Fig. 7A, B). KEGG analysis showed that DEGs were closely associated with neuroactive ligand−receptor interaction and calcium signaling pathway (P < 0.01) (Fig. 7C, D).
Evaluation of the immune activity between subgroups
As immune micro-environmental abnormality plays an important part in oncogenesis, invasion and metastasis, we further evaluated the enrichment of 16 types of immune cells and 13 immune-related pathways in TCGA and CGGA in different risk score via single-sample gene set enrichment analysis (ssGSEA) package. As shown in TCGA boxplot, patients in high-risk group showed lower level of immune cells (CD8+ T cells, natural killer (NK) cells, follicular helper T cell (Tfh) cells, Th2 cell) than those in low-risk group (P < 0.05) (Fig. 8A), while the difference in immune pathway is not prominent (Fig. 8B).
In CGGA cohort, high-risk group had lower infiltration of immune cells (CD8+ T cells, NK cells, activated dendritic cells (aDCs), Mast cells, Neutrophils), which was almost the same as TCGA (Fig. 8C). Additionally, APC co stimulation, cytolytic, inflammation promoting, Type I IFN response were enriched in low-risk group, indicated the low immune activity in high-risk group (Fig. 8D). To further analyze the relationship between mitophagy-related DEGs (MAP1LC3A, TOMM20, PHB2, UBA52) and immune cells, the TIMER 2.0 was applied (Fig. 9). As shown in Fig. 9, TOMM20 was positively associated with CD8+ T cells (P < 0.0001) and negatively related to B cells and CD4+ cells (P = 0.043, 0.0052). UBA52 had positively relationship with macrophage, neutrophil, dendritic cell (DC) (P = 0.00076, 0.0045, 0.017). PHB2 had negatively relationship with CD4+ cells (P = 0.016). These findings suggested that the risk model genes were closely associated with immune infiltration.
Evaluation of the tumor microenvironment between subgroups
Due to the close relationship between mitophagy-related risk model and immune activity, CIBERSORT was utilized to analysis the tumor microenvironment (TME) between low-risk and high-risk groups. The overview of 22 immune cell compositions in GBM samples was shown in Fig. 10A. The high-risk group had higher proportion of resting NK cells as well as plasm cells, while low-risk group had increased proportion of activated NK cells, indicating that the activation of NK cells regulated anti-tumor effects in low-risk group (Fig. 10B, C).