Eight-gene metabolic signature related with tumor-associated macrophages predicting overall survival for hepatocellular carcinoma

Background In recent years, the relationship between tumor-associated macrophages (TAMs) and solid tumors has become a research hotspot. This study aims to explore the close relationship of TAMs with metabolic reprogramming genes in hepatocellular carcinoma (HCC) to provide new methods of treatment for HCC. Methods The study selected 343 HCC patients with complete survival information (survival time > = 1 month) in the Cancer Genome Atlas (TCGA) as study subjects. Kaplan-Meier survival analysis assisted in determining the relationship between macrophage infiltration and overall survival (OS), and Pearson correlation tests were used to identify metabolic reprogramming genes (MRGs) associated with tumor macrophage abundance. Lasso regression algorithms were used on prognosis-related MRGs identified by Kaplan-Meier survival analysis and univariate Cox regression analysis to construct a risk score; another independent cohort (including 228 HCC patients) from the International Cancer Genome Consortium (ICGC) was used to verify prognostic signature externally. Results A risk score composed of 8 metabolic genes could accurately predict the OS of a training cohort (TCGA) and a testing cohort (ICGC). The risk score could be widely used for people with different clinical characteristics, and it is a predictor that is independent of other clinical factors that affect prognosis. As expected, compared with the low-risk group, the high-risk group exhibited an obviously higher macrophage abundance, together with a positive correlation between the risk score and the expression levels of three commonly used immune checkpoints (PD1, PDL1, and CTLA4). Conclusion Our study constructed and validated a novel eight-gene signature for predicting HCC patient OS, which may contribute to clinical treatment decisions.


Background
Growing evidence shows that tumor progression and metastasis are closely related to the tumor microenvironment [1,2]. Once the tumor microenvironment is formed, many immune cells, such as T cells, myelogenic inhibitory cells, macrophages, and others, form the tumor microenvironment through chemotaxis [3]. Among these immune cells, tumor-associated macrophages (TAMs) are macrophages derived from the infiltration of peripheral blood monocytes into solid tumor tissues [4], and they are the most numerous inflammatory cell group in tumor stroma, accounting for approximately 30-50% of the total inflammatory cells [5,6]. TAMs have been reported to have a remarkable effect on tumor occurrence, growth, invasion and metastasis [7][8][9], and thus they are receiving growing attention.
To adapt to the decreased nutrients and oxygen available in the tumor microenvironment (TME) and to maintain the rapid proliferation and material synthesis of tumor cells, a series of changes to the metabolic processes of tumor cells occur that lead to the increase in related metabolites, such as lactate, nitrous oxide, reactive oxygen species, prostaglandin, and arachidonic acid, in the tumor microenvironment, thus creating an inflammatory microenvironment [10,11]. These changes also affect the function of tumor associated macrophages (TAMs), including changes in cytokines and angiogenic factors, which may prompt tumor progression and metastasis [12].
Hepatocellular carcinoma (HCC) contributes to more than 90% of all cases of liver cancer, and HCC ranks 2nd in cancer-related deaths worldwide [13]. Metabolic reprogramming has been reported to have a significant effect on the prognosis of HCC [14,15]. However, there is no definitive understanding of the relationship between TAMs and metabolic reprogramming in HCC.
To examine how the infiltration of six types of immune cells affected HCC prognosis, the Tumor Immune Estimation Resource (TIMER) database was adopted in this study for the collection of tumor immune cell infiltration data. The patients with higher macrophage infiltration levels had a poor overall survival rate. We also identified differentially expressed metabolismrelated genes (DEMRGs) between high-level and lowlevel macrophage infiltration groups of HCC patients and used these genes to build a prognostic signature.

Data collection
The immune infiltration data for tumors from the Cancer Genome Atlas (TCGA) were obtained from the  With the purpose of eliminating the impact exerted by perioperative factors on patients' survival, our study did not include patients who survived no more than 1 month. Sequencing data from TCGA and ICGC databases utilized in this study were collected from the same Illumina HiSeq_RNA-Seq platform, and the R package "combat" was used to remove batch effects. There is no need to obtain the approval of a local ethics committee since the above data can be accessed from public sources. This study was conducted following the regulations on the use of databases involved. Figure 1 displays the workflow.
Identification of TAM-related metabolic genes (TRMGs) According to the median value of immune cell infiltration, 343 HCC cases from the TCGA dataset were classified into two groups, which had high and low levels of infiltration. We extracted 2752 previously published genes related to metabolism that encode all the known human metabolic enzymes and transporters to perform the following analysis [17]. Then, we used the Wilcoxon test in the R package "limma" to identify metabolismrelated genes (DEMRGs) that were expressed differently by the two groups. How gene expression correlated with estimated immune infiltration was examined by the Pearson correlation test. A significant correlation with TAM infiltration was identified by a p-value of smaller

Identification of prognostic TRMGs
The association of prognosis in the TCGA database with TRMGs was examined by Kaplan-Meier survival analysis and univariate Cox regression analysis. P-values < 0.05 were used as selection criteria for both methods.
Construction of a risk score predict overall survival in the TCGA cohort To construct a risk score, a lasso regression algorithm was performed on the prognosis-associated TRMGs identified by Kaplan-Meier survival analysis and univariate Cox regression analysis. It involved confirming the optimal penalty parameter λ related to the smallest 10fold cross-validation in the TCGA dataset. In terms of the signature, the risk score was calculated as the sum of each mRNA coefficient * each mRNA expression. The formula assisted in computing the risk score specific to each patient in TCGA and ICGC datasets. With the median value as the unified cutoff, patients of the TCGA cohort were classified into low-and high-risk groups. The prediction accuracy of our risk signature for 0.5/1/3/5-year survival was assessed by the R package "survival ROC", which generated the timedependent receiver operating characteristic (ROC) curves. The Kaplan-Meier curves generated with the R package "suvminer" were used to compare patients' OS in different groups using a log-rank test. The R package "glmnet" was utilized in this study to conduct lasso regression analysis.  External validation of prognostic signature in ICGC cohort The same formula mentioned above was used to calculate each patient's risk score, resulting in high-risk and low-risk groups in the same way. Patients with different clinical features (previous malignancy, stage, age, gender) were also subject to Kaplan-Meier survival analysis.

Independence validation of the prognostic signature
Based on univariate and multivariate Cox regression analyses, the risk score and clinicopathological features were obtained to determine whether the risk score could be used as an independent predictor of HCC prognosis.

Correlation analysis between risk score and clinicopathology
We conducted chi-square tests on different risk groups for correlation analysis regarding clinical features. P less than 0.05 represented statistical significance.

Estimation of immune infiltration
Two algorithms were used to estimate immune infiltration in different risk groups: TIMER, which is a method for estimating the abundance of 6 types of tumorinfiltrating immune cells (dendritic cells, macrophages, CD8 T cells, CD4 T, neutrophils, and B cells) [18]; and CIBERSORT-ABS, which is a methodology based on the gene expression profile for evaluating the absolute abundance of 22 immune cell populations [16].

Gene set enrichment analysis
To clarify possible mechanisms underlying the prognostic signatures of patients with HCC, HCC patients in both TCGA and ICGC cohorts were subject to Gene Set Enrichment Analysis (GSEA). An annotated gene set file "c2.cp.kegg.v7.0.symbols.gmt" was used as reference. The threshold was confirmed as FDR < 0.25 and NOM pvalue < 0.05.

Statistical analysis
SPSS Statistics 25 (https://www.ibm.com/products/ software) together with R v.3.6.1 (https://www.r-project. org/) was employed for all statistical analyses. Mann-Whitney U-tests and Unpaired Student's t-tests assisted in comparing two groups containing variables with normal distributions and two groups containing variables with nonnormal distributions, respectively. To compare three groups, one-way analysis of variance served as the parametric method, and Kruskal-Wallis tests of variance were used as the nonparametric method. Fisher's exact tests or chi-square tests assisted in analyzing the contingency table variables. Kaplan-Meier survival analysis was performed, together with log-rank tests for comparison. A univariate Cox proportional hazards regression model assisted in estimating the hazard ratios exhibited by univariate analyses. Statistical significance was identified by a two-tailed P-value < 0.05 [19].

Identification and annotation of TRMGs
We observed that patients with a higher abundance of macrophage infiltration had a poorer prognosis (Fig. 2a), which prompted us to identify prognostic biomarkers of HCC according to the degree of macrophage infiltration. We identified 1382 metabolic genes with different expression levels between high macrophage infiltration and low macrophage infiltration patients (Fig. 2b), and 192 genes were significantly correlated with macrophage infiltration (cor > 0.4, p < 0.001) (Fig. 2c). KEGG and GO enrichment analysis demonstrates the involvement of these genes in many aspects of metabolism, including those related to glycoproteins, sulfur compounds, coenzymes, carbon, purines, glycolysis, and glycogenesis (Fig. 3).

Construction of prognostic signature
A total of 87 TRMGs were identified by Kaplan-Meier survival analysis and Univariate Cox regression analysis, and all were risk factors for the overall survival of HCC (Fig. 4a). The lasso regression model included the following 8 metabolic genes associated with prognosis ( Fig. 4b- Prognostic assessment of the signature in the TCGA cohort Kaplan-Meier survival showed poorer overall survival (OS) of the high-risk group than the low-risk one (log-rank test p < 0.001) (Fig. 6a). The AUC for 1, 3, and 5-year OS was 0.786, 0.727, and 0.693, respectively (Fig. 6b). The survival status distribution map also showed that mortality increased as the risk score increased (Fig. 6c-e). Univariate and multivariate Cox regression analysis reveals the risk score as an independent prognosis predictor for OS ( Fig. 7a-b).
Internal validation of the prognostic signature in the TCGA cohort Patients were assigned to 20 groups by clinical characteristics. The high-risk group exhibited a poorer prognostic outcome in each subgroup than the low-risk one based on the Kaplan-Meier analysis. All p values obtained from log-rank tests in each subgroup were less than 0.05 (Fig. 7c).

External validation of the prognostic signature in ICGC cohort
A group of 228 patients with complete survival information served as an external validation cohort. The formula mentioned previously was used to calculate the risk score, and patients were assigned to two groups with the same cutoff (0.731). Similar to the TCGA cohort, the high-risk group showed worse survival as demonstrated by Kaplan-Meier survival analysis (p < 0.01) (Fig. 8a). The AUC for OS at 1, 3, and 5 years was 0.775, 0.713, and 0.761, respectively (Fig.  8b). Overall survival was independently predicted by the risk score as implied by Univariate and multivariate Cox regression analysis ( Fig. 9a-b). Then patients assigned by clinical features (8 subgroups) were subject to Kaplan-Meier survival analysis, where the high-risk group exhibited worse OS in each subgroup (Fig. 9c). These results further confirmed the robustness of the signature in predicting overall survival of HCC.

Relationship between risk score and clinical features
According to the results of chi-square test, the high-risk group has higher histopathological grade, later clinical stage, greater vascular invasion, and higher probability of new tumor growth after initial treatment, and this group usually presents with a tumor, which may help to explain the reasons leading to the poor prognosis of high-risk group (Tables 1-2).

Relationship between risk score and immune infiltration
The TIMER algorithm showed that the high-risk group presented greater infiltration of 6 types of immune cells, relative to the low-risk group (Fig. 10a). Among these immune cells, the risk score has the strongest correlation with macrophages and neutrophils (Fig. 10b). Macrophages and neutrophils were also estimated to have greater infiltration in the highrisk group based on the CIBERSORT-ABS algorithm (Fig. 10c-d). We also found that a variety of immune checkpoints correlated with the expression of 8 genes and that the expression of CD274, PDCD1, and CTLA4 ( Fig. 10e-f) was positively correlated with the risk score.

GSEA between different risk groups
Five representative upregulated signaling pathways in the groups with low and high risk, respectively, were identified by the NES score from TCGA and ICGC cohorts (Fig. S1). Interestingly, the high-risk group presents significantly lower metabolic activity than the low-risk group in the high-risk group in these two independent cohorts (Fig. S1, Table S1-2).

Discussion
Hepatocellular carcinoma (HCC) is a representative type of primary liver cancer. Despite recent improvements in treatment, it still holds a low five-year survival [20]. The mechanisms underlying the initiation and development of HCC remain to be clarified. In recent years, the initiation and development of HCC has been shown to be not only related to the characteristics of the tumor itself but also to its microenvironment [3]. Tumor and stromal cells, various immune inflammatory cells, chemokines, and cytokines together constitute the tumor microenvironment [21], and tumor-associated macrophages (TAMs) play important roles as significant inflammatory cells. As is increasingly shown by several studies, TAMs promote the generation, metastasis, and immunosuppression of HCC [22,23]. However, how TAMs promote tumorigenesis, growth, invasion, and metastasis, how TAMs can lead to immunosuppression, and how tumor cells interact with TAMs remain unresolved topics. With the development of genomics and proteomics, we can better study the molecular mechanisms by which TAMs influence tumorigenesis and development, clarify the relationship between TAMs and tumors, and provide new clues for tumor biotherapy. It has been reported that the abnormal metabolism of malignant tumor cells can not only induce changes in the phenotype and function of TAMs but can also change the metabolic mode of TAMs, causing them to exert immunosuppressive activity and to ultimately promote the development and metastasis of tumors [12]. On that account, exploring the metabolic changes of tumor cells and TAMs and their intricate relationship is necessary. This study found that the HCC patients with higher levels of macrophage infiltration had poorer prognosis. We identified metabolism-related genes with expression levels closely correlated with macrophage infiltration. A prognostic signature containing 8 genes was established using lasso regression analysis, and this signature correlated with the overall survival of HCC patients in TCGA and ICGC cohorts. The signature was applicable for people with different clinical features, demonstrating Fig. 9 Verification of the universal applicability of the prognostic model in the ICGC cohort. a-b Univariate and multivariate regression analysis of the relationship between the RS and clinicopathological characteristics regarding overall survival in the ICGC cohort (green represents univariate analysis, and red represents multivariate analysis). c Subgroup survival analysis depending on different clinical features that this signature is robust. Additionally, the signature could be used as an independent predictor for overall survival of HCC as confirmed by Cox analysis. The immune microenvironment of different risk groups was also evaluated; immune infiltration has previously been reported to be correlated with clinical outcome for many kinds of cancer [24]. In this study, we found that the high-risk patients with unfavorable prognosis had higher levels of macrophage infiltration, which is consistent with the results of Li [25]. In addition, compared with the low-risk group, the high-risk one exhibited a higher infiltration level of Tregs and neutrophils; previous literature also demonstrated the negative correlation of prognosis in HCC with the infiltration of these two immune cell types [26,27]. However, we also found that the infiltration level of B cells, CD4 T cells, and CD8 T cells was higher in the high-risk group than in the lowrisk group using the TIMER database, which contradicts previous studies [28][29][30]. The two groups showed no difference in the infiltration of these three immune cell types as found in the CIBERSORT-ABS method. PDL1 (CD274), PDCD1, and CTLA4 are three immune checkpoint markers commonly analyzed in the clinic [31], and these proteins may be used by tumors to escape immune surveillance by controlling cell cycle progression and extracellular and intracellular signals. The positive correlation of the risk score with the expression levels of PDL1 (CD274), PDCD1, and CTLA4 was found, suggesting that metabolic reprogramming genes may have a significant effect on the tumor immune microenvironment. Interestingly, we found that the high-risk group had significantly lower activity in metabolic pathways than the low-risk group in two independent cohorts based on GSEA. This finding is consistent with the improved prognosis for the metabolism subgroup of HCC reported by Gao et al. [15]. Lactate dehydrogenase A (LDHA) is a metabolic enzyme that can produce lactate in human body, and it has become an important indicator of clinical tumor diagnosis. It has been reported that LDHA can mediate tumor immune escape by inhibiting the activity of T cells and NK cells [32]. LDHA expression levels are positively correlated with macrophage abundance in HCC, which also provides a clue for the further study of the tumor immune mechanisms regulated by LDHA. The human solute carrier protein 25 family is a superfamily of human solute carrier proteins, which play a role in molecular transport, oxidative phosphorylation, and iron metabolism related to urea and the citric acid cycle. Recently, several studies have shown that SLC25 family members can affect tumor initiation and development [33]. This study demonstrates for the first time the correlation among SLC25A24, HCC prognosis, and macrophage infiltration. Glycosyltransferase is crucial in glycosylation; it catalyzes the transfer of an active glycosyl group from a glycosyl donor to a glycosyl receptor and forms glycosidic bonds [34]. David Kessel et al. found that the levels of three plasma glycosyltransferases could affect cancer patients' neoplasia, especially for patients with tumors metastasizing to the liver [35]. However, the mechanism of glycosyltransferaselike domain containing 1 (GTDC1) in HCC remains to be elucidated. After HCC leads to the unlimited proliferation of hepatocytes, the speed of development begins to slow. The most common metabolic phenomenon observed in HCC cells is an increased glycolysis rate, which is known as the Warburg effect [36]. Glucose-6-phosphate dehydrogenase (G6PD) is an important metabolic enzyme in glycolysis, and it is correlated with the proliferation and apoptosis of HCC [37]. This study is the first to identify correlation between 6-phosphate dehydrogenase and the abundance of tumor macrophages, which will provide a new direction for further exploration of the mechanism of G6PD in regulating HCC development. The rapid growth of tumors also depends on the polyamine content in cells. The abnormal metabolism of polyamine can also cause malignant transformation of cells [38]. The metabolism of polyamine has attracted special attention in the tumor research field. Known as SAMDC, S-adenosylmethionine decarboxylase 1 (AMD1) is the rate limiting enzyme, which regulates polyamine metabolism [39,40]. In lymphoma, AMD1 acts as a tumor suppressor gene by regulating the posttranslational modification of eukaryotic translation initiation factor 5A (eIF5A) [39]. In prostate cancer, AMD1 regulates the mTOR pathway to influence tumor cell proliferation, thus promoting tumor development [41]. AMD1 also remarkably affects breast cancer initiation and development [42]. However, the expression and role of AMD1 in HCC are not commonly reported. The role of the remaining three genes, CAD, GNPDA1, and ELOVL1, in tumors remains unclear.
Our study is the first attempt to identify the metabolic genes with prognostic significance for HCC from the perspective of immune infiltration. Our results show a significant relationship between metabolic reprogramming and the tumor immune microenvironment, and metabolic disorders may affect tumor development by mediating tumor immune regulation. These results provide theoretical support for exploring the nonmetabolic mechanisms of metabolic genes in the future. However, there are some limitations in our research. We have not performed further experiments to explore the immune mechanisms of these metabolic genes, but we will address this in the future work. And as for the HCC patients who are mostly diagnosed by imaging modalities and treated with nonsurgical methods, the value of this model may be limited, because our model needs to quantify the expression levels of eight specific genes in tumor tissues. Our research preliminarily demonstrates the feasibility of exploring the immune activity of tumors from the perspective of metabolic reprogramming. Therefore, it is necessary to continue performing multicenter prospective research on this subject.

Conclusion
To evaluate the prognosis of HCC, our study established a novel risk score by examining how tumor macrophages correlated with metabolic genes. Considering the heterogeneity of HCC, the prognostic evaluation of HCC may be improved by the prognostic model. Our study also provides theoretical support for further elucidating the complex relationship between metabolic reprogramming and tumor immune mechanisms. Fig. 10 The landscape of immune infiltration in high-and low-risk HCC patients. a-b Relationships between the risk score and infiltration abundance of six types of immune cells (TIMER method, red represents high-risk group, blue represents low-risk group). c Cor-heatmap of absolute abundance of 22 immune infiltration cell proportions estimated using CIBERSORT-ABS method. d Vioplot of absolute abundance of 22 immune infiltration cell estimated by CIBERSORT-ABS method (red represents high-risk group, blue represents low-risk group). e Analysis of coexpression of 8 genes and immune checkpoints (f) The comparison of expression levels of CD274 (PDL1), PDCD1, and CTLA4 between different risk groups and correlation analysis between risk score and the expression levels of CD274 (PDL1), PDCD1, and CTLA4