- Open Access
Identification of prognostic immune-related gene signature associated with tumor microenvironment of colorectal cancer
BMC Cancer volume 21, Article number: 905 (2021)
The tumor microenvironment (TME) has significantly correlation with tumor occurrence and prognosis. Our study aimed to identify the prognostic immune-related genes (IRGs)in the tumor microenvironment of colorectal cancer (CRC).
Transcriptome and clinical data of CRC cases were downloaded from TCGA and GEO databases. Stromal score, immune score, and tumor purity were calculated by the ESTIMATE algorithm. Based on the scores, we divided CRC patients from the TCGA database into low and high groups, and the differentially expressed genes (DEGs) were identified. Immune-related genes (IRGs) were selected by venn plots. To explore underlying pathways, protein-protein interaction (PPI) networks and functional enrichment analysis were used. After utilizing LASSO Cox regression analysis, we finally established a multi-IRGs signature for predicting the prognosis of CRC patients. A nomogram consists of the thirteen-IRGs signature and clinical parameters was developed to predict the overall survival (OS). We investigated the association between prognostic validated IRGs and immune infiltrates by TIMER database.
Gene expression profiles and clinical information of 1635 CRC patients were collected from the TCGA and GEO databases. Higher stromal score, immune score and lower tumor purity were observed positive correlation with tumor stage and poor OS. Based on stromal score, immune score and tumor purity, 1517 DEGs, 1296 DEGs, and 1892 DEGs were identified respectively. The 948 IRGs were screened by venn plots. A thirteen-IRGs signature was constructed for predicting survival of CRC patients. Nomogram with a C-index of 0.769 (95%CI, 0.717–0.821) was developed to predict survival of CRC patients by integrating clinical parameters and thirteen-IRGs signature. The AUC for 1-, 3-, and 5-year OS were 0.789, 0.783 and 0.790, respectively. Results from TIMER database revealed that CD1B, GPX3 and IDO1 were significantly related with immune infiltrates.
In this study, we established a novel thirteen immune-related genes signature that may serve as a validated prognostic predictor for CRC patients, thus will be conducive to individualized treatment decisions.
Colorectal cancer (CRC) is ranked as the third most common cause of cancer-related mortality globally. In the gastrointestinal tract, the incidence of CRC is one of the most frequently diagnosed . The heterogeneity of CRC exists with differences in molecular pathogenesis, clinical features, and prognosis . A significant number of early-stage CRC patients successfully undergo aggressive surgical removal of the primary tumors . However, among stage II/III patients, the association of the recurrence with metastases has been evident . Stage IV patients with high risks of metastatic relapses receive standardized cytotoxic chemotherapy . Generally, the treatment of CRC involves a combination of one or more cancer drugs in their regimens, such as 5-fluorouracil (5-FU), folinic acid, irinotecan and oxaliplatin . But in the presence of metastases, new adjuvant chemotherapy may not produce successful results as it could lead to cancer progression and the patient’s resistance to the drug .
Although several high-risk pathological characteristics, such as venous invasion or serosal involvement, are now recognised as important determinants of survival, particularly in node negative disease [7, 8]. However, it has become evident that the profiles of other hosts and tumors could also impact the clinical prognosis. Furthermore, the fundamental characteristics of tumor cells and mechanisms within the tumor microenvironment (TME), such as tumor-infiltrating immune cells (TIICs)  and stromal cells linked to the tumor, affect the progression of CRC and the clinical outcomes .
In epithelial cancer cells, the increased presence of mesenchymal genes leads to the recommendation that the poor prognosis of CRC patients is associated with the epithelial-to-mesenchymal transition . However, the transcriptome of the tissue that contains the tumor characterizes the expression profiling of mesenchymal cells that establish the TME and the epithelial cancer cells [12, 13]. The stroma or TME is comprised of the structural and functional features of the connective tissue in homeostasis and pathological angiogenesis in wound healing and illness. Furthermore, the stroma is made up of lymph and blood vessels, inflammatory/immune cells, extracellular matrix, and fibroblasts. The cancer cells alter the dynamic and multifaceted structure of the stroma, which eventually affects the progression of malignant cells .
The two main classifications of non-tumor mechanisms are the stromal and immune cells in the TME. These play a significant role when diagnosing, making a prognosis for, and assessing tumors. In the recent decade, researchers have created several algorithms to forecast the purity of tumors in different types of cancer. These algorithms are grounded on the precise gene signature of immune cells and/or stromal cells [15, 16]. An example is the Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE), which is an algorithm designed by Yoshihara K et al. . This algorithm calculates both stromal score and immune score to forecast the presence of infiltrating non-tumor cells by analyzing the signatures of specific genes.
After the founding of the ESTIMATE logarithm, research studies have applied it to prove the efficacy of big-data algorithms on evaluating prostate cancer , glioblastoma , and cutaneous melanoma . Nevertheless, researchers also need to extensively study the value of the ESTIMATE when evaluating stromal and/or immune scores of CRC. In this literature, we have pioneered the extraction of the list that itemizes how the TME-associated genes forecast poor prognosis amongst CRC patients. We completed this by using the data of CRC cohorts from Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA).
Gene expression data sets
Our study utilized publicly available data sets. We collected gene expression profiles and identified the equivalent material on the prognosis, as well as the tumor and normal tissues of CRC patients. We obtained the data from the TCGA (https://tcga-data.nci.nih.gov/tcga/) and the GEO (https://www.ncbi.nlm.nih.gov/geo/) web sites, which were uploaded up to 31 March 2019. Moreover, we excluded duplications and datasets that have sample sizes of less than 50 (N < 50). Then, we organized the clinical data and expression profiles of each sample manually. Our inclusion criteria indicate that we include diagnosed CRC patients who have available clinicopathological and survival information. Therefore, six datasets were included in our study (GSE12945, GSE39582, GSE41258, GSE72970, GSE103479 and TCGA) [21,22,23,24,25,26]. Stromal scores, immune scores and tumor purity were calculated by the ESTIMATE algorithm . For the succeeding genomic analysis, we used the TCGA dataset. To authenticate the prognosis of genetic information recognized by the TCGA analysis, we selected the largest CRC dataset GSE39582 as the validation cohort from the GEO database. This study complied with the TCGA and GEO approved publication guidelines. Also, we sourced the data from both databases. Thus, this research did not require the approval of the ethics committee.
Identification of immune-related genes (IRGs)
Based on the ESTIMATE results, we classified the sample groups of the stromal scores, immune scores and tumor purity into high and low to choose the intersection genes. This study analyzed the data using the R package limma . The cutoff values were established at FDR < 0.05 and Fold change > 2 to filter the differentially expressed genes. Finally, we obtained intersect genes among stromal scores, immune scores and tumor purity as immune-related genes (IRGs).
Functional enrichment of IRGs
To complete the functional enrichment analysis of the IRGs and classify the GO categories according to the molecular functions (MF), the biological processes (BP), or the cellular components (CC), we used a R package called “clusterprofiler” . Additionally, we utilized it to make a pathway enrichment analysis while referring to the pathways of Kyoto Encyclopedia of Genes and Genomes (KEGG). Finally, the cut-off value was set at a false discovery rate (FDR) < 0.05.
Protein-protein interaction (PPI) network
The STRING database  produced the PPI network that the Cytoscape software rebuilt . For further examination, this study only included individual networks with 10 or more nodes and excluded those with less than 10. The connectivity degree of each network node was calculated, and then, we searched for the clusters according to their typology to trace densely connected regions using the Molecular COmplex DEtection (MCODE).
Establishment of the IRGs signature for CRC
To investigate the link between IRGs and the prognosis of CRC patients, we applied the univariate Cox regression analysis. In this analysis, the statistical significance was established at p < 0.05. We selected a panel of genes according to these outcomes through the LASSO Cox regression analysis that used the R package “glmnet”. Then, we set up a multigene signature to forecast the prognosis of these patients. By conducting cross-validations 1000 times and adopting the best penalty parameter λ value’s one standard error, we established the most simplified (smallest perimeter) model of immune gene expression signatures. For each patient, the sum of the corresponding coefficients and products of each gene expression level determined the risk score formula. To verify the efficiency of the signature constructed by the TCGA cohort, we validated the results in the GEO cohort (GSE39582).
TIMER database analysis
To methodically analyses the immune infiltrates in different cancers, TIMER is a wide-ranging resource (https://cistrome.shinyapps.io/timer/) . This algorithm uses a statistical deconvolution method, which was recently made available to infer the TIICs based on the gene expression profiles . Moreover, TIMER covers 32 types of cancer and contains 10,897 samples from TCGA to approximate the abundance of immune-infiltrating. Our analysis encompassed the identification of prognostic immune-related genes (IRGs) in CRC and its correlation with the abundance of immune-infiltrating. These infiltrates include macrophages, neutrophils, B cells, CD8+ T cells, CD4+ T cells and dendritic cells (DC) through gene modules. On the left-most panel, the graph illustrates the gene expression profiles against tumor purity .
Results were showed as mean ± standard deviation (SD). To contrast the immune score, stromal score and tumor purity in different groups, we used the Kruskal-Wallis test of variance. The cutoff value of high and lower groups of immune score, stromal score and tumor purity were calculated by X-tile . We calculated the OS using the Kaplan-Meier method, and the statistical significance was determined by the log-rank test. The specificity and sensitivity of survival prediction according to the determined risk score were obtained by time-dependent receiver operating characteristic (ROC) curves, with AUC values quantified with the survivalROC package. This paper also executed the univariate and multivariate analyses of OS to distinguish the prognostic determinants of CRC patients from the TCGA and GSE39582 datasets. Furthermore, using the R package “rms”, we established a nomogram according to the TCGA CRC cohort and incorporated different prognostic factors. Moreover, we analyzed its performance through the C-index, calibration plots and decision curve analysis (DCA). Additionally, this study utilized the R software version 3.6.0 to conduct all of the statistical analyses (http://www.rproject.org/). Finally, we set the statistical significance at a two-sided P < 0.05.
Association of stromal scores, immune scores and tumor purity with CRC prognosis
Stromal scores, immune scores and tumor purity were significantly associated with CRC clinical stages and prognosis. Using a variety of technologies, we obtained the genomic profiles. The stromal and immune scores, as well as the tumor purity, were generated through the uniform algorithm. The results did not reveal any obvious cohort-bias clustering. Hence, we merged the groups from TCGA and GEO and further investigated whether there were stromal scores, immune scores and tumor purity statistically correlated with CRC clinical stages and prognosis. In the present study, we downloaded the gene expression profiles and clinical data of 1635 CRC patients from the TCGA and GEO databases. Patients were 66.06 ± 12.95 years old, with 882 males (53.9%) and 753 (46.1%) females.
For CRC, TNM stages are valuable prognostic indicators. We included them in the analysis. The stromal scores ranged from − 2232.54 to 2193.09, immune scores were distributed between − 899.57 and 3202.84, and tumor purity ranged from 0.28 to 0.98, respectively (Data were calculated by ESTIMATE algorithm). Figure 1 illustrates that across different stages, the distribution of stromal scores differed (p = 4.79e-10) while the distribution of immune scores had no variation (p = 0.58). Figure 1 show that tumor purity (p = 1.40e-04) were inversely associated with different stages.
To study the possible correlation of the prognosis with stromal score, as well as immune score and tumor purity, we classified the CRC patients into high and lower groups. For this purpose, we created the X-tile and Kaplan-Meier survival curves. The results suggest that the stromal scores were positive correlation with the OS (p = 0.036) is statistically significant. Also, the immune scores are positively associated with OS, but it was not statistically significant (p = 0.38). Conversely, tumor purity was significantly negative association with OS (p = 0.024) (Fig. 1).
Identification of DEGs
To uncover the correlation of gene expression profiles with stromal scores, immune scores and tumor purity, we analyzed RNAseq data of all 611 CRC cases obtained from the TCGA database. Figure 2, the volcano plots highlight unique gene expression profiles of cases that we classified under high/low stromal scores, immune scores, and tumor purity groups. We based our comparative analyses on three factors: first, stromal scores through the upregulation of 1507 genes and downregulation of 10 genes in the high stromal score group; second, immune scores through the upregulation of 1235 genes and downregulation of 61 genes in the high and low groups; and third, tumor purity group through the upregulation of 1824 genes and downregulation of 68 genes in the low score group. Furthermore, Fig. 2 illustrates the IRGs were the upregulated or downregulated intersection genes in the low tumor purity group and high stromal or immune groups (944 upregulated genes and 4 downregulated genes).
Functional analysis of IRGs
Using the STRING tool that included 2834 edges and 471 nodes, we generated PPI networks to study the interactions among the identified immune-related genes (Fig. 3A). For further analysis, we chose three modules with at least 20 nodes. As shown in module A (Fig. 3B), the network had formations of 903 edges and 43 nodes. Each gene in this model was associated with 42 other genes, indicating that this module might be the core sub-network of PPI. In module B (Fig. 3C), there were 395 edges and 43 nodes that had numerous genes in the middle that were critical to immune response. These included OLR1, ITGB2, SLC2A3, ITGAM, and ATP8B4. In module C (Fig. 3D), there were 198 edges and 29 nodes where the higher degree values of connectivity in FBN1, COL3A1, and COL1A2 proved their existence as the core genes in the module.
The clustering of these enriched functional-related genes is associated with the immune response, which is in line with the PPI network analysis. Furthermore, we identified statistically significant GO terms, such as the total of 1325 for biological processes, 83 for molecular function, and 77 for the cellular component (FDR < 0.05). GO terms (Table S1) included regulation of leukocyte activation, leukocyte migration and T cell activation (Fig. 4A), extracellular matrix and side of membrane (Fig. 4B), and receptor regulator activity (Fig. 4C). Additionally, the KEGG analysis (Fig. 4D) generated pathways that were all linked with the immune response.
Construction and validation of the IRGs-based signature
In the context of this study’s TCGA CRC patients, it utilized the univariate Cox models to investigate the association between the expression levels of their IRGs and their OS, 171 IRGs were found to have significant relationship with OS. Then, based on those 171 IRGs, we utilized the LASSO Cox regression model to build a prognostic signature in the TCGA dataset (Fig. 5A and B). Finally, a prognostic signature including thirteen IRGs was constructed. Using the coefficients derived from the LASSO Cox regression model, we constructed a formula to calculate risk score for each patient. This score is based on their personalized expression levels of the thirteen IRGs. The formula is as follows: risk score = (0.1217 × expression of A2ML1) + (0.03442 × expression of CALB2) + (− 0.6693 × expression of CD1B) + (0.04806 × expression of COL22A1) + (0.4471 × expression of FCRL2) + (0.00069 × expression of GPX3) + (0.05368 × expression of HAND1) + (0.0023 × expression of IDO1) + (0.006 × expression of LAMP5) + (0.07625 × expression of MAP2) + (0.02431 × expression of MMRN1) + (0.1085 × expression of NKAIN4) + (0.3541 × expression of VAX2). Patients were classified into low- and high-risk groups according to a cutoff risk score of 1.096 (Fig. 5C and D). Figure 5E demonstrates that the high-risk group had substantially lower levels of OS than in the low-risk at p < 0.0001. For 1-, 3-, and 5-year OS, the area under the ROC curve (AUC) was 0.713, 0.724, and 0.689, respectively (Fig. 5F). The GSE39582 cohort verified the prognostic model (N = 531). Patients were classified into low- and high-risk groups according to an ideal cutoff risk score of 0.973 (Fig. 5G and H). Figure 5I illustrates that the high-risk group had substantially lower levels of OS than in the low-risk at p = 5.97e-04. For 1-, 3-, and 5-year OS, the AUC was 0.725, 0.643, and 0.606, respectively (Fig. 5J).
Independent prognostic role of the thirteen-IRGs signature
The univariate and multivariate Cox regression analyses were executed in both GEO and TCGA datasets through the adjustment of clinicopathological features like the T stage, N stage, M stage, and tumor stage, as well as the age and gender. This process aimed to assess if the signature-based risk score of the thirteen-IRGs was an independent prognosis factor for OS, which the results confirmed as true (Fig. 6A, B, D and E). We verified the clinical significance of the thirteen-IRGs signature using the Chi-square test to ascertain the signature’s association with the clinical parameters. In the TCGA cohort, significant correlations were found between the higher-risk score and tumor stage (p < 0.001), M stage (p < 0.01), N stage (p < 0.001), and T stage (p < 0.001) (Fig. 6C). However, no significant difference was found in gender and age. Comparable outcomes were observed in the validation cohort of GSE39582 (Fig. 6F).
To further analyze the hierarchical efficacy of the thirteen-IRGs signature, stage I, II and stage III, IV colorectal cancer patients in the TCGA data set were divided into high-risk and low-risk groups according to the risk score of the signature, and it was found that the survival time of patients in the high-risk group was significantly shorter than that in the low-risk group (Fig. S1A and B). We then downloaded the MSI status data from the TCIA database (https://tcia.at/home). MSI-H patients had significantly higher risk scores than MSS patients in the TCGA dataset (Fig. S1C and D). Moreover, we discovered that the high-risk group was more likely to respond to immunotherapy than the low-risk group, because the expression of PD-1 and PD-L1 was significantly higher in the high-risk group than in the low-risk group (Fig. S1E and F). We also calculated TMB scores based on the TGCA somatic mutation data. The TMB in the low-risk group and high-risk group had no significant difference (Fig. S1G).
Establishment and assessment of the nomogram based on the thirteen-IRGs signature
A nomogram was established in terms of the smallest Akaike Information Criterion (AIC) value occurred from the multivariate Cox regression (AIC = 1132.81), which includes age, T stage, N stage, M stage and the thirteen-IRGs signature (Fig. 7A). The calibration plot demonstrated that the evaluating capability of the nomogram was best in forecasting 3-year OS (Fig. 7B). The C-index of the nomogram was 0.770 (95% CI = 0.718–0.823). As Fig. 7C displays, the nomogram illustrated a greater net benefit that had a wider range of threshold probability on the DCA for predicting 1-, 3-, and 5- year OS. Combining thirteen-IRGs signature with TNM stage showed better net benefit for predicting 3-year OS (Fig. 7D). The AUC for 1-, 3-, and 5- year OS of the nomogram in TCGA dataset was 0.789, 0.783 and 0.790, respectively (Fig. 7E). Based on Fig. 7F, CRC patients from the high-risk group, stratified by the median of nomogram ‘s risk score, had significantly lower OS rates compared to the low-risk group (P < 0.001). Using the GSE39582 cohort for validation, we obtained similar results, as displayed by Fig. S2.
Survival analysis of IRGs
For additional confirmation of their prognostic value and expression, we applied the Kaplan-Meier survival analysis for the thirteen IRGs in signature in patients with CRC from TCGA (Fig. 8). We found that A2ML1, CALB2, COL22A1, FCRL2, GPX3, HAND1, IDO1, LAMP5, MAP2, MMRN1, NKAIN4 and VAX2 were identified as cancer-promoting factors given their high expression correlation with shorter OS in patients with CRC. On the other hand, the high expression of CD1B exhibited a significant correlation with longer OS. This association implies the possible protective role of RNAs in CRC. Additionally, we verified ten IRGs have significant correlation with OS in the GSE39582 cohort, including A2ML1, CALB2, COL22A1, GPX3, HAND1, IDO1, LAMP5, MAP2, MMRN1 and NKAIN4. (Fig. S3).
Correlation between TIICs and prognostic IRGs
The independent predicted factors of sentinel lymph node status and OS among cancer patients are the Tumor-infiltrating immune cells (TIICs) . As a result, we explored the correlation of those thirteen prognostic IRGs with the immune infiltration levels in CRC using TIMER (Table 1). The outcomes reveal the positive correlation of the CD1B expression level with the infiltrating levels of neutrophils (r = 0.468, p = 3.31E-23) and Dendritic Cells (r = 0.505, P = 2.05E-27) in colon adenocarcinoma (COAD) (Fig. 9A). Moreover, GPX3 has the highest significant correlations with infiltrating levels of macrophages (r = 0.54, p = 5.93E-32), neutrophils (r = 0.408, p = 4.45E-17) and Dendritic Cells (r = 0.434, P = 7.01E-20) (Fig. 9B). In addition, IDO1 expression was significantly associated with infiltrating levels of CD8+ T cells (r = 0.4, P = 4.66E-17), neutrophils (r = 0.638, p = 3.97E-47) and Dendritic Cells (r = 0.564, P = 3.47E-35) (Fig. 9C).
We also found that CD1B, FCRL2, IDO1 and MMRN1 were correlated with the infiltration of B cells. Moreover, CD1B, FCRL2, GPX3, IDO1, MAP2 and MMRN1 were correlated with the infiltration of CD8+ T cells. The outcomes reveal the positive correlation of the CD1B, CALB2, COL22A1, FCRL2, GPX3, HAND1, IDO1, LAMP5, MAP2, MMRN1, NKAIN4 and VAX2 expression level with the infiltrating levels of CD4+ T cells, macrophages and Dendritic Cells. In addition, CD1B, CALB2, COL22A1, FCRL2, GPX3, IDO1, LAMP5, MAP2, MMRN1, NKAIN4 and VAX2 expression was significantly associated with infiltrating levels of neutrophils.
The TME affects the progression and growth of a tumor . Moreover, it is comprised of the tumor and non-tumor cells like fibroblasts and immune cells . Tumor-infiltrating immune cells are significantly linked to oncogenesis and angiogenesis, as well as to the metastasis and growth of tumor cells, which could regulate the abundance and differentiation of immune cells . Recent literature highlights how the discrepancies between tumor progression and host’s immune response could result in tumor’s growth . Thus, it is vital to understand the stromal and immune status in the TME to develop strategies that could enhance the patient’s response rate to immunotherapy.
The cutting-edge therapeutic methods include immunotherapies and molecularly targeted therapies, as well as radiotherapies and chemotherapies. However, limitations remain prevalent and include toxicity, immunity of patients to treatments, low response rates, and financial burdens caused by high costs [39, 40]. Also, modern cancer treatments still need biomarkers for prognosis. Hence, we attempted to find prognostic IRGs that impact the patients’ OS by analyzing the TME.
Some studies applied the ESTIMATE to some types of cancers [19, 20], which signify the benefits of using algorithms in large data sets. For example, Vincent et al. used this algorithm to compute for the breast cancer tumor purity and prove that the immune and stromal components are nonexistent in vitro . To comprehend the TME of CRC, we applied the ESTIMATE to generate tumor purity, as well as the stromal and immune scores. In recent studies, [42, 43] our analysis deduced the positive correlation of stromal scores and immune scores with the stages of a tumor, and tumor purity was inversely correlated with tumor stages. Moreover, longer survival times highlighted the presence of low tumor purity and high stromal and immune scores amongst the CRC patients. This suggests the key role of TME in patient outcomes.
We classified the TCGA patients into high and low stromal scores, immune scores and tumor purity groups to detect IRGs. Then, we applied the GO analysis to confirm the involvement of those genes in the TME, such as leukocyte activation, leukocyte migration, T cell activation, extracellular matrix, side of membrane and receptor regulator activity. The outcomes have been consistent with the results of previous studies, where immune cells and ECM molecules influenced the creation of the interdependence within the CRC’s tumor microenvironment [14, 35]. Also, we established the PPI modules to unveil the association and purpose of IRGs. The nodes with high degree of connectivity among modules, such as C3, FBN1 and ITGB2, were linked to apoptosis, migration, angiogenesis, proliferation, and immune response [44,45,46].
Then, we performed LASSO Cox analyses to generate a thirteen-IRGs signature for predicting the prognosis of OS patients in TCGA dataset, and cross-validated from the GEO database GSE39582. The AUC values for the signature in predicting 1, 3, and 5-year survival were 0.703, 0.711, and 0.676, which indicates a good capability for predicting survival in CRC patients. We further evaluated the association between the signature and clinical parameters to figure out the clinical value of this thirteen-IRGs signature. It was found that high-risk patients were associated significantly with TNM stages. DCA and ROC analysis indicated that the nomogram consists of the thirteen-IRGs signature and conventional clinical parameters exhibited high performance in prognostic sensitivity for patients with CRC. These results demonstrated great applicability and stability of the thirteen-IRGs signature for predicting prognosis of patients with CRC.
To ascertain the association of IRGs with immune infiltration, we used a marker called CD1B. Guo et al. found that miR-582/CD1B, which are involved in resting and activated dendritic cells, may be potential novel biomarkers for immunotherapy . Similar to our study, in prostate cancer, the decreased levels of CD1B expression could be linked to the poorer disease-free survival . Finally, using the TIMER algorithm, we evaluated how prognosis-related genes are correlated with immune-infiltrating levels in CRC. Our results suggest that in both READ and COAD, the expression level of CD1B is significantly positive correlation with the tumor-infiltrating levels of neutrophils and Dendritic Cells.
Previous studies indicated that IDO1 suppressed the CD8+ T cell response in colon cancer, which may provide a theoretical basis for the development of new immunotherapy for the treatment of colon cancer . The positive correlation of the IDO1 expression with PD-L1 pathways on T cells is also evident . Furthermore, during the PD-L1 therapy, both CD8+ and CD4+ T cells had higher abundance levels in the TME. As a result, this activates the IDO1 on the T cells, which improves this process and causes improvements in tumor immunity.
To our understanding, this is a pioneering study on CRC that investigates IRGs in the TME. Thus, we took advantage of the opportunities. First, we used the GEO and TCGA databases to study the TME. Both sources gave us an extensive amount of CRC samples. We crosschecked our results using independent cohorts. Then, we acknowledged the complexities associated with the TME, which include genetic factors among many others. Furthermore, this study extensively evaluated how the TME (immune score, stromal scores and tumor purity) interacted with prognostic IRGs.
In conclusion, our current study established a thirteen-gene signature based on ESTIMATE algorithm-derived immune/stromal scores, which could serve as a favorable prognostic factor. Our model would be of clinical value and provide additional data for better understanding of the TME. Finally, the results of functional analysis provide a novel insight for revealing the molecular mechanism in tumor initiation and progression.
Availability of data and materials
The Cancer Genome Atlas
Gene Expression Omnibus
Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data
Differentially expressed genes
Least absolute shrinkage and selection operator
Tumor-infiltrating immune cells
False discovery rate
Kyoto Encyclopedia of Genes and Genomes
Receiver operating characteristic
Decision curve analysis
Akaike Information Criterion
Schreuders EH, Ruco A, Rabeneck L, Schoen RE, Sung JJ, Young GP, et al. Colorectal cancer screening: a global overview of existing programmes. Gut. 2015;64(10):1637–49. https://doi.org/10.1136/gutjnl-2014-309086.
Bahrami A, Shahidsales S, Khazaei M, Ghayour-Mobarhan M, Maftouh M, Hassanian SM, et al. C-met as a potential target for the treatment of gastrointestinal cancer: current status and future perspectives. J Cell Physiol. 2017;232(10):2657–73. https://doi.org/10.1002/jcp.25794.
Gholamin S, Fiuji H, Maftouh M, Mirhafez R, Shandiz FH, Avan A. Targeting c-MET/HGF signaling pathway in upper gastrointestinal cancers: rationale and progress. Curr Drug Targets. 2014;15(14):1302–11. https://doi.org/10.2174/1389450115666141107105456.
Bahrami A, Khazaei M, Hasanzadeh M, ShahidSales S, Joudi Mashhad M, Farazestanian M, et al. Therapeutic potential of targeting PI3K/AKT pathway in treatment of colorectal Cancer: rational and Progress. J Cell Biochem. 2018;119(3):2460–9. https://doi.org/10.1002/jcb.25950.
Bahrami A, Hesari A, Khazaei M, Hassanian SM, Ferns GA, Avan A. The therapeutic potential of targeting the BRAF mutation in patients with colorectal cancer. J Cell Physiol. 2018;233(3):2162–9. https://doi.org/10.1002/jcp.25952.
Bahrami A, Amerizadeh F, ShahidSales S, Khazaei M, Ghayour-Mobarhan M, Sadeghnia HR, et al. Therapeutic potential of targeting Wnt/beta-catenin pathway in treatment of colorectal Cancer: rational and Progress. J Cell Biochem. 2017;118(8):1979–83. https://doi.org/10.1002/jcb.25903.
Petersen VC, Baxter KJ, Love SB, Shepherd NA. Identification of objective pathological prognostic determinants and models of prognosis in Dukes' B colon cancer. Gut. 2002;51(1):65–9. https://doi.org/10.1136/gut.51.1.65.
Roxburgh CS, McMillan DC, Richards CH, Atwan M, Anderson JH, Harvey T, et al. The clinical utility of the combination of T stage and venous invasion to predict survival in patients undergoing surgery for colorectal cancer. Ann Surg. 2014;259(6):1156–65. https://doi.org/10.1097/SLA.0000000000000229.
Roxburgh CS, McMillan DC. The role of the in situ local inflammatory response in predicting recurrence and survival in patients with primary operable colorectal cancer. Cancer Treat Rev. 2012;38(5):451–66. https://doi.org/10.1016/j.ctrv.2011.09.001.
Freeman MR, Li Q, Chung LW. Can stroma reaction predict cancer lethality? Clinical Cancer Res. 2013;19(18):4905–7. https://doi.org/10.1158/1078-0432.CCR-13-1694.
Kalluri R, Weinberg RA. The basics of epithelial-mesenchymal transition. J Clin Invest. 2009;119(6):1420–8. https://doi.org/10.1172/JCI39104.
Senbabaoglu Y, Gejman RS, Winer AG, Liu M, Van Allen EM, de Velasco G, et al. Tumor immune microenvironment characterization in clear cell renal cell carcinoma identifies prognostic and immunotherapeutically relevant messenger RNA signatures. Genome Biol. 2016;17(1):231. https://doi.org/10.1186/s13059-016-1092-z.
Winslow S, Lindquist KE, Edsjo A, Larsson C. The expression pattern of matrix-producing tumor stroma is of prognostic importance in breast cancer. BMC Cancer. 2016;16(1):841. https://doi.org/10.1186/s12885-016-2864-2.
Peddareddigari VG, Wang D, Dubois RN. The tumor microenvironment in colorectal carcinogenesis. Cancer Microenviron. 2010;3(1):149–66. https://doi.org/10.1007/s12307-010-0038-3.
Li B, Severson E, Pignon JC, Zhao H, Li T, Novak J, et al. Comprehensive analyses of tumor immunity: implications for cancer immunotherapy. Genome Biol. 2016;17(1):174. https://doi.org/10.1186/s13059-016-1028-7.
Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12(5):453–7. https://doi.org/10.1038/nmeth.3337.
Yoshihara K, Shahmoradgoli M, Martinez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun. 2013;4(1):2612. https://doi.org/10.1038/ncomms3612.
Shah N, Wang P, Wongvipat J, Karthaus WR, Abida W, Armenia J, et al. Regulation of the glucocorticoid receptor via a BET-dependent enhancer drives antiandrogen resistance in prostate cancer. eLife. 2017;11(6):e27861. https://doi.org/10.7554/eLife.27861.
Jia D, Li S, Li D, Xue H, Yang D, Liu Y. Mining TCGA database for genes of prognostic value in glioblastoma microenvironment. Aging (Albany NY). 2018;10(4):592–605. https://doi.org/10.18632/aging.101415.
Yang S, Liu T, Nan H, Wang Y, Chen H, Zhang X, et al. Comprehensive analysis of prognostic immune-related genes in the tumor microenvironment of cutaneous melanoma. J Cell Physiol. 2020;235(2):1025-35. https://doi.org/10.1002/jcp.29018.
Staub E, Groene J, Heinze M, Mennerich D, Roepcke S, Klaman I, et al. An expression module of WIPF1-coexpressed genes identifies patients with favorable prognosis in three tumor types. J Mole Med. 2009;87(6):633–44.
Marisa L, de Reynies A, Duval A, Selves J, Gaub MP, Vescovo L, et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 2013;10(5):e1001453. https://doi.org/10.1371/journal.pmed.1001453.
Sheffer M, Bacolod MD, Zuk O, Giardina SF, Pincas H, Barany F, et al. Association of survival and disease progression with chromosomal instability: a genomic exploration of colorectal cancer. Proc Natl Acad Sci U S A. 2009;106(17):7131–6. https://doi.org/10.1073/pnas.0902232106.
Del Rio M, Mollevi C, Bibeau F, Vie N, Selves J, Emile JF, et al. Molecular subtypes of metastatic colorectal cancer are associated with patient response to irinotecan-based therapies. Eur J Cancer. 2017;76:68–75.
Allen WL, Dunne PD, McDade S, Scanlon E, Loughrey M, Coleman H, et al. Transcriptional subtyping and CD8 immunohistochemistry identifies poor prognosis stage II/III colorectal cancer patients who benefit from adjuvant chemotherapy. JCO Precision Oncol. 2018;6(2018):PO.17.00241. https://doi.org/10.1200/PO.17.00241.
Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487(7407):330–7. https://doi.org/10.1038/nature11252.
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.
Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7. https://doi.org/10.1089/omi.2011.0118.
Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(Database issue):D447–52. https://doi.org/10.1093/nar/gku1003.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504. https://doi.org/10.1101/gr.1239303.
Li T, Fan J, Wang B, Traugh N, Chen Q, Liu JS, et al. TIMER: a web server for comprehensive analysis of tumor-infiltrating immune cells. Cancer Res. 2017;77(21):e108–e10. https://doi.org/10.1158/0008-5472.CAN-17-0307.
Aran D, Sirota M, Butte AJ. Systematic pan-cancer analysis of tumour purity. Nat Commun. 2015;6(1):8971. https://doi.org/10.1038/ncomms9971.
Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res. 2004;10(21):7252–9. https://doi.org/10.1158/1078-0432.CCR-04-0713.
Loi S, Drubay D, Adams S, Pruneri G, Francis PA, Lacroix-Triki M, et al. Tumor-infiltrating lymphocytes and prognosis: a pooled individual patient analysis of early-stage triple-negative breast cancers. J Clin Oncol. 2019;37(7):559–69. https://doi.org/10.1200/JCO.18.01010.
Quail DF, Joyce JA. Microenvironmental regulation of tumor progression and metastasis. Nat Med. 2013;19(11):1423–37. https://doi.org/10.1038/nm.3394.
Lim YZ, South AP. Tumour-stroma crosstalk in the development of squamous cell carcinoma. Int J Biochem Cell Biol. 2014;53:450–8. https://doi.org/10.1016/j.biocel.2014.06.012.
Taddei ML, Giannoni E, Comito G, Chiarugi P. Microenvironment and tumor cell plasticity: an easy way out. Cancer Lett. 2013;341(1):80–96. https://doi.org/10.1016/j.canlet.2013.01.042.
Galon J, Angell HK, Bedognetti D, Marincola FM. The continuum of cancer immunosurveillance: prognostic, predictive, and mechanistic signatures. Immunity. 2013;39(1):11–26. https://doi.org/10.1016/j.immuni.2013.07.008.
Zhang Y, Xu J, Zhang N, Chen M, Wang H, Zhu D. Targeting the tumour immune microenvironment for cancer therapy in human gastrointestinal malignancies. Cancer Lett. 2019;458:123–35. https://doi.org/10.1016/j.canlet.2019.05.017.
Fong W, To KKW. Drug repurposing to overcome resistance to various therapies for colorectal cancer. Cell Mole Life Sci. 2019;76(17):3383–406. https://doi.org/10.1007/s00018-019-03134-0.
Vincent KM, Findlay SD, Postovit LM. Assessing breast cancer cell lines as tumour models by comparison of mRNA expression profiles. Breast Cancer Res. 2015;17(1):114. https://doi.org/10.1186/s13058-015-0613-0.
Huijbers A, Tollenaar RA, GW VP, Zeestraten EC, Dutton S, CC MC, et al. The proportion of tumor-stroma as a strong prognosticator for stage II and III colon cancer patients: validation in the VICTOR trial. Ann Oncol. 2013;24(1):179–85. https://doi.org/10.1093/annonc/mds246.
Danielsen HE, Hveem TS, Domingo E, Pradhan M, Kleppe A, Syvertsen RA, et al. Prognostic markers for colorectal cancer: estimating ploidy and stroma. Ann Oncol. 2018;29(3):616–23. https://doi.org/10.1093/annonc/mdx794.
Lind GE, Danielsen SA, Ahlquist T, Merok MA, Andresen K, Skotheim RI, et al. Identification of an epigenetic biomarker panel with high sensitivity and specificity for colorectal cancer and adenomas. Mol Cancer. 2011;10(1):85. https://doi.org/10.1186/1476-4598-10-85.
Wang H, Luo C, Zhu S, Fang H, Gao Q, Ge S, et al. Serum peptidome profiling for the diagnosis of colorectal cancer: discovery and validation in two independent cohorts. Oncotarget. 2017;8(35):59376–86. https://doi.org/10.18632/oncotarget.19587.
Chang CM, Chia VM, Gunter MJ, Zanetti KA, Ryan BM, Goodman JE, et al. Innate immunity gene polymorphisms and the risk of colorectal neoplasia. Carcinogenesis. 2013;34(11):2512–20. https://doi.org/10.1093/carcin/bgt228.
Guo J, Jin H, Xi Y, Guo J, Jin Y, Jiang D. The miR-582/CD1B Axis is involved in regulation of dendritic cells and is associated with clinical outcomes in advanced lung adenocarcinoma. Biomed Res Int. 2020;2020:4360930.
Lee CH, Chen LC, Yu CC, Lin WH, Lin VC, Huang CY, et al. Prognostic value of CD1B in localised prostate cancer. Int J Environ Res Public Health. 2019;16(23):4723. https://doi.org/10.3390/ijerph16234723.
Lou Q, Liu R, Yang X, Li W, Huang L, Wei L, et al. miR-448 targets IDO1 and regulates CD8(+) T cell response in human colon cancer. J Immunother Cancer. 2019;7(1):210. https://doi.org/10.1186/s40425-019-0691-0.
Takada K, Kohashi K, Shimokawa M, Haro A, Osoegawa A, Tagawa T, et al. Co-expression of IDO1 and PD-L1 in lung squamous cell carcinoma: potential targets of novel combination therapy. Lung Cancer. 2019;128:26–32. https://doi.org/10.1016/j.lungcan.2018.12.008.
The results published here are in whole or part based upon data generated by The Cancer Genome Atlas managed by the NCI and NHGRI. Information about TCGA can be found at http://cancergenome.nih.gov.
Ethics approval and consent to participate
Not applicable. All data in this study are publicly available.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Top 20 KEEG pathways and GO terms enriched by the DEmRNAs.
Estimation of the MSI status and cancer immunotherapy response using the thirteen-IRGs signature in the TCGA cohort.
Validation of the nomogram predicting overall survival for CRC patients in the GSE39582 cohort.
Validation of correlation of IRGs extracted from TCGA database with overall survival in GEO cohort. Kaplan-Meier survival curves were generated for selected immune-related genes extracted from the comparison of groups of high (red line) and low (blue line) gene expression. p < 0.05 in Log-rank test.
About this article
Cite this article
Wang, Y., Li, W., Jin, X. et al. Identification of prognostic immune-related gene signature associated with tumor microenvironment of colorectal cancer. BMC Cancer 21, 905 (2021). https://doi.org/10.1186/s12885-021-08629-3
- Colorectal cancer
- Immune-related gene signature
- Tumor microenvironment