- Research article
- Open Access
Interconnectivity between molecular subtypes and tumor stage in colorectal cancer
BMC Cancer volume 20, Article number: 850 (2020)
There are profound individual differences in clinical outcomes between colorectal cancers (CRCs) presenting with identical stage of disease. Molecular stratification, in conjunction with the traditional TNM staging, is a promising way to predict patient outcomes. We investigated the interconnectivity between tumor stage and tumor biology reflected by the Consensus Molecular Subtypes (CMSs) in CRC, and explored the possible value of these insights in patients with stage II colon cancer.
We performed a retrospective analysis using clinical records and gene expression profiling in a meta-cohort of 1040 CRC patients. The interconnectivity of tumor biology and disease stage was assessed by investigating the association between CMSs and TNM classification. In order to validate the clinical applicability of our findings we employed a meta-cohort of 197 stage II colon cancers.
CMS4 was significantly more prevalent in advanced stages of disease (stage I 9.8% versus stage IV 38.5%, p < 0.001). The observed differential gene expression between cancer stages is at least partly explained by the biological differences as reflected by CMS subtypes. Gene signatures for stage III-IV and CMS4 were highly correlated (r = 0.77, p < 0.001). CMS4 cancers showed an increased progression rate to more advanced stages (CMS4 compared to CMS2: 1.25, 95% CI: 1.08–1.46). Patients with a CMS4 cancer had worse survival in the high-risk stage II tumors compared to the total stage II cohort (5-year DFS 41.7% versus 100.0%, p = 0.008).
Considerable interconnectivity between tumor biology and tumor stage in CRC exists. This implies that the TNM stage, in addition to the stage of progression, might also reflect distinct biological disease entities. These insights can potentially be utilized to optimize identification of high-risk stage II colon cancers.
Colorectal cancer (CRC) is the fourth most common cancer worldwide and the second leading cause of cancer mortality . Clinical decision making in CRC is mainly driven by clinical and traditional pathological features including TNM staging. Although these features hold considerable prognostic, and even predictive value, there are profound individual differences in clinical outcome within a single tumor stage, especially for stage II and III . Also, there is compelling evidence that not all cancers follow the linear-progression model associating with the TNM-stages. For example, in CRC the majority of lymphatic and distant metastases arise from independent subclones, and 40–63% of metachronous metastases develop in patients without lymph node metastasis . The consensus molecular subtype (CMS) classification is a widely studied transcriptome-based stratification system for CRC defining four disease entities (CMS 1–4) with distinct clinical, biological and molecular features . Hence, the CMS taxonomy could offer a framework to elucidate whether TNM solely resembles disease progression or also biologically different entities that preferentially present with a specific stage of disease at diagnosis. This study was conducted to investigate the interconnectivity between tumor stage and tumor biology in CRC patients. Subsequently we demonstrate the added value of this knowledge in patients with high-risk stage II colon cancer, a subgroup in which accurate prognostication and selection for adjuvant treatment is still an unmet need.
Patients and data aggregation
Patients for which information on staging and microsatellite instability (MSI) status was available were selected from the previously reported meta-cohort of Guinney et al. , resulting in 1040 individual patients (accession number GSE39582  and TCGA). For validation of our findings the chemotherapy naïve stage II CRC patients of the MATCH Cohort  and AMC-AJCCII-90 Cohort (accession number GSE33113)  were used. In the validation cohort high-risk was defined as either T4 or inadequate lymph node assessment (< 10 nodes assessed).
The R2: Genomic Analysis and Visualization Platform was used to extract the aggregated and normalized data (http://r2.amc.nl).
Differential gene expression analysis
The limma package was used to identify differentially expressed genes (DEG) between the different tumor stages and CMS groups, using the ANOVA test for overall DEG and a limma-test for individual groups. P-values were FDR corrected. For comparing the number of DEG between the overall cohort and CMSs, a random set of 200 patients was sampled 1000 times to correct for the effect of group size on the number of DEG.
Gene signatures for advanced stage and CMS4 were built using the top 100 DEG (with the lowest FDR corrected p-value) between early (stage I-II) and advanced stage (stage III-IV), and CMS1/2/3 and CMS 4. Gene signature scores were built using the weighted matched z-score of both the up- and downregulated genes of the gene signatures.
The Chi-square test was used to assess associations between CMS classification and tumor stage. The Kaplan-Meier method was used to estimate survival. Survival curves were compared using the log-rank test. Disease-free survival (DFS) times of > 60 months were censored at 60 months. We performed a multivariate analysis using a Cox proportional-hazards model with CMS, gender, age, tumor location, T-stage and MSI status as covariables. All statistical tests were 2-sided and considered significant at a P-value lower than 0·05. All analyses were performed using R version 3.6.1.
Distinct TNM stages represent with different distributions of molecular subtypes
We analyzed the association between CMS subtypes and tumor stage in a meta-cohort comprising 1040 patients (Table 1). An increase in prevalence of the poor-prognosis mesenchymal subtype (CMS4) was detected in advanced stages of disease (stage I 12 (9.8%), stage II 89 (22.9%), stage III 94 (29.4%) and stage IV 45 (38.5%), p < 0.001) (Fig. 1 and Additional file 1: Table S1). The same increase was observed for the individual cohorts separately (Additional file 1: Table S1 and Additional file 1: Fig. S1).
Tumor stage reflects tumor biology
We tested the hypothesis that tumor stage as defined by TNM, does not only represent disease progression but also reflects different biological entities. By investigating the changes in the number of differentially expressed genes, considerable gene expression differences between TNM stages was revealed. These differences decreased significantly when stratified for CMS2 and CMS4 representing the most common CMSs (Fig. 2a). This was confirmed when stratifying for all subtypes (CMS1–4) (Additional file 1: Fig. S2). Furthermore, visualization of the genes that displayed significant differences between tumor stages (ANOVA p < 0.05, n = 2764) shows a clear separation for the immune (CMS1), epithelial (CMS2/3) and mesenchymal (CMS4) subtypes in both a t-SNE plot and a gene expression heatmap (Fig. 2b and Additional file 1: Fig. S3).
CMS4 correlates with more advanced stages and has a higher progression rate
In order to specifically investigate the association between CMS4 and more advanced tumor stages, we built two gene signatures to discriminate disseminated disease (stage III-IV) from local disease (stage I-II), and to separate CMS4 cancers from CMS1/2/3 tumors (see methods). Remarkably, the two scores were highly correlated (r = 0.77, p < 0.001) (Fig. 2c), with only a few overlapping genes (13/200), suggesting that overrepresentation of CMS4 cancers in stage III-IV cancers is responsible for gene expression differences between early and advanced malignancies.
Subsequently, we assessed the rate of progression from early (stage I-II) to advanced (stage III-IV) tumor stage for each of the subtypes by calculating the risk ratios. This shows a markedly increased progression rate towards more advanced stages for CMS4 cancers as compared to CMS1 tumors (RR 1.64, 95% CI: 1.29–2.09), CMS2 (RR 1.25, 95% CI: 1.08–1.46) and CMS3 (RR 1.57, 95% CI: 1.23–2.01) (Fig. 2d).
CMS4 holds prognostic value in high-risk stage II colon cancer
In an effort to validate our findings and provide clinical utility to the insight obtained, we evaluated chemotherapy naive high-risk stage II colon cancers (Table 2). Based on the association between CMSs and tumor stage, we hypothesized that CMS4 cancers are over represented in high-risk stage II cancers. Indeed, in the combined stage II cohorts, MATCH and GSE33113 (n = 197), CMS4 cancers were more prevalent in high-risk stage II patients (21.7% vs 7.7%, p = 0.02 respectively) (Table 2, Fig. 3a and Additional file 1: Table S2). DFS for these patients confirmed the poor disease outcome of CMS4 cancers (Fig. 3b). This effect was explained by the poor outcome for patients with a CMS4 cancer in the subgroup with high-risk tumors (5-year DFS 41.7% versus 100.0%, p = 0.008) (Fig. 3c and Additional file 1: Fig. S4). These findings were substantiated by a multivariate analysis, which showed a significant correlation of CMS with DFS in the subgroup with high-risk tumors but not in the total stage II cohort (Table 3 and Additional file 1: Table S3). The extended GSE33113 cohort, comprising of both stage II and stage III tumors, revealed possible under-staging of high-risk stage II patients. With a rising number of assessed lymph nodes the percentage of stage III colon cancers increased (Fig. 3d and Additional file 1: Table S4).
At present we are moving towards a more personalized medicine approach for the treatment of cancer. However, at this stage TNM staging is still the single most important feature guiding treatment decisions for CRC. The CMS classification is a promising classification system for CRC, identifying four subtypes with distinguishing biological features. CMS classification might be a relevant addition to TNM staging in order to provide an optimal treatment strategy for individual patients. Our findings support the hypothesis that tumor stage as defined by TNM, in addition to disease progression, resembles different biological entities. This adds to the argument that the CMS taxonomy is a potential framework to further tailor the prognostication and treatment of patients with CRC.
We have observed a difference in distribution for the CMS within the different TNM stages with mainly a decrease in CMS1 and a profound increase of CMS4 patients with advancing stages of disease. This is in line with the overall good prognosis of the CMS1, which are mainly MSI tumors, and the poor prognosis of the mesenchymal CMS4 subtype . This may suggest that the poor prognosis for increased stages of disease is (in part) explained by the aggressive tumor biology of CMS4, given the poor disease outcome of CMS4 compared to CMS1–3 cancers. The aggressive nature of the mesenchymal subtype was also demonstrated by a higher progression rate for CMS4 compared to the other subtypes (Fig. 2d).
When stratified for CMSs, we observed a marked decrease in differentially expressed genes between the different tumor stages. Furthermore, a high correlation between the two gene signature scores for stage III/IV and CMS4 was demonstrated. This indicates that at least part of the biological differences between tumor stages are explained by the CMSs. Which in turn supports the hypothesis that different tumor stages are largely driven by tumor biology rather than disease progression.
Furthermore, we showed a possible and valuable clinical implication of the molecular subtypes for the high-risk stage II patients. Current guidelines recommend to consider adjuvant chemotherapy for these patients , which is based on literature showing (limited) prognostic value but no predictive value for the high-risk variables [9,10,11,12,13]. The overt difference in DFS for the CMS4 subtype in the subgroup of high-risk stage II patients suggests that CMS subtyping may be of added value to identify patients that have a high-risk, lymph node negative colon cancer. This effect might partly be explained by stage migration, due to under-staging as a result of low number of assessed lymph nodes; i.e. high-risk stage II tumors contain unrecognized stage III tumors. Another possible explanation for the marked difference in DFS within the CMS4 population is that these tumors behave more like the early-dissemination model [3, 14], instead of the classical linear-progression model in CRC. In agreement, the existence of early disseminating cancer cells which evolve independently at the metastatic site has been demonstrated in breast cancer . Therefore CMS4 tumors may benefit from treatment with chemotherapy at an apparently early stage of progression (stage II).
Several clinical studies found that patients with synchronous and metachronous liver metastases had a similar overall survival upon diagnosis of metastatic disease [16,17,18]. This supports our hypothesis that tumor biology is installed at an early moment in tumor development and that this, rather than the progression over time, is the main determinant for prognosis in these patients. Also, determining the CMS may not only be helpful to identify high-risk stage II patients, but may also be used to select patients for specific treatments. Patients with an MSI tumor (mostly CMS1) are known to have very limited benefit from chemotherapy [19, 20]. However, these patients may very well benefit from immunotherapy or the addition of Bevacizumab instead of Cetuximab to chemotherapy in metastasized CRC [21, 22]. For epithelial-like tumors (CMS2/3) there is a predictive value for anti-EGFR therapy [7, 23]. Patients with a CMS2 tumor were shown to be responsive to Oxaliplatin-containing chemotherapy while mesenchymal tumors (CMS4) seemed refractory to 5FU-based chemotherapy. These results suggest that the CMS taxonomy may also be used to select patients for conventional chemotherapy [24, 25]. Future prospective studies should be conducted to confirm these hints on CMS-specific drug sensitivity, as these findings originate from retrospective studies.
The current study has several limitations. First, the survival analysis in the subset of stage II colon cancer may be subject to selection bias. Patients with high-risk stage II colon cancer were excluded from the current analysis when they did receive adjuvant chemotherapy. However, on estimate only 10–15% of these patients actually receive adjuvant chemotherapy, and patients with a T4 tumors and inadequate lymph node assessment (both high-risk factors) were present in the aggregated cohorts. Second, the additive value of the CMS for high-risk stage II patients should be validated in larger series given the relatively small number of patients in the high-risk stage II cohort.
In conclusion, this study provides evidence to support the hypothesis that tumor stage and the corresponding prognosis are at least partly driven by tumor biology rather than the time of diagnosis. The CMS classification system has the potential to be a major contributor to clinical decision making. Therefore, future efforts should focus on further substantiating these findings and the development of a clinically applicable CMS test.
Availability of data and materials
The GSE39582 , GSE33113  and TCGA dataset are publicly available in the Gene Expression Omnibus repository (https://www.ncbi.nlm.nih.gov/geo/) and the TCGA repository (https://cancergenome.nih.gov/). The sequencing data and a restricted clinical data set of the MATCH cohort can be accessed through the European Genome Phenome Archive (https://www.ebi.ac.uk/ega/home) under accession number EGAS00001002197 and as supplemental data of Kloosterman et al.  Detailed clinical data of the MATCH Cohort and the data set of the extended AMC-AJCCII-90 Cohort will be provided upon reasonable request to the corresponding author.
Consensus Molecular Subtype
Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424.
Dienstmann R, Mason MJ, Sinicrope FA, Phipps AI, Tejpar S, Nesbakken A, et al. Prediction of overall survival in stage II and III colon cancer beyond TNM system: a retrospective, pooled biomarker study. Ann Oncol. 2017;28(5):1023–31.
Naxerova K, Reiter JG, Brachtel E, Lennerz JK, van de Wetering M, Rowan A, et al. Origins of lymphatic and distant metastases in human colorectal cancer. Science (New York, NY). 2017;357(6346):55–60.
Guinney J, Dienstmann R, Wang X, de Reynies A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nat Med. 2015;21(11):1350–6.
Marisa L, de Reynies A, Duval A, Selves J, Gaub MP, Vescovo L, et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 2013;10(5):e1001453.
Kloosterman WP, van den Braak RRJ C, Pieterse M, van Roosmalen MJ, Sieuwerts AM, Stangl C, et al. A systematic analysis of oncogenic gene fusions in primary Colon Cancer. Cancer Res. 2017;77(14):3814–22.
De Sousa EMF, Wang X, Jansen M, Fessler E, Trinh A, de Rooij LP, et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nat Med. 2013;19(5):614–8.
Benson AB 3rd, Schrag D, Somerfield MR, Cohen AM, Figueredo AT, Flynn PJ, et al. American Society of Clinical Oncology recommendations on adjuvant chemotherapy for stage II colon cancer. J Clin Oncol. 2004;22(16):3408–19.
Andre T, de Gramont A, Vernerey D, Chibaudel B, Bonnetain F, Tijeras-Raballand A, et al. Adjuvant fluorouracil, Leucovorin, and Oxaliplatin in stage II to III Colon Cancer: updated 10-year survival and outcomes according to BRAF mutation and mismatch repair status of the MOSAIC study. J Clin Oncol. 2015;33(35):4176–87.
Cakar B, Varol U, Junushova B, Muslu U, Gursoy Oner P, Gokhan Surmeli Z, et al. Evaluation of the efficacy of adjuvant chemotherapy in patients with high-risk stage II colon cancer. J BUON. 2013;18(2):372–6.
Jalaeikhoo H, Zokaasadi M, Khajeh-Mehrizi A, Rajaeinejad M, Mousavi SA, Vaezi M, et al. Effectiveness of adjuvant chemotherapy in patients with stage II colorectal cancer: a multicenter retrospective study. J Res Med Sci. 2019;24:39.
Kumar A, Kennecke HF, Renouf DJ, Lim HJ, Gill S, Woods R, et al. Adjuvant chemotherapy use and outcomes of patients with high-risk versus low-risk stage II colon cancer. Cancer. 2015;121(4):527–34.
O'Connor ES, Greenblatt DY, LoConte NK, Gangnon RE, Liou JI, Heise CP, et al. Adjuvant chemotherapy for stage II colon cancer with poor prognostic features. J Clin Oncol. 2011;29(25):3381–8.
Nagtegaal ID, Schmoll HJ. Colorectal cancer: what is the role of lymph node metastases in the progression of colorectal cancer? Nat Rev Gastroenterol Hepatol. 2017;14(11):633–4.
Ghajar CM, Bissell MJ. Metastasis: Pathways of parallel progression. Nature. 2016;540(7634):528–9.
Mekenkamp LJ, Koopman M, Teerenstra S, van Krieken JH, Mol L, Nagtegaal ID, et al. Clinicopathological features and outcome in advanced colorectal cancer patients with synchronous vs metachronous metastases. Br J Cancer. 2010;103(2):159–64.
Rahbari NN, Carr PR, Jansen L, Chang-Claude J, Weitz J, Hoffmeister M, et al. Time of metastasis and outcome in colorectal cancer. Ann Surg. 2019;269(3):494-502.
van der Pool AE, Lalmahomed ZS, Ozbay Y, de Wilt JH, Eggermont AM, Jzermans JN, et al. ‘Staged’ liver resection in synchronous and metachronous colorectal hepatic metastases: differences in clinicopathological features and outcome. Colorectal Dis. 2010;12(10 Online):e229–35.
Des Guetz G, Schischmanoff O, Nicolas P, Perret GY, Morere JF, Uzzan B. Does microsatellite instability predict the efficacy of adjuvant chemotherapy in colorectal cancer? A systematic review with meta-analysis. Eur J Cancer. 2009;45(10):1890–6.
Sargent DJ, Marsoni S, Monges G, Thibodeau SN, Labianca R, Hamilton SR, et al. Defective mismatch repair as a predictive marker for lack of efficacy of fluorouracil-based adjuvant therapy in colon cancer. J Clin Oncol. 2010;28(20):3219–26.
Le DT, Uram JN, Wang H, Bartlett BR, Kemberling H, Eyring AD, et al. PD-1 blockade in tumors with mismatch-repair deficiency. N Engl J Med. 2015;372(26):2509–20.
Lenz HJ, Ou FS, Venook AP, Hochster HS, Niedzwiecki D, Goldberg RM, et al. Impact of consensus molecular subtype on survival in patients with metastatic colorectal Cancer: results from CALGB/SWOG 80405 (Alliance). J Clin Oncol. 2019;37(22):1876–85.
Trinh A, Trumpi K, De Sousa EMF, Wang X, de Jong JH, Fessler E, et al. Practical and robust identification of molecular subtypes in colorectal Cancer by immunohistochemistry. Clin Cancer Res. 2017;23(2):387–98.
Roepman P, Schlicker A, Tabernero J, Majewski I, Tian S, Moreno V, et al. Colorectal cancer intrinsic subtypes predict chemotherapy benefit, deficient mismatch repair and epithelial-to-mesenchymal transition. Int J Cancer. 2014;134(3):552–62.
Song N, Pogue-Geile KL, Gavin PG, Yothers G, Kim SR, Johnson NL, et al. Clinical outcome from Oxaliplatin treatment in stage II/III Colon Cancer according to intrinsic subtypes: secondary analysis of NSABP C-07/NRG oncology randomized clinical trial. JAMA Oncol. 2016;2(9):1162–9.
This work is supported by, The New York Stem Cell Foundation (NYSCF-I-R43) and grants from KWF (UVA2014–7245 and UVA2013–6331), Worldwide Cancer Research (14–1164), the Maag Lever Darm Stichting (MLDS-CDG 14–03), the European Research Council (ERG-StG 638193) and ZonMw (Vidi 016.156.308) to LV. LV is a New York Stem Cell Foundation – Robertson Investigator. The RNA sequencing of the MATCH cohort was supported by a grant from NutsOhra (grant number 0903–011) to J.N.M. IJzermans, this funding body only had a role in RNA sequencing data collection of the MATCH cohort. The rest of the funding bodies had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Ethics approval and consent to participate
For the MATCH Cohort ethics approval was obtained from the institutional research ethics review board of the Erasmus MC University Medical Center (number MEC-2007-088). All patients gave written informed consent for the storage and use of tissue samples for research purposes, and the collection of clinical data. For the AMC-AJCCII-90 Cohort the ethics approval was obtained and the need for consent was waived by the institutional research ethics review board of the Amsterdam University Medical Center, location AMC (number W12_011 # 12.17.0020).
Consent for publication
The authors declare that they have no competing interests. LV received consultancy fees from Bayer, MSD, Genentech, Servier and Pierre Fabre but these had no relation with the content of this publication.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Distribution of CMS per tumor stage in the total and individual cohorts. Supplementary Figure S1. Distribution of the molecular subtypes per tumor stage in the individual cohorts. Supplementary Figure S2. Random sampling all subtypes n = 130. Supplementary Figure S3. Heatmap of the differentially expressed genes between tumor stages. Supplementary Table S2. Distribution of the molecular subtypes in high and low risk stage II CRC patients. Supplementary Figure S4. Disease-free survival in patients with ≥ 10 lymph nodes assessed. Supplementary Table S3. Multivariate analysis of CMS and disease free survival for total stage II cohort. Supplementary Table S4. Characteristics extended GSE33113 cohort.
About this article
Cite this article
Coebergh van den Braak, R.R.J., ten Hoorn, S., Sieuwerts, A.M. et al. Interconnectivity between molecular subtypes and tumor stage in colorectal cancer. BMC Cancer 20, 850 (2020). https://doi.org/10.1186/s12885-020-07316-z
- Colorectal cancer
- Molecular subtype
- Tumor biology