We used a high-stringency MBD-protein-based enrichment protocol to obtain methylated DNA for deep sequencing from CRCs (primary and liver metastases) and normal colon mucosa. The portion of the genome sequenced (7.8%) included ~ 27% of the ~ 28 million CpGs found therein (~ 7.6 × 106). Small fractions of the MBD-isolated regions were differentially methylated in primary (2155 regions including 279,441 CpGs) or metastatic CRC (3223 regions containing 312,723 CpGs) samples relative to the unmatched samples of normal colon mucosa we tested. Importantly, the hypermethylation phenotype of the liver metastases closely resembled that of primary CRCs, in terms of both the identity and location of the hypermethylated DMRs. We have previously shown that this phenotype is already evident in precancerous colorectal lesions (i.e., sessile serrated lesions and, to a lesser extent, adenomatous polyps), and it becomes increasingly obvious in CRCs [13]. Our present findings suggest that this progression reaches a plateau before CRC cells seed the liver. In contrast, the spread of CRC-associated hypomethylation continues after the tumor cells metastasize to the liver. The extent of this progression requires further investigation with whole-genome analysis of the methylome (see below), but it is tempting to speculate that late increases in hypomethylation might contribute to metastasis-specific alterations in the gene expression, genomic stability, and/or drug susceptibility of CRCs [32,33,34,35,36].
Our study is the first to use genome-wide deep-sequencing to compare the methylomes of primary CRCs and CRC liver metastases. To our knowledge, only three previous studies [37,38,39] have analyzed this issue at the genome-wide level. In two studies, freshly collected, patient-matched tissue pairs (nine in one case [37], three in the other [38]) were analyzed using a methylated CpG island amplification microarray approach involving methylation-specific, restriction-enzyme digestion of defined CpGs in approximately 6000 gene promoters [40, 41]. Both found that the hypermethylation phenotypes of CRC liver metastases closely resembled those of their primary cancer counterparts, leading the investigators to conclude —as we have— that most of the DNA hypermethylation associated with colorectal tumorigenesis probably occurs before the disease spreads to the liver. Our in silico analysis of previously published llumina Infinium 450 microarray data on six patient-matched tissue pairs [39] also confirmed that the methylomes of primary and metastatic colorectal cancers are similar (Additional Files, Supplementary Fig. 4). This conclusion is also supported by findings of a recent analysis of 70 pairs of formalin-fixed, paraffin-embedded tissues, which revealed concordance between primary CRCs and matched metastases taken from different organs in the CpG island hypermethylation phenotype at five gene promoters [42]. The tissues we tested were prospectively collected to obtain, from fresh samples, high-quality DNA for MBD capture, and this markedly reduced our chances of obtaining matched samples of colorectal cancers and their corresponding liver metastases in the timeframe of the study. (The clinical management of patients with colorectal cancer—before and after detection of liver metastases—is usually a fairly long process marked by multiple surgical and chemotherapeutic interventions, often carried out in different hospitals.) Although this is undeniably a limitation, the strikingly similar hypermethylation phenotypes observed in the unmatched primary and metastatic tumors suggests that similar or even greater concordance would probably be evident in matched samples. Our use of laser-capture tissue microdissection probably played a key role in reducing the variability between primary and metastatic epithelial tumors by eliminating normal and tumor-related stromal cells from their respective microenvironments, an important aspect that recently emerged in a gene-expression study of primary and metastatic colorectal cancers [43].
Our findings on hypomethylation are also in agreement with the previous observations of Hur et al. [35], who found hypomethylation of long interspersed nuclear element-1 (LINE-1) sequences in 77 formalin-fixed, paraffin-embedded samples of primary CRCs (vs. normal mucosa), and this alteration was even more evident in matched samples of liver metastases. The fact that some of the hypomethylated LINE-1 sequences were found to be located within the intronic regions of proto-oncogenes whose expression was increased in liver metastases points intriguingly to possible functional consequences of the late increase of hypomethylation in cancer cells seeding in the liver [35].
The present study was conducted exclusively on microsatellite stable/non-CIMP CRCs, which are far more common than CRC with MSI/CIMP-H phenotype. The decision to focus our work on the more frequently encountered phenotype was motivated by the difficulties we encountered in the prospective collection of samples (see above) and the costs of the genome-wide analysis of the DNA methylome. In addition, our previous work [13] has shown that the DNA methylome of primary microsatellite stable/non-CIMP CRCs differs from that of MSI/CIMP-H primaries. Data in the literature are lacking on the possible evolution of the MSI/CIMP-H CRC methylome during metastasis. Genome-wide analysis of the methylome should therefore be extended to this molecular type of CRCs to determine whether hyper−/hypomethylation changes between primary and metastatic tumors are CRC-type specific.
We found no evidence that chemotherapy significantly alters the methylomes of CRC liver metastases. This is an important issue in view of the emergence of drug-resistant clones that might exhibit clinically-relevant epigenomic changes [44, 45], and our finding obviously requires further and more in-depth investigation. The timing of chemotherapy relative to primary and metastatic tumor resections varied widely in our study, as did the drugs administered. All, however, were cytotoxic agents, and it is important to extend the investigation to include the possible effects of more recently introduced targeted approaches, such as anti-EGFR antibodies.
Bisulfite sequencing is still considered the gold standard technique for analyzing DNA methylomes. However, bisulfite conversion of unmethylated cytosines causes substantial DNA damage [46, 47], which can be a major concern when the amount of input DNA extracted from clinical samples is limited (e.g., the laser-microdissected sections of frozen tissues used in our study). Using MBDE, we obtained high-quality methylome data with only 100 ng of input DNA per sample, but reliable results with this enrichment method can reportedly be obtained with volumes as small as 15 ng [48]. Furthermore, owing to cost considerations and computational constraints, bisulfite sequencing analysis is usually limited to a genome-wide selection of regions, such as that obtained with the targeted enrichment step we used for our bisulfite-sequencing analysis of the methylomes of normal, precancerous, and cancerous colorectal tissues [13]. Metastatic CRCs were not included in that study, but comparison of the data it generated on primary cancers and normal mucosa with those obtained here provided insights into the pros and cons of the two pre-sequencing enrichment protocols (Fig. 3).
As expected, MBDE covered a larger portion of the genome and more CpG dinucleotides than TE (Fig. 3a). The probes used for TE, which were designed a priori, target specific genomic loci consisting mainly of CpG islands in regulatory regions [13]. In contrast, MBDE relies on the binding of the MBD polypeptide to any of the numerous methylated regions in the genome—extragenic as well as intragenic, and CpG shores and shelves as well as the islands they flank (Fig. 1b-c). Therefore, MBDE allowed us to recover more genomic information than TE.
Our analysis also confirmed that both MBDE and TE preferentially cover CpG-dense regions (Fig. 3b), but the mean O:E CpG ratio of those covered by TE was slightly higher than that of MBDE-covered regions (difference between means = 0.075). This small increase is consistent with the fact that TE detected 25,291 (96%) of the 26,361 “canonical” CpG islands located in non-sex chromosomes, as opposed to only 15,423 (59%) detected with MBDE. Most CpG islands are unmethylated in human tissues [49, 50] and will therefore be missed by the MBD polypeptide, which binds to methylated regions [22]. In contrast, however, the shores and shelves flanking canonical CpG islands are not missed by MBDE since they are usually methylated. Indeed, by considering a broader window that includes shelves and shores, as well as the island they flank (i.e., sSISs), MBDE actually covered 93.5% of these regions (24,644, Fig. 1f).
As for the CpG sites covered by both enrichment technologies, previous comparisons have shown high concordance in the methylation levels identified, in non-colorectal cells or tissues, by MBDE and bisulfite-sequencing approaches (reduced representation bisulfite sequencing [51] and whole genome bisulfite sequencing [52, 53]). Unlike these studies, our work included both diseased tissues (i.e., cancers) and normal colorectal tissues, with the latter as a reference. We could thus focus on methylation level alterations in primary CRCs, as compared with basal levels observed in normal colorectal mucosa, and we also assessed the performance of MBDE vs. TE in regions differing in CpG-density (Fig. 3c). On the whole, the methylation changes identified with the two methods were concordant, with correlation values ranging from 0.59 to 0.65 (Fig. 3d). Significant changes in methylation were observed in high-, intermediate, and low-CpG-density regions. Interestingly, however, MBDE identified hypermethylation in high-density regions that was not detected by TE, and TE consistently detected more hypomethylation, regardless of the regions’ CpG density (Fig. 3c-d). The latter difference was likely due to the limited magnitude of the changes involving hypomethylation and their tendency to occur in relatively short stretches of DNA. Both factors reduce the odds that these alterations will be detected by MBDE, which cannot quantify methylation levels at single CpGs [51, 52].
Despite the evidence for concordance between MBDE and TE, the differences (e.g., those related to cost, coverage achieved, and resolution) should also be considered when choosing between the two types of enrichment protocols (see [51, 54] for formal comparisons). In general, for the study of regional methylation, single-CpG resolution might not be necessary, and a lower-cost method of enrichment like MBDE can be used. Targeted enrichment, however, is more suitable when there is a need for CpG resolution of methylation levels (e.g., identification of methylation at transcription-factor binding sites).