Genome-wide expression profiling in colorectal cancer focusing on lncRNAs in the adenoma-carcinoma transition

Background Long non-coding RNAs (lncRNAs) play a fundamental role in colorectal cancer (CRC) development, however, lncRNA expression profiles in CRC and its precancerous stages remain to be explored. We aimed to study whole genomic lncRNA expression patterns in colorectal adenoma–carcinoma transition and to analyze the underlying functional interactions of aberrantly expressed lncRNAs. Methods LncRNA expression levels of colonic biopsy samples (20 CRCs, 20 adenomas (Ad), 20 healthy controls (N)) were analyzed with Human Transcriptome Array (HTA) 2.0. Expression of a subset of candidates was verified by qRT-PCR and in situ hybridization (ISH) analyses. Furthermore, in silico validation was performed on an independent HTA 2.0, on HGU133Plus 2.0 array data and on the TCGA COAD dataset. MiRNA targets of lncRNAs were predicted with miRCODE and lncBase v2 algorithms and miRNA expression was analyzed on miRNA3.0 Array data. MiRNA-mRNA target prediction was performed using miRWALK and c-Met protein levels were analyzed by immunohistochemistry. Comprehensive lncRNA-mRNA-miRNA co-expression pattern analysis was also performed. Results Based on our HTA results, a subset of literature-based CRC-associated lncRNAs showed remarkable expression changes already in precancerous colonic lesions. In both Ad vs. normal and CRC vs. normal comparisons 16 lncRNAs, including downregulated LINC02023, MEG8, AC092834.1, and upregulated CCAT1, CASC19 were identified showing differential expression during early carcinogenesis that persisted until CRC formation (FDR-adjusted p < 0.05). The intersection of CRC vs. N and CRC vs. Ad comparisons defines lncRNAs characteristic of malignancy in colonic tumors, where significant downregulation of LINC01752 and overexpression of UCA1 and PCAT1 were found. Two candidates with the greatest increase in expression in the adenoma-carcinoma transition were further confirmed by qRT-PCR (UCA1, CCAT1) and by ISH (UCA1). In line with aberrant expression of certain lncRNAs in tumors, the expression of miRNA and mRNA targets showed systematic alterations. For example, UCA1 upregulation in CRC samples occurred in parallel with hsa-miR-1 downregulation, accompanied by c-Met target mRNA overexpression (p < 0.05). Conclusion The defined lncRNA sets may have a regulatory role in the colorectal adenoma-carcinoma transition. A subset of CRC-associated lncRNAs showed significantly differential expression in precancerous samples, raising the possibility of developing adenoma-specific markers for early detection of colonic lesions.


Background
The incidence and mortality of colorectal cancer (CRC) are continuously increasing with approximately 1.4 million new CRC cases and 700.000 registered deaths worldwide [1]. Therefore, identification of molecular markers of CRC that might enhance the objective classification or the early detection of the disease remains highly relevant, as CRC is one of the most curable cancers if detected early [2]. Besides the commonly investigated molecular markers, such as DNA mutations, DNA methylation or mRNA expression alterations, interest is growing in an emerging novel class of non-coding RNAs, long non-coding RNAs (lncRNAs) [3][4][5].
LncRNAs are defined as transcripts longer than 200 base pairs without an open reading frame [6]. This class of non-coding RNAs represents a diverse group with known and predicted functions for gene expression regulation [7][8][9]. According to experimental data, lncRNAs can interact with DNA, RNA and also with proteins and can either promote or inhibit transcription [10]. In contrast to miRNA-mediated regulation, the function and mechanism of action of certain lncRNAs can be diverse; lncRNAs are involved in genomic imprinting, transcriptional regulation, protein scaffolding, maintenance of hetero-euchromatin balance, can function as a miRNA sponge, and also mediate disease-derived alterations of mRNAs, miRNAs and proteins [9,11]. Dysregulated lncRNAs are known to contribute to CRC formation through the disruption of various signaling cascades including Wnt/β-catenin, EGFR/ IGF-IR (KRAS and PI3K pathways), TGF-β, p53 and Akt signaling pathways, and also via influencing the epithelialmesenchymal transition program [12]. To date, 172.216 human lncRNA transcripts have been identified according to NONCODEv5 database [13] and their number continues to increase. Recent studies have demonstrated that several lncRNAs have a key regulatory role in various diseases including CRC [14]. During the carcinogenesis, lncRNA expression alterations affect major biological processes, and therefore. lncRNAs are considered as powerful molecular markers and also potential therapeutic targets in various cancers [3,15].
In the present study, we aimed to determine the differentially expressed lncRNAs at the whole genome level focusing on the colorectal adenoma-carcinoma transition to identify lncRNAs showing specific alterations only in CRC tissue and common lncRNA patterns characteristic both in benign and malignant colonic neoplasms. Furthermore, we validated the lncRNA expression alterations by qRT-PCR, in situ hybridization, on an independent HTA 2.0 dataset, HGU133 Plus2.0, and The Cancer Genome Atlas (TCGA) Colon adenocarcinoma (COAD) datasets. We also report an association between the dysregulated lncRNAs and mRNA, miRNA and protein expression.

Sample collection
During routine screening endoscopy examinations biopsy samples were collected from patients with untreated colorectal cancer (n = 20; Astler-Coller modified Dukes B-D), with colorectal adenomas (n = 20; tubulovillous: n = 9, tubular: n = 11; with low-grade dysplasia: n = 18, with high-grade dysplasia: n = 2), and from healthy donors (n = 20). Healthy donors had been referred to the outpatient clinic with constipation, rectal bleeding or chronic abdominal pain. Ileocolonoscopy showed normal macroscopic appearance, and no abnormal histologic changes were detected in biopsy samples. None of the healthy patients had familial history of CRC. Biopsies were immediately put in RNALater stabilization reagent (Qiagen GmbH, Hilden, Germany) and stored at − 80°C. Written informed consent was provided by all patients. The study was approved by the local ethics committee (Semmelweis University Regional and Institutional Committee of Science and Research Ethics; Nr.: ETT TUKEB 23970/2011). The clinicopathological data for the analyzed sample set are reported in Table 1.

RNA isolation, quality and quantity analyses
Total RNA including the microRNA (miRNA) fraction was isolated with High Pure miRNA isolation kit (Cat no: 05080576001, Roche, Penzberg, Germany) using the one-column protocol according to the manufacturer's recommendation. RNA quantity was measured on a Qubit fluorometer with the Qubit™ RNA Assay Kit (Life Technologies, Eugene, OR, USA) and also on the NanoDrop-1000 instrument (Thermo Fisher Scientific Inc., Waltham, USA) to determine the purity values (OD260/280, OD260/230). RNA quality analysis was performed on an Agilent Bioanalyzer microcapillary electrophoresis system with the RNA 6000 Pico Kit (Agilent, Santa Clara, CA, USA).

Microarray experiment
For lncRNA expression profiling Human Transcriptome Array 2.0 (HTA 2.0) experiments were performed with 100 ng total RNA sample input according to the manufacturer's instructions. For single-stranded complementary DNA (sscDNA) synthesis 15 μg complementary RNA (cRNA) was used and 5.5 μg fragmented and labeled sscDNA sample was hybridized to Human Transcriptome Array 2.0 microarrays (Affymetrix, Santa Clara, CA, USA) for 16 h at 45°C with 60 rpm rotation in the Hybridization Oven (Affymetrix). Microarrays were washed and stained with GeneChip® Hybridization, Wash, and Stain Kit reagents according to the FS450_0001 protocol using the Fluidics Station 450 instrument (Affymetrix). Scanning was performed with GeneChip Scanner 3000 (Affymetrix).   [18]. Alignment of the probesets of the different platforms was performed using the BioMart data mining tool based on the current Ensembl database (Ensembl release 93 -July 2018 using GRCh38.p12 human genome version). Among the significant lncRNA expression alterations identified on HTA 2.0 arrays, 11 associated probesets could be found on the HGU133-Plus2.0 arrays representing 10 lncRNAs. Linear correlation between the two microarray platforms was also analyzed.

Statistical analysis
For data distribution analysis the Kolmogorov-Smirnov test was applied. Due to the normal distribution, Student's t-test was applied for the pairwise comparison with Bonferroni and Hochberg correction. FDR adjusted p-values lower than 0.05 were considered as significant.
Pearson-correlation was calculated and lncRNA-mRNA-miRNA co-expression network was constructed by igraph package in the R environment.

Results
Expression of known CRC-associated lncRNAs in the colorectal adenoma and carcinoma samples As the expression of known CRC-associated lncRNAs has not been studied yet in precancerous adenoma samples, in the present study we aimed to analyze the adenomaspecific alterations with a special focus on adenomacarcinoma transition. Previous comprehensive studies analyzing healthy colon and CRC tissue samples revealed differentially expressed lncRNAs, so-called CRC-associated lncRNA markers [24][25][26]. First, the expression of these literature-based lncRNA markers was studied on adenoma and CRC samples as part of our whole transcriptome analysis. A subset of lncRNAs from our Human Transcriptome Array 2.0 study (CCAT1, PVT1,  CRNDE; LINC01021, LINC-ROR, UCA1, FTX, MEG3,  LOC100289019) showed remarkable expression changes already in the precancerous colonic lesions (p < 0.01) (Fig. 1).

Absolute quantification
By the use of absolute quantification, we were able to confirm the lncRNA expression alterations observed by HTA 2.0 analyses. CCAT1 and UCA1 showed upregulation in tumor samples compared to normal tissues (Fig. 3a). In the case of LINC00261, we observed upregulation in adenomas compared to normal controls and downregulation in CRC samples compared to adenomas, but these expression alterations were not significant (data not shown).

In situ hybridization
In normal colonic FFPE tissue samples, no UCA1 ISH signal was detected, whereas UCA1 ISH signal was observed in adenoma tissue and an even stronger signal was detected in colorectal carcinoma samples, in accordance with our qRT-PCR results. The UCA1 ISH signal was localized predominantly in the epithelial cells in adenoma and carcinoma tissue samples (Fig. 3b).
In silico validation on an independent HTA 2.0 dataset Our HTA 2.0 results were compared with an independent HTA 2.0 dataset [17]. The significantly altered lncRNA set in the CRC vs. normal comparison showed the same tendency, a high positive correlation was found with the independent GSE73360 dataset in the CRC vs. NAT comparison (R 2 = 0.7076) (Fig. 4a, Additional file 3: Table S2).

In silico validation on TCGA COAD RNASeq dataset
Out of the 37 lncRNAs identified in our cohort showing significantly differential expression in the CRC vs. N comparison, 16 were detected on the TCGA COAD dataset [19]. A very high positive correlation (R 2 = 0.9029) was observed between the datasets (Fig. 4c, Additional file 3: Table S2).

Co-expression analysis of differentially expressed lncRNAs-mRNAs-miRNAs
mRNAs negatively correlated with CCAT1 are involved in G1/S transition of mitotic cell cycle, G2/M transition of mitotic cell cycle functions, while the positively correlated mRNAs play a role e.g. in cell division and cell cycle regulation. The negatively co-expressed mRNAs with UCA1 have a negative regulatory role of transcription from RNA polymerase II promoter, angiogenesis, DNA methylation, while the positively correlated mRNAs are involved in mitotic cytokinesis and apoptotic processes. The coexpression network of lncRNAs and mRNAs showed certain overlap between the mRNA targets of the selected lncRNAs (Fig. 5 , Additional file 4: Table S3).
UCA1 was upregulated in adenoma and also in CRC samples (Fig. 6a) and a potential interaction was predicted between UCA1 and hsa-miR-1 based on independent algorithms. In the GSE83924 dataset [21] hsa-miR-1 was downregulated in adenoma and CRC samples compared to normal colonic samples (p < 0.01) (Fig. 6b). MIRWALK validated target prediction revealed that c-Met mRNA is one of the targets of hsa-miR-1. On the other hand, co-expression analysis showed that c-Met was among the most positively co-expressed mRNAs with both IDs of UCA1 (TC19002012: rho = 0.7811; TC19000279.hg.1: rho = 0.7717).

Discussion
The present study aimed to identify differentially expressed lncRNAs at the whole genome level characteristic of colorectal adenoma and carcinoma tissue samples principally focusing on the adenoma-carcinoma transition. Besides the detection of single lncRNA alterations (by PCR-based methods, Northern blot, RNAimmunoprecipitation and in situ hybridization [27]), genome-wide analysis can be achieved by RNA-Seq [28] and also by microarrays, e.g. by Human Transcriptome Array 2.0, a high-resolution microarray platform detecting lncRNA, miRNA and mRNA levels in parallel from the same sample [29].
Certain lncRNAs were previously analyzed in colorectal diseases, but the focus of most of the studies was limited to one or a small set of lncRNAs [30][31][32]. The strength of our study was the inclusion of colorectal adenoma tissue samples along with the CRC cases in a genome-wide lncRNA expression analysis, in order to identify markers for the early detection of colorectal tumors. Based on the present study, 54 lncRNAs were found to be differentially expressed along the colorectal adenoma-carcinoma sequence that could be validated on independent microarray results and also on TCGA COAD dataset. A subset of lncRNAs already associated with CRC development showed remarkable expression changes in the precancerous colonic adenoma lesions.
The lncRNAs that play a key role during the malignant transformation in CRC development remain to be identified, and therefore, we further focused on the colorectal-adenoma transition. According to our microarray analysis, 12 lncRNAs were downregulated and 5 were upregulated both in adenoma and CRC samples compared to the healthy controls. These lncRNAs might have a key regulatory role during CRC formation, as their dysregulation could be detected already in adenomas, that persisted until the malignant transition. Among these possible key factors, CASC19 overexpression was reported in CRC tissue samples and it was associated with metastasis formation [33]; its knockdown resulted in reduced migration of RKO and Caco2 cells [33]. Upregulation of LINC02163 was also observed in CRC tissue samples by NGS technology [34]. In rectal adenocarcinoma cases Zhang et al. found an association between LINC02163 expression and overall survival [35]. LINC02163 overexpression was documented in gastric cancer tissues and cell lines; knockdown of LINC02163 resulted in reduced cell growth and invasion [36]. Upregulation of AC021218.2 (CRCAL-1) was detected in 3 pairs of CRC compared to their matched normal mucosa by RNA sequencing [37]. According to the current literature in this subset, AL365226.2, LINC02023, AC092834.1, LINC02441, B3GALT5-AS1, THRB-IT1, LINC02535, AC140658.1, and AC142086 have not been reported to be associated with CRC formation and these may serve as novel lncRNA markers of neoplastic processes of the colon. As in the case of certain presumably 'driver' lncRNAs (e.g. MEG8) in the present study, a distinct expression tendency could be detected in adenoma and carcinoma samples, so that their dysregulation may not be a gradual event during cancer formation. It is well-known that up-or downregulation of lncRNAs can be restricted to a certain stage during cancer formation and therefore it is not evident that all differentially expressed RNAs show the same expression tendency in adenoma and carcinoma samples [21]. It is known, that lncRNAs show cell-specific expression in certain tumors [38] and the overall resultant expression found in biopsy samples can be influenced by the epithelial-stromal cell ratio which may differ between colorectal adenomas and adenocarcinomas. On the other hand, similarly to the findings on miRNA [21] and on DNA methylation levels [39], certain lncRNAs showed distinct expression alterations in adenomas and carcinomas that might be due to the counteracting cell-proliferation control pathways in adenomas that are dysregulated in carcinomas.
LncRNAs characteristic only in the malignant state could be identified in the intersection of CRC vs. N and CRC vs. Ad comparisons, where significant downregulation of LINC01752 and overexpression of PCAT1 and UCA1 were observed. Although the altered expression of these lncRNAs are hypothetically associated with malignancy in colonic tumors, due to the relatively low number of CRCs analyzed in the present study, no reliable further conclusion can be formed about the correlation of their expression levels with survival, metastasis or tumor stage data in our cohort. However, on the basis of literature data, overexpression of PCAT1 was suggested to be an independent prognostic factor for CRC [40] and its downregulation inhibited proliferation, blocked cell cycle transition, suppressed cyclin and c-myc expression and induced apoptosis [41]. Silencing of PCAT1 in Caco− 2 and HT-29 cell lines suppressed cell motility, invasiveness, sensitized the cells to 5-FU [42]. Human urothelial carcinoma associated 1 (UCA1) lncRNA has a role in cell proliferation, apoptosis, and cell cycle progression regulation and an increased UCA1 expression level was reported first in urothelial cancer [43] and also reported for breast cancer and CRC patients [44][45][46]. According to Ni et al., CRC patients with high UCA1 expression had a poor prognosis and UCA1 overexpression was found to be an independent predictor of CRC [47]. Exogenous expression of UCA1 enhanced tumorigenicity, invasive potential, and drug resistance in human bladder TCC BLS-211 cells [43]. Silencing of UCA1 in HCT116, HT29, SW480 and LoVo cell lines significantly decreased cell proliferation, while UCA1 overexpression promoted tumorigenicity in LoVo cells [48]. Silencing of UCA1 suppressed proliferation and metastasis and induced apoptosis of oral squamous cell carcinoma cell lines, which may be related to the activation of the WNT/β-catenin signaling pathway [49]. In a recent study, UCA1 expression was also detected in liquid biopsy samples, where Tao et al. reported that the elevated expression level in the tissue samples can also be detected in plasma samples of CRC patients [50]. The above-mentioned functions of UCA1 illustrate the complexity of its mechanism of actions and its diverse role in CRC development [51].
To date, there has been no data reported about UCA1 expression in colorectal adenoma patients. According to our results, UCA1 shows a gradual upregulation in the normal-adenoma-cancer sequence, thus besides its wellestablished malignancy-associated functions, it holds the possibility to be utilized as an early detection marker for precancerous lesions of the colon, that may contribute to reduce CRC-related deaths in the future.
Interestingly, a slight discrepancy was observed in UCA1 expression between microarray and qRT-PCR methods. By the two probe sets on HTA 2.0 microarray representing UCA1, all 36 transcripts can be detected, while in contrast, qRT-PCR primers used in the present study detect 4 transcripts (namely UCA1-213, UCA1-207, UCA1-201, and UCA1-228). The above-mentioned reason probably contribute to the discrepant expression results observed between the two platforms. Nevertheless, in situ hybridization probes could detect all UCA1 transcripts, and as an added advantage, this approach also provide information regarding lncRNA subcellular localization in colonic tissue samples. Tripathi et al. recently reported moderate UCA1 expression using ISH on colorectal cancer tissue [52]. To the best of our knowledge, our study is the first to report evidence of the upregulation of UCA1 lncRNA in colorectal adenomas and cancers predominantly observed in the epithelial cells adding additional evidence for UCA1 overexpression along colorectal cancer formation.
UCA1 has been reported to regulate miRNAs, such as miR-216b and hsa-miR-1 as a miRNA sponge and as a ceRNA of miR-204-5p among others, influencing growth promotion, apoptosis inhibition and 5-FU resistance [48,53,54]. The mutual inhibitory effect and the inverse expression of UCA1 and hsa-miR-1 have already been proven in bladder cancer and one functional interaction site was experimentally confirmed between hsa-miR-1 and UCA1 [54]. Furthermore, after the transfection of pre-miR-1 or following the treatment of UCA1 shRNA, cell proliferation and motility decreased in bladder cancer cell lines in an AGO2-mediated manner [54]. Downregulation of miR-1 in tumors compared to the corresponding normal tissue samples is associated with worse prognosis and its lower expression has been reported to reprogram cancer metabolism via the regulation of tumor glycolysis in colorectal cancer [55]. The interaction between hsa-miR-1 and HGFR (c-Met) is well-known in CRC [56]. On the basis of the comprehensive microarray analysis of Nagy et al. [21] and our present results, the downregulation of hsa-miR − 1 along the adenoma-carcinoma sequence was accompanied by c-MET overexpression. Upregulated UCA1 might exert its effect on c-Met protein and therefore can be associated with metastasis formation and CRC progression. Continuous overexpression of c-Met protein was observed along with the development of CRC [57,58], which could be confirmed in our cohort, as well. On the basis of our coexpression analysis, c-Met was among the most positively co-expressed mRNAs of UCA1. Taken together, the association of UCA1, hsa-miR-1 and c-Met was predicted by independent algorithms in our study. However, further functional experimental verification is needed to prove the hypothetized interactions in CRC.
Besides the studies aiming at the identification of a signature marker group of lncRNA candidates differentially expressed between CRC and normal samples either on the basis of tissue or plasma samples (e.g. circulating XLOC_006844, LOC152578 and XLOC_000303) [59] or between high-risk and low-risk CRC patients with different overall survival [60], our study identified a subset of lncRNAs potentially playing a key role during the adenoma-carcinoma transition complementing the present knowledge about CRC formation.

Conclusion
In summary, the defined lncRNA sets may have a regulatory role in the adenoma-carcinoma transition. A subset of lncRNAs showed significant differential expression already in precancerous samples that persisted until CRC formation, raising the possibility of developing adenoma-specific markers in order to achieve early detection of colonic lesions.
Additional file 1: Figure S1. Differentially expressed lncRNAs in the adenoma-carcinoma sequence. A) Adenoma vs. Normal samples, B) CRC vs. Adenoma samples, C) CRC vs. Normal samples. Intensity values on the color scale are as follows: redhigh intensity, blackintermediate intensity, greenlow intensity.
Additional file 3: Table S2. In silico validation on independent datasets. Additional file 4 : Table S3. lncRNA-mRNA co-expression network data, GO analysis.
Additional file 5: Figure S2. Specific UCA1 in situ hybridization signals using RNAScope probes (red stain) on CRC tissue with merged and single channel captures from ISH of UCA1 (Urothelial cancer associated 1), dapB (a Bacillus subtilis gene, negative control probe), and PPIB (Cyclophilin B, positive control probe). Tissue sections were counter stained with DAPI. Digital microscope samples, scale bar: 100 μm.