Skip to main content

Genome instability-related long non-coding RNA in clear renal cell carcinoma determined using computational biology

Abstract

Background

There is evidence that long non-coding RNA (lncRNA) is related to genetic stability. However, the complex biological functions of these lncRNAs are unclear.

Method

TCGA - KIRC lncRNAs expression matrix and somatic mutation information data were obtained from TCGA database. “GSVA” package was applied to evaluate the genomic related pathway in each samples. GO and KEGG analysis were performed to show the biological function of lncRNAs-mRNAs. “Survival” package was applied to determine the prognostic significance of lncRNAs. Multivariate Cox proportional hazard regression analysis was applied to conduct lncRNA prognosis model.

Results

In the present study, we applied computational biology to identify genome-related long noncoding RNA and identified 26 novel genomic instability-associated lncRNAs in clear cell renal cell carcinoma. We identified a genome instability-derived six lncRNA-based gene signature that significantly divided clear renal cell samples into high- and low-risk groups. We validated it in test cohorts. To further elucidate the role of the six lncRNAs in the model’s genome stability, we performed a gene set variation analysis (GSVA) on the matrix. We performed Pearson correlation analysis between the GSVA scores of genomic stability-related pathways and lncRNA. It was determined that LINC00460 and LINC01234 could be used as critical factors in this study. They may influence the genome stability of clear cell carcinoma by participating in mediating critical targets in the base excision repair pathway, the DNA replication pathway, homologous recombination, mismatch repair pathway, and the P53 signaling pathway.

Conclusion subsections

These data suggest that LINC00460 and LINC01234 are crucial for the stability of the clear cell renal cell carcinoma genome.

Peer Review reports

Introduction

Clear cell renal cell carcinoma (ccRCC) is the most common subtype of renal cell carcinoma, and ccRCC accounts for 80 to 90% of all renal cell carcinomas. ccRCC is a potentially invasive tumor with an overall progression-free survival rate of 70% and a cancer-specific mortality rate of 24% [1]. It is 1.5–2.0 times more common in men than in women. Advanced RCC has a five-year survival rate of 11.7% [2]. Risk factors include smoking, obesity, high blood pressure, chronic kidney disease, and exposure to certain chemicals and heavy metals [3]. The diagnosis of ccRCC has been increasing over the past few years. Although surgery is the most common treatment option, early diagnosis is difficult, and many patients have metastatic disease by this time [4]. For patients with advanced ccRCC or relapse, many molecular-targeted drugs have been used as first-line therapies. Nevertheless, outcomes are poor due to the side effects of these agents and individual differences in individual drug sensitivities [5].

It is a fundamental challenge for cells to copy their genetic material for daughter cells accurately. Once this process goes wrong, genomic instability occurs [6]. The level of genomic instability is reflected in nucleotide instability, microsatellite instability, and chromosome instability [7]. DNA damage can be caused by mistakes in DNA replication caused by genotoxic compounds or ultraviolet and ionizing radiation. Incorrect DNA replication can lead to mutations or blocked replication, leading to chromosome breakage, rearrangement, and dislocation [8]. Genomic instability is an essential source of genetic diversity within tumors. Oncogene expression drives proliferation by interfering with regulatory pathways that control cell cycle progression. Genomic instability produces large-scale genetic aberrations but also increases point mutations in protein-coding genes. The estimated mutation rate in tumors is an order of magnitude higher than that of typical healthy tissue. Genomic instability also changes as tumors develop, and this trait could become a target for treatment [9].

Recent advances in sequencing technology have revealed that only 2% of the human genome codes for proteins [10]. Non-coding RNAs are classified into small non-coding RNAs and long non-coding RNAs according to their size. Long non-coding RNA (lncRNA) predominate. LncRNAs play central roles in many cellular mechanisms, including regulation of cell processes [11]. They also regulate pathophysiological processes through gene imprinting, histone modification, chromatin remodeling, and other mechanisms [12, 13]. LncRNAs also play essential roles in cancer. They are involved in chromatin remodeling and transcriptional and post-transcriptional regulation through various chromatin-based mechanisms and interactions with other RNA species [14, 15]. LncRNA imbalances can alter functions such as cell proliferation, anti-apoptosis, angiogenesis, metastasis, and tumor suppression [16]. Depending on their positions and distribution in the genome, lncRNAs directly or indirectly affect the transcription of various proteins through transcriptional and post-transcriptional changes, some of which may mediate tumor inhibition or promotion [17].

Because chemotherapy, radiation therapy, targeted therapeutic agents, and immune checkpoint inhibitors do not function well in many ccRCC patients, investigators need to develop new treatment options and further identify prognostic biomarkers and therapeutic targets ccRCC. LncRNA screening and model building based on gene instability in ccRCC may represent an important research strategy.

Materials and methods

Data collection

We downloaded clinical information, protein-coding RNA expression data, lncRNA expression data, and somatic mutation information for clear renal cell carcinomas from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) [18]. We considered 507 ccRCC samples with paired lncRNA and mRNA expression profiles, survival information, and clinical information.

We divided all ccRCC samples into a training set and a test set. The training set included 254 samples for the creation of a clinical outcome lncRNA risk model. The test set included 253 patients, used to validate the predictive ability of the prognostic risk model. We provided detailed data on TCGA clear cell renal carcinoma (Supplementary Table 1). Meanwhile, we calculated the tumor mutation burden (TMB) in the samples and estimate the average number of mutations in the tumor genome [19].

Mining lncRNAs related to genetic instability

First, we calculated the number of somatic mutations in each sample. The samples with the number of somatic mutations in the top 25% were defined as the genomic unstable (GU)-like group. The samples with the number of somatic mutations in the bottom 25% were defined as the genomically stable (GS)-like group. We combined the lncRNA expression matrix of TCGA-KIRC with the GU and GS groups and obtained each group’s lncRNA expression matrix. We then conducted a difference analysis on these two lncRNAs matrixes; |fold change| > 1 and false discovery rate adjusted P < 0.05 were defined as genome instability-associated lncRNAs. The result of genome instability-associated lncRNAs difference analysis is displayed in Table 1.

Table 1 lncRNAs related to genetic instability

Functional enrichment analysis and GSVA

We calculated the correlations between each protein-coding gene and the lncRNAs obtained as described above using the Pearson correlation coefficient method [20]. We ranked these protein coding factors in descending order according to the correlation and selected mRNAs with the top 10 correlation coefficients as the co-expression coding genes of lncRNA. Using functional analysis of these co-expressed coding genes, we analyzed the biological functions of these genetically unstable lncRNAs. Gene Ontology (GO) enrichment was performed using the clusterProfiler package in R, version 3.6.3 [21]. GSVA, which is estimated in an unsupervised manner, has a higher ability to detect changes in pathways in the sample population [22]. We downloaded the GSVA score from the molecular signatures database (http://software.broadinstitute.org/gsea/msigdb) to construct the gene set. Then, GSVA score was performed for each gene set in each sample using GSVA R software package.

Statistical analysis

We used Euclidean distances and Ward’s linkage method to perform hierarchical cluster analyses [23]. We used univariate Cox proportional hazard regression analysis to calculate the associations between expression level of genome instability-associated lncRNAs and overall survival. We performed multivariate Cox proportional hazard regression analysis to evaluate the weighting coefficient in the risk signature. The genome instability-related lncRNA (GILncSig) for overall survival was as follows: Log[h(ti)/h0(ti)] = a1X1+ a2X2 + a3X3 + akXk, where h(ti) is the function hazard, and h0(ti) is the baseline hazard, X1, X2, X3, Xk are covariates, and a1, a2, and a3 are the corresponding multivariate Cox proportional hazard regression coefficients. A detailed introduction can be found in our previous articles [24]. We were using the same best cut-off point (the point is determined by the samples, with the maximum sensitivity and specificity in time-dependent receiver operating characteristic (ROC) curve). Hazard ratio (HR) and 95% confidence interval (CI) were calculated using Cox analysis. The Kyoto Encyclopedia of Genes and Genomes (KEGG) [25] pathway of genome instability-related lncRNAs were identified using gene set variation analysis [22]. All statistical analyses were performed using R-version 3.6.3.

Results

Differences in long non-coding RNAs

The design flow chart of this study was shown in Fig. 1.

Fig. 1
figure1

The design flow chart of this study. Clinical follow-up information of renal clear cell carcinoma, protein-coding RNA expression data, long non-coding RNA expression data, and somatic mutation information were downloaded from the TCGA database, and the samples were then divided into training sets and test sets. The samples were then divided into two groups for difference analysis according to gene mutation. According to the results of difference analysis, the overall samples were divided into gene stable group and gene unstable group by consensus cluster analysis. Then lncRNA-mRNA co-expression network was constructed, and the pathway analysis and GSVA scores were performed for this network. Then a COX regression prognostic model was established, and the model verification processes such as survival analysis, clinical subgroup analysis, tumor mutation burden analysis and model comparison were carried out

To identify non-coding genes related to genome instability, we grouped them according to the number of somatic mutations. We placed the first 25% of somatic mutations (84 samples) into the genetically unstable group and then placed the final 25% of somatic mutations (84 samples) into the genetically stable group. We screened and obtained differential non-coding RNAs using the limma package. We screened a total of 26 non-coding differential RNAs, of which 17 were down-regulated, and nine were up-regulated (Table 1). The levels of differential non-coding RNA expression in both groups are shown in Fig. 2a.

Fig. 2
figure2

(A) Difference analysis of the group that Somatic cell mutations are in the top 25% between the group that Somatic cell mutations are in the last 25% in RCC. (B) Unsupervised clustering of GS-group and GU-group. (C) The difference of somatic cell mutation number between GS-group and GU-Group. (D) The different expression of UBQLN4 in GS-Group and GU-Group

Genome instability-related lncRNA

We performed unsupervised clustering of all samples in KIRC based on the expression levels of these 26 lncRNAs (Fig. 2b). We obtained two clustering results, and the number of somatic mutations in the two groups was significantly different (Fig. 2c, P = 5.3e-13, Mann–Whitney U-test). Next, we compared the expression levels of the genomic instability driver ubiquilin4 (UBQLN4) in the GS-like and the GU-like groups (Fig. 2d) [26]. We found that the expression of UBQLN4 was significantly up-regulated in the genetically unstable group. We supplemented the correlation coefficient between UBQLN4 and other lncRNAs (Supplementary Table 2). Based on these results, we tested whether samples with different mutation levels could be distinguished based on expression levels of the 26 differential lncRNAs, and indirectly demonstrate that these lncRNAs may be related to genome stability.

LncRNA-mRNA co-expression network

Based on Pearson correlation coefficients, we determined the top 10 mRNAs that correlated with each lncRNA. We created a co-expression network lncRNAs and mRNAs (Fig. 3a). We then analyzed the function of the mRNAs in the co-expression module to determine the associated biological processes. GO enrichment demonstrated that these protein-coding genes are related to biological processes such as homologous recombination (Fig. 3b). This analysis suggests that the 26 genomically unstable non-coding RNAs may affect genome stability by regulating their co-expression networks. We found that these co-expressed protein-coding genes might regulate homologous recombination, thereby destroying cell stability. In total, we identified 26 non-coding RNAs related to genome instability.

Fig. 3
figure3

(A) The co-expression network of lncRNA-mRNA. Green stands for LncRNA and red for mRNA. The closer the relationship, the closer the connection. (B) Go analysis of the lncRNA-mRNA network. In the biological process, the network is mainly enriched in the monovalent inorganic homeostasis. In the cellular component, the network is mainly enriched in apical part of cell and apical plasma membrane. In the molecular function, the network is mainly enriched in monovalent inorganic cation transmembrane transporter activity and receptor ligand activity

The genome instability-related lncRNA risk model

We clarified the lncRNAs and biological processes related to genetic stability. Next, we calculated the correlations between these lncRNAs and clinical survival phenotypes. We randomly divided 507 clear cell carcinoma samples with detailed follow-up information into training groups and validation groups. We constructed a multivariate Cox proportional hazard regression model for ccRCC in the training set based on 26 genomic stable state-related lncRNAs. The coefficients of the risk factors in the model are shown in Table 2. Risk model (GILncSig) = 0.095 * LINC00460 + 0.165 * LINC01234 + 0.152 * AL139351.1 + 0.177 * MIR222HG + 0.123 * AC087636.1–0.027 * LINC02471. We found that LINC00460, LINC01234, AL139351.1, MIR222HG, AC087636.1 were transparent risk factors. The higher their expression, the worse the overall survival of patients with renal cancer. LINC02471 is a protective factor for ccRCC. The higher its expression, the better the overall survival. We supplemented the univariate cox regression analysis coefficients of clinical features and risk scores, risk scores acted as independent prognosis factors(Supplementary Table 3). Meanwhile, we added the pearson-correlation coefficients of LINC01234 and tumor mutation burden in other types of cancers(Supplementary Figure 1). LncRNA expression patterns and the distribution of somatic mutation count distribution and UBQLN4 expression for patients in high- and low-risk groups are shown in Supplementary Figure 2.

Table 2 Multivariate Cox proportional hazard regression analysis results

The verification and evaluation of lncRNA model performance

Risk scores for each sample in the training and test sets were calculated using the GILncSig method. Patients were divided into groups according to the median risk score (0.853); patients in the higher risk group had a risk score > 0.853. We then calculated the survival difference between the high- and low-risk groups using survival analysis. In TCGA-KIRC cohort, we found that patients in the low-risk group had better clinical outcomes (Fig. 4a, P < 0.001). Patients in the low-risk group in the training set (Fig. 4b, P < 0.001) and validation set (Fig. 4c, P < 0.001) also had better survival outcomes. The area under the time-dependent ROC curve of TCGA-KIRC cohort was 0.681 (Fig. 4d). The area under the time-dependent ROC curve of the training set cohort was 0.726 (Fig. 4e). The area under the time-dependent ROC curve of the verification set cohort was 0.642 (Fig. 4f). MutS homolog 2 (MSH2) and replication factor C subunit 1 (RFC1) are involved in the process of mismatch recognition [27]. Comparison analysis showed significant differences in MSH2 and RFC1 expression patterns between the samples in the high- and low-risk groups (Fig. 5). Expression levels of MSH2 in the low-risk group were significantly higher than those of the high-risk group (P < 0.001, Mann–Whitney U-test; Fig. 3d). RFC1 also showed higher expression levels in low-risk patients than in high-risk patients (P < 0.001, Mann–Whitney U-test).

Fig. 4
figure4

Survival analysis and ROC curve. (A-C) A COX prognostic regression model was established to calculate the scoring threshold, and a survival analysis was performed to assess the difference between the high-risk and low-risk groups. In the all set, train set and test set, patients in the low-risk group had a better prognosis than those in the high-risk group (P < 0.01). (D-F) The area under the ROC curve of the all set was 0.681, the area under the ROC curve of the train set was 0.726, and the area under the ROC curve of the test set was 0.642. The model shows good predictive ability

Fig. 5
figure5

(A-C) The previously reported genetic instability related factor MSH2 showed significant differences in expression patterns between high-risk group and low-risk group in the all set (P = 9.1e-05), train set (P = 0.0059) and test set (P = 0.0057). (D-F) The previously reported genetic instability related factor RFC1 showed significant differences in expression patterns between high-risk group and low-risk group in the all set (P = 6.8e-07), train set (P = 0.0066) and test set (P = 1.8E-05)

Subgroups of the lncRNA model

We then obtained a stable genomic stability-related lncRNA prognosis model. To further analyze their performance levels in various subgroups, we conducted survival analysis. We found that subgroups of patients in the low-risk group achieve better outcomes (Fig. 6)(Supplementary Figure 3).

Fig. 6
figure6

Subgroup analysis. The samples were divided into multiple clinical subgroups according to age, sex, stage, metastasis, and infiltration of lymph nodes. The results showed that in all clinical subgroups, the low-risk group had a better prognosis

Tumor mutation landscapes in high- and low-risk groups

To compare mutations in the high- and low-risk groups, we drew a panorama of mutations in the two groups (Fig. 7). A total of 88.24% of the samples had mutations in the low-risk group. The top 10 mutated genes included VHL, PBRM1, TTN, SETD2, BAP1, and MUC16. The high-risk group’s mutation frequency (84.62%) was lower than that of the low-risk group (88.24%). The top 10 factors associated with mutations were the same as those of the low-risk group.

Fig. 7
figure7

Waterfall map of gene mutation burden. (A) In the low-risk group, the mutation rate was 88.24%. The top three mutated genes were VHL, PBRM1 and TTN. (B) In the high-risk group, the mutation rate was 84.62%. The top five mutated genes were VHL, PBRM1, SETD2, TTN and BAP1

Performance comparison in terms of AUC

To determine the accuracy of clinical predictive models related to genome stability, we performed diagnostic test comparisons. Three recently published lncRNA signatures were involved in comparisons: the three-lncRNA signature derived from Zhang et al. (Zhang Dan) [28], the four-lncRNA signature derived from Liu et al. (LiulncSig) [29] and an immune signature derived from Sun et al. (SunlncSig) [30] using the same TCGA patient cohort. As shown in Fig. 8, the AUC of overall survival for the GILncSig was 0.681, which was significantly higher than those of SunlncSig (AUC = 0.657) and LiulncSig (AUC = 0.656) (Fig. 8). Although our model’s AUC was lower than Zhang Dan’s model, our training set score was 0.726.

Fig. 8
figure8

Model comparison. The model proposed in this paper is compared with the model of Liu et al., Sun et al., and Zhang et al., and the model presented in this paper has the highest ROC value, indicating the best evaluation ability

GSVA pathway correlation analysis

We obtained genome stability-related lncRNA in various somatic mutation groups; however, we believe that the lncRNA obtained based on differential analysis alone is insufficient to conclude that they are related to genome stability. Therefore, in this section, we obtained genomic stability-related pathway scores of each sample using the GSVA method. We calculated the Pearson correlation coefficients of these genomic stability pathway scores and the differences in lncRNA. We directly explained the pathways in which these factors regulate genomic stability. Figure 8 shows that the base excision repair pathway, the DNA replication pathway, homologous recombination, the mismatch repair pathway, the p53 signaling pathway, and ubiquitin-mediated proteolysis were related to LINC00460 and LINC01234. The interaction of these pathways appears to ensure the stability of the genome (Fig. 9). For these reasons, we believe LINC00460 and LINC01234 affect the stability of the genome by regulating these pathways. The correlation coefficient among genomic stability-related and lncRNAs were performed in Supplementary Table 4.

Fig. 9
figure9

Correlation analysis of lncRNA and genomic instability related pathways. Red represents positive correlation and blue represents negative correlation. The selected pathways are: P53 signaling pathway, mismatch repair, homologous recombination, DNA replication and base excision repair

Discussion

The genome structure’s relative stability is a prerequisite for the maintenance and continuation of the biological germline. It is crucial to ensure that a set of effective mechanisms is formed in the cell. There is a stable and accurate transmission of genetic information from generation to generation. Chromosome instability refers to the increased probability of acquiring chromosomal aberrations due to defects in processes such as DNA repair, replication, or chromosome segregation. Genome stability is closely related to the occurrence and progression of cancer [31,32,33]. Common DNA damage types include DNA base modification, DNA inter-strand, and intra-strand cross-links, and DNA single-strand and double-strand breaks [34]. Such DNA damage often leads to genome instability. Proteins related to DNA damage repair, DNA replication, and cell cycle checkpoints work together to ensure the integrity of the genome and the DNA structure’s integrity. However, mutations in these proteins can lead to the accumulation of mutations in chromosomes; as these mutations accumulate, they cause cancer and premature aging [32, 33, 35]. There is no accurate quantitative way to describe genome instability. Various efforts are underway to identify protein-coding genes and microRNAs related to genomic instability that predict outcomes [36,37,38].

Although we have made substantial efforts to identify lncRNAs related to genomic instability, whole-genome identification of lncRNA and its clinical research are still in their early stages.

Based on TCGA clear cell cancer cohort and the corresponding number of somatic mutations, we identified 26 differences related to the number of somatic mutations at the computational level. However, the analysis in computational biology is insufficient. Therefore, we combined clinical prognostic phenotype. A clinical predictive lncRNA model was constructed. We found that six lncRNAs in the model could be used as independent prognostic markers for renal cancer. According to our understanding, genome stability is closely related to levels of p53 mutations, DNA repair, and base mismatch repair. On account of the cumulative effect of these factors, normal cells gradually become cancer cells. According to our previous description, the six lncRNAs in the model should be closely related to these processes. Therefore, to verify this point of view, we performed GSVA gene set analysis and obtained the KEGG pathway scores corresponding to each sample. Then, the Pearson correlation coefficient test was performed using these pathways. LINC00460 and LINC01234 are the most relevant to these genomic stability pathways. We demonstrated that this method could screen candidate genome stability-related lncRNAs and identify the relevant pathways involved in these lncRNAs through GSVA analysis.

After a careful literature search, we found that the biological process of LINC00460 and LINC01234 in the GILncSig has not been reported to date. We found that the lncRNA LINC00460 was located on chromosome 13q33.2 and is a prognostic biomarker for esophageal squamous cell carcinoma [39] and renal carcinoma [28]. Another lncRNA, LINC01234, is located on chromosome 12q24.13. LINC01234 was found to regulate proliferation, migration, and invasion of ccRCC cells via the HIF-2α pathway [40]. Although studies have demonstrated the relationship between these two factors and outcomes of RCC, they do not explain the specifically related mechanisms. Finally, by analyzing the GSVA pathway, we found that they have the strongest correlation with the p53 pathway and affect the stability of the genome.

The transcription of lncRNA can affect the expression of neighboring genes [41]. Ephrin B2 (EFNB2), the neighboring gene of LINC00460, encoded the Ephrin family. Overexpression of EFNB2 is associated with malignant progression of tumors. It is expressed at high levels in head and neck squamous cell carcinoma and colorectal cancer [42], also promotes the growth of pancreatic ductal adenocarcinoma [43]. Knocking down EFNB2 can block tumorigenesis and establish tumor therapy [44].

RNA binding motif protein 19 (RBM19), the neighboring gene of LINC01234, Its function may be to participate in the regulation of ribosome biogenesis [45, 46]. Although there have been no specific studies linking RBM19 to cancer, other scientists have found that RBM19 is a gene expressed in the intestinal epithelium and is critical for intestinal morphogenesis [47].

There are some limitations to our study. First, we did not conduct cell or animal experiments. Second, we only identified 26 genomic stability-related lncRNAs; nevertheless, computational biology techniques demonstrated the connection between LINC00460 and LINC01234 and the genome stability pathway. Underlying regulatory mechanisms require further exploration.

In conclusion, we constructed a screening system for genome stability-related lncRNAs, and we identified 26 genomic stability-related lncRNAs, the detailed introduction of the 26 lnc RNAs was uploaded as Supplementary Table 5. We used these lncRNAs to predict outcomes in patients with ccRCC and found that these lncRNAs can be used as independent predictors. Finally, using GSVA pathway correlation analysis, we found that LNC00460 and LINC01234 are related to genome stability, and we indirectly demonstrated the appropriateness of this strategy.

Availability of data and materials

“The datasets analysed during the current study are available in the TCGA repository, [https://portal.gdc.cancer.gov/]”.

References

  1. 1.

    Dagher J, Delahunt B, Rioux-Leclercq N, Egevad L, Srigley JR, Coughlin G, et al. Clear cell renal cell carcinoma: validation of World Health Organization/International Society of Urological Pathology grading. Histopathology. 2017;71(6):918–25. https://doi.org/10.1111/his.13311.

    Article  PubMed  Google Scholar 

  2. 2.

    Makhov P, Joshi S, Ghatalia P, Kutikov A, Uzzo RG, Kolenko VM. Resistance to systemic therapies in clear cell renal cell carcinoma: mechanisms and management strategies. Mol Cancer Ther. 2018;17(7):1355–64. https://doi.org/10.1158/1535-7163.MCT-17-1299.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Mehdi A, Riazalhosseini Y. Epigenome aberrations: emerging driving factors of the clear cell renal cell carcinoma. Int J Mol Sci. 2017;18(8). https://doi.org/10.3390/ijms18081774.

  4. 4.

    Woo S, Suh CH, Kim SY, Cho JY, Kim SH. Diagnostic Performance of DWI for Differentiating High- From Low-Grade Clear Cell Renal Cell Carcinoma: A Systematic Review and Meta-Analysis. AJR Am J Roentgenol. 2017;209:W374–374W381.

    Article  Google Scholar 

  5. 5.

    Vera-Badillo FE, Templeton AJ, Duran I, Ocana A, de Gouveia P, Aneja P, et al. Systemic therapy for non-clear cell renal cell carcinomas: a systematic review and meta-analysis. Eur Urol. 2015;67(4):740–9. https://doi.org/10.1016/j.eururo.2014.05.010.

    Article  PubMed  Google Scholar 

  6. 6.

    Abbas T, Keaton MA, Dutta A. Genomic instability in cancer. Cold Spring Harb Perspect Biol. 2013;5:a012914.

    Article  Google Scholar 

  7. 7.

    Pikor L, Thu K, Vucic E, Lam W. The detection and implication of genome instability in cancer. Cancer Metastasis Rev. 2013;32(3-4):341–52. https://doi.org/10.1007/s10555-013-9429-5.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Gaillard H, García-Muse T, Aguilera A. Replication stress and cancer. Nat Rev Cancer. 2015;15(5):276–89. https://doi.org/10.1038/nrc3916.

    CAS  Article  PubMed  Google Scholar 

  9. 9.

    Reis AH, Vargas FR, Lemos B. Biomarkers of genome instability and cancer epigenetics. Tumour Biol. 2016;37(10):13029–38. https://doi.org/10.1007/s13277-016-5278-5.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, et al. Landscape of transcription in human cells. Nature. 2012;489:101–8.

    CAS  Article  Google Scholar 

  11. 11.

    Renganathan A, Felley-Bosco E. Long noncoding RNAs in Cancer and therapeutic potential. Adv Exp Med Biol. 2017;1008:199–222. https://doi.org/10.1007/978-981-10-5203-3_7.

    CAS  Article  PubMed  Google Scholar 

  12. 12.

    Thin KZ, Liu X, Feng X, Raveendran S, Tu JC. LncRNA-DANCR: a valuable cancer related long non-coding RNA for human cancers. Pathol Res Pract. 2018;214(6):801–5. https://doi.org/10.1016/j.prp.2018.04.003.

    CAS  Article  PubMed  Google Scholar 

  13. 13.

    Ransohoff JD, Wei Y, Khavari PA. The functions and unique features of long intergenic non-coding RNA. Nat Rev Mol Cell Biol. 2018;19(3):143–57. https://doi.org/10.1038/nrm.2017.104.

    CAS  Article  PubMed  Google Scholar 

  14. 14.

    Lam MT, Li W, Rosenfeld MG, Glass CK. Enhancer RNAs and regulated transcriptional programs. Trends Biochem Sci. 2014;39(4):170–82. https://doi.org/10.1016/j.tibs.2014.02.007.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Li W, Notani D, Ma Q, Tanasa B, Nunez E, Chen AY, et al. Functional roles of enhancer RNAs for oestrogen-dependent transcriptional activation. Nature. 2013;498(7455):516–20. https://doi.org/10.1038/nature12210.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Fang Y, Fullwood MJ. Roles, functions, and mechanisms of long non-coding RNAs in Cancer. Genomics Proteomics Bioinformatics. 2016;14(1):42–54. https://doi.org/10.1016/j.gpb.2015.09.006.

    Article  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Bhan A, Soleimani M, Mandal SS. Long noncoding RNA and Cancer: a new paradigm. Cancer Res. 2017;77(15):3965–81. https://doi.org/10.1158/0008-5472.CAN-16-2634.

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    Wang Z, Jensen MA, Zenklusen JC. A practical guide to the Cancer genome atlas (TCGA). Methods Mol Biol. 2016;1418:111–41. https://doi.org/10.1007/978-1-4939-3578-9_6.

    Article  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Schumacher TN, Kesmir C, van Buuren MM. Biomarkers in cancer immunotherapy. Cancer Cell. 2015;27(1):12–4. https://doi.org/10.1016/j.ccell.2014.12.004.

    CAS  Article  PubMed  Google Scholar 

  20. 20.

    Pripp AH. Pearson's or Spearman's correlation coefficients. Tidsskr Nor Laegeforen. 2018;138(8). https://doi.org/10.4045/tidsskr.18.0042.

  21. 21.

    Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. Gene Ontol Consortium Nat Genet. 2000;25(1):25–9. https://doi.org/10.1038/75556.

    CAS  Article  Google Scholar 

  22. 22.

    Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013;14:7.

    Article  Google Scholar 

  23. 23.

    Vogt W, Nagel D. Cluster analysis in diagnosis. Clin Chem. 1992;38(2):182–98. https://doi.org/10.1093/clinchem/38.2.182.

    CAS  Article  PubMed  Google Scholar 

  24. 24.

    Wang Y, Lin J, Yan K, Wang J. Identification of a robust five-gene risk model in prostate Cancer: a robust likelihood-based survival analysis. Int J Genomics. 2020;2020:1097602.

    PubMed  PubMed Central  Google Scholar 

  25. 25.

    Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45:D353–353D361.

    CAS  Article  Google Scholar 

  26. 26.

    Jachimowicz RD, Beleggia F, Isensee J, Velpula BB, Goergens J, Bustos MA, et al. UBQLN4 Represses Homologous Recombination and Is Overexpressed in Aggressive Tumors. Cell. 2019;176:505–19.e22.

    CAS  Article  Google Scholar 

  27. 27.

    Surtees JA, Alani E. Replication factors license exonuclease I in mismatch repair. Mol Cell. 2004;15(2):164–6. https://doi.org/10.1016/j.molcel.2004.07.004.

    CAS  Article  PubMed  Google Scholar 

  28. 28.

    Zhang D, Zeng S, Hu X. Identification of a three-long noncoding RNA prognostic model involved competitive endogenous RNA in kidney renal clear cell carcinoma. Cancer Cell Int. 2020;20(1):319. https://doi.org/10.1186/s12935-020-01423-4.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Liu Y, Gou X, Wei Z, Yu H, Zhou X, Li X. Bioinformatics profiling integrating a four immune-related long non-coding RNAs signature as a prognostic model for papillary renal cell carcinoma. Aging (Albany NY). 2020;12:15359–73.

    CAS  Article  Google Scholar 

  30. 30.

    Sun Z, Jing C, Xiao C, Li T. Long non-coding RNA profile study identifies an immune-related lncRNA prognostic signature for kidney renal clear cell carcinoma. Front Oncol. 2020;10:1430. https://doi.org/10.3389/fonc.2020.01430.

    Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Aguilera A, García-Muse T. Causes of genome instability. Annu Rev Genet. 2013;47(1):1–32. https://doi.org/10.1146/annurev-genet-111212-133232.

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Cha HJ, Yim H. The accumulation of DNA repair defects is the molecular origin of carcinogenesis. Tumour Biol. 2013;34(6):3293–302. https://doi.org/10.1007/s13277-013-1038-y.

    CAS  Article  PubMed  Google Scholar 

  33. 33.

    Basu AK. DNA damage, mutagenesis and Cancer. Int J Mol Sci. 2018;19(4). https://doi.org/10.3390/ijms19040970.

  34. 34.

    Kramara J, Osia B, Malkova A. Break-induced replication: the where, the why, and the how. Trends Genet. 2018;34(7):518–31. https://doi.org/10.1016/j.tig.2018.04.002.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Cotterill S. Diseases associated with mutation of replication and repair proteins. Adv Exp Med Biol. 2018;1076:215–34. https://doi.org/10.1007/978-981-13-0529-0_12.

    CAS  Article  PubMed  Google Scholar 

  36. 36.

    Habermann JK, Doering J, Hautaniemi S, Roblick UJ, Bündgen NK, Nicorici D, et al. The gene expression signature of genomic instability in breast cancer is an independent predictor of clinical outcome. Int J Cancer. 2009;124(7):1552–64. https://doi.org/10.1002/ijc.24017.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Mettu RK, Wan YW, Habermann JK, Ried T, Guo NL. A 12-gene genomic instability signature predicts clinical outcomes in multiple cancer types. Int J Biol Markers. 2010;25(4):219–28. https://doi.org/10.5301/JBM.2010.6079.

    Article  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Ferguson LR, Chen H, Collins AR, Connell M, Damia G, Dasgupta S, et al. Genomic instability in human cancer: Molecular insights and opportunities for therapeutic attack and prevention through diet and nutrition. Semin Cancer Biol. 2015;35:Suppl:S5–5S24.

    Article  Google Scholar 

  39. 39.

    Liang Y, Wu Y, Chen X, Zhang S, Wang K, Guan X, et al. A novel long noncoding RNA linc00460 up-regulated by CBP/P300 promotes carcinogenesis in esophageal squamous cell carcinoma. Biosci Rep. 2017;37(5). https://doi.org/10.1042/BSR20171019.

  40. 40.

    Yang F, Liu C, Zhao G, Ge L, Song Y, Chen Z, et al. Long non-coding RNA LINC01234 regulates proliferation, migration and invasion via HIF-2α pathways in clear cell renal cell carcinoma cells. PeerJ. 2020;8:e10149. https://doi.org/10.7717/peerj.10149.

    Article  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Engreitz JM, Haines JE, Perez EM, Munson G, Chen J, Kane M, et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature. 2016;539(7629):452–5. https://doi.org/10.1038/nature20149.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Scherer D, Deutelmoser H, Balavarca Y, Toth R, Habermann N, Buck K, et al. Polymorphisms in the angiogenesis-related genes EFNB2, MMP2 and JAG1 are associated with survival of colorectal Cancer patients. Int J Mol Sci. 2020;21(15). https://doi.org/10.3390/ijms21155395.

  43. 43.

    Zhu F, Dai SN, Xu DL, Hou CQ, Liu TT, Chen QY, et al. EFNB2 facilitates cell proliferation, migration, and invasion in pancreatic ductal adenocarcinoma via the p53/p21 pathway and EMT. Biomed Pharmacother. 2020;125:109972. https://doi.org/10.1016/j.biopha.2020.109972.

    CAS  Article  PubMed  Google Scholar 

  44. 44.

    Krusche B, Ottone C, Clements MP, Johnstone ER, Goetsch K, Lieven H, et al. EphrinB2 drives perivascular invasion and proliferation of glioblastoma stem-like cells. Elife. 2016;5. https://doi.org/10.7554/eLife.14845.

  45. 45.

    Kallberg Y, Segerstolpe Å, Lackmann F, Persson B, Wieslander L. Evolutionary conservation of the ribosomal biogenesis factor Rbm19/Mrd1: implications for function. PLoS One. 2012;7(9):e43786. https://doi.org/10.1371/journal.pone.0043786.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Zhang J, Tomasini AJ, Mayer AN. RBM19 is essential for preimplantation development in the mouse. BMC Dev Biol. 2008;8(1):115. https://doi.org/10.1186/1471-213X-8-115.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Lorenzen JA, Bonacci BB, Palmer RE, Wells C, Zhang J, Haber DA, et al. Rbm19 is a nucleolar protein expressed in crypt/progenitor cells of the intestinal epithelium. Gene Expr Patterns. 2005;6(1):45–56. https://doi.org/10.1016/j.modgep.2005.05.001.

    CAS  Article  PubMed  Google Scholar 

  48. 48.

    Yang W, Zhang K, Li L, Xu Y, Ma K, Xie H, et al. Downregulation of lncRNA ZNF582-AS1 due to DNA hypermethylation promotes clear cell renal cell carcinoma growth and metastasis by regulating the N(6)-methyladenosine modification of MT-RNR1. J Exp Clin Cancer Res. 2021;40(1):92. https://doi.org/10.1186/s13046-021-01889-8.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Kumegawa K, Maruyama R, Yamamoto E, Ashida M, Kitajima H, Tsuyada A, et al. A genomic screen for long noncoding RNA genes epigenetically silenced by aberrant DNA methylation in colorectal cancer. Sci Rep. 2016;6(1):26699. https://doi.org/10.1038/srep26699.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Li Z, Jiang M, Zhang T, Liu S. GAS6-AS2 Promotes Hepatocellular Carcinoma via miR-3619-5p/ARL2 Axis Under Insufficient Radiofrequency Ablation Condition. Cancer Biother Radiopharm. 2020. https://doi.org/10.1089/cbr.2019.3541.

  51. 51.

    Ren S, Xu Y. AC016405.3, a novel long noncoding RNA, acts as a tumor suppressor through modulation of TET2 by microRNA-19a-5p sponging in glioblastoma. Cancer Sci. 2019;110(5):1621–32. https://doi.org/10.1111/cas.14002.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Skala SL, Wang X, Zhang Y, Mannan R, Wang L, Narayanan SP, et al. Next-generation RNA sequencing-based biomarker characterization of Chromophobe renal cell carcinoma and related Oncocytic neoplasms. Eur Urol. 2020;78(1):63–74. https://doi.org/10.1016/j.eururo.2020.03.003.

    CAS  Article  PubMed  Google Scholar 

  53. 53.

    Chen D, Huang Z, Ning Y, Lou C. Knockdown of LINC02471 inhibits papillary thyroid carcinoma cell invasion and metastasis by targeting miR-375. Cancer Manag Res. 2020;12:8757–71. https://doi.org/10.2147/CMAR.S243767.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Li W, Chen QF, Huang T, Wu P, Shen L, Huang ZL. Identification and validation of a prognostic lncRNA signature for hepatocellular carcinoma. Front Oncol. 2020;10:780. https://doi.org/10.3389/fonc.2020.00780.

    Article  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Luo Y, Tan W, Jia W, Liu Z, Ye P, Fu Z, et al. The long non-coding RNA LINC01606 contributes to the metastasis and invasion of human gastric cancer and is associated with Wnt/β-catenin signaling. Int J Biochem Cell Biol. 2018;103:125–34. https://doi.org/10.1016/j.biocel.2018.08.012.

    CAS  Article  PubMed  Google Scholar 

  56. 56.

    Liu L, Wang Z, Jia J, Shi Y, Lian T, Han X. Linc01230, transcriptionally regulated by PPARγ, is identified as a novel modifier in endothelial function. Biochem Biophys Res Commun. 2018;507(1-4):369–76. https://doi.org/10.1016/j.bbrc.2018.11.045.

    CAS  Article  PubMed  Google Scholar 

  57. 57.

    Chen Y, Mao ZD, Shi YJ, Qian Y, Liu ZG, Yin XW, et al. Comprehensive analysis of miRNA-mRNA-lncRNA networks in severe asthma. Epigenomics. 2019;11(2):115–31. https://doi.org/10.2217/epi-2018-0132.

    CAS  Article  PubMed  Google Scholar 

  58. 58.

    Wang S, Zhang L, Tao L, Pang L, Fu R, Fu Y, et al. Construction and investigation of an LINC00284-associated regulatory network in serous ovarian carcinoma. Dis Markers. 2020;2020:9696285.

    PubMed  PubMed Central  Google Scholar 

  59. 59.

    Vidovic D, Huynh TT, Konda P, Dean C, Cruickshank BM, Sultan M, et al. ALDH1A3-regulated long non-coding RNA NRAD1 is a potential novel target for triple-negative breast tumors and cancer stem cells. Cell Death Differ. 2020;27(1):363–78. https://doi.org/10.1038/s41418-019-0362-1.

    CAS  Article  PubMed  Google Scholar 

  60. 60.

    Xing C, Cai Z, Gong J, Zhou J, Xu J, Guo F. Identification of potential biomarkers involved in gastric Cancer through integrated analysis of non-coding RNA associated competing endogenous RNAs network. Clin Lab. 2018;64(10):1661–9. https://doi.org/10.7754/Clin.Lab.2018.180419.

    CAS  Article  PubMed  Google Scholar 

  61. 61.

    Zhao Y, Wang H, Wu C, Yan M, Wu H, Wang J, et al. Construction and investigation of lncRNA-associated ceRNA regulatory network in papillary thyroid cancer. Oncol Rep. 2018;39(3):1197–206. https://doi.org/10.3892/or.2018.6207.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Chen Z, Chen X, Lei T, Gu Y, Gu J, Huang J, et al. Integrative analysis of NSCLC identifies LINC01234 as an oncogenic lncRNA that interacts with HNRNPA2B1 and regulates miR-106b biogenesis. Mol Ther. 2020;28(6):1479–93. https://doi.org/10.1016/j.ymthe.2020.03.010.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  63. 63.

    Liu D, Jian X, Xu P, Zhu R, Wang Y. Linc01234 promotes cell proliferation and metastasis in oral squamous cell carcinoma via miR-433/PAK4 axis. BMC Cancer. 2020;20(1):107. https://doi.org/10.1186/s12885-020-6541-0.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Ma J, Han LN, Song JR, Bai XM, Wang JZ, Meng LF, et al. Long noncoding RNA LINC01234 silencing exerts an anti-oncogenic effect in esophageal cancer cells through microRNA-193a-5p-mediated CCNE1 downregulation. Cell Oncol (Dordr). 2020;43:377–94.

    CAS  Article  Google Scholar 

  65. 65.

    Zhu Y, Luo C, Korakkandan AA, Fatma Y, Tao Y, Yi T, et al. Function and regulation annotation of up-regulated long non-coding RNA LINC01234 in gastric cancer. J Clin Lab Anal. 2020;34:e23210.

    CAS  PubMed  PubMed Central  Google Scholar 

  66. 66.

    Xu W, Li K, Song C, Wang X, Li Y, Xu B, et al. Knockdown of lncRNA LINC01234 suppresses the tumorigenesis of liver Cancer via sponging miR-513a-5p. Front Oncol. 2020;10:571565. https://doi.org/10.3389/fonc.2020.571565.

    Article  PubMed  PubMed Central  Google Scholar 

  67. 67.

    Chaudhary R, Wang X, Cao B, De La Iglesia J, Masannat J, Song F, et al. Long noncoding RNA, LINC00460, as a prognostic biomarker in head and neck squamous cell carcinoma (HNSCC). Am J Transl Res. 2020;12(2):684–96.

    CAS  PubMed  PubMed Central  Google Scholar 

  68. 68.

    Li X, Sun L, Wang X, Wang N, Xu K, Jiang X, et al. A five immune-related lncRNA signature as a prognostic target for Glioblastoma. Front Mol Biosci. 2021;8:632837. https://doi.org/10.3389/fmolb.2021.632837.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  69. 69.

    Fan CN, Ma L, Liu N. Comprehensive analysis of novel three-long noncoding RNA signatures as a diagnostic and prognostic biomarkers of human triple-negative breast cancer. J Cell Biochem. 2019;120(3):3185–96. https://doi.org/10.1002/jcb.27584.

    CAS  Article  PubMed  Google Scholar 

  70. 70.

    Li C, Zheng H, Hou W, Bao H, Xiong J, Che W, et al. Long non-coding RNA linc00645 promotes TGF-β-induced epithelial-mesenchymal transition by regulating miR-205-3p-ZEB1 axis in glioma. Cell Death Dis. 2019;10(10):717. https://doi.org/10.1038/s41419-019-1948-8.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We appreciate the free use of TCGA databases.

Funding

This work was supported by China Shenyang Science and Technology Plan (20–205–4-015).

Author information

Affiliations

Authors

Contributions

Yutao Wang, Kexin Yan and Jianbin Bi designed the study; Yutao Wang, Kexin Yan and Linhui Wang analyzed and wrote the manuscript. All authors read and agreed to the final version of the manuscript.

Corresponding author

Correspondence to Jianbin Bi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Yan, K., Wang, L. et al. Genome instability-related long non-coding RNA in clear renal cell carcinoma determined using computational biology. BMC Cancer 21, 727 (2021). https://doi.org/10.1186/s12885-021-08356-9

Download citation

Keywords

  • Genome instability
  • Long non-coding RNA
  • Computational biology
  • Gene set variation analysis
  • Risk signature