Targeted genomic profiling identifies frequent deleterious mutations in FAT4 and TP53 genes in HBV-associated hepatocellular carcinoma

Background Hepatitis B virus (HBV) is the major risk factor for hepatocellular carcinoma (HCC). The molecular mechanisms underlying HBV-associated HCC pathogenesis is still unclear. Genetic alterations in cancer-related genes have been linked to many human cancers. Here, we aimed to explore genetic alterations in selected cancer-related genes in patients with HBV-associated HCC. Methods Targeted sequencing was used to analyze six cancer-related genes (PIK3CA, TP53, FAT4, IRF2, HNF4α and ARID1A) in eight pairs of HBV-associated HCC tumors and their adjacent non-tumor tissues. Sanger sequencing, quantitative PCR, Western-blotting and RNAi-mediated gene knockdown were used to further validate findings. Results Targeted sequencing revealed thirteen non-synonymous mutations, of which 9 (69%) were found in FAT4 and 4 (31%) were found in TP53 genes. Non-synonymous mutations were not found in PIK3CA, IRF2, HNF4α and ARID1A. Among these 13 non-synonymous mutations, 12 (8 in FAT4 and 4 in TP53) were predicted to have deleterious effect on protein function by in silico analysis. For TP53, Y220S, R249S and P250R non-synonymous mutations were solely identified in tumor tissues. Further expression profiling of FAT4 and TP53 on twenty-eight pairs of HCC tumor and non-tumor tissues confirmed significant downregulation of both genes in HCC tumors compared with their non-tumor counterparts (P < 0.001 and P < 0.01, respectively). Functional analysis using RNAi-mediated knockdown of FAT4 revealed an increased cancer cell growth and proliferation, suggesting the putative tumor suppressor role of FAT4 in HCC. Conclusions This study highlights the importance of FAT4 and TP53 in HCC pathogenesis and identifies new genetic variants that may have potentials for development of precise therapy for HCC. Electronic supplementary material The online version of this article (10.1186/s12885-019-6002-9) contains supplementary material, which is available to authorized users.


Background
Hepatocellular carcinoma (HCC) is one of the most common malignant tumors worldwide. With an incidence of over 700,000 new cases per year, it ranks the sixth most common cancer and the third leading cause of cancer-related deaths worldwide [1]. China alone accounts for about 50% of the total number of cases and deaths [2]. Most cases of HCC are associated with chronic infection of hepatitis B virus (HBV) and/or hepatitis C virus (HCV). Other factors such as alcohol consumption, smoking, aflatoxin B exposure, diabetes, obesity, and non-alcoholic fatty liver disease (NAFLD) may act either as amplifiers of the effects of viral hepatitis or as independent risk factors of HCC [3]. Although there are advances in HCC diagnosis and treatment in recent decades, most HCC are still asymptomatic until at a late stage and hence resulting in a poor long-term prognosis and with limited therapeutic modalities [4]. Therefore, it is necessary to identify genomic alterations underlying the pathogenesis of HCC to pinpoint efficient therapeutic targets for early diagnosis and treatment of this deadly disease, as well as to improve its prognosis in affected patients [5].
Accumulation of genetic alterations in oncogenes, tumor-suppressor genes, cell adhesion molecules and DNA repair genes are characteristic features of many human cancers including HCC [6]. Over the past few years, next-generation sequencing (NGS) has profoundly advanced our understanding of cancer genomics. The identification of disease driver genes in some solid tumors holds promise for precision medicine, such as ALK inhibitors in non-small cell lung cancer with an ALK rearrangement or BRAF inhibitors in melanoma with a BRAF mutation [7,8]. Unfortunately, liver cancer has not yet reached the point of molecular-based treatment stratification, mainly due to incomplete understanding of the molecular landscape of HCC in particular the genomic alterations caused by different etiological factors [9]. Systematic efforts to elucidate the comprehensive somatic changes in a large group of viral-associated (both HBV and HCV) HCC tumor samples with an international contribution efforts are still underway (http://cancergenome.nih.gov/).
Although the genomic alterations underlying HCC have not been clearly understood, a broad variety of pathways activated in HCC have been reported including the Wnt/β-catenin, p53/cell cycle, chromatin remodeling complex, PI3K/Ras, and oxidative stress signaling [10]. Genetic alterations identified in key genes involved in these pathways generally present with different frequency in different cancer types and etiology background [10,11]. For example, the incidence of mutation in the well-known tumor suppressor gene TP53 varied from 5 to 70% depending on cancer types and stage [12].
In HCC, the rates of TP53 mutation varied significantly between African or Asian (10-60%) and Western countries (10-20%) [13]. Presence of PIK3CA mutation has been controversial with approximately 35.6% of HCC cases in Korea [14], 28% in Italy [15] and 0% in Japan [16]. By using NGS technologies, somatic mutations in several novel cancer-related genes such as ARID1A (7.53%), HNF4α (0.88%), FAT4 (4.71%) and IRF2 (1.06%) have been identified and suggested to be associated with HCC [17]. However, these studies were performed in patients with HCC of heterogeneous etiologies, and the role of genetic changes in these genes in the development of HBV-associated HCC is largely unknown.
To explore whether genetic changes in cancer-related genes can be identified in chronic hepatitis B patients with HCC, we performed targeted sequencing to detect the incidence of mutations in six selected cancer-related genes including ARID1A, TP53, FAT4, HNF4α, PIK3CA and IRF2. These genes have been suggested to play functional roles in chromatin remodeling (ARID1A), tumor suppression (TP53 and FAT4), transcription activation (HNF4α and IRF2), and oncogenic development (PIK3CA) (Additional file 1: Table S1) [18][19][20]. Identification of the key genes and the related mechanisms could provide a better understanding on HCC pathogenesis and develop effective therapeutic strategies. Hence, we aimed to identify genetic changes in cancer-related genes in HBV-associated HCC and explore whether they play roles in the process of HCC pathogenesis.

Sample preparation and nucleic acids extraction
Eight pairs of tumor and their adjacent non-tumor tissues were collected from Asian patients who had HBVrelated HCC and had undergone surgical resection at the Queen Mary Hospital, Hong Kong. Patients with other risk factors, such as HCV infection, heavy alcohol consumption, nonalcoholic steatohepatitis (NASH) and smoking were excluded in this study. These tissues were rapidly snap-frozen in liquid nitrogen and stored at -80°C freezers for future analysis. Written informed consent was obtained from all patients. This study was approved by the Institutional Review Board (UW 17-312), University of Hong Kong. Nucleic acids were extracted from about 30 mg of liver tissues by the QIAamp Allprep Kit (Qiagen, Hilden, Germany), according to the manufacturer's instructions. This extraction kit allows simultaneous extraction of DNA, RNA, and protein from the same piece of liver tissue. During RNA isolation, on-column DNase digestion was performed using RNase-free DNase (Qiagen) to get rid of DNA contamination. The quantity and quality of the nucleic acids were determined by using the NanoDrop and the Qubit fluorometer (Thermo Fisher Scientific, MA, USA). For the validation of gene expression level, RNA extracted from additional 20 pairs of tumor and non-tumor tissues from HBV-associated HCC patients were used.

Library preparation and targeted sequencing
Briefly, 100 ng of DNA from tumor or non-tumor tissues was fragmented with a Covaris M220 instrument (Covaris, Woburn, USA). Library preparation and custom target enrichment were performed with the KAPA Library Preparation kit for Illumina platforms (Kapa Biosystems, Wilmington, USA) and NimbleGen SeqCap EZ Library kit (Roche, Madison, WI, USA), respectively, following the manufacturer's protocol. The captured library was then amplified and sequenced using HiSeq 2000 (Illumina, San Diego, USA). Library preparation and targeted sequencing were performed by Centre for Genomic Sciences, The University of Hong Kong.

Targeted sequencing data analysis
The base calling and sequence alignment were performed using the Illumina pipeline (version 1.4) with default parameters [21]. The sequence reads were mapped to the reference human genome (hg19) using Burrow Wheeler Aligner (BWA) version 0.6.2 [22]. The optimization of sequence alignment, variant calling and annotation were performed using Genome Analysis Toolkit (GATK) version 3.2 [23]. The effects of missense variants and amino acid substitutions on protein function were predicted with four algorithms [SIFT [24], Polyphen2 [25], Mutation Taster [26] and LTR [27]].

Mutation verification by sanger sequencing
All the significant non-synonymous mutations were validated by Sanger sequencing. Primer pairs were designed to amplify the target sites using Primer 3 software (http://bioinfo.ut.ee/primer3/) (Additional file 2: Table  S2). Purified PCR products containing the potential variants were sequenced using the ABI 3730 DNA Analyzer (Applied Biosystems, Foster City, CA) to further ascertain the precision of the variants identified by targeted sequencing.

Cell culture
The human liver cancer cell lines (SNU-387, Huh7, HepG2, HepG2.2.15 and Hep3B) were obtained from the American Type Culture Collection (Manassas, VA, USA). Normal liver cell line, L02 was obtained from the Shanghai Institutes for Biological Sciences, and Chinese Academy of Sciences. All the cell lines were kept within 10 passages and have been tested for mycoplasma contamination using PCR method [28]. Cells were maintained in RPMI-1640 medium with 10% fetal bovine serum (Thermo Fisher Scientific) in a humidified incubator with 5% CO 2 at 37°C.

siRNA knockdown of FAT4
Transfection was performed with Lipofectamine 3000 reagent (Invitrogen) following the manufacturer's protocol. Briefly, SNU-387 cells were seeded in plates one day before transfection to ensure suitable cell confluency on the day of transfection. Ambion® Silence® select pre-designed siRNAs targeted FAT4 (Invitrogen) were used at a final concentration of 50 nM siRNA with non-specific sequences were used as controls. Cells were harvested at day 2 post-transfection, or as indicated.

Cell growth and proliferation analysis
SNU-387 cells were cultured in 12-well plate at about 5 × 10 4 per well for cell growth assay and in 96-well plates at about 5 × 10 3 per well for cell proliferation assay. Cells were transfected with siRNA targeting FAT4 or control siRNA for 24, 48, 72 h. Cells were observed under the phase contrast microscopy for changes in morphology and cell numbers at the designated time. For cell growth analysis, cells were trypsinized and diluted 1:1 with 0.4% trypan blue (sigma) and viable cells were counted with a hemocytometer (Sigma). For cell proliferation assay, 10 μl of Cell Counting Kit-8 solution was added into each well containing 100 μl culture medium and incubated for 2 h at 37°C. The optical density value of each well was measured by absorbance at 450 nm in a microplate reader. Experiments were performed in duplicates.

Real-time PCR analysis of gene expression
RNA was extracted from liver cell lines using TRIzol reagent (Thermo Fisher Scientific), following the manufacturer's protocol. RNA concentrations and integrity were determined using the NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific). Gene expression was measured by qRT-PCR using SYBR Green PCR master mix (Bio-Rad, Herculus, CA). Gene expression levels were normalized with GAPDH as an internal control gene and with adjacent non-tumor samples using the 2 -ΔΔCT method. Primer sequences used for gene amplification are listed in Additional file 2: Table S2.

Western blot analysis
Protein extraction from cultured cells was performed using the Mammalian Cell Lysis Reagent (Thermo Fisher Scientific). Protein concentration was determined by Bradford protein assay (Thermo Fisher Scientific), following the manufacturer's protocol. Equal amounts of total protein were loaded on 7% Tris-acetate polyacrylamide gels, transferred to a PVDF membrane, blocked with 5% milk, and then probed with relevant primary antibodies to FAT4 and p53 (Santa Cruz, CA), α-tubulin and β-actin (Cell Signaling, MA) overnight at 4°C. Protein expression was assessed by ECL detection system (GE Healthcare, NJ) and band intensities were quantified using the Image J software (NIH, Bethesda, MD).

Statistical analysis
Continuous variables were expressed as mean ± standard error (SEM) and analyzed using the student's t-test. All statistical analysis was performed using GraphPad Prism 5.0 (GraphPad Software, Inc. San Diego, CA). A P value of less than 0.05 was considered statistically significant.

Patient characteristics and sequencing quality
The clinical characteristics of the HBV patients with HCC are shown in Additional file 3: Table S3. Genetic alterations in coding regions and selected regulatory and intronic regions in the six cancer-related genes (Additional file 1: Table S1) were successfully sequenced using targeted sequencing method. After filtering reads with low sequence quality or sequencing adaptor, we obtained a total of > 1.0 Mb high-quality reads per sample (Additional file 4: Table S4). More than 99.9% of the yielded clean reads could be uniquely mapped to the human reference genome hg19, achieving 600x on target mean coverage in all the 16 samples. This high coverage (> 98.6%) of targeted regions (≥8x) allow a highly reliable detection of all variations in targeted regions (Additional file 4: Table S4).

Non-synonymous mutations
Thirteen non-synonymous mutations were identified, of which 9 (69%) were found in FAT4 and 4 (31%) were found in TP53 (Figs. 1 and 2). Non-synonymous mutations were not in PIK3CA, IRF2, HNF4α and ARID1A. Further in silico analysis using different software [SIFT [24], PolyPhen2 [25], Mutation Taster [26] and LRT [27]] predicted that 12 of the 13 non-synonymous mutations might cause significant changes in protein structure and hence were potentially deleterious or damaging on protein function (Additional file 5: Table S5). Of the 9 non-synonymous mutations identified in FAT4, 8 (except S3873 N) were predicted to be deleterious on protein function (Additional file 5: Table S5). Six of these non-synonymous mutations were located in the cadherin domains (6/9, 66.7%) and annotated in the COSMIC database, while the other three were located in the C-terminal ends (3/9, 33.3%) (Fig. 2b). The remaining four non-synonymous mutations were present in TP53 gene (4/13, 31%) (Fig. 1). P72R is located in the proline-rich region, and the other three mutations Y220S, R249S and P250R are localized in the DNA binding domain (Fig. 2a). These latter 3 mutations were predicted to be "deleterious" by all the four prediction algorithms, and were only detected in tumor tissues but not in their adjacent non-tumor controls ( Fig.  1B and Additional file 5: Table S5). Notably, the hot spot R249S mutation was detected in 3/8 (37.5%) patients. Additionally, some genetic variants in FAT4 and TP53 were present in both tumor and their adjacent non-tumor counterparts with tumor tissues generally harboring a higher allelic frequency of mutations (Fig. 1b).

Confirmation of the identified variants by sanger sequencing
Sanger sequencing was performed to confirm the accuracy of the 13 non-synonymous genetic variants in FAT4 and TP53 genes, with 12 of them were predicted to have disease-causing potential in samples identified by targeted sequencing (Fig. 1b). All these genetic variants presented in the same sample identified by targeted sequencing could be validated by Sanger sequencing (Fig.  2 and Additional file 8: Figure S1). The results showed complete consistency between the two methods, suggesting that the targeted sequencing method used in this study provides high accuracy and with no false-positive rate.

Downregulation of FAT4 and TP53 in liver tumor tissues and cell-lines
To further examine the biological significance of FAT4 and TP53 in HCC, we studied FAT4 and TP53 expression in a total of 28 pairs of tumor and non-tumor tissues and in six liver cell lines. As shown in Fig. 3a, the mRNA expression levels of FAT4 and TP53 were significantly downregulated in tumor tissues compared with their adjacent non-tumor counterparts (P < 0.001 and P < 0.01, respectively). Western blot analysis also revealed that FAT4 and p53 protein levels were lower in tumor tissues compared with the corresponding nontumor tissues (Fig. 3b). The FAT4 mRNA expression levels were also significantly reduced in the four liver cancer cell-lines (Hep3B, HepG2, HepG2.2.15 and Huh7) compared with the normal cell-line L02 (P < 0.001) (Fig. 3c). Similar results were obtained by Western blot analysis of FAT4 protein expression (Fig. 3d). These data suggest that FAT4 is repressed during hepatocarcinogenesis.

siRNA knockdown of FAT4 promotes cell growth and proliferation
The tumor suppressor role of TP53 is well characterized in HCC but not for FAT4. We next explored the functional role of FAT4 in liver cancer cells using siRNA-mediated knockdown of FAT4 expression. HBV-associated HCC cell-line, SNU-387 was chosen for knockdown experiment as the expression of FAT4 in SNU-387 cells was higher than the other 4 cancer cell-lines which makes it a better candidate to study the effect of FAT4 knockdown. The efficiency of siRNA-mediated FAT4 knockdown was confirmed by reduction in both mRNA and protein levels compared with cells transfected with control siRNA (both with P < 0.0001) (Fig. 4a). We next studied the effect of FAT4 knockdown on cell growth and proliferation. As shown in Fig. 4b, cell proliferation was significantly enhanced after 48 and 72 h in cells with FAT4 siRNA transfection compared with cells transfected with control siRNA (both with P < 0.0001). Similarly, cell growth was significantly increased by 33 and 24% in cells transfected with FAT4 siRNA at 48 h (P < 0.0001) and 72 h (P < 0.01) compared with cells transfected with control siRNA, respectively (Fig. 4c). Morphologic changes were observed with cells showing rapidly growth without contact inhibition and forming clonal populations in FAT4 siRNA transfected cells but not in control siRNA transfected cells at all the three time points (Fig. 4d). Taken together, knockdown of FAT4 promotes cell growth and proliferation indicating the putative tumor suppressor role of FAT4 in HCC.

Discussion
In this study, we applied targeted sequencing to screen for genetic variants in HBV-related HCC samples. As expected, genetic variants were identified in all the six cancer-related genes. We identified several previously established, high-likelihood genetic variants, with either known or unknown biological significance. We focused on FAT4 and TP53 genes as both showed frequent nonsynonymous mutations in our targeted sequencing cohort. Our in silico analysis also predicted that most of these mutations were likely to have deleterious effects on protein function, implying their involvement in HCC development in chronic hepatitis B disease.
FAT4 belongs to the cadherin gene superfamily and encodes transmembrane proteins homologous to tumor suppressor fat in Drosophila [29]. The highest synonymous and non-synonymous mutations found in FAT4 suggested its likely involvement in HCC carcinogenesis process. Non-synonymous mutations that result in amino acid coding change in FAT4 have been reported in several cancers including colon, gastric, esophageal and liver cancers [18,[30][31][32]. In a study investigating somatic mutations in an individual patient with multifocal HCC, Shi et al. has also identified consistent FAT4 mutations in different tumor loci within the same patient [31]. In our study, six non-synonymous mutations identified in FAT4 with deleterious effects on protein function were already annotated in the COSMIC database, indicating the importance of these 6 somatic mutations in cancer development. The P4972S mutation identified in this study, although not annotated in the COSMIC database, has been predicted to influence an exonic splicing enhancer or silencer and result in disequilibrium for different isoforms of FAT4 [30]. Our study also identified a potentially novel FAT4 mutation, A4977T, which has not been reported in HCC, and the significance of A4977T mutation on HCC development deserves further investigation.
Unlike non-synonymous mutations, synonymous mutations change the sequence of a gene without altering the sequence of the coded protein thus are generally termed as silent mutations. However, the prevalent view on synonymous mutations are silent is changing with recent evidence indicated that synonymous mutations frequently alter exonic splicing motifs and affect mRNA splicing [33]. Moreover, genome-wide association studies (GWAS) on genetic variants and disease has revealed a substantial contribution of synonymous SNPs to human disease risk and other complex traits [34]. This implies the higher number of synonymous mutations identified in FAT4 might also contribute to HCC risk. Taken together, our data reiterate the likely involvement of frequent FAT4 mutations in HBV-associated HCC. We believe further functional characterization of both synonymous and non-synonymous mutations in FAT4 will provide a better understanding of its biological relevance in hepatocellular carcinogenesis.
Expression and functional analysis indicated downregulation of FAT4 in tumor tissues and loss of FAT4 induced HCC cell growth and proliferation. These findings were consistent with previous reports suggesting the tumor suppressor role of FAT4 in human cancers [18,35]. However, knowledge about the exact functional role of FAT4 in HCC and its involvement in downstream signaling activation are still scarce. Thus, further delineation of the functional role of FAT4 as a HCC candidate gene especially using in vivo animal models are warranted.
There is a strong association between TP53 mutations and HCC [36]. Our findings also revealed frequent nonsynonymous TP53 mutations with disease-causing effects in HCC. The P72R mutation in the proline-rich region was reported to affect the structure of the putative SH3-binding domain [37]. The presence of Y220S and R249S mutations are proven to disrupt its transactivation activity according to the International Agency for Research on Cancer (IARC) TP53 database. Notably, we detected a high frequency of hot spot R249S mutation in tumor tissues. This finding is consistent with the reported mutation of R249S in > 30% of HCC cases in geographical areas of high HCC incidence [38]. The R249S mutation was induced by aflatoxin metabolites and this mutant TP53 could interact with HBx leading to cell proliferation, suggesting that the R249S mutation is an early mutational event in hepatocarcinogenesis [39,40]. Of note, the P250R is a novel genetic variant predicted to be deleterious by all four prediction algorithms and was not reported in any reference database. It resides in the DNA recognition region, in which a change in amino acid could affect the DNA binding ability of TP53 and therefore its associated transcriptional function. Our data further emphasize the importance of TP53 mutation in HBV-related HCC. The pathological link between genetic alterations leading to the loss of TP53 function and the initiation and progression of HCC with different etiologies warrant further confirmation in larger studies in order to customize treatment with targeted therapies.
In this study, PIK3CA, IRF2, ARID1A and HNF4α genes harbored mainly indel mutations in the noncoding regions. According to previous studies, the mutation rate of PIK3CA in HCC is controversial, with absence of mutation cases detected in a study done in Japan whereas a high mutation rate of 35.6% was reported in studies done in Korea [14][15][16]. In the present study, we only detected high frequency of indel mutations in non-coding regions in PIK3CA gene. The discrepancies in rates of PIK3CA mutations are likely due to a number of factors including the specific exons that were sequenced, geographical variation and methods used for sample storage and DNA extraction. Thus, the importance of PIK3CA mutation and its implications in HCC tumorigenesis needs further investigations. In addition, we observed low indel mutations in IRF2 and a relatively high mutation rate in TP53 gene. Regarding mutations in the IRF2, ARID1A and HNF4α genes, our findings are in line with other reports which show low mutation rates in HBV-related HCC [41,42], suggesting that these genes may not involve in HBVrelated HCC tumorigenesis.
One limitation of this study is the small sample size. Therefore, future studies with a larger cohort which includes healthy control as well as HCC patients without HBV infection should be performed. With a larger cohort, analysis of the co-mutational profile of the six chosen genes together with other well-known HCC-related genes such as β-catenin would further delineate the role of these genes in HCC. Finally, to understand functional role of FAT4 in HCC tumorigenesis, an indepth analysis of gene expression in a larger cohort and using animal models could facilitate deeper perspectives on the biological significant of FAT4 in HCC.