Comprehensive mutation detection of BRCA1/2 genes reveals large genomic rearrangements contribute to hereditary breast and ovarian cancer in Chinese women

Background Mutated BRCA1/2 genes are associated with hereditary breast and ovarian cancer (HBOC). So far most of the identified BRCA1/2 pathogenic variants are single nucleotide variants (SNVs) or insertions/deletions (Indels). However, large genomic rearrangements (LGRs) such as copy number variants (CNVs) are also playing an important role in HBOC predisposition. Their frequency and spectrum have been well studied in western populations but remain largely unknown for Chinese population. Methods Peripheral blood samples were collected from 218 unrelated familial breast and/or ovarian cancer (FBOC) patients living in Eastern China. PCR-based Sanger sequencing and panel-based next-generation sequencing (NGS) were performed to detect pathogenic SNVs and Indels in BRCA1/2 genes. For the patients lacking small pathogenic variants, multiplex ligation dependent probe amplification (MLPA) assay was conducted to screen for LGRs. Results In total, we identified 44 samples (20.1%) carrying small pathogenic variants (26 in BRCA1 and 18 in BRCA2, respectively). Among the rest of 174 samples, five were found carrying novel deleterious LGRs in BRCA1 which are exon5-7dup (1 patient), exon13-14dup (2 patients), and exon1-22del (2 patients). No LGR was found in BRCA2. Overall, LGRs accounted for 16.1% (5/31) of BRCA1 pathogenic variants, and were detected in 2.3% (5/218) of all FBOC patients., Conclusions LGR variants in BRCA1 gene play a significant role in Chinese HBOC patients. MLPA or other similar LGR-detecting methods should be recommended along with nucleotide sequencing as the initial screening approach for Chinese HBOC women. Electronic supplementary material The online version of this article (10.1186/s12885-019-5765-3) contains supplementary material, which is available to authorized users.


Background
According to National Central Cancer Registry of China, breast cancer ranks No.1 in cancer incidence and sixth in cancer-associated death for Chinese women, with over 250,000 newly diagnosed cases and 70,000 breast cancer-associated death in 2015 [1]. The average onset age of breast cancer is 45-55 years old for Chinese women, which is also younger than observed for Caucasian women [2]. While majority of breast cancer cases are sporadic, patients with familial history or other risk factors such as early onset age have been frequently observed in clinic, suggesting an important role of genetic factors in the disease development. Indeed, germline pathogenic variants in the two major breast cancer susceptibility genes BRCA1/2 have been detected within Chinese patients [3][4][5][6][7][8][9].
Studying BRCA1/2 pathogenic variants requires accurate and comprehensive testing methods. Short-read DNA sequencing methods, including both Sanger and next-generation sequencing (NGS), are only capable of reliably detecting small variants such as single nucleotide variants (SNVs) or insertion/ deletion (Indels), but not suitable for detecting large genomic rearrangements (LGRs), which involve deletions or duplications of multiple exons [i.e. copy number variants (CNVs)]. Therefore, sequencing alone may lead to underestimated frequency of pathogenic variants. Southern blotting could be used to detect LGRs [10], but is labor intensive and generally low-throughput. SNP or CGH arrays can detect copy number variants but their unit cost is high and resolution is usually over hundreds of Kb. Several multiplex PCR-based techniques have been recently developed to achieve higher processing and cost efficiency. For instance, multiplex ligation dependent probe amplification (MLPA) assay and multiplex amplicon quantification (MAQ) have been developed as fast and reproducible methods for CNV detection [11]. At the present, MLPA remains to be the most commonly used method for LGRs, and has detected 82.7 and 53% LGRs in BRCA1 and BRCA2, respectively [12].
As of today BRCA1/2 LGR studies have been mostly conducted in western countries, showing different prevalence with ethnicity and geography. For example, there was no BRCA1/2 LGR variants detected in Ashkenazi Jewish familial breast cancer patients [13,14], but in non-Ashkenazi Jewish, the frequency of LGRs was 6% [14]. Very limited research has been conducted for Chinese, and only 12 BRCA1/2 LGRs have been so far reported. Those studies were conducted in Hong Kong [15], Singapore [16] and Malaysian [17]. The frequency and spectrum of BRCA1/2 LGRs in familial breast cancer patients from China mainland remain largely unknown.

Patient subjects
A total of 218 unrelated familial breast cancer patients were enrolled into this study between 2008 and 2017. All patients were diagnosed in Zhejiang Cancer Hospital in Eastern China and had a family history of at least one first-or second-degree relatives affected with breast cancer and/or ovarian cancer, regardless of age. Peripheral blood samples from the patients were collected in EDTA tubes and stored at − 80°C. SNVs and Indels variants of BRCA1/2 were firstly determined for all patients using sequencing methods (PCR-based Sanger sequencing and panel-based NGS). The patients with negative sequencing finding were further screened for LGRs by MLPA. The written informed consents were obtained from all participating patients prior to clinical data and peripheral blood collection. This study was approved by the Research and Ethical Committee of Zhejiang Cancer Hospital, China. All experiments were performed in accordance with the approved guidelines.

DNA extraction
Genomic DNA was extracted from peripheral blood samples using QIAamp DNA Blood Mini kit (Qiagen, Hilden, Germany) by following the manufacture's manual. DNA purity and concentration were measured by NanoDrop 2000 Spectrophotometer and Qubit 3.0 (Thermo Fisher Scientific, Waltham, USA), and DNA integrity was determined by agrose gel electrophoresis.

Nucleotide sequencing
The present study used both PCR-based Sanger sequencing and panel-based NGS to interrogate small nucleotide variants including SNVs and Indels. In the first phase of the project, Sanger sequencing was performed on 133 unrelated FBOC cases using a total of 72 pairs of oligos to cover all coding exons and intron-exon boundaries of BRCA1/2. The primer oligo sequences were listed in Additional file 1: Table S1 and Additional file 2: Table S2. In the second phase of the project, in order to achieve high-throughput and cost-effective sequencing, we designed a NGS panel by adopting the NEBNext Direct sequencing technology developed by New England Biolabs (Ipswich, MA). The panel contains BRCA1/2 genes as well as other 96 known cancer risk-associated genes. We performed panel NGS on all of the 133 Sanger cases along with 85 new cases newly collected. Individually prepared libraries were pooled for Hiseq X sequencing (Illumina, CA, USA) to achieve a minimum 500x mean coverage for the included panel genes. Raw FASTQ data run through in house bioinformatic pipeline with variant calling generated for BRCA1/2 genes. Variant filtering and final interpretation were conducted by following the ACMG Standards and Guideline for the Interpretation of Sequence Variants [18] and based on a set of criteria such as allele frequency as well as information from clinical genome databases including ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/), Online Mendelian Inheritance in Man (OMIM) (http://www.omim. org/) and Human Gene and Mutation Database (HGMD) (http://www.hgmd.cf.ac.uk/ac/index.php).

BRCA1/2
LGRs was screened by (Multiplex ligation dependent probe assay) MLPA assay using the SALSA P002 kit and P045 kit for BRCA1 and BRCA2 genes, respectively (MRC-Holland, Amsterdam, the Netherlands). The MLPA reactions were performed according to the manufacturer's instruction. Five normal control samples were included as reference within each MLPA run. Fragment analysis of the PCR products were performed   on an ABI 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA). The data was analyzed by using the Coffalyser software v.9 (Applied Biosystems, Foster City, CA). All of the peak heights were normalized, and the ratio value between 0.7-1.3 was considered as normal. A ratio value ≤0.7 or ≥ 1.3 was threshold suggestive of a deletion or duplication, respectively. All patients with a value ≤0.7 or ≥ 1.3 were confirmed by independent experiments. Two primer oligos were designed to validate BRCA1 Exon 5-7 duplication. The forward primer sequence was CCGTGCCAAAAGACTTCTACA (Exon 7) and the reverse primer sequence was TTGCTTCCAACCTAG-CATCA (Exon 5). Long range PCR amplification was performed with Takara LA Taq DNA polymerase (Takara Bio, USA) by following the manufacturer's manual. The amplified product was run on 0.8% Agrose gel electrophoresis with EB (i.e. ethidium bromide) and visualized under UV light. The purified amplicons were subjected to Sanger sequencing to confirm amplification fidelity.

Small pathogenic variants in BRCA1/2 genes
Overall, we identified a total of 31 BRCA1 or BRCA2 pathogenic SNVs and Indels in 44 unrelated patients by combining Sanger sequencing and the 98-gene panel NGS assay. Table 1 lists all these small variants. In summary, nearly 59% (26 of 44) of patients had BRCA1 pathogenic variants, and 41% (18 of 44) had BRCA2 pathogenic variants. Two recurrent pathogenic variants (c.5154G > A and c.5468-1_5474del GCAATTGG) in BRCA1 were reported as putative founder mutations [19]. In total, frequency of BRCA1/2 small pathogenic variants was 20.2% (44/218) in the studied cohort.

Novel LGRs identified in BRCA1
Among the 174 patients lacking BRCA1/2 small pathogenic variants, three unique BRCA1 LGRs were detected in 5 (2.9%) cases by MLPA assay (Fig. 1a, b, c, d). These include one case with exon5-7dup, two cases with exon13-14dup, and two cases with exon1-22del (Table 2). To our knowledge, these three LGRs have not been reported in Chinese HBOC patients. To confirm MLPA results, we validated exon 5-7dup by designing oligo primers surrounding the putative junction. We obtained a clear and strong 6-8Kb PCR amplicon (Fig. 1e), whose sequence identity was confirmed by Sanger sequencing (data not shown) supporting a tandem duplication event.