Characterization of BRCA1 and BRCA2 variants in multi-ethnic Asian cohort from a Malaysian case-control study

Background Genetic testing for BRCA1 and BRCA2 has led to the accurate identification of individuals at higher risk of cancer and the development of new therapies. Approximately 10-20% of the genetic testing for BRCA1 and BRCA2 leads to the identification of variants of uncertain significance (VUS), with higher proportions in Asians. We investigated the functional significance of 7 BRCA1 and 25 BRCA2 variants in a multi-ethnic Asian cohort using a case-control approach. Methods The MassARRAY genotyping was conducted in 1,394 Chinese, 406 Malay and 310 Indian breast cancer cases and 1,071 Chinese, 167 Malay and 255 Indian healthy controls. The association of individual variant with breast cancer risk was analyzed using logistic regression model adjusted for ethnicity, age and family history. Results Our study confirmed BRCA2 p.Ile3412Val is presented in >2% of unaffected women and is likely benign, and BRCA2 p.Ala1996Thr which is predicted to be likely pathogenic by in-silico models is presented in 2% of healthy Indian women suggesting that it may not be associated with breast cancer risk. Single-variant analysis suggests that BRCA1 p.Arg762Ser may be associated with breast cancer risk (OR = 7.4; 95% CI, 0.9–62.3; p = 0.06). Conclusions Our study shows that BRCA2 p.Ile3412Val and p.Ala1996Thr are likely benign and highlights the need for population-specific studies to determine the likely functional significance of population-specific variants. Our study also suggests that BRCA1 p.Arg762Ser may be associated with increased risk of breast cancer but other methods or larger studies are required to determine a more precise estimate of breast cancer risk. Electronic supplementary material The online version of this article (doi:10.1186/s12885-017-3099-6) contains supplementary material, which is available to authorized users.


Background
Germline mutations in BRCA1 (MIM 113705) and BRCA2 (MIM 600185) are associated with increased risk of breast and ovarian cancer. The discovery of germline mutations has led to the accurate identification of individuals who are at risk of cancer and the development of new therapies for the disease. In many countries, individuals with a priori risk of 10-20% of having inherited a germline mutation in BRCA1 and BRCA2 are offered genetic counselling and testing [1] and on average, 17-20% of the genetic testing for BRCA1 and BRCA2 detect pathogenic mutations [2,3]. However, approximately 10-20% of the tests lead to the identification of variants of uncertain significance (VUS) which comprise missense variants, intronic variants, synonymous variants and in-frame insertions or deletions [4,5], for which the clinical relevance remains equivocal.
The frequency of VUS varies by ancestry around the world with lower frequency in populations that are well studied such as the Caucasian population in North America and Europe, and high frequency in populations such as Asian, African and Middle Eastern where there has been little study and limited availability of genetic counselling and testing [6]. Over time, VUS are systematically reclassified as additional information or evidence is supported [7,8]. According to a study reported by Myriad Genetics in United States based on 20 years of experience, the frequency of VUS in BRCA1 and BRCA2 declined from 12.8 to 2.1% [6]. Notably, the frequency of VUS remains the highest in Asians with 7% of the tests reported as VUS. Although the pathogenicity of VUS has been studied using different approaches such as multifactorial likelihood model, population frequency, functional or mRNA splicing assay, the majority of these are focused on VUS in the Caucasian population [9][10][11]. In Asia, studies have reported novel variants but the clinical significance of these variants has not been further investigated [12][13][14][15].
In this study, we analyzed the frequency of 7 BRCA1 and 25 BRCA2 variants in an Asian cohort of 2,110 breast cancer patients and 1,493 healthy women, in which the pathogenicity of variants is evaluated using a case-control approach.

Study description
The recruitment of breast cancer patients into the Malaysian Breast Cancer Genetic Study (MyBrCa) started in January 2003 at University Malaya Medical Centre, and in September 2012 at Subang Jaya Medical Centre in Kuala Lumpur, Malaysia. All were histopathology proven breast carcinoma. Blood, demographic and family history data were collected from breast cancer patients who consented to participate in this study.
From January 2003 to March 2014, 2,323 breast cancer patients were recruited into the study. A total of 467 individuals were selected for germline analysis on the basis of age of onset and family history of breast and/or ovarian cancer, in which 402 of the cases have been previously described [15][16][17][18] and 65 additional cases were tested using the same selection criteria. Detection of germline mutations was conducted using direct DNA sequencing and multiplex ligation-dependent probe amplification (MLPA) on the BRCA1 and BRCA2 genes.
Of the 2,323 breast cancer patients, cases were excluded if they were of mixed parentage or ethnicities other than Chinese, Malay or Indian (n = 87), or had insufficient or low quality genomic DNA (n = 126), leaving a cohort of 2,110 individuals (1,394 Chinese, 406 Malay and 310 Indian) for genotyping. Controls were selected from 1,530 women with no personal history of breast cancer attending an opportunistic mammography screening program from October 2011 to April 2013 [19]. A total of 37 controls who were of mixed parentage or ethnicities other than Chinese, Malay or Indian (n = 37) were excluded. The remaining 1,493 individuals consisting of 1,071 Chinese, 167 Malay and 255 Indian with sufficient genomic DNA were genotyped.

Selection and genotyping of variants
The germline analysis identified 69 missense and intronic variants of BRCA1 and BRCA2 from 109 individuals. All variants identified from the germline analysis, regardless of their predicted clinical importance, were selected for genotyping to evaluate their frequency using a casecontrol approach. The variants were annotated according to Human Genome Variation Society (HGVS) recommendations based on transcript sequence of BRCA1 (NM_007294.3) and BRCA2 (NM_000059.3). Two variants (BRCA1 p.Asp345Tyr and BRCA2 c.632-10dupT) failed in assay design due to repetitive nucleotides located at neighboring sequence of each variant. Of the remaining variants, 23 BRCA1 and 44 BRCA2 variants were included in the genotyping assay (Additional file 1: Table S1a and S1b).
Genotyping was conducted using SEQUENOM iPlex multiplex single-base extension assays and analyzed by MALDI-TOF mass spectrometry (SEQUENOM Inc., San Diego, USA). Individuals who were previously analyzed by direct DNA sequencing and MLPA were used as positive and negative controls for genotyping. The genotyping process involved two phases in which the assay of Phase 1 included 27 variants (9 BRCA1 and 18 BRCA2) and tested on 879 breast cancer cases recruited from January 2003 to July 2010. In Phase 2, the remaining 40 variants (14 BRCA1 and 26 BRCA2) were added to the assay and this was tested on a non-overlapping cohort of 1,231 breast cancer cases and 1,493 healthy controls recruited from July 2010 to March 2014 (Additional file 1: Table S1a and S1b). Where possible, all cases and controls participating in the research study were included in the genotyping assay. The subsequent variant analyses were performed according to the number of genotyped cases and controls for each variant. Variants with genotyping call rate of <95% were excluded. Approximately 5% of the randomly selected samples were duplicated in the experiment and samples that were failed to be genotyped in >20% of the assays were excluded.

In-silico prediction
The effect of missense variants on protein function was predicted using AGVGD (http://agvgd.hci.utah.edu/), PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/) and SIFT (http://sift.jcvi.org/). AGVGD is an evolutionary sequence conservation model in which the algorithm evaluates the physiochemical properties of amino acid and multiple sequence alignments in a substituted protein sequence [20]. PolyPhen-2 uses sequence-based and structure-based predictive features to evaluate the damaging effects of missense variants [21]. SIFT predicts the effects of all possible substitutions at each position in the protein sequence by using sequence homology [22].

Statistical analyses
The information on ethnicity, age of diagnosis or consent, and family history of breast or ovarian cancer in first and second degree relatives were obtained from the questionnaires. The association between breast cancer risk and these baseline characteristics was investigated using t-test for age and Chi-Square for ethnicity and family history. We assessed the relationship between each variant and the risk of breast cancer using logistic regression model adjusted for ethnicity, age and family history. All statistical analyses for single-variant association testing were performed using Statistical Package and Service Solutions (SPSS) version 16.0. The R package 'rmeta' (version 2.16; https://cran.r-project.org/web/packages/rmeta/index.html) was used to generate the forest plot of single-variant association in the Additional file 2.

Results
We determined the spectrum of BRCA1 and BRCA2 deleterious mutations and variants in 467 breast cancer patients by full sequence analysis and large genomic rearrangement analysis. Of these, 69 (14.8%) had germline deleterious mutations and 125 (26.8%) had variants in BRCA1 and BRCA2 genes. In total, 24 BRCA1 and 45 BRCA2 missense and intronic variants were identified from 109 individuals. Of these 69 variants, 67 variants that could be designed for the MassARRAY platform were included in a multiplex genotyping assay (Additional file 1: Table S1a and S1b).
The genotyping assay was tested on 2,110 breast cancer cases and 1,493 healthy controls ( Table 1). Majority of the individuals recruited for this study were Chinese (66.1% in cases and 71.7% in controls), followed by Malay (19.2% in cases and 11.2% in controls) and Indian (14.7% in cases and 17.1% in controls). The average age of breast cancer cases (49.5 years) was slightly younger than healthy controls (50.3 years). Notably, there was no difference in age for cases and controls for Chinese and Indian women, but healthy women were on average 2 years older than the cases for Malay women (Additional file 1: Table S2). Ethnicity, age and family history of breast or ovarian cancer were significantly associated with breast cancer risk and hence were included as covariates in single-variant association testing.
Genotyping of 2 BRCA1 (c.-19-3A > G and c.-19-10 T > C) and 2 BRCA2 (p.His523Arg and p.Arg2502His) variants resulted in genotyping call rates of <95%, and these variants were therefore excluded from the analysis. We also excluded 19 cases and 29 controls from analysis because these samples were failed to be genotyped in >20% of the assays. As a result, 2,091 breast cancer patients and 1,464 healthy controls were analyzed using a case-control approach. The genotyping cohort also included 120 individuals who were previously analyzed by germline analysis. Of these, genotyping was concordant with sequencing in 119 of 120 individuals (99.2%). Approximately 5% of randomly selected samples (88 cases and 67 controls) were duplicated in the genotyping assay and the concordance rate among the duplicated samples was 98.7% (153/155).
We identified BRCA2 p.Ile3412Val with variant frequency >2% in unaffected women and an additional four variants (BRCA2 p.Cys315Ser, p.Ile1929Val, p.Arg2108Cys and p.Lys2729Asn) with variant frequency >1% in unaffected women ( Table 3). All of these are unlikely to be associated with increased risk of breast cancer.
Of the 63 variants that were included in the analysis, thirty-one variants (14 BRCA1 and 17 BRCA2) could not be evaluated as carriers were present only in either breast cancer cases or healthy controls. Only 32 variants (7 BRCA1 and 25 BRCA2) were present in both cases and controls and these were analyzed for association with breast cancer risk (Tables 2 and 3). In the single-variant association testing using logistic regression, BRCA1 p.Arg762Ser was associated with breast cancer risk with a marginal significance (OR = 7.4; 95% CI, 0.9-62.3; p = 0.06) (Additional file 2: Figure S1). This variant was found in 5 out of 858 Chinese breast cancer patients (0.6%) and 1 out of 1,054 Chinese controls (0.1%) (Additional file 1: Table S3), and was marginally associated with breast cancer risk in Chinese women (OR = 6.7; 95% CI, 0.8-57.6; p = 0.08) (Additional file 2: Figure S2). The average age of diagnosis was 39 years old, 25% of women had estrogen receptor (ER) negative breast cancer but none reported any family history of breast or ovarian cancer. The BRCA1 p.Pro346Ser was associated with breast cancer risk in Chinese women [5 out of 859 Chinese breast cancer patients (0.6%) and 2 out of 1,055 Chinese controls (0.2%), OR = 3.3; 95% CI, 0.6-17.3; p = 0.15] (Additional file 1: Table S3a and Additional file 2: Figure S2), but the results were not statistically significant. The average age of diagnosis was 62 years old, 20% of women had ER negative breast cancer but none reported any family history of breast or ovarian cancer. None of the BRCA2 variants were significantly associated with breast cancer risk either in the overall cohort, or when stratified by ethnicity (Additional file 2: Figure S1 and S2).
The probability that missense variants were deleterious to protein function was assessed by three in-silico models, namely AGVGD, PolyPhen-2 and SIFT. Of the 29 missense variants that have been analyzed, three (BRCA2 p.Ala1996Thr, p.Gly2901Asp and p.Tyr3035Cys) were predicted to be likely pathogenic (Tables 2 and 3).

Discussion
In this study, we analyzed the frequency of 7 BRCA1 and 25 BRCA2 variants from exonic and intronic regions identified previously in Malaysian breast cancer patients by germline analysis. The genotyping of variants was conducted using a high-throughput mass spectrometry platform [18].
The variant frequency suggested that one of the 63 tested variants has a minor allelic frequency of >2% in unaffected women and is likely to be benign [6]. Four variants that had more than 1% of variant frequency in unaffected women could be potentially benign. Of these variants, three (BRCA2 p.Ile1929Val, p.Lys2729Asn and p.Ile3412Val) have been classified as Class 1 (not pathogenic) and one (BRCA2 p.Arg2108Cys) has been classified as Class 2 (likely not pathogenic) in either Breast Cancer Information Core Database (http://research.nhgri.nih.gov/bic/) or DatabaseARUP Laboratories BRCA Mutation Database (http://arup.utah.edu/database/BRCA/) using different approaches [20,23]. Moreover, BRCA2 p.Ile3412Val was also reported by EN-IGMA (http://enigmaconsortium.org/) as benign variant to occur in non-founder African control reference group at an allele frequency ≥1%. Although BRCA2 p.Lys2729Asn was predicted to have damaging effect by PolyPhen-2 and SIFT, the prediction models may have limitation to predict the actual consequences of missense mutation accurately [24] therefore the population frequency supersedes the prediction models [8]. These findings are in accordance with the results of our study which concluded that these variants are benign variants or polymorphisms.
The clinical significance of BRCA2 p.Cys315Ser and p.Arg2108Cys is currently listed as uncertain, but our study suggests that these variants are likely to be benign. This is consistent with a study in Chinese women from Shanghai in which BRCA2 p.Cys315Ser was detected in 1.4% of cases and 0.9% of controls, compared with 0.9% of cases and 1.2% of controls in our study [12]. Notably, BRCA2 p.Arg2108Cys was evaluated as pathogenic in spontaneous homologous recombination [25], but our study suggests that this variant is unlikely to be associated with high risk of breast cancer.
It was estimated that the rare variant had a relative risk of above 2 and above 4 might confer moderate and high risk of breast cancer, respectively [26,27]. Our study suggests that one BRCA1 variant may be associated with increased risk of breast cancer. The BRCA1 p.Arg762Ser may be associated with breast cancer risk with a marginal significance (p = 0.06). This variant was previously found in Chinese and Malay women and the clinical significance is currently unknown [12,28,29]. Although the in-silico analyses predicted the amino acid substitution of this variant is unlikely to have damaging effect to protein function, our study suggests that this warrants further analyses in Asian women.
Notably, three BRCA2 variants (p.Ala1996Thr, p.Gly2901Asp and p.Tyr3035Cys) which are predicted to be pathogenic by in-silico prediction models were found in cases and in healthy controls. The clinical significance of BRCA2 p.Ala1996Thr is currently listed as uncertain and the substitution of valine at the same codon (p.Ala1996Val) in a Western European woman is also uncertain [7]. BRCA2 p.Gly2901Asp was suggested as neutral in mouse embryonic stem cell-based functional assay [30], and predicted to be uncertain in protein likelihood ratios [31] and homology-directed repair activity [32]. Although BRCA2 p.Tyr3035Cys was predicted to be likely deleterious in protein likelihood ratios [31], this variant did not show any significant association with breast cancer risk in our study.
There are several limitations to this study. The breast cancer cases were not age-and ethnicity-matched with controls in this study, but these variables were adjusted for all variant analyses. Another limitation is that majority of the variants selected for this study are rare in our population. These rare variants were detected in very low frequency, thus decrease the statistical power in a case-control study. Analyses in larger groups are necessary to confirm these findings.

Conclusions
Our data suggests that BRCA2 p.Ile3412Val is likely to be benign and BRCA1 p.Arg762Ser is likely to be associated with breast cancer risk. This study could contribute the evidence to support the characterization of variants with uncertain significance in BRCA1 and BRCA2.

Additional files
Additional file 1: Table S1a. BRCA1 variants included in genotyping assay design. A total of 23 BRCA1 variants were included in the genotyping assay. Of these, two variants were excluded due to genotyping call rate <95%. Table S1b. BRCA2 variants included in genotyping assay design. A total of 44 BRCA2 variants were included in the genotyping assay. Of these, two variants were excluded due to genotyping call rate <95%. Table S2. Characteristics of Malaysian breast cancer cases and healthy controls in ethnicity subgroups: (a) Chinese, (b) Malay and (c) Indian. There was no difference in age for cases and controls for Chinese and Indian women, but healthy women were on average 2 years older than the cases for Malay women. Additional file 2: Figure S1. Association of BRCA1 and BRCA2 variants with breast cancer risk in all breast cancer cases and healthy controls. The forest plot illustrates the association of BRCA1 and BRCA2 variants with breast cancer risk in all breast cancer cases and healthy controls. Figure S2.