Analysis of the pathogenic variants of BRCA1 and BRCA2 using next-generation sequencing in women with familial breast cancer: a case–control study

Background Pathogenic variants (PVs) of BRCA genes entail a lifetime risk of developing breast cancer in 50–85% of carriers. Their prevalence in different populations has been previously reported. However, there is scarce information regarding the most common PVs of these genes in Latin-Americans. This study identified BRCA1 and BRCA2 PV frequency in a high-risk female population from Northeastern Mexico and determined the association of these mutations with the patients’ clinical and pathological characteristics. Methods Women were divided into three groups: aged ≤ 40 years at diagnosis and/or risk factors for hereditary breast cancer (n = 101), aged > 50 years with sporadic breast cancer (n = 22), and healthy women (n = 72). Their DNA was obtained from peripheral blood samples and the variants were examined by next-generation sequencing with Ion AmpliSeq BRCA1 and BRCA2 Panel using next-generation sequencing. Results PVs were detected in 13.8% group 1 patients (BRCA1, 12 patients; BRCA2, 2 patients). Only two patients in group 2 and none in group 3 exhibited BRCA1 PVs. Variants of uncertain significance were reported in 15.8% patients (n = 16). In group 1, patients with the triple-negative subtype, PV frequency was 40% (12/30). Breast cancer prevalence in young women examined in this study was higher than that reported by the National Cancer Institute Surveillance, Epidemiology (15.5% vs. 5.5%, respectively). Conclusions The detected BRCA1 and BRCA2 PV frequency was similar to that reported in other populations. Our results indicate that clinical data should be evaluated before genetic testing and highly recommend genetic testing in patients with the triple-negative subtype and other clinical aspects. Electronic supplementary material The online version of this article (10.1186/s12885-019-5950-4) contains supplementary material, which is available to authorized users.


Background
Breast cancer is the most common type of cancer among women worldwide and is the main cause of death in developing countries. In 2012, 1.67 million cases were reported worldwide by GLOBOCAN. Hereditary and familial cancers represent approximately 10% of the cases, indicating that 167,000 cases may be attributed to a genetic cause [1].
Approximately 15-40% of hereditary breast cancers occur due to pathogenic variants (PVs) of BRCA1 (17q21) and BRCA2 (13q12-13) [2][3][4][5]. BRCA PVs may be present in one of eight breast cancer patients aged < 40 years and who have two affected relatives [3]. Carriers of PVs of BRCA genes have a 60% risk of developing breast cancer at the age of 70 years and an 83% risk of developing contralateral breast cancer [6]. Ovarian cancer has high penetrance and association with BRCA PVs. Several other malignancies, such as pancreatic cancer, prostate cancer, and melanoma, have also been associated with mutations in BRCA genes; hence, the patients' family history should be considered.
The prevalence of BRCA1/2 germline mutations varies among ethnic groups and geographical zones. Clear variability across Latin American countries has been described, which is explained by the mixture of European, African, and Amerindian ancestors [7]. A founder mutation, ex9-12del, has been described in the Hispanic population from the south of the United States [8], and in an unselected study population from the center of Mexico that was assessed for a family history of cancer and exhibited a mutation frequency of 29% [9]. Mexico is a genetically heterogeneous country, and BRCA PVrelated information obtained using next-generation sequencing (NGS) is scarce. PVs should be identified for better disease characterization among different populations and for appropriate genetic counseling.
This study established the frequency and type of mutations of BRCA1 and BRCA2 in a female population from Northeastern Mexico and determined the correlation of mutations with the patients' clinical and pathologic characteristics.

Methods
We performed a case-control study comprising patients from the Centro Universitario Contra el Cáncer at the Hospital Universitario Dr. Jose E. Gonzalez from the Universidad Autónoma de Nuevo León. Subjects (including their parents and grandparents) from Northeastern Mexico (Nuevo León, Tamaulipas, and Coahuila) with high-risk factors for hereditary breast cancer. Enrollment strategy included searching on local data base from January 2005 to August 2015. Women with breast cancer at early age (≤ 40 y) were invited to participate in our study.
Sample size for case-control design considering alpha error 0.05 and beta of 0.8 resulted in 25 persons per group. Despite the calculated sample size, a pre-planned enrollment to recruit 200 people was conducted.
In addition to having a pathological diagnosis of breast cancer, patients were required to meet at least one of the following criteria: age ≤ 40 years at diagnosis [10]; presence of bilateral breast cancer; and three or more relatives with breast cancer, ovarian cancer, pancreatic, prostate, or melanoma cancer; the latter two were independent criteria that did not consider age at diagnosis to be < 40 years.
We include two control groups. Patients with a diagnosis of sporadic breast cancer were termed "positive controls," and healthy women without a personal or family history of cancer were termed "negative controls." For the last group, an open invitation was made to medical students and workers for detecting local variants. Inclusion criteria for the healthy group included the following: > 18 y, pedigree with no personal or family history of any cancer, born in the Northeast of Mexico. Informed consent was required for all included patients. Patients meeting the inclusion criteria were selected from the daily hospital outpatient attendance register or from the electronic database of the center and invited to participate by phone. Healthy controls were selected from the general population. An oncologist conducted an interview to obtain the medical history. Clinical data were verified from the electronic medical files of the patients and recorded as baseline data. Peripheral blood sample was taken and analyzed at the molecular laboratory of the genetics department in the university hospital (College of American Pathologists accredited).

Pathology and mutation analyses
All patients (cases and positive controls) received a diagnosis of invasive breast cancer that was confirmed by anatomopathological analysis at the pathology department of the university hospital. The histologic type of the cancer was determined according to the World Health Organization system [11]. Tumor grade was defined using the Scarff-Bloom-Richardson system. Estrogen and progesterone receptors and HER2 were identified using standard immunohistochemical techniques; hormone receptors were considered positive when at least 1% stain was detected [12]; HER2 was considered positive when "+++" was detected; if "++" was observed, fluorescence in situ hybridization analysis was used for confirmation [13].
DNA extraction was performed using the Qiagen QIAamp DNA Mini Kit, (QIAGEN GmbH, Hilden, Germany), according to the manufacturer's instructions. Elution was into 100 μL of water.
The entire coding regions of the BRCA1 and BRCA2 genes were amplified using the Ion AmpliSeq   For cases with negative findings, multiple ligationdependent probe amplification (MLPA) was performed to search for large genomic alterations, duplications, or deletions of one or more exons, as per guideline recommendations [14]. The SALSA MLPA P087-C1 BRCA1 and SALSA MLPA P077-A3 BRCA2 test kits (MRC-Holland, Amsterdam, Netherlands) were used in accordance with the manufacturer's instructions.

Data analysis
The raw data were analyzed using torrent suite software v5.0.4 (Life technologies). Coverage analysis was performed using the coverage analysis plug-in v5.0.2.0. Mutations were detected using the Variant Caller plugin v5.0.2.1 (Life Technologies). To eliminate erroneous base calling, two filtering steps were used to generate final variant calling. The first filter was set at an average total coverage depth of > 80, each variant coverage of > 20, a variant frequency of each sample of > 5, and p-value of < 0.01. The second filter was employed by visually examining mutations using Integrative Genomics Viewer software (https://software.broadinstitute.org/software/igv/). Ion Reporter 5.0 was used for variant annotation and classification.
After the filtrations, all variants identified through NGS (silent, missense, nonsense, frameshift, and splicing variants) were compared with variants in the 1000 Genomes Project (http://www.1000genomes.org/) for different ethnic populations, using ExAC (http://exac. broadinstitute.org/about) and 72 in-house controls. All mutations were also checked against the UMD, LOVD, kConFab, HGMD, and ClinVar databases, and were regarded as "pathogenic" if classified as such in these databases. The missense variants were annotated using the wANNOVAR web site (http://wannovar.wglab.org), which provides tools such as SIFT, PolyPhen-II HDIV, PolyPhen-II HVAR, LRT, Mutation Taster, Mutation Assessor, FATHMM, PROVEAN, VEST3, MetaLR and M-CAP to predict the effect of amino acid substitution for each missense mutation. Every missense mutation was scored as damaging or benign using the 11 prediction tools. If the missense mutation was scored as damaging by five or more of the prediction tools, the mutation was classified as a "damaging" mutation, and if it was scored by less than three, the mutation was classified as "benign". The detected variants are classified based on the criteria of the ENIGMA (Evidence-based Network for the Interpretation of Germline Mutant Alleles) consortium (https://enigmaconsortium.org) and described as recommended by Human Genome Variation Society (https://www.hgvs.org/) using as RefSeq: NM_007294.3 and NM_000059.3. To verify if the PVs identified were true variants or sequencing artifacts, point mutations classified as PV were confirmed by Sanger sequencing, using the BigDye Terminator v3.1 sequencing kit and the ABI PRISM 3130 Genetic Analyzer (Life Technologies).

Statistical analysis
Patient characteristics were tabulated, and description data are presented as the mean with standard deviations and proportions. Comparisons between groups (familial hereditary vs. sporadic and carriers vs. noncarriers) were performed using a t-test for two independent means and chi-squared test for two proportions expressed as percentages. Odds ratios (ORs) were calculated for age, bilateral cancer, family history, and triple-negative variables. SPSS version 20 (IBM, Armonk, NY) for Windows 7 was used for statistical analysis.

Results
All subjects were born in Northeastern Mexico. From January 2005 to August 2015, 3,065 patients were registered in the hospital database. We eliminated 265 patients because the reported age was not reliable. There were 436 patients (15.5%) aged ≤ 40 years at diagnosis, among whom 335 were either not located or did not agree to participate. 101 patients were included with early age breast cancer and/or familial/hereditary breast cancer, 22 patients with sporadic cancer (positive controls), and 72 healthy women (negative controls). The clinical characteristics of the patients and positive control groups are shown in Table 1. As expected, the mean age of the familial breast cancer group was significantly lower (36.9 ± 5.2 years). No statistically significant differences were noted between the groups. Regarding tumor histopathology, 53% of patients in the hereditary cancer group exhibited nuclear grade 3 compared with only 10% in the sporadic cancer group (p < 0.001).
PGM sequencing of these 195 patients had an average of 60,463 reads per patients, with the mean read length being 113 bp. The average read depth per sample was 330X, with the mean percentage of reads on target being 92% and uniformity of base coverage being 96.3%. PV analysis of BRCA1 and BRCA2 revealed 16 mutation carriers. 14 carriers (13.8%) present 10 different PVs in group 1 (Table 2). Overall, 12 different PVs were detected, and most of them (82%) were of BRCA1 (13/16), whereas only 18% (3/17) were of BRCA2. Among these, 11 variants were classified as pathogenic and one as likely pathogenic. Sixteen variants were identified, eight (50%) through NGS, and eight (50%) using MLPA. PVs identified with NGS were re-sequenced by Sanger and all were true variants for a validation rate of 100%. Two deletions, ex9-12del and ex16-17del accounted for 42.8% among carriers in the familial-hereditary group, 21.4% (3/14) respectively. Two PV's (1 in BRCA1 and one in BRCA2) were detected in the positive control group. No PV's were detected in the 72 healthy women. Results of total variants are summarized in are reported in Additional file 1: Table S1.
A comparison of demographic and clinical characteristics between the mutation and non-mutation groups only revealed a difference in the frequency of breast feeding (35.2% of mutated patients performed breastfeeding compared with 59.3% of non-mutated patients; p = 0.04). Regarding tumor characteristics, the triple-negative subtype was more frequently observed in patients with BRCA PVs than in those without PVs (65% vs. 22.6%; p < 0.001). The association of the triple-negative subtype with PVs of BRCA exhibited an OR of 6.4 (95% CI, 2.22-18.70). Other clinical and tumor characteristics did not statistically differ between the mutation and non-mutation groups (Table 3).

Discussion
The university oncology center serves the northeast region of Mexico. At least 30% come from other states and they are mostly low-income individuals who live in rural areas. So, the need for phone contact for participation and travel-related costs provoke low rates of participation, compared with the population found in the local database; however, the sample size was complete, as previously estimated.
Due to the lack of genetic characterization of BRCA genes in Mexico, 72 healthy women were included as control negative. Most of the previous studies are on Hispanics from diverse origins [7,8]. There is scarce information in Mexico for healthy population. Local variants were not detected among healthy controls. Less information exists in Mexico about BRCA variants in this population.     The frequency of PVs in BRCA1/2 genes reported by clinics that attend to high-genetic-risk populations in North America is approximately 9.3% [15]; By contrast, the frequency of PVs reported in the Hispanic population from Southwestern United States is as high as 25% [16]. In the present case-control study, a frequency of 13.8% of PVs was observed in a population from Northeastern Mexico, which is like that previously reported [17]. In Mexico, the frequency of PVs of these genes has been reported to be from 4 to 27%, depending on the studied population (for example, cases with risk factors and sporadic cases) and tumor characteristics [17][18][19][20]. Particularly, among populations with familial/hereditary characteristics, the frequency was 10.2% in Mexico, which is not statistically different from our study (p = 0.14) [17].
Over 1,500 clinically significant PVs have been described for each BRCA gene [21,22]. Among studies published in Mexican population 53 pathogenic genomic variants of BRCA1/2 (24 in patients with early onset or a family history of breast cancer, 28 in unselected populations, and one in both unselected populations) have been reported. Only one PV, a large genomic rearrangement (c.548-?_4185 +?del), which is considered a founder mutation in Mexicans, was recurrent in different studies [9,20]. Torres-Mejía et al. and Villarreal et al. Reported the frequency of this PV was 1% or 22% among carriers and 9.4% or 42% among carriers, respectively. In our study we detected this PV in 2.9% or 21.4% among group 1 carriers%. Inclusion criteria among these studies are different, going from an unselected population, triple negative in patients younger than 50 y and in this study in an early breast cancer and/or family history. This data must be noticed because of the high spectrum of PVs in our population.
We discovered one PV, predicted to be deleterious, not previously reported; c.682_683insAGCCATGTGG; p.Gly228Glufs*15. This last PV was detected in an early age onset breast cancer patient, 33 y at the time of diagnosis, with bilateral cancer and triple negative subtype. Two variants, p.Ser186Tyr and p.Thr1561Ile, are currently classified in several databases as benign. These two patients had early onset breast cancer at the age of 39 y with HER overexpression and 37 y with luminal subtype, none had family history. Nevertheless, according to the pathogenic predictors used in this study and considering the low frequency of these variants reported in 1000 Genomes Project, gnomAD, ExAc, we suggest further research for proper classification.
Some laboratories have been introducing multiplex assays, which analyze the most common genetic variants. In the present study, in addition to the founder genomic variant, all patients analyzed up to date were carriers of different PVs. From these data, we can infer that the use of these panels may provide missing information at least for Mexican populations.
New technologies such as NGS are currently being used for gene testing because they save time, are cost-effective, and have a higher sensitivity and specificity [23]. Nevertheless, it is important to use at least two different genomic technologies to rule out genomic variants, because as observed in this study, the use of MLPA enabled the identification of 38% of the PVs.
Because these technologies are not available in all clinical settings, clinical criteria should be considered to select patients for genetic testing. Recently, the criteria for hereditary breast cancer has been changing, with an expansion in the risk-related age range, family history, and pathologic characteristics [23] Patients with the triple-negative phenotype may even be of older age (> 50 years) [24]. In a previous study in Australia and Poland comprising patients unselected by age or family history of cancer, the prevalence was between 9.3 and 9.9% [25]. In a similar study of a Mexican population with a median age of 43 years (range, 23-50 years) and the triple-negative phenotype, the prevalence of PVs of BRCA was 23% [26]. In this study, the frequency was as high as 43.3%, representing 65% (OR, 6.4; 95% CI, 2.2-18.7) of the patients with PVs, as mentioned previously. This finding indicates the importance of clinical aspects in decision making with regards to the need for gene testing. Personal and environmental data are important characteristics for counseling and for decreasing the risk of breast cancer and other malignancies to some extent. Breastfeeding is considered an important protective factor for cancer development. In the present study, less proportion of women with PVs performed breastfeeding. Accordingly, it is important to recommend breastfeeding to carriers of BRCA PVs. This last modifiable risk factor has been described to be significant in decreasing the risk for breast cancer, with a relative risk of 0.63 (95% CI, 0.46-0.86) in mutated BRCA1 populations [27]. To the best of our knowledge, this is the first study to compare the effect of breastfeeding on breast cancer of the young between carriers of BRCA PVs and noncarriers in Mexico.
BRCA gene status is important for the selection of treatment. The use of platinum analogs has shown more benefits in metastasis cases, with a favorable response of 54% compared with 19% for the use of other therapies [28]. Novel therapies that involve poly (ADP-ribose) polymerase inhibitors have shown advantages when used in combination with chemotherapy for BRCA-positive cases [29]. This highlights the need of gene testing not only for genetic counseling but also for treatment. In this study, therapy was not decided on the basis of the BRCA gene status.

Conclusions
In the present study, BRCA PVs were detected with a frequency of 20% in a high-risk population, using Ion AmpliSeq BRCA1 and BRCA2 Panel together with MLPA. Because there is a high variability in the type and frequency of BRCA gene variants in the Mexican population, we propose the use of these technologies. We also state that clinical aspects can facilitate decision making regarding the need for BRCA analysis. The triple-negative subtype has a good correlation with BRCA mutations, so it is difficult to exclude this population from analysis. Strategies to promote a healthier environment must be included in the medical advice to patients.
Breastfeeding as a modifiable risk factor should be part of the analyses in future studies to determine the impact in high-risk groups of not only breast cancer, but also ovarian cancer.

Additional file
Additional file 1: Table S1. Genetic Database. Excel file with total data about genetic variants in BRCA1/2 about the three groups; Healthy, Sporadic