Novel polymorphisms in caspase-8 are associated with breast cancer risk in the California Teachers Study

Background The ability of tamoxifen and raloxifene to decrease breast cancer risk varies among different breast cancer subtypes. It is important to determine one’s subtype-specific breast cancer risk when considering chemoprevention. A number of single nucleotide polymorphisms (SNPs), including one in caspase-8 (CASP8), have been previously associated with risk of developing breast cancer. Because caspase-8 is an important protein involved in receptor-mediated apoptosis whose activity is affected by estrogen, we hypothesized that additional SNPs in CASP8 could be associated with breast cancer risk, perhaps in a subtype-specific manner. Methods Twelve tagging SNPs of CASP8 were analyzed in a nested case control study (1,353 cases and 1,384 controls) of non-Hispanic white women participating in the California Teachers Study. Odds ratios (ORs) and 95 % confidence intervals (CIs) were calculated for each SNP using all, estrogen receptor (ER)-positive, ER-negative, human epidermal growth factor receptor 2 (HER2)-positive, and HER2-negative breast cancers as separate outcomes. Results Several SNPs were associated with all, ER-positive, and HER2-positive breast cancers; however, after correcting for multiple comparisons (i.e., p < 0.0008), only rs2293554 was statistically significantly associated with HER2-positive breast cancer (OR = 1.98, 95 % CI 1.34-2.92, uncorrected p = 0.0005). Conclusions While our results for CASP8 SNPs should be validated in other cohorts with subtype-specific information, we conclude that some SNPs in CASP8 are associated with subtype-specific breast cancer risk. This study contributes to our understanding of CASP8 SNPs and breast cancer risk by subtype. Electronic supplementary material The online version of this article (doi:10.1186/s12885-015-2036-9) contains supplementary material, which is available to authorized users.


Background
Breast cancer risk factors include a woman's age, family history, reproductive and gynecologic factors, and lifestyle factors including alcohol consumption and lack of physical activity [1]. When treating women at high risk for breast cancer, clinicians may recommend that women undergo increased screening, genetic testing, or chemoprevention [2][3][4]. Phase III breast cancer chemoprevention trials have now demonstrated the efficacy of selective estrogen receptor (ER) modulators (SERMs) (e.g., tamoxifen and raloxifene) and aromatase inhibitors in reducing the incidence of breast cancer. However, these drugs were significantly more effective at reducing the incidence of ER-positive breast cancer than ER-negative breast cancer [5][6][7][8][9][10][11][12][13]. ER-positivity is also associated with better prognosis after breast cancer diagnosis than ER-negativity [14,15], while human epidermal growth factor receptor 2 (HER2)-positivity [16] and triple negativity (ER-negative, progesterone receptor (PR)negative, and HER2-negative) [17] are each associated with worse prognosis. Drugs to target prevention of HER2-positive breast cancer and triple-negative breast cancers are also currently being studied [18]. With known undesirable side effects associated with chemopreventive medications that have been developed thus far, knowledge of one's risk not only for any breast cancer but for specific subtypes of breast cancer would be helpful for a woman and her physician when considering chemopreventive therapy options.
Breast cancer risk models currently used by clinicians to identify women at high risk of developing breast cancer exhibit limited sensitivities and specificities [1]; and many studies have focused on identifying genetic variation associated with breast cancer risk with the hope that single nucleotide polymorphism (SNP) genotyping can be used to better stratify breast cancer risk and inform clinical management. While it is known that mutations in BRCA1 and BRCA2 markedly increase one's risk of developing breast cancer [19,20], a number of additional low and moderate-risk susceptibility variants have been identified, including one for caspase-8 (CASP8), an enzyme involved in apoptosis [21].
Caspase-8 is activated in response to extrinsic apoptotic signals, including chemotherapy agents [22]. In vitro, estrogen inhibits caspase-8 activity and activity of other caspases [23]. The Breast Cancer Association Consortium (BCAC) has identified 3 SNPs in CASP8, namely rs1045485, rs17468277, and rs1830298, which are associated with breast cancer risk [24][25][26]. Other CASP8 SNPs have shown to be associated with increased breast cancer risk [27][28][29]. Besides two BCAC studies, which found that rs1045485 was associated with a lower risk of PR-positive breast cancer [25], rs1830298 was associated with higher risk of ER-positive and triplenegative breast cancer [26], and rs36043647 was associated with lower risk of overall, ER-positive, ER-negative, and triple negative breast cancer [26], few studies have described associations between CASP8 polymorphisms and subtype-specific breast cancer risk. Given the important role of caspase-8 in apoptosis, we hypothesized that additional CASP8 polymorphisms would be associated with breast cancer risk and that the associations might be specific to some breast cancer subtypes. The aim of this study was to examine potential associations between 12 CASP8 polymorphisms and breast cancer risk, overall and by subtype, using case and control samples nested within the California Teachers Study (CTS).

Ethics statement
This study was carried out in compliance with the Helsinki Declaration and approved by the Institutional Review Boards at each study center, namely, the City of Hope (COH), the University of Southern California (USC), the Cancer Prevention Institute of California (CPIC), the University of California at Irvine (UCI), and by the California State Committee for the Protection of Human Subjects, in accordance with assurances filed with and approved by the US Department of Health and Human Services. All study participants provided written informed consent to participate in the study.

Participants
The CTS is a well-established prospective cohort study of 133,479 female California public school teachers and administrators who were enrolled in the California State Teachers Retirement System. A detailed account of the methods employed by the CTS has been published previously [30]. Briefly, participants completed a baseline questionnaire and returned it by mail in 1995-1996. The baseline survey, which collected information on demographics, personal and family cancer history, height, weight, history of hormone use, and behavioral factors including physical activity and alcohol consumption, is available on the CTS website (www.calteachersstudy.org). New diagnoses of first primary invasive breast cancer among cohort members were identified through annual linkages with California Cancer Registry (CCR), a legally mandated statewide population-based cancer reporting system in which cancer data are obtained from cancer patients' pathology reports at the hospital in which the patient was initially diagnosed. CCR ascertainment of newly diagnosed cancers is estimated to be 99 % complete [31].
For this nested, breast cancer case control study, biospecimens were collected between 2005-2009 from breast cancer cases diagnosed under age 80 years and unaffected controls in the cohort, all of whom had continued residence in California during the study period (1995 to time of blood draw). Cases were women who had a histologically confirmed invasive first primary carcinoma of the breast (International Classification of Disease for Oncology code C50 restricted to morphology codes under 8590) after 1998. Unaffected control participants were selected from the cohort and frequency matched to the cases based on age at baseline (within 5-year age groups), selfreported race/ethnicity (white, African American, Latina, Asian, and other), and three broad geographic regions in California (surrounding the three CTS specimen collection centers: CPIC, USC/COH, and UCI).

Collection of biological specimens and DNA extraction
The collection of specimens has been described previously [32]. Briefly, cases and controls provided a blood sample and completed a brief questionnaire at the time of blood draw, which updated breast and reproductive and gynecologic history and several lifestyle factors. Women who declined providing blood provided saliva in Oragene DNA self-collection kits (DNA Genotek, Kanata, ON, Canada). All biological specimens were sent overnight to the UCI laboratory. DNA was extracted from blood clots using Qiagen Clotspin Baskets and DNA QIAmp DNA Blood Maxi Kits (Qiagen, Inc., Valencia, CA, USA) in accordance with Qiagen protocols. DNA was extracted from saliva samples using the Oragene protocol (DNA Genotek).

Genotyping
The 12 tagging SNPs included in this analysis were selected to capture all common linkage disequilibrium tagging SNPs [minor allele frequency (MAF) of at least 5 %], 20 kb upstream of the 5' untranslated region (UTR) and 10 kb downstream of the 3' UTR, in individuals of European ancestry with minimum pairwise r 2 of at least 0.80, using data from the International HapMap Project for the white CEPH (Utah residents with ancestry from northern and western Europe) population [HapMap release 21, July 2006, genotype build 36 (http://hapmap.ncbi.nlm.nih.gov)] [32].
DNA samples from 1,751 cases and 1,697 controls were plated for genotyping. A random sample of 193 duplicates (105 cases and 88 controls) was included for quality control. The samples were genotyped using the Illumina Golden Gate Assay (Illumina, Inc., San Diego, CA USA) at the University of Southern California Core Facility. Twelve haplotype-tagging SNPs in CASP8 were included and genotyped. Samples with genotype call rates <90 % were excluded. Among the remaining samples, 160 randomly selected duplicates exhibited a genotype concordance rate of 99.9 %. Additional details were described previously [32]. Because the majority of participants were non-Hispanic whites, we restricted analyses to 2,737 non-Hispanic white women (1,353 cases and 1,384 controls).

Statistical analyses
All statistical tests were two-sided. We used unconditional logistic regression models to estimate the odds ratios (ORs), 95 % confidence intervals (CIs), and pvalues for the association of invasive breast cancer and each SNP, using log-additive models. Allele frequencies are shown in Additional file 1: Table S1. We adjusted for potential confounding by study center and other known risk factors, namely, age at baseline, family history (having a first-degree relative with history of breast cancer), body mass index (<25, 25.0-29.9, ≥30 kg/m 2 ), alcohol consumption in the past year (none, <20 g/day, ≥20 g/ day), physical activity in the past 3 years (0-0.5 hrs/wk/ yr, 0.51-4.0 hr/wk/yr, >4.0 hr/wk/yr), and menopausal and hormone therapy (HT) status (premenopausal, postmenopausal and never used HT, postmenopausal and used HT in the past, postmenopausal and using estrogen only at baseline, postmenopausal and using estrogen and progesterone at baseline, and unknown) at baseline. To potentially improve power by increasing subgroup homogeneity, we stratified our analysis by estrogen receptor (ER) and human epidermal receptor (HER2) status of breast cancer. We evaluated the association for ER-positive (n = 1,046), ER-negative (n = 155), HER2-positive (n = 159), and HER2-negative (n = 662) subtype. Some breast cancers were not included in any specific receptor (ER or HER2) subtype analysis because they were missing either ER or HER2 status. PR status was not included since PR expression usually follows ER expression [33] and the clinical rationale to determine associations with PR-specific breast cancer risk was lacking since no chemotherapies or preventive therapies are being studied for PR status-specific subtypes. While therapies targeting triple-negative breast cancer are being considered, the number of triple-negative cancers in our subset of cases and controls was too small for analysis (n = 60). We used the conservative Bonferroni correction to correct for multiple testing (n = 60, 12 SNPs x 5 outcomes). Statistical significance was set to p < 0.0008. All analyses were done using SAS software version 9.2.
Recombination rates and linkage disequilibrium across the CASP8 gene was evaluated using the HapMap database (http://hapmap.ncbi.nlm.nih.gov) and r 2 values were computed from the pairwise SNP genotype counts of the generated genotype data.

Results
Baseline characteristics of the cases and controls are provided in Table 1. Consistent with other studies, family history of breast cancer, menopause and hormone therapy (HT) use, physical inactivity, and alcohol use were associated with breast cancer risk. Genotype distributions are provided in Additional file 1: Table S1.

CASP8 polymorphisms and invasive breast cancer risk
The adjusted ORs and 95 % CIs of overall invasive breast cancer with CASP8 polymorphisms are shown in Table 2. Four SNPs had a p-value < 0.05 for positive associations with overall breast cancer (rs11899004, rs3769825, rs6723097 and rs6736233). The SNP most strongly associated with overall breast cancer risk was rs6736233, which conferred an OR of 1.38 (95 % CI 1.12-1.71, p = 0.0028) ( Table 2). After correcting for multiple comparisons, none of the SNPs tested remained statistically significant at p < 0.0008.
When ER-positive and ER-negative breast cancer outcomes were analyzed separately, the trends of increased risk with rs3769825, rs6723097 and rs6736233 as seen for overall breast cancer remained for ER-positive breast cancers (Table 3). However, after correcting for multiple comparisons, none of the associations remained statistically significant. None of the SNPs tested were associated with ER-negative breast cancer risk.
Three of the four SNPs that were associated with overall invasive breast cancer (p value < 0.05) were associated with HER2-positive invasive breast cancer (rs11899004, rs6723097, and rs6736233). rs2293554 was also associated with HER2-positive invasive breast cancer (OR = 1.98, 95 % CI 1.34-2.92, uncorrected p = 0.0005). After correcting for multiple comparisons, rs2293554 was the only SNP that remained statistically significant. Two of the four SNPs that were associated with overall invasive breast cancer (p value < 0.05) were associated with HER2-negative invasive breast cancer (rs3769825 and rs6723097). However, after correcting for multiple comparisons, neither remained statistically significant (Table 4).
In summary, after correction for multiple testing, one of the twelve CASP8 SNPs tested in our study remained nominally statistically significantly associated with invasive breast cancer, specifically, HER2-positive breast cancer.

Linkage disequilibrium
An analysis of data from the HapMap database indicated that very low historical genetic recombination exists across the entire CASP8 gene in individuals of European descent, with pairwise D' values near 1.0 for all SNP pairs spanning the gene in the database. The alleles at the five markers that were associated with breast cancer risk in this study before correcting for multiple comparisons were not strongly correlated, as measured by the linkage disequilibrium measure r 2 . This low correlation (r 2 ) in the context of high linkage disequilibrium (D') was expected given that the SNPs were selected as tagging markers. Three pairs of SNPs showed r 2 values greater than 0.4: r 2 = 0.44 for rs11899004 and rs2293554; r 2 = 0.52 for rs11899004 and rs6736233; and r 2 = 0.45 for rs3769825 and rs6723097. The remaining pairwise r 2 values were all  less than 0.2. rs6723097 and rs6736233 were the two SNPs most significantly associated with breast cancer risk overall, with uncorrected p-values of 0.0053 and 0.0028, respectively. These two SNPs are uncorrelated (r 2 = 0.07) and likely represent independent associations.

Discussion
This study is the first to identify the CASP8 SNP, rs2293554, to be statistically significantly associated with HER2-positive breast cancer risk in non-Hispanic white women. In our study, the observed OR of 1.98 with 95 % confidence interval of 1.34-2.92 for HER2-positive breast cancer risk was surprisingly high, especially given the small number of HER2-positive breast cancers in our study. It is possible that the observation may have been due to chance. A previous study reported that rs2293554 was not associated with breast cancer risk overall [34], similar to what we observed here; however, subtype-specific breast cancers were not evaluated in that study. The most recent BCAC paper on CASP8 [26] covered the analysis of 501 typed and 1232 imputed SNPs, and, while some CTS samples were included in the BCAC study, there was only overlap of 57 triple-negative and 49 controls between the BCAC study and our present analysis. rs2293554 was not included on the panel of CASP8 SNPs analyzed in the BCAC paper [26]; however, using the SNP lookup function on the BCAC website (http://apps.ccge.medschl.cam.ac.uk/consortia/bcac), we found that rs2293554 was not associated with overall, ER +, or ER-breast cancer risk. Data for HER2-specific breast cancer risk were not available on the website, but through personal email communication with the BCAC Data Manager, we learned that the BCAC data indicated that there was not an association between rs2293554 and HER2-positive breast cancer risk. rs2293554 was in strong LD with 16 of the 109 SNPs identified in the BCAC paper to be associated with overall breast cancer risk with FDR < 0.05 [26], with r 2 > 0.50, according to the Linkage Disequilibrium Calculator (https:// caprica.genetics.kcl.ac.uk/~ilori/ld_calculator.php), using the European panel in the 1000 genomes project; however, their effects were in the opposite direction (Additional file 2: Table S2). While our observation was not consistent with those in the BCAC study, our data demonstrates that SNPs can have different associations with breast cancer risk according to subtype and that rs2293554, with its nominally significant association with HER2-positive breast cancer risk in the CTS cohort, warrants further investigation.
Our study confirmed results from a meta-analysis, in which rs6723097 was associated with increased breast cancer risk [OR = 1.16 (95 % CI 1.07-1.25)] [34], and from a separate study [OR = 1.15 (95 % CI 1.01-1.30)] [27]. Here, the observed OR was 1.17 (95 % CI 1.05-1.31). Also consistent with previous studies, no associations with breast cancer risk were found for rs1035140 [34] and rs1861270 [27]. Eleven of the 12 SNPs analyzed in our study were included in a recent fine-mapping analysis by the BCAC [26]. Their findings were consistent with ours in that the 11 SNPs were not statistically significant after adjusting for multiple comparisons, or, in the case of the other paper, genome-wide significance of P = 5 x 10 -8 . The results for these SNPs were not shown by receptor subtype.
To correct for multiple testing, we used Bonferroni adjustment, which is very conservative, since the SNPs and phenotypes we tested were somewhat correlated. Given the importance of replicating genetic associations [35], our study, conducted in a well-established, well-characterized prospective cohort [30] contributes important information on the relationship between CASP8 polymorphisms and breast cancer risk.
Our results for rs1045485 were not consistent with those from two meta-analyses, which reported inverse associations with breast cancer, with pooled ORs of 0.87 (95 % CI 0.83-0.92) [28] and 0.79 (95 % CI 0.69-0.92) [29]. Our findings are consistent with a number of independent studies on the same SNP, some of which were included in the meta-analyses [28,29] and a separate study [34] in which no association was found between this SNP and breast cancer risk. The MAF (10.5 % ) we observed in this study (all non-Hispanic Whites) is similar to that seen in the women of European ancestry [10,35]. One of the BCAC studies on CASP8, which involved >30,000 invasive breast tumors, showed that rs1045485 was most strongly related with the risk of PR-negative tumors [25], but an association was not replicated in a later BCAC study [26]. Because no reports of development of PR status-specific chemoprevention were found at the time of the study, PR-specific subtypes were not included as outcomes in this study.
While the polymorphic CASP8 sites identified in this study are all intronic, it is possible that they may affect expression of the protein or RNA splicing, which may affect protein-protein interactions and other functions. rs6723097 and rs6736233 were found to have features consistent with involvement in gene transcription regulation according to the Variant Effect Predictor (VEP) on the Ensembl website (http://uswest.ensembl.org/Homo_sapiens/Tools/VEP) [36]. The other SNPs we found to be associated with breast cancer risk did not have such features. However, rs12693932 and rs6745051 are in strong LD with each other, and they are also in strong LD with the SNP rs13006529, which is a missense, according to the University of Washington Genome Variation Server (http://gvs.gs.washington.edu/GVS144/). Also, rs1861270 is in strong LD with the SNP rs3769823, which is also a missense. Neither rs13006529 nor rs3769823 have been reported to be associated with breast cancer risk. The remaining SNPs on our panel are not in LD with other SNPs with known functions.