Low-risk susceptibility alleles in 40 human breast cancer cell lines

Background Low-risk breast cancer susceptibility alleles or SNPs confer only modest breast cancer risks ranging from just over 1.0 to1.3 fold. Yet, they are common among most populations and therefore are involved in the development of essentially all breast cancers. The mechanism by which the low-risk SNPs confer breast cancer risks is currently unclear. The breast cancer association consortium BCAC has hypothesized that the low-risk SNPs modulate expression levels of nearby located genes. Methods Genotypes of five low-risk SNPs were determined for 40 human breast cancer cell lines, by direct sequencing of PCR-amplified genomic templates. We have analyzed expression of the four genes that are located nearby the low-risk SNPs, by using real-time RT-PCR and Human Exon microarrays. Results The SNP genotypes and additional phenotypic data on the breast cancer cell lines are presented. We did not detect any effect of the SNP genotypes on expression levels of the nearby-located genes MAP3K1, FGFR2, TNRC9 and LSP1. Conclusion The SNP genotypes provide a base line for functional studies in a well-characterized cohort of 40 human breast cancer cell lines. Our expression analyses suggest that a putative disease mechanism through gene expression modulation is not operative in breast cancer cell lines.


Background
About ten percent of breast cancer patients have a history of multiple breast cancer cases in their family, suggesting the inheritance of breast cancer susceptibility alleles in these families. Germline mutations in the BRCA1 and BRCA2 genes are identified in about one quarter of the families with breast cancer. Female carriers of BRCA1 and BRCA2 mutations have an estimated 50-90% life-time risk to develop breast cancer, classifying both genes as high-risk susceptibility genes [1,2]. Other high-risk breast cancer genes include the p53, PTEN and STK11 genes, but mutations in these genes account for only few familial breast cancers. CHEK2 was the first moderate-risk breast cancer gene being identified [3][4][5]. Germline mutations in CHEK2 are identified in up to 5% of breast cancer families, albeit that their prevalence varies widely among populations. Female carriers of CHEK2 mutations have a moderate two to three fold increased risk to develop breast cancer. By now, several other moderate-risk breast cancer genes have been identified, including ATM, BRIP1 and PALB2 [6][7][8][9]. Mutations in these genes all confer increased breast cancer risks of two to three fold and mutations in each of these genes are identified in about 1% of the familial breast cancers. Recently, the international breast cancer association consortium (BCAC) has conducted a large genome-wide association study and identified five single nucleotide polymorphisms (SNPs) that associated with breast cancer [10]. Four of these SNPs were within haplotype blocks that contained genes: SNP rs2981582 locates in intron 2 of the FGFR2 gene at chromosome 10q; SNP rs889312 locates near MAP3K1 at 5q; SNP rs3803662 locates between TNRC9 and the LOC643714 gene at 16q; and SNP rs3817198 locates intronic in LSP1 at 11p. SNP rs13281615 locates at 8q24 in a region without any annotated genes. Importantly, independent genome-wide association studies have associated other SNPs in FGFR2 with breast cancer [11,12]. As FGFR2 had already been implicated in breast cancer [13][14][15][16][17][18][19][20], the significance of the FGFR2 SNPs as susceptibility alleles seemed evident. The TNRC9 SNP had also been associated with breast cancer in another study [21]. Lastly, the 8q24 SNP was of particular interest because other SNPs at 8q24 had been associated with increased risks of prostate cancer and colorectal cancer [22][23][24][25][26]. BCAC estimated that each of the five identified SNPs associated with rather small increased breast cancer risks, ranging from just over 1.0 to 1.3 fold, classifying them as low-risk susceptibility alleles [10]. However, these low-risk SNPs are very common and their impact is therefore still substantial, together accounting for almost 5% of the familial breast cancers.
The mechanism by which the low-risk susceptibility alleles confer breast cancer risks was obscure [10]. In analogy with the high-risk and moderate-risk breast cancer genes, it had been anticipated that the identified SNPs associated with disease-causing alleles in the coding sequences of nearby located genes. However, extensive sequencing efforts have not identified such alleles in the SNP-associated haplotype blocks, suggesting that the SNPs themselves might be the disease-causing susceptibility alleles [10]. BCAC therefore proposed an alternative disease mechanism that involves expression modulation of genes located in the vicinity of the identified SNPs, thereby conferring low breast cancer risks. Here, we have evaluated expression modulation in a well-characterized cohort of 40 human breast cancer cell lines, allowing us to specifically address whether this mechanism might operate in breast cancer cells.

Breast cancer cell lines
The 40 human breast cancer cell lines used in this study are listed in Table 1 and have been described in detail elsewhere [27]. Microsatellite analysis with nearly 150 poly-morphic markers had shown that all cell lines are unique and monoclonal [28].

Genotyping
Genotypes of five low-risk susceptibility alleles have been determined: rs889312 (A>C) near the MAP3K1 gene; rs2981582 (C>T) in the FGFR2 gene; rs3803662 (C>T) near the TNRC9 gene; rs3817198 (T>C) in the LSP1 gene and rs13281615 (A>G) that located in a gene desert at chromosome 8q24 [10]. Genotyping was performed by direct sequencing of PCR-amplified genomic templates, using the BigDye Terminator V3.1 Cycle Sequencing Kit (Applied Biosystems) and an ABI 3130xL Genetic Analyzer. Primer sequences are available upon request.
Allele frequencies of cases and controls reported by BCAC have been obtained by using their reported Odds Ratio data [10], and inferring allele frequencies by assuming that Odds Ratios reflect the ratio of minor allele carriers versus major allele carriers from the cases divided by the ratio of minor allele carriers versus major allele carriers from the controls.

Expression analysis
Transcript expression levels of four genes have been determined: MAP3K1, FGFR2, TNRC9 and LSP1. Quantitative real-time PCR (qPCR) was performed on cDNA templates that had been generated with oligo-dT and random hexamer primers from total RNA isolates, using Power SYBR Green PCR Master Mix (Applied Biosystems) and an ABI Prism 7700. Ct values were normalized according HPRT and HMBS housekeeper Ct values. Transcript expression had also been determined by Human Exon 1.0 ST microarrays (Affymetrix), as described elsewhere [29]. The exon array data have been deposited in NCBI's Gene Expression Omnibus [30] and are accessible through GEO Series accession number GSE16732.

Statistical analysis
Statistical analyses were performed with Statistical Package for the Social Sciences (SPSS) version 11.5, considering P-values of less than 0.05 significant. Fisher's exact test was used to determine association of the SNP genotypes with the breast cancer cell lines. The Kruskal Wallis test was used to compare gene expression levels among three SNP genotype groups (major homozygotes, heterozygotes, and minor homozygotes).  Table 1. Frequencies of homozygote genotypes typically were higher than anticipated, likely related to allelic losses in the cell line samples (Figure 1a; [10]). For four SNPs (8q24, MAP3K1, FGFR2 and TNRC9), the minor allele frequencies among the cell lines were higher than among the 21,860 BCAC breast cancer cases and 22,578 population controls (Figure 1b; [10]). Fisher's exact testing indicated that the minor allele frequencies among the cell lines were significantly higher than the BCAC population controls for two SNPs: MAP3K1 and TNRC9 (Figure 1b). In Table 1 and 2, we also included previously-determined phenotypic and

Expression levels of nearby located genes in breast cancer cell lines do not correlate with their SNP genotype
Surprisingly, BCAC had not identified disease-causing gene variants within the haplotype blocks of the five low-risk SNPs [10]. They proposed an alternative disease mechanism, in which SNP genotypes modulate expression levels of nearby located genes. Such disease mechanism was conceivable because the minor SNP alleles confer only low risks for breast cancer. Here, we have evaluated whether gene expression modulation is operative in breast cancer cell lines, by associating SNP genotypes of the breast cancer cell lines with the expression levels of nearby located genes.
Basal Gene expression data of the four genes physically nearest to the SNPs were obtained by Affymetrix Human Exon 1.0 ST microarray profiling and by qPCR analysis. Both transcript expression analysis methods revealed similar expression levels for each of the four genes: MAP3K1, FGFR2, TNRC9 and LSP1, with Spearman correlation coefficients of -0.6, -0.7, -0.8 and -0.4, respectively, among the 40 breast cancer cell lines (Table 3 and Figures  2 and 3). Because BCAC had shown that the low-risk SNPs confer breast cancer risks in a dose-dependent manner, with the highest risks for the minor homozygotes [10], association between gene expression levels and SNP genotypes was performed by three-group comparisons. Exon array data are shown in Figure 2, with cell lines from each genotype group depicted in a different color. Unique outliers typically represented decreased expression of one or more probes sets, such as exon 17 of MAP3K1 or exons 3-5 of TNRC9, possibly related to the presence of SNPs in probe sequences, alternative splicing or genomic deletions [29]. Expression of recurrent isoforms as reported by NCBI was detected only for the FGFR2 gene, with two cell lines expressing the isoform that lacked exon 9. Both cell lines were minor homozygotes for the FGFR2 SNP. Overall, there was no apparent association between the exon array expression level of each of the four genes and their SNP genotypes (Figure 2). The qPCR Ct-values are detailed in Table 3 and the three-group comparisons are shown in Figure 3. Again, we did not detect any associa-  Table 1.
tion between gene expression levels with SNP genotypes for the four genes. It is possible that gene expression levels are affected by allelic loss of the gene loci. We therefore also have compared gene expression levels in major and minor homozygotes with allelic loss to the gene expression levels in cell lines without allelic loss, but gene expression levels did not correlate with allelic losses either (   -3  25  32  26  37  SK-BR-5  23  37  21  43  SK-BR-7  26  39  45  35  SUM102PT  25  35  32  29  SUM1315M02  26  43  44  35  SUM149PT  27  38  45  37  SUM159PT  27  45  43  34  SUM185PE  23  38  23  45  SUM190PT  24  45  23  45  SUM225CWN  24  36  23  39  SUM229PE  25  39  33  36  SUM44PE  22  41  26  36  SUM52PE  24  24  24  37  T47D  20  36  45  35  UACC812  24  38  24  45  UACC893  21  36  25  these genes was operative in invasive breast cancer cells but was lost upon in vitro propagation of the cell lines. Expression analysis of carefully dissected tumor cells and non-neoplastic epithelial and stromal cells from clinical breast cancer samples should resolve this issue and may determine the precise mechanism of expression modulation by low-risk breast cancer susceptibility alleles.

Conclusion
We present the genotypes of five low-risk susceptibility alleles or SNPs of 40 human breast cancer cell lines. Using this cell line model, we have evaluated the BCAC hypothesis that low-risk SNPs confer breast cancer risks by modulation of expression levels of nearby located genes. We found no evidence for expression modulation in the breast cancer cell lines, suggesting that such disease mechanism is more likely to operate in non-neoplastic epithelial or stromal cells or has been lost during in vitro propagation of the cell lines.