Prostate cancer cell line models contain potential deleterious mutations in the DNA repair machinery
We set out to define the state of the DNA repair machinery in prostate cancer cell lines, including how it is influenced by androgen signaling. A potential difficulty of comparing androgen signaling across prostate cancer cells is - depending on the line - the AR is mutated, alternatively spliced, and expressed at different levels, all of which could affect the transcriptional output measured in response to androgen [44, 45]. For example, the most widely used prostate cancer cell line, LNCaP, carries a mutation in the ligand binding domain that affects activity. To help address this issue, we generated a line using PC3 cells, in which WT AR was re-introduced. PC3 cells are a metastatic, AR-negative prostate cancer line that is arguably the most aggressive prostate cancer line used in the laboratory. AR protein expression in PC3-AR cells is similar to the level in VCaP and LNCaP and shows efficient translocation into the nucleus 15 min after the addition of the synthetic androgen, R1881 (Fig. 1a-b). Like other prostate cancer lines, AR re-expressed in PC3 cells displays R1881- and DHT-induced release of Hsp90 and Hsp70, which reflects conformational changes induced by androgen binding to the AR ligand binding domain (Fig. 1c). Androgen-induced transcription and translation of the FKBP5 gene, which is directly regulated by AR [46] and widely used as a readout of AR activity, generates comparable levels of FKBP51 protein detected by immunoblotting in PC3-AR and LNCaP cells (Fig. 1d). These data indicate that AR stably reintroduced into PC3 cells responds to androgen, activates endogenous gene expression, and can be used as a model to study WT AR function in prostate cancer cells. We also determined that R1881 treatment of PC3-AR cells increases the fraction of cells in G1 from 39 to 65% (Fig. 1e). This property is not unique to PC3-AR cells, as LNCaP show a biphasic growth response and undergo senescence in response to 1 nM R1881 [47, 48].
To define genomic alterations in DNA repair genes in prostate cancer cells, we performed whole-genome sequencing of PC3-AR, LNCaP, and VCaP cells to obtain an average coverage of ~ 30-fold. We also used a dataset generated previously from RWPE-1 cells. We focused on the mutational status of 18 DNA repair genes (PARP1, PARP2, ERCC3, ATR, ATM, RAD50, RAD51, MRE11, NBN, CHEK1, CHEK2, MLH3, PALB2, FANCA, BRCA1, BRCA2, HDAC2, PRKDC). Twelve of these DNA repair genes display genome-level alterations in prostate cancer that are associated with positive outcome in response to Olaparib treatment [6]. Six additional genes encoding other core components of the DNA repair machinery were also included [49]. We aligned the sequences to the b37 + decoy reference sequence, removed the putative PCR duplicates, and used the GATK Haplotyper pipeline [34] to identify single-nucleotide variants (SNV) in the DNA repair gene set across the four cell lines. From our analyses of the 18 DNA repair genes, we detected 12 missense mutations in the RWPE-1 cell line, 9 missense mutations in the VCaP cell line, 2 missense mutations in the PC3-AR cell line, and 22 missense mutations in the LNCaP cell line (Additional file 1: Table S1).
To address whether these mutations could be common alterations within the human population, we annotated the variants with observed allele-frequencies in European populations from the 1000 Genomes Project [35] and the NHLBI GO Exome Sequencing Project [36]. We filtered the missense mutations to identify those that have a reported allele frequency < 0.1. Four mutations, BRCA1 (D693N), BRCA2 (R2034C), PARP-1 (P377S), and PARP-1 (V762A) in the RWPE1 cell line passed the threshold. Three mutations in the VCaP cell line, ATM (F858 L), ATM (P1054R), and DNA-PK (R2899C) had lower frequencies than 0.1 while only PARP-1 (S383Y) was found to have a lower allelic frequency than 0.1 in the PC3-AR cell line. In our analysis, the LNCaP cell line contained the most missense and frameshift mutations including ATR (K297 N), ATR (K1482R), CHEK2 (E239*), CHEK2 (T387 N), DNA-PK (N2669 V), DNA-PK (A3417T), ERCC3 (A740T), ERCC3 (R391W), FANCA (L684P), HDAC2 (A62V), MLH3 (K274I), MLH3 (I541V), NBS1 (D95N), PARP-1 (E547G), RAD50 (K608 N), and RAD50 (L719 fs*15). A literature and COSMIC search revealed that several of these mutations have been detected in other model systems and other cancer subtypes, e.g. PARP-1 (V377S), ATR (K297 N), FANCA (G809 N), and MLH3 (I541V) (Additional file 1: Table S1).
To determine whether the variants are expressed, we performed RNA-seq on the respective prostate cancer lines. We aligned the RNA-seq reads using STAR aligner, removed the putative PCR duplicates, used the SplitNCigarReads tool to partition the reads into smaller sequences representing segments beside/between splicing events, and then used the GATK Haplotyper to identify SNVs validating the DNA variant calls (Additional file 2: Table S2, “RNA-seq SNV detected”). Of note, the RNA-seq reads are limited by the expression of the gene and the location within the exon. In particular, reads that fall on the 3′ and 5′ ends of the exon will not be detected by this approach. We determined whether the SNV was detected by RNA-seq and if the corresponding exon/intron was expressed by RNA-seq (Additional file 2: Table S2, “SNV expressed” and “Exon/Intron expressed”, respectively). Using these criteria, we confirmed the following missense mutations are encoded at the mRNA level: (1) BRCA1 (D693N), BRCA2 (R2034C), PARP-1 (P377S), and PARP-1 (V762A) in RWPE-1 cells; (2) ATM (F858 L), ATM (P1054R), and DNA-PK (R2899C) in VCaP cells; (3) PARP-1 (S383Y) in PC3-AR cells; (4) ATR (K297 N), CHEK2 (T387 N), ERCC3 (A740T), ERCC3 (R391W), HDAC2 (A62V), MLH3 (I541V), and NBS1 (D95N) in LNCaP cells.
To determine if any of the genetic alterations are potentially deleterious, we used the following in silico–based methods: SIFT [40], fitCons [41], CADD [42], and PolyPhen [43]. These methods use algorithms to determine the likely impact of the mutation on the protein and are based on a number of criteria including sequence homology and relationships of a given alteration to essential protein domains. Two potential deleterious candidates - CHEK2 (E239*) and RAD50 (L719 fs*15) - were both found in LNCaP cells (Additional file 2: Table S2, Fig. 2). These mutations would result in a premature termination and in C-terminal truncations. In silico analyses also detected missense mutations in PARP1, CHEK2, RAD50, and ERCC3 that would likely influence protein function based on where these alterations fall within the respective protein domain (Additional file 2: Table S2, Fig. 2). These results reveal that similar to the genetic alterations reported in CRPC patient samples, the cell line models contain deleterious mutations in DNA repair genes associated with the response to PARP inhibition. Thus, our data from these models, and particularly LNCaP cells, provides a baseline for exploring the interplay between DNA repair genes and therapeutics.
Activation of AR generates shared and cell line-specific transcription programs
Androgen signaling has been shown to alter the expression of DNA repair genes [9, 50]. To address whether this might be a general feature of prostate cancer cells, RNA-seq data that we generated from the PC3-AR, LNCaP, and VCaP lines was used to characterize the effects of androgen. Our initial focus was on the PC3-AR line. Using RT-qPCR, we found that androgen treatment (R1881, 2 nM) of PC3-AR cells for 12 h resulted in activation of AR target genes, including FKBP5, ABCC4, EAF2, and PIAS1 (Fig. 3a). Transcripts for PSA and TMPRSS2, which are commonly used as readouts for androgen activation of transcription, were not detected in the presence or absence of androgen. To further characterize the androgen response in PC3-AR cells, we treated the cells with R1881 and harvested cells at 6 and 12 h time points for RNA-seq. Using the RNA-seq data, we employed co-expression networks to understand the broad patterns of expression changes in this cell line in response to androgen treatment. We used the WGCNA package [17, 18] which uses correlation-based inference methods to define gene-gene relationships, followed by construction of a network where each node represents a gene and each edge represents the presence and the strength of the co-expression relationship between the genes it connects. “Modules” in this case refers to genes that show a similar pattern of change with time and are identified using hierarchical clustering. Each gene is assigned a kME value that is a measure of the strength of connection of the gene with the module, and each module can be summarized using one representative gene called the “eigengene”.
We applied WGCNA to the PC3-AR RNA-seq dataset requiring a minimum module size of 30 genes. We initially identified 79 modules, but after filtering to limit the modules to (a) those with a membership greater than 100 genes and (b) those where the three replicates from the same time-point exhibit similar expression profiles, we were left with 6 modules (Fig. 3b). For each of the modules, we ranked the genes using their kME values. We then used Gene Set Enrichment Analysis (GSEA) to determine the hallmark gene-sets from MSigDB that were over-represented at the top of the list using an FDR threshold of 0.25 (Additional file 3: Table S3). The genes from modules I, IV and VI all demonstrated positive associations with the 12 h androgen time point relative to the untreated sample and were all significantly enriched for the “HALLMARK_ANDROGEN_RESPONSE”, a collection of genes from the MSigDB collection that define the androgen response [19].
We next compared the RNA-seq data from PC3-AR cells (12 h R1881) to RNA-seq data from VCaP and LNCaP cells (24 h R1881). We used DESeq2 [16] to identify differentially expressed genes (adjusted p-value < 0.05) in response to androgen treatment within and across the three cell lines. In response to R1881 treatment, PC3-AR cells had 2,190 transcripts that were upregulated and 2,423 transcripts that were downregulated, LNCaP cells had 2,633 transcripts that were upregulated and 3,185 transcripts that were downregulated, and VCaP cells had 3,183 transcripts that were upregulated and 4,096 transcripts that were downregulated (Fig. 3c). Notably, the androgen-controlled gene expression patterns from all three cell lines, including the PC3-AR cell line, was found to be strongly enriched for the “HALLMARK_ANDROGEN_RESPONSE” and the “HALLMARK_E2F_TARGETS”, in a positive and negative manner, respectively (Fig. 3d, Additional file 4: Figure S1). From principal components analysis, we found that the replicates within each group were highly similar and that most of the variability in the data is explained by the differences in the cell lines, which dominate over the effects of androgen (Fig. 3e). Overall, these results are in agreement with other reports showing that ligand activation of nuclear hormone receptors generates shared and cell line-specific transcription programs [51]. Multiple factors and pathways likely contribute to the differences observed in the cell lines, and these may include specific forms and expression levels of AR.
Expression signatures of the DNA repair and DNA damage response genes in prostate cancer cell lines
The Sawyers group explored how ADT increases the radiosensitivity of CRPC, and reported that AR transcriptional output is associated with the expression of DNA repair genes [50]. GSEA was used to define an AR-associated DNA repair gene set in human tumors (144 genes) of which a subset (32 genes) are AR target genes. In light of these observations, we set out to understand if androgen regulation of DNA repair genes is a common feature of prostate cancer cells.
We used the 32 DNA repair gene set published by Polkinghorn et al. (2013) [50] to query our RNA-seq data and found that most of these genes were not significantly altered by androgen treatment in the three cell models (adjusted p-value < 0.05) (Fig. 4a). HUS1 and RAD51C were two genes that were increased in LNCaP and PC3-AR, respectively. We found a total of seven genes underwent a significant reduction in response to androgen treatment (MRE11, FANCI, RAD18, MAD2L1, MCM7, TDP1, MSH6), though this effect did not extend to all three cell lines. From a heatmap of the regularized-log (rlog) transformed mRNA counts and androgen-mediated fold changes of 28 characterized well-characterized androgen-regulated genes [52] (Fig. 4a-b), it is clear that androgen induces gene expression changes in all three cell lines under our growth conditions.
The hierarchical clustering of the 32 DNA repair genes (Fig. 4a) based on a correlation distance and complete linkage identified 16 genes that showed a weak decrease in expression in response to androgen across all three cell lines. We repeated the DESeq2 analysis, treating the cell lines as biological replicates adjusting for the cell-line specific effects. In this analysis, we detected differential expression in response to androgen in 21 of the 32 genes (adjusted p-value < 0.05) - however, 17 of the 21 genes showed a decrease in expression (data not shown). This finding was supported by a GSEA, where we found a significant enrichment for genes that undergo an androgen-dependent decrease in expression (Fig. 4c). We found that 18 of the 32 genes formed the leading edge and account for the majority of the gene set enrichment signal (Fig. 4c).
Our analysis using the 32-gene set confirms that androgen signaling has a role in regulating DNA repair gene expression, with both positive and negative changes, depending on the gene. This prompted us to expand our analysis, which we did using two approaches. In the first approach, we used the list of differentially expressed genes from all three cell lines and filtered the list to include only those which are part of a set of 450 expert-curated DNA damage response (DDR) [21]. Using this strategy, we found 34 DDR genes that were significantly upregulated (adjusted p-value < 0.05) and 87 DDR genes that were significantly downregulated (adjusted p-value < 0.05) in response to androgen treatment (Fig. 5a, Additional file 5: Table S4). Only seven DDR genes showed androgen-mediated differential expression in all three prostate cancer cell lines. We observed that three DNA repair genes (CDKN1A, NCAPD3, and PER1) were significantly upregulated by androgen treatment in each of the three cell line models, whereas four DDR genes (EXO1, XRCC2, PER3, and TERT) were significantly decreased (Fig. 5a). Considering the cell lines as biological replicates and accounting for the differences in the cell lines in the design of the differential expression analysis, we found that 291 of the 450 DDR genes display differential expression in response to androgen (adjusted p-value < 0.05), with 192 showing a decrease in expression in response to the treatment (data not shown).
In the second approach, we set out to identify sets of genes (gene modules) that are co-regulated in response to androgen by constructing differential co-expression networks and performing topological analysis. To this end, we employed WGCNA [17, 18] to our RNA-seq dataset requiring a minimum module size of 30 genes and identified 15 modules (Fig. 5b-c, Additional file 6: Figure S2). Of note, we found four modules that were significantly related to the androgen treatment status (Point-biserial correlation, adjusted p-value < 0.05); two modules were associated with an increase in expression and two modules were associated with a decrease in expression (Additional file 7: Table S5, Fig. 5b-c, red & blue, respectively).
Using the four modules, we selected the genes with kME value > 0.8 and ranked them based on the direction of the fold-change and p-value. We then used Enrichr [53] to identify enriched annotations for each of the four modules. The modules that showed an increase in expression after androgen treatment were found to be enriched for several important pathways related to prostate cancer (Additional file 8: Table S6). Module XV was found to be enriched for cholesterol biosynthesis, steroid biosynthesis, various transport pathways, with genes found to be targeted by several ETS factors GABP, ERG, and ETS1 (p-adjusted < 0.05). Module XIV was also enriched for similar pathways to Module XV. Moreover, this gene set was significantly enriched for genes reported to be highly expressed in the prostate (BioGPS, p-adjusted 0.002591). Module I and Module II were each associated with a decrease in expression after androgen treatment. Module I was not significantly enriched for a particular pathway, while Module II was significantly enriched for ribosomal genes. But because the enrichment analyses did reveal well-established pathways associated androgen signaling in prostate cells, we believed the approach could be used to determine the presence of DNA repair genes in the modules.
We compared the genes from the modules that were significantly related to androgen treatment (Modules I, II, XIV, XV) to the expert-curated list of 450 human DDR genes [21]. We found 25 genes in modules that were affected by androgen. There were 17 genes in modules that showed an increase in expression after androgen treatment, and eight genes were in modules that showed a decrease after androgen treatment (Additional file 9: Table S7). By examining publicly-available AR ChIP-sequencing data from LNCaP and VCaP cells (GSE28126), we found that 18/25 of these genes have proximal AR binding sites (Additional file 9: Table S7, indicated in red). Interestingly, the only gene shared between the 25-gene set described here and the 32-gene set reported in Polkinghorn et al. (2013) [50] was HUS1 while five genes in our 25-gene set were identified within a 144-gene set as significantly associated with AR output in human prostate cancer samples. Within our 25 gene set are EYA3 and MTOR, which have been identified as putative AR targets in an integrative study of TCGA-PRAD samples [54]. Our module analysis expands the number of DNA repair genes impacted by androgen signaling, but it also underscores the difficulties of predicting how DNA repair activity is affecting the biological outcome since there are both positive and negative effects on gene expression.
Advanced prostate cancer exhibits aberrant expression of the DNA repair machinery
Our analyses identified a set of 25 DNA repair genes affected by androgen treatment across three cell lines and other studies have recently identified the importance of this signaling pathway in large cohorts of advanced prostate cancer [55,56,57]. We examined if our DDR gene set, or any of the 450 expert-curated DDR genes, were associated with prostate cancer disease progression. We analyzed microarray data from cases of primary and metastatic prostate cancer as well xenograft models of castration resistance. We downloaded a Z-score normalized (with respect to the normal prostate samples) expression dataset from the MSKCC study [25] provided by cBioPortal [23, 24]. We restricted the analysis to the 450 DDR genes, and performed a Kruskal-Wallis rank sum test to identify genes that showed significant association (p-value <1e-5) with the provided clinical tumor status (primary tumor or metastatic tumor). In total, 42 DDR genes passed the threshold (Additional file 10: Table S8), and the expression levels of these genes help distinguish metastatic versus primary prostate cancer. It has been shown in other studies that DNA repair pathway gene alterations are more frequent in the metastatic samples compared to primary tumors [58], but further analyses are needed to confirm if those alterations lead to expression differences of specific genes and gene sets.
Because we observed that DNA repair gene expression is increased in prostate cancer metastases (Fig. 6a), we tested whether DNA repair genes are altered in models of hormone-resistant (HR) prostate cancer. We used publicly-available gene expression microarray data from seven prostate cancer xenografts (CWR22, LAPC4, LAPC9, LNCaP, LuCaP23, LuCaP35, and LuCaP41) for which there are hormone sensitive (HS) and hormone resistant (HR) isogenic lines [20]. To determine changes in expression levels, we calculated expression fold changes between xenograft tumor samples developed in androgen-deprived conditions and the corresponding control tumor sample. When we considered all 450 DDR genes, we found 19 genes were differentially expressed in at least four of the seven prostate cancer xenograft models (Fig. 6b). More specifically, we found the expression of 11 genes were increased and eight genes were decreased in the HR lines grown in castrated mice (Additional file 10: Table S8).
Integrating the findings from prostate cancer cell lines, patient tumor samples, and xenografts, we found a total of 81 DDR genes display expression changes associated with androgen signaling and disease progression (Fig. 6c). There was very limited overlap between these gene sets. The PER1, NFRKB, and CCNC genes were altered in prostate cell lines and in the HR xenograft models. POLE was differentially expressed in the HR xenograft models and in patient metastases. NCAPD3 was differentially regulated in patient metastases and was regulated by androgen treatment in the cell line models. These 81 genes play important roles in multiple DNA repair pathways through checkpoint signaling, chromatin remodeling, chromatin segregation, and base excision repair.
With the goal of understanding how the 81 DDR genes (Fig. 6c) contribute to specific repair pathways, we used a Fisher’s exact test to compare the 81 DDR gene set to the 125 DNA repair ontologies organized and described by Pearl et al. (2015) [21]. Using a p-value cut-off of 0.01, we found there are no ontologies that are common to the cell line models, one ontology that is common to the xenograft models, and four ontologies that are common to the patient model (Fig. 6d). Thus, it appears that the expression alterations to the DNA repair pathway are highly specific to the model and possibly the disease stage. This further underscores the importance of understanding the genomics and epigenomics of the model systems that are frequently used to study prostate cancer.
Treating prostate cancer cells with the MRE11 inhibitor
Studies from several groups have shown that androgen signaling through AR induces a low level of transcription-associated DNA damage [59, 60]. This might be a characteristic of steroid hormone receptor signaling given that similar observations have been made in cells treated with estrogen and glucocorticoid [61,62,63]. Thus, it is plausible that androgen-induced expression of DNA repair genes is part of a mechanism to restore the integrity of chromatin that is damaged during transcription. Importantly, the androgen effects on DNA repair also includes androgen and AR-dependent recruitment of DNA repair factors to sites of damage, which have been shown to occur at AR-regulated enhancers and promoters [59, 60, 63]. ATM, DNA-PK, DNA Ligase IV, MRE11, and other repair factors undergo transient recruitment to AR binding sites in an androgen-dependent manner [60, 64, 65]. The application of drugs targeting DNA repair enzymes has helped reveal the importance of DNA factors to AR-dependent transcription [60, 64, 66]. We used a drug approach to test whether the small molecule inhibitor mirin, which targets MRE11 [67], can be used to inhibit AR-dependent transcription. MRE11, an enzyme that has both 3′-5′ exonuclease and endonuclease activity, is a component of the DNA damage sensing MRN complex. Mirin has been characterized extensively in a variety of cellular and biochemical contexts where it inhibits events associated with homologous recombination [67, 68], but to our knowledge, mirin has not been characterized in prostate cancer cells. Thus, we set out to answer two simple questions. (1) Does mirin inhibit AR-dependent transcription in prostate cancer cells? (2) Does mirin inhibit prostate cancer cell growth?
We tested a broad range of mirin concentrations for effects on androgen-induction of FKBP5 in PC3-AR, LNCaP, and VCaP cells. We found that 50 μM mirin reduced the androgen-induction of FKBP5 by 50% (Fig. 7a). LNCaP cells appeared less sensitive to mirin, where 100 μM mirin was required to reduce FKBP5 by 50% (Fig. 7a). Mirin treatment reduced androgen induction of the SOCS2, HOMER2, and ABCC4 genes in PC3-AR cells (Fig. 7b). Partial knockdown of MRE11 by siRNA generated a modest but statistically significant reduction in the TMPRSS2 and PSA genes in LNCaP cells (Additional file 11: Figure S3A-B). These data, which indicate that mirin can be used to reduce androgen-induced transcription in prostate cancer cells expressing WT and mutant forms of AR, support the model proposed by other groups that DNA repair is important for AR-dependent transcription [64,65,66]. As we alluded to earlier, ATM is also recruited to sites of androgen-induced damage [60, 65], and is, in part, regulated in part by MRE11 [69]. Application of the ATM inhibitor KU55933 [70] caused a striking reduction in androgen induction of the PSA, FKBP5, and TMPRSS2 genes in LNCaP cells (Fig. 7c). These show that inhibition of the DNA repair machinery can be used as a strategy to reduce androgen signaling in prostate cancer cells, and support the view that DNA repair makes an important contribution to AR-dependent transcription.
To examine the effect of mirin on prostate cancer cell growth and survival, we treated cells with drug for 72 h and performed an Alamar blue assay. We determined that PC3-AR, VCaP, RWPE-1, and LNCaP cells all display IC50 values in the range of ~ 40–70 μM (Fig. 7d-e). LNCaP cells appeared to be somewhat less sensitive to mirin than the other cell types, a result noted in the transcription assays (Fig. 7a). Importantly, mirin antagonism of androgen signaling does not result in the accumulation of damaged DNA, at least assessed by immunoblotting for global changes in activated ATM and γH2AX (Additional file 11: Figure S3C).