Exome sequencing of primary breast cancers with paired metastatic lesions reveals metastasis-enriched mutations in the A-kinase anchoring protein family (AKAPs)

Background Tumor heterogeneity in breast cancer tumors is today widely recognized. Most of the available knowledge in genetic variation however, relates to the primary tumor while metastatic lesions are much less studied. Many studies have revealed marked alterations of standard prognostic and predictive factors during tumor progression. Characterization of paired primary- and metastatic tissues should therefore be fundamental in order to understand mechanisms of tumor progression, clonal relationship to tumor evolution as well as the therapeutic aspects of systemic disease. Methods We performed full exome sequencing of primary breast cancers and their metastases in a cohort of ten patients and further confirmed our findings in an additional cohort of 20 patients with paired primary and metastatic tumors. Furthermore, we used gene expression from the metastatic lesions and a primary breast cancer data set to study the gene expression of the AKAP gene family. Results We report that somatic mutations in A-kinase anchoring proteins are enriched in metastatic lesions. The frequency of mutation in the AKAP gene family was 10% in the primary tumors and 40% in metastatic lesions. Several copy number variations, including deletions in regions containing AKAP genes were detected and showed consistent patterns in both investigated cohorts. In a second cohort containing 20 patients with paired primary and metastatic lesions, AKAP mutations showed an increasing variant allele frequency after multiple relapses. Furthermore, gene expression profiles from the metastatic lesions (n = 120) revealed differential expression patterns of AKAPs relative to the tumor PAM50 intrinsic subtype, which were most apparent in the basal-like subtype. This pattern was confirmed in primary tumors from TCGA (n = 522) and in a third independent cohort (n = 182). Conclusion Several studies from primary cancers have reported individual AKAP genes to be associated with cancer risk and metastatic relapses as well as direct involvement in cellular invasion and migration processes. Our findings reveal an enrichment of mutations in AKAP genes in metastatic breast cancers and suggest the involvement of AKAPs in the metastatic process. In addition, we report an AKAP gene expression pattern that consistently follows the tumor intrinsic subtype, further suggesting AKAP family members as relevant players in breast cancer biology. Electronic supplementary material The online version of this article (10.1186/s12885-018-4021-6) contains supplementary material, which is available to authorized users.


Results:
We report that somatic mutations in A-kinase anchoring proteins are enriched in metastatic lesions. The frequency of mutation in the AKAP gene family was 10% in the primary tumors and 40% in metastatic lesions. Several copy number variations, including deletions in regions containing AKAP genes were detected and showed consistent patterns in both investigated cohorts. In a second cohort containing 20 patients with paired primary and metastatic lesions, AKAP mutations showed an increasing variant allele frequency after multiple relapses. Furthermore, gene expression profiles from the metastatic lesions (n = 120) revealed differential expression patterns of AKAPs relative to the tumor PAM50 intrinsic subtype, which were most apparent in the basal-like subtype. This pattern was confirmed in primary tumors from TCGA (n = 522) and in a third independent cohort (n = 182).
(Continued on next page) (Continued from previous page) Conclusion: Several studies from primary cancers have reported individual AKAP genes to be associated with cancer risk and metastatic relapses as well as direct involvement in cellular invasion and migration processes. Our findings reveal an enrichment of mutations in AKAP genes in metastatic breast cancers and suggest the involvement of AKAPs in the metastatic process. In addition, we report an AKAP gene expression pattern that consistently follows the tumor intrinsic subtype, further suggesting AKAP family members as relevant players in breast cancer biology.

Background
The mutational landscape of primary breast cancer tumors has been extensively studied in recent years and a large number of exomes and full genomes have become available [1][2][3]. Most studies to date have focused on the primary breast tumor whilst mutational profiles of metastatic lesions and their relationship to the primary tumor have largely been lacking. This has important clinical implications as altered receptor status in the metastatic lesion has been shown to occur at high rates ranging from 14.5-40% for ER 40% for PgR and 0-37% for HER2-receptors during cancer progression [4][5][6] and is additionally affected by adjuvant therapy with major implications for management of the metastatic disease. The genetic evolution and accumulation of genetic aberrations in metastatic malignancies has been described in [7][8][9][10][11][12][13]. Most importantly, accumulation of activating estrogen receptor 1 (ESR1) mutations in metastatic lesions has demonstrated a mechanism of acquired hormone resistance in metastatic breast cancer [14][15][16].
In order to characterize the mutational landscape in primary tumors and metastases, we performed exome sequencing in ten patients for which paired samples were available. We found a pronounced heterogeneity both between and within patients with a general enrichment in number of somatic mutations in metastatic tissues. As a consequence of finding a number of both silent and nonsilent mutations in the AKAP9 gene, one of the recurrent cancer genes in the COSMIC data base, we investigated all 14 AKAP (A-kinase anchoring proteins) family members and found somatic mutations in seven of them, the majority of which were in metastatic lesions only. In addition, several copy number changes were present in AKAP loci, mostly deletions. The different members of the A-kinase anchoring proteins (AKAPs) directs and orchestrates the activity of Protein Kinase A (PKA), which is an important factor in cell motility and proliferation [17]. In line with the cellular function of PKA, several of the AKAP members have been associated with cancer development and metastatic spread. TCGA data confirms the presence of AKAP somatic point mutations in primary breast cancer with a prevalence of 7% and with a higher rate including copy number variations.

Patient population and ethics
Ten patients representing different clinical subtypes and metastatic lesion sites were included in the study (Table 1). Patients P1 to P7 were part of the Swedish multicenter trial TEX, ClinicalTrials.gov identifier NCT01433614 that enrolled 287 patients with locoregional or distant breast cancer relapse from 2002 until 2007 [18]. Patients P8 to P10 were part of a prospective biopsy study at Radiumhemmet Karolinska Hospital during 2007-2010. The information on primary and relapse ER, PR and HER2 status were retrospectively collected from pathology reports and reassessed by IHC/ICC. All patients were given informed consent.
Patients for cohort 2 (Additional file 1: Table S1) were identified using search criteria in the digital patient record system which allowed collection of formalin-fixated paraffin-embedded (FFPE) material from primary breast cancer, local recurrence, axillary-and distant metastases as well as clinical information as described in Ullah et al. (submitted). Samples were obtained only after approval by the Ethical committee at Karolinska Institutet, Stockholm. Samples were further investigated and assessed by a certified surgical pathologist. All studies were approved by the Regional Ethical Review Board in Stockholm, Sweden.
Cohort 1: DNA extraction and exome capture DNA was extracted from fresh frozen fine needle aspiration (fnac) biopsies (metastases), fresh frozen surgical material (primary tumors) and blood (germline control), concentrations were determined using Qubit dsDNA HS assay (Life Technologies). Due to low concentrations obtained from aspirations, we used WGA in order to meet the required amounts of DNA needed for exome enrichment. All paired samples including germline DNA were subjected to WGA using GenomiPhi V2 DNA Amplification Kit (GE Healthcare) according to manufacturer's instructions, a summary of DNA concentrations can be found in Additional file 2: Table S2. The fragment length after ultrasonic DNA fragmentation by Covaris (LGC Genomics) was measured using 2200 TapeStation Genomic DNA screen tape (Agilent Technologies), amplified DNA did not differ from non-amplified DNA (data not shown).
Exome capture from amplified tumor and normal DNA was performed using Sure Select Human All Exon V4 (Agilent Technologies) according to the SureSelect XT Target Enrichment System for Illumina Paired-End Sequencing manual version 1.3. Sample preparation, hybridization and post-hybridization amplification were performed according to the manufacturers instructions, with five cycles of PCR in amplification of adaptorligated library, and 12 cycles in the post-hybridization captured library amplification step. The quality of final libraries was evaluated using TapeStation High Sensitivity DNA kit (Agilent) and the libraries were quantified using KAPA SYBR® FAST ABI Prism qPCR Kit (KAPA Biosystems). Enriched exome fragments were pooled and paired-end sequenced on a HiSeq 2000 platform (Illumina).

Cohort 1: Variant detection
Demultiplexing, alignment and variant detection were carried out using the Casava 1.8.2 pipeline from Illumina with default parameters and duplicate removal. Samtools v0.1.9 (r783) was used to create mpilups from the bam files and variant identification was performed using the VarScan (v2.3.2) [19] somatic function with default parameters, but a minimum variant frequency of 5% and a somatic P value of 0.2. The VarScan-identified positions were examined using the Ensembl variant prediction tool (v 2.6 and database ver 69) and known SNPs were labeled using the SnpEff database (SnpSift version 1.9c).
The VarScan, Variant effect prediction, SnpEff and mpileup base count outputs were joined on position using inhouse scripts to facilitate downstream filtering and comparison. The output data was filtered by retaining only those positions with variant frequency in germline sample < 1%, variant reads in tumor ≥ 5, and total coverage > 30. Average coverage per exome was 47.3 Mb (range 18.8-51.1 M) (Additional file 3: Figure S2a). SNVs, CNVs and LOH were called by comparative analysis of sequence variants between tumor and germline samples. To account for potential false positive mutation calls, the output data was manually filtered and positions with variant frequency in germline sample < 1%, variant reads in tumor < 5 and a total coverage of < 30 reads were excluded, leaving only exome wide mutations with an VAF > 16%.
To focus the analysis, a set of 505 recurrently mutated genes was used to select somatic variant positions of interest. For these genes, all variant positions occurring in the pairs of primary and metastatic lesions were selected, including those with less than five tumor variant reads, not to exclude true shared low frequency mutations. The selected positions were manually investigated, visualizing the unfiltered reads in Integrative Genomics Viewer (Broad Institute) with respect to alignment quality, base quality of variant base, read direction and total coverage on the position. Variant positions with the following conditions were manually removed from the data set: variant reads in normal sample, poor alignment or highly variable sequences and variant reads in one direction only.

Cohort 2: Variant detection
Raw sequencing reads were quality and adapter trimmed using trim_galore, where the first 13 bases of Illumina adapter were used and stringency parameter was set to 2. Reads having lengths less than 70 after trimming were filtered out with their paired mates. Trimmed reads were aligned to the human reference genome build hg19 using bwa-mem with default parameters. The aligned reads were marked for duplicates by Picard (25%), realigned around known indels and base-quality recalibrated by GATK. Somatic single nucleotide variants (SNVs) were detected by Mutect [20] with the high-confidence mode. Copy number alterations were detected by ADTEx [21]. All pipelines and analyses were run using Anduril [22], a workflow framework for scientific data analysis. Mean coverage per sample is given in (Additional file 3: Figure S2b).

CNV calculations
Copy number variations (CNVs) were determined by calculating the normalized read coverage at each SNP position for paired tumor and germline samples, then taking the log 2 of the ratio of these numbers and recentering to zero. In Additional file 4: Figure S1a-b, these measures were plotted as a moving average across ten adjacent SNPs.

LOH calculations
Estimating LOH fraction and tumor content using a Beta-Normal mixture model Additional file 5: Figure  S3a-b. Histograms show the distribution of major allele frequencies (range: 50-100%) and red curves show the mixture model estimated from the data. The set of SNPs called in the germline sample was considered in the tumor sample (primary and metastasis independently). When genomes contain regions of LOH, the distribution will be bimodal, as illustrated in the inset (top right). The first component (closer to 50%; red in inset) represents heterozygous SNPs in regions without LOH and was modeled as a normal distribution with a mean close to 50% in a perfect sample, but will increase towards 100% as allelic dropout increases.
The second component (closer to 100%; blue in inset) represents homozygous SNPs in regions with LOH, was modeled as a beta distribution with two parameters (the mean of this component should be close to 100% and increase as LOH fraction increases and decrease as with non-cancer cell contamination. The mixture distribution thus had four free parameters plus the mixture proportion, the latter representing the estimated LOH fraction of the sample. Fitting this mixture to the observations yielded estimates for LOH fraction, tumor fraction and allelic dropout rate for each sample (Additional file 5: Figure S3). Model fitting was performed using the EstimatedDistribution function of Mathematica 9.0 (Wolfram Research Inc.).

Microarray
Microarray experiment was performed as follows; total RNA from frozen metastatic tumors was extracted using the Qiagen RNeasy Mini Kit (Qiagen, Germany). All patient tumor samples were profiled using NuGEN amplification protocol and hybridized using the HRSTA-2.0 custom human Affymetrix array GPL10379. The microarray gene expression analyses were performed in R using the aroma.affymetrix package. The PAM50 intrinsic subtypes (PAM50) for cohort 1 relapses have been previously published [23] and publicly available GSE56493. The case-control cohort have been previously published [24] GSE48091. The normalized expression array and intrinsic subtype calls for the TCGA data were taken from the original manuscript [1].

Control experiment
To control for amplification-induced artefacts i.e. false positive and false negative mutation rates, we designed a control experiment as follows: Genomic DNA from a healthy individual was extracted from whole blood using the PAXgene Blood DNA kit (Qiagen). The DNA was of good quality (> 58 kB fragment length) and of high concentration (> 300 ng/uL). This DNA was diluted 1:100 and 1:1000 and 6 ng respectively 0.6 ng was subjected to WGA, exome enriched, sequenced and analyzed as described above using an unamplified sample as 'germline' control, (Additional file 6: Table S3).

Results
We first analyzed somatic mutations in a set of 505 previously recognized recurrent cancer genes from COSMIC census database and The Cancer Genome Atlas [1] (henceforth called 'recurrent cancer genes') where an enrichment for true driver mutations may be expected. We detected a total of 654 somatic mutations, of which 301 were nonsynonymous, distributed over 190 different genes. Of nonsynonymous mutations, an average of 43 and 47% were predicted to be deleterious and tolerated, respectively.
The number of somatic mutations was significantly different between primary tumor and metastatic lesion The mutational load was significantly greater in metastases, both in recurrently mutated genes, (average: 6, range: 0-20 vs average: 24, range: 2-58; p < 0.001; Student's t test) (Fig. 1a) and in all exomic regions (average: 222, range: 11-825 vs average: 706, range: 51-1411; p < 0.05; Student's t-test) (Fig. 1b). In eight of the patients, the metastatic lesions showed a higher number of somatic point mutations than the primary tumor (range: 2-19 fold higher). Interestingly, in two of the patients (patients P6 and P7) the relationship was inverted, with fewer mutations in recurrently mutated genes in the metastatic tissue ( Fig. 1a; 2.3 and 7 fold fewer) and ( Fig. 1b; 3 and 10 fold fewer). The lower number of mutations in these metastases coincided with a reduced level of CNV and LOH in these two patients (Additional file 4: Figure S1b). This unexpected pattern was also seen in five of 20 patients in cohort 2 (data not shown). The number of shared mutations varied between 0 and 47 (average 15.4) in whole exomes and included potential driver genes PDGFR, TP53, DAXX, ERBB2, FBOXO11 and JAK3 (Fig. 1c). Despite the small cohort we found several mutations in previously described genes, frequently mutated in breast cancer [1][2][3] (Table 2). Importantly, no recurrent cancer genes were found mutated in the control experiment.

Ti/Tv
The average Ti/Tv ratio for the heterozygous SNPs for each exome in cohort 1 was 2.7 (2.6-2.8) and for exome wide variant positions 3.37 and 7.7 in the primary tumor samples and the metastatic samples respectively (p < 0.05). The high Ti/Tv ratios in especially metastatic samples is likely a result of the predominance of C > T/G > A, which can be seen among the AKAP mutations (Tables 1 and 2). These substitutions are enriched in a majority of cancers and could be associated with the age at cancer diagnosis as well as adjuvant chemotherapy using alkylating drugs or adjuvant radiotherapy [25]. The Ti/Tv ratio are further known to be elevated in breast cancer, were a bias towards TpC mutation hotspots as well as a gene expression related bias towards A > G transitions compared to T > C transitions in coding strand and have been reported [26]. Specifically in breast cancer, mutational signatures clearly shows a predominance of C > T mutations, with possible associations to the ageing processes and APOBEC-induced mutagenesis [25,27]. Interestingly, we have discovered a significant enrichment of the APOBEC related mutational signature among the metastatic tumors in cohort 2 (Ullah et al. submitted), which is in support of the high level of Ti/Tv ratio in metastatic lesions of cohort 1. A similar, enrichment of the APOBEC signature in metastatic breast cancer has also been reported by others [28].

Mutations in the AKAP gene family are enriched in metastases
While investigating the list of recurrently mutated cancer genes we found that AKAP9 carried an unexpected number of both silent and nonsilent mutations in the metastatic lesions. Because of the previously reported implication of AKAP9 in breast-and other cancers, the analysis was extended to include all members of the AKAP gene family. We found a striking enrichment of mutations in the family of A-kinase anchoring proteins (AKAPs) in metastatic samples (p < 0.02; Fisher's exact test) compared with primary lesions. Altogether nonsynonymous coding mutations were found in seven of the AKAP family members AKAP5, AKAP6, AKAP8, AKAP9, AKAP10, AKAP12, AKAP13 ( Table 3). Out of the ten patients in the first cohort, five (50%) carried mutations in AKAP genes in only metastatic tissue (four) or in only the primary lesion (one). Out of the total nine AKAP mutations recorded, eight were found uniquely in the metastatic lesions, indicating that these mutations have emerged in a subclone during late primary tumor evolution. None of the "metastatic-unique" mutations could be found in the corresponding primary tumors, even at low allele frequency levels.
We next wanted to investigate this interesting mutation pattern in an independent cohort available in our lab. This cohort consists of 20 patients with paired primary breast tumor and multiple metastatic sites and is comparable regarding tumor grade, treatment and survival (Additional file 1: Table S1). Out of the 20 additional patients, four (20%) carried mutations in AKAP genes (AKAP3, AKAP4, AKAP9, AKAP11) in both primary tumor and metastatic tissue or in only the metastatic lesion (Table 4). In total four nonsynonymous mutations were recorded, two of these were only found in the corresponding metastatic lesion. The enrichment of mutations in metastases was not significant in cohort 2 (Fisher's exact test), possibly because of the small size of the cohort. However, in the two patients (patients 7 was an increase of allele frequency in the metastatic lesions. The same increase was seen in two of the patients having multiple relapses from different time points (patient 19 and 8), indicating clonal selection. In all four patients the tumor cell fraction was over 80% in the metastatic samples. In patient 19 however, tumor cell content was as low as 20% in the primary tumor, possibly leading to an underestimation of variant allele frequency. This highlights the possible role of AKAP deregulation in the oncogenic and metastatic setting discussed in [29][30][31].
The protein structures of mutated AKAPs are summarized in Fig. 2 including the functional protein kinase interaction domains (PKA-RI/II, PKC) from the UniprotKB database and [32]. None of the reported mutations were located within the PKA-binding domains although metastatic mutations in AKAP9 (S2518 N) and AKAP13 (D477N) were located within a 20 a.a from the domain. AKAP10 mutation L182P is located within one of the regulator of G-protein signaling (RGS1) motif [33]. Interestingly, the AKAP8 (Q169X) mutation is located within the MCM2 binding region (see discussion), and the two AKAP12 mutations were found in a proposed EGFR interacting domain [34].

AKAP cytogenetics
Several copy number variations (CNV) in regions containing AKAP genes were detected. AKAP7 and AKAP12, both located on chromosome 6 at 130.5 and 151 Mbp respectively, were deleted in both primary tumor and metastasis in two of the patients (P1, P5). In patient P2 the AKAP7 and AKAP12 deletion was aquired in the metastasis (Table 5). One patient carried deletions in both primary tumor and metastasis in a chromosome 13 region containing AKAP11 (P1) and two additional patients seemed to have aquired the deletion in the metastasis (P3 and P8). In patient P4 AKAP8 (chr 19) was amplified in the metastasis but also had aquired a deleterious nonsense mutation. The same patient also showed a focal amplification in chromosome 17 involving AKAP1.
The second cohort showed similar CNV patterns for AKAP1 and AKAP12. AKAP1 was amplified in both primary and metastasis in ten of the 20 patients. Interestingly, AKAP1 amplifications was the most frequent mutation in the Breast Invasive Carcinoma TCGA data set altered in 8% of the tumors (n = 1098), data from cbioportal.org (Additional file 7: Figure S4b).
AKAP12 was deleted in both primary and corresponding metastasis in two patients and in one patient the 6q25.1 deletion was aquired in the metastatic lesion. Loss of AKAP12 has been shown to increase cancer incidence and metastatic spread in several cancers including prostate [35,36] and breast cancer [37].

AKAP gene expression
As some of the AKAP members are reported with altered gene expression levels in the literature [38,39] we    wanted to further explore the gene expression of AKAP gene family in the setting of breast cancer using The Cancer Genome Atlas (TCGA) dataset (n = 522). Interestingly, a significant gene expression pattern of AKAPs was noted relative to tumor PAM50 subtypes whereby AKAP1, 3, 7 and 8 were highly expressed in basal-like subtypes and lower expressed in the other subtypes (p < 0.001 Basal vs. other subgroups, ANOVA with post-hoc Tukey, Additional file 8: Figure S5a and Fig. 3a). Conversely, AKAP5, 9, 10, 11 and 12 all showed lower expression in the basal-like subgroup (p < 0.001, ANOVA with post-hoc Tukey Additional file 8: Figure S5b and Fig. 3a). We further investigated the gene expression of the same AKAP gene sets in primary tumors from an independent nested case-control cohort, consisting of 182 patients stratified by metastatic risk with. A similar AKAP expression pattern was found (p < 0.001, Basal vs. other subgroups, ANOVA with post-hoc Tukey, Additional file 8: Figure S5c, d and Fig. 3b). Finally, in the full-sized TEX cohort 1 with metastatic lesions (n = 120), the pattern was less distinct, however, both AKAP1 and AKAP8 as well as AKAP5 ad AKAP9 showed a relative expression in the basal-like group consistent with the other cohorts (Additional file 8: Figure S5e, f and Fig. 3c).

Discussion
In this paper we report data from the exomes of primary and metastatic lesions in breast cancer. Initially we used DNA from a small subset of patients that were part of a larger study cohort. In order to obtain sufficient DNA for exome capture we used WGA for all samples including germline. The risk of introducing false positive mutations using WGA prior sequence capture was investigated by Hasmats et al. [40] who described that the majority of false mutations were found in low coverage areas (< 10 reads), indicating that caution should be taken, both of allelic drop-out as well as low coverage regions due to problematic sequences. We addressed the issues of both false positive mutations and false negative discovery rates (Additional file 6: Table S3) and found low frequences of both, except for three of the metastatic samples in cohort 1 with high levels of false negative rates, presumably due to low starting concentrations and allelic drop out during amplification. We could not detect any other sources of variability due to different sample sources in cohort 1. We have used two different cohorts obtained at different time points as well as different sources of tumor material using two different but comparable bioinformatic pipelines. While cohort 1 was used for initial findings, cohort 2 was used for validation. Using two different cohorts, separate statistics as well as separate analytic pipelines is a methodological strength and supports our findings regarding AKAP mutations. In cohort 1, we report a large increase in number of mutations for some patients. We also found a low proportion of shared mutations between primary and metastatic tumor tissue. While the average increase in mutations in metastatic tissue was less pronounced in cohort 2 we confirm a higher mutational load in metastatic tumors. The contribution of APOBEC related mutagenesis has been revealed to be increased in metastases and might impact the mutational load in metastatic tissues. Low percentage of overlapping mutations is affected by tumor heterogeneity. Up to 70% of tumor heterogeneity both within a primary tumor but also between primary and metastasis have been reported in breast cancer and, with even higher discordance in other cancers [41]. Changes in the mutational processes and the inherent tumor heterogeneity could both lead to a low number of shared mutations.
Here, we report for the first time that somatic mutations in the AKAP gene family occur in breast cancer and are enriched in metastatic lesions. Studies including both primary breast cancer tumors and their corresponding metastatic lesions are still rare but Lefebvre et al. report the mutational profiles from metastases from over 200 patients using TCGA data as primary tumor reference [28]. In this study, mutations in the AKAP gene family are well represented with one or more AKAP mutations being present in 30 out of 216 patients (14%) being compared. In total 42 nonsynonymous mutations are present in 9 different AKAP genes, supporting our findings.
A-kinase anchoring proteins (AKAPs) are members of a protein family acting as anchors for Protein Kinase A (PKA) by specifically associating PKA regulatory subunits RI and RII to cellular organelles and directing its active signal transduction spatially and temporally via cyclic AMP [17]. This functionality has implications for both cell (See figure on previous page.) Fig. 2 Gene view of AKAP mutations and AKAP protein regions and domains. Drop symbols indicating location of amino acid exchange light green: mutation present only in metastasis; dark green: mutation present in both primary tumor and corresponding metastasis; blue: SNP reported association with familiar breast cancer [46,48,66]. PKA-RI/II: protein kinase A regulatory subunit I/II; WSK: short conserved WXSXK motif in protein kinase A binding proteins (AKAPs); ZF: Zinc finger repeats; MCM2: minichromosome maintenance complex 2 [33]; DH: Dbl homology domain (RhoGEF); PH: Pleckstrin homology domain; Protein kinase interacting domains are provided from uniprotKB with additional domains from [32]**. EGFR interacting domain from [63] Table 5 Copy number variations (CNVs) in cohorts 1 and 2 at AKAP containing loci CNV in cohort 1 were determined by calculating the normalized read coverage at each SNP position for paired tumor and germline samples, then taking the log 2 of the ratio of these numbers and recentering to zero (see also Additional file 4: Figure S1 and Additional file 3: Figure S2) motility and cell proliferation. Studies show that an altered balance between subunit RI and RII has implications for tumorgenesis in prostate [42], colon [43,44] and in addition a prognostic value in breast cancers [45]. Overexpression of subunit RI tends to increase proliferative mechanisms and is elevated in malignant lesions, whereas the RII counterpart tends to upregulate apoptotic pathways, downregulate proliferative genes and shows a decreased expression in malignant cells.
In line with the function of PKA in tumorgenesis, several of the AKAP members have been associated with cancer development and metastatic spread. Although only AKAP9 has been previously reported as a recurrently mutated gene in COSMIC (Catalogue of Somatic Mutations in Cancer) database, several members of the AKAP family have been found to be associated with both oncogenic and tumor suppressing functions in several cancers including breast cancer. Specific polymorphisms in AKAP10 and in a b c Fig. 3 Expression profiles of the AKAP gene family in three different cohorts, tumors ordered by PAM50 subtype, genes are orderad by hierarchial clustering. a TCGA RNAseq expression profiles from primary tumors (n = 522). Genes are ordered by hierarchical clustering p < 0.0001 by ANOVA; q < 0.000072. Intrinsic subtype calls for the TCGA data were taken from the original manuscript [1]. b Risk Cohort microarray expression profiles from primary breast cancer tumors (n = 182). c Cohort 1 microarray expression profiles from metastatic lesions (n = 120) particular in combination with a SNP in AKAP13 were shown to be associated with increased risk for familial breast cancer [46]. The same polymorphism in AKAP10 (Ile646Val) (Fig. 2) was also found to be associated with colorectal cancer risk [47]. Additionally, specific polymorphisms in AKAP9 are associated with increased risk in familial breast cancers [48]. AKAP12 expression is downregulated in various cancers summarized in [29] including breast cancer [37] and have been suggested as a suppressor of invasiveness in prostate cancer through inhibition of the PKC-Raff/MEK/ERK pathway as well as inhibition of VEGF-mediated neovascularization [49]. Here, two mutations in AKAP12 were present in the EGFR-binding domain (Fig. 3), indicating a possible function in the deregulation of EGFR signaling pathways present in malignant cells [50]. Furthermore, the down regulation of AKAP12 is often associated with promoter hypermethylation or loss of its locus 6q24-25.2 [51]. Based on this, AKAP12 has been suggested to have a tumor suppressing function [49]. AKAP12 and AKAP11 was also shown to be essential for endothelial barrier function, the latter by linking cAMP signalling to adherens proteins such as VEcadherin and β-catenin [52]. Similarly, AKAP5 have been shown to localize with E-cadherin and β-catenin in epithelial adherens junctions stabilizing F-actin [53]. Furthermore, overexpression of AKAP5 was shown to reduce proliferation and hyperplasia in smooth muscle cells [54]. AKAP9 has been shown to be essential for microtubuli dynamics at adherens junctions [55] and recently [56], demonstrated that in mice with AKAP9 deficient T-cells, antigen-dependent activation of the T-cell by TCR recycling is impaired, suggesting a possible role in tumor cell immune surveillance (authors comment).
Taken together, AKAP 5, 9, 11 and 12 all seem to be involved in various aspects of epithelial integrity, presenting them as potential players in cell migration and metastatic mechanisms. In our data, loss of the AKAP12 locus occurs in six out of 30 patients. In four of these, the loss is detected in metastasis only. AKAP5 was deleted in two of 20 patients, both in which the lesion was found uniquely in the metastasis (Table 4). Interestingly, AKAP5, AKAP9, AKAP11 and AKAP12 is expressed at lower levels in the basal-like and, less prominently in HER2-enriched subtypes (Fig. 3a-b), the breast cancer subtypes with highest risk of recurrence [57].
AKAP13, also named lymphoblast crisis oncogene (AKAP-LBC) binds, and is negatively regulated by the metastasis suppressor protein NDKA encoded by the NM23 gene [31]. Disruption of AKAP13 mediated anchoring of PKA decreased carcinoma cell line migration in vitro [58]. Importantly, overexpression of the NM23-H2 isoform was shown to downregulate nuclear receptor peroxisome proliferator-activated receptor δ (PPAR-δ) in human cholangiocarcinoma cells [59]. Recently it was shown [60] that AKAP13 is essential for tamoxifen resistence induced by PKA dependent phosphorylation of ERαS305 through its direct association via ERα-binding domain and PKA. One of the mutations in AKAP13 (D477N) found in this study, is indeed located in the PKA-RII binding subunit of AKAP13 (Fig. 2).
AKAP4 expression is detected at high rate in various breast cancer tumors and has been suggested as a biomarker for breast and prostate cancer [38]. Specifically AKAP3 have been shown to play a role in cell migration and invasion in ovarian cancer [39]. AKAP1 was shown to be required for cAMP-dependent PKA mediated apoptosis in colorectal cancer cells upon IGF1R inhibition [61]. AKAP8 is a nuclear localized protein suggested to play a role in both DNA replication, cell cycle progression and chromatin condensation [29,62] and also interacts with minichromosome maintenance protein 2 (MCM2). Disruption of the AKAP8-MCM2 interaction abolished DNA replication in HeLa cells [63]. In our study the AKAP8 mutation in the metastatic lesion was located within the MCM2-binding domain (Fig. 2).
Interestingly, in line with the findings in our study, somatic mutations in AKAP8 and AKAP9 were found in multiple metastases in a patient with renal cell carcinoma but were both absent in the primary tumor [64]. Similarly, an AKAP9 mutation was detected in circulating tumor DNA upon relapse, but not in the primary tumor or in blood samples prior disease progression [65]. These findings further underscore the metastasis-specific enrichment of AKAP mutations. AKAP mutations are highly prevalent in most primary cancers available in The Cancer Genome Atlas data accessed through CBioportal including CNV and point mutations (Additional file 7: Figure S4a). Lung squamous carcinoma has the highest mutation frequency with over 50% of the samples having alterations in any of AKAP genes 1-14, including 20% point mutations. In primary breast cancer almost 30% have alterations including 10% point mutations. Among these, AKAP1 and AKAP11 are altered in > 10%, including gene expression alterations. AKAP3, 7, 8, 9, 10 and 13 are all altered in > 5% (Additional file 7: Figure S4b).
Our findings reveal that although present in 10% (3 of 30) primary tumors, AKAP mutations are much more prevalent (40%) in the metastatic lesions (12 of 30), indicating a clonal selection of certain malignant subpopulations during the metastatic process.
In summary, we report that several members of the Akinase anchoring protein family (AKAPs) are mutated in metastatic breast cancer lesions. We further describe that the AKAP gene family is differently expressed in both primary and metastatic tumors and that basal-like tumors show an altered AKAP expression profile dividing the AKAPs in one highly expressed group; AKAP1, AKAP3, AKAP7, AKAP8 and one low group; AKAP5, AKAP9, AKAP10, AKAP11, AKAP12. Much of previously reported functions of these individual AKAPs points towards that this AKAP clusters indeed could reflect true metastatic potential. The reduced expression of genes involved in stabilization of adherence proteins (AKAP5, 9, 11 and 12) as well as increase expression of genes associated with poor prognosis and proliferation (AKAP3 and AKAP8) strongly underscores the involvement of AKAP genes during tumor growth and dissemination.

Conclusions
Several studies have reported individual AKAP genes to be associated with cancer risk and specifically metastatic relapses but no study has so far demonstrated the presence of somatic genetic alterations in clinical metastatic samples. Here, we demonstrate that somatic mutations occurs in a multitude of the members of the AKAP family in breast cancer tissue, and that several of the mutations are acquired or enriched in the metastatic corresponding metastatic tumors. Several of these mutations do occur in functionally relevant regions. Further, we demonstrate that AKAP gene expression follows a subtype specific pattern that was confirmed in three independent cohorts, pinpointing a subset of the AKAP members to be interesting targets for further investigation. While conclusions regarding functionality of mutations or significance of differential AKAP gene expression, cannot be drawn here, our findings presents a novel example of convergent mutations towards a gene family with many reported implications in cancer disease. Further studies will be needed to investigate the functional role of individual AKAP genes in cancer and the implication of differential expression in breast cancer.

Additional files
Additional file 1: Additional file 3: Figure S2. Mean coverage of whole-exome sequencing in a) cohort 1 and b) cohort 2. b) Exome capture and sequencing was performed by SeqWright Genomic Services (GE Healthcare, Houston, USA). Briefly, exome capture was performed using Sure Select XT2 Human All ExonV5 (Agilent Technologies) according to manufacturers instructions. Paired end sequencing was performed on Illumina HiSeq 2500. Raw sequencing reads were quality and adapter trimmed using trim_galore, where the first 13 bases of Illumina adapter were used and stringency parameter was set to 2. Reads having lengths less than 70 after trimming were filtered out with their paired mates. Trimmed reads were aligned to the human reference genome build hg19 using bwa-mem with default parameters. The aligned reads were marked for duplicates by Picard, realigned around known indels and base-quality recalibrated by GATK. Somatic single nucleotide variants (SNVs) were detected by Mutect with the high-confidence mode. Somatic short indels were detected by Varscan2 with minimum variant allele frequency of 0.05. Copy number alterations were detected by ADTEx/ Ascat. All pipelines and analyses were run using Anduril, a workflow framework for scientific data analysis. (PDF 915 kb) Additional file 4: Figures S1a and S1b. Exome-wide view of mutations in metastatic breast cancer. Each panel (numbered by patient ID) shows whole-exome data from one patient in several aligned tracks, as indicated on the right. SNV: single-nucleotide variations, shown as tick marks for COSMIC recurrent cancer genes (outer tracks) and as density for all mutations (inner tracks). The central colored density gradient shows major allele concordance P value between primary and metastasis (50 SNP moving windows; blue: concordant; red: discordant), indicating the presence of concordant loss of heterozygosity. CNV: copy-number variation (red: deletion, blue: amplification); LOH: lossof-heterozygosity. The Tumorscape track shows known hotspots for copy-number variation in breast cancer using data from Tumorscape Copy Number Alterations Across Multiple Cancer Types Release 1.6 (Broad Institute). CNV was determined by calculating the normalized read coverage at each SNP position for paired tumor and germline samples, then taking the log 2 of the ratio of these numbers and recentering to zero. In Fig. 1, these measures were plotted as a moving average across ten adjacent SNPs. LOH: loss of heterozygousity) as major allele frequency for heterozygous SNPs on the vertical axis, with manually called LOH regions indicated by horizontal lines (vertical axis range 50-100%; red: significant SNP phase concordance with the paired sample, p < 0.01 by the binomial distribution; black: not significant (see also Additional file 3: Figure S2 a-b). Note that LOH could not be called for some samples (metastases of P1, P2, P4 and P10) because of low data quality. SNP phase concordance was determined using all SNPs in the LOH region in A and recorded the major allele of each (i.e. the 'phase' of the LOH region). We then tested the null hypothesis that the alleles of the corresponding SNPs in sample B was randomly distributed relative to A, with a 50% chance of concordance at each position. This would be expected if there were in fact no LOH in B, as the major allele would then be determined by the random fluctuations of read coverage. Under this model, the probability of observing k concordant calls among n SNPs is distributed according to the Binomial distribution (k; n, p) with p = 0.5. Thus we calculated the P value as the cumulative density of this distribution from k to n (that is, the probability of observing as many as k concordant calls, or more). A low P value (e.g. P < 0.01) on this test suggest that the LOH region in A was in fact also present on B, whereas a high P (e.g. P > 0.99) value suggests that there was LOH in B, but it was derived from the opposite allele compared with A. Intermediate values are expected whenever there was no LOH in sample B. The central track for each patient in Additional file 4: Figure S1 shows this P value. (ZIP 1670 kb) When genomes contain regions of LOH, the distribution will be bimodal, as illustrated in the inset (top right). The first component (closer to 50%; red in inset) represents heterozygous SNPs in regions without LOH and was modeled as a normal distribution with a mean close to 50% in a perfect sample, but will increase towards 100% as allelic dropout increases. The second component (closer to 100%; blue in inset) represents homozygous SNPs in regions with LOH, was modeled as a beta distribution with two parameters (the mean of this component should be close to 100% and increase as LOH fraction increases and decrease as with non-cancer cell contamination. The mixture distribution thus had four free parameters plus the mixture