Pathway activity profiling of growth factor receptor network and stemness pathways differentiates metaplastic breast cancer histological subtypes

Background Gene expression profiling of rare cancers has proven challenging due to limited access to patient materials and requirement of intact, non-degraded RNA for next-generation sequencing. We customized a gene expression panel compatible with degraded RNA from formalin-fixed, paraffin-embedded (FFPE) patient cancer samples and investigated its utility in pathway activity profiling in patients with metaplastic breast cancer (MpBC). Methods Activity of various biological pathways was profiled in samples from nineteen patients with MpBC and 8 patients with invasive ductal carcinoma with triple negative breast cancer (TNBC) phenotype using a custom gene expression-based assay of 345 genes. Results MpBC samples of mesenchymal (chondroid and/or osteoid) histology demonstrated increased SNAI1 and BCL2L11 pathway activity compared to samples with non-mesenchymal histology. Additionally, late cornified envelope and keratinization genes were downregulated in MpBC compared to TNBC, and epithelial-to-mesenchymal transition (EMT) and collagen genes were upregulated in MpBC. Patients with high activity of an invasiveness gene expression signature, as well as high expression of the mesenchymal marker and extracellular matrix glycoprotein gene SPARC, experienced worse outcomes than those with low invasiveness activity and low SPARC expression. Conclusions This study demonstrates the utility of gene expression profiling of metaplastic breast cancer FFPE samples with a custom counts-based assay. Gene expression patterns identified by this assay suggest that, although often histologically triple negative, patients with MpBC have distinct pathway activation compared to patients with invasive ductal TNBC. Incorporation of targeted therapies may lead to improved outcome for MpBC patients, especially in those patients expressing increased activity of invasiveness pathways. Electronic supplementary material The online version of this article (10.1186/s12885-019-6052-z) contains supplementary material, which is available to authorized users.


Background
Metaplastic breast cancer (MpBC) is a rare and aggressive histological subtype comprising 1 % or less of all breast cancer cases [1][2][3]. MpBCs are often negative for estrogen/progesterone receptor expression and HER2 amplification, yet this subtype differs in histology from invasive ductal triple negative breast cancer (TNBC) by the presence of mesenchymal (chondroid, osteoid), spindle cell, and/or squamous neoplastic cell populations [1]. Indeed, this histologically complex cancer often presents with multiple cell populations of mixed histologies. Patients with MpBC suffer from a worse outcome than those with invasive ductal TNBC, and MpBC patients demonstrate a poor response to chemotherapy [3][4][5]. Due to its rarity, the MpBC genome and transcriptome have only recently been studied with limited sample size [6,7]. Comprehensive molecular profiling of MpBC and its histological subtypes is urgently needed.
Formalin-fixed, paraffin-embedded (FFPE) samples are commonly archived from breast cancer patients' primary tumors and could prove a valuable resource with which to study MpBC omics. However, nucleic acids obtained from such samples are often degraded, thus impeding high quality profiling of transcriptomics via next generation sequencing. The NanoString nCounter platform has shown compatibility and reliability with gene expression profiling using RNA obtained from FFPE samples [8][9][10]. Here, we leverage the use of a custom NanoString Technologies nCounter-based assay to overcome sample degradation and to quickly and cost-effectively profile and compare pathway activity for various gene expression signatures across a set of 19 MpBC and 8 invasive ductal TNBC patient samples (Fig. 1). Fig. 1 Overview of NanoString pathway activity profiling in metaplastic and triple negative breast cancer samples. a) Growth factor receptor network (GFRN), stemness, or apoptosis genes were individually overexpressed in normal human mammary epithelial cells (HMECs) using adenovirus delivery. b) The gene expression changes most correlated with induction of expression of these genes were identified. c) Gene lists were pared down to the fewest number of genes able to accurately predict that gene's signature activity. These genes plus genes from other pathways relevant to breast cancer were placed on a custom NanoString panel. d) RNA from patient breast cancer samples was assayed using the custom NanoString panel.

Patient samples
Through a City of Hope IRB-approved retrospective analysis protocol, 18 FFPE and 1 fresh frozen sample from patients with MpBC, and 8 FFPE samples from patients with invasive ductal TNBC were collected for profiling (Additional file 1: Table S1). Written informed consent was obtained from all patients who participated in the study. Clinical records including demographics, treatment histories, recurrence free and overall survival, and cancer-associated mutation profiling data were reviewed and recorded for MpBC patients. MpBC samples were reviewed by a designated breast pathologist and assigned to histological subtypes including squamous, spindle cell, mesenchymal (chondroid and/or osteoid) or mixed subtype according to World Health Organization classification [11].

Activated pathway and GFP control samples
Activated pathway or control samples were generated in normal human mammary epithelial cells (HMECs) overexpressing genes of interest or GFP, respectively, as described previously [12]. Briefly, HMECs were cultured in basal Mammary Epithelial Cell Growth Medium plus a bullet kit (MEGM, Lonza, MD, USA). HMECs were starved of bullet kit additives 36 h prior to infection with adenovirus expressing either AKT1, BAD, BCL2L11, HER2, IGF1R, RAF1, or SNAI1 for 18 h or KRAS (G12 V mutation) for 36 h at MOI of 200. Overexpression of these genes was chosen to capture core cell growth, death/survival, and stemness phenotypes.

NanoString custom codeset
Probe gene targets for the custom gene expression panel were selected from previously published gene expression signatures (AKT1, BAD, HER2, IGF1R, KRAS G12 V, and RAF1, from Rahman et al. [12]; multi-cancer invasiveness from Anastassiou et al. [13]; stem cell signature from Boquest et al. [14]; TNF alpha signature from Phong et al. [15]) and two novel signatures (BCL2L11 and SNAI1) generated using the adenovirus infection protocol described above.. Signature gene sets from previously published AKT1, BAD, HER2, IGF1R, KRAS G12 V, and RAF1 signatures were reduced from the previously optimized RNA-sequencing-based signature lengths. Gene lists were sequentially-reduced in five gene increments down to a minimum size of five genes and each reduced gene list was used to profile cell lines from the International Cancer Benchmarking Partnership (ICBP) and breast cancer patient samples from The Cancer Genome Atlas (TCGA) using the Adaptive Signature Selection and InteGratioN toolkit (ASSIGN, [16], available from BioConductor, https://doi.org/10.1 8129/B9.bioc.ASSIGN) as described in Rahman et al. [12]. The ASSIGN pathway signature prediction scores were correlated with proteomics data for genes known to be associated with each signature as described previously [12]. Gene lists were selected to minimize the reduction of overall ASSIGN score vs. proteomics data correlation in TCGA while using a maximum of 150 genes across all six signatures (Additional file 2: Fig. S1). The reduced signature lengths for AKT1, BAD, HER2, IGF1R, KRAS G12 V, and RAF1 were 20 genes, 15 genes, 10 genes, 20 genes, 75 genes, and 50 genes, respectively.
Genes from the BCL2L11 and SNAI1 signatures were selected similarly to the method described in Rahman et al. [12]. Briefly, signature gene lists of various lengths were derived using ASSIGN to compare RNA expression from HMECs overexpressing either BCL2L11 or SNAI1 against those overexpressing GFP. For BCL2L11, candidate gene lists were subsequently used to predict pathway activity in small cell lung cancer cell lines from the Tse et al. [17] dataset (GSE10841). The BCL2L11 activity predictions from ASSIGN for these cell lines were correlated with the cell lines' average EC50 in response to ABT-263, a Bcl-2 family inhibitor. The signature which resulted in the largest negative Spearman correlation was selected for further development. SNAI1 signature candidate gene lists were used to predict pathway activity in an immortalized normal mammary epithelial cell line (HMLE) from the Taube et al. [18] dataset (GSE24202). The signature that best separated the ASSIGN prediction scores in HMLE cells overexpressing SNAI1 from HMLE expressing empty-vector control was chosen for further development. Following selection of the BCL2L11 and SNAI1 signature gene lists, we manually screened for and removed heat shock proteins (HSP) frequently appearing in the gene lists generated by ASSIGN across pathways. Seventy-nine genes were identified as HSP genes and removed from the signatures, resulting in final signature lists containing 54 genes for BCL2L11 and 103 genes for SNAI1.
Analysis scripts for the AKT1, BAD, BCL2L11, HER2, IGF1R, KRAS G12 V, RAF1, and SNAI1 pathway signatures are available at: https://github.com/dfjenkins3/ MpBC_genomics_paper. The Anastassiou multi-cancer invasiveness, Boquest stem cell, and Phong TNF alpha signatures were reduced to 25 genes each, based on those genes with highest expression in post-treatment breast cancer patient samples profiled in Brady et al. [19]. Additional genes of interest relevant to breast cancer were also added to the panel. In total, 345 genes (336 query genes and 9 housekeeping genes) were incorporated into the custom assay (Additional file 3: Table S2).

Patient and HMEC sample RNA extraction
RNA was extracted from patient breast cancer samples using the RNeasy FFPE kit, and from the HMEC controls using the RNeasy mini kit (both from Qiagen, CA, USA). RNA concentration was assessed with Nanodrop spectrophotometer ND-1000 and Qubit 3.0 Fluorometer (both from Thermo Scientific, CA, USA). RNA fragmentation and quality were determined by 2100 Bioanalyzer (Agilent, CA, USA).

NanoString nCounter profiling system
The NanoString nCounter platform gene expression assay has been described previously [20]. Briefly, the NanoString nCounter platform assays gene expression directly from RNA samples via hybridization of samples with a set of multiplexed nucleotide probes. Probes for each gene target are uniquely barcoded with a series of fluorophores. Fluorescence microscopy imaging of sample-hybridized fluorophore-labeled probes generates quantitative counts data for each gene in each sample.
For gene expression profiling on the nCounter system, patient sample or HMEC control RNA was first hybridized with the custom 345-gene codeset (NanoString Technologies, WA, USA) at 65°C for 16 h. Post-hybridization probe:target mixture was then purified and quantified via nCounter MAX Digital Analyzer (NanoString Technologies, WA, USA).

Pathway activity profiling in patient samples
Raw NanoString counts data were normalized to internal positive control probes and housekeeping genes using nSolver Software (NanoString Technologies, WA, USA) version 4.0, according to default parameters, except for background threshold count value was set to 20. Pathway probabilities for AKT1, BAD, BCL2L11, KRAS G12 V, HER2, IGF1R, RAF1, and SNAI1 signatures were calculated using ASSIGN, according to the same parameters as in Rahman et al. [12], with adaptive signature selection set to false. Pathway scores for Anastassiou multi-cancer invasiveness, Phong TNF alpha, and Boquest stem cell signatures were calculated using ASSIGN as above, with adaptive signature selection set to true.

Differential gene expression and biological pathway enrichment analysis
Differential gene expression analysis was performed using the NanoStringDiff package, version 1.10.0 for R (available from BioConductor, https://doi.org/10.18129/ B9.bioc.NanoStringDiff) using default settings [21]. This package uses a negative binomial-based model appropriate for discrete counts data, and employs a normalization step incorporating data from the internal nCounter positive and negative controls and the panel housekeeping controls to identify differentially-expressed genes across groups.
The package adjusts for false discovery using the Benjamini-Hochberg method. Genes passing the q < 0.05 false discovery cutoff were considered for pathway enrichment analysis using Ingenuity Pathway Analysis (IPA) software (Qiagen Silicon Valley, CA, USA). Analyses in IPA were run with the "reference set" parameter set to the input list of genes assayed on the NanoString panel to account for sampling bias of genes chosen for the panel. IPA uses a right-tailed Fisher's exact test to calculate the probability that genes belonging to particular biological pathways from its curated knowledge base are enriched in input datasets due to chance. IPA canonical pathways with p < 0.05 are reported herein.

Statistics
Statistical tests were performed using Prism version 6.0 (GraphPad, CA, USA). Comparison of ASSIGN pathway activity scores across groups was performed using oneway ANOVA followed by Tukey's post hoc test. Survival analyses were performed using the Kaplan-Meier log-rank method, with hazard ratios (HR) and 95% confidence intervals (CI) reported. For survival analyses, patients were grouped by median pathway activity score and the sample with median value was included in the group containing

Performance of RNA-seq based signatures on NanoString platform
We converted gene expression signatures originally created using RNA-sequencing data for use with the Nano-String gene expression profiling platform. To re-optimize the signatures to best capture pathway activity via Nano-String, RNA from control HMEC samples overexpressing each gene of interest and from HMECs overexpressing GFP was assayed on the NanoString platform using the custom codeset, and the top gene expression changes between groups were identified using ASSIGN (Fig. 2). These changes in gene expression identified in the control samples were then used to profile pathway activity in patient samples.
Metaplastic breast cancer histological subtypes demonstrate differential pathway activation Unsupervised hierarchical clustering of pathway activity scores for growth factor receptor network (GFRN), stemness, and apoptosis pathways revealed several broad clusters of pathway activity across MpBC and TNBC patients ( Fig. 3a-b). Notably, MpBC and TNBC patient samples did not cluster exclusively; rather, these samples were interleaved across clusters. Further, MpBC patient samples did not group distinctly by subtype; however, patient samples with a mesenchymal cell population (chondroid and/or osteoid) grouped in high SNAI1/ BCL2L11 pathway activity clades (left side of heatmap; Fig. 3a), while all uniformly squamous samples grouped in low SNAI1/BCL2L11 pathway activity clades (right side of heatmap, Fig. 3a). Indeed, samples with any mesenchymal cell population had significantly higher SNAI1 pathway activity scores than patients of the spindle and squamous subtypes (ANOVA, p = 0.0131; Fig. 3c). Similarly, mesenchymal samples demonstrated significantly increased BCL2L11 and marginally significantly increased AKT1  Fig. 3c). Interestingly, HER2 pathway activity was significantly higher in TNBC samples than in MpBC samples (Student's t-test, p < 0.001; Fig. 3d). Specifically, spindle cell, squamous and mixed spindle/squamous subtype samples had significantly lower HER2 pathway activity than TNBC samples (ANOVA, p < 0.001; Fig. 3c). All patient MpBC and TNBC samples were clinically categorized as negative for HER2 amplification or HER2 status unknown; however, all samples expressed ERBB2, with TNBC samples demonstrating significantly increased ERBB2 expression compared to MpBC samples (Additional file 5: Fig. S3). Differences in expression of the other 9 genes in the HER2 gene expression signature also contributed to differential pathway activity between MpBC and TNBC samples. No differences were seen in pathway activity across subtypes for the other pathways profiled, including BAD, KRASG12 V, IGF1R, RAF1, Anastassiou invasiveness, Boquest stem cell and Phong TNF alpha (Fig. 3c-d).

Differences in gene expression across subtypes
We examined gene expression differences across the panel of NanoString genes using NanoStringDiff, an R package designed to identify gene expression differences from the discrete counts data generated by the NanoString platform [21]. Gene expression profiling revealed differences between MpBC and TNBC samples as well as between samples of different MpBC histological subtypes. Fifty-seven genes were differentially expressed between MpBC and TNBC samples ( Table 2). Genes down-regulated in MpBC included, among others, CD24, keratinocyte-related genes such as CALML5 and KRT81 and late cornified envelope genes, LCE1F, LCE3D, and LCE3E, which were largely not expressed in MpBC samples, but were expressed in the majority of TNBC samples. Genes up-regulated in MpBC included cytokine genes IL6 and IL8, EMT-related genes FN1 and CTGF, and genes involved in extracellular matrix synthesis and adhesion: COL1A1, COL5A1, COL5A2, ICAM1, and HAS2 (Table 2).
Further, to explore subtype-specific gene expression, we identified genes differentially expressed in each MpBC subtype. Twenty-four panel genes were significantly differentially expressed between spindle cell MpBCs and all other MpBCs (Benjamini-Hochberg adjusted p < 0.05, Table 3). Squamous subtype samples had 36 differentially expressed genes and mesenchymal subtype samples had 24 genes differentially expressed compared to all other MpBC samples (Table 3).
Next, we interrogated non-GFRN pathway dysregulation at the subtype level by assessing the differentially expressed genes identified by NanoStringDiff for enrichment of genes belonging to the same pathway in the canonical pathways database curated by IPA. Genes differentially expressed between MpBC and TNBC samples were enriched for genes in the hepatic fibrosis and atherosclerosis pathways ( Table 4). Differentially expressed genes from the mesenchymal subtype were enriched for interferon signaling, IL-17 signaling, (a) granulocyte adhesion, and helper T cell differentiation pathway members. Similarly, IL-17 signaling and (a) granulocyte adhesion pathways were identified as enriched in spindle cell differentially expressed genes, as several genes up-regulated in mesenchymal samples were down-regulated in spindle cell samples. No pathways were significantly enriched in genes differentially expressed in squamous subtype samples.

Invasiveness markers and patient survival
To examine the relationship between pathway activity and survival, we stratified patients by median ASSIGN pathway activity score for all pathways assayed, and assessed patient recurrence-free survival (RFS) and overall survival (OS) within each group. Patients with above-median Anastassiou invasiveness pathway activity experienced shorter RFS and OS than those with equal to or below-median pathway activity (RFS: p = 0.021, HR = 5.82, 95% CI = 1.31-25.84; OS p = 0.02, HR = 5.77, 95% CI = 1.32-25.24; Fig. 4a). Patients with below-median KRAS G12 V pathway activity experienced a worse outcome compared to those patients with equal-to or above-median KRAS G12 V pathway activity (RFS: p = 0.0145, HR = 6.55, 95% CI = 1.45-29.55; OS: p < 0.001, HR = 14.14, CI = 3.10-64.40; Fig. 4c). There was no significant difference in outcome identified between patients stratified by median pathway activity for the remaining pathways assessed with the NanoString panel. Previous studies have identified that expression of mesenchymal markers including SPARC, VIM, and TWIST negatively correlate with MpBC patient survival [22,23]. In the present study, patients with above-median SPARC expression experienced shorter recurrence-free and overall survival times than patients with equal-to or below-median SPARC expression (RFS p = 0.023, HR = 5.52, 95% CI = 1.26-24.1; OS p = 0.023, HR = 5.41, 95% CI = 1.26-23.2). Conversely, patients bifurcated by median VIM expression or by median SNAI1 pathway activity did not experience differences in outcome (Additional file 6: Fig. S4).

Discussion
Elucidation of the omics underlying rare cancer types such as MpBC requires methods to accurately profile limited samples available from these cancers. Our results demonstrate the utility of RNA collected from FFPE samples and profiled with the NanoString platform to obtain interpretable gene expression and pathway activity data for patients with MpBC. Using this platform, we identified differences in gene expression and pathway activity between MpBC and invasive ductal TNBC samples, as well as between samples from different MpBC subtypes.
Several genes with potential implications on patient treatment were found significantly differentially expressed between MpBC and TNBC samples. One such gene, CD24, was down-regulated in MpBC. Interestingly, low expression or lack of expression of the CD24 protein has long been considered a marker of breast cancer stem cells, and  various clinical trials are underway to target the cancer stem cell population in breast cancer [24][25][26]. Additionally, we identified COL1A1 up-regulation in MpBC samples. The protein product of the COL1A1 gene forms part of the type I collagen protein complex, which has previously been identified as up-regulated in mesenchymal MpBCs when compared to adjacent normal tissue [27]. Further, high expression of the COL1A1 gene and protein has been associated with shorter recurrence free and overall survival in breast cancer, as well as with response to cisplatin [28,29]. Additionally, we identified increased HAS2 in MpBC samples. A previous study found expression of this enzyme involved in hyaluronan synthesis in 72.7% of patients with MpBC, compared to only 56% of patients with invasive ductal TNBC, and 25.2% of patients with invasive ductal carcinoma of ER, PR, or HER2-positive phenotypes [30]. Clinical trials investigating treatment of patients with high hyaluronan levels with recombinant hyaluronidase are currently underway in multiple cancer types [31][32][33].
At the pathway activity level, profiling results demonstrated increased BCL2L11, SNAI1, and AKT1 pathway activity in patient samples with a histologic mesenchymal (chondroid or osteoid) component. This finding supports that of Gwin et al. [34], who identified increased SNAI1 gene expression in chondroid MpBC tumors, and that of Taube et al. [18], who found high SNAI1 expression in a set of 12 metaplastic patient samples. Based on these findings, inhibition of SNAI1 pathway components may be a viable strategy for improving outcomes for patients with mesenchymal MpBC. While there are currently no FDA-approved SNAI1 inhibitors, the histone deacetylase (HDAC) inhibitors panobinostat and entinostat have been shown to reduce expression of SNAI1 and other EMT markers [35][36][37]. HDAC inhibitors are currently FDA-approved for use in some cancers, and thus may be an implementable strategy for treatment of MpBC tumors with high SNAI1 activity.
Similarly, we identified increased BCL2L11 pathway activity in patients with mesenchymal MpBC. Increased SNAI2-driven BCL2L11-encoded protein BIM expression was identified by Merino et al. [38] at the proliferating edge of two metaplastic breast cancer patient-derived xenografts, and it was speculated that this expression may play a role in tumor cell dissemination and metastasis. This same leading-edge expression of BIM was not present in TNBC and ER+ xenografts. Future experiments are needed to clarify the role of increased BIM in MpBC tumors, and to determine whether modulation of MAPK pathway activity upstream of BIM improves outcomes for patients with mesenchymal MpBC.
In the present cohort, patient samples with high Anastassiou invasiveness pathway activity and high expression of the extracellular matrix glycoprotein SPARC experienced worse outcomes. SPARC expression has been associated with invasiveness phenotype in patients  with ductal carcinoma in situ, as well as with poor survival in patients with TNBC [39,40]. Thus, a treatment strategy capable of reducing the invasiveness potential of metaplastic cancer cells may benefit MpBC patient outcome. Lack of KRAS activity to drive poor outcome in the present patient cohort may reflect the extent to which aggressive MpBCs are driven by stemness/invasiveness pathways not related to MAPK pathway activity. MpBC tumors are notorious for their failure to respond to chemotherapy; however, chemotherapy remains the standard of care for TNBC, including triple-negative MpBC [5,41]. Thus, identification of targetable pathways altered in MpBC is necessary for improving patient outcomes. Multiple ongoing trials including ARTEMIS and I-SPY2 are testing a precision medicine approach for TNBC treatment [42][43][44]. Patients with MpBC may similarly benefit from a precision medicine approach, which may be further tailored to the patient's specific MpBC subtype. Such an approach might leverage tumor transcriptomic profiling at time of patient diagnosis to determine if MpBC patients would benefit from specific targeted therapies.
MpBC is a remarkably rare cancer, and it is important to note the limitations in our conclusions due to the limited sample size from a single institution. However, data from the current study corroborate findings from other MpBC studies published to date. One such study examined gene expression differences across MpBC subtypes via RNA sequencing [6]. As in the present study, Piscuoglio et al. [6] also identified genes ALDH3B2, CDRT1, ELF3, EXTL1, GLYATL2, PI3, PPL, and PRSS22 as differentially expressed in the squamous subtype and genes AQP5, EXTL1, MMP9, NEFM, and VIPR1 in the spindle subtype. Further, our identification of increased IL8, IL6, HAS2, and ICAM1, as well as decreased ERBB2 in MpBC samples matches findings from a microarray comparison of gene expression between metaplastic breast cancers and ductal carcinomas of the breast [22]. At the pathway activity level, high SNAI1 activity and increased expression of stemness and EMT markers have been identified in the present cohort as well as in other MpBC patient cohorts [18,34].