African-American esophageal squamous cell carcinoma expression profile reveals dysregulation of stress response and detox networks

Background Esophageal carcinoma is the third most common gastrointestinal malignancy worldwide and is largely unresponsive to therapy. African-Americans have an increased risk for esophageal squamous cell carcinoma (ESCC), the subtype that shows marked variation in geographic frequency. The molecular architecture of African-American ESCC is still poorly understood. It is unclear why African-American ESCC is more aggressive and the survival rate in these patients is worse than those of other ethnic groups. Methods To begin to define genetic alterations that occur in African-American ESCC we conducted microarray expression profiling in pairs of esophageal squamous cell tumors and matched control tissues. Results We found significant dysregulation of genes encoding drug-metabolizing enzymes and stress response components of the NRF2- mediated oxidative damage pathway, potentially representing key genes in African-American esophageal squamous carcinogenesis. Loss of activity of drug metabolizing enzymes would confer increased sensitivity of esophageal cells to xenobiotics, such as alcohol and tobacco smoke, and may account for the high incidence and aggressiveness of ESCC in this ethnic group. To determine whether certain genes are uniquely altered in African-American ESCC we performed a meta-analysis of ESCC expression profiles in our African-American samples and those of several Asian samples. Down-regulation of TP53 pathway components represented the most common feature in ESCC of all ethnic groups. Importantly, this analysis revealed a potential distinctive molecular underpinning of African-American ESCC, that is, a widespread and prominent involvement of the NRF2 pathway. Conclusion Taken together, these findings highlight the remarkable interplay of genetic and environmental factors in the pathogenesis of African-American ESCC. Electronic supplementary material The online version of this article (doi:10.1186/s12885-017-3423-1) contains supplementary material, which is available to authorized users.


Background
Esophageal cancer is the third leading gastrointestinal malignancy worldwide with greater incidence in males than in females. Patients with esophageal cancer (EC) show limited response to multimodal treatments with an overall five-year survival rate of only about 20% [1]. Due to lack of effective screening for early detection, EC is usually diagnosed at an advanced stage or when metastasis has already occurred. Consistently reliable molecular markers to monitor outcomes remain to be developed [2].
Esophageal cancer has two main histologic subtypes and they arise in two distinct areas of the esophagus. Adenocarcinoma of the esophagus (EAC) is mostly seen in Western countries [3] while esophageal squamous cell carcinoma (ESCC) is predominant in Eastern countries and the eastern part of Africa [3]. Geographical and genomic differences play a significant role in ESCC [4]. In African-Americans, ESCC is the predominant subtype, and the survival rate is worse than in patients of other ethnic groups [5].
The combined action of genetic and environmental factors is believed to underlie the etiology of esophageal cancer. Recent genome-wide association studies, gene expression profiling, DNA methylation and proteomic studies conducted in Japanese and Chinese ESCCs (reviewed in [6]) have identified multiple risk variants and gene signatures associated with ESCC. These studies presented additional evidence for the effect of environmental exposures such as alcohol intake, smoking, opium abuse, hot food and beverage consumption, and diet as risk factors for ESCC [3,[7][8][9][10][11].
Genetic and transcriptome analyses on African-American ESCC have been particularly limited which highlights the lack of understanding of the genetic architecture of ESCC in this ethnic group. In an earlier study of black male ESCC samples, we detected loss of heterozygosity that spanned a significant portion of chromosome 18 [12]. To explore the entire anatomy of the neoplastic genome in black ESCC, we performed comparative genomic hybridization (CGH) on a panel of 17 matched pairs of tumor and control esophageal tissues [13]. Multiple chromosomal gains, amplifications and losses that represent regions potentially involved in etiology defined the pattern of abnormalities in the tumor genome [13]. We noted genomic imbalances that were represented disproportionately in African-American ESCC compared to those reported in ESCC of other ethnic groups including Japanese [14][15][16][17][18], South African black and mixed-race individuals [19], Taiwan Chinese [20], Hong Kong Chinese [21], Chinese in Henan province [22], and Swedes [23].
The preponderance of chromosomal aberrations in African-American ESCC predicts concomitant changes in gene activity during carcinogenesis. We sought to identify dysregulated genes and pathways that could define the expression signature in African-American ESCC by conducting microarray expression profiling in paired squamous esophageal tumors and normal tissue specimens. Here, we report significant differential expression of a wide array of genes involved in multiple pathways that may be crucial to causation and/or progression. Particularly noteworthy is the dysregulation of NRF2 mediated oxidative stress genes and genes that encode drug-metabolizing enzymes and xenobiotics that may, in part, contribute to the aggressive nature of ESCC among blacks.

Samples
Seven paired specimens of the esophagus (tumor and matching non-tumor tissues), each pair derived from the same patient, were collected endoscopically or surgically at the time of diagnosis, frozen and stored at -80°C until use. Staging indicated that all tumors included in this study were at Stage IV. This study was done under a protocol approved by the Washington D.C. VAMC Institutional Review Board and a written informed consent was obtained from each patient prior to biopsy or surgery. The demographics and risk factors of the patients are listed in the Additional file 1.

RNA extraction
Tissue samples were subjected to total RNA extraction using TRIzol-reagent (Invitrogen, Carlsbad, CA) and purified with RNeasy Mini kit (Qiagen), according to the manufacturer's guidelines. The concentration of each RNA sample was determined by NanoDrop spectrophotometer ND-1000 (NanoDrop Technologies, Wilmington, DE). RNA quality was assessed using the Agilent 2100 Bioanalyzer (Agilent Technologies Inc., Santa Clara, CA).

cRNA preparation and expression profiling
An aliquot of 5 μg of high-quality total RNA from each sample was used to synthesize cDNA and biotinylated cRNA utilizing the Affymetrix GeneChip® Expression 3' Amplification One-Cycle Target Labeling and Control Reagent kit according to manufacturer's instructions. Biotinylated cRNA was hybridized to Affymetrix Gene-Chips HG U133 Plus 2 (Affymetrix, Santa Clara, CA), washed, stained on the Affymetrix Fluidics station 400 and scanned with a Hewlett Packard G2500A Gene Array Scanner following Affymetrix instructions. All arrays used in the study passed the quality control set by Tumor Analysis Best Practices Working Group [24].

Microarray data analysis
The Affymetrix scanner 3000 was used in conjunction with Affymetrix GeneChip Operation Software to generate one. CEL file per hybridized cRNA. These files have been deposited in NCBI Gene Expression Omnibus (GEO) (www.ncbi.nlm.nih.gov/geo/) under the GEO accession number of GSE77861 and are freely available for download.
The Affymetrix Expression Console was used to summarize the data contained across all .CEL files and generate 54,675 RMA normalized gene fragment expression values per file. Quality of the resulting values was challenged and assured via Tukey box plot, covariancebased PCA scatter plot, and correlation-based heat map using functions supported in "R" (www.cran.r-project.org). Lowess modeling of the data (CV~mean expression) was performed to characterize noise for the system and define the low-end expression value at which the linear relationship between CV and mean was grossly lost (expression value = 8). Gene fragments not having at least one sample with an expression value greater than this low-end value were discarded as noise-biased. For gene fragments not discarded, differential expression was tested between Tumor and Non-tumor biopsies via paired t-test under Benjamini-Hochberg multiple comparison correction condition (alpha = 0.05). Gene fragments having a corrected P < 0.05 by this test and an absolute difference of means > = 1.5X were subset as those having differential expression between Tumor and Non-Tumor. Gene annotations for these subset fragments were obtained from IPA (www.ingenuity.com) along with the corresponding enriched functions, enriched pathways, and significant predicted upstream regulators. The analysis pipeline is summarized in the Additional file 2.

Validation of results by real-time PCR
RT-PCR was performed for KRT17, PRDCSH, TNFRSF6B, SELK, RAB5B, ALD, RAF genes. The delta-delta Ct calculation method was used for the quantification of the RT-PCR results.

Results
Transcriptome profiling of African-American ESCC tumors versus adjacent normal esophageal tissues revealed significant differential expression of 756 genes comprising 340 over-expressed and 416 under-expressed loci that were detected by 460 and 558 gene probes, respectively (Additional file 4). A volcano plot displayed genes that underwent the highest alteration in expression (Fig. 1a). Among the most strongly up-regulated genes are keratin 17 (KRT17), immunoglobulin genes including IGHG1 and ornithine decarboxylase 1 (ODC1). Genes that showed a huge loss of expression included cysteine-rich secretory protein 3 (CRSP3) and sciellin (SCEL). Experimental validation of microarray results through a real-time PCR assay on RNA derived from the same original samples for selected up-regulated (KRT17, PRDCSH, TNFRSF6B) and down-regulated (SELK, RAB5B, ALD, RAF) genes supported the microarray data (data not shown).
Principal component analysis of differentially expressed genes indicated the magnitude of the co-variance between paired tumor and non-tumor samples of each patient (Fig. 1b). The first principal component contributed 57.9% of the variance among the samples. Correlationbased clustering of all differentially expressed genes distinguished clearly tumor from the corresponding non-tumor tissues (Fig. 1c).

Perturbed pathways and networks in African-American ESCC
To determine the overall biological impact of the widespread transcriptional aberration in African-American ESCC, we performed pathway and network analysis on significantly dysregulated using Ingenuity Pathway Analysis (IPA). The majority of differentially expressed genes encoded a diversity of enzymes (Fig. 1d). Genes that coded for transporters, transcription regulators, phosphatases, translation regulators, ion channels and transmembrane receptors were among those that were most prominently down-regulated (Fig. 1d).
IPA detected the enrichment of 25 networks ( Fifteen canonical pathways were significantly enriched in African-American ESCC and the top three included NRF2-mediated oxidative stress pathway, integrin signaling and protein ubiquitination, in that order (Fig. 2b, Additional file 6). The gene constituents of these pathways are presented in Additional file 7. These results suggest that African-American ESCC is underpinned by a dysregulation of genes that play an important role in oxidative stress and xenobiotic metabolic responses.

Activation of NRF2 perturbs stress response and detoxification pathways in ESCC
Enriched pathways involving stress response, xenobiotic metabolism, and toxic response are noteworthy because smoking and alcohol consumption have been consistently a c d b Fig. 1 Gene expression differences observed between paired Esophageal Tumor and Non-Tumor biopsies for seven patients. a Volcano Plot depicting the differential expression testing results for 10,734 gene fragments. Gene fragments having significant difference in expression between Tumor and Non-Tumor where the magnitude of difference is also > = 1.5X are represented as triangles (n = 756). b Covariancebased Principal Component Analysis (PCA) scatter plot depicting the paired sample relationships when the 756 gene fragments identified to have significant difference in expression between Tumor and Non-Tumor are used. c Correlation-based clustered heat map depicting the sample relationships (x-axis) when the 756 gene fragments identified to have significant difference in expression between Tumor and Non-Tumor (y-axis) are used. d Bar plot describing the breakdown of the 756 gene fragments identified to have significant difference in expression between Tumor and Non-Tumor by protein type (where known).
shown to be strong contributing factors in ESCC etiology. It was therefore important to focus on pathways involved in detox networks.
The NRF2-mediated oxidative stress response pathway showed the highest enrichment (with a -log(p) of 6.25), in general, and in the toxicology panel as well (Fig. 3). NRF2 pathway is one of the primary mediators of detoxification and metabolism responses. Transcriptional targets of NRF2 include genes involved in alcohol metabolism such as ADH7, AKR1B1, ALDH3A1, and ALDH7A1, all of which are differentially expressed in our dataset (Additional file 8). Other targets that showed altered expression in African-American ESCC include genes with a wide range of function: MGST2, ABCC1, ABCC5, GCLC GPX4, ACOX1, BLVRA, FTL1, CEBPB, ACLY, ELOVL5, FABP5, ACAA1B.
IPA predicted that 19 upstream regulators are activated in our dataset (Table 1 and Additional file 9). Nuclear factor-erythroid 2 p45-related factor 2 gene, NFE2L2, a known upstream regulator of the NRF2 pathway was predicted to have the highest activation z-score of 3.796, followed by MEK, LDL, and CTNNB1 pathways, with decreasing z-scores. In addition, MYC was predicted to be an activated upstream regulator (Additional file 9).
The TP53 regulatory pathway was predicted to be the most inhibited with a z-score of −3.113 and a p-value of 4.05E-19 (Table 1). In our sample, 99 differentially expressed genes were downstream of the TP53 pathway (Additional file 10). Inhibition of the TP53 pathway is a hallmark of carcinogenesis and is predicted in our ESCC dataset, as well.

Functional meta-analysis of gene expression of ESCC in diverse ethnic groups
To determine whether African-American ESCC implicates genes that are unique or shared by ESCC of other ethnic groups, we performed a meta-analysis that included our African-American ESCC expression data and data from seven studies published in publicly available datasets in the GEO database. We note that our expression profiling data is the first such study in African-American ESCC to be deposited in the GEO repository. ESCC expression profiles in GEO included those generated in Japan (GSE17351) [25], Hong Kong, China (GSE33810) [29] and from various parts of China (GSE23400 [27], GSE20347 [26], GSE45670 [30], GSE33426 [28], and GSE29001 [28]). Ten genes that Analysis of the functional outcome of expression profiles from all microarray studies showed that NRF2-mediated oxidative stress pathway was significantly enriched only in our dataset (Fig. 4). Likewise, the significant enrichment of ubiquitination, androgen, and B-cell receptor signaling pathways was revealed only in our dataset. Integrin, ephrin receptor and protein kinase A signaling pathways were shared by at least two or more studies at or above the significance threshold.
It was important to examine the dysregulation of genetic components of the detox networks in the ESCC microarray expression datasets. All studies showed enrichment of toxicology pathways than other signaling pathways (Fig. 5). Interestingly, our dataset contained the highest number of genes in the NRF2-mediated oxidative stress response pathway while in other studies this number was either at or below the significance threshold. Aryl hydrocarbon receptor, fatty acid metabolism, xenobiotic metabolism signaling, G2/M DNA damage checkpoint regulation and cell death genes were significantly perturbed in all studies. In our dataset (GSE77861) and in GSE23400 [27], the number of genes in retinoic acid receptor signaling was above the significance threshold.

Meta-analysis of the upstream regulatory pathways of ESCC in various ethnic groups
Meta-analysis of all available ESCC gene expression profile datasets showed a distinctive upstream regulatory pathway in African-Americans that highlighted a significant enrichment of the NRF2 mediated oxidative stress response pathway ( Table 1). The activated pathways such as CBX5, insulin, MEK, NFE2L2, ANXA7, HSF2, NFE2L1, and PLIN5 were either uniquely represented in our study or shared with only one other study. Six out of eight datasets predicted the activation of upstream pathways of E2F and RABL6 although the rankings of z-score of these pathways were diverse (Table 1 and Additional file 9). FOXM1 was also projected as one of the common activated upstream pathways. Regardless of the z-score rankings, the activation of angiopoietin 2 pathway is the third highly represented upstream pathway in five of the studies (Additional file 9). The activation of fibronectin, and beta-catenin pathways as upstream regulators was revealed in five studies that included ours.
The predicted inhibited upstream pathways were divergent among the studies. While the TP53 pathway was predicted to be the top inhibited pathway in our study, the most common inhibited pathways including CDKN1A, IRF4, KDM5B, ACKR2, BNIP3L, DYRK1A were found in all datasets except in our study. In contrast, our dataset exclusively demonstrated the inhibition of FGFR1, ESRRA, EHF, and IL13 pathways.

Discussion
ESCC is the predominant esophageal carcinoma subtype worldwide occurring in specific geographic areas and in Fig. 3 The toxicology chart summarizes the enrichment of detoxification pathways enriched in our dataset by IPA. Ingenuity Pathway Analysis (IPA) identified NRF2-mediated oxidative stress response pathway as the most enriched toxicology pathway. Blue bar represent -log(p-value) and the ratio is the number of genes characterized in the dataset compared to the total number of genes belonging to that pathway The upstream regulatory pathways represented more than one study in the meta-analysis indicated in bold various countries including China, Japan, Iran, Italy and France [8,31]. In the United States, a high incidence of ESCC has been reported in the District of Columbia and coastal areas of the southern states [32]. ESCC occurs at a 5-fold greater frequency among African-Americans than among white Americans while the converse has been observed for EAC [7,33]. Even though five-year survival rates increased in both whites and black between 2004 and 2010, the mortality rate for esophageal carcinoma is still far greater in blacks than among whites [33][34][35]. Notably, in recent years, an increased incidence of EAC has been observed, particularly among whites [1,34]. Altogether, these distinctive features indicate geographic and racial disparities in esophageal cancer [31].
We conducted a transcriptome analysis to identify the molecular repertoire involved in esophageal squamous cell carcinoma in African-American males. To our knowledge, this study is the first to investigate and analyze the global gene expression pattern of stage IV ESCC in African-Americans.
Heavy alcohol consumption, cigarette smoking, and poor diet are environmental risk factors for ESCC. Our findings in African-American ESCC reveal dysregulation of genes involved in detox networks, including NRF2 pathway, which is a primary mediator of detoxification and metabolism responses (Additional file 5) [36]. Nuclear factor-erythroid 2 p45-related factor 2 (NFE2L2) gene encodes a transcription factor NRF2 that regulates the transcription of antioxidant/electrophile response element (ARE)-containing target genes in response to oxidative and/or toxic environmental changes. The NRF2 pathway also regulates wound healing, resolution of inflammation, autophagy, ER stress response and unfolded protein response [37], apoptosis, differentiation of keratinocytes [38] and the embryonic development of the esophagus in response to growth factor-induced ROS production [39,40].
The role of NRF2 pathway is cancer-type dependent. NRF2 protects against chemical carcinogen-induced carcinogenesis in the stomach, bladder and skin [41]. However, NRF2 activation plays an oncogenic role in lung, head and neck, ovarian and endometrial cancers [41]. Previous studies conducted in Asian samples demonstrated that higher expression of NRF2 is positively correlated with lymph node metastasis and drug resistance in ESCC [42]. Mutations in NFE2L2 confer malignant potential and resistance to therapy in advanced ESCC [43]. However, only 10% of Asian ESCC carry mutations in the NFE2L2 gene or its negative regulator KEAP1 [44]. Consistent with this data, our meta-analysis of gene expression profiles only showed a modest involvement of NRF2 in toxicology pathways in Asian ESCC datasets. IPA demonstrated the enrichment of NRF2 pathway in ESCC with high confidence in our dataset, suggesting a unique molecular signature of African-American ESCC. The significance of NRF2 pathway in African-American ESCC merits further functional evaluation.
In our CGH data, we previously found a loss of 7q in >50% of ESCC from African-American males [13]. Transcriptome mapping identified four genes located in the 7q21.1-22.3 region among which is the cytochrome P450 gene cluster that includes CYP3A5, CYP3A7, CYP3A4, and CYP3A43. It is noteworthy that our analysis indicates a significant loss of expression of CYP3A5 in addition to the down-regulation of three other genes that encode cytochrome P450 enzymes. It is well established that CYP3A enzymes metabolize more than half of the drugs used clinically [45]. Cytochrome P450 enzymes are also active in metabolizing toxic compounds thus their loss potentially contributes to carcinogenesis.
The persistent metabolic imbalance and tumor promoters found in cigarette smoking activate growthpromoting, cancerous conditions. Thus, the continual activation of NRF2 pathway could provide an adaptation mechanism to environmental toxicant especially in cancers [37]. Aryl hydrocarbon signaling, fatty acid, and xenobiotic metabolism also share some of the proteins that function in the NRF2 pathway. Therefore, the effect of the dysregulated NRF2 pathway may amplify the impairment of the dynamics of these pathways. In addition to response to toxins, NRF2 might promote cell proliferation of cancer cell by reprogramming metabolism to anabolic pathways [46]. However, further studies are required to investigate the causal association of NRF2 pathway in the esophageal tumor development in African-Americans. Future genomic studies are important to evaluate the mutational spectra of NFE2L2 or KEAP1 in African-American ESCC.
Recent studies that outlined the genomic and molecular characterization of esophageal carcinoma in the Asian population suggested the dysregulation of the receptor tyrosine kinase (RTK)-MAPK-PI3K, NOTCH, Hippo, cell cycle, and epigenetic pathways as the primary molecular mechanism of ESCC [44,47]. The amplification or over-expression of FGFR1, MET, EGFR, ERBB2, ERBB4, and IL7R was observed in the majority of the patients and has been suggested as main drivers for the ESCC tumorigenesis [47]. Our meta-analysis of ESCC expression datasets indicated that the activation of growth factors and or their receptors, RABL6, FOXM1, CCND1, and CTNNB1 are upstream signaling drivers of the cellular growth of ESCC.
The upstream regulatory role of RABL6 was predicted in six out of eight ESCC datasets. RABL6 gene encodes a member of the Ras superfamily of small GTPases. The encoded protein RABL6, also known as RBEL or PARF, binds to both GTP and GDP and may play a role in cell growth and survival. Overexpression of this gene may play a role in breast, and pancreatic cancer tumorigenesis [48][49][50]. Functional analysis of RABL6 in ESCC warrants further study.
The most common inhibited upstream regulatory pathways are TP53 and KDM5B across most of the ESCC datasets. Studies have shown that TP53 negatively regulates NRF2-mediated gene expression [51]. The down-regulation of TP53 could synergistically sustain the activation of NRF2 seen in African-American ESCC. We previously identified a single nucleotide mutation of SCEL gene in both normal and squamous cell carcinoma of esophagus in African-Americans [52]. In our present study, SCEL is significantly under-expressed in African-American ESCC, and thus could play a role in squamous cell carcinogenesis as suggested by the down-regulation of this gene in larynx and hypopharynx [53], and in tongue squamous cell carcinoma [54].
The diversity among the inhibited upstream pathways implies the variety of susceptibility loci remain to be discovered in ESCC tumorigenesis, particularly the contribution of the deregulation of immune components. Given the differences in enriched pathways displayed by ESCC in various ethnic groups, it is possible that different genetic backgrounds have dissimilar responses to various environmental exposures. [55,56].
Conceivably, our findings unmasked only a restricted view of the processes that are compromised in ESCC given the inherent limitations of microarray-based transcriptome profiling, the small sample size that was analyzed and incomplete modeling of biological reactions due to lack of functional data. However, the present study uncovered salient mechanistic aspects of the squamous esophageal cellular system in African-Americans, which to our knowledge, have not been described previously.  Role of the funding body: The Elsa U. Pardee Foundation approved the study design, the plans for sample collection, and data analysis before releasing the funds. The foundation also received a progress report during the study term and a final report at the end of the study term. The funding body did not contribute to the preparation and the revision of the manuscript. The Robert Leet and Clara Guthrie Patterson Trust. Recipient: Robert Wadleigh MD, MS.
Role of the funding body: The Robert Leet and Clara Guthrie Patterson Trust approved the study design, the plans for sample collection, and data analysis before releasing the funds. The foundation also received a progress report during the study term and a final report at the end of the study term. The funding body did not contribute to the preparation and the revision of the manuscript.

Availability of data and materials
The data was deposited to the NCBI GEO database. The link to data is below; http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=glstesispfoztcd&acc= GSE77861.
Authors' contributions HVE: Analyzed microarray data, performed pathway analysis, performed meta-analysis of expression profiling data, wrote and revised the manuscript. KJ: Analyzed the microarray raw data, and contributed to the interpretation of findings and intellectual content of the manuscript. SG: Performed RNA extraction and microarray experiments. DK: Coordinated patient sample collection, and contributed to the intellectual content of the manuscript. GT: Provided patient samples, and assisted in the revision of the manuscript. HA: Provided patient samples, and assisted in the revision of the manuscript. EPH: Supervised microarray experiments and contributed to the intellectual content of the manuscript. RGW: Designed the experiments, contributed for intellectual content, and co-wrote and assisted in the revision of the manuscript. All authors have read and approved the final version of this manuscript.

Competing interests
None of the authors have any competing interests in the manuscript. This manuscript does not reflect the views of the U.S. Federal Government or any of its agencies.

Consent for publication
The participants permitted to publish the results.
Ethics approval and consent to participate This study was done under a protocol approved by the Washington DC VAMC Institutional Review Board, and written informed consent was obtained from patients prior to biopsy or surgery. The IRB ID for this study is 00707.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Author details