Prognostic Impact of Array-based Genomic Profiles in Esophageal Squamous Cell Cancer

Background Esophageal squamous cell carcinoma (ESCC) is a genetically complex tumor type and a major cause of cancer related mortality. Although distinct genetic alterations have been linked to ESCC development and prognosis, the genetic alterations have not gained clinical applicability. We applied array-based comparative genomic hybridization (aCGH) to obtain a whole genome copy number profile relevant for identifying deranged pathways and clinically applicable markers. Methods A 32 k aCGH platform was used for high resolution mapping of copy number changes in 30 stage I-IV ESCC. Potential interdependent alterations and deranged pathways were identified and copy number changes were correlated to stage, differentiation and survival. Results Copy number alterations affected median 19% of the genome and included recurrent gains of chromosome regions 5p, 7p, 7q, 8q, 10q, 11q, 12p, 14q, 16p, 17p, 19p, 19q, and 20q and losses of 3p, 5q, 8p, 9p and 11q. High-level amplifications were observed in 30 regions and recurrently involved 7p11 (EGFR), 11q13 (MYEOV, CCND1, FGF4, FGF3, PPFIA, FAD, TMEM16A, CTTS and SHANK2) and 11q22 (PDFG). Gain of 7p22.3 predicted nodal metastases and gains of 1p36.32 and 19p13.3 independently predicted poor survival in multivariate analysis. Conclusion aCGH profiling verified genetic complexity in ESCC and herein identified imbalances of multiple central tumorigenic pathways. Distinct gains correlate with clinicopathological variables and independently predict survival, suggesting clinical applicability of genomic profiling in ESCC.


Background
Esophageal squamous cell carcinoma (ESCC) is a major cause of cancer-related mortality worldwide. Despite advances in diagnostic methods and combined treatment modalities, the majority of the tumors are diagnosed at advanced stages and the overall 5-year survival rate remains 40%. ESCC develops through a multistep process from dysplasia, through carcinoma in situ to invasive carcinoma, and the acquisition of genetic alterations is tightly related to the dysplasia-carcinoma sequence [1]. The characterization of genetic alterations inherently linked to ESCC development and an in-depth understanding of the molecular mechanisms underlying carcinogenesis and growth control may therefore provide information relevant for early tumor detection, refined prognosis and development of novel targeted therapeutics.

Tumor tissue
All 30 patients, 24 men and 6 women with a mean age of 64 (range 54-78) years, were recruited from the southern Sweden health care region and had undergone primary esophagectomy at the Department of Surgery, Lund University Hospital. None of the patients had received neoadjuvant radiotherapy or chemotherapy. Tumor tissue collected at surgery was stored at -80°C until DNA extraction. Stage (according to the International Union Against Cancer) was I in 3 cases, II in 9, III in 7, and IV in 11 cases (see Additional file 1). Presence of = 50% tumor cells in the tissue was verified by touch imprints, which were stained with Hematoxilin-Eosin and evaluated by a gastrointestinal pathologist (B.H.). All deaths were ESCC related, the median follow up was 21 (range 1-46) months for the survivors. Written informed consent was provided by all patients and the study was approved by the Lund University ethics committee.

BAC Array Platform
We used the 32 k human genome high-resolution BAC rearrayed clone set, Version 1.0 from the BACPAC Resource Center at Children's Hospital Oakland Research Institute, Oakland (CA, US), produced at the Swegene DNA Microarray Resource Center (GEO platform repository accession GPL4723) [17], Department of Oncology, Lund University with a resolution >80 kb.

DNA isolation, labelling and hybridization
DNA was extracted, labelled and hybridized as previously described [17]. A commercial obtained DNA, derived from a pool of normal human males (Promega, Madison, WI, USA) was used as reference. Scanning was performed using an Agilent microarray scanner (Agilent Technologies, Palo Alto, CA, USA).

Image processing and data analysis
Scanned arrays were analyzed using Gene Pix Pro 4.1 (Axon Instruments, MDS Analytical Technologies, Ontario, Canada) and bad spots were "flagged" during manual inspection. The quantified data matrix was loaded into Bio Array Software Environment (BASE) [18] and filtering and normalization were performed herein [19]. Correction of background intensities of Cy3 and Cy5 were calculated using median-feature and median-local background intensities of the uploaded file. Intensity ratios were calculated from the background tumor channel (ch1) divided with the reference channel (ch2). Spots flagged as bad were filtered out from further analysis and were regarded as missing values. A signal to noise ratio (SNR) = 5 was set for both channels. Data were normalized using an implementation of a pin-based Lowess algorithm [20] in BASE excluding the X chromosome. A moving average smoothing algorithm with a 200 kbp sliding window was used and a BASE-adapted CGH-plotter software was used to identify regions of gains and losses excluding the X and the Y chromosomes [21]. A region of gain or loss was defined as two or more consecutive clones showing an absolute log 2 ratio ≥ 0.2 and high-level amplifications as a log 2 ratio ≥ 1.5.

Statistical Analysis
A Chi2 test was used to identify differences between copy number changes and clinicopathological characteristics. For survival analysis, the Kaplan Meier method was used to estimate relevant event variables, and the log-rank test was used to compare survival between two strata. The Cox proportional hazards model was used for univariate and multivariate survival analyses. Cox analysis was first used for all clones, thereafter two or more consecutive significant (P < 0.05) clones (interrupted only by one or two clones) were re-analysed as regions affected by gains and losses. Regions significantly (P < 0.05) linked to stage/ outcome in univariate analysis were included in the mul-tivariate analysis. All covariates were evaluated for adherence to the assumption of proportional hazards by calculating Schoenfeld residuals. Correlations between changes within a same pathway were analysed using pairwise correlation. CGH profiles were classified as gains or losses according to the dominating variable in the region corresponding to the gene locus. After Bonferroni correction correlations with P < 0.05 were considered significant. Stata 9.2 (StataCorp LP, College Station, TX 77845, USA) was used for the statistical calculations.

Immunohistochemical Staining
Serial 4-μm sections from one representative paraffinembedded tumor block were used for immunostaining using a monoclonal antibody against human EGFR at a dilution of 1:50 (DAKO, Glostrup, Denmark). The slides were evaluated as no staining; 1+ (cytoplasmic staining or discontinuous membrane staining); 2+ (membrane staining with moderate intensity), and 3+ (intense staining with retained membranous staining). Interpretation of the staining was performed by two of the authors (AC and MN), who were blinded to the copy number changes and the clinical data.

Results
The median number of losses per tumor was 6.9% clones (range 0.1-24%) and the median number of gained clones was 9.8% (range 1.1-21%). Recurrent gains identified in at least 60% of the tumors involved chromosome regions 5p, 7p, 7q, 8q, 10q, 11q, 12p, 14q, 16p, 17p, 19p, 19q, and 20q. Likely target genes include TERT, EGFR, MYC, MYEOV, CCND1, FGF4, FGF3, CTTN and AKT1. Loss of genetic material in at least 40% of the samples affected 3p, 5q, 8p, 9p and 11q (Table 1). A homozygous deletion of 9p21.3, corresponding to the CDKN2A locus, was identified in one case and was verified with PCR using CDKN2A-specific primers (sequences available from the authors upon request, data not shown). High-level amplifications were observed in 33 regions and recurrently involved 11q22 (harbouring PDFG in 3 tumors), 11q13 (in 11 tumors, encompassing MYEOV, CCND1, FGF4, FGF3, PPFIA, FAD, TMEM16A, CTTS and SHANK2) and 7p11 (including EGFR in 4 tumors). Overexpression of EGFR was validated by immunohistochemistry, which revaled a highly positive (3+) staining in 19 tumors, 12 of which had copy number gain of 7p11. All 4 tumors with HLA of the EGFR locus showed 2+ or 3+ EGFR staining.  Despite genetic complexity (Figure 1), the genomic profiles were found to correlate with tumor stage, differentiation, and development of metastases with a lower (mean 13% versus 25%) number of changes in highly differentiated ESCC than in poorly differentiated tumors (see Additional file 2). A Chi2 test identified ~400 clones that mapped to 6 genomic regions (2p11. Univariate analysis verified that stage and tumor size were associated with prognosis; stage HR 1.6, P = 0.04 and pT HR 1.8, P = 0.05. When copy number gains and losses were correlated to prognosis, Cox proportional hazards analysis identified 1284 clones, with a p-value <0.05, mapping to 30 regions (see Additional file 7). When gains and losses were separately considered, 7 regions remained significantly associated to prognosis in univariate analysis. When these regions were entered with stage into multivariate analysis, gain of 1p36.32 and gain of 19p13.3 independently predicted poor prognosis (Table 2 and Figure 2).

Discussion
Array-based genomic profiling of ESCC confirms the genetic complexity suggested by earlier studies that have applied other means of genetic profiing, e.g. cytogenetics, conventional CGH, and LOH analysis. We found copy number gains and losses affecting median 19% of the genome, identified multiple high-level amplications, and demonstrated an association between copy number alterations and stage, differentiation and prognosis, suggesting clinical applicability of genomic profiling in ESCC.
The 11q13 region is central in ESCC development and alterations herein were identified in 20/31 tumors. CCND1 is a likely target, but several other candidate genes, e.g. FGF4, FGF3, CTTN and SHANK2, showed highlevel amplification. This amplicon harboured MYEOV, which has previously been associated with ESCC and described to be co-amplified with CCND1 [22]. The RB pathway is frequently targeted in ESCC carcinogenesis [23][24][25] and its activation seems to be dependent mainly on CCND1 amplification. In our sample set no significant correlations were observed between the gains/losses observed in the members of the RB pathway (CCND1, CCNE1, E2F3 and CDKN2A). Gain of 14q32.3, which includes the AKT1 oncogene, was identified in half of the samples. The PTEN-PIK3CA-AKT signalling cascade is frequently deregulated in several types of cancers and expression of PIK3CA has been strongly associated with elevated AKT activity. An increased copy number of PIK3CA is primarily detected in tumors with retained PTEN expression [26], and indeed, none of the 11 tumors with PIK3CA gain showed loss of the PTEN locus at 10q23.3, whereas 7 tumors showed PTEN loss without change at PIK3CA locus. The pairwise analysis showed a negative (P = 0.005) correlation of both copy number gains. Expression data from array-based oligonucleotide arrays were available from 8 samples (unpublished data) and verified overexpression of PIK3CA in 7 of these tumors, which further supports PIK3CA and PTEN acting as mutually exclusive tumorigenic events [27]. Gain of 7p11.2 was identified in half of the tumors and included high-level amplifications in 4 tumors. The most likely target gene herein, EGFR, is overexpressed in a multitude of malignancies and including ESCC [7,[28][29][30]. Immunostaining for EGFR was highly positive (3+) in 12 out of 14 tumors with copy number gain of EGFR, thus suggesting, as previously reported [28,31,32], that copy number gain leads to high protein expression in a significant fraction of tumors (86% in our sample set).
In our cohort, loss of PTEN was observed in 23% of the samples and was significantly correlated (P = 0.04) to EGFR gain, which may be relevant for resistance to EGFR inhibitors, since PTEN loss correlates with treatment resistency. Furthermore, gain of 17q12, harbouring ERBB2, was observed in 9 tumors and 6 of these showed concomitant gain of 7p11.2 (EGFR), which suggests that co-overexpression of ERBB2 and EGFR may apply also to ESCC [33]. High level amplification of ERBB2 correlated to overexpression (data not shown). Copy number gain of 5p15 was among the most frequent changes and the min-imal region of overlap harbour some 20 identified genes, among which the telomerase regulator TERT, which has previously been shown to be overexpressed in ESCC and has been associated with prognosis in other tumor types [12,[34][35][36]. Gains of 7p22.3, 8q22.3-qter and 20q11.21 were also frequently found and include the target genes MAD1L1 involved in TERT transcription, LRP12 and WISP1 linked to cell survival and p53-mediated apoptosis and TPX2 known to activate Aurora-A kinase [37][38][39][40][41]. High-level amplifications affected 33 loci, among which recurrent high-level amplification peaks were detected at 7p11 (EGFR), 11q22 (cIAP1, MMP3 and PDGF), 11q13 (that harbours e.g. CCND1, FGF4, FGF3, CTTN and SHANK2), and 10q21 with unknown targets.
The most frequent recurrent copy number losses affected 3p, 5p, 8p, 9p, and 11q, which is consistent with other studies and these loci also contain several tumor suppressors linked to ESCC [2,3,8,42]. Losses affecting the 9p21-p24 region, which contains CDKN2A and CDKN2B, were identified in 13/30 tumors. CDKN2A deletions have been associated with an invasive and metastatic phenotype and a homozygous CDKN2A deletion was identified in one sample [43][44][45]. Frequent losses were also observed at 3p26-p14 which harbours THRB, RARB, TOP2B and FHIT. Pairwise correlations between frequently observed gains and losses identified 5 regions that were significantly more often affected by concurrent aberrations. Four of these were located on the same chromosome, whereas loss of 3p24 and 5q12 occurred at an increased incidence. These regions contain targets such as TOP2B, RARB, and TGFBR2 on 3p and PIK3R1 and RAD17 on 5q, and the association identified may indicate cross-talk between genes in these regions.

Conclusion
The accumulation of genetic changes is central in ESCC development and progression. Our study is the first to apply high-resolution aCGH to clinical prognostication in ESCC. Studies that have applied traditional CGH have suggested a prognostic independent role for genes located on 8q, 11q, 12p, 14q, and 20q [2,16,46], and a recent study using a 4k BAC array platform (with 1 Mb resolution) identified 4 clones, mapping to 3q29, 4q21.21, 8q24 and 8q24.3, linked to survival [47]. The high-resolution data here presented demonstrates extensive genetic complexity already in early stage tumors, supports the involvement of several key genes in ESCC, links gain of 7p22.3 to presence of nodal metastases and demonstrates that gains of 1p36.32 and 19p13.3 provide independent prognostic information (HR = 19.6 and 7.0 respectively). The different prognostic regions identified may be related to the inherent genetic complexity of ESCC, to differences in materials (e.g. study populations from different geographic areas with disparities in dietary and environmen-Kaplan-Meier survival plots of the two prognostic regions in multivariate analysis Figure 2 Kaplan-Meier survival plots of the two prognostic regions in multivariate analysis. A) Highly significant difference in survival between patients without gain and with gain of 1p36.32 (P = 0.005). B) Difference in survival between patients without gain and with gain of 19p13.3 (P = 0.01).
tal exposures) and use of different genetic profiling technologies [48,49]. Nevertheless, these results hold promise for the application as genetic classifiers and refined prognostic markers. Moreover, the recognition of recurrent rearrangements in central signaling pathways provides a basis for the development of selected and individualized targeted therapeutics in ESCC.