Clinical relevance of nine transcriptional molecular markers for the diagnosis of head and neck squamous cell carcinoma in tissue and saliva rinse

Background Analysis of 23 published transcriptome studies allowed us to identify nine genes displaying frequent alterations in HNSCC (FN1, MMP1, PLAU, SPARC, IL1RN, KRT4, KRT13, MAL, and TGM3). We aimed to independently confirm these dysregulations and to identify potential relationships with clinical data for diagnostic, staging and prognostic purposes either at the tissue level or in saliva rinse. Methods For a period of two years, we systematically collected tumor tissue, normal matched mucosa and saliva of patients diagnosed with primary untreated HNSCC. Expression levels of the nine genes of interest were measured by RT-qPCR in tumor and healthy matched mucosa from 46 patients. MMP1 expression level was measured by RT-qPCR in the salivary rinse of 51 HNSCC patients and 18 control cases. Results Dysregulation of the nine genes was confirmed by the Wilcoxon test. IL1RN, MAL and MMP1 were the most efficient diagnostic markers of HNSCC, with ROC AUC > 0.95 and both sensitivity and specificity above 91%. No clinically relevant correlation was found between gene expression level in tumor and T stage, N stage, tumor grade, global survival or disease-free survival. Our preliminary results suggests that with 100% specificity, MMP1 detection in saliva rinse is potentially useful for non invasive diagnosis of HNSCC of the oral cavity or oropharynx, but technical improvement is needed since sensitivity was only 20%. Conclusion IL1RN, MAL and MMP1 are prospective tumor diagnostic markers for HNSCC. MMP1 overexpression is the most promising marker, and its detection could help identify tumor cells in tissue or saliva.

Our preliminary results suggests that with 100% specificity, MMP1 detection in saliva rinse is potentially useful for non invasive diagnosis of HNSCC of the oral cavity or oropharynx, but technical improvement is needed since sensitivity was only 20%.
Conclusion: IL1RN, MAL and MMP1 are prospective tumor diagnostic markers for HNSCC. MMP1 overexpression is the most promising marker, and its detection could help identify tumor cells in tissue or saliva.

Background
Head and neck squamous cell carcinoma (HNSCC) is the sixth most common malignancy worldwide and every year an estimated 600,000 people are newly diagnosed. HNSCC accounts for about 10% of the total cancer burden in men [1]. It is often diagnosed at late stages in heavy alcohol and tobacco users and patients presenting advanced disease have a short overall survival with major post-therapeutic side effects. Despite ongoing efforts, no molecular markers have yet been validated as useful clinical tools for the early detection and management of this disease. Recently, the transcriptome of HNSCC was extensively probed by several groups using expression microarray technology. The goals were to identify the genes potentially involved in this pathology and to identify diagnostic and/or prognostic gene-expression signatures . Although the heterogeneous designs of these studies (number of sample, tumor sites, tumor stages, microdissection of tissue samples, RNA extraction and RT protocols, microarray technologies or biostatistics analyses) probably explain the diversity of the results, there is nevertheless strong motivation to develop robust molecular diagnostic tests [28].
We reviewed 23 studies published between March 2000 and November 2007, in which expression profiles of HNSCC tumors versus their normal matched mucosa were compared. Despite the important methodological heterogeneity of gene expression studies 9 genes appeared to be repeatedly dysregulated, in at least nine of the 23 studies for each gene . Four genes were up-regulated in HNSCC tissues (FN1, MMP1, PLAU and SPARC) and five were down-regulated (IL1RN, KRT4, KRT13, MAL and TGM3). Because these genes were recurrently found to be dysregulated whatever the tumor location and despite the important methodological heterogeneity of the gene expression studies, we hypothesized that these nine genes could be candidate as HNSCC-specific molecular markers. We therefore evaluated by RT-qPCR their expression level in 46 HNSCC samples and 46 healthy matched mucosa taken from an independent cohort of 74 patients consecutively diagnosed with primary untreated HNSCC in our institution. We assessed their tumor diagnostic values and looked at possible correlations with clinical data. The need for early detection in high-risk patients led several groups to develop techniques for identifying molecular markers in bodily fluid rather than in an invasive biopsy unsuitable for cancer screening. Saliva is an easily accessible diagnostic fluid for screening HNSCC [29]. In this study, we assessed the capacity of MMP1, one of the most relevant markers, to detect HNSCC tumor cells in saliva cells; this provided information on its utility as a non invasive diagnostic marker.

Patients and sample collection
All normal and tumor tissues, as well as saliva samples, were obtained from the Institutional Tumor Bank of the University Hospital of Nîmes, France. All samples were collected with the informed consent of the patients. This study was approved by the local ethics committee and the scientific board of the Tumor Bank.
Tumor tissue, normal matched mucosa and saliva were obtained from 74 patients consecutively diagnosed with primary untreated HNSCC between April 2005 and April 2007 in our institution. All HNSCC patients were Caucasian heavy smokers and drinkers. Additional saliva samples from 18 healthy controls were collected, between February and April 2008. This cohort of control cases was composed of two populations: 9 consecutive patients refereed to our institution by their physician for HNSCC screening, motivated by heavy drinking and smoking addictions; 9 consecutive patients without alcohol or tobacco addiction in their past history that consulted for otologic or rhinologic mechanical disorders. All patients underwent a head and neck examination to determine the absence or presence of HNSCC and to rule out any significant inflammatory lesion of the upper aero-digestive tract. Patients in the control group were statistically a little younger and with more women than in the HNSCC group (average age = 52.8 years for controls vs 58.8 years for HNSCC [p = 0.037, student test]; sex ratio = 0.38 for controls vs 0.11 for HNSCC [p = 0.029 2-sample test for equality of proportions]).
Tissue samples were collected by biopsy during diagnostic endoscopy and were immediately snap frozen and stored in liquid nitrogen. The matched non-malignant tissue was collected from the same anatomical site, as far as possible from the primary lesion for tumors crossing the midline or on the opposite side for well-lateralized tumors.
For saliva collection both HNSCC patients and control cases followed the same protocole. Subjects were asked to refrain from eating, drinking, smoking, or oral hygiene procedures for at least 1 hour before collection and carried out a 30-second oral rinse with 50 ml of NaCl 0.9% solution that was immediately centrifuged at 2600 rpm for 15 minutes at 4°C. The supernatant was discarded and the pellet was diluted into 1 ml of NaCl 0.9% and stored at -80°C before RNA extraction. The clinical and histopathological features of the populations are presented in Table  1.

RNA isolation, quality control and cDNA synthesis
To obtain homogeneous and histologically well-characterized samples for RNA analyses, tissue samples were cut with a cryo-microtome into 50-200 slices of 9-μm thickness in RNase-free conditions. At least three frozen slices taken from the sample core were mounted on glass slides and briefly stained with eosin-hematoxylin for histopathological examination by an experienced pathologist (H.C.). Tumor tissue versus normal surrounding tissue percentage (T/N %) was determined for malignant samples. HNSCC samples with less than 30% tumor cells were excluded from the study. Tissue samples were not microdissected in order to include in the qPCR analysis not only the tumor cells, but also the surrounding stromal cells, which are known to have altered transcriptional activity during the carcinogenetic process [30]. Normal tissues had to be composed of both stroma and its surrounding normal epithelial layer, without any tumor cells, to be included in the study. Total RNA was extracted from the remaining tissue slices using the RNeasy Mini Kit (Qiagen, Courtaboeuf, France). Saliva rinse RNA extraction was carried out using the RNA Isolation Kit on a RNA quality control and quantification were carried out on an Agilent 2100 Bioanalyser using Total RNA Nano II Chips (Agilent Technologies, Massy, France). To limit the impact of possible RNA degradation on our results, we selected for further analysis, only tissue samples with total RNA concentrations >85 ng/μl and values of RNA integrity number (RIN) >6. and saliva samples with total RNA concentration >40 ng/μl and RIN value >4. These criteria were never responsible of sample removal as satisfying qualities could be obtained by a second or third extraction procedure when required.
After quality control of tissues and saliva of the initial cohort of 74 HNSCC patients, 46 matched pairs of tumor tissue and normal mucosa and 51 saliva samples were useful. For 23 patients we analyzed both tissues and saliva. For 23 patients we analysed tissues only and for 28 patients saliva only. From our initial cohort of 74 HNSCC patients, 46 matched pairs of tumor tissue and normal mucosa as well as 51 saliva samples were useful. The amount of tissue, either tumor or normal mucosa, was insufficient for 28 patients to carry out pathological analysis and mRNA extraction. Saliva collection missed for 18 patients and saliva collection procedures were inadequate for 5 other patients. Finally only 23 patients were analyzed both at the tissue and saliva level. No relation was found between sample removal and tumor stage.
After quality control, 1 μg of total RNA was reverse transcribed using M-MLV reverse transcriptase and oligo dT [14][15][16] primers (Applied Biosystems, Courtaboeuf, France). Samples were incubated for 10 minutes at 65°C, cooled on ice for 5 minutes, and incubated with reverse transcriptase for 1 hour at 37°C. Reverse transcriptase was then inactivated by heating at 95°C for 5 minutes. The resulting cDNA were diluted 1:10 for tissue samples and 1:4 for saliva samples before being used as PCR template.

Quantitative real-time PCR (qPCR
We quantified the mRNA expression of three housekeeping genes and the nine genes of interest by real-time RT-qPCR using the Light Cycler Fast DNA Master plus SYBR green kit on LightCycler 480 (Roche, Meylan, France). Stringent primers sets were designed using Oligo 6 Software (MBI, Cascade, CO, USA). To avoid false detection of genomic DNA, although DNase was used for the extraction procedure, amplification was done on spliced regions of the genes. Gene references and primer characteristics are listed in Table 2. For each qPCR reaction we used 2 μl of the diluted cDNA, 1 μl of 10 μmol.l -1 forward primer, 1 μl of 10 μmol.l -1 reverse primer, 5 μl Light Cycler Fast DNA Master plus SYBR green I and 11 μl PCR water, for a final volume of 20 μl. The PCR cycle conditions were set as follow: a preincubation step for 10 minutes at 95°C followed by 40 cycles for tissue cDNA and 53 cycles for saliva cell cDNA; each cycle included 15 seconds at 95°C, 15 seconds at 60°C, and 15 seconds at 72°C. The temperature transition rate was 20°C/second. A melting curve was generated by linear heating from 50°C to 95°C in 20 minutes with 10 fluorescence measures every 1°C. A negative control, with no template, and a positive inter-run control were included for each gene in each qPCR run. All measurements were performed in triplicate. Standard curve assays showed an efficient amplification >1.8 for all genes and the specificity was shown by a single peak at the expected temperature on melting curve analyses ( Table 2). For each gene the inter-assay coefficient of variation of crossing point values was <10% (data not shown).
For the salivary assay, 50 cycles of the above-described qPCR procedure were carried out in order to detect very low concentrations of mRNA. As this number of cycle was extremely high in order to detect very low concentration of mRNA, the PCR specificity was controlled with care by melting curve analysis of the PCR products and by negative controls. Whatever, the highest Cp value observed was 44.2 cycles. The whole RT-qPCR procedure was carried out with and without RT enzyme, confirming the absence of contamination by genomic DNA during the automated saliva rinse RNA extraction process (data not shown).
qPCR data analysis values were automatically calculated by LightCycler 480 ® Software using the second derivative method and were imported into qBase, version 1.3.5, a free software program for the management and automated analysis of qPCR data, for quantitative analyses [31]. Normal tissue cDNA of patient 5 was arbitrarily chosen as a calibrator for each gene, and for this sample the expression level was set at 1 for each gene. For each gene qPCR amplification efficiencies were calculated by qBase from standard curves and were applied in the quantification algorithm. Relative quantities were normalized by qBase to a geometric mean of the three housekeeping genes ACT, B2M and RPS18.
Concerning the housekeeping gene set stability assessment, raw relative quantities were tested with geNorm software, version 3.4, for the combination of the three genes. geNorm algorithm calculates a gene expression stability measure M for a reference gene based on the average pairwise variation for that gene with all other tested reference genes [32].

Statistical analysis
S-Plus 2000 software (TIBCO Software, Inc., Palo Alto, CA, USA) was used to perform the statistical analyses. The quantitative variables were described by median, mini-mum and maximum values and the qualitative variables were described by frequencies and percentages. Genetic markers were compared between HNSCC tissues and matched normal mucosa using a Wilcoxon test for paired data. Comparisons were considered significant when p < 0.05.
The ability of each gene to diagnose the tumor tissue (HNSCC diagnostic ability) was represented by a Receiver Operating Characteristic (ROC) curve and the corresponding Area Under the Curve (AUC). The ROC curves and AUC were assessed by a non-parametric method [33]. The optimal cut-off used to calculate both sensitivity and specificity, was defined as the cut-off minimizing the number of misclassified tissues. Since no data were available concerning the diagnosis values of our candidate markers, it was not possible to determine a statistically based number of sample. Then, we decided to include all patients with exploitable samples, available at our Institutional Tumor Bank at the time of analysis.

Housekeeping gene stability
To date there is no published evidence to guide the selection of suitable housekeeping genes for the normalization of HNSCC RT-qPCR studies. Hence, we chose to normalize our qPCR relative ratio by the geometric mean of three commonly used housekeeping genes in cancer studies: ACT, B2M and RPS18. We validated this approach using geNorm software, version 3.4, which gave an expression stability measure based on the average pairwise variation of the three genes [32]. In our assay the stability value (M) of the association of ACT, B2M and RPS18 was M = 1.2. This M value, below the 1.5 arbitrary cut-off recom-

Differential gene expression between HNSCC and normal matched mucosa
Normalized relative expression levels of the nine selected genes were calculated by qBase for each sample. These expression levels were compared between HNSCC and normal matched tissues for the 46 patients (92 samples). As presented in Table 3, relative mRNA expression levels were significantly higher in tumor than in healthy samples for FN1, MMP1, PLAU and SPARC (Wilcoxon test for paired data, p ≤ 0.002). These levels were lower in tumor than in normal samples for KRT4, KRT13, IL1RN, MAL and TGM3 (Wilcoxon test for paired data, p < 0.001). For each gene, the median expression level ratio between tumor tissues and their matched normal mucosa are presented in Figure 1. Among the overexpressed genes in tumor, MMP1 showed the highest order of magnitude (510-fold) and FN1 the lowest (2.6-fold). Among downregulated genes in tumor, KRT4 showed the highest order of magnitude (530-fold) and IL1RN the lowest (9-fold).

HNSCC diagnostic ability of the nine genes
The ability of the nine selected genes to diagnose HNSCC was assessed by generating ROC curves. Corresponding AUC were calculated for each gene in order to find the best single marker to differentiate between normal mucosa and HNSCC tissue.

MMP1 mRNA detection in salivary rinse
Among the three best markers for HNSCC at the diagnostic level, MMP1 was the only one to be overexpressed in tumors. We hypothesized that its expression could be detected in the saliva rinses of patients with HNSCC, where tumor cells are known to desquamate. We carried out a preliminary assay to tested this gene in saliva but not the other 8 candidate genes. Among the 51 HNSCC patients, only 10 (20%) exhibited MMP1 mRNA at a

Discussion
Transcriptome profiling of tumor is a promising approach to identifying gene dysregulations potentially useful at the clinical level to detect or diagnose tumors and predict outcome, as well as to identify the gene pathways involved in carcinogenic processes. In the field of HNSCC, numerous studies have been performed but all have failed to find clinical applications, essentially because of the great diversity in their designs and the lack of a confirmatory step in an independent cohort of patients . When we looked closely at the HNSCC transcriptome analyses some genes emerged as frequently dysregulated and therefore as specific candidates as HNSCC molecular markers.
This original study is the first to validate independently the gene dysregulation observed in various HNSCC tran-Differential mRNA expression of the nine genes of interest in macroscopically healthy mucosa and HNSCC tissue Figure 1 Differential mRNA expression of the nine genes of interest in macroscopically healthy mucosa and HNSCC tissue. For estimation of the individual expression of each gene, the expression ratios of paired tissue specimens were calculated as R = HNSCC/normal. The distribution of the log of these ratios is represented for each gene by a box-plot. The central box represents the interquartile interval, the white line inside the box is the median value, and the minimum and the maximum values are indicated with square brackets.
scriptome assays. Our study, based on RT-qPCR analysis, is not a definitive transcriptome validation assay, but it clearly proves that reliable biological information with potential clinical applications, can be obtained by pooling the results of several transcriptome data despite heterogeneous designs and small patient cohorts.
Using this gene selection and validation approach, we identified eight efficient transcriptional markers to predict the presence of HNSCC cells in tissue samples, with three remarkable markers: IL1RN, MAL and MMP1. When tested individually, these three markers presented specificity above 91% and sensitivity above 93% in a cohort of 46 patients with various stages, grades and sites of the disease. Surprisingly, there was no additional gain when these three top markers were evaluated together in a multivariate model rather than each separately. This finding implies that above 90% of both specificity and sensitivity little additional information can be expected by associating several transcriptional markers. It also indicates that the biological information associated with these three dysregulated genes should be somehow similar as far as diagnosis is concerned. It is worth mentioning that in clin-ical routine the use of only one well-characterized diagnostic marker has the advantage of being simple.
No clinically relevant correlations were identified between gene expression level measured in tumor and clinical or pathological parameters. This absence of correlation could signify that these dysregulations are common early events in HNSCC carcinogenesis, making these genes useful as diagnostic markers but useless as staging or prognostic markers. The proportion of early-stage tumor was low in our population, with 12 stage T2 tumors (26%); nonetheless, these nine dysregulations remained statistically significant in this subgroup of patient. This finding confirmed that these markers could be used for the detection of early-stage as well as later-stage tumors.
Among these three most remarkable dysregulated genes for HNSCC diagnosis, MMP1 was highly overexpressed. We thus hypothesized that it could be detected in the saliva rinses of patients with HNSCC for diagnosis or screening purposes. Indeed, several studies recently focused on the detection of such cells, mainly by studying DNA alterations (e.g., mutation, hypermethylation) but also by RNA detection [34]. Through a transcriptome salivary approach, Li et al. identified a set of seven overexpressed genes in saliva from patients presenting oral squamous cell carcinoma; their association in a multivariate model yielded a sensitivity of 91% and a specificity of 91% [35][36][37][38][39][40]. Li et al used the cell-free saliva for their transcriptome study. On the contrary, we choose to extract RNA from salivary floating-cells. In our opinion mRNA overexpression due to tumour alterations is more likely to be detected in the salivary floating-cells pellets than in the more RNA-diluted free cell saliva. Kim et al reported an elevation of MMP1 in saliva related to refractory periodontitis in microarray study of oral subepithelial connective tissues [41]. We do not think that periodontal health changes were responsible of MMP1 elevation in saliva of our patients since no clear differences of periodontal status were noticed between HNSCC patients and control Calculation of AUC values for the relative expression levels of the nine genes of interest by ROC analyses and the corresponding diagnostic values with optimum cut-off value. cases or between MMP1 positive and MMP1 negative HNSCC patients.
As most of the patients likely to develop HNSCC are reluctant to consult physicians when the first symptoms occur, a noninvasive screening method based on the detection of tumor cells among epithelial oral cells could be of interest. In this preliminary salivary study, we confirmed that the expression of MMP1 was confined to HNSCC tumor cells as no expression was detected within the healthy population. Unfortunately, the sensitivity was low as only 20% of the patients presenting the disease were detected. Given that the salivary donors in our study presented advanced tumors, mostly symptomatic or easily detectable by standard clinical examination, this salivary approach could be considered as inefficient. On the other hand, the 100% specificity seems encouraging and HNSCC screening by a non-invasive salivary/oral screening test remains a promising field of research. In our study, the lack of sensitivity to detect salivary expression of MMP1 could have been due to suboptimal procedures for saliva collection, RNA extraction and retro-transcription. Sensitized procedures would increase the quantity of tumor cells and mRNA in samples and therefore improve the sensitivity of this technique [29].
The nine dysregulated genes have very different biological functions and some of them are already known to be implicated in cancer. From a clinical point of view, IL1RN, MAL and TGM3 were not clearly identified as potential HNSCC markers, while few data have been published concerning the implication of FN1, KRT, MMP1, PLAU and SPARC in this type of cancer. From a fundamental point of view, further studies are required to assess the functional mechanisms implicated in the dysregulations we observed.

Conclusion
We confirmed in an independent study the dysregulation of FN1, MMP1, PLAU SPARC, IL1RN, KRT4, KRT13, MAL and TGM3 in HNSCC. Three of them, MMP1, MAL and IL1RN, were remarkable at identifying HNSCC in comparison with normal mucosa. They thus present an interesting potential as screening/diagnostic markers, still to be evaluated at the clinical level. MMP1 was the most promising gene of this study because it was overexpressed in tumor, highly dyregulated (500-fold) and tumor-specific. In addition to salivary detection, MMP1 could be tested at the RNA level for its ability to identify small tumor clusters in diagnostic biopsies, surgical margins or lymph nodes in the setting of primary tumors. This would be a valuable complement to histopathological analyses which efficacy for the diagnosis of microscopic tumors is low.