Skip to main content

Selection of suitable reference genes for accurate normalization of gene expression profile studies in non-small cell lung cancer



In real-time RT quantitative PCR (qPCR) the accuracy of normalized data is highly dependent on the reliability of the reference genes (RGs). Failure to use an appropriate control gene for normalization of qPCR data may result in biased gene expression profiles, as well as low precision, so that only gross changes in expression level are declared statistically significant or patterns of expression are erroneously characterized. Therefore, it is essential to determine whether potential RGs are appropriate for specific experimental purposes. Aim of this study was to identify and validate RGs for use in the differentiation of normal and tumor lung expression profiles.


A meta-analysis of lung cancer transcription profiles generated with the GeneChip technology was used to identify five putative RGs. Their consistency and that of seven commonly used RGs was tested by using Taqman probes on 18 paired normal-tumor lung snap-frozen specimens obtained from non-small-cell lung cancer (NSCLC) patients during primary curative resection.


The 12 RGs displayed showed a wide range of Ct values: except for rRNA18S (mean 9.8), the mean values of all the commercial RGs and ESD ranged from 19 to 26, whereas those of the microarray-selected RGs (BTF-3, YAP1, HIST1H2BC, RPL30) exceeded 26. RG expression stability within sample populations and under the experimental conditions (tumour versus normal lung specimens) was evaluated by: (1) descriptive statistic; (2) equivalence test; (3) GeNorm applet. All these approaches indicated that the most stable RGs were POLR2A, rRNA18S, YAP1 and ESD.


These data suggest that POLR2A, rRNA18S, YAP1 and ESD are the most suitable RGs for gene expression profile studies in NSCLC. Furthermore, they highlight the limitations of commercial RGs and indicate that meta-data analysis of genome-wide transcription profiling studies may identify new RGs.

Peer Review reports


Lung cancer remains the leading cause of cancer-related death in Europe and North America and accounts for nearly 30% of the total [1, 2]. Despite advances in treatment, the prognosis of non-small cell lung cancer (NSCLC) is poor: only 5–15% of patients survive 5 years after diagnosis, mainly in function of the initial stage of the disease [3]. Real-time quantitative PCR (qPCR) is the foremost method of choice for accurate determination of transcripts amounts. It has recently been applied in lung cancer studies to quantify the expression level of predictive and/or prognostic target genes [4, 5].

In gene expression studies, qPCR data need to be normalized to remove nonspecific variability associated with differences in RNA quantity and quality, usually by relative quantification whereby the expression level of the target gene is determined relatively to another gene transcript, the so-called reference gene (RG), and the results are expressed as a target to reference ratio [6]. The reliability of normalized data is highly dependent on RG robustness. An ideal RG should be constitutively expressed and characterized by stable expression in different samples/experimental conditions.

Genes considered to be valid RGs in semi-quantitative techniques (eg. Nothern Blot) may be less reliable when highly sensitive qPCR is used [7]. Identification of more sensitive RGs and their validation within specific biological conditions are thus critical issues in qPCR studies [8].

Microarray data can be used to identify potential RGs by modeling the expression data to select those with the most stable expression [911]. Most lung cancer qPCR studies use commercial RGs, such as glyceraldehyde-3-phosphate dehydrogenase (GAPDH) [12], β-actin (ACTB) [13], TATA-binding protein [14], β-microglobulin [15], cytochrome oxidase II [16], 18S ribosomal RNA (rRNA18S) [4]. Their reliability in this contex, however, has not been demonstrated. Furthermore, it is increasingly evident that in many experimental situations the use of GAPDH or ACTB is inappropriate due to their high variability [10, 1720]. Thus, the aim of the present study was the selection and validation of new RGs for the differentiation of normal and tumor lung specimens. Five putative RGs (ESD, BTF3, HIST1H2BC, RPL30, YAP1) selected from a meta-analysis of geneChip experiments [21] were validated along with ACTB, GAPDH, RNA18S, PPIA, PGK1, RPLP0 and POLR2A by qPCR assessment of their expression levels on 18 paired normal lung and NSCLC samples. Expression changes between and within these two classes were investigated to define the most stable and comprehensive RGs.



Primary tumour samples and paired normal lung tissues obtained from 18 NSCLC patients (11M, 7F) aged 41–79 (mean 62) during primary curative resection (not preceded by radiotherapy or chemotherapy) at the San Luigi Hospital, Orbassano (Italy) between December 2003 and March 2004 were immediately snap-frozen in liquid nitrogen and stored at -80°C until RNA extraction. The histological subtypes were: 7 adenocarcinomas (ADCA, 38.9%), 10 squamous cell carcinomas (SQCA, 55.6%) and 1 broncoalveolar carcinoma (BAC, 5.5%). Prior written and informed consent was obtained from each patient. The study was approved by the appropriate ethical review board. The guidelines for good clinical practice and the recommendations of the Declaration of Helsinki for biomedical research involving human subjects were followed.

RNA extraction and cDNA synthesis

Total RNA (totRNA) was isolated from lung specimens with the RNeasy 96 Kit and Biorobot 8000 (Qiagen, Hilden, Germany) according to the manufacturer's instructions. RNA was extracted from 15–25 mg and 60–80 mg of tumor and normal lung tissues specimens respectively. To take account of the expression variability due to cellular heterogeneity, totRNA was extracted from a biological duplicate of both normal and tumour specimens. Genomic DNA contaminations were removed by on-column-DNAseI treatment. totRNA was then quantified with an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) and stored at -80°C. Two μgr totRNA were finally retro-transcribed with random hexamer primers and Multiscribe Reverse transcriptase (MMLV) contained in the High Capacity cDNA Archive Kit (Applied Biosystems, Foster City, CA), in accordance with the manufacturer's suggestions.

Real-time PCR

Expression levels of the 5 putative and 7 commercial RGs (ACTB, β-actin; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; PGK1, phosphoglycerate kinase 1; POLR2A, polymerase RNA II polypeptide A, 220kDa; PPIA, cyclophilin A; rRNA 18S, 18S ribosomal RNA; RPLP0, ribosomal protein, large, P0; ESD, esterase D/formylglutathione hydrolase; BTF3, basic transcription factor 3; HIST1H2BC, histone 1 H2bc; RPL30, ribosomal protein L30; YAP1, Yes-associated protein 1, 65kDa) and 3 target genes (BRCA1, breast cancer 1 early onset; ERCC2, excision repair cross-complementing rodent repair deficiency, complementation group 2; RRM2, ribonucleotide reductase M2 polypeptide) were evaluated with TaqMan Probes commercially available as "Assay on Demand" (Applied Biosystems, Foster City, CA, USA) with optimised primer and probe concentrations (Table 1). Quantitative PCR (qPCR) was performed on an ABI PRISM 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA, USA) in 384 well plates assembled by Biorobot 8000 (Qiagen, Hilden, Germany) and the reaction was performed in a final volume of 20 μl. All qPCR mixtures contained 1 μl of cDNA template (corresponding approximately to 20 ng retro-transcribed totRNA), 1× TaqMan Universal PCR Master Mix (2×) (Applied Biosystems, Foster City, CA, USA) and 1× Assay-on-Demand Gene Expression Assay Mix (20×). Cycle conditions were as follows: after an initial 2-min hold at 50°C to allow AmpErase-UNG activity, and 10 minutes at 95°C, the samples were cycled 40 times at 95°C for 15 sec and 60°C for 1 minute. Baseline and threshold for Ct calculation were set manually with the ABI Prism SDS 2.1 software. Automation allowed negligible intra-assay variation (≤5% CV) and low inter-assay variation (≤10% CV) when evaluated on raw linear expression quantities.

Table 1 Gene expression assays.

Data analysis

Meta-analysis of expression data was performed by using affy, MergeMaid and genefilter libraries implemented in Bioconductor [22]. Statistical analyses were performed with R [23]. Gene differential expression was considered significant with an α = 0.05 and p-value <0.05. PCA (Principal Component Analysis, [24]) and agglomerative hierarchical clustering were performed with TMEV software [25]. Assays on Demand's PCR efficiencies, evaluated by cDNA dilution curves, ranged between 91% (rRNA18S, RPL30) and 100% (POL2RA, ESD) with an average value of 96%. Since PCR efficiencies were high and comparable we used the theoretical value of 2 (100%) for gene expression quantification. So, fold changes in expression levels between the normal and tumor paired samples were evaluated with 2-Δ'Ct [7]. The expression of each target gene was reported as 2-ΔΔCt[6].

The GeNorm Software [26] was obtained from the Gene Quantification Home Page [27].

Biological replicates were treated as independent samples in RGs raw Ct distribution analysis and geNorm analysis. Their average Ct value was used in Δ'Ct calculation (i.e for each sample mean Ct_tumor-mean Ct_normal) applied to evaluate the RGs expression stability with equivalence test. We also averaged the ΔCt value (Ct target-Ct reference) of biological replicates to generate the mean ΔΔCt values used in PCA and agglomerative hierarchical clustering according to the Euclidean distance metric and average linkage method.


Arrays meta-analysis

Five putative RGs were selected from a meta-analysis of a subset of lung tissue transcription profilings [21]. To secure a balanced set of tissues, the analysis focused on 17 normal samples, 18 ADCA (randomly selected from a total of 127), 21 SQCA, 20 pulmonary carcinoids and 6 small-cell lung carcinomas. After RMA intensity calculation [28] and quantile normalization [29], an intensity filter was applied to remove all probe sets with intensity ≤100 in all experiments.

Tumour/normal gene expression similarity was measured with the "integrative correlation" (IC) estimation, developed by Parmigiani et al. [30]. This reflects the general consistency of gene expression among the different tissues since it is generated using gene-to-gene correlation obtained by all pairwise correlation for all genes across the tissue groups. A variance/IC filter between each tumor group versus the normal tissues was applied to select probe sets with an inter-sample standard deviation (SD) ≤ 0.5 and IC ≥ 0.2. All the 28 top ranked putative RGs displayed a nearly normal distribution of the residuals representing the expression differences between each tumor sample versus normal samples (data not shown). Within this set we selected five candidates RGs (ESD, BTF3, HIST1H2BC, RPL30, YAP1) involved in different biological processes and for which commercial gene expression assays were available.

RGs expression ranges in lung tissues

The 12 RGs investigated displayed a wide range of Ct values (Figure 1) in separate evaluation of the paired samples. Except for rRNA18S (mean 9.8), the mean values fell into two groups: group A (ESD and the other six commercial RGs): 19–26; group B (the other four microarray-selected RGs): over 26. The microarray-selected RGs (apart from ESD) thus lay in a Ct range not usually covered by commercial RGs and more close to that of our three target genes (i.e. BRCA1, ERCC2, RRM2 with mean Ct values were 31, 29 and 30 respectively). All genes showed a normal distribution pattern proved by Shapiro-Francia fitting procedure (with α = 0.05) both in normal and tumor tissue samples, evaluated with the Bioconductor Nortest package.

Figure 1
figure 1

Expression levels of the 12 candidate RGs in normal and tumor lung tissue samples. Ct values for each RG are shown as medians (lines), 25th to 75th percentile (boxes) and range (whiskers). Hatched and open boxes represent tumor and normal samples respectively.

RGs expression stability

To allow accurate quantification a RG should own a stable expression among samples while to warrant a trustworthy quantification it should not be modulated by experimental conditions. Ideally to fulfill both conditions RG raw Ct distribution of both normal and tumor populations should thus be characterized by a narrow range of expression with comparable mean values. With reference to stability of expression in both normal and in tumor samples, the lowest RNA transcription ranges (i.e Ct 75°quantile-Ct 25°quantile) were associated withPOLR2A (normal = 0.73; tumor = 0.69) and rRNA18S (normal = 0.69; tumor = 0.93), whereas the highest ranges were detected for GAPDH (normal = 1.8; tumor = 2.2) and RPLP0 (normal = 1.71; tumor = 1.84). Unfortunately, even when equal amount of RNA input are used, retro-transcription reactions are responsible for most of the variation in the experimental determination of mRNA quantities [31, 32]. We are thus unable to estimate how much the technical (i.e. RT efficiency) and the biological (i.e. inter-samples) variability contribute to the measured difference in RNA quantities. Descriptive analysis of raw Ct distributions provides only a rough estimate of RG stability that must be confirmed by other approaches.

To investigate the experimental stability of RG transcription we evaluated the significance of differences in RNA transcription levels between the tumor and normal paired samples, where the latter were used as calibrators in Δ'Ct quantification approach [7, 33]. Since in the two-sided t-test the type II errors (false negatives) are uncontrollable, we performed an equivalence test for dependent samples [34] to investigate the null hypothesis of inequivalence H0: μTN ≤ θL and μTN ≥ θU [35].

The equivalence test showed that only POLR2A, rRNA18S, ESD and YAP1 had a differential expression, between the paired samples, within the equivalence interval [θL = -1; θU = 1], which represents a fold change variation interval [0.5; 2]. All these four genes, unlike the others, were indeed characterized by the lowest average fold change (<1.8) and maximum fold change (<8). Conversely, the greatest variability was associated with GAPDH, RPLP0 and HIST1H2BC (average fold change >6 and maximum fold change ranging from 27 fold for HIST1H2BC to 80 fold for GAPDH) (Figure 2).

Figure 2
figure 2

Fold change in expression levels. Differences in gene expression levels between tumor and normal sample pairs were represented as average fold change variation (plot) and maximum fold change (bar). Filled columns refer to statistically stable expressed RGs.

Gene expression stability was also investigated with GeNorm software [26], which calculates the measure of gene expression stability (M) of a putative RG based on the average pairwise variation between all investigated RGs. As requested by the software, Ct values were converted to linear expression quantities by 2-ΔCt using the highest expressed sample as calibrator. GeNorm confirmed our results by identifying rRNA18S and POLR2A as the most stable genes (M = 0.436) preceded by ESD and YAP1 (Figure 3a). Results were independent from the histological classification of the lung tumors. The GeNorm program also determines a normalization factor (NF) needed to define the optimal number of RGs required for an accurate normalization strategy. This is calculated from two or more genes with the variable V as the pairwise variation (Vn/Vn+1) between two sequential normalization factors (NFn and NFn+1). The use of at least the three most stable RGs (POL2RA, rRNA18S and ESD) is recommend to improve the accuracy of normalized data, as suggested by a V value below the cut-off of 0.15 (Figure 3b), which was indicated by the authors as the limit beneath which it would not be necessary to include additional reference genes [36].

Figure 3
figure 3

GeNorm analysis of the 12 candidate RGs. Selection of the RGs most suitable for normalization in lung cancer gene profiling studies by GeNorm analysis. The results are presented according to the output file of the program. (a) stepwise exclusion of the least stable genes. The x-axis from left to right indicates the ranking of the genes according to their expression stability, while the Y-axis indicates the stability parameter M. (b) determination of the optimal number of RGs for normalization.

Significance of suitable RGs for normalization

Since normalization of the target genes versus a set of reliable RGs should result in the same fold change variation with minimal fluctuations, Principal Component Analysis [24] and agglomerative hierarchical clustering were used to describe the homogeneity degree of target fold change variation as a function of the reference used (Figure 4). Gene expression levels of three target genes (RRM2, BRCA1, ERCC2) were normalized with respect to each of the available RGs and expressed as -ΔΔCt using normal paired samples as calibrator [6]. Both approaches showed that the most homogeneous group of target fold change evaluations was obtained when the most stable RGs were used for normalization: POLR2A, rRNA18S, YAP1 and ESD (Figure 4A–B filled dots). By contrast, the commonly used RGs ACTB and GAPDH led to inconsistent estimation of fold change (Figure 4A–B open triangles).

Figure 4
figure 4

Targets fold change homogeneity. PCA and agglomerative hierarchical clustering were used to describe the homogeneity degree of target fold change variation as a function of the reference used. Expression levels of three target genes (RRM2, BRCA1, ERCC2) were normalized with respect to each of the RGs and expressed as -ΔΔCt using normal paired samples as calibrator. A) Small fluctuations in fold change target detection using reliable RGs results in a very limited spread over the PCA space. As expected, POLR2A, rRNA18S, ESD and YAP1 produce the best homogeneous cluster (filled dots). B) Similar results are obtained by hierarchical clustering, where the smallest Euclidean distance is associated with the previously indicated set of genes.


This paper describes the first systematic comparison of several conventional and potential RGs and their effectiveness as internal control for relative quantification in lung cancer gene expression profiling. To increase data reliability, we sought to control several variables by (1) using matched pairs of normal and tumor lung samples instead of unpaired samples to minimize inter-individual variations that could arise from genotype profile; (2) analysing biological duplicates of each sample to take into account the expression variability due to cellular heterogeneity in both normal and tumor lung tissues; (3) employing a liquid handling robot for most of the technical procedures (i.e totRNA extraction and qPCR plates assembling), to avoid operator variability; (4) by using high quality certified qPCR reagents and (5) commercial RG probes (Taqman Gene Expression Assay) properly designed to guarantee maximum PCR efficiency and specificity [37] to allow trustworthy comparison in gene expression stability evaluation. Furthermore, in view of the unsuitability of conventional RGs, such as GAPDH or ACTB, for most of the experimental situations [10, 1720], we chose a microarray data meta-analysis to identify new potential RGs. Since 2 out of 4 of the best performing RGs (ESD and YAP1) identified in the present study belong to the meta-analysis group, we can confirm the reliability of this strategy in the identification of new candidate RGs within a specific experimental condition. A further advantage is the easier identification of candidate RGs expressed at a comparable order of magnitude to target genes. This might result in an improvement of data normalization since, as highlighted by some researchers [38, 39], at the same Ct level, reference and target experience the same condition and real-time RT-PCR kinetics with respect to polymerase activation, reaction inactivation, stochastic relation between target and primer concentration, and reaction end product inhibition by the generated RT-PCR product. However, the filtering procedures used to search microarray data for stably expressed genes are biased for low expressed genes.

RG expression stability within sample populations and under experimental condition (tumor versus normal lung specimens) has been evaluated by: (1) descriptive statistic; (2) equivalence test; (3) GeNorm applet. All approaches matched and show that for gene expression profiling in NSCLC the most suitable reference genes to be used for normalization are rRNA18S, POLR2A, ESD and YAP1. The effectiveness of POLR2A, the main enzyme in mRNA transcription, is further supported by others [40]. It is characterized by stable expression among different tissues and it is not modulated by treatment with TPA and ionomycin, indicating resistance to cellular activation. The efficacy of rRNA18S, however, has been criticized because: (1) it is highly expressed, representing up to 80% of cellular RNA and (2) it is transcribed by a specific RNA-polymerase [41]. Nevertheless the results of the present study clearly show that there are no significant differences between rRNA18s and the other best performing RGs identified. In addition, rRNA18s has already been recommended for gene expression analysis in non-microdissected kidney biopsies [10]. A recent assessment of RGs for lung cancer studies by Liu et al. [42] has indicated that GAPDH is the most suitable RG for qPCR studies in NSCLC tissue samples. In our study, however, all the statistical approaches used suggested that GAPDH is far from being a valid RG due to its evident modulation by the neoplastic transformation. Our data are strongly in agreement with previous indication that GAPDH is regulated in response to various stimuli (i.e hypoxia, insulin, mitogen, EGF) and that its transcript is elevated in various cancer tissues [4345], including lung cancer [46]. The discrepancy with Liu's data could be related to the different ethnical origin of the specimens. Even so, it probably reflects the employment of a different data analysis approach. Liu et al., in fact, defined GAPDH as the optimal RG because its expression was almost equal to the mean expression of the other six endogenous control genes they analyzed. This is not a satisfactory parameter of RG stability because it is closely dependent on the variability of the remaining RGs and could not be used to assess their constancy within investigated samples. Furthermore all their results refer to pooled tumor and normal lung tissues samples and do not consider whether and how high RG expression levels were affected by malignant transformation. The latter assessment is mandatory to warrant the validity of quantification data [47].


Careful validation of RGs has been repeatedly advocated as mandatory to ensure the accuracy of normalized data in qPCR studies adopting relative quantification. Our results invite the conclusion that for gene expression profiling in NSCLC the RGs most suitable for normalization are rRNA18S, POLR2A, ESD and YAP1. For accurate normalization we recommend their concurrent use since this results in more reliable quantification. Furthermore our results pinpoint the limits of conventional RGs and suggest that relatively simple meta-data analysis of genome wide transcription profiling studies is a useful way of identifying putative RGs.


  1. Jemal A, Tiwari RC, Murray T, Ghafoor A, Samuels A, Ward E, Feuer EJ, Thun MJ, American Cancer S: Cancer statistics, 2004. CA: a Cancer Journal for Clinicians. 2004, 54 (1): 8-29.

    Google Scholar 

  2. Mather D, Sullivan SD, Parasuraman TV: Beyond survival: economic analyses of chemotherapy in advanced inoperable NSCLC. Oncology (Huntingt). 1998, 12: 199-209.

    CAS  Google Scholar 

  3. Schiller JH, Harrington D, Belani CP, Langer C, Sandler A, Krook J, Zhu J, Johnson DH, Eastern Cooperative Oncology G: Comparison of four chemotherapy regimens for advanced non-small-cell lung cancer.[see comment]. New England Journal of Medicine. 2002, 346 (2): 92-98. 10.1056/NEJMoa011954.

    Article  CAS  PubMed  Google Scholar 

  4. Bepler G, Sharma S, Cantor A, Gautam A, Haura E, Simon G, Sharma A, Sommers E, Robinson L: RRM1 and PTEN as prognostic parameters for overall and disease-free survival in patients with non-small-cell lung cancer. Journal of Clinical Oncology. 2004, 22 (10): 1878-1885. 10.1200/JCO.2004.12.002.

    Article  CAS  PubMed  Google Scholar 

  5. Rosell R, Felip E, Taron M, Majo J, Mendez P, Sanchez-Ronco M, Queralt C, Sanchez JJ, Maestre J: Gene expression as a predictive marker of outcome in stage IIB-IIIA-IIIB non-small cell lung cancer after induction gemcitabine-based chemotherapy followed by resectional surgery. Clinical Cancer Research. 2004, 10 (12 Pt 2): 4215s-4219s. 10.1158/1078-0432.CCR-040006.

    Article  CAS  PubMed  Google Scholar 

  6. Livak KJ, Schmittgen TD: Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods (Duluth). 2001, 25 (4): 402-408.

    Article  CAS  Google Scholar 

  7. Schmittgen TD, Zakrajsek BA: Effect of experimental treatment on housekeeping gene expression: validation by real-time, quantitative RT-PCR. Journal of Biochemical & Biophysical Methods. 2000, 46 (1–2): 69-81. 10.1016/S0165-022X(00)00129-9.

    Article  CAS  Google Scholar 

  8. Huggett J, Dheda K, Bustin S, Zumla A: Real-time RT-PCR normalisation; strategies and considerations. Genes & Immunity. 2005, 6 (4): 279-284. 10.1038/sj.gene.6364190.

    Article  CAS  Google Scholar 

  9. Andersen CL, Jensen JL, Orntoft TF: Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Research. 2004, 64 (15): 5245-5250. 10.1158/0008-5472.CAN-04-0496.

    Article  CAS  PubMed  Google Scholar 

  10. Schmid H, Cohen CD, Henger A, Irrgang S, Schlondorff D, Kretzler M: Validation of endogenous controls for gene expression analysis in microdissected human renal biopsies.[see comment]. Kidney International. 2003, 64 (1): 356-360. 10.1046/j.1523-1755.2003.00074.x.

    Article  CAS  PubMed  Google Scholar 

  11. Warrington JA, Nair A, Mahadevappa M, Tsyganskaya M: Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiological Genomics. 2000, 2 (3): 143-147.

    CAS  PubMed  Google Scholar 

  12. Muller-Tidow C, Diederichs S, Bulk E, Pohle T, Steffen B, Schwable J, Plewka S, Thomas M, Metzger R, Schneider PM, et al: Identification of metastasis-associated receptor tyrosine kinases in non-small cell lung cancer. Cancer Research. 2005, 65 (5): 1778-1782. 10.1158/0008-5472.CAN-04-3388.

    Article  PubMed  Google Scholar 

  13. Brabender J, Metzger R, Salonga D, Danenberg KD, Danenberg PV, Holscher AH, Schneider PM: Comprehensive expression analysis of retinoic acid receptors and retinoid X receptors in non-small cell lung cancer: implications for tumor development and prognosis. Carcinogenesis. 2005, 26 (3): 525-530. 10.1093/carcin/bgi006.

    Article  CAS  PubMed  Google Scholar 

  14. Yuan A, Yu CJ, Shun CT, Luh KT, Kuo SH, Lee YC, Yang PC: Total cyclooxygenase-2 mRNA levels correlate with vascular endothelial growth factor mRNA levels, tumor angiogenesis and prognosis in non-small cell lung cancer patients. International Journal of Cancer. 2005, 115 (4): 545-555. 10.1002/ijc.20898.

    Article  CAS  PubMed  Google Scholar 

  15. Falleni M, Pellegrini C, Marchetti A, Roncalli M, Nosotti M, Palleschi A, Santambrogio L, Coggi G, Bosari S: Quantitative evaluation of the apoptosis regulating genes Survivin, Bcl-2 and Bax in inflammatory and malignant pleural lesions. Lung Cancer. 2005, 48 (2): 211-216. 10.1016/j.lungcan.2004.10.003.

    Article  PubMed  Google Scholar 

  16. Parekh K, Ramachandran S, Cooper J, Bigner D, Patterson A, Mohanakumar T: Tenascin-C, over expressed in lung cancer down regulates effector functions of tumor infiltrating lymphocytes. Lung Cancer. 2005, 47 (1): 17-29. 10.1016/j.lungcan.2004.05.016.

    Article  PubMed  Google Scholar 

  17. Bas A, Forsberg G, Hammarstrom S, Hammarstrom ML: Utility of the housekeeping genes 18S rRNA, beta-actin and glyceraldehyde-3-phosphate-dehydrogenase for normalization in real-time quantitative reverse transcriptase-polymerase chain reaction analysis of gene expression in human T lymphocytes. Scandinavian Journal of Immunology. 2004, 59 (6): 566-573. 10.1111/j.0300-9475.2004.01440.x.

    Article  CAS  PubMed  Google Scholar 

  18. Goidin D, Mamessier A, Staquet MJ, Schmitt D, Berthier-Vergnes O: Ribosomal 18S RNA prevails over glyceraldehyde-3-phosphate dehydrogenase and beta-actin genes as internal standard for quantitative comparison of mRNA levels in invasive and noninvasive human melanoma cell subpopulations. Analytical Biochemistry. 2001, 295 (1): 17-21. 10.1006/abio.2001.5171.

    Article  CAS  PubMed  Google Scholar 

  19. Zhong H, Simons JW: Direct comparison of GAPDH, beta-actin, cyclophilin, and 28S rRNA as internal standards for quantifying RNA levels under hypoxia. Biochemical & Biophysical Research Communications. 1999, 259 (3): 523-526. 10.1006/bbrc.1999.0815.

    Article  CAS  Google Scholar 

  20. Bustin SA: Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays. Journal of Molecular Endocrinology. 2000, 25 (2): 169-193. 10.1677/jme.0.0250169.

    Article  CAS  PubMed  Google Scholar 

  21. Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, et al: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences of the United States of America. 2001, 98 (24): 13790-13795. 10.1073/pnas.191502998.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Bioconductor. []

  23. R-project. []

  24. Raychaudhuri S, Stuart JM, Altman RB: Principal components analysis to summarize microarray experiments: application to sporulation time series. Pacific Symposium on Biocomputing. 2000, 455-466.

    Google Scholar 

  25. Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, et al: TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003, 34 (2): 374-378.

    CAS  PubMed  Google Scholar 

  26. Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F: Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biology. 2002, 3 (7): RESEARCH0034-10.1186/gb-2002-3-7-research0034.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Gene Quantification Home Page. []

  28. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003, 4: 249-264. 10.1093/biostatistics/4.2.249.

    Article  PubMed  Google Scholar 

  29. Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003, 19 (2): 185-193. 10.1093/bioinformatics/19.2.185.

    Article  CAS  PubMed  Google Scholar 

  30. Parmigiani G, Garrett-Mayer ES, Anbazhagan R, Gabrielson E: A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clinical Cancer Research. 2004, 10 (9): 2922-2927. 10.1158/1078-0432.CCR-03-0490.

    Article  CAS  PubMed  Google Scholar 

  31. Stahlberg A, Hakansson J, Xian X, Semb H, Kubista M: Properties of the reverse transcription reaction in mRNA quantification. Clinical Chemistry. 2004, 50 (3): 509-515. 10.1373/clinchem.2003.026161.

    Article  CAS  PubMed  Google Scholar 

  32. Curry J, McHale C, Smith MT: Low efficiency of the Moloney murine leukemia virus reverse transcriptase during reverse transcription of rare t(8;21) fusion gene transcripts. Biotechniques. 2002, 32 (4): 768-775.

    CAS  PubMed  Google Scholar 

  33. Dheda K, Huggett JF, Bustin SA, Johnson MA, Rook G, Zumla A: Validation of housekeeping genes for normalizing RNA expression in real-time PCR. Biotechniques. 2004, 37 (1): 112-114.

    CAS  PubMed  Google Scholar 

  34. Haller F, Kulle B, Schwager S, Gunawan B, von Heydebreck A, Sultmann H, Fuzesi L: Equivalence test in quantitative reverse transcription polymerase chain reaction: confirmation of reference genes suitable for normalization. Analytical Biochemistry. 2004, 335 (1): 1-9. 10.1016/j.ab.2004.08.024.

    Article  CAS  PubMed  Google Scholar 

  35. McBride GB: Equivalence tests can enhance enviromental, science, and management. Austral & New Zeland J Statist. 1999, 4 (1): 19-29. 10.1111/1467-842X.00058.

    Article  Google Scholar 

  36. Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F: GeNorm software manual, update 6 sept 2004. [http://medgenugentbe/~jvdesomp/genorm/]

  37. Amplification efficiency of TaqMan Gene Expression assays. Application Note Applied Biosystems.

  38. Bustin SA: Absolute quantification of mRNA using real-time reverse transcription chain reaction assay. J Mol Endocrinol. 2000, 25: 169-193. 10.1677/jme.0.0250169.

    Article  CAS  PubMed  Google Scholar 

  39. Quantification strategies in real-time PCR. Chapter 3 of "A-Z of quantitative PCR".

  40. Radonic A, Thulke S, Mackay IM, Landt O, Siegert W, Nitsche A: Guideline to reference gene selection for quantitative real-time PCR. Biochemical & Biophysical Research Communications. 2004, 313 (4): 856-862. 10.1016/j.bbrc.2003.11.177.

    Article  CAS  Google Scholar 

  41. Thellin O, Zorzi W, Lakaye B, De Borman B, Coumans B, Hennen G, Grisar T, Igout A, Heinen E: Housekeeping genes as internal standards: use and limits. Journal of Biotechnology. 1999, 75 (2–3): 291-295. 10.1016/S0168-1656(99)00163-7.

    Article  CAS  PubMed  Google Scholar 

  42. Liu DWCST, Liu HP: Choice of endogenous control for gene expression in nonsmall cell lung cancer. Eur Respir J. 2005, 26 (6): 1002-1008. 10.1183/09031936.05.00050205.

    Article  PubMed  Google Scholar 

  43. Revillion F, Pawlowski V, Hornez L, Peyrat JP: Glyceraldehyde-3-phosphate dehydrogenase gene expression in human breast cancer. European Journal of Cancer. 2000, 36 (8): 1038-1042. 10.1016/S0959-8049(00)00051-4.

    Article  CAS  PubMed  Google Scholar 

  44. Rondinelli RHEDE, Tricoli JV: Increased glyceraldehyde-3-phosphate dehydrogenase gene expression in late pathological stage human prostate cancer. Prostate Cancer Prostatic Dis. 1997, 1 (2): 66-72. 10.1038/sj.pcan.4500208.

    Article  PubMed  Google Scholar 

  45. Schek NHBL, Finn OJ: Increased glyceraldehyde-3-phosphate dehydrogenase gene expression in human pancreatic adenocarcinoma. Cancer Research. 1988, 48 (22): 6354-6359.

    CAS  PubMed  Google Scholar 

  46. Tokunaga K, Nakamura Y, Sakata K, Fujimori K, Ohkubo M, Sawada K, Sakiyama S: Enhanced expression of a glyceraldehyde-3-phosphate dehydrogenase gene in human lung cancers. Cancer Research. 1987, 47 (21): 5616-5619.

    PubMed  Google Scholar 

  47. Dheda K, Huggett JF, Chang JS, Kim LU, Bustin SA, Johnson MA, Rook GA, Zumla A: The implications of using an inappropriate reference gene for real-time reverse transcription PCR data normalization. Analytical Biochemistry. 2005, 344 (1): 141-143. 10.1016/j.ab.2005.05.022.

    Article  CAS  PubMed  Google Scholar 

Pre-publication history

Download references


This work was supported by "Oncongenomic centers" AIRC grant and Fondazione San Paolo. Fellowship for SS was granted by "Programma per la Ricerca Sanitaria 2003", Ministero della Salute.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Silvia Saviozzi.

Additional information

Competing interests

The author(s) declare that they have no competing interests.

Authors' contributions

SS designed the experiments, carried out the qPCR analysis and drafted the manuscript. LIM extracted and quantified RNA samples. CAR performed the meta-analysis of microarray data. CF performed the statistical analysis. NS collected samples and performed clinical characterization, GVS reviewed the manuscript and has given final approval of the version for publication. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Saviozzi, S., Cordero, F., Iacono, M.L. et al. Selection of suitable reference genes for accurate normalization of gene expression profile studies in non-small cell lung cancer. BMC Cancer 6, 200 (2006).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: