Tumour material and RNA extraction
In total, 27 breast tumour samples and 4 normal tissue samples were analyzed in this study. Among the tumour samples, 20 biopsy tissues from early breast carcinomas were included of which ten have been previously sub-classified as luminal subtypes, the other half as non-luminal subtypes using traditional two-dimensional microarray platforms such as Stanford cDNA arrays, Agilent Human Whole Genome Arrays and Applied Biosystems Human Genome Survey Microarrays [10]. Furthermore, tumour tissues from 7 locally advanced breast cancers were also included. These samples are part of a cohort of thirty-five patients, and have previously been described [11]. Three were classified as luminal subtypes and four as non-luminal subtypes [9]. In addition, control samples were taken from mastectomy specimens from four breast cancer patients. We selected tissue distant from the tumour, and verified that it consisted of unaffected breast tissue by HE (haematoxylin-eosin) stains of frozen sections. The scientific protocols (tissue sampling and laboratory analysis) of the samples were approved by the Regional Committee for Medical Ethics (health region II) for the M-samples (reference S- 97103) and Regional Committee for Medical Ethics (health region III) for the F-samples (reference 39/92–69.91).
Total RNA was extracted from fresh frozen tissue samples by using TRIzol® (Invitrogen, Carlsbad, CA, USA) as described by the manufacturer. The RNA quality was evaluated by microcapillary electrophoresis using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) and concentration measured by using NanoDrop (NanoDrop Technologies, Wilmington, DE, USA).
Selection of genes immobilized on the MetriGenix-Chip
We selected 269 genes that best represented the classification scheme in breast cancer to be synthesized and immobilized on the MetriGenix-Chip (see Additional file 1 for a complete listing of probes). The genes were selected from the intrinsic gene list as defined in Perou et al. 2000 [6] and Sorlie et al. 2001/2003 [8, 9] by a semi-supervised method. A nearest shrunken centroid analysis using PAM was performed to reduce the number of genes in the classification scheme. Thus, the top 226 genes from this list were included in the 269 selected for syntheses. In addition, genes that distinguish lobular from ductal carcinomas [12], and cell cycle associated genes were added to the chip to enable other types of classification.
MetriGenix-Chip-Preparation and hybridization
Total RNA was amplified using a two-step cRNA synthesis scheme typical for microarray experiments. First strand cDNA was synthesized by annealing T7-(T) 24- primer (100 pmol/μl) with 5 μg total RNA in a final volume of 12 μl at 70°C for 10 min, followed by addition of first-strand master mix (5× First Strand Buffer, 0.1 M DTT, 10 mM/each dNTPs mix, 25 U/μl RNaseOUT and 200 U/μl SuperScript II) to a final volume of 20 μl. The reaction was incubated at 42°C for one hour. Second strand synthesis followed immediately by adding 5× Second Strand Buffer, 10 mM/each dNTPs mix, 10 U/μl E.coli DNA Ligase, 10 U/μl E.coli DNA Polymerase I and 2 U/μl RNaseH to a final volume of 150 μl, and incubating at 16°C for two hours. To complete the reaction, 5 U/μl T4 DNA Polymerase were added and further incubated at 16°C for five min (all reagents supplied by Invitrogen). Double-stranded cDNA was purified in Phase-Lock Gel Tubes (Eppendorf AG, Hamburg, Germany) and in vitro transcribed by Ambion's MEGAscript ™ T7 High Yield Transcription kit (Ambion, Inc. Austin, TX, USA) followed by cleanup with RNeasy® RNA isolation kit columns (Qiagen). Amplified cRNA was evaluated on the Agilent 2100 Bioanalyzer (Agilent Technologies). The cRNA was biotin-labelled using MetriGenix Bio ULS (universal linkage system) (0.5 Units/μl) in a one-step chemical coupling reaction at 85°C for 30 min (MetriGenix and KreaTech Biotechnology, Amsterdam, The Netherlands).
Prior to hybridization the biotin-labelled cRNA were mixed with spike-in controls (for hybridization quality), Sample Dilution Buffer 2 (MetriGenix) and herring sperm DNA (Invitrogen) and denatured for 5 min at 90°C. The sample was then injected into the sample compartment of the 4D array, along with blocking and staining reagents into their respective compartments.
Custom 4D arrays to monitor the genes of interest were supplied by MetriGenix (Baltimore, Maryland). For each gene, a 50- to 60-mer probe was designed based on publicly available sequences and to have GC content in the range of 45 to 55 percent and a melting temperature between 64 and 68°C. For product quality control (QC) the following steps were performed: First, hybridization was performed with just the complements to the control probes to confirm that there was no cross-contamination of probes on the chip. Second, a test cRNA was hybridized to the chip in the absence of the control targets; since the controls were bacterial and the test cRNA mammalian, no hybridization was observed in the control probes (otherwise the chips failed QC). Third, control targets were added to every cRNA that was hybridized to the chip and the intensity of the spots was used qualitatively to confirm the hybridization results. Probes were synthesized with a 5' amino modification and printed on MetriGenix arrays using a Gene Machine Omnigrid arrayer. The arrays are housed in a 4D cartridge that includes reagent reservoirs and interfaces with the MGX2000 and MGX1200CL array processing stations.
4D array hybridizations were performed on the MGX™ 2000 hybridization station, which controlled all subsequent steps (blocking and buffer flushes, hybridization time and temperature). After four hours of hybridization (3 h for hybridization to the corresponding probes and 1 h for blocking, washing and staining of the reactive spots with HRP-streptavidin), the chip was placed in the MGX 1200 CL detection unit for chemiluminescence (CL) detection with exposure times usually ranging from 2 to 5 s. Subsequent image analysis was performed with the MetriSoft software (MetriGenix) that generated an excel file containing the experiment data for subsequent analysis.
Data analysis
The Metrisoft software operated on two different concepts, the noise floor and a stringent threshold value, to filter spots in the individual chip analysis. The noise floor was a value calculated by the software in each individual chip analysis that related to the amount of noise in the chip and which was subsequently used to determine the threshold value. The stringent threshold value was calculated as 3 times the noise floor, an empirical estimate of an 'absent' spot based on the image noise. Any signal below this value was not considered significant and assigned to the threshold value. For intra-chip normalization, the signal intensity of each individual spot was divided by the threshold to produce the normalized values within each chip. Data from 3 successful hybridized controls (one control with poor chip image was rejected from further analysis) were averaged for each gene to obtain a mean expression value. Next, to create log-transformed (base 2) pseudo ratios the value of each sample was divided by the mean of the three controls for every gene.
Principal Component Analysis (PCA), hierarchical clustering and ANOVA were performed by using Avadis Prophetic software (Strand Genomics, Bangalore, INDIA). Data were mean-centered, clustered using Euclidean distance measures and visualized using a heat map in which numeric values are represented in colour intensities (high levels in red, low levels in green). For ANOVA, samples were assigned class designations and gene expression data were analyzed assuming equal variance. Data were ranked based on p-values and F-statistics. In addition, a set of genes that best discriminated the two identified main subtypes of breast cancer were determined using "Nearest Shrunken Centroid classifier" and the PAM software [13]. For this analysis, pseudo ratios were generated using an average of all tumour samples as the denominator, to prevent the normal tissue samples from driving the analysis. PAM analysis was also performed with the pseudo ratios used for PCA, ANOVA and clustering analysis (see above) with similar results (data not shown).