Clinical cases and data
Sixteen patients who had developed an infiltrating breast carcinoma and a subsequent ovarian tumor, and for whom frozen tissues were available for both tumors were included in the study. Breast tumors were classified according to the American Joint Committee on Cancer (AJCC)/Union Internationale Contre le Cancer (UICC) staging system and usual pathological parameters [14]. Ovarian tumors were classified according to the Fédération Internationale de Gynécologie Obstétrique (FIGO) staging system.
Information about surgical and systemic treatments, and the occurrence of other metastatic sites were collected. The clinical data were reviewed by two medical oncologists (PHC, LM). Pathological data were blindly reviewed by two pathologists (NW, XSG), and, in case of discrepancy, a consensus was achieved. To ensure an independent data extraction, all procedures were conducted separately by reviewers, and samples were analyzed without knowledge of their supposed status (i.e. primary tumor, or metastasis).
This study was approved by the Institutional Review Board and Ethics committee. Patients were informed that their biological samples could be used for research purposes and that they had the right to refuse if they so wished.
Immunohistochemical analysis
Immunohistochemical analysis was performed on formalin-fixed paraffin-embedded tissue sections (depth: 3 μm). The samples were deparaffinized and pretreated in EDTA buffer at pH 9 (40 minutes at 97°C), and then hydrated in PBS solution for 5 minutes. Then, the rabbit polyclonal anti-PAX8 antibody (Protein Tech Group Inc., Chicago, IL, USA) was applied (dilution: 1/200), and samples were incubated overnight at 4°C. The endogenous peroxidase activity was blocked with hydrogen peroxide. A second antibody directed against the primary anti-PAX8 antibody and coupled with a peroxidase polymer Envision+ (Dako, Trappes, France) was applied for 30 minutes. Then, the peroxidase was revealed during a 10-minute incubation with a di-aminobenzidine solution (DAB Dako K3468). Finally, samples were counterstained with haematoxylin (2 minutes), and mounted with permanent media.
DNA and RNA extraction and preparation for microarray experiment
Tumor DNA and RNA were provided by the Biological Resource Center of the Institut Curie. Prior to DNA and RNA isolation, a tissue section of tumor fragments was performed and stained with hematoxylin and eosin to evaluate tumor cellularity. All analyzed tumors had more than 50% of tumoral cells on the tissue section. The DNA was extracted from frozen tumor samples using a standard phenol/chloroform procedure. The total RNA was isolated using TRIzol reagent (Invitrogen, Cergy-Pontoise, France) in accordance with the manufacturers' instructions. The concentration of RNA was measured by absorbance at 260 nm. The quality of each RNA sample was determined with Agilent 2100 bioanalyzer. RNAs were processed on chips only if the following criteria were fulfilled: RIN (a measure of RNA quality) ≥ 7.6, (28S/18S) ≥ 1.8, (260 nm/230 nm) ≥ 1.8, and (260 nm/280 nm) ≥ 1.8. Targets were prepared according to Affymetrix (Affymetrix Inc., Santa Clara, CA, USA) One Cycle Synthesis protocol, starting from 2 μg of total RNA. Targets were hybridized to GeneChip® Human Genome U133 plus 2.0 Arrays if yield and size of targets were reached. Twenty micrograms of complementary RNA, with a specific size distribution were used to hybridize GeneChip® Human Genome U133 plus 2.0 Array.
Regarding DNA, the quality was assessed on agarose gel, if a smear was observed instead of a band, the sample was discarded. A 250-ng genomic DNA was used to generate targets according to the GeneChip® Mapping 50 K Xba protocol or Genome-Wide Human SNP Array 6.0 protocol. Targets were prepared if 45 μg of amplified DNA were available and if their size was between 250 and 2,000 bp, and hybridized according to manufacturer's recommendations.
50 k SNP Array and SNP6.0 Array data analysis
Intensity signals data from Genechip® Human Mapping 50 K Xba Array or Genome-Wide Human SNP Array 6.0 were normalized and analyzed using ITALICS (ITerative and Alternative normaLIzation and Copy number calling for affymetrix Single nucleotide polymorphism [SNP] arrays) algorithm [15] or Partek Genomic Suite (Partek Inc., St Louis, MO, USA), respectively. The detection and determination of genomics events (gains, losses, amplifications and breakpoints) was performed using GLAD (Gain and Loss Analysis of DNA) software [16] for GeneChip® Human Mapping 50 K Xba Array, and Genomic Segmentation algorithm of Partek Genomic Suite for Genome-Wide Human SNP Array 6.0. Single nucleotide polymorphisms with smoothing value lower and greater than 2 ± 0.28 were considered as loss and gain, respectively. The profiles were visualized with the VAMP software [17] or Partek Genomic Suite.
Gene expression data analysis
A series of 89 ovarian primary tumors and 36 ovarian metastases from breast cancer with a clear pathological diagnosis was used to establish a reference hierarchical tree. All these samples were provided by the Resource Biological Center of the Institut Curie and the chips were processed and hybridized in our laboratory (Department of Translational research). The dataset is publicly available on GEO http://www.ncbi.nlm.nih.gov/geo/ under accession number GSE20565. RNAs were prepared according to the manufacturer's instructions, and were hybridized onto Affymetrix GeneChip® Human Genome U133 plus 2.0 Arrays. Transcriptomic data were normalized with gc-Robust Multi-array Average (gcRMA) algorithm [18], using Partek Genomic Suite (Partek Inc., St Louis, MO, USA). Unsupervised hierarchical clustering of tumor samples was done using Partek Genomic Suite software with standard Pearson's correlation as similarity measure, and Ward's method as linkage criteria. The IQR (a measure of the dispersion of each probe set intensity value across all samples) was set in order to have 2 000 probe sets. First, the clustering was performed on this set of reference samples (89 primary tumors and 36 ovarian metastases), then the 16 ovarian samples with ambiguous diagnosis were introduced in the dataset and the clustering was performed.
Clustering
Validation of the reference hierarchical tree was performed using R environment and the clusterStab package [19]. This package assessed the number of reliable clusters and the stability of the hierarchical clustering with a re-sampling approach whereby randomly selected subsets of samples (70% each round) are repeatedly clustered. The extent of similarity between the resulting clusters was examined and measured by the Jaccard coefficient ranging from zero (no similarity) to one (identical clustering). We used this strategy for a number of clusters ranging from 2 to 8, and we compared the results of Jaccard distribution. Enrichment of values equal or close to one indicated adequate choices of metric, agglomeration method and number of clusters. The algorithm was run with the commonly used metrics (Euclidean and Pearson correlation), and the commonly used agglomeration methods (average and Ward's method). The script of the function was adapted in order to use the Pearson correlation coefficient as metric (not implemented in the Bioconductor package). We used the Hmisc package from Bioconductor to calculate this correlation between samples.
The use of clusterStab package showed that the best reliability of the number of clusters was detected when using Pearson correlation coefficient as metric, Ward's method as agglomeration method, and k = 2 clusters (Figure 1).