Information on patients having a resection for colorectal cancer performed by members of the Concord Hospital Department of Colorectal Surgery has been entered into a prospective computer database since 1971 [31, 32]. The data set contains information on patient characteristics, co-morbidity, presentation, investigations, surgical management, complications, adjuvant therapy, pathology and follow-up, and has the approval of the South Western Sydney Health Area Ethics Committee. All patients gave written consent personally or through their guardian for pathology specimens and anonymous clinical data to be used for research purposes. From 1981, all resections were performed by specialist colorectal surgeons according to a standardized procedure  and acquisition of clinical data has been conducted by a single surgeon (P.C.). Patients reported here had a resection for clinicopathological stage C colon cancer between 1979 and 2004 inclusive.
The chemotherapy regimens utilized involved, for the most part, 6-month courses of bolus injections of 5-FU and folinic acid administered daily for 5 days every 28 days over a total of 6 cycles (Mayo Clinic regimen ) or 5-FU and leucovorin repeated weekly for 6 doses with a 2-week rest between (Roswell Park regimen ). These regimens were used as they were supported by results from randomized controlled trials on patients with stage C colon cancer.
Selection of the historical control group
Patients who received adjuvant chemotherapy were matched individually by sex and age with controls selected from among patients who had a resection for stage C colon cancer between 1979 and April 1992, before radiotherapy and chemotherapy were introduced at this hospital . Matching was by sex because numerous studies have shown sex differences in many epidemiological, clinical and pathological characteristics of colorectal cancer  and by age because of the demographic association between advancing age and diminishing survival in the population at large.
Pathological examination of the resected specimen followed a standard protocol as described previously . Only adenocarcinomas (including mucinous and signet ring cell carcinomas) were included in the data set. Where multiple tumours were present, only the lesion with the most advanced stage was included. Tumour size was measured as the greatest surface dimension. Blocks were taken to demonstrate maximum direct tumour penetration of the bowel wall. Additional blocks were taken specifically to demonstrate the relationship between tumour and any adherent structure or tissue  as well as lines of resection and the free serosal surface . Venous invasion by tumour referred to involvement of thick or thin walled veins, either within or beyond the bowel wall. When doubt existed as to whether a structure involved was a vein, a negative finding was recorded. Tumour grade was assessed taking into account the degree of differentiation and anaplasia, the nature of the tumour margin (pushing or infiltrating) and the presence and prominence of vascular invasion . An apical lymph node was defined as the most proximal of any nodes found within 1 cm of the ligation of a named vessel as the apex of a pedicle . All pathological characteristics analyzed were looked for in every specimen and their presence or absence recorded explicitly. There were no missing data on any original database variable. Tumours were staged according to the Australian Clinicopathological Staging System for colorectal cancer which accommodates sub-stages compatible with other clinicopathological staging systems such as Tumour Nodes Metastases . A stage C tumour was defined as one with lymph node metastasis but no systemic metastasis and no tumour present in the proximal, distal or deep lines of resection histologically.
Tissue microarray construction
Tissue micro arrays (TMA) for the assessment of GST Pi were constructed using an Advanced Tissue Arrayer ATA-100 (Chemicon, Temecula, Ca). 1.0 mm cores were taken from carefully selected, morphologically representative areas of the original paraffin blocks and arrayed into freshly made recipient paraffin blocks. As it is known that there is heterogeneity within colorectal cancers, we took cores from (a) the central part of the tumour, avoiding the luminal surface, the tumour edge and areas of necrosis, (b) the deep invasive tumour front at the interface between the tumour and non-neoplastic tissue, and also (c) adjacent normal mucosa.
GST Pi (1:20, Abcam, ab17088, Cambridge, U.K.) immunohistochemistry was carried out using DAKO Autostainer (DAKO, Glostrup, Denmark). Following dewaxing and rehydration, antigen retrieval was performed in a water bath (95°C) for 30 minutes using sodium citrate (pH 6.0) Target Retrieval Solution S1699 (DAKO, Glostrup, Denmark). Endogenous peroxidases were blocked with 3% hydrogen peroxide for 5 minutes. Non-specific binding sites were blocked with Protein Block (DAKO, Glostrup, Denmark) for 10 minutes. The sections were incubated with diluted GST Pi antibody for 1 hour at room temperature, followed by secondary reagent EnVision + Dual Link System-HRP (DAB+) K4065 (DAKO, Glostrup, Denmark) for 30 minutes. Staining was completed by a 10 minute incubation with 3,3-diaminobenzidine (DAB+) substrate-chromogen. After buffer wash the slides were counterstained with haematoxylin, dehydrated and mounted.
Immunoreactivity for GST Pi was assessed independently by three experienced pathologists (K.T., C.F., C.C.) who were unaware of the patients’ clinical characteristics, other histopathological data and survival. Tissue cores from the central part of the tumour and the invasive front were assessed separately in each sample, as was the presence of nuclear and cytoplasmic staining in the tumour epithelial cells. The intensity of staining was graded as 0 (no staining), 1 (weak staining), 2 (intermediate staining), 3 (strong staining). The percentage of stained cells (hereinafter termed “percentage stained”) was recorded as a quasi-continuous variable coded 0%, 1%, 10%…90%, 100%. When there were discrepancies between the observers, the slides were reviewed and a consensus reached. To find the optimum dichotomy for percentage stained in relation to survival the distribution of percentage stained was first dichotomized at 0% versus 1–100% and survival curves with the associated p value were obtained. The cutting point was then raised in steps of 10% (0–9% vs. 10–100%, 0–20% vs. 30–100% … 0–90% vs. 100%) and the separation of curves and p value recorded at each step. This process yielded the optimum cutting point giving the greatest separation of survival curves .
Archival paraffin block sections of all the lymph nodes resected from each patient were first reviewed by a pathologist (CC), who selected one normal cancer-free lymph node from each patient for subsequent analysis. A core biopsy was taken from the tissue block and DNA was extracted with the Puregene DNA Isolation Kit (Gentra, Minneapolis, MN) as previously described [44, 45]. A custom Taqman SNP Genotyping Assay (Applied Biosystems, Foster City, CA) was used for genotyping. The primer and probe sequences for GSTP1 were as follows:
Forward primer 5′-CCTGGTGGACATGGTGAATG-3′;
Reverse primer 5′- TGGTGCAGATGCTCACATAGTTG-3′;
Probe 1 (VIC-labelled) 5′-TGCAAATACATCTCC-3′;
Probe 2 (FAM-labelled) 5′-CTGCAAATACGTCTCC-3′ .
The DNA samples were diluted to ~5 ng/ml and tested in triplicate. Each 10μml reaction mix contained 5 μl of Taqman Universal PCR Master Mix (Applied Biosystems), 2 μl of 5x SNP Genotyping Assay, and 3 μl (~15 ng) of DNA. The PCR reactions and SNP analysis were carried out on the ABI 7900 (Applied Biosystems), with PCR conditions as follows: 50°C (2 min); 95°C (10 min); 40 cycles of 95°C (15 s) followed by 60°C (1 min).
Follow-up and survival
Apart from patients lost to follow-up, all patients were followed annually until death or for up to 14 years or to December 31, 2009. Overall survival time was measured from resection until the date of death due to any cause, the censoring date being the date of last follow-up for those surviving or the date of last contact for those lost to follow-up.
The chi-squared test or Fisher’s exact test were used to examine the statistical significance of differences in proportions. Comparisons of survival time between strata of binary variables were made with the Kaplan-Meier method and log-rank test. Proportional hazards regression and the Wald test were used in multivariable modeling with product terms to identify potential interactions. The assumption of proportional hazards was assessed by examining plots of log cumulative hazard for parallelism and in no case was it materially violated in any variable included in a regression model. The level for two-tailed statistical significance was p# 0.05 with confidence intervals (CI) at the 95% level. Analyses were performed with SPSS 15.0 for Windows (SPSS Inc., Chicago, Il. USA).