Patients
A total of 208 serum samples including 158 pathologically confirmed lung cancer patients and 50 healthy subjects were collected from the Department of Respiratory and Thoracic Surgery of the Second Hospital of Xi'an Jiaotong University. Informed consent was obtained from every subject prior to the study. All patients with lung cancer were found to have no evidence of other disease. The distribution of clinical stages (UICC, 1997) was as follows: 13 cases were at stage I, 41 stage II, 58 stage III, 46 stage IV. Among these patients, 68 patients suffered from squamous cell carcinomas, 53 from adenocarcinomas, 35 from small cell cancers and 2 from bronchioloalveolar carcinomas. The average age of the patients (101 males, 57 females ranging from 28 to 79 years) was 56.8. The healthy controls (31 males, 19 females ranging from 30 to72 years) came from general physical examinations, and had an average age of 54.5. The two groups were matched for age, sex and smoking history. Two milliliters of whole blood were collected during fasting and stored within one hour at 4°C. The blood was later centrifuged for 20 min at 4000 rpm, distributed into 100 μl aliquots, and stored at -80°C until used.
SELDI protein profiling
Five μL of 10 mM HCl was applied to a weak cation exchange (WCX2) chip and placed at room temperature for 10 min. Chips were rinsed with deionized water in a conical tube and then put into a bioprocessor and washed with binding buffer (100 mM NaAc, pH4) with gentle shaking twice for 5 min. Five μL of each serum and 10 μL of 9 mol/L urea were combined and vortexed on ice. 5 μL of this mixture was added to 60 μL of binding buffer. 50 μL of the serum mixture was applied to each spot and incubated on a shaker for 60 min. Chips were washed again with binding buffer with slight shaking 3 times. 200 μL of 1 mM HEPES pH7.0 was added to each well. Wells were quickly rinsed and then removed and let dry. Once dry, 0.5 μL of sinapinic acid (SPA) was applied to each spot twice. The arrays were allowed to air-dry and then stored in the dark at RT until SELDI analysis.
Data analysis
Before analysis, the data were randomly divided into two sets as follows: the training set consisted of 11 patients with stages I/II lung cancer, 63 patients with stages III/IV lung cancer and 20 healthy controls. The blinded test (in which the disease status was unrevealed) set consisted of 43 patients with stages I/II lung cancer, 41 patients with stages III/IV lung cancer and 30 healthy controls. The chips were placed in the Protein Biological system II-C mass spectrometer reader (Ciphergen Biosystems, Inc.) and TOF spectra were generated by averaging 128 laser shots with an intensity of 215 and a detector sensitivity of 9. The optimization range was from 3,000 to 50,000 Da, and a maximum of 200,000 Da. External calibration of the instrument was performed using the All-in-one peptide molecular mass standard (Ciphergen Biosystems, Inc.). We achieved a mass accuracy of 0.1% with this system.
Peak detection
Peak detection using Ciphergen Biomarker Wizard software 3.0.2 identified an average of 72 peaks/spectrum. Of the 72 peaks, 64 common peaks or clusters were generated from the training set. Eighteen of these proteins were found to have statistically differential expression levels between lung cancer and normal control sera (P < 10-4). Peak detections involved baseline subtraction, mass accuracy calibration, and automatic peak detection. The settings used for our work were as follows: for peak detection the signal-to-noise ratio was 3, minimum peak threshold was 10%; for cluster completion, the cluster mass was 0.5% and the signal-to-noise ratio for the second pass was 1.
Decision tree classification
Construction of the decision tree classification algorithm was performed by Ciphergen Biomarker Pattern software version 5.0. Classification tree, selected Gini, split the data into two nodes using one rule at a time in the form of peak intensity. The splitting decisions in this case were based on the normalized intensity levels of peaks from SELDI protein expression profile. The process of splitting was continued until terminal nodes were created. After V-fold cross validation 50, the accuracy of each classification tree was then challenged with the blinded test set.
Detection of serum Cyfra21-1 and NSE
The two markers, Crfra21-1 and NSE, were measured in the 208 sera included in this study using an electrochemiluminescent immunoassay (ECLIA, Elecsys 2010 system, Roche Diagnostics, Switzerland). The cutoff values for Crfra21-1 and NSE, recommended by the manufacturers, were 3.3 ng ml-1and 16.3 ng ml-1, respectively.
Statistical analysis
Comparison of relative peak intensity levels between groups was made using the Student's t test and in all cases P < 10-4 was considered statistically significant. Comparison of rates between groups was conducted using the χ2 test and P < 0.05 was regarded as a significant difference.