We found several proteins that showed different intensities in pre-diagnostic serum samples of breast cancer cases not yet showing clinical symptoms compared to samples of healthy controls. Two proteins detected with SELDI-TOF MS, one with m/z 3323, which is likely to be a double charged ion of apolipoprotein C-I, and another with m/z 8938, which is likely to be C3adesArg, were found to be related to pre-diagnostic breast cancer. Of the proteins detected with 2D-nanoLC-MS/MS, afamin, apolipoprotein E and an isoform of ITIH4 were slightly, but significantly higher and alpha-2-macroglobulin and ceruloplasmin slightly, but significantly lower in pre-diagnostic breast cancer samples compared to control samples. Although correction for multiple testing revealed that only ITIH4 had less than 10% chance to be a false positive finding, several of the other proteins have previously been found in relation with symptomatic breast cancer. M/z 3323, which probably represents the double charged ion of apolipoprotein C-I, showed the largest difference between cases and controls. Apolipoprotein C-I itself, detected both with SELDI-TOF MS (m/z 6637) and 2D-nanoLC-MS/MS, showed results in the same direction, i.e. higher in cases, but not statistically significantly. In a study by Engwegen et al. , examining serum samples taken after diagnosis, the doubly charged ion of apolipoprotein C-I was lower in breast cancer cases, but not statistically significantly. Apolipoprotein C-I itself (6631 Da), was statistically significantly lower in breast cancer cases in that study . It is striking that the same protein was found to be related with breast cancer in both studies, but in different directions. This may be due to differences in sample collection, processing and storage, but also to the differences in stage of disease of the two study populations. We included samples collected up to three years before diagnosis, while in the study by Engwegen et al.  samples were collected after diagnosis. Apolipoprotein C-I may be differently expressed in pre-diagnostic stages of breast cancer compared to stages visible on a mammogram and/or leading to clinical symptoms. It is also possible that the result is a chance finding.
M/z 8938, probably representing C3adesArg, that we found to be higher in pre-diagnostic breast cancer samples, has been found to be related to breast cancer in several previous SELDI-TOF MS studies [2, 3, 6–8, 28]. In the majority of these studies the protein was higher in patients compared to controls [3, 6–8], but in two studies it was lower [2, 9]. ITIH4 was higher in our pre-diagnostic breast cancer samples than in the control samples. This is a protein of which fragments have been frequently described in relation to symptomatic and/or mammographically detectable breast cancer [6–9, 29–31]. In these studies levels of a 4.3 kDa ITIH4 fragment were found either to be significantly higher [7, 30], or significantly lower [6, 8, 9] in breast cancer. Levels of other fragments of ITIH4, which were investigated by Villanueva et al. , Song et al. , and our own group , were usually found to be higher in breast cancer or were not related at all [29, 30].
To our knowledge, afamin, apolipoprotein E, alpha-2-macroglobulin and ceruloplasmin have not been found before to differ between breast cancer serum samples and control serum samples in studies using SELDI-TOF MS or other profiling methods. In the 1980s however, the acute phase proteins alpha-2-macroglobulin and ceruloplasmin were already studied in relation to breast cancer, using immunoassay methods [32, 33]. Serum levels of alpha-2-macroglobulin did not differ between breast cancer patients and women with benign breast disease . In our study, alpha-2-macroglobulin and ceruloplasmin were both lower in pre-diagnostic breast cancer samples compared to the control samples.
It may be a limitation that we did not perform structural identification, and validation of the discriminative power in an independent validation set, of the two discriminative proteins detected with SELDI-TOF MS. However, it is very likely that these proteins are acute phase reactants, which are not cancer specific, let alone breast cancer specific. Therefore, we decided not to invest in structural identification and validation. Moreover, another similar study population was not available for validation. Nevertheless, it is very interesting that this kind of proteins is already discriminative up to three years before the diagnosis of breast cancer. Therefore, our results should not draw our attention to these specific proteins, and their potential as breast cancer biomarkers, but rather to the fact that an inflammatory process is already measurable up to three years before diagnosis, at a moment that only few tumor cells or a very small tumor may be present.
The most important strength of our study is that we investigated proteomic profiles in serum of patients with asymptomatic breast cancer (diagnosed after a median time of 21.3 months (IQR: 0.7-26.6) after enrollment). Our study population therefore is more appropriate for finding early breast cancer biomarkers than all previous studies where mostly symptomatic cases were included. The case-control design nested in a cohort of, apparently healthy screening participants also ensures that all serum samples were collected, processed and stored uniformly under strictly defined conditions, at a time when none of the participants were diagnosed with breast cancer yet. These factors have shown to be important in protein profiling studies [10–16]. In this way systematic errors due to differences in these factors between cases and controls were prevented in our study. Moreover, we were able to control for many (possible) confounding variables, by including only post-menopausal women, who never had cancer before, were not diabetic, were not current smokers, and did not currently use oral contraceptives or menopausal hormone therapy . Furthermore, we could correct the results for age, BMI, past oral contraceptive and HT use, number of children, past smoking habits, alcohol intake, and several serum sample characteristics.
A limitation of our study is that, due to the strict selection criteria and the limited availability of pre-diagnostic serum samples of breast cancer cases, we were only able to include 68 case-control pairs in our study. Due to time and cost restriction, for the 2D-nanoLC-MS/MS analysis we only included the 20 cases that were diagnosed with breast cancer within the first 14 months after enrollment in the study, and their matched controls. These samples sizes are limited, but the strict selection criteria also prevented bias and confounding.
By measuring the protein profiles both with SELDI-TOF MS and 2D-nanoLC-MS/MS we benefited of the advantages of two complementary methods. SELDI-TOF MS has the advantage to simultaneously measure parts of the serum proteome in a high-throughput fashion with relative simple sample preparation, high analytical sensitivity and high speed of data acquisition [34, 35]. Although with 2D-nanoLC-MS/MS fewer samples can be measured simultaneously, this method has the advantage that it can identify the detected proteins immediately. Moreover, the protein detection by these two methods is complementary. With SELDI-TOF MS mainly measuring proteins in the 2 to 10 kDa mass range, many break-down products can be detected. Additionally, by measuring exact mass-to-charge ratios with SELDI-TOF MS, it is also possible to detect post-translational modified forms of proteins; for example proteins with additional amino acids or truncated forms. With 2D-nanoLC-MS/MS in combination with iTRAQ-labeling a higher selectivity is reached because of analysis of tryptic peptides with protein identification based on sequence information. This allows proteins with higher mass to be identified which cannot be detected with high sensitivity by SELDI-TOF MS.