- Research article
- Open Access
- Open Peer Review
Differential proteomic comparison of breast cancer secretome using a quantitative paired analysis workflow
BMC Cancer volume 19, Article number: 365 (2019)
Worldwide, breast cancer is the main cause of cancer mortality in women. Most cases originate in mammary ductal cells that produce the nipple aspirate fluid (NAF). In cancer patients, this secretome contains proteins associated with the tumor microenvironment. NAF studies are challenging because of inter-individual variability. We introduced a paired-proteomic shotgun strategy that relies on NAF analysis from both breasts of patients with unilateral breast cancer and extended PatternLab for Proteomics software to take advantage of this setup.
The software is based on a peptide-centric approach and uses the binomial distribution to attribute a probability for each peptide as being linked to the disease; these probabilities are propagated to a final protein p-value according to the Stouffer’s Z-score method.
A total of 1227 proteins were identified and quantified, of which 87 were differentially abundant, being mainly involved in glycolysis (Warburg effect) and immune system activation (activated stroma). Additionally, in the estrogen receptor-positive subgroup, proteins related to the regulation of insulin-like growth factor transport and platelet degranulation displayed higher abundance, confirming the presence of a proliferative microenvironment.
We debuted a differential bioinformatics workflow for the proteomic analysis of NAF, validating this secretome as a treasure-trove for studying a paired-organ cancer type.
Breast cancer is one of the most common human neoplasms, accounting for approximately one quarter of all cancers in females. Invasive breast cancer is the most common carcinoma in women [1, 2]. Most cases arise from epithelial cells of the mammary ductal system. In non-pregnancy and non-lactating periods, these epithelial cells produce a secretion that, when collected, is called the nipple aspirate fluid (NAF) . As a protein-rich breast-proximal fluid closely related to the tumor microenvironment in cancer patients, NAF constitutes a valuable biological sample to study secreted proteins from tumor cells without contamination by other interstitial fluids or cells [4, 5]. Proteomic studies of human body fluids and tissues are challenging, especially due to the high biological variability. Since breast is a “paired” organ, in unilateral breast cancers, the contralateral non-diseased breast from the same individual can be used as an ideal negative control of the cancerous breast , ultimately increasing the statistical power.
In 2014, we evaluated for the first time 14 paired NAF samples from seven patients with unilateral breast cancer by PAGE, zymography, and DIGE strategies. Our results have revealed the existence of very distinct proteomic profiles among patients (i.e., individual differences). However, NAF profiles from both breasts of the same woman were very similar in qualitative terms, although important quantitative differences in protein spot intensities could be observed. Patients with less aggressive tumors shared a similar homogeneous profile, with a typical set of proteins identified. In contrast, patients with more aggressive tumors presented very unique profiles (i.e., heterogeneous) .
DIGE poses as the state of the art method for sample comparisons by two-dimensional gel electrophoresis. When using this technique, to perform a statistical analysis of spot intensity differences between two conditions (Cy3- and Cy5-labeled), the images are overlaid using the Cy2-labeled internal standard as the reference image. This standard is made up of equal amounts of all samples in the study . Due to the substantial individual heterogeneity found in NAF samples, it was not possible to confidently overlay their gel images, therefore hampering the use of statistical tests for pinpointing differentially abundant candidate markers between the cancer and the control samples. Although the study failed to provide valuable candidate markers, it was fundamental to demonstrate that, even though substantial qualitative individual differences were observed, when comparing NAF samples from both breast within the same patient, the electrophoretic patterns were very similar, regardless of their cancer status .
To overcome the limitations imposed by our previous gel-based analytical strategy, a shotgun label-free proteomic approach was applied to further advance the proteomic characterization of NAF samples. However, attempts to use classical data analysis tools (e.g. Proteome Discoverer and Progenesis) [9, 10] were not successful in providing differential results. The significant inter-individual variability of NAF samples confuses chromatogram alignment, which constitutes an important first step of many algorithms. Most importantly, such traditional shotgun proteomic statistical algorithms do not capitalize on the sample pairing. As normalization is not trivial across the patients, applying data analysis strategies that rely on statistically finding differential abundance by considering the average values of each group (cancer vs control) for each protein is simply not applicable for the task at hand; these tools work considerably better for models with lower biological variation, such as cell cultures or mouse models [11, 12].
Taken together, our cumulative experience on various studies made clear that the substantial individual heterogeneity of these clinical samples required further development of proteomic data analysis tools. The pairing of the NAF samples constitutes the core of our strategy and capitalizes on the subtle variations within the same patient. Therefore, we developed an extension to the PatternLab for Proteomics suite that was tailored for the data analysis challenges at hand, which finally enabled us to confidently perform a differential proteomic comparison of breast cancer secretome samples (NAF) from patients with unilateral breast cancer. In summary, here we propose a consistent quantitative analysis workflow for the evaluation of a heterogeneous biological fluid that constitutes a valuable source of information with potential applications in clinical evaluation of breast cancer patients.
NAF samples (10 cancerous and 10 control) were collected from both breasts of 10 patients with biopsy-proven unilateral ductal invasive carcinoma, yielding a total of 20 biological samples, plus three individuals with no positive diagnosis of breast disease on either breast (providing six more biological samples). All samples were collected at the Mastology Service of the Fernandes Figueira Institute (IFF) of Fiocruz or at the Gynecology Ambulatory of Lagoa Federal Hospital (Table 1). Eligibility criteria for all subjects were: a) to be post-menopausal; b) no intake of exogenous hormones during the previous six months; c) no breast surgery or chemotherapy; d) no previous clinical evidence of breast disease or cancer. After obtaining the written informed consent (IFF Research Ethics Committee, license 0083/10) and the clinical and imaging confirmation of the diagnosis status, NAF collection and protein quantification were performed as previously described . Briefly, the breast was gently massaged from the chest wall toward the nipple for 5 min followed by warm compress for equal time. The nipple fluids were then aspirated using breast pumps and the fluid droplets were collected using a 10 μL micropipetter (Gilson, Inc., Middleton, WI, USA). Immediately, the diluted NAF samples (10 times in phosphate buffered saline pH 7.4) were centrifuged at 250 x g for 10 min at 6 °C and the supernatant was collected and stored at − 80 °C. The NAF protein concentrations were determined using the bicinchoninic acid protein assay kit (Sigma-Aldrich, St. Louis, MO, USA).
One hundred micrograms of lyophilized NAF proteins were dissolved in 20 μL of 400 mM ammonium bicarbonate/ 8 M urea followed digestion as described elsewhere . The digested peptide mixture was desalted by using homemade tip columns packed with Poros R2 resin (Applied Biosystems, USA). Samples were finally dried in a vacuum centrifuge .
Mass spectrometry data acquisition
Desalted tryptic peptides were resuspended in 100 μL of 0.1% (v/v) trifluoroacetic acid. Samples were then analyzed by nLC-MS/MS using an UltiMate 3000 RSLC system (Dionex, USA) coupled to an Orbitrap Elite mass spectrometer (ThermoFisher Scientific, Germany). Initially, peptides were loaded (normalized TIC values between 5 × 108 – 1 × 109, corresponding to 1–4 μL) with 0.1% TFA at 20 μL/min to a 2-cm long (100 μm i.d.) Acclaim® PepMap100 NanoViper Trap column packed with 5 μm silica particles, 100 Å pore size, followed by separation at 250 nL/min on a 50 cm × 75 μm i.d. Acclaim® PepMap100 NanoViper column, both at 60 °C. Peptides were eluted with a gradient of 3 to 45% of 0.1% (v/v) formic acid and 84% (v/v) acetonitrile over 187 min. The spray voltage was set to 1.8 kV with capillary temperature of 275 °C and no sheath or auxiliary gas flow. Full MS spectra were acquired with 1 microscan on the Orbitrap analyzer at a 60,000 resolution (FWHM at m/z 400) with a target AGC value set to 1 × 106. For each survey scan (300 to 1500 m/z range), up to 10 most abundant precursor ions were sequentially submitted to CID fragmentation and MS2 analysis in the LTQ using the following parameters: MSn AGC target value of 1 × 104, normalized collision energy of 35%, minimum signal threshold of 2000 counts and dynamic exclusion time of 30 s.
Peptide-spectrum matching (PSM) was performed using the Comet  search engine (version 2016.01), which is embedded in PatternLab for Proteomics (version 4.1, http://patternlabforproteomics.org) . Sequences from Homo sapiens were downloaded from UniProtKB/Swiss-Prot (containing target 42,402 entries, on September 17, 2018, http://www.uniprot.org/). The final search database, generated using PatternLab’s Search Database Generator tool, included a reverse decoy for each target sequence plus sequences from 127 common contaminants, such as BSA, keratin, and trypsin. The search parameters applied included: fully tryptic and semi-tryptic peptide candidates with masses between 550 and 5500 Da, up to two missed cleavages, 40 ppm for precursor mass and bins of 1.0005 m/z for MS/MS. The modifications were carbamidomethylation of cysteine and oxidation of methionine as fixed and variable, respectively. The validity of the PSMs was assessed using the Search Engine Processor (SEPro) . Identifications were grouped by tryptic status, resulting in two distinct subgroups. For each result, XCorr, DeltaCN, and Comet’s secondary score values were used to generate a Bayesian discriminator. A cutoff score was established to accept a false-discovery rate (FDR) of 1%. A minimum sequence length of 6 amino acid residues was required and the results were further filtered to only accept PSMs with precursor mass error of less than 6 ppm. Proteins identified by only one spectrum (i.e. 1-hit-wonders) having an XCorr below 2.0 were excluded from the identification list. The post-processing filter resulted in a global FDR, at the protein level, of less than 1% and was independent of the tryptic status .
Experimental design and statistical rationale
For breast cancer patients, NAF samples were collected (up to three attempts) in a brief time window between the diagnosis and surgery. Even though proteomic differences in NAF due to the activity of ovarian hormones are believed to be negligible [18,19,20], we were cautious to only include post-menopausal individuals. Through a workflow of only a few steps, a high-resolution and sensitive nLC-MS/MS analysis [21, 22] was carried out for shotgun evaluation of NAF samples.
PatternLab’s XIC extraction tool was used for obtaining the XICs of peptides confidently identified according to SEPro. The XIC extraction of precursor intensity measurements was performed under a tolerance of 9 ppm and acceptable charge states + 2 and + 3. PatternLab’s XIC Explorer was then used to visually assess the distribution of intensities of the label-free quantitations, label each run as control or disease, and tag which samples were from the same patient for further paired analysis. Additionally, PatternLab’s TFold module was used to demonstrate a standard comparison of mean values between two groups: NAF from diseased breasts versus non-diseased ones.
The .xic file provided by PatternLab served as input to a tool named Paired Analyzer (PA), specifically developed for this study. PA begins by normalizing the XICs from each peptide according to the total ion current from each run. The paired analysis of each unique (i.e. proteotypic) peptide required six or more sequential precursor intensity measurements and a minimum fold change of 1.5. Then, for each peptide, the software extracts a list of values according to one of four possibilities: i) when a peptide’s XIC is obtained from data originating from both breasts, an XIC ratio (cancer:control) is recorded; ii) when an XIC is not obtained from either breast, a “0” (zero) is recorded; iii) when an XIC is obtained only from the diseased sample, a “+” (plus) is recorded; iv) when an XIC is obtained only from the control sample, a “–” (minus) is recorded (Table 2). In what follows, PA relies on a peptide-centric approach to assign a p-value to each peptide as being differentially abundant. For this, we follow a paired binomial approach. Our model assumes a 50% chance for a randomly selected peptide to be a success relative to each individual patient for which an XIC was obtained from at least one breast, where success is to be understood as that peptide having a ratio greater than 1 or a “+” for the patient in question. A peptide’s number of successes is the random variable X, and we calculate its p-value as the probability P(X > x), given by a sum of binomials, where x is the number of patients for which success was observed. Thus, for the peptide Pa (Table 2), the number of successes (x) is 3, the number of trials (n, number of columns not having 0 as a value) is 5, which yields P(X > 3) = 0.5. For Pb, x = 2 and n = 10, yielding P(X > 2) = 0.99. For Pc, x = 6 and n = 6, yielding P(X > 6) = 0.02. In summary, low p-values (e.g. p < 0.025) link a peptide to the cancer condition; on the other hand, high values (e.g. p > 0.975) would link the aforementioned peptide to the control condition.
Finally, multiple p-values originating from the peptides mapped to a protein are used to perform a meta-analysis to help determine whether that protein can be considered differentially abundant. This analysis is the determination of Stouffer’s Z-score  for the data at hand, denoted by Z and given as a function of the various peptide p-values as
In this expression, k is the number of peptides; Zi = Φ−1(1 − pi), where Φ is the standard normal cumulative distribution and pi the i-th peptide’s p-value; wi is the square root of the count of individuals in which that peptide was identified.
Average fold-changes were calculated considering the logarithms of the ratios to the base 2, allowing for symmetry in the expression rates of more (positive values, Table 2) and less (negative values) abundant proteins in cancer.
The differentially abundant proteins were categorized in pathways according to the Reactome v60 (https://www.reactome.org/) database. The distribution of those proteins was plotted in a graph from PatternLab’s showing the mean of normalized parent ion intensity abundance factor (NIAF)  of each identified protein from the NAF samples of the ten patients.
Selected reaction monitoring (SRM)
From the differentially abundant list of proteins, 12 of them related to glycolysis, complement cascade and platelet activation pathways were selected for further validation. The spectral library was built from the shotgun analysis described in sections 2.3 and 2.4 and loaded at Skyline software (https://skyline.ms/project/home/software/Skyline/begin.view, version 4.1). A total of 87 transitions were selected for SRM according to peptide uniqueness in the human genome; presence in the spectral library with relatively high intensity of signal; without ragged ends (KK, RR, KR or RK); minimum and maximum size of 8 and 25 aminoacids, respectively; only “y” ion types. Six pairs of samples were prepared as described above and the dessalted tryptic peptide mixtures were quantified by Pierce Quantitative Colorimetric Peptide Assay (ThermoFisher Scientific, USA). A total of 0.5 μg of peptides for each sample spiked in 32 fmol of Pierce Retention Time Calibration Mixture (ThermoFisher Scientific, USA) in 1% formic acid (FA) were loaded to a 2 cm precolumn of 75 μm i.d. with 3 μm silica particles and 100 Å pore size (Acclaim PepMapTM 100, Thermo) in 12 μL of 0,1% (v/v) FA and 5% (v/v) acetonitrile in water, using an EASY II (Proxeon, USA). Then, separation was performed at 320 nL/min in a PicoChip column, 75 μm i.d. × 15 μm tip × 10.5 cm of H354 ReproSil-Pur C18-AQ 120 Å (New Objective, USA) using an elution gradient of 5 to 45% of 0.1% (v/v) FA and 5% (v/v) water in acetonitrile over 40 min followed by 45–95% over 10 min. The nLC was coupled to a TSQ Quantiva mass spectrometer (ThermoFisher Scientific, Germany). The spray voltage was set to 2.6 kV with capillary temperature of 280 °C, the 60 min acquisition was done with 2 s cycle time, 0.7 Q1 and Q3 resolution (FWHM 508.2 m/z), 1,5 mTorr for collision induced dissociation (CID) fragmentation, and collision energy adjusted according to the theoretical equation of this mass spectrometer. After manual refinement of each transition for each sample, the areas of 48 transitions which refer to 9 proteins were exported from Skyline and imported at Paired Analyzer tool for statistical analysis as described above.
PatternLab’s TFold comparison of the mean values of protein abundance between the cancerous group versus the non-cancerous one showed no protein as being differentially abundant (Fig. 1).
The shotgun approach disclosed a total of 1227 protein entries (Additional file 1: Table S1), of which 87 proteins (Table 3) were differentially abundant between cancerous and non-diseased breasts from unilateral breast cancer patients, according to our paired statistical approach. From these 87 differentially abundant proteins, all of them were quantified with more than 6 peptides and are included in the Plasma Proteome Database (http://www.plasmaproteomedatabase.org/), proteins except for three immunoglobulins forms (Ig heavy constant gamma 2, Ig kappa variable 3–20, and Ig heavy variable 2–5). Nine differentially abundant proteins were detected in lower levels in NAF samples originating from the cancerous breast (Stouffer p-values ≥0.975).
We also performed a differential analysis between samples from right and left breasts of three women without breast disease (Table 1, individuals 1–3). From the list of 578 statistically evaluated proteins (Additional file 2: Table S2), Ig heavy constant alpha-1 (Stouffer’s p-value = 0.0054), alpha-1-antichymotrypsin (Stouffer’s p-value = 0.9888), alpha-1-antitrypsin (Stouffer’s p-value = 0.9943) were pointed as differentially abundant. Since alpha-1-antichymotrypsin and alpha-1-antitrypsin were also found in the comparison between the breasts of cancer patients, they were excluded from the following analyses (Additional file 1: Table S1). Our motivation was to reduce the chance of false positive identifications as, in principle, there should be no reason for having differentially abundant proteins between the NAF samples originating from normal right and left breasts.
Among 87 differentially abundant proteins observed between cancerous and non-diseased paired breasts, it is worth mentioning the frequent identification of proteins associated with the glycolysis pathway, the complement cascade and the platelet activation/degranulation systems (Additional file 3: Table S3). Furthermore, having as reference the average value of NIAF plotted for each protein ordered by abundance, the 87 differentially abundant proteins were among the more abundant ones (Fig. 2).
We also performed an additional differential analysis between NAF samples from a subgroup of patients bearing estrogen receptor-positive tumors (ERpos) (Table 1). From the 873 statistically evaluated proteins (Additional file 4: Table S4), 14 proteins (Table 4) were classified as differentially abundant; 10 of them (alpha-1B-glycoprotein, ceruloplasmin, alpha-2-macroglobulin, serotransferrin, immunoglobulin heavy constant mu, alpha-1-acid glycoprotein 1, ferritin heavy chain, proteinS100-A8, and serum albumin) were also found as differentially abundant in the total set of breast cancer patients. All differentially abundant proteins found in both datasets, ERpos cancer NAF and total cancer NAF, presented the same abundance tendencies. Among the proteins found as more abundant in ERpos samples, representatives of the protein metabolism and the platelet degranulation system were frequently identified (Additional file 3: Table S3).
To validate the results obtained from the cancer versus nondiseased comparison, 12 differentially abundant proteins were selected according its overall high MS signal and their presence in the well represented Reactome pathways here described. By Selected Reaction Monitoring (SRM), 4 proteins of glycolysis (pyruvate kinase, glyceraldehyde-3-phosphate dehydrogenase, triosephosphate isomerase, and fructose-bisphosphate aldolase A), 4 proteins of complement cascade (complement C5, complement C3, complement factor B, and complement factor H), and 4 proteins of platelet activation and signaling (alpha-2-macroglobulin, apolipoprotein A-I, fibronectin, and annexin A5). From the initial 87 inicial transitions, 48 were successfully monitored with a CV lower then 15% among replicates, after normalization using global standards. The normalized areas were statistically analyzed by our paired setup and the higher abundance in cancer samples were confirmed to pyruvate kinase, alpha-2-macroglobulin, and complement factor B (p-value < 0.05). Although the proteins fructose-bisphosphate aldolase A, complement C5, complement C3, complement factor H, apolipoprotein A-I, and annexin A5 did not reach lower p-values, fold changes were corroborated with the higher abundance in cancer samples (Additional file 5: Table S5).
Differential analysis performed with individually paired NAF samples from unilateral breast cancer patients (using the contralateral non-diseased breast sample as negative control) is a powerful strategy for discrimination of which proteins are related to the disease as it helps overcoming the challenge of individual heterogeneity observed between patients [7, 25]. We applied PatternLab’s TFold analysis as a representative of a widely adopted proteomic approach to demonstrate the effectiveness of these methods on datasets with a high biological variation; no proteins were found as differentially abundant. Thus, to quantitatively analyze proteins by label-free shotgun, individual pair-by-pair analysis was performed by the new Paired Analyzer tool of the PatternLab for Proteomics software . As aforementioned, our software performs a quantitative peptide-centric approach that relies on the binomial distribution to attribute a p-value to each peptide as being related to the disease or not. In what follows, these p-values are rolled up to the protein level, converging to the Stouffer’s Z-score via a widely adopted meta-analysis procedure which enables combining independent statistical tests bearing upon the same hypothesis to establish a single score. The tool also allows quickly verifying whether individual peptides belonging to the same protein followed the same trend in differential abundance across the breast cancer NAF samples (Additional file 6: Graphical abstract).
Our shotgun approach revealed proteins presenting higher abundances in breast cancer samples that were previously known as related to cancer progression. Among these proteins are members of the glycolysis pathway, components of the platelet activation/degranulation systems, and proteins associated with the complement cascade . By SRM, at least 2 proteins per pathway corroborated the higher abundance in cancer samples. Increased levels of glycolytic enzymes have been previously related to higher glucose consumption, oncogene activation and loss of function of tumor suppressor genes, promotion of metastasis, angiogenesis stimulation, chemotherapy resistance, and immune evasion [28, 29]. Another known pathway related to tumor progression is the complement cascade that mediates the innate immune system activation, resulting in inflammatory cell and fibroblast recruitment to the tumor microenvironment, which sustains the extracellular matrix remodeling and, consequently, supports cancer progression [30, 31].
In this work we were able to identify 20 representatives of platelet degranulation, activation, signaling or aggregation as more abundant in NAF cancer samples, showing a typically coagulant tumor microenvironment. Several studies report that cancer progression and metastasis (specifically angiogenesis promotion, apoptosis suppression, and extracellular matrix degradation) can be supported by elements of the hemostatic system, such as platelets, coagulation, and fibrinolysis . Therefore, our approach seems to be suitable for more detailed analysis of coagulation cascade proteins, eventually providing further information on its mechanistic relation with breast cancer progression.
Although proteins related to glycolysis and platelet function are commonly found in cancer differential proteomic data [32, 33, 34], their putative roles as moonlighting proteins  have been largely overlooked. According to the MultitaskProtDB , glucose-6-phosphate isomerase, triosephosphate isomerase, and phosphoglycerate kinase 1, glycolytic proteins which we have found more abundant in the cancer samples, may show moonlighting functions in differentiation/ stimulation of cell migration (as a cytokine or a growth factor) , thrombosis/homeostasis , and angiogenesis (as a disulphide reductase) , respectively. Additionally, the peptidyl-prolyl cis-trans isomerase, an enzyme related to platelet degranulation found more abundant in cancer NAF, presents a proinflammatory cytokine function when located in the extracellular space . Interestingly, our comparative evaluation of the paired breast secretion in unilateral cancer cases showed the presence of these cancer-related proteins extracellularly, which may add important new information to the understanding of the human functional proteome.
Characteristically, ERpos breast tumors are a well differentiated type of cancer and present better treatment response and overall survival . This group of samples showed increased levels of proteins related to the regulation of IGF transport and to the platelet degranulation system. Although some findings about the role of IGF system in breast cancer are conflicting, many components of this system are known to be altered during breast cancer establishment and progression regardless the expression patterns of receptors (ER, PR and HER2) . Overall, proliferation mechanisms which are present in these tumors were observed in this work.
In NAF cancer samples, the higher abundances of proteins involved in cell-stroma communication, glycolysis (Warburg effect), and immune system activation (to maintain a stimulated stroma) corroborate previous breast cancer data from the literature. Additionally, this paired comparative proteomic strategy of analysis presents valuable information on the mechanisms described above that are known to be related to the disease, even with the inter-individual heterogeneity characteristic of NAF samples. Although, we performed an SRM experiment and confirmed the higher abundance of 3 proteins in cancer samples, further verification/confirmation of higher levels of glycolytic enzymes, complement components, and platelets activators in a larger cohort (> 20 NAF paired samples per cancer subtype, throughout the entire pathways) using the targeted proteomic strategy may contribute to new advances in breast cancer evaluation. Taken together, these results demonstrate that protein analysis of NAF, a clinical sample easily obtained, could compose a pillar in precision medicine, guiding a protein-based prognosis.
automatic gain control
estrogen receptor positive
false discovery rate
full width at half maximum
human epidermal growth factor receptor 2
linear trap quadrupole
nipple aspirate fluid
nano liquid chromatography
parallel reaction monitoring
peptide spectrum match
Search Engine Processor
Selected Reaction Monitoring
total ion current
triple negative breast cancer
extracted ion chromatogram/current
Bray F, Jemal A, Grey N, Ferlay J, Forman D. Global cancer transitions according to the human development index (2008-2030): a population-based study. Lancet Oncol. 2012;13:790–801.
Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer J Int Cancer. 2015;136:E359–86.
Djuric Z, Visscher DW, Heilbrun LK, Chen G, Atkins M, Covington CY. Influence of lactation history on breast nipple aspirate fluid yields and fluid composition. Breast J. 2005;11:92–9.
Alexander H, Stegner AL, Wagner-Mann C, Du Bois GC, Alexander S, Sauter ER. Proteomic analysis to identify breast cancer biomarkers in nipple aspirate fluid. Clin Cancer Res Off J Am Assoc Cancer Res. 2004;10:7500–10.
Varnum SM, Covington CC, Woodbury RL, Petritis K, Kangas LJ, Abdullah MS, et al. Proteomic characterization of nipple aspirate fluid: identification of potential biomarkers of breast cancer. Breast Cancer Res Treat. 2003;80:87–97.
Kuerer HM, Coombes KR, Chen J-N, Xiao L, Clarke C, Fritsche H, et al. Association between ductal fluid proteomic expression profiles and the presence of lymph node metastases in women with breast cancer. Surgery. 2004;136:1061–9.
Brunoro GVF, Ferreira AT da S, Trugilho MR de O, de OTS, Amêndola LCB, Perales J, et al. Potential correlation between tumor aggressiveness and protein expression patterns of nipple aspirate fluid (NAF) revealed by gel-based proteomic analysis. Curr Top Med Chem. 2014;14:359–68.
Alban A, David SO, Bjorkesten L, Andersson C, Sloge E, Lewis S, et al. A novel experimental design for comparative two-dimensional gel analysis: two-dimensional difference gel electrophoresis incorporating a pooled internal standard. Proteomics. 2003;3:36–44.
Gonzalez-Galarza FF, Lawless C, Hubbard SJ, Fan J, Bessant C, Hermjakob H, et al. A critical appraisal of techniques, software packages, and standards for quantitative proteomic analysis. Omics J Integr Biol. 2012;16:431–42.
Qi D, Brownridge P, Xia D, Mackay K, Gonzalez-Galarza FF, Kenyani J, et al. A software toolkit and interface for performing stable isotope labeling and top3 quantification using Progenesis LC-MS. Omics J Integr Biol. 2012;16:489–95.
Nie S, McDermott SP, Deol Y, Tan Z, Wicha MS, Lubman DM. A quantitative proteomics analysis of MCF7 breast cancer stem and progenitor cell populations. Proteomics. 2015;15:3772–83.
Di Luca A, Henry M, Meleady P, O’Connor R. Label-free LC-MS analysis of HER2+ breast cancer cell line response to HER2 inhibitor treatment. Daru J Fac Pharm Tehran Univ Med Sci. 2015;23:40.
Brunoro GVF, Carvalho PC, Ferreira AT da S, Perales J, Valente RH, de Moura Gallo CV, et al. Proteomic profiling of nipple aspirate fluid (NAF): exploring the complementarity of different peptide fractionation strategies. J Proteome. 2015;117:86–94.
Larsen MR, Trelle MB, Thingholm TE, Jensen ON. Analysis of posttranslational modifications of proteins by tandem mass spectrometry. BioTechniques. 2006;40:790–8.
Eng JK, Jahan TA, Hoopmann MR. Comet: an open-source MS/MS sequence database search tool. PROTEOMICS. 2013;13:22–4.
Carvalho PC, Lima DB, Leprevost FV, Santos MDM, Fischer JSG, Aquino PF, et al. Integrated analysis of shotgun proteomic data with PatternLab for proteomics 4.0. Nat Protoc. 2015;11:102–17.
Barboza R, Cociorva D, Xu T, Barbosa VC, Perales J, Valente RH, et al. Can the false-discovery rate be misleading? Proteomics. 2011;11:4105–8.
Chatterton RT, Geiger AS, Khan SA, Helenowski IB, Jovanovic BD, Gann PH. Variation in estradiol, estradiol precursors, and estrogen-related products in nipple aspirate fluid from normal premenopausal women. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol. 2004;13:928–35.
Huang Y, Nagamani M, Anderson KE, Kurosky A, Haag AM, Grady JJ, et al. A strong association between body fat mass and protein profiles in nipple aspirate fluid of healthy premenopausal non-lactating women. Breast Cancer Res Treat. 2007;104:57–66.
Noble J, Dua RS, Locke I, Eeles R, Gui GPH, Isacke CM. Proteomic analysis of nipple aspirate fluid throughout the menstrual cycle in healthy pre-menopausal women. Breast Cancer Res Treat. 2007;104:191–6.
Gilar M, Olivova P, Daly AE, Gebler JC. Orthogonality of separation in two-dimensional liquid chromatography. Anal Chem. 2005;77:6426–34.
Michalski A, Damoc E, Lange O, Denisov E, Nolting D, Müller M, et al. Ultra high resolution linear ion trap Orbitrap mass spectrometer (Orbitrap elite) facilitates top down LC MS/MS and versatile peptide fragmentation modes. Mol Cell Proteomics MCP. 2012;11:O111.013698.
Whitlock MC. Combining probability from independent tests: the weighted Z-method is superior to Fisher’s approach: combining probabilities from many tests. J Evol Biol. 2005;18:1368–73.
Zhang Y, Wen Z, Washburn MP, Florens L. Improving label-free quantitative proteomics strategies by distributing shared peptides and stabilizing variance. Anal Chem. 2015;87:4749–56.
Kuerer HM, Goldknopf IL, Fritsche H, Krishnamurthy S, Sheta EA, Hunt KK. Identification of distinct protein expression patterns in bilateral matched pair breast ductal fluid specimens from women with unilateral invasive breast carcinoma. High-throughput biomarker discovery. Cancer. 2002;95:2276–82.
Carvalho PC, Fischer JSG, Xu T, Yates JR, Barbosa VC. PatternLab: from mass spectra to label-free differential shotgun proteomics. Curr Protoc Bioinforma Ed Board Andreas Baxevanis Al. 2012;Chapter 13:Unit13.19.
Dvorak HF. Tumors: wounds that do not heal. Similarities between tumor stroma generation and wound healing. N Engl J Med. 1986;315:1650–9.
Jang M, Kim SS, Lee J. Cancer cell metabolism: implications for therapeutic targets. Exp Mol Med. 2013;45:e45.
El Sayed SM, Mohamed WG, Seddik M-AH, Ahmed A-SA, Mahmoud AG, Amer WH, et al. Safety and outcome of treatment of metastatic melanoma using 3-bromopyruvate: a concise literature review and case study. Chin J Cancer. 2014;33:356–64.
Byun JS, Gardner K. Wounds that will not heal: pervasive cellular reprogramming in cancer. Am J Pathol. 2013;182:1055–64.
Mueller MM, Fusenig NE. Friends or foes - bipolar effects of the tumour stroma in cancer. Nat Rev Cancer. 2004;4:839–49.
Lal I, Dittus K, Holmes CE. Platelets, coagulation and fibrinolysis in breast cancer progression. Breast Cancer Res BCR. 2013;15:207.
Martinez-Outschoorn U, Sotgia F, Lisanti MP. Tumor microenvironment and metabolic synergy in breast cancers: critical importance of mitochondrial fuels and function. Semin Oncol. 2014;41:195–216.
Calderón-González KG, Valero Rustarazo ML, Labra-Barrios ML, Bazán-Méndez CI, Tavera-Tapia A, Herrera-Aguirre ME, et al. Determination of the protein expression profiles of breast cancer cell lines by quantitative proteomics using iTRAQ labelling and tandem mass spectrometry. J Proteome. 2015;124:50–78.
Jeffery CJ. Moonlighting proteins. Trends Biochem Sci. 1999;24:8–11.
Hernandez S, Ferragut G, Amela I, Perez-Pons J, Piñol J, Mozo-Villarias A, et al. MultitaskProtDB: a database of multitasking proteins. Nucleic Acids Res. 2014;42:D517–20.
Chaput M, Claes V, Portetelle D, Cludts I, Cravador A, Burny A, et al. The neurotrophic factor neuroleukin is 90% homologous with phosphohexose isomerase. Nature. 1988;332:454–5.
Liu Q-Y, Corjay M, Feuerstein GZ, Nambi P. Identification and characterization of triosephosphate isomerase that specifically interacts with the integrin αIIb cytoplasmic domain. Biochem Pharmacol. 2006;72:551–7.
Lay AJ, Jiang XM, Kisker O, Flynn E, Underwood A, Condron R, et al. Phosphoglycerate kinase acts in tumour angiogenesis as a disulphide reductase. Nature. 2000;408:869–73.
Jin Z-G, Lungu AO, Xie L, Wang M, Wong C, Berk BC. Cyclophilin a is a proinflammatory cytokine that activates endothelial cells. Arterioscler Thromb Vasc Biol. 2004;24:1186–91.
Barcellos-Hoff MH. Does microenvironment contribute to the etiology of estrogen receptor-negative breast cancer? Clin Cancer Res Off J Am Assoc Cancer Res. 2013;19:541–8.
Christopoulos PF, Msaouel P, Koutsilieris M. The role of the insulin-like growth factor-1 system in breast cancer. Mol Cancer. 2015;14:43.
We would like to acknowledge Julio César da Paixão, Luis Claudio Belo Amêndola, Rafael Henrique Szymanski Machado and Napoleão Leão Jr. for their assistance during sample collection, and the Laboratory of Proteomics (LabProt) - LADETEC, Institute of Chemistry, Federal University of Rio de Janeiro for their assistance in the SRM analysis. This study was supported by Fundação do Câncer, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Rede Proteômica do Rio de Janeiro, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro, Fundação Oswaldo Cruz/ Programa de Desenvolvimento Tecnológico em Insumos para Saúde, the Ministerium für Innovation, Wissenschaft und Forschung des Landes Nordrhein-Westfalen, the Senatsverwaltung für Wirtschaft, Technologie und Forschung des Landes Berlin, and the Bundesministerium für Bildung und Forschung.
GVFB received a Ph.D. fellowship from CAPES (2010) and Programa de Doutorado Sanduíche no Exterior (PDSE, grant 8259-13-5) (2013). AGCNF and JP are CNPq fellows (grants 311539/2015–7 and 312311/2013–3, respectively). VCB and JP are FAPERJ-CNE fellows (grants E-26/201.444/2014 and E-26/202.960/2015, respectively). RPZ acknowledges funding support from the Ministerium für Innovation, Wissenschaft und Forschung des Landes Nordrhein-Westfalen, the Senatsverwaltung für Wirtschaft, Technologie und Forschung des Landes Berlin, and the Bundesministerium für Bildung und Forschung.
The funding sources played no role in the design of the study and collection, analysis, and interpretation of data nor in the writing of the manuscript.
Availability of data and materials
The PatternLab for Proteomics, Paired Analyzer data generated are made available at www.proteomics.fiocruz.br/supplementaryfiles/Brunoro2018. Mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE  partner repository with the dataset identifier PXD005157.
Ethics approval and consent to participate
All procedures performed in this study were in accordance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. A written informed consent was obtained for each individual under the license 0083/10 of the Fernandes Figueira Institute Research Ethics Committee.
Consent for publication
A written informed consent for publication was also obtained for each individual under the license 0083/10 of the Fernandes Figueira Institute Research Ethics Committee.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table S1. Result summary of the protein and peptide identifications from the NAF samples from breast cancer patients. a) Protein Report: List of all 1227 proteins statistically evaluated after paired comparisons of NAF samples from breast cancer patients; b) Peptide Report: List of all 1227 proteins and the respective peptides statistically evaluated after paired comparisons of NAF samples from breast cancer patients. (XLSX 1245 kb)
Table S2. Result summary of protein and peptide identifications (NAF samples from control women). a) Protein Report: Complete list of 578 proteins statistically evaluated after paired comparisons of non diseased breast NAF samples from control women; b) Peptide Report: Complete list of 578 proteins and the respective peptides statistically evaluated after paired comparisons of non diseased breast NAF samples from control women. (XLSX 364 kb)
Table S3. Functional analysis of differentially abundant proteins (NAF samples from breast cancer patients). a) Cancer_vs_Nondiseased: Reactome pathways categorization of the proteins found as differentially abundant after paired comparison of NAF samples from breast cancer patients; b) ERpos_vs_Nondiseased: Reactome pathways categorization for differentially abundant proteins according to the paired comparison of NAF samples from ERpos breast cancer patients. (XLSX 26 kb)
Table S4. Summary result of protein and peptide identifications (NAF samples from ERpos breast cancer patients). a) Protein Report: List of all 873 proteins statistically evaluated after paired comparisons of NAF samples from ERpos breast cancer patients; b) Peptide Report: List of all 873 proteins and the respective peptides statistically evaluated according to the paired comparisons of NAF samples from ERpos breast cancer patients. (XLSX 639 kb)
Table S5. Summary result of the proteins monitored by SRM (NAF samples from breast cancer patients). a) Protein Report: List of the 9 proteins monitored by SRM and statistically evaluated after paired comparisons of NAF samples from breast cancer patients; b) Peptide Report: List of the 9 proteins and the respective peptides monitored by SRM and statistically evaluated according to the paired comparisons of NAF samples from breast cancer patients. (XLSX 17 kb)
Graphical abstract. This study introduced a paired-proteomic shotgun strategy that relies on NAF analysis from both breasts of patients with unilateral breast cancer. The differential analysis of the quantitative data was performed by the “Paired Analyzer”, a newly developed module that works together with the “PatternLab for Proteomics” software. Using a peptide-centric approach, the software applied the binomial distribution to attribute a probability for each peptide as being linked to the disease; these probabilities were propagated to a final protein p-value, according to the Stouffer’s Z-score method. (TIF 195 kb)