Library diversity and preparation
The M13KE phage and its host, Escherichia coli ER2387, were obtained from New England Biolabs (NEB). Two different libraries of M13KE were used, namely a home-made 7-mer library and a commercial 12-mer library from NEB (E8110S). The construction of the 7-mer library was performed as described in [19], using primers 5′–CATGCCCGGGTACCTTTCTATTCTC–3′ and 5′– (NNN)7AGAGTGAGAATAGAAAGGTACCCGGG–3′ and digested as in the protocol for M13KE DNA insertion (7.2 kb).
Cell line and culture
The human cancer cell lines MDA-MB-231 (claudin-low subtype), SK-BR-3 (HER2 subtype), Hs 578 T (basal-like subtype) and MDA-MB-435 (melanoma [20]) were kindly provided by the Institute of Molecular Pathology and Immunology at the University of Porto (IPATIMUP). The human mammalian cell line MCF-10-2A (ATCC CRL-10781) is non-tumorigenic and was used as a control. MDA-MB-231, SK-BR-3, Hs 578 T, and MDA-MB-435 cells were routinely cultured in Dulbecco’s Modified Eagle Medium (DMEM, Biochrom) supplemented with 10% (v/v) fetal bovine serum (FBS, Biochrom) and 1% (v/v) penicillin-streptomycin (Biochrom). MCF-10-2A cells were grown in a 1:1 solution of DMEM and HAM’s F-12 medium supplemented with 5% horse serum (Merck Millipore), 20 ng.mL−1 epidermal growth factor (Merck Millipore), 100 ng.mL−1 cholera toxin (Sigma-Aldrich), 0.01 mg.mL−1 insulin (Sigma-Aldrich), 500 ng.mL−1 hydrocortisone, 95% (Sigma-Aldrich) and 1% penicillin-streptomycin. All cell lines were cultured at 37 °C and 5% CO2. Subculturing was performed at 80% confluence, by washing the monolayer with sterile phosphate buffered-saline (PBS), pH 7.4, without Ca2+ and Mg2+, and detaching the cells with Trypsin/EDTA solution 0.05%/0.2% (w/v) (Biochrom). The cell suspension was centrifuged at 250 × g for 7–10 min and the cell pellet was resuspended on fresh growth medium, counted and split according to the experimental needs.
Panning experiments – conventional selection versus BRASIL
Both conventional phage display and BRASIL [21] methods were used to compare their performance in the selection of a peptide specific to the MDA-MD-231 cells. The BRASIL method is in principle faster than the conventional panning and by using counter-selection it reduces the number of false positives. However, this methodology uses cells in suspension, which may hide surface receptors that are only available in the adherent state. The panning experiments with both methodologies were performed equally for the 7-mer and the 12-mer libraries. The experimental setting can be seen in Additional file 1: Table S1.
Conventional selection (surface panning procedure – direct target coating)
One mL of MDA-MB-231 cell suspension at a concentration of 106 cells.mL−1 was added to a 6-well microtiter plate and incubated overnight at 37 °C in a 5% CO2 humidified incubator. The medium was then removed and the wells completely filled with blocking buffer (0.1 M NaHCO3 (pH 8.6, Sigma), 5 mg.ml−1 Bovine Serum Albumin (BSA) (Sigma) solution IgG-free, low endotoxin suitable for cell culture (Sigma). After an incubation of 1 h at 4 °C, the blocking solution was discarded and the wells washed 6 times with Tris Buffered Saline with Tween-20 (TBST, TBS with 0.1% (v/v) Tween-20) (Sigma-Aldrich). One mL of a 100-fold dilution in TBST of the library (7-mer or 12-mer) (1x1011 for a library with 2x109 clones) was added to the coated wells and rocked gently for 60 min at 4 °C (to limit phage internalization). The non-binding phage was discarded and the wells were washed 10 times with TBST. The bound phage was then eluted with 750 μL of PBS 1x (137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4 and 1.8 mM KH2PO4), and rocked gently for 60 min at 4 °C. The eluate was transferred to a microcentrifuge tube and the titer was determined using the double layer agar technique [22] in LB plates containing 100 μM IPTG and 20 μg.mL−1 X-gal, counting the blue colonies. The remaining eluate was amplified by adding the eluate to 20 mL early-log ER2738 culture and incubating with vigorous shaking for 4.5 h at 37 °C. The culture was spun at 12,000 × g for 10 min at 4 °C, and the supernatant was transferred to a fresh tube and re-spun. The upper 80% of the supernatant was transferred to a new tube and the phage was precipitated with 1/6 volume of 20% polyethylene glycol (PEG) 8000/2.5 M NaCl for at least 2 h at 4 °C. This solution was centrifuged at 12,000 × g for 15 min at 4 °C, the supernatant was discarded and the phage pellet was suspended in 1 mL TBS. PEG/NaCl precipitation was repeated and the final pellet suspended in 200 μL TBS. The titer was determined as previously described. The whole process was repeated for a total of 8 rounds of panning.
A control panning experiment was carried out using streptavidin as the target, including 0.1 μg.mL−1 streptavidin in the blocking solution. The bound phage was eluted with 0.1 mM biotin in TBS for at least 30 min. After 3 rounds of enrichment/amplification, the consensus sequence for streptavidin-binding peptides was assessed to confirm the inclusion of the motif His-Pro-Gln.
BRASIL
A biopanning protocol was used as described in [21]. Briefly, MDA-MB-231 cells.mL−1 were collected, centrifuged (250 × g, 10 min) and the pellet suspended in 1 mL of complete DMEM medium, containing 1% (w/v) BSA. The solution was centrifuged and this step repeated 3 times; the cells were re-suspended in complete growth medium containing 3% (w/v) BSA solution and kept on ice. Ten μL of the phage library (7-mer or 12-mer) were added to the previous cell suspension and incubated on ice for 4 h. A bubble of 300 μL PBS was formed on a non-miscible organic phase (cyclohexane:dibutyl phthalate (1:9, v/v, Sigma)), and 200 μL of the cell suspension incubated with the phage library were gently inserted into the bubble. After centrifuging at 10,000 × g for 10 min, the pellet was recovered and washed with 50 μL Tris–HCl (10 mM, pH 9.5). Eluted phages were amplified between rounds using E. coli ER2738, purified and concentrated with 20% PEG 8000/2.5 M NaCl. Phage titer was determined as described above. The amplified phages were used for additional rounds of biopanning in a total of eight. A final round of counter-selection with MCF-10-2A cells (non-tumorigenic) was performed, differing from the previous rounds in the fraction collected, which in this case was the aqueous phase containing the phages that did not bind to the control cells.
Preliminary analysis of the specificity and selectivity of a phage pool
Flow cytometry analysis
To characterize pool specificity and selectivity, the last round of the 12-mer phage pool from conventional panning was conjugated with Alexa 488 and analyzed using flow cytometry to evaluate the binding to MCF-10-2A (control, non-tumorigenic cells), MDA-MB-231, MDA-MB-435, SK-BR-3 and Hs 578 T cell lines. Briefly, 1×105 cells were harvested, washed in PBS and blocked using PBS with 3% BSA at 4 °C for 1 h. Subsequently, the cells were washed with PBST 1× (PBS with 0.1% (v/v) Tween-20) and were incubated with 100 μL of fluorescent phage particles. The cells were rinsed again with PBST 1x and finally resuspended in 200 μL of PBS for flow cytometry analysis using a EC800™ flow cytometer analyzer (Sony Biotechnology Inc.) counting 20,000 events.
Tissue section analysis
For immunohistochemical analysis, serial sections of paraffin-embedded 231 mammary cancer tissue sections, kindly provided by Dr. João Nuno Moreira (CNC, Coimbra, Portugal), were treated as described in [23]. To maximize antibody binding, antigen retrieval was performed by heating the slides in 10 mM sodium citrate buffer (pH 6.0) at 95 °C for 20 min and the slow cooling at room temperature in the same buffer for about 20 min. Tissues were maintained humid at all time. Tissue sections were blocked using a 5% BSA solution and were incubated at room temperature for 30 min. Immunostaining was performed by adding 100 μL of the last round of the 12-mer phage pool (109 PFUs.mL−1) to the tissue overnight at 4 °C [24, 25]. Sections were washed 4 times in TBST 1x for 5 min and 100 μL of the primary antibody rabbit anti-fd bacteriophage (working dilution of 1:5000 in BSA 1%), was added and incubated at 4 °C overnight. Sections were washed several times with TBST 1x and were challenged with the fluorescein isothiocyanate (FITC)-labelled goat anti-rabbit IgG secondary antibody (working dilution of 1:40 in 1% BSA) for 2 h at room temperature. After additional washing of the sections with TBST buffer, sections were counterstained with 4′, 6 - diamidino-2-phenylindole (DAPI, Vector Laboratories) for nuclear labelling and were mounted with Vectashield® mounting medium (Vector Laboratories). The tissue sections were allowed to dry for 1 h at room temperature in the dark and were sealed with nail polish. Images of the slides were captured using an Olympus BX51 microscope incorporated with a high-sensitivity camera Olympus DP71 with 60× magnification.
Selection and screening of cell-specific peptides
Preparation of individual clones for peptide analysis
Single-stranded DNA (ssDNA) was prepared according to the standard protocol described in [19], using iodide buffer (10 mM Tris–HCl, 1 mM EDTA and 4 M NaI (Sigma-Aldrich), pH 8.0) and ethanol precipitation. The DNA pellet was suspended in 30 μL TE buffer (10 mM Tris–HCl, 1 mM EDTA, pH 8.0), quantified using Nanodrop 1000 and confirmed by 2% gel electrophoresis in SGTB (GRISP) buffer 1× at 200 V for 30 min.
PCR and confirmation electrophoresis
The insert sizes of the individual clones, as well as of the complete library were assessed by PCR using the forward primer 5′-TTAACTCCCTGCAAGCCTCA-3′ and the reverse primer 5′-CCCTCATAGTTAGCGTAACG -3′. PCR reactions were carried out using KAPA Taq polymerase in 20 μl reaction volume, containing 2 μL of phage DNA. The PCR conditions were the following: 25 cycles of denaturation at 95 °C for 30 s; annealing in the temperatures range from 45 to 70 °C, for 30 s; and extension at 72 °C for 30 s. Amplification was confirmed by 2% gel electrophoresis in SGTB buffer 1× at 200 V for 30 min.
DNA sequencing and insert analysis
The DNA products obtained were prepared for sequencing using Illustra ExoProStar 1-Step (GE Healthcare) and sent to Macrogen Inc. service using the M13-PIII sequencing primer 5′- TTAACTCCCTGCAAGCCTCA-3′, provided with the Ph.D.12-mer library kit for forward reading and the primer 5′ -CCCTCATAGTTAGCGTAACG-3′ for reverse reading. The Vector NTI Advance 11.5.0 software (Invitrogen – Life Technologies) was used for the analysis of correct insertion of the peptides taking into account that the displayed peptides are expressed at the N-terminus of pIII, followed by a short spacer (Gly-Gly-Gly-Ser) and then the wild-type pIII sequence.
Binding assays
Binding assay with counting of blue colony forming units (pfu)
The binding of the peptides displayed on M13KE phage was evaluated following a procedure similar to the conventional panning. First, the individual clones were amplified, centrifuged at 12,000 × g for 10 min at 4 °C, and the supernatant used for phage concentration with 20% PEG 8000/2.5 M NaCl. Phages were suspended in 50 μL TBS and the titer was determined using the double layer agar technique. Then, 1 mL of MDA-MB-231 cells at a concentration of 106 cells.mL−1 was added to a 6-well microtiter plate and incubated overnight at 37 °C and 5% CO2. MDA-MB-435 cells were used as a negative control in the same conditions. The cell medium was removed and the wells were washed 6 times with TBST. Then, 1 mL of each M13KE-peptide suspension, at a concentration of 1×1011 PFU.ml−1 was added to the wells and incubated for 60 min at 4 °C. The non-binding phage was discarded and the wells were washed 10 times with TBST. The bound phages were then eluted with 750 μL of PBS 1x and rocked gently for 60 min at 4 °C. The eluate was collected and the titer was determined using the double layer agar technique in IPTG/X-gal plates.
ELISA with direct target coating
ELISA was performed to rapidly determine whether a selected phage clone binds the target, using the protocol described in the NEB Phage Display manual [19]. For each clone to be characterized, one row of coated (with target cells) and uncoated wells were used. Plates were read at 405 to 415 nm (Promega Glomax 20/20 luminometer) and the signals (RLUs) obtained with and without target protein (cells) were compared.
Bioinformatics analysis
Library analysis
Sequence similarities between the peptides obtained in this work and peptides reported in the literature targeting cancer cells (see Additional file 2: Table S3) were scored using Blosum45 matrices and the Needleman-Wunsch algorithm as implemented by the pairwise alignment function from the R Biostrings package version 2.38.2 [26]. The symmetric matrix containing the scores for the pairwise sequence alignments, SC(i,j), was converted into a similarity matrix taking into account the background values for each sequence following a procedure similar to the Context Likelihood of Relatedness (CLR) algorithm used to detect spurious association in transcriptional or metabolite association networks [27, 28]. Briefly, the likelihood of SC(i,j) is estimated using a null model given by considering all the alignment scores involving independently sequences i and j, SC
i
and SC
j
, respectively. The background score is approximated as a joint normal distribution with SC
i
and SC
j
treated as independent variables. The final form of the likelihood estimate is:
$$ f\left({z}_i,{z}_j\right)=\sqrt{z_i^2}+{z}_j^2 $$
(1)
where
$$ {z}_i= max\left(0,\ \frac{SC\left(i,k\right) - {\mu}_i}{\sigma_i}\ \right) $$
(2)
and μ
i
and σ
i
are, respectively, the mean and the standard deviation of the empirical distribution of SC(i, k) with k = 1,…,n, and n the total number of considered sequences. The similarity estimate is then a matrix with entries f(z
i
, z
j
). The similarity estimate was normalized, through dividing by its highest values, to use in Multidimensional scaling (MDS) plots, clustering and heatmap reconstruction using the R gplots library [29].
Docking studies
Known biomarkers of breast cancer were selected from a literature and databases search (see Additional file 3: Table S4). The biomarkers found were retrieved through the Kyoto Encyclopedia of Genes and Genomes (KEGG) for pathways and function analysis of biomarkers, Uniprot for protein characterization and amino acid sequences, GenBank for gene sequences, and Protein Data Bank (PDB) for tri-dimensional protein structures [30]. When protein structures were not available, they were predicted using the PHYRE2 software [31] and the peptide structures were predicted using PEPstrMOD [32, 33]. The resulting pdb files were used in a protein-peptide analysis performed using ClusPro 2.0 [34, 35] in all available models, by the peptide sequences identified by phage display against the tri-dimensional structures of the breast cancer biomarkers. Weighted score (E) was obtained by:
$$ E=0.40{E}_{\mathrm{rep}}+-0.40{E}_{\mathrm{att}}+600{E}_{\mathrm{elec}}+1.00{E}_{\mathrm{DARS}}\left(\mathrm{Balanced}\kern0.5em \mathrm{coefficients}\right) $$
(3)
$$ E=0.40{E}_{\mathrm{rep}}+-0.40{E}_{\mathrm{att}}+1200{E}_{\mathrm{elec}}+1.00{E}_{\mathrm{DARS}}\left(\mathrm{Electrostatic}\hbox{-} \mathrm{favored}\kern0.5em \mathrm{coefficients}\right) $$
(4)
$$ E=0.40{E}_{\mathrm{rep}}+-0.40{E}_{\mathrm{att}}+600{E}_{\mathrm{elec}}+2.00{E}_{\mathrm{DARS}}\left(\mathrm{Hydrophobic}\hbox{-} \mathrm{favored}\kern0.5em \mathrm{coefficients}\right) $$
(5)
$$ E=0.40{E}_{\mathrm{rep}}+-0.10{E}_{\mathrm{att}}+600{E}_{\mathrm{elec}}+0.00{E}_{\mathrm{DARS}}\left(\mathrm{Vand}\kern0.5em \mathrm{d}\mathrm{e}\mathrm{r}\kern0.5em \mathrm{Waals}\kern0.5em \mathrm{and}\kern0.5em \mathrm{Electrostatic}\kern0.5em \mathrm{coefficients}\right) $$
(6)
where the lowest energy state represents the highest binding. The tri-dimensional model structures obtained were visualized using UCSF Chimera version 1.10.2 [36]. Alignments were scored using Blosum45, 50 and 62 matrices.
Statistical analysis
GraphPad Prism 5.03 (GraphPad Software, Inc.) was used for statistical analysis of the data. The significance of differences was evaluated using the One-way ANOVA with Tukey’s Multiple Comparison Test, considering a significance level of 95%.