Research article | Open | Open Peer Review | Published:
Development of novel real-time PCR methodology for quantification of COL11A1 mRNA variants and evaluation in breast cancer tissue specimens
BMC Cancervolume 15, Article number: 694 (2015)
Collagen XI is a key structural component of the extracellular matrix and consists of three alpha chains. One of these chains, the α1 (XI), is encoded by the COL11A1 gene and is transcribed to four different variants at least (A, B, C and E) that differ in the propensity to N-terminal domain proteolysis and potentially in the way the extracellular matrix is arranged. This could affect the ability of tumor cells to invade the remodeled stroma and metastasize. No study in the literature has so far investigated the expression of these four variants in breast cancer nor does a method for their accurate quantitative detection exist.
We developed a conventional PCR for the general detection of the general COL11A1 transcript and real-time qPCR methodologies with dual hybridization probes in the LightCycler platform for the quantitative determination of the variants. Data from 90 breast cancer tissues with known histopathological features were collected.
The general COL11A1 transcript was detected in all samples. The developed methodologies for each variant were rapid as well as reproducible, sensitive and specific. Variant A was detected in 30 samples (33 %) and variant E in 62 samples (69 %). Variants B and C were not detected at all. A statistically significant correlation was observed between the presence of variant E and lymph nodes involvement (p = 0.037) and metastasis (p = 0.041).
With the newly developed tools, the possibility of inclusion of COL11A1 variants as prognostic biomarkers in emerging multiparameter technologies examining tissue RNA expression should be further explored.
Breast cancer is the most frequent cancer among women both in more and in less developed World regions and the second most commonly occurring form of cancer globally when both sexes are accounted . The search for new prognostic and predictive tissue biomarkers is considered imperative for improving classification of this common type of cancer and for avoiding excessive and unnecessary exposure to toxic and ineffective treatments.
One of such biomarkers could be collagen as it is a key structural component of the extracellular matrix (ECM) that also serves as a modulator of diverse signaling pathways. Collagen XI belongs to the minor fibrillar subcategory in the collagen family and it is responsible for the proper conformation of collagen II and the formation of thin fibrils of developing or under remodeling tissues. Its highest expression values have been found in the articular cartilage and vitreous humor [2, 3]. It is a heterotrimeric protein, consisting of three alpha chains (a1, a2 and a3) that are organized into a triple helix formation. Both a1(XI) and a2(XI) chains are unique gene products however, a3(XI) is a an hyperglycolsylated version of the collagen a1(II) chain [4, 5]. The a1(XI) chain is encoded by the gene COL11A1 located at genomic locus 1p21.1. It is initially synthesized as procollagen XI and then its C and N termini may be cleaved with proteolysis as soon as they are secreted from the cell . The molecule of the a1(XI) chain has a characteristic globular N-terminal domain (NTD) consisting of a variable region and an amino-propeptide (Npp) that seems responsible for the steric hindrance exerted by collagen XI to other molecules in the ECM [7, 8]. Therefore, when collagen a1(XI) protein is overexpressed -as it has been proven in human ascending thoracic aortic aneurysms-, it leads to thinner collagen fibers and decreased tensile strength in the tissue .
It has also been demonstrated that expression of collagens alters in neoplasms, a fact that could affect the ability of tumor cells to break through the basal membrane and initiate local or distant metastases [10–12]. COL11A1 upregulation in tumor tissue versus normal tissue has been demonstrated in gastric cancer , non-small cell lung cancer [14, 15], pancreatic cancer  and this expression has been associated with metastasis in oral cavity and oropharynx , ovarian  and lung cancer . In ovarian cancer, it leads to a stromal desmoplastic reaction in cancer-associated fibroblasts, a feature that is associated with the epithelial-to-mesenchymal transition (EMT) phenotype . In a significant study for breast cancer, COL11A1 is shown to be significantly upregulated in infiltrating tumor lesions compared to their in situ compartments and adjacent stroma . In another study though, collagen a1(XI) appears to be downregulated in stroma surrounding breast cancer but also in metastasized tumors . In addition, COL11A1 is differentially expressed between primary breast cancers that metastasize and their corresponding lymph node sites where its expression seems that is no longer needed [22, 23]. The detection of such quantitative changes in COL11A1 expression could lead to novel approaches regarding prognostic and/or predictive tools for breast cancer.
COL11A1 gene consists of 67 exons and due to alternative splicing of four exons (6, 7, 8 and 9), there exist possibilities of production of at least eight different variants during its transcription [24–26]. Four different splicing variants of COL11A1 mRNA termed A, B, C and E, (Fig. 1) have been deposited in GenBank (Table 1) and are known to differ in their propensity for NTD proteolysis  and potentially in the way the extracellular matrix is arranged. No study in the literature has so far investigated the expression of the four known variants in breast cancer (as well as cancer in general) nor does a method for their accurate quantitative detection exist.
In our study we validated novel, specific and sensitive real-time qPCR (quantitative Polymerase Chain Reaction) methodologies for COL11A1 mRNA variants in the Lightcycler platform and obtained quantitative data for their distribution in breast tumors. Furthermore, we sought to determine whether there is a correlation between differential expression of these COL11A1 splice variants with tumor histopathological parameters and patient follow-up data in order to explore the possibility of their inclusion as prognostic biomarkers in emerging multiparameter technologies examining tissue RNA expression (analogous to Oncotype, MammaPrint, HOXB13: IL17BR and molecular grade index 8-gene panel, Endopredict and PAM50) [28–32].
Ninety tissue specimens were collected from the Pathologic Anatomy Laboratory of Evgenidio Hospital from consecutive female breast cancer patients residing mostly in the Athens Metropolitan area during the period 2007–2011. Main criteria were the availability of the material, the presence of >70 % of tumor cells in the frozen section and the written informed consent of the patients (family history was not used as a criterion for inclusion in the study). The study was approved by both bioethics and scientific committees of the Evgenidio Hospital. Most of the specimens originated from lumpectomies and the mean size was 2.0 cm (range: 1.0–5.5 cm). A small part of the resected specimens at surgery was immediately stored in RNAlater (Life Technologies Ambion, USA) for 1–2 days at 4 °C and then stored at −80 °C until total RNA extraction for molecular collagen analysis. The larger part of the resected specimens was embedded in formalin-fixed paraffin blocks and used for histopathological examinations. The majority of the tumors (80 %) were ductal infiltrating carcinomas (the rest lobular mostly, papillary and mucinous) and were classified according to the Bloom-Richardson grading system as grade 1 (3 samples), grade 2 (57 samples) and grade 3 (22 samples). Grades 1 and 2 were grouped together because of the small number of grade 1 tumors. The presence or absence of estrogen and progesterone hormone receptors was investigated with routine immunohistochemistry (IHC) and positivity was defined as a score >1 in IHC. Oncogene HER2 overexpression was examined with IHC and when the score was 2 in the 0–3 scale, it was further examined with chromogenic in situ hybridization (CISH). Therefore, we were able to dichotomize all samples as being either HER2 negative or positive. Classification into the triple negative breast cancer (TNBC) category was assigned if a tumor was negative for estrogen and progesterone hormone receptors and HER2 overexpression. Lymph node involvement was also noted and the presence of any recurrences or metastasis was recorded for those patients with follow-up data. The characteristics of the 90 tissues and patients with breast cancer are summarized in Table 2.
Total RNA Isolation
Total RNA was extracted with the use of the NucleoSpin RNA kit (Macherey-Nagel, Germany) after passing the liquid N2-snap frozen tissues through special filter columns (shredders) in order to homogenize them and to reduce viscosity. DNA was removed by an in-column recombinant DNase treatment. Total RNA was eluted in RNase-fee water and stored at −80 °C until further use. The absolute measurement of RNA concentration was determined by the Quant-iT RNA Assay kit in the Qubit 1.0 fluorometer (Life Technologies Invitrogen, USA) that employs a dye specific for RNA and not for DNA.
Complementary DNA Synthesis
cDNA was synthesized from 1 μg of total RNA and random hexamers in a 20 μL total volume, according to the Transcriptor First Strand cDNA Synthesis kit (Roche Applied Science, Switzerland) instructions. It was organized in large batches and appropriate controls were added: a no-RNA blank (RNA−) control, a Reverse Transcriptase-negative (RT−) control and a 100 ng RNA-positive (RNA+) control for Porphobilinogen deaminase (PBGD) gene provided by the kit. The cDNA samples were then stored at −20 °C. In order to test the quality and purity of RNA samples, the resulting cDNA was amplified in a control PCR method of the actin reference gene as previously described . cDNA samples that are free of containing genomic DNA produce a unique fragment of 587 base pairs (bp) (and not the additional fragment of 1122 bp if genomic DNA exists). The efficiency of cDNA synthesis was also examined with conventional PCR for the PBGD gene with primers provided by the kit: the same intensity of a 151 bp band was obtained each time for the RNA+ control (also many tumor cDNA samples were run alongside as an additional control of quality and purity of the RNA samples).
Conventional PCR for the general COL11A1 transcript
In order to detect the presence or not of the general COL11A1 transcript, a simple conventional PCR was developed. Suitable primers were designed, common for all splice variants of COL11A1 gene in a well conserved region, by using the CLC Free Workbench version 4 software (Qiagen Bioinformatics, Aarhus, Denmark). The primers shown in Table 3 are located in the junction of exons 48/49 and 51, respectively. For each reaction, 1.5 μL of cDNA was placed in a 23.5 μL reaction mixture containing 12.5 μL of BioMix Red DNA polymerase (Bioline, Germany), 1.5 μL of the supplied MgCl2 (50 mM), 1 μL of the primers (final concentration: 0.04 pmol/μL) and ddH2O. The cycling protocol was consisted of an initial 4-min denaturation step at 94 °C, followed by 40 cycles of denaturation at 94 °C for 30 s, annealing at 57 °C for 30 s, extension at 72 °C for 30 s and a final 5 min extension step at 72 °C. Checking for the proper size of 132 bp was performed with electrophoresis of a 10 μL PCR product on 2 % w/v agarose gel along with MW marker (PCR Marker, New England Biolabs, USA), staining with ethidium bromide and visualization under ultraviolet (UV) light.
Real-time quantitative PCR methodology for the COL11A1 variants detection
For the quantification of COL11A1 transcript variants, suitable pairs of primers and hybridization sets of dual probes (labeled with fluorescein donor and LC-Red 640 acceptor dyes) were designed by aligning all four variants mRNA in the CLC Free Workbench version 4 program in order to select for non-homologous regions for their binding. The choice of the primers was based on the presence or absence of exons 6, 7, 8 and 9 which differs in different variants uniquely. Transcripts A and C employ a common set of dual probes for their detection but different primers; the same strategy is used for B and E transcripts (Fig. 1). The sequences of primers and probes synthesized by TIB MOLBIOL (Germany) are shown in Table 3.
Real-time quantitative PCR was performed with the LightCycler 1.5 platform (Roche Applied Science) in glass capillaries in a total volume of 10 μL. For transcript variant A, 1 μL of the sample cDNA was added to 0.3 μL of the forward primer VARAC F (final concentration: 0.6 pmol/μL), 0.1 μL of the reverse primer VARAEB R (final concentration: 0.2 pmol/μL), 0.6 μL of the probe VARAC FL (final concentration: 0.18 μΜ), 0.6 μL of the probe VARAC LC (final concentration: 0.18 μΜ), 2 μL of 25 mM MgCl2 (Roche, final concentration: 5 mM), 1 μL of the LightCycler FastStart DNA Master HybProbe 10× reagent (Roche Applied Science) and ddH2O to the final volume (for variant C, the VARC R primer is used instead of VARAEB R). For transcript variant E, 1 μL of the sample cDNA was added to 0.3 μL of the forward primer VARE F (final concentration: 0.6 pmol/μL), 0.1 μL of the reverse primer VARAEB R (final concentration: 0.2 pmol/μL), 0.5 μL of the probe VAREB FL (final concentration: 0.15 μΜ), 0.5 μL of the probe VAREB LC (final concentration: 0.15 μΜ), 1.2 μL of 25 mM MgCl2 (Roche, final concentration: 3 mM), 0.6 μL of DMSO, 1 μL of the LightCycler FastStart DNA Master HybProbe 10× reagent and ddH2O to the final volume (for variant B, the VARB F primer is used instead of VARE F). All reactions were initiated with a 10-min denaturation at 95 °C and terminated with a 30 s cooling step at 40 °C. The cycling protocol consisted of denaturation step at 95 °C for 10 s, annealing at 52 °C for variant A/50 °C for variant E for 30 s and extension at 72 °C for 30 s and repeated for 42 cycles. In each preparation, alongside the unknown samples, standards, blank samples and positive controls samples (that were confirmed by DNA sequencing analysis) were included. Fluorescence detection was performed at the end of each extension step for 0 s at the F1 channel. For quantification, an external standard curve was obtained by using the transcript variants PCR amplicon standards (prepared as described below) and plotting the log number of copies corresponding to each standard versus the value of their corresponding quantification cycle (Cq). Real-time qPCR products were additionally checked: i) for size and purity by inversion of the glass capillaries and electrophoresis on 2 % w/v agarose gels (the expected PCR product sizes are provided in the last column of Table 1 ) and ii) for nucleotide composition. The Sanger DNA sequencing methodology was performed with a PCR product column clean-up (NucleoSpin Gel and PCR Clean-up kit, Macherey-Nagel, Germany) and a cycle sequencing reaction employing the Big Dye 1.1 reagent (Life Technologies Applied Biosystems, USA). The electrophoregrams in the ΑBI Prism 310 Genetic Analyzer were manually base-called with the Chromas Lite 2.01 software (Technelysium Pty, Tewantin, Australia) and compared with the expected sequence with the BLAST tool of PubMed. Also the Tm’s of the amplicons were determined immediately after amplification, by melting curve analysis performed in the LightCycler. The melting curve protocol included raising the temperature at 95 °C, cooling at 55 °C for 15 s and slow heating to 95 °C at a rate of 0.1 °C/s, during which time fluorescence measurements were continuously collected in the F2 channel and their first derivate (−d(F2)/dT vs. T) was used for the determination of Tm.
To establish specific, sensitive and reproducible real-time quantitative assays, we performed extensive optimization of primers, probes and MgCl2 concentrations as well as of the reaction temperatures and cycles. The analytical evaluation of assays was performed with the prepared standards. For each splice variant detected in our samples, a calibration curve was generated from serial dilutions e.g. ranging from 5 × 105 to 5 × 101 copies/μL of variant A and 5 × 106 to 5 × 101 copies/μL of variant E. The reproducibility (calculated as coefficients of variation, CVs), the efficiency of the PCR reaction (expressed as E = 10-1/slope) and the limit of detection for our assays (defined as the concentration detected in 95 % of trials) were also determined in order to complete the validation file of the novel methodologies with the established MIQE guidelines .
Preparation of the standards
For the development and analytical evaluation of our assays, we generated and used as standards PCR amplicons corresponding to the COL11A1 splice variants studied. For this reason, a significant amount of the amplicons was produced by many PCR reactions of the same cDNA preparation in a positive sample for each variant. The amplicons were pooled, purified by columns and quantitated by the Quant-iT dsDNA Broad-Range Assay kit (Life Technologies Invitrogen, USA) in the Qubit 1.0 fluorometer. The concentration was converted to copies per microliter by using the Avogadro constant and the product’s molecular weight (number of bases of the PCR product multiplied by the average molecular weight of a pair of nucleic acids, which is 660), as described elsewhere . Then, serial dilutions of the above-quantified stock amplicon solutions were prepared for each variant and kept in aliquots at −20 °C; they were used throughout the study as external standards for the absolute quantification of COL11A1 transcript variants.
Normalization facilitates experimental problems concerning the inherent variability of RNA level of expression, variability of extraction protocols and presence of inhibitors . In our assay, we ensured that the starting tissue material for RNA extraction had similar initial size and weight (approximately 30 mg) and we performed normalization against the same amount of total RNA (1 μg) that was used for cDNA synthesis in all samples as suggested by previous studies [36–38].
The COL11A1 variants were analyzed statistically both in a qualitative way (presence or absence of the variant) with either Pearson χ 2 or Fischer’s exact test and in a quantitative way: the positive samples were divided in two categories (high or low category) depending whether their copies were above or below a certain percentile value of copies (e.g. the 25th, 50th or median, the 75th) and 2 × 2 cross-tabulations were performed. Also the median copy values of the two low and high categories were compared in each category of the clinicopathological characteristics examined (all divided in two categories as well) with the Mann–Whitney U test for continuous variables that are non-normally distributed (as determined with the Kolmogorov-Smirnov test). The Spearman correlation coefficient was used as a measurement of correlation for continuous non-normally distributed variables. Probit statistical analysis was used for estimation of the limit of detection in our novel assays. The association of COL11A1 transcript variants with long-term metastasis was analyzed with the Kaplan-Meier method and survival curves were compared with the log-rank test. For all tests performed, a two-sided p value of <0.05 was considered significant. Data analysis was carried out with the SPSS version 21.0 statistical software package for Windows (IBM - SPSS Inc., USA).
Conventional PCR for the general COL11A1 transcript
All extracted RNAs were of adequate quantity -as measured in the fluorometer- and quality as they produced a single pure actin band in the gels. The general COL11A1 transcript was detected in all samples (Additional file 1: Figure S1) as revealed from a distinct 132 bp band in all PCR products.
Development, analytical and clinical evaluation of the real-time qPCR methodology for the COL11A1 variants detection
Real-time qPCR methodologies were developed adequately, were rapid and specific as it can be seen in Additional file 2: Figures S2 and Additional file 3: Figure S3 when the real-time PCR products from positive cDNA samples were extracted and run on a 2 % w/v agarose gel: variants A and E produced the expected bands at sizes of 439 and 259 bp. Portions of Sanger DNA Sequencing electropherograms of these transcripts A and E are shown in Additional file 4: Figures S4 and Additional file 5: Figure S5 and are aligned fully with the GenBank deposited variant sequences. Variants B and C were not detected in any tumor cDNA sample, therefore no further validation procedures were performed for these two transcripts.
The analytical sensitivity and linearity of the proposed COL11A1 A and E transcript real-time qPCR assays were determined by using the external standards of each variant with known concentrations that were prepared as described above. Our standard curves showed linearity over the entire quantification range (5 × 105 to 5 × 101 variant A copies/μL and 5 × 106 to 5 × 101 variant E copies/μL) while the correlation coefficients were about 0.99 in all cases, indicating a precise log–linear relationship (Figs. 2 and 3). The mean slope and intercept of the standard curve of variant A were −3.22 ± 0.19 and 36.81 ± 0.52 respectively (n = 5), while the PCR reaction efficiency was 2.05 ± 0.04 (CV % = 1.97), very close to the ideal value which is 2.00. About variant E, the mean slope and intercept of the standard curve were −3.66 ± 0.34 and 41.80 ± 2.49 respectively (n = 5), while the efficiency was 1.88 ± 0.10 (CV % = 5.39). The between-run CV’s for the Cq values of the standards, analyzed in five different experiments over a period of 1 month, ranged from 0.78 to 1.84 % for variant A and from 2.62 to 3.88 % for variant E. The analytical limit of detection as determined from probit statistical analysis was 19 copies/μL for variant A and 16 copies/μL for variant E. The Tm from all positive variant A amplicons was calculated to be 69.9 (±1.0) °C, while the corresponding for variant E was 65.3 (±1.2) °C (representative samples in Figs. 4 and 5).
Among the 90 breast cancer tissues investigated, variant A was detected in 30 tumor cDNA samples (33 %) and variant E in 62 (69 %). Characteristic amplication plots of tumor cDNA samples for COL11A1 variants A and E are shown in Figs. 6 and 7. In 28 samples, both A and E variants were detected (31 %) while in 26 samples, no variant was detected (29 %). For variant A, the mean value of copies for the positive samples was 7.58 × 104 copies/μg of total RNA, while the median value was 3.28 × 105 copies/μg of total RNA (range 2.36 × 102-6.85 × 105 copies/μg of total RNA). For variant E, the mean value of copies for the positive samples was 3.56 × 105 copies/μg of total RNA, while the median value was 4.97 × 104 copies/μg total RNA (range 3.51 × 102-3.86 × 106 copies/μg of total RNA).
COL11A1 transcript variants expression in relation to clinicopathological features
Statistical results are shown in Tables 4, 5 and 6. In the qualitative way, a statistically significant correlation was observed between the presence of variant E and lymph nodes involvement (p = 0.037) and metastasis (p = 0.041) (Table 5). No association was detected with the other classical prognostic factors in breast cancer. When patient tumors were classified in the higher-copy number group of the 50th percentile and were also positive for variant A, they showed correlation with the better prognosis lobular histopathological type (p = 0.042, Table 4). The two main findings in the qualitative stats, the lymph-node involvement and the metastasis for the variant E showed a trend when examined in the 25th percentile subcategories: 0.058 and 0.081 respectively (data not shown).
When examining the simultaneous expression of variant A and variant E, that was significantly correlated with the older age group (p = 0.036, Table 6 left). Furthermore, the qualitative presence of either variant A or either variant E presented a significant correlation with metastasis (p = 0.043, Table 6 right). There was also a statistically significant positive correlation between copies of variant A and copies of variant E (rho = 0.368, p = 0.050). We also examined the association of COL11A1 transcript variants with metastasis in the 55 patients where follow-up data was available by using the Kaplan-Meier survival analysis. Patients with the presence of variant E in their tumor showed a reduced disease-free interval compared to those not expressing it (p = 0.060, log-rank test, Fig. 8).
The first goal of this study was the development and validation of new and reliable quantitative assays for all reported COL11A1 mRNA splice variants (A, B, C and E) by using real-time qPCR methods. With another simple conventional PCR technique -in a common genomic area for all transcripts- we would still being able to determine the presence or not of the COL11A1 gene transcript, in general. Furthermore, we applied these techniques in breast cancer tissues in order to use the obtained quantitative data to determine any existing significant correlation between the differential expression of COL11A1 variants and clinicopathological features of these patients.
When 90 breast cancer tissues were studied, only A and E variants were encountered while the general COL11A1 transcript was present in all samples. Variant A was detected in 30 samples (33 %) and variant E in 62 (69 %). In 28 samples, both A and E variants were detected (31 %) while in 26 samples, no variant was detected (29 %). Variants B and C were not detected in our series of samples and hence, we were not able to validate the methodologies with the proposed combination of primers and probes. The quantification of variants A and E was performed with a real-time qPCR methodology on the LightCycler 1.5 thermocycler using dual hybridization probes and melting curve analysis at the end of each reaction. We performed optimization experiments by using isolated and quantified amplicons as external standards of the developed real-time qPCR assays for the A and E variants. The assays were developed satisfactorily, were rapid and reliable, demonstrating excellent efficiencies (2.05 ± 0.04 for variant A and 1.88 ± 0.10 variant E), very good reproducibilities (CV ≤1.3 % for variant A and CV ≤3.2 % for variant E) and low detection limits (~19 copies/μL for variant A and ~16 copies /μL for variant E). The specificity of the real-time qPCR assays was tested by melting curve analysis (Tm of variant A amplicon was 69.9 (±1.0) °C while that of variant E was 65.3 (±1.0) °C), by the presence of specific bands of the proper size during electrophoresis of the real-time PCR products and finally, by DNA sequencing of the amplicons obtained. The determination was easy and rapid (within ~ 50 min) after the synthesis of the cDNA and it was possible to analyze up to 32 samples simultaneously. However, there is the possibility of higher throughput in larger platforms such as the LightCycler 480/1630, wherein the determinations that are performed in microtiter plates lead to a much greater number of samples that can be processed together.
Statistical analysis of the data was carried out in order to detect any existing significant correlation between the differential expression of the variants A and E (presence or not, low or high number of copies) with clinicopathological characteristics of the samples and the patients (such as age group, tumor size, histopathological type of tumor, lymph nodes involvement, grade, metastasis, hormone receptors status, HER2 oncogene overexpression, TNBC status). The copy numbers of variants A and were E showed some positive correlation between them (rho = 0.368, p = 0.050) and the simultaneous expression of them was significantly correlated with the older age group (p = 0.036). We cannot exclude that this might reflect a more generalized defect in the splicing machinery with increased aging. The most important finding was the observed statistically significant correlation between the presence of variant E and lymph nodes involvement (p = 0.037) and metastasis (p = 0.041) which was corroborated by a trend in Kaplan-Meir analysis where the patients with variant E in their tissue show reduced disease-free interval (p = 0.060). Furthermore, the qualitative presence of either variant A or variant E showed a significant correlation with metastasis (p = 0.043). Results could be probably reinforced if follow-up data was available for all 90 patients with quantitative data on variants A and E and not only for 55 patients. No other association with established histopathological prognostic parameters was detected in our results. A working hypothesis therefore, would be that the shorter isoform, produced from the translation of variant E mRNA, would be more resistant in proteolytic actions by enzymes such as BMP-1 - and it could retain the bulky NTD domain for a longer time. This could lead to a “thinner” collagenous stroma, more attractive to adhesion molecules and metalloproteinases (as NTD contains thrombospondin-1 like and heparin binding regions ) and thus, could pave the way for tumor cells motility and metastasis.
A limitation of our study is that we could not investigate quantitatively whether the breast tumor cells showed upregulation of the expression of variants compared to normal epithelial breast tissues. Also, we could not dissect the expression to either the epithelial or the stromal compartment as the specimens obtained were a mixture of these. Finally, regarding the group of breast tumor tissues examined, the tumors studied were relatively small (~2.0 cm) because they originated from well-monitored patients in a metropolitan area. During the total RNA isolation procedure, although the samples were placed directly into an appropriate material for the RNA stability (RNAlater), the presence of inhibitors in our fresh-frozen biopsy RNA preparations and their integrity were not assessed by assays such as the SPUD  and the 5:3 ratio GAPDH (GlycerAldehyde 3-Phosphate DeHydrogenase) mRNA integrity tests . However, the RNA quality was tested with the actin reference gene and measured with absolute accuracy with the Quant-It RNA Assay kit on Qubit. Differences in cDNA synthesis efficiency due to tumor variability could not be assessed since the absolute quantification and normalization to total RNA strategy was selected for analysis of data (and not relative quantification and normalization to expression of one or an average of three reference genes as is the trend nowadays).
This study was the first to assess the differential expression of COL11A1 A and E splice variants in breast cancer tissues and in cancer in general. We attempted also to detect B and C variants but with no clear indication whether our assays failed or these transcripts weren’t present, since we didn’t possess any positive control. The existence of other variants is speculated: the fact that in 29 % of the cDNA samples no COL11A1 variant were detected -despite the presence of the general transcript- warrants a new research effort in the future for the quest and identification of novel variants. Additionally, the general COL11A1 transcript could also be quantitated in a novel assay (e.g. multiplexed with A and/or E variants) in order to identify samples that although they are positive for A and/or E variants don’t sum up to the total COL11A1 transcript and therefore one could hypothesize that they contain additional aberrant transcripts.
The study also could be extended to a larger number of breast cancer tissues and a significant number of normal tissues so that it could verify the results of earlier studies in relation to increased or no expression of COL11A1 mRNA and its variants in breast cancer. In this case, it may be possible to include COL11A1 gene and/or its variants in new improved prognostic multiparameter expression arrays for predicting metastasis. This information would be useful for 20–30 % of lymph node positive breast cancer patients that remain free of distant metastasis in 15–30 years but still receive toxic chemotherapy . It is expected that new tools such as deep RNA Sequencing with Next Generation Sequencing (NGS) platforms could assist in the discovery of such new aberrant transcripts in tumor RNA samples.
By employing polyclonal antibodies against various epitopes in the NTD domain -that are available now at a research level [21, 41]-, it should be possible to further validate our assays of COL11A1 RNA variants and to evaluate findings on the differential proteolysis of the N-terminal regions of the protein chain of collagen a1(XI) in breast cancer and their involvement in tissue remodeling through stereochemistry. The combined use of laboratory tools such as qPCR and Western Blot would lead to validation of antibodies suitable for use in routine IHC in paraffin-embedded tissues. Also it would be useful to evaluate the expression of COL11A1 variants in other cancers such as oropharynx , ovarian  and lung cancer , wherein the expression of COL11A1 has been shown to be associated with disease progression.
Chromogenic in situ hybridization
Coefficient of variation
Quantitative Polymerase Chain Reaction
Melting point temperature
Triple negative breast cancer
Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer. 2015;136(5):359–86.
Mendler M, Eich-Bender SG, Vaughan L, Winterhalter KH, Bruckner P. Cartilage contains mixed fibrils of collagen types II, IX, and XI. J Cell Biol. 1989;108(1):191–7.
Bernard M, Yoshioka H, Rodriguez E, Van der Rest M, Kimura T, Ninomiya Y, et al. Cloning and sequencing of pro-alpha 1 (XI) collagen cDNA demonstrates that type XI belongs to the fibrillar class of collagens and reveals that the expression of the gene is not restricted to cartilagenous tissue. J Biol Chem. 1988;263(32):17159–66.
Morris NP, Bachinger HP. Type XI collagen is a heterotrimer with the composition (1 alpha, 2 alpha, 3 alpha) retaining non-triple-helical domains. J Biol Chem. 1987;262(23):11345–50.
Burgeson RE, Hollister DW. Collagen heterogeneity in human cartilage: identification of several new collagen chains. Biochem Biophys Res Commun. 1979;87(4):1124–31.
Thom JR, Morris NP. Biosynthesis and proteolytic processing of type XI collagen in embryonic chick sterna. J Biol Chem. 1991;266(11):7262–9.
Fallahi A, Kroll B, Warner LR, Oxford RJ, Irwin KM, Mercer LM, et al. Structural model of the amino propeptide of collagen XI alpha1 chain with similarity to the LNS domains. Protein Sci. 2005;14(6):1526–37.
Oxford JT, DeScala J, Morris N, Gregory K, Medeck R, Irwin K, et al. Interaction between amino propeptides of type XI procollagen alpha1 chains. J Biol Chem. 2004;279(12):10939–45.
Toumpoulis IK, Oxford JT, Cowan DB, Anagnostopoulos CE, Rokkas CK, Chamogeorgakis TP, et al. Differential expression of collagen type V and XI alpha-1 in human ascending thoracic aortic aneurysms. Ann Thorac Surg. 2009;88(2):506–13.
Fischer H, Stenling R, Rubio C, Lindblom A. Colorectal carcinogenesis is associated with stromal expression of COL11A1 and COL5A2. Carcinogenesis. 2001;22(6):875–8.
Banyard J, Bao L, Hofer MD, Zurakowski D, Spivey KA, Feldman AS, et al. Collagen XXIII expression is associated with prostate cancer recurrence and distant metastases. Clin Cancer Res. 2007;13(9):2634–42.
Misawa K, Kanazawa T, Imai A, Endo S, Mochizuki D, Fukushima H, et al. Prognostic value of type XXII and XXIV collagen mRNA expression in head and neck cancer patients. Mol Clin Oncol. 2014;2(2):285–91.
Zhao Y, Zhou T, Li A, Yao H, He F, Wang L, et al. A potential role of collagens expression in distinguishing between premalignant and malignant lesions in stomach. Anat Rec. 2009;292(5):692–700.
Wang KK, Liu N, Radulovich N, Wigle DA, Johnston MR, Shepherd FA, et al. Novel candidate tumor marker genes for lung adenocarcinoma. Oncogene. 2002;21(49):7598–604.
Chong IW, Chang MY, Chang HC, Yu YP, Sheu CC, Tsai JR, et al. Great potential of a panel of multiple hMTH1, SPD, ITGA11 and COL11A1 markers for diagnosis of patients with non-small cell lung cancer. Oncol Rep. 2006;16(5):981–8.
Garcia-Pravia C, Galvan JA, Gutierrez-Corral N, Solar-Garcia L, Garcia-Perez E, Garcia-Ocana M, et al. Overexpression of COL11A1 by cancer-associated fibroblasts: clinical relevance of a stromal marker in pancreatic cancer. PLoS One. 2013;8(10):e78327.
Schmalbach CE, Chepeha DB, Giordano TJ, Rubin MA, Teknos TN, Bradford CR, et al. Molecular profiling and the identification of genes associated with metastatic oral cavity/pharynx squamous cell carcinoma. Arch Otolaryngol Head Neck Surg. 2004;130(3):295–302.
Wu YH, Chang TH, Huang YF, Huang HD, Chou CY. COL11A1 promotes tumor progression and predicts poor clinical outcome in ovarian cancer. Oncogene. 2013;33(26):3432–40.
Kim H, Watkinson J, Varadan V, Anastassiou D. Multi-cancer computational analysis reveals invasion-associated variant of desmoplastic reaction involving INHBA, THBS2 and COL11A1. BMC Med Genet. 2010;3:51.
Vargas AC, McCart Reed AE, Waddell N, Lane A, Reid LE, Smart CE, et al. Gene expression profiling of tumour epithelial and stromal compartments during breast cancer progression. Breast Cancer Res Treat. 2012;135(1):153–65.
Halsted KC, Bowen KB, Bond L, Luman SE, Jorcyk CL, Fyffe WE, et al. Collagen alpha1(XI) in normal and malignant breast tissue. Mod Pathol. 2008;21(10):1246–54.
Feng Y, Sun B, Li X, Zhang L, Niu Y, Xiao C, et al. Differentially expressed genes between primary cancer and paired lymph node metastases predict clinical outcome of node-positive breast cancer patients. Breast Cancer Res Treat. 2007;103(3):319–29.
Ellsworth RE, Seebach J, Field LA, Heckman C, Kane J, Hooke JA, et al. A gene expression signature that defines breast cancer metastases. Clin Exp Metastasis. 2009;26(3):205–13.
Yoshioka H, Inoguchi K, Khaleduzzaman M, Ninomiya Y, Andrikopoulos K, Ramirez F. Coding sequence and alternative splicing of the mouse alpha 1(XI) collagen gene (Col11a1). Genomics. 1995;28(2):337–40.
Zhidkova NI, Justice SK, Mayne R. Alternative mRNA processing occurs in the variable region of the pro-alpha 1(XI) and pro-alpha 2(XI) collagen chains. J Biol Chem. 1995;270(16):9486–93.
Oxford JT, Doege KJ, Morris NP. Alternative exon splicing within the amino-terminal nontriple-helical domain of the rat pro-alpha 1(XI) collagen chain generates multiple forms of the mRNA transcript which exhibit tissue-dependent variation. J Biol Chem. 1995;270(16):9478–85.
Medeck RJ, Sosa S, Morris N, Oxford JT. BMP-1-mediated proteolytic processing of alternatively spliced isoforms of collagen type XI. Biochem J. 2003;376(Pt 2):361–8.
Sinn P, Aulmann S, Wirtz R, Schott S, Marme F, Varga Z, et al. Multigene Assays for Classification, Prognosis, and Prediction in Breast Cancer: a Critical Review on the Background and Clinical Utility. Geburtshilfe Frauenheilkd. 2013;73(9):932–40.
Habel LA, Sakoda LC, Achacoso N, Ma XJ, Erlander MG, Sgroi DC, et al. HOXB13:IL17BR and molecular grade index and risk of breast cancer death among patients with lymph node-negative invasive disease. Breast Cancer Res. 2013;15(2):R24.
Caan BJ, Sweeney C, Habel LA, Kwan ML, Kroenke CH, Weltzien EK, et al. Intrinsic subtypes from the PAM50 gene expression assay in a population-based breast cancer survivor cohort: prognostication of short- and long-term outcomes. Cancer Epidemiol Biomarkers Prev. 2014;23(5):725–34.
Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351(27):2817–26.
Poumpouridou N, Kroupis C. Hereditary breast cancer: beyond BRCA genetic analysis; PALB2 emerges. Clin Chem Lab Med. 2012;50(3):423–34.
Pavlidou A, Kroupis C, Goutas N, Dalamaga M, Dimas K. Validation of a Real-Time Quantitative Polymerase Chain Reaction Method for the Quantification of 3 Survivin Transcripts and Evaluation in Breast Cancer Tissues. Clin Breast Cancer. 2014;14(2):122–31.
Bustin SA, Benes V, Garson JA, Hellemans J, Huggett J, Kubista M, et al. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin Chem. 2009;55(4):611–22.
Kroupis C, Stathopoulou A, Zygalaki E, Ferekidou L, Talieri M, Lianidou ES. Development and applications of a real-time quantitative RT-PCR method (QRT-PCR) for BRCA1 mRNA. Clin Biochem. 2005;38(1):50–7.
Nolan T, Hands RE, Bustin SA. Quantification of mRNA using real-time RT-PCR. Nat Protoc. 2006;1(3):1559–82.
Tricarico C, Pinzani P, Bianchi S, Paglierani M, Distante V, Pazzagli M, et al. Quantitative real-time reverse transcription polymerase chain reaction: normalization to rRNA or single housekeeping genes is inappropriate for human tissue biopsies. Anal Biochem. 2002;309(2):293–300.
Zygalaki E, Tsaroucha EG, Kaklamanis L, Lianidou ES. Quantitative real-time reverse transcription PCR study of the expression of vascular endothelial growth factor (VEGF) splice variants and VEGF receptors (VEGFR-1 and VEGFR-2) in non small cell lung cancer. Clin Chem. 2007;53(8):1433–9.
Warner LR, Brown RJ, Yingst SMC, Oxford JT. Isoform-specific Heparan Sulfate Binding within the Amino-terminal Noncollagenous Domain of Collagen α1(XI). J Biol Chem. 2006;281(51):39507–16.
Nolan T, Hands RE, Ogunkolade W, Bustin SA. SPUD: a quantitative PCR assay for the detection of inhibitors in nucleic acid preparations. Anal Biochem. 2006;351(2):308–10.
Bowen KB, Reimers AP, Luman S, Kronz JD, Fyffe WE, Oxford JT. Immunohistochemical localization of collagen type XI alpha1 and alpha2 chains in human colon tissue. J Histochem Cytochem. 2008;56(3):275–83.
We would like to express our gratitude to Ms. Tatiana Rizou for reading and commenting on our manuscript, Assoc. Prof. Kleanthi Dima for equipment provision, Prof. Evi Lianidou for critically reviewing the manuscript and for the decision to submit to BMC Cancer and finally, all the patients that participated in the study. NP is supported from Grant NSRF HRAKLEITOS 70/3/10973 from the European Social fund 2007–2013 (but with no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript).
The authors declare that they have no competing interests.
MK participated in the conception and design of the study, carried out the assays, collected and assembled the data, performed the statistical analysis and drafted the manuscript. IT participated in the conception and design of the study, provided study material and edited the manuscript. NG, DV and SV provided study patients and material. NP participated in the assays and the collection of the data. IR provided study material. CK participated in the conception and design of the study, provided study material, performed the statistical analysis, interpreted the data, drafted and edited the manuscript. All authors have read and approved the final manuscript.
Conventional PCR products for the general COL11A1 transcript run on a 2 % w/v agarose gel: in lane 1 PCR MW Marker (50-150-300-500-766 bp), lanes 2–5 positive cDNA samples for the general transcript (132 bp), lane 6 blank. (JPEG 43 kb)
PCR products from inverted capillaries of positive tumor samples for COL11A1 splice variant A run on a 2 % w/v agarose gel: in lane 1 PCR MW Marker (50-150-300-500-766 bp), lane 2 blank, lanes 3–7 positive cDNA samples (439 bp). (JPEG 13 kb)
PCR products from inverted capillaries of positive tumor samples for COL11A1 splice variant E run on a 2 % w/v agarose gel: in lane 1 PCR MW Marker (50-150-300-500-766 bp), lane 2 blank, lanes 3–7 positive cDNA samples (259 bp). (JPEG 15 kb)
Sanger DNA Sequencing electropherogram from a positive amplicon for COL11A1 transcript variant A in a tumor cDNA sample. (JPEG 133 kb)
Sanger DNA Sequencing electropherogram from a positive amplicon for COL11A1 transcript variant E in a tumor cDNA sample. (JPEG 125 kb)