Association of rs2282679 A>C polymorphism in vitamin D binding protein gene with colorectal cancer risk and survival: effect modification by dietary vitamin D intake

Background The rs2282679 A>C polymorphism in the vitamin D binding protein gene is associated with lower circulating levels of vitamin D. We investigated associations of this SNP with colorectal cancer (CRC) risk and survival and whether the associations vary by dietary vitamin D intake and tumor molecular phenotype. Methods A population-based case-control study identified 637 incident CRC cases (including 489 participants with follow-up data on mortality end-points) and 489 matched controls. Germline DNA samples were genotyped with the Illumina Omni-Quad 1 Million chip in cases and the Affymetrix Axiom® myDesign™ Array in controls. Logistic regression examined the association between the rs2282679 polymorphism and CRC risk with inclusion of potential confounders. Kaplan-Meier curves and multivariable Cox models assessed the polymorphism relative to overall survival (OS) and disease-free survival (DFS). Results The rs2282679 polymorphism was not associated with overall CRC risk; there was evidence, however, of effect modification by total vitamin D intake (Pinteraction = 0.019). Survival analyses showed that the C allele was correlated with poor DFS (per-allele HR, 1.36; 95%CI, 1.05–1.77). The association of rs2282679 on DFS was limited to BRAF wild-type tumors (HR, 1.58; 95%CI, 1.12–2.23). For OS, the C allele was associated with higher all-cause mortality among patients with higher levels of dietary vitamin D (HR, 2.11; 95%CI, 1.29–3.74), calcium (HR, 1.93; 95%CI, 1.08–3.46), milk (HR, 2.36; 95%CI, 1.26–4.44), and total dairy product intakes (HR, 2.03; 95%CI, 1.11–3.72). Conclusion The rs2282679 SNP was not associated with overall CRC risk, but may be associated with survival after cancer diagnosis. The association of this SNP on survival among CRC patients may differ according to dietary vitamin D and calcium intakes and according to tumor BRAF mutation status.


Background
Colorectal cancer (CRC) is a complex, multifactorial disease resulting from multiple genetic and environmental factors [1]. Vitamin D from diet, supplements, and cutaneous synthesis from sunlight, is associated with lower risks of CRC incidence [2][3][4][5] and mortality [6][7][8][9][10][11]. The anti-carcinogenic effects of vitamin D might vary by the vitamin D-binding protein (DBP) [12]. As the major carrier protein in systemic circulation, DBP reversibly binds and transports vitamin D metabolites to different target organs, including the colorectum, thereby influencing the bioavailability of active 25-hydroxyvitamin D (25(OH)D) [12,13]. Additionally, DBP is the precursor molecule of a potent macrophage-activating factor (GcMAF) [14], which is highly tumoricidal against various malignancies through its ability to inhibit endothelial angiogenesis [15,16] and stimulate the inflammationprimed phagocytic activity of tumoricidal macrophages [17]. Therefore, DBP would be hypothesized to play an important role in CRC initiation and progression, either alone or in combination with vitamin D [18].
The gene encoding DBP, GC gene, is highly polymorphic. The single nucleotide polymorphism (SNP) rs2282679 A>C is one of the most commonly studied variants in this gene, which has been shown to be robustly correlated with serum levels of 25(OH)D in recent genome-wide association studies (GWAS) [19,20]; specifically, the C allele of this SNP is associated with lower levels of 25(OH)D [20]. Prior studies on GC variants have been performed on melanoma [21], prostate [22] and breast cancers [23,24]; however, we found only two studies that evaluated the specific association of the GC rs2282679 polymorphism with CRC risk, with both studies reporting no evidence of association [25,26]. Another study reported that the GC rs2282679 SNP was associated with prognosis for patients diagnosed with stages II and III colon cancer [27].
Microsatellite instability (MSI) and BRAF V600E hotspot mutation are important molecular classifiers in CRC, which define distinct CRC subgroups arising from different oncogenic pathways. Microsatellite unstable (MSI-H) tumors are generally associated with superior prognosis [28] whereas BRAF-mutated cancers are related to inferior survival [29,30]. Therefore, it is plausible that factors associated with CRC risk and survival differ across tumor molecular subtypes defined by MSI and BRAF mutation status. Prior studies have reported that the associations between genetic variations in vitamin D and calcium metabolic pathway and CRC vary according to MSI status, with significant associations for microsatellite unstable CRC only [31,32]. However, no study has yet evaluated the relationship between the GC rs2282679 polymorphism and CRC by these tumor molecular alternations.
In this analysis, we assessed the associations of the GC rs2282679 variant with CRC risk and survival. We additionally evaluated the potential influence of this SNP according to dietary vitamin D, calcium, milk, and total dairy product intake and whether associations varied by tumor microsatellite instability (MSI) or BRAF Val600-Glu mutation status.

Study participants
Study data and biologic specimens were drawn from the Newfoundland and Ontario Familial Colorectal Cancer Study (NFCCS), a large population-based case-control study designed to identify genetic and environmental risk and prognostic factors for CRC [33,34]. The detailed rationale and methodology of NFCCS has been described elsewhere [4,33,[35][36][37]. For the current study, only the participants from the NL portion were analyzed. Briefly, men and women with pathologically confirmed CRC were identified through the Newfoundland Familial Colorectal Cancer Registry (NFCCR). Eligibility criteria included patients newly diagnosed with CRC from 1999 to 2003 and aged 20-75 years at the time of diagnosis. Controls were selected by random digit dialing and matched on age (±5 years) and sex with cases at baseline [38]. All consenting participants were sent self-administered risk factor questionnaires (a Food Frequency Questionnaire (FFQ), a Family History Questionnaire (FHQ), and a Personal History Questionnaire (PHQ)), and were asked to provide blood samples and for permission to access their tumor specimens and medical records (for cases). A total of 656 cases and 696 controls completed detailed questionnaires and donated a blood sample. Of the 656 cases, 490 were followed for mortality and recurrence from the date of cancer diagnosis to April 2010. Vital status (i.e., death, recurrence, and metastasis) was ascertained through periodic followup questionnaires (e.g., FHQ), local newspapers, death certificates, pathology records, autopsy records, physicians' notes, surgical reports, and from records at the Dr. H. Bliss Murphy Cancer Care Foundation. The main study survival outcomes were death from all-causes (i.e., overall survival (OS)) and disease-free survival (DFS), defined as death, recurrence, or metastasis (whichever came first). Follow-up time began at CRC diagnosis, and individuals who were lost to follow-up or did not die, had a recurrence or had a metastasis were censored at the time of their last contact.
Exclusions were made if patients had equivocal genotype or clinical outcome, or failed to provide sufficient information on other critical predictors. Thus, 637 cases and 489 controls for risk analyses and 489 patients for survival analysis were included in the final study.

Diet assessment and baseline information collection
Information on diet and other lifestyle, medical and demographic characteristics was gathered with selfadministered questionnaires. The dietary questionnaire was an adaptation of the Hawaii semi-quantitative FFQ to assess the dietary habits of participants from a year prior to disease diagnosis (cases) or interview (controls), which has been validated in a prior study [39]. The FFQ contained questions regarding the brand and frequency of consumption of 170 foods and beverages plus multivitamin and individual vitamin supplements [40]. The nutrient intakes from diet were calculated by multiplying the frequency of consumption of each food item by the nutrient content per average unit [4]. Total daily nutrient intakes were computed by incorporating supplement use in addition to intakes from diet. The PHQ collected information from each participant on sociodemographics (e.g., age, gender, ethnicity, and education attainment), medical conditions, bowel screening history, aspirin use, physical activity, and recent or prior alcohol and tobacco use. The FHQ gathered baseline and follow-up family history data from the participants.

Genotyping
Genotyping for the GC rs2282679 allele was conducted using the Illumina Human Omni-Quad Beadchip that contains about 1.1 million SNPs at Centrillion Biosciences (USA). Control individuals were genotyped in the Laboratory of Dr. Stephen Gruber (Director, USC Norris Comprehensive Cancer Center, Los Angeles) using the Affymetrix Axiom® myDesign™ GW Array Plate, which contains 1.3 million probes. To monitor quality and consistency between the two platforms, DNA samples from 200 CRC patients were typed on both platforms. As the DNA from cases and controls were genotyped on different platforms, a genotype imputation strategy was implemented to integrate the two datasets using IM-PUTE2 [41] with multi-population reference panels from 1000 Genomes (Phase 1). The imputation approach was validated based on the overlapping SNPs between the two platforms and the genotypes from 200 CRC samples that were typed on both platforms. SNPs with genotype concordance < 97% across the two platforms were removed from further analysis. For the purpose of the current study, directly measured data from both arrays on rs2282679 were retrieved from the genome-wide SNP genotype database of the NFCCR.
Our protocol for MSI and BRAF V600E mutation analyses in tumor DNA has been described previously [42][43][44]. MSI status was evaluated with 5 to 10 microsatellite markers. Tumors were deemed MSI-high if ≥30% of the repeats were unstable and MS-stable/MSI-low if < 30% of the repeats were unstable. The c.1799 T > A variant (Val600Glu mutation) region of the BRAF gene was amplified by BRAF allele-specific polymerase chain reaction technique [44].

Statistical analysis
Group comparisons between cases and controls were performed with two-sample t test for continuous variables and Chi-square (χ 2 ) test for categorical variables. The Hardy-Weinberg Equilibrium for rs2282679 genotype was evaluated using χ 2 goodness-of-fit test. Unconditional logistic regression was used to estimate the association between the rs2282679 GC SNP and risk for CRC as odds ratio (OR) with 95% confidence interval (CI). Initially, logistic regression models only included genotype, age and sex. More complex models also included family history of CRC, screening procedure, multivitamin use, folic acid intake, smoking history, and education attainment. These covariates were retained in the final models because they entered the model at P < 0.1, altered the parameter estimates by > 10%, and/or improved the model fit.
In survival analysis, survival curves were constructed with the Kaplan-Meier method. The log-rank test and the Cox regression models were used for univariable and multivariable survival analyses to assess the association between the SNP of interest and OS and DFS of CRC. The assumption of proportional hazards for each Cox model was verified by testing the statistical significance of timedependent covariates in the model. The hazard rate ratio (HR) and 95% CI were calculated from the Cox models. As the true inheritance mode of the rs2282679 polymorphism has not yet been established, the SNP was analyzed for risk and survival under dominant, additive, and recessive models. Given the limited sample size in some subgroups, we combined those who carried at least one of the minor C alleles in stratified analysis by selected tumor molecular phenotype. Linear trend for gene dose effect was tested by modeling ordinal variables of allele dose (0, 1, and 2) as a continuous variable. Gene-environment interactions were tested by introducing a multiplicative interaction term into the model and assessing its significance with the Wald method. Two-sided exact P < 0.05 was considered statistically significant. We did not adjust for multiple comparisons because the sub-tests in the current study are not independent of each other since the stratified variables are highly correlated (i.e., vitamin D, calcium, and dairy products). Although adjustment for multiple testing reduces type I error, it increases type II error and errors of interpretation [45]. All data management and analyses were performed using SAS software, Version 9.3.

Results
The rs2282679 polymorphism was in Hardy-Weinberg equilibrium (P > 0.05). Among controls, the genotype frequency was 7.4% homozygous (CC), 42.5% heterozygous (AC) and 50.1% wild-type homozygous (AA); the observed minor allele frequencies in the controls were comparable to that previously reported [25]. During a maximum follow-up of 10.9 years (mean: 6.3 years), 150 deaths occurred among the 489 patients included in the survival analysis. The cause of death defined by ICD codes was obtained for 105 of 150 deceased patients; thereof the majority (90.5%) was due to CRC.

Baseline characteristics of cases and controls
Cases and controls had similar sex and ethnicity distributions, and the majority reported their race as White (Table 1). Relative to cases, controls were slightly younger, leaner (lower body mass index), better educated, less likely to smoke, and more likely to have had a colorectal cancer screening/early detection procedure. Among those who completed the FFQ, total vitamin D and calcium intakes were significantly higher in controls than in cases (P = 0.001). Family history of CRC (first-degree relatives affected only) was reported by 9.8% of the patients and 7.5% of the controls. MSI-high was identified in 61 of 507 (12.0%) tumors, and BRAF Val600Glu mutation was detected in 10.8% of tumors.

Association of rs2282679 genotype with CRC risk
The rs2282679 SNP was not associated with risk of CRC overall or when stratified by MSI or BRAF-mutation subtypes (Table 2). Specifically, the odds ratio was 1.10 (95% CI, 0.88-1.37) per variant C allele and 1.21 (95% CI, 0.70-2.09) in CC homozygotes compared with AA homozygotes. The ORs were similar for men and women and did not differ according to tumor anatomical sub-site (data available upon request).

Interactions of rs2282679 genotype with dietary characteristics in relation to CRC risk
A previous NFCCS study demonstrated that total vitamin D intake was inversely associated with CRC incidence [4]. We therefore cross classified subjects on total vitamin D   MSS + MSI-low tumors (Log-rank P = 0.0013, Fig. 2). Our results did not confirm a prognostic relevance of rs2282679 in OS of CRC.

Interactions of rs2282679 genotype with dietary characteristics in relation to CRC survival
The GC rs2282679 genotype interacted with dietary factors to influence OS after CRC diagnosis (Table 5). Specifically, the positive association between carriage of the C allele and poor OS seemed limited to patients in higher categories of dietary vitamin D, calcium, milk, and total dairy product intakes; the HRs associated with the AC + CC genotypes were 2.11 (95% CI, 1.29-3.74; P interaction = 0.

Discussion
In this study, we observed no clear association of rs2282679 SNP with overall CRC risk, but noted a suggestive association of the CC genotype with DFS. A previous multicenter case-control study of 10,061 CRC cases and 12,768 controls of European ancestry found no evidence for associations between GC rs2282679 and the risk of CRC overall or for colon or rectal tumor separately [26]. In a Mendelian Randomization study in Scotland, Theodoratou et al. [25] reported a nonsignificant association of rs2282679 wild-type A allele with CRC risk (Per A allele: OR, 0.97; 95% CI, 0.90-1.06). The only available previous study investigating the prognostic effect of the SNP on CRC reported consistent results that GC rs2282679 polymorphism was significantly associated with reduced time to recurrence (HR, 3.30; 95% CI, 1.09-9.97, P = 0.034) in stages II and III colon cancer patients treated with surgery alone [27]. Referring to two recent GWAS studies [19,20], the GC rs2282679 has been identified as the strongest genomic predictor of serum vitamin D level (P = 2.0 × 10 − 30 ). Per copy of the risk C allele was associated with an approximately 50% elevated risk for hypovitaminosis among Caucasians [20]. In another study by Zhang et al. [46], the C allele of this SNP was associated with lower circulating DBP concentrations and thus lower 25(OH)D bioavailability to target organs. Together with the fact that vitamin D has been shown to reduce the growth of CRC xenografts by influencing cell growth, differentiation, apoptosis, as well as immune-modulation, the elevated risk of C allele P for interaction is computed with Wald method testing significance of multiplicative interaction term between GC SNP rs2282679 genotype and respective stratified variable Those with P < 0.05 are in bold carriers may be attributed to their low circulating DBP and 25(OH)D concentrations relative to noncarriers [47][48][49]. Alternatively, DBP can be converted to GcMAF, an activator of macrophages, by stepwise incubation of β-galactosidase and sialidase [17]. GcMAF could activate phagocytosis of macrophages during inflammation, reduce tumor growth and stimulate cell apoptosis [15,16]. In addition, GcMAF has been demonstrated to have the potential utility as an antitumorigenic drug for metastatic breast cancer [50]. Therefore, genetic variation in GC may alternatively influence cancer outcome via GcMAF, a biological mechanism independent of vitamin D levels.
We found that carriage of the risk C allele was associated with an increased likelihood of CRC incidence in patients with high vitamin D intake (P interaction = 0.019). No prior studies were found that specifically examined the interaction between rs2282679 polymorphism and vitamin D on CRC; yet, two other GC SNPs, rs17467825 and rs7041, which are in strong linkage disequilibrium with GC rs2282679 (γ 2 = 1.0 and 0.6 respectively) [20], have been associated with a slightly greater risk of CRC among individuals who consumed total vitamin D above the median in a multicenter case-unaffected sibling control study, though the interactions were not significant [32]. In addition, low vitamin D intake conferred higher risk of CRC among wild-type AA carriers but less obvious effect among AC or CC carriers; therefore, subjects with the AC/CC genotype might derive little benefit from high vitamin D intake, which may be due to their low affinity and abundance of DBP that might influence the function of vitamin D. Previous research [51] found that serum 25(OH)D level had greater effect on colorectal adenoma among patients with high total calcium Multivariable Cox model adjusted for sex, age at diagnosis, tumor stage at diagnosis, marital status, race, reported chemoradiotherapy, MSI status, BRAF mutation status, tumor location, fruit intake, and body mass index where applicable c P for interaction is computed with Wald method testing significance of multiplicative interaction term between GC SNP rs2282679 genotype and molecular subtype d Linear trend tested by modeling the ordinal variables of genotype dose as a continuous variable e CC and AC genotypes were analyzed jointly because of limited sample size in some subgroups f NC: not calculated Those with P < 0.05 are in bold intake; it is therefore unsurprising that we also observed a particularly strong association between the variation and all-cause mortality in patients at the higher calcium category. Based on these observations, we may speculate that the influence of rs2282679 polymorphism on either carcinogenesis or progression of CRC was strengthened by a metabolically permissive environmental condition characterized by high levels of dietary vitamin D, calcium, or foods rich in vitamin D and calcium [52].
In this study, the associations between rs2282679 SNP and DFS and OS were of similar patterns but stronger with DFS than OS. The difference in results may be explained by several reasons. It is plausible that many deaths among CRC patients are preceded by tumor metastasis or recurrence. Thus, the DFS end point may be dominated by metastasis and recurrence rather than deaths from all causes [53]; the difference in outcomes may have affected the results. A second possible explanation is that the power to detect an association for OS is less than that for DFS as the OS end point requires extended follow-up [53]. Therefore, the nonsignificant P values for OS might reflect inadequate power rather than a true lack of effect.
Our data suggest that the GC rs2282679 variation may be associated with poor DFS among patients with BRAF wild-type tumors, but not among BRAF mutant tumors (P interaction = 0.043). Although intriguing, the interaction of the SNP with tumor BRAF mutation status should be interpreted with caution because of a limited statistical power caused by low number of patients with BRAF mutant tumors, as well as the lack (at least to date) of exact mechanism of action underlying the prognostic value of this gene only in BRAF mutated CRC. Additionally, we observed an additive effect of the rs2282679 genotype A B Fig. 1 Survival curves for (a) disease-free survival and (b) overall survival by GC rs2282679 genotypes. Adjusted for sex, age at diagnosis, tumor stage at diagnosis, marital status, race, reported chemoradiotherapy, MSI status, BRAF mutation status, tumor location, fruit intake, and body mass index combined with MSI status; unsurprisingly, the most favorable prognosis as determined by DFS was seen among patients with AA/MSI-high tumors (vs. AC + CC/MSS + MSI-low). MSI has been established as a prognostic biomarker that confers survival advantage to CRC due to increased apoptosis rate and high lymphocytic infiltration [54][55][56][57]. Our observations suggest that the prediction model of CRC outcome should additionally integrate the rs2282679 genotype. These results may provide relevant information for identification of patients with increased susceptibility to CRC incidence and mortality and for patient assignment to interventions that are tailored to the individual. Additional studies should be addressed to investigate the role of rs2282679/ MSI classification in predicting the response to therapeutic lifestyle interventions.
One limitation of our study is that only one genetic variant of the GC gene was evaluated, thereby providing incomplete coverage of this gene; and we cannot exclude that genetic polymorphisms in other genes in the vitamin D metabolism pathway (e.g., vitamin D receptor) may also influence overall CRC initiation and progression. It is also possible that rs2282679 is not the true causal variant in itself but acts as a proxy through linkage disequilibrium (LD). Moreover, plasma 25(OH)D levels were not measured in this study. The lack of 25(OH)D measurements impeded us to test the relations of GC rs2282679 polymorphisms with plasma vitamin D concentration and to evaluate the extent to which the high risk of CRC mortality associated with the C allele is mediated through low 25(OH)D levels. Furthermore, dietary vitamin D intake may not accurately reflect each participant's vitamin D status since dietary history as measured by the FFQ is imprecise, and neither dermatic synthesis of vitamin D from sun exposure nor long-term dietary vitamin D intake was taken into account. Additionally, individuals were asked to report dietary exposures from one year prior to diagnosis for cases and one year prior to recruitment for controls; therefore, cases recalled dietary intakes from years earlier than controls. The longer recall period increases the rate of recall error resulting in higher likelihood of exposure misclassification in the case group.
Among the strengths of the study is the careful data collection, with a combination of results from genotyping and epidemiologic questionnaires. The availability of information on known environmental and genetic risk factors of CRC allowed us to investigate potential genegene or gene-environment interactions. The relatively large sample size with up to 10 years of follow-up permitted enough power to discern the significant genegene and gene-environment interactions in modifying CRC risk and survival using stratified analyses, which could be missed in smaller investigations. Finally, we were able to link the GC rs2282679 genotype to both risk and survival of CRC to recapitulate the entire spectrum of the disease from initiation through progression [58].

Conclusions
Our data demonstrate that the GC rs2282679 polymorphism is not associated with CRC risk overall, but suggest a possible reduced DFS after CRC diagnosis. These results identified an association between the GC Fig. 2 Kaplan-Meier survival curves for disease-free survival according to GC rs2282679 genotypes and MSI status SNP rs2282679 and DFS of CRC and effect modifications by vitamin D intake and BRAF mutation status. The genotype at the CG rs2282679 locus, along with vitamin D and BRAF mutation status, has potential utility as a susceptibility and prognostic biomarker of CRC. Future studies should verify these findings in other populations as well as clarify the molecular mechanisms behind the differential effects of the SNP on CRC outcomes according to vitamin D and BRAF mutation status.