A case-control study of a combination of single nucleotide polymorphisms and clinical parameters to predict clinically relevant toxicity associated with fluoropyrimidine and platinum-based chemotherapy in gastric cancer

Background Fluoropyrimidine plus platinum chemotherapy remains the standard first line treatment for gastric cancer (GC). Guidelines exist for the clinical interpretation of four DPYD genotypes related to severe fluoropyrimidine toxicity within European populations. However, the frequency of these single nucleotide polymorphisms (SNPs) in the Latin American population is low (< 0.7%). No guidelines have been development for platinum. Herein, we present association between clinical factors and common SNPs in the development of grade 3–4 toxicity. Methods Retrospectively, 224 clinical records of GC patient were screened, of which 93 patients were incorporated into the study. Eleven SNPs with minor allelic frequency above 5% in GSTP1, ERCC2, ERCC1, TP53, UMPS, SHMT1, MTHFR, ABCC2 and DPYD were assessed. Association between patient clinical characteristics and toxicity was estimated using logistic regression models and classification algorithms. Results Reported grade ≤ 2 and 3–4 toxicities were 64.6% (61/93) and 34.4% (32/93) respectively. Selected DPYD SNPs were associated with higher toxicity (rs1801265; OR = 4.20; 95% CI = 1.70–10.95, p = 0.002), while others displayed a trend towards lower toxicity (rs1801159; OR = 0.45; 95% CI = 0.19–1.08; p = 0.071). Combination of paired SNPs demonstrated significant associations in DPYD (rs1801265), UMPS (rs1801019), ABCC2 (rs717620) and SHMT1 (rs1979277). Using multivariate logistic regression that combined age, sex, peri-operative chemotherapy, 5-FU regimen, the binary combination of the SNPs DPYD (rs1801265) + ABCC2 (rs717620), and DPYD (rs1801159) displayed the best predictive performance. A nomogram was constructed to assess the risk of developing overall toxicity. Conclusion Pending further validation, this model could predict chemotherapy associated toxicity and improve GC patient quality of life. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-021-08745-0.


Introduction
Globally, gastric cancer (GC) is the sixth most common malignancy and the third leading cause of cancer death [1][2][3][4]. Current standard first-line treatment for GC patients consists of chemotherapy regimens that combine fluoropyrimidines and platinum compounds. Therapeutic responses and associated toxicity with these regimens can vary significantly among patients ranging from moderate to severe [5][6][7]. Indeed, gastrointestinal, hematological and neurological toxicities are commonly observed under these regimens, often leading to treatment discontinuation, a reduction in quality of life, and in some extreme cases to death [8,9]. Hence, severe toxicity becomes an essential obstacle to treatment completion and predictive models of toxicity may improve patient quality of life by avoiding severe toxicity.
Previous studies have demonstrated that single nucleotide polymorphisms (SNPs) are associated with chemotherapy-associated toxicity [10][11][12]. This can be explained by gene variations that alter the enzymatic activity of key proteins affecting pharmacokinetic and pharmacodynamic processes [13,14]. In this regard, platinum-based compounds can trigger cell arrest or apoptosis by forming Pt-DNA adducts [15]. Within our bodies, kidneys can excrete these compounds without undergoing biotransformation via B1/C2/G2 type ABC (ATP Binding Cassette) transporters [16,17]. Within cells, metabolizing enzymes including GSTP1, GSTM1, NQO1 and SOD1 decrease intracellular levels of platinum compounds [18][19][20][21]. Intracellularly platinum compounds target the DNA forming DNA-Pt complexes. Damaged DNA is recognized by HMGB, an enzyme that coordinates DNA repair by nucleotide excision repair enzymes [19,22]. On the other hand, 5-fluorouracil (5-FU) and its pro-drug capecitabine undergo a series of enzymatic transformations prior to exert their effects [23]. Although the precise mechanism is still unclear, 5-FU is known to inhibit thymidylate synthase (TYMS) suppressing the conversion of uracil into thymidylate, leading to the inhibition of DNA/RNA synthesis and eventually to cell death [24]. The metabolism of 5-FU occurs mainly in the liver, where DPYD metabolizes8 0% of the drug, producing 5,6 dihydroxy-5-FU (an inactive metabolite) [25]. It is widely documented that decreased DPYD activity is associated with severe toxicity [26][27][28]. Previous reports have also associated TYMS and MTHFR gene variations with 5-FU toxicity; however their clinical relevance is undetermined [29].
To date, the most reliable markers of fluoropyrimidine toxicity are DPYD*2A (rs3918290), DPYD-c.2846A > T (rs67376798), DPYD-Hap-B3 (rs56038477) and DPYD*13 (rs55886062). In fact, these are variants that have a welldocumented association with severe toxicity associated with fluoropyrimidines, and there is a Clinical Pharmacogenetics Implementation Consortium (CPIC) guideline that recommends avoiding or reducing the dose of fluoropyrimidines if a patient carries any of these variants [30]. Unfortunately, given their low frequencies in the general population the use of these variants identifies only a small fraction of potentially at-risk patients. For example, their frequencies in the 1000-Genome (1000-G) project and gnomAD databases for American or Latino/Admixed American population are much lower than that of those published for European cohorts. According to these databases, the Latin-American frequencies of the risk allele for: DPYD * 2 is 0.1 and 0.2% (1000-G Project and gno-mAD, respectively) yet an order of magnitude greater in frequency in a Finnish cohort (2.5%) [31,32]. In Latin-America the DPYD-c.2846A > T frequency is 0.3 and 0.2% (1000-G Project and gnomAD, respectively); DPYD-Hap-B3 is 0.6 and 0.7% (1000-G Project and gnomAD), DPYD*13 is 0 and 0.007% (1000-G Project and gnomAD, respectively). Therefore, we hypothesize that common SNPs in the Latin American population can potentially explain the clinically relevant toxicity in patients with fluoropyrimidines and platinum-based treatment.
As previously mentioned, SNP variants have been previously correlated with chemotherapy toxicity [33][34][35]. Other factors such as chemotherapy scheme, dosage, sex, and age have also been implicated in the development and severity of toxicity [7,[36][37][38]. However, only a few studies have developed comprehensive models that incorporate genetic and non-genetic factors to predict toxicity [39][40][41]. Herein, we developed and tested several models based on clinical factors, treatment regimens and candidate-SNPs. Our best performance model was used to construct a nomogram.

Patients and study design
A retrospective, observational and case/control study was carried out. A total of 224 gastric cancer patients diagnosed between April 2005 and March 2018 were registered at the UC-CHRISTUS Cancer Center in the Pontificia Universidad Católica de Chile (PUC). Previously, this group of patients has been clinically and molecularly characterized [42]. After applying the inclusion and exclusion criteria, a total of 93 CG patients were analyzed in this study. Eligibility criteria were: a) histologically confirmed GC, b) chemotherapy regimen based on fluoropyridines and/ or platinum compounds, c) adequate patient renal, hepatic and bone marrow function, determined by the treating physician at the time of starting chemotherapy, d) at least 2 cycles of chemotherapy, e) availability of biological sample for extraction of genetic material and f) adults (> 18 yr-old). Patients with neuropathies or hematological damage caused by other diseases were excluded. Clinical-pathological characteristics of patients included: age, sex, stage, ECOG, histological classification, treatment schemes used and cocomorbidities. The Ethics Committee at "PUC" approved this study (#16-046, April 21st, 2016) [4]. All participants signed an informed consent to participate in this study. A waiver of consent was granted to include deceased patients. All data were anonymized to protect patients' privacy. This study strictly adhered to the Code of Ethics of the World Medical Association (Declaration of Helsinki, 1964).

Toxicity graduation
Toxicities grades were determined following the National Cancer Institute Common Toxicity Criteria 4.0 (NCI-CTC 4.0). Data on anemia, neutropenia, febrile neutropenia, thrombocytopenia, nausea, vomiting, diarrhea, stomatitis, hand-foot syndrome, and peripheral neuropathy were collected. Then they were categorized into hematological, gastrointestinal, and neurological toxicity, and if they presented any of the above as "overall toxicity". All association analysis evaluated grade 0-2 vs grade 3-4 for hematological, gastrointestinal, neurological and overall toxicity. Treatment schemes and supportive care are shown in Supplementary Data.

SNP selection, DNA extraction and genotyping
A total of 11 SNPs were assessed, and a detailed description of this process is provided in Supplementary Data. Genotypic/allelic frequencies of analyzed SNPs are shown in Supplementary Table S8. Nucleic acids were extracted from paraffin-embedded tumor tissues using the "AllPrep DNA/RNA Mini Kit®" kit (Cat#AM1975, Thermo Fisher. DNA was quantified by "Qubit® dsDNA HS Assay" (Thermo Fisher). Candidate SNPs were genotyped by TaqMan® SNP Genotyping Assay technology on an Applied Biosystems® 7500 Fast Real-Time PCR System (Thermo Fisher). Samples were randomly reanalyzed for confirmation. TaqMan® probes are shown in Supplementary Table S9.

Statistical analysis Association analysis of SNPs
The association between SNPs and grade 3-4 toxicity was analyzed using univariate logistic regression models, reporting Odds Ratio (OR) values with 95% confidence interval (95% CI). These analyses were tested using 3 inheritance models; dominant, codominant and recessive, based on the parameters of AIC and BIC, the best inheritance model was chosen for each SNP [43]. To choose the SNP combinations in the first step, a multivariate logistic regression analysis was performed using the 11 SNPs. To reduce the number of combinations and avoid overfitting, we applied the AIC-based "Stepwise algorithm". On selected SNPs, binary combinations were performed and their association with grade 3-4 toxicity was established using their respective inheritance models.

Model evaluation and nomogram construction
Obtained models were evaluated using classification algorithms [45,46] including: Logistic Regression (LR); Support Vector Machine (SVM); Naïve Bayesian (NB); K-Nearest Neighbor (KNN); Artificial Neural Network (ANN); Random Forest (RF); Decision Tree (DT) (Details in Supplementary Data). Using as a basis the coefficients of the multivariate analysis of model 4 we constructed a nomogram using "rms" package [47]. In addition, for discriminatory capacity, 1000 bootstrap replications served as internal validation subsets to estimate the bias -corrected c -index calibration.

General statistical analysis
Continuous variables were compared using ANOVA. Kaplan-Meier method was used for survival analysis and log-rank tests for comparison. Significance was set at p < 0.05. According to the number of cases and controls (31 and 62 patients respectively), assuming a power of 80%, an α error of 5% and a frequency of common polymorphisms (i.e. DPYD rs1801265, GSTP1 rs1695) of 30%, Odd Ratios could be detected with values of 3.9 and 0.1 (high and low). Association analysis of SNPs and overall toxicity were performed by SNPstat program [43]. Uni/multivariate logistic regression models were built using the "stats" and "DescTools" packages. Classification algorithms were constructed using the "caret" and "ROCR" packages. For survival analysis, the "survival" and "survminer" packages were used. All analysis were performed in R software v3.5.1 (The R Foundation, Vienna, Austria). Full datasets used in this study can be found in Supplementary Data File 1.

SNPs selection
SNPs were selected based on: (1) scientific evidence regarding the SNPs/toxicity relationship, using the PharmGKB database [48]; (2) allelic and genotypic frequency of the SNPs in the American population [32]; (3) relationship of the SNPs with the toxicity collected in our patients literature-based criteria; (4) functional impact of SNPs at the protein level according to PolyPhen [49] and SIFT [50]. Briefly, in a first approximation 27 SNPs in 11 genes and 7 SNPs in 6 genes for fluoropyrimidines and platinums were reviewed, respectively. Then, based on a score system for fluoropyrimidines, 14 SNPs in 7 different genes were candidates, while for platinums, 4 SNPs in 4 different genes were candidates. Finally, eleven SNPs with an allelic frequency greater than 5% were genotyped (Table 1). A detailed description of this process is provided in Supplementary Data.

General characterization of patients
Main clinicopathological characteristics of the patients are summarized in Supplementary Table S1. Briefly, patients were predominantly male (62.4%) and advanced  Supplementary Table S3. Data were grouped according to type of toxicity. Peripheral neuropathy was the most common grade 1 toxicity (34.1%). Nausea was the most predominant grade 2 toxicity (31.5%). Among grade 3 toxicities, neutropenia was dominant (22.58%) followed by diarrhea (20%). Finally, we registered a total of 5 patients with grade 4 events among these 3 out of 5 corresponded to neuropathy (60%). No toxicity-related deaths were registered. Clinically relevant toxicities (grade ≥ 3) were more frequently associated to digestive problems such as diarrhea and stomatitis, with a total of 19 registered events. Among hematological toxicities, a total of 17 grade ≥ 3 events, principally neutropenia or febrile neutropenia, were registered (Supplementary Table S3).

Genetic variants associated with overall toxicity
Binary associations between overall toxicity and SNPs were assessed using three inheritance models (codominant, dominant and recessive) and are summarized in Table 3 Univariate logistic regression analysis for association between treatment variables and overall grade ≥ 3 toxicity  Table 4. Using a dominant model, the AG/GG genotypes of SNPs in the DPYD (rs1801159) were associated with higher toxicity; OR = 4.20 (95% CI = 1.70-10.95, p = 0.002). Also, we found a borderline association between lower toxicity and DPYD (rs1801159) with an OR = 0.45 (95% CI = 0.19-1.08; p = 0.071). Potential associations in DPYD (rs2297595), ERCC2 (rs13181) and GSTP1 (rs1695) SNPs were also analyzed. However, no significant association was found by univariate analysis.

Combination of genetic variants associated with overall toxicity
Next, we performed a multivariate logistic regression analysis incorporating the 11 SNPs to establish potential associations between combined SNPs and overall toxicity. We applied a "Stepwise algorithm" based on Akaike information criterion (AIC) [51] to reduce the number of combinations and avoid overfitting. Based on this we selected, 5 SNPs and their respective inheritance models to test binary combinations between SNPs (

Evaluation of toxicity models
To assess the predictive power of our models we employed a variety of classification algorithms. Figure 1 shows the sensitivity, specificity, accuracy and AUC of each model. For example, model 1 had a high specificity (range = 0.82-1.0), but low sensitivity (range = 0-0.17), in most tested algorithms. Accuracy reached a maximum value of 0.71 using the RL method and an AUC of 0.74 in the RL and ANN classification algorithms (Fig. 1A).
Comparing different algorithms for model 2, were found a promising specificity (range = 0.61-0.87) but low sensitivity (range = 0.17-0.42). In this model maximum accuracy (0.69) was achieved with the KNN method, and the most favorable AUC was 0.68 with the DT method (Fig. 1B). Model 3 showed a relatively high specificity (range = 0.70-0.91) and a moderately low sensitivity (range = 0.17-0.41), reaching its maximum value with the RL method. On the other hand, the maximum accuracy was 0.69 with the SVM and KNN methods, and a maximum AUC of 0.68 achieved with the DT method (Fig. 1C). Model 4 showed a relatively high specificity (range = 0.74-1.0), with the sensitivity ranging between 0 and 0.66 among the classification algorithms. Maximum accuracy was achieved with the RL method (0.80); AUC was 0.82 achieved with the same classification algorithm (Fig. 1D). In summary, our data suggest model 4 was the best predictor of grade 3-4 toxicity (using the RL method). This supports the notion that combined models provide better predictive power versus individual variable models.

Multivariate analysis for type of toxicity
In line with the methodology employed for general toxicity, multivariate analysis was performed on clinical and genetic factors in the development of severe toxicity for independent hematological, gastrointestinal or neurological toxicities. For hematological toxicity, when the clinical/treatment variables were integrated with SNPs, a better fit of the logistic regression model was achieved. For example, for model 1 and model 2, the Pseudo R 2 returned values of 0.07 and 0.14, respectively, while for the variable's integration models vary between 0. 23 Table S7). Taken together, these results suggest that it is the combined models that provide higher Pseudo R 2 values. However, given the reduced number of variables (low number of SNP incidence) when individual toxicities are analyzed the associations may be relatively imprecise (reflected in the confidence intervals) and thus must be interpreted with caution until a larger cohort is studied.

Nomogram for predicting general toxicity
As an approximation for future validation of our results we developed a multivariate logistic regression-based nomogram that estimates the probability of a given patient to experience grade 3-4 overall general toxicity ( Fig. 2A). This model is well calibrated (Supplementary Fig. S2) and has an acceptable discriminatory capacity, with an optimism-corrected c-index of 0.72 (95% CI, 0.72-0.92). Figure 2B shows the distribution of nomogram values for each patient. As expected, the median values for low or high-toxicity groups were significantly different (p < 0.0001). In addition, we established a different cut-off according to the points on delivered by the nomogram. Thereby ≤45-point patients have a 10% probability of developing toxicity. Encouragingly, 93% of patients in the lower range group correspond to the low-toxicity group. In contrast, patients with > 136-accumulated points are at a higher risk to develop toxicity and 73% of them are in the high-toxicity group.

Discussion
Chemotherapy treatment-related toxicity remains a critical problem in GC patients. Unfortunately, the relative benefit in terms of patient survival associated with targeted therapies is rather modest [52][53][54][55]. Therefore, optimizing chemotherapy regimens becomes crucial to improve GC patient survival. Within this context, the elaboration of reliable models that predict treatment related toxicities might ensure patient safety in interventional studies. While guidelines exist for genotypes relating to fluoropyrimidine toxicity, our study also demonstrates associations related to platinum presence. The Clinical Pharmacogenetics Implementation Consortium has delivered guidelines for the clinical interpretation of four DPYD genotypes related to severe fluoropyrimidine toxicity within European populations [56]. However, while the frequency of these single nucleotide polymorphisms (SNPs) could reach~10% in some European populations the reported frequency in the Latin-American populations is below 0.7% [57] (Suarez-Kurtz 2020; Nugent et al. 2019). This may be primarily due to the underrepresentation of the Latin-American population in genetic studies [58]. In accordance, the SNPs in DPYD (rs55886062), included in the CPIC guide, were not mutated homozygous nor were the heterozygous genotypes detected in the Chilean patients used in this study (data not shown). Given the world population that is not derived from European ancestry, the identification of new associations between the genome and fluoropyrimidine and platinum toxicity is of the utmost importance and may complement the current CPIC guidelines once further validation has been completed. Our findings are one of first to frame toxicity pharmacogenetics in an underexplored Latin American population and given the inherent global differences in SNP distribution, it is not beyond the realms of imagination to envisage that future pharmacogenetic tests are applied in a regional or populational manner. In line with previous GC reports, patients in our cohort were predominantly males [59,60] and advanced stage [42,60,61]. Similarly, median overall survival, histological type and overall toxicity were in agreement with the current literature (Supplementary Table S1 and Supplementary Fig. S1). Regarding age at diagnosis, the association with combined chemotherapy toxicity is probably explained by the age of recruited participants in most GC-trials that range between 50 and 60 years [62]. Interestingly, the inclusion of age in our final model increased the predictive power. A study reported no significant differences in the incidence of grade 3-4 toxicities in gastro-esophageal cancer patients comparing ≥70 vs < 70 year-old participants [63]. However, in many cases a higher prevalence of toxicity in older patients leads to chemotherapy discontinuation [64]. In contrast, a pooled analysis concluded that chemotherapy-related serious adverse events were significantly higher in > 65 year-old patients [65]. Accordingly, a recent study demonstrated that older GC patients (≥ 70) experience more severe toxicities versus younger patients [66]. A number of studies, including meta-analyses, have shown an increased risk of severe toxicity associated to fluoropyrimidine/platinum-based chemotherapy in female gastric and colorectal cancer patients [7,36,37,67]. In accordance, we found a trend towards higher toxicity among females in our study (Table 2).
Compared to intravenous 5-FU, oral capecitabine (5-FU pro-drug) increases OS and response rates in combination with platinum compounds. Also, 5-FU/cisplatin is associated with greater toxicity [68][69][70]. In line with these findings, we observed that incorporation of certain regimens improved the predictive power of our models (Table 3). A recent study in colorectal cancer patients demonstrated that FOLFOX was associated with a significant increase in stomatitis and neutropenia, but decreased diarrhea and hand-foot syndrome versus CAPEOX [71].
Our model 2 includes the most relevant associations between selected SNPs and overall toxicity (Table 4). Several reports confirm DPYD is a reaction-limiting enzyme for 5-FU catabolism. In fact, DPYD-deficiency is commonly associated with a lower drug-clearance and increased toxicity [72]. In our analysis, AG/GG DPYD (rs1801265) genotypes were associated with higher grade 3-4 toxicity (OR = 4.20, p = 0.002). This variant causes a Cys 29 to Arg substitution that reduces DPYD enzymatic activity and increases 5-FU-related toxicity [73]. Functional studies demonstrate that AG and GG genotypes of DPYD (rs1801265) have a significantly lower 5-FU degradation rates (5-FUDR) compared to AA, with a profound effect for GG [73]. Likewise, the CT DPYD (rs2297595) genotype was associated with grade 3-4 toxicity versus TT, although individually was not significant (OR = 2.71, p = 0.21). The same study reported that the CT DPYD (rs2297595) genotype had a significantly lower 5-FUDR versus TT [73]. On the other hand, the TC/CC DPYD (rs1801159) genotypes were associated with a lower probability of grade 3-4 overall toxicity versus the TT genotype (OR = 0.45, p = 0.071). However, studies on this polymorphism are somewhat inconsistent and some have reported an association with increased severe toxicity [28,74,75] or no association [76][77][78]. A potential explanation for our finding is the high frequency of the C allele in this subset, reaching 34% (Supplementary Table S8). In sharp contrast, European cohorts report 18.3% for the C allele (n = 157) [73]. Similarly, an Asian study reports a 27% frequency (n = 362) [75]. Notably, C allele frequency in the American population is 27%, whereas in East Asian, European, African and South Asian populations is 27, 19, 15 and 8% respectively [32]. Therefore, differences can be attributed to specific geographical/ethnic factors. Our study also found an association between GSTP1/ ERCC2 SNPs and overall toxicity. These are linked to the formation of DNA-adducts. The AG/GG GSTP1 (rs1695) genotypes were associated with a lower probability of grade 3-4 toxicity compared to AA. This "protective" role of the G allele has been previously reported in gastro-esophageal [79], colorectal [78], ovarian [80], testicular [41] and lung cancer [81]. A potential mechanism to explain this protective role could be the activation of the JNK pathway [79,82] that increases cell defense mechanisms. Conversely, GT/GG genotypes in ERCC2 (rs13181) were associated with a higher probability of grade 3-4 overall toxicity (Table 4). These polymorphic variants decrease repair efficacy and may thus increase DNA adducts [83,84]. This suggests that increased toxicity may be mediated by platinum damage to normal cells [85].
In line with previous publications, our study found a significant association between DPYD (rs1801265) SNPs and grade 3-4 overall toxicity only in male patients (Supplementary Table S4), [7]. Our paired-SNP analysis found a strong association between DPYD (rs1801265) and ABCC2 (rs717620) SNPs and overall toxicity (see Table 5). In particular, AG/GG (DPYD) and CT/TT (ABCC2) patients had a high probability of developing grade 3-4 overall toxicity (OR = 11.25, 95% CI = 1. 25-245.45). The ABCC2 (rs717620) polymorphism is located in the promoter region of the gene, and has been previously associated with decreased protein expression in vitro [86]; ABCC2 also mediates the export/elimination of glutathione-oxaliplatin conjugates [87] therefore an impaired function could decrease export of the drug leading to toxicity. Previous studies in colorectal and lung cancer [88,89] have associated this polymorphism to severe fluoropyrimidine/platinum-related hematological toxicity. Thus, the DPYD (rs1801265)/ABCC2 (rs717620) SNP combination could potentiate fluoropyrimidines and/or oxaliplatin derived toxicities. Again, given the SNP frequency in our analysis, this finding requires further validation by a larger cohort.
Utilizing a classification algorithm that involved 28 SNPs and 1 clinical variable (histology) a study by Yin et al. reported that the best prediction of toxicity was achieved in lung cancer patients that received platinumbased therapies [39]. Moreover, these authors demonstrated that the ABCG2 rs2231142-CES5A rs3859104 SNPs combination was strongly associated with grade 3-4 platinum toxicity (adjusted OR = 8.044, p = 4.350 × 10-5) [90]. In this regard, our models displayed better adjustments (based on Pseudo R 2 ) after adding clinical/ treatment factors and SNPs (models 3 and 4, Table 6). Our Model 4 was the best-fitted model in terms of sensitivity, specificity, accuracy and AUC (Fig. 1D). Previous studies have used also this strategy with consistent results [7,40,41,91].
The frequency of grade 3-4 toxicity is observed in only 10-15% of gastric cancer patients as medical oncologists often make alterations to treatment protocols when lower toxicities start to manifest. The number of cases and controls incorporated into this study allowed statistically significant differences to be observed, however despite over 223 medical records being screened, we recognize as a limitation that the number of patients was a limiting factor in further interpreting our data and thus validation in a larger cohort is required before these models can be considered in a clinical setting. A future cohort will permit better precision analysis of the combinatorial SNPs. Certain SNPs, despite not achieving the individual significance standard of p < 0.05, were included in the models due to their influence or effect on the development of severe toxicity has been previously reported, or their inclusion improved the Pseudo R 2 values. Interestingly, according to the CPIC guide, the polymorphisms for DPYD rs1801265 (also known as DPYD*9A) and DPYD rs1801159 (also known as DPYD*5), which were both incorporated into our model, (See figure on previous page.) Fig. 2 Nomogram for estimating overall toxicity risk based on the multifactorial model 4. A The nomogram was developed on the basis of the final multivariate logistic regression model. B The total sum of points of low and high-toxicity groups is shown in the scatter plot on the left. Broken lines represent the probability of developing severe toxicity according to the points. On the right side, the percentage of low and hightoxicity groups is shown according to the probability estimated in the nomogram. Low Tox. Low-toxicity group; High Tox. High-toxicity group. DPYD 6, rs1801265; DPYD 3, rs1801159 (in nomogram). Significance: P < 0.05 are classified to not affect the DPD function in a clinically relevant manner in the context of 5-fluorouracil related toxicity [30]. Interestingly, in accordance with our results, recent reports have shown an association with severe toxicity and high levels of 5-FU post-treatment for DPYD-rs1801265 [92][93][94].
A further option is a model / nomogram comparison with other similar studies. Schwab et al. [7] and Botticelli et al. [95] previously proposed nomograms to predict toxicity to 5-FU, and thus could be incorporated as a reference in future validation studies. A potential comment on this study could be the heterogeneity of treatments received by patients and that a prospective study considering a limited number of regimens would allow analyses the actual doses and duration of chemotherapy as potential variables associated with toxicity. While this is true, our gastric cancer patients and their treatments are a reflection of standard clinical practice. This heterogeneity in treatment was present despite the patients being part of the same recruiting clinical center and being treated by the same group of medical oncologists. Thus, this heterogeneity observed among our fluoropyrimidine and platinum-based treatments will always exist in the oncology clinic and thus any predictive model or algorithm will be required to be effective in face of this variable. The real-world treatment and clinical outcomes in this study have allowed a "proof of concept" of a model which integrates clinical and pharmacogenetic (three SNPs in two different genes) variables to improve the prediction of toxicity associated with fluoropyrimidine and platinum-based chemotherapy. Furthermore, recent studies have demonstrated that genetic elements outside the coding region of genes are potential regulators of pharmacokinetic and pharmacodynamic processes. In the TYMS gene, variants in UTR regions of 5'VNTR 28 bp-repeat (rs45445694) and 3'UTR 6 bp-indel (rs11280056) have been associated with severe toxicity in patients receiving fluoropyrimidine based treatments [77]. Furthermore, regulatory molecules of the noncoding RNA type, such as circular RNA (circRNAs) and Long Non-Coding RNA (lncRNAs), have been correlated to clinical variables (TNM stage, presence of metastasis and diagnosis) in patients with GC [96]. Interestingly, polymorphisms in lncRNAs of ANRIL (rs1333049) and MEG3 (rs116907618) genes were associated with severe overall and gastrointestinal toxicity in patients with lung cancer treated with platinum-based chemotherapy [97]. In addition, another regulator of gene expression are the Micro RNA (miRNAs), where genetic variations in miRNA binding sites are associated with an altered drug response [98]. In a similar vein, a recent publication by Powell et al. reported the mapping of miRNA-mRNA interactions in several pharmacogenes, in fact, the authors identified an hsa-mir-27b-DPYD interaction at a previously validated binding site, which may suggest the existence of additional elements that contribute to the individualism of drug response [99].
Given that the purpose of study is to predict toxicity to chemotherapy, it would be interesting in further validation studies to test our proposed clinical variables and individual and combinations of SNPs, together with emerging variables such as changes in expression, sequence, and binding sites of non-coding RNAs. This may give us a more complete picture on how to predict severe toxicity associated with chemotherapy and thus improve patient quality of life and survival.

Conclusions
In summary, in the absence of reliable markers and clinically relevant models to predict patient toxicity derived from fluoropyrimidine/platinum-based chemotherapy, herein we present for future validation a logistic regression-based model that integrates clinical, treatment and common SNPs.
Additional file 1: Supplementary Material S1. Supplementary Fig.  S1. Overall survival rates in the study cohort, Supplementary Fig. S2. Calibration plot for the prognostic model, Supplementary Table S1. Demographic and clinic-pathological characteristics of study population (N = 93), Supplementary