Polygenic risk prediction models for colorectal cancer: a systematic review

Sassano, Michele; Mariani, Marco; Quaranta, Gianluigi; Pastorino, Roberta; Boccia, Stefania

doi:10.1186/s12885-021-09143-2

Research
Open access
Published: 15 January 2022

Polygenic risk prediction models for colorectal cancer: a systematic review

Michele Sassano¹^na1,
Marco Mariani¹^na1,
Gianluigi Quaranta^1,2,
Roberta Pastorino² &
…
Stefania Boccia^1,2

BMC Cancer volume 22, Article number: 65 (2022) Cite this article

3911 Accesses
17 Citations
2 Altmetric
Metrics details

Abstract

Background

Risk prediction models incorporating single nucleotide polymorphisms (SNPs) could lead to individualized prevention of colorectal cancer (CRC). However, the added value of incorporating SNPs into models with only traditional risk factors is still not clear. Hence, our primary aim was to summarize literature on risk prediction models including genetic variants for CRC, while our secondary aim was to evaluate the improvement of discriminatory accuracy when adding SNPs to a prediction model with only traditional risk factors.

Methods

We conducted a systematic review on prediction models incorporating multiple SNPs for CRC risk prediction. We tested whether a significant trend in the increase of Area Under Curve (AUC) according to the number of SNPs could be observed, and estimated the correlation between AUC improvement and number of SNPs. We estimated pooled AUC improvement for SNP-enhanced models compared with non-SNP-enhanced models using random effects meta-analysis, and conducted meta-regression to investigate the association of specific factors with AUC improvement.

Results

We included 33 studies, 78.79% using genetic risk scores to combine genetic data. We found no significant trend in AUC improvement according to the number of SNPs (p for trend = 0.774), and no correlation between the number of SNPs and AUC improvement (p = 0.695). Pooled AUC improvement was 0.040 (95% CI: 0.035, 0.045), and the number of cases in the study and the AUC of the starting model were inversely associated with AUC improvement obtained when adding SNPs to a prediction model. In addition, models constructed in Asian individuals achieved better AUC improvement with the incorporation of SNPs compared with those developed among individuals of European ancestry.

Conclusions

Though not conclusive, our results provide insights on factors influencing discriminatory accuracy of SNP-enhanced models. Genetic variants might be useful to inform stratified CRC screening in the future, but further research is needed.

Peer Review reports

Introduction

Colorectal cancer (CRC) is currently the third most commonly diagnosed type of cancer and the second cause of cancer death worldwide, with an estimated 1.8 million new cases and 880 thousands deaths in 2018, with a greater burden among males respect to females [1]. Typically, CRC can be considered a disease related to wealth. National levels of both CRC incidence and mortality are closely related to the income and development level of the country, with a cumulative risk of CRC or CRC death three times higher in countries with a high Human Development Index (HDI) than countries with a medium or low HDI [1].

Over the last decade, the majority of the countries in Europe, Oceania and North America witnessed a decrease in CRC mortality [2]. Likely, one of the main reasons for such a reduction in mortality rates in Western or developed countries could be related to the adoption of screening programs for CRC. As for CRC screening, different methods and strategies are effective at reducing its mortality and have been implemented in different countries worldwide, the most represented by fecal occult blood testing and fecal immunochemical test [3,4,5,6]. However, in recent years researchers have explored the possibilities of stratified screening, through the use of prediction models that could guide CRC risk assessment for asymptomatic patients [7]. In particular, most recent research in this field has focused on the inclusion of genetic factors into prediction models, particularly through the use of a genetic risk score (GRS) or a polygenic risk score (PRS) [8]. Furthermore, the increasing number of genome-wide association studies (GWASs) that are being conducted, with more than 70 GWASs currently published for CRC [9], is leading to a progressive improvement of our knowledge regarding the impact of common genetic variants or single nucleotide polymorphisms (SNPs) on the risk of CRC. In this sense, it should be noted that up to 35% of inter-individual variability in CRC risk has been attributed to genetic factors [10, 11], thus making the importance of this field for public health evident. Genetic factors could guide CRC risk assessment, thus improving the effectiveness of currently available screening strategies.

However, the methods currently used by researchers to incorporate genetic factors into prediction models for CRC and the characteristics of the latter are highly heterogeneous [8]. In addition, the potential improvement in discriminatory accuracy yielded by the addition of genetic factors to CRC prediction models including only traditional risk factors is still unclear, as it is not certain whether the number of genetic variants included in the models are related to such improvement.

For these reasons, the primary aim of the present study is to perform a systematic review regarding polygenic risk prediction models for CRC in order to identify which prediction models including genetic risk variants for CRC have been reported in the Scientific Literature.

The secondary aim is to assess the impact, in terms of improvement in discriminatory accuracy, of the addition of SNPs into prediction models with only traditional risk factors, and to test whether there is any relation between the number of SNPs included in the models and the improvement of their discriminatory accuracy. In addition, we aimed to evaluate which factors, besides the number of SNPs, influence the improvement of discriminatory accuracy.

Methods and materials

We registered a protocol for this review on PROSPERO (Record ID: CRD42019135304), the international prospective register of systematic reviews. We uploaded on the PROSPERO register, prior to completing data extraction, the review title, timescale, team details, methods, and general information.

Search strategy and study selection

We queried Pubmed, Web of Knowledge, Embase and CINAHL Complete electronic databases up to February 2020 using the elements of the Population, Intervention, Comparator, Outcome (PICO) model (P, population/patient; I, intervention/indicator; C, comparator/control; and O, outcome) [12]. In detail, our study population was represented by colorectal cancer; the intervention by SNPs; the comparator was none, and outcome was represented by risk prediction models. For this reason the following search string was built: (“Colorectal Neoplasms”[Mesh] OR “colorectal cancer” OR “colon cancer”) AND (“genetic variant” OR “genetic variants” OR “genetic variation” OR “genetic data” OR polymorphism OR SNP OR SNPs OR polygenic) AND (“risk stratification” OR “risk model” OR “risk profile” OR “risk profiling” OR “risk prediction” OR “risk determination” OR “risk discrimination” OR “risk score” OR “predictive model” OR “prediction model” OR “prediction models” OR “stratified screening”). The search was refined by hand searching and analysis of bibliographic citations in order to identify missing articles. No publication time limits were applied.

The manuscript was written following the recommendations of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Supplementary material) [13].

We systematically searched databases to retrieve all eligible scientific studies that developed, compared or validated a prediction model (or clinical prediction rule based on a model) using multiple (at least two) SNPs to predict the risk of CRC.

Two independent investigators (M.M. and M.S.) screened titles and abstracts of all potentially pertinent articles to identify eligible studies. We obtained, read and included, if relevant, full papers following the same procedures. At all levels, any discrepancies and disagreement were solved by consensus or by involving a third investigator (R.P.).

We included English-written peer-reviewed papers focusing on sporadic CRC reporting primary data and that evaluated the combined effect of two or more genes on CRC risk (e.g. GRS or PRS) or that reported a formal prediction model using genetic factors.

We excluded all studies that tested a model on simulated populations, pediatric populations, or dealing with inherited forms of colorectal cancer (e.g. Lynch syndrome). Furthermore, we did not include in this review commentaries, editorials, review papers, case reports, case series, book chapters, and articles with no primary data. Lastly, as for articles updating previous ones, we included only the last updated study.

Data extraction

Data extraction was conducted independently by two researchers (M.M. and M.S.), for articles deemed relevant, using an in-depth piloted data extraction form and following an adapted version of the “CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies” (CHARMS) checklist [14]. Disagreements were solved through discussion or referral to a third reviewer (R.P.).

Extracted data include information regarding: author details; year of publication; study design; study population; sample size; genetic factors analyzed; GRS and related methods used to calculate it; factors other than genetic included in the model; internal and external validation; Area Under Curve (AUC) of non-SNP-enhanced models; AUC of SNP-enhanced models; Integrated discrimination improvement (IDI); and net reclassification improvement (NRI). In particular, NRI and IDI are measures used to compare the performances of two models, specifically an old model and a new model resulting from the addition of one or more predictors to the old one. The AUC is a measure of discriminatory accuracy and quantifies the ability of the model to discriminate between individuals with and without the outcome of interest [15], while NRI quantifies the ability of the new model to reclassify individuals compared to the previous one [16, 17], and IDI represents the difference in discrimination slopes of the new and the previous models, with the discrimination slope being the absolute difference in the averages of estimated probabilities of the event between those who experienced the event and those who did not [17,18,19].

For studies including both individuals with adenomas and CRC, we only extracted information about results related to CRC.

Quality assessment

The risk of bias of included studies was assessed by two investigators (M.M. and M.S.) using the Prediction model Risk Of Bias ASsessment Tool (PROBAST) [20]. PROBAST is a tool developed to assess the risk of bias and applicability of prediction model studies and contains a total of 20 signaling questions divided into 4 key domains that regard: participants, predictors, outcome, and analysis. Each domain is rated for risk of bias (low, high or unclear risk of bias). The signaling questions can be rated as “yes”, “probably yes”, “probably no”, “no” or “no information”. Every signaling question is phrased so that “yes” or “probably yes” mean absence of bias, while “no” or “probably no” warn for potential risk of bias. The first three domains that regard participants, predictors and outcome are also assessed for concerns for applicability (high, low, or unclear) to the defined review question.

Statistical analysis

Statistical analysis was carried out including only studies that reported both a model with only traditional risk factors and one incorporating also genetic factors. For studies that calculated the AUCs of the same model constructed in different ways (e.g. counted GRS and weighted GRS), only the model showing the best performance or, for those showing the same values of AUC, the simplest one was included in the analysis. Stratification according to the number of SNPs was conducted using tertiles based on the distribution of the number of SNPs included in the models across included studies, with lowest, mid, and highest tertile being represented by ≤22, 23–47, and ≥ 48 SNPs, respectively. We calculated standard errors of AUCs using the Hanley and McNeil method [15].

First, we tested whether a significant trend in the increase of the AUC of the SNP-enhanced models according to the number of SNPs included in the models could be observed. Secondly, we estimated the Pearson’s correlation coefficient between AUC improvement and number of SNPs. Eventually, we investigated whether the increasing number of SNPs added to the baseline models determined an observable trend in the improvement of the AUC by drawing a forest plot. In order to calculate a pooled AUC improvement for SNP-enhanced models compared with non-SNP-enhanced models, we conducted a meta-analysis using the random effects model, based on the assumption that clinical and methodological heterogeneity was very likely to occur and to have an effect on the results. We quantified statistical inconsistency using the I² statistic. Moreover, we assessed whether specific factors (number of cases, number of SNPs, publication year, AUC of non-SNP-enhanced model, ethnicity of study participants, number of traditional risk factors in the model, and inclusion of gender in the model both as a covariate or by stratification) were significantly associated with AUC improvement and explained statistical heterogeneity by conducting meta-regression, with p-values adjusted for multiple testing computed using 1000 Monte-Carlo permutations.

All statistical analyses were conducted using the Stata software version 13.0 [21].

Results

Study selection

The results of abstract and full-text screening with reasons for exclusion are shown in the PRISMA flow diagram [13] in Fig. 1. The database research resulted in 749 records. A total of 6 articles were retrieved through hand search. After checking for duplicates, 566 articles were analyzed for eligibility and 472 were excluded after title and abstract screening. The remaining 94 articles were selected for full-text review, resulting in 33 articles included in the qualitative synthesis and 10, eventually, included in the meta-analysis. The main causes for exclusion were represented by: articles with no primary data or with simulated populations (35%), non-pertinent articles (30%); articles with population represented by individuals with inherited forms of colorectal cancer (20%); eventually, studies that were later updated and published (10%) or that gathered together with CRC cancer and colorectal benign polyps without distinguishing these two populations (5%).

Study and population characteristics

The main characteristics of the articles included in the systematic review are summarized in Table 1. Studies included in this review were published from 2008 and 2019. Most of them were case-control studies (78.79%) [22, 23, 25, 27,28,29,30,31,32,33,34,35,36, 39, 41,42,43, 45,46,47, 49,50,51,52,53,54], followed by 5 cohort studies (15.15%) [24, 38, 40, 44, 48], and 2 (6.06%) case-cohort studies [26, 37]. No sample overlap can be reported across studies. Twenty-one (63.64%) evaluated risk prediction models among individuals of European ancestry [23, 24, 26,27,28, 30,31,32, 34, 35, 38,39,40,41,42,43,44,45,46, 49, 50], 12 (36.36%) among a population of Asian ancestry [22, 25, 29, 33, 36, 37, 47, 48, 51,52,53,54]. Population sizes ranged from 603 [47] to 361,543 [44] individuals.

Table 1 Main characteristics of the included studies in the systematic review

Full size table

Risk prediction models characteristics

The number of genetic variants evaluated in the risk prediction model ranged from 4 [54] to 696 SNPs [45]. A complete list of SNPs included in each study is provided in Table S1.

In order to include genetic factors into prediction models, different methodologies were investigated across the included studies. In particular, 26 (78.79%) studies used a GRS, 11 (42.31%) of which used a weighted GRS [31, 33,34,35, 40, 42,43,44,45,46, 52], other 6 (23.08%) studies used an unweighted GRS [22, 24, 26,27,28,29]. Instead, a total of 9 studies (34.62%) used both unweighted and weighted methods to develop risk scores [23, 25, 30, 32, 36, 37, 49,50,51].

Of the remaining 7 studies that did not use GRS (21.21%), one [39] derived 7 genes from a larger set. After gene profiling and cluster analysis, specific genes were selected, further validated and evaluated for predictive performance. The second one performed a Mendelian randomization analysis to assess the association between hyperlipidemia and CRC using Burgess statistics [55] and a fixed-effects meta-analysis to derive final odds ratios [41], while another one [47] applied logistic regression, Jackknife feature selection and ANOVA testing to construct the prediction model. Other authors [53] applied a stepwise selection procedure in order to determine the inclusion or exclusion of the putative risk factors from the models, and the combined effect of genes on colorectal cancer risk was assessed by multivariate unconditional logistic regression. Instead, 2 studies used machine learning approaches [38, 54]; the last one evaluated the predictive accuracy of genetic corrected serum levels of specific biomarkers compared to uncorrected ones [48].

Difference in discriminatory accuracy between SNP-enhanced and traditional risk factor models

Using the Swets classification [56], i.e. low accuracy when the AUC is between 0.5 and 0.7, moderate accuracy between 0.7 and 0.9, only two of the studies that included both a traditional risk factor only model and one incorporating also genetic factors found a moderate discriminatory accuracy. The first study [36] showed that, only among males, AUC values for models including counted GRS and weighted GRS reached 0.729 (95% CI: 0.682, 0.767) and 0.719 (95% CI: 0.677, 0.761), respectively; while models without SNPs showed low accuracy (i.e. AUC lower than 0.7). The second study [37] found moderate discriminatory accuracy for both SNP and non-SNP-enhanced models. In particular when overall colon and rectal cancer risk, colon cancer risk only, and rectal cancer risk only were separately considered, SNP-enhanced models yielded AUC values of 0.74 (95% CI: 0.70, 0.78), 0.75 (95% CI: 0.69, 0.81), and 0.74 (95% CI: 0.68, 0.79), respectively; while non-SNP-enhanced model yielded AUC values of 0.73 (95% CI: 0.69, 0.78), 0.76 (95% CI: 0.70, 0.83), and 0.71 (95% CI: 0.65, 0.77), respectively.

A total of 4 articles [33, 37, 49, 51] used the NRI and/or the IDI to compare the performances of two models (traditional only vs genetic enhanced model). In the first article [37], the NRI for a prediction model with GRS respect to the traditional risk score model was 0.17 (95% CI: − 0.05, 0.37) for CRC, − 0.17 (95% CI: − 0.33, 0.21) for colon cancer only, and 0.41 (95% CI: 0.10, 0.68) for rectal cancer only. The second one [33] found an increase in the inclusive model compared to the non-genetic model for the mean IDI (0.015) and the mean continuous NRI (0.39). After defining risk categories of NRI by arbitrary cut-off values of 1.5 and 3% of 10-year absolute risk of developing colorectal cancer, the mean NRI value was equal to 0.12 when the non-genetic and inclusive models were compared. The third [49] showed an increase in the NRI in all the models when different variables were included in the model (Table 1). Eventually, the last one [51] found that the traditional model with smoking status showed worse performance respect to the combined model that included genetic (simple count GRS,) and smoking factors: NRI of 0.317 (95% CI: 0.225, 0.408) and IDI of 0.031 (95% CI: 0.023, 0.039).

AUC analysis

A total of 14 risk prediction models, from 10 studies were included in the AUC analysis [23, 30, 32, 33, 35,36,37, 44, 49, 51]. We found no significant trend regarding the increase in the AUC of the SNP-enhanced risk prediction models according to the number of SNPs included in the models and, when the AUC was tested for trend, no significant association was retrieved (p for trend = 0.774). Pearson’s correlation coefficient between AUC improvement and number of SNPs was also estimated, r = − 0.0993 (95% CI: − 0.541, 0.385; p = 0.6951). No correlation could be found between the number of SNPs and AUC increase.

The meta-analysis resulted in a pooled estimate of AUC improvement for SNP-enhanced prediction models compared with non-SNP-enhanced models of 0.040 (95% CI: 0.035, 0.045) for all 14 models (Fig. 2). High heterogeneity was found reaching 98.5% (p < 0.001).

A stratified analysis by number of SNPs included across models was performed (Fig. 3). The AUC difference between the SNPs-enhanced models respect to non-SNP-enhanced models for the lowest tertile of SNPs added to the model (less than or equal to 22 SNPs) resulted in an improvement of 0.044 (95% CI: 0.022, 0.067). As to the mid (23–47 SNPs) and highest tertiles (more than or equal to 48 SNPs) of SNPs added, the estimates showed an improvement in the AUC of 0.018 (95% CI: 0.014, 0.022) and 0.045 (95% CI: 0.031, 0.058), respectively.

The results of the meta-regression (Table 2) showed that the factor more strongly associated, inversely, with AUC improvement after the addition of SNPs to a model with only traditional risk factors was the AUC of the non-SNP-enhanced model (p < 0.001). Furthermore, an inverse significant association was found also between the number of cases included in the study and AUC improvement (p = 0.002). Eventually, ethnicity was associated with AUC improvement too (p = 0.023), with better AUC improvements achieved by models constructed among Asians compared with individuals with European ancestry. No significant associations were found for other investigated factors. Overall, the factors included in the meta-regression explained almost half statistical heterogeneity, with a residual I² equal to 54.18%.

Table 2 Results of the meta-regression assessing which factors are associated with AUC improvement of SNP-enhanced models compared with non-SNP enhanced models

Full size table

Quality assessment

Results of the overall risk of bias and applicability assessment can be found in Table 3.

Table 3 Results of the risk of bias for each domain of the PROBAST tool

Full size table

The majority of the studies (93.94%) were scored as having high risk of bias [22,23,24,25,26,27,28,29,30, 32,33,34,35,36,37,38,39,40,41,42, 44,45,46,47,48,49,50,51,52,53,54, 57], 2 (6.06%) studies were rated as having an overall unclear risk of bias [31, 43].

A total of 22 (66.67%) studies were assessed only for the development of the model, 8 (24.24%) studies were assessed for both model development and validation, 3 (9.09%) only for model validation.

As to the model development, 66.67, 36.67, 20.00 and 70.00% of the studies were assessed as having high risk of bias respect to participants, predictors, outcome and statistical analysis, respectively; 33.33, 20.00, 63.33, 3.33% were deemed as having a low risk of bias, while 0.00, 43.33, 16.67, 26.67% were assessed as having unclear risk of bias respectively for participants, predictors, outcome and statistical analysis assessment.

As to validation models, 27.27, 36.36, 45.45, 9.09% of the included studies were assessed as having low risk of bias for participants, predictors, outcome and statistical analysis, respectively; while 72.73, 63.64, 54.55 and 90.91% were rated as high or unclear risk of bias.

Regarding the applicability of prediction models, in development model studies 30.00, 3.33, and 0.00% were at high or unclear risk; in validation studies 18.18, 0.00, 9.09% were at high or unclear risk as to, respectively, participants, predictors and outcome.

Discussion

Overall, from the 35 studies that we included in our systematic review we identified prediction models for CRC incorporating genetic factors, with extreme heterogeneity regarding the number of genetic factors included. Instead, as for the methods to include genetic factors in the prediction model, most studies used a weighted GRS, with a minority of them using either the count model or both the weighted and count methods.

As for studies reporting the AUC value of the model, most of them could not find a satisfactory discriminatory accuracy (e.g. AUC > 0.7 [56]) for their models, even though the addition of genetic factors to traditional risk factors improved it, with an improvement in the AUC ranging from 0.010 [37, 44] to 0.084 [51]. Nonetheless, similarly to what was previously reported for breast cancer [58], we found no evidence of association or correlation between the number of SNPs included in the model and the improvement in the AUC value. However, among studies comparing two or more models, only a minority reported data on NRI or IDI, witnessing the need to better quantify and report the improvement of accuracy of a model when adding new biomarkers or genetic data [59]. According to the interpretation suggested by Pencina et al. for NRI values, all these four studies showed a weak or intermediate strength of SNPs (for all of them in the form of a GRS), in terms of discriminatory potential, when added to models with only traditional risk factors [17].

Regarding the pooled improvement in AUC, a clear trend in the improvement of AUC related to the number of SNPs could not be found. The best results were achieved in the lowest (≤22 SNPs) and highest (≥48 SNPs) tertiles of SNPs incorporated into the models, which led to a larger improvement in AUC compared with the mid tertile (23–47 SNPs). As expected, due to the extremely high heterogeneity among variables, regarding various SNPs and several environmental factors included in the retrieved prediction models and among statistical methods used to incorporate such variables in the models, our meta-analysis results show significant statistical heterogeneity, witnessed by the high values of the I² obtained. For this reason, the results of our study should be interpreted cautiously and cannot be considered conclusive.

Similarly to our results, Fung et al. reported that the addition of genetic information improved discriminatory accuracy of the identified prediction models for breast cancer, even though AUC improvement was found to be not correlated or associated with the number of SNPs that were included in the model [58].

It should be noted that the improvement of AUC values with the addition of biomarkers, such as SNPs, to a model depends on the starting AUC value, which means the higher the AUC value of the model including only traditional risk factors, the smaller the improvement in AUC after adding genetic information into the model [17, 60, 61]. This was further confirmed by the results of our meta-regression. In addition, an inverse relation with AUC improvement was found also for the number of cases included in the study, which could actually be linked to the AUC of the non-SNP enhanced model. Likely, the higher the number of cases in the study, the larger the AUC of the non-SNP enhanced model and, hence, the smaller the AUC improvement.

Furthermore, the ethnicity of study participants was found to significantly affect AUC improvement, suggesting possible differences in the role of genetic factors between different populations, and witnessing the need to foster research in the field of genetic prediction models for all ethnicities [62]. The distribution of genetic factors associated with a specific cancer may vary between different ethnicities even more than traditional risk factors, thus the need for ethnicity-specific genome-wide association studies (GWAS) is crucial to inform the development of specific prediction models for different ethnicities [22, 63]. Furthermore, the importance of the chosen population in the construction of predictive models should be properly taken into account, as a model is applicable only to the specific population it was designed for [60].

Eventually, results of the meta-regression showed that the number of SNPs, publication year, the number of traditional risk factors in the model, and inclusion of gender in the model were not associated with AUC improvement. However, they largely explained statistical heterogeneity between included studies.

As far as we know, previous systematic reviews on prediction models for CRC including genetic factors were limited to a qualitative synthesis [8]. Hence, to our knowledge, our study is the first to investigate, through a quantitative approach, the improvement in discriminatory accuracy that can be obtained through the incorporation of SNPs into prediction models for CRC in addition to traditional risk factors. We also assessed which factors affect such improvement.

However, our study has some limitations. As previously mentioned, we identified extremely different prediction models, both in terms of genetic factors included in the models and in the methods used to include them -which range from weighted and unweighted GRS, to machine learning methods. The accuracy of a model, in terms of AUC values, depends not only on predictors that were used, but also on the method used for its construction. [64] Hence, as expected, this led to high heterogeneity of the results of our meta-analysis, which parallels what was previously described by Fung et al. regarding breast cancer [58]. Even though we showed that some factors partially explain such heterogeneity, our results should be considered exploratory and not conclusive due to the differences showed by included studies regarding chosen SNPs and traditional risk factors, as well as GRS computation methods.

Moreover, we found very limited high-quality evidence, with only one study having an overall low risk of bias [65], while majority had a high risk of bias. This not only limits the strength of our results, but also strongly suggests the need for better reporting, using as guidance the GRIPS Statement [66] or its updates, such as Polygenic Risk Score Reporting Standards (PRS-RS) [67], and higher quality research in the field of prediction models, which applies to CRC, and other chronic conditions – e.g. cardiovascular diseases [68]. Notably, all these factors affecting heterogeneity might have had an impact also on other estimates we reported in the analysis. Indeed, discriminatory accuracy of prediction models is expected to improve with the addition of newly discovered SNPs, [60] partially in contrast with our results. However, recently Khera et al. constructed 30 PRSs using millions of SNPs for five common diseases, obtaining PRSs with lower AUC values than those based on genome-wide significant SNPs only [69, 70]. This underlines the striking importance of an appropriate choice of SNPs to include in the models [58]. In addition, it should be noted that some SNPs used for risk prediction models by studies included in our analysis might have not been confirmed as risk loci by subsequent larger GWASs.

Furthermore, while recent research efforts in the field of PRS modelling are going towards the inclusion of thousand or even million SNPs into prediction models through the use of sophisticated methods, [70] such as LDpred2, lassosum, PRS-CS, and others, [71,72,73] the highest number of SNPs in the models included in our analyses was less than one hundred, thus limiting the applicability of our findings.

To further implement and advance knowledge in the field, in near the future, the adequate application of existing guidelines to improve the quality of prediction model studies, especially regarding study design and/or standardization of methodology to conduct these types of study, will be essential [20]. We showed that the addition of genetic factors into a prediction model with only traditional risk factors improves its performance, even if slightly. However, it is arguable if such improvement could really have an impact on populations’ health. In particular, in the field of disease prediction, great attention should be paid not only to the prediction performance, but also to clinical utility of the models [60]. As for CRC, disease prediction might play a key role in the personalization of screening programs, which could start earlier for individuals proven to be at higher risk compared with the average population. Hence, the use of a prediction model, especially if also incorporating genetic factors, might greatly impact starting age of screening [35, 74]. In addition, knowing own personal risk of cancer could also be a useful trigger for individuals to improve their adherence to screening programs, which is known to be far from the target levels [75].

The addition of genetic information may offer greater benefit when the models are used for risk prediction among specific subgroups of the population [8, 58]. This might imply that, in the future, this kinds of screening interventions could be an implemented multi-step process: the first regards the stratification of individuals according to their level of risk, followed by personalization of the interventions to carry out [58].

Eventually, as recently reported by Naber et al. [76], if a prediction model having an AUC of at least 0.65 is adopted, stratified screening for CRC becomes cost-effective compared with the current uniform screening [77]. This further underlines the importance to carry out further research in this field to improve performances of developed prediction models.

Conclusions

The integration of genetic information into traditional prediction risk models improves the discrimination accuracy respect to CRC. However, we could not find any association or correlation respect to the number of SNPs added to the model and an AUC improvement. High heterogeneity in the choice of baseline model, method of incorporating genetic information, and studied population suggest that standardization in the conduction of this kind of studies be needed. Further steps in research are surely needed in order to improve knowledge, increase comprehension and target people who would benefit more from this intervention. It is also crucial to consider how to apply the studied models into clinical and real-life settings, in fact, the implementation of prediction models into practice will require a better comprehension of potential economic benefits and organizational effects, as well as patient safety, ethical, social, and legal implications, which will make the impact of polygenic prediction models on Health Systems clearer.

Availability of data and materials

All data relevant to the study are included in the published article, and can be also found in original articles included in our study.

Abbreviations

CRC:: Colorectal cancer
HDI:: Human Development Index
GRS:: Genetic risk score
PRS:: Polygenic risk score
GWAS:: Genome-wide association study
SNP:: Single nucleotide polymorphism
PICO:: Population, Intervention, Comparator, Outcome
PRISMA:: Preferred Reporting Items for Systematic Reviews and Meta-Analyses
CHARMS:: CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies
AUC:: Area Under Curve
IDI:: Integrated discrimination improvement
NRI:: Net reclassification improvement
PROBAST:: Prediction model Risk Of Bias ASsessment Tool

References

Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424 [cited 2020 Aug 26]. https://acsjournals.onlinelibrary.wiley.com/doi/full/10.3322/caac.21492.
Article Google Scholar
Wong MCS, Huang J, Lok V, Wang J, Fung F, Ding H, et al. Differences in incidence and mortality trends of colorectal cancer worldwide based on sex, age, and anatomic location. Clin Gastroenterol Hepatol. 2020;0(0) [cited 2020 Sep 1]. https://doi.org/10.1016/j.cgh.2020.02.026.
Gini A, Jansen EEL, Zielonke N, Meester RGS, Senore C, Anttila A, et al. Impact of colorectal cancer screening on cancer-specific mortality in Europe: a systematic review. Eur J Cancer. 2020;127:224–35 Elsevier Ltd. [cited 2020 Sep 1]. www.sciencedirect.com.
Article PubMed Google Scholar
Zhang J, Cheng Z, Ma Y, He C, Lu Y, Zhao Y, et al. Effectiveness of screening modalities in colorectal cancer: a network meta-analysis. Clin Colorectal Cancer. 2017;16:252–63 Elsevier Inc.
Article PubMed Google Scholar
Fitzpatrick-Lewis D, Ali MU, Warren R, Kenny M, Sherifali D, Raina P. Screening for colorectal cancer: a systematic review and meta-analysis. Clin Colorectal Cancer. 2016;15:298–313 Elsevier Inc.
Article PubMed Google Scholar
Navarro M, Nicolas A, Ferrandez A, Lanas A. Colorectal cancer population screening programs worldwide in 2016: an update. World J Gastroenterol. 2017;23(20):3632 [cited 2020 Sep 1]. http://www.wjgnet.com/1007-9327/full/v23/i20/3632.htm.
Article PubMed PubMed Central Google Scholar
Usher-Smith JA, Walter FM, Emery JD, Win AK, Griffin SJ. Risk prediction models for colorectal cancer: a systematic review. Cancer Prev Res. 2016;9:13–26 American Association for Cancer Research Inc. [cited 2020 Aug 26]. http://cancerprevres.aacrjournals.org/.
Article CAS Google Scholar
McGeoch L, Saunders CL, Griffin SJ, Emery JD, Walter FM, Thompson DJ, et al. Risk prediction models for colorectal cancer incorporating common genetic variants: a systematic review. Cancer Epidemiol Biomark Prev. 2019;28:1580–93 American Association for Cancer Research Inc.; [cited 2020 Sep 3]. https://cebp.aacrjournals.org/content/28/10/1580.
Article Google Scholar
GWAS Catalog. Colorectal cancer. [cited 2020 Sep 3]. https://www.ebi.ac.uk/gwas/efotraits/EFO_0005842
Czene K, Lichtenstein P, Hemminki K. Environmental and heritable causes of cancer among 9.6 million individuals in the Swedish family-cancer database. Int J Cancer. 2002;99(2):260–6 [cited 2020 Sep 3]. https://onlinelibrary.wiley.com/doi/full/10.1002/ijc.10332.
Article CAS PubMed Google Scholar
Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, et al. Environmental and heritable factors in the causation of cancer — analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med. 2000;343(2):78–85 [cited 2020 Sep 3]. http://www.nejm.org/doi/abs/10.1056/NEJM200007133430201.
Article CAS PubMed Google Scholar
Richardson WS, Wilson MC, Nishikawa J, Hayward RS. The well-built clinical question: a key to evidence-based decisions. ACP J Club. 1995;123(3):A12–3.
Article CAS PubMed Google Scholar
Moher D, Liberati A, Tetzlaff J, Altman DG, Altman G. Preferred reporting items for systematic reviews and meta-analyses : the PRISMA statement all use subject to JSTOR terms and conditions REPORTING items preferred for systematic reviews reporting meta-analyses : the PRISMA statement. BMJ. 2009;339(7716):332–6.
Google Scholar
Moons KGM, de Groot JAH, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction Modelling studies: the CHARMS checklist. PLoS Med. 2014;11(10):e1001744. https://doi.org/10.1371/journal.pmed.1001744 [cited 2020 Aug 26].
Article PubMed PubMed Central Google Scholar
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36 [cited 2020 Aug 26]. https://pubmed.ncbi.nlm.nih.gov/7063747/.
Article CAS PubMed Google Scholar
Pencina MJ, D’Agostino RB, D’Agostino RB, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27(2):157–72 [cited 2020 Oct 6]. https://onlinelibrary.wiley.com/doi/full/10.1002/sim.2929.
Article PubMed Google Scholar
Pencina MJ, D’Agostino RB, Pencina KM, Janssens ACJW, Greenland P. Interpreting incremental value of markers added to risk prediction models. Am J Epidemiol. 2012;176(6):473–81 [cited 2020 Oct 6]. https://academic.oup.com/aje/article/176/6/473/118184.
Article PubMed PubMed Central Google Scholar
Goldman N, Glei DA. Quantifying the value of biomarkers for predicting mortality. Ann Epidemiol. 2015;25(12):901–906.e4.
Article PubMed PubMed Central Google Scholar
Yates JF. External correspondence: decompositions of the mean probability score. Organ Behav Hum Perform. 1982;30(1):132–56.
Article Google Scholar
Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170(1):51 [cited 2020 Aug 26]. http://annals.org/article.aspx?doi=10.7326/M18-1376.
Article PubMed Google Scholar
StataCorp. Stata statistical software: release 13. College Station: StataCorp LP; 2013.
Google Scholar
Abe M, Ito H, Oze I, Nomura M, Ogawa Y, Matsuo K. The more from east-Asian, the better: risk prediction of colorectal cancer risk by GWAS-identified SNPs among Japanese. J Cancer Res Clin Oncol. 2017;143(12):2481–92 [cited 2020 Aug 26]. https://link.springer.com/article/10.1007/s00432-017-2505-4.
Article PubMed Google Scholar
Balavarca Y, Weigl K, Thomsen H, Brenner H. Performance of individual and joint risk stratification by an environmental risk score and a genetic risk score in a colorectal cancer screening setting. Int J Cancer. 2020;146(3):627–34 [cited 2020 Aug 26]. https://onlinelibrary.wiley.com/doi/abs/10.1002/ijc.32272.
Article CAS PubMed Google Scholar
Chandler P, Tobias D, Wang L, Smith-Warner S, Chasman D, Rose L, et al. Association between vitamin D genetic risk score and cancer risk in a large cohort of U.S. women. Nutrients. 2018;10(1):55 [cited 2020 Aug 26]. http://www.mdpi.com/2072-6643/10/1/55.
Article PubMed Central Google Scholar
Cho YA, Lee J, Oh JH, Chang HJ, Sohn DK, Shin A, et al. Genetic risk score, combined lifestyle factors and risk of colorectal cancer. Cancer Res Treat. 2019;51(3):1033–40 [cited 2020 Aug 26]. http://e-crt.org/journal/view.php?doi=10.4143/crt.2018.447.
Article CAS PubMed Google Scholar
de Kort S, Simons CCJM, van den Brandt PA, Janssen-Heijnen MLG, Sanduleanu S, Masclee AAM, et al. Diabetes mellitus, genetic variants in the insulin-like growth factor pathway and colorectal cancer risk. Int J Cancer. 2019;145(7):ijc.32365 [cited 2020 Aug 26]. https://onlinelibrary.wiley.com/doi/abs/10.1002/ijc.32365.
Google Scholar
Dunlop MG, Tenesa A, Farrington SM, Ballereau S, Brewster DH, Koessler T, et al. Cumulative impact of common genetic variants and other risk factors on colorectal cancer risk in 42 103 individuals. Gut. 2013;62(6):871–81 [cited 2020 Aug 26]. http://gut.bmj.com/.
Article CAS PubMed Google Scholar
Hiraki LT, Qu C, Hutter CM, Baron JA, Berndt SI, Bézieau S, et al. Genetic predictors of circulating 25-hydroxyvitamin D and risk of colorectal cancer. Cancer Epidemiol Biomark Prev. 2013;22(11):2037–46 [cited 2020 Aug 26]. http://cebp.aacrjournals.org/.
Article CAS Google Scholar
Hosono S, Ito H, Oze I, Watanabe M, Komori K, Yatabe Y, et al. A risk prediction model for colorectal cancer using genome-wide association study-identified polymorphisms and established risk factors among Japanese. Eur J Cancer Prev. 2016;25(6):500–7 [cited 2020 Aug 26]. http://journals.lww.com/00008469-201611000-00003.
Article CAS PubMed Google Scholar
Hsu L, Jeon J, Brenner H, Gruber SB, Schoen RE, Berndt SI, et al. A model to determine colorectal cancer risk using common genetic susceptibility loci. Gastroenterology. 2015;148(7):1330–1339.e14.
Article PubMed Google Scholar
Huyghe JR, Bien SA, Harrison TA, Kang HM, Chen S, Schmit SL, et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat Genet. 2019;51(1):76–87 [cited 2020 Aug 26]. Available from: www.nature.com/naturegenetics76.
Article CAS PubMed Google Scholar
Ibáñez-Sanz G, Diéz-Villanueva A, Alonso MH, Rodríguez-Moranta F, Pérez-Gómez B, Bustamante M, et al. Risk model for colorectal cancer in Spanish population using environmental and genetic factors: results from the MCC-Spain study. Sci Rep. 2017;7(6):19 [cited 2020 Aug 26]. www.nature.com/scientificreports.
Google Scholar
Iwasaki M, Tanaka-Mizuno S, Kuchiba A, Yamaji T, Sawada N, Goto A, et al. Inclusion of a genetic risk score into a validated risk prediction model for colorectal cancer in Japanese men improves performance. Cancer Prev Res. 2017;10(9):535–41 [cited 2020 Aug 26]. www.broadinstitute.org/mpg/snap/.
Article Google Scholar
Jenkins MA, Win AK, Dowty JG, MacInnis RJ, Makalic E, Schmidt DF, et al. Ability of known susceptibility SNPs to predict colorectal cancer risk for persons with and without a family history. Familial Cancer. 2019;18(4):389–97. https://doi.org/10.1007/s10689-019-00136-6 [cited 2020 Aug 26].
Article CAS PubMed PubMed Central Google Scholar
Jeon J, Du M, Schoen RE, Hoffmeister M, Newcomb PA, Berndt SI, et al. Determining risk of colorectal cancer and starting age of screening based on lifestyle, environmental, and genetic factors. Gastroenterology. 2018;154(8):2152–2164.e19.
Article PubMed Google Scholar
Jo J, Nam CM, Sull JW, Yun JE, Kim SY, Lee SJ, et al. Prediction of colorectal cancer risk using a genetic risk score: the Korean cancer prevention study-II (KCPS-II). Genomics Inform. 2012;10(3):175. https://doi.org/10.5808/GI.2012.10.3.175 [cited 2020 Aug 26].
Article PubMed PubMed Central Google Scholar
Jung KJ, Won D, Jeon C, Kim S, Il KT, Jee SH, et al. A colorectal cancer prediction model using traditional and genetic risk scores in Koreans. BMC Genet. 2015;16(1):49 [cited 2020 Aug 26]. http://www.biomedcentral.com/1471-2156/16/49.
Article PubMed PubMed Central Google Scholar
Jung SY, Zhang Z-F. The effects of genetic variants related to insulin metabolism pathways and the interactions with lifestyles on colorectal cancer risk. Menopause. 2019;26(7):771–80 [cited 2020 Aug 26]. http://journals.lww.com/00042192-201907000-00013.
Article PubMed PubMed Central Google Scholar
Marshall KW, Mohr S, Khettabi F, El Nossova N, Chao S, Bao W, et al. A blood-based biomarker panel for stratifying current risk for colorectal cancer. Int J Cancer. 2010;126(5):1177–86. https://doi.org/10.1002/ijc.24910 [cited 2020 Aug 26].
Article CAS PubMed Google Scholar
Prizment AE, Folsom AR, Dreyfus J, Anderson KE, Visvanathan K, Joshu CE, et al. Plasma C-reactive protein, genetic risk score, and risk of common cancers in the atherosclerosis risk in communities study. Cancer Causes Control. 2013;24(12):2077–87 [cited 2020 Aug 26]. https://link.springer.com/article/10.1007/s10552-013-0285-y.
Article PubMed Google Scholar
Rodriguez-Broadbent H, Law PJ, Sud A, Palin K, Tuupanen S, Gylfe A, et al. Mendelian randomisation implicates hyperlipidaemia as a risk factor for colorectal cancer. Int J Cancer. 2017;140(12):2701–8 [cited 2020 Aug 26]. https://onlinelibrary.wiley.com/doi/abs/10.1002/ijc.30709.
Article CAS PubMed PubMed Central Google Scholar
Schmit SL, Edlund CK, Schumacher FR, Gong J, Harrison TA, Huyghe JR, et al. Novel common genetic susceptibility loci for colorectal cancer. J Natl Cancer Inst. 2019;111(2):146–57 [cited 2020 Aug 26]. https://academic.oup.com/jnci/article/111/2/146/5039592.
Article PubMed Google Scholar
Shi Z, Yu H, Wu Y, Lin X, Bao Q, Jia H, et al. Systematic evaluation of cancer-specific genetic risk score for 11 types of cancer in the cancer genome atlas and electronic medical records and genomics cohorts. Cancer Med. 2019;8(6):cam4.2143 [cited 2020 Aug 26]. https://onlinelibrary.wiley.com/doi/abs/10.1002/cam4.2143.
Article Google Scholar
Smith T, Gunter MJ, Tzoulaki I, Muller DC. The added value of genetic information in colorectal cancer risk prediction models: development and evaluation in the UK biobank prospective cohort study. Br J Cancer. 2018;119(8):1036–9. https://doi.org/10.1038/s41416-018-0282-8 [cited 2020 Aug 26].
Article PubMed PubMed Central Google Scholar
Thrift AP, Gong J, Peters U, Chang-Claude J, Rudolph A, Slattery ML, et al. Mendelian randomization study of height and risk of colorectal cancer. Int J Epidemiol. 2015;44(2):662–72 [cited 2020 Aug 26]. https://academic.oup.com/ije/article/44/2/662/754872.
Article PubMed PubMed Central Google Scholar
Thrift AP, Gong J, Peters U, Chang-Claude J, Rudolph A, Slattery ML, et al. Mendelian randomization study of body mass index and colorectal cancer risk. Cancer Epidemiol Biomark Prev. 2015;24(7):1024–31 [cited 2020 Aug 26]. http://cebp.aacrjournals.org/.
Article Google Scholar
Wang HM, Chang TH, Lin FM, Chao TH, Huang WC, Liang C, et al. A new method for post genome-wide association study (GWAS) analysis of colorectal cancer in Taiwan. Gene. 2013;518(1):107–13.
Article CAS PubMed Google Scholar
Wang K, Bai Y, Chen S, Huang J, Yuan J, Chen W, et al. Genetic correction improves prediction efficiency of serum tumor biomarkers on digestive cancer risk in the elderly Chinese cohort study. Oncotarget. 2018;9(7):7389–97 [cited 2020 Aug 26]. www.impactjournals.com/oncotarget.
Article PubMed Google Scholar
Weigl K, Thomsen H, Balavarca Y, Hellwege JN, Shrubsole MJ, Brenner H. Genetic risk score is associated with prevalence of advanced neoplasms in a colorectal cancer screening population. Gastroenterology. 2018;155(1):88–98.e10.
Article PubMed Google Scholar
Weigl K, Chang-Claude J, Knebel P, Hsu L, Hoffmeister M, Brenner H. Strongly enhanced colorectal cancer risk stratification by combining family history and genetic risk score. Clin Epidemiol. 2018;10:143–52 [cited 2020 Aug 26]. https://www.dovepress.com/strongly-enhanced-colorectal-cancer-risk-stratification-by-combining-f-peer-reviewed-article-CLEP.
Article PubMed PubMed Central Google Scholar
Xin J, Chu H, Ben S, Ge Y, Shao W, Zhao Y, et al. Evaluating the effect of multiple genetic risk score models on colorectal cancer risk prediction. Gene. 2018;673:174–80.
Article CAS PubMed Google Scholar
Xin J, Du M, Gu D, Ge Y, Li S, Chu H, et al. Combinations of single nucleotide polymorphisms identified in genome-wide association studies determine risk for colorectal cancer. Int J Cancer. 2019;145(10):2661–9 [cited 2020 Aug 26]. https://onlinelibrary.wiley.com/doi/abs/10.1002/ijc.32267.
Article CAS PubMed Google Scholar
Yeh CC, Sung FC, Tang R, Chang-Chieh CR, Hsieh LL. Association between polymorphisms of biotransformation and DNA-repair genes and risk of colorectal cancer in Taiwan. J Biomed Sci. 2007;14(2):183–93 [cited 2020 Aug 26]. https://link.springer.com/article/10.1007/s11373-006-9139-x.
Article CAS PubMed Google Scholar
Zhang L, Zheng C, Li T, Xing L, Zeng H, Li T, et al. Building up a robust risk mathematical platform to predict colorectal cancer. Complexity. 2017;2017.
Burgess S, Scott RA, Timpson NJ, Smith GD, Thompson SG. Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. Eur J Epidemiol. 2015;30(7):543–52 [cited 2020 Aug 26]. https://link.springer.com/article/10.1007/s10654-015-0011-z.
Article PubMed PubMed Central Google Scholar
Swets JA. Measuring the accuracy of diagnostic systems. Science. 1988;240(4857):1285–93 [cited 2020 Aug 26]. https://pubmed.ncbi.nlm.nih.gov/3287615/.
Article CAS PubMed Google Scholar
Han M, Choong TL, Hong WZ, Chao S, Zheng R, Kok TY, et al. Novel blood-based, five-gene biomarker set for the detection of colorectal cancer. Clin Cancer Res. 2008;14(2):455–60 [cited 2020 Aug 26]. www.aacrjournals.org.
Article CAS PubMed Google Scholar
Fung SM, Wong XY, Lee SX, Miao H, Hartman M, Wee HL. Performance of single-nucleotide polymorphisms in breast cancer risk prediction models: a systematic review and meta-analysis. Cancer Epidemiol Biomark Prev. 2019;28(3):506–21 [cited 2020 Aug 26]. http://cebp.aacrjournals.org/.
Article Google Scholar
Cook NR. Quantifying the added value of new biomarkers: how and how not. Diagnostic Progn Res. 2018;2(1):14 [cited 2020 Nov 17]. https://diagnprognres.biomedcentral.com/articles/10.1186/s41512-018-0037-2.
Article Google Scholar
Cecile A, Janssens JW, Joyner MJ. Polygenic risk scores that predict common diseases using millions of single nucleotide polymorphisms: is more, better? Clin Chem. 2019;65(5):609–11 [cited 2020 Aug 26]. https://academic.oup.com/clinchem/article/65/5/609/5608048.
Article Google Scholar
Tzoulaki I, Liberopoulos G, Ioannidis JPA. Assessment of claims of improved prediction beyond the Framingham risk score. JAMA - J Am Med Assoc. 2009;302:2345–52 American Medical Association; [cited 2020 Nov 18]. https://jamanetwork.com/journals/jama/fullarticle/184992.
Article CAS Google Scholar
Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51(4):584–91. https://doi.org/10.1038/s41588-019-0379-x [cited 2021 Mar 18].
Article CAS PubMed PubMed Central Google Scholar
Marigorta UM, Rodríguez JA, Gibson G, Navarro A. Replicability and prediction: lessons and challenges from GWAS. Trends Genet. 2018;34:504–17 Elsevier Ltd; [cited 2020 Nov 17]. http://www.cell.com/article/S016895251830060X/fulltext.
Article CAS PubMed PubMed Central Google Scholar
Kundu S, Mihaescu R, CMC M, Bakker R, Janssens ACJW. Estimating the predictive ability of genetic risk models in simulated data based on published results from genome-wide association studies. Front Genet. 2014;5(JUN) [cited 2020 Nov 17]. https://pubmed.ncbi.nlm.nih.gov/24982668/.
Luo H, Zhao Q, Wei W, Zheng L, Yi S, Li G, et al. Circulating tumor DNA methylation profiles enable early diagnosis, prognosis prediction, and screening for colorectal cancer. Sci Transl Med. 2020;12(524) [cited 2020 Aug 26]. http://stm.sciencemag.org/.
Janssens ACJW, Ioannidis JPA, van Duijn CM, Little J, Khoury MJ. Strengthening the reporting of genetic risk prediction studies: the GRIPS statement. PLoS Med. 2011;8(3):e1000420. https://doi.org/10.1371/journal.pmed.1000420 [cited 2020 Aug 26].
Article PubMed PubMed Central Google Scholar
Wand H, Lambert SA, Tamburro C, Iacocca MA, O’Sullivan JW, Sillari C, et al. Improving reporting standards for polygenic scores in risk prediction studies. Nature. 2021;591(7849):211–9 [cited 2021 Mar 18]. http://www.nature.com/articles/s41586-021-03243-6.
Article CAS PubMed PubMed Central Google Scholar
Fiatal S, Ádány R. Application of single-nucleotide polymorphism-related risk estimates in identification of increased genetic susceptibility to cardiovascular diseases: a literature review. Front Public Health. 2018;5:358 [cited 2020 Aug 26]. www.frontiersin.org.
Article PubMed PubMed Central Google Scholar
Janssens ACJW. Validity of polygenic risk scores: are we measuring what we think we are?. 28, Hum Mol Genet. 2019;R143–R150. Oxford University Press; [cited 2020 Nov 17]. https://academic.oup.com/hmg/article/28/R2/R143/5555564
Khera AV, Chaffin M, Aragam KG, Haas ME, Roselli C, Choi SH, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet. 2018;50:1219–24 Nature Publishing Group; [cited 2020 Nov 17]. https://pubmed.ncbi.nlm.nih.gov/30104762/.
Article CAS PubMed PubMed Central Google Scholar
Privé F, Arbel J, Vilhjálmsson BJ. LDpred2: better, faster, stronger. Schwartz R, editor. Bioinformatics. 2020; [cited 2021 Mar 18]; https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa1029/6039173.
Mak TSH, Porsch RM, Choi SW, Zhou X, Sham PC. Polygenic scores via penalized regression on summary statistics. Genet Epidemiol. 2017;41(6):469–80. https://doi.org/10.1002/gepi.22050 [cited 2021 Mar 18].
Article PubMed Google Scholar
Ge T, Chen CY, Ni Y, Feng YCA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun. 2019;10(1):1–10. https://doi.org/10.1038/s41467-019-09718-5 [cited 2021 Mar 18].
Article CAS Google Scholar
Kuipers EJ, Spaander MC. Personalized screening for colorectal cancer. Nat Rev Gastroenterol Hepatol. 2018;15(7):391–2 [cited 2020 Aug 26]. https://www.nature.com/articles/s41575-018-0015-8.
Article CAS PubMed Google Scholar
Robertson DJ, Ladabaum U. Opportunities and challenges in moving from current guidelines to personalized colorectal cancer screening. Gastroenterology. 2019;156:904–17. https://doi.org/10.1053/j.gastro.2018.12.012 W.B. Saunders; [cited 2020 Aug 26].
Article PubMed Google Scholar
Naber SK, Kundu S, Kuntz KM, Dotson WD, Williams MS, Zauber AG, et al. Cost-effectiveness of risk-stratified colorectal cancer screening based on polygenic risk: current status and future potential. JNCI Cancer Spectr. 2020;4(1) [cited 2020 Aug 26]. https://academic.oup.com/jncics/article/4/1/pkz086/5586982.
Bibbins-Domingo K, Grossman DC, Curry SJ, Davidson KW, Epling JW, García FAR, et al. Screening for colorectal cancer: US preventive services task force recommendation statement. JAMA-J Am Med Assoc. 2016;315(23):2564–75.
Article CAS Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

SB received a funding from Università Cattolica del Sacro Cuore (funds line D.3.1) to cover the journal fee of the publication. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Michele Sassano and Marco Mariani contributed equally to this work.

Authors and Affiliations

Section of Hygiene, University Department of Life Sciences and Public Health, Università Cattolica del Sacro Cuore, 00168, Roma, Italy
Michele Sassano, Marco Mariani, Gianluigi Quaranta & Stefania Boccia
Department of Woman and Child Health and Public Health - Public Health Area, Fondazione Policlinico Universitario A. Gemelli IRCCS, Roma, Italy
Gianluigi Quaranta, Roberta Pastorino & Stefania Boccia

Authors

Michele Sassano
View author publications
You can also search for this author in PubMed Google Scholar
Marco Mariani
View author publications
You can also search for this author in PubMed Google Scholar
Gianluigi Quaranta
View author publications
You can also search for this author in PubMed Google Scholar
Roberta Pastorino
View author publications
You can also search for this author in PubMed Google Scholar
Stefania Boccia
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SB conceptualized the research questions and the searching strategy, contributed to the final version of the manuscript and supervised the research project. MM and MS performed the research in the electronic databases and independently conducted the screening and study selection phase and the quality assessment of the included studies. MS performed the statistical analysis. GQ, MM, and MS drafted the manuscript, and RS supervised the search strategy, the statistical analysis, and the results interpretation, and critically revised the manuscript preparation process. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Roberta Pastorino.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Additional file 2: Table S1.

Details of single nucleotide polymorphisms investigated by the studies included in the systematic review.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Sassano, M., Mariani, M., Quaranta, G. et al. Polygenic risk prediction models for colorectal cancer: a systematic review. BMC Cancer 22, 65 (2022). https://doi.org/10.1186/s12885-021-09143-2

Download citation

Received: 30 March 2021
Accepted: 02 December 2021
Published: 15 January 2022
DOI: https://doi.org/10.1186/s12885-021-09143-2

Polygenic risk prediction models for colorectal cancer: a systematic review

Abstract

Background

Methods

Results

Conclusions

Introduction

Methods and materials

Search strategy and study selection

Data extraction

Quality assessment

Statistical analysis

Results

Study selection

Study and population characteristics

Risk prediction models characteristics

Difference in discriminatory accuracy between SNP-enhanced and traditional risk factor models

AUC analysis

Quality assessment

Discussion

Conclusions

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Additional file 1.

Additional file 2: Table S1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Cancer

Contact us