A clinical model to predict the risk of synchronous bone metastasis in newly diagnosed colorectal cancer: a population-based study

Background The early detection of synchronous bone metastasis (BM) in newly diagnosed colorectal cancer (CRC) affects its initial management and prognosis. A clinical model to individually predict the risk of developing BM would be attractive in current clinical practice. Methods A total of 55,869 CRC patients were identified from Surveillance, Epidemiology, and End Results (SEER) database, of whom 317 patients were diagnosed with synchronous BM. Risk factors for BM in CRC patients was identified using multivariable logistic regression. A weighted scoring system was built with beta-coefficients (P < 0.05). A random sample of 75% of the CRC patients was used to establish the risk model, and the remaining 25% was used to validate its accuracy of this model. The performance of risk model was estimated by receiver operating curve (ROC) analysis. Results The risk model consisted of 8 risk factors including rectal cancer, poorly-undifferentiation, signet-ring cell carcinoma, CEA positive, lymph node metastasis, brain metastasis, liver metastasis and lung metastasis. The areas under the receiver operating curve (AUROC) were 0.903 and 0.889 in the development and validation cohort. Patients with scores from 0 to 4 points had about 0.1% risk of developing BM, and the risk increased to about 30% in patients with scores ≥15 points. Conclusions This clinical risk model is accurate enough to identify the CRC patients with high risk of synchronous BM and to further provide more individualized clinical decision.


Background
Colorectal cancer (CRC) is the most commom cancer and the second most common cancer cause of death wordwide, leading to more than 1.8 million new cases and 881 thousand deaths in 2018 [1]. CRC is most likely to metastasize to liver, followed by lungs and peritoneal cavity, yet rarely to bone [2]. The incidence of bone metastasis (BM) among CRC patients has been reported to be 6.0-10.4% with higher rate of 8.6-23.7% in the autopsy results [3]. Nevertheless, the BM incidence rate from CRC has increased in recent years [4], and it is usually diagnosed at the advanced stages with a 5-year survival rate less than 5% [5].
In newly diagnosed CRC patients, systematic body screening including liver and lung examinations is routinely recommended by current guideline [6]. However, because of low incidence of BM, body imaging regarding to synchronous BM is mostly ignored during the primary diagnoses of CRC. And patients are often advised to perform radionuclide bone imaging or PET-CT only when they have suspicious symptoms of skeletal-related events (SREs), which are the concomitant complications of BM and exist high incidence within 1 year after BM [7]. At this time, the CRC are likely to have reached advanced stage or multiple metastases have occurred, thus the best chance of treatment for CRC patients will be missed [8]. In addition, the occurrence of SREs, including bone pain, pathological fracture, possible radiotherapy, spinal cord compression, fatal hypercalcemia could further lower the quality of life and survival of patients [9].
The identification of clinical and tumor factors related to synchronous BM will play an important role in the early detection of synchronous BM among newly diagnosed CRC patients. A clinical risk model of predicting BM appears to be a helpful way to clarify how likely a patient would suffer from BM. Several studies have identified the risk factors associated with BM [10,11], however most of models tend to predict the risk of metachronous BM which is diagnosed after curative resection of CRC [12,13]. To date, there is no statistical risk model has been developed for predicting the probability of synchronous BM at primary CRC diagnosis.
Thus, the aim of our study was to utilize the populationbased data to identify the risk factors associated with high risk of synchronous BM and to develop a risk model to predict the likelihood of synchronous BM for newly diagnosed CRC patients, which could potentially alert clinicians to detect BM promptly in patients with CRC and further take appropriate measures to avoid the incidence of SREs.

Study population
The newly diagnosed CRC cases were extracted from the Surveillance, Epidemiology, and End Results (SEER) database between January 2010 and December 2014. The SEER database includes the information with regard to cancer incidence, survival outcome and treatment strategy from 17 population-based cancer registries, representing 28% of US population [14]. All CRC patients included in this study were definitively diagnosed by pathological examination, and BM were diagnosed using imaging examination and/or pathological examination. Initially, 141,773 patients were identified who was diagnosed with CRC in the database. After excluding 85,904 cases who were not eligible, we finally collected 55,869 patients at

Statistical analysis
The patients were randomly divided into development cohort and validation cohort with a ratio of 3:1. In development cohort, the clinical and tumor variables between patients with and without BM were compared by using Spearman's rank correlation coefficient. Then, variables that associated with BM (P < 0.05) were included in a multivariable logistic regression model. Risk factors for BM in CRC patients was identified from the statistically significant variables in the multivariable model. The extent of model discrimination was further estimated by calculating the area under the receiver operating characteristic curve (AUROC). All statistical analysis was performed with SPSS version 25.0 for Mac. It is considered as statistically significant when P < 0.05.

Prediction of synchronous BM
The beta coefficients (β) were calculated in the multivariable model to develop a weighted point system to individually predict the risk of synchronous BM for newly diagnosed CRC patients. Individual patient scores were calculated by summing the score of each significant risk factor. The insignificant risk factors (P ≥ 0.05) received 0 point.
Variables with β > 0 and < 0.5 received 1 point; those with β ≥ 0.5 and < 1 received 2 points; those with β ≥ 1 and < 1.5 received 3 points; those with β ≥ 1.5 and < 2 received 4 points; those β ≥ 2 and < 2.5 received 5 points and those β ≥ 2.5 and < 3 received 6 points. The risk model was created based on 75% of the CRC patient in the development cohort and validated on the remaining 25% of the patients in the validation cohort. Observed and predicted rates of BM for each point value were calculated. Risk stratification of BM was further performed based on the risk scores.

Patient characteristics
A total of 55,869 patients diagnosed with CRC were ultimately included in the final analysis, of whom 317 patients were diagnosed with BM, accounting for 0.57% of all patients. Among these 317 patients, 61 patients were diagnosed with solitary BM, accounting for 19.2% of BM patients. The comparisons of clinical and tumor characteristics between all patients with BM and patients without BM were shown in Table 1. Finally, 208 patients with BM (0.53%) were included in the development cohort (n = 39,237), while 109 (0.66%) in the validation cohort (n = 16,632), which had no significant difference (P = 0.072). In the development cohort, variables including age at diagnosis, tumor location, tumor grade, tumor size, histological type, CEA levels, T stage, N stage, brain metastasis, liver metastasis and lung metastasis were associated with synchronous BM at the primary diagnosis of CRC, with P < 0.05. The details were shown in Table 2.

Risk model predictors
On multivariable logistic regression among development cohort in the risk model, rectal cancer, poorlyundifferentiation, signet-ring cell carcinoma, CEA positive, N1/N2 stages, brain metastasis, liver metastasis and lung metastasis were significantly associated with higher risk of developing BM, which were used to develop the risk model of predicting BM. The details were shown in Table 3.

Risk scores
According to the criteria for assigning point values as previously described, the scores for each significant variable in the final risk model were calculated and presented in Table 3. The total maximal risk point value would be assigned 26 for one patient. In our study, total scores ranged from 0 to 20 in both development cohort and validation cohort.

Model accuracy and validity
The AUROC for the risk model was 0.903 in development cohort and 0.889 in validation cohort, which suggested very good discriminations of this risk model to accurately identify the risk of developing BM for CRC patients (Fig. 2). Then we used the Euclideans's index to screen out the optimal cut off with 0.851 sensitivity and 0.845 specificity in development cohort, while 0.817 and 0.873 respectively in validation cohort. Risk stratification of BM was performed based on the risk scores. Patients with total score of 0-4, 5-9, 10-14, ≥15 points were respectively classified into very low risk group, low risk group, medium risk group and high risk group. Then, the observed and predicted rates of BM were compared in development cohort and validation cohort by four groups. The predicted rates of synchronous BM for each patient were calculated from the risk model estimates. In the very low risk patients, the predicted rates of BM were 0.13% in development cohort and 0.19% in validation corhort, as the observed rates were 0.10 and 0.13% respectively. In the high risk group, the predicted rates of BM obviously increased to 30.53% in development cohort and 31.87% in validation cohort, and the corresponding observed rates increased to 28.57 and 30.77% respectively. Figure 3 displayed the predicted and observed rates of BM in four groups by total risk score.

Discussion
Currently, CRC is a major cause of morbidity and mortality globally [1]. Liver and lung metastasis frequently occur in newly diagnosed CRC, which are routinely recommended to be identified by systematic body examination during the primary diagnosis of CRC [6]. Here we found the proportion of BM incidences in CRC patients in our study is much less than the national reported proportion of BM incidences in CRC patients [3]. Apart from the unrepresentative sample of the overall population in the national findings, this can be due to the rare incidence of synchronous BM recorded in the SEER database instead of metachronous BM seen through several decades in the national study. Synchronous BM diagnosed in CRC patients is very rare, which therefore   lead to ignorance of this special clinical entity [15]. However, with the increased overall survival of CRC patients, the incidence of BM from CRC has been on the rising during the course of CRC [4]. Therefore, early identification of synchronous BM during the primary diagnosis of CRC will contribute to the accurate staging and improvement of prognosis for CRC patients. And in current clinical practice, the diagnosis of synchronous BM is mainly based on the clinical SREs, but the optimal theraputic opportunity for BM has already missed when SREs occur [8]. Therefore, early identification of synchronous BM during the primary diagnosis of CRC will also contribute to delay the progress of SREs and improve the quality of life of patients. To better address this issue, we used population-based database to generate a risk model based on clinical and tumor characteristics to predict the risk of synchronous BM in newly diagnosed CRC patients. The results showed that this risk model could precisely identified the synchronous BM from CRC patients with considerably high accuracy. Previous studies have been reported identifying the risk factors of BM and developing risk models to predict BM after curative resection of CRC. Sun et al. evaluated 516 patients who received curative resection for CRC and found two independent risk factors contributing to BM during the follow-up, including lymph node involvement and tumor location [12]. Ang et al. have found that three independent risk factors of BM, including rectal cancer, lymph node metastasis and lung metastasis. A scoring system was then developed to predict BM based on these three risk factors, and CRC patient were divided into three groups with different risk of developing BM (1.5% vs. 6.6 and 10.5%, P < 0.001) [13].
In our study, we developed the first risk model to predict the probability of synchronous BM at primary CRC diagnosis. Here, we found that rectal cancer was an independent risk factor associated with synchronous BM, which accorded with research of other scholars [12,13]. Rectal and colon cancers have differences of anatomical location and tissue sources from the beginning of embryonic development. There are some communicating branches between rectal vein and vertebral vein system, which probably is one of the mechanisms of BM [3]. It also be found that patients with poorly or undifferentiated tumor and signet-ring carcinoma were more prone to have BM. This might be because cancer cells have capability of invading surrounding tissues, capillaries and lymphatics with stronger growth potential to develop early metastasis [16]. The CEA in serum and lymph node metastasis would promote the infiltration and metastasis of CRC, leading to higher risk of synchronous BM in CRC patients [17,18]. Besides, extraosseous metastasis could also increase the risk of synchronous BM. Thus, we use the above significant risk factors to develop the risk model to predict BM. Furthermore, this risk model showed good discriminations with the AUROC being 0.903 in the development and 0.889 in the validation cohort, which could be accurate to identify the newly diagnosed CRC patients who should require more comprehensive assessment to detect potential BM.
The strengths in this study include large sample size from SEER database and good accuracy for predicting synchronous BM in CRC patients. Furthermore, the prediction model consist of accessibly clinical and tumor characteristics, which suggested that this model should be more clinically acceptable. However, due to all patients in this study only representing the population in US, this risk model has the limitation in its widespread use, and the external validation involving the population from other countries should be needed to further confirm the accuracy of this risk model. Another limitation of this study is the lack of biological markers, which could be incorporated into this statistical model to improve the accuracy and its usefulness. Despite these limitations, this risk model remains a accurate and valuable clinical tool to predict the risk of developing synchronous BM in newly diagnosed CRC patients.

Conclusions
Although the patients with BM only accounted for a small proportion in CRC patients, early detection of synchronous BM with routine screening at the primary diagnose of CRC would be beneficial for high risk patients. Here, this clinical prediction model present high accuracy to identify the newly diagnosed CRC patients with high risk of BM and provide more individualized decision making to this group of patients.