Classification tree analysis to enhance targeting for follow-up exam of colorectal cancer screening

Background Follow-up rate after a fecal occult blood test (FOBT) is low worldwide. In order to increase the follow-up rate, segmentation of the target population has been proposed as a promising strategy, because an intervention can then be tailored toward specific subgroups of the population rather than using one type of intervention for all groups. The aim of this study is to identify subgroups that share the same patterns of characteristics related to follow-up exams after FOBT. Methods The study sample consisted of 143 patients aged 50–69 years who were requested to undergo follow-up exams after FOBT. A classification tree analysis was performed, using the follow-up rate as a dependent variable and sociodemographic variables, psychological variables, past FOBT and follow-up exam, family history of colorectal cancer (CRC), and history of bowel disease as predictive variables. Results The follow-up rate in 143 participants was 74.1% (n = 106). A classification tree analysis identified four subgroups as follows; (1) subgroup with a high degree of fear of CRC, unemployed and with a history of bowel disease (n = 24, 100.0% follow-up rate), (2) subgroup with a high degree of fear of CRC, unemployed and with no history of bowel disease (n = 17, 82.4% follow-up rate), (3) subgroup with a high degree of fear of CRC and employed (n = 24, 66.7% follow-up rate), and (4) subgroup with a low degree of fear of CRC (n = 78, 66.7% follow-up rate). Conclusion The identification of four subgroups with a diverse range of follow-up rates for CRC screening indicates the direction to take in future development of an effective tailored intervention strategy.


Background
Colorectal cancer (CRC) is the second leading cause of cancer mortality in developed countries, with 727,400 new cancer cases and 320,100 deaths estimated to occur worldwide in 2008 [1]. As five-year CRC mortality rates vary according to the extent of tumor spread at the time of diagnosis, early detection is important.
Audience segmentation, which involves the identification of population subgroups that share particular characteristics, has been proposed as a promising strategy because interventions can be tailored toward particular subgroups [25][26][27]. Thus, segmenting the population could better guide the development of effective intervention strategies to increase follow-up compliance after screening tests. Specifically, segmentation can assist in the development of tailored interventions for high-risk subgroups with low follow-up rates, which have a high tendency to be undetected in existing mass screening programs.
Our study had two primary objectives: 1) to identify subgroups of individuals who share the same patterns of characteristics related to the follow-up exam after FOBT and 2) to examine the variance among identified subgroups in order to develop effective tailored interventions.

Setting
The study was conducted in the Omiya district of Saitama city in Saitama Prefecture adjunct to Tokyo, Japan. The population was 108,585 as of January 1st, 2010. During the period of the study, it was the local government's policy to recommend an annual 2-day immunochemical FOBT screening for those aged 40 years and over. The FOBT is provided through a local medical association network of 170 clinics authorized by the local government. The local government informs eligible inhabitants about the screening once every year in April through pamphlets that are mailed to each household. Applicants then visit one of the 170 clinics to get the FOBT kit containing printed instructions for specimen collection and applicator sticks. Screening participants were required to conduct the specimen collection at home and to return the completed kits to the clinics. Participants were asked to visit the clinic again two weeks after undertaking the test to receive their diagnostic results. In the case of a positive result, participants were instructed by their physician to undertake additional tests.

Procedure
Participants in this study were CRC screening participants recruited at the time they visited the clinic to get the FOBT kits. We handed letters requesting participation in the study to participants aged in their 50s and 60s. After obtaining oral consent to participate in the study, willing participants were asked to complete an anonymous questionnaire at home. The questionnaires were returned by the participants when they returned their FOBT kits to the clinic. The data collection period was from September 2009 to March 2010. The total number of CRC screening participants during the study period was 12,009. Figure 1 shows the participation flow. Of the 3,536 participants who received the mail survey, 2,222 (response rate: 62.8%) replied. Following the baseline survey, 143 participants, who were asked to undergo follow-up examinations, were analyzed for the current study.

Survey measures
Survey measures included a follow-up exam after FOBT as a dependent variable and sociodemographic variables, psychological variables, past FOBT and follow-up exam, family history of CRC, and history of bowel disease as predictive variables.

Dependent variable
A follow-up exam after FOBT was employed as a dependent variable in this study. The number of follow-up exams was collected as a part of standard record-keeping of participating facilities. Each facility sent written notifications to the local government when a follow-up exam had been performed. This information was used to determine the number of follow-up exams.

Predictive variables
Sociodemographic variables included age, sex, marital status, education, employment status, and subjective economic status.
The psychological variables used in this study were derived from the constructs of the Health Belief Model [28] and the Theory of Planned Behavior [29]. According to the Health Belief Model, a person's behavior is determined by the following four beliefs: (a) perceived susceptibility, (b) perceived severity, (c) perceived barriers, and (d) perceived benefits. A previous systematic review suggested that the Health Belief Model is the most consistent model to predict CRC screening behavior [30]. Also, according to the Theory of Planned Behavior, a person's behavior is driven by his/her intention to perform the behavior. For example, intention to undergo CRC screening has remained one of the strongest factors in past studies [31,32]. Accordingly, the psychological variables we measured in this study were the perceived susceptibility and severity of CRC, perceived benefits and barriers of follow-up exam after FOBT, and intention to undergo a follow-up exam. The measurements for these psychological variables were derived from a past study (see Zheng et al. [33] for detailed questionnaire).
Family history of CRC was assessed as a dichotomous (yes/no) variable with the statement "Have any of your first-degree blood relatives had CRC?" Past CRC screening was assessed as a dichotomous (yes/no) variable with the statement "Have you ever undertaken an FOBT?" In addition, participants were asked whether they had ever received positive FOBT results and undergone follow-up exams.

Statistical analysis
First, frequencies and percentages of measured variables are reported. Next, a classification tree analysis is performed in order to identify the best combination of the measured variables that predict compliance with followup exam after FOBT. Among multivariate statistical analyses, the classification tree analysis is suggested to be superior to cluster analysis or the logistic regression analysis in identifying distinctive homogeneous subgroups for further development of tailored intervention [34]. In the current analysis, we adopted chi-square values as a criterion for variable selection, and the groups were divided into two groups until the following criteria were met: (1) 10% or less of all participants after grouping or (2) no significant explanatory variables at p < 0.001. The outcome variable was follow-up exam after FOBT and the explanatory variables were socio-demographic variables, psychological variables, past FOBT and follow-up exam, family history of CRC, and history of bowel disease. Finally, in order to test differences between subgroups identified by classification tree analysis, ANOVA was performed on continuous variables and a Chisquare test on categorical variables. Measured variables were statistically tested and p < 0.002 was adopted as significance level by a Bonferroni correction. All analyses were performed using SAS 9.1.3 (SAS Institute, Cary, NC). Participants with missing data were excluded from the analysis.

Ethical issues
This study was approved by the Institutional Review Board (IRB) of the National Cancer Center in Japan and adopted the principles of the Declaration of Helsinki. Table 1 presents the characteristics of the study participants. The follow-up rate after FOBT was 74.1% (n = 106).

Baseline characteristics of respondents
Classification tree analysis Figure 2 shows the result of the classification tree analysis. For all participants, the most appropriate explanatory variable that predicts compliance with follow-up exam after FOBT was fear of CRC. The was further classified into 2 groups: one with a high degree of fear of CRC (n = 65, 83.1% follow-up rate) and one with a low degree of fear of CRC (n = 78, 66.7% follow-up rate). The next most appropriate explanatory variable detected in the subgroup with a high degree of fear of CRC was employment status. This subgroup was further divided into two subgroups of unemployed (n = 41, 92.7% follow-up rate) and employed individuals (n = 24, 66.7% follow-up rate). On the other hand, for the subgroup with a lower degree of fear of CRC, no appropriate explanatory variable meeting the criteria was detected. Finally, the unemployed subgroup was divided into two subgroups of individuals with a history of bowel disease (n = 24, 100.0% follow-up rate) and those without a history of bowel disease (n = 17, 82.4% follow-up rate). At that point, the level of the criteria for the analysis completion was reached. Comparison of characteristics in each subgroup Table 2 shows the characteristics of each subgroup identified by the classification tree analysis. There were statistically significant differences between subgroups in the following variables: sociodemographic variables such as education (p = 0.001) and employment status (p < 0.001); history of bowel disease (p < 0.001); and perceived severity (p < 0.001).

Discussion
In order to achieve the goal of reducing colorectal cancer morbidity and mortality by mass screening, it is imperative that patients receive timely and appropriate follow-up exams for detected abnormalities. However, low follow-up rates after FOBT limits the potential benefit of mass CRC screening. Therefore, specifically from a public health perspective, targeting high-risk subgroups with low follow-up rates (i.e. people who are more likely to have CRC than the general public) is particularly important. This study is, to our knowledge, the first study to identify subgroups that share the same patterns of characteristics in terms of follow-up examinations after FOBT. The most important finding of the present study is the identification of four subgroups with diverse follow-up rates (ranging from 66.7% to 100.0%) using classifica-tion tree analysis. This method has been shown to be a powerful medical decision-making tool [35]. Compared with cluster analysis or logistic regression analysis, the visual image of a hierarchical tree structure provides benefit to clinical practitioners, because the choice of a tailored message only depends on three questions: Fear of CRC, employment status, and past history of bowel disease.
A second implication is that fear of CRC, one of the psychological variables of perceived severity based on the Health Belief Model [28], has been demonstrated to have the closest association with follow-up examinations. Through selecting a combination of antecedent behavioral variables, the value of behavioral theories should be considered, as they could guide the development of effective intervention strategies [36]. The current limited research on examining the theory-based variables related to follow-up behavior after FOBT therefore calls for further focused and prospective research.
This study has several limitations. First, the sample size (n = 143) was small, and therefore the statistical power might be insufficient. Second, a selection bias should be considered in lieu of a relatively low response rate of 62.8%. Third, because the participants were recruited from a single urban community, generalization of the findings should be treated with caution. Fourth, not all confounders have been accounted for. Efforts to reduce chances for produ- cing biases when segmenting the respondents, however, have been conducted as major confounders identified in the previous studies and were controlled statistically.

Conclusions
We identified four subgroups of individuals who share the same patterns of characteristics related to their degree of compliance with the follow-up exam after FOBT. The unique characteristics of each identified subgroup suggest future development efforts to design an effective tailored intervention strategy.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions YI was involved in design, interpretation of the data and drafting the manuscript. HS supervised the entire project and participated in the discussions on manuscript writing and finalization. YFZ assisted with the study design, literature review and questionnaire development. HN performed analysis of the data. TS and TH contributed to the development of the questionnaire and collecting data. All authors have read and approved the final manuscript.