Risks of developing breast and colorectal cancer in association with incomes and geographic locations in Texas: a retrospective cohort study

Background No study has been conducted to investigate the spatial pattern and association of socioeconomic status (such as income) with breast and colorectal cancer incidence in Texas, United States. This study aimed to determine whether median household income was associated with the risk of developing breast and colorectal cancer in Texas and to identify higher cancer risks by race/ethnicity and geographic areas. Methods This was a retrospective cohort study with an ecological component in using aggregated measures at the county level. We identified 243,677 women with breast cancer and 155,534 men and women with colorectal cancer residing in 254 counties in Texas in 1995–2011 from the public-use dataset of Texas Cancer Registry. The denominator population and median household income at the county level was obtained from the U.S. Bureau of the Census. Cancer incidence rates were calculated as number of cases per 100,000 persons and age-adjusted using the 2000 US population data. We used the ArcGIS v10.1 (geographic information system software) to identify multiple clustered counties with high and low cancer incidences in Texas. Results Age-adjusted breast cancer incidence rate in the highest median income quintile group was 151.51 cases per 100,000 in 2008–2011 as compared to 98.95 cases per 100,000 in the lowest median income quintile group. The risk of colorectal cancer appeared to decrease with increasing median income in racial/ethnic population. Spatial analysis revealed the significant low breast cancer incidence cluster regions located in southwest US-Mexico border counties in Texas. Conclusions This study demonstrated that higher income was associated with an increased risk of breast cancer and a decreased risk of colorectal cancer in Texas. There were geographic variations with cancer incidence clustered in high risk areas in Texas. Future studies may need to explore more factors that might explain income and cancer risk associations and their geographic variations. Electronic supplementary material The online version of this article (doi:10.1186/s12885-016-2324-z) contains supplementary material, which is available to authorized users.


Background
Breast cancer has the highest incidence rate in women and colorectal cancer is the third most common cancer in both men and women in the United States [1][2][3][4]. Previous studies had shown that breast cancer is associated with higher socioeconomic status (SES, such as higher income) and colorectal cancer is associated with lower SES in the U.S [5][6][7][8][9][10][11][12][13]. A study of cancer prevention using data from the California Cancer Registry showed that breast cancer incidence increased substantially with increasing SES [12]. A comprehensive review of SES related to breast and colorectal cancer in 11 registries of Surveillance, Epidemiology, and End Results (SEER) Program showed the similar findings of an increased breast cancer incidence and a decreased colorectal incidence with higher SES [5,11,14]. Previous studies have used spatial pattern analysis to identify areas with high breast [15][16][17][18] and colorectal [19,20] cancer incidence associated with socioeconomic factor. A study on the geographic distribution of late stage breast cancer cases has shown that higher breast cancer incidence rates were significantly associated with higher SES level in Florida between 1998 and 2002 [16]. However, no study has been conducted to investigate the spatial pattern and association of SES with breast and colorectal cancer incidence in Texas. Previous studies have suggested using education, income and occupation may represent different aspects of SES and one of these indicators should be used in epidemiologic studies involving SES [21]. Therefore, this study used the Texas Cancer Registry (TCR) database to determine the association of median household income with breast and colorectal cancer incidence rates from 1995 to 2011 in Texas [22]. Furthermore, we conducted cluster analysis to identify the counties with excessive high or low variation of breast and colorectal cancer incidence. The median household income at the county level in Texas was obtained from the U.S. census data [23]. Because individual level SES data were not available, group level SES data were frequently used to examine its association with cancer risk in the U.S. and in Europe [16,20,[24][25][26].
Here, we classified counties into five median household income categories by calendar year to examine whether median household income was correlated with the risk of breast and colorectal cancer in Texas [7,27,28]. Additionally, geographic maps were utilized to highlight the spatial differences in particular regions with excess disease rate. Moreover, we examined whether the relationship between median household income and the risk of breast and colorectal cancer interacted with race/ethnicity and metro/ urban/rural status. The findings from this study of both breast and colorectal cancer can help identify high risk populations and regions with respect to breast and colorectal cancer, which can enhance cancer prevention and control.

Study design and data sources
This was a retrospective cohort study with an ecological component in using aggregated measures at the county level. The Texas Cancer Registry (TCR) granted the permission to access the public-use dataset which was used to identify incident breast and colorectal cancer cases. The TCR is a statewide and population-based cancer registry with gold certification by the North American Association of Central Cancer Registries [22]. The TCR determined to cover at least 95 % statewide cancer cases diagnosed from 1995 to 2011 in Texas. Information on county population estimates, median household income, and population age groups was obtained from the U.S. Census data in 2000 and 2010 without needing permission [23,29]. County median household income data represented gross income from all sources, including government transfers but excluding non-cash benefits. The Institutional Review Board of the Texas Department of State Health Services and the Committee for the Protection of Human Subjects at the University of Texas Health Science Center granted ethics approval to our study. The informed consent was waived because the study was retrospective in design and from public datasets.

Measure of median household income
Because individual income information was not available at the TCR dataset, the aggregated median household income at county level was analyzed as SES. County median household income was chosen in this study because the median household income was more homogeneous with respect to SES and more accessible with wide representative of individual income factor [30]. Previous studies have frequently used these county-level socioeconomic indicators (ex. county poverty, and county median household income) to study temporal trends with breast and colorectal cancer incidence rates in U.S [12,16,[31][32][33][34].
Median household income at the county level was obtained from the U.S. Census Bureau [23]. It was calculated by 4 time periods according to calendar year (1995-1999, 2000-2003, 2004-2007, and 2008-2011), and income value in each time period was calculated as mean of incomes in all calendar years in the period. Because median household income at county level was not available in 1996, income in the 1995-1999 period was a mean of incomes in 1995, 1997, 1998 and 1999. Median household incomes in all 254 counties were then classified into quintiles with approximately equal number of counties in each of 5 income categories, ranging from the highest median household income (5 th quintile) to the lowest (1 th quintile) in Texas.

Breast and colorectal cancer cases
Incident breast and colorectal cancer cases were identified from TCR data using the following criteria: breast cancer among women and colorectal cancer among both men and women, diagnosed between 1995 and 2011, and no missing records on county at diagnosis. Breast cancer cases were identified using the "Primary Site" variable in TCR, coded as C500-C509 according to "International Classification of Diseases for Oncology, Third Edition (ICD-O-3), and Topography Section" [35]. Colorectal cancer cases were coded as C180-C189, C199, C209, and C260. According to the methods by Wu et al. [36] in counting total colorectal cancer cases, colon included the cecum (C180), appendix (C181), ascending colon (C182), hepatic flexure (C183), transverse colon (C184), splenic flexure (C185), descending colon (C186), sigmoid colon (C187), and large intestine, NOS(C188-C189,C260). The rectum included the rectosigmoid junction (C199) and the rectum-not otherwise specified (C209). In Texas, 243,677 women with breast cancer and 155,534 men and women with colorectal cancer residing in 254 counties from 1995 to 2011 were identified. Those breast and colorectal cancer cases with unknown county record were excluded (n = 35). Using TCR dataset, we obtained age, sex, and race/ethnicity for breast and colorectal cancer cases at an individual level [22]. Cases were separated into five age groups and four race/ethnicity categories. Five age groups were defined as <50, 50-59, 60-69, 70-79, and >79 years old. Four race/ethnicity categories were defined as non-Hispanic white, non-Hispanic black, Hispanic, and other. The other category includes Asian, Pacific Islander, American Indians, and unspecified race/ethnicity in TCR dataset. Definition of metro/urban/rural Texas county code were obtained from 2003 version of the U.S. Department of Agriculture (USDA) urban/rural continuum codes (RUCC). The USDA RUCC categorized counties as metropolitan (RUCC 1-3), nonmetropolitan with urban populations (RUCC 4-7), or rural (RUCC8-9) [37].

Statistical analyses
Spatial analyses have become an important tool used in public health research to identify potential cluster disease regions [15][16][17][18][19][20]. In this study, we first calculated the adjusted incidence rates at county level for breast and colorectal cancer separately after controlling for age and median household income, and then evaluated whether incidence rate clusters existed using the Getis-Ord G's statistic tool in ArcGIS 10.1 (ESRI, Redlands, CA) (Additional file 1) [38]. We also generated all maps in the figures and in supplemental materials using the ArcGIS 10.1 software [38].
The denominator of population data used to calculate incidence rates were acquired from the U.S. Census Bureau's Population Estimates Program [29]. Because age is a strong confounder and failure to use ageadjusted incidence rates in cancer study may lead to an underestimation or overestimation of incidence rates, we presented age-adjusted incidence rates as number of new cases per 100,000 persons which were standardized to the 2000 US population by five age groups and 4-year periods from 1995 to 2011 [12,39,40]. One assumption was that the median household income and population size of given counties would not change dramatically in each study period. Other cancer studies supported this assumption and showed no appreciable changes in aggregated median household income measured at the county levels over each study period [41,42].
Furthermore, cancer incidence rates were stratified by tumor stage for each median household income categories. Tumor stage at diagnosis includes in-situ, localized, regional, distant, and unstaged, which were defined according to the staging manual of National Cancer Institute [35]. The in-situ stage was defined as "the presence of malignant cells within the cell group from which they arose". Localized stage was defined as "a malignancy limited to the organ of origin; it has spread no farther than the organ in which it started". Regional stage was defined as "tumor extension beyond the limits of the organ of origin". Distant stage was defined as "tumor cells that have broken away from the primary tumor, have travelled to other parts of the body, and have begun to grow at the new location". In this study, we combined category of in situ and localized as early cancer stage, region and distant as late cancer stage. As a result, it allowed for an assessment of whether or not early or late stage breast and colorectal cancer incidence rates were positively associated with median household income factor. Poisson regression model, which is often used to model the rare disease, was chosen to model the number of cases in each county [43]. Poisson regression model with population size specific to demographic groups as an offset variable was used to determine the association between incidence rate ratio (IRR) and median household income, adjusting for age, gender, race/ethnicity, degree of urbanization/population, and all two-way interaction terms (Additional file 1). In order to determine the temporal relationship, incidence rate ratios were calculated separately and adjusted by potential confounders in each time period. The assumptions of the Poisson regression model were examined by linearity, constant variance and independent structure of observations. The examination showed only a minor degree of overdispersion, supporting that the Poisson regression model assumption was acceptable. The SAS 9.3 statistical software (SAS Institute Inc., Cary, NC) was used on all analyses.

Results
Trends in breast cancer incidence rates Table 1 presents the distribution of age-adjusted breast cancer incidence rates stratified by median household income and tumor stage factors in Texas from 1995 to 2011. Overall age-adjusted breast cancer incidence rates were 153.87, 157.58, 142.81, and 141.07 cases per 100,000, respectively by 4 time periods (1995-1999, 2000-2003, 2004-2007, and 2007-2011). Breast cancer incidence increased from 1995 to 2003 and decreased from 2004 to 2011. The increasing breast cancer incidence in 1995-2003 was consistent with the time period when the widespread use of screening program was implemented [44,45]. Within each time period, there was a significant association between breast cancer incidence and median household income level. For example, breast cancer age-adjusted incidence rate in the highest median income quintile group was 151.51 cases per 100,000 in 2008-2011 as compared to 98.95 cases per 100,000 in the lowest median income quintile group. After the results were stratified by tumor stage (last 2 columns in Table 1), the association between higher income and an increased breast cancer incidence was largely limited to women with early stage breast cancer, while there was no clear pattern of an association between high income and late stage breast cancer incidence. Figure 1 provides the geographic distribution of ageadjusted breast cancer incidence rates associated with median income at county level, stratified by four time period, (a) 1995-1999, (b) 2000-2003, (c) 2004-2007, and (d) 2008-2011. Counties with higher median income were likely to have higher breast cancer incidence rates. The effect of increasing median income quintiles on the age-adjusted breast cancer incidence rates was most pronounced in 2000-2003. Lowest median income counties were located around US-Mexico border areas, where age-adjusted breast cancer incidence rates were low. Spatial analysis revealed the significant low breast cancer incidence cluster regions located in southwest US-Mexico border counties in every study time period (P < 0.001). In other areas of Texas, only a few isolated counties were identified as low cold spot regions. Cold spot maps were provided in supplemental materials (Additional file 1: Figure S1). Table 2 presents the distribution of age-adjusted colorectal cancer incidence rates stratified by median income from 1995 to 2011. Overall age-adjusted colorectal cancer incidence rates were 59.17, 58.11, 52.49, and 45.76 cases per 100,000, respectively for the 4 time periods (1995-1999, 2000-2003, 2004-2007, and 2007-2011). Unlike the trends over time for breast cancer, overall age-adjusted colorectal cancer incidence rates decreased consistently from 1995 to 2011. Colorectal cancer incidence rates were not consistently associated with higher income levels. A small increase of colorectal cancer incidence was observed in the lowest to third quintile and no increase of colorectal cancer incidence was observed from the third to the highest income quintile. When the results were stratified by tumor stage (last 2 columns in Table 2), unlike what was found for breast cancer in Table 1, tumor stage for colorectal cancer did not seem to modify the association between income and overall colorectal cancer incidence. In other words, both early and late stage colorectal cancer incidence rates slightly decreased with higher income. Although counties in the US-Mexico border area had lower age-adjusted colorectal cancer incidence rates, there was no clear pattern about the association between median income and colorectal cancer incidence rates. Spatial analysis detected significantly low cluster colorectal cancer incidence regions near US-Mexico border counties in the 1995-1999 and 2008-2011 periods (P < 0.001, Additional file 1: Figure S2).

Trends in colorectal cancer incidence rates
Breast and colorectal cancer incidence risk ratios Table 3 presents the breast and colorectal cancer incidence rate ratios (IRR) by median household income quintiles for overall population and also stratified by urbanization (metro/urban/rural) using Poisson regression models that were adjusted for age, gender, race, and all possible two-way interactions. Compared to those in the lowest median income quintile counties in 2008-2011, overall age-adjusted breast cancer incidence rate was 69 % higher in counties with the highest income quintile (IRR = 1.69, 95 % CI: 1.56-1.82) and 22 % higher in counties with the 2nd lowest income quintile (IRR = 1.22, 95 % CI: 1.10-1.34). The association between breast cancer and incomes was stronger in metro and urban areas. For example, breast cancer incidence rate in metro area was 66 % significant higher in counties with the highest income quintile (IRR = 1.66, 95 % CI: 1.52-1.82) compared to the lowest median income quintile counties. In rural areas, breast cancer incidence rates appeared to be elevated with higher income quintile, but were not statistically significant with wide confidence intervals, partly due to small numbers. Breast cancer risk increased with increasing median income in all 4 time periods. On the contrary, colorectal cancer risk was not increased with increasing median income. Table 4 presents the risk of breast and colorectal cancer in association with median income and race/ ethnicity by 4 time periods. Because of statistically significant interactions between median income and race/ethnicity, the association between cancer risk and median incomes was stratified by race/ethnicity. In non-Hispanic white women with breast cancer, the risk of breast cancer significantly increased with increasing median income in all time periods except in 2000-2003. However, in other ethnic women, breast cancer risk did not appear to increase with increasing median income. On the contrary, in men and women with colorectal cancer, we did not observe any pattern of increased risk of colorectal cancer with increasing median income in non-Hispanics whites. The risk of colorectal cancer appeared to decrease with increasing median income in racial/ethnic populations.

Discussion
This study demonstrated that breast cancer risk increased with increasing median income, whereas colorectal cancer risk slightly decreased with increasing median income. In addition, the study examined the risks of breast and Breast cancer early stage includes in situ and localized. Late stage includes regional and distant colorectal cancer risk by race/ethnicity and degree of urbanization and highlighted the spatial variations in Texas where the breast and colorectal risks were lower in particular regions as compared to other geographic areas. This appears to be the first study in Texas to assess the relationship between median household income and ageadjusted breast and colorectal cancer incidence rates. The findings of this study were consistent with those of other studies outside Texas. Numerous studies had previously shown that breast cancer risk was associated with higher income and colorectal cancer risk was associated with lower income in the U.S [5][6][7][8][9][10][11][12][13]. For example, Clegg and colleagues studied the risk of cancer in association with SES in the 11 SEER areas in 1973-2001 and found that age-adjusted breast cancer incidence increased from 136.35 cases (per 100,000 population) in those with family income of < $12,500 per year to 158.15 cases in those with family income of ≥ $50,000, whereas the age-adjusted colorectal cancer incidence (in both men and women combined) decreased from 69.55 to 64.09 [11]. Klassen and Smith reviewed 90 studies from around the world that were published between 1978 and 2009 on breast cancer and social class, concluding that breast cancer incidence continued to be higher in high social class populations than in low social class populations [46]. On the contrary, Aarts and colleagues reviewed 62 studies published between 1995 and 2009 on colorectal cancer incidence and socioeconomic status and concluded that a lower SES was associated with higher colorectal cancer incidence in the U.S. and Canada, although the findings on the SES and colorectal cancer risk in Europe were different [26]. Also, the gap in colorectal cancer incidence between high and low socioeconomic status was narrowing over time [47,48].
The relationships between cancer risks and socioeconomic status are complex, especially so for breast cancer. Multiple reasons and explanations have been discussed previously, including social class as a marker for biological and behavioral differences, differential access to medical facilities, different health awareness for disease screening or early detection, and different exposures to environmental pollution particularly in metro and urban areas, lifestyle, stress, and work factors [49][50][51][52][53][54]. High social class and income have been documented to influence mothers and daughters in their reproductive life and related factors, for example, earlier onset of menarche, delayed age for first birth and menopause, fewer number of children, and perhaps more use of hormone replacement therapies, all of which were associated with a prolonged exposure to hormones and an increased risk of breast cancer [55][56][57][58]. These hormone-related factors may be one of the reasons why there was no such an association between social class and increased risk of colorectal cancer because colorectal cancer is not a hormoneassociated tumor. Furthermore, when mammogram as an effective screening tool was implemented, breast cancer incidence increased sharply due to screening. For example, breast cancer incidence in the U.S. was increased from 112 cases per 100,000 women in the early 1980s to 234 cases per 100,000 women in the late 1990s [59]. This increase was particularly evident in women with higher income, better health insurance coverage, and greater access to screening facilities. For this reason, we stratified the results by tumor stage and found that higher income was significantly associated with an increased early stage breast cancer incidence in 4 different time periods. On the contrary, no sharp increase in colorectal cancer incidence was observed after cancer screening in the U.S [2,60,61].
In this study we also found that early stage colorectal cancer incidence was not associated with income and time periods. Moreover, genetic and environmental factors were associated with an increased risk of breast and colorectal factors. For example, a diet that is high in red meats or processed meats has been well documented to increase cancer risk overall and colorectal cancer risk in particular [62][63][64][65]. This type of diet was associated with lower income, which was consistent with what we found in this study on the higher risk of colorectal cancer in those men and women with lower income [66,67]. Finally, we observed significant clustered counties for low breast cancer incidence in southwest US-Mexico border in all four time periods. This border area consisted of low income counties and a majority of people in these counties were Hispanics with Mexican origin. The finding of a low breast cancer incidence was consistent with previous studies [6-9, 11, 46]. It was reported that breast cancer incidence rate in Hispanic women was 26 % lower than in non-Hispanic white women and these risk differences   were likely attributed to differences in some potential risk factors for breast cancer such as lower age at birth and more children in Hispanic women [68]. However, we also observed significant clustered counties for low colorectal cancer incidence in this border area in the 1995-1999 and 2008-2011 periods but not in the 2000-2007 period. Although numerous studies concluded an association between low income and high colorectal cancer incidence, the low colorectal cancer incidence in these US-Mexico border counties might be partially explained by the fact that those Hispanics with Mexican origin had a lower risk than those with other Hispanic origins in Puerto Rico, South and Central America [68]. Because the above specific risk factors (such as number of children) and screening patterns for breast and colorectal cancer were not measured in our study, this report cannot address how these factors might explain the observed associations between income and cancer risk, but only demonstrated that higher income was associated with an increased risk of breast cancer and a lower risk of colorectal cancer in Texas. In particular, the mechanism of the finding that the trend of an increased risk of breast cancer in association with higher median income was statistically significant only in non-Hispanic white women after adjusting for age may need further research. There were some other limitations to be noted in this study. First, the income variable was the median household income at county level but not at individual level. Hence, there was a potential ecological fallacy in which the income at group level may not necessarily represent the income at individual level. Second, median household income alone may not be a good proxy for socioeconomic status. Ideally, education, occupation, and health knowledge, which are strongly associated with higher cancer screening rates and healthy lifestyles, should all be taken into consideration in the analyses. Third, the denominator populations by county in the non-census years were estimated by the U.S. Bureau of Census. We were unable to verify specific populations by year and age, particularly when Texas has one of the largest population growth and change in demographics in the U.S. The lack of accuracy in population estimates might have led to biased calculations for cancer incidence rates.

Conclusion
In conclusion, this statewide and population-based study demonstrated that higher income was associated with an increased risk of breast cancer and a decreased risk of colorectal cancer in Texas. There were also geographic variations with cancer cases clustered in high and low risk areas in Texas. Future studies may need to explore more factors that might explain these income and cancer risk associations and geographic variations.

Ethics approval and consent to participate
The Institutional Review Board of the Texas Department of State Health Services and the Committee for the Protection of Human Subjects at the University of Texas Health Science Center granted ethics approval to our study. The informed consent was waived because the study was retrospective in design and from public datasets.

Consent for publication
Not applicable because this manuscript does not contain any individual persons data.

Availability of data and materials
The de-identified datasets on incident breast and colorectal cancer cases in this study cannot be shared due to the release policy requirement by the Texas Cancer Registry (TCR). However, these public-use datasets can be obtained from TCR by following a few appropriate procedures at http://www.cprit.state.tx.us/texas-cancerregistry [22]. Information on the county population estimates, median household income, and population age groups at the county level was obtainable from the U.S. Census Bureau without needing approval at http:// www.census.gov/did/www/saipe/index.html [23,29].

Competing interests
The authors declare that they have no competing interests.
Authors' contributions ZL contributed to the design of the study, data analyses, interpretation of results, and draft and approval of manuscript; KZ contributed to the data analyses, interpretation of results, draft and approval of manuscript; XLD contributed to the concept and design of the study, data analyses, interpretation of results, and draft and approval of manuscript. XLD is responsible for the overall contents. All authors have read and approved the final manuscript.