Downregulated SPINK4 is associated with poor survival in colorectal cancer

Background SPINK4 is known as a gastrointestinal peptide in the gastrointestinal tract and is abundantly expressed in human goblet cells. The clinical significance of SPINK4 in colorectal cancer (CRC) is largely unknown. Methods We retrieved the expression data of 1168 CRC patients from 3 Gene Expression Omnibus (GEO) datasets (GSE24551, GSE39582, GSE32323) and The Cancer Genome Atlas (TCGA) to compare the expression level of SPINK4 between CRC tissues and normal colorectal tissues and to evaluate its value in predicting the survival of CRC patients. At the protein level, these results were further confirmed by data mining in the Human Protein Atlas and by immunohistochemical staining of samples from 81 CRC cases in our own center. Results SPINK4 expression was downregulated in CRC compared with that in normal tissues, and decreased SPINK4 expression at both the mRNA and protein levels was associated with poor prognosis in CRC patients from all 3 GEO datasets, the TCGA database and our cohort. Additionally, lower SPINK4 expression was significantly related to higher TNM stage. Moreover, in multivariate regression, SPINK4 was confirmed as an independent indicator of poor survival in CRC patients in all databases and in our own cohort. Conclusions We concluded that reduced expression of SPINK4 relates to poor survival in CRC, functioning as a novel indicator.


Background
Despite significant progress in surgery, radiotherapy, chemotherapy, and targeted therapy, colorectal cancer (CRC) remains one of the leading cancer types in terms of incidence and cancer-related death worldwide [1]. This characteristic is partly due to a lack of diagnostic markers for the detection of CRC and inefficient treatment of late-stage colorectal cancer [1]. Currently, the prediction of survival or relapse and the determination of therapeutic strategies are mostly based on the tumornode-metastasis (TNM) system [2]. However, the longterm outcome varies widely, even in patients within the same TNM stage [3]. Moreover, this pathological prognostic prediction method alone may not accurately predict prognosis without incorporating molecular data of the tumor [4]. Hence, an increasing number of studies in this era of genomic medicine have focused on molecularly based prognostic markers, which are complementary to the pathological TNM system [5,6].
Serine protease inhibitors function as central regulators of many vital processes in the mammalian body; when serine protease activity or serpin-mediated regulation becomes unbalanced or dysfunctional, severe disease states, such as cancer and sepsis, can ensue [7]. One branch of the family of serine protease inhibitors is named Kazal type (SPINK) and originally consisted of four members in humans (SPINK1, SPINK2, SPINK4, and SPINK5) [8]. Although the major site of expression of all four SPINK members may differ, all are thought to be involved in protection against proteolytic degradation of epithelial and mucosal tissues [8]. SPINK4 is abundantly expressed in human goblet cells but was also reported to be formed, stored, and secreted from monocytes and might function as a gastrointestinal peptide [9]. A previous study found that serum SPINK4 levels were increased in CRC and had high diagnostic value but were not associated with the survival of CRC patients [10]. The expression status of SPINK4 in tissue samples and its clinical significance in CRC is largely unknown. Therefore, the present study aimed to measure SPINK4 expression in CRC tissues and to investigate its relationship with clinicopathological features and survival.

Database analysis
A total of four microarray data sets were retrieved from the Gene Expression Omnibus (GEO) database (https:// www.ncbi.nlm.nih.gov/geo/). The microarray data set GSE39582 included mRNA expression profiles of a large series of 443 CRC and 19 nontumor colorectal mucosa and was submitted by Nabila Elarouci et al. [11]. The microarray data set GSE24551 comprised two independent series including exon level expression profiling data for a total of 160 CRC tissue samples and was submitted by Anita Sveen et al. [12]. The microarray data set GSE32323 included mRNA expression profiles of 17 pairs of cancer and noncancerous tissues from colorectal cancer patients and was submitted by Kaoru Mogushi et al. [13]. The majority of colorectal cancers develop as tubular adenomas through multistage carcinogenesis. To address the sequential expression changes in normal colonic mucosa, adenoma, and carcinoma tissues, the mRNA expression profiles of 4 pairs of normal colonic mucosa and adenoma tissues and 4 pairs of adenoma and carcinoma tissues from GSE3880 were also downloaded.
The expression level of the SPINK4 gene in other cell lines, organs and cancers was identified in the MediSapiens IST Online database (http://ist.medisapiens.com/) and the online database Gene Expression Profiling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn/index.html) [14].
RNA sequencing (RNA-Seq) data and the full clinicopathological dataset from 438 colon cancer patients from COAD were obtained from the TCGA data portal (https://portal.gdc.cancer.gov/). We excluded cases without sufficient survival data (n = 2), leaving 436 colon cancer patients selected for further survival analysis.
To further address the change in SPINK4 protein expression in CRC tissues, SPINK4 expression in CRC tissues and normal colon tissues was first reviewed by using the immunohistochemical (IHC) staining data provided in the Human Protein Atlas (http://www.proteinatlas.org/) [15].
Single-cell sequencing data and corresponding singlecell functional states from GSE81861 [16], which included the RNA expression profiles of 44 single CRC cells, were downloaded from CancerSEA [17].
A workflow of this study is shown in Fig. 1. Tissue specimens were fixed in formalin and embedded in paraffin. All patients were followed up until May 2018. Detailed information on the clinical characteristics of all patients, including gender, age, body mass index (BMI), TNM stage, tumor size, histological type, pretreatment CEA level, pretreatment CA199 level, perineural invasion status, venous invasion status, tumor location, and tumor differentiation, was retrieved. Diagnosis and TNM staging were performed according to the 7th edition of the AJCC Cancer Staging Manual [18].

IHC staining and interpretation of results
The differential protein expression levels of SPINK4 in 81 colorectal cancer and paired normal tissues were measured using IHC staining. Anti-SPINK4 monoclonal antibody (ab121257, Abcam, UK) was used at a working concentration of 1:200. The scores were evaluated based on staining intensity and the percentage of positive cells for each of the sections. The staining intensity was scored as follows: 0, no staining; 1, light yellow staining; 2, yellow-brown staining; and 3, deep brown staining. The percentage of positive cells was scored as follows: 0, 0~5%; 1, 6~25%; 2, 26~50%; 3, 51~75%; and 4, > 75%. The final score was calculated as follows: positive cell score × staining intensity score. The total scores were condensed into four categories: 0 for negative (−); 1-3 for weakly positive (+); 4-7 for positive (++); and 8-12 for strongly positive (+++). All patients were sorted into two groups according to the total score. High expression of SPINK4 was defined as a detectable immunoreaction with a total score of ≥1 + .

Gene set enrichment analysis (GSEA)
To determine the function of SPINK4, GSEA was conducted in patients with the top 25% and with the bottom 25% of the expression in the GSE24551 dataset and GSE39582 dataset, respectively. The annotated gene set c2.cp.kegg.v5.2.symbols.gmt from the pathway database was selected as the reference gene set. P < 0.05, |enrichment score (ES) | > 0.3 and gene size ≥30 were set as the cutoff criteria. The overlapping enriched hallmark signatures in the GSE24551 dataset and GSE39582 dataset are illustrated by a Venn diagram.

Statistical analysis
Categorical variables were compared using the χ2 test and Fisher's exact test. Continuous variables were compared using Student's t-test. The cutoff value for SPINK4 expression was assessed using X-tile 3.6.1 software (Yale University, New Haven, CT, USA) [19]. Survival curves were computed with the Kaplan-Meier method and compared using the log-rank test. Univariate Cox proportional hazards regression was applied to estimate the individual hazard ratios (HRs) for the survival rates. Significant variables in the univariate analysis (P < 0.05) were then retained in multivariate analysis using Cox proportional regression models to explore the independent indicators. To comprehensively explore which functional states are associated with SPINK4 at the single-cell level, a linear model was used to evaluate linear correlations between SPINK4 expression and 14 cancer-related functional states (stemness, invasion, metastasis, proliferation, EMT, angiogenesis, apoptosis, cell cycle, differentiation, DNA damage, DNA repair, hypoxia, inflammation and quiescence). A Pvalue of < 0.05 was set as the level of significance. All statistical analyses were performed using SPSS software (ver. 17, SPSS Inc., Chicago, IL, USA) and R (ver. 3.4.1).

The mRNA expression of SPINK4 is downregulated in CRC tissues
To examine the levels of SPINK4 mRNA in CRC samples, we first analyzed SPINK4 mRNA expression by comparing 17 CRC tissues and paired adjacent normal tissues from the GSE32323 dataset, and the results indicated that the relative SPINK4 expression level was significantly decreased in CRC tissues compared with that in adjacent normal tissues (8.5 ± 2.2 vs. 10.5 ± 2.6, P = 0.016, Fig. 2a). These results were further confirmed in the GSE39582 dataset, in which the relative SPINK4 mRNA expression levels in CRC tissues and normal tissues were 8.5 ± 2.6 and 10.1 ± 2.3, respectively (P = 0.010, Fig. 2b). To address the sequential expression changes from normal colonic mucosa to adenoma to carcinoma, we analyzed SPINK4 mRNA expression by comparing 4 pairs of normal colonic mucosa and adenoma tissues and 4 pairs of adenoma and carcinoma tissues from the GSE3880 dataset. The results indicated that the relative SPINK4 expression level was decreased in adenoma compared to that in adjacent normal mucosa (4.1 ± 0.1 vs. 4.2 ± 0.1, P = 0.007, Fig. 2d). However, SPINK4 expression was similar between adenoma and carcinoma (4.1 ± 0.0 vs. 4.2 ± 0.1, P = 0.206, Fig. 2e). The mRNA expression of SPINK4 in other cell lines, organs and cancers The SPINK4 mRNA levels in other various cell lines and normal organ tissues were analyzed via the IST Online database. This analysis revealed that SPINK4 was highly expressed in normal colorectal, small intestinal and stomach tissues as well as in gastrointestinal (GI) system cell lines (Fig. 3). In addition, the differences in SPINK4 expression in other tumor and normal tissues of multiple cancer types were analyzed in the GEPIA database. The results revealed that SPINK4 expression was higher in pancreatic adenocarcinoma (PAAD) and gastric adenocarcinoma (STAD) tissues than in the corresponding normal tissues (Fig. 4).

The protein expression of SPINK4 was downregulated in CRC tissues
To further address the change in SPINK4 protein expression in CRC tissues, data mining in the Human Protein Atlas was first performed. In all 3 normal colon tissues, SPINK4 expression was strongly positive, as determined by IHC staining, and was mainly located in the cytoplasm and membrane (Fig. 2f). However, among the 12 CRC tissues examined, 7 were negative for SPINK4 staining (negative SPINK4 staining rate: CRC tissues vs normal colon tissues: 58.3% vs 0%, Fig. 2g). Then, IHC staining was used to assess the levels of the SPINK4 protein in 81 colorectal cancer tissues compared with adjacent normal tissues. Positive expression of the SPINK4 protein was found in 98.8% (80/81) of the normal colorectal tissues and 30.9% (25/81) of the CRC tissues (P < 0.001). Moreover, the SPINK4 protein was expressed at significantly lower levels in CRC tissues (total score: 0.3 ± 0.5) than in normal tissues (total score: 2.7 ± 0.6; P = 0.016, Fig. 2c). SPINK4 was mainly located in the cytoplasm and membrane of normal mucosal epithelial cells and primary cancer cells. Images illustrating different SPINK4 expression levels in CRC tissues and paired adjacent normal tissues are shown in Fig. 5.

Correlations between SPINK4 expression and clinicopathological characteristics
Subsequently, the correlations between SPINK4 expression and the clinicopathological characteristics of patients with CRC were investigated. Low SPINK4 expression was more frequently observed in patients with more advanced TNM stage (stage III-IV: 33/56, 58.9%) than in patients with lower TNM stage (stage I-II: 8/25, 32.0%, P = 0.025). In addition, low SPINK4 expression was significantly related to lower BMI (22.0 ± 3.4 vs. 23.9 ± 2.5, P = 0.031). SPINK4 expression was not associated with tumor grade, since the percentage of well to moderately differentiated tumors was similar between the low SPINK4 expression and high SPINK4 expression groups (96.4% vs. 92.0%, P = 0.583). No correlations were observed for gender, age, tumor size, histological type, pretreatment CEA level, pretreatment CA199 level, perineural invasion status, venous invasion status, or tumor location (Table 1).

Correlations between SPINK4 expression and CRC patient survival
The prognostic significance of SPINK4 in CRC patients was explored first by a data mining approach in the GEO and TCGA databases at the mRNA level. The patient characteristics of each study cohort are summarized in Table 2. The study cohorts were divided into two groups according to the cutoff points, which were assessed using X-tile. CRC patients from the GSE24551 dataset with low SPINK4 mRNA levels had significantly lower 5-year overall survival rates (5Y-OS) than those with high SPINK4 mRNA levels (56.2% vs. 78.9%, P = 0.022, Fig. 6a). In the GSE39582 dataset, low SPINK4 mRNA expression was also significantly associated with  Fig. 6b). These results were also confirmed in patients from the TCGA database (5Y-OS of low expression vs. high expression: 38.5% vs. 76.9%, P<0.001, Fig. 6c). At the protein level, IHC staining was carried out in our own cohort, and survival analysis revealed that CRC patients with low levels of SPINK4 protein expression had significantly worse disease-free survival (DFS) and overall survival than those with high levels of SPINK4 protein expression (5Y-DFS of low expression vs. high expression: 58.5% vs. 83.6%, P = 0.007, Fig. 6d; 5Y-OS of low expression vs. high expression: 58.4% vs. 83.6%, P = 0.014, Fig. 6e). Furthermore, SPINK4 was confirmed as an independent indicator of poor survival in patients with CRC in multivariate Cox proportional hazard regression in all 3 databases (GSE24551, HR = 0.462, P = 0.056; GSE39582, HR = 0.636, P = 0.014; TCGA database, HR = 0.301, P = 0.014) and in our own cohort (HR = 0.299, P = 0.027 for OS; HR = 0.264, P = 0.014 for DFS) (Tables 3-4). In addition, other independent factors included TNM stage (GSE24551, GSE39582, TCGA, our cohort), age (GSE39582, TCGA, our cohort), microsatellite status (GSE24551), KRAS status (GSE39582), perineural invasion status (TCGA) and venous invasion status (TCGA).

Discussion
The family of SPINK protease inhibitors originally consisted of four members in humans: SPINK1, SPINK2, SPINK4, and SPINK5 [8]. SPINK1 is mainly produced in pancreatic acinar cells and is expressed in various cancers and inflammatory states. In addition to being a protease inhibitor, SPINK1 also acts as an acute-phase reactant and a growth factor. Furthermore, it has been shown to modulate apoptosis [20]. Ozaki et al. [21] suggested that SPINK1 stimulates the proliferation of pancreatic cancer cells through the EGFR/mitogen-activated protein kinase cascade. Ida et al. [22] demonstrated that SPINK1 stimulates the proliferation of colon cancer cells and is involved in colorectal cancer progression. Moreover, overexpression of SPINK1 is associated with adverse prognosis in other cancers, including prostate cancer [23], hepatocellular cancer [24] and breast cancer [25]. Thus, SPINK1 can be used as a prognostic tumor marker. However, there have been only a few studies on the gene encoding SPINK4, another member of the SPINK family, in tumors. The analysis in the present study revealed that SPINK4 was highly expressed in normal colorectal, small intestinal and stomach tissues as well as in GI system cell lines. We first investigated SPINK4 mRNA expression in tumors in data from the TCGA and two GEO datasets (GSE32323 and GSE39582), and the results showed that SPINK4 mRNA expression was significantly decreased in CRC tissues compared with that in paired adjacent normal tissues. In addition to being downregulated in CRC, SPINK4 expression was higher in pancreatic adenocarcinoma (PAAD) and gastric adenocarcinoma than in the corresponding normal tissues at the RNA level. The change in SPINK4 protein expression in CRC tissues was then validated by data mining of the Human Protein Atlas and by IHC staining in our own samples. Consistent with the predictive results in the database analysis, the SPINK4 protein was expressed at significantly lower levels in the 81 CRC tissues than in the paired normal tissues. Furthermore, SPINK4 mRNA expression was decreased in adenoma compared to that in adjacent normal mucosa in the present study. However, although the expression of SPINK4 in carcinoma tended to be further decreased compared to that in adenoma, the difference was not statistically significant. These results suggest that the decrease in SPINK4 expression is an early event in colon carcinogenesis. Due to the limited sample size of adenoma cases in the present study, whether SPINK4 can be used as a predictor of CRC formation requires further study. Interestingly, the serum SPINK4 level was increased in patients with CRC compared with that in healthy controls in previous research [10]. Since proteins    Fig. 6 Low SPINK4 levels were associated with significantly decreased overall survival in CRC patients from a GSE24551, b GSE39582, c TCGA and d our study cohort. Low SPINK4 levels were associated with significantly decreased disease-free survival rates in CRC patients from e our study cohort  [26]. With regard to the value of serum SPINK4 for predicting survival, Xie et al. [10] did not find that serum SPINK4 was associated with OS or DFS in CRC patients. The short follow-up time (less than 10 months in half of the patients) as well as the small sample size in that study might partially explain the negative result. At the tissue level in the present study, IHC staining in our own cohort indicated that the expression level of the SPINK4 protein was significantly associated with reduced survival rates in patients with CRC. To obtain a reliable conclusion, these results were further externally validated in 3 other independent databases (TCGA and GSE24551, GSE39582). Additionally, in our multivariate Cox proportional hazard regression model, SPINK4 was confirmed as an independent indicator of poor survival in CRC patients in all 3 databases (GSE24551, GSE39582, TCGA) and in our own cohort. These findings suggest that SPINK4 may be exploited as a potential novel indicator of poor survival in patients with CRC. The functional and pathway enrichment analysis of SPINK4 in CRC showed that biological processes such as oxidative phosphorylation, metabolism of some components, and process in Alzheimer's disease were significantly enriched. In cancer cells, there is enhanced glucose use, with the rate of the tricarboxylic acid cycle and oxidative phosphorylation slowed and glycolysis increased, as a way to generate energy [27]. This metabolic switch provides substrates for cell growth and division and free energy. Blocking these metabolic pathways could lead to a new approach in cancer treatment [27]. The gene sets associated with oxidative phosphorylation-related pathways were enriched in the samples with high SPINK4 expression in the current study. Furthermore, at the single-cell level functional analysis in our study, SPINK4 was deregulated in cancer stem cells, which were demonstrated to show a distinct metabolic phenotype that can be highly glycolytic or oxidative phosphorylation-dependent [28]. Metabolic pathways, including inositol phosphate metabolism [29], fructose and mannose metabolism [30], and butanoate metabolism [31], were reported to be associated with cancer development. In the present study, the SPINK4 expression level was decreased in CRC. However, the process in Alzheimer's disease pathway was significantly related to high expression of SPINK4. Both Alzheimer's disease and cancer are prevalent in the elderly. Some epidemiological studies have reported a negative association between Alzheimer's disease and cancer. The results of a meta-analysis suggested that individuals diagnosed with Alzheimer's disease had a decreased risk for incident cancer by 42%, and patients with a history of cancer had a 37% decreased risk of Alzheimer's disease [32]. However, the underlying mechanism is still unclear. Several basic studies have indicated that neurodegenerative disorders and cancer share several biological pathways that may contribute to this negative association [33]. For instance, deletion or mutation of Pin1 can induce Alzheimer's disease-like pathological changes in mice [34]. However, Pin1 is overexpressed and/or activated by multiple mechanisms in many common human cancers and acts on multiple signaling pathways to promote tumorigenesis. Inhibition of Pin1 in animal models has profound antitumor effects [33]. Moreover, lower SPINK4 expression was found to be associated with CRC cell stemness properties and undifferentiated states in the present study. It was reported that HMGA1 promotes cancer stem cell properties and plays a role in the pathogenesis of Alzheimer's disease [35]. In addition, higher DNA repair ability was related to lower SPINK4 levels in the single-cell analysis. This relationship could be explained by the observation that cancer stem cells have increased DNA repair abilities and demonstrate resistance to DNA-damaging treatment approaches [36]. These GSEA data and single-cell functional analyses provide directions for further research on the mechanism of SPINK4 in the development of CRC.
Our study has certain limitations. First, only small CRC tissue samples were collected retrospectively to investigate the impact of SPINK4 on the long-term survival of CRC patients, although the results were further externally validated in 3 other independent databases (TCGA, GSE24551, and GSE39582). More studies are needed to confirm the results. Second, the exact biological function of SPINK4 in CRC and its detailed molecular regulation mechanisms were not assessed in the present study. The hypothesis drawn from GSEA and the single-cell functional and sequencing data analyses needs to be further confirmed by in vitro and in vivo experiments.

Conclusions
This preliminary study confirmed by using multiple datasets and our own database that reduced expression of SPINK4 relates to poor survival in CRC, functioning as a novel indicator.