Appraising growth differentiation factor 15 as a promising biomarker in digestive system tumors: a meta-analysis

Background Previous studies have highlighted cytokine growth differentiation factor 15 (GDF-15) as a potential biomarker for digestive system tumors (DST). This study sought to assess the feasibility of using GDF-15 as a diagnostic and prognostic biomarker in DST. Methods Eligible studies from multiple online databases were reviewed. Meta-analyses of diagnostic parameters were carried out using standard statistical methods. Study-specific hazard ratios (HRs) with 95% confidence intervals (CIs) were calculated to estimate the strength of the relationship between GDF-15 levels and clinical prognosis. Results We identified 17 eligible studies comprising 3966 patients with DST. The sensitivity, specificity, and area under the curve (AUC) for the discriminative performance of GDF-15 as a diagnostic biomarker were 0.74 (95% CI: 0.68–0.80), 0.83 (95% CI: 0.75–0.89), and 0.84, respectively. Moreover, increased GDF-15 expression levels were markedly associated with unfavorable overall survival (OS) in patients with DST (HR = 2.34, 95% CI: 2.03–2.70, P < 0.001; I2 = 0.0%) and colorectal cancer (CRC) (HR = 2.27, 95% CI: 1.96–2.63, P < 0.001; I2 = 0.0%). Stratification by cancer type, test matrix, ethnicity, and cut-off setting also illustrated the robustness of the diagnostic value of GDF-15 in DST. Conclusion Collectively, our data suggest that GDF-15 expression level may have value as a diagnostic and prognostic biomarker, independent of other, traditional biomarkers. Electronic supplementary material The online version of this article (10.1186/s12885-019-5385-y) contains supplementary material, which is available to authorized users.


Background
Over the past decade, digestive system tumors (DST) have become major causes of cancer-related mortality worldwide [1]. According to global cancer statistics compiled in 2016, death rates have increased for patients with DST, including for those with liver cancer and pancreatic cancer [2]. Due to lack of sensitive diagnostic testing, large numbers of patients with DST are mostly diagnosed at advanced stages, resulting in poor 5-year survival rates [2]. It is therefore necessary to identify novel, reliable biomarkers which can predict early diagnosis and/or prognosis of patients with DST.

Literature search
We searched the PubMed, EMBASE, ESBCO, Wiley Online Library, and Ovid databases for eligible studies from their incipience to June 20, 2018. We used the following search terms or Medical Subject Headings (MeSH) words to identify eligible studies: "macrophage inhibitory cytokine-1/MIC-1/growth differentiation factor 15/GDF-15" AND "oesophageal cancer/oesophageal neoplasm/colorectal cancer/colorectal carcinoma/colon cancer/colon carcinoma/CRC /gastrointestinal cancer/gastric carcinoma/gastric cancer/stomach cancer/hepatocellular carcinoma/liver cancer/pancreatic carcinoma/ pancreatic neoplasms/pancreatic ductal adenocarcinoma/ pancreatic mass/digestive system tumor/digestive system neoplasm" AND "survival/prognosis/outcome/hazard ratio/ HR" OR "diagnosis/sensitivity/specificity/ROC/AUC/area under the curve". Reference lists of the included articles or relevant reviews were also browsed for potentially missing studies.

Inclusion and exclusion criteria
Studies meeting the following criteria were included: (1) clinical trials reporting the diagnostic and/or prognostic features of GDF-15 in DST; (2) studies where the diagnostic parameters or survival outcomes included sensitivity, specificity, area under the curve (AUC), overall survival (OS), disease free survival (DFS), progression-free survival (PFS), recurrence-free survival (RFS), tumor-specific survival (TSS), or cancer-specific survival (CSS); and (3) the estimated hazard ratios (HR) or odds ratio (OR) with corresponding 95% confidence intervals (CIs) were available or could be calculated from published data. Accordingly, exclusion criteria included: (1) studies defined as reviews, basic studies, animal studies, letters, or conference abstracts; (2) data for statistical analyses were unavailable, and also failed to contact the authors; (3) studies with high risk and bias in quality assessment; and (4) articles written in a language other than English.

Data extraction and quality assessment
Data extraction was performed for study sensitivity, specificity, sample numbers, as well as HRs and their corresponding 95% CIs. Where such data were unavailable, the values were calculated indirectly using Engauge Digitizer 4.1 software. Other information included the first author's name, article date, patient ethnicity, specimen type, test method, cut-off value settings, survival points, follow-up time, quantiles of GDF-15, and other relevant clinicopathological characteristics.
Study quality was judged according to the Quality Assessment of Diagnosis Accuracy Studies criteria (QUA-DAS), which is based on a 14-item list [25]. The quality of all retrospective cohort studies was assessed using the Newcastle-Ottawa Scale (NOS) checklist, wherein potential bias due to cohort selection, comparability, and outcome ascertainment is judged on a score ranging from 0 to 9 [26]. The included studies were eliminated if they were scored to be of low quality (i.e. a final score of less than 5 for NOS or 8 for Quality Assessment of Diagnostic Accuracy Studies [QUADAS]).

Statistical analysis
Statistical analyses were conducted using STATA 12.0 software (Stata Corporation, College Station, TX, USA). The primary outcomes (pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and AUC with corresponding 95% CIs) were obtained in the diagnostic meta-analysis; a pooled HR with 95% CI was calculated to measure the association between GDF-15 expression (high vs. low) and the clinical outcomes of patients with DST. A combined HR > 1 implied that GDF-15 had a negative effect on the survival outcome of the patients. Heterogeneity for the size of each effect was calculated using Cochran's Q and I 2 statistics, and with statistical significance defined as P < 0.05 or I 2 > 50%. Fixed or random meta-analysis models were selected depending on the degree of study heterogeneity. Influence analysis was undertaken to ascertain the effects of outlier studies on the overall results. Publication bias was examined using Deek's funnel plot asymmetry test, as well as Egger's and Begg's tests, with statistical significance defined as P < 0.05.

Results
Search results and study quality Figure 1 schematically displays the selection procedure for eligible articles. According to our search criteria, a total of 3281 studies were eligible after the elimination of duplicates among databases. Among them, 3217 records were excluded due to irrelevant content or non-original data after reading the titles and abstracts. In the subsequent stages of study selection, 64 studies were assessed based on full-text evaluation, and with another 47 excluded. Finally, 17 articles (12 relating to diagnosis, and 9 relating to prognosis) were included in the final meta-analysis.
Study bias judged, as by the 14-item QUADAS list or NOS checklist, revealed that all of the diagnostic and prognostic studies had QUADAS scores of ≥10 or NOS scores of ≥6 (Table 1, Additional file 1 and Additional file 2), indicating that these data were suitable for our final statistical analysis.

Heterogeneity
In the diagnostic meta-analysis, heterogeneity was observed in the overall pooled data, of which the I 2 value was estimated to be 99.38% (P < 0.001). Heterogeneity was also detected among 6 groups in our collected diagnostic data (Table 3), with I 2 values ranging from 78.4 to 93.7% (P < 0.0001). Thus, random effect models were used for these studies. In our pooled data for prognosis, no significant heterogeneity was detected.

Diagnostic meta-analyses
The overall pooled sensitivity, specificity, diagnostic odds ratio (DOR), and area under the curve (AUC) for GDF-15, used to distinguish DST from non-cancerous tumors, were 0.74 (95% CI: 0.68-0.80), 0.83 (95% CI: 0.75-0.89), 14.07 (95%CI: 9.12-21.71), and 0.84, respectively ( Fig. 2 and Table 3), corresponding to a positive likelihood ratio (PLR) of 4.38 (95%CI: 3.00-6.39) and a negative likelihood ratio (NLR) of 0.31 (95%CI: 0.25-0.38). These results suggest that GDF-15 level is a useful Stratified analyses were performed in the diagnostic meta-analysis based on cancer type, sample type, cut-off setting, and ethnicity. As summarized in Table 3, the pooled AUC of GDF-15 to rule out PC, EC, GC, and liver cancer were estimated to be 0.82, 0.84, 0.90, and 0.85, respectively. Moreover, GDF-15 had an AUC of 0.82 for its ability to distinguish PC from pancreatitis, which was a higher value than the AUC for its ability to distinguish PC from healthy individuals (AUC = 0.73). When meta-analyzed based on sample type, serum-based GDF-15 testing achieved a specificity of 0.80 (95%CI: 0.78-0.81) and an AUC of 0.87, which were superior to plasma-based analysis. We found differences in diagnostic efficacy based on cut-off value: a cut-off setting <2000 pg/mL showed an AUC of 0.85 for PC (PC vs. non-cancerous tumors), and 0.87 for all cancers (all cancers vs. non-cancerous tumors). In the meta-analysis based on ethnicity, GDF-15 testing in Caucasian and Asian patients yielded an AUC of 0.83, whereas the Asian-based test conferred a higher specificity of 0.81 (95% CI: 0.79-0.83). The raw data used for the diagnostic meta-analysis was attached as Additional file 3.

Influence analysis and meta-regression
Influence analysis was conducted for both diagnostic and prognostic meta-analyses using STATA 12.0 software. One individual study [16] was identified as an outlier in the overall pooled diagnostic dataset for DST (Fig. 4a) and PC (Fig. 4b). However, no outlier studies were found at the upper or lower CI limit of the prognostic studies, indicating that the selected studies had relatively high homogeneity (Fig. 4c, d, and e). Meta-regression was performed to trace the causes of heterogeneity, wherein seven covariates, comprising ethnicity, sample size, control size, cancer type, test matrix, cut-off setting, and QUADAS score, were predefined. As displayed in Additional file 5, the analysis of QUADAS score received the lowest P-value (0.0349) among the analyses, suggesting that QUA-DAS score is the likely source of heterogeneity among diagnostic studies.

Publication bias
Publication bias analysis, assessed by Deeks' funnel plot asymmetry test, demonstrated no clear bias in the overall diagnostic meta-analyses of DST and PC ( Fig. 5a and b, n = 22 or 12, P = 0.375 or 0.479). Additionally, no significant publication bias, as assessed using Egger's and Begg's tests, was detected in the meta-analyzed prognostic data (all with P > 0.05) (Fig. 5c, d and e).
As expected, GDF-15 was used successfully as a diagnostic biomarker in DST: the pooled sensitivity, specificity, and AUC for the discriminative performance of GDF-15 to rule out DST were 0.74, 0.83, and 0.84, respectively. Although the combined sensitivity was not significantly high, the specificity and AUC were relatively high as well, and illustrated an acceptable diagnostic performance for GDF-15. The diagnostic odds ratio (DOR) is another measure of diagnostic effectiveness, with a value higher than 1.0 representing diagnostic validity [30]. Herein, we obtained a DOR of 14.07, further suggesting that GDF-15 testing can be used to diagnose DST. The pooled PLR of 4.38 also indicated that GDF-15 testing harbored a ratio between the true-positive and false-positive rate.
Several groups have demonstrated that GDF-15 may be used as a biomarker to assist in the detection of PC, EC, GC, and HCC [12,13,[15][16][17][18][19][21][22][23][24]. In our stratified analysis, 4 groups of carcinomas had been evaluated repeatedly: the pooled AUC of GDF-15 to rule out PC, EC, GC, and HCC were estimated to be 0.82, 0.84, 0.90, and 0.85, respectively, showing that GDF-15 testing achieved a significant level of efficacy in confirming GC. In PC, GDF-15 testing had an AUC of 0.82 for its ability to differentiate PC from pancreatitis, which was higher than its ability to distinguish PC from healthy individuals. These data indicate that GDF-15 may also be a useful indicator for the differential diagnosis of PC and pancreatitis. Additionally, we observed matrix effects for the test performance: serum-based GDF-15 testing yielded a better AUC than that for plasma-based analysis, suggesting that serum samples may be more suitable than plasma samples for GDF-15 testing. We also found differences in diagnostic efficacy based on cut-off value: a cut-off setting of less than 2000 pg/mL exhibited better performance for all cancer types. Lastly, for data stratified by ethnicity, we found an equal diagnostic Fig. 3 Forest plots of pooled HRs (95% CI) for GDF-15 levels in the prognostic datasets. a Pooled HR (95% CI) of OS data for DST; b pooled HR (95% CI) of OS data for CRC; c pooled HR (95% CI) of CSS/TSS data for CRC efficacy of GDF-15 testing between Caucasians and Asians. However, without additional data to support these findings, more investigation is needed.
We found that increased levels of GDF-15 were an independent prognostic marker for DST [8-11, 13-15, 21, 22]. Previously, the topic of whether GDF-15 could serve as prognostic markers for OS, DFS, or RFS in cancer was considered controversial. In our prognostic analysis, 2106 patients with complete follow-up data were included. A clear association between increased GDF-15 levels and shorter OS was observed in patients with DST (HR = 2.34), as well as in colorectal cancer (HR = 2.27). We also included 11 individual studies that measured CSS or TSS in CRC, with results that showed a correlation between GDF-15 expression and poor CSS and TSS (HR = 2.33). These data suggest that GDF-15 could be used as an independent prognostic biomarker in DST. Previous studies have hypothesized that GDF-15 could be used to assist the prediction of cancer recurrence and metastasis in CRC [31,32]. However, the data obtained for CRC recurrence and metastasis were not sufficient for our study, and were therefore not analyzed.
Study heterogeneity and bias are very common in meta-analysis studies [33]. We observed significant heterogeneity in our diagnostic meta-analyses; thus, we attempted to interpret the cause of this heterogeneity. Firstly, we included studies that included varying patient population. Secondly, patients participating in these studies had different types of cancer and received a wide range of treatments. Moreover, the primary method of GDF-15 expression detection testing (ELISA) used a different cut-off value in each study, particularly that the cut-off points were obviously higher in gastric and liver cancers than other malignancies. Whether the differences in cut-off points were due to cancer type or limited studies still warranted further investigations. Collectively, these factors above may have resulted in non-homogeneous conditions. We therefore conducted sensitivity analysis and meta-regression test. Our sensitivity analysis identified one outlier study, and the degree of heterogeneity was decreased after we excluded all outlier data from the analysis. The univariate meta-regression test showed that only study quality (different QUADAS scores) seemed to be a source of heterogeneity among all other studies.
Limitations of this study include low sample sizes for some cancer types and few available current articles. Secondly, significant heterogeneity was observed in the diagnostic meta-analysis, compromising the overall study accuracy. Lastly, the method used to detect GDF-15 expression consisted primarily of ELISA, which might not be the optimal method to detect GDF-15.