PODXL might be a new prognostic biomarker in various cancers: a meta-analysis and sequential verification with TCGA datasets

ABSRACT Background Several studies have investigated the associations between the podocalyxin-like protein (PODXL) expression quantity or locations and cancers survival, but the results were far from conclusive. Therefore, we proceeded a meta-analysis on PODXL in various human cancers to find its prognostic value and followed confirmation using the TCGA datasets. Methods We performed a systematic search, and 18 citations, including 5705 patients were pooled in meta-analysis. The results were verified with TCGA datasets. Results Total eligible studies comprised 5705 patients with 10 types of cancer. And the result indicated that PODXL high-expression or membrane-expression were significantly related to poor overall survival (OS). However, subgroup analysis showed a significant association between high expressed PODXL and poor OS in the colorectal cancer, pancreatic cancer, urothelial bladder cancer, renal cell carcinoma and glioblastoma multiforme. Then, we validated the inference using TCGA datasets, and the consistent results were demonstrated in patients with pancreatic cancer, glioblastoma multiforme, gastric cancer, esophageal cancer and lung adenocarcinoma. Conclusion The result of meta-analysis showed that high expressed PODXL was significantly linked with poor OS in pancreatic cancer and glioblastoma multiforme, but not in gastric cancer, esophageal cancer or lung adenocarcinoma. And the membrane expression of PODXL might also associate with poor OS. PODXL may act as tumor promotor and may serve as a potential target for antitumor therapy.

In addition, the prognostic role of PODXL protein expression had been analyzed with systematic review and meta-analysis in 2017 [32]. But as new researches emerged, we performed a new meta-analysis at pooling data, in order to estimate the potential prognostic value of PODXL in deep. We explored the relationship between the expression level or site of PODXL and prognosis of multiple cancers. And the validation with the Cancer Genome Atlas (TCGA, http://cancergenome.nih. gov) datasets even had been added for further analysis.

Publication search
Our meta-analysis followed the guidance of the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) [33]. We performed a systematic search of the PubMed, Web of Science, Embase and Cochrane Library database from January 1, 2000 to October 31, 2018, using both MeSH search for keywords and full text. Our search terms were: ("cancer" OR "tumor" OR "neoplasm" OR "carcinoma") AND ("Podocalyxin like protein" OR "Podocalyxin" OR "PODXL") AND ("prognosis" OR "prognostic" OR "outcome"). Additionally, the references and other related researches were reviewed to find more potential articles.

Inclusion and exclusion criteria
The eligible articles selection process was done by two authors (Siying He and Menglan Li). The inclusion criteria were as followed: (1) involved the correlation between the expression of PODXL and survival data of cancer patients; (2) provided the relevant clinicopathological parameters; (3) the number of patients involved in the studies should be more than 50.
The exclusion criteria were as followed: (1) studies that not based on human; (2) insufficient Hazard ratios (HRs) or other data; (3) repetitive patients; (4) reviews, case reports or a meta-analysis.

Data collection and quality detection
Two researchers evaluated and collected data from these eligible articles with a predefined standard independently. The following information was recorded: (1) first author's name; (2) publication year; (3) countries; (4) types of cancers; (5) number of patients; (6) detection methods; (7) cut-off criteria; (8) clinical parameters; (9) data about overall survival (OS), disease-free survival (DFS) or cancer-specific survival (CSS). The Engauge  4.1 software was used to extract data from Kaplan-Meier (K-M) plot, when there was no HRs and its 95% confidence inter (CIs) offered directly [34]. In addition, the included studies should be evaluated with the Newcastle-Ottawa Scale (NOS) [35].

Data collection and analysis in TCGA
Data for the expression of PODXL and clinicopathological parameters in TCGA were recorded from the Gene Expression Profiling Interactive Analysis (GEPIA, http://gepia.cancer-pku.cn) [36] and the UALCAN (http://ualcan.path.uab. edu) [37]. There were 31 types of cancer, including 9040 subjects which had both PODXL expression and cancer survival data. In order to make the K-M survival analysis and generated overall survival plots, the expression levels of PODXL were divided into low/median and high expression group according to the TPM value. The difference between two groups was conducted by Log-rank test.

Mechanism prediction of PODXL
We used the STRING database (http://string-db.org/) [38], online common software, for finding PODXLrelated genes and providing a critical assessment and integration of protein-protein interactions (PPI) of PODXL and PODXL-related genes. And these PODXL-related genes were performed functional enrichment analysis by using DAVID database (http://david.abcc.ncifcrf.gov/), which means a common bioinformatics database for annotation, visualization and integrated discovery [39].

Statistical analysis
Our meta-analysis was based on the Stata12.0 software (Stata Corporation, College Station, TX, United States). The prognostic value of PODXL on OS, DFS and CSS was calculated by pooled HRs with 95% CIs. On the other hand, odds ratios (ORs) with corresponding 95% CIs were used to assess the relation between PODXL and clinicopathological features. Chi square-based Cochran Q test and I 2 test were used to determine the heterogeneity among these eligible articles. I 2 > 50% or P-value < 0.05 was considered as significant heterogeneity, and a random-effect model would be adopted; otherwise, a fix-effect model would be chose. The effect of covariates have been evaluated with regression analysis. The sources of heterogeneity could be dissect with subgroup analysis. In addition, the sensitivity and publication bias were performed. P < 0.05 was considered statistically significant with two-sided.

Search results and research characteristics
In total, 436 records were identified and 87 duplicates were excluded. 39 articles remained after scanning the titles and abstracts, and among the 39 studies, 7 were excluded for not for human, 9 were excluded for insufficient HRs or other data, 3 were excluded because the included patients were repetitive in other studies, and 1 meta-analysis was excluded, and the flow diagram was shown in Fig.1. Finally, 18 eligible studies were include in this meta-analysis [1, 5, 13-21, 23-27, 30, 31].. These eligible researches contained 5705 patients, involved 10 types of cancers, including the breast cancer (n = 2), renal cell carcinoma (n = 1), colorectal cancer (n = 4), ovarian cancer (n = 1), glioblastoma multiforme (n = 1), urothelial bladder cancer (n = 2), pancreatic adenocarcinoma (n = 4), esophageal cancer (n = 1), gastric cancer (n = 3) and lung adenocarcinoma (n = 1). In these studies, PODXL expression levels were evaluated by immunohistochemistry (IHC). The characteristics of the eligible articles were listed in Table 1.

Meta-analysis of PODXL expression levels and locations on OS/ DFS/ CSS
A total of 11 eligible studies, including 13 cohorts and 2272 patients, were recruited to evaluate the expression level of PODXL on OS. The pooled HR and 95% CI indicated that high-expressed PODXL was significantly related to poor OS in patients with various cancers (HR = 2.33, 95% CI = 1.76-3.09, P < 0.0001) with a significant heterogeneity across these studies (I 2 = 63.4%, P = 0.001) (Fig.2a). In addition, there were 6 studies performed the relationships between PODXL expression levels and DFS, and 8 studies investigated the associations between PODXL expression levels and CSS respectively. Heterogeneity test indicated both the DFS (I 2 = 73.4%, P = 0.002) and CSS (I 2 = 70.0%, P = 0.002) should be analyzed using the random-effect model. Finally, the results indicated the association between the high expressed PODXL and the shorter DFS (HR = 1.76, 95% CI =1.20-2.58, P = 0.004) or the shorter CSS (HR = 2.84, 95% CI = 1.85-4.38, P < 0.0001) (Fig.2b-c). On the other hand, among these eligible 18 papers, 5 studies involved the expression locations of PODXL and the prognosis of cancers, and only 2 studies, including 4 cohorts, showed   the association between membrane expressed PODXL and poor OS (HR = 2.98, 95% CI =1.29-6.90, P = 0.011), also by using the random-effect model (I 2 = 84.7%, P < 0.0001) (Fig.2d).

Subgroup analysis for OS
In order to find the source of heterogeneity, the subgroup analysis of OS was performed, and all of the 2272 patients were classified based on cancer types, analysis types, antibody types, ethnicities and sample sizes ( And regarding the analysis type, we also found that the high expression of PODXL was significantly associated with the much shorter OS, when the studies were assessed with K-M curve. In the subgroups based on ethnicities, antibody types and sample sizes, we also found that, the relation between high expression level of PODXL and poor OS, except for patients from Asia or the sample size ≥150.

PODXL overexpression and relative clinical parameters
In order to obtain more clinical values of PODXL, we investigated the associations between PODXL expression levels and clinical parameters in several cancers (Table 3). From these results, we found that the expression level of PODXL was related with the TNM stage (HR = 1.63, 95% CI = 1.19-2.23, P = 0.002, fixed-effects), tumor grade (HR = 4.29, 95% CI = 1.84-9.99, P = 0.001, randomeffects), differentiation (HR = 2.84, 95% CI = 1.82-4.42, P < 0.0001, fixed-effects), distant metastasis (HR = 5.46, 95% CI = 2.55-11.66, P < 0.0001, fixed-effects), lymph As a result, these correlations indicated that the high expressed PODXL was associated with the advanced biological behavior in various cancers. No covariate analyzed in this study had a statistically significant effect on degree of tumor malignancy and survival.

Sensitivity analysis and publication bias
We performed sensitivity analysis to determine whether an individual study could affected the overall result. Results of association studies between PODXL expression and OS and CSS demonstrated that single study had no influence on the result of meta-analysis (Fig.3). Funnel plots and Begg's test were performed and the results showed no publication bias existed in studies on associations between PODXL overexpression and OS (P = 0.502), DFS (P = 0.133) and CSS (P = 0.266). And no publication bias existed in our meta-analysis on associations between PODXL membrane expression and OS (P = 1.000) as well (Fig.4).

The expression data of PODXL extracted from TCGA datasets
The differences of PODXL expression level between various tumor tissues and corresponding normal tissues were obtained with GEPIA, which was a common web-based tool that can provide a quick and customizable survey of function based on TCGA and GTEx data [36]. PODXL was detected in 23 types of cancers, and the result that the PODXL expression was significantly much higher than the corresponding normal tissues was found in 9 types of cancers, including the esophagus cancer, glioblastoma multiforme, acute myeloid leukemia, liver hepatocellular carcinoma, ovarian serous cystadenocarcinoma, pancreatic  Table 4).

Validation of prognostic correlation by TCGA datasets
To validate the clinical prognosis indication value of PODXL, we explored TCGA datasets by using UAL-CAN, which was an interactive online tool that could analyze the expression data of genes in TCGA [37]. And among the 31 types of cancers, 9040 patients, the significant association between high expressed PODXL and poor OS was found in 3 types of cancers, including the glioblastoma multiforme, kidney renal papillary cell carcinoma and pancreatic adenocarcinoma (Table 5). But there were adverse results in kidney renal clear cell carcinoma and uterine corpus endometrial carcinoma, which showed a significant correlation between the low expressed PODXL and poor OS (Fig.5). The same results were also verified with KM Plotter, whose data sources were not completely consistent with TCGA datasets (Supplementary Fig. 1, SF.1).
A joint result of our meta-analysis and TCGA datasets validation identified the correlation between the expression level of PODXL and the glioblastoma multiforme, pancreatic adenocarcinoma, esophagus cancer, gastric cancer and lung adenocarcinoma.

PPI network construction and functional enrichment analysis
The PPI network of PODXL-related genes was obtained by using STRING, including 11 nodes and 23 edges (Fig.6a). The PODXL-related genes were collected for functional enrichment analysis (Fig.6b). The top GO terms, containing biological processes, cell components and molecular function, were selected based on the most significant. These PODXL-related genes were significantly enriched in cell development and differentiation, and played a significant role in cell-cell adhesion. These significant GO terms were matched with the pathogenesis of cancers, such as intercellular adhesion decrease, epithelial-mesenchymal transition (EMT), cell migration and invasion.

Discussion
Recently, increasing evidences have suggested that PODXL was involved in multiple links in several process of tumor development, such as cell adhesion and morphology [40], lymphatic metastasis [41], tumor cells motility and invasiveness [26], tumor angiogenesis [42]  and prognosis. Recent researches indicated that the expression level and location of PODXL could be a new biomarker to assess the prognosis of various types of cancers. However, a single study is limited by insufficient data and single experimental model, so that a metaanalysis of pooling studies is necessary to explore the potential clinical value of PODXL. Among these published studies, there were 10 types of cancers, including 5705 patients. Our meta-analysis not only indicated that high expressed PODXL was associated with poor OS, DFS or CSS in patients with cancers, but also showed that membrane expression was correlated with poor OS as well. Clinicopathological features analysis showed that the overexpressed PODXL was linked with poor stage and differentiation, and high incidences of metastasis and invasion in cancers, which indicated that there might be a significant association between PODXL expression level and advanced features of cancer. Subgroup analysis showed that the association between overexpressed PODXL and poor OS in patients with cancers, was only significative in the glioblastoma multiforme, pancreatic cancer, renal cell carcinoma, colorectal cancer and urothelial bladder cancer, but not in the esophageal cancer, gastric cancer and lung adenocarcinoma. Then we used GEPIA and UALCAN to explore TCGA datasets, to compare the expression difference of PODXL among tumor tissues and correlated normal tissues, and the survival curves. Consistent results of meta-analysis and TCGA datasets validation were found in 5 types of cancers. Beside TGGA datasets, Oncomine was used to further verify the differences of PODXL expression level between various tumor tissues and corresponding normal tissues. And On the other hand, KM Plotter was used to validate the clinical prognosis indication value of PODXL. The results of these databases also supported the consequence of TCGA datasets.
The prognostic value of PODXL had been indicated by meta-analysis in 2017 [32], the conclusion put forward by Wang et al. was approximately consistent with our results. But we revisited and gathered relevant research for another meta-analysis, in order to further explore its clinical significance. Compared with the meta-analysis in 2017, our research contained more studies and patients, which reinforced the conclusion. In addition, both of the expression level and site of PODXL were found to be associated with prognosis of various cancers. And the results of meta-analysis were filtrated by validation with TCGA datasets, which made our conclusion seem more convincing.
Among the eligible 18 studies, there were only 2 researches mentioned the expression location of PODXL and prognosis of cancers, containing 4 cohorts. The studies showed a significant association between On the premise of appropriate number of included studies, samples that may introduce heterogeneity are moved, but the sensitivity is still high, so this result can only be used as a descriptive hypothesis, and need more included studies. As PODXL is a transmembrane glycoprotein, whose high expression level and membrane expression lead to cell motility increasing, and over-activated tumor cell migration ability promotes tumor progression. Combined with the existing results, the expression site of PODXL was a promising markers in predicting the prognosis of cancers. Although, PODXL has been found to be highly expressed in various malignancies and was related to a more aggressive phenotype and poor prognosis, the exact mechanisms of which role did PODXL play in tumorigenesis remains unclear [43]. The gene functional  The functional enrichment analysis of PODXL-related genes enrichment analysis showed that PODXL was a fatal gene in cell development and differentiation, which played an important role in cell-cell adhesion. Some latest studies showed that PODXL promoted the gelsolinactin interaction in cell protrusions to enhance the motility and invasiveness [26], and some showed that the PODXL-ezrin signaling axis could rearrange the dynamic cytoskeleton for transendothelial migration [44]. According to these reports, it could be deduced that high expressed PODXL promoted tumor progression by enhancing a series of cell changes such as EMT, cell migration and invasion. In addition, the result that membrane-expressed PODXL was associated with poor survival, further supported the deduction that PODXL promoted tumor progression by enhancing the motility and invasiveness of tumor cells. PODXL also took part in the NF-kB, PI3K/AKT, Hippo and MAPK/ERK signaling pathway, and facilitated tumor progression by increasing cell proliferation, migration and invasion as well as suppressing apoptosis [21,45,46].
PODXL was expected to be a novel therapeutic and monitoring biomarker in certain cancers, because the high expressed PODXL might be a potential indicator of poor prognosis of cancers. Overexpressed PODXL could be detected in peripheral blood and used as a noninvasive diagnostic biomarker for the detection of pancreatic cancer [47]. ATF3 could activate PODXL transcription, which suggested that ATF3 pathway might be beneficial for anticancer therapy [48]. High expression of miR-509-3-5p and miR-5100 inhibited the invasion and metastasis of gastric cancers and pancreatic cancers by directly targeting PODXL, functioning as a tumor suppressor [27,41]. A core fucose-deficient monoclonal antibody (mAb) of PODXL might be a new antibodybased therapy method against PODXL high-expressed oral squamous cell carcinoma [49]. And patients with gastric or esophageal adenocarcinoma would have a much better prognosis after treating with neoadjuvant ± adjuvant fluoropyrimidine-and oxaliplantin-based chemotherapy, if the expression level of PODXL is high [50].
However, there are still some limitations. First of all, many unavoidable reasons, such as different types of cancers, the analysis methods, ethnicities and sample sizes might attribute to the heterogeneity. Secondly, we extracted the data of HRs and 95% CIs from the K-M plots when it could not be obtained from the paper directly, and this process might decrease the accuracy of results. Thirdly, the sensitivity analysis only showed that individual study had no influence on the association study between the high expressed PODXL and poor OS or CSS, that is to say, the results of the association between the membrane expressed PODXL and poor OS in cancers can only be seen as a descriptive hypothesis, might be induced by the insufficient studies or the small sample size. Fourthly, our meta-analysis seemed have no publication bias, but as the chance of negative results being published is very small, more studies are needed to verify the results of our meta-analysis.

Conclusion
PODXL is a significant clinical indicator for tumor prognosis and detection, the expression level and location in tumor tissues, and even the serum concentration of which could be associated significantly with tumor progression [47]. Our meta-analysis showed that PODXL plays a significant role in cancer progression, and highexpressed PODXL could be linked to aggressive biological phenotype and poor prognosis. Specifically, the high expressed PODXL was correlated with poor prognosis significantly in the glioblastoma multiforme and pancreatic cancer, but not in the esophageal adenocarcinoma, gastric cancer and lung adenocarcinoma.