Untargeted serum metabolomics reveals potential biomarkers and metabolic pathways associated with the progression of gastroesophageal cancer

Background Previous metabolic studies in upper digestive cancer have mostly been limited to cross-sectional study designs, which hinders the ability to effectively predict outcomes in the early stage of cancer. This study aims to identify key metabolites and metabolic pathways associated with the multistage progression of epithelial cancer and to explore their predictive value for gastroesophageal cancer (GEC) formation and for the early screening of esophageal squamous cell carcinoma (ESCC). Methods A case-cohort study within the 7-year prospective Esophageal Cancer Screening Cohort of Shandong Province included 77 GEC cases and 77 sub-cohort individuals. Untargeted metabolic analysis was performed in serum samples. Metabolites, with FDR q value < 0.05 and variable importance in projection (VIP) > 1, were selected as differential metabolites to predict GEC formation using Random Forest (RF) models. Subsequently, we evaluated the predictive performance of these differential metabolites for the early screening of ESCC. Results We found a distinct metabolic profile alteration in GEC cases compared to the sub-cohort, and identified eight differential metabolites. Pathway analyses showed dysregulation in D-glutamine and D-glutamate metabolism, nitrogen metabolism, primary bile acid biosynthesis, and steroid hormone biosynthesis in GEC patients. A panel of eight differential metabolites showed good predictive performance for GEC formation, with an area under the receiver operating characteristic curve (AUC) of 0.893 (95% CI = 0.816–0.951). Furthermore, four of the GEC pathological progression-related metabolites were validated in the early screening of ESCC, with an AUC of 0.761 (95% CI = 0.716–0.805). Conclusions These findings indicated a panel of metabolites might be an alternative approach to predict GEC formation, and therefore have the potential to mitigate the risk of cancer progression at the early stage of GEC. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-023-11744-y.


Additional file
Table S1.Baseline characteristics of discovery and validation dataset in the screening study.
Table S2.The detailed information of metabolites which were significantly altered between GEC cases and sub-cohort individuals in case-cohort study.
Table S3.Detailed information of differential metabolites identified in the subgroup casecohort study (TIS).
Table S4.Detailed information of differential metabolites identified in the subgroup casecohort study (ESCC).
Table S5.Detailed information of differential metabolites identified in the subgroup casecohort study (GC).
Table S6.ROC analysis of prediction model to discriminate GEC.Table S7.Detailed information of metabolites which were significantly altered between ESCC screening positive and ESCC screening negative group identified in the screening study.
Table S8.ROC analysis of prediction model to predict ESCC screening-positive subjects.
Table S9.NRI and IDI values for RF models composed of clinic markers and selected metabolites.
Table S10.Pathway enrichment analysis of differential metabolites in the case-cohort study and in the screening study.
Table S11.Pathway enrichment analysis of differential metabolites in the subgroup case-cohort study.Fold change was calculated as the ratio of the mean values of the screening-positive group to the screeningnegative group.
The criteria for differential metabolites was P value < 0.05 and VIP > 1.
I: Common differential metabolites in the case-cohort study and the subgroup case-cohort study (TIS).Fold change was calculated as the ratio of the mean values of the screening-positive group to the screeningnegative group.
The criteria for differential metabolites was FDR q value < 0.05 and VIP > 1.
I: Common differential metabolites in the case-cohort study and the subgroup case-cohort study (ESCC).Fold change was calculated as the ratio of the mean values of the screening-positive group to the screeningnegative group.
The criteria for differential metabolites was FDR q value < 0.05 and VIP > 1.
I: Common differential metabolites in the case-cohort study and the subgroup case-cohort study (GC).Area under the curve, sensitivity and specificity of modes were assessed using leave-one-out cross validation (LOOCV).
GC = Gastric cancer, m/z = mass charge ratio, RT = Retention time, FDR = P value adjusted using False Discovery Rate, FC = Fold change, VIP = The variable importance in projection.

Table S1 .
Baseline characteristics of discovery and validation dataset in the screening study

Table S2 .
The detailed information of metabolites which were significantly altered between GEC cases and sub-cohort individuals in case-cohort study.
m/z = mass charge ratio, RT = Retention time, FC = Fold change, VIP = The variable importance in projection.Fold change was calculated as the ratio of the mean values of the case group to the sub-cohort group.

Table S3 .
Detailed information of 14 differential metabolites identified in the subgroup casecohort study (TIS).
TIS = Tumor in situ, m/z = mass charge ratio, RT = Retention time, FDR = P value adjusted using False Discovery Rate, FC = Fold change, VIP = The variable importance in projection.

Table S4 .
Detailed information of 7 differential metabolites identified in the subgroup casecohort study (ESCC).
ESCC = Esophageal squamous cell carcinoma, m/z = mass charge ratio, RT = Retention time, FDR = P value adjusted using False Discovery Rate, FC = Fold change, VIP = The variable importance in projection.

Table S5 .
Detailed information of 4 differential metabolites identified in the subgroup casecohort study (GC).

Table S6 .
ROC analysis of prediction model to discriminate GEC.

Table S7 .
Detailed information of 17 metabolites which were significantly altered between ESCC screening positive and ESCC screening negative group identified in the screening study.

Table S10 .
Pathway enrichment analysis of differential metabolites in the case-cohort study and in the screening study.Hits means the number of differential metabolites enriched in the specific pathway.

Table S11 .
Pathway enrichment analysis of differential metabolites in the subgroup case-cohort study.Total means total number of metabolites involved in the specific pathway in Kyoto Encyclopedia of Genes and Genomes (KEGG) dataset; Expected means expected enrichment values of KEGG metabolites in the specific pathway; Hits means the number of differential metabolites enriched in the specific pathway.