Skip to main content

A genomic-clinicopathologic Nomogram for the preoperative prediction of lymph node metastasis in gastric cancer



Preoperative evaluation of lymph node (LN) state is of pivotal significance for informing therapeutic decisions in gastric cancer (GC) patients. However, there are no non-invasive methods that can be used to preoperatively identify such status. We aimed at developing a genomic biosignature based model to predict the possibility of LN metastasis in GC patients.


We used the RNA profile retrieving strategy and performed RNA expression profiling in a large GC cohort (GSE62254, n = 300) from Gene Expression Ominus (GEO). In the exploratory stage, 300 GC patients from GSE62254 were involved and the differentially expressed RNAs (DERs) for LN-status were determined using the R software. GC samples in GSE62254 were randomly allocated into a learning set (n = 210) and a verification set (n = 90). By using the Least absolute shrinkage and selection operator (LASSO) regression approach, a set of 23-RNA signatures were established and the signature based nomogram was subsequently built for distinguishing LN condition. The diagnostic efficiency, as well as the clinical performance of this model were assessed using the decision curve analysis (DCA). Metascape was used for bioinformatic analysis of the DERs.


Based on the genomic signature, we established a nomogram that robustly distinguished LN status in the learning (AUC = 0.916, 95% CI 0.833–0.999) and verification sets (AUC = 0.775, 95% CI 0.647–0.903). DCA demonstrated the clinical value of this nomogram. Functional enrichment analysis of the DERs was performed using bioinformatics methods which revealed that these DERs were involved in several lymphangiogenesis-correlated cascades.


In this study, we present a genomic signature based nomogram that integrates the 23-RNA biosignature based scores and Lauren classification. This model can be utilized to estimate the probability of LN metastasis with good performance in GC. The functional analysis of the DERs reveals the prospective biogenesis of LN metastasis in GC.

Peer Review reports


Globally, gastric cancer (GC) is the 5th most prevalent cancer type and the 3rd highest cause of cancer-associated mortalities [1]. Some studies demonstrated that Lymph node (LN) metastasis is an independent risk index for poor prognosis of GC [2, 3]. Precise and exact preoperative identification of LN involvement is important in informing therapeutic decisions for GC patients [4, 5]. Clinicopathologic factors such as lymphatic invasion or pathological differentiation are associated with LN metastasis, however, they can hardly be obtained preoperatively [6, 7]. The current preoperative prediction of LN metastasis primarily relies on morphological features of the lymph nodes as revealed by computed tomography (CT), which has unfavorable sensitivity [8]. Tumor biosignatures, including carcinoembryonic antigen (CEA), as well as carbohydrate antigen 199 (CA-199) have been shown to be poor predictors of LN metastasis in GC [9, 10]. Therefore, novel diagnostic biomarkers are needed to improve on the current strategies for predicting LN metastasis in GC patients. Gene expression studies have been performed to elucidate on the distinct molecular biosignatures for LN metastases. Daisuke Izumi et al. proposed a 15-gene signature for identification LN metastasis in GC [9]. Song et al. developed a co-expression network of RNAs for assessing LN metastasis in GC patients [11]. These studies show that genes have a high predictive power for detecting LN metastasis. However, clinicopathologic factors associated with LN status were not involved in these studies [12,13,14]. A Nomogram is a visual predictive tool used to quantify risk factors of LN metastasis in several carcinomas [15, 16], including early GC [17]. However, the current nomogram only integrates clinical and postsurgical factors, which would restrict their clinical value. Therefore, we aimed to establish and verify the efficacy of a nomogram that integrates both gene biosignatures and clinicopathologic parameters for the preoperative prediction of LN metastasis in GC.


Data preparation and differential expression analysis

Gene expression information and sample data from GSE62254 dataset in this research were retrieved from GEO ( its processed format, using the package ‘GEOquery’ in R. The overview of the screening strategy used in this study is shown in Fig. 1. The clinical data for these samples were downloaded from the authors’ website ( on May 20th, 2020. The dataset obtained from the GEO database had been anonymized and, therefore, ethical approval was waived. The samples in GSE62254 were randomly clustered into a learning set and a verification set.

Fig. 1
figure 1

The main flowchart of this study. The flowchart of analyses to establish the nomogram model and test its predictive value. Abbreviations: LN: lymph node, DERs: differentially expressed RNAs,LASSO: Least absolute shrinkage and selection operator

Human gene annotation files (GRCh38.p12) were obtained from the Ensembl repository ( for RNAs annotation on May 20th, 2020. Samples in the GSE62254 dataset were divided into LN-negative and LN-positive arms according to the source information. The differentially expressed RNAs (DERs) were identified using the package limma [18]. DERs were distinguished between the two groups according to the false discovery rate (FDR) < 0.05. Based on the R package heatmap, hierarchical clustering analysis was performed [19]. A volcano plot was developed by the ggplot2 package [19].

Development of the 23-RNA signature

The least absolute shrinkage and selection operator (LASSO) regression approach which is applicable in the regression analysis of high-dimensional data was performed using the R package “glmnet” [20]. For high-dimensional data with few true predictors and many noise variables, LASSO shrinkage penalty would force a feature weight to zero and this could reduce variables. This is an advantage over ridge regression, as it greatly improves model interpretability [20]. According to the optimal lambda value acquired using cv.glmnet, candidate genes with corresponding coefficients (βi) were screened out from the DERs. For each gene, univariate analysis was performed to investigate the association between gene expression levels and lymph node metastasis levels. A risk score was calculated for each patient using the linear combination of expression data (Expi) of selected genes that were weighted by their corresponding coefficients (βi) and intercept. Based on the above process, a risk-score formula was developed as:

Risk score (RS) = \( \sum \limits_{\mathrm{i}=1}^{\mathrm{n}=23} \)i × Expi) + Intercept

The R package “OptimalCutpoints” was applied in determining the optimum cutoff point for risk score. The optimum cutoff was employed to cluster the patients into high- or low-risk classes. It was obtained when the Youden index in receiver operating characteristic (ROC) curve predicting LN metastasis reached its maximum in the learning set. Samples were clustered into high- or low-risk clusters by utilizing the optimum cutoff.

Construction and assessment of genomic signature based model

Candidate predictors including age, sex, Borrmann classification, Lauren classification, tumor location and the risk score were embedded into the logistic regression analysis to design a diagnostic model for predicting LN metastasis in the learning set [21, 22]. To provide a quantitative technique for predicting individual likelihood of LN metastasis, a nomogram prediction model was constructed based on the independent risk factors using the R package rms [23]. Receiver operating characteristic (ROC) assessment was performed to inspect the sensitivity and specificity of the nomogram using R package “pROC” [24]. The calibration curve was subsequently utilized to examine the effectiveness of the nomogram with additional 1000 bootstrap samples to decrease the over fit bias. Decision curve analysis (DCA) was applied to inspect the clinical application of the gene signature based model [25].

Functional enrichment analysis

Metascape ( was used to predict the potential biological functions of the differentially expressed genes [26].

Statistical analyses

A chi-square test was used for the analysis of categorical variables between the two sets. The Student’s t test was applied in continuous variables assessments. Statistical analyses were performed using the SPSS software (version 24) or R software (version3.5.3). All tests were dual-sided and P-value below 0.05 signified statistical significance.


Patient characteristics

Samples in the GSE62254 dataset were randomly clustered into a learning set (n = 210, Additional file 1) and a verification set (n = 90, Additional file 2). The baseline features of all patients are shown in Table 1. The LN metastasis incidences were 88.1% in the learning set and 85.6% in the verification sets with no significant differences.

Table 1 Baseline features of all subjects

Differential expression analysis

Overall, 14,651 mRNAs, 840 lncRNAs, and 111 miRNAs were annotated from the GSE62254 datasets. The 300 GC samples in the GSE62254 dataset were allocated into LN-negative (38 samples) and LN-positive (262 samples) groups. 186 DERs (Additional file 3) were screened out under the defined thresholds between the LN-positive and the LN-negative groups. Among the 186 DERs, 70 DERs were found to be upregulated while 116 DERs were downregulated. Based on expression of the DERs, the heatmap and volcano plot are shown, in Fig. 2 and Fig. 3, respectively.

Fig. 2
figure 2

Heatmap: The hierarchical clustering heatmap (pink and blue represent lymph node positive and the lymph node negative samples, respectively in sample strip)

Fig. 3
figure 3

Volcano plot: The volcano plot (the red and blue dots represent up- and down-regulation of differentially expressed RNAs respectively, false discovery rate < 0.05)

Construction of 23-RNA signature based risk score

A total of 186 DERs with non-zero coefficients in the LASSO logistic regression model were reduced to 23 RNAs on the basis of 210 patients in the learning set (Additional file 5) (Fig. 4a, b). The risk score formula was subsequently established based on the 23 RNAs and their corresponding coefficients (Additional file 4 / Table 1s). The developed formula is:

Fig. 4
figure 4

Selection of the genes trough the LASSO approach and distribution of risk score. (a) Selection of tuning parameter (λ) via 10-fold cross-verification with minimum criteria. The area under curve was plotted versus log (λ). Dotted vertical lines were drawn at the optimal values using the minimum criteria and the 1 standard error of the minimum criteria (the 1-SE criteria). The optimal λ value of 0.033, with log (λ) = − 3.411 was chosen based on minimum criteria. (b) LASSO coefficient profiles of the 186 differentially expressed RNAs. A coefficient profile plot was generated against the log (λ) sequence. Vertical line was drawn at the optimal value where optimal λ led to 23 nonzero coefficients. (c) Distribution of risk score in learning set. (d) Distribution of risk score in verification set

RS = 0.3370*ExpTRAPPC10 + (− 0.6895)*ExpRHOA + 0.0452*ExpIGFBP2 + 1.4984*ExpC11orf80 + (− 0.0937)*ExpZNF74 + (− 0.9888)*ExpFOXN2 + 0.6580*ExpGOLGA8A + 0.9803*ExpRSRP1 + (− 0.4094)*ExpUSP10 + 0.3896*ExpCLTB + (− 1.2924)*ExpPIK3R1 + 1.5335*ExpPABPN1 + (− 0.3669)*ExpCLCN4 + (− 1.4978)*ExpPARD6B + 0.0329*ExpTRPA1 + (− 0.0174)*ExpBAG3 + 0.4511*ExpZNF26 + 0.0381*ExpGDPD3 + 1.1286*ExpSPTBN5 + 2.3647*ExpKLHL28 + 1.0420*ExpGTPBP8 + 2.5667*ExpTXNDC11 + 0.1489*ExpTMEM163 + Intercept.

We also compared the expression of each of the 23 genes between LN-positive and the LN-negative groups. Most of the genes were correlated with LN metastasis (p < 0.05 Additional file 4/ Table 1s).

The distribution of risk scores between LN-negative and LN-positive groups with significant differences (p < 0.05) are shown, in Fig. 4c and d, respectively. The cutoff value of the risk scores was calculated, and the samples were separately clustered into high or low risk classes in both the learning and verification sets. The cutoff value (1.3806) was obtained when the ROC curve reached optimum sensitivity (94.05%) and specificity (88.00%) for predicting LN metastasis (Additional file 5/ Fig. S1a). The Positive Predictive Value (PPV) reached 98% (Additional file 5/ Fig. S1b). Patients in the learning set with a risk score higher than 1.3806 were assigned to the high-risk group (n = 177) while the rest (n = 33) were assigned to the low-risk group (Additional file 6). Patients in the verification set with a risk score higher than 1.3806 were assigned to the high-risk group (n = 60) while the rest (n = 30) were assigned to the low-risk group (Additional file 7).

Construction and verification of genomic signature based model

By using the logistic regression analysis, Lauren classification (odds ratio [OR] = 2.126, 95% CI 1.070–4.223, p < 0.05) and risk score (OR = 126.126, 95%CI 30.466–522.148, p < 0.05) were confirmed as independent risk factors for LN metastasis (Table 2). Based on the two independent predictive factors, a nomogram model was subsequently built (Fig. 5a). LN metastasis probability was easily calculated based on their Lauren classification and risk scores. ROC evaluation was used to examine sensitivity and specificity of the nomogram. It was found that the nomogram had an optimum sensitivity of 94.1% and specificity of 88.0% when predicting LN metastasis in the learning set, and an optimum sensitivity of 74% and specificity of 76.9% in the verification set. The area under curve (AUC) were 0.916 (95% CI: 0.833–0.999) for learning set and 0.775 (95% CI: 0.647–0.903) for the verification set, which implied that the nomogram had good utility (Fig. 5b). In addition, the predicted probability of LN metastasis was further compared with the authentic probability by the calibration curve in the learning and verification set. Deviation when probability was below 75% in the verification group, bias-corrected calibration plot of the nomogram corresponded closely with the authentic probability in both sets. These findings of the estimated likelihood of LN metastasis and authentic probability were consistent. The mean absolute errors were 0.021 and 0.039 in the learning and verification set respectively (Fig. 5c, d). The DCA for genomic-clinicopathologic nomogram demonstrated that if the threshold ranged from 0.20 to 0.95, the nomogram model was more beneficial relative to either the treat-all-cases scheme or the treat-none scheme (Additional file 8/Fig. S2).

Table 2 Multivariate evaluations to evaluate potential predictive factors for LN metastasis
Fig. 5
figure 5

Developed a genomic signature based nomogram and the performance of the nomogram. (a) The nomogram was designed in the learning set, with the 23-mRNA biosignature based risk score and Lauren classification integrated. (b) The area under curve of nomogram in learning set was 0.916 (95% CI: 0.833–0.999). Area under curve of nomogram in verification set was 0.775 (95% CI: 0.647–0.903). (c) Calibration plot in learning set (mean absolute error = 0.021). After 1000 repetitions of bootstrap, the bias-corrected calibration curve (solid line) was close to the ideal curve (dashed line). (d) Calibration plot in verification set (mean absolute error = 0.039)

Functional enrichment analysis

Metascape was used for cascade and process enrichment analysis of the DERs (Additional file 9). The top 15 clusters with their illustrative enriched terms are shown in Fig. 6. A sub-cluster of the enriched terms was selected and regarded as a network plot (Additional file 10/Fig. S3). Specifically, the enriched DERs were associated with several pathways, such as Signaling by platelet derived growth factor (PDGF) and Intrinsic Pathway for Apoptosis.

Fig. 6
figure 6

Bar graph of enriched terms across the differentially expressed RNAs, colored by p-values


LN metastasis is involved in GC prognostic outcomes [2, 3]. Precise preoperative determination of LN involvement in GC is pivotal for clinical decision-making. Less invasive therapeutic options such as endoscopic submucosal resection (ESD) can be effectively performed for LN negative patients in early GC. However, ESD should be avoided for early GC patients with a high risk of LN metastasis [27, 28]. For localized LN negative GC patients, limited LN resection is recommended to reduce postoperative complications. Surgical resection with extensive lymphadenectomy is necessary for advanced GC patients with LN metastasis [5]. Therefore, it is important to accurately determine the extent and degree of LN metastasis in order to inform therapeutic decisions. With the development of high throughput sequencing (HTS) technologies, the molecular portrait of GC has been comprehensively analyzed by gene-expression profiling [29, 30]. As RNA-sequencing technology provides molecular insights into tumor biology process, we focused on building a genomic signature based Nomogram for predicting LN metastasis in GC. By using cDNA microarrays, several studies have reported certain geneexpression-based biomarkers for predicting LN metastasis in GC [31,32,33]. However, these studies did not elucidate on the clinical characteristics associated with LN status in GC [12,13,14].

Based on the Lauren classification, GC can be grouped into intestinal or diffuse kinds [34]. The intestinal type of GC stems from premalignant lesions developed from an initial Helicobacter pylori (H. pylori) triggered chronic gastritis and successive atrophic and metaplastic gastritis [35]. The diffuse form of GC is triggered by active inflammation of the gastric mucosa [36, 37]. Diffuse forms are prevalent in younger patients with an elevated risk of LN metastasis compared to the intestinal types [38,39,40]. Our study established that Lauren classification was an independent risk index for LN metastasis while diffuse type was associated with elevated risk of LN metastasis relative to the intestinal form.

We constructed and verified a diagnostic, genomic biosignature based nomogram as a noninvasive strategy for preoperative estimation of LN metastasis in GC. This nomogram incorporates two items of genomic signature based risk scores and Lauren classification. Though deviation was obviously found in the verification set when probability was below 75%, the nomogram exhibited ideal coincidence to the authentic probability in the learning set. The possible reason for deviation observed in the verification set may be the predictive model has an over-fitting problem as it was built based on data from the learning set. Therefore, it did not perform as well in the verification set as it did in the learning set when predicting LN metastasis. The areas under the ROC curve for the learning and verification sets implied that the nomogram had good utility. The DCA is a simple method for evaluating the clinical performance of a prediction model. It can quantify different strategies and determine an optimal threshold range. This LN metastasis prediction model can assist surgeons to balance between the quality of life and aggressive lymphadenectomy.

To provide insights into the potential biological processes, “metascape” was performed for the functional and enrichment analysis of DERs. The DERs were enriched in three signaling pathways, including PDGF signaling, Interleukin-7 signaling and in the Intrinsic pathway for apoptosis. The PDGF receptor cascade constitutes a signaling network that is essential for the growth of cells of mesenchymal parentage. Dysregulation of this pathway can lead to extracellular matrix reconstruction in a tumor-enhancement manner to promote the migration, infiltration, angiogenesis, and lymphangiogenesis [41, 42]. For this pathway, enriched genes such as STAT3 can activate cancer after the interaction of cytokines and cell surface receptors, and regulation of the downstream and promote the proliferation and growth of gene expression [43]. PLAT stimulates plasminogen activator which degrades the extracellular matrix, especially the collagen fiber components, mediating cell migration and tissue remodeling [44]. As for the Interleukin-7 signaling pathway, the Interleukin-7 (IL-7) gene is involved in both B-cell and T-cell proliferation and its absence leads to immature immune cell arrest. IL-7 modulates cell growth, apoptosis and modulates cancer lymphangiogenesis [45, 46]. RAG1 encodes the RAG1 protein which is involved in adjusting the starting phase of V(D) J recombination, making the rearrangement of antigen receptor gene strictly in line with the tissue and cell growth phases [47]. Low RAG1 gene expression is correlated with poor survival of gastric cancer patients [47]. Apoptosis is a form of programmed cell death. Insufficient apoptosis is associated with neoplastic diseases [48]. In the Intrinsic Pathway for Apoptosis, enriched genes such as complement C1q binding protein (C1QBP), also referred to as p32, are expressed in various cancer types [49,50,51,52,53,54]. Protein phosphatase 3 regulatory factor subunit 1 (PPP3R1) is a member of β-regulatory subunit family of calcineurin that codes for apoptosis-stimulating protein of p53 (ASPP) in the p53 integrin family [55]. The ASPP enhances P53-mediated apoptosis by binding to the P53 core domain [56]. However, the specific molecular mechanisms of the differentially expressed genes in the pathways have not been established. Elucidation of these mechanisms can provide new clues and molecular targets for the identification and specific treatment of GC with LN metastasis.

Compared to previous nomograms [15,16,17], our model incorporates Lauren classification and genomic signature based risk scores. This model exhibited a high accuracy for predicting LN metastasis. However, there were some limitations associated with this study. First, we did not perform external verification using data from another institution for this model. Second, clinicopathological factors, such as CEA level and CT-reported LN status, were not available in the GSE62254 dataset. Therefore, these important clinical features, could not be examined in this study. More, studies should be performed to elucidate on the functions of DERs in the pathogenesis of LN metastasis.


In conclusion, this nomogram incorporates both genomic signature based risk score and Lauren classification to estimate LN metastasis in preoperative GC.

Availability of data and materials

The datasets generated and analysed during the current study are available in the Gene Expression Ominus (GEO) (



Gastric cancer


Lymph node


the Gene Expression Omnibus


Differentially expressed RNAs


Least absolute shrinkage and selection operator


Decision curve analysis


The area under curve.


Computed tomography


Receiver operating characteristic


False discovery rate


Odds ratio


Confidence interval


Platelet derived growth factor


Endoscopic submucosal resection




  1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424.

    Article  PubMed  Google Scholar 

  2. Lo SS, Wu CW, Chen JH, Li AFY, Hsieh MC, Shen KH, et al. Surgical results of early gastric cancer and proposing a treatment strategy. Ann Surg Oncol. 2007;14(2):340–7.

    Article  PubMed  Google Scholar 

  3. Mu GC, Huang Y, Liu ZM, Wu XH, Qin XG, Chen ZB. Application value of nomogram and prognostic factors of gastric cancer patients who underwent D2 radical lymphadenectomy. BMC Gastroenterol. 2019;19(1):188.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lou N, Zhang L, Chen XD, Pang WY, Arvine C, Huang YP, et al. A novel scoring system associating with preoperative platelet/lymphocyte and clinicopathologic features to predict lymph node metastasis in early gastric cancer. J Surg Res. 2017;209:153–61.

    Article  PubMed  Google Scholar 

  5. Japanese Gastric Cancer Association. Japanese gastric cancer treatment guidelines 2010 (ver. 3). Gastric Cancer. 2011;14(2):113–23.

    Article  Google Scholar 

  6. Chu YN, Yu YN, Jing X, Mao T, Chen YQ, Zhou XB, et al. Feasibility of endoscopic treatment and predictors of lymph node metastasis in early gastric cancer. World J Gastroenterol. 2019;25(35):5344–55.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Huang Q, Cheng Y, Chen L, et al. Low risk of lymph node metastasis in 495 early gastric cardiac carcinomas: a multicenter clinicopathologic study of 2101 radical gastrectomies for early gastric carcinoma. Mod Pathol. 2018;31(10):1599–607.

    Article  PubMed  Google Scholar 

  8. Kim AY, Kim HJ, Ha HK. Gastric cancer by multidetector row CT: preoperative staging. Abdom Imaging. 2005;30(4):465–72.

    Article  CAS  PubMed  Google Scholar 

  9. Izumi D, Gao F, Toden S, Sonohara F, Kanda M, Ishimoto T, et al. A genomewide transcriptomic approach identifies a novel gene expression signature for the detection of lymph node metastasis in patients with early stage gastric cancer. EBioMedicine. 2019;41:268–75.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Okada Y, Fujiwara Y, Yamamoto H, Sugita Y, Yasuda T, Doki Y, et al. Genetic detection of lymph node micrometastases in patients with gastric carcinoma by multiple-marker reverse transcriptase-polymerase chain reaction assay. Cancer. 2001;92(8):2056–64.

    Article  CAS  PubMed  Google Scholar 

  11. Song Z, Zhao W, Cao D, Zhang J, Chen S. Elementary screening of lymph node metastatic-related genes in gastric cancer based on the co-expression network of messenger RNA, microRNA and long non-coding RNA. Braz J Med Biol Res. 2018;51(4):e6685.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Oka S, Tanaka S, Kaneko I, Mouri R, Hirata M, Kawamura T, et al. Advantage of endoscopic submucosal dissection compared with EMR for early gastric cancer. Gastrointest Endosc. 2006;64(6):877–83.

    Article  PubMed  Google Scholar 

  13. Ajani JA, Bentrem DJ, Besh S, D'Amico TA, Das P, Denlinger C, et al. Gastric cancer, version 2.2013: featured updates to the NCCN guidelines. J Natl Compr Cancer Netw. 2013;11(5):531–46.

    Article  CAS  Google Scholar 

  14. Hyung WJ, Cheong JH, Kim J, Chen J, Choi SH, Noh SH. Application of minimally invasive treatment for early gastric cancer. J Surg Oncol. 2004;85(4):181–6.

    Article  PubMed  Google Scholar 

  15. Klar M, Jochmann A, Foeldi M, Stumpf M, Gitsch G, Stickeler E, et al. The MSKCC nomogram for prediction the likelihood of non-sentinel node involvement in a German breast cancer population. Breast Cancer Res Treat. 2008;112(3):523–31.

    Article  CAS  PubMed  Google Scholar 

  16. Briganti A, Larcher A, Abdollah F, Capitanio U, Gallina A, Suardi N, et al. Updated nomogram predicting lymph node invasion in patients with prostate cancer undergoing extended pelvic lymph node dissection: the essential importance of percentage of positive cores. Eur Urol. 2012;61(3):480–7.

    Article  PubMed  Google Scholar 

  17. Zheng Z, Zhang Y, Zhang L, Li Z, Wu X, Liu Y, et al. A nomogram for predicting the likelihood of lymph node metastasis in early gastric patients. BMC Cancer. 2016;16(1):92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Smyth GK. limma: Linear models for microarray data. In: Gentleman R, Carey VJ, Huber W, Irizarry RA, Dudoit S, editors. Bioinformatics & Computational Biology Solutions Using R & Bioconductor. New York: Springer; 2011. p. 397–420. doi: 10.1007/0–387-29362-0_23.

    Google Scholar 

  19. Wang L, Cao C, Ma Q, Zeng Q, Wang H, Cheng Z, et al. RNA-seq analyses of multiple meristems of soybean: novel and alternative transcripts, evolutionary and functional implications. BMC Plant Biol. 2014;14(1):169.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Sauerbrei W, Royston P, Binder H. Selection of important variables and determination of functional form for continuous predictors in multivariable model building. Stat Med. 2007;26(30):5512–28.

    Article  PubMed  Google Scholar 

  21. Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015;350:g7594.

    Article  PubMed  Google Scholar 

  22. Sauerbrei W, Boulesteix AL, Binder H. Stability investigations of multivariable regression models derived from low- and high-dimensional data. J Biopharm Stat. 2011;21(6):1206–31.

    Article  PubMed  Google Scholar 

  23. Eng KH, Schiller E, Morrell K. On representing the prognostic value of continuous gene expression biomarkers with the restricted mean survival curve. Oncotarget. 2015;6(34):36308–18.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Zhao J, Qin R, Chen H, Yang Y, Qin W, Han J, et al. A nomogram based on glycomic biomarkers in serum and clinicopathological characteristics for evaluating the risk of peritoneal metastasis in gastric cancer. Clin Proteomics. 2020;17(1):34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Vickers AJ, Cronin AM, Elkin EB, Gonen M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform DecisMak. 2008;8(1):53.

    Article  Google Scholar 

  26. Zhou Y, Zhou B, Pache L, Chang M, Khodabakhshi AH, Tanaseichuk O, et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun. 2019;10(1):1523.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. ASGE TECHNOLOGY COMMITTEE, Kantsevoy SV, Adler DG, et al. Endoscopic mucosal resection and endoscopic submucosal dissection. GastrointestEndosc. 2008;68(1):11–8.

    Article  Google Scholar 

  28. ASGE Standards of Practice Committee, Gan SI, Rajan E, et al. Role of EUS. Gastrointest Endosc. 2007;66(3):425–34.

    Article  Google Scholar 

  29. Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature. 2014;513(7517):202–9.

    Article  CAS  Google Scholar 

  30. Cristescu R, Lee J, Nebozhyn M, Kim KM, Ting JC, Wong SS, et al. Molecular analysis of gastric cancer identifies subtypes associated with distinct clinical outcomes. Nat Med. 2015;21(5):449–56.

    Article  CAS  PubMed  Google Scholar 

  31. Weiss MM, Kuipers EJ, Postma C, Snijders AM, Siccama I, Pinkel D, et al. Genomic profiling of gastric cancer predicts lymph node status and survival. Oncogene. 2003;22(12):1872–9.

    Article  CAS  PubMed  Google Scholar 

  32. Teramoto K, Tada M, Tamoto E, Abe M, Kawakami A, Komuro K, et al. Prediction of lymphatic invasion/lymph node metastasis, recurrence, and survival in patients with gastric cancer by cDNA array-based expression profiling. J Surg Res. 2005;124(2):225–36.

    Article  CAS  PubMed  Google Scholar 

  33. Marchet A, Mocellin S, Belluco C, Ambrosi A, de Marchi F, Mammano E, et al. Gene expression profile of primary gastric cancer: towards the prediction of lymph node status. Ann Surg Oncol. 2007;14(3):1058–64.

    Article  PubMed  Google Scholar 

  34. LAUREN P. THE TWO HISTOLOGICAL MAIN TYPES OF GASTRIC CARCINOMA: Diffuse and so-called intestinal-type carcinoma. An attempt at a HISTO-clinical classification. Acta Pathol Microbiol Scand. 1965;64(1):31–49.

    Article  CAS  PubMed  Google Scholar 

  35. Correa P. Human gastric carcinogenesis: a multistep and multifactorial process-first American Cancer Society award lecture on Cancer epidemiology and prevention. Cancer Res. 1992;52(24):6735–40.

    CAS  PubMed  Google Scholar 

  36. Watanabe M, Kato J, Inoue I, Yoshimura N, Yoshida T, Mukoubayashi C, et al. Development of gastric cancer in nonatrophic stomach with highly active inflammation identified by serum levels of pepsinogen and helicobacter pylori antibody together with endoscopic rugal hyperplastic gastritis. Int J Cancer. 2012;131(11):2632–42.

    Article  CAS  PubMed  Google Scholar 

  37. Nardone G, Rocco A, Malfertheiner P. Review article: helicobacter pylori and molecular events in precancerous gastric lesions. Aliment PharmacolTher. 2004;20(3):261–70.

    Article  CAS  Google Scholar 

  38. Adachi Y, Yasuda K, Inomata M, Sato K, Shiraishi N, Kitano S. Pathology and prognosis of gastric carcinoma: well versus poorly differentiated type. Cancer. 2000;89(7):1418–24 doi: 10.1002/1097-0142(20001001)89:7<1418::aid-cncr2>;2-a.

    Article  CAS  PubMed  Google Scholar 

  39. Ribeiro MM, Sarmento JA, SobrinhoSimões MA, et al. Prognostic significance of Lauren and Ming classifications and other pathologic parameters in gastric carcinoma. Cancer. 1981;47(4):780–4<780::aid cncr2820470424>;2-g.

    Article  CAS  PubMed  Google Scholar 

  40. Lee T, Tanaka H, Ohira M, Okita Y, Yoshii M, Sakurai K, et al. Clinical impact of the extent of lymph node micrometastasis in undifferentiated-type early gastric cancer. Oncology. 2014;86(4):244–52.

    Article  PubMed  Google Scholar 

  41. Ehnman M, Östman A. Therapeutic targeting of platelet-derived growth factor receptors insolid tumors. Expert Opin Investig Drugs. 2014;23(2):211–26.

    Article  CAS  PubMed  Google Scholar 

  42. Andrae J, Gallini R, Betsholtz C. Role of platelet-derived growth factors in physiology and medicine. Genes Dev. 2008;22(10):1276–312.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Mehine M, Kaasinen E, Heinonen HR`, Mäkinen N, Kämpjärvi K, Sarvilinna N, Aavikko M, Vähärautio A, Pasanen A, Bützow R, Heikinheimo O, Sjöberg J, Pitkänen E, Vahteristo P, Aaltonen LA Integrated data analysis reveals uterine leiomyoma subtypes with distinct driver pathways and biomarkers. Proc Natl Acad Sci2016;113(5):1315–1320. doi:

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Sun F, Zhuo R, Ma W, Yang D, Su T, Ye L, et al. From clinic to mechanism: proteomics-based assessment of angiogenesis in adrenal pheochromocytoma. J Cell Physiol. 2019;234(12):22057–70.

    Article  CAS  PubMed  Google Scholar 

  45. Lin J, Zhu Z, Xiao H, Wakefield MR, Ding VA, Bai Q, et al. The role of IL-7 in immunity and Cancer. Anticancer Res. 2017;37(3):963–7.

    Article  CAS  PubMed  Google Scholar 

  46. Jian M, Yunjia Z, Zhiying D, Yanduo J, Guocheng J. Interleukin 7 receptor activates PI3K/Akt/mTOR signaling pathway via downregulation of Beclin-1 in lung cancer. Mol Carcinog. 2019;58(3):358–65.

    Article  CAS  PubMed  Google Scholar 

  47. Kang T, Ge M, Wang R, et al. Arsenic sulfide induces RAG1-dependent DNA damage for cell killing by inhibiting NFATc3 in gastric cancer cells. J Exp Clin Cancer Res. 2019;38(1):487 https:// doi: 10.1186/s13046-019-1471-x. PMID: 31822296; PMCID: PMC6902349.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Matsuura K, Canfield K, Feng W, et al. Metabolic regulation of apoptosis in Cancer. Int Rev Cell Mol Biol. 2016;327:43–87.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Amamoto R, Yagi M, Song Y, Oda Y, Tsuneyoshi M, Naito S, et al. Mitochondrial p32/C1QBP is highly expressed in prostate cancer and is associated with shorter prostate-specific antigen relapse time after radical prostatectomy. Cancer Sci. 2011;102(3):639–47.

    Article  CAS  PubMed  Google Scholar 

  50. Gao LJ, Gu PQ, Fan WM, Liu Z, Qiu F, Peng YZ, et al. The role of gC1qR in regulating survival of human papillomavirus 16 oncogene-transfected cervical cancer cells. Int J Oncol. 2011;39(5):1265–72 doi: 10.3892/ijo.2011.1108. Epub 2011.

    CAS  PubMed  Google Scholar 

  51. Yu H, Liu Q, Xin T, Xing L, Dong G, Jiang Q, et al. Elevated expression of hyaluronic acid binding protein 1 (HABP1)/P32/C1QBP is a novel indicator for lymph node and peritoneal metastasis of epithelial ovarian cancer patients. Tumour Biol. 2013;34(6):3981–7.

    Article  CAS  PubMed  Google Scholar 

  52. Wang J, Song Y, Liu T, Shi Q, Zhong Z, Wei W, et al. Elevated expression of HABP1 is a novel prognostic indicator in triple-negative breast cancers. Tumour Biol. 2015;36(6):4793–9.

    Article  CAS  PubMed  Google Scholar 

  53. Kim K, Kim MJ, Kim KH, Ahn SA, Kim JH, Cho JY, et al. C1QBP is upregulated in colon cancer and binds to apolipoprotein A-I. Exp Ther Med. 2017;13(5):2493–500.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Saha SK, Kim KE, Islam SMR, et al. Systematic Multiomics analysis of alterations in C1QBP mRNA expression and relevance for clinical outcomes in cancers. J Clin Med. 2019;8(4):513.

    Article  CAS  PubMed Central  Google Scholar 

  55. Wu J, Zheng C, Wang X, Yun S, Zhao Y, Liu L, et al. MicroRNA-30 family members regulate calcium/calcineurin signaling in podocytes. J Clin Invest. 2015;125(11):4091–106.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Schittenhelm MM, Walter B, Tsintari V, et al. Alternative splicing of the tumor suppressor ASPP2 results in a stress-inducible, oncogenic isoform prevalent in acute leukemia. EBioMedicine. 2019;42:340–51.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


We appreciate GEO database for providing the original study data. We thank QWZ for her invaluable contribution in biostatistics analysis.


This work was funded by the National Nature Science Foundation of China (Grant Nos.81272493 and 81472213) and the Zhejiang Provincial Natural Science Foundation of China (Grant No. LQ19H160044).

Author information

Authors and Affiliations



XZ, XFW and GYW designed the study. XZ analyzed, as well as interpreted the data. XZ, FCX and YQ drafted the manuscript. JHP, SHW, WCC, TYL and HPZ helped to revise the manuscript. All authors read and ratified the final draft. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Xin Zhong, Xianfa Wang or Guanyu Wang.

Ethics declarations

Ethics approval and consent to participate

Ethics approval was not applicable because all data is publicly available and without specific identifiers.

Consent for publication

Not applicable.

Competing interests

The authors disclose no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Learning set.

Additional file 2.

Verification set.

Additional file 3.

186 DERs were reduced to 23 RNAs on the basis of 210 patients in the learning set.

Additional file 4

Table S1 These 23 RNAs with their corresponding coefficients and univariate analysis result between gene expression level and lymph node metastasis level. p-values are based on t-test.

Additional file 5

Figure S1 The optimum cutoff point of risk score. (a) The cutoff value (1.3806) was acquired when the ROC curve reached optimum sensitivity (94.05%) and specificity (88.00%) for predicting LN metastasis. (b) The cutoff value (1.3806) was acquired when Positive Predictive Value (PPV) reached 98%.

Additional file 6.

Risk score and risk status for each sample in Learning set.

Additional file 7.

Risk score and risk status for each sample in Verification set.

Additional file 8.

Figure S2 Decision curve analysis for the genomic signature based nomogram.

Additional file 9.

Metascape Analysis result.

Additional file 10

Fig S3 Network of enriched terms: (a) Colored by the cluster-ID, in which the nodes with similar cluster ID are frequently close to each other. (b) Colored by p-value, in which the terms with more genes tend to have a more remarkable p-value.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhong, X., Xuan, F., Qian, Y. et al. A genomic-clinicopathologic Nomogram for the preoperative prediction of lymph node metastasis in gastric cancer. BMC Cancer 21, 455 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Gastric cancer
  • Gene signature
  • Nomogram
  • Lymph node metastasis