Skip to main content

Artificial intelligence-based prognostic model accurately predicts the survival of patients with diffuse large B-cell lymphomas: analysis of a large cohort in China

Abstract

Background

Diffuse large B-cell lymphomas (DLBCLs) display high molecular heterogeneity, but the International Prognostic Index (IPI) considers only clinical indicators and has not been updated to include molecular data. Therefore, we developed a widely applicable novel scoring system with molecular indicators screened by artificial intelligence (AI) that achieves accurate prognostic stratification and promotes individualized treatments.

Methods

We retrospectively enrolled a cohort of 401 patients with DLBCL from our hospital, covering the period from January 2015 to January 2019. We included 22 variables in our analysis and assigned them weights using the random survival forest method to establish a new predictive model combining bidirectional long-short term memory (Bi-LSTM) and logistic hazard techniques. We compared the predictive performance of our “molecular-contained prognostic model” (McPM) and the IPI. In addition, we developed a simplified version of the McPM (sMcPM) to enhance its practical applicability in clinical settings. We also demonstrated the improved risk stratification capabilities of the sMcPM.

Results

Our McPM showed superior predictive accuracy, as indicated by its high C-index and low integrated Brier score (IBS), for both overall survival (OS) and progression-free survival (PFS). The overall performance of the McPM was also better than that of the IPI based on receiver operating characteristic (ROC) curve fitting. We selected five key indicators, including extranodal involvement sites, lactate dehydrogenase (LDH), MYC gene status, absolute monocyte count (AMC), and platelet count (PLT) to establish the sMcPM, which is more suitable for clinical applications. The sMcPM showed similar OS results (P < 0.0001 for both) to the IPI and significantly better PFS stratification results (P < 0.0001 for sMcPM vs. P = 0.44 for IPI).

Conclusions

Our new McPM, including both clinical and molecular variables, showed superior overall stratification performance to the IPI, rendering it more suitable for the molecular era. Moreover, our sMcPM may become a widely used and effective stratification tool to guide individual precision treatments and drive new drug development.

Peer Review reports

Background

Diffuse large B-cell lymphomas (DLBCLs) are the most common subtype of non-Hodgkin lymphoma (NHL), and they display clinical and biological heterogeneity. Immunochemotherapy with rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone (RCHOP) is a widely accepted, standardized treatment for patients with newly diagnosed DLBCL [1]. Despite a high response rate, approximately 10–15% of patients are resistant to first-line immunochemotherapy [2,3,4], and disease relapse occurs in up to 30–40% of patients after treatment [4, 5]. Therefore, individualized treatments based on risk stratification are urgently needed.

The International Prognostic Index (IPI) is the most widely used tool for risk stratification in clinical practice; this was established before the era of immunochemotherapy [6]. The revised IPI (R-IPI) and National Comprehensive Cancer Network IPI (NCCN-IPI) are often used as prognostic models during rituximab treatments. They offer advantages in predicting prognosis after immunochemotherapy [7]. A variety of gene signatures and microenvironmental biomarkers have emerged. Tumour-associated macrophages (TAMs) are important components of the DLBCL microenvironment, and upregulation of CD163+ M2 TAMs has been found to be associated with inferior chemotherapy effects [8] and an unfavourable prognosis [9]. Carreras et al. also found that high-level infiltration of (CD163+, PTX3+, IL10+) M2c-like TAMs and low infiltration of FOXP3+ Tregs were associated with a poor prognosis [10]. In addition, DLBCL patients expressing PD-L1 also have a poor prognosis [11]. Therefore, several studies have attempted to incorporate molecular markers into various models [4, 7, 12,13,14,15]. Of these, the five-gene risk model (CD163、CLEC4A、COL15A1、GABRB2、IFIT3) reliably predicts the overall survival (OS) of DLBCL [15] patients. However, even in the current molecular era, there is no consensus among research groups on the optimal molecular technique for stratifying DLBCL patients. Thus, the models have not been widely used in routine medical practice. The IPI, R-IPI, and NCCN-IPI are still widely used to accurately predict prognosis. However, all three scoring systems share the limitation of only considering clinical indicators; molecular heterogeneity is not sufficiently taken into account [7, 16].

Gene abnormalities in MYC, BCL2, and BCL6 are strong prognostic predictors in patients with DLBCL [17,18,19]. DLBCL with a MYC rearrangement (MYC-R) but not a BCL2 rearrangement (BCL2-R) nor a BCL6 rearrangement (BCL6-R) is termed single-hit lymphoma (SHL). DLBCL with MYC-R and BCL2-R and/or BCL6-R is termed double- or triple-hit lymphoma (DHL and THL, respectively). All three types were reported to be associated with a poor prognosis [20]. However, to the best of our knowledge, these markers have not been used in prognostic scoring systems.

Mathematical modelling methods with digitized clinical data have gradually been introduced to identify the most important prognostic factors of diseases and predict the incidence of events [21,22,23,24]. The use of artificial intelligence to screen for DLBCL-prognostic genes is now common [25]. One study used artificial intelligence to reveal the prognostic impact of the MYC and BCL2 genes [26]. However, AI-based classification methods and models are not appropriate for routine clinical practice. In addition, few models combine clinical and genetic screening factors as prognostic indicators.

The aim of our research was to establish a new prognostic model with diverse indicators, including both clinical and molecular variables, by using an intelligent screening method. Our new model represents an advancement in both methodology and indicator selection. It is well positioned to provide improved guidance for clinical treatment, especially in the context of new drugs, and aligns with the demands of the molecular era.

Materials and methods

Patients and clinical features

Case selection

We retrospectively collected clinical information and the results of MYC, BCL2, and BCL6 fluorescence situ hybridization (FISH) tests from 401 patients with DLBCL newly diagnosed in our hospital during the period January 2015 to January 2019. We excluded patients with incomplete clinical data and those who did not receive chemotherapy. All patients were diagnosed with DLBCL by pathologists at our hospital or after external consultations. The study was conducted in accordance with the Helsinki Declaration. All patients signed informed consent forms before inclusion.

Fluorescence situ hybridization

FISH experiments were performed using paraffin-embedded specimens after the pathological diagnoses. All probes and 4’,6-diamidino-2-phenylindole (DAPI) counterstains used in this study were purchased from Abbott (USA), including the MYC dual-color separation (01N63-020), BCL2 dual-color separation (05N51-020), and BCL6 dual-color separation (01N23-020) probes. According to the results of hematoxylin-eosin (HE) staining, we selected regions rich in tumor cells as hybridization regions. We counted 200 tumor cells in each specimen and assessed genetic features based on specific signal patterns. We identified gene rearrangements based on signal separation, where one fused yellow signal and two separated red and green signals were observed in > 10% of cells for the MYC gene, > 10% of cells for the BCL2 gene, and > 10% of cells for the BCL6 gene. We identified gene amplifications in the presence of more than three yellow signals (or adjacent red and green signals) in the same nucleus. Two yellow signals (or adjacent red and green signals) in the same nucleus characterized normal MYC, BCL2, and BCL6 genes.

Follow-up

Patients were followed up via phone calls or the Hospital Information System. We excluded patients for whom we lacked survival outcome information. The last follow-up date was 1 January 2022. We defined OS as the time from DLBCL diagnosis to death or the last follow-up, whichever came first. Progression-free survival (PFS) was the time from DLBCL diagnosis to the first disease progression, death, or last follow-up, whichever came first.

Modelling

Modelling process

Figure 1 shows the modelling process.

Fig. 1
figure 1

The modelling process of McPM. EIS, extranodal involvement sites; LDH, lactate dehydrogenase; AMC, absolute monocyte count; PLT, platelet count; Bi-LSTM, bidirectional long-short term memory; MLP, multi-layer perceptron; MSELoss, mean squared error loss; OS, overall survival; PFS, progression free survival

Feature selection

We evaluated the significance of variables using the random survival forest (RSF) R package(website:https://cran.rproject.org/web/packages/randomForestSRC/index.html) [27]. Two methods were used to determine the contributions of variables to a stochastic survival model: the variable importance (VIMP) [28] method and the minimum depth method.

Data set segmentation

We used 90% of the data for model training and 10% for model testing. We split the data used for training into two sets according to a ratio of 9:1(test data, training data, and verification data in the proportion of positive and negative samples) (Fig. 2).

Fig. 2
figure 2

Data set segmentation. The complete data set is divided into training and test data sets in a ratio of 9:1, where the training data is further divided into training and validation data in a ratio of 9:1

Training

We generated a survival prediction model (“molecular-contained prognostic model”, McPM) based on entity embedding, encoded and decoded layers, bidirectional long-short term memory (Bi-LSTM), and logistic hazard techniques [29]. We classified variables into categorical, binary, and numerical categories and used them as model inputs. The encoded data were produced using the encoded layer. Future and past information was discovered and retained by the Bi-LSTM, and the final survival predictions were then generated using the logistic hazard model. The logistic hazard model is a discrete-time method and requires discretization of event times for data that are originally continuous. The model’s loss function consists of an encoder loss component (mean squared error loss, MSELoss) and a survival prediction loss component (negative logarithmic likelihood logistic regression loss, NLLLogistiHazardLoss), where the encoder loss was calculated on the basis of differences between the model’s output and input features.

Model evaluation

We used the concordance index (C-index) and Brier score (BS) as evaluation indexes of the survival predictions. The C-index, a commonly used indicator of survival predictions, is applied to evaluate the predictions made by an algorithm. The BS [30, 31] is used to evaluate the accuracy of a predicted survival function at a given time (t). It represents the average squared distance between the actual survival status and the predicted survival probability, and its value is always a number between 0 and 1, where 0 is the best possible value. Given a dataset of \(\text{N}\) samples, \(\forall \text{i} \in \left[1, \text{N}\right], (\overrightarrow{{\text{x}}_{\text{i}}},{{\updelta }}_{\text{i}},{\text{T}}_{\text{i}})\) is the format of a datapoint, and the predicted survival function is \(\widehat{\text{S}}\left(\text{t}, \overrightarrow{{\text{x}}_{\text{i}}}\right), \forall \text{t} \in {\mathbb{R}}^{+}\). In the absence of right censoring, the BS can be calculated as:

$$\text{B}\text{S}\left(\text{t}\right)=\frac{1}{\text{N}}\sum\limits _{\text{i}=1}^{\text{N}}{({1}_{{\text{T}}_{\text{i}}>\text{t}}-\widehat{\text{S}}\left(\text{t}, \overrightarrow{{\text{x}}_{\text{i}}}\right))}^{2}$$

However, if the dataset contains right-censored samples, the score needs to be adjusted by weighting the squared distances using the inverse probability of censoring weights method. Let \(\widehat{\text{G}}\left(\text{t}\right)=\text{P}[\text{C}>\text{t}]\) be the estimator of the conditional survival function of the censoring times calculated using the Kaplan-Meier method, where \(\text{C}\) is the censoring time.

$$\text{B}\text{S}\left(\text{t}\right)=\frac{1}{\text{N}}\sum\limits_{\text{i}=1}^{\text{N}}\frac{{(0-\widehat{\text{S}}\left(\text{t}, \overrightarrow{{\text{x}}_{\text{i}}}\right))}^{2}\cdot {1}_{{\text{T}}_{\text{i}}>\text{t},{{\updelta }}_{\text{i}}=1}}{\widehat{\text{G}}\left({\text{T}}_{\text{i}}^{-}\right)}+\frac{{(1-\widehat{\text{S}}\left(\text{t}, \overrightarrow{{\text{x}}_{\text{i}}}\right))}^{2}\cdot {1}_{{\text{T}}_{\text{i}}>\text{t}}}{\widehat{\text{G}}\left(\text{t}\right)}$$

In terms of benchmarks, a useful model will have a BS below 0.25. Indeed, it is easy to see that if \(\forall \text{i} \in \left[1, \text{N}\right], \widehat{\text{S}}\left(\text{t}, \overrightarrow{{\text{x}}_{\text{i}}}\right)=0.5\) then \(\text{B}\text{S}\left(\text{t}\right)=0.25\).

We performed other statistical analyses using GraphPad 9.0 software. We considered P values < 0.05 statistically significant.

Results

Clinical and molecular characteristics of patients with DLBCL

Table 1 lists the clinical and molecular characteristics of the 401 patients in this study. The median age of onset was 58 years and the proportion of male participants was slightly higher than that of female participants. Nearly half of patients (49.6%) had an Ann Arbor stage of III–IV. One fifth of patients were accompanied with B symptoms. Most patients were non-germinal center B cell (Non-GCB) subtypes according to the cell of origin of lymphoma. We determined the expression of MYC, BCL2, and BCL6 genes using FISH tests in all patients, and we found that 296 patients (73.8%) had at least one genetic abnormality. We identified 16 cases of double-hit lymphoma /triple-hit lymphoma (DHL/THL), including 1 case of THL, 10 of MYC and BCL6 DHL, and 5 of MYC and BCL2 DHL. In total, 381 patients received RCHOP-like regimens and only 20 (5.0%) received CHOP-like regimens. Additionally, we have also performed univariate and multivariate analyses using Log rank and COX (see Table S1 and Table S2).

Table 1 Clinical and molecular characteristics of the 401 patients

Importance of variables

We used the RSF method to assess the importance of 22 clinical and pathological variables, as shown in Fig. 3. We found that the prediction error rate decreased significantly with an increase in the number of survival trees (Fig. 3a). When survival trees increase to a certain number, the error rate curve flattens out. We found that the error rate was lowest near 300 trees (within a range of 0–500 trees).

Fig. 3
figure 3

Error rate curve and feature weights. a Plot of error rate according to the number of survival trees; (b) Comparison of the importance of all 22 factors. The error rate of the model stabilizes at around 0.31. Based on the feature weights, the top five most important indicators can be identified as extranodal involvement sites, LDH, MYC gene status, AMC, and PLT. LDH, lactate dehydrogenase; ALC, absolute lymphocyte count; AMC, absolute monocyte count; ECOG, Eastern Cooperative Oncology Group score; β2M, β2 microglobulin; PLT, platelet count; ANC, absolute neutrophil count; CRP, C-reactive protein; WBC, white blood cells; Hb, haemoglobin; COO, cell of origin

Figure 3b lists the weights of the different variables. The top five variables ranked by weight were extranodal involvement sites, lactate dehydrogenase (LDH), MYC gene status, absolute monocyte count (AMC), and platelet count (PLT).

Both the VIMP and minimum depth methods are commonly used for variable screening when employing the RSF model. VIMP represents the difference between the original and the new error rate. A VIMP value < 0 indicates that the variable reduces the prediction accuracy, while a VIMP value > 0 indicates that the variable improves the prediction accuracy. The minimum depth method assesses the importance of each variable by calculating the minimum depth at which it appears in the decision tree when the tree reaches its final node.

Table 2 lists the VIMP values and depth of different variables. Figure 4 shows a scatter plot comparing the two methods. In the plot, blue points represent VIMP values > 0, and red points represent VIMP values < 0. Points on the red diagonal dotted line indicate that the same two methods rank a given variable similarly. Points above the diagonal dotted line indicate a higher VIMP ranking, and points below the diagonal dotted line indicate a higher minimum depth ranking. We obtained similar results with both variable selection methods; extranodal involvement sites, LDH, MYC gene status, AMC, and PLT seem to be important variables for prognosis.

Fig. 4
figure 4

Scatter plot of the VIMP and minimum depth method. Blue points represent VIMP values > 0, and red points represent VIMP values < 0. Points on the red diagonal dotted line indicate that the same two methods rank a given variable similarly. Points above the diagonal dotted line indicate a higher VIMP ranking, and points below the diagonal dotted line indicate a higher minimum depth ranking (e.g. AMC and PLT are above the red diagonal dashed line, indicating that these two variables have a higher VIMP ranking.)LDH, lactate dehydrogenase; ALC, absolute lymphocyte count; AMC, absolute monocyte count; ECOG, Eastern Cooperative Oncology Group; β2M, β2 microglobulin; PLT, platelet count; ANC, absolute neutrophil count; CRP, c-reactive protein; WBC, white blood cells; Hb, haemoglobin; COO, cell of origin

Table 2 Importance of variables according to the minimum depth and VIMP methods

Comparison of the new McPM and the IPI for OS prediction

We assessed the goodness of fit of the two models using the C-index. A higher C-index value indicates a better model fit. The Brier score, defined as the mean square of the difference between the predicted and actual values, allowed us to calculate the integrated Brier score (IBS), an overall measure of model prediction performance. The lower the IBS, the higher the prediction accuracy of the model. Compared with the IPI, the McPM had a superior OS prediction accuracy due to its higher C-index (McPM, 0.8672 vs. IPI, 0.8025; Table 3) and lower IBS (McPM, 0.1296 vs. IPI, 0.2159; Table 3).

Table 3 Comparison of OS between the IPI and McPM

The receiver operating characteristic (ROC) curve and area under curve (AUC) are important for evaluating prognostic model discrimination. Figure 5 shows the change in AUC values for OS or PFS predictions as survival time increases. For OS prediction (Fig. 5a), the new score outperformed the IPI model over a continuous 80-month period, maintaining stable AUC values ranging from 0.8 to 0.9. In contrast, the AUC values of the IPI were less stable, with predicted AUC values ranging from 0.4 to 0.8. The predicted AUC values within 1 year ranged from 0.4 to 0.7 with large fluctuations, while the predicted AUC values after 1 year ranged from 0.7 to 0.8 with small fluctuations. The gap between the two models became wide over time. Figure 6 shows the ROC curves of the new score and the IPI models for OS predictions. Compared with the 1-, 3-, and 5-year survival AUCs of the IPI, the values of the new model increased by approximately 6%, 4%, and 15%, respectively. Overall, the new model showed a superior OS predictive performance than the IPI.

Fig. 5
figure 5

Comparison of the AUC values at various time points of the two models (McPM and IPI). a The relationship between survival time and the AUC value in OS prediction; (b) The relationship between survival time and the AUC value in PFS prediction. AUC stands for area under the curve, which is a metric used to evaluate the performance of a binary classification model. A higher AUC value suggests better discriminatory power. AUC, area under the curve; McPM, molecular-contained prognostic model; IPI, International Prognostic Index; OS, overall survival; PFS, progression-free survival

Fig. 6
figure 6

Comparison of the 1- (a), 3- (b), and 5-year (c) ROC curves of different models (McPM and IPI for OS prediction). It is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The larger the area under the curve, the better the classification performance of the model. ROC, Receiver operating characteristic; OS, Overall survival; McPM, Molecular- contained prognostic model; IPI, International Prognostic Index

Comparison of the new McPM and the IPI for PFS prediction

We further explored the predictive performance of the two scoring systems for PFS. Compared with IPI, The McPM provided better PFS discrimination than the IPI, as indicated by its higher C-index value (0.8394 vs. 0.7308; Table 4) and lower IBS (0.1619 vs. 0.1712; Table 4).

Table 4 Comparison of progression-free survival values between the IPI and the McPM

The AUC curve values for PFS (Fig. 5b) also differed significantly between the two models. During the continuous 80-month period, the AUC values of the new model remained stable between 0.75 and 0.9, and they began to decrease gradually after 60 months. By contrast, the predicted IPI AUC values ranged from 0.7 to 0.8 and showed an overall decreasing trend with time. Figure 7 shows the 1-, 3‐, and 5‐year ROC curves for PFS predictions, for both the McPM and the IPI. While the 1-year AUC value for the McPM was slightly lower than that of the IPI (0.80 vs. 0.82), the new model had better AUC values for 3- and 5-year survival, by approximately 12% in both cases. Overall, despite a slightly lower 1-year AUC, the new model demonstrated better PFS predictive performance than the IPI.

Fig. 7
figure 7

Comparison of the 1- (a), 3- (b), and 5-year (c) ROC curves of different models (McPM and IPI) of PFS. It is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The larger the area under the curve, the better the classification performance of the model. ROC, receiver operating characteristic; PFS, progression-free survival; McPM, molecular-contained prognostic model; IPI, International Prognostic Index

Comparison of two models with actual outcomes

To better evaluate the performance differences between the two prediction models and real-world outcomes, we conducted comparisons (Fig. 8). In terms of OS, the alignment of the IPI with the actual probability (global true) was slightly stronger than that of the new model within the first 40 months. However, the difference between the IPI and the global true became significantly larger thereafter than that of the new model, with the disparity gradually increasing over time. In general, the new model exhibited better alignment with the actual outcomes than the IPI (Fig. 8a).

Fig. 8
figure 8

Model fit for OS (a) and PFS (b). The closer the fitted curve is to the true curve, the higher the degree of fit of the model. OS, overall survival; PFS, progression-free survival; IPI, International Prognostic Index

In terms of PFS, both the new model and the IPI aligned well with the actual probability within the first 12 months (Fig. 8b). We found slight deviation between the new model’s predictions and the actual PFS between 12 and 30 months, and the new model’s predictions largely aligned with the global true after 30 months. By contrast, the deviation between the IPI and the actual PFS increased gradually over time. Overall, the predictive advantage of the new model for PFS was significant.

Clinical application of the new sMcPM

The calculations required to operate the McPM are too complex to be widely applicable in clinical practice. Therefore, we designed a simplified version, the sMcPM. According to the weights of the variables in Table 2, we selected the top five contributors to patient survival (extranodal involvement sites, LDH, MYC gene status, AMC, and PLT). We integrated the five variables considering their relative weights; Table 5 shows the scoring criteria. Our sMcPM classified patients into three risk subgroups: a score of 0–2 indicates a low risk of poor outcomes, a score of 3–4 indicates intermediate risk, and a score of 5–7 indicates high risk.

Table 5 Comparison of sMcPM and IPI for stratifying patients into risk groups according to survival outcomes

We further compared the OS and PFS of the different risk groups according to the two models (Fig. 9) to visualize the stratification by prognosis. The stratification performance of OS based on the IPI and sMcPM was similar and significant in both cases (both P < 0.0001). However, stratification by IPI of the PFS was inferior to that of the sMcPM, and the former could not distinguish intermediate- from high-risk patients (P = 0.44 vs. P < 0.0001). The 1-year PFS rates of the four IPI groups were 91.10%, 71.20%, 73.30%, and 63.00% (Table 5). The 1‐year PFS rates of the three sMcPM groups (low-, intermediate-, and high-risk groups) were 82.70%, 73.90%, and 58.8%. Thus, sMcPM was superior at identifying high-risk patients, stratifying patients, and predicting their prognoses. More detailly, the COX for OS and PFS for IPI, sMcPM, and both variables together is included in Table S3. In conclusion, the new simplified model fully considers the important influence of molecular factors, transforms numerical variables into binary variables and constitutes a more stable, reliable, and applicable stratification tool for patients with DLBCL. These results suggest that our sMcPM could serve as a reliable prognostic tool.

Fig. 9
figure 9

Performance of the IPI (a) and the new score (sMcPM) (b) in terms of stratifying patients according to OS and PFS [(c) and (d), respectively]. IPI and new score are similar in prognostic stratification of OS, but new score has an advantage in PFS. OS, overall survival; PFS, progression-free survival; IPI, International Prognostic Index

Discussion

DLBCL is the most common subtype of NHL, accounting for approximately 30–40% of all cases [32]. More than 50% of patients are cured with the first-line RCHOP regimen, but a small proportion still have a poor prognosis [33, 34]; identifying these high-risk patients is important. IPI, R-IPI [35], and NCCN-IPI [36] are widely used indexes for prognostic assessment, but none of them can clearly identify individuals with a long-term survival < 50% during rituximab treatment [16, 37]. In 2016, given the prognostic importance of MYC, BCL2, and BCL6 rearrangements in patients with lymphoma [34, 38,39,40], the World Health Organization (WHO) defined a new entity with an aggressive clinical course and poor clinical outcome, high-grade B-cell lymphomas with MYC and BCL2 and/or BCL6 rearrangements (HGBL-R), which was previously known as DHL/THL [41,42,43,44]. In other words, molecular heterogeneity is an important factor in the prognosis of patients with DLBCL, but existing scoring systems do not meet the demands of the molecular era. There are two main limitations of the IPI model. First, it only includes clinical indicators and ignores the important impact of molecular genetic heterogeneity on prognosis. Second, the use of Cox survival analysis alone to screen for independent prognostic factors is relatively outdated and cannot account for the non-linear associations between complex and diverse variables [45, 46]. Relevant studies showed that even with the same IPI score, the prognoses of patients with DLBCL may differ significantly [5, 17]. Thus, we created the McPM for patients with DLBCL to integrate the complex prognostic factors in the molecular era.

The incorporation of prognostic indicators linked to molecular genetics is one of the two main strengths of our McPM. Domestic and international studies have confirmed the prognostic significance of MYC, BCL2, and BCL6 gene abnormalities in patients with DLBCL [17, 19, 45]. Tzankov et al. analysed 432 patients and found that those with MYC gene rearrangements had a worse prognosis than those without (median OS rate, 42% vs. 86%, P = 0.038; median PFS, 42 vs. 75 months, P = 0.049) [47]. A similar study reported that patients with MYC amplification also had poorer OS than those with normal MYC gene status, as did those with rearrangements (both P < 0.01) [48]. Obermann et al. performed FISH testing of BCL2 gene status in 224 patients with newly diagnosed DLBCL. In patients with the non-GCB subtype, the presence of any BCL2 gene abnormality correlated with a shorter median OS (12 vs. 109 months, P = 0.003) [49]. Similarly, Huang et al. showed that the prognosis of patients with BCL2 gene rearrangement or amplification was significantly worse than that of patients with normal gene status [50]. Akyurek et al. analysed the gene rearrangement status of 239 DLBCL cases and found that patients with BCL6 rearrangements, particularly those with the non-GCB subtype, had a worse prognosis, suggesting that BCL6 rearrangement may be a biomarker of an aggressive disease course in non-GCB subgroups [17]. In another study, a trend toward inferior OS was observed in patients with BCL6 rearrangements who received immunochemotherapy (P = 0.08) [51]. It is worth noting that we previously found that patients with MYC and BCL2 and/or BCL6 gene rearrangements exhibited an aggressive clinical course and a poor response to a first-line RCHOP regimen, with 5-year OS and PFS rates of 27% and 18%, respectively [52]. In conclusion, MYC, BCL2, and BCL6 are associated with poor prognosis, and their inclusion in our scoring system facilitates stratification.

The screening method for prognostic indicators is another advantage of our McPM. We used artificial intelligence (AI) to create a new prognostic model with potential to aid accurate diagnosis, treatment, and prognostic assessment of tumours [53]. After collecting 22 variables, including clinical and pathological features and laboratory and molecular genetic test results, from 401 patients with DLBCL in our center, we used the RSF method to demonstrate the importance of each variable. We obtained identical results using the VIMP and minimum depth methods in terms of the variables selected. A large VIMP means that a variable has a large impact on model accuracy and is therefore important. By contrast, the minimum depth method assigns smaller values to more important variables. The selection of the same variables by the two methods confirmed the prognostic value thereof. We constructed the McPM based on Bi-LSTM and logistic hazard techniques. Similar models used for other cancers, including lung cancer [54] and nasopharyngeal cancer [55], achieved early successes. We believe that our AI-based approach can deal with complex non-linear associations between variables and has unique advantages over traditional Cox regression analyses when dealing with survival data [27].

We compared the OS and PFS predictions between the IPI and McPM in several respects. Compared with the IPI, the McPM had a higher C-index and a lower BS for OS and PFS. According to the 1-, 3-, and 5-year ROC curve analyses for OS, the area under the curve of the McPM was larger than that of the IPI. Notably, our results supported an association between survival time and the AUC value. With longer survival times, the gap between the McPM and IPI results gradually widened; the AUC value of the new model was consistently high, while that of the IPI decreased gradually. This implies that the McPM has a clear advantage for predicting long-term survival. The McPM can identify a high-risk subgroup of patients with long-term survival < 50%, which many other models have failed to achieve [16]. In the comparison of the ROC curves for PFS, the area under the curve of the McPM was similar to or larger than that of the IPI, which suggests that the McPM is helpful to identify patients with poor treatment responses and predict disease progression, which are important factors for appropriate, individualized treatment. According to the AUC values (Fig. 5b) for PFS, the two models in this study differed within and after 1 year of diagnosis, which may further reflect the heterogeneity of DLBCL. Future clinical assessment could be improved by combining the two scoring systems. By comparing the predictions and actual outcomes, we found that the results of the McPM were more accurate than those of the IPI. In conclusion, the McPM had better prediction accuracy and stability, achieved by the integration of comprehensive prognostic indicators. The McPM could predict individual events, and we believe it is a suitable stratification tool for the molecular era.

The McPM calculation process is too complicated to be used directly in the clinical practice. Therefore, we established a simplified model, the sMcPM, by selecting 5 from among the 22 variables according to their weights (extranodal involvement sites, LDH, MYC gene status, AMC, and PLT). The number of extranodal involvement sites was the most important factor in our study. Moreover, we explored the impact of different numbers of extranodal sites on patient survival. Patients with more sites of extranodal involvement seemed to have a worse prognosis, similar to the findings of El-Galaly et al. [56]. LDH, a meaningful tumour biomarker, is often negatively correlated with prognosis [57]. A relevant study suggested that LDH may reflect the severity of disease [58]. MYC gene status is the only molecular genetic factor included in our scoring system. Studies have shown that MYC mutations are often associated with a poor prognosis [48, 52, 59]. Monocyte counts are also associated with the survival and prognosis of patients with lymphoma [60, 61]. Platelets play an important role in tumour immunity and angiogenesis; both factors are closely related to prognosis [62,63,64]. Our new model comprises these five indicators and is easy to use for clinicians. The sMcPM performs similarly to the IPI in terms of predicting OS, but it has a significant advantage in PFS prognostic stratification. Studies have shown that neither the IPI nor the R-IPI can define a high-risk group with a 3-year PFS < 50% [56], while the sMcPM can identify a group of patients with a 3-year PFS of 30.8%. This suggests that the indicators included in our new model have an important impact on disease progression, and our model may help guide subsequent clinical treatment with new drugs. We believe the sMcPM may become a widely used stratification tool in the molecular age.

FISH is not a routine test in clinical practice due to its high costs [34]. Therefore, few studies with large samples have been able to show complete gene rearrangements in patients with DLBCL, which limits the assessment of their prognostic impact. In response to this situation, we incorporated 22 variables from 401 patients, including both clinical and molecular factors, to construct a prognostic scoring system considering molecular heterogeneity. While the establishment of the McPM is a step forward, there are some limitations. The McPM includes too many factors, which makes it difficult to calculate and is therefore not suitable for being used in clinical practice. Important information is retained in the simplified sMcPM. However, the new score did not outperform the IPI in terms of OS prediction. In addition, as this is a retrospective study, validation in a large cohort is needed.

Conclusions

We established a new prognostic model that is superior to the IPI in terms of prognostic prediction accuracy and stability. Moreover, the sMcPM can better meet the demand for prognostic stratification in the molecular era, and we expect it will become a widely used stratification tool with the ability to guide personalized clinical treatments.

Availability of data and materials

The dataset used and analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

NHL:

Non-Hodgkin lymphoma

DLBCL:

Diffuse large B-cell lymphoma

RCHOP:

Rituximab plus cyclophosphamide, doxorubicin, vincristine and prednisone

IPI:

International Prognostic Index

R-IPI:

Revised International Prognostic Index

NCCN-IPI:

National Comprehensive Cancer Network IPI

TAMs:

Tumour-associated macrophages

MYC-R :

MYC Rearrangement

BCL2-R :

BCL2 Rearrangement

BCL6-R :

BCL6 Rearrangement

SHL:

Single-hit lymphoma

FISH:

Fluorescence situ Hybridization

DAPI:

4',6-Diamidino-2-phenylindole

HE:

Hematoxylin-eosin

OS:

Overall survival

PFS:

Progression free survival

VIMP:

Variable importance

RSF:

Random Survival Forest

McPM:

Molecular-contained Prognostic Model

Bi-LSTM:

Bidirectional long-short term memory

AMC:

Absolute monocyte count

PLT:

Platelet

LDH:

Lactate dehydrogenase

ECOG:

Eastern Cooperative Oncology Group

C-index:

Concordance index

BS:

Brier Score

DHL/THL:

Double-hit/Triple-hit lymphoma

IBS:

Integrated Brier Score

ROC:

Receiver operating characteristic

AUC:

Area under curve

sMcPM:

Simplified McPM

WHO:

World Health Organization

(HGBL,R):

High Grade B-cell Lymphomas with MYC and BCL2 and/or BCL6 rearrangements

AI:

Artificial Intelligence

References

  1. Maurer MJ, Ghesquières H, Jais JP, Witzig TE, Haioun C, Thompson CA, et al. Event-free survival at 24 months is a robust end point for disease-related outcome in diffuse large B-cell lymphoma treated with immunochemotherapy. J Clin Oncol. 2014;32(10):1066–73.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Flowers CR, Odejide OO. Sequencing therapy in relapsed DLBCL. Hematol Am Soc Hematol Educ Program. 2022;2022(1):146–54.

    Article  Google Scholar 

  3. Patriarca A, Gaidano G. Investigational drugs for the treatment of diffuse large B-cell lymphoma. Expert Opin Investig Drugs. 2021;30(1):25–38.

    Article  CAS  PubMed  Google Scholar 

  4. He J, Chen Z, Xue Q, Sun P, Wang Y, Zhu C, et al. Identification of molecular subtypes and a novel prognostic model of diffuse large B-cell lymphoma based on a metabolism-associated gene signature. J Transl Med. 2022;20(1):186.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Coiffier B, Lepage E, Briere J, Herbrecht R, Tilly H, Bouabdallah R, et al. CHOP chemotherapy plus rituximab compared with CHOP alone in elderly patients with diffuse large-B-cell lymphoma. N Engl J Med. 2002;346(4):235–42.

    Article  CAS  PubMed  Google Scholar 

  6. International Non-Hodgkin’s Lymphoma Prognostic Factors Project. A predictive model for aggressive non-hodgkin’s lymphoma. N Engl J Med. 1993;329(14):987–94.

    Article  Google Scholar 

  7. Gao H, Wu B, Jin H, Yang W. A 6-lncRNA signature predicts prognosis of diffuse large B-cell lymphoma. J Biochem Mol Toxicol. 2021;35(6):1–12.

    Article  CAS  PubMed  Google Scholar 

  8. Nam SJ, Go H, Paik JH, Kim TM, Heo DS, Kim CW, et al. An increase of M2 macrophages predicts poor prognosis in patients with diffuse large B-cell lymphoma treated with rituximab, cyclophosphamide, doxorubicin, vincristine and prednisone. Leuk Lymphoma. 2014;55(11):2466–76.

    Article  CAS  PubMed  Google Scholar 

  9. Marchesi F, Cirillo M, Bianchi A, Gately M, Olimpieri OM, Cerchiara E, et al. High density of CD68+/CD163 + tumour-associated macrophages (M2-TAM) at diagnosis is significantly correlated to unfavorable prognostic factors and to poor clinical outcomes in patients with diffuse large B-cell lymphoma. Hematol Oncol. 2015;33(2):110–2.

    Article  CAS  PubMed  Google Scholar 

  10. Carreras J, Kikuti YY, Hiraiwa S, Miyaoka M, Tomita S, Ikoma H, et al. High PTX3 expression is associated with a poor prognosis in diffuse large B-cell lymphoma. Cancer Sci. 2021;113(1):334–48.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Takahara T, Nakamura S, Tsuzuki T, Satou A. The Immunology of DLBCL. Cancers (Basel). 2023;15(3):835.

    Article  CAS  PubMed  Google Scholar 

  12. Yan J, Yuan W, Zhang J, Li L, Zhang L, Zhang X, et al. Identification and validation of a prognostic prediction model in diffuse large B-Cell lymphoma. Front Endocrinol (Lausanne). 2022;13:846357.

    Article  PubMed  Google Scholar 

  13. Sun F, Zhu J, Lu S, Zhen Z, Wang J, Huang J, et al. An inflammation-based cumulative prognostic score system in patients with diffuse large B cell lymphoma in Rituximab era. BMC Cancer. 2018;18(1):5.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Zhang W, Yang L, Guan YQ, Shen KF, Zhang ML, Cai HD, et al. Novel bioinformatic classification system for genetic signatures identification in diffuse large B-cell lymphoma. BMC Cancer. 2020;20(1):714.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Wang G, Qiu C, Zhang C, Hou S, Zhang Q. Construction of a DLBCL Prognostic signature based on Tumor Microenvironment. Expert Rev Hematol. 2021;14(7):679–86.

    Article  CAS  PubMed  Google Scholar 

  16. Ruppert AS, Dixon JG, Salles G, Wall A, Cunningham D, Poeschel V, et al. International prognostic indices in diffuse large B-cell lymphoma: a comparison of IPI, R-IPI, and NCCN-IPI. Blood. 2020;135(23):2041–8.

    Article  CAS  PubMed  Google Scholar 

  17. Akyurek N, Uner A, Benekli M, Barista I. Prognostic significance of MYC, BCL2, and BCL6 rearrangements in patients with diffuse large B-cell lymphoma treated with cyclophosphamide, doxorubicin, vincristine, and prednisone plus rituximab. Cancer. 2012;118(17):4173–83.

    Article  CAS  PubMed  Google Scholar 

  18. Cunningham D, Hawkes EA, Jack A, Qian W, Smith P, Mouncey P, et al. Rituximab plus Cyclophosphamide, doxorubicin, vincristine, and prednisolone in patients with newly diagnosed diffuse large B-cell non-hodgkin lymphoma: a phase 3 comparison of dose intensification with 14-day versus 21-day cycles. Lancet. 2013;381(9880):1817–26.

    Article  CAS  PubMed  Google Scholar 

  19. Horn H, Ziepert M, Becher C, Barth TF, Bernd HW, Feller AC, et al. MYC status in concert with BCL2 and BCL6 expression predicts outcome in diffuse large B-cell lymphoma. Blood. 2013;121(12):2253–63.

    Article  CAS  PubMed  Google Scholar 

  20. Miyaoka M, Kikuti YY, Carreras J, Ito A, Ikoma H, Tomita S, et al. Copy Number Alteration and Mutational Profile of High-Grade B-Cell lymphoma with MYC and BCL2 and/or BCL6 rearrangements, diffuse large B-Cell lymphoma with MYC-Rearrangement, and diffuse large B-Cell lymphoma with MYC-Cluster amplification. Cancers (Basel). 2022;14(23):5849.

    Article  CAS  PubMed  Google Scholar 

  21. García R, Hussain A, Chen W, Wilson K, Koduru P. An artificial intelligence system applied to recurrent cytogenetic aberrations and genetic progression scores predicts MYC rearrangements in large B-cell lymphoma. EJHaem. 2022;3(3):707–21.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Carreras J, Roncador G, Hamoudi R. Artificial Intelligence Predicted overall survival and classified mature B-Cell neoplasms based on immuno-oncology and immune checkpoint panels. Cancers (Basel). 2022;14(21):5318.

    Article  CAS  PubMed  Google Scholar 

  23. Viswanathan A, Kundal K, Sengupta A, Kumar A, Kumar KV, Holmes AB, et al. Deep learning-based classifier of diffuse large B-cell lymphoma cell-of-origin with clinical outcome. Brief Funct Genomics. 2023;22(1):42–8.

    Article  CAS  PubMed  Google Scholar 

  24. Zaccaria GM, Altini N, Mezzolla G, Vegliante MC, Stranieri M, Pappagallo SA, et al. SurvIAE: survival prediction with interpretable autoencoders from diffuse large B-Cells lymphoma gene expression data. Comput Methods Programs Biomed. 2024;244:107966.

    Article  PubMed  Google Scholar 

  25. Carreras J, Hiraiwa S, Kikuti YY, Miyaoka M, Tomita S, Ikoma H, et al. Artificial neural networks predicted the overall survival and molecular subtypes of diffuse large B-Cell lymphoma using a Pancancer Immune-Oncology Panel. Cancers. 2021;13(24):6384.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Carreras J, Hamoudi R, Nakamura N. Artificial intelligence analysis of gene expression data predicted the prognosis of patients with diffuse large B-Cell lymphoma. Tokai J Exp Clin Med. 2020;45(1):37–48.

    PubMed  Google Scholar 

  27. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random survival forests. Ann Appl Stat. 2008;2:3.

    Article  Google Scholar 

  28. Taylor JM. Random survival forests. J Thorac Oncol. 2011;6(12):1974–5.

    Article  PubMed  Google Scholar 

  29. Brown CC. On the use of indicator variables for studying the time-dependence of parameters in a response-time model. Biometrics. 1975;31(4):863–72.

    Article  CAS  PubMed  Google Scholar 

  30. Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Stat Med. 1999;18(17–18):2529–45.

    Article  CAS  PubMed  Google Scholar 

  31. Gerds TA, Schumacher M. Consistent estimation of the expected Brier score in general survival models with right-censored event times. Biom J. 2006;48(6):1029–40.

    Article  PubMed  Google Scholar 

  32. Smith A, Howell D, Patmore R, Jack A, Roman E. Incidence of haematological malignancy by sub-type: a report from the haematological malignancy research network. Br J Cancer. 2011;105(11):1684–92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Chapuy B, Stewart C, Dunford AJ, Kim J, Kamburov A, Redd RA, et al. Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes. Nat Med. 2018;24(5):679–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Liu Y, Barta SK. Diffuse large B-cell lymphoma: 2019 update on diagnosis, risk stratification, and treatment. Am J Hematol. 2019;94(5):604–16.

    Article  CAS  PubMed  Google Scholar 

  35. Sehn LH, Berry B, Chhanabhai M, Fitzgerald C, Gill K, Hoskins P, et al. The revised International Prognostic Index (R-IPI) is a better predictor of outcome than the standard IPI for patients with diffuse large B-cell lymphoma treated with R-CHOP. Blood. 2007;109(5):1857–61.

    Article  CAS  PubMed  Google Scholar 

  36. Zhou Z, Sehn LH, Rademaker AW, Gordon LI, Lacasce AS, Crosby-Thompson A, et al. An enhanced International Prognostic Index (NCCN-IPI) for patients with diffuse large B-cell lymphoma treated in the Rituximab era. Blood. 2014;123(6):837–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Montalban C, Diaz-Lopez A, Dlouhy I, Rovira J, Lopez-Guillermo A, Alonso S, et al. Validation of the NCCN-IPI for diffuse large B-cell lymphoma (DLBCL): the addition of beta(2) -microglobulin yields a more accurate GELTAMO-IPI. Br J Haematol. 2017;176(6):918–28.

    Article  CAS  PubMed  Google Scholar 

  38. Li S, Desai P, Lin P, Yin CC, Tang G, Wang XJ, et al. MYC/BCL6 double-hit lymphoma (DHL): a tumour associated with an aggressive clinical course and poor prognosis. Histopathology. 2016;68(7):1090–8.

    Article  PubMed  Google Scholar 

  39. Akay OM, Aras BD, Isiksoy S, Toprak C, Mutlu FS, Artan S, et al. BCL2, BCL6, IGH, TP53, and MYC protein expression and gene rearrangements as prognostic markers in diffuse large B-cell lymphoma: a study of 44 Turkish patients. Cancer Genet. 2014;207(3):87–93.

    Article  CAS  PubMed  Google Scholar 

  40. Wang W, Hu S, Lu X, Young KH, Medeiros LJ. Triple-hit B-cell lymphoma with MYC, BCL2, and BCL6 Translocations/Rearrangements: clinicopathologic features of 11 cases. Am J Surg Pathol. 2015;39(8):1132–9.

    Article  PubMed  Google Scholar 

  41. Novo M, Castellino A, Nicolosi M, Santambrogio E, Vassallo F, Chiappella A, et al. High-grade B-cell lymphoma: how to diagnose and treat. Expert Rev Hematol. 2019;12(7):497–506.

    Article  CAS  PubMed  Google Scholar 

  42. Rosenthal A, Younes A. High grade B-cell lymphoma with rearrangements of MYC and BCL2 and/or BCL6: double hit and triple hit lymphomas and double expressing lymphoma. Blood Rev. 2017;31(2):37–42.

    Article  CAS  PubMed  Google Scholar 

  43. Stengel A, Kern W, Meggendorfer M, Haferlach T, Haferlach C. Detailed molecular analysis and evaluation of prognosis in cases with high grade B-cell lymphoma with MYC and BCL2 and/or BCL6 rearrangements. Br J Haematol. 2019;185(5):951–4.

    Article  PubMed  Google Scholar 

  44. Dunleavy K. Double-hit lymphoma: optimizing therapy. Hematol Am Soc Hematol Educ Program. 2021;2021(1):157–63.

    Article  Google Scholar 

  45. Wight JC, Chong G, Grigg AP, Hawkes EA. Prognostication of diffuse large B-cell lymphoma in the molecular era: moving beyond the IPI. Blood Rev. 2018;32(5):400–15.

    Article  CAS  PubMed  Google Scholar 

  46. Ma SY, Tian XP, Cai J, Su N, Fang Y, Zhang YC, et al. A prognostic immune risk score for diffuse large B-cell lymphoma. Br J Haematol. 2021;194(1):111–9.

    Article  CAS  PubMed  Google Scholar 

  47. Tzankov A, Xu-Monette ZY, Gerhard M, Visco C, Dirnhofer S, Gisin N, et al. Rearrangements of MYC gene facilitate risk stratification in diffuse large B-cell lymphoma patients treated with rituximab-CHOP. Mod Pathol. 2014;27(7):958–71.

    Article  CAS  PubMed  Google Scholar 

  48. Quesada AE, Medeiros LJ, Desai PA, Lin P, Westin JR, Hawsawi HM, et al. Increased MYC copy number is an independent prognostic factor in patients with diffuse large B-cell lymphoma. Mod Pathol. 2017;30(12):1688–97.

    Article  CAS  PubMed  Google Scholar 

  49. Obermann EC, Csato M, Dirnhofer S, Tzankov A. BCL2 gene aberration as an IPI-independent marker for poor outcome in non-germinal-centre diffuse large B cell lymphoma. J Clin Pathol. 2009;62(10):903–7.

    Article  CAS  PubMed  Google Scholar 

  50. Huang S, Nong L, Wang W, Liang L, Zheng Y, Liu J, et al. Prognostic impact of diffuse large B-cell lymphoma with extra copies of MYC, BCL2 and/or BCL6: comparison with double/triple hit lymphoma and double expressor lymphoma. Diagn Pathol. 2019;14(1):81.

    Article  PubMed  PubMed Central  Google Scholar 

  51. Shustik J, Han G, Farinha P, Johnson NA, Ben Neriah S, Connors JM, et al. Correlations between BCL6 rearrangement and outcome in patients with diffuse large B-cell lymphoma treated with CHOP or R-CHOP. Haematologica. 2009;95(1):96–101.

    Article  PubMed  PubMed Central  Google Scholar 

  52. Zhuang Y, Che J, Wu M, Guo Y, Xu Y, Dong X, et al. Altered pathways and targeted therapy in double hit lymphoma. J Hematol Oncol. 2022;15(1):26.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Dlamini Z, Francies FZ, Hull R, Marima R. Artificial intelligence (AI) and big data in cancer and precision oncology. Comput Struct Biotechnol J. 2020;18:2300–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. She Y, Jin Z, Wu J, Deng J, Zhang L, Su H, et al. Development and validation of a deep learning model for non-small cell lung cancer survival. JAMA Netw Open. 2020;3(6):e205842.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Qiang M, Li C, Sun Y, Sun Y, Ke L, Xie C, et al. A prognostic predictive system based on deep learning for Locoregionally Advanced Nasopharyngeal Carcinoma. J Natl Cancer Inst. 2021;113(5):606–15.

    Article  PubMed  Google Scholar 

  56. El-Galaly TC, Villa D, Alzahrani M, Hansen JW, Sehn LH, Wilson D, et al. Outcome prediction by extranodal involvement, IPI, R-IPI, and NCCN-IPI in the PET/CT and rituximab era: a danish-canadian study of 443 patients with diffuse-large B-cell lymphoma. Am J Hematol. 2015;90(11):1041–6.

    Article  CAS  PubMed  Google Scholar 

  57. Ferraris AM, Giuntini P, Gaetani GF. Serum lactic dehydrogenase as a prognostic tool for non-hodgkin lymphomas. Blood. 1979;54(4):928–32.

    Article  CAS  PubMed  Google Scholar 

  58. Qi J, Gu C, Wang W, Xiang M, Chen X, Fu J. Elevated lactate dehydrogenase levels display a poor prognostic factor for Non-hodgkin’s lymphoma in intensive care unit: an analysis of the MIMIC-III database combined with external validation. Front Oncol. 2021;11:753712.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Barrans S, Crouch S, Smith A, Turner K, Owen R, Patmore R, et al. Rearrangement of MYC is associated with poor prognosis in patients with diffuse large B-cell lymphoma treated in the era of Rituximab. J Clin Oncol. 2010;28(20):3360–5.

    Article  CAS  PubMed  Google Scholar 

  60. Marcheselli R, Franceschetto A, Sacchi S, Bari A, Levy I, Pizzichini P, et al. The prognostic role of end of treatment FDG-PET-CT in patients with diffuse large B cell lymphoma can be improved by considering it with absolute monocyte count at diagnosis. Leuk Lymphoma. 2019;60(8):1958–64.

    Article  PubMed  Google Scholar 

  61. Markovic O, Popovic L, Marisavljevic D, Jovanovic D, Filipovic B, Stanisavljevic D, et al. Comparison of prognostic impact of absolute lymphocyte count, absolute monocyte count, absolute lymphocyte count/absolute monocyte count prognostic score and ratio in patients with diffuse large B cell lymphoma. Eur J Intern Med. 2014;25(3):296–302.

    Article  PubMed  Google Scholar 

  62. Li M, Xia H, Zheng H, Li Y, Liu J, Hu L, et al. Red blood cell distribution width and platelet counts are independent prognostic factors and improve the predictive ability of IPI score in diffuse large B-cell lymphoma patients. BMC Cancer. 2019;19(1):1084.

    Article  PubMed  PubMed Central  Google Scholar 

  63. Zhao P, Zang L, Zhang X, Chen Y, Yue Z, Yang H, et al. Novel prognostic scoring system for diffuse large B-cell lymphoma. Oncol Lett. 2018;15(4):5325–32.

    PubMed  PubMed Central  Google Scholar 

  64. Ochi Y, Kazuma Y, Hiramoto N, Ono Y, Yoshioka S, Yonetani N, et al. Utility of a simple prognostic stratification based on platelet counts and serum albumin levels in elderly patients with diffuse large B cell lymphoma. Ann Hematol. 2017;96(1):1–8.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors would like to thank all the doctors, nurses and patients as well as their families for their contribution to this study.

Funding

This work was financially supported by Special Project for The Modernization of Traditional Chinese Medicine in Zhejiang Province (No.2020ZX007).

Author information

Authors and Affiliations

Authors

Contributions

XC, LS and XP contributed to the study conception and design. TL, HY, JX, and XC provided study material or patients in this study. HP and XC reviewed the literature and collected the data. XG, LS, MS, XP and HP performed the statistical data analysis. XC, LS, MS and HP contributed to the interpretation of data. XC, MS and HP drafted and revised the manuscript. All authors contributed to the development of the manuscript and approved the final version. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Xiaohua Pan or Xi Chen.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Institutional Review Board of Zhejiang Cancer Hospital and conducted in accordance with the Good Clinical Practice guidelines and the Declaration of Helsinki. Informed consent was waived because of the retrospective nature of the study.

Consent for publication

All the authors have read and approved the manuscript in all respects for publication.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, H., Su, M., Guo, X. et al. Artificial intelligence-based prognostic model accurately predicts the survival of patients with diffuse large B-cell lymphomas: analysis of a large cohort in China. BMC Cancer 24, 621 (2024). https://doi.org/10.1186/s12885-024-12337-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12885-024-12337-z

Keywords