 Research
 Open access
 Published:
Graph Neural Network for representation learning of lung cancer
BMC Cancer volume 23, Article number: 1037 (2023)
Abstract
The emergence of imagebased systems to improve diagnostic pathology precision, involving the intent to label sets or bags of instances, greatly hinges on Multiple Instance Learning for Whole Slide Images(WSIs). Contemporary works have shown excellent performance for a neural network in MIL settings. Here, we examine a graphbased model to facilitate endtoend learning and sample suitable patches using a tilebased approach. We propose MILGNN to employ a graphbased Variational Autoencoder with a Gaussian mixture model to discover relations between sample patches for the purposes to aggregate patch details into an individual vector representation. Using the classical MIL dataset MUSK and distinguishing two lung cancer subtypes, lung cancer called adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), we exhibit the efficacy of our technique. We achieved a 97.42% accuracy on the MUSK dataset and a 94.3% AUC on the classification of lung cancer subtypes utilizing features.
Introduction
The era of histopathology boasts voluminous electronic image records, a presentday reality. Within these records lies an overwhelming wealth of information, as exemplified by the notable study [1]. Yet, the path to accessing and harnessing this accrued knowledge for examination, research, and training purposes remains largely uncharted. The dearth of suitable methods to represent Whole Slide Images (WSIs) compounds this challenge, necessitating a deeper exploration of efficient WSI representation techniques. These endeavors become even more critical given the intricacies involved in depicting WSIs, including factors such as sharpness, features, hues, and pathological clarity.
The advent of deep neural networks has revolutionized digital pathology, sparking collaborative efforts between AI specialists and pathologists to innovate diagnostic approaches. As the digital pathology landscape gains wider acceptance, demanding increasingly effective WSI evaluation, new avenues have emerged. Deep learning has ascended to the forefront of visual computing, surpassing conventional visual interpretation techniques. Nevertheless, the sheer volume of pixels within each WSI presents an insurmountable hurdle for deep neural networks. Recent research has delved into the patchlevel analysis of WSIs, necessitating manual annotation by experts. However, applying such techniques to large WSI datasets becomes impractical. Additionally, labels often pertain to the entire WSI rather than individual patches, emphasizing the importance of harnessing information from all patches during WSI representation.
In response, Multiple Instance Learning (MIL) emerges as a promising approach for supervised learning in the context of WSIs. MILbased techniques initially extract neural network featureembedded data from image tiles. MIL introduces the concept of “bag training,” employing a collection of bag instances, each with a correlated label, thus offering a pathway to WSI representation. The feature embedding subsequently feeds into an aggregation network to generate slidelevel information. MIL techniques, applied extensively in WSI analysis, can be categorized into instancelevel and embeddinglevel paradigms. Instancelevel approaches emphasize local information, while embeddinglevel methods focus on global aspects. Previously, SVMbased models like MISVM [2] were common in MIL, but recent complex models like Deep MIML [3], miNet [4], TransMIL [5], and attentionbased methods such as ABMIL [6] and CLAM [7] have gained prominence. Notably, recent research has spotlighted the application of Graph Neural Networks (GNN) in the MIL framework. Graphs prove effective in modeling histopathology data extracted from WSIs, capturing spatial relationships among entities as nodes or subgraphs. This graphbased approach excels in capturing both marginal and substantial global information from patches, owing to its inherent potential. Several studies in the field of digital pathology have employed GNNbased approaches to address different aspects. GNNs are particularly suited for MIL problems due to their permutation invariant characteristics, where each instance is represented as a node in a graph. Various innovative methods have been developed, such as HIPT [8], a ViT architecture designed for learning from WSI image topology, and H2Graph [9], which constructs heterogeneous networks for training dense layers. The ProtoMIL [10] method, inspired by byexample reasoning and based on graphical representations, represents another innovative approach to MIL.
This research pioneers a novel GMMConvbased Variational Graph Autoencoder model tailored for MIL applications to WSIs, compressing them into compact graphs. The approach meticulously employs maximum magnification settings for WSIs and incorporates patchlevel annotations to highlight individual WSI labels. By representing WSIs as dense graphs, the interpretability of the final representation is greatly enhanced. In this framework, each instance is modeled as a node within a network, facilitating the discovery of interconnections between them. The patches are collected and organized into bags using the sliding window tiling method. Subsequently, a graph structure is constructed from the node features of the stacked patches. The interaction between patches is learned through the Gaussian mixture model representation. This innovative methodology unravels the intricate interconnections among regions while efficiently learning the representation of a given WSI.
To illustrate the efficacy of the model, we conducted classification experiments on two common subtypes of lung cancer, adenocarcinoma, and squamous cell carcinoma. Distinguishing between LUAD and LUSC, expert monitoring is essential. In this article, we leveraged MILGNN to perform subtype classification using WSIs from The Cancer Genome Atlas (TCGA), a freely accessible dataset. Our unique approach employed adjacency matrices to capture interactions between various patches, presenting a novel paradigm for WSI learning with GNN. Ultimately, our proposed method yielded impressive results, achieving an F1 Score of 82.24%, precision, and a 0.943 Area under the curve evaluation.
In summary, our article’s key contributions are as follows:

1.
A pioneering graphbased MILGNN technique for learning WSI representations.

2.
Introduction of an intranode adjacency layer that fosters endtoend connectivity among learning nodes.

3.
The use of MILGNN to identify and predict the most significant patches within WSIs, enriching our understanding of these complex images.
Related work
Histopathology photos from a whole slide can be as large as 100,000 pixels in size. Annotating such large photos by hand is a timeconsuming and laborintensive process. Recent advancements in machine learning, particularly deep learning [11, 12], have significantly contributed to the field of analyzing WSI. These methods have facilitated notable advancements in various areas, such as disease categorization [13], tissue segmentation, mutation prediction, and spatial profiling of immune infiltration [4, 14,15,16,17]. The relevant literature on WSIs representation learning, we discuss in detail below:
Multiple instance learning (MIL): There are two main approaches to representing WSIs. The first is subsetting, where a small subset is extracted from a large pathology image. Despite the requirement for professional expertise and accurate subset extraction, most literature employs this method due to its speed and accuracy. The second technique is tiling, which divides images into smaller, manageable tiles and processes them against one another [18]. The tiling approach can be particularly beneficial for MIL (Multiple Instance Learning) approaches that require more automation. In supervised learning, where each training instance has a label, MIL algorithms assign sets of labeled instances instead of individual ones [2, 19]. MIL techniques can also be applied to learning representations of histopathology images. For WSI analysis, MIL techniques are frequently employed. MIL can be categorized into paradigms at the instance and embedding levels. Instancelevel approaches primarily focus on local information, while embeddinglevel approaches concentrate on global information. Before the advent of deep learning, SVMbased models such as MISVM [2] were commonly used to address MIL problems. However, several complex models are now employed to manage MIL. Deep MIML [3] involves training something behind the scenes, which is subsequently pooled to produce a bag representation. miNet [4] combines projections from individual instances to generate baglevel predictions. TransMIL [5] efficiently handles balanced or unbalanced data while capturing morphological and spatial information. Specifically, attentionbased methods like ABMIL [6] and CLAM [7] can recognize the impact of various instances during global aggregation. Due to the inherent ambiguities and challenges associated with selflabeling, MIL techniques have the distinct advantage of leveraging carefully crafted formations and reducing manual annotation efforts.
Graph based approaches: Recent attempts to perform WSIlevel analysis have yielded promising results in terms of assessing the microenvironment of the entire tissue.Graphbased methods, namely graph convolutional networks, have drawn significant attention in recent years. This is mostly attributed to their capability to effectively capture the entirety of WSIs and analyze patterns within them, enabling accurate predictions of different outcomes of interest. Recent methods have suggested pooling algorithms for learning hierarchical representations for graph embeddings. AttPool [20] is an example of a paper that uses an attention pooling layer to identify discriminative nodes and construct a coarser graph from the resulting attention values. Model learning was simplified by AttPool’s use of the hierarchical structure, and it outperformed stateoftheart methods on multiple benchmark datasets for graph categorization. Graph Neural Networks (GNNs) have lately emerged as a prominent topic of investigation in several publications, demonstrating their substantial impact. A number of studies have utilized graphbased methods to analyze WSIs in order to investigate different aspects related to survival analysis [21,22,23,24], lymph node metastasis prediction [25], mutational prediction [26], cell categorization [27], and retrieval of significant sections [28]. In the field of digital pathology imaging, Ilse et al. (2018) [6] have successfully developed the permutation invariant operator. Graph Neural Networks have been utilized for MIL problems due to their permutation invariant characteristics. Using each instance as a node in a graph, it was demonstrated that GNN could be applied to MIL. Tu et al. (2019) [29] demonstrated the applicability of GNN for MIL by representing each instance as a node in a graph. In order to categorize WSIs expressed in terms of their constituent pixels, the methods based on GNN have been devised [30, 31]. HIPT [8] created a revolutionary ViT architecture to learn from the intrinsic WSI image topology, whereas H2Graph [9] built a heterogeneous network with higher scales of WSI to train a dense layer. The ProtoMIL method, as defined by Rymarczyk et al., (2021) [10], is an innovative approach to MIL that is based on graphical representations and is inspired by the byexample style of reasoning.
Driven by these recent advancements, we use a learning set or MIL strategy to tackle the problem, disregarding the interdependencies within the sets. Our methodology differs from prior research in its utilization of graph mixture model convolution to depict the connections between bags. In this study, we employ a combination of neural architecture and graph network to comprehensively analyze the bag structure. Subsequently, we proceed to train the acquired layout in a sequential manner, starting from the initial stage and progressing towards completion.
Materials and methods
This section presents our proposed framework for acquiring representations of WSI through learning. First of all, we briefly discuss the proposed method based on Variational Graph Autoencoder (VGAE). The proposed method is memory efficient while training and learning representation that is NonEuclidean. The proposed approach trains all the way through on a bag of instances to obtain a representation for each patch. The basic concept involves utilizing a graph that is fully connected, denoted by nodes V and an adjacency matrix A. A graph can represent any model as a variety of relationships. The two standard nodes, denoted as \(V_{i}\) and \(V_{j}\), are connected by weighted edges represented as \(a_{i,j}\). Figure 1 illustrates the overall proposed method. From a WSI, the patches are samples passing through a feature extraction by using a tiling method. All the selected regions’ features are extracted using a convolutional neural network that has already been trained, and those features are then used to build a completely linked graph. The WSI that has been provided is employed as a dense graph. In this graph, each node that is connected is trained to interact with all other connected nodes. After the graph has been pooled, it is sent through a Graph Variational Autoencoder to produce the WSI’s final representation. The efficient utilization of memory in processing WSI is a fundamental aspect of the process. Final WSI representation has led to classifying applications.
The ongoing research centers on utilizing GNNs to learning graph representations in the context of MIL. The study introduces a new method for addressing the MIL problem. Two phases comprise the suggested approach for expressing a WSI a) Sampling critical patches, arranging them in a fullyconnected graph; and b) itemizing the results to classify the fullyconnected graph, and permuting it into a vector form. The entire technique could be created in an individual training loop. The significance associated with our technique is the generation of the adjacency matrix, which is a structure that describes the relationships between nodes. The technique can summarize as follows.

1.
The present study employs a tilingbased approach to extract noteworthy patches from a WSI. A pretrained CNN is employed to extract features from individual sampled patches.

2.
The provided WSI is thereafter represented as a fully linked graph. The adjacency matrix normally connects every node to every other node. During the Adjacency Layer training block, the adjacency matrix is trained.

3.
Subsequently, the graph underwent processing via a GCN, succeeded by a graph pooling layer, resulting in the ultimate vector representation for the given WSI.
Deep Graph Convolution Layers We started establishing the VGAE component of the WSI. We tried out two different GCNs, the spectral approach ChebNet and the spatial method GraphMMConv. Each of the GCN’s hidden layers simulates the interaction between the nodes and converts the feature into a new feature space. The next step is a pooling layer, which is responsible for converting the characteristics of each node into a vector representation. Because of this, a WSI may now be represented by a compressed vector, which has the further benefit of being used for a variety of other purposes, like image retrieval, classification, and so on.
MIL training approach Our suggested method is applicable across the range in the MIL environment. The steps taken to address MIL concerns are as follows: Each instance is modeled as a vertex, and its characteristics are treated as features. The bag of instances is trained within a global context to calculate features for the adjacency matrix. Each instance as vertex and \(a_{ij}\) in adjacency features A that represents the edge weight between \(v_i\) and \(v_j\) construct a dense graph. The integration of deep graph convolution network facilitates the training of the graph’s representation. This representation is subsequently processed through a graph pooling layer to get a feature vector that represents the bag of instances. The vector of features derived from the graph can be utilized for classification purposes.
Feature extraction The study employed a sliding window technique, as described by [32], to generate small patches from the entire slide. These patches were then classified using a residual neural network. The predictions from the patches were pooled, and a heuristic was utilized to determine the predominant and nominal histological patterns for the slide in the whole. Each predicted patch was evaluated separately from its neighbors and from its position in the whole slide.
Graph building We suggest a new approach to learning WSI representations of GNN. Each WSI is transformed into a dense graph which has two components node v and Adjacency matrix A. Each node is representative of a feature vector and correlates with the features extracted from a patch.Conversely, the relationship among nodes v is denoted by the Adjacency matrix A. Adjacency matrix A is learned using patch features in a convolution layer. The training methodology employed in our study involves the iterative learning of the adjacency matrix in a sequential manner, utilizing the l2 distance as a threshold for precalculated features. We proposed to use context information that connection between two same nodes or patches are uncommon for various WSI. Consequently, the value of an element in the adjacency matrix is contingent upon the links between two patches as well as the contextual characteristics of said patches. We assume S be a WSI and \(s_1, s_2, \dots , s_n\). The patches went through to feature extraction through a layer, resulting in the derivation of feature representation \(x_i\). Then using these features \(x_i\) to obtain the context through Zaheer et al [33] theorem. The process of obtaining the context vector involves the utilization of the pooling operator \(\phi\) to combine feature vectors from all patches.
The context vector c is subsequently subjected to concatenation and MLP layers, resulting in the concatenated feature vector \(x_{i}^{\prime }\). This process facilitates the conversion of the new feature vector \(x_i^{*}\), which conveys patch information in conjunction with context. To make a form as feature matrix \(X^{*}\), features \(x_{i}^{*}\) are stacked together. Passing the features through the correlation layer yields the adjacency matrix A, where each element \(a_{ij}\) indicates the level of correlation between patches \(s_i\) and \(s_j\). The dense graph representation of WSI employs the notation \(a_{ij}\) to denote the weights of edges connecting distinct nodes.
Deep graph convolution network We experimented with two types of GCN: ChabNet, which uses a graph neural network, and GMMConv, which uses a Gaussian mixture model convolution operator, to implement the graph representation of the WSI. Within GCN models, every hidden layer establishes connections between nodes and converts the features into a distinct latent space. Ultimately, a layer of pooling is used to combine node characteristics into a solitary vector representation.
GVAE We employ a graph convolution network (GCN) [34] encoder and a straightforward inner product decoder. We apply a simple inference model parameterized by two GCN layer
where \(\mu = GCN_{\mu }(X,A)\) is the matrix of means vectors \(\mu _i\). on the other hand, \(log \sigma = GCN_{\sigma }(X,A)\). The two layer GCN is represent as \(GCN(X,A) = \tilde{A}ReLU (\tilde{A}XW_0)W_1\) with weight matrices \(W_i\) GCN mean and average layer share first layer parameters. ReLU utilizes the max operator, and the symmetrical normalized adjacency matrix. Models with inner product between latent variables are generative.
where \(A_{ij}\) are the elements of A and \(\sigma (.)\) is the logistic sigmoid function.
Loss function We consider the prior distribution on the random variable z to learn the encoder and decoder parameter of VAE i.e., \(\phi\) and \(\theta\) that models distribution of x. So the lower bound can be calculated as
where KL represents the KullbackLeibler divergence between q(.) and p(.), a priori Gaussian \(p(Z) = prodip(z_i)\). For sparse A, it will be useful to overwrite weight terms with \(A_{ij} = 1\) or otherwise zero. For training purposes, we use fullbatch gradient descent and the reparameterization method. The identity matrix will be utilized in Graph Convolution Networks (GCN) in lieu of the input feature matrix X.
Experiment
We analyzed the effectiveness of the method using two real datasets MIL public dataset MUSK and TCGA lung cancer slide Dataset. We conducted multiple trials to train and evaluate our model. The TCGA data coupled with the same hyperparameters were utilized as a separate dataset for model testing. On the MUSK1, the proposed method achieved a stateoftheart accuracy of 93%. Our model was also used to differentiate between two subtypes of lung cancer: Lung Adenocarcinoma (LUAD) and Lung Squamous Cell Carcinoma (LUSC).
Experiment settings We initially pretrain a classification network using 10fold crossvalidation, which involves splitting 10 different training sets into 60% training sets, 20% validation sets, and 20% test sets. We next evaluate the network using the area under the receiver operating curve (AUC/AUROC) and weighted F1Score. Using the ResNet pretrained network and a distinct ImageNetpretrained model for each fold, we were able to extract features from WSIs of the neural network. The construction of a graph involved the computation of spatial adjacency among the patches, followed by the storage of nodelevel embedding of the attribute matrix subsequent to the extraction of image features. To incorporate graph convolution networks, we used Pytorch Geometric Library [35] trained using Nvidia RTX3090 GPUs. Each WSI was cropped to produce a set of \(512 \times 512\) nonoverlapping patches at\(20\times\) magnification, with patches from the background whose nontissue areas were larger than 50 discarded. The CNN backbone used for the feature extractor is Resnet18. We use a minibatch size of 512, the Adam optimizer, and a cosine annealing approach to our learning rate schedule. The trained feature extractor was kept and used to generate graphs. The model layer’s parameters were \(L=3\), MLP size = 123, \(D=64\), and \(k=8\); we used a graph convolution layer (GCN). Eight samples at a time were used to train the model over the course of 150 iterations. Starting at a rate of 103, the learning rate gradually decreased through steps 30 and 100, final learning at a rate of 105. For a nodelevel classification on the training slides, we used stochastic gradient descent of CrossEntropy loss and evaluated our GNN’s ability to generalize on the testing data of the WSIs graph using F1Score for each crossvalidation fold. We kept the anticipated embeddings, predictions on the validation and test sets, and the training epoch with the greatest validation F1score for each fold of crossvalidation.
Evaluation To evaluate the efficacy of our approach and cuttingedge techniques, the Area Under the Receiver Operating Characteristic Curve (ROCAUC), accuracy, precision, recall, and F1score are used. Specifically,
The initials TP, FP, TN, and FN stand for true positive, false positive, true negative, and false negative, respectively. When comparing the performance of various methods, ROCAUC is the most comprehensive of these.
Results
MUSK dataset The MUSK dataset comprises a total of 92 instance bags, with 47 of them being positive and 45 of them being negative. Instances of the bag refer to the precise shape or characteristics of a molecule. In the event that novel molecules possess a musky quality, we shall discern such dissimilarities in the context of bags. A 10fold crossvalidation was conducted using predetermined arbitrary seeds. In Tables 1 and 3, we presented a comparative analysis of our proposed method against various stateoftheart techniques. The graph was designed using miGraph’s kernel [36], and it represents the items contained within a bag. The MINet [4] and Attention MIL [6] models are implemented using Deep Neural Network architecture and utilize either pooling or attention mechanisms to obtain the bag representation.
LUAD & LUSC Nonsmall cell lung cancer (NSCLC) is a prevalent form of lung cancer, with Lung Adenocarcinoma (LUAD) and Lung Squamous Cell Carcinoma (LUSC) being two significant subtypes. Collectively, these two subtypes constitute approximately onethird of all lung malignancies. Building automated systems requires a number of steps, one of which is the automatic categorization of the two primary subtypes of NSCLC. We were successful in obtaining 1026 diagnostic WSIs from the TCGA archive that is stained with hematoxylin and eosin (H &E), which include LUAD and LUSC. We chose relevant patches from across all of the WSIs with the use of a tilebased patch selection system. Applying ImageNet [37] got image components from these patches and each bag is a set of features characterized as LUAD or LUSC. Our method determines the bags as two lung cancer subclass. The significant AUC rate gained for 10fold distribution was 0.932 and the average AUC score across all fold was 0.91.
The crossvalidation on various subjects is performed, the training performed using WSIs on a distinct class of subjects than the testing. We present the data in tables referred to as Tables 2 and 3. Using a transfer learning approach, we achieved stateoftheart precision. We squeezed patch components from a prevailing pretrained network. They will improve the feature extractor during the training. Figure 2 shows the training loss over epochs for each of the folds.
Blackbox characteristics of deep neural networks are among the barriers to the functional deployment of advanced neural network models in computational diagnostics. Because our proposed technique uses VGAE, we can observe the weight that our network’s prediction algorithm assigns on each patch. This depiction can provide the pathologist with greater insight into the model’s internal decisionmaking process. Figure 3 visualize important highscore patches. The global attention pooling layer is taught to award patches an attention score. Higher attention levels imply that the model gives these patches greater consideration. As instances presented for insightful analysis are queued, the CAD system might identify regions of interest and select cases based on analytic requirements. We discover that parts with a higher score for attention contain more nuclei. Because of the fact that morphological characteristics of nuclei are essential for diagnostic decisions [40], the network learns this property.
Comparison with stateoftheart methods We present the results in Table 3, which includes comparisons with ABMIL (2018), GatedABMIL (2018), MIGraph (2019), CLAMMIL, DeepAttention MIL (2020), GTMIL (2021), TransMIL (2022), graph message passing (GCN), and our MILGNN framework. Across all evaluation metrics, our approach consistently outperforms the others on both the MUSK1 and TCGA datasets. Notably, GTMIL (2021) stands out as the most effective approach, highlighting the significant impact of the GraphTransformer architecture in WSI analysis. It’s worth noting that GTMIL, unlike our techniques, employed MinCut pooling and formed a GraphTransformer network in the task setting.
Figure 4 depicts the feature vector of the WSI of the tSNE plot. It visualized distinct differences among cancer subtype and showed the strength of our proposed method.
Our framework exhibits notable performance enhancements over GTMIL, showcasing improvements of 1.50% in AUC, 2.54% in accuracy (ACC), and 1.33% in F1score when applied to both the MUSK dataset and the TGCA lung cancer dataset. This reaffirms the validity of the models within our framework. However, it’s particularly noteworthy that our framework has significantly improved the classification of cancer subtypes. This achievement is of greater significance, possibly attributed to the robust and adaptable representations learned within the MILbased graph variational autoencoder learning paradigm.
Ablation study We evaluated the effectiveness of our proposed approach on various configurations of the TCGA dataset. For our Deep Graph Convolution Network, we examined the following layers and configurations: A) MILGNN, B) VGAE + GCE, C) VGAE + without GCE, D) with Graph, and E) without Graph. The experiments demonstrated a significant performance advantage of GMMConv over ChebNet. Additionally, we analyzed different encoder layer settings. The results, along with parameter settings, are presented in Table 4. We compared the results obtained when applying the method with and without the creation of a graph.
From ROCAUC accuracy rate, MILGNN demonstrate significant improvement with 90.6% than VGAEGNN. The assessment shows the improvement over the graph feeding. ROCAUC rates improve With Graph 68.6 where without Graph 60.1. Also, it presents fewer false negatives. Furthermore, this demonstrates that the model can be improved. By completing structural information, performance can be improved.
Discussion
We’ve developed a graphbased MILGNN approach that integrates graph structures to construct a systematic classifier for distinguishing WSIs between LUAD and LUSC. Our method excels when compared to recent architectures utilizing diverse stateoftheart compositions, as measured by various model performance metrics. Our graph neural network technique adeptly identifies regions within WSIs that exhibit strong correlations with the predicted outcomes. These findings represent significant strides in the field of interpretable deep learning, concurrently advancing the realms of machine learning and digital pathology.
Despite the substantial progress, the domain of digital pathology remains a work in progress, primarily due to the sheer volume of highresolution images. While these models demonstrate a capacity for accurate predictions, they often fall short in capturing temporal connectivity insights effectively. As a result, the attribution of significant imagelevel features for deploying such methods may yield mixed results. Our GNN approach effectively addresses this challenge by aggregating WSIlevel data into a comprehensive graph structure, marking a notable advancement in the field. The development of a graph that emphasizes WSI regions associated with class labels stands out as one of the distinctive contributions of our research.
Nevertheless, our study does have certain limitations. We made an assumption regarding GNNs’ ability to capture patchlevel data and their spatial layout. Additionally, we acknowledge the potential for bias in specific crossvalidation folds. Since we stratified patches based on whether they exhibited elevated or decreased prevalence, WSIs may exhibit a range of characteristics, rendering their presence unpredictable. The GNN’s performance suffered due to the presence of numerous unknown parameters. To mitigate this, we employed deep learning to generate patchlevel feature vectors before embarking on the design and construction phase, a task that proved to be computationally intensive.
Conclusion
We present MILGNN, a deep learning method based on GMMConv and Variational Graph Autoencoder, consisting of detection and classification stages. Initially, we convert each whole slide into patch features, treating each patch as a bag to construct a graph by learning the adjacency matrix. Next, we employ a GNNbased embedding design to train the graph model of the VGAE (Variational Graph Autoencoder) Deep Graph Neural Network. Finally, the resulting representation is fed into an MLPbased classifier to estimate the bag level. The outcomes highlight the superior performance of the proposed technique. We can use an adjacency matrix to visualize relevant patches, making the suggested approach straightforward and interpretable. Future work will explore the effects of deep GNN models with multiple layers. Additionally, we will investigate automatic training and the identification of pertinent histopathological architectural characteristics to obtain semantic features, which present intriguing prospects.
Availability of data and materials
Musk (Version 1) Data Set https://archive.ics.uci.edu/ml/datasets/Musk+(Version+1) The Cancer Genome Atlas (TCGA) TCGA dataset publically available https://portal.gdc.cancer.gov/.
References
He L, Long LR, Antani S, Thoma GR. Histology image analysis for carcinoma detection and grading. Comput Methods Programs Biomed. 2012;107(3):538–56. https://doi.org/10.1016/j.cmpb.2011.12.007.
Andrews S, Tsochantaridis I, Hofmann T. Support vector machines for multipleinstance learning. Adv Neural Inf Process Syst. 2002;15.
Feng J, Zhou ZH. Deep MIML network. In: Proceedings of the AAAI conference on artificial intelligence (Vol. 31, No. 1). 2017.
Wang X, Yan Y, Tang P, Bai X, Liu W. Revisiting multiple instance neural networks. Pattern Recog. 2018;74:15–24.
Shao Z, Bian H, Chen Y, Wang Y, Zhang J, Ji X. Transmil: transformer based correlated multiple instance learning for whole slide image classification. Adv Neural Inf Process Syst. 2021;34:2136–47.
Ilse M, Tomczak J, Welling M. Attentionbased deep multiple instance learning. In: International conference on machine learning. PMLR; 2018. p. 2127–36.
Lu MY, Williamson DF, Chen TY, Chen RJ, Barbieri M, Mahmood F. Dataefficient and weakly supervised computational pathology on wholeslide images. Nat Biomed Eng. 2021;5(6):555–70.
Chen RJ, Chen C, Li Y, Chen TY, Trister AD, Krishnan RG, et al. Scaling vision transformers to gigapixel images via hierarchical selfsupervised learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2022. p. 16144–55.
Hou W, Yu L, Lin C, Huang H, Yu R, Qin J, et al. H^{∧}2MIL: Exploring Hierarchical Representation with Heterogeneous Multiple Instance Learning for Whole Slide Image Analysis. In: ThirtySixth AAAI Conference on Artificial Intelligence, AAAI 2022, ThirtyFourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22  March 1, 2022. AAAI Press; 2022. p. 933–41. https://ojs.aaai.org/index.php/AAAI/article/view/19976.
Rymarczyk D, Pardyl A, Kraus J, Kaczyńska A, Skomorowski M, Zieliński B. Protomil: multiple instance learning with prototypical parts for wholeslide image classification. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer International Publishing; 2022. p. 421–36.
Sharma M, Mandloi A, Bhattacharya M. A novel DeepML framework for multiclassification of breast cancer based on transfer learning. Int J Imaging Syst Technol. 2022;32(6):1963–77.
Rane C, Mehrotra R, Bhattacharyya S, Sharma M, Bhattacharya M. A novel attention fusion networkbased framework to ensemble the predictions of CNNs for lymph node metastasis detection. J Supercomput. 2021;77:4201–20.
Sharma M, Bhattacharya M. Discrimination and quantification of live/dead rat brain cells using a nonlinear segmentation model. Med Biol Eng Comput. 2020;58:1127–46.
Wang X, Chen H, Gan C, Lin H, Dou Q, Tsougenis E, et al. Weakly supervised deep learning for whole slide lung cancer image analysis. IEEE Trans Cybern. 2019;50(9):3950–62.
Wang S, Yang DM, Rong R, Zhan X, Xiao G. Pathology image analysis using segmentation deep learning algorithms. Am J Pathol. 2019;189(9):1686–98.
Coudray N, Ocampo PS, Sakellaropoulos T, Narula N, Snuderl M, Fenyö D, et al. Classification and mutation prediction from nonsmall cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559–67.
Sharma M, Goudar VS, Koduri MP, Tseng FG, Bhattacharya M. Quantitative and Qualitative Image Analysis of In Vitro CoCulture 3D Tumor Spheroid Model by Employing ImageProcessing Techniques. Appl Sci. 2021;11(10):4636.
Levy J, Haudenschild C, Barwick C, Christensen B, Vaickus L. Topological feature extraction and visualization of whole slide images using graph neural networks. In: BIOCOMPUTING 2021: Proceedings of the Pacific Symposium; 2020. p. 285–96.
Dietterich TG, Lathrop RH, LozanoPérez T. Solving the Multiple Instance Problem with AxisParallel Rectangles. Artif Intell. 1997;89(1–2):31–71. https://doi.org/10.1016/S00043702(96)000343.
Huang J, Li Z, Li N, Liu S, Li G. Attpool: towards hierarchical feature representation in graph convolutional networks via attention mechanism. In: Proceedings of the IEEE/CVF international conference on computer vision; 2019. p. 6480–9.
Li R, Yao J, Zhu X, Li Y, Huang J. Graph CNN for survival analysis on whole slide pathological images. In: International Conference on Medical Image Computing and ComputerAssisted Intervention. Cham: Springer International Publishing; 2018. p. 174–82.
Di D, Li S, Zhang J, Gao Y. Rankingbased survival prediction on histopathological wholeslide images. In: International Conference on Medical Image Computing and ComputerAssisted Intervention. Cham: Springer International Publishing; 2020. p. 428–38.
Chen RJ, Lu MY, Shaban M, Chen C, Chen TY, Williamson DF, et al. Whole slide images are 2d point clouds: Contextaware survival prediction using patchbased graph convolutional networks. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part VIII 24. Springer International Publishing.; 2021. p. 339–49.
Di D, Zhang J, Lei F, Tian Q, Gao Y. Bighypergraph factorization neural network for survival prediction from whole slide image. IEEE Trans Image Process. 2022;31:1149–60.
Zhao Y, Yang F, Fang Y, Liu H, Zhou N, Zhang J, et al. Predicting lymph node metastasis using histopathological images based on multiple instance learning with deep graph convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. p. 4837–46.
Ding K, Liu Q, Lee E, Zhou M, Lu A, Zhang S. Featureenhanced graph networks for genetic mutational prediction using histopathological images in colon cancer. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part II 23. Springer International Publishing; 2020. p. 294–304.
Shi J, Wang R, Zheng Y, Jiang Z, Zhang H, Yu L. Cervical cell classification with graph convolutional network. Comput Methods Programs Biomed. 2021;198:105807.
Zheng Y, Jiang Z, Shi J, Xie F, Zhang H, Luo W, et al. Encoding histopathology whole slide images with locationaware graphs for diagnostically relevant regions retrieval. Med Image Anal. 2022;76:102308.
Tu M, Huang J, He X, Zhou B. Multiple instance learning with graph neural networks. arXiv preprint arXiv:1906.04881. 2019.
Anand D, Gadiya S, Sethi A. Histographs: graphs in histopathology. In: Medical Imaging 2020: Digital Pathology (Vol. 11320). SPIE; 2020. p. 150–5.
Jaume G, Pati P, Bozorgtabar B, Foncubierta A, Anniciello AM, Feroce F, et al. Quantifying explainers of graph neural networks in computational pathology. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2021. p. 8106–16.
Berman AG, Orchard WR, Gehrung M, Markowetz F. PathML: a unified framework for wholeslide image analysis with deep learning. medRxiv. 2021:2021–07.
Zaheer M, Kottur S, Ravanbakhsh S, Poczos B, Salakhutdinov RR, Smola AJ. Deep sets. Adv Neural Inf Process Syst. 2017;30.
Kipf TN, Welling M. Semisupervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. 2016.
Fey M, Lenssen JE. Fast graph representation learning with PyTorch Geometric. arXiv preprint arXiv:1903.02428. 2019.
Zhou ZH, Sun YY, Li YF. Multiinstance learning by treating instances as noniid samples. In: Proceedings of the 26th annual international conference on machine learning; 2009. p. 1249–56.
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017. p. 4700–8.
Khosravi P, Kazemi E, Imielinski M, Elemento O, Hajirasouliha I. Deep convolutional neural networks enable discrimination of heterogeneous digital pathology images. EBioMedicine. 2018;27:317–28.
Yu KH, Zhang C, Berry GJ, Altman RB, Ré C, Rubin DL, et al. Predicting nonsmall cell lung cancer prognosis by fully automated microscopic pathology image features. Nat Commun. 2016;7(1):1–10.
Naik S, Doyle S, Agner S, Madabhushi A, Feldman M, Tomaszewski J. Automated gland and nuclei segmentation for grading of prostate and breast cancer histopathology. In: 2008 5th IEEE International Symposium on Biomedical Imaging: from Nano to Macro. IEEE; 2008. p. 284–7.
Acknowledgements
We thank reviewers and editors for constructive and valuable advice for improving this article.
Funding
National Natural Science Foundation of China (Grant No. 61972274), Major project of National Natural Science Foundation of China (Grant No. U21A20469).
Author information
Authors and Affiliations
Contributions
Writing—original draft preparation, Rukhma Aftab; review and editing, Juanjuan Zhao, Zia urrehman and Zijuan Zhao.; supervision, Yan Qiang. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Aftab, R., Qiang, Y., Zhao, J. et al. Graph Neural Network for representation learning of lung cancer. BMC Cancer 23, 1037 (2023). https://doi.org/10.1186/s12885023115168
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12885023115168