Texture analysis of MR images to identify the differentiated degree in hepatocellular carcinoma: a retrospective study

Background To explore the clinical value of texture analysis of MR images (multiphase Gd-EOB-DTPA-enhanced MRI and T2 weighted imaging (T2WI) to identify the differentiated degree of hepatocellular carcinoma (HCC). Method One hundred four participants were enrolled in this retrospective study. Each participant performed preoperative Gd-EOB-DTPA-enhanced MR scanning. Texture features were analyzed by MaZda, and B11 program was used for data analysis and classification. The diagnosis efficiencies of texture features and conventional imaging features in identifying the differentiated degree of HCC were assessed by receiver operating characteristic analysis. The relationship between texture features and differentiated degree of HCC was evaluated by Spearman’s correlation coefficient. Results The grey-level co-occurrence matrix -based texture features were most frequently extracted and the nonlinear discriminant analysis was excellent with the misclassification rate ranging from 3.33 to 14.93%. The area under the curve (AUC) of the combined texture features between poorly- and well-differentiated HCC, poorly- and moderately-differentiated HCC, moderately- and well-differentiated HCC was 0.812, 0.879 and 0.808 respectively, while the AUC of tumor size was 0.649, 0.660 and 0.517 respectively. The tumor size was significantly different between poorly- and moderately-HCC (p = 0.014). The COMBINE AUC values were not increased with tumor size combined. Conclusions Texture analysis of Gd-EOB-DTPA-enhanced MRI and T2WI was valuable and might be a promising method in identifying the differentiated degree of HCC. The poorly-differentiated HCC was more heterogeneous than well- and moderately-differentiated HCC.


Background
Hepatocellular carcinoma (HCC) is a malignant tumor evolved from the hepatocyte and is the second most common cause of cancer death worldwide. HCC account for a larger proportion of tumor particularly in developing countries [1]. The high prevalence of hepatitis virus B is the most common reason leading to HCC in the developing countries, while the alcohol and hepatitis C virus is more frequent in developed countries. Although there are many treatments of HCC including surgery, radiofrequency ablation and transcatheter arterial chemoembolization, the mortality of HCC is still high due to the recurrence [2].
There were many reports suggested that the size of tumor, number of lesion, vascular invasion, status of tumor capsule and liver function status can affect the prognosis and the choices of therapy of HCC [3][4][5][6]. Nevertheless, the most important factor was the differentiated grade, which was supposed to an independent factor affecting recurrence of HCC [7]. According to the differentiated degree of tumor cells, HCC were grouped into poorly-differentiated HCC, moderately-differentiated HCC and well-differentiated HCC. According to the reports, the overall survival rate of the patients with moderately-differentiated and well-differentiated HCC was higher than that of the patients with poorlydifferentiated HCC, while the recurrence rate was lower [8,9].
As we known, a precise pre-surgical evaluation of differentiated degree of HCC might affect the individual treatment schedule [10]. Currently, aspiration biopsy was the most common method to get the information of histopathology before surgery. However, it was criticized by many researchers due to its invasiveness and the probability of seeding metastasis [11,12]. Recently, many studies suggested the image characteristics of tumor might predict the differential degree of the HCC. For example, there were some reports found that the low density/intensity of HCC on the portal phase of CT and hepatobiliary phase of Gd-EOB-DTPA-enhanced MRI might help to identify the differentiated degree of HCC [13,14].
Texture analysis was an established technique, which was beneficial to diagnoses, by extracting a large amount of texture information from medical images [15]. It was used in identifying the differentiated degree and characteristics of tumor, and evaluating the therapeutic effect, etc. [16][17][18]. However, the texture analysis has not been used in identifying the differentiated degree of HCC yet. Thus, our aim of the present study is to evaluate the accuracy of the texture analysis of MR images in discriminating the differentiated degree of HCC, and to compare the diagnostic efficiencies of conventional imaging features and texture features.

Patients
The present study received ethical approval from the Medical Ethics Review Committee of our institution and the relevant informed consent form was obtained in accordance with the Declaration of Helsinki. One hundred four participants were enrolled from 2015 to 2019, according to the following criteria:1) pathologically proved as HCC after hepatectomy; 2) inpatients who have comprehensive clinic materials; 3) performed preoperative Gd-EOB-DTPA-enhanced MRI. The clinic data of the 104 participants were recorded in the Table 1, containing age, gender, alpha fetoprotein (AFP), alamine aminotransferase (ALT), aspartate transaminase (AST), ALT\AST, total bilirubin (TBIL), direct bilirubin and indirect bilirubin.
Exclusion criteria included:1) participants have been treated (transplantation, resection, ablation or embolization) before MR examination; 2) clinical data (AFP, ALT, AST, TBIL, direct bilirubin and indirect bilirubin) or pathological results were incomplete; 3) the lesions were not clearly displayed on the images due to the artifact.

Image analysis
The MRI images were reviewed in the picture archiving and communication system (PACS). Experienced radiologists, who were blinded to the pathological results, evaluated the MRI imaging features of the HCC. The imaging features of MRI (arterial enhancement, capsule appearance, the intensity of HBP, the margin and diameter of the tumor, intralesional fat, intratumoral vessel and etc.) were selected referring to the Liver Imaging-Reporting and Data System (LI-RADS 2017) (https://www.acr.org/Clinical-Resources/Reporting-and-Data-Systems/LI-RADS) [19].

Texture analyses and features selection
MaZda software (version 4.6, quantitative texture analysis software, available from http://www.eletel.p.lodz.pl/ mazda/) was used for texture analysis. All images were transformed into Bitmap (BMP) format considering for the application compatibility of MaZda. An experienced radiologist manually portrayed the region of interest (ROI) of the lesion on the slice which contained the maximum proportion of tumor. One hundred four ROIs (one ROI for each patient) on HBP images were analyzed firstly. Subsequently, the ROIs were copied onto T2, AP and EP images. Then, texture features were extracted and analyzed. The texture features could be grouped into grey-level histogram, the grey-level cooccurrence matrix (GLCOM), the grey-level run-length matrix (GLRLM) and wavelet transform. A grey-level histogram indicated how many pixels of an image shared the same grey level. GLCOM was a statistical method of examining image texture, considering the spatial relationship, by calculating how often pairs of pixel with specific values, which could not provide information about shape. The GLRLM gave the size of homogeneous runs for each grey level. Wavelet transforms were a mathematical means for performing signal analysis when signal frequency varied over time. Wavelet transform coefficients could be computed. More detailed texture features were listed in Table 2. Feature selection algorithms included Fisher coefficient, mutual information [MI], and classification error probability combined with average correlation coefficients [POE + ACC]. Ten texture features were extracted by each of these algorithms. In order to enhance the discriminability, these three methods were combined, called "FPM", by which 30 texture features were extracted in total.

Histopathological analysis
Histopathological evaluation was available after hepatectomy for the lesions. The specimens were routinely prepared with 4% formaldehyde. The specimens were evaluated by two experienced pathologists who were blind to MRI information. The eight slices of each lesion were analyzed and evaluated with slices ranging from 0.3 cm to 2.0 cm depending on the size of the lesion. The Edmondson-Steiner grade was used to categorize all Table 1 The clinical data of each subtype group and inter-group differences  Grey-level co-occurrence matrix (GLCOM) Angular second moment, contrast, correlation, entropy, sum entropy, sum of squares, sum average, sum variance, inverse difference moment, difference entropy, difference variance (for four directions and five interpixel distances (offsets; n = 1-5)) Grey-level run-length matrix (GLRLM) Run-length non-uniformity, grey-level non-uniformity, long run emphasis, short run emphasis, fraction of image in runs (for four angles) Wavelet transform Energies of wavelet transform coefficients in sub-bands LL, LH, HL, HH the specimens. According to the differentiation degree of tumor cells, HCC were categorized into grades I to IV. Edmonson grade I and part of grade II was corresponding with well-differentiated HCC, Edmonson grade II and part of grade III was corresponding with moderately-differentiated HCC, grade III and part of grade IV was poorly-differentiated HCC, and grade IV was undifferentiated HCC. The specimens were stained with Glypican-3 (GPC-3) antibodies. The results of immunohistochemical staining were considered positive if more than 10% of the tumor cells showed cytoplasmic staining, otherwise the results were considered negative.

Statistical analysis and misclassification rate
The statistical analysis was performed using Statistical Product and Service Software (SPSS ver. 20.0, Chicago, IL). In present study, the group differences of continuous variables in abnormal distribution, such as age, ALT, AST, ALT\AST and texture features, were analyzed by Mann-Whitney U test. The difference of texture features between poorly-, moderately-and well-differentiated HCC were analyzed by Kruskal-Wallis H test. The group differences of categorical variables were analyzed by Pearson Test when the sample size was over 40 and the minimal expected frequency was over 5. Otherwise, the correction formula of chi-squared test would be chosen. And the R × C table was used when the dependent variable was over 2. In order to evaluate the diagnostic accuracy of texture features derived from T2, HBP, AP, and EP, the receiver operating characteristic (ROC) analysis was performed and the area under the curve (AUC) was calculated by MedCalc (MedCalc statistical software, ver.15.8). The correlation between texture features and differentiated degree of HCC was analyzed by Spearman's correlation coefficient. A p value less than 0.05 was considered statistically significant. And Bonferroni correction was used to adjust p values in multiple comparisons. The B11, a module of MaZda (version 4.6), provided four analyzing ways -principal component analysis (PCA), linear discriminant analysis (LDA), nonlinear discriminant analysis (NDA) and raw data analysis (RDA), to classify and analyze the texture features. The B11 implemented 1-NN classifier for non-linear supervised classification [20]. The misclassification rate was defined as total false samples divided by the total samples and the ratio indicated that the estimated group was different from the observed group. According to the misclassification rate, the classification results were separate into four levels: excellent (misclassification rates ≤10%), good (10% < misclassification rates ≤20%), moderate (20% < misclassification rates ≤30%), fair (30% < misclassification rates ≤40%), and poor (misclassification rates > 40%) [21].

Clinical data
There were 37 patients with poorly-differentiated HCC, 43 with moderately-differentiated HCC, and 24 with well-differentiated HCC in present study. As showed in Table 1, there were no significant differences for age and gender among the groups (p > 0.05). There were significant differences for AFP and ALT value between the poor-and well-differentiated HCC (p = 0.001, 0.006, respectively). The ALT was statistically different between well-and moderately-differentiated HCC (p = 0.008). Fifty-one participants were with GPC-3, among which, 20 were with poorly-differentiated HCC, 20 with moderately and 11 with well-differentiated HCC. There was no significant difference of GPC-3 expression among poorly-, well-and moderately-differentiated HCC, as Table 1 showed (p > 0.05).

MRI feature evaluation
The MRI imaging features of l04 patients were demonstrated in Table 3. As the table showed, the tumor size was statistically different between poorly-and moderately-HCC (p = 0.014). However, no statistical differences were found in the margin and the capsule status of the tumor, liver cirrhosis, the HBP hypointensity, intratumoral vessel, intralesional fat, rim-enhancement AP and lymphadenectasis, among poorly-, moderatelyand well-differentiated HCC. A typical case of poorlydifferentiated HCC was showed in Fig. 1.

Texture analysis and tissue classification
As showed in Tables 4, 262 texture features derived  from T2, HBP, AP and EP images were obtained and categorized into histogram (n = 10), GLCOM (n = 220), GLRLM (n = 20) and wavelet transform (n = 12). The frequency of each feature category of T2weighted images and each phase of Gd-EOB-DTPA enhanced images extracted by FPM was showed among poorly-differentiated, well-differentiated and moderately-differentiated HCC. The GLCOM-based texture features were most frequently extracted with three phases for poorly-verse well-differentiated HCC, poorly-verse moderately-differentiated HCC and well-verse moderately-differentiated HCC.
The tissue classification results were demonstrated across the T2, AP, EP and HBP in Table 5. The misclassification rate of NDA was excellent for each phase of the three groups, with the misclassification rate ranging from 3.33 to 14.93%. The misclassification rate of LDA was rank secondly to NDA, with the classification rate range from 4.92 to 33.75%. Both of the misclassification results of RDA and PCA were fair or poor.

ROC-analysis
The AUC of each texture feature was calculated. The ROC curves of the best combined diagnoses were demonstrated in Figs. 2, 3 and 4. As showed in Fig. 2, the combine AUC value (combining texture features from T2, AP and EP) was 0.812, higher than that of any single texture feature from each phase, to differentiate poorlyfrom well-differentiated HCC (accuracy = 0.77). As showed in Fig. 3, the combine AUC value was 0.879 (accuracy = 0.85), to differentiate poorly-from moderatelydifferentiated HCC, and as showed in Fig. 4, the combined AUC value was 0.808 (accuracy = 0.746) to differentiate moderately-from well-differentiated HCC.
The ROC analyses of combined tumor size and texture features were demonstrated in Table 6. "COMBINE" presented the combination of texture features derived from different phases. As showed in the Table 6, the AUC of tumor size was the lowest and the COMBINE AUC value was the highest. With combining tumor size and texture features, the COMBINE AUC values were the same as those without combining tumor size, in poorly-verse moderately-differentiated HCC and poorlyverse well-differentiated HCC, while the COMBINE AUC value was increased from 0.808 to 0.833 in moderately-with well-differentiated HCC (p = 0.314).

Correlation between texture features and differentiated degree of HCC
Perc.10% was positively correlated with the differentiated degree of HCC in AP (r = 0.276, p = 0.005), while 135dr_

Discussions
As previous studies showed, the diameter of HCC was an important factor to predict the pathological grade of HCC. Lee et al. [22] and Martins et al. [23] suggested that the diameter of most moderately-differentiated HCC was larger than well-differentiated HCC. Our present study found that the diameter of poorlydifferentiated HCC was larger than that of moderatelydifferentiated and well-differentiated HCC. However, there was no significant difference of diameter between poorly and well-differentiated HCC in present study, which was not in consistence with the Martins'. It may be due to the heterogeneity of the tumor cells and the individual differences of tumor growing patterns, as well as the limited sample size. Additionally, it was found that the diagnostic efficiency of tumor size was lower than those of the texture features in present study, which was consistent with previous study [24], suggesting the critical role of texture analysis in identifying the differentiated degree of HCC. The differential degree of HCC was the most important factor that affect the prognosis of the patients. In this study, the patients were grouped into poorly, moderately and well-differentiated group based on the histopathological outcomes, and whether the texture features could successfully differentiate the subtypes of HCC were explored. Texture analysis was a method that could quantize the information provided by the images. Some studies verified that texture analysis had the potential to identify the histopathological type of neoplasm, such as the breast cancer and renal tumor [21,25]. However, there were no studies to explore the value of texture features derived from multi-phase of Gd-EOB-DTPA-enhanced MRI and T2WI in predicting the histopathological grades of HCC yet.
In recent years, researchers gradually realized that the substantial quantitative features were increasingly important in the tumor diagnoses, not merely the application of qualitative features such as margin, signal intensity, capsule of the tumor and so on [26]. Mazda was a software package which provided a complete path for quantitative analysis of image texture. It included image analysis, texture features extraction, data classification, analysis automation and other functions [20]. Substantial information obtained by Mazda, might differentiate the pathological grade of tumor. Previous study analyzed the texture features to predict the OS of the patients with advanced HCC [27]. Our study attempted to identify the histopathological grade by texture analysis.
B11 module provided four procedures, RDA, PCA, LDA and NDA, to analyze the selected thirty features. In present study, the classification rate of NDA was excellent. It suggested that texture analysis was a reliable Table 4 The frequency of each feature category extracted by FPM from AP, EP, HBP and T2 images among poorly-differentiated, well-differentiated and moderately-differentiated HCC  Note: RDA raw data analysis, PCA principal component analysis, LDA linear discriminant analysis, NDA nonlinear discriminant analysis A: well-differentiated HCC, B: moderately-differentiated HCC, C: poorly-differentiated HCC; AP arterial phase, EP equilibrium phase images, and HBP hepatobiliary phase method to identify the poorly-, moderately-and welldifferentiated HCC. Although LDA was recommended as an optical method, NDA was more excellent than LDA in present study, which was in consistent with Li Y's study [28]. This might be due to the non-linearity of the clinical data which was obtained in a random way. And the inconformity of the misclassification rate from the texture analysis of different image sequences, might result from the different histological components and enhancement patterns among the subtypes of HCC [21]. The GLCOM-based features which described the spatial dependence of gray value in image were most frequently extracted than other texture features of other categories regardless of the phase of MRI and groups [28,29]. It was implied that the different pathological grades might impact the gray value of the image. Additionally, the tremendous number of texture features included in the GLCOM (n = 220) might lead to the high frequency of the extracted text features [21]. The GLRLM was secondly selected by texture analysis, which demonstrated the pixel runs with the same grey level values in a given direction and depicted intensity homogeneity in a given direction [28]. The result might suggest that the intensity homogeneity between poorly-, moderately-and well-differentiated HCC was different. The GLCOM-based features generated from AP was noticeably different between groups.
In present study, it was found that histogram-derived parameter --Perc.10% of AP was positively correlated with the differentiated degree of HCC. It was suggested that the signal intensity in AP imaging was detectably higher with a higher differentiated degree. However, as previous study showed, HCC with a higher differentiated degree was prone to have lower arterial supply. The individual differences of HCC arterial supply might lead to this discrepancy [30]. 135dr_ShrtREmp was a GLRLMbased texture feature to measure the heterogeneity and SumEntrp was a parameter to measure randomness and  heterogeneity of the studied region. 135dr_ShrtREmp of EP and SumEntrp of T2 were negatively correlated with differentiated degree of HCC, suggesting that the poorlydifferentiated HCC was most heterogeneous among different differentiated grades of HCC both in EP and T2 phase [25,31]. However, there was no statistical difference of signal (a routine MR feature) in different differentiated degrees of HCC as showed in Table 3. Therefore, the texture analysis was supposed to be a preciser method to evaluate the differentiated degree of HCC than traditional MRI imaging characteristics.
As showed in Table 6, the COMBINE (combined S(0, 2) SumAverg of AP, Perc.10% of T2, Perc.10%-EP and S(0,5)SumEntrp-HBP) AUC value was the highest when moderately-verse poorly-differentiated HCC. S(0,2) SumAverg and Perc.10% reflected the signal intensity of the lesion, and the S(0,5) SumEntrp reflected randomness and heterogeneity of the studied region. Therefore, the signal intensity of T2, AP and EP and the heterogeneity of HBP were supposed to be important to predict the differentiated degree of HCC. The COM-BINE (combined S(4,0)Correlat-AP, 135dr_ShrtREmp-EP and WavEnLH_s-2-T2) AUC value was the highest when well-verse poorly-differentiated HCC, while the COMBINE (combined S(5,5)DifVarnc-AP, S(2,2)Dif-Varnc-HBP and WavEnLH_s-1-T2) AUC value was the highest when well-verse moderately-differentiated HCC. All the above features reflected the heterogeneity of lesion. Both the signal intensity and heterogeneity of HCC valued in identifying the differentiated degree of HCC. In addition, the AUC of tumor size was the lowest, suggesting that the texture features analysis was preciser than tumor size in identifying the differentiated degree of HCC. These results suggested that radiologists should focus on the signal intensity and heterogeneity of lesion in clinical diagnosis.
GPC-3 was a member of the glypican family, which influenced cell growth, differentiation, and migration [32].  Previous studies demonstrated that higher GPC-3 expression level in HCC was a risk factor for shorter overall survival and GPC-3 expression level in poorlydifferentiated tumor cells was higher than that in moderately-and well-differentiated HCC [32][33][34]. But there was no significant difference of the expression of GPC-3 among poorly-, moderately-and well-differentiated HCC in present study. The small sample size was supposed to be the reason of this discrepancy. There were some limitations in our study. Although we adopted strict inclusion and exclusion criteria in this retrospective study, selection bias was still inevitably. Second, the sample size was relatively small which need to be enlarged in the future study. Third, the ROI (tumor contour) was manually delineated on the slice containing the maximum diameter, which led to the lack of three-dimentional information of the tumor.

Conclusions
In conclusion, the texture analysis of multiphase Gd-EOB-DTPA-enhanced MRI and T2WI were noninvasive and reliable quantitative technique to predict the differentiated grade of HCC. Texture analysis performed better than the tumor size in discriminating the differentiated grade of HCC. The signal intensity and heterogeneity of HCC were valued in identifying the differentiated degree of HCC.