The development and validation of a radiomic nomogram for the preoperative prediction of lung adenocarcinoma
BMC Cancer volume 20, Article number: 533 (2020)
Accurate diagnosis of early lung cancer from small pulmonary nodules (SPN) is challenging in clinical setting. We aimed to develop a radiomic nomogram to differentiate lung adenocarcinoma from benign SPN.
This retrospective study included a total of 210 pathologically confirmed SPN (≤ 10 mm) from 197 patients, which were randomly divided into a training dataset (n = 147; malignant nodules, n = 94) and a validation dataset (n = 63; malignant nodules, n = 39). Radiomic features were extracted from the cancerous volumes of interest on contrast-enhanced CT images. The least absolute shrinkage and selection operator (LASSO) regression was used for data dimension reduction, feature selection, and radiomic signature building. Using multivariable logistic regression analysis, a radiomic nomogram was developed incorporating the radiomic signature and the conventional CT signs observed by radiologists. Discrimination and calibration of the radiomic nomogram were evaluated.
The radiomic signature consisting of five radiomic features achieved an AUC of 0.853 (95% confidence interval [CI]: 0.735–0.970), accuracy of 81.0%, sensitivity of 82.9%, and specificity of 77.3%. The two conventional CT signs achieved an AUC of 0.833 (95% CI: 0.707–0.958), accuracy of 65.1%, sensitivity of 53.7%, and specificity of 86.4%. The radiomic nomogram incorporating the radiomic signature and conventional CT signs showed an improved AUC of 0.857 (95% CI: 0.723–0.991), accuracy of 84.1%, sensitivity of 85.4%, and specificity of 81.8%. The radiomic nomogram had good calibration power.
The radiomic nomogram might has the potential to be used as a non-invasive tool for individual prediction of SPN preoperatively. It might facilitate decision-making and improve the management of SPN in the clinical setting.
The most common cause of cancer death around the world is the lung and bronchus according to the 2017 cancer statistics [1,2,3]. Patients with lung cancer usually have a bad prognosis because most of them are diagnosed at an advanced stage (III or IV) with no discriminating symptoms as compared to early stage . In clinical practice, accurate diagnosis of early lung cancer from small pulmonary nodules (SPN) is challenging. The detection of SPN is increasing with years worldwide, mainly because of the wide use of low-dose chest computed tomography (CT) screening. In the Early Lung Cancer Action Project performed by Henschke et al. , the detection rate of SPN was as high as 23%, which increased to 39.5% in patients received lung operation . For indeterminate solid and ground-glass nodules, they should be followed with CT at least 2 and 3 years, respectively, according to the international guidelines for the management of SPN [7, 8]. Therefore, accurate diagnosis of SPN using advanced tool will reduce health costs and extensive CT examinations with no additional benefits. Also, clinicians need an non-invasive imaging tool to determine whether a patient needs surgery or long-term follow-up.
Recently, by high throughput extracting quantitative imaging features from standard-of-care medical images, radiomics provides us a promising and non-invasive tool in cancer research [9, 10]. The radiomic features mined by sophisticated bioinformatics tools might involve the process of diagnosis, prognosis and prediction . Radiomic signature constructed by significant features has been applied for precision diagnosis and treatment of cancer, which will promote the development of precision medicine. Currently, radiomics has been used to decode tumor phenotypes, histological subtypes and pathological response of lung cancer [12,13,14].
Therefore, the aim of this study was develop and validate a radiomic nomogram for the individual preoperative prediction of lung adenocarcinoma from benign SPN, which would improve the decision-making of SPN in clinical practice.
Patients and nodules
Our institutional review board approved this retrospective study and waived the need for informed consent from patients. A total of 197 patients with 210 SPN treated with surgical resection were included from January 2011 to March 2017. Inclusion criteria were as follows: (1) Patients had histopathologically-confirmed SPN ≤10 mm; (2) Patients had available clinical data; (3) Patients underwent baseline lung CT scan with the same imaging parameters and reconstruction slice thickness; and (4) Patients’ lung CT performed within 1 month before surgery. The patients were excluded if: (1) Patients received surgery before CT scans; and (2) Patients’ lung CT images have breathing artifacts. The patients were randomly divided into training and validation sets by a computer algorithm at a ratio of 7:3. Figure 1 illustrates the study inclusion pathway.
A total of 11 CT findings of each nodule were collected from the last CT scan before surgery, including the maximum diameter, location, involvement of pleura (pleural indentation with or without pleural thickness, absence), nodule consistency (ground-glass nodule [GGN], solid, part-solid GGN), shape (regular [e.g., round, oval] or irregular), margins (lobulation, spiculation, both, absence), cavity (presence or absence), calcification (presence or absence), intranodular changes (necrosis, consolidation, vacuoles, air bronchogram, absence), bronchial disruption (presence, absence, unclear), and vessel convergence sign (presence or absence). Two radiologists with 13 years and 18 years of clinical experience in lung cancer reviewed all of the CT images and reached a consensus.
Contrast-enhanced CT images were obtained by a 64-slice CT scanner (Siemens Definition AS + 128, Forchheim, Germany). The imaging parameters were as follows: 120 kV; 120 mA; rotation time = 0.5 s; detector collimation = 64 × 0.625 mm; the field of view = 500 mm; and matrix size, 512 × 512. All patients received intravenous administration of iodinated contrast agent (1–1.1 ml/Kg, Ultravist 370, Bayer Pharma AG, Berlin, Germany). The CT images were obtained after a 30 s delay and reconstructed with a slice thickness of 2 mm.
CT-based radiomic feature extraction and selection
Figure 2 shows the radiomic workflow of this study. The regions of interest (ROIs) of pulmonary nodules were delineated by a junior radiologist using open-source ITK-SNAP software (www.itk-snap.org) and validated by a senior radiologist. Radiomic features were extracted from contrast-enhanced CT images by using an in-house feature extraction algorithm applied in Artificial Intelligence Kit software that developed by GE Healthcare Life Sciences. It can be combined with ITK-SNAP software to obtain three dimensional images. A total of 385 radiomic features consisting of form factor features, histogram features, and textural features (such as Gray Level Size Zone Matrix [GLSZM], Gray Level Run Length Matrix [GLRLM], and Gray Level Cooccurrence Matrix [GLCM]). The description of feature extraction algorithms are presented in Supplementary Material.
We applied the least absolute shrinkage and selection operator (LASSO) regression to select the most significant features suggestive malignancy . We performed 100 iterations of 10-fold cross-validation with minimal binomial deviance to select the optimal parameters in LASSO regression .
Training and validation of the conventional CT signature, radiomic signature and radiomic nomogram
To determine the additional value of radiomic signature to conventional CT features, we developed and compared three models (i.e., conventional CT signature, radiomic signature and radiomic nomogram). Conventional CT signature was built based on the results of multivariate logistic regression analysis of 11 conventional CT features. Radiomic signature or radiomic score (Rad-score) was calculated by linearly fitting the selected radiomic features after weighted by their respective coefficients. Finally, radiomic nomogram was constructed by a multiple logistic regression using the selected conventional CT features and Rad-score.
The area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity were used to evaluate the performance of the three models in the validation dataset. Calibration curve and the Hosmer-Lemeshow test were used to assess the calibration and goodness-of-fit of the radiomic nomogram .
All the statistical analyses were performed using R software (version 3.4.2). The packages were used as follows: “glmnet” for LASSO logistic regression, “rms” for nomogram and calibration plots, and “vcdExtra” for Hosmer-Lemeshow test. Differences of patient and nodule characteristics between the training dataset and validation dataset were compared using Chi-square test or Fisher’s exact test or Mann–Whitney U test, if appropriate. The AUC comparison of different models used Delong test. A P < 0.05 was considered significant.
Patient and nodule characteristics
Table 1 shows patient and nodule characteristics. The mean age of 197 patients was 51.0 years. Of the 210 nodules, 87 (41.4%) were classified as benign, including tuberculomas (15/87, 17.2%), fibrous nodules (13/87, 14.9%), lymph nodes (11/87, 12.6%), hamartomas (13/87, 14.9%), pulmonary cryptococcosis (10/87, 11.5%), inflammatory nodules (8/87, 9.2%), inflammatory granuloma (4/87, 4.6%), aspergillosis (3/87, 3.4%), sclerosing hemangiomas (2/87, 2.3%), and atypical adenomatous hyperplasia (8/87, 9.2%); 123 (58.6%) were malignant, composed of invasive adenocarcinomas (44/123, 35.8%), minimally invasive adenocarcinoma (59/123, 48.0%), and adenocarcinoma in situ (20/123, 16.3%). No significant difference was found between the training and validation datasets in regard to most clinical and imaging features (Table 1).
Feature selection and radiomic signature construction
A total of 385 radiomic features were extracted from each volume of interest of the nodules on contrast-enhanced CT images. Five features with non-zero coefficients were selected by LASSO (Fig. 3a-b). The radiomic score calculation formula:
The five radiomic features were significantly different between the benign and malignant SPN (for all, p < 0.001) (Fig. 4).
Training and validation of the conventional CT signature, radiomic signature and radiomic nomogram
The radiomic signature achieved an AUC of 0.878 (95%CI: 0.813 to 0.943), accuracy of 85.0%, sensitivity of 90.1%, and specificity of 76.8% in the training dataset (Table 2) and an AUC of 0.853 (95%CI: 0.735 to 0.970), accuracy of 81.0%, sensitivity of 82.9%, and specificity of 77.3% in the validation dataset (Table 2). The was a significant difference between benign and malignant SPN in regard to Rad-score in the training dataset (median [interquartile range], 1.295 [0.880 to 1.631] vs. -0.525 [− 0.964 to 0.106], respectively, P < 0.001, Fig. 5a), which was confirmed in the validation dataset (median [interquartile range], 1.027 [0.444 to 1.841] versus. -0.541 [− 1.208 to − 0.078], respectively, P < 0.001, Fig. 5b).
After multivariate analysis, only two CT findings (nodule consistency and margins) remained (P < 0.001 and P = 0.026, respectively). The two CT features attained an AUC of 0.842 (95%CI: 0.779 to 0.906), accuracy of 73.5%, sensitivity of 62.6%, and specificity of 91.1% in the training dataset and an AUC of 0.833 (95%CI: 0.707 to 0.958), accuracy of 65.1%, sensitivity of 53.7%, and specificity of 86.4% in the validation dataset (Table 2). The AUCs of conventional CT signature and radiomic signature were not significantly different (P = 0.292 and 0.586 in the training and validation datasets, respectively).
A radiomic nomogram incorporating radiomic signature, internal composition and margins of nodule was constructed (Fig. 6a). The radiomic nomogram yielded an AUC of 0.911 (95%CI, 0.858 to 0.965), accuracy of 87.1%, sensitivity of 87.9%, and specificity of 85.7% in the training dataset and an AUC of 0.857 (95%CI: 0.723 to 0.991), accuracy of 84.1%, sensitivity of 85.4%, and specificity of 81.8% in the validation dataset (Table 2), which indicated that the radiomic signature provides added value to the conventional CT features in terms of discriminatory efficacy. The AUC of radiomic nomogram was not significantly different from that of conventional CT features and radiomic signature in the validation dataset (P = 0.304 and 0.864, respectively). The calibration curve of the radiomic nomogram is shown in Fig. 6b. The Hosmer-Lemeshow test yielded P values of 0.738 and 0.111 in the training and validation datasets, respectively, which indicated good calibration power.
We trained and tested a radiomic nomogram based on the radiomic signature and the anatomical CT features for individualized preoperative prediction of lung adenocarcinoma, which showed good discriminative power and calibration. This study indicates that CT-derived radiomic features supplement the CT findings reported by radiologists in the prediction process. Note that, this study provides a non-invasive and effective prediction tool to determine those patients with a high probability of lung adenocarcinoma.
Early diagnosis of cancer is associated with prolonged survival , for instance, the 5-year overall survival of breast cancer was 74.8% between 1975 and 1977; between 2003 and 2009, the number has significantly increased to 90.3% . This increase is mainly due to earlier detection because of the extensive application of mammography for cancer screening . Currently, small pulmonary nodules are still a common and challenging clinical problem. The classification performance of CT is limited, especially in small nodules (≤10 mm in diameter). More accurate and reliable non-invasive diagnostic tool is urgently needed for precise treatment. Early diagnosis of malignant pulmonary nodules is crucial for the improvement of patient’s long-term overall survival.
To date, radiologists diagnose lung cancer by largely depending on qualitative features of CT images, such as nodule diameter, evidence of spiculation, upper lobe location, and pleural indentation . Low-dose CT screening for pulmonary nodules may reduce mortality, however, it also has the risk of overdiagnosis due to detect indolent tumors . Some radiologists contended serial examinations for all serendipitous SPN on CT to render an timely lung operation for cure , which may be too aggressive. Excessive detection of SPN might has potential adverse implications on current medical system and clinical practice, such as low utilization of limited resources, raised health care costs, increased radiation and risk for morbidity and mortality of patients . CT-guided percutaneous biopsy has commonly used to obtain tumor histological results due to the characteristics of peripheral location of most pulmonary nodules. However, in actual clinical practice, progressively smaller nodules often result in reduced sensitivity for percutaneous biopsy [21, 22] and other factors also influence the accuracy of biopsy including nodule morphology and length of needle path . In addition, percutaneous biopsy has several limitations, such as invasive nature and high risk for complications . Therefore, non-invasive imaging-based biomarkers are needed to provide additional diagnosis information.
Recently, the increased training of medical image analysis and tools has driven additional studies investigating the radiomics of lung cancer. Radiomic signatures may help to mining bioinformatics behind lung cancer on medical image, for instance, tumor staging , gene expression patterns , treatment response [26, 27], and patient survival [28, 29]. Current determination of whether radiomic features can improve the prediction of pulmonary nodules as being malignant as opposed to conventional visual assessment on CT is a hot topic [30, 31], but most studies have examined nodules smaller than 30 mm in diameter. In this study, 210 SPN less than 10 mm with surgery-proven malignancy or benign status were included for radiomic analysis. All radiomic features were extracted from a same CT scanner, with same imaging parameters and reconstruction slice thicknesses. As Wu et al. indicated, without control of the variability of factors such as imaging scanners, scanning parameters, the performance of radiomic features could be depressed . An increased number of radiomic features has the potential ability to quantify intra-tumoral heterogeneity. However, most of high-dimensional features are redundant, which will cause poor classification performance. We aimed to select the radiomic features that most associated with lung adenocarcinoma. Only five useful features were selected from 385 features by LASSO algorithm. Unlike previous studies, this study describes some important CT findings that contribute to the differential diagnosis of lung adenocarcinoma. After multivariate analysis, internal composition and margins were two independent clinical features of lung adenocarcinoma. Those nodules with GGN, lobulation and/or signs of speculation had a higher risk for malignancy, which was consistent with the radiologists’ experience. The conventional CT signature attained a accuracy of 0.735 and 0.651 in the training and validation dataset, respectively. We hypothesized that radiomic features could further improve the diagnostic accuracy of a CT signature. Our study demonstrated the predictive performance of conventional CT features was improved by adding radiomic features, attaining accuracy of 0.871 and 0.841 in the training and validation datasets, respectively.
A number of risk models have been developed, of varying complexity for identifying risk of incident lung cancer among patients with visible lung nodules [33,34,35,36,37,38]. The models were based on significant patient and nodule characteristics. The accuracy and clinical utility of predictive models depends on the case mix of the population in which it was derived and the prevalence of malignancy in that population. The risk prediction models should be externally validated before they are used in a different clinical setting and population. The four validated models were the Mayo Clinic , Veterans Administration , Herder  and Brock . The studies have shown AUC of 0.89 for Mayo Clinic model, 0.74 for Veterans Administration, 0.92 for Herder and 0.90 for Brock. Our radiomic model achieved similar performance, with an AUC of 0.857. Compared with previous models, our model didn’t consider patient data, but included radiomic features extracted from CT images that could reflect intratumoral heterogeneity. However, our model lacks external validation. We hope to explore the added value of radiomics to the existing risk prediction models.
In summary, this study showed the potential of radiomic features extracted from unenhanced CT images for predicting lung cancer before surgery. Radiomic features showed the added value to the conventional CT features in differentiating lung adenocarcinoma from benign SPN. This study provides doctors a radiomic nomogram as a non-invasive tool for individualized prediction of lung cancer preoperatively. However, before applying in real-world setting, more studies are needed to validate the performance of the radiomic nomogram.
Availability of data and materials
All data generated or analysed are included in this article.
Small pulmonary nodules
Volume of interest
Least absolute shrinkage and selection operator
Area under the receiver operating characteristic curve
Ground glass nodule
Gray-level co-occurrence matrix
Grey level run-length matrix
Gray level size zone matrix
Siegel RL, Miller KD, Jemal A. Cancer statistics, 2017. CA Cancer J Clin. 2017;67:7–30.
Torre LA, Bray F, Siegel RL, et al. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65:87–108.
Chen W, Zheng R, Baade PD, et al. Cancer statistics in China, 2015. CA Cancer J Clin. 2016;66:115–32.
Ruparel M, Quaife SL, Navani N, et al. Pulmonary nodules and CT screening: the past, present and future. Thorax. 2016;71:367–75.
Patz EF, Pinsky P, Gatsonis C, et al. Overdiagnosis in low-dose computed tomography screening for lung cancer. JAMA Intern Med. 2014;174:269–74.
Hazelrigg SR, Boley TM, Weber D, et al. Incidence of lung nodules found in patients undergoing lung volume reduction. Ann Thorac Surg. 1997;64:303–6.
MacMahon H, Austin JH, Gamsu G, et al. Guidelines for management of small pulmonary nodules detected on CT scans: a statement from the Fleischner society. Radiology. 2005;237:395–400.
Sayyouh M, Vummidi DR, Kazerooni EA. Evaluation and management of pulmonary nodules: state-of-the-art and future perspectives. Expert Opin Med Diag. 2013;7:629–44.
Lambin P, Rios-Velazquez E, Leijenaar R, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer. 2012;48:441–6.
Kumar V, Gu Y, Basu S, et al. Radiomics: the process and the challenges. Magn Reson Imaging. 2012;30:1234–48.
Gillies RJ, Kinahan PE, Hricak H. Radiomics: images are more than pictures, they are data. Radiology. 2016;278:563–77.
Choi ER, Lee HY, Jeong JY, et al. Quantitative image variables reflect the intratumoral pathologic heterogeneity of lung adenocarcinoma. Oncotarget. 2016;7:67302–13.
Yoon HJ, Sohn I, Cho JH, et al. Decoding tumor phenotypes for ALK, ROS1, and RET fusions in lung adenocarcinoma using a Radiomics approach. Medicine. 2015;94:e1753.
Coroller TP, Agrawal V, Narayan V, et al. Radiomic phenotype features predict pathological response in non-small cell lung cancer. Radiother Oncol. 2016;119:480–6.
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B. 1996;58:267–88.
Wu S, Zheng J, Li Y, et al. Development and validation of an MRI-based Radiomics signature for the preoperative prediction of lymph node metastasis in bladder Cancer. EBioMedicine. 2018;34:76–84.
Kramer AA, Zimmerman JE. Assessing the calibration of mortality benchmarks in critical care: the Hosmer-Lemeshow test revisited. Crit Care Med. 2007;35(9):2052–6.
Ott JJ, Ullrich A, Miller AB. The importance of early symptom recognition in the context of early detection and cancer survival. Eur J Cancer. 2009;45:2743–8.
DeSantis CE, Lin CC, Mariotto AB, et al. Cancer treatment and survivorship statistics, 2014. CA Cancer J Clin. 2014;64:252–71.
Callister MEJ, Baldwin DR. How should pulmonary nodules be optimally investigated and managed? Lung Cancer. 2016;91:48–55.
Kothary N, Lock L, Sze DY, et al. Computed tomography-guided percutaneous needle biopsy of pulmonary nodules: impact of nodule size on diagnostic accuracy. Clin Lung Cancer. 2009;10:360–3.
Wallace MJ, Krishnamurthy S, Broemeling LD, et al. CT-guided percutaneous fine-needle aspiration biopsy of small (< or =1-cm) pulmonary lesions. Radiology. 2002;225:823–8.
Wiener RS, Schwartz LM, Woloshin S, et al. Population-based risk of complications following transthoracic needle lung biopsy of a pulmonary nodule. Ann Inern Med. 2011;155:137–44.
Hawkins S, Wang H, Liu Y, et al. Predicting malignant nodules from screening CT scans. J Thorac Oncol. 2016;11:2120–8.
Aerts HJ, Velazquez ER, Leijenaar RT, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014;5:4006.
Aerts HJ, Grossmann P, Tan Y, et al. Defining a Radiomic response phenotype: a pilot study using targeted therapy in NSCLC. Sci Rep. 2016;6:33860.
Huynh E, Coroller TP, Narayan V, et al. CT-based radiomic analysis of stereotactic body radiation therapy patients with lung cancer. Radiother Oncol. 2016;120:258–66.
Grove O, Berglund AE, Schabath MB, et al. Quantitative computed tomographic descriptors associate tumor shape complexity and intratumor heterogeneity with prognosis in lung adenocarcinoma. PLoS One. 2015;10:e0118261.
Huang Y, Liu Z, He L, et al. Radiomics signature: a potential biomarker for the prediction of disease-free survival in early-stage (I or II) non-small cell lung cancer. Radiology. 2016;281:947–57.
He L, Huang Y, Ma Z, et al. Effects of contrast-enhancement, reconstruction slice thickness and convolution kernel on the diagnostic performance of radiomics signature in solitary pulmonary nodule. Sci Rep. 2016;6:34921.
Bae JM, Jeong JY, Lee HY, et al. Pathologic stratification of operable lung adenocarcinoma using radiomics features extracted from dual energy ct images. Oncotarget. 2017;8:523–35.
Wu W, Parmar C, Grossmann P, et al. Exploratory study to identify radiomics classifiers for lung cancer histology. Front Oncol. 2016;6:71.
Swensen SJ, Silverstein MD, Ilstrup DM, et al. The probability of malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. Arch Intern Med. 1997;157(8):849–55.
Gould MK, Ananth L, Barnett PG, et al. A clinical model to estimate the pretest probability of lung cancer in patients with solitary pulmonary nodules. Chest. 2007;131(2):383–8.
Li Y, Chen K-Z, Wang J. Development and validation of a clinical prediction model to estimate the probability of malignancy in solitary pulmonary nodules in Chinese people. Clin Lung Cancer. 2011;12(5):313–9.
Gurney JW. Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. Part I. Theory Radiol. 1993;186(2):405–13.
Yonemori K, Tateishi U, Uno H, et al. Development and validation of diagnostic prediction model for solitary pulmonary nodules. Respirology. 2007;12(6):856–62.
McWilliams A, Tammemagi MC, Mayo JR, et al. Probability of cancer in pulmonary nodules detected on first screening CT. N Engl J Med. 2013;369(10):910–9.
Herder GJ, van Tinteren H, Golding RP, et al. Clinical prediction model to characterize pulmonary nodules: validation and added value of 18F-fluorodeoxyglucose positron emission tomography. Chest. 2005;128(4):2490–6.
This work was funded by the National Health and Family Planning Commission of the People‘s Republic of China (201402013).
Ethics approval and consent to participate
This study was approved by the ethics review board of the First Affiliated Hospital of Guangzhou Medical University, the need for informed patient consent for inclusion was waived.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Liu, Q., Huang, Y., Chen, H. et al. The development and validation of a radiomic nomogram for the preoperative prediction of lung adenocarcinoma. BMC Cancer 20, 533 (2020). https://doi.org/10.1186/s12885-020-07017-7
- Lung adenocarcinoma
- Computed tomography