Next Article in Journal
Trem2 Enhances Demyelination in the Csf1r+/− Mouse Model of Leukoencephalopathy
Previous Article in Journal
The Dynamic Relationship between the Glymphatic System, Aging, Memory, and Sleep
Previous Article in Special Issue
Sensitivity Analysis for Survival Prognostic Prediction with Gene Selection: A Copula Method for Dependent Censoring
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Radiomics-Clinical AI Model with Probability Weighted Strategy for Prognosis Prediction in Non-Small Cell Lung Cancer

School of Medical and Health Sciences, Tung Wah College, Hong Kong, China
*
Author to whom correspondence should be addressed.
Biomedicines 2023, 11(8), 2093; https://doi.org/10.3390/biomedicines11082093
Submission received: 1 June 2023 / Revised: 29 June 2023 / Accepted: 19 July 2023 / Published: 25 July 2023

Abstract

:
In this study, we propose a radiomics clinical probability-weighted model for the prediction of prognosis for non-small cell lung cancer (NSCLC). The model combines radiomics features extracted from radiotherapy (RT) planning images with clinical factors such as age, gender, histology, and tumor stage. CT images with radiotherapy structures of 422 NSCLC patients were retrieved from The Cancer Imaging Archive (TCIA). Radiomic features were extracted from gross tumor volumes (GTVs). Five machine learning algorithms, namely decision trees (DT), random forests (RF), extreme boost (EB), support vector machine (SVM) and generalized linear model (GLM) were optimized by a voted ensemble machine learning (VEML) model. A probabilistic weighted approach is used to incorporate the uncertainty associated with both radiomic and clinical features and to generate a probabilistic risk score for each patient. The performance of the model is evaluated using a receiver operating characteristic (ROC). The Radiomic model, clinical factor model, and combined radiomic clinical probability-weighted model demonstrated good performance in predicting NSCLC survival with AUC of 0.941, 0.856 and 0.949, respectively. The combined radiomics clinical probability-weighted enhanced model achieved significantly better performance than the radiomic model in 1-year survival prediction (chi-square test, p < 0.05). The proposed model has the potential to improve NSCLC prognosis and facilitate personalized treatment decisions.

1. Introduction

Lung cancer is one of the leading commonly diagnosed cancer, accounting for 11.6% of cancer cases. It has the highest mortality among all malignancies worldwide, comprising approximately 25% of all cancer death. Non-small cell lung cancer (NSCLC) contributes to the majority of lung cancer incidence, adding up to almost 85% of cases [1]. The primary treatment modalities for NSCLC are surgery, radiation therapy and chemotherapy. Recent research indicated that patients may benefit from immunotherapy for NSCLC with specific biomarkers [2]. Moreover, targeted therapy is favorable for NSCLC with specific genes or proteins [3]. Formulating a treatment plan and patient management is crucial to the prognosis of NSCLC. Traditionally, TNM staging is a widely used system for prognosis stratification and treatment decision for NSCLC based on tumor size (‘T’), lymph node involvement (‘N’) and distant metastasis (‘M’). However, the TNM staging system only provides a stratified prognosis prediction based on the characteristics of the tumor, which is not personalized for each patient. Furthermore, other prognostic factors that are influential to the outcomes of patients, such as age and histology, are not taken into consideration by the TNM staging system. Due to the limitations of the TNM staging system, there is a need to incorporate other factors that can provide more comprehensive and individualized predictions.
Radiomics is a rapidly growing field that uses quantitative data extracted from medical images, such as computed tomography (CT), magnetic resonance imaging (MRI), and position computed tomography (PET), to provide a more detailed characterization of tumors [4]. These data that include textural and morphological information can be used to identify subtle differences in the heterogeneity of tumor that are significant to the treatment outcome [5] and personalized medicine [6].
Machine learning has been used in radiomics in predicting treatment outcomes of cancer patients in colorectal cancer [7], head and neck cancer [8], hepatocellular carcinoma [9], and NSCLC [10]. Common machine learning algorithms include decision tree (DT), random forest (RF), extreme boost (EB), support vector machine (SVM), and generalized linear model (GLM) [11].
Chaddad et al. (2017) investigated the use of radiomics in predicting the survival time of patients with NSCLC based on the shape and the textural radiomic features [12]. The subjects were classified according to their histology and TNM staging information. Twenty-four radiomic features were used. The study suggested that these radiomic features have the potential ability to predict the survival time of patients with area under the curve (AUC) from 0.70 to 0.76. Le et al. (2021) performed another study to evaluate the predictive ability of radiomics in one-year, three-year, and five-year survival of NSCLC patients. A risk score was developed from ten radiomic models with AUC of 0.696, 0.705 and 0.657 for one-year, three-year, and five-year survival, respectively [13].
Ching et al. (2023) used a combined radiomic model with clinical features (RC combined model) for prostate cancer for the prediction of five-year progression-free survival prognosis and obtained an AUC of 0.797 [14]. Their model combined radiomic factors with clinical factors using ridge regression. The best accuracy of RC combined model obtained was 0.729. Their result is still not impressive.
It appears that radiomics is helpful in the early detection of survival for NSCLC patients [15]. In this study, we present a radiomics-clinical probability weighted enhanced model for the prediction of prognosis of NSCLC. The model combines radiomic features extracted from computed tomography (CT) images with clinical factors to predict the overall survival of NSCLC patients. The model is based on a combination of machine learning algorithms that include radiomics features and clinical information using a probability weighted strategy.

2. Materials and Methods

2.1. Data Acquisition

Pre-treatment planning CT images were acquired from The Cancer Imaging Archive (TCIA). TCIA is an open-access database managed by the Frederick National Laboratory for Cancer Research. It is funded by the Cancer Imaging Program (CIP) of National Cancer Institute (NCI) in the United States [16]. The images were reviewed and approved by TCIA Advisory Group, which is formed by experts in cancer imaging, informatics, and related technology to ensure the reliability of the database. TCIA contains medical images of different types of cancer. Supporting information of the images, such as age, gender, and outcomes of the patients, are also provided if available.
Four hundred and twenty-two NSCLC patients’ information was retrieved from TCIA. All patients received radiotherapy with curative intent. The dataset contains pre-treatment planning CT images with radiotherapy structures. Gross tumor volume (GTV) was segmented manually by experienced oncologists. Patients’ demographics and tumor information, including age, gender, TNM staging, and histology, were also acquired from the database.

2.2. Case Selection

The samples were selected by convenience with all samples available in the target database. Among all 422 cases collected, 5 cases with distant metastasis or with GTV outside the lung were excluded from the study. A total of 8 cases were ignored due to errors in acquiring the DICOM images. A total of 57 cases with missing data in age, histology, T stage, or overall staging were also excluded. Finally, 352 cases were included. By using G*Power analysis (version 3.1.9.7, Statistical Consulting Group, UCLA, Los Angeles, CA, USA), the sample size required is 52 with a power of 0.8, effective size 0.5, and α at 0.05 for chi-square test. To give more accurate machine learning results, we used 352 samples in this study.

2.3. Feature Extraction

Cases with multiple GTVs were combined into a single GTV for feature extraction by the Eclipse treatment planning system, version 15.6 (Varian, Palo Alto, CA, USA). The GTV was utilized for radiomic feature extraction performed by 3D slicer (v. 5.2.1, slicer.org) with Pyradiomics extension (Computational Imaging and Bioinformatics Lab, Harvard Medical). A total of 107 radiomic features were extracted from each sample (Table 1), which were imported into the machine learning algorithms. These radiomic features can be classified into seven groups, including shape, first-order feature, gray level co-occurrence matrix (GLCM), gray level dependence matrix (GLDM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), and neighborhood gray-tone difference matrix (NGTDM) (see Table 1).

2.4. Study Endpoints

Overall survival (OS) was defined as the time from the patient having radiotherapy treatment to the time of death when the precise cause of death is not specified. Luna et al. (2022) evaluated the prediction of overall survival (OS) using radiomics on patients with stage III lung adenocarcinoma treated with chemoradiation. It revealed that by integrating radiomic features into a baseline Cox model based on age and ECOG performance status scale, there was an improvement in the OS predictive ability of the model [15]. In our study, we divided the study endpoints to one-year, three-year, and five-year OS so that we have a more precise prediction model.
To avoid overfitting and bias due to uneven data, a balanced sample with an equal sample size in each treatment outcome was randomly selected at each endpoint for validation and testing of the models (Table 2).

2.5. Machine Learning for Data Processing

The radiomic features extracted were imported into the machine learning algorithms using R (Ihaka and Gentleman; v. 4.1.3, Switzerland) [17] with Rattle package [18]. The machine learning algorithms used in the study include decision trees (DT), random forests (RF), extreme boost (EB), support vector machine (SVM), and generalized linear model (GLM). We built our AI model by randomly splitting the sample into three independent cohorts, with 70% of the sample in the training cohort to identify patterns, 15% of the sample in the validation cohort to measure our progress, and 15% of the sample in the testing cohort to evaluate the performance of the model on unobserved data. The predicted treatment outcome was quantified as binary classification: a score of less than 0.5 indicated the model prediction of the patient survived at the given endpoint, while a score of greater than 0.5 signified that the model predicted the patient did not survive.
The above machine learning algorithms were optimized by the voted ensemble machine learning (VEML) model we proposed earlier [18]. Due to the difference in the properties of machine learning algorithms, each algorithm has its own limitations. A study stated that VEML demonstrates an improvement in predictive performance when compared with a single machine-learning algorithm [19]. Hence, the ensemble method was introduced to compensate for the weaknesses of the different models in order to achieve a higher prediction accuracy. This method incorporates results from the five machine learning algorithms, which are decision tree (DT), random forest (RF), extreme boost (EB), support vector machine (SVM), and generalized linear (GLM), by calculating the average score of the majority predicted outcome by these algorithms that were alive or dead using ROC analysis [20]. (Figure 1).
Prediction of prognosis using radiomic model or clinical factors model have their own strengths and weaknesses. The radiomic model is a non-invasive tool that predicts cancer prognosis by mathematical analysis of radiomic features. For clinical factors model, it provides a subjective measurement based on clinical elements, such as age and histology, that may significantly influence the prediction results. On the other hand, the TNM staging system only stratifies patients according to the tumor size, lymphatic involvement, and the extent of metastasis, but not personalized for each patient [21]. Hence, a weighted method was proposed to construct a combined probability-enhanced model, which is a weighted combination of radiomic model and clinical factors model (Figure 2). By combining the two models, it can take the strengths of each model and potentially improve the accuracy of the predicted outcome.

2.6. Probability Weighted Enhanced Model (PWEM)

The association of patient demographics and clinical factors with radiomics features have proven to add further value to the predictive power of machine learning models [22].
A significant correlation was discovered between advanced age and AJCC TNM staging with the survival of the patients [23,24]. It appears that further consideration needs to be explored by taking advantage of patient clinical factors combined with radiomics features for machine learning data mining.
The Probability Weighted Enhanced Model (PWEM) is a multi-algorithm model proposed in this study to facilitate collaborative voting between radiomics and the clinical factor model (Figure 2). The rationale behind this is to account for crucial and high-risk clinical factors as references, in order to produce a more realistic prediction. It consists of hard-voting and soft-voting techniques for decision-making by considering the numerical outcomes of radiomics features and categorical clinical factors. The hard voting consists of performing VEML on the radiomics features model and clinical factors model separately, as a result of both radiomics and clinical factors would have a VEML score indicating the probability and prediction for the outcome. For soft voting, a classifier known as predictive weighting classifies the weighting of the radiomics model and clinical factor model based on probability.
Predictive weighting is an important factor that reflects the model’s probability of acquiring a correct prediction under a conflicting situation. When the radiomics model and the clinical factor model have different predictions of the patient outcome, the occurrences of a correct prediction by each model are counted according to the probability of getting a correct prediction by each model.
The weighted score of the PWEM Model reflects the collective survival prediction of the radiomics model and clinical factor model. It is deduced by combining the radiomics model score and a clinical model score while multiplying their corresponding predictive weighting factor. The weighted score is presented in a numerical value between 0 and 1, a value less than 0.5 indicates the PWEM model has predicted the patient to survive, while a value equal to 0.5 or larger than 0.5 indicates the PWEM model has predicted the patient to be dead (Figure 2). It is calculated by the following equation:
Weighted Score = Radiomic VEML Score × Radiomics Weighting + Clinical VEML Score × Clinical Weighting

3. Results

3.1. Patient Demographics and Tumor Characteristics

A total of 352 patients with NSCLC were included in the study. The overall staging was classified according to the TNM system of the American Joint Committee on Cancer (AJCC). Among the patients, 67% were male, while 33% were female. A total of 44% of the patients were diagnosed with stage IIIB NSCLC. For histology, the highest proportion of patients were diagnosed with squamous cell carcinoma, which was equivalent to 40% of the sample (Table 3).

3.2. Prognosis Prediction Performance of the Models at Different Endpoints

Receiver Operating Characteristics (ROC) curves were utilized to evaluate the performance in prognosis prediction of radiomic model, clinical factors model and the combined probability-weighted enhanced model at the endpoints of one-year, three-year, and five-year survival. The area under the curve (AUC) of ROC curves at each endpoint was generated by Rattle in R.

3.3. Performance Analysis for Machine Learning Models

For the predictive performance for the one-year, three-year and five-year endpoints, the overall average performance of the radiomics model (RAT), clinical model (CF) and the Probability Weighted Enhanced (PWE) model obtained AUCs of 0.941, 0.856 and 0.949, respectively. The RAT model and PWE model had similar performance for survival prediction, and both the RAT and PWE model outperform the CF model (Figure 3, Figure 4 and Figure 5).
The best performance was achieved by the PWE model for the one-year survival prediction with an AUC of 0.955 (95% CI [0.9264,0.9742]); with the RAT model for the five-year survival prediction with an AUC of 0.942 (95% CI [0.8923–0.9714]) and the CF model had the lowest AUC of 0.846 (95% CI [0.7697–0.9027]) for the five-year survival prediction (Table 4).
The PWE model had significantly better performance than the RAT model for one-year survival prediction (p < 0.01, chi-square test). For the three-year and five-year survival prediction, the performance of PWE and RAT models are similar and there was no significant difference (Table 5). Nevertheless, both RAT and PWE had good performance in terms of accuracy. PWE obtained the best accuracy of 0.9107 for three-year survival. Both RAD and PWE performed better than CF with accuracy ranging from 0.8594 (RAT five-year survival) to 0.9107 (PWE, three-year survival) (Table 6).

4. Discussion

Our radiomics-clinical model demonstrates the value of combining radiomic features with clinical factors for predicting the prognosis of NSCLC with probability weighting. The model achieved a higher level of predictive accuracy of 0.9107 compared to traditional clinical factors with the highest accuracy of 0.8281 alone, indicating that the combined PWE model can provide valuable information that is not captured by clinical factors alone.
We noted that there were attempts to combine clinical information with radiomics features to predict cancer treatment prognosis such as ridge regression [14,25], logistic regression [26], and Cox regression [24] and obtained an AUC ranging from 0.733 [24] to 0.868 [25,26]. In our model, the probabilistic weighted method takes into consideration that radiomics features and clinical factors are two distinctive factors of different natures and should not be put together as inputs for machine learning. By using a probability-weighted strategy, we obtained a better AUC of 0.955 and an accuracy of 0.9107.
Our study illustrated that prognosis prediction of cancer, in particular NSCLC can be achieved by machine learning models with radiomic features or clinical factors. The advantage of clinical data is the convenience in data collection, such as demographics information of the patients, for example, age and gender. For radiomics prediction, it is a non-invasive method to predict prognosis based on radiomic features extracted from medical images. However, radiomics fail to consider the deterministic factors that significantly influence the prognosis of the patients, which may jeopardize the predictive ability of the model. From our study, it was acknowledged that age was an influential clinical factor affecting the prognosis of the patients. The probability-weighted enhanced model proposed in this study can incorporate clinical data with radiomic features taking into consideration each set of data to achieve a better predictive power than each factor alone.
Radiomics is a promising approach not only in the prognosis of cancer but it can be used for the diagnosis of diseases. By using image analysis techniques, image feature as another form of radiomics can identify subtle differences in tissue properties that may not be visible to the naked eye, and can potentially improve diagnostic accuracy such as automatic detection of ischemic stroke in the brain using CT images [27], lung cancer diagnosis [28,29], prostate cancer detection [30], and brain tumor assessment and classification [31]. In addition, the AI model using radiomics can be applied for histology image classification [32]. Overall, radiomics represent a promising approach for disease diagnosis and prognosis prediction, and with the advancement of this area, radiomics will play an increasingly important role in medical science.
There are factors that may affect the outcomes and the accuracy of the model. These include:
Image quality: types of imaging modality such as CT or ultrasound, image resolution, and image noise can affect the accuracy of the radiomic model [33].
Feature extraction: in this study, feature extraction is based on gross tumor volume delineation. The quality and reproducibility of the radiomic features extracted from the images are dependent on the experience of the oncologists and technologists [34].
Sample size: The size and composition of the dataset used to train and validate the model can impact its accuracy [35].
Treatment effects: When validating a radiomic model, it is important to carefully consider the potential effect of treatment on the accuracy of the model. This may involve including treatment-related variables in the model or stratifying the dataset based on treatment status to ensure that the model is accurate and generalizable to the target population. In our case, we divided the sample into one-, three-, and five-year endpoints [36].
One limitation of our study was the clinical data we collected, such as smoking status and family history, were not included in the data source. This missing information could potentially improve the accuracy of the prediction models.
Another limitation of our study is the lack of clinical validation. Clinical validation is important to confirm the generalizability of our model to other patient populations and healthcare settings. Future studies should aim to validate our model externally using independent datasets.
Despite these limitations, our radiomics-clinical model has important implications for the prognosis of NSCLC patients. The model can provide more accurate and individualized predictions of patient outcomes, which can aid in treatment planning and improve patient survival.

5. Conclusions

In this study, we presented a radiomics-clinical probabilistic model for the prognosis of NSCLC. The model combines radiomic features extracted from CT images with clinical factors such as age, histology and tumor stage to predict overall survival. Our results demonstrate the potential of combining radiomics-clinical factors with probability weighting for improving the prognosis of NSCLC patients. Future studies with larger datasets and external validation are needed to confirm the robustness and generalizability of our model.

Author Contributions

Conceptualization, F.-H.T.; methodology, F.-H.T., Y.-W.F. and S.-H.Y.; software, F.-H.T.; validation, C.-L.T. and M.-T.C.; formal analysis, S.-H.Y. and C.-K.W.; investigation, Y.-W.F., S.-H.Y. and C.-L.T.; resources, S.-H.Y.; data curation, C.-K.W.; writing—original draft preparation, Y.-W.F., S.-H.Y.; C.-K.W., C.-L.T. and M.-T.C.; writing: F.-H.T.; visualization, F.-H.T.; supervision, F.-H.T.; project administration, F.-H.T.; funding acquisition, F.-H.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by UGC Faculty Development Grant UGC/FDS17/M10/19. The APC was partly funded by the Staff Development Fund of the School of Medical and Health Science and the UGC Faculty Development Grant.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Research Ethics Committee of Tung Wah College (REC2019031).

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analysed in this study. These data can be found at: https://wiki.cancerimagingarchive.net/display/Public/NSCLC-Radiomics (accessed on 22 January 2023). Data used in the preparation of this article were obtained from The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. American Cancer Society. Lung Cancer. 2023. Available online: https://www.cancer.org/cancer/lung-cancer.html (accessed on 2 March 2023).
  2. Huang, D.; Zhang, F.; Tao, H.; Zhang, S.; Ma, J.; Wang, J.; Liu, Z.; Cui, P.; Chen, S.; Huang, Z.; et al. Tumor Mutation Burden as a Potential Biomarker for PD-1/PD-L1 Inhibition in Advanced Non-small Cell Lung Cancer. Target. Oncol. 2020, 15, 93–100. [Google Scholar] [CrossRef]
  3. Yuan, M.; Huang, L.L.; Chen, J.H.; Wu, J.; Xu, Q. The emerging treatment landscape of targeted therapy in non-small-cell lung cancer. Signal Transduct. Target. Ther. 2019, 4, 61. [Google Scholar] [CrossRef] [Green Version]
  4. van Timmeren, J.E.; Cester, D.; Tanadini-Lang, S.; Alkadhi, H.; Baessler, B. Radiomics in medical imaging—“How-to” guide and critical reflection. Insights Imaging 2020, 11, 91. [Google Scholar] [CrossRef] [PubMed]
  5. Marusyk, A.; Janiszewska, M.; Polyak, K. Intratumor Heterogeneity: The Rosetta Stone of Therapy Resistance. Cancer Cell 2020, 37, 471–484. [Google Scholar] [CrossRef]
  6. Aerts, H.J.; Velazquez, E.R.; Leijenaar, R.T.; Parmar, C.; Grossmann, P.; Carvalho, S.; Cavalho, S.; Bussink, J.; Monshouwer, R.; Haibe-Kains, B.; et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 2014, 5, 4006. [Google Scholar] [CrossRef] [Green Version]
  7. Staal, F.C.; Van Der Reijd, D.J.; Taghavi, M.; Lambregts, D.M.J.; Beets-Tan, R.G.H.; Maas, M. Radiomics for the Prediction of Treatment Outcome and Survival in Patients with Colorectal Cancer: A Systematic Review. Clin. Color. Cancer 2021, 20, 52–71. [Google Scholar] [CrossRef]
  8. Giraud, P.; Giraud, P.; Gasnier, A.; Ayachy, R.E.; Kreps, S.E.; Foy, J.; Durdux, C.; Huguet, F.; Burgun, A.; Bibault, J. Radiomics and Machine Learning for Radiotherapy in Head and Neck Cancers. Front. Oncol. 2019, 9, 174. [Google Scholar] [CrossRef] [Green Version]
  9. Santos, J.A.C.; Oliveira, B.C.; De Arimateia Batista Araujo-Filho, J.; Assuncao, A.N., Jr.; De MMachado, F.A.; Rocha, C.; Horvat, J.V.; Menezes, M.G.; Horvat, N. State-of-the-art in radiomics of hepatocellular carcinoma: A review of basic principles, applications, and limitations. Abdom. Imaging 2020, 45, 342–353. [Google Scholar] [CrossRef] [PubMed]
  10. Walls, G.; Osman, S.O.; Brown, K.K.; Butterworth, K.T.; Hanna, G.B.; Hounsell, A.R.; McGarry, C.K.; Leijenaar, R.T.; Lambin, P.; Cole, A.A.; et al. Radiomics for Predicting Lung Cancer Outcomes Following Radiotherapy: A Systematic Review. Clin. Oncol. 2022, 34, e107–e122. [Google Scholar] [CrossRef]
  11. Parmar, C.; Grossmann, P.; Bussink, J.; Lambin, P.; Aerts, H.J. Machine Learning methods for Quantitative Radiomic Biomarkers. Sci. Rep. 2015, 5, 13087. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Chaddad, A.; Desrosiers, C.; Toews, M.; Abdulkarim, B. Predicting survival time of lung cancer patients using radiomic analysis. Oncotarget 2017, 8, 104393–104407. [Google Scholar] [CrossRef] [PubMed]
  13. Le, V.; Kha, Q.H.; Hung, T.N.K.; Le, N.Q.K. Risk Score Generated from CT-Based Radiomics Signatures for Overall Survival Prediction in Non-Small Cell Lung Cancer. Cancers 2021, 13, 3616. [Google Scholar] [CrossRef]
  14. Ching, J.C.F.; Lam, S.; Lam, C.C.H.; Lui, A.O.Y.; Kwong, J.C.K.; Lo, A.Y.H.; Chan, J.W.H.; Cai, J.; Leung, W.S.; Lee, S.W.Y. Integrating CT-based radiomic model with clinical features improves long-term prognostication in high-risk prostate cancer. Front. Oncol. 2023, 13, 1060687. [Google Scholar] [CrossRef]
  15. Luna, J.; Barsky, A.; Shinohara, R.; Roshkovan, L.; Hershman, M.; Dreyfuss, A.; Horng, H.; Lou, C.; Noël, P.; Cengel, K.; et al. Radiomic Phenotypes for Improving Early Prediction of Survival in Stage III Non-Small Cell Lung Cancer Adenocarcinoma after Chemoradiation. Cancers 2022, 14, 700. [Google Scholar] [CrossRef]
  16. The Cancer Imaging Archive (TCIA). 2020. Available online: https://www.cancerimagingarchive.net/ (accessed on 1 January 2023).
  17. R: The R Project for Statistical Computing. (n.d.). Available online: https://www.r-project.org/ (accessed on 5 January 2023).
  18. Tang, F.H.; Cheung, E.Y.W.; Wong, H.L.; Yuen, C.M.; Yu, M.H.; Ho, P.C. Radiomics from Various Tumour Volume Sizes for Prognosis Prediction of Head and Neck Squamous Cell Carcinoma: A Voted Ensemble Machine Learning Approach. Life 2022, 12, 1380. [Google Scholar] [CrossRef]
  19. Shin, T. Ensemble Learning, Bagging, and Boosting Explained in 3 Minutes. Medium. 2021. Available online: https://towardsdatascience.com/ensemble-learning-bagging-and-boosting-explained-in-3-minutes-2e6d2240ae21 (accessed on 6 April 2023).
  20. Mandrekar, J.N. Receiver Operating Characteristic Curve in Diagnostic Test Assessment. J. Thorac. Oncol. 2010, 5, 1315–1316. [Google Scholar] [CrossRef] [Green Version]
  21. Liu, L.; Shi, M.; Wang, Z.; Lu, H.; Li, C.M.; Tao, Y.; Chen, X.; Zhao, J. A molecular and staging model predicts survival in patients with resected non-small cell lung cancer. BMC Cancer 2018, 18, 966. [Google Scholar] [CrossRef] [PubMed]
  22. Wang, T.; She, Y.; Yang, Y.; Liu, X.; Chen, S.; Zhong, Y.; Deng, J.; Zhao, M.; Sun, X.; Xie, D.; et al. Radiomics for Survival Risk Stratification of Clinical and Pathologic Stage IA Pure-Solid Non–Small Cell Lung Cancer. Radiology 2021, 302, 425–434. [Google Scholar] [CrossRef] [PubMed]
  23. Zhang, L.; Lv, L.; Li, L.; Wang, Y.; Zhao, S.G.; Miao, L.; Gao, Y.; Liu, L.; Wu, N. Radiomics Signature to Predict Prognosis in Early-Stage Lung Adenocarcinoma (≤3 cm) Patients with No Lymph Node Metastasis. Diagnostics 2022, 12, 1907. [Google Scholar] [CrossRef] [PubMed]
  24. Hong, D.; Zhang, L.; Xu, K.; Wan, X.; Guo, Y. Prognostic Value of Pre-Treatment CT Radiomics and Clinical Factors for the Overall Survival of Advanced (IIIB–IV) Lung Adenocarcinoma Patients. Front. Oncol. 2021, 11, 628982. [Google Scholar] [CrossRef]
  25. Chen, W.; Wang, L.; Hou, Y.; Li, L.; Chang, L.; Li, Y.; Xie, K.; Qiu, L.; Mao, D.; Li, W.; et al. Combined Radiomics-Clinical Model to Predict Radiotherapy Response in Inoperable Stage III and IV Non-Small-Cell Lung Cancer. Technol. Cancer Res. Treat. 2022, 21, 15330338221142400. [Google Scholar] [CrossRef] [PubMed]
  26. Ai, Y.; Zhang, J.; Jin, J.; Zhang, J.; Zhu, H.; Jin, X. Preoperative Prediction of Metastasis for Ovarian Cancer Based o Computed Tomography Radiomic Features and Clinical Factor. Front. Oncol. 2021, 11, 610742. [Google Scholar] [CrossRef] [PubMed]
  27. Tang, F.H.; Ng, D.K.; Chow, D.H. An image feature approach for computer-aided detection of ischemic stroke. Comput. Biol. Med. 2011, 41, 529–536. [Google Scholar] [CrossRef]
  28. Avanzo, M.; Stancanello, J.; Pirrone, G.; Sartor, G. Radiomics and deep learning in lung cancer. Strahlenther Onkol. 2020, 196, 879–887. [Google Scholar] [CrossRef] [PubMed]
  29. Binczyk, F.; Prazuch, W.; Bozek, P.; Polanska, J. Radiomics and artificial intelligence in lung cancer screening. Transl. Lung Cancer Res. 2021, 10, 1186–1199. [Google Scholar] [CrossRef]
  30. Cameron, A.; Khalvati, F.; Haider, M.A.; Wong, A. MAPS: A Quantitative Radiomics Approach for Prostate Cancer Detection. IEEE Trans. Biomed. Eng. 2016, 63, 1145–1156. [Google Scholar] [CrossRef]
  31. Zhou, M.; Scott, J.; Chaudhury, B.; Hall, L.; Goldgof, D.; Yeom, K.W.; Iv, M.; Ou, Y.; Kalpathy-Cramer, J.; Napel, S.; et al. Radiomics in Brain Tumor: Image Assessment, Quantitative Feature Descriptors, and Machine-Learning Approaches. AJNR Am. J. Neuroradiol. 2018, 39, 208–216. [Google Scholar] [CrossRef] [Green Version]
  32. Obiols, M.H.; Jiao, Y.; Wang, Q. Can radiomics features boost the performance of deep learning upon histology images? In Proceedings of the 2019 International Conference on Medical Imaging Physics and Engineering (ICMIPE), Shenzhen, China, 22–24 November 2019; pp. 1–6. [Google Scholar] [CrossRef]
  33. Cui, Y.; Yin, F.F. Impact of image quality on radiomics applications. Phys. Med. Biol. 2022, 67, 15TR03. [Google Scholar] [CrossRef]
  34. Dai, H.; Lu, M.; Huang, B.; Tang, M.; Pang, T.; Liao, B.; Cai, H.; Huang, M.; Zhou, Y.; Chen, X.; et al. Considerable effects of imaging sequences, feature extraction, feature selection, and classifiers on radiomics-based prediction of microvascular invasion in hepatocellular carcinoma using magnetic resonance imaging. Quant. Imaging Med. Surg. 2021, 11, 1836–1853. [Google Scholar] [CrossRef]
  35. An, C.; Park, Y.W.; Ahn, S.S.; Han, K.; Kim, H.; Lee, S.-K. Radiomics machine learning study with a small sample size: Single random training-test set split may lead to unreliable results. PLoS ONE 2021, 16, e0256152. [Google Scholar] [CrossRef]
  36. Chetan, M.R.; Gleeson, F.V. Radiomics in predicting treatment response in non-small-cell lung cancer: Current status, challenges and future perspectives. Eur. Radiol. 2021, 31, 1049–1058. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Schematic diagram of voted ensemble machine learning model.
Figure 1. Schematic diagram of voted ensemble machine learning model.
Biomedicines 11 02093 g001
Figure 2. Schematic diagram for the probability-weighted enhanced model (PWEM).
Figure 2. Schematic diagram for the probability-weighted enhanced model (PWEM).
Biomedicines 11 02093 g002
Figure 3. Prediction of 1-year survival using RAT, CF, and PWE models.
Figure 3. Prediction of 1-year survival using RAT, CF, and PWE models.
Biomedicines 11 02093 g003
Figure 4. Prediction of 3-year survival using RAT, CF, and PWE models.
Figure 4. Prediction of 3-year survival using RAT, CF, and PWE models.
Biomedicines 11 02093 g004
Figure 5. Prediction of 5-year survival using RAT, CF, and PWE models.
Figure 5. Prediction of 5-year survival using RAT, CF, and PWE models.
Biomedicines 11 02093 g005
Table 1. Radiomic features summary.
Table 1. Radiomic features summary.
Feature GroupNumber of Features
Shape14
First-order feature18
Gray level co-occurrence matrix24
Gray level dependence matrix14
Gray level run length matrix16
Gray level size zone matrix16
Neighborhood gray tone difference matrix5
Total107
Table 2. Balanced sample size at various endpoints.
Table 2. Balanced sample size at various endpoints.
Endpoint1-Year Survival3-Year Survival5-Year Survival
224Sample size
Balanced sample119 alive
119 dead
112 alive
112 dead
64 alive
64 dead
Table 3. Patient demographics and tumor characteristics.
Table 3. Patient demographics and tumor characteristics.
Patient Demographics
No. of Patients (%) No. of Patients (%)
Gender Age
Male237 (67%)≤65 y/o135 (38%)
Female115 (33%)>65 y/o217 (62%)
Overall Stage T Stage
I60 (17%)T163 (18%)
II35 (10%)T2135 (38%)
IIIa103 (29%)T349 (14%)
IIIb154 (44%)T4105 (30%)
Histology N Stage
Adenocarcinoma48 (14%)N0131 (37%)
Large Cell Carcinoma105 (30%)N120 (5%)
Squamous Cell Carcinoma142 (40%)N2125 (36%)
Not Otherwise Specified57 (16%)N373 (21%)
N43 (1%)
Table 4. Summary of predictive performance of ML models.
Table 4. Summary of predictive performance of ML models.
EndpointMachine Learning ModelAUC [95% Confidence Interval]
Radiomic model0.931, [0.894, 0.956]
1-year survivalClinical factors model0.869, [0.817, 0.909]
Probability weighted enhanced model0.955, [0.926, 0.974]
Radiomic model0.952, [0.921, 0.973]
3-year survivalClinical factors model0.855, [0.801, 0.898]
Probability weighted enhanced model0.950, [0.919, 0.971]
Radiomic model0.942, [0.892, 0.971]
5-year survivalClinical factors model0.846, [0.770, 0.903]
Probability weighted enhanced model0.941, [0.891, 0.971]
Table 5. Summary of significant difference between models (Chi-square test value and p value).
Table 5. Summary of significant difference between models (Chi-square test value and p value).
Survival Year(s)RAT|CFRAT|PWECF|PWE
18.066710.598621.708
(p < 0.05)(p < 0.05)(p < 0.05)
318.25962.231421.9264
(p < 0.05)(p > 0.05)(p < 0.05)
510.11100.3817.8133
(p < 0.05)(p > 0.05)(p < 0.05)
Table 6. Summary of predictive performance of machine learning models in sensitivity, specificity, and accuracy.
Table 6. Summary of predictive performance of machine learning models in sensitivity, specificity, and accuracy.
Survival Year(s)RATCFPWE
10.92440.90760.9244
Sensitivity30.91070.86610.9196
50.76560.79690.7813
10.84870.67230.8487
Specificity30.90180.72320.9018
50.95310.85940.9531
10.88660.78990.8866
Accuracy30.90630.79460.9107
50.85940.82810.8672
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tang, F.-H.; Fong, Y.-W.; Yung, S.-H.; Wong, C.-K.; Tu, C.-L.; Chan, M.-T. Radiomics-Clinical AI Model with Probability Weighted Strategy for Prognosis Prediction in Non-Small Cell Lung Cancer. Biomedicines 2023, 11, 2093. https://doi.org/10.3390/biomedicines11082093

AMA Style

Tang F-H, Fong Y-W, Yung S-H, Wong C-K, Tu C-L, Chan M-T. Radiomics-Clinical AI Model with Probability Weighted Strategy for Prognosis Prediction in Non-Small Cell Lung Cancer. Biomedicines. 2023; 11(8):2093. https://doi.org/10.3390/biomedicines11082093

Chicago/Turabian Style

Tang, Fuk-Hay, Yee-Wai Fong, Shing-Hei Yung, Chi-Kan Wong, Chak-Lap Tu, and Ming-To Chan. 2023. "Radiomics-Clinical AI Model with Probability Weighted Strategy for Prognosis Prediction in Non-Small Cell Lung Cancer" Biomedicines 11, no. 8: 2093. https://doi.org/10.3390/biomedicines11082093

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop