Vertebral Column Pathology Diagnosis Using Ensemble Strategies Based on Supervised Machine Learning Techniques
Abstract
:1. Introduction
- The first step is to create a desired information dataset. To accomplish this, it is necessary to consider the different outputs (classes) into which the information will be divided. Also, it is essential to determine the inputs and characteristics (attributes) from which the classes are to be evaluated [11].
- Once the information is gathered, it must be analyzed and filtered. This step is essential for assessing the adequacy of the dataset in terms of the number of samples, the representativeness of the characteristics, and the accuracy of the element values, among other factors [12].
- After determining the reliability of the dataset for analysis, a specific pattern recognition approach can be trained or directly executed, depending on its behavior, to obtain the optimal parameters (hyperparameters). These hyperparameters will categorize future data based on the chosen attributes [13,14].
2. Description and Analysis of the Dataset
- (i)
- Pelvic incidence (PI): This is the angle between a line perpendicular to the sacral plate at its center and a line connecting the same point to the center of the bicoxofemoral axis [48].
- (ii)
- Pelvic tilt (PT): This is the angle estimated between two reference lines. The first is a vertical line to the center of the femoral head. The second goes from the center of the femoral head to the midpoint of the sacral endplate [49].
- (iii)
- Lumbar lordosis angle (LLA): The angle measured in the sagittal plane between the two ends of the lumbar curve [50].
- (iv)
- Sacral slope (SS): The angle produced by a line parallel to the sacral end plate and a horizontal reference line [49].
- (v)
- Pelvic radius (PR): The distance from the hip axis to the posterior–superior corner of the S1 endplate [51].
- (vi)
- Grade of spondylolisthesis (GS): The grades are considered as follows: grade I represents 0–25%, grade II 25–50%, grade III 50–75%, and grade IV 75–100%. These percentages represent how much the cephalad vertebra has slipped anteriorly, related to the caudal vertebra. The superior endplate of the caudal vertebral body is divided into four equivalent quadrants, and the magnitude of the slip is based on the percentage of endplate uncovered as a result of the slip [52].
3. Classifier Methods
3.1. Basic Classifier Methods
3.1.1. k-Nearest Neighbors (kNN)
- It is necessary to find the correct k value, i.e., the number of elements to be considered in the distance comparison.
- If a new element to be classified is situated in an area where the k-neighbors are evenly distributed among two or more classes, the classification is impossible (without using other additional implementations) because there is no dominant class.
- If the attributes of the elements are discrete (categorical), the distances cannot be measured without additional valuations regarding the previous knowledge of the dataset.
3.1.2. Naïve Bayes (NB)
3.1.3. Logistic Regression (LR)
- Gradient-based:
- –
- Library for Large Linear (liblinear).
- –
- Stochastic Average Gradient (sag).
- –
- Stochastic Average Gradient Accelerated (saga).
- Hessian-based:
- –
- Limited-memory Broyden–Fletcher–Goldfarb–Shanno (lbfgs).
- –
- Newton–Cholesky algorithm (newton-cholesky).
- –
- Newton conjugated gradient (newton-cg).
3.1.4. Linear and Quadratic Discriminant Analysis (LDA and QDA)
- Single value decomposition (svd): This approach decomposes a matrix into three matrixes, helping to reduce dimensionality, which aids attribute extraction tasks.
- Least Squares solution by QR decomposition (lsqr): Iterative method that approximates the eigenvector and eigenvalues used in the function regression. It is helpful to reduce the computational burden in large dataset classification problems.
- Eigenvalue decomposition (eigen): This approach directly computes the exact eigenvalue decomposition of the covariance matrix. This approach is the most computationally intensive but achieves an exact solution.
3.1.5. Support Vector Machine (SVM)
3.1.6. Artificial Neural Network (ANN)
3.1.7. Decision Tree (DT)
- The criterion to split the tree into a new branch. Some of the most common criteria are [70]
- –
- Gini: This measures how often a randomly chosen element from a set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in a subset.
- –
- Entropy: This measures the randomness in the dataset split.
- –
- Logarithmic loss (log_loss): This measures the performance of a classification model where the predicted output is a probability value between 0 and 1.
- The minimum number of elements required to halt the divisions.
- The number of splits based on certain criteria.
- The number of layers in the tree.
3.2. Ensembled Classifier strategies
- Voting classifier (VC) [74]: As its name suggests, this strategy performs a voting process with the baseline classifiers implemented, where the final forecast is the majority of the individual predictions . A variant of the voting process assigns weights to each of the individual predictions considering the classifiers’ reliability .
- Stacking classifier (SC) [75]: This ensemble strategy uses an additional classifier (meta-classifier) to compute the final prediction . This meta-classifier is usually an LR classifier due to its simplicity by providing a smooth interpretation of the predictions made by the baseline classifiers.
- Random Forest (RF) [76]: This method is similar to the voting classifier, since the final prediction is made by selecting the majority of the individual predictions of the baseline classifiers. The most representative characteristic of the RF is that it only uses Decision Trees as baseline classifiers. Each of these baseline classifiers is different from each other. Another noteworthy point is that the baseline classifiers of the RF evaluate only particular attributes of the elements of the dataset. This characteristic makes the baseline classifiers less complex than a unique decision tree to perform the whole classification process.
4. Results
4.1. Experimentation Methodology
- Firstly, the hyperparameters of the baseline classifiers presented in this work are tuned through a grid-search K-Fold cross-validated strategy. This process aims to find the most suitable hyperparameters per classifier, hence the best version of the baseline classifiers. A thorough description of this process is presented in Section 4.2.
- An analysis of the results of the tuned baseline classifiers is presented in Section 4.3. This analysis aims to provide insight into the baseline classifiers and find the best for the particular problem of the vertebral column disease classification.
- Section 4.4 encompasses the description and analysis results of different ensemble strategies (with subvariant proposals). This analysis highlights the behavior and performance of the employed ensemble strategies when different baseline classifiers are employed, providing a better understanding of their advantages and limitations.
- Finally, a general discussion is presented in Section 4.5. This discussion not only highlights the results of the baseline classifiers and the ensemble strategies but also discusses their trustworthiness in this particular medical problem for vertebral column disease classification.
4.2. Baseline Classifiers’ Hyperparameter Tuning
- kNN classifier: The grid-search process varies the distance metric between Euclidean and Manhattan. Also, the number of k neighbors is evaluated from 1 to 30, i.e., , considering the maximum permissible of the K-Fold process.
- LR classifier: In this classifier, the optimization algorithm (solver) used to perform the training is selected among the liblinear, sag, saga, lbfgs, newton-cholesky, and newton-cg algorithms.
- LDA classifier: In this classifier, the solver employed to compute the eigenvector and eigenvalues used for the function regression is chosen among the svd, lsqr, and eigen strategies.
- SVM classifier: The grid-search strategy evaluates the kernel function employed to create the hyperplane to separate the dataset element varying between linear, polynomial, radial basis, and sigmoid functions. Regarding the case of a polynomial function kernel, its degree d varies between 2 and 10, i.e., . This limit is considered through an empirical observation where higher polynomial degrees increased the computational burden without enhancing the classification task. Such characteristics of polynomial degree increment have been analyzed in [81] with similar outcomes.
- ANN classifier: The grid-search strategy evaluates multiple hyperparameters like the number of hidden layers (between 1 and 10) and the number of neurons in the hidden layers (between 1 and 10). Also, the activation function used in the neurons is among identity, logistic, tanh, and relu functions (see Table 3). Moreover, the optimization algorithm (solver) used in the grid-search strategy varies between lbfgs, sgd, and adam. Finally, the last hyperparameter tuned is the learning rate varying it between 0.001 and 0.9 with a step size of 0.005, i.e., . The range regarding learning rate, step size, number of hidden layers, and their number of neurons are established considering that the boundaries range set computational burden limits (regarding time) where other classifiers achieved acceptable outcomes without further endeavor.
- DT classifier: The grid search variates multiple hyperparameters like the criterion to split the elements of the dataset between gini, entropy, and log_los. Also, the splitter technique can be optimization-based (best) or stochastic-based (random). Moreover, the maximum depth of the tree varies between 5 and 10. Moreover, the minimum of samples to consider a node a leaf changes between 1 and 10. Finally, the minimum number of samples to split a node is between 1 and 10. Similarly, the hyperparameters tuned within ranges are selected considering boundaries where the computational burden limits are not surpassed.
4.3. Baseline Classifiers Results
- The kNN classifier yields the most outstanding results, being the only one whose accuracy, precision, recall, and F1-scores are above 0.9. This is reflected in its CCE evaluation, where it correctly classified 427 elements. Also, the kNN classifier is the most trustworthy approach for this particular medical application since it only misclassified four disease elements as normal elements, i.e., MDE = 4.
- It is noteworthy that the SVM classifier achieves the second-best outcomes. This is interesting since both SVM and kNN classifiers use geometrical strategies. Nonetheless, it is important to mention that the SVM classifier drops its competitiveness against the kNN classifier by decreasing its CCE evaluation in . The SVM’s MDE evaluation (MDE = 8) is twice greater (worse) than that reported by the best classifier.
- The DT classifier gets third place by reducing its CCE evaluation to concerning the best classifier. However, even if its CCE evaluation is worse than the SVM classifier outcome, the DT classifier yields more reliable results since its MDE evaluation is lower (MDE = 6).
- The ANN classifier reaches fourth place, followed by the LR classifier, where both obtain similar outcomes. Interestingly, even if the ANN classifier is based on a more complex process, it does not obtain better results.
- The classifiers based on probabilistic strategies (NB, LDA, and QDE) report the worst outcomes. Their poor performance is related to the dataset characteristics, where there are high correlations among the attributes, as observed in Figure 1. This characteristic is against the operation principle of the probabilistic classifiers, which require statistical independence among the attributes evaluated.
- Particularly, regarding the MDE, it is observed that the baseline classifiers tend to misclassify the SL elements as NO elements. Notably, the SL elements were the majority class in the original imbalanced data.
- It is observed that the baseline classifier based on probabilistic strategies requires less time to be implemented, but their classification performance is lower against other classifiers. On the other hand, it is interesting that the best-baseline classifier (kNN) is 26.6 times faster than the second best-baseline classifier (SVM), indicating that the vertebral column disease classification task does not imply a computational waste endeavor.
4.4. Ensembled Classifiers Results
- Regarding the ensemble strategies that used all the baseline classifiers, it is observed that the SCall ensemble yields the best outcomes, where the SCall ensemble is the only ensemble strategy able to achieve results above 0.95 in all the evaluated metrics. These results make the SCall ensemble the most outstanding option among all the ensembles and baseline classifiers. On the other hand, it is observed that the VCall ensemble cannot boost its performance beyond the best baseline classifier (kNN).
- Concerning the ensemble strategies that use only the three best baseline classifiers, it is observed that the VCtop and SCtop ensembles have similar outcomes to the best baseline classifier (kNN). Particularly, the VCtop ensemble has the same kNN’s outcomes in all the evaluated metrics, and the SCtop ensemble drops kNN’s competitiveness by increasing its MDE valuation, misclassifying six disease elements as normal elements.
- The RF ensemble is the less promising ensemble strategy. Nonetheless, it is observed that the RF ensemble improves by the CCE valuation of the DT classifier (its baseline classifier). These results show that an ensemble strategy that uses only one kind of baseline classifier has a limited performance, reaching only small improvements.
- Regarding the MDE metric, it is observed that, as with the baseline classifier, the ensemble strategies tend to miscategorize the SL elements as NO ones. This can result from the overlapping PT and GS dataset’s attributes, as observed in Figure 2. This kind of error might be mitigated by removing from the dataset the overlapped attributes, yet the best strategy is to consider additional attributes that clarify the differences between these classes. Therefore, in clinical problems, the machine learning results must be ratified by an expert in the field [82].
- Since the ensemble strategies work with already tuned baseline classifiers except for the RF ensemble strategy, the time measurement only considers the time required to complete the classification task, looking for a fair comparison. Particularly regarding the VC and SC ensemble strategies, it is clear that the variants that use all the baseline classifiers require more computational time than the variants that use only the top three, which is an expected result, according to related works like [83,84].
4.5. General Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Marengo, L. Is this time different? A note on automation and labour in the fourth industrial revolution. J. Ind. Bus. Econ. 2019, 46, 323–331. [Google Scholar] [CrossRef]
- Avril, E.; Cegarra, J.; Wiol, L.; Navarro, J. Automation Type and Reliability Impact on Visual Automation Monitoring and Human Performance. Int. J. Hum.–Comput. Interact. 2022, 38, 64–77. [Google Scholar] [CrossRef]
- Hutchinson, K.; Sloutsky, R.; Collimore, A.; Adams, B.; Harris, B.; Ellis, T.D.; Awad, L.N. A Music-Based Digital Therapeutic: Proof-of-Concept Automation of a Progressive and Individualized Rhythm-Based Walking Training Program after Stroke. Neurorehabilit. Neural Repair 2020, 34, 986–996. [Google Scholar] [CrossRef] [PubMed]
- Shen, Y.; Borowski, J.E.; Hardy, M.A.; Sarpong, R.; Doyle, A.G.; Cernak, T. Automation and computer-assisted planning for chemical synthesis. Nat. Rev. Methods Prim. 2021, 1, 23. [Google Scholar] [CrossRef]
- Kothamachu, V.B.; Zaini, S.; Muffatto, F. Role of Digital Microfluidics in Enabling Access to Laboratory Automation and Making Biology Programmable. SLAS Technol. Transl. Life Sci. Innov. 2020, 25, 411–426. [Google Scholar] [CrossRef] [PubMed]
- Morton, S.E.; Knopp, J.L.; Chase, J.G.; Docherty, P.; Howe, S.L.; Möller, K.; Shaw, G.M.; Tawhai, M. Optimising mechanical ventilation through model-based methods and automation. Annu. Rev. Control 2019, 48, 369–382. [Google Scholar] [CrossRef] [PubMed]
- Bahrin, M.A.K.; Othman, M.F.; Azli, N.H.N.; Talib, M.F. Industry 4.0: A review on industrial automation and robotic. J. Teknol. 2016, 78, 137–143. [Google Scholar] [CrossRef]
- Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- De Sa, J.M. Pattern Recognition: Concepts, Methods, and Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2001. [Google Scholar] [CrossRef]
- Pal, S.K.; Pal, A. Pattern Recognition: From Classical to Modern Approaches; World Scientific: Singapore, 2001. [Google Scholar]
- Abraham, A.; Falcón, R.; Bello, R. Rough Set Theory: A True Landmark in Data Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009; Volume 174. [Google Scholar] [CrossRef]
- García, S.; Luengo, J.; Herrera, F. Data Preprocessing in Data Mining; Springer: Berlin/Heidelberg, Germany, 2015; Volume 71. [Google Scholar] [CrossRef]
- Hart, P.E.; Stork, D.G.; Duda, R.O. Pattern Classification; Wiley Hoboken: Hoboken, NJ, USA, 2000. [Google Scholar]
- Kuncheva, L.I. Combining Pattern Classifiers: Methods and Algorithms; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar] [CrossRef]
- Mullin, M.D.; Sukthankar, R. Complete Cross-Validation for Nearest Neighbor Classifiers. In Proceedings of the ICML, Stanford, CA, USA, 29 June 29–2 July 2000; pp. 639–646. [Google Scholar]
- Purushotham, S.; Tripathy, B. Evaluation of classifier models using stratified tenfold cross validation techniques. In International Conference on Computing and Communication Systems; Springer: Berlin/Heidelberg, Germany, 2011; pp. 680–690. [Google Scholar] [CrossRef]
- Chiang, L.H.; Russell, E.L.; Braatz, R.D. Pattern Classification. In Fault Detection and Diagnosis in Industrial Systems; Springer: Berlin/Heidelberg, Germany, 2001; pp. 27–31. [Google Scholar] [CrossRef]
- O’Sullivan, C.M.; Ghahramani, A.; Deo, R.C.; Pembleton, K.G. Pattern recognition describing spatio-temporal drivers of catchment classification for water quality. Sci. Total Environ. 2023, 861, 160240. [Google Scholar] [CrossRef]
- Esteki, M.; Memarbashi, N.; Simal-Gandara, J. Classification and authentication of tea according to their harvest season based on FT-IR fingerprinting using pattern recognition methods. J. Food Compos. Anal. 2023, 115, 104995. [Google Scholar] [CrossRef]
- Tuncer, T.; Dogan, S.; Subasi, A. Surface EMG signal classification using ternary pattern and discrete wavelet transform based feature extraction for hand movement recognition. Biomed. Signal Process. Control 2020, 58, 101872. [Google Scholar] [CrossRef]
- Fernandez, N.; Lorenzo, A.J.; Rickard, M.; Chua, M.; Pippi-Salle, J.L.; Perez, J.; Braga, L.H.; Matava, C. Digital Pattern Recognition for the Identification and Classification of Hypospadias Using Artificial Intelligence vs. Experienced Pediatric Urologist. Urology 2021, 147, 264–269. [Google Scholar] [CrossRef]
- Kazmierska, J.; Malicki, J. Application of the Naïve Bayesian Classifier to optimize treatment decisions. Radiother. Oncol. 2008, 86, 211–216. [Google Scholar] [CrossRef] [PubMed]
- Wolpert, D.H. The Supervised Learning No-Free-Lunch Theorems. In Soft Computing and Industry: Recent Applications; Springer: London, UK, 2002; pp. 25–42. [Google Scholar] [CrossRef]
- Duarte, E.; Wainer, J. Empirical comparison of cross-validation and internal metrics for tuning SVM hyperparameters. Pattern Recognit. Lett. 2017, 88, 6–11. [Google Scholar] [CrossRef]
- Shankar, K.; Zhang, Y.; Liu, Y.; Wu, L.; Chen, C.H. Hyperparameter Tuning Deep Learning for Diabetic Retinopathy Fundus Image Classification. IEEE Access 2020, 8, 118164–118173. [Google Scholar] [CrossRef]
- Sun, J.; Zheng, C.; Li, X.; Zhou, Y. Analysis of the Distance Between Two Classes for Tuning SVM Hyperparameters. IEEE Trans. Neural Netw. 2010, 21, 305–318. [Google Scholar] [CrossRef] [PubMed]
- Alawad, W.; Zohdy, M.; Debnath, D. Tuning Hyperparameters of Decision Tree Classifiers Using Computationally Efficient Schemes. In Proceedings of the 2018 IEEE First International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), Laguna Hills, CA, USA, 26–28 September 2018; pp. 168–169. [Google Scholar] [CrossRef]
- Akinsola, J.E.T. Supervised Machine Learning Algorithms: Classification and Comparison. Int. J. Comput. Trends Technol. (IJCTT) 2017, 48, 128–138. [Google Scholar] [CrossRef]
- Sen, P.C.; Hajra, M.; Ghosh, M. Supervised Classification Algorithms in Machine Learning: A Survey and Review. In Emerging Technology in Modelling and Graphics; Mandal, J.K., Bhattacharya, D., Eds.; Springer: Singapore, 2020; pp. 99–111. [Google Scholar]
- Zhu, M.; Li, Y.; Wang, Y. Design and experiment verification of a novel analysis framework for recognition of driver injury patterns: From a multi-class classification perspective. Accid. Anal. Prev. 2018, 120, 152–164. [Google Scholar] [CrossRef] [PubMed]
- Unal, Y.; Polat, K.; Kocer, H.E. Classification of vertebral column disorders and lumbar discs disease using attribute weighting algorithm with mean shift clustering. Measurement 2016, 77, 278–291. [Google Scholar] [CrossRef]
- Kadhim, A.I. Survey on supervised machine learning techniques for automatic text classification. Artif. Intell. Rev. 2019, 52, 273–292. [Google Scholar] [CrossRef]
- Panigrahi, K.P.; Das, H.; Sahoo, A.K.; Moharana, S.C. Maize Leaf Disease Detection and Classification Using Machine Learning Algorithms. In Progress in Computing, Analytics and Networking; Das, H., Pattnaik, P.K., Rautaray, S.S., Li, K.C., Eds.; Springer: Singapore, 2020; pp. 659–669. [Google Scholar]
- Erdem, E.; Bozkurt, F. A comparison of various supervised machine learning techniques for prostate cancer prediction. Avrupa Bilim ve Teknoloji Dergisi 2021, 21, 610–620. [Google Scholar] [CrossRef]
- Soni, K.M.; Gupta, A.; Jain, T. Supervised Machine Learning Approaches for Breast Cancer Classification and a high performance Recurrent Neural Network. In Proceedings of the 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2–4 September 2021; pp. 1–7. [Google Scholar] [CrossRef]
- Uddin, S.; Khan, A.; Hossain, M.E.; Moni, M.A. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform. Decis. Mak. 2019, 19, 281. [Google Scholar] [CrossRef] [PubMed]
- Rojas-López, A.G.; Uriarte-Arcia, A.V.; Rodríguez-Molina, A.; Villarreal-Cervantes, M.G. Comparative Study of Pattern Recognition Techniques in the Classification of Vertebral Column Diseases. In Telematics and Computing; Mata-Rivera, M.F., Zagal-Flores, R., Barria-Huidobro, C., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 395–417. [Google Scholar] [CrossRef]
- Pu, L.; Shamir, R. 3CAC: Improving the classification of phages and plasmids in metagenomic assemblies using assembly graphs. Bioinformatics 2022, 38, ii56–ii61. [Google Scholar] [CrossRef] [PubMed]
- Singh, P.D.; Kaur, R.; Singh, K.D.; Dhiman, G. A Novel Ensemble-based Classifier for Detecting the COVID-19 Disease for Infected Patients. Inf. Syst. Front. 2021, 23, 1385–1401. [Google Scholar] [CrossRef] [PubMed]
- Velusamy, D.; Ramasamy, K. Ensemble of heterogeneous classifiers for diagnosis and prediction of coronary artery disease with reduced feature subset. Comput. Methods Programs Biomed. 2021, 198, 105770. [Google Scholar] [CrossRef]
- Rustam, F.; Ishaq, A.; Munir, K.; Almutairi, M.; Aslam, N.; Ashraf, I. Incorporating CNN Features for Optimizing Performance of Ensemble Classifier for Cardiovascular Disease Prediction. Diagnostics 2022, 12, 1474. [Google Scholar] [CrossRef] [PubMed]
- Tanveer, M.; Ganaie, M.; Suganthan, P. Ensemble of classification models with weighted functional link network. Appl. Soft Comput. 2021, 107, 107322. [Google Scholar] [CrossRef]
- Ganaie, M.; Tanveer, M.; Suganthan, P. Oblique Decision Tree Ensemble via Twin Bounded SVM. Expert Syst. Appl. 2020, 143, 113072. [Google Scholar] [CrossRef]
- Weng, C.H.; Huang, T.C.K.; Han, R.P. Disease prediction with different types of neural network classifiers. Telemat. Inform. 2016, 33, 277–292. [Google Scholar] [CrossRef]
- Saravanan, R.; Sujatha, P. A State of Art Techniques on Machine Learning Algorithms: A Perspective of Supervised Learning Approaches in Data Classification. In Proceedings of the 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 14–15 June 2018. [Google Scholar] [CrossRef]
- Jiang, T.; Gradus, J.L.; Rosellini, A.J. Supervised Machine Learning: A Brief Primer. Behav. Ther. 2020, 51, 675–687. [Google Scholar] [CrossRef]
- Guilherme Barreto, A.N. Vertebral Column. 2005. Available online: https://archive.ics.uci.edu/dataset/212/vertebral+column (accessed on 1 January 2024). [CrossRef]
- Errico, T.J.; Petrizzo, A. CHAPTER 1—Introduction to Spinal Deformity. In Surgical Management of Spinal Deformities; Errico, T.J., Lonner, B.S., Moulton, A.W., Eds.; W.B. Saunders: Philadelphia, PA, USA, 2009; pp. 3–12. [Google Scholar] [CrossRef]
- Ghobrial, G.M.; Al-Saiegh, F.; Heller, J. Procedure 31—Spinopelvic Balance: Preoperative Planning and Calculation. In Operative Techniques: Spine Surgery, 3rd ed.; Baron, E.M., Vaccaro, A.R., Eds.; Operative Techniques; Elsevier: Philadelphia, PA, USA, 2018; pp. 281–287. [Google Scholar] [CrossRef]
- Whittle, M.W.; Levine, D. Measurement of lumbar lordosis as a component of clinical gait analysis. Gait Posture 1997, 5, 101–107. [Google Scholar] [CrossRef]
- Zhao Yang, S.; Cai-Liang, Z.; Ren-Jie, C.; Da-Wei, D.; Fu-Long, W.J. Sagittal Pelvic Radius in Low-Grade Isthmic Lumbar Spondylolisthesis of Chinese Population. J. Korean Neurosurg. Soc. 2016, 59, 292–295. [Google Scholar] [CrossRef] [PubMed]
- Gallagher, B.; Moatz, B.; Tortolani, P. Classifications in Spondylolisthesis. Semin. Spine Surg. 2020, 32, 100802. [Google Scholar] [CrossRef]
- Haixiang, G.; Yijing, L.; Shang, J.; Mingyun, G.; Yuanyue, H.; Bing, G. Learning from class-imbalanced data: Review of methods and applications. Expert Syst. Appl. 2017, 73, 220–239. [Google Scholar] [CrossRef]
- Gosain, A.; Sardana, S. Handling class imbalance problem using oversampling techniques: A review. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13–16 September 2017; pp. 79–85. [Google Scholar] [CrossRef]
- He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, 1–8 June 2008. [Google Scholar] [CrossRef]
- Haibo He, Y.M. Imbalanced Learning: Foundations, Algorithms, and Applications; Wiley: Hoboken, NJ, USA, 2013. [Google Scholar] [CrossRef]
- Mamdouh Farghaly, H.; Abd El-Hafeez, T. A high-quality feature selection method based on frequent and correlated items for text classification. Soft Comput. 2023, 27, 11259–11274. [Google Scholar] [CrossRef]
- Althnian, A.; AlSaeed, D.; Al-Baity, H.; Samha, A.; Dris, A.B.; Alzakari, N.; Abou Elwafa, A.; Kurdi, H. Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain. Appl. Sci. 2021, 11, 796. [Google Scholar] [CrossRef]
- Lehr, J.; Philipps, J.; Hoang, V.; Wrangel, D.; Krüger, J. Supervised learning vs. unsupervised learning: A comparison for optical inspection applications in quality control. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1140, 012049. [Google Scholar] [CrossRef]
- Lo Vercio, L.; Amador, K.; Bannister, J.J.; Crites, S.; Gutierrez, A.; MacDonald, M.E.; Moore, J.; Mouches, P.; Rajashekar, D.; Schimert, S.; et al. Supervised machine learning tools: A tutorial for clinicians. J. Neural Eng. 2020, 17, 062001. [Google Scholar] [CrossRef] [PubMed]
- Suwanda, R.; Syahputra, Z.; Zamzami, E.M. Analysis of Euclidean Distance and Manhattan Distance in the K-Means Algorithm for Variations Number of Centroid K. J. Phys. Conf. Ser. 2020, 1566, 012058. [Google Scholar] [CrossRef]
- Hidayati, N.; Hermawan, A. K-Nearest Neighbor (K-NN) algorithm with Euclidean and Manhattan in classification of student graduation. J. Eng. Appl. Technol. 2021, 2, 86–91. [Google Scholar] [CrossRef]
- Nguyen, T.T.S. Model-based book recommender systems using Naïve Bayes enhanced with optimal feature selection. In Proceedings of the 2019 8th International Conference on Software and Computer Applications, Penang, Malaysia, 19–21 February 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 217–222. [Google Scholar] [CrossRef]
- Géron, A. Hands-On machine learning with Scikit-Learn, Keras, and TensorFlow; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2022. [Google Scholar]
- Tharwat, A. Linear vs. quadratic discriminant analysis classifier: A tutorial. Int. J. Appl. Pattern Recognit. 2016, 3, 145. [Google Scholar] [CrossRef]
- Tharwat, A.; Gaber, T.; Ibrahim, A.; Hassanien, A.E. Linear discriminant analysis: A detailed tutorial. AI Commun. 2017, 30, 169–190. [Google Scholar] [CrossRef]
- Cristianini, N.; Shawe-Taylor, J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar] [CrossRef]
- Jiang, P.; Zhou, Q.; Shao, X. Surrogate Model-Based Engineering Design and Optimization; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar] [CrossRef]
- Safavian, S.; Landgrebe, D. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man, Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef]
- Müller, A.C.; Guido, S. Introduction to Machine Learning with Python: A Guide for Data Scientists; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2016. [Google Scholar]
- Sagi, O.; Rokach, L. Ensemble learning: A survey. WIREs Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
- Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2019, 14, 241–258. [Google Scholar] [CrossRef]
- Jurek, A.; Bi, Y.; Wu, S.; Nugent, C. A survey of commonly used ensemble-based classification techniques. Knowl. Eng. Rev. 2013, 29, 551–581. [Google Scholar] [CrossRef]
- Mohandes, M.; Deriche, M.; Aliyu, S.O. Classifiers Combination Techniques: A Comprehensive Review. IEEE Access 2018, 6, 19626–19639. [Google Scholar] [CrossRef]
- Džeroski, S.; Ženko, B. Is Combining Classifiers with Stacking Better than Selecting the Best One? Mach. Learn. 2004, 54, 255–273. [Google Scholar] [CrossRef]
- Resende, P.A.A.; Drummond, A.C. A Survey of Random Forest Based Methods for Intrusion Detection Systems. ACM Comput. Surv. 2018, 51, 1–36. [Google Scholar] [CrossRef]
- Raschka, S.; Mirjalili, V. Python Machine Learning: Machine Learning and Deep Learning with Python, Scikit-Learn, and TensorFlow 2; Packt Publishing Ltd.: Birmingham, UK, 2019. [Google Scholar]
- Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 2019, 8, 832. [Google Scholar] [CrossRef]
- Zhou, J.; Gandomi, A.H.; Chen, F.; Holzinger, A. Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics. Electronics 2021, 10, 593. [Google Scholar] [CrossRef]
- Anguita, D.; Ghelardoni, L.; Ghio, A.; Oneto, L.; Ridella, S. The’K’in K-fold Cross Validation. In Proceedings of the ESANN, Bruges, Belgium, 25–27 April 2012; pp. 441–446. [Google Scholar]
- Rojas-López, A.G.; Villarreal-Cervantes, M.G.; Rodríguez-Molina, A. Surrogate indirect adaptive controller tuning based on polynomial response surface method and bioinspired optimization: Application to the brushless direct current motor controller. Expert Syst. Appl. 2024, 245, 123070. [Google Scholar] [CrossRef]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef] [PubMed]
- Prieto, O.J.; Alonso-González, C.J.; Rodríguez, J.J. Stacking for multivariate time series classification. Pattern Anal. Appl. 2013, 18, 297–312. [Google Scholar] [CrossRef]
- Kuncheva, L.I.; Rodríguez, J.J. A weighted voting framework for classifiers ensembles. Knowl. Inf. Syst. 2012, 38, 259–275. [Google Scholar] [CrossRef]
Name | Distance Function |
---|---|
Euclidean | |
Manhattan |
Type of Kernel | Kernel’s Function |
---|---|
Linear | |
Polynomial 1,2 | |
Radial Basis Function (RBF) 1 | |
Sigmoid 1,2 |
Name | Activation Function |
---|---|
Linear (identity) | |
Sigmoid (logistic) | |
Hyperbolic (tanh) | |
Rectified linear unit (relu) |
Classifier | Hyperparameters |
---|---|
kNN | k-neighbors: 1 |
distance metric: Euclidean | |
NB | - |
LR | solver: newton-cholesky |
LDA | solver: svd |
QDA | - |
SVM | kernel: polynomial |
degree: 7 | |
ANN | hidden structure: 6 layers, 6 neurons |
activation function: identity | |
learning rate: 0.08 | |
solver: lbfgs | |
DT | criterion: gini |
max. depth: 10 | |
min. samples per leaf: 1 | |
min. samples to split: 3 | |
splitter: best |
Classifier | Accuracy | Precision | Recall | F1-Score | Confusion Matrix | CCE | MDE | Time (s) | |||
---|---|---|---|---|---|---|---|---|---|---|---|
kNN | 0.9488 | 0.9520 | 0.9488 | 0.9490 | 146 | 4 | 0 | 427 | 4 | 1.27 | |
6 | 140 | 4 | |||||||||
1 | 8 | 141 | |||||||||
NB | 0.7444 | 0.7508 | 0.7444 | 0.7324 | 123 | 23 | 4 | 335 | 21 | 0.253 | |
64 | 69 | 17 | |||||||||
0 | 7 | 143 | |||||||||
LR | 0.8200 | 0.8295 | 0.8200 | 0.8198 | 114 | 35 | 1 | 369 | 7 | 1.057 | |
30 | 114 | 6 | |||||||||
3 | 6 | 141 | |||||||||
LDA | 0.7844 | 0.8069 | 0.7844 | 0.7876 | 119 | 31 | 0 | 353 | 5 | 0.1312 | |
34 | 111 | 5 | |||||||||
12 | 15 | 123 | |||||||||
QDA | 0.7866 | 0.7957 | 0.7866 | 0.7868 | 108 | 38 | 4 | 354 | 15 | 0.1112 | |
33 | 106 | 11 | |||||||||
1 | 9 | 140 | |||||||||
SVM | 0.8533 | 0.8720 | 0.8533 | 0.8492 | 138 | 10 | 2 | 384 | 8 | 33.86 | |
41 | 103 | 6 | |||||||||
2 | 5 | 143 | |||||||||
ANN | 0.8155 | 0.8249 | 0.8155 | 0.8154 | 113 | 36 | 1 | 367 | 7 | 210.84 | |
32 | 112 | 6 | |||||||||
2 | 6 | 142 | |||||||||
DT | 0.8288 | 0.8409 | 0.8288 | 0.8299 | 120 | 29 | 1 | 373 | 6 | 69.76 | |
33 | 112 | 5 | |||||||||
1 | 8 | 141 |
Classifier | Accuracy | Precision | Recall | F1-Score | Confusion Matrix | CCE | MDE | Time (s) | |||
---|---|---|---|---|---|---|---|---|---|---|---|
RF | 0.8822 | 0.8889 | 0.8822 | 0.8811 | 137 | 12 | 1 | 397 | 6 | 1.278 | |
29 | 116 | 5 | |||||||||
0 | 6 | 144 | |||||||||
VCtop | 0.9488 | 0.9520 | 0.9488 | 0.9490 | 146 | 4 | 0 | 427 | 4 | 0.878 | |
6 | 140 | 4 | |||||||||
1 | 8 | 141 | |||||||||
SCtop | 0.9488 | 0.9510 | 0.9488 | 0.9487 | 146 | 4 | 0 | 427 | 6 | 0.914 | |
6 | 138 | 6 | |||||||||
1 | 6 | 143 | |||||||||
VCall | 0.9488 | 0.9520 | 0.9488 | 0.9490 | 146 | 4 | 0 | 427 | 4 | 1.365 | |
6 | 140 | 4 | |||||||||
1 | 8 | 141 | |||||||||
SCall | 0.9511 | 0.9534 | 0.9511 | 0.9509 | 146 | 4 | 0 | 428 | 5 | 1.449 | |
6 | 139 | 5 | |||||||||
1 | 6 | 143 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rojas-López, A.G.; Rodríguez-Molina, A.; Uriarte-Arcia, A.V.; Villarreal-Cervantes, M.G. Vertebral Column Pathology Diagnosis Using Ensemble Strategies Based on Supervised Machine Learning Techniques. Healthcare 2024, 12, 1324. https://doi.org/10.3390/healthcare12131324
Rojas-López AG, Rodríguez-Molina A, Uriarte-Arcia AV, Villarreal-Cervantes MG. Vertebral Column Pathology Diagnosis Using Ensemble Strategies Based on Supervised Machine Learning Techniques. Healthcare. 2024; 12(13):1324. https://doi.org/10.3390/healthcare12131324
Chicago/Turabian StyleRojas-López, Alam Gabriel, Alejandro Rodríguez-Molina, Abril Valeria Uriarte-Arcia, and Miguel Gabriel Villarreal-Cervantes. 2024. "Vertebral Column Pathology Diagnosis Using Ensemble Strategies Based on Supervised Machine Learning Techniques" Healthcare 12, no. 13: 1324. https://doi.org/10.3390/healthcare12131324
APA StyleRojas-López, A. G., Rodríguez-Molina, A., Uriarte-Arcia, A. V., & Villarreal-Cervantes, M. G. (2024). Vertebral Column Pathology Diagnosis Using Ensemble Strategies Based on Supervised Machine Learning Techniques. Healthcare, 12(13), 1324. https://doi.org/10.3390/healthcare12131324