Effect of Missing Data Types and Imputation Methods on Supervised Classifiers: An Evaluation Study
Abstract
:1. Introduction
2. Related Work
3. Missing Values Imputation Techniques
3.1. KNN Imputation Technique
3.2. Mean Imputation Technique
3.3. Mode Imputation Technique
4. Classification Models Evaluation Metrics
4.1. Accuracy
4.2. F1-Score
4.3. The Matthews Correlation Coefficient (MCC)
5. Experimental Analysis
5.1. Datasets
5.2. Experiments Framework
5.3. Experimental Results
5.3.1. Baseline Performance
- Though the impact of the imbalanced data on the performance of all classifiers is clear, it decreases dramatically on small, mixed, and imbalanced data, such as that of the Bank dataset.
- The performance of the naïve Bayes classifier decreased dramatically for the categorical imbalanced datasets because Gaussian naïve Bayes treats categorical data as if they are normally distributed; however, the data are made of up 0 s and 1 s, which makes it a non-sensible model with categorical datasets. In comparison with the other datasets, NB reported low performance ranging from 12% to 26% within the car and nursery datasets under different evaluation metrics. So, NB was excluded from experiments when dealing with categorical datasets.
5.3.2. Missing Values with MCAR Pattern
MCAR—Imputed with KNN
MCAR Imputed with Mean (and Mode)
5.3.3. Missing Values with MNAR Pattern
MNAR—Imputed with KNN
MNAR—Imputed with Mean (and Mode)
5.4. Discussion of the Results
- In all cases, the performance of all classifiers was affected by the MVs for all datasets. The sensitivity of all classifiers to MVs was better reflected by the MCC than the accuracy and F1-score. This is obvious for the small datasets with mixed features; the accuracy and F1-score both show that there is no variation in the sensitivity of all classifiers, as shown in Figure 6c,d for the F1-Score and Figure 6e,f for the accuracy. However, the MCC shows that the SVM is the most affected, followed by the RF and then the LDA (Figure 6a,b).
- For the effect of the MMs, there was a noticeable difference between the sensitivity of the classifiers with each mechanism; these were more sensitive in the case of MNAR (Figure 7) than in the case of MCAR (Figure 6). This can be clearly observed in the case of large numeric and small mixed datasets.
- For the effect of the IMs, the pattern of the sensitivity is very similar in the case of MCAR. However, the classifiers show higher sensitivity with the KNN imputation (Figure 6a,c,e) than with the mean/mode imputation in Figure 6b,d,f. This implies that for a dataset with MVs due to MCAR, it would be advisable to use the mean/mode IM to ensure a performance that is almost near to the baseline. In the case of MNAR, the two IMs show different sensitivities, which imply the total dependency on the size and the type of the dataset.
- For the type of dataset, all classifiers show high sensitivity with categorical datasets; DT, SVM, LDA, and RF degraded by around 49%, 55%, 42%, and 52%, and 18%, 21%, 16%, and 20%, within the Car dataset when imputed by KNN and evaluated by MCC and F1-score, respectively. The sensitivity ranged from moderate to low with numeric and mixed datasets depending on the evaluation metric; DT, SVM, NB, LDA, and RF degraded by around 0.31%, 10.55%, 4.29%, 7.89%, and 8.92%, respectively, within the Bank dataset when evaluated by MCC, except for the case when MNAR data was imputed with the mean/mode, when numeric datasets showed the highest effect, as shown in Figure 7b,d,f.
- In the MCAR, DT among the five tested classifiers was the most sensitive to the MVs for all datasets, except the categorical data where the SVM was more sensitive as it showed more degradation in its performance from the baseline. The less sensitive classifiers were NB, LDA, and RF (Figure 6).
- In the MNAR, NB and LDA were the two most sensitive classifies, especially in the case of using the mean/mode IMs (Figure 7b,d,f).
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Gabr, M.I.; Mostafa, Y.; Elzanfaly, D.S. Data Quality Dimensions, Metrics, and Improvement Techniques. Future Comput. Inform. J. 2021, 6, 3. [Google Scholar] [CrossRef]
- Pedersen, A.B.; Mikkelsen, E.M.; Cronin-Fenton, D.; Kristensen, N.R.; Pham, T.M.; Pedersen, L.; Petersen, I. Missing data and multiple imputation in clinical epidemiological research. Clin. Epidemiol. 2017, 9, 157. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Aleryani, A.; Wang, W.; De La Iglesia, B. Multiple imputation ensembles (MIE) for dealing with missing data. SN Comput. Sci. 2020, 1, 134. [Google Scholar] [CrossRef] [Green Version]
- Blomberg, L.C.; Ruiz, D.D.A. Evaluating the influence of missing data on classification algorithms in data mining applications. In Proceedings of the Anais do IX Simpósio Brasileiro de Sistemas de Informação, SBC, Porto Alegre, Brazil, 22 May 2013; pp. 734–743. [Google Scholar]
- Acuna, E.; Rodriguez, C. The treatment of missing values and its effect on classifier accuracy. In Classification, Clustering, and Data Mining Applications; Springer: Berlin/Heidelberg, Germany, 2004; pp. 639–647. [Google Scholar]
- Jäger, S.; Allhorn, A.; Bießmann, F. A benchmark for data imputation methods. Front. Big Data 2021, 4, 693674. [Google Scholar] [CrossRef] [PubMed]
- Gimpy, M. Missing value imputation in multi attribute data set. Int. J. Comput. Sci. Inf. Technol. 2014, 5315, 5321. [Google Scholar]
- You, J.; Ma, X.; Ding, Y.; Kochenderfer, M.J.; Leskovec, J. Handling missing data with graph representation learning. Adv. Neural Inf. Process. Syst. 2020, 33, 19075–19087. [Google Scholar]
- Samant, R.; Rao, S. Effects of missing data imputation on classifier accuracy. Int. J. Eng. Res. Technol. IJERT 2013, 2, 264–266. [Google Scholar]
- Christopher, S.Z.; Siswantining, T.; Sarwinda, D.; Bustaman, A. Missing value analysis of numerical data using fractional hot deck imputation. In Proceedings of the 2019 3rd International Conference on Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, 29–30 October 2019; pp. 1–6. [Google Scholar]
- Aljuaid, T.; Sasi, S. Proper imputation techniques for missing values in data sets. In Proceedings of the 2016 International Conference on Data Science and Engineering (ICDSE), Cochin, India, 23–25 August 2016; pp. 1–5. [Google Scholar]
- Thirukumaran, S.; Sumathi, A. Missing value imputation techniques depth survey and an imputation algorithm to improve the efficiency of imputation. In Proceedings of the 2012 Fourth International Conference on Advanced Computing (ICoAC), Chennai, India, 13–15 December 2012; pp. 1–5. [Google Scholar]
- Hossin, M.; Sulaiman, M.; Mustapha, A.; Mustapha, N.; Rahmat, R. A hybrid evaluation metric for optimizing classifier. In Proceedings of the 2011 3rd Conference on Data Mining and Optimization (DMO), Kuala Lumpur, Malaysia, 28–29 June 2011; pp. 165–170. [Google Scholar]
- Bekkar, M.; Djemaa, H.K.; Alitouche, T.A. Evaluation measures for models assessment over imbalanced data sets. J. Inf. Eng. Appl. 2013, 3, 27–29. [Google Scholar]
- Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2021, 17, 168–192. [Google Scholar] [CrossRef]
- Chicco, D.; Warrens, M.J.; Jurman, G. The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment. IEEE Access 2021, 9, 78368–78381. [Google Scholar] [CrossRef]
- Chicco, D.; Tötsch, N.; Jurman, G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 2021, 14, 13. [Google Scholar] [CrossRef]
- Warrens, M.J. Five ways to look at Cohen’s kappa. J. Psychol. Psychother. 2015, 5, 1000197. [Google Scholar] [CrossRef] [Green Version]
- Jeni, L.A.; Cohn, J.F.; De La Torre, F. Facing imbalanced data–recommendations for the use of performance metrics. In Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, Geneva, Switzerland, 2–5 September 2013; pp. 245–251. [Google Scholar]
- Narkhede, S. Understanding auc-roc curve. Towards Data Sci. 2018, 26, 220–227. [Google Scholar]
- Nanmaran, R.; Srimathi, S.; Yamuna, G.; Thanigaivel, S.; Vickram, A.; Priya, A.; Karthick, A.; Karpagam, J.; Mohanavel, V.; Muhibbullah, M. Investigating the role of image fusion in brain tumor classification models based on machine learning algorithm for personalized medicine. Comput. Math. Methods Med. 2022, 2022, 7137524. [Google Scholar] [CrossRef]
- Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
- Jadhav, A.S. A novel weighted TPR-TNR measure to assess performance of the classifiers. Expert Syst. Appl. 2020, 152, 113391. [Google Scholar] [CrossRef]
- Liu, P.; Lei, L.; Wu, N. A quantitative study of the effect of missing data in classifiers. In Proceedings of the the Fifth International Conference on Computer and Information Technology (CIT’05), Shanghai, China, 21–23 September 2005; pp. 28–33. [Google Scholar]
- Hunt, L.A. Missing data imputation and its effect on the accuracy of classification. In Data Science; Springer: Berlin/Heidelberg, Germany, 2017; pp. 3–14. [Google Scholar]
- Purwar, A.; Singh, S.K. Hybrid prediction model with missing value imputation for medical data. Expert Syst. Appl. 2015, 42, 5621–5631. [Google Scholar] [CrossRef]
- Su, X.; Khoshgoftaar, T.M.; Greiner, R. Using imputation techniques to help learn accurate classifiers. In Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence, Dayton, OH, USA, 3–5 November 2008; Volume 1, pp. 437–444. [Google Scholar]
- Jordanov, I.; Petrov, N.; Petrozziello, A. Classifiers accuracy improvement based on missing data imputation. J. Artif. Intell. Soft Comput. Res. 2018, 8, 31–48. [Google Scholar] [CrossRef] [Green Version]
- Luengo, J.; García, S.; Herrera, F. On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowl. Inf. Syst. 2012, 32, 77–108. [Google Scholar] [CrossRef]
- Garciarena, U.; Santana, R. An extensive analysis of the interaction between missing data types, imputation methods, and supervised classifiers. Expert Syst. Appl. 2017, 89, 52–65. [Google Scholar] [CrossRef]
- Aggarwal, U.; Popescu, A.; Hudelot, C. Active learning for imbalanced datasets. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass village, Snowmass Village, CO, USA, 1–7 October 2020; pp. 1428–1437. [Google Scholar]
- García, V.; Mollineda, R.A.; Sánchez, J.S. Theoretical analysis of a performance measure for imbalanced data. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Washington, DC, USA, 23–26 August 2010; pp. 617–620. [Google Scholar]
- Lei, L.; Wu, N.; Liu, P. Applying sensitivity analysis to missing data in classifiers. In Proceedings of the ICSSSM’05, 2005 International Conference on Services Systems and Services Management, Chongqing, China, 13–15 June 2005; Volume 2, pp. 1051–1056. [Google Scholar]
- de la Vega de León, A.; Chen, B.; Gillet, V.J. Effect of missing data on multitask prediction methods. J. Cheminform. 2018, 10, 26. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hossain, T.; Inoue, S. A comparative study on missing data handling using machine learning for human activity recognition. In Proceedings of the 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Spokane, WA, USA, 30 May–2 June 2019; pp. 124–129. [Google Scholar]
- Wang, G.; Lu, J.; Choi, K.S.; Zhang, G. A transfer-based additive LS-SVM classifier for handling missing data. IEEE Trans. Cybern. 2018, 50, 739–752. [Google Scholar] [CrossRef] [PubMed]
- Makaba, T.; Dogo, E. A comparison of strategies for missing values in data on machine learning classification algorithms. In Proceedings of the 2019 International Multidisciplinary Information Technology and Engineering Conference (IMITEC), Vanderbijlpark, South Africa, 21–22 November 2019; pp. 1–7. [Google Scholar]
- Liu, Q.; Hauswirth, M. A provenance meta learning framework for missing data handling methods selection. In Proceedings of the 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), Virtual Conference, 28–31 October 2020; pp. 0349–0358. [Google Scholar]
- Izonin, I.; Tkachenko, R.; Verhun, V.; Zub, K. An approach towards missing data management using improved GRNN-SGTM ensemble method. Eng. Sci. Technol. Int. J. 2021, 24, 749–759. [Google Scholar] [CrossRef]
- Han, J.; Kamber, M.; Pei, J. Data mining: Concepts and techniques. Morgan Kaufinann 2006, 10, 88–89. [Google Scholar]
- Malarvizhi, M.; Thanamani, A. K-NN classifier performs better than K-means clustering in missing value imputation. IOSR J. Comput. Eng. 2012, 6, 12–15. [Google Scholar] [CrossRef]
- Singhai, R. Comparative analysis of different imputation methods to treat missing values in data mining environment. Int. J. Comput. Appl. 2013, 82, 34–42. [Google Scholar] [CrossRef]
- Golino, H.F.; Gomes, C.M. Random forest as an imputation method for education and psychology research: Its impact on item fit and difficulty of the Rasch model. Int. J. Res. Method Educ. 2016, 39, 401–421. [Google Scholar] [CrossRef]
- Nishanth, K.J.; Ravi, V. Probabilistic neural network based categorical data imputation. Neuro Comput. 2016, 218, 17–25. [Google Scholar] [CrossRef]
- Grandini, M.; Bagli, E.; Visani, G. Metrics for multi-class classification: An overview. arXiv 2020, arXiv:2008.05756. [Google Scholar]
- Luque, A.; Carrasco, A.; Martín, A.; de Las Heras, A. The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognit. 2019, 91, 216–231. [Google Scholar] [CrossRef]
- Branco, P.; Torgo, L.; Ribeiro, R.P. A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. CSUR 2016, 49, 1–50. [Google Scholar] [CrossRef] [Green Version]
- Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef] [Green Version]
- Sa’id, A.A.; Rustam, Z.; Wibowo, V.V.P.; Setiawan, Q.S.; Laeli, A.R. Linear support vector machine and logistic regression for cerebral infarction classification. In Proceedings of the 2020 International Conference on Decision Aid Sciences and Application (DASA), Online, 8–9 November 2020; pp. 827–831. [Google Scholar]
- Cao, C.; Chicco, D.; Hoffman, M.M. The MCC-F1 curve: A performance evaluation technique for binary classification. arXiv 2020, arXiv:2006.11278. [Google Scholar]
- AlBeladi, A.A.; Muqaibel, A.H. Evaluating compressive sensing algorithms in through-the-wall radar via F1-score. Int. J. Signal Imaging Syst. Eng. 2018, 11, 164–171. [Google Scholar] [CrossRef]
- Glazkova, A. A comparison of synthetic oversampling methods for multi-class text classification. arXiv 2020, arXiv:2008.04636. [Google Scholar]
- Toupas, P.; Chamou, D.; Giannoutakis, K.M.; Drosou, A.; Tzovaras, D. An intrusion detection system for multi-class classification based on deep neural networks. In Proceedings of the 2019 18th IEEE International Conference on Machine Learning And Applications (ICMLA), Boca Raton, FL, USA, 16–19 December 2019; pp. 1253–1258. [Google Scholar]
- Wang, R.; Fan, J.; Li, Y. Deep multi-scale fusion neural network for multi-class arrhythmia detection. IEEE J. Biomed. Health Inform. 2020, 24, 2461–2472. [Google Scholar] [CrossRef]
- Bouazizi, M.; Ohtsuki, T. Multi-class sentiment analysis in Twitter: What if classification is not the answer. IEEE Access 2018, 6, 64486–64502. [Google Scholar] [CrossRef]
- Baker, C.; Deng, L.; Chakraborty, S.; Dehlinger, J. Automatic multi-class non-functional software requirements classification using neural networks. In Proceedings of the 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, WI, USA, 15–19 July 2019; Volume 2, pp. 610–615. [Google Scholar]
- Liu, C.; Osama, M.; De Andrade, A. DENS: A dataset for multi-class emotion analysis. arXiv 2019, arXiv:1910.11769. [Google Scholar]
- Opitz, J.; Burst, S. Macro f1 and macro f1. arXiv 2019, arXiv:1911.03347. [Google Scholar]
- Josephine, S.A. Predictive Accuracy: A Misleading Performance Measure for Highly Imbalanced Data Classified negative. In Proceedings of the SAS Global Forum, Orlando, FL, USA, 2–5 April 2017. [Google Scholar]
- Boughorbel, S.; Jarray, F.; El-Anbari, M. Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric. PLoS ONE 2017, 12, e0177678. [Google Scholar] [CrossRef]
- Fisher, R. UCI Iris Data Set. 1988. Available online: https://archive.ics.uci.edu/ml/datasets/iris (accessed on 18 April 2022).
- Moro, S.; Paulo, C.; Paulo, R. UCI Bank Marketing Data Set. 2014. Available online: https://archive.ics.uci.edu/ml/ (accessed on 21 April 2022).
- Bohanec, M.; Zupan, B. UCI Nursery Data Set. 1997. Available online: https://archive.ics.uci.edu/ml/datasets/nursery (accessed on 21 April 2022).
- Bohanec, M. Car Evaluation Data Set. 1997. Available online: https://www.kaggle.com/datasets/elikplim/car-evaluation-data-setl (accessed on 21 April 2022).
- Mehmet, A. Churn for Bank Customers. 2020. Available online: https://www.kaggle.com/datasets/mathchi/churn-for-bank-customers (accessed on 21 April 2022).
- Elawady, A.; Iskander, G. Dry Beans Classification. 2021. Available online: https://kaggle.com/competitions/dry-beans-classification-iti-ai-pro-intake01 (accessed on 21 April 2022).
- Gong, M. A novel performance measure for machine learning classification. Int. J. Manag. Inf. Technol. IJMIT 2021, 13, 1–19. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. An invitation to greater use of Matthews correlation coefficient (MCC) in robotics and artificial intelligence. Front. Robot. AI 2022, 9, 78. [Google Scholar] [CrossRef] [PubMed]
Dataset Name | Size | Features | Data Type | Classes Type | No. of Classes | Balanced | % of Dominated Class |
---|---|---|---|---|---|---|---|
Iris [61] | 150 | 5 | Numeric | Multi-class | 3 | Balanced | – |
Car Evaluation [64] | 1728 | 7 | Categorical | Multi-class | 4 | Imbalanced | 68% |
Bank Advertising [62] | 4521 | 17 | Mixed | Binary class | 2 | Imbalanced | 88.5% |
Bank Churners [65] | 10,127 | 21 | Mixed | Binary class | 2 | Imbalanced | 84% |
Nursery [63] | 12,960 | 9 | Categorical | Multi-class | 5 | Imbalanced | 26% |
Dry Beans [66] | 13,611 | 17 | Numeric | Multi-class | 7 | Imbalanced | 33% |
Datasets | Metrics | DT | SVM | NB | LDA | RF |
---|---|---|---|---|---|---|
Iris | MCC | 96.73% | 93.61% | 96.73% | 93.32% | 93.61% |
F1 | 97.80% | 95.60% | 97.80% | 95.60% | 95.60% | |
ACC | 97.78% | 95.56% | 95.56% | 97.78% | 95.56% | |
Dry Beans | MCC | 89.51% | 91.00% | 87.45% | 87.86% | 90.72% |
F1 | 91.30% | 92.50% | 89.50% | 89.70% | 92.30% | |
ACC | 91.31% | 92.53% | 89.54% | 89.72% | 92.31% | |
Bank | MCC | 30.92% | 33.53% | 31.32% | 45.04% | 35.91% |
F1 | 94.57% | 94.74% | 91.12% | 94.57% | 94.51% | |
ACC | 89.90% | 90.19% | 84.52% | 90.19% | 89.90% | |
Churners | MCC | 74.99% | 64.34% | 52.33% | 60.17% | 83.33% |
F1 | 78.91% | 67.00% | 58.94% | 64.86% | 85.36% | |
ACC | 93.35% | 91.31% | 87.99% | 90.16% | 95.69% | |
Car | MCC | 83.68% | 90.90% | 12.62% | 79.86% | 88.27% |
F1 | 92.30% | 95.60% | 20.60% | 90.00% | 94.40% | |
ACC | 92.29% | 95.57% | 20.62% | 89.98% | 94.41% | |
Nursery | MCC | 95.89% | 99.96% | 26.95% | 92.72% | 97.39% |
F1 | 97.20% | 99.98% | 25.90% | 94.40% | 98.20% | |
ACC | 97.19% | 99.97% | 25.87% | 94.39% | 98.23% |
Imputation Method | Pattern | Criteria | Mixed Dataset | Numeric Dataset | Categorical Dataset |
---|---|---|---|---|---|
KNN | MCAR | Dataset Size | NO | YES | YES (with MCC Only) |
Dataset Type | No General Observation | No General Observation | Highly Sensitive | ||
MNAR | Dataset Size | Yes (with MCC Only) | YES | NO | |
Dataset Type | Differs Based on the Metric | Low to High Sensitivity | Moderate to High Sensitivity | ||
Mean | MCAR | Dataset Size | NO | YES | Not Applicable |
Dataset Type | No General Observation | No General Observation | Not Applicable | ||
MNAR | Dataset Size | Yes (with MCC Only) | YES | Not Applicable | |
Dataset Type | Low to Moderate Sensitivity | Low to High Sensitivity | Not Applicable |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gabr, M.I.; Helmy, Y.M.; Elzanfaly, D.S. Effect of Missing Data Types and Imputation Methods on Supervised Classifiers: An Evaluation Study. Big Data Cogn. Comput. 2023, 7, 55. https://doi.org/10.3390/bdcc7010055
Gabr MI, Helmy YM, Elzanfaly DS. Effect of Missing Data Types and Imputation Methods on Supervised Classifiers: An Evaluation Study. Big Data and Cognitive Computing. 2023; 7(1):55. https://doi.org/10.3390/bdcc7010055
Chicago/Turabian StyleGabr, Menna Ibrahim, Yehia Mostafa Helmy, and Doaa Saad Elzanfaly. 2023. "Effect of Missing Data Types and Imputation Methods on Supervised Classifiers: An Evaluation Study" Big Data and Cognitive Computing 7, no. 1: 55. https://doi.org/10.3390/bdcc7010055
APA StyleGabr, M. I., Helmy, Y. M., & Elzanfaly, D. S. (2023). Effect of Missing Data Types and Imputation Methods on Supervised Classifiers: An Evaluation Study. Big Data and Cognitive Computing, 7(1), 55. https://doi.org/10.3390/bdcc7010055