Novel Ensemble Model Recommendation Approach for the Detection of Dyslexia
Abstract
:1. Introduction
- An efficient ensemble model is presented and developed for the early detection of learning disability in dyslexia.
- An experimental model is developed for a pool of ensembles where validation is carried out rigorously.
- The performance of the ensemble model recommendation for the detection of dyslexia is evaluated on the basis of metrics of accuracy, precision, recall, f-measure, and AUC.
2. Related Work
3. Novelty of Research
4. Dataset
5. Problem Formulation and Methodology
- (a)
- Generation of a new training data set by random sampling with replacement from the original dataset.
- (b)
- Training of the base learner with the training dataset, as achieved in step a.
- (c)
- Base learners continuously trained with different subsets of datasets based on the pre-defined stopping criteria of steps a and b.
- (d)
- Aggregation carried out by forming a final classifier from all of the base learner classifiers.
- (a)
- Equal weights assigned for each sample in the training data.
- (b)
- The first base learner classifier is trained with the weighted training data.
- (c)
- More weight is assigned for all the training instances that are misclassified by the first base learner. The updated training weight of the samples are used to train the next base learner.
- (d)
- The training error is minimized with each proceeding step by updated weights of the training instances.
- (e)
- Final classifier results by the aggregation of sequentially trained base learners with weights.
- (a)
- N subsets are chosen containing M number of random features out of D features.
- (b)
- Use each random subset to train N weak learners.
- (c)
- Apply majority voting for making the final prediction.
6. Experimental Results and Discussion
7. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rello, L.; Baeza-Yates, R.; Ali, A.; Bigham, J.P.; Serra, M. Predicting risk of dyslexia with an online gamified test. PLoS ONE 2020, 15, e0241687. [Google Scholar]
- Abed, M.G.; Shackelford, T.K. Saudi public primary school teachers’ knowledge and beliefs about developmental dyslexia. Dyslexia 2021, 28, 244–251. [Google Scholar] [CrossRef] [PubMed]
- Ahmad, N.; Rehman, M.B.; El Hassan, H.M.; Ahmad, I.; Rashid, M. An Efficient Machine Learning-Based Feature Optimization Model for the Detection of Dyslexia. Comput. Intell. Neurosci. 2022, 2022, 8491753. [Google Scholar] [CrossRef] [PubMed]
- Rajapakse, S.; Polwattage, D.; Guruge, U.; Jayathilaka, I.; Edirisinghe, T.; Thelijjagoda, S. ALEXZA: A Mobile Application for Dyslexics Utilizing Artificial Intelligence and Machine Learning Concepts. In Proceedings of the 2018 3rd International Conference on Information Technology Research (ICITR), Moratuwa, Sri Lanka, 5–7 December 2018; IEEE: New York, NY, USA, 2018; pp. 1–6. [Google Scholar]
- Zingoni, A.; Taborri, J.; Panetti, V.; Bonechi, S.; Aparicio-Martínez, P.; Pinzi, S.; Calabrò, G. Investigating Issues and Needs of Dyslexic Students at University: Proof of Concept of an Artificial Intelligence and Virtual Reality-Based Supporting Platform and Preliminary Results. Appl. Sci. 2021, 11, 4624. [Google Scholar] [CrossRef]
- Available online: https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms (accessed on 10 December 2021).
- Kotu, V.; Deshpande, B. Data Mining Process. In Predictive Analytics and Data Mining; Morgan Kaufmann Boston: Burlington, MA, USA, 2015; pp. 17–36. [Google Scholar] [CrossRef]
- Araque, O.; Corcuera-Platas, I.; Sánchez-Rada, J.F.; Iglesias, C.A. Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Syst. Appl. 2017, 77, 236–246. [Google Scholar] [CrossRef]
- Shaywitz, S.E.; Shaywitz, B.A. Dyslexia (Specific Reading Disability). Biol. Psychiatry 2005, 57, 1301–1309. [Google Scholar] [CrossRef]
- Parmar, S.K.; Ramwala, O.A.; Paunwala, C.N. Performance Evaluation of SVM with Non-Linear Kernels for EEG-based Dyslexia Detection. In Proceedings of the 2021 IEEE 9th Region 10 Humanitarian Technology Conference (R10-HTC), Bangalore, India, 30 September–2 October 2021; IEEE: New York, NY, USA, 2021; pp. 1–6. [Google Scholar]
- Falahati, F.; Westman, E.; Simmons, A. Multivariate Data Analysis and Machine Learning in Alzheimer’s Disease with a Focus on Structural Magnetic Resonance Imaging. J. Alzheimer’s Dis. 2014, 41, 685–708. [Google Scholar] [CrossRef]
- Usman, O.L.; Muniyandi, R.C.; Omar, K.; Mohamad, M. Advance Machine Learning Methods for Dyslexia Biomarker Detection: A Review of Implementation Details and Challenges. IEEE Access 2021, 9, 36879–36897. [Google Scholar] [CrossRef]
- Dasarathy, B.; Sheela, B. A composite classifier system design: Concepts and methodology. Proc. IEEE 1979, 67, 708–713. [Google Scholar] [CrossRef]
- Hansen, L.K.; Salamon, P. Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 993–1001. [Google Scholar] [CrossRef] [Green Version]
- Schapire, R.E. The strength of weak learnability. Mach. Learn. 1990, 5, 197–227. [Google Scholar] [CrossRef]
- Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Jacobs, R.A.; Jordan, M.I.; Nowlan, S.J.; Hinton, G.E. Adaptive Mixtures of Local Experts. Neural Comput. 1991, 3, 79–87. [Google Scholar] [CrossRef] [PubMed]
- Jordan, M.J.; Jacobs, R.A. Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 1994, 6, 181–214. [Google Scholar] [CrossRef]
- Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
- Benediktsson, J.; Swain, P. Consensus theoretic classification methods. IEEE Trans. Syst. Man Cybern. 1992, 22, 688–704. [Google Scholar] [CrossRef]
- Xu, L.; Krzyzak, A.; Suen, C. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybern. 1992, 22, 418–435. [Google Scholar] [CrossRef]
- Ho, T.K.; Hull, J.J.; Srihari, S.N. Decision combination in multiple classifier systems. IEEE Trans. Pattern Anal. Mach. Intell. 1994, 16, 66–75. [Google Scholar]
- Rogova, G. Combining the results of several neural network classifiers. Neural Netw. 1994, 7, 777–781. [Google Scholar] [CrossRef]
- Lam, L.; Suen, C.Y. Optimal combinations of pattern classifiers. Pattern Recognit. Lett. 1995, 16, 945–954. [Google Scholar] [CrossRef]
- Woods, K.; Bowyer, K.; Kegelmeyer, W. Combination of multiple classifiers using local accuracy estimates. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 405–410. [Google Scholar] [CrossRef]
- Bloch, I. Information combination operators for data fusion: A comparative review with classification. IEEE Trans. Syst. Man, Cybern.-Part A Syst. Humans 1996, 26, 52–67. [Google Scholar] [CrossRef]
- Cho, S.-B.; Kim, J. Combining multiple neural networks by fuzzy integral for robust classification. IEEE Trans. Syst. Man Cybern. 1995, 25, 380–384. [Google Scholar]
- Kuncheva, L.I.; Bezdek, J.C.; Duin, R.P. Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recognit. 2001, 34, 299–314. [Google Scholar] [CrossRef]
- Drucker, H.; Cortes, C.; Jackel, L.D.; LeCun, Y.; Vapnik, V. Boosting and Other Ensemble Methods. Neural Comput. 1994, 6, 1289–1301. [Google Scholar] [CrossRef]
- Kuncheva, L.I. Classifier ensembles for changing environments. In Proceedings of the 5th International Workshop on Multiple Classifier Systems in Lecture Notes in Computer Science, Cagliari, Italy, 9–11 June 2004; Roli, F., Kittler, J., Windeatt, T., Eds.; Volume 3077, pp. 1–15. [Google Scholar]
- Khaliq, S.; Ramzan, I. Study about awareness of dyslexia among elementary school teachers regarding pakistanelementaryu educational institutes. Int J. Res. Bus. Stud. Manag. 2017, 4, 18–23. [Google Scholar]
- Shrestha, S.; Murano, P. An algorithm for automatically detecting dyslexia on the fly. Int. Conf. J. Comput. Sci. Inf. Technol. 2018, 10. [Google Scholar] [CrossRef]
- Kaisar, S. Developmental dyslexia detection using machine learning techniques: A survey. ICT Express 2020, 6, 181–184. [Google Scholar] [CrossRef]
- Bannick, M.S.; McGaughey, M.; Flaxman, A.D. Ensemble modelling in descriptive epidemiology: Burden of disease estimation. Int. J. Epidemiol. 2019, 49, 2065–2073. [Google Scholar] [CrossRef]
- Xiao, Y.; Wu, J.; Lin, Z.; Zhao, X. A deep learning-based multi-model ensemble method for cancer prediction. Comput. Methods Programs Biomed. 2018, 153, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Somasundaram, S.K.; Alli, P. A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy. J. Med. Syst. 2017, 41, 201. [Google Scholar] [CrossRef]
- Mienye, I.D.; Sun, Y.; Wang, Z. An improved ensemble learning approach for the prediction of heart disease risk. Informatics Med. Unlocked 2020, 20, 100402. [Google Scholar] [CrossRef]
- Daud, S.M.; Abas, H. ‘Dyslexia Baca’ Mobile App—The Learning Ecosystem for Dyslexic Children. In Proceedings of the 2013 International Conference on Advanced Computer Science Applications and Technologies, Kuching, Malaysia, 23–24 December 2013; pp. 412–416. [Google Scholar] [CrossRef]
- Jincy, M.J.; Subha, H.D. A Study on Dyslexia Using Machine Learning; EasyChair Preprint: Manchester, UK, 2021; p. 5490. [Google Scholar]
- Perera, H.; Shiratuddin, M.F.; Wong, K.W.; Fullarton, K. Eeg signal analysis of writing and typing between adults with dyslexia and normal controls. Int. J. Interact. Multimedia Artif. Intell. 2018, 5, 62. [Google Scholar] [CrossRef]
- Frid, A.; Manevitz, L.M. Features and machine learning for correlating and classifying between brain areas and dyslexia. arXiv 2018, arXiv:1812.10622v2. [Google Scholar]
- Rezvani, Z.; Zare, M.; Žarić, G.; Bonte, M.; Tijms, J.; Van der Molen, M.W.; González, G.F. Machine Learning Classification of Dyslexic Children Based on Eeg Local Network Features. bioRxiv 2019, 1–23. [Google Scholar] [CrossRef]
- Rello, L.; Ballesteros, M. Detecting readers with dyslexia using machine learning with eye tracking measures. In Proceedings of the 12th International Web for All Conference, Florence, Italy, 18–20 May 2015. [Google Scholar]
- Benfatto, M.N.; Seimyr, G.; Ygge, J.; Pansell, T.; Rydberg, A.; Jacobson, C. Screening for Dyslexia Using Eye Tracking during Reading. PLoS ONE 2016, 11, e0165508. [Google Scholar]
- Asvestopoulou, T.; Manousaki, V.; Psistakis, A.; Smyrnakis, I.; Andreadakis, V.; Aslanides, I.M.; Papadopouli, M. Dyslexml: Screening tool for dyslexia using machine learning. arXiv 2019, arXiv:1903.06274. [Google Scholar]
- Cui, Z.; Xia, Z.; Su, M.; Shu, H.; Gong, G. Disrupted white matter connectivity underlying developmental dyslexia: A machine learning approach. Hum. Brain Mapp. 2016, 37, 1443–1458. [Google Scholar] [CrossRef]
Reference | Age Group | Baseline Language | Dataset Size | ML Technique Used | Modality Used |
---|---|---|---|---|---|
[40] | 18 and above | English | 32 | SVM | EEG |
[41] | 6–7 | Hebrew | 32 | SVM NN | EEG |
[42] | 8 | Dutch | 44 | SVM KNN | EEG |
[43] | 11–55 | Spanish | 97 | SVM | Eye Tracking |
[44] | 9–10 | Swedish | 185 | SVM | Eye Tracking |
[45] | 8–13 | Greek | 69 | SVM NB K Means | Eye Tracking |
[46] | 10–15 | Mandarin | 61 | SVM LR | MRI |
[47] | 8–14 | German Polish French | 236 | SVM RF LR | MRI |
Operating Ensemble | Ensemble Method | Learning Method |
---|---|---|
Ensemble 1 | Adaboost | Decision Trees |
Ensemble 2 | Bagging | Decision Trees |
Ensemble 3 | Subspace | Discriminant |
Ensemble 4 | Subspace | Nearest Neighbors |
Ensemble 5 | RUSBoost | Decision Trees |
Model | Ensemble Models | Validation | ||||
---|---|---|---|---|---|---|
Boosted Trees | Bagged Trees | Subspace Discriminant | Subspace KNN | RUS Boosted Trees | ||
Accuracy | 87.90% | 89.60% | 89.80% | 88.80% | 78.00% | 5-fold cross validation |
Accuracy | 90.50% | 89.80% | 89.70% | 88.60% | 77.90% | 10-fold cross validation |
Accuracy | 90.30% | 89.70% | 89.70% | 88.70% | 78.90% | 15-fold cross validation |
Accuracy | 90.20% | 89.80% | 89.80% | 88.90% | 78.40% | 20-fold cross validation |
Accuracy | 90.90% | 89.70% | 89.40% | 88.30% | 77.90% | Holdout (30%) |
Accuracy | 92.60% | 99.70% | 90.10% | 99.9% | 82.40% | Without CV |
Average Accuracy | 90.40% | 91.38% | 89.75% | 90.55% | 78.92% | - |
Model | Ensemble Models | Validation | ||||
---|---|---|---|---|---|---|
Boosted Trees | Bagged Trees | Subspace Discriminant | Subspace KNN | RUS Boosted Trees | ||
Accuracy | 89.10% | 89.10% | 89.20% | 84.60% | 66.70% | 5-fold cross validation |
Accuracy | 89.20% | 88.90% | 89.20% | 83.90% | 68.10% | 10-fold cross validation |
Accuracy | 89.30% | 89.30% | 89.20% | 84.30% | 67.10% | 15-fold cross validation |
Accuracy | 89.10% | 88.90% | 89.20% | 84.40% | 68.60% | 20-fold cross validation |
Accuracy | 88.80% | 89.00% | 89.20% | 83.90% | 61.50% | Holdout (30%) |
Accuracy | 90.10% | 99.70% | 89.20% | 99.9% | 72.60% | Without CV |
Average Accuracy | 89.27% | 90.82% | 89.20% | 86.85% | 67.43% | - |
Ensemble Method | Precision | Recall | F1 Score | AUC | Accuracy |
---|---|---|---|---|---|
Adaboost | 0.98 | 0.91 | 0.94 | 0.84 | 90.3 |
Bagging | 0.99 | 0.90 | 0.94 | 0.80 | 89.8 |
Subspace Discriminant | 0.98 | 0.90 | 0.94 | 0.82 | 89.7 |
Subspace kNN | 0.97 | 0.90 | 0.93 | 0.74 | 88.9 |
RUSBoost | 0.90 | 0.97 | 0.93 | 0.74 | 78.7 |
No. of Learners. | Precision | Recall | F1 Score | AUC | Accuracy |
---|---|---|---|---|---|
5 | 0.98 | 0.90 | 0.94 | 0.73 | 89.6 |
10 | 0.98 | 0.90 | 0.94 | 0.79 | 90.3 |
15 | 0.98 | 0.90 | 0.94 | 0.82 | 90.1 |
20 | 0.98 | 0.90 | 0.94 | 0.83 | 90.3 |
25 | 0.98 | 0.91 | 0.94 | 0.84 | 90.4 |
30 | 0.98 | 0.91 | 0.94 | 0.84 | 90.3 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
AlGhamdi, A.S. Novel Ensemble Model Recommendation Approach for the Detection of Dyslexia. Children 2022, 9, 1337. https://doi.org/10.3390/children9091337
AlGhamdi AS. Novel Ensemble Model Recommendation Approach for the Detection of Dyslexia. Children. 2022; 9(9):1337. https://doi.org/10.3390/children9091337
Chicago/Turabian StyleAlGhamdi, Ahmed Saeed. 2022. "Novel Ensemble Model Recommendation Approach for the Detection of Dyslexia" Children 9, no. 9: 1337. https://doi.org/10.3390/children9091337