The Exploration of Predictors for Peruvian Teachers’ Life Satisfaction through an Ensemble of Feature Selection Methods and Machine Learning
Abstract
:1. Introduction
2. Literature Review
2.1. Concept of Life Satisfaction and Influencing Factors
2.2. Machine Learning Techniques and Their Application in the Study of Life Satisfaction
2.3. Ensemble of Feature Selection Methods
3. Materials and Methods
3.1. Data Extraction
3.2. Data Cleaning and Preprocesing
3.2.1. Initial Data Exploration
3.2.2. Missing Data Handling
3.2.3. Data Transformation
3.2.4. Split Dataset
3.3. Feature Selection
3.3.1. Feature Selection by Filtering Methods
- Mutual information (MI) is a metric that quantifies the dependence between two variables, indicating to what extent knowledge of a feature helps predict the target variable [112]. Its value varies between 0 and 1, where 0 indicates no dependence and a value of MI > 0 indicates some relationship between the feature and the target variable [113]. For the selection of the k best features using the mutual information filter, we set the parameters score_func = mutual_info_classif and k = ‘all’ in the SelectKBest class of the Python module sklearn.feature_selection. Equation (1) allows us to obtain these scores.
- where is the joint probability density function of x and y, and and are the marginal density functions. In Figure 4a, we show the fifteen most relevant features obtained with this technique.
- Analysis of variance (ANOVA F-test) is used to compare the means of different groups and determine whether at least one of the means is significantly different from the others [114]. In the context of feature selection, it is used to assess the relevance of a feature in terms of predicting the target variable [115,116]. In this study, since our target variable is categorical, we use this technique to select numerical features. To do so, we employ the SelectKBest class, with the parameters score_func = f_classif and k = ‘all’ from the Python module sklearn.feature_selection. Equation (2) allows us to obtain the score of this technique.
- where is the mean of squares between groups, is the number of samples in group , is the mean of group , is the overall mean of all groups, and is the number of groups.
- is the mean of the squares within groups, is the value of sample in group , and is the total number of samples.
- We show the ANOVA F-test filter scores for the prediction of teachers’ life satisfaction in Figure 4b.
- Chi-square analysis is used to determine whether there is a significant association between two categorical variables [114]. In feature selection, this test is used to assess the relevance of a feature in predicting a target variable [117]. In this study, since our target variable is categorical, we applied this technique to select categorical features. We use the SelectKBest class, with parameters score_func = chi2 and k = ‘all’, from the Python module sklearn.feature_selection. Equation (3) shows how the score is calculated for each feature using the following filter.
- where is the observer frequency and is the expected frequency.
- In Figure 4c, we show the scores obtained with this filter.
- Spearman correlation coefficient is a nonparametric measure that evaluates the monotonic relationship between two variables based on the ranges of the data rather than their exact values. It is useful in feature selection in ML to evaluate ordinal or monotonic dependencies between features and the target variable, without requiring assumptions about the distribution of the data [118,119]. We use the spearmanr() function of the Python module scipy.stats to determine the value of the coefficients. Equation (4) allows the calculation of these values.
- where and are the ranks of the e variables, respectively.
- In Figure 5, we show the Spearman correlation matrix between the fifteen most important variables and the variable “teacher life satisfaction”.
3.3.2. Feature Selection by Integrated Methods
3.3.3. Feature Selection through Ensemble of Methods
3.3.4. Subset of Data with Characteristics Most Relevant to Satisfaction with Teachers’ Lives
3.4. Training and Model Avaluation
3.4.1. Training and Hyperparameters Tuning
3.4.2. Model Evaluation
4. Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Con sent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Feature | Description |
---|---|
P1_24_B | Satisfaction with your health |
P1_24_C | Satisfaction with the living conditions you can provide for your children/family |
P1_24_E | Satisfaction with your job at the educational institution |
P1_24_F | Satisfaction with the conditions for carrying out their teaching duties |
P1_22_A | Degree of trust with the Ministry of Education |
P1_2 | Age |
P1_22_D | Degree of trust with the Local Management Unit (UGEL) |
P1_26_C | Reflection on the results of their pedagogical practice |
P1_26_E | Participation in continuing education programs |
P1_9_D_LV_HORA | Hours dedicated to household chores and childcare/parental care from Monday to Friday |
P1_9_A_SD_HORA | Hours spent on class preparation and administrative tasks on Saturdays and Sundays |
P1_6 | Number of students under your care |
P1_27_E | Difficulty in planning activities under the competency-based approach of the National Basic Education Curriculum |
P1_22_C | Level of trust in the Regional Education Directorate or Management |
P1_26_B | Difficulty in systematizing pedagogical practice |
P1_9_D_SD_HORA | Hours spent on housework and child/parent care on a Saturday and Sunday |
P1_9_E_SD_HORA | Hours devoted to leisure or sports (excluding sleep) on a Saturday and Sunday |
P1_9_A_LV_HORA | Hours spent preparing classes and administrative tasks from Monday to Friday |
References
- Dagli, A.; Baysal, N. Investigating Teachers’ Life Satisfaction. Univers. J. Educ. Res. 2017, 5, 1250–1256. [Google Scholar] [CrossRef]
- Diener, E.; Emmons, R.A.; Larsem, R.J.; Griffin, S. The Satisfaction with Life Scale. J. Personal. Assess. 2010, 49, 71–75. [Google Scholar] [CrossRef] [PubMed]
- Diener, E.; Sapyta, J.J.; Suh, E. Subjective Well-Being Is Essential to Well-Being. Psychol. Inq. 2009, 9, 33–37. [Google Scholar] [CrossRef]
- Lind, N. Better Life Index. In Encyclopedia of Quality of Life and Well-Being Research; Springer: Dordrecht, The Netherlands, 2014; pp. 381–382. [Google Scholar] [CrossRef]
- Helliwell, J.F.; Huang, H.; Shiplett, H.; Wang, S. Happiness of the Younger, the Older, and Those in Between. World Happiness Rep. 2024, 2024, 9–60. [Google Scholar]
- UNDP. Human Development Index (HDI) by Country 2024. Available online: https://worldpopulationreview.com/country-rankings/hdi-by-country (accessed on 11 August 2024).
- Malvaso, A.; Kang, W. The Relationship between Areas of Life Satisfaction, Personality, and Overall Life Satisfaction: An Integrated Account. Front. Psychol. 2022, 13, 894610. [Google Scholar] [CrossRef]
- Angelini, V.; Cavapozzi, D.; Corazzini, L.; Paccagnella, O. Age, Health and Life Satisfaction Among Older Europeans. Soc. Indic. Res. 2012, 105, 293–308. [Google Scholar] [CrossRef]
- Hong, Y.Z.; Su, Y.J.; Chang, H.H. Analyzing the Relationship between Income and Life Satisfaction of Forest Farm Households—a Behavioral Economics Approach. For. Policy Econ. 2023, 148, 102916. [Google Scholar] [CrossRef]
- Joshanloo, M.; Jovanović, V. The Relationship between Gender and Life Satisfaction: Analysis across Demographic Groups and Global Regions. Arch. Women’s Ment. Health 2020, 23, 331–338. [Google Scholar] [CrossRef]
- Rogowska, A.M.; Meres, H. The Mediating Role of Job Satisfaction in the Relationship between Emotional Intelligence and Life Satisfaction among Teachers during the COVID-19 Pandemic. Eur. J. Investig. Health Psychol. Educ. 2022, 12, 666–676. [Google Scholar] [CrossRef]
- Kida, H.; Niimura, H.; Eguchi, Y.; Suzuki, K.; Shikimoto, R.; Bun, S.; Takayama, M.; Mimura, M. Relationship Between Life Satisfaction and Psychological Characteristics Among Community-Dwelling Oldest-Old: Focusing on Erikson’s Developmental Stages and the Big Five Personality Traits. Am. J. Geriatr. Psychiatry 2024, 32, 724–735. [Google Scholar] [CrossRef]
- Kuykendall, L.; Tay, L.; Ng, V. Leisure Engagement and Subjective Well-Being: A Meta-Analysis. Psychol. Bull. 2015, 141, 364–403. [Google Scholar] [CrossRef] [PubMed]
- Znidaršič, J.; Marič, M. Relationships between Work-Family Balance, Job Satisfaction, Life Satisfaction and Work Engagement among Higher Education Lecturers. Organizacija 2021, 54, 227–237. [Google Scholar] [CrossRef]
- Liu, Y.S.; Lu, C.W.; Chung, H.T.; Wang, J.K.; Su, W.J.; Chen, C.W. Health-Promoting Lifestyle and Life Satisfaction in Full-Time Employed Adults with Congenital Heart Disease: Grit as a Mediator. Eur. J. Cardiovasc. Nurs. 2024, 23, 348–357. [Google Scholar] [CrossRef]
- Kim, E.-J.; Kang, H.-W.; Sala, A.; Kim, E.-J.; Kang, H.-W.; Park, S.-M. Leisure and Happiness of the Elderly: A Machine Learning Approach. Sustainability 2024, 16, 2730. [Google Scholar] [CrossRef]
- Phulkerd, S.; Thapsuwan, S.; Chamratrithirong, A.; Gray, R.S. Influence of Healthy Lifestyle Behaviors on Life Satisfaction in the Aging Population of Thailand: A National Population-Based Survey. BMC Public Health 2021, 21, 43. [Google Scholar] [CrossRef]
- Zagkas, D.G.; Chrousos, G.P.; Bacopoulou, F.; Kanaka-Gantenbein, C.; Vlachakis, D.; Tzelepi, I.; Darviri, C. Stress and Well-Being of Greek Primary School Educators: A Cross-Sectional Study. Int. J. Environ. Res. Public Health 2023, 20, 5390. [Google Scholar] [CrossRef]
- Pagán-Castaño, E.; Sánchez-García, J.; Garrigos-Simon, F.J.; Guijarro-García, M. The Influence of Management on Teacher Well-Being and the Development of Sustainable Schools. Sustainability 2021, 13, 2909. [Google Scholar] [CrossRef]
- Ao, N.; Zhang, S.; Tian, G.; Zhu, X.; Kang, X. Exploring Teacher Wellbeing in Educational Reforms: A Chinese Perspective. Front. Psychol. 2023, 14, 1265536. [Google Scholar] [CrossRef] [PubMed]
- Natha, P.; RajaRajeswari, P. Advancing Skin Cancer Prediction Using Ensemble Models. Computers 2024, 13, 157. [Google Scholar] [CrossRef]
- Conte, L.; De Nunzio, G.; Giombi, F.; Lupo, R.; Arigliani, C.; Leone, F.; Salamanca, F.; Petrelli, C.; Angelelli, P.; De Benedetto, L.; et al. Machine Learning Models to Enhance the Berlin Questionnaire Detection of Obstructive Sleep Apnea in At-Risk Patients. Appl. Sci. 2024, 14, 5959. [Google Scholar] [CrossRef]
- Ghassemi, M.; Naumann, T.; Schulam, P.; Beam, A.L.; Chen, I.Y.; Ranganath, R. A Review of Challenges and Opportunities in Machine Learning for Health. AMIA Summits Transl. Sci. Proc. 2020, 2020, 191. [Google Scholar] [PubMed]
- Stephen, O.; Sain, M.; Maduh, U.J.; Jeong, D.U. An Efficient Deep Learning Approach to Pneumonia Classification in Healthcare. J. Healthc. Eng. J. 2019, 2019, 4180949. [Google Scholar] [CrossRef]
- Spencer, R.; Thabtah, F.; Abdelhamid, N.; Thompson, M. Exploring Feature Selection and Classification Methods for Predicting Heart Disease. Digit. Heal. 2020, 6, 2055207620914777. [Google Scholar] [CrossRef] [PubMed]
- Hamdia, K.M.; Zhuang, X.; Rabczuk, T. An Efficient Optimization Approach for Designing Machine Learning Models Based on Genetic Algorithm. Neural Comput. Appl. 2021, 33, 1923–1933. [Google Scholar] [CrossRef]
- Bhosekar, A.; Ierapetritou, M. Modular Design Optimization Using Machine Learning-Based Flexibility Analysis. J. Process Control 2020, 90, 18–34. [Google Scholar] [CrossRef]
- Yogesh, I.; Suresh Kumar, K.R.; Candrashekaran, N.; Reddy, D.; Sampath, H. Predicting Job Satisfaction and Employee Turnover Using Machine Learning. J. Comput. Theor. Nanosci. 2020, 17, 4092–4097. [Google Scholar] [CrossRef]
- Celbiş, M.G.; Wong, P.H.; Kourtit, K.; Nijkamp, P. Job Satisfaction and the ‘Great Resignation’: An Exploratory Machine Learning Analysis. Soc. Indic. Res. 2023, 170, 1097–1118. [Google Scholar] [CrossRef]
- Gupta, A.; Chadha, A.; Tiwari, V.; Varma, A.; Pereira, V. Sustainable Training Practices: Predicting Job Satisfaction and Employee Behavior Using Machine Learning Techniques. Asian Bus. Manag. 2023, 22, 1913–1936. [Google Scholar] [CrossRef]
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine Learning in Agriculture: A Review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef]
- Sharma, A.; Jain, A.; Gupta, P.; Chowdary, V. Machine Learning Applications for Precision Agriculture: A Comprehensive Review. IEEE Access 2021, 9, 4843–4873. [Google Scholar] [CrossRef]
- Benos, L.; Tagarakis, A.C.; Dolias, G.; Berruto, R.; Kateris, D.; Bochtis, D. Machine Learning in Agriculture: A Comprehensive Updated Review. Sensors 2021, 21, 3758. [Google Scholar] [CrossRef] [PubMed]
- Pallathadka, H.; Mustafa, M.; Sanchez, D.T.; Sekhar Sajja, G.; Gour, S.; Naved, M. Impact of Machine Learning on Management, Healthcare and Agriculture. Mater. Today Proc. 2023, 80, 2803–2806. [Google Scholar] [CrossRef]
- McQueen, R.J.; Garner, S.R.; Nevill-Manning, C.G.; Witten, I.H. Applying Machine Learning to Agricultural Data. Comput. Electron. Agric. 1995, 12, 275–293. [Google Scholar] [CrossRef]
- Leo, M.; Sharma, S.; Maddulety, K. Machine Learning in Banking Risk Management: A Literature Review. Risks 2019, 7, 29. [Google Scholar] [CrossRef]
- Mashrur, A.; Luo, W.; Zaidi, N.A.; Robles-Kelly, A. Machine Learning for Financial Risk Management: A Survey. IEEE Access 2020, 8, 203203–203223. [Google Scholar] [CrossRef]
- Aziz, S.; Dowling, M.M. AI and Machine Learning for Risk Management. SSRN Electron. J. 2018, 33–50. [Google Scholar] [CrossRef]
- Mandapuram, M.; Mandapuram, M.; Gutlapalli, S.S.; Reddy, M.; Bodepudi, A. Application of Artificial Intelligence (AI) Technologies to Accelerate Market Segmentation. Glob. Discl. Econ. Bus. 2020, 9, 141–150. [Google Scholar] [CrossRef]
- Ngai, E.W.T.; Wu, Y. Machine Learning in Marketing: A Literature Review, Conceptual Framework, and Research Agenda. J. Bus. Res. J. 2022, 145, 35–48. [Google Scholar] [CrossRef]
- Yoganarasimhan, H. Search Personalization Using Machine Learning. Manag. Sci. 2019, 66, 1045–1070. [Google Scholar] [CrossRef]
- Greene, T.; Shmueli, G. How Personal Is Machine Learning Personalization? arXiv 2019, arXiv:1912.07938. [Google Scholar]
- Lovera, F.A.; Cardinale, Y. Sentiment Analysis in Twitter: A Comparative Study. Rev. Cient. Sist. E Informática 2023, 3, e418. [Google Scholar] [CrossRef]
- Sentieiro, D.H. Machine Learning for Autonomous Vehicle Route Planning and Optimization. J. AI-Assist. Sci. Discov. 2022, 2, 1–20. [Google Scholar]
- Lazar, D.A.; Bıyık, E.; Sadigh, D.; Pedarsani, R. Learning How to Dynamically Route Autonomous Vehicles on Shared Roads. Transp. Res. Part C: Emerg. Technol. 2021, 130, 103258. [Google Scholar] [CrossRef]
- Lee, S.; Kim, Y.; Kahng, H.; Lee, S.K.; Chung, S.; Cheong, T.; Shin, K.; Park, J.; Kim, S.B. Intelligent Traffic Control for Autonomous Vehicle Systems Based on Machine Learning. Expert Syst. Appl. 2020, 144, 113074. [Google Scholar] [CrossRef]
- Liu, Y.; Fan, S.; Xu, S.; Sajjanhar, A.; Yeom, S.; Wei, Y. Predicting Student Performance Using Clickstream Data and Machine Learning. Educ. Sci. 2022, 13, 17. [Google Scholar] [CrossRef]
- Alghamdi, A.S.; Rahman, A. Data Mining Approach to Predict Success of Secondary School Students: A Saudi Arabian Case Study. Educ. Sci. 2023, 13, 293. [Google Scholar] [CrossRef]
- Bayazit, A.; Apaydin, N.; Gonullu, I. Predicting At-Risk Students in an Online Flipped Anatomy Course Using Learning Analytics. Educ. Sci. 2022, 12, 581. [Google Scholar] [CrossRef]
- Zhang, C.; Ahn, H. E-Learning at-Risk Group Prediction Considering the Semester and Realistic Factors. Educ. Sci. 2023, 13, 1130. [Google Scholar] [CrossRef]
- MINEDU. Ministerio de Educación del Perú|MINEDU. Available online: http://www.minedu.gob.pe/politicas/docencia/encuesta-nacional-a-docentes-endo.php (accessed on 8 May 2021).
- Diener, E.; Diener, M. Cross-Cultural Correlates of Life Satisfaction and Self-Esteem. J. Personal. Soc. Psychol. 1995, 68, 653–663. [Google Scholar] [CrossRef]
- Karataş, Z.; Uzun, K.; Tagay, Ö. Relationships Between the Life Satisfaction, Meaning in Life, Hope and COVID-19 Fear for Turkish Adults During the COVID-19 Outbreak. Front. Psychol. 2021, 12, 633384. [Google Scholar] [CrossRef]
- Szcześniak, M.; Tułecka, M. Family Functioning and Life Satisfaction: The Mediatory Role of Emotional Intelligence. Psychol. Res. Behav. Manag. 2020, 13, 223–232. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, D. Economic Income and Life Satisfaction of Rural Chinese Older Adults: The Effects of Physical Health and Ostracism. Research Square 2022. [Google Scholar] [CrossRef]
- Judge, T.A.; Piccolo, R.F.; Podsakoff, N.P.; Shaw, J.C.; Rich, B.L. The Relationship between Pay and Job Satisfaction: A Meta-Analysis of the Literature. J. Vocat. Behav. 2010, 77, 157–167. [Google Scholar] [CrossRef]
- Haar, J.M.; Russo, M.; Suñe, A.; Ollier-Malaterre, A. Outcomes of Work–Life Balance on Job Satisfaction, Life Satisfaction and Mental Health: A Study across Seven Cultures. J. Vocat. Behav. 2014, 85, 361–373. [Google Scholar] [CrossRef]
- Noda, H. Work–Life Balance and Life Satisfaction in OECD Countries: A Cross-Sectional Analysis. J. Happiness Stud. 2020, 21, 1325–1348. [Google Scholar] [CrossRef]
- Author, C.; Hee Park, K. The Relationships between Well-Being Lifestyle, Well-Being Attitude, Life Satisfaction, and Demographic Characteristics. J. Korean Home Econ. Assoc. 2011, 49, 39–49. [Google Scholar] [CrossRef]
- Luque-Reca, O.; García-Martínez, I.; Pulido-Martos, M.; Lorenzo Burguera, J.; Augusto-Landa, J.M. Teachers’ Life Satisfaction: A Structural Equation Model Analyzing the Role of Trait Emotion Regulation, Intrinsic Job Satisfaction and Affect. Teach. Teach. Educ. 2022, 113, 103668. [Google Scholar] [CrossRef]
- Lent, R.W.; Nota, L.; Soresi, S.; Ginevra, M.C.; Duffy, R.D.; Brown, S.D. Predicting the Job and Life Satisfaction of Italian Teachers: Test of a Social Cognitive Model. J. Vocat. Behav. 2011, 79, 91–97. [Google Scholar] [CrossRef]
- Cayupe, J.C.; Bernedo-Moreira, D.H.; Morales-García, W.C.; Alcaraz, F.L.; Peña, K.B.C.; Saintila, J.; Flores-Paredes, A. Self-Efficacy, Organizational Commitment, Workload as Predictors of Life Satisfaction in Elementary School Teachers: The Mediating Role of Job Satisfaction. Front. Psychol. 2023, 14, 1066321. [Google Scholar] [CrossRef]
- Marcionetti, J.; Castelli, L. The Job and Life Satisfaction of Teachers: A Social Cognitive Model Integrating Teachers’ Burnout, Self-Efficacy, Dispositional Optimism, and Social Support. Int. J. Educ. Vocat. Guid. 2023, 23, 441–463. [Google Scholar] [CrossRef]
- Bano, S.; Malik, S.; Sadia, M. Effect of Occupational Stress on Life Satisfaction among Private and Public School Teachers. JISR Manag. Soc. Sci. Econ. 2014, 12, 61–72. [Google Scholar] [CrossRef]
- Quinteros-Durand, R.; Almanza-Cabe, R.B.; Morales-García, W.C.; Mamani-Benito, O.; Sairitupa-Sanchez, L.Z.; Puño-Quispe, L.; Saintila, J.; Saavedra-Sandoval, R.; Paredes, A.F.; Ramírez-Coronel, A.A. Influence of Servant Leadership on the Life Satisfaction of Basic Education Teachers: The Mediating Role of Satisfaction with Job Resources. Front. Psychol. 2023, 14, 1167074. [Google Scholar] [CrossRef]
- Sanchez-Martinez, S.; Camara, O.; Piella, G.; Cikes, M.; González-Ballester, M.Á.; Miron, M.; Vellido, A.; Gómez, E.; Fraser, A.G.; Bijnens, B. Machine Learning for Clinical Decision-Making: Challenges and Opportunities in Cardiovascular Imaging. Front. Cardiovasc. Med. 2021, 8, 765693. [Google Scholar] [CrossRef] [PubMed]
- Byeon, H. Application of Artificial Neural Network Analysis and Decision Tree Analysis to Develop a Model for Predicting Life Satisfaction of the Elderly in South Korea. Int. J. Eng. Technol. 2018, 7, 161–166. [Google Scholar] [CrossRef]
- Zhang, J.; Li, L. A Study on Life Satisfaction Prediction of the Elderly Based on SVM; Association for Computing Machinery: New York, NY, USA, 2023; pp. 16–21. [Google Scholar] [CrossRef]
- Pan, Z.; Cutumisu, M. Using Machine Learning to Predict UK and Japanese Secondary Students’ Life Satisfaction in PISA 2018. Br. J. Educ. Psychol. 2024, 94, 474–498. [Google Scholar] [CrossRef]
- Khan, A.E.; Hasan, M.J.; Anjum, H.; Mohammed, N.; Momen, S. Predicting Life Satisfaction Using Machine Learning and Explainable AI. Heliyon 2024, 10, e31158. [Google Scholar] [CrossRef]
- Jaiswal, R.; Gupta, S. Money Talks, Happiness Walks: Dissecting the Secrets of Global Bliss with Machine Learning. J. Chin. Econ. Bus. Stud. 2024, 22, 111–158. [Google Scholar] [CrossRef]
- Morrone, A.; Piscitelli, A.; D’Ambrosio, A. How Disadvantages Shape Life Satisfaction: An Alternative Methodological Approach. Soc. Indic. Res. 2019, 141, 477–502. [Google Scholar] [CrossRef]
- Lee, S. Exploring Factors Influencing Life Satisfaction of Youth Using Random Forests. J. Ind. Converg. 2023, 21, 9–17. [Google Scholar] [CrossRef]
- Shen, X.; Yin, F.; Jiao, C. Predictive Models of Life Satisfaction in Older People: A Machine Learning Approach. Int. J. Environ. Res. Public Health 2023, 20, 2445. [Google Scholar] [CrossRef]
- Jang, J.H.; Masatsuku, N. A Study of Factors Influencing Happiness in Korea: Topic Modelling and Neural Network Analysis [Estudio de Los Factores Que Influyen En La Felicidad En Corea: Modelización de Temas y Análisis de Redes Neuronales]. Data Metadata 2024, 3, 238. [Google Scholar] [CrossRef]
- Pudjihartono, N.; Fadason, T.; Kempa-Liehr, A.W.; O’Sullivan, J.M. A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Front. Bioinforma. 2022, 2, 927312. [Google Scholar] [CrossRef] [PubMed]
- Yang, P.; Zhou, B.B.; Yang, J.Y.H.; Zomaya, A.Y. Stability of Feature Selection Algorithms and Ensemble Feature Selection Methods in Bioinformatics. In Biological Knowledge Discovery Handbook: Preprocessing, Mining and Postprocessing of Biological Data; John Wiley & Sons, Inc.: New York, NY, USA, 2014; pp. 333–352. [Google Scholar] [CrossRef]
- Abeel, T.; Helleputte, T.; Van de Peer, Y.; Dupont, P.; Saeys, Y. Robust Biomarker Identification for Cancer Diagnosis with Ensemble Feature Selection Methods. Bioinformatics 2010, 26, 392–398. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Xu, J.; Zhao, C.; Peng, Y.; Wang, H. An Ensemble Feature Selection Method for High-Dimensional Data Based on Sort Aggregation. Syst. Sci. Control. Eng. 2019, 7, 32–39. [Google Scholar] [CrossRef]
- Tsai, C.F.; Sung, Y.T. Ensemble Feature Selection in High Dimension, Low Sample Size Datasets: Parallel and Serial Combination Approaches. Knowl.-Based Syst. 2020, 203, 106097. [Google Scholar] [CrossRef]
- Hoque, N.; Singh, M.; Bhattacharyya, D.K. EFS-MI: An Ensemble Feature Selection Method for Classification. Complex Intell. Syst. 2017, 4, 105–118. [Google Scholar] [CrossRef]
- Seijo-Pardo, B.; Porto-Díaz, I.; Bolón-Canedo, V.; Alonso-Betanzos, A. Ensemble Feature Selection: Homogeneous and Heterogeneous Approaches. Knowl.-Based Syst. 2017, 118, 124–139. [Google Scholar] [CrossRef]
- Ben Brahim, A.; Limam, M. Ensemble Feature Selection for High Dimensional Data: A New Method and a Comparative Study. Adv. Data Anal. Classif. 2018, 12, 937–952. [Google Scholar] [CrossRef]
- Neumann, U.; Genze, N.; Heider, D. EFS: An Ensemble Feature Selection Tool Implemented as R-Package and Web-Application. BioData Min. 2017, 10, 21. [Google Scholar] [CrossRef]
- Werner de Vargas, V.; Schneider Aranda, J.A.; dos Santos Costa, R.; da Silva Pereira, P.R.; Victória Barbosa, J.L. Imbalanced Data Preprocessing Techniques for Machine Learning: A Systematic Mapping Study. Knowl. Inf. Syst. 2023, 65, 31–57. [Google Scholar] [CrossRef]
- Gardner, W.; Winkler, D.A.; Alexander, D.L.J.; Ballabio, D.; Muir, B.W.; Pigram, P.J. Effect of Data Preprocessing and Machine Learning Hyperparameters on Mass Spectrometry Imaging Models. J. Vac. Sci. Technol. A 2023, 41, 63204. [Google Scholar] [CrossRef]
- Frye, M.; Mohren, J.; Schmitt, R.H. Benchmarking of Data Preprocessing Methods for Machine Learning-Applications in Production. Procedia CIRP 2021, 104, 50–55. [Google Scholar] [CrossRef]
- Maharana, K.; Mondal, S.; Nemade, B. A Review: Data Pre-Processing and Data Augmentation Techniques. Glob. Transit. Proc. 2022, 3, 91–99. [Google Scholar] [CrossRef]
- Dina Diatta, I.; Berchtold, A. Impact of Missing Information on Day-to-Day Research Based on Secondary Data. Int. J. Soc. Res. Methodol. 2023, 26, 759–772. [Google Scholar] [CrossRef]
- Austin, P.C.; White, I.R.; Lee, D.S.; van Buuren, S. Missing Data in Clinical Research: A Tutorial on Multiple Imputation. Can. J. Cardiol. 2021, 37, 1322–1331. [Google Scholar] [CrossRef]
- Emmanuel, T.; Maupong, T.; Mpoeleng, D.; Semong, T.; Mphago, B.; Tabona, O. A Survey on Missing Data in Machine Learning. J. Big Data 2021, 8, 140. [Google Scholar] [CrossRef] [PubMed]
- Memon, S.M.; Wamala, R.; Kabano, I.H. A Comparison of Imputation Methods for Categorical Data. Inform. Med. Unlocked 2023, 42, 101382. [Google Scholar] [CrossRef]
- Kosaraju, N.; Sankepally, S.R.; Mallikharjuna Rao, K. Categorical Data: Need, Encoding, Selection of Encoding Method and Its Emergence in Machine Learning Models—A Practical Review Study on Heart Disease Prediction Dataset Using Pearson Correlation. Lect. Notes Networks Syst. 2023, 1, 369–382. [Google Scholar] [CrossRef]
- Mallikharjuna Rao, K.; Saikrishna, G.; Supriya, K. Data Preprocessing Techniques: Emergence and Selection towards Machine Learning Models—A Practical Review Using HPA Dataset. Multimed. Tools Appl. 2023, 82, 37177–37196. [Google Scholar] [CrossRef]
- Vowels, L.M.; Vowels, M.J.; Mark, K.P. Identifying the Strongest Self-Report Predictors of Sexual Satisfaction Using Machine Learning. J. Soc. Pers. Relat. 2022, 39, 1191–1212. [Google Scholar] [CrossRef]
- Zhang, H.; Zheng, G.; Xu, J.; Yao, X. Research on the Construction and Realization of Data Pipeline in Machine Learning Regression Prediction. Math. Probl. Eng. 2022, 2022, 7924335. [Google Scholar] [CrossRef]
- Md, A.Q.; Kulkarni, S.; Joshua, C.J.; Vaichole, T.; Mohan, S.; Iwendi, C. Enhanced Preprocessing Approach Using Ensemble Machine Learning Algorithms for Detecting Liver Disease. Biomedicines 2023, 11, 581. [Google Scholar] [CrossRef] [PubMed]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Daza Vergaray, A.; Miranda, J.C.H.; Cornelio, J.B.; López Carranza, A.R.; Ponce Sánchez, C.F. Predicting the Depression in University Students Using Stacking Ensemble Techniques over Oversampling Method. Inform. Med. Unlocked 2023, 41, 101295. [Google Scholar] [CrossRef]
- Lemaître, G.; Nogueira, F.; Aridas, C.K. Imbalanced-Learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. J. Mach. Learn. Res. 2017, 18, 1–5. [Google Scholar]
- Wang, L.; Han, M.; Li, X.; Zhang, N.; Cheng, H. Review of Classification Methods on Unbalanced Data Sets. IEEE Access 2021, 9, 64606–64628. [Google Scholar] [CrossRef]
- Viloria, A.; Lezama, O.B.P.; Mercado-Caruzo, N. Unbalanced Data Processing Using Oversampling: Machine Learning. Procedia Comput. Sci. 2020, 175, 108–113. [Google Scholar] [CrossRef]
- Xu, Y.; Park, Y.; Park, J.D.; Sun, B. Predicting Nurse Turnover for Highly Imbalanced Data Using the Synthetic Minority Over-Sampling Technique and Machine Learning Algorithms. Healthcare 2023, 11, 3173. [Google Scholar] [CrossRef]
- Kalimuthan, C.; Arokia Renjit, J. Review on Intrusion Detection Using Feature Selection with Machine Learning Techniques. Mater. Today Proc. 2020, 33, 3794–3802. [Google Scholar] [CrossRef]
- Abubakar, S.M.; Sufyanu, Z.; Abubakar, M.M. A survey of feature selection methods for software defect prediction models. FUDMA J. Sci. 2020, 4, 62–68. [Google Scholar]
- Chandrashekar, G.; Sahin, F. A Survey on Feature Selection Methods. Comput. Electr. Eng. 2014, 40, 16–28. [Google Scholar] [CrossRef]
- Miao, J.; Niu, L. A Survey on Feature Selection. Procedia Comput. Sci. 2016, 91, 919–926. [Google Scholar] [CrossRef]
- Jia, W.; Sun, M.; Lian, J.; Hou, S. Feature Dimensionality Reduction: A Review. Complex Intell. Syst. 2022, 8, 2663–2693. [Google Scholar] [CrossRef]
- Altae, A.A.; Rad, A.E.; Tati, R. Comparative Study on Effective Feature Selection Methods. Int. J. Innov. Eng. Manag. Res. Forthcoming. 2023. [Google Scholar]
- Tang, J.; Alelyani, S.; Liu, H. Feature Selection for Classification: A Review. Data Classif. Algorithms Appl. 2014, 37–64. [Google Scholar] [CrossRef]
- Bommert, A.; Sun, X.; Bischl, B.; Rahnenführer, J.; Lang, M. Benchmark for Filter Methods for Feature Selection in High-Dimensional Classification Data. Comput. Stat. Data Anal. 2020, 143, 106839. [Google Scholar] [CrossRef]
- Nguyen, H.B.; Xue, B.; Andreae, P. Mutual Information Estimation for Filter Based Feature Selection Using Particle Swarm Optimization. In Applications of Evolutionary Computation, Proceedings of the 19th European Conference, Ponto, Portugal, 30 March–1 April 2016; pp. 719–736. [CrossRef]
- Vergara, J.R.; Estévez, P.A. A Review of Feature Selection Methods Based on Mutual Information. Neural Comput. Appl. 2014, 24, 175–186. [Google Scholar] [CrossRef]
- Dissanayake, K.; Johar, M.G.M. Comparative Study on Heart Disease Prediction Using Feature Selection Techniques on Classification Algorithms. Appl. Comput. Intell. Soft Comput. 2021, 2021, 5581806. [Google Scholar] [CrossRef]
- Tripathy, G.; Sharaff, A. AEGA: Enhanced Feature Selection Based on ANOVA and Extended Genetic Algorithm for Online Customer Review Analysis. J. Supercomput. 2023, 79, 13180–13209. [Google Scholar] [CrossRef]
- Raufi, B.; Longo, L. Comparing ANOVA and PowerShap Feature Selection Methods via Shapley Additive Explanations of Models of Mental Workload Built with the Theta and Alpha EEG Band Ratios. BioMedInformatics 2024, 4, 853–876. [Google Scholar] [CrossRef]
- Laborda, J.; Ryoo, S. Feature Selection in a Credit Scoring Model. Mathematics 2021, 9, 746. [Google Scholar] [CrossRef]
- Jiang, J.; Zhang, X.; Yuan, Z. Feature Selection for Classification with Spearman’s Rank Correlation Coefficient-Based Self-Information in Divergence-Based Fuzzy Rough Sets. Expert Syst. Appl. 2024, 249, 123633. [Google Scholar] [CrossRef]
- Tang, M.; Zhao, Q.; Wu, H.; Wang, Z. Cost-Sensitive LightGBM-Based Online Fault Detection Method for Wind Turbine Gearboxes. Front. Energy Res. 2021, 9, 701574. [Google Scholar] [CrossRef]
- Liu, H.; Zhou, M.; Liu, Q. An Embedded Feature Selection Method for Imbalanced Data Classification. IEEE/CAA J. Autom. Sin. 2019, 6, 703–715. [Google Scholar] [CrossRef]
- Elgeldawi, E.; Sayed, A.; Galal, A.R.; Zaki, A.M. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. [Google Scholar] [CrossRef]
- Papernot, N.; Steinke, T. Hyperparameter Tuning with Renyi Differential Privacy. arXiv 2021, arXiv:2110.03620. [Google Scholar]
- Bacanin, N.; Stoean, C.; Zivkovic, M.; Rakic, M.; Strulak-Wójcikiewicz, R.; Stoean, R. On the Benefits of Using Metaheuristics in the Hyperparameter Tuning of Deep Learning Models for Energy Load Forecasting. Energies 2023, 16, 1434. [Google Scholar] [CrossRef]
- Ali, Y.A.; Awwad, E.M.; Al-Razgan, M.; Maarouf, A. Hyperparameter Search for Machine Learning Algorithms for Optimizing the Computational Complexity. Processes 2023, 11, 349. [Google Scholar] [CrossRef]
- Rajendran, S.; Chamundeswari, S.; Sinha, A.A. Predicting the Academic Performance of Middle- and High-School Students Using Machine Learning Algorithms. Soc. Sci. Humanit. Open 2022, 6, 100357. [Google Scholar] [CrossRef]
- Passos, D.; Mishra, P. A Tutorial on Automatic Hyperparameter Tuning of Deep Spectral Modelling for Regression and Classification Tasks. Chemom. Intell. Lab. Syst. 2022, 223, 104520. [Google Scholar] [CrossRef]
- Bergstra, J.; Ca, J.B.; Ca, Y.B. Random Search for Hyper-Parameter Optimization Yoshua Bengio. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
- Heydarian, M.; Doyle, T.E.; Samavi, R. MLCM: Multi-Label Confusion Matrix. IEEE Access 2022, 10, 19083–19095. [Google Scholar] [CrossRef]
- Prusty, S.; Patnaik, S.; Dash, S.K. SKCV: Stratified K-Fold Cross-Validation on ML Classifiers for Predicting Cervical Cancer. Front. Nanotechnol. 2022, 4, 972421. [Google Scholar] [CrossRef]
- Li, X.; Lin, X.; Zhang, F.; Tian, Y. Playing Roles in Work and Family: Effects of Work/Family Conflicts on Job and Life Satisfaction Among Junior High School Teachers. Front. Psychol. 2021, 12, 772025. [Google Scholar] [CrossRef]
- Judge, T.A.; Bono, J.E. Relationship of Core Self-Evaluations Traits - Self-Esteem, Generalized Self-Efficacy, Locus of Control, and Emotional Stability—With Job Satisfaction and Job Performance: A Meta-Analysis. J. Appl. Psychol. 2001, 86, 80–92. [Google Scholar] [CrossRef] [PubMed]
- Holgado-Apaza, L.A.; Carpio-Vargas, E.E.; Calderon-Vilca, H.D.; Maquera-Ramirez, J.; Ulloa-Gallardo, N.J.; Acosta-Navarrete, M.S.; Barrón-Adame, J.M.; Quispe-Layme, M.; Hidalgo-Pozzi, R.; Valles-Coral, M. Modeling Job Satisfaction of Peruvian Basic Education Teachers Using Machine Learning Techniques. Appl. Sci. 2023, 13, 3945. [Google Scholar] [CrossRef]
- Cole, C.; Hinchcliff, E.; Carling, R. Reflection as Teachers: Our Critical Developments. Front. Educ. 2022, 7, 1037280. [Google Scholar] [CrossRef]
- Shandomo, H.M. The Role of Critical Reflection in Teacher Education. Sch.-Univ. Partnersh. 2010, 4, 101–113. [Google Scholar]
- Shiri, R.; El-Metwally, A.; Sallinen, M.; Pöyry, M.; Härmä, M.; Toppinen-Tanner, S. The Role of Continuing Professional Training or Development in Maintaining Current Employment: A Systematic Review. Healthcare 2023, 11, 2900. [Google Scholar] [CrossRef]
- Law, S.F.; Le, A.T. A Systematic Review of Empirical Studies on Trust between Universities and Society. J. High. Educ. Policy Manag. 2023, 45, 393–408. [Google Scholar] [CrossRef]
- OECD. OECD Survey on Drivers of Trust in Public Institutions-2024 Results Building Trust In A Complex Policy Environment; OECD: Paris, France, 2024. [Google Scholar] [CrossRef]
- Helliwell, J.F.; Huang, H. New measures of the costs of unemployment: Evidence from the subjective well-being of 3.3 million americans. Econ. Inq. 2014, 52, 1485–1502. [Google Scholar] [CrossRef]
- Helliwell, J.; Layard, R.; Sachs, J.; Neve De, J.-E.; Aknin, L. Happiness and Age: Summary|The World Happiness Report. Available online: https://worldhappiness.report/ed/2024/happiness-and-age-summary/ (accessed on 15 August 2024).
- Cho, H.; Pyun, D.Y.; Wang, C.K.J. Teachers’ Work-Life Balance: The Effect of Work-Leisure Conflict on Work-Related Outcomes. Asia Pac. J. Educ. 2023, 1–16. [Google Scholar] [CrossRef]
- Ertürk, R. The Effect of Teachers’ Quality of Work Life on Job Satisfaction and Turnover Intentions. Int. J. Contemp. Educ. Res. 2022, 9, 191–203. [Google Scholar] [CrossRef]
- Lee, K.O.; Lee, K.S. Effects of Emotional Labor, Anger, and Work Engagement on Work-Life Balance of Mental Health Specialists Working in Mental Health Welfare Centers. Int. J. Environ. Res. Public Heal. 2023, 20, 2353. [Google Scholar] [CrossRef]
Attribute | Value Obtained |
---|---|
Variables | 150 |
Rows | 28,216 |
All missing cells | 1.5265 × 106 |
Missing cell (%) | 36.1% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Variable types | Categorical: 125 Numerical: 25 |
P1_24_B | P1_24_E | P1_24_C | P1_24_F | P1_22_A | P1_2 | P1_22_D | P1_26_E | P1_26_C | P1_9_A_SD_HORA | Satisfied |
---|---|---|---|---|---|---|---|---|---|---|
3.0 | 3.0 | 3.0 | 3.0 | 2.0 | 1.022093 | 3.0 | 1.0 | 1.0 | −0.550070 | 2 |
2.0 | 2.0 | 2.0 | 1.0 | 1.0 | 1.869359 | 1.0 | 1.0 | 1.0 | −0.970063 | 1 |
2.0 | 2.0 | 2.0 | 1.0 | 2.0 | 1.869359 | 2.0 | 2.0 | 1.0 | 1.969885 | 1 |
… | … | … | … | … | … | … | … | … | … | … |
3.0 | 3.0 | 3.0 | 3.0 | 2.0 | −0.460624 | 3.0 | 1.0 | 1.0 | 1.129900 | 2 |
3.0 | 3.0 | 2.0 | 3.0 | 2.0 | −0.460624 | 3.0 | 1.0 | 1.0 | 0.289915 | 2 |
3.0 | 3.0 | 2.0 | 2.0 | 2.0 | 0.068918 | 2.0 | 1.0 | 1.0 | −0.550070 | 2 |
Model | Hyperparameter | Search Space | Description | Default Values | Optimal Values |
---|---|---|---|---|---|
Random Forest | n_estimators | [10:100] step 1 | Number of trees | 100 | 85 |
criterion | [“gini”, “entropy”] | Criteria for evaluating divisions | “gini” | “entropy” | |
max_depth | [2:20] step 1 | Maximum depth | None | None | |
min_samples_split | [2:10] step 1 | Minimum number of samples to split node | 2 | 2 | |
min_samples_leaf | [1:10] step 1 | Minimum samples to be a leaf node | 1 | 1 | |
max_features | [“auto”, “sqrt”, “log2”, None] | Number of characteristics to consider for the best division | “sqrt” | “sqrt” | |
bootstrap | [True, False] | Method for sampling input data | True | True | |
XGBoost | n_estimators | [10, 17, 25, 33, 41, 48, 56, 64, 72, 80] | Number of trees | None | 80 |
max_depth | [3, 5, 7] | Maximum Depth | None | 3 | |
learning_rate | [0.01:0.1] step 0.03 | Learning rate | None | 0.01 | |
subsample | [0.6:0.9] step 0.1 | Proportion of samples to train each tree | None | 0.8 | |
colsample_bytree | [0.6:0.9] step 0.1 | Proportion of features per tree | None | 0.8 | |
Gradient Boosting | loss | [“log_loss”] | Loss function | “log_loss” | “log_loss” |
learning_rate | [0.001, 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.15, 0.2] | Learning rate | 0.1 | 0.025 | |
min_samples_split | [500:595] step 5 + [601:696] step 5 + [702:797] step 5 + [803:898] step 5 + [904:1000] step 5 | Minimum samples to split a node | 2 | 606 | |
min_samples_leaf | [20, 28, 37, 46, 55, 64, 73, 82, 91, 100] | Minimal samples in a leaf node | 1 | 100 | |
max_depth | [2:10] step 1 | Maximum tree depth | 3 | 8 | |
max_features | [“log2”,”sqrt”] | Number of characteristics to consider for the best division | None | “sqrt” | |
criterion | [“friedman_mse”, “squared_error”] | Criteria for evaluating divisions | “friedman_mse” | “squared_error” | |
subsample | [0.5, 0.618, 0.8, 0.85, 0.9, 0.95, 1.0] | Proportion of samples to train each tree | 1.0 | 0.618 | |
n_estimators | [100:1000] step 100 | Number of sequential trees | 100 | 200 | |
Decision Trees-CART | max_depth | [10, 20, 30, 40, 50, None] | Maximum tree depth | None | None |
criterion | [“gini”, “entropy”] | Criteria for measuring the quality of a division | “gini” | “entropy” | |
min_samples_split | [2, 3, 4, 5, 7, 10, 15] | Minimum samples to split a node | 2 | 2 | |
min_samples_leaf | [1, 2, 3, 4, 5, 7] | Minimal samples in a leaf node | 1 | 3 | |
max_features | [“sqrt”, “log2”] | Maximum number of characteristics to be considered for a division | None | “sqrt” | |
CatBoost | iterations | [100:500] step 100 | Number of iterations (trees) | 1000 | 400 |
depth | [3: 10] step 1 | Maximum tree depth | 6 | 10 | |
learning_rate | [0.01, 0.05, 0.1, 0.2] | Learning rate | 0.093 | 0.2 | |
l2_leaf_reg | [1:9] step 2 | L2 regularization in leaf | 3.0 | 1 | |
border_count | [32, 50, 100, 200] | Number of division limits in numerical characteristics | 254 | 32 | |
bagging_temperature | [0.5, 1, 2, 3] | Controls the intensity of random sampling | 1.0 | 3 | |
random_strength | [1, 2, 5, 10] | Intensity of random noise to handle equal predictions | 1.0 | 5 | |
one_hot_max_size | [2, 10, 20] | Maximum size to use one-hot encoding | 215 | 2 | |
LightGBM | num_leaves | [20:140] step 10 | Maximum number of leaves per tree | 31 | 80 |
max_depth | [3, 5, 7, 9, 11, 13] | Maximum depth | −1 | 9 | |
learning_rate | [0.0001, 0.001, 0.01, 0.1, 1.0] | Learning rate | 0.1 | 0.1 | |
n_estimators | [100, 300, 500, 700, 900] | Number of trees | 100 | 300 | |
min_child_samples | [5, 15, 25, 35, 45] | Minimum samples in leaf nodes | 20 | 35 | |
subsample | [0.6, 0.7, 0.8, 0.9, 1.0] | Proportion of data to train each tree | 1.0 | 0.7 | |
colsample_bytree | [0.6, 0.7, 0.8, 0.9, 1.0] | Proportion of characteristics per tree | 1.0 | 0.8 | |
reg_alpha | [1.0 × 10−4, 1.78 × 10−3, 3.16 × 10−2, 5.62 × 10−1, 1.0 × 101] | Regularization L1 | 0.0 | 3.16 × 10−2 | |
reg_lambda | [1.0 × 10−4, 1.78 × 10−3, 3.16 × 10−2, 5.62 × 10−1, 1.0 × 101] | Regularization L2 | 0.0 | 1.78 × 10−3 | |
min_split_gain | [0.0, 0.25, 0.5, 0.75, 1.0] | Minimum gain for splitting a node | 0.0 | 0.0 | |
scale_pos_weight | [1, 10, 25, 50, 75, 99] | Balancing unbalanced classes | 1.0 | 10 | |
Support Vector Machine | C | [0.1, 1, 10, 100, 1000] | Regularization parameter | 1.0 | 0.1 |
gamma | [1, 0.1, 0.01, 0.001, 0.0001] | Kernel coefficient | “scale” | 1 | |
kernel | [“linear”, “rbf”] | Kernel function | “rbf” | “linear” | |
Multilayer Perceptron | hidden_layer_sizes | [50, 100, 150] | Number of neurons in the hidden layer | 100 | 150 |
activation | [“tanh”, “relu”] | Activation function | “relu” | “relu” | |
solver | [“adam”, “sgd”] | Optimization method | “adam” | “adam” | |
alpha | [0.0001, 0.001, 0.01] | Adjustment parameter | 0.0001 | 0.0001 | |
learning_rate | [“constant”, “adaptive”] | Learning rate | “constant” | “constant” | |
max_iter | Entero aleatorio entre 100 y 1000 | Number of training iterations | 200 | 848 |
Model | Accuracy | Balanced Accuracy | Recall | Precision | F1 Score | Cohen Kappa Coefficient | Jaccard Score |
---|---|---|---|---|---|---|---|
CatBoost | 0.824 0.026 | 0.824 0.026 | 0.824 0.026 | 0.823 0.027 | 0.822 0.027 | 0.737 0.039 | 0.714 0.036 |
CART | 0.762 0.026 | 0.762 0.026 | 0.762 0.026 | 0.756 0.028 | 0.755 0.028 | 0.642 0.039 | 0.622 0.034 |
Gradient Boosting | 0.677 0.029 | 0.677 0.029 | 0.677 0.029 | 0.677 0.029 | 0.676 0.029 | 0.515 0.043 | 0.516 0.033 |
LightGBM | 0.814 0.024 | 0.814 0.024 | 0.814 0.024 | 0.811 0.025 | 0.811 0.025 | 0.721 0.036 | 0.698 0.033 |
MLP classifier | 0.735 0.026 | 0.735 0.026 | 0.735 0.026 | 0.735 0.027 | 0.732 0.026 | 0.603 0.039 | 0.586 0.032 |
Random Forest | 0.791 0.024 | 0.791 0.024 | 0.791 0.024 | 0.787 0.025 | 0.787 0.025 | 0.687 0.036 | 0.661 0.032 |
SVM | 0.615 0.032 | 0.615 0.032 | 0.615 0.032 | 0.644 0.031 | 0.619 0.031 | 0.422 0.048 | 0.451 0.033 |
XGBoost | 0.633 0.032 | 0.633 0.032 | 0.633 0.032 | 0.634 0.032 | 0.631 0.032 | 0.449 0.048 | 0.466 0.034 |
Sum of Squares | df | Mean Square | F | Sig. | ||
---|---|---|---|---|---|---|
Balanced accuracy | Between groups | 4.633 | 7 | 0.662 | 880.466 | 0.000 |
Within groups | 0.595 | 792 | 0.001 | |||
Total | 5.229 | 799 | ||||
Sensitivity | Between groups | 4.633 | 7 | 0.662 | 881.058 | 0.000 |
Within groups | 0.595 | 792 | 0.001 | |||
Total | 5.228 | 799 | ||||
F1 Score | Between groups | 4.414 | 7 | 0.631 | 804.628 | 0.000 |
Within groups | 0.621 | 792 | 0.001 | |||
Total | 5.035 | 799 | ||||
Cohen kappa coefficient | Between groups | 10.425 | 7 | 1.489 | 881.251 | 0.000 |
Within groups | 1.338 | 792 | 0.002 | |||
Total | 11.763 | 799 |
Balanced Accuracy | Sensitivity | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
HSD Tukey a | HSD Tukey a | ||||||||||||||||
Model | N | Subset for Alpha= 0.05 | Model | N | Subset for Alpha = 0.05 | ||||||||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | ||||
Support Vector Machine | 100 | 0.615 | Support Vector Machine | 100 | 0.615 | ||||||||||||
XGBoost | 100 | 0.633 | XGBoost | 100 | 0.633 | ||||||||||||
Gradient Boosting | 100 | 0.677 | Gradient Boosting | 100 | 0.677 | ||||||||||||
MLP Classifier | 100 | 0.735 | MLP Classifier | 100 | 0.735 | ||||||||||||
Decision Trees—CART | 100 | 0.762 | Decision Trees—CART | 100 | 0.762 | ||||||||||||
Random Forest | 100 | 0.791 | Random Forest | 100 | 0.791 | ||||||||||||
LightGBM | 100 | 0.814 | LightGBM | 100 | 0.814 | ||||||||||||
CatBoost | 100 | 0.824 | CatBoost | 100 | 0.824 | ||||||||||||
Sig. | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.117 | Sig. | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.117 | ||
F1 Score | Cohen Kappa Coefficient | ||||||||||||||||
HSD Tukey a | HSD Tukey a | ||||||||||||||||
Model | N | Subset for Alpha = 0.05 | Model | N | Subset for Alpha = 0.05 | ||||||||||||
1 | 2 | 3 | 4 | 5 | 6 | 7 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | ||||
Support Vector Machine | 100 | 0.619 | Support Vector Machine | 100 | 0.422 | ||||||||||||
XGBoost | 100 | 0.631 | XGBoost | 100 | 0.449 | ||||||||||||
Gradient Boosting | 100 | 0.676 | Gradient Boosting | 100 | 0.515 | ||||||||||||
MLP Classifier | 100 | 0.732 | MLP Classifier | 100 | 0.603 | ||||||||||||
Decision Trees—CART | 100 | 0.755 | Decision Trees—CART | 100 | 0.642 | ||||||||||||
Random Forest | 100 | 0.787 | Random Forest | 100 | 0.687 | ||||||||||||
LightGBM | 100 | 0.811 | LightGBM | 100 | 0.721 | ||||||||||||
CatBoost | 100 | 0.822 | CatBoost | 100 | 0.737 | ||||||||||||
Sig. | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.125 | Sig. | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.117 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Holgado-Apaza, L.A.; Ulloa-Gallardo, N.J.; Aragon-Navarrete, R.N.; Riva-Ruiz, R.; Odagawa-Aragon, N.K.; Castellon-Apaza, D.D.; Carpio-Vargas, E.E.; Villasante-Saravia, F.H.; Alvarez-Rozas, T.P.; Quispe-Layme, M. The Exploration of Predictors for Peruvian Teachers’ Life Satisfaction through an Ensemble of Feature Selection Methods and Machine Learning. Sustainability 2024, 16, 7532. https://doi.org/10.3390/su16177532
Holgado-Apaza LA, Ulloa-Gallardo NJ, Aragon-Navarrete RN, Riva-Ruiz R, Odagawa-Aragon NK, Castellon-Apaza DD, Carpio-Vargas EE, Villasante-Saravia FH, Alvarez-Rozas TP, Quispe-Layme M. The Exploration of Predictors for Peruvian Teachers’ Life Satisfaction through an Ensemble of Feature Selection Methods and Machine Learning. Sustainability. 2024; 16(17):7532. https://doi.org/10.3390/su16177532
Chicago/Turabian StyleHolgado-Apaza, Luis Alberto, Nelly Jacqueline Ulloa-Gallardo, Ruth Nataly Aragon-Navarrete, Raidith Riva-Ruiz, Naomi Karina Odagawa-Aragon, Danger David Castellon-Apaza, Edgar E. Carpio-Vargas, Fredy Heric Villasante-Saravia, Teresa P. Alvarez-Rozas, and Marleny Quispe-Layme. 2024. "The Exploration of Predictors for Peruvian Teachers’ Life Satisfaction through an Ensemble of Feature Selection Methods and Machine Learning" Sustainability 16, no. 17: 7532. https://doi.org/10.3390/su16177532