A Human-Centered Approach to Academic Performance Prediction Using Personality Factors in Educational AI

Aslam, Muhammad Adnan; Murtaza, Fiza; Haq, Muhammad Ehatisham Ul; Yasin, Amanullah; Azam, Muhammad Awais

doi:10.3390/info15120777

Open AccessArticle

A Human-Centered Approach to Academic Performance Prediction Using Personality Factors in Educational AI

by

Muhammad Adnan Aslam

^1,†,

Fiza Murtaza

^1,†,

Muhammad Ehatisham Ul Haq

^1,†,

Amanullah Yasin

^1,†

and

Muhammad Awais Azam

^2,*,†

¹

Department of Creative Technologies, Faculty of Computing and Artificial Intelligence (FCAI), Air University, Islamabad 44000, Pakistan

²

Technology and Innovation Research Group, School of Information Technology, Whitecliffe, Wellington 6145, New Zealand

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Information 2024, 15(12), 777; https://doi.org/10.3390/info15120777

Submission received: 4 November 2024 / Revised: 24 November 2024 / Accepted: 26 November 2024 / Published: 5 December 2024

(This article belongs to the Special Issue Advances in Human-Centered Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

As artificial intelligence (AI) becomes increasingly integrated into educational environments, adopting a human-centered approach is essential for enhancing student outcomes. This study investigates the role of personality factors in predicting academic performance, emphasizing the need for explainable and ethical AI systems. Utilizing the SAPEx-D (Student Academic Performance Exploration) dataset from Air University, Islamabad, which comprises 494 records, we explore how individual personality traits can impact academic success. We employed advanced regression models, including Gradient Boosting Regressor, K-Nearest Neighbors Regressor, Linear Regression, and Support Vector Regression, to predict students’ Cumulative Grade Point Average (CGPA). Our findings reveal that the Gradient Boosting Regressor achieved an R-squared value of 0.63 with the lowest Mean Squared Error (MSE); incorporating personality factors elevated the R-squared to 0.83, significantly improving predictive accuracy. For letter grade classification, the incorporation of personality factors improved the accuracy for distinct classes to 0.67 and to 0.85 for broader class categories. The integration of the Shapley Additive Explanations (SHAPs) technique further allowed for the interpretation of how personality traits interact with other factors, underscoring their role in shaping academic outcomes. This research highlights the importance of designing AI systems that are not only accurate but also interpretable and aligned with human values, thereby fostering a more equitable educational landscape. Future work will expand on these findings by exploring the interaction effects of personality traits and applying more sophisticated machine learning techniques.

Keywords:

human-centered AI; explainable AI; academic performance prediction; personality factors; educational data analytics; machine learning; student well-being; ethical AI; transparent decision-making; AI in education

Graphical Abstract

1. Introduction

In the modern competitive academic environment, universities are no different from business organizations undergoing constant change. Globalization has opened up enormous opportunities for students to pursue the high-quality education offered by institutions across the world. Consequently, these universities find themselves in a stiff competition to attract and retain students capable of steering such processes successfully [1]. The university management has to take drastic and timely decisions related to SPA (Student Performance Analysis) to improve the retention and performance of students. Predicting student performance is crucial for educators to offer early feedback and implement timely interventions to support students’ academic development [2]. Contemporary universities generate a large amount of data associated with students, educational processes, and administrative aspects. This rich information is rarely put to good use, with simple queries and customary reports failing to reach decision-makers on time. In this regard, a huge portion of data on students goes to waste because of the size and the complexity of the data. Data-driven methods like analytics and predictive algorithms are necessary to change these raw data into meaningful insights that can be of great help in making decisions. Some of the structural barriers to students’ learning can be overcome by establishing the root cause of their problems, which may relate to extracurricular activities or family and health issues, among others. Many students have been reported to perform poorly in school due to other challenges unrelated to their academic potential, as shown in [3]. According to [4], many students report stress, anxiety, and an inability to balance studies with family obligations, social life, and personal responsibilities as the reasons for their poor grades. Students increasingly seek individual support and guidance but find the usual approaches fail to meet their exact needs. The uniqueness of students, bringing with them all sorts of diverse perspectives and backgrounds, calls for a personalization of education and support. This means that, in this respect, personalized interventions are very important since the early identification of struggling students helps them to receive effective support that will improve their academic outcomes.

These findings highlight a critical gap, while academic data alone provide limited insights, integrating broader factors—such as family background, personal well-being, and personality traits—can enable more comprehensive student assessments and im-prove the support provided.

Data mining has emerged as a powerful tool to address this gap, enabling institutions to derive valuable insights from complex datasets. Data mining may be defined as the process of exploring meaningful patterns in large databases [5]. Therefore, it serves an important role in addressing the problems encountered by universities and students. As institutions of higher learning strive towards sustaining their competitive advantage and providing better support to increase student retention, data mining provides a means to understand and predict the behaviors of students more effectively. It gives very important insights into making sense out of large volumes of data to unravel hidden patterns and trends that traditional methods might miss. Data mining can identify students’ academic struggles early for timely interventions that support their educational journey. Elucidating the individual characteristics and needs of the students, it lets the university offer personalized guidance and resources for improved overall student success and satisfaction. Besides this, data mining optimizes recruitment strategies, helps improve operational efficiency, and supports alumni management to enable universities to maintain strong connections with graduates and leverage those relationships for future growth. However, existing research lacks a comprehensive framework that integrates personal, family, academic, and personality factors to provide a holistic view of student performance.

To address the gaps identified in prior studies and advance our understanding of student performance prediction, this study is guided by the following research questions:

How do personality traits influence academic performance?
What is the impact of incorporating personality traits into predictive models for student performance?

Based on these questions, the study hypothesizes the following:

H1.

Incorporating personality traits into predictive models significantly improves their accuracy in predicting academic performance.

H2.

The influence of personality traits on predictive outcomes varies across distinct academic performance categories, such as letter grade classifications and CGPA.

In response to these gaps, this study aims to leverage data mining techniques for a more integrated approach to student performance prediction. Our primary research objectives are as follows:

Integration of Personality Traits with Traditional Factors: Unlike prior studies that predominantly focus on academic and demographic data, we incorporate personality traits (based on the Big Five model) alongside personal, family, and academic factors. This integration allows for a more holistic analysis of the determinants of student performance, enabling personalized and actionable recommendations.
Development of a Robust Predictive Framework: Our proposed framework uniquely combines regression and classification tasks, achieving an enhanced predictive accuracy for both continuous (CGPA) and categorical (letter grades) performance metrics. This dual capability fills a gap in prior studies that typically focus on a single predictive objective.
Comprehensive Comparative Analysis of Predictive Models: We evaluate and compare a range of machine learning models tailored to the SAPEx-D dataset, including traditional and ensemble approaches. This analysis identifies the most effective methods for student performance prediction, offering valuable guidance for researchers and practitioners in selecting appropriate techniques.
Utilization of Explainable AI (XAI) for Causal Analysis: By applying SHAPs (Shapley Additive explanations) to our predictive models, we provide interpretable insights into the causal relationships among factors influencing student performance. This interpretability not only enhances our trust in the predictions but also informs educators about the most impactful areas for intervention, advancing the practical application of machine learning in education.
Advancing Tailored Educational Strategies: By integrating advanced predictive techniques and explainable AI, this study provides actionable insights to help educators and policymakers design targeted interventions, particularly for underperforming students. These contributions address limitations in prior work, which often lacked interpretability, holistic analysis, or actionable outcomes.

Proper assessment and prediction of educational outcomes require a deep understanding of the many factors (personal and family background, academic, and personality) that interact to affect a student’s academic performance [6]. In this regard, we delve deeper into the interaction of these factors in the formation of the success of the student’s academic performance. Such insight will help in the formulation of targeted interventions and policies. With this view, therefore, the study seeks to provide a few leading contributions to the field of educational psychology by harnessing highly developed predictive modeling tools with the scope of available information in the SAPEx-D (Student Academic Performance Exploration) dataset. Ultimately, it aims to promote healthy academic development and explore potential ways to realize better educational outcomes.

The motivation behind the research comes from the vast potential of data mining methods in effectively exploiting university data for the analysis of student performance. Based on discussions with numerous top-level managers and administrators at Air University, Pakistan, the consensus is that there is a dire need to gain deeper insights into student performances and to come up with more effective strategies for supporting students and improving the institution. These discussions motivated this research, which looks at the application of data mining to address these needs.

The subsequent sections of the paper are organized into five main parts. Section 2 provides a comprehensive overview of previous research in the field. Section 3 outlines the dataset (SAPEx-D) in detail. Additionally, this section includes a detailed account of the data pre-processing strategies used discusses the machine learning techniques adopted to predict student performance. Furthermore, Section 4 presents the results derived from the machine learning algorithms applied to the novel proposed dataset, SAPEx-D, this section also compares the novel SAPEx-D with previous datasets. Finally, Section 5 concludes the paper with a summary of the outcomes and a discussion on potential future research directions.

2. Related Work

In this section, we review some of the previous studies on student performance prediction models using data mining. This area has been extensively addressed. Researchers have addressed the development and application of data mining techniques within different levels of education [7]. A number of studies have compared the different methods adopted, which is an indication of a growing interest and continuous investigations in this area.

2.1. Evolution of Algorithms for Student Performance Prediction

Ren et al. [8] proposed a Learning Ability Self-Adaptive Algorithm for long-term student performance prediction. LASA is an algorithm that adapts the distribution of changing data and provides and improved accuracy over short-term models. The results indicated that LASA performed better than other models, with an increase of 7.9% in the accuracy of its prediction compared to the ProbSAP and SFERNN models, outperforming them by 6.8% and 6.4%, respectively. It optimizes the algorithm’s parameters separately, which may be suboptimal but remains a robust tool in predicting student performance in a dynamic academic environment. Fazil et al. [9] proposed a new deep learning model (ASIST) that predicts student performance based on data from academic registries, VLE clickstreams, and assessments. ASIST showed an improvement with an AUC score ranging between 0.86 and 0.90 on three datasets, thereby outperforming the baseline and traditional classification models. The attention-aware convolutional Stacked BiLSTM was really robust, but performed worst in the dataset of 2020 and 2021. Anisa et al. [10] created a combined method of SVM-RBF that increased student performance prediction to 88%. To this regard, the approach used was student assessment data, and model optimization was carried out on GridSearchCV. These fairly accurate predictions showed just how well the combination of SVM and RBF works for predictive modeling purposes.

2.2. Hybrid and Ensemble Methods for Student Performance Analysis

Sawalkar et al. [11] applied fuzzy relational calculus and Dempster Shafer theory when ranking factors that influence student academic performance. This supports predictive analysis by making the degree to which different factors are influential towards such an academic outcome clear. The research focused on the teacher’s beliefs in the ranking process wherein the application of fuzzy relational calculus and Dempster Shafer theory aided in establishing the viability of the methods used for the factor ranking. Sirait et al. [12] utilized SVM-SMOTE and Deep Feature Synthesis combined with the Random Forest algorithm for student performance prediction. Their results had an accuracy of 69%, a precision of 47%, a recall of 52%, and an F1-score of 47%. The problems dealt with in this research regarded the class imbalance in the dataset and the issues of a skewed data distribution. To overcome the balance problem, SVM-SMOTE was used. The application of the Random Forest algorithm together with Deep Feature Synthesis increased the model’s ability to predict the outcomes by a remarkable extent. Manzali et al. [13] also proposed an algorithm, which combines Random Forests with Naive Bayes for the efficient prediction of students’ performances. In this hybrid approach, seven other machine learning algorithms were outperformed, and a great performance was obtained with a very small number of branches, but the ensemble’s complexity brought additional computational cost and reduced the interpretability of the results.

2.3. Context-Specific Models and Pedagogical Applications for Student Performance

Khan et al. [14] focused the choice of their machine learning models on certain pedagogical objectives, including the identification of struggling or excellent students. The importance of tailored model selection has been underlined for avoiding the misclassification of the minority class, which is key for struggling students, and amplifying the correct classification of the majority class, which is beneficial for excellent students. Their study indicated that accuracy alone cannot be used to assess models, and a new framework was introduced for model selection. Abuchar et al. [15] developed a risk-based student performance prediction model for courses in engineering using fragility curves based on student metadata like GPA. The results of this model showed that the risk of failure in Solid Mechanics was higher compared to Statics, further highlighting that gaining a strong base from the prerequisite courses is necessary at all costs. However, fragility curves may not be able to capture all the factors that affect student performance.

2.4. Data-Driven Insights and Implications from Traditional and Baseline Models for Student Performance

Mustapha et al. [16] proposed a model that makes use of Rapid Miner, decision trees, and data analysis to predict students’ academic performance. In this study, the accuracy attained was 80%, with a precision of 89.47%. In the study, students who had less than seven absences performed better than those with more than seven. The project used the CRISP-DM methodology for data mining and analytics, and the algorithms were applied using Tableau and Rapid Miner. The model was very accurate and highlighted attendance as a major influencer of students’ success. Ahmed et al. [17] adopted some machine learning algorithms like SVM, decision tree, Naïve Bayes, and KNNs to carry out his prediction of students’ performance. The result revealed that the performance of all algorithms improved after the adjustment of the parameters, particularly the SVM algorithm, which had the best accuracy rate of 96%, while the Naïve Bayes model was found to have the worst value for predicting the performance since it assumes strong independent relationships between features.

Hongli et al. [18] put forward an intelligent model that used ASHO for feature selection and the XGBoost algorithm for predicting student performance. This model outperformed all of the traditional methods in terms of precision, accuracy, recall, and F1 score, thus proving that the combination of advanced feature selection techniques with robust predictive algorithms is very effective. Li et al. [19] employed logistic regression in student academic performance prediction, which turned out to be quite accurate at 95.8%. Its high model precision, recall, and F1 score, at 96.7%, 95.1%, and 95.8%, respectively, rendered it very effective in enabling schools to quickly recognize students lagging behind in performance and offer timely support. Hairy et al. [20] applied the Random Forest and decision tree machine learning algorithms, which showed a high degree of accuracy in correctly classifying the exam performances of students in 253 instances, and there were only three misclassifications during testing. The study was, however, limited by dataset constraints due to time, security, and privacy concerns. Duan et al. In [21], the authors created student performance prediction models using blended learning data combined with SMOTE and Bayesian optimization. The models achieved 86%, 84%, 85%, and 86% accuracy, while the Root Mean Square Error varied from 0.17 to 0.35. This method focused on the incorporation of SMOTE oversampling and Bayesian optimization into the models. In [22], Rolly et al. created a predictive model for the academic performance of freshman students, wherein the high school GWA, strand, admission general ability, course, and sex turned out to be very significant. The study obtained an accuracy of 67.30% using Multiple Linear Regression. The results recommend that these variables should be considered during admission and likewise propose a model wherein more attention could be given to those students who require further assistance. Ardiyansyah et al. [23] have carried out research into the factors that impact students’ academic performance at UiTM, covering three components: teaching methods, family and peer influences, and financial aspects. In a survey conducted on 497 students, the results indicated that family and peer influences were most important, with r = 0.584 and p < 0.05. This study therefore implies that external factors concerning family and peers, as well as teaching approaches, are very important in enhancing students’ performances. Ni et al. [24] proposed a method that integrated SGNNs with LLM embeddings for student performance prediction using learner-sourced questions. This improved the accuracy of the prediction and provided an increased robustness against noise and cold start conditions. The paper presented a method that outperformed the baseline values in terms of predictive accuracy and robustness by using only a signed bipartite graph and a contrastive learning framework for noise resilience. Shou et al. [25] suggested a model for multidimensional time series data analysis in the prediction of student performance within an online learning environment. The model achieved high accuracy and F1 scores of 74% and 73%, respectively, on a four-category prediction task and an even higher accuracy of 99.08% on the early risk prediction task. It emphasized that difficulties in effective communication and feedback within online learning increase the possibility of failure or dropout. Wang et al. [26] have used the graph regularization non-negative matrix factorization algorithm for student performance prediction in relation to programming courses. Compared to existing algorithms, the GNMF algorithm included supplementary student information, like learning background, to decrease the number of prediction errors and increase the prediction accuracy. This research was helpful for personalizing teaching in higher education. Oppong [27] reviewed the literature reporting on the application of machine learning algorithms in predicting students’ performance, whereby the neural network was the most used classifier and provided a better accuracy. As an overview, the study indicated that 87% of the algorithms used were supervised learning; unsupervised learning algorithms and non-machine learning methods attracted minimal attention for prediction purposes. Alamgir et al. [28] investigated how the inclusion of historical academic data from 15 years ago can complement models on student performance prediction with a relative grading scheme. Historical data contributed much toward the predictions of future performances, particularly in advanced courses. Nevertheless, it remained relatively difficult to predict the grades of students on advanced courses compared to those on prerequisite courses because of the relative grading scheme and the fact that behavior factors are really hard to understand. Resmi et al. [29] applied statistical and machine learning methods to identify examination performance differences by gender. In this study, the Support Vector Machine was found to be the best model for predicting the performance of students. The Brown–Forsythe Test assessed the homogeneity of variance, and the explanatory capacity of the Linear Regression models was evaluated.

While the reviewed literature presents a clear understanding of student performance prediction, there are still areas with significant gaps. Many studies focus purely on certain methodologies and do not lay out a full integration of contextual factors or interactions among different predictors. In most existing models, adaptability to changes in education landscapes is also absent.

Building on those gaps, this research presents a complete model taking into account academic factors as well as personal, family, and personality factors. Since it is multi-faceted in approach, increased accuracy and robustness in the predictions of students’ performance would contribute towards more personalized educational interventions.

3. Proposed Methodology

We propose a machine learning-based methodology for student performance prediction using the SAPEx-D dataset as shown in Figure 1. After data preprocessing and feature selection, the dataset is split into training and test sets. The approach applies regression for CGPA prediction and classification for letter grade prediction with eight distinct classes and three broader categories. We employed regression and classification models due to the nature of the target variables. Regression was chosen to predict CGPA, a continuous numerical value that requires models capable of capturing its variability. Classification, on the other hand, was used to predict letter grades, which are categorical in nature and require algorithms that can assign discrete class labels effectively. This dual approach aligns with the characteristics of the dataset and ensures a comprehensive analysis of student performance outcomes. To address the complexity of the dataset and the tasks of making predictions, we selected models according to their theoretical suitability and alignment with the characteristics of the dataset.

Gradient Boosting (GB) was selected because this algorithm would best model the non-linear relationship among the different features. It uses an iterative boosting mechanism which guarantees the highest predictive accuracy, and feature importance analysis, which provides very useful insights into the relative contributions of different factors, such as academic, personal, family, and personality-related traits. This makes GB particularly effective for datasets such as SAPEx-D, where complex, non-linear interactions are prevalent. For the capacity of capturing local data patterns without assuming a specific parametric form of the data distribution, K-Nearest Neighbors (KNNs) was included. Although it is sensitive to the size of the dataset and computational overhead for larger datasets, the ability of KNNs in identifying localized trends makes it an ideal complement to tree-based models. Additionally, Support Vector Regression has been chosen as it can effectively handle high-dimensional feature spaces and is effective at capturing complex relationships with the use of the kernel trick. This attribute comes in handy, especially because SAPEx-D is multi-faceted; it encompasses various data types, such as personality traits and academic records.

The choice of the models was influenced by features of the SAPEx-D dataset, which consists of heterogeneous features made up of categorical and continuous variables. Models such as Gradient Boosting and KNNs are effective in dealing with this diversification, whereas SVR presents a substantial advantage when dealing with high-dimensional settings. Additionally, non-linear relationships and feature interactions within the dataset require a model able to capture these complexities. Although the dataset is of a moderate size, its multifaceted nature demands models such as Gradient Boosting and SVR, balancing computational efficiency with great predictive power. Carefully aligning the capabilities of the developed models with the attributes of the datasets, this methodology ensures both accurate and interpretable predictions for both CGPA and letter grades.

3.1. Dataset

In this paper, we utilized the SAPEx-D (Student Academic Performance Exploration) dataset that was collected at Air University in Islamabad, Pakistan, from September 2022 to April 2024, for the prediction of student performance based on a variety of factors. One of the key characteristics of this dataset is that it includes 494 instances, which consist of personal and family information as well as academic and personality factors. Our work utilizes these features to predict student performance with and without personality factors. This research was conducted as a collaboration between the departments of Academics and Psychology, with invaluable inputs from Air University.

Table 1 summarizes the broader SAPEx-D dataset factors in detail, consisting of category, attribute names, types, and values of the variables for each factor:

Table 1 provides a clear overview of the different categories, attribute names, types, and the possible values that were considered in this research. By organizing the information in an effective manner, we aimed to facilitate a thorough and structured analysis of the factors influencing the performance of the students.

Personal Factors: These are characteristics reflecting various aspects of the students’ individual conditions and their daily routines.

Family Factors: These features describe family background and the support network that a student has.

Academic Factors: These measures relate to the environment and behaviors of the students with respect to their academic performance.

Personality Factors: These features describe the personality of students based on the Big Five personality traits model [30] and provide a psychological dimension to the assessment of the students in terms of their behaviors and attitude.

Openness: A student’s level of creativity and curiosity. Students high in Openness are often more imaginative and willing to explore new ideas.
Conscientiousness: This entails a student’s level of organization, dependability, and discipline. The highly conscientious student tends to be more responsible and goal-oriented.
Extraversion: Measures a student’s sociability and assertiveness. Extraverted students are generally more outgoing and energetic.
Agreeableness: Reflects a student’s tendency towards compassion and cooperation. A high Agreeableness will, therefore, mean more sensitive and cooperative students.
Neuroticism: This refers to the emotional stability of a student and a person’s tendency to experience negative emotions. High Neuroticism would thus relate to high stress and anxiety levels.

In this paper, we examine how personal, family, and academic factors, along with personality traits contribute to predicting the academic performance of a student. In this regard, we aim to explore how these Big Five personality traits interact with other factors to influence academic outcomes of the student. This holistic approach will allow us to recognize those students who would benefit most from special interventions or support systems. By considering all the aspects that affect achievement, our ultimate goal will be to increase student success.

3.2. Data Preprocessing

3.2.1. Removal of Duplication and Missing Values

The first steps in preprocessing check for duplicate records in the database to ensure its integrity [31,32]. There were no duplicate entries in the SAPEx-D dataset since we identified the unique identifier and thoroughly checked for other features. The second potential problem could be missing values. Upon inspecting the dataset, it turned out to be rather well maintained with no missing data in the features. This detailed inspection made sure the dataset was full and ready to go for further analysis and model training.

3.2.2. Data Encoding

SAPEx-D consisted of both ordinal and nominal attributes [33], and these had already been appropriately encoded for analysis. Ordinal attributes [34] have a meaningful order or ranking among the categories while nominal attributes [35] do not have a natural order or ranking, the categories are simply different from each other without any inherent hierarchy [36]. For instance, age ranges are encoded as ‘1’ for 18–21 and ‘2’ for 22–25, and scholarship types are encoded from 0 for “None” up to 4 for “Full”, maintaining their ordinal nature. Nominal attributes, which lack an inherent order [37], have been one-hot encoded. For example, the transportation means attribute has been converted into separate binary columns, with categories like “Bus”, “Private car/taxi”, “Bicycle”, and “Other” each represented by their own column.

3.2.3. Feature Scaling

This step introduced variability into the scales of the different features. Features like “Age”, “Total Salary if Available”, and “Exact CGPA” were measured in different units and scales, which, again, could result in biased models. So, we carried out feature scaling in order to make sure all features used the same scale and were weighted equally within the analysis. Scaling refers to changing the range and distribution of data to put all features on a common scale. We chose normalization scaling as the primary method.

Normalization scales [38] the data to a range of [0, 1], which is particularly useful for algorithms that are sensitive to the magnitude of feature values, like gradient descent-based methods. Each of the rescaled features would contribute proportionately to the analysis without the presence of any dominating features because of their scale.

X n o r m a l i z e d = \frac{X - M i n (X)}{M a x (X) - M i n (X)}

(1)

where

X represents the original value of the variable.
$X n o r m a l i z e d$ represents the normalized value of variable.
Min (X) represents the minimum value of the variable across all instances.
Max(X) represents the maximum value of the variable across all instances.

3.2.4. Output (CGPA) Distribution

We measured student academic performance using the CGPA and letter grades obtained from the SAPEx-D, which has continuous values up to 4.00 for CGPA and letter grades A, A−, B+, B, B−, C+, C, and C−, following Air University’s grading policy.

Figure 2 shows the distribution of the CGPA for the students in our dataset. The histogram in front shows us the frequency of different CGPAs, and on top of that is the Kernel Density Estimate (KDE), which gives a clearer view of the pattern of the overall distribution. The minimum CGPA is 0.0, while the maximum is 4.0. However, most of the students’ scores are between 2.0 and 3.5. This can be seen in the peak near the 3.0 mark, which indicates a high percentage of students obtaining grades near the mean CGPA of 2.75.

This peak shows some symmetry on either side, indicating that the distribution is relatively normal, even though it is slightly skewed towards higher values. This observation is further supported by the KDE curve, which emphasizes the central tendency of the results. The standard deviation of about 0.5 indicates a moderate spread in the students’ academic performance. This means that, although most of the students are closer to the average CGPA, some of them scored further away from it and attained very high or even very low averages.

CGPA is a crucial feature for predicting academic performance because it provides a comprehensive measure of a student’s overall academic achievement. By reflecting cumulative grades across various courses and semesters, CGPA offers a consistent and quantifiable indicator of a student’s academic capabilities and progress. This metric allows us to identify patterns and correlations between different influencing factors—such as academic, family, personality, and personal factors—and their impact on student performance. Understanding these relationships helps us to better predict and support student success, making CGPA an indispensable component of our analysis.

SAPEx-D includes a range of features related to academic and personal and family backgrounds, such as study habits, attendance, and parental education levels. We aimed to predict student letter grades across eight distinct levels, A, A−, B+, B, B−, C+, C, and C−, following Air University’s grading policy. To gain deeper insights, we further categorized these grades into three broader classes: high (A, A−), average (B+, B, and B−), and low (C+, C, and C−). This categorization helps in understanding overall student performance and aligns with common grading criteria used in many institutions. By doing this, we can better analyze and compare the performance of students across different academic environments. The results from both the detailed and broader categorizations provide valuable insights into predicting student success and identifying key factors that influence academic outcomes. In this regard, we considered all features that can predict student performance, including personal factors, family factors and academic factors, but excluded personality traits from our analysis. By focusing on features, we aimed to construct a predictive model that accurately reflects the factors influencing student performance. Furthermore, a careful selection process ensured that our model was both efficient and effective, utilizing the most relevant data to predict student letter grades.

3.2.5. Evaluation Measures

In this study, we used the following metrics to evaluate the performance of regression models for CGPA, which express the accuracy and reliability of the predictions:

Mean Squared Error (MSE): Mean Squared Error [39] is simply the average of the squares of the differences between the predicted and observed values. It quantifies the accuracy of the model in making its predictions, where larger errors are more heavily penalized.

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {y^}_{i})}^{2}

(2)

R-squared (R²): R-squared [40] is a measure of the amount of variance explained by the regression model. It provides an indication of the goodness of fit. The R² value varies from 0 to 1, with higher values indicating a better fit.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {y^}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - y)}^{2}}

(3)

For classification experimentation for letter grades, we employed four distinct evaluation metrics [41] to assess the models’ performance: accuracy, precision, recall, and F1-score. Among these measures [42], accuracy has been widely utilized in numerous previous research studies. The F1-score [43] offers a comprehensive assessment of classifier performance within individual classes. It serves as an amalgamation [44] of both precision and recall, proving particularly valuable when confronted with disparate class distributions. Throughout our study [45], students’ results were categorized into four groups: true positive (TP), true negative (TN), false positive (FP), and false negative (FN). To sum up, the evaluation measures employed in our investigation encompass the following:

Accuracy: This is the proportion of correctly predicted instances against the total number of instances in the SAPEx-D.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(4)

Precision: This is the ratio of true positive predictions against the sum of true positive and false positive predictions.

P r e c i s i o n = \frac{T P}{T P + F P}

(5)

Recall: This is the ratio of true positive predictions to the sum of true positive and false negative predictions.

R e c a l l = \frac{T P}{T P + F N}

(6)

F1 Score: This is the harmonic mean of precision and recall; it provides a balance between the two metrics.

F 1 S c o r e = 2 * (\frac{P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l})

(7)

4. Experimental Results

This section presents the results of our experimental analyses, structured into three distinct subsections. In Section 4.1, we focus on regression-based predictions of CGPA, examining models that incorporate and exclude personality factors. This analysis allows us to evaluate the influence of these factors on predictive accuracy. Section 4.2 delves into the classification of letter grades, where we compare the model’s performance across two methodologies: one that categorizes students into eight distinct classes and another that consolidates them into three broader categories. Both approaches are analyzed with and without the consideration of personality factors to assess their impact on classification outcomes. Finally, Section 4.3 employs SHAPs (Shapley Additive Explanations) to explore the importance of various features, providing insights into the causal relationships among predictors and identifying the most significant factors influencing student performance predictions. This structured approach facilitates a comprehensive evaluation of model effectiveness and the role of diverse predictors in enhancing predictive accuracy.

4.1. Analysis of Student Performance Based on CGPA Using Machine Learning Techniques

In order to study the effectiveness of the proposed methodology, the SAPEx-D dataset was instrumental. We engaged various regression models for the prediction of the academic performance of students. For each model, we used the same split ratio for the training and testing sets [31], 80–20, in order to compare the results obtained from them. The models were evaluated using Mean Squared Error and R-squared.

4.1.1. CGPA Prediction Exclusion of Personality Factors

Initially, as shown in Table 2, we used only academic factors (e.g., attendance and grades), personal factors (e.g., age and gender), and family factors (e.g., parental education and household income) to predict CGPA. This provided a baseline for our predictions without considering personality traits.

The Gradient Boosting Regressor returned the lowest Mean Squared Error of 0.1162, with a quite high R-squared of 0.6905. This proves that the relationship between the input features belonging to academic, family, and personal factors and CGPA has been grasped by the model; hence, this model turns out to be most accurate among all the tested models. A smaller MSE indicates that the predictions are relatively closer to the actual values. On the other hand, the higher R² indicates that this model explains a large portion of the CGPA variance. Figure 3 illustrates the residual distribution for the Gradient Boosting Regressor, showing a relatively normal distribution with most residuals clustered around zero, indicating a good model performance. When comparing KNNs, it has a slightly higher Mean Square Error than that of the Gradient Boosting Regressor, being approximately 0.1465, and a low value of R-squared at approximately 0.6095. Even though the KNNs model also performed relatively well, it proved to be not very effective in capturing the complexities of the data. This may be because of the very basic nature of KNNs, which mostly relies on the proximity of data points and generalizes much worse for complex relationships. The residual distribution for the KNNs Regressor, highlighting a broader spread of residuals and less clustering around zero compared to the Gradient Boosting model.

The result from the Linear Regression model yielded an MSE of 0.2111 and a relatively medium R-squared value of 0.4376. This might be interpreted to mean that although the linear model has the advantage of simplicity and interpretability, it does a very poor job of capturing non-linear relationships embedded in this dataset. The higher error shows that the predictions are less accurate; the lower R² indicates that less variance in CGPA is explained by the model. The residual distribution for the Linear Regression model, demonstrating a wider spread and a less pronounced peak around zero, indicates higher prediction errors.

Support Vector Regression gave the highest Mean Squared Error of 0.3037 and the lowest R-squared, which was 0.1907. This shows that SVR had the least power in predicting CGPA, possibly because of its high sensitivity to parameter settings and the complexity of the data. The high MSE thus reflects larger prediction errors, and the low R² suggests that only a small portion of the variance in CGPA is explained by the model. Figure 3 illustrates the residual distribution for the SVR model, showing the widest spread of residuals and the least clustering around zero, confirming the model’s lower performance.

Subsequently, we included personality traits (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) along with academic, personal, and family factors. This allowed us to evaluate whether incorporating personality traits enhances the predictive power of the models.

4.1.2. CGPA Prediction Including Personality Factors

As shown in Table 3, we used only academic factors (e.g., attendance, grades), personal factors (e.g., age, gender), and family factors (e.g., parental education, household income) to predict CGPA. This provided a baseline for our predictions without considering personality traits.

The results from the experiment show a huge influence of personality factors (Big Five traits) in the prediction of students’ academic performance in the form of CGPA. In a comparison between the models’ performances, the Gradient Boosting Regressor performed very well, returning an average Mean Squared Error of 0.0618 and an R-squared value of 0.8352. This gives an indication of the high degree of accuracy of the model’s fit with the data, thus implying that personality is important in determining the performance of a student. As Figure 4 shows, the Gradient Boosting Regressor had the most balanced and normal distribution, indicating well-fitted predictions. The Nearest Neighbors Regressor had a medium effectiveness with an MSE = 0.1434 and an R² = 0.6179. This explained that personality traits are very important in determining the performance of a student along with other factors. Figure 4 illustrates the K-Nearest Neighbors Regressor’s residuals, revealing more variance compared to the Gradient Boosting.

Linear Regression produced a reasonable fit, yielding an MSE of 0.1907, with an R² of 0.4919. Support Vector Regression gave an MSE of 0.3035 and the lowest R². This had an exemption of 0.1914, which was the least effective. As shown in Figure 4, the Linear Regression Residuals showed an adequate fit, but there was complexity here beyond simple linear relationships. In addition, as shown in Figure 4, Support Vector Regression Residuals showed the largest variance and the poorest of fits according to standard measures.

The inclusion of the Big Five trait variables in all of the prediction models increased their predictive performance, thereby underlining that psychological factors should be taken into consideration together with academic, personal, and family factors. In particular, the Gradient Boosting Regressor turned out to be the best performing model in capturing the complex interaction among the different factors affecting students’ CGPAs.

4.2. Analysis of Student Performance Based on Letter Grade Using Machine Learning Technqiues

The effectiveness of the proposed methodology has been assessed using the instrumental SAPEx-D dataset. The dataset is divided into a train and test set in the ratio 80% to 20%. A rigorous analysis of the dataset has been carried out by computing a variety of metrics: prediction accuracy, precision, recall, and the F1 score. Furthermore, the confusion matrix details—involving some very vital indicators of performances, like true positives, true negatives, false positives, and false negatives—have been carefully calculated for each of these discrete techniques.

4.2.1. Letter Grade Classification with Exclusion of Personality Factors

The extensive evaluation process facilitated a comprehensive cross-technique comparison of the results obtained using the four distinct methods employed.

Letter Grade Classification with Eight Distinct Classes

The confusion matrices for the Gradient Boosting, KNN, Naive Bayes, and Random Forest classifiers provide a comprehensive comparison of their predictive performances for the letter grade (1:A, 2:A−, 3:B+, 4:B, 5:B−, 6:C+, 7:C, 8:C−) classification task. Each model’s ability to correctly classify the grades and their respective shortcomings can be seen in these matrices.

In Figure 5, Gradient Boosting shows very strong performance with a high number of correct predictions, such as 12 in class B− and 10 in class C−. It differentiates them very well, which proves its ability to handle complex patterns in the data. In addition, there is still space for class B+; generally, the balance between most of the evaluated classes underscores the robustness and reliability of this model. The KNNs classifier also worked well, as shown in Figure 5 by its high accuracy in class A, where it has correctly predicted 11 cases. Since it performed well for most of the classes, it goes without saying that it is able to capture the local structure of the data. Even though there are a couple of misclassifications, the ability of the KNNs model for handling a feature space rich in diversity is evident from its confusion matrix.

Although in its raw form Naive Bayes performed well on class C−, correctly predicting 10 instances as shown in Figure 5, it also performed fairly well for other classes, such as class A and class A−. Thus, this model is good at providing correct predictions for class C−, which makes it promising for certain distributions of features and a valuable model for subsets of data. For classes B− and C−, that the Random Forest classifier performs exceedingly well with 12 correct predictions and 10 correct predictions, respectively. This model shows a lower degree of miss-classifications compared to others. It is this very reason that allows the Random Forests approach to use ensemble learning to capture intricate patterns and makes it reliable for multi-class classification.

We concentrated our study on sorting student performances into eight different letter grades: 1:A; 2:A−; 3:B+; 4:B; 5:B−; 6:C+; 7:C; and 8:C−, respectively. We tried several machine-learning models to find out the most efficient one for the task of multi-class classification.

As shown in Table 4, we started with Gradient Boosting, which returned to be the best for eight different classes: 1:A, 2:A−, 3:B+, 4:B, 5:B−, 6:C+, 7:C, 8:C−. Its accuracy was 0.686869. An F1 score of 0.678227, along with a precision of 0.703067, along with a recall of 0.686869, proved it to be quite robust in identifying the different letter grades with excellent accuracy. The ability of Gradient Boosting to handle complex interactions within the data made it especially well-suited for our classification task, returning nuanced and precise predictions. We moved on to evaluate the Random Forest model, which had a decent performance accuracy of 0.626263. Its F1-Score was 0.638761, with a precision of 0.679717 and a recall of 0.626263, thus shedding some light on why it is so effective regarding the multifaceted nature of student performance data. Due to the ensemble approach, Random Forest was able to capture diverse dimensions of the data and turned out to be a reliable model with regard to predicting the grades of students. Another approach that we used was the Naive Bayes model, which had an accuracy of 0.373737. The model’s precision was 0.496006 but the recall came out to be lower at 0.373737, which translated into an F1-Score of 0.316316. In this regard, Naive Bayes offered an easy and fast way to make baseline predictions, which was useful as a comparison point. We finally evaluated the KNNs model with an accuracy of 0.424242. With an F1-Score of 0.406810, precision and recall values of 0.413786 and 0.424242, respectively, KNNs could make sense of the local patterns in the data.

Letter Grade Classification with Three Distinct Classes

We have presented a full comparison of the classifiers and their predictive performance regarding the broader letter grades (1: High, 2: Average, 3: Low) classification task.

As seen from Figure 6, Gradient Boosting performs very well, more so in predicting high performers. It managed to get 35 out of the 50 high performer’s right, thereby indicating high precision. It did, however, misclassify some students, as it predicted eight high performers as the average and seven low performers. It also identified most of the low performers correctly, where it predicted 30 out of 31. Its overall balance in prediction makes this a robust model for this task. The K-Nearest Neighbors model also indicates some competency, particularly in the identification of high and low performers. It correctly classified 30 of the 50 high-performing students and 22 of the 31 low-performing students. It was a little spotty in differentiating between high and average performers, mis-categorizing 10 high performers on average. Its consistency in picking out the low performers is to be noted, however. Its success in detecting high-performance students shows how it could be useful in programs concerning academic excellence. In contrast, although the Naive Bayes model showed some strong points, the high variance within its predictions caused it to mislabel a class of people as low performers. This model performed with a better accuracy in predicting the low-performing students, with all 31 being correctly identified. Its high precision in identifying the low performers means it may not be the best way to differentiate between high and average performers; it is very reliable in picking out the ones who struggle the most. Moreover, Random Forest demonstrates robust performance in general, especially at the high and low ends of performers. It placed 45 of the 50 high performers accurately, which is a very good accuracy.

As shown in Table 5, our analysis of the models gives some interesting insights into the performances of our models with regard to their output in terms of how well they predicted student outcomes. Most notably, Gradient Boosting had a quite strong performance with an accuracy of 0.737374 and an F1-score of 0.731799. Precision was almost directly related to this at 0.730782, while recall paralleled this at 0.737374. This means that the model could withstand quite sturdily the variances in our SAPEx data, thus also becoming one of the stronger contenders within our analysis. The top-performing model was the Random Forest model with an accuracy of 0.808081. Its F1-score was 0.783302, showing a balanced performance with a precision of 0.799743 and a recall of 0.808081, proving its strength in identifying students who were at risk and being very strong in predicting their performances. Thus, the ensemble approach made Random Forest better placed to deal with such very different factors, influencing student success and hence conforming its place as our most dependable model. The Naive Bayes model, although simple and fast, obtained an accuracy of 0.363636, but it turned out to have quite a high precision of 0.650914 and a very poor recall of 0.363636, with an F1-score of 0.250237. We have also assessed the K-Nearest Neighbors algorithm. With accuracy of 0.585859 and an F1-score of 0.582228, the KNNs model displayed a balanced approach toward the target variables in predicting student outcomes. Both precision and recall, hovering at about 0.580011 and 0.585859, respectively, gave an indication of its effectiveness in capturing local patterns within the student data.

A dual analysis with eight distinct classes and three broader level categorizations showed the flexibility and strengths of each model in different scenarios of classification. Gradient Boosting and Random Forest were very strong and versatile and could be applied for both detailed and broader performance predictions. Though Naive Bayes and KNNs were rapid and edified the baseline predictions, it was actually additive to gain comprehensive insight into the dynamics of student performances.

4.2.2. Letter Grades Classification with Inclusion of Personality Factors

We presented classification of letter grades that include personality factors alongside academic, personal, and family factors.

Letter Grade Classification with Eight Distinct Classes

We compared Gradient Boosting, KNNs, Naive Bayes, and Random Forest. In this, four classifiers were comprehensively evaluated to predict letter grades (1:A, 2:A−, 3:B+, 4:B, 5:B−, 6:C+, 7:C, 8:C−) that include personality factors alongside academic, personal, and family factors. The confusion matrices for every classifier in this case present its accuracy as well as the limitations of grade classification with the effect of including personality factors on predictive performance.

The Gradient Boosting classifier in Figure 7 showed a generally quite high level of performance for most of the grade categories in the dataset. In particular, it performed very well for grades C and C−. Nevertheless, it still made some misclassifications, especially with respect to the B and B+ grades. As shown in Figure 7, the K-Nearest Neighbors classifier model performed well on the test set. But, it suffered when detecting the finer distinctions between similar classes like B (class 4) and B− (class 5). This classifier model had a tendency to misclassify students whose grades were approximately in the middle of the grade scale. Analyzing the performance of the Naive Bayes classifier in Figure 7, it can be seen that it is computationally efficient but it offers two challenges relating to overlapping between certain grades leading to some serious misclassifications. It performed particularly poorly on grades B and B−, often confusing them with C+ and C. In contrast, the Random Forest classifier was very strong and, as is shown, correctly predicted most of the grade categories with a higher degree of precision, as indicated in Figure 7. The model showed high performance over the entire dataset, and discriminated between class 4 (B) and class 6 (C+) well.

It also showed an increased performance with the addition if the personality factors, as shown in Table 6; on eight of the classes, the Random Forest classifier achieved an accuracy of 0.6768 and an F1 score of 0.6741. This strengthens its case, as it is commensurate with the complexity of this dataset and reports very balanced precision and recall values. Its ensemble approach reduces variance, improves generalization, and hence fits this task very well.

As shown in Table 6, the performance of the Gradient Boosting algorithm came close, with an accuracy of 0.6667 and an F1 score of 0.6634. It performed better in terms of precision as compared to Random Forest at 0.6903 but its recall was lower than the latter. Because of its iterative boosting method, Gradient Boosting can correct errors from its previous iterations, which elevates its performance on datasets that have complex patterns and different distributions. Naive Bayes was the most computationally efficient but had the lowest metrics, with an accuracy of 0.3737 and an F1 score of 0.3163. Most likely, this is due to the assumptions of the model about the independence of features, which became a drawback when working with overlapping grade categories. Although its precision of 0.4960 showed that it could perform some correct classifications, it significantly lacked the ability to decrease the number of false positives and negatives. Nearest Neighbors performed moderately well, with an accuracy of 0.4545 and an F1 score of 0.4198. It had a precision greater than Naive Bayes, with a value of 0.5107, which means that it missed fewer true positives. However, the KNNs classifier rarely worked well because it relied on local neighbor information that misclassified similar grades.

Letter Grade Classification with Three Distinct Classes

We have presented a full comparison of the classifiers and their predictive performance regarding the broader letter grade (1: high, 2: average, 3: low) classification task including personality factors alongside the other factors (personal, academic and family).

The confusion matrix drawn for the Gradient Boosting classifier is shown in Figure 8. It has high performance in looking for the high and low differentiators, correctly predicting 38 in the high category and 31 in the low, but it fared a little worse on the average results, misclassifying many of them as high or low.

The K-Nearest Neighbors classifier was relatively balanced but less accurate than Gradient Boosting. It correctly classified 34 as high (one class 1) and 22 as low (class 3), though it struggled with the average category, misclassifying a good number of instances into the high and low categories.

Naive Bayes had a number of problems; it failed for the high category, where it predicted only three grades correctly. It misclassified most high cases as low, class 3, showing itself rather incapable to parse overlap. The average category was equally misclassified, and this is more indicative of the deficiencies of Naive Bayes.

The Random Forest classifier performed very well, classifying 49 observations into the class 1 category, high, correctly, and it was very good in class 3, low, with 30 precise observations. The classifier showed a little bit of misclassification in class 2, the average category. In general, though, Random Forest displayed the best balance in terms of accuracy among the models.

As shown in Table 7, Gradient Boosting performed very robustly with an accuracy of 0.777778 and an F1 score of 0.766878. Iterative boosting is what refines its predictive powers by focusing on the errors in the previous iterations. Most likely, the addition of personality factors added some extra discriminative power to the model, which allowed it to capture these fine lines of heterogeneity in the students’ performances. It turned out that Random Forest was the most resistant. Its accuracy equaled 0.858586, and the F1 score equaled 0.835859. It gave a very good precision and recall and was able to significantly differentiate the cases of the three grade categories. This high performance is due to the fact that Random Forests are based on ensemble learning, meaning many decision trees are aggregated to improve accuracy and avoid overfitting. In Naive Bayes, there were still major problems in the prediction of the high category, with an accuracy of 0.363636 and an F1 score of 0.250237. Naive Bayes is very computationally efficient, but its assumptions about independent features likely made it perform unsatisfactorily on overlapping classes. Adding personality factors did little to boost its performance, reflecting the intrinsic deficiencies in the model. Nearest Neighbors performed fairly well with an accuracy of 0.636364 and an F1 score of 0.632793. Despite performing very poorly on class 2, the average category, the inclusion of personality factors increased KNNs’ performance in classifying the high and low categories.

In fact, the comparison indicated that, with the incorporation of the personality factors in Table 7 and without considering the personality factors in Table 5 for the three classes of letter grades, the model performances showed a quite remarkable improvement.

Gradient Boosting improved from 0.737374 to 0.777778, and the F1 score increased from 0.731799 to 0.766878, representing an improved performance with the addition of personality data.

Accuracy increased from 0.808081 for Random Forest to 0.858586; the F1 score improved from 0.783302 to 0.835859. This strongly suggests that personality factors are strong drivers of the predictive power of the model. Still, with the same accuracy, Naive Bayes showed marginal improvements in precision, thereby indicating that personality factors have a limited positive effect on its ability to reduce false positives.

For K-Nearest Neighbors, this addition did increase its accuracy from 0.585859 to 0.636364, and increased its F1 score from 0.582228 to 0.632793. These results show that personality gives valuable context to help KNNs differentiate between classes.

The integration of personality factors (Big Five) into the predictive models significantly improved their performance. Personal characteristics are able to affect learning styles, motivation, and in general, academic behavior. For example, Conscientiousness and Openness to experience usually turn out related to better academic results. This helped the models understand the different influences that affect the performance of a student and hence make better predictions. These findings put forth the message of adopting a holistic approach in student performance prediction. It is by using these insights that an institution can help students by offering more tailored support, not only in academic areas but also by looking into personality traits that influence learning. Such a comprehensive insight can thus lead to effective interventions, hence improving educational outcomes.

4.3. Comparison of Prediction Models Effect on Students’ Academic Performance with and Without Personality Factors

In the following section, the predictive performance of both the regression and classification models is compared with and without the inclusion of the personality factors used in this study. The major evaluation metrics used for both models are MAE and R² in the case of regression models, and accuracy, precision, recall, and F1-score in the case of classification models.

4.3.1. Regression-Based Performance Analysis with and Without Personality Factors

Table 8 and Figure 9 present the outcome of including personality factors in the regression models to predict students’ academic performance. The findings indicate that adding personality factors improves the accuracy of prediction for all the models. With GBR, for instance, an important improvement was seen as R² increased to 0.8352 from 0.6905, while MSE went down to 0.0618 from 0.1162. This could be due to the fact that GBR captures non-linear interactions among features and is less susceptible to overfitting.

K-Nearest Neighbors (KNNs) and Support Vector Regression (SVR) showed marginal improvements. KNNs performed poorly even on personality traits due to its susceptibility to high-dimensional data and the curse of dimensionality. The mixed feature types and high-dimensionality of the SAPEx-D dataset makes it particularly challenging to uncover meaningful patterns using the KNNs model. Similarly, it was difficult for SVR to significantly improve its performance based on the linear kernel used by it to separate boundaries, thereby imposing a limit on its ability to capture complex, non-linear relationships in the dataset fully. Linear Regression, still requiring modest improvements, is hampered by the assumptions involved in linearity, which can only approximate the subtle interplay of personal and family, academic background, and personality factors. These findings indicate that Gradient Boosting and the other models involving the handling of non-linear interactions are suitable for the prediction of students’ academic performance when personality factors are present.

4.3.2. Classification-Based Performance Analysis with and Without Personality Factors

Here, we present a comparison between the classifiers Gradient Boosting (GB), Random Forest (RF), Naive Bayes (NB), and K-Nearest Neighbors (KNNs) across eight distinct classes (1:A, 2:A−, 3:B+, 4:B, 5:B−, 6:C+, 7:C, 8:C−) and three distinct classes (1: high, 2: average, 3: low) with and without the inclusion of personality factors as shown in Table 9.

The analysis shown in Table 8 and Figure 10 and Figure 11 clearly shows that using personality features distinctly enhances the prediction performance of classification models for student outcomes. With eight-class and three-class categorizations, the inclusion of personality features in the models always significantly led to improvements in predictive accuracy, precision, recall, and F1 measure.

This is evident in the eight-class classification, where model performance monotonously improves with the inclusion of personality factors. The accuracy increases from 62.63% to 67.68%, while the F1-score grows from 63.87% to 67.41%, meaning the capacity of the model to correctly identify and classify student performance improves substantially due to the consideration of personality traits. There are similar enhancements in the precision and recall factors for Gradient Boosting. The model thus looks more reliable and balanced.

The effect is even more striking in the three-class classification. In this case, accuracy rises to 85.59%, with an F1-score of 83.59%, for a Random Forest model that includes personality traits from an accuracy of 80.81% without. This suggests that the personality traits are particularly useful in telling these broad categories of student achievement apart, such as high, average, and poor achievers. Hence, these make the model more efficient in picking out these underlying patterns of behavior as they influence academic success, with more fine-grained predictions that are actionable.

Gradient Boosting and Random Forest performed well because they can accommodate complex, non-linear interactions between features. Gradient Boosting is particularly good because it makes weak predictions in an iterative manner, targeting the errors produced in each step, thus fine-tuning the delicate relationships within the data. Its immunity towards overfitting and capability to take on mixed data types make it especially apt for a dataset like SAPEx-D where personal, family, academic, and personality factors interplaying in intricate manners must be considered. Random Forest, on the other hand, relies on an ensemble of decision trees, which capture diverse perspectives on the data through random feature selection and bagging techniques. This diversity helps reduce variance while being highly accurate, thus making it a sound choice for both regression as well as classification. These models also have an advantage in terms of their flexibility and scalability, allowing the model to adjust to the complexities of the dataset aptly.

Contrarily, the KNNs and Naive Bayes models have struggled to perform well because of the inherent limitations in handling high-dimensional and heterogeneous data. KNNs models depend on a distance metric like Euclidean distance, which poses difficulties in datasets with mixed feature types; it assumes that all features have equal importance and scaling. This limitation combined with the curse of dimensionality diminishes its capability to draw proper patterns from sparsely and diversified datasets. Naive Bayes assumes feature independence, which grossly over-simplifies the complex interdependencies among factors in the SAPEx-D dataset. This makes the model miss vital interactions between academic, personal, and personality features and makes it far less predictive. Though the models are computationally quite efficient and easy to apply, they lack adaptability to the complex nature of data structure which holds back their performance overall.

4.4. Interpreting Model Causality Through Explainable AI (XAI) in Student Performance Prediction

In our research on predicting students’ academic performance based on personal, academic, family factors, and personality traits, we utilized Explainable AI (XAI) techniques, specifically the SHAPs (Shapley Additive explanations) method, to interpret the causal relationships.

4.4.1. Regression-Based Causality Through Explainable AI (SHAPs) with and Without Personality Factors

Among the models we tested (Gradient Boosting Regressor, K-Nearest Neighbors Regressor, Linear Regression, and Support Vector Regression). Gradient Boosting Regressor performed the best. Therefore, we will use XAI to uncover the causal factors driving this model’s predictions.

Figure 12 shows important SHAPs values for the Gradient Boosting Regressor without personality trait features. The most influencing factors are “Total Salary if Available”, “Degree Major_IT”, and “Interest in Major Course”. These are strongly linked to the model predictions, represented by large SHAPs values. Clearly, “Total Salary if Available” shows a positive correlation with CGPA, thus proving that students from high-income families have a better academic performance. At the same time, “Degree Major_IT” and “Interest in Major Course” are important predictors explaining that IT major students who are interested in their course major perform well academically. Interestingly, an interaction was found to exist between “Parental Education Level” and “Economic Status”. In particular, students with highly educated parents from families with high incomes portrayed a straight line of consistently higher CGPAs; intellectual resources are compounded by economic resources. On the other hand, students with lower parental education but a higher family income also showed an above-average performance and, hence, economic resources partly mitigate the influence of educational background.

Figure 13 illustrates that SHAP values of the model with the inclusion of Personality Big five Traits. Indeed, in this case, personality traits such as “Agreeableness”, “Neuroticism”, “Openness”, and “Conscientiousness” are now substantive predictors alongside conventional factors. “Agreeableness” and “Conscientiousness” contribute positively towards CGPA and imply that students who are cooperative, dependable, and disciplined have better chances at attaining high academic performance. On the other hand, “Neuroticism” has a negative influence, thus indicating that students who are more sensitive to stress and emotional instability may fare poorly in academics. Students with higher parental education, high income, and strong “Conscientiousness” exhibited the highest predicted CGPAs, demonstrating the synergistic effects of external support and internal traits. For students with moderate economic resources, “Agreeableness” and “Conscientiousness” emerged as compensatory factors, offsetting the lack of financial advantage.

In contrast, “Neuroticism” had a greater adverse effect in students with low parental income, thus underlining a need to target these mentally vulnerable groups among the economically disadvantaged.

By comparing the SHAPs values with and without personality factors, we can see that personality traits provide additional explanatory power, enhancing the model’s accuracy. For instance, with the addition of “Agreeableness” and “Conscientiousness” versus having only family income and major choice as key variables in the previous model, more importance is given to the internal psychological aspects of students. Such a shift strikes at the very core of the necessity of taking a holistic approach in the prediction of academic performance by considering external and internal variables.

4.4.2. Classification-Based Performance Analysis with and Without Personality Factors

In this section, we report on the performance of our classification models for the prediction of student letter grades for eight distinct classes and three distinct classes, both without and with including personality factors. The main interest here is the way that including personality traits changes the model’s ability to make predictions of academic outcomes as reflected by SHAP values. As Random Forest performed well across the other classifiers so we consider only Random Forest to investigate causality between the factors.

Eight Distinct Classes Causality Using SHAP

A SHAPs analysis was performed on all models by only using the personal, family, and academic factors without personality factors to provide such important insights into variables that affect student performance, captured in Figure 14. The plot captures “Weekly Study Hours”, “Total Salary if Available”, and “Scholarship Type” among the top most important predictors. Notably, “Weekly Study Hours” is always positively influential and predicts higher grades, which supports the expectation that more study time correlates with improved academic outcomes. In contrast, variables like “No. of Siblings” and “any additional Work” featured increased interactions with grade predictions; that is, their influence changed across different categories of grades. With no personality factors, the model could only assess observable demographic and academic data, hence focusing on direct indicators.

More insights are revealed by interactions between personal and family factors: parental education level and economic status are jointly influential, so that students from families with higher education levels and income tend to cluster in higher-grade categories.

Those whose parents had lower education levels but a higher income demonstrated moderate improvement in grades, suggesting that economic resources may compensate for a lack of academic mentorship at home.

As shown in Figure 15, traits such as “Agreeableness” allow for a better understanding of student behaviors and their relationship with academic outcomes. For instance, with just the presence of “Agreeableness”, it seems that more cooperative and empathetic students obtain higher grades, even if other variables like study hours are less favorable.

The effects brought about by the interplay between personality factors and the traditional variables are similarly complex in nature. For example, the effect of “Total Salary if Available” is moderated by “Agreeableness”, with a high level of agreeableness possibly buffering the stress caused by financial constraints. Similarly, the interaction of “Weekly Study Hours” with the personality factors like “Agreeableness” has synergistic effects, hence making the prediction of academic success more accurate.

Three Distinct Classes Causality Using SHAPs

As Figure 16 shows that, the SHAPs for the classification model presents how factors like “Scholarship Type”, “Taking Notes in Class”, and “Weekly Study Hours” take part in the prediction of student grades. In the case of “Scholarship Type”, it is rather flat; hence, its impacts on grade prediction are almost the same for all students. However, this in itself is a weak predictor. The distribution of the factor “Taking Notes in Class” is compactly arranged around a central SHAPs value, indicating that although taking notes generally is a positive contributor to academic achievement, its impact is at best moderate when compared with other factors. On the other hand, “Weekly Study Hours” has the largest distribution of SHAPs values; hence, it has a more substantial effect on grades. It would infer that the longer the study hours, the better the grades, so establishing a consistent academic effort is important.

As shown in Figure 17, that the more personality traits included in a model, the greater the complexity of a SHAPs value, hence the more nuanced interaction of those traits with the remaining variables. The inclusion of “Agreeableness” as a personality trait introduces a stronger new dimension in the analysis. For this case, the SHAPs values state that a higher Agreeableness matches with better grades. This example shows that personality traits can increase the predictive power of a model by adding behavioral elements, which neither educational nor family factors can independently achieve. Also, their interactions with variables became more complex when considering personality traits. For example, “Weekly Study Hours” interacts with personality dispositions such as Agreeableness. In that respect, it could mean that the way study time works is modified by personality and gives a clearer understanding of the predictors of academic performance.

The SHAPs analysis provides a framework which fully translates personality and other influencing factors into actionable educational interventions. It underlines potential self-improvement areas and institutional support areas for students. For example, students who are characterized as having a high “Neuroticism”, which negatively predicts CGPA, can be offered stress management in the form of mindfulness programs, journaling, yoga, and access to mental health counseling. These interventions could alleviate the negative impact of emotional instability on performance. In this respect, “Conscientiousness” is also among the students with low values, which is a strong positive predictor; such students can be assisted using structured study planners, steady mentoring, and accountability groups that help produce habitually disciplined individuals and improved time management. Educational institutions could also use SHAPs information in designing holistic interventions that meet both academic and psychological requirements. Programs designed to enhance characteristics like “Agreeableness” and “Conscientiousness” are likely to promote teamwork, obedience, and interest among students. The study also points out that features like “Weekly Study Hours” and “Parent Education Level” are crucial, thereby recommending initiatives such as extended library hours, organized study groups, and family engagement initiatives. The intervention devised would target undergraduates and graduates differently, with each focusing on the needs of each in relation to the access to all resources equally. This, again, has not only a positive impact on individual achievement but also leads to a systemic bettering of educational performance.

Overall, the SHAPs analysis not only confirms the importance of traditional academic and personal factors but also highlights the significant role of personality traits in predicting student performance. By leveraging XAI techniques like SHAPs, we can better understand the underlying causality in our predictive models, leading to more informed and effective educational strategies.

4.5. Discussion and Comparative Analysis

Previously conducted studies, reviewed in the literature review section, by Rolly et al. [22] and Ardiyansyah et al. [23] mainly consider academic and personal factors primarily guided the prediction models. Rolly et al.’s model achieved an R-squared value of 0.6730, focusing on factors like student demographics, program enrollment, and high school GPA. Similarly, Ardiyansyah et al. [23] included gender, age, family size, parental education, extracurricular activities, and health status. Their study applied both Random Forest and Linear Regression models, achieving an R-squared value of 0.7214 with the Random Forest model.

It may be difficult to make a direct comparison with the existing studies, because the sets of factors were less considered in existing studies; however, the fact that personality traits were included in the models clearly reaffirms the importance of these traits. In comparison to the existing studies shown in Figure 18, our proposed model, which includes personality traits alongside academic, personal, and family factors, shows a significant improvement in predicting CGPA. Specifically, our Gradient Boosting model achieved an R-squared value of 0.8352, indicating that it explains a higher proportion of the variance in student CGPA compared to the models in existing studies.

The inclusion of personality traits in our model shows the need to consider personality in modeling academic performance. Despite considering a host of factors, existing studies such as [22,23] did not include personality traits, which this research identified as one of the key determinants. Thus, the improved R-squared value that our proposed model is expected to provide hints toward the crucial role of these personality traits in improving the accuracy of academic performance prediction.

5. Conclusions

In this study, we have delved into various factors that affect student achievement, such as personal, family, and academic factors, with an added dimension of personality factors using the SAPEx-D dataset collected at Air University, Islamabad. By employing various regression models, we aimed to predict students’ Cumulative Grade Point Average (CGPA) and classify letter grades with eight distinct class and three broader classes and evaluate the impact of incorporating personality traits into these predictions. Our results showed that various predictors are very effective in explaining student performance. Firstly, we used the SAPEx-D dataset, which contains 494 instances with an all-round background of the students. The dataset was categorized into personal, family, academic, and personality factors, giving a holistic perspective to variables that influence academic outcomes. The dataset was pre-processed with a lot of care to rid it of inaccuracies and inconsistencies. There were duplications and missing values, which were properly handled. Then, we crossed verified the encoding methods for ordinal and nominal attributes, and feature scaling ensured that the data were normalized for analysis. Four regression models, Gradient Boosting Regressor, K-Nearest Neighbors Regressor, Linear Regression, and Support Vector Regression, were trained and tested for predicting the CGPAs of the student and four classifier models were utilized for letter grade classification: Gradient Boosting, Random forest, Naive Bayes and K-nearest neighbors. Our findings indicated that the inclusion of all factors (personal, family, and academic) yielded meaningful predictions of student performance. The Gradient Boosting Regressor ranked the highest in R-squared, with a value of approximately 0.69, and the lowest in Mean Squared Error, which indicates that this model has a robustness toward predicting CGPA. The accuracy of prediction improved with the addition of personality traits into the model. The R-squared value increased to 0.83, with the MSE decreasing and for classification, the Random Forest’s accuracy for the eight distinct classes was 0.62 and for the three classes it 0.80 without personality factor, and it performed well when personality factors were incorporated, increasing to 0.67 and 0.85, respectively, thereby indicating the strong influence of personality traits on academic performance. In order to understand the causality and importance of these factors, we utilized the SHAPs technique. SHAPs showed that, together with personal, family, and academic factors, personality factors have a principal role in predicting academic performance, giving more insights into the relationship between different factors and student performance. The results of the research are, therefore, in line with a number of studies that have underscored the importance of a taking holistic approach to predicting academic outcomes, an idea to be further strengthened as, in addition to academic and personal variables, personality traits are very important in shaping a student’s success. This study has several limitations. The analysis is static, thereby ignoring changes over time for variables such as personality traits or study habits. It focuses exclusively on the Big Five model of personality, excluding other psychological constructs such as resilience. It does not consider external influences, such as socioeconomically driven or institutional variations in differences. In the future, we would also like to consider interaction effects among these variables and to create more sophisticated models to capture these complexities.

Furthermore, we intend to explore more sophisticated machine learning techniques to capture complicated, non-linear interdependencies in the student performance data. Of course, this will now be augmented by the incorporation of temporal dynamics: tracking changes in personality traits and various academic behaviors over time. The analysis could further be expanded to include other psychological constructs and external socio-economic factors, which can further enhance the robustness of the models. Integrating alternative personality frameworks and combining machine learning models can also enhance predictive accuracy. Finally, these models can be developed as user-friendly tools that educators and policymakers can utilize to actualize improvements in student outcomes.

Author Contributions

Conceptualization, F.M., M.E.U.H. and M.A.A. (Muhammad Adnan Aslam); Methodology M.A.A. (Muhammad Adnan Aslam); Software, M.A.A. (Muhammad Adnan Aslam), A.Y. and F.M.; Formal analysis, F.M.; Resources, A.Y. and M.A.A. (Muhammad Awais Azam); Data curation, M.A.A. (Muhammad Adnan Aslam); Writing—original draft, M.A.A. (Muhammad Adnan Aslam); Writing—review and editing, M.A.A. (Muhammad Adnan Aslam), M.A.A. (Muhammad Awais Azam) and M.E.U.H.; Visualization, F.M.; Supervision, M.A.A. (Muhammad Awais Azam); Funding acquisition, M.A.A. (Muhammad Awais Azam). All authors have read and agreed to the published version of the manuscript.

Funding

This research work is supported by the School of Information Technology, Whitecliffe, Wellington, New Zealand, and Air University, Islamabad, Pakistan.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on reasonable request.

Acknowledgments

The authors express their gratitude to the commenters for their insightful and constructive feedback.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Albreiki, B.; Zaki, N.; Alashwal, H. A Systematic Literature Review of Student’ Performance Prediction Using Machine Learning Techniques. Educ. Sci. 2021, 11, 552. [Google Scholar] [CrossRef]
Kumar, M.; Singh, A.J.; Handa, D. Literature Survey on Student’s Performance Prediction in Education using Data Mining Techniques. Int. J. Educ. Manag. Eng. 2017, 7, 40–49. [Google Scholar] [CrossRef]
Smith, C. A Holistic Approach to Assessment for Students with Severe Learning Difficulties. EdD Thesis, The Open University, Milton Keynes, UK, 2023. [Google Scholar] [CrossRef]
Hijazi, S.T.; Naqvi, S.M.M.R. Factors affecting students’ performance. Bangladesh E-J. Sociol. 2006, 3, 1–10. [Google Scholar]
Zhang, Y.; Yun, Y.; An, R.; Cui, J.; Dai, H.; Shang, X. Educational Data Mining Techniques for Student Performance Prediction: Method Review and Comparison Analysis. Front. Psychol. 2021, 12, 698490. [Google Scholar] [CrossRef]
Misopoulos, F.; Argyropoulou, M.; Tzavara, D. Exploring the Factors Affecting Student Academic Performance in Online Programs: A Literature Review. In On the Line; Springer International Publishing: Cham, Switzerland, 2018; pp. 235–250. [Google Scholar] [CrossRef]
Shahiri, A.M.; Husain, W.; Rashid, N.A. A Review on Predicting Student’s Performance Using Data Mining Techniques. Procedia Comput. Sci. 2015, 72, 414–422. [Google Scholar] [CrossRef]
Ren, Y.; Yu, X. Long-term student performance prediction using learning ability self-adaptive algorithm. Complex Intell. Syst. 2024, 10, 6379–6408. [Google Scholar] [CrossRef]
Fazil, M.; Rísquez, A.; Halpin, C. A Novel Deep Learning Model for Student Performance Prediction Using Engagement Data. J. Learn. Anal. 2024, 11, 23–41. [Google Scholar] [CrossRef]
Anisa, Y.; Erika, W.; Azmi, F. Enhancing Student Performance Prediction Using a Combined SVM-Radial Basis Function Approach. Int. J. Innov. Res. Comput. Sci. Technol. 2024, 12, 1–5. [Google Scholar] [CrossRef]
Sawalkar, P.M.S.; Bhore, S.; Doiphode, S.; Sonawane, N.; Sunkewar, V. Student Performance Prediction. Int. J. Res. Appl. Sci. Eng. Technol. 2024, 12, 3855–3862. [Google Scholar] [CrossRef]
Bonar Sirait, J.C.; Togatorop, P.R.; Tambunan, P.M.L.; Yanti Marpaung, Y.F.; Situmeang, S.I.; Simanjuntak, H.T. Predicting Students Performance Using Data Mining Approach (Case Study: IT Del). In Proceedings of the 2023 IEEE International Conference on Data and Software Engineering (ICoDSE), Toba, Indonesia, 7–8 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 202–207. [Google Scholar] [CrossRef]
Manzali, Y.; Akhiat, Y.; Abdoulaye Barry, K.; Akachar, E.; El Far, M. Prediction of Student Performance Using Random Forest Combined With Naïve Bayes. Comput. J. 2024, 67, 2677–2689. [Google Scholar] [CrossRef]
Khan, I.; Zabil, M.H.M.; Ahmad, A.R.; Jabeur, N. Selecting Machine Learning Models for Student Performance Prediction Aligned with Pedagogical Objectives. In Proceedings of the 2023 IEEE International Conference on Computing (ICOCO), Langkawi, Malaysia, 9–12 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 402–407. [Google Scholar] [CrossRef]
Abuchar, V.J.; Arteta, C.A.; De La Hoz, J.L.; Vieira, C. Risk-based student performance prediction model for engineering courses. Comput. Appl. Eng. Educ. 2024, 32, e22757. [Google Scholar] [CrossRef]
Firdaus Mustapha, M.; Izzah Zulkifli, A.N.; Kairan, O.; Sofea Mat Zizi, N.N.; Naim Yahya, N.; Maisarah Mohamad, N. The prediction of student’s academic performance using RapidMiner. Indones. J. Electr. Eng. Comput. Sci. 2023, 32, 363. [Google Scholar] [CrossRef]
Ahmed, E. Student Performance Prediction Using Machine Learning Algorithms. Appl. Comput. Intell. Soft Comput. 2024, 2024, 4067721. [Google Scholar] [CrossRef]
Tao, H. Educational data mining for student performance prediction: Feature selection and model evaluation. J. Electr. Syst. 2024, 20, 1063–1074. [Google Scholar] [CrossRef]
Li, Y. Data Analysis of Student Academic Performance and Prediction of Student Academic Performance Based on Machine Learning Algorithms. Commun. Humanit. Res. 2024, 32, 65–71. [Google Scholar] [CrossRef]
Khairy, D.; Alharbi, N.; Amasha, M.A.; Areed, M.F.; Alkhalaf, S.; Abougalala, R.A. Prediction of student exam performance using data mining classification algorithms. Educ. Inf. Technol. 2024, 29, 21621–21645. [Google Scholar] [CrossRef]
Zhang, T.; Yao, H.; Duan, Y.; Zhang, S.; Xie, Y.; An, X. Research on Student Performance Prediction Model Based on Blended Learning Data. In Proceedings of the 2023 4th International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Hangzhou, China, 25–27 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 98–101. [Google Scholar] [CrossRef]
Dagdagui, R.T. Predicting Students’ Academic Performance Using Regression Analysis. Am. J. Educ. Res. 2022, 10, 640–646. [Google Scholar] [CrossRef]
Ardiyansyah, A.M.M. Covariance Structure Analysis of Academic Performance Indicators in Relation to Family, Peer Influence, and Financial Factors. In StatPearls [Internet]; StatPearls Publishing: Treasure Island, FL, USA, 2023; Volume 15, pp. 1–14. [Google Scholar]
Ni, L.; Wang, S.; Zhang, Z.; Li, X.; Zheng, X.; Denny, P.; Liu, J. Enhancing Student Performance Prediction on Learnersourced Questions with SGNN-LLM Synergy. arXiv 2024, arXiv:2309.13500. [Google Scholar] [CrossRef]
Shou, Z.; Xie, M.; Mo, J.; Zhang, H. Predicting Student Performance in Online Learning: A Multidimensional Time-Series Data Analysis Approach. Appl. Sci. 2024, 14, 2522. [Google Scholar] [CrossRef]
Wang, J.; Tang, G.; Wang, Y. Application in Student Performance Prediction Using Graph Regularization Nonnegative Matrix Factorization. In Proceedings of the 2023 10th International Conference on Dependable Systems and Their Applications (DSA), Tokyo, Japan, 10–11 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 816–820. [Google Scholar] [CrossRef]
Oppong, S.O. Predicting Students’ Performance Using Machine Learning Algorithms: A Review. Asian J. Res. Comput. Sci. 2023, 16, 128–148. [Google Scholar] [CrossRef]
Alamgir, Z.; Akram, H.; Karim, S.; Wali, A. Enhancing Student Performance Prediction via Educational Data Mining on Academic data. Inform. Educ. 2023, 23, 1–24. [Google Scholar] [CrossRef]
Resmi, T.J.; Mathews, M.K.; Padmanabhan, S. Statistical Analysis of Student Data and Machine Learning Models for Performance Prediction. In Proceedings of the 2024 4th International Conference on Data Engineering and Communication Systems (ICDECS), Bangalore, India, 22–23 March 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–5. [Google Scholar] [CrossRef]
Mammadov, S. Big Five personality traits and academic performance: A meta-analysis. J. Pers. 2022, 90, 222–255. [Google Scholar] [CrossRef] [PubMed]
Birba, D.E. A Comparative Study of Data Splitting Algorithms for Machine Learning Model Selection. Ph.D. Thesis, KTH, School of Electrical Engineering and Computer Science (EECS), Stockholm, Sweden, 2020. Available online: https://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A1506870&dswid=2578 (accessed on 12 April 2024).
Obaid, H.S.; Dheyab, S.A.; Sabry, S.S. The Impact of Data Pre-Processing Techniques and Dimensionality Reduction on the Accuracy of Machine Learning. In Proceedings of the 2019 9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference (IEMECON), Jaipur, India, 13–15 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 279–283. [Google Scholar] [CrossRef]
Hedeker, D. Multilevel Models for Ordinal and Nominal Variables. In Handbook of Multilevel Analysis; Springer: New York, NY, USA, 2008; pp. 237–274. [Google Scholar] [CrossRef]
Zhang, Y.; Cheung, Y. Learnable Weighting of Intra-attribute Distances for Categorical Data Clustering with Nominal and Ordinal Attributes. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3560–3576. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Cheung, Y.-M. A New Distance Metric Exploiting Heterogeneous Interattribute Relationship for Ordinal-and-Nominal-Attribute Data Clustering. IEEE Trans. Cybern. 2022, 52, 758–771. [Google Scholar] [CrossRef]
Jia, B.B.; Zhang, M.L. Multi-Dimensional Classification via Sparse Label Encoding. Proc. Mach. Learn. Res. 2021, 139, 4917–4926. [Google Scholar]
Yu, L.; Zhou, R.; Chen, R.; Lai, K.K. Missing Data Preprocessing in Credit Classification: One-Hot Encoding or Imputation? Emerg. Mark. Financ. Trade 2022, 58, 472–482. [Google Scholar] [CrossRef]
Singh, D.; Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 2020, 97, 105524. [Google Scholar] [CrossRef]
Plevris, V.; Solorzano, G.; Bakas, N.; Ben Seghier, M. Investigation of performance metrics in regression analysis and machine learning-based prediction models. In Proceedings of the 8th European Congress on Computational Methods in Applied Sciences and Engineering, Oslo, Norway, 5–9 June 2022; CIMNE: Barcelona, Spain, 2022. [Google Scholar] [CrossRef]
Chicco, D.; Warrens, M.J.; Jurman, G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
Yağcı, M. Educational data mining: Prediction of students’ academic performance using machine learning algorithms. Smart Learn. Environ. 2022, 9, 11. [Google Scholar] [CrossRef]
Tomasevic, N.; Gvozdenovic, N.; Vranes, S. An overview and comparison of supervised data mining techniques for student exam performance prediction. Comput. Educ. 2020, 143, 103676. [Google Scholar] [CrossRef]
Naidu, G.; Zuva, T.; Sibanda, E.M. A Review of Evaluation Metrics in Machine Learning Algorithms. In Artificial Intelligence Application in Networks and Systems; Springer: Cham, Switzerland, 2023; pp. 15–25. [Google Scholar] [CrossRef]
Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1. [Google Scholar] [CrossRef]
Vujovic, Ž.Ð. Classification Model Evaluation Metrics. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 599–606. [Google Scholar] [CrossRef]

Figure 1. Proposed methodology for student performance prediction.

Figure 2. Distribution of Cumulative Grade Point Average (CGPA) in SAPEx-D.

Figure 3. Classifiers’ (Gradient Boosting, KNN, LR, SVR) of residual distribution without personality factors.

Figure 4. Classifiers’ (Gradient Boosting, KNN, LR, SVR) residual distribution without personality factors.

Figure 5. Classifiers’ (Gradient Boosting, KNN, NB, RF) confusion matrixes for eight classes without personality factors.

Figure 6. Classifiers’ confusion matrices for three classes (high, average, low) without personality factors.

Figure 7. Classifiers’ (Gradient Boosting, KNN, NB, RF) confusion matrices for eight classes with personality factors.

Figure 8. Classifiers’ confusion matrices for three classes (high, average, low) with personality factors.

Figure 9. Comparison of regression model performance with and without personality factors.

Figure 10. Comparison of classifiers on eight classes with and without personality factors.

Figure 11. Comparison of classifiers on three classes with and without personality factors.

Figure 12. Causality between features without personality factors.

Figure 13. Causality between features with personality factors.

Figure 14. Causality between features without personality factors for eight distinct classes.

Figure 15. Causality between features with personality factors for eight distinct classes.

Figure 16. Causality between features without personality factors for three distinct classes.

Figure 17. Causality between features with personality factors for three distinct classes.

Figure 18. Comparison of R-squared between existing studies [22,23].

Table 1. SAPEx-D dataset factors description.

Category	Attribute Name	Attribute Type	Values
Personal Factors	Age	Categorical (Ordinal)	(18–21), (22–25)
	Gender	Categorical (Nominal)	Male, Female
	Additional work	Categorical (Ordinal)	Yes, No
	Sports/Activities	Categorical (Ordinal)	Yes, No
	Compensation	Categorical (Ordinal)	None, USD 135–200, USD 201–270, USD 271–340, USD 341–410, above 410
	Means of transportation	Categorical (Nominal)	Bus, Private car/taxi, bicycle, Other
	Lodging	Categorical (Nominal)	Rental, Dormitory, With family, Other
Family Factors	Marital Status	Categorical (Nominal)	Yes, No
	Mother’s education	Categorical (Ordinal)	Primary School, Secondary School, High School, Bachelor, MSc., Ph.D.
	Father’s education	Categorical (Ordinal)	Primary School, Secondary School, High School, Bachelor, MSc., Ph.D.
	Siblings	Numeric (Discrete)	0,1,2,3,4,5 or above
	Mother’s occupation	Categorical (Nominal)	Retired, Housewife, Government Officer, Private Sector Employee, Self-Employment, Other
	Father’s occupation	Categorical (Nominal)	Retired, Housewife, Government Officer, Private Sector Employee, Self-Employment, Other
	Parental status	Categorical (Nominal)	Married, Divorced, Died—one of them or both
Academic Factors	College graduation type	Categorical (Nominal)	Private, State, Other
	Scholarship	Categorical (Ordinal)	None, 25%, 50%, 75%, Full
	Weekly study hours	Categorical (Ordinal)	None, <5 h, 6–10 h, 11–20 h, More than 20 h
	Reading/non-scientific	Categorical (Nominal)	None, Sometimes, Often
	Reading/scientific	Categorical (Nominal)	None, Sometimes, Often
	Attendance seminars	Categorical (Ordinal)	1: Yes, 2: No
	Impact of your projects	Categorical (Nominal)	Positive, Negative, Neutral
	Attendance	Categorical (Nominal)	Always, Sometimes, Never
	Preparation to Mid-term/group	Categorical (Nominal)	Alone, With friends, Not applicable
	Preparation to Mid-term/time before	Categorical (Nominal)	Closest Date To The Exam, Regularly During The Semester, Never
	Taking notes	Categorical (Nominal)	Never, Sometimes, Always
	Listening in class	Categorical (Nominal)	Never, Sometimes, Always
	Improvement by discussion	Categorical (Nominal)	Never, Sometimes, Always
	Flip classroom	Categorical (Nominal)	Not Useful, Useful, Not applicable
	Cumulative GPA	Numeric (Continuous)	1:<2.00, 2: 2.00–2.49, 3: 2.50–2.99, 4: 3.00–3.49, 5: above 3.49
	Expected GPA	Numeric (Continuous)	1:<2.00, 2: 2.00–2.49, 3: 2.50–2.99, 4: 3.00–3.49, 5: above 3.49
	Rate your Interest in Major Degree	Categorical (Ordinal)	Scale 1–10 1 refers low, 10 refers to high
	Output grade	Categorical (Ordinal)	A, A−, B+, B, B−, C+, C, C−
	Degree Major	Categorical (Nominal)	Electrical Engineering, Business Administration, Accounting and Islamic Finance, Bachelor of Medicine and Bachelor of Surgery, Psychology, Software Engineering, Computer Science, Cyber Security, Data Science, Computer Game Development, Information Technology, Bachelor of Science in Pharmacy, Artificial Intelligence
Personality Factors	Openness to Experience (O) (Creativity)	Numeric (Continuous)	0–1 (Continuous values)
	Conscientiousness (C) (Organization)	Numeric (Continuous)	0–1 (Continuous values)
	Extraversion (E) (Sociability)	Numeric (Continuous)	0–1 (Continuous values)
	Agreeableness (A) (Compassion)	Numeric (Continuous)	0–1 (Continuous values)
	Neuroticism (N) (Emotional stability)	Numeric (Continuous)	0–1 (Continuous values)

Table 2. Comparison between models without personality factors.

Models	Mean Squared Error (MSE)	R-Squared (R²)
Gradient Boosting Regressor	0.1162	0.6905
K-Nearest Neighbors Regressor	0.1465	0.6096
Linear Regression	0.2111	0.4376
Support Vector Regression	0.3037	0.1907

Table 3. Comparison between models considering personality factors.

Models	Mean Squared Error (MSE)	R-Squared (R²)
Gradient Boosting Regressor	0.0618	0.8352
K-Nearest Neighbors Regressor	0.1434	0.6179
Linear Regression	0.1907	0.4919
Support Vector Regression	0.3035	0.1914

Table 4. Comparison between models for eight classes without personality factors.

Models	Accuracy	Precision	Recall	F1-Score
Gradient Boosting	0.686869	0.703067	0.686869	0.678227
Random Forest	0.626263	0.679717	0.626263	0.638761
Naive Bayes	0.373737	0.496006	0.373737	0.316316
KNN	0.424242	0.413786	0.424242	0.40681

Table 5. Comparison between models for three classes without personality factors.

Models	Accuracy	Precision	Recall	F1-Score
Gradient Boosting	0.737374	0.730782	0.737374	0.731799
Random Forest	0.808081	0.799743	0.808081	0.783302
Naive Bayes	0.363636	0.650914	0.363636	0.250237
KNN	0.585859	0.580011	0.585859	0.582228

Table 6. Comparison between models for eight classes with personality factors.

Models	Accuracy	Precision	Recall	F1-Score
Gradient Boosting	0.666667	0.690326	0.666667	0.663416
Random Forest	0.676768	0.689742	0.676768	0.674057
Naive Bayes	0.373737	0.496006	0.373737	0.316316
K-Nearest Neighbors	0.454545	0.510702	0.454545	0.419838

Table 7. Comparison between models for three classes with personality factors.

Models	Accuracy	Precision	Recall	F1-Score
Gradient Boosting	0.777778	0.767169	0.777778	0.766878
Random Forest	0.858586	0.884001	0.858586	0.835859
Naive Bayes	0.363636	0.650914	0.363636	0.250237
K-Nearest Neighbors	0.636364	0.632117	0.636364	0.632793

Table 8. Performance of regression models with and without personality factors.

Models	Without Personality Factors		With Personality Factors
Models	Mean Squared Error (MSE)	R-Squared (R²)	Mean Squared Error (MSE)	R-Squared (R²)
Gradient Boosting Regressor	0.1162	0.6905	0.0618	0.8352
K-Nearest Neighbors Regressor	0.1465	0.6096	0.1434	0.6179
Linear Regression	0.2111	0.4376	0.1907	0.4919
Support Vector Regression	0.3037	0.1907	0.3035	0.1914

Table 9. Performance of classifiers with eight and three classes of letter grade with and without personality factors.

Models	Without Personality Factors					With Personality Factors
Models	Classes	Accuracy	Precision	Recall	F1-Score	Accuracy	Precision	Recall	F1-Score
GB	Eight	0.686869	0.703067	0.686869	0.678227	0.666667	0.690326	0.666667	0.663416
GB	Three	0.737374	0.730782	0.737374	0.731799	0.777778	0.767169	0.777778	0.766878
RF	Eight	0.626263	0.679717	0.626263	0.638761	0.676768	0.689742	0.676768	0.674057
RF	Three	0.808081	0.799743	0.808081	0.783302	0.858586	0.884001	0.858586	0.835859
NB	Eight	0.373737	0.496006	0.373737	0.316316	0.373737	0.496006	0.373737	0.316316
NB	Three	0.363636	0.650914	0.363636	0.250237	0.363636	0.650914	0.363636	0.250237
KNN	Eight	0.424242	0.413786	0.424242	0.40681	0.454545	0.510702	0.454545	0.419838
KNN	Three	0.585859	0.580011	0.585859	0.582228	0.636364	0.632117	0.636364	0.632793

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aslam, M.A.; Murtaza, F.; Haq, M.E.U.; Yasin, A.; Azam, M.A. A Human-Centered Approach to Academic Performance Prediction Using Personality Factors in Educational AI. Information 2024, 15, 777. https://doi.org/10.3390/info15120777

AMA Style

Aslam MA, Murtaza F, Haq MEU, Yasin A, Azam MA. A Human-Centered Approach to Academic Performance Prediction Using Personality Factors in Educational AI. Information. 2024; 15(12):777. https://doi.org/10.3390/info15120777

Chicago/Turabian Style

Aslam, Muhammad Adnan, Fiza Murtaza, Muhammad Ehatisham Ul Haq, Amanullah Yasin, and Muhammad Awais Azam. 2024. "A Human-Centered Approach to Academic Performance Prediction Using Personality Factors in Educational AI" Information 15, no. 12: 777. https://doi.org/10.3390/info15120777

APA Style

Aslam, M. A., Murtaza, F., Haq, M. E. U., Yasin, A., & Azam, M. A. (2024). A Human-Centered Approach to Academic Performance Prediction Using Personality Factors in Educational AI. Information, 15(12), 777. https://doi.org/10.3390/info15120777

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Human-Centered Approach to Academic Performance Prediction Using Personality Factors in Educational AI

Abstract

1. Introduction

2. Related Work

2.1. Evolution of Algorithms for Student Performance Prediction

2.2. Hybrid and Ensemble Methods for Student Performance Analysis

2.3. Context-Specific Models and Pedagogical Applications for Student Performance

2.4. Data-Driven Insights and Implications from Traditional and Baseline Models for Student Performance

3. Proposed Methodology

3.1. Dataset

3.2. Data Preprocessing

3.2.1. Removal of Duplication and Missing Values

3.2.2. Data Encoding

3.2.3. Feature Scaling

3.2.4. Output (CGPA) Distribution

3.2.5. Evaluation Measures

4. Experimental Results

4.1. Analysis of Student Performance Based on CGPA Using Machine Learning Techniques

4.1.1. CGPA Prediction Exclusion of Personality Factors

4.1.2. CGPA Prediction Including Personality Factors

4.2. Analysis of Student Performance Based on Letter Grade Using Machine Learning Technqiues

4.2.1. Letter Grade Classification with Exclusion of Personality Factors

Letter Grade Classification with Eight Distinct Classes

Letter Grade Classification with Three Distinct Classes

4.2.2. Letter Grades Classification with Inclusion of Personality Factors

Letter Grade Classification with Eight Distinct Classes

Letter Grade Classification with Three Distinct Classes

4.3. Comparison of Prediction Models Effect on Students’ Academic Performance with and Without Personality Factors

4.3.1. Regression-Based Performance Analysis with and Without Personality Factors

4.3.2. Classification-Based Performance Analysis with and Without Personality Factors

4.4. Interpreting Model Causality Through Explainable AI (XAI) in Student Performance Prediction

4.4.1. Regression-Based Causality Through Explainable AI (SHAPs) with and Without Personality Factors

4.4.2. Classification-Based Performance Analysis with and Without Personality Factors

Eight Distinct Classes Causality Using SHAP

Three Distinct Classes Causality Using SHAPs

4.5. Discussion and Comparative Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI