A New Machine-Learning-Driven Grade-Point Average Prediction Approach for College Students Incorporating Psychological Evaluations in the Post-COVID-19 Era

Zhang, Tiantian; Zhong, Zhidan; Mao, Wentao; Zhang, Zhihui; Li, Zhe

doi:10.3390/electronics13101928

Open AccessArticle

A New Machine-Learning-Driven Grade-Point Average Prediction Approach for College Students Incorporating Psychological Evaluations in the Post-COVID-19 Era

¹

School of Mechanical and Electrical Engineering, Henan University of Science and Technology, Luoyang 471003, China

²

School of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(10), 1928; https://doi.org/10.3390/electronics13101928

Submission received: 19 April 2024 / Revised: 9 May 2024 / Accepted: 12 May 2024 / Published: 15 May 2024

(This article belongs to the Special Issue Innovations and Challenges of Higher Education Institutions in the Post-COVID-19 Era)

Download

Browse Figures

Versions Notes

Abstract

:

With the rapid development of artificial intelligence in recent years, intelligent evaluation of college students’ growth by means of the monitoring data from training processes is becoming a promising technique in the field intelligent education. Current studies, however, tend to utilize course grades, which are objective, to predict students’ grade-point averages (GPAs), but usually neglect subjective factors like psychological resilience. To solve this problem, this paper takes mechanical engineering as the research object, and proposes a new machine-learning-driven GPA prediction approach to evaluate the academic performance of engineering students by incorporating psychological evaluation data into basic course scores. Specifically, this paper adopts SCL-90 psychological assessment data collected in the freshman year, including key mental health indicators such as somatization, depression, hostility, and interpersonal sensitivity indicators, as well as professional basic course scores, including mechanical principles, mechanical design, advanced mathematics, and engineering drawing. Four representative machine learning algorithms, Support Vector Machine (SVM), CNN-CBAM, Extreme Gradient Boosting (XGBoost) and Classification and Regression Tree (CART) that include deep and shallow models, respectively, are then employed to build a classification model for GPA prediction. This paper designs a validation experiment by tracking 229 students from the 2020 class from the School of Mechanical and Electrical Engineering of Henan University of Science and Technology, China. The students’ academic performance in senior grades is divided into five classes to use as the prediction labels. It is verified that psychological data and course data can be effectively integrated into GPA prediction for college students, with an accuracy rate of 83.64%. Meanwhile, this paper also reveals that anxiety indicators in the psychological assessment data have the greatest impact on college students’ academic performance, followed by interpersonal sensitivity. The experimental results also show that, for predicting junior year GPAs, psychological factors play more important role than they do in predicting sophomore GPAs. Suggestions are therefore given: the current practice in existing undergraduate teaching, i.e., only conducting psychological assessments in the initial freshman year, should be updated by introducing follow-up psychological assessments in each academic year.

Keywords:

academic performance; psychological health; GPA evaluation; deep learning; intelligent education

1. Introduction

Smart education is regarded as an inevitable trend in the development of global education. Grade-point average (GPA) prediction [1] is becoming one of the critical technologies for realizing smart education. Through the prediction of the future growth quality of students, especially college students, it can help educators to determine students’ learning difficulties in time, thus effective intervention and personalized teaching can be carried out for different students. Weariness raised by learning difficulties, as well as psychological problems, for such students can then be prevented. With the rapid development of artificial intelligence theory in recent years [2], using machine learning techniques to predict students’ academic performance has become a research hotspot in smart education [3]. Machine learning can reveal potential learning patterns and trends through the analysis of students’ family backgrounds, learning habits, course performance and other data. With such pattern information, a classification model can be established to predict the future growth quality of students. How to build a machine learning model for specific problems [4] and explore its key influencing factors has become a key challenge of intelligent academic performance prediction.

According to our literature research, there have been some preliminary studies on student achievement prediction. Jiang et al. [5] utilized several typical behavioral characteristics of learners selected from MOOC to predict whether learners can successfully complete learning tasks. Pandey et al. [6] selected 8 important attributes from a total of 18 attributes that affect student achievement by calculating the information gain rate of each attribute and built a decision tree model to predict student achievement. In the aspect of exploring the influencing factors of student achievement, Thiele et al. [7] believe that students’ demographic characteristics, such as race, gender and economic status, and academic characteristics, such as school type and school performance, have a close link with their academic performance. Bhardwaj et al. [8] conducted a study on the performance of 300 students at an Indian university and found their that home address, family annual income, living habits, mother’s education level and historical performance had a greater impact on their academic performance. It is clear that current studies generally only use objective data such as course scores and environmental factors to predict GPA trends. Although good prediction results have been achieved, most of these methods are only applicable to a definite growth environment. Once crisis events such as the COVID-19 epidemic, family changes, and learning environment changes occur, GPA trends are hard to predict. As a sudden stress event, COVID-19 not only aggravated the deterioration of interpersonal relationships, moral relativism and value confusion, but it also had a profound impact on youth lifestyles, including their attitudes and value orientation towards life [9]. Meanwhile, factors related to life quality, such as food, exercise and lifestyle, affect the happiness of college students to a certain extent [10]. The relationships between life quality and depression, anxiety and interpersonal relationships are complex and interactive, and have an important impact on students’ mental health. The aforementioned objective data are incapable of reflecting the effect of students’ learning behavior, thus reducing the prediction effect.

Generally speaking, students’ internal psychological status and other subjective factors, especially when facing crisis events, will have a direct impact on students’ coping strategies. At present, correlation analyses of students’ mental health and GPA have been widely conducted. Sebastian Freyhofer et al. [11] employed exploratory and confirmatory factor analysis and determined three higher-order factors for college student loneliness during the COVID-19 epidemic. A structural equation model was used to test the sequence mediation model in order to explore the structural relationship between environment, mental health and academic performance. It was found that academic performance could be improved by increasing social activities and reducing procrastination-type negative emotions. By means of a regression technique and a path analysis, Wang et al. [12] found a significant correlation between mobile phone addiction tendency, time management ability and academic performance. Su et al. [13] found that college students’ confidence levels can affect their academic performance using exploratory and confirmatory factor analyses. Luo et al. [14] adopted a two-factor variance F-test and found that personality factors affect college students’ academic performance. A willingness to cooperate and enthusiasm have been proven helpful to improve academic performance. Monzonis-Carda et al. [15] evaluated a total of 266 secondary school students (baseline age: 13.9 ± 0.3 years) using cross-lag modeling to examine the bidirectional longitudinal association between mental health and academic achievement. However, this kind of research mainly uses psychological characteristics data for association analyses and mediation effect tests. This kind of research, being essentially a qualitative analysis, can achieve coarse-grained relationship mining but fails to accurately quantitatively evaluate GPAs, as well as failing to directly build a prediction model. Especially for the COVID-19 epidemic, which had a great impact on students’ psychological status, existing studies have only analyzed the correlation between various psychological factors and academic performance, rather than performed in-depth quantitative analyses of the influence of the epidemic on academic performance. Moreover, there are no studies about intelligent prediction of academic performance by means of psychological status. After all, the key to advancing practical applications is to evaluate the future performance based on available data, which facilitates decision-making in educational management.

This paper has the following hypothesis: course scores, especially scores in specialized basic courses, can only reflect students’ ability to refine and learn professional knowledge, and are not enough to fully express students’ long-term learning ability and knowledge acquisition ability. Psychological characteristics data [16] can potentially reflect students’ ability to cope with pressure, emotional regulation, interpersonal communication, self-cognition and adaptation. Such abilities can be used to evaluate students’ mental health status and promote students’ self-cognition. As a result, potential psychological problems can be effectively prevented, which in turn improves student learning. In view of this, both course results and psychological assessments play a key supporting role in the long-term academic training of college students [17]. Therefore, this paper combines the core of basic courses with psychological data and uses machine learning techniques to realize the intelligent prediction of GPAs in the training process of students.

Specifically, this paper chooses mechanical engineering [18], a typical engineering discipline, as the research object. It is worth noting that the basic courses of mechanical engineering have strong innovative and comprehensive characteristics, while the training process for students focuses on practical operation, communication and teamwork. In this paper, 540 students from the 2020 class from the School of Mechanical and Electrical Engineering of Henan University of Science and Technology (WEBOMETRICS World Ranking 1510), China, are selected, and their psychological assessment data and professional basic course scores, including the symptom self-assessment scores, advanced mathematics and other basic courses, are studied. Four different types of machine learning algorithms were used for modeling to alleviate the uncertainty of classification algorithms on the prediction effect [19]. These students were all enrolled in school during the epidemic period, and the learning process in school was discontinuous, with online learning and home learning interspersed. We believe that students’ self-discipline and mental toughness have had a great impact on their academic performance after the epidemic. Finally, a grade 5 GPA level is used as the model label for GPA prediction. The results show that psychological data are an effective supplement to course scores data, and the prediction effect is significantly improved compared to using only course data. Simultaneously, it is also found that two psychological characteristics, anxiety and interpersonal sensitivity, have a great impact on the prediction of students’ GPAs.

The main contribution of this paper is introducing psychological assessment data into the prediction of students’ GPAs and performing a comprehensive evaluation of college students’ growth qualities from both subjective and objective perspectives. The merit of this work is that it evaluates this not only from the perspective of college students’ learning abilities, but also from the perspective of students’ own psychological qualities as well as individual differences. Psychological characteristics are verified to supplement the deficiency in course data. According to the author’s literature research, this paper is the first to apply psychological assessment data to intelligent GPA prediction.

2. Methodology

This section elaborates on the GPA prediction model with four different classification algorithms. The flowchart of the model is shown in Figure 1, including four modules: data acquisition, data preprocessing, GPA level (adding labels), and classification model construction. The classification methods include four algorithms: XGBoost [20], CART decision tree [21], 1DCNNcite [22]-CBAM [23] (Convolutional Block Attention Module), and SVM [24] (Support Vector Machine). These algorithms cover shallow models and deep models. Specifically, XGBoost is an integrated learning model and is considered to be the best at present for processing tabular data. CART decision tree can extract classification rules and determine the importance ranking of features. SVM is a typical classification algorithm with small-scale samples. 1DCNN-CBAM is a deep learning model with an attention mechanism based on a convolutional neural network. The attention mechanism is regarded as a useful technique to adaptively obtain the most related features from training data. We believe these four algorithms can comprehensively evaluate the performance of the proposed approach by alleviating the uncertainty of the algorithms themselves.

2.1. Data Acquisition

In this study, the psychological assessment data, as well as the basic professional courses scores, of students are collected as a dataset. In universities in China, college students are required to complete the SCL-90 [25] form when they enter university as a freshman. Psychological assessment data can be directly collected from the SCL-90 form. Certainly, other psychological assessments can be introduced.

Here, we provide a brief introduction to the SCL-90 form. As a symptom self-rating form, SCL-90 is currently a commonly used tool to assess the mental health status of college/middle school students, including 24 indicators like somatization, obsessive-compulsive symptoms, interpersonal sensitivity, depression, anxiety, hostility, terror, paranoia, psychosis, etc. Please refer to the Experiments section for more details. At the Chinese university level, the SCL-90 test is usually organized by the school within the first week when all students enter the freshman year, who complete their own assessments based on the level of psychological symptoms. In the subsequent training process, only the psychological status of very focused students will be assessed. Demographic attributes are the student’s name, age, gender, major class, etc.

Basic course score data include student numbers, course names, scores in each subject, etc. It needs to be emphasized that the subjects of the freshman courses are relatively complicated. Only the basic professional courses are chosen for modeling. For example, for the subject of mechanical engineering (the subject of this article’s experiment), only seven subjects, including advanced mathematics, engineering drawing, college English, computer basics, and philosophy, are selected. At the same time, the student’s GPA scores for each academic year are also added as the input.

2.2. Data Filtering and Preprocessing

Data preprocessing in this study includes data alignment, time filtering, GPA calculation, data alignment again according to GPA, as shown in Figure 2. The following steps are required:

(1) Data alignment [26]. Determine and screen out the students from the original data who have both psychological assessment scores and course scores for the first semester of freshman year, as well as average grade point for sophomore year [27].

(2) Time filtering. Filter out the data of students who took more than 200 s to answer questions in SCL-90 test to ensure that they complete their answers carefully.

(3) Course screening. Screen out the professional basic courses from students’ scores and credit data in various subjects. Generally speaking, basic professional courses have higher credits and significance. Representative subjects should be selected for the GPA prediction by calculating the GPA of the basic courses. The calculation formula is as follows:

G P A = \frac{\sum_{i = 1}^{n} (s c o r e_{i} \times c r e d i t_{i})}{\sum_{i = 1}^{n} c r e d i t_{i}} .

(1)

where

s c o r e_{i}

and

c r e d i t_{i}

represent the grades and credits of the i-th course, respectively, and n represents the total number of professional basic courses.

(4) Data alignment: Align the screened psychological assessment scores and professional basic course score to ensure that each student’s psychological assessment scores correspond to the grade point of the professional basic courses.

To date, the dataset contains 24 attributes of psychological assessment and 11 attributes of professional courses.

2.3. GPA Label Definition

For the GPA prediction problem, the labels of student instances are mainly calibrated according to the student’s GPA levels [28], which are used as supervision information for training the classifiers. This article evenly divides students’ average grade points into five levels according to the 5-point GPA score. For example, students with a GPA of 0–1 are labeled as 0, while students with a GPA of 4–5 are labeled as 4. Therefore, according to the GPA scores of the sophomore and junior years, each is divided into five grades, which are used as the labels for instances in different experimental settings, for instance, using psychological assessment data and freshman course grades to predict the sophomore GPA and junior GPA, respectively.

2.4. Classification Algorithm

This article employs four classification algorithms. The independent variable parameters involved include the evaluation results of each item of the SCL-90 psychological assessment (such as somatization score, obsessive-compulsive score, etc.), the GPA of all courses in the first semester of freshman year, and scores in seven professional basic courses (A1 to A7). The task is to predict a student’s GPA during their sophomore or junior year.

The four classification algorithms include:

(1): The SVM algorithm.
(2): The CART algorithm.
(3): The XGBoost (Extreme Gradient Boosting) algorithm.
(4): 1DCNN-CBAM.

These algorithms use students’ SCL-90 psychological assessment results and course grades as features and use existing data to predict students’ GPA levels. Among them, SVM is a classic small-sample classification algorithm. 1DCNN-CBAM combines a convolutional neural network with a dual-channel attention mechanism to better capture the relationship and importance between features. XGBoost and CART decision tree algorithms can rank the importance of features from the perspective of information entropy and form explicit classification rules. XGBoost is considered to be the best model for processing tabular data at present.

2.4.1. SVM

SVM is a classic supervised learning algorithm, mainly used for classification and regression problems [29]. Obeying a max-margin strategy that seeks the optimal hyperplane between the two classes, SVM is able to achieve a good classification performance, especially when facing small-scale training data. By minimizing the structural risk via convex quadratic programming, SVM has a good generalization capability and is robust to noise interference. SVM has been proven to be promising in solving classification problems like network intrusion detection [29]. Since the GPA prediction problem is also a small-scale classification problem, SVM is believed ti be suitable to address it.

The objective function of SVM can be formalized as follows:

min_{w, b} \frac{1}{2} {∥w∥}^{2} + C \sum_{i = 1}^{n} ξ_{i} .

(2)

where w is the model weight, b is the bias,

ξ_{i}

is the slack variable, and C is the regularization parameter that controls the tradeoff between the generalization capability and empirical risk. The core idea of SVM is to find an optimal hyperplane that separates samples of different categories and maximizes the distance from the nearest sample point to the hyperplane. The goal is to find a decision boundary that maximizes the margin between the sample points of two different categories while meeting the accuracy requirements of classification. SVM has a solid mathematical ground in seeking the optimal model weights and maintaining an appropriate generalization capability.

With SVM as the classifier for student GPA predictions, we set students’ psychometric assessment results and professional basic course scores, as well as their freshman GPA, as the features. The students’ sophomore GPA level is set as the prediction label [30].

2.4.2. CART

CART is a classic decision tree algorithm that recursively divides a dataset into two subsets until a certain termination condition is met [31]. For the classification problem, CART divides the dataset into multiple categories according to the importance value of features. The output category is just the classification result. Since it does not require a large amount of training data, CART exhibits high effectiveness in handling small-scale data. Furthermore, it is easy to understand and interpret the classification rule because the decision trees have a clear structure. For limited, even, small-scale data, users can directly observe the tree’s structure and the partitioning rules of nodes to comprehend the classification process and results. CART has also shown superior performance on multi-classification problems like study program selection for students [31]. These advantages make the CART model an effective tool for handling the GPA prediction problem with small-scale data. The algorithmic flowchart of CART is shown in Figure 3.

2.4.3. XGBoost

Also consisting of a decision tree structure, XGBoost is an ensemble learning algorithm that iteratively trains multiple decision trees and combines their prediction results [32]. XGBoost performs well on large-scale datasets and high-dimensional features and has a good robustness to outliers and noisy data and interpretability of classification rules. For small-scale datasets, XGBoost also performs well by means of an optimized distributed gradient boosting library. XGBoost provides parallel tree boosting that solves multi-classification problems, e.g., self-admitted technical debt classification [32], in a fast and accurate way. Thus, XGBoost is believed to be suitable to solve the GPA prediction problem.

The objective function of the XGBoost algorithm is:

L^{(t)} = \underset{i = 1}{\sum^{n}} \frac{1}{2} h_{i} {[f_{t} (x_{i}) - \frac{g_{i}}{h_{i}}]}^{2} + Ω (f_{t}) + C .

(3)

where

g_{i} = \frac{\partial l (y_{i}, {\hat{y}}_{i}^{t - 1})}{\partial {\hat{y}}_{i}^{t - 1}}

,

h_{i} = \frac{\partial^{2} l (y_{i}, {\hat{y}}_{i}^{t - 1})}{{(\partial {\hat{y}}_{i}^{t - 1})}^{2}}

represent the first and second derivatives, respectively,

{\hat{y}}_{i}^{t - 1}

indicates the prediction result of the ith sample after the (t−1)th iteration, and

l (\cdot, \cdot)

means the loss function that is supposed to be differential. The true value is

\frac{g_{i}}{h_{i}}

, while

f_{t} (x_{i})

represents the predicted value of the t decision tree for sample i.

Ω (f_{t})

represents the model complexity of the t-th tree, serving as a regularization term. C is a constant. At the same time, the contribution of a sample to the value of the objective function lies in

h_{i}

. So, the “importance” of the eigenvalues can be sorted according to

h_{i}

. This article uses XGBoost to rank the importance of factors that affect students’ GPA levels in their sophomore year.

2.4.4. 1DCNN-CBAM

Convolutional neural networks (CNNs) are renowned for their powerful feature extraction capability, which enables them to effectively learn pivotal features and provide strong support for classification tasks [33]. CBAM is an attention module that combines channel attention and spatial attention [34]. Compared with attention mechanisms that only focus on channels, it can capture the correlation between the space and channels of features more comprehensively, thereby helping the model focus on more important features. 1DCNN-CBAM, employing a one-dimensional CNN, can concentrate more on critical information, enhancing the accuracy and robustness of classification [23]. The CNN and CBAM have been successfully applied to perform antenna selection [33] and vegetable detection [34]. The adoption of the CNN-CBAM model is then believed to able to provide an effective solution for GPA prediction with multi-class classification.

As shown in Figure 4, the channel attention mechanism in CBAM is used to adjust the importance of different channels to enhance the network’s ability to express channel-related features. The spatial attention mechanism is used to adjust the importance of different spatial locations to enhance the network’s ability to express channel-related features and to perceive spatial relationships between features as well. By combining these two attention mechanisms, CBAM can extract features more effectively and improve the performance and generalization ability of the network.

This article uses a CNN structure with wide convolution, as shown in Figure 5. In this structure, the filter width (kernel size) of the convolutional layer is relatively large so more input features can be covered in one convolution operation, thereby extracting richer feature information. The wide convolutional [35] structure helps to increase the perceptual capabilities of the network, allowing it to better capture complex feature patterns in the input data.

In Figure 5, a convolutional layer is used to extract local features from the input data. The pooling layer is used to reduce the dimensionality of features and to retain key information as well. The fully connected layer is used to map the extracted features to the output space. The activation function, usually a rectified linear unit (ReLU) function, is dedicated to nonlinear transformations [36].

The GPA prediction algorithm using a 1DCNN combined with CBAM can be briefly introduced as follows.

Denote by X the inputted student psychological assessment scores. After feature extraction with the 1D CNN and CBAM modules, the feature representation is

F (X) = C B A M (C N N (X)) .

(4)

In the CBAM module, the process of calculating the channel attention weight and spatial attention weight can be expressed as follows.

(1) Channel attention weight calculation:

The channel attention weight can be obtained by:

W_{c} = σ (ML P_{channel} (AVG (CNN (X))) ⊙ ML P_{channel} (MAX (CNN (X)))) .

(5)

where

σ

represents the Sigmoid activation function,

M L P_{c} h a n n e l

represents the multi-layer perceptron (MLP) network, AVG and MAX represent the global average pooling and global maximum pooling operations, and ⊙ represents element-wise multiplication.

(2) Spatial attention weight calculation

The spatial attention weight can be obtained by:

W_{s} = σ (ML P_{spatial} (AVG (CNN (X))) ⊙ ML P_{spatial} (MAX (CNN (X)))) .

(6)

where

M L P_{s p a t i a l}

represents an MLP network for each spatial dimension.

Finally,

w_{c}

and

W_{s}

are fused with the original features to obtain the CBAM-adjusted features

F (X)

.

The CBAM-adjusted feature representation can be obtained by:

Y = FC (F (X)) .

(7)

where

F C

represents the fully connected layer and Y represents the predicted student performance. Through appropriate loss functions and optimization algorithms, the network parameters can be optimized to minimize prediction errors and achieve accurate predictions of student performance.

3. Experiment

3.1. Task Setting

This article uses the GPA data from a total of 540 students from the School of Mechanical and Electrical Engineering at Henan University of Science and Technology, China, in 2020, to verify the rationality of this method. Following the data preprocessing in Section 2.2, only 229 student data were finally used for the validation. The data for each student included a total of 24 indicators from the SCL-90 form, seven freshman (G1) professional basic courses (college English, engineering drawing, advanced mathematics, philosophy, college computer basics, physical education, general law), GPAs in the first semester of freshman year (G1-1), and GPAs in the second semester of freshman year (G1-2). A total of 33 characteristics were used for modeling. In addition, the sophomore (G2) year GPA and junior year GPA were also included in the training data as separate prediction tasks. A total of 75 % of the 229 student instances were chosen for training, while the rest were for testing. Then, nine prediction tasks were constructed, as shown in Table 1 [37]. Table 1 shows the four methods used, SVM, CART decision tree, XGBoost, and 1DCNN-CBAM, as prediction models, referred to Task 1, Task 2, Task 3, and Task 4, respectively. Tasks 1 to 4 were mainly used to verify the predictability of GPAs under different methods. In addition, in order to verify the importance of psychological data, we conducted the other experiments in Task 5 to Task 9. In Task 5, only the freshman academic performance was used as input to predict the sophomore year GPA. In Task 6, only psychological data were used to predict GPAs. In Task 7, the psychological data and freshman academic performance were used to predict junior year GPAs. In Task 8, the psychological data and sophomore academic performance were used to predict junior year GPAs. In Task 9, the psychological data plus freshman and sophomore academic performance were used to predict junior year GPAs. Tasks 5 to 9 all use the CART decision tree as the prediction model. Tasks 5 to 9 mainly test the impact of different training data when using the same prediction algorithm. Among them, the training data used in Task 9 cover the whole education process.

3.2. Parameter Setting

Table 2 lists the self-rating attributes from the SCL-90 form. For the complete survey data, please refer to the following URL: https://github.com/study490/Learning-and-Mental-Scores (accessed on 11 May 2024). The data are split into two sections. The first section contains measures from the mental health scale, including scores on somatization, obsessive-compulsive symptoms, interpersonal sensitivity, depression, anxiety, hostility, phobic anxiety, paranoid ideation, and psychoticism. The second section contains data on academic performance, including GPAs from multiple semesters and the scores from different courses. The data collected from the SCL-90 form and data on academic performance can present a synthesized picture of a student’s mental health and learning ability, providing a solid basis for GPA prediction [38].

This article employs the following evaluation metrics.

Accuracy: used to measure the proportion of samples that the classification model predicts correctly in the entire dataset, as follows:

A c c u r a c y = \frac{T r u e P o s i t i v e (T P) + T r u e N e g a t i v e (T N)}{T P + T N + F a l s e P o s i t i v e (F P) + F a l s e N e g a t i v e (F N)} .

(8)

Recall: used to measure how many of all real positive category samples are correctly predicted as positive categories by the model. In other words, Recall tells us how complete and comprehensive the model is in identifying positive class samples, as follows:

R e c a l l = \frac{T P}{T P + F N} .

(9)

where TP refers to the number of positive data predicted to be positive, TN refers to the number of negative data predicted to be negative, FP refers to the number of negative data predicted to be positive, and FN refers to the number of positive data predicted to be positive. The number of positive numbers predicts negative data [39].

F − score: used to comprehensively evaluate classification performance by taking into account both Precision and Recall, as follows:

F - s c o r e = 2 \times \frac{P r e c i s o n \cdot R e c a l l}{P r e c i s o n + R e c a l l} .

(10)

where Precision is the fraction of true positive examples among the examples that the model classified as positive, i.e., the number of true positives divided by the number of false positives plus true positives. The value of the F − score ranges from 0 to 1. The higher the value, the better the performance of the model. With high values of Precision and Recall, the value of F − score will also be high.

3.3. Result Analysis

Figure 6 shows the results of all prediction tasks. It is clear that from Task 1 to Task 4, the prediction performance is relatively stable; thus, there is no difference in prediction algorithms. It also proves that psychological data and academic data can be used to predict GPA levels. When only subject scores are used (Task 5) or only psychological data (Task 6) are used, the prediction performance decreased, which verified again that each of these two kinds of data do not provide enough evaluation information for GPA prediction. We also find an interesting phenomenon that the numerical results of all tasks are not very high, reaching no more than 40%. The Accuracy values of Tasks 1 to 6 are obviously lower than the ones of the other tasks, while Tasks 1–6 obtain lower Recall scores below 50%. The reason for this is the significant class imbalance in the training data. Specifically, training samples with a GPA of 3 account for approximately 50% of the whole dataset, which pushes the classification model to be biased towards these samples and, not surprisingly, increase the value of Accuracy. Meanwhile, the model’s discriminative performance becomes inferior in predicting the samples of the other GPA grades. A further inspection of the test set for Tasks 1 to 6 in Figure 6 reveals a starkly divergent distribution characteristic. The test samples are primarily concentrated at GPA grade 1, with only one sample with a GPA of grade 3. Such a discrepancy in class distribution between the test and training sets significantly reduces the test classification performance, also leading to a lower Accuracy value for Tasks 1 to 6. Moreover, the imbalanced class distribution in the training data will lead to a poor performance in predicting minority classes, even leading to their Recall values approaching zero. The less discriminative capability regarding minority classes greatly drags the overall Recall value down.

Figure 7 shows the confusion matrix for Task 1 to Task 9 for evaluating the performance of classification models. Figure 7 is expected to provide detailed insights into the model’s classification accuracy by summarizing the relationship between the model’s predictions and actual GPA grades. From Figure 7, we can clearly identify the number of TP, FN, FP and TN samples that have been introduced. The prediction accuracy of Task 5 and Task 6 in the confusion matrix is significantly reduced, while the prediction accuracy of Task 2, Task 7, and Task 8 in the confusion matrix gradually increases, which is in line with the observations in Figure 6.

From the analysis of Figure 8, Figure 9 and Figure 10, it can be seen that psychological data are only collected when the freshman begins school [40]. Therefore, the psychological factors in Task 2 (predicting the grades of the sophomore year) still have a certain impact on the final prediction result, but in Tasks 7 and 8 (predicting junior year grades), psychological factors only have a small impact on the final prediction results, which shows that psychological factors play a greater role in predicting sophomore year grades than junior year grades. Therefore, we hope to conduct follow-up psychological assessments in each school year in the future.

The loss box plot [41] in Figure 11 shows the distribution of loss values during the training process of our student achievement prediction model using the convolutional neural network model. Each box represents the loss value data for each epoch in 10 repetitions of the experiment, totaling 100 epochs. In the graph, the anomalies are lower than the boxes at the beginning and slightly higher than the boxes at the end, which may indicate that in the initial stage of the training, the model has some anomalies, resulting in lower loss values, and as the training proceeds, these anomalies gradually increase and the loss values are higher in the later stages of training. Boxes and anomalies show a decreasing trend, which indicates that with the increase in the number of training times, the model has improved in terms of loss values, i.e., the model’s training effect is gradually improved. It can be seen that the model fits the data better in the initial stage and achieves a good training effect in the middle stage, but in the later stage, there may be some problems, leading to an increase in the loss values, which may require further analysis and adjustment of the model or training strategy.

4. Discussion

The experimental results listed above reveal the following observations:

(1): Only using course scores or only using psychological assessment data cannot guarantee a good prediction performance.
(2): Utilizing similar learning patterns (e.g., major courses have minor differences in learning approaches) and the course scores of the sophomore year to predict the following year’s GPA (junior year) results in a better prediction performance. When using learning patterns with significant differences (freshman course) to predict the following year’s grades (sophomore year), the model’s prediction accuracy is rather low.
(3): Incorporating course scores with substantial differences even has a negative impact on the prediction results. For instance, the Accuracy value of Task 9 in which the freshman course scores are employed to predict the junior year GPA decreases to some extent.
(4): With an increase in students’ grades, the influence of psychological assessment data on GPA decreases year on year.
(5): Paranoia, hostility, interpersonal sensitivity, and anxiety are the pivotal psychological factors that have a larger impact on students’ GPA, and these factors can be significantly influenced by the epidemic.
(6): Annual assessments of psychological health are important for university freshmen in the post-COVID-19 era, and they can also enhance the prediction effectiveness of machine learning models.

These points are elaborated in the following.

Figure 6 and Figure 7 demonstrate the prediction results for all prediction tasks. It is evident that, from Task 1 to Task 4, the prediction performance remains relatively stable, thus eliminating the differences in prediction algorithms. They also prove that psychological assessment data and course scores can be used to predict GPA. Additionally, Tasks 1 to 4 utilize psychological assessment and freshman course scores to predict the freshman year GPA. Since students learn freshman course material in a learning pattern that differs significantly from high school, Task 1 to Task 4 achieve a lower classification accuracy than Task 8 and Task 9. When only using subject scores (Task 5) or psychological assessment data (Task 6), the prediction performance decreases. It is clear that each of these two kinds of data do not provide enough comprehensive information for college student GPA prediction. Task 8 achieves the best prediction results, which indicates that the courses and study habits in sophomore year are nearly identical to those in junior year. Identical learning patterns will certainly allow the junior year GPA to be easily predicted by means of sophomore course results. We also find that, by adding freshman courses in Task 8, the prediction results of Task 9 decrease to some extent. It is interesting that freshman courses have a negative impact on the prediction of the junior year GPA. We speculate that the reason for this is that freshman courses are linked to students’ learning patterns before entering university. The learning pattern in university generally exhibits a great change, thus reducing the prediction performance for junior year GPA.

To better identify the critical factors influencing GPA prediction, we have quantified the importance of psychological factors in three representative tasks (Task 2, Task 7, and Task 8). From Figure 8, it is clear that psychological data play a certainly auxiliary role in predicting the sophomore year GPA. For predicting the junior year GPA, however, the impact of psychological assessment data significantly decreases, as shown in Figure 9 and Figure 10. This is due to the fact that psychological data are gathered during the process of freshman enrollment. Over time, students’ psychological states also change, leading to a gradual reduction in the impact of psychological factors on GPA prediction. Moreover, Figure 8 illustrates that, within the realm of psychological factors, paranoia, hostility, interpersonal sensitivity, and anxiety exert a larger influence on students’ GPAs. These four factors are closely related to individuals’ emotional states: Paranoia and hostility lead to interpersonal tensions, thereby increasing sensitivity and anxiety in interpersonal relationships. Long-term anxiety and stress will trigger paranoid and hostile emotional responses. These factors are also easily influenced by the epidemic. Psychological studies [42,43] have demonstrated that, during major emergencies, humans are prone to experiencing adverse emotional responses, including unease, fear, panic, etc. These reactions often result in behaviors such as withdrawal and avoidance. The unexpected outbreak of COVID-19 has caused widespread psychological and social ripple effects. In particular, university students, as a distinct group, have undergone significant shifts in their study habits and living arrangements due to the influence of the epidemic [44]. In light of the virus’s infectivity and transmission rate, it is imperative for university students to minimize the chances of infection by refraining from engaging in close contact. A closed-off home environment leads to a sense of distrust among college students, resulting in varying degrees of psychological problems and obstacles, including interpersonal sensitivity, hostility, and paranoia [45]. Certain studies have revealed that, in the aftermath of the pandemic, the prevalence of psychological conditions, such as depression and anxiety, among student populations remained notably elevated [46]. Additionally, the learning patterns of university students have also undergone substantial changes during the epidemic. Following the outbreak of COVID-19, the closure of campuses necessitated a shift towards online learning for students, presenting a novel challenge in terms of educational modalities. Unsupervised online learning demands students to cultivate excellent study habits and proficient time management skills, whilst also addressing technical challenges such as intermittent internet connectivity and equipment malfunctions. Additionally, the absence of interactions and face-to-face communication gives rise to feelings of isolation and stress, subsequently resulting in a diminished interest in learning. The epidemic has also compelled educational departments to adjust and change examination systems. The decision to postpone or cancel significant exams, including college entrance examinations and university entrance exams, has resulted in significant stress and unpredictability for university students, leading to negative emotional responses such as anxiety and depression. Furthermore, the financial constraints imposed on families by the epidemic have a negative impact on students’ psychological well-being and academic motivation, subsequently affecting their academic performance. Therefore, it is imperative to promptly identify and address students’ psychological statuses and enhance their academic performance.

5. Conclusions

In this paper, a new machine-learning-driven GPA prediction approach is proposed to evaluate the academic performance of upper-year college students [47]. The essential idea is to incorporate psychological assessment data into GPA prediction using course scores. Four different types of machine learning algorithms are employed to build a classification model for GPA prediction. From the experimental results on a real-world dataset from Henan University of Science and Technology, the following conclusions can be drawn:

(1): Psychological assessment data are proven to be promising for supplementing subjective information to predict junior year GPAs. Utilizing psychological data is helpful to develop the discriminative capability of machine learning models on the basis of the object information contained in basic course scores.
(2): Both only using course scores and only using psychological data decrease the GPA prediction effect. The overall prediction accuracy only reaches around 80% with these two kinds of data due to the inevitable noise interference of the used data that come directly from the educational administration system.
(3): Psychological data play a greater role in predicting GPAs in the sophomore year compared to junior year. Obviously, follow-up psychological assessments in subsequent academic years will be beneficial for GPA prediction.

The most interesting finding of this study is the quantification of the effect of four psychological factors, i.e., paranoia, hostility, interpersonal sensitivity and anxiety, on GPA prediction. What is particularly concerning is that these psychological factors were significantly impacted during the COVID-19 epidemic. In order to promote the all-round development of students, some educational operations are suggested, such as heterogeneous grouping, peer support and teacher guidance, which are able to provide university students with timely psychological counselling and interventions. These measures are designed to help students cope with psychological challenges and improve their academic performance and overall quality. Moreover, more effective machine learning techniques will be introduced in future work to solve the potential class imbalance problem. A robust and reliable prediction model is vital to this study. The authors are also planning to analyze and quantify the potential relationship between courses and psychological factors by using a graph neural network. Once such a relationship is found, a bridge between university education and psychological studies will be established.

Author Contributions

Conceptualization, T.Z. and Z.Z. (Zhidan Zhong); methodology, W.M.; software, W.M.; validation, T.Z., Z.Z. (Zhidan Zhong), W.M., Z.Z. (Zhihui Zhang) and Z.L.; formal analysis, T.Z.; investigation, Z.Z. (Zhidan Zhong); resources, W.M.; data curation, T.Z.; writing—original draft preparation, Z.Z. (Zhihui Zhang) and Z.L.; writing—review and editing, T.Z., Z.Z. (Zhidan Zhong) and W.M.; visualization, Z.Z. (Zhihui Zhang) and Z.L.; supervision, T.Z., Z.Z. (Zhidan Zhong) and W.M.; project administration, T.Z., Z.Z. (Zhidan Zhong) and W.M. All authors have read and agreed to the published version of the manuscript.

Funding

This study was partially funded by the Henan Province Undergraduate College Research Teaching Reform Research and Practice Project 2022SYJXLX026, Research Teaching Reform and Practice of Mechanical Major Courses in the Collaboration of Obstetrics, Science and Education. “Machine Learning and Signal Processing”, Henan Province Postgraduate Ideological and Political Demonstration Course Project YJS2023SZ11.

Data Availability Statement

Part of the annotated dataset created by the authors and used in this paper to train the deep learning object detection models is available online at [25].

Acknowledgments

We thank the reviewers for reviewing this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

GPA	Grade-Point Average
SVM	Support Vector Machine
XGBoost	Extreme Gradient Boosting
CART	Classification and Regression Tree
CNN	Convolutional Neural Network
CBAM	Convolutional Block Attention Module
SCL-90	Self-Reporting Inventory

References

Tatar, A.E.; Düştegör, D. Prediction of academic performance at undergraduate graduation: Course grades or grade point average? Appl. Sci. 2020, 10, 4967. [Google Scholar] [CrossRef]
Lu, Y. Artificial intelligence: A survey on evolution, models, applications and future trends. Expert Syst. Appl. 2019, 136, 252–263. [Google Scholar] [CrossRef]
Wu, L.; Jiang, M.; Du, Q.; Jiang, Z.; Zhang, Y. Dynamics, Hot Spots and Future Directions of Educational Research in the Intelligent Era. In Artificial Intelligence in Education and Teaching Assessment; Springer: Singapore, 2021; pp. 261–272. [Google Scholar]
Sullivan, E. Understanding from machine learning models. Br. J. Philos. Sci. 2022, 73, axz035. [Google Scholar] [CrossRef]
Jiang, Z.; Zhang, Y.; Li, X. Analysis and Prediction of Learning Behavior based on MOOC data. Comput. Res. Dev. 2015, 52, 614–628. [Google Scholar]
Pandey, M.; Sharma, V.K. A decision tree algorithm pertaining to the student performance analysis and prediction. Int. J. Comput. Appl. 2013, 21, 15. [Google Scholar] [CrossRef]
Thiele, T.; Singleton, A.; Pope, D. Predicting students academic per for mance based on school and socio-demographic characteristics. Stud. High. Educ. 2016, 41, 14241446. [Google Scholar] [CrossRef]
Bhardwaj, B.K.; Pal, S. Data Mining. A prediction for performance improvement using classification. arXiv 2012, arXiv:1201.3418. [Google Scholar]
Kralik, R. The Influence of Family and School in Shaping the Values of Children and Young People in the Theory of Free Time and Pedagogy. J. Educ. Cult. Soc. 2023, 14, 249–268. [Google Scholar] [CrossRef]
Petrovič, F.; Murgaš, F.; Králik, R. Food, exercise and lifestyle are predictors of a hedonic or eudaimonic quality of life in university students? Acta Missiologica 2023, 17, 99–114. [Google Scholar]
Sebastian, F.; Niklas, Z.; Elisabeth, M.J.; Michaela, C.S. Depression and Anxiety in Times of COVID-19: How Coping Strategies and Loneliness Relate to Mental Health Outcomes and Academic Performance. Front. Psychol. 2021, 12, 682684. [Google Scholar]
Wang, C. A study on the relationship between Mobile phone addiction, Time Management and academic performance of Vocational College students. J. Yellow River Conserv. Tech. Coll. 2019, 31. [Google Scholar] [CrossRef]
Su, B.; Cui, W. A study on the relationship between college students’ confidence level and their academic performance. Sci. Theory 2013, 18, 314–315. [Google Scholar]
Luo, B.; Chen, W. A study on the correlation between college students’ academic performance and Personality Characteristics. J. Wuhan Univ. Sci. Technol. 2001, 1, 77–79. [Google Scholar]
Monzonís-Carda, I.; Rodriguez-Ayllon, M.; Adelantado-Renau, M. Bidirectional longitudinal associations of mental health with academic performance in adolescents: DADOS study. Pediatr. Res. 2023; ahead of print. [Google Scholar]
Pace, C.R.; Stern, G.G. An approach to the measurement of psychological characteristics of college environments. J. Educ. Psychol. 1958, 49, 269. [Google Scholar] [CrossRef]
Soares, A.P.; Guisande, A.M.; Almeida, L.S. Academic achievement in first-year Portuguese college students: The role of academic preparation and learning strategies. Int. J. Psychol. 2009, 44, 204–212. [Google Scholar] [CrossRef] [PubMed]
Enelund, M.; Knutson Wedel, M.; Lundqvist, U. Integration of education for sustainable development in the mechanical engineering curriculum. Australas. J. Eng. Educ. 2013, 19, 51–62. [Google Scholar] [CrossRef]
Dwyer, D.B.; Falkai, P.; Koutsouleris, N. Machine learning approaches for clinical psychology and psychiatry. Annu. Rev. Clin. Psychol. 2018, 14, 91–118. [Google Scholar] [CrossRef] [PubMed]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 106–181. [Google Scholar] [CrossRef]
Abdoli, S.; Cardinal, P.; Koerich, A.L. End-to-end environmental sound Classification using a 1D convolutional neural network. Expert Syst. Appl. 2019, 136, 252–263. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.Y. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Tiantian, Z. Learning-and-Mental-Scores Data. Available online: https://github.com/study490/Learning-and-Mental-Scores (accessed on 1 April 2024).
Villena-Martinez, V.; Oprea, S.; Saval-Calvo, M. When deep learning meets data alignment: A review on deep registration networks (drns). Appl. Sci. 2020, 10, 7524. [Google Scholar] [CrossRef]
Gerdes, H.; Mallinckrodt, B. Emotional, social, and academic adjustment of college students: A longitudinal study of retention. J. Couns. Dev. 1994, 72, 281–288. [Google Scholar] [CrossRef]
Franz, D.J.; Richter, T.; Lenhard, W. The influence of diagnostic labels on the evaluation of students: A multilevel meta-analysis. Educ. Psychol. Rev. 2023, 35, 17. [Google Scholar] [CrossRef]
Aburomman, A.A.; Reaz, M.B.I. A novel weighted support vector machines multiclass classifier based on differential evolution for intrusion detection systems. Inf. Sci. 2017, 414, 225–246. [Google Scholar] [CrossRef]
Taha, Z.; Musa, R.M.; Koerich, A.L. The identification of high potential archers based on relative psychological coping skills variables: A support vector machine approach. IOP Conf. Ser. Mater. Sci. Eng. 2018, 319, 012027. [Google Scholar] [CrossRef]
Subarkah, P.; Ikhsan, A.N.; Setyanto, A. The effect of the number of attributes on the selection of study program using classification and regression trees algorithms. In Proceedings of the 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE), Yogyakarta, Indonesia, 13–14 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–5. [Google Scholar]
Chen, X.; Yu, D.; Fan, X.; Wang, L.; Chen, J. Multiclass classification for self-admitted technical debt based on XGBoost. IEEE Trans. Reliab. 2021, 71, 1309–1324. [Google Scholar] [CrossRef]
Kim, J.; Joung, J.; Jeong, E.R. Transmit Antenna Selection Using CNN-Based Multiclass Classification with Linear Interpolation of Wideband Channels. In Proceedings of the 2023 Fourteenth International Conference on Ubiquitous and Future Networks (ICUFN), Paris, France, 4–7 July 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 222–224. [Google Scholar]
Luo, Q.; Zhang, Z.; Yang, C.; Lin, J. An Improved Soft-CBAM-YoloV5 Algorithm for Fruits and Vegetables Detection and Counting. In Proceedings of the 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Haikou, China, 18–20 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 187–192. [Google Scholar]
Liu, G.; Dang, M.; Liu, J. True wide convolutional neural network for image denoising. Inf. Sci. 2022, 610, 171–184. [Google Scholar] [CrossRef]
Liu, Y.; Zhang, J.; Gao, C. Natural-logarithm-rectified activation function in convolutional neural networks. In Proceedings of the 2019 IEEE 5th International Conference on Computer and Communications, Chengdu, China, 6–9 December 2019; pp. 2000–2008. [Google Scholar]
Ceruti, M.G.; Kamel, M.N.d. Preprocessing and integration of data from multiple sources for knowledge discovery. Int. J. Artif. Intell. Tools 1999, 8, 157–177. [Google Scholar] [CrossRef]
Holi, M. Assessment of Psychiatric Symptoms Using the SCL-90. Ph.D. Thesis, Medical Faculty of the University of Helsinki, Helsinki, Finland, 2003. [Google Scholar]
Vujović, Ž. Classification model evaluation metrics. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 599–606. [Google Scholar] [CrossRef]
Lewis, R.J. An introduction to classification and regression tree (CART) analysis. In Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, California; Department of Emergency Medicine Harbor-UCLA Medical Center Torrance: San Francisco, CA, USA, 2000; Volume 14. [Google Scholar]
Lem, S.; Cardinal, P.; Verschaffel, L. The heuristic interpretation of box plots. Learn. Instr. 2013, 26, 22–35. [Google Scholar] [CrossRef]
Chang, B.P. Are there long-term consequences to psychological stress during a medical event? Acad. Emerg. Med. 2020, 27, 173–175. [Google Scholar] [CrossRef] [PubMed]
Kumar, P.; Kamal, S.; Tuli, S.; Gupta, N. COVID-19 and manifest psychological morbidity: A case series. Indian J. Psychiatry 2021, 63, 294–296. [Google Scholar] [PubMed]
Kecojevic, A.; Basch, C.H.; Sullivan, M.; Davi, N.K. The impact of the COVID-19 epidemic on mental health of undergraduate students in New Jersey, cross-sectional study. PLoS ONE 2020, 15, e0239696. [Google Scholar] [CrossRef] [PubMed]
Rajkumar, E.; Rajan, A.M.; Daniel, M. The psychological impact of quarantine due to COVID-19: A systematic review of risk, protective factors and interventions using socio-ecological model framework. Heliyon 2022, 8, e09765. [Google Scholar] [CrossRef]
Farfán-Latorre, M.; Estrada-Araoz, E.G.; Lavilla-Condori, W.G. Mental health in the post-pandemic period: Depression, anxiety, and stress in Peruvian university students upon return to face-to-face classes. Sustainability 2023, 15, 11924. [Google Scholar] [CrossRef]
Kishor, K.; Sharma, R.; Chhabra, M. Student performance prediction using technology of machine learning. In International Conference on Micro-Electronics and Telecommunication Engineering; Springer Nature: Singapore, 2021; pp. 541–551. [Google Scholar]

Figure 1. Flowchart of GPA prediction with machine learning techniques.

Figure 2. Sketch map of data filtering and preprocessing.

Figure 3. Algorithmic flowchart of CART.

Figure 4. Introduction of the CBAM module.

Figure 5. Introduction to a classical CNN model.

Figure 6. Numerical prediction results from Task 1 to Task 9.

Figure 7. Confusion matrices of different tasks. The subfigures are the results from Task 1 to Task 9.

Figure 8. The importance of different independent variable factors in Task 2.

Figure 9. The importance of different independent variable factors in Task 7.

Figure 10. The importance of different independent variable factors in Task 8.

Figure 11. Task 4 iterative loss box plot.

Table 1. Settings of Task 1 to Task 9.

Serial Number	Task	Input Data	Output
Task 1	SVM	Freshman course and psychological assessment	Sophomore year GPA
Task 2	CART	Freshman course and psychological assessment	Sophomore year GPA
Task 3	XGBoost	Freshman course and psychological assessment	Sophomore year GPA
Task 4	CNN-CBAM	Freshman course and psychological assessment	Sophomore year GPA
Task 5	CART	Freshman course	Sophomore year GPA
Task 6	CART	Psychological assessment	Sophomore year GPA
Task 7	CART	Freshman course and psychological assessment	Junior year GPA
Task 8	CART	Sophomore course and psychological assessment	Junior year GPA
Task 9	CART	Freshman+sophomore courses and psychological assessment	Junior year GPA

Table 2. SCL-90 attribute form.

Measurement Items	Score	Measurement Items	Score
Somatization	17	Obsessive Compulsivity	17
Interpersonal Sensitivity	16	Depression	21
Anxiety	15	Hostility	12
Phobic Anxiety	15	Paranoid Ideation	13
Psychoticism	16	Other Total	15
Total	157	Positive Items	53
Somatization Mean	1.42	Obsessive Compulsivity Mean	1.7
Interpersonal Sensitivity Mean	1.78	Depression Mean	1.62
Anxiety Mean	1.5	Hostility Mean	2
Phobic Anxiety Mean	2.14	Paranoid Ideation Mean	2.17
Psychoticism Mean	1.6	Other Mean	2.14
Total Mean	1.74	Positive Items Mean	2.26
GPA1-1	4.476	GPA1-2	4.9
A1	85	A2	86
A3	81	A4	90
A5	82	A6	84
A7	96	GPA2	4.53
GPA3	4.57

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, T.; Zhong, Z.; Mao, W.; Zhang, Z.; Li, Z. A New Machine-Learning-Driven Grade-Point Average Prediction Approach for College Students Incorporating Psychological Evaluations in the Post-COVID-19 Era. Electronics 2024, 13, 1928. https://doi.org/10.3390/electronics13101928

AMA Style

Zhang T, Zhong Z, Mao W, Zhang Z, Li Z. A New Machine-Learning-Driven Grade-Point Average Prediction Approach for College Students Incorporating Psychological Evaluations in the Post-COVID-19 Era. Electronics. 2024; 13(10):1928. https://doi.org/10.3390/electronics13101928

Chicago/Turabian Style

Zhang, Tiantian, Zhidan Zhong, Wentao Mao, Zhihui Zhang, and Zhe Li. 2024. "A New Machine-Learning-Driven Grade-Point Average Prediction Approach for College Students Incorporating Psychological Evaluations in the Post-COVID-19 Era" Electronics 13, no. 10: 1928. https://doi.org/10.3390/electronics13101928

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Machine-Learning-Driven Grade-Point Average Prediction Approach for College Students Incorporating Psychological Evaluations in the Post-COVID-19 Era

Abstract

1. Introduction

2. Methodology

2.1. Data Acquisition

2.2. Data Filtering and Preprocessing

2.3. GPA Label Definition

2.4. Classification Algorithm

2.4.1. SVM

2.4.2. CART

2.4.3. XGBoost

2.4.4. 1DCNN-CBAM

3. Experiment

3.1. Task Setting

3.2. Parameter Setting

3.3. Result Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI