Design of a Prediction Model to Predict Students’ Performance Using Educational Data Mining and Machine Learning

R, Jayasree; Selvakumari, Sheela

doi:10.3390/engproc2023059025

Open AccessProceeding Paper

Design of a Prediction Model to Predict Students’ Performance Using Educational Data Mining and Machine Learning^†

by

Jayasree R

^* and

Sheela Selvakumari

Department of Computer Science, Sri Krishna Arts and Science College, Coimbatore 641008, Tamilnadu, India

^*

Author to whom correspondence should be addressed.

^†

Presented at the International Conference on Recent Advances on Science and Engineering, Dubai, United Arab Emirates, 4–5 October 2023.

Eng. Proc. 2023, 59(1), 25; https://doi.org/10.3390/engproc2023059025

Published: 12 December 2023

(This article belongs to the Proceedings of Eng. Proc., 2023, RAiSE-2023)

Download

Browse Figures

Versions Notes

Abstract

:

The development of a knowledge- and information-based society can be aided by higher education. Through research and extension efforts, higher education institutions must perform a variety of functions, including building an intelligent human resource pool, gaining new skills, and creating new knowledge. As a result, the development of skilled workers with the ability to think critically, creatively, and logically is the primary focus of higher education institutions. However, there are some significant obstacles in the way of offering quality education, such as how to identify low-performing students and their causes. Predicting student performance has become challenging as a result of the vast quantity of data in educational databases. The lack of a developed system for assessing and monitoring student achievement is also not being considered. There are primarily two causes for this kind of situation. Initially, there was inadequate study of the various prediction techniques to select the ones that would best predict students’ success in educational environments. The second is the lack of investigation into the courses. In this research work, efforts have been made to identify low-performing students through the proposed Back Propagation Neural Network for Student Performance Analysis (BPNN-SPA) model, which generates more accurate, efficient, and dependable results as compared to some of the existing techniques and models. The performance of the proposed model is compared with the Support Vector Machine and Random Decision algorithms and evaluated by four significant performance metrics, namely, sensitivity, specificity, accuracy, and the F-measure. Based on performance measures, the proposed BPNN-SPA achieved better accuracy than existing algorithms.

Keywords:

Educational Data Mining; machine learning; Support Vector Machine; Random Decision; Back Propagation Neural Network

1. Introduction

Technology evolves swiftly. This technological advancement has created massive volumes of data, which are now everywhere. Educational institutions are no exception [1]. Higher education institutes (HEI) have had two major issues recently. Educational big data exploration is the first problem. The second major issue is analysing massive educational data sets to find crucial patterns, facts, and linkages for education and decision-making. Data mining and machine learning can now extract knowledge and hidden patterns [2]. The analysis of student performance has been a major focus of Educational Data Mining (EDM) and Machine Learning Analytics (MLA) [3]. Educational institutions use EDM and MLA to forecast student performance, which helps students improve academically and allows instructors and decision-makers to monitor individual students, identify at-risk students, and take prompt corrective action [4]. Many academics utilize EDM and MLA to predict student performance, engagement, and dropout or retention risk. Given the significance of anticipating student success in today’s educational environment, researchers are pushed to construct reliable and useful models [5].

The biggest challenge for educational institutions is turning massive volumes of data from numerous sources into information that can assist students, professors, and administrators in making decisions. Academic Analysis (AA), Machine Learning Analytics (MLA), and Educational Data Mining evaluate educational data to solve this complex challenge [6]. AA, MLA, and EDM all aim to improve education, but each domain targets a different set of stakeholders [7,8]. The majority of student performance prediction research has focused on demographics, academic achievement, and course passing rates [9]. However, student behavioural data may enhance performance prediction [10]. Student performance prediction algorithms struggle with poor classification rates, fewer prediction components, and low accuracy with large datasets. Thus, a comprehensive prediction model is needed to determine the correlations between traits that may correctly predict student performance and assist instructors in identifying underachievers [11,12].

2. Methodology

This proposed system evaluates student performance and behaviour using neural network training. The integrated model reduces overfitting. The model workflow is shown in Figure 1.

2.1. Existing Techniques

Random Decision (RD) Classification: The Random Decision (RD) is a Random Forest and decision tree classification extension. The classifier uses geometrical categorization and supervised learning. To meaningfully incorporate multinomials in the multiclass case model, it presupposes a probabilistic model and calculates alternative outcome likelihoods. Diagnostic and prognostic issues are addressed. The RD classification method starts with a dataset-wide decision tree. After finding tree nodes, it calculates entropy. Efficacy, outcomes, and resource costs are simulated. A community of bagging-trained decision trees uses an ensemble technique to generate the RD. Decision trees are built and merged to increase forecast accuracy and stability.

Support Vector Machine: A frequently used regulated AI is the Support Vector Machine or SVM. For controlled learning, it is necessary to prepare the computation with named classes in order to test it. SVM and discriminant classifiers are quite comparable, but what sets the SVM apart is that SVMs create the biggest edge separator, which leads to superior speculation when compared to discriminant classifiers.

2.2. Proposed Technique

Back Propagation Neural Network for Student Performance Analysis.

Neural Networks are utilized for effective data mining, converting the raw data into useful data. The model can process a large amount of data, increasing its reputation and effectiveness [13]. It involves an Artificial Neural Network (ANN) for processing and data mining. When the data are massive, the requirement for automated processing becomes effective [14]. With its efficient dual nature, data mining is effectively used in many ways. The most common application of data mining is classification, in which patterns are detected based on their groups. Neural Networks work on the basis of artificial neurons [15,16]. Each node in the network is given the name neuron, the basic processing unit [17].

Here, the components of the Neural Network (NN) are given as the input data set

{x_{1}, x_{2} \dots x_{n}}

and the bias factor x_0, which is the constant value for activation operations, and the weights are given as,

{w_{0}, w_{1}, w_{2} \dots w_{n}}

. Based on this, the output is derived based on the following Equation (1):

\propto = a (\sum_{i = 0}^{n} w_{i} \times x_{i})

(1)

where ‘a’ is the activation function of nodes. Moreover, the function provides computations of complex non-linear connectivity between the data and the neuron. Additionally, it also provides flexibility.

After performing feature selection from the student data set, the Back Propagation NN is used for measuring the student behaviours. Here, the student data samples that are collected from questionnaires are given as

S S^{'} \in R_{m \times n}

, where, ‘m’ denotes the number of samples remaining after removing the outliers from SS and ‘k’ is the number of retained features. Here, the dataset

D S = \{(u_{1}, v_{1}), (u_{2}, v_{2}), \dots (u_{m}, v_{m})\}

, where,

u_{i} = \hat{{S S}_{i}}

is a row vector of SS′ and v_i is the class label of ‘i’ the student sample. The NN contains three layers, in which the first layer is the input layer comprising the ‘k’ number of nodes for proving the inputs ‘u_i’, and the output layers are to provide the prediction results as ‘v_i’. The middle layer is the hidden layer, which has an ‘n’ number of adaptive nodes based on requirements. The node thresholds make the neural network non-linear and the equivalent function is computed as in Equation (2).

\hat{v_{i}} = f (u_{i})

(2)

The optimization of the NN model is based on the MSE rate between the actual and predicted results. The error rate is computed as in Equation (3),

E r r o r R a t e = \frac{1}{m} \sum_{i = 1}^{m} {(\hat{v_{i}} - v_{i})}^{2}

(3)

Additionally, the BP-NN uses the process of parameter adjustment to reduce the mean square error. The weight parameter is computed as in Equation (4):

{Δ W P}_{f m} = - l \frac{\partial E r r o r R a t e}{{\partial W F}_{f m}}

(4)

where ‘l’ is the learning rate representing the training speed, ‘f’ is the factor of the first layer, and ‘m’ is the factor of the middle layer [18]. The function is defined between the student features ‘u_i’ and their behaviour ‘v_i’ as F(u). The model is effectively used to minimize the difference between F(u) and f(u). Hence, the trained network model is used as a good prediction model for classifying students. The Back Propagation NN in training with the student dataset is provided in Figure 2.

The neural network trains at the first layer and feeds information to the next. The second layer collects basic data and combines it with complicated data for the third layer. Each layer processes the complicated patterns from the preceding layer. Layer input determines layer weights [19]. Neural network training involves determining weights for each neuron in the network. The NN process may have several components, and calculating the right rates for all inputs is complex. The relevance of the input to the output is crucial for NN output quality control [20]. The procedure also uses the network loss function. The function calculates anticipated values from model calculations and intended values. The loss function is dictated by the network. The input samples define the common loss function and MSE.

The approach is to find the weight that lowers the error function. This method returns the error to the NN, called the back propagation model. These data are used to alter NN link weights to reduce mistakes. The loss function and weights are generated and changed as the error decreases using gradient descent, a standard non-linear optimization methodology. Additionally, the training rate converts and computes weights. After several training epochs, iterations, and model convergence, the error is small, and the network is said to have trained with the goal function.

3. Experimental Results

3.1. Dataset Description

Each record in this dataset describes secondary school student accomplishments in two Portuguese schools using 33 parameters. The criteria include students’ grades, demographics, social status, and education. The data came from school reports and questionnaires. Two datasets show performance in Portuguese (por) and mathematics (mat). The data set characteristics are separated into first, second, and final grades.

3.2. Results and Discussion

The experiment was conducted with some fine-tuning of the parameters. The models are developed using classification algorithms: SVM, Random Decision, BPNN-SPA, and various performance metrics, namely sensitivity, specificity, accuracy, and the F-measure. In binary classification, G3 ≥ 10 is a pass, otherwise it is a fail. Five-level grading is based on the Erasmus grade conversion system. The Erasmus coordinators allocate the available exchange spots based on the student’s entire application.

Figure 3 represents the heat map for feature important analysis for the student performance dataset. Table 1 illustrates the experimental results for binary classification, i.e., pass or fail. Here, the proposed BPNN-SPA achieves 1.8% higher accuracy than Random Decision, 11.9% higher accuracy than SVM, and takes 2920 milliseconds less than Random Decision and 4740 milliseconds less than SVM.

Table 2 illustrates the experimental results for binary classification, i.e., pass or fail. Here, the proposed BPNN-SPA achieves 2.9% higher accuracy than Random Decision, 11.5% higher accuracy than SVM, and takes 3720 milliseconds less than Random Decision and 5440 milliseconds less than SVM.

Figure 4a,b show the performance analysis of binary classification. It is obvious that the proposed BPNN-SPA performs better than other algorithms. Figure 5 depicts the accuracy of the proposed algorithms, and Figure 6 illustrates the time taken for each algorithm for binary classification.

Table 3 illustrates the experimental results for five-level grading, i.e., very good, good, satisfactory, sufficient, and fail. Here, the proposed BPNN-SPA achieves 2.9% higher accuracy than Random Decision, 8.5% higher accuracy than SVM, and takes 2580 milliseconds less than Random Decision and 5050 milliseconds less than SVM.

Table 4 illustrates the experimental results for five-level grading, i.e., pass or fail. Here, the proposed BPNN-SPA achieves 2.7% higher accuracy than Random Decision, 12.1% higher accuracy than SVM, and takes 3650 milliseconds less than Random Decision and 6190 milliseconds less than SVM.

Figure 7 shows the performance analysis of five-level grading. It is obvious that the proposed BPNN-SPA performs better than other algorithms. Figure 8 depicts the accuracy of the proposed algorithms, and Figure 9 illustrates the time taken for each algorithm for binary classification.

4. Conclusions

The experiment presented in this chapter focused on creating and assessing a prescient predictive model for classifying non-performing students. We used frequently used classification techniques on the student dataset to find an optimum solution for student classification. These classification algorithms were selected on the basis of the results of the current research on developing predictive models for student classification. From the analysis, it is obvious that the proposed Back Propagation Neural Network for Student Performance Analysis achieves good accuracy for binary and five-level grading. The biggest problem with the proposed technique is that it can be sensitive to noisy data. In future, the proposed algorithm is to be extended with various fine parameters and larger datasets.

Author Contributions

Conceptualization, J.R. and S.S.; methodology, J.R.; software, J.R.; validation, S.S.; formal analysis, J.R. and S.S.; investigation, S.S.; resources, S.S.; data curation, S.S.; writing—original draft preparation, J.R.; writing—review and editing, J.R.; visualization, S.S.; supervision, S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data can be obtained from the corresponding author on request.

Acknowledgments

We acknowledge the institutional management and family members for their immense support.

Conflicts of Interest

The authors declare no conflict of interest.

References

Usama, M.; Qadir, J.; Raza, A.; Arif, H.; Yau, K.L.A.; Elkhatib, Y.; Al-Fuqaha, A. Unsupervised machine learning for networking: Techniques, applications and research challenges. IEEE Access 2019, 7, 65579–65615. [Google Scholar] [CrossRef]
Aljohani, N.R.; Fayoumi, A.; Hassan, S.U. A comparative study of feature selection methods for dialectal Arabic sentiment classification using support vector machine. Int. J. Comput. Sci. Netw. Secur. 2019, 19, 167–176. [Google Scholar]
Anjum, N.; Badugu, S.; Satapathy, S.; Raju, K.; Shyamala, K.; Krishna, D.; Favorskaya, M. A Study of Different Techniques in Educational Data Mining. In Advances in Decision Sciences, Image Processing, Security and Computer Vision: International Conference on Emerging Trends in Engineering (ICETE); Springer International Publishing: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Aljohani, N.R.; Fayoumi, A.; Hassan, S.U. Predicting At-Risk Students Using Clickstream Data in the Virtual Learning Environment. Sustainability 2019, 11, 7238. [Google Scholar] [CrossRef]
Alhassan, A.; Zafar, B.; Mueen, A. Predict Students’ Academic Performance based on their Assessment Grades and Online Activity Data. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 185–194. [Google Scholar] [CrossRef]
Toivonen, T.; Jormanainen, I.; Tukiainen, M. Augmented intelligence in educational data mining. Smart Learn. Environ. 2019, 6, 10. [Google Scholar] [CrossRef]
Buenaño-Fernández, D.; Gil, D.; Luján-Mora, S. Application of machine learning in predicting performance for computer engineering students: A case study. Sustainability 2019, 11, 2833. [Google Scholar] [CrossRef]
Tomasevic, N.; Gvozdenovic, N.; Vranes, S. An overview and comparison of supervised data mining techniques for student exam performance prediction. Comput. Educ. 2020, 143, 103676. [Google Scholar] [CrossRef]
Shrestha, S.; Pokharel, M. Machine Learning algorithm in educational data. Artif. Intell. Transform. Bus. Soc. 2019, 1, 1–11. [Google Scholar]
López, M.B.V.; García, M.Y.A.; Jaico, J.L.B.; Ruiz-Pico, A.A.; Hernández, R.M. Application of a Data Mining Model to Predict Customer Defection. Case of a Telecommunications Company in Peru. J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl. 2023, 14, 144–158. [Google Scholar]
Rawat, K.S.; Malhan, I.V. A hybrid classification method based on machine learning classifiers to predict performance in educational data mining. In Proceedings of the 2nd International Conference on Communication, Computing and Networking: ICCCN 2018, NITTTR, Chandigarh, India, 8 September 2018. [Google Scholar]
Waheed, H.; Hassan, S.U.; Aljohani, N.R.; Hardman, J.; Alelyani, S.; Nawaz, R. Predicting academic performance of students from VLE big data using deep learning models. Comput. Hum. Behav. 2020, 104, 106189. [Google Scholar] [CrossRef]
Yahya, A.A. Swarm intelligence-based approach for educational data classification. J. King Saud Univ. Comput. Inf. 2019, 31, 35–51. [Google Scholar] [CrossRef]
Zhang, W.; Qin, S. A brief analysis of the key technologies and applications of educational data mining on online learning platform. In Proceedings of the IEEE 3rd International Conference on Big Data Analysis (ICBDA), Shanghai, China, 9–12 March 2018. [Google Scholar]
Liloja; Ranjana, P. An Intrusion Detection System Using a Machine Learning Approach in IOT-based Smart Cities. J. Internet Serv. Inf. Secur. 2023, 13, 11–21. [Google Scholar]
Jayasree, R.; Sheela Selvakumari, N.A. Student Performance Prediction Using Random decision (RD) Classification Algorithm. Int. J. Mech. Eng. 2022, 7, 1725–1735. [Google Scholar]
Alhakami, H.; Alsubait, T.; Jarallah, A.S. Data mining for student advising. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 526–532. [Google Scholar] [CrossRef]
Yağcı, M. Educational data mining: Prediction of students’ academic performance using machine learning algorithms. Smart Learn. Environ. 2022, 9, 11. [Google Scholar] [CrossRef]
Zhang, Y.; Yun, Y.; An, R.; Cui, J.; Dai, H.; Shang, X. Educational data mining techniques for student performance prediction: Method review and comparison analysis. Front. Psychol. 2021, 12, 698490. [Google Scholar] [CrossRef] [PubMed]
Dhilipan, J.; Vijayalakshmi, N.; Suriya, S.; Christopher, A. February. Prediction of student’s performance using machine learning. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1055, 12122. [Google Scholar] [CrossRef]

Figure 1. Proposed system architecture.

Figure 2. Back Propagation Neural Network with error rate in training.

Figure 3. Heat map for feature importance analysis.

Figure 4. (a) Performance analysis for binary classification for Portuguese lessons. (b) Performance analysis for binary classification for mathematics lessons.

Figure 5. Accuracy for binary classification.

Figure 6. Time taken for binary classification.

Figure 7. Performance analysis for five-level grading.

Figure 8. Accuracy for five-level grading.

Figure 9. Time taken for five-level grading.

Table 1. Binary grading for Portuguese lessons.

Algorithms	Precision	Recall	F-Measure	Accuracy	Time Taken (ms)
SVM	84.6	85.2	83.6	84.6	6540
Random Decision	93.5	94.2	93.5	94.7	4720
BPNN-SPA	96.8	95.8	96.2	96.5	1800

Table 2. Binary grading for mathematics lessons.

Algorithms	Precision	Recall	F-Measure	Accuracy	Time Taken (ms)
SVM	86.2	92.1	87.1	86.5	7540
Random Decision	96.8	96.2	95.3	95.1	5820
BPNN-SPA	98.8	96.1	97.5	98.0	2100

Table 3. Five-level grading for Portuguese lessons.

Algorithms	Precision	Recall	F-Measure	Accuracy	Time Taken (ms)
SVM	88.5	89.15	87.41	88.7	6750
Random Decision	94.4	95.7	93.2	94.3	4280
BPNN-SPA	96.5	97.8	96.7	97.2	1700

Table 4. Five-level grading for mathematics lessons.

Algorithms	Precision	Recall	F-Measure	Accuracy	Time Taken (ms)
SVM	85.3	86.4	85.7	84.9	7950
Random Decision	93.6	94.8	93.7	94.3	5410
BPNN-SPA	96.8	97.2	96.8	97	1760

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

R, J.; Selvakumari, S. Design of a Prediction Model to Predict Students’ Performance Using Educational Data Mining and Machine Learning. Eng. Proc. 2023, 59, 25. https://doi.org/10.3390/engproc2023059025

AMA Style

R J, Selvakumari S. Design of a Prediction Model to Predict Students’ Performance Using Educational Data Mining and Machine Learning. Engineering Proceedings. 2023; 59(1):25. https://doi.org/10.3390/engproc2023059025

Chicago/Turabian Style

R, Jayasree, and Sheela Selvakumari. 2023. "Design of a Prediction Model to Predict Students’ Performance Using Educational Data Mining and Machine Learning" Engineering Proceedings 59, no. 1: 25. https://doi.org/10.3390/engproc2023059025

Article Menu

Design of a Prediction Model to Predict Students’ Performance Using Educational Data Mining and Machine Learning^†

Abstract

1. Introduction