Two-Stage PNN–SVM Ensemble for Higher Education Admission Prediction

Zub, Khrystyna; Zhezhnych, Pavlo; Strauss, Christine

doi:10.3390/bdcc7020083

Open AccessArticle

Two-Stage PNN–SVM Ensemble for Higher Education Admission Prediction

by

Khrystyna Zub

¹

,

Pavlo Zhezhnych

¹ and

Christine Strauss

^2,*

¹

Department of Social Communication and Information Activities, Lviv Polytechnic National University, 79000 Lviv, Ukraine

²

Faculty of Business, Economics and Statistics, University of Vienna, 1090 Vienna, Austria

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2023, 7(2), 83; https://doi.org/10.3390/bdcc7020083

Submission received: 13 March 2023 / Revised: 18 April 2023 / Accepted: 20 April 2023 / Published: 23 April 2023

(This article belongs to the Special Issue Quality and Security of Critical Infrastructure Systems)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we investigate the methods used to evaluate the admission chances of higher education institutions’ (HEI) entrants as a crucial factor that directly influences the admission efficiency, quality of education results, and future students’ life-long trajectories. Due to the conditions of uncertainty surrounding the decision-making process that determines the admission of entrants and the inability to independently assess the probability of potential outcomes, we propose the application of the machine learning (ML) model as an algorithm that provides decision-making support. The proposed model includes the support vector machine (SVM) stacking ensemble, which expands the input data set obtained using the Probabilistic Neural Network (PNN). The basic algorithms include four SVM ensemble methods with different kernel functions and Logistic Regression (LR) as a meta-algorithm. We evaluate the accuracy of the developed model in three stages: comparison with existing ML methods; comparison with a single-based model that comprises it; and comparison with a similar stacking model and with other types of ensembles (boosting, begging). The results of the designed two-stage PNN–SVM ensemble model provided an accuracy of 94% and possessed acquired superiority in the comparison stages. The obtained results enable the use of the presented model in the subsequent stages of the development of an intellectual support system for decision making regarding entrants’ admission.

Keywords:

higher education institution; admission prediction; support vector machine; ensemble stacking; linear regression; probabilistic neural network

1. Introduction

Today, the admission process is a critical component of educational institutions’ activity. In order to strengthen HEIs’ autonomy and social responsibility, one of their priorities is searching for and implementing information technology (IT) innovations that provide effective results. The rapid development of IT, particularly in the field of artificial intelligence (AI), increases the need to constantly investigate the most effective technological solutions to be utilized in different stages of education. It should be emphasized that in order to ensure the quality of higher education, the results of the admission process must meet the needs of both the HEI itself and the other stakeholders, entrants, and society.

One of the critical moments during the admission process is the entrant’s choice of a specific program; the method by which a student selects the HEI and, alternately, the institution selects a student to admit significantly affects the success of both sides. However, today, many factors could make this task more complicated, particularly those that follow:

an increasing amount of information coming from various HEI or government institutions (complex admission standards for entrants, and a wide range of educational institutions and programs) could complicate how the HEI is perceived and processed by entrants;
the specific rules of separate HEIs, the complexity of the admission process and annual updates to the admission requirements often confront entrants;
the risk of entrants getting stuck in the “area of limited information” that is caused by the popularity of certain fields in the labor market (it is difficult to expertly assess the demand for specialists and the labor market trends that change every year); entrants’ uncertainty associated with the inability to independently and accurately assess the success of admission;
the consequences of final decisions that directly affect future students and their self-realization—education success, employment, and HEI ranking.

As a result, in most cases, entrants are not fully knowledgeable, so they are not ready to choose a program expertly and consciously. HEI entrants cannot use their previous experience, because of its insufficiency, so they rely on their perception and understanding of the general information that is available regarding the admission process. Since all entrants are direct subjects in the education market, the results of their choices are significant and need to be supported in order to achieve effective admission process results.

The current methods and tools used by leading HEIs that aim to support entrants with admission-related information nowadays involve a few main aspects: official websites (programs lists and descriptions, admission requirements, promotional materials); online applications to HEIs (provides all stages, including payment); online/offline specialist consultations (a human impact factor makes them less effective); and commercial software solutions designed to aid the admission management process (problems of privacy and data security, high purchase cost, and difficulties in regard to supporting them makes them unsuitable in many cases). As we can see, these tools do not provide enough support for entrants’ decision making.

A proper evaluation of the chances of entrants’ admission success could support entrant’s decision-making process regarding their choice of education program and, at the same time, save their time and financial costs. Such a prediction of an entrant’s admission will also have a positive impact on the planning and management processes of HEIs.

One possible way to solve this task is to build a specific intellectual information system to support the decision-making process. The advantages of ML include its ability to naturally extend traditional statistical approaches that focus primarily on predictions, to automatically identify patterns within data, and perform the tasks that we aim to investigate in this paper. Based on our previous research results and an analysis of the existing studies, with the aim of achieving high accuracy, low time consumption, and the ability to implement it in future studies, this paper evaluates a new hybrid model that can be used to predict the admission chances of HEI entrants.

2. Related Work

ML and AI more broadly have great current and future potential for implementation in almost all aspects of the education process. However, an analysis of the latest research on ways in which to support decision-making using ML methods shows us that most of them are designed for existing students. This could be explained by the fact that HEIs have flexible curricula and offer a wide list of elective courses, so they use educational data to build possible recommendations and decision-making algorithms.

In addition, there is much ML-based research on decision making and its application in program selection, but it generally aims to support HEI managers. In this case, prediction primarily aids in the understanding of enrollment trends and could influence future strategy and resource decisions. For example, the purpose of such studies was to define the future enrollment trends [1], investigate a statistical model that can be used to better forecast international undergraduate student enrollment at the institutional level [2], identify applicants with a higher tendency to enroll [3], predict student enrollment behavior and predict the students who are at high risk of dropping out [4].

Some papers have focused on predicting the entrant’s admission success. These authors have emphasized the relevance of admission prediction because it will help entrants save their time and money, promote successful study and employment, and reduce the risk of exclusion [5,6,7,8,9,10]. Various methods of intellectual data analysis have been used to extrapolate patterns and new knowledge from the collected data. The authors have also used common ML methods to compare different regression algorithms [5,7], including Association Rule Mining [6] and gradient boosting regressor [9]. These studies were performed in accordance with the needs of undergraduate [6,10] or graduate students [5,7,8,9]. The researchers used data sets that contained the previous year’s admission results [5,6], students’ data [7,8,9,10], or current user’s data. The authors also emphasized the importance of cleaning, preparing data, and the feature selection process.

Intending to improve research results, the authors emphasized the relevance of such tasks in the future, including expanding datasets and using neural networks as another plausible model, instead of single regression models.

However, the existing works in most cases do not provide comparative results with other methods or the authors’ previous studies. In addition, the authors do not consider the application of the researched models in building a software solution that supports applicants. As an essential key component in many data mining applications, neural networks have emerged as being particularly effective in higher education-related tasks. Researchers have used neural networks in the study of engineering student retention prediction [11], have determined the education quality by classifying the students’ achievements [12] and have predicted entrants’ academic performance at the university in order to support them in the admission decision-making process [13]. Other work has investigated classification patterns using the k-nearest neighbors algorithm (k-NNN), artificial neural networks (ANNs), and the Naïve Bayes method. The experiment showed that the ANNs demonstrated the best accuracy and precision [14]. The methods used in these works were defined as practical result detection algorithms with a high accuracy and efficiency.

The results of modern research using ML algorithms testify to the effectiveness of ensemble methods. This approach has become a primary development direction in ML in various subject areas [15].

In one study, the authors applied a new ensemble technique for classifying the multi-class hyperspectral honey dataset, and mitigated a vital drawback of these tree-based methods: they cannot correct for misclassification. This ensemble performed better than the benchmark one-vs-rest SVM across all the feature reduction techniques. However, the authors defined some disadvantages, such as a slow training and testing time [16].

Another paper presented the effective results of applying an ensemble approach in a deep learning-based method for dynamic security assessment. The SVM ensemble was used in the base of the stacked de-noising auto-encoder. The ensemble-boosting learning increased the SVM classifier’s accuracy [17].

To provide high accuracy, another paper presented a new, stacking-based General Regression Neural Network ensemble model with a Successive Geometric Transformations Model and a neural-like structure as a meta-algorithm. A comparison with several known predictors showed the best result of the proposed model [18].

The existing research supports the use of an ensemble approach when building ML models for different tasks and data subject areas. However, existing studies in the educational domain still have certain disadvantages, particularly the lack of comparison with other methods, low accuracy, and missing indicators that affect the admission outcome (incomplete data or their low quality). Therefore, finding and researching the most effective methods and models for solving this task is still necessary. In addition, an analysis of the literature demonstrates that most available research does not explore the combination of different methods. Therefore, this work investigates the possibility of improving the accuracy of predicting the success of admission to an HEI by using a two-stage proposed model.

3. Materials and Methods

This section describes the data set, methods, and modeling process used in more detail.

3.1. Base Models

PNNs are a group of ANNs obtained from Bayesian calculations and the nucleus density method. PNNs often learn more quickly than many neural network models. Its training time, resistance to noise, simple structure, and training manner make this type widely used in modern research and various applications [19].

According to the PNN topology, there is no need for massive back-propagation training computations. Instead, each data pattern is represented with a unit that measures the similarity of the input patterns to the data pattern. The point is that the PNN learns to estimate the probability density function. Its output is considered the model’s expected value at a given point in the input space. The output layer of the PNN estimates the probability of an element belonging to a particular class. The PNN topology is presented in Figure 1. A complete analysis and mathematical explanation are presented in [20].

Today, PNNs are widely used in order to solve various issues that researchers have attempted to address. PNNs facilitate the smooth and accurate classification of complex models with arbitrary solution boundary shapes [20].

Another ML method that we used in this research is SVM. It is one of the most common ML methods used for classification, regression, or the detection of outliers. This method shows high efficiency in binary classification (when the data have exactly two classes) [21]. Compared with other statistical learning algorithms, SVM implementation often provides a high level of performance, especially when using high-dimensionality data or data with a small number of training examples [16,17].

SVM is a powerful algorithm that finds the best possible decision boundary and separates the data points of different classes in a high-dimensional feature space. To implement SVM, the data points are first mapped to a higher dimensional feature space using a kernel function. In the feature space, SVM finds the best decision boundary that maximizes the margin between the classes. Once the decision boundary is found, SVM can predict the class of new data points by determining which side of the decision boundary they lie on. SVM is a powerful algorithm that can handle large datasets and works well with both linearly separable and non-linearly separable data. A classical SVM application mathematical theory is explained in [22].

The advantage of this method is also universality. Different kernel functions provide processing data through a set of mathematical functions. The kernel plays an essential role in classification and is used to analyze some patterns in a particular data set. Different algorithmic implementations of SVM can use different types of kernel functions. The most popular are Linear, Sigmoid, Polynomial, Radial Basis Function (RBF), Gaussian kernel, and Bessel function kernel. It is worth noting that the accuracy of the classifier, to some extent, depends on the choice of specific kernels [23]. Since this method shows good results in binary classification, we chose it as the basic algorithm of the ensemble model. To build the ensemble method, we used SVM and LR, also considering previous studies’ results. After investigating the effectiveness of different ML-based classifiers in solving the prediction task, these methods showed the highest accuracy [24]. The LR assumes that the relationship between the input features and the output variable is linear on the logit scale and estimates the coefficients of the linear equation using maximum likelihood estimation. LR is a statistical model commonly used for binary classification tasks to predict whether an observation belongs to one of two classes. Despite its simplicity, LR can achieve high performance on a wide range of classification problems.

3.2. The Proposed Ensemble Model

In this work, the stacking ensemble of the SVM, with the expansion of the input data set via PNN, was implemented. The main idea of such a hybrid approach is to use the summation layer of the PNN topology as a tool for pre-processing data for further use. The processed data set is added to each relevant vector of the initial training and test data samples. This scheme ensures an expansion of the input data space via the probability of the data belonging to each class, which theoretically should increase the probability of the correct result in the classification tasks based on the ML algorithm. In addition, the ensemble method is used as one of the ways to increase the accuracy of the prediction results. We propose Logistic Regression for a meta-algorithm as it is effective and relatively fast compared to other algorithms, making it suitable for real-time applications. We use SVM, as it is one of the most commonly used algorithms in current research and was the most accurate in our previous investigation classification method [24].

Given the possibilities of the methods described above, in this study, we presented a heterogeneous ensemble of SVM with four different kernels, with the expansion of the input data set using PNN and LR as a meta-algorithm.

The essence of the ensemble method is the simultaneous application of several basic algorithms. In this method, algorithms can be taught in parallel, which provides an opportunity to improve the results’ accuracy. Several approaches to building ensemble algorithms exist, such as stacking, boosting, and bagging. Boosting is the continuous application of each model, with each subsequent one correcting the errors of the previous one. The basic algorithms are learned in parallel using the bagging ensemble method, and the final results are aggregated. When using stacking, several different algorithms are trained, and at their outputs, the meta-algorithm gives the final result. The stacking ensemble method is used in this work. The scheme of the proposed model is shown in Figure 2.

Therefore, the model works as follows. In the first stage, we use PNN on the initial data set. As a result, we obtain an additional set of probabilities from the PNN’s summation layer, which increases the features of the initial data set. Next, the extended data set is processed using a heterogeneous stacking ensemble.

3.3. Dataset Description

The effectiveness of the chosen method was evaluated based on a real dataset of admissions to HEIs. It includes information about the admission results of 29 universities. This data set is freely available, which is the main reason for its use in this work. It can be downloaded from [25]. Because of the importance of data preprocessing, we chose this well-prepared dataset, which included all the necessary procedures for data processing, selecting essential attributes, and preparing them for further intellectual analysis. Some dataset statistics are presented in Table 1. The following are mentioned in the table: mean—the average or the most common value in a collection of numbers; median—the middle value of a set of numbers; dispersion—the extent to which a distribution is stretched or squeezed; and minimum and maximum values of the featured dataset.

In total, 1653 observations in this data set did not contain gaps. That is why the specified data set was used to test the developed method in its current form.

3.4. Performance Indicators

Performance evaluation of the designed two-stage PNN–SVM ensemble model was conducted using the following indicators:

Accuracy = \frac{TP + TN}{TP + FP + TN + FN},

(1)

Precision = \frac{TP}{TP + FP},

(2)

Recall = \frac{TP}{TP + FN},

(3)

F 1 - score = 2 \times \frac{(Recall \times Precision)}{(Recall + Precision)},

(4)

where TP represents true positive observations, TN represents true negatives observations, FP represents false positive observations, and FN represents false negatives observations.

The basic and commonly described intuitive metric used for model evaluation is accuracy, which defines the number of correct predictions divided by the total number of predictions multiplied by 100. It is defined as the ratio of the number of correct predictions to the total number of predictions made by the model. It evaluates how accurately the model can predict the class according to a problem statement.

Precision provides a measure of accuracy for positive predictions, and it could be applicable to a wide range of ML tasks, including binary classification, multi-class classification, and even regression. This makes it a flexible metric that can be used in various contexts.

Recall is a commonly used performance metric for classification models. It is a crucial performance metric that measures how well a model identifies positive instance recall measures and the proportion of actual positive instances that the model correctly identifies, i.e., the ratio of true positives to the sum of true positives and false negatives [26].

The F1-score metric combines precision and recall metrics to provide a single score that reflects the model’s overall performance. The main advantage is that the F1-score combines other metrics into a single one, capturing many aspects simultaneously.

3.5. Optimal Parameters of Ensemble Members

The smooth factor is the only parameter of PNN selection that is necessary for the exact work of a neural network of this type. Therefore, in Figure 3, the dependence of the PNN accuracy values on the change in this parameter is shown (in the range from 0.01 to 0.3 with a step of 0.01). As shown in Figure 3, the highest accuracy value of the PNN is obtained at a smooth factor of 0.03. A further increase in the value of this indicator does not change the accuracy of the PNN. Therefore, a value of 0.03 was used during the pre-processing of a given data set in order to obtain a set of probabilities belonging to one of the two classes of the original attribute of the task.

All the SVMs (SVM with different kernels) of the stacking ensemble worked on the following parameters: cost = 1, regression loss epsilon = 0.1, numerical tolerance = 0.001, and the number of iterations = 10,000. Only the core of the method changed. It should be noted that when comparing the work of the developed ensemble with single-based SVMs, the same parameters of their work were used.

4. Results

The modeling of the stacking ensemble was based on an extended data set. It is formed by adding to the initial data set the probabilities of belonging to each of the two classes of the task (whether the entrant entered or not) due to the outputs of the summation layer PNN. Thus, the new data set contained not 7, but 9 input attributes. This is the first stage of the developed method. The grid-search method was used to select the optimal operating parameters of each ML method (details are presented in Appendix A).

A stacking ensemble of four SVMs with different kernels was used in the second stage. To objectively assess the behavior of the developed ensemble on the independent data, we used ten-fold cross-validation. This procedure provides more reliable results and allows us to conclude a specific generalization of the studied model.

The results of the ensemble based on Equations (1)–(4) are summarized in Table 2.

In addition to the numerical results of the accuracy of the developed method, Figure 4 shows the quality of classification in visual form; this is in the form of ROC curves for two classes (get admission or not).

An error curve, the so-called ROC curve, was used to visualize the research results (Figure 4). The ROC curve plots the true positive rate (sensitivity) against the false positive rate (specificity) for each individual class. Accordingly, the studied classifier, whose ROC curve is located above and to the left of the graph, demonstrates greater accuracy.

5. Comparisons and Discussion

The comparison of the model efficiency in this work was carried out in three stages. In the first stage, there was a comparison with the existing ML methods, which are most widely used in the existing literature. The second experiment compared the developed ensemble with a single-based model that formed it. Finally, the third experiment demonstrated the comparison results with a similar stacking model and other types of ensembles (boosting, begging).

5.1. Comparison with Existing Machine Learning Algorithms

The aim of this experiment was to compare the designed two-stage PNN–SVM ensemble model with the neural network and classical ML algorithms:

classical PNN;
Neural Network;
SVM with RBF kernel;
LR;
Stochastic gradient descent (SGD);
Decision Tree;
Naïve Bayes;
kNN.

Existing methods were simulated using the “Orange” software package [27]. “Orange” is a tool for data visualization, ML, and open-source data analysis. It is equipped with a visual programming interface for fast and high-quality data analysis and interactive data visualization. Software “Orange” includes a wide array of tools that are suitable for demonstrating such an investigation.

It should be noted that all of these methods used the original data set (without extension using PNN). The results of the experiment are summarized in Table 3.

5.2. Comparison with All Single-Based Ensemble Members

The purpose of this phase was to compare the designed two-stage PNN–SVM ensemble model with its single-based members: PNN [28], which is the basis of the first stage of the developed method; and SVM with RBF, sigmoid, polynomial and linear kernels separately, which used the extended dataset. In Table 4, the results of such a comparison are summarized.

5.3. Comparison with Other Ensemble-Based Approaches

This experiment demonstrates the advantages of the developed ensemble in comparison with other ensemble methods:

stacking of SVMs with different kernels;
Random Forest algorithm;
AdaBoost classifier.

All three existing ensemble methods belong to different groups of ensemble construction: stacking, begging, and boosting. It should be noted that the simulation of their work took place on the initial data set. The simulation results are summarized in Table 5.

5.4. Discussion

According to the research goal, the proposed model results provide high accuracy. The three-stage experiment proves that our model is more accurate than other ML methods, a single-based model that comprises the proposed model and the other ensemble methods. In addition to the numerical results, Figure 5 presents the classification accuracy using the ROC curves of the two classes (get admission or not) of the accuracy of the last experimental stage.

In addition, these results are more accurate in comparison with the previous authors’ investigations. The numerical results show a higher accuracy than our previous implementation of ML-based classifiers for HEI entrants’ admission prediction.

The high accuracy of the model is a prerequisite for building an effective information system. This is also facilitated by the greater reliability of intellectual analysis due to using an ensemble of ML methods, compared to single-based ML algorithms.

The need to make a choice appears every year before entrants of each HEI. At the same time, many variables can affect students’ decision success. As previous studies show, additional research requires a data collection and analysis stage to be used in such a system. The sources of data that were used in the research of other authors are the results of surveys, information about student performance, and (or) historical information about applicants from previous years of admission. However, there is still the problem of considering the crucial parameters of the data set when using specific methods.

Thus, studying a set of independent traits that may affect the results of prediction in each case is also extremely important. This direction will be a subject of a future work.

6. Conclusions

In this research, we considered the need to support the program decision-making process of HEI entrants during admission. As one of the ways to increase the effectiveness of admission results, we investigated the possibility of evaluating entrants’ chances of being admitted to an HEI. Reviewing previous research allowed us to highlight some tendencies in implementing IT systems that aim to support the entrant’s needs. However, existing studies still have certain disadvantages, so we aimed to investigate another model that could accurately predict the entrant’s success more accurately.

A heterogeneous stacking ensemble of the SVM with the expansion of the input data set via PNN was used to investigate the prediction of the entrant’s chances of being admitted through a binary classification task. The basic algorithms of the stacking ensemble were SVM with four different kernels: linear, sigmoid, polynomial, and RBF. We chose Logistic Regression as a meta-algorithm. In the first stage, we used PNN to obtain two additional data vectors. Then, the extended data set was processed using a heterogeneous stacking ensemble.

The results of the designed two-stage PNN–SVM ensemble model provide an accuracy = 0.940, which is the highest value when compared with other studied methods. The obtained results show that the proposed model could be used at the subsequent stages of building an information system that supports HEI entrants’ decision-making process. In addition, the contribution of this study is that it demonstrates the possibility of implementing such a model for the prediction task. This could help other experts who aim to implement IT systems in higher education or other areas.

For a wider and more effective evaluation of the proposed model, our future work will involve testing other datasets that include HEI admission results. We aim to perform all necessary intellectual analysis process stages in order to evaluate the proposed model on another dataset. Further research on a feature selection model is especially important regarding the outcome of prediction and recommendation in each case. These studies will contribute to the effectiveness of building an intelligent system to support entrants’ decision making.

Author Contributions

Conceptualization, K.Z. and P.Z.; methodology, K.Z.; validation, C.S. and P.Z.; formal analysis, C.S.; investigation, K.Z.; resources, K.Z.; writing—original draft preparation, K.Z.; writing—review and editing, C.S. and P.Z.; visualization, K.Z.; supervision, P.Z. and C.S.; project administration, C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data are fully available without restriction. They may be found at: https://github.com/satwik2663/Machine-Learning-Graduate-Studuent-Admission-Predictor (accessed on 15 October 2022).

Acknowledgments

The authors would like to thank the Armed Forces of Ukraine for providing security to perform this work. This work has become possible only because of the resilience and courage of the Ukrainian Army.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Grid search parameters.

Model	Optimal Parameters
AdaBoost	exponential loss function, 100 estimators, learning rate: 1,
Random Forest	subsets smaller than 3 did not split, 5 attributes at each split, 100 trees
SVM (RBF-kernel)	cost = 1, regression loss epsilon = 0.1, numerical tolerance = 0.001, and a number of iterations = 10,000
LR	C = 1, regularization: L2
Neural Network	7 neurons in the input layer, 100 neurons in the hidden layer, 200 iterations, Adam solver, ReLu activation function
SGD	Hinge loss function, regularization: L2, 1000 iterations
Decision Tree	subsets smaller than 3 did not split, 14 number of instances in leaves
Naïve Bayes	basic parameters
kNN	5 neighbors, metric: Euclidian
Classical PNN	Sigma = 0.03

References

Dela Cruz, A.; Basallo, M.; Jerome, A. Higher Education Institution (HEI) Enrollment Forecasting Using Data Mining Technique. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 2060–2064. [Google Scholar]
Chen, Y.; Li, R.; Hagedorn, L.S. Undergraduate International Student Enrollment Forecasting Model: An Application of Time Series Analysis. J. Int. Stud. 2020, 9, 242–261. [Google Scholar] [CrossRef]
Slim, A.; Hush, D.; Ojah, T.; Babbitt, T. Predicting Student Enrollment Based on Student and College Characteristics. In Proceedings of the 11th International Educational Data Mining Society, Raleigh, NC, USA, 16–20 July 2018. [Google Scholar]
Shilbayeh, S.; Abonamah, A. Predicting Student Enrolments and Attrition Patterns in Higher Educational Institutions using Machine Learning Intern. Int. Arab. J. Inf. Technol. 2021, 18, 562–567. [Google Scholar]
Acharya, M.S.; Armaan, A.; Anton, A.S. A Comparison of Regression Models for Prediction of Graduate Admissions. In Proceedings of the Second International Conference on Computational Intelligence in Data Science, Chennai, India, 21–23 February 2019; pp. 1–5. [Google Scholar]
Mane, R.V.; Ghorpade, V.R. Predicting student admission decisions by association rule mining with pattern growth approach. In Proceedings of the International Conference on Electrical, Electronics, Communication, Computer and Optimization Techniques, Mysuru, India, 9–10 December 2016; pp. 202–207. [Google Scholar] [CrossRef]
AlGhamdi, A.; Barsheed, A.; AlMshjary, H.; AlGhamdi, Y.A. Machine Learning Approach for Graduate Admission Prediction. In Proceedings of the 2nd International Conference on Image, Video and Signal Processing, New York, NY, USA, 20–22 March 2020; pp. 155–158. [Google Scholar] [CrossRef]
Sujay, S. Supervised Machine Learning Modelling & Analysis For Graduate Admission Prediction. Int. J. Trend Res. Dev. 2020, 7, 5–7. [Google Scholar]
Chakrabarty, N.; Chowdhury, S.; Rana, S.A. Statistical Approach to Graduate Admissions’ Chance Prediction. In Innovations in Computer Science and Engineering; Springer: Singapore, 2020; Volume 103, pp. 333–340. [Google Scholar] [CrossRef]
Singhal, S.I.; Sharma, A. Prediction of Admission Process for Gradational Studies using Al Algorithm. Eur. J. Mol. Clin. Med. 2020, 7, 116–120. [Google Scholar]
Mason, C.; Twomey, J.; Wright, D.; Whitman, L. Predicting Engineering Student Attrition Risk Using a Probabilistic Neural Network and Comparing Results with a Backpropagation Neural Network and Logistic Regression. Res. High. Educ. 2020, 59, 382–400. [Google Scholar] [CrossRef]
Wu, C.; Jiang, H.; Wang, P. Education Quality Detection Method Based on the Probabilistic Neural Network Algorithm. Diagnostyka 2020, 21, 79–86. [Google Scholar] [CrossRef]
Dewantoro, G.; Ardisa, N. A Decision Support System for Undergraduate Students Admissions using Educational Data Mining. In Proceedings of the 7th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), Semarang, Indonesia, 24–25 September 2020; pp. 105–109. [Google Scholar] [CrossRef]
Mengash, H.A. Using Data Mining Techniques to Predict Student Performance to Support Decision Making in University Admission Systems. IEEE Access 2020, 8, 462–470. [Google Scholar] [CrossRef]
Philipp, R.; Mladenow, A.; Strauss, C.; Voelz, A. Machine Learning as a Service: Challenges in Research and Applications. In Proceedings of the 22nd International Conference on Information Integration and Web-Based Applications & Services, IiWAS ’20, Chiang Mai, Thailand, 30 November–2 December 2020; Association for Computing Machinery: New York, NY, USA; pp. 396–406. [Google Scholar]
Phillips, T.; Abdulla, W. Developing a new ensemble approach with multi-class SVMs for Manuka honey quality classification. Appl. Soft Comput. 2021, 111, 107710. [Google Scholar] [CrossRef]
Rizwan-ul-Hassan, L.; Changgang Liu, Y. Online dynamic security assessment of wind integrated power system using SDAE with SVM ensemble boosting learner. Int. J. Electr. Power Energy Syst. 2021, 125, 106429. [Google Scholar] [CrossRef]
Izonin, I.; Tkachenko, R.; Vitynskyi, P.; Zub, K.; Tkachenko, P.; Dronyuk, I. Stacking-based GRNN-SGTM Ensemble Model for Prediction Tasks. In Proceedings of the International Conference on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 8–9 November 2020; pp. 326–330. [Google Scholar] [CrossRef]
Fan, H.; Pei, J.; Zhao, Y. An optimized probabilistic neural network with unit hyperspherical crown mapping and adaptive kernel coverage. Neurocomputing 2019, 373, 24–34. [Google Scholar] [CrossRef]
Mohebali, B.; Tahmassebi, A.; Meyer-Baese, A.; Gandomi, A.H. Chapter 14–Probabilistic neural networks: A brief overview of theory, implementation, and application. In Handbook of Probabilistic Models; Pijush, S., Dieu, T.B., Subrata, C., Ravinesh, C.D., Eds.; Butterworth-Heinemann: Oxford, UK, 2020; pp. 347–367. [Google Scholar]
Izonin, I.; Trostianchyn, A.; Duriagina, Z.; Tkachenko, R.; Tepla, T.; Lotoshynska, N. The Combined Use of the Wiener Polynomial and SVM for Material Classification Task in Medical Implants Production. Intell. Syst. Appl. 2018, 9, 40–47. [Google Scholar] [CrossRef]
Valero-Carreras, D.; Alcaraz, J.; Landete, M. Comparing two SVM models through different metrics based on the confusion matrix. Comput. Oper. Res. 2023, 152, 106131. [Google Scholar] [CrossRef]
Kernel Functions-Introduction to SVM Kernel & Examples. Available online: https://data-flair.training/blogs/svm-kernel-functions (accessed on 15 October 2021).
Zub, K.; Zhezhnych, P. Performance Evaluation of ML-based Classifiers for HEI Graduate Entrants’. In Proceedings of the International Workshop of IT-professionals on Artificial Intelligence, Kharkiv, Ukraine, 20–21 September 2021. [Google Scholar]
Singh, A.; Dhar, A.; Jami, N.; Kashyap, S. Machine Learning Graduate Student Admission Predictor: A Machine Learning Model Build to Help Student and Universities after GRE Exam. 2020. Available online: https://github.com/satwik2663/Machine-Learning-Graduate-Studuent-Admission-Predictor (accessed on 15 October 2022).
C3 AI. Glossary: Definition of Enterprise AI and Data Science Terms. Available online: https://c3.ai/glossary/data-science/recall/ (accessed on 10 January 2023).
Demšar, J.; Curk, T.; Erjavec, A.; Gorup, Č.; Hočevar, T.; Milutinovič, M.; Možina, M.; Polajnar, M.; Toplak, M.; Starič, A.; et al. Orange: Data Mining Toolbox in Python. J. Mach. Learn. Res. 2013, 14, 2349–2353. [Google Scholar]
Izonin, I.; Tkachenko, R.; Ryvak, L.; Zub, K.; Rashkevych, M.; Pavliuk, O. Addressing Medical Diagnostics Issues: Essential Aspects of the PNN-based Approach. In Proceedings of the 3rd International Conference on Informatics & Data-Driven Medicine, Växjö, Sweden, 19–21 November 2020. [Google Scholar]

Figure 1. PNN topology.

Figure 2. Two-stage PNN–SVM Ensemble Model.

Figure 3. PNN optimal parameter identification.

Figure 4. ROC-curves for the designed method: (a) class 1 (admitted); (b) class 2 (not admitted).

Figure 5. ROC curves for the designed method and other ensemble-based classifiers: (a) class 1 (admitted); (b) class 2 (not admitted).

Table 1. Features statistic.

Feature/Statistic Indicator	Mean	Median	Dispersion	Min	Max
GRE score	313.70	314	0.02	287	338
GRE score quant	162.43	163	0.03	142	170
GRE score verbal	151.27	151	0.03	135	170
English test score (TOELF)	97.344	103	0.245	0	120
Undergraduate score	2.9418	3	0.1607	10.6	3.94
Working experience	17.16	15	0.96	0	132
Paper published	0.74	0	1.65	0	3

Table 2. The results of the designed two-stage PNN–SVM ensemble model.

Model	Accuracy	F1-Score	Precision	Recall
Designed two-stage PNN–SVM ensemble model	0.940	0.939	0.940	0.940

Table 3. Comparison of the accuracy of the model with classical ML methods.

Model	Accuracy	F1-Score	Precision	Recall
Designed two-stage PNN–SVM ensemble model	0.940	0.939	0.940	0.940
SVM (RBF-kernel)	0.699	0.676	0.683	0.699
LR	0.689	0.649	0.674	0.689
Neural Network	0.681	0.667	0.665	0.681
SGD	0.678	0.643	0.656	0.678
Decision Tree	0.673	0.666	0.662	0.673
Naïve Bayes	0.666	0.67	0.676	0.666
kNN	0.657	0.617	0.626	0.657
Classical PNN	0.664	0.677	0.692	0.664

Table 4. Comparison of the accuracy of the designed two-stage PNN–SVM ensemble model with its single-based members.

Model	Accuracy	F1-Score	Precision	Recall
Designed two-stage PNN–SVM ensemble model	0.940	0.939	0.940	0.940
SVM (RBF-kernel)	0.938	0.937	0.938	0.938
SVM (linear kernel)	0.938	0.938	0.938	0.938
SVM (polynomial kernel)	0.935	0.935	0.936	0.935
SVM (sigmoid kernel)	0.914	0.914	0.914	0.914
PNN from [28] *	0.68	0.68	0.67	0.68

* a PNN in this case used the initial dataset, but all other methods used the extended dataset.

Table 5. Comparison of the accuracy of the designed two-stage PNN–SVM ensemble model with other ensemble-based classifiers.

Model	Accuracy	F1-Score	Precision	Recall
Designed two-stage PNN–SVM ensemble model	0.940	0.939	0.940	0.940
Random Forest	0.703	0.695	0.693	0.703
AdaBoost	0.655	0.654	0.654	0.655
Stacking of SVMs with different kernels	0.706	0.680	0.693	0.706

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zub, K.; Zhezhnych, P.; Strauss, C. Two-Stage PNN–SVM Ensemble for Higher Education Admission Prediction. Big Data Cogn. Comput. 2023, 7, 83. https://doi.org/10.3390/bdcc7020083

AMA Style

Zub K, Zhezhnych P, Strauss C. Two-Stage PNN–SVM Ensemble for Higher Education Admission Prediction. Big Data and Cognitive Computing. 2023; 7(2):83. https://doi.org/10.3390/bdcc7020083

Chicago/Turabian Style

Zub, Khrystyna, Pavlo Zhezhnych, and Christine Strauss. 2023. "Two-Stage PNN–SVM Ensemble for Higher Education Admission Prediction" Big Data and Cognitive Computing 7, no. 2: 83. https://doi.org/10.3390/bdcc7020083

Article Menu

Two-Stage PNN–SVM Ensemble for Higher Education Admission Prediction

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Base Models

3.2. The Proposed Ensemble Model

3.3. Dataset Description

3.4. Performance Indicators

3.5. Optimal Parameters of Ensemble Members

4. Results

5. Comparisons and Discussion

5.1. Comparison with Existing Machine Learning Algorithms

5.2. Comparison with All Single-Based Ensemble Members

5.3. Comparison with Other Ensemble-Based Approaches

5.4. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI