A New Student Performance Prediction Method Based on Belief Rule Base with Automated Construction

Liu, Mingyuan; He, Wei; Zhou, Guohui; Zhu, Hailong

doi:10.3390/math12152418

Open AccessArticle

A New Student Performance Prediction Method Based on Belief Rule Base with Automated Construction

School of Computer Science and Information Engineering, Harbin Normal University, Harbin 150025, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(15), 2418; https://doi.org/10.3390/math12152418

Submission received: 10 July 2024 / Revised: 31 July 2024 / Accepted: 2 August 2024 / Published: 3 August 2024

(This article belongs to the Section Fuzzy Sets, Systems and Decision Making)

Download

Browse Figures

Versions Notes

Abstract

:

Student performance prediction (SPP) is a pivotal task in educational analytics, enabling proactive interventions and optimized resource allocation by educators. Traditional SPP models are often hindered by their complexity and lack of interpretability. This study introduces a novel SPP framework, the Belief Rule Base with automated construction (Auto–BRB), designed to address these issues. Firstly, reference values are derived through data mining techniques. The model employs an IF–THEN rule-based system integrated with evidential reasoning to ensure both transparency and interpretability. Secondly, parameter optimization is achieved using the Projected Covariance Matrix Adaptive Evolution Strategy (P–CMA–ES), significantly enhancing model accuracy. Moreover, the Akaike Information Criterion (AIC) is then applied to fine-tune the balance between model accuracy and complexity. Finally, case studies on SPP have shown that the Auto–BRB model has an advantage over traditional models in terms of accuracy, while maintaining good interpretability. Therefore, Auto–BRB has excellent application effects in educational data analysis.

Keywords:

student performance prediction; belief rule base; data mining; evidential reasoning; evolutionary strategy

MSC:

68T37

1. Introduction

Student Performance Prediction (SPP) [1,2,3,4] is a task used to predict the performance of students when they take certain tests. SPP has long been of interest to educators because of its educational benefits, such as providing high-quality, individualized instruction for each student [5] and the early identification of at-risk students based on their academic performance [6]. Despite this interest, many attempts have been made to develop scalable automated SPP systems based on Machine Learning (ML) approaches. However, unreliable and unexplained results may heighten the ethical risks in educational practices. Therefore, having accurate and interpretable SPPs is crucial, especially with the rapid advancement of intelligent technologies [7,8,9].

Previous research includes a series of works using traditional ML approaches. Kabakchieva et al. [2] applied three representative ML methods (Decision Tree, Nearest Neighbor, and Neural Networks) for SPP in college students, demonstrating the potential of ML in this area. Similarly, Al-Shehri et al. [1] utilized Support Vector Machines and Nearest Neighbors for SPP on a university education dataset, highlighting the strong predictive performance of these approaches. Additionally, significant advances in Deep Learning (DL) have led to active research on DL-based SPP methods in recent years [3,10,11,12]. Furthermore, there have been attempts to convert the SPP problem into a recommender system (RecSys) [13,14] in pursuit of highly personalized SPP. The above research is mainly based on data-driven approaches. However, data-driven approaches rely heavily on large sample sizes to train models and are not suitable for small sample sizes.

Data–driven models can create more accurate predictive models with large datasets. However, they often struggle to balance accuracy and interpretability, with higher accuracy models usually being less interpretable. This lack of interpretability complicates understanding the final results from a pedagogical perspective. Furthermore, algorithmic opacity creates information asymmetry between developers and users, exacerbating educational inequity. On the other hand, knowledge-driven approaches do not significantly enhance predictive accuracy and are less effective in addressing uncertainty. When data collection technology is underdeveloped, learners are more influenced by environmental and psychological factors, causing significant variations in measurements [15]. In such cases, it is difficult to obtain accurate results with knowledge-driven methods. The hybrid-driven approach combines the knowledge and experience of domain experts with historical data to make the model output more accurate. It maintains a good balance of performance and accuracy in terms of model interpretability.

The article introduces the Belief Rule Base (BRB) to achieve a balance between accuracy and interpretability. The BRB model is a hybrid-driven approach proposed by Yang et al. [16]. BRB is based on fuzzy inference, IF–THEN rules, and evidential reasoning (ER), which can address the uncertainty and fuzzy of the information gathering process [17,18]. Chen et al. [19] applied BRB to the field of education, developing a learning affective assessment method based on BRB, which demonstrated excellent performance compared to traditional ML models. Additionally, BRB is a rule-based modeling approach with good transparency and interpretability. In this research, the BRB-based SPP teaching method is focused on.

The following problems still exist in modeling real SPPs using BRBs. First, in some complex real-world situations, limited expert knowledge restricts the structure and parameters of the BRB, which may not fully cover the possible states and dynamic changes. Second, real teaching problems have complex features and diverse data distributions, which may change over time, space, or other factors. Traditional BRB models may not fully capture all the complex patterns and adapt to these dynamic changes, leading to poor performance in dealing with complex problems. Additionally, in dynamic environments, the parameters of the model need continuous optimization to adapt to changes, and traditional optimization methods may not effectively address such dynamics. Therefore, an effective optimization method and strategy need to be considered. Finally, there is still a lack of research on balancing model accuracy and complexity, as fixed-structure BRB models usually represent only bias-variance tradeoffs. Real-world engineering problems may require more comprehensive considerations, necessitating attention to both model accuracy and complexity.

To this end, a SPP model based on Belief Rule Base with automated construction (Auto–BRB) is proposed in this article. First, a collection of reference values is constructed based on data mining. Then, transparent and interpretable reasoning is performed using ER rules as the inference mechanism of the model. In addition, the parameter optimization of the model is performed using the P–CMA–ES algorithm to further improve the model’s accuracy. Finally, the model is comprehensively evaluated based on the AIC introduced into BRB to obtain the model with optimal accuracy and complexity. The primary contributions of this article are as follows:

(1) Based on data mining methods, knowledge is extracted from the data. Simultaneously, the AIC is introduced to comprehensively evaluate the model, ensuring the optimal balance of accuracy and complexity while maintaining interpretability. This approach facilitates the automated modeling of the model.

(2) A SPP model based on Auto–BRB is constructed. The model effectively leverages the efficient reasoning ability of Auto–BRB, enabling the accurate prediction of student performance in an educational setting. Additionally, the model balances accuracy and interpretability, thereby reducing potential ethical risks in educational decision-making.

The article is organized as follows. In Section 2, three problems in SPP are analyzed. In Section 3, an Auto–BRB-based SPP model is developed. Case study of the prediction of student performance is described in Section 4. The conclusions and future perspectives of this article are presented in Section 5.

2. Problem Formulation

This section identifies and analyzes key challenges in SPP. Despite the variety of methods available, traditional SPP models often struggle with complex metrics, lacking transparency and interpretability. Understanding the rationale and processes behind model results is crucial. Therefore, constructing accurate and interpretable predictive models is paramount. This study proposes an Auto–BRB-based SPP model, addressing three critical problems to enhance the accuracy and reliability of SPP models.

Problem 1.

How to obtain the initial modeling parameters for Auto–BRB.

In BRB model, each rule contains several key parameters such as attribute reference points, attribute reference values, attribute weights and rule weights. Ensuring these reference points is the initial phase of constructing a BRB model. Traditionally, these reference points are set by experts and decision makers based on experience. However, in the absence of expert knowledge, model development can become very difficult.

To address this challenge, data mining to extract meaningful reference points from historical data can be used. Data mining is able to analyze data from different perspectives and reveal important patterns and trends. Therefore, it is important to develop an algorithm that automatically extracts and identifies these reference points from historical data. In addition, the number of reference points can significantly affect the accuracy and complexity of BRB models. Therefore, the process of obtaining the initial parameters can be defined as follows:

X = f (α, γ)

(1)

where

X

represents the initial parameters of the BRB model,

f (α, γ)

represents the data mining process,

α

represents the historical data, and

γ

represents the data mining parameters.

Problem 2.

How to optimize Auto–BRB parameters to enhance accuracy.

BRB has superior nonlinear modeling capabilities while ensuring model interpretability. Initial parameters are typically set by experts. However, due to fuzzy in knowledge representation, these parameters may be inaccurate and require optimization. An optimization model based on the P–CMA–ES has been developed to refine these parameters. The optimization process of the model is described as follows:

X_{b e s t} = g (O, X)

(2)

where

X_{b e s t}

represents the optimized parameters of the BRB model,

g (O, X)

represents the model optimization process, and

O

represents the parameters of the P–CMA–ES.

Problem 3.

How to comprehensively assess the Auto–BRB.

A comprehensive assessment of the complexity and accuracy of the model is essential. Accuracy ensures that the model can reliably prediction student performance, while interpretability and simplicity guarantee that the model is feasible and easy to understand in practical applications to ensure that the model chosen is both valid and feasible in practice. Therefore, it is necessary to establish the evaluation process of the model, which is described as follows:

M_{b e s t} = h (a, c)

(3)

where

M_{b e s t}

represents the evaluated optimal model,

h (a, c)

represents the model evaluation process, and AIC is used as an indicator for integrated assessment.

a

represents the accuracy of the model, and in this paper, MSE is used as the accuracy metric.

c

represents the complexity of the model, the number of rules is used to represent the model complexity.

3. Construction of SPP Model Based on Auto–BRB

In this section, the initial parameters of the Auto–BRB model are established in Section 3.1. The inference process of the Auto–BRB model is described in Section 3.2. The parameter optimization process of the Auto–BRB model is given in Section 3.3. The Auto–BRB model synthesis assessment process is described in Section 3.4. Finally, the methodology steps of the Auto–BRB are summarized in Section 3.5.

3.1. Construction of Model Reference Value Sets

Reference points are the basis for constructing SPP models based on BRBs [20]. Normally, experts provide these benchmarks in advance [21]. In practical applications, situations often arise where only historical data are available or there is insufficient expert knowledge, making the development of effective models a complex challenge. Determining the best BRB framework involves identifying the ideal count and set of reference points that most effectively match the data collected from samples and the system’s architecture. Given that BRBs operate on belief rules, positioning the reference points in areas with a high concentration of historical data of input parameters allows these rules to more accurately capture the patterns in the data. Therefore, the choice of reference points ought to be linked to the data’s distribution, ensuring that the data can be accurately depicted within a specified range [22].

In artificial intelligence, data mining entails exploring databases to uncover concealed patterns and has found application across various decision-making support systems [23]. Data mining is essential for developing smart decision support systems that analyze data to reveal hidden patterns and rules. Clustering is a powerful data mining method for discovering the underlying structure in datasets [23]. Therefore, an enhanced algorithm, known as SSE–KPP, has been utilized to extract the set of reference points in BRB models [24]. This algorithm employs the analysis and mining of historical data to iteratively construct a collection of reference points, informed by a logical sequence of decision cycles and the specified count of reference points. The process of the algorithm is described in Figure 1. SSE–KPP enhances the K-means++ algorithm by incorporating a minimum sum of squares error (SSE) constraint, aiming for more precise clustering outcomes [24]. The objective of SSE-KPP is to develop a suitable set of reference points for BRB models to predict student performance through the analysis and mining of historical data.

The quantity of reference points needs to be established within a practical range. Based on the research by Sun et al. [25], the minimum number of reference points should be set to 2. When a rule incorporates an excessive number of reference points, it becomes overly complex and elongated, rendering it difficult to comprehend. Thus, the span of reference points should align closely with human cognitive capabilities. Miller’s research demonstrated that the human brain has a cognitive limit, often referred to as the “magic number 7 (±2)”, for processing information simultaneously [26]. This finding implies that 7 ± 2 represents the upper bound of the human brain’s capacity to process information [27]. Therefore, in this research, the cap for the number of reference points is set to 9, aligning with the cognitive limits identified by Miller. For both attributes and outcomes, the maximum and minimum reference points are set to match the respective highest and lowest values found in the sample data. Therefore, utilizing the SSE-KPP for mining yields 9 − 2 = 7 reference points, setting the range of K to be between 2 and 7. The constraints linked to the reference points derived through the SSE–KPP are articulated as follows:

\begin{array}{l} A r v_{m}^{T} = \{M i n (D_{A}), S_{A_{a}}, M a x (D_{A})\} = \{A_{1}, A_{2}, \dots, A_{M}\} \\ R r v_{m}^{T} = \{M i n (D_{R}), S_{R_{r}}, M a x (D_{R})\} = \{V_{1}, V_{2}, \dots, V_{M}\} \\ s . t . \\ 2 \leq m \leq 9 \\ S_{A_{a}} = \{s_{a, 0}, s_{a, 1}, \dots, s_{a, K}\} \\ S_{R_{r}} = \{s_{r, 0}, s_{r, 1}, \dots, s_{r, K}\} \\ 0 \leq a \leq 7, 0 \leq K \leq 7, s_{a, 0} = \emptyset \\ 0 \leq r \leq 7, s_{r, 0} = \emptyset \end{array}

(4)

where

M i n (D_{A})

and

M a x (D_{A})

denote the minimum and maximum values of the attribute reference value, respectively.

M i n (D_{R})

and

M a x (D_{R})

denote the minimum and maximum values of the resulting reference points, respectively.

S_{A_{a}}

and

S_{R_{r}}

denote the sets of attribute reference points and result reference points generated by SSE-KPP, which are denoted as the sets of a and r reference points, respectively. Notably, the maximum values of m and K can be lower than 9 and 7, respectively, based on real data.

In the modeling of SPP, it becomes apparent that relying solely on limited expert knowledge is insufficient for determining the model structure. Minor variations in the model configuration can substantially influence the precision of the assessments [28]. Prior to choosing an effective model structure, constructing a holistic model integration is crucial. This ensures that the selection of a suitable model structure for predicting student performance can be achieved under various conditions.

To meet this objective, this study merges the acquired set of attribute reference points

A r v_{m}^{T}

and the resultant reference value set

R r v_{m}^{T}

to create a diversified and integrated BRB structure. This integrated evaluation of various structures enhances both the adaptability and precision of the model, effectively catering to diverse assessment demands.

A new form of BRB ensemble construction is proposed as follows:

\begin{array}{l} B R B_{s e t s} = \{B R B_{1}, B R B_{2}, \dots, B R B_{m_{a} \cdot m_{a}}\} \\ B R B_{m_{a} \cdot m_{a}} = \{B R B_{1}^{m_{a}}, B R B_{2}^{m_{a}}, \dots, B R B_{m_{r}}^{m_{a}}\} \end{array}

(5)

where

B R B_{s e t s}

denotes the full set of models.

m_{a}

is defined as the number of attribute reference points, and

m_{r}

is defined as the number of outcome reference points. Building a set of BRB models with diverse structures is crucial for choosing an effective framework for predicting student performance. This method offers a systematic approach to investigate and select the model structure that optimally aligns with particular requirements. The process of dynamically constructing a model set for SPP based on different numbers of reference points is shown in Figure 2.

3.2. Reasoning Process of Auto–BRB

After establishing models for SPP, a reasoning process can be employed for each of the models. The BRB approach utilizes evidential reasoning (ER) to accomplish this objective [16]. ER enables the integration and processing of evidence from diverse sources and types while also managing the conflicts and synergies among the pieces of evidence. This approach allows for a comprehensive analysis that considers the varying degrees of reliability and relevance of each piece of evidence, facilitating more nuanced and accurate decision–making processes [29]. In the proposed new modeling approach for SPP, ER serves as the foundational mechanism for amalgamating rules during the final stage of the reasoning process [30]. The theoretical inference process of BRB is usually divided into five steps: input conversion, rule activation weight calculation, matching degree normalization, rule aggregation, and utility calculation.

Step 1: Input Conversion

First, the input information is converted into a reference value distribution.

S (x_{i}) = {(A_{i, j}, a_{i, j}), i = 1, 2, \dots, M; j = 1, 2, \dots, J_{m}} .

(6)

where

A_{i, j}

is defined as the jth reference value of the ith attribute.

a_{i, j}

is defined as the distribution level of the reference value.

Step 2: Matching Degree Calculation

The reference value distribution is calculated as the match degree, and the calculation process is given as follows:

α_{k} = \prod_{i = 1}^{T} {(a_{i}^{k})}^{{\bar{δ}}_{i}}

(7)

{\bar{δ}}_{i} = δ_{i} / \max_{i = 1, 2, \dots, T} {δ_{i}},

(8)

where

{\bar{δ}}_{i}

is defined as the attribute normalized weights.

α_{k}

is defined as the match degree of the kth rule.

Step 3: Rule Activation Weight Calculation

The rule needs to calculate whether it is activated by the activation weight, and the calculation process is given as follows:

w_{k} = \frac{θ_{k} α_{k}}{\sum_{l = 1}^{L} θ_{l} α_{k}},

(9)

where

w_{k}

is defined as the weight of the activation of the rule.

Step 4: Rule aggregation

The rule aggregation process is the process of fusing all active rules to generate the final belief distribution, and its calculation process is given as follows:

β_{n} = \frac{[\prod_{k = 1}^{L} (w_{k} β_{n}^{k} + γ_{n, i}^{k}) - \prod_{k = 1}^{L} (γ_{n, i}^{k})]}{[\sum_{n = 1}^{N} \prod_{k = 1}^{L} (w_{k} β_{n}^{k} + γ_{n, i}^{k}) - (N - 1) \prod_{k = 1}^{L} (γ_{n, i}^{k}) - \prod_{k = 1}^{L} (1 - w_{k})]},

(10)

γ_{n, i}^{k} = 1 - w_{k} \sum_{i = 1}^{N} β_{i}^{k} .

(11)

where

β_{n}

denotes the belief level of the result.

γ

denotes an intermediate parameter.

The resulting belief levels are distributed as follows:

S (A^{*}) = \{(D_{n}, β_{n}); n = 1, 2, \dots, N\},

(12)

where

A^{*}

is defined as the vector of inputs.

Step 5: Utility Calculation

The final utility of

S (A^{*})

is calculated as follows:

u (S (A^{*})) = \sum_{n = 1}^{N} u (D_{n}) β_{n},

(13)

where

u (D_{n})

denotes the utility value.

3.3. Optimization Process of Auto–BRB

Model accuracy is a crucial metric for gauging model capability. To further enhance model accuracy, this paper explores model optimization strategies. Within the context of a BRB-based prediction model for student performance, the accuracy of the BRB model is influenced not only by the reference points but also by changes in other parameters. For example, if the model’s prediction accuracy is suboptimal, this could be attributed to the presence of residual belief within the rule’s belief distribution [31]. In this case, adjusting the belief distribution becomes essential for enhancing the accuracy of the model. Thus, the focus of this paper during model optimization is to refine the belief distribution, rule weights, and attribute weights with the aim of minimizing the mean squared error (MSE). This approach targets the precise calibration of these elements to improve the predictive performance and reliability of the model.

In the present stage of research, many optimization algorithms are used as optimization models. A notable optimization algorithm frequently applied to the BRB model is the P–CMA–ES. This method is recognized for its effectiveness in addressing complex optimization problems, particularly in adjusting parameters within BRB models to achieve enhanced accuracy and efficiency [22]. It is suitable for addressing complex optimization issues that are nonlinear and nonconvex in continuous settings. P–CMA–ES offers the following advantages [24]: (1) robust and rapid convergence, (2) ability to simulate biological evolution principles, and (3) strong performance. Considering these benefits, this study employs the P–CMA–ES to fine-tune the parameters of the model created by Auto–BRB, aiming to further enhance the model’s accuracy.

The P–CMA–ES unfolds through the following steps:

Step 1: Determine the initial parameters to be optimized $w^{0} = Ω^{0}$ and the initial parameters of the P–CMA–ES.

$Ω^{0} = \{θ_{1}, \dots, θ_{L}, β_{1, 1}, \dots, β_{L, N}, δ_{1}, \dots, δ_{M}\}$

(14)

where $Ω^{0}$ is defined as the initial parameter vector to be optimized and $w^{0}$ is defined as the initial mean value.
Step 2: Determine the objective function and the constraints. The MSE denotes the modeling accuracy of BRB and is calculated by:

$M S E (θ_{k}, β_{n, k}, δ_{i}) = \frac{1}{T} {\sum_{t = 1}^{T} (r e s u l t_{a u t u a l} - r e s u l t_{p r e d i c t})}^{2}$

(15)

where T is the amount of observation data, $r e s u l t_{a u t u a l}$ is the actual output of the system, and $r e s u l t_{p r e d i c t}$ is the diagnostic result of the BRB.

Based on the above definition, the objective function and constraints are given as follows:

\begin{array}{l} \min M S E (θ_{k}, β_{n, k}, δ_{i}) \\ s . t . 0 \leq θ_{k} \leq 1, 0 \leq β_{n, k} \leq 1, \\ 0 \leq δ_{i} \leq 1, \sum_{n = 1}^{N} β_{n, k} \leq 1 \\ k = 1, 2, \dots, L, n = 1, 2, \dots, N, i = 1, 2, \dots, M \end{array}

(16)

Step 3: Execute a sampling operation to generate the population:

$Ω_{i}^{g + 1} ~ w^{g} + ε^{g} ℕ (0, C^{g}) i = 1, \dots, λ$

(17)

where $Ω_{i}^{g + 1}$ denotes the ith solution in the (g+1)th generation. $ε$ denotes the step size, and $ℕ$ denotes the normal distribution. $C^{g}$ denotes the covariance matrix in the gth generation.
Step 4: Execute the projection operations to satisfy the constraints:

$\begin{array}{l} Ω_{i}^{g + 1} (1 + n_{e} \times (j - 1) : n_{e} \times j) = & Ω_{i}^{g + 1} (1 + n_{e} \times (j - 1) : n_{e} \times j) - A_{e}^{T} \times {(A_{e} \times A_{e}^{T})}^{- 1} \\ \times Ω_{i}^{g + 1} (1 + n_{e} \times (j - 1) : n_{e} \times j) \times A_{e} \end{array}$

(18)

where hyperplane can be denoted by $A_{e} Ω_{i}^{g} (1 + n_{e} \times (j - 1) : n_{e} \times j) = 1$ ; $n_{e}$ represents the count of variables involved in the equality constraint of the solution $Ω_{i}^{g}$ ; $j = 1, \dots, N + 1$ indicates the quantity of equality constraints present in the solution $Ω_{i}^{g}$ ; and $A_{e} = {[1 \dots 1]}_{1 \times N}$ denotes the parameter vector.
Step 5: Perform selection to revise the mean based on $w^{g + 1} = \sum_{i = 1}^{τ} h_{i} Ω_{i : λ}^{g + 1}$ , where $h_{i}$ denotes the weight coefficient of the ith solution. $Ω_{i : λ}^{g + 1}$ denotes the ith solution from $λ$ solutions in the (g+1)th generation. $τ$ denotes the offspring population size.
Step 6: Carry out adaptation to refine the covariance matrix.

$C^{g + 1} = (1 - c_{1} - c_{2}) \cdot C^{g} + c_{1} p_{c}^{g + 1} {(p_{c}^{g + 1})}^{T} + c_{2} \sum_{i = 1}^{τ} h_{i} (\frac{Ω_{i : λ}^{g + 1} - w^{g}}{ε^{g}}) (\frac{Ω_{i : λ}^{g + 1} - w^{g}}{ε^{g}})^{T}$

(19)

$p_{c}^{g + 1} = (1 - c_{c}) \cdot p_{c}^{g} + \sqrt{c_{c} (2 - c_{c})} \cdot {(\sum_{i = 1}^{τ} h_{i}^{2})}^{- 0.5} \cdot \frac{(w^{g + 1} - w^{g})}{ε^{g}}$

(20)

where the step size is adjusted according to the following equation:

$ε^{g + 1} = ε^{g} \exp (\frac{c_{σ}}{d_{σ}} (\frac{‖p_{σ}^{g + 1}‖}{E ‖N (0, 1)‖} - 1))$

(21)

$p_{σ}^{g + 1} = (1 - c_{c}) \cdot p_{σ}^{g} + \sqrt{c_{c} (2 - c_{c})} \cdot {(\sum_{i = 1}^{τ} h_{i}^{2})}^{- 0.5} \cdot \frac{(w^{g + 1} - w^{g})}{ε^{g}} \cdot {(C^{g})}^{- 0.5}$

(22)

where $c_{1}$ and $c_{2}$ are defined as the learning rates, $p_{c}$ is defined as the evolution path, and $c_{c}$ is identified as the retrospective duration of the evolutionary path.
Step 7: Repeat the process iteratively until the optimal solution is reached $Ω_{o p t i m a l}$ . The optimal BRB is then modeled.

3.4. Model Assessment of Auto–BRB

In SPP, accuracy is a critical measure that significantly influences the practicality of the BRB model [20]. Moreover, the model’s complexity plays a vital role in its comprehension and acceptance by decision-makers, marking a primary advantage of the BRB model over other nonlinear models and securing its usage in SPP. Metrics concerning the model’s structure, including the number of rules and prior attributes, are deemed appropriate for gauging model complexity [20]. In practical scenarios, models that are understandable to decision-makers are necessary. An excessive number of reference points and rules can obscure the decision-making procedure, complicating comprehension and consequently diminishing the decision-maker’s trust in the model. Conversely, too few rules may compromise the accuracy of the model. Thus, conducting a thorough assessment of both model complexity and accuracy is vital. The Akaike Information Criterion (AIC) [32,33] has proven to be effective, and since its proposal, it has been applied in a wide variety of fields [32,34]. The AIC based on the BRB has been derived [35].

The AIC was initially introduced by Akaike. For any specified linear model, there exists:

z = h_{0} + h_{1} χ_{1} + h_{2} χ_{2} + \dots + h_{N} χ_{N} + e

(23)

where z represents the output matrix,

h_{N}

represents the input matrix,

χ_{N}

represents the nth model parameter, and e represents the model inaccuracies.

Akaike defined new criteria to assess models:

A I C = - 2 \log L (χ_{M L}) + 2 N

(24)

where

{\overset{⌢}{χ}}_{M L}

represents the maximum likelihood estimate (MLE) of the parameters

χ = [χ_{1}, χ_{2}, \dots, χ_{N}]

.

L ({\overset{⌢}{χ}}_{M L})

is denoted as the likelihood function under

{\overset{⌢}{χ}}_{M L}

, and N is denoted as the number of model parameters.

Hypothetical

\{(X, Y)\}

refers to the training dataset, where X is the input matrix and Y is the output matrix. X has two dimensions. One refers to the quantity of training datasets, P, and the other is the quantity of the independent parameters, N. Therefore,

X = [X_{n}^{p}, n = 1, 2, \dots, N; p = 1, 2, \dots, P]

. Y represents the number of training datasets and is one-dimensional. Therefore,

Y = {[y_{1}, y_{2}, \dots, y_{p}]}^{T}

. The output of the BRB can be modeled by

f (X^{p}) = ω_{0} + \sum_{n = 1}^{N} ω_{n} ϕ_{n} (X^{p})

(25)

where

ω_{n}

represents the weight of the nth independent parameter and

n = 1, 2, \dots, N

and

ϕ_{n} (X^{p})

indicate the relationships between the input matrix and the predicted output matrix of the BRB model in relation to the nth independent variable and the pth set of training data, respectively.

f (X^{p})

represents the output matrix of the BRB with the input matrix

X^{p}

.

Let

ε_{p}

be the error between

f (X^{p})

and

y_{p}

. Hypothetical

ε_{p}

follows the normal distribution,

ε_{p} ~ N \{0, σ^{2}\}

. Equation (25) can be calculated as:

y_{p} = f (X^{p}) + ε_{p} = ω_{0} + \sum_{n = 1}^{N} ω_{n} ϕ_{n} (X^{p}) + ε_{p}

(26)

Based on Equation (26), there is

y_{p} ~ N \{ω_{0} + \sum_{n = 1}^{N} ω_{n} ϕ_{n} (X^{p}), σ^{2}\}

The likelihood function of Equation (27) can be written as:

\begin{array}{l} L (Y, W, σ^{2}) & = \prod_{p = 1}^{P} \frac{1}{\sqrt{2 π} σ} \exp {- \frac{1}{2 σ^{2}} (y_{p} - ω_{0} - \sum_{n = 1}^{N} ω_{n} ϕ_{n} (X^{p}))^{2}} \\ = {(2 π σ^{2})}^{- \frac{P}{2}} \exp {- \frac{1}{2 σ^{2}} (y_{p} - ω_{0} - \sum_{n = 1}^{N} ω_{n} ϕ_{n} (X^{p}))^{2}} \end{array}

(27)

where

W = {[ω_{1}, ω_{2}, \dots, ω_{n}]}^{T}

.

By applying a logarithmic transformation, Equation (27) can be expressed as

\ln L = - \frac{P}{2} \ln (2 π) - \frac{P}{2} \ln (σ^{2}) - \frac{1}{2 σ^{2}} \sum_{p = 1}^{P} {(y_{p} - ω_{0} - \sum_{n = 1}^{N} ω_{n} ϕ_{n} (X^{p}))}^{2}

(28)

By calculating the partial derivatives of

ω_{n}

and

σ^{2}

on

\ln L

based on Equation (28), setting the partial derivatives to 0 yields the following equations, as shown in Equation (29):

\{\begin{cases} \frac{\partial \ln L}{\partial ω_{n}} = 0 \\ \frac{\partial \ln L}{\partial σ^{2}} = 0 \end{cases}

(29)

where

n = 1, 2, \dots, N

.

By solving Equation (29), the following solution can be obtained:

W = {[ω_{1}, ω_{2}, \dots, ω_{n}]}^{T}

(30)

The maximum likelihood estimation for W and

σ^{2}

can be calculated by

W = {(G^{'} G)}^{- 1} G^{'} Y

(31)

σ^{2} = \frac{1}{P} (Y - G W)^{'} (Y - G W)

(32)

where

G = [\begin{array}{l} 1 & ϕ_{1} (X_{1}^{1}) & \dots & ϕ_{N} (X_{N}^{1}) \\ 1 & ϕ_{1} (X_{1}^{2}) & \dots & ϕ_{N} (X_{N}^{2}) \\ ⋮ & ⋮ & ⋮ \\ 1 & ϕ_{1} (X_{1}^{P}) & \dots & ϕ_{N} (X_{N}^{P}) \end{array}]

.

Equations (31) and (32) are in matrix form. Equation (28) can be calculated as

\ln L (Y, W, σ^{2}) = - \frac{P}{2} \ln (2 π) - \frac{P}{2} \ln (σ^{2}) - \frac{P}{2}

(33)

With Equation (33), Equation (24) can be presented as

A I C_{B R B} = - 2 [- \frac{P}{2} \ln (2 π) - \frac{P}{2} \ln (σ^{2}) - \frac{P}{2}] + 2 N

(34)

Equation (34) can be rewritten as:

A I C_{B R B} = P \ln (σ^{2}) + 2 N + C

(35)

where

C = P \ln (2 π) + P

represents a constant and is independent of N.

The constant, C, can be disregarded when comparing various models. Thus, Equation (35) can be reformulated as

A I C_{B R B} = P \ln (σ^{2}) + 2 N

(36)

In Equation (36),

σ^{2}

can be calculated by

σ^{2} = P \cdot M S E

(37)

Based on Equations (36) and (37), Equation (36) can be calculated as

A I C_{B R B} = P \ln (P \cdot M S E) + 2 N

(38)

Equation (38) reveals that the first component represents the MSE, signifying the model’s accuracy, while the second component indicates the number of parameters, reflecting the model’s complexity.

3.5. Summary of the Process of Auto–BRB

The steps of the Auto–BRB methodology are summarized in this section.

Step 1: This step utilizes the approach detailed in Section 3.1 to derive a comprehensive set of reference points from historical data analysis. This foundational setup is crucial for the belief rule framework, enabling the precise prediction of student performance in the Auto–BRB.

Step 2: Leveraging the previously established reference value set, diverse BRB models are developed by integrating the various combinations of attribute and result reference points. This process leads to the formation of models with unique structures, each characterized by a specific set of these reference points, to establish belief rules. By constructing a collection of BRB models, the methodology allows for a comparative analysis to identify the most effective structure for accurately predicting student performance based on empirical data and expert insights.

Step 3: Within the constructed collection of BRB models, each model undergoes a reasoning process using the ER inference method. This crucial step involves the computation of activation weights for the rules within each model, facilitating the inference process based on the available evidence. Following initial reasoning, the model parameters are then fine-tuned using the P–CMA–ES optimization algorithm. This sophisticated optimization technique is employed to enhance the model’s accuracy by iteratively adjusting its parameters.

Step 4: Comprehensive assessment. According to the assessment method, a comprehensive assessment of the model’s accuracy versus complexity is conducted, aiming to construct a model that integrates both complexity and accuracy.

To summarize, the Auto–BRB methodology involves several critical stages: the creation of a reference value set, the assembly of an integrated BRB model, the application of reasoning and optimization techniques, and a thorough assessment. Through these steps, a BRB model can be constructed that is compatible with the assessment objectives and can ensure a trade–off between accuracy and complexity to improve the prediction accuracy. The overall process of the method is shown in Figure 3.

4. Case Study

In this section, the background of the case study is described in Section 4.1. The Auto–BRB model is developed in Section 4.2. The analysis of the results is described in Section 4.3. A comparative study is carried out in Section 4.4. The research on the generalization ability of the model is described in Section 4.5. Finally, a discussion is presented in Section 4.6.

4.1. Background Description

The data for this case study relate to student performance in secondary education in two schools in Portugal. Data attributes include student performance, demographic, social and school-related characteristics and were collected using school reports and questionnaires. The dataset was obtained from https://archive.ics.uci.edu/dataset/320/student+performance (accessed on 27 November 2014), which provides two datasets on performance in two different subjects, namely Portuguese (por) and mathematics (mat), where Portuguese performance was used for this case study and mathematics performance was used for the research on the generalization ability of the model. According to the research of Paulo Cortez et al. [36], G1 (first period grade) and G2 (second period grade) are considered to be strongly correlated with the target attribute G3 (final grade). Therefore, G1 and G2 were chosen as feature attributes. G3 was chosen as outcome attribute. The data distribution is shown in Figure 4. The relationship between input and output variables is shown in Figure 5. The statistical analysis of the dataset is shown in Table 1.

4.2. Construction of Auto–BRB

To demonstrate the enhanced accuracy of the Auto–BRB model, an SPP model is developed using Auto–BRB. In this experimental setup, 130 samples from the dataset are chosen as the training dataset, while the remaining 519 samples serve as the test dataset. The first step involves constructing the reference value set, where each set of reference points should encompass both the maximum and minimum values of the data. To complement these boundary values, the remaining reference points are generated using the SSE–KPP algorithm. The experiment sets the number of iterations for the SSE–KPP (MaxIter KPP) and the number of updates for the cluster center (MaxIter) both to 1000, ensuring a robust optimization process. The constructed reference value sets for G1 are detailed in Table 2, with specific reference value sets for G2 and G3 outcomes illustrated in Table 3 and Table 4, respectively. This structured approach enables a comprehensive evaluation of the Auto–BRB model’s ability to accurately predict student performance, emphasizing the importance of meticulous reference value selection and algorithmic optimization to enhance model performance.

4.3. Analysis of Results

The initial model library contains a total of 392 models. The belief degree, rule weights, and attribute weights of the initial BRB models are determined by a stochastic algorithm. Parameters such as the belief distribution, attribute weights, and rule weights significantly influence the model’s accuracy. Once the BRB model library has been established, the focus shifts to refining the remaining parameters, a critical step for enhancing the precision and structure of the model. This optimization process is essential for fine-tuning the model to achieve higher accuracy in its predictions and assessments. Strategically segmenting belief intervals plays a crucial role in minimizing variance in model evaluations and elevating the accuracy of the model. In this experiment, the P–CMA–ES algorithm was used to refine the remaining parameters, with the goal of significantly boosting the model’s precision. By comprehensively evaluating the models in the model library, the optimal reference value structures were determined as follows: G1 {0, 9.82, 14.39, 20}, G2 {0, 8.63, 13.16, 20}, and G3 {0, 7.94, 20}.

In this study, the MAE, RMSE, MSE,

R^{2}

(coefficient of determination), VAF (variance accounted for), and

a_{10}

(percentage within 10% error range) are used as the evaluation indexes of the model, respectively, which are calculated as follows:

R M S E = \sqrt{\frac{1}{Q} \sum_{t = 1}^{Q} {(p r e d i r e c t_{t} - a {ctual}_{t})}^{2}}

(39)

M A E = \frac{1}{Q} \sum_{t = 1}^{Q} |p r e d i r e c t_{t} - a {ctual}_{t}|

(40)

R^{2} = 1 - \frac{S S_{r e s}}{S S_{t o t}}

(41)

V A F = (1 - \frac{var (e)}{var (y)}) \times 100

(42)

a_{10} = \frac{1}{N} \sum_{i = 1}^{N} I (|\frac{p r e d i r e c t_{t} - a {ctual}_{t}}{a {ctual}_{t}}| < 0.1) \times 100

(43)

where Q is defined as the number of data,

p r e d i r e c t_{t}

is defined as the predicted value for the tth data, and

a {ctual}_{t}

is defined as the true value for the tth data.

S S_{r e s}

is defined as the residual sum of squares,

S S_{t o t}

is defined as the total sum of squares,

var (e)

represents the variance of the error,

var (y)

represents the variance of the actual data, and I represents the indicator function.

The final optimized Auto–BRB model has a MSE of 0.8452, a RMSE of 1.3603, a MSE of 1.8505, a

R^{2}

of 0.8198, a VAF of 82.33%, and an

a_{10}

of 83.24% and the fit of the model prediction results to the real data is shown in Figure 6. It can be seen that the optimized model has high accuracy. Crucially, the Auto–BRB leverages a rule-based modeling framework that promotes transparency throughout the modeling and reasoning processes. This transparency is instrumental in facilitating decision-makers’ understanding of the model structure, thereby enhancing the acceptability of the model’s decision outcomes. Moreover, the experiments illustrate Auto–BRB’s ability to model effectively even in scenarios where expert knowledge is lacking, further attesting to its versatility and efficacy in evaluation.

4.4. Comparative Study of Auto–BRB

In order to analyze the accuracy of Auto–BRB, we compared it with other models such as Backpropagation Neural Network (BPNN), Long Short-Term Memory Network (LSTM), Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbor (KNN). As shown in Table 5, the experiments compare the MAE, RMSE, MSE,

R^{2}

, VAF, and

a_{10}

of these models and the results show that the Auto–BRB has good accuracy compared to other models. The model tests were implemented in MatLab. The outputs of the comparative models are shown in Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11 and the MAE values for the six models are 0.8452, 0.9314, 1.0169, 0.9190, 0.9012, and 0.8709. The hyperparameters for the comparison model are presented in Table 6.

To increase the modeling accuracy, many optimization algorithms are used to optimize the BRB, for example, the particle swarm optimization algorithm (PSO) and differential evolutionary algorithm (DE). P–CMA–ES is suitable for addressing complex optimization issues that are nonlinear and nonconvex in continuous settings. In this research, PSO and DE algorithms were introduced to optimize Auto–BRB as comparative research. According to Table 7, the P–CMA–ES algorithm has better optimization results and Auto–BRB possesses the highest accuracy in different metrics relative to PSO and DE. Ten rounds of experiments were conducted for different algorithms, respectively, and the robustness analysis of each optimization algorithm is shown in Table 8. The standard deviation of the P–CMA–ES algorithm is 0.0182, and the experimental results show that the P–CMA–ES algorithm has better robustness. In summary, P–CMA–ES has better performance and higher robustness; therefore, P–CMA–ES was selected as the optimization algorithm for Auto–BRB.

To further verify the robustness of the Auto–BRB model, ten rounds of experiments were conducted with the same parameters. Figure 12, Figure 13 and Figure 14 display the MSE, RMSE, and MAE values of the comparison test. The average values of MAE, RMSE, and MSE for Auto–BRB are 0.8627, 1.4604, and 1.8750, respectively. It can be seen that Auto–BRB has a high level of stability compared to other models.

4.5. Generalizability Study

To validate the generalizability of the proposed Auto–BRB-based SPP model, two different case studies were conducted and are outlined in this section. First, the same school’s math course grades are shown with the same feature attributes and target attributes as in Section 4.1. Second, a case study which predicts the performance of high school students in a particular location is used as generalizability research, with the dataset being publicly available at Kaggle (doi: 10.34740/kaggle/ds/5195702, accessed on August 2024), with the feature attributes being the weekly study time (WST) and the number of absences (NA) and the target attribute being the student’s performance grade (G).

First, the comparative research model in Section 4.4 is also introduced for comparative research. The final SPP model structure resulting from Auto–BRB is G1 {0,10.90886076,20}, G2 {0,10.71392405,20}, and G3 {0,4.392156863,12.51194539,20}. The experimental results are presented in Table 9. According to Table 9, the MAE, RMSE, MSE,

R^{2}

,VAF, and

a_{10}

of Auto–BRB model are 1.0757, 2.0321, 4.1293, 0.8189, 81.90%, and 69.04%, respectively, which are significantly higher than in other models.

Second, the final SPP model structure resulting from Auto–BRB is WST {0,4.99,16.48,20}, NA {0,6.64,21.28,30}, and G {0,1,2,3,4}. According to Table 10, Auto–BRB has the lowest MSE and RMSE and the highest

R^{2}

at 0.0429, 0.2071, and 0.9293, respectively; DT has the lowest MAE at 0.0740; LSTM has the highest VAF at 93.82%; and KNN has the highest

a_{10}

at 93.56%.The results show that Auto–BRB is superior in several metrics. Although Auto–BRB does not show the best performance in terms of MAE, VAF, and

a_{10}

, it still has a high level of accuracy. Meanwhile, Auto–BRB has good interpretability while guaranteeing accuracy, which can provide trusted and effective decision support for decision makers.

In summary, the Auto–BRB has shown superior performance through two different case studies. Therefore, the generalization ability of the proposed model is verified.

4.6. Discussion

As shown in Figure 11, Figure 12 and Figure 13, the trained Auto–BRB model has higher metrics than several other models. The MSE of the Auto–BRB model is improved by 34.25%, 34.22%, 34.56%, 29.71%, and 33.26%, respectively, compared to the other models. In addition, the RMSE improved by 19.92%, 20.45%, 16.45%, 16.62%, and 14.49%. the MAE improved by 30.25%, 30.28%, 13.91%, 12.01%, and 11.41%, respectively. Auto–BRB also performs better than other models on metrics such as

R^{2}

, VAF, and

a_{10}

.

BPNN, LSTM, RF, and KNN are some of the commonly used tools in learning achievement prediction, and they are all data-driven models. Data-driven approaches have their advantages because they do not need to know the specific relationships between the models and output results. Nonetheless, the performance of these models varies greatly from one training round to another, even when using the same dataset. The performance of models heavily reliant on data is determined by the training set. According to experimental results, BPNN and RF show strong performance but lack interpretability due to their black-box nature. In contrast, the Auto–BRB model integrates both expert knowledge and data. It starts with a model based on expert knowledge, then refines it using historical data and optimization techniques, providing a clearer representation of input–output relationships. Auto–BRB’s processes are transparent, with clear inference and optimization steps. Although DT offers some interpretability, it does not match the performance of Auto–BRB.

By discussing the results of the above experiments, the following conclusions can be drawn:

1. The optimization algorithm can train and optimize the parameters of the Auto–BRB model, resulting in higher accuracy compared to other methods. The average results from 10 repeated experiments demonstrate that the Auto–BRB model exhibits better robustness and higher accuracy.

2. The Auto–BRB model uses the ER algorithm as an inference mechanism with a transparent process where the causal relationship between inputs and outputs is clearly represented. Therefore, the Auto–BRB model has excellent interpretability. Compared with other data-driven methods, the Auto–BRB model demonstrates better accuracy and interpretability.

5. Conclusions

This article proposed an Auto–BRB-based SPP model that aids teachers in managing students and provides better decision support when learning interventions based on student performance or learning behaviors encounter difficulties. The advantages of the Auto–BRB-based SPP model are as follows: (1) Auto–BRB has stronger modeling capabilities; (2) Auto–BRB offers better transparency and interpretability. The experimental results demonstrate that the Auto–BRB model has superior accuracy and stability. The reasoning process of Auto–BRB is intuitive and traceable. This model can predict student performance, effectively assisting schools and teachers in student management, resource allocation, and decision support.

The case study on the prediction of students’ performance in the Portuguese region shows that Auto–BRB has MAEs of 0.8452 and 1.0757, respectively, demonstrating higher accuracy and better interpretability relative to other models. The validity of the proposed model is confirmed. Meanwhile, Auto–BRB still performs reasonably well in generalization ability research.

Future research efforts could provide further insights into the interpretability of the model. The full use of expert knowledge and historical data ensures the interpretability of the model during the modeling and optimization process.

Author Contributions

M.L.: conceptualization, formal analysis, software, writing—review and editing. W.H.: funding acquisition, supervision, writing—review and editing. G.Z.: data curation, validation. H.Z.: data curation, validation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Teaching Reform Project of Higher Education in Heilongjiang Province under grant no. SJGY20210456; in part by the Foreign Expert Projects in Heilongjiang Province under grant no. GZ20220131; in part by the Shandong Provincial Natural Science Foundation under grant no. ZR2023QF010; in part by the Social Science Planning Foundation of Liaoning Province under grant no. L23BTQ005; and in part by the Scientific Research Project of Liaoning Provincial Education Department under grant no. JYTMS20230555.

Data Availability Statement

The datasets used in this paper can be found on https://archive.ics.uci.edu/dataset/320/student+performance (accessed on 27 November 2014), and https://www.kaggle.com/datasets/rabieelkharoua/students-performance-dataset (accessed on August 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Al-Shehri, H.; Al-Qarni, A.; Al-Saati, L.; Batoaq, A.; Badukhen, H.; Alrashed, S.; Alhiyafi, J.; Olatunji, S.O. Student performance prediction using support vector machine and k-nearest neighbor. In Proceedings of the 2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering, Windsor, ON, Canada, 30 April–3 May 2017; pp. 1–4. [Google Scholar]
Kabakchieva, D. Student performance prediction by using data mining classification algorithms. Int. J. Comput. Sci. Manag. Res. 2012, 1, 686–690. [Google Scholar]
Kim, B.-H.; Vizitei, E.; Ganapathi, V. GritNet: Student performance prediction with deep learning. arXiv 2018, arXiv:1804.07405. [Google Scholar]
Sweeney, M.; Rangwala, H.; Lester, J.; Johri, A. Next-term student performance prediction: A recommender systems approach. J. Educ. Data Min. 2016, 8, 22–50. [Google Scholar]
Connor, C.M. Using technology and assessment to personalize instruction: Preventing reading problems. Prev. Sci. 2019, 20, 89–99. [Google Scholar] [CrossRef] [PubMed]
Grayson, A.; Miller, H.; Clarke, D.D. Identifying barriers to help-seeking: A qualitative analysis of students’ preparedness to seek help from tutors. Br. J. Guid. Couns. 1998, 26, 237–253. [Google Scholar] [CrossRef]
Yue, L.; Hu, P.; Chu, S.-C.; Pan, J.-S. Multi-Objective Gray Wolf Optimizer with Cost-Sensitive Feature Selection for Predicting Students’ Academic Performance in College English. Mathematics 2023, 11, 3396. [Google Scholar] [CrossRef]
Gaftandzhieva, S.; Talukder, A.; Gohain, N.; Hussain, S.; Theodorou, P.; Salal, Y.K.; Doneva, R. Exploring Online Activities to Predict the Final Grade of Student. Mathematics 2022, 10, 3758. [Google Scholar] [CrossRef]
Liu, C.; Wang, H.; Yuan, Z. A Method for Predicting the Academic Performances of College Students Based on Education System Data. Mathematics 2022, 10, 3737. [Google Scholar] [CrossRef]
Hu, Q.; Rangwala, H. Academic performance estimation with attention-based graph convolutional networks. arXiv 2019, arXiv:2001.00632. [Google Scholar]
Niu, K.; Cao, X.; Yu, Y. Explainable student performance prediction with personalized attention for explaining why a student fails. arXiv 2021, arXiv:2110.08268. [Google Scholar]
Ren, Z.; Ning, X.; Lan, A.S.; Rangwala, H. Grade prediction with neural collaborative filtering. In Proceedings of the 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Washington, DC, USA, 5–8 October 2019; pp. 1–10. [Google Scholar] [CrossRef]
Isinkaye, F.O.; Folajimi, Y.O.; Ojokoh, B.A. Recommendation systems: Principles, methods and evaluation. Egypt. Inform. J. 2015, 16, 261–273. [Google Scholar] [CrossRef]
Resnick, P.; Varian, H.R. Recommender systems. Commun. ACM 1997, 40, 56–58. [Google Scholar] [CrossRef]
Saganowski, S. Bringing Emotion Recognition Out of the Lab into Real Life: Recent Advances in Sensors and Machine Learning. Electronics 2022, 11, 496. [Google Scholar] [CrossRef]
Yang, J.-B.; Liu, J.; Wang, J.; Sii, H.-S.; Wang, H.-W. Belief rule-base inference methodology using the evidential reasoning approach-RIMER. IEEE Trans. Syst. Man Cybern. Syst. 2006, 36, 266–285. [Google Scholar] [CrossRef]
Chen, M.; Zhou, Z.; Han, X.; Feng, Z. A Text-Oriented Fault Diagnosis Method for Electromechanical Device Based on Belief Rule Base. Mathematics 2023, 11, 1814. [Google Scholar] [CrossRef]
Cheng, X.; Qian, G.; He, W.; Zhou, G. A Liquid Launch Vehicle Safety Assessment Model Based on Semi-Quantitative Interval Belief Rule Base. Mathematics 2022, 10, 4772. [Google Scholar] [CrossRef]
Chen, H.; Zhou, G.; Zhang, X.; Zhu, H.; He, W. Learning Emotion Assessment Method Based on Belief Rule Base and Evidential Reasoning. Mathematics 2023, 11, 1152. [Google Scholar] [CrossRef]
You, Y.; Sun, J.; Chen, Y.-W.; Niu, C.; Jiang, J. Ensemble belief rule-based model for complex system classification and prediction. Expert Syst. Appl. 2021, 164, 113952. [Google Scholar] [CrossRef]
Antonelli, M.; Ducange, P.; Lazzerini, B.; Marcelloni, F. Learning knowledge bases of multi-objective evolutionary fuzzy systems by simultaneously optimizing accuracy, complexity and partition integrity. Soft Comput. 2011, 15, 2335–2354. [Google Scholar] [CrossRef]
Cao, Y.; Zhou, Z.; Hu, C.; He, W.; Tang, S. On the interpretability of belief rule-based expert systems. IEEE Trans. Fuzzy Syst. 2021, 29, 3489–3503. [Google Scholar] [CrossRef]
Yang, Y.; Tan, W.; Li, T.; Ruan, D. Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems. Knowl. Based Syst. 2012, 32, 101–115. [Google Scholar] [CrossRef]
Zhang, Q.; Zhao, B.; He, W.; Zhu, H.; Zhou, G. A behavior prediction method for complex system based on belief rule base with structural adaptive. Appl. Soft Comput. 2024, 151, 111118. [Google Scholar] [CrossRef]
Sun, C.; Yang, R.; He, W.; Zhu, H. A novel belief rule base expert system with interval-valued references. Sci. Rep. 2022, 12, 6786. [Google Scholar] [CrossRef] [PubMed]
Miller, G.A. The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychol. Rev. 1956, 63, 81–97. [Google Scholar] [CrossRef]
Alonso, J.M.; Magdalena, L.; González-Rodríguez, G. Looking for a good fuzzy system interpretability index: An experimental approach. Int. J. Approx. Reason. 2009, 51, 115–134. [Google Scholar] [CrossRef]
Gao, F.; Zhang, A.; Bi, W.; Ma, J. A greedy belief rule base generation and learning method for classification problem. Appl. Soft Comput. 2021, 98, 106856. [Google Scholar] [CrossRef]
Wang, Y.-M.; Yang, L.-H.; Fu, Y.-G.; Chang, L.-L.; Chin, K.-S. Dynamic rule adjustment approach for optimizing belief rule- base expert system. Knowl. Based Syst. 2016, 96, 40–60. [Google Scholar] [CrossRef]
Xu, D.-L.; Liu, J.; Yang, J.-B.; Liu, G.-P.; Wang, J.; Jenkinson, I.; Ren, J. Inference and learning methodology of belief-rule-based expert system for pipeline leak detection. Expert Syst. Appl. 2007, 32, 103–113. [Google Scholar] [CrossRef]
Chang, L.; Xu, X.; Liu, Z.-G.; Qian, B.; Xu, X.; Chen, Y.-W. BRB prediction with customized attributes weights and tradeoff analysis for concurrent fault diagnosis. IEEE Syst. J. 2021, 15, 1179–1190. [Google Scholar] [CrossRef]
Akaike, H. Akaike’s Information Criterion. In International Encyclopedia of Statistical Science; Lovric, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Burnham, K.P.; Anderson, D.P. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach; Springer: London, UK, 2002. [Google Scholar]
Choudhari, V.G.; Dhoble, A.S.; Sathe, T.M. A review on effect of heat generation and various thermal management systems for lithium-ion battery used for electric vehicle. J. Energy Storage 2020, 32, 101729. [Google Scholar] [CrossRef]
Chang, L.; Zhou, Z.; Chen, Y.; Xu, X.; Sun, J.; Liao, T.; Tan, X. Akaike Information Criterion-based conjunctive belief rule base learning for complex system modeling. Knowl.-Based Syst. 2018, 161, 47–64. [Google Scholar] [CrossRef]
Cortez, P.; Silva, A.M.G. Using data mining to predict secondary school student performance. In DSI—Engenharia da Programação e dos Sistemas Informáticos, Proceedings of 5th Annual Future Business Technology Conference, Porto, Portugal, 9–11 April 2008; Brito, A., Teixeira, J., Eds.; EUROSIS-ETI: Ostend, Belgium, 2008. [Google Scholar]

Figure 1. The progress of SSE–KPP.

Figure 2. Modeling process of Auto–BRB.

Figure 3. The progress of Auto–BRB.

Figure 4. Data distribution.

Figure 5. A 3D scatter plot of input and output variables.

Figure 6. Predictive fitting plot for Auto–BRB.

Figure 7. Predictive fitting plot for BPNN.

Figure 8. Predictive fitting plot for LSTM.

Figure 9. Predictive fitting plot for DT.

Figure 10. Predictive fitting plot for RF.

Figure 11. Predictive fitting plot for KNN.

Figure 12. Comparison of MAE for different models.

Figure 13. Comparison of RMSE for different models.

Figure 14. Comparison of MSE for different models.

Table 1. Statistical analysis of datasets.

Attributes	Max	Ave	Std
G1	19.00	11.39	2.75
G2	19.00	11.57	2.91
G3	19.00	11.91	3.23

Table 2. Reference point sets for G1.

No.	G1
2	{0.00,20.00}
3	{0.00,11.39,20.00}
4	{0.00,9.82,14.39,20.00}
5	{0.00,7.11,10.10,13.75,20.00}
6	{0.00,7.11,9.59,11.47,14.39,20.00}
7	{0.00,7.11,9.00,10.49,12.47,15.05,20.00}
8	{0.00,6.36,9.26,11.00,12.47,14.33,16.72,20.00}

Table 3. Reference point sets for G2.

No.	G2
2	{0.00,20.00}
3	{0.00,11.57,20.00}
4	{0.00,8.63,13.16,20.00}
5	{0.00,7.84,11.46,15.38,20.00}
6	{0.00,6.50,9.22,12.26,16.13,20.00}
7	{0.00,8.24,10.55,12.48,14.75,17.46,20.00}
8	{0.00,7.41,10.12,12.48,14.00,15.39,17.46,20.00}

Table 4. Reference point sets for G3.

No.	G3
2	{0.00,20.00}
3	{0.00,7.94.00}
4	{0.00,9.83,14.71,20.00}
5	{0.00,9.31,12.96,16.12,20.00}
6	{0.00,0.35,10.28,13.83,16.79,20.00}
7	{0.00,0.35,9.85,12.53,14.44,16.79,20.00}
8	{0.00,0.06,8.19,10.52,12.53,14.44,16.79,20.00}
9	{0.00,0.06,8.19,10.00,11.00,12.96,15.42,17.41,20.00}

Table 5. Comparison of accuracy of Auto–BRB with other models for Portuguese (por).

Model	MAE	RMSE	MSE	$R^{2}$	VAF	$a_{10}$
Auto–BRB	0.8452	1.3603	1.8505	0.8198	82.33%	83.24%
BPNN	0.9314	1.6364	2.6777	0.7392	74.93%	78.99%
LSTM	1.0169	1.5850	2.5124	0.7553	76.50%	72.06%
DT	0.9190	1.5877	2.5802	0.7545	75.76%	74.95%
RF	0.9012	1.4657	2.1483	0.7908	79.52%	77.65%
KNN	0.8709	1.5180	2.3044	0.7756	77.56%	71.87%

Table 6. Hyperparameter settings for comparison models.

Model	Parameters Settings
BPNN	hidden_layer_sizes = 5, learning_rate_init = 0.01, max_iter = 500
LSTM	MaxEpochs = 500, MiniBatchSize = 5, InitialLearnRate = 0.08, GradientThreshold = 1
DT	splitter = ‘best’, min_samples_leaf = 15
RF	n_estimators = 85, oob_score = True, random_state = 12
KNN	n_neighbors = 25, algorithm = ‘auto’, weights = ‘uniform’, leaf_size = 25, p = 2, metric = ‘minkowski’, metric_params = None, n_jobs = 1

Table 7. Comparison of accuracy of different optimization algorithms for Auto–BRB.

Methods	MAE	RMSE	MSE	$R^{2}$	VAF	$a_{10}$
P–CMA–ES	0.8452	1.3603	1.8505	0.8198	82.33%	83.24%
PSO	0.8955	1.4572	2.1235	0.7825	80.12%	81.16%
DE	0.9053	1.5283	2.3356	0.7752	79.93%	80.98%

Table 8. Robustness analysis of different optimization algorithms for Auto–BRB.

Methods	Min MAE	Max MAE	Average MAE	The Standard Deviation of MAE
P–CMA–ES	0.8326	0.8912	0.8659	0.0182
PSO	0.8836	0.9466	0.9098	0.0217
DE	0.8946	0.9592	0.9343	0.0216

Table 9. Comparison of accuracy of Auto–BRB with other models for mat.

Model	MAE	RMSE	MSE	$R^{2}$	VAF	$a_{10}$
Auto–BRB	1.0757	2.0321	4.1293	0.8189	81.90%	69.04%
BPNN	1.1373	2.1531	4.6358	0.7805	78.24%	64.65%
LSTM	1.6255	2.6510	7.0279	0.7265	72.23%	60.62%
DT	1.2638	2.1702	4.7099	0.7705	77.05%	62.94%
RF	1.3365	2.2439	5.0351	0.7590	75.92%	66.49%
KNN	1.5584	2.9281	8.5736	0.5823	58.53%	57.36%

Table 10. Comparison of Auto–BRB and other models predicting student performance for high school students.

Model	MAE	RMSE	MSE	$R^{2}$	VAF	$a_{10}$
Auto–BRB	0.1231	0.2071	0.0429	0.9293	92.09%	84.75%
BPNN	0.1338	0.2221	0.0493	0.9072	90.92%	85.42%
LSTM	0.1805	0.2301	0.0529	0.9004	93.82%	78.98%
DT	0.0740	0.2425	0.0588	0.8894	89.13%	93.22%
RF	0.1071	0.2269	0.0515	0.9032	90.32%	86.10%
KNN	0.0644	0.2538	0.0644	0.8788	88.06%	93.56%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, M.; He, W.; Zhou, G.; Zhu, H. A New Student Performance Prediction Method Based on Belief Rule Base with Automated Construction. Mathematics 2024, 12, 2418. https://doi.org/10.3390/math12152418

AMA Style

Liu M, He W, Zhou G, Zhu H. A New Student Performance Prediction Method Based on Belief Rule Base with Automated Construction. Mathematics. 2024; 12(15):2418. https://doi.org/10.3390/math12152418

Chicago/Turabian Style

Liu, Mingyuan, Wei He, Guohui Zhou, and Hailong Zhu. 2024. "A New Student Performance Prediction Method Based on Belief Rule Base with Automated Construction" Mathematics 12, no. 15: 2418. https://doi.org/10.3390/math12152418

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Student Performance Prediction Method Based on Belief Rule Base with Automated Construction

Abstract

1. Introduction

2. Problem Formulation

3. Construction of SPP Model Based on Auto–BRB

3.1. Construction of Model Reference Value Sets

3.2. Reasoning Process of Auto–BRB

3.3. Optimization Process of Auto–BRB

3.4. Model Assessment of Auto–BRB

3.5. Summary of the Process of Auto–BRB

4. Case Study

4.1. Background Description

4.2. Construction of Auto–BRB

4.3. Analysis of Results

4.4. Comparative Study of Auto–BRB

4.5. Generalizability Study

4.6. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI