Memory-Based Dynamic Bayesian Networks for Learner Modeling: Towards Early Prediction of Learners’ Performance in Computational Thinking

Hooshyar, Danial; Druzdzel, Marek J.

doi:10.3390/educsci14080917

Open AccessArticle

Memory-Based Dynamic Bayesian Networks for Learner Modeling: Towards Early Prediction of Learners’ Performance in Computational Thinking

by

Danial Hooshyar

^1,*

and

Marek J. Druzdzel

²

¹

School of Digital Technologies, Tallinn University, 10120 Tallinn, Estonia

²

Faculty of Computer Science, Bialystok University of Technology, 15-351 Bialystok, Poland

^*

Author to whom correspondence should be addressed.

Educ. Sci. 2024, 14(8), 917; https://doi.org/10.3390/educsci14080917 (registering DOI)

Submission received: 21 May 2024 / Revised: 7 August 2024 / Accepted: 18 August 2024 / Published: 21 August 2024

(This article belongs to the Topic Artificial Intelligence for Education)

Download

Browse Figures

Versions Notes

Abstract

:

Artificial intelligence (AI) has demonstrated significant potential in addressing educational challenges in digital learning. Despite this potential, there are still concerns about the interpretability and trustworthiness of AI methods. Dynamic Bayesian networks (DBNs) not only provide interpretability and the ability to integrate data-driven insights with expert judgment for enhanced trustworthiness but also effectively process temporal dynamics and relationships in data, crucial for early predictive modeling tasks. This research introduces an approach for the temporal modeling of learners’ computational thinking abilities that incorporates higher-order influences of latent variables (hereafter referred to as memory of the model) and accordingly predicts learners’ performance early. Our findings on educational data from the AutoThinking game indicate that when using only first-order influences, our proposed model can predict learners’ performance early, with an 86% overall accuracy (i.e., time stamps 0, 5, and 9) and a 94% AUC (at the last time stamp) during cross-validation and 91% accuracy and 98% AUC (at the last time stamp) in a holdout test. The introduction of higher-order influences improves model accuracy in both cross-validation and holdout tests by roughly 4% and improves the AUC at timestamp 0 by roughly 2%. This suggests that integrating higher-order influences into a DBN not only potentially improves the model’s predictive accuracy during the cross-validation phase but also enhances its overall and time stamp-specific generalizability. DBNs with higher-order influences offer a trustworthy and interpretable tool for educators to foresee and support learning progression.

Keywords:

early prediction; learner modeling; dynamic Bayesian network; interpretable and trustworthy AI; computational thinking

1. Introduction

Artificial intelligence (AI) methodologies have become instrumental in addressing a wide array of educational challenges, such as the early prediction of learners’ performance, dropout rates, procrastination, grades, etc. (e.g., [1,2,3,4]). These computational methods meticulously construct structured representations of cognitive and non-cognitive learner characteristics, facilitating an estimation and timely prediction of knowledge, skill, and performance and thus enabling effective support for learners when necessary. This support can manifest in various forms, including individualized feedback, scaffolding, hints, personalized learning materials, and tailored learning paths [5,6,7].

The wide array of AI approaches to learner modeling can be divided into two primary families: symbolic and sub-symbolic methods [8,9,10]. Sub-symbolic methods, applicable only in situations where quantities of data are available, are adept at handling noisy and incomplete data while requiring minimal human input. However, their educational utility is restricted due to a lack of interpretability and difficulty in integrating existing educational causal relationships [11]. The challenge in incorporating these relationships arises as these methods primarily operate with numerical data, making it difficult to embed educational causality directly [12]. This limitation can introduce learning biases, as tutoring systems may learn to associate variables in ways that do not accurately reflect educational realities [11,13,14].

Symbolic methods, such as Bayesian networks, Bayesian knowledge tracing, and dynamic Bayesian networks (DBNs), are suitable for situations in which there are data available, but they will work also when there are limited or no data and models need to be based on expert knowledge. Expert-based model building necessitates domain knowledge to delineate relationships among variables, which may be resource-intensive due to the high degree of human involvement required. Despite this, symbolic methods offer interpretability, giving insights into the predictions and decisions made by the learner models [15]. Bayesian knowledge tracing, a subset of DBNs, focuses on tracking a learner’s mastery of specific concepts over time [16]. It is best suited for linear learning progressions and falls short in representing more complex, multidirectional learning scenarios [17]. In contrast, DBNs provide a more versatile and general framework capable of capturing complex interactions and dependencies among various concepts or variables, offering a broader and more intricate representation of learning behaviors and patterns [18,19].

Interpretability is particularly crucial in educational contexts, as it underpins accountability to learners, parents, and regulatory bodies by transparently explaining predictions and the rationale behind specific feedback or interventions [20,21,22]. DBNs offer the potential to harmonize data-driven insights with expert judgment, adeptly navigating the temporal dynamics and relationships inherent to educational data. This capability is indispensable for early predictive modeling tasks, ensuring a more nuanced and contextually informed approach to learner modeling [23,24,25].

Given the critical importance of interpretability and trustworthiness in educational AI applications and in light of the regulatory framework set forth by the EU AI Act [13,21], DBNs have maintained their status as a widely adopted and trusted method [17,26,27,28]. For example, Rowe and Lester [26] designed a DBN to tailor narrative elements in response to user interactions, enhancing the adaptation of knowledge. This approach allows for more personalized learning experiences based on the actions of the user within the educational framework. Cui et al. [17] developed a DBN that processes in-game data from learners to provide formative feedback. This method supports real-time adaptation and personalized learning by analyzing interactions and performance continuously. Levy [27] proposed a scoring model based on a DBN for an educational game named Save Patch, which enhances math learning through game play. The model uses a polytomous variable to assess multiple proficiencies in different categories, demonstrating a sophisticated approach to measuring educational outcomes in game-based learning environments.

Despite the extensive application of DBNs, almost all models applied are based on only first-order influences, meaning that they consider only the interactions between two neighboring time steps (time step t-1 and time step t). Many naturally occurring processes are based on higher-order influences. For example, a DBN model of the woman’s monthly cycle developed in [29,30] is based on influences of up to the 9th order. This is necessary, as the woman’s monthly cycle is a complex system with memory that spans over more or less 28 days of the cycle. Even though the body goes through daily changes, it is roughly known when ovulation will take place on the first day of the cycle (this is usually the first day of the monthly bleeding). Without including higher-order influences, modeling such a system would be daunting. A first-order model assumes that the only thing that is important in how the model will develop is its current state. Educational models show a clear need for higher-order influences, as both students’ and teachers’ actions depend on a long history of the students’ activities. The incorporation of higher-order influences, i.e., adding memory to DBNs, has the potential to enhance the temporal modeling of learning progression. Yet, this idea has seen limited use in education. Designing DBNs with higher-order influences could create a more accurate reflection of the complexities inherent to educational settings. Additionally, most related works do not investigate how different initial strategies for the Expectation–Maximization (EM) algorithm—used for parameter estimation in DBNs—affect the learning process and overall model performance. The EM algorithm iteratively refines parameter estimates to maximize the likelihood of the observed data, and its effectiveness can vary depending on the initial parameter values chosen. Understanding how these initial strategies impact the learning outcomes and performance of DBNs could offer valuable insights for improving model accuracy and robustness. This study introduces a novel approach to learner modeling that leverages the concept of memory in DBNs for the temporal modeling of learners’ computational thinking abilities. The primary goal is to provide early predictions of learner performance by integrating higher-order influences of latent variables. We address this objective through the following research questions:

What is the effect of EM’s initial strategy on parameter learning and the overall performance of the DBN?
- To assess the impact of different initial strategies for the EM algorithm on parameter learning and the overall accuracy of the DBN.
How is the performance of the proposed DBN-based learner modeling approach in terms of cross-validation and holdout test?
- To evaluate the effectiveness of the proposed DBN-based learner modeling approach through cross-validation and holdout tests.
What are the effects of using higher-order memory on the predictive power of the proposed learner modeling approach?
- To investigate how incorporating higher-order memory influences the predictive power of the learner modeling approach.

To address these research questions, we implement our approach using the educational game AutoThinking [31]. AutoThinking is designed to enhance computational thinking by providing an adaptive learning environment where students tackle complex problems through engaging in gameplay. Computational thinking, as defined by Wing [32], provides individuals with the cognitive tools to devise computational solutions for problems using principles and reasoning from computer science. This skill—characterized by systematic and algorithmic reasoning—is increasingly recognized as a crucial component of modern education [33]. Computational thinking empowers students to approach complex challenges in a structured manner, facilitating the breakdown of problems into manageable components, identifying patterns, and generating effective solutions. By cultivating computational thinking skills, students not only enhance their problem-solving capabilities but also prepare for success in various academic fields and professional domains.

The remainder of this paper is structured as follows. Section 2 reviews related works on DBNs in education and the early prediction of performance. Section 3 focuses on model development and analysis. Section 4 describes our experiments and results. Finally, Section 5 discusses the conclusions drawn from the study.

2. Related Works

2.1. DBNs in Education

Bayesian networks have gained recognition for their effectiveness in modeling uncertain learner characteristics in educational contexts. These networks, which are essentially acyclic directed graphs, utilize nodes to represent different variables and arcs to represent probabilistic dependencies among them. Through an explicit representation of independencies, they simplify the representation of complex relationships and offer a highly efficient representation of joint probability distributions [34]. Their graphical structure not only makes them visually intuitive but also highly capable of incorporating uncertainty using probabilistic methods. Within the realm of education, Bayesian networks have been widely applied to assess various learner attributes, including knowledge [35,36], affective states [37], skills [31,38], and engagement [39]. Almond et al. [15] and Abyaa et al. [8] provide comprehensive insight on Bayesian networks in educational assessment and learner modeling.

DBNs extend the conventional Bayesian network framework to better accommodate the evolving and dynamic characteristics of learners. While Bayesian networks capture static relationships, DBNs enhance this approach by introducing a temporal dimension, integrating time-varying variables and transitions. This allows for a more detailed and nuanced understanding of how learner attributes and interactions change over time. The application of DBNs in educational contexts provides a more robust framework for capturing the complex, fluctuating nature of learner characteristics, offering insights into the temporal dynamics of learning processes. There have been many studies revolving around the use of DBN in learner modeling. For instance, Grawemeyer et al. [40] utilized DBNs to adapt feedback according to students’ affective states. Their model, an affective state reasoner, tailors feedback to evoke positive emotions, thereby enhancing the learning experience. This study underscores the potential of DBNs to foster an emotionally supportive learning environment.

In an innovative approach, Abbasi et al. [41] developed a DBN that infers students’ mental states from their body gestures during lectures. This model translates sensory data into semantic descriptions of mental states, demonstrating high accuracy in recognizing affective states, like confusion or interest, which can significantly enrich interactive educational settings. Sabourin et al. [37] applied DBNs to enhance learner emotion modeling in game-based environments. By incorporating appraisal theory within the Crystal Island game, their study shows that DBNs can effectively improve emotion prediction over static models, essential for developing effect-sensitive educational systems. Further advancing the application of DBNs in education, Seffrin et al. [42] focused on modeling students’ algebraic knowledge within intelligent tutoring systems. Their DBN model goes beyond mere solution correctness to consider the algebraic operations employed by students, offering a more detailed understanding of learners’ knowledge and misconceptions. Building on these insights, Käser et al. [18] demonstrated how DBNs can predict and understand students’ problem-solving skills over time. Their research highlights the utility of dynamic modeling in real-time educational settings, enhancing the prediction of learning outcomes. Expanding the scope of DBN applications, Choi and Mislevy [43] explored student engagement in online learning environments. Their study illustrates how DBNs can utilize temporal interaction data to dynamically predict and enhance student engagement, contributing to more engaging online educational experiences. Lastly, Han et al. [44] introduced a sequential response model that analyzes responses in technology-based problem-solving tasks through a DBN. This model treats students’ response sequences as discrete time stochastic processes, enabling nuanced inferences about problem-solving abilities.

In summary, these studies collectively demonstrate the significant potential of DBNs in various educational applications, from customizing learning environments to deepening our understanding of student behaviors and emotional states. However, despite their widespread use, these models often overlook temporal influences beyond the first order. As a result, DBNs frequently underutilize the potential for incorporating memory, which limits their effectiveness in modeling learning progressions. Furthermore, there is limited research on how different initial strategies in the expectation maximization (EM) parameter learning algorithm affect the accuracy of such models and their overall efficacy.

2.2. Early Prediction of Performance

This section reviews the recent developments in the field of educational technology, specifically focusing on the early prediction of student performance. Recently, numerous studies have concentrated on leveraging advanced (sub-symbolic) predictive models to predict student performance at early stages. López Zambrano et al. [23] provide a systematic review of early predictive modeling methods for student performance.

In the realm of educational computer games, Barata et al. [45] develop a predictive modeling approach for identifying student types based on performance and behavior. Using supervised and unsupervised machine learning, it predicts student types across terms, showing that combining performance with gaming data improves accuracy. Geden et al. [46] introduces a predictive student modeling framework for game-based learning environments, utilizing natural language reflections to forecast student outcomes. Using word embeddings, the framework achieves notably improved accuracy, especially when employing an ensemble of predictive models. Results from studies with 118 middle school students underscore the effectiveness of this approach, offering insights for enhancing game-based learning design. Min et al. [47] propose a deep learning-based framework for stealth assessment in game-based learning environments, reducing the need for extensive feature engineering. LSTM-based models that they employ outperform traditional approaches in predictive accuracy and early prediction capacity, showing the effectiveness of deep learning in enhancing assessments in educational settings. Hooshyar et al. [48] present an early predictive model for educational games, accurately estimating learners’ final scores early in gameplay using deep learning. Outperforming traditional methods with a squared correlation above 0.8 and a relative error below 8%, this approach demonstrates robustness across different datasets, showcasing significant improvements in predictive power over conventional approaches.

Despite their success in terms of prediction accuracy, the majority of the early predictive modeling approaches discussed above lack interpretability. Interpretability involves elucidating model decisions in a manner comprehensible to humans, revealing the underlying reasoning process using methods like SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) [49]. Furthermore, these approaches often fail to incorporate existing domain knowledge, such as causal relationships among features, which is essential for ensuring that models adhere to educational restrictions and avoid relying on spurious correlations. Additionally, there is a lack of predictive modeling approaches for the early prediction of learners in computational thinking.

3. Model Development and Analysis

3.1. Dynamic Bayesian Networks

Our modeling efforts have been based on probabilistic graphical models, with their two prominent members: Bayesian networks [34] and dynamic Bayesian networks (DBNs) [50]. Bayesian networks are widely used practical tools for knowledge representation and reasoning under uncertainty in equilibrium systems. DBNs are an extension of Bayesian networks for modeling dynamic systems. The term dynamic means that we model the state of a system over time, not that the model structure and its parameters change over time (even though the latter is theoretically possible). In a DBN, the state of a system at time t is represented by a set of random variables X_t = (X_1,t, …, X_n,t). The state of the system at time t generally dependents on the states at previous time steps. It is common to assume that each state only depends on the immediately preceding state (i.e., the system is first-order Markov), and thus, we represent the transition distribution by P(X_t|X_t−1), in most cases, taking into consideration that only the first-order dependence may be sufficient. Having a first-order model is a convenient approximation that is supported by the existence of efficient algorithms and caused by the limitations of available tools. After all, introducing higher-order temporal influences, which account for interactions over multiple time steps, requires more parameters and increases the computational complexity, which may make inference in such models intractable. However, there is nothing in the theory on which DBNs are based that prevents us from specify layers from t − n to n, and many real-world systems have memory that extends beyond the immediate previous state, necessitating higher-order models to capture these extended temporal dependencies effectively.

A first order model resembles a “drunk man” walk, in which the next step and the position depend only on the current position. A trajectory that is not that of a “drunk man” assumes some type of intention and can be interpreted as memory of the underlying system. The idea of increasing modeling accuracy by means of increasing the time order of a dynamic model was beautifully illustrated by Shannon. In his influential paper [51], outlining the principles of theory of information, he shows sentences in the English language, generated by a series of Markov chain models of increasing time order, trained by means of the same corpus of text. The following sentence was generated by a first-order model:

OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA TH

EEI ALHENHTTPA OOBTTVA NAH BRL.

Compare this with the following sentence generated by a sixth-order model:

THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH

WRITER THAT THE CHARACTER OF THIS POINT IS

THEREFORE ANOTHER METHOD FOR THE LETTERS

THAT THE TIME OF WHO EVER TOLD THE PROBLEM

FOR AN UNEXPECTED.

The resemblance of the latter sentence to ordinary English text, an informal measure of the model’s accuracy, has increased dramatically between the first and the sixth orders. A first order model was essentially impotent in its ability to learn and model the language. Łupińska-Dubicka and Druzdzel [30] studied the question of whether introducing higher-order influences improves a model’s accuracy in the context of a system with undeniable memory spanning over many steps, notably a woman’s monthly cycle. They have shown that adding influences that span over multiple steps is worth the effort in the sense of significantly improving the model’s accuracy. In the context of intelligent tutoring systems, it is clear that a student has goals and intentions and looking back at his/her previous interactions with the system should reveal these better than looking only at the most recent interaction. We thus expect that higher-order DBNs will lead to better quality predictions.

3.2. Data Collection and Pre-Processing

Context on the AutoThinking game. AutoThinking (http://m.youtube.com/watch?v=O3K6G0i1jYU, (accessed on 1 July 2024)) is an educational computer game designed to enhance computational thinking (CT) skills and knowledge. In this game, players assume the roles of mice navigating through a maze to collect cheese while evading cats, represented as non-player characters (NPCs). This setup provides a dynamic environment where players can develop up to 20 unique solutions to various maze scenarios. The game encourages players to devise solutions that integrate CT concepts and skills to achieve more efficient outcomes and higher scores. For example, as shown in Figure 1, there are two ways to traverse the maze: (1) a basic approach using single directional commands (arrows) for each move, or (2) a more sophisticated method that employs CT principles. By creating and saving sub-solutions, such as a function labeled ‘1’ that instructs the player’s character to ‘go left and continue until a third junction is encountered, then stop’, players can apply this pattern multiple times throughout different segments of the maze. This technique allows for code reuse and a reduction in the number of steps and ultimately leads to higher scores by achieving the goal with greater efficiency. Hooshyar [52] provides further details into the CT framework applied within AutoThinking, elaborating on its relevance to the game’s objectives.

The game distinguishes itself through its adaptivity in both gameplay and educational aspects. It offers tailored feedback and hints to support the learning process. From a gameplay perspective, it modifies NPC behavior based on the player’s performance, introducing a variety of challenges. Specifically, one of the NPCs during the gameplay adaptively changes its behavior to random, lenient, aggressive, and provocative movements. For instance, an NPC might adopt a provocative stance, strategically positioning itself near but not capturing the mouse. This approach stimulates the player to explore new strategies and move beyond their comfort zone. Such behavior is designed to encourage players to escape from being cornered in one area of the maze without making optimal progress. Hooshyar et al. [31] offer further insights into the game’s dynamics and adaptive features. Through its blend of educational content and engaging gameplay, AutoThinking offers a unique platform for developing CT competencies in an interactive and adaptive setting.

Dataset and data pre-processing. In this study, we analyzed gameplay log data from 442 participants from various countries, including Estonia, France, South Korea, Taiwan, and South Africa. These participants, recruited through a combination of purposive sampling and voluntary engagement, represent diverse backgrounds and age groups, including both school students and adult learners. They have actively interacted with the third level of the game, contributing to the collected dataset comprising up to 6436 examples or solutions. Data collection spanned from December 2019 to January 2024, incorporating information from both experimental studies, as indicated in references [52,53], with that of learners who engaged with the game independently.

When observing the learners’ interaction with the game, we recorded a diverse array of activities. These included the positioning of mice and NPCs, consumption of various sizes of cheese pieces, and the application of loops, conditionals, arrows, functions, debugging, and simulation. Additionally, the dataset tracks the frequency of requests for help, the provision of feedback and hints, and instances of colliding with walls, among other metrics. The primary objective of this research is to develop a learner modeling approach that enables the early prediction of CT performance. To achieve this, we focus on analyzing the usage of loops and conditionals to gauge a conceptual understanding of CT and the application of functions, simulation, and debugging to assess CT skills. We derived performance labels for learners from player scores recorded at each game episode (solution).

To prepare the data, we initially filtered out records from learners with fewer than 10 solutions, aiming to ensure a dataset for analysis across 10 game episodes (these correspond to our time stamps). We chose 10-time steps because most players had at least 10 solutions, minimizing missing or padded values and ensuring sufficient data. This has reduced the size of the dataset to 380 learners measured over 10 or more-time steps. Subsequently, we established a time window of 10 for the six performance-related variables: (1) conditional, (2) loop, (3) simulation, (4) debug, (5) function, and (6) the learners’ median score in each game episode. This process entailed considering learners’ sequences of actions, filtering out missing records, and generating an additional attribute counting the number of action sequences for each learner to eliminate incomplete solutions.

We discretized numerical variables, including categorizing learners’ performance as low or high based on the median of their scores. Thereafter, we transformed the time series data into a windowed example set. Specifically, for the six aforementioned variables, we created a time window of 10 game episodes, ensuring that each segment reflects a complete narrative of player interaction and decision-making processes. Briefly, the averages for the low category variables over the 10-time stamps are as follows: conditional 323.00, loop 257.67, simulation 299.11, debug 356.89, function 368.56, and performance 191.44. Finally, the dataset underwent stratified sampling, setting performance as the label, and dividing it into validation (90%) and test (10%) sets. The 90/10 split was chosen to provide a larger validation set for better fine-tuning of model parameters and improved generalizability. The random selection of records for the test set with every player given an equal chance of being selected ensured that the test set mirrors the distribution of values in the training set. Figure 2 shows that the two sets are similar from the point of view of the true values.

3.3. Structure of the Network and Setting Initial Parameters

We developed the structure of the network manually (Figure 3), using common sense knowledge. The directed graph shows observable nodes of conditional and loop as having a direct connection to the unobservable node of concept, indicating that students with a good understanding of CT concepts are likely to use conditionals or loops in their problem solving. Likewise, there are direct links from the hidden node CT skill to simulation, debug, and function. This expresses the assumption that as students get better at CT skills, they tend to use more complex solutions, including functions, engage in simulation, and apply debugging strategies in their solutions.

The network also includes direct influences from the concept and skill nodes to the performance node, highlighting that a solid understanding of CT concepts and proficiency in CT skills are influential in determining a student’s overall performance. The network includes temporal influences of the skill and concept nodes on themselves. This indicates that the nodes propagate their values over time, i.e., the value of the variables in the next time step depends on the values of these variables at the current step. We have decided to consider a time span of 10 steps. This, we believe, should be sufficient for making early predictions about a student’s performance by tracking the progression of their conceptual knowledge and skill application over time.

Model parameters for a DBN given the structure of a directed graph can be obtained from domain experts or learned from a time series. Utilizing our training data to inform the parameters is beneficial, but blending this with expert insights for initial parameter estimation provides a better foundation, especially when datasets are small. This hybrid approach leverages learning algorithms, such as EM, effectively blending data-driven insights with expert judgement. We generated the initial conditional probability tables (CPT) for the nodes concept and performance from expert judgment (see Table 1 and Table 2).

The CPT for the node concept shows the probabilities of transitions between various states of the concept node within a single step. For example, if the concept was low at t-1, there is a 70% chance that it will remain low and a 30% chance that it will switch to high at time t. The CPT for performance gives the probabilities of performance being low or high given the initial states of concept and skill. It suggests that performance is most likely to be high when both concept and skill are high.

Thereafter, we applied the expectation maximization (EM) algorithm [54,55] to determine the model’s probability distributions. Using this approach, we start with the probability distributions obtained from experts, assigning them different confidence values, and refined them through the EM algorithm. The confidence value, also known as the equivalent sample size (ESS) expresses the number of cases/records that the expert has seen before providing the numerical parameters. For example, the ESS = 20 states that the expert has seen roughly 20 cases and the EM algorithm weighs the existing parameters accordingly when refining them based on a dataset of n records. Table 3 and Table 4 show the refined CPTs for the nodes concept and performance, respectively. As we can see in the tables, the refined CPTs show an important shift. In the refined concept table, there is a strong tendency for the state to remain the same between time steps, indicated by the high probabilities of 0.90 for remaining low and 0.91 for remaining high.

In the initial CPT for performance, the probability distribution was closer to the uniform distribution, suggesting an equal likelihood of performance being low or high given any combination of concept and skill. Specifically, when both concept and skill were low, the probability of performance being high was set at 0.01, indicating a strong initial belief that performance would be low under these conditions. However, the refined CPT shows a shift towards more nuanced probabilities. While the learning process did not revise the probability of performance being high given that both concept and skill were low and vice versa; it resulted in a substantial adjustment with regard to the probability of performance being low given a low concept but high skill, which was modified from 0.5 to 0.84. More drastically, the probability of performance being high given a high concept but low skill was modified from 0.5 to 0.92. This significant change underscores the importance of a solid conceptual understanding in enhancing performance, irrespective of lower skill levels.

This comparison illustrates the influence of learning from data, as the EM algorithm adjusts the parameters to more accurately reflect the observed relationships in the data, moving away from the equal or non-committal probabilities set by the expert knowledge.

3.4. Exploring Temporal Beliefs

To investigate the impact of the observation of some of the variables on the marginal probability distributions over other variables, we set some temporal evidence, as illustrated in Table 5. Figure 4 displays the model predictions (posterior probabilities) in light of the provided evidence. It is apparent that exposing the model to data related to conditional and simulation use results in a clear shift in the model’s predicted probabilities for conceptual knowledge and skill mastery, as well as performance. The graphs indicate a trend, where high probabilities of concept and skill and performance correspond with the introduction of high evidence of conditional and simulation use at specific time stamps. Conversely, the introduction of low evidence correlates with lower probabilities. This trend underscores the significance of loop and simulation engagement in influencing the model’s assessment of a learner’s learning over time.

3.5. Strengths of Individual Influences in the Model

A probabilistic model is a representation of the joint probability distribution over its variables, and it allows for a variety of calculations that increase our insight into the relationships among the variables. One such calculation is that of the strength of individual influences in the model. The strength of influence expresses the distance between the marginal probability distribution over the child variable given various possible values of the parent variable.

The strength of influence analysis within this DBN (Figure 5 and Table 6) offers important insights into the dynamics of computational thinking, understanding, and performance in the AutoThinking game context. It reveals that conceptual understanding is paramount for CT performance, with a high influence score of 0.86. Simulation skill is also a critical factor, scoring 0.66, suggesting that skilled learners are better at problem solving through game simulations. Meanwhile, loop understanding and debugging skills present moderate influences, with scores of 0.23 and 0.17, respectively, indicating their supportive but less critical roles. Factors, such as conditional (0.09) and functions (0.06), have a relatively low influence on CT conceptual understanding and skills, suggesting that they are less critical in the context of our game. This evidence underscores that while both conceptual and skill-based competencies are key to achieving high CT performance, conceptual knowledge in CT is more crucial than CT skills for superior performance.

Figure 4. Marginal probabilities of (a) knowledge mastery, (b) skill mastery, and (c) performance as a function of time given the evidence listed in Table 7. Green and blue lines represent low and high categories, respectively.

Figure 5. Strength of influence in the DBN model.

4. Experiment and Results

4.1. The Effect of the EM Algorithm’s Initial Strategy on Parameter Learning

Learning the model parameters with priors obtained from an expert (Keep original setting in GeNIe software (version 4.1)’s parameter learning dialog) allows us to express our confidence in the expert-obtained parameters. This confidence value, known as the equivalent sample size (ESS), amounts to the number of data records that the expert has theoretically seen before giving us the parameters. A low ESS means that we are uncertain about the exact values of the parameters and prefer the data, whenever these are available. When there are few or no data records, the original parameters ensure that the model contains reasonable quality parameters. To investigate the EM algorithm’s sensitivity to different values of the ESS parameter, we learned and used four different values, notably 5, 10, 20, and 30. Additionally, we compared these to learning with uniform and random priors (Uniform and Random settings in GeNIe software).

Table 7 shows the log-likelihood value of the learned parameters for the different parameter learning strategies. The log-likelihood value is the logarithm of the probability of data given the model. Because the probability of the data is a very small fraction, typically, its logarithm is a large negative number. The smaller its absolute value, the better the model fits the data.

The Uniform strategy showed the lowest fit with a log-likelihood value of −8143, indicating its relative inefficiency; the Random strategy’s performance, marked by a log-likelihood value of −6668, suggests the potential benefits of incorporating randomness in initial parameter selection or utilizing prior knowledge to enhance the EM algorithm’s capability in identifying optimal parameter sets. However, the Keep original strategy, with an initial confidence level set at 5, demonstrated a superior fit with a log-likelihood value of −6584, closely followed by slight decreases in fit as the confidence level increased, evidenced by log-likelihood values from −6586 to −6603 for ESS levels from 10 to 30.

This pattern underscores that while a model with strong priors can be advantageous, it does not inherently guarantee a better fit to the data. The emphasis should instead be placed on the quality of prediction as a function of the ESS. Moreover, using the original parameters set by experts and adjusting the confidence level allows for the nuanced integration of new data, balancing between the preservation of existing model knowledge and the integration of new information. As the confidence level increases, the model adopts a more conservative approach in updating its parameters, which is reflected in the gradual decline in log-likelihood values, indicating a marginally less precise fit to the data with increasing ESS levels. This trend emphasizes the nuanced interplay between model rigidity and adaptability in the context of parameter updating.

4.2. Performance of the DBN Using Cross-Validation and Holdout Test

Table 8 outlines the performance of various learning strategies in a 10-fold cross-validation and holdout cross-validation to assess the generalizability of DBNs, using accuracy, sensitivity, and specificity. In 10-fold cross-validation, the dataset is divided into 10 equal parts, of which one part is used as a test set and the remaining nine parts are used for training. This process is repeated 10 times, each time using a different part for validation, and the results are averaged to provide a robust estimate of model performance. Changing the number of folds may impact accuracy because it affects the size and variability of the training and validation sets. A higher number of folds generally provides a more reliable estimate but increases the computational cost, while fewer folds may lead to higher variance in performance estimates. The low class is considered the positive class for identifying underperforming learners.

From the cross-validation results, a trend of improved accuracy is evident from the baseline (timestamp 0) to the final timestamp (9) across all strategies, except for Uniform. The Uniform strategy yields an overall accuracy of around 48%, slightly worse than random guessing. This strategy predominantly predicts the low class while proving inefficient for the high class. Performance further declines during timestamps 5 and 9, with both sensitivity and specificity dropping to nearly below 50%. In the holdout test, the Uniform strategy also underperforms severely in specificity, with overall accuracies not exceeding 55% at any time stamp, indicating consistent inadequacy in predictive reliability for the high class.

The Random strategy shows a marked improvement over the Uniform strategy with overall accuracies of 86% in 10-fold cross-validation and 90% in holdout tests. This strategy maintains a more balanced performance between low and high predictions, with particularly high sensitivity rates at timestamps 5 and 9, reaching up to 97%, suggesting that it is more effective compared to the Uniform strategy. The Keep original strategy demonstrates the best performance among the outlined strategies. With an overall accuracy of 87% for 10-fold cross-validation and 91% for holdout tests at the 5- and 10-confidence levels, it displays consistent and reliable predictive ability for both low and high classes across all time stamps.

Considering the overall accuracy of the strategies, Keep original (with ESS of 5 and 10) appears to have a slightly better accuracy at all-time stamps, suggesting that allowing the model to gradually adjust to new data from a position of low initial confidence may lead to the most accurate classification of learners in both the low and high categories. The high performance of the Keep original strategy suggests that integrating expert knowledge with moderate-to-high confidence could enhance learning. Therefore, it is crucial to recognize that the suitability of a model cannot be determined solely by its log-likelihood values; the incorporation of prior knowledge and the confident application of such information during the learning process are equally important considerations.

To enhance our analysis of how effectively the strategies classify learners at various time points in the 10-fold cross-validation and holdout test, we have compiled the AUC (Area Under the ROC Curve) values in Table 9. In the cross-validation and holdout test, both the Keep original and Random strategy show a marked improvement over the Uniform strategy in all time stamps. More importantly, regardless of the confidence level, the Keep original strategy performs better than the Random strategy, especially with regard to time stamp 0. A consistent AUC value across both classes suggests that the model’s ability to identify low performers (which is critical in the educational context for early interventions) is as good as its ability to identify high performers.

We found that upon assessing strategies through accuracy, sensitivity, and specificity, Keep original with ESSs of 5 and 10 initially showcased robust precision in classifying learners. However, our in-depth examination utilizing AUC metrics, which account for various thresholds for a comprehensive evaluation, identified Keep original with ESSs of 20 and 30 as distinctly superior. Their elevated AUC scores confirm their efficacy in early prediction tasks, revealing a nuanced and effective method for accurately discerning learner outcomes.

There is one more aspect of the performance of a probabilistic model that is worth examining. The main output of a probabilistic system is a marginal probability distribution of a variable of interest. In that distribution, we are usually interested in the probability of an outcome of interest. Because this probability is typically used in making a decision (e.g., what tutoring action to undertake next, depending on whether the student has mastered a concept), we would like it to be as precise as possible. Calibration curves express how well a model is calibrated, i.e., how precise are the probabilities that it produces. Figure 6 shows a set of calibration curves for the models based on expert parameters with ESS = 20 and ESS = 30 developed in this study predicting performance = low at time steps 0, 5, and 9. The horizontal axes show the probability of performance being low, as calculated using each model. The vertical axes show the actual frequency of performance being low in the data. For a model that is perfectly precise, i.e., that outputs probabilities that correspond directly to the frequencies observed in the data, the calibration curve is a diagonal line connecting points (0,0) and (1,1). As we can see, none of the curves are perfect, but the calibration curves for t = 0 and t = 5 seem to be significantly better than the curves for t = 9. This makes sense; predicting the state of a variable closer in the future is easier than predicting it further in the future. There seem to be no noticeable differences between the curves for ESS = 20 and ESS = 30. Consequently, we select Keep original (20) as the preferred strategy for parameter learning.

4.3. The Effect of Using Higher-Order Influences on the Model’s Performance

In order to investigate the effect of using higher-order influences (we will refer to this somewhat informally as memory of the model or, in short, memory) on the model’s performance, we explore the integration of advanced temporal dependencies into the DBNs and their impact on predictive accuracy. The adoption of 2nd-, 3rd-, and 5th-order temporal arcs allows for the nuanced modeling of learners’ performance over time, capturing complex patterns of learning and identifying underperforming students. This approach challenges the traditional reliance on the single immediately preceding time step, embracing a broader temporal perspective that reflects the cumulative effect of past learning experiences on current performance.

Figure 7 shows the structure of the model that includes 2nd-, 3rd-, and 5th-order influences in our original network structure. Learning model parameters using the EM algorithm by means of the Keep original with ESS = 20 strategy, the log-likelihood value obtained of −8143 was an even worse fit of the model to the data than previously reported results.

Table 10 demonstrates the efficacy of models with higher-order temporal influences during 10-fold cross-validation and holdout tests for the initial strategy of Keep original 20 (as we found it to result in models of higher accuracy) using various metrics. The introduction of higher-order temporal influences into the model increases its performance in both cross-validation and holdout tests. Specifically, in cross-validation, the AUC for all three timestamps improved by up to 2%, while in the holdout test, the accuracy at timestamp 0 saw an increase of up to 3%, with the overall accuracy reaching 92%. This improvement, when compared to the outcomes for the influence of the first order, as detailed in Section 4.2, suggests that the integration of higher-order temporal influences into the DBN model not only potentially enhances the model’s predictive accuracy during the cross-validation phase but also boosts its overall and timestamp-specific generalizability (demonstrated by a 4% increase in accuracy and a 2% improvement in AUC at timestamp 0).

The enhanced performance observed can be linked to the model’s expanded memory capacity, enabling a nuanced grasp of temporal dynamics and relationships in the data. Incorporating higher-order temporal influences allows the DBN to more accurately model the sequence and effects of historical events, thus enhancing prediction accuracy. This improvement is pivotal for both cross-validation, which measures predictive success against known data, and generalizability, the model’s ability to predict outcomes for new data. Essentially, adding higher-order influences to the DBN refines its predictive capabilities through a deeper temporal analysis, thereby improving its adaptability and reliability across different testing environments.

5. Discussion and Conclusions

The research presented in this paper offers an innovative approach to learner modeling, utilizing DBNs enhanced by higher-order temporal influences, to predict learner performance in real-time scenarios, such as during gameplay. We applied this methodology to analyze data from the AutoThinking game, which provided a rich dataset for examining the nuances of learner interactions and decision-making processes.

Our findings highlight the effectiveness of the EM parameter learning algorithm when applied with different initial strategies—particularly the Keep original strategy, which consistently outperformed others in terms of log-likelihood values and model fit. This suggests that starting with expert-informed priors can significantly influence the learning outcomes, affirming the value of incorporating domain knowledge into the modeling process. It is important to note that the concept of equivalent sample size (ESS) in Bayesian network parameter learning is pivotal for calibrating the influence of prior knowledge against observed data. A lower ESS generally leads to a model that is more responsive to new data, whereas a higher ESS tends to preserve the influence of prior assumptions, providing stability but possibly at the cost of adaptability. This delicate balance between data responsiveness and prior reliability is crucial in settings where prior knowledge is robust, yet evolving insights are essential. This finding is in line with study of Cano et al. [56], who indicated that adjustments to the ESS can significantly impact the accuracy of the inferred network models.

In terms of predictive accuracy, our model demonstrated substantial efficacy. The cross-validation and holdout tests underscored the robust generalizability of the DBN, with the Keep original strategy performing particularly well. This aligns with studies conducted by Abbasi et al. [41], Seffrin et al. [42], and Käser et al. [18] indicating that DBNs, when correctly tuned, offer powerful tools for predicting outcomes in complex, time-dependent data environments.

The inclusion of higher-order temporal influences enhanced model performance, supporting the hypothesis that more complex temporal dependencies can be crucial for accurate learner modeling. This is consistent with the recent study of Choi and Mislevy [43] reporting that designing a DBN with a higher-order Markov property could reflect a more realistic educational setting, thus emphasizing the benefits of incorporating longer temporal scopes into predictive models to capture deeper, more nuanced learner behaviors and trends over time. By extending the memory capacity beyond immediate prior states, our model effectively utilized historical performance data to predict future learner actions and performance with greater accuracy.

In conclusion, this study demonstrates the effectiveness of DBNs enhanced by higher-order temporal influences in accurately predicting learner performance across complex, time-dependent scenarios. By carefully tuning the initial conditions of the EM algorithm and adjusting the equivalent sample size, our models not only achieved a superior fit but also exhibited robust generalizability in various testing environments. The incorporation of higher-order temporal influences improved the models’ predictive accuracy, highlighting the potential of these advanced techniques in creating adaptive learning environments tailored to individual educational needs. This approach not only aligns with but also enhances current educational practices by offering clearer insights into learner behaviors and enabling more effective interventions for both learners and teachers. For instance, Tripon [57] emphasizes the importance of integrating computational thinking into teacher training and educational activities. By applying advanced DBN techniques, our research contributes to this area by offering a method for more accurately tracking and enhancing learner performance, thereby supporting educators in adapting their teaching strategies to meet students’ evolving needs. Moreover, by improving the ability to monitor and support the development of computational thinking skills, our study takes an important step towards preparing students for future challenges and reinforces the importance of these skills in contemporary educational frameworks.

Limitations and Future Work

While this research provides valuable insights into learner modeling using DBNs with higher-order temporal influences, a few limitations warrant consideration for future investigations. Firstly, the reliance on data solely from the AutoThinking game might restrict the generalizability of the findings to other educational contexts. Expanding the dataset to include diverse educational games or scenarios could enhance the robustness and applicability of the proposed modeling approach.

Secondly, the current model constrains the analysis to fixed sequences of actions, specifically 10-time stamps. Extending the temporal scope or implementing dynamic sequence lengths could capture more nuanced patterns in learner behavior and decision-making processes. Additionally, deploying the proposed model in other real-world educational settings would provide valuable insights into its effectiveness and practical utility, potentially informing instructional design and intervention strategies. Overall, addressing these limitations and advancing the proposed model could further enhance its efficacy and relevance in supporting learner-cantered educational practices.

Author Contributions

Conceptualization, D.H. and M.J.D.; methodology, D.H. and M.J.D.; validation, D.H. and M.J.D.; formal analysis and data curation, D.H. and M.J.D.; writing—original draft preparation, D.H. and M.J.D.; writing—review and editing, D.H. and M.J.D.; funding acquisition, D.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Estonian Research Council grant (PRG2215).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated and analyzed during the current study are not publicly available due to privacy or ethical restrictions. The corresponding author can provide a sample of the dataset on reasonable request.

Acknowledgments

The model development, as well as the analyses described in this paper, was conducted using GeNIe Modeler, available free of charge for academic research and teaching use from BayesFusion, LLC, https://www.bayesfusion.com/ (accessed on 21 June 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Aulck, L.; Velagapudi, N.; Blumenstock, J.; West, J. Predicting Student Dropout in Higher Education. arXiv 2016, arXiv:1606.06364. [Google Scholar]
Hooshyar, D.; Huang, Y.-M.; Yang, Y. A Three-Layered Student Learning Model for Prediction of Failure Risk in Online Learning. Hum.-Centric Comput. Inf. Sci. 2022, 12, 28. [Google Scholar]
Sweeney, M.; Lester, J.; Rangwala, H. Next-Term Student Grade Prediction. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October 2015–1 November 2015; pp. 970–975. [Google Scholar]
Yang, Y.; Hooshyar, D.; Pedaste, M.; Wang, M.; Huang, Y.-M.; Lim, H. Prediction of Students’ Procrastination Behaviour through Their Submission Behavioural Pattern in Online Learning. J. Ambient. Intell. Humaniz. Comput. 2020, 1–18. [Google Scholar]
Chang, T.-W.; Kurcz, J.; El-Bishouty, M.M.; Kinshuk; Graf, S. Adaptive and Personalized Learning Based on Students’ Cognitive Characteristics. In Ubiquitous Learning Environments and Technologies; Springer: Berlin/Heidelberg, Germany, 2015; pp. 77–97. [Google Scholar]
Munshi, A.; Biswas, G.; Baker, R.; Ocumpaugh, J.; Hutt, S.; Paquette, L. Analysing Adaptive Scaffolds That Help Students Develop Self-regulated Learning Behaviours. J. Comput. Assist. Learn. 2023, 39, 351–368. [Google Scholar]
Raj, N.S.; Renumol, V. A Systematic Literature Review on Adaptive Content Recommenders in Personalized Learning Environments from 2015 to 2020. J. Comput. Educ. 2022, 9, 113–148. [Google Scholar]
Abyaa, A.; Khalidi Idrissi, M.; Bennani, S. Learner Modelling: Systematic Review of the Literature from the Last 5 Years. Educ. Technol. Res. Dev. 2019, 67, 1105–1143. [Google Scholar]
d’Avila Garcez, A.; Bader, S.; Bowman, H.; Lamb, L.C.; de Penning, L.; Illuminoo, B.; Poon, H.; Zaverucha, C.G. Neural-Symbolic Learning and Reasoning: A Survey and Interpretation. Neuro-Symb. Artif. Intell. State Art 2022, 342, 327. [Google Scholar]
Hooshyar, D.; Yang, Y. Neural-Symbolic Computing: A Step toward Interpretable AI in Education. Bull. Tech. Comm. Learn. Technol. 2021, 21, 2–6. [Google Scholar]
Hooshyar, D.; Azevedo, R.; Yang, Y. Augmenting Deep Neural Networks with Symbolic Educational Knowledge: Towards Trustworthy and Interpretable AI for Education. Mach. Learn. Knowl. Extr. 2024, 6, 593–618. [Google Scholar] [CrossRef]
Nielsen, M.A. Neural Networks and Deep Learning; Determination Press: San Francisco, CA, USA, 2015; Volume 25. [Google Scholar]
Vincent-Lancrin, S.; Van der Vlies, R. Trustworthy Artificial Intelligence (AI) in Education: Promises and Challenges; OECD Publishing: Paris, France, 2020. [Google Scholar]
Ye, W.; Zheng, G.; Cao, X.; Ma, Y.; Hu, X.; Zhang, A. Spurious Correlations in Machine Learning: A Survey. arXiv 2024, arXiv:2402.12715. [Google Scholar]
Almond, R.G.; Mislevy, R.J.; Steinberg, L.S.; Yan, D.; Williamson, D.M. Bayesian Networks in Educational Assessment; Springer: New York, NY, USA, 2015; ISBN 1-4939-2125-8. [Google Scholar]
Pelánek, R. Bayesian Knowledge Tracing, Logistic Models, and beyond: An Overview of Learner Modeling Techniques. User Model. User-Adapt. Interact. 2017, 27, 313–350. [Google Scholar]
Cui, Y.; Chu, M.-W.; Chen, F. Analyzing Student Process Data in Game-Based Assessments with Bayesian Knowledge Tracing and Dynamic Bayesian Networks. J. Educ. Data Min. 2019, 11, 80–100. [Google Scholar]
Käser, T.; Klingler, S.; Schwing, A.G.; Gross, M. Dynamic Bayesian Networks for Student Modeling. IEEE Trans. Learn. Technol. 2017, 10, 450–462. [Google Scholar]
Reichenberg, R. Dynamic Bayesian Networks in Educational Measurement: Reviewing and Advancing the State of the Field. Appl. Meas. Educ. 2018, 31, 335–350. [Google Scholar]
Conati, C.; Porayska-Pomsta, K.; Mavrikis, M. AI in Education Needs Interpretable Machine Learning: Lessons from Open Learner Modelling. arXiv 2018, arXiv:1807.00154. [Google Scholar]
Meltzer, J.P.; Tielemans, A. The European Union AI Act: Next Steps and Issues for Building International Cooperation in AI; Brookings Institution: Washington, DC, USA, 2022. [Google Scholar]
Rosé, C.P.; McLaughlin, E.A.; Liu, R.; Koedinger, K.R. Explanatory Learner Models: Why Machine Learning (Alone) Is Not the Answer. Br. J. Educ. Technol. 2019, 50, 2943–2958. [Google Scholar]
López Zambrano, J.; Lara Torralbo, J.A.; Romero Morales, C. Early Prediction of Student Learning Performance through Data Mining: A Systematic Review. Psicothema 2021, 33, 456–465. [Google Scholar]
Molenaar, I.; Järvelä, S. Sequential and Temporal Characteristics of Self and Socially Regulated Learning. Metacognition Learn. 2014, 9, 75–85. [Google Scholar]
Saqr, M.; López-Pernas, S. The Temporal Dynamics of Online Problem-Based Learning: Why and When Sequence Matters. Int. J. Comput.-Support. Collab. Learn. 2023, 18, 11–37. [Google Scholar]
Rowe, J.; Lester, J. Modeling User Knowledge with Dynamic Bayesian Networks in Interactive Narrative Environments. Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain. 2010, 6, 57–62. [Google Scholar]
Levy, R. Dynamic Bayesian Network Modeling of Game-Based Diagnostic Assessments. Multivar. Behav. Res. 2019, 54, 771–794. [Google Scholar]
Hooshyar, D. Temporal Learner Modelling through Integration of Neural and Symbolic Architectures. Educ. Inf. Technol. 2024, 29, 1119–1146. [Google Scholar]
Łupińska-Dubicka, A. Probabilistic Graphical Models of Time-Dependent Domains with Memory: Application to Monitoring Woman’s Monthly Cycle. Doctoral Dissertation, Politechnika Białostocka, Białystok, Poland, 2014. [Google Scholar]
Łupińska-Dubicka, A.; Druzdzel, M.J. Modeling Dynamic Processes with Memory by Higher Order Temporal Models. In Foundations of Biomedical Knowledge Representation: Methods and Applications; Springer: Cham, Switzerland, 2015; pp. 219–232. [Google Scholar]
Hooshyar, D.; Lim, H.; Pedaste, M.; Yang, K.; Fathi, M.; Yang, Y. AutoThinking: An Adaptive Computational Thinking Game; Springer: Cham, Switzerland, 2019; pp. 381–391. [Google Scholar]
Wing, J.M. Computational Thinking. Commun. ACM 2006, 49, 33–35. [Google Scholar]
Denning, P.J.; Tedre, M. Computational Thinking; MIT Press: Cambridge, MA, USA, 2019; ISBN 0-262-53656-0. [Google Scholar]
Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann: Burlington, MA, USA, 1988; ISBN 1-55860-479-0. [Google Scholar]
Conati, C.; Gertner, A.S.; VanLehn, K.; Druzdzel, M.J. On-Line Student Modeling for Coached Problem Solving Using Bayesian Networks; Springer: Vienna, Austria, 1997; pp. 231–242. [Google Scholar]
Millán, E.; Descalço, L.; Castillo, G.; Oliveira, P.; Diogo, S. Using Bayesian Networks to Improve Knowledge Assessment. Comput. Educ. 2013, 60, 436–447. [Google Scholar]
Sabourin, J.; Mott, B.; Lester, J.C. Modeling Learner Affect with Theoretically Grounded Dynamic Bayesian Networks; Springer: Berlin/Heidelberg, Germany, 2011; pp. 286–295. [Google Scholar]
Käser, T.; Klingler, S.; Schwing, A.G.; Gross, M. Beyond Knowledge Tracing: Modeling Skill Topologies with Bayesian Networks; Springer: Cham, Switzerland, 2014; pp. 188–198. [Google Scholar]
Ting, C.-Y.; Cheah, W.-N.; Ho, C.C. Student Engagement Modeling Using Bayesian Networks. In Proceedings of the 2013 IEEE International Conference on Systems, Man, and Cybernetics, Manchester, UK, 13–16 October 2013; pp. 2939–2944. [Google Scholar]
Grawemeyer, B.; Mavrikis, M.; Holmes, W.; Gutierrez-Santos, S. Adapting Feedback Types According to Students’ Affective States.; Springer: Cham, Switzerland, 2015; pp. 586–590. [Google Scholar]
Abbasi, A.R.; Dailey, M.N.; Afzulpurkar, N.V.; Uno, T. Student Mental State Inference from Unintentional Body Gestures Using Dynamic Bayesian Networks. J. Multimodal User Interfaces 2010, 3, 21–31. [Google Scholar]
Seffrin, H.; Bittencourt, I.I.; Isotani, S.; Jaques, P.A. Modelling Students’ Algebraic Knowledge with Dynamic Bayesian Networks. In Proceedings of the 2016 IEEE 16th International Conference on Advanced Learning Technologies (ICALT), Austin, TX, USA, 25–28 July 2016; pp. 44–48. [Google Scholar]
Choi, Y.; Mislevy, R.J. Evidence Centered Design Framework and Dynamic Bayesian Network for Modeling Learning Progression in Online Assessment System. Front. Psychol. 2022, 13, 742956. [Google Scholar]
Han, Y.; Liu, H.; Ji, F. A Sequential Response Model for Analyzing Process Data on Technology-Based Problem-Solving Tasks. Multivar. Behav. Res. 2022, 57, 960–977. [Google Scholar]
Barata, G.; Gama, S.; Jorge, J.; Gonçalves, D. Early Prediction of Student Profiles Based on Performance and Gaming Preferences. IEEE Trans. Learn. Technol. 2016, 9, 272–284. [Google Scholar]
Geden, M.; Emerson, A.; Carpenter, D.; Rowe, J.; Azevedo, R.; Lester, J. Predictive Student Modeling in Game-Based Learning Environments with Word Embedding Representations of Reflection. Int. J. Artif. Intell. Educ. 2020, 31, 1–23. [Google Scholar]
Min, W.; Frankosky, M.H.; Mott, B.W.; Rowe, J.P.; Smith, A.; Wiebe, E.; Boyer, K.E.; Lester, J.C. DeepStealth: Game-Based Learning Stealth Assessment with Deep Neural Networks. IEEE Trans. Learn. Technol. 2019, 13, 312–325. [Google Scholar]
Hooshyar, D.; El Mawas, N.; Milrad, M.; Yang, Y. Modeling Learners to Early Predict Their Performance in Educational Computer Games. IEEE Access 2023, 11, 20399–20417. [Google Scholar]
Miller, T. Explanation in Artificial Intelligence: Insights from the Social Sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar]
Murphy, K.P. Dynamic Bayesian Networks. Probabilistic Graph. Models M. Jordan 2002, 7, 431. [Google Scholar]
Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar]
Hooshyar, D. Effects of Technology-enhanced Learning Approaches on Learners with Different Prior Learning Attitudes and Knowledge in Computational Thinking. Comput. Appl. Eng. Educ. 2022, 30, 64–76. [Google Scholar]
El Mawas, N.; Hooshyar, D.; Yang, Y. Investigating the Learning Impact of Autothinking Educational Game on Adults: A Case Study of France. CSEDU 2020, 2, 188–196. [Google Scholar]
Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data via the EM Algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 1977, 39, 1–22. [Google Scholar]
Lauritzen, S.L. The EM Algorithm for Graphical Association Models with Missing Data. Comput. Stat. Data Anal. 1995, 19, 191–201. [Google Scholar]
Cano, A.; Gómez-Olmedo, M.; Masegosa, A.R.; Moral, S. Locally Averaged Bayesian Dirichlet Metrics for Learning the Structure and the Parameters of Bayesian Networks. Int. J. Approx. Reason. 2013, 54, 526–540. [Google Scholar]
Tripon, C. Supporting Future Teachers to Promote Computational Thinking Skills in Teaching STEM—A Case Study. Sustainability 2022, 14, 12663. [Google Scholar] [CrossRef]

Figure 1. Comparative navigation strategies in maze traversal by a learner: (a) a basic approach using arrows for each step; (b) an advanced strategy incorporating CT principles, such as loops and conditionals, alongside skills, like pattern recognition and generalization, demonstrated by the creation of a reusable function.

Figure 2. Percentage of true values in the training and the test datasets.

Figure 3. The structure of the proposed DBN for the early prediction of student performance.

Figure 6. Calibration curves of the EM algorithm’s Keep original initial strategy for identifying low performers for ESS = 20 and ESS = 30, shown at three different timestamps: (a) ESS = 20 at t = 0, (b) ESS = 30 at t = 0, (c) ESS = 20 at t = 5, (d) ESS = 30 at t = 5, (e) ESS = 20 at t = 9, and (f) ESS = 30 at t = 9.

Figure 7. The proposed DBN with higher-order influences for the early prediction of performance.

Table 1. CPT for the node concept at (t ≥ 1).

(Self) (t-1)	Low	High
low	0.7	0.3
high	0.3	0.7

Table 2. CPT for the node performance.

Concept	Low		High
Skill	Low	High	Low	High
low	0.99	0.5	0.5	0.01
high	0.01	0.5	0.5	0.99

Table 3. Refined CPT for the node concept at (t ≥ 1).

(Self) (t-1)	Low	High
low	0.90	0.09
high	0.10	0.91

Table 4. Refined CPT for the node performance.

Concept	Low		High
Skill	Low	High	Low	High
low	0.98	0.84	0.08	0.02
high	0.02	0.16	0.92	0.98

Table 5. Introducing evidence to explore the temporal beliefs.

Evidence
Time	Conditional	Simulation
0	-	High (100%)
1	Low (100%)	-
2	- ¹	-
3	-	High (100%)
4	-	-
5	Low (100%)	High (100%)
6	-	-
7	-	-
8	-	-
9	-	-

¹ Missing evidence.

Table 6. List the strength of influence values in descending order.

Parent	Child	Maximum
concept	performance	0.86
skill	simulation	0.66
concept	loop	0.23
skill	debug	0.17
skill	performance	0.10
concept	conditional	0.09
skill	function	0.06

Table 7. Log-likelihood (logarithm of the probability of data given the model) values for different parameter learning strategies.

Strategy	Log-Likelihood
Uniform	−8143
Random	−6668
Keep original (ESS = 5)	−6584
Keep original (ESS = 10)	−6586
Keep original (ESS = 20)	−6593
Keep original (ESS = 30)	−6603

Table 8. Performance of the model in cross-validation and holdout test (accuracy, sensitivity, and specificity).

Validation	Strategy	Performance (Time Stamp)
Validation	Strategy	0 ¹ (Low/High)	5 (Low/High)	9 (Low/High)	Overall
Cross-validation	Uniform	56% ² (1 ³/0 ⁴)	43% (43/43)	45% (35/56)	48%
	Random	79% (79/78)	87% (87/87)	91% (90/92)	86%
	Original (5)	79% (83/73)	89% (89/88)	92% (91/93)	87%
	Original (10)	78% (83/72)	89% (90/88)	92% (91/93)	87%
	Original (20)	77% (79/73)	89% (91/88)	92% (91/93)	86%
	Original (30)	77% (76/78)	89% (91/88)	92% (91/93)	86%
Holdout test	Uniform	55% (1/0)	51% (1/0)	54% (1/0)	54%
	Random	78% (81/74)	97% (97/97)	96% (95/97)	90%
	Original (5)	79% (81/76)	97% (97/97)	96% (95/97)	91%
	Original (10)	79% (81/76)	97% (97/97)	96% (95/97)	91%
	Original (20)	79% (81/76)	97% (97/97)	96% (95/97)	91%
	Original (30)	79% (81/76)	97% (97/97)	96% (95/97)	91%

¹ Time stamp, ² accuracy, ³ sensitivity, ⁴ specificity.

Table 9. Performance of the model in cross-validation and holdout test (AUC).

Validation	Strategy	Performance (Time Stamp)
Validation	Strategy	0 (Low/High)	5 (Low/High)	9 (Low/High)
Cross-validation	Uniform	41/41	40/41	42/43
	Random	83/83	95/95	94/94
	Original (5)	84/84	95/95	94/94
	Original (10)	84/84	95/95	94/94
	Original (20)	85/85	95/95	94/94
	Original (30)	85/85	95/95	94/94
Holdout test	Uniform	43/35	46/41	51/35
	Random	84/84	99/99	98/98
	Original (5)	88/88	99/99	97/97
	Original (10)	88/88	99/99	97/97
	Original (20)	88/88	99/99	98/98
	Original (30)	88/88	99/99	98/98

Table 10. Performance of models with higher-order influences in cross validation and holdout test for initial strategy of Keep original 20.

Validation	Order of Influences	Performance (Time Stamp)
		0				5				9				Overall
		Acc ¹	Sen ²	Spe ³	AUC ⁴	Acc	Sen	Spe	AUC	Acc	Sen	Spe	AUC	Acc
Cross-validation	2	77	84	69	83/83	88	90	86	94/94	91	92	91	93/93	85
	3	77	84	69	83/83	88	91	86	94/94	91	92	91	93/93	86
	5	78	81	74	85/85	88	91	86	95/95	92	91	93	95/95	86
Holdout test	2	80	81	79	88/88	97	100	95	99/99	96	95	97	97/97	91
	3	80	81	79	88/88	97	100	95	99/99	96	95	97	98/98	91
	5	83	86	79	90/90	97	97	97	99/99	96	95	97	97/97	92

¹ Accuracy, ² sensitivity, ³ specificity, ⁴ AUC (low/high).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hooshyar, D.; Druzdzel, M.J. Memory-Based Dynamic Bayesian Networks for Learner Modeling: Towards Early Prediction of Learners’ Performance in Computational Thinking. Educ. Sci. 2024, 14, 917. https://doi.org/10.3390/educsci14080917

AMA Style

Hooshyar D, Druzdzel MJ. Memory-Based Dynamic Bayesian Networks for Learner Modeling: Towards Early Prediction of Learners’ Performance in Computational Thinking. Education Sciences. 2024; 14(8):917. https://doi.org/10.3390/educsci14080917

Chicago/Turabian Style

Hooshyar, Danial, and Marek J. Druzdzel. 2024. "Memory-Based Dynamic Bayesian Networks for Learner Modeling: Towards Early Prediction of Learners’ Performance in Computational Thinking" Education Sciences 14, no. 8: 917. https://doi.org/10.3390/educsci14080917

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Memory-Based Dynamic Bayesian Networks for Learner Modeling: Towards Early Prediction of Learners’ Performance in Computational Thinking

Abstract

1. Introduction

2. Related Works

2.1. DBNs in Education

2.2. Early Prediction of Performance

3. Model Development and Analysis

3.1. Dynamic Bayesian Networks

3.2. Data Collection and Pre-Processing

3.3. Structure of the Network and Setting Initial Parameters

3.4. Exploring Temporal Beliefs

3.5. Strengths of Individual Influences in the Model

4. Experiment and Results

4.1. The Effect of the EM Algorithm’s Initial Strategy on Parameter Learning

4.2. Performance of the DBN Using Cross-Validation and Holdout Test

4.3. The Effect of Using Higher-Order Influences on the Model’s Performance

5. Discussion and Conclusions

Limitations and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI