Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing

Pan, Jianguo; Dong, Zhengyang; Yan, Lijun; Cai, Xia

doi:10.3390/app14177952

Open AccessArticle

Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing

The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 200234, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7952; https://doi.org/10.3390/app14177952

Submission received: 23 July 2024 / Revised: 16 August 2024 / Accepted: 27 August 2024 / Published: 6 September 2024

Download

Browse Figures

Versions Notes

Abstract

:

Knowledge tracing is a significant research area in educational data mining, aiming to predict future performance based on students’ historical learning data. In the field of programming, several challenges are faced in knowledge tracing, including inaccurate exercise representation and limited student information. These issues can lead to biased models and inaccurate predictions of students’ knowledge states. To effectively address these issues, we propose a novel programming knowledge tracing model named GPPKT (Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing), which enhances performance by using knowledge graphs and personalized answer sequences. Specifically, we establish the associations between well-defined knowledge concepts and exercises, incorporating student learning abilities and latent representations generated from personalized answer sequences using Variational Autoencoders (VAE) in the model. This deep knowledge tracing model employs Long Short-Term Memory (LSTM) networks and attention mechanisms to integrate the embedding vectors, such as exercises and student information. Extensive experiments are conducted on two real-world programming datasets. The results indicate that GPPKT outperforms state-of-the-art methods, achieving an AUC of 0.8840 and an accuracy of 0.8472 on the Luogu dataset, and an AUC of 0.7770 and an accuracy of 0.8799 on the Codeforces dataset. This demonstrates the superiority of the proposed model, with an average improvement of 9.03% in AUC and 2.02% in accuracy across both datasets.

Keywords:

knowledge graph; programming learning; variational autoencoder; knowledge tracing; LSTM

1. Introduction

With the continuous development of Massive Open Online Courses (MOOC), a large amount of online learning data can be used to accurately and timely trace students’ learning states. Knowledge tracing (KT) is a data-driven method in education technology, which uses a series of student interactions with exercises to predict their mastery of Knowledge Concepts (KCs) corresponding to those exercises. However, many existing KT models, such as DKT [1], DKT+ [2], DKVMN [3], SAKT [4], IEKT [5], AKT [6], and GKT [7], suffer from limitations, including insufficient exercise representation and a lack of personalized knowledge state modeling. Details of these works will be reviewed in the subsequent section.

Nowadays, programming has become a fundamental skill for solving real-world problems. However, despite the success of KT in MOOC systems, a noticeable gap in research on KT within programming education can be observed, which presents distinct challenges. Programming KT is mainly based on subjective exercises, and questions are usually answered by students through code submission. The performance of students’ code is judged by the evaluation system using test data under different situations. In this context, we provide an example of students’ answers to programming exercises, as shown in Figure 1, where exercise

p_{1}

corresponds to the KC “Struct”,

p_{2}

includes the KCs “Struct” and “Pointer”,

p_{3}

corresponds to the KC “LinkList”, and

p_{4}

includes the KCs “LinkList” and “Bi-LinkList”. In the process of answering exercises in the field of programming, two aspects are worth considering:

(1): Exercises are not independent—clear predecessor and successor relationships between KCs exist. For example, after the $p_{2}$ exercise was answered incorrectly, $p_{4}$ continued to be attempted by student $S_{1}$ . Due to insufficient mastery of the predecessor of $p_{4}$ , the exercise resulted in a timeout. However, after exercises related to the previous KCs, $p_{2}$ and $p_{3}$ , were completed, $p_{4}$ was answered correctly by student $S_{1}$ .
(2): Multiple attempts reflect individualized knowledge mastery. For example, two failed attempts were made by student $S_{2}$ even after exercise $p_{2}$ was answered correctly, which may have been submitted for testing reasons and should not be interpreted as a lack of mastery over the KCs involved. Another example is when exercise $p_{1}$ was faced; students $S_{1}$ and $S_{2}$ answered correctly on the first attempt, while three attempts were required by student $S_{3}$ to get it right. Therefore, the mastery of the knowledge concept “Struct” is clearly different for $S_{1}$ and $S_{2}$ compared to $S_{3}$ .

Various methods have been explored in current studies on programming KT. For example, Bayesian Knowledge Tracing (BKT) has been employed for code quality assessment, and code representation learning has been conducted using abstract syntax trees (AST) and token-based methods. While promise has been shown by these methods in tracing students’ knowledge states on individual exercises, the broader relationships between exercises and KCs are often overlooked, and personalization in modeling students’ overall knowledge states is lacking.

To address these challenges, we propose a programming KT model based on a knowledge graph and personalized answer sequences for the diversification of programming exercises, named GPPKT. A knowledge graph is provided as a structured way to represent relationships between entities, making it especially suitable for organizing complex information. In the context of programming KT, the hierarchical and relational structure of knowledge concepts is captured by the knowledge graph, and the connections between these concepts are modeled more accurately. The Variational Autoencoder (VAE) [8] is utilized as a powerful generative model to learn latent representations of data. These latent variables capture the underlying structure and variability in the data. In programming KT, personalized knowledge states are modeled by the VAE, capturing the nuances of students’ answer sequences. Long Short-Term Memory (LSTM) networks [9] are employed for modeling sequential data, as long-term dependencies are captured by them. This makes them an effective choice for KT, as a student’s learning process is influenced by a series of past interactions and answers. By integrating these components, a powerful framework for programming knowledge tracing is provided by GPPKT. Both the representation of programming exercises and the modeling of students’ individual learning processes are improved. In summary, the main contributions of this work are as follows:

We construct a knowledge graph in the programming field to constrain the embedding of knowledge concepts by perceiving their types. The resulting embedding vectors can effectively connect the KCs in the exercises.
In response to the phenomenon of students answering consecutive programming exercises, we explore students’ learning behavior and learning ability, and a gating mechanism is introduced to balance their historical and current knowledge states. This approach better reflects the personalized knowledge mastery of students.
We conduct extensive experiments on two real-world programming datasets, which shows that GPPKT outperforms state-of-the-art models, with an average AUC improvement of 9.0%. Ablation experiments are also performed to ensure the reliability and effectiveness of each component in the study.

The rest of this paper is organized as follows: In Section 2, related work is introduced, and the improvements made by our method on these approaches are highlighted. In Section 3, the problem addressed by programming knowledge tracing is explained, and the components and implementation of the GPPKT model are detailed. Section 4 presents experimental comparisons of GPPKT with baseline methods on two programming datasets, along with ablation studies. Finally, in Section 5, the paper is concluded, and future work and potential applications are discussed.

2. Related Work

In recent years, significant advancements have been made in knowledge tracing through the application of deep learning techniques. Deep Knowledge Tracing (DKT) [1], a model based on Recurrent Neural Networks (RNNs), marked a pivotal shift by eliminating the need for manual feature extraction and achieving higher accuracy in predicting student performance. Following DKT, various deep learning models have been proposed to further refine KT. Notable among these are RNN-based models, such as EERNN [10], DKT-DSC [11], and KQN [12], as well as memory network-based models like Deep-IRT [13] and DKVMN [14]. The integration of attention mechanisms has also been explored, leading to models like AKT [6], ATKT [15], and RKT [16], which offer improved performance by focusing on relevant parts of the input data. In addition, convolutional neural networks have been employed in KT models, such as CKT [17] and CRKT [18], to capture spatial dependencies within the data. Further enhancements in KT have been achieved by incorporating exercise text models (EKT) [19] and memory curves [20,21] to better reflect student learning behavior. Moreover, graph-based approaches have gained traction, with Nakagawa [7] introducing the Graph-based Knowledge Tracing (GKT) model, which represents relationships between KCs as a graph and frames the KT task as a time-series node classification problem within a Graph Neural Network (GNN). Yang [22] expanded on this concept with the Graph-based Interactive Knowledge Tracing (GIKT) model, which leverages the relationships between exercises and KCs for embedding learning. Additionally, the SFBKT model [23] was introduced, incorporating a synthetically forgetting behavior method that utilizes both individual and group forgetting factors to enhance predictions of student performance.

In programming KT, a BKT-based method for evaluating code quality and tracing knowledge states was introduced by Kasurinen [24], relying on manual rules and statistical analysis of code syntax. A programming KT model integrating code features with LSTM networks was developed by Wang [25], using student-submitted code as input. Piech’s [26] recurrent neural network model was also employed, where ASTs are decomposed into subtrees, and a non-parametric model is used for matrix representation of each subtree’s root node. Building on this, a token-based method was employed by Swamy [27], in which code tokens form a vocabulary, with vector elements representing the TF-IDF values of tokens. The code2vec model was used by Shi [28] with DKT to trace student progress, representing code as abstract syntax trees divided into code paths, guided by learned weights. KT models using logistic regression and RNN based on the AST edit distance were proposed by Jiang [29], quantifying the distance between a student’s intermediate solution and the optimal one. Lastly, student code was transformed into vector representations by Liang [30], and error classification was used as a concept indicator, adding a cognitive layer to DKT for inferring students’ conceptual skill levels.

Our proposed method is built upon by these works, specifically addressing the limitations of existing programming KT models. Unlike prior models that often focus on single code features or specific exercises, a knowledge graph is integrated into our method to capture the relationships between diverse exercises and KCs. Additionally, the VAE is employed to model personalized learning trajectories, allowing for a better reflection of students’ individualized learning progress and abilities. This approach combines a knowledge graph with personalized learning trajectory modeling. It not only improves the predictive accuracy of existing methods but also offers a more holistic understanding of students’ knowledge states across different exercises and KCs.

3. Method

In this section, a detailed introduction to our GPPKT model is provided. First, the formal definition of KT is presented. Then, the method for constructing the knowledge graph is described. Next, the use of the VAE to handle student answer sequences and the IRT for modeling students’ personalized learning abilities is introduced. Finally, these features are incorporated into the KT framework to enhance its generality in the programming field, thereby improving KT’s performance.

3.1. Presentation of the Problem

In online judge (OJ) systems, suppose there are

| E |

programming exercises. The answer sequence for a particular student is defined as

s = {(e_{t}, r_{t}) | t = 1, \dots, n}

,

e_{t} \in E

represents the t-th programming exercise attempted by the student,

r_{t} \in {0, 1}

indicates the correctness of the answer, and n represents the number of different exercises which students answered. Typically,

r_{t} = 1

indicates that the student passed all test cases, otherwise

r_{t} = 0

. As programming exercises involve students repeatedly attempting the same exercise, the student’s answer to exercise

e_{t}

is detailed as

(e_{t}, r_{t}) = {(e_{t}, r_{t}^{i}) | i = 1, \dots, m}

, where m represents the number of consecutive attempts for exercise

e_{t}

, and

r_{t}^{i} \in {0, 1}

indicates the answer for the i-th consecutive attempt on exercise

e_{t}

.

3.2. Overall Architecture

We propose a deep KT model based on a knowledge graph and students’ personalized answer sequences, namely GPPKT, to solve programming KT. The model architecture is shown in Figure 2, and mainly consists of four parts. The knowledge graph module is responsible for linking the various KCs to obtain the representation of the exercise

e_{t}

. IRT is used to obtain the student’s personalized learning ability value

θ_{t}^{i}

in the part of learning ability. In the part of answer sequence, the student’s answer sequence

r_{t}^{s e q}

on the same exercise is modeled by VAE, where the personalized answer state

z_{t}

obtained can effectively reflect the student’s learning behavior. The LSTM module introduces an attention mechanism for adaptive aggregation of the current state

k s_{t}

, and a gating mechanism is proposed to balance the student’s historical and current knowledge state, where

k s_{t o u t}

denotes the student’s final knowledge state.

3.3. Knowledge Graph to Represent Exercise Vectors

3.3.1. Constructing Knowledge Graph

Many kinds of programming exercises exist where the predecessor and successor relationships between KCs are obvious, and these relationships can be reflected by the knowledge graph, as shown in the left part of Figure 2. By using graph embedding methods, embedding vector representations that contain rich semantic and topological information are obtained from the graph. A knowledge graph suitable for the field of programming exercises is constructed in three steps: knowledge acquisition, graph design, and storage.

Knowledge concept acquisition: We gather information about KCs from the OI-Wiki website. Specifically, hierarchical and relational information about programming concepts, such as “Dynamic Programming” and its associated subtopics, like “Depth First Search” and “Recurrence”, is extracted. To ensure the completeness and correctness of these concepts, the obtained KCs are categorized and organized with the help of information from CSDN and Wikipedia. Additionally, three authoritative textbooks, Data Structure Programming Practice, Introduction to Competitive Programming (2nd Edition), and Introduction to Algorithms, are used for validation. This process results in a knowledge graph with 11 topics and 204 KCs.
Knowledge graph design: Introductory KCs in the field of programming are first identified based on various knowledge sources. For instance, under the “Fundamentals” category, concepts such as “Divide and Conquer” and “Function” are categorized as basic KCs. Next, predecessor–successor relationships are established, linking more advanced KCs like “Dynamic Programming” to foundational concepts. These relationships are labeled with references to content from Data Structure Programming Practice and OI-Wiki. Further validation of the initial knowledge graph is carried out using Introduction to Competitive Programming (2nd Edition) and Introduction to Algorithms to minimize potential subjective biases. The graph is then optimized by retrieving additional hierarchical relationships for KCs from Wikipedia, ensuring a more comprehensive coverage.
Knowledge storage: After KC acquisition and knowledge graph design, we obtain the nodes and relationships required for the knowledge graph and finally use Neo4j for knowledge storage.

We extract a portion of the constructed knowledge graph, as shown in Figure 3; the knowledge graph represents different KCs and their interrelationships. For instance, under the category “Dynamic Programming”, concepts such as “Depth First Search” and “Recurrence” are connected, reflecting their related nature. Nodes with the same color are assigned to the same knowledge category, indicating their thematic connections. The lighter, dashed edges are used to illustrate connections between different categories, emphasizing the interconnectedness of programming concepts across various topics.

3.3.2. Knowledge Graph Embedding Model: RotatE-TA

To improve the representation of exercises in our model, we build on the RotatE model, which is a knowledge graph embedding method that uses relational rotation in complex vector space to model entities and relations [31]. RotatE effectively captures the relationships between entities by treating relations as rotations in the complex plane. Building on this foundation, we propose a type-aware extension called RotatE-TA. The knowledge graph structure is incorporated into our model, while type information of KCs is also accounted for to create more accurate embeddings.

Entity and Relation Definition: We define predecessor KC as head entities $c_{i}$ and successor KC as tail entities $c_{j}$ , with the relation denoted as r. These two KCs, along with the relation, are represented as a triple $(c_{i}, r, c_{j})$ .
Scoring Function: We define a scoring function for the triple as $d_{r} (c_{i}, c_{j})$ , which evaluates the importance of a candidate triple.
Type Comparison: The knowledge topics for all KCs in the knowledge graph are defined as $T = {t_{1}, t_{2}, \dots, t_{m}}$ , where each KC $c_{i}$ is associated with a knowledge topic $c_{i}^{t} \in T$ , and each KC belongs to only one knowledge topic. The comparison of the two knowledge topic types is calculated as follows:

$t y p e (c_{i}, c_{j}) = \{\begin{matrix} 1, & if c_{i}^{t} = c_{j}^{t} \\ 0, & else \end{matrix}$

(1)
Distance Scoring with Type-Awareness: Based on the type comparison function, we introduce type-aware weights into the distance scoring function as follows:

$d_{r} (c_{i}, c_{j}) = | | c_{i} \circ r - c_{j} | | + λ * | | m_{c_{i}} - m_{c_{j}} | | * t y p e (c_{i}, c_{j}) .$

(2)

In Equation (2), ∘ represents the Hadamard product, $m_{c_{i}}$ and $m_{c_{j}}$ represent the modular lengths of the head entity and tail entity, respectively. $λ$ is a hyperparameter representing the magnitude of type-aware weights.

After embedding the knowledge graph, we obtain the vector representation of an exercise by simply adding the embedding vectors of the KCs, as shown in Equation (3):

e_{t} = \sum_{k = 1}^{m} c_{k},

(3)

where m represents the number of KCs involved in the exercise,

c_{k}

is the embedding vector for each KC,

e_{t}

is the final exercise vector, and this vector can capture the semantic relationships of the exercise in the knowledge graph.

3.4. Modeling Answer Sequences with VAE

In addressing the issue of lacking personalized knowledge states during the answer sequence in the programming field, we propose a personalized modeling method based on the VAE [8], as shown in the middle right part of Figure 2, inspired by the Exercise-Enhanced Recurrent Neural Network (EERNN) [10]. VAE is a generative model that combines the concepts of an autoencoder [32] with variational inference to learn latent representations of data.

3.4.1. Input Representation

In this study, we represent a student’s multiple answers to a specific exercise as a sequence denoted by

r_{t}^{s e q} = {r_{t}^{1}, r_{t}^{2}, \dots, r_{t}^{n}}

, where

r_{t}^{i}

represents the i-th answer. Since the number of answers a student provides for a particular exercise is uncertain,

r_{t}^{s e q}

is adjusted to a fixed length N by padding it with

r_{t}

.

3.4.2. Model Architecture

The model’s architecture is depicted in Figure 4. The student’s answer sequence to the same exercise is used as the VAE input, with hidden layer information

z_{t}

being learned and extracted as the personalized response state, thereby enabling the modeling and tracing of the student’s knowledge state. The encoder–decoder structure of the VAE is defined by the following equations:

\begin{matrix} z_{t} & = Encoder (r_{t}^{s e q}), \\ y & = Decoder (z_{t}) . \end{matrix}

(4)

3.4.3. Objective Function

The objective function

L_{t o t a l}

of the VAE consists of two components: the reconstruction loss

L_{r e c o n}

and the KL divergence loss

L_{K L}

. These components are calculated as follows:

Reconstruction Loss: This loss measures the difference between the original input $r_{t}^{s e q}$ and the output of the decoder y. It is defined as:

$\begin{matrix} L_{r e c o n} & = - \sum_{i = 1}^{N} ({r_{t}^{s e q}}_{i} log (y_{i}) + (1 - {r_{t}^{s e q}}_{i}) log (1 - y_{i})), \end{matrix}$

(5)

where N represents the dimensions of the student’s answer sequence, ${r_{t}^{s e q}}_{i}$ and $y_{i}$ represent the i-th element of the input sample $r_{t}^{s e q}$ and the decoder output y, respectively.
KL Divergence Loss: This loss encourages the learned latent variable distribution to be close to a standard normal distribution. It is defined as:

$\begin{matrix} L_{K L} & = - \frac{1}{2} \sum_{j = 1}^{D} (1 + log ({(σ_{j})}^{2}) - {(μ_{j})}^{2} - {(σ_{j})}^{2}), \end{matrix}$

(6)

where D represents the dimensions of the latent variable $z_{t}$ , and $μ_{j}$ and $σ_{j}$ represent the mean and standard deviation of the j-th dimension of the latent variable $z_{t}$ , respectively.

The final objective function is a combination of these two losses:

\begin{matrix} L_{t o t a l} & = L_{r e c o n} + L_{K L} . \end{matrix}

(7)

3.5. Students’ Personalized Learning Abilities

3.5.1. First-Order IRT for Ability Assessment

Different students have varying levels of mastery over different exercises, and a single student may have different levels of mastery over different exercises. As shown in the middle left part of Figure 2, this study aims to assess and represent students’ personalized learning abilities using a first-order IRT, thereby reflecting students’ different abilities for each exercise. The equation for first-order IRT is as follows:

p_{i} (θ) = \frac{1}{1 + e^{- d (θ - b_{i})}},

(8)

where

p_{i} (θ)

represents the probability of a student answering a particular exercise correctly,

θ

represents the student’s personalized learning ability,

b_{i}

represents the difficulty parameter of the exercise, and d is a constant set to 1.702 to scale the logistic function to approximate a normal ogive curve.

3.5.2. Inverting Ability Parameters

We utilize students’ accuracy in answering exercises as an indicator of their personalized learning abilities, defined as the ratio of correct answers to the total number of attempts. Additionally, our datasets in the programming field include information about the difficulty of each exercise. Based on this information, the original IRT equation is used to invert the individual ability parameter

θ

, achieving an accurate measurement of student learning ability:

θ_{t}^{i} = b_{i} + \frac{1}{d} ln \frac{p_{i} (θ)}{1 - p_{i} (θ)} .

(9)

In this study, we obtain the difficulty parameter

b_{i}

of the exercise and the student’s answer condition

p_{i} (θ)

. The final obtained

θ_{t}^{i}

represents the personalized ability parameter of the student.

3.6. LSTM Framework

3.6.1. Modeling Knowledge States with LSTM

Due to the unique mechanism in the field of programming, we should consider not only the exercise information and students’ results, but also incorporate the previously mentioned personalized response state

z_{t}

and the student’s learning ability

θ_{t}^{i}

. In this study, a student’s personalized learning state is composed of these features mentioned above. Subsequently, the interaction projection technique [15] is applied to differentiate whether a student answers an exercise correctly. Additionally,

r_{t}

is extended into a K-dimensional all-0 or all-1 vector

{\hat{r}}_{t}

. The calculation method for a student’s personalized learning state is as follows:

\begin{matrix} x_{t} = \{\begin{matrix} [e_{t}, z_{t}, {\hat{r}}_{t}, θ_{t}^{i}] & i f r_{t} = 1, \\ {\hat{r}}_{t}, e_{t}, z_{t}, θ_{t}^{i}] & i f r_{t} = 0 . \end{matrix} \end{matrix}

(10)

The above equations represent the concatenation of four features, where

x_{t}

signifies the student’s personalized learning state. From this, we obtain the sequence of a student’s personalized knowledge states,

X = {x_{1}, x_{2}, x_{3}, \dots, x_{t}}

.

Next, LSTM is used to model the student’s knowledge states, as shown in the upper part of Figure 2. The student’s current knowledge state

k s_{t}

is only dependent on the previous time step’s knowledge state

k s_{t - 1}

and the current learning state

x_{t}

. Therefore, the LSTM computation process can be simplified as follows:

k s_{t} = L S T M (k s_{t - 1}, x_{t}) .

(11)

3.6.2. Attention Mechanism

Although it is possible to directly use

k s_{t}

to predict a student’s performance on the next exercise, the programming exercises have many knowledge topics, various KCs, and a wide range of difficulty levels. Therefore, recent exercises may not have an equal impact on a student’s performance. To address this issue, we introduce an attention mechanism to adaptively determine the importance coefficients

a_{i}

for each exercise in the early knowledge state sequence,

K S_{s e q} = {k s_{1}, k s_{2}, \dots, k s_{t - 1}}

. The computation process is as follows:

\begin{matrix} u_{i} & = tanh (W_{w} \cdot k s_{i} + b_{w}), \\ a_{i} & = \frac{exp (u_{i}^{T} u_{w})}{\sum_{j} exp (u_{j}^{T} u_{w})} . \end{matrix}

(12)

In this process,

k s_{i}

is first input into a single-layer neural network, and then a softmax operation is performed to obtain the importance coefficient

a_{i}

. Where

W_{w}

,

b_{w}

, and

u_{w}

represent the weight matrix, bias, and the weight vector, respectively.

Due to the differential impact of historical and current knowledge states on prediction, this study divides students’ knowledge states into two stages: historical and current. The historical knowledge state

k s_{h i s}

is defined as the weighted sum of early knowledge states

{k s_{1}, k s_{2}, \dots, k s_{t - 1}}

. The specific calculation equation is as follows:

k s_{h i s} = \sum_{j = 1}^{t - 1} a_{j} k s_{j} .

(13)

3.6.3. Gating Mechanism

In this study,

k s_{t}

is used to represent the current level of knowledge mastery for a student, reflecting their present knowledge state. Meanwhile, the student’s historical knowledge state

k s_{h i s}

is also considered a significant indicator of their knowledge mastery. Therefore, we introduce a learnable gating mechanism, denoted as g to balance the importance between

k s_{t}

and

k s_{h i s}

. The final knowledge state acquisition process is illustrated in Figure 5, and the detailed computation process is as follows:

\begin{matrix} g = σ (W_{t} \cdot k s_{t} + W_{h i s} \cdot k s_{h i s} + b_{g}), \\ k s_{t o u t} = g ⊙ k s_{t} + (1_{D} - g) ⊙ k s_{h i s}, \end{matrix}

(14)

where g represents a learnable gating mechanism,

W_{t}

and

W_{h i s}

denote the weight matrices, and

b_{g}

represents the bias term. ⊙ signifies element-wise multiplication,

1_{D}

represents a D-dimensional all-ones vector, and

k s_{t o u t}

represents the student’s final knowledge state.

3.7. Prediction

Once the student’s final knowledge state is obtained, a single layer neural network can be used to predict students’ performance in the next exercise. Assuming the one-hot encoding representation of the next exercise is denoted as

{\hat{e}}_{t + 1} \in R^{1 \times K}

, the equation for calculating the probability of the next exercise’s prediction is as follows:

{\hat{y}}_{t + 1} = σ (k s_{t o u t} \cdot W_{o u t} + b_{o u t}) \cdot {\hat{e}}_{t + 1},

(15)

where

W_{o u t}

represents the weight matrix,

b_{o u t}

is the bias term,

{\hat{y}}_{t + 1}

signifies the predicted probability of the student answering exercise

t + 1

correctly, and

y_{t + 1}

is the actual student answer to exercise

t + 1

. We use the cross-entropy to calculate the loss between

y_{t + 1}

and

{\hat{y}}_{t + 1}

, and the specific equation for this calculation is as follows:

L = - \sum_{t + 1} (y_{t + 1} log {\hat{y}}_{t + 1} + (1 - y_{t + 1}) log (1 - {\hat{y}}_{t + 1})) .

(16)

4. Experiments

4.1. Dataset

4.1.1. Data Source

Traditional KT models often use open datasets, such as Statics [33] or ASSISTments [34]. However, there are significant differences between the programming domain and traditional fields, particularly because most programming exercises are subjective in nature. To ensure the effectiveness of the model proposed in this study, we utilized datasets from two widely used programming practice platforms, Luogu and Codeforces. These platforms offer a diverse range of exercise types and a comprehensive record of student responses, which align well with the unique requirements of programming education. This approach also allows the performance of our model to be evaluated in a more realistic and practical context.

Luogu (https://www.luogu.com.cn/, accessed on 20 December 2022) established by Chinese programming enthusiasts, is a robust platform dedicated to providing a refreshing and efficient programming experience for OIers/ACMers. With a current repository of over ten thousand exercises and several hundred thousand users, Luogu gains popularity in the programming community. Codeforces (https://codeforces.com/, accessed on 27 December 2022) is a Russian website, which serves as an online judging system, and hosts daily programming contests for global programming enthusiasts. Its globally renowned weekly contests attract nearly 30,000 participants, and the platform boasts a collection of over 8000 exercises, earning recognition from top programming students both domestically and internationally.

4.1.2. Data Preprocess

Given that the datasets were sourced from programming practice platforms, the raw data included various types of information, such as knowledge concepts, exercises, and student answer records, all represented as discrete character-type data. To prepare the data for model training, we applied a series of preprocessing steps:

Filtering Invalid Submissions: We filtered out invalid submission records, such as those labeled “Time limit exceeded”, “Runtime error”, or “Unanswered”, to retain only valid and meaningful submission data. These invalid records were considered noise and could skew the analysis if included.
Removing Irrelevant Users: We excluded users with an insufficient number of answer sequences, as these would not provide enough data to model effectively. Additionally, non-student users, including administrators and virtual users, were removed to ensure that the dataset reflected genuine student learning behavior.
Normalizing Sequence Lengths: Since the number of exercises answered by each student varied significantly, directly modeling answer sequences of different lengths would be challenging. To address this, we standardized the sequence length to a fixed value, ensuring that all input sequences were of uniform length for the model. This involved truncating longer sequences and padding shorter ones to the specified length.

The resulting preprocessed subset of the datasets is summarized in Table 1.

4.2. Environment

The success of a model analysis experiment heavily relies on having the right hardware and software environment available. Table 2 presents an overview of the key hardware and software environment configurations utilized in this study.

4.3. Evaluation Metrics and Baseline Methods

4.3.1. Evaluation Metrics

To validate the GPPKT model proposed in this research, we choose the area under the curve (AUC) and accuracy (ACC) as evaluation metrics. AUC offers a more precise evaluation of the model’s ability to differentiate positive and negative samples and is independent of threshold values. ACC is a fundamental performance metric in classification models. It quantifies the proportion of correctly classified samples relative to the total number of samples. In the context of KT, ACC is used to measure the accuracy of a model in predicting student learning progress.

4.3.2. Baseline Methods

To thoroughly verify the effectiveness of the GPPKT model, several classical and state-of-the-art KT models were selected as baseline methods. These baselines were chosen as they represent the evolution of knowledge tracing models over time, from the initial models that incorporated deep learning to their improved versions and the latest advances in attention mechanisms and interpretability. The baseline methods include:

DKT [1]: A pioneering knowledge tracing method using recurrent neural networks.
DKT+ [2]: An improved DKT model addressing input reconstruction and state fluctuations.
DKVMN [3]: A memory network-based model with interpretable knowledge and student state representation.
Deep-IRT [13]: Combines IRT with DKVMN for an interpretable deep knowledge tracing model.
AKT [6]: Introduces the Rasch model and self-attention for better exercise and interaction modeling.
ATKT [15]: Adds adversarial perturbations to LSTM-based sequences to reduce overfitting.
IEKT [5]: Combines individual cognition and learning processes for accurate knowledge state tracing.
SAKT [4]: Uses self-attention mechanisms in Transformer to handle sparse data in knowledge tracing.
SAINT [35]: Extends SAKT with additional attention modules for deeper exercise–answer relationship modeling.

4.4. Model Training and Parameter Selection

We randomly divide the dataset; 80% is used as the training set, and 20% is used as the test set. We use the Adam optimizer to optimize all model parameters with the following settings:

β_{1} = 0.9

,

β_{2} = 0.999

,

ε = 1 \times 10^{- 8}

, and a model learning rate of 0.001. In accordance with typical practices in the field of KT, the sequence length is set to

l = 200

, considering the typical student answer sequence lengths and the average number of student answers. Finally, each model is trained for a total of 30 epochs.

4.5. Experiment Results

4.5.1. Comparative Experiment Analysis

After multiple experiments, the final performance of all models on the two datasets is shown in Table 3, with the best results highlighted in bold. Baseline KT models provided by the pykt [36] library are used. Table 3 compares various KT models using two key metrics: AUC and ACC. The AUC measures the ability of a model to distinguish between correct and incorrect answers, with higher values indicating better performance. The ACC reflects the overall percentage of correct predictions made by the model. These metrics are critical for evaluating the effectiveness of KT models in predicting students’ future performance. Overall, the GPPKT model proposed in this study is shown to perform well on both datasets after the knowledge graph and the student answer sequence are fully considered.

On the Luogu dataset, an average improvement of 7.75% in AUC is exhibited by GPPKT compared to other methods, with a relative improvement of approximately 5.49% observed compared to the second-best method, ATKT. The ACC is recorded at 0.8472, showing an improvement of 1.51% over the average performance of other methods. On the Codeforces dataset, an average AUC improvement of 10.30% is demonstrated by GPPKT compared to other methods, with a relative improvement of about 3.01% seen compared to the second-best method, ATKT. The ACC is recorded at 0.8799, indicating an improvement of 2.53% over the average performance of other methods.

Additionally, it is observed that similar performance is exhibited by DKT and DKT+, which are among the first to use deep knowledge tracing. This is due to the fact that programming exercises primarily consist of subjective questions, and students’ understanding of KCs is relatively stable, leading to fewer fluctuations in the predicted state. DKT+ was developed specifically to address the problem of fluctuations in the prediction state. Although some improvement in preventing overfitting is provided by DKT+, it does not account for the inherent relationships between KCs.

Furthermore, similar performance is exhibited by Deep-IRT and DKVMN. On the Luogu dataset, AUC values of 0.8020 and 0.8058 are recorded, respectively, and on the Codeforces dataset, AUC values of 0.6702 and 0.6828 are recorded, respectively. DKVMN is designed to dynamically update the mastery levels of relevant KCs and performs well on objective exercises. However, in programming KT, where subjective exercises prevail and students often engage in continuous answering behavior, the impact of KC forgetting is not significant. Deep-IRT is an improvement upon the DKVMN model; although students’ learning abilities are also considered, it indirectly reveals the limitations of traditional KT methods when addressing personalized answers for subjective exercises. Both SAKT and SAINT are based on transformer models, and similar AUC values are observed for both. However, relationships between exercises and students’ learning abilities are not considered by either model.

A certain gap between IEKT and GPPKT is also observed. Although the IEKT model integrates individual cognition and learning processes, the relationship between KCs is not considered. It is evident that, besides GPPKT, the AKT and ATKT models demonstrate the best performance. These two models share the commonality of incorporating attention mechanisms. This indirectly highlights the significance of the attention mechanism introduced in this study for accurate prediction when modeling student answer sequences.

The superior performance of GPPKT compared to other models can be attributed to several key factors. First, the integration of knowledge graph structures into the modeling process by GPPKT is effectively achieved, capturing the intricate relationships between exercises and KCs. This enhances the model’s ability to accurately trace students’ knowledge states, particularly in domains with complex, interconnected knowledge. Moreover, the attention mechanism incorporated in GPPKT enables the model to focus on relevant parts of the answer sequences, leading to more accurate predictions. These factors collectively contribute to the significant improvements observed in both AUC and ACC.

4.5.2. Ablation Experiment Analysis

In order to verify the contribution of each part of the model proposed in this paper to the prediction of students’ answers under the same other conditions, we conduct ablation experiment analysis to verify the rationality of GPPKT. Table 4 presents the results of the ablation experiments, which examine the impact of each component of the GPPKT model on its performance. The methods listed represent different combinations of model components: VAE (Variational Autoencoder), KG (Knowledge Graph), and ABI (Ability Index). The AUC and ACC columns show how each combination affects the model’s predictive accuracy and overall performance. The analysis of these values is essential to understand the significance of each component in enhancing the model’s capability. We choose the Luogu dataset for the ablation experiments. The results of the ablation experiments are shown in Table 4 which demonstrates the permutations of each method proposed in this study, where VAE indicates that the student answer sequences processed by VAE are used, KG indicates that the knowledge graph is used, and ABI indicates that students’ learning abilities are used.

Through the ablation experiments in Table 4, we draw the following conclusions: Comparing methods (2) and (5), it is observed that the consideration of students’ personalized learning abilities has a positive impact on the model’s prediction because different students have varying levels of mastery over different KCs. When methods (3) and (6) are compared, it is found that the incorporation of a VAE to process students’ answer sequences has a positive impact on the model’s predictive capabilities. This is because each student’s understanding of the exercises gradually improves during the learning process. Finally, when methods (4) and (7) are compared, it is seen that the introduction of a knowledge graph also benefits the model’s predictions because KCs are interconnected, and the use of a knowledge graph allows exercise information to be better considered.

In summary, the inclusion of a knowledge graph, the use of a VAE to represent answer sequences, and the consideration of students’ personalized learning abilities all have a positive impact on the model’s performance.

4.6. Visualization

4.6.1. Knowledge Graph Visualization

We extract the changes in the knowledge state of a student during the process of answering an exercise, as shown in Figure 6, which reflects the changes in the student’s mastery of other related KCs when answering the KC of “tree”.

In the heatmap, the color intensity represents the depth of the student’s mastery: lighter colors indicate lower mastery levels, while darker colors signify a deeper understanding. As the student progresses through time steps 0 to 7, the heatmap block corresponding to the KC “Tree” gradually deepens in color. This deepening suggests that the student’s understanding of the “Tree” concept is improving steadily. Simultaneously, the heatmap blocks for related KCs such as “Tree Structures”, “Huffman Tree”, “Minimum Spanning Tree”, and “Applications of Trees” also show deepening colors. This indicates that the student’s mastery of these interconnected concepts is strengthening alongside their understanding of “Tree”.

Notably, for the more complex concept “Huffman Tree”, the heatmap shows a slower change in color intensity compared to “Tree”. This slower deepening reflects the greater difficulty the student experiences in mastering “Huffman Tree”, likely due to the complexity of its construction algorithm.

As the student continues answering exercises during time steps 8 to 15, 16 to 23, and 24 to 31, the heatmap reveals further changes in their understanding of various KCs. During time steps 16 to 23, the heatmap block for “Huffman Tree” initially deepens but then lightens, indicating that the student’s grasp of this KC is unstable and errors might occur in subsequent answers. This pattern is mirrored in the heatmap blocks for other related KCs, which also show a cycle of deepening and lightening colors. These fluctuations in color illustrate the dynamic nature of the student’s knowledge state, which shifts in response to their answering performance.

Overall, the evolution of colors in the heatmap provides a visual representation of how the student’s mastery levels for specific KCs, as well as related concepts, change over time. The comprehensive nature of these changes underscores the effectiveness of incorporating a knowledge graph in tracing and visualizing the student’s learning process.

4.6.2. Student Learning Abilities Visualization

We visualize the learning abilities of three students across six exercises in Figure 7a. In this heatmap, darker colors represent higher learning ability on the respective exercise, while lighter colors indicate lower learning ability. Among the three students selected, we observe different overall levels of learning ability: excellent, moderate, and lower.

Students with excellent learning abilities demonstrate consistently high learning ability values on almost all exercises, mostly around 0.9. Students with moderate learning abilities exhibit moderate ability values on some exercises, with some around 0.7 and one exercise at 0.27. While students with lower learning abilities generally display relatively lower ability values on most exercises, with half around 0.27 and a smaller portion around 0.7.

To further analyze the learning process, we examine how the knowledge states of these students evolve over time on two different exercises, as shown in Figure 7b. Here, the heatmap traces changes in their mastery of KCs. The darkening of heatmap blocks over time indicates an increase in the student’s mastery level.

For students with higher learning abilities, the color intensifies quickly, indicating significant and rapid progress in mastering the relevant KCs. Students with moderate learning abilities also experience a darkening of color, though this process is more gradual. Despite the slower progression, these students still achieve noticeable improvements in their knowledge states within a given period. In contrast, students with lower learning abilities show only slight darkening of the heatmap blocks, reflecting limited progress. Even though their knowledge states improve somewhat during the learning process, the final values, around 0.31 and 0.29, suggest they are still far from fully mastering the relevant KCs.

From another perspective, students with higher learning ability need a shorter time to reach a relatively higher knowledge state, while students with lower learning ability need more time to reach a similar level. These findings further highlight the close relationship between learning ability and knowledge state.

4.6.3. Personalized Answer Sequence Visualization

We extract a partial answer sequence from a student, as shown in Figure 8a, where each row represents the student’s responses to different programming exercises. In this figure, lighter colors represent incorrect answers (coded as 0), while darker colors represent correct answers (coded as 1). It is evident that the second and eighth exercises were correctly answered by the student on their first attempt. However, for the third, fifth, and seventh exercises, the student consistently answered incorrectly. There are also cases where the student initially answered incorrectly but eventually arrived at the correct answer after multiple attempts. For example, in the sixth exercise, there was one incorrect attempt before the correct answer was given. The fourth exercise required five attempts to arrive at the correct answer, while the first exercise took 15 attempts before a correct answer was achieved.

The left half of Figure 8b represents the latent variables obtained after processing the student’s answer sequence through the VAE’s encoder, which is a part of our model’s input and clearly reflects the personalized and refined representation of the student’s continuous responses to each programming exercise. The higher value indicates the faster mastery of the problem-solving process for that particular exercise. We also visualize the output of the hidden layer after passing through the VAE’s decoder, as shown in the right half of Figure 8b. It can be observed that the VAE effectively reconstructs the latent variables. For exercises where the student initially answered incorrectly and then correctly after multiple attempts, the gradient of the values reflects the student’s gradually improving mastery of the exercise throughout the answer sequence. Additionally, the different number of attempts is differentially reflected in the answer sequence.

This detailed visualization and analysis underscore the effectiveness of the VAE in capturing and representing the complexity of students’ learning processes, providing valuable insights into their knowledge states.

5. Conclusions

In this paper, GPPKT is proposed to address two significant challenges in programming knowledge tracing: the inaccurate representation of exercises and the neglect of student answer sequence information. A programming knowledge graph was constructed based on authoritative learning resources, and the RotatE-TA knowledge graph embedding method was introduced to vectorize exercise representations. The VAE was employed to model students’ answer sequences on the same exercise, effectively capturing students’ learning behavior. Additionally, students’ learning ability differences were obtained through the IRT to enhance the information provided by students’ answer sequences. LSTM and an attention mechanism were used to adaptively aggregate hidden states, while a gating mechanism was introduced to balance the student’s historical and current knowledge state for performance prediction. Extensive experiments were conducted on two real-world programming datasets to evaluate GPPKT. The results demonstrate that GPPKT outperforms state-of-the-art methods in the field of programming knowledge tracing, with an average improvement of 9.03% in AUC and 2.02% in ACC across both datasets.

The GPPKT model is expected to stimulate the development of KT research within the programming field. Key contributions of this work include the integration of knowledge graphs to enhance exercise representation, the application of VAE for modeling student behavior, and the utilization of IRT to differentiate student learning abilities. These innovations have been shown to significantly improve the accuracy and effectiveness of programming knowledge tracing.

In future work, we will integrate recommendation algorithms to more accurately suggest exercises and learning materials to students, aiming to achieve intelligent and adaptive programming learning. This has important implications for personalized learning and could enhance the effectiveness of programming education. Additionally, we plan to incorporate features such as code submissions and other relevant data into our model to further refine the understanding of student learning processes and outcomes.These advancements in programming knowledge tracing have the potential to transform educational practices by providing educators with deeper insights into student learning trajectories and enabling more targeted interventions. Our work not only contributes to the academic field but also offers practical applications for improving the quality of programming education.

Author Contributions

Conceptualization, J.P. and X.C.; methodology, Z.D. and X.C.; software, Z.D.; validation, J.P. and Z.D.; formal analysis, L.Y. and X.C.; investigation, L.Y.; resources, J.P. and L.Y.; data curation, L.Y.; writing—original draft preparation, Z.D.; writing—review and editing, Z.D. and X.C.; visualization, Z.D. and X.C.; supervision, J.P.; project administration, J.P.; funding acquisition, J.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 61802258, in part by the Natural Science Foundation of Shanghai under Grant 20ZR1455600, and the National Key Research and Development Program of China under Grant 2022YFB4501704.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

Acknowledgments

The authors would like to express sincere appreciation for the support provided by the Shanghai Engineering Research Center of Intelligent Education and Bigdata as well as the Research Base of Online Education for Shanghai Middle and Primary Schools. The authors also acknowledge the contributions of Hao Wang from Shanghai Newtouch Software Co., Ltd., Shanghai, China, for his support and assistance in this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Piech, C.; Bassen, J.; Huang, J.; Ganguli, S.; Sahami, M.; Guibas, L.J.; Sohl-Dickstein, J. Deep knowledge tracing. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar]
Yeung, C.K.; Yeung, D.Y. Addressing two problems in deep knowledge tracing via prediction-consistent regularization. In Proceedings of the Fifth Annual ACM Conference on Learning at Scale, London, UK, 26–28 June 2018; pp. 1–10. [Google Scholar]
Zhang, J.; Shi, X.; King, I.; Yeung, D.Y. Dynamic key-value memory networks for knowledge tracing. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 765–774. [Google Scholar]
Pandey, S.; Karypis, G. A self-attentive model for knowledge tracing. arXiv 2019, arXiv:1907.06837. [Google Scholar]
Long, T.; Liu, Y.; Shen, J.; Zhang, W.; Yu, Y. Tracing knowledge state with individual cognition and acquisition estimation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 11–15 July 2021; pp. 173–182. [Google Scholar]
Ghosh, A.; Heffernan, N.; Lan, A.S. Context-aware attentive knowledge tracing. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 23–27 August 2020; pp. 2330–2339. [Google Scholar]
Nakagawa, H.; Iwasawa, Y.; Matsuo, Y. Graph-based knowledge tracing: Modeling student proficiency using graph neural network. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, Thessaloniki, Greece, 14–17 October 2019; pp. 156–163. [Google Scholar]
Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Su, Y.; Liu, Q.; Liu, Q.; Huang, Z.; Yin, Y.; Chen, E.; Ding, C.; Wei, S.; Hu, G. Exercise-enhanced sequential modeling for student performance prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
Minn, S.; Yu, Y.; Desmarais, M.C.; Zhu, F.; Vie, J.J. Deep knowledge tracing and dynamic student classification for knowledge tracing. In Proceedings of the 2018 IEEE International conference on data mining (ICDM), Singapore, 17–20 November 2018; pp. 1182–1187. [Google Scholar]
Lee, J.; Yeung, D.Y. Knowledge query network for knowledge tracing: How knowledge interacts with skills. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge, Tempe, AZ, USA, 4–8 March 2019; pp. 491–500. [Google Scholar]
Yeung, C.K. Deep-IRT: Make deep learning based knowledge tracing explainable using item response theory. arXiv 2019, arXiv:1904.11738. [Google Scholar]
Abdelrahman, G.; Wang, Q. Knowledge tracing with sequential key-value memory networks. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 175–184. [Google Scholar]
Guo, X.; Huang, Z.; Gao, J.; Shang, M.; Shu, M.; Sun, J. Enhancing knowledge tracing via adversarial training. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 367–375. [Google Scholar]
Pandey, S.; Srivastava, J. RKT: Relation-aware self-attention for knowledge tracing. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual, 19–23 October 2020; pp. 1205–1214. [Google Scholar]
Shen, S.; Liu, Q.; Chen, E.; Wu, H.; Huang, Z.; Zhao, W.; Su, Y.; Ma, H.; Wang, S. Convolutional knowledge tracing: Modeling individualization in student learning process. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 1857–1860. [Google Scholar]
Wang, W.; Liu, T.; Chang, L.; Gu, T.; Zhao, X. Convolutional recurrent neural networks for knowledge tracing. In Proceedings of the 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Chongqing, China, 29–30 October 2020; pp. 287–290. [Google Scholar]
Liu, Q.; Huang, Z.; Yin, Y.; Chen, E.; Xiong, H.; Su, Y.; Hu, G. Ekt: Exercise-aware knowledge tracing for student performance prediction. IEEE Trans. Knowl. Data Eng. 2019, 33, 100–115. [Google Scholar] [CrossRef]
Ebbinghaus, H. Memory: A contribution to experimental psychology. Ann. Neurosci. 2013, 20, 155–156. [Google Scholar] [CrossRef] [PubMed]
Nagatani, K.; Zhang, Q.; Sato, M.; Chen, Y.Y.; Chen, F.; Ohkuma, T. Augmenting knowledge tracing by considering forgetting behavior. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 3101–3107. [Google Scholar]
Yang, Y.; Shen, J.; Qu, Y.; Liu, Y.; Wang, K.; Zhu, Y.; Zhang, W.; Yu, Y. GIKT: A graph-based interaction model for knowledge tracing. In Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, 14–18 September 2020; Proceedings, Part I. Springer: Berlin/Heidelberg, Germany, 2021; pp. 299–315. [Google Scholar]
Song, Q.; Luo, W. SFBKT: A Synthetically Forgetting Behavior Method for Knowledge Tracing. Appl. Sci. 2023, 13, 7704. [Google Scholar] [CrossRef]
Kasurinen, J.; Nikula, U. Estimating programming knowledge with Bayesian knowledge tracing. ACM SIGCSE Bull. 2009, 41, 313–317. [Google Scholar] [CrossRef]
Wang, L.; Sy, A.; Liu, L.; Piech, C. Learning to Represent Student Knowledge on Programming Exercises Using Deep Learning. In Proceedings of the International Educational Data Mining Society, Paper Presented at the International Conference on Educational Data Mining (EDM), Wuhan, China, 25–28 June 2017. [Google Scholar]
Piech, C.; Huang, J.; Nguyen, A.; Phulsuksombati, M.; Sahami, M.; Guibas, L. Learning program embeddings to propagate feedback on student code. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 6–11 July 2015; pp. 1093–1102. [Google Scholar]
Swamy, V.; Guo, A.; Lau, S.; Wu, W.; Wu, M.; Pardos, Z.; Culler, D. Deep knowledge tracing for free-form student code progression. In Proceedings of the Artificial Intelligence in Education: 19th International Conference, AIED 2018, London, UK, 27–30 June 2018; Proceedings, Part II 19. Springer: Berlin/Heidelberg, Germany, 2018; pp. 348–352. [Google Scholar]
Shi, Y.; Chi, M.; Barnes, T.; Price, T. Code-dkt: A code-based knowledge tracing model for programming tasks. arXiv 2022, arXiv:2206.03545. [Google Scholar]
Jiang, B.; Wu, S.; Yin, C.; Zhang, H. Knowledge tracing within single programming practice using problem-solving process data. IEEE Trans. Learn. Technol. 2020, 13, 822–832. [Google Scholar] [CrossRef]
Liang, Y.; Peng, T.; Pu, Y.; Wu, W. HELP-DKT: An interpretable cognitive model of how students learn programming based on deep knowledge tracing. Sci. Rep. 2022, 12, 4012. [Google Scholar] [CrossRef] [PubMed]
Sun, Z.; Deng, Z.H.; Nie, J.Y.; Tang, J. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv 2019, arXiv:1902.10197. [Google Scholar]
Le Cun, Y.; Fogelman-Soulié, F. Modèles connexionnistes de l’apprentissage. Intellectica 1987, 2, 114–143. [Google Scholar] [CrossRef]
Koedinger, K.R.; Baker, R.S.; Cunningham, K.; Skogsholm, A.; Leber, B.; Stamper, J. A data repository for the EDM community: The PSLC DataShop. Handb. Educ. Data Min. 2010, 43, 43–56. [Google Scholar]
Feng, M.; Heffernan, N.; Koedinger, K. Addressing the assessment challenge with an online system that tutors as it assesses. User Model. User-Adapt. Interact. 2009, 19, 243–266. [Google Scholar] [CrossRef]
Choi, Y.; Lee, Y.; Cho, J.; Baek, J.; Kim, B.; Cha, Y.; Shin, D.; Bae, C.; Heo, J. Towards an appropriate query, key, and value computation for knowledge tracing. In Proceedings of the Seventh ACM Conference on Learning@ Scale, Virtual, 12–14 August 2020; pp. 341–344. [Google Scholar]
Liu, Z.; Liu, Q.; Chen, J.; Huang, S.; Tang, J.; Luo, W. pyKT: A python library to benchmark deep learning based knowledge tracing models. Adv. Neural Inf. Process. Syst. 2022, 35, 18542–18555. [Google Scholar]

Figure 1. The answer sequences of three students. Different colors represent different results, where green, yellow, and red indicate correct, time-out, and incorrect results, respectively.

Figure 2. The overall architecture of the GPPKT model, which consists of four functional modules: Knowledge Graph, Learning Ability, Answer Sequence, and LSTM Framework.

Figure 3. A section of the constructed knowledge graph, illustrating the relationships between various KCs.

Figure 4. Personalized learning state modeling based on VAE.

Figure 5. Acquisition of the final knowledge state.

Figure 6. Student mastery of relevant KCs when answering an exercise. On the left are five associated KCs, and students’ mastery of related KCs changes when answering the exercise corresponding to one of the concepts.

Figure 7. (a) Ability scores of three students on different exercises, (b) The knowledge state changes of three students with different abilities on two exercises.

Figure 8. (a) Partial answer sequence of a student, (b) Answer sequence after VAE processing.

Table 1. Partial information extracted from two datasets.

Data Sources	Students	Exercises	Interactions	Knowledge Concepts
Luogu	2081	3329	299,168	190
Codeforces	1685	8237	630,724	204

Table 2. Description of the experimental environment.

Configuration Environment	Configuration Parameters
Operating System	Windows 10 64-bit
GPU	RTX 3070ti
CPU	i7-13700H
Memory	16 GB
Programming language	Python 3.7
Deep learning framework	tensorflow 2.1.0
Python library	Scikit-learn, Numpy, Pandas

Table 3. Comparative experiments.

Methods	Luogu		Codeforces
Methods	AUC	ACC	AUC	ACC
DKT	0.8114	0.8397	0.6804	0.8574
DKT+	0.8104	0.8388	0.7037	0.8589
Deep-IRT	0.8020	0.8271	0.6702	0.8526
DKVMN	0.8058	0.8300	0.6828	0.8573
IEKT	0.8230	0.8309	0.7308	0.8567
SAKT	0.8301	0.8307	0.6708	0.8550
SAINT	0.8263	0.8254	0.6763	0.8577
AKT	0.8313	0.8394	0.7491	0.8595
ATKT	0.8380	0.8418	0.7543	0.8614
GPPKT (our)	0.8840	0.8472	0.7770	0.8799

Table 4. Ablation experiments.

Methods	AUC	ACC
(1) LSTM	0.8097	0.7954
(2) LSTM + VAE	0.8576	0.8343
(3) LSTM + KG	0.8654	0.8361
(4) LSTM + ABI	0.8679	0.8380
(5) LSTM + VAE + ABI	0.8762	0.8425
(6) LSTM + VAE + KG	0.8749	0.8392
(7) LSTM + KG + ABI	0.8703	0.8411
(8) LSTM + VAE + KG + ABI	0.8840	0.8472

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pan, J.; Dong, Z.; Yan, L.; Cai, X. Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing. Appl. Sci. 2024, 14, 7952. https://doi.org/10.3390/app14177952

AMA Style

Pan J, Dong Z, Yan L, Cai X. Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing. Applied Sciences. 2024; 14(17):7952. https://doi.org/10.3390/app14177952

Chicago/Turabian Style

Pan, Jianguo, Zhengyang Dong, Lijun Yan, and Xia Cai. 2024. "Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing" Applied Sciences 14, no. 17: 7952. https://doi.org/10.3390/app14177952

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing

Abstract

1. Introduction

2. Related Work

3. Method

3.1. Presentation of the Problem

3.2. Overall Architecture

3.3. Knowledge Graph to Represent Exercise Vectors

3.3.1. Constructing Knowledge Graph

3.3.2. Knowledge Graph Embedding Model: RotatE-TA

3.4. Modeling Answer Sequences with VAE

3.4.1. Input Representation

3.4.2. Model Architecture

3.4.3. Objective Function

3.5. Students’ Personalized Learning Abilities

3.5.1. First-Order IRT for Ability Assessment

3.5.2. Inverting Ability Parameters

3.6. LSTM Framework

3.6.1. Modeling Knowledge States with LSTM

3.6.2. Attention Mechanism

3.6.3. Gating Mechanism

3.7. Prediction

4. Experiments

4.1. Dataset

4.1.1. Data Source

4.1.2. Data Preprocess

4.2. Environment

4.3. Evaluation Metrics and Baseline Methods

4.3.1. Evaluation Metrics

4.3.2. Baseline Methods

4.4. Model Training and Parameter Selection

4.5. Experiment Results

4.5.1. Comparative Experiment Analysis

4.5.2. Ablation Experiment Analysis

4.6. Visualization

4.6.1. Knowledge Graph Visualization

4.6.2. Student Learning Abilities Visualization

4.6.3. Personalized Answer Sequence Visualization

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI