*3.2. Motion Identification Results*

Furthermore, the performance of the aforementioned four models in accurately classifying classroom behavior is evaluated in order to measure the influence of different classification models on the self-constructed dataset (SCB-13). A deep neural network (DNN) is chosen as a simple benchmark model for the purpose of evaluating the efficacy of various algorithms. Separately for the back sensor and shoulder sensor, the research tests the accelerometer data (acc), gyroscope data (ypr), and accelerometer and gyroscope data (acc + ypr). The research confirms the effect of classifying sensor data using LSTM and BiLSTM networks, respectively, taking into account the time-series characteristic of the data. In addition, from the perspective of one-dimensional signal feature extraction, the research uses 1DCNN to extract and classify data features in a more "intelligent" mode. The results of the experiments carried out are listed in Table 3 below.


**Table 3.** Main Result of four networks for the back sensor and shoulder sensor separately. Furthermore, acc represents accelerometer data, ypr represents gyroscope data, and acc + ypr represents the combination of accelerometer and gyroscope data.

Based on a comprehensive evaluation of the experiment outcomes, we have determined that both DNN and LSTM networks are generally useful in distinguishing classroom behaviors from the three channels' data of the accelerometer or gyroscope. However, when accelerometer and gyroscope data are incorporated into the network input, the classification effect of the DNN and LSTM network is significantly enhanced, demonstrating that more data channels are beneficial for the expression and differentiation of features.

The main experiment results show that, compared to DNN and LSTM networks, the BiLSTM network significantly improves the identification accuracy of classroom behavior. In addition, BiLSTM networks are capable of a more robust feature representation, whether for three-channel data (accelerometer, gyroscope) or six-channel data (accelerometer and gyroscope), demonstrating that the combination of forward-backward LSTM neural network for the learning of feature representation has been significantly improved.

Compared to the other three networks, the unique and potent feature extraction capabilities for sequence data demonstrated by the 1DCNN network stands out. Combining accelerometer and gyroscope data, the 1DCNN achieves classification accuracy of 100% and 98.8% for the back and shoulder sensors, respectively. In terms of model complexity and computing speed, 1DCNN is considerably superior to LSTM and BiLSTM.

In general, the data collected by the back sensor is more stable than that collected by the shoulder sensor, allowing for the differentiation of classroom activities on a wider scale. For motion classification, the gyroscope is superior to the accelerometer, despite neither being as accurate as when accelerometer and gyroscope data are used simultaneously in the classification.

#### **4. Discussion**

#### *4.1. Ablation Study*

#### 4.1.1. Effect of VB-DTW Valid Segment Extraction

To evaluate the effectiveness of the proposed VB-DTW algorithm for valid segment extraction, we chose the data with the best classification impact (the combination of acc and ypr data) to investigate how valid segment extraction affected the action classification results. Table 4 displays the test results. According to the test results, it can be inferred that the results with VB-DTW valid segment extraction generally have higher accuracy than those without VB-DTW. The 1DCNN model outperforms the other algorithms in terms of classification accuracy for valid segment extraction.

**Table 4.** Test result of the effectiveness of VB-DTW valid segment extraction.


#### 4.1.2. Effect of VB-DTW Augmentation

In order to compare the accuracy of the model with and without data augmentation, we still select the data (the combination of Acc and Ypr data) with the highest level of classification accuracy. Table 5 displays the test results. The test results show that the model's classification accuracy with and without data augmentation is significantly different, and the special benefits of 1DCNN in the categorization of time series data are not reflected. These results might be brought on by the issue of data overfitting by the insufficient amount of data we gathered. As a result, for datasets with fewer data, the proposed algorithm needs to apply data augmentation on the dataset.


**Table 5.** Test result of the effectiveness of data augmentation.

According to the results and discussions, the proposed VB-DTW algorithm, based on wearable sensors and artificial intelligence technology, achieves intelligent perception and identification of school-aged students' classroom behaviors. Furthermore, effective, valid segment extraction methods, as well as data augmentation in model design, are essential for the network's superior performance. Intelligent recognition of school-age children's classroom behavior can provide timely feedback, allowing the children, particularly those with special education needs, to grasp their classroom behavior in real-time and obtain assistance in the classroom without being labor-intensive.
