*1.3. Contributions and Structure*

The contributions of this paper include: 1. Artificial intelligent based behavior recognition is applied to the classroom environment for the first time, and an intelligent system with motion sensors to perceive and identify classroom behavior is built. 2. Based on sensor hardware devices, a classroom behavior database (SCB-13) including 14 common classroom behaviors collected from 13 participants is constructed. 3. A method of extracting valid sensor data segments based on an improved Voting-Based Dynamic Time Warping algorithm (VB-DTW) is proposed. 4. An intelligent identification method is proposed to recognize 14 common classroom behaviors based on valid behavior segments combined with a 1DCNN algorithm, and the proposed method achieved 100% recognition accuracy on a self-constructed dataset (SCB-13).

The second part of this paper describes the data hardware acquisition system and the relevant characteristics of the data; the third part gives a brief overview of the basic principles of the algorithm; the fourth part is the experimental results and comparative analysis; finally, the paper provides a conclusion.

#### **2. Materials and Methods**

#### *2.1. Participants*

In this study, we recruited 13 participants to carry out a feasibility study on the possibility of accurately identifying students' classroom behavior. The participants, aged from 20 to 26 years, were invited to participate in a classroom behavioral simulation experiment. This population consisted of 6 males and 7 females without special educational needs or developmental problems. They were culturally literate and able to comprehend, imitate, and model classroom behaviors accurately. Participants signed consent forms approved by the Ethics Committee of The Education University of Hong Kong (Approval Number: 2021-2022-0417) before data collection.

#### *2.2. Experimental Design*

For each participant, 5 sets of experimental data were gathered, and a total of 65 sets of data were collected. In each trial, participants were tasked with simulating 14 common classroom behaviors, and Table 1 shows the design of each motion. Each motion lasts for 20 s, which can be divided into the valid time duration doing the motions and the sitting still time. Except for when motion happens, the rest of the period is referred to as sitting still time.

**Table 1.** Motion mode design. To simulate classroom behaviors for the participants, we selected 14 typical classroom behaviors. The table lists each motion's name as well as the order in which it took place.


The actual hardware system used in the acquisition system is MPU6050, the main hardware processing chip is ESP-8266, the acquisition data bit rate is 115,200 Hz, the Arduino hardware platform is used for programming control, and the sensor data is stored in a .CSV file format via the computer's USB port using Python program. Figure 1a illustrates the schematic diagram of the 3D acquisition system, and 3 cameras are respectively installed on the participant's left side, front side, and diagonal rear to record the participant's vision motion data. As depicted in Figure 1b, in order to investigate the effect of the sensor in different positions, the sensors are positioned in the middle of the spine and the right shoulder of the participant. In addition to the 14 motions, there is a 20-s system calibration time at the beginning of the experiment to reduce the initial error caused during data acquisition, and the total duration of each experiment is 5 min. The sensors generate 7 channels of data: accelerometer (*x*-axis, *y*-axis, and *z*-axis) data, gyroscope (*x*-axis, *y*-axis, *z*-axis) data, and temperature data. The participants' motion information can be measured using accelerometer data in various directions. Gyroscope data can monitor angular velocity to determine an object's position and rotational orientation. Due to their susceptibility to environmental factors, temperature data are insufficient for use in a motion recognition system.

‐ ‐ **Figure 1.** The acquisition system of the experiment. (**a**) The schematic diagram of the motion acquisition system in the classroom scene; (**b**) The location of the sensors. The vision information of the participants' motions is collected through cameras from three perspectives to assist in the classification. One sensor was placed in the center of the participant's spine and another one on the right shoulder to collect data on the participant's motions.

‐ ‐ ‐ ‐

‐ ‐

‐ ‐ ‐ ‐

‐

‐ ‐ ‐
