1. Introduction
Behavioral skills training (BST) is a systematic teaching method that aims to improve individual behavior and skill levels [
1,
2,
3], providing a safe training environment for professional fields such as experimentation teaching, industrial demonstration, and special training [
4,
5,
6,
7]. Currently, BST is gradually transitioning from traditional teaching methods to more informationization and immersive learning experiences [
8]. The integration of human–computer interaction (HCI) and extended reality (XR) technologies plays a crucial role in this process [
9,
10]. HCI in XR optimizes user interfaces and interaction methods, enabling learners to engage with training systems in a more natural and intuitive way. XR technologies create immersive virtual environments for BST, allowing for the simulation of complex real-world scenarios, offering a controlled and safe platform for learners [
11].
Despite these significant advantages, the application of HCI and XR in BST still faces several challenges. First, head-mounted displays (HMDs), such as VR headsets and AR glasses, are currently the most widely used XR devices, but long-term use of HMDs can lead to motion sickness [
12], eye fatigue [
13] and other health problems [
14,
15,
16]. Moreover, the design of HMDs (such as their weight and wearability) can also cause discomfort for users [
17]. Second, traditional BST methods still heavily depend on video tutorials and live demonstrations. However, these methods have several limitations, including the limitations of teaching resources, the risks involved in the operation, and the inefficiency of the teaching process [
18]. Third, current BST evaluations focus mainly on improvements in behavior or skills, with insufficient research on subjective user experiences. In comparison, gamified education is more attractive to students [
19]. To fully explore the benefits of HCI and XR technologies in BST, more attention is needed to the higher cognitive, memory, and emotional processes of users.
To address these challenges, we propose an innovative BST approach—the spatial reality display (SRD):
- 1.
Spatial reality display (SRD) technology has been introduced into educational methods. This cutting-edge XR technology allows users to experience realistic 3D virtual scenes and objects without the need for head-mounted devices, effectively avoiding the discomfort associated with HMDs. A simple and intuitive gesture-based control scheme for SRD is designed to make it easier for users to learn, with virtual objects matching their real-world counterparts. This helps users become more familiar with the operations and properties of various objects.
- 2.
BST content in the form of serious games (SGs) is designed by integrating knowledge and skills into specific scenarios and gamified tasks. We also provide guidance at key points to enhance users’ psychological acceptance and reduce unfamiliarity with new technologies. As shown in
Figure 1, BST keeps students actively engaged in the learning environment and proposes four steps in the behavioral skills training process: instruction, modeling, rehearsal, and feedback [
20,
21]. Based on these technologies and design principles, we selected the thermite reaction experiment as a case study in one of the most typical BST application areas—experiment teaching. We developed a BST application for chemistry labs where users can learn and perform the thermite reaction experiment in a 3D virtual environment.
- 3.
The effectiveness evaluation of the SRD approach is assessed through written exams and simulated operations, with users’ skill acquisition and behavior data compared to their performance in traditional chemical experiment BST. The specific evaluation process is shown in
Figure 2, which includes several stages such as pre-test, behavioral skills training, simulated operations, post-test, understanding other BST methods, and interviews. A more detailed description is provided in
Section 4.
The following sections of this paper detail our research.
Section 2 reviews related studies, covering various immersive XR technologies, common interaction methods, and concepts like BST and SGs.
Section 3 introduces the hardware requirements, interaction methods, and design and implementation of the BST application in the chemistry laboratory.
Section 4 and
Section 5 present the evaluation, including user experiment design, evaluation metrics, and data analysis. In
Section 6, we discuss the strengths and weaknesses of SRD compared to traditional BST methods, based on data and user feedback, and identify areas for future improvement. Finally,
Section 7 concludes the paper.
3. Method
3.1. Hardware
The experimental system employs a Leap Motion gesture sensor operating in desktop mode as the input device, which captures the positional information of hand nodes in space as well as the orientation and position of the hand. The display device of the experimental system is a Sony ELF-SR2 autostereoscopic 3D display, with a screen size of 27 inches and a resolution of 3840 × 2160. It is capable of detecting the position of the pupils and rendering spatial images in real-time for each eye separately, allowing users to perceive stereoscopic images with the naked eye.
The audio parameters are equally important. The ELF-SR2 is equipped with dual 1 W built-in speakers and supports 3D surround sound field transformation technology. The auditory experience complements the visual experience. To achieve a better overall experience, this device can simulate a spatial sound field to enhance immersion. Additionally, based on Sony’s audio processing technology, the audio signals can be intelligently analyzed to determine the spatial positions of different sound sources, thereby reproducing a more realistic 3D surround sound effect. With this design, the direction and intensity of the sound dynamically adjust according to changes in the visual scene, allowing the audio to be closely integrated with the visual content and providing an immersive audio experience.
3.2. Interaction Methods
To enhance the realism of the experiment, the virtual environment used in this study employed Leap Motion to capture the operator’s hand gestures, allowing the operator to directly manipulate virtual instruments for skill training. In terms of gesture interaction, the Leap Motion Unity3D plugin was utilized to record gesture data within the virtual environment. Additionally, the study explored methods for determining whether the virtual hand has successfully grasped the equipment.
During the initial implementation and research testing, we experimented with several approaches. These included concealing the virtual hand and substituting it with a real human hand, utilizing the physical properties of fingers to interact with objects, and employing bounding boxes with gesture triggers. Our research findings indicate that using a real human hand to interact with the virtual environment, considering the imaging principles of the SRD screen, makes it difficult to align the human hand with the geometric relationships of objects in the virtual scene. Moreover, the hand often obstructs the view of the virtual scene, affecting the operator’s actions. Interacting with the virtual hand in the scene sacrifices some sense of immersion but simplifies the operator’s tasks to a certain extent. This approach gives rise to two types of operations: physical grasping of objects with fingers and gesture-triggered interactions based on bounding boxes. Physical grasping of objects with fingers involves all operations based on real physical collisions. Although this method enhances realism, it significantly increases the interaction difficulty under the hardware conditions of Leap Motion. The implementation of gesture-triggered interactions based on bounding boxes involves binding a bounding box to an object, allowing the virtual hand to grasp and interact with the object by performing the corresponding gestures within the bounding box. After preliminary testing, we found that this interaction method has the highest execution efficiency and thus adopted it as the gesture interaction method for this study.
In this study, developers used a cubic bounding box in the application to encapsulate the corresponding chemical experiment instrument models. As shown in
Figure 3, when the virtual hand enters the range of the bounding box, pinching the thumb and index finger together triggers the gesture interaction. To enhance the coherence of the interaction and reduce difficulty, we disabled the physical effects of some experimental instruments. This measure ensures that when gesture recognition is accidentally interrupted and then restored, the operator does not need to re-grasp the object.
3.3. Application
The thermite reaction is an important topic in China’s college entrance examination (Gaokao). It is a chemical reaction that involves the reaction of aluminum powder with metal oxides to obtain metallic elements. The reaction is highly exothermic, with temperatures exceeding 1250 °C. Since the molten aluminum oxide generated during the reaction is prone to splashing, it can easily cause severe consequences such as burns. Given its high level of danger, teachers generally require students to learn about the thermite reaction through video demonstrations rather than conducting the experiment themselves.
The existing teaching of the thermite reaction mainly relies on watching videos, with a few teachers conducting live demonstrations. However, in the effectiveness evaluation studies of these two methods, Xian et al. [
67] found that while watching videos is safe, it results in poor memory retention and lack of focus among students. Moreover, the live demonstration method is highly dependent on the teacher’s skill level, involves higher risks, and has a low tolerance for errors. There are also issues with unclear experimental phenomena and the inability to observe repeatedly. These problems can be addressed to a certain extent by the SRD teaching method.
Considering the above, the knowledge and procedural steps of the thermite reaction experiment are highly suitable for learning through a serious game-based BST application using the SRD method. To replicate the real thermite reaction experiment, we developed a BST application using the Unity3D (version 2021.3.6) graphics engine. The engine’s physics and lighting effects were utilized to construct a virtual experimental environment, simulating a real-world laboratory scene. During the experimental operations, corresponding information prompts will also appear in the scene to guide users in learning the relevant knowledge and operational methods of the experiment as shown in
Figure 4.
In this application, we have implemented simulations of all the key operational steps of the thermite reaction experiment. Users can personally experience and complete the following operations using gesture interaction technology on a naked-eye stereoscopic reality device.
Wetting the funnel: In the thermite reaction experiment, to ensure that the molten material generated can smoothly drop from the funnel into the crucible, it is necessary to wet the inner funnel with water in advance. Therefore, we consider it crucial to design the simulation of this step in the application. As shown in
Figure 5, users need to grasp a dropper with their hand and drop water onto the funnel. When the system recognizes that the step is completed, the UI will change, directing the user to the next experimental step.
Figure 5.
Dropping water onto the funnel. The prompt information on the left includes the name of this step, a schematic diagram, and its purpose: moistening the inner funnel with water is to prevent the paper funnel from burning due to high temperatures.
Figure 5.
Dropping water onto the funnel. The prompt information on the left includes the name of this step, a schematic diagram, and its purpose: moistening the inner funnel with water is to prevent the paper funnel from burning due to high temperatures.
Adding thermite and oxidizer: In this step, users will learn how to add thermite and oxidizer. As shown in
Figure 6, users will learn from the UI that thermite is composed of aluminum powder and iron (III) oxide powder mixed in a 1:3 ratio. After using a spoon to take a measured amount of thermite, the operator needs to place it into the wetted funnel from the previous step, compact it, and then add a potassium chlorate compound on top of the thermite to aid combustion.
Figure 6.
Adding thermite and oxidizer. (a) The prompt information on the left includes the names, ratios, and methods of adding the reagents for this step: In the thermite mixture, the ratio of iron oxide powder to aluminum powder is 3:1, and they should be thoroughly mixed to ensure a complete reaction; (b) Add the thermite mixture; (c) Add the oxidizer (potassium chlorate) to aid combustion.
Figure 6.
Adding thermite and oxidizer. (a) The prompt information on the left includes the names, ratios, and methods of adding the reagents for this step: In the thermite mixture, the ratio of iron oxide powder to aluminum powder is 3:1, and they should be thoroughly mixed to ensure a complete reaction; (b) Add the thermite mixture; (c) Add the oxidizer (potassium chlorate) to aid combustion.
Igniting and placing the magnesium strip: This step involves operations such as ignition, which are very dangerous in real experiments. In this step, the operator needs to turn on the alcohol burner and use crucible tongs to hold the magnesium strip and ignite it over the burner. As shown in
Figure 7, this step includes many details, such as using crucible tongs to hold the magnesium strip and insert it upside down into the thermite. This is an important detail involving safety hazards in real experiments, and thus it is emphasized in this application.
Figure 7.
Igniting and placing the magnesium strip. (a) The information prompt on the left includes the name of this step, a schematic diagram, and the method of operation; (b) Ignite the magnesium ribbon and insert it upside down into the mixture.
Figure 7.
Igniting and placing the magnesium strip. (a) The information prompt on the left includes the name of this step, a schematic diagram, and the method of operation; (b) Ignite the magnesium ribbon and insert it upside down into the mixture.
Reaction occurrence: As shown in
Figure 8, after the magnesium strip is fully ignited and reacts within the thermite, the thermite burns and generates a large amount of heat, producing high-temperature, bright yellow molten iron that cools into iron balls and emits intense flames. This is the hallmark phenomenon of the thermite reaction. In this BST application, we have also well-reproduced the visual effects of the thermite reaction using Unity3D particle effects, allowing users to immerse themselves in understanding the experimental phenomena.
Figure 8.
The experimental phenomena. The information on the left describes the phenomena of the reaction: 1. It emits a dazzling light and generates a large amount of heat; 2. The paper funnel is burned through, and molten droplets in a red-hot state fall onto the fine sand in the evaporating dish. After cooling, the droplets turn into a black solid. At the bottom is the button to restart the experiment.
Figure 8.
The experimental phenomena. The information on the left describes the phenomena of the reaction: 1. It emits a dazzling light and generates a large amount of heat; 2. The paper funnel is burned through, and molten droplets in a red-hot state fall onto the fine sand in the evaporating dish. After cooling, the droplets turn into a black solid. At the bottom is the button to restart the experiment.
3.4. Content Optimization
In the adjustment of experimental scene content and testing with holographic display devices, we found that in some cases, the stereoscopic effect of the displayed content was not as pronounced as expected. Through exploration, we have summarized that the following factors affect the holographic display effect, with some of the most significant ones being scene brightness, background depth, object material, and reference objects in the foreground and background. Additionally, factors such as object color, object proximity, object dynamics, and aspect ratio of the scene all influence the display effect. For instance, darkening the scene, using a dark background, and employing low-transparency materials with a frosted texture can enhance the overall stereoscopic and realistic feel of the displayed content. Specific influencing factors and adjustment explanations are provided in
Table 1.
4. User Study
4.1. Design of User Study
To accurately evaluate the differences between the SRD method and the commonly used BST method in terms of objective effectiveness and subjective user experience, we designed a controlled experimental scheme based on the application background of the aluminothermic reaction experiment. One group of participants learned using the SRD method, while the other group learned using the most commonly used BST method—watching videos. The experimental scheme included multiple components such as examinations, filling out evaluation questionnaires, simulating experimental operations in virtual and real environments, and user interviews. This allowed us to comprehensively assess the SRD method from multiple perspectives using quantitative methods.
The BST assets used in the experiment included an application and a teaching video. The assessment materials consisted of two tests—pre-test and post-test. The pre-test contained seven questions, which were used to understand the participants’ prior knowledge before the BST. The post-test included 15 questions, which were used to evaluate the learning outcomes after the BST. In addition, some necessary experimental instruments were used to allow participants to replicate the steps of the thermite reaction in a real-world environment. Furthermore, a subjective experience evaluation questionnaire with 42 items was provided for the participants to complete.
4.2. Procedure
To accurately assess the differences in effectiveness between the SRD method and the traditional BST method for watching videos, participants were divided into two groups: the traditional group and the SRD group, to control for variables. Therefore, each group of participants was only exposed to one of the two BST methods. However, the subjective evaluation of user experience required that each participant be familiar with both BST methods to minimize the impact of their cognitive biases on the experimental data. To resolve this contradiction, we designed the experimental procedure as shown in
Figure 2.
Before the experiment began, we introduced the background and procedure of the experiment, recorded the participants’ information, and conducted a pre-test to assess their current level of understanding of the training content. After that, participants in the traditional group would first learn using the traditional BST method, which involved watching detailed instructional videos on a PC monitor. Additionally, the video links are available in the
Supplementary Materials section of the paper. In contrast, participants in the SRD group would first learn using the SRD method. After the learning phase, all participants were required to complete a simulation task and a post-test. The simulation task involved using chemical laboratory instruments and some substitute props to simulate the steps of the thermite reaction experiment in a real-world environment. This process was recorded to capture the participants’ behavioral data. At this point, all objective evaluation aspects related to the effectiveness of the BST methods had been completed. Subsequently, there was no need to control for variables anymore. Participants were then exposed to the other BST method they had not previously experienced and were asked to complete a subjective experience evaluation questionnaire. Following the experimental procedure described above, a semi-structured interview was conducted to gain insights from the participants regarding their views on traditional education methods and SRD education methods.
4.3. Evaluation Indicators
Objective Indicators: Objective indicators are used to assess the differences in effectiveness between the SRD method and the traditional BST method. These differences are mainly reflected in the participants’ level of knowledge about the thermite reaction experiment and their proficiency in operating it. The relevant indicators can be derived from pre-tests and post-tests, such as the accuracy rate of questions. Additionally, key steps from the simulated operation phase can be extracted as objective indicators. By analyzing the participants’ behavioral data and calculating the accuracy rate of completing these steps, an assessment can be made. We have extracted the following 11 indicators:
Place the crucible below the iron stand;
Fill the crucible with sand;
Use double-layer filter paper to hold the reactants;
Cut a small hole in the filter paper;
Moisten the filter paper with water;
Add the reactants using a spatula;
Perform steps 5 and 6 in sequence;
Add the aluminothermic agent and potassium chlorate successively;
Stir the aluminothermic agent and potassium chlorate to mix them evenly using a glass rod or spatula;
Use tweezers to hold the magnesium strip and ignite it;
Insert the ignited magnesium strip into the reactants upside down.
Subjective Indicators: Subjective indicators are used to assess the differences between the SRD method and the traditional BST method in terms of users’ cognitive, emotional, and memory-related subjective experiences. These differences are quantitatively evaluated by analyzing the ratings of scale items in the subjective experience assessment questionnaire. The subjective indicators are primarily derived from authoritative experience assessment scales in the field.
In this experiment, the Presence Questionnaire (PQ) [
68], Intrinsic Motivation Inventory (IMI), and System Usability Scale (SUS) were used. Based on the research content, items from the original scales were selected to form a subjective experience assessment questionnaire. Each subjective indicator can be quantified by several questions in this questionnaire. The following 10 subjective indicators were extracted:
Involvement;
Sensory fidelity;
Adaptation/immersion;
Interface quality;
Interest/enjoyment;
Perceived competence;
Effort/importance;
Pressure/tension;
Value/usefulness;
SUS.
When conducting the evaluation, a comparative analysis can be performed between the performance of the SRD method and the traditional BST method in terms of the metrics on the PQ and IMI scales. Additionally, the user ratings of the two BST methods based on the SUS scale can be analyzed to assess their impact on user subjective experience.
4.4. Data Collection and Analysis
The collection of experimental result data originates from multiple stages of the experimental process, including pre-test scores, participants’ reactions during training, behaviors and performance in simulated operations, post-test scores, and interviews. This encompasses both subjective and objective data.
For data analysis, we employed difference analysis to explore the differences in effectiveness and subjective experience between the two BST methods. The difference analysis methods used were the t-test and the U test. The t-test was applied to data that followed a normal distribution, while the U test was used for data that did not follow a normal distribution. Given the small sample size, we utilized the Shapiro–Wilk test to assess whether the data conform to a normal distribution.
We also considered the potential impacts of the small sample size. First, regarding extrapolation, the results of a study with a small sample size may not be easily generalizable to a broader population. Due to the limited sample size, the study may not fully capture the variability among different individuals, thereby restricting the universality and applicability of the findings. Second, in terms of effect size, a small sample size may lead to inaccurate estimates of effect size, which in turn affects the interpretation of the results. To mitigate these negative impacts, we rigorously controlled the selection of experimental participants and chose appropriate analytical methods. The specific strategies are detailed in
Section 5. The analysis results are at least somewhat representative of the target user group. However, to widely promote the SRD method to more scenarios and user groups, it is necessary to introduce more diverse experimental subjects for further research to explore the performance of the SRD method in a broader population.
5. Results
The target users of our study were first-year college students majoring in chemistry, who had just begun their professional chemistry studies but were required to conduct potentially hazardous experiments. To obtain accurate experimental results, we selected participants with the following attributes: aged 18–25, a balanced gender ratio, having taken high school chemistry courses, and possessing only very basic knowledge of chemistry experiments. These participants also had little to no experience with SRD-related devices. This ensures that their cognitive level and knowledge base align closely with our target users. Given the participants’ lack of experimental experience, we increased the experiment’s tolerance for errors. For example, in the SRD method, we enlarged the bounding boxes of interactive objects to reduce the difficulty of gesture interactions for selecting and moving virtual objects; when watching videos, participants were allowed to rewind by dragging the progress bar; and in simulated operations, we used salt, sugar, and sand instead of the reactants in the thermite reaction, without providing any ignition sources to avoid potential dangers caused by accidental operations. The actual experimental conditions are as follows.
We selected a portion of thermite reaction-related questions to conduct a pre-test on the candidate participants, ensuring that the participants were not familiar with the operation process of the thermite experiment. This step reduced the influence of the participants’ prior knowledge on the experimental test. After selecting the participants, we recruited 32 experimenters (17 females and 15 males), aged between 18 and 24 years old. Among them, 30 had never used an SRD display, while 2 had used it before. Prior to the experiment, we provided a detailed description of the experimental procedure and the matters that the participants needed to be aware of. We also had the participants sign an informed consent form to ensure that they were fully informed about the experimental information and the potential risks involved, and to ensure that they would not experience any discomfort from the hardware during the experiment. Upon completion of the experiment, each participant received a remuneration of USD 10.
5.1. Analysis of Pre-Test and Post-Test
Firstly, the normality of the correct answer rate data for the SRD group and the traditional group in the pre-test and post-test was tested. Since this is a small sample with a size of less than 50, the Shapiro–Wilk test was employed.
As shown in
Table 2, the
p-values for the accuracy of answers in the pre-test for the two groups of participants were 0.0612 and 0.3621, respectively. Since both
p-values are greater than 0.05, it is concluded that the pre-test data for both groups conform to a normal distribution. Additionally, the
p-values for the accuracy of answers in the post-test for the two groups were 0.0194 and 0.0165, respectively. Since both
p-values are less than 0.05, it is concluded that the post-test data for both groups do not conform to a normal distribution.
Since the pre-test data of the two groups conform to the normal distribution, a
t-test can be conducted on the data before training. As shown in
Table 3, the homogeneity of variance was tested, with a
p-value of 0.7311, which is greater than 0.05, indicating that the test for homogeneity of variance was passed. Further analysis revealed that the
p-value in the independent samples
t-test was 0.7233, which is also greater than 0.05. Therefore, there was no statistically significant difference between the post-test data of the SRD group and the traditional group. This result suggests that before the participants received BST, there was no significant difference in their understanding of the training content.
Since the post-test data of the two groups did not conform to a normal distribution, the Mann–Whitney U test was conducted separately on the pre-test and post-test data of each group. In
Table 4, the results showed that the
p-value for the SRD group was 0.0025, which is less than 0.01, indicating a highly significant difference between the pre-test and post-test data. In contrast, the
p-value for the traditional group was 0.0669, which is greater than 0.05, indicating no significant difference between the pre-test and post-test data. This result suggests that the SRD method has a more significant effect on improving users’ theoretical knowledge of behavioral skills compared to the traditional method.
This conclusion can also be corroborated by the comparison of histograms in
Figure 9. Before BST, the correct answer rate of the SRD group was slightly lower than that of the traditional group. However, after BST, the correct answer rate of the SRD group exceeded that of the traditional group.
5.2. Analysis of Simulation of the Operation
Firstly, a normality test was conducted on the data of operational accuracy for the SRD group and the traditional group in the simulation of the operation. Since this is a small sample with a size of less than 50, the Shapiro–Wilk test was employed.
Table 5 shows that the
p-values of the two groups of participants’ operation data in the simulation of the operation are 0.1706 and 0.5863, respectively. Since both
p-values are greater than 0.05, it is concluded that the simulation operation data of the two groups conform to a normal distribution.
Given that the simulation operation data of the two groups conform to a normal distribution, a
t-test can be performed on the data, and the results are shown in
Table 6. First, the homogeneity of variance was tested, with a
p-value of 0.0735, which is greater than 0.05, indicating that the variance homogeneity test was passed. Further analysis revealed that the
p-value in the independent samples
t-test was 0.0106, which is less than 0.05. Therefore, there is a statistically significant difference between the simulation operation data of the SRD group and the traditional group. This suggests that the SRD method and the traditional method have distinct effects on the improvement of participants’ behavioral skills.
Combining the histogram can more intuitively corroborate the above conclusion. The horizontal axis of the histogram represents the key operation steps, which are the 11 objective indicators mentioned earlier. By subtracting the operation accuracy rates of the traditional group from those of the SRD group for each indicator, the green data in
Figure 10 are obtained. It can be observed that the accuracy rate of the SRD group is higher than that of the traditional group in the majority of indicators. In conjunction with the results of the
t-test, it can be concluded that the SRD method has a more significant effect on improving users’ operational behavioral skills compared to the traditional method.
After statistical analysis during the simulated operation phase, the SRD group took a total of 34 min and 48 s, while the traditional group took a total of 36 min and 50 s. Overall, the SRD group demonstrated higher proficiency. Examining each individual participant, when considering the results of
Figure 11, it was found that the majority of participants in the SRD group were able to complete the simulated experiments well and quickly. In contrast, participants in the traditional group did not exhibit outstanding levels of accuracy or speed in their operations. It can be concluded that the SRD method is more effective in skill transfer from virtual to real scenarios.
5.3. Analysis of Subjective Indicators
Firstly, based on the data from the subjective experience evaluation questionnaires completed by users, we conducted normality tests for the 10 subjective indicators in both the SRD method and the traditional BST method.
The results are shown in
Table 7 that in the questionnaire data of the SRD group, involvement, interest/enjoyment, perceived competence, effort/importance, pressure/tension, and value/usefulness did not exhibit normality characteristics. In contrast, sensory fidelity, adaptation/immersion, interface quality, and SUS did exhibit normality characteristics.
As shown in
Table 8, in the questionnaire data of the traditional group, sensory fidelity did not exhibit normality characteristics. However, involvement, adaptation/immersion, interface quality, interest/enjoyment, perceived competence, effort/importance, pressure/tension, value/usefulness, and SUS did exhibit normality characteristics.
We then conducted homogeneity of variance tests for the subjective indicators and found that, according to the results in
Table 9, the type samples did not exhibit homogeneity of variance for adaptation/immersion, effort/importance, and value/usefulness. The data fluctuation was significantly inconsistent. Therefore, we concluded that the following indicators could be analyzed using the
t-test: interface quality, interest/enjoyment, perceived competence, pressure/tension, and SUS. The indicators that could not be analyzed using the
t-test included involvement, sensory fidelity, adaptation/immersion, effort/importance, and value/usefulness. For these indicators, we employed non-parametric tests.
5.4. Analysis of Subjective Scale
5.4.1. Presence Questionnaire
After conducting normality tests and analysis of variance (ANOVA), we found that the interface quality in the Presence Questionnaire (PQ), interest/enjoyment, perceived competence, and pressure/tension in the Intrinsic Motivation Inventory (IMI), as well as the System Usability Scale (SUS) score, all approximately followed a normal distribution and exhibited homogeneity of variance. Therefore, we used the t-test to analyze these attributes. For other attributes, we used the Mann–Whitney U test for data analysis.
When analyzing the data from the PQ, the Mann–Whitney U test was conducted on involvement, sensory fidelity, and adaptation/immersion, with the results shown in
Figure 12 and
Table 10.
The results indicate that there were significant differences among the three factors across different experimental teaching methods. For the involvement factor, the SRD method had a score distribution of 35.000 (33.0, 36.0) (where the median is 35.000, the 25th percentile is 33.0, and the 75th percentile is 36.0; this notation is used consistently for subsequent data), which was significantly higher than the score of 16.000 (13.3, 19.0) obtained from traditional multimedia teaching methods. For the sensory fidelity factor, the SRD method scored 17.000 (15.0, 19.0), which was significantly higher than the traditional multimedia teaching method’s score of 6.000 (4.0, 7.0). For adaptation/immersion, the SRD method scored 28.000 (26.0, 30.0), which was still higher than the score of 18.500 (14.3, 21.0) obtained from traditional video methods. For all three factors, the SRD teaching method significantly outperformed traditional multimedia teaching methods.
When analyzing interface quality, we used the t-test, and the results showed that t = −0.387, p = 0.700 > 0.05, indicating no significant difference between the two methods.
5.4.2. The Intrinsic Motivation Inventory
The
t-test was conducted on interest/enjoyment, perceived competence, and pressure/tension, with detailed results shown in
Table 11.
It can be observed that there were significant differences among different experimental teaching methods in terms of interest/enjoyment and perceived competence. For interest/enjoyment, the SRD method had a score distribution of 6.15 ± 0.82 (mean = 6.15, standard deviation = 0.82; the same notation applies to subsequent data), which was significantly higher than the traditional method’s score of 3.55 ± 0.97. In terms of perceived competence, the SRD method scored 5.83 ± 0.97, which was significantly higher than the traditional method’s score of 4.51 ± 1.36. The differences and comparisons are shown in
Figure 13. However, no significant difference was observed between the SRD method and traditional multimedia methods in terms of pressure/tension.
For the effort/importance and value/usefulness factors in the IMI, the Mann–Whitney U test was used, and the results are shown in
Table 12.
Both factors exhibited significant differences. For the effort/importance factor, the SRD method’s score distribution of 6.167 (5.0, 7.0) was significantly higher than the traditional method’s score distribution of 5.000 (3.8, 6.3). For the value/usefulness factor, the SRD method’s score distribution of 6.667 (6.1, 7.0) was significantly higher than the traditional method’s score distribution of 4.667 (4.3, 5.7). The boxplot distributions are shown in
Figure 14.
5.4.3. System Usability Scale
We used the t-test on the total SUS score obtained, and the results showed no significant difference across different experimental teaching methods (t = 1.385, p = 0.171). The data showed that the SRD method had a score distribution of 74.77 ± 12.43, while the traditional teaching method had a score distribution of 69.61 ± 16.99. The SRD method was rated as “B” on the SUS scale, with an adjective describing its usability as “good”. In contrast, the traditional teaching method was rated closer to “C+”, with its usability adjective falling between “good” and “OK”.
6. Discussion
After the experimental section was completed, we conducted interviews with each participant, lasting 5–10 min. The interviews were guided by an outline, focusing primarily on the topic of “the differences between the two experimental teaching methods and their respective advantages and disadvantages”. In the following sections, we discuss the strengths and weaknesses of the SRD method and the traditional BST method by integrating the results of the data analysis and the content of the participant interviews. We also summarize the typical views of the participants regarding the two experimental methods, along with the original statements supporting these views, with the participants’ numbers indicated. For each viewpoint, we offer some speculation and explanations regarding its formation.
6.1. Advantages of the SRD Method
The advantages of the SRD method are mainly reflected in three aspects. First, the immersive experience provides users with a more realistic experimental experience, effectively enhancing memory and understanding, and increasing motivation and interest in learning. Some participants described this aspect as follows: “In SRD, I feel like I’m actually conducting the experiment, able to see the details of chemical reactions and even operate the experimental equipment” (P1, P4, P7, P20, P21. P stands for participant, with P1 referring to Participant No. 1. The same notation applies to the following cases). “When I perform the operations myself, I can remember each step, rather than just watching videos and forgetting” (P2, P4, P7, P13, P14). “SRD feels like playing an interesting game, making me more interested in learning and exploring” (P6, P12, P18, P21, P25). Based on these descriptions, the greatest advantage of SRD is that it provides a virtual experimental environment close to reality. Students can operate and observe as if they were in a real laboratory, enhancing their sense of realism and practicality. Through SRD, students can repeatedly perform experimental steps and experience the chemical reaction process firsthand. Combined with previous research data, this active participation helps deepen their memory and understanding. Moreover, the use of novel virtual reality and interactive technologies makes the learning process more engaging, thereby increasing students’ interest and proactivity in learning.
Second, the interaction and feedback mechanism offer a rich interactive experience, allowing users to enjoy immediate feedback and flexible operability. Participants described this during the interviews as follows: “I can use gestures to operate test tubes and instruments, and this interaction makes me feel very involved” (P4, P8, P10, P17). “Whenever I make a mistake in the steps, the system immediately provides feedback, letting me know where I went wrong” (P12, P17, P30). “I can freely control the pace of the experiment, pausing and trying different steps at any time” (P12, P13, P20). Thanks to the innovative gesture-based interaction, this high level of interactivity enhances the richness and engagement of the learning experience. The virtual environment provides an immediate feedback mechanism, helping students correct their mistakes promptly during operations, thus facilitating faster learning and mastery of experimental skills. Additionally, the SRD teaching method allows students to control the pace of the experiment autonomously. They can repeat and adjust their operations according to their learning needs, increasing opportunities for self-directed learning.
Third was safety and cost-effectiveness. Some participants mentioned: “In SRD, I don’t have to worry about the dangers of chemical reagents and can conduct experiments with peace of mind” (P7, P24, P31). “Through virtual experiments, we save a lot of costs associated with purchasing experimental materials and equipment” (P1, P3, P26). During the interviews, we also found that participants recognized one of the original intentions behind the design of the SRD experiment, which was to eliminate the safety hazards of real chemical experiments. This allows students to conduct potentially dangerous experiments in a risk-free environment. Moreover, using SRD for experimental teaching reduces the need for actual experimental materials and equipment, thereby lowering the overall cost of experimental instruction.
6.2. Disadvantages of the SRD Method
Despite its many advantages, the SRD method still has limitations in certain aspects, which are mainly reflected in three areas:
Firstly, the SRD method is limited by technological constraints, resulting in less satisfactory experiences in terms of gesture recognition and sensory aspects beyond vision. For example, several participants reported that “sometimes the gesture recognition is not sensitive enough, leading to unsmooth operations and affecting the experimental experience” (P4, P8, P17, P18), or “although the SRD is very realistic, I still feel there is a gap compared with real experiments, especially in the sense of touch” (P1, P8, P28). In experiments and interviews, we found a relatively serious issue: despite the advanced interaction technology provided by SRD, the accuracy of gesture recognition remains a problem, which may affect students’ operational experience and the smooth progress of experiments. Although the SRD offers a highly realistic virtual environment, some students still believe that it cannot fully replace the physical realism of real experiments, especially due to the lack of tactile feedback. This is because participants can only perform grasping gestures in the air during operations and cannot actually hold physical objects. To improve the accuracy of gesture recognition in the SRD method, several strategies can be adopted. (i) Integrating multiple sensors (such as cameras and infrared sensors) can enhance the accuracy and robustness of gesture recognition. For example, combining visual signals can more comprehensively capture gesture movements. (ii) Utilizing deep learning algorithms, particularly convolutional neural networks (CNN) and long short-term memory networks (LSTM), can automatically extract gesture features and improve recognition accuracy. (iii) Through environmental light adjustment and background separation techniques, the impact of lighting conditions and complex backgrounds on recognition can be minimized.
Secondly, the learning and economic costs associated with the use of new technologies are significant. “To use SRD, specific equipment is required, which is a bit difficult for some students” (P1, P6, P22), and “using SRD at the beginning is a bit complicated, not as convenient and intuitive as watching videos” (P15, P20, P26). During interviews, some participants also expressed concerns about the costs. The hardware for SRD requires high-performance support, which may increase the burden on students and schools, especially in situations with limited resources. The complexity of SRD also results in a steep learning curve for first-time users, meaning that students and educators may need to spend more time and effort to become familiar with and master the usage methods.
Thirdly, the interaction and feedback provided during the use of SRD need to be further enhanced; otherwise, they may exacerbate users’ feelings of frustration when mistakes occur. This point was also confirmed by some participants: “During the experiment, I hope there are more prompts and guidance to prevent me from making mistakes” (P7, P14, P17, P30), and “when I repeatedly make mistakes, the system’s feedback is not effective enough, which makes me feel frustrated” (P2, P17, P19). Although SRD provides interactive and feedback functions, some students hope to receive more real-time guidance during operations to reduce errors caused by unfamiliarity. If students frequently make mistakes during experiments and do not receive effective help and feedback, it may lead to frustration and negatively impact their motivation to learn.
6.3. Skill Transfer from Virtual to Real
In this study, to evaluate the skill transfer effect from virtual environments to the real world between the SRD method and traditional methods, a simulated experimental operation segment was designed. In this segment, participants’ behavior and accuracy in simulating an aluminum thermite reaction experiment were recorded and analyzed to compare the effectiveness differences between the two methods. For the SRD method intended for large-scale applications, some supplements should be made to the skill transfer evaluation method and experimental design used in this study.
Regarding experimental design, environmental differences, perceptual differences, and task complexity are the primary factors influencing the transfer effect. To enhance the skill transfer effect from virtual environments to the real world, the following strategies can be adopted: First, by employing more precise environmental modeling and simulation techniques, the differences between virtual and real environments can be reduced. Second, by integrating various feedback methods such as visual, auditory, and tactile feedback, learners’ perceptual experiences in virtual environments can be enhanced to make them closer to real-world operations. Third, complex tasks can be decomposed into multiple subtasks and gradually trained in the virtual environment. Through this approach, learners can progressively master the skills of each subtask and eventually integrate them into a complete skill set.
Regarding evaluation methods, skill transfer effects can still be evaluated based on real-world task performance. However, a comprehensive and standardized evaluation scheme needs to be developed, including four stages: pre-testing, virtual environment learning, real-world task testing, and data analysis. The pre-testing should be related to the real-world task testing to serve as baseline data. The evaluation process should combine quantitative and qualitative assessments, analyze differences in metrics such as completion time and accuracy, record operators’ behaviors during the learning and testing processes, and analyze learning styles and psychological changes through interview results.
6.4. Promotion of the SRD Method
This study holds practical significance for the large-scale promotion of the SRD method in terms of theoretical basis, practical reference, and model optimization:
- 1.
Initial validation of technical potential. Although the sample size is small, the study results can preliminarily validate the potential of the SRD method in behavioral skills training. Compared with traditional video training methods, the SRD method demonstrates stronger immersion, interactivity, and better skill transfer capabilities, providing a theoretical basis and practical reference for subsequent large-scale studies.
- 2.
Promoting technological application and innovation. The study results can attract more attention from the industry and researchers to the application of the SRD method. The positive outcomes from small-sample studies can inspire further exploration of innovative applications and promote the implementation of BST applications based on the SRD method in more fields.
- 3.
Optimizing training models. Small-sample studies can provide direction for optimizing the training models of the SRD method. For example, the study can reveal which design elements (such as immersion and interactivity) in the SRD method have a greater impact on learning outcomes, thereby providing references for training design when promoting on a large scale. Additionally, by analyzing small-sample data, potential areas for improvement can be identified to further enhance the effectiveness of the SRD method.
To promote the application of the SRD method in different scenarios, several methods need to be considered. First, integrating various interactive methods such as gesture recognition, speech recognition, eye tracking, etc., to form a multimodal interactive system. This allows users to choose the most natural interaction method according to their needs in complex scenarios. For example, prioritizing gesture recognition in noisy environments and combining eye tracking for high-precision operations. Second, utilizing machine learning algorithms to enable the system to automatically adjust the gesture recognition model based on user habits. By recording commonly used gesture patterns and optimizing the recognition algorithm, individual user recognition accuracy and interaction efficiency can be improved. Third, adopting a modular architecture to design various systems of the SRD method to facilitate the rapid integration or replacement of functional modules according to the requirements of different scenarios. For instance, quickly integrating virtual experiment modules in educational scenarios or high-precision virtual assembly modules in industrial scenarios.
7. Conclusions
This study proposes a new method of behavioral skills training (BST) based on spatial reality display (SRD), aiming to overcome the limitations of traditional BST methods and existing applications of human–computer interaction (HCI) and extended reality (XR) technologies. By introducing autostereoscopic technology and natural gesture interaction, the SRD method provides users with an immersive experience without the need for head-mounted devices, effectively avoiding the discomfort associated with HMDs. Combined with the design philosophy of serious games (SGs), this method integrates knowledge and skills into specific contexts and engaging tasks, significantly enhancing users’ psychological acceptance and learning efficiency.
In the method evaluation section, we developed a virtual experiment application using Unity3D, taking the thermite reaction experiment as an example. Users can operate within a three-dimensional virtual environment, experiencing and completing the experiment firsthand. Through simulated operations, written examinations, and subjective experience evaluations based on the Presence Questionnaire (PQ), Intrinsic Motivation Inventory (IMI), and System Usability Scale (SUS), we compared the effectiveness and user experience of the SRD method with that of traditional BST methods. The results indicate that both the SRD immersive teaching method and the traditional video teaching method have their respective strengths and weaknesses. SRD provides a more realistic and interactive learning experience, suitable for hands-on and in-depth learning, and has significant advantages in enhancing users’ behavioral skills and intrinsic motivation. However, it faces challenges in terms of equipment requirements and usage costs. Traditional BST methods, on the other hand, are suitable for large-scale applications due to their convenience and low cost, but lack interactivity and a sense of operation, making it easy for users to passively receive knowledge. By combining the strengths of these two methods, a hybrid teaching model can be explored, leveraging the immersion and interactivity of SRD, combined with the convenience and low cost of video, to provide users requiring BST with a richer and more efficient experience.
In the future, we anticipate that SRD-related technologies, such as display technology, smart interactive devices, AI technology, communication technology, etc., will further develop. In terms of display devices, high-resolution, larger field of view, and lower latency display technologies will become mainstream. The combination of gesture recognition and speech recognition will provide users with more natural and convenient ways of interaction. AI technology will enable SRD applications to achieve more intelligent and precise interactions. AI-driven real-time rendering and content generation technologies will significantly lower the barrier to content creation for SRD methods, driving the prosperity of the content ecosystem. The 5G networks and future 6G network technologies will provide SRD applications with faster, low-latency data transmission services, supporting smoother cloud rendering and real-time interactions.
In addition to chemical experiment training, the SRD method also has a high probability of development and large-scale application in training and education in different fields. Firstly, in medical training, SRD is expected to play an important role in surgical simulation, rehabilitation training, and psychological therapy. It can be used for remote operation and surgical guidance of medical equipment, enhancing medical efficiency and quality. Secondly, in industrial manufacturing training, virtualization of product design, production process simulation, and equipment maintenance will be realized, improving training and production efficiency. Furthermore, based on cloud technology, there is the potential to achieve resource sharing for training, while also reducing the hardware requirements on end-user devices, thereby lowering hardware costs. Combining shared hardware or leasing options can increase the economic feasibility of expanding the SRD method to other scenarios. This enables remote collaborative training to be conducted anytime and anywhere.