1. Introduction
Miguel [
1] contended that exertion games (Exergames), such as Wii, PlayStation, and Xbox console games based on hand gesture recognition, have become a trend and are applicable to learning and rehabilitation. DePriest and Barilovits [
2] and Hsu [
3] indicated that physical interactions in the new generation of video games (e.g., Xbox Kinect digital games) transform conventional learning approaches. Hamzah et al. [
4] introduced Kinect as an innovative learning tool that enhances teaching effects and learning experiences. Teaching through games enables learners to learn effectively and facilitates their comprehension and absorption of knowledge.
As the worldwide common language, English has always been a prevalent research topic, with numerous pertinent studies conducted. Furthermore, advancements in technology are conducive to diverse English-learning approaches and research. Katzlinger [
5] and Kim [
6] noted that most game-based English learning systems, such as large-scale online games and emerging video games, emphasize the enhancement of learning motivation among children. Regarding existing online and mobile English learning games, the present study also explored the learning approach of individual games and websites; some examples are listed as follows. Animal Safari [
7] and Bon Appetit [
8] are simple Flash-based games, offering users simple vocabulary to learn with relatively less learning content. The website Kizclub [
9] provides multiple animations and pictures. Kids Dailies [
10] is a digital newspaper that offers news videos and English news reading material suitable for children. Kizclub and Kids Dailies provide relatively abundant and robust learning content centered on educational aspects of text and videos. After investigating existing games, this study discovered that they rarely provide situated learning and they lack the intuitive operation and sense of presence provided by Kinect, which invokes learners’ motivation.
The attention, relevance, confidence, and satisfaction (ARCS) model of motivation was proposed by Keller [
11] as a framework for encouraging students to focus on demonstrating learning motivation in learning environments. As a result, students can exhibit relatively superior learning outcomes.
Therefore, this study aimed to introduce the Kinect somatosensory interaction technique, situated learning theory, and the ARCS model of motivation into learning vocabulary and acquiring knowledge related to supermarkets. The proposed somatosensory English learning system (SELS) enables learners to engage in learning through physical interactions with virtual characters, events, and objects, thereby facilitating their motivation and learning outcomes. Accordingly, this study proposed the following research questions:
4. System Design
Based upon the design concept, the learning content was divided into the following four categories (see
Figure 2). Each of the categories of fresh produce, food, and fruits and vegetables, had three parts, namely animations, vocabulary learning, and quizzes. A 3 × 3 square puzzle learning section consisted of two parts, namely quizzes and system feedback. We first explain the three parts for each of the categories as follows:
4.1. Animation
Animation consisted of the following four parts.
- (1)
Design of learners’ characters: Learners march in place and their animated character should move in a corresponding manner, as shown in
Figure 3. When users wish to learn or review vocabulary, they could raise either left hand or right hand, and the animated character turned to face the corresponding direction.
- (2)
Design of learning scenarios:
Figure 4 shows the opening scene of a learning unit, which was comprised of three learning sections, namely fresh produce, fruits and vegetables, and food. The aforementioned 3 × 3 square puzzle was only available for selection after learners completed all three sections.
- (3)
Design of each learning section: The designated vocabulary was presented in conjunction with the corresponding picture, text, and video in each learning scenario presented through two-dimensional (2D) scrolling. Learners were provided with game instructions. After receiving text or pronunciation hints, learners could touch the image of the grocery item they wished to learn on the screen using either hand (
Figure 5).
- (4)
Design of learning sections’ ending scenes: When learners reached the far-right corner of the system, an animated cashier appeared. Learners could enter the somatosensory quiz on grocery-related vocabulary (
Figure 6) by touching the animated cashier.
4.2. Vocabulary Learning
The vocabulary items as the learning material were divided into three categories: fresh produce, fruits and vegetables, and food departments. Each vocabulary item was presented with a corresponding picture. When learners touched the picture with their hands, the corresponding Chinese term, English word, and pronunciation appeared.
Table 1 shows the vocabulary items for the fruits and vegetables department.
4.3. Quizzes
A small quiz after each unit allowed learners to review their acquired knowledge. Five grocery items were randomly selected from a particular department. Each question displayed a picture of the grocery item along with four answer options. Learners selected the answers by touching the option button. Sound effects provided feedback to learners to indicate whether each answer was correct or incorrect. Learners had to provide three or more correct answers to proceed to other learning units; otherwise, they had to return to the learning section to practice and retake the quiz.
In the 3 × 3 square puzzle, there were animation-based quiz questions. After accomplishing all three learning sections, learners could enter this game-based learning section to assess their learning outcomes. Learners who completed game levels using the Kinect somatosensory system gained a sense of accomplishment.
- (1)
Quiz: This quiz game was presented in a 3 × 3 square puzzle format (
Figure 7). A question appeared in the middle cell and the answer options appeared in the remaining eight cells for learners to select. The question automatically changed every 5 s and the system calculated the number of questions a learner answered correctly within the time limit of 1.5 min. During the quiz, learners were provided with text or pronunciation hints. Then, they could move either of their hands to touch the item button on the main screen that corresponded to the English vocabulary presented in the upper-left corner. Answering 15 or more questions correctly within the time limit indicated that a learner had successfully completed the level. The quiz question presented in
Figure 7 is “Chocolate”.
- (2)
System feedback: During the SELS 3 × 3 square puzzle game, both correct and incorrect answers triggered corresponding sound effects. Furthermore, the animated character raised signs that correspond to correct (O) or incorrect (X) answers (
Figure 8).
Learners who correctly answered 15 or more questions successfully completed the level, marking the end of the SELS 3 × 3 square puzzle learning activity (
Figure 9).
6. Discussion and Conclusions
The aforementioned analysis results indicated the following findings: items A1 and A2 received the highest mean values among all items of ARCS-A. Item A2 implied that learners enjoyed the aesthetic and scenarios of the game and agree it increased their interest in learning. The elementary school students exhibited a high level of acceptance of the system’s operational design and theme, both of which enabled them to concentrate on learning. Item A1 revealed that because over half of the learners were relatively inexperienced in somatosensory interfaces (controlling an animated character through body movements was a novel concept for them), they were extremely focused when using Kinect. In terms of relevance, items R1 and R3 received the highest mean scores among ARCS-R items, indicating that the learners were immersed in the supermarket scenario when learning with Kinect. That is, the integration of Kinect with situated learning stimulated students’ interest in learning. Seeing these grocery items in subsequent real-world scenarios will remind the learners of the corresponding English vocabulary, which demonstrates learning outcomes. In terms of confidence, items C2 and C1 reached the highest mean scores among ARCS-C items. Item C1 implied that the adoption of Kinect enhanced learners’ motivation to learn; Item C2 implied that learners preferred Kinect to books or mobile games. Therefore, the learners’ confidence was boosted when using SELS, which is conducive to learning. Regarding satisfaction, items S2 and S5 received the highest mean scores among ARCS-S items, implying that learners could memorize these vocabulary items to play games. Learners acquired deep satisfaction after having successfully completed the game; therefore, the system then became acceptable to learners, generating continuance intentions to play the game. Thus, the designed game in the learning system enhanced learners’ motivation to learn and increased their learning outcomes. In sum, the proposed system improved learners’ motivation to learn English, as well as strengthened their attention and concentration, enabling them to create associations between vocabulary and grocery items for effective memorization. Moreover, boosting learners’ self-confidence may have facilitated their learning and created a sense of satisfaction.
On the other hand, this study had some limitations. In the part of system development, during the experiment, we discovered that the system occasionally ceased operating. Furthermore, several learners responded that the font size was too small and not clear enough, and that the descriptions were excessively long. Therefore, increasing the stability of the game and modifying the instructions could provide learners with a clearer and smoother user experience. This study employed the Kinect for Windows software development kit (SDK) 1.8 to develop the proposed system. Considering that Microsoft has released Kinect for Windows SDK 2.0 (or higher versions), subsequent developments of Kinect-based learning systems are advised to employ the latest version to benefit from abundant new features, allowing superior material design and system smoothness. In addition, this study only utilized 2D space for developing the proposed system (i.e., learners used actual body movements to interact with virtual 2D characters, events, and objects). Three-dimensional space should be developed for somatosensory learning in the future. Three-dimensional animated graphics can be integrated into the system design to cultivate a more realistic and vivid situated learning scenario, thereby allowing learners to feel as if they are learning in an actual supermarket. In the part of learning materials, nouns and verbs are equally important parts of vocabulary that competent language users need to acquire. In the study, we only targeted the learning of nouns. Therefore, in the future, we can discuss the learning effectiveness of verbs. In the part of methodology, we shall study and try some different research methodology and design such as design-based research [
24] that are difficult or impossible to replicate in current practice strategies, and thus improve student learning outcomes.
In summary, the study results revealed the following findings. In terms of learning outcomes, the pre- and posttest scores of the experimental and control groups were analyzed by performing paired and independent sample t-tests. The obtained results indicated that the difference between the pretest scores of both groups was nonsignificant, implying that learners from both groups had similar levels of English proficiency. The result that the posttest scores of both groups were significantly different from their pretest scores shows that both somatosensory and conventional learning approaches increased learning outcomes. However, comparing the extent of progress indicated that the proposed system could more effectively improve learning outcomes. Therefore, the developed system indeed improved the learners’ learning outcomes. In addition, in terms of learning motivation, the analysis results of the questionnaire showed that hypotheses H1 to H4 are all valid and indicated the SELS enhanced the learners’ learning motivation.
Therefore, introducing Kinect sensor-based human–machine interactions and situated learning to learning English supermarket-related vocabulary allowed learners to immerse themselves in the scenarios through interactions with virtual characters, events, and objects, using actual physical movements. As a result, the learners became more motivated to learn and developed a desire for learning English vocabulary related to supermarkets, thereby transforming their understanding of grocery items into knowledge. Thus, the Kinect-based SELS increased learners’ motivation and understanding of teaching materials, stimulating their interest in studying English vocabulary to enhance learning outcomes.