*Article* **Usability and Engagement Study for a Serious Virtual Reality Game of Lunar Exploration Missions**

#### **Lizhou Cao 1,\*, Chao Peng <sup>1</sup> and Jeffrey T. Hansberger <sup>2</sup>**


Received: 29 June 2019; Accepted: 29 September 2019; Published: 3 October 2019

**Abstract:** Virtual reality (VR) technologies have opened new possibilities for creating engaging educational games. This paper presents a serious VR game that immerses players into the activities of lunar exploration missions in a virtual environment. We designed and implemented the VR game with the goal of increasing players' interest in space science. The game motivates players to learn more about historical facts of space missions that astronauts performed on the Moon in the 1970s. We studied usability and engagement of the game through user experience in both VR and non-VR versions of the game. The experimental results show that the VR version improved their engagement and enhanced the interest of players in learning more about the events of lunar exploration.

**Keywords:** serious game; usability; game engagement; virtual reality

#### **1. Introduction**

Since the advent of head-mounted displays (HMDs), virtual reality (VR) technologies have been used widely in digital media and entertainment. The low cost of HMDs makes VR accessible to the public. Many VR games using HMDs and handheld motion controllers have been released to the public in recent years. Games developed with VR technologies can provide a highly immersive experience [1]. Serious games are "(digital) games used for purposes other than mere entertainment" [2]. They are used in many areas such as education, healthcare, ecology, and scientific research [3]. Adopting VR technologies in serious games can simulate a learning or training environment that would otherwise be impossible for people to access in the physical world.

In this work, we design and implement a serious VR game that immerses players into activities of lunar exploration missions in a virtual environment. The game is based on the historical events of the Apollo 16 mission in the 1970s, in which astronauts landed on the Moon, drove the lunar rover, and eventually returned to the Earth. In this game, we implement planning, preparing, and driving activities of the Apollo 16 mission by utilizing VR technologies and handheld motion controllers. In our previous work, we developed a similar serious game [4], but it was a non-VR version playable with a single-screen display and keyboard–mouse/joystick inputs. Besides the contributions in game design and development, we investigate how the VR technology effects usability and game engagement compared to the non-VR version. In our study, we utilized a between-subject design, where half of the participants play the non-VR version of the game and the other half of the participants play the VR version of the game. We used the Game Engagement Questionnaire (GEQ) [5] and an interview questionnaire to measure their levels of engagement. The results show that the VR version of the lunar roving game took longer for participants to finish but enhanced the game engagement and their motivation to learn the events of lunar exploration.

The paper is organized as follows. Section 2 describes existing work on VR-based serious games. Section 3 describes the background of lunar exploration missions by The National Aeronautics and Space Administration (NASA) in the United States in the late 1960s and early 1970s and their influence on education. Section 4 discusses the design details and gameplay mechanics of the VR game. Section 5 describes the design of the usability and engagement study. Section 6 presents the analysis results, and we conclude the work in Section 7.

#### **2. Existing Work in Serious VR Games**

Our Lunar Exploration Game is a serious VR game with the goal of motivating players to learn historical events of the Apollo 16 mission. Serious games have been applied in many domains such as education, healthcare, training, and scientific exploration. In Section 2.1, we review papers that discuss serious games used in different domains. In Section 2.2, we review studies evaluating the effectiveness of serious VR games, with a focus on engagement, usability, and learning outcomes. In Section 2.3, we review the theories underpinning usability and engagement studies in serious VR games, which have impacted the design of our experiment.

#### *2.1. Applied Domains of Serious VR Games*

Susi et al. [2] gave a review of serious games. They presented the definition, domain, history, and market status of serious games. Djaouti et al. [3] classified serious games with the Gameplay/Purpose/Scope (G/P/S) model to combine both "serious" and "game". Mikropoulos and Natsis [6] gave a review of educational virtual environments from the year 1999 to 2009. They identified several features in VR for learning and discussed reasons why VR could be useful in a learning situation. The most important reason they identified is that VR provides the possibility for users to try situations not accessible or dangerous in the real world. Potkonjak et al. [7] presented a review of virtual laboratories for science, technology, and engineering education. They argued that virtual labs can offer advantages that real labs may not provide. The advantages included cost-efficiency, flexibility, multi-user support, and the access of dangerous or not accessible situations. Alhalabi [8] discussed the effectiveness of VR systems for engineering education. They evaluated traditional education approaches, Corner Cave systems, and HMD systems. They found that Corner Cave and HMD systems were better than traditional education approaches in terms of improving students' performance, and the effectiveness of HMD systems was more significant.

However, researchers have shown that in different situations, VR technology does not always play a positive role. Jensen and Konradsen [9] presented a review of the use of HMDs in education and training. They found that using an HMD led to a better result only in a small number of cases. Beyond those cases, using an HMD had no advantage or even performed worse than the non-VR setting. Freina and Ott [10] reviewed the use of immersive virtual environments in education. They discussed the advantages and drawbacks of using VR in children's education and rehabilitation for cognitive disabilities. Virvou and Katsinois [11] discussed the usability of using VR games for education in the classroom. They argued that learning outcomes are different among students since they usually have different game-playing backgrounds, but in most cases, the differences in their backgrounds do not influence their motivation.

#### *2.2. Use Case Studies of Serious VR Games*

Mei and Sheng [12] presented a system for virtual hospital-situated learning. Their system was primarily used for learning human organ anatomies. They obtained the result that the situated learning can improve users' motivation. Cheng et al. [13] presented a VR game to teach language and culture. The result showed that the VR game could increase user engagement, but the result did not show a significant improvement in the language learning outcome when the VR game was used. Adamo-Villani and Wilbur [14] presented a study of the virtual learning environment for deaf and hard-of-hearing children. They compared a immersive version and non-immersive version; however,

the results of the study did not show a significant difference between the versions. Parmar et al. [15] compared the HMD-based metaphor viewer with the desktop-based display for training users in electrical circuitry. Their results showed a significant learning benefit when using the HMD. They argued that participants may gain better performance and enjoy using the HMD version, but they did not provide an evaluation to support this argument. Greenwald et al. [16] presented a usability study to compare the learning in VR with the learning on a traditional 2D screen. The comparison result did not show a significant difference in quantitative learning measurements, though the completion time of the VR version was always longer than the non-VR version. Olmos et al. [17] provided an educational platform to study the influence of emotional induction and level of immersion on learning motivation. They presented a usability study between an immersive version with an HMD and a low-immersive version with a tablet. The results indicated that the immersive condition and the positive emotional induction can influence knowledge retention. Zizza et al. [18] designed and implemented a multimodule VR learning environment over the network. They evaluated user experience in the VR version by comparing it to a desktop version. Although they stated that the feedback on the VR version was positive, the interview questions used to obtain feedback should be more comprehensive. Buttussi and Chittaro [19] conducted a two-week study on three different types of display settings for learning, including a desktop VR platform, an HMD with a narrow field view and a 3-DOF tracker, and an HMD with a wide field of view and a 6-DOF tracker. They found that HMDs can increase user engagement. In their study, HMD did not lead to a significant increase in knowledge and self-efficacy.

In summary, the use of VR technologies for serious games is a new area. In different use cases, people may have different experiences and opinions about VR. Thus, there is no consensus. Our literature search did not find any studies about the adoption of VR technologies for learning historical events of lunar exploration or other events in space science.

#### *2.3. Usability and Engagement Studies in Serious VR Games*

Bowman et al. [20] discussed several evaluation methods according to three key characteristics: involvement of representative users, context of evaluation, and types of results produced. The results showed that the evaluation should be close to the changes between the VR system and non-VR system. Based on their findings, the parameters for usability evaluation that we recorded in our experiment focus on the differences between the VR and non-VR versions of the game. Sutcliffe and Kaur [21] studied the walkthrough method for evaluating VR systems. Key points of this method are a goal-oriented task, exploration and navigation, and system initiative. In our experiment, tasks that the participants complete were based on the walkthrough method.

When measuring the usability differences between a VR system and non-VR system, researchers have employed the between-subjects study design [22,23]. Mcmahan et al. [22] expressed that the reason is to increase experimental controls and reduce confounding variables. In our experiment, we used a between-subjects study design so that each participant was only subjected to the play experience of a single version.

Game engagement is one of the factors that affect learning outcomes and the motivation to play games [24]. Brockmyer et al. [5] developed the Game Engagement Questionnaire (GEQ) to measure the engagement level of players playing a video game. The GEQ has been used in many serious game studies [25–28]. Researchers have also used the GEQ to study VR games [28–30]. In our experiment, we used the GEQ to measure and analyze the engagement level of participants in both VR and non-VR versions of the lunar roving game.

#### **3. Background of NASA-Inspired Serious Games**

In the 1960s, students were inspired by the success of lunar exploration missions performed by NASA astronauts and engineers. Photographs and videos of the missions and NASA-developed technologies motivated students to pursue science, technology, engineering, or math (STEM) degrees. Through the 1960s, the number of students enrolled in science and engineering increased significantly. According to a report by the National Science Foundation [31] and the article by Markovich [32], the percentage of bachelor degrees awarded in science and engineering fields peaked in the late 1960s for the period of 1966–2010.

The experience of learning about lunar exploration missions is usually through a passive setting where information is presented in the form of reading material, speaker talks, videos, museum lectures, or exhibitions. For example, people may explore NASA's website to find a wealth of materials that explain the past and future of lunar exploration. Those materials provide both scientific facts and public narratives for public science engagement. In such a setting, learners passively receive what the assigned material says. In contrast to the passive setting, an active setting gives the learner an opportunity to take a participatory role as he or she is absorbing the knowledge [33]. For example, some summer camp programs promote active learning experiences during a scheduled period of time (e.g., the Space Camp program at the U.S. Space & Rocket Center (Space Camp is located in Huntsville, Alabama, U.S. Website: https://www.spacecamp.com)). However, those summer camp programs usually require a substantial cost for travel or lodging.

In contrast to attending Space Camp, a game of lunar exploration is easier to set up and provides an active learning environment at a lower cost. The complex content of lunar exploration missions can be represented graphically in the game, and the game allows the learner to interact with the content using precisely simulated navigation systems and driving controls based on the operations in real missions. NASA's Eyes [34] is an interactive application developed at NASA's Jet Propulsion Laboratory. It is an educational tool that allows the user to interact with the Earth, solar system, and spacecraft. To maintain a low computational cost on geometry rendering, the Earth and Moon in NASA's Eyes are represented as spheres wrapped with textures, so there is a lack of visual fidelity on geographical features. Another game-based example is NASA Space Place [35], which is a NASA science webpage for kids. NASA Space Place has a few games related to the Moon, rovers, and missions. The games are 2D, cartoonish, and provide only introductory or conceptual information.

NASA supports game developers making NASA-inspired 3D games. NASA has released a variety of 3D models and textures of spacecraft, landing sites, asteroids, etc. [36]. One 3D game is Station Spacewalk Game [37]. It contains a 3D model of the International Space Station that allows the user to experience NASA repair work on the station and learn how the station is assembled. The game runs on a standalone PC or on the web with a Flash plug-in, but it does not work with VR devices. NASA also provides a web-based interactive tool [38] to show the historical landing sites on the Moon and explains briefly the activities that astronauts have conducted. It helps the player to understand the history of lunar exploration, but it does not offer participatory experiences for users. A few years ago, the Immersive VR Education company [39] released an Apollo 11 VR game that is a documentary style VR application presenting historical lunar exploration events through a mix of original audio and video. The National Naval Aviation Museum adopted a similar concept to exhibit the Apollo 11 journey using VR technologies [40] along with a physical Apollo command module. Audiences are immersed in activities such as boarding the rocket, experiencing the launch, witnessing travel through space, and landing on the Moon. Peng et al. [4] gamified the Moon mission and created three playing phases. They presented the lunar roving game on a PC platform and employed a standard screen. They performed a usability study with 30 participants and provided a descriptive analysis of the effectiveness of the game for learning about lunar exploring activities. The results showed that the game enhanced user engagement in NASA-inspired events and increased the users' interest in space science. However, their work did not provide a statistical comparison between a VR-based gameplay and a non-VR gameplay.

#### **4. Game Design Methodology**

The concept of game design in this work is similar to the lunar roving game presented by Peng et al. [4]. In this work, we develop a VR version of the lunar roving game, which is compared to

the non-VR version presented by Peng et al. [4] in terms of usability and game engagement. The game is composed of three playing phases: planning, preparing, and driving. In the planning phase, the player is guided to create the driving route by placing markers on a virtual lunar terrain map. In the preparing phase, the player selects a subset of devices from the inventory and then loads them onto the lunar rover for later use on the route. In the driving phase, the player drives the rover to the end of the route. During the driving, the player operates a navigation system to determine the driving direction, control the rover's speed, and avoid overheating. The game presented by Peng et al. [4] is a non-VR video game running on a PC with a single screen. The player uses a keyboard/mouse to place markers and select scientific instrument and uses a joystick to drive the rover. In this work, we convert the non-VR version of the game into a VR game. The VR game runs on a PC with the content rendered to a head-mounted display so that the player will feel immersed in a virtual lunar environment. The player plays the VR game using handheld motion controllers.

In general, VR games have many things in common with non-VR video games. For example, both non-VR and VR games produce interactive experiences and require real-time rendering, and their development process is the same. In contrast to the use of a keyboard and mouse in non-VR games, the use of VR head trackers and handheld controllers enhances the freedom of movement for in-game actions. In this section, we discuss the design differences between the non-VR and VR versions of the lunar roving game, the ways of relating to the environment and interface, input modalities for gameplay, and engagement levels in each playing phase.

#### *4.1. Planning Phase*

The lunar roving game requires the player to explore a region of lunar terrain on a planned driving route. In the planning phase, the player is asked to identify station stops on the lunar terrain map based on the longitude and latitude values. Both the non-VR and VR versions of the game allow the player to place markers/tokens on the map.

In the non-VR version of the game, the player places markers on the map through a 2D user interface. As shown in Figure 1a, the interface contains a cartoon-style commander who instructs the player how to use the interface. The commander's instructions and longitude and latitude values are displayed as rolling texts in a dialogue box. The player places a marker by mouse-clicking on the map. If the player does not place the marker at the correct location, he or she will be asked to repeat the placement operation for the same marker until it is placed correctly on the map. When all markers are placed correctly, a calculator appears in the interface. The commander will ask the player to calculate the travel time based on the given distance and maximum speed. To do the calculation, the player can enter numbers by mouse-clicking the number buttons on the calculator.

**Figure 1.** Screenshots of the planning phase. (**a**) is the user interface for the non-virtual reality (VR) version of the game, and (**b**) is the 3D environment for the VR version of the game.

In the VR version of the game, the player wears an Oculus Rift headset with which he or she gains a 360 degree interior view of the spacecraft. As shown in Figure 1b, the player can see the lunar terrain map on a desk, and there are four 3D tokens next to the map. The instructions from the commander and the longitude and altitude values are displayed on a virtual monitor in front of the player. In the VR game, the player holds a pair of Oculus Touch controllers in his or her hands, which provide hand presence in the virtual environment. The player can press a button on the Touch controllers to trigger a gameplay event without any interruption from physical hand movements. To place a token on the map, the player points to the token using the controller and then holds a specific button on the controller to pick up the token. Then, the token can be moved across the desk by moving the controller while holding the button. The token can be dropped on the map when the button is released. If the token is not placed correctly, the player is asked to pick it up again and repeat the placement operation until it is placed correctly. When placing a token, the player can lean his or her body forward to have a closer look of the map or lean back to regain the full view of objects on the desk. After all tokens are placed correctly, more instructions from the commander will roll into the virtual monitor; at the same time, a 3D calculator appears on the left side of the desk. To use the calculator, the player points on the number buttons of the calculator using the Touch controller and then triggers a specific button on the controller to press the number button.

#### *4.2. Preparing Phase*

In the original Apollo mission, astronauts needed to load communication and scientific instruments onto the lunar rover and use them at station stops on the route. In the preparing phase of this game, the commander describes the related scientific tasks. Based on the description, the player determines what instruments to choose from the inventory and loads them onto the rover. The participant is required to finish a total of five tasks. The first task is to put two astronauts onto the lunar rover. The second task is to load two communication devices. The third task is to load a pallet frame that will hold devices and tools. The fourth task is to load a monitoring device that can be controlled remotely by Houston. The fifth task is to load a camera device to record documentary photos and videos.

As shown in Figure 2a, the non-VR version of the game consists of a 2D user interface. The commander appears in the top-left corner of the screen. The description of related scientific tasks from the commander appears in a dialog box. The player can explore the inventory in the middle panel of the interface. When mouse-clicking on an item in the inventory, the 3D model of the corresponding device will appear in the top part of the right panel. At the same time, the usage of the device will appear in the bottom part of the right panel. When the player checks the checkbox of the item in the inventory, the device will show up on the rover at the location it should be loaded. A 3D view of the lunar rover is shown in the left panel of the interface. The player can rotate the 3D view to look at the rover from different angles.

**Figure 2.** Screenshots of the preparing phase. (**a**) is the user interface for the non-VR version of the game, and (**b**) is the 3D user interface for the VR version of the game.

The VR version of the game contains a 3D user interface for the preparing phase. Different from the non-VR version, the description of science tasks appears on a virtual display device held in the player's left hand. The player can move the display device closer or farther away from the HMD's viewpoint by moving the Touch controller in the left hand. The device icons and usages are shown in a large curved virtual display. As discussed in Cao et al. [41], a large curved display allows the player to interact with the content in a wide field of view and supports a high level of perceived immersion. The Touch controller in the player's right hand is used to scroll the list of device icons up and down. To view the details of a device, the player points to the device icon and presses a specific button on the controller so that the 3D model of the device and its usage will appear in the right section of the curved display. The player can hold a specific button on the controller to pop the 3D model of the device out of the curved display and move it in 3D space using the controller. The player can drop the device onto the lunar rover, as shown in Figure 2b.

#### *4.3. Driving Phase*

In the driving phase, the player drives the lunar rover to explore a geographical region of the Moon. Both the non-VR and VR versions of the game use the same 3D model of the lunar terrain, which is converted from Lunar Reconnaissance Orbiter (LRO) data. The player needs to operate the rover's navigation system to check the driving direction, monitor the speed, and avoid the overheating issue. In particular, the game requires the player to use three navigation devices. The first one is a heading device that displays the driving direction based on the shadow cast on the heading dial. The second one is a speed meter that displays the current driving speed of the rover. The third one is a temperature meter that displays the heat levels of the batteries and motors. The rover has to slow down or stop if the reading of the temperature meter indicates an overheating of batteries or motors.

The interaction with the navigation system is different between the non-VR and VR versions of the game. In the non-VR version, as shown in Figure 3a, a vertical panel is created on the right side of the screen to host the heading device, speed meter, and temperature meter. The 3D model next to this panel is the control console, on which the navigation devices were installed originally. In the VR version, as shown in Figure 3b, the player operates the navigation devices only through the control console on the rover. With the immersive view provided by the HMD, the player can move his or her head closer to the control console to check readings on the devices, just like what astronauts would do when driving the lunar rover.

The real lunar rover uses a T-shaped steering controller for driving. This controller, as a mechanical part of the control console, is located between two seats. It was operated by the astronaut sitting on the left seat. Moving the controller left or right turns the rover to the left or right. Moving it forward increases the speed, and moving it backward decreases the speed. To mimic such special steering control, both versions of the game use a flight stick as the input device, as depicted in Figure 3d.

The camera control in the non-VR and VR versions of the game is implemented differently. In the non-VR version, the default position of the camera is set behind the rover. When the player is driving, the camera will follow the rover's movement. The player can press a button on the flight stick to switch among a few different camera views, such as the side view to see the rover from the side and, the front view. In the VR version, the camera is controlled by the player's head movement, as shown in Figure 3c. The player only has the first-person view of the astronaut sitting on the driver's side. The VR version does not allow view switching.

Both the non-VR and VR versions of the game contain a mini map that shows the location of the Lunar Module (the landing location), locations of the station stops, and geological features of the Moon. In the era of the Apollo missions, there was no GPS system to track where the rover was on the Moon. Therefore, the mini map in the game does not show the location of the lunar rover. However, in the case that the player feels lost, he or she can press a specific button on the flight stick to show the rover's location and its current moving direction only for a few seconds. This can be used a maximum of five times. In the non-VR version of the game, the mini map is at the bottom-left corner, and it is always visible to the player. In the VR version of the game, as shown in Figure 3d, the mini map becomes visible when a button is pressed.

**Figure 3.** Screenshots and control examples of the driving phase. (**a**) is the 3D environment for the non-VR version of the game, with the navigation system on the right side of the screen, (**b**) is the 3D environment of the first-person view for the VR version of the game, (**c**) shows the head movement with the head-mounted display (HMD) for the readings of the navigation devices in the control console, and (**d**) shows the operation with the flight stick to bring up the mini map in the VR version of the game.

#### **5. Usability and Engagement Study**

We designed and performed a study to understand the usability and game engagement of the non-VR and the VR versions of the game. This section describes the design of our study.

We set up the usability and engagement study in a research laboratory in the university. The study took about 30 min for each participant. After arrival, participants first signed the consent form, and then each of them was given a unique user ID. Before playing the game, participants answered a demographic questionnaire about their experience in gaming, virtual reality, space science, and serious games. We utilized a between-subject design, where participants were randomly selected to play the VR and non-VR versions of the game. Each participant was asked to play the same version twice (two trials). After playing the game, they were asked to answer the GEQ [5] (as shown in Table 1) and the interview questionnaire (as shown in Table 2). We recruited a total of 30 participants for the study, 20 males and 10 females. A total of 10 males and 5 females played the non-VR version, and the other 10 males and 5 females played the VR version. Their ages ranged from 20 to 42 years, with the average age of 29.2 years. Twenty-six participants were from STEM majors. Figure 4 shows pictures of participants playing the game during the study. We defined several usability parameters and recorded values of the parameters during the study to evaluate user performance. Table 3 lists the parameters used in the planning, preparing, and driving phases.


**Table 1.** Game Engagement Questionnaire (GEQ) items [5] and the results of the game engagement evaluation. The rating scale for each GEQ item is 1 (strongly disagree) to 5 (strongly agree).

#### **Table 2.** Interview questions and the results.


#### **Table 3.** Usability parameters used in our study.


**Figure 4.** Pictures of participants playing the game during the usability study. The left column shows pictures of participants playing the planning phase. The middle column shows pictures of participants playing the preparing phase. The right column shows pictures of participants playing the driving phase. The first row shows participants playing the non-VR version of the game. The second row shows participants playing the VR version of the game.

#### **6. Evaluation Results**

#### *6.1. User Performance*

To evaluate user gameplay performance, we applied the analysis of variance (ANOVA) statistical model (with *p* = 0.05) to the values of usability parameters. The format for the time parameter was a tuple of minutes and seconds denoted as mm:ss. Table 4 shows the statistical analysis results between the VR version and non-VR version of the game and the statistical analysis results between the first trial and second trial in each version of the game. The mean values (denoted as *M*) and standard deviation values (denoted as *SD*) of the parameters are shown in the "Mean" and "Standard Deviation" columns in Table 4. The "Between-Trial Comparison" and "Between-Version Comparison" columns are the results produced by the ANOVA model. The following subsections discuss analysis details for the planning, preparing, and driving phases between the two versions of the game and between the two trials in each version.

#### 6.1.1. Evaluation of the Planning Phase

We analyzed user performance on two tasks in the planning phase. The first task requires the participant to place 4 markers (in the non-VR version) or 4 tokens (in the VR version) on the lunar terrain map based on given longitude and latitude values. The second task requires the participant to use the calculator to calculate the driving time based on the given distance and maximum speed.

The parameter *plmn* recorded the number of attempts that the participant performed to place all markers or tokens correctly. As shown in Table 4, in the first trial, the mean number of attempts is 7.60 in VR and 4.67 in non-VR. This indicates that placing tokens on the map in the VR environment is more challenging than placing markers on a non-VR, 2D interface. The between-version comparison on the first trial for *plmn* indicates that there is a significant difference for this parameter, where *F*(1, 28) = 6.04, *p* = 0.02. In the second trial, the mean number of attempts in the VR version is reduced more significantly than the number of attempts in the non-VR version, where the mean values are reduced to 4.87 in VR and 4.07 in non-VR. However, there is still a significant difference for *plmn*. We observed that when participants were placing tokens in VR, they tended to try repetitively with small

moves until reaching the correct position. In non-VR, participants usually moved markers directly to the position that they thought should be correct. Between the two trials, user performance in the second trial was better than user performance in the first trial for both versions of the game. As shown in the "Between-Trial Comparison" columns in Table 4, the ANOVA results are *F*(1, 28) = 5.10, *p* = 0.03 in VR and *F*(1, 28) = 4.13, *p* = 0.05 in non-VR. This means that user performance between the two trials was significantly different.


**Table 4.** Analysis results of user performance.

The parameter *plmt* recorded the total time the participant spent on placing markers or tokens. In the first trial, the mean was 03:05 in VR and 01:33 in non-VR. Thus, the time in VR was 1.98 times longer than the time in non-VR. In the second trial, the mean was 01:04 in the VR version and 00:51 in non-VR. The time in the VR version was 1.25 times longer than the time in the non-VR version. In each of the trials, the time duration was a significant difference between the two versions. Participants who played the VR version of the game spent a longer time in the planning phase than the participants playing the non-VR version. As shown in the "Between-Trial Comparison" columns in Table 4, the value of *plmt* is decreased significantly from the first trial to the second trial in both versions, and such a decrease is significant since *F*(1, 28) = 29.21, *p* < 0.01 in the VR version and *F*(1, 28) = 19.70, *p* < 0.01 in the non-VR version.

The parameter *plmt*<sup>1</sup> recorded the time that the participant spent on placing the first marker or token. The value of this parameter indicates how long the participant took to learn and understand task requirements and basic operations. In the first trial, the mean value of *plmt*<sup>1</sup> in the VR version was 01:39 and the mean value was 00:40 in the non-VR version. The difference in *plmt*<sup>1</sup> between the two versions is significant since the results show *F*(1, 28) = 6.04, *p* = 0.02 in the "Between-Version Comparison" column. This indicates that participants who played the VR version spent longer learning how to operate the game than the participants who played the non-VR version. In the second trial, there was no significant difference in *plmt*<sup>1</sup> between the two versions. The time for the VR version was 00:29 and the time for the non-VR version was 00:31. The value of *plmt*<sup>1</sup> decreased significantly from the first trial to the second trial in both versions. We observed that after the first trial, participants understood clearly the task requirements and basic operations. There was no learning difficulty for participants in the second trial.

The parameter *plct* recorded the time spent on calculating the driving time with the given distance and maximum speed. The results do not show a significant difference for this parameter between the two versions or between the trials. However, we observed that from the first trial to the second trial, the value of this parameter decreased significantly in both versions.

#### 6.1.2. Evaluation of the Preparing Phase

In the preparing phase, participants assembled the lunar rover with constantly proposed requirements for each of the five tasks, as described in Section 4.2. After the participants added all the devices that they thought were required for a task, they clicked the submission button. If the devices were added correctly, they would move to the next task. Otherwise, they had to select appropriate devices again and resubmit. Note that in the game, the order of the five tasks is different in the first trial and the second trial. The order the devices are displayed in is random and so will most likely be different between the two trials. The parameter *prn* records the number of submissions the participant made. The results do not show any significant difference in *prn* between the two versions or between the two trials. However, from the first trial to the second trial, the number of submissions decreased significantly in both versions. This is because the participants became familiar with the devices in the inventory in the second trial, so they could determine the correct answer more quickly.

For the first trial, the total time (*prt*) that the participants spent on finishing this phase in the VR version (*M* = 04:11, *SD* = 01:15) was not significantly different to the time spent in the non-VR version (*M* = 03:33, *SD* = 01:22). However, there was a significant difference in the second trial. In the second trial, the mean value of *prt* was 01:40 in the VR version, and the mean value was 01:05 in the non-VR version. Between the two versions, the results have *F*(1, 28) = 12.03, *p* < 0.01. In each version, the value of *prt* decreased significantly from the first trial to the second trial.

The parameter *prt*<sup>1</sup> records the time that the participant spent on the first submission. In each trial, there is not any significant difference between the two versions. In each version, the value of *prt*<sup>1</sup> decreases significantly from the first trial to the second trial.

#### 6.1.3. Evaluation of the Driving Phase

Participants drove through all station stops and came back to the Lunar Module (the landing location). In each trial, the total time that the participant spent on finishing this phase (*drt*) was not significantly different between the two versions, as shown in the "'Between-Version Comparison" column in Table 4. For example, for the first trial, the mean value of *drt* in the VR version was 05:36, which is almost the same as the mean value in the non-VR version, which was 05:05. In each version, the decrease in the value from the first trial to the second trial was significant, as shown in the "Between-Trial Comparison" columns in Table 4 for each of the versions.

In the driving phase, participants had to monitor the battery temperature and motor temperature in order to avoid the issue of overheating. If the rover starts overheating, the participant should reduce the speed or completely stop the rover to let it cool down. We defined the parameter *drn* to record the number of times that the rover overheated. In the first trial, the mean value of *drn* in the VR version was 5.53, which is higher than the value of 2.93 in the non-VR version. The difference in *drn* between the two versions in the first trial was significant (*F*(1, 28) = 5.71, *p* = 0.02). This indicates that in the first trial, the participants tended to drive more aggressively in VR than in the non-VR version of the game. In the second trial, however, the participants who played the VR version of the game performed more calmly than in the first trial. There was no significant difference in *drn* between the two versions in the second trial (*F*(1, 28) = 0.07, *p* = 0.79). We observed that in the second trial of the VR version, participants paid more attentions to the prompted messages than in their first trial.

#### *6.2. Game Engagement*

We employed the GEQ and a post-game interview questionnaire to evaluate game engagement. Table 1 shows the results of the GEQ. The GEQ provides a total of 19 questions. The VR version of the game had better results than the non-VR version in 11 questions. For example, for the question "I feel spaced out", the VR version received a score 47% higher than the non-VR version of the game. For the

question "I lose track of where I am", the VR version received a score 52% higher than the non-VR version of the game. The VR version of the game also received higher scores on the questions "I feel scared" and "I get wound up" . Since they are negative questions, higher scores indicate a weaker engagement in the game.

The results of the interview questionnaire are showed in Table 2. The averaged rating for the non-VR version was 8.26/10, and the averaged rating for the VR version is 8.60/10. The overall rating for the VR version was higher than the non-VR version, but this was not a significant difference. Participants rated the difficulty level of the VR version (4.07/10) higher than the non-VR version (3.13/10).

The interview questionnaire asked participants whether they wanted to learn more about the historical events of lunar exploration missions. The VR version of the game received a rating of 8.47/10, which is significantly higher than the non-VR version's rating of 7.67/10 (*F*(1, 28) = 4.97, *p* = 0.03). Moreover, for the question asking whether they gained interest in space science, the VR version received the rating of 8.33/10, which is significantly higher than the rating of 7.60/10 for the non-VR version (*F*(1, 28) = 4.52, *p* = 0.04).

Motion sickness could be an issue in the VR version of the game. The interview questionnaire asked participants if they felt motion-sick during play. We received the rating of 2.60/10 for the non-VR version and 3.90/10 for the VR version. Thus, the VR version has a higher chance of making participants feel motion-sick. We observed that some participants started feeling motion-sick when they entered the driving phase because of the bumpy lunar terrain surface. This has a negative impact on VR-based game engagement.

In the game, there is a crater near the landing site that is displayable on the terrain map in the planning phase and approachable in the driving phase. The interview questionnaire asked participants whether or not they noticed the crater and asked them to describe the location of the crater. A total of 16 participants answered correctly, including 6 participants who played the non-VR version and 10 participants who played the VR version. We observed that participants who played the VR version gained a better understanding of the nearby environment and performed better on tasks of identifying locations and directions.

As we observed in the literature, astronauts in the real lunar exploration mission gained information on the mission either during the pre-launch training or through voice communications with HQ. In the game, participants had to read texts on the screen to know the functionality of the scientific instruments or follow along the mission tasks. Reading screen displays is not the intuitive method for this gameplay, and consequently it may decrease the level of engagement.

#### *6.3. Discussion*

In our experiment, when participants were playing the VR version of the game, they used extra time to get familiar with the VR environment and learn how to operate the handheld controllers. Subsequently, it took longer for them to finish the game. In contrast, the participants did not need a learning or tutorial session to learn how to use the keyboard and mouse in the non-VR version. In our experiment, we noticed that participants took about 22 min to finish the VR version of the game and 17.5 min to finish the non-VR version. We also found that performing an operation in VR usually takes longer than performing one in the non-VR version. In the non-VR version, the participants moved the mouse and clicked a button to finish an in-game event. However, in the VR version, participants had to point to an object, grab it, and then drag it to a certain place. This may make the participant spend longer on completing a game in VR. In the second trial in the study, participants usually performed better than in the first trial. This happened in both the VR and non-VR versions. User performance improvement is more significant in the VR version, and the performance gap between the two versions was largely reduced in the second trial.

We noticed that even though some participants experienced motion sickness, they still provided positive feedback on their overall gameplay experience. Participants who played the VR version also gave higher scores than the participants who played the non-VR version for the interview question about how well the game promotes interest in space science.

#### **7. Conclusions and Future Work**

In this paper, we designed, implemented, and studied a serious lunar exploration mission VR game. We demonstrated the design differences between the VR version and the non-VR version. We discussed the usability and engagement study between VR and non-VR versions. The results of our experiment indicate that in our game, the non-VR version leads to better user performance than the VR version. Users usually needed extra time to get familiar with the VR environment and the use of handheld controllers. After practicing in VR however, the performance gap with the non-VR version reduced significantly. More importantly, the VR version improves the game engagement and enhances the interest of players in learning more about space science and the historical events of lunar exploration.

In the future, we plan to add a multiplayer mode to the game. We want to study the usability and game engagement through multiplayer cooperation on lunar exploration missions. We also plan to improve the naturalness of the interaction with the virtual reality environment. For example, we could employ a hand-gesture-based input modality for gaming controls instead of using handheld controllers.

**Author Contributions:** Conceptualization, L.C., C.P., and J.T.H.; methodology, L.C., C.P., and J.T.H.; software, L.C. and C.P.; validation, L.C., C.P., and J.T.H.; formal analysis, L.C.; investigation, L.C. and C.P.; resources, C.P. and J.T.H.; data curation, L.C.; writing—original draft preparation, L.C. and C.P.; writing—review and editing, L.C., C.P., and J.T.H.; visualization, L.C.

**Funding:** This research received no external funding.

**Acknowledgments:** We gratefully acknowledge the support of the NVIDIA Corporation with the donation of a GPU card used in this project. We thank anonymous reviewers for their comments. The study was performed at the University of Alabama in Huntsville (UAH), USA. The Institutional Review Board (IRB) application (E201922) for the study was submitted and approved by the UAH Institutional Review Board of Human Subjects Committee. We thank the participants for their participation in the study. We also thank the Center for Media, Arts, Games, Interaction & Creativity (MAGIC) at the Rochester Institute of Technology. The lunar rover model was obtained from Hameed (https://www.deviantart.com/hameed/art/Lunar-Rover-Downloadable-3D-Model-With-Textures-439471995) for non-commercial use.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **A Guide for Game-Design-Based Gamification**

#### **Francisco J. Gallego-Durán \*, Carlos J. Villagrá-Arnedo, Rosana Satorre-Cuerda, Patricia Compañ-Rosique, Rafael Molina-Carmona and Faraón Llorens-Largo**

Cátedra Santander-UA de Transformación Digital, Universidad de Alicante, 03690 San Vicente del Raspeig, Spain; villagra@ua.es (C.J.V.-A.); rosana.satorre@ua.es (R.S.-C.); patricia.company@ua.es (P.C.-R.); rmolina@ua.es (R.M.-C.); Faraon.Llorens@ua.es (F.L.-L.)

**\*** Correspondence: fjgallego@ua.es

Received: 8 July 2019; Accepted: 24 October 2019; Published: 5 November 2019

**Abstract:** Many researchers consider Gamification as a powerful way to improve education. Many studies show improvements with respect to traditional methodologies. Several educational strategies have also been combined with Gamification with interesting results. Interest is growing and evidence suggest Gamification has a promising future. However, there is a barrier preventing many researchers from properly understanding Gamification principles. Gamification focuses of engaging trainees in learning with same intensity that games engage players on playing. But only some very well designed games achieve this level of engagement. Designing truly entertaining games is a difficult task with a great artistic component. Although some studies have tried to clarify how Game Design produces fun, there is no scientific consensus. Well established knowledge on Game Design resides in sets of rules of thumb and good practices, based on empirical experience. Game industry professionals acquire this experience through practice. Most educators and researchers often overlook the need for such experience to successfully design Gamification. And so, many research papers focus on single game-elements like points, present non-gaming activities like questionnaires, design non-engaging activities or fail to comprehend the underlying principles on why their designs do not yield expected results. This work presents a rubric for educators and researchers to start working in Gamification without previous experience in Game Design. This rubric decomposes the continuous space of Game Design into a set of ten discrete characteristics. It is aimed at diminishing the entry barrier and helping to acquire initial experience with Game Design fundamentals. The main proposed uses are twofold: to analyse existing games or gamified activities gaining a better understanding of their strengths and weaknesses and to help in the design or improvement of activities. Focus is on Game Design characteristics rather than game elements, similarly to professional game designers. The goal is to help gaining experience towards designing successful Gamification environments. Presented rubric is based on our previous design experience, compared and contrasted with literature, and empirically tested with some example games and gamified activities.

**Keywords:** gamification; game design; rubric

#### **1. Introduction**

In recent years, Gamification [1,2] is getting considered a magic solution for most educational problems. Many researchers and practitioners chase it, and many studies try to unveil its secrets and details. In one form or another, the term and the field are acknowledging the power of games to engage and induce states of flow in players. Gamification chases this power to apply it to environments that originally are not ludic. The aim is to get people engaged in serious or important work with the same intrinsic motivation than in games.

This enterprise is noble but extremely complicated. As more and more research is being carried out, results remain unclear [3–8]. Hundreds of research experiences have been undertaken with mixed results. Many studies find benefits when applying Gamification, but many others do not and even some of them report damage. Overall tendency seems to report some small but measurable benefits. These results are quite unexpected compared to the exponential rise in game sales and gaming culture in general.

The problem with most Gamification research seems to be in its different focus from actual Game Design. Many studies pursue scientific isolation of statistical variables. This leads them to consider the isolated influence of individual game elements like points, badges and leaderboards in motivation and behaviour change. The problem with this approach is that a game is not an unrelated set of game elements. Metaphorically, a game is similar to a grand-cuisine dish: testing its isolated ingredients in other contexts does not convey useful information to learn to cook the dish.

This view is supported by relevant Gamification practitioners like Kevin Werbach, Yu-kai Chou or Sebastian Deterding [9–11] and also Game Design experts like Raph Koster or Jesse Schell [12,13]. In Werbach's words [9]: "Clearly not everything that includes a game element constitutes gamification. Examinations in schools, for example, give out points and are non-game contexts." Deterding goes beyond that in Reference [11]: "The main task of rethinking Gamification is to rescue it from the gamifiers." For Deterding, the majority of gamifiers are confused as they simple try to add points, badges and leaderboards to everything, with great disregard to the complexities of Game Design.

Games are complex environments that deliver experiences to players [13]. They are made of game elements, similar to a dish is made of ingredients but the process, interactions, uses and objectives are key for the final result:

"*Gamification should be understood as a process. Specifically, it is the process of making activities more game-like. Conceiving of Gamification as a process creates a better fit between academic and practitioner perspectives. Even more important, it focuses attention on the creation of game-like experiences, pushing against shallow approaches that can easily become manipulative. A final benefit of this approach is that it connects Gamification to persuasive design.*" Kevin Werbach [9]

These reasons could explain why there is no scientific consensus on a formal approach to Gamification. There are analyses of the characteristics of good games [14,15] which Gamification pursues. There also are methodological approaches, design frameworks and even descriptions of design patterns based on Game Design principles, good practices and experience [3,16–21]. However, all approaches rely on subjective interpretation and creative design. In fact, many professional Game Designers and researchers express their view that games cannot be formally specified at all [22,23],

Even if games cannot be formally specified and individual game element research does not yield complete information, there are useful approaches [16,24]. Assuming that Game and Gamification Design are artistic in essence, approaches the focus on acquiring design experience. There is no need to solve "what-is-a-game" philosophical debate. Game-like designs able to become engaging voluntary experiences for players could be successful. Willingness can make experiences fall in persuasive or seductive sides of Tromp, Hekkert and Verbeek's matrix in which design can influence behaviour [25]. Similar to References [9,11–13], this work focuses on this practical approach.

The main goal of this work is to help acquire Game Design experience for Gamification. Experienced practitioners may found methods, frameworks or models cited in the literature more suitable to their needs, particularly References [3,17,18,20,26]. These works have great value but require previous Game-Design-Based Gamification experience to be fully comprehended and put into practice. To build such required previous experience a practical and simple approach is proposed: a measurement tool, a rubric, with great focus on Game Design aspects rather than on game elements.

#### *Acquiring Design Experience*

As previous design experience seems key [9,12,13], our proposal for new practitioners is creating and testing their own designs. In our experience, iterating over own designs leads to obtaining solid game-design-based Gamification design skills. However, analysing and improving designs results in an almost impossible task for inexperienced designers. On the absence of personal experience to rely on, the only valid source is testing. Testing with trainees is essential but doing so with no previous design guidance could result on a extremely slow and frustrating discovery process. This is an entry barrier that can produce two important problems: too many failures on initial attempts, and abandon due to frustration. Moreover, when initial failures are not identified as a consequence of lack of experience, they can result in research papers blaming the field itself.

During fifteen years teaching Game Development and Gamification [27,28], we have perceived a great difficulty to pass on design experience to new practitioners. The problem, as discussed, seems to be on the artistic nature of Game Design. Novice practitioners often underestimate the complexity of creating a design that can be put in practice, not to say a successful one. This is problematic, as their initial experiences will probably fail and be frustrating. There are design frameworks, methods and guidelines proposed for game and Gamification [3,16,18–21,29] that could help in creating better first designs. However, these proposals are either general or specifically for experts. They are not designed with novices in mind and can easily result overwhelming for them. For instance, Kreimeier's patterns [16] condense many designers' experiences. This is highly valuable but almost impossible to properly understand without previous experience on pitfalls and failures. Tondello et al. [20] explicitly state "Our set of heuristics is aimed at enabling experts to identify gaps in a gameful system's design" which clearly leaves novices out. Linehan et al. [3] propose to use Applied Behaviour Analysis from the field of psychology with many interesting theoretical explanations. This is too much theoretical information for novices which probably will require several testing iterations to relate it to actual practice. Similarly, Self-Determination Theory (SDT) [29] is the most widely cited theoretical framework. In essence, SDT is easy to understand but too generic. Novices need more specific and game related descriptions, as SDT is purely psychological. Hunicke et al. proposal [18] splits Game Design into three blocks: Mechanics, Dynamics and Aesthetics (MDA). This simple classification helps organizing designs, which is very useful for novices but does not help in measuring their value, comparing with others or giving hints on how to improve them.

To help in this process, this work proposes a game-design-based rubric. This rubric focuses on measuring how well designed a game or activity is, from a game-design-based Gamification perspective. The measure is formalized as a score from 0 to 20 points; the greater the score, the better. The rubric is based on a set of ten characteristics related to successful designs. These ten characteristics have been selected from our previous experience in Game Design and Gamification, partially in accordance to previously discussed works and with an aim to simplify analysis. The goal is being useful enough to serve as analysis and design tool, at the same time as being simple enough to help novices.

It is important to remember that there is no known way to perform an objective assessment of a given design. The assessment obtained with the proposed rubric is to be considered a simplified initial guidance. This guidance is targeted at inexperience designers to help them overcome the entry barrier. In this sense, the rubric helps discretizing designs and moving them from the artistic to the analytic dimension. It also helps identifying potential areas for improvement, pointing to those underperforming characteristics of a given design. These values are complementary to previously analyzed formal tools and frameworks, which makes it interesting to be combined with them.

Section 2 describes the ten selected characteristics for the rubric in detail, explaining their design implications. Section 3 presents the rubric and explains its design constraints and criteria. Section 4 shows some initial evidence on the validity of the rubric by applying it to four activity samples: a commercial game, a gamified course from literature, a learning activity and a gamified version of the learning activity. Finally, Section 5 sums up conclusions and limitations of this work.

#### **2. Ten Relevant Characteristics for Game-Design-Based Gamification**

This section describes in detail the ten selected characteristics in the proposed rubric. The importance of each one is discussed by highlighting relevant psychological arguments

considered and comparing classical educative environments with successful commercial games. Moreover, considerations from prestigious game designers are also cited and analysed from their published works.

This set of characteristics greatly overlaps those described by previous works [3,16,17,19–21], specially Gee's learning principles good games incorporate [14,15]. Most works are generalist and some are specifically focused on experts. That could be overwhelming for novices. Our proposed set aims to fill in this gap. Its intended value comes from its use case goal: simple and easy to understand for new practitioners.

We acknowledge that this set is subjective in nature, despite the arguments presented and discussed. We propose them out of previous Game Design and Gamification experience and we expect following works to help refining it after gathering appropriate usage evidence. Section 4 gives an initial piece of evidence on the validity of this set for guidance and ground basis to start retrieving more evidence.

Description of the ten characteristics follows in no particular order.

#### *2.1. Open Decision Space*

Autonomy is one of the center points of intrinsic motivation. In order for a trainee to be truly autonomous, there must exist different possible decisions to take. In fact, the greater the space of possible decisions over time, the better. However, there are some common misconceptions whose analysis is relevant.

• Correct decisions ill-form decision spaces.

Many Gamification designs reside on questions or alternative paths, being only one of them correct. This represents and ill-formed decision space because there is no true decision to take. Trainees are not being asked to decide and progress but are being tested instead. For an environment to foster autonomy and provide a truly open decision space, decisions should not be designed as correct/incorrect. In contrast, decisions must produce consequences and trainees should be free to play with situations, environments and consequences, experimenting and learning from results.


These points are clearly addressed by great games. Decision spaces are usually continuous, as players can move freely over time and experiment the consequences of their interaction decisions. Take for instance *Super Mario Bros* for *Nintendo Entertainment System* (NES) (see Figure 1) [30]. When facing the first enemy there are virtually infinite ways to do it. We often simplify it by thinking that you can jump on it, jump over it or collide with it and die. However, there are virtually infinite alternatives: jumping earlier or later, higher or lower, faster or slower and so forth. Players could even jump several times back and forth, advance and retreat, or do anything they can imagine based on the free will given by the rules of the world. In fact, Demain et al. prove that *Super Mario Bros* is

PSPACE-complete [30], the harder class of problems that can be solved in polynomial time. Of course, this great complexity comes exactly from the openness of its decision space. Generally speaking, broader decision spaces that are not ill-formed produce more complex problems. And the greater the complexity, the more the options for creative behaviour, which fosters player autonomy.

**Figure 1.** Start of the first level of *Super Mario Bros* game. There are virtually infinite possible decisions, in the form of movement sequences. Players can press up to 60 combinations of inputs per second.

#### *2.2. Challenge*

Designing challenging activities is a key point in Gamification and a difficult task to accomplish. An activity is considered challenging when it tests the limits of our ability in subtle ways. Oversimplifying, a design space can be considered with only two dimensions: difficulty of the task and ability of the trainee. When both difficulty and ability match, trainees are faced with activities that they are able to solve [31]. However, when difficulty is much higher than ability, trainees usually get frustrated. On the contrary, if ability is much higher than difficulty, then trainees probably become bored. There is a narrow space in between both extremes where difficulty and abilities are evenly matched. This simple analysis on activity design-space for challenge is the basis for the theory of the channel of flow [32] (see Figure 2).

**Figure 2.** (**Left**) The flow channel. (**center**) Linear incremental difficulty design that perfectly matches abilities. (**right**) Rhythmic incremental difficulty design.

In essence, challenging trainees consists in assigning then interesting tasks that lie on the verge of their abilities. Although simple in concept, challenging trainees in an educational environment is difficult. A trainee can fail in a challenging task several times while learning. Educational environments tend to punish failure by diminishing trainees marks. This is contrary to using challenge as a driver for learning and motivation. In the presence of punishments, trainees avoid difficulty even at the cost of boredom and diminished learning. Marks are the most important outcome pursued by trainees and that must be taken into account.

The flow channel shown in Figure 2 (left) represents a dynamic space. Trainees' abilities evolve over time. As abilities increase, previous challenges become boring and new ones are required. Tasks have to be designed with incremental difficulty in mind. Figure 2 (center) represents the general concept of incremental difficult with an ideal linear progression. However, this progression should not be considered ideal. As we humans are not machines, our brains usually distaste flat linear progressions. Hollywood movie makers usually design movies following a sinusoidal pattern of fast action events followed by relaxed moments. Figure 2 (right) shows this same concept applied to incremental difficulty. This approximation generates stress spikes followed by more relaxed moments. Stress spikes on the verge of the flow channel force trainees to push their limits, adapt and learn. Easier activities let trainees reinforce their sense of progress at the same time they release previous stress and prepare for next spike. Moreover, relaxed moments represent also a psychological reward, as trainees subconsciously acknowledge their new abilities. This pattern is completely similar to General Adaptation Syndrome described by Hans Selye [33] and summarized in Figure 3.

**Figure 3.** General Adaptation Syndrome. Challenging tasks produce stress, test available abilities and yield failures. Present abilities are improved as resistance, then new ones are developed as super-compensation. Relaxing helps fixing acquired abilities, as continued stress ends up in exhaustion.

#### *2.3. Learning by Trial and Error*

This is arguably one of the key differences between great Computer Games and traditional teaching environments. As discussed before, great computer games appropriately challenge players. Proper challenges put players into an edge where they can narrowly succeed or fail depending on their abilities [15,26]. This produces engagement on the most relevant thing a game conveys: learning. Players fail many times and try again, learning from their failures. They continue trying as long as they feel competent to learn and eventually succeed. This guiding force comes from human natural desire to learn and so it reinforces autonomy and will. This cycle happens so naturally because great computer games are safe environments for failure. Players do not want to fail but they are not afraid to do so either. They assume failure as part of the learning process and then try different things to improve their abilities.

In contrast, traditional learning environments are designed to prevent, prosecute and punish failure. This situation often arises from assessing learning on the basis of task results. Two trainees solving a given task can obtain different marks depending on their failures during respective solving processes. This situation drives trainees to focus on preventing failure at all costs to preserve their marks. All learning through challenge and experimentation gets removed from the environment.

In order to mimic computer games and get the benefits from challenge and experimentation, trial and error must be considered as a center way to learn. A proper Gamification design should focus trainees on goals and let them freely experiment and fail without punishment. Trainees need to acknowledge that failing is safe to be confident enough as to experiment. In fact, it would even

be better if the environment encourages them to fail and analyse: learning from failure is extremely valuable and often forgotten due to too much focus on task results.

Moreover, solving problems by trial and error, creating solutions, failing, redoing and refining produces "professional experience". In fact, professionals usually say that the greatest expert is a person who has committed all possible mistakes. A great Gamification design understands the importance of experience and designs situations for trainees to learn by trial and error.

#### *2.4. Progress Assessment*

Computer games generate virtual environments that evolve with player interactions over time. This evolution immediately informs players about their progress inside the game. Many computer games also include progress measures and feedback systems that constantly inform players about their statistics, achievements, awards, status and, in general sense, progress. Part of the engagement of players in games comes from their sense of progress [3]. Players build upon their own progress as their achievements encourage them to pursue next steps far beyond.

Many learning environments feel very different. Trainees attend lessons and then have to practice or study contents on their own. There is few or none feedback on their progress. Occasionally they can check if they manage to solve some exercises. However, this is radically different from having a constant feedback on progress and clear goals to next levels. One key element in this feedback are measures of already achieved success. Whenever players obtain any award or finish levels, they move on in their gaming adventure and their previous success is acknowledged. Their achievements are never removed, even if they fail afterwards [14]. Compare this with lessons in which a trainee can solve all proposed exercises but fail on the final exam. There is no progress at all because there is no assessment and acknowledgement. All that matters is the result of the final exam. And that is the reason why many trainees do not care about solving exercises during lessons. They only need to prepare them at the end and do well on their exam: progress does not matter at all.

In order to generate engagement and maintain interest, Gamification designs should include one or several forms of progress assessment. Moreover, designing for progress assessment also helps better designing incremental difficulty for challenge, as progress and difficulty are closely related [26]. Ideally, progress assessment should not be based on extrinsic rewards like points or badges. Intrinsic motivation requires trainees to be focused on learning goals per se. Too much emphasis on extrinsic rewards can change trainees' focus, which would be detrimental for learning. For example, some trainees focus on passing subjects to get a degree without worrying about learning. Getting a degree is an extrinsic reward that eclipses their interest for learning and getting abilities. Consequently, using extrinsic rewards to assess progress should be done with care, ensuring that the main focus is always placed on learning goals.

#### *2.5. Feedback*

The most relevant difference between computer games and traditional learning environments lays in quantity, quality and rapidity of feedback response. Many good computer games act as simulations, which confers them similar properties to reality: players get immediate feedback response to any of their actions. This is a key point both for learning and engaging: immediate feedback. It is better understood with an example: imagine a child learning to play soccer. Every time the child kicks the ball, it reacts and moves depending on the kick. This feedback lets the child learn applied physics: the child learns to control movement, spin, momentum and force transmitted to the ball through kicking. Now imagine a delayed ball that reacts 24 h after being kicked. Learning how to kick the ball and get a desired reaction would require great patience and effort. Many trainees would rapidly desist, demotivated by such slowness, unable to effective learn. Appropriate, on-time feedback is crucial for both learning and engaging [3,12,13,26].

This is an important problem of many traditional learning environments. Many of the learning activities require to be assessed by a teacher. For instance, trainees solving math exercises wait until they receive teacher corrections. This is similar to the 24-h delayed ball of our previous example. A computer game designed this way would probably be played by no one. A designer would probably envision something more dynamic like "Sum Totaled" mini-game inside "Brain Age Express: Math" from *Nintendo DSi* [34] (see Figure 4). In this mini-game, monsters attack the player who can destroy them by adding the numbers on their bodies. The player has 3 lives that are lost every time a monster hits player's avatar. The activity is based on adding numbers but its rules make it dynamic and the player gets constant, immediate feedback: enemies explode when the player writes down a correct answer and action continues uninterrupted otherwise.

**Figure 4.** (**Left**) Brain Age 2. Sum Totaled. Player writes '12' (7 + 5) to destroy the monster on the top before it fells down and takes one of the three lives. (**Right**) Traditional addition practice.

Comparison between *Sum Totaled* game and the traditional addition practice in Figure 4 shows the importance of appropriate feedback. Both activities are mathematically the same (apart from their difference in difficulty) but trainees performing it traditionally will have no feedback stimuli to learn from. They will need to wait for teacher corrections. Moreover, game dynamics encourages trainees stop fearing failure and produce more answers, because being quick is crucial. This promotes learning from failure and has the potential to make learning more time/cost efficient.

#### *2.6. Randomness*

Randomness is a relevant factor for learning and engaging and links both of them together. In its most fundamental definition, learning is about discovering and modelling patterns and testing constructed models against reality through experimentation. This describes an iterative process for learning, in which engaging arises naturally when trainees constantly find new ways to refine and validate their mental models. An appropriate degree of randomness can keep trainees iterating longer, as their minds will continuously try to refine their models based on unexpected observations. Human minds are not well suited for dealing with probabilities, as they tend to model in terms of strong cause-effect relationships. This generally explains contexts like gambler's fallacy or the hot hand fallacy [35]. These fallacies show us that randomness itself can be used to produce engagement. Therefore, it is quite relevant for games and Gamification designs.

Well designed randomness can provide a useful consequence: surprise. Surprise is one of the most desirable feelings both in learning and playing. Schell describes it as "so basic that we can easily forget about it" in Reference [13]. Schell also describes fun as "pleasure with surprises" and remembers that "Surprise is a crucial part of all entertainment - it is the root of humour, strategy and problem solving. Our brains are hard-wired to enjoy surprises". In fact, surprise happens when observations are radically different from our mental models. The more unexpected the event that happens, the more information it carries: this is a natural consequence of the definition of entropy in information theory by Shannon [36]. This means that surprises are great sources of new information, which can potentially push trainees to revise their mental models, that is, to learn.

#### *Informatics* **2019**, *6*, 49

Consequently, it is key to consider appropriate uses of randomness in our Gamification designs to foster engagement and learning through surprise.

#### *2.7. Discovery*

As Koster states in Reference [12] a good game is one that "keeps the player learning". One of the most important ways to keep players learning is presenting them with new content at an adequate rate. This renews interest in the game and keeps players eager to continue discovering more. It can also trigger surprise, depending on the nature of the new content and the way it is presented. However, discovery is as difficult to design as challenge. New content has to build up on previous content to balance novelty and familiarity. Similar to the flow channel (Figure 2), if a content is radically new it can easily be difficult to understand or accept. New information that cannot link to pre-existing mental models becomes similar to noise: no pattern can be found in the information and so it cannot be modelled and learnt. Some degree of familiarity is needed to help players understand, accept and enjoy new content but too much would eclipse novelty, making new content not feel new at all.

Games present basically two ways for new content delivery: discovery and unlocking. Unlocking works by asking players to perform some achievements to unlock new content. Typically, this means finishing some levels before being able to play new ones. The other way is by placing new content in such a way that players will discover it while playing. Discovery can be equivalent to unlocking by delivering same content at same rate. However, well designed discovery can produce better feelings on players, like surprise, reaffirmation and self-esteem raise. Moreover, discovery can also be designed non-linearly. Games can have secret content, not required to succeed, but present only for players that go beyond normal play. This is also an indirect way to reward players for their attention to detail, research or clever play. It is also an interesting way to convey rewards, as discovery would not be perceived as a reward but as a personal achievement. This has higher probability to foster intrinsic motivation.

Discovery is not commonly used in Gamification. This is probably due to the difficulty of content and activity design. Educational contexts tend to be linear and content is usually known beforehand. Trainees expect contents to be introduced first, then explained linearly. This relates to what we stated in Section 2.1: activities and contents are usually designed with a single correct path, expecting a concrete answer. To add discovery, designs require open spaces in which trainees decisions are relevant. Otherwise, discovery has no meaning at all. And this is the root of the difficulty for including discovery: it is difficult to design activities or content with proper open decision spaces. So, when willing to introduce discovery, it is advisable first to think about activities with open decision spaces.

#### *2.8. Emotional Entailment*

Emotions have been generally ignored in education. Educational context usually focus on factual content, leaning and methodologies for better learning. Everything tends to be concentrated on effectiveness of factual learning. Emotions are seldom considered. However, it is quite common for trainees to define their teachers in function of their feelings. Usual comments include "I like lessons from this teacher because they are X", being X a qualifier like "fun", "entertaining", "approachable", "kind". Emotions are a key factor in all human relations and they also play a key role in learning and engaging. It is known that high intensity emotions produce long lasting memories. In fact, even before the term Gamification were coined, many studies targeted fun as a catalyst for learning [37–39].

Similar to movies, games cannot be successful without paying close attention to emotions. At the very least, a game is always expected to be fun. But fun, like any other emotion, is created inside player's mind. In Schell's words, "When people play games, they have an experience. It is this experience that the designer cares about. Without the experience, the game is worthless" [13]. The game itself is not the experience rather than the tool that enables it. That is what makes anything we create different for each person: the experience happens always in the mind of the player. And that makes it so important for games to be emotionally entailing, because they will attach to emotions in the mind of the player producing a much better and personal experience.

Gamification tends to use the same main tools games use to construct emotional entailment: characters, stories and aesthetics [13]. The problem usually lies in the complexity of these tools. All of them require great abilities and long periods of time when trying to mimic what commercial games do. This is too expensive and usually not cost/effective in educational environments. Simple approaches are preferred in this case: simple stories like "escape from enchanted house" or "disarm the bomb" could be enough for an emotional entailing environment. Trainees could be given freedom to create their own characters (like in role playing games) and aesthetics could be imaginary. Moreover, direct interaction between trainees could also help creating emotional entailment. Forming groups, sharing challenges and achieving common goals are preferred approaches in educational environments.

#### *2.9. Playfulness Enabled*

A playfulness enabled game refers to its versatility to be used as a toy. Games have goals, toys do not. Some game can be played without focusing on goals, in a playful way. Some examples include Minecraft, Grand Theft Auto series or Goat Simulator (see Figure 5) [40]. This kind of games are classified as *sandbox* or *open-world*.

**Figure 5.** (**Left**) An example world constructed in Minecraft. Similar to LegoTMblocks, no rule forces players to build anything specific. Creations come out of personal will, just because the game allows them. (**Right**) In Goat Simulator there is no specific reward for jumping over an ultralight but players do it because they can and it is fun: it is a way of experimenting, just like in the real world.

On latest years game designers are paying more attention to playfulness as sandbox or open world games are increasingly more demanded. Reasons for this demand have already been pointed out: players have complete autonomy for developing their own creativity in vast open decision spaces, they can pursue their own goals, experiment by trial and error, create their own personal challenges and constantly discover what happens as a result of their actions. Clearly, these kind of games properly address many of the items in this rubric, including playfulness ability. Some of these games have goals also but they let and encourage players to do anything they like, pursue goals in any order or even forget about goals and just explore and experiment. This is how these games become toys: they can be played in almost any imaginable way by players, similar to children playing invented games with a ball. The ball is a just a toy that can be used for any kind of play.

Playfulness is absent most of the time in educative environments. However, it is present on research and development environments. In fact, most of present discoveries in many branches of knowledge came out of experimental approaches. These approaches are completely similar to the playfulness nature of toys. Research has no intrinsic goals in general: it emerges from raw questions. Researchers are presented with current evidence and they ask questions about why or how things happen. That leads to experiments to seek answers and this gives new evidence. Evidence then gives ideas to developers to create. And all this cycle is driven by curiosity. Therefore, it could be considered a playfulness approach, as it is completely similar.

This characteristic is highly desirable in our educative environments. Curiosity is the most important driver for knowledge and a playfulness enabled environment fosters curiosity and experimentation. However, there is a great challenge involved. As teachers we usually design from knowledge to activities. This implies that activities are designed to practice and acquire some concrete knowledge or abilities. So activities tend to be the opposite of sandboxes: they are usually focused on some finite set of goals and give little or no space to trainees for experimentation or play. Great Gamification designs should change this focus and seek on producing sandbox-like activities.

#### *2.10. Automation*

One of the main differences in Figure 4 is due to automation. Characteristics like feedback, challenge or randomness are greatly affected by the level of automation. A game like Brain Age cannot exist without automation. Similar games can be made, even in manual contexts but they will be different to Brain Age.

Interaction based on the immediate feedback from a computer game generates great amounts of information per second. Players' brain subconsciously analyse cause/effect relations between this information and their input interactions. This fosters adaptations in players' brains as they advance practicing and mastering the game. This practical learning also happens on sports, which are real-life games outside a computer support.

When referring to a gamified activity, automation defines the level of human intervention required to produce responses to trainee's inputs. It also refers to the need of human intervention to enforce the rules. Computer games automatically process all inputs from players, give immediate responses and enforce the rules without any human intervention. By contrast, a group of players of a board game have to do all this processing: throwing dices, counting, interpreting rules, changing status of the game and so forth. Exactly the same happens with tests, exams or manual classroom activities.

Therefore, there are two relevant differences between manual and automated activities with respect to learning: the stream of information generated and the immediacy of the responses to input interactions. Both have been discussed in previous characteristics like feedback, challenge or learning by trial and error and have great impact on learning outcomes. For Gamification this means that automation should be sought always when possible. However, not all contexts are easy or viable to automate, nor every automation has to include computers. If we consider soccer, most of the game is automated. A referee is required to enforce the rules but most of its interactions are performed by the field, the ball, the goals and the players. In fact, a fan soccer match can be played without a referee. Same happens to other games and sports. This shows that some great level of automation can be achieved with appropriate real-life designs.

#### **3. The Rubric**

Table 1 shows the game-design-based rubric with the ten selected characteristics, their assessment criteria and assigned scores. The rubric has ten rows, one for each characteristic. Each row is divided into three columns that hold the criteria to assign scores from zero to two. For each characteristic, the given score will be at the top of the column that contains the criteria that more accurately describes the design. For simplicity, only a integer score is assignable to each characteristic.

The rubric has been designed as an instrument and so it meets the requirement to fit in a single page. To accomplish this, criteria have been written with a few simple words. This makes them simpler, more direct but less detailed and specific. It is advised to understand written criteria as a general contextual description. They are thought to be complemented with more detailed descriptions from Section 2. Also, criteria are written with three or four sentences per cell. For an appropriate application, they should not be considered a check-list: depending on the design being assessed they could even be not applicable as they are written. These sentences should better be considered as a description of general observable symptoms from designs that meet the criteria. This is a consequence of designs being artistic in nature: strict objective descriptions would not be applicable most of the time.


**Table 1.** Ten-characteristic game-design-based gamification rubric.

Two main outcomes arise from the use of the rubric as proposed: first, the rubric is an easy to use instrument for assessing strengths and weaknesses of designs with respect to their game-like characteristics. Second, the knowledge of strengths and weaknesses helps thinking in ways to improve designs, creating a feedback cycle of analysis and improvement. These goodnesses are limited by the subjective nature of the rubric and the discretization it imposes over the analysis space to only ten characteristics and three scores. However, these limitations are acceptable and even desirable in the selected context of helping new practitioners to overcome the lack-of-experience entry barrier.

#### **4. Rubric Application Samples**

As an initial piece of evidence, we show four samples of application of the rubric to four different activities in two blocks: Sections 4.1 and 4.2 analyse the *Super Mario Bros* game and a unsuccessfully gamified 16-week course described in literature. Sections 4.3 and 4.4 analyse the activity of solving a single system of linear equations and a gamified activity designed based on linear equations systems solving. The first pair of examples shows how the rubric compares a successful game with an unsuccessful gamification expecting a great difference in their scores. The second pair shows how the rubric measures the difference between a single classic learning activity and a gamification design produced with the items of the rubric in mind. This gives an idea on how the rubric could be used to help practitioners create and improve their initial designs.

This small piece of evidence does not prove the general validity of the rubric but yields a initial hint. More support evidence is required in any case to validate or discard the rubric.

#### *4.1. Super Mario Bros (NES)*

Next we will assess the game *Super Mario Bros* [30] (see Figure 1). The player controls Mario, a plumber in an imaginary world that has to save his princess from an enemy. For that, Mario has to surpass many perils and enemies in a series of levels. Mario's abilities include running and jumping, getting inside pipes, breaking blocks with a punch from below and firing. *Super Mario Bros* is classified as an action-platformer game: most of the time Mario has to jump from platform to platform to surpass the perils.

*Super Mario Bros* is considered one of the most played games in videogame history. Millions of people have played either the original game or any of its successors. Let us apply the rubric and confirm if this popularity correlates with its score:


According to the rubric, *Super Mario Bros* has an score of 2 + 2 + 2 + 1 + 2 + 1 + 2 + 2 + 1 + 2 = 17 points, which is quite reasonable for such a well known and played game.

#### *4.2. Unsuccessful 16-Week Gamified Semester*

As a more elaborated example we will apply the rubric to the Gamification study by D. Hanus et al. [41]. A class was divided into two groups. The control group received normal lessons, materials, assignments and exams. The experimental group was given same content than the control group but in a standard gamified fashion including badges, leaderboards and incentive systems. Badges were given as a reward for positive behaviours like interaction with class materials, study in pairs in the library or handing assignments early. There also was a badge for entering lessons dressed up like a videogame character. In addition to badges, students also earned coins for small contributions to class discussions or sharing interesting information. Students could use coins to earn some class benefits like extension on a paper.

Students were required to obtain some mandatory badges but coins were optional. The leaderboard was ordered by number of badges obtained, with students using pseudonyms. The leaderboard was updated weekly.

The description of the system is too broad, which produces a great level of noise. Therefore, the final score from the rubric should be taken with care. Adding a big error bar to the final result is advisable to compensate the noise. In raw application, the rubric yields these scores:


The rubric gives a final score of 3 points for this Gamification design. Even considering an important error bar up to 100%, maximum value would be 6 points, really far from the 17 points obtained by *Super Mario Bros*. It clearly appears not to be enough to induce important motivational changes on students.

This analysis supports results obtained by D. Hanus et al. [41], who concluded that the Gamification methods they used had no positive impact on learning and could even harm student motivation. As the 3 points obtained are far lower compared to *Super Mario Bros*, a much inferior motivational level could be expected. This result is supportive of what D. Hanus et al. found in their study.

#### *4.3. Single Learning Activity: Solving a Linear-Equations System*

To establish a comparison with a common learning activity, we will now apply the rubric to a linear-equations system exercise. Let us consider a trainee solving the system on paper and handing it to the teacher. A week afterwards, the trainee receives the exercise assessed. These would be the rubric scores:


This gives 2 points of final score for the learning activity in isolation. It clearly contrasts with *Super Mario Bros* and shows a strong difference. Similar strong difference is usually perceived on student motivation on these two activities. Both scores seem intuitively correlated with this general perception.

#### *4.4. Gamified Version of the Linear-Equations Solving Activity*

Now we consider a explicit gamification design created for the activity of solving linear-equations. This activity was presented by Llorens et al. [42]: here we present a summary of the design along

with its evaluation using the rubric. For complete details on the design please refer to Reference [42]. Basically, Llorens et al. propose these changes to the activity:


Let us now compare the evaluation of this gamified version to the single linear-equations system solving activity:


might be not detailed or complete. However, this last part is improbable with new versions of the app.

This gamified version of the activity gets a score of 1 + 1 + 2 + 2 + 1 + 2 + 1 + 1 + 0 + 2 = 13 points in total. Improvement over the 2 points obtained by the single activity is clear, even considering possible variable criteria interpreting the rubric that could lead to a reduction of some points. In this sense, the rubric also shows potential for helping practitioners to improve their designs: the proposed design could have been made by looking at the items of the rubric and including ideas to improve each one of them. An experience designer would probably consider the design as a whole product instead of a separate set of characteristics. However, the set of separate characteristics is much more easily manageable for a trainee and can easily lead to initial designs like the one proposed here.

#### **5. Conclusions**

In this paper we have presented a rubric as an instrument to help new Gamification practitioners assessing designs. Its aim is lowering the entry barrier to get experience in game-design-based Gamification. Experience is obtained through practice but practice without guidance is much more difficult and frustrating. The rubric gives this guidance.

The rubric can be used to assess a given design, to analyse strengths and weaknesses and to highlight areas for improvement. This uses are focused on helping practitioners learn and develop experience on game-design-based Gamification.

Due to the artistic nature of Game Design, the rubric is conceived on previous experience from authors and works from experts. The rubric itself is conceived with flexible and interpretable criteria to fit all subtle perceptional details from games and Gamification experiences. This is both a limitation and a strength: it cannot provide objective assessment but it allows considering emotions and player experiences, which are key for a successful Gamification design.

As practical experience is the main basis for successful game-design-based Gamification designs, the rubric is probably be far too simplified for experienced practitioners. This is an intended limitation, as its focus is to help new practitioners.

Four samples of application of the rubric have been shown in two pairs: to *Super Mario Bros* game and a 16-week gamified course presented by D. Hanus et al. [41] and to a linear-equations solving exercise and a gamified version of the linear-equations solving exercise proposed by Llorens et al. [42]. The four applications yield consistent results with previous evidence and general perception. Although much more evidence is required to assess the general validity of the rubric, this piece of evidence encourages further testing and analysis.

**Author Contributions:** Conceptualization, F.J.G.-D., C.J.V.-A., R.M.-C. and F.L.-L.; Investigation, R.S.-C. and P.C.-R.; Writing—original draft, F.J.G.-D. and C.J.V.-A.; Writing—review & editing, F.J.G.-D., C.J.V.-A., R.M.-C. and F.L.-L.

**Funding:** This research received no external funding

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
