2.2.2. Control Group

After filling out a pre-test questionnaire, students were taught three chapters of the school textbook, as the curriculum indicates. Finally, all control group students were tested again (post-test) and filled out the IMI and lesson evaluation questionnaires (see Figure 5).

**Figure 5.** Control group's process.

#### *2.3. Digital Game*

We wanted to further explore Geopoly's impact on students, so after the completion of the first intervention, we designed a digital version of Geopoly, following the same principles and mechanisms as the board game. Each side of the board represented a European region, countries were grouped by

color and their position represented the perception of relative position. Influenced by current affairs (COVID–19 pandemic), we replaced jail with "quarantine" and taxes with "corona bonds" to make the game more relevant to students' everyday life. Up to 7 players can play against each other or against the computer. The Digital game upholds mechanisms such as "roll dice", "buy a property", "exchange properties", which promote skills such as strategic and critical thinking.

Students played the game at home, three to four times and evaluated it by answering an online questionnaire.

#### *2.4. Analysis*

#### 2.4.1. Research Instruments

Pre-test: Students were tested before the first game session, in a written test, consisting of 5 closed-format exercises, relevant to European geography (ex. Choose the correct answer: One of the most prominent monument of France is (a) Colosseum, (b) Eiffel Tower, (c) Parthenon, (d) Brandenburg's Gate). Pre-test's highest score was 100 points.

Post-test: Students were tested after the last game session, in a written test, consisting of 5 closed-format exercises, relevant to European geography (ex. Choose South Europe's biggest country: (a) Greece, (b) Italy, (c) Portugal, (d) Spain). Post-test's highest score was 100 points. Both tests' scores were used to measure students' academic performance.

Post-experiment questionnaire: A five-point Likert scale questionnaire was given to students, to evaluate their experience, after they played Geopoly (ex. Did you enjoy playing Geopoly? 1. Strongly disagree, 2. Disagree, 3. Neither agree nor disagree, 4. Agree, 5. Strongly agree). We used the Cronbach α metric to test for the reliability of the chosen scale. Results (α = 0,813) showed that the scale of the questionnaire is reliable (Table 1).


#### 2.4.2. IMI Questionnaire

Intrinsic Motivation Inventory (IMI) is a seven-point Likert scale questionnaire [47], which was used to assess students' subjective experience. It includes nine questions in total, three for each subscale: Interest/Enjoyment; Pressure/Tension; Perceived competence.

The interest/enjoyment subscale is the self-report measure of intrinsic motivation. Perceived competence is considered to be positive predictor of intrinsic motivation as opposed to pressure/tension subscale, which is considered to be negative predictor of intrinsic motivation (ex. "While I was playing the game I was thinking about how much I enjoyed geography", scale: 1–7, 1: "not at all true", 4: "somewhat true", 7: "very true"). All participants filled out the IMI questionnaire and the data from the two questionnaires were used to measure students' interest in geography.

#### 2.4.3. Online Post-Experiment Questionnaire

A five-point Likert scale evaluation questionnaire was given to students after they played digital Geopoly to evaluate digital game.

#### **3. Results**

#### *3.1. Performance*

Overall performance: Results on the dependent variable "performance" (see Table 2) showed that both groups (control and experimental) achieved similar scores in the pre-test (~45/100).


**Table 2.** Performance statistics before the intervention.

In order to check for any statistically significant difference between the scores of the two groups, we performed a parametric t-test in order to examine means' equivalence variance (see Table 3); the value of *p* (*p* = 0.958) showed that there is no statistically significant difference in the pre-test evaluation scores.

This test was followed by a statistical check on the mean scores, which showed that performance for both groups improved. However, the scores in the experimental group were higher (Table 4). In addition, the *t*-test attested (Table 5) that the game improved students' performance.

There is a statistically significant difference between the means in the two groups (*p* = 0.02 < 0.05), which confirms our assumption that the game improved student performance.

Finally, an independent sample *t*-test for each group (Table 6 for the experimental group and Table 7 for the control group) showed the impact of the game and conventional lesson on students' academic performance.

Using Levene's test, we analyzed pre-test and post-test scores for each group. Both groups' performance was enhanced but in comparison, experimental group showed greater improvement. Hence, Geopoly did have effect on students' performance.

Gender difference: We tested for a difference between the mean score across genders (Tables 8 and 9). The significance level in the control group (*p* = 0.728 > 0.05) and the experimental group (*p* = 0.574 > 0.05) verified our null hypothesis that boys' scores do not vary from girls'. Based on Table 10, Levene's Test null hypothesis is satisfied for both cases (experimental and control) yielding that the variances are equal. Ergo, the student's t-test can be used to compare the two populations' mean (assumption for equality of means is satisfied). The *t*-test's result (Table 10) indicates that the null hypothesis is satisfied hence, the performance of the two groups displays no significant difference.


**3.**Independentsamplestest(pre-test).



**Table 8.** Performance across genders.

**Table 9.** Performance statistics per gender.


**Table 10.** Statistical analysis of student performance.


#### *3.2. Interest*

We wanted to examine whether Geopoly affected students' interest in geography. To this end, we used the data from the IMI questionnaire which explores intrinsic motivation and tested for correlation between interest, a positive indicator of intrinsic motivation, and academic performance. Since the variable "interest" was not distributed normally, we used Spearman's rank-correlation (see Table 11), which indicated a strong positive correlation (ρ = 0.844) between the two variables (academic performance and interest).

**Table 11.** Spearman's rank correlation.


The scatter plot in Figure 6 shows a monotonic direct relationship between two variables. This correlation signifies that as students' interest increases, their performance increases as well. Anxiety and perceived competence, subscales of Intrinsic Motivation Inventory were tested but did not show any significant relation to academic performance.

**Figure 6.** Correlation between interest and performance.

Finally, we explored whether students who played the game showed as much interest in geography as students who attended traditional class. The variable "interest" of experimental group was not normally distributed, so we used the Mann–Whitney U test to compare differences between the two groups. The test resulted in a higher average ranking (30.30) in the experimental group, compared to that of the control group (13.31), meaning that mean rank of interest is higher in the experimental group (results in Table 12).

**Table 12.** Results from the non-parametric Mann–Whitney U test.


As shown in Table 13, significance level (*p* = 0 < 0.05) rejects the null hypothesis for both groups and confirms that the students in the experimental group showed more interest in the subject of geography than those in the control group.



#### 3.2.1. Digital Game

Digital game's questionnaire results shed further light on how students' interest and strategic skills are affected by the game. Forty-seven sixth grader students (27 girls and 20 boys) played digital Geopoly. Students rated the game, evaluated their performance and compared the game to traditional method of teaching.
