Next Article in Journal
Innovation from Spatial Spillovers of FDI and the Threshold Effect of Urbanization: Evidence from Chinese Cities
Next Article in Special Issue
Interrogating Structural Bias in Language Technology: Focusing on the Case of Voice Chatbots in South Korea
Previous Article in Journal
Research on a Framework for Sustainable Campus Eco-Architecture Selection: Taking a Taiwan High School as an Example
Previous Article in Special Issue
Research on Art Teaching Practice Supported by Virtual Reality (VR) Technology in the Primary Schools
 
 
Article
Peer-Review Record

Extending the UTAUT Model of Gamified English Vocabulary Applications by Adding New Personality Constructs

Sustainability 2022, 14(10), 6259; https://doi.org/10.3390/su14106259
by Kexin Zhang and Zhonggen Yu *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Sustainability 2022, 14(10), 6259; https://doi.org/10.3390/su14106259
Submission received: 8 April 2022 / Revised: 8 May 2022 / Accepted: 18 May 2022 / Published: 20 May 2022
(This article belongs to the Special Issue Language Education in the Age of AI and Emerging Technologies)

Round 1

Reviewer 1 Report

Some examples of vocabulary learning apps could be provided, and the questionnaire appended.

Author Response

Reviewer comments: Some examples of vocabulary learning apps could be provided, and the questionnaire appended.

Response of authors: Many thanks for your careful suggestions. We have added an example of a gamified vocabulary learning app in the literature review (See 2.1.). The questionnaire we used has been attached to the Appendix document.

Special thanks to you for your valuable and constructive comments.

Reviewer 2 Report

UTAUT

The purpose of the study is to determine use of a language learning app through the an extension of UTAUT with personality traits, which itself seems to be a variant of Theory of Planned Behaviour on Tech Use. I thank the authors on their work. It is obvious there were quite an extensive intervention and collection involved. I have the following comments to improve your manuscript. 

Methods:

I won’t question the model itself and the application of the study. However, did participants actually use a language learning app? Or was that part of the recruitment criteria? If it was mentioned, then it may not have been cleared. The focus of the study seemed to be around the UTAUT construct and survey, I was not sure if there was actually any involvement of a language learning app involved. I understood the surveys were administered on wjx.cn, I was not sure if the survey included language learning tools or just the survey. 

 

How were your participants recruited? They came from a broad set of participants. What compensation was provided? Or was this through the participation of your mooc, your intervention(language learning app) was your mooc, and this was a supplement study toward evaluating the learning tool use? I only connected the Mooc through looking at the funding, it would have been nice to have it included in the methods. 

 

Results: 

But you didn’t have sphericity? 

If the results of your personality questions did not contribute substantially to your model, does this imply the original UTAUT may provide better fit? 

Was there any R2 for your path analysis? 

Would you consider presenting your covariance table as that can be used to verify your outcome?

Conclusion: I was not sure how this study contributed to the literature. Personally, I think the application of the UTAUT was nice. The lack of description of the intervention involved may have affected your modelling estimates. Taking away that context makes interpreting your results difficult. 

While I appreciated the traditional presentation of the hypothesis and results surrounding the research questions, it resulted in a study that skipped over the discussion, the use of the app, and many other interesting details. Perhaps revising to explain the added factors(personality) and leaving the UTAUT explanation intact would have been nice. 

Analysis of just the UTAUT without the personality factors would be ideal. I don’t know why the personality was even considered as an effect. 

Editorial:

Please review the hypothesis number references.(H14 listed multiple times, there may also be other issues such as Font for H1) 

Sample size should be listed with N not M for Male, I do not understand what is the standard deviation presented in Table 1. Given the values are all presented in frequency table. Please remove.

Figure numbers are all off by 1. 

Could use a round of language edits. 

Author Response

Reviewer comment 1: I won’t question the model itself and the application of the study. However, did participants actually use a language learning app? Or was that part of the recruitment criteria? If it was mentioned, then it may not have been cleared. The focus of the study seemed to be around the UTAUT construct and survey, I was not sure if there was actually any involvement of a language learning app involved. I understood the surveys were administered on wjx.cn, I was not sure if the survey included language learning tools or just the survey.

Response of authors: Many thanks for your careful suggestions. We have added the criterion of recruiting our participants in the method part: “EFL learners who used at least one gamified vocabulary learning app before attended our questionnaire survey (See 3.1.). And the survey did not include language learning tools. Participants answered the questions based on their previous using experiences.”

Reviewer comment 2: How were your participants recruited? They came from a broad set of participants. What compensation was provided? Or was this through the participation of your mooc, your intervention (language learning app) was your mooc, and this was a supplement study toward evaluating the learning tool use? I only connected the Mooc through looking at the funding, it would have been nice to have it included in the methods.

Response of authors: Many thanks for your careful suggestions. We have added the ways we recruited participants in the participant part (3.1.): “The authors recruited participants both online and offline through an online social media platform and at a university. The participants could obtain an adequate amount of money after finishing the questionnaire.” And we had more specific descriptions in how we recruited our participants in the research procedure part (3.3.2. The Distribution and Data Collection of the Questionnaire).

Reviewer comment 3: But you didn’t have sphericity?

Response of authors: We are truly sorry that we did not make the table clear enough. We made some adjustments and put the sphericity test in table 2.: KMO and Bartlett's Test.

Reviewer comment 4: If the results of your personality questions did not contribute substantially to your model, does this imply the original UTAUT may provide better fit?

Response of authors: Many thanks for your insightful and critical thinking concerning our study. The purpose of our study is to explore what possible factors could influence the adoption of gamified vocabulary application. Since many studies extended the original UTAUT by including other relevant factors according to specific research purposes and targets, we extended the model with constructs about personality factors given their significant roles in language learning and acceptance of new technologies. Even though some of them did not show significant correlations with intention and actual behavior of using the apps, some were found to exert influences which could not be ignored. More importantly, we tried removing the new constructs and analyzed the data, but the model fit was not better than the extended model. However, we are still truly grateful for your question as it is rather thought-provoking.  

Reviewer comments 5 and 6: Was there any R2 for your path analysis? Would you consider presenting your covariance table as that can be used to verify your outcome?

Response of authors: Thank you for your suggestion. We have added a new part of R² and covariance in the result section.

Reviewer comment 7: I was not sure how this study contributed to the literature. Personally, I think the application of the UTAUT was nice. The lack of description of the intervention involved may have affected your modelling estimates. Taking away that context makes interpreting your results difficult.

Response of authors: We are grateful for this piece of advice. To clearly address the contribution of this study to previous literature, we added the following description in this discussion section: “Although numerous studies explored the role of vocabulary learning applications in enhancing learners’ vocabulary learning and acquisition, few studies intended to investigate elements in these applications that could influence users’ acceptance of the technology. Knowing what factors influence users’ use intention and behavior can assist application developers to improve the vocabulary software and thus promote the use of vocabulary apps. To complement this missing link in literature, this study innovatively employed the UTAUT model to examine the proposed relationships.”

Reviewer comments 8 and 9: While I appreciated the traditional presentation of the hypothesis and results surrounding the research questions, it resulted in a study that skipped over the discussion, the use of the app, and many other interesting details. Perhaps revising to explain the added factors(personality) and leaving the UTAUT explanation intact would have been nice. Analysis of just the UTAUT without the personality factors would be ideal. I don’t know why the personality was even considered as an effect.

Response of authors: We are truly sorry that we did not make it clear why we extended the UTAUT model by introducing some of the personality factors as new constructs. Thus, we stressed this part by adding more descriptions in the literature review: “Although the UTAUT model has fixed constructs, researchers can also extend it by including new ones. Performance expectancy (PE), effort expectancy(EE), social influence (SI), and facilitating conditions (FC) were the four main determinants of users’ intention to utilize technology and actual use in the UTAUT [23]. Moreover, gender, age, experience, and voluntariness of use were taken into account to moderate users’ individual differences in their technology acceptance. Most studies extended the original UTAUT by including other relevant factors according to specific research purposes and targets, e.g. student-centric learning [16], teacher feedback and compatibility [20], and management learning [23]. This research extended the UTAUT initially proposed by Venkatesh et al. by including attitudes towards behavior (ATB), openness (OP), emotional stability (ES), positive competition (PC), and perseverance of effort (POE). The reason the authors included some of the personality factors was that personalities could influence users’ intention and the actual behavior of using a certain system (e.g., people with a high level of openness were willing to try out new things and experiences, and they could be likely to have strong intention to use a gamified vocabulary application).”

In addition, we have also explained more of the added personality factors as suggested in the discussion part: “The newly introduced latent variables have supplemented the relevant literature. Attitudes towards behavior were implemented in previous studies which adopted the UTAUT model (e.g. Embi, Altalhi) [26,31]. This research also included it as a factor to sustain the original model and had new discoveries. Openness, emotional stability, positive competition, and perseverance of effort were predicted to exert a positive influence on behavioral intention given that they played a vital role in learning new things and knowledge; however, some of the findings contradicted the hypotheses. Specific details and discussions are as follows.”.

Reviewer comments on editorial issues: Please review the hypothesis number references. (H14 listed multiple times, there may also be other issues such as Font for H1). Sample size should be listed with N not M for Male, I do not understand what is the standard deviation presented in Table 1. Given the values are all presented in frequency table. Please remove. Figure numbers are all off by 1. Could use a round of language edits.

Response of authors: We are sincerely grateful for the reviewer’s carefulness. We have made the corrections, e.g., the hypothesis number references and fonts, and we have removed the standard deviation presented in Table 1 as suggested.  

Special thanks to you for your valuable and constructive comments.

Reviewer 3 Report

The study investigates factors that may affect language learners’ use of gamified English vocabulary applications. Thus, the study sheds light on what factors might motivate or demotivate the use of such applications.

Overall, the design of the study appears to be solid in that it involves a large number of participants who were presented with measures tapping the intended constructs (these had mostly been designed based on existing measures of the constructs which, thus, was likely to increase their validity). The reporting of the results is mostly appropriate but needs to be elaborated in the at least two ways detailed below. In addition, the discussion of the findings and the limitations of the study need to be somewhat expanded.

The description of the need and aim of the study is clear. The literature review appears sufficient (I am not an expert on research on language learning applications, however) and the selection of the UTAUT model for the study is well justified. The presentation of the application of the UTAUT model to the study (in Figure 2) is clear and informative, and so are the explicit hypotheses of the study on pages 4-6. It is also good to see that the construct selection is justified and the measures (Likert scale statements) are mostly based on previous research and operationalisation of those constructs. There were 3 measures per construct (apart from one that had more), which is sufficient for CFA (even if for one construct one of the measures had to be dropped).

Appendix 1 allows the reader to see the actual statements. In some cases, the statements tapping the same construct seem quite heterogeneous but apart from the Social Influence construct they turned out to work sufficiently well according to the analyses. However, it would be useful if the authors elaborated a bit on the selection process of at least some of the measures / statements. Presumably the original scales (used in previous research) from which the statements were selected (and often modified to fit this study of vocabulary apps) were longer, that is, they contained more than three statements. How did the authors go about choosing those 3 statements and what kind of criteria did they use in their selection? There may not be enough space for a detailed account of this but some examples probably suffice.

Reporting of the results and analyses. The scales worked surprisingly well considering their short nature (only 3 items in most cases), as the quite high Cronbach’s alphas indicate, so the measures were robust in this sense. The model fit of the measurement part of the model was reported in Table 4, and was sufficient. However, it is important to see the descriptive statistics of the measures listed in Table 3, e.g. in an Appendix (e.g. means and standard deviations, as it degree of normality of these variables can affect the results and/or influence the specification of the estimation procedures).

The results of the SEM model is done on pages 11-12 (the path analysis). However, the indices of the model fit of this structural model (which contains also the measurement part of the model not shown as a figure in the manuscript but reported in Table 3) are not reported in the text. How well did this (fuller) model fit the data? And were any modifications needed to achieve satisfactory fit?

It seems that the authors collected more information from the participants than what is reported in this manuscript, and some note would be useful regarding the nature of that additional information even if it was apparently not relevant for this article.

Some notes on the Discussion of the findings. It seems to me that one factor that might have affected the findings could relate to the fact that the participants were mostly Chinese. Therefore, it might be useful to consider how they view Effort Expectancy in foreign language learning in general, not just related to learning vocabulary with apps. Can the expectancy of learning English as easy vs difficult in general play a role in the findings? Could there be cultural differences in such expectancies (or differences related to the educational culture of the countries where e.g. Effort Expectancy has been studied)? Furthermore, how might the learners’ familiarity with (particular) vocabulary applications relate and possibly explain some of the findings?

Openness and Behavioral Intention were negatively related but it could be noted that while the relationship was significant, it was still quite small compared to most of the others found in the study.

One limitation of the study appears to be the nature of data underlying Actual Use of gamified applications. The data are based on participants’ self-reports / estimates rather than collection of behavioral information about their use of such apps (e.g. longitudinally or in relation to the use of a particular vocabulary app). Such self-reporting involves an amount of uncertainty and/or error, which should be acknowledged.

 

Some minor points:

  • page 3: reference to the Figures 2 and 3 seems wrong (the Figure is 1 but it is referred to as Figure 2 in the text; similarly, Figure 2 is referred to as 3)

Author Response

Reviewer comment 1: Appendix 1 allows the reader to see the actual statements. In some cases, the statements tapping the same construct seem quite heterogeneous but apart from the Social Influence construct they turned out to work sufficiently well according to the analyses. However, it would be useful if the authors elaborated a bit on the selection process of at least some of the measures / statements. Presumably the original scales (used in previous research) from which the statements were selected (and often modified to fit this study of vocabulary apps) were longer, that is, they contained more than three statements. How did the authors go about choosing those 3 statements and what kind of criteria did they use in their selection? There may not be enough space for a detailed account of this but some examples probably suffice.

Response of authors: Many thanks for your careful suggestions. We have added examples and explanations of how we selected the items from the original scales in 3.2. (research instrument): “In some scales with more than three items for a construct, the authors selected the items that were not close in meaning. For instance, there were 6 items of POE in the original scale [50], but the item ‘even when I can do something more fun, I give language learning tasks my best effort’ had similar meanings compared to the item ‘I am committed to the investment of my best effort in language learning tasks’. Thus, the authors only selected one of them to be adapted in this research.”.

Reviewer comment 2: However, it is important to see the descriptive statistics of the measures listed in Table 3, e.g. in an Appendix (e.g. means and standard deviations, as it degree of normality of these variables can affect the results and/or influence the specification of the estimation procedures).

Response of authors: Many thanks for your careful suggestions. The output of estimates in AMOS was presented in the manuscript which included standard regression weights, S.E. (standard error), and p-value. We did not see means and standard deviations in the automatic output; however, there existed an assessment of normality that met the requirement. As suggested, we have put the normality table in the appendix.

Reviewer comment 3: The results of the SEM model is done on pages 11-12 (the path analysis). However, the indices of the model fit of this structural model (which contains also the measurement part of the model not shown as a figure in the manuscript but reported in Table 3) are not reported in the text. How well did this (fuller) model fit the data? And were any modifications needed to achieve satisfactory fit?

Response of authors: Many thanks for your thoughtful questions. We used the MI function in Amos to make some modifications. We have clarified that in our manuscript to avoid ambiguity: “To achieve a satisfactory level of the model fit, we used the modification indices function (MI) in Amos to make some modifications by building new relationships between variables as suggested by the MI. Table four represents the results of the modified model…”.

Reviewer comment 4: It seems that the authors collected more information from the participants than what is reported in this manuscript, and some note would be useful regarding the nature of that additional information even if it was apparently not relevant for this article.

Response of authors: Many thanks for your advice. We have added more notes on the participants’ information in our manuscript: “In addition, more than half of the participants (54.7%) were between the ages of 21 and 30, accounting for the largest proportion. Although both liberal arts students and science and engineering students participated in this survey, there were significantly more liberal arts students than students in the other major (82.9%).”.

Reviewer comment 5: Some notes on the Discussion of the findings. It seems to me that one factor that might have affected the findings could relate to the fact that the participants were mostly Chinese. Therefore, it might be useful to consider how they view Effort Expectancy in foreign language learning in general, not just related to learning vocabulary with apps. Can the expectancy of learning English as easy vs difficult in general play a role in the findings? Could there be cultural differences in such expectancies (or differences related to the educational culture of the countries where e.g. Effort Expectancy has been studied)? Furthermore, how might the learners’ familiarity with (particular) vocabulary applications relate and possibly explain some of the findings?

Response of authors: Thank you for your thoughtful questions. Effort expectancy in the UTAUT model refers to the expected degree of ease using a particular technological system, thus we mainly discussed users’ expectancy of using gamified vocabulary apps. However, your question about the influence of participants’ nationality is an important factor to consider. Thus, we added this participant bias to the limitation part: “…although this study includes participants from different countries, most of them (over 90%) are Chinese, so the results can not be generalized because of possible cultural differences and various perceptions and attitudes toward learning English.” Furthermore, learners’ familiarity with certain vocabulary apps belongs to the construct of effort expectancy according to the definition, which can affect users’ behavioral intention and actual use of the apps. Thus, in our study, we also explored the relationships between effort expectancy, behavioral intention, and actual use, and presented our findings in the result and discussion sections.

Reviewer comment 6: Openness and Behavioral Intention were negatively related but it could be noted that while the relationship was significant, it was still quite small compared to most of the others found in the study.

Response of authors: Many thanks for your carefulness in reading our manuscript. We added this description in the result part: “Therefore, only hypotheses 1, 4b, and 10 are supported, although the correlation between OP and BI is smaller than the rest relationships.”.

Reviewer comment 7: One limitation of the study appears to be the nature of data underlying Actual Use of gamified applications. The data are based on participants’ self-reports / estimates rather than collection of behavioral information about their use of such apps (e.g. longitudinally or in relation to the use of a particular vocabulary app). Such self-reporting involves an amount of uncertainty and/or error, which should be acknowledged.

Response of authors: Many thanks for your suggestion. We have added this limitation to our manuscript: “It should be acknowledged that the data are based on participants’ self-reports rather than the collection of behavioral information about their actual use of such apps. Such self-reporting involves an amount of uncertainty.”

Reviewer comment 8: page 3: reference to the Figures 2 and 3 seems wrong (the Figure is 1 but it is referred to as Figure 2 in the text; similarly, Figure 2 is referred to as 3)

Response of authors: We are sincerely grateful for the reviewer’s carefulness. We have made the corrections in our manuscript.

Special thanks to you for your valuable and constructive comments.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Issues Addressed

Reviewer 3 Report

Thank you for the revisions to the manuscript and the accompanying explanations. I find the revisions to address the points I raised in my review sufficiently well so I'm happy to recommend that the study be published.

Back to TopTop