Next Article in Journal
Intellectual Curiosity as a Mediator between Teacher–Student Relationship Quality and Emirati Science Achievement in PISA 2022
Previous Article in Journal
Enhancing Student Engagement and Outcomes: The Effects of Cooperative Learning in an Ethiopian University’s Classrooms
 
 
Article
Peer-Review Record

Writing with AI: What College Students Learned from Utilizing ChatGPT for a Writing Assignment

Educ. Sci. 2024, 14(9), 976; https://doi.org/10.3390/educsci14090976
by Changzhao Wang 1,*, Stephen J. Aguilar 1, Jennifer S. Bankard 2, Eric Bui 1 and Benjamin Nye 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Educ. Sci. 2024, 14(9), 976; https://doi.org/10.3390/educsci14090976
Submission received: 18 July 2024 / Revised: 29 August 2024 / Accepted: 1 September 2024 / Published: 4 September 2024
(This article belongs to the Topic Artificial Intelligence for Education)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This article presents the results of a survey of university students on their perception of how the use of ChatGPT influences the improvement of writing. It is based on the responses of 47 students in a university writing course who, after completing a task, compare their writing with and without ChatGPT. Among other things, in the results, students point out both strengths and weaknesses of ChatGPT for writing, as well as positive attitudes towards the use of this tool.

This is a highly topical work, which fills a gap in empirical work regarding ChatGPT and its contribution to the improvement of writing, in this case taking into account student perception.

Below are more detailed comments on each of the sections of the work:

SUMMARY AND KEYWORDS

The summary is adequate, as it includes information about the objective, method and results. However, authors are encouraged to include some elements of the discussion and conclusions so that the summary complies with the IMRaD format.

Keywords represent and focus the content of the work.

 

INTRODUCTION

Although, as the authors indicate, there is little empirical evidence on the topic of ChatGPT and learning to write from the students' perspective, the contextualization and argumentation can be reinforced with other types of studies or more theoretical works that analyze the opinion of students regarding the role of ChatGPT in their teaching-learning process.

One of the most controversial and worrying aspects of the use of tools such as ChatGPT is not taken into account: plagiarism and the difficulty in detecting it. It may be interesting, if the authors consider it so, to refer to this problem in the Introduction.

 

RESEARCH QUESTIONS

If the authors consider it appropriate, it might be interesting to establish a general objective for the work, which would reflect the purpose of the study, which is none other than to analyze the students' perception of the use of ChatGPT for the development of writing.

In this way, this general objective would encompass and centralize the research questions indicated by the authors, which would be transformed into specific objectives.

 

METHOD

The study refers to junior and senior students, but it might be interesting to indicate the age range in which these categories fall, as well as to indicate the average and standard deviation of the age of the students in both categories.

The type of research design is not indicated in the paper. It also does not indicate why it was decided to use an open survey to collect information or what tool was used to analyse the information in questions 1-4. The tool used for the sentiment analysis of question 5 is indicated.

The authors explain the type of task to be carried out by the students, as well as the content of the survey used to collect information, but there is no reference to whether the students received training on generative AI and on how to make proper and efficient use of these tools to carry out the assigned task. Nor is it indicated whether the type of previous experience of the students in the use of this type of tools is known (or if they were asked about this before preparing the task). Previous use, familiarity with generative AI or training in its use are important factors when analyzing and interpreting the results obtained.

To facilitate the replicability of the study, it would be interesting for the authors to indicate whether any clearer and more precise instructions were provided for the elaboration of the task to be carried out by the students, beyond the information on the objective and content of the products that were requested from the students, already indicated in the work.

 

RESULTS

The results of the study are correctly and clearly described. Furthermore, these results are consistent with the research questions posed and, in turn, with the questions of the designed questionnaire. The structure and order of presentation of the results coincides with the order of the research questions.

 

DISCUSSION AND CONCLUSIONS

Although the empirical background on this specific topic (ChatGPT and writing), as the authors point out, is limited, the discussion could be strengthened, for example, by sharing the results with other studies that analyze the strengths and disadvantages of the ChatGPT tool for students, or that analyze the students' attitudes/motivation towards the tool. Reference could also be made to this in the Introduction of the work.

The conclusions focus on providing an answer to the research questions raised, however, limitations of the study or future research proposals are not included. On the other hand, perhaps greater emphasis could be placed on the implications of the study for research and the improvement of the teaching-learning process in the area of ​​learning to write.

 

CITATIONS AND REFERENCES

Citations and references do not present the format required by the journal. The template for preparing the article contains instructions for preparing citations and references.

Author Response

Response to Reviewer 1 Comments

Thank you so much for your time to review this manuscript and provide helpful feedback. We also appreciate your positive view on the significance and contribution of this study. Please find our responses to your detailed comments below and the corresponding revisions in track changes in the revised manuscript.

Comment 1: SUMMARY AND KEYWORDS

The summary is adequate, as it includes information about the objective, method and results. However, authors are encouraged to include some elements of the discussion and conclusions so that the summary complies with the IMRaD format.

Keywords represent and focus the content of the work.

Response 1: Thank you for pointing out that our abstract lacked the ending part from discussion and conclusions. We have now shortened the result part and included ending sentences on the implications of the study (see Page 1, Line 21-22 in the revised manuscript).

Comment 2: INTRODUCTION

Although, as the authors indicate, there is little empirical evidence on the topic of ChatGPT and learning to write from the students' perspective, the contextualization and argumentation can be reinforced with other types of studies or more theoretical works that analyze the opinion of students regarding the role of ChatGPT in their teaching-learning process.

One of the most controversial and worrying aspects of the use of tools such as ChatGPT is not taken into account: plagiarism and the difficulty in detecting it. It may be interesting, if the authors consider it so, to refer to this problem in the Introduction.

Response 2: We appreciate your helpful suggestions on the introduction. Now we have added more literature regarding students’ views/attitudes toward the use of ChatGPT (Page 2, Line 63-77), including the investigation of the correlation between students’ attitude toward traditional plagiarism and their attitude toward plagiarism using ChatGPT, as well as research on 200 Vietnamese students’ perceptions of using ChatGPT for their learning in general.

We also added more literature about the difficulty in detecting the use of ChatGPT for plagiarism in the first paragraph of Introduction (Page 2, Line 29-33), to provide readers a more comprehensive understanding of the necessity and significance of research on the use of ChatGPT in education.

Comment 3: RESEARCH QUESTIONS

If the authors consider it appropriate, it might be interesting to establish a general objective for the work, which would reflect the purpose of the study, which is none other than to analyze the students' perception of the use of ChatGPT for the development of writing.

In this way, this general objective would encompass and centralize the research questions indicated by the authors, which would be transformed into specific objectives.

Response 3: We agree with you that it’s great to have a general objective for the study, which we already have but in a slightly different phrasing. Since our research questions are not only about perceptions (RQ 2 – 5 are about perceptions, but RQ1 is about their writing process with AI), we phrased our general objective as “this study seeks to better understand how undergraduate students make use of AI technologies when they are encouraged, rather than discouraged, from doing so” (Page 1, Line 43–45). It was restated before the research questions, “Therefore, we conducted this study to examine how students in advanced writing courses used AI (i.e., ChatGPT) to complete an assignment, guided by the following research questions” (Page 3, Line 101–103). Hopefully, this addressed the reviewer’s comment.

Comment 4: METHOD

The study refers to junior and senior students, but it might be interesting to indicate the age range in which these categories fall, as well as to indicate the average and standard deviation of the age of the students in both categories.

Response 4: We agree that adding the mean and standard deviation of student age can provide readers with a more concrete idea of their age range. Yet, we have missed the timing to collect this information. Considering this study does not focus on any research questions related to students’ age, we believe the current information of how many juniors and how many seniors is sufficient for this study.

Comment 5: METHOD

The type of research design is not indicated in the paper. It also does not indicate why it was decided to use an open survey to collect information or what tool was used to analyse the information in questions 1-4. The tool used for the sentiment analysis of question 5 is indicated.

Response 5: Thank you for identifying the missing but necessary information. We have addressed the issues accordingly. (1) We added a Research Design section to clarify it is a pre-experimental design with a single group and post-assessment only (Page 4, Line 135–139), for which we appreciated Reviewer 3’s suggestion. (2) We explained why we decided to use open-ended survey on Page 4, Line 150–157. The main reason is that the open-ended format allowed us to collect detailed information about students’ perceptions regarding their experience of the writing assignments. (3) For the coding of students’ responses to Q1 – Q4, we (the two coders) coded manually on Excel spreadsheets. We added this information in the Data Analysis section (Page 5, Line 176–179). We also added our final version of coding book as Supplementary Material, to provide readers more details of our coding.

Comment 6: METHOD

The authors explain the type of task to be carried out by the students, as well as the content of the survey used to collect information, but there is no reference to whether the students received training on generative AI and on how to make proper and efficient use of these tools to carry out the assigned task. Nor is it indicated whether the type of previous experience of the students in the use of this type of tools is known (or if they were asked about this before preparing the task). Previous use, familiarity with generative AI or training in its use are important factors when analyzing and interpreting the results obtained.

To facilitate the replicability of the study, it would be interesting for the authors to indicate whether any clearer and more precise instructions were provided for the elaboration of the task to be carried out by the students, beyond the information on the objective and content of the products that were requested from the students, already indicated in the work.

Response 6: We appreciate your suggestion and agree that students’ prior experience/training is important for the interpretation of results and the replicability of the study. Now we have added that information under Data Collection section (Page 3-4, Line 145–149): “Prior to this writing assignment, the instructor did not give any instructions on how to use AI tools for writing. In addition, to the best of the instructor’s knowledge, students did not have any prior experience of using AI tools for course writing assignments or receive any training on using AI tools for formal writing.”

Comment 7: RESULTS

The results of the study are correctly and clearly described. Furthermore, these results are consistent with the research questions posed and, in turn, with the questions of the designed questionnaire. The structure and order of presentation of the results coincides with the order of the research questions.

Response 7: We appreciate your positive comment.

Comment 8: DISCUSSION

Although the empirical background on this specific topic (ChatGPT and writing), as the authors point out, is limited, the discussion could be strengthened, for example, by sharing the results with other studies that analyze the strengths and disadvantages of the ChatGPT tool for students, or that analyze the students' attitudes/motivation towards the tool. Reference could also be made to this in the Introduction of the work.

Response 8: Thank you for the suggestions, and we have now added a paragraph in the Discussion section to discuss the comparison of our results and the results in prior studies (see Page 13, Line 479-493). The Introduction section has also fully incorporated your comments (please see our Response 2).

Comment 9: CONCLUSIONS

The conclusions focus on providing an answer to the research questions raised, however, limitations of the study or future research proposals are not included. On the other hand, perhaps greater emphasis could be placed on the implications of the study for research

and the improvement of the teaching-learning process in the area of learning to write.

Response 9: Thank you for the suggestions. We have now added a Limitations section on the limitations of the study and the corresponding proposals for future research (Page 14, Line 535–546).

Comment 10: CITATIONS AND REFERENCES

Citations and references do not present the format required by the journal. The template for preparing the article contains instructions for preparing citations and references.

Response 10: Thank you for pointing this out. We have updated the format of all in-text citations and the reference list to meet the requirements of the journal.

Reviewer 2 Report

Comments and Suggestions for Authors

Well done! As you noted, students' perceptions and actual use of ChatGPT haven't been as prevalent as papers about the actual technology. It was very interesting to read about how the students responded in terms of the strengths and limitations in aiding their work.

A few comments:

1. Could students' perceptions toward ChatGPT vary based on the university used in the study? It might be valuable to highlight that the results of this study could change depending on the specific university involved. For instance, at the university where I work, some students refuse to use any form of Generative AI due to copyright dilemmas faced by artists and creators. It may be worth noting that replicating this study at different universities could yield varying results.

2. Since one of the students used Bard instead of ChatGPT, it might be interesting to include that ChatGPT was the expected form of Generative AI, as inferred in the introduction discussing its history.

3. I recommend clearly stating your hypothesis. I think it would be interesting to discuss, maybe based on prior research studies, how the students were expected to respond. You could circle back to it in the conclusion and note how it differed or was as expected. 

4. A substantial number of articles have been published on the ethical dilemmas and usage of ChatGPT within the university context. Including a few more references from these articles could further strengthen this manuscript, especially within the academic integrity/ethical and history of ChatGPT sections. 

Author Response

Response to Reviewer 2 Comments

Thank you so much for your time to review this manuscript and provide helpful feedback. We appreciate your positive view on the significance and contribution of this study. Please find our responses below and the corresponding revisions in track changes in the revised manuscript.

Comment 1: Could students' perceptions toward ChatGPT vary based on the university used in the study? It might be valuable to highlight that the results of this study could change depending on the specific university involved. For instance, at the university where I work, some students refuse to use any form of Generative AI due to copyright dilemmas faced by artists and creators. It may be worth noting that replicating this study at different universities could yield varying results.

Response 1: Thank you for sharing your experience and perspective. We have incorporated it in the Limitations section (see Page 14, Line 535–546 in the revised manuscript) as follows:

“Second, the convenience sampling limits the generalizability of the study. The student participants were recruited from a single university, and they cannot represent all college students since the student population in each higher-education institution may vary depending on its country, location, culture, etc. It will be interesting for future research to investigate if the study will yield similar results when replicating it at different universities.”

Comment 2: Since one of the students used Bard instead of ChatGPT, it might be interesting to include that ChatGPT was the expected form of Generative AI, as inferred in the introduction discussing its history.

Response 2: Thank you for this suggestion. We have incorporated it in our Discussion section (see Page 13, Line 476-478).

Comment 3: I recommend clearly stating your hypothesis. I think it would be interesting to discuss, maybe based on prior research studies, how the students were expected to respond. You could circle back to it in the conclusion and note how it differed or was as expected.

Response 3: Thank you for the suggestions. Regarding the suggestion of making hypotheses, we have the concern that for open-ended questions and qualitative analysis, making hypotheses in advance may give people the impression that we have strong assumption while doing research. That may harm the objectivity of the research, which is different from making hypotheses for statistical analysis. But we agree that it is interesting to discuss how our findings are similar to or different from prior research studies. We have enriched our Discussion section by incorporating this comment in a new paragraph (Page 13, Line 479-493). Hopefully, our response and revision make sense to the reviewer.

Comment 4: A substantial number of articles have been published on the ethical dilemmas and usage of ChatGPT within the university context. Including a few more references from these articles could further strengthen this manuscript, especially within the academic integrity/ethical and history of ChatGPT sections.

Response 4: Thank you for your helpful suggestion. We added more literature on the ethical dilemmas and the usage of ChatGPT in the university context: (1) the difficulty in detecting the use of ChatGPT for plagiarism in the first paragraph of Introduction (Page 1, Line 29-33), to provide readers a more comprehensive understanding of the necessity and significance of research on the use of ChatGPT in education; (2) students’ views/attitudes toward the use of ChatGPT (Page 2, Line 63-77), including the investigation of the correlation between students’ attitude toward traditional plagiarism and their attitude toward plagiarism using ChatGPT, as well as research on 200 Vietnamese students’ perceptions of using ChatGPT for their learning in general.

Reviewer 3 Report

Comments and Suggestions for Authors

This paper consists of a pre-experimental design with convenience sampling which consists of the students of the researcher(s). The main purpose of the papers is to delve into the way in which college students make use of LLM to enhance the quality of their writings. In this very case, the research focuses on ChatGPT.

Even though pre-experimental designs with post-test have their limitations – which should be acknowledged in the paper – they are of interest to the community because they provide with insights on emerging topics like the use of AI. So, from my point of view, the design and the sample are appropriate, but they should be properly justified.

The references are poor, and this aspect can be improved since authors should elaborate a state of the art in which the latest findings on the topic are clearly explained. In this regard, I think they should develop a more thorough revision of the existing literature. In this regard, I would like to mention that it is important to stick to the guidelines of the journal and they have used a different citing system. So, this aspect can be improved.

The data treatment is the area in which I have major concerns since we are not to forget that Education (MDPI) is high-impact journal and I consider that the statistical methods which have been implemented are not strong enough for the journal. The use of descriptive data is illustrative, but inferential statistics should also be taken into account particularly when trying to publish a piece of research in such an important journal. That is why I strongly recommend revisiting this point. The use of tests like Chi-square could be of great help to compare the frequencies and to determine the significance of the differences. I would also like to mention that author(s) should consider whether Table 1 is necessary or not for the length of the answers is not a valuable source of information. The inclusion of the sentiment analysis for the qualitative section is something that I have valued positively because I think it provides the analysis with a stronger ground.

The section devoted to strategies to regulate the use of AI is quite interesting, but I consider it should be included in the conclusions to stick to the IMRD structure of research papers in education.

Comments on the Quality of English Language

The quality of the English language in this article is good.

Author Response

Response to Reviewer 3 Comments

Thank you so much for your time to review this manuscript and provide helpful feedback. Please find our responses below and the corresponding revisions in track changes in the re-submitted file.

Comment 1: Even though pre-experimental designs with post-test have their limitations – which should be acknowledged in the paper – they are of interest to the community because they provide with insights on emerging topics like the use of AI. So, from my point of view, the design and the sample are appropriate, but they should be properly justified.

Response 1: Thank you for bringing up “pre-experimental design” as the research design for our study. We have added it in the Methods (Page 4, Line 135–139 in the revised manuscript) and acknowledged the limitations of this research design and convenience sampling in the Limitations section (Page 14, Line 535–546).

Comment 2: The references are poor, and this aspect can be improved since authors should elaborate a state of the art in which the latest findings on the topic are clearly explained. In this regard, I think they should develop a more thorough revision of the existing literature.

Response 2: Thank you for the suggestion. We have incorporated more relevant literature in the Introduction and Discussion. In the Introduction, we added more literature about the difficulty in detecting the use of ChatGPT for plagiarism in the first paragraph of Introduction (Page 2, Line 29-33), to provide readers a more comprehensive understanding of the necessity and significance of research on the use of ChatGPT in education; we also added more literature regarding students’ views/attitudes toward the use of ChatGPT (Page 2, Line 63-77), including the investigation of the correlation between students’ attitude toward traditional plagiarism and their attitude toward plagiarism using ChatGPT, as well as research on 200 Vietnamese students’ perceptions of using ChatGPT for their learning in general. In the Discussion, we have now added a new paragraph to discuss the comparison of our results and the results in prior studies (see Page 13, Line 479-493).

Comment 3: In this regard, I would like to mention that it is important to stick to the guidelines of the journal and they have used a different citing system. So, this aspect can be improved.

Response 3: Thank you for pointing this out. We have updated the format of all in-text citations and the reference list to meet the requirements of the journal.

Comment 4: The data treatment is the area in which I have major concerns since we are not to forget that Education (MDPI) is high-impact journal and I consider that the statistical methods which have been implemented are not strong enough for the journal. The use of descriptive data is illustrative, but inferential statistics should also be taken into account particularly when trying to publish a piece of research in such an important journal. That is why I strongly recommend revisiting this point. The use of tests like Chi-square could be of great help to compare the frequencies and to determine the significance of the differences.

Response 4: We appreciate your suggestion of adding Chi-squared test. We have carefully considered the fitness of Chi-squared test for our study by reviewing our data, research questions, analysis, and results again. But we believe using Chi-squared test to examine potential associations between categorical variables goes beyond the scope of this study. We will consider using it for further analysis in a follow-up study, and thank you again for the suggestion.

For the current study, the qualitative thematic analysis method is appropriate and sufficient to answer our Research Questions 1 – 4. And our qualitative analysis has gone through rigorous scientific procedures (Page 5, Line 173-199), to ensure the validity and reliability of the study. We also added our final version of coding book as Supplementary Material, to provide readers more details of our coding. But we agree that qualitative analysis has its limitations, which we have added in the Limitations section (Page 14, Line 535–546) to propose future research with “experimental design and statistical methods”. Hopefully, these addressed your concern.

Comment 5: I would also like to mention that author(s) should consider whether Table 1 is necessary or not for the length of the answers is not a valuable source of information. The inclusion of the sentiment analysis for the qualitative section is something that I have valued positively because I think it provides the analysis with a stronger ground.

Response 5: Thank you for sharing your view on Table 1. For qualitative analysis of students’ open-ended responses, we believe it’s helpful to provide the length of responses, which shows that our research data is of high quality. Readers can expect more information shared by students to help us understand the research questions if each student answered in 100 words rather than 10 words. Therefore, we would like to keep this table.

Comment 6: The section devoted to strategies to regulate the use of AI is quite interesting, but I consider it should be included in the conclusions to stick to the IMRD structure of research papers in education.

Response 6: We agree with the reviewer’s comment, and we have included the discussions on strategies to regulate the use of AI in the conclusions (Page 14, Line 564-566). We have also added that as a brief summary sentence at the end of the abstract (Page 1, Line 21-22).

Round 2

Reviewer 3 Report

Comments and Suggestions for Authors

As I mentioned in the former report this paper consists of a pre-experimental design with convenience sampling which consists of the students of the researcher(s). The main purpose of the papers is to delve into the way in which college students make use of LLM to enhance the quality of their writings. In this very case, the research focuses on ChatGPT.

Even though pre-experimental designs with post-test have their limitations – which have now been acknowledged in the paper – they are of interest to the community because they provide with insights on emerging topics like the use of AI. So, from my point of view, the design and the sample are appropriate, and they have been justified properly.

The references are now richer, and this aspect has been improved. Another positive point is that authors have stuck to the guidelines for citation.

I still have concerns with the data treatment. Even though authors justify their approach I’m afraid I still consider it vague for such a high-impact journal. From my point of view, the implementation of statistical methods is the weakest point of the study. Nonetheless, if the Editor consider it is appropriate I have no objections to the publication of the paper.

Therefore, my decision is to accept it in its present form as long as the Editor agrees with the data analysis.

Comments on the Quality of English Language

The language quality in the article is good, and there are only a few typos that can be corrected with a simple proofreading.

Back to TopTop