Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Exploring Bi-Directional Context for Improved Chatbot Response Generation Using Deep Reinforcement Learning

Appl. Sci. 2023, 13(8), 5041; https://doi.org/10.3390/app13085041

by Quoc-Dai Luong Tran^*,†

and Anh-Cuong Le^†

Reviewer 1:

Sooyoung CHO

Reviewer 2:

Rodney Duffett

Reviewer 3:

Alaa Farhan

Appl. Sci. 2023, 13(8), 5041; https://doi.org/10.3390/app13085041

Submission received: 8 March 2023 / Revised: 8 April 2023 / Accepted: 13 April 2023 / Published: 17 April 2023

(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

point 1 : It would be more clear and intuitive to provide specific numerical values in the Abstract and Conclusions sections.

point 2 : I would like to request a diagram and explanation of the structure of BERT.
- Providing a brief description of BERT and how it is used in the proposed model could improve clarity for readers who are not familiar with it.

point 3 : As some sentences appear to require a few modifications, please allow you to check the whole text once again.
- ex) in line 228
"We splitted the training dataset into n pairs (st ,st+1) n t=1 where (st ,st+1) represents the t th pair consisting of an input and its corresponding target." The correct past participle form of the verb "split" is "split," not "splitted." Therefore, the corrected sentence would be: "We split the training dataset into n pairs (st, st+1) n t=1 where (st, st+1) represents the t th pair consisting of an input and its corresponding target."

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

The main aim of the paper was to use a bi-directional context, where historical direction assisted the model to remember information from the past in the conversation, while the future direction enables the model to anticipate its impact subsequently. A transformer-based sequence-to-sequence model and a reinforcement learning algorithm to attain the study’ objective for improved Chatbot response generation using deep reinforcement learning, and overcome limitations from prior studies that did not consider the relationship between utterances in conversations when generating responses. The showed the effect of the proposed model via qualitative evaluation of generated samples, which exhibited significant improvements in BLEU and ROUGE scores in comparison to the baseline model and other current related studies. Hence, this study makes an original contribution in terms of bi-directional context for improved Chatbot response generation using deep reinforcement learning.

The Abstract was generally well written in that it adequately covers the background, main research aim, theoretical and model findings, but a couple of lines on the contributions/implications (theoretical and practical) should be included.

The Introduction section proved a solid argumentation that allows justification of the reason for the selection of the explanatory variables that were considered in the empirical analysis. Sufficient discourse on the research, theoretical explanations and empirical evidence was provided and provided justification for the research be conducted. The theoretical foundation on which the research was based was robust and adequately introduced in the Introduction and then suitably expanded in the literature review.

The paper does explore some relevant literature in that it included some recent academic sources, for example two from 2020, five from 2022 and two from 2023. However, I believe that it is of great importance to increase the ratio of new sources since much has been published on related topics in recent years. So, there is room to conduct a fresh literature search to add additional academic sources to update the paper with a number of more recent sources, which should be incorporated throughout the paper to support the research.

The Proposed model section was adequate.

The Experiment and discussion section was also satisfactory, except that it is important to substantiate the “discussion” component with a greater number of relevant and recent sources to substantiate your findings.

The Conclusions section was acceptable, but there was no theoretical contributions/implications section, nor a practical contributions/implications section, which should each be included as separate subsections (i.e. with their own sub-headings) under the Conclusions section:

- Theoretical implications: What contribution has the research has made to theory based on the theoretical foundation/model that you developed?

- Practical implications: Explain the contribution that the research has made in terms of practical (managerial) implications.

Both of the aforementioned sections should be comprehensive and take the form of substantial sub-sections.

The Limitations and future research section was also not apparent. Please discuss the limitations associated with study and avenues for future research as a separate section.

Referencing: The in-text referencing and reference list were adequate, except that the reference list should be supplemented by a number of recent academic sources.

There are also a number of minor grammatical/language issues throughout the paper, so the paper is in need of a language expert or grammarian to language edit the paper to eradicate these errors.

Overall, an interesting paper that makes of an original contribution! I suggest relatively minor revisions, but collectively these equate to major revision before the paper can be accepted for publication!

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

after reading this paper we found

1- in abstract no any result can show

2- where is the contribution in abstract

3- ". Experimental results demonstrate the effectiveness of the proposed model through 15 qualitative evaluations of some generated samples, showing significant improvements in BLEU and 16 ROUGE scores compared to the baseline model and other current related studies." must more explain in detail

4-"we aim to use a bi-directional context in which the 9 historical direction helps the model remember information from the past in the conversation what means the bi-direction,can do multi agent"

5-"n. This study proposes a Deep Reinforcement Learning model for 28 analyzing the influence of different contextual information on responses in a conversation 29 and how to combine them to improve the coherence, and consistency of a conversation." not exactly with this work more show with why

6-", Seq2Seq models tend to generate generic responses regardless of the input 57 [5,8]." used another reference more quality

7-"Table 1. A conversation example ", used more statement

8-"o long-term goals gathered by the Reinforcement 105 Learning (RL) algorithm" has many problems who do you solve?

9-that not clear in "In recent years, Deep reinforcement 184 learning has attracted a lot of attention in NLP because many NLP tasks can be formulated 185 as RL problem"

10-"Figure 2. The architecture of Decoder. In which, each Bert block has cross-attention layers added between the self-attention layer and the two feed-forward layers." not perfect; must redraws

11-in line 332, the quesion is not correct

12-"where γ is the discount factor which adjusts the importance of rewards over time in 362 reinforcement learning algorithm" in think the values between [0..1]

13-"Table 2. Summarization results of different models." can compare with another model

14- in conclusion "Experimental results have shown that our proposals are effective in improving the quality of generated responses. We conducted several experiments to demonstrate that utilizing both left and right contexts is superior to using only one type, and certainly more beneficial than not utilizing context at all. The experimental results demonstrate that our 507 proposed model has significantly improved when compared to current relevant studies", we need more result

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

귀하의 논문 개정판에서 검토할 아레스가 보이지 않습니다. 잘했고 열심히 일해 주셔서 감사합니다.

Reviewer 2 Report

My initial review comment: The main aim of the paper was to use a bi-directional context, where historical direction assisted the model to remember information from the past in the conversation, while the future direction enables the model to anticipate its impact subsequently. A transformer-based sequence-to-sequence model and a reinforcement learning algorithm to attain the study’ objective for improved Chatbot response generation using deep reinforcement learning, and overcome limitations from prior studies that did not consider the relationship between utterances in conversations when generating responses. The showed the effect of the proposed model via qualitative evaluation of generated samples, which exhibited significant improvements in BLEU and ROUGE scores in comparison to the baseline model and other current related studies. Hence, this study makes an original contribution in terms of bi-directional context for improved Chatbot response generation using deep reinforcement learning. The Abstract was generally well written in that it adequately covers the background, main research aim, theoretical and model findings, but a couple of lines on the contributions/implications (theoretical and practical) should be included.

Author response: We would like to thank you very much for providing me with detailed feedback. We agree entirely with your first point that it is necessary to make the abstract clearer. We have added our proposed model's theoretical contributions and experimental results to the Abstract section. To provide a more comprehensive view of our work, we have revised some sentences to enhance their clarity and added a description of the specific numerical improvements of our proposed model compared to the baseline model and previous studies. The changes we made have been highlighted in the abstract. Please review them.

My comment to R1: The authors added some contributions/implications to the abstract as requested.

My initial review comment: The paper does explore some relevant literature in that it included some recent academic sources, for example two from 2020, five from 2022 and two from 2023. However, I believe that it is of great importance to increase the ratio of new sources since much has been published on related topics in recent years. So, there is room to conduct a fresh literature search to add additional academic sources to update the paper with a number of more recent sources, which should be incorporated throughout the paper to support the research.

Author response: Thank you for providing such valuable feedback. It will undoubtedly help to improve the quality of our work. We added 5 more articles published in 2022 and 2 in 2023. Specifically, to clarify the limitations of current models, we have added two recent references ([9] and [10]) that also faced the same situation when examining traditional seq2seq systems. We have included six studies ([38-41, 45, 46]) in the proposed model, which are relevant to the research problem and have helped strengthen the theoretical framework. We highlighted these changes in sections 1 (Introduction) and 3 (The Proposed Model). Please review them.

My comment to R1: Some recent and relevant additional academic sources were included to update the paper, which were incorporated throughout the paper to support the research.

My initial review comment: The Experiment and discussion section was also satisfactory, except that it is important to substantiate the “discussion” component with a greater number of relevant and recent sources to substantiate your findings.

Author response: Thank you very much for your valuable comments. As you suggested, we updated the manuscript by providing more discussion and comparison. First of all, four recent studies were selected by us for comparison based on the following criteria: They have the same approach of using context information as a model improvement factor, and they all use the Seq2Seq model as a foundation to improve the model. Additionally, all studies were tested on the same data set to facilitate comparison. In the Experiment and Discussion section, we presented comparative results with previous studies and discussed the contributions and advantages of our proposed model in more detail. At the end of this section, we have also included a comprehensive summary of our study results. Please review them in Table 5.

My comment to R1: Some additional academic sources were included to substantiate the discussion and support the findings as suggested.

My initial review comment: The Conclusions section was acceptable, but there was no theoretical contributions/ implications section, nor a practical contributions/implications section, which should each be included as separate subsections (i.e. with their own sub-headings) under the Conclusions section:

- Theoretical implications: What contribution has the research has made to theory based on the theoretical foundation/model that you developed?

- Practical implications: Explain the contribution that the research has made in terms of practical (managerial) implications.

Both of the aforementioned sections should be comprehensive and take the form of substantial sub-sections.

Author response: Thank you very much for this suggestion. We have added two new subsections, 5.1 and 5.2, under the Conclusion section to discuss our contribution to the theoretical and practical implications of our research findings.

My comment to R1: The authors included suitable theoretical and practical implications as recommended.

My initial review comment: The Limitations and future research section was also not apparent. Please discuss the limitations associated with study and avenues for future research as a separate section.

Author response: Thank you very much. Thanks to this suggestion, we have updated the manuscript by discussing the drawbacks of the proposed model in more detail. We have also included some future works that we plan to undertake. Please review them in section 5.3.

My comment to R1: Some limitations and future research avenues were provided.

My initial review comment: There are also a number of minor grammatical/language issues throughout the paper, so the paper is in need of a language expert or grammarian to language edit the paper to eradicate these errors.

Author response: We would like to thank you very much for your suggestions. We have corrected the grammar errors and improved our writing. We used a professional tool that checks for errors in our English. We also asked a native speaker to help us improve our writing.

My comment to R1: There was some improvement in the language, but I will leave this to the MDPI’s highly effective proofing team to conduct the final language/grammar review, and make recommendations accordingly (since language is not my forte).

Overall, an interesting paper that makes of an original contribution! I suggest that the paper can now be accepted for publication!

Reviewer 3 Report

can only improve more for presentation idea, references and discussions

Article Menu

Exploring Bi-Directional Context for Improved Chatbot Response Generation Using Deep Reinforcement Learning

Further Information

Guidelines

MDPI Initiatives

Follow MDPI