An Automatic Question Generator for Chinese Comprehension
Round 1
Reviewer 1 Report
In this paper, the authors propose using QG to generate sets of frequently asked questions (FAQs) for companies. Furthermore, question answering (QA) can benefit from QG, where the QA training dataset can be enriched using QG to improve the learning and performance of QA algorithms.
Most existing QA works and tools have been designed for English texts. The authors propose a Chinese language generator. The generator provides a user-friendly web interface that allows users to generate a set of Wh questions (i.e., what, who, when, where, why, and how) based on a Chinese text and conditioned by a corresponding set of answer sentences. In addition, the web interface allows users to quickly refine the automatically generated answer sentences from the web generator.
Question generation is based on a Transformer model, which was trained on a dataset combined from three publicly available Chinese reading comprehension datasets, namely DRUD, CMRC2017, and CMRC2018. Linguistic features such as part-of-speech (POS) and named-entity (NER) are extracted from the text, which, together with the original text and response sentences, are fed into a machine learning algorithm based on a pre-trained mT5 model.
The generated questions and answers are displayed in an easy-to-use format, integrated with the source text sentences used to generate each question.
We expect that the design of this web tool will provide insights into how to question generation in Chinese can be easily accessible to users with limited computer skills.
The paper begins with an evident introduction with a subsection describing how the work is organized.
The related works are described very well, and there is a detailed structure of semantic and syntactic-structural approaches.
Section three describes the datasets and methods and provides summary images of the system.
These figures help the reader understand what is going on, and they are beneficial.
Some minor weaknesses need to be put right.
There are some typos (as an example in the final discussions, line 234, 246), but these do not compromise the work, which is good. I also strongly advise the authors to include the following work https://www.mdpi.com/1999-5903/14/1/10 in the syntactic structural background.
Author Response
The authors would like to thank the reviewers for the insightful and constructive comments. Below are our responses to the reviewer questions.
[1] There are some typos (as an example in the final discussions, line 234, 246), but these do not compromise the work, which is good.
Our response: The authors have done a careful proofreading on the paper.
[2] I also strongly advise the authors to include the following work https://www.mdpi.com/1999-5903/14/1/10 in the syntactic structural background.
Our response: The authors agree with the reviewer. We have cited this reference in Section 2.
Reviewer 2 Report
This work aims at proposing a question generator in Chinese by utilizing the mT5 model. A user-friendly interface is provided in the form of a web application. The research is sound and well-motivated, and the article is well-written and easy to follow. Three MRC datasets are involved, which can facilitate the comprehensiveness of this work.
Suggestions:
1. As described in the article, both simplified and traditional Chinese are supported. However, it seems that no example in simplified Chinese is presented. I would therefore like to suggest providing such an example in the main text or the appendix.
2. It would be better if the authors can provide the link to the web application, and I suggest open-sourcing this work.
3. For the evaluation, I wonder why prevailing automatic evaluation metrics for NLP are not applied, such as BLEU, ROUGE-L, METEOR, BERTScore, etc.
4. To state the validity of empirical evaluation methods like a user survey, I would suggest applying statistical power and significance test on the results. Please consider it as future work.
Minor errors/misspellings:
Line 101: subtask -> subtasks
The cross-referencing is not functional, namely, the citations and numbers are not clickable. Please check whether it is an error in the template.
Author Response
The authors would like to thank the reviewers for the insightful and constructive comments. Below are our responses to the reviewer questions.
[1] As described in the article, both simplified and traditional Chinese are supported. However, it seems that no example in simplified Chinese is presented. I would therefore like to suggest providing such an example in the main text or the appendix.
Our response: The authors have replaced Text 5 of test data to a Simplified Chinese text. The result tables are updated accordingly.
[2] It would be better if the authors can provide the link to the web application, and I suggest open-sourcing this work.
Our response: Our application server was hosted on a PC. To avoid security threats, we did not make our app public. We will try our best to launch our app publicly on a trusted host within our budget. Once our app is made public, we will open-source our work.
[3] For the evaluation, I wonder why prevailing automatic evaluation metrics for NLP are not applied, such as BLEU, ROUGE-L, METEOR, BERTScore, etc.
Our response: Since we focus on the application design instead of the machine learner part, we focus on user feedbacks and performance in time, instead of the prevailing evaluation metrics for NLP. In our future work, we will focus on the machine learner part and we will apply the evaluation metrics.
[4] To state the validity of empirical evaluation methods like a user survey, I would suggest applying statistical power and significance test on the results. Please consider it as future work.
Our response: We have included this as a future work in Section 5.
[5] The cross-referencing is not functional, namely, the citations and numbers are not clickable. Please check whether it is an error in the template.
Our response: We have fixed the cross-referencing.