Chinese Lip-Reading Research Based on ShuffleNet and CBAM
Round 1
Reviewer 1 Report
This paper proposed a lip-reading recognition model combining a light-weight CNN-ShuffleNet to which CBAM is added with the Temporal Convolution Network; the proposed measure has the room to be improved before the acceptance of the manuscript.
1.The abstract should reflect the contributions of the manuscript. I suggest rewriting it.
2.Keywords must reflect the core of study same as abstract
3. Introduction should be clearly presented to highlight main ideas and motivation behind the proposed research. Please include and clearly state research question and motivation of proposed study in Introduction.
4.Authors need to add a table of used symbols in the paper to make the paper read easier.
5. the authors should analyze how to set the parameters of the proposed methods in the framework. Do they have the “optimal” choice?
6.Section experiment, it would be good to have more information about how experiments have been conducted. What tools/software has been used?
7.It will be valuable to provide some analysis or discussion on the computational complexity for the proposed framework.
8.I suggest the authors compare their method with some advanced methods.
9.The conclusion section in the present form is relatively weak and should be strengthened with more details and justifications.
10. Figure captions need to be expanded to make them self-explained.
11.The following papers on the same topic in machine learning should be cited and discussed:
1. Dual Regularized Unsupervised Feature Selection Based on Matrix Factorization and Minimum Redundancy with application in gene selection (2022)
2. Diagnosis of Alternaria disease and Leafminer pest on Tomato Leaves using Image Processing Techniques (2022)
3. Review of swarm intelligence-based feature selection methods (2021)
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 2 Report
1. Title of the manuscript should be reframed.
2. Uniformity is not maintained in citing figures in the main text.
3. Authors are strongly suggested to refer the nascent Literature on the selected topic.
4. The full forms of the abbreviations are missing when they are used at first time in the manuscript. For example, BANCA, LRW, etc.,
5. All the references are not cited in the main text.
6. The proposed methodology should be elaborated.
7. The computing cost comparison should be presented and explained.
8. References are not in the format as per Journal guidelines.
9. Author contributions should be mentioned clearly in Section 1.
10. Mention the references for the models presented in Table 2.
11. The outputs/results of the proposed method are inadequate.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 3 Report
1. why ShuffleNet is chosen? Justify
2. There are many typo errors. Do english proof reading and grammer spell check.
3. It is mentioned that chinease dataset called Databox , was created as part of this experiment. Have you make publicly available of this dataset for further research?
4. How the words are chosen? for lip reading experiment, words should be taken in all categories - easy, medium, hard. did you categarize?
5. Result section is weak, it should be discussed in detail.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Reviewer 4 Report
The work is interesting and good work but I suggest some additions to improve the overall quality of the manuscript-
1. The introduction part needed more elaboration to increase the understanding of the proposed work to the audience.
Please add "Viseme Examples" in a visual format.
2. A summarized table of the most significant work done in the area of lip reading may be kept in related works.
3. This may include headings such as-Title, year of publication, Techniques applied, Dataset used, Merits and Demerits, etc.
4. Algorithmic steps should also be added to your proposed work.
5. Discuss the reason why "MobiVSR-1 +TCN” accuracy got decreased after epoch=31, for other models also
6. It is suggested to mention how much accuracy in % improved in your proposed model compared to existing methods.
Author Response
Please see the attachment.
Author Response File: Author Response.docx
Round 2
Reviewer 1 Report
I have gone through the revised paper. All my concerns and requests have been carefully addressed by authors.
Author Response
Thank you for your time, reviewer.
Reviewer 2 Report
[1] The full form of CNN in the abstract is missing. Authors are instructed to mention the full forms for all the abbreviations. The Table 1 presents full forms and it can be included at the end of the manuscript (before Reference section). CNN full form is repeatedly mentioned. For example, in section 2.2, Table 1.
[2] The citation format for the figures need to be uniform. For example, figure 1, FIG. 2, Fig. 5
[3] Units need to be mentioned for the x & y-axis in Fig. 11, Fig. 12.
[4] In Table 6, the title need to be reframed that should contain the model name.
[5] The text after Fig. 12 is not formatted.
[6] Authors are strongly instructed to read the manuscript and correct the grammar, citation formats.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
author revised the paper as per comments
Author Response
Thank you for your time, reviewer.
Reviewer 4 Report
Suggested changes are incorporated in the manuscript, it may be considered for publication.
Author Response
Thank you for your time, reviewer.