Next Article in Journal
Regularized Mislevy-Wu Model for Handling Nonignorable Missing Item Responses
Previous Article in Journal
Text to Causal Knowledge Graph: A Framework to Synthesize Knowledge from Unstructured Business Texts into Causal Graphs
 
 
Article
Peer-Review Record

Scene Text Recognition Based on Improved CRNN

Information 2023, 14(7), 369; https://doi.org/10.3390/info14070369
by Wenhua Yu 1,2, Mayire Ibrayim 1,2,* and Askar Hamdulla 1,3
Reviewer 1:
Reviewer 2:
Reviewer 3:
Information 2023, 14(7), 369; https://doi.org/10.3390/info14070369
Submission received: 27 April 2023 / Revised: 17 June 2023 / Accepted: 27 June 2023 / Published: 28 June 2023

Round 1

Reviewer 1 Report

The presented work is not new in the field but has presented a hybrid model to solve the scene text recognition problem. The work has no potential to be published because there are some very serious concerns with respect to the presentation of the work. There is a need to restructure the paper and must have some flow of description which is not currently in the paper. The work on scene text analysis is also presented by Saad etal by considering Arabic text analysis. It would be good if you could compare their work with them too (https://ieeexplore.ieee.org/abstract/document/8641268).

 

Nowadays, the problems related to the text analysis considered to be solved after introduction of transformers/ BERT architecture. Please provide comparison of your work with the accuracies obtained from these models, some reported work are mentioned below,

1. https://arxiv.org/abs/2102.02111

2. https://www.semanticscholar.org/paper/I2C2W%3A-Image-to-Character-to-Word-Transformers-for-Xue-Lu/9ec831ce2e101d19456d6540a096106434b7a3ab

3. https://www.mdpi.com/2079-9292/11/3/374

 

 

It would be easier to improve the paper quality if the author writes the paper by considering the reader's ability to grasp the knowledge from your paper. The work could be good but if representation is poor then nobody is going to acknowledge the work. This paper can not be accepted in it's current state.

 

Some additional comments are mentioned below,

 

1. Abstract is very poorly written. There is no need to mention your contribution in number in the abstract section, instead should be mentioned in the last paragraphs of introduction. The names of datasets should not be mentioned in the abstract as abbreviations are given there which are inappropriate to be written in the abstract.

2. The full form of CRNN has not been provided in its first appearance in the abstract.

3. On line 68, ‘CRNN models have low text accuracy……..’, please provide more references for this claim. 

4. The main contributions mention in the line 81 should be written in new paragraph.

5. On line 86, the word Next is mentioned with capital letter ‘N’ after comma, it should be in small letter.

6. On line 92, ‘while the language model itself is independent and CRNN can be separated from the language model..…..’, Please explain how it's possible?

7. On line 164, please explain what is Bidirectional Cloze Network?

8. On Line 206, how can it be seen from the equations 1,2,3,4?

9. On line 214, section 3.1, the ‘Label smoothing’ section should be re-written.

Are the equations used in the paper your contribution? If it is your contribution then need to mention in the paper otherwise ignore them, only reference should be provided.

 

This paper is poorly written and has unstructured representation. There is no connection in the explanation. While explaining one idea suddenly another concept jumped in without any proper relation to the prior explanation. I found serious linguistical issues throughout the paper. Some are mentioned in the comments but other than that there are still so many.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

The experiments include a comparison with other state of the art methods. It seems that the novel method does not perform these methods. As a result, why this method would be published? Are there some other improvements compared to the existing methods?

 

Moreover, some methods are not cited and compared, see:

Zhang XinSheng, Wang Yu: Industrial character recognition based on improved CRNN in complex environments. In Computers in Industry, Volume 142, 2022, ISSN 0166-3615.

Other notices:

Missing citations in the first sentences of page 1.

I am not a native speaker, the text is mostly readable for me.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The authors describe interesting results of their research on scene text recognition based on improved CRNN. The discussion of the specifics of scene text recognition (STR) is accompanied by the CRNN model by which it is proposed to improve low text recognition accuracy, poor recognition performance for irregular texts, and incomplete information acquisition. In particular, label smoothing is added to ensure the generalization ability of the model and prevent overfitting. Besides, the smoothing loss function in speech recognition is introduced into the field of text recognition, and the Connectionist temporal classification (CTC) loss function is redefined. Also, a language model is used to fuse sequence information with language information for text recognition to increase the information acquisition channels and ultimately achieve the goal of improving the accuracy of text recognition. Comparison with the original model CRNN on the public datasets shows that the model can accurately realize text recognition. However, there are minor shortcomings in the manuscript which need to be corrected.

1. The paragraph on the organization of the manuscript is missing from the Introduction;

2. The accuracy metric was used to assess the quality of the models. Results for other metrics should also be provided, including F1 score and AUC ROC.

3. Minor bugs.

 

Line 54: RNN=>an RNN;

Line 55: RNN=> the RNN;

Line 107: Phrase "2016 proposed the R^2AM" is unclear;

Related Work is one paragraph. Break it up into several paragraphs;

Line 181: Variables T and I should be in italics;

Line 352: Processo => Processor;

 

Line 509: "Patents" is unclear.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

This paper has improved and represented the significant knowledge in the field of text analysis. Although the approach is not new but they adapt the CRNN model to get better results on the available datasets. Overall presentation of content works well and results are impressive enough to get recognition.

Proficiency of English is good overall. But there is a space for improvement.

Reviewer 3 Report

This version of a manuscript should be accepted.

Back to TopTop