Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessReview

Peer-Review Record

The Challenges of Recognizing Offline Handwritten Chinese: A Technical Review

Appl. Sci. 2023, 13(6), 3500; https://doi.org/10.3390/app13063500

by Lu Shen^1,*

, Bidong Chen¹, Jianjing Wei¹, Hui Xu¹, Su-Kit Tang^1,*

and Silvia Mirri²

Reviewer 1:

Karthick Kanagarathinam

Reviewer 2: Anonymous

Appl. Sci. 2023, 13(6), 3500; https://doi.org/10.3390/app13063500

Submission received: 4 February 2023 / Revised: 4 March 2023 / Accepted: 6 March 2023 / Published: 9 March 2023

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Volume)

Round 1

Reviewer 1 Report

This paper investigates the research that has been done on offline handwritten Chinese recognition using traditional techniques, deep learning methods, methods that combine deep learning with traditional techniques, and knowledge from other areas from 2016 to 2022. The challenges that have been encountered are also discussed. In the beginning, it provides an overview of the history of research into handwritten Chinese recognition, as well as its current position, standard datasets, and assessment criteria.

A detailed description and analysis of offline HCCR and offline HCTR methods that have been developed over the course of the past seven years is included in the second part of this article, along with an explanation of their principles, particulars, and performances.

The following points need to be discussed.

In section 2, "Databases and Model Evaluation Metrics," the problem is the classification problem. So, the F1 score, recall, and precision metrics should be compared for the machine learning or deep learning approach.

For this, Table 6 needs to be changed.

Figure 8 needs to be modified. Some text contents overlapped.

In discussion part (Section 5), tabulated the different scenarios of the text background considered.

The AUC / cross fold validation of the ANN/ML/DL models needs to be discussed.

The authors discussed various datasets. But as a review article, it is better to provide direct access (link) to the publicly available dataset. Some of the links' present in table 1 do not contain direct access.

List the various public domain availability and their direct link to the dataset.

Dataset information should contain a number of features / attributes.

The work flow diagram of deep learning, machine learning and other methods should be provided.

The performance comparison among them should be discussed in the conclusion part.

Author Response

In section 2, "Databases and Model Evaluation Metrics," the problem is the classification problem. So, the F1 score, recall, and precision metrics should be compared for the machine learning or deep learning approach.
For this, Table 6 needs to be changed.
The AUC / cross fold validation of the ANN/ML/DL models needs to be discussed.

Response to questions 1, 2 and 3:

Different Chinese characters are different classes, and Chinese has over 20,000 characters, which is a mega-class classification problem. Therefore, it is difficult to find information on F1, recall, precision metrics, and AUC curves in the references. In fact, we found only a very few references discussing recall values, but for the evaluation of detection models, not recognition models. Also, since the recognition of text has not only character recognition errors but also character insertion errors and deletion errors, several evaluation metrics (Accuracy, AR, CR, CER) introduced in Section 2.2 are commonly used in this field to compare the recognition effectiveness of different methods. Therefore, we compare and analyze these metrics of different methods on the same dataset in the paper.

Figure 8 needs to be modified. Some text contents overlapped.

The overlap in Figure 8 has been modified, please see Figure 11 in the modified manuscript.

In discussion part (Section 5), tabulated the different scenarios of the text background considered.

We have added these contents of question 5 in the new Table 8.

The authors discussed various datasets. But as a review article, it is better to provide direct access (link) to the publicly available dataset. Some of the links' present in table 1 do not contain direct access.
List the various public domain availability and their direct link to the dataset.

Response to questions 6 and 7:

We have provided links to the relevant datasets in the new Table 2.

Dataset information should contain a number of features / attributes.

We have added these contents of question 8 in the new Table 1.

The work flow diagram of deep learning, machine learning and other methods should be provided.

The new Figures 3, 4, 6, and 7 show what is in question 9.

The performance comparison among them should be discussed in the conclusion part.

For the consideration of the structure of the article, we have added Tables 7 and 8 in the new section 5.1 to summarize and compare the contents in question 10.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper focuses on the survey of intelligent recognition techniques towards handwritten Chinese. It surveys current research status of this problem in detail. The work is worth approval. Pleases consider some points for improvement during revision.

1. I think just displaying challenge is not enough. It is supposed to tell readers some latent directions that can be expllored to deal with current challengnes.

2. The authors introduced both traditional technology-based recognition methods and deep learning-based recognition methods. But it is expected to make comparison between them. This can help readers know whether it is necessary to use deep learning in following works.

3. More methods published in 2022 and 2021 need to be discussed and surveyed.

4. It is supposed to declare technical difficulties in recognition of Chinese characters. Why is it more worthy of investigation compared with other languages such as English?

5. Please explain the difference between Section 3 and Section 4. They both seem to describe technical methods.

6. It is supposed to summarize experimental platforms, simulation means of existing research.

Author Response

I think just displaying challenge is not enough. It is supposed to tell readers some latent directions that can be expllored to deal with current challengnes.

We have added these contents of question 1 in the new Table 8.

The authors introduced both traditional technology-based recognition methods and deep learning-based recognition methods. But it is expected to make comparison between them. This can help readers know whether it is necessary to use deep learning in following works.

We have added Table 7 in the new section 5.1 to summarize and compare the contents in question 2.

More methods published in 2022 and 2021 need to be discussed and surveyed.

We have reviewed the most relevant references in the manuscript. Since the recognition accuracy of regular handwritten Chinese characters has been up to more than 98%, more research in the last two years has focused on the recognition of handwritten Chinese texts, which we discuss in Section 4.

It is supposed to declare technical difficulties in recognition of Chinese characters. Why is it more worthy of investigation compared with other languages such as English?

We have added the explanation of question 4 (highlighted in yellow) in the section 1.

Please explain the difference between Section 3 and Section 4. They both seem to describe technical methods.

Offline handwritten Chinese recognition is divided into recognition of individual characters and recognition of text. Section 3 is the introduction of the handwritten Chinese character recognition method. Section 4 introduces the recognition methods for handwritten Chinese text.

It is supposed to summarize experimental platforms, simulation means of existing research.

We have added these contents of question 6 in the Tables 3, 4 and 5.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The article can be accepted after making the following changes.

Authors were expected to provide the public domain links as per the previous comment.

In Table 2 link [12], it is repetitive.

ref [13] could not accessible

Kindly revise this table

Author Response

In Table 2 link [12], it is repetitive.

The datasets CASIA-HWDB1.0-1.2 and CASIA-HWDB2.0-2.2 have different contents, but they both belong to the dataset CASIA-HWDB, so their information can be found on the same page. But we still updated the link so that the six datasets can be downloaded directly with the new link. Please kindly retry the link [12] in Table 2.

ref [13] could not accessible.

We have updated the information in question 2. Please kindly retry the ref [13] and link [13] in Table 2.

Author Response File: Author Response.pdf

Reviewer 2 Report

Authors have addressed my concerns.

Author Response

Thank you for your kind reply!

We are glad to have addressed your concern.

Thank you again for your comments on the manuscript!

Author Response File: Author Response.pdf

Article Menu

The Challenges of Recognizing Offline Handwritten Chinese: A Technical Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI