Next Article in Journal
Evolutionary Computation: Theories, Techniques, and Applications
Next Article in Special Issue
Towards Understanding Neural Machine Translation with Attention Heads’ Importance
Previous Article in Journal
A State-of-the-Art Review on the Study of the Diffusion Mechanism of Fissure Grouting
Previous Article in Special Issue
TodBR: Target-Oriented Dialog with Bidirectional Reasoning on Knowledge Graph
 
 
Article
Peer-Review Record

MLSL-Spell: Chinese Spelling Check Based on Multi-Label Annotation

Appl. Sci. 2024, 14(6), 2541; https://doi.org/10.3390/app14062541
by Liming Jiang, Xingfa Shen *, Qingbiao Zhao and Jian Yao
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2024, 14(6), 2541; https://doi.org/10.3390/app14062541
Submission received: 26 January 2024 / Revised: 8 March 2024 / Accepted: 8 March 2024 / Published: 18 March 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for the opportunity to read this paper. The research introduces MLSL-Spell, a novel Chinese Spelling Check framework leveraging multi-label annotation to enhance spelling detection and correction. Authors suggested that by integrating character-based context vectors and Pinyin information, they significantly outperform existing models in accuracy. Strengths of the study include its innovative approach to spelling error identification and correction, and its superior performance demonstrated through empirical evaluation. However, I think the study shows the complexity of its implementation and potential limitations in adapting to diverse linguistic contexts beyond the datasets tested.

I suggest the authors to revise their research based on the following issues:

1. I suggest they integrate more diverse datasets, including colloquial and domain-specific texts, in order to improve model robustness across different linguistic contexts.

2. I think the authors should explore simplification of their model without compromising accuracy. Why not focus on reducing computational complexity and improving runtime performance?

3. Also, please try to conduct a deeper analysis of model errors to identify specific weaknesses in spelling detection and correction. Can you suggest targeted improvements?

4. Maybe authors could also evaluate the integration of more advanced language models? Ex: newer variants of transformers, to potentially capture context and nuances more effectively (?)

 

5. Can you also comment on the possibility to implement/test some techniques for better generalization, such as domain adaptation methods? We need to ensure your model performs well on unseen data and across various linguistic styles.

Thannk you.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Review of “MLSL-Spell Chinese Spelling Check Based

on Multi-label Annotation”

(Applied Sciences)

 

General assessment:

 

In this paper, the authors propose a CSC framework for the extraction of contextual and Pinyin information as part of a method to label misspellings in Mandarin in a more precise and explicit way.

 

I think this article is definitely eligible for publication in Applied Sciences, since it investigates an issue that is not only highly relevant to current discussions of AI-related tools for language recognition/correction/etc., but also generally appropriate for the scope of the journal.

 

The methods and the output of this research are spelled out and the technical sections are exhaustive on the whole, but the text is quite “dry” in some parts, especially when referring to formulas or tables. It would be nice if all references to results, data and formulas were more “reader-friendly”. Ideally, one should not necessarily have to be a mathematician to understand the argumentation pursued in this paper. Computer linguists or linguists interested in formal tools for speech recognition or the like, for instance, will not be able to follow without some extra information. This would be a pity, since the framework proposed by the authors is quite intriguing.

 

Therefore, while being persuaded that this paper should appear in Applied Sciences, I suggest the authors should make it more accessible for readers not coming from such formal models. Of course, I do not mean that the formal material should be removed - far from it! What I suggest is more of an adaptation of the very high level of the discussion to the needs of more readers potentially interested in reading this.

I propose "Reconsider after major revision (control missing in some experiments)": however, I would like to stress that this is so because the corresponding option is formulated like this in the review report. I am not referring to the general quality of the content, but rather to the fact that making such a paper more reader-friendly probably implies a major revision.

Apart from this, which concerns the paper as a whole, please consider my more specific comments below.

 

Comments:

 

- General remark: Please make the acronym “MLSL” explicit upon its first occurrence in the text. What does it stand for?

- l. 21: Chinese spelling errors often occur in our lives > I find this formulation (the “in our lives” segment) a bit strange. I would phrase this as in the abstract or in a similar way.

- l. 26: s is the ability of the Chinese spelling check > What is meant by “ability”? Maybe effectiveness/efficiency/functionality, …?

- l. 35: it might require multiple Chinese characters to represent the same idea > …multiple Chinese characters might be required to …

- A general remark concerning the notation: when you provide a translation e.g. for a Chinese word, as in l. 38 or l. 44, the translation should appear in single quotation marks (‘__’). Please adapt this through out the text.

- l. 38: "zhong" which means "medium" > It does not mean ‘medium’, but ‘middle’.

- l. 43: the Chinese phrase "innocent you" > A speaker of any language other than Chinese will not be able to understand this. Please provide some kind of explanation or translation.

- l. 44: "tian/zhen de/ni" (lit. sky/true/you) > Again, someone who does not speak Chinese will not be able to understand what corresponds to what and why only three of four items appear in the translation. This is explained right after this line, but it remains obscure why there is a slash between “tian” and “zhen” on the first occurrence of the phrase.

- l. 56: Because they have similar pronunciations "xi`ang" in Chinese > Why is this subordinate clause detached from the rest of the utterance?

- Tab. 1: Why is “elephants” in the plural form? The form “xiang” does not imply - morphologically - that the word refers to a plurality of entities.

- l. 102: We conduct experiments > conducted?

- l. 124: Despite the fact that numerous CSC schemes are available now > Which ones for instance?

- Section 2 is linguistically very redundant: The authors repeat too often the expression “XYZ propose(s)…”. Please try to vary a little bit.

- l. 185: model is shown in Figure ?? > Please add this information.

- Fig. 1: Pinyin is systematically not capitalized here, but in the rest of the text it is. Please unify this.

- l. 209: kequals 5 > Please delete the space to the left of “equals”.

- The calculation formulas in (11) to (16) should be explained more in detail. it is not easy to follow the argumentation if the premises of these formulas are not made explicit.

- l. 452: In this case, 452

we will further expand the confusion set and add grammatical information to improve 453

MLSL-Spell model’s contextual reasoning ability > I fail to understand what is meant by “grammatical information” here. Do the authors mean “information related to the part of speech”? Why is this necessary?

- Tab. 9: detection, correction > detection and correction

- The connection between the characters in Fig. 4 and Tab. 12 is very unclear. Also, please provide the Chinese sentences for the examples in Tab. 12. It would be nice to see the words in their context. For instance, I do not understand the translation “go throwing”.

- References: l. 602: In Proceedings of the Proceedings 2010 > ???

- l. 506 and elsewhere: Why are some of the publication years referring to papers that have appeared in journals in bold (or: why are some of the publication years NOT in bold)?

Comments on the Quality of English Language

(see above)

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I agreed with revisions, but some references need updating.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

I am glad to see that the authors have carefully revised the manuscript taking into account all my suggestions and improved the general quality of the paper.

The formal issues have also been taken care of. All revisions have been thoroughly explained. 

The paper can be published in the present form.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop