Next Article in Journal
A Blockchain-Based Model for the Prevention of Superannuation Fraud: A Study of Australian Super Funds
Next Article in Special Issue
Research on Gearbox Fault Diagnosis Method Based on VMD and Optimized LSTM
Previous Article in Journal
Alkaloid and Nitrogenated Compounds from Different Sections of Coryphantha macromeris Plants and Callus Cultures
Previous Article in Special Issue
Deep Reinforcement Learning for Intelligent Penetration Testing Path Design
 
 
Article
Peer-Review Record

A Local Information Perception Enhancement–Based Method for Chinese NER

Appl. Sci. 2023, 13(17), 9948; https://doi.org/10.3390/app13179948
by Miao Zhang and Ling Lu *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Appl. Sci. 2023, 13(17), 9948; https://doi.org/10.3390/app13179948
Submission received: 14 August 2023 / Revised: 25 August 2023 / Accepted: 28 August 2023 / Published: 3 September 2023
(This article belongs to the Special Issue Evolutionary Computation Meets Deep Learning)

Round 1

Reviewer 1 Report

The paper addresses an important problem in Chinese natural language processing - named entity recognition (NER). Identifying named entities like people, organizations, locations, etc. in unstructured Chinese text is challenging due to the lack of word boundaries. 

 

The main contribution is proposing a novel neural network architecture that effectively incorporates lexical knowledge and local context modeling for Chinese NER. The key innovations that enable improved performance are:(1)Using a graph attention network to model semantic relationships between characters and matched lexicon words. This provides informative features for determining entity types. 

 

(2)A Short-sequence CNN and LSTM encoder that combines local and global contexts when encoding the character sequence. The CNN can detect semantic changes around entity boundaries.

 

Compared to prior approaches that rely on complex linguistic feature engineering or syntactic parsing, a strength is the model simplicity and effectiveness just using lexicon matches and dual encoders. The experiments on 4 datasets from different domains demonstrate consistent and significant improvements in NER F1 scores over previous state-of-the-art methods. The results are fairly convincing.

 

Although this article has high quality, it still has the following limitations, and I hope the author can answer and revise it in turn.

 

Using graph neural networks like GAT to incorporate lexical knowledge has been explored in previous works for NER. The application of GAT specifically is an incremental innovation. Highlight the novel contextual graph attention mechanism used in the GAT architecture for capturing character-word semantic relationships, rather than just using GAT generically.

 

Combining CNNs for local context and LSTM for global context has also been investigated for sequence modeling tasks. The use for Chinese NER is valid but not highly original. Emphasize the specific innovations in how the CNN and LSTM encoders are designed and combined, rather than just applying conventional CNN+LSTM.

 

In the literature review part, some good technologies in text embedding for NLP tasks should be further highlighted. Some recent work of needs to be considered.

 

-A Simple but Effective Method for Chinese Named Entity Recognition--NAACL 2022.  This paper is the latest SOTA in the field of Chinese NER. It proposes a simple and effective method to study the regularity of entity span in Chinese NER, which is called Regularity-Inspired reCOgnition Network (RICON), but the related applications or comparisons are not considered in this paper.   

 

-Chinese Named Entity Recognition Method in History and Culture Field Based on BERT, int J Comput Intell Syst, 2021. In this paper, a named entity recognition model based on BERT is proposed, which is used to extract entities more accurately and efficiently from a large amount of historical and cultural information. This model uses BERT pre-training language model to replace the static word vectors trained in the traditional way, and can dynamically generate semantic vectors according to the context of words, thus improving the representation ability of word vectors. Experimental results show that the model has achieved excellent results in the task of named entity recognition in the field of history and culture.  

 

-MAUIL: Multilevel attribute embedding for semisupervised user identity linkage (2022)   This paper mentioned the combination method of character level and word level feature extraction of entities, which can be well used in Chinese NER, but this paper did not mention it.

 

There are a few long and complex sentences in the paper body which could be split for clarity, for example: "Moreover, this paper proposes a Short-sequence CNN that utilizes the generation of shorter subsequences for encoding with a sliding window module to enhance the perception of local information of characters." (Line 10)

 

Some abbreviations like GAT are used before being defined which can be confusing. (Line 166)

 

The conclusion focuses on summarizing the proposed techniques but could provide more analysis of key takeaways, limitations, and future work.

 

I recommend accepting this paper after a minor revision, which proposes novel techniques for effectively combining neural networks and lexical knowledge to push state-of-the-art in Chinese NER. The innovations and experimental results are promising.

Good.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

I have thoroughly reviewed your manuscript and find your proposed approach for enhancing Chinese Named Entity Recognition (NER) both innovative and promising. Using graph attention networks and local text features through short sequence CNNs, and LSTM demonstrates a strong potential for improving NER performance. The well-organized and clear manuscript comprehensively analyzes your approach's effectiveness. I suggest a few minor improvements:

 

  1. In the abstract, the authors could briefly mention how much their proposed method outperforms existing models. For instance, they could highlight percentage improvements or other relevant metrics to showcase the superiority of your approach.
  2. To discuss the proposed CNN-LSTM hybrid model, the authors should briefly reference Chiu and Nichols' work [15] and briefly explain how your approach builds upon or differs from their method. 
  3. While the authors touch upon the limitations of existing methods, they could explicitly state how their proposed approach bridges the identified gaps. Therefore, the author should clearly articulate how your method addresses these gaps will provide a more substantial justification for the need for your research.
  4. For more intricate processes like the SoftLexicon method, the author may consider providing a step-by-step example illustration to help readers grasp the concept more easily.
  5.  While the approach section offers a comprehensive overview, the authors might consider including a brief paragraph or subsection that discusses implementation details, such as hyperparameters or training procedures.
  6. After the authors present the ablation study results, they briefly interpret these results. The authors should discuss how removing specific components impacts the model's performance.
  7. In the "Overall Results" section, the authors could briefly discuss how their proposed method compares to the baseline methods regarding F1 scores and performance trends across different datasets.
  8. The authors should consider adding a brief paragraph discussing any limitations of their approach or potential areas for improvement in future work.
  9. The last section (discussion) could be replaced by a conclusion because this section summarizes the main contributions, discusses the significance of the results, and highlights areas for further investigation.
  10.  The formatting of some references may need adjustment to follow a consistent style. For example, in some references, the capitalization of titles and authors' initials varies. 

The quality of the English language is good. The text is well-structured, and the writing is clear and coherent. No apparent grammatical or syntactical errors would hinder the understanding of the content. 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

Dear Authors

 

Firstly, the manuscript is clarity of research on an enhanced method for Chinese Named Entity Recognition (NER) using local information perception. Below, I have provided feedback, which I believe will help in refining and enhancing the quality of the manuscript.

 

General information

 

The article identifies a common problem in Chinese NER, specifically the challenge of integrating lexical information and considering the context surrounding Chinese characters.

The paper offers a concrete solution by suggesting the use of graph attention networks and introduces a "Short-sequence CNN" which seems to be novel in this context. The use of the "南京市江大" and “高雄(Kaohsiung)” examples offers concrete illustrations of the challenges faced in Chinese NER, making complex concepts more understandable. The statement regarding "entity ambiguity because of polysemous words in Chinese texts" could benefit from a brief elaboration for clarity.

The article core contributions and methodologies are introduced clearly, with specific mentions of graph attention networks, the concept of Zone-K, CNNs, and LSTMs.

Some sentences are quite long, which can make it challenging to follow. Breaking them into shorter, more digestible pieces would improve readability.

 

The Zone-K approach, based on the provided text, appears to be a promising strategy for addressing challenges in Chinese NER. It incorporates contextual information in a nuanced manner, bridging the gap between local lexical semantics and global entity recognition.

 

Potential Concerns: 1) One potential concern might be the selection of the size of Zone-K. The article doesn't elaborate on how the optimal size is determined, which could influence the model's performance, 2) Introducing concepts like semantic mutation zones and integrating multiple methodologies (like graph attention networks and CNNs) could complicate the model. The complexity might pose challenges in implementation, optimization, or scaling.

 

Detailed analyze:

 

Part 3

The authors introduce a model that combines four distinct network modules. The architecture aims to balance the need for global and local sequence information, the latter of which is essential for capturing character-based nuances in Chinese text. The inclusion of lexicon information adds another layer of contextual understanding.

The approach seems comprehensive, blending both character and word information for Chinese NER. Chinese, being a language where a single character can carry significant meaning, can benefit from such a mixed approach.

The use of BERT, a highly performative pre-trained model, suggests the potential for high accuracy in the NER task. BERT's capability to understand context can be particularly valuable in addressing polysemy in Chinese.

Combining local details from CNNs and global sequence details from Bi-LSTMs gives the model a balanced perspective. While CNNs are adept at capturing local features (e.g., n-grams or short sequences), Bi-LSTMs can consider long-term dependencies in a sequence, offering a broader context.

The use of CRF for decoding is apt for sequence labeling tasks like NER. CRFs consider the entire sequence when predicting each label, thus ensuring that the label sequence is optimal, not just individual labels.

The described approach is robust and demonstrates a well-thought-out structure for the challenges of Chinese NER.

 

Strengths

The authors have designed a holistic model that seeks to capture both global and local features in text. The combination of LSTMs, CNNs, and GATs is both innovative and theoretically robust.

Inclusion of Lexicon: By integrating lexicon-based features, the model can leverage external knowledge, possibly improving accuracy on real-world Chinese NER tasks.

Visualization: The inclusion of figures like Figure 6 and Figure 7 provides visual insights into the discussed concepts, aiding reader comprehension.

The inclusion of figures like Figure 6 and Figure 7 provides visual insights into the discussed concepts, aiding reader comprehension.

 

Areas for Discussion

 

While comprehensive, the model is quite complex. The practicality of such a model could be in question, especially regarding computational resources and training time.

Validation Needed: The efficacy of the model will largely depend on its performance on real-world datasets. Empirical validation, in the form of experimental results, would strengthen the paper.

Clarification on "BMES": A more detailed explanation of the "BMES" positional labels would benefit readers unfamiliar with this concept.

 

Part 4

 

The research utilized four prominent Chinese datasets encompassing different domains. This broad scope ensures the model's robustness across various use cases.

The proposed model outperforms multiple renowned models in terms of F1 scores across multiple datasets. This highlights its effectiveness and superiority over existing methods.

 

Area for Discussion

 

1.     Given the model's performance on Chinese datasets, how well would it transfer to other languages, especially those with complex morphologies or different scripts?

2.     With the complexities introduced by the proposed method, how scalable is this approach for even larger datasets or in real-world, high-throughput scenarios?

3.     With the blending of various techniques and the intricacies of neural networks, how interpretable is the model's decision-making process? Interpretability is often crucial for understanding model predictions and ensuring trustworthiness in practical applications.

 

The research showcases the utilization of the GAT (Graph Attention Network) layer to extract and process rich local information, which enhances entity recognition in Chinese texts. Two case studies demonstrate this improvement.

 

Area for Discussion

 

While the GAT module improves recall and overall detection of entities, there's a trade-off as it occasionally introduces false positives. This can be acceptable in scenarios where ensuring every entity is captured is crucial, even if it means dealing with some noise.

Despite the improvements with the GAT layer, challenges persist, such as mispredicting continuous single entity labels. This suggests there's room for further refinement.

The manual examination of 37 sentence samples offers insights into the model's behavior. However, the selection criteria and the representativeness of these samples should be scrutinized.

The study's findings are valuable for any application relying on accurate entity recognition in Chinese texts, such as information extraction, recommendation systems, or search engines.

Considering the challenges faced by the current model, exploring additional mechanisms, architectures, or integrating other sources of information (like external knowledge graphs) might enhance the model's performance further.

 

Part 5

 

The research introduces an enhanced method for Chinese Named Entity Recognition (NER) by leveraging a local information perception technique. The Graph Attention Network (GAN) is used to fuse character-level details with matching words, and also the contextual data of adjacent matching words. Further, the method employs Convolutional Neural Network (CNN) sliding windows to capture local textual attributes and an LSTM to understand global characteristics. Experiments were conducted using four distinct Chinese datasets, and the results indicated a superior performance of the proposed method when compared to baseline models, especially in leveraging surrounding context for entity recognition.

 

Discussion Points

 

1.     The approach innovatively uses GAT for character fusion and CNNs for local text feature extraction. How do these different layers and components complement each other?

2.     The study uses four distinct datasets from different domains, offering a holistic evaluation. How does the performance vary across these datasets, and what insights can we derive from this?

3.     The focus on enhancing the quality of Chinese NER performance using single-stream networks and handling informal text is promising. What challenges do these specific areas present?

4.     The method's proficiency in understanding local contexts can revolutionize many NER applications. In what sectors or scenarios might this methodology be most impactful?

5.     The study’s reliance on publicly available datasets promotes reproducibility and transparency. How might this openness influence further developments in the field?

6.     With multiple authors contributing to different aspects of the research (algorithms, coding, data management), how does this interdisciplinary collaboration enhance the quality and breadth of the study?

 

Best regards

Reviewer

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop