Next Article in Journal
Adaptive Leakage Protection for Low-Voltage Distribution Systems Based on SSA-BP Neural Network
Previous Article in Journal
α-Glucosidase Inhibitors Based on Oleanolic Acid for the Treatment of Immunometabolic Disorders
 
 
Article
Peer-Review Record

Coreference Resolution for Improving Performance Measures of Classification Tasks

Appl. Sci. 2023, 13(16), 9272; https://doi.org/10.3390/app13169272
by Kirsten Šteflovič 1,* and Jozef Kapusta 1,2
Reviewer 1:
Reviewer 3: Anonymous
Appl. Sci. 2023, 13(16), 9272; https://doi.org/10.3390/app13169272
Submission received: 11 May 2023 / Revised: 5 August 2023 / Accepted: 8 August 2023 / Published: 15 August 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

The aim of the paper is to identify fake news using coreference resolution, using natural language processing, machine learning and neural networks algorithms. The authors propose methodology to find out whether pre-processing of the text using coreference resolution can improve performance measures in classification tasks.

The paper brings new results through very actual topic of fake news detection The paper is written as original scientific article, with goals and methodology indicated, and research contribution pointed. Sentences and paragraphs are interrelated in logical order.

The paper is suggested for acceptance, however with some suggestions for correction in order to improve readability and understanding:

Give examples of fake news, howe does it look like? Are those fake news collected from web, or messages, or….?

Provide few examples on how coreference annotation looks like in the used corpus?

In the related work there are examples of ML techniques used for detection of phishing messages, together with corpus description and evaluation results, which might be useful to compare: Seljan et al. (2023) Information Extraction from Security-Related Datasets; Kovač et al. (2022)  An overview of machine learning algorithms for detecting phishing attacks on electronic messaging services

Provide a paragraph to describe dataset in more details and present how it looks like. Make distinction between fake and legitimate news by marking concrete examples.

Present clearly what the system will classify: Correct or incorrect coreference resolutions or fake news? Give examples

Explain what is weak declarative value and the role of pronouns inside.

Sentence row 61-63 is too long and unclear – please rewrite

It is advised to use neutral way of expressions, instead of personal constructions (Our results show -> Results show….; In the research, we used XY method -> In the research, the XY method was used…) which is more suitable for the research work.

Conclusion – repeat the most important concrete results in numbers

The paper is suggested for publishing after corrections.

Light language editing needed.

Author Response

Reviewer#1, Concern # 1:  Give examples of fake news, how does it look like? Are those fake news collected from web, or messages, or….?

Author response: 

Thank you for the suggestion. We have added a sample of the KaiDMML dataset to the article.

 

Reviewer#1, Concern # 2:  Provide few examples on how coreference annotation looks like in the used corpus?

Author response: 

Thank you for the suggestion. We have added an example of coreference annotation to our article under Fig.3.

 

Reviewer#1, Concern # 3:  In the related work there are examples of ML techniques used for detection of phishing messages, together with corpus description and evaluation results, which might be useful to compare: Seljan et al. (2023) Information Extraction from Security-Related Datasets; Kovač et al. (2022)  An overview of machine learning algorithms for detecting phishing attacks on electronic messaging services

Author response: 

We have added the works to chapter 2: Related work.

 

Reviewer#1, Concern # 4:  Provide a paragraph to describe dataset in more details and present how it looks like. Make distinction between fake and legitimate news by marking concrete examples.

Author response: 

Thank you for the suggestion. We have added a sample of the KaiDMML dataset to the article. The article contains links to publications with a detailed description of the dataset from its original authors. We consider our description sufficient.

 

Reviewer#1, Concern # 5:  Present clearly what the system will classify: Correct or incorrect coreference resolutions or fake news? Give examples

Author response: 

Thank you for the suggestion. We realized the information not formulated clearly in the article. We have added some explanatory notes to the methodology in the Introduction chapter. We also added a paragraph to this chapter that explains the methodology. At the same time, we added information about the classification to the Conclusion.

 

Reviewer#1, Concern # 6:  Explain what is weak declarative value and the role of pronouns inside.

Author response: 

"Weak declarative value" refers to the replacement of pronouns with an explicit representation of the objects they refer to in order to provide additional information to the text. This means that instead of using pronouns like "it," "he," or "she," the specific entities or objects they represent are stated explicitly. This approach of replacing pronouns with their referents helps to improve the representation of the text by providing clearer and more specific information.

Reviewer#1, Concern # 7:  Sentence row 61-63 is too long and unclear – please rewrite

Author response: 

Thank you for the suggestion. We have shortened the sentence by a half so that it covers only the most important part.

 

Reviewer#1, Concern # 8:  It is advised to use neutral way of expressions, instead of personal constructions (Our results show -> Results show….; In the research, we used XY method -> In the research, the XY method was used…) which is more suitable for the research work.

Author response: 

Thank you, we have corrected the mistakes.

 

Reviewer#1, Concern # 9:  Conclusion – repeat the most important concrete results in numbers

Author response: 

Thank you for your comments. We have added some important numbers to the Conclusion section to evaluate our most important results.

Reviewer 2 Report

The work presented is interesting, very well presented and I believe it demonstrates a very useful hypothesis. However, there are some issues that need to be improved.

1. Although section 2 references a number of papers reported on the co-reference solution, it only provides descriptive elements and does not highlight any limitations they may have. On the other hand, it is not clear whether a paper with this same objective has not been reported before, since the analysis of the related papers focuses on the resolution of co-reference, but not on its impact on classification problems.

2. Considering that there are several approaches to solve the co-reference, the experiments should not have considered only one approach and the performance of other approaches should be evaluated.

3. Several references are incomplete and should be completed with all their metadata.

English is generally good

Author Response

Reviewer#2, Concern # 1:  Although section 2 references a number of papers reported on the co-reference solution, it only provides descriptive elements and does not highlight any limitations they may have. On the other hand, it is not clear whether a paper with this same objective has not been reported before, since the analysis of the related papers focuses on the resolution of co-reference, but not on its impact on classification problems.

Author response: 

Unfortunately, during the writing of Section 2, we did not find articles that verify coreference resolution using an application for classification tasks. Most of the articles listed in section 2 are focused only on projects for coreference resolution. This part is a classic overview of "Related work", also for this reason we did not focus on discussing the limitations of these approaches. Even regarding the "related work" section, we did not find a job with the same goal as the article presented by us.

 

 

Reviewer#2, Concern # 2:  Considering that there are several approaches to solve the co-reference, the experiments should not have considered only one approach and the performance of other approaches should be evaluated.

Author response: 

Of course, there are multiple libraries and approaches to coreference resolution. To solve our problem, we chose the library with which we have the best previous experience. Our goal was to determine the effect of coreference resolution on performance measures for classification tasks. Of course, the reminder is adequate, but our article could be practically expanded with many, many more improvements. However, this could potentially confuse our article to result at the expense of the main objective.

 

 

Reviewer#2, Concern # 3:  Several references are incomplete and should be completed with all their metadata.

Author response: 

Thank you. We didn't notice the incorrect references, it has been fixed.

Reviewer 3 Report

Comments

The authors reported the evaluation of coreference resolution CR to identify either preprocessing (using CR) can help to improve machine learning ML classification. They used TF-EDF and Doc2Vec techniques with DT, RF, KNN, and MNB machine learning classifiers and provided their performance with only one dataset. Although the topic is attractive, authors must bring extensive change to the manuscript for experiments.

 

Reviews:

 

  1. In the abstract, emphasis on the problem statement in early statements. Mention how much accuracy (%) is improved after this preprocessing.
  2. Do we still need other preprocessing techniques (i.e., Tokenization, POS, Stemming, etc.) besides CR for the ML classification improvement?
  3. Figure 1 is very simple. Draw a systematic diagram to show the relationship or process of CR for NLP.
  4. Section 2 briefly states prior studies' gaps and limitations and summarizes the major and minor challenges in the last paragraph. Table 1: also add best performance results (%) and limitation/gap of that studies.
  5. There are many typos and grammar mistakes.
  6. Section 3: name should be "Methodology" or other, but not Materials and Methods. Add mathematical foundation in methods.
  7. The font size is inappropriate for Figures 5, 6, and 7. Briefly mention (a), (b), (c), and (d) in the captions.
  8. Provide a comparison with SOTA studies from the latest publications, and also add your code link to let the reviewer confirm the result's integrity.
  9. Since authors evaluate whether CR can improve the ML, it is strongly suggested to make comparisons with at least five ML classifiers.
  10. How the authors solve the issue of Anaphora and Ellipsis resolution for understanding the word or phrase context and identify missing elements, respectively.
  11. Since authors provide comparisons with TF-IDF and Doc2Vec, why they did not consider other word embedding techniques, i.e., GloVe (https://www.mdpi.com/2076-3417/11/23/11255) and Word2Vec (10.1109/RCAR52367.2021.9517671). 
  1. There are many typos and grammar mistakes.

Author Response

Reviewer#3, Concern # 1:  In the abstract, emphasis on the problem statement in early statements. Mention how much accuracy (%) is improved after this preprocessing.

Author response: 

Thank you for the suggestion. The last sentences in the abstract were misleading. We have corrected them and provided information on how much the accuracy has increased.

 

Reviewer#3, Concern # 2:  Do we still need other preprocessing techniques (i.e., Tokenization, POS, Stemming, etc.) besides CR for the ML classification improvement?

Author response: 

Certainly yes. From the given examples, we mainly need tokenization. However, this is an elementary part of TfIdf and Doc2Vec (if you create vectors through the available libraries). Based on your comment, we have added this information to the beginning of our article.

 

Reviewer#3, Concern # 3:  Figure 1 is very simple. Draw a systematic diagram to show the relationship or process of CR for NLP.

Author response: 

The whole process consists of corpus preparation and model training for CR identification. In our case, it would unnecessarily complicate the readability of the article. For this reason, we did not present such a diagram in the article. Figure 1 is presented only in order to clarify the methodology of the article, in our opinion, it is sufficient.

 

Reviewer#3, Concern # 4:  There are many typos and grammar mistakes.

Author response: 

Thank you for the reminder. We have checked the article again and I believe that we have corrected all the typos and grammar mistakes.

 

Reviewer#3, Concern # 5:  Section 3: name should be "Methodology" or other, but not Materials and Methods. Add mathematical foundation in methods

Author response: 

Section 3 was renamed to “Methodology”.

 

Reviewer#3, Concern # 6:  The font size is inappropriate for Figures 5, 6, and 7. Briefly mention (a), (b), (c), and (d) in the captions.

Author response: 

The size of the images and thus the size of the font will be solved according to the instructions before the final publication. We added (a), (b), (c), and (d) to the descriptions. Thank you for the suggestion.

 

Reviewer#3, Concern # 7:  Provide a comparison with SOTA studies from the latest publications, and also add your code link to let the reviewer confirm the result's integrity.

Author response: 

We have uploaded the source code and the results to the github. The dataset is not there because of its size.

https://github.com/ksteflovic/coreference-resolution_word-vectors

 

Reviewer#3, Concern # 8:  Since authors evaluate whether CR can improve the ML, it is strongly suggested to make comparisons with at least five ML classifiers.

Author response: 

Thank you for the suggestion. In addition to the DecisionTree, RandomForest, MultinomialNB and K-Neighbors classifiers, which are mentioned in the article, we also calculated performance measures for the classifiers: LogisticRegression, SGD, LinearSVC, GradientBoosting. If we also would include them to the article, in our opinion, the article would be confusing. However, we present their results in a table on github, together with their visualization.

https://github.com/ksteflovic/coreference-resolution_word-vectors

 

Reviewer#3, Concern # 9:  How the authors solve the issue of Anaphora and Ellipsis resolution for understanding the word or phrase context and identify missing elements, respectively.

Author response: 

Neuralcoref algorithm already includes anaphora resolution, which focuses on resolving references to previously mentioned entities or concepts, specifically involving pronouns. Coreference resolution involves identifying and linking all mentions that refer to the same entity or concept, encompassing a broader range of reference resolution. We did not resolve Ellipsis resolution as it is out of the focus of our problem.

 

Reviewer#3, Concern # 10:  Since authors provide comparisons with TF-IDF and Doc2Vec, why they did not consider other word embedding techniques, i.e., GloVe (https://www.mdpi.com/2076-3417/11/23/11255) and Word2Vec (10.1109/RCAR52367.2021.9517671).

Author response: 

Our goal in the article was to present one "traditional" word embedding method, i.e. TF-IDF and one word embedding from the group of methods focused on the semantic proximity of words (e.g. Doc2Vec, Word2Vec, GloVe, Bert, etc.). Doc2Vec is actually a variation of Word2Vec, at the same time we chose Word2Vec from semantic word embedding.

Round 2

Reviewer 3 Report

Briefly examining the revised version and the corresponding response from the mentioned author, it has come to my attention that certain questions and suggestions, clearly highlighted in the attached document, have not been adequately addressed. Furthermore, the responses provided by the authors are deemed unsatisfactory. In adherence to academic norms, authors must respond to reviewer inquiries with scholarly rigor, following established protocols for addressing feedback. It is imperative that the authors conscientiously incorporate the reviewer's feedback and suggestions into the revised manuscripts. By doing so, they can uphold the standards of academic discourse and ensure the integrity of their research contribution.

I have not reviewed the revised manuscript because the author avoids responding to all queries. 

Author Response

Author response: 

In response to the previous comments from the reviewer, we have diligently addressed them, we assure you that we have conscientiously incorporated them into the article. For certainty, we have rechecked all the responses to the comments as well as the changes made in the article. Based on the previous comments, we have made the following additional improvements:

 

Reviewer#3, Concern # 6:  Section 3: name should be "Methodology" or other, but not Materials and Methods. Add mathematical foundation in methods

Author response: 

Section 3 was renamed to 'Methodology.' In addition, we have reconsidered our previous decision and added the formulas used for calculating performance measures of individual models to Chapter 3.3.3.

 

Reviewer#3, Concern # 9:  Since authors evaluate whether CR can improve the ML, it is strongly suggested to make comparisons with at least five ML classifiers.

Author response: 

In addition to the DecisionTree, RandomForest, MultinomialNB and K-Neighbors classifiers, which are mentioned in the article, we also calculated performance measures for the classifiers: LogisticRegression, SGD, LinearSVC, GradientBoosting. We present their results in a table on github, together with their visualization (https://github.com/ksteflovic/coreference-resolution_word-vectors). Furthermore, we selected the results of the Logistic Regression classifier and incorporated them into the article, within figures 7e to 9e. Similarly, we integrated these results into the evaluation of outcomes in Chapter 4.

 

We believe that we did not sufficiently respond to these two comments in the previous reply. If the reviewer did not like other previous comments, please mark them again, we will also be grateful for an additional comment.

Round 3

Reviewer 3 Report

Regrettably, the provided comments and suggestions have not been fully implemented, and the work's novelty remains lacking despite the authors' contributions.

  1. In the abstract, emphasis on the problem statement in early statements.
  2. Section 2 briefly states prior studies' gaps and limitations and summarizes the major and minor challenges in the last paragraph. Table 1: also add best performance results (%) and limitation/gap of that studies.
  3. Add mathematical foundation for the proposed methodology in section 3. In the previous version, the added equations only belonged to well-known performance metrics.  
  4. The font size is inappropriate for Figures 5, 6, and 7. Briefly mention (a), (b), (c), and (d) in the captions.
  5. Important - Provide a comparison with SOTA studies from the latest publications, and also add your code link to let the reviewer confirm the result's integrity. 

Almost fine.

Author Response

Reviewer#3, Concern # 1:  

In the abstract, emphasis on the problem statement in early statements.

Author response: 

According to the recommendations of the reviewer and the editor, the abstract was completely revised.

 

Reviewer#3, Concern # 2:  

Section 2 briefly states prior studies' gaps and limitations and summarizes the major and minor challenges in the last paragraph. Table 1: also add best performance results (%) and limitation/gap of that studies.

Author response: 

We sincerely apologize, but “The Best performance results (%) and Limitation and Gap” appeared rather inconsistent in table 1. However, this information was added to the text of Section 2: Related work. At the same time, we added other publications to Section 2.

 

Reviewer#3, Concern # 3:  

Add mathematical foundation for the proposed methodology in section 3. In the previous version, the added equations only belonged to well-known performance metrics. 

Author response: 

Coreference resolution is based on classification methods, mainly neural networks. Likewise, Doc2Vec. By adding a mathematical foundation about neural networks, the article would become significantly unclear, while it would mainly be a typical well-known mathematical theory of neural networks. For this reason, we did not add more mathematical foundation to the chapter.

 

Reviewer#3, Concern # 4:  

The font size is inappropriate for Figures 5, 6, and 7. Briefly mention (a), (b), (c), and (d) in the captions.

Author response: 

Thank you for the suggestions. In the new version of the article, we adjusted the size of the images.

 

Reviewer#3, Concern # 5:  

Important - Provide a comparison with SOTA studies from the latest publications, and also add your code link to let the reviewer confirm the result's integrity.

Author response: 

Thank you for the suggestion. Additionally, we found publications that deal with Data Augmentation Techniques. We have added these to Section 2: Related work and at the same time we have provided a broader discussion of the results of these experiments and their comparison with our experiments in Section 5: Discussion. The reviewer can find the results as well as the source codes at the web address:  https://github.com/ksteflovic/coreference-resolution_word-vectors

Back to TopTop