Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification
Round 1
Reviewer 1 Report
Title:
****
Benchmarking Deep Learning Methods for Aspect Level Senti-2 ment Classification
Overview:
**********
This paper provides a brief overview and an empirical evaluation and comparison 35 deep learning methods for Aspect Level Sentiment Classification (ALSC). The empirical evaluation is comprehensive and provides a nice contribution to the literature. The paper’s English needs to be carefully revised.
Positive points:
****************
The paper is well motivated and well organized. Experimental results seem satisfactory and promising.
Negative points and improvements:
**************************************
- The paper’s abstract is somewhat confusing. This might be due to the usage of the English language. I suggest the authors carefully refine it to better highlight the main contributions of the paper.
- The authors need to briefly introduce the different kinds of approaches for LSA, before focusing into DL-based methods (supervised versus unsupervised, knowledge-based versus corpus-based, graph-based versus deep learning). The authors can refer to the below recent publication in their literature review:
-
- Sumaia Mohammed Al-Ghuribi, Shahrul Azman Mohd. Noah, Sabrina Tiun: Unsupervised Semantic Approach of Aspect-Based Sentiment Analysis for Large-Scale User Reviews. IEEE Access 8: 218592-218613 (2020)
- Mireille Fares, Angela Moufarrej, Eliane Jreij, Joe Tekli, William I. Grosky: Unsupervised word-level affect analysis and propagation in a lexical knowledge graph. Based Syst. 165: 432-459 (2019)
- The authors need to carefully revise the paper’s English. The paper includes many typos.
Author Response
- The paper’s abstract is somewhat confusing. This might be due to the usage of the English language. I suggest the authors carefully refine it to better highlight the main contributions of the paper.
Response: Thank you for your comment. We have revised our abstract to make it clear and more understandable. We have also mentioned the key contributions of our work in the abstract.
- The authors need to briefly introduce the different kinds of approaches for LSA, before focusing into DL-based methods (supervised versus unsupervised, knowledge-based versus corpus-based, graph-based versus deep learning). The authors can refer to the below recent publication in their literature review.
- Sumaia Mohammed Al-Ghuribi, Shahrul Azman Mohd. Noah, Sabrina Tiun: Unsupervised Semantic Approach of Aspect-Based Sentiment Analysis for Large-Scale User Reviews. IEEE Access 8: 218592-218613 (2020)
- Mireille Fares, Angela Moufarrej, Eliane Jreij, Joe Tekli, William I. Grosky: Unsupervised word-level affect analysis and propagation in a lexical knowledge graph. Based Syst. 165: 432-459 (2019)
Response: Thank you for your suggestions. Following your suggestion, we have discussed different kind of other approaches as well. Please refer to paragraph 1 and 2 of section 3.1 on page no. 4.
We have also updated the references as suggested.
- The authors need to carefully revise the paper’s English. The paper includes many typos.
Response: Thank you for raising this concern. Now, we have thoroughly checked our paper again for the English constructions and typos. The sentences have been restructured. To ensure better quality, we have double-checked our work with Grammarly’s paid premium services.
The updated text has been highlighted in the manuscript.
Reviewer 2 Report
Congratulations to the Authors - good and novel methodology, excellent experimental setup, very useful results.
The only potential methodological issue is related to tuning of hyperparameters. According to "4.2. Experimental design", the Authors preselected hyperparameter values for all models of interest. Performance of modern models is highly dependent on proper hyperparameter selection, so the this issue should be addressed (at least, discussed and mentioned in limitations of the study).
Author Response
The only potential methodological issue is related to tuning of hyperparameters. According to "4.2. Experimental design", the Authors preselected hyperparameter values for all models of interest. Performance of modern models is highly dependent on proper hyperparameter selection, so the this issue should be addressed (at least, discussed and mentioned in limitations of the study).
Response: Thank you for your comment. We agree that the performance of models may vary with different values of hyperparameters. But the hyperparameter tuning of 35 deep learning methods will require substantial computational cost and time. Therefore, in the future, we will try to address this issue. Meanwhile, we have added this as a limitation of our study.
Kindly refer to the second to last paragraph of section 6 on page no.22.
The updated text has been highlighted in the manuscript.