SocialTERM-Extractor: Identifying and Predicting Social-Problem-Specific Key Noun Terms from a Large Number of Online News Articles Using Text Mining and Machine Learning Techniques
Round 1
Reviewer 1 Report
I found this paper technically sound. It is quite extense, but I think that is clear and well structure. In my opinion, it is publishable as is.
Author Response
Author Response File: Author Response.docx
Reviewer 2 Report
This is a well-written and well-presented paper, performing a thorough comparison of text classification methods for predicting social problem-specific key noun terms for the Korean language.
The author's motivation lies in the social problems' domain. Particularly, the author attempts to exploit the deluge of data residing and continuously produced on the web, in order to assist in identifying social problems and the location they emerge.
A adequate state of the art review is performed, where representative papers of the various techniques and approaches are presented. Evaluation is also adequate, where various algorithms and ensemble techniques are applied and compared against a large dataset of Korean news articles. In this context, the work is interesting.
However, the approach taken is not novel. Similar approaches has been presented in other domain (not the social) and the particularity of this domain is not evident (at least to my eyes). Additionally, current sota in text mining (embeddings/Vector space models), as well as topic extraction techniques (LDA) are not adequately considered.
Additionally, tables are difficult to read. The best performing algorithm has to be clearly highlighted. All in all I believe that this is an interesting applied paper and could be considered for publication, however its novelty is limited.
Author Response
Author Response File: Author Response.docx
Reviewer 3 Report
The key information that the experimental research was performed on texts in Korean language is missing in the abstract, introduction and conlusions.
The introduction should be reconsidered for shortening, as it goes quite far away from the actual research topic of the paper, elaborating too much of social problems in South Korea.
There is a number of minor editing errors to be corrected:
technological knowledge shares (...) has been=> have been
Almeida, et al. => Almeida et al.
vs => vs.
Table 4. 18 complex network structural features...=>Table 4. Complex network structural features...
Table 5. 20 classification techniques...=>Table 5. Classification techniques...
SoicalTERMs=>SocialTERMs
4. Two suggestions for consideration:
- its unclear why a special notation of the term skewedness is used:
|skewedness|
- it would be good to provide proper references to mentioned data sources (e.g. SentiWordNet) and tools (e.g. JUNG)
Author Response
Author Response File: Author Response.docx