Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Automatic Taxonomy Classification by Pretrained Language Model

Electronics 2021, 10(21), 2656; https://doi.org/10.3390/electronics10212656

by Ayato Kuwana, Atsushi Oba, Ranto Sawai

and Incheon Paik^*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Electronics 2021, 10(21), 2656; https://doi.org/10.3390/electronics10212656

Submission received: 22 September 2021 / Revised: 24 October 2021 / Accepted: 26 October 2021 / Published: 29 October 2021

(This article belongs to the Section Computer Science & Engineering)

Round 1

Reviewer 1 Report

The paper is well written. You have presented the effect of the batch size on the calculation time with your BERT-based model. Could you please add the effect of the batch size on the accuracy of your BERT-based model? I am wondering about the effect of the batch size on the accuracy of the model, how it performs.

The author did compare the batch size vs the computational time, would like to know how the batch size compares in terms of the accuracy not only with the computational time.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

Interesting work which tackles a classic issue in ontology engineering, the one of generating taxonomy elements from text. Important improvements in NLP, here pretrained language model with fine-tuning, lead us to take a new look at this problem. It is also very interesting to approach it in relation to the QA problem.

Nevertheless, an ontology is more than a taxonomy (see Ian Horrocks. 2008. Ontologies and the semantic web. Commun. ACM 51, 12 (December 2008), 58–67. DOI:https://doi.org/10.1145/1409360.1409377). So I’d suggest replacing the term Ontology by Taxonomy in the title.

Also, the paper must be improved at different levels, below my suggestions:

- motivation: in which way the produced ontology can be actually useful to populate a domain-specific taxonomy. Tables 7 and 8 show some interesting examples but noise and lack of coverage and exhaustivity wrt a specific domain are problems that should be discussed.

- application to the QA problem: clarify how the produced hypernym–hyponym relationship pairs relate specifically to the QA Dataset. Further explanation on this dataset would be useful to understand.

- related work: ontology generation from text relying on NLP is a well known technique and has been widely used to generate ontology element candidates. New generation NLP (pretrained language models, neural networks, ..) leads us to take a new look at this problem. That said, a more exhaustive state of the art should be included.

- error analysis of results, this is completely absent from the paper and should be added just after the experiments.

Minor remarks :

- some sentences have a very open turn of phrase, one does not understand the meaning. For instance, "As steps 2 and 3 are specific" in line 53, "The first step is a general problem in natural language processing " in line 51.

- Line 63 : add bibliographic references to give examples of some of the "existing methods".

- Line 74 : which method ? add bibliographic reference

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The research is interesting and important, but the description is confusing. My remarks:

Please, clarify if you are training the BERT-based models yourself? From lines (79-80, 162, 187-188) it seems you are by using the Wikipedia and Book Corpus data. From lines (195-196) it seems you are only reusing/fine-tuning the existing ones. If you are training the BERT-based models yourself, please, clarify the purpose of it and also provide more details about the used training corpora. Many BERT models already exist for the EN language (https://huggingface.co/transformers/pretrained_models.html): they are trained on the huge corpora and are very accurate. If you are reusing the existing BERT models, please, clarify which ones. Does Table 3 present models trained by you or only reused?
Can you provide more details about the lexico-syntactic patterns used to extract nouns (lines 269-270)? Besides, please, clarify the deep learning-based model are you using to extract noun phrases (lines 275-276).
Does Table 6 provide the testing results only with WordNet? Since you do not provide the distribution between different relations for the testing split (only statistics about the whole dataset in Table 2), it is difficult to interpret the results in Table 6. Are differences between the results statistically significant? Do these results exceed random and majority baselines?
I do not understand the purpose of the experiment presented in Figure 8. The smaller batch size always increases the training accuracy. Is there a difference for you in how long the training takes? Training of DNNs is always a long process, but the main goal is not the shorter training time, but the accurate model. Despite how long the model is trained, the testing time is almost the same.
It is not clear how successful the extraction of nouns and noun phrases was from the SQuAD dataset since you only present some determined relations in Table 7 and Table 8. Without the deeper error analysis only from the specific cases, it is impossible to make any interpretations and conclusions.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Thank you for taking into consideration my suggestions and for your work.

Article Menu

Automatic Taxonomy Classification by Pretrained Language Model

Further Information

Guidelines

MDPI Initiatives

Follow MDPI