Next Article in Journal
Effects of Cardiac Contractility Modulation Therapy on Right Ventricular Function: An Echocardiographic Study
Previous Article in Journal
Coupled Dynamic Analysis of a Bottom-Fixed Elastic Platform with Wave Energy Converters in Random Waves
 
 
Article
Peer-Review Record

Improving Deep Mutual Learning via Knowledge Distillation

Appl. Sci. 2022, 12(15), 7916; https://doi.org/10.3390/app12157916
by Achmad Lukman * and Chuan-Kai Yang
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2022, 12(15), 7916; https://doi.org/10.3390/app12157916
Submission received: 24 July 2022 / Revised: 1 August 2022 / Accepted: 4 August 2022 / Published: 7 August 2022
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

Authors have explained Improving Deep Mutual Learning via Knowledge Distillation very well. Figures are very well explained but the quality of text in figures can be improved.

 

Contributions of paper are not well presented. Please give a precise and clear point to point contribution

 

What are the existing work till now. Compare with the help of a table at the end of related work

 

I am not able to find section 4 in the paper. Also, it seems that a section before result section is still pending.

 

The introduction and related work section demands more to be included like a) A hybrid convolutional neural network model for diagnosis of COVID-19 using chest X-ray images b) Enhanced convolutional neural network model for cassava leaf disease identification and classification c) Visualization of Customized Convolutional Neural Network for Natural Language Recognition 

 

Explain the concept with an application area like automatic speech recognition.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript titled "Improving Deep Mutual Learning via Knowledge Distillation" deals with knowledge transfer and the authors proposed two new approaches for the purpose. The two new approaches are named as Full Deep Distillation Mutual Learning  and Half Deep Distillation Mutual Learning. According to authors and they are correct that the new approaches have significant effects on improving the performance of convolutional neural networks. These approaches work with three losses by using variations of existing network architectures and the experiments have been conducted on three public benchmarks dataset. Although the manuscript contains novel ideas and can have good perspectives but yet it has some serious flaws that should be addressed before reaching to any decision. Hence I recommend the following major revisions:

1. Manuscript needs comprehensive language revision, further it has too long sentences to understand. So this part needs complete attention.

2. Abstract and introduction should be rewritten keeping in view the point 1 and the historical background.

3. The manuscript is based on too many preprints. So there are serious concerns about the validity of the proposed results.

4. As a sample I am discussing eq. 1. It is not clear that from where it comes, either it is taken from somewhere or it is introduced by the authors. If it is taken from somewhere, then proper references should be cited and if it is introduced by the authors then it should be mentioned. Handle this issue overall.

5. It is mentioned that figure 1 is from [5], where it is not given.

6. Regarding comparative analysis the entries mentioned as from [5] in table 1  and the entries mentioned as from [17] in table 3 are not given in the respective references, so either delete these entries or mentioned the correct source.

7. Presentation and methodology needs improvement.

9. Conclusion should be supported by the presented results.

10. All references should be complete and must be written on same pattern.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

The article from a methodological point of view meets the expectations of a scientific article. The authors establish clear categories of analysis and comprehensively explain the procedure used machine learning and deep learning models. The methodology is also very clear and well worked out from a practical point of view.

Minor.

Lines 182-186 and lines 193-194 are for the contribution of this paper.

182 Inspired by the concepts of DML[22] and KD[5], we developed a new approach that

183 combines the two methods into a formula to improve the performance of DML. If the

184 concept used by DML is to pair two or more networks in the form of a cohort that aims to

185 conduct training simultaneously by utilizing KL divergence loss to guide another net[1]186 work to increase the posterior entropy of each student

193 -194 Our proposed method adopts two KD divergence to improve the network performance.

 

Minor.

If we search many existing sources, KL(P||Q) = CE(P,Q) - H(P), where KL is KL-divergence, CE is cross-entropy, and H is entropy. In general, to minimize the cross entropy or to minimize KL-divergence, the same result will be obtained. What we want to do is to optimize Q to be close to P, but in KL-divergence H(P) is not a function of Q. That is, since H(P) has no effect on optimization, it does not matter whether we use cross entropy or KL-divergence. Therefore, we use KL(P||Q) = CE(P,Q). In other words, it is common to use KL as a deep learning loss function

Therefore, there is a big question about whether pairing two networks and using two KL-divergence is a contribution. In other words, if you connect two deep learning networks, you will naturally use two KL-divergence.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The manuscript is revised well and carefully and now I am fully satisfied with the revised version. In my opinion now the manuscript is ready for publication and so I recommend it to be accepted for publication in its current form.

Back to TopTop