Next Article in Journal
Influence of Manufacturing Process Modularity on Lead Time Performances and Complexity
Previous Article in Journal
Three-Dimensional Force Characterizations in Maxillary Molar Distalization: A Finite Element Study
 
 
Article
Peer-Review Record

Self-Training with Entropy-Based Mixup for Low-Resource Chest X-ray Classification

Appl. Sci. 2023, 13(12), 7198; https://doi.org/10.3390/app13127198
by Minkyu Park and Juntae Kim *
Reviewer 1:
Reviewer 2:
Appl. Sci. 2023, 13(12), 7198; https://doi.org/10.3390/app13127198
Submission received: 19 April 2023 / Revised: 10 June 2023 / Accepted: 10 June 2023 / Published: 16 June 2023
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

Introduction:

The authors assert that the number of radiologists is limited. That is not often the case in many settings. They can benefit from automated tools or improve their performance with automated tools.

Entropy-based mixup:

It seems that it works, but it is unclear what is its meaning. It is a simple linear combination of two x-rays, generating an image that is not anatomically correct. The labels are a weighted average of the labels. A mixed image of a case with a hernia and a case with fibrosis, for instance, will result in an image where the hernia is only partially visible, and probably mixed with other anatomical structures, and whose reference standard, for hernia, would be of \lambda.

The improvements of table 1 are minor, except for the cases where there are few labels (hernia, fibrosis, cardiomegaly). The authors used only 5% of the training data for this test. How many cases of those disease are in the training data? And on the test data? Figure 5 shows only 27 cases of hernia for approximately 15k cases. 5% of ChestX-ray 15 is approximately 6k cases. The risk here is that the AUCs are heavily biased due to very small test sets. Please specify how many cases are the AUCs based on.

Effect of self training:

I am not convinced that “teacher with model mixup“ improves over “teacher without model mixup” in Table 2. Sure, there is a 1.8% increase, but mainly driven by the hernia and edema classes. Please read previous comment regarding AUCs of small classes.

I have an issue with the parameter K. It indicates “how many data are selected for each class”. For the case of hernia, one is to expect at most 200 cases in the unlabeled dataset. What is the point of selecting the 2000 highest hernia rankers from the predictions of the teacher? Wouldn’t it make more sense to select a number of cases associated with the prevalence of the disease?

Table 3 is interesting, the authors have not discussed why high values (P=14) or low values (P=5) decrease performance. Why is there such a sweet spot in this parameter?

Table 4 is really interesting. What would happen with n=4? One would assume that at some point there would be no further improvement or even a decrease in performance, but that point is not found in the work presented.

 

 

 

In general, the english used in the paper is of good quality.

Minor English comments:

-          Data in the abstract should not be capitalized.

-          Mixup should not be capitalized.

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

1. According to the final results, the AUC only increased by slightly over 2%. The effectiveness of this research method is not outstanding, possibly due to entropy not being a good indicator for estimating sample imbalance.

2. Although the methods may need further revision, its ideas can serve as a reference for other researchers.

3. Suggest adding some basic principles of Mixup-related algorithms in the paragraph about methods. In the introduction paragraph, only mention that this article uses the Mixup algorithm, but the purpose and step description of the algorithm itself are not clear enough.

4. There is no mention of the details on how to implement Mixup and calculate entropy in this study. Was the program self-written, applied with certain framework libraries or packaged software, and what programming language was used?

The content of the article is expressed clearly and the structure is complete. 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop