Next Article in Journal
Analysis and Design of a Leading Edge with Morphing Capabilities for the Wing of a Regional Aircraft—Gapless Chord- and Camber-Increase for High-Lift Performance
Previous Article in Journal
A General Framework Based on Machine Learning for Algorithm Selection in Constraint Satisfaction Problems
 
 
Article
Peer-Review Record

Classification and Predictions of Lung Diseases from Chest X-rays Using MobileNet V2

Appl. Sci. 2021, 11(6), 2751; https://doi.org/10.3390/app11062751
by Abdelbaki Souid 1, Nizar Sakli 2 and Hedi Sakli 1,2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2021, 11(6), 2751; https://doi.org/10.3390/app11062751
Submission received: 4 February 2021 / Revised: 15 March 2021 / Accepted: 16 March 2021 / Published: 18 March 2021
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

The paper proposes using a modified MobileNet V2 deep learning model for recognition of lung pathologies from frontal thoracic X-ray images. The manuscript must be revised following the comments and suggestions presented below before it could be considered for acceptance.

Comments:

  1. Abstract: summarize the main results of the paper (including the numerical results) in 1-2 sentences.
  2. Introduction section: improve the presentation of the research context. What are the main challenges in lung disease recognition? What knowledge gap are you trying to bridge?
  3. The novelty is not clear. The innovation over existing methods/models seems small. Explicitly formulate the novelty and contribution of this paper at the end of the first section.
  4. The overview of the state-of-the-art related works is weak. Only a few works are shortly discussed. Present a more comprehensive analysis and discussion on current methods for pulmonary disease identification from chest X-ray images. How about nature-inspired, heuristic and hybrid methods? Recently, hybrid neuro-heuristic and neuro-fuzzy methods have demonstrated excellent results. Discuss these and other works for improving the overview: doi:10.1016/j.patrec.2020.12.010, doi:10.1007/s10044-020-00950-0, doi:10.1109/TII.2020.3022912, doi:10.3390/sym12071146. Summarize the discussed works as a table.
  5. Provide a more detailed explanation on the limitations of the current methods for pneumonia and similar lung disease detection, i.e. how they deal with specific problems such as overfitting, class imbalance, low-quality input images, etc.?
  6. Why have you selected MobileNetV2? Did you consider any alternatives such as SqueezeNet?
  7. How did you select the number of epochs to train? It seems what you stop training too early as the networks could have been trained more to improve the accuracy.
  8. How did you address the overfitting problems? What method you employ to counter overfitting?
  9. Explain the hyper-parameter tuning in more detail. What hyper-parameter search / optimization methods do you use?
  10. Present and discuss the confusion matrix.
  11. Did you cross-validate your results?
  12. The study uses only a single metric (AUC) to evaluate performance. However, using only one metric to evaluate the performance is not recommended. I suggest to add an F1-score as well.
  13. Add a critical discussion section. Discuss the limitations of your method in the Discussion section.
  14. Conclusions section: use main experimental results to support your claims; add more in-depth insights into the possible impact of your research; discuss future work.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This is a very interesting article to investigate the application of the MobileNet V2 model to classify lung diseases using chest X-rays. However, more referencing is required to substantiate the author’s statements and extensive English editing is needed. Please ensure all your images are easy to view for the reader (e.g. figure 3 resolution, figure 4 size too small). This manuscript requires clearer delineation of the content to make it easier for the reader to follow: Introduction (background literature and aims of the research), Methods (including performance measures), Results (Tables, Figures and reporting of performance measures), Discussion (impact of this work, strengths and limitations), and Conclusion. Below are specific comments/questions for the authors:

INTRODUCTION:

Line 46-47: Please provide references to substantiate your comment “There is a lot of work done using AI technology to detect and diagnose lung disease.”

Lines 47-48: Please provide references to substantiate your comment “As a kind of probabilistic neural network, multilayer perceptron, learning vector quantization, and recurrent neural network (RNN) have been used to diagnose lung disease.”

Lines 61-62: Please expand on your statement “However, the performance of these algorithms has not reached the level of radiologists.” Please provide references and quantify the metrics of performance you are referring to.

Lines 64-65: You introduce the MobileNet V2 architecture in your introduction but do not expand on what distinguishes it from others used, what areas has it been previously used (with references), what are its benefits. There is more detail in the methods section, but there needs to be a few sentences in the Introduction to explain to the reader these issues.

Line 67-70: You indicate the work is divided into “five parts” but then commence at the “second part”. This is confusing to the reader as I am guessing you included “Introduction” as the first part? Suggest Lines 67-70 be removed and this paragraph clearly articulate the aims of your research.

DATA:

Line 78: Please list to the reader the “14 binary values”.

Lines 79-80: The division of your database into training, configuration and testing should be included in your Methods section. Please explain what ‘configuration’ is? Usually ML data is segmented into training, validation and testing sets?

Figure 2: As you correctly point out, your classes are extremely unbalanced. Please explain why you did not include some ML resampling techniques to attempt to handle this imbalance (e.g. SMOTE, RandomSampling)?

Figure 2: Please label your x-axis and y-axis.

Line 95: You combine “validation/testing” together but these are used for different purposes in ML. Please clarify why you have done this or amend accordingly.

RELATED WORK:

This section relates to past work and seems out of place after the section on data. Please review if this section should be included in either the Introduction or the Methodology sections.

Lines 115-117: Please provide more detail relating to “The dataset was trained using the ImageNet competition, and mainly the descriptions DeCAF and PiCoDes (image codes) are extracted using the implementation of the Convolution Neural Network.” Please provide references for the “ImageNet competition”.

Line 118: Please clarify your statement: “Unfortunately, a faithful comparison of approaches remains very difficult” and provide references.

Lines 118-120: Please expand and provide references for your statement: “Most 118 of the reported results was obtained with different setups. These include, among others, 119 the architecture of the neural network used, and the function loss.”

METHODOLOGY

Please provide a Table of the cohort used in this analysis (e.g. age, sex, class) (e.g. usually Table 1)

Lines 123-124: Please explain your statement “classification and prediction of 14 different lung diseases in Chest X-rays using by Keras framework.” This is the first time the Keras framework has been introduced in your article. Please provide a sentence describing this framework and provide the appropriate references.

Figure 3: Please improve the image resolution. Please explain in more detail the following statement attached to this figure: “After the dataset preparation, later in this paper, the model composed from the MobileNet V2 itself with one Global pooling layer and a Fully Connected Layer.”

Line 146: Please correct this sub-heading as it appears to be incorrect (i.e. V2)?

Line 157: Please correct the gender reference “Her role”.

Lines 163-167: Please expand on the benefits so MobileNet V2 compared with other architectures and include the appropriate references.

Table 1: Please define ‘t’ in your table footer.

Lines 182-184: Please expand on and provide a reference to substantiate your comment “MobileNet V2 focuses on optimizing latency, but at the same time, it also enables small networks to operate efficiently and support any size input, which can provide better performance.” Please outline the performance measure(s).

Line 186: Please explain the difference between “MobileNet” and “MobileNet V2”?

Lines 187-188: Your statement “We have used a data generator function to divide the entire dataset into three groups, i.e. training (80%), validation and testing (20%).” However, you only show percentages for training and testing and only report on training and validation (see figures 6 and 7). This is very confusing for the reader. Please explain in detail what was exactly performed.

Line 193: Pleaser provide the version of Python used (and reference) and reference the TensorFlow and Keras packages.

Line 199-200: Please correct the gender reference “Her robustness”

Lined 205-206: Please outline the tuning parameters used (e.g. maximum number of epochs) and metrics used to determine the ideal parameters used in the final model in this paragraph as part of your methods.

EXPERIMENTAL RESULTS

Line 225: Please confirm “The difference decries to 1.18%.” should read “The difference decreases to 1.18%.”

AUC: You report on the C-Statistic but fail to outline this metric in your methods. Please make sure you define all your performance measures in your Methods section. Please include accuracy, sensitivity/Recall, specificity, F1 score and AUC.

COMPARISON TO OTHER APPROACHES:

This section is interesting and it would be beneficial if the authors clarified this research methodology in their Methods section (i.e. search criteria of their literature review). This section seems to be potentially misleading as you do not document the details of the other comparisons. Your statement

Lines 266-267” Please outline to the reader the performance measures you are using when you state “With a different split of our re-sampling, we found a wide diversity in performance.”

Line 267-268: Please clarify to the reader your statement “Therefore, as shown in Table 2, direct comparisons with other the groups may fail in the sense of leading results.”

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The article has been revised. But several presentation issues still remain.

  • Equations (1)-(5) are well known and could be omitted from the paper.
  • Figure 6 is confusing as the links between the layers of the neural network are not shown. Please redraw.
  • Sensitivity of the proposed method is rather low, which should be discussed as a limitations of the proposed method. Low sensitivity means that many diseased subjects will not be classified correctly.
  • The language should be improved.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Thank you for addressing my comments and the number of edits and additions has greatly improved your manuscript.

However, there are some minor edits required and further English checking is strongly recommended:

  • Line 67: The English "Table 1. Summarize some of the work...." is incorrect. Please amend.
  • Lines 70-71: This paragraph contains only one sentence and it is hard to understand where it fits. Please expand or place in the paragraph above?
  • Lines 109-110: Unfortunately, I am still not clear on the statements: "Unfortunately, a faithful comparison of approaches remains very difficult. Most of the reported results was obtained with different setups." What do you main by "faithful comparison of approaches"? Please provide examples of the "different setups".
  • Line 152: "we show ..." Please start your sentences with uppercase "We show....
  • Lines 272 and 321: Please do not refer to any sex in your text.
  • Line 288: Please define YOLOv2
  • Line 290: Please define mIOU
  • Figures 2b and 3b: The image labels are hard to read and seem to be over-written. Please correct.
  • Lines 203-207: Unfortunately, your method for dealing with imbalance appears is still unclear, and appears to be a form of propensity score weighting, but without the necessary metrics. This is an important stage due to your imbalanced data. Please provide more detail of this process and provide before/after metrics in Supplementary material.
  • Table 4: Please define "Acc" - did you mean 'Accuracy'?
  • Table 4: Please include your Precision metrics
  • Line 326: Please explain to the reader "stair divide"

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop