Automated Classification of the Tympanic Membrane Using a Convolutional Neural Network
Round 1
Reviewer 1 Report
1.Can elaborate how to identify the perforation 2.Can explain more about the inclusion and exclusion criteria, since the clinical pathology are very complicated in COM cases
Author Response
Thank you for your review and invaluable comments.
1. We don't know how this model can identify the perforation. As shown in the class activation map, the perforation margin and middle ear may be the clue of detection.
2. As shown in Methods, TM photos taken in the clinic were collected. Among them, postoperative, retraction, attic destruction, and inadequate images were excluded. TM with problematic cases will be studied in the next study.
Reviewer 2 Report
The paper entitled “Automated classification of the tympanic membrane using a convolutional neural network” provides an interesting example of automated image recognition conducted by convolutional neural network. The research presented in the paper is sound both in methodological aspect as in the concept. I have no doubts that is could be published in case the authors address few minor questions provided below.
1. The statistics of the network should provide not only % of the correct predictions but also true positive, true negative, false positive and false negative values. This is due to non-even cost of the misclassification in such a screening model (i.e. false positive classification of perforated TM would have severe consequences both to the patient and medical practitioner using the CNN tool). It is also customary to select best decision threshold of the classification for the output neuron based on ROC curve. If possible I would suggest presenting how true positive rate vs false negative rate changes with respect to different values of the threshold in the output neuron and if the selected value is optimal (or how different costs of diagnosis can be achieved)
2. The sense in predicting the side from which the image was collected completely alluded me. Surely otolaryngologist or even pediatrician who would collect the image for analysis would be able to tell if the ear was right or left? Such prediction has no sense (beside demonstration that the CNN can indeed do that). However I understand that due to the common right handiness of the doctors collecting pictures, the images from left and right ear differ. I would suggest developing two processing networks – one for classification of left and one for classification right ear images. Such an approach could facilitate the perforation recognition.
3. In image recognition it is possible to teach neural network recognize features irrespective to their position in the image. To attain this one can provide data with rotated original image thus expanding dataset. Such approach was used with DBN in the paper:
Cireşan D.C., Giusti A., Gambardella L.M., Schmidhuber J. (2013) Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks. In: Mori K., Sakuma I., Sato Y., Barillot C., Navab N. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2013. MICCAI 2013. Lecture Notes in Computer Science, vol 8150. Springer, Berlin, Heidelberg
4. The perforations presented in the figures are very easy to recognize. The authors did not provide information if their models could also classify cases that would indeed be problematic for an unexperienced practitioner (small linear perforations, perforations in a retracted tympanic membrane, perforations filled with cholesteatoma or discharge). The latter are more common for chronic otitis media than clear, well-defined perforations presented in the figures. From the clinical point of view it would be valuable to differentiate between a perforation and a retraction pocket or between a true perforation and a perforation covered with a pseudomembrane. In my opinion differentiation between a normal membrane and a membrane with a well-defined perforation is only a first step, but further research would be necessary to create models that could analyze the whole spectrum of otoscopic pictures in chronic otitis media in clinical practice.
5. The authors use the term “corn of light”. They probably mean the “cone of light”?
Author Response
Thank you for your review and invaluable comments.
1. As you pointed out, I added another performance result. In this study, we tried variable combinations of optimizers for the training. Our final hyperparameters are as follows: batch size 32, the number of epochs 400, SGD as an optimizer, learning rate 0.0001, momentum 0.9, Nesterov momentum.
However, this time, we did not study the changes the values(true positive rate, false negative rate changes or cut-off value) according to the parameters. We are trying to increase the dataset and will show changes according to the parameters in the next study.
2. We agree with your opinion. The physician can recognize the side of the tympanic membrane at the time of capture of the image. However, the study was conducted to find out whether this algorithm can identify the tympanic membrane even in otitis media with deformation, rather than to recognize the side of a normal tympanic membrane.
3. We agree with you. Mirroring, rotations and even generative adversarial networks(GANs) can expand the dataset. The detection accuracy of this study remains to be further improved by this technique. We added the point of your comments in the discussion according to your comments.
4. We agree very much with your opinion. This study could show the effectiveness of the machine algorithm in detecting the perforation of the tympanic membrane. Further research is necessary to create a better model to recognize different kinds of otologic diseases. We added more description in Discussion.
5. It's my mistake. I will check and correct the spelling and English. Thank you for your correction.
Reviewer 3 Report
# good work
# problem is well addressed.
# relatively well written and easy to follow the paper.
# Very good figures and presentation (format).
# Significance test can be added in the experiments.
# For CNN implementation, we understand that the parameters used in CNN model can be varied from one problem to another. CNN parameters are application dependent. Authors would like to convey the exact same message to avoid confusions. For example, in document analysis problem, different parameters are used (Ukil, S., Ghosh, S., Obaidullah, S.M. et al. Neural Comput & Applic (2019). https://doi.org/10.1007/s00521-019-04111-1)
# Also, finding the abnormality based on symmetric features can help understand more about the problem. I suggest authors mentioned such a concept (Automated Chest X-Ray Screening: Can Lung Region Symmetry Help Detect Pulmonary Abnormalities? IEEE Trans. Med. Imaging 37(5): 1168-1177 (2018))
Author Response
Thank you for your review and invaluable comments.
1. Regarding the detailed hyperparameters used in this study, we tried a lot of combinations of hyperparameters to optimize CNN training. However, this time, we did not study the changes the values according to the hyperparameters. And finally used hyperparameters are as follows: batch size 32, the number of epochs 400, SGD as an optimizer, learning rate 0.0001, momentum 0.9, Nesterov momentum. Training dataset was augmented with sheer range 0.2, rotation range 5, horizontal flip for normal vs. perforation training. According to your comments, we added more description in Methods.
2. We agree with you that symmetry can be a useful clue to detect an abnormality in certain circumstances. However, symmetry cannot be considered in this study because the shape of the tympanic membrane is not symmetric (because of the different angle of taking a photo with an endoscope), and extraction of the individual feature is not necessary in deep learning using CNN. According to your comments, we added more description in Discussion.
Round 2
Reviewer 2 Report
The corrections are ok.