Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Advancing Early Detection of Breast Cancer: A User-Friendly Convolutional Neural Network Automation System

BioMedInformatics 2024, 4(2), 992-1005; https://doi.org/10.3390/biomedinformatics4020055

by Annie Dequit and Fatema Nafa^*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Yan Zhang

Reviewer 4:

Crenguta Sorina Serboiu

BioMedInformatics 2024, 4(2), 992-1005; https://doi.org/10.3390/biomedinformatics4020055

Submission received: 8 December 2023 / Revised: 6 March 2024 / Accepted: 13 March 2024 / Published: 1 April 2024

(This article belongs to the Special Issue Feature Papers in Clinical Informatics Section)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper presented a deep learning model based on Convolutional Neural Network (CNN) architecture to predict Invasive Ductal Carcinoma (IDC), a form of breast cancer, using microscopic images of biopsy samples. The following are the specific review comments:

(1) The authors didn’t provide sufficient details on the dataset used, such as the source, size, distribution, and preprocessing of the images. It is unclear how representative and diverse the dataset is, and how it affects the generalization of the model.

(2) The authors didn’t explain the rationale behind the choice of the CNN architecture, the hyperparameters, and the activation functions. It is unclear how these design decisions affect the performance and interpretability of the model.

(3) They didn’t conduct any ablation studies or sensitivity analysis to evaluate the impact of different components and parameters of the model. It is unclear how robust and stable the model is, and how it handles noise, outliers, or variations in the input data.

(4) Some related references can be discussed, such as:

Synchronous Medical Image Augmentation Framework for Deep Learning-based Image Segmentation. Computerized Medical Imaging and Graphics

A hybrid two-stage teaching-learning-based optimization algorithm for feature selection in bioinformatics. IEEE/ACM Transactions on Computational Biology and Bioinformatics

A Configurable Deep Learning Framework for Medical Image Analysis. Neural Computing and Applications

Region-to-boundary deep learning model with multi-scale feature fusion for medical image segmentation. Biomedical Signal Processing and Control

An ultrasound standard plane detection model of fetal head based on multi-task learning and hybrid knowledge graph. Future Generation Computer Systems

(5) The paper does not provide any qualitative results or visualizations of the model’s predictions, such as heat maps, saliency maps, or attention maps. It is unclear how the model identifies and localizes the cancerous regions in the images, and what features it learns from the data.

(6) They didn’t discuss the limitations and challenges of the proposed method, such as the computational cost, the scalability, the ethical issues, or the potential risks. It is unclear how the method can be improved or extended for future research.

Comments on the Quality of English Language

NNA

Author Response

Dear Reviewer,

Thank you for your valuable feedback; it greatly enhanced the quality of the paper. Please review the attached file where the responses are highlighted in yellow and have been incorporated into the document.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

In this manuscript by Dequi and Nafa, the authors develop and test a CNN model for the identification of the most common histologic form of breast cancer, the most frequent cancer in women. The authors leverage over 78,000 histologic images derived from 279 patients with breast cancer and an even larger cohort of images without cancer for their model development and testing. The literature overview in section 2 is quite comprehensive, the authors aim for open science and state that they provide their code accessible on GitHub. Importantly, the authors also developed a user interface model to facilitate the spread of their research. Here, the use of binary target visualization may indeed accelerate the detection of pathologies. This research is interesting and potentially very relevant, yet has limitations which need to be addressed.

Major:

- It is not clear where the images are derived from. I assume that it is the Wisconsin image database, but this should be explicitly stated. Also, the number of healthy individuals providing images should be labeled. It is also not clear is the pathological images are all derived from invasive ductal carcinoma or if other histological kinds are present. Also I would have appreciated more information on the images, their resolution, the way these digital images were made etc.

- An ethics section is missing, depending on the data source this will be relevant.

- Formalities of scientific writing: The references at the end of the manuscript are not adequately formatted to identify the manuscript of interest (e.g. nr. 8)

- Open science: The authors state that their code is publicly available, yet no access link is provided.

- I invite the authors to use the discussion section to put their results in the context of existing research. E.g. I would like to see the comparative discussion of the author’s results with the XGBoost model results of Pawar et al. as both models yield to similar results in accuracy. Also, some elements from section 2 can be leveraged here. I would also discuss that histological images, which are the basis for this model have usually been obtained following suspicious mammography screening results or clinical suspicion of cancer, which is a distinct precision challenge from identifying the sparse patients with pathological findings among the many healthy ones in a mammography screening program.

- Please provide a confusion matrix of your model (ideally with a graphical representation, code is publically available e.g. https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html)

Minor:

- My review package did not include access to the user interface model

- Formalities of scientific writing: The introduction section is in my view too lengthy and could be combined with sections 2 and the beginning of section 3. The second part of section 3 should be the “methods” section.

- Line 70: it is not “fine needle aspirate (FNA) images of the breast” but images of histology samples obstaine from breast tissues.

- Lines 131 ff. are comprehensive but use too informal language

Author Response

Dear Reviewer,

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

Annie Dequi and Fatema Nafa presented a convolutional neural network that classifies Invasive Ductal Carcinoma using biopsy images. The manuscript is well-written, and the logic is straightforward. I have the following suggestions to improve the manuscript, before I can recommend publishing.

11. In section 2.3.1, the first a few sentences describe neural networks in general, but miss the most important part of CNN, which is the convolutional layers that help extracting local spatial features that are translational invariant. Although the mathematical details are explained in the next section, I believe such a high-level description would highlight the advantage of CNN vs. fully connected neural networks.

22. At lines 158-160, the depth of a convolution kernel does not need to be the same as the image. In fact, at lines 174-176 the authors used a kernel of depth 64 while the image has 3 channels.

33. The calculation at line 176 is only for the size and not the depth, so I suggest keeping only the number 32 on the right side of the equation.

44. The receptive field is not a function of the depth (D), which is indeed not present in Equation 1. However, it is in the descriptions of notations below equation 1. I suggest removing it and only using the notation when necessary. I also suggest not using the bold font for the variables, as that would indicate they are vectors or matrices, which they are not.

55. Notations that are defined in Equation 1 do not need to be redefined in Equation 2, unless they have different meanings.

66. In the case of Equation 3, W and b should be in bold, and I suggest renaming “input” and “output” to x and y, both in bold. Replace the star symbol with the dot for dot product, if possible.

77. At line 214, “These are the basic layers of CNN…” does not belong to the description of the softmax layer, so I suggest moving it to a new line. Line 226 should be merged with line 225. The description of the Tanh activation at line 228 should follow the same format as that of the sigmoid, and preferably with a new line.

88. The method section nicely outlines the mathematical foundations of CNNs. However, I cannot find the architectural details of the CNN model presented in this paper, e.g. the dimensions of the input, the number of CNN and fully connected layers, the kernel size and depth of each layer, the receptive field, and the total number of parameters.

99. Section 3.2.3 is about the training of the CNN model, which should be clearly mentioned. Also, in the pseudocode it seems that all the underscores are complied with latex, making some characters subscripts.

110. At line 249, the authors mentioned that “The ROC curve shows that the model has a high true 249 positive rate and a low false positive rate…” However, I cannot find the plot for the ROC curve anywhere in the manuscript.

111. The quality of the figures is very bad. Please provide versions with higher DPIs.

112. In the discussion, the authors mentioned that the quality of the images will impact the performance of the model. It would be beneficial to perform an analysis where the quality of test images is artificially downgraded and see how classification metrics (accuracy, precision, recall, etc.) are affected. This would help clinicians to determine the acceptable image quality for the presented model.

Author Response

Dear Reviewer,

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

Dear Authors,

I read with great interest your article about the use of AI in breast cancer detection.

However, there are some aspects that require your attention.

Figure 2, has a very low quality, insert an image with higher quality in order to be able to read the text from the figure at least.

Figure 3, you need to number each column and increase the quality of the image.

Figure 4, you need to number each column and increase the quality of the image.

Figure 5, you need to number each column and increase the quality of the image.

Figure 7, seems to be the same image at different enlargements, not different patches, also in the caption you need to describe the type of coloration and the magnification of the microscope.

At the end of the manuscript you need to insert the authors contributions, the ethical statements and acknowledgements according to MDPI instructions for authors.

References need formatting according to MDPI style and also increase the number of references.

Author Response

Dear Reviewer,

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Major:

- Unfortunately, the answers provided by the authors in their point-by-point response on my main concerns are not satisfactory. The authors refer to Cruz-Roa et al. 2014, who included pathology slides of 162 patients. Dequi and Dafa used histologic images derived from 279 patients.

Major:

1. Please do not only refer to the external paper but state clearly in your paper and in the point-by-point response, which source the images are derived from. If it's a public database, additionally refer directly to it and also provide the link in the methods.

2. Also, the number of healthy individuals providing images should be clearly stated.

3. It is also not clear is the pathological images are all derived from invasive ductal carcinoma or if other histological kinds are present.

4. Also I would have appreciated more information on the images, their resolution, the way these digital images were made etc.

5. An ethics section is missing, depending on the data source this will still be relevant.

6. The results from the new images should not figure in the discussion but in the results section. Place the limitations before the conclusion

7. My comment has not been addressed in the discussion, although the authors claimed so. - I invite the authors to use the discussion section to put their results in the context of existing research. E.g. I would like to see the comparative discussion of the author’s results with the XGBoost model results of Pawar et al. as both models yield to similar results in accuracy. Also, some elements from section 2 can be leveraged here. I would also discuss that histological images, which are the basis for this model have usually been obtained following suspicious mammography screening results or clinical suspicion of cancer, which is a distinct precision challenge from identifying the sparse patients with pathological findings among the many healthy ones in a mammography screening program.

- Formalities of scientific writing and open Science issues have been correctly solved as the code is provided in GitHub and the confusion matrix informs on the precision and recall.

Author Response

In submitting the updated version of our paper titled "Advancing Early Detection of Breast Cancer: A User-Friendly CNN Automation System,"

we extend our deepest thanks to the reviewers for their thorough assessments and the feedback. The collaborative nature of this work is fundamental to our mission of serving the community by enhancing early breast cancer detection methodologies. We welcome and are more than happy to receive further input from our reviewers, as we believe that continued collaboration and dialogue are key to refining our research and achieving our shared goal of improving public health outcomes. Your insights and expertise are invaluable to us in this endeavor, and we look forward to any additional suggestions or comments you may have.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

I am satisfied with the revised manuscript.

Author Response

Thank you very much for taking the time to review the revised manuscript and for your positive feedback. I am delighted to hear that the revisions have met your expectations. If there are any further steps or information needed from my side, please do not hesitate to let me know. I look forward to any future opportunities to collaborate or receive your valuable insights again.

Reviewer 4 Report

Comments and Suggestions for Authors

I had very few observations, but you did not follow them.

Author Response

Author Response File: Author Response.docx

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript has been substantially improved, which I welcome.

Minor issues:

- The number of patients is not yet clear, on page 8, line 282, the authors state in their table 279 patients, while they claim that the cohort includes 162 patients in their point by point response, which I think is correct.

- The results (figure 8 + confusion matrix) are still ill-placed after the discussion section on page 13

- The discussion is much better and improved the manuscript. I like very much how you discussed in your point by point response this made it even more clear. Optionally consider using some of these sentences in the main text. In your discussion the references of your comparators should be cited, whenever you refer to them.

- I would put the conclusions after the discussion

Author Response

Thank you for your valuable feedback on our manuscript. We have carefully considered your comments and suggestions and are pleased to inform you that we have made the necessary revisions to address the concerns raised. Below is a summary of the key changes implemented in our revised manuscript:

Clarification of Patient Numbers: we confirmed that the correct number is indeed 162 patients. The table on page 8, line 282, has been amended to accurately reflect this number, ensuring consistency throughout the document. It is really nice catch.

Repositioning of Results Section: In alignment with your suggestion, we have moved Figure 8 and the confusion matrix to separate section as 4.2.2. Model Performance. This adjustment follows the conventional structure of academic papers.

Enhancement of the Discussion Section: We have incorporated relevant sentences from our point-by-point response into the Discussion section of the main text. This enrichment has undoubtedly improved the clarity and depth of our analysis.

Adjustment of Conclusion Placement: As recommended, the Conclusions section has been relocated to immediately follow the Discussion.

We believe these revisions have significantly strengthened our manuscript, making the arguments clearer and the overall paper more coherent.

We are grateful for your insightful feedback, which has been instrumental in enhancing the quality of our work.

We hope that the revised manuscript now meets your expectations and the high standards of biomedinformatics.

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

Dear Authors,

You managed to follow my instructions.

Hope you will solve the suggestions from the other reviewers.

Author Response

I am grateful for your thoughtful and comprehensive review of our work. Your feedback has provided invaluable insights and opened our eyes to new perspectives. We deeply appreciate the time and effort you dedicated to reviewing our work.

Article Menu

Advancing Early Detection of Breast Cancer: A User-Friendly Convolutional Neural Network Automation System

Further Information

Guidelines

MDPI Initiatives

Follow MDPI