Next Article in Journal
Enhancing Scrap Reduction in Electric Motor Manufacturing for the Automotive Industry: A Case Study Using the PDCA (Plan–Do–Check–Act) Approach
Next Article in Special Issue
Prediction of Consolidation Tumor Ratio on Planning CT Images of Lung Cancer Patients Treated with Radiotherapy Based on Deep Learning
Previous Article in Journal
Comparative Analysis of Machine Learning Methods for Predicting Energy Recovery from Waste
Previous Article in Special Issue
Melanoma Brain Metastases: Immunotherapy or Targeted Therapy? A Systematic Review and Meta-Analyses
 
 
Systematic Review
Peer-Review Record

Ultrasound-Based Deep Learning Models Performance versus Expert Subjective Assessment for Discriminating Adnexal Masses: A Head-to-Head Systematic Review and Meta-Analysis

Appl. Sci. 2024, 14(7), 2998; https://doi.org/10.3390/app14072998
by Mariana Lourenço 1,†, Teresa Arrufat 2,†, Elena Satorres 3,†, Sara Maderuelo 3,†, Blanca Novillo-Del Álamo 3, Stefano Guerriero 4, Rodrigo Orozco 5 and Juan Luis Alcázar 5,6,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2024, 14(7), 2998; https://doi.org/10.3390/app14072998
Submission received: 5 March 2024 / Revised: 25 March 2024 / Accepted: 29 March 2024 / Published: 3 April 2024
(This article belongs to the Special Issue Computational Approaches for Cancer Research)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In the present publication the authors present a meta-analysis regarding deep learning assisted classification of adnexal masses. Congratulations on your work; I honestly read it with great interest!

Regarding the methodology, the study is very well conducted!

However, several points should be addressed before the paper is considered for publication:

- Deep machine learning is a bit odd. I am aware that seldomly authors refer to deep learning as deep machine learning but I would suggest to change this throughout the manuscript to simply deep learning. Also in table 1 it should state "Deep Learning Architecture" and not "Program".

- Ultrasound is referred to as the gold standard in the Introduction. However, even the Methods list histological analysis as the reference standard and as in most diagnostic evaluation of tumor biopsy is the non-plus-ultra. I believe you want to refer to transvaginal ultrasound as the primary diagnostic tool or reference standard imaging tool. That should be clarified throughout the manuscript.

- Please rephrase lines 381-384 to a style more suited for a scientific publication.

- The discussion would benefit from a section dedicated to the differences in performance of "conventional" machine learning methods listed in paragraph 299-305 and deep learning structures. Because the main questions is whether computationally resource intensive DL architectures are needed when more simple regression models could achieve the same.

- The results of the subsection Quantitative synthesis should be explained a bit more in-depth in the Discussion section for readers inexpirienced with the methodology.

Comments on the Quality of English Language

Line 46 "repsriductive age"

Line 124 "IA", AI?

Table 1 "Bening" and Country "Suecia"

Author Response

Dear Reviewer 

Thannks for all your constructive comments that clearly contributed to imprive our manuscript

The following amendments have been made in the manuscript

  1. Comment: In the present publication the authors present a meta-analysis regarding deep learning assisted classification of adnexal masses. Congratulations on your work; I honestly read it with great interest!
    1. Answer: Thanks for this comment.

 

  1. Regarding the methodology, the study is very well conducted!
    1. Answer: Thanks for this comment

 

  1. Deep machine learning is a bit odd. I am aware that seldomly authors refer to deep learning as deep machine learning but I would suggest to change this throughout the manuscript to simply deep learning. Also, in table 1 it should state "Deep Learning Architecture" and not "Program".
    1. Answer: Modified as suggested.

 

  1. Ultrasound is referred to as the gold standard in the Introduction. However, even the Methods list histological analysis as the reference standard and as in most diagnostic evaluation of tumor biopsy is the non-plus-ultra. I believe you want to refer to transvaginal ultrasound as the primary diagnostic tool or reference standard imaging tool. That should be clarified throughout the manuscript.
    1. Answer: Thanks for this comment. Reviewer is right. We have modified throughout the text.

 

  1. Pease rephrase lines 381-384 to a style more suited for a scientific publication.
    1. Answer: We agree. Lines rephrased

 

  1. The discussion would benefit from a section dedicated to the differences in performance of "conventional" machine learning methods listed in paragraph 299-305 and deep learning structures. Because the main questions is whether computationally resource intensive DL architectures are needed when more simple regression models could achieve the same.
    1. Answer: Thanks for this comment. A paragraph added in the Discussion.

 

  1. The results of the subsection Quantitative synthesis should be explained a bit more in-depth in the Discussion section for readers inexperienced with the methodology.
    1. Answer: Thanks for this comment. A paragraph added in the Discussion

 

  1. Line 46 "repsriductive age"
    1. Answer: Sorry for this mistake. Amended

 

  1. Line 124 "IA", AI?
    1. Answer: Sorry for this mistake. Amended.

 

  1. Table 1 "Bening" and Country "Suecia"
    1. Answer: Sorry for these mistakes. Amended

 

Reviewer 2 Report

Comments and Suggestions for Authors

Ultrasound based Deep Machine Learning models performance versus Expert Subjective Assessment for discriminating adnexal masses. A head-to-head systematic review and meta-analysis.

Summary:

In this review the authors perform a meta-analysis on the use of deep machine learning as a diagnostic tool in adnexal masses in comparison with expert evaluation. Authors screened the literature for studies fitting their selection criteria, in brief studies evaluating diagnostic of adnexal masses with a deep machine learning model versus expert evaluation. Out of thousands of results only four studies matched with authors criteria. When comparing the best model of deep machine learning from each publication versus expert evaluation, authors did not observe significant differences between the two methods to differentiate malignant and benign adnexal masses.

 

General comments:

The introduction could benefit from a clarification. Emphasis on the aim of this meta-analysis and where this systematic review found its ground should be considered. As mentioned in the discussion other reviews already looked at AI diagnosis therefore providing more context would help the reader understand the scope of the review. The methodological reasoning is clearly disclosed and well described in the manuscript.

Results could be improved and clarified. Figure 1 has a low resolution, font size of the text as well as use of bullet point seems to not be standardised, and text is not always aligned.

Table 1 might benefit from a lower font size to improve its readability, the “age” column should specify the methods use, probably median. The column “time until surgery” should specify the unit of time. There is a typo for the column benign spelled “bening”. In the second line from the Christiansen study the country should read “Sweden” not “suecia”.

Paragraphs from line 213 to 234 would greatly benefit from the addition of a figure encompassing all number quoted, to allow a better visualisation of the number of case/images/training data set. All these paragraphs need in text interpretation.

Paragraph line 260 to 262 is too vague it needs to be more specific with an expanded explanation and cited number needs to be refereeing to a figure number/panel. The use of word “heterogeneity” is too broad. Prevalence of malignant adnexal masses per study could be included in table 1 and referred in this section. Reason of the discrepancies in prevalence between the selected studies could be further explore in the discussion.

Overall, the results section lack of interpretation and is currently only descriptive Figure 4 A and B is not discussed at all. Figures throughout the manuscript are of a subpar quality and definitely needs to be improved.

The discussion lacks in substance. The first section (4.2 Interpretation of findings in clinical context) from line 291 to 331, echoes as an introduction and not discussion. The entire section should be considered to be merge with the actual introduction of the manuscript. The paragraphs line 331-337 is the ground for the discussion and should be expanded. The introduction leading to the discussion should be shorten.

The strengths and limitations section brings some interesting point of discussion. While only mentioned a crucial part of these studies is the model use to perform the analysis. A succinct explanation of the differences of each DML models methodologies used in these four studies would help the reader understand the challenges of the field.

Lastly, within all the references used in this manuscript, 9 out of the 72 are from the author. Having more than 12% of self-citation might be considered a high number. Could the authors justify the absolute necessity of using each of their own citation in the manuscript.

Overall, this manuscript needs improvement before being published, reconsideration after major revision is recommended.

 

Minor comments:

Line 46: “repsriductive” should be changed to reproductive

Line 63: “trilal” should be changed to trial

Line 71: a space is missing between masses and for

Line 95: reference is not correct, it should be change to ref 45

Line 124: could the author clarify if AI is misspelled IA, otherwise this new abbreviation would need to be spell out.

Line 149: reference is not correct, it should be change to ref 46

Line 171: reference is not correct, it should be change to ref 47

Line 334: “drowned” should be changed to drawn

Line 381-382: This sentence is not grammatically correct

Comments on the Quality of English Language

The overall English used in the manuscript could be improved to facilitate readability. Some sentences/paragraph might benefit from editing to help the flow and typos are present throughout the manuscript. See comments to authors.

Author Response

Dear Reviewer

Thanks for yor constructive comments.

The following amendments have been made.

  1. The introduction could benefit from a clarification. Emphasis on the aim of this meta-analysis and where this systematic review found its ground should be considered. As mentioned in the discussion other reviews already looked at AI diagnosis therefore providing more context would help the reader understand the scope of the review. The methodological reasoning is clearly disclosed and well described in the manuscript.
    1. Answer: Thanks for this comment. A sentence has been added in the Introduction.

 

  1. Results could be improved and clarified. Figure 1 has a low resolution, font size of the text as well as use of bullet point seems to not be standardized, and text is not always aligned.
    1. Answer: Thanks for this comment. This figure has been modified according to suggestion

 

  1. Table 1 might benefit from a lower font size to improve its readability, the “age” column should specify the methods use, probably median. The column “time until surgery” should specify the unit of time. There is a typo for the column benign spelled “bening”. In the second line from the Christiansen study the country should read “Sweden” not “suecia”.
    1. Answer: Thanks for these comments. Table amended. However, Font size is kept at lowest for readability

 

  1. Paragraphs from line 213 to 234 would greatly benefit from the addition of a figure encompassing all number quoted, to allow a better visualization of the number of case/images/training data set. All these paragraphs need in text interpretation.
    1. Answer: We added a table incorporating these data. It is a little bit complex, as data are.

 

  1. Paragraph line 260 to 262 is too vague it needs to be more specific with an expanded explanation and cited number needs to be refereeing to a figure number/panel. The use of word “heterogeneity” is too broad. Prevalence of malignant adnexal masses per study could be included in table 1 and referred in this section. Reason of the discrepancies in prevalence between the selected studies could be further explore in the discussion.
    1. Answer: “Heterogeneity” is the word used in statistical terms, meaning the presence of variation in true effects sizes underlying the different studies. We added this definition to the text. We already cite figures 2A and 2B, were this heterogeneity (I2) and the corresponding p values are displayed. The prevalence of malignancy can be easily calculated from the data already reported in table 1. The reasons for different prevalence are discussed in the Discussion.

 

  1. Overall, the results section lack of interpretation and is currently only descriptive Figure 4 A and B is not discussed at all. Figures throughout the manuscript are of a subpar quality and definitely needs to be improved.
    1. Answer: Thanks for this comment. We added a paragraph for explaining these figures

 

  1. The discussion lacks in substance. The first section (4.2 Interpretation of findings in clinical context) from line 291 to 331, echoes as an introduction and not discussion. The entire section should be considered to be merge with the actual introduction of the manuscript. The paragraphs line 331-337 is the ground for the discussion and should be expanded. The introduction leading to the discussion should be shorten.
    1. Answer: Thanks for this comment. We have structured the Discussion according to PRISMA checklist, that includes the following items
      1. Provide a summary of findings. (Subheading 4.1)
      2. Provide a general interpretation of the results in the context of other evidence (Subheading 4.2)
  • Discuss any limitations of the evidence included in the review. (Subheading 4.3)
  1. Discuss any limitations of the review processes used (Subheading 4.3)
  2. Discuss implications of the results for practice, policy, and future research. Stated at the end of Discussion

 

  1. The strengths and limitations section bring some interesting point of discussion. While only mentioned a crucial part of these studies is the model use to perform the analysis. A succinct explanation of the differences of each DML models’ methodologies used in these four studies would help the reader understand the challenges of the field.
    1. Answer: Thanks for this comment. We added this point in the Discussion. Three new references have been also added.

 

  1. Lastly, within all the references used in this manuscript, 9 out of the 72 are from the author. Having more than 12% of self-citation might be considered a high number. Could the authors justify the absolute necessity of using each of their own citation in the manuscript.
    1. Answer: We are not aware that > 12% of self-citation is excessive neither where this figure comes. The self-citation rate in this study is 12%, after new references added. In any case, we do consider that all citations are justified as they are absolutely related to the topic under research.

 

  1. Line 46: “repsriductive” should be changed to reproductive
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 63: “trilal” should be changed to trial
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 71: a space is missing between masses and for
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 95: reference is not correct, it should be change to ref 45
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 124: could the author clarify if AI is misspelled IA, otherwise this new abbreviation would need to be spell out.
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 149: reference is not correct, it should be change to ref 46
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 171: reference is not correct, it should be change to ref 47
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 334: “drowned” should be changed to drawn
    1. Answer: Sorry for this mistake. Corrected

 

  1. Line 381-382: This sentence is not grammatically correct
    1. Answer: Sorry for this mistake. Corrected

 

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript benefited a lot from the changes. Especially Table 2 is a great addition.

Please change the abbreviation DML to DL before publication.

 

Author Response

Comment:

Please change the abbreviation DML to DL before publication

Answer: 

Done as suggested

Reviewer 2 Report

Comments and Suggestions for Authors

The authors worked on the manuscript to address comments raised. The manuscript presented is now improved, however figure quality is still subpar and would probably benefit from quality improvement to facilitate readability.

Acceptance with minor revision is recommended.

Comments on the Quality of English Language

The overall comprehension in the manuscript could be improved. Some sentences/paragraph might benefit from editing to help the flow throughout the manuscript.

Author Response

Thanks four your comments.

We have tried to improve English language

 

Back to TopTop