Next Article in Journal
High-Resolution Phase-Contrast Tomography on Human Collagenous Tissues: A Comprehensive Review
Previous Article in Journal
Potential for Dose Reduction in CT-Derived Left Ventricular Ejection Fraction: A Simulation Study
Previous Article in Special Issue
Dedicated Cone-Beam Breast CT: Reproducibility of Volumetric Glandular Fraction with Advanced Image Reconstruction Methods
 
 
Article
Peer-Review Record

Artificial Intelligence for Image-Based Breast Cancer Risk Prediction Using Attention

Tomography 2023, 9(6), 2103-2115; https://doi.org/10.3390/tomography9060165
by Stepan Romanov 1,*, Sacha Howell 2,3,4, Elaine Harkness 1, Megan Bydder 4, D. Gareth Evans 4,5, Steven Squires 6, Martin Fergie 1,† and Sue Astley 1,*,†
Reviewer 1: Anonymous
Reviewer 3: Anonymous
Tomography 2023, 9(6), 2103-2115; https://doi.org/10.3390/tomography9060165
Submission received: 31 October 2023 / Revised: 17 November 2023 / Accepted: 21 November 2023 / Published: 24 November 2023
(This article belongs to the Special Issue Artificial Intelligence in Breast Cancer Screening)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,
Thank you for submitting your manuscript to Tomography.
It is about the detection of breast cancer using artificial intelligence in addition to other clinical information based on mammography images.
I consider the topic to be extremely relevant, as breast cancer is the most common tumor in women and therefore affects many women worldwide. Early detection is therefore essential. In addition, the current screening recommendations are standardized and not personalized. However, as this would be of great benefit to some patients due to more intensive screening, this is of great importance.

Title: good
Introduction: Good introduction. In line 18/19 the references are missing. Pleas define the used word: short-term risk (e.g. line 40).
Methods: Please include how many patients were included in this study. You mention the number of mammograms but not the amount of patients.
In line 92: 'over a period scanning several years', please define several years.
Results: good.
Discussion: Please summarize your results a litte bit more detailed. The benefit of your result is that more women might be detected with breast cancer earlier. But the "negative" side would be again that more diagnostic leads sometimes to unneccessary interventions and burdens the women.
Limitations of your study are missing.

Good luck!

Author Response

Comment

Response

Introduction

·  Reference location amended to properly cite the full statement (line 19)

·  We defined short-term as those the period of around 5 to 55 months. This encompasses first prior mammograms to the formation of cancer. A more specific definition for short-term risk and the decision rationale added to the introduction (line 41)

Methods (Number of patients

The number of women in the study is stated in the data section, however, we have phrased the sentence poorly which made it seem like we were talking about images. This has been amended since, to properly outline the number of women in the study as opposed to images and clear any confusion (line 82)

Methods (Missing Definition)

This period is from 2009 to 2015, this has been added to the manuscript (line 94)

Discussion

Significantly expanded the discussion section, added a conclusion.

Limitations Missing

Limitation section has now been added as its own section (starting line 252)

 

Reviewer 2 Report

Comments and Suggestions for Authors
  1. The study emphasizes the importance of detailed image-based features for breast cancer risk prediction. However, the potential influence of interobserver variability among radiologists in the dataset, particularly when incorporating prior mammograms, is not explicitly addressed. A thorough analysis of the steps taken to mitigate or account for this variability would provide valuable insights into the robustness of the proposed model. Considering the complex nature of mammographic images and the potential variability in interpretation, how did the study address or control for interobserver variability among radiologists in the dataset, especially in the cases involving prior mammograms?
  2. While the article introduces the concept of mixing screen-detected cancers with priors for model development, it does not delve into the potential challenges or advantages of this approach from a radiologist's perspective. Understanding how the model copes with distinguishing features related to risk and those indicative of cancer formation, especially in the context of mixed cases, would provide insights into the clinical relevance and specificity of the proposed model. The article mentions the incorporation of screen-detected cancers mixed with priors during model development. Could you elaborate on how the inclusion of these different types of cases influences the model's ability to differentiate features associated with risk and features specific to cancer formation in the context of radiological interpretation?
  3. The article underscores the significance of short-term risk prediction, but it does not explicitly address the potential variation in the model's performance across different breast cancer subtypes. An exploration of the model's effectiveness in predicting short-term risk within distinct molecular subtypes would enhance our understanding of its clinical applicability and generalizability. The study primarily focuses on short-term risk prediction. How does the model's performance vary when applied to different subtypes of breast cancer, considering the heterogeneity within the disease? Is there evidence of the model's efficacy in predicting short-term risk across diverse molecular subtypes?
  4. The article mentions the use of attention-based Multiple Instance Learning (MIL) for accurate risk predictions. While the study introduces MIL for risk prediction, it does not explicitly discuss the model's adaptability to evolving insights in breast cancer biology. Exploring how the proposed model accommodates emerging biomarkers and molecular profiling advancements would shed light on its potential to integrate with contemporary oncological research and clinical practices. However, how does the model account for the evolving landscape of breast cancer biology, especially considering advancements in molecular profiling and the identification of novel biomarkers? Is there evidence supporting the model's adaptability to emerging biological insights?
  5. The article uses a pre-trained ResNet-18 feature extractor for image processing. Still, the article lacks a detailed discussion on the rationale behind selecting this architecture and its suitability for processing mammographic images. A more comprehensive explanation of why ResNet-18 was chosen and how its features contribute to accurate risk prediction would provide a deeper understanding of the model's design choices. Can you provide insights into the choice of this specific architecture and how its characteristics align with the requirements of processing highly dimensional mammographic images?
  6. The study discusses the use of attention-based pooling in the proposed model. Although attention-based pooling is introduced, the article does not delve into the interpretability aspects of the model's predictions and how attention contributes to this. A more thorough exploration of how attention mechanisms enhance interpretability, especially in the context of mammographic images, would provide valuable insights into the transparency and trustworthiness of the proposed model. How does this attention mechanism contribute to the interpretability of the model's predictions, and what considerations were taken to ensure that the attention-based approach aligns with the specific challenges posed by mammographic images?
  7. The study utilizes the Predicting Risk of Cancer At Screening (PROCAS) dataset. While PROCAS is utilized, the article lacks a comprehensive discussion on the representativeness of the dataset in terms of demographic characteristics. Examining the demographic alignment and efforts taken to ensure generalizability across diverse populations would strengthen the external validity and applicability of the proposed model. How does the demographic composition of this dataset align with broader population demographics, and what efforts were made to ensure that the model's performance is generalizable across diverse demographic groups?
  8. The article mentions the exclusion of certain cases based on criteria such as missing Full-Field Digital Mammography (FFDM) or prior breast cancer diagnosis. The exclusion criteria are outlined, but the article does not explicitly address the potential impact of these exclusions on the external validity of the model. A more thorough exploration of how these criteria influence the model's applicability in broader screening contexts would enhance our understanding of its real-world utility and limitations. How might these exclusion criteria impact the external validity of the model, particularly in real-world screening scenarios where such exclusions may be less stringent?
  9. The article emphasizes the potential for personalized prevention and early detection. While the potential for personalized prevention is highlighted, the article does not extensively discuss the practical integration of the model into primary care settings. Exploring the feasibility, challenges, and potential benefits of incorporating this risk prediction tool into routine primary care would offer valuable insights for physicians involved in frontline patient care. How could the proposed model be integrated into routine primary care settings to facilitate individualized screening protocols, and what challenges might primary care physicians anticipate in implementing such risk prediction tools?
  10. The study mentions the importance of adjusting individual screening protocols based on model outputs in real-time, but it does not delve into the communication strategies or considerations for diverse patient populations. Understanding how primary care physicians interpret and convey risk predictions, especially in culturally diverse settings, would provide essential insights into the model's clinical implementation and patient acceptance. How might primary care physicians interpret and communicate these risk predictions to patients, and what measures were taken to ensure that the model outputs align with the communication needs of diverse patient populations?
  11. The article mentions the patching of mammograms into smaller, equally sized tiles. The patching process is discussed, but the article lacks a detailed explanation of how the specific patch size was chosen and its impact on preserving relevant image features. Elaborating on the considerations taken to determine an optimal patch size would contribute to a deeper understanding of the model's image-processing techniques. How was the choice of patch size determined, and what considerations were taken to ensure that this patching process preserves relevant image features without introducing artifacts?
  12. Affine transforms, including rotations and translations, are applied during training. While affine transforms are mentioned, the article does not extensively discuss their role in enhancing the model's robustness, particularly in the context of variations in breast positioning. A more detailed exploration of how these transformations address challenges related to breast positioning and image acquisition would provide valuable insights into the model's adaptability to real-world scenarios. How do these transformations contribute to the model's robustness in handling variations in breast positioning and image acquisition, and were there specific challenges encountered in applying these transformations to mammographic images?
  13. The study employs 5-fold cross-validation for model evaluation, but the article lacks a comprehensive discussion on the rationale behind choosing this specific approach and its implications for model evaluation. Exploring the considerations for potential biases or confounding factors would enhance our understanding of the robustness of the model. Can you provide insights into why this specific cross-validation approach was chosen, and were there considerations for potential biases or confounding factors that may affect the validity of the model evaluation results?
  14. The logistic regression analysis incorporates additional predictors such as family history and ethnic origin. The article mentions additional predictors, but it does not provide a detailed explanation of the criteria for their selection or the statistical methods used to prevent multicollinearity or overfitting. A thorough exploration of the variable selection process and statistical precautions would contribute to a more comprehensive understanding of the logistic regression analysis. How were these predictors selected, and what statistical methods were employed to ensure that the inclusion of these variables does not introduce multicollinearity or overfitting in the model?
  15. The study discusses the potential impact of the model on personalized screening. While the potential impact of personalized screening is highlighted, the article does not extensively discuss the alignment with broader public health initiatives or considerations for disparities in access to screening. Exploring the potential implications for population-level health outcomes and addressing equity concerns in screening access would provide valuable insights for public health professionals. How might the integration of this risk prediction tool align with broader public health initiatives aimed at reducing breast cancer morbidity and mortality, and were there considerations for disparities in access to screening resources?
  16. The article introduces variables associated with breast cancer risk, but it does not explicitly discuss how these variables were identified or whether there were efforts to encompass a comprehensive range, including social determinants of health. A more thorough exploration of the variable selection process and considerations for social determinants would enhance the study's relevance to broader public health perspectives. How were these variables identified, and were there efforts to ensure that the selected variables encompass a comprehensive range of determinants, including those relevant to social determinants of health?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

The manuscript titled "Artificial Intelligence for Image-Based Breast Cancer Risk Prediction Using Attention..." is well-written and presents an engaging read. The proposed model shows promise for early cancer detection upon additional testing, validation, and refinement, thus it has substantial clinical relevance.

Please find my comments below.

Major Comments: There are no major comments.

Minor Comments:

  1. The results are presented as a comparison of AUC values, but there is no context provided about the significance of the improvement. How does this increase in AUC impact clinical outcomes?

  2. Considering the model is specifically designed for early detection prior to the detection of cancer at full resolution, should the focus be more on true positives (TP) even if it increases false positives (FP)?
  3. In Table 2, are the odds ratios (OR) statistically significant?
  4. Does a significant demographic difference exist between the PRC-mixed and BCR-priors groups, as shown in Table A1?

 

Comments on the Quality of English Language

English language fine. No issues detected

Author Response

 

Comment

Response

Comment 1 (Results)

We are not assessing our method for a specific clinical intervention as that is outside of the scope of this paper.  We compare the results of MAI-risk against VAS and IBIS to highlight the improvement in discrimination for image-based methods against existing clinical prediction models. As such it is not possible to discuss clinical outcomes without having a specific intervention to assess. We hope to address this in future work. 

Comment 2 (FP vs TP)

 The model is intended for general populations with average risk in national screening programmes. Our intuition is that small increases to the FP rate can lead to an unmanageable number of women requiring further intervention and putting even more strain on screening programmes. This is why we focus on the TP rate over the FP rate as women for whom we recommend additional intervention are highly likely to benefit from it. This has been made clearer in the manuscript.

Comment 3 (OR Significance)

MAI-risk shows a statistically significant improvement in OR over VAS and IBIS for the combined model. We have now ensured this is stated explicitly in the results (line 209)

Comment 4 (Data demographics)

There are some demographic differences between the sets. A short paragraph detailing these differences, as well as explaining their origin was added to the Appendix A1 (Starting line 268)

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,
thank you for including my suggestions for improvement.

Introduction: in line 20 missing reference
Limitations: double a in line 280.

Good luck!

Comments on the Quality of English Language

Limitations: double a in line 280.

Author Response

Dear Reviewer,

Thank you for taking the time to read and critique our manuscript.

As per your suggestions, the missing reference was amended, and a more thorough pass of the spelling was done.

Kind regards,

Stepan

Reviewer 2 Report

Comments and Suggestions for Authors

Accept in present form. 

Author Response

Dear Reviewer,

 

Thank you for taking the time to review and critique our manuscript.

 

Kind regards,

Stepan

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

All the comments are addressed.

Best.

Author Response

Dear Reviewer,

Thank you for taking the time to read and critique our manuscript.

 

Kind regards,

Stepan

Back to TopTop