A Deep Learning Model for Preoperative Differentiation of Glioblastoma, Brain Metastasis, and Primary Central Nervous System Lymphoma: An External Validation Study
Round 1
Reviewer 1 Report
The paper seems interesting but somehow difficult to follow. Clinical considerations are weak and further explanation of analysis process are required. A plot of the radiological features selection process can help formulas.
Author Response
Reviewer 1:
“The paper seems interesting but somehow difficult to follow. Clinical considerations are weak and further explanation of analysis process are required. A plot of the radiological features selection process can help formulas. “
# We thank Reviewer1 for the work done reviewing our manuscript. We really appreciated the intention to improve the quality of our work in its revised form. What follow are our comments on the adjustments arranged:
Discussion has been extensively reviewed to improve the clinical key points where implementation of ML models like our own is suggested and warranted. We understand our study is of preliminary value at most, as it validates a pilot study on a small sample retrospectively investigated by a ML model with low intelligibility. However, we want to underline the importance of such models as bodies of literature are growing further in this direction, posing a step forward into artificial intelligence implementation and human enhancement in clinical practice.
Give what previously explained, exact radiological features description is not methodologically allowed by the model architecture itself. Deep convolutional network do not permit exact feature extraction and only coding feature values are known for replicability of results. Those values, are commonly happen in ML-based literature, are of limited importance to the general medical reading community, but are available to specialized personnel for further use upon justified request, of course. We direct elsewhere for a more extensive explanation of Deep convolutional network architecture, given the limited amount of words allowed in such a publication. However, should Reviewer1 require additional clarification, we would love to meet his/her requests for the sake of the manuscript under revision.
Finally, extensive English revision was performed.
Reviewer 2 Report
The authors presented an interesting manuscript regarding the role of a deep learning model for preoperative differentiation of GBM, BM e PCNSL.
The manuscript has a high potential scientific soundness; nevertheless, it presents several pitfalls that should be clarified by the authors before considering the draft ready for publication.
- Abstract: no elements about the methodology and results about external validation are presented; lines 36/37 and lines 37/38 in the conclusion can refer to Methods and to Results. This summary should be rewritten and balanced among the TrS study and TeS study.
- Methods (inclusion criteria): how did the authors define an inadequate MRI? Does MRI include volumetric sequences? Were anamnestic concerns considered as exclusion criteria such as history of malignancies or risk factors for CNS-lymphomas? Was a total body CT scan performed in the preoperative period? Why did the authors included only IDH1-wt GBM since their manuscript refers to 2016 CNS classification and surgical management is superimposable among so-called primary and secondary GBM?
- Does "operated" (l.107) refer only to surgical removal or also to biopsy?
- Please specify which is the expert neuroradiologist among the authors in the text (line 174). Was the neuropathologist team blinded to MRI features?
- In the Discussion the authors should expand the first sentence regarding what is new in this manuscript if compared to the previous experience as recently published (ref.13).
- Furthermore, I suggest to add a paragraph regarding the further perspective of this research including the feasibility of development of a software available for hospitals and, probably, a didactic tool.
Author Response
Reviewer 2:
# We thank Reviewer2 for the work done reviewing our manuscript. We really appreciated the intention to improve the quality of our work in its revised form. What follow are our comments on the adjustments arranged:
“The authors presented an interesting manuscript regarding the role of a deep learning model for preoperative differentiation of GBM, BM e PCNSL.The manuscript has a high potential scientific soundness; nevertheless, it presents several pitfalls that should be clarified by the authors before considering the draft ready for publication.”
“- Abstract: no elements about the methodology and results about external validation are presented; lines 36/37 and lines 37/38 in the conclusion can refer to Methods and to Results. This summary should be rewritten and balanced among the TrS study and TeS study.”
# Abstract: The abstract has been reviewed and corrected as suggested. Additional speculation on our findings in light of our previous results in the pilot study is available in the main discussion.
“- Methods (inclusion criteria): how did the authors define an inadequate MRI? Does MRI include volumetric sequences? Were anamnestic concerns considered as exclusion criteria such as history of malignancies or risk factors for CNS-lymphomas? Was a total body CT scan performed in the preoperative period? Why did the authors included only IDH1-wt GBM since their manuscript refers to 2016 CNS classification and surgical management is superimposable among so-called primary and secondary GBM?”
# Methods were reviewed and inclusion of volumetric axial Gd enhanced scans specified for further replicability of our tests. For the purpose of our retrospective validation analysis, no additional historical information but those reported in the manuscripts were taken into account by the DL model. As said, no post-hoc exclusion has been designed for immunocompetent vs immunosuppressed patients. Furthermore, the design of the study did not allowed to review the timing of total body CT scan execution at the TeS. As reported in the manuscript, a main bias when interpreting the neuroradiological gold standard performance is weighting the effect of previous history and collateral imaging studies occasionally performed on the patient at the time of the diagnosis that the machine learning model did not acquire information from. This might look like an inequal comparison when drawing any appreciable conclusion is warranted: however, having in mind this methodological limitation, we adhered to previous machine learning-based studies implementing such a design to test our hypotheses. Finally, as previous studies have extensively reported radiomic and deep learning differential features when IDH wild type vs mutated gliomas were compared for classification tasks, we decided to collect data on “primary glioblastomas” (Glioblastoma WHO grade 4 if revised with CNS WHO 2021) only, being the latter the more represented from an epidemiological point of view and to avoid the aforementioned methodological bias according to implicit features. Despite being classified within the same group upon common histological features at the time the diagnosis was made, we believe that the current design might favor clinical implementation.
“- Does "operated" (l.107) refer only to surgical removal or also to biopsy?”
# The piece of information highlighted by Reviewer2 was reviewed in the main manuscript:
Line 99: The medical records and preoperative imaging of patients who underwent surgical tumour resection or biopsy at […]
“- Please specify which is the expert neuroradiologist among the authors in the text (line 174). Was the neuropathologist team blinded to MRI features?”
# Lines 178-179 were corrected: The tumour radiological assessment was addressed by experienced neuroradiologists [P.R. and G.B.] with at least 10 years of clinical experience[…].
The conditions neuropathological diagnosis was made were beyond the aim of the current study and hard to be assessed retrospectively.
“- In the Discussion the authors should expand the first sentence regarding what is new in this manuscript if compared to the previous experience as recently published (ref.13).”
# Lines 316-321 were re-formulated as follows: The accuracy returned by our model was equivalent to a senior neuroradiologist performance in identifying PCNSLs and glioblastomas; moderate diagnostic accuracy was observed for BMs. In light of our previos preliminary findings [13], the evidence of model robustness and generalizability achieved in the current study support our DNN model being “experimentally not inferior” to senior physicians in classifying brain tumours in an unbiased cohort, endorsing the development and deployment of such models in medical training and clinical practice if cleared by regulatory authorities.
- Furthermore, I suggest to add a paragraph regarding the further perspective of this research including the feasibility of development of a software available for hospitals and, probably, a didactic tool.”
# An additional paragraph was added to the discussion according to Reviewer2 perspective [Line 911-924]
Finally, the manuscript underwent extensive English revision.
Round 2
Reviewer 1 Report
the manuscript has been modified accordingly to requests.
Reviewer 2 Report
The authors extensively revised the manuscript; in my opinion, it can be accepted since it present a relevant archival value.