Machine Learning for Rupture Risk Prediction of Intracranial Aneurysms: Challenging the PHASES Score in Geographically Constrained Areas
Round 1
Reviewer 1 Report
The authors made a great job. However, something should be changed regarding the title and conclusion part since the similar article has been published recently from Chubin Ou and coworkers: "Rupture Risk Assessment for Cerebral Aneurysm Using Interpretable Machine Learning on Multidimensional Data", 2020.
Author Response
Dear reviewer 1,
we want to thank you for the opportunity to improve our manuscript „Rupture Risk Prediction of Intracranial Aneurysms using Machine Learning - A Comparison with the PHASES Score“, which we submitted to Symmetry for publication. We appreciate your constructive criticism and all the time and effort that you invested in our manuscript. Your comment was addressed accordingly and the manuscript was subsequently modified.
Reviewer 1:
“The authors made a great job. However, something should be changed regarding the title and conclusion part since the similar article has been published recently from Chubin Ou and coworkers: "Rupture Risk Assessment for Cerebral Aneurysm Using Interpretable Machine Learning on Multidimensional Data", 2020.”
Thank you for this valuable information!
To emphasize that our study primarily questions the PHASES score under geographical constraints, we have adjusted the title as follows: “Machine Learning for Rupture Risk Prediction of Intracranial Aneurysms: Challenging the PHASES Score in Geographically Constrained Areas”.
The conclusion section was revised accordingly: “This study demonstrated that the machine learning approach is superior to the PHASES score for rupture prediction of UIAs. Since the patient cohort is geographically constrained, the model can enhance risk evaluation and patient counseling in this specific area.”
Please find attached a PDF highlighting the changes between the old and the new version. Thank you again for your constructive feedback.
Author Response File: Author Response.pdf
Reviewer 2 Report
This article is well done.
The topic is very important, innovative with high significance in the medical community.
I consider this paper ready for full publication yet.
The introduction is well developed.
Materials and Methods are correct
Statistic test is appropriate
The result are clear and very interesting
Discussion is very cool. The word choice is very incorrect but it means that reaches the attention to the readers
Well done
Author Response
Dear reviewer 2,
thank you very much for your support. We appreciate all the time and effort you invested in our manuscript.
Reviewer 3 Report
The paper presents an application of gradient boosting algorithm to predicting the risk of rupture of intracranial aneuryms. The Authors use a custom dataset of 446 samples (balanced) to train a gradient boosting classifier using a scikit-learn implementation of the algorithm. The training/testing routine is carried out using a five-fold cross validation. Many metrics are computed to showcase the obtained results and compare them with a common clinical scoring system - PHASES (i.e. accuracy, F1-score, confusion matrices and more). The proposed machine learning approach provides singfinicantly better predictions than PHASES score.
The manuscript is well written and easy to ready, while the topic is relevant. Nevertheless, I have two minor comments. Please find the details below
1. Did you employ any methods to minimize the effect of RNG on the selection of samples for training/testing from the dataset? Cross validation partially takes care of this issue, but in some cases (depending on the structure of the dataset and shuffling) RNG can still affect the results. In some cases it might help to repeat the experiment with different seeds.
2. In the paper it was stated that "A major advantage of this study [...] is our balanced data set.". I would not go this far with the comment. In most medical applications of ML, the algorithms have to deal with imbalanced datasets. It is unclear how the proposed algorithm would behave when faced with imabalanced dataset and what approach would work best to overcome this issue - this could be studied in the future to make sure that the approach is general and can be effectively trained using different datasets from different facilities.
Author Response
Dear reviewer 3,
we want to thank you for the opportunity to improve our manuscript „Rupture Risk
Prediction of Intracranial Aneurysms using Machine Learning - A Comparison with
the PHASES Score“, which we submitted to Symmetry for publication. We appreciate
your constructive criticism and all the time and effort that you invested in our
manuscript. Your comment was addressed accordingly and the manuscript was
subsequently modified.
Reviewer 3:
“The paper presents an application of gradient boosting algorithm to predicting
the risk of rupture of intracranial aneuryms. The Authors use a custom dataset
of 446 samples (balanced) to train a gradient boosting classifier using a scikitlearn implementation of the algorithm. The training/testing routine is carried
out using a five-fold cross validation. Many metrics are computed to showcase
the obtained results and compare them with a common clinical scoring system
- PHASES (i.e. accuracy, F1-score, confusion matrices and more). The
proposed machine learning approach provides singfinicantly better predictions
than PHASES score.
The manuscript is well written and easy to ready, while the topic is relevant.
Nevertheless, I have two minor comments. Please find the details below
1. Did you employ any methods to minimize the effect of RNG on the selection
of samples for training/testing from the dataset? Cross validation partially
takes care of this issue, but in some cases (depending on the structure of the
dataset and shuffling) RNG can still affect the results. In some cases it might
help to repeat the experiment with different seeds.
2. In the paper it was stated that "A major advantage of this study [...] is our
balanced data set.". I would not go this far with the comment. In most medical
applications of ML, the algorithms have to deal with imbalanced datasets. It is
unclear how the proposed algorithm would behave when faced with
imabalanced dataset and what approach would work best to overcome this
issue - this could be studied in the future to make sure that the approach is
general and can be effectively trained using different datasets from different
facilities.”
We used shuffled stratified 5-fold CV with a predefined fixed seed. A predefined seed
should prevent one from simulating until an overoptimistic model is reached. This
may not be enough to minimise the effect of RNG. Therefore, we really appreciate
your suggestion on repeated cross validation and incorporated it into the manuscript:
“Repeating this experiment 100 times with random seeds resulted in a similar mean
AUC (0.8492 ± 0.0085).”
You are certainly right about point two. The respective statement was revised in this
regard: "Since the model is trained on a balanced data set, patients with and without
ruptured aneurysms are well represented."
Please find attached a PDF highlighting the changes between the old and the new
version. Thank you again for your constructive feedback
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Authors made all changes.