Influence of Sampling Methods on the Accuracy of Machine Learning Predictions Used for Strain-Dependent Slope Stability
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper is well written and described. It is focused on a very specific topic, in which the reviewer is not a top expert. In any case, the paper is interesting with minor typing errors that should be corrected:
- In line 84 is missing a full stop after techniques
- Line 280 - There is an additional space at the beginning of the sentence
- At Figure 9 it is not visible the curve corresponding to the Incremental Drivel, which is the comparison curve. It should be improved.
It is the opinion reviewer the paper can be accepted.
Author Response
Please find the attached response to the comments of reviewer 1 in the PDF file.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript presented an interesting study on investigating the influences of sampling methods on the predictive accuracy of machine learning algorithms for strain-dependent slope stability. In general, revision is suggested and some technical issues can be addressed or clarified to improve the quality of the manuscript:
(1) Lines 106-107: ‘Explicit discussion of data sample preparation for training and its effect on the ML model performance is rare in geotechnical applications’. Please point out the potential reasons or challenges underlying this issue. This is beneficial for readers to grasp the novelty or contribution of this study.
(2) Lines 237-240: ‘Among others, common sampling approaches include grid, stratified, cluster, weighted, Monte Carlo, and Latin Hypercube Sampling (LHS). Out of those, six sampling approaches are described in more detail and applied throughout this study’. It is suggested to briefly summarize the advantages and limits of these six commonly used sampling techniques using a table.
(3) Lines 281-282: ‘Let us consider 60,000 samples generated using Latin Hypercube sampling (LHS)’. How to determine the number of samples required to be generated when using the LHS approach, please provide more details on it.
(4) Lines 423-425: ‘The lowest R2 was obtained using grid sampling with linear/uniform spacing (R2 = 0.313), while the other sampling approaches resulted in 0.603 ≤ R2 ≤ 0.856’. Please analyze the possible reasons underlying this observation, namely predictive accuracy is significantly smaller than the other sampling techniques in this study.
(5) Although Table 2 lists the parameters of the hypoplastic model for the slope example, the definition of the symbols is implicit. It is suggested to add a row to describe the definition of these symbols.
(6) As reported in the introduction section, supervised machine learning (ML) methods have been successfully applied to diverse geotechnical engineering problems. Several latest research on this topic may provide a better understanding for readers, such as ‘A comprehensive comparison among metaheuristics (MHs) for geohazard modeling using machine learning’ and ‘Comprehensive review of machine learning in geotechnical reliability analysis: Algorithms, applications and further challenges’.
(7) The section ‘Conclusion and future work’ focused more on the analysis results repeatedly described in the main text, it is suggested to pay more attention to the implications of this study for geotechnical practitioners in slope stability prediction with the aid of machine learning techniques.
Comments on the Quality of English Languagefine
Author Response
Please find the attached response to the comments of reviewer 2 in the PDF file.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript has been carefully revised and accept is suggested.
Comments on the Quality of English LanguageMinor editing of English language required.