Next Article in Journal
SMEs Awareness and Preparation for Digital Transformation: Exploring Business Opportunities for Entrepreneurs in Saudi Arabia’s Ha’il Region
Previous Article in Journal
Application of Fuzzy Control and Neural Network Control in the Commercial Development of Sustainable Energy System
Previous Article in Special Issue
Challenges for Sustainable Urban Planning: A Spatiotemporal Analysis of Complex Landslide Risk in a Latin American Megacity
 
 
Article
Peer-Review Record

Enhancing Seismic Landslide Susceptibility Analysis for Sustainable Disaster Risk Management through Machine Learning

Sustainability 2024, 16(9), 3828; https://doi.org/10.3390/su16093828
by Hailang He 1,2,3, Weiwei Wang 2,4, Zhengxing Wang 2, Shu Li 2,5 and Jianguo Chen 1,*
Reviewer 1: Anonymous
Reviewer 3: Anonymous
Sustainability 2024, 16(9), 3828; https://doi.org/10.3390/su16093828
Submission received: 24 March 2024 / Revised: 25 April 2024 / Accepted: 29 April 2024 / Published: 2 May 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript integrates nine machine learning models with GeoDetector to evaluate the screening factors in earthquake-induced landslide-prone areas in Minxian. Ten reliable and effective evaluation factors were identified, redundant factors were removed, enhancing the predictive ability of machine learning models in the region. Ultimately, a seismic landslide susceptibility map (SLSM) was generated, laying the foundation for establishing main data on effective evaluation factors in the area and providing new directions and guidance for formulating earthquake disaster prevention and mitigation plans.. However, there are some details in the manuscript that need to be revised and adjusted. The specific recommendations are listed as follows:

1.      The description of recursive feature elimination in the manuscript is simplistic. Has the risk of overfitting been considered? And was there any assessment of feature importance conducted?

2.      Was model validation considered during the machine learning modeling process? Specifically, was the model's generalization ability validated using a test set to ensure it performs well on unseen data?

3.      The presentation of figures in this manuscript is subpar. It is advisable to readjust their clarity and captions to enhance readability for the readers.

4.      In Table 3, various evaluation metrics are mentioned. However, landslides are the result of both natural and triggering conditions. Was sensitivity analysis conducted on triggering conditions when selecting evaluation metrics to assess their impact on the chosen metrics and determine whether they are influenced by triggering conditions?

5.      While increasing the sample size of the training set helps reduce overfitting, in some cases, augmenting the sample size may not always be effective. For small datasets, 10-fold cross-validation might result in excessively small sample sizes for each subset, potentially affecting the stability of the model and the reliability of performance assessment.

6.      It is not a standard practice to limit the absolute difference between validation and test set values within 10%. Typically, suitable evaluation metrics and thresholds are determined based on the specific problem and characteristics of the data.

7.      While cross-validation can reduce random errors due to sample selection, different machine learning models may exhibit varying degrees of sensitivity to the data, thus, there may still be some level of random error.

8.      This manuscript should consider comprehensive evaluations, taking into account factors such as rainfall patterns, freeze-thaw cycles, seismic activity, and topography when selecting evaluation metrics.

9.      This manuscript mentions both seismic landslides and non-seismic landslides. In cases of multi-factor coupling leading to landslides, which evaluation metrics should be used?

10.   The manuscript describes normalizing and summing the importance of each factor obtained from the machine learning model, but does not detail the specific normalization and summation methods. These methods may vary depending on the model type and the problem, so more specific information should be provided.

11.  The manuscript mentions the improvement of prediction values and enhancement of model stability without elaborating on how the interpretability and stability of the results are evaluated and compared. More information on this aspect should be provided.

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Moderate editing of English language required

Author Response

Thank you very much for your thorough review and insightful comments regarding our manuscript.  We appreciate your positive remarks on the integration of machine learning models with GeoDetector to assess factors in earthquake-induced landslide-prone areas in Minxian, and the development of a seismic landslide susceptibility map (SLSM).  We are committed to addressing your concerns and making the necessary revisions to enhance the manuscript.

Please find below our responses to each of the specific recommendations:

  1. The description of recursive feature elimination in the manuscript is simplistic. Has the risk of overfitting been considered? And was there any assessment of feature importance conducted?

Thank you for your insightful comments regarding our manuscript's section on Recursive Feature Elimination (RFE). To address your concerns regarding the simplicity of our description and the potential risks of overfitting, we provide additional details on our methodology and the safeguards in place to ensure robust model performance.

Firstly, RFE is indeed based on a greedy algorithm that iteratively builds a model and eliminates features based on their ranking, which is calculated from the model's performance metrics. To mitigate the risk of overfitting, cross-validation was employed during the RFE process, allowing us to assess the generalizability of the model across different subsets of the dataset. This step ensures that the feature selection is not overly tailored to the nuances of the training data.Moreover, regarding the assessment of feature importance, it was conducted through the model's intrinsic evaluation metrics such as information gain in decision trees or coefficient magnitudes in support vector machines. The elimination sequence of each feature provided a quantitative measure of their impact on the model's predictive accuracy.

We hope these clarifications address your concerns. The manuscript has been updated accordingly to enhance understanding and transparency of our feature selection process.The section that has been revised is Section 3.3, titled 'Recursive feature elimination', specifically at lines 262-272 of the manuscript.

 

  1. Was model validation considered during the machine learning modeling process? Specifically, was the model's generalization ability validated using a test set to ensure it performs well on unseen data?

Thank you for your query regarding model validation. We indeed considered the generalization ability of the machine learning models as a crucial aspect of our study. To ensure robust performance on unseen data, we employed a structured validation approach using a designated test set.

To clarify, the dataset was partitioned into a training set and a test set, with the latter comprising 15% of the entire dataset (N=695 cases). This separation was executed randomly to avoid any sampling bias. Furthermore, to enhance the reliability of our model performance evaluations, we implemented 10-fold cross-validation within the training process. This method involves dividing the training set into 10 equal subsets, where each subset is used once as a validation set while the remaining subsets are used for training. This procedure is repeated 10 times with different combinations, ensuring that each instance of the dataset is used for both training and validation.

The consistency between the validation and test results was stringently monitored, with a threshold set for the absolute difference to be less than 10% of the test set values, indicating a successful model fit. This validation strategy not only reduces random errors that may arise from the use of various machine learning models but also allows for the comparative analysis of accuracy across different models.

We trust this explanation confirms the rigor of our validation approach and its effectiveness in assessing the generalization capability of the models used in our study.The section that has been revised is Section 4.2 , titled ' Sample design', pecifically at lines 337-356 of the manuscript.

 

 

  1. The presentation of figures in this manuscript is subpar. It is advisable to readjust their clarity and captions to enhance readability for the readers.

Thank you for your constructive feedback regarding the presentation of figures in our manuscript. We appreciate your advice on enhancing the clarity and readability of these visual elements, which are crucial for conveying complex information effectively.

In response to your comments, we have thoroughly reviewed and revised all figures within the manuscript. We have increased the resolution of all figures to ensure that they are clear and easily interpretable, even when zoomed in. Fine details are now more visible, and graphical elements are sharply defined.

 

  1. In Table 3, various evaluation metrics are mentioned. However, landslides are the result of both natural and triggering conditions. Was sensitivity analysis conducted on triggering conditions when selecting evaluation metrics to assess their impact on the chosen metrics and determine whether they are influenced by triggering conditions?

Thank you for your thoughtful inquiry regarding the sensitivity analysis of triggering conditions in our evaluation metrics. We acknowledge the importance of understanding how various natural and induced factors influence landslide susceptibility, which indeed constitutes a significant aspect of landslide risk assessment.

Our study builds upon established research that has extensively explored these factors. The evaluation metrics selected for our analysis were informed by prior studies which have detailed the sensitivity of landslide occurrences to such conditions. Throughout our research, we have carefully considered the impact of these well-documented factors, ensuring that our models and the corresponding evaluation metrics adequately reflect the complex interplay of these conditions.

While a detailed, model-specific sensitivity analysis was not explicitly described in our manuscript, the selection and validation of our evaluation metrics inherently involved considerations regarding the robustness and relevance of these metrics under varying conditions, guided by established literature. Our methodology section has been updated to clarify this point, explicitly stating that the influences of triggering conditions on landslide susceptibility, as documented in previous studies, were integral to our approach in selecting and assessing the reliability of our metrics.

We appreciate your suggestion and believe that this clarification will enhance the manuscript's transparency and provide the readers with a better understanding of how our findings align with and build upon existing knowledge.

This response carefully addresses the reviewer's concerns by acknowledging the importance of sensitivity analysis and emphasizing that your study's methodology was designed with an awareness of the impact of various triggering conditions, informed by a solid foundation of previous research.The section that has been revised is Section 2.2 , titled ' Data', specifically at lines 129-139 of the manuscript.

 

  1. While increasing the sample size of the training set helps reduce overfitting, in some cases, augmenting the sample size may not always be effective. For small datasets, 10-fold cross-validation might result in excessively small sample sizes for each subset, potentially affecting the stability of the model and the reliability of performance assessment.

Thank you for highlighting the potential concerns regarding the use of 10-fold cross-validation with our dataset size. We understand the importance of ensuring that the subsets used in cross-validation are sufficiently large to prevent any compromise in model stability and performance assessment reliability.

In response to your concerns, we have revisited our cross-validation procedure and reaffirmed that with a total of 4634 data points, the size of each fold is adequate to maintain the stability and reliability of our models. The size allows each subset to be representative of the larger dataset, which is crucial for the robust training and validation of our machine learning models.

We acknowledge the potential issues you have outlined could arise under different circumstances, particularly with smaller datasets. However, given our dataset's size and the partitioning strategy employed, we believe our approach is appropriate and robust for the scope and needs of our study.The section that has been revised is Section 4.2 , titled ' Sample design', specifically at lines 337-356 of the manuscript.

 

  1. It is not a standard practice to limit the absolute difference between validation and test set values within 10%. Typically, suitable evaluation metrics and thresholds are determined based on the specific problem and characteristics of the data.

Thank you for your valuable feedback concerning our method for assessing the consistency between validation and test set outcomes. We appreciate your expertise and understand the importance of tailoring evaluation metrics specifically to the data and research question at hand.

In our study, we opted for a 10% threshold as the criterion for success based on a detailed preliminary analysis tailored to our dataset's characteristics and the specific challenges of our research domain. This choice was informed by both empirical evidence and theoretical considerations within the context of seismic landslide analysis, where maintaining a balance between sensitivity and specificity is crucial.

We acknowledge that while standard practices may vary, our decision was guided by an intention to adopt a rigorously quantitative approach while also considering practical constraints and the specific nature of binary classification problems such as ours. The 10% threshold was shown to be effective in our preliminary tests for minimizing overfitting and ensuring a robust measure of model performance.

We respect the perspectives brought forward in your comments and believe that our approach, while perhaps unconventional, is justified given the unique aspects of our analysis. We aim to provide a clear and scientifically sound rationale for our methodologies in the manuscript, ensuring that our choices are transparent and well-supported by our data and its analysis.

Thank you once again for enhancing the scholarly discourse with your critique, which has provided us an opportunity to further clarify and substantiate our research approach.The section that has been revised is Section 4.2 , titled ' Sample design', specifically at lines 337-356 of the manuscript.

 

  1. While cross-validation can reduce random errors due to sample selection, different machine learning models may exhibit varying degrees of sensitivity to the data, thus, there may still be some level of random error.

Thank you for your insightful comments on the potential for random errors in our model evaluation due to the sensitivity of different machine learning models to the data, despite the use of cross-validation. We appreciate your expertise in highlighting this critical aspect of machine learning model validation.

We acknowledge that no cross-validation strategy can completely eliminate random errors, particularly given the inherent variability in model sensitivity. In our study, while 10-fold cross-validation was employed to maximize the utilization of our data and minimize the risk of overfitting, we recognize that variations between models could still introduce some degree of error.

However, our methodology was designed to balance these concerns effectively. By employing multiple models and monitoring the consistency of their performance across various cross-validation folds, we aimed to mitigate the potential bias and variance errors that any single model might introduce. The criterion for success — that the absolute difference between validation and test set results does not exceed 10% — was set to ensure that only models demonstrating stable and reliable performance across different subsets and conditions were considered successful.

This approach, while not completely eliminating random errors, significantly reduces their impact and provides a robust basis for the comparative analysis of the accuracy between test and training results. We believe that our methodological choices are appropriate given the scope of our study and the data characteristics, and they provide a strong foundation for the generalizability of our models to unseen data.

Thank you once again for your constructive critique. It has provided us with an opportunity to clarify the robustness of our analysis and the thoughtfulness behind our methodological choices.The section that has been revised is Section 4.2 , titled ' Sample design', specifically at lines 337-356 of the manuscript.

 

  1. This manuscript should consider comprehensive evaluations, taking into account factors such as rainfall patterns, freeze-thaw cycles, seismic activity, and topography when selecting evaluation metrics.

Thank you for your insightful suggestions regarding the inclusion of comprehensive evaluations in our manuscript, particularly concerning factors such as rainfall patterns, freeze-thaw cycles, seismic activity, and topography. We greatly appreciate your expertise in identifying these aspects, which undoubtedly enrich the analysis of landslide susceptibility.

In our current research, which focuses on a county-level region particularly susceptible to seismic-induced landslides, we made a considered decision to focus on the most impactful factors documented in previous studies and pertinent to the seismic context of our specific study area. Given the relatively uniform climate and geological conditions within this localized area, variations in rainfall patterns and freeze-thaw cycles were determined to be less significant compared to other regions where these factors might vary more broadly and significantly influence landslide occurrence.

However, we recognize the importance of these factors in a broader geographical context and agree that they could provide valuable insights in studies covering larger or more diverse areas. In this study, we concentrated on seismic activity and topography, along with geological and human factors, as our primary variables based on their critical relevance and the availability of robust data.

We appreciate your suggestion and plan to incorporate a more extensive range of environmental factors, including rainfall and freeze-thaw cycles, in future studies that cover larger regions or where climatic conditions show significant variability. This would indeed allow for a more comprehensive understanding of the factors influencing landslide susceptibility and enhance the applicability of our findings.

Your feedback has been invaluable in highlighting potential areas for further research, and we look forward to exploring these aspects in subsequent projects, as they would undoubtedly contribute to a more nuanced understanding of landslide dynamics.The section that has been revised is Section 4.1 , titled ' Spatial Database', specifically at lines 309-311 and 317-318 of the manuscript.

 

  1. This manuscript mentions both seismic landslides and non-seismic landslides. In cases of multi-factor coupling leading to landslides, which evaluation metrics should be used?

Thank you for your thoughtful question regarding the evaluation metrics appropriate for studies involving both seismic and non-seismic landslides, particularly in contexts where multi-factor coupling influences landslide occurrences. We appreciate your expertise and the importance of this consideration in landslide susceptibility research.

In our manuscript, while we mention both seismic and non-seismic landslides, our primary focus has been on seismic-induced landslides, given the specific objectives and scope of our study. The data utilized for seismic landslides was meticulously compiled from prior research, ensuring a robust basis for our analysis. For the purposes of this study, we concentrated on metrics that are particularly effective in assessing the impact of seismic activity on landslide susceptibility. These include factors such as peak ground acceleration (PGA), distance from seismic faults, and geological and topographical vulnerabilities that are known to interact significantly with seismic triggers.

Regarding non-seismic landslides, while they are acknowledged within the scope of our research, a detailed study involving the diverse triggers and metrics specifically applicable to such landslides was beyond the immediate scope of this paper. Our evaluation metrics were selected to optimally address the research questions pertaining to seismic impacts, based on the availability and reliability of the data we had.

We recognize the importance of considering a wider array of factors in comprehensive landslide studies, especially where multi-factor coupling plays a significant role. In future work, as we expand the breadth of our research, we will certainly consider incorporating a broader suite of evaluation metrics tailored to different types of landslide triggers, including non-seismic ones, to provide a more holistic view of landslide dynamics.

Thank you again for your constructive feedback, which not only deepens the discussion within our study but also helps outline potential directions for future research. We look forward to exploring these aspects more comprehensively in subsequent studies.

 

  1. The manuscript describes normalizing and summing the importance of each factor obtained from the machine learning model, but does not detail the specific normalization and summation methods. These methods may vary depending on the model type and the problem, so more specific information should be provided.

Thank you for your insightful comment on the need for more specific details regarding the normalization and summation methods used for the importance scores derived from the different machine learning models utilized in our study. We recognize the importance of detailing these methods to ensure transparency and reproducibility in our research.

To address your concerns, we would like to clarify that the normalization process applied to the importance scores from each of the nine machine learning models—Logistic Regression, XGBoost, LightGBM, RandomForest, AdaBoost, GaussianNB, ComplementNB, Multilayer Perceptron, and Support Vector Machine—was standardized to allow for a fair comparison across these diverse models. Each model's feature importance scores were normalized by dividing by the sum of all scores within that model, ensuring that the sum of normalized scores for each model equaled one. This method helps to standardize the contribution of each factor across models that may inherently produce importance scores on different scales.

The summation of these normalized scores across all models provided a cumulative importance score for each evaluation factor. This cumulative score represents a comprehensive measure of each factor's importance across all models, offering a robust perspective on the factors most impactful in predicting landslides.The section that has been revised is Section 4.3.2 , titled ' Evaluation factors for machine learning modeling', specifically at lines 382-391 of the manuscript.

  1. The manuscript mentions the improvement of prediction values and enhancement of model stability without elaborating on how the interpretability and stability of the results are evaluated and compared. More information on this aspect should be provided.

Thank you for your constructive feedback regarding the need for a clearer elaboration on how the interpretability and stability of the results are evaluated and compared in our manuscript. We appreciate your attention to detail and your expertise in highlighting this essential aspect of our analysis.

In our manuscript, we aimed to demonstrate the improvement in prediction values and the enhancement of model stability, primarily through the use of AUC values, model calibration curves, and decision curve analysis (DCA). However, we acknowledge that the description of how these measures contribute to the interpretability and stability of the models could have been more detailed.

To address this, we have revised the relevant sections to include a more thorough explanation of how these metrics reflect the interpretability and stability of the machine learning models used in our study. We believe this enhancement will provide readers with a clearer understanding of why these particular metrics were chosen and how they contribute to validating the effectiveness of the evaluated factors in our models.The section that has been revised is Section 5.1 , titled ' Model Validation', specifically at lines 546-560 of the manuscript.

 

We believe that these revisions will address your concerns and significantly improve the quality and clarity of our manuscript.  We are dedicated to advancing the research in this critical area and appreciate the role your feedback has played in refining our work.

Best regards,

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

General comments

 

The paper sustainability-2955587 implements several machine learning algorithms and employs several comparison methods and graphs. The relation of all presented ML information and the earthquake data is ambiguous. One should expect from the term earthquake in title focus on EQs. The paper is 32 pages while the presented data does not justify this length. This distracts the reader.

 

The paper should be shortened to the important findings and methods. Some could be moved to an appendix. The same with those many figures. The paper can be enhanced and should be.

 

At this phase, major revision is recommended

 

 

Specific Comments

 

 

1.The title should be changed relatively to the main content.

2.The Introduction is unnecessary long.The aims and the research questions should be clearly presented.

3. Kindly use Ms=6.6 and everywhere.

4.the url data should be resented as references.

5.All enumerated equations are written with great fonts which is not the same everywhere (also for the number). It should be fixed.

6.Figures 4 and 5 are blurred when zoomed. Please increase dpi

7.Please choose other colours in figure 6.

8.For all figures with curves: The curves should be statistically based. The simple connection of points is not adequate. Please take care on this.

9.The inclined text in Figure 7 is taken from the image. It should be  fixed. The small circles can not be visualised even in great zoom. Please consider remaking this one.

10.In figure 9 there is the same problem as in Figure 7. Text taken from image. Everything is blurred. Sub figures d and e are very mixed. The colours are not discriminated., The curves are line-connecting. The legends are from image. The legend in d is on points.

11.Results section is only some lines! Please consider that all previous data refer to results.

12. Discussion and comparisons to literature findings is necessary.

13. The references are not in MDPIs format

Author Response

Thank you for your thorough review and the insightful observations regarding our manuscript titled "Sustainability-2955587." We greatly appreciate the time and effort you have invested in evaluating our work and providing detailed feedback. Your comments have given us valuable direction on how to improve the clarity and focus of our paper, particularly concerning the integration of machine learning algorithms with earthquake data.We recognize the issues you have raised about the length of the paper and the clarity of its focus on earthquake-related analyses, which you rightly noted should be more pronounced given the title of our manuscript.

Please find below our responses to each of the specific recommendations:

1.The title should be changed relatively to the main content.

Thank you very much for your careful reading of our manuscript and for your insightful comments. We appreciate your suggestion regarding the alignment of the title with the main content of our paper.

Upon reflection, we agree that the title could more precisely reflect the core of our research and its implications. Accordingly, we have revised the title to better capture the essence of our study and its contribution to the field. The new title is now:“Enhancing Seismic Landslide Susceptibility Analysis for Sustainable Disaster Risk Management through Machine Learning”We believe this adjustment not only clarifies the focus of our research but also aligns more closely with the objectives and outcomes discussed within the paper.

Thank you once again for your constructive feedback, which has undoubtedly helped in refining our work.

 

2.The Introduction is unnecessary long.The aims and the research questions should be clearly presented.

Thank you for your constructive feedback regarding the length of the Introduction and the presentation of the research aims and questions. We acknowledge the need for brevity and clarity in setting the stage for our research. To address your concerns, we have condensed the Introduction and explicitly defined the aims and research questions early in the text to improve readability and focus.

We have streamlined the content by integrating detailed discussions of specific landslide factors and modeling approaches into subsequent sections, such as the "Research Area and Data Collection" and "Methods" sections. This reorganization helps maintain the necessary depth of information without overburdening the Introduction.

Thank you once again for your guidance, which has significantly enhanced the manuscript.The section that has been revised is Section 1, titled ' Introduction', specifically at lines 61-65 and 78-90 of the manuscript.

 

  1. Kindly use Ms=6.6 and everywhere.

Thank you for your attentive reading of our manuscript and for your constructive comment regarding the consistency of earthquake magnitude notation. We agree that standardizing the earthquake magnitude notation throughout our paper enhances clarity and conforms to scientific communication norms.

 

 

4.the url data should be resented as references.

Thank you very much for your insightful comments. Regarding your suggestion in point 4, concerning the presentation of URL data as references, we fully agree and appreciate the importance of this aspect. Indeed, a detailed and standardized reference format not only enhances the academic rigor of our article but also facilitates readers' access to original data sources.

ASTER Global Digital Elevation Model. Available online: https://www.gscloud.cn (accessed on 12 February 2024).

China Geological Archives. Available online: http://www.ngac.cn (accessed on 02 February 2024).

Landsat 8 Satellite Data. Available online: https://www.gscloud.cn (accessed on 08 January 2024).

Resource and Environmental Science Data Center of the Chinese Academy of Sciences. Available online: http://www.resdc.cn (accessed on 17 December 2023).

GLOBELAND30. Available online: http://www.globallandcover.com (accessed on 22 February 2024).

 

5.All enumerated equations are written with great fonts which is not the same everywhere (also for the number). It should be fixed.

Thank you for your insightful comments regarding the formatting of the enumerated equations in our manuscript. We appreciate your attention to detail and agree that maintaining consistency in font style and size across all elements of the document enhances readability and overall presentation.

To address your concern, we have reviewed all the enumerated equations and adjusted the font settings to ensure uniformity with the rest of the document. Specifically, we have standardized the font type and size for both the equations and their corresponding numbers, ensuring they match the document’s main text.

 

6.Figures 4 and 5 are blurred when zoomed. Please increase dpi

Thank you for your constructive feedback concerning the clarity of Figures 4 and 5 in our manuscript. We understand the importance of providing high-quality figures for both clarity and detailed examination.

In response to your comments, we have revised Figures 4 and 5 by increasing the resolution to a higher dpi setting. This adjustment ensures that these figures remain clear and detailed, even when significantly zoomed in. We hope that these improvements address your concerns and enhance the visual support for our findings.

We appreciate your attention to this detail and your help in improving the presentation of our work.

 

7.Please choose other colours in figure 6.

Thank you for your feedback on Figure 6. We have taken steps to enhance the figure by increasing its clarity and optimizing the layout and size to ensure that the attributes are more distinguishable. We believe these enhancements will greatly improve the readability and effectiveness of the visual presentation. Please review the updated version of Figure 6.

 

8.For all figures with curves: The curves should be statistically based. The simple connection of points is not adequate. Please take care on this.

Thank you for your comments and suggestions regarding the graphical representations in our manuscript. Regarding your point about ensuring that curves are statistically based, we have carefully considered and implemented this in our presentation.

For clarity, Figure 3 in our manuscript is designed as a stacked line graph because it illustrates categorical comparisons across different groups. The use of a line graph here aids in effectively visualizing how each category contributes to the total across different conditions.Similarly, Figure 6 employs a line graph to represent the quantities corresponding to different partitions. This format was chosen to clearly demonstrate trends and distributions across the partitions, enhancing the interpretability of the data.For all other figures involving curves, we have ensured that these are based on rigorous statistical methods to accurately reflect the underlying data trends. These curves are not merely connections of points but are derived from appropriate statistical analyses ensuring their reliability and validity.

We appreciate your attention to the details and your valuable feedback which has undoubtedly helped in improving the quality and accuracy of our figures.

 

 

9.The inclined text in Figure 7 is taken from the image. It should be  fixed. The small circles can not be visualised even in great zoom. Please consider remaking this one.

Thank you for your constructive feedback regarding Figure 7 in our manuscript. We appreciate your attention to detail and agree with your observations.

Regarding the inclined text in Figure 7, we acknowledge that this has compromised the clarity of the presentation. We will correct the orientation of the text to ensure it is horizontally aligned and easily readable.We also note your concern about the visibility of the small circles within the figure. To address this, we will increase the size of these circles and enhance their contrast against the background to ensure they are visible without requiring excessive zooming.

We will implement these modifications and provide a revised version of Figure 7 that meets the standards for clarity and visibility. Thank you for helping us improve the quality of our figures.

 

 

  1. In figure 9 there is the same problem as in Figure 7. Text taken from image. Everything is blurred. Sub figures d and e are very mixed. The colours are not discriminated., The curves are line-connecting. The legends are from image. The legend in d is on points.

Thank you for your insights regarding Figure 9 in our manuscript. We appreciate your guidance and have implemented several revisions to address the issues you raised.

We have enhanced the resolution of the figure to improve clarity, ensuring that all elements, including text and subfigures, are sharp and not blurred.Upon reviewing the relevance and impact of subfigures d and e in Figures 8 and 9, we determined that they did not significantly contribute to the manuscript's findings. Consequently, we have removed these subfigures to streamline our presentation and focus on more impactful data.In addressing the challenge of color differentiation among nine treatments, we initially selected distinctly different colors. For those lines that were still similar in hue, we have made further adjustments to their shades to ensure clear discrimination between them.All line charts now utilize curved lines to enhance the visual distinction and readability of the data.

Legends have been revised and clearly positioned to avoid overlapping with data points, ensuring they are legible and do not obstruct any crucial information.These modifications enhance the visual quality and interpretability of Figure 9, making the data presentation more effective and accessible. Thank you for helping us enhance the quality of our figures.

 

11.Results section is only some lines! Please consider that all previous data refer to results.

Thank you for your valuable feedback regarding the Results section of our manuscript. Upon reflection, we recognize that our presentation may have segmented the results throughout different sections of the paper, which might have led to the perception of an understated Results section. We agree with your suggestion and have revised the manuscript to explicitly incorporate all relevant findings as part of the Results section. This adjustment ensures a more coherent presentation of all data that contribute to the understanding of our study's outcomes.The section that has been revised is Section 5, titled ' Introduction', specifically at lines 555-560 and 571-574 of the manuscript.5.Results

 

  1. Discussion and comparisons to literature findings is necessary.

Thank you for your insightful suggestion to include a discussion and comparison with literature findings in our manuscript. We appreciate the importance of situating our research within the broader scientific context and have revised the manuscript accordingly. In the updated version, we have added a detailed discussion section that compares our results with existing literature, highlighting both consistencies and discrepancies. We believe this enhancement strengthens the manuscript and provides a clearer contribution to the field.The section that has been revised is Section 6, titled ' Discussion', specifically at lines 608-612 and 617-625 of the manuscript.5.Results

 

  1. The references are not in MDPIs format

Thank you for your constructive feedback regarding the format of the references in our manuscript.  We acknowledge the oversight in aligning the reference style with MDPI's guidelines.  We will revise all citations to comply with the MDPI reference format thoroughly.

 

We have carefully considered each of your points and have undertaken a significant revision to better align the manuscript with the expectations of the journal and its readers. We have condensed the text to focus sharply on the key findings and methods, relocating supplementary material to an appendix and streamlining the number of figures to enhance readability and coherence.

We believe these changes address your concerns effectively and improve the overall quality and impact of our paper. We are eager to receive further feedback and hope that our revised manuscript will now meet the journal’s standards for publication.

Thank you once again for your constructive feedback and guidance. We look forward to your further evaluation and are optimistic about the improvements made.

Best regards,

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper deals with an important issue of  the earthquake-induced landslide vulnerability analysis and it's implementation in the process of machine learning.  I have some questions and recommendations to the authors.

1) First of all, I should note that any  statistical analysis is to be based on serious scientific background. It means that all of the factors which are filtered by  GeoDetector should be previously analyzed by a researcher.

Arbitrary noise is to be removed to a significant extent before any analyzing procedure, in my opinion.

2) I do not understand the following statement (Lines 105-106): "... GeoDetector reveals the driving forces behind  spatial characteristics by analyzing quantative types".  I think that   physical phenomena  (or processes) should be investigated  and only then one can   fulfill any analyzing procedures.

3) Figure 1(c) should be clarified (topography, lithology e.t.c. of this region should be briefly described) , in my opinion.  

4) I think that Relation (1) should be justified. Such a detector can not ensure high-reliable estimate for prediction of landslide.

5) Relation (2) should be corrected: In must be changed by Ln. Moreover, I think that a linear relationship between dependent and independent variables can not provide high accuracy of the approximate solution to an optimizitation problem. 

6) Formula (5) should be retyped: too small symbols are poorly readable.

7) Table 2 should be accompanied, in my opinion, by a more detailed description of the most important factors which are involved into confusion matrix's calculation. 

8) Fig.2 contains too many maps, in my opinion. That's why this Figure is poorly perceived by a reader.

9) The stacking line graph (Fig. 3) should be clarified and smoothed if possible.

10) Please, give a more detailed comment to Fig.5.

11) Line 178 should be corrected: "..Figure 6m. mainly...".

12) Fig.6 contains too many images. The diagramms drawn in this Figure are poorly readable. I recommend for the authors to reduce the total amount of the images and to leave the most significant drawing for understanding the results of a mathematical experiment.

13) Fig. 8-9 should be redrawn. The number of graphs given in these Figures should be reduced dramatically. Otherwise it will be difficult to draw any conclusions from the tests described by the authors.

14) I think that the methods used in the scope of  inverse and ill-posed problems in geophysics could contribute in the machine learning addressed to predict the seismic-induced landslide. 

3)

Comments on the Quality of English Language

English Language is well.

Author Response

Dear Reviewer,

We would like to express our sincere gratitude for your thoughtful and insightful feedback on our manuscript regarding earthquake-induced landslide vulnerability analysis and its integration with machine learning methodologies. Your expertise in this field is evident, and we greatly appreciate the time and effort you have dedicated to reviewing our work.

Your comments serve as valuable guidance for further improving the quality and rigor of our research. We have carefully considered each of your points and have made revisions accordingly to enhance the clarity and robustness of our findings.

Please find below our responses to each of the specific recommendations:

1) First of all, I should note that any statistical analysis is to be based on serious scientific background. It means that all of the factors which are filtered by GeoDetector should be previously analyzed by a researcher.

Arbitrary noise is to be removed to a significant extent before any analyzing procedure, in my opinion.

Thank you for your insightful feedback regarding the necessity of grounding our statistical analysis in a solid scientific framework. We concur with your assertion about the importance of pre-analyzing factors utilized in GeoDetector to ensure their scientific validity and relevance to our study on landslides.

In our manuscript, we have undertaken a thorough literature review to identify and select these factors based on their documented influence on landslide occurrences. These factors have been chosen to maximize relevance and minimize extraneous noise. We agree with your suggestion and have made slight revisions to our manuscript to explicitly clarify this rigorous selection process. We trust these changes address your concerns effectively.The section that has been revised is Section 2.2, titled ' Data' , specifically at lines 116-123 of the manuscript.

 

2) I do not understand the following statement (Lines 105-106): "... GeoDetector reveals the driving forces behind spatial characteristics by analyzing quantative types". I think that physical phenomena (or processes) should be investigated and only then one can fulfill any analyzing procedures.

Thank you for your insightful comments and for highlighting the need for clarity in our description of the GeoDetector's methodology. We appreciate your suggestion to emphasize the investigation of physical phenomena or processes before analyzing the spatial characteristics.

In response to your comment, we agree that it is essential to underline that the GeoDetector method not only analyzes but also incorporates the investigation of physical phenomena as part of its framework. This is indeed crucial in understanding the underlying spatial disparities and driving forces. We have revised the relevant section of our manuscript to better reflect this process and to ensure the methodology is articulated more transparently.

We hope that our revisions adequately address your concerns and provide a clearer understanding of the GeoDetector's capabilities and applications.The section that has been revised is Section 1, titled ' Introduction', specifically at lines 67-70 of the manuscript.

 

3) Figure 1(c) should be clarified (topography, lithology e.t.c. of this region should be briefly described) , in my opinion.

Thank you very much for your thoughtful feedback regarding Figure 1(c). We highly value your recommendation to enhance the clarity of the topography and lithology depicted in this figure. Accordingly, we have updated Section 2.1, "Research Area," to include a more comprehensive discussion of the geological and topographical characteristics of the region. We would like to clarify that the intricate interplay between lithology and terrain in the region complicates their simultaneous representation in a single figure. Therefore, Figure 1(c) is dedicated to depicting the topography, to maintain visual clarity and focus. Detailed discussions on lithology and other geological factors are subsequently presented in Figure 2. We trust these revisions will address your concerns and improve the manuscript's overall clarity and precision.The section that has been revised is Section 2.1, titled ' Research area', specifically at lines 78-101 of the manuscript.

 

4) I think that Relation (1) should be justified. Such a detector can not ensure high-reliable estimate for prediction of landslide.

Thank you for your valuable feedback concerning the justification of Relation (1) and its relevance in predicting landslides with high reliability. Your insight is crucial for enhancing the clarity and rigor of our manuscript.

We acknowledge your concern regarding the efficacy of the GeoDetector method in ensuring a high-reliability estimate for the prediction of landslides. As you correctly pointed out, the equation (Relation 1) detailing the relationship between independent variables and the dependent variable (landslide occurrence) needs further clarification to substantiate its reliability and relevance.

We have revised the relevant section to provide a more comprehensive explanation of how the GeoDetector statistically assesses the impact of independent variables on landslide susceptibility, emphasizing its capabilities and limitations in detecting spatial correlations and their implications on predictive reliability.

We hope our revisions will address your concerns effectively, and we appreciate your guidance in refining our work.The section that has been revised is Section 3.1, titled ' GeoDetector', specifically at lines 153-158 of the manuscript.

 

5) Relation (2) should be corrected: In must be changed by Ln. Moreover, I think that a linear relationship between dependent and independent variables can not provide high accuracy of the approximate solution to an optimizitation problem.

Thank you for your insightful feedback regarding the use of logistic regression in our study and for pointing out the necessary correction in the notation of Relation (2) from "In" to "Ln". We greatly appreciate your detailed attention to our methodology and the mathematical accuracy of our expressions.

In response to your concern about the linear relationship modeled by logistic regression, we have chosen this statistical method due to its robust applicability in binary classification tasks, which is central to our study's objective—assessing the probability of seismically induced landslides. Logistic regression is particularly advantageous for its simplicity and effectiveness in deriving initial insights from dichotomous outcomes based on a set of predictor variables. This model facilitates an understanding of how various independent variables might contribute to the likelihood of landslide occurrences.

However, we fully acknowledge that logistic regression constructs a linear relationship between the log odds of the dependent and independent variables, which may not adequately capture the complex, non-linear interactions typical of geological phenomena. Geological processes involved in landslides can exhibit intricate dynamics that are not always linearly correlated, posing challenges in achieving high accuracy in predictive models.Therefore, although logistic regression offers a structured framework for preliminary analysis, it is crucial to recognize that the accuracy of predictions might vary based on the specific characteristics and interdependencies of the involved variables. The choice of this model was driven by the need for a methodologically sound approach that can be systematically applied to the binary classification of landslide occurrence and non-occurrence, providing a foundation for further, more complex analyses.

We have revised our manuscript to clarify these points and better explain the rationale behind the methodological choice, as well as to correct the mathematical notation error.Thank you again for your guidance, which is invaluable in enhancing both the scientific rigor and clarity of our research.The section that has been revised is Section 3.2, titled ' Machine Learning Methods', specifically at lines 189-191 of the manuscript.

 

6) Formula (5) should be retyped: too small symbols are poorly readable.

Thank you for your careful examination of our manuscript and for pointing out the readability issue with the symbols in Formula (5). We understand the importance of clear and accessible presentation of mathematical expressions to ensure that all readers can follow the analytical procedures and findings without difficulty.

We will resize the symbols in Formula (5) to enhance their readability in the revised manuscript. We appreciate your attention to detail, which undoubtedly improves the quality and clarity of our work.

 

7) Table 2 should be accompanied, in my opinion, by a more detailed description of the most important factors which are involved into confusion matrix's calculation.

Thank you for your constructive feedback regarding the presentation of Table in our manuscript. We appreciate your suggestion to provide a more detailed description of the factors involved in the computation of the confusion matrix.

We acknowledge that a clearer exposition of how each factor contributes to the calculations within the confusion matrix will enhance the reader's understanding and the transparency of our methodological approach. We will revise the corresponding section to include a more comprehensive explanation of the confusion matrix components and their implications for the performance evaluation of the machine learning models used in our study.

Thank you for helping us improve the clarity and detail of our work.The section that has been revised is Section 3.4, titled ' Confusion matrixr', specifically at lines 294 of the manuscript.

 

 

  • 2 contains too many maps, in my opinion. That's why this Figure is poorly perceived by a reader.

Thank you very much for your constructive feedback regarding Figure 2. We completely agree that the current arrangement of multiple maps may compromise the figure’s clarity and hinder effective communication of the data. We appreciate your expertise in highlighting this issue, and in response, we are committed to revising the layout of Figure 2. Our aim is to enhance its readability by simplifying and clearly separating the maps, which we believe will facilitate a better understanding of the information presented.

We value your suggestions immensely as they assist in elevating the quality of our manuscript.

 

9) The stacking line graph (Fig. 3) should be clarified and smoothed if possible.

Thank you for your valuable feedback regarding the stacking line graph in Figure 3. We acknowledge your point about the need for greater clarity and a smoother presentation in this figure. We agree that enhancing these aspects will improve the overall effectiveness and readability of the graph. We appreciate your suggestion as it highlights a crucial area for improvement.

In response, we plan to revise Figure 3 by replotting the stacking line graph to ensure that it is both clearer and smoother. The choice of a line graph format was intentional to effectively display the results across various categories over time, facilitating a more straightforward interpretation of trends. We will carefully adjust the visualization to address your concerns while preserving the integrity and clarity of the data.

Thank you once again for your insightful comments, which significantly contribute to the refinement of our manuscript.

 

10) Please, give a more detailed comment to Fig.5.

Thank you for requesting a more detailed commentary on Figure 5. We recognize the importance of clarity in presenting our results and appreciate the opportunity to elaborate on the insights obtained from this figure.

In response to your feedback, we have thoroughly revised the manuscript to enhance the description of Figure 5. This revision aims to clarify the differences in AUC values and the implications of these findings for our analysis. We believe that the additional details will facilitate a better understanding of the comparative effectiveness of the assessment factors used in our study.

We value your suggestions and believe they significantly contribute to improving the quality of our manuscript.The section that has been revised is Section 4.4, titled ' Factor screening', specifically at lines 461-466 of the manuscript.

 

11) Line 178 should be corrected: "..Figure 6m. mainly...".

Thank you for your astute observation regarding the correction needed in Line 178. We appreciate your dedication to ensuring the accuracy and clarity of our manuscript. As part of our revisions based on previous feedback, we had already removed the sentence containing the reference to "Figure 6m. mainly." We have conducted a thorough review of the manuscript to ensure that such inaccuracies have been corrected throughout the document.

We are grateful for your guidance in helping us maintain high standards of precision and scholarly rigor. Your feedback is instrumental in enhancing the quality of our work.

Thank you once again for your meticulous attention to detail and valuable suggestions.

 

12) Fig.6 contains too many images. The diagramms drawn in this Figure are poorly readable. I recommend for the authors to reduce the total amount of the images and to leave the most significant drawing for understanding the results of a mathematical experiment.

Thank you for your constructive feedback regarding the readability of the figures presented in our manuscript. We appreciate your suggestion to streamline the number of images to enhance clarity and focus on the most significant results of our mathematical experiment.

In response to your recommendation, we have conducted a thorough analysis and decided to retain figures 6a, 6b, 6c, and 6d in the main text. We believe this revision improves the manuscript’s readability and allows the main results to be communicated more effectively, without losing the supportive evidence that the additional figures provide.

Thank you once again for your valuable insights, which have greatly contributed to the refinement of our presentation.The section that has been revised is Section 4.5.1, titled ' Single factor analysis', specifically at lines 486-500 of the manuscript.

 

13) Fig. 8-9 should be redrawn. The number of graphs given in these Figures should be reduced dramatically. Otherwise it will be difficult to draw any conclusions from the tests described by the authors.

Thank you for your critical feedback concerning Figures 8 and 9 in our manuscript. We appreciate your expertise and agree that the effectiveness of our presentation can be significantly enhanced by simplifying these figures.

In response to your suggestion, we have reassessed the content of Figures 8 and 9 and decided to remove panels c and d from each figure, as our further analysis confirmed that the differences illustrated in these panels were not substantial enough to warrant their inclusion. This change not only addresses your concern about the number of graphs but also focuses the reader's attention on the most relevant and discernible results.

We believe that this revision will facilitate a clearer understanding of the conclusions drawn from our tests and enhance the overall coherence of our findings.Thank you once again for your invaluable guidance, which has greatly assisted us in improving the clarity and impact of our work.

 

14) I think that the methods used in the scope of inverse and ill-posed problems in geophysics could contribute in the machine learning addressed to predict the seismic-induced landslide.

Thank you for your insightful suggestion regarding the application of methods from inverse and ill-posed problems in geophysics to enhance our machine learning approach for predicting seismic-induced landslides. We value your expertise and recognize the potential of integrating these advanced geophysical methods to refine our predictive models.

In response to your recommendation, we will explore how these methods can be adapted and implemented within our existing framework to potentially improve the accuracy and robustness of our predictions. This exploration will aim to leverage the strengths of both fields to address the complexities associated with seismic-induced landslide forecasting more effectively.

We appreciate your constructive feedback, which opens new avenues for enriching our research and advancing the application of machine learning in geophysical problem-solving. Thank you once again for your valuable contribution to enhancing the depth and scope of our study.

Once again, we extend our heartfelt appreciation for your constructive criticism and scholarly insights. Your contributions have undoubtedly strengthened the scholarly integrity of our work, and we are confident that the revised manuscript will make a meaningful contribution to the field of earthquake-induced landslide vulnerability analysis and machine learning integration.

Best regards,

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The author has made comprehensive revisions according to each editing suggestion, resulting in a clear logical flow and in-depth content. Therefore, it is agreed to be accepted.

Comments on the Quality of English Language

Minor editing of English language required

Reviewer 2 Report

Comments and Suggestions for Authors

The paper sustainability-2955587 is a very good research and deserves publication. It presents significant issues of general interest. I suggest its publication. 

 

Since there are some minor problems, the authors may solve them during proof correction.

 

Line 113: Investigating

Line 165: refer

Line 189: Needs a spice between sentences.

Line 201: I denotes

Line 263: By Jennifer [33]

Table 1: Specificity and Sensitivity have the same meaning which should be fixed. Specificity, is, I think, the possibility of the true Negatives, that is the true negatives that were found to be such.

 

Back to TopTop