Next Article in Journal
Impact of Water Level Fluctuations on Landslide Deformation at Longyangxia Reservoir, Qinghai Province, China
Previous Article in Journal
An Efficient Ground Moving Target Imaging Method for Airborne Circular Stripmap SAR
 
 
Article
Peer-Review Record

A GIS-Based Landslide Susceptibility Mapping and Variable Importance Analysis Using Artificial Intelligent Training-Based Methods

Remote Sens. 2022, 14(1), 211; https://doi.org/10.3390/rs14010211
by Pengxiang Zhao 1, Zohreh Masoumi 2,3, Maryam Kalantari 2, Mahtab Aflaki 2 and Ali Mansourian 1,4,*
Reviewer 1: Anonymous
Reviewer 2:
Remote Sens. 2022, 14(1), 211; https://doi.org/10.3390/rs14010211
Submission received: 17 November 2021 / Revised: 16 December 2021 / Accepted: 31 December 2021 / Published: 4 January 2022

Round 1

Reviewer 1 Report

General Comments:

Firstly, this research was investigated a GIS-based LSM by comparing CNN model with four conventional models, including RF, ANN, SVM and LR. The results illustrate that the five machine learning (ML) methods performed well, especially RF model. Secondly, this research applied permutation-based variable accuracy importance (PVAI) method to assess the importance of each feature for the five ML models. And the decrease in sensitivity was taken as a measurement of the feature importance. Thirdly, the LSMs generated by five ML models can help decision makers in landslide management and risk analysis. In general, the structure of this paper is relatively complete and the expression is clear.

However, there are also some specific problems in this paper, as shown below.

Specific Comments:

  1. In Section 2.3, please explain how to obtain non-landslide samples in this research?

And please specify how to handle landslide samples before ML processing?

  1. Please explain the principle of machine learning methods used in this research in detail.
  2. Please state the difference between CNN method and conventional ML methods. And what are the advantages of DL methods applied in the landslide susceptibility mapping?
  3. In this research, sixteen causative factors of landslides were prepared. How to ensure that these factors are not multicollinearity?
  4. In addition to calculating the feature importance, what else was done to the factors?

Please specify it.

  1. In Figure 7, what is the meaning of the width of the box in the boxplot? In Figure 7 (a), (b) and (c), apart from the factors of slope and curvature, the value of the factor importance is close to 0. Does this phenomenon indicate that these factors are useless in modelling? How to explain this phenomenon?
  2. Please specify how to calculate the value of PVAI?

Author Response

Ref: remotesensing-1490914 

Title: A GIS-based landslide susceptibility mapping and variable importance analysis using artificial intelligent training-based methods  
Journal: Remote Sensing

Response to Editor and Reviewers’ Comments

 

Dear Editors

We are grateful for your consideration of this manuscript. We also thank the reviewers for their careful reading of our text. All the comments we received on this study have been taken into account in improving the quality of the article, and we present our reply to each of them separately.

We attach our detailed report where we have answered each point from the reviewers, one by one, to show how we have revised and improved the manuscript. We have also submitted the revised, updated, and restructured version of the manuscript via the online system. 

The report below is rather detailed, as we consider that, as the reviewers have made great efforts to advise us on how to improve the manuscript, we are deemed to address each of their comments in depth. We greatly appreciate the valuable comments made by the reviewers. We use blue font for our responses, and highlight the revisions in our revised manuscript by red color.

Sincerely,

The corresponding author

 

 

Reviewer #1:

Firstly, this research was investigated a GIS-based LSM by comparing CNN model with four conventional models, including RF, ANN, SVM and LR. The results illustrate that the five machine learning (ML) methods performed well, especially RF model. Secondly, this research applied permutation-based variable accuracy importance (PVAI) method to assess the importance of each feature for the five ML models. And the decrease in sensitivity was taken as a measurement of the feature importance. Thirdly, the LSMs generated by five ML models can help decision makers in landslide management and risk analysis. In general, the structure of this paper is relatively complete and the expression is clear.

However, there are also some specific problems in this paper, as shown below.

Specific Comments:

  1. In Section 2.3, please explain how to obtain non-landslide samples in this research? And please specify how to handle landslide samples before ML processing?

Response: Thanks for your comment. We have added the description about this issue in the first paragraph of page 10, in section 3.3 as below:

“Here, the landslides in the inventory map are recorded as points along with their area. Then, landslides with the large areas were selected as landslide points for training. In our study, we verified this map using accessible aerial photos and Landsat images, and also the morphological shape of the area. So, the landslide points and their position have been checked visually to eliminate suspected points. For instance, points located in low slope areas such as agricultural lands or plains eliminated from landslide point set. Non-landslide points were prepared, assuming that landslides would not occur on slopes less than 5 degrees. So, non-landslide points were selected randomly from these areas.”

  1. Please explain the principle of machine learning methods used in this research in detail.

Response: Thanks for the suggestion. We have added one more section to introduce the principle of the selected machine learning methods, as shown in section 4.1

  1. Please state the difference between CNN method and conventional ML methods. And what are the advantages of DL methods applied in the landslide susceptibility mapping?

Response: Thanks for the comment. As a class of artificial neural networks, CNN is designed to automatically and adaptively learn features through backpropagation by using convolution layers, pooling layers, and fully connected layers. The major difference between CNN method and conventional ML methods is convolution manipulation. The aim of a convolutional manipulation is to extract different features of the input layer. Hence, compared with the conventional ML methods that directly classify the input data and cannot uncover more representative features from these data to, CNN can automatically and adaptively learn features from the input data to further improve classification accuracies. Such difference has been introduced in Section 4.1.5 Convolutional neural network (CNN).

  1. In this research, sixteen causative factors of landslides were prepared. How to ensure that these factors are not multicollinearity?

Response: Thanks for pointing it out. We examined the multicollinearity of factors by calculating VIF (Variable Inflation Factors). VIF of one independent variable represents how well the variable is explained by other independent variables. It is found that VIF values of all the factors are within the range between 1 and 2, which implies that multicollinearity doesn’t exist between these independent variables or factors. The following text has been added to the manuscript:

 

“After calculating all the causative factors, we implement multicollinearity analysis to examine the correlations between these causative factors by calculating variance inflation factor (VIF). VIF of one variable represents how well the variable is explained by other variables, which has been widely used for multicollinearity analysis in different applications (e.g., Zhao et al., 2018; Wang et al., 2019). By calculating VIF values of each factor, they are within the range between one and two and all far less than 10. This suggests that there is no multicollinearity among these causative factors.”

  1. In addition to calculating the feature importance, what else was done to the factors? Please specify it.

Response: Thanks for pointing it out. We also implemented the multicollinearity analysis for the factors, as mentioned in the previous comment.

  1. In Figure 7, what is the meaning of the width of the box in the boxplot? In Figure 7 (a), (b) and (c), apart from the factors of slope and curvature, the value of the factor importance is close to 0. Does this phenomenon indicate that these factors are useless in modelling? How to explain this phenomenon?

Response: Thanks for the very good comment. Since each feature is permuted 10 times in the test dataset, it means we can obtain 10 PVAI values. The 10 PVAI values of each feature are presented in boxplot. The width of each box represents the range of PVAI values. If the value of the factor importance is close to 0, it means the corresponding factor is less importance in the classification of the ML model.

 

However, it doesn’t mean these factors are useless at all in the modeling. Here we take ANN as an example. If we remove the factor topographic curvature from the input data and estimate the PVAI values of the remaining factors, the result is like the following figure. It can be observed that the PVAI value/importance of each factor has changed, which implies that they are also related with landslide. Therefore, the PVAI value of one factors only reflects its importance degree in all the factors. If we remove one important factor (e.g., topographic curvature) from the input, some less important factors could display as important factors.

 

  1. Please specify how to calculate the value of PVAI?

Response: Thanks for pointing it out. We introduced how to calculate permutation-based variable accuracy importance (PVAI) in Section 4.3. The rationale of the method is that the importance of a feature is calculated based on comparing the variation in the performance (i.e., accuracy in this study) of a classifier when the feature is randomly permuted in the test dataset. If the performance decreases more under the variation of a feature, its importance degree is higher. For example, let’s suppose we estimate the classification accuracy Aoriginal is 90%. Then, given a feature, we permute it in the feature matrix and estimate the classification accuracy based on the permuted feature matrix. Let’s say the classification accuracy Apermutation is 88% after the feature permutation. The PVAI value of this feature can be calculated as Apermutation - Aoriginal, namely -2%. In this paper, we adopt the absolute values of PVAI to measure the feature importance. The higher the PVAI values is, the more important the variable/feature is. Since each feature is permuted 10 times in the test dataset, the PVAI values of each feature are presented in boxplot, as shown in Figure 7.

Ref: remotesensing-1490914 

Title: A GIS-based landslide susceptibility mapping and variable importance analysis using artificial intelligent training-based methods  
Journal: Remote Sensing

Response to Editor and Reviewers’ Comments

 

Dear Editors

We are grateful for your consideration of this manuscript. We also thank the reviewers for their careful reading of our text. All the comments we received on this study have been taken into account in improving the quality of the article, and we present our reply to each of them separately.

We attach our detailed report where we have answered each point from the reviewers, one by one, to show how we have revised and improved the manuscript. We have also submitted the revised, updated, and restructured version of the manuscript via the online system. 

The report below is rather detailed, as we consider that, as the reviewers have made great efforts to advise us on how to improve the manuscript, we are deemed to address each of their comments in depth. We greatly appreciate the valuable comments made by the reviewers. We use blue font for our responses, and highlight the revisions in our revised manuscript by red color.

Sincerely,

The corresponding author

 

 

Reviewer #1:

Firstly, this research was investigated a GIS-based LSM by comparing CNN model with four conventional models, including RF, ANN, SVM and LR. The results illustrate that the five machine learning (ML) methods performed well, especially RF model. Secondly, this research applied permutation-based variable accuracy importance (PVAI) method to assess the importance of each feature for the five ML models. And the decrease in sensitivity was taken as a measurement of the feature importance. Thirdly, the LSMs generated by five ML models can help decision makers in landslide management and risk analysis. In general, the structure of this paper is relatively complete and the expression is clear.

However, there are also some specific problems in this paper, as shown below.

Specific Comments:

  1. In Section 2.3, please explain how to obtain non-landslide samples in this research? And please specify how to handle landslide samples before ML processing?

Response: Thanks for your comment. We have added the description about this issue in the first paragraph of page 10, in section 3.3 as below:

“Here, the landslides in the inventory map are recorded as points along with their area. Then, landslides with the large areas were selected as landslide points for training. In our study, we verified this map using accessible aerial photos and Landsat images, and also the morphological shape of the area. So, the landslide points and their position have been checked visually to eliminate suspected points. For instance, points located in low slope areas such as agricultural lands or plains eliminated from landslide point set. Non-landslide points were prepared, assuming that landslides would not occur on slopes less than 5 degrees. So, non-landslide points were selected randomly from these areas.”

  1. Please explain the principle of machine learning methods used in this research in detail.

Response: Thanks for the suggestion. We have added one more section to introduce the principle of the selected machine learning methods, as shown in section 4.1

  1. Please state the difference between CNN method and conventional ML methods. And what are the advantages of DL methods applied in the landslide susceptibility mapping?

Response: Thanks for the comment. As a class of artificial neural networks, CNN is designed to automatically and adaptively learn features through backpropagation by using convolution layers, pooling layers, and fully connected layers. The major difference between CNN method and conventional ML methods is convolution manipulation. The aim of a convolutional manipulation is to extract different features of the input layer. Hence, compared with the conventional ML methods that directly classify the input data and cannot uncover more representative features from these data to, CNN can automatically and adaptively learn features from the input data to further improve classification accuracies. Such difference has been introduced in Section 4.1.5 Convolutional neural network (CNN).

  1. In this research, sixteen causative factors of landslides were prepared. How to ensure that these factors are not multicollinearity?

Response: Thanks for pointing it out. We examined the multicollinearity of factors by calculating VIF (Variable Inflation Factors). VIF of one independent variable represents how well the variable is explained by other independent variables. It is found that VIF values of all the factors are within the range between 1 and 2, which implies that multicollinearity doesn’t exist between these independent variables or factors. The following text has been added to the manuscript:

 

“After calculating all the causative factors, we implement multicollinearity analysis to examine the correlations between these causative factors by calculating variance inflation factor (VIF). VIF of one variable represents how well the variable is explained by other variables, which has been widely used for multicollinearity analysis in different applications (e.g., Zhao et al., 2018; Wang et al., 2019). By calculating VIF values of each factor, they are within the range between one and two and all far less than 10. This suggests that there is no multicollinearity among these causative factors.”

  1. In addition to calculating the feature importance, what else was done to the factors? Please specify it.

Response: Thanks for pointing it out. We also implemented the multicollinearity analysis for the factors, as mentioned in the previous comment.

  1. In Figure 7, what is the meaning of the width of the box in the boxplot? In Figure 7 (a), (b) and (c), apart from the factors of slope and curvature, the value of the factor importance is close to 0. Does this phenomenon indicate that these factors are useless in modelling? How to explain this phenomenon?

Response: Thanks for the very good comment. Since each feature is permuted 10 times in the test dataset, it means we can obtain 10 PVAI values. The 10 PVAI values of each feature are presented in boxplot. The width of each box represents the range of PVAI values. If the value of the factor importance is close to 0, it means the corresponding factor is less importance in the classification of the ML model.

 

However, it doesn’t mean these factors are useless at all in the modeling. Here we take ANN as an example. If we remove the factor topographic curvature from the input data and estimate the PVAI values of the remaining factors, the result is like the following figure. It can be observed that the PVAI value/importance of each factor has changed, which implies that they are also related with landslide. Therefore, the PVAI value of one factors only reflects its importance degree in all the factors. If we remove one important factor (e.g., topographic curvature) from the input, some less important factors could display as important factors.

 

  1. Please specify how to calculate the value of PVAI?

Response: Thanks for pointing it out. We introduced how to calculate permutation-based variable accuracy importance (PVAI) in Section 4.3. The rationale of the method is that the importance of a feature is calculated based on comparing the variation in the performance (i.e., accuracy in this study) of a classifier when the feature is randomly permuted in the test dataset. If the performance decreases more under the variation of a feature, its importance degree is higher. For example, let’s suppose we estimate the classification accuracy Aoriginal is 90%. Then, given a feature, we permute it in the feature matrix and estimate the classification accuracy based on the permuted feature matrix. Let’s say the classification accuracy Apermutation is 88% after the feature permutation. The PVAI value of this feature can be calculated as Apermutation - Aoriginal, namely -2%. In this paper, we adopt the absolute values of PVAI to measure the feature importance. The higher the PVAI values is, the more important the variable/feature is. Since each feature is permuted 10 times in the test dataset, the PVAI values of each feature are presented in boxplot, as shown in Figure 7.

 

 

 

Reviewer 2 Report

1.Line 11

The abstract is not clearly written. You have to fix it.

What is the purpose of the study? What method is proposed for research? What is the research result: (1) Study GIS-based LSM in Zanjan comparing different machine learning methods?  (2) study the performance of machine learning algorithms using LSM data as an example? (3) calculating LSM? 

What is the reason for using 5 algorithms to solve the LSM problem? For what reason does the RF algorithm give better results: (1) Does it take into account the physics of the LSM process? (2) Is it the best for any forecasting task? (3) Is the result random for a given set of samples selected for analysis?

2. Line 378

What is mean "the natural breaks method". Your classification into 5 subclasses is questionable. Why don't you solve the problem of classification into 5 subclasses using ML methods and all 16 factors?

3. Line 437

In Figure 6, I can see that the classification results are almost independent of the machine learning methods you use. The first conclusion is based on a comparison of random variables obtained for a specific classification problem based on data from one region and one sample. Therefore, the first conclusion that RF is the best machine learning method is not substantiated either physically or statistically.

4. Line 445

Second conclusion.  Using statistical machine learning techniques, you name the first two factors from Table 3 as the most important variables. Their main contribution to the classification decision rule is confirmed in Figure 7. Both factors carry information about the instability of creeping soil. But from a physical point of view, the decision rule requires additional information about the presence and properties of the soil on the slope. I think lines 366-368 should be corrected.

Author Response

Ref: remotesensing-1490914 

Title: A GIS-based landslide susceptibility mapping and variable importance analysis using artificial intelligent training-based methods  
Journal: Remote Sensing

Response to Editor and Reviewers’ Comments

 

Dear Editors

We are grateful for your consideration of this manuscript. We also thank the reviewers for their careful reading of our text. All the comments we received on this study have been taken into account in improving the quality of the article, and we present our reply to each of them separately.

We attach our detailed report where we have answered each point from the reviewers, one by one, to show how we have revised and improved the manuscript. We have also submitted the revised, updated, and restructured version of the manuscript via the online system. 

The report below is rather detailed, as we consider that, as the reviewers have made great efforts to advise us on how to improve the manuscript, we are deemed to address each of their comments in depth. We greatly appreciate the valuable comments made by the reviewers. We use blue font for our responses, and highlight the revisions in our revised manuscript by red color.

Sincerely,

The corresponding author

 

 

Reviewer #2:

1.Line 11

The abstract is not clearly written. You have to fix it.

What is the purpose of the study? What method is proposed for research? What is the research result: (1) Study GIS-based LSM in Zanjan comparing different machine learning methods?  (2) study the performance of machine learning algorithms using LSM data as an example? (3) calculating LSM? 

Response: Thank you so much for your attention; we revised the abstract in the new version of the manuscript and add details about the purpose of the study as below:

“The main target of this study is investigating a GIS-based LSM in Zanjan, Iran and exploring the most important causative factors of landslides in case study area. To do so, different machine learning (ML) methods have been employed and compared to select the best results in the case study area.”

Response: Moreover, methods are described as below. It is notably to say that because of word count limitation in the abstract, we could not add more details.

“The CNN is compared with four ML algorithms, including random forest (RF), artificial neural network (ANN), support vector machine (SVM), and logistic regression (LR). To do so, sixteen landslide causative factors have been extracted and their related spatial layers have been prepared. Then, the algorithm trained with related landslide and non-landslide points.”

Response: The research results are also described as below:

“The results illustrate that the five ML algorithms performed suitable (precision = 82.43%-85.6%, AUC = 0.934-0.967). The RF algorithm reaches the best result; the CNN, SVM, the ANN, and the LR have the best results after RF respectively. Moreover, variable importance analysis results indicate that slope and topographic curvature contribute more to the prediction. The results would be beneficial to planning strategies for landslide risk management. Moreover, variable importance analysis results indicate that slope and topographic curvature contribute more to the prediction. The results would be beneficial to planning strategies for landslide risk management.”

 

What is the reason for using 5 algorithms to solve the LSM problem? For what reason does the RF algorithm give better results: (1) Does it take into account the physics of the LSM process? (2) Is it the best for any forecasting task? (3) Is the result random for a given set of samples selected for analysis?

Response: Thanks for the very good comment. We selected the 5 algorithms by extensive literature review. Previous studies indicated that the conventional machine learning methods, such as Logistics Regression, Support Vector Machine, Artificial Neural Network and Random Forest, have been widely used for landslide susceptibility mapping. Besides, the potential of CNNs in this field based on causative factors has not been fully explored yet, especially by comparing its performance with the conventional ma-chine learning algorithms. Herein, the performance of the convolutional neural network (CNN), random forest (RF), artificial neural network (ANN), support vector machine (SVM), and logistic regression (LR) are compared in producing LSM due to the complexity of influencing factors. The following text has been added to the manuscript:

“In particular, the potential of CNNs in this field based on causative factors has not been fully explored yet, especially by comparing its performance with the conventional ma-chine learning algorithms.”

Random Forest algorithm achieves better results could be due to its nature and characteristics. RF is based on the bagging algorithm and uses Ensemble Learning technique. It creates many trees on the subset of the input data and combines the output of all the trees. In this way it reduces overfitting problem in decision trees and also reduces the variance, and therefore improves the accuracy. We have not considered the physical process of LSM while modeling with RF, but taken into account the related causative factors.

Besides, it is impossible for RF to achieve the best results for any prediction task. No single algorithm dominates when choosing a machine learning model. For example, some ML methods perform better with large data sets and some perform better with high dimensional data. Thus, it is important to assess a model’s effectiveness for particular application problem and data set.

Given a set of samples, we tried to obtain the optimal classification result by hyperparameter tuning for each ML model. So the result is not random. All the results are reproducible.

  1. Line 378

What is mean "the natural breaks method". Your classification into 5 subclasses is questionable. Why don't you solve the problem of classification into 5 subclasses using ML methods and all 16 factors?

Response: Thanks for the very good question. The Jenks Natural Breaks Classification is a data classification method designed to optimize the arrangement of a set of values into "natural" classes. A Natural class is the most optimal class range found "naturally" in a data set. This classification method attempts to minimize the average deviation from the class mean while maximizing the deviation from the means of the other groups.

The landslide susceptibility mapping is implemented by training ML models based on 16 factors and the landslide inventory map. The landslide inventory map was created by the forest and watershed management organization (FWMO) of Zanjan province, which includes 2513 landslide points and 3287 non-landslide points in the study area. It is impossible to generate the landslide points based on 5 different susceptibility levels. That’s why we conduct binary classification first, and then implement landslide susceptibility mapping.

  1. Line 437

In Figure 6, I can see that the classification results are almost independent of the machine learning methods you use. The first conclusion is based on a comparison of random variables obtained for a specific classification problem based on data from one region and one sample. Therefore, the first conclusion that RF is the best machine learning method is not substantiated either physically or statistically.

Response: Thanks for the very good comment. Figure 6 displays the model performance comparison in terms of ROC curves.

As mentioned in the Introduction, Zanjan province of Iran experiences a high number of landslides annually due to its mainly mountainous topography, diverse geological and morphological structures, and different climatic conditions, which cause considerable damage to the country. However, little effort has been made to assess or predict these landslides. Comparing the performance of different methods in landslide susceptibility mapping can provide analysts with guidance for the selection of appropriate one for the study area in Zanjan. Our conclusion doesn’t mean RF is the best machine learning method for all prediction/classification task. The conclusion mainly indicates that RF can achieve promising result of landslide susceptibility mapping in this study. To consider your comment we emphasis in Abstract and Results section that this results have been showed that RF was performed as the best algorithm in this case study area. Also, the below text has been added just before Table2 in the revised version.

“It is notably to say that the results here don’t mean RF is the best machine learning method for all prediction/classification tasks. The conclusion mainly indicates that RF can achieve promising result of landslide susceptibility mapping in this study.”

  1. Line 445

Second conclusion.  Using statistical machine learning techniques, you name the first two factors from Table 3 as the most important variables. Their main contribution to the classification decision rule is confirmed in Figure 7. Both factors carry information about the instability of creeping soil. But from a physical point of view, the decision rule requires additional information about the presence and properties of the soil on the slope. I think lines 366-368 should be corrected.

Response: Thanks for the insightful comment. Here we explain a bit more about Table 3 and Figure 7 for better understanding. In this paper, the absolute value of PVAI is adopted to measure the feature importance. We introduced how to calculate permutation-based variable accuracy importance (PVAI) in Section 4.3. The rationale of the method is that the importance of a feature is calculated based on comparing the variation in the performance (i.e., accuracy in this study) of a classifier when the feature is randomly permuted in the test dataset. If the performance decreases more under the variation of a feature, its importance degree is higher. For example, let’s suppose we estimate the classification accuracy Aoriginal is 90%. Then, given a feature, we permute it in the feature matrix and estimate the classification accuracy based on the permuted feature matrix. Let’s say the classification accuracy Apermutation is 88% after the feature permutation. The PVAI value of this feature can be calculated as  Apermutation - Aoriginal, namely -2%. The higher the PVAI values is, the more important the variable/feature is. Since each feature is permuted 10 times in the test dataset, the PVAI values of each feature are presented in boxplot, as shown in Figure 7.

Figure 7 and Table 3 mainly demonstrate which factors are important in landslide classification. However, it doesn’t mean other less important factors are useless at all in the modeling. Here we take ANN as an example. If we remove the factor topographic curvature from the input data and estimate the PVAI values of the remaining factors, the result is like the following figure. It can be observed that the PVAI value/importance of each factor has changed, which implies that they are also related with landslide. Therefore, the PVAI value of one factors only reflects its importance degree in all the factors. If we remove one important factor (e.g., topographic curvature) from the input, some less important factors could display as important factors.

As can be seen in the results, slope and curvature, which indicate the instability of the terrain, are the most important factor in all ML methods in this study with the area near 22100 square kilometers. However, datasets such as soil type, soil moisture, subsidence, and so on which are essential in the study of smaller scales, are not available in the whole case study area here. Accordingly, it is necessary to pay attention to the geological characteristics of soil type, soil moisture, and other issues rose on smaller scales for more accurate studies. Besides, we do agree with the reviewer comment that from a physical point of view, the decision requires additional information. To address this comment and also describe the matter we added the text below in the revised version before Figure 7.

“It is worth noting that the results about the importance of landslide causative factors have been obtained computationally here. As can be seen, slope and curvature, which indicate the instability of the terrain, are the most important factor in all ML methods in this study with the area near km2. However, datasets such as soil type, soil moisture, subsidence, and so on which are essential in the study of smaller scales, are not available in the whole case study area here. Accordingly, it is necessary to pay attention to the geological characteristics of soil type, soil moisture, and other issues rose on smaller scales for more accurate studies. Besides, from a physical point of view, the decision requires additional information.”

 

 

Round 2

Reviewer 2 Report

I accept the corrections of the article and the comments of the authors.

At the same time, I do not quite agree with the authors' conclusion regarding the ROС curves (Fig. 6). I think it would be useful (in a new work) to compare not only ROС curves, but also landslide maps obtained using different algorithms. (Each map is a vector with coordinates equal to pixel values). The algorithms are independent and apply to the same data. Maps obtained with "bad" algorithms must be very different (each bad algorithm is "bad in its own way"). Maps created with "good" algorithms should create a compact cluster (there is only one true map, so "good" algorithms should look like a real map).

Back to TopTop