**4. Results**

The performance of the test dataset was first evaluated using the ROC approach to find out the model with best performance. The analysis was carried out with the values of k ranging from 2 to 10 for algorithms, using both a point and polygon dataset, and the ROC curves are plotted in Figure 6.

The minimum and maximum accuracy of the model with NB algorithm and point data are 82.70% and 83.30%, respectively, and the corresponding AUC values are nearly the same, i.e., 0.903 and 0.904. The accuracy values remained the same, while the AUC values reduced when the polygon data is used with the NB algorithm. The trend is nearly the same for the LR algorithm as well. The AUC values are slightly better than NB, with the maximum value of 0.920 with point data. The pattern is different for the other three algorithms, and the performance is significantly improved with polygon data in all the three cases. With the point data, the maximum accuracy values are 84.71%, 88.12%, and 86.63% for KNN, RF, and SVM, respectively, while the maximum AUC values are 0.911, 0.954, and 0.930. With the use of polygon data, the maximum accuracy of KNN increased up to 95.22%, while that of RF became 98.14% and the same for SVM became 91.61%. The AUC values also increased up to 0.981, 0.993, and 0.963 for KNN, RF, and SVM, respectively. Another important observation is that the performance of SVM is better than KNN while using point data, with a difference of 1.92% in accuracy, albeit when polygon data is used. KNN performed better than SVM, with a difference of 3.61% accuracy (Table 2). In both the cases, the RF model outperforms the other models with the highest values of accuracy and AUC.

From Figure 6, it can be observed that the AUC values of KNN, RF and SVM have improved significantly by using polygon inventory data, while the variation is minimum in the case of NB and LR. Moreover, the effect of varying the value of k in k-fold cross validation is insignificant while using polygon data for NB, LR, and SVM algorithms, while, in the case of KNN and RF, variation in the number of folds can result in a variation of approximately 2% accuracy with polygon data. Even though the variation is not significant, the best performance of all models was obtained at k = 8, using point data. A summary of quantitative comparison is provided in Table 2, with the k values corresponding to minimum and maximum performances in the brackets.

**Figure 6.** ROC curves, AUC, and accuracy of different models: (**a**) Naïve Bayes, (**b**) Logistic Regression, (**c**) K Nearest Neighbors, (**d**) Random Forest, (**e**) SVM, and (**f**) comparison of AUC of all five algorithms.


**Table 2.** Quantitative comparison of different algorithms, sampling strategies and data splitting using accuracy and AUC values.

From the comparison of statistical performance obtained as per Figure 6 and Table 2, it can be observed that the RF algorithm with polygon inventory data is performing better than all other models. The performance of KNN and RF are comparable while using polygon data and the scores of RF and SVM are comparable while using point data. Still, the best suited model cannot be selected on the basis of statistical scores only. The choice needs a detailed understanding of the distribution of susceptibility classes and a detailed evaluation based on practical perspectives. The purpose of landslide susceptibility maps is to help the planners and authorities in making strategic decisions for future development. Hence, it is important to provide clear information about the susceptibility classes. Based on the value of probability of the occurrence of landslides, the district is divided into five susceptibility classes: very low, low, medium, high, and very high. The statistical attributes provide the prediction performance on the test data only [49]. From a practical perspective, a landslide susceptibility map with an acceptable performance should classify all the landslides correctly within the very high, high, or medium classes. At the same time, the model cannot be too conservative, which may restrict the developmental activities within a larger area. The landslide susceptibility maps prepared using both point and polygon data using each algorithm with the best performing model are evaluated in detail along with the H-index map for a better understanding of spatial agreement.

The number of pixels in each category and the number of landslides that occurred in each class are also important concerns. By using a reliable landslide susceptibility map, the landslides should occur within medium, high, and very high susceptible zones. The landslides which occur outside these zones are missed events, which should be considered with utmost care. Any model with an increased number of missed alerts fails to predict the possible occurrence of landslides.

The landslide susceptibility maps prepared using NB algorithm classifies 15.07% of the total area in the very high category with point data and 18.29% with polygon data (Figure 7). It can also be understood from Figure 7 that, among the 388 landslides considered, 72.64% occurred in very high classified areas, itself with point data, and the percentage increased to 80.64% using polygon data. Exactly 74.49% of the total area is classified as very low using point data and 73.27% using polygon data. The performance of the model is slightly reduced while using polygon data due to the increased number of false alarms within the increased percentage area covered by very high and high category. Considering the mutual agreemen<sup>t</sup> between the predictions made by both sampling strategies, 86.42% of the total predictions are in perfect agreemen<sup>t</sup> with each other (Figure 7c), while the classification of susceptibility predicted by both methods are different in the remaining area.

The LR algorithm classifies 6.90% of the total area as very high, 9.04% as high, 10.55% as medium, 22.21% as low, and 51.30% as very low susceptible classes using point data

(Figure 8). The number of landslides that occurred in the very high classified locations are reduced to 58.60% when compared with NB, but, at the same time, the number of landslides that occurred in the very low category was also reduced to 6.78%, which in turn slightly improved the performance of LR. While using polygon data, LR algorithm classifies 8.63% of the total area as very high, 7.82% as high, 9.03% as medium, 14.58% as low, and 59.95% as very low. Even though the missed alarms are reduced by this case, the increased number of false alarms resulted in a marginal decrease in accuracy and the AUC values. For 72% of the total area, the susceptibility class predicted using both point data and polygon data perfectly agreed with each other, with an H-index of 0.

From the AUC values (Figure 6), it is evident that the performance of KNN is comparable with NB and LR algorithms while using point data, but it has increased significantly while using polygon data. The reason for this is the drop in the areas classified into very high, high, and medium classes to 3.48%, 3.27%, and 3.44% while using polygon data when compared to 7.15%, 7.60%, and 7.82% while using point data (Figure 9). This reduction has resulted in a considerable reduction of false alarms and in the improvement of accuracy and AUC values. The variation is also reflected in the H-index map, as only 68.70% of the total area agrees with the prediction made using different sampling methods.

Similar to KNN, RF also shows a significant improvement in performance while using polygon data when compared to the point data. The reason is also very similar, as the percentage of very high, high, and medium classified points are reduced while using the polygon data. With the use of point data, 7.86% of the total area was classified under the very high category, which comprises 61.26% of the total landslide occurrences (Figure 10). However, with polygon data, 97.90% of the total landslides are happening within the 1.06% of the total area, which are classified into the very high category. The number of missed events is also reduced by using polygon data as only 0.13% and 0.06% of landslides occurring in the low and very low classified areas, respectively. The mutual agreemen<sup>t</sup> between the landslide susceptibility maps produced by point and polygon data is also the least in case of RF algorithm, as 71.20% of the total area has been classified into different categories by using different sampling strategies.

Similar to NB and LR, SVM also shows an increase in percentage of area classified into the very high category with the use of polygon data when compared with the landslide susceptibility map prepared using point data (Figure 11). However, the percentage increase in this category does not result in false alarms, as in the case of NB and LR, as most pixels classified as high and medium categories using point data were classified as in the very high category while using polygon data. Thus, the true positives have increased, and false negatives have been reduced by using polygon data, which in turn resulted in an increase in performance using polygon data. For 75.28% of the area, the categorization is same when using both point and polygon data, as depicted by the H-index plot.

While comparing the performance of different models, RF provides better performance by using both point and polygon data. Moreover, while using polygon data, the performance of KNN and RF are comparable and, while using point data, the performance of SVM and RF are comparable. Apart from statistical comparison, a better understanding of the pixel-wise distribution of susceptibility classes and mutual agreemen<sup>t</sup> between the landslide susceptibility maps can help in deciding the best suited landslide susceptibility map for a region.

**Figure 7.** Details of landslide susceptibility maps prepared using NB algorithm: (**a**) using point data, (**b**) using polygon data, (**c**) H-index plot, (**d**) percentage distribution of using point data, and (**e**) percentage distribution of pixels using polygon data.

**Figure 8.** Details of landslide susceptibility maps prepared using LR algorithm: (**a**) using point data, (**b**) using polygon data, (**c**) H-index plot, (**d**) percentage distribution of pixels using point data, and (**e**) percentage distribution of pixels using polygon data.

**Figure 9.** Details of landslide susceptibility maps prepared using KNN algorithm: (**a**) using point data, (**b**) using polygon data, (**c**) H-index plot, (**d**) percentage distribution of pixels using point data, and (**e**) percentage distribution of pixels using polygon data.

**Figure 10.** Details of landslide susceptibility map prepared using RF algorithm: (**a**) using point data, (**b**) using polygon data, (**c**) H-index plot, (**d**) percentage distribution of pixels using point data, and (**e**) percentage distribution of pixels using polygon data.

**Figure 11.** Details of landslide susceptibility maps prepared using SVM algorithm: (**a**) using point data, (**b**) using polygon data, (**c**) H-index plot, (**d**) percentage distribution of pixels using point data, and (**e**) percentage distribution of pixels using polygon data.
