*Article* **Susceptibility Analysis of Glacier Debris Flow Based on Remote Sensing Imagery and Deep Learning: A Case Study along the G318 Linzhi Section**

**Jiaqing Chen 1, Hong Gao 1, Le Han 1, Ruilin Yu <sup>1</sup> and Gang Mei 1,2,\***


**Abstract:** Glacial debris flow is a common natural disaster, and its frequency has been increasing in recent years due to the continuous retreat of glaciers caused by global warming. To reduce the damage caused by glacial debris flows to human and physical properties, glacier susceptibility assessment analysis is needed. Most research efforts consider the effect of existing glacier area and ignore the effect of glacier ablation volume change. In this paper, we consider the impact of glacier ablation volume change to investigate the susceptibility of glacial debris flow. The susceptibility to mudslide was evaluated by taking the glacial mudslide-prone ditch of G318 Linzhi section of Sichuan-Tibet Highway as the research object. First, by using a simple band ratio method with manual correction, we produced a glacial mudslide remote sensing image dataset, and second, we proposed a deep-learning-based approach using a weight-optimized glacial mudslide semantic segmentation model for accurately and automatically mapping the boundaries of complex glacial mudslide-covered remote sensing images. Then, we calculated the ablation volume by the change in glacier elevation and ablation area from 2015 to 2020. Finally, glacial debris flow susceptibility was evaluated based on the entropy weight method and Topsis method with glacial melt volume in different watersheds as the main factor. The research results of this paper show that most of the evaluation indices of the model are above 90%, indicating that the model is reasonable for glacier boundary extraction, and remote sensing images and deep learning techniques can effectively assess the glacial debris flow susceptibility and provide support for future glacial debris flow disaster prevention.

**Keywords:** geological hazards; glacial debris flow; remote sensing; deep learning

#### **1. Introduction**

Glacier debris flow is a kind of mountain natural disaster caused by ice melting, which is characterized by suddenness, large scale, and hazard. In the context of global warming, glacier retreat is accelerating, and the frequency and scale of glacier debris flows are increasing [1,2]. Therefore, accurate assessment and prediction of glacial debris flow susceptibility is important to protect people's lives and properties and to maintain the ecological environment.

Remote sensing images and deep learning techniques have an important role in the analysis of glacier debris flow susceptibility. Remote sensing images can provide large-scale, high-resolution surface information, including topography, vegetation, and hydrology [3], which provides basic data for the formation and occurrence of glacier debris flows. Deep learning techniques can quickly and accurately identify potential glacier debris flow hazard areas by feature extraction and classification of remotely sensed images [4], providing an effective means for glacier debris flow prediction and early warning. Therefore, glacier

**Citation:** Chen, J.; Gao, H.; Han, L.; Yu, R.; Mei, G. Susceptibility Analysis of Glacier Debris Flow Based on Remote Sensing Imagery and Deep Learning: A Case Study along the G318 Linzhi Section. *Sensors* **2023**, *23*, 6608. https://doi.org/10.3390/ s23146608

Academic Editor: Junwei Ma and Jie Dou

Received: 18 June 2023 Revised: 18 July 2023 Accepted: 21 July 2023 Published: 22 July 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

debris flow susceptibility analysis based on remote sensing images and deep learning has important research significance and application value [5].

Glacier debris flow susceptibility analysis is important research work that can help people better understand the formation mechanism and influencing factors of glacier debris flows, so as to take effective prevention and control measures. In recent years, with the continuous development of remote sensing technology and deep learning algorithms, the analysis of glacier debris flow susceptibility based on remote sensing images and deep learning has also gained wide attention. The occurrence of glacial debris flow has obvious indicators of catastrophic changes, such as increased density of hanging glacier crevasses, enhanced glacier velocity, and rapid increase in glacial lake area.

The influencing factors of its glacial debris flow susceptibility analysis mainly include the following aspects: (1) Topographic factors: Topography is one of the important factors in the formation of glacial debris flow, including topographic height difference, slope, and slope direction [6]. In remote sensing images, topographic information can be obtained by data such as digital elevation model (DEM). (2) Climatic factors: Climatic factors also have an important influence on the formation and development of glacier debris flows, including rainfall, temperature, and humidity. Remote sensing images can obtain meteorological data, such as rainfall. (3) Geological factors: Geological factors are also one of the important factors in the formation of glacial debris flow, including lithology, faults, and earthquakes. Remote sensing images can obtain geological information, such as lithology, faults, etc. (4) Vegetation factor: Vegetation cover also has some influence on the formation and development of glacial debris flow. Remote sensing images can obtain vegetation information, such as vegetation cover, etc. [7]. Glacier debris flow susceptibility analysis based on remote sensing images and deep learning can obtain glacier debris flow susceptibility information by feature extraction and classification of remote sensing images. Commonly used deep learning algorithms include convolutional neural networks (CNN), recurrent neural networks (RNN), etc. The prediction results of glacier debris flow susceptibility can be obtained by training and testing the remotely sensed images. In conclusion, the analysis of glacier debris flow susceptibility based on remote sensing images and deep learning can provide an important scientific basis for glacier debris flow prevention and control [8].

In recent years, there has been a new research trend aimed at glacier debris flow susceptibility assessment through remote sensing images and deep learning techniques. For example, Ji et al. used deep learning and remote sensing image analysis to establish a mudslide susceptibility assessment model based on topographic and geomorphological features, and validated it in the Bijie City area of northwestern Guizhou, showing that the model can more accurately assess mudslide susceptibility in the area [9]. Ref. [10] used high-resolution remote sensing image data for glacier prediction. As for the method of remote sensing image susceptibility analysis, Lin et al. considered the influence of the change in glacier ablation volume and conducted a mudslide susceptibility analysis using the G217 glacier mudslide-prone trench on the Dukku Highway in Xinjiang. This study showed that accurate prediction of glacier mudslide susceptibility could be achieved using high-resolution remote sensing image data and machine learning algorithms [11].

However, these methods have some limitations, such as low accuracy in conducting glacier debris flow susceptibility assessment. Therefore, we need more accurate models for glacier debris flow susceptibility assessment.

In recent years, deep learning techniques have been widely used in remote sensing image processing [12]. Among them, the DeepLabv3+ model is a deep convolutionalneural-network-based method with high segmentation accuracy and fast operation speed. This model improves the accuracy of image segmentation by optimizing the traditional convolutional neural network through the null convolution and decoder modules. Using this model, we can semantically segment glacier debris flows by labelling them as "Ice" category and other terrain as "Background" category. This method not only ensures high accuracy segmentation results, but also fast evaluation, avoiding the errors in traditional

methods. In addition, the method can actively reduce the harm caused by glacier debris flow to humans and the natural environment.

We chose the G318—Linzhi section of the Sichuan-Tibet Highway, which has been affected by global warming in recent years, and the glacial melt in the Linzhi area has accelerated, resulting in frequent glacier debris flow disasters and serious hazards. According to incomplete statistics, which are only from April 2006 to September 2007, the mud-slide disaster in the Linzhi area endangered the safety of 264 villagers in 17 villages, resulting in one death and seven people missing. A comprehensive analysis of mudslide disasters and environmental factors in the Linzhi region shows that the current glacier debris flow disasters in the Linzhi region are at a high incidence and seriously affect the local economic development. Therefore, it is important to study the development pattern of mudslide under climate change in the Linzhi region for monitoring, early warning and prevention of mudslide disasters in the Linzhi region and the whole of southeast Tibet.

Our contributions in this paper can be summarized as follows: (1) the ablation zone of the study area was determined by comparing the glacier boundary in 2015 and the glacier boundary in 2020; (2) the amount of glacier ablation was calculated based on the changes in glacier elevation and ablation area from 2015 to 2020; and (3) an evaluation of the susceptibility of glacial debris flow by using the melting amount of glaciers in different basins.

#### **2. Study Area**

The Linzhi section of National Highway 318 is located in Linzhi City, Tibet Autonomous Region of China, with a total length of about 287 km, connecting Linzhi City with Chengdu City in the southwest of Sichuan Province. The starting point of the Linzhi section is located in the town of Motuo within Linzhi City, and the end point is located in the county of Yajiang. The road traverses several natural scenic areas such as the Hengduan Mountains, the Sichuan-Tibet Plateau, and the Yarlung Tsangpo River Grand Canyon.

The Linzhi section is a typical area prone to glacial debris flow. There are many glaciers and rocks piled together, and the geomorphologic conditions are such that glacial debris flow occurs easily. The study area has typical alpine valley and mountain valley landforms developed due to crustal uplift, strong river undercutting, and strong tectonic activity. In the context of high stress, the potential risk of strong earthquakes is high [13]. The Linzhi section has abundant research resources, such as existing satellite remote sensing images and ground monitoring data, to facilitate the implementation of glacier debris flow analysis.

The average elevation of the study area is about 3000 m, the lowest elevation is 115 m in the territory of Murdoch County, the highest elevation is more than 7000 m in Namcha Barwa Peak, which is the zone with the largest vertical landform drop in the world, and the relative elevation difference is generally around 1000–2000 m. The slope of the mountain slope is generally not less than 30°, and the slope of the canyon area is mostly around 80°.

The national highway G318 crosses Gongbu Jiangda County, Bayi District, and Bomi County, respectively, the details of which are shown in Figure 1. Gongbu Jiangda County is located in the transition zone from the valley of southern Tibet to the high mountain valley area of eastern Tibet, bounded by the eastern extension of the Gangdis Mountains in the south and the Tanggula Mountains in the north, with mountains and valleys spreading in an east–west direction. Bayi district in the south for the Gangdis Mountain remnants, the north belongs to the Nianqing Tanggula Mountain branch alpine section. The average elevation of the territory is 3000 m, and the highest peak is Galabaek Peak, which is 7300 m above sea level, while the lowest place is Bayu Village, which is 1600 m above sea level, with a relative height difference of 4700 m. Bomi County is located in the eastern section of Tanggula and the eastern end of the Himalayas, with high north and low south and continuous high mountains, and the central part of the Palongzangbu River Valley and the Egonzangbu River Valley, with the highest elevation of 6648 m and the lowest of 2001.4 m in Bomi County. With the change in topography and geomorphology, the geological effect is also changing. glacier debris flow hazards are generally developed on the margins

of modern glaciers and snowpacks, and the topography and excessive relative elevation differences in this region provide favourable spatial conditions for the formation and development of glacier debris flow hazards [13].

**Figure 1.** Geographical location of the glacier study area.

While the climate of this study area is obviously influenced by the crustal uplift and more prominently influenced by the topography, the average temperature in the southern part of Dongjiu Township is around +12 °C. The cold-temperate zone becomes more pronounced as we move upstream of the Yarlung Tsangpo River, and Bomi County is known as the centre for modern glaciers because it has high mountains with perennial snow accumulation below 0 °C. In recent years, the glacier area has been retreating, which is mainly influenced by the continuous increase in temperature, and the change in precipitation has little effect on glacier changes [14]. Therefore, we chose Bomi County as our typical study area. The average annual temperature in Kumbumgangda County is +8.7 °C, with a maximum temperature of +31.5 °C and a minimum temperature of −10.4 °C. The average annual rainfall is 640.1 mm, with a maximum annual rainfall of 808.3 mm and a maximum daily rainfall of 45.2 mm. The seasonal distribution of rainfall is uneven, with 80% of the rainfall concentrated between May and September. Bayi district is influenced by the warm and humid airflow of the Indian Ocean, and has a temperate humid monsoon climate with abundant rainfall. The annual average temperature is +8.5 °C, the highest temperature +29 °C, and the lowest temperature −1.8 °C. The average annual rainfall is 654 mm, mainly concentrated in May–September, accounting for about 90% of the annual rainfall. The average annual temperature in Bomi County is +8.5 °C, with a maximum temperature of +31.1 °C and a minimum temperature of −1.8 °C. The average annual rainfall is 977 mm. Since the mid-twentieth century, rising temperatures have led to the melting of many glaciers at unsustainably high rates of melting, resulting in diminishing ice storage [15]. Temperature and rainfall conditions are important factors in triggering the occurrence of glacier debris flow hazards [16,17], and it is important to fully understand and investigate this aspect.

#### **3. Methods**

#### *3.1. Overview*

In this paper, we investigate the susceptibility of glacier debris flow along the G318 Linzhi section based on remote sensing imagery and deep learning. First, high-quality remotely sensed images are acquired and pre-processed prior to segmentation. Second, the processed images are used to generate a sample set. Third, after the segmentation results are output, post-processing procedures such as boundary extraction, small polygon removal, and edge lubrication are applied to obtain glacier profiles.

Our research is divided into three sections: (1) remote sensing image processing; (2) the glacier ablation volume was calculated by combining elevation data; and (3) eight influencing factors were used to evaluate the susceptibility of glacial debris flow. A flow chart of the study is shown in Figure 2.

**Figure 2.** Workflow of the susceptibility analysis of glacier debris flow.

#### *3.2. Step 1: Data Acquisition and Pre-Processing*

In this paper, we have obtained freely available Landsat 8 data from the US Geological Survey [18], covering the period of 2015 to 2020. In recent years, remote sensing imagery has been widely used in physical geography and environmental research, especially in areas such as glacier monitoring and geological hazard control. Because of its advantages of high resolution and confidentiality, it often charges fees or provides a limited number of images, making it difficult to achieve long-term monitoring over large areas [19]. Landsat 8, as a high-resolution multispectral satellite, can effectively improve the identification accuracy over large areas and has obvious advantages for glacier identification. Taking into account the climatic conditions, glacier distribution, and change characteristics of the study area, remote sensing images with less cloud shadow in summer were selected. We also selected two global elevation remote sensing data: Copernicus DEM downloaded from the Copernicus Open Access Centre (Available online: https://scihub.copernicus.eu/ (accessed on 12 March 2023)) mapped in 2015, and the latest global 30 m resolution DEM data currently available—NASA DEM was released by NASA on 18 February 2020, and NASA DEM will be the highest resolution, best quality, and widest coverage DEM product in the foreseeable future [20].

Furthermore, in this paper, we generate the corresponding glacier remote sensing dataset with the help of the above-mentioned channels for constructing semantic segmentation models. We selected the Band2 (Blue), Band3 (Green), and Band6 (SWIR 1) bands from the Landsat 8 data and synthesised them as pseudo-colour images. In the synthesised image, the ice appears blue and the bare ground is red, whereby we annotated the obtained remote sensing image with glaciers, as shown in Figure 3.

**Figure 3.** The dataset of semantic segmentation. Examples show the sample labels of glaciers in different images. (**a**) True-colour composite of the Landsat 8 imagery; (**b**) the white polygon indicates the glacier, and black is the background. (Band2 (Blue), Band3 (Green) and Band6 (SWIR 1)).

Due to the influence of external factors such as topographic relief, solar radiation and cloud cover, remote sensing images may suffer from distortions and other problems during the imaging process [21]. Therefore, pre-processing of the raw remote sensing data, including data format conversion, radiometric calibration, atmospheric correction, and geometric correction, is required before classifying the glaciers [22]. Extraction of information, such as elevation and area of the study area, allows for debris flow hazard assessment and predictive analysis, as well as final output of study results and visualisation images, hazard evaluation maps, prediction analysis maps, etc.

#### *3.3. Step 2: Model Architectures and Training*

In this section, we detail the model architecture of a deep-learning-based approach to accurately and automatically map complex debris-covered glaciers from remotely sensed images. First, we generate sample sets for training and testing. Afterwards, we perform model training by employing a semantic segmentation model with weight optimisation. The following is an example of the basic process for processing remotely sensed images as Figure 4.

**Figure 4.** The processing of remote sensing images (**a**) False-colour images; (**b**) labelled images; (**c**) identified images.

DeepLabv3+ is a convolutional-neural-network-based image segmentation model for segmenting objects in images [23]. We use a deep convolutional neural network called Mobilenet as the backbone network for extracting image features. By leveraging the capabilities of DeepLabv3+, the research aims to achieve accurate and reliable segmentation of glacier debris and surrounding terrain from remote sensing imagery. The model's high accuracy can enhance the quality of the susceptibility analysis results. Furthermore, to extend the perceptual field of the convolutional kernel, DeepLabv3+ adds a null convolution layer. Null convolution adds a certain number of voids inside the convolution kernel, allowing larger convolution kernels to process features over large regions without using too many parameters [24]. DeepLabv3+ uses spatial pyramidal pooling for aggregation of multi-scale image features. Specifically, the model samples the feature maps at different scales using separate pooling kernels to capture the features of object regions at different scales. A full convolution decoder is also used for reducing the feature maps extracted in the backbone network to segmented images of the same size as the input image. This decoder obtains results by upsampling and stitching high-resolution features with low-resolution semantic segmentation maps. DeepLabv3+ achieves efficient extraction of multi-scale image features and accurate segmentation of glacier regions by using Mobilenet networks with operations such as null convolution, spatial pyramid pooling and a full convolution decoder [25]. The DeepLabv3+ architecture is shown in Figure 5.

**Figure 5.** DeepLabv3+ architecture for semantic segmentation (figure adapted from [24]).

#### *3.4. Step 3: Model Architectures and Training*

We used the DeepLabv3+ model for glacier debris flow remote sensing image classification, which allows us to obtain information on the susceptibility of glacier debris flow remote sensing images. In evaluating the Deeplabv3+ model, we learned that the accuracy assessment of glacier identification relies heavily on the quality of the sample set. The samples were divided into training and validation sets in a 9:1 ratio for training the model and evaluating its accuracy, respectively. The main evaluation metrics for semantic segmentation are MPA, MIoU, and pixel accuracy, which are calculated based on the confusion matrix. The four basic elements that make up the confusion matrix are true positive (TP), false positive (FP), true negative (TN) and false negative (FN). The performance accuracy of a glacier or non-glacier can be defined based on MPA, MIoU, and pixel accuracy. The following formulas are quoted from [24].

$$MPA = \frac{1}{k+1} \sum\_{i=0}^{k} PA\_i \tag{1}$$

$$MIolI = \frac{1}{k+1} \sum\_{i=0}^{k} \frac{TP}{FN + FP + TP} \tag{2}$$

$$Pixel-Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \tag{3}$$

We use the DeepLabv3+ model to train the annotated remote sensing images and carry out model optimisation to improve the model prediction performance and to evaluate the ease of occurrence. At the same time, there are some limitations and challenges to its application. On the one hand, the model requires a large amount of high quality data for training, which is demanding in terms of data quality and data volume, which may create some limitations in areas of study where there is insufficient data. On the other hand, if the amount of training data is too small or unbalanced, over-fitting can easily occur. In addition to this, the model is relatively complex and requires parameter tuning, which requires a certain level of technical skill and practical experience on the part of the user. When using the DeepLabv3+ model for glacial mudslide susceptibility assessment, these issues need to be considered thoroughly to ensure accurate and reliable results.

#### *3.5. Step 4: Glacial Debris Flow Susceptibility Assessment*

In conducting the analysis of glacial debris flow susceptibility, we conducted a comprehensive analysis and judgment of eight factors that affect the occurrence of glacial debris flows, including the volume of physical sources, catchment area, maximum daily rainfall, longitudinal slope drop of the main gully, length of the main gully, glacier volume, total glacial lake area, and vegetation area [26]. Based on our study, we replaced glacier area with the glacier ablation volume in the previous study, and calculated the volume of meltwater by calculating the volume of glacier ablation within the glacial debris flow basin over a five-year period, and combined it with other factors to arrive at a more accurate evaluation method for glacial debris flows.

We first calculated the values of each factor in the glacier debris flow basin in the study area, and then used the entropy weighting method to calculate the weight of each factor on glacier debris flow susceptibility. Finally, we used the Topsis method to score the susceptibility of each glacial debris flow by combining the weights of each factor, and evaluated the susceptibility of glacial debris flows based on the scores.

#### **4. Results and Analysis**

#### *4.1. Experimental Environment and Settings*

The experimental environment for this paper uses the Python programming language, the windows 11 operating system, a 12th Gen Intel® Core™ i7-12700H CPU, an NVIDIA GeForce RTX 3060 Laptop GPU, the PyTorch framework, and Mobilenet for the backbone network. A larger value is better for parallel computing, while a smaller value affects the GPU's performance. When the batch size is set to 16 or 32, too large a batch may lead to a lack of memory. Finally, considering the convergence speed and random gradient noise and device performance, the batch size was chosen to be 4 for the freezing phase and 2 for the thawing phase. The number of iterations is the number of times the training set is fed into the neural network for training. Usually, when to stop iterating depends on the predictive performance of the model. When the number of iterations is chosen to be 600, overfitting can occur because the number of iterations is too large. It needs to be simplified. When the number of iterations is chosen to be 500, the difference between the test error rate and the training error rate is small and the current number can be considered appropriate. The learning rate is the size of the network weights update in the optimisation algorithm. The maximum learning rate of the model defaults to 0.01 when the learning process is adaptively adjusted according to the current batch\_size. The training set is then fed into the DeepLabv3+ network for iteration. As described above, the training parameters were continuously adjusted to finally obtain a weight-optimised semantic segmentation model, which we used to predict the glacier boundary in the study area and to calculate the glacier ablation volume over a five-year period in combination with the elevation data.

#### *4.2. Glacier Boundary Identification Model Training and Evaluation*

Before the training, the extracted remote sensing images were divided into 3690 images according to a size of 128\*128. If the whole image is trained directly without cropping, it may consume too much computer memory during the training process, resulting in the interruption of the training process. According to 9:1, the data set was divided into the training set and the verification set, which were 3354 and 336 pieces, respectively. In this model training, we used the test set as the validation set and did not divide the test set separately any more, and the remaining training set was used to learn the discriminative information between glaciers and non-glaciers.

We conducted several training sessions of the glacier boundary recognition model, selected several different batch\_sizes, obtained several model files, and evaluated their model training performance. Figures 6 and 7 show the relevant evaluation parameter changes for 200 and 500 epochs of model training, respectively.

The analysis of the glacier segmentation results based on different model architectures was performed by continuously adjusting the parameters to obtain weight optimisation. The segmentation results for the test data based on Landsat 8 images are shown in Figure 8 in comparison to the corresponding original false colour images. We can see that the pixel points discriminated as glacier-like are given a green mask during the model prediction process, while the pixel points discriminated as non-glacier-like are left unchanged.

**Figure 6.** Model parameters after 200 epochs of training: (**a**) loss; (**b**) MIoU.

**Figure 7.** Model parameters after 500 epochs of training: (**a**) loss; (**b**) MIoU.

**Figure 8.** Comparison of the original image and the segmentation model results of glacier boundary part.

The discrimination from these remote sensing images alone is not comprehensive enough for us to quantitatively evaluate the trained glacier boundary recognition model. Therefore, we assessed the quality of the model training through the values of MPA, MIoU, and pixel accuracy to reflect the accuracy of the model discrimination. For the DeepLabv

3+ model performance from the Landsat 8 dataset, the values of MIoU were 90.85%, MPA 96.09%, and pixel accuracy 96.64%, all of which were above 90%, indicating that our trained model can perform reasonable glacier boundary extraction.

In general, the DeepLabv3+ model can extract mudslide susceptibility information from remote sensing images well, but the evaluation needs to consider a variety of indicators and combine with other factors such as topography and rainfall to make a comprehensive analysis and judgment. At the same time, the model results need to be revised and validated by combining the experience of historical mudslide events and actual measurement data when making predictions.

#### *4.3. Assessment Results of the Vulnerability of Glacial Debris Flow*

In the analysis of the susceptibility of glacier debris flows, we have addressed material source conditions, slope, precipitation, glacial geological conditions, etc.

We summarised eight influencing factors that have a significant impact on the occurrence of glacier debris flows: volume of physical source (*X*1), catchment area (*X*2), maximum daily rainfall (*X*3), longitudinal slope drop of the main gully (*X*4), length of the main gully (*X*5), glacier volume (*X*6), total glacial lake area (*X*7), and vegetation area (*X*8). We found a raw data matrix of 132 glacier debris flow gullies in Linzhi city classified by the above influencing factors.

To qualitatively analyse the formation factors of glacier debris flows, we normalised the raw data and used the entropy weighting method to derive the weight of each influencing factor on the susceptibility of glacier debris flows. The weights are shown in the Table 1.


**Table 1.** Evaluation metrics based on the entropy weighting method (EWM).

Combining the weights given, we scored the mudslide gullies in the study area using the Topsis method and graded the results as Table 2.

**Table 2.** Scores for six glacier debris flows based on the Topsis method of analysis.


Figures 9 and 10 shows our evaluation of the susceptibility of six glacier debris flows in the study area and the evaluation of their susceptibility in previous studies.

The direct result of the remote sensing image vulnerability assessment of the G318 Linzhi section of the national highway is an image map with different coloured areas, each colour representing the corresponding mudslide vulnerability level. The red areas represent areas of very high susceptibility, the orange areas represent areas of high susceptibility, the yellow areas represent areas of medium susceptibility, and the green areas represent areas of low susceptibility. These different coloured areas provide a visual indication of the mudslide susceptibility of the area and provide important reference information for the prevention of mudslide disasters. The results of our evaluation are highly accurate when compared with the mudslide susceptibility assessment of the glaciers obtained after the field survey.

**Figure 9.** Glacial debris flow susceptibility mapping of the study area.

**Figure 10.** Previous glacial debris flow susceptibility mapping of the study area.

#### *4.4. Different Factors' Influence on the Results*

In conducting the glacial debris flow susceptibility analysis, we considered eight factors, namely material source volume, catchment area, maximum daily rainfall, longitudinal slope drop of the main gully, length of the main gully, glacier volume, total glacial lake area, and vegetation area, for comprehensive analysis and judgement.

#### 4.4.1. Volume of Material Source

Loose material is one of the most important precipitating conditions for glacial debris flows, and loose material plays an important role in the estimation of the volume of the material source. The product of the material distribution area and the average thickness is the estimate of the volume of the material source. Based on this method, and combined with remote sensing images, it can be concluded that: 60.1% of the 145 glacier debris flows in the study area have a material source volume greater than 10 × 106 m3 , 34% are between <sup>1</sup> × <sup>10</sup><sup>6</sup> and 10 × <sup>10</sup><sup>6</sup> <sup>m</sup>3, and 5.9% are less than 1 × <sup>10</sup><sup>6</sup> <sup>m</sup>3. This shows that the glacier debris flows in this study area have abundant reserves of material sources, providing conditions for the initiation of glacier debris flows, as shown in Figure 11.

**Figure 11.** Comparison of the volume of the physical source in the study area.

#### 4.4.2. Catchment Area

The catchment area is the area of water within a valley or watershed that is formed by ground form, precipitation, snow melt, and other factors. Within the catchment area, water, such as precipitation or snow melt and ice melt, collects through the gully and surface to become a river or stream, and eventually flows into the convergence point. The area of the convergence point and its upstream area is called the glacier debris flow catchment area. There are 145 glacier debris flows in the study area, of which the smallest is 0.93 km2 and the largest is 349 km2, with 64.7% of the catchments measuring 10 km2 to 100 km2. Glacial debris flows have a strong erosion and accumulation effect on the landscape, forming natural catchment areas such as ice buckets and troughs, so the catchment area in this study area is large, as seen in Figure 12.

#### 4.4.3. Maximum Daily Precipitation

The maximum daily precipitation is an important factor in measuring rainfall. The role of rainfall among the many triggering factors for glacier debris flows is obvious. The study area is rich in rainfall and, according to statistics the maximum daily rainfall in the Linzhi section, is very close to the standard for heavy rainfall, which fully meets the requirements for the formation of glacier debris flows. The amount of rainfall influences the confluence of glacier debris flow slopes and the runoff from the gully. Strong rainfall can lead to erosion

collapse, while weak rainfall manifests itself as liquefaction of the glacier debris flow slope, as seen in Figure 13.

**Figure 12.** Comparison of catchment areas in the study area.

**Figure 13.** Comparison of maximum daily rainfall in the study area.

4.4.4. Longitudinal Slope Drop of the Main Ditch

The longitudinal slope drop of the main ditch is the change in height difference between the length of the ditch in the direction of the river, and its value is mainly determined

by two important basic parameters, the length of the main ditch and the height difference. The main ditch longitudinal slope drop is calculated as follows:

$$\mathcal{W} = \frac{H}{L} \tag{4}$$

where *W* is the longitudinal slope drop of the main gully, *H* is the relative height difference of the watershed along the main gully, and *L* is the length of the main gully.

The longitudinal slope drop of the main gully is one of the factors necessary to cause large-scale glacial debris flow hazards, and therefore the analysis of the longitudinal slope drop of the main gully is an integral part of the study of glacial debris flows, as seen in Figure 14.

**Figure 14.** Comparison of the longitudinal slope drop of the main ditch in the study area.

#### 4.4.5. Length of Main Gully

A main gully is a gully with a certain slope, a pronounced gully, and a high water table or surface flow rate, formed by natural factors such as glaciers, rivers, and wind. The length of the main gully is then the length of the main gully within the gully belt or valley. It is closely related to the longitudinal slope drop above, as seen in Figure 15.

#### 4.4.6. Glacier Volume

Modern glaciers are necessary for the formation of glacier debris flows. Glacier volume is proportional to the number of glacier debris flows that occur. Most of the mudslide gullies in the study area contain large volumes of modern glaciers. The freezing and thawing of glaciers produces loose solid material that can trigger glacier debris flows. The analysis of glacial debris flows therefore requires an analysis of the volume of glaciers in the study area. The volume of the glacier and its volume of water is better reflected by combining two-dimensional remote sensing images with elevation than by the area of the glacier. This value has a more accurate impact on the analysis of susceptibility than glacier area, as seen in Figure 16.

**Figure 15.** Comparison of the length of the main ditch in the study area.

**Figure 16.** Comparison of glacier volumes in the study area.

4.4.7. Total Glacial Lake Area

A lake formed by glacial melt, flash floods, etc., with ice and glacial water as the main components is a glacial lake. The size and variation in the glacial lake area can reflect the activity and melting rate of the glacier, and is important for the analysis of the susceptibility of glacial debris flow, as seen in Figure 17.

**Figure 17.** Comparison of the total glacial lake area in the study area.

#### 4.4.8. Vegetation Area

Vegetation plays a suppressive role in the formation of glacier debris flows, with its roots penetrating deep into the soil and interlocking in a web-like pattern, acting similarly to anchors, anchoring the soil against erosion and scouring. The area of vegetation is therefore also an important indicator of the formation and susceptibility of glacier debris flows, and is one of the factors necessary for their analysis, as seen in Figure 18.

**Figure 18.** Comparative map of vegetation area in the study area.

In addition, there are a number of other factors that affect the accuracy of glacial debris flow susceptibility assessment. For example, the quality of remote sensing image data, the higher the quality of the data, the more reliable the prediction results; for the coarse and fine classification of feature classes, too fine or too coarse classification of feature classes will affect the model prediction results; for the adjustment of parameter settings in the model, including the adjustment of multi-scale image cropping, learning rate, number of training rounds, etc., will affect the model prediction results; for the characteristics of the sample itself, if the sample classes are unbalanced, it will easily make the mudslide susceptibility class is off from reality; and for the model structure and performance, different types of model structure, size of convolution kernel and other factors will affect the model prediction effect.

Different combinations of the above factors may produce different forms of impact effects, such as less accurate prediction results, failure to meet accuracy requirements, or weaker generalisation ability. Therefore, when analysing the susceptibility of remote sensing images of glacier debris flows in the Linzhi section of National Highway G318, these factors need to be taken into account and optimised and adjusted in the process of model training and prediction in order to improve the prediction effect of the model. At the same time, a comprehensive analysis of the actual terrain and other important factors is also needed to obtain more scientific and reliable prediction results.

#### **5. Discussion**

In this paper, we investigate the susceptibility of glacier debris flow along the G318 Linzhi section based on remote sensing imagery and deep learning. The segmentation results demonstrate the effectiveness and accuracy of the method. Its strengths and limitations are discussed below, and our future work to address the drawbacks is noted.

#### *5.1. Advantages*

The applications of remote sensing image and deep learning technology are important progress for the field of glacier debris flow analysis. This method provides a more effective and accurate means to study and predict the susceptibility of debris flow on glaciers. In contrast, previous studies may have relied on traditional methods, which were timeconsuming and imprecise. The study of the Linzhi section of National Highway 318 is a case study, but its impact extends beyond a specific region. Glacial debris flow is a worldwide phenomenon, and the methods adopted in this study can be applied to other glacial regions around the world. This contributes to a global body of knowledge and provides a valuable tool for assessing and managing glacial debris flow risks in different regions.

The DeepLabv3+ model can significantly improve the accuracy and effectiveness of remote sensing image sensitivity analysis of glacier debris flow in the Linzhi section of the G318 National Highway. The model uses high-resolution image segmentation capability to finely segment high-resolution remote sensing images at the detail level and accurately extract key features of debris flow susceptibility [27]. In addition, the DeepLabv3+ model has stable prediction results, multi-scale input and zero convolution technology can improve the robustness and noise resistance of the model and reduce errors and outliers in the sensitivity analysis of glacial debris flow. The model has strong adaptability and transferability, and can maintain the segmentation effect and obtain good prediction results even in different study areas. In terms of vulnerability prediction, the model provides accurate and intuitive vulnerability map of glacier debris flow, and realizes the visualization of regional debris flow vulnerability information, which is of great significance for preventing and mitigating glacier debris flow disasters.

In addition, replacing the glacier area with the glacier ablation volume can more accurately assess glacier debris flow susceptibility. Sensitivity assessment based on deep learning reduces the high error rate of manual recognition while improving efficiency, especially for large-scale remote sensing images.

In summary, this study introduced research methods, extracted the boundary of glacier debris flow, obtained the vulnerability analysis diagram of glacier debris flow, identified the risk factors, verified the results, provided inspiration for disaster management, and made progress in the study of glacier debris flow. These advances help to understand and mitigate the risks associated with glacier debris flow and enhance the safety and resilience of affected areas.

#### *5.2. Limitations*

We know from our research literature that in an evaluation of the changes in elevation and surface velocity of Iran's largest and most dynamic detritus-covered glacier (Alamkouh Glacier) during 2018–2020, The high-resolution images of the UAV were obtained and processed into a digital elevation model (spatial resolution of about 15 cm) and an orthophoto image (spatial resolution of about 8 cm), and the changes in glacier thickness were obtained. However, in our study, we only used satellite images, and the accuracy could not reach the centimetre level [28]. It is extremely difficult to detect glacier surface flow rates in rugged and alpine Himalayan terrain using traditional surface techniques. Karimi, Neamat et al. used the differential band composite method for the first time to estimate the glacier surface velocity in the non-detrital covered area and the detrital covered area of the glacier, respectively. The accuracy is relatively considerable [29].

The assessment of glacier debris flow susceptibility requires a large amount of finegrained data, and the acquisition of remote sensing images is critical to the accuracy of the assessment results. The remote sensing elevation image data contains the regional topographic information needed for the susceptibility evaluation of glacial debris flow, which is of great significance. However, in practice, it is often difficult to obtain high quality remote sensing images for glacier debris flow vulnerability assessment. First of all, in order to meet the evaluation needs, a large number of remote sensing images need to be obtained continuously over a long time scale, and obtaining these continuous highquality remote sensing image data is an insurmountable challenge. Second, the lack of high-quality remote sensing imagery in many areas requires an extensive review of relevant sources to obtain more comprehensive topographic information. Finally, the process of remote sensing image pre-processing is also very complicated, and it is usually necessary to perform multiple steps to obtain high-quality remote sensing image data. Therefore, the acquisition and processing of high quality remote sensing image data is an important part of the susceptibility assessment of glacier debris flow, and sufficient attention should be paid in the evaluation work.

The integration of remote sensing image and deep learning technology is an important progress in the field of glacier debris flow analysis, while this study demonstrates the effectiveness of these methods in the specific context of the G318 Linzhi section, it is necessary to evaluate how these techniques compare to existing methods used in different parts of the globe. Exploring the limitations and potential improvements of these techniques helps to gain a more complete understanding of their applicability and effectiveness in different glacial environments.

#### *5.3. Outlook*

Glacial debris flow susceptibility analysis using deep learning and remote sensing imagery techniques has a wide range of applications. In order to improve the accuracy of the glacier debris flow susceptibility analysis model in response to this phenomenon, integration of multiple sources of information, including topographic data, land cover data, meteorological data, etc., can be considered. In addition, integrating multiple remote sensing data modalities such as optical images, infrared images, and radar images can help to improve the robustness and prediction accuracy of the model. Once the model has the capability to identify areas of glacier debris flow susceptibility, the scope and number of training sets and the use of global or national remote sensing data from international

agencies will need to be expanded to further improve the generalisability and accuracy of the method.

#### **6. Conclusions**

In this research, we explore the vulnerability of glacier debris flow along the G318 Linzhi section using remote sensing imagery and deep learning techniques. We have come to the following conclusions: (1) the precursors of glacier debris flows can be monitored and warned in real time using remote sensing technology; (2) glacier retreat and glacial lake formation are important factors in glacier debris flow susceptibility areas and should be given priority consideration; (3) with the help of remote sensing data, slopes, river valleys, cliffs, and water bodies can be effectively identified; and (4) by obtaining glacier morphological parameters, topographic slope and elevation data, and using deep learning techniques to construct complex predictive models, we can predict the susceptibility of glacier debris flows more accurately.

In the future, we will consider how to more accurately monitor changes in glacier edges, and use state-of-the-art deep learning methods to address monitoring changes in glacier edges, observing changes in glaciers, and providing data support for applications such as environmental protection.

**Author Contributions:** Conceptualization, J.C., G.M., H.G., L.H. and R.Y.; methodology, J.C., G.M., H.G., L.H. and R.Y.; writing—original draft preparation, J.C. and G.M.; writing—review and editing, J.C., G.M., H.G., L.H. and R.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by China University of Geosciences (Beijing) Student Innovation and Entrepreneurship Training Programme, Category A Project 202311415016.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank the editor and the reviewers for their contributions.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:

DEM Digital elevation model


#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **A Robust Deep-Learning Model for Landslide Susceptibility Mapping: A Case Study of Kurdistan Province, Iran**

**Bahareh Ghasemian 1, Himan Shahabi 1,\*, Ataollah Shirzadi 2, Nadhir Al-Ansari 3, Abolfazl Jaafari 4, Victoria R. Kress 5, Marten Geertsema 6, Somayeh Renoud <sup>7</sup> and Anuar Ahmad <sup>8</sup>**


**Abstract:** We mapped landslide susceptibility in Kamyaran city of Kurdistan Province, Iran, using a robust deep-learning (DP) model based on a combination of extreme learning machine (ELM), deep belief network (DBN), back propagation (BP), and genetic algorithm (GA). A total of 118 landslide locations were recorded and divided in the training and testing datasets. We selected 25 conditioning factors, and of these, we specified the most important ones by an information gain ratio (IGR) technique. We assessed the performance of the DP model using statistical measures including sensitivity, specificity, accuracy, F1-measure, and area under-the-receiver operating characteristic curve (AUC). Three benchmark algorithms, i.e., support vector machine (SVM), REPTree, and NBTree, were used to check the applicability of the proposed model. The results by IGR concluded that of the 25 conditioning factors, only 16 factors were important for our modeling procedure, and of these, distance to road, road density, lithology and land use were the four most significant factors. Results based on the testing dataset revealed that the DP model had the highest accuracy (0.926) of the compared algorithms, followed by NBTree (0.917), REPTree (0.903), and SVM (0.894). The landslide susceptibility maps prepared from the DP model with AUC = 0.870 performed the best. We consider the DP model a suitable tool for landslide susceptibility mapping.

**Keywords:** landslide susceptibility; extreme learning machine; deep belief network; genetic algorithm; GIS; Iran

#### **1. Introduction**

Landslides occur in a variety of materials and undergo various styles of movement at different rates [1]. Landslides play an important geomorphological role in the evolution of landscapes, impacting the natural (soils, ecosystems, aquatic habitat, etc.) and built (residential areas, roads, pipelines, etc.) environment [2,3]. Landslide hazards are often exacerbated by land use practices such as road building, and deforestation, and may be made worse by increases in precipitation [4]. Therefore, it is important to identify areas that have a high potential for landslides and mitigate landslide damage.

**Citation:** Ghasemian, B.; Shahabi, H.; Shirzadi, A.; Al-Ansari, N.; Jaafari, A.; Kress, V.R.; Geertsema, M.; Renoud, S.; Ahmad, A. A Robust Deep-Learning Model for Landslide Susceptibility Mapping: A Case Study of Kurdistan Province, Iran. *Sensors* **2022**, *22*, 1573. https:// doi.org/10.3390/s22041573

Academic Editors: Junwei Ma and Jie Dou

Received: 24 December 2021 Accepted: 26 January 2022 Published: 17 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Landslide risk assessment methodologies can be classified into three dominant groups: qualitative, quantitative, and artificial intelligence approaches. Qualitative approaches often rely on air photo and field interpretation and expert judgment (e.g., Schwab and Geertsema [5]). Quantitative methods are based on mathematical rules and expert judgment [6]. Artificial intelligence techniques can use subjective knowledge or pattern recognition techniques to solve a set of mathematical equations. Selection of the most appropriate model is usually based on the type of data available, the scale of the case study and analysis, and the knowledge of the researcher [7].

In recent decades, with the rapid development of geographic information systems (GIS) remote sensing (RS) techniques and improvements in the computing power of artificial intelligence algorithms, machine learning has played an important role in increasing the accuracy and reliability of landslide predictions [8]. Machine learning methods depend on field observations and statistical calculations [9]. Machine learning uses computer algorithms for analyzing and forecasting information by learning training datasets [10]. They have a high ability to detect landslide occurrence behavior using distribution estimation algorithms, they have a data-driven nature, and they utilize high repetition of the modeling process. In several studies, these methods have proven their comparative advantage over bivariate and multivariate statistical models [11,12].

Several machine-learning methods have been applied in landslide susceptibility assessment, such as logistic regression (LR) [13], naive Bayes (NB) [14,15], fuzzy logic (FL) [16], support vector machines (SVM) [17–19], kernel logistic regression (KLR) [20,21], Bayesian logistic regression (BLR) [17,22], artificial neural network (ANN) [23,24], random forest (RF) [25–28], rotation forest [29,30], random subspace (RS) [31], neuro-fuzzy inference system (ANFIS) [32,33], decision tree (DT) [26], classification and regression tree (CART) [34–36] and many other methods [37].

Despite the logical results and high performance of different models, geoscientists are always looking for new methods to more accurately identify landslide-prone areas and produce reliable maps needed for environmental planning. Therefore, presenting a new approach based on artificial intelligence algorithms, deep learning and GIS-RS techniques for landslide modeling is of high necessity in landslide hazards management [38].

One of the major challenges in mountainous areas is the occurrence of landslides, of which the occurrence is naturally inevitable and cannot be completely prevented but can be managed. In this study, we are looking for a technique that can be combined with several methods to achieve an algorithm with higher predictive power than conventional machine-learning algorithms to predict landslide prone areas. Despite the logical results and high performance of different models, geoscientists are always looking for new methods with quantitative criteria to more accurately identify landslide-prone areas and reliable maps needed for environmental planning. Therefore, presenting a new approach based on artificial intelligence algorithms and remote sensing techniques for landslide modeling is of high necessity in landslide management. We applied and developed a deep-learning model based on ELM, DBN, and BP optimized by GA for landslide susceptibility mapping. This model has been used earlier in spatial prediction against floods [39] and also has been applied to predict cancer [40]; however, its ability has not been evaluated to landslide susceptibility mapping so far. The model has been confirmed by some statistical measures and compared with some state-of-the-art benchmark machine-learning algorithms including SVM, NBTree and REPTree. We developed this model in MATLAB 2018a and all landslide susceptibility maps were produced in ArcGIS 10.5. The purpose of this study is to evaluate a robust deep-learning model that will support landslide susceptibility mapping. Here, we build on previous landslide susceptibility modelling for this study area using landslide data belonging to Asadi et al. [41], but with a different set of algorithms.

#### **2. Study Area and Data**

#### *2.1. Description of the Study Area*

Our study area, around Kamyaran city, is a mountainous area of nearly 150 km<sup>2</sup> in the southwest of the Kurdistan Province, Iran (Figure 1). The elevation ranges from 850 to 2328 m and has a mean annual temperature that varies between 11.3 ◦C and 17 ◦C and mean annual precipitation of 528 mm. Geologically, the study area is in the Sirvan drainage basin, located in the structural zone of Sanandaj-Sirjan and Zagros. Bedrock lithologies include outcrops from Cretaceous to Quaternary rocks, the oldest of which include Micrite limestone and dark gray shale. Most of the study areas are covered by Mesozoic and Cretaceous formations, which include Basaltic pillow lava and dark grey shale with intercalations of volcanic rocks. Holocene sediments of the Old Testament include alluvial fans and alluvial barracks. The predominant land covers in the study area are semi-dense forests and dry farming. In addition, dense pasture, semi-dense pasture, low-dense forest and woodlands are other types of land cover/land use in the study area. The area is significantly prone to landslides associated with road developments.

**Figure 1.** Geographical location of the study area in (**a**) Iran and (**b**) Kurdistan province.

#### *2.2. Data*

2.2.1. Landslide Inventory Map

It is necessary to prepare a landslide distribution map for landslide modeling because the assumption of the modeling process is that future landslides occur in the same conditions as in the past [42]. That "*the past and the present are key to the future*" is one of the most important principles in earth science. This means landslides that have occurred in the past and present under specific topographic, geological, hydrogeological, and climatic conditions in an area can provide useful information to predict the potential for future landslides in that area [43]. A map showing such information is useful for studying the spatial relationships between landslide distribution and factors affecting landslide occurrence [44]. Galli et al. [45] have mentioned that the quality of a landslide inventory map can lead to reasonable results in landslide modeling. From a total of 118 landslide points detected in the study area, 94 points (~80%) were used as the training dataset, and 24 points (~20%) were considered as the validation dataset. A total of 118 landslide locations used in this study were a part of a total of 175 landslide locations of Asadi et al. [41].

#### 2.2.2. Landslide Conditioning Factors

The selection of the factors affecting the occurrence of landslides is one of the most important steps in landslide susceptibility studies [46]. In this study, we selected 25 conditioning factors that were slope angle, aspect, elevation above sea level, curvature, profile curvature, plan curvature, solar radiation, valley depth (VD), terrain ruggedness index (TRI), vector ruggedness measure (VRM), stream power index (SPI), topographic wetness index (TWI), length slope (LS), topographic position index (TPI), land use, normalized difference vegetation index (NDVI), lithology, soil, distance to fault, distance to river, distance to road, fault density, river density, road density, and rainfall (Figure 2). We used some conditioning factors for this study area that were earlier published by Asadi et al. (2022).

**Figure 2.** *Cont*.

**Figure 2.** *Cont*.

**Figure 2.** Landslide conditioning factors used in this study: (**a**) slope angle, (**b**) aspect, (**c**) elevation, (**d**) curvature, (**e**) plan curvature, (**f**) profile curvature, (**g**) solar radiation, (**h**) VRM, (**i**) VD, (**j**) SPI, (**k**) TWI, (**l**) TRI, (**m**) TPI, (**n**) LS, (**o**) land use, (**p**) NDVI, (**q**) rainfall, (**r**) distance to fault, (**s**) distance to road, (**t**) distance to river (**u**), fault density, (**v**) road density, (**w**), river density, (**x**) lithology, and (**y**) soil texture.

#### Slope Angle

Landslide hazard is often linked to slope angle, with shear stresses increasing on steeper slopes [47]. The supply of soil available for sliding often thins dramatically on steeper slopes above 25 degrees [48]. In other words, on high slopes, the type of material is more often stone and outcrops, such that medium slopes are more prone to landslides. This layer in the present study was extracted from the digital elevation model (DEM) and classified into eight intervals: 0–13, 14–22, 23–30, 31–42, and >43 (Figure 2a).

#### Aspect

Slope direction affects the occurrence of landslides by controlling the parameters related to soil moisture concentration, sunlight, dry winds, rainfall (saturation degree), and discontinuities [49]. This layer was extracted from DEM and categorized into nine classes: flat (−1–39.08), north (39.08–79.16), northeast (79.16–119.24), east (119.246–159.32), southeast (159.32–199.41), south (199.41–239.49), southwest (239.49–279.57), west (279.57–319.65), and northwest (319.65–359.74) (Figure 2b).

#### Elevation

The influences of elevation on landslides are often displayed as indirect relationships or by means of other factors [50]. The altitude factor of each region is one of the effective layers in creating slope instabilities. This factor indirectly determines many causes of landslides such as annual rainfall, heavy rainfall, temperature, frost changes, ice melting, etc. [51]. Maximum elevation of the region is 2328 m, and the minimum elevation is 850 m, hence the general elevation variance is 1478 m. The elevation map was extracted from DEM and then classified into eight classes: (1) 850–1000, (2) 1000–1200, (3) 1200–1400, (4) 1400–1600, (5) 1600–1800, (6) 1800–2000, (7) 2000–2200, and (8) 2200–2400 (Figure 2c).

#### Curvature

Curvature maps show the extent to which the surface deviates from the flatness, or in other words, the convexity and concaveness of the slope [52]. The curvature of the slope represents the shape of the topography so that the positive concavity represents the surface where the pixels are convex (Convex, Coves, Hollows), Negative concavity indicates a surface where the pixels are concave (Concave, Noses) and zero indicates a surface that has no slope and is straight (Flat, Straight). These three types of slope shapes have a great effect on slope instability by controlling the concentration and diffusion of surface and subsurface water in the slopes [53]. Convexity and concavity of the slope curvature map using distances between consecutive topographic lines in the GIS were extracted from the DEM of the region and classified into five classes (1) highly concave (−51.20)–(−3.79), concave (−3.79)–(−1.12), (3) flat (−1.12)–(0.54), (4) convex (0.54)–(3.21) and (5) very convex (3.21)–(33.9) (Figure 2d).

#### Plan Curvature

Plan curvature indicates changes in direction along a curve. This factor affects the divergence and convergence of water and materials containing a landslide in the path of motion. Plan curvature was extracted from DEM and divided into five classes: (1) [(−28.51)– (−1.43)], (2) [(−1.43)–(−0.44)], (3) [(−0.44)–(0.34)], (4) [(0.34)–(1.53)], and (5) [(1.53)–(21.09)] (Figure 2e).

#### Profile Curvature

Profile curvature is an important factor that affects the stress resistance due to landslides in the path and indicates the intensity of water flow and transportation and deposition processes [54]. The positive values in the transverse curvature of the slope indicate concavity (decrease in flow rate) and the negative values indicate convexity (increase in flow rate) [55]. Profile curvature was extracted from DEM and constructed in five categories: (1) [(−23.05)–(−2.29)], (2) [(−2.29)–(−0.519)], (3) [(−0.519)–(0.272)], (4) [(0.272)–(2.05)], and (5) [(2.05)–(27.4)] (Figure 2f).

#### Solar Radiation

The average convergence of solar radiation per pixel over a year is called the intensity of solar radiation, which is expressed in kilowatt hours per square meter [56]. The importance of this index is that its larger value indicates more vapor than the soil surface in an area. This index also controls the amount of vegetation on the slope. The less solar radiation that reaches a slope, the more vegetation appears on the slope, and as a result, the slope becomes more stable [57,58]. In the present study, the solar radiation layer was extracted from DEM in ArcGIS and categorized into five classes: (1) 80,000–43,000, (2) 440,000–540,000, (3) 550,000–630,000, (4) 640,000–700,000, and (5) 710,000–810,000 (Figure 2g).

#### Vector Ruggedness Measure

Vector ruggedness measure (VRM) factor was suggested by Hobson et al. [59]. It provides a way to measure terrain ruggedness as the variation in the three-dimensional orientation of grid cells within a neighborhood: slope and aspect are captured into a single measure and used to decouple terrain ruggedness from just slope or elevation [60]. The VRM map was created from DEM in the SAGA GIS software environment and then it was divided into five classes: (1) 0–0.0302, (2) 0.0303–0.0795, (3) 0.796–0.151, (4) 0.152–0.274, and 0.275–0.699 (Figure 2h).

#### Valley Depth

The valley depth (VD) factor can also be considered one of the fundamental layers in assessing landslide susceptibility. This index was prepared based on DEM map in the SAGA GIS software, and after exporting to ArcGIS it was classified into five classes: (1) 0–37.9, (2) 38–87.7, (3) 87.8–149, and (4) 150–233 and (5) 234–508 m (Figure 2i).

#### Stream Power Index

Stream Power Index (SPI) is a criterion derived from the DEM that might affect landslide occurrence, and it reflects the erosive power of slope surface run-off [61,62]. It can be formulated as follows:

$$\text{SPI} = \text{A}\_{\text{s}} \times \tan{\beta} \tag{1}$$

where As is the specific basin area and tan β represents the slope angle. In this study, it was prepared based on DEM in the SAGA GIS software and then exported to ArcGIS software to map. The SPI layer was then extracted in five intervals: (1) 0–1510, (2) 1520–1600, (3) 1610–3110, (4) 3120–26,500, (5) 26,600–390,000 (Figure 2j).

#### Topographic Wetness Index

Topographic wetness index (TWI) represents a theoretical component of flow accumulation at any point in a watershed or region that is used to describe the spatial pattern of soil moisture [63]. This index is generally used for topographic control over hydrological processes and its high values are generally used in landslide bodies. The TWI can be formulated as follows: 

$$\text{TWI} = \text{Ln}\left(\frac{\text{A}\_{\text{s}}}{\tan \beta}\right) \tag{2}$$

where As is cumulative drainage upstream area at one point and tan the angle of slope at the point. The TWI was prepared in five classes: (1) 0.0895–2.62, (2) 2.63–3.32, (3) 3.33–4.15, (4) 4.16–6.26, and (5) 6.26–10.70 (Figure 2k).

#### Terrain Ruggedness Index

Terrain Ruggedness Index (TRI) was introduced by Riley [64], and it is actually the difference in the height of one pixel with the eight pixels around it. Equation (3) is provided to calculate this index:

$$\text{TRI} = \sqrt{\sum\_{p=1}^{8} ZMD} \tag{3}$$

where *p* is the number of pixels in the region and *ZMD* is the average difference of eight pixels around each pixel. The TRI map was prepared in five classes: (1) 0–2.64, (2) 2.65–4.75, (3) 4.76–7.74, (4) 7.75–13.4, and 13.5–44.9 (Figure 2l).

Topographic Position Index

Topographic position index (TPI) compares the height of each pixel in the digital elevation model with the specified pixel around that pixel [65]. To calculate TPI (Equation (4)), the height of each cell in a digital elevation model compared with the average height of neighboring cells is examined. Finally, the average height decreases from the height value in the center. Areas higher than the surrounding points (hills) are indicated by positive TPI values; negative TPI values denote areas lower than their surroundings (valleys). Zero and near-zero values also illustrate flat areas (where the slope is close to zero) or areas with a fixed slope [66].

$$\text{TPI} = \text{Z}\_{\text{O}} - \sum\_{n=1} \text{Z}\_{n}/n \tag{4}$$

where Z0 is the point height of the model under evaluation, Z*<sup>n</sup>* is the height of the grid and *n* is the total number of surrounding points considered in the evaluation. We prepared TPI in five classes: (1) (−75.7)–(−9.77), (2) (−9.77)–(−2.83), (3) (−2.83)–(2.94), (4) (2.94)–(11.03), and (5) (11.03)–(71.7) (Figure 2m).

#### Slope Length

The slope length (LS) factor, which is a combination of the slope angle and length of the slope, is a fundamental factor in the study of landslides because this factor refers to the sediment transport capacity created by the landslide through the daily (direct) flow. Carrara [67] stated that there is a relationship between landslide density and slope length. Therefore, this factor is examined in this study [67]. Mathematically, this equation is expressed as:

$$\text{LS} = \left(\frac{A\_{\text{s}}}{22.13}\right)^{0.4} \left(\frac{\sin\beta}{0.0896}\right)^{1.3} \tag{5}$$

where As is the specific catchment area and β is the degree of local slope gradient. This index was prepared based on DEM in the SAGA GIS software, and after exporting in the GIS environment it was classified into five classes: (1) 0–6.88, (2) 6.89–13.1, (3) 13.2–19.6, (4) 19.7–28.2, and (5) 28.3–87.8 (Figure 2n).

#### Land Use/Land Cover

Land use is one of the important indicators in the instability of slopes, and it affects the characteristics of the land and changes its behavior [53]. In this study, the land use layer was prepared and extracted from an Iranian land use map. Land use/cover classes in the current research are dry-farming, semi-dense forest, low-dense forest, semi-dense pasture, dense pasture, and woodland (Figure 2o).

#### Normalized Difference Vegetation Index

The normalized difference vegetation index (NDVI) factor shows the ability to detect growth and vegetation levels in an area [68,69]. It is obtained by subtracting the reflection values of red band (Red) or visible spectrum (0.6–6.7 μm) and near-infrared band (NIR) (0.7–1/1 μm). Equation (6) is used to calculate this index:

$$\text{NDVI} = (\text{NIR} - \text{RED}) / (\text{NIR} + \text{RED}) \tag{6}$$

The minimum and maximum values of this index, respectively, are (−1) and (+1). The NDVI map was produced in five classes: (1) (−0.351)–(−0.064), (2) (−0.064)–(0.008), (3) (0.008)–(0.099), (4) (0.099)–(0.260) and (5) (0.260)–(0.759) (Figure 2p).

#### Rainfall

Rainfall intensity and duration play a major role in landslide initiation [70]. Here, we obtained the rainfall data of eight meteorological stations from the Iranian Meteorological Organization. A rainfall map of the area was built with the inverse distance weighting (IDW) method with five classes: (1) 438–440, (2) 440–480, (3) 480–520, (4) and 520–560 (Figure 2q).

#### Distance to Fault

Large-scale structures such as faults and thrusts can influence the distribution of landslides [71]. In this study, distance to fault was calculated by the "Euclidean Distance" tool in ArcGIS software, in terms of distance from each pixel from the study area to the nearest fault. Based on these results, buffers were constructed around the fault with distances of 100 m, and this map was extracted into five classes: (1) 0–100, (2) 101–200, (3) 201–300, (4) 301–400, and (5) >400 (Figure 2r).

#### Distance to Roads

Both cut and fill slopes and improper road drainage structures associated with road construction can contribute to slope instability [72]. In this study, distance to road was calculated by the "Euclidean Distance" tool in ArcGIS software, in terms of distance from each pixel from the study area to the nearest road. Distance to roads was mapped with five categories: (1) 0–100, (2) 101–200, (3) 201–300, (4) 301–400, and (5) >400 m (Figure 2s).

#### Distance to Rivers

Another conditioning factor that directly impacts landslide susceptibility is distance to river. Flowing water is one of the factors increasing the potential for instability in the slopes, playing an effective role in mass movements. Distance to river was calculated by the "Euclidean Distance" tool in the ArcGIS software in meters of each pixel from the study area to the nearest stream line. The map was created with five classes: (1) 0–100, (2) 101–200, (3) 201–300, (4) 301–400, and (5) >400 m (Figure 2t).

#### Fault Density

Slope instabilities are more likely to occur in areas where the number of faults is high and particularly when the faults are active [73]. Fault density is the ratio of the total length of faults in a given watershed or a given area to the total area of the watershed or the area surrounding those faults [74]. The higher the density of faults in an area, the greater the split in rocks and the reduction in shear strength of rocks and slope constituents due to weathering. As a result, the risk of slope instability and landslides increases on the slopes [75]. Fault density was extracted with five classes: (1) 0–0.67, (2) 0.671–1.84, (3) 1.85–3.01, (4) 3.02–4.41, and (5) 4.42–7.12 km/km2 (Figure 2u).

#### Road Density

Road density is the ratio of the total length of roads in a given watershed or a given area to the total area of the watershed or the area surrounding those roads [76]. Although the quality of roads and drainage control are important, road density can also influence landslide occurrence [77]. Road density was calculated using the "Line density" tool in the ArcGIS software for modeling, and the factor was classified into five classes: (1) 0–0.440, (2) 0.440–1.210, (3) 1.210–1.914, (4) 1.914–2.772, and (5) 2.772–5.610 km/km2 (Figure 2v).

#### River Density

Another influence controlling landslides is river density [78]. River density is the ratio of the total length of rivers in a given watershed within a given area to the total area of a watershed or area containing those rivers [79]. We used the "Line density" tool in the ArcGIS software to extract five classes of river density: (1) 0–0.5551, (2) 0.5551–1.4608, (3) 1.4608–2.4542, (4) 2.4542–3.7983, (5) 3.7983–7.4505 (Figure 2w).

#### Lithology

Lithology often strongly influences slope stability [80], in part due to variable strength characteristics of certain bedrock types [81]. Therefore, to determine the susceptibility of various lithological formations to produce landslides, we extracted lithological units of the case study of Kamyaran geology sheet with a scale of 1:100,000. The number of lithological units in the study area was divided into 10 classes (Figure 2x).

#### Soil Texture

Landslides that involve soils are influenced by the type of soil they occur in [82]. Soil texture influences properties such as permeability and cohesion, which can influence the style of movement [83]. Primarily, landslides change soil features by exposing parent material (the C horizon) by removing organic mats and the horizon A [84]. Changes in soil texture occur when a landslide moves or removes various materials to a specific location [85]. From the study area, 20 soil samples in different lithological units were collected to determine soil texture using the hydrometric method. We used the soil texture triangle to classify textural groups. The soil map was created into five classes: (1) Silty Loam (2) Clay Loam (3) Loam (4) Sandy Loam (5) Silty Clay (Figure 2y).

#### **3. Modeling Process**

Figure 3 shows the workflow of our study. In step 1, we collect and interpret landslideconditioning factors. In step 2, we divide landslide locations into the training and the validating datasets. In step 3, we conduct landslide modeling using the DL (deep learning) model and the three benchmark models (SVM, NBTree, and REPTree). In the DL model, we computed landslide susceptibility index (LSIs) for each pixel of the study in five steps: (i) constructing DBN using RBMs as pretraining on the dataset; (ii) parameter tuning in ELM to obtain the weights matrix from the last restricted Boltzmann machines (RBMs), (iii) fine tuning the training of the whole network by BP, (iv) optimizing the obtained weights from the network by the genetic algorithm (GA), and (v) assigning the optimum weights to the pixel of the study to map the landslide susceptibility. In step 4, we generate the landslide susceptibility maps using the outcomes of step 3. Finally, we compare and validate the performance of the models using a suite of statistical measures.

**Figure 3.** Flowchart of the study.

#### **4. Mathematical Background of the Methods**

#### *4.1. Deep Belief Network*

One of the most common deep neural networks (DBN) training techniques is the use of unsupervised pretraining, which initializes the network using only unlabeled data. Network initialization has been shown to be a good starting point for fine-tuning with the next observer, and greatly reduces the risk of being trapped at the local minimum according to Kustikova and Druzhkov [86]. One of the methods used to teach deep networking is

the deep belief network. The deep belief network [87,88] has become a popular approach in machine learning due to its advantages such as fast inference and the ability to encode richer and higher-order network structures. DBN operates a hierarchical structure with several finite Boltzmann machines, and operates through a layered learning process [89]. A deep belief network with two Boltzmann machines bounded for one problem to n inputs and one output is shown in (Figure 4).

**Figure 4.** Deep belief network model used in the study.

Usually, with pretraining, the deep belief network training process includes the following steps [90]:

Step 1: Pretraining step: a sequential training of learning modules, greedily, one layer at a time, using unsupervised data;

Step 2: First fine-tuning step: use random weights for the last layer (matrix W3 in Figure 4);

Step 3: Second fine-tuning step: use back propagation to fine-tune the entire network using supervised data.

#### *4.2. Extreme Learning Machine*

The extreme learning machine (ELM) [91] was first proposed by Huang in 2004 for the single hidden-layer feedforward neural networks (SLFNs) with the aim of reducing the costs imposed by the post-error propagation procedure during the training process, and then extending to SLFNs where latent layer neurons do not need to be the same. Over the past decade, the extreme learning machine has been extensively studied due to its high productivity, effectiveness, and easy implementation [92]. The ELM has the advantage of a fast learning rate and high generalizability [93]. In ELM, the hidden layer does not need to be adjusted; that is, the connection weights from the input layer to the hidden layer as well as the hidden biases, and neurons are generated randomly without additional adjustment. The efficient least squares method is used to computationally calculate the connection weights from the hidden layer to the output layer [94].

#### *4.3. Structure of the Deep-Learning Model*

In the proposed model, for network training, the deep belief network training process mentioned in the DBN section is used; the difference is that in the first fine-tuning step, the ELM is applied to teach the weights between the last hidden layer and the output layer (W3 in Figure 4). The optimal network structure is also derived from GA. The steps of the genetic algorithm are as follows:

Step 1: Chromosome coding and population initialization. The chromosome is directly counted by taking positive integers (to a predetermined population N). The number of genes on each chromosome indicates the number of hidden deep layer layers and the amount of each gene indicates the number of neurons. Chromosome genes are also randomly initialized.

Step 2: Assessment. Each chromosome is trained by the proposed hybrid model using training data. Then, the classification accuracy is calculated and considered as the fit value of that chromosome.

Step 3: Selection. The known mechanism of selecting the roulette wheel has been used to choose the parents for the combination and jump.

Step 4: Combination. To search the problem space, the one-point compound operator, which is one of the most common compound operators in the literature, has been used.

Step 5: Mutation. The mutation operator produces a new chromosome by randomly selecting a gene/layer and decreasing or increasing its amount. The purpose of this operator is to prevent the algorithm from being trapped in the local optimization by discovering new solution spaces.

Step 6: Selection of survivors. After arranging the chromosomes of the current population and the chromosomes resulting from the combination and mutation based on their proportional values, the superior N chromosomes are selected as the survivors.

Step 7: Stop criteria. When the number of generations reaches the predetermined value, the algorithm stops and the best chromosome returns as the answer; otherwise, it returns to step 3. The flowchart of the deep-learning model used in this study is shown in Figure 5.

**Figure 5.** The flowchart of the deep-learning model [39,40].

#### *4.4. Benchmark Methods*

#### 4.4.1. Support Vector Machine

The support vector machine (SVM) algorithm is based on the theory of statistical learning that uses the inductive minimization principle of structural error leading to an overall optimal solution [95,96]. In recent years, this algorithm has attracted a lot of attention due to its good classification performance and good generalizability. The SVM includes the two operations, (i) nonlinear mapping of an input vector into a highdimensional feature space that is hidden from both the input and the output and (ii) construction of an optimal hyperplane to separate the features. The structure of this model is explained as follows:

$$X\_i = (i = 1, \ 2, \ \dots, n) \tag{7}$$

The training vectors included two classes of *Yi* = ±1; the purpose of this model is to find a differentiated hyperplane of −*N* dimensional by the maximum gap. The description is as follows:

$$\mathbf{1}/2 = \|\mathbf{W}\|^2\tag{8}$$

Subject to the following constraints.

$$Y\_{\bar{i}} = ((W, \ X\_{\bar{i}}) + b) \ge 1 \tag{9}$$

where *W* is the norm of the normal of the hyperplane, (.) is a specific numerical production and *b* is a scalar base.

#### 4.4.2. REPTree

The reduced error-pruning tree (REPTree) as a fast decision-tree learning process that combines two kinds of algorithms as a hybrid method involving reduced-error pruning (REP) and decision tree (DT) [97]. The main structure of this method is based on classification and regression problems. The REP minimizes the complexity of tree structure if the DT's performance is high [98]. The REPTree method uses the pruning mechanism to overcome the backward overfitting problem. Additionally, this technique uses the post-pruning method to obtain the minimal version of the most-accurate tree [99].

#### 4.4.3. NBTree

Naïve Bayes tree (NBTree) was used due to its simplicity and linear runtime method, combining the J48 algorithm and the naïve Bayes algorithm [100]. This method is used for classification problems, especially to evaluate and pick the class that maximizes the subsequent class's likelihood. Hence, NBTree can solve problems of big data that relate to the Naïve Bayes algorithm and the data fragmentation of the J48 algorithm. The important distinguishment of this model from other machine-learning methods is that it is based on a minimal training data structure that uses a classification system to evaluate important parameters [101]. To build a Naïve Bayes classifier for detection of landslide occurrence points in the area, NBTree uses information obtained from the root node to a given leaf node down the tree, and then utilizes the training cases that fall into that leaf node [102].

#### *4.5. Information Gain Ratio*

In the present study, the information gain ratio (IGR) was applied as the basis of judgment for factor selection and to determine important comparative factors for modeling. For landslide susceptibility assessment, selecting the most effective factors as input dataset is fundamental. IGR was proposed by Quinlan [103] to define the quantitative predictive strength of the effective parameters and to select important conditioning factors for modeling. The higher the IGR value, the higher the prediction utility of a factor for modeling [19]. This method enhances the power of prediction of landslides, discarding noise factors with lower IGR. Assuming that the training data *T* contain *n* samples, *Ci* (landslide, nonlandslide) is a classification set of sample data, and the information entropy of the factors is calculated as follows:

$$Info\ (T) = -\sum\_{I=1}^{2} \frac{n(C\_{i\prime}T)}{|T|} \log\_2 \frac{n(C\_{i\prime}T)}{|T|}\tag{10}$$

Estimating the amount of information (*T*1, *T*<sup>2</sup> and *Tm*) from T considering causal factor *F* takes the form of the following Equation (11):

$$\text{Info}\ (T, F) = -\sum\_{I=1}^{m} \frac{T\_i}{|T|} \log\_2 \text{Info}\ (T) \tag{11}$$

Eventually, the *IGR* of the landslide causal factor *F* can be calculated by:

$$IGR\ (Y, F) = \frac{Info\ (T) - Info\ (T, F)}{Split\ Info\ (T, F)}\tag{12}$$

where *Split Info* denotes the potential information produced by dividing the training data *T* into *m* subsets. The formula of *Split Info* is shown as:

$$SplitInfof\ (T, F) = -\sum\_{I=1}^{m} \frac{|T\_i|}{|T|} \log\_2 \frac{|T\_i|}{|T|} \tag{13}$$

If *IGR* > 1, the probability of landslide incidence is higher than average; if *IGR* = 0, the probability of landslide is equal to average; and if *IGR* < 0, the probability of landslide incidence is less than average [104].

#### *4.6. Performance Metrics*

To evaluate the performance of all the models, we used a number of statistical indexbased metrics: sensitivity (SST), specificity (SPF), accuracy (ACC), F1-measure, and receiver operative characteristic curve (AUC). All statistical metrics were computed based on true positive (TP), true negative (TN), false positive (FP), and false negative (FAN). Table 1 shows the mentioned statistical index-based metrics and their descriptions.


#### **5. Results**

#### *5.1. The Most Important Conditioning Factors*

We obtained the relative importance of the factors influencing landslide occurrence based on average merit (AM) as IGR score through the *k*-fold cross-validation technique (Table 2). Results indicated that in the lower folds (1 and 2 folds), the number of removing factors with less predictive power was higher (13 factors) than the higher folds (10-fold; 9 factors). According to Table 2, the results pointed out that from 1-fold to 10-folds crossvalidation, distance to road (AM = 0.177), road density (AM = 0.118), lithology (AM = 0.079) and land use (AM = 0.055) were the first four most important factors for landsliding in the study area. These four influencing factors are followed by NDVI (AM = 0.04), elevation (AM = 0.04), soil (AM = 0.031), aspect (AM = 0.025), solar radiation (AM = 0.021), VRM (AM = 0.015), slope angle (AM = 0.014), distance to fault (AM = 0.014), TWI (AM = 0.013), LS (AM = 0.011), TRI (AM = 0.008), and rainfall (AM = 0.006). Further, the results showed that profile curvature, curvature, plan curvature, distance to river, VD, fault density, river density, TPI, and SPI, because of having AM = 0, were removed from the modeling process.



64

Fault density; RiD: River density.

#### *5.2. Performance of the Deep-Learning Model*

Figure 5 shows the results of training the DL model. Figure 6a,b illustrates how well the landslide (target) and nonlandslide (output) values fitted based on the training and testing datasets, respectively. A well-trained model with a high goodness of fit also has a high agreement between the target and output by the training dataset. However, high prediction accuracy of the model is inferred by the agreement between the target and output of the testing dataset. The two statistical quantitative metrics of MSE (mean squared error) and RMSE (root-mean-square error) show the modeling error of the DL model (Figure 6c,e). The values of MSE and RMSE in the training dataset were 0.0435 and 0.0208, respectively (Figure 6c); however, these values for the testing dataset were 0.079 and 0.281 (Figure 6e). The StD (standard deviation) and mean for the training dataset were, respectively, 0.04 and 0.280, and for the testing dataset these values were 0.01 and 0.208, respectively (Figure 6d,f).

#### 5.2.1. Parameter Tuning

The success rate of a model depends on selecting the optimal value of the parameters of that model. The parameter can be tuned by offline and online approaches. In the offline technique, the values of different parameters are fixed, whereas in the online approach the parameters are dynamically or adaptively controlled and updated [114]. In this study, we used the online parameter tuning approach and the results are shown in Tables 3–5.

**Table 3.** The optimal value of the genetic algorithm parameters.


**Table 4.** Optimal parameters of the DBN and BP models.


#: Number of...

**Table 5.** The optimal value of parameter of the benchmark methods.


**Figure 6.** Performance of the DL model: (**a**) target and output for the training dataset, (**b**) target and output for the testing dataset, (**c**) magnitude of the errors for the training dataset, (**d**) distribution of the errors for the training dataset (**e**) magnitude of the errors for the testing dataset, (**f**) distribution of the errors for the testing dataset.

#### 5.2.2. Classification Performance

After selecting the optimal parameter values for each model, we ran the models to obtain the highest evaluation measures but the least error. Our findings for the validation phase using the testing dataset, which are briefly reported below and in Table 6, show that the DL model outclassed and outperformed the three benchmark models. The sensitivity, specificity, accuracy, F1-measure, and AUC metrics were obtained based on the four possibilities from the confusion matrix of TP, TN, FP, and FN (Table 6).



**Table 6.** The predictive performance of the deep-learning model and the three benchmark models.

#### *5.3. Preparing Landslide Susceptibility Maps*

We ran the DL model and also the three benchmark models (SVM, NBTree, and REPTree) and computed landslide susceptibility index (LSIs) for each pixel of the study area. We then assigned the LSIs from the DL and benchmark machine-learning models to each pixel of the study area to produce landslide susceptibility maps (Figure 7a–d). We classified the maps into the five susceptibility classes: very low susceptibility (VLS), low susceptibility (LS), moderate susceptibility (MS), high susceptibility (HS), and very high susceptibility (VHS). In DL, the range of the classes were, respectively, 0.0000875–0.0092, 0.0312–0.0446, 0.0447–0.125, 0.126–0.333, and 0.333–0.868 (Figure 7a). These classes in SVM (Figure 7b) for the VLS, LS, MS, HS, and VHS were, respectively, 0.001–0.092, 0.0921–0.0293, 0.0294–0.0789, 0.079–0.201, and 0.202–0.5. For NBTree (Figure 7c) these classes were VLS (0.001–0.01299), LS (0.013–0.03783), MS (0.03784–0.09404), HS (0.09405–0.2128), and VHS (0.02129–0.468). Consequently, in REPTree (Figure 7d) the classes were VLS (0–0.03), LS (0.03001–0.0993), MS (0.09931–0.1067), HS (0.1068–0.2556), and VHS (0.2557–0.406).

**Figure 7.** *Cont*.

**Figure 7.** Landslide susceptibility maps produced by the (**a**) DL, (**b**) SVM, (**c**) NBTree, and (**d**) REPTree models.

#### *5.4. Validation and Comparison of the Models*

The prediction accuracy of the DL model and the benchmark machine-learning algorithms were assessed by AUC using the testing dataset (Figure 8). As shown in Figure 8, results indicated that the DL model had a high prediction accuracy (AUC = 0.870). In contrast, AUC for the SVM, NBTree, and REPTree models were somewhat lower, at, 0.861, 0.837, and 0.834, respectively (Figure 8). Overall, the DL model was superior compared to the other three benchmark machine-learning models in terms of prediction accuracy.

**Figure 8.** AUC of the models based on the testing dataset.

#### **6. Discussion**

In recent years, the demand for accurate prediction of landslides and the production of landslide susceptibility maps has increased, in part due to the improvement of data processing techniques, but also due to the importance of landslide prediction and susceptibility mapping in more effective land-use planning and management. There are numerous approaches and methods for producing landslide susceptibility maps, but machine-learning methods based on GIS-automated techniques offer advantages such as low cost, wide scope, fast analysis, and the option for periodically updating outputs. Each machine-learning method has its specific advantages and disadvantages, and depending on the software capabilities and input data, its outputs may differ from that of other methods. The challenge many researchers face is selecting the most appropriate method. Thus, comparative analysis of the predictive performance of different machine-learning methods is a major topic in the landslide literature [115,116]. With a desire to produce a landslide susceptibility map with high prediction accuracy, we compared the predictive performance of four machine-learning methods. We first investigated the usefulness of the conditioning factors for the modeling using the information gain ratio technique with 10-fold cross-validation. The results revealed the landslides that have occurred in our study area were significantly associated with road networks, such that the distance to roads and road density factors were identified as the most influential landslide-conditioning factors. Jaafari et al. [117] and Schlögl and Matulla [118], and Sultana and Tan [119] have also reported on the strong associations between road networks and the frequency of landslide occurrences. Therefore, these regions should be the priority targets for landslide mitigation measures [119,120].

We measured the predictive performance of the models using several widely used metrics [115,121,122] and found that the DL model has obtained the first rank in all metrics used, and therefore successfully outperformed the benchmark models (i.e., NBTree, REPTree, and SVM) that have been previously used for landslide susceptibility mapping in many regions around the world [121–125].

DL algorithms are powerful types of machine-learning algorithms, which utilize numerous hidden layers to model complex relationships among data for pattern recognition and classification tasks, such as landslide prediction. In contrast to traditional shallow learning algorithms (e.g., backpropagation neural networks, logistic regression, and decision trees) that generate decision boundaries directly based on the original datasets [126], DL algorithms hierarchically analyze the original datasets to extract the most relevant features for the data classification [127].

Despite the infrequent applications of the DL model for the prediction of natural hazards, the superiority of this model to other models derived from machine learning has been confirmed. For example, Wang et al. [128] reported the first application of DL for landslide prediction and achieved better prediction accuracy than that of SVM. Sameen et al. [123] reported that the DL model outperformed ANN and SVM for landslide prediction. Huang et al. [129] reported that the DL model was superior to ANN and SVM for landslide prediction. Dao et al. [130] showed that the DL model could provide a more accurate prediction of landslide susceptibility compared to quadratic discriminant analysis, Fisher's linear discriminant analysis, and ANN. Dou et al. [131] reported that DL provided greater AUCs than the ANN and LR methods for landslide prediction. In a recent study, Mandal et al. [132] demonstrated improved accuracy for landslide prediction using the DL model compared to RF, ANN, and Bagging. Overall, the DL model has proven efficiency for landslide modeling and has been identified as an attractive alternative to traditional machine-learning methods.

#### **7. Conclusions**

We illustrated the robustness of a deep-learning model against three benchmark models (SVM, NBTree, and REPTree) for the prediction of landslide susceptibility in Kamyaran city, Kurdistan Province, Iran. First, the landslide inventory map with 118 past landslides was produced using different sources and randomly divided into two groups: 80% for the model training and 20% for the model validation. Next, using the models, the past landslides were linked to 25 conditioning factors. The performance of the models was evaluated using sensitivity, accuracy, specificity, F1-mesaure and AUROC. The results showed that although all models had acceptable performance, the deep-learning model outperformed the other models. This indicates that the DL model can be considered as a promising technique for preparing landslide susceptibility in mountainous areas prone to landsliding. Based on the results obtained from the deep-learning model, an accurate landslide susceptibility map is developed to complement previous research. The findings of this study can be used for future planning, land management, land use allocation, and government policy making, to prevent or reduce landslides in Kamyaran city. In future studies, we suggest integrating the current framework with other individual and metaclassifiers from machine learning to achieve a higher prediction accuracy for landslides, and perhaps other natural hazards.

**Author Contributions:** Contributed equally to the work, B.G., H.S., A.S., N.A.-A., A.J., V.R.K., M.G., S.R. and A.A.; collected field data and conducted the landslide susceptibility analysis, B.G., H.S. and A.S.; wrote the manuscript, B.G., H.S., A.S., N.A.-A., A.J. and S.R.; provided critical comments in planning this paper and edited the manuscript, V.R.K., M.G. and A.A. All the authors discussed the results and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the University of Kurdistan, Iran, based on two grants number 11-99-4469 and 99-11-32657-6.

**Acknowledgments:** The authors would like to thank the Vice Chancellorship of Research and Technology, University of Kurdistan, Sanandaj, Iran for supplying required data, reports, useful maps, and their nationwide geodatabase to the first author (Bahareh Ghasemian) as a postdoctoral fellowship scheme. The authors also greatly appreciate the assistance of anonymous reviewers for their constructive comments that helped us to improve the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

