*Article* **Automatic Remote Sensing Identification of Co-Seismic Landslides Using Deep Learning Methods**

**Dongdong Pang <sup>1</sup> , Gang Liu 1,2,\*, Jing He <sup>1</sup> , Weile Li <sup>2</sup> and Rao Fu <sup>1</sup>**


**Abstract:** Rapid and accurate extraction of landslide areas triggered by earthquakes has far-reaching significance for geological disaster risk assessment and emergency rescue. At present, visual interpretation and field survey are still the most-commonly used methods for landslide identification, but these methods are often time-consuming and costly. For this reason, this paper tackles the problem of co-seismic landslide identification and the fact that there is little sample information in existing studies on landslide. A landslide sample dataset with 4000 tags was produced. With the YOLOv3 algorithm as the core, a convolutional neural network model with landslide characteristics was established to automatically recognize co-seismic landslides in satellite remote sensing images. By comparing it with the graphical interpretation results of remote sensing images, we found that the remote sensing for landslide recognition model constructed in this paper demonstrated high recognition accuracy and fast speed. The F1 value was 0.93, indicating that the constructed model was stable. The research results can provide reference for emergency rescue and disaster investigation of the same co-seismic landslide disaster.

**Keywords:** YOLOv3; deep learning; automatic landslide identification; remote sensing image

**Citation:** Pang, D.; Liu, G.; He, J.; Li, W.; Fu, R. Automatic Remote Sensing Identification of Co-Seismic Landslides Using Deep Learning Methods. *Forests* **2022**, *13*, 1213. https://doi.org/10.3390/f13081213

Academic Editor: Olga Viedma

Received: 29 March 2022 Accepted: 25 July 2022 Published: 1 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

### **1. Introduction**

Co-seismic landslides [1] are secondary hazards triggered by earthquakes in mountainous regions and often account for a very substantial proportion of earthquake hazards [2]. Landslides are generally catastrophic, resulting in loss of life, destroying infrastructure and houses and exerting a significant impact on the global economy [3]. For example, on 31 May 1970, Peru was struck by a 7.7 magnitude earthquake, triggering an avalanche that buried the city of Yongai and killed an estimated 23,000 people [4]. The 2008 Wenchuan earthquake triggered a landslide that directly killed nearly 30,000 people, accounting for roughly 30% of the population killed by the earthquake [5]. Therefore, the ability to obtain the information quickly and accurately about location, degree and scale of co-seismic landslides is of great significance for guiding earthquake emergency rescue and disaster risk assessment [6]. Hence, the research on remote sensing for landslides has become the main project to be addressed by various experts and scholars. In 1992, Wang [7] investigated and monitored landslides by combining the advantages of remote sensing and proposed to study and establish a landslide and debris flow information system, including the research and establishment of database and a series of application models, such as landslide and debris flow identification model, development model, impact evaluation model on engineering, disaster relief model and other concepts. Combining the interferometer radar and TM imagery, V. Singhroy et al. from Canada accurately extracted information about landslides in the lower Rocky Mountain watershed and used such information to assess the risk of the impacted area in 1995 [8]. In 2003, Wasowski [9] from Canada analyzed the opportunities brought by the development of remote sensing technology for studying

landslide hazards and discussed the feasibility of using radars, ETM+, RADARASAT-1 and GPS and other multi-source data to observe landslides. In 2013, Wang [10] also proposed that if the relationship between the surface landslide wall and the landslide body with the ground slide surface and the slide bed could be determined, their spectral characteristics could be understood and a computational model could be built, and that the landslide patterns could be recognized by remote sensing technology. In 2016, Casagli et al. [11] used optical remote sensing, InSAR data and target-based means to extract landslide hazards. Gore et al. [12] proposed to extract mapping glacier features using hyperspectral remote sensing images. Based on the surface characteristics of spectral reflection characteristics, glaciers are divided into accumulation zone and ablation zone to make identification much simpler. At present, remote sensing-based technology for landslide information extraction is mainly through high-resolution, multispectral images on the texture, shape, hue and other shallow features to build extraction models [13]. Although the visual interpretation accuracy is somewhat high through this means, it is very dependent on the experience of professionals, slow identification speed and it is difficult to meet the requirements of emergency rescue, disaster timely assessment. In recent years, with the development of artificial intelligence, deep learning has been introduced into various disciplines [14], and it has been widely used in the field of remote sensing, providing a reliable solution to the identification of co-seismic landslides.

Deep learning has become a mainstream research tool in the fields of speech recognition, image recognition, image classification, target detection, etc. [15,16]. Song et al. proposed a method to transform spectral information into images and extract texture features for recognition and classification using CNN [17]. The reference [18] recognized and classified spectral features by calculating the value of the cost function through a 5-layer neural network. The reference [13] used mask region-based convolutional networks (Mask R-CNN) target detection module for automatic landslide identification. Liu et al. [19] classified frozen landslides on the Tibetan Plateau by applying migration learning. Arabameri et al. [20] created a landslide inventory map using GPS points, highresolution satellite images, topographic maps, and historical records obtained from various machine learning methods and field analysis. Then, they compared the applicability of FLDA, RF, and ADTree models for establishing LSM in the Gallicash River basin in northern Iran near the Caspian Sea and succeeded in further applying deep learning in the field of landslide identification. Chen et al. [21] obtained a landslide sensitivity map based on the evidence confidence function model by using the slope element of hydrologic analysis and the slope element based on curvature watershed. After that, they calculated the success rate and prediction rate of the landslide susceptibility map. More and more advanced algorithms are used in landslide research, such as C5.0 decision tree (DT) model and Kmeans clustering algorithm, which can generate regional landslide susceptibility maps well [22]. With the development of computer vision technology [23], there are numerous algorithms for image recognition using deep learning methods, whose main objective is to pinpoint the class and location of various targets in an image or sequence of images [24]. Girshick et al. [25] proposed R-CNN convolutional neural network in 2013 for target detection using a deep convolutional neural network. Joseph Redmon, Ali Farhadi et al. [26] proposed the YOLO algorithm in 2015, a target detection system based on a single neural network. With modeling detection as a regression problem, it directly obtains the coordinates of the bounding box, the confidence of the objects contained in the bounding box and the class probability from all pixels of the whole picture, which effectively improves detection and recognition efficiency. At present, many experts and scholars have applied deep learning to the study of co-seismic landslide recognition. However, according to the most cutting-edge research results, few scholars can automatically recognize co-seismic landslide by deep learning, and many of the existing studies are about co-seismic automatic landslide extraction based on traditional machine learning or deep learning.

For the purpose of landslide recognition, this paper adopted the popular object detection algorithm YOLOv3 based on deep learning to build a convolutional neural network

for co-seismic landslide recognition. TensorFlow [27] platform was used to construct the convolutional neural network, and YOLOv3 convolutional neural network to train the sample set. The convolutional neural network model with landslide characteristics was obtained, and then the landslide detection and recognition of satellite remote sensing images of co-seismic landslides were achieved. The recognition accuracy was close to that of visual interpretation in the experimental area with much faster recognition speed.

### **2. Study Area and Data**

### *2.1. Study Area*

The study area of this paper is in the Hokkaido region of Japan, as shown in Figure 1. Hokkaido, the largest island in Japan other than Honshu, lies between latitude 40 degrees 330 and 45 degrees 330 north, and longitude 139 degrees 200 and 148 degrees 530 east, and is bounded by Honshu Island in the south by the Tsugaru Strait and by the Soya Strait in the north by the Kuril Islands. It covers a large area, accounting for about 22% of the total area of Japan. At least 70% of the island is covered by forests. The vegetation consists mainly of bright coniferous forests and grasslands. The terrain is high in the center and low in the surroundings with volcanoes. There are mountains and ranges in the central part of Hokkaido, so the terrain is generally undulating, while the surrounding area is a vast plain. The co-seismic landslide is the major threat to people's lives and development. A 6.7 magnitude earthquake hit Hokkaido, Japan, at 3:08 am local time on 6 September 2018, with an epicenter at 42.671 N, 141.933 E, and a source depth of 35 km. The earthquake, with a maximum observed intensity of magnitude 7, was the highest magnitude earthquake observed in Hokkaido, Japan, by the Japan Meteorological Agency since 1923 [28]. During the earthquake, 44 people were killed, 32 houses were completely damaged and 18 were partially damaged. The casualties and house damage were mainly caused by lare landslides and liquefied ground settlement and seismic subsidence [29]. The disaster damage caused by some of the co-seismic landslides is shown in Figure 2. *Forests* **2022**, *13*, x FOR PEER REVIEW 4 of 16 suitable for co-seismic landslides. Figure 3 shows some of the orthophotos of the study area and the co-seismic landslide area mapped by visual interpretation.

**Figure 1.** Study area (the red square is the study area and the inverted triangle is the earthquake center). **Figure 1.** Study area (the red square is the study area and the inverted triangle is the earthquake center).

**Figure 2.** Comparison of the landslide before and after earthquake. (**a**,**b**) are the images before earthquake; (**c**,**d**) are the images after earthquake. The red dotted line shows the image of the surface before the landslide. The surface of the newly formed landslides is marked with a solid red line.

**Figure 1.** Study area (the red square is the study area and the inverted triangle is the earthquake

suitable for co-seismic landslides. Figure 3 shows some of the orthophotos of the study

area and the co-seismic landslide area mapped by visual interpretation.

**Figure 2.** Comparison of the landslide before and after earthquake. (**a**,**b**) are the images before earthquake; (**c**,**d**) are the images after earthquake. The red dotted line shows the image of the surface before the landslide. The surface of the newly formed landslides is marked with a solid red line. **Figure 2.** Comparison of the landslide before and after earthquake. (**a**,**b**) are the images before earthquake; (**c**,**d**) are the images after earthquake. The red dotted line shows the image of the surface before the landslide. The surface of the newly formed landslides is marked with a solid red line.

### *2.2. Data*

center).

The satellite remote sensing images in this paper were downloaded from Planet satellite (https://www.planet.com/explorer/ (accessed on 18 September 2018)), with a resolution of 3 m taken on 18 September 2018. Planet's SuperDove image, which went online in April 2020, features a 3-m resolution and eight spectral bands (temporarily open red, green, blue, near infrared and red-edge bands), integrating the advantages of PL small satellite clusters with a 3-m resolution, and RapidEye red-edge bands. ENVI 5.6 software was used to enhance the recognition of co-seismic landslides through image true-color synthesis. Finally, orthophotos were generated by combining the images of the co-seismic landslide affected area with red, green and blue bands. The image-covered area after splicing was the whole landslide region with many landslides of small scale. To provide better training samples, images were cut into 800 m × 800 m images in ArcGIS, and 120 images with the dense landslide area were selected to make data sets as training samples, including 100 as training sets and 20 as test sets. The satellite remote sensing images were transformed into image samples and XML format label samples using the LablImg label production platform and then interpreted by the experts in this field from the State Key Laboratory of Geological Hazard Prevention and Geological Environment Protection. In this paper, we use the open YOLOv3 algorithm to recognize the co-seismic landslides. The open YOLOv3 algorithm is a public algorithm, which is one of the most widely used deep learning-based object detection methods and uses the K-means cluster method to estimate the initial width and height of the predicted bounding boxes. According to the open YOLOv3 algorithm, the parameters of YOLOv3 algorithm were adjusted to make it more suitable for co-seismic landslides. Figure 3 shows some of the orthophotos of the study area and the co-seismic landslide area mapped by visual interpretation.

**Figure 3.** Orthophotos and visual interpretation maps of landslide in the study area. (**a**,**b**) are the orthophotos and (**c**,**d**) are the corresponding visual interpretation maps. The red regions in (**c**,**d**) are the landslide areas interpreted visually by experts from the State Key Laboratory of Geological Hazard Prevention and Geological Environment Protection. **Figure 3.** Orthophotos and visual interpretation maps of landslide in the study area. (**a**,**b**) are the orthophotos and (**c**,**d**) are the corresponding visual interpretation maps. The red regions in (**c**,**d**) are the landslide areas interpreted visually by experts from the State Key Laboratory of Geological Hazard Prevention and Geological Environment Protection.

#### **3. Methodology 3. Methodology**

### *3.1. Research Design*

*3.1. Research Design*  The main design of the experimental study of this paper is shown in Figure 4. Firstly, the data were preprocessed as shown in Section 2.2; thus, the preprocessed data were obtained for landslide identification. With YOLOv3 algorithm as the core, a convolutional neural network was constructed using TensorFlow framework to train and validate the sample set. The convolutional neural network model with the characteristics of co-seismic landslides was obtained by constantly adjusting the parameters. Satellite remote sensing images of landslides were analyzed, tested and identified, and then the experimental re-The main design of the experimental study of this paper is shown in Figure 4. Firstly, the data were preprocessed as shown in Section 2.2; thus, the preprocessed data were obtained for landslide identification. With YOLOv3 algorithm as the core, a convolutional neural network was constructed using TensorFlow framework to train and validate the sample set. The convolutional neural network model with the characteristics of co-seismic landslides was obtained by constantly adjusting the parameters. Satellite remote sensing images of landslides were analyzed, tested and identified, and then the experimental results were compared with the visual interpretation results of the experts.

### *3.2. Procedure and Recognition Principle of YOLOv3 Convolutional Neural Network*

sults were compared with the visual interpretation results of the experts.

Recently, with the rapid development of artificial intelligence, great breakthroughs have been made in target detection algorithms. According to detection ideas, common target detection algorithms based on deep learning can be divided into two categories. One is two-stage target detection algorithms based on candidate region and one-stage target detection algorithms based on regression [30]. The typical two-stage target detection algorithms include R-CNN [31] system algorithms based on region proposal, such as R-CNN, Fast R-CNN, and Faster R-CNN. The other is one-stage algorithm, such as YOLO and SSD, among which YOLO is based on an end-to-end idea. The principle of this algorithm is to model detection as a regression problem, and directly obtain the boundary box coordinates of the detection target, the confidence degree of the object contained in the boundary box and the category probability of the object from all the pixels of the image.

**Figure 4.** Design of experimental study. **Figure 4.** Design of experimental study.

*3.2. Procedure and Recognition Principle of YOLOv3 Convolutional Neural Network*  Recently, with the rapid development of artificial intelligence, great breakthroughs have been made in target detection algorithms. According to detection ideas, common target detection algorithms based on deep learning can be divided into two categories. One is two-stage target detection algorithms based on candidate region and one-stage target detection algorithms based on regression [30]. The typical two-stage target detection algorithms include R-CNN [31] system algorithms based on region proposal, such as R-CNN, Fast R-CNN, and Faster R-CNN. The other is one-stage algorithm, such as YOLO and SSD, among which YOLO is based on an end-to-end idea. The principle of this algorithm is to model detection as a regression problem, and directly obtain the boundary box coordinates of the detection target, the confidence degree of the object contained in the boundary box and the category probability of the object from all the pixels of the image. Compared with other deep learning target detection algorithms, YOLO is characterized by fast detection speed. It solves object detection as a regression problem, dividing an image into S×S grid cells. If the center of an object falls in the grid, the grid will be Compared with other deep learning target detection algorithms, YOLO is characterized by fast detection speed. It solves object detection as a regression problem, dividing an image into S×S grid cells. If the center of an object falls in the grid, the grid will be responsible for predicting the object, and each grid for predicting B bounding boxes and C categories. In addition to returning to its own position, each bounding box should also be accompanied with a confidence value of five values, so the output tensor is S×S× (5B+C); Figure 5 shows a bounding box with scale and position prediction. YOLO detection is very fast. The standard version of YOLO can reach 45 FPS on TrainX's GPU. However, YOLO also has the disadvantages of low recall rate, low position accuracy, and poor detection effect for small objects. To overcome the above disadvantages, researchers proposed YOLOv2 and YOLOv3 [32]. YOLOv3 improves both detection accuracy and speed, demonstrating a new detection and classification network that is superior to other algorithms. According to the literature, in particular reference [31], M40 or TrainX can be detected by the same GPU (graphic processing unit), and the running speed of YOLOv3 is obviously faster than other detection methods with similar performance [33].

responsible for predicting the object, and each grid for predicting B bounding boxes and C categories. In addition to returning to its own position, each bounding box should also be accompanied with a confidence value of five values, so the output tensor is S×S× (5B+C); Figure 5 shows a bounding box with scale and position prediction. YOLO detection is very fast. The standard version of YOLO can reach 45 FPS on TrainX's GPU. However, YOLO also has the disadvantages of low recall rate, low position accuracy, and poor detection

effect for small objects. To overcome the above disadvantages, researchers proposed YOLOv2 and YOLOv3 [32]. YOLOv3 improves both detection accuracy and speed, demonstrating a new detection and classification network that is superior to other algorithms. According to the literature, in particular reference [31], M40 or TrainX can be detected by the same GPU (graphic processing unit), and the running speed of YOLOv3 is

The total square error was still used for calculation in the network training process of YOLOv3, and binary cross entropy was used for the loss function of parts other than w (the width of the bounding box) and h (the height of the bounding box). If the ground

<sup>∗</sup> െ ∗. This ground truth value could be easily calculated by inverting the above equation. The width and height of the bounding box were then predicted as offsets from the target center coordinates, and the position of the center coordinates of the box relative to the filter was predicted using the sigmoid function. Dimension clusters were used to fix the anchor box frame in the convolutional neural network, and four coordinates of each boundary frame were predicted, namely ௫, ௬, , ௪. If the predicted bounding box at the upper left corner of the image is offset by(௫,௬), and the predicted width and height

<sup>∗</sup>, the gradient is the ground truth minus our prediction,

௫ = ሺ௫ሻ + ௫ (1) ௬ = ൫௬൯ + ௬ (2)

௪ = ௪௧ೢ (3) = ௧ (4)

obviously faster than other detection methods with similar performance [33].

of the target box are ௪, , then the predicted value of the target is as follows:

truth predicted by a coordinate is ̂

i.e., ̂

**Figure 5.** Bounding box with scale grid (scale 13/26/52) and location prediction. **Figure 5.** Bounding box with scale grid (scale 13/26/52) and location prediction.

The YOLOv3 convolutional neural network uses the Darknet-53 network as the base network for feature extraction throughout YOLOv3. The Darknet-53 network achieved the highest measurement floating point operations per second, which allowed the network structure to make better use of the GPU, thus making its evaluation more efficient and faster. The base network has a total of 53 convolutional layers, with no fully connected The total square error was still used for calculation in the network training process of YOLOv3, and binary cross entropy was used for the loss function of parts other than w (the width of the bounding box) and h (the height of the bounding box). If the ground truth predicted by a coordinate is <sup>ˆ</sup>*t*∗, the gradient is the ground truth minus our prediction, i.e., <sup>ˆ</sup>*t*<sup>∗</sup> <sup>−</sup> *<sup>t</sup>*∗. This ground truth value could be easily calculated by inverting the above equation. The width and height of the bounding box were then predicted as offsets from the target center coordinates, and the position of the center coordinates of the box relative to the filter was predicted using the sigmoid function. Dimension clusters were used to fix the anchor box frame in the convolutional neural network, and four coordinates of each boundary frame were predicted, namely *tx*, *ty*, *t<sup>h</sup>* , *tw*. If the predicted bounding box at the upper left corner of the image is offset by (*cx*,*cy*), and the predicted width and height of the target box are *pw*, *p<sup>h</sup>* , then the predicted value of the target is as follows:

$$b\_{\mathbf{x}} = \sigma(t\_{\mathbf{x}}) + \mathbf{c}\_{\mathbf{x}} \tag{1}$$

$$b\_y = \sigma(t\_y) + c\_y \tag{2}$$

$$b\_w = p\_w e^{t\_w} \tag{3}$$

$$b\_h = p\_h e^{t\_h} \tag{4}$$

The YOLOv3 convolutional neural network uses the Darknet-53 network as the base network for feature extraction throughout YOLOv3. The Darknet-53 network achieved the highest measurement floating point operations per second, which allowed the network structure to make better use of the GPU, thus making its evaluation more efficient and faster. The base network has a total of 53 convolutional layers, with no fully connected layers. The tensor size was achieved by the step size of the convolution kernel. The whole network was a fully convolutional network, which made use of the residual module hop layer connections to reduce the negative effects of pooling gradient by removing pooling layer. In the network structure, a convolution of step size of 2 was used for down sampling. The total square error was used for the loss function of w, h and binary cross entropy for the loss function of the other parts. To enhance the detection accuracy of the YOLOv3 algorithm, the algorithm uses a complementary and fusion approach similar to FPN, providing 3 scales of prediction frames that are divided into 3 prediction output branches, i.e., 13, 26, and 52 scales, and the structure used in the three prediction output branches is also fully convolutional, and the tensor obtained for each branch is as follows:

$$\mathbf{S} \times \mathbf{S} \times \left[\mathbf{3} \times (\mathbf{4} + \mathbf{1} + \mathbf{N})\right] \tag{5}$$

where S represents different grid scales, 3 represents three predicted target boxes in each grid cell, 4 represents four coordinate values of each bounding box, 1 is the confidence value of each box, and *N* is the number of categories predicted by training.

### *3.3. Construction of YOLOv3 Convolutional Neural Network Structure for Landslide Satellite Remote Sensing Image Features*

In this paper, a landslide was taken as the identification object, and the identification target was single, so there was no need to consider the category problem, thus making network construction less difficult. According to the characteristics of image data, the network model adjusted the image format of all the images involved in the training to 416 × 416 × 3 by means of down sampling before entering the training. Meanwhile, we adopted regularization to prevent overfitting by adjusting the parameters of momentum coefficient and weight attenuation regularization. Due to the difficulty in obtaining landslide data and the lack of samples, the saturation and exposure of images were adjusted to increase sample diversity. Figure 6 shows the structure of the YOLOv3 convolutional neural network for landslide satellite remote sensing image features, i.e., the target category is 1, and the number of convolutional kernels in the last convolutional layer is 18.

$$3 \times (4 + 1 + 1) = 18\tag{6}$$

Taking the scale of 52 × 52 as an example, according to Formulas (5) and (6), the tensor of landslide data set at this scale is 52 × 52 × 18.

### *3.4. Accuracy Evaluation Index*

The evaluation metrics of the model are generally accuracy, recall, mean average precision, IOU (intersection over union), F1 score, etc. The F1 score is a measure of the accuracy of the binary classification model, which considers the accuracy and recall of the classification model and can be regarded as a weighted average of the accuracy and recall of the model. It can be expressed as follows:

$$\text{Precision} = \frac{T\_p}{T\_P + F\_P} \times 100\% \tag{7}$$

$$\text{Recall} = \frac{T\_P}{T\_P + F\_N} \times 100\% \tag{8}$$

$$\text{mAP} = \frac{1}{\mathbb{C}} \sum\_{i=1}^{N} Precision(i) \Delta Recall(i) \tag{9}$$

$$\text{IOU} = \frac{\text{Detection Result} \cap \text{Ground True}}{\text{Detection Result} \cup \text{Ground True}} \tag{10}$$

$$\text{F1 score} = \frac{\text{2Precision} \cdot \text{Recall}}{(Precision + Recall)} \tag{11}$$

In the above formula,

Precision: Accuracy

Recall: Recall rate

mAP: Average precision mean

IOU: Intersection ratio union

F1 score: The F1 score is an indicator of the accuracy of the dichotomous model *TP*: The number of samples that are correctly classified as targets

*FP*: The number of samples wrongly classified as the target *FN*: The number of samples incorrectly classified as non-target objects C: Number of target categories. The category of this paper is only landslide, so C = 1 Detection result: The predicted bounding box Ground truth: Real bounding box *Forests* **2022**, *13*, x FOR PEER REVIEW 9 of 16

**Figure 6.** Structure of YOLOv3 convolutional neural network for landslide satellite remote sensing image features. **Figure 6.** Structure of YOLOv3 convolutional neural network for landslide satellite remote sensing image features.

#### *3.4. Accuracy Evaluation Index*  **4. Experiments and Results**

#### The evaluation metrics of the model are generally accuracy, recall, mean average pre-*4.1. Network Model Training*

cision, IOU (intersection over union), F1 score, etc. The F1 score is a measure of the accuracy of the binary classification model, which considers the accuracy and recall of the classification model and can be regarded as a weighted average of the accuracy and recall of the model. It can be expressed as follows: Precision <sup>=</sup> × 100% (7) The orthophotos after correction and stitching were segmented into 800 m × 800 m image maps in ArcGIS, and 120 images with more landslide coverage were selected to create the sample dataset for network training, and the samples were created using labeling, a tool specialized in labeling, to convert the satellite remote sensing images into image samples and label samples in XML format. The detailed experimental platform is shown in Tables 1 and 2.

+

**Table 1.** Hardware configuration for the experiment.


IOU <sup>=</sup> Detection Result ∩ Ground Truth

F1 score <sup>=</sup> 2 ∙

Detection Result ∪ Ground Truth (10)

ሺ + ሻ (11)

ୀଵ

In the above formula,

Precision: Accuracy


**Table 2.** Test all software environments.

A landslide, characterized by large area and monotonous features, was used as the research object in the experiment. Therefore, it was necessary to adjust the setting of some parameters and find out the most reasonable parameter values to train the sample set by comparing the training results for several times in the process of network training. In this training, the training times were set as 6000 times, the learning rate as 0.001, the change point of the learning rate as 500, and as 64 batches. After 64 samples were accumulated each time, the forward propagation was carried out once, and the sub batch was set as 16, that is, a batch of images were divided into 16 sub batches to complete the forward propagation of the network. The momentum coefficient was set to 0.9 and the weight decay regularization term to 0.0005 regularization in case of the change in weight. The saturation and exposure parameters were set to 1.5 and used to increase sample diversity and generate more training samples. The change in the loss value of the model training is shown in Figure 7, from which the loss value decreased sharply at the beginning of 500 training sessions, slowed down from 500 to 2500 sessions, and stabilized after 3500 training sessions, and finally decreased to 0.97 at 6000 sessions. As can be observed from the training log, 64 samples were processed in the first training, and 320,000 samples were processed in the 6000th training. This effect of reducing training loss was quite ideal, which was in line with the purpose of the experiment. *Forests* **2022**, *13*, x FOR PEER REVIEW 11 of 16 in the 6000th training. This effect of reducing training loss was quite ideal, which was in line with the purpose of the experiment.

According to the accuracy evaluation indexes in Section 3.3 above, in this paper, recall, IOU and F1 are selected as the accuracy evaluation indexes of the model. Recall was

It can be observed from Figure 8 that as the number of tested samples increased, the accuracy of recognition always fluctuated in a fixed area and was relatively stable. The IOU and recall of the first two images in the test were relatively high because the cumulative number of samples contained in the detected images was averaged and they were recorded only by the number of images. With the increase in sample size, the average crossover ratio and average recall rate tended to be stable. After testing 60 images, IOU

was 60 images, and the average IOU and recall of the samples were recorded when the training model was used to verify 60 images. Each image measured was recorded once,

**Figure 7.** Plot of loss values with the number of iterations. **Figure 7.** Plot of loss values with the number of iterations.

*4.2. Accuracy Evaluation of the Network Model* 

as shown in Figure 8.

was balanced at 80% and recall at 90%, which were ideal results.

### *4.2. Accuracy Evaluation of the Network Model*

According to the accuracy evaluation indexes in Section 3.3 above, in this paper, recall, IOU and F1 are selected as the accuracy evaluation indexes of the model. Recall was used as an indicator to evaluate the accuracy of the training model. The validation set data was 60 images, and the average IOU and recall of the samples were recorded when the training model was used to verify 60 images. Each image measured was recorded once, as shown in Figure 8. *Forests* **2022**, *13*, x FOR PEER REVIEW 12 of 16

*4.3. Network Model Test*  When the training part of the network was completed and the verification accuracy met the requirements, we started to test the network model for proving the usefulness of the network by randomly selecting several satellite remote sensing images in the study area that were not involved in the training for landslide recognition. The recognition results are shown in Figure 9, in which red boxes are the recognized landslides. It can be It can be observed from Figure 8 that as the number of tested samples increased, the accuracy of recognition always fluctuated in a fixed area and was relatively stable. The IOU and recall of the first two images in the test were relatively high because the cumulative number of samples contained in the detected images was averaged and they were recorded only by the number of images. With the increase in sample size, the average crossover ratio and average recall rate tended to be stable. After testing 60 images, IOU was balanced at 80% and recall at 90%, which were ideal results.

### observed from the figure that good recognition results were achieved. *4.3. Network Model Test*

When the training part of the network was completed and the verification accuracy met the requirements, we started to test the network model for proving the usefulness of the network by randomly selecting several satellite remote sensing images in the study area that were not involved in the training for landslide recognition. The recognition results are shown in Figure 9, in which red boxes are the recognized landslides. It can be observed from the figure that good recognition results were achieved.

### *4.4. Analysis of Experimental Results*

The experimental results show that the YOLOv3 convolutional neural network built in this paper performs well in detecting and identifying landslides. The comparison results between the network proposed in this paper and visual interpretation are shown in Figure 10.

landslide. The red box shows the landslide area).

observed from the figure that good recognition results were achieved.

**Figure 8.** Average IOU and average recall.

*4.3. Network Model Test* 

The experimental results show that the YOLOv3 convolutional neural network built

**Figure 9.** Landslide identification results. (**a**) and (**b**) are the identification results of co-seismic landslide. The red box shows the landslide area). **Figure 9.** Landslide identification results. (**a**) and (**b**) are the identification results of co-seismic landslide. The red box shows the landslide area). in this paper performs well in detecting and identifying landslides. The comparison results between the network proposed in this paper and visual interpretation are shown in Figure 10.

When the training part of the network was completed and the verification accuracy met the requirements, we started to test the network model for proving the usefulness of the network by randomly selecting several satellite remote sensing images in the study area that were not involved in the training for landslide recognition. The recognition results are shown in Figure 9, in which red boxes are the recognized landslides. It can be

**5. Discussion** 

**Figure 10.** Comparison between landslide identification results of YOLOv3 network model and visual interpretation results. (**a**,**b**) are landslide recognition results of YOLOv3 network model, and (**c**,**d**) are the corresponding visual interpretation results. The red parts in (**c**,**d**) are the landslide areas interpreted visually by experts from the State Key Laboratory of Geological Hazard Prevention and Geological Environment Protection. The yellow parts in a and b are the unrecognized co-seismic landslide area. **Figure 10.** Comparison between landslide identification results of YOLOv3 network model and visual interpretation results. (**a**,**b**) are landslide recognition results of YOLOv3 network model, and (**c**,**d**) are the corresponding visual interpretation results. The red parts in (**c**,**d**) are the landslide areas interpreted visually by experts from the State Key Laboratory of Geological Hazard Prevention and Geological Environment Protection. The yellow parts in a and b are the unrecognized co-seismic landslide area.

(11), as shown in Figure 10a,b. The YOLOv3 network model recognized F1 was 0.93, indi-

The method adopted in this paper was to transform the landslide identification problem into an image processing problem. Thus, the method proposed can effectively reduce the difficulty of co-seismic landslide research, making it easier for related researchers to understand the disaster caused by co-seismic landslides. This method belongs to the category of artificial intelligence. The purpose is to solve the tedious manual interpretation

As can be observed from the Figure 10, the overall effect of YOLOv3 recognition on landslides was better than visual interpretation, with only 11 landslides with small areas and light colors failing to be recognized, as shown in yellow in Figure10. The average

cating that the experiment met the research needs.

As can be observed from the Figure 10, the overall effect of YOLOv3 recognition on landslides was better than visual interpretation, with only 11 landslides with small areas and light colors failing to be recognized, as shown in yellow in Figure 10. The average recall was calculated as 0.88, and the average precision as 0.98 by Equations (7), (8) and (11), as shown in Figure 10a,b. The YOLOv3 network model recognized F1 was 0.93, indicating that the experiment met the research needs.

### **5. Discussion**

The method adopted in this paper was to transform the landslide identification problem into an image processing problem. Thus, the method proposed can effectively reduce the difficulty of co-seismic landslide research, making it easier for related researchers to understand the disaster caused by co-seismic landslides. This method belongs to the category of artificial intelligence. The purpose is to solve the tedious manual interpretation problem. Therefore, it has more outstanding superiority in large-scale seismic landslide research. Furthermore, the YOLOv3 algorithm proved to have outstanding advantages in image recognition. Ref. [34] applied a neural network built by the YOLOv3 algorithm to recognize surface objects using remote sensing images and verified the efficiency of the algorithm in remote sensing image identification. In this study, the YOLOv3 algorithm was applied to satellite image recognition, and successfully identified the co-seismic landslide. The Planet image adopted in this paper is a remote sensing image with a resolution of 3 m. In addition, this method is more inclusive to different image categories, even in satellite images with different resolutions. Although the texture, color and other characteristics of the co-seismic landslide area are different, we need to build on the network model by transfer learning. By adding images of different seismic landslide features as new samples, we can identify co-seismic landslides from different satellite images.

The identification results have met the experimental requirements, and this network can quickly and accurately identify seismic landslides. There are still some small landslides that cannot be identified. The failure to identify those small landslides may be caused by the insufficient number of samples, because more images with obvious landslide characteristics in the large area were selected when making sample labels. While some parts of the YOLOv3 network were unrecognizable, YOLOv3 could identify images within seconds, while visual interpretation required professionals to manually label images based on their experience and expertise, which is a more time-consuming and costly process. Therefore, the research in this paper helps to achieve the rapid identification of landslides in the internal industry and improve the identification efficiency, which provides reference for disaster emergency rescue and investigation of co-seismic landslides. In the meanwhile, it is of great significance to the emergency rescue of landslide disasters after earthquakes.

This research realizes fast and accurate automatic recognition of co-seismic landslides. Due to the complexity of landslide remote sensing identification, there are still many problems that need to be further studied. The following are some thoughts on the research of deep learning identification of co-seismic landslides.


### **6. Conclusions**

In this paper, the YOLOv3 algorithm was used to build a neural network model oriented to landslide identification from remote sensing satellite image features, train samples and automatically identify co-seismic landslides. In the model test, the IOU was

balanced at 80% and the recall at 90%, which showed that the model had good stability. In the meantime, for the co-seismic landslide identification experiment in Hokkaido, Japan, the experimental results showed that the average recall was calculated as 0.88, and the average precision as 0.98 and the F1 as 0.93. The results also showed that the YOLOv3 network model was relatively simple to build and easy to adjust and optimize parameters with high precision and fast speed, it had excellent identification effect and could play an important role in landslide identification. Compared with expert visual interpretation results, the model adopted in this paper has a very fast speed in co-seismic landslide identification. The research in this paper effectively overcame the deficiencies of visual interpretation, such as more required time, high cost and low efficiency. In this paper, a landslide sample dataset with 4000 tags was produced. It provides data support for deep learning to identify seismic landslides.

In the future, other landslide databases caused by geological disasters can be used to train the network model to improve its accuracy. We can also extend the constructed network model for the identification of geological disasters, such as loess and rain-fall induced landslides.

**Author Contributions:** Conceptualization, D.P. and G.L.; methodology, D.P.; software, R.F.; validation, J.H.; data curation, W.L.; writing—original draft preparation, D.P.; writing—review and editing, D.P. and G.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Key National Key Research and Development Program of China (Grant No. 2021YFC3000401), State Key Laboratory of Geohazard Prevention and Geoenvironment Protection Independent Research Project (Grant No. SKLGP2018Z010), the National Natural Science Foundation of China (NSFC) (Grant No. 41871303), Sichuan Provincial Science and Technology Support Project (Grant No. 2021YFG0365), Department of Natural Resources of Sichuan Province (Grant No. kj-2021-3). Chengdu Technology Innovation R&D Project (2022-YF05-01090-SN).

**Data Availability Statement:** The images used in this paper are derived from Planet satellite data (https://www.planet.com/explorer/ (accessed on 18 September 2018)).

**Conflicts of Interest:** The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

### **References**

