The Diverse Mountainous Landslide Dataset (DMLD): A High-Resolution Remote Sensing Landslide Dataset in Diverse Mountainous Regions

Chen, Jie; Zeng, Xu; Zhu, Jingru; Guo, Ya; Hong, Liang; Deng, Min; Chen, Kaiqi

doi:10.3390/rs16111886

Open AccessArticle

The Diverse Mountainous Landslide Dataset (DMLD): A High-Resolution Remote Sensing Landslide Dataset in Diverse Mountainous Regions

by

Jie Chen

¹

,

Xu Zeng

¹,

Jingru Zhu

¹

,

Ya Guo

¹,

Liang Hong

²,

Min Deng

¹ and

Kaiqi Chen

^1,*

¹

School of Geosciences and Info-Physics, Central South University, Changsha 410083, China

²

College of Tourism & Geography Science, Yunnan Normal University, Kunming 650500, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(11), 1886; https://doi.org/10.3390/rs16111886

Submission received: 15 April 2024 / Revised: 16 May 2024 / Accepted: 22 May 2024 / Published: 24 May 2024

(This article belongs to the Special Issue AI-Driven Mapping Using Remote Sensing Data)

Download

Browse Figures

Versions Notes

Abstract

:

The frequent occurrence of landslides poses a serious threat to people’s lives and property. In order to evaluate disaster hazards based on remote sensing images via machine learning means, it is essential to establish an image database with manually labeled boundaries of landslides. However, the existing datasets do not cover diverse types of mountainous landslides. To address this issue, we propose a high-resolution (1 m) diverse mountainous landslide remote sensing dataset (DMLD), including 990 landslide instances across different terrain in southwestern China. To evaluate the performance of the DMLD, seven state-of-the-art deep learning models with different loss functions were implemented on it. The experiment results demonstrate not only that all of these deep learning methods with different characteristics can adapt well to the DMLD, but also that the DMLD has potential adaptability to other geographical regions.

Keywords:

landslide dataset; landslide detection; deep learning; remote sensing; high resolution

1. Introduction

There are a large number of landslides caused by natural or human-made factors all over the world, especially in mountainous areas [1]. Earth observation data are considered to be the most effective data source in large-scale landslides investigation [2]. Human visual interpretation based on remote sensing imagery [3] can determine the affected areas of landslides by considering the surrounding topography and geological types of disaster areas [4]. Undeniably, the efficiency of this means is limited. With the rapid development of computer image recognition technology, the automatic landslide identification method based on remote sensing images is becoming an important means for large-area rapid landslide detection. Traditional machine learning methods such as the SVM [5], RF [6], as well as deep learning methods such as the CNN [7] and Transformer [8], have been demonstrated to be effective. However, the effectiveness of these machine learning methods, especially deep learning methods, relies heavily on datasets with high-quality annotations. These models often exhibit unfavorable performance when labeled data is scarce, and the results are not stable in the face of new samples that have not been seen during training [9]. In addition, some existing landslide detection models were trained in a small study area [10], making it difficult to effectively extend these methods to other regions.

In recent years, several publicly available landslide datasets have been released. According to how the datasets are created, these datasets can be divided into event-based landslide datasets and historical data-based landslide datasets. Event-based datasets are generally created after major natural disasters (e.g., volcanic eruptions, earthquakes). Specifically, pre- and post-disaster remote sensing images of the disaster area are collected first, and then the area where the landslide occurred is determined through a comparison of the images [11]. This approach allows for a rapid acquisition of regional landslide data, and most of the landslide areas obtained are exposed, new landslide surfaces. For example, the Landslide4Sense dataset is created at four different times and geographical locations based on Sentinel-2 imagery [12], but the limited resolution (10 meters) poses a challenge for identifying small landslides. On the other hand, historical data-based landslide datasets are constructed by conducting manual annotation on the remote sensing images of corresponding locations based on historical landslide reporting data. This approach allows access to landslides from different periods, such as old landslides that have undergone some changes in their surfaces. However, the landslide datasets based on historical data are generally targeted at a certain area of geographic coverage, such as the dataset in Bijie City, Guizhou Province [13], which contains hundreds of patch-level images without sufficient contextual information. To meet the needs of remote sensing-based landslide mapping in different regions, it is important to identify landslides from higher-resolution imagery with more diverse geographical backgrounds [14].

Therefore, considering that the current landslide datasets lack different terrains and various shapes, this paper proposes a diverse mountainous landslide dataset (DMLD). Specifically, based on historical landslide monitoring data, we collected and annotated landslide images at 1-meter resolution from 19 different counties in Yunnan Province, China. We constructed an open landslide dataset containing 490 images of 1000 × 1000 pixels, including more than 990 landslides. It contains a variety of terrain conditions and rich context information. To evaluate the properties of the constructed dataset, we conducted seven state-of-the-art (SOTA) models for landslide recognition, including CNN and Transformer architecture models. Additionally, taking into account the data imbalance problem, we examined the impact of four different loss functions on landslide extraction performance. Furthermore, we assessed the generalization performance of the models trained on this dataset to other landslide datasets.

The structure of this paper is as follows: In Section 2, we review the latest research on landslide recognition and landslide datasets. In Section 3, we introduce the landslide dataset production method and the information of the proposed dataset. In Section 4, we evaluate the SOTA models’ performance on the DMLD, and assess the models’ generalization on other datasets. Section 5 provides a summary and future research.

2. Related Works

Landslide recognition based on remote sensing images is essentially a segmentation task aimed at accurately extracting landslide boundaries. Early methods for landslide recognition can be classified into pixel-based and object-based approaches. The former determines whether a pixel belongs to a landslide based on the value of each pixel in the remote sensing image [15], which only takes into account the features of a single pixel without considering other attribute features of landslides such as shape, texture, and spatial relationships, thus ignoring context information about landslides. The latter can fully utilize spectral and shape features of landslides, reducing errors caused by relying only on individual pixel information [16]. However, its results depend on the choice of segmentation scale, which is often determined by the texture features of the image [17]. Moreover, both them require manually designed features [18].

In recent years, image semantic segmentation methods based on deep learning techniques have demonstrated outstanding performance [19]. Compared to traditional machine learning methods, CNN-based methods not only have excellent image processing capabilities [20] but also can automatically extract effective feature representations from images [21]. For example, Yu et al. utilized a simple CNN for preliminary landslide identification [22], Prakash et al. applied U-Net for landslide detection [23], and Sameen et al. used ResNet to fuse optical imagery and terrain information to detect landslides [24]. The effectiveness of landslide detection based on satellite imagery depends much on model adaptation. Therefore, Yi et al. proposed LandsNet, which outperformed ResUNet [25], Chandra et al. utilized an improved U-Net to integrate high-level and low-level features to enhance recognition accuracy [26], and Niu et al. enhanced network accuracy by using attention mechanisms [27]; the attention mechanism can focus on important parameters and improve the computational efficiency [28]. Recently, Transformer-based deep learning models have been applied in the field of computer vision and have achieved excellent performance in many tasks [29]. Some researchers have attempted to apply Transformer models to landslide identification, such as using the improved SegFormer for identifying coseismic landslides [30], and combining knowledge distillation with Swin Transformer to enhance the performance of rapid landslide identification [31].

Well-annotated datasets are becoming increasingly important for the advancement of deep learning models in landslide identification. Due to the concentration of landslides in mountainous areas with complex geography, the availability of landslide samples for collection is extremely limited [32]. Moreover, the shape of landslide samples is characterized by irregularity and fragmentation, and, thus, requires more labor for annotation than other geo-objects such as buildings and crops [33]. Ji et al. [13] proposed an open landslide dataset, but only provided landslide image patches from Bijie City, Guizhou Province. Regarding the available high-resolution remote sensing images, researchers are able to create landslide datasets based on regional disaster events, such as the Jiuzhaigou earthquake landslides in Sichuan Province, China [34], and the monsoon rainfall landslides in Kerala, India [11]. Considering the limited terrain diversity in individual regions, and in order to increase the generalizability of the dataset, Ghorbanzadeh et al. created a landslide dataset based on the Sentinel-2 data that contains four geographically distinct areas. However, the limited image resolution (10 m) may result in overlooking small-scale landslides [12]. Zhang et al. created a VHR landslide identification dataset using pre- and post-disaster imagery from Google Earth, covering 17 different cities around the world [35]. Meenal et al. generated a 3 m resolution landslide dataset based on PlanetScope imagery [14], sampling landslide instances from 10 different geographic regions worldwide, including South Asia, Southeast Asia, East Asia, South America, and Central America. These hazard event-based landslide datasets can cover multiple regions and contain a large number of landslides. However, they mainly focus on visibly exposed new landslides, resulting in relatively uniform context features. However, for practical applications, the landslide detection task requires the identification of both new landslides and old landslides.

In summary, there is a lack of an open landslide dataset with high resolution and diverse topographic contexts that includes various forms of landslides. To address these deficiencies, we collected historical landslide data from 19 different counties in the mountainous regions of Yunnan and created a high-resolution (1 m) landslide dataset using publicly available satellite imagery.

3. Landslide Dataset

This study aims to construct a high-resolution remote sensing landslide dataset containing diverse terrain and explore the adaptability of landslide detection methods in complex contexts. Specifically, we collected landslide samples from 19 different counties, in three different topographic contexts (alpine gorges, laterite plateaus, and karst plateaus), including both newly formed landslides and old landslides, which have been triggered by different conditions, such as earthquakes, rainfall, or human-made slope-cutting.

3.1. Study Area

The study area is Yunnan, one of the most landslide-prone provinces in China [36]. By the end of 2021, there were 18,501 recorded landslide hazards in Yunnan [37], accounting for 33.26% of the total number of landslide hazards in Southwest China. Yunnan is located between 21°8′ to 29°15′ north latitude and 97°31′ to 106°11′ east longitude, with a total area of 394,000 square kilometers, and all kinds of mountains account for 80% of its total area. The topography of Yunnan is complex and varied, dominated by plateau mountains. The overall terrain is higher in the northwest and lower in the southeast in Yunnan, with a significant elevation difference. Complex topography and distribution of steep slopes provide favorable conditions for the formation of mountain landslides; frequent earthquakes in fault zones cause mountains to break up and provide geological conditions for the formation of landslides; moreover, heavy rainfall is the main trigger for landslides in Yunnan. Yunnan Province has western, central, and eastern topographic regions, as shown in Figure 1. Western Yunnan is an alpine gorge region, which is characterized by mountain valleys in the high north and low south, and the region has a large mountain drop and intense tectonic activities. The central Yunnan region is characterized by a laterite plateau with obvious red soil outcrops and significant differences in the softness and hardness of the strata. East Yunnan belongs to the karst plateau, with obvious karst landforms, rugged and broken terrain, and relatively sparse vegetation coverage. We sampled landslides from these three regions to ensure the diversity of topographic and geomorphic qualities in the dataset.

3.2. Data Source and Processing

3.2.1. Data Collection

In order to sample major terrain areas in Yunnan to increase the diversity of terrain, we screened out some landslide disaster points from 19 counties based on the historical landslide disaster report data, as shown in Figure 2. To ensure the quality of the images, we first checked and deleted the cloud or shadow cover areas. Furthermore, in ArcGIS 10.2, we aligned the image coordinate system with the coordinate system of the disaster points to ensure location accuracy. The topographic areas, elevation ranges, and distribution of landslide sites in the 19 selected counties are shown in Table 1.

3.2.2. Data Processing

To ensure the authenticity of the sampled landslide points, we obtained high-resolution open-source remote sensing images with 1 m resolution and the highest-resolution open-source DEM data from Tianditu. Moreover, we resampled DEM uniformly into 8 meters and aligned the remote sensing images with the DEM in ArcGIS 10.2. Since landslides typically occur in the slope range of 10 to 45 degrees [38], we calculated the slope of each landslide point in ArcGIS 10.2 based on the DEM and filtered out the landslide points that were not in the 10-to-45-degree range. Considering the varying scales of landslides, to retain the complete information about the landslide and its surroundings, we cropped the image into a size of 1000 × 1000 pixels centered on the landslide points. The context of these images includes various non-landslide features such as water bodies, vegetation, buildings, and roads. Furthermore, images of 1000 × 1000 size can be cropped to obtain image blocks suitable for inputting into deep learning models, avoiding the loss of image information in up-sampling or down-sampling processes.

3.2.3. Data Annotation

It is crucial to ensure that the annotated landslides show sufficiently distinctive features in the high-resolution remote sensing images. Therefore, several volunteers who were remote sensing majors were invited to annotate the cropped image blocks using LabelMe [39] to carefully delineate the boundaries of each landslide. During the annotation process, we refer to the location of the landslide hazard site, the topographic information around the landslide, and the distribution of the DEM. All identifiable landslides in the image are annotated, with a particular focus on annotating small landslides.

There are many terraced fields in the Yunnan region. In areas with steep slopes, landslides may occur on or near terraced fields. To some extent, unplanted terraced fields may resemble the exposed surface of a new landslide, which can lead to confusion. In addition, some old stable landslides may not be evident in the image features and may be covered by vegetation, which poses a challenge for interpretation, as shown in Figure 3. For landslides with ambiguous image features, we conducted visual inspections using Google Earth imagery from multiple time periods. If it was still difficult to determine, we discarded them to avoid negative effects on model training. After completing the annotations, we conducted three rounds of validation to ensure the completeness of the annotations. In the end, we selected and annotated a total of 3497 images cropped based on historical landslide points, resulting in 490 finely annotated 1000 × 1000 landslide images, forming a diverse mountainous landslide dataset.

3.3. Dataset Analysis

Finally, the dataset consists of 490 images of 1000 × 1000 pixels with a resolution of 1 meter and contains 990 landslides with a total landslide area of more than 5.79 km². Landslide pixels make up 1.18% of the dataset and background pixels make up 98.82%. In order to extract the statistical information about the dataset images, we conducted a spectral analysis of them. Figure 4a shows the histogram of the spectral distribution of the dataset, indicating the frequency of occurrence of different pixel values in the images, showing the spectral distribution of remote sensing features in the dataset. Figure 4b shows the mean and variance for each band of red, green, and blue.

To quantitatively analyze the complexity of context information in the dataset, we computed the gray-level co-occurrence matrix (GLCM) to obtain image contrast and energy, as shown in Figure 4c. We calculated the average contrast and energy in four directions, horizontal (0°), vertical (90°), and diagonal (45° and 135°), which cover the major angles and capture the texture information of the image more comprehensively. The average contrast values in the four directions of the images are relatively high, indicating dramatic texture variations in the dataset. The average energy values of the four directions are relatively close to each other, indicating that the gray values in the image are concentrated in a specific range. Overall, the images have intense texture features, with significant grayscale variations between different regions, and these variations are evenly distributed throughout the images.

Shannon entropy is a measure of uncertainty in the grayscale distribution of image pixels. A higher entropy value indicates more information in the image, and a lower entropy value indicates less information. In order to quantitatively evaluate the richness of contextual information in the dataset, we calculated the Shannon entropy of the DMLD dataset and compared it with other publicly available remote sensing landslide datasets, as shown in Figure 4d. Two existing landslide datasets were selected for comparison with the DMLD dataset in this study. The Bijie-landslide-dataset is a historical data-based landslide dataset, consisting of hundreds of landslide patches with small image sizes and limited context information. The GVLM dataset is an event-based landslide dataset created by collecting pre-disaster and post-disaster images after natural disasters such as earthquakes and heavy rainfalls. Most of the landslides in the GVLM dataset are new landslides after a disaster, and the context information is relatively simple. From Figure 4d, we can find that the DMLD dataset has the highest Shannon entropy, indicating a richer information content.

Some sample landslide images in the dataset and their corresponding labels are shown in Figure 5, from which we can find the different terrain and shapes of landslides.

4. Experiments and Result Analysis

4.1. Experiments Design

Machine learning-based landslide recognition requires the models to fully utilize the information from different scales and contexts of landslides. To further explore the models’ performance on the DMLD dataset, seven SOTA deep learning models that can represent multi-scale features and contextual information were selected for conducting landslide recognition. These models include four classical semantic segmentation networks, such as FCN, PSPNet, DeepLabv3+, and HRNet, as well as advanced networks such as Swin Transformer, SegFormer, and ConvNeXt, as shown in Table 2.

Semantic segmentation is a binary categorization task for distinguishing landslides and background in this study. As described above, landslides make up a relatively small portion of the image, which may lead to a category imbalance. In order to better explore the performance of models on the dataset and address the imbalance issue, we tried using four different loss functions in the experiments, as shown in Table 3.

Finally, to explore the generalization of the model trained on the DMLD, we tested two landslide areas selected from the GVLM dataset using seven advanced deep learning models [35]. Further, the results were analyzed and evaluated to validate the adaptability potential of the DMLD dataset in other mountain environments.

4.2. Experimental Settings

1.: Data Augmentation

In order to increase the number and diversity of training samples, we used overlapping sampling for data enhancement. Overlapping sampling generates a subset of samples with overlapping samples, which can improve the model’s generalization ability and robustness, and reduce the influence of abnormal samples and noise on the model, and the risk of overfitting. Specifically, the images in the dataset were sampled with a window size of 512 × 512 centered on the landslides in the labels, with a stride of 128. We divided the data into training, validation, and test sets at a ratio of 8:1:1. Subsequently, various data augmentation methods were applied to the training set, including translation, rotation, scaling, random flipping, random lighting enhancement, photometric distortion, and random cropping.

2.: Parameter Configuration

In order to eliminate the scale difference in the image pixel values, we normalized the image pixel values to [0, 1]. To control the variables, the AdamW optimizer was used for the models’ training, with the learning rate set to 2 × 10⁻⁴ and the weight decay set to 0.01. We used a polynomial learning rate adjustment strategy, wherein the learning rate started increasing linearly at the beginning, and the polynomial function’s power was set to 1.0. The learning rate decayed according to the polynomial function’s rule after 1500 iterations, and the maximum training iterations were set to 60 k. To better control experimental variables and maintain a relatively fair comparison environment, we conducted experiments using PyTorch (version 1.11.0) [50] and MMSegmentation [51]. All experiments were trained on a single GPU with a batch size of 2, configured with an NVIDIA GeForce RTX 2080 Ti (manufactured by NVIDIA Corporation, Santa Clara, CA, USA).

4.3. Results and Analysis

4.3.1. Recognition Results with SOTA Models

To evaluate the adaptability of advanced deep learning models on the DMLD dataset, seven SOTA models were selected for the experiments. The quantitative evaluation results of all seven models and their different versions are shown in Table 4. FCN, PSPNet, and DeepLabv3+ all use ResNet-50 and ResNet-101 as backbones, and the loss function used by all models is Binary CrossEntropy. In the training results of 60 k iterations, the base version of ConvNeXt achieved the highest accuracy metrics, with an IoU reaching 90.59%, recall reaching 94.14%, and F1 score reaching 95.06%, indicating that this model fitted the dataset well. The SegFormer-b5 achieved the second-highest IoU of 89.51% and a 94.47% F1 score, while having smaller parameter and computational complexity, with params only at 81.97 m. In terms of precision, PSPNet achieved the highest performance among the different models, with PSPNet-r50 reaching the highest precision of 98.61%, but recall ranked second lowest at 76.35%, indicating a higher number of missed detections. It is worth noting that precision was higher than recall for all models, which indicates a better performance in ensuring prediction accuracy, but a weaker ability to detect all real landslides, and some false positives in model results. The three advanced models, SegFormer, Swin Transformer, and ConvNeXt, performed well in balancing recall and precision. Among them, ConvNeXt achieved the best balance of precision and recall through multiple optimizations of the model structure. FCN, PSPNet, DeepLabv3+, and HRNet were able to obtain high precision but low recall values. A lower recall value indicates a higher number of undetected landslides.

We selected samples of landslides containing a variety of topographic contexts to qualitatively evaluate the models. These samples included landslides of different sizes. The results of the landslide detection are shown in Figure 6. True-positive areas are marked in red, missing landslide pixels (false negatives) are marked in green, and non-landslide pixels that were incorrectly detected as landslides are marked in blue. From the results, although there are some bare areas in the images similar to landslide features, the models have a low false detection rate, indicating their capability to effectively differentiate landslide areas from non-landslide areas. In terms of missed detection, most models tended to ignore small landslides. It is also difficult to recognize a complete landslide boundary when some vegetation appears on the landslide body. FCN, PSPNet, DeepLabv3+, and HRNet had poor recognition performance for old landslides that are almost completely covered by vegetation. On the contrary, ConvNeXt, SegFormer, and Swin Transformer have robust feature representation capabilities to discriminate old landslides that are almost completely covered by vegetation. ConvNeXt-base showed good prediction performance for most samples, fully utilizing multi-scale context information, but had a tendency to miss detection for small landslides. Overall, the predictions of all models tend to be conservative, that is, the precision is high and the recall is low, which is probably due to data imbalance. Despite employing data augmentation strategies, the large difference in the proportion of landslide and non-landslide pixels causes the models to categorize some difficult-to-identify areas as non-landslide areas.

4.3.2. Recognition Results with Different Loss Functions

In the DMLD, the number of landslide pixels is significantly less than the number of background pixels. Category imbalance can cause machine learning models to tend to predict classes with more samples and ignore those with fewer samples. To address the category imbalance in landslide recognition, we have chosen three known loss functions, BCE Loss combined with Dice Loss, Focal Loss, and Lovasz Loss to mitigate this imbalance. The basic network used for the experiments is ConvNeXt-base using CrossEntropy Loss, which performed best in terms of accuracy in Section 4.3.1. The quantitative evaluation of the four loss functions is shown in Table 5, from which we can find that using CrossEntropy Loss achieves the best performance in all accuracy metrics. The accuracy is not improved by using Focal Loss, BCE Loss combined with Dice Loss, and Lovasz Loss. To investigate this phenomenon, the next paragraph will analyze whether loss functions display unusual drops in accuracy during training. Moreover, we qualitatively analyze the effect of different loss functions on landslide extraction results.

In order to track the impacts of different loss functions in the training process, we visualize the training–loss curves and the F1 score change curves under different loss functions, as shown in Figure 7. As the number of iterations increases, all four loss functions decrease during training, and by 60,000 iterations, they are close to convergence. Among them, the Focal Loss decreases very quickly at the beginning of training and can fit the dataset quickly. According to the F1 score curve, the Focal Loss is close to convergence at about 40,000 iterations, and the Lovasz Loss converges slowly at the beginning of the training period, but the convergence rate increases rapidly as the number of iterations increases. Both the training–loss curves and the F1 score curves indicate that the models corresponding to the four loss functions have stabilized by the end of training.

We selected some representative landslide samples, including small landslides and easily confused landslides, to qualitatively test the models trained with different loss functions, as shown in Figure 8. From the qualitative test results, it can be seen that the Focal Loss-based model can distinguish small landslides and easily confused landslides well. Focal Loss can strengthen the focus on hard-to-classify samples by reducing the weights of easy-to-classify samples. However, since most of the landslides in the dataset are easy to classify and have clear features, Focal Loss’s reduction of the weights of the easy-to-classify samples may lead to a decrease in the overall accuracy, but it performs well on the difficult-to-classify samples. BCE Loss combined with Dice Loss can improve the recognition rate of small landslides and help to solve the category imbalance problem. Lovasz Loss performs poorly on the dataset and is difficult to adapt to some simple landslide samples, because Lovasz Loss focuses more on overall category boundary matching, resulting in insensitivity to small landslides and confusing regions. According to the above result analysis, we can know that the CrossEntropy Loss can be used for landslide extraction tasks focusing on overall accuracy, and Focal Loss can be used for tasks focusing on small landslides.

4.3.3. Model Generalization Ability

To assess the adaptability of the landslide dataset, we selected two landslide areas from the GVLM dataset [35]: Kodagu, India, and Kaikoura, New Zealand. Both regions are mountainous and share some similar image features with our dataset, but there are differences in image color saturation and contrast. The image comparisons of the DMLD with Kodagu and Kaikoura are shown in Figure 9. The landslides in Kodagu are distributed in long strips and have a rough surface texture, as shown in Figure 9a, while in the DMLD the landslides have a smoother surface and are distributed in small patches (Figure 9b). The landslide in Kaikoura occurred on a slope near a road, as shown in Figure 9c, and similar samples are included in the DMLD. The surface characteristics of the landslide in Kaikoura are similar to those in the DMLD, as shown in Figure 9d.

The two selected landslide regions contain large continuous landslides, which are rare in our dataset, challenging the completeness of landslide identification. We used the highest-accuracy versions of the trained models in this study to test on Kodagu and Kaikoura, and the results of the quantitative evaluation are shown in Table 6 and Table 7. From the tables, we can find that FCN achieved the highest IoUs and F1 scores in both regions, with the IoUs and F1 scores reaching 58.41% and 73.75% in Kodagu, and 51.73% and 68.19% in Kaikoura, which indicates that FCN has a good generalization ability and a significant advantage over other models. The good generalization of FCN is due to its simple network structure, which replaces the fully connected layers in traditional CNNs with convolutional layers, thus reducing the complexity of the model and the risk of overfitting. HRNet has the highest recall in Kodagu at 65.99% and the second-highest F1 score at 64.24%. In Kaikoura, HRNet performed similarly to FCN, with F1 score and IoU falling behind by less than 1%. DeepLabv3+ had the highest precision in both regions, reaching 93.95% in Kodagu and 95.39% in Kaikoura, but had significantly lower recall. Moreover, recall is lower than precision for all models, indicating that the DMLD differs from Kodagu and Kaikoura in landslide features and image background, which makes it difficult for the models to completely recognize landslides on Kodagu and Kaikoura. ConvNeXt, which performed best on the DMLD dataset, did not perform well in terms of generalization, achieving only a 17.95% IoU and 30.43% F1 score in Kodagu, and achieving a medium-level performance in Kaikoura. The landslides in Kodagu are characterized as long continuous bare landslides with features similar to some bare ground in the DMLD, which may have led the model to identify the landslides as the background. The landslides in Kaikoura are distributed in patches next to roads, and there are numerous similar roadside landslide samples in the DMLD. ConvNeXt has a superior learning ability and fitted the training set well in 60,000 iterations with some tendency of overfitting. As a result, ConvNeXt performed worse in Kodagu, which is more different from the DMLD, and better in Kaikoura, which is more similar to the DMLD. Among the Transformer architecture models, SegFormer achieved the third-highest IoU and F1 score in Kodagu, while Swin Transformer achieved the third-highest IoU and F1 score in Kaikoura. The attention mechanism in Transformer models allows them to obtain global contextual information, contributing to improved generalization. Overall, the generalization performance of various models is better in Kaikoura than in Kodagu, mainly because Kaikoura’s terrain context is closer to the DMLD.

The prediction results of the seven models in the major landslide areas of Kodagu and Kaikoura are shown in Figure 10. The results of quantitative and qualitative analyses indicate that HRNet and FCN can effectively learn multi-scale information from the dataset, thus showing good generalization performance. In Kodagu, HRNet and FCN can achieve relatively complete predictions. For Kaikoura, all models recognize the major landslide areas in the mountainous region, but there is a missed detection in the exposed areas along the water bodies. This may be due to a lack of similar samples in the DMLD, leading to a tendency for the models to classify shoreline areas as non-landslide exposed surfaces. ConvNeXt performs well in identifying landslides in mountainous areas, but makes errors in classifying water bodies as landslides, suggesting that it has a high learning capacity and may be overfitting the training set.

5. Conclusions

In response to the lack of remote sensing landslide datasets with diverse topographic contexts, this paper establishes a large-scale landslide dataset in diverse mountainous contexts. In this study, we have obtained the following conclusions: (1) Seven SOTA deep learning models have been adopted to evaluate their applicability on the dataset, and ConvNeXt finally achieved the best result with its good, balanced performance. (2) For the class imbalance problem caused by the small percentage of landslide pixels in images, Focal Loss is shown to have improved the identification effect for both small landslides and confusing landslides. (3) The experiments’ results demonstrate that FCN and HRNet have good generalization, and the DMLD dataset has a potential adaptability to other mountainous regions.

Although the deep learning models perform well on the DMLD dataset, the adaptability of the models to different geographic regions needs to be further explored. For example, differences in vegetation, soil types, and topography between different geographic regions may affect the accuracy of the models. Future research could improve these models by using domain adaptation techniques to increase their accuracy in extracting landslide boundaries in different geographic regions.

Author Contributions

Conceptualization, J.C. and X.Z.; methodology, J.C. and X.Z.; software, X.Z. and K.C.; validation, J.Z., Y.G. and K.C.; formal analysis, J.Z. and Y.G.; investigation, Y.G.; resources, J.C.; data curation, X.Z.; writing—original draft preparation, X.Z.; writing—review and editing, J.C.; visualization, X.Z.; supervision, M.D.; project administration, J.C.; funding acquisition, L.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the major scientific and technological projects of Yunnan Province (Grant No. 202202AD080010).

Data Availability Statement

The DMLD dataset is available for download at https://github.com/RS-CSU/DMLD-Dataset, accessed on 15 April 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ghorbanzadeh, O.; Shahabi, H.; Crivellari, A.; Homayouni, S.; Blaschke, T.; Ghamisi, P. Landslide Detection Using Deep Learning and Object-Based Image Analysis. Landslides 2022, 19, 929–939. [Google Scholar] [CrossRef]
Shi, W.; Zhang, M.; Ke, H.; Fang, X.; Zhan, Z.; Chen, S. Landslide Recognition by Deep Convolutional Neural Network and Change Detection. IEEE Trans. Geosci. Remote Sens. 2020, 59, 4654–4672. [Google Scholar] [CrossRef]
Samodra, G.; Chen, G.; Sartohadi, J.; Kasama, K. Generating Landslide Inventory by Participatory Mapping: An Example in Purwosari Area, Yogyakarta, Java. Geomorphology 2018, 306, 306–313. [Google Scholar] [CrossRef]
Xu, C. Preparation of Earthquake-Triggered Landslide Inventory Maps Using Remote Sensing and GIS Technologies: Principles and Case Studies. Geosci. Front. 2015, 6, 825–836. [Google Scholar] [CrossRef]
Marjanović, M.; Kovačević, M.; Bajat, B.; Mihalić Arbanas, S.; Abolmasov, B. Landslide Assessment of the Strača Basin (Croatia) Using Machine Learning Algorithms. Acta Geotech. Slov. 2011, 8, 45–55. [Google Scholar]
Goetz, J.; Brenning, A.; Petschko, H.; Leopold, P. Evaluating Machine Learning and Statistical Prediction Techniques for Landslide Susceptibility Modeling. Comput. Geosci. 2015, 81, 1–11. [Google Scholar] [CrossRef]
Ding, A.; Zhang, Q.; Zhou, X.; Dai, B. Automatic Recognition of Landslide Based on CNN and Texture Change Detection. In Proceedings of the 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; pp. 444–448. [Google Scholar]
Yang, Z.; Xu, C.; Li, L. Landslide Detection Based on ResU-Net with Transformer and CBAM Embedded: Two Examples with Geologically Different Environments. Remote Sens. 2022, 14, 2885. [Google Scholar] [CrossRef]
Alcorn, M.A.; Li, Q.; Gong, Z.; Wang, C.; Mai, L.; Ku, W.-S.; Nguyen, A. Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–17 June 2019; pp. 4845–4854. [Google Scholar]
Liu, Y.; Yao, X.; Gu, Z.; Zhou, Z.; Liu, X.; Chen, X.; Wei, S. Study of the Automatic Recognition of Landslides by Using InSAR Images and the Improved Mask R-CNN Model in the Eastern Tibet Plateau. Remote Sens. 2022, 14, 3362. [Google Scholar] [CrossRef]
Hao, L.; Rajaneesh, A.; Van Westen, C.; Sajinkumar, K.S.; Martha, T.R.; Jaiswal, P.; McAdoo, B.G. Constructing a Complete Landslide Inventory Dataset for the 2018 Monsoon Disaster in Kerala, India, for Land Use Change Analysis. Earth Syst. Sci. Data 2020, 12, 2899–2918. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Xu, Y.; Ghamisi, P.; Kopp, M.; Kreil, D. Landslide4Sense: Reference Benchmark Data and Deep Learning Models for Landslide Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–17. [Google Scholar] [CrossRef]
Ji, S.; Yu, D.; Shen, C.; Li, W.; Xu, Q. Landslide Detection from an Open Satellite Imagery and Digital Elevation Model Dataset Using Attention Boosted Convolutional Neural Networks. Landslides 2020, 17, 1337–1352. [Google Scholar] [CrossRef]
Meena, S.R.; Nava, L.; Bhuyan, K.; Puliero, S.; Soares, L.P.; Dias, H.C.; Floris, M.; Catani, F. HR-GLDD: A Globally Distributed Dataset Using Generalized Deep Learning (DL) for Rapid Landslide Mapping on High-Resolution (HR) Satellite Imagery. Earth Syst. Sci. Data 2023, 15, 3283–3298. [Google Scholar] [CrossRef]
Zhao, W.; Li, A.; Nan, X.; Zhang, Z.; Lei, G. Postearthquake Landslides Mapping from Landsat-8 Data for the 2015 Nepal Earthquake Using a Pixel-Based Change Detection Method. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 1758–1768. [Google Scholar] [CrossRef]
Lu, P.; Stumpf, A.; Kerle, N.; Casagli, N. Object-Oriented Change Detection for Landslide Rapid Mapping. IEEE Geosci. Remote Sens. Lett. 2011, 8, 701–705. [Google Scholar] [CrossRef]
Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2016, 55, 645–657. [Google Scholar] [CrossRef]
Zhu, M.; He, Y.; He, Q. A Review of Researches on Deep Learning in Remote Sensing Application. Int. J. Geosci. 2019, 10, 1–11. [Google Scholar] [CrossRef]
Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
Kang, Q.; Li, K.-Q.; Fu, J.-L.; Liu, Y. Hybrid LBM and Machine Learning Algorithms for Permeability Prediction of Porous Media: A Comparative Study. Comput. Geotech. 2024, 168, 106163. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Yu, H.; Ma, Y.; Wang, L.; Zhai, Y.; Wang, X. A Landslide Intelligent Detection Method Based on CNN and RSG_R. In Proceedings of the 2017 IEEE International Conference on Mechatronics and Automation (ICMA), Takamatsu, Japan, 6–9 August 2017; pp. 40–44. [Google Scholar]
Prakash, N.; Manconi, A.; Loew, S. Mapping Landslides on EO Data: Performance of Deep Learning Models vs. Traditional Machine Learning Models. Remote Sens. 2020, 12, 346. [Google Scholar] [CrossRef]
Sameen, M.I.; Pradhan, B. Landslide Detection Using Residual Networks and the Fusion of Spectral and Topographic Information. IEEE Access 2019, 7, 114363–114373. [Google Scholar] [CrossRef]
Yi, Y.; Zhang, W. A New Deep-Learning-Based Approach for Earthquake-Triggered Landslide Detection from Single-Temporal RapidEye Satellite Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 6166–6176. [Google Scholar] [CrossRef]
Chandra, N.; Sawant, S.; Vaidya, H. An Efficient U-Net Model for Improved Landslide Detection from Satellite Images. PFG—J. Photogramm. Remote Sens. Geoinf. Sci. 2023, 91, 13–28. [Google Scholar] [CrossRef]
Niu, C.; Ma, K.; Shen, X.; Wang, X.; Xie, X.; Tan, L.; Xue, Y. Attention-Enhanced Region Proposal Networks for Multi-Scale Landslide and Mudslide Detection from Optical Remote Sensing Images. Land 2023, 12, 313. [Google Scholar] [CrossRef]
Kang, Q.; Chen, E.J.; Li, Z.-C.; Luo, H.-B.; Liu, Y. Attention-Based LSTM Predictive Model for the Attitude and Position of Shield Machine in Tunneling. Undergr. Space 2023, 13, 335–350. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar] [CrossRef]
Tang, X.; Tu, Z.; Wang, Y.; Liu, M.; Li, D.; Fan, X. Automatic Detection of Coseismic Landslides Using a New Transformer Method. Remote Sens. 2022, 14, 2884. [Google Scholar] [CrossRef]
Huang, R.; Chen, T. Landslide Recognition from Multi-Feature Remote Sensing Data Based on Improved Transformers. Remote Sens. 2023, 15, 3340. [Google Scholar] [CrossRef]
Qin, S.; Guo, X.; Sun, J.; Qiao, S.; Zhang, L.; Yao, J.; Cheng, Q.; Zhang, Y. Landslide Detection from Open Satellite Imagery Using Distant Domain Transfer Learning. Remote Sens. 2021, 13, 3383. [Google Scholar] [CrossRef]
Mohan, A.; Singh, A.K.; Kumar, B.; Dwivedi, R. Review on Remote Sensing Methods for Landslide Detection Using Machine and Deep Learning. Trans. Emerg. Telecommun. Technol. 2021, 32, e3998. [Google Scholar] [CrossRef]
Yi, Y.; Zhang, Z.; Zhang, W.; Jia, H.; Zhang, J. Landslide Susceptibility Mapping Using Multiscale Sampling Strategy and Convolutional Neural Network: A Case Study in Jiuzhaigou Region. Catena 2020, 195, 104851. [Google Scholar] [CrossRef]
Zhang, X.; Yu, W.; Pun, M.-O.; Shi, W. Cross-Domain Landslide Mapping from Large-Scale Remote Sensing Images Using Prototype-Guided Domain-Aware Progressive Representation Learning. ISPRS J. Photogramm. Remote Sens. 2023, 197, 1–17. [Google Scholar] [CrossRef]
Zhu, Y.; Hong, T.; Wang, J.; Luo, Y.; Yang, K. Landslide and Debris Flow Hazard Risk Analysis and Assessment in Yunnan Province. In Proceedings of the 2018 Eighth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC), Harbin, China, 19–21 July 2018; pp. 772–777. [Google Scholar]
Tie, Y.; Ge, H.; Gao, Y.; Bai, Y. The Research Progress and Prospect of Geological Hazards in Southwest China since the 20th Century. Sediment. Geol. Tethyan Geol. 2022, 42, 653–665. [Google Scholar]
Çellek, S. Effect of the Slope Angle and Its Classification on Landslide. Nat. Hazards Earth Syst. Sci. Discuss. 2020, 2020, 1–23. [Google Scholar]
Russell, B.C.; Torralba, A.; Murphy, K.P.; Freeman, W.T. LabelMe: A Database and Web-Based Tool for Image Annotation. Int. J. Comput. Vis. 2008, 77, 157–173. [Google Scholar] [CrossRef]
Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 1–12 June 2015; pp. 3431–3440. [Google Scholar]
Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
Sun, K.; Xiao, B.; Liu, D.; Wang, J. Deep High-Resolution Representation Learning for Human Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5693–5703. [Google Scholar]
Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A Convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar]
Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
Li, X.; Sun, X.; Meng, Y.; Liang, J.; Wu, F.; Li, J. Dice Loss for Data-Imbalanced NLP Tasks. arXiv 2019, arXiv:1911.02855. [Google Scholar]
Berman, M.; Triki, A.R.; Blaschko, M.B. The Lovász-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-over-Union Measure in Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4413–4421. [Google Scholar]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An Imperative Style, High-Performance Deep Learning Library. Adv. Neural Inf. Process. Syst. 2019, 32, 8024–8035. [Google Scholar] [CrossRef]
MMSegmentation Contributors. MMSegmentation: OpenMMLab Semantic Segmentation Toolbox and Benchmark 2020. Available online: https://github.com/open-mmlab/mmsegmentation (accessed on 15 April 2024).

Figure 1. Remote sensing image and terrain display of Yunnan Province.

Figure 2. Selected regions and distribution of landslide disaster points.

Figure 3. Difficult-to-annotate landslides. Red frames indicate the landslide areas.

Figure 4. Statistical information about the dataset images. (a) Shows the histogram of spectral distribution, (b) displays the mean and variance of each band, (c) presents the contrast and energy calculated from the GLCM, and (d) illustrates the contrast results from Shannon entropy.

Figure 5. Presentation of landslide samples in the dataset. Red frames indicate the landslide areas.

Figure 6. Visualization of detection results from different models.

Figure 7. Training–loss curve (a) and F1 score curve (b) for different loss functions.

Figure 8. Visualization of detection results for different loss functions.

Figure 9. Image comparisons of the DMLD with Kodagu and Kaikoura. (a) Shows landslide images in Kodagu while (b) shows landslide images similar to Kodagu in the DMLD, and (c) shows landslide images in Kaikoura while (d) shows landslide images similar to Kaikoura in the DMLD, respectively.

Figure 10. Visualization results of regional generalization.

Table 1. Geographical information statistics of study areas.

Terrain Region	County	Elevation (m)	Area (km²)	Number of Landslide Points
Western Alpine Gorge	Gongshan	1170–5128	4379	61
	Deqin	1840.5–6740	7291	74
	Tengchong	930–3870	5845	143
	Longling	535–3001.6	2884	122
	Lianghe	860–2672.8	1136	118
	Mangshi	528–2889.1	2901	63
	Changning	608–2875.9	3888	291
	Fengqing	919–3098.7	3335	287
	Zhenyuan	774–3137	4223	235
	Luchun	320–2637	3097	204
Central Laterite Plateau	Huaping	1015–3198	2200	152
	Xinping	422–3165.9	4223	288
	Dayao	1023–3657	4146	193
	Nanhua	963–2861	2343	155
	Lufeng	1309–2754	3536	136
	Chuxiong	556–3657	4433	156
Eastern Karst Plateau	Yongshan	340–3199.5	2778	237
	Huize	695–4017.3	5884	371
	Xuanwei	920–2868	6070	211

Table 2. Models selected in this study.

Models	Year	Characteristics
FCN [40]	2015	FCN uses fully convolutional layers instead of fully connected layers, which allows for capturing semantic information in images at different scales.
PSPNet [41]	2017	PSPNet introduces a pyramid pooling module that performs pooling operations on feature maps at different scales to obtain global context information.
DeepLabv3+ [42]	2018	DeepLabv3+ introduces dilated convolution and residual connections to increase the receptive field in order to efficiently capture contextual information in images. In addition, a spatial pyramid pooling module is utilized to fuse features of different scales, thus improving the accuracy of semantic segmentation.
HRNet [43]	2019	HRNet retains the rich details in the image by constructing a high-resolution feature pyramid. It simultaneously retains the features at different resolutions, and interacts and fuses the information between different resolutions to fully utilize the multi-scale information.
Swin Transformer [44]	2021	Swin Transformer uses shifted windows to process large size images by dividing the image into multiple windows and applying a local self-attention mechanism on each window, disguised as a global modeling capability through sliding windows. In addition, it uses a multi-scale merging strategy similar to CNN pooling to merge neighboring patches to increase the receptive field and acquire multi-scale feature information.
SegFormer [45]	2021	SegFormer treats pixels as a sequence and applies self-attention mechanisms on the sequence to obtain global contextual information and long-range dependencies. In addition, it employs a hierarchical structure and multi-scale feature fusion strategy to capture information at different scales in the image.
ConvNeXt [46]	2022	ConvNeXt is a pure convolutional network inspired by the principles of Transformer. It takes the ResNet structure as a backbone, and borrows the ideas of Swin Transformer in terms of stage computation ratio, convolution, and optimization strategy, and improves them step-by-step.

Table 3. Selected loss functions and their roles.

Loss Functions	Formulas	Description
Binary CrossEntropy Loss	$BCE Loss = - \frac{1}{N} \sum_{i = 1}^{N} (y_{i} \cdot \log ({\hat{y}}_{i}) + (1 - y_{i}) \cdot \log (1 - {\hat{y}}_{i}))$ (1) where $N$ is the total number of pixels, $y_{i}$ is the true label of pixel $i$ , and ${\hat{y}}_{i}$ is the predicted probability of pixel $i$ by the model.	It is used in binary classification problems to measure the difference between predicted probabilities and true labels.
Focal Loss [47]	$Focal Loss = - α_{t} {(1 - p_{t})}^{γ} l o g (p_{t})$ (2) where $p_{t}$ is the probability predicted by the model for a positive sample, $α_{t}$ is a hyperparameter used to adjust the balance between positive and negative samples, and $γ$ is a hyperparameter used to adjust the importance of difficult samples.	To address the class imbalance problem, it can improve the performance of the model on difficult samples by adjusting the weight of the loss function and reducing the weight of the samples that are easy to classify.
BCE Loss + Dice Loss [48]	$Dice Loss = 1 - \frac{2 \cdot \sum_{i = 1}^{N} y_{i} {\hat{y}}_{i} + ϵ}{\sum_{i = 1}^{N} y_{i} + \sum_{i = 1}^{N} {\hat{y}}_{i} + ϵ}$ (3) where $y_{i}$ is the true binary label, ${\hat{y}}_{i}$ is the predicted output of the model, $N$ is the number of samples, and $ϵ$ is a very small constant used to avoid division by zero. $BCE Loss + Dice Loss = BCE + γ DICE$ (4) where BCE is the value of BCE Loss, “DICE” is the value of Dice Loss, and $γ$ is a scaling hyperparameter.	For class imbalance problems, it can mitigate the adverse effects of an oversized foreground area in the sample. When combined with BCE Loss, the instability of Dice Loss can be alleviated when the overlap between the predicted results and the real label is very small.
Lovasz Loss [49]	$Lovasz Loss = \sum_{i = 1}^{N} \|y_{i} - {\hat{y}}_{i}\|$ (5) where $y_{i}$ is the true label, ${\hat{y}}_{i}$ is the model’s predicted output, and $N$ is the number of samples.	For class imbalance problems, Lovasz Loss does not rely on the global class distribution, but instead focuses on the similarity between the prediction results and the true label of each sample or pixel, which facilitates the model’s ability to learn better boundaries.

Table 4. Quantitative results of different models in landslide recognition (%).

	IoU	Precision	Recall	F1 Score	Params	GFlops
FCN-r50	81.03	96.39	83.57	89.52	49.48 m	197.69
FCN-r101	84.45	96.93	86.77	91.57	68.48 m	275.37
PSPNet-r50	75.53	98.61	76.35	86.06	48.96 m	178.44
PSPNet-r101	83.89	98.46	85.01	91.24	67.95 m	256.13
DeepLabv3+-r50	79.9	97.81	81.36	88.83	43.59 m	176.36
DeepLabv3+-r101	84.14	98.07	85.56	91.39	62.57 m	253.9
HRNet-18	73.38	95.37	76.1	84.65	9.64 m	18.42
HRNet-48	82.85	97.4	84.72	90.62	65.85 m	93.38
SegFormer-b4	89.17	97.12	91.6	94.28	61.37 m	40.61
SegFormer-b5	89.51	97.03	92.03	94.47	81.97 m	51.83
Swin-base	85.59	94.2	90.35	92.24	121.17 m	296.04
Swin-large	82.98	94.32	87.34	90.7	233.65 m	654.43
ConvNeXt-base	90.59	96.0	94.14	95.06	121.99 m	291.17
ConvNeXt-large	88.54	95.25	92.63	93.92	234.88 m	651.53

Table 5. Quantitative results of different losses in landslide recognition (%).

	IoU	Precision	Recall	F1 Score
CrossEntropy Loss	90.59	96.00	94.14	95.06
BCE Loss + Dice Loss	88.31	95.36	92.28	93.79
Focal Loss	87.17	95.59	90.82	93.14
Lovasz Loss	87.64	94.7	92.16	93.41

Table 6. Quantitative evaluation of different models’ generalization in Kodagu (%).

	IoU	Precision	Recall	F1 Score
FCN	58.41	89.63	62.64	73.75
PSPNet	27.42	74.97	30.18	43.03
DeepLabv3+	23.44	93.95	23.80	37.98
HRNet	47.32	62.57	65.99	64.24
SegFormer	37.08	65.23	46.22	54.10
Swin Transformer	27.20	59.21	33.47	42.77
ConvNeXt	17.95	72.99	19.22	30.43

Table 7. Quantitative evaluation of different models’ generalization in Kaikoura (%).

	IoU	Precision	Recall	F1 Score
FCN	51.73	83.01	57.86	68.19
PSPNet	28.44	94.98	28.88	44.30
DeepLabv3+	26.30	95.39	26.64	41.65
HRNet	50.75	84.09	56.14	67.33
SegFormer	36.88	87.47	38.94	53.89
Swin Transformer	41.28	88.23	43.68	58.43
ConvNeXt	36.95	57.39	50.93	53.96

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, J.; Zeng, X.; Zhu, J.; Guo, Y.; Hong, L.; Deng, M.; Chen, K. The Diverse Mountainous Landslide Dataset (DMLD): A High-Resolution Remote Sensing Landslide Dataset in Diverse Mountainous Regions. Remote Sens. 2024, 16, 1886. https://doi.org/10.3390/rs16111886

AMA Style

Chen J, Zeng X, Zhu J, Guo Y, Hong L, Deng M, Chen K. The Diverse Mountainous Landslide Dataset (DMLD): A High-Resolution Remote Sensing Landslide Dataset in Diverse Mountainous Regions. Remote Sensing. 2024; 16(11):1886. https://doi.org/10.3390/rs16111886

Chicago/Turabian Style

Chen, Jie, Xu Zeng, Jingru Zhu, Ya Guo, Liang Hong, Min Deng, and Kaiqi Chen. 2024. "The Diverse Mountainous Landslide Dataset (DMLD): A High-Resolution Remote Sensing Landslide Dataset in Diverse Mountainous Regions" Remote Sensing 16, no. 11: 1886. https://doi.org/10.3390/rs16111886

APA Style

Chen, J., Zeng, X., Zhu, J., Guo, Y., Hong, L., Deng, M., & Chen, K. (2024). The Diverse Mountainous Landslide Dataset (DMLD): A High-Resolution Remote Sensing Landslide Dataset in Diverse Mountainous Regions. Remote Sensing, 16(11), 1886. https://doi.org/10.3390/rs16111886

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Diverse Mountainous Landslide Dataset (DMLD): A High-Resolution Remote Sensing Landslide Dataset in Diverse Mountainous Regions

Abstract

1. Introduction

2. Related Works

3. Landslide Dataset

3.1. Study Area

3.2. Data Source and Processing

3.2.1. Data Collection

3.2.2. Data Processing

3.2.3. Data Annotation

3.3. Dataset Analysis

4. Experiments and Result Analysis

4.1. Experiments Design

4.2. Experimental Settings

4.3. Results and Analysis

4.3.1. Recognition Results with SOTA Models

4.3.2. Recognition Results with Different Loss Functions

4.3.3. Model Generalization Ability

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI