1. Introduction
Landslides are prevalent geological hazards that pose significant risks to the natural environment, property, and personal safety on a global scale [
1]. As reported by the International Landslide Center (ILC), the annual global fatality count due to landslides exceeds 11,500 individuals. Furthermore, landslides result in substantial economic damages, with estimated losses surpassing USD 10 billion per year on a global scale [
2]. Ongoing climate change and erratic, high-intensity rainfall are expected to trigger more landslides, which will increase the annual mortality rate in the future [
3]. So, the precise identification of landslides using semantic segmentation methods is highly significant in addressing the challenges related to locating and estimating the area of landslide occurrences [
4]. Deep learning has witnessed remarkable progress across various domains. In the realm of bird recognition, deep learning techniques have exhibited the ability to effectively differentiate between various bird species with high accuracy [
5]. Within the field of autonomous driving, deep learning technology has played a vital role in achieving autonomous driving capabilities. It has been extensively utilized for various tasks, including target detection, road recognition, and decision making in autonomous driving systems [
6]. In medical imaging, deep learning algorithms play a crucial role in enhancing the precision and efficiency of medical image diagnoses [
7]. Within the realm of video understanding, deep learning methods have showcased the capacity to automatically analyze and comprehend video content [
8]. However, the utilization of deep learning techniques in various natural disaster domains has not yet been extensive. Therefore, in this paper, from the perspective of the rapid assessment of landslide disasters, deep learning and artificial intelligence technologies are utilized to explore advanced and precise methods for the semantic segmentation of landslides, incorporating principles from computer vision.
As image segmentation technology in the field of deep learning continues to advance, researchers have started to investigate landslide analysis and extract landslide areas via different methods. Ju et al. [
9] used Mask R-CNN to identify old landslides in Google Earth images. Although the results confirmed the feasibility of identifying old landslides with Mask R-CNN, the accuracy was still insufficient. Ullo et al. [
10] employed Mask R-CNN and transferred learning techniques to train their proposed network specifically designed for landslide detection. Tavakkoli Piralilou et al. [
11] directed their attention toward the Himalayan region and integrated object-based image analysis with three machine learning methods to effectively detect landslides in this area. In Refs. [
12,
13,
14], the detection capability of landslides was enhanced; however, the detection results were limited by the optimal scale parameters. Mandal, Saha, and Mandal [
15] used a CNN to identify landslides in the Himalayas and compared it with commonly used machine learning methods [
16,
17,
18]. The results substantiated that deep learning outperforms machine learning. Du et al. [
19] introduced the application of semantic segmentation networks to landslide detection, providing a comprehensive comparison of various popular semantic segmentation networks. The researchers conducted a thorough analysis to assess the suitability of various networks for landslide segmentation. They also identified key areas that require further improvement in deep convolutional neural networks to achieve more effective landslide detection in future studies. Yang et al. [
20] proposed a method that combines deep learning, for the extraction of landslides in images, and the selection of three widely recognized semantic segmentation networks (U-Net, Deeplabv3+, and PspNet) to complete the automatic identification of landslides; however, the segmentation of a landslide’s boundaries needs further improvement. Wang et al. [
21] presented a deep learning semantic segmentation network for landslide identification, utilizing an encoder–decoder architecture. The proposed network effectively reduced the computational complexity while enabling pixel-level prediction of landslide images. Qi et al. [
22] introduced the ResU-Net deep learning network model for the automatic segmentation of landslide areas. They conducted tests comparing this method with the baseline model (U-Net) for the automated mapping of landslide areas in Tian Shui city, Gansu Province. The ResU-Net model demonstrated an improved accuracy compared to the U-Net.
Despite the successful achievements in landslide segmentation in the aforementioned studies, the task of semantic segmentation in landslide areas still presents three significant challenges due to the intricate nature and variability of landslides, as follows: (1) the current landslide dataset directly used for deep learning training is limited; (2) the old landslide areas in a landslide image are similar to the background landscape features so that the general semantic segmentation network cannot segment the boundaries of the old landslides accurately; and (3) redundant information in landslide images can lead to the omission of small-scale landslide areas.
To address these issues, this paper, firstly, uses image enhancement techniques such as rotation and deflation. Secondly, the paper introduces fractional order image enhancement [
23]. These methods can enrich the dataset and enhance the generalization performance of the network. In addition, this paper adopts the method of manual annotation to produce more accurate image labels. Finally, an improved landslide segmentation network model, the Deepened High-Resolution–ECA Network (DHRECA-Net), is proposed. The main contributions of this paper are listed as follows:
(1) This paper introduces ECARes-Net and ECA-ASPP for acquiring broader contextual information, enhancing the network’s ability to detect old landslides and small-scale landslides. ECARes-Net is incorporated into the feature extraction network, while ECA-ASPP is added after the feature extraction network.
(2) This paper presents a deepened high-resolution semantic segmentation network specifically designed for landslide scenes. By deepening the network’s backbone, a stacked feature module is constructed. This module effectively captures deeper semantic information, enhancing the network’s performance in semantic segmentation of landslide scenes.
To address the issue of imbalanced positive and negative samples and improve the segmentation performance, this paper employed a combined loss function that included BCE loss and Dice loss [
24] during the network training.
To assess the effectiveness of the proposed model, this paper performed ablation experiments using a landslide dataset collected from the Loess Plateau. For comparative analysis, the results were compared with the performances of the U-Net [
25], Deeplabv3+ [
26], and PspNet [
27] semantic segmentation networks. The experimental findings unequivocally demonstrate that the proposed method significantly enhances the accuracy.
4. Results and Discussion
4.1. Results of Ablation Experiment
The performance of the DHRECA-Net exhibited a significant enhancement in the segmentation accuracy on the Loess Plateau dataset, as evidenced by the results from the conducted ablation experiments presented in
Table 3. Specifically, the results indicate that the MIOU, MPA, and recall improved with the ECARes-Net and ECA-ASPP compared to the HR-Net model. Although the precision decreased, the F1 value improved for all metrics. This suggests that the inclusion of these two modules enhances the overall performance of the network. The network’s performance significantly improved when all three modules were added in pairs. The DHRECA-Net achieved outstanding metrics by incorporating all three modules, with an MIOU of 88.37%, MPA of 94.55%, and F1-score of 93.85%. From these results, it outperformed the baseline HR-Net model, demonstrating an increase of 4.82% in the MIOU, 4.42% in the MPA, and 3.15% in the F1-score. These findings highlight a substantial enhancement in the segmentation performance.
Figure 9 shows that the baseline HR-Net algorithm struggled to accurately delineate the boundaries of old landslides and tended to miss small-scale landslides in multiple areas of a single test image. In contrast, the DHRECA-Net algorithm exhibited a noticeable improvement in accurately delineating the boundaries between old landslides and the background, as well as providing more complete segmentation of small-scale landslides. The algorithm’s performance was significantly enhanced and demonstrated greater accuracy in capturing details compared to the baseline model.
4.2. Performance Comparison Using Different Loss Functions
Table 4 presents the results of comparing the network training supervised by the loss function alone and the network training supervised by the combined loss function. It is evident that utilizing the combined loss function for network training yielded an improvement in the performance.
4.3. Comparison of Experiments Results among Different Algorithms
To evaluate the performance of the proposed algorithm in this paper, a comparison was conducted with the U-Net, Deeplabv3+, and PspNet algorithms. The experimental results are presented in
Table 5. It is evident that the DHRECA-Net outperformed the other three algorithms in all three metrics. In comparison to the highest performing PspNet algorithm, the DHRECA-Net demonstrated improvements of 4.28% in the MIOU, 2.76% in the MPA, and 2.78% in the F1-score.
Figure 10 indicates that the U-Net, Deeplabv3+, and PspNet segmentation algorithms exhibited deficiencies in detecting multiple small-scale landslide areas, while the proposed DHRECA-Net in this study demonstrated comprehensive detection of all landslide areas in the images. Moreover, when segmenting old landslides, the three aforementioned algorithms displayed inaccuracies and segmentation errors, which suggests challenges in accurately discerning between background and landslide features. In contrast, the proposed algorithm in this paper demonstrated an improvement in the accuracy of segmenting old landslides. Consequently, these results underscore the superiority and enhance the efficiency of the proposed DHRECA-Net over the three comparative algorithms.
4.4. Number of Parameters for Different Algorithms
In order to further confirm the superiority of the DHRSE-Net algorithm, a comparison was made regarding the model’s parameter quantity (Params), computational load (floating point operations per second, GFLOPS), and the inference speed (frames per second, FPS) in detecting the number of images. The results of this analysis are presented in
Table 6.
The data listed in
Table 6 reveal that the DHRECA-Net exhibited the lowest parameter count and computational load among the four algorithms (U-Net, Deeplabv3+, PspNet, and DHRECA-Net). This implies that DHRECA-Net is more easily deployable and operable in resource-constrained environments, such as mobile devices or embedded systems. Additionally, the DHRECA-Net exhibited the highest FPS, demonstrating high real-time performance in applications.
4.5. Discussion
First, the test results show that the model proposed in this paper has higher accuracy compared to the baseline model, indicating better performance in landslide segmentation tasks. Moreover, the model demonstrated superior segmentation results when dealing with old landslides and small-scale landslide segmentation.
Secondly, when compared to the other models, the U-Net achieved an MIOU of 82.72%, MPA of 91.19%, and F1-score of 90.18% on the dataset in this paper. Deeplabv3+ attained an MIOU of 83.65%, MPA of 90.68%, and F1-score of 90.75% on the same dataset. PspNet, on the other hand, recorded an MIOU of 84.09%, MPA of 91.79%, and F1-score of 91.07% on this dataset. Specifically, all performance metrics of the model proposed in this paper surpassed those of the other models on the dataset, demonstrating the superior performance of this model in landslide segmentation tasks. Furthermore, both PspNet and the model in this paper utilize a parallel structure, suggesting that this structure may offer advantages in landslide detection.
Finally, this paper compared the number of parameters and the computation of the DHRECA-Net with the other models. The results indicate that the number of parameters in the DHRECA-Net was only 39.6% that of U-Net, 18% of Deeplabv3+, and 21.1% of PspNet. Additionally, the computation required for the DHRECA-Net was only 8.42% of that of U-Net, 22.8% of Deeplabv3+, and 32.1% of PspNet. These findings demonstrate that the model proposed in this paper achieved a superior performance while utilizing fewer parameters. Particularly for scenarios in which deployment on power-constrained devices is necessary, the algorithm presented in this paper demonstrated significant advantages. Through parameter and computation optimization, the model can efficiently operate in resource-constrained environments, presenting a more effective solution for practical applications.