Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks

Sun, Hui; Yang, Shuguang; Wang, Rui; Yang, Kaixin

doi:10.3390/app14156459

Open AccessArticle

Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks

¹

College of Information Engineering and Automation, Civil Aviation University of China, Tianjin 300300, China

²

Institute of Unmanned Systems Application, University of Science and Technology Beijing Tianjin, Tianjin 301830, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6459; https://doi.org/10.3390/app14156459

Submission received: 20 June 2024 / Revised: 16 July 2024 / Accepted: 22 July 2024 / Published: 24 July 2024

(This article belongs to the Special Issue Computer Vision and Pattern Recognition: Advanced Techniques and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Landslides are a kind of geological hazard with great destructive potential. When a landslide event occurs, a reliable landslide segmentation method is important for assessing the extent of the disaster and preventing secondary disasters. Although deep learning methods have been applied to improve the efficiency of landslide segmentation, there are still some problems that need to be solved, such as the poor segmentation due to the similarity between old landslide areas and the background features and missed detections of small-scale landslides. To tackle these challenges, a proposed high-resolution semantic segmentation algorithm for landslide scenes enhances the accuracy of landslide segmentation and addresses the challenge of missed detections in small-scale landslides. The network is based on the high-resolution network (HR-Net), which effectively integrates the efficient channel attention mechanism (efficient channel attention, ECA) into the network to enhance the representation quality of the feature maps. Moreover, the primary backbone of the high-resolution network is further enhanced to extract more profound semantic information. To improve the network’s ability to perceive small-scale landslides, atrous spatial pyramid pooling (ASPP) with ECA modules is introduced. Furthermore, to address the issues arising from inadequate training and reduced accuracy due to the unequal distribution of positive and negative samples, the network employs a combined loss function. This combined loss function effectively supervises the training of the network. Finally, the paper enhances the Loess Plateau landslide dataset using a fractional-order-based image enhancement approach and conducts experimental comparisons on this enriched dataset to evaluate the enhanced network’s performance. The experimental findings show that the proposed methodology achieves higher accuracy in segmentation performance compared to other networks.

Keywords:

landslide segmentation; high-resolution networks; stacking modules; attentional mechanisms; fractional order

1. Introduction

Landslides are prevalent geological hazards that pose significant risks to the natural environment, property, and personal safety on a global scale [1]. As reported by the International Landslide Center (ILC), the annual global fatality count due to landslides exceeds 11,500 individuals. Furthermore, landslides result in substantial economic damages, with estimated losses surpassing USD 10 billion per year on a global scale [2]. Ongoing climate change and erratic, high-intensity rainfall are expected to trigger more landslides, which will increase the annual mortality rate in the future [3]. So, the precise identification of landslides using semantic segmentation methods is highly significant in addressing the challenges related to locating and estimating the area of landslide occurrences [4]. Deep learning has witnessed remarkable progress across various domains. In the realm of bird recognition, deep learning techniques have exhibited the ability to effectively differentiate between various bird species with high accuracy [5]. Within the field of autonomous driving, deep learning technology has played a vital role in achieving autonomous driving capabilities. It has been extensively utilized for various tasks, including target detection, road recognition, and decision making in autonomous driving systems [6]. In medical imaging, deep learning algorithms play a crucial role in enhancing the precision and efficiency of medical image diagnoses [7]. Within the realm of video understanding, deep learning methods have showcased the capacity to automatically analyze and comprehend video content [8]. However, the utilization of deep learning techniques in various natural disaster domains has not yet been extensive. Therefore, in this paper, from the perspective of the rapid assessment of landslide disasters, deep learning and artificial intelligence technologies are utilized to explore advanced and precise methods for the semantic segmentation of landslides, incorporating principles from computer vision.

As image segmentation technology in the field of deep learning continues to advance, researchers have started to investigate landslide analysis and extract landslide areas via different methods. Ju et al. [9] used Mask R-CNN to identify old landslides in Google Earth images. Although the results confirmed the feasibility of identifying old landslides with Mask R-CNN, the accuracy was still insufficient. Ullo et al. [10] employed Mask R-CNN and transferred learning techniques to train their proposed network specifically designed for landslide detection. Tavakkoli Piralilou et al. [11] directed their attention toward the Himalayan region and integrated object-based image analysis with three machine learning methods to effectively detect landslides in this area. In Refs. [12,13,14], the detection capability of landslides was enhanced; however, the detection results were limited by the optimal scale parameters. Mandal, Saha, and Mandal [15] used a CNN to identify landslides in the Himalayas and compared it with commonly used machine learning methods [16,17,18]. The results substantiated that deep learning outperforms machine learning. Du et al. [19] introduced the application of semantic segmentation networks to landslide detection, providing a comprehensive comparison of various popular semantic segmentation networks. The researchers conducted a thorough analysis to assess the suitability of various networks for landslide segmentation. They also identified key areas that require further improvement in deep convolutional neural networks to achieve more effective landslide detection in future studies. Yang et al. [20] proposed a method that combines deep learning, for the extraction of landslides in images, and the selection of three widely recognized semantic segmentation networks (U-Net, Deeplabv3+, and PspNet) to complete the automatic identification of landslides; however, the segmentation of a landslide’s boundaries needs further improvement. Wang et al. [21] presented a deep learning semantic segmentation network for landslide identification, utilizing an encoder–decoder architecture. The proposed network effectively reduced the computational complexity while enabling pixel-level prediction of landslide images. Qi et al. [22] introduced the ResU-Net deep learning network model for the automatic segmentation of landslide areas. They conducted tests comparing this method with the baseline model (U-Net) for the automated mapping of landslide areas in Tian Shui city, Gansu Province. The ResU-Net model demonstrated an improved accuracy compared to the U-Net.

Despite the successful achievements in landslide segmentation in the aforementioned studies, the task of semantic segmentation in landslide areas still presents three significant challenges due to the intricate nature and variability of landslides, as follows: (1) the current landslide dataset directly used for deep learning training is limited; (2) the old landslide areas in a landslide image are similar to the background landscape features so that the general semantic segmentation network cannot segment the boundaries of the old landslides accurately; and (3) redundant information in landslide images can lead to the omission of small-scale landslide areas.

To address these issues, this paper, firstly, uses image enhancement techniques such as rotation and deflation. Secondly, the paper introduces fractional order image enhancement [23]. These methods can enrich the dataset and enhance the generalization performance of the network. In addition, this paper adopts the method of manual annotation to produce more accurate image labels. Finally, an improved landslide segmentation network model, the Deepened High-Resolution–ECA Network (DHRECA-Net), is proposed. The main contributions of this paper are listed as follows:

(1) This paper introduces ECARes-Net and ECA-ASPP for acquiring broader contextual information, enhancing the network’s ability to detect old landslides and small-scale landslides. ECARes-Net is incorporated into the feature extraction network, while ECA-ASPP is added after the feature extraction network.

(2) This paper presents a deepened high-resolution semantic segmentation network specifically designed for landslide scenes. By deepening the network’s backbone, a stacked feature module is constructed. This module effectively captures deeper semantic information, enhancing the network’s performance in semantic segmentation of landslide scenes.

To address the issue of imbalanced positive and negative samples and improve the segmentation performance, this paper employed a combined loss function that included BCE loss and Dice loss [24] during the network training.

To assess the effectiveness of the proposed model, this paper performed ablation experiments using a landslide dataset collected from the Loess Plateau. For comparative analysis, the results were compared with the performances of the U-Net [25], Deeplabv3+ [26], and PspNet [27] semantic segmentation networks. The experimental findings unequivocally demonstrate that the proposed method significantly enhances the accuracy.

2. Materials and Methods

2.1. DHRECA-Net Structure

This paper proposes an improved segmentation network for landslide scenes. It effectively solves the challenges of distinguishing old landslides from the background and improves the detection of small-scale landslides in the landslide dataset. It significantly improves accuracy and reduces missed detections. The network’s overall structure is depicted in Figure 1. The network improvement focuses on three aspects, as follows: (1) Adding ECA [28] to the Res-Net [29] makes the network more attentive to the areas where landslides occur and improves the poor performance of the network in recognizing landslides due to the difficulty in distinguishing old landslides from the background, caused by the similarity of their pixel values, and improves the performance of the network. (2) By incorporating the ECARes-Net as the backbone and deepening the HR-Net [30], this paper repeatedly invokes Stage 3 and Stage 4 modules to capture deeper semantic information while preserving the fusion of multiscale information coming from the submodules of Stage 3. (3) This paper uses ECA to extract further information on the second- and third-level features extracted from the backbone network. The ECA-ASPP expands the sensing field to obtain broader contextual information, enhances the network’s ability to sense small-scale landslides, and further enhances the network’s ability to segment old landslides through the ECA module.

2.1.1. ECARes-Net

The ECA module is introduced into the basic block of the Res-Net, and the network structure is shown in Figure 2. This enables the network to better distinguish between old landslides and backgrounds, and it reduces the degradation of the network segmentation performance due to the similarity of features between old landslides and backgrounds. The paper introduces the ECARes-Net module into the network. The ECA module does not require channel dimensionality reduction operations, with only a small number of parameter additions, thus retaining the informational integrity of the original channel features. Meanwhile, it is also able to better utilize the interdependence between the features produced by the convolution operation, which effectively improves the quality of the feature representation generated by the neural network [31,32]. The mechanism can use global information to selectively emphasize information-rich features and suppress less useful ones. The structure of the ECA module is shown in Figure 3.

The ECA first performs global average pooling on the input feature map, changing the input size from [H, W, C] to a vector of [1, 1, C]. Then, the weight of each channel is computed using a one-dimensional convolution kernel (of size k), where the value of k is related to the number of channels of the feature map. The calculation formula is shown as follows:

k = ψ (C) = {| \frac{l o g_{2} (C)}{γ} + \frac{b}{γ} |}_{o d d},

(1)

where k represents the one-dimensional convolutional kernel size, |t|_odd denotes the closest odd number to t, C represents the channel number size, and γ and b are hyperparameters set to 1.

Finally, the normalized weights are multiplied channel by channel with the original input feature map to obtain a weighted feature map. This enhances landslide features and suppresses background features.

2.1.2. Stacking Feature Extraction Modules

To enhance the network’s capability in extracting image features without overfitting, this paper selected Res-Net as the feature extraction network on top of the HR-Net architecture. This decision aimed to effectively improve the network’s feature extraction abilities while avoiding the issue of excessive parameterization, which can hinder backpropagation during training. The improved network structure is shown in Figure 4. Firstly, the Stage 3 and Stage 4 network modules are stacked. The Stage 3 modules extract feature information from the original image and the image with two downsamplings. The Stage 4 modules extract feature information from the original image and the image with three downsamplings. After several experiments, this paper ultimately used 4 Stage 3 modules and 3 Stage 4 modules stacked. Then, fusion layers are introduced between the newly added network modules. These fusion layers facilitate the combination of features from different scales, resulting in more representative feature maps. This approach effectively improves the network’s segmentation performance.

2.1.3. ECA-ASPP

ASPP is mainly used to solve the scaling problem in semantic segmentation tasks, in which each pixel in an image needs to be categorized into different classes, while different objects and structures may have different scales in the image. Traditional convolutional neural networks can only operate with a fixed-scale convolutional kernel when extracting semantic information and thus cannot capture the contextual information at different scales well.

To address this issue, this paper introduces the ASPP module after the second- and third-level features of the DHRECA-Net, which is used to further enhance the network’s ability to detect small-scale landslides. ASPP captures different levels of contextual information by introducing multiple parallel branches in the network, where each branch uses different scales of void convolution and pooling operations. By using different expansion rates for null convolution, the sensory field can be expanded to obtain a broader range of contextual information. The ASPP consists of a 1 × 1 convolutional layer, a pooling pyramid layer, and a pooling layer. In this paper, the layers of the pooling pyramid are used with expansion rate (rate) settings of 6, 12, and 18. In order to enhance the network’s ability to differentiate between landslides and backgrounds, this paper also adds the ECA module to the ASPP structure and constructs an ECA-ASPP model. This module further enhances the network’s ability to extract features from different categories, the structure of which is shown in Figure 5.

By introducing the ECA-ASPP structure, the DHRECA-Net is able to better capture contextual information at different scales, which improves the network’s ability to detect small-scale landslides and further enhances the segmentation of landslide areas.

2.2. Combined Loss Function

In the context of landslide segmentation, the task can be viewed as a binary classification problem. However, because of the imbalanced ratio between positive samples (i.e., landslides) and negative samples (i.e., backgrounds), the network model often gets stuck in a local optimum, resulting in a suboptimal segmentation performance. To address this issue, this paper utilizes a joint loss function consisting of BCE loss and Dice loss. Furthermore, the BCE loss is well suited for binary classification problems, and the Dice loss can alleviate the imbalance between positive and negative samples.

BCE loss is a loss function commonly used in pixel classification to compare the discrepancy between the model’s learned distribution and the true distribution. The calculation formula for BCE loss is shown as follows:

L o s s = - \frac{1}{N} \sum_{i = 1}^{N} g_{i} \times l o g (p_{i}) + (1 - g_{i}) \times l o g (1 - p_{i}),

(2)

where N represents the total number of pixels, p_i denotes the predicted value of the i-th pixel by the network, with a range of (0, 1), and g_i represents the value of the i-th pixel in the ground truth label, where positive samples are assigned to g_i with a value of 1; otherwise, negative samples are assigned to g_i with a value of 0.

The Dice loss can alleviate the negative effect of positive and negative sample imbalances. A positive and negative sample imbalance means that most of the region of an image is free of the target, while only a small portion of the region contains the target. The Dice loss focuses on the overlap between the predicted results and the ground truth during the calculation process, rather than simply considering the overall pixel values. It mitigates the issue of sample imbalance to some extent. The calculation formula for Dice loss is depicted as follows:

D i c e L o s s = 1 - \frac{2 \times \sum_{i = 1}^{N} p_{i} \times g_{i}}{\sum_{i = 1}^{N} p_{i} + \sum_{i = 1}^{N} g_{i}},

(3)

The combined function is denoted as follows:

L = L_{B C E} + L_{D},

(4)

where L_BCE is the BCE Loss, and L_D is the Dice loss.

3. Experimental Description

3.1. Landslide Dataset Production

The dataset employed in this study consists of images acquired from areas in the Loess Plateau. The dataset consists of 340 remote sensing images, for which the landslide regions were initially unannotated. To annotate the landslide boundaries, this paper utilized the labelme image annotation tool. Figure 6 depicts the annotated boundaries of the landslide areas within the images.

The final labeling diagram obtained is shown in Figure 7.

To overcome the size limitation for the training data and prevent overfitting, this study employed data augmentation techniques. The objective of employing these techniques was to enhance the diversity of the landslide dataset and optimize the training of the neural network model. The data augmentation methods employed in this study primarily encompassed:

Flip transformation: flip images horizontally or vertically to create mirrored versions;
Rotation transformation: rotate images to a certain angle to simulate variations in the viewpoints;
Perspective transformation: apply perspective distortion to the images to mimic different camera perspectives;
Shear transformation: apply shear distortion to the images to introduce geometric deformations;
Image enhancement based on fractional-order differentiation: The fractional order differential operator offers superior capabilities in preserving the image’s edge features [33]. The fractional derivatives in this paper used the G–L definition in the image processing [34], which is defined as follows:

{}_{a}^{R}D_{t}^{T} = \lim_{h \to 0} \frac{1}{h^{v}} \sum_{m = 0}^{\frac{t - a}{h}} {(- 1)}^{i} \frac{Γ (v + 1)}{j! Γ (v - j + 1)} f (t - j h),

(5)

where R is the real number field, a and t are the independent variables that meet the G–L definition and t − a < 1 and a < t, T is the differential order, v is the fractional order derivative of the v-th order, i and j are natural numbers from 1 to n, and h is the unit variable. Γ (·) is the gamma function, which is defined as follows:

Γ (n) = \int_{0}^{\infty} e^{- t} t^{n - 1} d t = (n - 1)! .

(6)

Applying fractional-order techniques to enhance gray and poorly textured landslide images can significantly enhance the quality of the dataset. This improvement is visually illustrated in Figure 8.

By incorporating data augmentation techniques, the variability of the dataset is increased, allowing the deep neural network to generalize more effectively and enhance its performance in landslide segmentation.

After applying data augmentation techniques, the final dataset of landslide images consisted of a total of 640 images, and the dataset was divided into training and validation sets using a ratio of 9:1.

3.2. Evaluation Metrics

The evaluation metrics employed in this paper included mean intersection over union (MIOU), mean pixel accuracy (MPA), and F1-score.

1. The MPA computes the percentage of accurately classified pixels for each class, aggregates them, and calculates the average. The calculation formula can be expressed as follows:

M P A = \frac{1}{n} \sum_{i = 1}^{n} \frac{T N + T P}{T N + T P + F N + F P},

(7)

where n is the class, TN (true negative) denotes the count of correctly predicted negative samples, TP (true positive) represents the count of correctly predicted positive samples, FN (false negative) indicates the count of incorrectly predicted negative samples, and FP (false positive) represents the count of incorrectly predicted positive samples.

2. MIOU tends to provide an intuitive evaluation, which is the average intersection over union between the predicted result and the ground truth. The calculation formula can be expressed as follows:

M I O U = \frac{1}{n} \sum_{i = 1}^{n} \frac{T P}{T P + F N + F P},

(8)

3. The calculation of the F1-score depends on two other fundamental metrics, recall and precision, and it is a more comprehensive indicator of the model’s performance. The calculation formula is defined as follows:

P r e c i s o n = \frac{T P}{T P + F P},

(9)

R e c a l l = \frac{T P}{T P + F N},

(10)

F 1 - s c o r e = \frac{2 \times R e c a l l \times P r e c i s i o n}{R e c a l l + P r e c i s i o n},

(11)

where Precision refers to the model’s ability to predict correctly, while Recall refers to the model’s ability to predict more instances in a given validation set. Precision and Recall are contradictory metrics, so the F1-score is needed to evaluate the comprehensive performance of the network.

3.3. Experimental Platform and Training Parameters

This section presents the experimental evaluation of the performance of the proposed DHRECA-Net on the landslide dataset. The parameters of the experimental platform are shown in the Table 1.

The DHRECA-Net model was developed using the PyTorch framework and was trained and evaluated on a single NVIDIA TESLA T4 platform with 12 GB RAM (Nvidia, Santa Clara, CA, USA). The landslide dataset was processed with an input image size of 512 × 512. For the optimization, a stochastic gradient descent (SGD) optimizer was utilized with an initial learning rate of 4 × 10⁻³ and a minimum learning rate of 4 × 10⁻⁵. During the network training, a cosine annealing learning rate strategy (COS) was employed to dynamically adjust the learning rate. The training was performed with a batch size of 16 and lasted for 300 epochs. The model’s parameter configuration is displayed in detail in Table 2.

4. Results and Discussion

4.1. Results of Ablation Experiment

The performance of the DHRECA-Net exhibited a significant enhancement in the segmentation accuracy on the Loess Plateau dataset, as evidenced by the results from the conducted ablation experiments presented in Table 3. Specifically, the results indicate that the MIOU, MPA, and recall improved with the ECARes-Net and ECA-ASPP compared to the HR-Net model. Although the precision decreased, the F1 value improved for all metrics. This suggests that the inclusion of these two modules enhances the overall performance of the network. The network’s performance significantly improved when all three modules were added in pairs. The DHRECA-Net achieved outstanding metrics by incorporating all three modules, with an MIOU of 88.37%, MPA of 94.55%, and F1-score of 93.85%. From these results, it outperformed the baseline HR-Net model, demonstrating an increase of 4.82% in the MIOU, 4.42% in the MPA, and 3.15% in the F1-score. These findings highlight a substantial enhancement in the segmentation performance.

Figure 9 shows that the baseline HR-Net algorithm struggled to accurately delineate the boundaries of old landslides and tended to miss small-scale landslides in multiple areas of a single test image. In contrast, the DHRECA-Net algorithm exhibited a noticeable improvement in accurately delineating the boundaries between old landslides and the background, as well as providing more complete segmentation of small-scale landslides. The algorithm’s performance was significantly enhanced and demonstrated greater accuracy in capturing details compared to the baseline model.

4.2. Performance Comparison Using Different Loss Functions

Table 4 presents the results of comparing the network training supervised by the loss function alone and the network training supervised by the combined loss function. It is evident that utilizing the combined loss function for network training yielded an improvement in the performance.

4.3. Comparison of Experiments Results among Different Algorithms

To evaluate the performance of the proposed algorithm in this paper, a comparison was conducted with the U-Net, Deeplabv3+, and PspNet algorithms. The experimental results are presented in Table 5. It is evident that the DHRECA-Net outperformed the other three algorithms in all three metrics. In comparison to the highest performing PspNet algorithm, the DHRECA-Net demonstrated improvements of 4.28% in the MIOU, 2.76% in the MPA, and 2.78% in the F1-score.

Figure 10 indicates that the U-Net, Deeplabv3+, and PspNet segmentation algorithms exhibited deficiencies in detecting multiple small-scale landslide areas, while the proposed DHRECA-Net in this study demonstrated comprehensive detection of all landslide areas in the images. Moreover, when segmenting old landslides, the three aforementioned algorithms displayed inaccuracies and segmentation errors, which suggests challenges in accurately discerning between background and landslide features. In contrast, the proposed algorithm in this paper demonstrated an improvement in the accuracy of segmenting old landslides. Consequently, these results underscore the superiority and enhance the efficiency of the proposed DHRECA-Net over the three comparative algorithms.

4.4. Number of Parameters for Different Algorithms

In order to further confirm the superiority of the DHRSE-Net algorithm, a comparison was made regarding the model’s parameter quantity (Params), computational load (floating point operations per second, GFLOPS), and the inference speed (frames per second, FPS) in detecting the number of images. The results of this analysis are presented in Table 6.

The data listed in Table 6 reveal that the DHRECA-Net exhibited the lowest parameter count and computational load among the four algorithms (U-Net, Deeplabv3+, PspNet, and DHRECA-Net). This implies that DHRECA-Net is more easily deployable and operable in resource-constrained environments, such as mobile devices or embedded systems. Additionally, the DHRECA-Net exhibited the highest FPS, demonstrating high real-time performance in applications.

4.5. Discussion

First, the test results show that the model proposed in this paper has higher accuracy compared to the baseline model, indicating better performance in landslide segmentation tasks. Moreover, the model demonstrated superior segmentation results when dealing with old landslides and small-scale landslide segmentation.

Secondly, when compared to the other models, the U-Net achieved an MIOU of 82.72%, MPA of 91.19%, and F1-score of 90.18% on the dataset in this paper. Deeplabv3+ attained an MIOU of 83.65%, MPA of 90.68%, and F1-score of 90.75% on the same dataset. PspNet, on the other hand, recorded an MIOU of 84.09%, MPA of 91.79%, and F1-score of 91.07% on this dataset. Specifically, all performance metrics of the model proposed in this paper surpassed those of the other models on the dataset, demonstrating the superior performance of this model in landslide segmentation tasks. Furthermore, both PspNet and the model in this paper utilize a parallel structure, suggesting that this structure may offer advantages in landslide detection.

Finally, this paper compared the number of parameters and the computation of the DHRECA-Net with the other models. The results indicate that the number of parameters in the DHRECA-Net was only 39.6% that of U-Net, 18% of Deeplabv3+, and 21.1% of PspNet. Additionally, the computation required for the DHRECA-Net was only 8.42% of that of U-Net, 22.8% of Deeplabv3+, and 32.1% of PspNet. These findings demonstrate that the model proposed in this paper achieved a superior performance while utilizing fewer parameters. Particularly for scenarios in which deployment on power-constrained devices is necessary, the algorithm presented in this paper demonstrated significant advantages. Through parameter and computation optimization, the model can efficiently operate in resource-constrained environments, presenting a more effective solution for practical applications.

5. Conclusions

This study developed a landslide scene dataset utilizing images from the Loess Plateau. To enhance the extraction of deep semantic information, the DHRECA-Net model was employed. This paper used ECARes-Net as the backbone network to allocate more resources for effective segmentation and deepen the network’s backbone structure to extract deeper semantic information. It also introduced the ECA-ASPP structure to further process the information extracted from the backbone network and obtain more effective semantic information. Moreover, this paper performed ablation experiments on the landslide dataset by employing the DHRECA-Net model in conjunction with baseline models. The experimental results were compared with three segmentation networks, including U-Net, Deeplabv3+, and PspNet. The results indicate the exceptional capability of the proposed algorithm in accurately delineating old landslide regions and multiple areas affected by landslides. However, as a consequence of the augmented size of the parameters in the enhanced network, the speed of landslide segmentation is comparatively reduced. To meet the demands of practical applications, it is crucial to design a network model that offers a faster segmentation speed and a lighter structure. This enables better adaptability to various application scenarios without compromising performance. Therefore, in the future, this paper will focus on researching a lightweight nature for the network.

Author Contributions

H.S. and S.Y. designed the research; R.W. and K.Y. processed the data; S.Y. drafted the manuscript; R.W. helped organize the manuscript; H.S. and R.W. revised and finalized the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Program of Tianjin, China, grant number: 22YFZCSN00210.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhao, J.; Feng, W.; Yi, X.; Wu, W.; Li, B. Investigation of the mass movement and thermal pressurization effect of rapid and long-runout landslides in Shuicheng, Guizhou, China. Geomorphology 2024, 449, 109051. [Google Scholar] [CrossRef]
Singh, K.; Bhardwaj, V.; Sharma, A.; Thakur, S. A Comprehensive Review on Landslide Susceptibility Zonation Techniques. Quaest. Geogr. 2024, 43, 79–91. [Google Scholar]
Sreelakshmi, S.; Vinod Chandra, S.S.; Shaji, E. Landslide identification using machine learning techniques: Review, motivation, and future prospects. Earth Sci. Inform. 2022, 15, 2063–2090. [Google Scholar]
Sun, C.; Zhao, Y. Meteorological Disaster Fault Prediction for Power Grid Based on Equipment Vulnerability. Shandong Electr. Power Technol. 2020, 47, 9–12. (In Chinese) [Google Scholar]
Wang, R.; Shi, Y.; Sun, H.; Zhang, Y. Lightweight-based high resolution bird flocking recognition deep learning network. J. Huazhong Univ. Sci. Technol. Nat. Sci. Ed. 2023, 51, 81–87. (In Chinese) [Google Scholar]
Wang, R.; Li, J.; Shi, Y.; Sun, H. Vision-based path planning algorithm of unmanned bird-repelling vehicles in airports. J. Beijing Univ. Aeronaut. Astronaut. 2024, 50, 1446–1453. (In Chinese) [Google Scholar]
Pang, T.; Li, P.; Zhao, L. A survey on automatic generation of medical imaging reports based on deep learning. BioMedical Eng. Online 2023, 22, 48. [Google Scholar] [CrossRef]
Bai, J.; Yang, Z.; Peng, B.; Li, W. Research on 3D Convolutional Neural Network and Its Application on Video Understanding. J. Electron. Inf. Technol. 2023, 45, 2273–2283. (In Chinese) [Google Scholar]
Ju, Y.; Xu, Q.; Jin, S.; Li, W.; Su, Y.; Dong, X.; Guo, Q. Loess Landslide Detection Using Object Detection Algorithms in Northwest China. Remote Sens. 2022, 14, 1182. [Google Scholar] [CrossRef]
Ullo, S.L.; Mohan, A.; Sebastianelli, A.; Ahamed, S.E.; Kumar, B.; Dwivedi, R.; Sinha, G.R. A new mask R-CNN-based method for improved landslide detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3799–3810. [Google Scholar] [CrossRef]
Tavakkoli Piralilou, S.; Shahabi, H.; Jarihani, B.; Ghorbanzadeh, O.; Blaschke, T.; Gholamnia, K.; Meena, S.R.; Aryal, J. Landslide Detection Using Multi-Scale Image Segmentation and Different Machine Learning Models in the Higher Himalayas. Remote Sens. 2019, 11, 2575. [Google Scholar] [CrossRef]
Purwati, I.; Nur, H.; Susetyo, B.B.; Purwaningsih, E. Determination of landslide hazardous map using logistic regression in Lima Puluh Kota Regency. J. Phys. Conf. Ser. 2023, 2582, 012012. [Google Scholar]
Karpatne, A.; Ebert-Uphoff, I.; Ravela, S.; Babaie, H.A.; Kumar, V. Machine learning for the geosciences: Challenges and opportunities. IEEE Trans. Knowl. Data Eng. 2018, 31, 1544–1554. [Google Scholar] [CrossRef]
Han, W.; Zhang, X.; Wang, Y.; Wang, L.; Huang, X.; Li, J.; Wang, S.; Chen, W.; Li, X.; Feng, R.; et al. A survey of machine learning and deep learning in remote sensing of geological environment: Challenges, advances, and opportunities. ISPRS J. Photogramm. Remote Sens. 2023, 202, 87–113. [Google Scholar] [CrossRef]
Mandal, K.; Saha, S.; Mandal, S. Applying deep learning and benchmark machine learning algorithms for landslide susceptibility modelling in Rorachu river basin of Sikkim Himalaya, India. Geosci. Front. 2021, 12, 101203. [Google Scholar] [CrossRef]
Al-Saleh, A. A balanced communication-avoiding support vector machine decision tree method for smart intrusion detection systems. Sci. Rep. 2023, 13, 9083. [Google Scholar] [CrossRef] [PubMed]
Duarte, D.; Pereira, W.H.; Ribeiro, R.B. A probabilistic model for networks generated by actors’ characteristics. J. Comput. Sci. 2023, 73, 102143. [Google Scholar] [CrossRef]
Latifah, S.; Akhsani, F.; Sofiana, E.I.; Ferdiansah, M.R. Land cover change assessment using random forest and CA markov from remote sensing images in the protected forest of South Malang, Indonesia. Remote Sens. Appl. Soc. Environ. 2023, 32, 101061. [Google Scholar]
Du, B.; Zhao, Z.; Hu, X.; Wu, G.; Han, L.; Sun, L.; Gao, Q. Landslide susceptibility prediction based on image semantic segmentation. Comput. Geosci. 2021, 155, 104860. [Google Scholar] [CrossRef]
Yang, S.; Wang, Y.; Wang, P.; Mu, J.; Jiao, S.; Zhao, X.; Wang, Z. Automatic Identification of Landslides Based on Deep Learning. Appl. Sci. 2022, 12, 8153. [Google Scholar] [CrossRef]
Yang, S.; Wang, Y.; Wang, P.; Mu, J.; Jiao, S.; Zhao, X.; Wang, Z.; Wang, K.; Zhu, Y. A Deep Learning Semantic Segmentation Method for Landslide Scene Based on Transformer Architecture. Sustainability 2022, 14, 8153. [Google Scholar] [CrossRef]
Qi, W.; Wei, M.; Yang, W.; Xu, C.; Ma, C. Automatic Mapping of Landslides by the ResU-Net. Remote Sens. 2020, 12, 2487. [Google Scholar] [CrossRef]
Rahman, Z.; Yi-Fei, P.; Aamir, M.; Wali, S.; Guan, Y. Efficient image enhancement model for correcting uneven illumination images. IEEE Access 2020, 8, 109038–109053. [Google Scholar] [CrossRef]
Li, X.; Sun, X.; Meng, Y.; Liang, J.; Wu, F.; Li, J. Dice loss for data-imbalanced NLP tasks. arXiv 2019, arXiv:1911.02855. [Google Scholar]
Zhang, G.; Roslan, S.N.; Wang, C.; Quan, L. Research on land cover classification of multi-source remote sensing data based on improved U-net network. Sci. Rep. 2023, 13, 16275. [Google Scholar] [CrossRef] [PubMed]
Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018; pp. 801–818. [Google Scholar]
Zhang, Z.; Gao, S.; Huang, Z. An Automatic Glioma Segmentation System Using a Multilevel Attention Pyramid Scene Parsing Network. Curr. Med. Imaging 2021, 17, 751–761. [Google Scholar] [CrossRef] [PubMed]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
Shafiq, M.; Gu, Z. Deep Residual Learning for Image Recognition: A Survey. Appl. Sci. 2022, 12, 8972. [Google Scholar] [CrossRef]
Sun, K.; Zhao, Y.; Jiang, B.; Cheng, T.; Xiao, B.; Liu, D.; Mu, Y.; Wang, X.; Liu, W.; Wang, J. High-resolution representations for labeling pixels and regions. arXiv 2019, arXiv:1904.04514. [Google Scholar]
Fang, J.; Wang, X.; Li, Y.; Zhang, X.; Zhang, B.; Gade, M. GLUENet: An Efficient Network for Remote Sensing Image Dehazing with Gated Linear Units and Efficient Channel Attention. Remote Sens. 2024, 16, 1450. [Google Scholar] [CrossRef]
Liu, Q.; Zhang, J.; Zhao, Y.; Bu, X.; Hanajima, N. A YOLOX Object Detection Algorithm Based on Bidirectional Cross-scale Path Aggregation. Neural Process. Lett. 2024, 56, 35. [Google Scholar] [CrossRef]
Pu, Y.F.; Zhang, N.; Wang, Z.N.; Wang, J.; Yi, Z.; Wang, Y.; Zhou, J.L. Fractional-order retinex for adaptive contrast enhancement of under-exposed traffic images. IEEE Intell. Transp. Syst. Mag. 2019, 13, 149–159. [Google Scholar] [CrossRef]
Gamini, S.; Kumar, S.S. Homomorphic filtering for the image enhancement based on fractional-order derivative and genetic algorithm. Comput. Electr. Eng. 2023, 106, 108566. [Google Scholar] [CrossRef]

Figure 1. The overall structure of the DHRECA-Net network.

Figure 2. Introduction of the ECA into the basic block.

Figure 3. Efficient channel attention structure.

Figure 4. The network structure before and after the improvements: (a) network structure prior to enhancement; (b) network structure following enhancement. Different colors represent various levels of feature maps, with Level 1 at the top and Level 4 at the bottom.

Figure 5. The ECA-ASPP structure.

Figure 6. Labelme annotation tool.

Figure 7. Label diagram.

Figure 8. Sample image enhancement based on fractional order: (a) unprocessed original image; (b) enhanced by fractional-order differentiation.

Figure 9. The network model’s recognition performance was compared before and after improvements.

Figure 10. Comparison of the recognition effect of the improved network model with the other models.

Table 1. Experimental platform’s configuration.

Platform Parameter	Value
Operating system	Windows 10
GPU	NVIDIA TESLA T4 (12 GB RAM)
Experimental frameworks	PyTorch 1.11.0 and TorchVision 0.12.0
Compilers	Python 3.7

Table 2. Model parameter configuration.

Hyperparameter	Value
Input image size	[512, 512]
Optimizer	SGD
Initial learning rate	4 × 10⁻³
Minimum learning rate	4 × 10⁻⁵
Learning rate decay strategy	COS
Batch size	16

Table 3. Comparison of DHRECA-Net and HR-Net ablation experiments on the Loess Plateau landslide dataset.

	A	B	C	MIOU/%	MPA/%	Precision/%	Recall/%	F1-Score/%
HR-Net				83.55	90.13	91.28	90.13	90.70
	√			83.89	91.43	90.44	91.43	90.93
		√		85.57	91.39	92.56	91.39	91.97
			√	84.49	91.83	90.80	91.83	91.31
	√	√		86.15	92.87	91.82	92.87	92.34
	√		√	84.05	90.82	91.22	90.82	91.01
		√	√	86.18	92.02	92.03	92.02	92.02
	√	√	√	88.37	94.55	93.17	94.55	93.85

A represents ECARes-Net, B the Stacked Feature Module, and C the ECA-ASPP. The best results are in bold.

Table 4. Comparison of the training results of the supervised networks with different loss functions.

	MIOU/%	MPA/%	Precision/%	Recall/%	F1-Score%
BCE loss	84.38	90.55	91.94	90.55	91.07
Combined loss	88.37	94.55	93.17	94.55	93.85

The best results are in bold.

Table 5. Performance comparison between the DHRECA-Net and other models.

	MIOU/%	MPA/%	Precision/%	Recall/%	F1-Score/%
U-Net	82.72	91.19	89.26	91.19	90.18
Deeplabv3+	83.65	90.68	90.86	90.65	90.75
PspNet	84.09	91.79	90.36	91.79	91.07
DHRECA-Net	88.37	94.55	93.17	94.55	93.85

The best results are in bold.

Table 6. Comparison of the model parameters.

	Params	GFLOPS	FPS
U-Net	24.891 M	451.672 G	12.883
Deeplabv3+	54.709 M	166.841 G	17.376
PspNet	46.707 M	118.427 G	18.028
DHRECA-Net	9.858 M	38.063 G	22.013

The best results are in bold.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, H.; Yang, S.; Wang, R.; Yang, K. Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks. Appl. Sci. 2024, 14, 6459. https://doi.org/10.3390/app14156459

AMA Style

Sun H, Yang S, Wang R, Yang K. Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks. Applied Sciences. 2024; 14(15):6459. https://doi.org/10.3390/app14156459

Chicago/Turabian Style

Sun, Hui, Shuguang Yang, Rui Wang, and Kaixin Yang. 2024. "Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks" Applied Sciences 14, no. 15: 6459. https://doi.org/10.3390/app14156459

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks

Abstract

1. Introduction

2. Materials and Methods

2.1. DHRECA-Net Structure

2.1.1. ECARes-Net

2.1.2. Stacking Feature Extraction Modules

2.1.3. ECA-ASPP

2.2. Combined Loss Function

3. Experimental Description

3.1. Landslide Dataset Production

3.2. Evaluation Metrics

3.3. Experimental Platform and Training Parameters

4. Results and Discussion

4.1. Results of Ablation Experiment

4.2. Performance Comparison Using Different Loss Functions

4.3. Comparison of Experiments Results among Different Algorithms

4.4. Number of Parameters for Different Algorithms

4.5. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI