Next Article in Journal
Large-Depth Ground-Penetrating Radar for Investigating Active Faults: The Case of the 2017 Casamicciola Fault System, Ischia Island (Italy)
Next Article in Special Issue
The Development of a Prototype Solution for Detecting Wear and Tear in Pedestrian Crossings
Previous Article in Journal
Monitoring the Sleep Respiratory Rate with Low-Cost Microcontroller Wi-Fi in a Controlled Environment
Previous Article in Special Issue
Inv-ReVersion: Enhanced Relation Inversion Based on Text-to-Image Diffusion Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks

1
College of Information Engineering and Automation, Civil Aviation University of China, Tianjin 300300, China
2
Institute of Unmanned Systems Application, University of Science and Technology Beijing Tianjin, Tianjin 301830, China
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2024, 14(15), 6459; https://doi.org/10.3390/app14156459
Submission received: 20 June 2024 / Revised: 16 July 2024 / Accepted: 22 July 2024 / Published: 24 July 2024

Abstract

:
Landslides are a kind of geological hazard with great destructive potential. When a landslide event occurs, a reliable landslide segmentation method is important for assessing the extent of the disaster and preventing secondary disasters. Although deep learning methods have been applied to improve the efficiency of landslide segmentation, there are still some problems that need to be solved, such as the poor segmentation due to the similarity between old landslide areas and the background features and missed detections of small-scale landslides. To tackle these challenges, a proposed high-resolution semantic segmentation algorithm for landslide scenes enhances the accuracy of landslide segmentation and addresses the challenge of missed detections in small-scale landslides. The network is based on the high-resolution network (HR-Net), which effectively integrates the efficient channel attention mechanism (efficient channel attention, ECA) into the network to enhance the representation quality of the feature maps. Moreover, the primary backbone of the high-resolution network is further enhanced to extract more profound semantic information. To improve the network’s ability to perceive small-scale landslides, atrous spatial pyramid pooling (ASPP) with ECA modules is introduced. Furthermore, to address the issues arising from inadequate training and reduced accuracy due to the unequal distribution of positive and negative samples, the network employs a combined loss function. This combined loss function effectively supervises the training of the network. Finally, the paper enhances the Loess Plateau landslide dataset using a fractional-order-based image enhancement approach and conducts experimental comparisons on this enriched dataset to evaluate the enhanced network’s performance. The experimental findings show that the proposed methodology achieves higher accuracy in segmentation performance compared to other networks.

1. Introduction

Landslides are prevalent geological hazards that pose significant risks to the natural environment, property, and personal safety on a global scale [1]. As reported by the International Landslide Center (ILC), the annual global fatality count due to landslides exceeds 11,500 individuals. Furthermore, landslides result in substantial economic damages, with estimated losses surpassing USD 10 billion per year on a global scale [2]. Ongoing climate change and erratic, high-intensity rainfall are expected to trigger more landslides, which will increase the annual mortality rate in the future [3]. So, the precise identification of landslides using semantic segmentation methods is highly significant in addressing the challenges related to locating and estimating the area of landslide occurrences [4]. Deep learning has witnessed remarkable progress across various domains. In the realm of bird recognition, deep learning techniques have exhibited the ability to effectively differentiate between various bird species with high accuracy [5]. Within the field of autonomous driving, deep learning technology has played a vital role in achieving autonomous driving capabilities. It has been extensively utilized for various tasks, including target detection, road recognition, and decision making in autonomous driving systems [6]. In medical imaging, deep learning algorithms play a crucial role in enhancing the precision and efficiency of medical image diagnoses [7]. Within the realm of video understanding, deep learning methods have showcased the capacity to automatically analyze and comprehend video content [8]. However, the utilization of deep learning techniques in various natural disaster domains has not yet been extensive. Therefore, in this paper, from the perspective of the rapid assessment of landslide disasters, deep learning and artificial intelligence technologies are utilized to explore advanced and precise methods for the semantic segmentation of landslides, incorporating principles from computer vision.
As image segmentation technology in the field of deep learning continues to advance, researchers have started to investigate landslide analysis and extract landslide areas via different methods. Ju et al. [9] used Mask R-CNN to identify old landslides in Google Earth images. Although the results confirmed the feasibility of identifying old landslides with Mask R-CNN, the accuracy was still insufficient. Ullo et al. [10] employed Mask R-CNN and transferred learning techniques to train their proposed network specifically designed for landslide detection. Tavakkoli Piralilou et al. [11] directed their attention toward the Himalayan region and integrated object-based image analysis with three machine learning methods to effectively detect landslides in this area. In Refs. [12,13,14], the detection capability of landslides was enhanced; however, the detection results were limited by the optimal scale parameters. Mandal, Saha, and Mandal [15] used a CNN to identify landslides in the Himalayas and compared it with commonly used machine learning methods [16,17,18]. The results substantiated that deep learning outperforms machine learning. Du et al. [19] introduced the application of semantic segmentation networks to landslide detection, providing a comprehensive comparison of various popular semantic segmentation networks. The researchers conducted a thorough analysis to assess the suitability of various networks for landslide segmentation. They also identified key areas that require further improvement in deep convolutional neural networks to achieve more effective landslide detection in future studies. Yang et al. [20] proposed a method that combines deep learning, for the extraction of landslides in images, and the selection of three widely recognized semantic segmentation networks (U-Net, Deeplabv3+, and PspNet) to complete the automatic identification of landslides; however, the segmentation of a landslide’s boundaries needs further improvement. Wang et al. [21] presented a deep learning semantic segmentation network for landslide identification, utilizing an encoder–decoder architecture. The proposed network effectively reduced the computational complexity while enabling pixel-level prediction of landslide images. Qi et al. [22] introduced the ResU-Net deep learning network model for the automatic segmentation of landslide areas. They conducted tests comparing this method with the baseline model (U-Net) for the automated mapping of landslide areas in Tian Shui city, Gansu Province. The ResU-Net model demonstrated an improved accuracy compared to the U-Net.
Despite the successful achievements in landslide segmentation in the aforementioned studies, the task of semantic segmentation in landslide areas still presents three significant challenges due to the intricate nature and variability of landslides, as follows: (1) the current landslide dataset directly used for deep learning training is limited; (2) the old landslide areas in a landslide image are similar to the background landscape features so that the general semantic segmentation network cannot segment the boundaries of the old landslides accurately; and (3) redundant information in landslide images can lead to the omission of small-scale landslide areas.
To address these issues, this paper, firstly, uses image enhancement techniques such as rotation and deflation. Secondly, the paper introduces fractional order image enhancement [23]. These methods can enrich the dataset and enhance the generalization performance of the network. In addition, this paper adopts the method of manual annotation to produce more accurate image labels. Finally, an improved landslide segmentation network model, the Deepened High-Resolution–ECA Network (DHRECA-Net), is proposed. The main contributions of this paper are listed as follows:
(1) This paper introduces ECARes-Net and ECA-ASPP for acquiring broader contextual information, enhancing the network’s ability to detect old landslides and small-scale landslides. ECARes-Net is incorporated into the feature extraction network, while ECA-ASPP is added after the feature extraction network.
(2) This paper presents a deepened high-resolution semantic segmentation network specifically designed for landslide scenes. By deepening the network’s backbone, a stacked feature module is constructed. This module effectively captures deeper semantic information, enhancing the network’s performance in semantic segmentation of landslide scenes.
To address the issue of imbalanced positive and negative samples and improve the segmentation performance, this paper employed a combined loss function that included BCE loss and Dice loss [24] during the network training.
To assess the effectiveness of the proposed model, this paper performed ablation experiments using a landslide dataset collected from the Loess Plateau. For comparative analysis, the results were compared with the performances of the U-Net [25], Deeplabv3+ [26], and PspNet [27] semantic segmentation networks. The experimental findings unequivocally demonstrate that the proposed method significantly enhances the accuracy.

2. Materials and Methods

2.1. DHRECA-Net Structure

This paper proposes an improved segmentation network for landslide scenes. It effectively solves the challenges of distinguishing old landslides from the background and improves the detection of small-scale landslides in the landslide dataset. It significantly improves accuracy and reduces missed detections. The network’s overall structure is depicted in Figure 1. The network improvement focuses on three aspects, as follows: (1) Adding ECA [28] to the Res-Net [29] makes the network more attentive to the areas where landslides occur and improves the poor performance of the network in recognizing landslides due to the difficulty in distinguishing old landslides from the background, caused by the similarity of their pixel values, and improves the performance of the network. (2) By incorporating the ECARes-Net as the backbone and deepening the HR-Net [30], this paper repeatedly invokes Stage 3 and Stage 4 modules to capture deeper semantic information while preserving the fusion of multiscale information coming from the submodules of Stage 3. (3) This paper uses ECA to extract further information on the second- and third-level features extracted from the backbone network. The ECA-ASPP expands the sensing field to obtain broader contextual information, enhances the network’s ability to sense small-scale landslides, and further enhances the network’s ability to segment old landslides through the ECA module.

2.1.1. ECARes-Net

The ECA module is introduced into the basic block of the Res-Net, and the network structure is shown in Figure 2. This enables the network to better distinguish between old landslides and backgrounds, and it reduces the degradation of the network segmentation performance due to the similarity of features between old landslides and backgrounds. The paper introduces the ECARes-Net module into the network. The ECA module does not require channel dimensionality reduction operations, with only a small number of parameter additions, thus retaining the informational integrity of the original channel features. Meanwhile, it is also able to better utilize the interdependence between the features produced by the convolution operation, which effectively improves the quality of the feature representation generated by the neural network [31,32]. The mechanism can use global information to selectively emphasize information-rich features and suppress less useful ones. The structure of the ECA module is shown in Figure 3.
The ECA first performs global average pooling on the input feature map, changing the input size from [H, W, C] to a vector of [1, 1, C]. Then, the weight of each channel is computed using a one-dimensional convolution kernel (of size k), where the value of k is related to the number of channels of the feature map. The calculation formula is shown as follows:
k = ψ ( C ) = | l o g 2 ( C ) γ + b γ | o d d ,
where k represents the one-dimensional convolutional kernel size, |t|odd denotes the closest odd number to t, C represents the channel number size, and γ and b are hyperparameters set to 1.
Finally, the normalized weights are multiplied channel by channel with the original input feature map to obtain a weighted feature map. This enhances landslide features and suppresses background features.

2.1.2. Stacking Feature Extraction Modules

To enhance the network’s capability in extracting image features without overfitting, this paper selected Res-Net as the feature extraction network on top of the HR-Net architecture. This decision aimed to effectively improve the network’s feature extraction abilities while avoiding the issue of excessive parameterization, which can hinder backpropagation during training. The improved network structure is shown in Figure 4. Firstly, the Stage 3 and Stage 4 network modules are stacked. The Stage 3 modules extract feature information from the original image and the image with two downsamplings. The Stage 4 modules extract feature information from the original image and the image with three downsamplings. After several experiments, this paper ultimately used 4 Stage 3 modules and 3 Stage 4 modules stacked. Then, fusion layers are introduced between the newly added network modules. These fusion layers facilitate the combination of features from different scales, resulting in more representative feature maps. This approach effectively improves the network’s segmentation performance.

2.1.3. ECA-ASPP

ASPP is mainly used to solve the scaling problem in semantic segmentation tasks, in which each pixel in an image needs to be categorized into different classes, while different objects and structures may have different scales in the image. Traditional convolutional neural networks can only operate with a fixed-scale convolutional kernel when extracting semantic information and thus cannot capture the contextual information at different scales well.
To address this issue, this paper introduces the ASPP module after the second- and third-level features of the DHRECA-Net, which is used to further enhance the network’s ability to detect small-scale landslides. ASPP captures different levels of contextual information by introducing multiple parallel branches in the network, where each branch uses different scales of void convolution and pooling operations. By using different expansion rates for null convolution, the sensory field can be expanded to obtain a broader range of contextual information. The ASPP consists of a 1 × 1 convolutional layer, a pooling pyramid layer, and a pooling layer. In this paper, the layers of the pooling pyramid are used with expansion rate (rate) settings of 6, 12, and 18. In order to enhance the network’s ability to differentiate between landslides and backgrounds, this paper also adds the ECA module to the ASPP structure and constructs an ECA-ASPP model. This module further enhances the network’s ability to extract features from different categories, the structure of which is shown in Figure 5.
By introducing the ECA-ASPP structure, the DHRECA-Net is able to better capture contextual information at different scales, which improves the network’s ability to detect small-scale landslides and further enhances the segmentation of landslide areas.

2.2. Combined Loss Function

In the context of landslide segmentation, the task can be viewed as a binary classification problem. However, because of the imbalanced ratio between positive samples (i.e., landslides) and negative samples (i.e., backgrounds), the network model often gets stuck in a local optimum, resulting in a suboptimal segmentation performance. To address this issue, this paper utilizes a joint loss function consisting of BCE loss and Dice loss. Furthermore, the BCE loss is well suited for binary classification problems, and the Dice loss can alleviate the imbalance between positive and negative samples.
BCE loss is a loss function commonly used in pixel classification to compare the discrepancy between the model’s learned distribution and the true distribution. The calculation formula for BCE loss is shown as follows:
L o s s = 1 N i = 1 N g i × l o g ( p i ) + ( 1 g i ) × l o g ( 1 p i ) ,
where N represents the total number of pixels, pi denotes the predicted value of the i-th pixel by the network, with a range of (0, 1), and gi represents the value of the i-th pixel in the ground truth label, where positive samples are assigned to gi with a value of 1; otherwise, negative samples are assigned to gi with a value of 0.
The Dice loss can alleviate the negative effect of positive and negative sample imbalances. A positive and negative sample imbalance means that most of the region of an image is free of the target, while only a small portion of the region contains the target. The Dice loss focuses on the overlap between the predicted results and the ground truth during the calculation process, rather than simply considering the overall pixel values. It mitigates the issue of sample imbalance to some extent. The calculation formula for Dice loss is depicted as follows:
D i c e L o s s = 1 2 × i = 1 N p i × g i i = 1 N p i + i = 1 N g i ,
The combined function is denoted as follows:
L = L B C E + L D ,
where LBCE is the BCE Loss, and LD is the Dice loss.

3. Experimental Description

3.1. Landslide Dataset Production

The dataset employed in this study consists of images acquired from areas in the Loess Plateau. The dataset consists of 340 remote sensing images, for which the landslide regions were initially unannotated. To annotate the landslide boundaries, this paper utilized the labelme image annotation tool. Figure 6 depicts the annotated boundaries of the landslide areas within the images.
The final labeling diagram obtained is shown in Figure 7.
To overcome the size limitation for the training data and prevent overfitting, this study employed data augmentation techniques. The objective of employing these techniques was to enhance the diversity of the landslide dataset and optimize the training of the neural network model. The data augmentation methods employed in this study primarily encompassed:
  • Flip transformation: flip images horizontally or vertically to create mirrored versions;
  • Rotation transformation: rotate images to a certain angle to simulate variations in the viewpoints;
  • Perspective transformation: apply perspective distortion to the images to mimic different camera perspectives;
  • Shear transformation: apply shear distortion to the images to introduce geometric deformations;
  • Image enhancement based on fractional-order differentiation: The fractional order differential operator offers superior capabilities in preserving the image’s edge features [33]. The fractional derivatives in this paper used the G–L definition in the image processing [34], which is defined as follows:
D t T a R = lim h 0 1 h v m = 0 t a h ( 1 ) i Γ ( v + 1 ) j ! Γ ( v j + 1 ) f ( t j h ) ,
where R is the real number field, a and t are the independent variables that meet the G–L definition and t − a < 1 and a < t, T is the differential order, v is the fractional order derivative of the v-th order, i and j are natural numbers from 1 to n, and h is the unit variable. Γ (·) is the gamma function, which is defined as follows:
Γ ( n ) = 0 e t t n 1 d t = ( n 1 ) ! .
Applying fractional-order techniques to enhance gray and poorly textured landslide images can significantly enhance the quality of the dataset. This improvement is visually illustrated in Figure 8.
By incorporating data augmentation techniques, the variability of the dataset is increased, allowing the deep neural network to generalize more effectively and enhance its performance in landslide segmentation.
After applying data augmentation techniques, the final dataset of landslide images consisted of a total of 640 images, and the dataset was divided into training and validation sets using a ratio of 9:1.

3.2. Evaluation Metrics

The evaluation metrics employed in this paper included mean intersection over union (MIOU), mean pixel accuracy (MPA), and F1-score.
1. The MPA computes the percentage of accurately classified pixels for each class, aggregates them, and calculates the average. The calculation formula can be expressed as follows:
M P A = 1 n i = 1 n T N + T P T N + T P + F N + F P ,
where n is the class, TN (true negative) denotes the count of correctly predicted negative samples, TP (true positive) represents the count of correctly predicted positive samples, FN (false negative) indicates the count of incorrectly predicted negative samples, and FP (false positive) represents the count of incorrectly predicted positive samples.
2. MIOU tends to provide an intuitive evaluation, which is the average intersection over union between the predicted result and the ground truth. The calculation formula can be expressed as follows:
M I O U = 1 n i = 1 n T P T P + F N + F P ,
3. The calculation of the F1-score depends on two other fundamental metrics, recall and precision, and it is a more comprehensive indicator of the model’s performance. The calculation formula is defined as follows:
P r e c i s o n = T P T P + F P ,
R e c a l l = T P T P + F N ,
F 1 s c o r e = 2 × R e c a l l × P r e c i s i o n R e c a l l + P r e c i s i o n ,
where Precision refers to the model’s ability to predict correctly, while Recall refers to the model’s ability to predict more instances in a given validation set. Precision and Recall are contradictory metrics, so the F1-score is needed to evaluate the comprehensive performance of the network.

3.3. Experimental Platform and Training Parameters

This section presents the experimental evaluation of the performance of the proposed DHRECA-Net on the landslide dataset. The parameters of the experimental platform are shown in the Table 1.
The DHRECA-Net model was developed using the PyTorch framework and was trained and evaluated on a single NVIDIA TESLA T4 platform with 12 GB RAM (Nvidia, Santa Clara, CA, USA). The landslide dataset was processed with an input image size of 512 × 512. For the optimization, a stochastic gradient descent (SGD) optimizer was utilized with an initial learning rate of 4 × 10−3 and a minimum learning rate of 4 × 10−5. During the network training, a cosine annealing learning rate strategy (COS) was employed to dynamically adjust the learning rate. The training was performed with a batch size of 16 and lasted for 300 epochs. The model’s parameter configuration is displayed in detail in Table 2.

4. Results and Discussion

4.1. Results of Ablation Experiment

The performance of the DHRECA-Net exhibited a significant enhancement in the segmentation accuracy on the Loess Plateau dataset, as evidenced by the results from the conducted ablation experiments presented in Table 3. Specifically, the results indicate that the MIOU, MPA, and recall improved with the ECARes-Net and ECA-ASPP compared to the HR-Net model. Although the precision decreased, the F1 value improved for all metrics. This suggests that the inclusion of these two modules enhances the overall performance of the network. The network’s performance significantly improved when all three modules were added in pairs. The DHRECA-Net achieved outstanding metrics by incorporating all three modules, with an MIOU of 88.37%, MPA of 94.55%, and F1-score of 93.85%. From these results, it outperformed the baseline HR-Net model, demonstrating an increase of 4.82% in the MIOU, 4.42% in the MPA, and 3.15% in the F1-score. These findings highlight a substantial enhancement in the segmentation performance.
Figure 9 shows that the baseline HR-Net algorithm struggled to accurately delineate the boundaries of old landslides and tended to miss small-scale landslides in multiple areas of a single test image. In contrast, the DHRECA-Net algorithm exhibited a noticeable improvement in accurately delineating the boundaries between old landslides and the background, as well as providing more complete segmentation of small-scale landslides. The algorithm’s performance was significantly enhanced and demonstrated greater accuracy in capturing details compared to the baseline model.

4.2. Performance Comparison Using Different Loss Functions

Table 4 presents the results of comparing the network training supervised by the loss function alone and the network training supervised by the combined loss function. It is evident that utilizing the combined loss function for network training yielded an improvement in the performance.

4.3. Comparison of Experiments Results among Different Algorithms

To evaluate the performance of the proposed algorithm in this paper, a comparison was conducted with the U-Net, Deeplabv3+, and PspNet algorithms. The experimental results are presented in Table 5. It is evident that the DHRECA-Net outperformed the other three algorithms in all three metrics. In comparison to the highest performing PspNet algorithm, the DHRECA-Net demonstrated improvements of 4.28% in the MIOU, 2.76% in the MPA, and 2.78% in the F1-score.
Figure 10 indicates that the U-Net, Deeplabv3+, and PspNet segmentation algorithms exhibited deficiencies in detecting multiple small-scale landslide areas, while the proposed DHRECA-Net in this study demonstrated comprehensive detection of all landslide areas in the images. Moreover, when segmenting old landslides, the three aforementioned algorithms displayed inaccuracies and segmentation errors, which suggests challenges in accurately discerning between background and landslide features. In contrast, the proposed algorithm in this paper demonstrated an improvement in the accuracy of segmenting old landslides. Consequently, these results underscore the superiority and enhance the efficiency of the proposed DHRECA-Net over the three comparative algorithms.

4.4. Number of Parameters for Different Algorithms

In order to further confirm the superiority of the DHRSE-Net algorithm, a comparison was made regarding the model’s parameter quantity (Params), computational load (floating point operations per second, GFLOPS), and the inference speed (frames per second, FPS) in detecting the number of images. The results of this analysis are presented in Table 6.
The data listed in Table 6 reveal that the DHRECA-Net exhibited the lowest parameter count and computational load among the four algorithms (U-Net, Deeplabv3+, PspNet, and DHRECA-Net). This implies that DHRECA-Net is more easily deployable and operable in resource-constrained environments, such as mobile devices or embedded systems. Additionally, the DHRECA-Net exhibited the highest FPS, demonstrating high real-time performance in applications.

4.5. Discussion

First, the test results show that the model proposed in this paper has higher accuracy compared to the baseline model, indicating better performance in landslide segmentation tasks. Moreover, the model demonstrated superior segmentation results when dealing with old landslides and small-scale landslide segmentation.
Secondly, when compared to the other models, the U-Net achieved an MIOU of 82.72%, MPA of 91.19%, and F1-score of 90.18% on the dataset in this paper. Deeplabv3+ attained an MIOU of 83.65%, MPA of 90.68%, and F1-score of 90.75% on the same dataset. PspNet, on the other hand, recorded an MIOU of 84.09%, MPA of 91.79%, and F1-score of 91.07% on this dataset. Specifically, all performance metrics of the model proposed in this paper surpassed those of the other models on the dataset, demonstrating the superior performance of this model in landslide segmentation tasks. Furthermore, both PspNet and the model in this paper utilize a parallel structure, suggesting that this structure may offer advantages in landslide detection.
Finally, this paper compared the number of parameters and the computation of the DHRECA-Net with the other models. The results indicate that the number of parameters in the DHRECA-Net was only 39.6% that of U-Net, 18% of Deeplabv3+, and 21.1% of PspNet. Additionally, the computation required for the DHRECA-Net was only 8.42% of that of U-Net, 22.8% of Deeplabv3+, and 32.1% of PspNet. These findings demonstrate that the model proposed in this paper achieved a superior performance while utilizing fewer parameters. Particularly for scenarios in which deployment on power-constrained devices is necessary, the algorithm presented in this paper demonstrated significant advantages. Through parameter and computation optimization, the model can efficiently operate in resource-constrained environments, presenting a more effective solution for practical applications.

5. Conclusions

This study developed a landslide scene dataset utilizing images from the Loess Plateau. To enhance the extraction of deep semantic information, the DHRECA-Net model was employed. This paper used ECARes-Net as the backbone network to allocate more resources for effective segmentation and deepen the network’s backbone structure to extract deeper semantic information. It also introduced the ECA-ASPP structure to further process the information extracted from the backbone network and obtain more effective semantic information. Moreover, this paper performed ablation experiments on the landslide dataset by employing the DHRECA-Net model in conjunction with baseline models. The experimental results were compared with three segmentation networks, including U-Net, Deeplabv3+, and PspNet. The results indicate the exceptional capability of the proposed algorithm in accurately delineating old landslide regions and multiple areas affected by landslides. However, as a consequence of the augmented size of the parameters in the enhanced network, the speed of landslide segmentation is comparatively reduced. To meet the demands of practical applications, it is crucial to design a network model that offers a faster segmentation speed and a lighter structure. This enables better adaptability to various application scenarios without compromising performance. Therefore, in the future, this paper will focus on researching a lightweight nature for the network.

Author Contributions

H.S. and S.Y. designed the research; R.W. and K.Y. processed the data; S.Y. drafted the manuscript; R.W. helped organize the manuscript; H.S. and R.W. revised and finalized the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Program of Tianjin, China, grant number: 22YFZCSN00210.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhao, J.; Feng, W.; Yi, X.; Wu, W.; Li, B. Investigation of the mass movement and thermal pressurization effect of rapid and long-runout landslides in Shuicheng, Guizhou, China. Geomorphology 2024, 449, 109051. [Google Scholar] [CrossRef]
  2. Singh, K.; Bhardwaj, V.; Sharma, A.; Thakur, S. A Comprehensive Review on Landslide Susceptibility Zonation Techniques. Quaest. Geogr. 2024, 43, 79–91. [Google Scholar]
  3. Sreelakshmi, S.; Vinod Chandra, S.S.; Shaji, E. Landslide identification using machine learning techniques: Review, motivation, and future prospects. Earth Sci. Inform. 2022, 15, 2063–2090. [Google Scholar]
  4. Sun, C.; Zhao, Y. Meteorological Disaster Fault Prediction for Power Grid Based on Equipment Vulnerability. Shandong Electr. Power Technol. 2020, 47, 9–12. (In Chinese) [Google Scholar]
  5. Wang, R.; Shi, Y.; Sun, H.; Zhang, Y. Lightweight-based high resolution bird flocking recognition deep learning network. J. Huazhong Univ. Sci. Technol. Nat. Sci. Ed. 2023, 51, 81–87. (In Chinese) [Google Scholar]
  6. Wang, R.; Li, J.; Shi, Y.; Sun, H. Vision-based path planning algorithm of unmanned bird-repelling vehicles in airports. J. Beijing Univ. Aeronaut. Astronaut. 2024, 50, 1446–1453. (In Chinese) [Google Scholar]
  7. Pang, T.; Li, P.; Zhao, L. A survey on automatic generation of medical imaging reports based on deep learning. BioMedical Eng. Online 2023, 22, 48. [Google Scholar] [CrossRef]
  8. Bai, J.; Yang, Z.; Peng, B.; Li, W. Research on 3D Convolutional Neural Network and Its Application on Video Understanding. J. Electron. Inf. Technol. 2023, 45, 2273–2283. (In Chinese) [Google Scholar]
  9. Ju, Y.; Xu, Q.; Jin, S.; Li, W.; Su, Y.; Dong, X.; Guo, Q. Loess Landslide Detection Using Object Detection Algorithms in Northwest China. Remote Sens. 2022, 14, 1182. [Google Scholar] [CrossRef]
  10. Ullo, S.L.; Mohan, A.; Sebastianelli, A.; Ahamed, S.E.; Kumar, B.; Dwivedi, R.; Sinha, G.R. A new mask R-CNN-based method for improved landslide detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3799–3810. [Google Scholar] [CrossRef]
  11. Tavakkoli Piralilou, S.; Shahabi, H.; Jarihani, B.; Ghorbanzadeh, O.; Blaschke, T.; Gholamnia, K.; Meena, S.R.; Aryal, J. Landslide Detection Using Multi-Scale Image Segmentation and Different Machine Learning Models in the Higher Himalayas. Remote Sens. 2019, 11, 2575. [Google Scholar] [CrossRef]
  12. Purwati, I.; Nur, H.; Susetyo, B.B.; Purwaningsih, E. Determination of landslide hazardous map using logistic regression in Lima Puluh Kota Regency. J. Phys. Conf. Ser. 2023, 2582, 012012. [Google Scholar]
  13. Karpatne, A.; Ebert-Uphoff, I.; Ravela, S.; Babaie, H.A.; Kumar, V. Machine learning for the geosciences: Challenges and opportunities. IEEE Trans. Knowl. Data Eng. 2018, 31, 1544–1554. [Google Scholar] [CrossRef]
  14. Han, W.; Zhang, X.; Wang, Y.; Wang, L.; Huang, X.; Li, J.; Wang, S.; Chen, W.; Li, X.; Feng, R.; et al. A survey of machine learning and deep learning in remote sensing of geological environment: Challenges, advances, and opportunities. ISPRS J. Photogramm. Remote Sens. 2023, 202, 87–113. [Google Scholar] [CrossRef]
  15. Mandal, K.; Saha, S.; Mandal, S. Applying deep learning and benchmark machine learning algorithms for landslide susceptibility modelling in Rorachu river basin of Sikkim Himalaya, India. Geosci. Front. 2021, 12, 101203. [Google Scholar] [CrossRef]
  16. Al-Saleh, A. A balanced communication-avoiding support vector machine decision tree method for smart intrusion detection systems. Sci. Rep. 2023, 13, 9083. [Google Scholar] [CrossRef] [PubMed]
  17. Duarte, D.; Pereira, W.H.; Ribeiro, R.B. A probabilistic model for networks generated by actors’ characteristics. J. Comput. Sci. 2023, 73, 102143. [Google Scholar] [CrossRef]
  18. Latifah, S.; Akhsani, F.; Sofiana, E.I.; Ferdiansah, M.R. Land cover change assessment using random forest and CA markov from remote sensing images in the protected forest of South Malang, Indonesia. Remote Sens. Appl. Soc. Environ. 2023, 32, 101061. [Google Scholar]
  19. Du, B.; Zhao, Z.; Hu, X.; Wu, G.; Han, L.; Sun, L.; Gao, Q. Landslide susceptibility prediction based on image semantic segmentation. Comput. Geosci. 2021, 155, 104860. [Google Scholar] [CrossRef]
  20. Yang, S.; Wang, Y.; Wang, P.; Mu, J.; Jiao, S.; Zhao, X.; Wang, Z. Automatic Identification of Landslides Based on Deep Learning. Appl. Sci. 2022, 12, 8153. [Google Scholar] [CrossRef]
  21. Yang, S.; Wang, Y.; Wang, P.; Mu, J.; Jiao, S.; Zhao, X.; Wang, Z.; Wang, K.; Zhu, Y. A Deep Learning Semantic Segmentation Method for Landslide Scene Based on Transformer Architecture. Sustainability 2022, 14, 8153. [Google Scholar] [CrossRef]
  22. Qi, W.; Wei, M.; Yang, W.; Xu, C.; Ma, C. Automatic Mapping of Landslides by the ResU-Net. Remote Sens. 2020, 12, 2487. [Google Scholar] [CrossRef]
  23. Rahman, Z.; Yi-Fei, P.; Aamir, M.; Wali, S.; Guan, Y. Efficient image enhancement model for correcting uneven illumination images. IEEE Access 2020, 8, 109038–109053. [Google Scholar] [CrossRef]
  24. Li, X.; Sun, X.; Meng, Y.; Liang, J.; Wu, F.; Li, J. Dice loss for data-imbalanced NLP tasks. arXiv 2019, arXiv:1911.02855. [Google Scholar]
  25. Zhang, G.; Roslan, S.N.; Wang, C.; Quan, L. Research on land cover classification of multi-source remote sensing data based on improved U-net network. Sci. Rep. 2023, 13, 16275. [Google Scholar] [CrossRef] [PubMed]
  26. Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018; pp. 801–818. [Google Scholar]
  27. Zhang, Z.; Gao, S.; Huang, Z. An Automatic Glioma Segmentation System Using a Multilevel Attention Pyramid Scene Parsing Network. Curr. Med. Imaging 2021, 17, 751–761. [Google Scholar] [CrossRef] [PubMed]
  28. Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
  29. Shafiq, M.; Gu, Z. Deep Residual Learning for Image Recognition: A Survey. Appl. Sci. 2022, 12, 8972. [Google Scholar] [CrossRef]
  30. Sun, K.; Zhao, Y.; Jiang, B.; Cheng, T.; Xiao, B.; Liu, D.; Mu, Y.; Wang, X.; Liu, W.; Wang, J. High-resolution representations for labeling pixels and regions. arXiv 2019, arXiv:1904.04514. [Google Scholar]
  31. Fang, J.; Wang, X.; Li, Y.; Zhang, X.; Zhang, B.; Gade, M. GLUENet: An Efficient Network for Remote Sensing Image Dehazing with Gated Linear Units and Efficient Channel Attention. Remote Sens. 2024, 16, 1450. [Google Scholar] [CrossRef]
  32. Liu, Q.; Zhang, J.; Zhao, Y.; Bu, X.; Hanajima, N. A YOLOX Object Detection Algorithm Based on Bidirectional Cross-scale Path Aggregation. Neural Process. Lett. 2024, 56, 35. [Google Scholar] [CrossRef]
  33. Pu, Y.F.; Zhang, N.; Wang, Z.N.; Wang, J.; Yi, Z.; Wang, Y.; Zhou, J.L. Fractional-order retinex for adaptive contrast enhancement of under-exposed traffic images. IEEE Intell. Transp. Syst. Mag. 2019, 13, 149–159. [Google Scholar] [CrossRef]
  34. Gamini, S.; Kumar, S.S. Homomorphic filtering for the image enhancement based on fractional-order derivative and genetic algorithm. Comput. Electr. Eng. 2023, 106, 108566. [Google Scholar] [CrossRef]
Figure 1. The overall structure of the DHRECA-Net network.
Figure 1. The overall structure of the DHRECA-Net network.
Applsci 14 06459 g001
Figure 2. Introduction of the ECA into the basic block.
Figure 2. Introduction of the ECA into the basic block.
Applsci 14 06459 g002
Figure 3. Efficient channel attention structure.
Figure 3. Efficient channel attention structure.
Applsci 14 06459 g003
Figure 4. The network structure before and after the improvements: (a) network structure prior to enhancement; (b) network structure following enhancement. Different colors represent various levels of feature maps, with Level 1 at the top and Level 4 at the bottom.
Figure 4. The network structure before and after the improvements: (a) network structure prior to enhancement; (b) network structure following enhancement. Different colors represent various levels of feature maps, with Level 1 at the top and Level 4 at the bottom.
Applsci 14 06459 g004
Figure 5. The ECA-ASPP structure.
Figure 5. The ECA-ASPP structure.
Applsci 14 06459 g005
Figure 6. Labelme annotation tool.
Figure 6. Labelme annotation tool.
Applsci 14 06459 g006
Figure 7. Label diagram.
Figure 7. Label diagram.
Applsci 14 06459 g007
Figure 8. Sample image enhancement based on fractional order: (a) unprocessed original image; (b) enhanced by fractional-order differentiation.
Figure 8. Sample image enhancement based on fractional order: (a) unprocessed original image; (b) enhanced by fractional-order differentiation.
Applsci 14 06459 g008
Figure 9. The network model’s recognition performance was compared before and after improvements.
Figure 9. The network model’s recognition performance was compared before and after improvements.
Applsci 14 06459 g009
Figure 10. Comparison of the recognition effect of the improved network model with the other models.
Figure 10. Comparison of the recognition effect of the improved network model with the other models.
Applsci 14 06459 g010
Table 1. Experimental platform’s configuration.
Table 1. Experimental platform’s configuration.
Platform ParameterValue
Operating systemWindows 10
GPUNVIDIA TESLA T4 (12 GB RAM)
Experimental frameworksPyTorch 1.11.0 and TorchVision 0.12.0
CompilersPython 3.7
Table 2. Model parameter configuration.
Table 2. Model parameter configuration.
HyperparameterValue
Input image size[512, 512]
OptimizerSGD
Initial learning rate4 × 10−3
Minimum learning rate4 × 10−5
Learning rate decay strategyCOS
Batch size16
Table 3. Comparison of DHRECA-Net and HR-Net ablation experiments on the Loess Plateau landslide dataset.
Table 3. Comparison of DHRECA-Net and HR-Net ablation experiments on the Loess Plateau landslide dataset.
ABCMIOU/%MPA/%Precision/%Recall/%F1-Score/%
HR-Net 83.5590.1391.2890.1390.70
83.8991.4390.4491.4390.93
85.5791.3992.5691.3991.97
84.4991.8390.8091.8391.31
86.1592.8791.8292.8792.34
84.0590.8291.2290.8291.01
86.1892.0292.0392.0292.02
88.3794.5593.1794.5593.85
A represents ECARes-Net, B the Stacked Feature Module, and C the ECA-ASPP. The best results are in bold.
Table 4. Comparison of the training results of the supervised networks with different loss functions.
Table 4. Comparison of the training results of the supervised networks with different loss functions.
MIOU/%MPA/%Precision/%Recall/%F1-Score%
BCE loss84.3890.5591.9490.5591.07
Combined loss88.3794.5593.1794.5593.85
The best results are in bold.
Table 5. Performance comparison between the DHRECA-Net and other models.
Table 5. Performance comparison between the DHRECA-Net and other models.
MIOU/%MPA/%Precision/%Recall/%F1-Score/%
U-Net82.7291.1989.2691.1990.18
Deeplabv3+83.6590.6890.8690.6590.75
PspNet84.0991.7990.3691.7991.07
DHRECA-Net88.3794.5593.1794.5593.85
The best results are in bold.
Table 6. Comparison of the model parameters.
Table 6. Comparison of the model parameters.
ParamsGFLOPSFPS
U-Net24.891 M451.672 G12.883
Deeplabv3+54.709 M166.841 G17.376
PspNet46.707 M118.427 G18.028
DHRECA-Net9.858 M38.063 G22.013
The best results are in bold.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, H.; Yang, S.; Wang, R.; Yang, K. Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks. Appl. Sci. 2024, 14, 6459. https://doi.org/10.3390/app14156459

AMA Style

Sun H, Yang S, Wang R, Yang K. Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks. Applied Sciences. 2024; 14(15):6459. https://doi.org/10.3390/app14156459

Chicago/Turabian Style

Sun, Hui, Shuguang Yang, Rui Wang, and Kaixin Yang. 2024. "Study on a Landslide Segmentation Algorithm Based on Improved High-Resolution Networks" Applied Sciences 14, no. 15: 6459. https://doi.org/10.3390/app14156459

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop