HPM-Match: A Generic Deep Learning Framework for Historical Landslide Identification Based on Hybrid Perturbation Mean Match

Ran, Shuhao; Ma, Gang; Chi, Fudong; Zhou, Wei; Weng, Yonghong

doi:10.3390/rs17010147

Open AccessArticle

HPM-Match: A Generic Deep Learning Framework for Historical Landslide Identification Based on Hybrid Perturbation Mean Match

by

Shuhao Ran

^1,2

,

Gang Ma

^1,2,*

,

Fudong Chi

³,

Wei Zhou

^1,2

and

Yonghong Weng

^1,4

¹

Institute of Water Engineering Sciences, Wuhan University, Wuhan 430072, China

²

State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China

³

Huaneng Lancang River Hydropower Inc., Kunming 650214, China

⁴

Changjiang Institute of Survey, Planning, Design and Research, Wuhan 430010, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(1), 147; https://doi.org/10.3390/rs17010147

Submission received: 21 September 2024 / Revised: 16 December 2024 / Accepted: 30 December 2024 / Published: 3 January 2025

(This article belongs to the Section AI Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

The scarcity of high-quality labeled data poses a challenge to the application of deep learning (DL) in landslide identification from remote sensing (RS) images. Semi-supervised learning (SSL) has emerged as a promising approach to address the issue of low accuracy caused by the limited availability of high-quality labels. Nevertheless, the application of SSL approaches developed for natural images to landslide identification encounters several challenges. This study focuses on two specific challenges: inadequate information extraction from limited unlabeled RS landslide images and the generation of low-quality pseudo-labels. To tackle these challenges, we propose a novel and generic DL framework called hybrid perturbation mean match (HPM-Match). The framework combines dual-branch input perturbation (DIP) and independent triple-stream perturbation (ITP) techniques to enhance model accuracy with limited labels. The DIP generation approach is designed to maximize the utilization of manually pre-defined perturbation spaces while minimizing the introduction of erroneous information during the weak-to-strong consistency learning (WSCL) process. Moreover, the ITP structure unifies input, feature, and model perturbations, thereby broadening the perturbation space and enabling knowledge extraction from unlabeled landslide images across various perspectives. Experimental results demonstrate that HPM-Match has substantial improvements in IoU, with maximum increases of 26.68%, 7.05%, and 12.96% over supervised learning across three datasets with the same label ratio and reduces the number of labels by up to about 70%. Furthermore, HPM-Match strikes a better balance between precision and recall, identifying more landslides than other state-of-the-art (SOTA) SSL approaches.

Keywords:

landslide identification; remote sensing images; semi-supervised learning; consistency regularization

1. Introduction

Landslides are a deforming and destructive phenomenon resulting from a combination of multiple factors [1]. It causes enormous casualties and economic losses. Therefore, the rapid, accurate, and automatic identification and mapping of landslides hold significant value in terms of disaster prevention and mitigation [2]. Traditionally, the identification of historical landslides is mainly based on field investigation or visual interpretation of remote sensing (RS) images. While these approaches yield accurate results, they are costly, labor-intensive, time-consuming, and rely on human subjectivity [3].

Computer-assisted methods currently employed for historical landslide identification using RS images can be categorized into three main approaches: feature thresholding segmentation [4,5], machine learning [6,7], and deep learning (DL). The first two methods still necessitate the manual determination of thresholds and feature screening, which restricts their efficiency and generalizability. In contrast, DL methods have made significant strides in the intelligent interpretation of RS images [8,9], owing to their ability to automatically extract and classify features.

However, DL-based landslide identification relies heavily on the availability of a large amount of labeled data, resulting in increased costs associated with model training. Furthermore, supervised learning (SL) (Figure 1a) with insufficiently labeled data usually leads to model overfitting, which prevents the model from learning the proper feature distributions embedded in the dataset. To address the reliance on expert-labeled data and mitigate these limitations, semi-supervised learning (SSL) (Figure 1b) has been proposed to train DL models using a small amount of labeled data and a large amount of unlabeled data.

In general, SSL approaches can be categorized into three main categories: consistency regularization (CR), pseudo-labeling self-training (PS), and generative adversarial network (GAN). The principle of CR underlines that a robust model should maintain consistent results for the same input under extra perturbation. The PS technique involves identifying unlabeled data and incorporating these results with the original labeled data to form new labeled data. The goal is to minimize entropy. However, the PS process may introduce confirmation bias in the model, ultimately impacting the accuracy of landslide identification [10]. The GAN synthesizes landslide images by a generator and trains a discriminator to differentiate between real and synthetic ones, enabling it to learn a richer feature space and improve landslide identification results [11], whereas the utilization of GAN-based SSL methods poses challenges due to the instability of GANs [12].

Currently, research on SSL in RS largely focuses on full-scene segmentation [13,14,15] and classification [16,17,18], with the limited exploration of specific historical landslide identification tasks. Zhang et al. [11] applied WGAN-GP to extract discriminative deep features through unsupervised adversarial training from unlabeled landslide images, effectively learning pixel-level and object-level deep representations of landslides. He et al. [19] used a generator to synthesize landslide images, training a discriminator to learn multidimensional low- and high-level semantic multiscale features, thereby reducing dependence on labeled data. Zhou et al. [20] proposed a method combining class activation maps (CAMs) with a cycle generative adversarial network (CycleGAN) for RS landslide semantic segmentation, which reduces the workload of landslide annotation. In addition, U²PL [21] as a typical PS-based SSL approach originally proposed for natural image segmentation and achieves significant improvement in accuracy when applied to Luding landslide identification [22]. The above studies demonstrate that SSL has substantial potential for historical landslide identification tasks. However, significant differences between RS images and natural images present various challenges for its application in landslide recognition.

Inadequate information can be extracted from embedded data due to a limited number of unlabeled RS images of landslides. As illustrated in Figure 2, RS images of landslides exhibit distinct characteristics such as larger receptive fields, a higher presence of objects, richer embedded information, and more complex color textures compared to natural images. Furthermore, the scale of available landslide datasets is significantly smaller in comparison to popular natural image datasets. For instance, the Cityscapes dataset [23] comprises 3475 images, while the MS COCO dataset [24] contains 123,000 images. In contrast, the Bijie landslide dataset [25] consists of a mere 770 images, the Nepal landslide dataset [26] contains 275 images, and the Sichuan dataset [27] includes only 107 images. Consequently, the key to improving the accuracy of landslide identification lies in the effective exploration and utilization of information from these limited yet information-rich RS images. However, existing CR approaches are restricted to single- [15,28,29] or dual-level [30] perturbations, lacking specific rules for perturbations tailored for landslide identification. This constraint hinders the model from exploring a wider perturbation space and gaining valuable knowledge from unlabeled landslide images. By integrating input, feature, and model perturbations, the perturbation space of the model is expanded, enabling the model to learn from a broader range of perspectives. This, in turn, enhances the utilization of limited yet information-rich RS images of landslides.

The generation of low-quality weak augmentation pseudo-labels. Currently, there is considerable interest in SSL approaches that utilize weak-to-strong consistency learning (WSCL) in natural images, which utilizes instant pseudo-labels by enabling the strongly augmented view to be guided by the weakly augmented view for learning during each training iteration [15]. The effectiveness of these approaches stems from the ability of weak perturbations to generate accurate identifications while strong perturbation introduces additional information as well as mitigating confirmation bias. While various data augmentation strategies, including RandAugment [31], CTAugment [32], and AugSeg [33], have been proposed for SSL in classification and segmentation tasks involving natural images, their application in landslide identification may result in augmentation types that do not accurately reflect real-world landslide scenarios. This mismatch negatively impacts the accuracy of landslide identification. Additionally, RS offers a diverse range of information sources. For example, the digital elevation model (DEM) and its derived topographic data are frequently utilized as auxiliary resources for landslide identification, yielding positive results [34]. Unfortunately, existing data augmentation strategies are prone to corrupting this valuable information and impairing model performance, as shown in Figure 3. In this work, we propose a data augmentation strategy specifically designed for the landslide identification task. Through a well-balanced combination of weak and strong data augmentations, the strategy simulates landslides under different angles and lighting conditions, mitigates the influence of color variations on landslide identification, and enhances the delineation of landslide boundaries.

To tackle the aforementioned challenges, we propose a comprehensive generic DL framework named hybrid perturbation mean match (HPM-Match), specifically designed for the identification of historical landslides. The HPM-Match framework addresses the limitations of existing approaches by integrating input-level, feature-level, and model-level perturbations in an end-to-end manner and updating model parameters through the exponential moving average (EMA). Additionally, it introduces an independent triple-stream perturbation (ITP) structure that relies on multidimensional CR constraints and defines implementation strategies for three different perturbations, thereby expanding the range of available perturbations. This enables the extraction of valuable knowledge from unlabeled landslide images from multiple perspectives. Moreover, the HPM-Match framework aims to fully exploit the manually pre-defined perturbation spaces while concurrently minimizing the introduction of erroneous information during the WSCL process. To enhance the representation of information without impairing the integrity of the original data, we propose a novel dual-branch input perturbation (DIP) generation approach. By applying two parallel WSCLs to the input data, DIP aims to create a more comprehensive and informative representation of input data, thereby facilitating more accurate landslide identification. We conducted experiments on three landslide identification datasets, each characterized by unique attributes. In comparison to both the state-of-the-art (SOTA) SSL approach and SL methods, our framework demonstrates superior performance while employing the same model.

2. Materials and Methods

In this work, our main focus lies in facilitating the capacity of the model to learn relevant knowledge from unlabeled landslide images using the HPM-Match framework, with an emphasis on this learning process rather than concentrating on model optimization. As a result, the construction of the HPM-Match framework is designed to be model-independent, making it a generic and flexible solution.

2.1. SSL Landslide Identification Problem Defined

Defining a small set of landslide image-label pairs as labeled data

D_{L} = {(x_{i}^{L}, y_{i}^{L})}_{i = 1}^{N_{L}}

, while defining a set of landslide images as unlabeled data

D_{U} = {(x_{i}^{U})}_{i = 1}^{N_{U}}

, where

N_{L} ≪ N_{U}

,

x_{i} \in ℝ^{H \times W \times C}

, and

y_{i}^{L} \in {\{0, 1\}}^{H \times W}

.

H, W, C

denote the height, width, and channels of landslide image, respectively. The goal of the SSL-based landslide identification is to determine an identify function

f (x)

, driven by both

D_{L}

and

D_{U}

, thus generating the correct landslide area for samples that do not appear in the labeled data distribution. Specifically, the role of

D_{L}

is to generate an initial estimate of the decision boundary through SL, while

D_{U}

further refines the decision boundary through unsupervised learning (UL) to achieve more precise landslide identification results. Consequently, the overall loss function for SSL-based landslide identification can be formulated as follows:

L = L_{Sup} + λ_{U} \cdot L_{U}

(1)

where

L_{Sup}

is a supervised loss,

L_{U}

is an unsupervised consistency loss, and

λ_{U}

represents a human-defined hyperparameter used to balance supervised and unsupervised losses. In this work,

L_{U}

can be divided into three versions: input perturbation, feature perturbation, and model perturbation. A comprehensive explanation of these versions is provided below.

L_{Sup} = \frac{1}{N_{L}} \sum_{i = 1}^{N_{L}} l_{ce / dice} (y_{i}^{L}, f (x_{i}^{L}))

(2)

L_{U} = \frac{1}{N_{U}} \sum_{i = 1}^{N_{U}} l_{dice} (ONEHOT (Argmax (f (x_{i}^{U}) \geq τ)), f ({\hat{x}}_{i}^{U}))

(3)

where

f (x_{i}^{L})

represents the landslide identification result of the

i -th

labeled image and

y_{i}^{L}

is its corresponding label. Similarly,

f (x_{i}^{U})

and

f ({\hat{x}}_{i}^{U})

represent the landslide identification result of the unlabeled image and

{\hat{x}}_{i}^{U}

represents the perturbed version corresponding

x_{i}^{U}

. The

l_{ce}

denotes the cross-entropy loss function and

l_{dice}

denotes the dice loss function [35]. In the SL loss, the choice between

l_{ce}

and

l_{dice}

relates to the data distribution. The

l_{dice}

is selected to improve the detection accuracy of small landslide targets when landslide and background pixels are imbalanced, while the general

l_{ce}

is used in other situations. In addition, in the UL loss,

l_{dice}

regularizes the landslide identification filtering results of the perturbed samples to be the same as the original.

ONEHOT (\cdot, \cdot)

is the one-hot operation,

Argmax (\cdot, \cdot)

denotes the argmax operation, and

τ

represents the confidence threshold that is used to filter noisy pseudo-labels.

2.2. Over-Structure of HPM-Match

The CR-based SSL approach employs a CR term to minimize the model output before and after perturbation, effectively pushing the decision boundary away from unlabeled data and thereby enhancing model performance [36]. However, the single perturbation [28,37,38] and dual-level perturbation [39] techniques are not directly applicable to landslide identification and may not fully exploit the knowledge embedded in unlabeled data, resulting in reduced accuracy or failure. Moreover, the limited availability of labeled landslide data worsens two common issues in SSL: model overfitting to a restricted number of labeled images and weakly perturbed pseudo-labels with low accuracy [40]. The utilization of such perturbations, which suffer from overfitting and inaccurate identifications, may introduce incorrect training signals, leading to a degradation in consistency learning. To address these challenges, we propose an end-to-end generic DL framework, HPM-Match, for effective landslide identification.

The overall architecture of the HPM-Match is depicted in Figure 4 and its training procedure is listed in Algorithm 1. It includes a student model

f_{θ}^{s}

and a teacher model

f_{φ}^{t}

, both sharing the same architecture but initialized with different parameters. The knowledge-learning process in HPM-Match consists of two components during each iteration: SL using labeled data

D_{L}

and UL based on ITP using unlabeled data

D_{U}

. In the SL component, the weak data augmentation (WDA) landslide identification results

f_{θ}^{s} (x_{W}^{L})

and labels

y^{L}

are combined to construct a supervised loss

L_{Sup}

to achieve a fast learning fit to the distribution of landslide features. In the UL component, the ITP includes three consistency learning loss streams, i.e., input weak-to-strong consistency loss

L_{S 1}

,

L_{S 2}

; feature consistency loss

L_{F 1}

,

L_{F 2}

; and model consistency loss

L_{M}

, which collectively explore a wider solution space. Consequently, the final loss function for HPM-Match

L_{total}

is defined as follows:

L_{total} = \frac{1}{N_{U}} \sum_{i = 1}^{N_{U}} \frac{(L_{Sup}^{i} + β L_{S 1}^{i} + μ L_{S 2}^{i} + γ L_{F 1}^{i} + ω L_{F 2}^{i} + η L_{M}^{i})}{2}

(4)

where

β, μ, γ, ω, η

are the individual loss weights of UL streams. We consider that each branch contributes valuable insights from its respective perspective, enabling the model to effectively acquire knowledge from the unlabeled data and to set

β = μ = γ = ω = η = 0.2

, which is verified in the ablation experiments.

Algorithm 1 Training Procedure of HPM-Match

2.3. Dual-Branch Input Perturbation (DIP) Generation

The complexity of landslide images and the limited availability of data contribute to model overfitting and the generation of low-quality pseudo-labels [15], thereby undermining the effectiveness of CR. To address this issue, data augmentation emerges as a crucial strategy. This strategy enriches the limited landslide images by simulating landslides under varying angles and lighting conditions, thereby enhancing the model’s resilience to interference and improving its robustness [40].However, existing data augmentation strategies may introduce augmentation types that do not align with real landslide scenarios, compromising the auxiliary information for landslide identification. Furthermore, given the notable advantages of input perturbation in current SSL research and the rich information available in RS images of landslides, we endeavor to explore its untapped potential further.

Inspired by UniMatch [30] and linear sampling self-training [40], we propose a data augmentation strategy for landslide identification that includes both weak and strong augmentations. The detailed augmentation strategy is outlined in Table 1. Building upon this augmentation scheme, we introduce the DIP generation approach. It disrupts the original data distribution and injects additional information through applying strong data augmentation (SDA). The dual-branch structure leverages the advantages of contrastive learning, enabling the model to learn more discriminative representations [30]. The overall process is shown in Figure 5. Specifically, WDA is first applied to two randomly selected image and DEM pairs, followed by two separate SDA processes. Notably, the DEM auxiliary to landslide identification does not perform SDA but is connected via residuals. Subsequently, the four pairs of image and DEM are combined across branches for CutMix masking and fed into the model to obtain landslide identifications

f_{θ}^{s} (x_{S}^{U})

. Finally, the

f_{θ}^{s} (x_{S}^{U})

is combined with the

y^{U}

that have been WDA and CutMix masking to construct the weak-to-strong consistency loss

L_{S 1}, L_{S 2}

. It is worth noting that DEM residual connections need to be removed when DIP is applied to regions or datasets without DEM auxiliary. The DIP generation process can be formulated as follows:

CutMix (A_{S}^{U}, B_{S}^{U}) = M ⊙ (A^{s} A^{w} (A_{RGB}^{U}) + A^{w} (A_{DEM}^{U})) + (1 - M) ⊙ (A^{s} A^{w} {(B}_{RGB}^{U}) + A^{w} {(B}_{DEM}^{U}))

(5)

f_{θ}^{s} (x_{S}^{U}) = f_{θ}^{s} (CutMix (A_{S}^{U}, B_{S}^{U}))

(6)

y^{U} = M ⊙ f_{θ}^{s} (A^{w} (A^{U})) + (1 - M) ⊙ f_{θ}^{s} (A^{w} (B^{U}))

(7)

L_{S 1} = \frac{1}{N_{U}} \sum_{i = 1}^{N_{U}} l_{dice} (y_{1_i}^{U}, f_{θ}^{s} (x_{S 1_i}^{U}))

(8)

L_{S 2} = \frac{1}{N_{U}} \sum_{i = 1}^{N_{U}} l_{dice} (y_{2_i}^{U}, f_{θ}^{s} (x_{S 2_i}^{U}))

(9)

where

A^{U}, B^{U}

both denote randomly input pairs of unlabeled landslide data and

A^{U} = A_{RGB}^{U} + A_{DEM}^{U}

;

A^{w} (\cdot)

represents WDA operation; and

A^{s} (\cdot)

represents SDA operation. Moreover,

M \in {0, 1}^{H \times W}

is a binary mask that defines the location of deletion and filling of the two data pairs,

1

denotes a mask filled with ones that have the same size as

M

, and

⊙

represents pixel-by-pixel multiplication.

2.4. Independent Triple-Stream Perturbation (ITP) Structure

To further enhance the performance of CR in specific tasks such as landslide identification, it is crucial to explore augmentation perturbation strategies that align with domain expertise. In this study, we introduce the ITP structure to enhance the training process. The ITP framework consists of three streams: input perturbation, feature perturbation, and model perturbation. Input perturbation increases data diversity, feature perturbation enhances the robustness of feature representations, and model perturbation reduces dependency on specific model parameters [41]. The ITP compels the identification model to learn within a more complex and diverse feature space, breaking free from any single perturbation mechanism and mitigating confirmation bias [42,43]. This allows the model to effectively learn valuable knowledge from each independent stream.

Formally, the

f_{θ}^{s}

can decompose into an encoder

E

, a decoder

D

, and an auxiliary decoder

G

. The feature perturbation is realized through two feedforward learning streams,

x^{U} \to E \to D \to DROP \to f_{θ}^{s} (x_{F 1}^{U})

and

x^{U} \to E \to CUT \to G \to f_{θ}^{s} (x_{F 2}^{U})

. To enforce consistency, we construct the feature consistency loss

L_{F}

based on these weakly perturbed pseudo-labels. It is formulated as follows:

f_{θ}^{s} (x_{F 1}^{U}) = DROP (D (E (A^{w} (x^{U}))))

(10)

f_{θ}^{s} (x_{F 2}^{U}) = G (CUT (E (A^{w} (x^{U}))))

(11)

f_{θ}^{s} (x_{W}^{U}) = D (E (A^{w} (x^{U})))

(12)

L_{F 1} = \frac{1}{N_{U}} \sum_{i = 1}^{N_{U}} l_{dice} (f_{θ}^{s} (x_{W_i}^{U}), f_{θ}^{s} (x_{F 1_i}^{U}))

(13)

L_{F 2} = \frac{1}{N_{U}} \sum_{i = 1}^{N_{U}} l_{dice} (f_{θ}^{s} (x_{W_i}^{U}), f_{θ}^{s} (x_{F 2_i}^{U}))

(14)

where

DROP (\cdot, \cdot)

denotes feature dropout [28] operation,

CUT (\cdot, \cdot)

denotes guided cutout operation [28], and

G

is achieved through PixelShuffle [44].

To introduce model perturbation, we enforce consistency between the identifications of the teacher and student model. Through the process of exponential moving average (EMA), the student model progressively improves the teacher model. Simultaneously, the teacher model aids the student model through consistency loss. This iterative process creates a virtuous circle that enhances the learning of feature distributions and mitigates identification oscillation for unlabeled data. The model consistency learning loss

L_{M}

and the corresponding model parameter can be calculated as follows:

f_{φ}^{t} (x_{W}^{U}) = f_{φ}^{t} (NOISE (A^{w} (x^{U})))

(15)

L_{M} = \frac{1}{N_{U}} \sum_{i = 1}^{N_{U}} l_{dice} (f_{θ}^{s} (x_{W_i}^{U}), f_{φ}^{t} (x_{W_i}^{U}))

(16)

φ_{T} = α φ_{T - 1} + (1 - α) θ_{T}

(17)

where

NOISE (\cdot, \cdot)

denotes Gaussian noise with zero mean and standard deviation of 0.1.

φ_{T}

and

φ_{T - 1}

are the

f_{φ}^{t}

parameters at T and T-1 moments,

θ_{T}

is the

f_{θ}^{s}

parameters at T moment, and

α = 0.999

is smoothing coefficient hyperparameter.

Moreover, as detailed in Section 2.2, input perturbation constitutes an integral part of our approach. Feature and model perturbations complement input perturbation, expanding the original perturbation space and working together to achieve knowledge learning from unlabeled data. Notably, the three perturbations are implemented as independent streams, thereby mitigating the limitations associated with single-stream learning and reducing the associated computational costs.

3. Study Area and Dataset

To assess the efficacy of HPM-Match, we conducted experimental analyses using three distinct datasets: Bijie [25], Nepal [26], and Sichuan [27]. These datasets encompass both earthquake-induced and rainfall-induced landslides. Notably, the Sichuan dataset includes debris flows in addition to landslides, thereby providing a more comprehensive representation of landslide events. The geographical locations of study areas are depicted in Figure 6a, while detailed information regarding the three datasets is presented in Figure 6 and Table 2.

4. Experiment Settings

To comprehensively evaluate the performance of HPM-Match, we chose SL, and other SOTA SSL approaches, i.e., Mean Teacher [43], UniMatch [30], and U²PL [21], for comparison. Considering the wide applicability of U-Net [45] in landslide identification, we selected it as the model for both SSL and various SSL approaches. In the SSL experiments, the input batch size is set to 16, comprising 8 labeled images and 8 unlabeled images. The SL batch size is 8. The learning rate (LR) is fixed at 1 × 10⁻⁴. For all models, the epoch remains the same under identical labeled data proportions, without using an early stopping mechanism. Instead, the best weights are saved by monitoring the IoU value on the validation set. The consistency represents the maximum weight of the UL loss in Mean Teacher, while the consistency ramp-up specifies the epoch at which this maximum weight is reached. The parameter α is the smoothing factor for updating the student model parameters through EMA. All experiments are implemented based on Pytorch and NVIDIA 3060 GPU. The hyperparameters are primarily determined by referencing previous studies [30,46] and are manually tuned and optimized with a focus on key parameters. The detailed experimental parameter settings are summarized in Table 3.

The experiment results are evaluated with overall accuracy (OA), precision (P), recall (R), F₁-Score, and intersection over union (IoU). They can be expressed as follows equations and Table 4:

OA = \frac{T P + T N}{T P + T N + F P + F N} \times 100 %

(18)

P = \frac{T P}{T P + F P} \times 100 %

(19)

R = \frac{T P}{T P + F N} \times 100 %

(20)

F_{1} -Score = \frac{2 \times P \times R}{P + R} \times 100 %

(21)

IoU = \frac{T P}{T P + F P + F N} \times 100 %

(22)

5. Experiment Results and Evaluation

Each experiment is repeated three times and the mean and standard are taken as the final results. The landslide identification performance metrics are shown in Table 5, Table 6 and Table 7. Typical landslide identification visual results are presented in Figure 7, Figure 8 and Figure 9.

5.1. Bijie Landslide Dataset

As depicted in Table 5, all SSL approaches exhibited superior performance compared to the SL baseline. Notably, HPM-Match achieved the highest IoU and F₁-Score across all three cases with varying label ratios. Specifically, in the 2% labeling cases, the IoU and F₁-Score of HPM-Match are 26.68% and 24.85% higher than the SL. As the number of labeled data increased, the gap between SL and SSL approaches diminished. However, HPM-Match consistently outperformed both SL and other SSL approaches, thus highlighting its superiority in landslide identification tasks. Additionally, HPM-Match demonstrated higher recall compared to other SSL approaches in all cases. This observation suggests that HPM-Match effectively leverages the unlabeled data, thereby aiding researchers in identifying more landslides. Furthermore, the student and teacher model employed in HPM-Match ensures the stability of the training process and produces more consistent results. Most notably, HPM-Match achieved comparable performance to SL using only 30% of the labeled data, underscoring its efficiency and effectiveness.

The visual observation clearly shows that the identification results of the HPM-Match are closer to the ground truth than other approaches from Figure 7. Since SL can only learn features from labeled data, it is hard to identify and extract features beyond its learning range, causing inaccurate identification. Moreover, HPM-Match demonstrates enhanced capability in accurately classifying non-landslide pixels when landslides exhibit spectral features similar to the surrounding elements. This highlights the robustness and discriminative power of HPM-Match in handling complex classification scenarios. However, HPM-Match exhibits some misclassification and omissions at the edges of landslides, which may be related to the feature extraction capabilities of the U-Net.

5.2. Nepal Landslide Dataset

A special note is that the percentage of landslide pixels is much lower than the background pixels in the Nepal dataset. Therefore, we use

l_{dice}

to balance the class and improve the identification of small-scale landslide objects. Moreover, the reference significance of OA is diminished due to the imbalance between landslides and the background. In the 15% labeling case, HPM-Match achieves the greatest advantage over the supervised baseline. The IoU and F₁-Score of HPM-Match are 7.05% and 6.58% higher than the supervised baseline, respectively. However, it is important to highlight that HPM-Match does not achieve the highest precision scores. For instance, in the 15% and 40% labeling cases, U²PL attains precision scores of 58.83% and 60.9%, respectively, surpassing the second-best approach by 3.3% and 5.29%, respectively. However, U²PL has the lowest recall, even 2.62% lower than SL. This indicates a significant number of missed landslide instances, which is unacceptable in practical land hazard investigations. Similarly, in the 5% labeling cases, SL achieves the highest precision; however, its lowest recall results in a large number of missed and undetected landslides. In contrast, HPM-Match embeds ITP structure in different branches, thereby enhancing the model’s generalization ability and preventing overfitting to easily recognizable features. Consequently, HPM-Match achieves a better balance between precision and recall compared to other approaches.

From Figure 8, it is evident that the landslides in the Nepal dataset are relatively small in scale, making it visually challenging to distinguish them from the ground features. In the 5% labeling case, the SL model learns minimal features, resulting in an inability to accurately identify landslides, with the resultant map being predominantly dominated by FP. When the spectral characteristics of the surrounding region closely resemble those of landslides, the comparative approaches exhibit a higher FN compared to HPM-Match in the 15% labeling case. Overall, the HPM-Match demonstrates a superior ability to identify more landslides while effectively excluding ground interference, aligning with the performance metric results.

5.3. Sichuan Landslide and Debris Flow Disaster Dataset

According to Table 7, HPM-Match consistently achieves the best results for OA, IoU, and F₁-Score across different labeling ratios. Specifically, in the 7% and 14% labeling cases, HPM-Match achieves IoU and F₁-Score improvements of more than 10% compared to the SL, and it outperforms the second-best SSL approach by 5.22% in terms of F₁-Score. Regarding recall and precision, HPM-Match demonstrates similar patterns to the Bijie and Nepal datasets, with the best recall and intermediate precision scores. Moreover, in the 28% labeling case, HPM-Match surpasses SL with 100% labels in all metrics except OA.

In Figure 9, from top to bottom, the third and fifth images are debris flows, while the rest are landslides. It is worth noting that in some instances, landslides and debris flows exhibit similar characteristics. In comparison to other SSL approaches, the SL baseline exhibits more FN and FP for both landslides and debris flows. In contrast, HPM-Match accurately identifies the majority of landslides and debris flows by leveraging more effective augmentation techniques and the more comprehensive perturbation of the aerial images through the ITP structure. This observation underscores the generalization ability of the proposed DL framework. Moreover, when identifying landslides and debris flows in complex scenes, HPM-Match may incorrectly label certain non-target objects with similar spectral and texture features, and some omissions may also be present within the identification results.

6. Discussion

6.1. Contribution of Perturbation Type on HPM-Match

To assess the impact of different perturbation types on HPM-Match, an ablation experiment was conducted on the Bijie dataset, which had a labeling ratio of 2%. As depicted in Figure 10, the F0-M0-I0 combination yields the lowest IoU and F₁-Score among all perturbation combinations, with reductions of 26.68% and 24.85%, respectively, compared to the best-performing F2-M1-I2 combination. This finding affirms that each perturbation type contributes positively to landslide identification. Moreover, the F1-M0-I0, F0-M1-I0, and F0-M0-I1 all improve the precision of the model in identifying landslides, albeit to varying degrees. This hierarchy arises from the fact that strong input perturbation supplies additional information to decouple self-biased cognition. Conversely, feature and model perturbations leverage perturbation-based internal discrepancies as update momentum to facilitate learning from unlabeled data. Both F2-M0-I0 and F0-M0-I2 outperform their respective single perturbations, indicating the effectiveness of DIP and two feedforward feature perturbations. However, the F3-M1-I2 and F2-M1-I3 combinations perform worse than F2-M1-I2, underscoring the limited significance of blindly adding perturbations, which only adds extra computational burden without substantial benefits.

We visualize the high-level features of the HPM-Match decoder and show some activation maps in Figure 11. In the case of F0-M0-I0, where no perturbations are applied, the model primarily focuses on prominent features of landslides. Consequently, the identification of landslides is incomplete. However, by incorporating perturbations and leveraging the learning from unlabeled data, the ability of the model to focus on landslides is enhanced. This results in achieving the most accurate and complete landslide attention at F2-M1-I2.

6.2. Effect of Loss Weights on Final Loss

The loss weights in the HPM-Match are determined using two approaches: the balanced weight allocation method [47] and the SL-priority method [43]. The balanced weight allocation method treats both SL and UL as equally important to avoid bias toward any specific loss regularization term, i.e., (1,1,1), in Figure 12. The SL-priority assigns primary importance to SL and treats UL as an auxiliary loss, with the UL loss weights summing to 1, as depicted in Equation (4).

Figure 12 illustrates that the precision is highest when using balanced loss weights (1,1,1). However, this comes at the cost of lower recall, IoU, and F₁-Score, leading to a significant number of missed landslides. This can be attributed to interference with the ability of SL to effectively learn from labeled data. In the SL-priority weight allocation strategy, the IoU, F₁-Score, precision, and OA metrics with weights of (0.2, 0.4, and 0.1) are the lowest. That is, with feature–model–input perturbation consistency loss weights of 0.2, 0.4, and 0.1, the contribution of input perturbation is underestimated, resulting in the weakening of the model’s ability to learn knowledge from original images of unlabeled landslides. The remaining experimental cases exhibit smaller performance gaps across different metrics. Overall, the importance of the input perturbation stream and the feature perturbation stream outweighs that of the model perturbation stream. Therefore, the best landslide identification results are achieved when the weights for the two input perturbation consistency losses, the two feature perturbation consistency losses, and the one model perturbation consistency loss are set to 0.2. Additionally, it is noteworthy that different weight combinations have little impact on the optimization efficiency of the model, as the training times are similar.

6.3. Effect of Data Augmentation on Landslide Identification

To assess the effectiveness of the proposed weak and strong data augmentation strategies on landslide identification, we conduct experiments involving the ablation of SDA and DEM, as well as additional comparison experiments involving the extra SDA strategy (Figure 13). The performance metrics for the ablation experiments are presented in Figure 14.

Excluding the DEM from the SDA approach led to improved performance metrics across all three labeling ratios. Notably, in the 30% labeling case, the IoU and F₁-Score demonstrated maximum improvements of 3.23% and 2.2%, respectively. This improvement can be attributed to the fact that SDA destroys the original information of DEM, so the landslide geomorphological features are destroyed, such as scarp, rupture surface, and sliding body. This may impair the performance of the model.

Table 8 showcases the performance metrics of the framework for landslide identification when utilizing the proposed SDA and extra SDA approaches. The results demonstrate that the proposed SDA outperforms the extra SDA across all three datasets. The inferior performance of the extra SDA can be attributed to its generation of augmentation types that do not accurately represent real landslide scenarios, as depicted in Figure 14. In contrast, the proposed SDA improves accuracy by simulating landslides from various angles and lighting conditions, as well as enhancing landslide edges.

6.4. Practical Implications for Real-World Landslide Monitoring

The HPM-Match framework effectively reduces the need for manual labels by mining and learning from the knowledge embedded in unlabeled data, thereby enhancing the efficiency of historical landslide identification. In real-world landslide monitoring, integrating the HPM-Match with specialized landslide identification models can further improve accuracy and timeliness. Specifically, in early warning systems, HPM-Match can identify historical landslides, providing landslide maps that serve as foundational data for assessing landslide susceptibility. This facilitates early monitoring and warning for high-susceptibility regions. Additionally, incorporating HPM-Match into disaster management workflows enables the rapid determination of landslide locations and scales in affected areas, even with limited labeled data. This information is invaluable for post-disaster rescue, quantitative damage assessment, and reconstruction efforts.

7. Limitation and Future Work

Although HPM-Match demonstrates excellent performance in the experimental results, it still faces the risk of failure under complex scenarios and with limited landslide imagery data. The feature extraction capability of the U-Net used in the model is limited, which may cause it to struggle in extreme weather conditions (e.g., cloud cover) or highly complex terrains. However, models with stronger feature extraction capabilities generally have more complex structures, which can lead to increased computational costs and time, thereby reducing the practical applicability of HPM-Match. In regions with scarce imagery, HPM-Match may fail to adequately learn about unknown landslide characteristics, reducing both the accuracy and completeness of landslide identification. Additionally, the experiments use relatively small datasets and have not thoroughly explored the generalization ability of HPM-Match on larger, more diverse datasets.

HPM-Match improves the accuracy of historical landslide identification under limited-label conditions but also increases the computational cost of the model. We evaluated the training time of HPM-Match, SL, and other SOTA SSL methods on the Bijie dataset, as shown in Figure 15. The results indicate that the training time of SSL methods is significantly higher than that of SL across all three label ratios. This is because SL only utilizes a small amount of labeled data for training and does not learn from unlabeled data. Among the four SSL methods, the Mean Teacher has the shortest training time but typically achieves the lowest F₁-Score. While HPM-Match achieves the best performance, its longer training time poses a challenge for practical applications.

In future work, we plan to integrate HPM-Match with lightweight models specialized for landslide identification and incorporate multimodal or dynamic data to further enhance the accuracy of historical landslide identification. We will also use K-fold cross-validation to rigorously evaluate the accuracy and reliability of the framework. Additionally, the impact of the DEM source on the final results and the ability of HPM-Match to generalize using large datasets also need to be considered. Last but not least, we will explore the application of HPM-Match in real-world landslide monitoring and examine its potential to address other geodynamic processes.

8. Conclusions

In this work, we present a generic DL framework called HPM-Match, which aims to accurately identify landslides using limited labeled data and abundant unlabeled data. The framework incorporates the ITP structure based on multidimensional CR constraints, enabling the unification of input, feature, and model perturbations. This integration of diverse perturbation types allows the model to minimize the discrepancy between the acquired and target feature space distributions, thereby mitigating overfitting issues associated with labeled data. Additionally, we devise the DIP approach that minimizes the introduction of erroneous information during the WSCL process through a reasonable weak and strong data augmentation strategy specifically tailored for landslide identification.

The framework is comprehensively evaluated and analyzed using three datasets that exhibit variations in resolutions, landslide scales, and spectral features. The experimental results demonstrate the superior performance of HPM-Match when compared to SOTA SSL approaches, both quantitatively and visually. The identified results are closer to the ground truth and demonstrate good generalization ability. Finally, the effectiveness of the framework and strategy proposed in this work is further validated through comprehensive ablation and comparison experiments.

Author Contributions

Conceptualization, S.R. and G.M.; methodology, S.R.; software, S.R.; validation, S.R., W.Z. and F.C.; formal analysis, S.R., W.Z. and F.C.; resources, G.M. and W.Z.; data curation, Y.W.; writing—original draft preparation, S.R.; writing—review and editing, G.M., W.Z., F.C. and Y.W.; visualization, S.R. and Y.W.; supervision, G.M. and Y.W.; project administration, G.M. and Y.W.; funding acquisition, G.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China via grant 52322907, grant 52179141, grant U23B20149, and grant U2340232; in part by the Key Program of Science and Technology of Yunnan Province via grant 202202AF080004 and grant 202203AA080009; and in part by the Program of Technical Innovation Talent Training of Yunnan Province through grant 202205AD160019.

Data Availability Statement

The download link for the Bijie landslide dataset is http://gpcv.whu.edu.cn/data/Bijie_pages.html, accessed on 2 January 2025; the download link for the Nepal landslide dataset is https://zenodo.org/records/3675410#.XygLJChKiUk, accessed on 2 January 2025; and the download link for the Sichuan landslide and debris flow disaster dataset is https://www.scidb.cn/en/detail?dataSetId=803952485596135424#p2, accessed on 2 January 2025.

Conflicts of Interest

Fudong Chi is employed by Huaneng Lancang River Hydropower INC. The company played no role in the design of the study, the collection, analysis, or interpretation of data, the writing of the manuscript, or the decision to publish the article. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Casagli, N.; Intrieri, E.; Tofani, V.; Gigli, G.; Raspini, F. Landslide detection, monitoring and prediction with remote-sensing techniques. Nat. Rev. Earth Environ. 2023, 4, 51–64. [Google Scholar] [CrossRef]
Zhong, C.; Liu, Y.; Gao, P.; Chen, W.; Li, H.; Hou, Y.; Nuremanguli, T.; Ma, H. Landslide mapping with remote sensing: Challenges and opportunities. Int. J. Remote Sens. 2020, 41, 1555–1581. [Google Scholar] [CrossRef]
Guzzetti, F.; Mondini, A.C.; Cardinali, M.; Fiorucci, F.; Santangelo, M.; Chang, K.-T. Landslide inventory maps: New tools for an old problem. Earth-Sci. Rev. 2012, 112, 42–66. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Gholamnia, K.; Ghamisi, P. The application of ResU-net and OBIA for landslide detection from multi-temporal sentinel-2 images. Big Earth Data 2022, 7, 961–985. [Google Scholar] [CrossRef]
Keyport, R.N.; Oommen, T.; Martha, T.R.; Sajinkumar, K.S.; Gierke, J.S. A comparative analysis of pixel- and object-based detection of landslides from very high-resolution images. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2018, 64, 1–11. [Google Scholar] [CrossRef]
Tien Bui, D.; Shahabi, H.; Shirzadi, A.; Chapi, K.; Alizadeh, M.; Chen, W.; Mohammadi, A.; Ahmad, B.B.; Panahi, M.; Hong, H.; et al. Landslide Detection and Susceptibility Mapping by AIRSAR Data Using Support Vector Machine and Index of Entropy Models in Cameron Highlands, Malaysia. Remote Sens. 2018, 10, 1527. [Google Scholar] [CrossRef]
Chen, F.; Yu, B.; Li, B. A practical trial of landslide detection from single-temporal Landsat8 images using contour-based proposals and random forest: A case study of national Nepal. Landslides 2018, 15, 453–464. [Google Scholar] [CrossRef]
Ma, A.; Wan, Y.; Zhong, Y.; Wang, J.; Zhang, L. SceneNet: Remote sensing scene classification deep learning network using multi-objective neural evolution architecture search. ISPRS J. Photogramm. Remote Sens. 2021, 172, 171–188. [Google Scholar] [CrossRef]
Ran, S.; Gao, X.; Yang, Y.; Li, S.; Zhang, G.; Wang, P. Building Multi-Feature Fusion Refined Network for Building Extraction from High-Resolution Remote Sensing Images. Remote Sens. 2021, 13, 2794. [Google Scholar] [CrossRef]
Peláez-Vegas, A.; Mesejo, P.; Luengo, J. A survey on semi-supervised semantic segmentation. arXiv 2023, arXiv:2302.09899. [Google Scholar]
Zhang, X.; Pun, M.-O.; Liu, M. Semi-Supervised Multi-Temporal Deep Representation Fusion Network for Landslide Mapping from Aerial Orthophotos. Remote Sens. 2021, 13, 548. [Google Scholar] [CrossRef]
Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. arXiv 2017, arXiv:1701.07875. [Google Scholar]
Huang, W.; Shi, Y.; Xiong, Z.; Zhu, X.X. Decouple and weight semi-supervised semantic segmentation of remote sensing images. ISPRS J. Photogramm. Remote Sens. 2024, 212, 13–26. [Google Scholar] [CrossRef]
Zhang, L.; Yang, Z.; Zhou, G.; Lu, C.; Chen, A.; Ding, Y.; Wang, Y.; Li, L.; Cai, W. MDMASNet: A dual-task interactive semi-supervised remote sensing image segmentation method. Signal Process. 2023, 212, 109152. [Google Scholar] [CrossRef]
Lu, X.; Jiao, L.; Li, L.; Liu, F.; Liu, X.; Yang, S.; Feng, Z.; Chen, P. Weak-to-Strong Consistency Learning for Semisupervised Image Segmentation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
Miao, W.; Geng, J.; Jiang, W. Semi-Supervised Remote-Sensing Image Scene Classification Using Representation Consistency Siamese Network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
Han, W.; Feng, R.; Wang, L.; Cheng, Y. A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification. ISPRS J. Photogramm. Remote Sens. 2018, 145, 23–43. [Google Scholar] [CrossRef]
Hong, D.; Yokoya, N.; Xia, G.-S.; Chanussot, J.; Zhu, X.X. X-ModalNet: A semi-supervised deep cross-modal network for classification of remote sensing data. ISPRS J. Photogramm. Remote Sens. 2020, 167, 12–23. [Google Scholar] [CrossRef]
He, H.; Li, C.; Yang, R.; Zeng, H.; Li, L.; Zhu, Y. Multisource Data Fusion and Adversarial Nets for Landslide Extraction from UAV-Photogrammetry-Derived Data. Remote Sens. 2022, 14, 3059. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, H.; Yang, R.; Yao, G.; Xu, Q.; Zhang, X. A novel weakly supervised remote sensing landslide semantic segmentation method: Combining CAM and cycleGAN algorithms. Remote Sens. 2022, 14, 3650. [Google Scholar] [CrossRef]
Wang, Y.; Wang, H.; Shen, Y.; Fei, J.; Li, W.; Jin, G.; Wu, L.; Zhao, R.; Le, X. Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 4238–4247. [Google Scholar]
Xie, D.; Yang, R.; Qiao, Y.; Zhang, J. Intelligent Identification of Landslide Based on Deep Semi-supervised Learning. In Proceedings of the 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Chengdu, China, 19–21 August 2022; pp. 264–269. [Google Scholar]
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 3213–3223. [Google Scholar]
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
Ji, S.; Yu, D.; Shen, C.; Li, W.; Xu, Q. Landslide detection from an open satellite imagery and digital elevation model dataset using attention boosted convolutional neural networks. Landslides 2020, 17, 1337–1352. [Google Scholar] [CrossRef]
Bragagnolo, L.; Rezende, L.R.; da Silva, R.V.; Grzybowski, J.M.V. Convolutional neural networks applied to semantic segmentation of landslide scars. Catena 2021, 201, 105189. [Google Scholar] [CrossRef]
Zeng, C.; Cao, Z.; Su, F.; Zeng, Z.; Yu, C. High-precision aerial imagery and interpretation dataset of landslide and debris flow disaster in Sichuan and surrounding areas. China Sci. Data 2021, 7, 195–205. [Google Scholar]
Ouali, Y.; Hudelot, C.; Tami, M. Semi-supervised semantic segmentation with cross-consistency training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 14–19 June 2020; pp. 12674–12684. [Google Scholar]
Yun, S.; Han, D.; Chun, S.; Oh, S.J.; Yoo, Y.; Choe, J. CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 6022–6031. [Google Scholar]
Yang, L.; Qi, L.; Feng, L.; Zhang, W.; Shi, Y. Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation. arXiv 2022, arXiv:2208.09910. [Google Scholar]
Cubuk, E.D.; Zoph, B.; Shlens, J.; Le, Q.V. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Virtual, 14–19 June 2020; pp. 3008–3017. [Google Scholar]
Berthelot, D.; Carlini, N.; Cubuk, E.D.; Kurakin, A.; Sohn, K.; Zhang, H.; Raffel, C. Semi-supervised learning with distribution alignment and augmentation anchoring. arXiv 2019, arXiv:1911.09785. [Google Scholar]
Zhao, Z.; Yang, L.; Long, S.; Pi, J.; Zhou, L.; Wang, J. Augmentation matters: A simple-yet-effective approach to semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 11350–11359. [Google Scholar]
Liu, X.; Peng, Y.; Lu, Z.; Li, W.; Yu, J.; Ge, D.; Xiang, W. Feature-Fusion Segmentation Network for Landslide Detection Using High-Resolution Remote Sensing Images and Digital Elevation Model Data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
Milletari, F.; Navab, N.; Ahmadi, S.A. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. In Proceedings of the 4th International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 565–571. [Google Scholar]
Zhu, X. Semi-Supervised Learning Literature Survey; University of Wisconsin-Madison: Madison, WI, USA, 2005. [Google Scholar]
Chen, X.; Yuan, Y.; Zeng, G.; Wang, J. Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 20–25 June 2021; pp. 2613–2622. [Google Scholar]
Olsson, V.; Tranheden, W.; Pinto, J.; Svensson, L. Classmix: Segmentation-based data augmentation for semi-supervised learning. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Virtual, 5–9 January 2021; pp. 1369–1378. [Google Scholar]
Wu, Y.; Liu, C.; Chen, L.; Zhao, D.; Zheng, Q.; Zhou, H. Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation. Multimed. Syst. 2023, 29, 511–523. [Google Scholar] [CrossRef]
Lu, X.; Jiao, L.; Liu, F.; Yang, S.; Liu, X.; Feng, Z.; Li, L.; Chen, P. Simple and Efficient: A Semisupervised Learning Framework for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
Oliver, A.; Odena, A.; Raffel, C.A.; Cubuk, E.D.; Goodfellow, I. Realistic evaluation of deep semi-supervised learning algorithms. In Proceedings of the 32nd Conference on Neural Information Processing Systems, Montreal, QC, Canada, 2–8 December 2018; p. 31. [Google Scholar]
Xie, Q.; Dai, Z.; Hovy, E.; Luong, T.; Le, Q. Unsupervised data augmentation for consistency training. In Proceedings of the 34th Conference on Neural Information Processing Systems, Virtual, 6–12 December 2020; pp. 6256–6268. [Google Scholar]
Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 1195–1204. [Google Scholar]
Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 1874–1883. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Li, X.; Yu, L.; Chen, H.; Fu, C.-W.; Xing, L.; Heng, P.-A. Transformation-consistent self-ensembling model for semisupervised medical image segmentation. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 523–534. [Google Scholar] [CrossRef]
Odena, A.; Olah, C.; Shlens, J. Conditional image synthesis with auxiliary classifier gans. In Proceedings of the 34th International Conference on Machine Learning, Sydney, NSW, Australia, 6–11 August 2017; pp. 2642–2651. [Google Scholar]

Figure 1. Different learning styles for landslide identification: (a) supervised learning; (b) semi-supervised learning.

Figure 2. Comparison of (a) RS images of landslides and (b) natural images.

Figure 3. Visualization DEM of landslide damage through unsuitable data augmentation strategies, (a–c) are the original data and (d–g) are the contour lines generated based on DEM after data augmentation. (a) Image; (b) label; (c) original contour line; (d) color jitter; (e) CutMix; (f) invert; (g) cutout.

Figure 4. The architecture of the proposed generic semi-supervised learning framework HPM-Match.

Figure 5. Overview of the proposed dual-branch input perturbation (DIP) generation approach.

Figure 6. The geographical locations of study areas and example image and the corresponding visual interpretation: (a) study area; (b) Bijie landslide dataset; (c) Nepal landslide dataset; and (d) Sichuan landslide and debris flow disaster dataset.

Figure 7. Typical visual results of landslide identification on the Bijie dataset: (a) input image; (b) label; (c) SL; (d) Mean Teacher; (e) UniMatch; (f) U²PL; (g) HPM-Match.

Figure 8. Typical visual results of landslide identification on the Nepal dataset: (a) input image; (b) label; (c) SL; (d) Mean Teacher; (e) UniMatch; (f) U²PL; (g) HPM-Match.

Figure 9. Typical visual results of landslide identification on the Sichuan dataset: (a) input image; (b) label; (c) SL; (d) Mean Teacher; (e) UniMatch; (f) U²PL; (g) HPM-Match.

Figure 10. Performance metrics under different perturbations.

Figure 11. Typical activation map results of landslide identification using the Bijie dataset. (a) Input image; (b) label; (c) F0-M0-I0; (d) F1-M0-I0; (e) F0-M1-I0; (f) F0-M0-I1; (g) F2-M0-I0; (h) F0-M0-I2; (i) F2-M1-I2; (j) F3-M1-I2; (k) F2-M1-I3.

Figure 12. Performance metrics of final loss with different loss weights in the 2% labeled Bijie dataset.

Figure 13. Landslide images processed by extra SDA. (a) Original image; (b) label; (c) channel shuffle; (d) Gaussian noise; (e) invert; (f) CutOut; (g) RGBShift.

Figure 14. Performance metrics of SDA and DEM ablation experiments using the Bijie dataset.

Figure 15. Computational load comparisons between HPM-Match, SL, and SOTA methods on the Bijie datasets: (a) 2% label ratio; (b) 6% label ratio; and (c) 30% label ratio.

Table 1. Data augmentation of various weak and strong for landslide identification.

WDA
Method	Probability	Remarks
Resize	1	Resize input data to a fixed size
Resized Crop	0.5	The crop range is (0.8, 1.0)
Rotate90	0.5	$Randomly rotate data by n \times 90^{\circ}$
Flip	0.5	Include horizontal and vertical flip
Transpose	0.5	Perform rows and column swaps
SDA
Color Jitter	0.8	Randomly change the brightness, contrast, saturation, and hue
Blur	0.5	Apply Gaussian blur to the image
Edge Enhance	0.5	Edge enhancement using a convolution kernel with a center of 10 and an edge of −1
Grayscale	0.5	Convert the image to grayscale
CutMix	0.5	Two pairs of data undergo a masking operation with random sizes and positions

Table 2. Detailed information regarding the three landslide datasets.

Dataset	Data Type	Data Source	Resolution	Date Range	Training	Validation	Test
Bijie	Image and DEM	TripleSat	0.8 m	2018.5–2018.8	514	44	212
Nepal	Image	Landsat 8	30 m	2014–2019	200	35	40
Sichuan	Image	Aerial imagery	0.2–0.9 m	2008–2020	70	15	22

Table 3. Parameter details of experiments.

Parameter	Value
Model	U-Net
Input Image Size	256 × 256
Batch Size	8/16
Lr	1 × 10⁻⁴
Optimizer	Adam
Weight Decay	1 × 10⁻⁴
Consistency	1
Consistency Ramp-Up	5
$α$	0.999
Conf Thresh	0.3/0.95

Table 4. Confusion matrix for landslide identification results.

		Actual Classification
		Landslide	Background
Predicted Classification	Landslide	TP	FP
Predicted Classification	Background	FN	TN

Table 5. Performance metrics of U-Net SL and SSL on the Bijie dataset.

Label Ratio	Approaches	OA/%	Precision/%	Recall/%	IoU/%	F₁-Score/%
2% (10)	SL	88.19 ± 0.21	66.32 ± 1.33	40.83 ± 0.84	33.82 ± 0.80	50.54 ± 0.90
	Mean Teacher	90.27 ± 0.02	73.94 ± 0.40	52.79 ± 0.41	44.50 ± 0.15	61.60 ± 0.15
	UniMatch	92.62 ± 0.04	81.07 ± 1.50	65.38 ± 1.98	56.67 ± 0.75	72.34 ± 0.61
	U²PL	90.82 ± 0.16	79.94 ± 1.44	50.56 ± 0.13	44.86 ± 0.37	61.94 ± 0.36
	HPM-Match	92.98 ± 0.01	78.20 ± 0.20	72.77 ± 0.41	60.50 ± 0.18	75.39 ± 0.13
6% (30)	SL	92.25 ± 0.38	80.20 ± 3.03	63.38 ± 2.43	54.73 ± 1.58	70.73 ± 1.31
	Mean Teacher	92.66 ± 0.08	77.37 ± 0.82	71.16 ± 0.70	58.89 ± 0.17	74.13 ± 0.14
	UniMatch	93.25 ± 0.50	84.80 ± 2.94	66.27 ± 0.84	59.25 ± 2.03	74.39 ± 1.61
	U²PL	92.80 ± 0.15	81.65 ± 0.73	66.12 ± 1.00	57.56 ± 0.80	73.06 ± 0.65
	HPM-Match	93.42 ± 0.02	81.29 ± 0.32	72.02 ± 0.30	61.78 ± 0.04	76.38 ± 0.02
30% (154)	SL	94.52 ± 0.37	86.44 ± 1.61	74.64 ± 2.21	66.81 ± 2.03	80.08 ± 1.45
	Mean Teacher	94.91 ± 0.07	88.49 ± 0.36	75.35 ± 0.36	68.62 ± 0.37	81.39 ± 0.26
	UniMatch	95.12 ± 0.09	90.49 ± 0.78	74.83 ± 1.21	69.36 ± 0.69	81.91 ± 0.48
	U²PL	95.13 ± 0.00	88.64 ± 1.05	76.96 ± 1.13	70.03 ± 0.29	82.37 ± 0.20
	HPM-Match	95.70 ± 0.09	90.93 ± 0.21	78.78 ± 0.59	73.04 ± 0.58	84.42 ± 0.38
100% (514)	SL	95.65 ± 0.05	87.38 ± 0.80	82.46 ± 1.20	73.67 ± 0.46	84.83 ± 0.30

Bold indicates that the metrics performed best among all methods.

Table 6. Performance metrics of U-Net SL and SSL on the Nepal dataset.

Label ratio	Approaches	OA/%	Precision/%	Recall/%	IoU/%	F₁-Score/%
5% (10)	SL	99.15 ± 0.01	48.77 ± 1.48	30.77 ± 1.55	23.26 ± 1.22	37.73 ± 1.61
	Mean Teacher	98.89 ± 0.04	36.62 ± 1.37	42.86 ± 0.72	24.59 ± 0.42	39.47 ± 0.53
	UniMatch	99.13 ± 0.01	47.25 ± 1.13	32.24 ± 2.01	23.66 ± 0.81	38.26 ± 1.05
	U²PL	99.04 ± 0.07	42.63 ± 3.63	36.77 ± 1.54	24.50 ± 0.92	39.34 ± 1.20
	HPM-Match	98.97 ± 0.02	39.86 ± 0.76	44.97 ± 1.58	26.77 ± 0.22	42.23 ± 0.27
15% (30)	SL	99.12 ± 0.08	48.10 ± 5.53	39.75 ± 2.72	27.62 ± 1.65	43.25 ± 2.04
	Mean Teacher	99.06 ± 0.05	44.27 ± 2.33	46.26 ± 1.39	29.18 ± 0.72	45.18 ± 0.86
	UniMatch	99.19 ± 0.06	52.66 ± 5.12	41.34 ± 3.13	30.01 ± 2.09	46.12 ± 2.47
	U²PL	99.25 ± 0.01	58.83 ± 1.44	37.13 ± 1.67	29.43 ± 0.70	45.47 ± 0.84
	HPM-Match	99.23 ± 0.01	55.23 ± 0.77	45.41 ± 0.69	33.18 ± 0.30	49.83 ± 0.34
40% (80)	SL	99.24 ± 0.03	55.36 ± 1.95	47.07 ± 2.49	34.12 ± 1.77	50.85 ± 1.99
	Mean Teacher	99.18 ± 0.02	51.23 ± 1.54	53.62 ± 0.66	35.48 ± 0.50	52.37 ± 0.54
	UniMatch	99.24 ± 0.04	55.61 ± 2.99	50.10 ± 2.16	35.81 ± 2.19	52.70 ± 2.39
	U²PL	99.30 ± 0.03	60.90 ± 3.15	48.03 ± 1.64	36.62 ± 0.41	53.61 ± 0.44
	HPM-Match	99.19 ± 0.01	51.70 ± 0.50	59.88 ± 1.04	38.39 ± 0.33	55.48 ± 0.34
100% (200)	SL	99.25 ± 0.07	55.58 ± 3.86	58.72 ± 3.75	39.90 ± 2.89	56.99 ± 2.91

Bold indicates that the metrics performed best among all methods.

Table 7. Performance metrics of U-Net SL and SSL on the Sichuan dataset.

Label ratio	Approaches	OA/%	Precision/%	Recall/%	IoU/%	F₁-Score/%
7% (5)	SL	79.02 ± 0.07	51.34 ± 0.12	63.85 ± 0.05	39.78 ± 0.07	56.92 ± 0.08
	Mean Teacher	79.71 ± 0.70	52.48 ± 1.28	71.15 ± 2.12	43.23 ± 0.21	60.37 ± 0.20
	UniMatch	82.70 ± 1.03	59.15 ± 2.24	65.81 ± 2.14	45.27 ± 2.22	62.29 ± 2.12
	U²PL	81.22 ± 0.56	55.15 ± 1.17	72.76 ± 1.63	45.68 ± 0.19	62.71 ± 0.17
	HPM-Match	85.12 ± 0.55	63.89 ± 1.65	72.57 ± 1.21	51.43 ± 0.70	67.93 ± 0.61
14% (10)	SL	86.39 ± 0.71	78.77 ± 0.64	51.11 ± 4.39	44.88 ± 3.38	61.89 ± 3.17
	Mean Teacher	87.48 ± 0.24	79.33 ± 0.50	57.22 ± 1.02	49.80 ± 0.90	66.48 ± 0.80
	UniMatch	88.56 ± 0.75	80.37 ± 3.64	62.87 ± 3.53	54.40 ± 2.49	70.43 ± 2.08
	U²PL	87.11 ± 0.34	74.96 ± 1.54	61.03 ± 0.31	50.69 ± 0.59	67.27 ± 0.52
	HPM-Match	89.26 ± 0.40	79.81 ± 2.79	67.89 ± 2.25	57.84 ± 0.86	73.29 ± 0.69
28% (20)	SL	88.43 ± 0.60	77.49 ± 3.36	66.49 ± 6.36	55.42 ± 3.15	71.26 ± 2.64
	Mean Teacher	89.88 ± 0.21	79.41 ± 0.50	72.07 ± 0.55	60.72 ± 0.68	75.56 ± 0.52
	UniMatch	88.98 ± 0.41	73.07 ± 2.21	78.33 ± 2.89	60.69 ± 0.83	75.53 ± 0.64
	U²PL	89.49 ± 0.26	78.19 ± 1.09	71.58 ± 0.20	59.67 ± 0.54	74.74 ± 0.43
	HPM-Match	89.90 ± 0.32	75.96 ± 1.16	78.31 ± 0.82	62.75 ± 0.75	77.11 ± 0.57
100% (70)	SL	90.37 ± 0.45	86.05 ± 1.95	66.48 ± 1.62	60.00 ± 1.54	74.98 ± 1.20

Bold indicates that the metrics performed best among all methods.

Table 8. Performance metrics of proposed and extra SDA comparison experiments.

Dataset	Method	OA/%	Precision/%	Recall/%	IoU/%	F₁-Score/%
Bijie	Extra SDA	92.30 ± 0.11	76.07 ± 0.55	69.83 ± 0.47	57.25 ± 0.46	72.81 ± 0.38
Bijie	Ours	92.98 ± 0.01	78.20 ± 0.20	72.77 ± 0.41	60.50 ± 0.18	75.39 ± 0.13
Nepal	Extra SDA	99.14 ± 0.02	49.25 ± 0.85	57.05 ± 0.83	35.92 ± 0.42	52.85 ± 0.45
Nepal	Ours	99.19 ± 0.01	51.70 ± 0.50	59.88 ± 1.04	38.39 ± 0.33	55.48 ± 0.34
Sichuan	Extra SDA	86.46 ± 0.41	72.73 ± 2.88	60.62 ± 2.57	49.27 ± 0.40	66.01 ± 0.36
Sichuan	Ours	89.26 ± 0.40	79.81 ± 2.79	67.89 ± 2.25	57.84 ± 0.86	73.29 ± 0.69

Bold indicates that the metrics performed best among all methods.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ran, S.; Ma, G.; Chi, F.; Zhou, W.; Weng, Y. HPM-Match: A Generic Deep Learning Framework for Historical Landslide Identification Based on Hybrid Perturbation Mean Match. Remote Sens. 2025, 17, 147. https://doi.org/10.3390/rs17010147

AMA Style

Ran S, Ma G, Chi F, Zhou W, Weng Y. HPM-Match: A Generic Deep Learning Framework for Historical Landslide Identification Based on Hybrid Perturbation Mean Match. Remote Sensing. 2025; 17(1):147. https://doi.org/10.3390/rs17010147

Chicago/Turabian Style

Ran, Shuhao, Gang Ma, Fudong Chi, Wei Zhou, and Yonghong Weng. 2025. "HPM-Match: A Generic Deep Learning Framework for Historical Landslide Identification Based on Hybrid Perturbation Mean Match" Remote Sensing 17, no. 1: 147. https://doi.org/10.3390/rs17010147

APA Style

Ran, S., Ma, G., Chi, F., Zhou, W., & Weng, Y. (2025). HPM-Match: A Generic Deep Learning Framework for Historical Landslide Identification Based on Hybrid Perturbation Mean Match. Remote Sensing, 17(1), 147. https://doi.org/10.3390/rs17010147

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

HPM-Match: A Generic Deep Learning Framework for Historical Landslide Identification Based on Hybrid Perturbation Mean Match

Abstract

1. Introduction

2. Materials and Methods

2.1. SSL Landslide Identification Problem Defined

2.2. Over-Structure of HPM-Match

2.3. Dual-Branch Input Perturbation (DIP) Generation

2.4. Independent Triple-Stream Perturbation (ITP) Structure

3. Study Area and Dataset

4. Experiment Settings

5. Experiment Results and Evaluation

5.1. Bijie Landslide Dataset

5.2. Nepal Landslide Dataset

5.3. Sichuan Landslide and Debris Flow Disaster Dataset

6. Discussion

6.1. Contribution of Perturbation Type on HPM-Match

6.2. Effect of Loss Weights on Final Loss

6.3. Effect of Data Augmentation on Landslide Identification

6.4. Practical Implications for Real-World Landslide Monitoring

7. Limitation and Future Work

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI