*Article* **Performance Study of Landslide Detection Using Multi-Temporal SAR Images**

**Yunung Nina Lin 1,\* , Yi-Ching Chen <sup>1</sup> , Yu-Ting Kuo <sup>2</sup> and Wei-An Chao <sup>3</sup>**


**Abstract:** This study addresses one of the most commonly-asked questions in synthetic aperture radar (SAR)-based landslide detection: How the choice of datatypes affects the detection performance. In two examples, the 2018 Hokkaido landslides in Japan and the 2017 Putanpunas landslide in Taiwan, we utilize the Growing Split-Based Approach to obtain Bayesian probability maps for such a performance evaluation. Our result shows that the high-resolution, full-polarimetric data offers superior detection capability for landslides in forest areas, followed by single-polarimetric datasets of high spatial resolutions at various radar wavelengths. The medium-resolution singlepolarimetric data have comparable performance if the landslide occupies a large area and occurs on bare surfaces, but the detection capability decays significantly for small landslides in forest areas. Our result also indicates that large local incidence angles may not necessarily hinder landslide detection, while areas of small local incidence angles may coincide with layover zones, making the data unusable for detection. The best area under curve value among all datatypes is 0.77, suggesting that the performance of SAR-based landslide detection is limited. The limitation may result from radar wave's sensitivity to multiple physical factors, including changes in land cover types, local topography, surface roughness and soil moistures.

**Keywords:** SAR-based landslide detection; Growing Split-Based Approach (GSBA); Hokkaido landslide; Putanpunas landslide; SAR polarimetry; model-free 3-component decomposition for full polarimetric data (MF3CF)

## **1. Introduction**

According to the Global Fatal Landslide Database, Asia suffers the greatest impact of fatal landslides among all continents [1,2]. This fact has to do with the physiographical environment of this region, including active tectonics, frequent typhoons/tropical cyclones, as well as socioeconomic factors, such as rapid economic growth, human population increase, habitat expansion and even loose regulations [1,3]. Among all questions associated with landslides, where and when they occur and how big they are remain the first information people demand to know. From a rapid response perspective, landslide sizes and locations are the key information needed by the ground crews to ensure the safety of human lives and the transportation of aids and supplies. From a policy-making perspective, long-term spatiotemporal evolution of landslide hotspots impacts the formulation of management strategies and mitigation plans [4]. From a scientific perspective, landslide volumes and frequencies may provide information about rock strength and denudation rates [5,6], the influence of climate changes [2], and the interaction among the lithosphere, hydrosphere and biosphere [7,8].

Given the large-area imaging capability from the sky, remote sensing has been the most widely-used tool in landslide detection during the past decades. For example, based on

**Citation:** Lin, Y.N.; Chen, Y.-C.; Kuo, Y.-T.; Chao, W.-A. Performance Study of Landslide Detection Using Multi-Temporal SAR Images. *Remote Sens.* **2022**, *14*, 2444. https://doi.org/ 10.3390/rs14102444

Academic Editors: Takeo Tadono and Masato Ohki

Received: 31 March 2022 Accepted: 17 May 2022 Published: 19 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Formosat-2 satellite images, Lin et al. (2017) found that about 70% of the mountainous area in Taiwan has experienced at least 1 landslide within the decade between 2003 and 2012 [4]. In Japan, the National Research Institute for Earth Science and Disaster Prevention (NIED) also conducts regular landslide mappings based on aerial photos (https://www.bosai.go. jp/e/research/database/earth-and-sand.html, accessed on 10 March 2022; a study using the national landslide database can be found in [8]). The optical image-based landslide mapping has high quality and accuracy, although the availability usually depends on the weather condition and the source of sun lights after a landslide occurs. In some cases, the latency between the event and the first usable image can be weeks. This is where the synthetic aperture radar (SAR)-based mapping may serve as an alternative solution particularly during the early phase of the hazards. Some recent examples include the earthquake-triggered landslides after the 1999 Chi-Chi earthquake in Taiwan [9], the 2015 Gorkha earthquake in Nepal [10,11], the 2015 Mt. Kinabalu earthquake in Malaysia [12], the 2018 Lombok earthquakes in Indonesia [10], and the 2018 Hokkaido Eastern Iburi earthquake in Japan [10,13–17]. Examples of rainfall-triggered landslides include the 2009 Typhoon Morakot in Taiwan [18], the 2011 Typhoon Talas in Honshu, Japan [19], the 2015 heavy rain in Chin State, Myanmar [20], and the 2017 heavy rain in Kyushu, Japan [16].

In general, SAR-based change detection can be classified into two categories–the coherent change detection (CCD) and incoherent change detection (ICD)–depending on whether interferometric phase is used [15]. Speaking of change detection for landslides, CCD includes the pre-failure landslide monitoring by using interferometric phase timeseries, as well as the post-failure landslide detection by comparing the change of the interferometric phase qualities (interferometric coherence). In the post-failure landslide detection, a landslide patch is usually depicted by low coherence due to the changes in surface geometry, roughness and dielectric properties. It works particularly well in regions with intermediate-to-high pre-event coherence [10,12,15]. However, in places where phase decorrelation occurs constantly, CCD may fail to provide accurate landslide information. One such a place is the forest, where volumetric decorrelation and temporal decorrelation prevail due to frequent changes in vegetation structures and dielectric properties, as well as the disturbance by winds, rains, water vapors and other atmospheric conditions [21]. It has been shown that coherence-based landslide detection methods may yield unreliable results in the forest areas where the pre-event coherence is too low [10,15].

On the other hand, ICD compares SAR backscattering amplitudes or intensities before and after the landslide event. Similar to interferometric coherence, changes in intensity are also associated with variations in surface geometry, roughness and dielectric properties. Backscattering intensities are, however, less sensitive to atmospheric conditions, and they also appear in a wider range of values that allow a better separation of major changes (such as from vegetation to bare surface) from minor variations (such as tree growth). In addition, most ICD methods are relatively easy to implement as they only involve image-wise calculations (except for the intensity correlation method in [15,19,20]). ICD has therefore been adopted more widely for landslide detection than CCD, especially for forest areas [13,15,17,19,22].

Single-polarimetric (single-pol) SAR data is by far the most commonly-used datatype in ICD. Having said that, multi-polarimetric (multi-pol) datasets and their decomposition parameters can also be used in ICD. SAR polarimetry, or PolSAR, takes into account of both the intensities and phases from the same image epoch acquired at different transmittingreceiving polarizations (HH, HV, VV and VH). Polarimetric decomposition then recombines the complex scattering coefficients to extract parameters that can directly infer physical properties of the scatterers. These decomposition parameters, even from a single image epoch, have been proved to be efficient in differentiating land cover types including landslides [9,16,23,24]. Some studies also use PolSAR datasets and decomposition parameters in the dual-temporal (1 pre-event and 1 post-event image) change detection to improve the result accuracy and to gain physical insights [16,18,25]. In comparison, PolSAR's detection

capability in a multi-temporal context (≥2 pre-event images and 1 post-event image) has not received much attention yet and deserves more exploration.

This study attempts to incorporate the multi-temporal PolSAR decomposition parameters into SAR-based landslide detection in forest areas, and compares the performance with those from single-pol datasets. This comparison is out of a practical consideration: Given more and more public and private players in SAR space missions, we expect the list of SAR sensors to increase and the overpass latency to shorten substantially in the near future. Soon SAR-based landslide detection may be not so much limited by data availability, and hence our knowledge about how different data properties affect the detection performance will be pivotal. Such knowledge will help researchers and government agencies make decisions before choosing the best dataset for landslide mapping. It will also provide information about the uncertainties and limitations especially for a rapid-response product.

The goal of this study is therefore to evaluate the influence of different SAR data properties on landslide detection, including radar wavelengths, spatial resolutions, polarizations and viewing geometry. Different datatypes, or combinations of the properties listed above, are produced from some of the most commonly-used sensors (the L-band ALOS-2, C-band Sentinel-1 and X-band COSMO-SkyMed) to facilitate this performance study. To unify the comparison basis, we carry out change detection using a newly-designed algorithm Growing Split-Based Approach (GSBA). GSBA takes in a SAR-derived value, either a backscattering intensity or decomposition parameter, normalized over its time-series variations and computes the Bayesian probability of landslides given that value. We compare the detection performance in two landslide cases, the earthquake-triggered landslides due to the 2018 Hokkaido Eastern Iburi Earthquake in Japan, and the rainfall-triggered landslide caused by the 2017 heavy rain in the catchment of Putanpunas River, southern Taiwan. With results from these two cases, we discuss how different data properties affect the landslide detection, and what may limit the detection efficacy.

## **2. SAR Data Processing**

To achieve our objectives, we need to produce multiple datatypes from the same SAR data (see Table 1 for the list of SAR data used in this study). Next we describe the processing flows for two major datatype categories and the generation of Z-score maps, which are the input to the GSBA algorithm.


**Table 1.** List of SAR data used in this study.


**Table 1.** *Cont.*

\* <sup>1</sup> S-1 = Sentinel-1; CSK = COSMO-SkyMed; A = ascending; D = descending. \*<sup>2</sup> UHR = ultra-high resolution; HR = high resolution; MR = medium resolution; LR = low resolution. \*<sup>3</sup> Full-pol = full-polarimetric; Dual-pol = dual-polarimetric; Single-pol = single-polarimetric. H = horizontally polarized; V = vertically polarized. The first letter in the combination stands for the polarization of the transmitted wave, and the second is for the received wave.

#### *2.1. Single-Polarization: Backscattering Coefficient (σ* 0 *)*

Backscattering coefficient *σ* 0 is the normalized measure of the radar signal's strength reflected by a distributed target. From the single-look complex (SLC) stack, the general processing steps to obtain *σ* 0 include radiometric calibration [26], speckle noise attenuation [27,28], multilooking and geocoding (Figure 1). For Sentinel-1 (S-1) data, thermal noise removal is carried out concurrently with radiometric calibration before converting the digital numbers to *σ* 0 [29]. We only process the co-polarized *σ* 0 stacks (HH or VV) for their higher sensitivity to surface-related scattering [30]. All data processing is carried out using the graph processing tool (gpt) in SeNtinel Application Platform (SNAP) and the InSAR Scientific Computing Environment (ISCE, for ALOS-2 ScanSAR data only) built in a high-performance computing cluster. Two auxiliary datasets, local incidence angle (LIA) and layover-shadow mask, are also generated during data processing [31]. LIA is the angle between the radar incidence direction and the slope normal vector. Small LIAs indicate slopes facing the satellite, while large LIAs indicate either slopes facing the satellite but significantly deviating from the line-of-sight (LOS) direction, or slopes facing away from the satellite.

**Figure 1.** Processing flows for different SAR datatypes used in this study.

## *2.2. Multi-Polarization: Degree of Polarization (mDP) and Scattering Powers*

For a dual-polarimetric SAR data, we follow the same steps as those in the *σ* <sup>0</sup> processing flow to generate geocoded complex scattering coefficients *Sij* (*i*, *j* = *H* or *V*) (Figure 1). We then construct the 2 × 2 covariance matrix **C<sup>2</sup>** [32],

$$\mathbf{C\_{2}} = \langle \mathbf{k} \cdot \mathbf{k}^{\*T} \rangle = \begin{bmatrix} \mathbf{C\_{11}} & \mathbf{C\_{12}} \\ \mathbf{C\_{21}} & \mathbf{C\_{22}} \end{bmatrix} = \begin{bmatrix} \left< |\mathbf{S\_{ii}}|^{2} \right> & \left< \mathbf{S\_{ii}} \mathbf{S\_{ij}^{\*}} \right> \\ \left< \mathbf{S\_{ij}} \mathbf{S\_{ii}^{\*}} \right> & \left< |\mathbf{S\_{ij}}|^{2} \right> \end{bmatrix} \tag{1}$$

where h·i indicates spatial ensemble averaging (using a 5 × 5 window in our case) and **k** is the target vector,

$$\mathbf{k} = \begin{bmatrix} \mathbf{S}\_{li} \\ \mathbf{S}\_{lj} \end{bmatrix} \tag{2}$$

From **C<sup>2</sup>** we can derive the 2D Barakat degree of polarization *m*DP, which is defined as the ratio between the intensity of the polarized portion to that of the total intensity [33],

$$m\_{\rm DP} = \sqrt{1 - \frac{4 \times \det(\mathbf{C\_2})}{\left(\text{Tr}(\mathbf{C\_2})\right)^2}}\tag{3}$$

*m*DP represents the anisotropy from polarization structures, whereas the scattering randomness, 1 − *m*DP*β* (*β* is a measure of the relative dominance of polarized scattering), is considered as the dual-pol radar vegetation index [34].

For the full-pol SAR data (ALOS-2 A122 in Table 1), we form the 3 × 3 coherency matrix **T<sup>3</sup>** [30]:

$$\begin{aligned} \mathbf{T\_3} = \langle \mathbf{k} \cdot \mathbf{k}^{\*T} \rangle &= \begin{bmatrix} T\_{11} & T\_{12} & T\_{13} \\ T\_{21} & T\_{22} & T\_{23} \\ T\_{31} & T\_{32} & T\_{33} \end{bmatrix} \\ = \frac{1}{2} \begin{bmatrix} \langle \mathbf{S\_{HH}} + \mathbf{S\_{VV}} \rangle^2 \\ \langle (\mathbf{S\_{HH}} + \mathbf{S\_{VV}})^2 \rangle & \langle (\mathbf{S\_{HH}} + \mathbf{S\_{VV}})(\mathbf{S\_{HH}} - \mathbf{S\_{VV}})^\* \rangle & 2 \left\langle (\mathbf{S\_{HH}} + \mathbf{S\_{VV}})\mathbf{S\_{HV}^\*} \right\rangle \\ \langle (\mathbf{S\_{HH}} - \mathbf{S\_{VV}})(\mathbf{S\_{HH}} + \mathbf{S\_{VV}})^\* \rangle & \langle (\mathbf{S\_{HH}} - \mathbf{S\_{VV}})^2 \rangle & 2 \left\langle (\mathbf{S\_{HH}} - \mathbf{S\_{VV}})\mathbf{S\_{HV}^\*} \right\rangle \\ 2 \langle \mathbf{S\_{HV}}(\mathbf{S\_{HH}} + \mathbf{S\_{VV}})^\* \rangle & 2 \langle \mathbf{S\_{HV}}(\mathbf{S\_{HH}} - \mathbf{S\_{VV}})^\* \rangle & 4 \langle |\mathbf{S\_{HV}}|^2 \rangle \end{bmatrix} \end{aligned} (4)$$

where the target vector **k** is defined as [30]

$$\mathbf{k} = \frac{1}{\sqrt{2}} \begin{bmatrix} \mathbf{S}\_{HH} + \mathbf{S}\_{VV} \\ \mathbf{S}\_{HH} - \mathbf{S}\_{VV} \\ \mathbf{2} \mathbf{S}\_{HV} \end{bmatrix} \tag{5}$$

From **T3**, we carry out a Model-Free 3-Component decomposition for Full-pol data (MF3CF) which jointly considers the Barakat degree of polarization and the received wave information to allow the estimation of scattering powers without any assumption of scattering models [35]. The odd-bounce surface scattering power *P<sup>s</sup>* , even-bounce scattering power *P<sup>d</sup>* and the diffused (volumetric) scattering power *P<sup>v</sup>* can be estimated as [35]:

$$\begin{cases} P\_s = \frac{m\_{\rm FP} \text{Span}}{2} (1 + \sin 2\theta\_{\rm FP}) \\\ P\_d = \frac{m\_{\rm FP} \text{Span}}{2} (1 - \sin 2\theta\_{\rm FP}) \\\ P\_v = \text{Span} (1 - m\_{\rm FP}) \end{cases} \tag{6}$$

where *m*FP is the 3D Barakat degree of polarization,

$$m\_{\rm FP} = \sqrt{1 - \frac{27 \times \det(\mathbf{T\_3})}{\left(\text{Tr}(\mathbf{T\_3})\right)^3}}\tag{7}$$

Span = *T*<sup>11</sup> + *T*<sup>22</sup> + *T*<sup>33</sup> , and the scattering type parameter *θ*FP is defined as,

$$\tan \theta\_{\rm FP} = \frac{m\_{\rm FP} \text{Span} (T\_{11} - T\_{22} - T\_{33})}{T\_{11} (T\_{22} + T\_{33}) + m\_{\rm FP}^2 \text{Span}^2} \tag{8}$$

The calculation of 2D degree of polarization and scattering powers is carried out in PolSAR-tools available at https://github.com/Narayana-Rao/PolSAR-tools (accessed on 10 March 2022).

#### *2.3. Generating Z-Score Maps*

To standardize the change detection flow for various data values (*σ* 0 , *m*FP and scattering powers), we adopt a dimensionless Z-score (*Z*) of the post-event value normalized by the statistical information obtained from its pre-event time-series. It is calculated as [36],

$$Z = \frac{\mathcal{Y}\_{post} - \overline{\mathcal{Y}}\_{pre}}{\sigma\_{pre}} \tag{9}$$

where *ypost* is the data value on the post-event epoch, and [*ypre*, *σpre*] are the mean and standard deviation calculated from the pre-event time-series. The Z-score value represents the difference between a pixel's change value and its background mean value in the unit of background standard deviations. The background mean and standard deviation are derived from multiple pre-event epochs, and therefore may avoid the issues associated with a single biased pre-event image, a consideration commonly seen in a dual-temporal change detection scheme [37]. It also allows a more robust detection of minor changes if the pre-event time-series contains relatively stable values [36].

Among different datatypes, however, we caution that the Z-score map for CSK data in the Hokkaido case (Table 1) may not be as accurate due to the insufficient number of pre-event images–no acquisition exists before 16 July 2017. The temporal standard deviations thus derived tend to be too large and yield Z-score values smaller than those generated from other datasets. To work around this problem, we estimate the spatial standard variation within a 21 × 21 window centered at each pixel, and use it as *σpre* in Equation (9) when the value is smaller than the temporal standard deviation. The Z-score values thus generated are visually more comparable to other Z-score maps. The detection results, however, may still be less accurate because of incomplete pre-event information.

In addition to Z-score maps generated from the aforementioned datasets, we further generate a Z-score map that combines the positive Z-score values in *P<sup>s</sup>* (*Z*P*<sup>s</sup>* ) and negative values in *P<sup>v</sup>* (*Z*P*<sup>v</sup>* ),

$$Z\_{\mathcal{P}\_{\mathcal{E}}} = \begin{cases} \begin{array}{c} Z\_{\mathcal{P}\_{\mathcal{V}}} \text{ if } Z\_{\mathcal{P}\_{\mathcal{V}}} < 0 \text{ and } |Z\_{\mathcal{P}\_{\mathcal{V}}}| > |Z\_{\mathcal{P}\_{\mathcal{S}}}|\\ \end{array} \\ Z\_{\mathcal{P}\_{\mathcal{S}}} \text{ otherwise} \end{cases} \tag{10}$$

where *Z*P*<sup>c</sup>* stands for the combined Z-score map. The necessity and performance of such a combined Z-score map will be further demonstrated in Section 4. Next, we describe how these Z-score maps are used in the GSBA change-detection algorithm.

#### **3. Change Detection Method**

The method proposed in this study is a variation of the split-based approach (SBA), which was first proposed for SAR-based flood mapping [38]. SBA is designed for the generalization of a mapping algorithm regardless of the image's spatial resolution, swath size and the target's geospatial distribution. The basic idea is to separate the image into multiple, non-overlapping tiles (also called splits). The tiles are then checked one by one to identify the existence of a certain proxies that signify the changes. Some suggested proxies include standard deviation [38], coefficient of variation [39,40], and the ratio between the tile mean and the global mean [40]. Tiles with proxies above a given threshold are selected to jointly determine the global threshold either via a non-parametric approach such as the

Otsu method [41] or the KI algorithm [42], or through a parameterized fitting to the data histogram in order to determine the best cutoff point [43,44].

One variation of SBA is the hierarchical split-based approach (HSBA) [45]. Different from the conventional SBA which adopts a uniform tile size [38,40,46,47], HSBA adaptively and hierarchically splits the image to variable sizes until a bimodal histogram (representing the change and non-change classes) can be identified in the tile or when a minimal tile size is reached. This way it avoids the need of a pre-defined tile size, which in some cases may compromise the detection if the change area within a tile is extremely large or small. The issue with HSBA is that the splitting process is nonlinear-every split depends on the result of the previous split, and hence the computation time can be long when the image is large.

Instead of the top-down splitting strategy of HSBA, here we propose a bottom-up approach called Growing Split-Based Approach (GSBA) (Figure 2). We initialize the image splitting as in the conventional SBA. Once the tiles with changes are identified, we "grow" a patch within each tile cluster until the maximum patch area with a consistent bimodal histogram is reached. This growing step produces patches of different areas that mimic the variable tile size obtained by HSBA. The second variation in GSBA is that rather than obtaining a global threshold from the patches to generate a binary map, we calculate the Bayesian probability of changes instead [48]. Given the probability map, we can obtain a binary change map at a given cutoff probability (by default 0.5).

**Figure 2.** Growing Split-Based Approach (GSBA) workflow.

The last variation in GSBA is that instead of using a single tile size, the flow described above is repeated at different tile sizes. This practice is to acknowledge the observation that depending on the size and spatial distribution of changes, a single arbitrary tile size may in some cases produce results that fall into a local minimum or maximum of change area. With multiple binary maps generated at different tile sizes, we can select the one with intermediate spatial clustering (using Ripley's K, see step (g) in Appendix A) as the final output. In other cases where different tile sizes produce similar binary maps, this practice also offers reassurance regarding the robustness of the detection output. The processing flow in GSBA is linear and can be fully parallelized, so the increase in computation time can be minor in a multi-processor computing system.

To avoid distraction from the main focus of this paper, details of the GSBA algorithm are given in the Appendix A. The output Bayesian probability map is used in the following Receiver Operating Characteristics (ROC) curve analysis. To generate the ROC curves, we calculate the true positive rate (TPR, or recall) and false positive rate (FPR) at different cutoff probabilities. They are defined as follows:

$$\begin{cases} \text{TPR} = \text{TP/(TP} + \text{FN)}\\ \text{FPR} = \text{FP/(FP} + \text{TN)} \end{cases} \tag{11}$$

where [T, F] stand for true and false, [P, N] stand for positive and negative, and any twoletter combination means the number of such pixels identified through validation. We further analyze the area under curve (AUC) to evaluate the overall detection performance. This continuous tracking of trade-off effects between TPR and FPR allows users to have a complete and visual overview of the detection performance [15]. To compare with other studies using the same landslide cases, we also report the overall accuracy (OA) by using the final binary maps from GSBA. The overall accuracy is defined as:

$$\text{OA} = (\text{TP} + \text{TN}) / (\text{TP} + \text{TN} + \text{FP} + \text{FN}) \tag{12}$$

Note that these metrics are calculated at each datatype's spatial resolutions, which means, the validation dataset is resampled to match the resolution of the SAR images.

#### **4. Results**

#### *4.1. Case Study 1: Earthquake-Triggered Hokkaido Landslides in Japan*

Widely-distributed landslides occurred due to seismic shaking during the 6 September 2018 Mw 6.6 Hokkaido Eastern Iburi Earthquake (Figure 3). After the earthquake, the Geospatial Information Authority of Japan (GSI) acquired aerial photos on 6 and 11 September over the landslide areas and identified more than 3300 landslide patches manually (https://www.gsi.go.jp/BOUSAI/H30-hokkaidoiburi-east-earthquake-index.html#1, accessed on 10 March 2022). Visual comparison between the aerial photos taken on 6 and 11 September shows that only minor changes occurred between these two dates, and hence the majority of the landslides have existed since the earthquake.

**Figure 3.** Distribution of earthquake-triggered landslides in Hokkaido, after the 6 September 2018 Mw. 6.6 Hokkaido Easter Iburi Earthquake. Red star in (**a**) represents the epicenter; black box represents the full extent of AOI in (**b**). Background images are aerial photos taken on 6 and 11 September by GSI. Orange polygons are the manually identified landslide patches based on these aerial photos (source: https://www.gsi.go.jp/common/000204728.zip, accessed on 10 March 2022).

We define a large area of interest (AOI) of 480 km<sup>2</sup> that covers most of the landslide patches (Figure 3). Within the large AOI we further define four test areas, three of which are the same as those chosen by Jung and Yun (2020) [15]. This choice allows us to compare the effect of different spatial resolutions-they adopt the ALOS-2 PALSAR-2 Ultra-Fine HH-pol SAR data of 3-m resolution, while the data used in this study is the 6-m High-Sensitive full-pol SAR. In addition, we define a fourth test area around the Atsuma Reservoir in order to examine the effect of water bodies during landslide detection.

#### 4.1.1. Qualitative Comparison

From the three sensors used in this case study (Table 1), we produce the following seven SAR datatypes and one combined Z-score map to carry out the performance comparison:


In Figure 4 we choose Test Area 2 to demonstrate the differences among some of the selected datatypes. Both the high-res L-band HH-pol *σ* <sup>0</sup> and the ultra-high-res X-band HH-pol *σ* <sup>0</sup> provide sharp outlines for landslides compared to the multi-pol datatypes (*m*DP and *Ps*). This difference is caused by the spatial ensemble averaging carried out on the polarimetric datasets. The C-band VV-pol *σ* 0 , acquired at a lower spatial resolution, does not capture the landslide boundaries as clearly. In addition, many pixels in the landslide patches contain low Z-score values, suggesting that these pixels are indistinguishable from non-landslides. The same phenomenon is also observed more in the X-band *σ* 0 than in the L-band *σ* 0 , implying that shorter radar wavelengths may not perform as efficient in landslide detection within forest areas, possibly due to their higher sensitivity to small-scale changes in vegetation during the pre-event periods.

Next, we examine the effect of viewing geometry on different polarimetric combinations in the L-band datatypes. The *P<sup>s</sup>* datatype shows strong positive Z-scores for the landslides on the slopes with small LIAs. On the slopes with large LIAs, the landslide patches contain low to nearly zero *P<sup>s</sup>* Z-scores (pointed by yellow arrows in Figure 4). The same is also seen on the *m*DP Z-score map. In comparison, the L-band *σ* <sup>0</sup> may still show clear and predominantly negative Z-score values on the slopes with large LIAs. We investigate the Z-score maps from other scattering powers and find that, instead of a positive increase in *P<sup>s</sup>* Z-score, landslide patches with larger LIAs tend to show a stronger decrease in *P<sup>v</sup>* Z-score (yellow arrows in Figure 5). This different dependency on LIA between *P<sup>s</sup>* and *P<sup>v</sup>* is also observed in other landslide cases [23]. Numerical simulation in [49] offers some possible explanation to this phenomenon, in which backscattering energy due to direct-ground reflection and crown-ground interaction decreases with LIA, while the energy due to direct-crown backscattering increases. In other words, landslides on the slopes with larger LIAs are sensed as "loses in tree crowns" instead of "increases in bare surfaces". That means *P<sup>s</sup>* or *P<sup>v</sup>* each only carries half of the information about landslides. It is therefore necessary to consider both values when carrying out landslide detection, such as by combining them into a joint Z-score map *Z*P*<sup>c</sup>* (Equation (10) and Figure 5).

**Figure 4.** A qualitative comparison among different datatypes for Hokkaido Test Area 2 (see Table 1 for data sources). The value 3 m, 6 m and 15 m are the spatial resolutions, with the asterisk (\*) indicating images processed by taking spatial ensemble averaging. DEM: digital elevation model. LIA: local incidence angle. The LIA maps for the multi-pol datatypes are the same as those in the single-pol. Θ: satellite look angle. Yellow arrows on Z-score maps indicate one particular landslide that is well captured by *σ* <sup>0</sup> but not by *m*DP and *Ps*.

**Figure 5.** Comparison of Z-score maps generated from the odd-bounce surface scattering power (*Ps*) and the diffused (volumetric) scattering power (*Pv*), and the combined Z-score map (*P<sup>c</sup>* Zscore). Yellow arrows indicate places with low Z-score amplitudes in *P<sup>s</sup>* but high amplitude in *Pv*, a phenomenon associated with the local incidence angles (LIAs). Yellow arrows indicate landslide patches better depicted by combining the Z-score values from *Ps* and *Pv*.

The aforementioned complementary effect between *P<sup>s</sup>* or *P<sup>v</sup>* is not seen between the dual-pol *m*DP and the dual-pol radar vegetation index [34]. In fact, when comparing the Z-score maps generated from these dual-pol parameters, one appears as a sign-flipped image of the other–pixels in the landslide patches display similar amplitudes but opposite signs on the two Z-score maps. This is probably due to the incomplete information of scattering properties in dual polarization. We therefore do not attempt to combine these two parameters and remain with the *m*DP-only Z-score map. Next we will look into the quantitative comparison of detection performance among the eight listed datatypes.

#### 4.1.2. Quantitative Comparison

Figure 6 shows the Bayesian probability maps calculated from different datatypes. Overall, the full-pol *P<sup>c</sup>* depicts the most complete shape of the landslide bodies, followed by the single-pol *σ* <sup>0</sup> at different radar wavelengths. The dual-pol datatypes detect only few landslides. The X-band *σ* 0 captures an additional high-probability patch in Test Area 1 with sharp outlines. This patch is possibly related to crops, and the false detection is likely associated with the lower number of pre-event images. In areas where water bodies exist (Test Area 4), the single-pol *σ* <sup>0</sup> may detect changes associated with both the landslides and the water bodies, while the multi-pol datatypes are relatively free of such confusions.

**Figure 6.** Bayesian probability maps obtained from different datatypes in the Hokkaido landslide case. The asterisk (\*) indicates the image is processed by taking spatial ensemble averaging using a 5 × 5 window.

The performance of each datatype is shown in the ROC curves (Figure 7). In Test Area 1, the ROC curves for the L-band *σ* <sup>0</sup> and *P<sup>c</sup>* show a steep increase of TPR at low FPR, yielding AUC values as high as 0.83 to 0.86 (Figure 7f). These values are comparable to the detection results based on the ALOS-2 Ultra-Fine HH-pol multi-temporal intensities of 3-m spatial resolution (AUC = 0.79) [15]. The AUC for X-band *σ* <sup>0</sup> and L-band *P<sup>s</sup>* rank the second and third at 0.77 and 0.73. The L-band *m*DP, C-band *m*DP and C-band *σ* <sup>0</sup> perform poorly, yielding AUC values lower than 0.7.

**Figure 7.** Receiver operating characteristics (ROC) curves and area under curve (AUC) for (**a**–**d**) the four test areas and (**e**) the full area of the Hokkaido AOI (Figure 3). The numbers shown in (**f**) the AUC plot are the values from the full-pol *Pc*. Squares of different colors indicate the positions of the GSBA binary maps generated at a cutoff probability of 0.5.

The landslide patches in Test Area 2, 3 and 4 result in slow-rising ROC curves and low AUC values in almost all datatypes (Figure 7b–d,f). Among them, the full-pol *P<sup>c</sup>* still gives the highest and more stable AUC values, above 0.7 for all three test areas. All the other datatypes yield AUC values lower than 0.7. The C-band *σ* <sup>0</sup> yields the lowest AUC in Test Area 2 and 3–lower than 0.6. In Test Area 4, the L-band *σ* <sup>0</sup> gives the lowest AUC values as a result of false detection over the reservoir lake (Figures 3 and 6).

For the full area, the L-band full-pol *P<sup>c</sup>* has superior performance in landslide detection with AUC = 0.77 (Figure 7f). The comparison between *P<sup>c</sup>* and *Ps*-only AUC values again confirms that the combination of *P<sup>s</sup>* and *P<sup>v</sup>* is necessary for landslide detection. The fullarea AUC value for the high-res L-band *σ* <sup>0</sup> and ultra-high-res X-band *σ* 0 ranks the second and third, higher than the medium-res C-band *σ* 0 . We can therefore confirm that spatial resolutions plays a role as important as radar wavelengths in SAR-based landslide detection. The performance of the dual-pol *m*DP is least favored because of its lower full-area AUC and its tendency to detect fewer landslides.

In Figure 7a–e, we also mark the positions of the binary change maps determined by GSBA on each ROC curve. These solutions are located close to the turning points of the curves, but slightly leaning towards the lower-FPR end. These positions suggest that the default cutoff probability of 0.5 tends to create conservative change maps with a higher positive likelihood ratio (TPR-to-FPR ratio).

#### *4.2. Case Study 2: Rainfall-Triggered Putanpunas Landslide in Southern Taiwan*

In the second case, we look into a different scenario–a rainfall-triggered landslide. During the first 4 days in June 2017, a total of nearly 900 mm of precipitation occurred

in the upstream of the Laonong River (according to the records from weather station C0V210, available at https://e-service.cwb.gov.tw/HistoryDataQuery/index.jsp, accessed on 10 March 2022) (Figure 8). Three days later on 7 June 2017, a landslide-associated earthquake (landquake) was detected through the Real-time Landquake Monitoring System (RLAMS, http://collab.cv.nctu.edu.tw/older/catalog\_20171231.html, accessed on 10 March 2022) [50]. Given the order of occurrence between the torrential rain and the landquake, we attribute this event to a rainfall-triggered landslide.

**Figure 8.** Basemap of the Putanpunas River catchment and the landslide on 7 June 2017. Blue rectangle represents the AOI. Orange polygon: the landslide patch from manual mapping. C0V210 is the weather station. Source of image: SPOT-6/7 acquired on 5 July 2017 overlaid on Google Earth Pro Image © 2022 CNES/Airbus.

To determine the landslide area associated with the 7 June 2017 landquake, we use SPOT-6/7 images acquired on 18 April 2017 and 5 July 2017 to carry out manual detection. Within a 15-km radius (approximately the estimation error of landquake relocation) of the epicenter, we locate a new sliding patch in the existing landslide area of the Putanpunas River catchment. Actually, there have been repeating landslides in this catchment since typhoon Morakot in the year of 2009 [18,51], making this catchment one of the most actively evolving landslide area in southern Taiwan. Different from the multiple small-scale landslides triggered by the Hokkaido Eastern Iburi Earthquake, this rainfall-triggered Putanpunas landslide contains a single patch of a relatively large area, up to 400,000 m<sup>2</sup> (Figure 8). We caution that the actual landslide patch may differ from what we map here due to the longer latency between the event date and the post-event optical image.

#### 4.2.1. Qualitative Comparison

In this case study, we produce the following five datatypes for comparison (note that different viewing geometry is involved):


Some datasets allow dual-pol polarimetric combinations, such as HH-HV for ALOS-2 track D27 and VV-VH for both Sentinel-1 tracks (Table 1). However, in the Hokkaido case we have demonstrated that the use of dual-pol *m*DP datatype tends to capture fewer landslides, so we decide not to adopt the dual-pol datatype in this case study. Here we wish to focus on the detection performance among the *σ* <sup>0</sup> datatypes of different wavelengths, spatial resolutions and viewing geometry.

Figure 9 shows how different *σ* <sup>0</sup> datatypes compare visually. The ascending tracks show more prominent landslide signatures on the Z-score maps compared to those from the descending tracks, regardless of the wavelengths and spatial resolutions. The intriguing point is that the viewing geometry from ascending tracks yield larger LIAs compared to those from descending tracks. In addition, we find that the viewing geometry from descending tracks produces large layover zones that tend to overlap with the slopes of small LIAs. Among all sensors, the X-band CSK descending track produces the largest layover area, which also corresponds to the smallest look angle (27◦ ) among all three sensors. Within the layover zones, landslide-related changes may still be recorded but are mixed with energies from multiple ground targets of the same range distance, leading to brighter pixels and stretched patterns after geocoding (Figure 10). We also notice that the current layover-shadow mask does not necessarily mask out all layover areas (Figure 10), which is possibly caused by errors in DEM or limitations in the SAR geometric distortion simulation [52].

**Figure 9.** A qualitative comparison among different datatypes for the rainfall-triggered Putanpunas landslide on 7 June 2017 in southern Taiwan. White and black pixels on the LIA maps are layover and shadow zones, respectively. Refer to Figure 4 captions for more information.

**Figure 10.** (**Left**) Slopes within the layover zones appear as brighter and stretched pixels in the postevent X-band CSK *σ* 0 image. In comparison, slopes in non-layover areas show typical speckle textures. Yellow polygon is the 7 June 2017 landslide patch. (**Right**) After applying the layover-shadow mask (white and red pixels), some severely stretched patterns remain unmasked (blue arrows) possibly due to errors in DEM or incorrect prediction in the SAR geometric distortion simulation.

We have to emphasize that the same layover-shadow masks are also calculated and applied on the SAR images in the previous Hokkaido landslide case. However, given the smaller slope angles (Figure 4), the layover and shadow effects are nearly negligible. In comparison, the Putanpunas River catchment has a large topographic relief and hence greater slope angles (Figure 9), resulting in a larger portion of unusable data within the layover zone.

#### 4.2.2. Quantitative Comparison

Figure 11 shows the Bayesian probability, ROC curves and AUC values for different datatypes. The ascending tracks show better performance than descending tracks, with the highest AUC of 0.78 from the medium-res C-band VV *σ* 0 . The ascending L-band HH *σ* 0 , albeit its low spatial resolution (60 m), still yields an AUC value of 0.71. On the other hand, the descending tracks do not detect as many changes, with an AUC value of 0.65 for the medium-res C-band VV *σ* <sup>0</sup> and 0.64 for the ultra-high-res X-band HH *σ* 0 . No change is detected from the descending L-band HH *σ* 0 . We should point out that despite the similar AUC values from the descending C-band VV *σ* <sup>0</sup> and the X-band HH *σ* 0 , their effective area (unmasked area) is different–more than 50% of the catchment is under the layover-shadow mask of the X-band data.

**Figure 11.** (**a**) Bayesian probability maps obtained from different *σ* <sup>0</sup> datatypes for the Putanpunas landslide on 7 June 2017. (**b**) ROC curves and (**c**) AUC values for different datatypes. Blue numbers in parentheses are the percentage of effective area within the AOI.

#### **5. Discussion**

#### *5.1. How Data Properties Affect Detection Performance*

In Table 2 we summarize the performance of different datatypes, including the fullarea AUC and OA. Note that AUC does not represent the performance of any particular detection outcome but a full spectrum of outcomes, whereas OA is calculated at a specific selection (e.g., at the cutoff probability of 0.5) to mimic the operator's choice. The OA values among different datatypes in the Hokkaido case, however, are all very similar and cannot reflect their actual relative performance. This similarity results from the large number of non-landslide pixels (TN in Equation (12)) when computing the OA values. To allow a better judgement of the relative performance at a specific landslide mapping outcome, we compute the full-area TPR at a fixed FPR of 0.1 (TPRFPR = 0.1). This value indicates the ratio of detectable landslides at the cost of 10% false positive detection. In the Hokkaido case, we further normalize the TPRFPR = 0.1 values by using the best TPRFPR = 0.1 from *P<sup>c</sup>* (Table 2). With the AUC and TPRFPR = 0.1 values, we discuss how the following data properties affect the performance of SAR-based landslide detection.


**Table 2.** Summary of full-area AUC, OA, TPR at PRF = 0.1 and the ratio of effective area (Ae) for the two case studies.

\* <sup>1</sup> UHR: ultra-high resolution; HR: high resolution; MR: medium resolution; LR: low resolution. \*<sup>2</sup> The TPR value at FPR = 0.1. Values in parentheses represent the normalized value by the best performance. \*<sup>3</sup> The ratio between the effective area (area outside the layover-shadow mask) and the full AOI.

**Radar wavelengths and spatial resolutions.** In the Hokkaido case, the L-band datatypes have better AUC and TPRFPR = 0.1 values than the other two radar wavelengths. This result seems to suggest that longer wavelengths work better in landslide detection. However, spatial resolutions can be an equally important factor. This inference is made from the poor performance of the medium-res C-band single-pol datatype–its result is worse than that from the X-band single-pol data at a higher spatial resolution. At the same time, the medium-res C-band single-pol data seems to perform relatively well in the Putanpunas case. This better performance is probably associated with the geometry of the landslide patches (small and distributed landslides in Hokkaido vs. one single large patch in Putanpunas), and the fact that the Putanpunas landslide is a repeated landslide on a bare surface instead of on a forested land (Figure 8).

**Polarizations.** In the Hokkaido case, the high-res L-band full-pol data can offer the best landslide detection capability in forest areas. It can even avoid false detection over water bodies. Even though the landslide boundaries become slightly blurry compared to those detected by using single-pol datatypes, the amount of information contained in a full-pol dataset and the detection performance thereof is indeed unparalleled. The L-band dual-pol datatype, despite a slightly better TPRFPR = 0.1, gives an AUC value lower than those from many single-pol datatypes. It also tends to detect fewer landslide patches. As dual polarization will be a major observation mode for some upcoming missions such as NISAR, more efforts are needed to explore a better utilization strategy for dual-pol data in landslide mapping.

**Viewing Geometry.** In the Hokkaido case, we do not see a notable difference in LIA between the detected and the missed pixels (Figure 12a). This similarity in LIA distributions suggests that large LIAs do not necessarily hinder landslide detection. In the Putanpunas case, landslides are even better detected on datatypes with larger LIAs (Figure 9). On the other hand, slopes with small LIAs may coincide with overlay zones, which are also seen in the Putanpunas case. The smaller (steeper) the satellite look angle (θ), the larger the overlay zone and the smaller the effective area. The percentage of effective area drops from 97% at θ = 44◦ (ALOS-2 D27) to 54% at θ = 27◦ (CSK-D) (Figure 9 and Table 2). As the area of layover and shadow depends on the specific topography of a region, a DEM-based SAR geometric distortion simulation should be executed before planning for a mission or submitting a tasking request in order to determine the best viewing geometry [52].

**Figure 12.** Normalized histogram for (**a**) the LIA and the post-to-pre-event difference in scattering powers (∆Power) for the (**b**) detected and (**c**) missed pixels in the L-band quad-pol datatype (detected and missed pixels are based on the GSBA binary map generated with the *P<sup>c</sup>* Z-score map). The histograms are computed over the entire Hokkaido AOI.

#### *5.2. Limitations in SAR-Based Landslide Detection*

In the Hokkaido case, the best full-area AUC is 0.77 and the best OA is 0.90 (Table 2). These values are close to the results produced by using the ultra-high-res L-band HH-pol multi-temporal intensity with different detection algorithms [15]. The similarity in limited performance from different studies suggests that there may exist some factors that prevent the SAR-based landslide mapping from achieving the high accuracy attained in optical image-based mapping [53]. We plot the histogram of the post-to-pre-event difference in scattering powers (∆Power) for the detected and the missed pixels, respectively, within the Hokkaido AOI (Figure 12b,c). While the histograms for the detected pixels are clearly skewed, the histograms for the missed pixels are centered and symmetric around zero, indicating no significant difference in any of the scattering powers before and after the landslide. This phenomenon is bewildering and needs some explanation.

The first possibility is the change of local topography. From a broader scale, there does not seem to be a systematic difference in LIA between the detected and the missed pixels (Figure 12a). However, the LIA is calculated using the pre-event global 30-m DEM [54]. The topography must have changed locally after the landslide. In fact, according to the post-event DEM obtained by airborne laser survey, the surface morphology within the landslide patches has changed substantially [55]. Features such as scarps and crown cracks can reach meters tall with very steep (nearly vertical) facets [55]. Field photos also reveal that large boulders, huge piles of dead trees and meter-scale surface undulations exist on the ground [55,56]. The chaotic distribution of these features may result in scattering

properties considerably deviating from what is expected from a pure land cover-type change (forest to bare surface).

Another equally important factor to consider is the spatial variation of water contents. Several studies show that the water contents can vary remarkably in the Hokkaido area, from 30% to 280% among different geological materials sampled at the same site [57,58]. As these materials are spread out during the landslide, the randomness in surface soil moisture within the landslide patches increases. To sum up, soil moistures, local topography, surface roughness and land cover changes together form a complicated backscattering field in the landslide patches, leading to clear changes of scattering powers in some places and no clear changes in other places. We may even hypothesize that it is radar wave's sensitivity to multiple physical properties that limits its performance in landslide detection. This hypothesis needs further validations though, such as through numerical simulation of 3D backscattering fields.

The increased randomness in backscattering fields within a landslide patch may explain why intensity correlation method can yield better detection results than those from pixel-by-pixel detection methods [15,19]. Intensity correlation is calculated based on a number of pixels within a moving window, which offers contextual information about the objects and their changes. Most important of all, it can potentially average out the randomness in the complicated scattering field within a landslide patch, leading to a smoother and less noisy detection result.

#### **6. Conclusions**

By applying a newly-designed change detection algorithm Growing Split-Based Approach (GSBA) on two different landslide cases, this study examines how different SAR data properties affect the performance of landslide detection. Our result shows that the high-resolution, full-polarimetric SAR datatype has unparalleled performance in landslide detection over forest areas. Single polarimetric datasets of high or ultra-high spatial resolution rank the next, regardless of their radar wavelengths. This result suggests that high spatial resolution is critical especially for detecting small and distributed landslides in forest areas. Datatypes of medium or low spatial resolution work better in detecting large landslide patches over bare surfaces; their detection performance decays significantly over small landslides in forest areas. Dual polarimetric datatypes have the worst performance among all; a better utilization strategy may be needed for their use in landslide detection. Different viewing geometry mainly impacts the effective detection area by creating layover and shadow zones. This problem is more severe in areas of large slopes (≥40–50◦ ), in which a steep viewing angle (<30◦ ) may render half of the image unusable for landslide detection. SAR geometric distortion simulation is recommended before planning for a mission or submitting a tasking request for landslide mapping purposes. Given the limited performance of SAR-based landslide detection (both in this study and in previous studies) as compared to that from optical image-based landslide detection, we propose that other confounding factors, including but not limited to local topography, surface roughness and soil moistures, are all contributing to the randomness in the backscattering field and hinder the detection of land cover changes. Such limitations need to be properly acknowledged when adopting SAR-based landslide mapping for emergency responses or post-hazard assessments.

**Author Contributions:** Conceptualization, Y.N.L.; data curation, Y.-C.C., Y.-T.K. and W.-A.C.; formal analysis, Y.N.L.; methodology, Y.N.L. and Y.-C.C.; project administration, Y.N.L.; software, Y.N.L. and Y.-C.C.; validation, Y.-C.C. and Y.-T.K.; writing—original draft, Y.N.L. and Y.-C.C.; writing review and editing, Y.-T.K. and W.-A.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by research grant AS-TP-108-M08-1 from Academia Sinica, Taiwan and MOST 110-2119-M-001-006 from the Ministry of Science and Technology, Taiwan to Y.N.L.

**Data Availability Statement:** ALOS-2 data between 2016 and 2018 used in this study are provided by JAXA via the EO-RA2 research program (PI number: ER2A2N006). Copernicus Sentinel data between 2016 and 2018 used in this study are processed by ESA and retrieved from ASF DAAC. The COSMO-SkyMed data are purchased from e-GEOS. The airborne optical images for the Hokkaido landslides are available through the Geospatial Information Authority of Japan.

**Acknowledgments:** We thank Hsin Tung for handling the COSMO-SkyMed dataset. We also thank two anonymous reviewers for their constructive suggestions which improve the quality of this paper. This work is IES paper number 2408.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. GSBA Algorithm**

The following description corresponds to the steps illustrated in Figure 2.

#### **(a) Initialize splitting and histogram fitting**

We first split the image into multiple tiles. After splitting, we fit the data histogram *h*(*z*) within each tile with a model histogram *g*(*z*) of tri-modal Gaussian distribution (modified after the bimodal Gaussian distribution in [48]):

$$\log(z) = \sum\_{i=1}^{3} G\_i = \sum\_{i=1}^{3} A\_i \exp\left[ -\frac{1}{2} \frac{\left(z - m\_i\right)^2}{s\_i^2} \right] \tag{A1}$$

where *g*(*z*) is the modeled histogram discretized at *z* bin centers, [*A<sup>i</sup>* , *m<sup>i</sup>* , *s<sup>i</sup>* ] is the amplitude, mean and standard deviation for each of the *i*-th Gaussian modes. Mode 1 (*G*1) and mode 3 (*G*3) imply negative and positive changes in Z-score values, respectively. Mode 2 (*G*2) has a mean value close to zero for the unchanged class (Figure A1). In the rest of this paper, we use Gaussian parameters to refer to [*A<sup>i</sup>* , *m<sup>i</sup>* , *s<sup>i</sup>* ] for the three modes. The curve-fitting optimization is carried out by using the fast nonlinear solver of the Levenberg-Marquardt algorithm [59].

**Figure A1.** Tri-model Gaussian distribution. Mode 1 (*G*<sup>1</sup> ) and Mode 3 (*G*3) are for the changes with decreased and increased Z-score values, whereas Mode 2 (*G*2) is for the unchanged class. SA is surface area under the colored curve. NA is the non-overlapping area, marked by slash stripes.

#### **(b) Select tiles using thresholds**

The tiles are selected based on the following proxies:

i. Ashman D coefficient (AD). It represents the separation between two modes. The value is defined as [60]:

$$\text{AD}\_{\bar{i}} = \sqrt{2} \frac{|m\_{\bar{i}} - m\_2|}{\sqrt{(s\_{\bar{i}}^2 + s\_2^2)}} \tag{A2}$$

ii. Bhattacharyya coefficient (BC). It represents the goodness of fit in terms of probability. It is defined as [61]:

$$\text{BC} = \sum\_{k} \sqrt{\frac{h(z\_k)}{\sum\_{k} h(z)}} \sqrt{\frac{g(z\_k)}{\sum\_{k} g(z\_k)}} \tag{A3}$$

where *k* stands for the *k*-th histogram bin.

iii. Surface ratio (SR). It represents the significance of the changes in terms of probability compared to the unchanged class. It is defined as [45]:

$$\text{SR}\_{i} = \frac{\min(SA\_{i\prime}, SA\_{2})}{\max(SA\_{i\prime}, SA\_{2})} \tag{A4}$$

where *SA* stands for the surface area for each mode (Figure A1).

iv. Non-overlapping Ratio (NR). It represents the significance of the changes in terms of the cumulative probability that is not overlapped with the unchanged class (*G*2). It is defined as:

$$\text{NR}\_{i} = \frac{NA\_{i}}{SA\_{i}} \tag{A5}$$

The first three proxies are also used by HSBA [45], while the last one (NR) is an additional proxy implemented in GSBA. The main purpose of NR is to weed out the tiles where the *G*<sup>2</sup> mode has a wide distribution that overlap significantly with the other two modes. Except for BC, the other three proxies are calculated for *G*<sup>1</sup> (*i* = 1) and *G*<sup>3</sup> (*i* = 3) mode separately. Table A1 shows the empirical tile selection thresholds used for landslide selection in this study.

**Table A1.** Tile selection thresholds for landslide detection.


#### **(c) Grow patches for consistent statistical distribution**

Next we try to merge the tiles in order to obtain more robust Gaussian parameters from clustered changes. Within each tile cluster, GSBA first chooses a random seed tile and identifies the neighboring tiles around it. The first round of histogram fitting (Equation (A1)) is carried out on the joint histogram between the seed and each of the neighboring tiles. The neighboring tiles with proxies above the thresholds shown in Table A1 are selected as the next round of seeds, and new neighboring tiles are identified around them. This process is repeated until all the tiles in the cluster are touched, or until the growing can no longer propagates onwards.

To avoid the situation where the first seed tile is significantly different from the rest of the tiles in the cluster, this growing process will be repeated a few times from a few randomly-selected seed tiles. The resultant patch with the largest number of merged tiles will be adopted. Through this growing process, the merged tiles (or a patch) contain consistent statistical distribution.

#### **(d) Fill Gaussian parameters**

After tile growing, we could have moved on to estimate the Bayesian probabilities by using the Gaussian parameters averaged over all patches. However, we found that in the case of landslide detection, where changes can be affected by spatially-varying factors such as local incidence angles [23], a global set of Gaussian parameters may not be the best solution. So instead, we keep the Gaussian parameters unchanged in the patches, and fill only the area outside the patches with the global average.

#### **(e) Calculate Bayesian probability**

The Bayesian probability is defined as [48]:

$$p(X|Z) = \frac{p(Z|X)p(X)}{p(Z|X)p(X) + p(Z|\overline{X})p(\overline{X})} \tag{A6}$$

where *X*, *X* stands for changes and non-changes, and *p*(*X*|*Z*) is the Bayesian probability of changes given the *Z*-score value of the pixel. The prior probabilities *p*(*X*) and *p X* are both set to 0.5 following the suggestions in [48]. We assume that the conditional probability for the changes *p*(*Z*|*X*) has a Gaussian probability density function (PDF) with parameters of [*A*e <sup>1</sup>, *m*1,*s*1] for negative Z-scores or [*A*e3, *m*3,*s*3] for positive *Z*-scores:

$$p(Z|X) = \begin{cases} \frac{\tilde{A}\_1}{\sqrt{2\pi}\,s\_1} \exp\left[-\frac{1}{2}\frac{(Z-m\_1)^2}{\left(s\_1\right)^2}\right], \tilde{A}\_1 = \frac{A\_1}{A\_1+A\_2} \text{ if } Z < 0\\ \frac{\tilde{A}\_3}{\sqrt{2\pi}\,s\_3} \exp\left[-\frac{1}{2}\frac{(Z-m\_3)^2}{\left(s\_3\right)^2}\right], \tilde{A}\_3 = \frac{A\_3}{A\_3+A\_2} \text{ if } Z > 0 \end{cases} \tag{A7}$$

and the conditional probability for the non-changes *p*(*Z X*) also has a Gaussian PDF with parameters of <sup>h</sup> *A*e2, *m*2, *s*<sup>2</sup> i :

$$p\left(Z|\overline{X}\right) = \frac{\tilde{A}\_2}{\sqrt{2\pi}\,s\_2} \exp\left[-\frac{1}{2}\frac{\left(Z - m\_2\right)^2}{\left(s\_2\right)^2}\right], \quad \tilde{A}\_2 = \begin{cases} \frac{A\_2}{\overline{A\_1 + A\_2}} \,if\, Z < 0\\ \frac{A\_2}{\overline{A\_3 + A\_2}} \,if\, Z > 0 \end{cases} \tag{A8}$$

#### **(f) Derive binary change maps**

This step is relatively straightforward. By default, we adopt a cutoff probability of 0.5 on the Bayesian probability map in order to obtain the binary map.

#### **(g) Choose the final change map**

Step (a) to (f) will be repeated at multiple tile sizes. The list of tile sizes are determined based on the image dimensions and the preferred number of test sets. Currently we limit the tile size to be between 10 and 500 pixels, and the default number of test sets is between 4 and 8. After step (f), a binary change map will be generated at each tile size. To determine which one is the final output, GSBA calculates Ripley's K (*Kr*) for each map to estimate the spatial randomness of the change points [62,63]:

$$\begin{array}{c} \mathsf{K}\_{\mathsf{r}} = \frac{A}{n^2} \sum\_{i}^{n} \sum\_{j \neq i}^{n} I\_{\mathsf{r}} \left( \left| \boldsymbol{x}\_{i} - \boldsymbol{x}\_{j} \right| \leq r \right) \\\ I\_{\mathsf{r}} = \left\{ \begin{array}{c} 1 \ \boldsymbol{if} \left| \boldsymbol{x}\_{i} - \boldsymbol{x}\_{j} \right| \leq r \\\ 0 \ \boldsymbol{if} \left| \boldsymbol{x}\_{i} - \boldsymbol{x}\_{j} \right| > r \end{array} \right. \end{array} \tag{A9}$$

where *r* is a pre-defined distance of interest, *x<sup>i</sup>* and *x<sup>j</sup>* are the positions of any two change pixels, n is the total number of change pixels, and *A* is the image area. Ripley's K is a geospatial index to tell if points are dispersive or clustered in space. When the point distribution is close to complete spatial randomness, *K<sup>r</sup>* will be close to *πr* 2 . The higher the value, the more clustered the points. For a more efficient calculation of *K<sup>r</sup>* , we down-sample the change map to 100 m × 100 m resolution, and set *r* = 100 m. After computing the *K<sup>r</sup>* value for all binary maps, we choose the one with intermediate *K<sup>r</sup>* value as our final change map.

#### **References**

