**1. Introduction**

Rainfall-triggered landslides pose a severe threat to societies on all continents [1,2]. Rainfall thresholds are therefore essential for characterizing landslide hazard and developing early warning systems [3–5]. Empirical approaches define thresholds on scales ranging from local [6,7] to regional and global [8,9], based on the observed relation between dated landslides and rainfall characteristics such as intensity, accumulation, duration, or antecedent rainfall (*AR*) conditions [10]. However, rainfall is only a proxy for what is regarded as the main trigger of landslides, i.e., the development of high pore-water pressure in the subsurface, constrained by water infiltration [11,12]. Interacting with retention and drainage processes [12], infiltration is a highly complex process affected by a myriad of factors such as soil physical properties (e.g., soil suction head, porosity, hydraulic conductivity) and their variations through the soil column [13–15], presence of cracks [16], hillslope morphology [17], vegetation [11,18,19], antecedent rainfall conditions [15,20–22], and rainfall intensity [23,24]. In contrast to the empirical threshold definitions, process-based approaches incorporate such hydrophysical

parameters through a spatially extended infinite-slope stability model [25]. However, the large required data input for well-calibrated process-based thresholds explains their current limitation to mostly applications at the hillslope scale or through numerical simulations [4,21,25–27].

The estimation of empirical rainfall thresholds is also associated with additional sources of uncertainty. Firstly, landslide inventories are inherently biased towards high-impact landslide events and regions that are most accessible, while their accuracy is constrained by the scientific validity of the reporting sources, especially in data-scarce low-capacity environments [1,28–31]. Secondly, rainfall data comprise uncertainties related to the spatial representativeness of rain gauges or biases in satellite-derived estimates [32,33]. Thirdly, the definition of rainfall parameters, with intensity and duration forming the most frequently used parameter couple [3,5], varies strongly across studies [3]. Finally, the latter parameters' interdependence is problematic, obscuring the physical processes associated with the calculated thresholds [34].

In order to account for and characterize threshold uncertainties, a growing number of reproducible statistical techniques have been developed [3]. A weakness of such methods is, however, that they are generally tailored to a specific area and available data sets, which often prevents straightforward transferability to other regions and data sets [35]. Nevertheless, transferability is not only essential for evaluating and comparing landslide hazard over different regions of the world [10,36], but also valuable in the context of the increasing availability of ever higher-resolution data relevant for threshold analysis, such as rainfall estimates from global-scale satellite data [32].

The most influential statistical threshold techniques include the probabilistic approach through Bayesian inference [10,37], the use of receiver operating characteristics (ROC) analysis with different optimization metrics [38,39], and the frequentist approach developed by [40]. The Bayesian and ROC approaches compare conditions that resulted or not in landsliding, the former fundamentally relying on prior and marginal probabilities [37] and the latter attempting to balance the true and false positive rates derived from a confusion matrix [39]. When rainfall data are only available for conditions that triggered landslides, the frequentist method provides a quantitative way to exploit it and calculate thresholds. This method, as developed by [40] for the *(intensity, duration*) parameter couple of rainfall, calculates the least-square fit of the log-transformed data and fits a Gaussian function to the probability density function of its residuals. Next, the Gaussian curve is used to adjust the intercept of the best fit equation to the desired threshold, expressed in terms of exceedance probability [40]. Practically this means that for a threshold at, e.g., 5% exceedance probability level, there is a 0.05 probability that any landslide be triggered by rainfall conditions below the threshold. The quality of the thresholds obtained by this method depends on the size of the data set and its good covering of the whole range of the parameters used [40]. An improvement of the frequentist method lies in the adoption of a bootstrapping statistical technique to assess the parameters' uncertainty in the power-law threshold model [9]. Here, the bootstrap procedure involves many threshold calibrations (e.g., 5000 [9]), each of which based on *n* randomly sampled data (with replacement) from a data set of size *n*. The final threshold parameters and associated uncertainties are calculated as the mean and their standard deviations, respectively, of their many estimates. This approach has proved to be transferable over different regions where abundant information on landsliding and rainfall was available [9,41,42].

Recently, this frequentist approach with bootstrapping [32,41,43] has been modified by [35] through coupling a dynamic rainfall variable (*AR*) with a static indicator of the spatially varying predisposing ground conditions (landslide susceptibility, *S*) (further referred to as the *AR-S* approach). The first step in *AR-S* threshold estimation is similar to the frequentist method developed by [40] and [9], calculating the residuals of the least-square fit on the log-transformed data. Then, it proceeds to select 2*x*% of the data with the largest negative residuals, on which a new least-square regression is applied, providing a threshold at the *x*% exceedance probability level. In this way, not only the intercept α but also the slope β of the threshold equation are based on the smallest *AR* data able to cause landsliding. In parallel, the following novel *AR* index, covering a period of 42 days (*n*) of

antecedent rainfall [35], was proposed to account for the non-linear decay of the effect of rainfall on soil wetness.

$$AR\_i = \sum\_{k=i}^{i-n} \varepsilon^{\frac{-a \times (t\_i - t\_k)}{r\_k^b}} \times r\_{k\prime} \tag{1}$$

with *t* referring to time (here expressed in days), and the characteristic time τ = *r<sup>b</sup> k* /*a* varying non-linearly with daily rainfall *rk* [35].

Identifying thresholds for rainfall-triggered landsliding in data-scarce environments is challenging with respect to information on landslide occurrence and hydrophysical parameters, resulting in the quasi-absence of research on this topic in regions such as Central Africa [3] despite high hazard potential [29,44–47]. The *AR-S* approach allowed defining the first regional threshold for landsliding in the western branch of the East African Rift (WEAR) [35]. To the authors' knowledge, it has so far not been used in other regions. Moreover, the cited study relied on limited data available on landslide occurrence, global satellite-based rainfall estimates [48], and continental susceptibility data [45]. There is hence a strong need for testing the method's robustness with other data sets. A regional *S* model is now available for the WEAR [49], which outperforms the global and continental models with regard to prediction accuracy and geomorphological plausibility [49]. Moreover, the landslide event database used in [35] has now grown by about 27%. In this paper, our aim is thus to use these new data and test the transferability of the *AR-S* threshold method as designed by [35] to these new data.
