**3. Methodology**

ST-CORAbico was developed in order to analyse the spatiotemporal characteristics of storm events and bias correct the main sources of systematic error in satellites. Figure 2 shows the methodology of ST-CORAbico. In this section, we describe the elements for storm analysis and bias correction in ST-CORAbico.

**Figure 2.** Diagram of the Spatiotemporal Contiguous Object-based Rainfall Analysis for bias correction (ST-CORAbico) method. Grey boxes represent the input and output products while white boxes describe the methodological process for storm analysis and bias correction components.

#### *3.1. Storm Analysis*

In the storm analysis, ST-CORAbico uses ST-CORA to analyse the spatiotemporal characteristics of the storm events observed and detected by satellites. This process requires the definition of the spatial and temporal domain in order to reduce the computational time of ST-CORA. We applied a spatiotemporal searching algorithm to predetermine the region of analysis in ST-CORA. This algorithm uses the spatial searching algorithm concept that was proposed by Guttman [51] to index areas with rainfall information in both datasets. The indexing is made in a two-dimensional space compressing the latitude and longitude dimensions using a maximum intensity value as a reference. Once the spatiotemporal domain is defined, we use ST-CORA in the observed and SPP dataset to identify storms in the rainfall data. In this study, ST-CORA incorporates a multivariate kernel density function for storm segmentation.

3.1.1. Storm Segmentation Using the Spatiotemporal Object-Based Rainfall Analysis with Multivariate Kernel Density Segmentation

ST-CORA was applied to analyse the spatiotemporal characteristics of storm events at the catchment scale (duration, spatial extent, magnitude, and centroid). This method enables the feature extraction of different storm event types, classified based on hydrometeorological criteria. ST-CORA uses a multidimensional connected labelling component algorithm to associate connected voxels in space and time (a volume generalisation of pixels) into a disjoint object labelled with a unique classifier. This operation is built upon binary information that was created by voxels, considered to be 'effective rainfall'. Effective rainfall voxels *S*[*x*,*y*,*t*] are defined according to rainfall voxels *Rx*,*y*,*<sup>t</sup>* above the rainfall intensity threshold *IT*, as:

$$S\_{[x,y,t]} := \begin{cases} 1, & \text{if } R\_{x,y,t} \ge \text{IT}. \\ 0, & \text{otherwise}. \end{cases} \tag{1}$$

where, *IT* is defined by the user and *S*[*x*,*y*,*t*] is defined in terms of 1 = "true" or 0 ="false". In this study, we used *IT* = 1mm/h to define effective rainfall [52]. Once binary voxels are created, the connected labelling component algorithm scans all voxels in a neighbour system (from top to bottom and left to right), assigning preliminary labels to *S*[*x*,*y*,*t*], as follows:

$$\mathcal{L}(S\_{[x,y,t]}) = \{N\_{[x,y,t]} \in as : S\_{CR} = S\_N\} \tag{2}$$

where, *c*(*S*[*x*,*y*,*t*]) is a preliminary label, *SCR*, *SN* are properties of the voxel *S*[*x*,*y*,*t*] and its neighbours *N*[*x*,*y*,*t*], respectively, while *αs* is the neighbour system in space and time. The labelling process *c*(*S*[*x*,*y*,*t*]) is repeated to resolve equivalence classes of the spatiotemporal object.

Bethel et al. [53] found that object segmentation, while using image thresholding, such as the connected component labelling method, has limitations for edge detection in data with unknown topology. In the original ST-CORA, a size-filtering algorithm and morphological closing method are incorporated in order to remove both small noisy objects and a false merging effect, respectively. However, this process is based on a binary object not taking into account the intensity value of voxels. To overcome this limitation, we have incorporated a Multivariate Kernel Density Estimation (KDE) approach to segment rainfall objects when considering their four dimensions. This method assumes a non-parametric probability density distribution technique for d-dimensional data. Notably, KDE has been widely used in many fields for image detection and object tracking, e.g., [54–58]. Multivariate kernel density is estimated at point *x* from a random sample *X*1, *X*2, ...*Xn* from a density function, *f* ,

$$\widehat{f\_K}(\mathbf{x}) = \frac{1}{n} \sum\_{i=1}^{n} K\_h \left(\mathbf{x} - \mathbf{x}\_i\right) \tag{3}$$

where *K* corresponds to the kernel function and *h* is the bandwidth matrix. Choosing the bandwidth matrix can be restricted to a class of positive diagonal matrices [59]. In the literature, there are several bandwidth selection methods for kernel density estimation [59,60]. For this approach, we use the normal reference rule-of-thumb proposed by Henderson and Parmeter [61]. This method estimates the bandwidth while assuming that the density distribution function follows a Gaussian distribution.

The process of edge detection using KDE is based on the Edge Detection by Density method that was developed by Pereira et al. [55]. This process evaluates the multivariate density distribution of the density of a four-dimensional (4D) rainfall object (Figure 3), and segments the object based on the density threshold, *u*. This threshold identifies the storm edges that are lower than a probability percentage. This parameter is calculated by analysing the relationship between threshold delineation and the connected intensity value. We found that the 25th distribution percentile for *u* threshold showed good results for storm segmentation over the Lower Mekong Basin, especially for intense storm events, which are characteristic of monsoon environments.

Rainfall Objects are considered to be storm based on the Critical Mass Threshold (*CMT*), which is defined as the minimum volume of rainfall (km3) necessary to be considered as an extreme event [62]. The value of *CMT* is calculated locally based on the sensitivity between the spatial extent and the total object volume [37,63]. In this analysis, we also incorporated the sensitivity of *CMT* to the maximum intensity of the storm in order to evaluate the response of intense storm events in the study area. Based on the sensitivity analysis of those parameters, we selected a *CMT* of 0.01 km3 for storm events with a maximum intensity greater than 10 mm/h. In the study area, these events correspond to rainfall objects bigger than 2000 km2.

**Figure 3.** Multivariable kernel density of a storm object in space and time. Example for the storm event 2014-07.
