*2.2. Data Processing*

#### 2.2.1. Representation of the Spatial Correlation

Our framework defines the GNSS monitoring network structure as a weighted graph G = (V, E, W). The monitoring sites are regarded as nodes, symbolized by *V*, and E is a finite set of edges representing the connection between the nodes. The numbers of the edges are *<sup>N</sup>*(*<sup>N</sup>* − 1)/2, where N is the number of monitoring stations of the network, *<sup>W</sup>* ∈ *<sup>R</sup>N*×*<sup>N</sup>* is a weighted adjacency matrix representing the correlation between the nodes (Figure 5).

**Figure 5.** The diagram of a weighted graph and adjacency matrix.

Generally, the deformation characteristics of a landslide at different parts vary with the monitoring site's location. The spatial correlation of monitoring sites in the GNSS network graph shows a strong place dependence. Thus, the weighted adjacency matrix is calculated using the Gaussian similarity functions based on spatial proximity. As given in Equation (1), weights *wij* of edges *eij* representing the spatial correlation between nodes (*vi*, *vj*) can be calculated.

$$w(i,j) = \exp(-||v\_i - v\_j||^2 / 2\sigma^2) \tag{1}$$

where *vi* <sup>−</sup> *vj*<sup>2</sup> denotes the spatial dependence between nodes (*vi*, *vj*), and *<sup>σ</sup>* is the standard deviation controlling the width of the neighbourhoods.

The weighted adjacency matrix can be expressed as Equation (2), where a more significant weight indicates a higher correlation between the two neighbourhood nodes.

$$A\_w = \begin{pmatrix} 0 & w(1,2) & \cdots & w(1,N) \\ w(2,1) & 0 & \cdots & w(2,N) \\ \vdots & \vdots & \ddots & \vdots \\ w(N,1) & w(N,2) & \cdots & 0 \end{pmatrix} \tag{2}$$

#### 2.2.2. Representation of the Temporal Correlation

The spatial and temporal attributes are two critical elements of landslide displacement prediction. This section will explore the node features that can represent the temporal correlation. Once the displacement data are collected through the GNSS monitoring system, preprocessing is needed before analysis. Outlier removal and missing value imputation are first carried out, followed by denoising and normalization. This study applies a waveletbased denoising method to remove the random noise and improve the data quality. Then, the monitoring date is normalized into the range from 0 to 1 by max-min normalization to eliminate dimensional effects.

A feature matrix *<sup>X</sup>* ∈ *RN*×*<sup>P</sup>* is defined, which contains the time-series information of the monitoring stations (nodes). Where N is the number of monitoring stations in the network, P denotes the number of node time-series features, such as the length of the historical time series. *<sup>X</sup>* ∈ *RN*×*<sup>t</sup>* represents the displacement at each monitoring station at time *t*. Thus, the input [*Xt-n*, ... , *Xt*−1, *Xt*] is a sequence of *n* historical displacement dataset, and [*Xt*+1,... , *Xt+T*] is the predicted displacement in the following *T* time series.
