2.1.1. Geographically and Temporally Weighted Regression Model

The GTWR model is expressed as Equation (1) [19,37]:

$$\mathbf{y}\_{i} = \beta\_{0}(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_{i\prime}\boldsymbol{t}\_{i}) + \sum\_{k=1}^{n} \beta\_{k}(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_{i\prime}\boldsymbol{t}\_{i})\mathbf{x}\_{ik} + \boldsymbol{\varepsilon}\_{i\prime}\boldsymbol{i} = 1,2,\ldots,n\tag{1}$$

where (*ui*, *vi*, *ti*) denotes the spatial–temporal coordinates of the observed location *i*, which contains the location and temporal information; *k* denotes the index *k* = 1, 2 ... *n*. β*k*(*ui*, *vi*, *ti*) represents a set of values for the number *n* of parameters at point *i*, and ε<sup>i</sup> represents the random error of the predicted variable *yi*. The estimates of β*k*(*ui*, *vi*, *ti*) are given by Formula (2):

$$
\widehat{\boldsymbol{\beta}}\_{k}(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_{i\prime}\boldsymbol{t}\_{i}) = \left(\mathbf{X}^{T}\mathcal{W}(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_{i\prime}\boldsymbol{t}\_{i})\mathbf{X}\right)^{-1}\mathbf{X}^{T}\mathcal{W}(\boldsymbol{u}\_{i\prime}\boldsymbol{v}\_{i\prime}\boldsymbol{t}\_{i})\boldsymbol{y} \tag{2}
$$

The spatial–temporal weight matrix *W*(*ui*, *vi*, *ti*) is based on the definition of the spatial–temporal distance and its decay functions. Generally, the weight functions include the distance threshold method, distance inverse ratio method, Gaussian (Gauss) function method, and double square root (bi-square) kernel function method. The Gaussian and bi-square kernel function methods are commonly used in the GTWR model [38]. The Gaussian kernel function is given by Equation (3):

$$w\_{ij} = \exp\left(-\left(d\_{ij}^{ST}/h^{ST}\right)^2\right) \tag{3}$$

where the element *wij* of the weighting matrix is determined by the spatial–temporal distance *dST ij* and spatial–temporal bandwidth *hST*. The GTWR model sets the spatial–temporal distance *dST ij* as a function of the temporal distance *dT ij* and the spatial distance, *<sup>d</sup><sup>S</sup> ij* as expressed in Equation (4):

$$
\lambda \left( d\_{ij}^{ST} \right)^2 = \lambda \left( d\_{ij}^S \right)^2 + \mu \left( d\_{ij}^T \right)^2 = \lambda \left[ \left( u\_i - u\_j \right)^2 + \left( v\_i - v\_j \right)^2 \right] + \mu \left( t\_i - t\_j \right)^2 \tag{4}
$$

λ and μ denote the spatial distance factor and the temporal distance factor, respectively, which balance the effect of the space and time dimension on parameter estimation. The *wij* element of the spatial–temporal weight matrix is expressed by Equation (5):

$$\begin{split} w\_{ij} &= \exp\left\{-\left\{\frac{\lambda\left[\left(u\_{i}-u\_{j}\right)^{2}+\left(v\_{i}-v\_{j}\right)^{2}\right]+\mu\left(t\_{i}-t\_{j}\right)^{2}}{\left(h^{S}\right)^{2}}\right\}\right\} \\ &= \exp\left\{-\left[\frac{\left(u\_{i}-u\_{j}\right)^{2}+\left(v\_{i}-v\_{j}\right)^{2}}{\left(h^{S}\right)^{2}}+\frac{\left(t\_{i}-t\_{j}\right)^{2}}{\left(h^{T}\right)^{2}}\right]\right\} \\ &= \exp\left\{-\left[\frac{\left(d\_{ij}^{2}\right)^{2}}{\left(h^{S}\right)^{2}}+\frac{\left(d\_{ij}^{T}\right)^{2}}{\left(h^{T}\right)^{2}}\right]\right\} \\ &= \exp\left\{-\left[\frac{\left(d\_{ij}^{3}\right)}{\left(h^{S}\right)^{2}}\right]\right\} \times \exp\left\{-\left[\frac{\left(d\_{ij}^{1}\right)^{2}}{\left(h^{T}\right)^{2}}\right]\right\} \\ &= w\_{ij}^{S} \times w\_{ij}^{T} \end{split} \tag{5}$$

It is seen in Equation (5) that the spatial–temporal element or kernel function *wij* equals the spatial kernel function *w<sup>S</sup> ij* multiplied by the temporal kernel function *<sup>w</sup><sup>T</sup> ij*. *<sup>h</sup><sup>S</sup>* and *hT* are the spatial and temporal bandwidths, respectively. One can determine the weight of the observed variable at a given location to the regressed variable at the same location at a specific time by the spatial–temporal kernel function. The spatial bandwidth *bS* and temporal bandwidth *bT* are decided by cross-validation (CV), and they are obtained by minimizing the expression on the right-hand side of Equation (6):

$$CV(b\_{\mathcal{S}\_{\prime}}b\_{T}) = \frac{1}{n} \sum\_{i=1}^{n} \left[ y\_{i} - \widehat{y}\_{\neq i}(b\_{\mathcal{S}\_{\prime}}b\_{T}) \right]^{2} \tag{6}$$

when *CV*(λ, μ) is minimized, it yields the optimal spatial bandwidth and temporal bandwidth of the model. A method for determining spatial and temporal bandwidth in steps was proposed by Fotheringham et al. [21]. The principle of this method is that GTWR can be regarded as GWR for a period of time. In each time period of data, the spatial bandwidth is determined by the GWR model by minimizing *CV*(*bS*), and then determining the time bandwidth that minimizes CV (*bS*1, *bS*2, ... , *b*Sn, *bT*).

One obtains the optimal spatial bandwidth and temporal bandwidth when the CV is minimized. This paper relies on a step-by-step approach [8] to calculate the bandwidth. This approach considers the data at a specific time or a period at first. The GTWR became similar to the GWR model in this manner. One minimize the spatial data *CV*(*bS*) of each period, and then obtain the appropriate time bandwidth by minimizing the CV (*bS*1, *bS*2, ... , *b*Sn, *b*T).
