In this section, both synthetic and real-world datasets are used to evaluate the effectiveness of the proposed MGTWR approach compared to the performances of the MGWR and GTWR methods.
3.1. Simulation Experiments
The spatial simulation layout was designed as a three-dimensional cube, and the length of each side being 12 units [
7,
12]. A Cartesian coordinate system was built in such a way that its origin was located at the bottom-lower-left corner of this cube region. The locations where the observations were collected consisted of
m ×
m ×
m lattice points with a distance
l = 12/(
m − 1) between any two neighboring points along each coordinate axis. In this case
m = 13, and a sample size of
n = 2197 observations were collected in the cube region. Note that if the spatial layout was a square region, the sample size would be
n = 169 observations. The coordinates of locations (
ui, vi, ti) at which observations were computed can be expressed as follows:
where mod(
i − 1,
m) is the remainder of
i − 1 divided by
m and int(
i − 1/
m) is the integer value of the number (
i − 1)/
m.
The dependent variable in the simulation was generated based on the coefficients, the independent variables and the residual error as follows:
where the independent variables
xi1 and
xi2 (
i = 1, 2, …,
n) are uniformly distributed and randomly selected from (−4, 4). The random errors
(
i = 1, 2, …,
n) conform to the standard normal distribution
N(0,1). The coefficients
,
and
are related to spatial-temporal location (
u,
v,
t). To test the performance of the proposed method, three datasets were used in this paper. The first dataset is the combination of constant coefficient and spatially varying coefficients, the second dataset is the combination of spatially and temporally varying coefficients and the third dataset is the combination of constant coefficient and spatially and temporally varying coefficients. The expressions of the three datasets are as follows:
Dataset 1: ;
Dataset 2: and
Dataset 3:
Based on the above conditions, three datasets (Dataset 1 to Dataset 3) were generated. For each dataset, the MGWR, GTWR and MGTWR models were fitted using the Gauss kernel function in the experiment. In the MGWR model, the coefficients and are assumed to vary spatially, and is assumed constant. In the GTWR model, all coefficients are assumed to vary by spatial-temporal location. In the MGTWR model, the coefficients and are assumed to vary by spatial-temporal location, and is assumed constant. Each dataset was generated and recorded ten times to avoid the influence of random error in each run. It should to be pointed out that the coefficients , and and the spatial-temporal location (u, v, t) were fixed in all simulations using the same dataset.
The descriptive statistics of the optimal bandwidth and the optimal spatial-temporal parameter ratio (
τ), which were calculated using CV procedures, are shown in
Table 1. This table lists the minimum (Min), Mean and maximum (Max) bandwidths of the MGWR, GTWR, MGTWR models based on ten replications. Moreover, the minimum (Min), mean and maximum (Max) values of spatial-temporal parameter ratio (
τ) of the GTWR and MGTWR models based on ten replications are shown in
Table 1.
The Akaike Information Criterion (AIC) can account for model parsimony [
27] and has been widely used for model selection. In practice, a corrected version of the AIC was used to address the spatial-temporal non-stationarity in MGWR, which, unlike the basic AIC, is a function of sample size [
6,
28]. The associated formula is as follows:
where
is the estimated standard deviation of the error term,
n is the sample size, and
tr(S) denotes the trace of the hat matrix. As a rule of thumb, in cases where the difference between AIC values is less than approximately 3, the competition between models is regarded as “too close to call”, i.e., there is no clear evidence as to which of the two models is better [
6,
14]. Otherwise, if the difference between the two AIC models is greater than approximately 3, the two models have significant differences and the model with the smaller
AIC is deemed to provide a better fit to the datasets. Therefore, an AIC reduction in different models can be used as a key model fit diagnostic.
For each dataset, the mean AIC was calculated based on ten replications of three models: the MGWR model, the GTWR model and the MGTWR model. The improvements in mean AIC calculated from the MGWR, GTWR and MGTWR models are shown in
Table 2. In this table, the first three columns list the mean AIC values of the MGWR, GTWR and MGTWR models. The fourth column gives the difference between the MGWR and MGTWR models, and the last column gives the difference between the GTWR and MGTWR models.
To further analyze the fitting performance of the constant and spatial-temporally varying coefficients in different models, the following steps were performed. First, in each replication, we recorded the true values and estimated coefficients by the MGWR, GTWR and MGTWR models for each coefficient in each dataset. Second, we calculated the mean estimated values of
,
and
at (
ui,
vi,
ti) based on ten replications as the results at (
ui,
vi,
ti) [
12]. Finally, because it is inconvenient to plot the coefficient distribution in a spatial-temporal three-dimensional region and it is intuitive to present the surfaces of the coefficients in the spatial region, we plotted the coefficient surfaces of true values (the values estimated using the MGWR, GTWR and MGTWR models) at a given time, as shown in
Figure 1,
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6,
Figure 7,
Figure 8 and
Figure 9. In detail, we designed
Figure 1 and
Figure 2 to illustrate the constant
and spatially varying coefficient
in Dataset 1 without considering the temporal information. Moreover, we designed
Figure 3,
Figure 4 and
Figure 5 to present the spatial-temporally varying coefficient
at different times (
t = 0, 6, and 12) in Dataset 2. Additionally, we designed
Figure 6,
Figure 7,
Figure 8 and
Figure 9 to illustrate the constant
and spatial-temporally varying coefficient
in Dataset 3.
For the combinations of constant and spatially varying coefficients in Dataset 1,
Table 2 shows that the AIC values of the MGWR, GTWR and MGTWR models are 592.1566, 635.0128 and 589.45, respectively. Compared to the MGWR and GTWR models, the improvements in the MGTWR model are 2.7066 and 45.5628, respectively. This change indicates that the MGTWR model achieved the best results. The estimated constant coefficient surfaces calculated using the MGWR and MGTWR models (
Figure 1b,d, respectively) are smooth and similar to the true value, whereas the estimation constant coefficient surface calculated by the GTWR model (as shown in
Figure 1c) greatly fluctuates from the true value.
For the combinations of spatially and spatial-temporally varying coefficients in Dataset 2, the AIC values of the MGWR, GTWR and MGTWR models are 7168.606, 6493.464 and 6532.238, respectively, as shown in
Table 2. Compared to the MGWR and GTWR models, the improvements in the MGTWR model are 36.368 and −38.774, respectively. These differences illustrate that the MGTWR model achieved the better performance.
Figure 3,
Figure 4 and
Figure 5 present the distributions of spatial-temporally varying coefficients when
t = 0, 6 and 12 in Dataset 2. The estimation coefficient surfaces calculated using the MGWR model are distributed between (0.5, 2.5) no matter how the temporal coordinate
t changes. However, the estimated coefficient surfaces calculated using the GTWR and MGTWR models are distributed between (0, 2), (0.5, 2.5) and (1, 3) when
t = 0, 6 and 12, respectively. Obviously, both the GTWR and MGTWR models can effectively simulate temporal variations.
For combinations of constant coefficients and spatial-temporally varying coefficients in Dataset 3, the AIC values for the MGWR, GTWR and MGTWR models are 6516.37, 6439.214 and 6403.558, respectively. Additionally, the improvements in the MGTWR model compared to the GTWR and MGWR model are 112.812 and 35.656, respectively. The estimated coefficient surfaces for constant (
Figure 6) and spatial-temporally varying coefficients (
Figure 7,
Figure 8 and
Figure 9) reveal that the MGTWR model has superior efficiency in dealing with global stationarity and the local spatial-temporal non-stationarity problem.
3.2. The Real Data Experiments
We tested the performance of the MGTWR model in the real world and established a hedonic price model of Beijing. The hedonic model examines the effects of characteristics of housing commodities on housing prices [
29,
30,
31,
32,
33]. Such models regard houses as a composite commodity formed by structural attributes, neighborhood attributes, the age of construction and other attributes. The price of a property is assumed to be a realization of the value. The structural attributes include the housing area, the number of bedrooms, the residential plot ratio, the residential greening ratio and other factors. The neighborhood attributes include the influences of supermarkets, shopping centers, primary schools, gas stations and other factors. We obtained 1961 samples with attributes such as house price, house area, residential plot ratio, residential greening ratio, property management fee, the distance to the nearest primary school, the distance to the nearest shopping mall, age of construction and geographical coordinates [
34]. The housing commodity data were provided by the National Bureau of Statistics, and
Figure 10 illustrates the distribution of the housing commodity samples.
The description and units of the variables are shown in
Table 3. In
Table 3, the dependent variable (Ln
Price) is the logarithmically transformed sales price of the house in RMB units. The housing area is logarithmically transformed as Ln
FArea and is in units of m
2. The residential property management fee is logarithmically transformed as Ln
PFee and is in units of RMB/m
2. The distance to the nearest primary school is logarithmically transformed as Ln
DpriSchool and is in units of meters. The distance to the nearest shopping mall, also in units of meters, is logarithmically transformed as LnD
ShMall. Both the residential plot ratio and the residential greening ratio are logarithmically transformed as Ln
PRatio and Ln
GRatio. The temporal variable is the age of the building at the time of sale (
Age) in units of years.
Before constructing the MGWR and MGTWR models to conduct the real data experiment, it was necessary to confirm which variables were stationary and which were non-stationary. We implemented a hypothesis test that assumed that all independent variables were non-stationary and established the F statistical (Leung et al. [
35]) to detect the spatial and spatial-temporal variation in the coefficients [
35,
36]. Both the optimal bandwidth and spatial-temporal parameters required to calculate the F values were obtained using the CV method with a Gaussian kernel function. The results yielded an optimal spatial bandwidth of 7700 m and an optimal spatial-temporal parameter ratio of 1,500,000.
Table 4 provides the
p-value of spatial-temporal non-stationary hypothesis test and the statistically-significant values at the 5% level are marked with an asterisk “*”. The results illustrated that the residential plot ratio (Ln
PRatio), the property management fee (Ln
PFee) and the distance to the nearest shopping mall (LnD
ShMall) had nonsignificant spatial variations, and the remaining explanatory variables had significant spatial variations based on the spatial non-stationarity hypothesis test. Moreover, the property management fee (Ln
PFee) had non-significant spatial-temporal variations, and others had significant spatial-temporal variations based on the spatial-temporal non-stationarity test.
Based on the results of the spatial-temporal non-stationarity hypothesis test, we established the MGWR, GTWR and MGTWR models using the Gaussian kernel function in the real data experiment. In the MGWR model, the residential plot ratio (LnPRatio), the property management fee (LnPFee) and the distance to the nearest shopping mall (LnDShMall) were taken as constant variables. The remaining independent variables were taken as spatially varying variables. All the independent variables were taken as spatial-temporal variables in the GTWR model. Moreover, the property management fee (LnPFee) was taken as a constant variable in the MGTWR model, and the remaining independent variables were taken as spatial-temporal variables. CV criteria were used to calculate the optimal bandwidth and spatial-temporal parameters. The results showed that the optimal bandwidths of the MGWR, GTWR and MGTWR models were 8000 m, 7700 m, and 5080 m, respectively, and the optimal spatial-temporal parameter ratio of the GTWR and MGTWR models were 1,500,000 and 212,000, respectively. The leave-one-out cross-validation method was used to avoid overly optimistic results.
Table 5 provides summaries of the estimation coefficients of the MGTWR model, including the minimum (Min), lower quartile (LQ), mean (Mean), median (Median), upper quartile (UQ), maximum (Max) and standard deviation (SD). Additionally, diagnostic indices of the hedonic price model were adopted to examine the efficiency, similar method reported by Wo [
17], i.e., we calculated the MSE, R
2, R
2adj and AIC values of the MGWR, GTWR and MGTWR models, as shown in
Table 6. In general, high R
2 and R
2adj values or low MSE and AIC values indicate a good fit between the different models and the sample data. An important characteristic of the MGWR, GTWR and MGTWR techniques is that the local relationships between estimated coefficients are mappable and visually analytic. Taking the estimated housing area coefficients as an example, we divided the value into five intervals and colored each interval to illustrate the spatial-temporal variation patterns, as shown in
Figure 10.
As shown in
Table 6, the mean squared errors of the MGWR, GTWR and MGTWR models were 0.0958, 0.078, and 0.0691, respectively. The MGTWR model yielded a 27.87% improvement over the MGWR and an 11.41% improvement over the GTWR model. Thus, the MGTWR exhibited the highest precision of all the models. Moreover, note that the goodness of fit increased from 0.8135 for the MGWR model to 0.8482 for the GTWR model and 0.8654 for the MGTWR model. Additionally, the AIC values of the MGTWR model decreased by 254.2 with respect to the MGWR model and by 128.76 with respect to the GTWR model.
3.3. Discussion
This paper proposes the MGTWR model and testes the efficiency of the MGWR, GTWR and MGTWR models under the following three conditions:
Condition 1: global stationarity and spatial non-stationarity;
Condition 2: spatial-temporal non-stationarity;
Condition 3: global stationarity and spatial-temporal non-stationarity.
First, the MGTWR model is most applicable under Conditions 1 and 3. Under Condition 1, compared to the MGWR and GTWR models, the AIC value of the MGTWR model is reduced from 2.7066 (MGWR) to 45.5628 (GTWR) for Dataset 1. Under Condition 3, compared to the MGWR model, the AIC value of the MGTWR model is reduced by 112.812 (Dataset 3) and 254.20 (real data). Compared to the GTWR model, the AIC value of the MGTWR model is reduced by 35.656 (Dataset 3) and 128.76 (real data). Under Condition 2, the AIC value of the MGTWR model is reduced by 36.368 (MGWR) to −38.774 (GTWR) for Dataset 2. The results indicate that the MGTWR model is superior to the MGWR model but did not outperform the GTWR model. This phenomenon is caused by taking the spatial-temporal varying coefficients as constant coefficients in the MGTWR model, which leads to the result not remaining consistent with that of other conditions.
Second, from the perspective of the estimated spatial-temporal coefficients, the estimation coefficients of the MGTWR model are similar to the true values based on the simulated data (
Figure 3,
Figure 4 and
Figure 5 and
Figure 7,
Figure 8 and
Figure 9). In addition, as
Figure 11 illustrates, the coefficients of the MGTWR (GTWR) models increase (decrease) sharply in Haidian District in the real data experiment compared to those of the MGWR model due to the spatial-temporal variations.
Third, from the perspective of the estimated constant coefficients, the estimated coefficients of the MGTWR and MGWR models are similar to the true values in the simulated data (
Figure 1 and
Figure 6). When the constants are treated as spatial-temporally varying coefficients, the estimation surface of the GTWR model shows a clear deviation from the true value (
Figure 1c and
Figure 6c). The real data experiment suggests that, although we can determine which coefficients are stationary and spatial-temporally non-stationary using the F statistic, the stationarity problem cannot be solved using the GTWR model. Therefore, we proposed a method that divides the explanatory variables into two groups, stationary and spatial-temporally non-stationary variables, and formulated a two-stage least squares estimation for the MGTWR model.
Finally, the real data experiment indicates that not all explanatory variables are spatially or spatial-temporally non-stationary. Under the 95% confidence level criterion, the property management fee (LnPFee) did not exhibit significant spatial-temporal or spatial variations, potentially because the growth rate of property management fees in the spatial-temporal or spatial dimension might be negligible compared to growth rate of the house price. This finding is evidence of the phenomenon that both global stationarity and spatial-temporal non-stationarity exist in the real word. Considering both the constant and spatial-temporally varying coefficients, the MGTWR model achieves more accurate estimation than do the MGWR or GTWR models.