4.1. Performance Evaluation of the MLP-GTWR Method Evaluated with MODIS LST
Figure 3 shows the comparison between the MLP-GTWR LST (not bias-corrected) and the 1-km MODIS LST for clear pixels across the entire study area. After the MLP and GTWR processes, the overall mean bias, MAE, RMSE, and R
2 between both LSTs were 1.17 K, 2.15 K, 2.99 K, and 0.91, respectively, and the standard deviations of the bias, MAE, RMSE, and R
2 were 0.87 K, 0.72 K, 0.85K, and 0.05, respectively. The result that the RMSE was less than 3 K and the R
2 was higher than 0.9 in most regions indicates a close agreement between the MLP-GTWR LST and MODIS LST under clear-sky conditions.
The spatial patterns of the four metrics showed that the performance varied in space in a manner that was not highly related to the clear-day percentage of MODIS LST and the land cover type (
Figure 1 and
Figure 3). Although satisfactory performance and high correlation existed in most of South China, three regions could be recognized with relatively higher error: the Southwestern Hengduan Mountains, Taiwan Mountains, and the areas near river channels or lakes. Some studies have reported that the strong topographic effects on radiative environment would induce large retrieval bias in PMW BTs over high-mountain regions [
58,
59]. Additionally, the largest uncertainty in the MODIS LST retrieval (i.e., MYD11A1) was calculated using fixed surface emissivity determined by the land cover type [
48]. However, in mountainous regions, the snow and the plant cover are quite dynamic and can have strong seasonality, leading to high uncertainty in estimates of surface emissivity as well as LST.
For land surfaces with a high proportion of water areas (e.g., lakes and flooded soil), the LST retrieval based on the PMW BTs was expected to exhibit weaker performance due to lower BTs and higher polarization differences for water [
60]. Therefore, there would be large discrepancies between the AMSR2 and MODIS observations over those three regions, resulting in lower accuracy of the generated MLP-GTWR LST compared to other areas. Therefore, it was necessary to correct the systematic biases of the MLP-GTWR LST using the bias correction method before merging with the MODIS LST.
4.2. Performance Evaluation of the MLP-GTWR Method Evaluated with In Situ Measurements
The scatterplots in
Figure 4 show the comparison of the MLP-GTWR LST
cor and MODIS LST against the in situ LST measurements at four flux towers. The former all-weather LST was divided into two parts: clear-sky and cloudy-sky, decided by the MODIS clear-sky LST. Overall, the measured in situ LST showed good agreement with the MODIS LST at four sites, with MAE, RMSE, and R
2 varying between 2.15–2.95 K, 2.72–3.45 K, and 0.88–0.95, respectively. The metrics of the flux-based in situ LST used in this study were comparable to those from the literature involving the validation of the MODIS LST [
25,
48], indicating a good spatial representative to evaluate the 1-km gridded LST. For the MLP-GTWR LST
cor under clear-sky conditions, the MAE, RMSE, and R
2 varied within the range of 2.29–2.85 K, 2.83–3.44 K, and 0.87–0.96, respectively. These results certify that the performance of the MLP-GTWR LST
cor under clear-sky condition is comparable and similar to that of the MODIS LST at all flux sites.
For the cloudy-sky groups of the MLP-GTWR LSTcor, the MAE, RMSE, and R2 ranged from 2.29 K to 2.65 K, from 2.92 K to 3.24 K, and from 0.82 to 0.92, respectively. Overall, the performance of the all-weather MLP-GTWR LSTcor was generally similar between the clear-sky and cloudy-sky conditions versus the in situ measurements at four flux towers. The RMSE of the MLP-GTWR LSTcor under cloudy-sky condition was slightly lower than that under clear-sky condition at the Dinghushan, Taoyuan, and Xishuangbanna flux stations. This can be attributed to the lower thermal spatial heterogeneity under cloudy-sky conditions on account of lower shortwave radiation. This decreases uncertainty arising from the scale-mismatch between the flux-based observation and a 1-km pixel in validation.
In addition, separate validations of the MLP-GTWR LST
cor for daytime and nighttime on clear and cloudy days would offer a better understanding of the performance of the MLP-GTWR method (
Table 2). The statistical metrics for daytime and nighttime are both similar to those shown in
Figure 4, supporting the above results that the difference of the accuracy was not significant between the MLP-GTWR LST
cor and the MODIS LST during the day as well as at night.
In addition to this broad area evaluation, the MLP-GTWR LST
cor was compared with the bias-corrected 0-cm temperature at ten AMS sites. Statistics are listed in
Table 3. The performance of the MLP-GTWR LST
cor under cloudy-sky condition was worse than that under clear-sky conditions based on the AMS temperature, which was different from the results from three flux stations. It is worth noting that the measurement was sampled at a single point for the AMS instruments, which may fluctuate more than a spatially averaged flux tower footprint or a satellite retrieval if rain, snow, or strong wind is present under cloudy conditions. This effect may exaggerate the scale-mismatch issue between the AMS observations and 1-km LST, causing higher uncertainty in the validation.
Among the AMS sites, seven had RMSE difference between the clear-sky and cloudy-sky condition lower than 0.5 K, while one site covered by the savannas and both urban sites had larger RMSE differences of 0.68 K, 0.66 K, and 1.09 K, respectively. The likely reason is that microwave radiometry usually faces a radio frequency interference (RFI) problem over urban areas, inducing large uncertainty on AMSR2 BTs as well as the MLP-GTWR LST
cor [
61]. Combining evaluations against in situ measurements at the four flux towers and ten AMS sites, the result revealed that the proposed MLP-GTWR method could generate all-weather LST estimates reasonably well.
As our proposed method is composed of two major processes (i.e., MLP and GTWR) and one correction step, it produces three LST products with two spatial resolutions: the 10-km MLP LST, 1-km MLP-GTWR LST, and bias-corrected MLP-GTWR LST
cor, sequentially. These LST products were compared against the in situ LST measurements to illustrate the change of errors in different steps (
Figure 5). Note that only the in situ LST from four flux towers was used as a comparison object to ensure the better reliability. From the 10-km MODIS LST to the 10-km MLP LST in the MLP process, the MAE, RMSE, and R
2 changed from the range of 1.98–2.41 K, 2.43–2.92 K, and 0.90–0.96 to 2.20–2.38 K, 2.77–3.02 K, and 0.88–0.94. Considering the in situ LST may not be suitable for the 10-km LST comparison due to the large spatial scale difference, the 0.1° ERA5-Land LST product was also used as a reference (
Table 4). The results against the ERA5-Land LST also showed a similar change in the MAE, RMSE, and R
2 from 1.64–3.49 K, 2.11–4.00 K, and 0.88–0.96 to 1.61–3.80 K, 2.14–4.54 K, and 0.84–0.94. It is evident that the MLP LST had a larger error than the MODIS LST. However, considering that the former is under all-weather condition, the increased error is still at an acceptable level with the MLP process.
The GTWR process downscaled the 10-km MLP LST to the 1-km MLP-GTWR LST and filled the swath gaps. The comparison showed that this process made non-significant changes in the errors at the Qianyanzhou and Taoyuan sites, while the Dinghushan and Xishuangbanna sites had an increased error, according to the MAE, RMSE, and R
2 (
Figure 5). Using Dinghushan as an example, the MAE increased from 2.21 K to 2.53 K, the RMSE increased from 2.92 K to 3.10 K, and the R
2 decreased from 0.88 to 0.83. Dinghushan and Xishuangbanna are both located at tropical areas and covered with rainforest, which have low variations of FVC and LAI in a year. This may be the reason why the GTWR model performed worse on predicting the fine-scale LST. Overall, the introduced error in the GTWR process was considerably small, indicating the satisfactory performance of the GTWR method in downscaling and gap-filling.
In the bias correction, the MLP-GTWR LST was corrected to the MODIS LST rather than to the in situ data, resulting in the bias of the MLP-GTWR LST
cor closer to that of the MODIS LST (
Figure 5). There was nonsignificant improvement within bias correction at the Qianyanzhou, Dinghushan, and Taoyuan sites, perhaps due to the low bias and deviation between the clear-sky LST in Equations (7) and (9) (
Figure 3). However, the bias correction aimed to improve the agreement between the MLP-GTWR and MODIS LST over some regions with large discrepancies (see
Section 4.1 and
Figure 3), especially the Hengduan and Taiwan mountains. It is very challenging to evaluate our approach over these regions due to the lack of suitable in situ measurements with good representativeness to the gridded LST. Moreover, the appliance of the bias correction made it more reasonable to further merge the MLP-GTWR LST
cor and MODIS LST to generate the best 1-km all-weather LST.
4.4. Temporal Variability in the MLP-GTWR LSTcor
As the observation periods at the four flux stations all included the year 2014, the time series of the MLP-GTWR LST
cor and in situ measurements are shown in 2014 (
Figure 8). One can see that the MLP-GTWR LST
cor had high similarity in the temporal variation with the in situ measurements and MODIS LST in both daytime and nighttime at different stations. In most cases, the MLP-GTWR LST
cor could capture a sudden LST change following the in situ LST when the MODIS LST was missing, associated with precipitation events. However, when some extreme situations occur, particularly within a single day (e.g., a short-term precipitation occurs and the AMSR2 data are missing), the LST image could exhibit a different pattern with neighboring pixels in space and time. This means that there was not enough information to retrieve the missing LST without introducing more uncertainty. Using the LST or radiation output from an LSM may be an alternative to the MLP LST to achieve better LST retrievals over orbit gaps in these cases.
Furthermore, the performance of the MLP-GTWR LST
cor at daytime and nighttime varied among sites (subplots in
Figure 8). For example, the MAE was lower for daytime than nighttime at Qianyanzhou, but the opposite result occurred at Xishuangbanna. In terms of the standard deviations, no evident difference was shown for the accuracy between different seasons in most cases. The performance of the MLP-GTWR LST
cor in summer can be comparable and even better than other seasons, although the MODIS LST had the fewest valid observations during summer. This indicates that the performance of the MLP-GTWR method is less impacted by the uneven temporal distribution of the MODIS LST. Some large biases between the MLP-GTWR LST
cor exist in some sites and in some seasons, for example, in the summer at Qianyanzhou, in the autumn at Dinghushan, or in the autumn/winter at Xishuangbanna in 2014. However, using the clear-sky MODIS LST (the target in this study) as the reference, our modeled LST had smaller bias than the ground LST in most cases. When comparing the remote sensing-based LST with in situ measurements, the monthly/season mean bias between them was variable among sites and time (Duan et al., 2019). In addition, obvious bias was shown between the MLP-GTWR LST
cor and in situ measurements during the 230th to 310th day in the year 2014, but a similar phenomenon was not found in 2015 (
Appendix A Figure A4). Overall, the analysis of the spatio-temporal variability of the all-weather MLP-GTWR LST
cor confirms the great stability and reliability of the MLP-GTWR approach over South China.