Modelling Future Land Surface Temperature: A Comparative Analysis between Parametric and Non-Parametric Methods

Gao, Yukun; Li, Nan; Gao, Minyi; Hao, Ming; Liu, Xue

doi:10.3390/su16188195

Open AccessArticle

Modelling Future Land Surface Temperature: A Comparative Analysis between Parametric and Non-Parametric Methods

by

Yukun Gao

¹

,

Nan Li

²,

Minyi Gao

³,

Ming Hao

³ and

Xue Liu

^3,4,*

¹

School of Computer Engineering, Suzhou Vocational University, Suzhou 215000, China

²

Northeast Asia Ecosystem Carbon Sink Research Center (NACC), Key Laboratory of Sustainable Forest Ecosystem Management-Ministry of Education, School of Ecology, Northeast Forestry University, Harbin 150040, China

³

School of Geographic Sciences, East China Normal University, Shanghai 200241, China

⁴

Department of Land Management, Zhejiang University, Hangzhou 310058, China

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(18), 8195; https://doi.org/10.3390/su16188195

Submission received: 16 August 2024 / Revised: 16 September 2024 / Accepted: 18 September 2024 / Published: 20 September 2024

Download

Browse Figures

Versions Notes

Abstract

:

As urban expansion continues, the intensifying land surface temperature (LST) underscores the critical need for accurate predictions of future thermal environments. However, no study has investigated which method can most effectively and consistently predict the future LST. To address these gaps, our study employed four methods—the multiple linear regression (MLR), geographically weighted regression (GWR), random forest (RF), and artificial neural network (ANN) approach—to establish relationships between land use/cover and LST. Subsequently, we utilized these relationships established in 2006 to predict the LST for the years 2012 and 2018, validating these predictions against the observed data. Our results indicate that, in terms of fitting performance (R² and RMSE), the methods rank as follows: RF > GWR > ANN > MLR. However, in terms of temporal stability, we observed a significant variation in predictive accuracy, with MLR > GWR > RF > ANN for the years 2012 and 2018. The predictions using MLR indicate that the future LST in 2050, under the SSP2 and SSP5 scenarios, is expected to increase by 1.8 ± 1.4 K and 2.1 ± 1.6 K, respectively, compared to 2018. This study emphasizes the importance of the MLR method in predicting the future LST and provides potential instructions for future heat mitigation.

Keywords:

land surface temperature; land use/cover; multiple linear regression; random forest approach; future projections

1. Introduction

Urban heat island (UHI) refers to the phenomenon whereby urban areas experience a higher temperature compared to their surrounding rural areas, which is mainly reflected by the urban–rural disparity of land surface temperature (LST) [1,2,3,4]. The replacement of natural landscapes (e.g., farmland, forest, and grassland) with urban complications, including pavements, buildings, concrete, and other impervious surfaces in the urbanized areas, is recognized as the main factor in creating a UHI [5,6]. The UHI has multiple adverse effects to global warming [7], increases the energy demand of cities [5,6,8], and increases heat-related mortality [9]. As urbanization continues globally, it is anticipated that the LST and UHIs will further intensify in the future. As the capital of China, Beijing has experience significant urbanization, inducing magnificently increasing UHI intensities [10,11]. It is reported that the average summer UHI intensity in Beijing can reach 2–3 K during 2003–2018 [12]. Therefore, accurately forecasting the future LST and UHIs is essential for understanding thermal environments and developing strategies to mitigate heat-related risks.

There are two main ways to predict or simulate the future temperature and UHIs based on previous studies. The first method utilizes urban growth modeling and the meteorological model (usually using the weather research and forecasting model, known as WRF) to predict the future regional climate [13,14,15,16,17]. They firstly used the urban growth model or land conversion model to simulate the future land use/cover change [13,14,15,16]. These projected land use data are then input into meteorological models, which integrate other physical parameterizations to simulate future temperatures [13,14,16,17]. This approach is typically employed to estimate future climates on a large scale with a coarse spatial resolution. For instance, Cao et al. [13] employed the WRF model in eastern China to estimate the mean temperature with a spatial resolution of 20 km. Similarly, Wang et al. [16] used a land conversion model and the WRF model in the Beijing–Tianjin–Hebei metropolitan area to predict future air temperature from 4 km to 20 km for the inner and outer domain, respectively. The coarse resolution of this method poses challenges for studies requiring detailed urban information, considering the high spatial variations in urban areas.

To achieve a relatively high spatial resolution in thermal environment predictions, the second approach employ empirical formulae. Many methods utilize urban growth models and parametric methods to simulate the future regional climate [18,19,20,21,22,23,24,25,26,27,28]. This approach firstly establishes the historical relationship between land use/cover percentages or indices and the LST using parametric methods such as Pearson correlation [1,20,27], linear regression [2,3,21,22,24,26], geographically weighted regression (GWR) [18], and Markov models [6,19,23,25,28] during a historical period. Subsequently, an urban expansion model (e.g., cellular automata) is used to simulate future land use/cover [6,19,24,28]. Finally, the future LST is estimated by combining the projected land use/cover data with the historical relationships, under the assumption that these relationships remain constant over time. This approach is widely used by researchers due to the significantly high correlation between land use/cover and the LST observed historically. The simulated future LST has a relatively high spatial resolution, effectively reflecting the urban–rural disparity in the thermal environment.

Compared to the parametric methods, other studies have tried to use non-parametric methods, including machine learning [29,30,31,32,33,34,35,36,37] and deep learning [38,39,40,41,42,43,44,45,46,47,48,49,50,51], to estimate the future LST. Regarding machine learning, the random forest (RF) algorithm is the most commonly used method to build the relationship between landscape pattern indices and the LST [29,31,35,37]. Regarding deep learning, the artificial neural network (ANN) is a very popular approach for predicting future LST patterns [38,39,44,50,51].

Regardless of whether they use parametric or non-parametric methods, these studies aim to better fit the historical relationship between land use/cover data and the LST to more accurately predict future regional thermal environments. However, the comparative effectiveness and temporal stability of these methods have seldom been investigated. Only one study has compared the multiple linear regression and RF approaches for predicting UHIs in Brazil, suggesting that the random forest model outperformed linear regression [52]. However, they did not investigate the temporal stability of the two approaches, which is essential for forecasting studies with large temporal variations. A more rigorous procedure, as suggested by Deilami and Kamruzzaman [18], involves building the relationship at an early stage (e.g., the year 2004 in their case), validating it at a recent stage (e.g., the year 2013), and then carrying out the prediction of the LST or UHIs for a future year (e.g., the year 2023).

In this study, we aim to investigate the most suitable approach for predicting future thermal environments with a high fitting accuracy and temporal robustness. We employed multiple linear regression (MLR), geographically weighted regression (GWR), artificial neural network (ANN), and random forest (RF) approach to examine their effects on the association between land use/cover and the LST across different historical periods, thereby evaluating their respective performances. Moreover, we used the relationships established in 2006 by each method to predict the LST based on land use/cover in 2012 and 2018, and compared these predictions with the actual observed LST in 2012 and 2018. Finally, the optimal method was selected to predict the future thermal environment in Beijing under various urban expansion scenarios.

2. Materials and Methods

2.1. Study Area

Beijing has experienced rapid urbanization in the 21st century. The city exhibits an elevation gradient, being higher in the northwest and lower in the southeast. Previous studies have shown that the intensive urban expansion of Beijing has significantly increasing urban heat island (UHI) intensities [10,53]. However, it remains unclear which algorithm may yield the most accurate results for predicting future land surface temperature (LST). Therefore, a comparative analysis of different modeling algorithms for LST estimation is essential.

This study examines an urban area of 65 km × 63 km in Beijing, encompassing the entire Core Functional Zone and the Urban Function Extended Zone. The study area also includes a large part of the New Urban Development Zone and a small part of the Ecological Conservation Zone (Figure 1).

2.2. Data Collection and Preprocessing

The enhanced Spatial and Temporal Adaptive Reflectance Fusion Model (ESTARFM) is an advanced model designed for spatiotemporal adaptive reflectance fusion [54,55]. The ESTARFM integrates high-spatial-resolution images with low temporal resolution and low-spatial-resolution images with high temporal resolution from two observation periods, along with low-spatial-resolution images from the target date. This fusion process generates simulated high-spatial-resolution images corresponding to the desired time, enhancing the temporal and spatial detail of the resulting imagery. We utilized ESTARFM to generate high-spatiotemporal-resolution reflectance data for the study area by integrating Landsat and MODIS imagery. The datasets used in this research are summarized in Table 1. Meanwhile, all datasets were re-projected to a consistent coordinate system and resampled to a spatial resolution of 30 m.

In this study, surface reflectance data from Landsat 5, Landsat 7, and Landsat 8 were preprocessed using the Google Earth Engine (GEE) cloud platform (https://code.earthengine.google.com/ accessed on 22 September 2023). Initially, clouds, cloud shadows, and snow cover were removed from all images. Subsequently, a median algorithm was applied to merge all images for each year, resulting in a composite image representing the corresponding year.

2.3. Classification of Land Use/Land Cover (LULC)

We used Landsat 5, Landsat 7, and Landsat 8 images by GEE to conducted a classification of LULC in Beijing. Additional spatial layers, such as Normalized Difference Vegetation Index (NDVI), seasonal NDVI, Normalized Difference Built-up Index (NDBI), Normalized Difference Bareness Index (NDBaI), Normalized Difference Water Index (NDWI), and Digital Elevation Model (DEM), were mosaicked with original images to provide textural features for machine learning [56,57].

A random forest (RF) algorithm was performed in the GEE platform to classify land-use types, including impervious surface, water bodies, barren land, and other lands. We selected 1500 training samples as the characteristic data for LULC classification, and selected 300 verified samples based on high-resolution images in Google Earth to validate classification accuracy. The overall accuracy of LULC in Beijing for the years 2006, 2012, and 2018 was 85.6%, 84.9%, and 87.1%, respectively, indicating high accuracy and suitability for further analysis.

In addition, the shared socioeconomic pathways (SSP) framework has five scenarios. SSP2 is a middle pathway between sustainable pathway and regional rivalry pathway. SSP5 describes a fossil-fueled development pathway characterized by rapid global economic growth, accompanied by significant challenges in mitigation efforts [58,59]. We used the dataset of urban land expansion simulations, representing the distribution of impervious surfaces under the shared socioeconomic pathways (SSPs), while maintaining other land use types unchanged, to predict the LST under the SSP2 and SSP5 scenarios in 2050 (Figure 2) [60].

2.4. Calculating Land Surface Temperature

We used a 3-step hybrid method to generate the summer monthly LSTs for the years 2006, 2012, and 2018 [12]. We first filled the gaps in the daily MODIS LST data using either mean filter or linear regression. In the second step, we fused the red, NIR, and TIR bands of MODIS and Landsat to generate multi-year Landsat-like red, NIR, and TIR data by using ESTARFM, which was developed by Zhu et al. [55]. The application of ESTARFM for estimating Landsat-like land surface temperature (LST) has demonstrated high accuracy in comparison to observed LST data [54]. In the final step, brightness temperatures were calculated from the estimated Landsat-like TIR radiance data. Then, we calculated the surface emissivity through NDVI based on the estimated Landsat-like red band and NIR band. Finally, the LST was calculated using the brightness temperature and surface emissivity. The estimated Landsat-like LST was validated with the observed LST data. For a detailed description of the method, please refer to Liu et al. [12].

2.5. Selection of Variables for LST Estimation

The thermal infrared bands of Landsat 5 and Landsat 8 have spatial resolutions of 120 m and 100 m, respectively, with a least common multiple of 600 m. To minimize errors associated with the differing spatial resolutions, we used a 600 m resolution raster to obtain various datasets for the three years. The independent variables included the proportions of impervious surface, barren land, and water bodies, as well as the average digital elevation model (DEM). And the mean of LST in block area served as the dependent variable. Previous studies have shown that all of these factors can significantly affect LST [19,23,61]. Then, the proportions of impervious surfaces, barren land, water bodies, and DEM values were normalized to a scale of 0 to 1. The LST values were not standardized and consistently ranged between 290 K and 317 K. The descriptions of independent and dependent variables are given in Table 2.

2.6. LST Modeling Algorithm

Different algorithms including multiple linear regression (MLR), geographically weighted regression (GWR), artificial neural network (ANN) and random forest (RF) methods were used to quantify the influences of urban form and landform factors on UHI intensity, which were commonly used methods in relevant studies [18,24,62].

Multiple linear regression (MLR) is a statistical method used to examine cause–effect relationships between dependent and independent variables. In this study, we employed the least squares method to construct an MLR model for estimating LST. Spatially varying relationships in the pattern of LST may exhibit nonstationary characteristics. Therefore, we employed GWR to examine the nonstationary spatial relationships between factors and LST, considering spatial autocorrelation in temperature across the study area [18].

ANN model is also used for LST estimation. ANN model typically comprises learning and prediction. The learning phase involves establishing correlations between input and output variables by adjusting the network’s weight matrix through iterative training. This adjustment aims to minimize the disparity between predicted and actual output values. Throughout the process, the weights and thresholds of the network remain deterministic. From a structural perspective, the network weights and thresholds correspond to the coefficients and constants in the model. The learning process parallels the process of determining the coefficients and constants of the model [62,63]. We employed a network architecture consisting of two hidden layers, each comprising 100 neurons. The training process was set with a maximum of 1000 iterations and a learning rate of 0.001, while default values were retained for all other hyperparameters.

The advantages of using discrete or continuous datasets—such as reduced noise susceptibility, high efficiency with large datasets, no requirement for prior probability distribution, flexibility, and robustness—have established random forest (RF) as a crucial tool for land cover classification and land surface temperature (LST) estimation [62,63]. We trained an ensemble of 100 decision trees, using one-third of the available independent variables as features for each split. To ensure the reproducibility of results, the random state was set to 42, while all other parameters were maintained at their default settings.

2.7. Evaluation of LST Estimates

In LST modeling research, the R² and RMSE are often used to assess the prediction performance [12,64]:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(1)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}{n}}

(2)

where

{\hat{y}}_{i}

and

y_{i}

are the predicted LST and corresponding LST at the sample plot i;

\bar{y}

is the mean LST of the test sample plots (total number of n). In general, the higher R² and smaller RMSE values indicate better LST estimation performance. The scatterplots showing the relationships between LST estimates and observed data were also used to evaluate the model performance. We used a total of 12,000 samples for modeling and validation, with 30% of the sample plots randomly selected as test samples.

3. Results

3.1. Evaluation of LST Fitting Results

The performance of the estimations can be explained with the scatterplots showing the relationships between the LST estimates and reference data (Figure 3a–d). The comparison indicated that the estimations were reasonable for all of MLR (RMSE = 1.88 K), GWR (RMSE = 1.17 K), ANN (RMSE = 1.92 K), and RF (RMSE = 1.41 K). However, Figure 2a indicates that the overestimation and underestimation problems were obvious for the estimated results by MLR.

Figure 3e also presents the distribution of the LST estimation using a box plot. The results indicate that the MLR model exhibits a higher dispersion of low estimation values. In contrast, the LST estimation produced by the GWR, ANN, and RF models demonstrate more reliable and consistent results. The accuracy assessment quantitatively indicated that the RF and GWR models provides a better performance for the yearly LST estimates. In contrast, the MLR and ANN models exhibited a slightly worse performance in single-year estimation (Table 3).

3.2. Comparative Analysis of Estimated LST over Time

To evaluate the performance of the MLR, GWR, ANN, and RF models in predicting the LST, we applied each model trained in 2006 to predict LSTs for the years 2012 and 2018. Table 4 presents the prediction error indices derived from the independent validation of the LST using the testing dataset. It shown that the linear regression model had the lowest RMSE (1.47 K in 2012 and 2.40 K in 2018), as well as the highest R² (0.79 in 2012 and 0.75 in 2018) value. The R² and RMSE results clearly confirmed that MLR produced more accurate LST predictions than the GWR, ANN, and RF models, particularly as the number of predicted years increases. The predicted values of the LST by the four regression models for the testing phases are plotted against the observed values in Figure 4. The density scatterplot illustrates that the MLR model outperforms the other three models in predicting the LST.

Additionally, these four models also gave similar spatiotemporal patterns. The spatial distribution of the LST prediction shows that the estimated LST agrees well with the observed LST for most areas (Figure 5). The predicted LST shows a similar spatial pattern and a more extensive coverage compared to the observed LST. Moreover, the predicted LST by four models have similar ranges that go from 294 K to 310 K in MLR, from 295 K to 317 K in GWR, from 294 K to 315 K in ANN, and from 294 K to 314 K in RF—in general, sub-districts belonging to high LST in the urban center, with sub-districts belonging to low LST in the surrounding suburban areas.

We also spatially compared the LST predicted by the MLR, GWR, ANN, and RF models with the observed LST in 2018 (Figure 6). The absolute differences between the MLR-predicted LST with the observed LST show low value in the urban area, while large differences are mainly located in the rural area (Figure 6a). However, the GWR, ANN, and RF models exhibit a systematic bias in prediction: they tend to overestimate the LST in the urban area while underestimating it in the rural area (Figure 6a–c). The Figure 6 further confirmed that MLR was the best method with which to predict the LST in future scenarios.

3.3. Future LST Prediction in Beijing

The above results indicate that the MLR model demonstrates a superior performance in predicting future LST. Consequently, we employed the MLR model to predict the LST under the SSP2 and SSP5 scenarios in Beijing. Figure 7 indicates that the LST is projected to increase more significantly under SSP5 compared to SSP2 by 2050. On average, over the period that ranges from 2018 to 2050, the LST under SSP2 increased by 1.8 ± 1.4 K, while, under SSP5, it increased by 2.1 ± 1.6 K. Spatial analysis indicates that 21% of pixels under the SSP2 scenario and 26% of pixels under the SSP5 scenario are projected to experience a temperature increase exceeding 1.5 K. It is evident that the intermediate urban–rural areas experience the highest LST increase, with a maximum rise of 7.6 K. However, the increase in LST in urban centers and remote rural areas was not significant.

4. Discussion

4.1. The Spatial Reoslution of Future LST Prediction

Remote sensing data, including MODIS and Landsat series data, offer extensive images for inverting the LST, providing superior spatial information compared to air temperature data derived from meteorological stations [10]. Most current studies (30 of 41) used Landsat images to derive the LST with a spatial resolution of 100–120 m [3,7,23,24,37,38,39,45,47,54]. Typically, these studies use one or two Landsat images to represent the thermal environment of a specific year. However, Landsat images are instantaneous and have long revisit cycles (15–16 days), thereby representing the spatial pattern of thermal environments only over short periods [12]. This temporal limitation can introduce uncertainties in the correlation between land use/cover data and the thermal environment due to the high spatial heterogeneity and temporal variation of the LST [65].

The remaining 11 studies used the LST derived from MODIS sensors [29,33,34,40,41,46,50]. For instance, Mathew et al. [2] utilized the 8-day average LST of MYD11A2 from MODIS to represent historical thermal environments. Although these da-ta benefit from high-frequency revisit cycles, their spatial resolution (approximately 1000 m) is too coarse to effectively monitor urban heat islands (UHIs) within cities [12]. Consequently, both Landsat and MODIS LST data have their own limitations in establishing a stable correlation between land use/cover and the LST and in predicting future UHIs. Therefore, there is a critical need for LST data with both a high spatial and temporal resolution, which can better reflect the historical thermal environment and accurately predict the future LST.

Therefore, this study employs the enhanced spatial and temporal adaptive reflectance fusion model to generate a Landsat-like monthly average LST with a spatial resolution of 30 m. This approach facilitates a more realistic and stable correlation between land use/cover and the thermal environment compared to using only Landsat or MODIS images [12].

4.2. The Performance of Fitting Results

The coefficient of determination (R²) between the simulated land surface temperature (LST) and actual LST is a crucial metric for evaluating the performance of various methods in predicting thermal environments. This coefficient is widely used in current studies employing parametric and non-parametric approaches, with higher R² values suggesting a better explanation of the variation in the observed LST [6,19,24,26,29,30,35,49,52,66,67].

In previous studies, a wide range of algorithms have been employed for LST estimation, including multiple linear regression, random forest, support vector machine, extreme gradient boosting, etc. In this study, we selected four commonly used models—the MLR, GWR, ANN, and RF methods—to estimate the LST [18,24,62,63]. Different methods demonstrate varying levels of fitting performances. Parametric models generally exhibit a relatively low degree of fit for thermal environments [18,19,24,26]. For instance, Deilami and Kamruzzaman [18] suggested that using GWR to predict the 2013 LST based on the 2004 relationship resulted in an R² of 0.72. Our results using GWR to predict the 2012 and 2018 LST based on the 2006 and 2012 relationship resulted in an R² of 0.73 and 0.70, respectively (Table 4), which is consistent with the study of Deilami and Kamruzzaman [18]. Other studies using linear regression to fit the thermal environment have the R² from 0.69 to 0.74 [19,24,26]. Our results using linear regression to fit the LST have the R² from 0.75 to 0.79 (Table 3), which is slightly higher than the above studies. The reason is that previous studies used various normalized indices (such as NDVI and NDWI) as independent variables, whereas we use the proportions of different land uses, which indicates that the parameters we selected better reflect the variations in LST. Only one study using the linear time series model to estimate the LST had a superior fitting performance (R² = 0.95) to the observed LST [2]. This superior performance can be attributed to the use of long time series data (2004–2013), providing more detailed information for generating a more accurate empirical formula [2].

In contrast, studies using a non-parametric approach have shown relatively higher correlations [29,30,35,49,67]. For instance, Arunab and Mathew [29] used an extreme gradient boosting model for fitting the LST with a high R² of 0.871. Similarly, Shen et al. [35] utilized the RF model to fit the surface UHI with a high R² of 0.837. Other studies have reported even higher correlations [30,49,67]. Our results using the RF model also show high correlations with an R² of 0.9 to 0.93 (Table 3), which is consistent with the previous studies.

Most of these studies, however, utilized only one method, precluding a direct comparison between traditional statistical methods and machine/deep learning approaches. There is only one study using the MLR and RF approaches to forecast future UHIs, by splitting the dataset into training (80%) and testing (20%) [52]. The results showed that RF had a higher R² than MLR in fitting the daytime and nighttime air temperature [52]. Our results in Section 3.1 are consistent with Oukawa et al. [52], which showed that machine/deep learning approaches exhibit a superior fitting performance compared to traditional statistical methods when fitting the LST for specific years (e.g., 2006, 2012, and 2018 in Table 3).

4.3. Temporal Stability in Prediction Accuracy

Whether using parametric or non-parametric approaches, the fundamental objective is to establish a relationship between the thermal environment and land use/cover during historical periods using specific methodologies. Subsequently, this relationship is utilized alongside simulated future land use/cover data to predict the future thermal environment. Therefore, it is crucial that we validate the temporal stability of the relationships established through these methods. However, most current studies only construct relationships using data from a single specific historical period, comparing the predicted data with the observed thermal environments from the same year [19,23,24,26,29,35]. Finally, they used the relationships to predict the future LST or UHIs without conducting additional temporal validation.

Only one study using GWR in Brisbane, Australia built the relationship in 2004 and validated this relationship using the derived UHI in 2013. The validation showed a good fit with an R² of 0.72; then, the future UHI in 2023 was predicted using this relationship [18]. However, this study still did not address the question of the temporal stability of the empirical formula in predicting the thermal environment over time.

In this study, we compared the temporal variation of the fitting performances among the four methods (Section 3.2). We used the relationships built in 2006 to predict the LST in 2012 and 2018, then validated the accuracy using the observed LST in 2012 and 2018. The results were disruptive. According to our results in Section 3.1, the RF approach has the highest fitting degree in 2006, followed by GWR and MLR, which is consistent with previous studies. However, when using the relationships in 2006 to predict the LST in 2012 and 2018, MLR has the highest fitting degree, followed by GWR and RF approaches.

The possible reasons can be attributed to the fact that the relationship between LULC and LST remains relatively stable and approximately linear over the observed period. Despite its inferior performance in 2006 compared to the other three methods, linear regression was able to capture the linear relationship between land use and the LST, allowing it to stand out in the predictions for 2012 and 2018. For GWR, the geographic information weights were more accurate in 2006, but this performance had declined in 2012 and 2018 due to environmental changes or shifts in data distribution, which has been observed in previous studies on different topics [68,69]. Random forest exhibits strong fitting capabilities and can capture highly nonlinear relationships within data. The RF approach performed best on the 2006 dataset due to its complexity and strong fitting capability. However, this propensity can lead to overfitting, particularly in situations where the data distribution shows a significant change or small amount of the data. RF may not only learn the underlying patterns in the data but also adapt to noise or idiosyncratic features specific to the training set. Consequently, it can result in a decrease in prediction accuracy when applied in the future LST estimation [70].

We also noted that the fitting and prediction performance of the ANN method were suboptimal, which is consistent with previous research [45,51]. A potential explanation for this suboptimal performance is the simplicity of the data features and the pronounced linear relationship between land use and the LST. In such cases, the inherent complexity of ANN models may be a disadvantage. Contrarily, one study reported a high goodness-of-fit using the ANN method, which can be attributed to their use of input data limited to the latitude, longitude, and LST for specific years. This approach diverges from our study and fails to account for the fact that the LST is significantly influenced by land use and land cover dynamics [50].

Therefore, our findings suggest that linear regression demonstrates a distinct advantage in capturing future thermal environment changes. This advantage becomes more pronounced as the prediction horizon extends.

4.4. Advantages and Limitations

This study conducts both horizontal and vertical comparisons of the typical methods, including the MLR, GWR, RF, and ANN methods, involved in existing empirical formulae for predicting future thermal environments. The primary objective is to determine the most suitable method for this purpose. Unlike previous studies, this research utilizes high-spatiotemporal-resolution LST data derived from a data fusion model, which better captures the spatial distribution characteristics of a stable thermal environment over the long term. This represents a significant advancement over prior research.

In the horizontal comparison of model-fitting degrees, this study confirms that RF > GWR > ANN > MLR, consistent with the findings from previous studies. It is crucial to note that the empirical formulae generated by these methods will be applied to predict thermal environments several years into the future. However, no previous studies have adequately tested the temporal stability of these methods, which this study addresses as a critical gap. In the vertical comparison conducted, this study finds that MLR > GWR > RF > ANN, challenging existing understandings. These findings provide compelling evidence supporting the efficacy of multiple linear regression (MLR) for predicting future thermal environments.

Additionally, our results indicate that, by 2050, urban expansion in Beijing will contribute to an increase in temperature of 1.8 ± 1.4 K and 2.1 ± 1.6 K under the SSP2 and SSP5 scenarios, respectively. Meanwhile, projections from other studies estimate a general warming of approximately 2–3 °C above current levels by 2050, with temperatures potentially exceeding 3 °C under the SSP5-RCP8.5 scenario [59,71,72,73]. This suggests that the localized warming effects of urban expansion are significant and should not be overlooked in the broader context of global warming.

This study also acknowledges certain limitations. Firstly, it focuses exclusively on specific areas within Beijing, which may restrict the generalizability of the findings to a broader range of regions. Future research will aim to expand the study to encompass more cities to enhance the robustness and applicability of the results. Secondly, the study focuses on the influence of land use/cover change on future LST predictions, without accounting for the effects of climate change. This limitation may lead to an underestimation of the future LST. Future research will aim to combine climate change and land use change to comprehensively assess their combined impact on the urban thermal environment, thereby improving the performance of LST predictions in future scenarios. Thirdly, the study identified that the multiple linear regression (MLR) method may have limitations in predicting future land surface temperature values, potentially impacting the accuracy of predictions. While exploring more sophisticated methods could improve accuracy in the future, the predictive capability of MLR remains adequate for simulating future thermal environments based on the current findings.

5. Conclusions

This study is the first to comprehensively compare four common methods, including the multiple linear regression (MLR), geographically weighted regression (GWR), artificial neural network (ANN), and random forest (RF) models, used to predict the future LST. By evaluating the performances of these methods in fitting the current-year LST and predicting the LST for other years, this study aims to illustrate the advantages of each method. Our results show that, when fitting the LST and independent variables for the years 2006, 2012, and 2018 using the four methods, the RF, GWR, and ANN methods demonstrated a superior fitting performance compared to MLR. This finding is consistent with existing research conclusions. However, when using the fitting results from specific years to predict the LST for other years, the performance of the predictions varied significantly. In this context, MLR yielded the best predictive performance, followed by the GWR, RF, and ANN models. This aspect has not been addressed in previous research, that methods used to predict the future LST must not only perform well in fitting but also exhibit temporal stability. In addition to determining the most suitable method, we also applied MLR to predict the impact of future urban expansion on the urban heat island effect in Beijing. This analysis reveals the influence of urbanization on the living environment of city residents. Although our findings indicate that MLR is the most suitable among the four methods for predicting future thermal environments, MLR has its limitations, such as an insufficient predictive accuracy for high temperatures. Future research could explore the use of linear time series models to estimate the LST in order to enhance the prediction accuracy.

Author Contributions

X.L. conceived and designed the experiments; Y.G. performed the experiments; M.G. and M.H. analyzed the data; Y.G. and N.L. contributed to data collection, organization, and LST estimation modeling; X.L. wrote the manuscript; and each co-author participated in the editing. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Humanity and Social Science Youth Foundation of Ministry of Education of China (Grant No. 22YJCZH113), Shanghai Soft Science Program (Grant No. 23692113500), National Natural Science Foundation of China (Grant No. 42201309), Fundamental Research Funds for the Central Universities (Grant No. 2021ECNU-HWCBFBLW002), Fundamental Research Funds for Suzhou Vocational University (Grant Nos. KY202304008 and 202305000001).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Liu, J.; Zhang, L.; Zhang, Q.; Zhang, G.; Teng, J. Predicting the surface urban heat island intensity of future urban green space development using a multi-scenario simulation. Sustain. Cities Soc. 2021, 66, 102698. [Google Scholar] [CrossRef]
Mathew, A.; Sreekumar, S.; Khandelwal, S.; Kaul, N.; Kumar, R. Prediction of Land-Surface Temperatures of Jaipur City Using Linear Time Series Model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 3546–3552. [Google Scholar] [CrossRef]
Ramzan, M.; Saqib, Z.A.; Hussain, E.; Khan, J.A.; Nazir, A.; Dasti, M.Y.S.; Ali, S.; Niazi, N.K. Remote Sensing-Based Prediction of Temporal Changes in Land Surface Temperature and Land Use-Land Cover (LULC) in Urban Environments. Land 2022, 11, 1610. [Google Scholar] [CrossRef]
Ward, K.; Lauf, S.; Kleinschmit, B.; Endlicher, W. Heat waves and urban heat islands in Europe: A review of relevant drivers. Sci. Total Environ. 2016, 569–570, 527–539. [Google Scholar] [CrossRef]
Alavi Panah, S.K.; Kiavarz Moghaddam, M.; Karimi Firozjaei, M. Monitoring Spatiotemporal Changes of Heat Island in Babol City Due to Land Use Changes. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, 42, 17–22. [Google Scholar] [CrossRef]
Firozjaei, M.K.; Kiavarz, M.; Alavipanah, S.K.; Lakes, T.; Qureshi, S. Monitoring and forecasting heat island intensity through multi-temporal image analysis and cellular automata-Markov chain modelling: A case of Babol city, Iran. Ecol. Indic. 2018, 91, 155–170. [Google Scholar] [CrossRef]
Liu, X.; Ming, Y.; Liu, Y.; Yue, W.; Han, G. Influences of landform and urban form factors on urban heat island: Comparative case study between Chengdu and Chongqing. Sci. Total Environ. 2022, 820, 153395. [Google Scholar] [CrossRef] [PubMed]
Santamouris, M. Cooling the cities—A review of reflective and green roof mitigation technologies to fight heat island and improve comfort in urban environments. Sol. Energy 2014, 103, 682–703. [Google Scholar] [CrossRef]
Zhu, D.; Zhou, Q.; Liu, M.; Bi, J. Non-optimum temperature-related mortality burden in China: Addressing the dual influences of climate change and urban heat islands. Sci. Total Environ. 2021, 782, 146760. [Google Scholar] [CrossRef]
Yue, W.; Liu, X.; Zhou, Y.; Liu, Y. Impacts of urban configuration on urban heat island: An empirical study in China mega-cities. Sci. Total Environ. 2019, 671, 1036–1046. [Google Scholar] [CrossRef]
Liu, X.; Yue, W.; Zhou, Y.; Liu, Y.; Xiong, C.; Li, Q. Estimating multi-temporal anthropogenic heat flux based on the top-down method and temporal downscaling methods in Beijing, China. Resour. Conserv. Recycl. 2021, 172, 105682. [Google Scholar] [CrossRef]
Liu, X.; Zhou, Y.; Yue, W.; Li, X.; Liu, Y.; Lu, D. Spatiotemporal patterns of summer urban heat island in Beijing, China using an improved land surface temperature. J. Clean. Prod. 2020, 257, 120529. [Google Scholar] [CrossRef]
Cao, Q.; Yu, D.; Georgescu, M.; Wu, J.; Wang, W. Impacts of future urban expansion on summer climate and heat-related human health in eastern China. Environ. Int. 2018, 112, 134–146. [Google Scholar] [CrossRef] [PubMed]
Fu, P.; Weng, Q. Responses of urban heat island in Atlanta to different land-use scenarios. Theor. Appl. Climatol. 2017, 133, 123–135. [Google Scholar] [CrossRef]
Lemonsu, A.; Viguié, V.; Daniel, M.; Masson, V. Vulnerability to heat waves: Impact of urban expansion scenarios on urban heat island and heat stress in Paris (France). Urban Clim. 2015, 14, 586–605. [Google Scholar] [CrossRef]
Wang, J.; Huang, B.; Fu, D.; Atkinson, P.M.; Zhang, X. Response of urban heat island to future urban expansion over the Beijing–Tianjin–Hebei metropolitan area. Appl. Geogr. 2016, 70, 26–36. [Google Scholar] [CrossRef]
Zhuo, H.; Liu, Y.; Jin, J. Improvement of land surface temperature simulation over the Tibetan Plateau and the associated impact on circulation in East Asia. Atmos. Sci. Lett. 2015, 17, 162–168. [Google Scholar] [CrossRef]
Deilami, K.; Kamruzzaman, M. Modelling the urban heat island effect of smart growth policy scenarios in Brisbane. Land Use Policy 2017, 64, 38–55. [Google Scholar] [CrossRef]
Feng, Y.; Li, H.; Tong, X.; Chen, L.; Liu, Y. Projection of land surface temperature considering the effects of future land change in the Taihu Lake Basin of China. Glob. Planet. Chang. 2018, 167, 24–34. [Google Scholar] [CrossRef]
Kumar, S.; Ghosh, S.; Hooda, R.S.; Singh, S. Monitoring and prediction of land use land cover changes and its impact on land surface temperature in the central part of hisar district, Haryana under semi-arid zone of India. J. Landsc. Ecol. 2019, 12, 117–140. [Google Scholar] [CrossRef]
Mathew, A.; Sreekumar, S.; Khandelwal, S.; Kaul, N.; Kumar, R. Prediction of surface temperatures for the assessment of urban heat island effect over Ahmedabad city using linear time series model. Energy Build. 2016, 128, 605–616. [Google Scholar] [CrossRef]
Mustafa, E.K.; Liu, G.; Abd El-Hamid, H.T.; Kaloop, M.R. Simulation of land use dynamics and impact on land surface temperature using satellite data. GeoJournal 2019, 86, 1089–1107. [Google Scholar] [CrossRef]
Nurwanda, A.; Honjo, T. The prediction of city expansion and land surface temperature in Bogor City, Indonesia. Sustain. Cities Soc. 2020, 52, 101772. [Google Scholar] [CrossRef]
Rahman, M.T.; Aldosary, A.S.; Mortoja, M.G. Modeling Future Land Cover Changes and Their Effects on the Land Surface Temperatures in the Saudi Arabian Eastern Coastal City of Dammam. Land 2017, 6, 36. [Google Scholar] [CrossRef]
Sekertekin, A.; Zadbagher, E. Simulation of future land surface temperature distribution and evaluating surface urban heat island based on impervious surface area. Ecol. Indic. 2021, 122, 107230. [Google Scholar] [CrossRef]
Tian, L.; Tao, Y.; Li, M.; Qian, C.; Li, T.; Wu, Y.; Ren, F. Prediction of Land Surface Temperature Considering Future Land Use Change Effects under Climate Change Scenarios in Nanjing City, China. Remote Sens. 2023, 15, 2914. [Google Scholar] [CrossRef]
Yang, Y.; Guangrong, S.; Chen, Z.; Hao, S.; Zhouyiling, Z.; Shan, Y. Quantitative analysis and prediction of urban heat island intensity on urban-rural gradient: A case study of Shanghai. Sci. Total Environ. 2022, 829, 154264. [Google Scholar] [CrossRef]
Amir Siddique, M.; Dongyun, L.; Li, P.; Rasool, U.; Ullah Khan, T.; Javaid Aini Farooqi, T.; Wang, L.; Fan, B.; Rasool, M.A. Assessment and simulation of land use and land cover change impacts on the land surface temperature of Chaoyang District in Beijing, China. PeerJ 2020, 8, e9115. [Google Scholar] [CrossRef]
Arunab, K.S.; Mathew, A. Exploring spatial machine learning techniques for improving land surface temperature prediction. Kuwait J. Sci. 2024, 51, 100242. [Google Scholar] [CrossRef]
Chauhan, S.; Jethoo, A.S.; Mishra, A.; Varshney, V. Duo satellite-based remotely sensed land surface temperature prediction by various methods of machine learning. Int. J. Data Sci. Anal. 2023. [Google Scholar] [CrossRef]
Han, L.; Zhao, J.; Gao, Y.; Gu, Z. Prediction and evaluation of spatial distributions of ozone and urban heat island using a machine learning modified land use regression method. Sustain. Cities Soc. 2022, 78, 103643. [Google Scholar] [CrossRef]
Li, Q.; Zheng, H. Prediction of summer daytime land surface temperature in urban environments based on machine learning. Sustain. Cities Soc. 2023, 97, 104732. [Google Scholar] [CrossRef]
Mathew, A.; Sreekumar, S.; Khandelwal, S.; Kumar, R. Prediction of land surface temperatures for surface urban heat island assessment over Chandigarh city using support vector regression model. Sol. Energy 2019, 186, 404–415. [Google Scholar] [CrossRef]
Mohammad, P.; Goswami, A. A Spatio-Temporal Assessment and Prediction of Surface Urban Heat Island Intensity Using Multiple Linear Regression Techniques Over Ahmedabad City, Gujarat. J. Indian Soc. Remote Sens. 2021, 49, 1091–1108. [Google Scholar] [CrossRef]
Shen, C.; Hou, H.; Zheng, Y.; Murayama, Y.; Wang, R.; Hu, T. Prediction of the future urban heat island intensity and distribution based on landscape composition and configuration: A case study in Hangzhou. Sustain. Cities Soc. 2022, 83, 103992. [Google Scholar] [CrossRef]
Sherafati, S.A.; Saradjian, M.R.; Niazmardi, S. Urban Heat Island Growth Modeling Using Artificial Neural Networks and Support Vector Regression: A case study of Tehran, Iran. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2013, 40, 399–403. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, J.; Wen, Z. Predicting Surface Urban Heat Island in Meihekou City, China: A Combination Method of Monte Carlo and Random Forest. Chin. Geogr. Sci. 2021, 31, 659–670. [Google Scholar] [CrossRef]
Abdullah, S.; Barua, D.; Abdullah, S.M.A.; Rabby, Y.W. Investigating the Impact of Land Use/Land Cover Change on Present and Future Land Surface Temperature (LST) of Chittagong, Bangladesh. Earth Syst. Environ. 2022, 6, 221–235. [Google Scholar] [CrossRef]
Ahmad, M.N.; Zhengfeng, S.; Yaseen, A.; Khalid, M.N.; Javed, A. The Simulation and Prediction of Land Surface Temperature Based on SCP and CA-ANN Models Using Remote Sensing Data: A Case Study of Lahore. Photogramm. Eng. Remote Sens. 2022, 88, 783–790. [Google Scholar] [CrossRef]
Choe, Y.-J.; Yom, J.-H. Improving accuracy of land surface temperature prediction model based on deep-learning. Spat. Inf. Res. 2019, 28, 377–382. [Google Scholar] [CrossRef]
Chung, J.; Lee, Y.; Jang, W.; Lee, S.; Kim, S. Correlation Analysis between Air Temperature and MODIS Land Surface Temperature and Prediction of Air Temperature Using TensorFlow Long Short-Term Memory for the Period of Occurrence of Cold and Heat Waves. Remote Sens. 2020, 12, 3231. [Google Scholar] [CrossRef]
Esha, E.J.; Rahman, M.T.U. Simulation of future land surface temperature under the scenario of climate change using remote sensing & GIS techniques of northwestern Rajshahi district, Bangladesh. Environ. Chall. 2021, 5, 100365. [Google Scholar] [CrossRef]
Id, M. Simulation and Prediction of Land Surface Temperature (LST) Dynamics within Ikom City in Nigeria Using Artificial Neural Network (ANN). J. Remote Sens. GIS 2015, 5, 1000158. [Google Scholar] [CrossRef]
Kafy, A.; MA, I.; Khan, M.; Sarker, M.; Rahman, M. Prediction of Future Land Surface Temperature And Its Impact On Climate Change: A Remote Sensing Based Approach In Chattogram City. In Proceedings of the 1st International Student Research Conference, Dhaka, Bangladesh, 1 April 2020. [Google Scholar]
Kafy, A.A.; Rahman, M.S.; Faisal, A.-A.; Hasan, M.M.; Islam, M. Modelling future land use land cover changes and their impacts on land surface temperatures in Rajshahi, Bangladesh. Remote Sens. Appl. Soc. Environ. 2020, 18, 100314. [Google Scholar] [CrossRef]
Khalil, U.; Aslam, B.; Azam, U.; Khalid, H.M.D. Time Series Analysis of Land Surface Temperature and Drivers of Urban Heat Island Effect Based on Remotely Sensed Data to Develop a Prediction Model. Appl. Artif. Intell. 2021, 35, 1803–1828. [Google Scholar] [CrossRef]
Khan, M.; Qasim, M.; Tahir, A.A.; Farooqi, A. Machine learning-based assessment and simulation of land use modification effects on seasonal and annual land surface temperature variations. Heliyon 2023, 9, e23043. [Google Scholar] [CrossRef]
Maithani, S.; Nautiyal, G.; Sharma, A.; Sharma, S.K. Simulation of Land Surface Temperature Patterns Over Future Urban Areas—A Machine Learning Approach. J. Indian Soc. Remote Sens. 2022, 50, 2145–2162. [Google Scholar] [CrossRef]
Oh, J.W.; Ngarambe, J.; Duhirwe, P.N.; Yun, G.Y.; Santamouris, M. Using deep-learning to forecast the magnitude and characteristics of urban heat island in Seoul Korea. Sci. Rep. 2020, 10, 3559. [Google Scholar] [CrossRef]
Ranjan, A.; Anand, A.; Patibandla, S.; Verma, S.; Murmu, L. Prediction of Land Surface Temperature Using Artificial Neural Network in Conjunction with Geoinformatics Technology within Sun City Jodhpur (Rajasthan), India. Asian J. Geoinformatics 2018, 17, 14–23. [Google Scholar]
Ullah, S.; Ahmad, K.; Sajjad, R.U.; Abbasi, A.M.; Nazeer, A.; Tahir, A.A. Analysis and simulation of land cover changes and their impacts on land surface temperature in a lower Himalayan region. J. Environ. Manag. 2019, 245, 348–357. [Google Scholar] [CrossRef]
Oukawa, G.Y.; Krecl, P.; Targino, A.C. Fine-scale modeling of the urban heat island: A comparison of multiple linear regression and random forest approaches. Sci. Total Environ. 2022, 815, 152836. [Google Scholar] [CrossRef] [PubMed]
Peng, J.; Xie, P.; Liu, Y.; Ma, J. Urban thermal environment dynamics and associated landscape pattern factors: A case study in the Beijing metropolitan region. Remote Sens. Environ. 2016, 173, 145–155. [Google Scholar] [CrossRef]
Li, Q.; Ding, F.; Wu, W.; Chen, J. Improvement of ESTARFM and its application to fusion of Landsat-8 and MODIS Land Surface Temperature images. In Proceedings of the 2016 4th International Workshop on Earth Observation and Remote Sensing Applications (EORSA), Guangzhou, China, 4–6 July 2016; pp. 33–37. [Google Scholar]
Zhu, X.; Chen, J.; Gao, F.; Chen, X.; Masek, J.G. An enhanced spatial and temporal adaptive reflectance fusion model for complex heterogeneous regions. Remote Sens. Environ. 2010, 114, 2610–2623. [Google Scholar] [CrossRef]
Huang, H.; Chen, Y.; Clinton, N.; Wang, J.; Wang, X.; Liu, C.; Gong, P.; Yang, J.; Bai, Y.; Zheng, Y. Mapping major land cover dynamics in Beijing using all Landsat images in Google Earth Engine. Remote Sens. Environ. 2017, 202, 166–176. [Google Scholar] [CrossRef]
Li, X.; Zhou, Y.; Asrar, G.R.; Zhu, Z. Creating a seamless 1 km resolution daily land surface temperature dataset for urban and surrounding areas in the conterminous United States. Remote Sens. Environ. 2018, 206, 84–97. [Google Scholar] [CrossRef]
Fricko, O.; Havlik, P.; Rogelj, J.; Klimont, Z.; Gusti, M.; Johnson, N.; Kolp, P.; Strubegger, M.; Valin, H.; Amann, M.; et al. The marker quantification of the Shared Socioeconomic Pathway 2: A middle-of-the-road scenario for the 21st century. Glob. Environ. Chang. 2017, 42, 251–267. [Google Scholar] [CrossRef]
Kriegler, E.; Bauer, N.; Popp, A.; Humpenöder, F.; Leimbach, M.; Strefler, J.; Baumstark, L.; Bodirsky, B.L.; Hilaire, J.; Klein, D. Fossil-fueled development (SSP5): An energy and resource intensive scenario for the 21st century. Glob. Environ. Chang. 2017, 42, 297–315. [Google Scholar] [CrossRef]
Zhuang, H.; Chen, G.; Yan, Y.; Li, B.; Zeng, L.; Ou, J.; Liu, K.; Liu, X. Simulation of urban land expansion in China at 30 m resolution through 2050 under shared socioeconomic pathways. GIScience Remote Sens. 2022, 59, 1301–1320. [Google Scholar] [CrossRef]
Sun, R.; Chen, L. How can urban water bodies be designed for climate adaptation? Landsc. Urban Plan. 2012, 105, 27–33. [Google Scholar] [CrossRef]
Zhao, W.; Duan, S.-B.; Li, A.; Yin, G. A practical method for reducing terrain effect on land surface temperature using random forest regression. Remote Sens. Environ. 2019, 221, 635–649. [Google Scholar] [CrossRef]
Hutengs, C.; Vohland, M. Downscaling land surface temperatures at regional scales with random forest regression. Remote Sens. Environ. 2016, 178, 127–141. [Google Scholar] [CrossRef]
Gao, Y.; Lu, D.; Li, G.; Wang, G.; Chen, Q.; Liu, L.; Li, D. Comparative analysis of modeling algorithms for forest aboveground biomass estimation in a subtropical region. Remote Sens. 2018, 10, 627. [Google Scholar] [CrossRef]
Weng, Q.; Fu, P.; Gao, F. Generating daily land surface temperature at Landsat resolution by fusing Landsat and MODIS data. Remote Sens. Environ. 2014, 145, 55–67. [Google Scholar] [CrossRef]
Mustafa, E.K.; Co, Y.; Liu, G.; Kaloop, M.R.; Beshr, A.A.; Zarzoura, F.; Sadek, M. Study for Predicting Land Surface Temperature (LST) Using Landsat Data: A Comparison of Four Algorithms. Adv. Civil Eng. 2020, 2020, 7363546. [Google Scholar] [CrossRef]
Yun, G.Y.; Ngarambe, J.; Duhirwe, P.N.; Ulpiani, G.; Paolini, R.; Haddad, S.; Vasilakopoulou, K.; Santamouris, M. Predicting the magnitude and the characteristics of the urban heat island in coastal cities in the proximity of desert landforms. The case of Sydney. Sci. Total Environ. 2020, 709, 136068. [Google Scholar] [CrossRef]
Liu, C.; Wu, X.; Wang, L. Analysis on land ecological security change and affect factors using RS and GWR in the Danjiangkou Reservoir area, China. Appl. Geogr. 2019, 105, 1–14. [Google Scholar] [CrossRef]
Zhang, H.; Guo, L.; Chen, J.; Fu, P.; Gu, J.; Liao, G. Modeling of spatial distributions of farmland density and its temporal change using geographically weighted regression model. Chin. Geogr. Sci. 2013, 24, 191–204. [Google Scholar] [CrossRef]
Zamani Joharestani, M.; Cao, C.; Ni, X.; Bashir, B.; Talebiesfandarani, S. PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data. Atmosphere 2019, 10, 373. [Google Scholar] [CrossRef]
Rogelj, J.; Popp, A.; Calvin, K.V.; Luderer, G.; Emmerling, J.; Gernaat, D.; Fujimori, S.; Strefler, J.; Hasegawa, T.; Marangoni, G. Scenarios towards limiting global mean temperature increase below 1.5 C. Nat. Clim. Chang. 2018, 8, 325–332. [Google Scholar] [CrossRef]
Riahi, K.; Van Vuuren, D.P.; Kriegler, E.; Edmonds, J.; O’neill, B.C.; Fujimori, S.; Bauer, N.; Calvin, K.; Dellink, R.; Fricko, O. The Shared Socioeconomic Pathways and their energy, land use, and greenhouse gas emissions implications: An overview. Glob. Environ. Change 2017, 42, 153–168. [Google Scholar] [CrossRef]
Wu, Y.; Guo, J.; Lin, H.; Bai, J.; Wang, X. Spatiotemporal patterns of future temperature and precipitation over China projected by PRECIS under RCPs. Atmos. Res. 2021, 249, 105303. [Google Scholar] [CrossRef]

Figure 1. (a) Beijing in Northern China; (b) location of the study area; and (c) Landsat 8 true color composition image of the study area.

Figure 2. Spatial distribution of LULC in Beijing in 2050 under (a) SSP2 and (b) SSP5 scenarios.

Figure 3. The relationships between LST estimates for (a) MLR, (b) GWR, (c) ANN, and (d) RF against the reference values. Black lines are linear regression lines, and red dotted lines represent the 1:1 line. (e) Box plots of LST estimates generated by four different models.

Figure 4. The relationships between LST predictions from different models against the reference values. (a,b) represent the prediction of the LST in 2012 and 2018 using 2006 MLR models, respectively. (c,d) represent the prediction of the LST in 2012 and 2018 using 2006 GWR models, respectively. (e,f) represent the prediction of the LST in 2012 and 2018 using 2006 ANN models, respectively. (g,h) represent the prediction of the LST in 2012 and 2018 using 2006 RF models, respectively. Black lines are linear regression lines, and red dotted lines represent the 1:1 line.

Figure 5. The spatial pattern of the reference LST (a), the predicted LST by MLR (b), the predicted LST by GWR (c), the predicted LST by ANN (d), and the predicted LST by RF (e).

Figure 6. The spatial pattern of the absolute difference between the predicted and observed LSTs in 2018 using the model in 2006 for MLR (a), GWR (b), ANN (c), and RF (d).

Figure 7. Spatial distribution of the absolute difference between the predicted LST in 2050 and observed LST in 2018 under SSP2 (a) and SSP5 (b) scenarios. ∆ LST means the LST difference between 2050 and 2018.

Table 1. The characteristics of three key inputs’ satellite data.

Year	Landsat and Daily MODIS (Reference Data)		8-Day Daytime MODIS at Predicted Time (t_p)
Year	First Pair (t_m)	Second Pair (t_n)	June	July	August
2006	29 October 2005	28 May 2007	06/10–06/17	07/20–07/27	08/13–08/20
2012	26 July 2011	29 April 2014	06/09–06/16	07/11–07/18	08/20–08/27
2018	04 August 2018	17 October 2018	06/10–06/17	07/28–08/04	08/13–08/20

Note: Landsat 5 was used from 2005 to 2013, and Landsat 8 was used from 2014 to 2018.

Table 2. Description of dependent and independent variables.

Variables	Description	Calculation
Percentage of impervious surface (%)	The ratio of impervious surface area to block area	PIS = A_i/A_b, where A_i is the area of impervious surfaces in block, and A_b is the block area.
Percentage of barren land (%)	The ratio of barren land area to block area	PBL = A_bl/A_b, where A_bl is the area of barren land in block, and A_b is the block area.
Percentage of water (%)	The ratio of water bodies to block area	PW = A_w/A_b, where A_w is the area of water in block, and A_b is the block area.
Average DEM	The mean of DEM in block area	$A_{d} = \sum_{i = 1}^{n} V_{d i} / n$ , where V_di is the value of dem i, and n is the number of pixels in a block.
Average LST	The mean of LST in block area	$A_{t} = \sum_{i = 1}^{n} V_{t i} / n$ , where V_ti is the value of LST i, and n is the number of pixels in a block.

Table 3. Summary of accuracy assessment based on different algorithms.

Year	MLR		GWR		ANN		RF
Year	R²	RMSE	R²	RMSE	R²	RMSE	R²	RMSE
2006	0.78	1.88	0.90	1.17	0.81	1.9	0.93	1.41
2012	0.79	1.47	0.88	1.18	0.72	1.7	0.92	1.42
2018	0.75	1.49	0.85	1.20	0.70	1.7	0.90	1.46

Table 4. Accuracy assessment of LST prediction based on different algorithms.

Year	MLR		GWR		ANN		RF
Year	R²	RMSE	R²	RMSE	R²	RMSE	R²	RMSE
2006–2012	0.79	1.47	0.73	1.87	0.71	2.15	0.72	1.85
2012–2018	0.75	2.40	0.70	2.97	0.70	2.58	0.69	2.69

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gao, Y.; Li, N.; Gao, M.; Hao, M.; Liu, X. Modelling Future Land Surface Temperature: A Comparative Analysis between Parametric and Non-Parametric Methods. Sustainability 2024, 16, 8195. https://doi.org/10.3390/su16188195

AMA Style

Gao Y, Li N, Gao M, Hao M, Liu X. Modelling Future Land Surface Temperature: A Comparative Analysis between Parametric and Non-Parametric Methods. Sustainability. 2024; 16(18):8195. https://doi.org/10.3390/su16188195

Chicago/Turabian Style

Gao, Yukun, Nan Li, Minyi Gao, Ming Hao, and Xue Liu. 2024. "Modelling Future Land Surface Temperature: A Comparative Analysis between Parametric and Non-Parametric Methods" Sustainability 16, no. 18: 8195. https://doi.org/10.3390/su16188195

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modelling Future Land Surface Temperature: A Comparative Analysis between Parametric and Non-Parametric Methods

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Collection and Preprocessing

2.3. Classification of Land Use/Land Cover (LULC)

2.4. Calculating Land Surface Temperature

2.5. Selection of Variables for LST Estimation

2.6. LST Modeling Algorithm

2.7. Evaluation of LST Estimates

3. Results

3.1. Evaluation of LST Fitting Results

3.2. Comparative Analysis of Estimated LST over Time

3.3. Future LST Prediction in Beijing

4. Discussion

4.1. The Spatial Reoslution of Future LST Prediction

4.2. The Performance of Fitting Results

4.3. Temporal Stability in Prediction Accuracy

4.4. Advantages and Limitations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI