Disentangling Climatic and Surface-Physical Drivers of the Urban Heat Island Using Explainable AI Across U.S. Cities

Aljarrah, Osama A. B.; Goulias, Dimitrios

doi:10.3390/su18083694

Open AccessArticle

Disentangling Climatic and Surface-Physical Drivers of the Urban Heat Island Using Explainable AI Across U.S. Cities

by

Osama A. B. Aljarrah

^*

and

Dimitrios Goulias

Department of Civil and Environmental Engineering, University of Maryland, College Park, MD 20742, USA

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(8), 3694; https://doi.org/10.3390/su18083694

Submission received: 9 March 2026 / Revised: 29 March 2026 / Accepted: 6 April 2026 / Published: 8 April 2026

(This article belongs to the Special Issue Climate-Responsive Strategies for Sustainable Infrastructure)

Download

Browse Figures

Versions Notes

Abstract

Urban Heat Islands (UHIs) are widely analyzed using Land Surface Temperature (LST), yet most studies remain limited to single cities, rely on a single machine-learning model, analyze LST alone, and use inconsistent Surface Urban Heat Island Intensity (SUHII) definitions, which restrict cross-city comparability and broader generalization. This study introduces an explainable artificial intelligence (XAI) framework implemented in Google Earth Engine (GEE) to analyze census-tract summer surface heat (2018–2024) across eight climatically contrasting U.S. cities. The main novelty is a standardized tract-scale cross-city framework that jointly models LST and SUHII using a consistent SUHII definition, a common physical predictor set, city-held-out nested cross-validation, and SHAP-based interpretation, allowing absolute surface heat to be distinguished from relative within-city heat anomaly; this combination is rarely implemented within a single urban heat study. Multiple machine-learning models were evaluated, with ensemble trees performing best: Extreme Gradient Boosting (XGBoost) best predicted SUHII (R² = 0.879; RMSE = 0.213), while Extra Trees best predicted LST (R² = 0.908; RMSE = 0.745 °C). SHapley Additive exPlanations (SHAP) indicate that SUHII is driven primarily by impervious surface fraction and surface moisture availability, whereas LST is structured by latitude and mean summer air temperature. Overall, the framework provides interpretable multi-city attribution of urban surface heat drivers with demonstrated cross-city generalization.

Keywords:

surface urban heat island intensity (SUHII); land surface temperature (LST); explainable artificial intelligence (XAI); Google Earth Engine (GEE); SHapley Additive exPlanations (SHAP); cross-city generalization; machine learning modeling

1. Introduction

Urban heat islands (UHIs) reflect an urban–rural temperature contrast created when urbanization shifts the surface energy balance—most commonly through reduced evaporative cooling, greater heat storage in built materials, changed radiative exchange, and added anthropogenic heat sources (e.g., from buildings and cooling) [1]. Satellite thermal infrared observations quantify the surface urban heat island intensity (SUHII) by measuring land surface temperature (LST), where SUHII is defined as the difference between a local surface temperature and an appropriate background reference, enabling spatially continuous mapping of urban thermal patterns that is difficult to achieve with sparse ground stations [2,3]. However, LST is not air temperature: surface temperatures are typically higher than canopy-layer air temperatures, and their spatiotemporal patterns can differ; the LST–air temperature relationship is empirical and context-dependent, relying on surface properties, atmospheric conditions, and solar geometry, which limits direct interpretation of LST as human exposure without additional modeling or in situ data [2,4]. In this study, this distinction is central: LST is used to represent absolute surface thermal conditions, whereas SUHII is used to represent relative within-city thermal anomaly against a city-specific background.

A growing body of evidence shows that urban heat exposure is structured by social and racial inequality within cities, rather than being randomly distributed. Recent tract- and block-group-scale studies in U.S. cities show that lower income, greater social vulnerability, and racialized disadvantage are often associated with higher LST and lower vegetation cover, although the strength and timing of these relationships vary across cities [5,6]. Despite the consistent direction of evidence, cross-city inference remains limited by weak methodological comparability in UHI measurement. Review evidence shows that SUHII varies with the choice of reference areas, spatial resolution, and analytical approach, and that the lack of consistent definitions and methods across studies constrains meaningful comparison across cities without explicit standardization [7].

In parallel, machine-learning (ML) models are increasingly used to analyze and predict urban thermal patterns by leveraging diverse inputs such as land cover, vegetation, built-environment characteristics, and, in some cases, socio-environmental variables [7,8]. Recent studies show that ML has become a major analytical direction in UHI research, with applications spanning prediction, driver attribution, mitigation assessment, and optimization; however, model transferability, predictor comparability, and interpretability remain recurrent limitations in the literature [9,10,11]. In particular, many ML applications provide limited insight into how individual predictors shape model outcomes. Explainable artificial intelligence (XAI) methods such as SHapley Additive exPlanations (SHAP) address this limitation by attributing model predictions to individual input features, allowing both local and global interpretation of model behavior and supporting transparent decision-making [12]. Recent studies show that SHAP can identify non-linear thresholds, heterogeneous predictor effects, and context-dependent relationships in urban heat analyses; nevertheless, these applications were still concentrated mainly in individual cities or specific climatic settings rather than standardized cross-city designs [13,14,15,16,17].

Although ML is increasingly used to analyze urban heat patterns, clear limitations remain for cross-city comparison and for evaluating neighborhood-scale heat disparities. Most studies apply ML models within single cities or limited regions, which prevents direct comparison of heat drivers across different climatic settings and urban forms [7,10,13]. Cross-city interpretation is further constrained by inconsistent definitions of SUHII. Across large multi-city samples, SUHII is sensitive to the background-reference definition (e.g., buffer size, urban extent, rural vs. suburban references), and these choices can produce material differences in SUHII magnitude and spatial variability, undermining cross-city comparability without standardized methods [18]. While recent multi-city ML studies show that urban heat can be analyzed across many cities at once [19], they usually produce one value per city, most often calculated as the temperature difference between the central business district (CBD) and nearby rural areas. Because these studies work at the city scale and do not use neighborhood-level temperature data, they cannot separate regional climate effects from local within-city processes and cannot explain within-city heat inequality. Recent regional studies also confirm that even when ML is extended beyond a single city, predictor sets, validation strategies, and heat metrics often remain heterogeneous, which limits direct methodological comparison and broader generalization [8,9].

This study addresses these limitations by developing an XAI framework implemented in Google Earth Engine (GEE) to quantify and interpret census-tract-level summer (June–August) LST and SUHII across eight U.S. cities representing contrasting climates and urban forms (see Table 1). A common predictor set is used to represent surface moisture, impervious surface fraction, radiative properties, nighttime activity intensity, proximity to water, spatial context, and background climate, consistent with prior studies [3,16]. The main novelty and methodological contribution of this study is not simply that it applies ML across multiple cities, but that it integrates within one tract-scale framework a standardized SUHII definition, a harmonized predictor set across all cities, city-held-out nested cross-validation to test geographic generalization, and SHAP-based interpretation to compare the drivers of LST and SUHII. Prior studies have addressed some of these elements separately, including multi-city ML modelling, SUHII sensitivity analysis, and SHAP-based urban heat interpretation, but have rarely combined them within one consistent tract-scale design [2,9,10,13,14,18,19]. ML models are trained and evaluated using nested, city-based cross-validation (CV) for two outcomes: LST and SUHII. LST is used here as a measure of absolute surface heat, reflecting the combined influence of background climate and tract-scale surface and urban characteristics. In contrast, SUHII is used as a measure of relative within-city heat anomaly, expressing each tract’s temperature relative to its city’s regional background and enabling comparison independent of regional climate [3]. In this study, SUHII is defined using the mean and standard deviation of LST within a standardized 20-km city buffer, so it represents a normalized tract-scale thermal anomaly rather than absolute surface temperature. SHAP is used to attribute model predictions to individual predictors and quantify the direction and magnitude of their contributions.

Specifically, this study: (i) compares census-tract-level LST distributions across cities to quantify regional differences in absolute surface temperature associated with background climate; (ii) maps the spatial distribution of heat-relevant predictors at the tract scale to characterize within-city spatial variability relevant to SUHII; (iii) trains and evaluates multiple ML models using city-based nested CV to assess geographic generalization across cities; and (iv) applies SHAP to quantify the magnitude and direction of individual predictor effects for both absolute LST and SUHII.

Accordingly, this framework enables tract-scale, cross-city analysis that separates regional climate effects from neighborhood-scale heat disparities and provides interpretable attribution of surface and built-environment drivers relevant to equity-focused heat mitigation across diverse urban and climatic contexts.

2. Materials and Methods

This section describes the data sources, variable construction, modeling approach, and explainability analysis used to examine tract-scale urban surface heat across multiple cities. The overall workflow, spanning data collection, processing in GEE, model training, and SHAP-based interpretation, is summarized in Figure 1. Detailed descriptions of each step are provided in the subsections that follow.

2.1. Study Cities Area

For this study, eight U.S. cities were selected to enable structured cross-city comparison of tract-level urban heat patterns across contrasting climatic regimes and urban forms (Figure 2; Table 1). The cities were organized into four regional pairs (West, South, Midwest, and Northeast), with each pair contrasting a hotter, denser, or more inland city with a coastal, greener, or otherwise structurally different counterpart. This paired design allows regional climate context to be considered while isolating how differences in urban form, land cover, and development history shape within-city surface temperature patterns and spatial inequality. The selected cities span a range of thermal environments and urban structures, including hot-desert, humid subtropical, and humid continental climates, as well as variation in density, vegetation, and impervious surface coverage (Table 1). Figure 3 summarizes the state-level coverage of the dataset and shows the relative contribution of each state to the tract-level observations used in this study.

2.2. Spatial Unit, Outcomes, and Predictors

Census tracts were used as the unit of analysis because they are a standard spatial unit in urban planning and environmental research and support linkage with demographic datasets [4,5,7]. Tract geometries and attributes, including land area (ALAND), water area (AWATER), and internal point latitude (INTPTLAT), were obtained from the U.S. Census Bureau TIGER/2020/TRACT dataset [28]. Distance to city center (DIST_CITY_CENTER) was calculated as the Euclidean distance from each tract centroid to a consistently defined city-center location for each city. All geospatial processing to derive pixel-level layers and tract-level summaries was implemented in GEE (code.earthengine.google.com), and map layouts were produced in Quantum Geographic Information System (QGIS; version 3.44.7-Solothurn) using GeoTIFF exports from GEE.

Two tract-level outcome variables were derived. Mean summer daytime LST was calculated using Landsat Collection 2 Level-2 thermal products from Landsat 8 and 9 accessed through GEE [30,31]. Images were limited to June–August for the period 2018–2024, and only daytime scenes within the summer window were retained. In GEE, the Landsat scenes were filtered using the QA_PIXEL band to mask fill pixels, dilated cloud, cirrus, cloud, cloud shadow, and snow, and the QA_RADSAT band was used to exclude radiometrically saturated pixels before compositing. Surface temperature was then obtained from the ST_B10 band using the Collection 2 Level-2 scale factor and converted from Kelvin to °C, after which a multi-year summer median composite was generated and aggregated to tract means within each city boundary. This processing sequence ensured that the LST estimates reflected cloud-screened summer daytime surface conditions derived consistently across all study cities.

To quantify relative within-city thermal inequality independent of background climate, SUHII was computed as the tract-level LST relative to the regional mean LST, scaled by the regional LST standard deviation within the city buffer [3]. Compared with traditional urban–rural temperature-difference methods, this standardized SUHII approach reduces sensitivity to how the background reference is defined and improves cross-city comparability by expressing local thermal conditions relative to a consistent city-regional background rather than to raw LST values alone [3,18]. More specifically, for each city, a circular buffer with a 20 km radius centered on the city reference point was used to define the regional thermal background. The reference temperature was therefore not based on a single rural pixel or a separate rural station, but on the mean summer LST of all valid pixels within that standardized buffer. The corresponding buffer-wide standard deviation was then used to normalize tract departures from that background. Accordingly, SUHII was calculated as a standardized tract-scale LST anomaly, where positive values indicate tracts warmer than the city-region background and negative values indicate relatively cooler tracts. This definition improves cross-city comparability by expressing each tract’s heat condition relative to a consistently defined local thermal context rather than comparing raw LST values across cities with different climatic baselines. This approach supports cross-city comparison of relative hot and cool tracts and is consistent with prior surface UHI inequity studies [2,3,4].

Predictor variables were selected to represent physical mechanisms known to influence daytime surface temperature, including surface moisture and evapotranspiration potential, impervious surface fraction, radiative properties, built-environment intensity, proximity to water, spatial context, and background climate, consistent with prior urban heat and LST studies [3,8,14]. All predictors were aggregated to census-tract means or derived at the tract scale using publicly available geospatial datasets accessed through GEE and external sources. Detailed descriptions, units, and data sources for all predictors and outcomes are provided in Table 2. Mathematical definitions and equations for each analysis variable are also summarized in Table 3.

2.3. Preprocessing and Transformations

All tract-level predictors and outcomes were compiled into a single dataset (N = 5144). Prior to model estimation, continuous variables were systematically evaluated for missingness, distributional asymmetry, and extreme values. Because missingness was low across variables, missing observations were addressed using median imputation as a simple and robust approach that preserves the full sample while limiting sensitivity to skewed distributions [38]. To limit the influence of extreme observations while preserving the rank structure of the data, all continuous predictors were winsorized using symmetric quantile clipping at the 0.5th and 99.5th percentiles. This step was used to reduce the influence of unusually extreme tract values without compromising the integrity of the observations retained for analysis [39,40]. To reduce skewness and stabilize variance, some variables were subjected to a log1p (i.e., logarithm of 1 + x) transformation in the linear and neural-network modeling because these methods are more sensitive to skewed feature distributions and differences in scale than tree-based methods [41,42]. Following the transformation, predictors in these models were scaled using median and interquartile range normalization to ensure numerical comparability across variables expressed in different units and ranges. Median and interquartile range scaling was selected because it is more robust to remaining outliers than mean-based standardization and is well suited to heavy-tailed predictors [42,43]. Tree-based models were estimated using median-imputed and winsorized predictors without scaling or log transformation, consistent with their insensitivity to predictor scale and monotonic transformations [42,44,45]. Overall, the preprocessing strategy was designed to improve numerical stability and comparability across models while avoiding unnecessary transformation of predictors for algorithms that do not require it.

2.4. Overview of ML Models

We evaluated five supervised regression models—Elastic Net, multilayer perceptron (MLP), Random Forest (RF), Extra Trees, and XGBoost—to model census-tract-scale urban heat outcomes using a common predictor set. All model implementations were conducted in Python using PyCharm 2024.3.2. Elastic Net was included as a regularized linear baseline to address multicollinearity (Appendix A.1, Equation (A1)), and to provide a parsimonious linear benchmark under correlated predictors [46,47]. The MLP is a nonlinear feedforward neural network that approximates complex predictor–response relationships through stacked affine transformations and nonlinear activation functions (Appendix A.1, Equation (A2)); the specific architecture implemented in this study is shown in Figure 4 and reflects recent applications in land surface temperature modeling [48].

RF and Extra Trees are ensemble tree methods that capture nonlinearities and higher-order interactions by aggregating multiple decision trees using the same averaging formulation for both methods (Appendix A.1, Equation (A3)). RF employs bootstrap aggregation and randomized feature selection to reduce variance [14], and is implemented in this study as shown in Figure 5, while Extra Trees further increase split-level randomization to reduce variance and sensitivity to noise [49]. XGBoost is a gradient-boosted tree framework that builds an additive ensemble by sequentially fitting trees to residual structure under explicit regularization (Appendix A.1, Equation (A4)), and has shown strong performance in recent urban-climate and environmental prediction studies [7,14,50]. All models were trained and tuned using city-based nested CV to assess geographic generalization while minimizing bias from hyperparameter optimization.

2.5. Model Training, Validation, and Comparison

All models were trained to predict SUHII and LST using a common predictor set and city-based grouping (Table 1 and Table 2), with performance evaluated using nested CV implemented via GroupKFold (eight outer folds for evaluation and four inner folds for hyperparameter tuning), ensuring that census tracts from the same city were never split across training and validation sets. Specifically, the grouping variable supplied to GroupKFold was the city identifier (CITY), so that all census tracts belonging to the same city were assigned to the same fold. Because the dataset comprised eight cities, the outer GroupKFold with eight folds effectively evaluated model performance by holding out one city at a time, while the inner GroupKFold with four folds performed hyperparameter tuning using only the remaining training cities. This design was used to test whether models trained on a subset of cities could generalize to a geographically unseen city, which is the relevant inference target of this study [51,52]. This grouped validation strategy also reduces overly optimistic performance estimates that can arise when spatially clustered observations are randomly split across training and validation sets [51,52,53,54]. Hyperparameters were optimized within the inner loop using RandomizedSearchCV with RMSE as the objective metric, and the selected configurations for each model–target combination are reported in Table 4. As discussed earlier, Elastic Net and MLP models were trained on predictors processed using median imputation, winsorization, log1p transformation of selected skewed variables, and scaling, whereas tree-based and XGBoost models were trained on median-imputed and winsorized predictors without scaling or log transformation. Model comparison was based on out-of-sample performance across outer folds, providing a consistent and geographically independent basis for assessing predictive accuracy and cross-city generalization.

2.6. Model Validation and Performance Assessment

Model performance for both SUHII and LST was evaluated using held-out test data. Prediction accuracy was quantified using the coefficient of determination (R²), mean absolute error (MAE), and root mean square error (RMSE), which are standard metrics in remote-sensing-based temperature modeling and urban climate analysis [3,7,14,55]. R² represents the proportion of variance explained by the model, while MAE and RMSE quantify prediction error magnitude in the original units of each outcome. The corresponding equations are provided in Appendix A.2.

In addition to these metrics, graphical diagnostics were used to examine model behavior and error structure: actual versus predicted plots were used to evaluate agreement with the 1:1 line and calibration, while residuals versus predicted plots were used to assess error dispersion, identify systematic bias, and detect heteroscedastic or non-linear patterns not captured by scalar metrics alone.

2.7. Explainability (SHAP) and Effect Interpretation

To address the limited interpretability of machine-learning models and mitigate the black-box issue commonly associated with ensemble and non-linear algorithms, this study adopts SHAP as a unified framework for effect interpretation. SHAP is grounded in cooperative game theory and attributes a model’s prediction to individual input features by computing their average marginal contributions across all possible feature coalitions [56,57]. Unlike traditional feature importance metrics, SHAP provides both global explanations—revealing overall feature influence across the dataset—and local explanations—clarifying how specific feature values drive individual predictions. This property is particularly important for urban and energy modeling applications, where complex, non-linear interactions between morphology, socioeconomic variables, and environmental factors are expected. The Shapley value for a given feature is expressed in Equation (1):

ϕ_{i} = \sum_{S \subseteq F ∖ {i}} \frac{|S|! (|F| - |S| - 1)!}{|F|!} [f_{S \cup {i}} (x_{S \cup {i}}) - f_{S} (x_{S})]

(1)

where

F

denotes the full set of input features,

S

represents any subset of features excluding feature

i

,

f_{S \cup {i}}

and

f_{S}

are the model outputs with and without feature

i

, respectively, and the weighting term ensures a fair attribution by averaging contributions across all feature orderings [14,57].

In practical implementations, SHAP values decompose each prediction into the sum of a baseline output and feature-specific contributions, enabling direct interpretation of positive and negative effects on the predicted variable [13]. Model-agnostic variants such as Kernel SHAP offer flexibility but incur high computational costs, while optimized implementations—including Tree SHAP for tree-based models—exploit the decision-tree structure to achieve exact and efficient attribution [14]. Prior studies have demonstrated the effectiveness of SHAP in urban and energy contexts: Li et al. employed SHAP with a RF model to identify key urban morphological drivers of building energy use in Manhattan [58]; Seyrfar et al. applied SHAP to XGBoost models for residential energy analysis in Chicago [59]; and Zhang et al. combined Light Gradient Boosting Machine (LightGBM) with SHAP to evaluate the influence of urban form on energy demand and emissions [60]. Building on this body of work, SHAP is used in this study to transparently interpret model behavior, quantify feature effects, and support physically meaningful insights.

3. Results

3.1. Cross-City Distributions of Tract-Level LST and SUHII

Comparing summer daytime LST across cities is necessary because regional climate sets the baseline level of surface heating and can dominate differences in observed temperatures. Without this context, it is difficult to interpret whether a hotter tract reflects local urban conditions or simply the city’s broader thermal regime. We therefore summarize tract-level LST distributions across the eight cities to document between-city differences in baseline surface temperature before evaluating within-city heat anomalies and their drivers.

Tract-level LST differs markedly across the cities (Figure 6a). Across the eight cities, median tract-level LST spans roughly from the mid-30s to the upper-50s °C, highlighting the strong role of regional climatic background in shaping absolute surface temperature. Phoenix has the highest median LST and the hottest upper tail, while Boston and Philadelphia show the lowest distributions; Minneapolis is similarly cool, and Detroit is intermediate. Houston and Los Angeles have higher medians than the Midwest and Northeast cities, whereas Miami shows a lower median than Houston and Los Angeles in this dataset, despite its hot–humid climate. Because LST primarily reflects between-city baselines, we also present SUHII, which expresses each tract’s LST relative to its city’s mean and standard deviation (Figure 6b). This allows comparison of within-city thermal anomalies across cities with different climates. In contrast to the wider separation seen in LST, SUHII distributions are more comparable across cities, with medians generally clustered near 0 to 1 and broad within-city spreads in all cases. The SUHII distributions show substantial within-city variability in every city, including upper tails that indicate consistently hotter-than-average tracts relative to the local background. These patterns justify modeling SUHII alongside LST to distinguish drivers of regional heat burden from drivers of within-city heat inequality.

3.2. Spatial Patterns of Heat Drivers (Phoenix as an Illustrative Example)

To clarify the physical meaning of the predictor variables used in the tract-scale models and to show how they vary within a single metropolitan area, Phoenix is used as an illustrative case. This example is included only to provide spatial context for the predictor set and not to imply that Phoenix is uniquely representative of the full study sample. Phoenix was selected because it shows the highest tract-level LST among the study cities and provides a clear high-heat example in which contrasts among built surfaces, vegetation, moisture availability, and urban intensity are readily visible. The cross-city conclusions of this study are based on the full eight-city modeling framework, whereas the Phoenix maps are presented only to visually demonstrate how the predictor variables and thermal patterns co-occur within one metropolitan setting. A true-color image of Phoenix is shown first (Figure 7) to document the spatial distribution of developed areas, transportation corridors, agricultural land, and undeveloped desert terrain within the analysis region. This image provides a direct visual reference for the land-cover and built-environment features that correspond to the tract-level variables used in the modeling framework.

Figure 8 presents the tract-level outcome and predictor layers derived from satellite data, summarized as summer (June–August) medians over 2018–2024. LST (Figure 8a) is highest across the most densely built portions of the metropolitan area and lower in tracts with visible vegetation or irrigated land in Figure 7. NDMI (Figure 8b) shows higher values in areas with greater vegetation or moisture presence and lower values across arid and heavily developed tracts, corresponding spatially to the LST patterns. Impervious surface fraction (Figure 8d) delineates the extent and internal variability of urban development and aligns with areas of elevated LST. VIIRS nighttime radiance (Figure 8e) is concentrated in urban and commercial areas and overlaps with regions of high imperviousness and higher LST, indicating areas of greater built intensity and nighttime activity. Albedo (Figure 8c) varies across the region in association with differences in surface materials and land cover and is included because surface reflectance directly affects the surface energy balance. Distance to water (Figure 8f) measures proximity to mapped surface water features and shows systematic spatial gradients across the region; this variable is included to represent the potential spatial association between surface water and local temperature patterns.

Overall, Figure 7 and Figure 8 show that hotter tracts tend to coincide with higher imperviousness, lower moisture availability, and greater nighttime radiance, whereas cooler tracts are more often associated with vegetation or irrigated land. Because these characteristics co-occur spatially, the Phoenix example mainly illustrates why multivariable tract-level models are needed to separate their relative associations with LST and SUHII.

3.3. Distributional Characteristics and Correlation Structure of Predictors

Summary statistics of the input variables and outcomes are reported in Table 5, and their distributions are shown in Figure 9. Several predictors, most notably AWATER, ALAND, DIST_WATER, VIIRS_RAD, and DIST_CITY_CENTER, show substantial right-skew and heavy upper tails, supporting the preprocessing choices described in Section 2.3. Missingness is low across variables, generally below 1%, with slightly higher but still limited missingness for MEAN_SUMMER_TEMP and SOLAR_RAD (2.07%).

Pearson correlation matrices were used to quantify pairwise linear relationships and evaluate potential multicollinearity among predictors (Figure 10), with correlation patterns consistent with established relationships between impervious cover, surface moisture, and nighttime activity in urban environments. Moisture availability (NDMI) shows moderate negative correlations with impervious surface fraction (IMPERV; r ≈ −0.60) and nighttime radiance (VIIRS_RAD; r ≈ −0.64), whereas IMPERV and VIIRS_RAD are strongly positively correlated (r ≈ 0.74). In contrast, MEAN_SUMMER_TEMP and SOLAR_RAD display weak correlations with land-cover and urban form predictors (generally |r| < 0.35). All pairwise correlations fall below commonly used thresholds for problematic multicollinearity (|r| ≥ 0.8), supporting the use of this predictor set in multivariable ML models [61]. To further assess multicollinearity beyond pairwise correlations, variance inflation factors (VIFs) were calculated for all predictors (Table 6). Using the commonly applied conservative criterion that VIF values below 5 indicate acceptable levels of multicollinearity [62,63]; all predictors were within the acceptable range. The highest VIF values were observed for VIIRS_RAD (4.667), IMPERV (4.425), and NDMI (4.092), but all remained below the threshold of concern, indicating that multicollinearity was not problematic in the predictor set.

3.4. Comparative Performance of the Models for SUHII and LST

To assess the ability of the models to generalize across cities and accurately capture tract-level thermal variation, predictive performance was evaluated using held-out census tracts under city-based CV. As discussed earlier, model accuracy was quantified using R², MAE, and RMSE (Table 7).

For SUHII, XGBoost achieved the strongest out-of-sample performance (R² = 0.879; RMSE = 0.213), outperforming RF, Extra Trees, Elastic Net, and MLP. Relative to Elastic Net, the best SUHII model reduced RMSE from 0.316 to 0.213. For LST, Extra Trees performed best (R² = 0.908; RMSE = 0.745 °C), closely followed by RF. For LST, the top three models all exceeded R² = 0.88, while Extra Trees improved RMSE relative to MLP from 0.871 °C to 0.745 °C. Overall, ensemble tree-based models performed best for both outcomes, with XGBoost leading for SUHII and Extra Trees leading for LST, consistent with prior urban heat and land surface temperature studies showing that RF and gradient-boosted tree models perform well when modeling nonlinear and correlated urban predictors [7,8,14,16].

Figure 11 provides graphical diagnostics for the best-performing models. For LST, predicted values align closely with the 1:1 reference line, and the fitted trend exhibits a slope of 0.87, indicating strong proportional agreement across the observed temperature range. Residuals are centered near zero and display no systematic pattern or change in variance across predicted values, supporting homoscedasticity.

For SUHII, the predicted–observed relationship shows strong agreement, with a fitted slope of 0.75, indicating moderate compression of extreme positive and negative values relative to the 1:1 line. Predictions remain well aligned across the central range, while the largest anomalies are slightly underpredicted in magnitude. This pattern is consistent with known difficulties in generalizing models across cities with different climatic and urban characteristics, where generalized models tend to be less accurate than city-specific ones [64]. Residuals are centered near zero with approximately constant dispersion across the prediction range, indicating no systematic bias or heteroscedasticity.

Overall, these results confirm that the selected best-performing models provide accurate and sufficiently well-calibrated predictions suitable for subsequent explainability analysis.

3.5. SHAP-Based Attribution of Tract-Scale Heat Drivers (SUHII vs. LST)

To interpret the best-performing models beyond predictive accuracy, SHAP was used to quantify the direction and magnitude of each predictor’s contribution to tract-level heat outcomes. Global SHAP feature importance (Figure 12a,c) and SHAP summary (beeswarm) distributions (Figure 12b,d) are shown for the best-performing ML models, SUHII (XGBoost) and LST (Extra Trees), in Figure 12. SHAP dependence/interaction patterns for the most influential predictors are also shown in Figure 13.

For SUHII, IMPERV is the dominant predictor, followed by SOLAR_RAD and NDMI, as indicated by the mean absolute SHAP importance rankings in Figure 12a and the corresponding effect distributions in Figure 12b. The SHAP distributions indicate that higher imperviousness produces positive contributions to SUHII (hotter relative anomalies), while higher NDMI produces negative contributions (cooler relative anomalies). The dependence plots show a clear monotonic increase in SUHII with IMPERV (Figure 13a) and a clear decrease with NDMI (Figure 13c). SOLAR_RAD shows a step-like pattern rather than a smooth gradient (Figure 13b), suggesting broader radiation regimes rather than a strongly continuous tract-scale effect. VIIRS_RAD has a weaker secondary influence and levels off at higher radiance values (Figure 13f).

For LST, INTPTLAT and MEAN_SUMMER_TEMP are the dominant predictors in the global importance ranking (Figure 12c), indicating that LST is structured primarily by geographic position and background climate. Figure 12d shows the same pattern in directional form: higher latitude and warmer climatological summer temperature are associated with higher predicted LST, whereas higher NDMI is associated with lower LST and higher IMPERV is associated with higher LST. NDMI, IMPERV, and SOLAR_RAD therefore act as secondary tract-scale modifiers of this broader climatic structure. The dependence plots further show that NDMI lowers predicted LST, IMPERV increases it, and VIIRS_RAD rises at low-to-moderate values before leveling off (Figure 13j–l). In contrast, INTPTLAT and MEAN_SUMMER_TEMP mainly reflect broad cross-city structure rather than a single within-city gradient (Figure 13g,h).

Overall, Figure 12 and Figure 13 distinguish the drivers of relative and absolute heat. Within the fitted models, SUHII is associated more strongly with tract-scale surface characteristics, especially imperviousness and moisture availability, whereas LST is associated more strongly with geographic position and background summer climate.

3.6. Heterogeneity of Driving Factors by City and Climate Zone

To examine whether the cross-city SHAP results masked meaningful spatial heterogeneity, mean absolute SHAP importance from the best-performing models was summarized separately by city and by regional climate setting (Figure 14). This analysis retains the same multi-city modeling framework used throughout the study, while allowing direct comparison of how the relative importance of predictors changes across local and regional contexts.

For SUHII, the relative importance of predictors varies clearly across cities (Figure 14a). In the West, SOLAR_RAD is the leading predictor in both Phoenix (30.6%) and Los Angeles (31.6%), and NDMI is also high in Phoenix (29.1%), while IMPERV is lower there (13.1%). In contrast, IMPERV is the dominant predictor in Minneapolis (38.8%), Philadelphia (40.4%), and Boston (33.1%), indicating stronger control of relative heat anomaly by built surface conditions in the Midwest and Northeast. The South is more mixed: in Houston, IMPERV ranks first (28.5%), whereas in Miami, INTPTLAT is highest (38.3%), exceeding both IMPERV (19.6%) and SOLAR_RAD (19.1%). At the regional level (Figure 14b), the West is characterized by higher SOLAR_RAD (31.1%) and NDMI (25.9%), the South by elevated INTPTLAT (28.5%) and IMPERV (23.4%), and the Midwest and Northeast by stronger dominance of IMPERV (35.5% and 36.7%, respectively).

For LST, heterogeneity is present, but the broader climatic structure remains more stable (Figure 14c,d). MEAN_SUMMER_TEMP is the dominant predictor in Phoenix (51.0%) and across the West (42.5%), whereas INTPTLAT ranks first in Houston (41.9%), Detroit (34.0%), and Minneapolis (29.8%), and is also the dominant regional predictor in the South (34.0%) and Midwest (31.8%). In the Northeast, the structure is more mixed: INTPTLAT contributes 24.1%, IMPERV 19.8%, MEAN_SUMMER_TEMP 18.4%, and NDMI 16.1%. Local surface predictors therefore influence LST in all cities, but their role is generally secondary to broader climatic and geographic controls.

Overall, Figure 14 refines the cross-city interpretation presented in Section 3.5 rather than changing it. SUHII remains more strongly associated with imperviousness, moisture availability, solar radiation, and, in some cities, latitude, but the relative importance of these predictors varies meaningfully by city and regional climate setting. LST remains more consistently associated with background climate and geographic location, although the contribution of local surface predictors becomes more mixed in some regions, especially the Northeast.

4. Discussion

4.1. Separating Absolute Surface Heat (LST) from Within-City Heat Inequality (SUHII)

The results show that LST and SUHII should not be interpreted as interchangeable outcomes. In the present analysis, LST was explained mainly by INTPTLAT and MEAN_SUMMER_TEMP, whereas SUHII was explained more strongly by tract-scale surface characteristics, especially IMPERV and NDMI. This difference is consistent with the construction of the two metrics: LST retains the broader climatic background of each city, whereas SUHII expresses each tract as a standardized anomaly relative to its city-region background. Recent reviews show that satellite-derived LST is useful for spatially continuous mapping of urban thermal patterns, but it is distinct from near-surface air temperature and should not be interpreted as a direct measure of human heat exposure without additional modeling or observations [2,4]. In addition, cross-city SUHII comparisons are known to depend on how the background reference is defined, which supports the use of a standardized reference framework in this study [18,65]. Relative to the recent literature, the main contribution here is the joint tract-scale modeling of absolute surface heat and relative within-city thermal anomaly within one standardized multi-city framework (Table 8).

4.2. Impervious Surface and Moisture Availability as the Dominant Drivers of SUHII

Across the fitted models, IMPERV was the strongest positive contributor to SUHII, whereas NDMI was the strongest and most consistent negative contributor; this interpretation is supported jointly by the SHAP importance rankings, summary distributions, and dependence plots. This result is consistent with the recent literature showing that impervious surfaces are major drivers of urban surface heating, while vegetation and moisture-related conditions provide important cooling effects [2,13,67]. Recent reviews show that vegetation cooling is a widely observed urban heat-mitigation mechanism, although its magnitude depends on climatic context, vegetation characteristics, urban form, and water availability [68,69,70]. The present study adds to that literature by showing that the roles of imperviousness and moisture remain strong under city-held-out nested cross-validation, which supports their relevance beyond single-city analyses.

4.3. Solar Radiation Conditions SUHII Magnitude, but Is Not the Clearest Tract-Scale Differentiator

SOLAR_RAD ranked highly in the SUHII model, but its SHAP dependence pattern was more step-like than smoothly monotonic, so the present results support a narrower interpretation than a simple linear effect. Within this dataset, solar radiation appears to act more as a broader radiative context that conditions SUHII magnitude than as the clearest tract-scale discriminator of relatively hotter and cooler neighborhoods. This interpretation is consistent with recent comparative and multi-city studies showing that background climate and broader urban context strongly shape urban thermal patterns and modulate how local land-surface factors relate to surface temperature [8,19,71,72]. In the present models, the sharper tract-scale differentiation was captured more directly by IMPERV and NDMI, so the most accurate conclusion is that solar radiation contributes to SUHII but does not, by itself, explain why some tracts within the same city are relatively hotter than others.

4.4. Nighttime Radiance as a Secondary Proxy for Urban Intensity

VIIRS_RAD contributed to both SUHII and LST, but with lower importance than the dominant predictors, and its SHAP pattern rose at low-to-moderate values before flattening at higher radiance levels. This suggests that nighttime radiance functions mainly as a secondary proxy for urban intensity rather than as a primary physical driver of daytime surface heating. That interpretation is consistent with recent ML and XAI studies showing that both built-environment indicators and natural land-surface variables can strongly influence LST, with SHAP-based analysis helping distinguish their relative roles across settings [13,73,74]. In this study, VIIRS_RAD therefore adds useful information about the degree of urban development, but its explanatory role remains subordinate to IMPERV and NDMI for SUHII and to INTPTLAT and MEAN_SUMMER_TEMP for LST.

4.5. Why Water-Related Metrics Show Limited Explanatory Power

DIST_WATER and AWATER had weak global importance in the fitted models, but this should not be interpreted as evidence that water-related cooling is unimportant. A more accurate interpretation is that the specific variables used here—distance to mapped water and tract water area—captured relatively little additional variance once moisture availability and other predictors were included. This is physically plausible because the thermal influence of blue space depends strongly on morphology, spatial configuration, ventilation, surrounding urban form, and climatic context, which simple static proximity measures do not represent well [75,76,77]. In the present framework, NDMI likely captures part of the local cooling environment more directly than DIST_WATER or AWATER, which helps explain the limited independent role of the water variables once moisture is already included.

4.6. Implications for Heat Mitigation and Equity-Oriented Planning

The mitigation implications follow directly from the distinction between absolute surface heat and relative within-city heat anomaly. Variables such as INTPTLAT and MEAN_SUMMER_TEMP help explain why some cities are hotter in absolute terms, but they are not tract-scale planning levers. By contrast, IMPERV and NDMI correspond more closely to modifiable surface conditions and are therefore more relevant for interventions aimed at reducing relative within-city surface heat inequality. This interpretation is consistent with recent reviews showing that urban cooling is strongly linked to vegetation, evapotranspiration or water availability, shading, and green-blue infrastructure, although the magnitude of cooling varies with climate, scale, spatial configuration, and design [68,70,75,78,79,80]. The present results, therefore, support a focused planning conclusion: where the objective is to reduce relatively hot tract conditions, the most relevant levers are those linked to imperviousness and local cooling capacity rather than those that only describe broad climatic background.

4.7. Scope and Limits of Inference

This study explains spatial variation in summertime daytime satellite-derived LST and standardized SUHII, not canopy-layer air temperature, thermal comfort, or direct human heat exposure, and that distinction should remain explicit. Recent reviews show that LST is highly informative for surface thermal analysis, but it should not be treated as a direct proxy for air-temperature-based exposure or human thermal comfort without additional atmospheric or exposure modeling [2,4]. In addition, census tracts provide a consistent and policy-relevant unit for cross-city comparison, but they smooth finer within-tract heterogeneity. SHAP improves the interpretability of fitted model behavior, yet it still describes model attribution rather than causal effects, and several predictors in the framework remain proxies rather than direct process variables. These limits do not weaken the central contribution of the study, but they define it more precisely: this is an interpretable, tract-scale, multi-city framework for attributing daytime surface thermal patterns, not a direct causal model of human heat exposure or environmental injustice mechanisms.

5. Conclusions

This study developed an XAI framework implemented in GEE to attribute census-tract summer surface heat across eight U.S. cities using a consistent SUHII definition and a harmonized set of physical predictors. By jointly modeling LST and SUHII, the framework separates regional climatic controls from neighborhood-scale surface heat patterns, enabling consistent cross-city comparison within a standardized tract-scale framework.

The main findings are as follows:

Generalizable performance: City-held-out nested CV showed strong geographic transferability, with ensemble tree models clearly outperforming linear and neural-network alternatives.
Drivers of SUHII: Within the fitted models, SUHII is most strongly associated with impervious surface fraction (warmer relative anomalies) and surface moisture availability (cooler relative anomalies), with solar radiation acting as an additional but less spatially differentiating factor.
Drivers of LST: LST is associated mainly with latitude and long-term mean summer air temperature, while local surface properties act as consistent but secondary modifications on this climatic baseline.
Limited role of water-proximity metrics: Distance-to-water and water-area variables contribute relatively little within the fitted models once surface moisture is included, suggesting that static proximity measures do not capture water-related cooling as effectively as moisture-based indicators.

Overall, the study shows that LST alone is not sufficient to diagnose intra-urban heat patterns because it conflates background climate with local surface conditions. Collectively, the results indicate that LST and SUHII should be interpreted as complementary rather than interchangeable metrics: LST captures absolute surface thermal conditions, whereas SUHII better isolates relative within-city thermal anomaly. The proposed framework provides an interpretable basis for multi-city attribution of urban surface heat drivers.

From a sustainability perspective, the proposed framework functions as a practical tool for supporting sustainable urban planning, climate adaptation, and targeted heat-mitigation action. By providing tract-scale evidence on urban surface heat patterns and their physical drivers, the study contributes to the measurement and monitoring of climate-related urban sustainability and offers decision-relevant information for place-based planning and policy under ongoing climate change.

These conclusions are limited to the modeled associations observed in daytime summer satellite-derived surface temperature patterns and should not be interpreted as direct evidence of cause-and-effect relationships, near-surface air temperature effects, or human heat exposure. Future work should incorporate explicit urban form metrics (such as building density, sky view factor, and surface-to-volume ratio) and their influence on urban airflow and ventilation, improve representation of coastal and large-scale weather conditions, and integrate demographic and historical data to better connect physically attributed surface heat patterns with the mechanisms driving urban heat inequality and targeted mitigation [16].

Author Contributions

Conceptualization, O.A.B.A. and D.G.; methodology, O.A.B.A. and D.G.; software, O.A.B.A. and D.G.; validation, O.A.B.A. and D.G.; formal analysis, O.A.B.A. and D.G.; investigation, O.A.B.A. and D.G.; resources, O.A.B.A. and D.G.; data curation, O.A.B.A. and D.G.; writing—original draft preparation, O.A.B.A. and D.G.; writing—review and editing, O.A.B.A. and D.G.; visualization, O.A.B.A. and D.G.; supervision, O.A.B.A. and D.G.; project administration, O.A.B.A. and D.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in GEE. Detailed data sources are provided in Table 2.

Acknowledgments

The authors thank the developers and maintainers of GEE and the providers of the publicly available satellite, climate, and census datasets used in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ALAND	Land Area (of census tract)
ALBEDO	Broadband Surface Albedo
AWATER	Water Area (of census tract)
CV	Cross-Validation
DNB	Day/Night Band (Visible Infrared Imaging Radiometer Suite)
DIST_CITY_CENTER	Distance to City Center
DIST_WATER	Distance to Nearest Major Water Body
ERA5-Land	ECMWF Reanalysis v5–Land (land-surface reanalysis product)
GEE	Google Earth Engine
GIS	Geographic Information System
K	Kelvin
LightGBM	Light Gradient Boosting Machine
log1p	Logarithm of (1 + x) transformation
LST	Census-Tract Mean Land Surface Temperature (°C)
MAE	Mean Absolute Error
ML	Machine Learning
MLP	Multilayer Perceptron
N (e.g., N = 5144)	Sample Size (number of observations)
NAIP	National Agriculture Imagery Program
NDMI	Normalized Difference Moisture Index
NIR	Near-Infrared
NLCD	National Land Cover Database
QA	Quality Assurance
QGIS	Quantum Geographic Information System
R²	Coefficient of Determination
RF	Random Forest
RMSE	Root Mean Square Error
SR	Surface Reflectance
SVI	Social Vulnerability Index
SUHI	Surface Urban Heat Island
SUHII	Surface Urban Heat Island Intensity
SWIR1	Shortwave Infrared 1
TIGER	Topologically Integrated Geographic Encoding and Referencing (U.S. Census)
UHI	Urban Heat Island
USGS	United States Geological Survey
VIIRS	Visible Infrared Imaging Radiometer Suite
XAI	Explainable Artificial Intelligence
XGBoost	Extreme Gradient Boosting

Appendix A

Appendix A.1

The standard mathematical expressions for the regression models summarized in Section 2.6 are given below.

\hat{β^{*}} = \arg {\min_{β^{*}} |y^{*} - X^{*} β^{*}|}^{2} + \frac{λ_{1}}{\sqrt{1 + λ_{2}}} {|β^{*}|}_{1}

(A1)

where

a r g m i n

denotes the coefficient vector that minimizes the naïve elastic net objective;

y^{*}

and

X^{*}

are the augmented response vector and design matrix;

β^{*}

is the naïve elastic net coefficient vector;

λ_{1}

and

λ_{2}

are regularization parameters; the first term is the squared residual loss on the augmented data, and the second term is a lasso (

l_{1}

) penalty that induces sparsity.

ϕ (x) = σ (W_{4}^{T} σ (W_{3}^{T} σ (W_{2}^{T} σ (W_{1}^{T} x + b_{1}) + b_{2}) + b_{3}) + b_{4})

(A2)

where

x

is the input feature vector;

W_{1}, \dots, W_{4}

denote the weight matrices of the successive layers;

b_{1}, \dots, b_{4}

denote the corresponding bias vectors;

σ

denotes the nonlinear activation function applied elementwise; the superscript

T

denotes matrix transpose;

ϕ (x)

denotes the network output obtained by forward propagation through four layers.

\hat{y} (x_{i}) = \frac{1}{B} \sum_{b = 1}^{B} T_{b} (x_{i})

(A3)

where

x_{i}

denotes the

i

-th input feature vector;

B

is the total number of trees in the ensemble; and

T_{b} (x_{i})

is the prediction produced by the

b

-th tree for input

x_{i}

.

\hat{y} (x_{i}) = \sum_{k = 1}^{K} f_{k} (x_{i}), f_{k} \in F

(A4)

where

f_{k} (x_{i})

denotes the contribution of the

k

-th regression tree to the prediction for input

x_{i}

;

F

represents the space of admissible regression trees; and

K

is the number of trees included in the ensemble.

Appendix A.2

The evaluation metrics summarized in Section 2.6 are defined as follows:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(A5)

MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \hat{y_{i}}|

(A6)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(A7)

where

n

is the number of census tracts in the held-out test dataset;

y_{i}

is the observed value of the target variable for tract

i

(either

S U H I I

or

L S T

);

{\hat{y}}_{i}

is the corresponding model-predicted value; and

\overset{ˉ}{y}

is the mean of observed target values across the held-out test data.

References

Zhao, L.; Oppenheimer, M.; Zhu, Q.; Baldwin, J.W.; Ebi, K.L.; Bou-Zeid, E.; Guan, K.; Liu, X. Interactions between Urban Heat Islands and Heat Waves. Environ. Res. Lett. 2018, 13, 034003. [Google Scholar] [CrossRef]
Zhao, L.; Fan, X.; Hong, T. Urban Heat Island Effect: Remote Sensing Monitoring and Assessment—Methods, Applications, and Future Directions. Atmosphere 2025, 16, 791. [Google Scholar] [CrossRef]
Mutani, G.; Scalise, A.; Sufa, X.; Grasso, S. Synergising Machine Learning and Remote Sensing for Urban Heat Island Dynamics: A Comprehensive Modelling Approach. Atmosphere 2024, 15, 1435. [Google Scholar] [CrossRef]
Kim, Y.; Yoo, C.; Im, J. Nighttime Satellite Land Surface Temperature for Urban Applications: Achievements, Challenges, and Future Prospects. GIScience Remote Sens. 2025, 62, 2527990. [Google Scholar] [CrossRef]
Hashemi, F.; Adib, M. Examining Thermal Inequities: Land Surface Temperature, Social Vulnerability, and Historical Redlining in San Antonio, TX. Urban Clim. 2024, 55, 101960. [Google Scholar] [CrossRef]
Chen, S.; Bruhn, S.; Seto, K.C. Trends in Socioeconomic Disparities in Urban Heat Exposure and Adaptation Options in Mid-Sized U.S. Cities. Remote Sens. Appl. Soc. Environ. 2024, 36, 101313. [Google Scholar] [CrossRef]
Mallick, J.; Alqadhi, S. Explainable Artificial Intelligence Models for Proposing Mitigation Strategies to Combat Urbanization Impact on Land Surface Temperature Dynamics in Saudi Arabia. Urban Clim. 2025, 59, 102259. [Google Scholar] [CrossRef]
Mansouri, A.; Erfani, A. Machine Learning Prediction of Urban Heat Island Severity in the Midwestern United States. Sustainability 2025, 17, 6193. [Google Scholar] [CrossRef]
Ahmed, A.N.; AlDahoul, N.; Aziz, N.A.; Huang, Y.F.; Sherif, M.; El-Shafie, A. The Urban Heat Island Effect: A Review on Predictive Approaches Using Artificial Intelligence Models. City Environ. Interact. 2025, 28, 100234. [Google Scholar] [CrossRef]
Snaiki, R.; Merabtine, A. Recent Advances on Machine Learning Techniques for Urban Heat Island Applications: A Review and New Horizons. Sustain. Cities Soc. 2025, 134, 106943. [Google Scholar] [CrossRef]
Gaur, A.; Deb, C. Machine Learning Methods and Approaches for Urban Heat Island (UHI) Assessment: A Comprehensive Review. Renew. Sustain. Energy Rev. 2026, 234, 116903. [Google Scholar] [CrossRef]
Darvishvand, L.; Kamkari, B.; Huang, M.J.; Hewitt, N.J. A Systematic Review of Explainable Artificial Intelligence in Urban Building Energy Modeling: Methods, Applications, and Future Directions. Sustain. Cities Soc. 2025, 128, 106492. [Google Scholar] [CrossRef]
Feng, F.; Ren, Y.; Xu, C.; Jia, B.; Wu, S.; Lafortezza, R. Exploring the Non-Linear Impacts of Urban Features on Land Surface Temperature Using Explainable Artificial Intelligence. Urban Clim. 2024, 56, 102045. [Google Scholar] [CrossRef]
Tahooni, A.; Kakroodi, A.A.; Kiavarz, M.; Mansourian, H. High-Resolution Urban LST Downscaling via Machine Learning and SHAP: A Case Study in a Rapidly Urbanizing Semi-Arid Region. Sustain. Cities Soc. 2025, 134, 106897. [Google Scholar] [CrossRef]
Hong, T.; Yim, S.H.L.; Heo, Y. Interpreting Complex Relationships between Urban and Meteorological Factors and Street-Level Urban Heat Islands: Application of Random Forest and SHAP Method. Sustain. Cities Soc. 2025, 126, 106353. [Google Scholar] [CrossRef]
Liao, S.; Liu, Z. Explaining and Reducing Urban Heat Islands Through Machine Learning: Evidence from New York City. Buildings 2026, 16, 186. [Google Scholar] [CrossRef]
Zhang, Y.; Ge, J.; Bai, X.; Wang, S. Blue-Green Space Seasonal Influence on Land Surface Temperatures across Different Urban Functional Zones: Integrating Random Forest and Geographically Weighted Regression. J. Environ. Manag. 2025, 374, 123975. [Google Scholar] [CrossRef]
Li, K.; Chen, Y.; Gao, S. Uncertainty of City-Based Urban Heat Island Intensity across 1112 Global Cities: Background Reference and Cloud Coverage. Remote Sens. Environ. 2022, 271, 112898. [Google Scholar] [CrossRef]
Kong, G.; Peng, J.; Corcoran, J. Modelling Urban Heat Island Effects: A Global Analysis of 216 Cities Using Machine Learning Techniques. Comput. Urban Sci. 2025, 5, 18. [Google Scholar] [CrossRef]
Azizi, S.; Azizi, T. Urban Climate Dynamics: Analyzing the Impact of Green Cover and Air Pollution on Land Surface Temperature—A Comparative Study Across Chicago, San Francisco, and Phoenix, USA. Atmosphere 2024, 15, 917. [Google Scholar] [CrossRef]
Sheridan, S.; De Guzman, E.B.; Eisenman, D.P.; Sailor, D.J.; Parfrey, J.; Kalkstein, L.S. Increasing Tree Cover and High-Albedo Surfaces Reduces Heat-Related ER Visits in Los Angeles, CA. Int. J. Biometeorol. 2024, 68, 1603–1614. [Google Scholar] [CrossRef]
Mejia, J.F.; Henao, J.J.; Eslami, E. Role of Clouds in the Urban Heat Island and Extreme Heat: Houston-Galveston Metropolitan Area Case. JGR Atmos. 2024, 129, e2024JD041243. [Google Scholar] [CrossRef]
Suraj, K.C.; Chiluwal, A.; Magar, L.P.; Paudel, K. Investigating Urban Heat Islands in Miami, Florida, Utilizing Planet and Landsat Satellite Data. Atmosphere 2025, 16, 880. [Google Scholar] [CrossRef]
De Wit, V.; Forsythe, K.W. Urban Structure Changes in Three Areas of Detroit, Michigan (2014–2018) Utilizing Geographic Object-Based Classification. Land 2023, 12, 763. [Google Scholar] [CrossRef]
Bhatta, D. Grid-Level Spatial and Temporal Analysis of Land Surface Temperature and the Association with Land Use and Land Cover: A Case Study of Minnesota, USA Between 2013–2022. Available online: https://repository.stcloudstate.edu/gp_etds/18/ (accessed on 16 February 2026).
Li, X.; Chakraborty, T.; Wang, G. Comparing Land Surface Temperature and Mean Radiant Temperature for Urban Heat Mapping in Philadelphia. Urban Clim. 2023, 51, 101615. [Google Scholar] [CrossRef]
Gray, L. Remote Sensing-Based Analysis of Urban Heat Islands and Historical Housing Discrimination in Boston, MA. Bachelor’s Thesis, Dartmouth College, Hanover, NH, USA, 2024. Available online: https://digitalcommons.dartmouth.edu/geography_senior_theses/8 (accessed on 16 February 2026).
Google Earth Engine TIGER: U.S. Census Tracts, 2020 (TIGER/2020/TRACT). 2024. Available online: https://developers.google.com/earth-engine/datasets/catalog/TIGER_2020_TRACT (accessed on 16 February 2026).
Google Earth Engine USGS National Land Cover Database (NLCD) 2019 Release (USGS/NLCD_RELEASES/2019_REL/NLCD). 2024. Available online: https://developers.google.com/earth-engine/datasets/catalog/USGS_NLCD_RELEASES_2019_REL_NLCD (accessed on 16 February 2026).
Google Earth Engine USGS Landsat 8 Level-2, Collection 2, Tier 1 (LANDSAT/LC08/C02/T1_L2). 2024. Available online: https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2 (accessed on 16 February 2026).
Google Earth Engine USGS Landsat 9 Level-2, Collection 2, Tier 1 (LANDSAT/LC09/C02/T1_L2). 2024. Available online: https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T1_L2 (accessed on 16 February 2026).
Google Earth Engine JRC Global Surface Water Mapping Layers, v1.4 (JRC/GSW1_4/GlobalSurfaceWater). 2024. Available online: https://developers.google.com/earth-engine/datasets/catalog/JRC_GSW1_4_GlobalSurfaceWater (accessed on 16 February 2026).
Google Earth Engine VIIRS Stray Light–Corrected Nighttime Day/Night Band Monthly Composites, Version 1 (NOAA/VIIRS/DNB/MONTHLY_V1/VCMSLCFG). 2024. Available online: https://developers.google.com/earth-engine/datasets/catalog/NOAA_VIIRS_DNB_MONTHLY_V1_VCMSLCFG (accessed on 16 February 2026).
Google Earth Engine ERA5-Land Hourly—ECMWF Climate Reanalysis (ECMWF/ERA5_LAND/HOURLY). 2024. Available online: https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_LAND_HOURLY (accessed on 16 February 2026).
Naegeli, K.; Damm, A.; Huss, M.; Wulf, H.; Schaepman, M.; Hoelzle, M. Cross-Comparison of Albedo Products for Glacier Surfaces Derived from Airborne and Satellite (Sentinel-2 and Landsat 8) Optical Data. Remote Sens. 2017, 9, 110. [Google Scholar] [CrossRef]
Jadhav, A.V.; Belange, K.; Gajbhiv, N.; Kumar, V.; Rahul, P.R.C.; Sudeepkumar, B.L.; Bhawar, R.L. Evaluation of the Reanalysis and Satellite Surface Solar Radiation Datasets Using Ground-Based Observations over India. Atmosphere 2025, 16, 957. [Google Scholar] [CrossRef]
U.S. Geological Survey Landsat Collection 2 Surface Temperature 2024. Available online: https://www.usgs.gov/landsat-missions/landsat-collection-2-surface-temperature (accessed on 16 February 2026).
Alejo-Sanchez, L.E.; Márquez-Grajales, A.; Salas-Martínez, F.; Franco-Arcega, A.; López-Morales, V.; Acevedo-Sandoval, O.A.; González-Ramírez, C.A.; Villegas-Vega, R. Missing Data Imputation of Climate Time Series: A Review. MethodsX 2025, 15, 103455. [Google Scholar] [CrossRef]
Ayiah-Mensah, F.; Bosson-Amedenu, S.; Baah, E.M.; Addor, J.A. Advancements in Seasonal Rainfall Forecasting: A Seasonal Auto-Regressive Integrated Moving Average Model with Outlier Adjustments for Ghana’s Western Region. Sci. Afr. 2025, 28, e02632. [Google Scholar] [CrossRef]
Dash, C.S.K.; Behera, A.K.; Dehuri, S.; Ghosh, A. An Outliers Detection and Elimination Framework in Classification Task of Data Mining. Decis. Anal. J. 2023, 6, 100164. [Google Scholar] [CrossRef]
West, R.M. Best Practice in Statistics: The Use of Log Transformation. Ann Clin Biochem 2022, 59, 162–165. [Google Scholar] [CrossRef]
Koukaras, P.; Tjortjis, C. Data Preprocessing and Feature Engineering for Data Mining: Techniques, Tools, and Best Practices. AI 2025, 6, 257. [Google Scholar] [CrossRef]
Tawakuli, A.; Havers, B.; Gulisano, V.; Kaiser, D.; Engel, T. Survey: Time-Series Data Preprocessing: A Survey and an Empirical Analysis. J. Eng. Res. 2025, 13, 674–711. [Google Scholar] [CrossRef]
Sattar, M.U.; Dattana, V.; Hasan, R.; Mahmood, S.; Khan, H.W.; Hussain, S. Enhancing Supply Chain Management: A Comparative Study of Machine Learning Techniques with Cost–Accuracy and ESG-Based Evaluation for Forecasting and Risk Mitigation. Sustainability 2025, 17, 5772. [Google Scholar] [CrossRef]
Olyasani, M.; Azimi, H.; Shiri, H. Robust Tree-Based Machine Learning Algorithms for Predicting Drag Anchor Performance. JMSE 2026, 14, 281. [Google Scholar] [CrossRef]
Zou, H.; Hastie, T. Regularization and Variable Selection Via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
Dormann, C.F.; Elith, J.; Bacher, S.; Buchmann, C.; Carl, G.; Carré, G.; Marquéz, J.R.G.; Gruber, B.; Lafourcade, B.; Leitão, P.J.; et al. Collinearity: A Review of Methods to Deal with It and a Simulation Study Evaluating Their Performance. Ecography 2013, 36, 27–46. [Google Scholar] [CrossRef]
Baggag, A.; Saad, Y. Deep Learning, Transformers and Graph Neural Networks: A Linear Algebra Perspective. Numer. Algorithms 2025, 100, 2095–2134. [Google Scholar] [CrossRef]
Geurts, P.; Ernst, D.; Wehenkel, L. Extremely Randomized Trees. Mach Learn 2006, 63, 3–42. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Ploton, P.; Mortier, F.; Réjou-Méchain, M.; Barbier, N.; Picard, N.; Rossi, V.; Dormann, C.; Cornu, G.; Viennois, G.; Bayol, N.; et al. Spatial Validation Reveals Poor Predictive Performance of Large-Scale Ecological Mapping Models. Nat. Commun. 2020, 11, 4540. [Google Scholar] [CrossRef]
Milà, C.; Ludwig, M.; Pebesma, E.; Tonne, C.; Meyer, H. Random Forests with Spatial Proxies for Environmental Modelling: Opportunities and Pitfalls. Geosci. Model Dev. 2024, 17, 6007–6033. [Google Scholar] [CrossRef]
Koldasbayeva, D.; Tregubova, P.; Gasanov, M.; Zaytsev, A.; Petrovskaia, A.; Burnaev, E. Challenges in Data-Driven Geospatial Modeling for Environmental Research and Practice. Nat. Commun. 2024, 15, 10700. [Google Scholar] [CrossRef]
Linnenbrink, J.; Milà, C.; Ludwig, M.; Meyer, H. kNNDM CV: K -Fold Nearest-Neighbour Distance Matching Cross-Validation for Map Accuracy Estimation. Geosci. Model Dev. 2024, 17, 5897–5912. [Google Scholar] [CrossRef]
Hutengs, C.; Vohland, M. Downscaling Land Surface Temperatures at Regional Scales with Random Forest Regression. Remote Sens. Environ. 2016, 178, 127–141. [Google Scholar] [CrossRef]
Shapley, L.S.; Snow, R.N. Basic Solutions of Discrete Games. Contrib. Theory Games 1952, 1, 27. [Google Scholar]
Lundberg, S.M.; Lee, S.-I. Consistent Feature Attribution for Tree Ensembles. arXiv 2018, arXiv:1706.06060. [Google Scholar] [CrossRef]
Li, Z.; Ma, J.; Jiang, F.; Zhang, S.; Tan, Y. Assessing the Impacts of Urban Morphological Factors on Urban Building Energy Modeling Based on Spatial Proximity Analysis and Explainable Machine Learning. J. Build. Eng. 2024, 85, 108675. [Google Scholar] [CrossRef]
Seyrfar, A.; Ataei, H.; Movahedi, A.; Derrible, S. Data-Driven Approach for Evaluating the Energy Efficiency in Multifamily Residential Buildings. Pract. Period. Struct. Des. Constr. 2021, 26, 04020074. [Google Scholar] [CrossRef]
Zhang, Y.; Teoh, B.K.; Wu, M.; Chen, J.; Zhang, L. Data-Driven Estimation of Building Energy Consumption and GHG Emissions Using Explainable Artificial Intelligence. Energy 2023, 262, 125468. [Google Scholar] [CrossRef]
Senaviratna, N.A.M.R.; Cooray, T.M.J.A. Diagnosing Multicollinearity of Logistic Regression Model. Asian J. Probab. Stat. 2019, 5, 1–9. [Google Scholar] [CrossRef]
Zaki, A.; Métwalli, A.; Aly, M.H.; Badawi, W.K. 5G and Beyond: Channel Classification Enhancement Using VIF-Driven Preprocessing and Machine Learning. Electronics 2023, 12, 3496. [Google Scholar] [CrossRef]
Xi, W.-F.; Jiang, Q.-W.; Yang, A.-M. Using Stepwise Regression to Address Multicollinearity Is Not Appropriate. Int. J. Surg. 2024, 110, 3122–3123. [Google Scholar] [CrossRef]
Esposito, A.; Pappaccogli, G.; Bozzeda, F.; Buccolieri, R. A Multi-City Statistical Modelling of Surface Urban Heat Island: Application to Italian Cities. Urban Clim. 2025, 64, 102717. [Google Scholar] [CrossRef]
Hsu, A.; Sheriff, G.; Chakraborty, T.; Manya, D. Disproportionate Exposure to Urban Heat Island Intensity across Major US Cities. Nat. Commun. 2021, 12, 2721. [Google Scholar] [CrossRef]
Tanoori, G.; Soltani, A.; Modiri, A. Machine Learning for Urban Heat Island (UHI) Analysis: Predicting Land Surface Temperature (LST) in Urban Environments. Urban Clim. 2024, 55, 101962. [Google Scholar] [CrossRef]
Vahid, R.; Aly, M.H. A Comprehensive Systematic Review of Machine Learning Applications in Assessing Land Use/Cover Dynamics and Their Impact on Land Surface Temperatures. Urban Sci. 2025, 9, 234. [Google Scholar] [CrossRef]
Galalizadeh, S.; Morrison-Saunders, A.; Horwitz, P.; Silberstein, R.; Blake, D. The Cooling Impact of Urban Greening: A Systematic Review of Methodologies and Data Sources. Urban For. Urban Green. 2024, 95, 128157. [Google Scholar] [CrossRef]
Moncada-Morales, G.A.; Verichev, K.; López-Guerrero, R.E.; Carpio, M. A Global Review of Vegetation’s Interaction Effect on Urban Heat Mitigation Across Different Climates. Urban Sci. 2025, 9, 361. [Google Scholar] [CrossRef]
Soltanifard, H.; Amani-Beni, M. The Cooling Effect of Urban Green Spaces as Nature-Based Solutions for Mitigating Urban Heat: Insights from a Decade-Long Systematic Review. Clim. Risk Manag. 2025, 49, 100731. [Google Scholar] [CrossRef]
Shafizadeh-Moghadam, H.; Xu, T.; Murayama, Y. Climate-Specific Trends in Urban Land Surface Temperature: A Global Analysis of 432 Cities (2014–2024). J. Environ. Manag. 2025, 395, 127789. [Google Scholar] [CrossRef]
Liu, S.; Li, X.; Shi, Z.; Geng, M.; Yu, G.; Hu, T. Urbanization Is Projected to Increase Local Surface Temperature by 2100. Commun Earth Env. 2025, 6, 988. [Google Scholar] [CrossRef]
Hoang, N.-D.; Tran, V.-D.; Huynh, T.-C. From Data to Insights: Modeling Urban Land Surface Temperature Using Geospatial Analysis and Interpretable Machine Learning. Sensors 2025, 25, 1169. [Google Scholar] [CrossRef]
Li, H.; Yang, J.; Xin, J.; Yu, W.; Ren, J.; Yu, H.; Xiao, X.; Xia, J. (Cecilia) Investigating the Effect of Urban Form on Land Surface Temperature at Block and Grid Scales Based on XGBoost-SHAP. Environ. Model. Softw. 2026, 195, 106738. [Google Scholar] [CrossRef]
Kumar, P.; Debele, S.E.; Khalili, S.; Halios, C.H.; Sahani, J.; Aghamohammadi, N.; Andrade, M.D.F.; Athanassiadou, M.; Bhui, K.; Calvillo, N.; et al. Urban Heat Mitigation by Green and Blue Infrastructure: Drivers, Effectiveness, and Future Needs. Innovation 2024, 5, 100588. [Google Scholar] [CrossRef]
Han, M.; Zhang, T.; Si, Z. Optimizing Urban Blue-Green Space in Climate Adaptive Planning: A Systematic Review of Threshold Value of Efficiency Thresholds. Landsc. Ecol. 2025, 40, 13. [Google Scholar] [CrossRef]
Li, J.; Wang, L.; Xie, X.; Zhang, X. Urban Blue Spaces and Urban Heat Island Mitigation: A Bibliometric and Systematic Review of Spatiotemporal Dynamics, Morphology, and Planning Integration. Buildings 2026, 16, 834. [Google Scholar] [CrossRef]
Li, Y.; Svenning, J.-C.; Zhou, W.; Zhu, K.; Abrams, J.F.; Lenton, T.M.; Ripple, W.J.; Yu, Z.; Teng, S.N.; Dunn, R.R.; et al. Green Spaces Provide Substantial but Unequal Urban Cooling Globally. Nat. Commun. 2024, 15, 7108. [Google Scholar] [CrossRef]
Rao, P.; Torreggiani, D.; Tassinari, P.; Rötzer, T.; Pauleit, S.; Rahman, M.A. Do Urban Green Spaces Cool Cities Differently across Latitudes? Spatial Variability and Climatic Drivers of Vegetation-Induced Cooling. Sustain. Cities Soc. 2025, 130, 106513. [Google Scholar] [CrossRef]
Alonzo, M.; Ibsen, P.C.; Locke, D.H. Urban Trees and Cooling: A Review of the Recent Literature (2018 to 2024). J. Arboric. Urban For. 2025, 51, 420–444. [Google Scholar] [CrossRef]

Figure 1. Workflow for Tract-Scale Urban Heat Modeling. TIGER 2020 and NLCD 2019 refer to [28] and [29], respectively.

Figure 2. Locations of Study Cities Across the United States.

Figure 3. State Coverage of Urban Heat Data.

Figure 4. MLP architecture used in this study.

Figure 5. RF workflow with nested CV used in this study.

Figure 6. Distribution of Tract-Level (a) LST and (b) SUHII Across Major U.S. Cities. Note: The orange lines indicate the median values.

Figure 7. True-color imagery of the Phoenix metropolitan area, showing urban form and land cover patterns that provide physical context for the tract-level heat drivers mapped in Figure 8. Note. National Agriculture Imagery Program (NAIP) true-color imagery (2018–2024) was processed in GEE, clipped to a 20-km buffer around Phoenix, and rendered in QGIS.

Figure 8. Summer (June–August) multi-year median (2018–2024) spatial patterns of urban heat drivers across Phoenix, Arizona, shown at the census-tract scale: (a) LST (°C), (b) NDMI, (c) albedo, (d) impervious surface (%), (e) VIIRS nighttime radiance (nW·cm⁻²·sr⁻¹), and (f) distance to water (m). Note. Data processing was performed in GEE, and maps were rendered in QGIS.

Figure 9. Distributions of input variables and outcome measures (SUHII and LST).

Figure 10. Pearson correlation matrix of input variables after transformations.

Figure 11. Best-performing models for SUHII and LST.

Figure 12. Global SHAP feature importance (a,c) and SHAP summary effects (b,d) for SUHII (XGBoost) and LST (Extra Trees).

Figure 13. SHAP dependence and interaction plots for the most influential predictors of SUHII (a–f) and LST (g–l).

Figure 14. Mean absolute SHAP-based predictor importance for SUHII and LST by city and regional climate setting: (a) SUHII by city, (b) SUHII by region, (c) LST by city, and (d) LST by region.

Table 1. U.S. cities selected for this study.

Region	City A	City B
West	Phoenix, AZ: Located in the Sonoran Desert, this city experiences extremely hot summers. Rapid urban growth and extensive development have created one of the most intense urban heat island effects in the United States [20]	Los Angeles, CA: A large metropolitan region where extreme heat is an increasing public health risk, driven by dense urban development, limited tree cover, and uneven access to cooling, producing strong neighborhood-scale heat impacts [21]
South	Houston, TX: A warm, humid metropolitan region where urbanization raises overall heat exposure, while sea-breeze circulation and urban-enhanced clouds partially limit peak afternoon heat [22]	Miami, FL: A hot, humid city where urban heat islands are strongest in built-up areas, while vegetated zones show consistently lower surface temperatures [23]
Midwest	Detroit, MI: A humid continental city where population loss and widespread demolition have reshaped the urban fabric, leaving extensive vacant land and concentrated impervious surfaces in remaining built areas [24]	Minneapolis, MN: A humid continental city where heat is concentrated in built-up areas, while higher vegetation cover is associated with lower surface temperatures [25]
Northeast	Philadelphia, PA: A dense city where heat exposure is highest in heavily built areas and reduced where tree canopy and street shading are present [26]	Boston, MA: Dense urban areas are warmer than surrounding regions due to impervious surfaces and reduced evapotranspiration, with the urban heat island strongest during summer [27]

Table 2. Description of predictor variables and outcomes used in the analysis.

Category	Variable	Description	Unit	Data Source
Land Cover/Surface Properties	NDMI	Normalized Difference Moisture Index, a proxy for surface moisture and evapotranspiration	Unitless (−1 to 1)	[30,31]
	IMPERV	Mean impervious surface fraction within each census tract	Percent (%)	[29]
	ALBEDO	Surface broadband albedo, representing solar reflectance	Unitless (0–1)	derived from [30,31]
	AWATER	Water area within the census tract	m²	[28]
	DIST_WATER	Distance to the nearest major water body (coast, river, lake)	m	[32]
Urban Geometry/Built Environment	VIIRS_RAD	Mean VIIRS nighttime lights radiance, used as a proxy for human activity intensity	nW·cm⁻²·sr⁻¹	[33]
Urban Geometry/Built Environment	ALAND	Land area of census tract	m²	[28]
Spatial Context/Neighborhood Effects	INTPTLAT	Latitude of census tract internal representative point	Degrees (°)	[28]
Spatial Context/Neighborhood Effects	DIST_CITY_CENTER	Distance from the tract centroid to the city center	km	derived from [28]
Climate	MEAN_SUMMER_TEMP	Long-term mean summer air temperature (climatology)	°C	[34]
Climate	SOLAR_RAD	Mean incoming solar radiation (long-term average)	W·m⁻²	[34]
Outcome Variables	LST	Mean land surface temperature aggregated to the census tract	°C	derived from [30,31]
Outcome Variables	SUHII	Surface Urban Heat Island Intensity	Unitless	Derived from LST

Table 3. Equations of analysis variables.

Variable	Equation/Definition	References
NDMI	$NDMI = \frac{ρ_{NIR} - ρ_{SWIR 1}}{ρ_{NIR} + ρ_{SWIR 1}}$ , where $ρ_{NIR}$ and $ρ_{SWIR 1}$ are Landsat 8/9 Collection-2 Level-2 surface reflectance from bands SR_B5 and SR_B6; summer median composite; tract value = mean of pixels within tract.	[3]
IMPERV	$IMPERV = \bar{I}$ , where $I$ is the NLCD 2019 impervious surface fraction (%) at 30 m; tract value = mean impervious fraction within tract.	–
ALBEDO	$A l b e d o (α) = 0.356 ρ_{B 2} + 0.130 ρ_{B 4} + 0.373 ρ_{B 5} + 0.085 ρ_{B 6} + 0.072 ρ_{B 7} - 0.0018$ , where $ρ_{B i}$ are Landsat 8/9 surface reflectance bands (SR_B2, B4, B5, B6, B7); summer median composite; tract mean.	[35]
AWATER	$AWATER = A_{water}$ , where $A_{water}$ is total water area within the census-tract polygon.	–
DIST_WATER	$DIST_WATER = d (x_{t}, x_{w})$ , where $x_{t}$ is the tract location and $x_{w}$ is the nearest pixel with JRC Global Surface Water occurrence ≥ 50%; distance computed via Euclidean distance transform and converted to meters using analysis scale.	–
VIIRS_RAD	$VIIRS_RAD = \bar{R}$ , where $R$ is monthly VIIRS DNB average radiance (avg_rad); summer months filtered and median-aggregated across years; tract mean.	–
ALAND	$ALAND = A_{land}$ , where $A_{land}$ is total land area of the census tract.	–
INTPTLAT	$INTPTLAT = ϕ$ , where $ϕ$ is the latitude (degrees) of the tract internal representative point.	–
DIST_CITY_CENTER	$DIST_CITY_CENTER = \frac{1}{1000} d (x_{c}, x_{t})$ , where $x_{c}$ is the city-center point and $x_{t}$ is the tract geometry centroid; distance converted to kilometers.	–
MEAN_SUMMER_TEMP	${\bar{T}}_{summer} = \bar{(T_{2 m}− 273.15)}$ , where $T_{2 m}$ is ERA5-Land monthly 2-m air temperature (K); averaged over summer months (2018–2024); tract mean.	–
SOLAR_RAD	$\bar{S} = \bar{(\frac{E_{↓}}{Δ t})}$ , where $E_{↓}$ is ERA5-Land surface solar radiation downwards (J m⁻² per month) and $Δ t$ is seconds per month; summer mean; tract mean.	[36]
LST	$LST = (S T_{B 10} \cdot 0.00341802 + 149.0) - 273.15$ , where $S T_{B 10}$ is Landsat 8/9 scaling factor; summer median composite; tract mean.	[37]
SUHII	$SUHII = \frac{{LST}_{tract} - {\bar{LST}}_{region}}{σ_{region}}$ , where regional mean ${\bar{LST}}_{region}$ and standard deviation $σ_{region}$ are computed over the city buffer.	[14]

Note: All regional statistics and tract-level predictors were computed within a circular buffer of 20 km radius centered on each city’s reference point (latitude–longitude), with summer defined as June–August for the period 2018–2024. For SUHII, this buffer was used to define the reference thermal background for each city, allowing tract temperatures to be expressed as normalized anomalies relative to a consistent local context.

Table 4. Best hyperparameters selected for SUHII and LST predictions.

Model	SUHII—Best Hyperparameters	LST—Best Hyperparameters
ElasticNet	alpha = 0.083; l1_ratio = 0.275; fit_intercept = True; max_iter = 40,000; random_state = 42	alpha = 0.0829; l1_ratio = 0.275; fit_intercept = True; max_iter = 40,000; random_state = 42
MLP	hidden_layer_sizes = (256, 128); alpha = 2.15 × 10⁻⁶; learning_rate_init = 0.0003; activation = relu; solver = adam; early_stopping = True; max_iter = 2000; n_iter_no_change = 60; random_state = 42	hidden_layer_sizes = (128, 128); alpha = 0.00316; learning_rate_init = 0.001; activation = tanh; solver = adam; early_stopping = True; max_iter = 2000; n_iter_no_change = 60; random_state = 42
RF	n_estimators = 600; max_depth = None; min_samples_split = 5; min_samples_leaf = 3; max_features = sqrt; bootstrap = False; criterion = squared_error; n_jobs = 1; random_state = 42	n_estimators = 600; max_depth = None; min_samples_split = 2; min_samples_leaf = 3; max_features = 0.6; bootstrap = True; criterion = squared_error; n_jobs = 1; random_state = 42
Extra Trees	n_estimators = 1500; max_depth = 20; min_samples_split = 2; min_samples_leaf = 1; max_features = sqrt; criterion = squared_error; n_jobs = 1; random_state = 42	n_estimators = 1500; max_depth = 12; min_samples_split = 2; min_samples_leaf = 1; max_features = 0.8; criterion = squared_error; n_jobs = 1; random_state = 42
XGBoost	n_estimators = 1200; learning_rate = 0.01; max_depth = 3; subsample = 1.0; colsample_bytree = 0.7; min_child_weight = 8; reg_lambda = 1.0; reg_alpha = 0.1; objective = reg:squarederror; tree_method = hist; n_jobs = 4; random_state = 42	n_estimators = 600; learning_rate = 0.01; max_depth = 5; subsample = 0.7; colsample_bytree = 0.7; min_child_weight = 5; reg_lambda = 1.0; reg_alpha = 0.5; objective = reg:squarederror; tree_method = hist; n_jobs = 4; random_state = 42

Note: Models were trained and evaluated using city-based nested CV (eight outer and four inner folds) with tuning via RandomizedSearchCV; missing values were handled by feature-wise median imputation; Linear/MLP models used winsorization and log1p transformation of selected skewed predictors; tree-based models used winsorization only; lr denotes learning rate.

Table 5. Summary statistics of input variables and outcomes.

Variable	Mean	Median	Std	Min	Q1	Q3	Max	Missing (%)
NDMI	0.119	0.106	0.09	−0.412	0.042	0.181	0.422	0.33
IMPERV	59.672	62.746	19.163	0	47.386	73.971	98.608	0.33
ALBEDO	0.17	0.168	0.027	0.003	0.156	0.187	0.278	0.33
AWATER	613,833	0	18,414,222	0	0	16,010	8.2 × 10⁸	0.33
DIST_WATER	3045.614	2458.351	2358.01	0	1134.857	4489.605	12,264.82	0.33
VIIRS_RAD	48.302	42.46	31.239	0.821	27.088	61.25	289.338	0.33
ALAND	2,411,929	1,152,250	23,341,993	0	603,817.5	2,123,429	1.49 × 10⁹	0.33
INTPTLAT	36.415	34.111	5.823	25.473	33.473	42.33	45.174	0.33
DIST_CITY_CENTER	11.83	11.988	5.67	0.071	7.243	16.632	52.657	0.33
MEAN_SUMMER_TEMP	25.124	24.033	3.452	21.321	22.692	27.5	34.826	2.07
SOLAR_RAD	8.895	8.39	1.148	7.778	7.911	10.484	10.673	2.07
SUHII	0.442	0.516	0.671	−2.68	0.065	0.907	2.088	0.33
LST	42.498	42.404	6.465	17.769	37.611	46.634	60.824	0.33

Note: Census-tract-level summary statistics (N = 5144). Std = standard deviation; Q1 and Q3 = the 25th and 75th percentiles.

Table 6. Variance inflation factor (VIF) values for predictor variables.

Predictor	VIF
VIIRS_RAD	4.667
IMPERV	4.425
NDMI	4.092
ALBEDO	3.179
INTPTLAT	2.674
MEAN_SUMMER_TEMP	2.558
ALAND	2.332
DIST_WATER	2.247
AWATER	2.232
SOLAR_RAD	2.176
DIST_CITY_CENTER	1.898

Table 7. Model performance for SUHII and LST.

Target	Model	R²	MAE	RMSE
SUHII	XGBoost	0.879	0.162	0.213
	RF	0.867	0.196	0.251
	Extra Trees	0.844	0.212	0.272
	MLP	0.826	0.198	0.255
	Elastic Net	0.788	0.254	0.316
LST	Extra Trees	0.908	0.583	0.745
	RF	0.907	0.570	0.750
	Elastic Net	0.895	0.626	0.795
	XGBoost	0.882	0.659	0.843
	MLP	0.874	0.680	0.871

Note: N = 5144 total census tracts. R² = coefficient of determination; MAE = mean absolute error; RMSE = root mean squared error.

Table 8. Positioning the present study relative to recent literature.

Study	Scale	Outcome	Method	Main Relevance to This Study
Li et al. [18]	1112 global cities	SUHII	Comparative methodological analysis	Concluded that SUHII estimates are highly sensitive to background-reference definition, supporting the use of a standardized SUHII framework for cross-city comparison
Feng et al. [13]	Single city (Beijing)	LST	Random forest + SHAP/explainable AI	Indicated that urban thermal drivers are non-linear and can be interpreted with SHAP-based analysis
Tanoori et al. [66]	Single city (Shiraz)	LST	Machine-learning model comparison	Showed that ML models, especially DNN and XGB, can predict urban LST accurately
Kong et al. [19]	216 global cities	UHI intensity	SVR-based machine-learning model	Concluded that harmonized multi-city machine-learning models can predict UHI intensity and identify broad cross-city drivers
Mansouri and Erfani [8]	Midwestern U.S. multi-state regional scale	UHI severity	Ensemble machine learning (Random Forest, XGBoost)	Indicated that ensemble models can be applied effectively across a broad regional dataset, supporting the value of regional generalization beyond single-city studies
Zhao et al. [2]	Systematic review	SUHI/LST	Remote-sensing monitoring and assessment review	Showed that satellite-derived LST is valuable for mapping surface thermal patterns, but it is distinct from near-surface air temperature and should not be interpreted directly as human heat exposure
Present study	8 U.S. cities, tract scale	LST and SUHII	Multiple machine-learning models + nested city-held-out CV + SHAP	Separated absolute surface heat from relative within-city thermal anomaly within a standardized tract-scale cross-city framework

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aljarrah, O.A.B.; Goulias, D. Disentangling Climatic and Surface-Physical Drivers of the Urban Heat Island Using Explainable AI Across U.S. Cities. Sustainability 2026, 18, 3694. https://doi.org/10.3390/su18083694

AMA Style

Aljarrah OAB, Goulias D. Disentangling Climatic and Surface-Physical Drivers of the Urban Heat Island Using Explainable AI Across U.S. Cities. Sustainability. 2026; 18(8):3694. https://doi.org/10.3390/su18083694

Chicago/Turabian Style

Aljarrah, Osama A. B., and Dimitrios Goulias. 2026. "Disentangling Climatic and Surface-Physical Drivers of the Urban Heat Island Using Explainable AI Across U.S. Cities" Sustainability 18, no. 8: 3694. https://doi.org/10.3390/su18083694

APA Style

Aljarrah, O. A. B., & Goulias, D. (2026). Disentangling Climatic and Surface-Physical Drivers of the Urban Heat Island Using Explainable AI Across U.S. Cities. Sustainability, 18(8), 3694. https://doi.org/10.3390/su18083694

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Disentangling Climatic and Surface-Physical Drivers of the Urban Heat Island Using Explainable AI Across U.S. Cities

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Cities Area

2.2. Spatial Unit, Outcomes, and Predictors

2.3. Preprocessing and Transformations

2.4. Overview of ML Models

2.5. Model Training, Validation, and Comparison

2.6. Model Validation and Performance Assessment

2.7. Explainability (SHAP) and Effect Interpretation

3. Results

3.1. Cross-City Distributions of Tract-Level LST and SUHII

3.2. Spatial Patterns of Heat Drivers (Phoenix as an Illustrative Example)

3.3. Distributional Characteristics and Correlation Structure of Predictors

3.4. Comparative Performance of the Models for SUHII and LST

3.5. SHAP-Based Attribution of Tract-Scale Heat Drivers (SUHII vs. LST)

3.6. Heterogeneity of Driving Factors by City and Climate Zone

4. Discussion

4.1. Separating Absolute Surface Heat (LST) from Within-City Heat Inequality (SUHII)

4.2. Impervious Surface and Moisture Availability as the Dominant Drivers of SUHII

4.3. Solar Radiation Conditions SUHII Magnitude, but Is Not the Clearest Tract-Scale Differentiator

4.4. Nighttime Radiance as a Secondary Proxy for Urban Intensity

4.5. Why Water-Related Metrics Show Limited Explanatory Power

4.6. Implications for Heat Mitigation and Equity-Oriented Planning

4.7. Scope and Limits of Inference

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1

Appendix A.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI