*3.1. Model Validation for the Agglomeration of The Hague*

The diagnostic equation for the UHI applied here (Equation (2)) was designed and tested using observational data from cities of variable sizes in northwestern Europe [20]. It was crucial to verify the results presented here with citizen weather stations in the agglomeration of The Hague. Firstly, minimum temperatures are a different metric from the maximum UHI used for verification by T17 [20]. Secondly, the retrieval and integration of morphological data and meteorological data were slightly different compared with the procedure of T17 [20]. Finally, spatial differences may occur because the area is bordered by the coast, where seasonality in sea-surface temperatures and the presence of a sea breeze may play a role.

A quality assessment of crowdsourced weather data is indispensable, because weather stations may have issues and are not always properly installed [40,41]. For the verification, we used data from citizen weather stations in the agglomeration of The Hague, and data were also obtained from the Weather Underground platform. We selected only Davis Vantage Pro and Oregon scientific stations, since they show small biases in the night [40]. This added up to nine stations in total. The time series from these stations comprised two years of data (2015–2016) restricted by the summer period (April–October). Minimum temperatures were discarded if more than two hourly values were missing (as in the analysis by Hopkinson et al. [42]). After evaluating this constraint, the minimum availability was 48% and the average availability was 78% for a single citizen station.

The citizen weather stations and the diagnostic equation were compared in a quantile–quantile plot in Figure 5A, and the bias and standard deviation of the stations grouped by percentile are shown in Figure 5B. Only the lower percentiles with low minimum temperatures had a substantial cold bias. It is expected that the lack of anthropogenic heat led to the underestimation in the diagnostic equation (Equation (2)). During cold weather in the spring, the anthropogenic heat source is larger than in the summer due to heating of buildings [43]. Furthermore, in the lowest and highest percentile, there seemed to be more variance in minimum temperatures of the citizen weather stations than in the modeled minimum temperatures of the diagnostic equation. The other percentiles showed good

agreement between the model and the observations, with only slight cold biases, and supported the reliability of the equation.

**Figure 5.** (**A**) Quantile–quantile plot for modeled and observed (Davis and Oregon citizen weather stations) night-time minimum temperatures for the years 2015–2016. (**B**) The bias between the data from these citizen weather stations and the diagnostic equation are presented as a function of temperature. Error bars indicate the standard deviation.
