**4. Discussion**

Based on the validation results, we found that a certain degree of errors are inevitable in the three reanalysis datasets in comparison to in-situ observations. Surface air temperature in the QTP is not only controlled by regional factors (longitude, latitude), but also affected by other factors such as geographical conditions (altitude, aspect, slope) and the underlying surface (such as vegetation, snow cover), and are further complicated by temperature changes [55–63].

Several previous studies have also found differences of biases between stations and air temperature reanalysis datasets. For example, Huang et al. [41] found that elevation is not the only factor that causes biases in reanalysis datasets. Through the assessment of the slopes of in-situ observation sites, they found that the errors and applicability of CLDAS, ERA5L, and GLDAS increase with the increasing slope of the observation site. Ding et al. [64] found that temperature changes are significantly correlated with the elevation and slope of the observation site, and the complexity of the terrain is the main factor leading to large errors in reanalysis data. Meanwhile, these results also indicate that when using temperature reanalysis data, topographic correction should be performed on the data in order to effectively reduce errors and improve the accuracy and applicability of the reanalysis data. Liu and Long et al. [42,65] evaluated CLDAS; Huang et al. [41] evaluated CLDAS, ERA5L, and GLDAS in China. They found that that the errors of the three reanalysis datasets all increase with increasing monthly average temperatures, and the correlation between the reanalysis data and the in-situ observations gradually decreased, reaching the minimum value in July or August. The correlation and bias then gradually increased with decreasing temperature. However, the present study indicates that the monthly variation of errors in the QTP is not significant, though the seasonal variation is essentially the same with the previous studies of Huang et al. [41], which may be attributed to the differences in the time series of the data used and the division of regions.

Errors that resulted from the approach of evaluation at individual sites can be influenced by a few factors: (1) The spatial scales do not match. The in-situ observations at a specific site only reflect temperature changes in a certain area around it, and, due to the influence of topography, the representativeness of the observations is still limited. In contrast, the reanalysis data at a specific grid represents the average value of the grid. Thus, it is difficult to solve the problem of spatial mismatch [56,59] between in-situ observations and gridded reanalysis data. (2) The difference between the terrain height of the re-analysis grid and the elevation of the station [41,42,66]. If the observation site is located in a valley and its altitude is lower than the altitude of the surrounding grid points of the reanalysis dataset, the evaluation result at this site will generally show a colder deviation; if the site is located at the top of mountains, which is higher than the elevation of the surrounding grids in the reanalysis dataset, the evaluation result will show a warm bias at this site. (3) Systematic errors [67,68] caused by the numerical model or assimilation method. For example, cold errors occur at 70.6% and 100% of the total number of stations for GLDAS and ERA5L, respectively, which may be caused by systematic errors. In addition, errors in input data and errors introduced during the interpolation of reanalysis data (e.g., from Gaussian grids to coordinate grids) are also sources of errors that need to be further verified. Therefore, error characteristics and the applicability of reanalysis data should be fully considered in the application of temperature reanalysis data. Next, we will resample the three reanalysis datasets at the same resolution and use different interpolation methods for evaluation to discuss the influences on the accuracy of the gridded datasets.

#### *4.1. Inpact of Grid Resolutions on the Accuracy of the Reanalysis Datasets*

To explore the temperature variation characteristics and accuracy of the three reanalysis datasets at the same resolution, GLDAS and ERA5L are resampled to GLDAS grids with the spatial resolution of 0.25◦ using the mean value algorithm [55], and thus the three reanalysis datasets have uniform temporal and spatial resolutions. Figure 15 show the spatial distributions of annual mean temperature over 2017–2018. Although remapping

CLDAS and ERA5L reduces the resolution, it still shows more advantages in detail than GLDAS. Figure 16 shows the bias of annual mean temperature between in-situ observations and reanalysis datasets at 17 weather stations. It is easy to find that, at each observation station, CLDAS has a small deviation to in-situ observations compared with ERA5L and GLDAS.

**Figure 15.** Spatial distributions of annual mean temperature over 2017–2018. (**a**) CLDAS (0.05◦); (**b**) ERA5L (0.1◦); (**c**) CLDAS (0.25◦); (**d**) CLDAS (resampled to 0.25◦); (**e**) ERA5L (resampled to 0.25◦); (**f**) OBS (in-situ observations).

#### *4.2. Inpact of Interpolation Methods on the Accuracy of the Reanalysis Datasets*

In order to analyze the impact of different interpolation methods on the evaluation results, the two most common interpolation methods, nearest neighbor method and bilinear interpolation method, are used in the present study [40,42]. The results are shown in Table 4. We found that different interpolation methods can have a certain impact on the evaluation results, but the impact is very small. It can also be seen that the bilinear interpolation method also shows that the deviation of CLDAS from the in-situ observations is lower than the other two reanalysis datasets, and that GLDAS is better than ERA5L.

**Figure 16.** The bias of annual mean temperature between in-situ observations (OBS) and three reanalysis datasets over 2017–2018.

**Table 4.** Accuracy evaluation results used two interpolation methods for the evaluation period.


Note: Nearest neighbor interpolation method (Nea), bilinear interpolation method (Bil).

Although all three reanalysis datasets can accurately reflect the distribution characteristics of air temperature in the alpine region of the QTP, CLDAS performs better overall. It is also better at individual stations and on daily, monthly, and seasonal time scales. One important reason is that in the QTP, CLDAS integrates the observations collected at thousands of surface automatic weather stations [22,35], which is of grea<sup>t</sup> benefit to the quality of CLDAS. ERA5L and GLDAS are global reanalysis products, both of which show large deviations from observations, probably due to the lack of observations for assimilation over the QTP. Furthermore, when compared with the other two reanalysis datasets, CLDAS also has higher spatial resolution, which can improve its ability for temperature description [40], especially in complex terrain areas. Using data assimilation and fusion techniques with in-situ observation data, satellite remote sensing data, and numerical model data, the reanalysis system produces regular gridded data with a certain temporal and spatial resolution. This process will introduce some uncertainties [22] into the final data products, which is why deviation of reanalysis from in-situ observations is important.

Based on the aforementioned discussion, we believe that, although there are certain deviations in the temperature reanalysis datasets in the QTP, they still have certain applicability and credibility in the alpine region of the QPT, where observation sites are unevenly distributed with a low density of observations. Thus, these reanalysis datasets have certain reference values. It should be noted that, even though the quality of CLDAS is better than the other two datasets based on careful evaluation of the three reanalysis datasets in the present study, the reanalysis datasets of ERA5L and GLDAS have longer time sequences, larger spatial coverage, and better continuity compared to CLDAS. Due to the extremely complex terrain in the QTP and the short sequences of in-situ observations at the 17 sites, which cannot cover all of the QTP area, the three datasets have their respective advantages and disadvantages in different areas and further studies are necessary for local scale.
