2.1.2. CLDAS Data Processing Methods

The 2 m temperature, 2 m specific humidity, 10 m wind speed, and surface pressure products take ECMWF numerical analysis/forecast products as the background field. Topographic adjustment and multi-grid variational technology (STMAS) are used to integrate the observation data of automatic ground stations in China. The background field outside China is formed by topographic adjustment, variable diagnosis, and interpolation to the analysis grid.

The DISORT radiative transfer model used ozone, atmospheric precipitation, and surface pressure in GFS numerical analysis products as the dynamic input parameters for the radiative transfer model. Additionally, FY2E/G satellite VIS channel complete disk nominal map data inversion was used to form the short-wave radiation product.

The above information comes from the China Meteorological Data Sharing Network (https://data.cma.cn/, accessed on 1 May 2021). The data used in this study included temperature, global solar radiation, relative humidity, and wind speed, of which the height of wind speed was 10 m and the height of other meteorological variables was 2 m. The data spanned from 2017 to 2020.

#### *2.2. Reference Evapotranspiration*

According to the FAO56 PM equation [32], reference evapotranspiration (*ET*0; mm d<sup>−</sup>1) can be calculated as:

$$ET\_0 = \frac{0.408(R\_n - G) + \gamma \frac{900}{T\_d + 273} \mathcal{U}(e\_s - e\_a)}{\Delta + \gamma (1 + 0.34 \mathcal{U})} \tag{1}$$

where *Rn* is the net radiation at the crop surface, usually calculated by *Rs* (Global solar radiation); *G* is the soil heat flux density; *Ta* is the mean daily air temperature at 2 m height; *U* is the wind speed at 2 m height; *es* and *ea* are the saturation and actual vapor pressure, respectively; Δ is the slope of vapor pressure curve, and *γ* is the air psychrometric constant. In daily time-step in this study *G* can be neglected [33,34].

#### *2.3. Data Sources*

To examine the performance of this dataset, meteorological data from 689 ground meteorological observation stations of the China Meteorological Administration (CMA) were collected, which included maximum and minimum temperatures at 2 m, global surface radiation or sunshine durations, relative humidity at 2 m, and wind speed at 10 m. If necessary, sunshine durations were converted into global radiation using a formula from a previous study [35]. The stations were divided into seven climate zones [36,37]. The specific distribution is shown in Figure 1, and the names are shown in Table 1.

**Figure 1.** Climate zones of China and geographical distribution of 689 meteorological stations. (See Table 1 for the names of climatic zones 1–7).

**Table 1.** Names of the seven climate zones.


To obtain the daily reanalysis variables for Equation (1) (identified by subscript CLD), the following steps were taken: (a) daily *T*maxCLD and *T*minCLD were selected as the maximum and minimum of the 24 daily available 1-h values of the *T*max and *T*min sequences, respectively; (b) daily RHCLD was obtained by calculating the 24-h average value of 24 RH values per day; (c) calculating the 24-h cumulative value of the 12-h *Rs* as the daily *Rs*CLD value; (d) wind speed at 10 m (*U*10CLD) was calculated as the 24-h average of 24 1-h values, which were then converted to a height of 2 m (*U*CLD) using Formula (2) as follows, respectively:

$$\mathcal{U}\mathcal{U} = \mathcal{U}\_z \frac{4.8\mathcal{T}}{\ln(6\mathcal{T}.8z - 5.42)}\tag{2}$$

where *z* is the height of the wind speed observation instrument (in this paper, *z* is equal to 10) for each meteorological station. Grid data from four grid points around it were selected and interpolated to the station by the inverse distance weight (IDW) method. The formula is as follows:

$$V = \frac{\sum\_{i=1}^{n} \frac{v\_i}{D\_i^2}}{\sum\_{i=1}^{n} \frac{1}{D\_i^2}}\tag{3}$$

where *V* is the inverse value, *vi* is the value of the control point, and *Di* is the weight coefficient.

#### *2.4. Statistics Indicators*

Three common statistical indicators, including the coefficient of determination (*R*2), root mean square error (RMSE), mean absolute error (MAE), and percent bias (PBias) were chosen to evaluate the accuracy of the CLDAS meteorological variables and *ET*<sup>0</sup> in this study. The corresponding formulas are:

$$\text{MAE} = \frac{1}{n} \sum\_{i=1}^{n} |M\_i - P\_i| \tag{4}$$

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} (M\_i - P\_i)^2} \tag{5}$$

$$R^2 = \frac{\left[\sum\_{i=1}^n \left(M\_i - M\_i\right)\left(P\_i - P\_i\right)\right]^2}{\sum\_{i=1}^n \left(M\_i - M\_i\right)^2 \sum\_{i=1}^n \left(P\_i - P\_i\right)^2} \tag{6}$$

$$\text{PB} = \frac{\sum\_{i=1}^{n} (P\_i - M\_i)}{\sum\_{i=1}^{n} M\_i} \tag{7}$$

where *Mi* is *ET*<sup>0</sup> calculated by meteorological station data, *Pi* is *ET*<sup>0</sup> calculated by the CLDAS gridded data, *Mi* is average *ET*<sup>0</sup> calculated by meteorological station data, *Pi* is average *ET*<sup>0</sup> calculated by the CLDAS gridded data, and *n* is the number corresponding to *ET*<sup>0</sup> data. Higher *R*<sup>2</sup> values (closer to 1) or lower RMSE and MAE values indicate a better estimation performance of the CLDAS dataset. The closer PB is to 0, the better the estimation performance of the CLDAS dataset.

### **3. Results**

#### *3.1. Meteorological Factors*

#### 3.1.1. Air Temperature

Table 2 shows the statistical indicators of maximum and minimum temperatures in the CLDAS data for the seven climate zones in China. Results indicated that the accuracy for the maximum and minimum temperatures differed in different climatic regions. For the maximum temperature, CLDAS data showed a high correlation with data from ground stations in the four northern climate zones (i.e., climate zones 1–4), with *R*<sup>2</sup> larger than 0.9. Climate zone 5 in the humid climate region also yielded a good correlation. In climate zones 6 and 7, the correlations between the two datasets were slightly worse when compared with other climate zones. However, climate zone 6 showed the smallest values in terms of statistical errors, with RMSE and MAE valued at 2.9 and 2.3 ◦C, respectively. This may be since the range of temperature changes in this region is not as large as that in other regions, and the area of this climate zone is significantly smaller than that in other

climate zones, so the temperature change in this region is not as dramatic as that in other climate zones. The RMSE and MAE of the high-altitude climate zone (i.e., climate zone 7) were 6.55 ◦C and 5.83 ◦C, respectively. Figures 2–4 show the spatial error distribution of the maximum temperature in CLDAS. Overall, the errors at most stations were within a small range. However, in climate zone 7 and the north-central area of climate zone 1, there was a big error in the regions, with RMSE and MAE of many stations more significant than 10 ◦C, while *R*<sup>2</sup> was lower than 0.5. Such huge variations in model errors in these stations might be resulted from the regional climate model parameter variations and were unlikely caused by the overall overvalued or undervalued problem of models that may cause significant variation for *ET*<sup>0</sup> estimation.

**Table 2.** Statistical indicators of maximum and minimum temperatures in different climate zones of China.


**Figure 2.** RMSE values of the five meteorological factors of in all stations.

**Figure 3.** MAE values of the five meteorological factors of in all stations.

The minimum temperature behavior in the CLDAS data set was similar to the maximum temperature. However, compared with the maximum temperature, correlations between the minimum temperature of CLDAS and the station's temperature were higher in all seven climate zones, with all *R*<sup>2</sup> larger than 0.9. In climate zone 6, the minimum temperature error was the lowest among all climate zones, with RMSE and MAE valued at 1.87 ◦C and 1.5 ◦C, respectively. On the contrary, climate zone 7 had the highest error, with RMSE and MAE reaching 5.1 and 4.53 ◦C, respectively. In addition, the lowest temperature error in climate zone 1 was also relatively high, with RMSE and MAE valued at 4.03 ◦C and 3.45 ◦C, respectively, which might affect the accuracy of *ET*<sup>0</sup> estimation. As can be seen from Figures 2–4, the performance of *T*min was similar to that of *T*max. Therefore, the accuracy of most stations was within an acceptable range. However, some stations showed significant errors, which were mainly located in the middle of climate zone 1 and climate zone 7. These stations with high error in the minimum temperature had a high coincidence with corresponding high *T*max error stations, indicating severe problems in the temperature simulation of the stations.

**Figure 4.** *R*<sup>2</sup> values of the five meteorological factors of in all stations.
