**2. Materials and Methods**

#### *2.1. Study Area*

The Yellow River, the second-largest river in China, starts from the Bayankara Mountains in the west, crosses the Qinghai–Tibetan Plateau, Loess Plateau, and Huang-Huai-Hai Plain, and flows into the Bohai Sea in Shandong Province. The total length of the basin is 5464 km, covering an area of approximately 79.5 × 104 km2 (95◦53 E–119◦05 E, 32◦10 N–41◦50 N), accounting for 8% of China's land area (Figure 1). The YRB is in the mid-latitude zone, with complex natural conditions and undulating basin topography, and is influenced by atmospheric and monsoonal circulation, making the climate different from that of the other basins in China [33,34]. The average annual precipitation in the basin is 495 mm, with concentrated and highly variable interannual precipitation and an evident downward trend from the southeast to the northwest [35]. The average annual temperature ranges from −4 to 14 ◦C, varying with latitude and altitude [36]. The basin's evapotranspiration varies markedly, with an average annual *ET*<sup>0</sup> of 700–1800 mm and an increasing trend from the southeast to the northwest. As the basin straddles arid, semi-arid, and semi-humid zones, it is in the transition zone between semi-arid and semi-humid climates, rendering it extremely sensitive to climate change [21]. Climate change has exacerbated the uneven spatial and temporal distributions of water resources in the YRB, and the contradiction between water resource supply and demand has become evident, seriously affecting the production and life of human society and restricting the high-quality economic development of the region.

**Figure 1.** (**a**) Location and digital elevation model of the Yellow River Basin in China and (**b**) the distribution of 93 national meteorological stations in the Yellow River Basin.

#### *2.2. Data Collection*

### 2.2.1. Ground-Based Observation Data

In this study, monthly monitoring data from 93 national meteorological stations in the YRB from 1980 to 2014 were obtained from the National Meteorological Information Centre-China Meteorological Data Network (http://data.cma.cn/ (accessed on 11 March 2022)), including monthly mean temperature (tas), monthly mean maximum temperature (tasmax), monthly mean minimum temperature (tasmin), and monthly pan evaporation. Some of the missing data were reasonably interpolated via the hydrologic analogy method and the linear interpolation method. The tas, tasmax, and tasmin were used to assess the accuracy of the climate model simulations, and the converted value based on pan evaporation data [4] were used to assess the *ET*<sup>0</sup> values based on the multi-model ensemble and Hargreaves formula.

#### 2.2.2. Reference Data on Downscaling

The national 30-year cumulative mean, mean maximum, and mean minimum temperature datasets with a resolution of 1 km from 1971 to 2000 were selected as the regional high-resolution reference data to construct the delta statistical downscaling model in this study. Data were obtained from the National Ecosystem Science Data Center (NESDC) (http://www.nesdc.org.cn/ (accessed on 9 May 2022)).

#### 2.2.3. Future Climate Data

In this study, 24 GCMs were selected from CMIP6 (https://esgf-node.llnl.gov/search/ cmip6/ (accessed on 13 May 2022)) for the historical period (1901–2014) and three future periods (near-term 2022–2040, mid-term 2041–2060, and long-term 2081–2100). The tas, tasmax, and tasmin data of the models were presented, which contained 21, 19, and 21 GCMs, respectively; the basic details about each model and variable are summarized in Table 1. For future forcing scenarios, the recent shared socioeconomic pathways (SSPs), such as SSP1-2.6 (low-forcing scenario, SSP126), SSP2-4.5 (medium-forcing scenario, SSP245), SSP3-7.0 (medium to high-forcing scenario, SSP370), and SSP5-8.5 (high-forcing scenario, SSP585), were selected [37]. Notably, the future scenarios of the climate model were set for the 2015–2100 period; the historical period in this study did not extend back to 2021, and the future period did not extend forward to 2015 to ensure the reasonability of the data. The selection of the periods for downscaling the simulation accuracy and Hargreaves model validation were based on these considerations.


**Table 1.** Introduction to climate models with temperature variables.

Note: In the variable column, tas is the average temperature, tasmax is the average maximum temperature, and tasmin is the average minimum temperature.

#### *2.3. Research Methodology*

#### 2.3.1. Delta Statistical Downscaling

The delta statistical downscaling method is a simple bias correction technique recommended by the U.S. Global Change Research Program (see http://www.nacc.usgcrp.gov (accessed on 6 June 2022)) that is easy to understand and operate, requires fewer factors, and is widely used in a wide range of fields related to climate change impact studies [21,38,39]. For the temperature variables used in this study, the delta method was used to compare the temperature of different periods of each simulation grid with the simulated average temperature of the base period, calculate the absolute change in temperature in each period of each simulation grid, and add the measured average temperature of each base period with the change in the grid based on the spatial interpolation of the change to obtain the temperature scenarios of different periods in the reconstruction grid [21]. The calculation equation is as follows:

$$T\_f = T\_0 + \left(T\_{Mf} - T\_{M0}\right) \tag{1}$$

where *Tf* is the grid temperature data reconstructed by the delta method, *TM f* is the simulated grid temperature data for a certain period, *TM*<sup>0</sup> is the simulated grid multiyear average temperature data for the base period, and *T*<sup>0</sup> is the measured multi-year average temperature data for the base period. In this study, five interpolation methods were considered: bilinear interpolation (BI), inverse distance weighted (IDW), kriging, natural neighbor interpolation (NNI), and spline. The delta statistical downscaling process is shown in Figure 2.

**Figure 2.** The delta downscaling process of the climate models over the Yellow River Basin.

#### 2.3.2. Climate Model Accuracy Assessment and Multi-Model Ensemble

To effectively assess the accuracy and applicability of climate model forecasts in the YRB, the evaluation metrics used were mean absolute error (MAE) [40], Taylor diagrambased quantile S [41], spatial skills score (SS) [42], and temporal skills score (TS) [43]. The closer the MAE and TS are to 0, the better the simulation ability of the model. The closer S and SS are to 1, the better the simulation ability of the model.

The downscaling results of different GCMs differ, and the performance of multi-model averaging is considered to be better than that of individual models [44,45]. In this study, multi-model ensemble averaging of preferred climate models was performed using the equally weighted ensemble averaging (MME) method commonly used in multi-model prediction studies.
