*2.2. SWAT Model and Inputs*

The SWAT model is a typical semi-distributed, physically-based hydrological model. It splits the watershed into subbasins connected by a stream network and further delineates each subbasin into hydrologic response units (HRUs), which comprise unique combinations of land use, soil type, and slope. The HRU aggregates water and nutrient fluxes to the subbasin level, and they are then routed through the stream network to the watershed outlet. Therefore, the major model inputs consist of topography, land-cover type, soil property, climate data, and land management practices. The resolution and sources of all data inputs in this study are listed in Table 1.


**Table 1.** Input and validation datasets, sources, and main attributes.

Note: <sup>1</sup> Namely, statistical analyses of the results of a government-led survey conducted in 2005 on the agricultural pollution along the central route of the South-to-North Water Transfer Project.

In this study, we used the ArcSWAT interface (version 2012) for ArcGIS. The simulation was performed with a monthly time step from 1998–2019, in which the first two years were excluded from the analysis as they were used as the warmup period.

#### 2.2.1. Datasets for SWAT Model Building

A digital elevation model (DEM, 90 m × 90 m, Figure 1b) was used to delineate the watershed. We obtained a DEM from the SRTM 90 m Digital Elevation Database (https://bigdata.cgiar.org/, accessed on 1 February 2021) and clipped it into the area that covers the whole TRB. Note that additional subbasin outlets were manually defined at the locations with hydrological gauging stations (Figure 2d) in addition to the autogenerated subbasins, allowing us to compare the simulated results with observations at any location of interest. covers the whole TRB. Note that additional subbasin outlets were manually defined at the locations with hydrological gauging stations (Figure 2d) in addition to the autogenerated subbasins, allowing us to compare the simulated results with observations at any location of interest.

A digital elevation model (DEM, 90 m × 90 m, Figure 1b) was used to delineate the watershed. We obtained a DEM from the SRTM 90 m Digital Elevation Database (https://bigdata.cgiar.org/, accessed on 1 February 2021) and clipped it into the area that

*Water* **2022**, *14*, x FOR PEER REVIEW 5 of 21

2.2.1. Datasets for SWAT Model Building

**Figure 2.** The spatial distribution of the land‐use/land‐cover (LULC) classes in 2015 (**a**), soil types (**b**), slope classes (**c**), and (**d**) gauging stations in the SWAT‐generated stream network, distinguish‐ ing meteorological stations used for building SWAT (triangle), hydrological stations used for model validation (circle), and water quality monitoring sections at the provincial boundary (square). Ab‐ breviations of land‐cover classes and soil types are found in the main text. **Figure 2.** The spatial distribution of the land-use/land-cover (LULC) classes in 2015 (**a**), soil types (**b**), slope classes (**c**), and (**d**) gauging stations in the SWAT-generated stream network, distinguishing meteorological stations used for building SWAT (triangle), hydrological stations used for model validation (circle), and water quality monitoring sections at the provincial boundary (square). Abbreviations of land-cover classes and soil types are found in the main text.

Approximately 66% of the study area is in agricultural use (AGRL and RICE). The remaining area is covered by forest (FRST), grassland (PAST and HAY), residential zones (URLD and URBN), and others. As the area under cultivation changed only slightly

from 2000 to 2015, with the dryland (AGRL) decreasing from 62.4% to 61.6% and the paddy field (RICE) increasing from 4.4% to 4.5%, we therefore categorized land-use/landcover classes based on the conditions in 2015 (Figure 2a). A total of 18.4% of the landcover area was considered to be mixed forest (FRST), and 6.8% was considered to be low density residential zones. Both the land-cover map and soil map hold a resolution of 1 km × 1 km. The soil characteristics for preparing the user-defined soil database include soil component parameters (e.g., soil name, hydrological soil group, and number of soil layers) and soil layer parameters (e.g., soil depth, salinity, organic matter content, and particle size distribution). All characteristics, except for the soil name, were extracted from the China Soil Science Database (http://vdb3.soil.csdb.cn/, accessed on 1 May 2021) for central China, with some further calculations using SPAW software (Table 1). The most dominant soil types are yellow-cinnamon soil (HHT, 37.4%), shajiang black soil (SJHT, 19.3%), and yellow-brown earth (HZR, 15.2%). The remaining soil types include skeletal soil (CGT), fluvo-aquic soil (CT), cinnamon soil (HT), paddy soil (SDT), lime soil (SHT), litho soil (SZT), alluvial soil (XJT), brown earth (ZR), and purplish soil (ZST). Considering the large span of slope in TRB (i.e., 0–163%, 0–58◦ ), it is necessary to categorize the slope into separate classes. Referring to the soil erosion condition on the steepness of cultivated fields [30], we divided the whole TRB into three slope classes: 0–3.5% (i.e., 0–2◦ , flat land with no soil erosion), 3.5%–26.8% (i.e., 2–15◦ , terraced land with a low soil erosion rate), and 26.8–9999% (i.e., 15–58◦ , sloping land with a high erosion rate) (Figure 2c).

The meteorological observations used were obtained from the National Meteorological Information Centre (Table 1) and consist of rainfall, temperature, relative humidity, solar radiation, and wind speed from seven stations (Figure 2d). Solar radiation was calculated based on sunshine observations using the Angstrom–Prescott equation. Stations outside the watershed were also included for more accurate spatial interpolation of climate data, particularly near the border.

#### 2.2.2. Management Practices for the SWAT Model

To obtain a relatively accurate estimation of crop yields and nutrient transformations in the TRB, we investigated crop rotation, fertilization, and livestock breeding practices based on field surveys and literature review (Tables 1 and 2). Specifically, these management activities were established for three types of land cover: AGRL, RICE, and URLD. As listed in Table 2, the main crops in the AGRL field are rainfed grains (winter wheat, summer corn) with rotation [31], while the RICE field is used for growing rice [32]. The nitrogen fertilizer applied was urea, and the phosphorus fertilizer was phosphorus pentoxide. The amount and fertilization period were set following Lan [33] and local crop guidance [34]. The irrigation schedule was set as default. As livestock and poultry breeding is mainly concentrated in Nanyang city and much less in Xiangyang city, we assumed that in the model all animal breeding occurs in Nanyang in low-density residential zones (URLD). The total amount of fresh manure discharge *Fmanure* is then estimated based on the statistics of animal breeding [35], with

$$F\_{\text{manure}} = \mu \sum\_{i} n\_i X\_i \text{Area}\_{TRB} / \text{Area}\_{URLD} \tag{1}$$

where *µ* is the fraction of livestock and poultry manure, which is set to 0.9 according to Cai [32], *n<sup>i</sup>* is the number of the *i*th specific livestock or poultry breeding, and *X<sup>i</sup>* is manure produced per unit area (kg/ha·d); Area*TRB* and Area*URLD* are the area of TRB (ha) and area covered by URLD in TRB (ha).

Three variants of fertilization were considered depending on land-cover classes: (1) for crop fields (i.e., AGRL and RICE) listed in Table 2, fertilizer was applied according to the calendar; (2) for animal breeding areas (i.e., URLD), compound fresh manure was applied once every day in the form of continuous fertilization; (3) for the remaining land-cover types, an auto-fertilization mode was adopted based on plant nitrogen stress.


**Table 2.** Settings on non-point-source pollution from planting in terms of crop cycle and management practices, and livestock breeding in terms of manure discharge in rural areas based on land cover in 2015.

Note: <sup>1</sup> The auto-fertilization based on heat unit in AGRL and RICE fields were removed. <sup>2</sup> P2O<sup>5</sup> is a selfdefined P fertilizer, in which the fraction of mineral P is 0.43. <sup>3</sup> Compound manure is a self-defined compound manure, in which nutrient fractions are proportional to that of the fresh manure produced by different livestock/poultry species.

Data used for validation include recorded time series of streamflow and water quality. The monthly streamflow was available at three stations on two tributaries: Guotan station (32◦320 N, 112◦360 E) on the Tang River, Xindianpu station (32◦250 N, 112◦180 E) on the Bai River, and at the watershed outlet (32◦5 0 N, 112◦120 E) (Figure 2d). The streamflow time series span from 1997 to 2012, with 2001–2006 missing. Regarding water quality data, we did not find any long-term observations. We therefore decided to use data from articles and online reports, e.g., [28,36,37], in which nutrients (N and P) are presented in concentration. Data on TN and TP concentrations were collected on different dates, at eight different sites (Figure 3d), including the two river sections at the provincial boundary (Bukou section on the Tang River and Diwan section on the Bai River, Figure 2d). *Water* **2022**, *14*, x FOR PEER REVIEW 8 of 21

**Figure 3.** Comparison between SWAT simulation and record extracted from articles with respect to monthly streamflow (**a**–**c**) and TN and TP concentrations (mg/L) (**d**). The flood season refers to the months of April to September, and the non‐flood season refers to the remaining months. The distri‐ bution of the subbasin index is found in the following Figure 4. **Figure 3.** Comparison between SWAT simulation and record extracted from articles with respect to monthly streamflow (**a**–**c**) and TN and TP concentrations (mg/L) (**d**). The flood season refers to the months of April to September, and the non-flood season refers to the remaining months. The distribution of the subbasin index is found in the following Figure 4.

**Figure 4.** Spatial distribution of simulated TN (**a**,**b**) and TP (**c**,**d**) averaged over 2000–2019. Panels (**a**,**c**) are the annual mean load into reach from each subbasin and (**b**,**d**) are the mean monthly pol‐ lutant concentrations in the reach. The blue palette in (**b**,**d**) indicates concentrations of water quality of Class I (applicable to source water and national nature reserves, with TN ≤ 0.2 mg/L and TP ≤ 0.02 mg/L), Class II (primary protection zone for centralized drinking water, with TN ≤ 0.5 mg/L and TP ≤ 0.1 mg/L), Class III (secondary protection zone for centralized drinking water, with TN ≤ 1 mg/L and TP ≤ 0.2 mg/L), Class IV (undrinkable, applicable to general industrial water, with TN ≤ 1.5 mg/L and TP ≤ 0.3 mg/L), and Class V (undrinkable, applicable to agricultural water, with TN ≤ 2 mg/L and TP ≤ 0.4 mg/L), respectively, according to Environmental Quality Standards for Surface **Figure 4.** Spatial distribution of simulated TN (**a**,**b**) and TP (**c**,**d**) averaged over 2000–2019. Panels (**a**,**c**) are the annual mean load into reach from each subbasin and (**b**,**d**) are the mean monthly pollutant concentrations in the reach. The blue palette in (**b**,**d**) indicates concentrations of water quality of Class I (applicable to source water and national nature reserves, with TN ≤ 0.2 mg/L and TP ≤ 0.02 mg/L), Class II (primary protection zone for centralized drinking water, with TN ≤ 0.5 mg/L and TP ≤ 0.1 mg/L), Class III (secondary protection zone for centralized drinking water, with TN ≤ 1 mg/L and TP ≤ 0.2 mg/L), Class IV (undrinkable, applicable to general industrial water, with TN ≤ 1.5 mg/L and TP ≤ 0.3 mg/L), and Class V (undrinkable, applicable to agricultural water, with TN ≤ 2 mg/L and TP ≤ 0.4 mg/L), respectively, according to Environmental Quality Standards for Surface Water (GB3838-2002). The red palette indicates water quality inferior to Class V (i.e., TN > 2 mg/L and TP > 0.4 mg/L).

Water (GB3838‐2002). The red palette indicates water quality inferior to Class V (i.e., TN > 2 mg/L and TP > 0.4 mg/L). SWAT simulated nutrient yields are expressed as load. The term was converted from SWAT simulated nutrient yields are expressed as load. The term was converted from load to concentration to facilitate comparison between simulation, record, and Environmental Quality Standards for Surface Water (GB3838-2002). The total N/P load in surface water is *TOT<sup>X</sup>* (*X* represents nitrogen or phosphorus), while that leaching from HRU or subbasin into

load to concentration to facilitate comparison between simulation, record, and Environ‐ mental Quality Standards for Surface Water (GB3838‐2002). The total N/P load in surface

tions (2) and (3). The corresponding concentration (mg/L) in surface water is the total

ே ൌ (2)

ൌ (3)

load divided by the mass flow rate (Equation (4)).

the reach during the time step is the sum of N/P in different forms (Equations (2) and (3). The corresponding concentration *C<sup>X</sup>* (mg/L) in surface water is the total load divided by the mass flow rate (Equation (4)).

$$TOT\_N = ORGN + NUSRQ \tag{2}$$

$$TOT\_P = ORGP + SEDP + SOLP \tag{3}$$

$$\mathcal{C}\_X = \frac{TOT\_X}{FLOW\_{out}} \times \eta \tag{4}$$

where *TOT<sup>N</sup>* and *TOT<sup>P</sup>* are total nitrogen TN (kg) and total phosphorus TP (kg) in the reach. *ORGN* and *NUSRQ* are organic N yield (kg N/ha) and nitrate NO<sup>3</sup> − transport with surface runoff into the reach (kg N/ha). *ORGP*, *SEDP*, and *SOLP* are organic P yield (kg P/ha), mineral P in sediment transported into reach (kg P/ha), and soluble mineral forms of P transport by surface runoff (kg P/ha), respectively. *FLOWout* is the streamflow out of the reach during time step (m3/s), and *η* is the unit conversion factor.

#### *2.3. Ecological Compensation Implementation*

Understanding the upstream–downstream linkage in hydrological processes and nutrient (N and P) pollution is essential. Generally, the economy of the upstream basin is relatively backwards compared to the downstream basin; thus, the conflict between economic development and water ecological protection is more pronounced. Protection of water resources in the upper regions is likely to be much more beneficial for downstream areas, especially in terms of urban landscape and ecology. For TRB, Nanyang city (in upstream Henan province) had a per capita GDP of CNY 38,064 (note: CNY 1 = USD 0.15), while Xiangyang city (in downstream Hubei province) had a GDP of 2.3 times the upstream GDP (CNY 84,815) in 2019. It is necessary to establish water pollution ecological compensation between downstream beneficiaries and upstream water protectors to achieve ecological sustainability for the entire basin.

Common river restoration measures include the collection of pollution taxes for individual polluters [38], payments for environmental services (e.g., agricultural practices) [39], conservation reserve programs [40], and conversion of cropland to forest (i.e., Grain for Green) [41]. Among them, the Grain for Green (hereinafter "GFG") measure is a widely adopted ecological restoration measure in China that is applicable to agricultural non-pointsource pollution. In this study, we established four sets of GFG measures applied to AGRL lands in Henan province, thereby adjusting the economic structure and promoting industrial upgrades using funds from eco-compensation. (1) Converting all sloping drylands (AGRL, slope > 15◦ ) in Henan province, comprising 1.5% of dryland areas in Henan, to forest (FRST); (2) converting all non-flat drylands (AGRL with slope > 2◦ ), constituting 21.8% of drylands in Henan, to FRST; (3) converting all drylands in Henan province to FRST; and (4) performing conversions based on the AGRL area ratio (without distinguishing slopes), 10%, 20%, 40%, 60%, and 80% of the AGRL land in Henan province to FRST. For each area ratio, we generated 10 scenarios by randomly selecting HRUs of unique combinations of SOIL-LULC until the threshold area was reached. A total of 53 AGRL-to-FRST scenarios were generated, and each was used as LULC input to rerun SWAT. The corresponding nutrient (N and P) and crop yield outputs were then simulated.

The eco-compensation quantification, which is the key for implementing compensation, is always measured by monetary values. The ecosystem services were estimated based on the opportunity cost of land use. Specifically, we added up the cost of planting trees and the annual mean value of crops lost because of the GFG activities. The tree planting fees were calculated based on the annual compensation standard for state-owned forests, which is CNY 75 per hectare per year. The value of wheat was assumed to be the average purchase price in Henan in 2020, which was CNY 2.24/kg, while corn was CNY 2.62/kg. This total cost then served as the upper boundary of the compensation standard. Additionally, the lower boundary of the compensation standard was estimated according to the up-to-date

national policy of the GFG program, which is subsiding CNY 4800 per hectare per year, for a total of 5 years.

#### **3. Results and Discussion**

According to the input data, the watershed is discretized into 61 subbasins and divided into 2298 HRUs. The watershed outlet (Zhangwan section on the Tangbai River–Han River intersection) is located in subbasin 56, which drains an area of 24,200 km<sup>2</sup> . The river sections lying at the provincial boundary, the Diwan section on the Bai River and the Bukou section on the Tang River (Figure 2d), are located in subbasins 45 and 46, with drainage areas of 11,780 km<sup>2</sup> and 7834 km<sup>2</sup> , respectively.

## *3.1. Suitability of SWAT for Simulating Streamflow and Nutrients*

SWAT model performance in simulating hydrological processes is typically determined by the fitness of the streamflow time series at the watershed outlet. According to the existing literature, when the Nash–Sutcliffe efficiency (NSE) exceeds 0.5 and the coefficient of determination (R<sup>2</sup> ) exceeds 0.6, the simulation results are deemed satisfactory [42]. Before calibration, the NSE and R<sup>2</sup> values for monthly streamflow at the outlet of TRB are already 0.9 and 0.72, respectively (Figure 3a). The streamflow in high-flow seasons is underestimated at the outlet. This could be related to the interpolation and aggregation of daily rainfall. The NSEs for the reaches on the tributaries also exceed 0.6, while R<sup>2</sup> s exceed 0.8 (Figure 3b,c). According to the statistical indicators, it appears that the SWAT model has satisfactory applicability to TRB. This may be partly due to the relatively uniform distribution of meteorological inputs (Figure 2d). In contrast, simulated streamflow does not improve much after 2000 auto-calibrations using SWATCUP, a phenomenon which was also observed in another Yangtze River subbasin study [43]. Therefore, the rest of this study used the default parameters and settings, instead of the calibrated parameters. Literature discussions are available for the SWAT model calibration method, sensitivity analysis, and the possible range of parameters [44,45].

Water quality comparison in terms of concentrations of TN and TP is shown in Figure 3d. Because only a monthly step was adopted when running SWAT, the impact of daily step storm scour on nutrients cannot be considered; thus, the hydrologic condition when water was sampled (i.e., water quality data obtained from the literature) may differ from that we simulated. Most TN concentrations are greater than 1.0 mg/L and the overestimation of the SWAT model usually occurs during the flood season (April to September). TP concentrations are mostly within 1.0 mg/L but overestimation of TP is more pervasive. These frequent overestimations of N and P may likely be the result of uncalibrated parameters impacting sediment and nutrient transportation simulations. Nevertheless, we failed to calibrate the simulations as calibration requires observations of nutrient loads (kg), whereas the collected measurements are in the form of concentration (kg/L) and, more importantly, include both point and non-point sources of pollution at distinct locations. Due to the difficulty of collecting point-source pollution data, the nutrient emissions caused by industrial waste and residential sewage were not considered in this study.

In general, despite the dispersion of nutrient concentrations, the estimated values mostly fall within acceptable ranges according to available data for the study area. We claim that the SWAT model can essentially convey the hydrological processes and nutrient pollution in TRB.
