**4. Discussions**

#### *4.1. Impact of TWSC on ET Estimates in Local Catchments*

As we can see, in the UYRB, MYRB, PRB, LRB, and HuRB, the ETWB is typically smaller than ETPQ in the wet season from May to July (Figure 3). Meanwhile, from September to December, the ETWB is larger than ETPQ. From the water balance equation, it is because, during the wet season, TWSA usually increases, and TWSC (*ds*/*dt*) is greater than 0 (Figure 3). In contrast, in the dry season with less precipitation, TWSA generally decreases, and TWSC (*ds*/*dt*) is smaller than 0, then ETWB is larger than ETPQ (Figure 3).

It should be noted that the impacts of TWSC on ET estimates are region–specific. On the monthly scale, ETWB is obviously larger than ETPQ from March to May in the HRB (Figure 3g), which is caused by the spring irrigation of wheat [13]. The ETWB is larger than ETPQ from March to May, which is 14.4, 22.1, and 21.5 (sum: 57.9) mm/month, respectively. This result is similar to the human-induced ET (60.0 ± 24.2 mm) estimated by Pan et al. [13] for the same months. While for the SRB and LRB, ETWB is obviously larger than ETPQ from August to October. Since the main crop is corn in this region, the water consumption of the growth period is ongoing in the corresponding period. Meanwhile, there is a significant reduction in PCMDC (−40.3 mm/month) in September relative to August (Figure 3f), which is different from the HRB. As the different water consumption in agriculture, the monthly TWSC are region-specific, then the deviations between ETWB and ETPQ are region-specific.

On the other hand, in the MYRB, PRB, and MRB, the RMSEs between the monthly mean of ETWB and ETPQ is significantly larger than those for other catchments. As Figure A2 shows, the amplitudes of monthly mean TWSC are stronger than other catchments. These catchments are all located in South China, with abundant precipitation [57]. During the rainy season, as the water is stored and TWSC is

positive, the monthly ETWB is smaller than ETPQ in all the catchments. As the corresponding months of the rainy season are different with respect to different catchments, these catchments also show regional heterogeneities. However, the region-specific impacts of TWSC on ET deserves more research.

On the annual scale, the sizeable variations between annual ETWB and ETPQ can mostly be explained by large precipitation anomalies (Figure A3), such as the first year in the Figure 4c (YeRB) and 4i (MRB) and year 2012 in the Figure 4f (LRB). Their variations correspond to the lowest or highest precipitation during the whole study period in the corresponding catchment.

It is interesting that all the STDs of mean annual ETWB are less than those for ETPQ (see Table 3), indicating smaller interannual ETWB fluctuations. In the year with more precipitation, such as the years 2008 and 2014 in the UYRB, 2012 and 2013 in the SRB, based on the water balance equation, as the TWSA increases, TWSC is greater than 0, then ETWB should be less than ETPQ in the given year. On the contrary, in the years with precipitation deficit, as the TWSA usually decreases, the ETWB would be higher than ETPQ. We would deem that the TWSA plays a role as a reservoir in the terrestrial water cycle, impounding water and reducing the amount of water that returns to the atmosphere through evapotranspiration or other forms in the wet years, but discharging water in the dry years. We can conclude that estimating annual ET simply by subtracting runoff from precipitation would overestimate the interannual fluctuations of ET.

The difference between mean annual ETWB and ETPQ reflects the long-term rate of TWSA in a catchment. The ETWB is significantly higher than ETPQ in the HRB, where the water depletion (mainly from groundwater) is fast [58]. In the LRB and YeRB, the mean annual ETWB is also larger than ETPQ, which also indicates the water depletion there [58]. Conversely, TWSA increases in the UYRB, MYRB, SRB, and PRB, and therefore the mean annual ETWB is typically less than that for ETPQ.

#### *4.2. The Di*ff*erences between ETWB and other ET Estimates*

On the monthly scale, the RMSEs between ETWB and ET from different GRACE solutions are smaller, and the ETCSR–DDK4 is closest to ETWB among the three GRACE solutions (Table A1). In the YeRB, SRB, and LRB, the maximum RMSEs are between ETWB and ETGLEAM, and all of these catchments are located in North China and are semiarid catchments. In the UYRB and MRB, the RMSEs between ETWB and ETGLDAS–1 show the maximum values. In the MYRB, PRB, HRB, and HuRB, the RMSEs between ETWB and ETPQ show the maximum values, which indicates that impacts of ignoring TWSC on the ET estimate is the most, and it should be noted that all of these catchments are humid catchments except HRB with intense water consumption.

On the annual scale, the RMSEs between ETWB and ET from the three GRACE solutions show small values, while ETCSRT–GSH.sf is closest to ETWB among the three solutions (Table A2). It indicates some differences in the TWSC estimate on the monthly and annual scales. The RMSEs between ETWB and ET from other products markedly exceed those between ETWB and ET from other GRACE solutions. In the UYRB, MYRB, PRB, and MRB, the RMSEs between ETWB and ETGLDAS–1 show the maximum values, which are all in humid regions. It should be noted that in the UYRB, the RMSE between ETWB and ETPQ is even less than the RMSEs between ETWB and ET from other GRACE solutions, which indicates that the interannual variations of TWSC are very small in this catchment. ET estimates from different GRACE solutions generally show relatively small deviations in all the catchments, and ET estimates from different products are generally relatively large deviations in the humid catchments.

#### *4.3. Impact of Precipitation and Modeled Runo*ff *from a Water Balance Perspective*

The RMSEs between annual ETWB and other ET results are further analyzed. Their results are shown in Table A3 (WB – GLDAS-1), Table A4 (WB – GLDAS-2.1), and Table A5 (WB – GLEAM). For Table A3, in the UYRB, MYRB, SRB, PRB, and MRB, the RMSEs between ETWB and ETGLDAS–1 can be markedly reduced if the deviation of PGLDAS–1 and QGLDAS–1 can both be taken into consideration. Generally speaking, though GLDAS ET outputs are not computed based on the water balance method [9,59]. If the accuracy of PGLDAS–1 and QGLDAS–1 can be improved in China, e.g., modeled

QGLDAS–1 verified by in situ runoff. Then the ET estimate would also benefit from improved runoff outputs based on the water balance equation during the simulation process. In the YeRB, LRB, HRB, and HuRB, the RMSEs are also reduced, with smaller proportions reduced than above catchments. In the YeRB and LRB, if we only consider the difference of runoff, the RMSEs would even increase, and in the HRB, the RMSE is also slightly reduced. Since the outflows are much smaller in these three catchments than other catchments, the deviation of runoff is small itself (Figure 7a). Unlike the other humid catchments, in the three semiarid catchments (YeRB, LRB, and HRB), the proportions of the RMSEs of ET–P as opposed to ET are reduced, which indicates that deviations of precipitation forcing data indeed contribute to deviations of ET.

In Table A4, the RMSEs between ETWB and ETGLDAS–2.1 reduced in all the catchments except HRB when the deviations of precipitation and runoff can be considered. In the YRB (UYRB and MYRB), the RMSEs between QRSBC and QGLDAS–2.1 account for most of the deviations. In the HRB and MRB, the deviations between ETWB and ETGLDAS–2.1 do not result from the precipitation difference. It should be noted that in the LRB, if the precipitation inconsistency is considered (Figure A3f), the RMSE between ETWB and ETGLDAS–2.1 is dramatically reduced, which can explain the cause of overestimation of the annual ET for ETGLDAS–2.1. In the HRB, with the deviation of precipitation and modeled runoff considered, the proportion of the RMSE increased (Table A4). Since the HRB is heavily influenced by human activities [13,31], the RMSE between mean annual ETWB and GLDAS ET outputs is mainly contributed by anthropogenic activities [13].

As for the RMSEs between ETWB and ETGLEAM, we only compute their precipitation difference (Table A5). In the YeRB, LRB, HRB, and HuRB, if the precipitation difference can be taken into consideration, the RMSEs between ETWB and ETGLEAM would be reduced, the YeRB, LRB, and HRB are semiarid catchments. Figure A3 also shows a large deviation between PCMDC and PMSWEP. The proportion of the RMSE reduced in the YeRB reaches 72.8%. In the UYRB, MYRB, SRB, PRB, and MRB, the RMSEs would even increase, which indicates that the deviations of precipitation do not contribute or contribute little to the deviation between ETGLEAM and ETWB. The RMSE between ET and ET–P rapidly increases from 86.9 to 228.7 mm/yr in the MRB, there is a small difference between their annual precipitation actually (Figure A3i).

Here we try to explore the deviation between ETWB and GLDAS or GLEAM ET based on the water balance equation. The RMSE (ET) would decrease if the deviation of PGLDAS and modeled QGLDAS in the GLDAS LSM can be taken into consideration. In four catchments (YeRB, LRB, HRB, and HuRB), precipitation differences contribute to the deviation between ETWB and ETGLEAM. However, the increased RMSE (ET-P), RMSE (ET-Q) and RMSE (ET-(P-Q)) relative to RMSE (ET) should be further explored. We do not investigate other forcing variables except precipitation to derive ET, e.g., radiation, air temperature, and snow water equivalent [6,9,60]. Therefore, a future intercomparison can be performed to identify the impact of these variables on ET estimates.

#### *4.4. Impact of Groundwater Baseflow and Water Diversion on ET Estimates*

Based on the water balance equation, the groundwater inflow and outflow across the basin boundary would also affect the estimate of ET. As an example, in the LRB, according to the estimate of groundwater outflow from Zhang and Li [61], the outflow is 0.61 × 10<sup>8</sup> m<sup>3</sup>/yr, and its impact on the annual ET is only ~0.3 mm/yr. Therefore, it can be negligible relative to the annual ET (417.7 ± 46.5 mm/yr).

Water diversion in the basin inside and outside is also a part of basin water balance. In China, there is South-to-North water diversion, which includes the east route, the middle route, and the west route projects (http://nsbd.mwr.gov.cn/). The west route project has not been built yet. The starting point of the east route is in the mainstream of Lower Yangtze River, transporting water to Shandong Province, which is not in our study area. The middle route transports water from the MYRB to the HRB, is going through the HuRB and the YeRB. It transported water to the North in October 2014 for the first time, with a water volume of 21.67 × 10<sup>8</sup> m<sup>3</sup> in the first year. The impact on the ET estimate is

3.1 mm/yr for the MYRB, which is relatively small compared to annual ET (689.7 ± 50.3 mm/yr). If the water is totally supplied to the HRB, the impact on the ET estimate will reach 15.18 mm/yr, exerting a certain influence on the ET estimate in the HRB (494.2 ± 37.2 mm/yr). If we estimate the ET after 2015 in this region, it is necessary to account for the water diversion.

#### *4.5. Impact of Spatial Scale on ET Estimate*

The area of the MRB is only 5.45 × 10<sup>4</sup> km2, which is less than the typical GRACE footprint (20 × 10<sup>4</sup> km2). However, some studies have demonstrated that GRACE is capable of detecting TWSA in local regions with an area smaller than GRACE resolution if the signal amplitude is large enough [44,62,63]. As the MYRB receives the most abundant precipitation among these catchments (Table 3), TWSA should have higher SNR (Signal to Noise Ratio), and TWSC tends to have higher reliability. On the other hand, the maximum uncertainty of monthly ET estimate is indeed in the MRB, where the uncertainties of monthly TWSC, precipitation, and runo ff are also large (Table 3). Thus, we recommend that caution should be exercised when using TWSA estimates in regions with a small area.
