**2. Cascade Reservoir Impoundment Model**

### *2.1. Study Area*

The Yangtze River, the longest river in Asia, flows 6300 km to the East China Sea with a total drainage area of 1.8 million km<sup>2</sup> and has abundant hydropower resources. A series of cascade reservoirs have been constructed along the upper Yangtze River which provides a wide range of services including flood control, hydropower generation, water supply, as well as navigation. There are five cascade reservoirs in the upper Yangtze River, WDD (Wu-Dong-De), BHT (Bai-He-Tan), XLD (Xi-Luo-Du), XJB (Xiang-Jia-Ba), and TGR (Three Gorges Reservoir). These reservoirs, along with their characteristics, are listed in Table 1. There are no main tributaries between WDD and XJB reservoirs, while there are three main tributaries between XJB and TGR, Min River, Jia-Ling River, and Wu River. The inflow to WDD (*QWDD*) and TGR (*QTGR*) are derived from gauges at Hua-Tan and Yi-Chang hydrological stations by revivification, respectively. Figure 1 shows the sketch map of the cascade reservoirs, hydrological stations, and tributaries in the upper Yangtze River basin.

**Table 1.** Characteristics of the five cascade reservoirs in the upper Yangtze River.


**Figure 1.** Sketch map of the five cascade reservoirs in the upper Yangtze River.

#### *2.2. Impoundment Operation Rules for Cascade Reservoirs*

The impoundment operation rules are employed to refill reservoir storage during the impoundment period. The impoundment operation rules (Figure 2c) delineate trajectories to raise the water level from the annual top of buffer pool at the initial impoundment time to the top of conservation pool by the end of impoundment period. The SIOR derived from historical flow records initiates the impoundment operation at fixed predefined dates. However, the SIOR may fail to refill reservoir during the impoundment period in dry years. Hence, in low flow conditions and dry years, an early impoundment operation is more desirable to refill the storage capacity. Table 2 lists the potential time for employing early initial impoundment in the upper Yangtze River obtained from previous investigations [7,9]. We employ these initial dates along with the inflow conditions for the WDD and TGR reservoirs (*QWDD* and *QTGR*) to evaluate the possibility of an early impoundment for the cascade reservoirs.

**Figure 2.** The flowchart of the parameterization–simulation–optimization (PSO) strategy to derive reservoir impoundment operation rule curves (**a**) input data of PSO, (**b**) optimization strategy of PSO, (**c**) concept of the seasonal top of buffer pool (STBP) and impoundment operation rules, similar to a previous study [9].


**Table 2.** Impounding periods of Standard impoundment operation rules (SIOR) and early impoundment operation rules (EIOR) for the five reservoirs.

Employing EIOR without considering the seasonal top of buffer pool (STBP) could increase the risk of flooding. Figure 2c shows the concept of the STBP and Table 3 lists the value of the STBP for selected reservoir (calculation method obtained from previous studies [7,9]). STBP is employed as the maximum water level to mitigate the risk of flooding for the impoundment period. We follow an iterative process to find the STBP for the reservoirs by evaluating the most extreme event. For further information for this process, please refer to [7,9]. We use the STBP in our model to control and assess the risk of flooding for the selected reservoirs.

**Table 3.** Seasonal top of buffer pool (STBP) for the five reservoirs in different periods.


According to Table 2, the impoundment process for most reservoirs starts before September. Hence, streamflow forecast data with 2-month lead time can be used in early August to evaluate the possibility of using EIOR. For this purpose, we define quantile-based thresholds based on historical September monthly inflow to the Wu-Dong-De and Three Gorges Reservoir (*QWDD* and *QTGR*). These thresholds are used to determine the streamflow condition to help decision-makers decide to use either EIOR or SIOR. For instance, if the September monthly *QWDD* is forecasted to be below the threshold percentile, EIOR is recommended as the suitable impoundment operation rule.

It is worth noting that, employing higher percentiles thresholds for inflow would increase the possibility of using EIOR. However, it also increases the risk of flooding. So, careful consideration should be devoted to the selection of these thresholds. Here, we examine four quantiles of historical monthly inflow in September, including 20, 30, 40, and 50-percentile (Figure 2a), for the *QWDD* and *QTGR* to select the best thresholds. For instance, considering the 20-percentile threshold of the historical monthly inflow in September, the observed inflow can fall into the above 20-percentile and below 20-percentile category. Then, we evaluate the potential benefit and risk for each of these thresholds by employing a cascade reservoirs impoundment simulation–optimization model under EIOR and SIOR scenarios. Since each of these thresholds divide the historical streamflow observation into two groups, we employ each group to find the impoundment rule curve separately. Hence, there are eight scenarios for the thresholds that need to be evaluated for each impoundment approach by the simulation–optimization model.

The reservoir simulation–optimization model is generally used to construct the rule curves by simulating the reservoir responses to predefined operating rules. Due to a large number of policies and constraints, mathematical optimization techniques can be used to identify the optimal operation rules by evaluating all possible alternatives [37]. The parameterization–simulation–optimization (PSO) approach is a popular and effective way of deriving optimal rule curves for cascade reservoirs [38]. Initially, PSO employs a linear rule curve (impoundment operation rule curve shown in Figure 2c), which connects the annual top of the buffer pool to the top of the conservation pool. It then employs a heuristic strategy to find the optimal rule curve according to predefined objective functions under possible inflow scenarios. Figure 2 shows the scheme of the PSO approach [39]. Finally, the objective function values of the PSO are employed to select the best threshold for reservoir impoundment decision-making.

Here, we employ PSO at a daily timescale to find the optimum rule curve for each threshold and scenario (Figure 2b). By optimizing the parameters of the rule curves for SIOR and EIOR, decision-makers can decide to employ EIOR or SIOR based on the obtained objective function values. The objective functions and the constraints for the impoundment operation employed in the PSO model are discussed in Sections 2.3.1 and 2.3.2. In Section 2.4, we describe the NSGA-II algorithm employed for optimizing impoundment operation rule curve.

#### *2.3. Objective Functions and Constrains*

#### 2.3.1. Objective Functions

Decision-makers rely on different criteria to make a comprehensive assessment of operation rules and address trade-offs among different users and services. In the Yangtze River, the goal of reservoir impoundment is to enhance water conservation in order to maximize hydropower generation and fullness storage rate, while minimizing the risk of flooding [7,18]. Hence, we employ objective functions that can measure the degree that these goals are achieved. These objectives are adopted from previous studies [7,18] and can be mathematically expressed as:

(1) Maximum hydropower generation (*HG*),

$$\max HG = \max \left[ \frac{1}{N} \sum\_{i=1}^{N} \left( \sum\_{k=1}^{M} HG\_{i,k} \right) \right];\tag{1}$$

(2) Maximum fullness storage rate (*FSR*),

$$\max\_{1 \le k \le N} FSR \quad = \max \left[ \frac{1}{N} \sum\_{i=1}^{N} \left( \sum\_{k=1}^{M} \alpha\_k FSR\_{i,k} \right) \right] \tag{2}$$

$$FSR\_{i,k} = \frac{V\_{high,i}^k - V\_{\text{min}}^k}{V\_{\text{max}}^k - V\_{\text{min}}^k} \times 100\%;\tag{3}$$

(3) Minimum flood control risk (*R*),

$$\min R = \min[\max(R\_1, R\_2, \dots, R\_{k\prime}, \dots, R\_M)],\tag{4a}$$

$$R\_k = P \text{ (Water level} > \text{STBP)} = N\_{risk,k} / \text{N};\tag{4b}$$

where


#### 2.3.2. Operation Constraints

In addition to the objective functions, the constraints of the reservoir system need to be specified for the optimization process. The following equality and inequality operational constraints need to be satisfied in the cascade reservoirs impoundment operation. Adopted from previous studies [7,18,40], the mathematical formulations of these constraints are as follows:

(1) Water balance equation,

$$\boldsymbol{V}\_{i,j+1}^{k} = \boldsymbol{V}\_{i,j}^{k} + (\boldsymbol{Q}\_{\text{in}(i,j)}^{k} - \boldsymbol{Q}\_{\text{out}(i,j)}^{k})\text{At } i = 1, \dots, \text{N } j = 1, \dots, \text{T};\tag{5}$$

(2) Reservoir capacity,

$$V\_{\min}^k \le V\_{i,j}^k \le V\_{\max}^k i = 1, \dots, \mathcal{N}j = 1, \dots, \mathcal{T};\tag{6}$$

(3) Power generation,

$$P\_{\min}^{k} \le A^k Q\_{v(i,j)}^k H\_{i,j}^k \le P\_{\max}^k i = 1, \dots, \text{N} \; j = 1, \dots, \text{T};\tag{7}$$

(4) Reservoir discharge,

$$Q\_{\min}^k \le Q\_{out(i,j)}^k \le Q\_{\text{safe}}^k \text{ i } = 1, \dots, \text{N} \text{ j } = 1, \dots, \text{T}; \tag{8}$$

$$\left| Q\_{out(i,j+1)}^k - Q\_{out(i,j)}^k \right| \le \Delta Q^k i = 1, \dots, \mathbf{N} \, j = 1, \dots, \mathbf{T} \tag{9}$$

(5) Navigation,

$$\mathbf{Z}\_{\text{dmin}}^{k} \le Z\_{d(i,j)}^{k} \le Z\_{\text{dmax}}^{k} \text{ i } = 1, \dots, \text{N} \text{ j } = 1, \dots, \text{T}; \tag{10}$$

$$\mathbf{Z}\_{d(i,j)}^k = f(\mathbf{Q}\_{\text{out}(i,j)}^k) \mathbf{i} = \mathbf{1}, \dots, \mathbf{N} \, j = \mathbf{1}, \dots, \mathbf{T}; \tag{11}$$

where


#### *2.4. NSGA-II Optimization Algorithm*

The nonlinearity of the reservoir systems, along with the existing constraints, require an effective optimization algorithm to solve these types of problems [41]. Here, we employ the non-dominated sorting genetic algorithm-II (NSGA-II), which is a robust multi-objective optimization algorithm [36], to derive the parameters of the rule curves. The NSGA-II algorithm has been applied to a wide range of complex multi-objective reservoir optimization and water resources management problems [18,38,42–44].

The NSGA-II algorithm has four parameters, including population size, generation number, crossover rate, and mutation rate, that need to be tuned by the user. Population size and generation number determine the effectiveness and efficiency of the algorithm and control the convergence speed to the optimal non-dominated solutions. Crossover and mutation rates control the ability of the algorithm to perform an effective search over the problem space [38,45]. In this study, the population size and the generation number were set to 50 and 200, respectively. These values are selected based on trial and error to obtain reasonable non-dominated solutions with acceptable simulation time. The crossover and mutation rates were empirically set to 0.9 and 0.1, respectively. The non-dominated solutions are used to evaluate the three objective functions for each threshold and each of the EIOR and SIOR scenarios.

#### **3. Evaluation of GloFAS-Seasonal Forecasts**

GloFAS-Seasonal forecasts combine the ECMWF's latest seasonal meteorological forecasting system, SEAS5, and a river routing model, Lisflood, to provide streamflow forecasts at global scale [35]. This dataset provides weekly-averaged river flow with 4-month lead time. The first component of the GloFAS-Seasonal forecast is the meteorological input from SEAS5 which employs a data assimilation system along with a global circulation model. SEAS5 is executed once a month to produce seasonal weather forecasts with 7-month lead time. The second model component is a revised Hydrology Tiled ECMWF Scheme of Surface Exchanges over Land (HTESSEL) which computes the land surface response to atmospheric forcing and simulates the evolution of soil temperature, moisture content, and snowpack conditions through the forecast horizon to produce a corresponding forecast of surface and subsurface run-off [46]. The third model component is Lisflood which simulates the groundwater

(subsurface water storage and transport) processes and routing of the water through the river network. While SEAS5 provides forecasts for the 7 months ahead, the GloFAS-Seasonal uses only the first 4 months and produces forecasts of river flow for the next 4 months. For more details on the forecast method, please refer to paper [35].

The GloFAS-Seasonal is a real-time forecast dataset which contains data from January 2018 and updates every month, with a total of 51 ensemble members. In order to evaluate the skill of the dataset, a set of retrospective seasonal forecasts for past dates, which are called reforecasts (also known as hindcasts), are available to compare with the historical observation streamflow. GloFAS-Seasonal reforecasts are available at http://www.globalfloods.eu/ and have 25 ensemble members from January 1981 to December 2017. In this study, GloFAS-Seasonal reforecasts at Hua-Tan and Yi-Chang hydrological stations in the Yangtze River are downloaded and analyzed. Also, the original weekly-averaged reforecasts are converted into monthly products for reservoir impoundment operation. Hence, monthly-averaged streamflow in September is obtained at the beginning of August with 2-month lead time (LM2).

We evaluate the GloFAS-Seasonal reforecasts to measure the capability of the dataset to predict the condition of the streamflow, i.e., the ability of the reforecast to predict that September monthly averaged flow falls below the selected thresholds which is defined in Section 2.1. Since seasonal climate is inherently probabilistic, seasonal forecasts should be evaluated probabilistically [47]. If each of the 25 ensemble members of the GloFAS-Seasonal reforecasts are equally likely, the proportion of ensemble members below each percentile threshold is calculated as the probability of the forecast. In addition, the percentile thresholds are calculated separately for historical observed and reforecast data [48]. This approach takes into account the systematic additive error (bias) of the reforecast data, hence further bias adjustment for the reforecast data is not required [48,49].

The conversion of raw ensemble members to forecast probabilities enables us to validate GloFAS-Seasonal reforecasts by using probabilistic forecasts verification measures. Here, we employ multiple metrics for our evaluation. These metrics include: I) discrimination, ability of the forecast to discriminate among observations; II) skill, the relative accuracy of the forecast over a reference forecast; III) reliability, the agreement between forecast probability and mean observed frequency; IV) resolution, the ability of the forecast to resolve the set of sample events into subsets; and V) sharpness, the tendency to forecast probabilities near 0 or 1. These metrics are briefly discussed here. Interested readers can refer to https://www.cawcr.gov.au/projects/verification/ for further details.

### *3.1. Discrimination*

To assess the potential application of GloFAS-Seasonal forecasts for the prediction of the streamflow condition, the relative operating characteristic (ROC) curve, a measure of discrimination [50], is calculated for the selected thresholds. If the forecasts indicate that flow will be below threshold, which means a dry and unfavorable condition for reservoir impoundment operation, then a warning is issued. The forecasts are converted into a binary (e.g., "yes" or "no") format depending on whether a warning has been issued or not issued. Then the ROC curve is plotted based on hit rate (HR) and false-alarm rate (FAR) of the forecast for streamflow condition. The HR and FAR can be calculated by Equation (12):

$$\text{HR} = \frac{h}{h+m} \tag{12a}$$

$$\text{FAR} = \frac{f}{f+r} \tag{12b}$$

where *h* refers to a correct warning (hit), *m* refers to a missed warning, *f* refers to a false warning, and *r* correct no warning detection.

The area under the ROC curve (referred as AUC) is then calculated, which is used to measure whether the forecast is informative for decision-making. Most of the time, the ROC curve does not clearly indicate the accuracy of forecast. As a numerical value, it is more intuitive to use the AUC value as the evaluation standard. The larger the AUC value, the more skillful the forecast is. The value of the AUC ranges from 0 to 1. If the AUC is equal to 0.5, it indicates that forecasts are consistent with the random guess and provides no information. Generally, when the AUC value is greater than 0.6, the seasonal forecast can be regarded as useful [25,35].

#### *3.2. Skill*

Skill implies information about the relative accuracy of the forecast according to a reference forecast. The reference forecast is generally an unskilled forecast such as random chance, persistence, or climatology. To assess the skill of GloFAS-Seasonal reforecasts, we compare the reforecasts with climatology [51], an ensemble of observed flows, and use the ROC skill score (ROCSS), which has been used in previous studies for the verification of seasonal forecasts [52]. ROCSS is computed as follows:

$$\text{ROCSSS} = \frac{A \text{UC}\_{fc} - A \text{UC}\_{cm}}{1 - A \text{UC}\_{cm}} \tag{13}$$

where *AUCfc* refers to the AUC value of reforecasts and *AUCcm* refers to the AUC value of climatological forecasts. ROCSS of one means a perfect forecasting system; ROCSS of zero indicates no improvement over the climatology.

#### *3.3. Reliability, Resolution, and Sharpness*

For assessing the reliability of forecasts, the reliability diagram is used here, where X and Y axes represent the forecast probability and the observed frequency of the future below the streamflow threshold, respectively. When the forecast probability and the observed frequency are equal, the reliability of forecasts is perfect. For example, if an event will occur with a forecast probability of 70%, then, on average, the event should occur on 70% of the occasions that this forecast is made. So, reliability is indicated by the proximity of the plotted curve to the diagonal. If the plotted curve lies below the diagonal, this indicates over-estimation (forecast probabilities are too high); curve above the diagonal indicates under-estimation (forecast probabilities are too low).

The climatological average can produce high reliability, but it lacks information for practice. In theory, we are interested in probability forecast systems which give a forecast probability that deviates from the climatological average and approaches 0% or 100% while maintaining a high level of reliability [35]. So, the reliability diagram can also be used to assess the resolution of forecasts. Forecasts that discriminate between events and non-events are said to have a resolution (a forecast of climatological average, a curve lying on or near the horizontal line would have no resolution). For assessing the sharpness of forecasts, the reliability diagram is usually accompanied by a histogram. If the histogram is U-shaped, then the frequency of forecasts approaches 0% and 100% and the forecast system sharpness is well. Forecasts with no or low sharpness will show a peak in the forecast frequency near the climatological average.

#### **4. Results and Discussion**

#### *4.1. The Selected Thresholds*

As the first step of our evaluation, we select thresholds to evaluate the streamflow condition for impoundment operation. The Changjiang (Yangtze River) Water Resources Commission (CWRC) provides daily inflow and discharge data series for the selected five reservoirs and streamflow for adjacent gauges at hydrological stations, which covers the whole impoundment operation period from 1 August to 31 October (92 days) for 1950–2015 (66 years). We use 20, 30, 40, and 50-percentile of historical inflow as thresholds to determine the inflow condition in September according to the monthly-averaged inflow (*QWDD* and *QTGR*). For example, the 20-percentile historical average inflow in September divides data into two groups where one group (above 20-percentile in Figure 2a) includes 53 years of data and the other group (below 20-percentile in Figure 2a) has 13 years of data. Therefore, we get two scenarios for these thresholds, one above and one below each threshold. By this approach, we get 16 different flow scenario groups (two *QWDD* or *QTGR* × four quantiles × two groups for each quantile).

These scenarios are evaluated independently along with the EIOR and SIOR by the PSO approach. Since the population size for the NSGA-II algorithm is set to 50, the algorithm provides 50 Pareto-optimal solutions (non-dominated solutions). Since there are three objective functions for each scenario and considering the multi-purpose nature of these reservoirs, a single value cannot be reported as the best answer from these 50 Pareto-optimal solutions. Therefore, we average the objective function of the 50 Pareto-optimal solutions as a potential benefit and risk in response to this combination of historical flow group and impoundment rules. For 16 different groups and two operation rules, the averaged three objective functions of 50 Pareto-optimal solutions are shown in Table 4. Comparing these 16 different scenarios, we can see that the HG and FSR values are improved by employing larger thresholds. This improvement is due to the increase in streamflow and reservoirs storage in September from the lowest, below 20-percentile, to the highest, above 50-percentile, threshold. It is clearly shown that low flow in September has an adverse impact on impoundment operation.


**Table 4.** Benefit and risk results of the cascade reservoir system in response to the combination of historical flow group and impoundment rule.

Note: EIOR: early impoundment operation rules; SIOR: standard impoundment operation rules; *HG:* hydropower generation; *FSR*: fullness storage rate; *R*: flood control risk.

Comparing EIOR with SIOR for cascade reservoirs, the EIOR improves the *HG* and *FSR* from the flow group below 20% to below 40% for both of *QWDD* and *QTGR* without affecting the risk of flooding. We employ these results to select the most suitable threshold among these 16 scenarios for our analysis. Figure 3 shows the relationship between increased benefit ratio and different flow groups of *QWDD* and *QTGR*. According to Figure 3, *HG* is less affected by the selected thresholds. On the contrary, *FSR* values are decreased by increasing the threshold or inflow. For the group below the 20-percentile and below the 30-percentile, the *FSRs* of the proposed EIOR are increased significantly around or above 3% in comparison to the SIOR, without increasing the risk of flooding. Hence, we select the 20-percentile and 30-percentile as the thresholds for our study, as their performance is superior to others. In early August, we use these thresholds to evaluate the performance of the GloFAS-Seasonal in predicting the streamflow condition for *QWDD* and *QTGR* next month.

**Figure 3.** The relationship between increased benefit ratio of EIOR and different flow groups of (**a**) *QWDD* and (**b**) *QTGR*.

#### *4.2. Evaluation of GloFAS-Seasonal Reforecasts*

GloFAS-Seasonal reforecasts are evaluated using adjusted historical river flow data at the Hua-Tan and Yi-Chang hydrologic stations in the Yangtze River. GloFAS-Seasonal reforecasts represent natural flow and do not consider any reservoir routing. The CWRC provides monthly averaged historical flow records which have been adjusted to represent the natural flow. These adjusted historical natural streamflow timeseries span over thirty years (1981–2013). So, GloFAS-Seasonal reforecasts are evaluated over the same 33-year period. Since the impoundment operation starts before September, we investigate the GloFAS-Seasonal reforecasts on 1 August (2-month lead, LM2) to evaluate the potential for employing EIOR. We also investigate the 1-month lead, LM1, on 1 September to evaluate the performance of GloFAS-Seasonal for different lead times.

#### 4.2.1. AUC Values

In order to compare AUC values for different stations, lead times, and thresholds, we employ the Nightingale's Rose chart. This chart is suitable to visually evaluate the evident differences between various categorical data. The results are shown in Figure 4, and it is clearly shown that all AUC values are greater than 0.6, which means that the forecasts can be regarded as informative and have the ability to predict the streamflow condition (whether streamflow is below the threshold or not). Besides, the AUC values exhibit a decline from the LM1 (around 0.9) to the LM2 (below 0.8) as expected.

For different stations and thresholds, AUC values of forecasts vary more significantly with lead times. So, the discrimination of GloFAS-Seasonal reforecasts is relatively stable over space in the upper Yangtze River. Moreover, an interesting finding is that the performance of thresholds varies for hydrological stations. For Hua-Tan, the 20-percentile has the best performance, whereas the 30-percentile for the Yi-Chang station. This emphasizes that a spatial evaluation of thresholds is necessary for the Yangtze River to find the best thresholds for employing the GloFAS-Seasonal forecast at the basin.

**Figure 4.** The area under the ROC curve (AUC) values for GloFAS-Seasonal reforecast at Hua-Tan (left) and Yi-Chang (right) hydrologic stations.

#### 4.2.2. ROCSS Values

Tercile plots are designed to show the performance of a forecast system at different periods [53]. Here, we employ these plots to compare reforecast probabilities (color coded from light to dark color for lower to higher probability) for different threshold events with the observed condition (white dots). We defined three different categories of threshold events for our comparison. Since low flow condition leads to employing the EIOR, we only employ 0–20%, 20–40%, and 40–60% quantiles of the streamflow data to evaluate the performance of GloFAS-Seasonal reforecasts for predicting the correct flow condition. However, the evaluation can be done for other flow ranges based on the selected thresholds. ROCSS values for each quantile is shown on the right axis for comparison. Significant values of ROCSS with a 95% confidence are marked with an asterisk for statistical evaluation.

Results show that for GloFAS-Seasonal reforecasts below 20%, the ROCSS exhibit a decline in skill from the LM1 (0.8 and 0.76) to the LM2 (0.46 and 0.42) for both Hua-Tan and Yi-Chang hydrological stations. However, skills (ROCSS greater than 0) still prevail in the LM2 and are marked with asterisks, which means that forecasts of LM2 are better than climatology. Furthermore, forecast sharpness is also evident in this tercile plot. The darker the color of the square, the better the sharpness of that probabilistic forecasts is. Forecasts for both LM1 and LM2 exhibit sharpness, although the sharpness is higher for LM1, which is indicated by the colors of the squares in Figure 5.

According to Figure 5, the streamflow condition for the Hua-Tan hydrological station is below the 20-percentile in seven years, among which the forecast predicted the highest probability for five and three of these years by LM1 and LM2, respectively. For the Yi-Chang hydrological station, the number of years with streamflow below the 20-percentile is seven, out of which GloFAS-Seasonal reforecasts with LM1 and LM2 predicted the highest probability for five and three of these years, respectively. Consistent with the results of AUC, although LM1 shows better performance with shorter lead time, aiming at reservoir impoundment operation, GloFAS-Seasonal reforecasts with 2-month lead time (LM2) are still informative. Further, compared with the LM1, LM2 still has a lot of potential improvement in the future, which depends on developing the seasonal climate prediction. A similar analysis can be performed for the 30-percentile threshold with other ranges.

**Figure 5.** Tercile plots and ROC skill score (ROCSS) for GloFAS-Seasonal reforecasts (**a**) 2-month lead at the Hua-Tan, (**b**) 1-month lead at the Hua-Tan, (**c**) 1-month lead at the Yi-Chang, and (**d**) 2-month lead at the Yi-Chang hydrologic station.

#### 4.2.3. Reliability Diagram

Similar to ROC calculations, the reliability is assessed for both the 20-percentile and 30-percentile threshold. Due to the limit number of samples, the range of forecast probabilities is divided into five bins (for every 20% from 0% to 100%) rather than ten bins in order to avoid sparseness of the probability categories. Since GloFAS-Seasonal reforecasts have similar performance at hydrological stations, reliability diagrams are only presented for Yi-Chang hydrologic station here. Figure 6 shows the effect of (a) the lead time (LM1 and LM2) and (b) the threshold (20-percentile and 30-percentile) on reliability by combining the contingency table for thresholds and lead times, respectively.

**Figure 6.** Reliability diagrams of GloFAS-Seasonal reforecasts for comparing (**a**) two lead times and (**b**) thresholds.

Figure 6a shows that forecasts have more reliability than climatology, regardless of the lead time. It is worth noting that the observed frequency is unrealistically equal to 1 for 60–80% and the LM1 due to sampling limitations rather than necessarily true deviations from reliability [54]. Overall, the reliability appears to be slightly better for forecasts of LM2 than LM1. The forecast data for both LM2 and LM1 exhibit sharpness, which means that forecast probabilities are more informative than climatology. Similar behavior is observed for the 60–80% bin in Figure 6b, because of the limited number of samples. Figure 6b shows that the reliability of the 30-percentile threshold is better than the 20-percentile. In contrast to reliability, sharpness is better for forecasts of the 20-percentile rather than the 30-percentile threshold. Differences in reliability and sharpness can be explained by the limited number of samples. So, the performance of the two selected thresholds is close and hard to distinguish.

Due to most dots laying below the diagonal, Figure 6 suggests that in general, GloFAS-Seasonal reforecasts have a tendency to over-estimate the likelihood of a below percentile streamflow condition, which is a common situation for seasonal forecasting [55]. This conclusion is consistent with the reliability diagram of GloFAS-Seasonal reforcasts aggregated across all observation stations globally [35], and reflects the characteristics of the GloFAS-Seasonal forecasting system. However, with respect to the impoundment of the reservoirs, it is more favorable to over-estimate the below threshold conditions rather than under-estimating. The reservoir operators could employ GloFAS-Seasonal forecasts for decision-making for the early impoundment operation, while control the risk of flooding through short-term hydrological forecasting in real-time operation.

#### *4.3. Specific Analysis and Benefits of the EIOR*

The above results demonstrate that GloFAS-Seasonal forecasts have the potential to give water managers the flexibility to employ early impoundment in the upper Yangtze River. Here, we try to analyze the EIOR and find its benefits. As an example, Figure 7 shows the Pareto-optimal solutions of EIOR (plot a) and SIOR (plot b) for the *QTGR* below the 20-percentile threshold. These Pareto-optimal solutions are averaged for derving parts of Table 4. We are employing three objective functions. Therefore, three subplots are needed for Pareto-optimal solutions to show three objectives in pairs. However, the flood control risk (*R*) of almost all Pareto-optimal solutions are equal to zero. Hence, we only show two objective functions (*FSR* and *HG)* in Figure 7. Each one of the 50 Pareto-optimal solutions obtained from the NSGA-II algorithm represents impoundment rule curve for each of the five cascade reservoirs. Figure 7 also shows EIOR and SIOR rule curves of WDD and TGR reservoirs focusing on the extreme solution of the *FSR* objective function. The figure shows that the average water level of EIOR is higher than SIOR for the selected 20-percentile threshold, while the risk of flooding is zero.

To illustrate the potential maximum benefits of EIOR, Figure 7 also shows a linear operation rule (LOR), which connects the initial water level to the top of STBP at the end of impoundment period. As a benchmark, LOR could present the maximum benefit of the EIOR from two parts where one is the earlier initial impoundment time and the other is the optimized rule curve. For curves of EIOR and LOR in Figure 7, Table 5 shows the benefit and risk of EIOR compared to the LOR. The proposed EIOR improves the *FSR* by 5.63% and increases *HG* by 4.02%. In conclusion, during dry years, our proposed methodology could significantly increase the hydropower generation and water utilization by employing GloFAS-Seasonal forecasts and early reservoir impoundment.

**Figure 7.** Optimized EIOR and SIOR rule curves of WDD and TGR reservoirs focusing on fullness storage rate (*FSR*) for flow scenario group (below 20%) of *QTGR.*


**Table 5.** Comparison of the benefit and risk of EIOR and linear operation rule (LOR) rule curves for the QTGR below the 20-percentile threshold.

Note: QTGR: inflow to the Three Gorges Reservoir; LOR: linear operation rule; EIOR: early impoundment operation rules; *HG*: hydropower generation; *FSR*: fullness storage rate; *R*: flood control risk.

#### **5. Conclusions**

In this study, we evaluated the potential application of GloFAS-Seasonal forecasts for early reservoir impoundment in the upper Yangtze River. A cascade reservoirs impoundment simulation–optimization model was employed to select suitable low flow thresholds for decision-making for EIOR or SIOR. These thresholds were selected by analyzing the historical inflow data ofWDD and TGR reservoirs, which were derived from Hua-Tan and Yi-Chang hydrologic stations. The performance of GloFAS-Seasonal reforecasts to predict the streamflow condition at these two hydrological stations was evaluated using AUC, ROCSS, and reliability diagram for two different lead times (LM1 and LM2) and selected thresholds. The main findings of our study can be summarized as follows:

(1) The low flow condition in September has a very significant impact on reservoir impoundment operation in the upper Yangtze River. The 20-percentile and 30-percentile selected thresholds of inflow at WDD and TGR are suitable for evaluating the possibility of early impoundment. These two selected thresholds can be used as a measure for flow condition and decision-making for early impoundment operation.

(2) All AUC values of reforecasts are greater than 0.6 which shows that GloFAS-Seasonal forecasts can be used to predict the streamflow condition according to the selected thresholds. However, AUC decreases from the LM1 (around 0.9) to the LM2 (below 0.8) as expected. The ROCSS reveals that both LM1 and LM2 are significantly better than climatology. The reliability diagrams also show that both LM1 and LM2 forecasts have more reliability and sharpness than climatology. Furthermore, results also indicate a tendency of the two lead time forecasts to over-estimate, which is more favorable for water managers.

(3) GloFAS-Seasonal forecasts with 2-month lead time (LM2) are valuable for reservoir impoundment operation. During dry years, the proposed EIOR improves the fullness storage rate by 5.63% and the annual average hydropower generation by 4.02% without increasing the risk of flooding.

This paper demonstrates that GloFAS-Seasonal forecasts has the potential to improve the standard impoundment operation rules in the upper Yangtze River and give water managers the flexibility to employ early impoundment.

**Author Contributions:** Conceptualization and software. K.C. and S.G.; Data Curation, J.W. and P.Q.; Formal Analysis, K.C., S.H. and S.S.; Writing—Original Draft Preparation, K.C. and M.R.N.; Writing—Review and Editing, S.G.

**Funding:** This study was funded by the National Key Research and Development Project (Grant NO. 2016YFC0402206) of China, the National Natural Science Foundation of China (Grant NO. 51879192), the "111 Project" Fund of China (B18037), the joint U.S.-China Clean Energy Research Center for Water-Energy Technologies (CERC-WET) project (2018YFE0196000) and the U.S. Department of Energy (DOE Prime Award # DE-IA0000018).

**Acknowledgments:** The authors would like to express their gratitude to anonymous reviewers for their insightful and constructive comments.

**Conflicts of Interest:** The authors declare no conflict of interest.
