*2.2. Model Calibration and Evaluation*

The differential split-sample test (DSST) is a common way to evaluate hydrological models between sub-periods with contrasting climate and watershed underlying conditions [37,38]. In practice, usually only two or three contrasting sub-periods can be identified due to the availability of the hydro-meteorological data. This may limit the potential of DSST for accessing the transferability of model parameters and drawing general conclusions. To overcome this limitation, the generalized split-sample test (GSST) proposed by Coron et al. [39,40] was chosen to test the temporal transferability of parameter for the GR4J and GR4J-T model in this study. The main objective of GSST procedure is to evaluate the model performance under as many and as varied climatic and watershed conditions as possible. To this end, a sliding window with a specific length (e.g., five years) is applied to define the sub-periods (as illustrated in Figure 1). In Figure 1, the blue bars indicate the sub-periods while the grey bars represent the remaining part of the time series. Between two adjacent sub-periods, the sliding window is moved by one year.

**Figure 1.** Illustration of the generalized split-sample test (GSST) procedure (example with 15 years available and 5-year subperiods). (Adapted from the work of Coron et al. [40]).

When the sub-periods creation completed, both the GR4J and GR4J-T model were calibrated in each sub-period. Then the parameter sets obtained was used to perform validation test on independent sub-periods, i.e., the sub-periods that do not overlap with the calibration sub-period. For example, the parameter calibrated during SP1 won't be validated in SP2, SP3, SP4 and SP5, as illustrated in Figure 1. For more detail about GSST, the readers can refer to the work of Coron et al. [39,40].

The original GR4J model and the GR4J-T model have different numbers of calibrated parameters. For the original GR4J model, the number of calibrated parameters is 4, i.e., θ = {*x*1, *x*2, *x*3, *x*4}. In the case of the GR4J-T model, when the parameter *xi*(*i* = 1, 2, 3, 4) is treated to be time-varying, its constant value *xi*,*<sup>c</sup>* and the corresponding regression parameters λ*i*,*j*(*j* = 1, ... , *M*) are calibrated together with other time-invariant parameters. For example, when only the parameter *x*<sup>1</sup> is regarded as the time-varying parameter, parameters that require calibration are <sup>θ</sup> = *x*1,*c*, λ1,1, ... , λ1,*M*, *x*2, *x*3, *x*<sup>4</sup> . Similarly, when multiple parameters are assumed to be time-varying, e.g., *x*<sup>1</sup> and *x*2, the calibrated parameters become <sup>θ</sup> = *x*1,*c*, λ1,1, ... , λ1,*M*, *x*2,*c*, λ2,1, ... , λ2,*M*, *x*3, *x*<sup>4</sup> .

The widely used Kling-Gupta Efficiency (*KGE*) [41] was applied to assess the overall performance of the GR4J and GR4J-T model. The *KGE* is calculated as follows,

$$KGE = 1 - \sqrt{(r(Q\_{\rm act}, Q\_{\rm sim}) - 1)^2 + \left(\frac{\mu(Q\_{\rm act})}{\mu(Q\_{\rm sim})} - 1\right)^2 + \left(\frac{\sigma(Q\_{\rm act})}{\sigma(Q\_{\rm sim})} - 1\right)^2} \tag{3}$$

where *Qact* and *Qsim* are the observed and model-simulated streamflow series, respectively, *r*(*Qact*, *Qsim*) is the Pearson's correlation coefficient between the observed and simulated streamflow series, μ(*Qact*) and μ(*Qsim*) are the mean values of the observed and simulated streamflow series, respectively, σ(*Qact*) and σ(*Qsim*) are the standard deviation of the observed and simulated streamflow series, respectively. *KGE* value ranges from −∞ to 1, with a value closer to 1 indicating a better simulation performance.

The bias between the observed and simulated streamflow (*BIAS*) was also applied, which is calculated as follows,

$$BIAS = \frac{\sum\_{t=1}^{T} Q\_{sim}(t)}{\sum\_{t=1}^{T} Q\_{act}(t)} - 1 \tag{4}$$

where *Qact*(*t*) and *Qsim*(*t*) are the observed and model-simulated streamflow at time *t*, respectively. *BIAS* with a value of 0 indicates no bias, and a value above 0 means an overestimation of the total streamflow volume.

Both the GR4J and GR4J-T models were calibrated using the SCEM (Shuffled Complex Evolution Metropolis) optimization algorithm [2,42], which has been widely used for practical assessment of parameter uncertainty in hydrological modeling. The SCEM algorithm, partially inspired by the SCE-UA (Shuffled Complex Evolution - University of Arizona) algorithm [1], merges the strengths of Metropolis-Hastings sampling [43,44], controlled random search, competitive evolution and complex shuffling to evolve a population of sampled points to an approximation of the posterior distribution of the parameters. Besides, the SCEM method can identify the most likely parameter set and meanwhile its underlying posterior probability distribution in every single run [45]. The likelihood of a parameter set was calculated as the corresponding *KGE* value. Considering that the likelihood value must be nonnegative and monotonically increasing with improved performance, the parameter set that leads to a *KGE* value below 0 was rejected during the evolution procedure within SCEM. The convergence of the algorithm was determined using the Gelman-Rubin statistic [46], which is calculated on the posterior densities to check whether convergence to a stationary target distribution has been reached. For more detail about the SCEM optimization algorithm, the reader can refer to the work of Vrugt et al. [2].

After the convergence of the SCEM optimization algorithm, the final posterior distribution of parameter can be obtained. In total 5000 parameter sets were sampled from the posterior distribution to account for parameter uncertainty in this study. It should be noted that the 5000 parameter sets can lead to very similar streamflow simulation performance in terms of the *KGE* criteria, and the one corresponding to the highest *KGE* value (i.e., the *KGE*-best) was chosen to represent the estimate of the global optimum and used for the model evaluation and selection.

#### *2.3. Temporal Transferability Test of Model Parameters*

When a parameter set θ that calibrated from the sub-period *D* (denoted as "donor period" hereafter) is transferred to the sub-period *R* (i.e., validation, denoted as "receiver period"), the Kling-Gupta Efficiency (*KGE*) can be rewritten as,

$$\text{KGE}\_{D \rightarrow R} = 1 - \sqrt{\left(r(Q\_{\text{act},R}, Q\_{\text{sim},R}[\Theta\_D]) - 1\right)^2 + \left(\frac{\mu(Q\_{\text{act},R})}{\mu(Q\_{\text{sim},R}[\Theta\_D])} - 1\right)^2 + \left(\frac{\sigma(Q\_{\text{act},R})}{\sigma(Q\_{\text{sim},R}[\Theta\_D])} - 1\right)^2} \tag{5}$$

where *Qact*,*<sup>R</sup>* is the simulated streamflow series of sub-period *R*, *Qsim*,*R*[θ*D*] is the simulated streamflow series of sub-period *R* using parameter set obtained from sub-period *D*. In addition, *KGER*→*<sup>R</sup>* is defined to access the streamflow simulation performance during sub-period *R* using parameter sets obtained from sub-period *R*.

Note that the temporal transferability test on one single parameter set could be less persuasive. To fully evaluate the temporal parameter transferability of both the GR4J and GR4J-T model, the 5000 parameter sets from each sub-period were then validated at other independent sub-periods, and the average performance of both models were compared. To this end, a parameter transferability criteria (*PTC*) is defined as:

$$PTC = \frac{\sum\_{n=1}^{N} KGE\_{D \to R}^{t}(n) - \sum\_{n=1}^{N} KGE\_{D \to R}^{c}(n)}{N} \tag{6}$$

where *KGE<sup>c</sup> <sup>D</sup>*→*R*(*n*) and *KGEt <sup>D</sup>*→*R*(*n*) represent the *KGED*→*<sup>R</sup>* value in the case of the GR4J and the GR4J-T model using the *n*th parameter set, respectively. Here, *N* takes the value of 5000. Note that *PTC* with a value above 0 indicates a better parameter transferability for the GR4J-T model over the original GR4J model when transferred from sub-period *D* to sub-period *R*.

#### **3. Study Data and Area**

Weihe Basin and Tuojiang Basin in Western China were chosen as the study areas. Weihe Basin is located between the coordinates 33◦40- –37◦26- N and 103◦57- –110◦27- E, with a drainage area of 148,000 km<sup>2</sup> above Huaxian hydrological station, as shown in Figure 2. The elevation within Weihe Basin ranges from 3671 m in the western upstream region to 318 m in the eastern downstream region. Dominated by the semi-arid continental monsoon climate, most precipitation and flood events in Weihe Basin occur in late summer and early autumn. For Weihe Basin, climate variability is significant and human activities have been proved to be the main cause of the alternation of flow regimes. Previous studies have shown that the annual streamflow of the gauge Huaxian has been declining over the past decades [47–49], mainly due to the increasing human activities including the agricultural irrigation, the construction of large water control projects and the implementation of the water-soil conservation projects [50,51]. In addition, the variations of the annual precipitation have also contributed to the reduced annual streamflow in Weihe Basin [52].

**Figure 2.** Location of Weihe Basin and Tuojiang Basin in China and the meteorological and hydrological gauges.

Tuojiang Basin (28◦88- –30◦29- N, 105◦44- –108◦20- E) is an upstream tributary of the Yangtze River, with a total length of 712 km. Tuojiang Basin covers 19,613 km<sup>2</sup> with Fushun hydrological station as the catchment outlet. The elevation within Tuojiang Basin ranges from 264 to 4741 m and decreases from northwest to southeast. Tuojiang Basin is dominated by the subtropical monsoon climate, with most precipitation and flood events occurring in summer and early autumn. The annual streamflow of Tuojiang Basin was also in a downtrend. However, few recent studies focus on Tuojiang Basin and the reason for the declined annual streamflow remains unclear.

The input data for the GR4J model include precipitation (*P*) and potential evapotranspiration (*PET*). The precipitation and air temperature data were provided by National Meteorological Information Center of China [53] on a daily basis for the period from 1981 to 2010 and has been widely validated for its quality in previous studies [47–49]. Limited by availability of the meteorological dataset, the potential evapotranspiration (*PET*) was calculated using the Blaney-Criddle method [54] from the daily average air temperature and the daily sunshine duration. To calibrate and validate the GR4J model, the observed streamflow (*Q*) series from the gauge Huaxian and Fushun were obtained for the period 1981–2010. The *NDVI* dataset for the period from 1981–2010 was obtained from the GIMMS (Global Inventory Modelling and Mapping Studies) *NDVI*–3g product [55], which has been widely validated [55,56]. Since its temporal resolution is 15-day, the monthly *NDVI* is represented by the mean value of the two *NDVI* values in each month.

#### **4. Results and Discussion**

#### *4.1. Diagnostics of Hydrological Nonstationarity*

Considering the available hydro-meteorological dataset for Weihe Basin and Tuojiang Basin are for the period 1981–2010, in total 26 sub-periods of 5-year in length are set up following the method described in Figure 1. The reasons for the selection of 5-year as the sub-period length are: (1) 5-year is sufficient for continuous hydrological simulation, (2) a large number of sub-periods (26 in this study) can be set up given that only 30-year records were available. To identify the long-term variation of the hydro-meteorological data series of both basins, the non-parametric Mann-Kendall method [57,58] was applied to examine the trends in the annual *P*, *PET*, *Q* and annual runoff ratios (*RR*) series of the period 1981–2010, and the result is shown in Table 3. The results show a significant downtrend for the annual *Q* and *RR* series, a slight downtrend for annual *P* series and uptrend for the *PET* series in Weihe Basin. In the case of Tuojiang Basin, a slight downtrend for annual *P* series and significant downtrend for annual *PET*, *Q,* and *RR* series are presented. It can be found that both basins witnessed a clear nonstationary in rainfall-runoff relationship. In terms of the annual *NDVI* series, Weihe Basin shows a clear uptrend while Tuojiang Basin shows no clear changes. This may be attributed to the fact that Weihe Basin has experienced an incessant construction of water-soil conservation projects (e.g., afforestation) since the early 1980s [49].

Variability is also apparent between longer periods, as shown in Figure 3, where the relative mean *P*, *PET*, *Q,* and *RR* values for all sub-periods (5-year sliding window) are plotted for both basins. Figure 3a–d show how the 5-year hydro-meteorological conditions differ from those of the whole period (1981–2010). For each basin, a line corresponds to the series of the mean values over a 5-year sliding window (sub-period). Each value is expressed relative to the mean value of the whole period (1981–2010) and plotted in the middle year of the corresponding sub-period. The result is in line with the results of the MK-test. For both Weihe Basin and Tuojiang Basin, a significant downtrend of 5-year total streamflow volume and a decreased runoff-ratios are presented. This may question the parameter transferability of the GR4J model, which is detailed in following sections.

**Table 3.** Mann-Kendal (MK) test results of the annual hydro-meteorological data series, the annual runoff ratios and the annual *NDVI* data series for Weihe Basin and Tuojiang Basin. The critical value of the MK statistic is *Z*1−α/2 = 1.960 with α = 0.05.


Notes: The symbol "\*\*" indicates a significant uptrend or downtrend.

**Figure 3.** Relative long-term hydro-meteorological variability of (**a**) precipitation (*P*), (**b**) potential evapotranspiration (*PET*), (**c**) runoff (*Q*), and (**d**) runoff ratio (*RR*) over Weihe Basin and Tuojiang Basin.

#### *4.2. Parameter Estimation for the GR4J and GR4J-T Model*

To incorporate time-varying sensitivity analysis, the Sobol' sensitivity analysis was repeated at a monthly temporal resolution rather than over the whole period (1981–2010). The monthly sensitivity indices (including the first-order and total-order) of each parameter were therefore obtained with *KGE* as the metric. More detail about the time-varying sensitivity analysis can be found in the work of Herman et al. [36]. Figure 4 provides a simple and direct interpretation of the relationships between the sensitivity of the GR4J's parameters and changing precipitation and streamflow conditions, which was achieved by sorting the monthly sensitivity indices (only the total-order were presented for brevity) for both basins along ascending gradients of monthly precipitation and streamflow.

**Figure 4.** Monthly total-order sensitivity indices of GR4J parameters for (**a**) Weihe Basin and (**b**) Tuojiang Basin. For each basin, the left panel and the right panel share the same sensitivity indices, but they are sorted by monthly total precipitation (left) and monthly streamflow (right), respectively.

For Weihe Basin, the parameter *x*1, which controls the production storage capacity, shows a strong sensitivity during both high-flow and low-flow months. In contrast, the parameter *x*2, *x*<sup>3</sup> and *x*<sup>4</sup> are less sensitive. In the case of Tuojiang Basin, the parameter *x*<sup>1</sup> and *x*<sup>3</sup> dominate the months with low-flow and less-precipitation, while the parameter *x*<sup>2</sup> and *x*<sup>4</sup> show relatively weaker sensitivity. Thus, the parameter *x*<sup>1</sup> is assumed to be time-varying for Weihe Basin, and the parameter *x*<sup>1</sup> and *x*<sup>3</sup> for Tuojiang Basin.

The correlation between monthly runoff-ratios and monthly external covariate is summarized in Figure 5 in term of Spearman's rank correlation coefficient. Note that the monthly runoff-ratios were calculated as the ratios of monthly runoff depth (mm) and the monthly total precipitation (mm). It can be found that the monthly runoff-ratios has a strong correlation with *PET*<sup>1</sup> and *NDVI*<sup>0</sup> for Weihe Basin. In the case of Tuojiang Basin, the covariates with high correlation coefficient are *P*<sup>2</sup> and *NDVI*0. This correlation analysis was done to justify the reasonability of incorporating the abovementioned external covariates into the time-varying parameters. It may be arbitrary to assert that an external covariate with a weak correlation coefficient is inappropriate and useless. In this study, all 8 external covariates were applied to construct the time-varying parameters for both Weihe Basin and Tuojiang Basin, which is detailed in the following section.

**Figure 5.** Spearman rank correlation coefficient between monthly covariates and monthly runoff ratio (*RR*) for Weihe Basin and Tuojiang Basin for the whole period (1981–2010).

To reduce the influence of the initial state, a spin-up period of 1-year was used at the parameter calibration procedure for each sub-period, and the simulated streamflow within the spin-up period was excluded in the model assessment procedure. To fully explore the potential for the GR4J-T model in streamflow simulation, all combination of the eight external covariates were investigated and their corresponding performances were compared in terms of the *KGE* and *BIAS* criteria. Considering a large number of model assessment procedure (e.g., 255 covariate combinations for each sub-period in Weihe Basin,*C*<sup>1</sup> <sup>8</sup> <sup>+</sup> *<sup>C</sup>*<sup>2</sup> <sup>8</sup> <sup>+</sup> *<sup>C</sup>*<sup>3</sup> <sup>8</sup> <sup>+</sup> *<sup>C</sup>*<sup>4</sup> <sup>8</sup> <sup>+</sup> *<sup>C</sup>*<sup>5</sup> <sup>8</sup> <sup>+</sup> *<sup>C</sup>*<sup>6</sup> <sup>8</sup> <sup>+</sup> *<sup>C</sup>*<sup>7</sup> <sup>8</sup> <sup>+</sup> *<sup>C</sup>*<sup>8</sup> <sup>8</sup> = 255), only a small part of the results is presented hereafter for the sake of brevity. For each covariate combination, totally 5000 parameter sets were obtained from the calibration procedure and only the *KGE*-best for each covariate combination is shown here. For example, Table 4 presented the streamflow simulation performance for 8 different cases (C0–C7) during the sub-period 1 (SP1) in Weihe Basin. The case C0 was designed to represent the original GR4J model, where the value of parameter *x*<sup>1</sup> is invariant over time. Under cases C1–C7, the external covariate *P*1, *PET*<sup>1</sup> and *NDVI*<sup>0</sup> were incorporated into the time-varying parameter *x*1,*t*. The corresponding equations for time-varying parameter *x*1,*<sup>t</sup>* are also presented in Table 4. It should be noted that three forms of link functions including 'linear', 'exponential' and 'logarithmic' were used to link the external covariates with the time-varying parameter. Three link functions lead to similar model performance, while the 'linear' was always the best and simplest one. So only the 'linear' is shown here.

It can be found that the streamflow simulation performance significantly improves when parameter *x*<sup>1</sup> was allowed to change over time. For example, the *KGE* value is 0.752 when only the covariate *P*<sup>1</sup> was considered when compared to 0.727 for the original GR4J model. The difference in *BIAS* criteria (from −16.1% to −6.3%) also indicates a better water balance simulation of the GR4J model with time-varying parameter. Furthermore, the highest *KGE* value (0.761) was achieved when *P*1, *PET*<sup>1</sup> and *NDVI*<sup>0</sup> were incorporated into the time-varying parameter.


**Table 4.** Comparison of streamflow simulation performance of the GR4J model (C0) and the GR4J-T model (C1–C7) in Weihe Basin during the sub-period 1 (SP1, from 1981 to 1985).

### *4.3. Streamflow Simulation Performance of the GR4J and GR4J-T Model*

Similarly, the model assessment procedure was repeated for the rest sub-periods (SP2-SP26). Table 5 presents the streamflow simulation performance of the GR4J model and the GR4J-T model for all 26 sub-periods in Weihe Basin. Note that the equations for time-varying parameter *x*1,*<sup>t</sup>* corresponds to the covariate combination case with the best streamflow simulation performance in terms of the *KGE* criteria during the corresponding sub-period, i.e., the *KGE*-best one. The GR4J-T model significantly outperforms the original GR4J model in terms of the *KGE* criteria and achieve a better water balance simulation in most cases. This is not unexcepted as previous studies have proven the advantage of applying time-varying parameter in streamflow simulation [27,28].


**Table 5.** Comparison of streamflow simulation performance of the GR4J model and the GR4J-T model in Weihe Basin for all sub-periods (SP1–SP26).


**Table 5.** *Cont*.

The case of Tuojiang is more complicated since two parameters (*x*<sup>1</sup> and *x*3) are assumed to be time-varying. Therefore, four scenarios were considered for each sub-period, including: (1) both parameters are constant, (2) *x*<sup>1</sup> is constant and *x*<sup>3</sup> is time-varying, (3) *x*<sup>1</sup> is time-varying and *x*<sup>3</sup> is constant, (4) both parameters are time-varying. However, only the results of the first and fourth scenario are presented here in Table 6, because: (1) the calibration procedure is not the main purpose of this study, (2) the second and third scenario show less significant improvement in term of *KGE* when compared to the fourth scenario. For Tuojiang Basin, it is clear that the GR4J-T outperforms the original GR4J model, though the improvement in terms of *KGE* is not as significant as the case of Weihe Basin. This may partly be attributed to the fact that the original GR4J can achieve a satisfying performance in most sub-periods for Tuojiang Basin.


**Table 6.** Comparison of streamflow simulation performance of the GR4J model and the GR4J-Model in Tuojiang Basin for all sub-periods (SP1–SP26).


**Table 6.** *Cont*.

The observed streamflow and the uncertainty bounds associated with model parameter for both models during part of SP1 (1982–1983) are presented in Figure 6 (for Weihe Basin) and Figure 7 (for Tuojiang Basin) for an illustration purpose. The uncertainty bounds associated with model parameter were calculated as follows [59]. For both the GR4J and GR4J-T model, the streamflow simulation for a certain sub-period was repeated using the 5000 parameter sets, and 5000 simulated hydrographs were therefore obtained. Then the likelihood value of each parameter set was assigned to the respective simulated hydrograph. At each time step of the simulation, the posterior distributions of simulated streamflow can be calculated based on the 5000 simulated streamflow and the corresponding likelihood values. The upper and lower uncertainty bounds are here defined as the 2.5% and 97.5% quantiles of the posterior distribution, respectively. Then the 95% predictive uncertainty bounds associated with model parameter for both models were obtained. More detail of the calculation of the predictive uncertainty bounds can be found in the work of Blasone et al. [59]. The red and grey shaded areas indicate the prediction uncertainty associated with the GR4J model and GR4J-T model, respectively. It can be found that most parts of the observed streamflow series lie within the uncertainty bounds for both models, indicating a good simulation performance. In addition, the original GR4J model tends to underestimate the low flows in Weihe Basin, as presented in Figure 6b, where the simulated streamflow series is constantly lower than the observed in November of 1983. However, the average relative bandwidth of the uncertainty bounds associated with the GR4J-T model is wider, which

indicates a higher parameter uncertainty. This agrees with the finding of Wallner and Haberlandt [25], who showed that time-varying model parameters can improve the model performance at the cost of possible growth in parameter uncertainty. The growth in parameter uncertainty may also relate to the choice of the external time-varying covariates when applying time-varying parameter, which may be an interesting topic that deserves further exploration.

Figure 6a presents the constant value of the parameter *x*<sup>1</sup> and its temporal dynamic when treated as time-varying for Weihe Basin during part of SP1 (1982–1983). The constant value was obtained from the *KGE*-best parameter set of the GR4J model, while the temporal dynamic of *x*1,*<sup>t</sup>* is presented in the form of boxplot using the 5000 resultant parameter sets of the GR4J-T model. It can be seen that the time-varying parameter *x*1,*<sup>t</sup>* increases in the growing season and decreases as the dormant season begins, with its values around the constant value of the parameter *x*1. The case is the same for the parameter *x*<sup>1</sup> and *x*<sup>3</sup> in Tuojiang Basin, as shown in Figure 7a,b.

**Figure 6.** Temporal comparison of (**a**) the values of the parameter *x*<sup>1</sup> and (**b**) the simulated daily streamflow of the GR4J model and the GR4J-T model for Weihe Basin during SP1 (only the period from 1 January 1982 to 31 December 1983 are presented here). The uncertainty bounds associated with model parameter are derived from the streamflow simulation results using the corresponding 5000 parameter sets.

**Figure 7.** Temporal comparison of the values of the parameter (**a**) *x*<sup>1</sup> and (**b**) *x*3, and (**c**) the simulated daily streamflow of the GR4J model and the GR4J-T model for Tuojiang Basin during SP1 (only the period from 1 January 1982 to 31 December 1983 are presented here). The uncertainty bounds associated with model parameter are derived from the streamflow simulation results using the corresponding 5000 parameter sets.

The increased value of the parameter *x*<sup>1</sup> during the growing season indicates larger catchment storage (e.g., interception) and, therefore, a weaker response of the watershed to precipitation inputs. As a result, the same hydrological inputs of precipitation may lead to less streamflow, therefore lower runoff ratio when compared to the dormant season. Likewise, the relatively low value of the parameter *x*<sup>1</sup> during the dormant season may lead to a stronger response to the precipitation inputs and may help to reduce the possibility to underestimate the low flow, as mentioned above. The parameter *x*3, which controls the routing storage of GR4J model, shows a similar trend with *x*<sup>1</sup> in the case of Tuojiang. Similarly, the relatively low values of the parameter *x*<sup>3</sup> during dormant season, could contribute to a more rapid hydrological response and therefore a better representation of the low flow.

The improved streamflow simulation performance can be attributed to: (1) the dynamics in model parameter, which may help to compensate the possible deficiency of model structure, (2) the increased degree of freedom of model parameters. For the GR4J-T model applied in Weihe Basin, the number of parameters that required calibration is 7. The trade-off between a simple model structure and good model performance could be an interesting topic that deserves further researches, especially under changing environments.

#### *4.4. Parameter Transferability of the GR4J and GR4J-T Model*

To test the parameter transferability of the GR4J and GR4J-T Model, the abovementioned 5000 parameter sets obtained from each sub-period (donor period) were then validated at its independent sub-periods (receiver periods). For example, Figure 8 compares the streamflow simulation performance of the GR4J model and GR4J-T model under 10 different sub-periods using parameter sets obtained from SP1 in Weihe Basin. The corresponding *PTC* criteria are also presented. As a reference, the red solid line represents the highest value of *KGE* (*KGE<sup>c</sup> <sup>R</sup>*→*R*) achieved by the original GR4J model using parameter from each receiver period, while the blue solid line for the GR4J-T model (*KGEt <sup>R</sup>*→*R*). The blue boxes represent the *KGE* values (*KGEc <sup>D</sup>*→*R*) achieved by the original GR4J model using the 5000 parameter sets obtained from SP1, and the green boxes represent those of the GR4J-T model (*KGEt <sup>D</sup>*→*R*). It is clear that when parameter sets were transferred from the donor period to the receiver periods, the model performance significantly decreased for both the GR4J model and GR4J model. However, the maximum value of the green boxes (*KGE<sup>t</sup> <sup>D</sup>*→*R*) is constantly higher than those of the blue boxes (*KGE<sup>c</sup> <sup>D</sup>*→*R*), as shown in Figure 8. Besides, the *PTC* criteria, which is the difference between the average value of *KGE* achieved using 5000 parameter sets by both models over each receiver period (see Equation (6)), is above 0 in most cases, which indicates a better temporal transferability of the GR4J-T model. The case is the same for Tuojiang Basin (see Figure 9), though the corresponding *PTC* values are quite close to 0.

**Figure 8.** Comparison of streamflow simulation performance in terms of *KGE* when parameter sets calibrated during P1 are transferred to other sub-periods (SP6–SP26) for the GR4J model and the GR4J-T model in Weihe Basin. The boxplots represent the *KGE* values using the 5000 parameter sets obtained from SP1. The *PTC* values indicate the difference of the average simulation performance between the GR4J model and the GR4J-T model during the validation procedure.

**Figure 9.** Comparison of streamflow simulation performance in terms of *KGE* when parameter sets calibrated during P1 are transferred to other sub-periods (SP6–SP26) for the GR4J model and the GR4J-T model in Tuojiang Basin.

To comprehensively compare the temporal transferability of the constant parameter and the time-varying parameter, the *PTC* criteria for every possible case is also presented in Figure 10. The blank area indicates the cases where the transferability test is inapplicable (e.g., between adjacent sub-periods or among sub-periods that overlap each other). The labels on the y-axis indicate the 26 donor periods while the labels on the x-axis indicate the receiver periods. Each cell in the figure stands for a transferability test and its color represents the corresponding *PTC* value. The red cells, indicating positive *PTC* values, stands for the cases where the GR4J-T model shows a better temporal transferability over the GR4J model, while the blue cells are in the opposite cases. It is clear that the *PTC* criteria have a positive value in most cases. This further demonstrates that the GR4J-T model has advantage in term of temporal transferability over the original GR4J model.

Note that the loss in model performance when the model parameter was transferred among sub-period with different climate and watershed conditions is inevitable even for the hydrological model with time-varying parameter. Here, it was proven that a better transferability can be achieved under changing environments when applying the time-varying parameter. Though in practice, applying the time-varying parameter is not an easy job, as a more complicated calibration procedure is necessary. However, it is still worth identifying the sensitive parameter and find out how it may change over time, which may help to improve the model structure and to provide new sights for better understanding the hydrological processes.

**Figure 10.** *PTC* values of the parameter transferability test for (**a**) Weihe Basin and (**b**) Tuojiang Basin.

#### **5. Conclusions**

This study investigated the parameter transferability under changing environments of the original GR4J model and its modified version, in which the parameter was allowed to vary over time. To set up the GR4J model with the time-varying parameter, a sensitivity analysis was applied to identify the most sensitive parameter, which was then treated as time-varying and assumed to be a function of several external covariates. Both models were calibrated and validated in a series of sub-periods with different climatic and watershed conditions. The investigation was carried out for Weihe Basin and Tuojiang Basin in western China. The main findings of this study are as follows:

(1) A better streamflow simulation performance in terms of the *KGE* criteria was achieved by the GR4J model with time-varying parameters (*x*<sup>1</sup> for Weihe Basin, *x*<sup>1</sup> and *x*<sup>3</sup> for Tuojiang Basin) during most calibration sub-periods for both basins, though at the cost of slight growth in parameter uncertainty,

(2) The GR4J model with time-varying parameter shows a better transferability among sub-periods with different climate and watershed conditions when compared to the original GR4J model for both basins.

Although the time-invariance of model parameters is one of the basic criteria of a high-quality hydrologic model, very few (if any) models can achieve this due to their inherent limitations. Numerous previous studies have proved that hydrological models with time-varying parameters can achieve a better streamflow simulation performance during the calibration period. This study further demonstrates the advantage of the GR4J model with time-varying parameter in terms of temporal transferability among different sub-periods. This may provide new insights for improving the structure of existing hydrological models and for building new models.

Considering the availability of the dataset for both basins, only six climate-related covariates (*P*1, *P*2, *P*3, *PET*1, *PET*2, and *PET*3) and 2 watershed-related covariates (*NDVI*<sup>0</sup> and *NDVI*1) were accounted for to describe the dynamic variation of the selected parameter. More covariates should be investigated in further researches. Besides, the key factor that leads to the alternation in flow regimes also differs for different catchments. Thus, more emphases should be placed on the careful identification of proper external covariates when applying time-varying parameter.

**Author Contributions:** Conceptualization, L.Z. and L.X.; data curation, L.Z., J.C. and J.-S.K.; formal analysis, L.Z., L.X., and D.L.; funding acquisition, L.X.; investigation, L.Z.; methodology, L.Z. and L.X.; project administration, L.X.; resources, L.X. and J.C.; software, L.Z., D.L., and J.C.; supervision, L.X.; validation, L.Z. and J.-S.K.; visualization, L.Z.; writing – original draft, L.Z.; writing – review and editing, L.X., D.L., J.C., and J.-S.K.

**Funding:** This research is supported by the National Natural Science Foundation of China (Grant Nos. 41890822 and 51525902), the Research Council of Norway (FRINATEK Project 274310), and the Ministry of Education "111 Project" Fund of China (B18037), all of which are greatly appreciated.

**Acknowledgments:** Sincere thanks are due to the editor and two anonymous reviewers for all the remarks and suggestions that are very helpful and constructive for improving our manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
