*4.2. Selection and Estimation of C-Vine Copula*

In this section, we introduce how to define the C-vine structures according to the learning data obtained from Section 4.1. Figure 5 shows the pair plots of the learning data set. The histograms along on the diagonal represent the marginal distributions discussed in Section 4.1. Additionally, Figure 5 (above the diagonal) indicates the values of Kendall's τ between two pairs of the variables, and the results show that the correlation between the variable St-1 and other variables is approximately stronger than that other pair variables (i.e., Kendall's τ = 0.65, 0.46, 0.33, 0.40, 0.32, and 0.46). Therefore, we define the variable St-1 as the central variate 1 (e.g., in Figure 1) in the first tree. In detail, considering that the monthly streamflow (S) is affected by various climatic and hydrological factors, such as temperature and precipitation, the monthly streamflow at last month (St-1) is selected as the first root in the first tree. Moreover, the predicted variable (St) is placed last because it is the more convenient option to evaluate the probability of St and to predict the St. The rest of the tree structures follow this principle and so forth (e.g., as shown in Figure 1). In general, the order of these variables is 1-St-1, 2-Pt-1, 3-St-2, 4-St-12, 5-Tt, 6-Pt, and 7-St. Figure 5 (below the diagonal) shows scatter plots for each pair of learning data and provides a basis for revealing the dependence structures between the variables. For example, we may find that there exists a lower tail correlation between St-1 and St-2. Obviously, the Clayton copula can be used to fit the relationship between variables St-1 and St-2.

According to the process of construction of the bivariate copula, the vine copula is constructed by a series of pair-copulas iterated tree by tree. Table 4 presents the C-vine structures consisting of 6 trees, 21 nodes, and the corresponding bivariate copulas with the parameters for every edge and KS test statistics. As mentioned above, the variables from 1 to 7 correspond to St-1, Pt-1, St-2, St-12, Tt, Pt, and St, respectively. In fact, due to the flexibility of the vines' structure, this order of the variables above is only such structure. It is the best arrangement made by considering the dependence of the variables in practical applications in this study. Meanwhile, in the process of constructing the paired copula, the vine copulas are simplified by ignoring the conditional variables.

λ-function is used to test the goodness of fit for the estimation of bivariate copula in each C-vine structure. Figure 6 illustrates the dependence of St-1 and other variables with the main node in tree 1 using λ-function. The results indicate that the selected and empirical copula are consistent with each other in all edges of tree 1. As shown in Figure 6a, the empirical λ-function (black) of the observations and the theoretical λ-function (grey) of the fitted copula coincide with each other, which means that the fitted copula is consistent with the empirical values. Combined with the KS test results in Table 4, all other selected pair-copulas obtained the optimal fitting results with *p* > 0.05 for the KS test.

the diagonal.

Tree 4

**Figure 5.** Pair plots of the learning data set with scatter plots below and Kendall's *τ* above the diagonal and histograms on **Figure 5.** Pair plots of the learning data set with scatter plots below and Kendall's *τ* above the diagonal and histograms on the diagonal. Notes: 1–7 represent St-1, Pt-1, St-2, St-12, Tt, Pt, and St, respectively; F—Frank, C—Clayton, G—Gumbel, N—Normal, and T—t-copula.

**Figure 6.** Correlation diagram of St-1 with other variables of the *λ*-function with the main node in tree 1 (empirical function (black line), theoretical function of a fitted copula with parameters (grey line), as well as independence and comonotonicity limits (dashed lines)). **Figure 6.** Correlation diagram of St-1 with other variables of the *λ*-function with the main node in tree 1 (empirical function (black line), theoretical function of a fitted copula with parameters (grey line), as well as independence and comonotonicity limits (dashed lines)).

some low-flow records (e.g., 1999 and 2000).

*4.3. Predicted Monthly Streamflow of MLR, ANN, and C-Vine Models* 

1963–1964) but overestimates more records during 2004–2009.

Figure 7 shows a comparison of the predicted and observed streamflow acquired by the MLR, ANN, and CVQR models. For the MLR model, the results indicate that the values of R2, NSE, and RMSE are 0.73, 0.72, and 16.16 in the calibration period and 0.73, 0.66, and 16.72 in the validation period. For the MLR model (Figure 7a), the predicted value is slightly underestimated in the case of high flow observation values (1980–1986), and vice versa, the predicted value is slightly overestimated during 2004–2009. Due to the inherent characteristics of the algorithm, the predicted values even become negative at

The ANN model performs better than the MLR model in the calibration period (Figure 7b). The ANN model obtains an R2 of 0.75, an NSE of 0.73, and an RMSE of 15.57 in the calibration period. Similar to the results of the MLR model, the ANN model, with values of R2 at 0.72, NSE at 0.69, and RMSE at 16.53, performs worse in the validation period than that in the calibration period. Moreover, as presented in Figure 7b, the ANN model also underestimates some streamflow during the high flow periods (e.g.,


**Table 4.** Estimation of the 7-d C-vine model with bivariate copula-corresponding parameters of every node and the KS test.

Notes: 1–7 represent St-1, Pt-1, St-2, St-12, Tt, Pt, and St, respectively; F—Frank, C—Clayton, G—Gumbel, N—Normal, and T—t-copula.
