3.3.2. Comparisons Between the Hybrid Forecasting Models

Figure 5 shows the error forecasting by the error correction module in the hybrid models. Compared to the water demand data in Figure 4, the errors of initial forecasting in Figure 5 have a large number of fluctuations, in other words, the value of errors has a greater frequency of change. In addition, the complex and disorderly change in the peak values of the error data are also shown in Figure 5; there is no obvious rule on the occurrence time of the peak value, such as peaks at the time steps (7, 45, 71, 75) in Figure 5a. The results in Figure 5 can be summarized as follows:


32, 33, and 62 to 64 in DMA1; time steps 32, 46, 55 and 50 in DMA2; time steps 30 to 35, 80 and 86 in DMA3. While this kind of misleading correction is not much in the chaos method.

In general, the chaos method performs better than the FS method in predicting such a complex fluctuated error time series, and the practice also proves that the errors predicted by the chaotic method are closer to the initial errors in the three DMAs.

**Figure 5.** Comparison between the errors of the initial forecasting and the predicted errors. (**a**) DMA1 on 26 December; (**b**) DMA2 on 26 December; and (**c**) DMA3 on 11 August.

The statistics of absolute percentage errors (APE) between the single forecasting model S\_LSSVM and the hybrid models are provided in Figure 6. From the mean, median, maximum, and minimum values of APEs of the predictions for the three DMAs in Beijing, the H\_LSSVM\_Chaos models perform better than that of the S\_LSSVM models. Therefore, the hybrid framework using the LSSVM and chaotic time series gives more accurate predictions. The hybrid models using LSSVM and Fourier series did not always perform as well as the H\_LSSVM\_Chaos. The MAPEs of the H\_LSSVM\_FS model for DMA1 is 5.44%, which is better than that of the single forecasting model S\_LSSVM 5.68%. Whereas, other statistics of the H\_LSSVM\_FS model in DMA1, such as the 75-percentile value and the maximum value of the APE, are similar or even worse than that of the S\_LSSVM. The reason is that the H\_LSSVM\_FS model performs a misleading correction for the severely fluctuated time steps, as shown in Figure 5a. For DMA2, although the mean and median APEs of the H\_LSSVM\_FS models are similar to that of the H\_LSSVM\_Chaos models, the overestimates of the errors during the time steps 38 to 58 in Figure 5b by the FS method are still notable. Therefore, more attention should be paid when using the error correction module in short-term water demand forecasting.

**Figure 6.** Statistics of the absolute relative errors for different forecasting models.

#### *3.4. Discussion*

In the initial forecasting module and error correction module of the hybrid forecasting framework, the forecasting models are established by LSSVM. The successful implementation of the LSSVM model depends on the precision of model parameters (i.e., γ and δ2). In this study, the three-level Bayesian evidence inferring method is adopted to infer LSSVM model parameters. To investigate the influence of model parameters on the performance of LSSVM models, the application of the S\_LSSVM model on DMA2 is taken as an example. With the same model input data, Table 5 shows the model performances to different model parameters which are obtained by the 1-level Bayesian inferring, 3-level Bayesian inferring, and the grid search algorithm. These parameters are computed by the LS-SVMlab Toolbox [45]. As Table 5 shows, after 3-level inferring, the Bayesian evidence method catches reasonable model parameters with moderate computation burden. The grid search algorithm provides the best performance, but it takes the longest computation time. As shown in Table 4, the hybrid model H\_LSSVM\_Chaos model using 3-level Bayesian inferred parameters performs even better than the grid search algorithm built S\_LSSVM model. The computation time of the H\_LSSVM\_Chaos model is about 1 time (including initial forecasting and error correction) longer than the 3-level Bayesian built S\_LSSVM model, which is much shorter than that of the grid search algorithm built S\_LSSVM model (Table 5). Therefore, the hybrid framework using 3-level Bayesian built LSSVM for initial forecasting and error time series forecasting is suitable for the short-term water demand forecasting.

**Table 5.** Performances of the S\_LSSVM model with different parameters with application to DMA2.


The hybrid model (H\_LSSVM\_Chaos) is also compared to the traditional ARIMA model, and Table 6 shows the results on the three DMAs. The development of the ARIMA models follows the procedure described by Adamowski [45]. The parameters of the ARIMA are trained and tested based on different combinations, the number of autoregressive parameters (p), the number of difference (d) and the number of moving average parameters (q) are set as (3, 1, 1). Note that, the same

set of historical water demand data are used to build the H\_LSSVM\_Chaos and ARIMA forecasting models; the historical data before the forecasting day are used to establish the forecasting models.


**Table 6.** Performance comparison between the auto regressive integrated moving average (ARIMA) and the hybrid forecasting models.

As shown in Table 6, the H\_LSSVM\_Chaos model perform better than the ARIMA model on DMA1 and DMA2, for example, the MAPEs (DMA1, DMA2) of the H\_LSSVM\_Chaos model and the ARIMA model are (4.84%, 3.15%) and (5.53%, 3.83%), respectively. Whereas, the application results of DMA3 show some variations: (i) on the forecasting day August 11, the H\_LSSVM\_Chaos has a similar result to the ARIMA, for example, the R<sup>2</sup> and MAPEs of the two models are (0.9701, 0.9687) and (3.47%, 3.44%), respectively; (ii) on the forecasting days from August 8 to 10, the H\_LSSVM\_Chaos perform better than the ARIMA, for example, the three days' MAPEs of the H\_LSSVM\_Chaos and the ARIMA are (3.48%, 2.81%, 2.71%) and (4.03%, 3.10%, 3.35%), respectively. The reason for the variations is that August 11 is Saturday while August 8 to 10 are weekdays. As shown in Figures 3c and 4c, for DMA3, the water consumption curve on Saturday is different and more complex than that of weekdays. The distinctive water consumption curve on Saturday results in fewer training samples for establishing the forecasting model, which affects the forecasting accuracy for Saturday. However, the overall performance of the H\_LSSVM\_Chaos model is still better than the ARIMA model, despite the variations in the forecasting accuracy on Saturday. These comparisons verified the validity of the H\_LSSVM\_Chaos model.

Generally, one single model could not identify the underlying patterns for every case, and the hybrid framework including different models is able to capture different aspects of the available information for prediction [5,46]. The LSSVM method in the initial prediction module captures nonlinear relationships between the discontinuous feature data (*Qt*, *Qt*–1, *Qt*–2, *Qt*–95, *Qt*–191, *Qt*–671) of the historical water demand data set and the water demand *Qt*<sup>+</sup><sup>1</sup> on the forecasting day; the chaotic time series method in the error correction module captures the continuous and periodic changes from the errors of the initial forecasting module.
