**1. Introduction**

One critical factor in planning, design, operation, and management of water distribution system (WDS) is satisfying quality water demand at reasonable pressure [1–3]. An accurate hydraulic model of WDS will help water utilities to improve their operation ability and management effectively. Because the WDS hydraulics are driven by consumer demands, it is necessary to estimate consumer demands prior to performing hydraulic evaluation [4]. Water demand at a given time in the future is usually related to historical water consumption and meteorological factors such as humidity, air temperature, and wind velocity [5]. Water demand forecasting plays an important role in activities of the WDS such as water production, pump station operation, real-time modeling, and other strategic decisions of water management [1,6].

The water demand forecasting models can be categorized into long-term and short-term models according to the forecast horizon (i.e., the time period that the water demand will be forecasted) and forecast frequency (i.e., the time step that the water demand forecasts are performed within the time period) [7]. The long-term forecasting model (1 to 10 years' forecast horizon) pays more attention to the plan and design of WDSs. The short-term forecasting model (1 day to 1 month's forecast horizon) targets the real-time water demands of the existing WDSs, which is generally used for daily operation of water plants and pump stations [8]. In this study we focus on the short-term model. The accurate model for short-term water demand forecasting with a forecast frequency ranging from daily to sub-hourly is an essential support for optimal scheduling and better decision marking for WDS management [9].

Many studies have proposed forecasting models for short-term water demand forecasting, which can be generally classified into traditional methods and learning algorithms [9]. Early works used traditional statistical models to settle this problem, such as liner regression, exponential smoothing, and auto regressive integrated moving average (ARIMA) [7]. These models have been widely applied in practice because they are simple to understand and implement. Whereas, the traditional models are not always able to accurately predict the nonlinear changes of water demands. Recently, more sophisticated models that use machine learning algorithms and artificial intelligence have been utilized to address this problem. The models utilizing machine learning algorithms are typical data-driven nonlinear models, which are mainly based on historical data to establish the relationships between water demand and related variables (e.g., previous water consumption, air humidity, and temperature).

A number of data-driven models that use machine learning algorithms have been developed for short-term water demand forecasting, such as artificial neural networks (ANN) models [10–12], support vector machine models (SVM) [13–16], project pursuit regression models [1,17], and random forests [18]. Herrera et al. [1] conducted a comparison of these aforementioned models, and found that the SVM model has the most accurate results. Khan and Coulibaly [15] performed a comparison between SVM, ANN, and seasonal autoregressive model in forecasting lake water levels, and the results indicated the SVM model outperforms the other two. The main reason is because the SVM exhibits inherent advantages in formulating cost functions by using structural risk minimization principle instead of the empirical risk minimization of ANN [19].

SVM maps the nonlinear trends of input space to linear trends in a higher dimensional space and recognizes the subtle patterns in complex datasets by using a learning algorithm [20]. The least squares support vector machine (LSSVM) is an extension of SVM which involves equality constraints instead of inequality constraints and works with a least squares cost function [21,22]. Due to the equality constraints, the LSSVM reduces the computational complexity by solving a set of linear equations rather than the quadratic programming problem in standard SVM. Chen and Zhang [14], Herrera et al. [1], and Praveen and Bagavathi [23] established an LSSVM-based model to forecast hourly water demand; it was found that the LSSVM model has better generalization ability than ANN. Other examples of LSSVM applications include river flow estimation [24], discharge-suspended sediment estimation [25], and pipeline network failure estimation [26]. When forecasting water demand with the LSSVM-based model, Chen and Zhang [13] utilized the Bayesian framework to determine the model parameters (namely, the regularization constant and the width of the RBF kernel). Their case study showed that parameter determination by Bayesian method is faster than that of cross-validation [26,27].

Both the traditional models and the learning algorithms have achieved promising results in their own linear or nonlinear domains, whereas, none of them are universally suitable for all circumstances. To improve the performance of the forecasting models, the hybrid models combining two or more different algorithms/models are developed by some studies. Zhang [28] established the hybrid model with ANN and ARIMA to forecast time series, in which the ARIMA model was firstly used to predict the linear part of the data, then ANN was performed to model the errors between the linear part and the observed data (i.e., the nonlinear part of the data). The application results of three benchmark time series data showed that the hybrid model improved forecasting accuracy more than the independent models. Odan and Reis [7] associated the Fourier series (FS) to ANN for hourly water demand forecasting. ANN were used to model the errors of the FS forecast (i.e., the difference between the FS model and the observed data). Brentan et al. [29] proposed a hybrid model based on SVM and adaptive FS, where SVM firstly provided the initial forecasting and then the adaptive FS was utilized to model the errors between the initial forecasting and the observed data. Thus, the nonlinear and periodical behavior of water demand can be captured by the SVM and FS model, respectively.

In addition to FS, the chaotic time series method gives the possibility of detecting instability phenomena hidden behind random-looking phenomena, which has been widely used in short-term time series forecasting of rainfall, traffic, and other fields. For example, Dhanya et al. [30] examined the chaotic characteristics of daily rainfall data of the Malaprabha basin, India, and they established a daily rainfall prediction model based on the theory of chaotic time series. Liu et al. [31] combined chaos theory with SVM to perform short-term prediction of network traffic. Yang et al. [32] proposed an improved fuzzy neural system based on chaotic reconstruction technology for short-term load forecasting of electric power systems, and the application showed that the chaotic technology-based model performs better than the conventional neural network model. So far, chaotic time series has rarely been implemented to forecast water demand, and its performance in this field is unknown.

As aforementioned, with the help of error correction of the initial forecasting, hybrid models could perform better than any individual model [7,28,29]. Therefore, it is worthwhile to integrate the chaotic time series method in the hybrid forecasting model and investigate their performance. This paper aims to achieve better predictions of short-term water demand by presenting a hybrid forecasting model which couples the chaotic time series with LSSVM in the error correction module. Specifically, it will:

