Performance Assessment for Short-Term Water Demand Forecasting Models on Distinctive Water Uses in Korea

Koo, Kang-Min; Han, Kuk-Heon; Jun, Kyung-Soo; Lee, Gyumin; Kim, Jung-Sik; Yum, Kyung-Taek

doi:10.3390/su13116056

Open AccessFeature PaperArticle

Performance Assessment for Short-Term Water Demand Forecasting Models on Distinctive Water Uses in Korea

by

Kang-Min Koo

¹

,

Kuk-Heon Han

²

,

Kyung-Soo Jun

¹,

Gyumin Lee

³,

Jung-Sik Kim

⁴ and

Kyung-Taek Yum

^2,*

¹

Graduate School of Water Resources, Sungkyunkwan University, Suwon 16419, Korea

²

Smart Water Grid Research Group, Sungkyunkwan University, Suwon 16419, Korea

³

Construction and Environmental Research Center, Sungkyunkwan University, Suwon 16419, Korea

⁴

Techwin Co., Ltd., Cheongju 28580, Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2021, 13(11), 6056; https://doi.org/10.3390/su13116056

Submission received: 12 April 2021 / Revised: 26 May 2021 / Accepted: 26 May 2021 / Published: 27 May 2021

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

It is crucial to forecast the water demand accurately for supplying water efficiently and stably in a water supply system. In particular, accurately forecasting short-term water demand helps in saving energy and reducing operating costs. With the introduction of the Smart Water Grid (SWG) in a water supply system, the amount of water consumption is obtained in real-time through a smart meter, which can be used for forecasting the short-term water demand. The models widely used for water demand forecasting include Autoregressive Integrated Moving Average, Radial Basis Function-Artificial Neural Network, Quantitative Multi-Model Predictor Plus, and Long Short-Term Memory. However, there is a lack of research on assessing the performance of models and forecasting the short-term water demand in the SWG demonstration plant. Therefore, in this study, the short-term water demand was forecasted for each model using the data collected from a smart meter, and the performance of each model was assessed. The Smart Water Grid Research Group installed a smart meter in block 112 located in YeongJong Island, Incheon, and the actual data used for operating the SWG demonstration plant were adopted. The performance of the model was assessed by using the Residual, Root Mean Square Error, Normalized Root Mean Square Error, Nash–Sutcliffe Efficiency, and Pearson Correlation Coefficient as indices. As a result of water demand forecasting, it is difficult to forecast water demand only by time and water consumption. Therefore, as the short-term water demand forecasting models using only time and the amount of water consumption have limitations in reflecting the characteristics of consumers, a water supply system can be managed more precisely if other factors (weather, customer behavior, etc.) influencing the water demand are applied.

Keywords:

smart water grid; advanced metering infrastructure; short-term water demand forecasting; distinctive uses

1. Introduction

The Smart Water Grid Research Group (SWGRG) operated the smart water grid (SWG) demonstration plant in block 112 located in YeongJong Island, Incheon from 2017 to 2019 [1]. The SWG is based on the Internet of Things, which is the core technology of the 4th Industrial Revolution, and aims to provide efficient and economic management of water resources and stable supply [2]. The integrated system monitors and collects the data on the amount of water consumption at the types of water use through real-time remote reading using Advanced Metering Infrastructure (AMI) sensors and a bilateral network having transmission and control devices. The SWGRG installed 527 ultrasonic-wave-type AMI sensors in the customers of block 112 located in YeongJong Island and collected the water consumption data at one-hour intervals in real-time. The collected water consumption data can be used to forecast water demand and to determine abnormal water pressure or leakage in worn-out pipelines throughout the water supply infrastructure [3]. According to Tiwari and Adamowski [4,5], there is no general rule for forecasting water demand, but it can be classified into short-term (hourly, daily, weekly), medium-term (up to 24-month), and long-term (annual, decadal). In general, water supply managers refer to short-term water demand forecast for one day or up to several weeks based on experiences to manage the system, including pumps and valves, efficiently. As water is supplied from a purification plant to a distribution reservoir using pumps, the operating costs can be reduced if the work is performed at nighttime when power rates are low. In particular, if the short-term (24–48 h) water demand is forecasted, efficient pump scheduling can guarantee stable water availability at a distribution reservoir [6,7]. Hence, an accurate forecast of short-term water demand is required for the efficient management of a water supply system and the reduction of operating costs and energy [8]. Additionally, the higher measuring time step can lead to significant reductions of the peak demand [9], so a higher temporal resolution might help forecast water demand. Accordingly, several studies have been conducted in this regard, but there is a lack of research on assessing the performance of short-term water demand forecasting at the types of water use with higher temporal resolution.

Short-term water demand forecasting can be conducted mainly using statistical, Machine Learning (ML), Hybrid, and Deep Neural Network (DNN) models. Firstly, the following studies have applied statistical models. Kofinas et al. [10] forecasted the water demand for cities using the AutoRegressive Integrated Moving Average (ARIMA) model. Zhou et al. [11] forecasted the daily water demand using the AutoRegressive (AR) model and Fourier series by applying the weather parameters (maximum temperature, precipitation, evapotranspiration) of the city of Melbourne, Australia. Wong et al. [12] applied the trend, seasonality, weather regression, and calendar effect (holiday) to analyze the correlation when forecasting the daily water demand of Hong Kong. Hutton and Kapelan [7] diagnosed and reduced the forecasting error using the repeated Bayesian likelihood model, and Do et al. [13] proposed a particle-filter-based model as the statistical model for real-time water demand forecasting [14,15]. In the past, a linear regression model has been widely applied, as it is relatively simple [13]; however, as the changes in water demand are nonlinear and cannot be accurately forecasted with linear regression methods [16], studies have shown that nonlinear regression methods are better than linear regression methods for forecasting the water demand of cities [17]. In addition, Quevedo et al. [18] applied the Seasonal ARIMA (SARIMA) model and the exponential smoothing model using time and daily periods to compare the water demand forecasting results; accordingly, they proved that the exponential smoothing model and the SARIMA model are superior in forecasting the water demand based on time and daily periods, respectively.

Secondly, as studies using ML models, Chang et al. [19] used the Radial Basis Function-Artificial Neural Network (RBF-ANN) model to forecast the water demand. Moreover, Braun et al. [16] used the Support Vector Machine (SVM) and SARIMA models. Furthermore, Brentan et al. [20] used the model in which SVM and adaptive Fourier series are combined to forecast the water demand, and proved that better forecasting results were produced than when the SVM model was used alone. Moreover, Candelieri [21] combined the SVM and clustering technique to forecast the water demand in Milan, Italy. An Artificial Neural Network (ANN) model [22,23,24,25,26,27], Random Forest model [28], Extreme Learning Machine model [17], and Multi Evolutionary ANN model [26], which are all ML models, have been reported to be superior to statistical models.

Thirdly, the studies using Hybrid models that combine the statistical model and the ML model are as follows. Rangel et al. [29] used the concept of daily water consumption pattern predicted based on Nearest Neighbor (NN) node estimation, which is a non-parametric method; Cheifetz et al. [30] estimated the water demand pattern for each day of the week using the hourly water consumption data based on the Fourier regression mixture model. Particularly, Farias et al. [31] applied the NN classification, which is an ML model, and a calendar effect based on quantitative and qualitative information, and proposed the Qualitative Multi-Model Predictor Plus (QMMP+) model for estimating water demand patterns based on the moving average (MA). Here, better results were produced when the total daily water demand was forecasted using the SARIMA model applied with a sliding window [26], whereas the hourly water demand was distributed with the NN model according to a calendar effect for daily patterns and compared with the ANN model.

Lastly, recent studies using the DNN model focused on water demand forecasting [32,33], power demand forecasting [34,35,36], tourism flow forecasting [37], airline demand forecasting [38], and sales demand forecasting [39]. In particular, Li and Cao [37] reported that the forecast accuracy of a Long Short-Term Memory (LSTM) model [40] is higher than that of the ARIMA model for forecasting tourism flows.

Therefore, this study selected ARIMA as a statistical model, RBF-ANN as an ANN model, QMMP+ as a hybrid model, and LSTM as a DNN model for short-term water demand forecasting. As the input data for forecasting water demand, the hourly water consumption data collected at the types of water use of an SWG demonstration plant of block 112 located in YeongJong Island, Incheon were used. For assessing the water demand forecasting performance of each model, the forecasted value was comparatively analyzed against the 24-h (one day) observed value. The results of this study can be used as index data for applying a short-term water demand forecasting model to an actual water supply system to which an SWG is applied.

2. Study Process and Methodology

2.1. Study Process

The process of assessing the performance of a short-term water demand forecasting model can be divided into three parts: (a) the preprocessing of the data collected from the customers AMI sensors of a District Metered Area (DMA) of a water supply and then transferred to the integrated server to be accessible, (b) hourly water demand forecasting using the preprocessed water consumption data, and (c) the performance assessment of the models. This process is illustrated in Figure 1.

Similar to most time-series data, the water consumption data transferred from AMI sensors contain missing values due to a communication error or AMI measurement error in addition to error values that substantially differ from the observed values or other samples [41]. Particularly, missing values often occur in big data when a large number of variables have correlations with respect to time [42]. However, despite considerable research on the imputation of missing values, only a few methods have been proposed for accurately processing the missing values in the big data received from AMI sensors. Rahman et al. [42] suggested an imputation method for big data in which the lagged k-NN and fast Fourier transform are combined; this method demonstrated a high accuracy when the missing values were imputed in the diabetes datasets. Particularly, this method was applied in this study, as it can impute the data omitted for at least one day and is less biased.

First, to find outliers, the local minima and the variables exceeding the maximum discharge capacity per unit time with respect to a pipe diameter are determined and designated as missing in the accumulated water consumption data. For imputing the missing values, the lagged k-NN is applied to find the lagged time and the training dataset to find the k-NN value; then, for integrating the water consumption patterns within the variable, the Fourier transform is approximated and the estimates of the two methods are averaged.

The water consumption dataset is categorized into the training and validation datasets for forecasting water demand. In general, the training dataset uses the entire observed values of 70–90%; however, in this study, among the water consumption data of one year measured at one-hour intervals, 70% of the data were used as the training dataset overtime sequence, whereas the remaining 30% were used as the validation dataset [31]. The water consumption data are standardized for reducing the bias in demand forecasting. After configuring each water demand forecasting model, parameters are estimated using the training dataset (70%). Then, if the estimated parameters satisfy the simulation conditions, the water demand for 24 h ahead is forecasted using the validation dataset (30%). For assessing the performance of each model, the residual, Root Mean Square Error (RMSE), Normalized Root Mean Square Error (NRMSE), Nash–Sutcliffe Efficiency (NSE), and Pearson Correlation Coefficient (PCC) are calculated by comparing with the observed data.

2.2. Methodology

2.2.1. ARIMA

ARIMA, a stochastic model, was first presented by Box and Jenkins [43] in 1976, and is used for forecasting time series data in various fields. In the ARIMA (p, d, q) model,

y_{t}

of the observed time series y =

〈 y_{t} 〉_{t = 1}^{n}

is integrated and differentiated with the Auto-Regressive (AR) and Moving Average (MA) terms, can be expressed as the following Equation (1) [43].

(1 - \emptyset_{1} L - \dots - \emptyset_{p} L^{p}) {(1 - L)}^{d} y_{t} = c + (1 + θ_{1} L + \dots + θ_{q} L^{q}) e_{t},

(1)

where

\emptyset_{1, \dots,} \emptyset_{p}

are the coefficients of AR(p) to be estimated that accompany each of the observations in the past periods,

p

is the order of AR terms,

θ_{1, \dots,} θ_{q}

are the coefficients of MA(q) will be stationary, q is the order of MA terms,

c

is constant,

L

is backshift operator (e.g.,

L y_{t} = y_{t - 1}

),

d

is the order of non-seasonal differences, and

e_{t}

is the white noise, which is an error term that satisfies

N (0, σ^{2})

.

The AutoCorrelation Function (ACF) and the Partial AutoCorrelation Function (PACF) are used to determine the orders of p, d, and q for model estimation [44]. In order to apply the ARIMA model and find the optimal orders, we used “estimate function” and forecast water demand using “forecast function” in the code (see Supplementary Materials) (Matlab R2020b, MathWorks Inc., Natick, MA, USA).

2.2.2. RBF-ANN

ANNs are used in various fields such as monitoring, control, classification, and forecasting. In particular, RBF-ANN model among ANNs models is very powerful for numerical calculations, and has the advantage of fast learning without weight update by repeated computation, so it has superior robust ability and adaptation capability compared to other ANN models [45]. In the RBF-ANN model, the value

h_{t}

of the output layer is as shown in Equation (2) [45,46].

h_{t} = \sum_{n = 1}^{N} G_{n} w_{t, n} + b_{t} = \sum_{n = 1}^{N} w_{t, n} e x p [\frac{{(y - μ_{n})}^{2}}{2 σ_{n}^{2}}] + b_{t},

(2)

where

N

is hidden layer nodes, y =

〈 y_{t} 〉_{t = 1}^{n}

is input data,

w_{n}

is the synaptic weight connecting the hidden node and the output layer,

G_{n}

is the Gaussian function of the hidden layer,

b

is a bias,

μ_{n}

and

σ_{n}

are the center vector and radius of the nth hidden node, respectively. At this time, the number of Gaussian neurons in the RBF-ANN model was 92 in the hidden layer,

σ = 1

, and “train function” of MATLAB Neural Network Toolbox package (Matlab R2020b, MathWorks Inc., Natick, MA, USA) was used for weight estimation and updating in the code.

2.2.3. QMMP+

Lopez Farias et al. [31,47] proposed the QMMP+ model, which is a nonlinear time-series model, for water demand forecasting. The time-series data of water consumption typically have a cyclic consumption pattern for every 24 h, excluding holidays [47]. Therefore, the time-series data of water consumption can be disintegrated as in Equations (3) and (4) for distinguishing qualitative (

X_{t})

and quantitative (

Z_{t})

components [31].

Z_{t} = \sum_{t = τ (T - 1) + 1}^{T_{τ}} y_{t},

(3)

X_{t} = \frac{{\{y_{t}\}}_{τ (T - 1) + 1}^{T_{τ}}}{Z_{t}},

(4)

where

τ

is the length of the accumulation period (e.g., a day, 24 h), and T =

f l o o r (t / γ)

.

X_{t + 1}

, which is the daily water consumption pattern to be estimated, is applied with the calendar effect (week, holiday), the Nearest Neighbor Rule Pattern Estimation (NNRPE), and probabilistic selection [31] based on a nonparametric learning algorithm, the NN, proposed by Kantz and Schreiber [48].

On the other hand, the daily water demand

Z_{t + 1}

can be forecasted using the SARIMA model. Therefore, the SARIMA

(p, d, q) (P, D, Q) s

model has seasonality added to the ARIMA model and can be expressed as Equation (5).

\emptyset_{p} (L) Φ_{P} (L^{s}) {(1 - L)}^{d} {(1 - L^{s})}^{D} Z_{t} = c + θ_{q} (L) Θ_{Q} (L^{s}) e_{t},

(5)

where

\emptyset_{p} (L) = (1 - \emptyset_{1} L - \dots - \emptyset_{p} L^{p})

,

Φ_{P} (L^{s}) = (1 - Φ_{1} L^{2 s} - \dots - Φ_{P} L^{P s})

,

θ_{q} (L) = (1 + θ_{1} L + \dots + θ_{q} L^{q})

, and

Θ_{Q} (L^{s}) = (1 + Θ_{1} L^{2 s} + \dots + Φ_{Q} L^{Q s})

.

P, D, Q

refer to the degree of the Seasonal Auto Regressive (SAR) term, Seasonal Moving Average (SMA) term, and seasonal difference term, respectively, whereas

s

refers to the degree of seasonality.

Considering the cyclical characteristics,

s

was set to 7 (seven days), and holidays were reflected according to a calendar effect. Water demand can be forecasted by multiplying the estimated 24-h-ahead hourly pattern (

X_{t + 1})

with the daily water consumption

Z_{t + 1}

. The QMMP+ model used in this study adopted the MATLAB code proposed by Lopez Farias et al. [31].

2.2.4. LSTM

A Recurrent Neural Network (RNN) model, which is a Deep Neural Network (DNN) model, rectified these drawbacks by applying multiple hidden layers and the backpropagation algorithm to update the weight [49]. However, Recurrent Neural Network models have a gradient vanishing problem [50] in which weight and bias are not significantly changed as the gradient is reduced during the backpropagation step, as the length of the training data being input is lengthened. In 1997, Hochreiter and Schmidhuber [40] introduced the LSTM model, which solved the problem of the information in the hidden layer not being transferred to the last layer due to the vanishing gradient in an RNN model, which is called long-term dependency [51]. The model is used in various fields due to its excellent performance. Similar to the RNN models, the LSTM model is a type of DNN model having a circulating structure by being connected with edges in which a hidden node has directionality. It is designed to be trained by reflecting the information of the past in the present, regardless of the length of sequential data. However, the module having a repeated chain structure consists of a memory cell state that can select information and three gates (forget, input, and output), unlike RNNs. Each gate outputs either 0 or 1 for the information received from the previous cell and the information input in the current layer by multiplying the sigmoid layer and the vector. If the gate value of 0 is received, the input information is not delivered; however, if 1 is received, the information is delivered to the memory cell state.

The weight matrices that can be trained in the LSTM layer are the input weight (

w)

, recurrent weight (

r

), and bias (

b)

. The mathematical form of each gate can be expressed as Equations (6)–(9) as follows [40].

f_{t} = σ (w_{f} y_{t} + r_{f} h_{t - 1} + b_{f}),

(6)

i_{t} = σ (w_{i} y_{t} + r_{i} h_{t - 1} + b_{i}),

(7)

g_{t} = t a n h (w_{g} y_{t} + r_{g} h_{t - 1} + b_{g}),

(8)

o_{t} = σ (w_{o} y_{t} + r_{o} h_{t - 1} + b_{o}),

(9)

where

f

,

i

,

g

, and

o

denote the input gate, forget gate, cell candidate, and output gate, respectively, at time step

t

.

y_{t}

represents the original time-series data,

σ

is the sigmoid function

{(1 + e^{- y})}^{- 1}

, and

t a n h

is the hyperbolic tangent function

\frac{2}{1 + e^{- 2 y}} - 1

(scale from −1 to 1).

The cell state (

c_{t}

) and output (

h_{t}

) updated from each gate can be expressed as Equations (10) and (11) as follows [40].

c_{t} = f_{t} ⨀ c_{t - 1} + i_{t} ⨀ g_{t},

(10)

h_{t} = o_{t} ⨀ \tanh (c_{t}),

(11)

where

⨀

is the matrix operator, i.e., Hadamard product or element-wise multiplication. The last layer of the training process is the regression layer in which the loss function

E

is defined as the half-mean-squared error of forecasted responses, and the corresponding equations can be expressed as Equation (12).

E = \frac{1}{2} \sum_{t = 1}^{T} {(h_{t}^{*} - y_{t})}^{2},

(12)

where

h_{t}^{*}

is the forecasted point,

y_{t}

is the input point, and

E

is the quantified deviation between the forecasted and observed values.

The loss function is reduced, and the weight of each gate is updated by a gradient matrix using the differential in the backpropagation of training. As an optimizer of the loss function, Adam, which is the combination of AdaGrad and Momentum, was used and can be expressed as in Equation (13) [52].

w_{l} = w_{l - 1} - η \frac{\hat{m_{l}}}{\sqrt{\hat{v_{l}} + e}},

(13)

where

w_{t}

is the weight of the model at iteration step

l

,

η

is the learning rate, and

e

is a constant for which a value of

10^{- 8}

is commonly used. Furthermore,

\hat{m_{l}} = β_{1}^{l} m_{l - 1} + (1 - β_{1}^{l}) \nabla E (w_{l})

,

\hat{v_{t}}

=

β_{2}^{l} v_{l - 1} + (1 - β_{2}^{l}) \nabla {[E (w_{l})]}^{2}

are the momentums using inertia; thus, the decay rates

β_{1}

and

β_{2}

are 0.9 and 0.999, respectively [52].

For constructing the LSTM model, the fully connected functions of the MATLAB Neural Network Deep Learning Toolbox package code (MATLAB R2020b, MathWorks Inc., Natick, MA, USA) were used. The initial learning rate (

η

), the number of hidden units (

d

), and the maximum number of epochs (

i t e r a t i o n, l

) were designated as the hyperparameters. If the values of each condition are excessive, the computation time increases, and overfitting may occur where the training error is excessively trained. On the other hand, if the values are too small, training may not capture the pattern of the observed values [53]. Therefore, the number of repetitions is set such that the values of the loss function

E

are not reduced further.

2.3. Performance Assessment

For assessing the short-term water demand forecasting of each model, the residual, RMSE, NRMSE, NSE, and PCC were selected as the assessment indices in this study (Equations (14)–(18)).

R e s i d u a l = \sum_{t = 1}^{T} (q_{t} - y_{t}),

(14)

R M S E = \sqrt{\frac{1}{T} \sum_{t = 1}^{T} {(q_{t} - y_{t})}^{2}},

(15)

N R M S E = \frac{R M S E}{y_{t_m a x} - y_{t_m i n}} \times 100,

(16)

N S E = 1 - \frac{\sum_{t = 1}^{T} {(q_{t} - y_{t})}^{2}}{\sum_{t = 1}^{T} {(y_{t} - \bar{y})}^{2}},

(17)

P C C = \frac{\sum_{t = 1}^{T} (q_{t} - \bar{q}) (y_{t} - \bar{y})}{\sqrt{\sum_{t = 1}^{T} {(q_{t} - \bar{q})}^{2}} \sqrt{\sum_{t = 1}^{T} {(y_{t} - \bar{y})}^{2}}},

(18)

where

T

denotes the length of time (24 h),

q_{t}

is the forecasted result,

\bar{q}

is the mean of the forecasted result,

y_{t}

is the observed water consumption, and

\bar{y}

is the mean of the observed hourly water consumption.

y_{t_m a x}

and

y_{t_m i n}

are the maximum and minimum observed values, respectively.

The residual represents the difference in the forecast of daily water demand, which can be used for determining whether the forecast is underestimated or overestimated. Moreover, the RMSE is a typical assessment method for comparing the observed and forecasted values, which facilitates identifying the errors in the actual water consumption. The NRMSE is a percentage error suitable for determining the quantitative characteristics of the forecasted values. However, RMSE has a considerable influence on the observed values, as it is a scale-dependent error [54]. Therefore, NSE and PCC were used as additional assessment indices. NSE is a coefficient often used in the field of hydrology for verifying the accuracy of forecasted values via a comparison with the observed values. A negative value of this coefficient indicates that the forecasted value is poor or is not consistent with the observed values. A value closer to 1 indicates a higher forecasting accuracy [55]. PCC measures the accuracy of the correlation between two variables that are linearly correlated; a value closer to 1 is more linear and thus indicates that the observed and forecasted values are consistent [56].

3. Study Area and Dataset Description

3.1. Study Area

The study area is block 112 (Unbuk-dong, Unseo-dong) in YeongJong Island, where Incheon International Airport, a hub of Northeast Asia, is located. In this location, water is supplied through a single submarine pipeline, making the place optimal for responses to water crises. The total area of the block is 17.41 km², and the population served is approximately 17,000. The total number of customers is 958 within the block, and a service reservoir and pumping station are also present. The SWGRG [1] installed an ultrasonic smart water meter in the block to 527 customers who agreed to install it for the first time in Korea, and operated the SWG demonstration plant from 2017 to 2019 (Figure 2). The pipe diameter can be categorized into eight different types ranging from 15 mm to 100 mm. Specifically, the number of 15-mm diameter household-use pipes is 279, which is approximately 52.9% of the total number of water consumers, whereas that of 20-mm diameter pipes is 12.5%, constituting 68.9% of the total number of water consumers (527). The total length of a pipe conduit is approximately 55 km, and the daily water consumption is approximately 8000 m³. The accumulated water consumption data read by the AMI sensors are transferred (at one-hour intervals) to the server of the Water Operation Center of Incheon Waterworks Headquarters in real-time.

3.2. Dataset Description

As the dataset for forecasting water demand, the accumulated water consumption data received from the 527 ultrasonic-wave-type AMI sensors in block 112 of YeongJong Island from 00:00 on 1 January 2018 to 00:00 on 1 January 2019 were used. The accumulated water consumption data are updated every hour and transferred to the central server; as shown in Figure 3, the data exhibit a (a) stair-shaped pattern or a (c) nonlinear pattern and contain missing values or erroneous data. The data in (a) and (c) can be represented as the water consumption at one-hour intervals as shown in (b) and (d), which can be used as the input data for forecasting water demand. Figure 3b,d show seasonality, where the amount of water consumption is the highest in summer (August).

The local minima and the maximum discharge capacity with respect to a pipe diameter are designated as missing values for discerning outliers in the annual water consumption data (one-hour intervals) of 527 AMIs. The minimum number of missing values was 77 (0.88%), whereas the maximum number of missing values was 8119 (92.68%) of each AMI; the overall average missing rate of 527 AMIs was 5.40% (249,276) (Figure 4). Water demand forecasting may be affected if the missing rate is high due to errors in the data received during the actual operation of the SWG demonstration plant. Therefore, the problem of missing values must be resolved for the proper management of a water supply system.

10 AMIs were selected for forecasting water demand for which the missing values were set to 10% or less to minimize the influence of errors in the simulation results. Moreover, for a comparison with the forecasted values, the missing data were not included in the 24-h observed values of the validation dataset; the pipe diameters ranged from 15 mm to 32 mm, and the purpose of use included domestic, restaurant, church, supermarket, senior community center, and laundry. Table 1 lists the details of the AMI dataset. The Total Water Demand (TWD) being forecasted ranges from 0.634 m³/day at the minimum to 12.206 m³/day at the maximum. Generally, the amount of water consumption increases as the pipe diameter increases, and the water consumption was the highest for restaurants, churches, and laundries compared with households.

Furthermore, the water demand needs to be forecasted per hour within the DMA for the operation of a distribution reservoir of a water supply system. Hence, the missing values are imputed using the lagged k-NN and fast Fourier transform in the hourly dataset of 527 AMIs, and the summed (a) daily water demand (year) and (b) daily average water demand per hour (24 h) of a distribution reservoir are shown in Figure 5. The amount of water consumption forecasted 24 h ahead is 5108 m³/day, and the average water demand shown in (b) exhibits periodicity with respect to 24 h.

4. Results and Discussion

4.1. AMIs Water Demand Forecasting

For 10 AMIs selected based on the purpose of use (households, churches, supermarkets, daycare centers, senior community centers, restaurants, etc.) and pipe diameter (15 mm–32 mm), the results of forecasting water demand 24 h ahead by using the ARIMA, RBF-ANN, QMMP+, and LSTM models are illustrated in Figure 6.

The pattern of water consumption reflects the characteristics of use. In particular, restaurants (a) have a higher amount of water consumption at night than during the day, whereas the amount of water consumption at night is extremely low for laundries (e), supermarkets (f), and senior community centers (i). Furthermore, the amount of water consumption in households (b), (c), and (j) is large, especially during breakfast and dinner times. In terms of water consumption patterns, the observed values cannot be accurately forecasted due to various factors (weather, day, consumer characteristics, etc.), whereas the forecasted values are generally similar to the observation patterns when visually inspected, excluding the ARIMA model, which cannot reflect the periodic nature. However, it is difficult to forecast the peak value of the observed water consumption.

Daily water demand forecasting is also important in terms of stably supplying water in addition to water consumption patterns; the differences in the forecasted daily water demand of each AMI are presented in Table 2 by model. Among the 10 AMIs, two ARIMA models (110013044, 110016799), one RBF-ANN model (110013629), two QMMP+ models (110013012, 110016860), and four LSTM models (110013004, 110013074, 110016389, 110018932) produced the results that are most similar to the actual water consumption but the results were underestimated (Table 2). Particularly, the ARIMA model cannot reflect the periodic characteristics but can estimate the daily water demand.

The result was closer to the observed value when the RMSE was small, and the LSTM model had better forecast results compared with the other models (Table 3). The result indicates that the amount of forecasted water demand per hour is most similar to the observed values. For the NRMSE, which is the RMSE divided by standard deviation, the LSTM model showed excellent forecasting results similar to the case of the RMSE. The accuracy was 27.58% for the ARIMA model, and 23.25% and 17.18% for the RBF-ANN and QMMP+ models, respectively (Table 4).

The forecast accuracy is higher as NSE is closer to 1. Except for the LSTM model, negative values were observed in the AMIs of eight ARIMA models, three RBF-ANN models, and five QMMP+ models, thus exhibiting a poor forecast accuracy. On average, the NSE of the ARIMA model was 0.11, whereas the corresponding values of the RBF-ANN, QMMP+, and LSTM models were 0.34, 0.34, and 0.61, respectively, thus exhibiting a low forecast accuracy (Table 5).

According to the PCC calculation results, the average value was 0.37 for the ARIMA model, 0.64 for the RBF-ANN model, 0.68 for the QMMP+ model, and 0.79 for the LSTM model, thus failing to exhibition a high level of correlation above 0.9 (Table 6).

In terms of the short-term water demand forecasting performance of the models for 10 AMIs in block 112 of YeongJong Island, the LSTM model demonstrated the best performance, but there was a limitation in estimating the peak amount of water consumption while the daily water consumption was underestimated. In particular, the NSE had a poor forecasting accuracy due to negative values in most of the models, excluding the LSTM model, and none of the models had a correlation above 0.9 in the PCC analysis. It can be inferred that the life cycle of the consumers could not be reflected accurately as the water demand forecasting is solely based on time and the amount of water consumption. Furthermore, weather conditions and consumer events influenced the results in addition to missing values and erroneous values when collecting the data of water consumption from AMIs.

4.2. Total Water Demand Forecasting

The results of water demand forecasting 24 h ahead for the operation of the DMA distribution reservoir are illustrated in Figure 7. A distribution reservoir supplies water from a purification plant through pumps; thus, the electricity cost can be reduced by controlling the supply time based on the forecasting of water demand. Similar to the results of 10 AMIs, the periodic characteristics are reflected adequately to be similar to the observed values when inspected visually, excluding the ARIMA model. The observed daily water consumption is 5108 m³. The forecasted values of each model were 5275 m³ for the LSTM model, 5607 m³ for the QMMP+ model, 5503 m³ for the RBF-ANN model, and 5433 m³ for the ARIMA model, which are all overestimated, unlike individual AMIs. However, the LSTM model had the smallest difference of

167.1

m³ from the observed value (Table 7). Moreover, in terms of the assessment factors, RMSE, NRMSE, NSE, and PCC, the LSTM model produced better results compared with the other models. However, the NSE and PCC of the LSTM model were 0.61 and 0.79, respectively, thus having a poor forecasting accuracy and correlation. Based on the above results, it can be inferred that there is a limitation in forecasting water demand only based on time and the amount of water consumption. Recently, the water demand forecasting models with probabilistic components are becoming more important to accurately forecast water demand [57]. In particular, the SWG environment, information that can differentiate the characteristics of consumers can be obtained in addition to the data on direct information, such as the amount of water consumption and usage time. Therefore, forecasting methods reflecting the propensity and characteristics of consumers can be considered. Specifically, a water supply system can be managed more precisely if various factors, such as water usage time, the amount of water consumption, purpose, the number of household members, occupation, and weather (precipitation, humidity, temperature), and customers behavior, etc., are applied for forecasting the water demand.

5. Conclusions

The SWG technology, which was introduced for stable and efficient management of water resources in cities, enables the real-time monitoring and collection of data on water consumption using bilateral communication and AMI sensors. Short-term water demand forecasting is required for reducing energy and operating costs when managing water supply pipe networks. The water consumption data received from the AMIs can be used as basic materials for forecasting the water demand. Despite a large number of short-term water demand forecasting models, there is a lack of research on assessing the performance of short-term water demand forecasting models at the customer’s level for the operation of an SWG. Therefore, this study used the data on hourly water consumption collected by installing 527 ultrasonic-wave-type AMI sensors at the customers of block 112 in YeongJong Island, Incheon and operating the SWG demonstration plant for water demand forecasting. The data on water consumption of 10 customer’s AMIs categorized according to the purpose and pipe diameter and the total water consumption data of 527 AMIs for operating a distribution reservoir were used for short-term water demand forecasting. As the water consumption data are big data containing missing values and errors, which require correction, the lagged k-NN and fast Fourier transform were applied for the imputation of missing values.

The ARIMA, RBF-ANN, QMMP+, and LSTM models were selected for the short-term water demand forecasting, and the performance of each model was assessed using residual, RMSE, NRMSE, NSE, and PCC as indices. Only usage time and water consumption data were used in the water demand forecasting models, other factors that would affect the forecasting results were not included. Then the simulation was repeated by setting the same conditions.

Consequently, compared to the cases of many previous studies so far, the simulation of each model shows similar results. However, the ARIMA, RBF-ANN, QMMP+, and LSTM models are limited in terms of application for the management of a water supply system. The residual underestimated or overestimated the water demand, the NSE had negative values, which lowered the forecasting accuracy, and the PCC was below 0.9, which indicated a low correlation between the forecasted and observed values. In particular, water demand forecasting based only on usage time and water consumption amount entails an underestimation or overestimation of the forecasted values and fails to reflect the peak water consumption amount.

Additionally, since only data from 2018 to 2019 were used to forecast water demand, the results may differ if modeling including data from 2019 or later. Currently, the ratio of missing and outliers of smart water meters is about 5% in this study, and forecasting results are expected to improve as the precision of smart meters improves in the future.

However, in an SWG environment, information that can determine the characteristics of consumers can be obtained in addition to the hourly water consumption amount. Therefore, a water supply system can be managed more precisely if various factors, such as water usage time, the amount of water consumption, purpose, the number of household members, occupation, and weather (precipitation, humidity, temperature), and customers behavior, etc., are considered.

Supplementary Materials

The code as supplementary material is found at https://github.com/koo00v/water-demand-forecasting.

Author Contributions

K.-M.K. proposed methodologies of this research work, simulated and validated the models, and wrote the manuscript of original draft; K.-H.H. managed fund; K.-S.J., G.L., J.-S.K., and K.-T.Y. reviewed and edited manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This Research has been performed as Development of Water and Sewage Innovation Technology Program of ARQ202001302001 supported by Korea Ministry of Environment and the Basic Science Research Program of NRF-2020R1A2C1005554 and NRF-2018R1D1A1B07049352 supported by the National Research Foundation of Korea.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Datasets that are restricted and not publicly available.

Conflicts of Interest

The authors declare no conflict of interest.

References

MOLIT. Smart Water Grid: Global Leader Korea’s Water Management Technology; SWGRG, MOLIT: Sejong, Korea, 2017. [Google Scholar]
Fikejz, J.; Roleček, J. Proposal of a smart water meter for detecting sudden water leakage. In Proceedings of the 2018 ELEKTRO, Mikulov, Czech Republic, 21–23 May 2018; pp. 1–4. [Google Scholar]
Koo, D.; Piratla, K.; Matthews, C.J. Towards sustainable water supply: Schematic development of big data collection using internet of things (IoT). Procedia Eng. 2015, 118, 489–497. [Google Scholar] [CrossRef] [Green Version]
Tiwari, M.K.; Adamowski, J. Urban water demand forecasting and uncertainty assessment using ensemble wavelet-bootstrap-neural network models. Water Resour. Res. 2013, 49, 6486–6507. [Google Scholar] [CrossRef]
Seo, Y.; Kwon, S.; Choi, Y. Short-term water demand forecasting model combining variational mode decomposition and extreme learning machine. Hydrology 2018, 5, 54. [Google Scholar] [CrossRef] [Green Version]
da Coelho Costa, B. Energy Efficiency of Water Supply Systems Using Optimisation Techniques and Micro-Hydroturbines. Ph.D. Thesis, Universidade de Aveiro, Aveiro, Portugal, 2016. [Google Scholar]
Hutton, C.J.; Kapelan, Z. A probabilistic methodology for quantifying, diagnosing and reducing model structural and predictive errors in short term water demand forecasting. Environ. Model. Softw. 2015, 66, 87–97. [Google Scholar] [CrossRef]
Yu, M.-J.; Gu, J.-Y.; Gu, Y.-H.; Kim, S.-G. Forecasting hourly water demand using linear and non-linear model. J. Korean Soc. Environ. Eng. 2004, 26, 277–283. [Google Scholar]
Gargano, R.; Tricarico, C.; Granata, F.; Santopietro, S.; De Marinis, G. Probabilistic Models for the Peak Residential Water Demand. Water 2017, 9, 417. [Google Scholar] [CrossRef] [Green Version]
Kofinas, D.; Mellios, N.; Papageorgiou, E.; Laspidou, C. Urban water demand forecasting for the island of Skiathos. Procedia Eng. 2014, 89, 1023–1030. [Google Scholar] [CrossRef] [Green Version]
Zhou, S.L.; McMahon, T.A.; Walton, A.; Lewis, J. Forecasting daily urban water demand: A case study of Melbourne. J. Hydrol. 2000, 236, 153–164. [Google Scholar] [CrossRef]
Wong, J.S.; Zhang, Q.; Chen, Y.D. Statistical modeling of daily urban water consumption in Hong Kong: Trend, changing patterns, and forecast. Water Resour. Res. 2010, 46, W03506. [Google Scholar] [CrossRef]
Do, N.C.; Simpson, A.R.; Deuerlein, J.W.; Piller, O. Particle filter–based model for online estimation of demand multipliers in water distribution systems under uncertainty. J. Water Resour. Plan. Manag. 2017, 143, 04017065. [Google Scholar] [CrossRef] [Green Version]
Arandia, E.; Ba, A.; Eck, B.; McKenna, S. Tailoring seasonal time series models to forecast short-term water demand. J. Water Resour. Plan. Manag. 2016, 142, 04015067. [Google Scholar] [CrossRef] [Green Version]
Bakker, M.; Vreeburg, J.; Van Schagen, K.; Rietveld, L. A fully adaptive forecasting model for short-term drinking water demand. Environ. Model. Softw. 2013, 48, 141–151. [Google Scholar] [CrossRef]
Braun, M.; Bernard, T.; Piller, O.; Sedehizade, F. 24-hours demand forecasting based on SARIMA and support vector machines. Procedia Eng. 2014, 89, 926–933. [Google Scholar] [CrossRef]
Mouatadid, S.; Adamowski, J. Using extreme learning machines for short-term urban water demand forecasting. Urban Water J. 2017, 14, 630–638. [Google Scholar] [CrossRef]
Quevedo, J.; Saludes, J.; Puig, V.; Blanch, J. Short-term demand forecasting for real-time operational control of the Barcelona water transport network. In Proceedings of the 2014 22nd Mediterranean Conference on Control and Automation, Palermo, Italy, 16–19 June 2014; pp. 990–995. [Google Scholar]
Chang, M.; Liu, J. Water demand prediction model based on radial basis function neural network. In Proceedings of the 2009 First International Conference on Information Science and Engineering, Nanjing, China, 26–28 December 2009; pp. 5295–5298. [Google Scholar]
Brentan, B.M.; Luvizotto, E., Jr.; Herrera, M.; Izquierdo, J.; Pérez-García, R. Hybrid regression model for near real-time urban water demand forecasting. J. Comput. Appl. Math. 2017, 309, 532–541. [Google Scholar] [CrossRef]
Candelieri, A. Clustering and support vector regression for water demand forecasting and anomaly detection. Water 2017, 9, 224. [Google Scholar] [CrossRef]
Banjac, G.; Vašak, M.; Baotić, M. Adaptable urban water demand prediction system. Water Sci. Technol. Water Supply 2015, 15, 958–964. [Google Scholar] [CrossRef]
Bougadis, J.; Adamowski, K.; Diduch, R. Short-term municipal water demand forecasting. Hydrol. Process. Int. J. 2005, 19, 137–148. [Google Scholar] [CrossRef]
Cutore, P.; Campisano, A.; Kapelan, Z.; Modica, C.; Savic, D. Probabilistic prediction of urban water consumption using the SCEM-UA algorithm. Urban Water J. 2008, 5, 125–132. [Google Scholar] [CrossRef]
Maier, H.R.; Jain, A.; Dandy, G.C.; Sudheer, K.P. Methods used for the development of neural networks for the prediction of water resource variables in river systems: Current status and future directions. Environ. Model. Softw. 2010, 25, 891–909. [Google Scholar] [CrossRef]
Romano, M.; Kapelan, Z. Adaptive water demand forecasting for near real-time management of smart water distribution systems. Environ. Model. Softw. 2014, 60, 265–276. [Google Scholar] [CrossRef] [Green Version]
Adamowski, J.F. Peak daily water demand forecast modeling using artificial neural networks. J. Water Resour. Plan. Manag. 2008, 134, 119–128. [Google Scholar] [CrossRef] [Green Version]
Herrera, M.; Torgo, L.; Izquierdo, J.; Pérez-García, R. Predictive models for forecasting hourly urban water demand. J. Hydrol. 2010, 387, 141–150. [Google Scholar] [CrossRef]
Rangel, H.R.; Puig, V.; Farias, R.L.; Flores, J.J. Short-term demand forecast using a bank of neural network models trained using genetic algorithms for the optimal management of drinking water networks. J. Hydroinform. 2017, 19, 1–16. [Google Scholar] [CrossRef] [Green Version]
Cheifetz, N.; Noumir, Z.; Samé, A.; Sandraz, A.-C.; Féliers, C.; Heim, V. Modeling and clustering water demand patterns from real-world smart meter data. Drink. Water Eng. Sci. Discuss. 2017, 10, 75–82. [Google Scholar] [CrossRef] [Green Version]
Lopez Farias, R.; Puig, V.; Rodriguez Rangel, H.; Flores, J.J. Multi-model prediction for demand forecast in water distribution networks. Energies 2018, 11, 660. [Google Scholar] [CrossRef] [Green Version]
Vijai, P.; Sivakumar, P.B. Performance comparison of techniques for water demand forecasting. Procedia Comput. Sci. 2018, 143, 258–266. [Google Scholar] [CrossRef]
Xenochristou, M.; Kapelan, Z. An ensemble stacked model with bias correction for improved water demand forecasting. Urban Water J. 2020, 17, 212–223. [Google Scholar] [CrossRef]
Bouktif, S.; Fiaz, A.; Ouni, A.; Serhani, M.A. Optimal deep learning lstm model for electric load forecasting using feature selection and genetic algorithm: Comparison with machine learning approaches. Energies 2018, 11, 1636. [Google Scholar] [CrossRef] [Green Version]
Kong, W.; Dong, Z.Y.; Jia, Y.; Hill, D.J.; Xu, Y.; Zhang, Y. Short-term residential load forecasting based on LSTM recurrent neural network. IEEE Trans. Smart Grid 2017, 10, 841–851. [Google Scholar] [CrossRef]
Wang, Y.; Gan, D.; Sun, M.; Zhang, N.; Lu, Z.; Kang, C. Probabilistic individual load forecasting using pinball loss guided LSTM. Appl. Energy 2019, 235, 10–20. [Google Scholar] [CrossRef] [Green Version]
Li, Y.; Cao, H. Prediction for tourism flow based on LSTM neural network. Procedia Comput. Sci. 2018, 129, 277–283. [Google Scholar] [CrossRef]
Pan, B.; Yuan, D.; Sun, W.; Liang, C.; Li, D. A Novel LSTM-Based Daily Airline Demand Forecasting Method Using Vertical and Horizontal Time Series. In Proceedings of the 2018 Pacific-Asia Conference on Knowledge Discovery and Data Mining, Melbourne, VIC, Australia, 3–6 June 2018; pp. 168–173. [Google Scholar]
Bandara, K.; Shi, P.; Bergmeir, C.; Hewamalage, H.; Tran, Q.; Seaman, B. Sales demand forecast in e-commerce using a long short-term memory neural network methodology. In Proceedings of the 2019 International Conference on Neural Information Processing, Sydney, NSW, Australia, 12–15 December 2019; pp. 462–474. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Grubbs, F.E. Procedures for detecting outlying observations in samples. Technometrics 1969, 11, 1–21. [Google Scholar] [CrossRef]
Rahman, S.A.; Huang, Y.; Claassen, J.; Heintzman, N.; Kleinberg, S. Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data. J. Biomed. Inform. 2015, 58, 198–207. [Google Scholar] [CrossRef] [Green Version]
Box, G.E.; Jenkins, G.M.; Reinsel, G.C. Time Series Analysis: Forecasting and Control; John Wiley & Sons: Hoboken, NJ, USA, 2015; Volume 734. [Google Scholar]
Chen, P.; Yuan, H.; Shu, X. Forecasting crime using the arima model. In Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Jinan, China, 18–20 October 2008; pp. 627–630. [Google Scholar]
Lee, C.-M.; Ko, C.-N. Time series prediction using RBF neural networks with a nonlinear time-varying evolution PSO algorithm. Neurocomputing 2009, 73, 449–460. [Google Scholar] [CrossRef]
Mirbagheri, S.A.; Bagheri, M.; Boudaghpour, S.; Ehteshami, M.; Bagheri, Z. Performance evaluation and modeling of a submerged membrane bioreactor treating combined municipal and industrial wastewater using radial basis function artificial neural networks. J. Environ. Health Sci. Eng. 2015, 13, 17. [Google Scholar] [CrossRef] [Green Version]
López Frías, R.; Puig Cayuela, V.; Rodríguez Rangel, H. An implementation of a multi-model predictor based on the qualitative and quantitative decomposition of the time-series. In Proceedings of the 2015 First International work-conference on Time Series, Granada, Spain, 1–3 July 2015; pp. 912–923. [Google Scholar]
Kantz, H.; Schreiber, T. Nonlinear Time Series Analysis; Cambridge University Press: Cambridge, UK, 2004; Volume 7. [Google Scholar]
Werbos, P.J. Backpropagation through time: What it does and how to do it. Proceedings of the IEEE 1990, 78, 1550–1560. [Google Scholar] [CrossRef] [Green Version]
Hochreiter, S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 1998, 6, 107–116. [Google Scholar] [CrossRef] [Green Version]
Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Reimers, N.; Gurevych, I. Optimal hyperparameters for deep lstm-networks for sequence labeling tasks. arXiv 2017, arXiv:1707.06799. [Google Scholar]
Hyndman, R.J.; Koehler, A.B. Another look at measures of forecast accuracy. Int. J. Forecast. 2006, 22, 679–688. [Google Scholar] [CrossRef] [Green Version]
McCuen, R.H.; Knight, Z.; Cutter, A.G. Evaluation of the Nash–Sutcliffe efficiency index. J. Hydrol. Eng. 2006, 11, 597–602. [Google Scholar] [CrossRef]
Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
Creaco, E.; Blokker, M.; Buchberger, S. Models for generating household water demand pulses: Literature review and comparison. J. Water Resour. Plan. Manag. 2017, 143, 04017013. [Google Scholar] [CrossRef]

Figure 1. Diagrammatic representation of the water demand forecasting and performance assessment.

Figure 2. Study area of block 112 with smart water meter and water supply system in YeongJong Island.

Figure 3. Water consumption between 1 January 2018 and 1 January 2019 in block 112 (e.g., (a,b) 110012984 and (c,d) 110013004).

Figure 4. Missing rates of each smart water meter between 1 January 2018 and 1 January 2019.

Figure 5. Hourly water consumption of block 112 between 1 January 2018 and 1 January 2019.

Figure 6. Comparison of observed and forecasted hourly water demand of each AMI (a–j) with the methods ARIMA, RBF-ANN, QMMP+, and LSTM.

Figure 7. Difference between forecasted and observed hourly water demand of block 112 with the methods ARIMA, RBF-ANN, QMMP+, LSTM.

Table 1. AMI dataset for forecasting 24 h ahead.

AMI No.	Diameter (mm)	Missing Rate (%)	Types	TWD (m³/Day)
110012984	32	1.438	Restaurant	12.206
110013004	15	1.610	Domestic	1.207
110013012	15	1.826	Domestic	1.052
110013044	25	1.267	Church	3.941
110013074	25	1.027	Laundry	1.793
110013629	25	1.062	Mart	0.950
110016389	15	7.957	Pre-primary	0.634
110016799	15	1.062	Restaurant	2.578
110016860	32	1.062	Senior-citizen center	0.636
110018932	15	1.096	Domestic	1.701

Table 2. Performance assessment in terms of residuals in water demand between observed and forecasted values with the methods ARIMA, RBF-ANN, QMMP+, and LSTM (Bold indicates the best result for each AMI No.).

AMI No.	Observed	Residual (m³/day)
AMI No.	(m³/day)	ARIMA	RBF-ANN	QMMP+	LSTM
110012984	12.206	3.276	−0.072	−2.103	−0.178
110013004	1.207	0.517	0.413	0.248	0.144
110013012	1.052	−0.528	−0.264	−0.086	−0.235
110013044	3.941	−0.360	−1.054	−1.369	−0.415
110013074	1.793	0.840	−0.280	0.464	−0.087
110013629	0.950	0.430	−0.028	−0.053	0.168
110016389	0.634	−0.363	−0.114	−0.241	−0.020
110016799	2.578	−0.164	−0.389	0.791	−0.204
110016860	0.636	0.241	0.171	−0.035	0.075
110018932	1.701	−0.566	0.482	−0.161	−0.052

Table 3. Performance assessment in terms of RMSE with the methods ARIMA, RBF-ANN, QMMP+, and LSTM (Bold indicates the best result for each AMI No.).

AMI No.	RMSE (m³/h)
AMI No.	ARIMA	RBF-ANN	QMMP+	LSTM
110012984	0.216	0.145	0.142	0.139
110013004	0.037	0.031	0.033	0.025
110013012	0.040	0.031	0.030	0.024
110013044	0.105	0.117	0.110	0.079
110013074	0.102	0.075	0.083	0.060
110013629	0.045	0.035	0.030	0.028
110016389	0.024	0.016	0.020	0.011
110016799	0.077	0.086	0.090	0.058
110016860	0.019	0.015	0.015	0.012
110018932	0.038	0.037	0.031	0.022

Table 4. Performance assessment in terms of NRMSE with the methods ARIMA, RBF-ANN, QMMP+, and LSTM (Bold indicates the best result for each AMI No.).

AMI No.	NRMSE (%)
AMI No.	ARIMA	RBF-ANN	QMMP+	LSTM
110012984	27.68	18.56	18.18	17.78
110013004	27.92	23.7	25.08	19.00
110013012	29.75	23.34	22.37	17.93
110013044	26.8	29.8	27.9	20.21
110013074	23.32	17.08	18.99	13.80
110013629	24.42	19.28	16.62	15.13
110016389	38.21	25.09	31.23	17.04
110016799	26.43	29.52	31.05	20.05
110016860	23.59	18.99	18.39	14.95
110018932	27.72	27.17	22.7	15.94
Avg.	27.03	23.32	23.25	17.92

Table 5. Performance assessment in terms of NSE with the methods ARIMA, RBF-ANN, QMMP+, and LSTM (Bold indicates the best result for each AMI No.).

AMI No.	NSE
AMI No.	ARIMA	RBF-ANN	QMMP+	LSTM
110012984	−0.12	0.50	0.52	0.54
110013004	−0.38	0.00	−0.12	0.36
110013012	−0.43	0.12	0.19	0.48
110013044	−0.01	−0.25	−0.09	0.43
110013074	−0.05	0.43	0.30	0.63
110013629	−0.10	0.32	0.49	0.58
110016389	−0.65	0.29	−0.10	0.67
110016799	0.18	−0.02	−0.13	0.53
110016860	0.24	0.51	0.54	0.69
110018932	−0.61	−0.54	−0.08	0.47
Avg.	0.11	0.34	0.34	0.61

Table 6. Performance assessment in terms of PCC with the methods ARIMA, RBF-ANN, QMMP+, and LSTM (Bold indicates the best result for each AMI No.).

AMI No.	PCC
AMI No.	ARIMA	RBF-ANN	QMMP+	LSTM
110012984	0.73	0.73	0.85	0.73
110013004	0.37	0.67	0.56	0.64
110013012	0.17	0.66	0.47	0.84
110013044	0.23	0.02	0.45	0.67
110013074	0.31	0.71	0.63	0.81
110013629	0.34	0.65	0.71	0.78
110016389	0.13	0.60	0.55	0.83
110016799	0.64	0.15	0.63	0.85
110016860	0.67	0.80	0.89	0.85
110018932	0.08	0.05	0.57	0.70
Avg.	0.37	0.64	0.68	0.79

Table 7. Performance assessment in terms of Total water demand, RMSE, MAPE, NRMSE, with the methods LSTM, QMMP+, RBF-ANN, ARIMA of block 112 water consumption data (Bold indicates the best result for each index).

Indices	ARIMA	RBF-ANN	QMMP+	LSTM
Residual (m³/day)	324.6	395.5	499.1	167.1
RMSE(m³/day)	85.18	73.47	73.25	56.46
NRMSE (%)	27.03	23.32	23.25	17.92
NSE	0.11	0.34	0.34	0.61
PCC	0.37	0.64	0.68	0.79

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Koo, K.-M.; Han, K.-H.; Jun, K.-S.; Lee, G.; Kim, J.-S.; Yum, K.-T. Performance Assessment for Short-Term Water Demand Forecasting Models on Distinctive Water Uses in Korea. Sustainability 2021, 13, 6056. https://doi.org/10.3390/su13116056

AMA Style

Koo K-M, Han K-H, Jun K-S, Lee G, Kim J-S, Yum K-T. Performance Assessment for Short-Term Water Demand Forecasting Models on Distinctive Water Uses in Korea. Sustainability. 2021; 13(11):6056. https://doi.org/10.3390/su13116056

Chicago/Turabian Style

Koo, Kang-Min, Kuk-Heon Han, Kyung-Soo Jun, Gyumin Lee, Jung-Sik Kim, and Kyung-Taek Yum. 2021. "Performance Assessment for Short-Term Water Demand Forecasting Models on Distinctive Water Uses in Korea" Sustainability 13, no. 11: 6056. https://doi.org/10.3390/su13116056

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Performance Assessment for Short-Term Water Demand Forecasting Models on Distinctive Water Uses in Korea

Abstract

1. Introduction

2. Study Process and Methodology

2.1. Study Process

2.2. Methodology

2.2.1. ARIMA

2.2.2. RBF-ANN

2.2.3. QMMP+

2.2.4. LSTM

2.3. Performance Assessment

3. Study Area and Dataset Description

3.1. Study Area

3.2. Dataset Description

4. Results and Discussion

4.1. AMIs Water Demand Forecasting

4.2. Total Water Demand Forecasting

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI