Next Article in Journal
2,2,3,3,4,4,4-Heptafluorobutyl Acetate: Transesterification Reaction of 2,2,3,3,4,4,4-Heptafluoro-1-Butanol and Isopropyl Acetate—Side-Product Composition
Previous Article in Journal
Sports Broadcasting: How Big Data Technology Impacts the Viewer Experience in Baseball Broadcasting
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Enhancing Water Demand Forecasting: Leveraging LSTM Networks for Accurate Predictions †

by
Fatemeh Boloukasli ahmadgourabi
*,
Melica Khashei Varnamkhasti
,
Morad Nosrati Habibi
,
Niuosha Hedaiaty Marzouny
and
Rebecca Dziedzic
Building, Civil and Environmental Engineering Department, Concordia University, Montreal, QC H3G 2W1, Canada
*
Author to whom correspondence should be addressed.
Presented at the 3rd International Joint Conference on Water Distribution Systems Analysis & Computing and Control for the Water Industry (WDSA/CCWI 2024), Ferrara, Italy, 1–4 July 2024.
Eng. Proc. 2024, 69(1), 120; https://doi.org/10.3390/engproc2024069120
Published: 12 September 2024

Abstract

:
This study aims to create a reliable water-demand forecasting system using Long Short-Term Memory networks. The model integrates hourly water demands from 10 District Metered Areas of a Water Distribution Network in northeast Italy and weather data, handling missing values with LSTM-based data imputation. It considers temporal aspects like time, weekdays, holidays, and weekend patterns, employing sine and cosine transformations to capture daily cycles. To ensure the model’s robustness, the testing was conducted during the last week of the dataset, specifically week 81, with iterative adjustments to the LSTM’s hyperparameters to optimize prediction accuracy. These tuning efforts focused on learning rate, number of layers, and batch size, tailored to maximize the system’s performance. This method is essential for smart decision-making in water utility management and demonstrates significant potential for improving operational efficiencies.

1. Introduction

Increasing water demands caused by growing urbanization, coupled with water scarcity driven by climate change, make accurate water demand forecasting essential. Predicting demands is crucial for designing water distribution systems, allocating resources effectively, enhancing leakage detection, reducing supply costs, and ensuring that they can meet the needs of the users [1]. However, this prediction is not simple; there is an intricate and nonlinear relationship between demand and its numerous driving factors. Previous approaches for demand prediction can generally be categorized as either statistical or machine learning methods.
Although traditional statistical techniques such as Autoregressive Integrated Moving Average (ARIMA) and seasonal ARIMA, i.e., SARIMA, mostly have a simple structure and can identify general patterns, they are ineffective at capturing nonlinear interactions, important for demand forecasting [2]. Patrick F. Perry (1981) utilized a Fourier series to model the periodic demand component and ARIMA to describe the residual demand for 24 h predictions in the United Kingdom [3].
Data-driven approaches to forecasting water demand have been established in order to deal with this problem and provide more robust prediction. Ghiassi et. al. (2008) developed dynamic artificial neural networks for different time horizons, including hourly, daily, weekly, and monthly. The model demonstrated a more accurate result for daily, weekly, and monthly in comparison to hourly. Moreover, comparing the results of DAN2 with ARIMA, concluded superior performance for DAN2 [4]. Another approach in demand forecasting is support vector regression (SVR). M. Braun et. al. (2014) compared SVR and SARIMA for a 24 h forecast of Berlin’s residential water consumption. The SARIMA model did not perform as well as the support vector regression model. This can be attributed to the fact that the SVR model analyzes nonlinear effects using Gaussian kernels, whereas the SARIMA model relies on linear regression [5]. Chen et al. (2017) introduced a regression model based on random forests aimed at tackling the nonlinear relationship of factors affecting water demand series [6].
Recent advancements in predictive modeling have leveraged Long Short-Term Memory (LSTM) networks [7], a type of Recurrent Neural Network (RNN), to improve time-series analysis, particularly in water demand forecasting using historical data [8]. LSTMs excel in capturing long-range sequences, outperforming traditional models like ARIMA and SVM by effectively capturing temporal dependencies within water usage patterns. This capability to understand seasonal trends and irregular patterns without extensive feature engineering has revolutionized water demand forecasting. However, challenges persist in integrating external factors with seasonal variations and optimizing loss functions for improved accuracy.
Given the promising results of previous LSTM models and the opportunities for improving these models with feature engineering, the present study seeks to develop an LSTM-based hourly water demand prediction model that accounts for weather and user behavior. The input data incorporates feature engineering, particularly sinusoidal and cosine functions for modeling seasonal variations [9] along with calendar features [10].

2. Materials and Methods

The proposed procedure in this paper focuses on predicting water demands within a Water Distribution Network (WDN) situated in the northeast of Italy. This entails consumption forecasting for ten District Metered Areas (DMAs) within the WDN to define optimal system operation over both the upcoming day and week. The DMAs (named A to J) demonstrate a diverse array of area characteristics, spanning residential, suburban, commercial, and industrial zones, each presenting distinct challenges and opportunities for demand prediction. Notably, DMA E, situated near the city center, serves the highest number of users with a significant net inflow, while DMA I, positioned by the port, has the fewest users with a comparatively lower net inflow. Historical data on net inflows and weather records were made available for analysis, spanning a total of 81 weeks.
Beginning with importing the raw data, we ensured consistent date–time formatting to facilitate temporal analysis. Additional contextual information, such as holidays and weekends, was integrated to account for temporal variations in water inflow patterns. Addressing missing values was critical, and we employed LSTM neural networks to accurately predict and impute missing data, aligning closely with underlying inflow dynamics. Feature engineering included computing the mean inflow over the preceding four weeks and transforming the day of the week and time of day into sine and cosine representations to capture cyclic patterns effectively. Finally, systematic data reordering enhanced clarity and consistency, preparing the dataset for insightful conclusions regarding water demand prediction.
Iterative testing was conducted to optimize the performance of LSTM networks through hyperparameter tuning. Different values for hyperparameters such as the number of LSTM units (50, 64, 128), learning rate (0.0025, 0.002, 0.0015, 0.001), batch size (32, 64, 128), and dropout rate (0.1, 0.2, 0.3) were systematically tested. This fine-tuning process aimed to identify the optimal configuration for each DMA. The number of weeks used for training varied for each DMA, based on trial and error, to maximize learning efficacy and predictive accuracy.
In the evaluation of the models’ effectiveness and accuracy, three Performance Indicators (PIs) were utilized. These indicators include the mean absolute error (MAE) computed for the first 24 h of each evaluation week, the maximum absolute error within the same 24 h period, and the MAE calculated for the period between the second and final days of each evaluation week. To facilitate a fair comparison of the LSTM model’s performance across the varied DMAs, the error metrics were also normalized by the average net inflow of each DMA. Thus, normalized MAE (N MAE) and Max AE (N Max AE) for the first 24 h, along with the MAE for the remaining days, were computed. The final week, week 81, was reserved as the testing set, while the rest was utilized as the training set. Accordingly, performance metrics were computed for hyperparameter tuning as well as identifying the most effective predictive model. Once the best model was selected, predictions were made for week 81.

3. Results

Considering the diverse characteristics of each DMA, unique LSTM models were trained individually for each of them. Details regarding the training period and hyperparameters are presented in Table 1. The models’ effectiveness was assessed by testing them on the last week of available data. For all DMAs, the window size was set at 24 h. The training period and hyperparameters were determined through trial and error to minimize errors. The performance of each model is presented in Table 2. The performance of the LSTM model displayed notable variation across the different DMAs, reflecting the impact of area characteristics on water demand patterns. A particularly notable observation is DMA E, which, despite having the highest average net inflow, shows the lowest normalized errors across all metrics, suggesting that the model’s performance is quite robust for areas with higher demand. Conversely, DMAs with lower average net inflows, such as DMA A, C, and F, exhibit higher normalized errors. This suggests that the model has difficulties in capturing demand patterns in DMAs with lower demand, pointing to opportunities for model refinement or inclusion of more data.

4. Conclusions

This study presents a comprehensive approach to water demand forecasting using LSTM networks within a WDS in northeast Italy. By integrating hourly water demand data from diverse DMAs with weather information and temporal features, the LSTM models demonstrated promising accuracy in predicting demand patterns. The models’ effectiveness was evident across various DMAs, with notable performance in areas with higher average net inflows. The normalized mean absolute error (MAE) for the first 24 h ranged from 0.0224 to 0.1308 for different DMAs, with DMA E exhibiting the lowest error. Furthermore, the study emphasizes the importance of advanced feature engineering techniques, including sinusoidal and cosine transformations, along with the integration of contextual data such as holidays, in enhancing prediction accuracy.

Author Contributions

Methodology, F.B.a., M.K.V., M.N.H., N.H.M. and R.D.; validation, F.B.a., M.K.V., M.N.H., N.H.M. and R.D.; writing—original draft preparation, F.B.a., M.K.V., M.N.H., N.H.M. and R.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Sciences and Engineering Research Council, grant number RGPIN-2022-04664.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study is openly available in https://wdsa-ccwi2024.it/battle-of-water-networks (accessed on 1 January 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Billings, R.B.; Jones, C.V. Forecasting Urban Water Demand, 2nd ed.; America Water Works Association: Washington, DC, USA, 2011. [Google Scholar]
  2. Pu, Z.; Yan, J.; Chen, L.; Li, Z.; Tian, W.; Tao, T.; Xin, K. A hybrid Wavelet-CNN-LSTM deep learning model for short-term urban water demand forecasting. Front. Environ. Sci. Eng. 2023, 17, 22. [Google Scholar] [CrossRef]
  3. Perry, P.F. Demand Forecasting in Water Supply Networks. J. Hydraul. Div. 1981, 107, 1077–1087. [Google Scholar] [CrossRef]
  4. Ghiassi, M.; Zimbra, D.K.; Saidane, H. Urban Water Demand Forecasting with a Dynamic Artificial Neural Network Model. J. Water Resour. Plan. Manag. 2008, 134, 138–146. [Google Scholar] [CrossRef]
  5. Braun, M.; Bernard, T.; Piller, O.; Sedehizade, F. 24-hours demand forecasting based on SARIMA and support vector machines. Procedia Eng. 2014, 89, 926–933. [Google Scholar] [CrossRef]
  6. Chen, G.; Long, T.; Xiong, J.; Bai, Y. Multiple Random Forests Modelling for Urban Water Consumption Forecasting. Water Resour. Manag. 2017, 31, 4715–4729. [Google Scholar] [CrossRef]
  7. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  8. Mu, L.; Zheng, F.; Tao, R.; Zhang, Q.; Kapelan, Z. Hourly and daily urban water demand predictions using a long short-term memory based model. J. Water Resour. Plan. Manag. 2020, 146, 05020017. [Google Scholar] [CrossRef]
  9. Stolwijk, A.M.; Straatman, H.; Zielhuis, G.A. Studying seasonality by using sine and cosine functions in regression analysis. J. Epidemiol. Community Health 1999, 53, 235–238. [Google Scholar] [CrossRef]
  10. Bakker, M.; Vreeburg, J.H.; Rietveld, L.C.; Blom, T. The use of an adaptive water demand prediction model. In Proceedings of the WDSA 2012: 14th Water Distribution Systems Analysis Conference, Adelaide, SA, Australia, 24–27 September 2012. [Google Scholar]
Table 1. Training period and hyperparameters of each LSTM model.
Table 1. Training period and hyperparameters of each LSTM model.
DMAABCDEFGHIJ
Training Period (weeks)406102040410204020
Hyperparameters# layers4454434344
# units128128, 64128, 65128128256, 128, 64128, 64128, 655064
OptimizerAdamAdamAdamAdamAdamAdamAdamaxAdamAdamAdam
Learning
rate
0.00150.0010.00250.00250.0020.0010.0010.0010.00250.001
Dropout
rate
0.10.20.20.10.20.30.30.30.30.3
Batch size32163232323232326432
Table 2. Performance metrics of each LSTM model.
Table 2. Performance metrics of each LSTM model.
DMAABCDEFGHIJ
MAE 1st day1.290.480.562.381.750.691.390.911.251.46
Max AE 1st day3.321.411.446.293.712.273.262.103.164.28
MAE other days1.340.930.832.202.300.931.681.311.201.31
N. MAE 1st day0.15370.05000.13080.07240.02240.08560.05520.04380.06080.0553
N. Max AE 1st day0.39570.14740.33580.19120.04740.28070.13010.10100.15330.1620
MAE other days0.16010.09700.19340.06690.02940.11500.06690.06300.05840.0496
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Boloukasli ahmadgourabi, F.; Khashei Varnamkhasti, M.; Nosrati Habibi, M.; Hedaiaty Marzouny, N.; Dziedzic, R. Enhancing Water Demand Forecasting: Leveraging LSTM Networks for Accurate Predictions. Eng. Proc. 2024, 69, 120. https://doi.org/10.3390/engproc2024069120

AMA Style

Boloukasli ahmadgourabi F, Khashei Varnamkhasti M, Nosrati Habibi M, Hedaiaty Marzouny N, Dziedzic R. Enhancing Water Demand Forecasting: Leveraging LSTM Networks for Accurate Predictions. Engineering Proceedings. 2024; 69(1):120. https://doi.org/10.3390/engproc2024069120

Chicago/Turabian Style

Boloukasli ahmadgourabi, Fatemeh, Melica Khashei Varnamkhasti, Morad Nosrati Habibi, Niuosha Hedaiaty Marzouny, and Rebecca Dziedzic. 2024. "Enhancing Water Demand Forecasting: Leveraging LSTM Networks for Accurate Predictions" Engineering Proceedings 69, no. 1: 120. https://doi.org/10.3390/engproc2024069120

Article Metrics

Back to TopTop