Next Article in Journal
The LOTUS International Multifunctional Digital Twin
Previous Article in Journal
Developing a Framework for Smart Stormwater Management in Tallinn, Estonia
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Urban Water Demand Forecasting Using DeepAR-Models as Part of the Battle of Water Demand Forecasting (BWDF) †

Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, Fraunhoferstrasse 1, 76131 Karlsruhe, Germany
*
Author to whom correspondence should be addressed.
Presented at the 3rd International Joint Conference on Water Distribution Systems Analysis & Computing and Control for the Water Industry (WDSA/CCWI 2024), Ferrara, Italy, 1–4 July 2024.
Eng. Proc. 2024, 69(1), 25; https://doi.org/10.3390/engproc2024069025
Published: 2 September 2024

Abstract

:
The accurate and reliable short-term forecasting of urban water demand plays a crucial role in enabling drinking water utilities to operate sustainably and secure water supplies in the future. Here, we apply state-of-the-art DeepAR models to predict urban water demand in ten district metered areas (DMAs) in a water distribution system in northeastern Italy. DeepAR models are based on long short-term memory networks and can directly provide probabilistic results. For this contribution, we leverage past flow data, current and future weather data, and engineered weather and date features as input to predict flow data one week ahead. A local model for each DMA is prepared and applied after hyperparameter optimization.

1. Introduction

Drinking water utilities worldwide currently face substantial challenges especially due to the effects of climate change, population growth, and urbanization. Therefore, the sustainable use of water resources is essential to secure future water supplies, ensure water quality, and preserve the ecosystems associated with available water sources. Accurate and reliable short-term forecasting of urban water demand plays a crucial role in enabling drinking water utilities to make informed operational and strategic decisions regarding, e.g., to plan water treatment, well operation, or reservoir replenishment in times, when energy costs are low. Recent reviews (e.g., [1]) show that artificial neural networks (ANNs) are among the most common approaches for water demand forecasting. ANNs are data-driven, i.e., they learn from observation data in the past (e.g., flow) and do not need a complex physical representation of the water distribution system, while nevertheless being able to capture nonlinearities and system dynamics. They can incorporate knowledge from relevant exogenous time series such as weather data to produce accurate short-term forecasts. ANNs usually lack interpretability; thus, they are especially useful in cases in which the result is more important than the involved processes. Among the most popular architectures for time series prediction is the long short-term memory network (LSTM) [2], a recurrent model that applies feedback loops and gating mechanisms to remember past time steps and prevent information from vanishing in the memory. Here, we apply a deep autoregressive model (DeepAR) [3], a specific ANN architecture, which internally builds upon LSTM cells. DeepAR models have been shown to excel in learning global models from hundreds of time series. They have been applied to several application domains, including the water domain (e.g., groundwater level prediction in [4]), but, to the best knowledge of the authors, not yet to the domain of water demand forecasting.
This work is a contribution to the Battle of Water Demand Forecasting (BWDF) at the third WDSA/CCWI Joint Conference 2024, which aims to compare the effectiveness of methods for the short-term forecasting of urban water demand in ten real district metered areas (DMAs) in a water distribution system in northeastern Italy. The goal is to predict the hourly net inflow of each DMA, which includes all types of water consumption and leakages. The performance is judged based on the prediction accuracy for four predefined individual weeks (W1–W4). In this contribution, we develop local DeepAR models for each DMA, built on an equal set of input parameters, but with optimized hyperparameters to improve the individual performance for each DMA. All the data are provided by the BWDF organizing committee. We follow the approach developed and applied within the TwinOptPRO project, a digital platform for online pump scheduling optimization. For more details, please refer to the respective paper at this year’s WDSA/CCWI conference.

2. Materials and Methods

We apply DeepAR models [3], a specific type of recurrent ANN based on the popular LSTM architecture. DeepAR models have the capability to perform probabilistic forecasting by using Monte Carlo samples to parameterize a Gaussian likelihood function. Further, DeepAR models allow out-of-the-box usability of numerical and categorical input data and do not require strict normalization, because scaling and provisioning of corresponding scaling information to the model are performed internally. For the BWDF, we perform sequence-to-sequence (or many-to-many) forecasting with an optimized number of input time steps (context length) and a fixed output length of 168 steps (one week in hourly time steps). Further, we train a separate model for each DMA (i.e., local model), due to the small number of overall available time series (10). This means that the real strength of DeepAR models is not exploited as they can learn seasonal behavior and dependencies on given covariates across hundreds of time series and only need minimal feature engineering to extract group-dependent behavior [3]. Such a global model would be able to tackle the cold-start problem, meaning a forecast with little historic training data for a specific location by deriving knowledge from similar items. We use the DeepAR implementation provided in the Pytorch Forecasting [5] library. To prevent overfitting, we apply dropout (10%) and early stopping, as well as gradient clipping to prevent exploding gradients. Each DMA’s data are separated into three parts. The last 7 days are used for independent testing (‘test-set’), the preceding two weeks for early stopping (‘stop-set’), and the rest for training (training-set). Under the hood, training is performed using the Rancher training algorithm [6] with a learning rate of 0.001 for a maximum of 100 training epochs. The training stops after five epochs without error improvement in the stop-set (early stopping patience). All local models are largely identical and differ only in terms of four hyperparameters, which are optimized using the Optuna toolbox [7] and maximizing the coefficient of determination (R2) in the test set as the objective function. We optimize the context length for each DMA from 1 to 21 days, the batch size (24, 25, 26), the number of LSTM layers (1 to 3), and the number of hidden nodes in the LSTM layer (24, 25, 26); however, optimization is performed only based on the data available until forecasting week W1. With newly available data for weeks W2–W4, the hyperparameters stay fixed.
For all 10 DMAs, hourly time series of net inflow are provided along with weather data (precipitation, temperature, relative humidity, wind) and a calendar with information on national holidays. Initially, 1.5 years of data are available. The data are extended after each forecasting week with about 2 years and 3 months in total for W4. We perform a preprocessing of the provided hourly flow data by resampling the data to a strictly hourly interval and closing data gaps up to 5 h with a piecewise cubic Hermite interpolating polynomial (PCHIP). Larger data gaps are ignored and cut out, to account for the requirements of the applied model, which requires gapless data. This means that, for missing values, an error is introduced into the data in which successive values can have time intervals greater than one hour (even days or weeks). Each DMA is preprocessed separately. Due to the high number of missing values, we do not use the provided data for wind and relative humidity but only those for temperature and precipitation. Here, we close the very few (<5) missing values with linear interpolation and zero-padding, respectively.
Additional information is provided by engineering input features both from (used) weather and calendar data. Hence, we calculate (i) the number of consecutive days with no or very low precipitation (<1 mm), (ii) the cumulative precipitation of the last 7 days, and (iii) the mean temperature within the last 7 days. Additional calendar-derived features include information on (i) whether a day is a workday (binary), (ii) the encoded weekday (0–6), (iii) whether today and/or tomorrow is a national holiday (0: no holiday, 1: tomorrow is a holiday, 2: today is a holiday, 3: today and tomorrow are holidays), and (iv) the cyclical encoded hour. Cyclic encoding addresses the problem that, e.g., 23:00 h and 0:00 h are numerically very distant but, in reality, describe two points in time that are rather close. Hence, we encode the hour information in two complementary values based on a sine and cosine function. Such encoding is possible for the weekday information, too; however, trial and error showed that standard numerical encoding provided slightly better results. To summarize, the models leverage weather data with a certain context length, past target (flow) data of the same length, the described weather and date features, and additionally scaling information for each input provided internally.

3. Results and Discussion

Part of the BWDF procedure is the prediction of four different weeks in 2022 and 2023, and the provisioning of additional flow and weather data in between. Thus, we re-train the model for each week, with all data available up to this point. The test set is the respective preceding week. We evaluate the test set using R2 and PICP (prediction interval coverage probability). The PICP shows whether the provided prediction interval (PI) (80%-PI in our case) includes the observed target values (1: all targets within PI, 0: no targets within PI). Figure 1 summarizes the performances of all local DMA models (A–J) in all four test sets preceding the respective forecasting weeks (W1–W4).
Except for single weeks in DMAs B, F, and J, the models reached generally good R2 values larger than 0.6, up to 0.98 in DMA E, solely based on an evaluation of the mean forecast. The performance was not stable in all DMAs after retraining with additional data (e.g., R2 dropped in DMA B after W1, in E for W2, and in J for W4). Tweaking the model architecture, hyperparameters or inputs might be able to fix this behavior, which was not implemented due to the BWDF rules. Thus, for a real-world deployment, automated re-training of DeepAR models (ML models in general) without validation checks and updating the model, if necessary, is generally not advisable. In Figure 1, the PICP for DMA A indicates that the provided PI can contribute valuable information aside from standard metrics such as R2. Even though the R2 is (only) around 0.6, the PICP reaches about 0.8 for all weeks, showing that the observed values are largely covered by the provided PI. This is also illustrated by Figure 2 (upper part, DMA A, W3), where the PI covers even strong peaks, with a larger uncertainty in the opposite direction though. Nevertheless, a considerable bias can jeopardize the meaning of the PICP metric, which is illustrated by the example of DMA B, W3 in Figure 2 (lower part). Here, flow is systematically underestimated (but still close to observed values), resulting in a fairly good R2 of around 0.6 but a very low PICP of 0.1.

Author Contributions

Methodology A.W. and C.K., software and validation A.W., C.K., S.W. and M.Z.; formal analysis and investigation, A.W. and C.K.; resources S.W. and M.Z.; data curation, A.W. and C.K.; writing—original draft preparation, A.W.; writing—review and editing, A.W., C.K., S.W. and M.Z.; visualization, A.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the German Federal Ministry of Education and Research (BMBF), grant number 02WQ1646C.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data will be publicly available, provided by the BWDF organizers. All code will be publicly available: https://gitlab.cc-asp.fraunhofer.de/open/iosb/mrd/bwdf2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Niknam, A.; Zare, H.K.; Hosseininasab, H.; Mostafaeipour, A.; Herrera, M. A Critical Review of Short-Term Water Demand Forecasting Tools—What Method Should I Use? Sustainability 2022, 14, 5412. [Google Scholar] [CrossRef]
  2. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural. Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  3. Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks. Int. J. Forecast. 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
  4. Clark, S.R.; Pagendam, D.; Ryan, L. Forecasting Multiple Groundwater Time Series with Local and Global Deep Learning Networks. Int. J. Environ. Res. Public Health 2022, 19, 5091. [Google Scholar] [CrossRef]
  5. Beitner, J. Jdb78/Pytorch-Forecasting, v1.0.0. 2023. Available online: https://github.com/jdb78/pytorch-forecasting (accessed on 26 February 2024).
  6. Wright, L. Ranger—A Synergistic Optimizer. 2019. Available online: https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer (accessed on 26 January 2024).
  7. Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019. [Google Scholar]
Figure 1. Coefficient of determination (R2) and prediction interval coverage probability (PICP) in all four test sets, for each DMA (A–J).
Figure 1. Coefficient of determination (R2) and prediction interval coverage probability (PICP) in all four test sets, for each DMA (A–J).
Engproc 69 00025 g001
Figure 2. Exemplary test set results for DMA A, W3 (upper part), and DMA B, W3 (lower part).
Figure 2. Exemplary test set results for DMA A, W3 (upper part), and DMA B, W3 (lower part).
Engproc 69 00025 g002
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wunsch, A.; Kühnert, C.; Wallner, S.; Ziebarth, M. Urban Water Demand Forecasting Using DeepAR-Models as Part of the Battle of Water Demand Forecasting (BWDF). Eng. Proc. 2024, 69, 25. https://doi.org/10.3390/engproc2024069025

AMA Style

Wunsch A, Kühnert C, Wallner S, Ziebarth M. Urban Water Demand Forecasting Using DeepAR-Models as Part of the Battle of Water Demand Forecasting (BWDF). Engineering Proceedings. 2024; 69(1):25. https://doi.org/10.3390/engproc2024069025

Chicago/Turabian Style

Wunsch, Andreas, Christian Kühnert, Steffen Wallner, and Mathias Ziebarth. 2024. "Urban Water Demand Forecasting Using DeepAR-Models as Part of the Battle of Water Demand Forecasting (BWDF)" Engineering Proceedings 69, no. 1: 25. https://doi.org/10.3390/engproc2024069025

Article Metrics

Back to TopTop