Next Article in Journal
Empowering Water Engineers to Develop XR Learning Applications with the WATERLINE Project
Previous Article in Journal
Cooperative Operational Optimization of Water and Power Systems under Extreme Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Water Demand Forecasting Based on Online Aggregation for District Meter Areas-Specific Adaption †

1
Chair of Environmental Economics esp. Economics of Renewable Energy, University of Duisburg-Essen, 45141 Essen, Germany
2
Institute of Hydraulic Engineering and Water Resources Management, University of Duisburg-Essen, 45141 Essen, Germany
*
Author to whom correspondence should be addressed.
Presented at the 3rd International Joint Conference on Water Distribution Systems Analysis & Computing and Control for the Water Industry (WDSA/CCWI 2024), Ferrara, Italy, 1–4 July 2024.
Eng. Proc. 2024, 69(1), 15; https://doi.org/10.3390/engproc2024069015
Published: 29 August 2024

Abstract

:
Short-term water demand forecasting is critical to enable optimal system operation. For practical purposes, the accuracy of the forecast and the adaptability to changing conditions are paramount. Therefore, for the Battle of Water Demand Forecasting (BWDF), we propose a precise and highly flexible forecasting methodology to allow an excellent adaptation to District Meter Areas (DMA)-specific characteristics. The proposed method consists of data cleaning and pre-processing, the training of individual forecast models and finally of combining the individual forecasts by the smoothed Bernstein Online Aggregation (BOA) algorithm. The ensemble of individual forecasting models includes simple time series, high-dimensional linear, and highly non-linear models such as neural networks.

1. Introduction

A variety of models and approaches for water demand forecasting are currently available; however, the selection of a single best technique remains difficult, as noted by [1]. Moreover, as outlined by [2], the obtained results may differ extensively depending on the forecasting purpose, the data used, and the chosen method at hand. Hence, the BWDF aims at tackling these issues by comparing the effectiveness of methods for short-term urban water demand forecasting in a controlled forecasting study. The data originates from a set of ten real District Metered Areas (DMAs) in the north-east of Italy. As the individual DMAs are different as far as characteristics, size and average water demand are concerned, we propose a methodology focused on accuracy and adaptability to changing conditions. This paper is structured as follows: first, the Materials and Methods are outlined; second, the results for one DMA and one forecasting task are presented; finally, the Discussion and Conclusion sections conclude the paper.

2. Materials and Methods

In this section, the data and the competition setting are explained. Building on this, our modeling framework is presented.

2.1. Data and Competition Setting

The provided data set contains raw hourly net-inflow time series data in L/s for each DMA and observed weather data, namely air temperature, rainfall depth, air humidity and wind speed. The available period of the full competition ranges from 4 January 2021 CET to 12 March 2023 CEST.
The competition was separated into four forecasting tasks, named W1, W2, W3 and W4, including consecutive time windows. With each forecasting task, the data for the respective time window was provided. The last week of each task was required to be forecasted for each DMA. Calendar information such as public holidays and local-event days were provided in addition to the hourly net-inflow and weather data. The observed weather data were also provided for the week to forecast as a perfect weather forecast.
The following three Performance Indicators (PI) were introduced for forecast performance evaluation: PI1: the mean-absolute error for the initial 24 h; PI2: the maximum-absolute error for the initial 24 h; and PI3: the mean-absolute error in for the last 144 h (6 days) of each evaluation week. As all PI are based on the well-known mean-absolute error (MAE), we concluded that median-optimized estimation procedures are better suited. Moreover, a separate consideration of the initial 24 and the remaining 144 h seemed worthwhile.

2.2. Modelling Framework

Our methodology consists of three steps, first: data cleaning and pre-processing, second: training and prediction of individual models, and third: applying the smoothed Bernstein Online Aggregation (BOA) on the individual model forecasts to provide a weighted ensemble forecast.

2.2.1. Data Cleaning and Pre-Processing

The net-inflow data and the provided weather data were not post-processed so that gaps and outliers were present in the data. Here, we applied the R-package tsrobprep, explained in detail in [3]. Moreover, we clock-change adjusted the input data to allow a more straightforward modeling of seasonal effects.

2.2.2. Individual Forecasting Models

After data cleaning and pre-processing, we trained an ensemble of 53 individual forecasting models on each DMA and for each forecasting task, separately. As a benchmark, we introduced a naive model, defined as Y t ^ = Y t 168 . As the starting point for more sophisticated models, we used the well-known linear model as defined by
Y t = β 0 + β 1 X t . 1 + + β p X t . p + ϵ t = X t β + ϵ t
where Y t denotes the hourly net-inflow for the corresponding DMA at time t , X t denotes the regression matrix, β the coefficient vector and ε t the error term. Based on (1), we introduced linear autoregressive (AR)-based models to account for autoregressive effects. These became further extended by combining them with the Multiple Seasonal-Trend Loess (MSTL) framework, for decomposing time series with multiple seasonal components, and the Augmented Dynamic Adaptive Model (ADAM) framework, for a more flexible way to model Error Trend Seasonality (ETS) components based on exponential smoothing. To account for non-linear effects such as weather effects and complex linear effects as well as interactions between them, we added more sophisticated linear modeling approaches, namely generalized linear models (GLM) and generalized additive models (GAM).
The GLM-based models were applied with automatic feature selection as already carried out successfully for water demand forecasting in [4], to model weather interactions, holiday effects and the autoregressive effects in a high dimensional feature space. The GAM-based models were used like the ADAM-based and MSTL-based models as pre-processing for AR-based models.
Moreover, we included non-linear models employing neural networks. The model architecture resembled a sum of linear models (1) and a neural network’s output. This approach leveraged the advantages of both methods, allowing to model the non-linear effects using a neural network with minimal capacity to mitigate over-fitting, whereas the linear relationships within the data were learned by the low-dimensional linear model. The implemented models employed either a Multi-Layer Perceptron (MLP) or Long Short-Term Memory (LSTM) type of neural networks.
All the individual models share a commonality in that they include linear structures to model the data. Moreover, to account for the MAE-based evaluation framework, we focused on robust and median-based optimization, if possible.

2.2.3. Individual Forecasting Models

The smoothed BOA algorithm as applied by [5] is an online aggregation method that dynamically combines forecasts by adjusting weights based on past performance, updating these weights upon receiving new data to favor better-performing forecasts. Its smoothing procedure ensures weight changes are gradual, enhancing stability over time. The smoothing parameter λ ≥ 0 must be determined.
Considering the competition framework, we employed a burn-in period and a test period of 15 weeks, respectively. For each DMA, we utilized a stepwise forward approach to determine the most appropriate subset of models to combine with the BOA method. To obtain more reliable results, the previous forecasting weeks and their subsequent week was removed for the training period of the BOA method. To account for the different evaluation horizons of the PI-scores, we trained the smoothed BOA algorithm on two model pools and combined their forecasts. The first model pool was tailored to PI1 and PI2, and the second to PI3.

3. Results

As noted in the introduction, the competition consisted of four forecasting weeks and ten DMAs. For illustration purposes, we concentrate on one forecasting week and on one DMA, namely W4 and DMA E. As shown in Table 1, the proposed BOA forecast combination outperforms all individual models for each score. An improvement of MAE by 20% over the naive benchmark is achieved. For comparison, the improvement of the best individual model for DMA E, the AR-based ar1000_des168, is only 11%. In general, the simple linear models, focused on autoregressive effects and seasonal components on a daily and weekly scale, are comparable in performance to more sophisticated models in the model pool. The results were obtained in a 15-week validation study before forecasting week W4.
Figure 1 visualizes the changing weights of the two model pools for forecasting week W4 and DMA E of the BOA forecast combination. We observe that different models are selected depending on the forecast horizon. Moreover, we see that, likewise, the simple and more sophisticated models such as complex linear (GLM-based) or pure non-linear models (MLP-based) are included in both model pools.

4. Discussion and Conclusions

We presented a highly flexible and accurate forecasting procedure to participate in the BWDF. The versatile range of the individual models of the ensemble in the BOA method is decisive. However, the adaptation to structural changes such as pipe bursts or pipe repairs remains challenging. The BOA forecast combination yielded a considerable improvement of overall forecasting performance over the best single model. Regarding the individual models, simple linear models obtained comparable results to more sophisticated approaches.

Author Contributions

Conceptualization, J.K.-H., B.S., G.J. and F.Z.; methodology, J.K.-H., B.S., G.J. and F.Z.; software, J.K.-H., B.S., G.J. and F.Z.; validation, J.K.-H., B.S., G.J. and F.Z.; formal analysis, J.K.-H., B.S., G.J. and F.Z.; investigation, J.K.-H., B.S., G.J. and F.Z.; resources, J.K.-H., B.S., G.J. and F.Z.; data curation, J.K.-H., B.S., G.J. and F.Z.; writing—original draft preparation, J.K.-H., B.S., G.J. and F.Z.; writing—review and editing, J.K.-H., B.S., G.J. and F.Z.; visualization, J.K.-H., B.S., G.J. and F.Z.; supervision, F.Z.; project administration, F.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Herrera, M.; Torgo, L.; Izquierdo, J.; Pérez-García, R. Predictive models for forecasting hourly urban water demand. J. Hydrol. 2010, 387, 141–150. [Google Scholar] [CrossRef]
  2. Donkor, E.A.; Mazzuchi, T.A.; Soyer, R.; Alan Roberson, J. Urban water demand forecasting: Review of methods and models. J. Water Resour. Plan. Manag. 2014, 140, 146–159. [Google Scholar] [CrossRef]
  3. Narajewski, M.; Kley-Holsteg, J.; Ziel, F. tsrobprep—An R package for robust preprocessing of time series data. SoftwareX 2021, 16, 100809. [Google Scholar] [CrossRef]
  4. Kley-Holsteg, J.; Ziel, F. Probabilistic multi-step-ahead short-term water demand forecasting with Lasso. J. Water Resour. Plan. Manag. 2020, 146, 04020077. [Google Scholar] [CrossRef]
  5. Ziel, F. Smoothed bernstein online aggregation for short-term load forecasting in IEEE DataPort competition on day-ahead electricity demand forecasting: Post-COVID paradigm. IEEE Open Access J. Power Energy 2022, 9, 202–212. [Google Scholar] [CrossRef]
Figure 1. Combination weights for forecasting week W4 and DMA E for the corresponding model pools; (a) model pool for forecasting horizon hours one to 24 and (b) model pool for forecasting horizon hours 25 to 168.
Figure 1. Combination weights for forecasting week W4 and DMA E for the corresponding model pools; (a) model pool for forecasting horizon hours one to 24 and (b) model pool for forecasting horizon hours 25 to 168.
Engproc 69 00015 g001
Table 1. MAE, PI1, PI2 and PI3 values for DMA E on validation set (15 weeks) of seven pre-selected individual models and the corresponding BOA forecast combination.
Table 1. MAE, PI1, PI2 and PI3 values for DMA E on validation set (15 weeks) of seven pre-selected individual models and the corresponding BOA forecast combination.
ModelMAE%-Imp. (MAE)PI1PI2PI3
BOA_forecast_combination1.360.201.164.111.39
ar1000_des1681.510.111.575.991.50
glm_lasso_Sdwy_hld_W_lag6721.530.101.565.991.52
adam_mnm_Sdw-lags_arma_ar3361.540.091.515.471.54
gam_Sdw_ar10081.560.081.616.721.55
mlp_lin_pp_Sdwy_W_smw1.600.051.404.491.63
mstl_s7s1.600.051.526.241.62
naive1.690.001.646.971.70
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kley-Holsteg, J.; Sonnenschein, B.; Johnen, G.; Ziel, F. Water Demand Forecasting Based on Online Aggregation for District Meter Areas-Specific Adaption. Eng. Proc. 2024, 69, 15. https://doi.org/10.3390/engproc2024069015

AMA Style

Kley-Holsteg J, Sonnenschein B, Johnen G, Ziel F. Water Demand Forecasting Based on Online Aggregation for District Meter Areas-Specific Adaption. Engineering Proceedings. 2024; 69(1):15. https://doi.org/10.3390/engproc2024069015

Chicago/Turabian Style

Kley-Holsteg, Jens, Björn Sonnenschein, Gregor Johnen, and Florian Ziel. 2024. "Water Demand Forecasting Based on Online Aggregation for District Meter Areas-Specific Adaption" Engineering Proceedings 69, no. 1: 15. https://doi.org/10.3390/engproc2024069015

Article Metrics

Back to TopTop