Next Article in Journal
Parameter Estimation in Water Distribution Networks Using an Error-in-Variables Approach
Previous Article in Journal
Evaluation of Free-Chlorine Data from Online Sensors in a Water Supply Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

A Novel Multi-Step Forecasting-Based Approach for Enhanced Burst Detection in Water Distribution Systems †

1
Centre for Water Systems, Faculty of Environment, Science and Economy, University of Exeter, Exeter EX4 4QB, UK
2
Faculty of Environment, Science and Economy, University of Exeter, Exeter EX4 4QB, UK
3
College of Civil Engineering, Hefei University of Technology, Hefei 230009, China
*
Author to whom correspondence should be addressed.
Presented at the 3rd International Joint Conference on Water Distribution Systems Analysis & Computing and Control for the Water Industry (WDSA/CCWI 2024), Ferrara, Italy, 1–4 July 2024.
Eng. Proc. 2024, 69(1), 146; https://doi.org/10.3390/engproc2024069146
Published: 12 September 2024

Abstract

:
Burst detection in water asset management is a crucial issue in ensuring the efficient and sustainable operation of water distribution systems. For an online burst detection method based on flow time series data, the challenge arises in the variability of anomaly definitions across different datasets, rendering a one-size-fits-all anomaly detection algorithm impossible. Additionally, existing prediction-driven anomaly detection schemes, relying on single-step prediction, face accuracy issues due to susceptibility to input data contamination. In this paper, a novel scheme for burst detection is proposed to address the limitations of existing methods. The approach incorporates a multi-step forecasting model, offering multiple sources for the forecasting, and aggregates the forecasts to establish a common expectation for the data pattern. A metric termed Local Residual Discrepancy (LRD) is proposed to score deviation between predictions and observations. The effectiveness of the proposed method is evaluated through its application to both synthetic and real datasets. Experimental results reveal significant improvements in detection accuracy achieved by the LRD metric, irrespective of the underlying prediction model. This research contributes to the advancement of burst detection methodologies, offering a more robust and versatile approach applicable to varied datasets and prediction models in water distribution systems.

1. Introduction

Burst events have been a critical issue in the management of water distribution networks (WDNs). Hardware-based methods such as listening rods, CCTV inspection [1], and infrared thermography [2] usually require specialized professionals to interpret the results and have a high installation cost. Recent advancements in sensing technologies have made the deployment of cost-effective hydraulic sensors feasible, revolutionizing network monitoring [3]. These sensors, when integrated with appropriate data analysis algorithms, leverage flow and pressure data to efficiently identify leak events. Data-driven detection methods excel by offering real-time analysis capabilities, eliminating the need for highly calibrated hydraulic models, and reducing costs through minimal requirements.
For WDN burst detection based on flow data, the focus is to identify unexpected increases in the data while mitigating the influence of noise and uncertainties [4]. Traditional methods employed a prediction-classification scheme to perform this task. Initially, a prediction model is configured and trained to understand the normal behavior of the flow time series based on historical data. Then, the prediction model is deployed for real-time prediction of online monitoring data. As the prediction model is trained to recognize normal data behavior, the predicted outcomes can be regarded as anticipated data values. Any observed data points deviating significantly from these expected values are identified as potential burst events [5].
However, employing one-step forecasting strategies, which assess a single time point for prediction at a time, can make the detection process sensitive to noise and uncertainties [4]. Furthermore, as burst events typically manifest as continuous occurrences of outliers, the presence of outliers within the input window can hinder the accurate prediction and detection of future time points [6]. Present methods continue to struggle with high false alarm rates and insufficient detection performance.
Hence, this study introduces a novel framework for burst event detection, leveraging multiple forecasting outcomes generated by a multi-step forecasting approach. The multi-step forecasting model yields several predictions for the same time point, enhancing the accuracy and resilience of the forecasting results. The ensemble of these predictions establishes a common expectation for the data pattern. To quantify the disparity between observed data and this common expectation, we introduce a metric called local residual discrepancy (LRD).

2. Method

Firstly, a multi-step prediction model is configured and trained. To prove the robustness of the proposed LRD metric, three different prediction models, namely a multilayer perceptron (MLP) model, stacked long short-term memory (LSTM) model, and sequence-to-sequence (Seq2Seq) model, are employed in this study to evaluate the performance of the proposed LRD metric. These models are widely recognized in the field of deep learning for time series forecasting [7]. Within each model category, both single-output and multiple-output architectures were implemented. During the parameter tuning phase, a consistent input information setting was employed to facilitate fair comparisons, with the model receiving data spanning one week. Models generally use two or three hidden layers. The number of neurons was analyzed through preliminary analysis to minimize mean absolute error. Additionally, hyperparameters like dropout rate and learning rate were determined for each model using the grid search method.
Then, the LRD is calculated to evaluate the disparity between observed data and the common expectation provided by the multi-step prediction model. LRD is formulated as
L R D = 1 L i = 1 L D i s t y i ,   y ^ i
D i s t y i ,   y ^ i = 1 l i t = 1 l i y t y ^ t
where L is the length of the prediction window that is being used for calculation, which can be adjusted to achieve optimal performance. D i s t y i ,   y ^ i is the distance between the observed data of time window i and predicted data of time window i . l i is the length of the i th time window. y t is the t th observed data in the i th time window. y ^ t is the t th predicted data in the i th time window.

3. Results

Both a synthetic dataset and a real dataset are employed in this study to evaluate the performance of the proposed methodology, as shown in Figure 1 and Figure 2. The synthetic dataset used in this study is generated from the L-Town hydraulic model, as presented in the CCWI 2020 conference proceedings. The hydraulic model has considered various uncertainties, including base demands, demand patterns, pipe parameter uncertainties, etc. While the synthetic dataset generated from the hydraulic model accounts for various uncertainties, it still retains artificial elements inherent in its synthetic nature. A real-flow dataset sourced from a water utility company in the United Kingdom is also employed. For both datasets, three levels of burst events are defined, corresponding to 0–5% (level 1), 5–10% (level 2), and >10% of the average inflow (level 3), respectively.
The LRD is considered the signature for burst events. By varying the threshold for classification from the LRD, an ROC curve could be obtained to show the performance of a burst detection method. For each threshold, the true positive rate and false positive rate could be calculated. Usually, values that fall outside two or three standard deviations from the mean are considered abnormal. Figure 3 and Figure 4 show the detection results obtained by three deep learning models with both the LRD metric and the traditional single-step method. Observations reveal that regardless of the model type used, the multi-step model combined with the LRD metric significantly enhances detection performance, as evidenced by ROC curves that are closer to the top-left corner of the plot. Furthermore, the results indicate that the MLP model outperforms other models, whether on synthetic or real datasets.

4. Conclusions

This paper proposes a novel residual calculation metric named LRD, combined with a multi-step forecasting model. The proposed method takes advantage of the multiple forecasting results during time window rolling, aggregating all information to establish a unified expectation for future behavior. The emphasis of the LRD metric is placed on the data pattern itself rather than the value of a single point. The proposed method has been applied to both synthetic and real datasets to demonstrate its efficiency and robustness. The detection results indicate significant improvements when employing all three types of deep learning models (MLP, LSTM, and Seq2Seq) in conjunction with the proposed LRD metric compared to traditional single-step prediction frameworks.

Author Contributions

Conceptualization, X.W. and X.Z.; methodology, X.W.; software, X.W.; validation, X.W.; formal analysis, X.W.; investigation, X.W.; resources, R.F.; data curation, X.W.; writing—original draft preparation, X.W.; writing—review and editing, R.F., E.K. and X.Z.; visualization, X.W.; supervision, R.F. and E.K.; project administration, R.F.; funding acquisition, X.W. and R.F. All authors have read and agreed to the published version of the manuscript.

Funding

The first author is funded by the China Scholarship Council (No. 202006370080), and the work is supported by a Royal Academy of Engineering Industrial Fellowship to resource Raziyeh Farmani’s involvement (IF\192057).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The following data and the model used in this study can be made available by the corresponding author on request: data of synthetic experiments and codes for the proposed method in the Python language.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Halfawy, M.R.; Hengmeechai, J. Automated Defect Detection in Sewer Closed Circuit Television Images Using Histograms of Oriented Gradients and Support Vector Machine. Autom. Constr. 2014, 38, 1–13. [Google Scholar] [CrossRef]
  2. Bach, P.M.; Kodikara, J.K. Reliability of Infrared Thermography in Detecting Leaks in Buried Water Reticulation Pipes. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 4210–4224. [Google Scholar] [CrossRef]
  3. Wan, X.; Kuhanestani, P.K.; Farmani, R.; Keedwell, E. Literature Review of Data Analytics for Leak Detection in Water Distribution Networks: A Focus on Pressure and Flow Smart Sensors. J. Water Resour. Plan. Manag. 2022, 148, 03122002. [Google Scholar] [CrossRef]
  4. Wan, X.; Farmani, R.; Keedwell, E. Gradual Leak Detection in Water Distribution Networks Based on Multistep Forecasting Strategy. J. Water Resour. Plan. Manag. 2023, 149, 04023035. [Google Scholar] [CrossRef]
  5. Romano, M.; Kapelan, Z.; Savić, D.A. Automated Detection of Pipe Bursts and Other Events in Water Distribution Systems. J. Water Resour. Plan. Manag. 2014, 140, 457–467. [Google Scholar] [CrossRef]
  6. Wang, X.; Guo, G.; Liu, S.; Wu, Y.; Xu, X.; Smith, K. Burst Detection in District Metering Areas Using Deep Learning Method. J. Water Resour. Plan. Manag. 2020, 146, 04020031. [Google Scholar] [CrossRef]
  7. Sahoo, D.; Sood, N.; Rani, U.; Abraham, G.; Dutt, V.; DIleep, A.D. Comparative Analysis of Multi-Step Time-Series Forecasting for Network Load Dataset. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies, ICCCNT 2020, Kharagpur, India, 1–3 July 2020. [Google Scholar] [CrossRef]
Figure 1. Visualization of the synthetic dataset with five burst events.
Figure 1. Visualization of the synthetic dataset with five burst events.
Engproc 69 00146 g001
Figure 2. Visualization of the real dataset with five burst events.
Figure 2. Visualization of the real dataset with five burst events.
Engproc 69 00146 g002
Figure 3. ROC curve for the synthetic dataset (with LRD duration of 2 h) with burst events at (a) level 1; (b) level 2; (c) level 3.
Figure 3. ROC curve for the synthetic dataset (with LRD duration of 2 h) with burst events at (a) level 1; (b) level 2; (c) level 3.
Engproc 69 00146 g003
Figure 4. ROC curve for the real dataset (with LRD duration of 2 h) with burst events at (a) level 1; (b) level 2; (c) level 3.
Figure 4. ROC curve for the real dataset (with LRD duration of 2 h) with burst events at (a) level 1; (b) level 2; (c) level 3.
Engproc 69 00146 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wan, X.; Farmani, R.; Keedwell, E.; Zhou, X. A Novel Multi-Step Forecasting-Based Approach for Enhanced Burst Detection in Water Distribution Systems. Eng. Proc. 2024, 69, 146. https://doi.org/10.3390/engproc2024069146

AMA Style

Wan X, Farmani R, Keedwell E, Zhou X. A Novel Multi-Step Forecasting-Based Approach for Enhanced Burst Detection in Water Distribution Systems. Engineering Proceedings. 2024; 69(1):146. https://doi.org/10.3390/engproc2024069146

Chicago/Turabian Style

Wan, Xi, Raziyeh Farmani, Edward Keedwell, and Xiao Zhou. 2024. "A Novel Multi-Step Forecasting-Based Approach for Enhanced Burst Detection in Water Distribution Systems" Engineering Proceedings 69, no. 1: 146. https://doi.org/10.3390/engproc2024069146

Article Metrics

Back to TopTop