*5.1. Simulation Setup and Large-Scale WDN*

PEOS further demonstrated its accuracy and robustness in fault detection on a large-scale drinking WDN by considering different data collection issues. Figure 6 displays the structure of pipe network B with various faults. Pipe network B was modified from Reference [64] with the data of the pipe characteristics listed in Table 7. The pipe network consisted of 74 pipes and 48 nodes, including 11 continual consumption nodes, 2 water supply nodes, and 2 constant-head reservoirs. All pipes were considered to be long-term used cast iron pipes. Hence, the initial H–W coefficient *CHW*(0) and wave speed *a0* for all pipes in pipe network B were 130 and 1000 m/s, respectively. The *CHW*(*t*) for various pipes was also calculated by Equations (2)–(4) and is listed in the last column of Table 7.

**Table 7.** The characteristics of the large-scale WDN (pipe network B).



**Table 7.** *Cont*.


**Table 7.** *Cont*.

**Figure 6.** Configuration of the large-scale WDN (pipe network B).

N1 was the first reservoir with a constant-head of 138.9 m, and the second reservoir N2 had a constant-head of 91.4 m. The inflow rates at nodes N9 and N31 were both 1620.33 L/s. The consumption rates at nodes N10, N14, N17, N21, N25, N30, N32, N37, N45, N46, and N47 were respectively 23.15, 17.36, 162.04, 74.07, 104.17, 12.73, 92.59, 138.88, 254.63, 196.76, and 16.2 L/s. Three leaks were separately located at different pipes. Leak L1 was at the middle of P19 and was 1150 m away from N36. Leak L2 was located at P32, 0 m away from N22, implying that leak L2 occurred exactly at N22. Leak L3 was 960 m away from N12 and was located at P41. The *CdLAL* values for L1, L2, and L3 were respectively 2.00 <sup>×</sup> 10<sup>−</sup>4, 1.00 <sup>×</sup> 10<sup>−</sup>4, and 1.20 <sup>×</sup> 10−<sup>4</sup> m2. In addition, *QLs* was 2.0, 1.0, and 1.5 L/s for L1, L2, and L3, respectively. Two partial blockages, B1 and B2, were respectively situated at P23 and P39. B1 was 200 m away from N33 and blocked 20% of the cross-sectional area of P23, while B2 was 600 m away from N10 and blocked 15% of the cross-sectional area of P39. Hence, the *CdBAB* values of B1 and B2 were 4.0 <sup>×</sup> 10−<sup>1</sup> m<sup>2</sup> and 6.0 <sup>×</sup> 10−<sup>1</sup> m2, respectively. Moreover, two distributed deterioration reaches, D1 and D2, occurred at P62 and P67, respectively. D1 was located at P62, 400 m away from N22, while D2 was located at P67, 600 m away from N19. The length, wave speed, impedance, and cross-sectional area of D1 were respectively 40 m, 800 m/s, 163.2 s/m2, and 0.50 m2, while those of D2 were 30 m, 600 m/s, 122.4 s/m2, and 0.50 m2. The properties of the two deterioration reaches are shown in Figure 6 as well. The outflow node N17 was considered to be the transient operation and data collection point for pipe network B. The Δ*t* was also selected to be 0.01 s. Thus, the initial Δ*x* was also considered to be 10 m for the intact pipe reach and was further altered with different wave speeds in the deterioration reach. Because the WDN scale was large and complicated, the transient wave may have taken more time to arrive at the fault points/parts. The total simulation time increased to 60 s. A total of 6001 data points should be collected and used in a complete simulation. Two different cases with different data collection issues were considered to test the reliability of the proposed approach for fault detection in a large-scale WDN. *NP* was chosen to be 50, and *Miter* was updated to 20,000 for possible enormous iterations. The transient excitation period was chosen as 5 or 10 s for the simulation of the complete closure of the valve.

#### *5.2. Case Description and Error Criteria*

Three cases were selected to test the capability of PEOS in fault detection in a complex pipe network such as pipe network B, considering the effects of limited observations, measurement errors, and inappropriate transient operation. Case 1 used less data, with a low frequency of 0.1 s (i.e., 10% of the original sampling frequency) to represent instrument limitations in the field survey, and thus 601 data points were collected and used in the simulation of case 1. In case 2, measurement errors were added to the 601 low-frequency data points to depict the uncertainty in data collection. Notice that the white noise ε was normally distributed, with a zero mean and a standard deviation of 0.01 m, which was generated as a random measurement error that was added to each data point in case 2. The observation heads with errors were defined as

$$H\_{\rm ij}^o = H\_{\rm ij}^o + \varepsilon.\tag{15}$$

Case 3 was designed under the same sampling frequency as case 1, but the transient operation time was extended to 10 s to address the effects of an inappropriate transient operation. There were 601 data points collected after 10 s of transient operation that were used in the simulations of case 3.

In order to evaluate the effects of limited observations and measurement errors on the results predicted by the proposed approach, two error criteria, the standard error of the estimate (SEE) and mean error (ME), were considered. The SEE is a measure of the accuracy of predictions, defined as the square root of the sum of squared errors between the observed and predicted heads divided by the number of degrees of freedom, which equals the number of observed data points minus the number of unknowns. The criterion ME is the average of the sum of errors between the observed and simulated heads.
