**4. Application Results**

The KF-based method was used to consider a two-day series of pipe flow rates measured at 14 meter locations for node group demand estimation. The RMSE for each group (Equation (7)) were calculated by comparing the synthetically-generated node group demands for two days (both the pipe flow rates and node group demands were in 5-min time steps) with the estimated values. The erroneous node groups resulted in the divergence of the state estimates, which mostly originated from a badly-scaled or (nearly) singular matrix in Equation (4) (*i.e.*, the determinant of the matrix (**H***uk***P**´*k* **H***Tuk* ` **R***k*) to be inversed was close to zero). In the optimization of the proposed ONG model, a high penalty value (e.g., 10,000 ˆ 14 = 140,000) was added to the objective function value (*i.e.*, RMSE) in Equation 8 if a matrix was ill-conditioned. The demand estimation was terminated at the time step to speed up the optimization when the ill-conditioned matrix was identified.

Note that the KF-based demand estimation method estimated the nodal group demands only using pipe flow measurements without being given any information/values of the true nodal group demands. In this study, the true nodal (group) demands were synthetically generated and entered into the WDS system equations to obtain pipe flow rates at locations, to which the measurement error was added to finally produce the pipe flow measurements. Processed using the non-linear governing equations and added noises, the pipe flow measurements did not contain clear clues for tracking the true nodal group demands. Therefore, estimating the nodal group demands was not a circular numerical calculation.

First, the differences between the node groupings of the initial and final optimal solutions were investigated. Then, the spatial distributions of the nodes in a group and node groups were examined along with whether or not there was a one-to-one relationship between a meter and a node group. The RMSE values of the individual node group demand estimates were determined. Conclusions were drawn on the demand estimation error, base demand of the node group and meter locations.

### *4.1. Optimal Node Grouping Results*

Figure 7 shows the trajectory of the best fitness value (*i.e.*, sum of node group RMSEs) over the iterations. Note that the sum of the RMSEs reported by Jung and Lansey [8] was 74.1 L/s. They determined the node grouping based on engineering judgment. The best RMSE with the proposed ONG model was 56.2 L/s, which was about 76% of the RMSE obtained by Jung and Lansey [8] (Figure 7). Jung and Lansey [8] included seven nodes (out of 47 demand nodes) with industrial and commercial demands, which slightly complicated their demand estimation. Other conditions (e.g., meter locations and measurement time interval) were the same as in this study.

Most solutions found in the early stage of the optimization had diverging demand estimates (*i.e.*, infeasible solutions with a penalty value), which resulted in the solution's fitness value of 140,000 (not included in Figure 7). The main reason for the divergence was the scattered distribution of node groups (Figure 8). Generally, the highest accuracy is achieved when a group of gathered nodes had a one-to-one relationship with a meter in close proximity (opposite condition to the node grouping in the infeasible solutions). A feasible solution was found after a few iterations, and step decreases in the fitness value were observed until the best fitness was reached at the optimal value of 56.2 L/s. The results for the individual node group demands were discussed in Section 4.2.

**Figure 7.** Trajectory of the fitness value of the best solution.

**Figure 8.** Node groups of a representative initial random solution and meter locations. Each node group is delineated by either a different shape outline or filled by different colors/styles.

Figure 9 shows the optimal node grouping layout found by the proposed node grouping model. In contrast to the representative initial solution shown in Figure 8, the optimal solution spatially gathered nodes in each group. For example, Node Group (NG) 14 comprised three nodes in proximity at the south end of the study network, whereas the five nodes at the north end composed NG8. NG4 only had one node. Some node groups (e.g., NG3, NG6 and N10) comprised nodes that were not close to each other (*i.e.*, slightly scattered), which was mostly caused by the existence of node(s) with no external demand between the nodes. For example, there was a zero-demand node between the leftmost and middle nodes in NG3. In addition, the three zero-demand nodes near NG11 were also worthy of being seen (Figure 9).

**Figure 9.** Optimal node groups identified by the proposed model. A node that is not grouped (boxed) has no external demand. Therefore, demand estimation is not required.

Figure 9 also shows the location of the 14 meters installed in the study network. Note that not only was a meter located at the transmission mains (*e.g.*, the source pipe linked to Source 1 and two pipes on the right and left sides of the NG13 box), but one was also installed in the distribution pipes within the loops at the northeast and southwest parts of the network. Finding an apparent one-to-one relationship between a node group and a meter was very difficult because of the complex hydraulic relationship of a looped network. Compared to a simple branched network, the response of the pipe flow rate at the meter location to a perturbation in the node group demand, which was information required for demand estimation, cannot be easily identified by visual inspection or engineering knowledge/sense in a looped network. Therefore, this result highlighted the need for the proposed model when trying to find optimal node groups that result in a highly accurate WDS demand estimation.

Multiple optimal (or near-optimal) solutions for the ONG problem of real large networks could be employed. More preference would be given to the solution with a high tendency of classifying nodes in proximity or of the same user type into the same group.

### *4.2. Accuracy of Individual Nodal Group Demand Estimation*

Table 1 summarizes the RMSE of the individual node group demand estimates. Figure 10 plots the actual and estimated demands of representative multiple node groups (*i.e.*, NG1, 4, 7, 8, 12 and 13). The non-linear KF was used as the unbiased minimum-variance estimator to minimize the error between two values. While accurate demand estimation was achieved in most node groups, the highest RMSE was obtained in NG13 followed by NG1 (Table 1 and Figure 10a,f, respectively). The latter mainly originated from the fact that NG1 had the largest base demand among the node groups (see the range of the *y*-axis in Figure 10a compared to the other plots in Figure 10). The node group with the largest base demand should have the largest error value if the proportion of error to the base demand value was assumed to stay similar or the same. In contrast, the former was caused by the lack of available information/signals for the demand estimation.

**Figure 10.** Actual (circle) and estimated (line) group demands at 5-min time steps for the first 24 h: (**a**) NG1; (**b**) NG4; (**c**) NG7; (**d**) NG8; (**e**) NG12; and (**f**) NG13 (NG = node group).

**Table 1.** Node group demand estimation RMSE (sum of node group demands' RMSE = 56.2 L/s).


The nodes in NG13 were located along the transmission pipes that delivered bulk flows to supply the north, northeast and southwest parts of the study network (Figures 6 and 9). While only a meter located at the source pipe linked to Source 1 could measure NG13's demand (with other demands), the proportion of the group demand was very small compared to the total pipe flow (*i.e.*, the total system demand because Source 2 was not operated) measured at the meter. NG13's demand was less than the natural randomness of the total system demand (its base value was 726 L/s), which made it difficult to extract useful information from the measurements. This result led to the fluctuating demand with large deviation from the actual demand value (Figure 10f).

Similarly, fluctuating demand estimates with large deviations from the actual values were observed in node groups with a small base demand (e.g., NG7 and 12 in Figure 10c and e, respectively). In other words, the proportion of error seemed higher in these groups than the others. This result implied that the demand estimation was not reliable and difficult to utilize for managemen<sup>t</sup> purposes. Such cases were avoided by placing meters at transmission pipes and distributing them among small pipes to capture the pipe flow information of various magnitudes, which was required for WDS demand estimation. The examples included a conventional meter located at the end of a branched network (delivering demand to one or two pendant nodes) or an AMI installed at the inlet pipe to each household.
