**5. Summary and Conclusions**

State estimation is defined as the process of calculating a state variable of interest that is not directly measured. In a WDS, nodal demands are considered as the state variable (*i.e.*, an unknown variable that cannot be directly measured) and can be estimated using nodal pressures and pipe flow rates measured at sensors installed throughout the system. The number of unknown variables should be minimized given the lack of available information for demand estimation. An alternative way is to group a set of nodes with a correlation (e.g., the same demand pattern and proximity). Finding the optimal node grouping that results in the best demand estimation accuracy is a challenging task given the number and location of meters and complex hydraulic conditions.

This study proposes an ONG model that minimizes the sum of the RMSEs of the node groups' estimated demand given the number of node groups. The KF-based demand estimation method is linked with a genetic algorithm for node group optimization. The proposed model is applied to estimate the modified Austin network demand as a demonstration. The true demands and field measurements are synthetically generated using a hydraulic model of the study network. Note that the proposed model is the first to combine a WDS demand estimation tool (*i.e*., the KF-based model) and a node clustering/aggregation optimization model (*i.e.*, eGA).

The sum of the RMSEs of the final best solution found with the proposed model is 56.2 L/s, which is about 76% of the value obtained by Jung and Lansey [8] based on their engineering judgment. In contrast to the randomly-generated initial solutions, the nodes are spatially grouped in the final solution. However, no apparent one-to-one linkage (mapping) is observed between a meter and a node group. This result indicates that the proposed ONG model should be applied to identify the best node grouping for the demand estimation of the loop-dominated networks. Among individual node groups, a high RMSE is obtained for the node group with the largest base demand. In addition, the proportion of the error to the true demand is large for node groups with a small base demand. Therefore, micrometers (*i.e.*, meters that can measure a small flow, such as a conventional meter installed at pipes to dead-end nodes or AMIs) should be installed to further increase the demand estimation accuracy.

The work in this study has several limitations that future research must address. First, this model finds the optimal node grouping given that the number of node groups is predefined as equal to the number of meters. The model can be extended to include the number of node groups as a decision variable. Assuming that the same number of meters should be installed for an accurate demand estimation, the demand estimation problem can be formulated as a multi-objective problem that minimizes the sum of RMSEs and minimizes the cost of meters (*i.e.*, meter instrument and installation costs). Many meters should be installed to achieve high demand estimation accuracy, which requires a large investment. Such a tradeoff relationship can be explored by solving the multi-objective demand estimation and meter placement problem.

Second, the proposed model can be verified using real demand data measured in a real large network fully equipped with AMI/automatic water meter reading.

Third, an advanced warm-start approach to the initial solution should be developed to shorten the time of finding a feasible solution in the early optimization phase. The Euclidean distance among nodes can be considered for the initial node grouping to avoid spatially-distributed nodes in a group. This study uses a loop-dominated network to demonstrate the proposed model. Different network types (e.g., branched or DMA-structured) can be used to check (1) whether the conclusions of this study are still valid or not and (2) whether such network layouts affect the demand estimation accuracy and optimal node grouping or not. Finally, the proposed model can serve as a submodule for WDS operation and managemen<sup>t</sup> tools (e.g., a real-time WDS operation model that determines the status of pumping units given the estimated future demand).

**Acknowledgments:** This work was supported by a gran<sup>t</sup> from the National Research Foundation (NRF) of Korea funded by the Korean governmen<sup>t</sup> (MSIP) (No. 2013R1A2A1A01013886).

**Author Contributions:** Donghwi Jung and Young Hwan Choi carried out the survey of previous studies and wrote the draft of the manuscript. Donghwi Jung made Matlab codes of the proposed ONG model and performed the optimization runs. Joong Hoon Kim and Donghwi Jung conceived the original idea of the proposed model and revised the draft to the final manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.
