*2.5. Elitism-Based GA*

A GA [31] is a metaheuristic optimization algorithm that mimics the process of natural selection (*i.e.*, "the survival of the fittest"). A standard GA (SGA) generally begins with a randomly-generated initial population (*i.e.*, group of potential solutions). A pair of solutions is selected with a high probability to have high fitness (selection process), which is then subjected to crossover and mutation processes that share their genetic traits (*i.e.*, decision variables) and modify a few of them, respectively. This series of processes (selection–crossover–mutation) is called an "iteration." This series is continued until the child population is filled. Better fitness solutions tend to appear in the population over iterations.

Various versions of the GA have been released after Holland [31] first introduced the SGA. These versions include the elitism GA, greedy GA, adaptive GA and refined GA [32–35]. One of the reasons for such a massive number of releases is that SGA is not capable of keeping the current best solution for the next generation. In other words, SGA is inefficient at considering new and good solutions found in the iterations. Therefore, an elitism-based GA (eGA) is proposed to optimize the WDS node grouping problem.

The proposed eGA selects two solutions from the parent population through roulette wheel selection, where the probability of a solution to be selected is proportional to its fitness. Similar to the general GA, a high fitness solution has a better chance to be selected. The general crossover and mutation processes are then performed at frequencies defined by the crossover and mutation rates. Crossover generally occurs with a probability of 70%–90%, while mutation happens with a probability of 2%–10%. The resulting two solutions are called children solutions. The two new children solutions compete with the selected parent solutions in a tournament with respect to fitness to determine whether any of the two children solutions can replace the parent solution(s). Therefore, one iteration of eGA requires only two functional evaluations, whereas the SGA requires *ni* number of functional evaluations, where *ni* is the total number of individuals (*i.e.*, solutions) in the population. The prompt inclusion of newly-found good solutions in the eGA improves the search efficiency of the SGA because the solution can be considered as a parent solution for selection in the immediately following iteration. Chromosomes are integer-coded (*i.e*., decision variables are in integers) in the eGA (Figure 3) for the node grouping optimization in the WDS demand estimation.

### *2.6. Optimal Node Grouping Model*

While the number of unknown variables (*i.e.*, demands) should be reduced because of the lack of available information, an accurate demand estimation can be achieved when each pipe flowmeter is linked to an appropriate node group, which is usually the group of nodes in proximity to the meter, whose demand affects the pipe flow rate. Therefore, given the layout of pipe flowmeters, finding the optimal node groups plays a very important role in determining the WDS demand estimation accuracy. Provided that there is no spatial proximity information between the meters and the nodes, the proposed optimal node group model finds the optimal node groups that minimize the total RMSE of the estimated node group demand using the KF-based method given the total number of node groups:

$$\text{Minimize } F = \sum\_{i=1}^{n\_{\mathcal{S}}} RMSE\_i \tag{8}$$

$$\text{subject to } n\mathfrak{g} = m \tag{9}$$

where *RMSEi* is the RMSE of the *i*-th node group; *ng* is the total number of node groups (*i* = 1, 2, ..., *ng*); and *m* is the total number of node groups predefined by the user generally equal to the total number of pipe flowmeters in the WDS.

Note that pressure constraints are not included in the proposed ONG problem, which differs from other optimization problems (e.g., WDS design problem) developed in the WDS field. However, the synthetic demand generated by the methodologies described in Sections 2.1 and 2.2 results in nodal pressures in a normal operation range of 21–28 m (30–40 psi).
