*3.2. Community Structure Algorithm*

The community structure detection algorithm is a bottom-up hierarchical approach based on graph theory and proposed by Newman and Girvan [63] and Clauset et al. [64]. It uses greedy optimization of a quantity known as modularity (*Q*), which is defined in Equation (1). They used the quality measure of network density to define the clusters, assuming that the density of a network division was effective if there were many edges within communities (intraclusters) and only a few between them (interclusters). Modularity index is a network property used as an indicator to quantify the quality of graph division in the community. The clustering method is based on maximizing the modularity index. Higher values of that metric are related to a community structure of the network, which is significant if *Q* ≥ 0.3 [63,64]:

$$Q = \frac{1}{2m} \sum\_{ij} \left[ A\_{ij} - \frac{k\_i k\_j}{2m} \right] \delta(\mathbb{C}\_i, \mathbb{C}\_j) \tag{1}$$

where δ *Ci*,*Cj* is the Kronecker delta coefficient, and δ *Ci*,*Cj* = 1 if vertices *i* and *j* are the same community; otherwise δ *Ci*,*Cj* = 0.

If we assume that the fraction of pipes that have both start and end nodes belonging to the same community is *eii*, and *ai* is the portion of pipes with at least one end node in the community *i*, then the modularity can be formulated as:

$$Q = \sum\_{\mathcal{C}} e\_{ii} - a\_i^2. \tag{2}$$

The change in the two communities *i* and *j* to increase modularity can be computed by [63]:

$$
\Delta Q = 2(e\_{ij} - a\_i a\_j). \tag{3}
$$

The community structure algorithm is implemented following the steps listed in Figure 4.

**Figure 4.** Main steps for community structure algorithm clustering.

Diao et al. [19] first applied a community structure algorithm to detect clusters in a WDN. Their study used a community structure to automatically create boundaries for DMAs. WDN was mapped onto an undirected graph and community detection was implemented to maximize the modularity matrix and find the hierarchical community structure that represented the DMAs of the WDN. In the study, the authors determined the size constraint to be 300–5000 properties [10] for each community by applying a heuristic approach, known as oriented dendrogram cutting.

Instead of identifying network communities by maximizing the modularity index, Campbell et al. [34] proposed a procedure based on the idea that feedlines (i.e., a trunk network) should not be included in sectorization schemes. This was identified by means of determining the "betweenness" of edges, the flow, and the diameter analysis. The betweenness algorithm is a branch of graph theory that defines the edge (i.e., pipe) that connects to many pairs of vertices (i.e., nodes) [63]. A random-walk betweenness [19] can detect community segmentation with the highest modularity and a dendrogram can set the size constraint for each community.

Similarly, Ciaponi et al. [40] offered a different approach that combined convincing practical criteria when designing DMA as proposed by Morrison et al. [4]. Accordingly, automated identification of DMAs was performed by identifying the prevalent transport service (main transmission pipes) in WDNs and then each DMA, which was determined by the remaining distribution service pipes, was directly connected with the main transmission pipes. The procedure decomposed subsystems exceeding the threshold DMA size constraint owing to a modularity-based optimization algorithm. The two approaches brought the boundaries of identification of DMAs closer to reality and supported feasible alternative solutions to make more convincing decisions.

## *3.3. Modularity-Based Algorithm*

The community structure algorithm uses a modularity index as a metric for the optimal design of DMAs. However, the modularity index may not be representative for the WDN because it is strongly affected by hydraulic properties (e.g., elevation, node demand, pipe diameter). Adopting the classic formulation of a modularity index without considering the physical and hydraulic constraints would therefore be artificial and misleading. Inspired by this approach, Giustolisi and Ridolfi [41] proposed a modularity-based method for WDN segmentation that accounts for hydraulic network properties to define WDN-oriented modularity. First, to formulate the modularity index for WDNs, the proposed method focused on conceptual segmenting of the network close to the ending nodes by using a topological incidence matrix and the number of pipes separating communities. This was done to minimize the number of required pipe cuts. Despite being tailored for a WDN, WDN-oriented modularity had an inherent limitation left over from the classic community detection algorithm. Fortunato and Barthelemy [65] stated that the modularity index proposed by Giustolisi and Ridolfi may fail to detect small communities if the community's total edge number is smaller than <sup>√</sup> 2*m*, where *m* is the total number of edges in the network. To overcome such failures, Giustolisi and Ridolfi [66] proposed that an infrastructure modularity index can improve the negative effect of the inconsistency of modularity optimization. A new index is released through maximization of the classic modularity index in the framework of the two-objective optimization, modifying the framework

to overcome the resolution limit. Laucelli et al. [67] took a further step by developing a flexible procedure for DMA planning based on Giustolisi and Ridolfi's achievement with a conceptual cut for segmentation. A two-step strategy was adopted for optimal sectorization design by maximizing the WDN-oriented modularity index versus minimizing the number of conceptual cuts, where the location of pipe cutting minimizes the number of devices to be installed. To determine the location of flow meters and gate valves, DMA design was optimized based on each conceptual cut and returned an optimal solution for each one, accounting for hydraulic behavior change in the network with respect to maximizing the reduction of background leakage in each DMA. Using the WDN-oriented modularity index, Simone et al. [68] developed a sampling-oriented modularity index to perform optimal spatial distribution and assess the optimal number of pressure meters needed in a network (i.e., sampling design) using a multi-objective optimization method to minimize pressure-meter cost versus sampling-oriented modularity.

As mentioned in Section 2, DMAs are designed to detect and actively manage leaks. To that end, pressure management is a fundamental and important factor affecting leak management. Zhang et al. [42] developed a hybrid procedure by combining node pressure with modularity-based community detection to segment a network into similar DMAs from a pressure aspect. However, to improve the resolution limits of classical modularity, they used a random-walk theory similar to that of Campbell et al. [34]. The random-walk theory allows for precise identification of communities with greater or smaller differences in size and the automatic creation of a multiscale community [42]. To illustrate the superiority of this method over previous methods, the results proposed by Diao et al. [19] were compared. They demonstrated that different partition schemes result at a variety of random-walk time periods because the variances of node pressure are integrated into the community. InDiao et al.[19], variance was made immutable using a top-down search. Additionally, in the aspect of boundary pipes proposed by the two respective methods, Zhang et al. [42] showed that the traditional modularity-based community detection introduced by Diao et al. had more boundary pipes.

Most recently, Perelman et al. [37] combined three branches of graph theory to evaluate the performance of each method. Global clustering, community structure, and graph partition were applied to two WDNs in Singapore. Global clustering is a bottom-up algorithm for grouping points concerning a measure of similarity defined for each pair of points. Community algorithms detect the community structure in the network focused on the concept of edge betweenness. Graph partitioning divides the graph into a predefined number of groups such that the number of edges crossing between the groups is minimal [69]. The authors showed that the methods were compatible and applicable to large networks, but the performance of each method was completely different and depended on the number of clusters and the parameters selected for evaluation. They proposed multi-criteria metrics based on visual and quantitative performance measures. Accordingly, a better approach would be to minimize four metrics, such as (a) worst cut size, (b) total cut size, (c) cluster size, and (d) running time, and maximize the metric in regard to (e) recurrence of inter-cluster edges [37]. The results demonstrate that graph-partitioning generally outperforms clustering and the community structure methods in terms of (a), (b), and (d), which implies that the number of flow meters needed to monitor the flow will be minimized. On the contrary, the global clustering method indicated a good expectation in terms of (e), while in terms of (c), the three methods showed similar results. Therefore, community structure and the graph partitioning methods were more flexible and outperformed global clustering under particular budget constraints.

Similarly, Di Nardo et al. [70] conducted a comprehensive analysis of two popular clustering algorithms, such as the graph partitioning based on multilevel recursive bisection (MLRB) and the spectral clustering based on the normalized cut algorithm. Applications to a real-life WDN in South Italy revealed that the graph partitioning outperformed the spectral clustering in balancing the number of nodes in each DMA. On the contrary, the spectral algorithm showed better performance than the graph partitioning to minimize the number of edges cuts, thus it was more efficient in both hydraulic and economic aspects. A similar study conducted by Liu et al. [71] explored the performance of three partitioning methods, including fast greedy [64], random walk [63], and multilevel recursive bisection (MLRB) [72] using a spectrum of topology-based indicators.

As mentioned earlier, WDNs exhibit dynamic hydraulic behavior changes in the spatial and temporal mode that are completely different compared to others. Most of the partitioning algorithms lack exhaustive analyses of the similarity of the hydraulic and physical aspects in DMAs, such as the number of nodes and balance in terms of water demand and pressure. It is therefore not sufficient to offer a universal O&M solution to a utility. Awareness will make DMA segmentation more reliable when physical properties and hydraulic behavior are considered in network partitioning. Realizing the limitations of Diao et al. [19] and Ciaponi et al. [40], Creaco et al. [73] incorporated engineering aspects (i.e., demand supplied along the pipe and pipe length) into WNP processes. However, unlike Giustolisi and Ridolfi [41,66], they focused on applying heuristic procedures to improve the original fast greedy partitioning algorithm to maximize the modularity index developed by Clauset et al. [64]. Two heuristic optimization techniques were developed and applied to the formulation of modularity to perform different merging combinations. In the first technique, randomness was added to the DMA merging process, which allows for the acquisition of numerous WDN-partitioning probabilistic solutions while generating a higher modularity increment during the merging steps and a lower number of boundary pipes compared with the traditional deterministic approach. The second technique illustrated the trade-offs between various engineering aspects by embedding the former technique inside a multi-objective genetic algorithm optimization [74].

Evaluation of DMAs scenarios after sectorization must also guarantee that hydraulic indicators are at an acceptable or higher threshold compared with the original network. Because different criteria lead to various DMA layouts, Brentan et al. [43,44] proposed a method that considers the relationship between many technical criteria, such as demand and pipe length, to create different DMA scenarios. The social community detection algorithm was used to define DMAs. To assess the performance of DMA generation, a comprehensive analysis was proposed that considered performance indicators such as resilience index, demand similarity, pressure uniformity, water age, cost, and energy consumption, hopefully provides decision-makers with an optimal DMA configuration.
