*3.5. Spectral Graph Algorithms*

Spectral graph theory is a mathematical approach to study the relationship of graph properties by associating both linear algebra and graph theory to determine the eigenvalue and eigenvector properties. Spectral clustering uses the spectrum in eigenvectors of the adjacent matrix to cluster groups of points into communities [49,54]. In spectral clustering, the row of eigenvectors of the Laplacian matrix for a pair of the nodes is similar if the nodes belong to the same cluster. Spectral-based graph clustering has been implemented in many fields over the last decade, especially in computer sciences, bioinformatics, and data analysis. Recently, in the field of WDN management, spectral graph theory has been used to define an optimal cluster configuration, a preliminary analysis of network vulnerability, and robustness through graph matrices eigenvalues [76,77] and a toolset for WDN management has been proposed [49]. Several types of research have applied spectral graph theory for WDN management, but in this subsection we focus on effective approaches that have been proposed for water network clustering. The core idea of spectral clustering is the Laplacian matrix as defined by three equations. The first is the non-normalized Laplacian relationship, which solves a relaxed version of the Ratiocut problem proposed by Von Luxburg [78]:

$$L = D - A\_\prime \tag{4}$$

where **D** is a diagonal matrix of nodal degrees *ki*, **D** = *diag*(**d**), in which **d** = [*k*1, *k*2, ... , *kn*] T. **A** is the adjacency matrix.

The other two matrices are normalized graph Laplacians, which are closely related and can be defined as

$$L\_{sym} = D^{-1/2} L D^{-1/2} \text{ and} \tag{5}$$

$$L\_{rw} = D^{-1}L,\tag{6}$$

where *Lsym* is a symmetric matrix proposed to solve the NCut problem [78] and *Lrw* is closely related to a random walk, which can be used to solve the same problem.

The Laplacian matrix of an undirected graph has the following properties [78,79]


Aim of spectral graph partitioning is to divide graph *G* into *p* ≤ *n* subgraphs *G*1, *G*2, ... , *Gp*. Then,

$$V = V\_1 \cup V\_2 \cup \dots \cup V\_p \text{ where } V\_i \cap V\_j = \mathcal{Q}\_i \text{ } i \neq j. \tag{7}$$

Let *Gk* = (*Vk*, *Ek*) represents for subgraph *k*, in which *k* = 1, ... , *p* and *Vk* is the set of vertices of subgraph *Gk*. From Equation (7), an edge that has its endpoints in different vertex subsets is not contained in any of the formed subgraphs *Gk* and is called an intercluster edge. Let a set of the intercluster edges with one endpoint in *Vk* be denoted as Equation (8).

$$\partial(V\_k) := \{ij : i \in V\_k \text{ and } j \notin V\_k\},\tag{8}$$

Two different sets of edges can thus be distinguished as follows.


From the optimal bipartitioning of a graph point of view, minimizing the *cut* values are objective functions. Von Luxburg [78] and Shi and Malik [80] proposed these functions to optimize *cut* value, called ratiocut method and normalized cut method, in Equations (9) and (10), respectively.

$$\min\_{V\_1, V\_2, \dots, V\_p} \sum\_{k=1}^p \frac{\text{vol}(\partial(V\_k))}{|V\_k|},\tag{9}$$

$$\min\_{V\_1, V\_2, \dots, V\_p} \sum\_{k=1}^p \frac{\text{vol}(\partial(V\_k))}{\text{vol}(V\_k)},\tag{10}$$

where *vol*(∂(*Vk*)) is the sum of the weights on the all intercluster edges in ∂(*Vk*); |*Vk*| is the number of vertices in *Vk*; and *vol*(*Vk*) is the sum of the weights on the vertices in *Vk*.

Equations (9) and (10) are NP-complete problems, however, they can be relaxed to find approximate solutions proved by Von Luxburg [78] and Shi and Malik [80] and reformed as Equations (11) and (12) following

$$\mathbf{LU} = \mathbf{U}\boldsymbol{\Phi}\tag{11}$$
