1. Introduction
Informally, a cluster in a graph is a subgraph whose vertices are tightly linked between them, and loosely linked with other vertices, out of the subgraph. Such a vague concept is useful in the description of several phenomena: walking, searching, and decomposition of graphs [
1]. The concept of a cluster is closely related to that of community [
2]. Depending on the intended application, the meaning of the concept can be specified either to better reflect the aspects to be modeled or to ease its calculation. In our case, we were interested in the subdivision of the set of vertices in two parts, of similar size, in such a way that the number of edges between the two parts is kept to a minimum. To solve this problem, there are many references in the literature about spectral tools, based on a certain eigenvector of the Laplacian operator (Fiedler vector) [
3]. These references also show how to consider weights at the edges of the graph so that the minimization of the edges takes these weights into account. In this work, we extended this tool in case there are also weights on the vertices independent of those on the edges so that the partition is made in parts of similar total weight, not necessarily a similar number of vertices, minimizing the total edge weight of the cut. Other proposals in the literature deal with this problem using generalized eigenvectors (see [
4] and the references therein). In this article, we remain in the context of the usual eigenvectors. Our interest in avoiding generalized eigenvalues is that one of the possible applications is spectral clustering, along the lines of [
5]. This method uses several eigenvectors to build a template for each vertex and forms the clusters with these templates. The eigenvectors are orthogonal to each other, but the generalized eigenvectors are not, so the templates formed with these will be less effective. In other words, we needed larger templates with generalized eigenvectors than with usual eigenvectors. An example of our work related to spectral clustering is [
6].
A similar problem is studied in [
7], where a Laplacian that incorporates both weights at the vertices and at the edges is defined. The vertex weights are multiplicatively integrated directly as matrix factors. The novelty of our approach is that our weights (called
) derive from a potential
p that is additively integrated on the diagonal. The relationship between the potential
p that must be introduced to obtain the desired weights
is specified in Theorem 3. This is significant because it allows us to overcome a technical difficulty of the cited work, obtaining error bounds (Theorem 4) as a function of the potential, not of the weights. A case of application of this method of weights from a potential is our contribution (Chapter 2.4) to the collective work [
8]. In that work, we used the Laplacian matrix with potentials to perform spectral graph partitioning for process placement on heterogeneous computing platforms.
In the next section, we review standard notations and concepts about graph and matrix representation. In the following
Section 3, we describe known facts about the spectral partition using a Laplacian matrix.
Section 4 contains our contribution: we define a Laplacian matrix with potential (
p-Laplacian matrices) and show that certain matrix of this type can be used to perform a partition on the set of vertices, in sizes of similar weight, minimizing the edges cut (Theorem 3). It also contains a Cheeger-style bound on the difference between the true value of the minimum partition and the approximated value obtained using the Fiedler eigenvector of the
p-Laplacian matrix (Theorem 4). The value considered in this result is the ratio cut of the partition, instead of the total cut. Our purpose was to show that it is possible to give bounds analogous to those that appear in the literature but adapted to this approach of weights at the vertices with independence from the weights at the edges. We emphasize that this bound is an improvement, similar to that of Mohar, for vertex-and-edge-weighted graph spectral analysis.
2. Graphs and Transfer Matrices
A graph consists of a set V of vertexes and a set E of subsets of two vertices (the edges). That is, . For , the edge , also noted , is said to go between u and v. In this study, we used this definition of a graph, which does not model the direction of the edge or loops.
A
weight on edges is a map
The weight of the edge
is noted
. If a weight on the edges is not specified, implicitly, the constant unit weight must be considered (that is,
for each
).
A
weight on vertexes is a map
The set of these maps, that is, the set of all vertex weights, is denoted
. Clearly, it is a vector space.
To represent graphs using matrices, we chose an ordering of the set of vertexes,
. The
adjacency matrix of
G (for this conventional ordering) is the
matrix
of values:
We represent a vertex weight
as the vector
of its values
. The adjacency matrix operates in
, the set of vertex weights, as a right matrix product (postmultiplication).
This is the
transference (or
shift) of the vertex weight
by the graph
G.
The postmultiplication of
s, as a row vector, by
A, is usual in matrix analysis of finite Markov chains [
9]. We can see the shift as the following action: the shift
represents that each edge
takes the content of
, that is,
, and transports it to the vertex
(modifying its amount by the factor of transference
). The sum of the values transferred to
is
, with a gain
from each adjacent vertex
. So, the vertex weight
is:
In the literature, the interest is focused on symmetrical matrices (because they are modeling undirected graphs), so there is no difference between pre- and postmultiplication by
A. Besides, note that the main diagonal is zero because the graphs do not have loops. In
Section 4, we introduce potentials that can be viewed as a method to use the diagonal entries to carry vertex weights, independently of the edge weights.
3. Laplacian and Partitions
This section summarizes standard notions about graphs, partitions, and the Laplacian operator, which can be seen in [
3] or [
10], although the notation is adapted to our particular purposes, and the Lemma 3 is proved in a novel way, without using summatories.
A
partition of a set V is an array
of two subsets of
V such that
and
. A
partition of a graph is a partition of the underlying set
V of vertices. An edge
in
G is
cut by a partition if
and
or vice versa. If the graph is weighted, the
total cut is
Any subset defines a cut, , which is called the cut of U.
It is usually preferable, between several partitions in a graph, that one has a minimum number of cut edges (or total cut, if weighted). The cut of the graph is:
In this study, we were interested in partitions with minimal cuts but with a balanced number of vertices, that is,
for an even number of vertices or
for any number of vertices (a
bipartition). The bipartition width [
11] is:
We also used the cut ratio of
U (the quotient
) and the isoperimetric number:
We expressed, using linear algebra, the combinatorial problem of finding the partitions that realize these minimums.
Let us suppose we are given an ordering
in the set of vertices. A vector
has an entry
for each
. The
characteristic vector of a set
is
with:
Sometimes it is preferable to use other values than 0 or 1 in the vector expression of a combinatorial object like a subset or partition [
4]. For two real values
, the
-indicator vector of a partition
is the vector
with
For example, the (0,1)-indicator is the characteristic of the second set of the partition. We mainly use (1,−1)-indicators. We denote , which is the standard scalar product in . In this way, a matrix A has associated the bilinear form . The vector has the value 1 in each component. The degree vector is where .
Lemma 1. Being of characteristic vectors :
- (i)
. Also .
- (ii)
.
- (iii)
If A is the adjacency matrix of a graph, the vector has, in the i-th entry, the degree . That is, . Also .
- (iv)
has, in entry i-th, the number of edges to from vertices in S.
Proof. They are straightforward. □
Lemma 2. Being and characteristic vectors of the sets of a partition : Proof. By Lemma 1
(iv),
contains in
i-th entry the total weight of edges between
and
. Hence,
is the sum of the weight of the edges in the set
, defined as
that is, cut
. □
Calling the matrix with the degree vector g in the diagonal and zero off-diagonal, we have . Being x the (1,−1)-indicator of any partition, we also have , because the minus signs appear in pairs.
Defining the Laplacian as , we have:
Lemma 3. If x is the (1,−1)-indicator of , Proof. If
x is the (1,−1)-indicator of
,
with
and
characteristic of
and
, respectively, and:
Besides, as
, that is,
, then
. Likewise,
, hence
That is,
. Solving for the cut, we have
. As
, and
, we can express
. Therefore,
□
We deduced this well-known identity in matrix form, instead of summatory form as usual. So we not only avoided the index chasing but also made explicit the role of the values
used in indicators. For example, if
x is a
-indicator of
, then cut
. In general [
4], for
-indicators, the cut is
. This deduction also clarifies the role of the diagonal degree matrix
.
In addition to the expression of the cost as a bilinear form
L, we expressed the requirement that the partition
be balanced as
. So, the problem of finding the bipartition of minimal cost is the following problem of combinatorial optimization:
| |
| . |
This combinatorial problem is an NP-complete problem [
12]. To approximate a solution, with polynomial computational cost, it is customary to relax the constraints
. This relaxed problem is a numerical one that has several features that ease its resolution:
L is symmetrical, hence its eigenvalues are real, and there is an orthonormal basis of eigenvectors [
13]. Besides,
is an eigenvector of eigenvalue 0, because
. Additionally,
L is a weak diagonally dominant of positive diagonal, hence (by the theorem of the Geršgorin discs [
14]) its eigenvalues are nonnegative
. If
G is connected,
[
3]. These features of
L are generally deduced from its expression as summatory of squares, which we have avoided. The result about diagonal dominance that we used instead is also easy to see.
The
Rayleigh quotient of a
symmetric matrix
M is
defined for
in
. It plays a role in the following min–max theorem of Courant-Weyl [
13], which we use without proof:
Theorem 1. Let be the set of subspaces of of dimension lesser or equal than k, for , M symmetric with eigenvalues and corresponding eigenvectors . ThenBesides, the argument E giving the minimum is , and an argument x giving the maximum is . In particular,
, because each
span an
. In the case that
, the eigenvector
is (a scalar multiple of)
, as commented above, and the others are orthogonal to it:
. In the case
,
because each
x with
span a
. The minimum is reached in
.
To relate this result with the cut value of partitions note that if x an indicator vector of a partition (), the cut is proportional to the Rayleigh quotient. The minimum is reached in a vector that is an eigenvector for . That is, is a solution to the relaxed problem, although it may not be an indicator vector. The first non-null eigenvalue is termed the Fiedler value, and its eigenvector is the Fiedler vector.
There are several rounding or truncation methods to obtain an indicator vector (i.e., with integer values for ) from (whose entries are not necessarily integers). The most direct rounding is the partition by sign: if , otherwise. This rounding can give a partition that is not a bipartition. Another rounding method uses the median, achieving precisely bipartitions: if m is the median value of the entries of , if and otherwise.
For these rounding methods and others that appear in the literature [
15], there is an error bound. The error is the difference between the partition obtained by the rounding and the partition that actually minimizes the cut, which is obtainable by combinatorial methods. These bounds are known as discrete Cheeger bounds since they involve the first nonzero eigenvalue. In particular, we use the following Cheeger bound developed by Mohar [
16]. It compares the cut ratio
of the partition induced by the sign rounding,
, and the isoperimetric number
.
Theorem 2. If G has more than three vertexes and its maximal degree is Δ
, then This bound uses the sign rounding and the cut ratio,
, instead of the median rounding and the edge cut,
, which we use to describe our framework. However, we chose it because it is easier to present the generalization that we make for vertex-weighted graphs in the next section. The cut ratio has also been studied in the stochastic setting [
17]. In principle, it is also possible to generalize for vertex-weighted graphs the similar bounds that there are in the literature for median rounding and edge cut [
18].
4. Laplacians with Potential
To motivate our contribution, we recall here the usual interpretation, for instance in [
19], of the values of a weight
s on vertices using flows on graphs. The vertex weight
corresponds to the amount or magnitude of some physical substance placed at
. The weights at the edges
between different vertices correspond, in this interpretation, to a transmission factor or gain that affects the substance when it flows from one vertex
to another
, increasing or decreasing its amount. Following this interpretation of the weight matrix as a transference or shift, the weights in the loop edges
are the gain that suffers the substance that stays in the same vertex
.
In this diffusion process interpretation, the eigenvectors of a shift matrix are the stationary substance distributions. In particular, the Laplacian matrix has a stationary distribution that is uniform (corresponding to the null eigenvalue) because the degree values on the diagonal make the total gain of substance equal to zero. In this sense, the Laplacian process is conservative. Another example is the shift by the adjacency matrix of a connected graph, which has a positive stationary distribution (the random walk limit distribution, corresponding to the Perron eigenvector [
3]), with a substance gain given by the Perron eigenvalue.
Following this vein, we show how to control the weights in the diagonal to obtain any positive as a stationary distribution, the eigenvector of a matrix similar to the Laplacian.
As commented, a weight on vertices is a function
. Its diagonal form is the matrix
. The
generalized Laplacian with potential p (or
p-Laplacian) is:
That is,
. In particular, with
, the 0-Laplacian is the ordinary Laplacian. Some properties of
p-Laplacians are similar to those of ordinary Laplacians, as can be viewed in [
20] under the name of generalized Laplacian. For our purposes, we highlight the following ones, whose proof require some technicalities:
Lemma 4. If the graph G is connected and the potential p verifiesthen: - (a)
The eigenvalues of are real, and the minimum eigenvalue has multiplicity 1. That is, .
- (b)
There is a positive eigenvector corresponding to , unique up to a scalar multiple.
Proof. If
, then
is a symmetrical Z-matrix [
21]. As
G is connected,
is irreducible. By Observation 1.4.3 of [
21], the claims follow. □
The minimum eigenvalue is the Perron eigenvalue . It has multiplicity 1, and there is an eigenvector of the Perron eigenvalue with all positive entries. To fix one such eigenvector, we define the Perron eigenvector as that with .
The min–max theorem for the operator gives us that , and the minimum is reached in an eigenvector of of norm 1 (the Fiedler vector ).
With these properties, we can replicate the spectral partition methodology because the spectral decomposition of
assures that
. This can be understood, as in the ordinary Laplacian
L above, that the positive and negative values of the Fiedler vector give us an indicator of two sets of vertexes. This indicator cuts
V in two parts of equal absolute sum of Perron values. That is,
and if
,
, then
However, in this case, the Perron vector is not the constant distribution, , but a positive distribution . We use the distribution as a measure of the relative importance of the vertexes in a partition or clustering.
Note that the Perron eigenvector is defined up to a constant factor, in the sense that , for , is also a positive eigenvector of the same eigenvalue. Our choice of is conventional.
Note also that, as in the ordinary case commented in
Section 3, the Fiedler vector
does not necessarily have
components and should be considered as an approximation to the optimal partition.
To build a potential p such that the Perron distribution of will be a predefined given , we apply the formula of the following theorem. Note that it is scale-invariant, in the sense that and , for , will produce the same potential p.
Remember that for a vector x, we denote its i-th component as , and a function is identified with the vector . We can build a potential p such that the Perron vector of have predetermined positive values :
Theorem 3. For any vector ρ such that for , ifbeing the adjacency matrix, then the Perron vector of is ρ, and the Perron value is 0. Proof. As
p verifies the hypothesis of Lemma 4,
has a Perron eigenvalue. Note that for each
,
. Also
Therefore
So,
. To conclude that 0 and
are the Perron eigenvalue and eigenvector, we use the fact that in a symmetrical matrix the eigenvectors of different eigenvalues are orthogonal. Consequently, only one eigenvalue can have associated a positive eigenvector as
, and this is 0. □
With this result, we can do spectral partition with preassigned weights on the vertexes. By the above discussion, the p-Laplacian for the potential p corresponding to the given has a Fiedler vector orthogonal to the Perron vector (that is, it produces a partition in parts of equal total weight at the vertices). The value of the cut is also well expressed by the p-Laplacian, in the following lemma. For a vertex set , we denote .
Lemma 5. If x is the (1,−1)-indicator of , Proof. As
,
By Lemma 3,
, and
, hence the claim. □
Therefore, to find a partition (i.e., an indicator x) that minimizes is equivalent to minimizing , because is a constant, given the potential p.
For a predefined weight on vertices
, we build the potential
p by Theorem 3. The
p-Laplacian
specifies the vertex
-weighted bipartition problem:
| |
| . |
This combinatorial problem is at least NP-complete as the conventional bipartition problem, which includes it as a subproblem. In any case, we consider the relaxed problem (without the restriction ) that has as solution, by the min–max Theorem 1, the Fiedler vector of .
By taking this Fiedler vector as an approximation to the combinatorial solution, that is, the unrelaxed problem, the error can be bounded with a Chegeer expression similar to that of Mohar (Theorem 2). To express this bound, in the following Theorem 4, we define the
cut ratio of U with respect to p as
and the
isoperimetric number with respect to p as
Given a vector
,
, we order the set of its values as
. That is, for each
i in
there is one
j in
and only one with
. The
level sets of
x are, for each
,
. The
sweep cut is the minimum cut following level sets of
x, that is:
It is clear that, for any
x,
. We define also
, the increment of square values along
x, as
Lemma 6. Let L be the ordinary Laplacian, for any x:
- (a)
- (b)
Proof. For
(a), consider the level sets of
x, that is, vertices sorted such that
, being
the different values of
x. For each level set
, by the definition of sweep cut we have:
Besides
extending with a value
to ease the notation.
The last equality is because, for each pair of vertices , if and with , the double summation includes a chain of terms that collapses to .
For (
b), we consider vectors
, that is, indexed by the edges
If we apply the Cauchy–Schwarz inequality
to the particular vectors of
and
, as
, we have:
For the first root factor, it is known that
being
L the ordinary Laplacian (see for example [
15]). For the second root factor, by the trivial fact that
we have:
□
To prove the following claim, we apply the above lemma to , the Fiedler eigenvalue of .
Theorem 4. Being , , and ϕ the Fiedler eigenvalue and eigenvector of , then: Proof. For the first inequality, calling
the (−1,1)-indicator of the minimal bipartition
, that is, such that
, by Lemma 5 we have:
that is,
. Besides, being
for each
x is
. In particular,
, hence
and:
For the second inequality, being , a particular cut (the sweep cut of ) the minimal cut is lesser or equal than it.
Finally, for the third inequality, we use Lemma 6 inequalities for the particular case , that is
- (a)
- (b)
From (a) we have
, that from (b) is lesser or equal than
. The value of these terms are
and
. Additionally, as
and
, we have
Substituting those values in the inequalities:
The last inequality is because, by definition of
and
, for each
, we have
,
and also
. □
This result is similar to the above Theorem 2, in this case bounding the cost of the minimal cut and the sweep cut of the Fiedler eigenvector of .