A Further Study on the Degree-Corrected Spectral Clustering under Spectral Graph Theory

Liu, Fangmeng; Li, Wei; Zhong, Yiwen

doi:10.3390/sym14112428

Open AccessArticle

A Further Study on the Degree-Corrected Spectral Clustering under Spectral Graph Theory

by

Fangmeng Liu

^†,

Wei Li

^*,†

and

Yiwen Zhong

College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2022, 14(11), 2428; https://doi.org/10.3390/sym14112428

Submission received: 31 October 2022 / Revised: 9 November 2022 / Accepted: 12 November 2022 / Published: 16 November 2022

(This article belongs to the Special Issue Symmetry in Graph and Hypergraph Theory)

Download

Browse Figures

Versions Notes

Abstract

:

Spectral clustering algorithms are often used to find clusters in the community detection problem. Recently, a degree-corrected spectral clustering algorithm was proposed. However, it is only used for partitioning graphs which are generated from stochastic blockmodels. This paper studies the degree-corrected spectral clustering algorithm based on the spectral graph theory and shows that it gives a good approximation of the optimal clustering for a wide class of graphs. Moreover, we also give theoretical support for finding an appropriate degree-correction. Several numerical experiments for community detection are conducted in this paper to evaluate our method.

Keywords:

spectral clustering; graphs; degree-corrected Laplacian; partition; eigenvalue

1. Introduction

Due to the growing availability of datasets of large-scale networks, community detection has attracted significant consideration. The community detection problem is to discover a community structure by dividing the network into multiple clusters according to the affinity between nodes. Because the spectral clustering method is easy to implement and can detect non-convex clusters, it is widely used for detecting clusters in networks. Compared to the traditional algorithms, spectral clustering performs well and has many fundamental advantages [1,2,3,4].

In the spectral clustering algorithm, the similarity between the data points is reflected by the weights on the edges in the graph. The data points are mapped to a lower-dimensional space through the Laplacian matrix of the graph, and finally, the non-convex datasets in the obtained low-dimensional space are clustered by traditional clustering algorithms.

Let

G = (V, E)

be an undirected and unweighted simple graph with n nodes, where V and E are the set of nodes and edges, respectively. The adjacency matrix of graph G, denoted by

W = (w_{i j})

, is a 0–1 symmetric matrix of order n, where the

(i, j)

-th and

(j, i)

-th element is 1 if there is an edge between two nodes i and j, and 0 otherwise. Let

d_{i} = \sum_{j = 1}^{n} w_{i j}

, which is defined as the degree of node i. Moreover,

d_{max} = {max}_{i \in V} d_{i}

and

d_{min} = {min}_{i \in V} d_{i}

are called the maximal degree and minimal degree of G, respectively. Denote

\bar{d}

as the average degree of graph G, which equals

\frac{1}{n} \sum_{i = 1}^{n} d_{i}

. The degree matrix is defined by

D = d i a g (d_{1}, \dots, d_{n})

. The symmetric matrix

D - W

is called an unnormalized Laplacian of G, each of whose row sum is zero. The normalized Laplacian

L = I - D^{- 1 / 2} W D^{- 1 / 2}

has zero as the smallest eigenvalue and plays an very important role in the spectral clustering algorithm. It is well defined only in case

D^{- 1}

exists, i.e., there are no isolated nodes.

In 2002, Ng et al. [5] proposed a version of spectral clustering (NJW) under the normalized Laplacian matrix. Moreover, the authors in [5] analyzed their algorithm using matrix perturbation theory and gave the conditions for the algorithm performing well when nodes from different clusters are well-separated. However, when dealing with a sparse network with a strong degree of heterogeneity, i.e., the minimum degree of the graph is low, and NJW cannot concentrate well. To resolve this issue, Chaudhuri and Chung [6] introduced the notion of a degree-corrected random-walk Laplacian

I - {(D + τ I)}^{- 1} W

and demonstrated that it outputs the correct partition under a wide-range graph generated from extended planted partition (EPP) model. Instead of doing the spectral decomposition on the entire matrix, Chaudhuri and Chung [6] divided the nodes into two random subsets and only used the induced subgraph on one of those random subsets to compute the spectral decomposition. Qin and Rohe [7] investigated the spectral clustering algorithm using the degree-corrected normalized Laplacian

L_{τ} = I - {(D + τ I)}^{- 1 / 2} W {(D + τ I)}^{- 1 / 2}

under the degree-corrected stochastic blockmodel, where

τ = \bar{d}

. This method extended the previous statistical estimation results to the more canonical spectral clustering algorithm, which is called the regularized spectral clustering (RSC). Recently, Qing and Wang [8] proposed an improved spectral clustering under the degree-corrected stochastic blockmodel also, where

τ = 0.1 \frac{d_{m i n} + d_{m a x}}{2}

, (ISC). Unlike NJW and RSC, which use the top k eigenvectors to construct the mapping matrix, ISC uses the top

k + 1

eigenvectors and the corresponding eigenvalues instead and outperforms especially in the weak signal networks, where k is the number of clusters.

Actually, previous works for spectral clustering with the degree-corrected Laplacian were mostly applied to graphs generated from stochastic blockmodels. Moreover, the optimal

τ

has a complex dependence on the degree of distribution of the graph and

τ = \bar{d}

provides good results [6,7]. In [7], the authors claimed that when

τ = \bar{d}

, it could be adjusted by a multiplicative constant and the results are not sensitive to such adjustments. However, some numerical experiments show that an appropriate

τ

could be found for a better performance.

This paper investigates the spectral clustering algorithm using the degree-corrected Laplacian in view of spectral graph theory [9] and shows that it also works for a wide class of graphs. Moreover, we also provide theoretical guidance on the choice of the parameter

τ

. Finally, six real-world datasets are used to test the performance of our method for an appropriate

τ

. The results are roughly equivalent to that of RSC, or even better.

The rest of this paper is organized as follows. In Section 2, we list some relative definitions and useful lemmas in the analysis of our main results in Section 3. In Section 4, some numerical experiments are conducted for the real-world datasets. Moreover, some artificial networks are generated to analyze the effect of our method in terms of some related parameters. The conclusion and future work are provided in Section 5.

2. Preliminary

Let

G = (V, E)

be a graph. The symmetric difference of two subsets S and T of V is defined as

S Δ T = (S \ T) \cup (T \ S)

. For a subset S of V,

E (S, V \ S) = {(u, v) \in E : u \in S, v \in V \ S}

. The symbol

μ (S)

denotes the volume of S that is given by the sum of degree of all notes in S, i.e.,

μ (S) = \sum_{v \in S} d_{v}

. If k disjoint subsets

S_{1}, \dots, S_{k}

of V satisfy

\cup_{i = 1}^{k} S_{i} = V

, we call

{S_{1}, \dots, S_{k}}

a k-way partition of V. Kolev and Mehlhorn [10] introduced the minimal average conductance denoted by

{\bar{ϕ}}_{k} (G) = min_{{S_{1}, \dots, S_{k}} \in U} \frac{1}{k} (ϕ (S_{1}) + \dots + ϕ (S_{k})),

where U is a set of containing every k way partition of the points set of G, and

ϕ (S) = \frac{| E (S, V \ S) |}{μ (S)}

. A partition

{S_{1}, \dots, S_{k}}

is called optimal, if it satisfies that

\frac{1}{k} (ϕ (S_{1}) + \dots + ϕ (S_{k})) = {\bar{ϕ}}_{k} (G) .

In this paper, we denote

\{A_{1}, \dots, A_{k}\}

as the actual partition returned by the RSC algorithm, where k is the number of classes of the graph.

Let

{∥ \cdot ∥}_{2}

denote the 2-norm for a vector and

{∥ \cdot ∥}_{F}

denote the Frobenius norm for a matrix.

The k-means algorithm tends to find a set of k centers

c_{1}, \dots, c_{k}

to minimize the sum of the squared-distance between the points and the center to which it is assigned.

Let F be a spectral embedding map from V to a vector space. Given any k-way partition of G and a set of vectors, say

{S_{1}, \dots, S_{k}}

and

w_{1}, \dots, w_{k}

, respectively, the cost function of partition

{S_{1}, \dots, S_{k}}

of V, mentioned in [11], is defined as

\begin{matrix} g (S_{1}, \dots, S_{k}, w_{1}, \dots, w_{k}) = \sum_{i = 1}^{k} \sum_{v \in S_{i}} d_{v} {∥ F (v) - w_{i} ∥}_{2}^{2} . \end{matrix}

(1)

The main idea of this function is to expand each element

F (v)

of V by making

d_{v}

copies of

F (v)

and form a set with

2 | E (G) |

nodes. Then, it acquires a partition by using k-means algorithm. The “trick” is to copy every node u to

d_{u}

identical nodes. This method can efficiently deal with the networks, which have the overlap between clusters. For convenience, it is necessary to assume that the k-means clustering algorithm outputs of the expansion of vertices V satisfying the following condition.

(A): For every $v \in V$ , all $d_{v}$ copies of $F (v)$ are contained in one part.

Suppose that

{Y_{1}, \dots, Y_{k}}

is the partition of V with centers

z_{1}, \dots, z_{k}

, which is the output of the k-means clustering algorithm, the value of the clustering cost function is denoted by “COST”, i.e.,

C O S T = g (Y_{1}, \dots, Y_{k}, z_{1}, \dots, z_{k}) .

Then, we will introduce the traditional NJW and RSC Algorithm 1.

Algorithm 1. The traditional NJW and RSC algorithm

Input:: $W, k,$ ( $τ$ for RSC)
1:: Calculate the normalized Laplacian matrix $L = D^{- 1 / 2} W D^{- 1 / 2}$ .
( $L_{τ} = {(D + τ I)}^{- 1 / 2} W {(D + τ I)}^{- 1 / 2}$ for RSC).
2:: Find the eigenvectors $f_{1}, \dots, f_{k}$ corresponding to the k largest eigenvalues of L. Form $X = [f_{1}, \dots, f_{k}]$ by putting the eigenvectors into the columns.
3:: Normalize each row of X to get matrix Y, i.e., $Y_{i j} = X_{i j} / {(\sum_{j = 1}^{k} X_{i j}^{2})}^{1 / 2},$ where $i = 1, \dots, n$ and $j = 1, \dots, k$ .
4:: Apply k-means method to Y to get the label of each node.
Output:: labels for all nodes

3. Analysis of RSC Algorithm

Our method for analyzing the RSC algorithm follows the strategy developed by Peng et al. [11], Kolve et al. [10], and Mizutani [12]. Let

{S_{1}, \dots, S_{k}}

be a partition of the nodes set of V. Define

g_{i} \in R^{n}

is the normalized indicator of

S_{i}

. That means, if

v \in S_{i}

, the v-th element of

g_{i}

is one, or else is zero. The normalized indicator

{\bar{g}}_{i}

of

S_{i}

is given as

{\bar{g}}_{i} = \frac{D^{1 / 2} g_{i}}{∥ D^{1 / 2} g_{i} ∥_{2}} = \{\begin{matrix} \sqrt{\frac{d_{v}}{μ (S_{i})}} & v \in S_{i} \\ 0 & v \notin S_{i} . \end{matrix}

It is obvious that

∥ {\bar{g}}_{i} ∥_{2} = 1

.

The following result is called the structure theoremwhich plays a very important role to examine the performance of the spectral clustering. It shows that there is a linear combination

{\hat{f}}_{i}

of

f_{1}, \dots, f_{k + 1}

such that

{\hat{f}}_{i}

and

g_{i}

are close.

Theorem 1

(Structure Theorem). Let

Ψ = \frac{1}{1 - λ_{k + 1} (τ)} (1 - \frac{d_{m i n}}{d_{m a x} + τ} + {\bar{ϕ}}_{k} (G) \frac{d_{m i n}}{d_{m a x} + τ}),

where

λ_{k + 1} (τ)

(

λ_{k + 1}

for short) is the

(k + 1)

-th largest eigenvalue of

L_{τ}

, and

{S_{1}, \dots, S_{k}}

be the

{\bar{ϕ}}_{k} (G)

-optimal partition of G,

\bar{G} = [{\bar{g}}_{1}, \dots, {\bar{g}}_{k}] \in R^{n \times k}

,

\bar{F} = [f_{1}, \dots, f_{k}] \in R^{n \times k} .

If

k Ψ < 1

, then there exists a

k \times k

orthogonal matrix

U = [u_{1}, \dots, u_{k}]

, such that

∥ \bar{F} U - \bar{G} {| |}_{F} \leq 2 \sqrt{k Ψ} .

Proof.

Denote

{\bar{g}}_{i}^{u}

as the element in

{\bar{g}}_{i}

corresponding to the vertex u. Moreover, let

{\bar{g}}_{i} = \sum_{j = 1}^{n} h_{i, j} f_{j}, \hat{f_{i}} = \sum_{j = 1}^{k} h_{i, j} f_{j} .

First,

\begin{matrix} {\bar{g}}_{i}^{T} L_{δ} \bar{g_{i}} & = \sum_{{u, v} \in E (G)} [{(\frac{1}{\sqrt{d_{u}}} {\bar{g}}_{i}^{u})}^{2} - \frac{2}{\sqrt{d_{u} + τ} \sqrt{d_{v} + τ}} {\bar{g}}_{i}^{u} {\bar{g}}_{i}^{v} + {(\frac{1}{\sqrt{d_{v}}} {\bar{g}}_{i}^{v})}^{2}] \\ = \sum_{u \in S_{i}; v \in {\bar{S}}_{i}} \frac{1}{μ (S_{i})} + \sum_{u, v \in S_{i}} \frac{2}{μ (S_{i})} (1 - \frac{\sqrt{d_{u} d_{v}}}{\sqrt{d_{u} + τ} \sqrt{d_{v} + τ}}) \\ \leq ϕ (S_{i}) + \frac{2 E (S_{i})}{μ (S_{i})} (1 - \frac{d_{m i n}}{d_{m a x} + τ}) \\ = 1 - \frac{2 E (S_{i})}{μ (S_{i})} \frac{d_{m i n}}{d_{m a x} + τ} \\ = 1 - \frac{d_{m i n}}{d_{m a x} + τ} + ϕ (S_{i}) \frac{d_{m i n}}{d_{m a x} + τ} < 1 . \end{matrix}

(2)

On the other hand,

\begin{matrix} {\bar{g}}_{i}^{T} L_{δ} \bar{g_{i}} & = {(\sum_{j = 1}^{n} h_{i, j} f_{j})}^{T} L_{δ} (\sum_{j = 1}^{n} h_{i, j} f_{j}) \\ = {(\sum_{j = 1}^{n} h_{i, j} f_{j})}^{T} (\sum_{j = 1}^{n} h_{i, j} (1 - λ_{j}) f_{j}) \\ = \sum_{j = 1}^{n} h_{i, j}^{2} (1 - λ_{j}) \\ \geq \sum_{j = k + 1}^{n} h_{i, j}^{2} (1 - λ_{j}) \\ \geq (1 - λ_{k + 1}) \sum_{j = k + 1}^{n} h_{i, j}^{2} . \end{matrix}

Then,

| | \hat{f_{i}} - \bar{g_{i}} {| |}_{2}^{2} = \sum_{j = k + 1}^{n} h_{i, j}^{2} \leq \frac{1}{1 - λ_{k + 1}} (1 - \frac{d_{m i n}}{d_{m a x} + τ} + ϕ (S_{i}) \frac{d_{m i n}}{d_{m a x} + τ}),

and

| | \hat{F} - \bar{G} {| |}_{F}^{2} = \sum_{i = 1}^{k} | | \hat{f_{i}} - \bar{g_{i}} {| |}_{2}^{2} \leq \frac{1}{1 - λ_{k + 1}} (1 - \frac{d_{m i n}}{d_{m a x} + τ} + {\bar{ϕ}}_{k} (G) \frac{d_{m i n}}{d_{m a x} + τ}) .

Let

h_{i} = {[h_{i, 1}, \dots, h_{i, k}]}^{T}

,

i = 1, \dots, k

, and

H = [h_{1}, \dots, h_{k}] \in R^{k \times k}

. Considering the singular value decomposition of

H

, given as

H = A Σ B^{T}

, where

A \in R^{k \times k}

and

B \in R^{k \times k}

are orthogonal matrices, and

Σ

is a

k \times k

diagonal matrix.

Let

U = {AB}^{T}

and

R = U - H \in R^{k \times k}

. Then,

U

is an orthogonal matrix. According to the proof of Theorem 4 in [12], it obtains

\begin{matrix} {∥ R ∥}_{F} \leq k Ψ a n d {∥ \bar{F} U - \bar{G} ∥}_{F} \leq k Ψ + \sqrt{k Ψ} . \end{matrix}

When

k Ψ < 1

, we have

\begin{matrix} ∥ \bar{F} U - \bar{G} ∥_{F} \leq 2 \sqrt{k Ψ} . \end{matrix}

(3)

This completes the proof. □

Given k vectors, say

c_{1}, \dots, c_{k} \in R^{k}

, we suppose that

| | c_{i} - c_{j} {| |}_{2}^{2}

is lower bounded by some real numbers

ζ_{i, j} \geq 0

and

g (S_{i}, \dots, S_{k}, c_{1}, \dots, c_{k})

is upper bound by a real number

ω \geq 0

, i.e.,

\begin{matrix} | | c_{i} - c_{j} {| |}_{2}^{2} \geq ζ_{i j} (i \neq j) a n d g (S_{i}, \dots, S_{k}, c_{1}, \dots, c_{k}) \leq ω . \end{matrix}

(4)

We are now ready to derive the bounds of

ζ_{i j}

and

ω

shown in (4) for the RSC algorithm. Let

\bar{F} = [f_{1}, \dots, f_{k}]

and

p_{v}

be the v-th row of

\bar{F}

, corresponding to the node v. Since U is an orthogonal matrix, the inequality (3) can be rewritten as

∥ \bar{F} U - \bar{G} ∥_{F} = ∥ \bar{F} - \bar{G} U^{T} ∥_{F} = {∥ {\bar{F}}^{T} - U {\bar{G}}^{T} ∥}_{F} = \sum_{i = 1}^{k} \sum_{v \in S_{i}} ∥ p_{v} - \sqrt{\frac{d_{v}}{μ (S_{i})}} u_{i} ∥_{2}^{2} .

(5)

The spectral embedding map in the RSC algorithm, denoted by

F_{R S C} (v)

, is given as

F_{R S C} (v) = \frac{1}{∥ p_{v} ∥_{2}} p_{v} .

Hence, according to the discussion in [12], it is easy to obtain the upper bound of “COST”. The discussion needs the following inequality.

Lemma 1

([12]). The following inequality holds for a vector

a \in R^{k}

and a vector

u \in R^{k}

with

{∥ u ∥}_{2} = 1

,

∥ \frac{a}{{∥ a ∥}_{2}} - u ∥_{2} \leq 2 {∥ a - u ∥}_{2} .

Theorem 2.

Let a partition

{S_{1}, \dots, S_{l}}

of G be an optimal achieving

{\bar{ϕ}}_{k} (G)

and

F_{R S C}

be the spectral embedding map in RSC algorithm. Define the center of

S_{i}

,

c_{i} = u_{i}

for

i = 1, \dots, k

, then

$∥ c_{i} - c_{j} ∥_{2}^{2} = 2$ .
$g (S_{1}, \dots, S_{k}, c_{1}, \dots, c_{k}) \leq 16 k μ_{m a x} Ψ$ ,

where

μ_{m a x} = max {μ (S_{i}) | i = 1, 2, \dots, k}

.

Proof.

First, since

c_{i} = u_{i}

, we have that

∥ c_{i} - c_{j} ∥_{2}^{2} = {(u_{i} - u_{j})}^{T} (u_{i} - u_{j}) = 2 .

On the other hand, let

F (v) = \sqrt{\frac{μ (S_{i})}{d_{v}}} p_{v}

. Then,

\begin{matrix} g (S_{1}, \dots, S_{k}, c_{1}, \dots, c_{k}) \\ = & \sum_{i = 1}^{k} \sum_{v \in S_{i}} d_{v} {∥ F_{R S C} (v) - u_{i} ∥}_{2}^{2} \\ = & \sum_{i = 1}^{k} \sum_{v \in S_{i}} d_{v} ∥ \frac{p (v)}{∥ p (v) ∥} - u_{i} ∥_{2}^{2} \\ = & \sum_{i = 1}^{k} \sum_{v \in S_{i}} d_{v} ∥ \frac{F (v)}{∥ F (v) ∥} - u_{i} ∥_{2}^{2} \\ \leq & 4 \sum_{i = 1}^{k} \sum_{v \in S_{i}} d_{v} ∥ \sqrt{\frac{μ (S_{i})}{d_{v}}} p_{v} - u_{i} ∥_{2}^{2} (by Lemma 1) \\ = & 4 \sum_{i = 1}^{k} \sum_{v \in S_{i}} μ (S_{i}) ∥ p_{v} - \sqrt{\frac{d_{v}}{μ (S_{i})}} u_{i} ∥_{2}^{2} (by Equation (5) and Theorem 1) \\ \leq & 16 k μ_{m a x} Ψ . \end{matrix}

The result holds. □

Assume that

OPT

stands for the optimal clustering cost of graph G, then it is obvious that

COST \leq α \cdot OPT

, where

α

is an approximation ratio. Moreover,

OPT \leq g (S_{1}, \dots, S_{k}, c_{1}, \dots, c_{k})

. Therefore, we can obtain the upper bound of COST.

Theorem 3.

Let

{S_{1}, \dots, S_{k}}

be a

{\bar{ϕ}}_{k} (G)

-optimal partition of G. Then

C O S T \leq 16 k α μ_{m a x} Ψ .

Lemma 2

([12]). Assume that, for every permutation

π : \{1, \dots, k\} \to \{1, \dots, k\}

, there is an index l such that

μ (A_{l} Δ S_{π (l)}) \geq 2 ϵ μ (S_{π (l)})

for a real number

0 \leq ϵ \leq 1 / 2

. Then, the following inequality holds,

C O S T \geq \frac{1}{8} \sum_{i \in H} ξ_{i} ζ_{i, p} min \{μ (S_{i}), μ (S_{l})\} - ω,

where H is a subset of

\{1, \dots, k\}

, p is an element of

\{1, \dots, k\}

and

ξ_{i} \geq 0

is a non-negative real number satisfying

\sum_{i \in H} ξ_{i} \geq ϵ

, and ω is the upper bound of

g (S_{1}, \dots, S_{k}, c_{1}, \dots, c_{k})

in (4).

By setting

ζ_{i, j} = 2

and

ω = 16 k α μ_{m a x} Ψ

, then we obtain the following result.

Theorem 4.

Suppose that the assumption of Lemma 2 holds. Then,

C O S T \geq \frac{1}{4} ϵ μ_{m i n} - 16 k μ_{m a x} Ψ .

Theorem 5

(Main result). Given a graph G = (V,E) and a positive integer k, let a partition

\{S_{1}, \dots, S_{n}\}

of G be

{\bar{ϕ}}_{k} (G) - o p t i m a l

and

A_{1}, \dots, A_{n}

be a partition of G returned by the RSC clustering algorithm. Assume that a k-means clustering algorithm has an approximation ratio of α and satisfies assumption (A). If

Ψ \leq \frac{μ_{m i n}}{264 * 2 k α μ_{m a x}}

, then, after a suitable renumbering of

A_{1}, \dots, A_{k}

, the following holds for

i = 1, \dots, k

,

μ (A_{i} Δ S_{i}) \leq (\frac{264 k α μ_{m a x}}{μ_{m i n}} Ψ) μ (S_{i}) .

Proof.

Choose a real number

ϵ = \frac{132 k α μ_{m a x}}{μ_{m i n}} Ψ < \frac{1}{4} .

Assume that, for every permutation

π : \{1, \dots, k\} \to \{1, \dots, k\}

, there is an index l such that

μ (A_{l} Δ S_{π (l)}) \geq 2 ϵ μ (S_{π (l)})

for a real number

ϵ

. Hence, applying Theorems 3 and 4, we can obtain the following

\begin{matrix} C O S T & \geq \frac{1}{4} ϵ μ_{m i n} - 16 k μ_{m a x} Ψ \\ \geq 33 k α μ_{m a x} Ψ - 16 k α μ_{m a x} Ψ \\ = 17 k α μ_{m a x} Ψ \\ > 16 k α μ_{m a x} Ψ, \end{matrix}

which contradicts Theorem 3. That means, after a suitable renumbering of

A_{1}, \dots, A_{n}

, we have

μ (A_{i} Δ S_{i}) \leq 2 ϵ μ (S_{i}) = (\frac{264 k α μ_{m a x}}{μ_{m i n}} Ψ) μ (S_{i}),

for every

i = 1, 2, \dots, k

. □

4. Finding an Appropriate $τ$ and Numerical Experiment

The main theorem gives an upper bound of

μ (A_{i} Δ S_{i})

in RSC algorithm. It tells us that the performance would vary while the term

Ψ

decreases with increasing

τ

. In this section, we will try to find an appropriate

τ

for the good partitioning, according to this main theorem.

Before our analysis, we may make some reasonable assumptions as (B) to (D).

(B): $2 | E (S_{i}) | / μ (S_{i}) > 1 / \bar{d}$
(C): $τ \leq k \bar{d}$
(D): $μ_{m i n} / μ_{m a x} \leq \frac{2 \bar{d}}{n} .$

Firstly,

\frac{2 | E (S_{i}) |}{μ (S_{i})}

stands for the ratio of the edges in

S_{i}

to the degree summation of all points in

S_{i}

. We may assume that

\frac{2 | E (S_{i}) |}{μ (S_{i})} > 1 / \bar{d}

, since

S_{i}

is one of clusters in the optimal partitioning. Second, as mentioned in [6,7], the choice of

τ

is very important. If

τ

is too small, there is insufficient regularization. If

τ

is too large, it washes out significant eigenvalues. Then, it is reasonable to assume that

τ \leq k \bar{d}

. Moreover,

μ_{m i n}

and

μ_{m a x}

stand for the edges in the responding cluster and the ratio of them stands for the relative density. Hence, we can assume that

\frac{μ_{m i n}}{μ_{m a x}} \leq \frac{2 \bar{d}}{n} .

Then,

Ψ \leq \frac{1}{1 - λ_{k + 1} (τ)} (1 - \frac{d_{m i n}}{\bar{d} (d_{m a x} + τ)}) \leq \frac{\bar{d} (d_{m a x} + k \bar{d}) - d_{m i n}}{(1 - λ_{k + 1} (τ)) (\bar{d} (d_{m a x} + τ))},

where

λ_{k + 1} (τ)

is the

(k + 1)

-th largest eigenvalue of

L_{τ}

. Furthermore, the theoretical analysis in [7] shows that

τ = \bar{d}

provides good results and one could adjust this by a multiplicative constant. For these reasons, we set

τ = δ \bar{d}

and attend to find an appropriate

δ

to refine the algorithm.

Six real datasets are used to test our method. These datasets can be downloaded directly from http://zke.fas.harvard.edu/software.html, accessed on 10 September 2022. Table 1 shows the detail information of six real datasets, including the source of the dataset, the number of data points (n), the number of communities included (k), the minimum degree of data points (

d_{m i n}

), and the maximum degree of data points (

d_{m a x}

).

4.1. Find an Appropriate $δ$

Let

U B (δ) = \frac{1}{(1 - λ_{k + 1} (δ \bar{d})) (d_{m a x} + δ \bar{d})} .

Figure 1 plots the variation of UB(

δ

) when

δ

varies between 0 and 1 in six real datasets. It is obvious that

U B (δ)

is decreasing with increasing

δ

. The following theorem (often called the Geršgorin disc theorem) makes this observation true.

Theorem 6

(Geršgorin Disk Theorem). Let

A = (a_{i j}) \in R^{n \times n}

and

R_{i} (A) = \sum_{\binom{j = 1}{j \neq i}}^{n} | a_{i j} |, 1 \leq i \leq n

denote the deleted absolute row sums of A. Then, all the eigenvalues of A are located in the union of n discs

⋃_{i = 1}^{n} {z \in C : | z - a_{i i} | \leq R_{i} (A)} \equiv G (A) .

It tells us that, for all

i = 1, 2, \dots, n

,

R_{i} (L_{δ \bar{d}})

decreases with increasing

δ

. Then,

λ_{k + 1} (δ \bar{d})

and the term

U B (δ)

will decrease as well. It is easy to see that

{lim}_{δ \to \infty} U B (δ) = 0

. Therefore, we would like to find an appropriate

δ

, such that the upper bound

U B (δ)

will not vary too much, when

δ

varies small.

According to Theorem 5, we may assume that

Ψ \leq \frac{\bar{d} (d_{m a x} + k \bar{d}) - d_{m i n}}{(1 - λ_{k + 1} (τ)) (\bar{d} (d_{m a x} + τ))} \leq \frac{μ_{m i n}}{264 * 2 k α μ_{m a x}} .

Then

\frac{1}{(1 - λ_{k + 1} (τ)) (\bar{d} (d_{m a x} + τ))} \leq \frac{\bar{d}}{264 k^{2} n},

follows from the assumption

\frac{μ_{m i n}}{μ_{m a x}} \leq \frac{2 \bar{d}}{n}

and

\bar{d} (d_{m a x} + k \bar{d}) - d_{m i n} \geq k

.

Define

U B D (δ)

as the absolute difference of

U B

when

δ

increases 0.005, i.e.,

U B D (δ) = | U B (δ + 0.005) - U B (δ) |

. We would like to find that

δ_{0}

satisfies the following conditions:

\begin{matrix} \{\begin{matrix} \forall δ \geq δ_{0}, & U B D_{δ} \leq \frac{\bar{d}}{264 k^{2} n} \\ \forall δ < δ_{0}, & U B D_{δ} > \frac{\bar{d}}{264 k^{2} n} . \end{matrix} \end{matrix}

(6)

In the rest of this paper, three indices, namely RI, NMI, and error rate, are used to evaluate the effectiveness.

Evaluation Indices

Rand Index For a dataset with n data points, the total number of sample pairs is

\frac{n (n - 1)}{2}

. If two sample points belong to the same class are classified into the same class, we denote the number of such sample pairs as a. If two sample points belong to different classes are divided into different classes, we denote the number of such sample pairs as b. The calculation formula of RI is as follows:

R I = \frac{a + b}{n (n - 1) / 2} .

The

R I

value represents the proportion of correctly clustered sample pairs in all sample pairs and is often used to measure the similarity between two datasets. Obviously,

R I

is between 0 and 1. If

R I = 1

, the clustering is completely correct, and if

R I = 0

, it is completely wrong.

Normalized Mutual Information We use U and V to denote the true label vector and predicted label vector, respectively. Let

U_{i}

represent the elements belonging to class i in U and

V_{j}

represent the elements belonging to class j in V.

H (U)

represents the information entropy of U, that could be calculated by

H (U) = - \sum_{i = 1}^{n} p_{i} log p_{i},

where the base of the logarithmic function is usually 2 and

p_{i}

represents the ratio of the number of nodes belonging to class i to the total amount of nodes, i.e.,

p_{i} = \frac{| U_{i} |}{n}

. Now, we can obtain the formula for calculating mutual information (MI):

M I (U, V) = \sum_{i = 1}^{n} \sum_{j = 1}^{n} p_{i j} l o g (\frac{p_{i j}}{p_{i} \times p_{j}}),

where

p_{i j} = \frac{| U_{i} \cap V_{j} |}{n}

. Based on the information entropy and the mutual information, we can obtain the normalized mutual information as

N M I (U, V) = 2 \frac{M I (U, V)}{H (U) \times H (V)} .

Error Rate Error rate is defined by

m i n_{\{π : p e r m u t a t i o n o v e r \{1, 2, \dots, k\}\}} \frac{1}{n} \sum_{i = 1}^{n} 1 \{π ({\hat{l}}_{i} \neq l_{i})\},

where

{\hat{l}}_{i}

and

l_{i}

are the true and predicted labels of node i, respectively.

4.2. Real Networks Experiments

After some pre-processing, these six real datasets are all networks containing k non-overlapping communities and are labeled. We will use RSC-

δ

to stand for the RSC algorithm when

δ = δ_{0}

which satisfies the condition in (6). Actually, NJW, RSC, and RSC-

δ

are three different cases in RSC algorithm, when

δ

takes different values. When

δ = 0

, it is NJW algorithm. When

δ = 1

, it is the RSC algorithm. When

δ = δ_{0}

in (6), it is RSC-

δ

. Table 2 shows the experimental results of these three cases. Furthermore, the best performance in each dataset is indicated by the bold-type letter. The last row in Table 2 shows the corresponding

δ_{0}

in RSC-

δ

.

As can be seen from the table, RSC-

δ

is fully clustered correctly on UKfaculty and karate dataset. Moreover, RSC-

δ

achieves the best clustering results on the politicalblog dataset, with only 58 clustering errors.

Table 3 shows the items of the upper bound for

μ (S_{i} Δ A_{i})

proposed in Theorem 5. From the observation, the performance of the RSC-

δ

algorithm is effected by the two parameters of

\frac{μ_{m a x}}{μ_{m i n}}

and

{\bar{ϕ}}_{k} (G)

. For example, the RSC-

δ

does not perform well in caltech and dolphins. All networks except caltech have the minimal average conductance smaller than 0.4 and that of caltech is larger than 0.5. Although dolphins has a small

{\bar{ϕ}}_{k} (G)

,

\frac{μ_{m a x}}{μ_{m i n}}

is larger than 2.

4.3. Synthetic Data Experiments

In this section, we will use artificial networks to evaluate the performance of the RSC-

δ

algorithm in terms of the average degree, mixing parameter, and the number of nodes in the largest community. We generate artificial networks using the LFR benchmark, which is considered as a standard test network for community detection, characterized by a non-uniform distribution of node degrees and community sizes.

The test artificial networks are generated with the following parameters: the number of nodes (n), the average degree (

\bar{d}

), the maximum degree(maxd), the mixing parameter (

μ

), the number of nodes in the smallest community (minc), and the number of nodes in the largest community (maxc). The value of the mixing parameter, denoted by

μ

, is between 0.1 and 0.9. Low amounts of

μ

give a clear community structure where the intra-cluster link is much more than the inter-cluster link [17].

4.3.1. The Ratio of the Average Degree to the Maximum Degree

In this experiment, we generate nine artificial networks consisting of 500 nodes. To evaluate the performance of RSC-

δ

in terms of the average degree, we fix the parameter

μ = 0.5

, minc = 100, maxc = 300, maxd = 220, respectively, and the average degreevaries from 10 to 170, i.e., 10, 30, 50, 70, 90, 110, 130, 150, and 170, respectively. Then, the ratio of the average degree to the maximum degree varies from 0.0455 to 0.7727. The performance comparison is summarized in Figure 2.

From our observation, we can understand that the performance of RSC-

δ

is highly dependent on the average degree of the network. With the average degree increasing, RI and NMI increase, and the error rate decreases significantly. Actually, this phenomenon is verified by the inequality (2), since the equality holds when the graph is regular.

4.3.2. Mixing Parameter

In this experiment, we also generate nine artificial networks with 500 nodes and fix the parameter

\bar{d} = 15

, minc = 100, maxc = 300, maxd = 220, respectively. In order to study the effect of the mixing parameter on RSC-

δ

,

μ

varies from 0.1 to 0.9. The experimental results are shown in Figure 3.

From the observation, we understand that RSC-

δ

performs excellently when

μ

is between 0 and 0.3. However, it drops sharply when

μ

is varying from 0.3 to 0.5. This phenomenon coincides with the result for the real datasets, that RSC-

δ

does not perform well when

ϕ_{k} (G)

is larger than 0.5. However, the performance of RSC-

δ

remains stable when

μ \geq 0.5

. This shows that RSC-

δ

is less affected by

μ \geq 0.5

.

4.3.3. The Number of Nodes in the Largest Community

In this experiment, we generate 13 artificial networks consisting of 1700 nodes. To evaluate the performance of RSC-

δ

in terms of the number of nodes in the largest community, we fix the parameter

μ = 0.5

,

\bar{d}

= 30, minc = 300, maxd = 500, respectively, and the number of nodes in the largest community is varying from 300 to 900, step size is 50. The experimental results are shown in Figure 4.

Since both the degree and the community size distributions, in the graph generated by the LFR benchmark, are power laws, this experiment uses

\frac{m i n c}{m a x c}

to simulate

\frac{μ_{m a x}}{μ_{m i n}}

, and the experiment result shows that the RSC-

δ

algorithm performs well when the network is “balanced”, which also verifies the results in the real datasets.

5. Conclusions

Traditional spectral clustering algorithms such as NJW have poor performance in sparse networks with a strong degree of heterogeneity. The RSC algorithm improves the performance of spectral clustering in sparse networks through degree correction. Based on the spectral graph theory, this paper investigates the degree correction method of RSC, and shows that the RSC algorithm works for a wide class of networks. Moreover, we also provide a method to find an appropriate degree-correction

τ

to refine the RSC algorithm. Some numerical experiments are conducted to evaluate the performance of our method. By comparing the experimental results on the six real datasets, RSC-

δ

performs well on the datasets named karate, politicalblog, and simmons. Finally, the experimental results on the artificial networks show that RSC-

δ

performs well when the average degree is much smaller than the maximum degree. Furthermore, the performance of RSC-

δ

algorithm is less affected by the mixing parameter

μ \geq 0.5

. At last, the numerical experiments also show that the algorithm is affected by two parameters of

{\bar{ϕ}}_{k} (G)

and

\frac{μ_{m a x}}{μ_{m i n}}

.

6. Discussion

The RSC algorithm uses a constant

τ

for the degree-correction. Can we use different degree-corrections for different nodes? We try to use the information of the neighbor nodes of each node as follows.

Let

N (i)

be the set of nodes adjacent to node i. Denote

d_{m a x}^{i} = max {d_{j} : v_{j} \in N (i)}

,

d_{m i n}^{i} = min {d_{j} : v_{j} \in N (v_{i})}

,

d_{m i d}^{i} = \frac{1}{2} (d_{m a x}^{i} + d_{m i n}^{i})

and

d_{m e a n}^{i} = \sum_{j \in N (i)} d_{j} / d_{i}

.

Let

Π = d i a g (π_{1}, \dots, π_{n})

be a diagonal matrix of order n. The modified normalized Laplacian matrix is

L_{Π} = {(D + Π)}^{- 1 / 2} W {(D + Π)}^{- 1 / 2},

We used RSC-max, RSC-min, RSC-mean, and RSC-mid to denote the method when

π_{i}

equals to

d_{m a x}^{i}

,

d_{m i n}^{i}

,

d_{m e a n}^{i}

,

d_{m i d}^{i}

, and

i = 1, 2, \dots, n

, respectively. Table 4 shows the experimental results of these methods. We can see that the RSC-min algorithm is a bit better than RSC. The RSC-min algorithm performs better than RSC in five datasets, and only misclassifies two nodes on UKfaculty. Therefore, using a different degree-correction for each node might improve the performance of the RSC algorithm. We will leave this to our future work.

Author Contributions

Conceptualization, W.L.; data curation, W.L. and F.L.; formal analysis, W.L.; funding acquisition, W.L.; investigation, F.L.; methodology, W.L.; project administration, W.L.; resources, W.L. and F.L.; software, F.L.; supervision, W.L.; validation, W.L., F.L. and Y.Z.; visualization, W.L. and F.L.; writing—original draft preparation, W.L.; writing—review and editing, W.L., F.L. and Y.Z.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (No. 11901094).

Data Availability Statement

The data presented in this study are openly available at http://zke.fas.harvard.edu/software.html, accessed on 10 September 2022.

Conflicts of Interest

The authors declare that there are no conflict of interest.

References

Adamic, L.A.; Glance, N. The political blogosphere and the 2004 US election: Divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery, Chicago, IL, USA, 21–25 August 2005; pp. 36–43. [Google Scholar]
Hamad, D.; Biela, P. Introduction to spectral clustering. In Proceedings of the 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications, Damascus, Syria, 7–11 April 2008; pp. 1–6. [Google Scholar]
Khan, B.S.; Niazi, M.A. Network community detection: A review and visual survey. arXiv 2017, arXiv:1708.00977. [Google Scholar]
Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 2007, 17, 395–416. [Google Scholar] [CrossRef]
Ng, A.; Jordan, M.; Weiss, Y. On spectral clustering: Analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2001, 14, 849–856. [Google Scholar]
Chaudhuri, K.; Chung, F.; Tsiatas, A. Spectral clustering of graphs with general degrees in the extended planted partition model. In Proceedings of the Conference on Learning Theory. JMLR Workshop and Conference Proceedings, Edinburgh, UK, 25–27 June 2012; pp. 1–35. [Google Scholar]
Qin, T.; Rohe, K. Regularized spectral clustering under the degree-corrected stochastic blockmodel. Adv. Neural Inf. Process. Syst. 2013, 26, 3120–3128. [Google Scholar]
Qing, H.; Wang, J. An improved spectral clustering method for community detection under the degree-corrected stochastic blockmodel. arXiv 2020, arXiv:2011.06374. [Google Scholar]
Chung, F.R.K. Spectral Graph Theory; CBMS. Reg. Conf. Ser. Math. 92; AMS: Providence, RI, USA, 1997. [Google Scholar]
Kolev, P.; Mehlhorn, K. A Note on Spectral Clustering. In Proceedings of the 24th Annual European Symposium on Algorithms (ESA 2016), Aarhus, Denmark, 22–26 August 2016; Volume 57, pp. 57:1–57:14. [Google Scholar]
Peng, R.; Sun, H.; Zanetti, L. Partitioning well-clustered graphs: Spectral clustering works! In Proceedings of the Conference on Learning Theory, Paris, France, 3–6 July 2015; pp. 1423–1455. [Google Scholar]
Mizutani, T. Improved analysis of spectral algorithm for clustering. Optim. Lett. 2021, 15, 1303–1325. [Google Scholar] [CrossRef]
Nepusz, T.; Petróczi, A.; Négyessy, L.; Bazsó, F. Fuzzy communities and the concept of bridgeness in complex networks. Phys. Rev. E 2008, 77, 016107. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Red, V.; Kelsic, E.D.; Mucha, P.J.; Porter, M.A. Comparing community structure to characteristics in online collegiate social networks. SIAM Rev. 2011, 53, 526–543. [Google Scholar] [CrossRef] [Green Version]
Lusseau, D. The emergent properties of a dolphin social network. Proc. R. Soc. Lond. Ser. B Biol. Sci. 2003, 270, S186–S188. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zachary, W.W. An information flow model for conflict and fission in small groups. J. Anthropol. Res. 1977, 33, 452–473. [Google Scholar] [CrossRef]
Yang, C.; Liu, Z.; Zhao, D.; Sun, M.; Chang, E. Network representation learning with rich text information. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]

Figure 1. Plots of the UB(

δ

) in six real datasets: x axis:

δ

and y axis: UB(

δ

). (a) UB for different values of

δ

on UKfaculty; (b) UB for different values of

δ

on caltech; (c) UB for different values of

δ

on dolphins; (d) UB for different values of

δ

on karate; (e) UB for different values of

δ

on politicalblog; (f) UB for different values of

δ

on simmons.

Figure 1. Plots of the UB(

δ

) in six real datasets: x axis:

δ

and y axis: UB(

δ

). (a) UB for different values of

δ

on UKfaculty; (b) UB for different values of

δ

on caltech; (c) UB for different values of

δ

on dolphins; (d) UB for different values of

δ

on karate; (e) UB for different values of

δ

on politicalblog; (f) UB for different values of

δ

on simmons.

Figure 2. Ri, Nmi, and Error rate for different average degrees: x axis: the ratio of the average degree to the maximum degree; and the y axis: Ri, Nmi, Error rate. (a) Ri for different values of

\bar{d}

; (b) Nmi for different values of

\bar{d}

; (c) Error rate for different values of

\bar{d}

.

Figure 2. Ri, Nmi, and Error rate for different average degrees: x axis: the ratio of the average degree to the maximum degree; and the y axis: Ri, Nmi, Error rate. (a) Ri for different values of

\bar{d}

; (b) Nmi for different values of

\bar{d}

; (c) Error rate for different values of

\bar{d}

.

Figure 3. Ri, Nmi, and Error rate for different mixing parameters: x axis:

μ

and y axis: Ri, Nmi, Error rate. (a) Ri for different values of

μ

; (b) Nmi for different values of

μ

; (c) Error rate for different values of

μ

.

Figure 3. Ri, Nmi, and Error rate for different mixing parameters: x axis:

μ

and y axis: Ri, Nmi, Error rate. (a) Ri for different values of

μ

; (b) Nmi for different values of

μ

; (c) Error rate for different values of

μ

.

Figure 4. Ri, Nmi, and Error rate for different numbers of nodes in the largest community: x axis: the number of nodes; and y axis: Ri, Nmi, Error rate. (a) Ri for different values of maxc; (b) Nmi for different values of maxc; (c) Error rate for different values of maxc.

Table 1. The information of six real datasets.

DataSet	Source	n	k	$d_{\min}$	$d_{\max}$	$\bar{d}$
UKfaculty	Nepusz et al. (2008) [13]	79	3	2	39	13.97
caltech	Traud et al. (2011) [14]	590	8	1	179	43.46
dolphins	Lusseau (2003) [15]	62	2	1	12	5.12
karate	Zachary (1977) [16]	34	2	1	17	4.6
politicalblog	Adamic and Glance (2005) [1]	1222	2	1	351	27.35
simmons	Traud et al. (2011) [14]	1137	4	1	293	42.66

Table 2. Results on six real datasets.

DataSet		UKfaculty	Caltech	Dolphins	Karate	Politicalblog	Simmons
	NJW	0.9834	0.9091	1	0.9412	0.5003	0.8596
RI	RSC	0.9834	0.9091	1	0.9412	0.5003	0.8596
	RSC- $δ$	1	0.8936	0.9677	1	0.9095	0.8550
	NJW	0.9502	0.6138	1	0.8365	0.0006	0.6796
NMI	RSC	1	0.5881	0.8904	1	0.7133	0.6143
	RSC- $δ$	1	0.5867	0.8904	1	0.7317	0.6187
	NJW	1/79	149/590	0/62	1/34	586/1222	284/1137
Error rate	RSC	0/79	170/590	1/62	0/34	64/1222	244/1137
	RSC- $δ$	0/79	174/590	1/62	0/34	58/1222	238/1137
$δ_{0}$		0.71	2.155	2.435	1.205	0.15	0.625

Table 3. The

{\bar{ϕ}}_{k} (G) \frac{k μ_{m a x}}{μ_{m i n}}

of six real datasets.

Table 3. The

{\bar{ϕ}}_{k} (G) \frac{k μ_{m a x}}{μ_{m i n}}

of six real datasets.

DataSet	$μ_{\min}$	$μ_{\max}$	${\bar{ϕ}}_{k} (G)$	$\frac{μ_{\max}}{μ_{\min}}$	${\bar{ϕ}}_{k} (G) \frac{k μ_{\max}}{μ_{\min}}$
UKfaculty	189	519	0.1909	2.7460	1.5724
caltech	1443	4821	0.5062	3.3410	13.5302
dolphins	94	224	0.0453	2.3830	0.2159
karate	76	80	0.1283	1.0526	0.2701
politicalblog	16,175	17,253	0.0943	1.0666	0.2012
simmons	8796	15,592	0.2946	1.7726	2.0890

Table 4. Different methods of degree correction.

DataSet		UKfaculty	Caltech	Dolphins	Karate	Politicalblog	Simmons
	RSC	1	0.8967	0.9677	1	0.9007	0.8521
	RSC-min	0.9646	0.9008	1	1	0.9065	0.8590
RI	RSC-max	0.9834	0.9018	1	1	0.5104	0.8525
	RSC-mean	0.9646	0.8976	1	1	0.5002	0.8504
	RSC-mid	0.9834	0.9005	1	1	0.5002	0.8539
	RSC	1	0.5881	0.8904	1	0.7133	0.6143
	RSC-min	0.8985	0.5953	1	1	0.7243	0.6228
NMI	RSC-max	0.9502	0.6016	1	1	0.0227	0.6172
	RSC-mean	0.8985	0.5933	1	1	0.0019	0.6073
	RSC-mid	0.9502	0.6006	1	1	0.0019	0.6189
	RSC	0/79	170/590	1/62	0/34	64/1222	244/1137
	RSC-min	2/79	162/590	0/62	0/34	60/1222	222/1137
Error rate	RSC-max	1/79	163/590	0/62	0/34	521/1222	242/1137
	RSC-mean	2/79	170/590	0/62	0/34	586/1222	240/1137
	RSC-mid	1/79	164/590	0/62	0/34	586/1222	237/1137

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, F.; Li, W.; Zhong, Y. A Further Study on the Degree-Corrected Spectral Clustering under Spectral Graph Theory. Symmetry 2022, 14, 2428. https://doi.org/10.3390/sym14112428

AMA Style

Liu F, Li W, Zhong Y. A Further Study on the Degree-Corrected Spectral Clustering under Spectral Graph Theory. Symmetry. 2022; 14(11):2428. https://doi.org/10.3390/sym14112428

Chicago/Turabian Style

Liu, Fangmeng, Wei Li, and Yiwen Zhong. 2022. "A Further Study on the Degree-Corrected Spectral Clustering under Spectral Graph Theory" Symmetry 14, no. 11: 2428. https://doi.org/10.3390/sym14112428

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Further Study on the Degree-Corrected Spectral Clustering under Spectral Graph Theory

Abstract

1. Introduction

2. Preliminary

3. Analysis of RSC Algorithm

4. Finding an Appropriate $τ$ and Numerical Experiment

4.1. Find an Appropriate $δ$

Evaluation Indices

4.2. Real Networks Experiments

4.3. Synthetic Data Experiments

4.3.1. The Ratio of the Average Degree to the Maximum Degree

4.3.2. Mixing Parameter

4.3.3. The Number of Nodes in the Largest Community

5. Conclusions

6. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Further Study on the Degree-Corrected Spectral Clustering under Spectral Graph Theory

Abstract

1. Introduction

2. Preliminary

3. Analysis of RSC Algorithm

4. Finding an Appropriate τ and Numerical Experiment

4.1. Find an Appropriate δ

Evaluation Indices

4.2. Real Networks Experiments

4.3. Synthetic Data Experiments

4.3.1. The Ratio of the Average Degree to the Maximum Degree

4.3.2. Mixing Parameter

4.3.3. The Number of Nodes in the Largest Community

5. Conclusions

6. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4. Finding an Appropriate $τ$ and Numerical Experiment

4.1. Find an Appropriate $δ$