Information Limits for Community Detection in Hypergraph with Label Information

Zhao, Xiaofeng; Zhao, Wei; Yuan, Mingao

doi:10.3390/sym13112060

Open AccessArticle

Information Limits for Community Detection in Hypergraph with Label Information

by

Xiaofeng Zhao

¹

,

Wei Zhao

² and

Mingao Yuan

^2,*

¹

School of Mathematics and Statistics, North China University of Water Resources and Electric Power, Zhengzhou 450045, China

²

Department of Statistics, North Dakota State University, Fargo, ND 58103, USA

^*

Author to whom correspondence should be addressed.

Symmetry 2021, 13(11), 2060; https://doi.org/10.3390/sym13112060

Submission received: 26 September 2021 / Revised: 11 October 2021 / Accepted: 21 October 2021 / Published: 1 November 2021

Download

Browse Figures

Versions Notes

Abstract

:

In network data mining, community detection refers to the problem of partitioning the nodes of a network into clusters (communities). This is equivalent to identifying the cluster label of each node. A label estimator is said to be an exact recovery of the true labels (communities) if it coincides with the true labels with a probability convergent to one. In this work, we consider the effect of label information on the exact recovery of communities in an m-uniform Hypergraph Stochastic Block Model (HSBM). We investigate two scenarios of label information:

(1)

a noisy label for each node is observed independently, with

1 - α_{n}

as the probability that the noisy label will match the true label;

(2)

the true label of each node is observed independently, with the probability of

1 - α_{n}

. We derive sharp boundaries for exact recovery under both scenarios from an information-theoretical point of view. The label information improves the sharp detection boundary if and only if

α_{n} = n^{- β + o (1)}

for a constant

β > 0

.

Keywords:

community detection; uniform hypergraph; label information; exact recovery

1. Introduction

A graph or network consists of a set of nodes (vertices) and an edge set. Graphs have been used extensively to model a variety of systems in many fields [1,2,3,4,5]. Due to the widespread application, network data analysis has drawn a lot of attention in both statistical and machine learning communities [6,7,8,9,10,11,12,13]. Real-world networks are usually more complex than ordinary graphs, and in this case, a hypergraph is a popular alternative model [8,14,15,16,17,18,19,20,21]. Given a positive integer n, let

V = [n] : = {1, 2, \dots, n}

. An undirected m-uniform hypergraph on

V

is a pair

H_{m} = (V, E)

in which

E

is a set of subsets of

V

such that

| e | = m

for every

e \in E

, and each element in

E

is called a hyperedge. That is, in

H_{m}

, each hyperedge consists of exactly m distinct nodes. For

i_{1} < i_{2} < \dots < i_{m}

, we denote

A_{i_{1} i_{2} \dots i_{m}} = 1

if

{i_{1}, i_{2}, \dots, i_{m}}

is a hyperedge, and

A_{i_{1} i_{2} \dots i_{m}} = 0

if otherwise. Suppose

A_{i_{1} i_{2} \dots i_{m}} = A_{j_{1} j_{2} \dots j_{m}}

if

{i_{1}, i_{2}, \dots, i_{m}} = {j_{1}, j_{2}, \dots, j_{m}}

, and

A_{i_{1} i_{2} \dots i_{m}} = 0

if

| {i_{1}, i_{2}, \dots, i_{m}} | \neq m

. That is, a hypergraph that is symmetric and a self-loop is not allowed. Then, the m-dimensional symmetric array

A = (A_{i_{1}, \dots, i_{m}}) \in {0, 1}^{\otimes n^{m}}

is called the adjacency tensor of hypergraph

H_{m}

. When

m = 2

,

H_{2}

is just the usual graph that has been widely used in community detection problems [9].

A hypergraph is random if

A_{i_{1} i_{2} \dots i_{m}}

is a random variable for

i_{1} < i_{2} < \dots < i_{m}

. The binary m-uniform Hypergraph Stochastic Block Model (HSBM)

H_{m, n, p, q}

is defined as follows: Each node

i \in [n]

is independently assigned a community label

σ_{i}

, and randomly with

P (σ_{i} = 1) = P (σ_{i} = - 1) = \frac{1}{2}

. We denote

σ = (σ_{1}, σ_{2}, \dots, σ_{n})

and

I_{+} = I_{+} (σ) = {i | σ_{i} = + 1}

,

I_{-} = I_{-} (σ) = {i | σ_{i} = - 1}

. Then for

1 \leq i_{1} < i_{2} < \dots < i_{m} \leq n

,

A_{i_{1}, i_{2}, \dots, i_{m}} \sim \{\begin{matrix} B e r n (p), & if {i_{1}, i_{2}, \dots, i_{m}} \subset I_{+} or I_{-}, \\ B e r n (q), & o . w . . \end{matrix}

In addition,

A_{i_{1}, i_{2} \dots, i_{m}} (i_{1} < i_{2} < \dots < i_{m})

are assumed to be independent and conditional on

σ

. Here,

+ 1, - 1

represent two communities and

I_{+}

and

I_{-}

denote the nodes that belong to the

+ 1

and

- 1

communities, respectively. The subset

{i_{1}, i_{2}, \dots, i_{m}}

forms a hyperedge with a probability p if the distinct nodes

i_{1}, i_{2}, \dots, i_{m}

are in the same community. Otherwise,

{i_{1}, i_{2}, \dots, i_{m}}

forms a hyperedge with a probability q. Throughout this paper, the community structure is assumed to be balanced; that is, the number of nodes in each community is

\frac{n}{2}

, as in [6,22,23]. Moreover, we focus on the case

p = \frac{a log n}{n^{m - 1}}

and

q = \frac{b log n}{n^{m - 1}}

with

a \geq b > 0

, since this is the smallest order of hyperedge probability where exact recovery is possible. We denote the binary m-uniform Hypergraph of Stochastic Block Model (HSBM) as

H_{m, n, a, b} = H_{m} (n, \frac{a log (n)}{n^{m - 1}}, \frac{b log (n)}{n^{m - 1}})

.

Community detection refers to the problem of identifying the true label

σ

based on an observation of a hypergraph A. Let

\hat{σ}

be an estimator of

σ

. We say

\hat{σ}

is an exact recovery of

σ

or

\hat{σ}

exactly recovers

σ

if:

P (\exists s \in {\pm 1} : \hat{σ} = s σ) = 1 - o (1) .

In words, exact recovery means that the estimated label

\hat{σ}

is equal to the true label

σ

, with a probability convergent to one as the number of nodes goes to infinity. We say exact recovery is possible if there is an estimator

\hat{σ}

that exactly recovers

σ

, and exact recovery is impossible if any estimator

\hat{σ}

does not exactly recover

σ

.

In practice, along with the hypergraph A, side information about node labels is usually available [23,24,25,26,27,28,29,30]. For example, in a co-authorship or co-citation network, the cluster labels of some authors are known [28]. In a Web query network, some queries are labeled [30]. In student relational networks, the dorms in which students live can serve as label information [29]. Various algorithms have been developed to incorporate label information in community recovery in hypergraphs [28,29,30], and incorporating side information has been shown to improve clustering performance [23,24,25,26,27,28,29,30]. The sharp recovery boundary with a label or side information were given by [23,27] in the graph case. However, to the best of our knowledge, the sharp recovery boundary for hypergraphs is still unknown. In this paper, we study the effect of label information on the boundary of exact recovery for hypergraphs and consider two types of label information:

(1)

a noisy label for each node is observed independently, with

1 - α_{n}

as the probability that the noisy label will match the true label;

(2)

the true label of each node is observed independently, with the probability of

1 - α_{n}

. Let

α_{n} = n^{- β + o (1)}

with a constant

β \geq 0

. From an information-theoretical point of view, we derive sharp boundaries of exact recovery in terms of

m, a, b, β

. Interestingly, label information is useful if and only if

β > 0

. The main result is summarized in Table 1, where

η_{m, a, b} (β), C_{m, a, b}

are defined in Equations (1) and (2). In both cases, for a fixed m, the region (in terms of

a, b

) where exact recovery is impossible shrinks as

β

gets larger. The label information is helpful if and and only if

β > 0

; that is,

α_{n}

has to converge to zero at the rate of

n^{- β}

for

β > 0

. The visualization of the regions in Table 1 can be found in Figures 1 and 2 in Section 2.

2. Main Result

In this section, we consider community detection in hypergraphs through an observation of noisy labels or a proportion of the true labels from an information-theoretical point of view. We derive sharp boundaries of exact recovery, which provides a benchmark for developing practical community detection algorithms.

2.1. Detection with Noisy Label Information

In this subsection, we consider community detection in hypergraphs through a noisy version of node labels available. In the graph regime, community detection with noisy label information was proposed in [27] and extensively studied in [23]. Here, we focus on an m-uniform hypergraph with an arbitrary fixed

m \geq 2

. Given a true label vector

σ = (σ_{1}, \dots, σ_{n})

, for each node i, a noisy label

Y_{i}

is independently observed and

Y_{i}

coincides with the true label

σ_{i}

with a probability of

1 - α_{n}

. More specifically,

P (Y_{i} = σ_{i} | σ_{i}) = 1 - α_{n}

and

P (Y_{i} = - σ_{i} | σ_{i}) = α_{n}

,

α_{n} \in [0, \frac{1}{2}]

and

Y_{i} (1 \leq i \leq n)

are independent and conditional on

σ

. If

α_{n} = 0

, the true label for each node is fully known. If

α_{n} = \frac{1}{2}

, the noisy labels

Y = (Y_{1}, Y_{2}, \dots, Y_{n})

do not provide any information about the true labels. The hypergraph A and Y are assumed to be independent and conditional on

σ

. In this subsection, we focus on the effect of a noisy label Y on community detection in a hypergraph.

Assume

α_{n} = n^{- β + o (1)}

with a constant

β \geq 0

and define the following:

\begin{matrix} η_{m, a, b} (β) & = \frac{1}{2^{m - 1} (m - 1)!} \{a + b - \frac{γ_{m, a, b} (β)}{C_{m, a, b}} + \frac{β}{2 C_{m, a, b}} log (\frac{γ_{m, a, b} (β) + β}{γ_{m, a, b} (β) - β})\} + \frac{β}{2}, \end{matrix}

(1)

where

C_{m, a, b} = \frac{log (a) - log (b)}{2^{m - 1} (m - 1)!}, γ_{m, a, b} (β) = \sqrt{β^{2} + 4 a b C_{m, a, b}^{2}} .

(2)

Here,

C_{m, a, b}

and

γ_{m, a, b}

are defined just for notation convenience without any practical meaning. The quantity

η_{m, a, b} (β)

can be considered the signal contained in the model. The larger

η_{m, a, b} (β)

is, the easier exact recovery becomes. It is clearer to see this in the special case of

β = 0

:

η_{m, a, b} (0) = \frac{{(\sqrt{a} - \sqrt{b})}^{2}}{2^{m - 1} (m - 1)!} .

(3)

For a fixed m, a large

η_{m, a, b} (0)

implies that the difference of

\sqrt{a} - \sqrt{b}

is large. The within-community nodes are also more densely connected than between-community nodes. Hence, it gets easier to cluster the nodes into groups. Note that

η_{m, a, b} (0)

was used to characterize the sharp detection boundary in [6]. For an arbitrary

β \geq 0

, we provide the necessary and sufficient conditions for the exact recovery of the community structure. To this end, we firstly investigate the maximum likelihood estimator (MLE) of true labels. The region where exact recovery is impossible corresponds to the region where MLE fails. Then, based on the noisy labels, we construct an estimator that exactly recovers the community structure. The result is summarized in the following theorem.

Theorem 1.

Assume

α_{n} = n^{- β + o (1)}

with a constant

β \geq 0

, then exact recovery in HSBM

H_{m, n, a, b}

is impossible if

\begin{matrix} \{\begin{matrix} η_{m, a, b} (β) < 1, & when β < C_{m, a, b} (a - b), \\ β < 1, & when β > C_{m, a, b} (a - b) . \end{matrix} \end{matrix}

(4)

Exact recovery is possible if

\begin{matrix} \{\begin{matrix} η_{m, a, b} (β) > 1, & when β < C_{m, a, b} (a - b), \\ β > 1, & when β > C_{m, a, b} (a - b) . \end{matrix} \end{matrix}

(5)

Here,

η_{m, a, b} (β)

and

C_{m, a, b}

are defined in Equations (1) and (2).

Based on Theorem 1, there is a phase transition phenomenon for exact recovery in

H_{m, n, a, b}

. In the region

β < C_{m, a, b} (a - b)

, exact recovery is possible if

η_{m, a, b} (β) > 1

, and not possible if

η_{m, a, b} (β) < 1

. In the region

β > C_{m, a, b} (a - b)

, exact recovery is possible if

β > 1

, and exact recovery is impossible if

β < 1

. In this sense, phase transition occurs at 1, and 1 is the sharp boundary for exact recovery. When

α_{n}

is bounded away from zero,

β = 0

and

C_{m, a, b} (a - b) > 0

trivially hold. Then, Theorem 1 recovers Theorem 4 and Theorem 5 in [6]. This shows that the noisy label is useful if and only if

α_{n}

converges to zero at a rate of

n^{- β}

for

β > 0

. Furthermore, exact recovery with a fixed

m, a, b

becomes easier as

β

increases, since

η_{m, a, b} (β)

is increasing in

β

(see Lemma A1). When

m = 2

, Theorem 1 is reduced to Theorem 1 and Theorem 2 in [23]. Note that

C_{m, a, b}

is decreasing in m. Then, given a fixed

β

, the region of exact recovery for

m = 2

contains the region of

m \geq 3

as a proper subset. These findings can be summarized in Figure 1, where we visualize the regions characterized by (4) and (5) with

m = 2, 3

and

β = 0, 0.4, 0.8

. In Figure 1, the red (green) region represents exact recovery as impossible (possible). We point out that the time complexity of our estimator for exact recovery is

O (n^{m})

. Since our focus in this paper is to derive the sharp boundary of exact recovery as in [23,27] and not to propose algorithms with optimal performance, our estimator may or may not outperform existing algorithms.

2.2. Detection with Partially Observed Labels

In this subsection, we consider the community detection problem with true labels partially observed. This type of side information was considered in [23,24,25,26] in the context of graphs, and in [28,29,30] for hypergraphs. Here, we focus on an m-uniform hypergraph with an arbitrary fixed

m \geq 2

. Given the true labels

σ = (σ_{1}, \dots, σ_{n})

, for each node i, the true label is independently observed with a probability of

1 - α_{n}

. More specifically, we define a random variable

Y_{i}

with

P (Y_{i} = σ_{i} | σ_{i}) = 1 - α_{n}

and

P (Y_{i} = 0 | σ_{i}) = α_{n}

,

α_{n} \in [0, 1]

. Here,

Y_{i} = 0

indicates that the true label for node i is not observed. If

α_{n} = 1

, no label information is observed. If

α_{n} = 0

, all the true labels are observed and community detection is not necessary. We study how

α_{n}

changes the sharp detection boundary from the information-theoretical point of view. To this end, we investigate the maximum likelihood estimator (MLE) of true labels. The region where exact recovery is impossible corresponds to the region where MLE fails. The exact recovery estimator is constructed based on the partially observed labels. The result is summarized in the following theorem.

Theorem 2.

Assume

α_{n} = n^{- β + o (1)}

with a constant

β \geq 0

, then exact recovery in HSBM

H_{m, n, a, b}

is impossible if

\begin{matrix} \frac{{(\sqrt{a} - \sqrt{b})}^{2}}{2^{m - 1} (m - 1)!} + β < 1 . \end{matrix}

(6)

Exact recovery is possible if

\begin{matrix} \frac{{(\sqrt{a} - \sqrt{b})}^{2}}{2^{m - 1} (m - 1)!} + β > 1 . \end{matrix}

(7)

Theorem 2 clearly characterizes how the partially observed labels affect the boundary for exact recovery. A phase transition phenomenon of exact recovery exists at 1, since exact recovery is possible if

η_{m, a, b} (0) + β > 1

, but not possible if

η_{m, a, b} (0) + β < 1

. When

β = 0

, Theorem 2 recovers Theorem 4 and Theorem 5 in [6]. If

β > 0

, the region (6) where exact recovery is impossible is smaller than that in [6]. The side information of partially known labels makes exact recovery easier if and only if

β > 0

. When

m = 2

, Theorem 2 is reduced to Theorem 1 and Theorem 2 in [23]. For a fixed

β

, the region of exact recovery of

m \geq 3

is smaller than that of

m = 2

. These findings can be verified in Figure 2, where we visualize the regions characterized by (6) and (7) with

m = 2, 3

and

β = 0, 0.4, 0.8

. In Figure 2, the red (green) region represents exact recovery as impossible (possible). Finally, we point out that the time complexity of our estimator achieving exact recovery is

O (n^{m})

. Again, our focus in this paper is to derive the sharp boundary of exact recovery as in [23,27], not to propose algorithms with optimal performance; hence, our estimator may or may not outperform existing algorithms.

3. Proof of Main Result

In this section, we provide detailed proof of Theorems 1 and 2.

To start with, we derive the explicit expression of the likelihood function. The likelihood function of hypergraph A given the node label

σ

is:

\begin{matrix} P (A | σ) \\ = & \prod_{1 \leq i_{1} < \dots < i_{m} \leq n} {(p^{A_{i_{1}, \dots, i_{m}}} {(1 - p)}^{1 - A_{i_{1}, \dots, i_{m}}})}^{I [σ_{i_{1}} = \dots = σ_{i_{m}}]} \\ \prod_{1 \leq i_{1} < \dots < i_{m} \leq n} {(q^{A_{i_{1}, \dots, i_{m}}} {(1 - q)}^{1 - A_{i_{1}, \dots, i_{m}}})}^{1 - I [σ_{i_{1}} = \dots = σ_{i_{m}}]} \\ = & \prod_{1 \leq i_{1} < \dots < i_{m} \leq n} {(\frac{p (1 - q)}{q (1 - p)})}^{A_{i_{1}, \dots, i_{m}} I [σ_{i_{1}} = \dots = σ_{i_{m}}]} {(\frac{q}{1 - q})}^{A_{i_{1}, \dots, i_{m}}} {(\frac{1 - p}{1 - q})}^{I [σ_{i_{1}} = \dots = σ_{i_{m}}]} (1 - q) . \end{matrix}

Then, the log-likelihood function can be written as follows:

\begin{matrix} log (P (A | σ)) & = & log (\frac{p (1 - q)}{q (1 - p)}) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} A_{i_{1}, \dots, i_{m}} I [σ_{i_{1}} = \dots = σ_{i_{m}}] \\ + & log (\frac{q}{1 - q}) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} A_{i_{1}, \dots, i_{m}} \\ + & log (\frac{1 - p}{1 - q}) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} I [σ_{i_{1}} = \dots = σ_{i_{m}}] \\ + & log (1 - q) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} 1 \\ : = & I + II, \end{matrix}

where

\begin{matrix} I & = & log (\frac{p (1 - q)}{q (1 - p)}) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} A_{i_{1}, \dots, i_{m}} I [σ_{i_{1}} = \dots = σ_{i_{m}}] \\ = & (log (\frac{p}{q}) + log (\frac{1 - q}{1 - p})) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} A_{i_{1}, \dots, i_{m}} I [σ_{i_{1}} = \dots = σ_{i_{m}}] \\ : = & (C_{a, b} + o (1)) (e_{+} + e_{-}) \\ ≍ & C_{a, b} (e_{+} + e_{-}), \\ C_{a, b} & = & log (a) - log (b), \\ e_{+} & = & \sum_{1 \leq i_{1} < \dots < i_{m} \leq | I_{+} |} A_{i_{1}, \dots, i_{m}} I [σ_{i_{1}} = \dots = σ_{i_{m}} = + 1], \\ e_{-} & = & \sum_{1 \leq i_{1} < \dots < i_{m} \leq | I_{-} |} A_{i_{1}, \dots, i_{m}} I [σ_{i_{1}} = \dots = σ_{i_{m}} = - 1], \end{matrix}

and

\begin{matrix} II & = & log (\frac{q}{1 - q}) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} A_{i_{1}, \dots, i_{m}} \\ + & log (\frac{1 - p}{1 - q}) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} I [σ_{i_{1}} = \dots = σ_{i_{m}}] \\ + & log (1 - q) \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} 1 \\ = & e_{[n]} log (\frac{q}{1 - q}) + 2 (\binom{\frac{n}{2}}{m}) log (\frac{1 - p}{1 - q}) + (\binom{n}{m}) log (1 - q) \\ e_{[n]} & = & \sum_{1 \leq i_{1} < \dots < i_{m} \leq n} A_{i_{1}, \dots, i_{m}} . \end{matrix}

Note that

II

is independent of

σ

.

Proof of Theorem 1

The likelihood function of the vector of noisy labels Y given node lable

σ

is the following:

\begin{matrix} P (Y | σ) & = & \prod_{i = 1}^{n} {(1 - α_{n})}^{I [y_{i} σ_{i} = + 1]} α_{n}^{I [y_{i} σ_{i} = - 1]} = \prod_{i = 1}^{n} {(1 - α_{n})}^{I [y_{i} σ_{i} = + 1]} α_{n}^{1 - I [y_{i} σ_{i} = + 1]} \end{matrix}

Then, the log-likelihood function can be written as follows:

\begin{matrix} log (P (Y | σ)) & = & log (\frac{1 - α_{n}}{α_{n}}) \sum_{i = 1}^{n} I [y_{i} σ_{i} = 1] + log (α_{n}) \sum_{i = 1}^{n} 1 \\ : = & I_{s} + {II}_{s}, \end{matrix}

where

\begin{matrix} I_{s} & = & log (\frac{1 - α_{n}}{α_{n}}) \sum_{i = 1}^{n} I [y_{i} σ_{i} = 1] \\ = & (log (\frac{1}{α_{n}}) + log (1 - α_{n})) \sum_{i = 1}^{n} I [y_{i} σ_{i} = 1] \\ ≍ & C_{α_{n}} (s_{+} + s_{-}), \\ C_{α_{n}} & = & log (\frac{1 - α_{n}}{α_{n}}), \\ s_{+} & = & \sum_{i = 1}^{n} I [y_{i} = + 1, σ_{i} = + 1], \\ s_{-} & = & \sum_{i = 1}^{n} I [y_{i} = - 1, σ_{i} = - 1], \end{matrix}

and

{II}_{s} = n log (α_{n}) .

Noting that A and Y are independent given

σ

, the joint log-likelihood of

A, Y

given

σ

is the following:

\begin{matrix} log (P (A, Y | σ)) & = log (P (A | σ)) + log (P (Y | σ)) \\ = I + II + I_{s} + {II}_{s} \\ ≍ C_{a, b} (e_{+} + e_{-}) + C_{α_{n}} (s_{+} + s_{-}) + II + {II}_{s} . \end{matrix}

(8)

where

C_{a, b} = log (a) - log (b)

,

C_{α_{n}} = log (1 - α_{n}) - log (α_{n})

, and

II + {II}_{s}

consist of the terms that are independent of

σ

.

Denote

e_{S_{1}, S_{2}}

as the number of edges between two sets of nodes, say

S_{1}

and

S_{2}

. Then, define the following events as follows:

\begin{matrix} F & = {maximum likelihood fails}, \\ F_{i, +} & = {i \in I_{+} : e_{i, I_{-}} \geq e_{i, I_{+}} + \frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1}, \\ F_{i, -} & = {i \in I_{-} : e_{i, I_{+}} \geq e_{i, I_{-}} - \frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1}, \\ F_{+} & = \cup_{i \in I_{+}} F_{i, +}, \\ F_{-} & = \cup_{i \in I_{-}} F_{i, -} . \end{matrix}

(9)

Lemma 1.

P (F) = 1 - o (1)

, if

F_{+} \neq \emptyset

and

F_{-} \neq \emptyset

.

Proof.

Take

i_{0} \in F_{+}

and

j_{0} \in F_{-}

. Then, define the following:

{\tilde{I}}_{+} = I_{+} \ {i_{0}} \cup {j_{0}}, {\tilde{I}}_{-} = I_{-} \ {j_{0}} \cup {i_{0}} .

Denote

Λ = \frac{P (A, Y | \tilde{σ})}{P (A, Y | σ)}

, and we need to show that

P (Λ > 1) = 1 - o (1)

.

By (8), we have the following:

log (Λ) ≍ C_{a, b} \{(e_{{\tilde{I}}_{+}} + e_{{\tilde{I}}_{-}}) - (e_{+} + e_{-})\} + C_{α_{n}} \{(s_{{\tilde{I}}_{+}} + s_{{\tilde{I}}_{-}}) - (s_{+} + s_{-})\} .

(10)

It is clear that

\begin{matrix} e_{{\tilde{I}}_{+}} - e_{+} & = & - e_{i_{0}, I_{+}} + e_{j_{0}, I_{+}} - e_{j_{0}, I_{+} \ {i_{0}}}, \\ e_{{\tilde{I}}_{-}} - e_{-} & = & - e_{j_{0}, I_{-}} + e_{i_{0}, I_{-}} - e_{i_{0}, I_{-} \ {j_{0}}}, \\ s_{{\tilde{I}}_{+}} - s_{+} & = & - I [y_{i_{0}} = + 1] + I [y_{j_{0}} = + 1], \\ s_{{\tilde{I}}_{-}} - s_{-} & = & - I [y_{j_{0}} = - 1] + I [y_{i_{0}} = - 1] . \end{matrix}

Plugging into (10) yields

\begin{matrix} log (Λ) & ≍ & C_{a, b} \{(e_{j_{0}, I_{+}} - e_{j_{0}, I_{-}}) - (e_{i_{0}, I_{+}} - e_{i_{0}, I_{-}}) - e_{j_{0}, I_{+} \ {i_{0}}} - e_{i_{0}, I_{-} \ {j_{0}}}\} + C_{α_{n}} {y_{j_{0}} - y_{i_{0}}} \\ = & \{C_{a, b} (e_{j_{0}, I_{+}} - e_{j_{0}, I_{-}}) + C_{α_{n}} y_{j_{0}}\} + \{C_{a, b} (e_{i_{0}, I_{-}} - e_{i_{0}, I_{+}}) - C_{α_{n}} y_{i_{0}}\} \\ - & C_{a, b} e_{i_{0}, I_{-} \ {j_{0}}} - C_{a, b} e_{j_{0}, I_{+} \ {i_{0}}} \\ \geq & 2 C_{a, b} (1 - e_{i_{0}, I_{-} \ {j_{0}}}) . \end{matrix}

In the last inequality, we assumed that

e_{i_{0}, I_{-} \ {j_{0}}} \geq e_{j_{0}, I_{+} \ {i_{0}}}

without loss of generality.

Next, we will show that

E (e_{i_{0}, I_{-} \ {j_{0}}}) = o (1)

. Rewrite

e_{i_{0}, I_{-} \ {j_{0}}}

as:

e_{i_{0}, I_{-} \ {j_{0}}} = \sum_{\begin{matrix} 1 \leq i_{3} < \dots < i_{m} \leq \frac{n}{2} - 1 \\ {i_{3}, \dots, i_{m}} \in I_{-} \ {j_{0}} \end{matrix}} A_{i_{0} j_{0} i_{3} \dots i_{m}} = \sum_{k = 1}^{(\binom{\frac{n}{2} - 1}{m - 2})} Z_{k},

where

Z_{k} \sim_{i . i . d .} Z = B e r n (q)

and

q = \frac{b log (n)}{n^{m - 1}}

. Then,

E (e_{i_{0}, I_{-} \ {j_{0}}}) = (\binom{\frac{n}{2} - 1}{m - 2}) \frac{b log (n)}{n^{m - 1}} = \frac{log (n)}{n} O (1) = o (1) .

Applying Markov’s inequality, we have

\begin{matrix} P (Λ > 1) & = & P (log (Λ) > 0) \geq P (2 C_{a, b} (1 - e_{i_{0}, I_{-} \ {j_{0}}}) > 0) = P (e_{i_{0}, I_{-} \ {j_{0}}} < 1) \\ = & 1 - P (e_{i_{0}, I_{-} \ {j_{0}}} \geq 1) \geq 1 - E (e_{i_{0}, I_{-} \ {j_{0}}}) = 1 - o (1), \end{matrix}

which complete the proof. □

A. Proof of the impossible part of Theorem 1

Let

H

be a fixed subset of

I_{+}

of size

| H | = \frac{n}{{log}^{τ} (n)}

, with a constant

τ > \frac{1}{m - 1}

and

θ_{n} = \frac{log (n)}{log log (n)}

. For any

i \in H

, define the following events:

\begin{matrix} Δ_{i, H} & = & {e_{i, H} \geq θ_{n}}, \\ F_{i, H} & = & {e_{i, I_{-}} \geq e_{i, I_{+} \ H} + θ_{n} + \frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1}, \\ Δ_{H} & = & \cup_{i \in H} Δ_{i, H}, \\ F_{H} & = & \cup_{i \in H} F_{i, H} . \end{matrix}

Lemma 2.

P (Δ_{H}) = o (1)

.

Proof.

Let

W_{k} \sim_{i . i . d .} B e r n (p)

. By definition, for any

i \in H

,

\begin{matrix} e_{i, H} & = & \sum_{\begin{matrix} 1 \leq i_{2} < \dots < i_{m} \leq | H | - 1 \\ {i_{2}, \dots, i_{m}} \subset H \ {i} \end{matrix}} A_{i i_{2} \dots i_{m}} = \sum_{k = 1}^{(\binom{| H | - 1}{m - 1})} W_{k}, \\ E (e_{i, H}) & = & (\binom{| H | - 1}{m - 1}) \frac{a log (n)}{n^{m - 1}} . \end{matrix}

Then, the multiplicative Chernoff bound (see

(i i i)

in Lemma A1) gives the following:

\begin{matrix} P (Δ_{i, H}) & = & P (e_{i, H} \geq θ_{n}) = P (\sum_{k = 1}^{(\binom{| H | - 1}{m - 1})} W_{k} \geq θ_{n}) \leq {(\frac{n^{m - 1} log log (n)}{e a (\binom{| H | - 1}{m - 1})})}^{- θ_{n}} . \end{matrix}

By union bound we have the following:

\begin{matrix} P (Δ_{H}) \leq | H | {(\frac{n^{m - 1} log log (n)}{e a (\binom{| H | - 1}{m - 1})})}^{- θ_{n}} = n^{1 - (m - 1) τ + o (1)} = o (1), \end{matrix}

since

τ > \frac{1}{m - 1}

by the assumption on

| H |

. □

Lemma 3.

P (F_{i, H}) ≳ \frac{1}{| H |} log (\frac{1}{δ})

under the condition in (4), for any

δ \in (0, 1)

.

Proof.

See Appendices Appendix A.1 and Appendix A.2. □

Lemma 4.

P (F_{H}) = 1 - o (1)

under the condition in (4).

Proof.

Under the condition in (4), Lemma 3 holds. That is,

P (F_{i, H}) > \frac{1}{| H |} log (\frac{1}{δ})

(11)

for any

δ \in (0, 1)

and for a sufficiently large n. By union bound, we have the following:

\begin{matrix} P (F_{H}) & = & P (\cup_{i \in H} F_{i, H}) = 1 - P (\cap_{i \in H} {(F_{i, H})}^{c}) = 1 - {(1 - P (F_{i, H}))}^{| H |} \\ > & 1 - {(1 - P (F_{i, H}))}^{- log (δ) \frac{1}{P (F_{i, H})}} \\ = & 1 - δ^{- \frac{1}{P (F_{i, H})} log (1 - P (F_{i, H}))} \\ \geq & 1 - δ . \end{matrix}

Here, we used (11) in the first inequality. □

Lemma 5.

P (F) = 1 - o (1)

under the condition in (4).

Proof.

By Lemmas 2 and 4, there exists a sufficiently small

δ > 0

, such that

P (Δ^{c}) \geq 1 - δ

and

P (F_{H}) \geq 1 - δ

simultaneously hold. Clearly,

Δ^{c} \cap F_{H} \Rightarrow F_{+}

. Hence,

P (F_{+}) \geq P (Δ^{c}) + P (F_{H}) - 1 \geq 1 - 2 δ .

By symmetry, we have

P (F_{-}) \geq 1 - 2 δ .

Then, it follows from Lemma 1 that

P (F) \geq P (F_{I_{+}}) + P (F_{I_{-}}) - 1 \geq 1 - 4 δ,

which completes the proof. □

The impossible part of Theorem 1 follows directly from Lemma 5.

B. Proof of the possible part of Theorem 1

Let

H_{1}

be an independently generated random hypergraph, built on the same node set of

V = [n]

, with an edge probability of

\frac{d_{n}}{log (n)}

. Denote its complement by

H_{2} = H_{1}^{c}

and

H_{1} = H \cap H_{1}, H_{2} = H \cap H_{2} .

A weak recovery algorithm [12] is applied to

H_{1}

to return a partition of two communities

{\tilde{I}}_{+}

and

{\tilde{I}}_{-}

, which agree with the true communities

I_{+}

and

I_{-}

on at least

(1 - δ) n

nodes. Here,

δ = δ (d_{n})

depends on

d_{n}

such that

δ \to 0

as

d_{n} \to \infty

.

d_{n}

can be taken as

O (log log (n))

[12]. In the next step, we will use

H_{2}

to decide whether to flip a node’s membership or not. More specifically, for a node

i \in {\tilde{I}}_{+}

, if it has more edges in

H_{2}

going to

{\tilde{I}}_{-}

, plus the scaled side information

\frac{C_{α_{n}}}{C_{a, b}} y_{i}

, then we reset

i \in {\tilde{I}}_{-}

. Similarly, for

i \in {\tilde{I}}_{-}

, if it has more edges in

H_{2}

going to

{\tilde{I}}_{+}

, minus the scaled side information

\frac{C_{α_{n}}}{C_{a, b}} y_{i}

, then we reset

i \in {\tilde{I}}_{+}

. If the number of flips in each community is not the same, then keep the discard change. This algorithm has been summarized in Algorithm 1.

Algorithm 1: Algorithm for exact recovery of community structure in hypergraphs with label information.

1. Input: Hypergraph

H

and label information y;

2. Partition

H = H_{1} ⊔ H_{2}

, where

H_{1} = H \cap H_{1}

,

H_{2} = H \cap H_{1}^{c}

, and

H_{1}

make an

Erd

\ddot{o}

s–Renyi graph generated with an edge probability of

\frac{d_{n}}{log (n)}

;

3. Apply weak recovery algorithm [12] to

H_{1}

to return a partition

I_{+, 0} ⊔ I_{-, 0}

;

4. Initialize

{\tilde{I}}_{+} \leftarrow I_{+, 0}

, and

{\tilde{I}}_{-} \leftarrow I_{-, 0}

;

5. Flip membership if

i \in {\tilde{I}}_{+}

and

e_{i, {\tilde{I}}_{-}} \geq e_{i, {\tilde{I}}_{+}} + \frac{C_{α_{n}}}{C_{a, b}} y_{i}

in

H_{2}

, or

i \in {\tilde{I}}_{-}

and

e_{i, {\tilde{I}}_{+}} \geq e_{i, {\tilde{I}}_{-}} - \frac{C_{α_{n}}}{C_{a, b}} y_{i}

in

H_{2}

;

6. If

| {\tilde{I}}_{+} | \neq | I_{+, 0} |

, then keep

I_{+, 0}

and

I_{-, 0}

unchanged.

To show (5), the possible part of Theorem 1, we first introduce the following definitions. For any node

i \in [n]

, it is mis-classified if and only if it belongs to the following:

\begin{matrix} \{i \in {\tilde{I}}_{+} : σ_{i} = - 1\} \cup \{i \in {\tilde{I}}_{-} : σ_{i} = + 1\} . \end{matrix}

With WLOG, we assume that

i \in {\tilde{I}}_{+}

. Then, the mis-classification probability of node i is given by the following:

\begin{matrix} M_{i} & = P (e_{i, {\tilde{I}}_{-}} \geq e_{i, {\tilde{I}}_{+}} + \frac{C_{α_{n}}}{C_{a, b}} y_{i}) \\ = P (\sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1}) - (\binom{\frac{n}{2} δ}{m - 1})} Z_{k} + \sum_{k = 1}^{(\binom{\frac{n}{2} δ}{m - 1})} W_{k} \geq \sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1}) - (\binom{\frac{n}{2} δ}{m - 1})} W_{k} + \sum_{k = 1}^{(\binom{\frac{n}{2} δ}{m - 1})} Z_{k} + \frac{C_{α_{n}}}{C_{a, b}} y_{i}) . \end{matrix}

(12)

In the last equation, we assumed

H_{2}

to be a complete graph. Then, the probability of the existence of a mis-classified label is:

\begin{matrix} M \leq n M_{i}, \end{matrix}

by union bound on all nodes.

Consider a node

i \in H_{1}

. Its degree is given by the following:

deg (i) = \sum_{\begin{matrix} 1 \leq i_{2} < \dots < i_{m} \leq n - 1 \\ {i_{2}, \dots, i_{m}} \in [n] \ {i} \end{matrix}} A_{i i_{2} \dots i_{m}} .

Lemma 6.

{max}_{i \in H_{1}} {deg (i)} ≲ 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}

.

Proof.

Note that

deg (i) \sim B i n o m ((\binom{n - 1}{m - 1}), \frac{d_{n}}{log (n)})

. Then, the multiplicative Chernoff bound (see

(i i i)

in Lemma A1) gives the following:

\begin{matrix} P (deg (i) \geq 2 μ) & \leq & {(\frac{e}{4})}^{μ} < e^{- \frac{1}{4} μ}, \end{matrix}

where

μ = (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}

. By union bound, we have the following:

\begin{matrix} P (max_{i \in H_{1}} {deg (i)} \geq 2 μ) \leq n P (deg (i) \geq 2 μ) < n e^{- \frac{1}{4} μ} = o (1) . \end{matrix}

□

The Lemma above suggests that:

min_{i \in H_{2}} {deg (i)} ≳ (\binom{n - 1}{m - 1}) (1 - \frac{2 d_{n}}{log (n)}) .

Therefore, taking into account the incompleteness of

H_{2}

, we will loose the upper bound (12) by removing

2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}

terms from both of the summations on the right-hand side of (12). That is,

\begin{matrix} M_{i} & = & P (\sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1}) - (\binom{\frac{n}{2} δ}{m - 1})} Z_{k} + \sum_{k = 1}^{(\binom{\frac{n}{2} δ}{m - 1})} W_{k} \geq \sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1}) - (\binom{\frac{n}{2} δ}{m - 1}) - 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}} W_{k} + \sum_{k = 1}^{(\binom{\frac{n}{2} δ}{m - 1}) - 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}} Z_{k} + \frac{C_{α_{n}}}{C_{a, b}} y_{i}) . \end{matrix}

Lemma 7.

M \leq \{\begin{matrix} n^{1 - η_{m, a, b} (β) + o (1)}, β < C_{m, a, b} (a - b), \\ n^{1 - β + o (1)}, β > C_{m, a, b} (a - b) . \end{matrix}

Proof.

See Appendix A.3. □

The possible part of Theorem 1 follows directly from Lemma 7.

4. Conclusions

In this paper, we studied the effect of label information on the exact recovery of communities in uniform hypergraphs from an information-theoretical point of view. Specifically, we considered two types of label information: a noisy label for each node was observed independently, with

1 - α_{n}

as the probability that the noisy label would match the true label, and the true label of each node was observed independently, with a probability of

1 - α_{n}

. We used the maximum likelihood method to derive a lower bound for exact recovery and then constructed an estimator that could exactly recover the communities above the lower bound. In this way, we obtained sharp boundaries for exact recovery under both scenarios. We found that the label information improved the sharp detection boundary if and only if

α_{n}

converges´d to zero at a rate of

n^{- β}

for some positive constant

β

.

There are several possible future research directions: (I) The sharp recovery boundary for general HSBM with label information is still unknown. Characterizing the boundary in this case is an important problem. (II) In this paper, we focused on the label information. It is important to consider other side information, such as the covariates observed for each node.

Author Contributions

Methodology: X.Z., W.Z. and M.Y.; writing—original draft preparation: W.Z., M.Y.; writing—review and editing: W.Z. and X.Z.; supervision: M.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

The authors are grateful to Editor and reivewers for helpful comments that significantly improve this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Chernoff Bound

For a random variable X, denote its cumulant generating function (cgf) by

ψ_{X} (t) = log (E (e^{t X}))

. Define

\begin{matrix} ϕ_{X, D} (t) & = t D - ψ_{X} (t), \\ ϕ_{- X, D} (t) & = - t D - ψ_{- X} (t), \end{matrix}

(A1)

for any fixed

D \in R

, where t ranges over

R

or

R^{+}

.

Lemma A1.

Assume

X_{i} \sim_{i . i . d .} X

with cgf

ψ_{X} (t)

,

i = 1, \dots, n

.

(i): (Lemma 15 in [23]) For any $D, ϵ \in R$ ,

$P (\frac{1}{n} \sum_{i = 1}^{n} X_{i} \geq D - ϵ) \geq e^{- n (ϕ_{X, D} (t_{max}) + | t_{max} | ϵ)} (1 - \frac{σ_{\hat{X}}^{2}}{n ϵ^{2}})$

where $t_{max} = arg {sup}_{t \in R} (ϕ_{X, D} (t))$ , $\hat{X}$ is a random variable with the same alphabet as X but distributed according to $\frac{e^{t_{max} x} P (x)}{E_{X} (e^{t_{max} X})}$ , and $μ_{\hat{X}}$ , $σ_{\hat{X}}^{2}$ are the mean and variance of $\hat{X}$ , respectively.
(ii): (Generic Chernoff bound ) For any $D \in R$ ,

$P (\frac{1}{n} \sum_{i = 1}^{n} X_{i} \geq D) \leq e^{- n ϕ_{X, D} (t_{max})}$

where $t_{max} = arg {sup}_{t > 0} (ϕ_{X, D} (t))$ , and

$P (\frac{1}{n} \sum_{i = 1}^{n} X_{i} \leq D) \leq e^{- n ϕ_{- X, D} (t_{max})}$

where $t_{max} = arg {sup}_{t > 0} (ϕ_{- X, D} (t))$ .
(iii): (Multiplicative Chernoff bound ) For any $t > 1$ ,

$P (\sum_{i = 1}^{n} X_{i} \geq t μ) \leq {(\frac{t}{e})}^{- t μ} e^{- μ}$

where $μ = E (X)$ .

For later use, we consider

X = C_{a, b} (Z - W)

, where

Z \sim B e r n (q)

,

W \sim B e r n (p)

with

p = \frac{a log (n)}{n^{m - 1}}

,

p = \frac{b log (n)}{n^{m - 1}}

, and

C_{a, b} = log (a) - log (b)

,

a \geq b > 0

. Define

\begin{matrix} η_{m, a, b} (Δ) & = \frac{1}{2^{m - 1} (m - 1)!} \{a + b - \frac{γ_{m, a, b} (Δ)}{C_{m, a, b}} + \frac{Δ}{2 C_{m, a, b}} log (\frac{γ_{m, a, b} (Δ) + Δ}{γ_{m, a, b} (Δ) - Δ})\} + \frac{Δ}{2}, \\ γ_{m, a, b} (Δ) & = \sqrt{Δ^{2} + 4 a b C_{m, a, b}^{2}} \end{matrix},

(A2)

where

C_{m, a, b} = \frac{log (a) - log (b)}{2^{m - 1} (m - 1)!}

and

Δ \in R

is a constant. In the special case

Δ = 0

, we have

η_{m, a, b} (0) = \frac{{(\sqrt{a} - \sqrt{b})}^{2}}{2^{m - 1} (m - 1)!} .

Lemma A2.

(i): (Lower bound) Assume $D_{m, n} = \frac{log (n)}{l_{m, n}} (Δ + o (1))$ with a constant $Δ \in R$ , and $ϵ_{m, n} = \frac{log (n)}{l_{m, n}} o (1)$ , then

$P (\frac{1}{l_{m, n}} \sum_{i = 1}^{l_{m, n}} X_{i} \geq D_{m, n} + ϵ_{m, n}) \geq n^{- η_{m, a, b} (Δ) + o (1)},$

(A3)

where $l_{m, n} = (\binom{\frac{n}{2}}{m - 1})$ , and $η_{m, a, b} (Δ)$ is given in (A2).
(ii): (Upper bound) Assume $D_{m, n} = \frac{log (n)}{l_{m, n}} (Δ + o (1))$ with $Δ \in R$ , then

$P (\frac{1}{l_{m, n}} \sum_{i = 1}^{l_{m, n}} X_{i} \geq D_{m, n}) \leq n^{- η_{m, a, b} (Δ) + o (1)},$

(A4)

when $Δ > - C_{m, a, b} (a - b)$ , and

$P (\frac{1}{l_{m, n}} \sum_{i = 1}^{l_{m, n}} X_{i} \leq D_{m, n}) \leq n^{- η_{m, a, b} (Δ) + o (1)},$

(A5)

when $Δ < - C_{m, a, b} (a - b)$ . Here, $l_{m, n}$ and $η_{m, a, b} (Δ)$ are defined as in $(i)$ .

Proof.

We first calculate and approximate

ϕ_{X} (t)

and

ψ_{X} (t)

. Denote

s = {(\frac{a}{b})}^{t}

. Then direct calculation gives the following:

\begin{matrix} ψ_{X} (t) & = log (E (e^{t C_{a, b} (Z - W)})) = log (E (s^{Z - W})) \\ = log (1 - q (1 - s)) + log (1 - p (1 - s^{- 1})) . \end{matrix}

Define

\begin{matrix} {\tilde{ψ}}_{X} (t) & : = - \frac{log (n)}{n^{m - 1}} (a + b - b s - a s^{- 1}) = ψ_{X} (t) + \frac{log (n)}{n^{m - 1}} o (1) . \end{matrix}

Taking

D = D_{m, n} = \frac{log (n)}{l_{m, n}} (Δ + o (1))

in the first equation of (A1), we have

\begin{matrix} {\tilde{ϕ}}_{X, D} (t) & = \frac{log (n)}{n^{m - 1}} \{Δ \frac{log (s)}{C_{m, a, b}} + a + b - b s - a s^{- 1} + o (1)\}, \end{matrix}

where

l_{m, n} = (\binom{\frac{n}{2}}{m - 1})

.

Taking the first derivative of

{\tilde{ϕ}}_{X, D} (t)

, w.r.t. t yields

{\tilde{ϕ}}_{X, D}^{'} (t) = \frac{log (n)}{l_{m, n}} \{Δ - b C_{m, a, b} s + a C_{m, a, b} s^{- 1}\} .

Set

{\tilde{ϕ}}_{X, D}^{'} (t) = 0

, and solve

\begin{matrix} 0 = b C_{m, a, b} s^{2} - Δ s - a C_{m, a, b}, \end{matrix}

we have

s^{*} = \frac{γ_{m, a, b} (Δ) + Δ}{2 b C_{m, a, b}},

where

γ_{m, a, b} (Δ) = \sqrt{Δ^{2} + 4 a b C_{m, a, b}^{2}} .

Noting that

\begin{matrix} log (s^{*}) & = & log \frac{γ_{m, a, b} (Δ) + Δ}{2 b C_{m, a, b}} = \frac{1}{2} \{log \frac{γ_{m, a, b} (Δ) + Δ}{γ_{m, a, b} (Δ) - Δ} + C_{m, a, b}\}, \\ b s^{*} + a s^{* - 1} & = & \frac{γ_{m, a, b} (Δ)}{C_{m, a, b}} . \end{matrix}

Plugging in (A6) yields

\begin{matrix} e^{- l_{m, n} {\tilde{ϕ}}_{X} (t^{*})} & = n^{- \frac{l_{m, n}}{n^{m - 1}} \{a + b - \frac{γ_{m, a, b} (Δ)}{C_{m, a, b}} + \frac{Δ}{2 C_{m, a, b}} log \frac{γ_{m, a, b} (Δ) + Δ}{γ_{m, a, b} (Δ) - Δ} + \frac{Δ}{2} + o (1)\}} \\ = n^{- η_{m, a, b} (Δ) + o (1)} . \end{matrix}

where

η_{m, a, b} (Δ)

and

γ_{m, a, b} (Δ)

are given in (A2).

(1) For part

(i)

, note that

s_{max} = s^{*}

is the global maximum of

{\tilde{ϕ}}_{X, D} (t)

on

R

, since

{\tilde{ϕ}}_{X, D}^{″} (t) < 0

. Therefore, the first part is completed by applying the Chernoff bound (see

(i)

in Lemma A1).

(2) To show (A4) in

(i i)

, we get from

t > 0

that

s_{max} = max {s^{*}, 1} = \{\begin{matrix} 1, & if Δ < - C_{m, a, b} (a - b), \\ s^{*}, & if Δ > - C_{m, a, b} (a - b), \end{matrix}

which is the global maximum of

{\tilde{ϕ}}_{X, D} (t)

on

(0, \infty)

, since

{\tilde{ϕ}}_{X, D}^{''} (t) < 0

.

If

Δ < - C_{m, a, b} (a - b)

, then

s_{max} = 1

leads to a trivial bound since in this case

e^{- l_{m, n} ϕ_{X, D} (t_{max} = 0)} = 1 .

If

Δ > - C_{m, a, b} (a - b)

, then

s_{max} = s^{*}

and (A6) hold. This completes the proof of (A4) by applying the Chernoff bound (see

(i i)

in Lemma A1).

Now, we are left to show (A5) in part

(i i)

. Consider the random variable

- X

, we have

\begin{matrix} ψ_{- X} (t) & = log (1 - p (1 - s)) + log (1 - q (1 - s^{- 1})), \end{matrix}

Define

\begin{matrix} {\tilde{ψ}}_{- X} (t) : = - \frac{log (n)}{n^{m - 1}} (a + b - a s - b s^{- 1}) = {\tilde{ψ}}_{- X} (t) + \frac{log (n)}{n^{m - 1}} o (1) . \end{matrix}

Taking

D = D_{m, n} = \frac{log (n)}{l_{m, n}} (Δ + o (1))

in the second equation of (A1), we have

\begin{matrix} {\tilde{ϕ}}_{- X, D} (t) & = \frac{log (n)}{n^{m - 1}} \{- Δ \frac{log (s)}{C_{m, a, b}} + a + b - a s - b s^{- 1}\}, \end{matrix}

where

Δ \in R

and

C_{m a, b}

is defined as above.

Taking the first derivative of

{\tilde{ϕ}}_{- X, D} (t)

, w.r.t. t yields

{\tilde{ϕ}}_{- X, D}^{'} (t) = \frac{log (n)}{l_{m, n}} \{- Δ - a C_{m, a, b} s + b C_{m, a, b} s^{- 1}\}

Set

{\tilde{ϕ}}_{- X, D}^{'} (t) = 0

, and solve

\begin{matrix} 0 = a C_{m, a, b} s^{2} + Δ s - b C_{m, a, b}, \end{matrix}

we have

s^{*} = \frac{γ_{m, a, b} (Δ) - Δ}{2 a C_{m, a, b}},

where

γ_{m, a, b} (Δ) = \sqrt{Δ^{2} + 4 a b C_{m, a, b}^{2}} .

Noting that

t > 0

, we have

s_{max} = max {s^{*}, 1} = \{\begin{matrix} 1, & if Δ > - C_{m, a, b} (a - b), \\ s^{*}, & if Δ < - C_{m, a, b} (a - b), \end{matrix}

which is the global maximum of

{\tilde{ϕ}}_{- X, D} (t)

on

(0, \infty)

, since

{\tilde{ϕ}}_{- X, D}^{″} (t) < 0

.

If

Δ > - C_{m, a, b} (a - b)

, then

s_{max} = 1

leads to a trivial bound since

e^{- l_{m, n} ϕ_{- X, D} (t_{max} = 0)} = 1 .

If

Δ < C_{m, a, b} (a - b)

, then

s_{max} = s^{*}

. Noting that

\begin{matrix} log (s^{*}) & = & log \frac{γ_{m, a, b} (Δ) - Δ}{2 a C_{m, a, b}} = - \frac{1}{2} \{log \frac{γ_{m, a, b} (Δ) + Δ}{γ_{m, a, b} (Δ) - Δ} + C_{a, b}\}, \\ a s^{*} + b s^{* - 1} & = & \frac{γ_{m, a, b} (Δ)}{C_{m, a, b}} . \end{matrix}

Then

\begin{matrix} e^{- l_{m, n} {\tilde{ϕ}}_{- X} (t^{*})} & = n^{- \frac{l_{m, n}}{n^{m - 1}} \{a + b - \frac{γ_{m, a, b} (Δ)}{C_{m, a, b}} + \frac{Δ}{2 C_{m, a, b}} log \frac{γ_{m, a, b} (Δ) + Δ}{γ_{m, a, b} (Δ) - Δ} + \frac{Δ}{2} + o (1)\}} \\ = n^{- η_{m, a, b} (Δ) + o (1)} . \end{matrix}

This completes the proof of (A5) by applying the Chernoff bound (see

(i i)

in Lemma A1).

□

Appendix A.2. Proof of Lemma 3

Denote

l_{m, n} = (\binom{\frac{n}{2}}{m - 1})

and recall that

θ_{n} = \frac{log (n)}{log log (n)}

. Then

\begin{matrix} P (F_{i, H}) & = P (\sum_{k = 1}^{(\binom{| I_{-} |}{m - 1})} Z_{k} \geq \sum_{k = 1}^{(\binom{| I_{+} | - | H |}{m - 1})} W_{k} + \frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1 + θ_{n}) \\ \geq P (\sum_{k = 1}^{l_{m, n}} (Z_{k} - W_{k}) \geq \frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1 + θ_{n}) \\ = P (\sum_{k = 1}^{l_{m, n}} X_{k} \geq C_{α_{n}} y_{i} + C_{a, b} + C_{a, b} θ_{n}) \\ : = P (\frac{1}{l_{m, n}} \sum_{k = 1}^{l_{m, n}} X_{k} \geq D_{ϵ_{m, n}, y_{i}} - ϵ_{m, n}), \end{matrix}

(A6)

where

\begin{matrix} D_{ϵ_{m, n}, y_{i}} & = \frac{C_{a, b}}{l_{m, n}} (\frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1 + θ_{n}) + ϵ_{m, n}, \\ ϵ_{m, n} & = \frac{log (n)}{n^{m - 1}} o (1) . \end{matrix}

It follows that

Δ = β y_{i}

, since

Δ_{β, y_{i}} = \frac{l_{m, n}}{log (n)} D_{ϵ_{m, n}, y_{i}} = β y_{i} + o (1) .

Here, we used

C_{α_{n}} = (β + o (1)) log (n)

. Therefore,

$y_{i} = + 1$ . then $η_{m, a, b} (Δ) = η_{m, a, b} (β)$ as in (1);
$y_{i} = - 1$ , then $η_{m, a, b} (Δ) = η_{m, a, b} (β) - β$ .

By (A3) in Lemma A2, (A6) becomes

\begin{matrix} P (F_{i, H}) & \geq (1 - α_{n}) n^{- η_{m, a, b} (β) + o (1)} + α_{n} n^{- η_{m, a, b} (β) + β + o (1)} \\ = (1 - α_{n}) n^{- η_{m, a, b} (β) + o (1)} + n^{- η_{m, a, b} (β) + o (1)} \\ = n^{- η_{m, a, b} (β) + o (1)} \end{matrix}

Therefore, if

η_{m, a, b} (β) \leq 1 - ε

for some

0 < ε < 1

, then

P (F_{i, H}) \geq n^{- 1 - ε} > \frac{1}{| H |} log (\frac{1}{δ})

holds for

δ \in (0, 1)

and a sufficiently large n, since

| H | = \frac{n}{{log}^{τ} (n)}

for

τ > \frac{1}{m - 1}

. This completes the first part under condition (4), the impossible part of Theorem 1.

For the second part in (4), recall that

β > C_{m, a, b} (a - b)

. Then

\begin{matrix} P (F_{i, H}) & = P (\sum_{k = 1}^{(\binom{| I_{-} |}{m - 1})} Z_{k} \geq \sum_{k = 1}^{(\binom{| I_{+} | - | H |}{m - 1})} W_{k} + \frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1 + θ_{n}) \\ \geq P (\sum_{k = 1}^{l_{m, n}} (Z_{k} - W_{k}) \geq \frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1 + θ_{n}) \\ = 1 - P (\sum_{k = 1}^{l_{m, n}} X_{k} \leq C_{α_{n}} y_{i} + C_{a, b} + C_{a, b} θ_{n}) \\ : = 1 - P (\frac{1}{l_{m, n}} \sum_{k = 1}^{l_{m, n}} X_{k} \leq D_{ϵ_{m, n}, y_{i}}), \end{matrix}

(A7)

where

D_{ϵ_{m, n}, y_{i}} = \frac{C_{a, b}}{l_{m, n}} (\frac{C_{α_{n}}}{C_{a, b}} y_{i} + 1 + θ_{n}) .

It follows that

Δ = β y_{i}

, since

Δ_{β, y_{i}} = \frac{l_{m, n}}{log (n)} D_{ϵ_{m, n}, y_{i}} = β y_{i} + o (1) .

Here we used

C_{α_{n}} = (β + o (1)) log (n)

. We still have

$y_{i} = + 1$ , then $η_{m, a, b} (Δ) = η_{m, a, b} (β)$ . Note that $C_{m, a, b} (a - b) > - Δ = - β$ always holds, the probability in (A5) is upper-bounded by 1;
$y_{i} = - 1$ , then $η_{m, a, b} (Δ) = η_{m, a, b} (β) - β$ . Note that $C_{m, a, b} (a - b) < - Δ = β$ by assumption, and (A5) applies.

Then (A7) becomes

\begin{matrix} P (F_{i, H}) & \geq 1 - (1 - α_{n}) - α_{n} n^{- η_{m, a, b} (β) + β + o (1)} \\ = α_{n} - n^{- η_{m, a, b} (β) + o (1)} \\ = n^{- β} - n^{- η_{m, a, b} (β) + o (1)} \end{matrix}

Therefore, if

β \leq 1 - ε_{1}

and

η_{m, a, b} (β) \geq 1 + ε_{2}

for two sufficiently small positive constants

ε_{1}

and

ε_{2}

, then

P (F_{i, H}) \geq n^{- 1} (n^{ε_{1}} - n^{- ε_{2}}) > \frac{1}{| H |} log (\frac{1}{δ})

holds for

δ \in (0, 1)

and a sufficiently large n, since

| H | = \frac{n}{{log}^{τ} (n)}

for

τ > \frac{1}{m - 1}

. This completes the second part under condition (4), the impossible part of Theorem 1.

Appendix A.3. Proof of Lemma 7

\begin{matrix} M_{i} & = & P (\sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1}) - (\binom{\frac{n}{2} δ}{m - 1})} Z_{k} + \sum_{k = 1}^{(\binom{\frac{n}{2} δ}{m - 1})} W_{k} \geq \sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1}) - (\binom{\frac{n}{2} δ}{m - 1}) - 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}} W_{k} + \sum_{k = 1}^{(\binom{\frac{n}{2} δ}{m - 1}) - 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}} Z_{k} + \frac{C_{α_{n}}}{C_{a, b}} y_{i}) \\ \leq & P (\sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1})} Z_{k} + \sum_{k = 1}^{(\binom{\frac{n}{2} δ}{m - 1})} W_{k} - \sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1}) - (\binom{\frac{n}{2} δ}{m - 1}) - 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}} W_{k} \geq \frac{C_{α_{n}}}{C_{a, b}} y_{i}) \\ \leq & P (\sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1})} Z_{k} - \sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1})} W_{k} + \sum_{k = 1}^{2 (\binom{\frac{n}{2} δ}{m - 1}) + 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}} W_{k} \geq \frac{C_{α_{n}}}{C_{a, b}} y_{i}) \end{matrix}

Defining

λ = \frac{1}{\sqrt{- log (δ)}}

. Then

\begin{matrix} M_{i} & \leq & P (\sum_{k = 1}^{(\binom{\frac{n}{2}}{m - 1})} (Z_{k} - W_{k}) \geq \frac{C_{α_{n}}}{C_{a, b}} y_{i} - λ log (n)) + P (\sum_{k = 1}^{2 (\binom{\frac{n}{2} δ}{m - 1}) + 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}} W_{k} \geq λ log (n)) \\ : = & I + II . \end{matrix}

For

II

, the multiplicative Chernoff bound (see

(i i i)

in Lemma A1) gives

\begin{matrix} II & \leq {(\frac{λ log (n)}{e} \frac{n^{m - 1}}{a log (n)} \frac{1}{2 (\binom{\frac{n}{2} δ}{m - 1}) + 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}})}^{- λ log (n)} \\ = {(\frac{λ}{a e} \frac{n^{m - 1}}{2 (\binom{\frac{n}{2} δ}{m - 1}) + 2 (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}})}^{- λ log (n)} \\ = n^{- λ log \{\frac{λ}{a e} \frac{1}{2 \frac{1}{n^{m - 1}} (\binom{\frac{n}{2} δ}{m - 1}) + 2 \frac{1}{n^{m - 1}} (\binom{n - 1}{m - 1}) \frac{d_{n}}{log (n)}}\}} \\ = n^{- λ log \{\frac{1}{δ^{m - 1}} (1 - \frac{log (1 + \frac{c_{m} d_{n}}{log (n)} \frac{1}{δ^{m - 1}})}{log (\frac{1}{δ^{m - 1}})} + o (1))\}} \\ = n^{- (m - 1) \sqrt{log (\frac{1}{δ})} (1 - \frac{log (1 + \frac{c_{m} d_{n}}{log (n)} \frac{1}{δ^{m - 1}})}{log (\frac{1}{δ^{m - 1}})} + o (1))} \\ \leq n^{- (m - 1) (1 + Ω (1))}, \end{matrix}

(A8)

where

c_{m}

depends only on m. We used

δ \to 0

as

d_{n} \to \infty

, and

log (1 + \frac{d_{n}}{log (n)} \frac{1}{δ^{m - 1}}) < log (\frac{1}{δ^{m - 1}})

for a sufficiently large

d_{n}

in the last inequality.

For

I

, we again apply the Chernoff bound (see

(i i)

in Lemma A4).

\begin{matrix} I & = P (\sum_{k = 1}^{l_{m, n}} X_{k} \geq C_{α_{n}} y_{i} - C_{a, b} λ log (n)) = P (\frac{1}{l_{m, n}} \sum_{k = 1}^{l_{m, n}} X_{k} \geq D_{λ, y_{i}}), \end{matrix}

(A9)

where

D_{λ, y_{i}} = \frac{C_{a, b}}{l_{m, n}} (\frac{C_{α_{n}}}{C_{a, b}} y_{i} - λ log (n)) .

It follows that

Δ = β y_{i}

, since

Δ_{λ, y_{i}} = \frac{l_{m, n}}{log (n)} D_{ϵ_{m, n}, y_{i}} = β y_{i} + o (1) .

Here, we used

C_{α_{n}} = (β + o (1)) log (n)

. We still have the following:

$y_{i} = + 1$ , then $η_{m, a, b} (Δ) = η_{m, a, b} (β)$ . Note that $C_{m, a, b} (a - b) > - Δ = - β$ always holds, and (A4) applies;
$y_{i} = - 1$ , then $η_{m, a, b} (Δ) = η_{m, a, b} (β) - β$ . If $C_{m, a, b} (a - b) > - Δ = β$ , then (A4) applies. If $C_{m, a, b} (a - b) < β$ , then we take the upper bound to be 1.

To sum up,

(i): If $C_{m, a, b} (a - b) > β$ , then (A9) becomes

$\begin{matrix} I & \leq & (1 - α_{n}) n^{- η_{m, a, b} (β) + o (1)} + α_{n} n^{- η_{m, a, b} (β) + β + o (1)} \\ = & (1 - α_{n}) n^{- η_{m, a, b} (β) + o (1)} + n^{- η_{m, a, b} (β) + o (1)} \\ = & n^{- η_{m, a, b} (β) + o (1)} . \end{matrix}$

Combining with (A8) yields

$M_{i} \leq I + II \leq n^{- η_{m, a, b} (β) + o (1)} + n^{- (m - 1) + Ω (1)} .$
(ii): If $C_{m, a, b} (a - b) < β$ , then (A9) becomes

$\begin{matrix} I & \leq & (1 - α_{n}) n^{- η_{m, a, b} (β) + o (1)} + α_{n} \\ = & (1 - α_{n}) n^{- η_{m, a, b} (β) + o (1)} + n^{- β + o (1)} \\ = & n^{- β + o (1)} . \end{matrix}$

Here, we used the fact that $η_{m, a, b} (β) \geq β$ for any $β \in R$ (see Lemma A3). Combining with (A8) yields the following:

$M_{i} \leq I + II \leq n^{- η_{m, a, b} (β) + o (1)} + n^{- (m - 1) + Ω (1)}$

That is,

M_{i} \leq \{\begin{matrix} n^{- η_{m, a, b} (β) + o (1)} + n^{- (m - 1) + Ω (1)}, & C_{m, a, b} (a - b) > β, \\ n^{- β + o (1)} + n^{- (m - 1) + Ω (1)}, & C_{m, a, b} (a - b) < β . \end{matrix}

By union bound, the probability of failure is

M \leq n M_{i} \leq \{\begin{matrix} n^{1 - η_{m, a, b} (β) + o (1)}, & C_{m, a, b} (a - b) > β, \\ n^{- β + o (1)}, & C_{m, a, b} (a - b) < β, \end{matrix}

under condition (5). This complete the proof of Lemma 7.

Denote

β_{m} = 2^{m - 1} (m - 1)! β

and rewrite (1) as

\begin{matrix} η_{a, b} (β_{m}) & = a + b - \frac{γ_{a, b} (β_{m})}{C_{a, b}} + \frac{β_{m}}{2 C_{a, b}} log \frac{γ_{a, b} (β_{m}) + β_{m}}{γ_{a, b} (β_{m}) - β_{m}} + \frac{β_{m}}{2}, \\ γ_{a, b} (β_{m}) & = \sqrt{β_{m}^{2} + 4 a b C_{a, b}^{2}}, \end{matrix}

(A10)

where

C_{a, b} = log (a) - log (b)

.

Lemma A3.

η_{a, b} (β_{m}) \geq β_{m}

.

Proof.

Let

ξ_{a, b} (β_{m}) : = η_{a, b} (β_{m}) - β_{m},

and we will show that it is convex in

β_{m}

, with a global minimum value of 0. By (1), we have

\begin{matrix} C_{a, b} ξ_{a, b} (β_{m}) & = & C_{a, b} (η_{a, b} (β_{m}) - β_{m}) \\ = & C_{a, b} (a + b) - γ_{a, b} (β_{m}) + \frac{β_{m}}{2} (log \frac{γ_{a, b} (β_{m}) + β_{m}}{γ_{a, b} (β_{m}) - β_{m}} - C_{a, b}) \\ = & C_{a, b} (a + b) - γ_{a, b} (β_{m}) - β_{m} log \frac{γ_{a, b} (β_{m}) - β_{m}}{2 b} \\ = & C_{a, b} (a + b) - γ_{a, b} (β_{m}) - β_{m} log (γ_{a, b} (β_{m}) - β_{m}) + β_{m} log (2 b) . \end{matrix}

Here, we used the fact that

log \frac{γ_{a, b} (β_{m}) - β_{m}}{2 b} = - \frac{1}{2} (log \frac{γ_{a, b} (β_{m}) + β_{m}}{γ_{a, b} (β_{m}) - β_{m}} - C_{a, b}) .

Taking the first two derivatives of

ξ_{a, b} (β_{m})

w.r.t.

β_{m}

, and using

γ_{a, b}^{'} (β_{m}) = γ_{a, b}^{- 1} (β_{m}) β_{m}, {(log (γ_{a, b} (β_{m}) - β_{m}))}^{'} = - γ_{a, b}^{- 1} (β_{m}),

we have

\begin{matrix} C_{a, b} ξ_{a, b}^{'} (β_{m}) & = - log \frac{γ_{a, b} (β_{m}) - β_{m}}{2 b}, \\ C_{a, b} ξ_{a, b}^{″} (β_{m}) & = γ_{a, b}^{- 1} (β_{m}) . \end{matrix}

(A11)

Thus,

ξ_{a, b} (β_{m})

is convex with a unique critical point

β_{m}^{*} = C_{m, a, b} (a - b)

. Hence,

ξ_{a, b} (β_{m}) \geq ξ (β_{m}^{*}) = 0

. □

Corollary A1.

η_{a, b} (β_{m})

is increasing in

β_{m}

for any

β_{m} \geq 0

.

Proof.

Taking the first two derivatives of

η_{a, b} (β_{m}) = ξ_{a, b} (β_{m}) + β_{m}

w.r.t.

β_{m}

and using (A11), we have the following:

\begin{matrix} C_{a, b} η_{a, b}^{'} (β_{m}) & = & C_{a, b} (ξ_{a, b}^{'} (β_{m}) + 1) = - log \frac{γ_{a, b} (β_{m}) - β_{m}}{2 b} + log (\frac{a}{b}) = - log \frac{γ_{a, b} (β_{m}) - β_{m}}{2 a}, \\ C_{a, b} ξ_{a, b}^{″} (β_{m}) & = & C_{a, b} η_{a, b}^{″} (β_{m}) = γ_{a, b}^{- 1} (β_{m}) . \end{matrix}

Thus,

η_{a, b} (β_{m})

is convex with a unique critical point

β_{m}^{*} = - C_{m, a, b} (a - b) < 0

. Hence,

η_{a, b} (β_{m})

is increasing in

β_{m}

for any

β_{m} \geq 0

. □

References

Chen, J.; Yuan, B. Detecting functional modules in the yeast proteinprotein interaction network. Bioinformatics 2006, 22, 2283–2290. [Google Scholar] [CrossRef]
Costa, L.F.; Oliveira, O.N., Jr.; Travieso, G.; Rodrigues, F.A.; Villas Boas, P.R.; Antiqueira, L.; Viana, M.P.; Correa Rocha, L.E. Analyzing and modeling real-world phenomena with complex networks: A survey of applications. Adv. Phys. 2011, 60, 329–412. [Google Scholar] [CrossRef] [Green Version]
Fortunato, S. Community detection in graphs. Phys. Rep. 2010, 486, 75–174. [Google Scholar] [CrossRef] [Green Version]
Newman, M.E.J. Coauthorship networks and patterns of scientific collaboration. Proc. Natl. Acad. Sci. USA 2004, 101, 5200–5205. [Google Scholar] [CrossRef] [Green Version]
Ma’ayan, A. Introduction to Network Analysis in Systems Biology. Sci. Signal. 2011, 4, tr5. [Google Scholar] [CrossRef] [Green Version]
Kim, C.; Bandeira, A.; Goemans, M. Stochastic Block Model for Hypergraphs: Statistical limits and a semidefinite programming approach. arXiv 2018, arXiv:1807.02884. [Google Scholar]
Lei, J. A goodness-of-fit test for stochastic block models. Ann. Stat. 2016, 44, 401–424. [Google Scholar] [CrossRef]
Yuan, M.; Nan, Y. Test dense subgraphs in sparse uniform hypergraph. Commun. Stat.-Theory Methods 2020, 1–20. [Google Scholar] [CrossRef]
Abbe, E. Community Detection and Stochastic Block Models: Recent Developments. J. Mach. Learn. Res. 2018, 18, 6446–6531. [Google Scholar]
Agarwal, S.; Branson, K.; Belongie, S. Higher order learning with graphs. In Proceedings of the International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 17–24. [Google Scholar]
Amini, A.; Chen, A.; Bickel, P. Pseudo-likelihood methods for community detection in large sparse networks. Ann. Stat. 2013, 41, 2097–2122. [Google Scholar] [CrossRef]
Ahn, K.; Lee, K.; Suh, C. Hypergraph Spectral Clustering in the Weighted Stochastic Block Model. IEEE J. Sel. Top. Signal Process. 2018, 12, 959–974. [Google Scholar] [CrossRef] [Green Version]
Bickel, P.J.; Sarkar, P. Hypothesis testing for automated community detection in networks. J. R. Stat. Soc. Ser. B 2016, 78, 253–273. [Google Scholar] [CrossRef] [Green Version]
Ghoshdastidar, D.; Dukkipati, A. Consistency of spectral partitioning of uniform hypergraphs under planted partition model. Adv. Neural Inf. Process. Syst. 2014, 27, 397–405. [Google Scholar]
Ghoshdastidar, D.; Dukkipati, A. Consistency of spectral hypergraph partitioning under planted partition model. Ann. Stat. 2017, 45, 289–315. [Google Scholar] [CrossRef]
Ke, Z.; Shi, F.; Xia, D. Community Detection for Hypergraph Networks via Regularized Tensor Power Iteration. arXiv 2020, arXiv:1909.06503. [Google Scholar]
Kim, S. Higher-order correlation clustering for image segmentation. Adv. Neural Inf. Process. Syst. 2011, 24, 1530–1538. [Google Scholar]
Yuan, M.; Liu, R.; Feng, Y.; Shang, Z. Testing community structures for hypergraphs. arXiv 2018, arXiv:1810.04617. [Google Scholar]
Yuan, M.; Shang, Z. Sharp detection boundaries on testing dense subhypergraph. arXiv 2021, arXiv:2101.04584. [Google Scholar]
Yuan, M.; Shang, Z. Heterogeneous Dense Subhypergraph Detection. arXiv 2021, arXiv:2104.04047. [Google Scholar]
Yuan, M.; Shang, Z. Information Limits for Detecting a Subhypergraph. arXiv 2021, arXiv:2105.02259. [Google Scholar]
Abbe, E.; Banderira, A.; Hall, G. Exact Recovery in the Stochastic Block Model. IEEE Trans. Inf. Theory 2016, 62, 471–487. [Google Scholar] [CrossRef] [Green Version]
Saad, H.; Nosratinia, A. Community detection with side information: Exact recovery under the stochastic block model. IEEE J. Sel. Top. Signal Process. 2018, 12, 944–958. [Google Scholar] [CrossRef] [Green Version]
Cai, T.T.; Liang, T.; Rakhlin, A. Inference via Message Passing on Partially Labeled Stochastic Block Models. arXiv 2016, arXiv:1603.06923. [Google Scholar]
Kanade, V.; Mossel, E.; Schramm, T. Global and Local Information in Clustering Labeled Block Models. IEEE Trans. Inf. Theory 2016, 62, 5906–5917. [Google Scholar] [CrossRef]
Kadavankandy, A.; Avrachenkov, K.; Cottatellucci, L.; Sundaresan, R. The Power of Side-Information in Subgraph Detection. IEEE Trans. Signal Process. 2018, 66, 1905–1919. [Google Scholar] [CrossRef] [Green Version]
Mossel, E.; Xu, J. Local algorithms for block models with side information. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science, Cambridge, MA, USA, 14–16 January 2016; pp. 71–80. [Google Scholar]
Tudisco, F.; Prokopchik, K.; Benson, A. A nonlinear diffusion method for semi-supervised learning on hypergraphs. arXiv 2021, arXiv:2103.14867. [Google Scholar]
Tudisco, F.; Benson, A.; Prokopchik, K. Nonlinear Higher-Order Label Spreading. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021. [Google Scholar]
Whang, J.; Du, R.; Jung, S.; Lee, G.; Drake, B.; Liu, Q.; Kang, S.; Park, H. MEGA: Multi-View Semi-Supervised Clustering of Hypergraphs. Proc. VLDB Endow. 2020, 13, 698–711. [Google Scholar] [CrossRef]

Figure 1. Detection boundary with noisy labels observed for

m = 2, 3

and

β = 0, 0.4, 0.8

. Red regions: exact recovery is impossible; green regions: exact recovery is possible.

Figure 1. Detection boundary with noisy labels observed for

m = 2, 3

and

β = 0, 0.4, 0.8

. Red regions: exact recovery is impossible; green regions: exact recovery is possible.

Figure 2. Detection boundary with partially observed labels for

m = 2, 3

and

β = 0, 0.4, 0.8

. Red regions: exact recovery is impossible; green regions: exact recovery is possible.

Figure 2. Detection boundary with partially observed labels for

m = 2, 3

and

β = 0, 0.4, 0.8

. Red regions: exact recovery is impossible; green regions: exact recovery is possible.

Table 1. Regions for the exact recovery of community structure in a hypergraph with label information.

Region Where Noisy Labels Are Observed	Recovery
$(i)$ $η_{m, a, b} (β) < 1$ and $β < C_{m, a, b} (a - b)$	Exact recovery is impossible
$(i i)$ $β < 1$ and $β > C_{m, a, b} (a - b)$	Exact recovery is impossible
$(i)$ $η_{m, a, b} (β) > 1$ and $β_{m} < C_{m, a, b} (a - b)$	Exact recovery is possible
$(i i)$ $β > 1$ and $β > C_{m, a, b} (a - b)$	Exact recovery is possible
Region Where True Labels Are Partially Observed	Recovery
$\frac{{(\sqrt{a} - \sqrt{b})}^{2}}{2^{m - 1} (m - 1)!} + β < 1$	Exact recovery is impossible
$\frac{{(\sqrt{a} - \sqrt{b})}^{2}}{2^{m - 1} (m - 1)!} + β > 1$	Exact recovery is possible

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, X.; Zhao, W.; Yuan, M. Information Limits for Community Detection in Hypergraph with Label Information. Symmetry 2021, 13, 2060. https://doi.org/10.3390/sym13112060

AMA Style

Zhao X, Zhao W, Yuan M. Information Limits for Community Detection in Hypergraph with Label Information. Symmetry. 2021; 13(11):2060. https://doi.org/10.3390/sym13112060

Chicago/Turabian Style

Zhao, Xiaofeng, Wei Zhao, and Mingao Yuan. 2021. "Information Limits for Community Detection in Hypergraph with Label Information" Symmetry 13, no. 11: 2060. https://doi.org/10.3390/sym13112060

APA Style

Zhao, X., Zhao, W., & Yuan, M. (2021). Information Limits for Community Detection in Hypergraph with Label Information. Symmetry, 13(11), 2060. https://doi.org/10.3390/sym13112060

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Information Limits for Community Detection in Hypergraph with Label Information

Abstract

1. Introduction

2. Main Result

2.1. Detection with Noisy Label Information

2.2. Detection with Partially Observed Labels

3. Proof of Main Result

Proof of Theorem 1

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Chernoff Bound

Appendix A.2. Proof of Lemma 3

Appendix A.3. Proof of Lemma 7

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI