Next Article in Journal
Moran’s I for Multivariate Spatial Data
Previous Article in Journal
Representing the Information of Multiplayer Online Battle Arena (MOBA) Video Games Using Convolutional Accordion Auto-Encoder (A2E) Enhanced by Attention Mechanisms
Previous Article in Special Issue
Multi-View and Multimodal Graph Convolutional Neural Network for Autism Spectrum Disorder Diagnosis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Weighted Graph-Based Two-Sample Test via Empirical Likelihood

by
Xiaofeng Zhao
1 and
Mingao Yuan
2,*
1
School of Mathematics and Statistics, North China University of Water Resources and Electric Power, Zhengzhou 450045, China
2
Department of Statistics, North Dakota State University, Fargo, ND 58103, USA
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(17), 2745; https://doi.org/10.3390/math12172745
Submission received: 2 August 2024 / Revised: 30 August 2024 / Accepted: 2 September 2024 / Published: 4 September 2024
(This article belongs to the Special Issue Network Biology and Machine Learning in Bioinformatics)

Abstract

:
In network data analysis, one of the important problems is determining if two collections of networks are drawn from the same distribution. This problem can be modeled in the framework of two-sample hypothesis testing. Several graph-based two-sample tests have been studied. However, the methods mainly focus on binary graphs, and many real-world networks are weighted. In this paper, we apply empirical likelihood to test the difference in two populations of weighted networks. We derive the limiting distribution of the test statistic under the null hypothesis. We use simulation experiments to evaluate the power of the proposed method. The results show that the proposed test has satisfactory performance. Then, we apply the proposed method to a biological dataset.
MSC:
60K35; 05C80

1. Introduction

Comparing the distributions of two samples is a fundamental problem in statistics. This problem is known as two-sample hypothesis test. Under the null hypothesis, the distributions are equal, while under the alternative hypothesis, the distributions are different. In practice, the two-sample hypothesis test has various applications. In the medical field, the two-sample test is widely used in clinic trial experiments [1]. In the biology field, researchers apply two-sample tests to distinguish the expression of genes [2]. In manufacturing companies, a two-sample test is carried out to choose more efficient producing processes and examine product quality [3]. In the social science field, the two-sample test is intended for making comparisons of people with different races, genders, ethnicities, etc. [4]. The two-sample test problem has been studied extensively in the literature. The classical two-sample hypothesis tests include the two-sample t test, Hotelling’s T-squared test, the Wilcoxon test, and the Kolmogorov–Smirnov test.
A network is a structure that represents a group of objects and relationships between them. In mathematics, it is known as a graph. A network structure consists of nodes and edges, with nodes representing objects and edges representing the relationships between those objects. In the past decades, network data analysis has received intense attentions.
In many applications, a number of graphs from several populations are available. A natural question is whether the graph samples are from the same distribution. This problem can be formulated as a graph-based two-sample test. Under the null hypothesis, the distributions are the same. Under the alternative hypothesis, the distributions are different. The graph-based two-sample test has wide applications. Ref. [5] used a graph-based two-sample test to study how different screening rules may influence the diversification benefits of portfolios in asset management and wealth management. In brain network data analysis, a graph-based two-sample test can be employed to distinguish between various brain disorders [6,7].
Graph-based two-sample testing has been widely studied. Ref. [8] considered testing whether two networks have the same distribution and proposed a consistent and minmax optimal two-sample test. Ref. [9] propose two new tests for a small two-sample setting. Ref. [10] provide sufficient conditions under which it is possible to test the difference between two populations of inhomogeneous random graphs. Refs. [11,12] study the two-sample problem on a regime of random dot product graphs, and their test statistics are based on the kernel function of the spectral decomposition of the adjacency matrix. Refs. [13,14] propose a test based on subgraph counts. Ref. [15] propose a powerful test for a weighted graph-based two-sample test. However, the aforementioned graph-based two-sample tests present drawbacks in the following aspects: (1) Most of the tests are developed for binary (or unweighted) graphs. (2) The independence assumption of edges poses a strong condition which makes the tests conservative in some sense.
In practice, many real-world networks are weighted and the edges are correlated. For example, in brain networks, the edges are constructed based on correlations or other association measures between two brain regions. The association measures are weights of the edges, and they may be correlated [16]. In this paper, we study a weighted graph-based two-sample test. We propose a novel graph-based two-sample test based on empirical likelihood [17]. Empirical likelihood is a nonparametric method that does not require the form of the underlying distribution of data, and it retains some of the advantages of likelihood-based inference. We derive the asymptotic distribution of the test statistic under the null hypothesis. We use a simulation to study the power of the proposed test. We apply the proposed method to real-world weighted networks.
The rest of the paper is organized as follows: Section 2 describes the model and the proposed new graph-based two-sample test. Section 3 evaluates the performance of the new test using simulations and its application to real data. The proofs are deferred to Appendix A.
Notations: Let c 1 , c 2 be two positive constants. For two positive sequences a n , b n , denote a n b n if c 1 a n b n c 2 ; a n = O ( b n ) if a n b n c 2 ; and a n = o ( b n ) if lim n a n b n = 0 . For a sequence of random variables X n , X n = O P ( a n ) means X n a n is bounded in probability, and X n = o P ( a n ) means X n a n converges to zero in probability.

2. Weighted Graph-Based Two-Sample Empirical Likelihood Test

A graph G is defined as G = (V, E ) , where V is the set of all vertices (nodes) and E is the set of all edges in the graph. The adjacency matrix A of G is defined as follows: A i j = 1 if there is an edge between vertices i and j, and A i j = 0 otherwise. We assume the graph is undirected and does not have self-loops, that is, A i j = A j i and A i i = 0 . Then, the adjacency matrix A is symmetric and its diagonal elements are zeroes. If A i j B e r n ( p i j ) , 0 p i j 1 , we refer to it as an inhomogeneous random graph. Since A i j { 0 , 1 } , the random graph A is said to be binary or unweighted.
To incorporate weights, we define the weighted random graph as follows.
Definition 1. 
Let F ( z ; θ ) be a probability distribution with the parameter θ, and h : [ 0 , 1 ] 2 R d be a symmetric function, that is, h ( x , y ) = h ( y , x ) . Here, d is a positive integer. Let n be the number of nodes and U = { U 1 , U 2 , , U n } be an independent sample from the uniform distribution U n i f ( [ 0 , 1 ] ) . The random weighted graph G ( F , h ) is defined as follows. Given U, the edges A i j ( 1 i < j n ) are conditionally independent and follow the distribution F ( h ( U i , U j ) ) , that is,
A i j F ( z ; h ( U i , U j ) ) .
Denote A G ( n , F , h ) .
When F ( z ; θ ) is the Bernoulli distribution, G ( F , h ) is the well-known random graphon model for unweighted graphs. Further, if h is a constant between one and zero, it is the Erdős-Rényi random graph. If F ( z ; θ ) is not the Bernoulli distribution, the graph A is weighted. Since the distributions of A i j depend on U, the edges A i j ( 1 i < j n ) are not independent. Hence, the random graph G ( F , h ) can model weights and correlations in weighted networks. Figure 1 provides visualizations of two weighted graphs. In Figure 1, F ( z ; θ ) = 0.7 δ { 0 } + 0.3 F e x p ( z ; θ ) , where δ { 0 } is the Dirac measure centered on 0, and F e x p ( z ; θ ) is the exponential distribution F e x p ( z ; θ ) = 1 e θ z . The left weighted graph is constructed based on F ( z ; h 1 ( x , y ) ) with h 1 ( x , y ) = e x y , and the right weighted graph is constructed based on F ( z ; h 2 ( x , y ) ) with h 2 ( x , y ) = 2 + e x y . The number on an edge represents the weight of the edge. On average, the left graph has larger weights than the right graph.
Let m 1 , m 2 be two positive integers, n 1 = ( n 11 , n 12 , , n 1 m 1 ) and n 2 = ( n 21 , n 22 , , n 2 m 2 ) be two sequences of positive integers. Given independent networks A 1 , , A m 1 with A l G ( n 1 l , F , h 1 ) ( 1 l m 1 ) and independent networks B 1 ,…, B m 2 with B l G ( n 2 l , F , h 2 ) ( 1 l m 2 ), we are interested in testing whether the two samples have the same distribution, that is, we test the following hypotheses
H 0 : h 1 = h 2 , H 1 : h 1 h 2 .
Under the null hypothesis, h 1 = h 2 . The two graph samples are drawn from the same distribution. Under the alternative hypothesis, h 1 h 2 . The distributions of the two samples are different.
A similar hypothesis test problem to that in (1) has been studied in several papers. In [11,13,14], the distribution F is assumed to be the Bernoulli distribution. Ref. [15] consider (1) under the assumption that the edges are independent and d = 1 . In this work, we investigate (1) in a more general setting, and we propose a new test based on empirical likelihood.
Empirical likelihood (EL) was introduced by Owen [17,18] to construct a confidence region for the mean. It is a nonparametric method that does not require a prespecified distribution for the data. As a counterpart of the parametric likelihood method, it inherits the advantageous properties of the likelihood-based method. The empirical likelihood confidence region respects the shape of the data and usually outperforms the method based on asymptotic normality. Empirical likelihood has also been widely used in hypothesis testing.
Now, we present the empirical likelihood test for (1). For positive integers k , l , k 3 , l 1 , define the cycles
C t ( l ) ( A ) = 1 n 1 t k i 1 < i 2 < < i k A t , i 1 i 2 l A t , i 2 i 3 l A t , i k i 1 l ,
C s ( l ) ( B ) = 1 n 2 t k i 1 < i 2 < < i k B s , i 1 i 2 l B s , i 2 i 3 l B s , i k i 1 l .
In the binary graph case, C t ( l ) ( A ) represents the density of k-cycles in graph A t , and C t ( l 1 ) ( A ) = C t ( l 2 ) ( A ) for all l 1 , l 2 . For weighted graphs, C t ( l 1 ) ( A ) C t ( l 2 ) ( A ) if l 1 l 2 . Let
X t = ( C t ( 1 ) ( A ) , C t ( 2 ) ( A ) , , C t ( d ) ( A ) ) , ( 1 t m 1 ) ,
Y s = ( C s ( 1 ) ( B ) , C s ( 2 ) ( B ) , , C s ( d ) ( B ) ) , ( 1 s m 2 ) .
That is, X t and Y s are d-dimensional vectors with the components C t ( l ) ( A ) and C s ( l ) ( B ) , respectively. Let ( p 1 , p 2 , p m 1 ) and ( q 1 , q 2 , q m 2 ) be probability vectors, that is,
i = 1 m 1 p i = 1 , j = 1 m 2 q j = 1 , p i 0 , q j 0 .
Define the empirical likelihood test statistic as
R m 1 , m 2 = max { i = 1 m 1 p i j = 1 m 2 q j | i = 1 m 1 p i X i j = 1 m 2 q j Y j = 0 d } .
According to the Lagrange multiplier method, the maximizer is given by
p ^ i = 1 m 1 ( 1 + λ 1 ( X i μ ) ) ,
q ^ j = 1 m 2 ( 1 + λ 2 ( Y j μ ) ) ,
where μ , λ 1 , and λ 2 are the solutions to the following nonlinear equations:
1 m 1 j = 1 m 1 X j μ 1 + λ 1 ( X j μ ) = 0 ,
1 m 2 j = 1 m 2 Y j μ 1 + λ 2 ( Y j μ ) = 0 ,
m 1 λ 1 + m 2 λ 2 = 0 .
The test statistic R m 1 , m 2 is a generalization of the classical two-sample empirical likelihood in [19,20]. The difference is that X 1 , X 2 , …, X m 1 are not identically distributed and Y 1 , Y 2 , …, Y m 2 are not identically distributed as in [19,20]. The limiting distribution of R m 1 , m 2 under the null hypothesis is given by the following theorem.
Theorem 1. 
Suppose n 1 , n 2 are fixed and the 4 d -th moment of the distribution F is finite. Assume that m 1 m 1 + m 2 β 0 ( 0 , 1 ) , as m 1 , m 2 . Under the null hypothesis, H 0 , 2 log R m 1 , m 2 converges in distribution to χ d 2 as m 1 , m 2 . Here χ d 2 is the chi-square distribution with the degree of freedom d.
According to Theorem 1, we define the empirical likelihood test for (1) as follows.
Reject H 0 at significance level α if 2 log R m 1 , m 2 > χ d , 1 α 2 ,
where χ d , 1 α 2 is the 100 ( 1 α ) % quantile of the chi-square distribution with the degree of freedom d. The p-value of the empirical likelihood test is defined as follows:
p - value = P χ d 2 > 2 log R m 1 , m 2 .
The assumption on the finiteness of 4d-th moment of the distribution F is not very restrictive. For unweighted networks, F is the Bernoulli distribution, and all the moments F exist. In modeling weighted networks, the weights are usually assumed to follow a distribution in the exponential family. Many distributions in the exponential family satisfy this assumption, for instance, the normal distribution, the exponential distribution, and the Poisson distribution. In many applications, the weights are usually correlations, which are between −1 and 1. It is reasonable to assume its higher moments exist.

3. Simulations

In this section, we run simulations to evaluate our proposed empirical likelihood test. For binary (or unweighted) graphs, refs. [13,14] proposed a two-sample t test based on X t ( 1 t m 1 ) and Y s ( 1 s m 2 ) . We compare our empirical likelihood test with the t-test in an unweighted network case. For weighted networks, it is not clear whether the t-test still works or not. However, as a comparison, we still run a simulation to evaluate its performance. Note that the t-test statistic is a function of A t , i j ( 1 i < j n 1 t , 1 t m 1 ) and B s , i j ( 1 i < j n 2 s , 1 s m 2 ) . For weighted networks, A t , i j R and B s , i j R . We still plug them into the t-test statistic and adopt the same rejection rule as in the unweighted network case. Then, we evaluate its performance and compare our empirical likelihood test with it.
In the simulations, we set the Type I error α to 0.05 and report the empirical size and power by 2000 trials. The empirical size is calculated as follows. Generate an independent sample A 1 , , A m 1 , B 1 , , B m 2 G ( n , F , h ) , perform the empirical likelihood test, and record whether H 0 is rejected or not. Then, repeat the experiment 2000 times, and the rejection rate is the empirical size. The empirical power is calculated as follows. Generate the independent sample A 1 , , A m 1 G ( n , F , h 1 ) and independent sample B 1 , , B m 2 G ( n , F , h 2 ) , perform the empirical likelihood test, and record whether H 0 is rejected or not. Then, repeat the experiment 2000 times, and the rejection rate is the empirical power. The empirical size and power of the t-test are similarly calculated.
We take m 1 = m 2 = m { 20 , 30 , 40 } , n 11 = n 12 = = n 1 m 1 = n 1 { 10 , 20 , 30 } , and n 21 = n 22 = = n 2 m 2 = n 1 { 10 , 20 , 30 } . We consider the following five situations.
In the first simulation, d = 1 , h 1 ( x , y ) = 1 1 5 + x y , and h 2 ( x , y ) = h 1 ( x , y ) + c with c = 0 ; 0.01 ; 0.02 . F ( z ; θ ) is the Bernoulli distribution. The result is shown in Table 1.
In the second simulation, d = 1 , h 1 ( x , y ) = e x y 10 , and h 2 ( x , y ) = h 1 ( x , y ) + c with c = 0 ; 0.02 ; 0.04 . F ( z ; θ ) is the Poisson distribution. The result is shown in Table 2.
In the third simulation, we consider d = 1 , h 1 ( x , y ) = e x y + x 2 y 2 2 , h 2 ( x , y ) = h 1 ( x , y ) + c , c = 0 ; 0.04 ; 0.06 , etc. F ( z ; θ ) is the exponential distribution. The result is shown in Table 3.
In the forth simulation, we study d = 2 , h 1 ( x , y ) = x y , e x y + x 2 y 2 , and h 2 ( x , y ) = h 1 ( x , y ) + c . Here, c = ( c 1 , c 2 ) is given below
c = ( 0 , 0 ) , c = ( 0.1 , 0.1 ) , c = ( 0.2 , 0.2 ) .
F ( z ; h 1 ( x , y ) ) and F ( z ; h 2 ( x , y ) ) are the two-dimensional normal distributions with mean and variance equal to h 1 ( x , y ) and h 2 ( x , y ) , respectively. The result is given in Table 4.
In the last simulation, we consider d = 2 and the Gamma distribution with the parameter α , β . Let h 1 ( x , y ) = ( f 1 ( x , y ) , f 2 ( x , y ) ) = x 3 y 3 , x y and h 2 ( x , y ) = h 1 ( x , y ) + c . Here, c = ( c 1 , c 2 ) is the same as in the normal distribution case. F ( z ; h 1 ( x , y ) ) is the Gamma distribution with the parameter α , β , which depend on f 1 , f 2 as follows:
f 1 = α β , f 2 = α β 2 + f 1 2 = α + α 2 β 2 .
The result is given in Table 5.
All the Type I errors (the third columns) are close to 0.05. As m increases, the power of the tests approaches 1, indicating that our empirical results are consistent with Theorem 1. Moreover, our method has higher power and converges to 1 faster than the two-sample t-test.

4. Real Data Analysis

In this section, we apply our proposed method to a real dataset from a proof-of-concept study [21]. The dataset comprises 72 samples, with 47 patients diagnosed with acute myeloid leukemia (AML) and 25 with acute lymphoblastic leukemia (ALL). The dataset contains 7129 features representing gene expression levels measured via DNA microarray analysis. We randomly choose 10, 20, and 50 features and also include all the 7129 features to calculate the correlation matrices. We take the absolute value of the correlation matrix elements and set the diagonal elements to 0, thereby mimicking the adjacency matrix of a weighted graph. We apply the proposed empirical likelihood test and the t-test to the networks and report the p-values. The results are shown in Table 6. Note that the two groups of networks are constructed based on two different types of diseases. The tests should reject the null hypothesis that they come from the same graph distribution. When n 20 , the p-values of the empirical likelihood test are smaller than 0.05. Hence, we reject the null hypothesis. For n = 20 , 30 , the p-values of the t-test are larger than 0.05. We fail to reject the null hypothesis. This indicates that the proposed empirical likelihood test is more powerful than the t-test.

5. Conclusions and Discussion

In this paper, we study the weighted graph two-sample test problem and propose an empirical likelihood test. A simulation is employed to evaluate the performance of the proposed empirical likelihood test. As a comparison, we also run a simulation for the graph-based two-sample t-test, which is developed in an unweighted network case. The simulation study shows that the t-test still works for weighted networks, and the proposed empirical likelihood test has higher power than the t-test.
One limitation of the current work is that the proposed empirical likelihood test only works for large sample sizes m 1 , m 2 . In Theorem 1, we require m 1 , m 2 . In practice, it is possible that the sample sizes m 1 , m 2 are very small, for instance, m 1 = m 2 = 1 [22]. In this case, the proposed empirical likelihood test does not work. Determining how to develop a powerful test that is valid for small m 1 and m 2 is an important future topic. Another important future topic is to study the weighted graph-based two-sample test problem from a minimax perspective. In an unweighted case, this problem has been studied in [10].

Author Contributions

Conceptualization, X.Z. and M.Y.; methodology, M.Y.; formal analysis, X.Z.; data curation, X.Z.; writing—review and editing, X.Z.; supervision, M.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Proof of Empirical Likelihood. 
Firstly, we consider the case d = 1 . In this case,
X i = 1 n i k 1 i 1 < i 2 < < i k n i A i , i 1 i 2 A i , i 2 i 3 A i , i k i 1 ,
Y i = 1 l i k 1 i 1 < i 2 < < i k l i B i , i 1 i 2 B i , i 2 i 3 B i , i k i 1 .
Let X ¯ m 1 = 1 m 1 i = 1 m 1 X i , Y ¯ m 2 = 1 m 2 i = 1 m 2 Y i , σ i 2 = V a r ( X i ) , τ i 2 = V a r ( Y i )
σ 2 = i = 1 m 1 σ i 2 m 1 , τ 2 = i = 1 m 2 τ i 2 m 2
S n 2 = 1 m 1 i = 1 m 1 ( X i X ¯ m 1 ) 2
τ n 2 = 1 m 2 i = 1 m 2 ( Y i Y ¯ n 2 ) 2
R n 2 = 1 m 1 i = 1 m 1 ( X i μ ^ ) 2
T n 2 = 1 m 2 i = 1 m 2 ( Y i μ ^ ) 2
μ ^ = η ^ X ¯ m 1 + ( 1 η ^ ) Y ¯ m 2
η ^ = m 1 S n 2 m 1 S n 2 + m 2 τ n 2
Let ρ = m 1 m 1 + m 2 and
η ^ = ρ T + ( 1 ρ ) S 1 ρ T .
X ¯ n η X ¯ n I η Y ¯ n = I η X ¯ n Y ¯ n
Y ¯ n η X ¯ n I η Y ¯ n = η X ¯ n Y ¯ n
Let
μ 0 = E [ A 1 , 12 A 1 , 23 A 1 , 34 A 1 , k 1 ] = f ( μ 1 , μ 2 ) f ( μ 2 , μ 3 ) f ( μ k , μ 1 ) d μ 1 d μ 2 d μ k ,
Lemma A1. 
Suppose 1 m 1 i = 1 m 1 σ i 2 = σ 2 + o ( 1 ) and 1 m 2 i = 1 m 2 τ i 2 = τ 2 + o ( 1 ) for two positive constants, σ and τ. Then, S n 2 = σ 2 + o p ( 1 ) and τ n 2 = τ 2 + o p ( 1 ) . 0 < c 1 n i c 2 < + , 0 < c 1 l i c 2 < + .
Proof. 
Note that
S n 2 = 1 m 1 i = 1 m 1 ( X i μ 0 ) 2 + ( μ 0 X ¯ m 1 ) 2 + 2 ( X ¯ m 1 μ 0 ) ( μ 0 X ¯ m 1 ) = 1 m 1 i = 1 m 1 ( X i μ 0 ) 2 ( X ¯ m 1 μ 0 ) 2 = 1 m 1 i = 1 m 1 σ i 2 + O P ( 1 m 1 ) = σ 2 + O P ( 1 m 1 ) .
Hence, S n 2 = σ 2 + o p ( 1 ) . Similarly, one can obtain τ n 2 = τ 2 + o p ( 1 ) . □
Lemma A2. 
Suppose 1 m 1 i = 1 m 1 σ i 2 = σ 2 + o ( 1 ) and 1 m 2 i = 1 m 2 τ i 2 = τ 2 + o ( 1 ) for two positive constants, σ and τ, and the 4-th moment of distribution F is finite. Then, as m 1 , m 2 , m 1 ( X ¯ m 1 μ 0 ) d N ( 0 , σ 2 ) and m 2 ( Y ¯ m 2 μ 0 ) d N ( 0 , τ 2 ) .
Proof. 
We shall use the Lindeberg central limit theorem to prove the results. Note that X i is independent and
m 1 ( X ¯ m 1 μ 0 ) = i = 1 m 1 X i μ 0 m 1 .
Then,
i = 1 m 1 V a r ( X i μ 0 m 1 ) = i = 1 m 1 σ i 2 m 1 = σ 2 + o ( 1 ) .
Next, we verify Lindeberg’s condition. Let ε > 0 be any fixed constant. According to the Cauchy–Schwarz inequality and Markov’s inequality, we have
1 σ 2 i = 1 m 1 E X i μ 0 m 1 2 I [ | X i μ 0 m 1 | > ϵ σ ] 1 σ 2 i = 1 m 1 E X i μ 0 m 1 4 1 ε 2 σ 2 = 1 ε 2 σ 4 i = 1 m 1 E ( X i μ ) 4 m 1 2 = o ( 1 ) ,
where we used the fact that the 4-th moment of distribution F exists. Hence, m 1 ( X ¯ m 1 μ 0 ) d N ( 0 , σ 2 ) . Similarly, one has m 2 ( Y ¯ m 2 μ 0 ) d N ( 0 , τ 2 ) . □
Lemma A3. 
Suppose 1 m 1 i = 1 m 1 σ i 2 = σ 2 + o ( 1 ) and 1 m 2 i = 1 m 2 τ i 2 = τ 2 + o ( 1 ) for two positive constants, σ and τ. Then, R n 2 = σ 2 + o p ( 1 ) and T n 2 = τ 2 + o p ( 1 ) .
Proof. 
According to Lemma A2, we have X ¯ m 1 μ 0 = O P ( 1 m 1 ) . Then,
R n 2 = 1 m 1 i = 1 m 1 ( X i μ ^ ) 2 = 1 m 1 i = 1 m 1 ( X i μ 0 + μ 0 μ ^ ) 2 = 1 m 1 i = 1 m 1 ( X i μ 0 ) 2 + ( μ 0 μ ^ ) 2 + 2 ( X i μ 0 ) ( μ 0 μ ^ ) = 1 m 1 i = 1 m 1 ( X i μ 0 ) 2 + ( μ 0 μ ^ ) 2 + 2 ( X ¯ m 1 μ 0 ) ( μ 0 μ ^ ) = σ 2 + O P ( 1 m 1 ) + ( μ ^ μ 0 ) 2 .
Note that
μ ^ μ 0 = η ^ ( X ¯ m i μ 0 ) + ( 1 η ^ ) ( Y ¯ m 2 μ 0 ) = O P ( 1 m 1 ) + O P ( 1 m 2 ) .
Hence, R n 2 = σ 2 + o p ( 1 ) . Similarly, we have T n 2 = τ 2 + o p ( 1 ) . □
Proof of Theorem 1. 
Note that
0 = 1 m 1 i = 1 m 1 X i μ ^ 1 + λ 1 ( X i μ ^ ) = 1 m 1 i = 1 m 1 ( X i μ ^ ) 1 m 1 i = 1 m 1 λ 1 ( X i μ ^ ) 2 1 + λ 1 ( X i μ ^ ) ,
from which it follows that
X ¯ m 1 μ ^ = λ 1 1 m 1 i = 1 m 1 ( X i μ ^ ) 2 1 + λ 1 ( X i μ ^ ) .
Taking absolute value on both sides of (A1) yields
X ¯ m 1 μ ^ λ 1 R n 2 1 + λ 1 = R n 2 λ 1 1 + λ 1 .
According to Lemmas A2 and A3, we have X ¯ m 1 μ 0 = O P ( 1 m 1 ) and R n 2 = σ 2 + o p ( 1 ) , respectively. Then, λ 1 = O p ( 1 m 1 ) . Similarly, λ 2 = O p ( 1 m 2 ) .
Next, we find the leading term of λ 1 , λ 2 , respectively. Note that
0 = 1 m 1 i = 1 m 1 X i μ ^ 1 + λ 1 ( X i μ ^ ) = 1 m 1 i = 1 m 1 ( X i μ ^ ) λ 1 i = 1 m 1 ( X i μ ^ ) 2 m 1 + λ 1 2 m 1 i = 1 m 1 ( X i μ ^ ) 2 1 + λ 1 ( X i μ ^ ) .
Then, X ¯ m 1 μ ^ = λ 1 R n 2 + O p ( 1 m 1 ) . Hence, λ 1 = X ¯ m 1 μ ^ R n 2 + O p ( 1 m 1 ) . Similarly, λ 2 = Y ¯ m 2 μ ^ T n 2 + O p ( 1 m 2 ) .
2 log R = 2 i = 1 m 1 log 1 + λ 1 ( X i μ ^ ) i = 1 m 2 log 1 + λ 2 ( Y i μ ^ ) = 2 [ λ 1 i = 1 m 1 ( X i μ ^ ) 1 2 λ 1 2 i = 1 m 1 ( X i μ ^ ) 2 + O ( λ 1 3 ) + λ 2 i = 1 m 2 ( Y i μ ^ ) 1 2 λ 2 2 i = 1 m 2 ( Y i μ ^ ) 2 + O ( λ 2 3 ) ] = 2 [ m 1 ( X ¯ m 1 μ ^ ) 2 R n 2 m 1 ( X ¯ m 1 μ ^ ) 2 R n 2 2 R n 4 + O ( 1 m 1 m 1 ) + m 2 ( Y ¯ m 2 μ ^ ) 2 T n 2 m 2 ( Y ¯ m 2 μ ^ ) 2 T n 2 2 T n 4 + O ( 1 m 2 m 2 ) ] = m 1 ( X ¯ m 1 μ ^ ) 2 R n 2 + m 2 ( Y ¯ m 2 μ ^ ) 2 T n 2 + O ( 1 m 1 m 1 + 1 m 2 m 2 ) = m 1 ( 1 η ^ ) 2 ( X ¯ m 1 Y ¯ m 2 ) 2 R n 2 + m 2 η ^ 2 ( X ¯ m 1 Y ¯ m 2 ) 2 T n 2 + O ( 1 m 1 m 1 + 1 m 2 m 2 ) = m 1 ( 1 η ^ ) 2 R n 2 + m 2 η ^ 2 T n 2 X ¯ m 1 Y ¯ m 2 2 + o p ( 1 ) .
Let η = m 1 σ 2 m 1 σ 2 + m 2 τ 2 . Then, it is easy to obtain η ^ = η + o p ( 1 ) . Hence,
m 1 ( 1 η ^ ) 2 R n 2 + m 2 η ^ 2 T n 2 = m 1 σ 2 m 2 τ 2 m 1 σ 2 + m 2 τ 2 2 + m 2 τ 2 m 1 σ 2 m 1 σ 2 + m 2 τ 2 2 + o p ( 1 ) = m 1 σ 2 m 2 τ 2 ( m 2 τ 2 + m 1 σ 2 ) ( m 1 σ 2 + m 2 τ 2 ) 2 = m 1 σ 2 m 2 τ 2 m 1 σ 2 + m 2 τ 2 = 1 τ 2 m 2 + σ 2 m 1 + o p ( 1 ) .
Note that X i and Y i are independent. Then, according to Lemma A2, we have
X ¯ m 1 Y ¯ m 2 σ 2 m 1 + τ 2 m 2 = σ 2 m 1 σ 2 m 1 + τ 2 m 2 ( X ¯ m 1 μ 0 ) σ 2 m 1 + τ 2 m 2 σ 2 m 1 + τ 2 m 2 ( Y ¯ m 2 μ 0 ) τ 2 m 2 d N ( 0 , 1 ) .
Then the desired result for d = 1 follows from (A3) and (A4).
The proof for d 2 is similar to the proof of d = 1 . In this case,
η ^ = ρ τ n + ( 1 ρ ) S 2 1 ρ τ n 2 , μ ^ = η ^ X ¯ m 1 + ( I η ^ ) Y ¯ m 2 .
The rest of the proof is almost the same. We omit it. □

References

  1. Callegaro, A.; Spiessens, B. Testing treatment effect in randomized clinical trials with possible nonproportional hazards. Stat. Biopharm. Res. 2017, 9, 204–211. [Google Scholar] [CrossRef]
  2. Tusher, V.G.; Tibshirani, R.; Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 2001, 98, 5116–5121. [Google Scholar] [CrossRef] [PubMed]
  3. Montgomery, D.C. A modern framework for achieving enterprise excellence. Int. J. Lean Six Sigma 2010, 1, 56–65. [Google Scholar] [CrossRef]
  4. Blau, F.D.; Kahn, L.M. The gender wage gap: Extent, trends, and explanations. J. Econ. Lit. 2017, 55, 789–865. [Google Scholar]
  5. Gudmundarson, R.; Peters, G. Assessing portfolio diversification via two-sample graph kernel inference. A case study on the influence of ESG screening. PLoS ONE 2024, 19, e0301804. [Google Scholar]
  6. Arroyo, J.; Kessler, D.; Levina, E.; Taylor, S. Network classification with applications to brain connectomics. Ann. Appl. Stat. 2017, 13, 1648–1677. [Google Scholar] [CrossRef]
  7. Stam, C.J.; Jones, B.; Nolte, G.; Breakspear, M.; Scheltens, P. Small-world networks and functional connectivity in Alzheimer’s disease. Cereb. Cortex 2007, 17, 92–99. [Google Scholar] [CrossRef]
  8. Ghoshdastidar, D.; Gutzeit, M.; Carpentier, A.; von Luxburg, U. Two-sample tests for large random graphs using network statistics. In Proceedings of the Conference on Learning Theory, PMLR, Amsterdam, The Netherlands, 7–10 July 2017; pp. 954–977. [Google Scholar]
  9. Ghoshdastidar, D.; Von Luxburg, U. Practical methods for graph two-sample testing. Adv. Neural Inf. Process. Syst. 2018, 31, 1568. [Google Scholar]
  10. Ghoshdastidar, D.; Gutzeit, M.; Carpentier, A.; Von Luxburg, U. Two-sample hypothesis testing for inhomogeneous random graphs. Ann. Stat. 2020, 48, 2208–2229. [Google Scholar] [CrossRef]
  11. Tang, M.; Athreya, A.; Sussman, D.L.; Lyzinski, V.; Priebe, C.E. A nonparametric two-sample hypothesis testing problem for random graphs. Bernoulli 2017, 23, 1599–1630. [Google Scholar] [CrossRef]
  12. Tang, M.; Athreya, A.; Sussman, D.L.; Lyzinski, V.; Park, Y.; Priebe, C.E. A semiparametric two-sample hypothesis testing problem for random graphs. J. Comput. Graph. Stat. 2017, 26, 344–354. [Google Scholar] [CrossRef]
  13. Maugis, P.A.; Olhede, S.; Priebe, C.; Wolfe, P. Testing for equivalence of network distribution using subgraph counts. J. Comput. Graph. Stat. 2020, 29, 455–465. [Google Scholar] [CrossRef]
  14. Maugis, P. Central limit theorems for local network statistics. arXiv 2020, arXiv:2006.15738. [Google Scholar] [CrossRef]
  15. Yuan, M.; Wen, Q. A practical two-sample test for weighted random graphs. J. Appl. Stat. 2021, 50, 495–511. [Google Scholar] [CrossRef] [PubMed]
  16. Simpson, S.; Bowman, F.; Laurienti, P. Analyzing complex functional brain networks: Fusing statistics and network science to understand the brain. Stat. Surv. 2013, 7, 1–36. [Google Scholar] [CrossRef] [PubMed]
  17. Owen, A.B. Empirical Likelihood; Chapman and Hall/CRC: Boca Raton, FL, USA, 2001. [Google Scholar]
  18. Owen, A.B. Empirical likelihood confidence region. Ann. Stat. 1990, 18, 90–120. [Google Scholar] [CrossRef]
  19. Liu, Y.; Zou, C.; Zhang, R. Empirical likelihood for the two-sample mean problem. Stat. Probab. Lett. 2008, 78, 548–556. [Google Scholar] [CrossRef]
  20. Wu, C.; Yan, Y. Empirical Likelihood Inference for Two-Sample Problems. Stat. Its Interface 2012, 5, 345–354. [Google Scholar] [CrossRef]
  21. Golub, T.R.; Slonim, D.K.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J.P.; Coller, H.; Loh, M.L.; Downing, J.R.; Caligiuri, M.A.; et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 1999, 286, 531–537. [Google Scholar] [CrossRef] [PubMed]
  22. Ghoshdastidar, D.; Luxburg, V.U. Two-sample hypothesis testing for inhomogeneous random graphs. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, QC, Canada, 3–8 December 2018; Volume 48, pp. 3019–3028. [Google Scholar]
Figure 1. Weighted graphs with F ( z ; θ ) = 0.3 δ { 1 } + 0.7 F e x p ( z ; θ ) . The left weighted graph corresponds to F ( z ; h ( x , y ) ) with h ( x , y ) = e x y , and the right weighted graph corresponds to F ( z ; h ( x , y ) ) with h ( x , y ) = 2 + e x y .
Figure 1. Weighted graphs with F ( z ; θ ) = 0.3 δ { 1 } + 0.7 F e x p ( z ; θ ) . The left weighted graph corresponds to F ( z ; h ( x , y ) ) with h ( x , y ) = e x y , and the right weighted graph corresponds to F ( z ; h ( x , y ) ) with h ( x , y ) = 2 + e x y .
Mathematics 12 02745 g001
Table 1. Simulated size and power with graphs generated from Bernoulli distribution.
Table 1. Simulated size and power with graphs generated from Bernoulli distribution.
n 1 , n 2   ( m = 20 ) Method c = 0 (Size) c = 0.01 (Power) c = 0.02 (Power)
n 1 = 10 , n 2 = 10 0.0540.0920.196
n 1 = 10 , n 2 = 20 0.0510.1170.284
n 1 = 10 , n 2 = 30 t test0.0520.1250.323
n 1 = 20 , n 2 = 20 0.0550.1940.601
n 1 = 20 , n 2 = 30 0.0510.2410.745
n 1 = 30 , n 2 = 30 0.0540.3870.917
n 1 = 10 , n 2 = 10 0.0480.1060.213
n 1 = 10 , n 2 = 20 0.0500.1300.300
n 1 = 10 , n 2 = 30 EL test0.0510.1360.337
n 1 = 20 , n 2 = 20 0.0490.2200.634
n 1 = 20 , n 2 = 30 0.0530.2690.768
n 1 = 30 , n 2 = 30 0.0490.4190.926
n 1 , n 2   ( m = 30 ) Method c = 0   (Size) c = 0.01   (Power) c = 0.02   (Power)
n 1 = 10 , n 2 = 10 0.0520.0960.269
n 1 = 10 , n 2 = 20 0.0510.1340.393
n 1 = 10 , n 2 = 30 t test0.0500.1600.432
n 1 = 20 , n 2 = 20 0.0550.2640.774
n 1 = 20 , n 2 = 30 0.0490.3610.891
n 1 = 30 , n 2 = 30 0.0510.5280.982
n 1 = 10 , n 2 = 10 0.0520.1120.283
n 1 = 10 , n 2 = 20 0.0530.1430.211
n 1 = 10 , n 2 = 30 EL test0.0520.1640.438
n 1 = 20 , n 2 = 20 0.0530.2790.787
n 1 = 20 , n 2 = 30 0.0510.3750.899
n 1 = 30 , n 2 = 30 0.0500.5480.988
n 1 , n 2   ( m = 40 ) Method c = 0   (Size) c = 0.01   (Power) c = 0.02   (Power)
n 1 = 10 , n 2 = 10 0.0490.1190.329
n 1 = 10 , n 2 = 20 0.0510.1690.495
n 1 = 10 , n 2 = 30 t test0.0520.1800.540
n 1 = 20 , n 2 = 20 0.0550.3500.886
n 1 = 20 , n 2 = 30 0.0500.4630.953
n 1 = 30 , n 2 = 30 0.0520.6570.996
n 1 = 10 , n 2 = 10 0.0530.1220.338
n 1 = 10 , n 2 = 20 0.0520.1760.503
n 1 = 10 , n 2 = 30 EL test0.0510.1830.550
n 1 = 20 , n 2 = 20 0.0520.3600.890
n 1 = 20 , n 2 = 30 0.0530.4710.958
n 1 = 30 , n 2 = 30 0.0550.6700.998
Table 2. Simulated size and power with graphs generated from Poisson distribution.
Table 2. Simulated size and power with graphs generated from Poisson distribution.
n 1 , n 2   ( m = 20 ) Method c = 0 (Size) c = 0.02 (Power) c = 0.04 (Power)
n 1 = 10 , n 2 = 10 0.0480.1070.169
n 1 = 10 , n 2 = 20 0.0500.2080.418
n 1 = 10 , n 2 = 30 t test0.0520.2340.464
n 1 = 20 , n 2 = 20 0.0530.2920.774
n 1 = 20 , n 2 = 30 0.0510.4400.931
n 1 = 30 , n 2 = 30 0.0520.6330.982
n 1 = 10 , n 2 = 10 0.0510.1410.280
n 1 = 10 , n 2 = 20 0.0520.2520.478
n 1 = 10 , n 2 = 30 EL test0.0510.2890.540
n 1 = 20 , n 2 = 20 0.0560.3330.811
n 1 = 20 , n 2 = 30 0.0530.4660.943
n 1 = 30 , n 2 = 30 0.0550.6700.995
n 1 , n 2   ( m = 30 ) Method c = 0   (Size) c = 0.02   (Power) c = 0.04   (Power)
n 1 = 10 , n 2 = 10 0.0520.1260.243
n 1 = 10 , n 2 = 20 0.0510.2350.498
n 1 = 10 , n 2 = 30 t test0.0500.2610.549
n 1 = 20 , n 2 = 20 0.0550.4020.912
n 1 = 20 , n 2 = 30 0.0490.5770.974
n 1 = 30 , n 2 = 30 0.0510.7900.985
n 1 = 10 , n 2 = 10 0.0520.1570.352
n 1 = 10 , n 2 = 20 0.0530.2850.561
n 1 = 10 , n 2 = 30 EL test0.0520.3330.650
n 1 = 20 , n 2 = 20 0.0530.4410.916
n 1 = 20 , n 2 = 30 0.0510.5970.985
n 1 = 30 , n 2 = 30 0.0500.8070.992
n 1 , n 2   ( m = 40 ) Method c = 0   (Size) c = 0.02   (Power) c = 0.04   (Power)
n 1 = 10 , n 2 = 10 0.0490.1130.325
n 1 = 10 , n 2 = 20 0.0510.2340.616
n 1 = 10 , n 2 = 30 t test0.0520.2720.621
n 1 = 20 , n 2 = 20 0.0550.4960.972
n 1 = 20 , n 2 = 30 0.0500.6580.980
n 1 = 30 , n 2 = 30 0.0520.8910.996
n 1 = 10 , n 2 = 10 0.0530.1400.404
n 1 = 10 , n 2 = 20 0.0520.3100.694
n 1 = 10 , n 2 = 30 EL test0.0510.3520.713
n 1 = 20 , n 2 = 20 0.0520.5240.981
n 1 = 20 , n 2 = 30 0.0530.6890.997
n 1 = 30 , n 2 = 30 0.0550.9001.000
Table 3. Simulated size and power with graphs generated from exponential distribution.
Table 3. Simulated size and power with graphs generated from exponential distribution.
n 1 , n 2   ( m = 20 ) Method c = 0 (Size) c = 0.04 (Power) c = 0.06 (Power)
n 1 = 10 , n 2 = 10 0.0510.1650.288
n 1 = 10 , n 2 = 20 0.0530.1700.322
n 1 = 10 , n 2 = 30 t test0.0540.1740.347
n 1 = 20 , n 2 = 20 0.0500.4110.759
n 1 = 20 , n 2 = 30 0.0520.5120.827
n 1 = 30 , n 2 = 30 0.0550.6710.931
n 1 = 10 , n 2 = 10 0.0500.2190.346
n 1 = 10 , n 2 = 20 0.0530.2620.467
n 1 = 10 , n 2 = 30 EL test0.0550.2750.515
n 1 = 20 , n 2 = 20 0.0520.4580.788
n 1 = 20 , n 2 = 30 0.0510.5650.871
n 1 = 30 , n 2 = 30 0.0540.6990.952
n 1 , n 2   ( m = 30 ) Method c = 0   (Size) c = 0.04   (Power) c = 0.06   (Power)
n 1 = 10 , n 2 = 10 0.0490.2060.410
n 1 = 10 , n 2 = 20 0.0520.2580.514
n 1 = 10 , n 2 = 30 t test0.0530.2800.551
n 1 = 20 , n 2 = 20 0.0480.5900.891
n 1 = 20 , n 2 = 30 0.0530.6740.953
n 1 = 30 , n 2 = 30 0.0520.8480.982
n 1 = 10 , n 2 = 10 0.0530.2390.457
n 1 = 10 , n 2 = 20 0.0510.3510.621
n 1 = 10 , n 2 = 30 EL test0.0540.3970.679
n 1 = 20 , n 2 = 20 0.0550.6220.904
n 1 = 20 , n 2 = 30 0.0490.7270.968
n 1 = 30 , n 2 = 30 0.0500.8630.995
n 1 , n 2   ( m = 40 ) Method c = 0   (Size) c = 0.04   (Power) c = 0.06   (Power)
n 1 = 10 , n 2 = 10 0.0530.2530.522
n 1 = 10 , n 2 = 20 0.0520.3340.671
n 1 = 10 , n 2 = 30 t test0.0550.3570.694
n 1 = 20 , n 2 = 20 0.0510.7110.958
n 1 = 20 , n 2 = 30 0.0530.8160.983
n 1 = 30 , n 2 = 30 0.0560.9331.000
n 1 = 10 , n 2 = 10 0.0550.2810.553
n 1 = 10 , n 2 = 20 0.0490.4230.756
n 1 = 10 , n 2 = 30 EL test0.0520.4610.797
n 1 = 20 , n 2 = 20 0.0540.7320.969
n 1 = 20 , n 2 = 30 0.0550.8450.995
n 1 = 30 , n 2 = 30 0.0510.9481.000
Table 4. Simulated size and power with graphs generated from multivariate normal distribution.
Table 4. Simulated size and power with graphs generated from multivariate normal distribution.
n 1 , n 2   ( m = 20 ) Method c = ( 0 , 0 ) (Size) c = ( 0.1 , 0.1 ) (Power) c = ( 0.2 , 0.2 ) (Power)
n 1 = 10 , n 2 = 10 0.0520.1260.273
n 1 = 10 , n 2 = 20 0.0510.1610.315
n 1 = 10 , n 2 = 30 t test0.0540.2320.409
n 1 = 20 , n 2 = 20 0.0560.2450.414
n 1 = 20 , n 2 = 30 0.0490.2520.468
n 1 = 30 , n 2 = 30 0.0510.2630.531
n 1 = 10 , n 2 = 10 0.0520.1750.344
n 1 = 10 , n 2 = 20 0.0540.1830.358
n 1 = 10 , n 2 = 30 EL test0.0510.2430.414
n 1 = 20 , n 2 = 20 0.0520.2580.468
n 1 = 20 , n 2 = 30 0.0530.2630.521
n 1 = 30 , n 2 = 30 0.0520.2890.588
n 1 , n 2   ( m = 30 ) Method c = ( 0 , 0 )   (Size) c = ( 0.1 , 0.1 )   (Power) c = ( 0.2 , 0.2 )   (Power)
n 1 = 10 , n 2 = 10 0.0520.4040.830
n 1 = 10 , n 2 = 20 0.0540.5120.881
n 1 = 10 , n 2 = 30 t test0.0490.6160.926
n 1 = 20 , n 2 = 20 0.0580.6190.959
n 1 = 20 , n 2 = 30 0.0520.7310.974
n 1 = 30 , n 2 = 30 0.0520.7960.983
n 1 = 10 , n 2 = 10 0.0520.4660.857
n 1 = 10 , n 2 = 20 0.0550.5270.898
n 1 = 10 , n 2 = 30 EL test0.0530.6350.933
n 1 = 20 , n 2 = 20 0.0500.6620.972
n 1 = 20 , n 2 = 30 0.0510.7750.982
n 1 = 30 , n 2 = 30 0.0530.8120.992
n 1 , n 2   ( m = 40 ) Method c = ( 0 , 0 )   (Size) c = ( 0.1 , 0.1 )   (Power) c = ( 0.2 , 0.2 )   (Power)
n 1 = 10 , n 2 = 10 0.0520.7320.981
n 1 = 10 , n 2 = 20 0.0490.8410.991
n 1 = 10 , n 2 = 30 t test0.0520.8720.992
n 1 = 20 , n 2 = 20 0.0500.9420.993
n 1 = 20 , n 2 = 30 0.0490.9511.000
n 1 = 30 , n 2 = 30 0.0550.9851.000
n 1 = 10 , n 2 = 10 0.0510.7770.992
n 1 = 10 , n 2 = 20 0.0520.8500.996
n 1 = 10 , n 2 = 30 EL test0.0500.8820.997
n 1 = 20 , n 2 = 20 0.0510.9471.000
n 1 = 20 , n 2 = 30 0.0520.9581.000
n 1 = 30 , n 2 = 30 0.0520.9891.000
Table 5. Simulated size and power with graphs generated from multivariate Gamma distribution.
Table 5. Simulated size and power with graphs generated from multivariate Gamma distribution.
n 1 , n 2   ( m = 20 ) Method c = ( 0 , 0 ) (Size) c = ( 0.05 , 0.05 ) (Power) c = ( 0.1 , 0.1 ) (Power)
n 1 = 10 , n 2 = 10 0.0520.1470.159
n 1 = 10 , n 2 = 20 0.0530.1510.165
n 1 = 10 , n 2 = 30 t test0.0490.2310.171
n 1 = 20 , n 2 = 20 0.0550.2540.256
n 1 = 20 , n 2 = 30 0.0520.2660.269
n 1 = 30 , n 2 = 30 0.0490.2790.295
n 1 = 10 , n 2 = 10 0.0520.1530.179
n 1 = 10 , n 2 = 20 0.0540.1580.332
n 1 = 10 , n 2 = 30 EL test0.0550.2720.346
n 1 = 20 , n 2 = 20 0.0500.2830.353
n 1 = 20 , n 2 = 30 0.0520.2890.385
n 1 = 30 , n 2 = 30 0.0510.3400.410
n 1 , n 2   ( m = 30 ) Method c = ( 0 , 0 )   (Size) c = ( 0.05 , 0.05 )   (Power) c = ( 0.1 , 0.1 )   (Power)
n 1 = 10 , n 2 = 10 0.0520.2710.795
n 1 = 10 , n 2 = 20 0.0500.3230.845
n 1 = 10 , n 2 = 30 t test0.0510.3310.872
n 1 = 20 , n 2 = 20 0.0520.4110.934
n 1 = 20 , n 2 = 30 0.0550.4290.954
n 1 = 30 , n 2 = 30 0.0520.4660.970
n 1 = 10 , n 2 = 10 0.0490.3380.812
n 1 = 10 , n 2 = 20 0.0530.4010.888
n 1 = 10 , n 2 = 30 EL test0.0510.4320.912
n 1 = 20 , n 2 = 20 0.0520.4670.950
n 1 = 20 , n 2 = 30 0.0540.4810.962
n 1 = 30 , n 2 = 30 0.0520.5520.989
n 1 , n 2   ( m = 40 ) Method c = ( 0 , 0 )   (Size) c = ( 0.05 , 0.05 )   (Power) c = ( 0.1 , 0.1 )   (Power)
n 1 = 10 , n 2 = 10 0.0520.6120.950
n 1 = 10 , n 2 = 20 0.0530.7290.973
n 1 = 10 , n 2 = 30 t test0.0510.7540.987
n 1 = 20 , n 2 = 20 0.0500.8310.990
n 1 = 20 , n 2 = 30 0.0520.8491.000
n 1 = 30 , n 2 = 30 0.0500.9141.000
n 1 = 10 , n 2 = 10 0.0510.6770.959
n 1 = 10 , n 2 = 20 0.0520.7420.988
n 1 = 10 , n 2 = 30 EL test0.0540.7770.995
n 1 = 20 , n 2 = 20 0.0520.8430.999
n 1 = 20 , n 2 = 30 0.0510.8641.000
n 1 = 30 , n 2 = 30 0.0530.9251.000
Table 6. p-value of tests with real data graph.
Table 6. p-value of tests with real data graph.
m 1 ,   m 2 Method n = 10 n = 20 n = 50 n = 7129
m 1 = 47 ,   m 2 = 25 t test0.4230.2310.115< 0.001
m 1 = 47 ,   m 2 = 25 EL test0.1460.0470.032< 0.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, X.; Yuan, M. Weighted Graph-Based Two-Sample Test via Empirical Likelihood. Mathematics 2024, 12, 2745. https://doi.org/10.3390/math12172745

AMA Style

Zhao X, Yuan M. Weighted Graph-Based Two-Sample Test via Empirical Likelihood. Mathematics. 2024; 12(17):2745. https://doi.org/10.3390/math12172745

Chicago/Turabian Style

Zhao, Xiaofeng, and Mingao Yuan. 2024. "Weighted Graph-Based Two-Sample Test via Empirical Likelihood" Mathematics 12, no. 17: 2745. https://doi.org/10.3390/math12172745

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop