Next Article in Journal / Special Issue
Symmetry in Boolean Satisfiability
Previous Article in Journal / Special Issue
Fluctuating Asymmetry in Flies, What Does It Mean?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Minimum Phi-Divergence Estimators and Phi-Divergence Test Statistics in Contingency Tables with Symmetry Structure: An Overview

by
Leandro Pardo
1,* and
Nirian Martín
2
1
Department of Statistics and O.R., Complutense University of Madrid, 28040 Madrid, Spain
2
Department of Statistics, Carlos III University of Madrid, 28903 Getafe (Madrid), Spain
*
Author to whom correspondence should be addressed.
Symmetry 2010, 2(2), 1108-1120; https://doi.org/10.3390/sym2021108
Submission received: 12 March 2010 / Revised: 7 May 2010 / Accepted: 10 June 2010 / Published: 11 June 2010
(This article belongs to the Special Issue Feature Papers: Symmetry Concepts and Applications)

Abstract

:
In the last years minimum phi-divergence estimators (M ϕ E) and phi-divergence test statistics ( ϕ TS) have been introduced as a very good alternative to classical likelihood ratio test and maximum likelihood estimator for different statistical problems. The main purpose of this paper is to present an overview of the main results presented until now in contingency tables with symmetry structure on the basis of (M ϕ E) and ( ϕ TS).
Classification:
MSC 62B10, 62H15

1. Introduction

An interesting problem in a two-way contingency table is to investigate whether there are symmetric patterns in the data: Cell probabilities on one side of the main diagonal are a mirror image of those on the other side. This problem was first discussed by Bowker [1] who gave the maximum likelihood estimator as well as a large sample chi-square type test for the null hypothesis of symmetry. The minimum discrimination information estimator was proposed in [2] and the minimum chi-squared estimator in [3]. In [4,5,6,7] new families of test statistics, based on ϕ -divergence measures, were introduced. These families contain as a particular case the test statistic given by [1] as well as the likelihood ratio test.
Let X and Y denote two ordinal response variables, X and Y having I levels. When we classify subjects on both variables, there are I 2 possible combinations of classifications. The responses ( X , Y ) of a subject randomly chosen from some population have a probability distribution. Let p i j = Pr ( X = i , Y = j ) , with p i j > 0 , i , j = 1 , , I . We display this distribution in a rectangular table having I rows for the categories of X and I columns for the categories of Y. Consider a random sample of size n on ( X , Y ) and we denote by n i j the observed frequency in the ( i , j ) th cell for ( i , j ) I × I with i = 1 I j = 1 I n i j = n . The classical problem of testing for symmetry is given by
H 0 : p i j = p j i , ( i , j ) I × I
versus
H 1 * : p i j p j i , for at least one ( i , j ) pair
This problem was considered for the first time by Bowker [1] using the Pearson test statistic
X 2 = i = 1 I j = 1 i < j I n i j n j i 2 n i j + n j i
for which established that X 2 χ k 2 for large n, where k = 1 2 I ( I 1 ) .
In some real problems (i.e., medicine, psychology, sociology, etc.) the categorical response variables ( X , Y ) represent the measure after or before a treatment. In such situations our interest is to determine the treatment effect, i.e., if X Y (we assume that X represents the measure after the treatment and Y before the treatment). In the following we understand that X is preferred or indifferent to Y, according to joint likelihood ratio ordering, if and only if (iff) p i j p j i i j . In this situation the alternative hypothesis is
H 1 : p i j p j i , for all i j
This problem was first considered by El Barmi and Kochar [8], who presented the likelihood ratio test for the problem of testing
H 0 : p i j = p j i against H 1 : p i j p j i , i j
and considered the application of it to a real life problem: They tested if the vision of both the eyes, for 7477 women, is the same against the alternative that the right eye has better vision than the left eye. In [5] these results were extended using ϕ -divergence measures.
In this paper we present an overview on contingency tables with symmetry structure on the basis of divergence measures. We pay especial attention to the family of ϕ -divergence test statistics for testing H 0 versus H 1 * , H 0 against H 1 and also for testing H 1 against the alternative H 2 of no restrictions over p i j ’s, i.e.,
H 1 : p i j p j i , i j , against H 2 : p i j p j i , i j
It is interesting to observe that not only we consider ϕ -divergence test statistics but also we consider minimum ϕ -divergence estimators in order to estimate of the parameters of the model.

2. Phi-divergences Measures

We consider the set
Θ = { θ : θ = ( p i j ; 1 i , j I , ( i , j ) I , I ) with p i j > 0 and i = 1 I j = 1 , ( i , j ) ( I , I ) I p i j < 1 }
and we denote p ( θ ) = p = p 11 , , p I I T , p I I = 1 i = 1 I j = 1 , ( i , j ) ( I , I ) I p i j or equivalently by the I × I matrix p ( θ ) = [ p i j ] .
The ϕ -divergence between two probability distributions p = [ p i j ] , q = [ q i j ] was introduced independently by [9] and [10]. It is defined as follows:
D ϕ ( p , q ) = i = 1 I j = 1 I q i j ϕ p i j q i j , ϕ Φ *
where Φ * is the class of all convex functions ϕ : 0 , R , such that ϕ 1 = 0 , ϕ 1 > 0 ; and we define 0 ϕ 0 / 0 = 0 and 0 ϕ p / 0 = lim u ϕ u / u . For every ϕ Φ * that is differentiable at x = 1 , the function ψ given by
ψ x = ϕ x ϕ 1 x 1
also belongs to Φ * . Then we have D ψ ( p , q ) = D ϕ ( p , q ) , and ψ has the additional property that ψ 1 = 0 . Because the two divergence measures are equivalent, we can consider the set Φ * to be equivalent to the set
Φ Φ * ϕ : ϕ 1 = 0
An important family of ϕ -divergences in statistical problems, is the power divergence family
ϕ λ x = x λ + 1 x + λ 1 x λ λ + 1 ; λ 0 , λ 1 ϕ 0 x = lim λ 0 ϕ λ x = x ln x x + 1 ϕ 1 x = lim λ 1 ϕ λ x = ln x + x 1
which was introduced and studied by [11]. Notice that ϕ λ Φ . In the following we shall denote the power-divergence measures by D ϕ λ ( p , q ) , λ R . For more details about ϕ -divergence measures see [12].

3. Hypothesis Testing: H 0 versus H 1 *

We define
B = { ( a 11 , , a 1 I , a 22 , , a 2 I , , a I 1 I 1 , a I 1 I ) T R + I I + 1 2 1 : i j a i j < 1 , i , j = 1 , , I }
the hypothesis (1) can be written as
H 0 : θ = g ( β ) , β = ( p 11 , , p 1 I , p 22 , , p 2 I , , p I 1 I 1 , p I 1 I ) T B
where the function β is defined by β = ( g i j ; i , j = 1 , , I , ( i , j ) I , I ) with
g i j ( β ) = p i j , i j p j i , i > j , i , j = 1 , , I 1
and
g I j ( β ) = p j I , j = 1 , , I 1 g i I ( β ) = p i I , i = 1 , , I 1
Note that p ( g ( β ) ) = ( g i j ( β ) ; i , j = 1 , , I ) T , where
g I I ( β ) = 1 i , j = 1 , ( i , j ) I , I I g i j ( β )
The maximum likelihood estimator (MLE) of β can be defined as
β ^ = arg min β B D K L ( p ^ , p ( g ( β ) ) ) a . s .
where D K L ( p ^ , p ( g ( β ) ) ) is the Kullback–Leibler divergence measure (see [13,14]) defined by
D K L ( p ^ , p ( g ( β ) ) ) = i = 1 I j = 1 I p ^ i j log p ^ i j g i j ( β )
We denote by θ ^ = g ( θ ^ ) and by p ( θ ^ ) = ( p 11 ( θ ^ ) , , p I I ( θ ^ ) ) T . It is well known that p i j ( θ ^ ) = p ^ i j + p ^ j i 2 , i = 1 , , I , j = 1 , , I . Using the ideas developed in [15], we can consider the minimum ϕ 2 -divergence estimator ( M ϕ 2 E ) replacing the Kullback–Leibler divergence by a ϕ 2 -divergence measure in the following way
θ ^ ϕ 2 = arg min β B D ϕ 2 ( ( p ^ , p ( g ( β ) ) ) ) ; ϕ 2 Φ *
where
D ϕ 2 ( p ^ , p ( g ( β ) ) ) = i = 1 I j = 1 I g i j ( β ) ϕ 2 p ^ i j g i j ( β )
We denote θ ^ S , ϕ 2 = g ( β ^ ϕ 2 ) and we have (see [7,16])
n ( g ( β ^ ϕ 2 ) β ) n L N 0 , I F S ( β ) 1
where
I F S ( θ ) = Σ θ Σ θ B ( θ ) T ( B ( θ ) Σ θ T B ( θ ) ) 1 B ( θ ) Σ θ
being Σ θ = diag ( θ ) θ θ T and B ( θ ) = h i j θ θ i j I ( I 1 ) 2 × I 2 1 . The functions h i j are given by
h i j ( θ ) = p i j p j i , i < j , i = 1 , , I 1 , j = 1 , , I
It is not difficult to establish that the matrix I F S ( θ ) can be written as
I F S ( θ ) = M β T I F ( β ) 1 M β
where I F ( β ) is the Fisher information matrix corresponding to β B .
If we consider the family of power divergences we get the minimum power-divergence estimator, θ ^ S , λ of θ , under the hypothesis of symmetry, whose expression is given by
θ ^ i j S , λ = p ^ i j λ + 1 + p ^ j i λ + 1 2 1 λ + 1 i = 1 I j = 1 I p ^ i j λ + 1 + p ^ j i λ + 1 2 1 λ + 1 , i , j = 1 , , I
For λ = 0 we get
θ ^ i j S , 0 = p ^ i j + p ^ j i 2 , i , j = 1 , , I
hence, we obtain the maximum likelihood estimator for symmetry introduced by [1]. For λ = 1 , we obtain as a limit case
θ ^ i j S , 1 = ( p ^ i j p ^ j i ) 1 2 i = 1 I j = 1 I ( p ^ i j p ^ j i ) 1 2 , i , j = 1 , , I
i.e., the minimum discrimination estimator for symmetry introduced and studied in [2]. For λ = 1 we get the minimum chi-squared estimator for symmetry introduced in [3],
θ ^ i j S , 1 = p ^ i j 2 + p ^ j i 2 2 1 / 2 i = 1 I j = 1 I p ^ i j 2 + p ^ j i 2 2 1 / 2
We denote θ ^ ϕ 2 = g ( β ^ ϕ 2 ) and by
p ( θ ^ ϕ 2 ) = ( p 11 ( θ ^ ϕ 2 ) , . , p I I ( θ ^ ϕ 2 ) ) T
the ( M ϕ 2 E ) of the probability vector that characterizes the symmetry model. Based on p ( θ ^ ϕ 2 ) it is possible to define a new family of statistics for testing (1) that contains as a particular case Pearson test statistic as well as likelihood ratio test. This family of statistics is given by
T n ϕ 1 ( θ ^ ϕ 2 ) 2 n ϕ 1 1 D ϕ 1 ( p ^ , p ( θ ^ ϕ 2 ) ) = 2 n ϕ 1 1 i = 1 I j = 1 I p i j ( θ ^ ϕ 2 ) ϕ 1 p ^ i j p i j ( θ ^ ϕ 2 )
We can observe that the family (13) involves two functions ϕ 1 and ϕ 2 , both belonging to Φ * . We use the function ϕ 2 to obtain the ( M ϕ 2 E ) and ϕ 1 to obtain the family of statistics. If we consider ϕ 1 ( x ) = 1 2 ( x 1 ) 2 and ϕ 2 ( x ) = x log x x + 1 we get Pearson test statistic whose expression was given in (3) and for ϕ 1 ( x ) = ϕ 2 ( x ) = x log x x + 1 we get the likelihood ratio test given by
G 2 = 2 i = 1 I j = 1 i < j I n i j log 2 n i j n j i + n i j
In the following theorem the asymptotic distribution of T n ϕ 1 ( θ ^ ϕ 2 ) is obtained.
Theorem 1
The asymptotic distribution of T n ϕ 1 ( θ ^ ϕ 2 ) is chi-squared with m = I ( I 1 ) / 2 degrees of freedom.
Proof. 
See Chapter 8 in [12].
Thus, for a given significance level α 0 , 1 , the critical value of T n ϕ 1 ( θ ^ ϕ 2 ) may be approximated by χ m , α 2 , the upper 100 α % of the chi-square distribution with m degrees of freedom, i.e., reject the hypothesis of symmetry iff
T n ϕ 1 ( θ ^ ϕ 2 ) χ m , α 2
Now we are going to analyze the power of the test. Let q = ( q 11 , , q I I ) T be a point at the alternative hypothesis, i.e., there exist at least two indexes i and j for which q i j q j i . We denote by θ a ϕ 2 the point on Θ verifying
θ a ϕ 2 = arg min θ Θ 0 D ϕ 2 ( q , p ( θ ) )
where Θ 0 is given by
Θ 0 = { θ Θ : θ = g ( β ) for some β B }
It is clear that
θ a ϕ 2 = ( f i j ( q ) ; i , j = 1 , , I , i , j I , I ) T
and
p ( θ ˜ a ϕ 2 ) = f i j ( q ) ; i , j = 1 , , I , T f ( q )
with
f I I ( q ) = 1 i = 1 I j = 1 ( i , j ) I , I I f i j ( q )
The notation f i j ( q ) indicates that the elements of the vector θ a ϕ 2 depend on q . For instance, for the power-divergence family ϕ λ x we have
f i j ( q ) = q i j λ + 1 + q j i λ + 1 1 λ + 1 i = 1 I j = 1 I q i j λ + 1 + q j i λ + 1 1 λ + 1 , i , j = 1 , , I
We also denote
θ ^ S , ϕ 2 = ( p i j S , ϕ 2 ; i , j = 1 , , I , i , j I , I ) T
and then
p ( θ ^ S , ϕ 2 ) = ( p i j S , ϕ 2 ; i , j = 1 , , I ) T f ( p ^ )
where f = ( f i j ; i , j = 1 , , I ) T . If the alternative q is true we have that p ^ tends to q and p ( θ ^ S , ϕ 2 ) to p ( θ a ϕ 2 ) in probability.
If we define the function
Ψ ϕ 1 ( q ) = D ϕ 1 ( q , f ( q ) )
we have
Ψ ϕ 1 p ^ = Ψ ϕ 1 ( q ) + i = 1 I j = 1 J D ϕ 1 ( q , f ( q ) ) q i j ( p ^ i j q i j ) + o p ^ q
Then the random variables
n D ϕ 1 ( p ^ , f ( p ^ ) ) D ϕ 1 ( q , f ( q ) )
and
n i = 1 I j = 1 J D ϕ 1 ( q , g ( q ) ) q i j ( p ^ i j q i j )
have the same asymptotic distribution. If we define
l i j = D ϕ 1 ( q , f ( q ) ) q i j
and l = l i j ; i , j = 1 , , I T , we have
n ( D ϕ 1 ( p ^ , f ( p ^ ) ) D ϕ 1 ( q , f ( q ) ) ) n L N ( 0 , l T Σ q l )
where Σ q = diag ( q ) qq T .
If we consider the maximum likelihood estimator instead of minimum ϕ -divergence estimator, we get
l T Σ q l = i = 1 I j = 1 i j I q i j ( m i j ϕ 1 ) 2 i = 1 I j = 1 i j I q i j m i j ϕ 1 2
where
m i j ϕ 1 = 1 2 ϕ 1 2 q i j q i j + q j i + 1 2 ϕ 1 2 q j i q i j + q j i + q i j q i j + q j i ϕ 1 2 q i j q i j + q j i ϕ 1 2 q j i q i j + q j i
It is also interesting to observe, if we consider the power divergence measure, that
m i j λ = 1 2 λ ( λ + 1 ) 2 p i j p i j + p j i λ + 1 + 2 p j i p i j + p j i λ + 1 2 + 1 λ p j i p i j + p j i 2 p i j p i j + p j i λ 2 p j i p i j + p j i λ
For λ 0 and λ = 1 we get
m i j 0 = log 2 p i j p i j + p j i and m i j 1 = p i j 2 3 p j i 2 + 2 p i j p j i 2 ( p i j + p j i ) 2
respectively. Therefore, the corresponding asymptotic variances are given by
σ ( 0 ) 2 = i = 1 I j = 1 i j I p i j log 2 p i j p i j + p j i 2 i = 1 I j = 1 i j I p i j log 2 p i j p i j + p j i 2
and
σ ( 1 ) 2 = i = 1 I j = 1 i j I p i j p i j 2 3 p j i 2 + 2 p i j p j i 2 ( p i j + p j i ) 2 2 i = 1 I j = 1 i j I p i j p i j 2 3 p j i 2 + 2 p i j p j i 2 ( p i j + p j i ) 2 2
Based on the previous result we can formulate the following theorem.
Theorem 2
The asymptotic power for the test given in (15), at the alternative q , is given by
β n , ϕ 1 , ϕ 2 ( q ) = 1 Φ n 1 l T Σ q l ϕ 1 1 2 n χ m , α 2 n D ϕ 1 ( q , f ( θ a ϕ 2 ) )
where Φ n ( x ) is a sequence of distributions functions tending uniformly to the standard normal distribution function Φ x .
We consider a contiguous sequence of alternative hypotheses that approaches the null hypothesis H 0 : θ = p ( g ( β ) ) , for some unknown β B , at the rate O n 1 / 2 . Consider the multinomial probability vector
p n , i j = p i j ( g ( β ) ) + d i j n 1 / 2 , i = 1 , , I , j = 1 , , I
where d = d 11 , , d I I T is a fixed I 2 × 1 vector such that i = 1 I j = 1 I d i j = 0 , recall that n is the total count parameter of the multinomial distribution and β B . As n , the sequence of multinomial probabilities { p n } n N with p n = p n , i j , i = 1 , , I , j = 1 , , I T , converges to a multinomial probability in H 0 at the rate of O n 1 / 2 . Let
H 1 , n : p n = p ( g ( β ) ) + d n 1 / 2 , β B
In the next theorem we present the asymptotic distribution of the family of test statistics T n ϕ 1 ( θ ^ ϕ 2 ) defined in (13), under the contiguous alternative hypotheses given in (18).
Theorem 3
Under H 1 , n , given in (18), the family of test statistics T n ϕ 1 ( θ ^ ϕ 2 ) is asymptotically noncentrality chi-squared distributed with I I 1 / 2 degrees of freedom and noncentrality parameter
δ = 1 2 i = 1 I j = 1 i j I d i j 2 p i j i = 1 I j = 1 i < j I d i j d j i p i j
Proof. 
See Chapter 8 in [12].
An interesting a simulation study can be seen in [7]. In that study some interesting alternative test statistics appear to the classical Pearson test statistics and likelihood ratio test.

4. Hypothesis Testing: H 0 versus H 1 and H 1 versus H 2

In this section we consider the three hypotheses, H 0 , H 1 , H 2 given in (1), (4), (5) respectively and some test statistics based on ϕ -divergence
D ϕ ( p ^ , p ^ ( 0 ) ) and D ϕ ( p ^ , p ^ ( 1 ) )
for testing H 0 against H 1 and H 1 against H 2 .
In the expression (19), p ^ is the maximum likelihood estimator (MLE) of p given by p ^ = p ^ i j , where p ^ i j = n i j / n ; and p ^ ( 0 ) and p ^ ( 1 ) denote the MLEs of p under H 0 and H 1 respectively. These MLEs were obtained by [8]. Let
θ i j = p i j p i j + p j i , for i > j
then H 0 : θ i j = 1 / 2 (for i > j ) and H 1 : θ i j 1 / 2 (for i > j ) , and
θ ^ i j = n i j n i j + n j i , θ ^ i j ( 0 ) = 1 / 2 and θ ^ i j ( 1 ) = max n i j n i j + n j i , 1 2 for i > j
It follows that p ^ ( 0 ) and p ^ ( 1 ) are given by
p ^ i j ( 0 ) = n i j + n j i 2 n and p ^ i j ( 1 ) = n i j + n j i n max n i j n i j + n j i , 1 2
Then we have
D ϕ ( p ^ , p ^ ( 0 ) ) = i = 1 I j = 1 i > j I n i j + n j i 2 n ϕ ( 2 θ ^ i j ) + ϕ ( 2 ( 1 θ ^ i j ) )
and
D ϕ ( p ^ , p ^ ( 1 ) ) = i = 1 I j = 1 i > j I n i j + n j i n θ ^ i j ( 1 ) ϕ θ ^ i j θ ^ i j ( 1 ) + ( 1 θ ^ i j ( 1 ) ) ϕ ( 1 θ ^ i j ) ( 1 θ ^ i j ( 1 ) )
To solve the problem of testing H 1 against H 2 , [8] consider the likelihood ratio test statistic
T 12 = 2 i = 1 I j = 1 i > j I n i j ln θ ^ i j + n j i ln ( 1 θ ^ i j ) n i j ln ( θ ^ i j ( 1 ) ) n j i ln ( 1 θ ^ i j ( 1 ) )
This statistic is such that
T 12 = 2 n D K L ( p ^ , p ^ ( 1 ) )
where D K L ( p ^ , p ^ ( 1 ) ) is the Kullback–Leibler divergence given by (20) with ϕ ( x ) = ϕ ( 0 ) ( x ) defined above. Then the likelihood ratio test statistic is based on the closeness, in terms of the Kullback–Leibler divergence measure, between the probability distributions p ^ and p ^ ( 1 ) . Thus, one could measure the closeness between the two probability distributions using a more general divergence measure if we are able to obtain its asymptotic distribution. One appropriate family of divergence measures for that purpose is the ϕ -divergence measure.
As a generalization of the test statistic given in (20) for testing H 1 against H 2 we introduce the family of test statistics
T 12 ϕ = 2 n ϕ ( 1 ) D ϕ ( p ^ , p ^ ( 1 ) )
To test H 0 against H 1 , El Barmi and Kochar [8] consider the likelihood ratio test statistic
T 01 = 2 i = 1 I j = 1 i > j I n i j ln θ ^ i j ( 1 ) + n j i ln ( 1 θ ^ i j ( 1 ) ) n i j ln ( 1 2 ) n j i ln ( 1 2 )
It is clear that
T 01 = 2 n D K L ( p ^ , p ^ ( 0 ) ) D K L ( p ^ , p ^ ( 1 ) )
As a generalization of this test statistic we consider in this paper the family of test statistics
T 01 ϕ = 2 n ϕ ( 1 ) D ϕ ( p ^ , p ^ ( 0 ) ) D ϕ ( p ^ , p ^ ( 1 ) )
If ϕ = ϕ ( 0 ) then T 12 ϕ = T 12 and T 01 ϕ = T 01 , and hence the families of test statistics T 12 ϕ and T 01 ϕ can be considered as generalizations of the test statistics T 12 and T 01 respectively.
In order to get the asymptotic distribution of the test statistics given in (21) and (22), we first define the so-called chi-bar squared distribution with n degrees of freedom, denoted by χ ¯ n 2 .
Definition 4
Let U = max ( 0 , Z ) where Z N ( 0 , 1 ) , so that the c.d.f. of U is given by
F U ( u ) = Φ ( u ) , u > 0 0 , u < 0
where Φ denotes the standard normal cumulative distribution function. Let V = k = 1 n U k 2 , where U 1 , , U n are independent and distributed like U, then V χ ¯ n 2 .
It is readily shown that
E ( V ) = 1 2 n and Var ( V ) = 5 4 n
This distribution is related to χ 2 distribution. It can be readily shown that
Pr ( V > v ) = l = 0 n n l 1 2 n Pr ( χ l 2 > v )
by conditioning on L, the number of non-zero U i s.
Furthermore, like the χ n 2 distribution, the χ ¯ n 2 distribution is stochastically increasing with n. If V χ ¯ n 2 and V χ ¯ n 2 , where n < n , then V is stochastically smaller than V : Pr ( V > t ) Pr ( V > t ) . This follows since V V + W where W χ ¯ n n 2 with V and W independent. For more details about the chi-bar squared distribution see [17].
The following theorem present the asymptotic distribution of T 01 ϕ .
Theorem 5
Under H 0 , T 01 ϕ L χ ¯ K 2 as n where K = I ( I 1 ) / 2 .
Proof. 
See [7].
If we consider the family of power divergences given in (8), we have the power divergence family of test statistics defined as
T 01 λ = T 01 ϕ ( λ )
which can be used for testing H 0 against H 1 . Therefore some important statistics can now be expressed as members of the power divergence family of test statistics T 01 λ , that is, T 01 1 is the Pearson test statistic ( X 01 2 ), T 01 1 / 2 is the Freeman–Tukey test statistic ( F 01 2 ), T 01 2 is the Neyman-modified test statistic ( N M 01 2 ), T 01 1 is the modified loglikelihood ratio test statistic ( N G 01 2 ), T 01 0 is the loglikelihood ratio test statistic ( G 01 2 ) introduced by [8] and T 01 2 / 3 is Cressie–Read test statistic (see [11]).
Theorem 6
Under H 1 , T 12 ϕ L χ ¯ M 2 as n , where M is the number of elements in the set { ( i , j ) : i > j , p i j = p j i } K = 1 2 I ( I 1 ) and
lim n Pr ( T 12 ϕ t ) l = 0 K K l 1 2 K Pr ( χ l 2 t )
If we consider the family of power divergences given in (8), we have the power divergence family of test statistics defined as
T 12 λ = T 12 ϕ ( λ )
which we can use for testing H 1 against H 2 .
Remark 7
In the same way as previously we can obtain the test statistics T 12 0 = G 12 2 , T 12 1 = N G 12 2 , T 12 1 = X 12 2 , T 12 1 / 2 = F 12 2 , T 12 2 = N M 12 2 and T 12 2 / 3 .
We will refer here to the example of [18, Section 9.5], where the test proposed by Bowker [1] is applied. The proposed tests in this paper may be used in the situation such that it is hoped that a new formulation of a drug will reduce some side-effects.
Example 
We consider 158 patients who have been treated with the old formulation and records are available of any side-effects. We might now treat each patient with the new formulation and note incidence of side-effects. Table 1 shows a possible outcome for such an experiment. Do the data in Table 1 provide any evidence regarding a less severity of side-effects with the new formulation of the drug? BA The two test statistics given in (21) and (22) are appropriate for this problem. For the test statistic T 01 ϕ given in (22) the null hypothesis is that for all off-diagonal counts in the table the associated probabilities are such that all p i j = p j i . The alternative is that p i j p j i for all i j . We have computed the members of the family { T 01 λ } given in Remark 7 and the corresponding asymptotic p-values P 01 λ = Pr ( χ ¯ 3 2 > T 01 λ ) which are given in the following table:
On the other hand, if we consider the usual Pearson test statistic X 2 , we have that the value of this statistic is 9.33 . In this case using the chi-squared distribution with 3 degrees of freedom, the corresponding asymptotic distribution found by Bowker [1], Pr ( χ 3 2 > X 2 ) = 0.025 . Then for all the considered statistics there is evidence of a differing incidence rate for side-effects under the two formulations, moreover this difference is towards less severe side effects under the new formulation. Therefore, the two considered tests lead to the same conclusion: There is a strong evidence of a bigger incidence rate for side-effects under the old formulation. The conclusion obtained in [18] is in accordance with our conclusion.

Acknowledgements

This work was supported by Grants MTM 2009-10072 and BSCH-UCM 2008-910707.

References

  1. Bowker, A. A test for symmetry in contingency tables. J. Am. Statist. Assoc. 1948, 43, 572–574. [Google Scholar] [CrossRef]
  2. Ireland, C.T.; Ku, H.H.; Koch, G.G. Symmetry and marginal homogeneity of an r × r contingency table. J. Am. Statist. Assoc. 1969, 64, 1323–1341. [Google Scholar] [CrossRef]
  3. Quade, D.; Salama, I.A. A note on minimum chi-square statistics in contingency tables. Biometrics 1975, 31, 953–956. [Google Scholar] [CrossRef]
  4. Menéndez, M.L.; Pardo, J.A.; Pardo, L. Tests based on ϕ-divergences for bivariate symmetry. Metrika 2001, 53, 15–29. [Google Scholar]
  5. Menéndez, M.L.; Pardo, J.A.; Pardo, L. Tests for bivariate symmetry against ordered alternatives in square contingency tables. Aust. N. Z. J. Stat. 2003, 45, 115–124. [Google Scholar] [CrossRef]
  6. Menéndez, M.L.; Pardo, J.A.; Pardo, L. Tests of symmetry in three-dimensional contingency tables based on phi-divergence statistics. J. Appl. Stat. 2004, 31, 1095–1114. [Google Scholar] [CrossRef]
  7. Menéndez, M.L.; Pardo, J.A.; Pardo, L.; Zografos, K. On tests of symmetry, marginal homogeneity and quasi-symmetry in two contingency tables based on minimum ϕ-divergence estimator with constraints. J. Stat. Comput. Sim. 2005, 75, 555–580. [Google Scholar] [CrossRef]
  8. El Barmi, H.; Kochar, S.C. Likelihood ratio tests for symmetry against ordered alternatives in a square contingency table. Stat. Probab. Lett. 1995, 2, 167–173. [Google Scholar] [CrossRef]
  9. Csiszàr, I. Eine Informationstheorestiche Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten; The Mathematical Institute of Hungarian Academy of Sciences: Budapest, Hungary, 1963; Volume 8, pp. 84–108. [Google Scholar]
  10. Ali, S.M.; Silvey, S.D. A general class of coefficients of divergence of one distribution from another. J. Roy. Stat. Soc. Ser. B Stat. Met. 1966, 28, 131–142. [Google Scholar] [CrossRef]
  11. Cressie, N.; Read, T.R.C. Multinomial goodness-of-fit tests. J. Roy. Stat. Soc. Ser. B Stat. Met. 1984, 46, 440–464. [Google Scholar] [CrossRef]
  12. Pardo, L. Statistical Inference Based on Divergence Measures; Chapman & Hall/CRC: New York, NY, USA, 2006. [Google Scholar]
  13. Kullback, S. Information Theory and Statistics; John Wiley: New York, NY, USA, 1959. [Google Scholar]
  14. Kullback, S. Marginal homogeneity of multidimensional contingency tables. Ann. Math. Stat. 1971, 42, 594–606. [Google Scholar] [CrossRef]
  15. Morales, D.; Pardo, L.; Vajda, I. Asymptotic divergences of estimates of discrete distributions. J. Stat. Plan. Infer. 1995, 48, 347–369. [Google Scholar] [CrossRef]
  16. Pardo, J.A.; Pardo, L.; Zografos, K. Minimum ϕ-divergence estimators with constraints in multinomial populations. J. Stat. Plan. Infer. 2002, 104, 221–237. [Google Scholar] [CrossRef]
  17. Robertson, T.; Wrigt, F.T.; Dykstra, R.L. Order Restricted Statistical Inference; Wiley: New York, NY, USA, 1988. [Google Scholar]
  18. Sprent, P. Applied Nonparametric Statistical Methods; Chapman & Hall: London, UK, 1993. [Google Scholar]
Table 1. Side-effect levels for old and new formulation.
Table 1. Side-effect levels for old and new formulation.
Symmetry 02 01108 i001
Table 2. Asymptotic p-values for T 01 λ .
Table 2. Asymptotic p-values for T 01 λ .
λ−2−1−1/202/31
T 01 λ 14.4311.4810.589.969.469.33
P 01 λ 0.0010.0030.0040.0060.0070.008

Share and Cite

MDPI and ACS Style

Pardo, L.; Martín, N. Minimum Phi-Divergence Estimators and Phi-Divergence Test Statistics in Contingency Tables with Symmetry Structure: An Overview. Symmetry 2010, 2, 1108-1120. https://doi.org/10.3390/sym2021108

AMA Style

Pardo L, Martín N. Minimum Phi-Divergence Estimators and Phi-Divergence Test Statistics in Contingency Tables with Symmetry Structure: An Overview. Symmetry. 2010; 2(2):1108-1120. https://doi.org/10.3390/sym2021108

Chicago/Turabian Style

Pardo, Leandro, and Nirian Martín. 2010. "Minimum Phi-Divergence Estimators and Phi-Divergence Test Statistics in Contingency Tables with Symmetry Structure: An Overview" Symmetry 2, no. 2: 1108-1120. https://doi.org/10.3390/sym2021108

Article Metrics

Back to TopTop