1. Introduction
Let
and
be two general sources (cf. Han [
1]), where we use the term of general source to denote a sequence of random variables
(respectively,
) indexed by block length
n, where each component of
(respectively,
) takes values in alphabet
and may vary depending on
n.
We consider the hypothesis testing problem with null hypothesis
, alternative hypothesis
and acceptance region
. The probabilities of type I error and type II error are defined, respectively, as
We focus mainly on how to determine the
-optimum exponent, defined as the supremum of achievable exponents
R for the type II error probability
under the constraint that the type I error probability is allowed asymptotically up to a constant
. The classical but fundamental result in this setting is so-called Stein’s lemma [
2], which gives the
-optimum exponent in the case where both the null and alternative hypotheses are stationary memoryless sources. The lemma shows that the
-optimum exponent is given by
, the divergence between stationary memoryless sources
X and
. Chen [
3] has generalized this lemma to the case where both of
and
are general sources, and established the general formula of
-optimum exponent in terms of divergence spectra. The
-optimum exponent derived in [
3] is called in this paper the first-order
-optimum exponent.
On the other hand, the second-order asymptotics have also been investigated in several contexts of information theory [
4,
5,
6,
7,
8,
9] to analyze the finer asymptotic behavior of the form
.
Strassen [
4] has first introduced the notion of
-optimum achievable exponent of the second-order in hypothesis testing problem in the case where both of
and
are stationary memoryless sources. The results in [
4] have also revealed that the asymptotic normality of divergence density rate (or likelihood ratio rate) plays an important role in computing the second-order
-optimum exponent.
In this paper, on the other hand, we investigate the hypothesis testing for mixed memoryless sources. The class of mixed sources is quite important, because all stationary sources can be regarded as mixed sources consisting of stationary ergodic sources. Therefore, the analysis for mixed sources is primitive but fundamental and thus we first focus on the case where the null hypothesis is a mixed memoryless source and the alternative hypothesis is a memoryless source. In this direction, Han [
1] has first derived the single-letter formula for the first-order
-optimum exponent in the case with mixed memoryless source
and stationary memoryless source
. The first main result in this paper is to establish the single-letter second-order
-optimum exponent in the same setting by invoking the relevant asymptotic normality. The result is a substantial generalization of that of Strassen [
4]. Second, we generalize this setting to the case where both null and alternative hypotheses are mixed memoryless
,
to establish the single-letter first-order
-optimum exponent.
It should be emphasized that our results described here are valid for mixed memoryless sources with general mixture in the sense that the mixing weight for component sources may be an arbitrary probability measure. For the case of mixed general sources with finite discrete mixture, we reveal the deep relationship with the compound hypothesis testing problem. We notice that the compound hypothesis testing problem is important from both of theoretical and practical points of view. We show that the first-order 0-optimum (respectively, exponentially r-optimum) exponent for the mixed general hypothesis testing coincides with that for the 0-optimum (respectively, exponentially r-optimum) exponent in the compound general hypothesis testing.
The present paper is organized as follows. In
Section 2, we fix the problem setting and review the general formula (Theorem 1) for the first-order
-optimum exponent. This is used to prove Theorem 5 to establish a first-order single-letter formula for hypothesis testing in the case where both the null and alternative hypotheses are mixed memoryless. Moreover, we give the general formula (Theorem 2) for the second-order
-optimum exponent, which is used to prove Theorem 4 to establish a second-order single-letter formula for hypothesis testing in the case where the null hypothesis is mixed memoryless and the alternative hypothesis is stationary memoryless. In
Section 3, we establish the single-letter second-order
-optimum exponent in the case with mixed memoryless source
and stationary memoryless source
(cf. Theorem 4). Furthermore, in
Section 4, we consider the case where both of null and alternative hypotheses are mixed memoryless sources, and derive the single-letter first-order
-optimum exponent (cf. Theorem 5).
Section 5 is devoted to an extension of mixed memoryless sources to mixed general sources. Finally, in
Section 6, we define the optimum exponent for the compound general hypothesis testing problem to discuss a relevant relationship with the hypothesis testing with mixed general sources. We conclude the paper in
Section 7.
2. General Formulas for -Hypothesis Testing
In this section, we first review the first-order general formula and then give the second-order general formula. Throughout in this paper, the following lemmas play the important role, where we use the notation that indicates the probability distribution of random variable Z.
Lemma 1 ([
1] (Lemma 4.1.1)).
For any , define the acceptance region asthen, it holds that Lemma 2 ([
1] (Lemma 4.1.2)).
For any and any , it holds that Proofs of these lemmas are found in [
1].
We define the first and second-order -optimum exponents as follows.
Definition 1. Rate R is said to be ε-achievable, if there exists an acceptance region such that Definition 2 (First-order
ε-optimum exponent).
The right-hand side of Equation (5) specifies the asymptotic behavior of the form
. Chen [
3] has derived the general limiting formula for
as follows, which is utilized to establish Theorem 5 in
Section 4.
Theorem 1 (Chen [
3] (Theorem 1)).
where Moreover, we consider the second-order -optimum exponent as follows.
Definition 3. Rate S is said to be -achievable, if there exists an acceptance region such that Definition 4 (Second-order
-optimum exponent).
The right-hand side of Equation (9) specifies the asymptotic behavior of the form
. The general limiting formula for
is given as follows, which is the second-order counterpart of Theorem 1, and is utilized to establish Theorem 4 in the next
Section 3.2 to give a second-order single-letter formula for hypothesis testing in the case where the null hypothesis is mixed memoryless and the alternative hypothesis is stationary memoryless.
3. Mixed Memoryless Sources
3.1. First-Order -Optimum Exponent
In the previous section, we have demonstrated the “limiting” formulas for general hypothesis testing. In this and subsequent sections, we consider special but insightful cases and compute the optimum exponents in single-letter forms.
Let be an arbitrary probability space with general probability measure . Then, the hypothesis testing problem to be considered in this section is stated as follows:
The null hypothesis is a mixed stationary memoryless source
, that is, for
where
is a stationary memoryless source for each
and
with generic random variable
taking values in
.
The alternative hypothesis is a stationary memoryless source
with generic random variable
taking values in
, that is,
We assume to be a finite alphabet hereafter.
To investigate this special case, first we introduce an expurgated parameter set on the basis of types, where the type T of sequence is the empirical distribution of , that is, with the number of i such that .
Let
denote all possible types of sequences of length
n. Then, it is well-known that
Now, for each
, we define the set
Since
is an i.i.d. source for each
, the set
depends only on the type
of sequence
, and therefore, we may write
instead of
. Moreover, we define the “expurgated” set
Then, we have the following lemma:
Lemma 3 (Han [
1]).
Let denote a mixed memoryless source defined in Equation (13), then we have Next, we introduce two basic “decomposition” lemmas as follows.
Lemma 4 (Upper Decomposition Lemma).
Let be a mixed memoryless source and be an arbitrary general source. Then, for any and any real it holds that Lemma 5 (Lower Decomposition Lemma).
Let be a mixed memoryless source and be an arbitrary general source. Then, for any , and it holds that These Lemmas 3–5 are used later in order to establish Theorems 3–5. First, Theorem 3 concerning the first-order -optimum exponent for mixed memoryless sources has earlier been given as follows:
Theorem 3 (First-order
ε-optimum exponent: Han [
1]).
For ,where denotes the Kullback–Leibler divergence between and . Remark 1. If Θ
is a singleton, the above formula reduces towhich is nothing but Stein’s lemma [2]. Remark 2. can be expressed also as This can be verified as follows. Set Then, clearly . Here, we assume that to show a contradiction. From the assumption, there exists a constant satisfying . On the other hand, from the definition of , for any holds. Thus, setting leads towhich is a contradiction, where the last inequality is due to the definition of . 3.2. Second-Order -Optimum Exponent
Next, we establish the second-order -optimum exponent for mixed sources, which is the first main result in this paper.
Theorem 4 (Second-order
ε-optimum exponent).
For ,where Remark 3. If Θ
is a singleton , Theorem 4 reduces to for , which is originally due to Strassen [4]. Remark 4. From Theorem 3 with , it is not difficult to verify thatand Here, let us consider the following canonical equation for S In view of Equations (33) and (34), this equation always has a solution . It should be noted that if holds, the solution is not unique and so . By using the solution , it is not difficult to check that Theorem 4 with can be expressed as The canonical equation is a useful expression for the second-order ε-optimum rate [7,10,11,12]. Equation (35) is the hypothesis testing counterpart of these results. 4. Mixed Memoryless Alternative Hypothesis
In this section, we consider the case where not only the null hypothesis but also the alternative hypothesis are mixed memoryless sources to establish the single-letter formula for the first-order -optimum exponent, by which we intend to generalize Theorem 3.
Let be a family of probability distributions on , where is a probability space with probability measure . We assume here that is a compact space and is continuous as a function of .
The hypothesis testing problem considered in this section is stated as follows:
The null hypothesis is a mixed memoryless source
as defined by Equations (
13) and (
14) in
Section 3.1.
The alternative hypothesis is another mixed memoryless source
, that is, for
where
Let us now consider, for each
(the set of probability distributions on
), the equation with respect to
as follows:
with
(the essential infimum of
with respect to
), where
is measured with respect to the probability measure
.
Since the solution of this equation depends on P, we may write as . Notice here that is continuous in , and as we have assumed that is compact and is continuous in , there indeed exists such a function . Now, to avoid technical subtleties, we assume here that the function may be chosen so as to be continuous. For example, if we consider a special case such that is a closed convex subset of , then it is not difficult to verify that the function is uniquely determined and continuous (or even differentiable), which follows from the strict convexity of in . Another simple example will be the case that is a countable set.
Hereafter, for simplicity, we write (respectively, ) instead of (respectively, ), then we have the second main result in this paper as
Theorem 5 (First order
ε-optimum exponent).
For , Remark 5. In the case that Σ is a singleton, the above theorem coincides with Theorem 3. Therefore, this theorem is a direct generalization of Theorem 3. This means also that both Θ and Σ are singletons, the theorem coincides with Stein’s lemma (see Remark 1).
Remark 6. Remark 2 is also valid in this theorem. That is, can be expressed also as Proof of Theorem 5. To show the theorem, let
be the set of
-typical sequence with respect to
, that is, let
be the set of all
such that
where
is the number of
i such that
, and
is an arbitrary constant. Then, it is well known that
In the sequel, we use the upper and lower bounds of the probability
in the form
for each
, where
satisfies
as
, and
and
are some constants independent of
n. Proofs of Equations (
45) and (46) appear in
Appendix E.
We then prove the theorem by using Equations (
45) and (46) as follows. In view of Theorem 1 and Remark 6, it suffices to show two inequalities:
Similar to the derivation of Equation (A23) with Lemma 4, we have
From the definition of the
-typical set and Equation (45), we also have
for any
. Here, we define two sets:
Then, from the definition of
there exists a small constant
satisfying
for
. Thus, it holds that
where we have used the relation
, and
for sufficiently large
n and sufficiently small
.
Therefore, noting that, with
,
gives the arithmetic average of
n i.i.d. variables with expectation
Then, the weak law of large numbers yields that for
,
Thus, from Equations (54) and (57), the right-hand side of Equation (49) is upper bounded by
which completes the proof of (
47).
Similar to the derivation of Equation (A32) with Lemma 5, we have
From the definition of the
-typical set and Equation (46), we also have
for any
.
We also partition the parameter space
into two sets.
Then, for
, if we set
and
sufficiently small, then there exists a constant
satisfying
Thus, again by invoking the weak law of large numbers, we have for
Summarizing up, we obtain
This completes the proof of Equation (48). □
Remark 7. Theorem 3 is a special case of Theorem 5 when Σ is a singleton.
To illustrate a significance of Theorem 5, let us now consider the special case with . Then, by virtue of Theorem 5, we have the following simplified result:
Corollary 1. In the special case of , we have Proof. The formula (
40) can be written in this case as
Contrarily, let
then this means that
As a consequence, (
66) follows from (
67), (
69) and (
71). □
Remark 8. One may wonder if it might be possible to deal with the second-order ε-optimum problem too using the arguments as developed in the above for the first-order ε-optimum problem with mixed memoryless sources and . To do so, however, it seems that we need some novel techniques, which remain to be studied.
5. Hypothesis Testing with Mixed General Sources
We have so far investigated the -hypothesis testing for mixed memoryless sources. In this section, we deal with more general settings such as hypothesis testings with mixed general sources, which inherits the crux of that for mixed memoryless sources (cf. Theorem 5). This leads us to a primitive but insightful “general” observation.
To do so, we consider the case where both of null hypothesis and alternative hypothesis are finite mixtures of general sources as follows:
The null hypothesis is a mixed general source
consisting of
K general (not necessarily memoryless) sources
, that is,
,
where
and
The alternative hypothesis is another mixed general source
consisting of
L general (not necessarily memoryless) sources
, that is,
,
where
and
In this general setting, it is hard to derive a compact formula for the first-order -optimum exponent (with ). Instead, we can obtain the following theorem in the special case of .
Theorem 6. In particular, if and are all stationary memoryless sources specified by and , respectively, thenwhich is a special case of Corollary 1. Furthermore, we can also consider the following exponentially r-optimum exponent in hypothesis testing with two mixed general sources and as above.
Definition 5. Let be any fixed constant. Rate R is said to be exponentially r-achievable if there exists an acceptance region such that Definition 6 (First-order exponentially
r-optimum exponent).
Then, it is not difficult to verify that a result analogous to Theorem 6 holds, which is a generalization of [
1] (Remark 4.4.3):
Theorem 7. In particular, if the null and alternative hypotheses consist of stationary memoryless sources and , respectively, thenby virtue of Hoeffding’s theorem. 6. Hypothesis Testing with Compound General Sources
In this section, let us consider the compound hypothesis testing problem with finite null hypotheses and finite alternative hypotheses , where and are general sources. As is well-known, this problem is expected to have a primitive but “general” relationship to that of mixed hypothesis at the structural level.
Specifically, the compound hypothesis testing is the problem in which a pair of general sources
occurs as a pair (null hypothesis, alternative hypothesis), and the tester does not know which pair
is actually working. This means that the acceptance region
cannot depend on
i and
j. The type I error probabilities of the compound hypothesis testing are given by
for each general null hypothesis
. The type II error probabilities are also given by
for each general alternative hypothesis
. Then, the following achievability is of our interest.
Definition 7. Rate R is said to be 0-achievable for the compound hypothesis testing, if there exists an acceptance region such thatfor all and . Definition 8 (First-order 0-optimum exponent).
Now, we have
Theorem 8. Assuming that and hold for all and , it holds thatwhere with sources Equations (72) and (73), we use here the notationto denote to make explicit dependence on , . From Theorems 6 and 8, we immediately obtain the first-order 0-optimum exponent for the compound hypothesis testing as:
Corollary 2. Assuming that and hold for all and , we haveIn particular, if and are all stationary memoryless sources specified by and , respectively, Equation (86) reduces to Remark 9. Similar to Definition 5, we can define the exponentially r-optimum exponent also for the compound hypothesis testing problem as follows.
Definition 9. Let be any fixed constant. Rate R is said to be exponentially r-achievable for the compound hypothesis testing, if there exists an acceptance region such thatfor all and . Definition 10 (First-order exponentially
r-optimum exponent).
Then, using an argument similar to the proof of Theorem 8, the following theorem can be shown:
Theorem 9. Let and hold for all and , then it holds thatwhere with sources Equations (72) and (73) we use the notationto denote (cf. Definitions 5 and 6). Combining Theorems 7 and 9, we immediately obtain the following corollary:
Corollary 3. Let and hold for all and , then it holds that In particular, if the null and alternative hypotheses consist of stationary memoryless sources specified by and , respectively, as in Theorem 7, thenwhich corresponds to Equation (79). 7. Concluding Remarks
Thus far, we have investigated the first- and second-order
-optimum exponents in the hypothesis testing problem. First, we have studied the second-order
-optimum problem with mixed memoryless null hypothesis and stationary memoryless alternative hypothesis. As we have shown in the analysis of the second-order
-optimum exponent, we use, as a key property, the asymptotic normality of divergence density rate for each of the component sources. We also observe that the canonical representation, first introduced in [
11], is still efficient to express the second-order
-optimum exponent for mixed memoryless sources in the hypothesis testing problem.
The first-order -optimum exponent in the case with mixed memoryless null and alternative hypotheses has also been established. One may wonder whether we can apply the same approach in the derivation of the second-order -optimum exponent in this setting. Notice that one of our key techniques to derive the first-order -optimum exponent is an expansion around . More careful evaluation of this expansion would be needed to compute the second-order -optimum exponent. This remains to be a future work. Our final goal is the problem of hypothesis testing in which both of null and alternative hypotheses are general stationary sources. This paper characterizes the first- and second-order performance of hypothesis testing for mixed memoryless sources as a simple but crucial step toward this goal.
Finally, the relationship between the first-order 0-optimum (respectively, exponentially r-optimum) exponent in the hypothesis testing with mixed general sources and the 0-optimum (respectively, exponentially r-optimum) exponent in the compound hypothesis testing has also been demonstrated.