Next Article in Journal
On Entropy Test for Conditionally Heteroscedastic Location-Scale Time Series Models
Next Article in Special Issue
Channel Coding and Source Coding With Increased Partial Side Information
Previous Article in Journal
On Increased Arc Endurance of the Cu–Cr System Materials
Previous Article in Special Issue
Scaling Exponent and Moderate Deviations Asymptotics of Polar Codes for the AWGN Channel
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Reliability Function of Variable-Rate Slepian-Wolf Coding †

1
College of Electronic Information and Automation, Tianjin University of Science and Technology, Tianjin 300222, China
2
Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1, Canada
3
Google, Mountain View, CA 94043, USA
4
IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in the 45th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 26–28 September 2007.
Entropy 2017, 19(8), 389; https://doi.org/10.3390/e19080389
Submission received: 13 June 2017 / Revised: 14 July 2017 / Accepted: 27 July 2017 / Published: 28 July 2017
(This article belongs to the Special Issue Multiuser Information Theory)

Abstract

:
The reliability function of variable-rate Slepian-Wolf coding is linked to the reliability function of channel coding with constant composition codes, through which computable lower and upper bounds are derived. The bounds coincide at rates close to the Slepian-Wolf limit, yielding a complete characterization of the reliability function in that rate region. It is shown that variable-rate Slepian-Wolf codes can significantly outperform fixed-rate Slepian-Wolf codes in terms of rate-error tradeoff. Variable-rate Slepian-Wolf coding with rate below the Slepian-Wolf limit is also analyzed. In sharp contrast with fixed-rate Slepian-Wolf codes for which the correct decoding probability decays to zero exponentially fast if the rate is below the Slepian-Wolf limit, the correct decoding probability of variable-rate Slepian-Wolf codes can be bounded away from zero.

1. Introduction

Consider the problem (see Figure 1) of compressing X n = ( X 1 , X 2 , , X n ) with side information Y n = ( Y 1 , Y 2 , , Y n ) available only at the decoder. Here { ( X i , Y i ) } i = 1 is a joint memoryless source with zero-order joint probability distribution P X Y on finite alphabet X × Y . Let P X and P Y be the marginal probability distributions of X and Y induced by the joint probability distribution P X Y . Without loss of generality, we shall assume P X ( x ) > 0 , P Y ( y ) > 0 for all x X , y Y . This problem was first studied by Slepian and Wolf in their landmark paper [1]. They proved a surprising result that the minimum rate for reconstructing X n at the decoder with asymptotically zero error probability (as block length n goes to infinity) is H ( X | Y ) , which is the same as the case where the side information Y n is also available at the encoder. The fundamental limit H ( X | Y ) is often referred to as the Slepian-Wolf limit. We shall assume H ( X | Y ) > 0 throughout this paper.
Different from conventional lossless source coding, where most effort has been devoted to variable-rate coding schemes, research on Slepian-Wolf coding has almost exclusively focused on fixed-rate codes (see, e.g., [2,3,4,5] and the references therein). This phenomenon can be partly explained by the influence of channel coding. It is well known that there is an intimate connection between channel coding and Slepian-Wolf coding. Intuitively, one may view Y n as the channel output generated by channel input X n through discrete memoryless channel P Y | X , where P Y | X is the probability transition matrix from X to Y induced by the joint probability probability distribution P X Y . Since Y n is not available at the encoder, Slepian-Wolf coding is, in a certain sense, similar to channel coding without feedback. In a channel coding system, there is little incentive to use variable-rate coding schemes if no feedback link exists from the receiver to the transmitter. Therefore, it seems justifiable to focus on fixed-rate codes in Slepian-Wolf coding.
This viewpoint turns out to be misleading. We shall show that variable-rate Slepian-Wolf codes can significantly outperform fixed-rate codes in terms of rate-error tradeoff. Specifically, it is revealed that variable-rate Slepian-Wolf codes can beat the sphere-packing bound for fixed-rate Slepian-Wolf codes at rates close to the Slepian-Wolf limit. It is known [6] that the correct decoding probability of fixed-rate Slepian-Wolf codes decays to zero exponentially fast if the rate is below the Slepian-Wolf limit. Somewhat surprisingly, the decoding error probability of variable-rate Slepian-Wolf codes can be bounded away from one even when they are operated below the Slepian-Wolf limit, and the performance degrades gracefully as the rate goes to zero. Therefore, variable-rate Slepian-Wolf coding is considerably more robust.
The rest of this paper is organized as follows. In Section 2, we review the existing bounds on the reliability function of fixed-rate Slepian-Wolf coding, and point out the intimate connections with their counterparts in channel coding. In Section 3, we characterize the reliability function of variable-rate Slepian-Wolf coding by leveraging the reliability function of channel coding with constant composition codes. Computable lower and upper bounds are derived. The bounds coincide at rates close to the Slepian-Wolf limit. The correct decoding probability of variable-rate Slepian-Wolf coding with rate below the Slepian-Wolf limit is studied in Section 4. An illustrative example is given in Section 5. We conclude the paper in Section 6. Throughout this paper, we assume the logarithm function is to base e unless specified otherwise.

2. Fixed-Rate Slepian-Wolf Coding and Channel Coding

To facilitate the comparisons between the performances of fixed-rate Slepian-Wolf coding and variable-rate coding, we shall briefly review the existing bounds on the reliability function of fixed-rate Slepian-Wolf coding. It turns out that a most instructive way is to first consider their counterparts in channel coding. The reason is two-fold. First, it provides the setup to introduce several important definitions. Second and more important, it will be clear that the reliability function of fixed-rate Slepian-Wolf coding is closely related to that of channel coding; indeed, such a connection will be further explored in the context of variable-rate Slepian-Wolf coding.
For any probability distributions P , Q on X and probability transition matrices V , W : X Y , we use H ( P ) , I ( P , V ) , D ( Q P ) , and D ( W V | P ) to denote the standard entropy, mutual information, divergence, and conditional divergence functions; specifically, we have
H ( P ) = x P ( x ) log P ( x ) , I ( P , V ) = x , y P ( x ) V ( y | x ) log V ( y | x ) x P ( x ) V ( y | x ) , D ( Q P ) = x Q ( x ) log Q ( x ) P ( x ) , D ( W V | P ) = x , y P ( x ) W ( y | x ) log W ( y | x ) V ( y | x ) .
The main technical tool we need is the method of types. First, we shall quote a few basic definitions from [7]. Let P ( X ) denote the set of all probability distributions on X . The type of a sequence x n X n , denoted as P x n , is the empirical probability distribution of x n . Let P n ( X ) denote the set consisting of the possible types of sequences x n X n . For any P P n ( X ) , the type class T n ( P ) is the set of sequences in X n of type P. We will make frequent use of the following elementary results:
| P n ( X ) | ( n + 1 ) | X | ,
1 ( n + 1 ) | X | e n H ( P ) | T n ( P ) | e n H ( P ) , P P n ( X ) ,
i = 1 n P ( x i ) = e n D ( Q P ) + H ( Q ) , x n T n ( Q ) , Q P n ( X ) , P P ( X ) .
A block code C n is an ordered collection of sequences in X n . We allow C n to contain identical sequences. Moreover, for any set A X n , we say that C n A if x n A for all x n C n . Note that C n A does not imply | C n | | A | . The rate of C n is defined as
R ( C n ) = 1 n log | C n | .
Given a channel W Y | X : X Y , a block code C n X n , and channel output Y n Y n , the output of the optimal maximum likelihood (ML) decoder is
X ^ n = arg min x n C n i = 1 n log W Y | X ( Y i | x i ) ,
where the ties are broken in an arbitrary manner. The average decoding error probability of block code C n over channel W Y | X is defined as
P e ( C n , W Y | X ) = 1 | C n | x n C n Pr { X ^ n x n | x n is   transmitted } .
The maximum decoding error probability of block code C n over channel W Y | X is defined as
P e , max ( C n , W Y | X ) = max x n C n Pr { X ^ n x n | x n is   transmitted } .
The average correct decoding probability of block code C n over channel W Y | X is defined as
P c ( C n , W Y | X ) = 1 P e ( C n , W Y | X ) .
Definition 1.
Given a channel W Y | X : X Y , we say that an error exponent E 0 is achievable with block codes at rate R if for any δ > 0 , there exists a sequence of block codes { C n } such that
lim   inf n   R ( C n ) R δ , lim   sup n 1 n log P e ( C n , W Y | X ) E δ .
The largest achievable error exponent at rate R is denoted by E ( W Y | X , R ) . The function E ( W Y | X , · ) is referred to as the reliability function of channel W Y | X .
Similarly, we say that a correct decoding exponent E c 0 is achievable with block channel codes at rate R if for any δ > 0 , there exists a sequence of block codes { C n } such that
lim   inf n   R ( C n ) R δ , lim   inf n 1 n log P c ( C n , W Y | X ) E c + δ .
The smallest achievable correct decoding exponent at rate R is denoted by E c ( W Y | X , R ) . It will be seen that E c ( W Y | X , R ) is positive if and only if R > C ( W Y | X ) , where C ( W Y | X ) max Q X I ( Q X , W Y | X ) is the capacity of channel W Y | X . Therefore, we shall refer to the function E c ( W Y | X , · ) as the reliability function of channel W Y | X above the capacity.
Remark 1.
Given any block code C n of average decoding error probability P e ( C n , W Y | X ) , we can expurgate the worst half of the codewords so that the maximum decoding error probability of the resulting code is bounded above by 2 P e ( C n , W Y | X ) . Therefore, the reliability function E ( W Y | X , · ) is unaffected if we replace P e ( C n , W Y | X ) by P e , max ( C n , W Y | X ) in (4).
Definition 2.
Given a probability distribution Q X P ( X ) and a channel W Y | X : X Y , we say that an error exponent E 0 is achievable at rate R with constant composition codes of type approximately Q X if for any δ > 0 , there exists a sequence of block codes { C n } with C n T n ( P n ) for some P n P n ( X ) such that
lim n P n Q X = 0 , lim   inf n   R ( C n ) R δ , lim   sup n 1 n log P e ( C n , W Y | X ) E δ ,
where · is the l 1 norm.
The largest achievable error exponent at rate R for constant composition codes of type approximately Q X is denoted by E ( Q X , W Y | X , R ) . The function E ( Q X , W Y | X , · ) is referred to as the reliability function of channel W Y | X for constant composition codes of type approximately Q X .
Similarly, we say that a correct decoding exponent E c 0 is achievable at rate R with constant composition codes of type approximately Q X if for any δ > 0 , there exists a sequence of block codes { C n } with C n T n ( P n ) for some P n P n ( X ) such that
lim n P n Q X = 0 , lim   inf n   R ( C n ) R δ , lim   inf n 1 n log P c ( C n , W Y | X ) E c + δ .
The smallest achievable correct decoding exponent at rate R for constant composition codes of type approximately Q X is denoted by E c ( Q X , W Y | X , R ) .
Remark 2.
The reliability function E ( Q X , W Y | X , · ) is unaffected if we replace P e ( C n , W Y | X ) by P e , max ( C n , W Y | X ) in (5).
Let | t | + = max { 0 , t } and d W Y | X ( x , x ˜ ) = log y W Y | X ( y | x ) W Y | X ( y | x ˜ ) . Define
E e x ( Q X , W Y | X , R ) = min Q X ˜ | X : Q X = Q X ˜ , I ( Q X , Q X ˜ | X ) R E Q X X ˜ d W Y | X ( X , X ˜ ) + I ( Q X , Q X ˜ | X ) R ,
E r c ( Q X , W Y | X , R ) = min V Y | X D ( V Y | X W Y | X | Q X ) + | I ( Q X , V Y | X ) R | + ,
E s p ( Q X , W Y | X , R ) = min V Y | X : I ( Q X , V Y | X ) R D ( V Y | X W Y | X | Q X ) ,
where in (6), Q X ˜ and Q X X ˜ are respectively the marginal probability distribution of X ˜ and the joint probability distribution of X and X ˜ induced by Q X and Q X ˜ | X .
Let R e x ( Q X , W Y | X ) be the smallest R 0 with E e x ( Q X , W Y | X , R ) < . We have
R e x ( Q X , W Y | X ) = min Q X ˜ | X : Q X = Q X ˜ , E Q X X ˜ d W Y | X ( X , X ˜ ) < I ( Q X , Q X ˜ | X ) .
It is known ([7], Exercise 5.18) that E e x ( Q X , W Y | X , R ) is a decreasing convex function of R for R R e x ( Q X , W Y | X ) ; moreover, the minimum in (9) is achieved at Q X X ˜ if and only if
Q X X ˜ ( x , x ˜ ) = c Q ( x ) Q ( x ˜ ) if   d W Y | X ( x , x ˜ ) < , 0 otherwise ,
where the probability distribution Q and the constant c are uniquely determined by the condition Q X = Q X ˜ .
It is shown in ([8], Lemma 3) that, for some R * ( Q X , W Y | X ) [ 0 , I ( Q X , W Y | X ) ] , we have
max E e x ( Q X , W Y | X , R ) , E r c ( Q X , P Y | X , R ) = E e x ( Q X , W Y | X , R ) if   R R * ( Q X , W Y | X ) , E r c ( Q X , W Y | X , R ) if   R > R * ( Q X , W Y | X ) .
It is also known ([7], Corollary 5.4) that
E r c ( Q X , W Y | X , R ) = E s p ( Q X , W Y | X , R ) if   R R c r ( Q X , W Y | X ) , E s p ( Q X , W Y | X , R c r ) + R c r R if   0 R R c r ( Q X , W Y | X ) ,
where R c r R c r ( Q X , W Y | X ) is the smallest R at which the convex curve E s p ( Q X , W Y | X , R ) meets its supporting line of slope −1. It is obvious that R c r ( Q X , W Y | X ) I ( Q X , W Y | X ) .
Proposition 1.
R c r ( Q X , W Y | X ) = I ( Q X , W Y | X ) if and only if the value of
W Y | X ( y | x ) x Q X ( x ) W Y | X ( y | x )
does not depend on y for all x , y such that Q X ( x ) W Y | X ( y | x ) > 0 .
Proof. 
See Appendix A ☐
Define R s p ( Q X , W Y | X ) = inf { R > 0 : E s p ( Q X , W Y | X , R ) < } . It is known ([7], Exercise 5.3) that
R s p ( Q X , W Y | X ) = min   I ( Q X , V Y | X ) ,
where the minimum is taken over those V Y | X ’s for which V Y | X ( y | x ) = 0 whenever W Y | X ( y | x ) = 0 ; in particular, R s p ( Q X , W Y | X ) > 0 if and only if for every y Y there exists an x X with Q X ( x ) > 0 and W Y | X ( y | x ) = 0 .
Proposition 2.
The minimum in (12) is achieved at V Y | X = W Y | X if and only if the value of
W Y | X ( y | x ) x Q X ( x ) W Y | X ( y | x )
does not depend on y for all x , y such that Q X ( x ) W Y | X ( y | x ) > 0 .
Proof. 
The proof is similar to that of Proposition 1. The details are omitted. ☐
One can readily prove the following result by combining Propositions 1 and 2.
Proposition 3.
The following statements are equivalent:
  • R c r ( Q X , P Y | X ) = I ( Q X , W Y | X ) ;
  • R s p ( Q X , P Y | X ) = I ( Q X , W Y | X ) ;
  • the value of
    W Y | X ( y | x ) x Q X ( x ) W Y | X ( y | x )
    does not depend on y for all x , y such that Q X ( x ) W Y | X ( y | x ) > 0 .
Proposition 4.
  • E ( Q X , W Y | X , R ) max { E e x ( Q X , W Y | X , R ) , E r c ( Q X , W Y | X , R ) } ;
  • E ( Q X , P Y | X , R ) E s p ( Q X , W Y | X , R ) with the possible exception of R = R s p ( Q X , W Y | X ) at which point the inequality not necessarily holds;
  • E c ( Q X , W Y | X , R ) = min V Y | X D ( V Y | X W Y | X | Q X ) + | R I ( Q X ; V Y | X ) | + .
Remark 3.
E e x ( Q X , W Y | X , R ) , E r c ( Q X , W Y | X , R ) , and E s p ( Q X , W Y | X , R ) are respectively the expurgated exponent, the random coding exponent, and the sphere packing exponent of channel W Y | X for constant composition codes of type approximately Q X . The results in Proposition 4 are well known [7,9]. However, bounding the decoding error probability of constant composition codes often serves as an intermediate step in characterizing the reliability function for general block codes; as a consequence, the reliability function for constant composition codes is rarely explicitly defined. Moreover, E e x ( Q X , W Y | X , R ) , E r c ( Q X , W Y | X , R ) , and E s p ( Q X , W Y | X , R ) are commonly used to bound the decoding error probability of constant composition codes for a fixed block length n; therefore, it is implicitly assumed that Q X is taken from P n ( X ) (see, e.g., [7]). In contrast, we consider a sequence of constant composition codes with block length increasing to infinity and type converging to Q X for some Q X P ( X ) (see Definition 2). A continuity argument is required for passing Q X from P n ( X ) to P ( X ) . For completeness, we supply the proof in Appendix B. Note that different from E ( Q X , W Y | X , · ) , the function E c ( Q X , W Y | X , · ) has been completely characterized.
Proposition 5.
  • E ( W Y | X , R ) = sup Q X E ( Q X , W Y | X , R ) ,
  • E c ( W Y | X , R ) = inf Q X E c ( Q X , W Y | X , R ) .
Remark 4.
In view of the fact that E c ( Q X , W Y | X , R ) is a continuous function of Q X defined on a compact set, we can replace “inf" with “min" in the above equation, i.e.,
E c ( W Y | X , R ) = min Q X E c ( Q X , W Y | X , R ) .
Proof. 
It is obvious that E ( W Y | X , R ) sup Q X E ( Q X , W Y | X , R ) ; the other direction follows from the fact that every block code C n contains a constant composition code C n with P e , max ( C n , W Y | X ) P e , max ( C n , W Y | X ) and R ( C n ) R ( C n ) | X | log ( n + 1 ) n . Similarly, it is clear that E c ( W Y | X , R ) inf Q X E c ( Q X , W Y | X , R ) ; the other direction follows from the fact that given any block code C n , one can construct a constant composition code C n with P c ( C n , W Y | X ) ( n + 1 ) | X | P c ( C n , W Y | X ) and R ( C n ) = R ( C n ) [9]. ☐
The expurgated exponent, random coding exponent, and sphere packing exponent of channel W Y | X for general block codes are defined as follows:
  • expurgated exponent
    E e x ( W Y | X , R ) = max Q X E e x ( Q X , W Y | X , R ) ,
  • random coding exponent
    E r c ( W Y | X , R ) = max Q X E r c ( Q X , W Y | X , R ) ,
  • sphere packing exponent
    E s p ( W Y | X , R ) = max Q X E s p ( Q X , W Y | X , R ) .
Let R s p ( W Y | X ) be the smallest R to the right of which E s p ( W Y | X , R ) is finite. It is known ([7], Exercise 5.3) and [10] that
R s p ( W Y | X ) = max Q X R s p ( Q X , W Y | X ) = log min Q X max y x X : W Y | X ( y | x ) > 0 Q X ( x ) .
By Propositions 4 and 5, we recover the following well-known result [7,10]:
max { E e x ( W Y | X , R ) , E r c ( W Y | X , R ) } E ( W Y | X , R ) E s p ( W Y | X , R )
with the possible exception of R = R s p ( W Y | X ) at which point the second inequality in (17) not necessarily holds.
Now we proceed to review the results on the reliability function of fixed-rate Slepian-Wolf coding. A fixed-rate Slepian-Wolf code ϕ n ( · ) is a mapping from X n to a set A n . The rate of ϕ n ( · ) is defined as
R ( ϕ n ) = 1 n log | A n | .
Given ϕ n ( X n ) and Y n , the output of the optimal maximum a posteriori (MAP) decoder is
X ^ n = arg min x n : ϕ n ( x n ) = ϕ n ( X n ) i = 1 n log P X | Y ( x i | Y i ) = arg min x n : ϕ n ( x n ) = ϕ n ( X n ) i = 1 n log P X Y ( x i , Y i ) ,
where the ties are broken in an arbitrary manner. The decoding error probability of Slepian-Wolf code ϕ n ( · ) is defined as
P e ( ϕ n , P X Y ) = Pr { X ^ n X n } .
The correct decoding probability of Slepian-Wolf code ϕ n ( · ) is defined as
P c ( ϕ n , P X Y ) = 1 P e ( ϕ n , P X Y ) .
Definition 3.
Given a joint probability distribution P X Y , we say that an error exponent E 0 is achievable with fixed-rate Slepian-Wolf codes at rate R if for any δ > 0 , there exists a sequence of fixed-rate Slepian-Wolf codes { ϕ n } such that
lim   sup n   R ( ϕ n ) R + δ , lim   sup n 1 n log P e ( ϕ n , P X Y ) E δ .
The largest achievable error exponent at rate R is denoted by E f ( P X Y , R ) . The function E f ( P X Y , · ) is referred to as the reliability function of fixed-rate Slepian-Wolf coding.
Similarly, we say that a correct decoding exponent E c 0 is achievable with fixed-rate Slepian-Wolf codes at rate R if for any δ > 0 , there exists a sequence of fixed-rate Slepian-Wolf codes { ϕ n } such that
lim   sup n   R ( ϕ n ) R + δ , lim   inf n 1 n log P c ( ϕ n , P X Y ) E c + δ .
The smallest achievable correct decoding exponent at rate R is denoted by E f c ( P X Y , R ) . It will be seen that E f c ( P X Y , R ) is positive if and only if R < H ( X | Y ) . Therefore, we shall refer to the function E f c ( P X Y , · ) as the reliability function of fixed-rate Slepian-Wolf coding below the Slepian-Wolf limit.
The expurgated exponent, random coding scheme, and sphere packing exponent of fixed-rate Slepian-Wolf coding are defined as follows:
  • expurgated exponent
    E f , e x ( P X Y , R ) = min Q X D ( Q X P X ) + E e x ( Q X , P Y | X , H ( Q X ) R ) ,
  • random coding exponent
    E f , r c ( P X Y , R ) = min Q X D ( Q X P X ) + E r c ( Q X , P Y | X , H ( Q X ) R ) ,
  • sphere packing exponent
    E f , s p ( P X Y , R ) = min Q X D ( Q X P X ) + E s p ( Q X , P Y | X , H ( Q X ) R ) .
Equivalently, the random coding exponent and sphere packing exponent of fixed-rate Slepian-Wolf coding can be written as [11]:
E f , r c ( P X Y , R ) = max 0 ρ 1 log y x P X Y ( x , y ) 1 1 + ρ 1 + ρ + ρ R , E f , s p ( P X Y , R ) = sup ρ > 0 log y x P X Y ( x , y ) 1 1 + ρ 1 + ρ + ρ R .
To see the connection between the random coding exponent and the sphere packing exponent, we shall write them in the following parametric forms [11]:
R = H ( X ( ρ ) | Y ( ρ ) ) , E f , s p ( P X Y , R ) = D ( P X ( ρ ) Y ( ρ ) P X Y ) ,
and
E f , r c ( P X Y , R ) = D ( P X ( ρ ) Y ( ρ ) P X Y ) if   H ( X | Y ) R H ( X ρ | Y ρ ) ρ = 1 , log y x P X Y ( x , y ) 2 + R if   R > H ( X ( ρ ) | Y ( ρ ) ) ρ = 1 ,
where the joint distribution of ( X ( ρ ) , Y ( ρ ) ) is P X ( ρ ) Y ( ρ ) , which is specified by
P Y ( ρ ) ( y ) = P Y ( y ) x P X | Y ( x | y ) 1 1 + ρ 1 + ρ y P Y ( y ) x P X | Y ( x | y ) 1 1 + ρ 1 + ρ , y Y ,
P X ( ρ ) | Y ( ρ ) ( x | y ) = P X | Y ( x | y ) 1 1 + ρ x P X | Y ( x | y ) 1 1 + ρ , x X , y Y .
Define the critical rate
R f , c r ( P X Y ) = H ( X ( ρ ) | Y ( ρ ) ) ρ = 1 .
Note that E r c ( P X Y , R ) and E s p ( P X Y , R ) coincide when R [ H ( X | Y ) , R f , c r ( P X Y ) ] . Let R f , s p ( P X Y ) = sup { R : E f , s p ( P X Y , R ) < } . It is shown in [12] that
R f , s p ( P X Y ) = max y log | { x X : P X | Y ( x | y ) > 0 } | .
It is well known [8,11,13] that the reliability function E f ( P X Y , · ) is upper-bounded by E f , s p ( P X Y , · ) and lower-bounded by E f , r c ( P X Y , · ) and E f , e x ( P X Y , · ) , i.e.,
max { E f , r c ( P X Y , R ) , E f , e x ( P X Y , R ) } E f ( P X Y , R ) E f , s p ( P X Y , R )
with the possible exception of R = R f , s p ( P X Y ) at which point the second inequality in (23) not necessarily holds. Note that E f ( P X Y , R ) is completely characterized for R [ H ( X | Y ) , R f , c r ( P X Y ) ] .
Unlike E f ( P X Y , · ) , the function E f c ( P X Y , · ) has been characterized for all R. Specifically, it is shown in [6,14] that
E f c ( P X Y , R ) = min Q X D ( Q X P X ) + E c ( Q X , P Y | X , H ( Q X ) R ) .
Comparing (14) with (18), (15) with (19), (16) with (20), and (13) with (24), one can easily see that there exists an intimate connection between fixed-rate Slepian-Wolf coding for source distribution P X Y and channel coding for channel P Y | X . This connection can be roughly interpreted as the manifestation of the following facts [15].
  • Given, for each type Q X P n ( X ) , a constant composition code C n ( Q X ) T n ( Q X ) with R ( C n ( Q X ) ) H ( Q X ) R and P e , max ( C n ( Q X ) , P Y | X ) e n E ( Q X ) , one can use C n ( Q X ) to partition type class T n ( Q X ) into approximately e n R disjoint subsets such that each subset is a constant composition code of type Q X with the maximum decoding error probability over channel P Y | X approximately equal to or less than that of C n ( Q X ) . Note that these partitions, one for each type class, yield a fixed-rate Slepian-Wolf code of rate approximately R with Pr { X ^ n X n | X n T n ( Q X ) } e n E ( Q X ) . Since Pr { X n T n ( Q X ) } e n D ( Q X P X ) (cf. (2) and (3)), it follows that Pr { X ^ n X n , X n T n ( Q X ) } e n [ D ( Q X P X ) + E ( Q X ) ] . The overall decoding error probability Pr { X ^ n X n } of the resulting Slepian-Wolf code can be upper-bounded, on the exponential scale, by e n [ D ( Q X * P X ) + E ( Q X * ) ] , where Q X * = arg min Q X D ( Q X P X ) + E ( Q X ) . In contrast, one has the freedom to choose Q X in channel coding, which explains why maximization (instead of minimization) is used in (14)–(16).
  • Given a fixed-rate Slepian-Wolf code ϕ n ( · ) with R ( ϕ n ) R and P e ( ϕ n , P X Y ) e n E , one can, for each type Q X P n ( X ) , lift out a constant composition code C n ( Q X ) T n ( Q X ) with R ( C n ( Q X ) ) H ( Q X ) R and P e ( C n ( Q X ) , P Y | X ) e n [ E D ( Q X P X ) ] .
  • The correct decoding exponents for channel coding and fixed-rate Slepian-Wolf coding can be interpreted in a similar way. Note that in channel coding, to maximize the correct decoding probability one has to minimize the correct decoding exponent; this is why in (13) minimization (instead of maximization) is used.
Therefore, it should be clear that to characterize the reliability functions for channel coding and fixed-rate Slepian-Wolf coding, it suffices to focus on constant composition codes. It will be shown in the next section that a similar reduction holds for variable-rate Slepian-Wolf coding. Indeed, the reliability function for constant component codes plays a predominant role in determining the fundamental rate-error tradeoff in variable-rate Slepian-Wolf coding.

3. Variable-Rate Slepian-Wolf Coding: Above the Slepian-Wolf Limit

A variable-rate Slepian-Wolf code φ n ( · ) is a mapping from X n to a binary prefix code B n . Let l ( ϕ n ( x n ) ) denote the length of binary string ϕ n ( x n ) . The rate of variable-rate Slepian-Wolf code ϕ n ( · ) is defined as
R ( φ n , P X Y ) = 1 n log 2 e E [ l ( φ n ( X n ) ) ] .
It is worth noting that R ( φ n , P X Y ) depends on P X Y only through P X .
Given φ n ( X n ) and Y n , the output of the optimal maximum a posteriori (MAP) decoder is
X ^ n = arg min x n : φ n ( x n ) = φ n ( X n ) i = 1 n log P X | Y ( x i | Y i ) = arg min x n : φ n ( x n ) = φ n ( X n ) i = 1 n log P X Y ( x i , Y i ) ,
where the ties are broken in an arbitrary manner. The decoding error probability of variable-rate Slepian-Wolf code φ n ( · ) is defined as
P e ( φ n , P X Y ) = Pr { X ^ n X n } .
The correct decoding probability of Slepian-Wolf code φ n ( · ) is defined as
P c ( ϕ n , P X Y ) = 1 P e ( φ n , P X Y ) .
Definition 4.
Given a joint probability distribution P X Y , we say that an error exponent E 0 is achievable with variable-rate Slepian-Wolf codes at rate R if for any δ > 0 , there exists a sequence of variable-rate Slepian-Wolf codes { φ n } such that
lim   sup n   R ( φ n , P X Y ) R + δ , lim   sup n 1 n log P e ( φ n , P X Y ) E δ .
The largest achievable error exponent at rate R is denoted by E v ( P X Y , R ) . The function E v ( P X Y , · ) is referred to as the reliability function of variable-rate Slepian-Wolf coding.
The power of variable-rate Slepian-Wolf coding results from its flexibility in rate allocation. Since there are only polynomial number of types for any given n (cf. (1)), the encoder can convey the type information to the decoder using negligible amount of rate when n is large enough. Therefore, without loss of much generality, we can assume that the type of X n is known to the decoder. Under this assumption, an optimal fixed-rate Slepian-Wolf encoder of rate R should partition T n ( P ) into min { | T n ( P ) | , e n R } disjoint subsets for each P P n . It can be seen that the rate allocated to T n ( P ) is always R if | T n ( P ) | e n R . In general, the type Q X * that dominates the error probability of fixed-rate Slepian-Wolf coding is different from P X . In contrast, for variable-rate Slepian-Wolf coding, we can losslessly compress the sequences of types that are bounded away P X by allocating enough rate to those type classes (but its contribution to the overall rate is still negligible since the probability of those type classes are extremely small), and therefore, effectively eliminate the dominant error event in fixed-rate Slepian-Wolf coding. As a consequence, the types that can cause decoding error in variable-rate Slepian-Wolf coding must be very close to P X . This is the main intuition underlying the proof of the following theorem. A similar argument has been used in the context of variable-rate Slepian-Wolf coding under mismatched decoding [16].
Theorem 1.
E v ( P X Y , R ) = E ( P X , P Y | X , H ( P X ) R ) .
Proof. 
The proof is divided into two parts. Firstly, we shall show that E v ( P X Y , R ) E ( P X , P Y | X , H ( P X ) R ) . The main idea is that one can use a constant composition code C n of type approximately P X and rate approximately H ( P X ) R to construct a variable-rate Slepian-Wolf code φ n ( · ) with n n , R ( φ n , P X Y ) R , and P e ( φ n , P X Y ) P e , max ( C n , P Y | X ) .
By Definition 2, for any δ > 0 , there exists a sequence of constant composition codes { C n } with C n T n ( P n ) for some P n P n ( X ) such that
lim n P n P X = 0 , lim   inf n   R ( C n ) H ( P X ) R δ , lim   sup n 1 n log P e , max ( C n , P Y | X ) E ( P X , P Y | X , H ( P X ) R ) δ .
Since P X ( x ) > 0 for all x X , we have
max P P n ( X ) E ( δ ) max x P n ( x ) P ( x ) ( 1 + δ ) 2
for all sufficiently n, where
E ( δ ) = P P ( X ) : max x P X ( x ) P ( x ) 1 + δ , H ( P ) H ( P X ) + δ , D ( P P X ) δ .
Let k n = ( 1 + δ ) 2 n . When n is large enough, we can, for each P P k n ( X ) E ( δ ) , construct a constant composition code C k n ( P ) of length k n and type P by appending a fixed sequence in X k n n to each codeword in C n . It is easy to see that
| C k n ( P ) | = | C n | ,
P e , max ( C k n ( P ) , P Y | X ) = P e , max ( C n , P Y | X )
for all P P k n ( X ) E ( δ ) . One can readily show by invoking the covering lemma in [17] that for each P P k n ( X ) E ( δ ) , there exist L ( k n ) permutations π 1 , , π L ( k n ) of the integers 1 , , k n such that
i = 1 L ( k n ) π i ( C k n ( P ) ) = T k n ( P ) ,
where
L ( k n ) max P P k n ( X ) E ( δ ) | C k n ( P ) | 1 | T k n ( P ) | log | T k n ( P ) | + 1 .
In view of (25), we can rewrite L ( k n ) as
L ( k n ) = max P P k n ( X ) E ( δ ) | C n | 1 | T k n ( P ) | log | T k n ( P ) | + 1 .
Note that
P e , max ( π i ( C k n ( P ) ) , P Y | X ) = P e , max ( C k n ( P ) , P Y | X ) , i = 1 , 2 , , L ( k n ) .
Given π 1 ( C k n ( P ) ) , , π L ( k n ) ( C k n ( P ) ) , we can partition T k n ( P ) into L ( k n ) disjoint subsets:
T k n ( P , 1 ) = π 1 ( C k n ( P ) ) , T k n ( P , i ) = π i ( C k n ( P ) ) j = 1 i 1 π i ( C k n ( P ) ) , i = 2 , , L ( k n ) .
It is clear that
P e , max ( T k n ( P , i ) , P Y | X ) P e , max ( π i ( C k n ( P ) ) , P Y | X ) , i = 1 , 2 , , L ( k n ) .
Now construct a sequence of variable-rate Slepian-Wolf codes { ϕ k n ( · ) } as follows.
  • The encoder sends the type of x k n to the decoder, where each type is uniquely represented by a binary sequence of length m 1 ( k n ) .
  • If x k n T k n ( P ) for some P E ( δ ) , the encoder sends x k n losslessly to the decoder, where each x k n T k n ( P ) is uniquely represented by a binary sequence of length m 2 ( k n ) .
  • If x k n T k n ( P ) for some P E ( δ ) , the encoder finds the set π i * ( C k n ( P ) ) that contains x k n and sends the index i * to the decoder, where each index in { 1 , 2 , , L ( k n ) } is uniquely represented by a binary sequence of length m 3 ( k n ) .
Specifically, we choose
m 1 ( k n ) = log 2 | P k n ( X ) | , m 2 ( k n ) = max P P k n ( X ) log 2 | T k n ( P ) | , m 3 ( k n ) = log 2 L ( k n ) .
Note that
R ( φ k n , P X Y ) = m 1 ( k n ) + ( 1 θ ) m 2 ( k n ) + θ m 3 ( k n ) k n log 2 e ,
where
θ = P P k n ( X ) E ( δ ) Pr { X k n T k n ( P ) } .
It is easy to verify (cf. (1)–(3)) that
m 1 ( k n ) | X | log 2 ( k n + 1 ) + 1 , m 2 ( k n ) k n log 2 | X | + 1 , 1 θ ( k n + 1 ) | X | e k n δ .
Therefore, we have
lim   sup n R ( ϕ k n , P X Y ) = lim   sup n m 3 ( k n ) k n log 2 e max P E ( δ ) H ( P ) 1 ( 1 + δ ) 2 lim inf n R ( C n ) H ( P X ) + δ H ( P X ) R δ ( 1 + δ ) 2 .
By (26)–(28) and the construction of φ k n ( · ) , it is clear that
P e ( φ k n , P X Y ) = P P k n ( X ) E ( δ ) i = 1 L ( k n ) Pr { X k n T k n ( P , i ) } Pr { X ^ n X n | X k n T k n ( P , i ) } P P k n ( X ) E ( δ ) i = 1 L ( k n ) Pr { X k n T k n ( P , i ) } P e , max ( T k n ( P , i ) , P Y | X ) P P k n ( X ) E ( δ ) i = 1 L ( k n ) Pr { X k n T k n ( P , i ) } P e , max ( π i ( C k n ( P ) ) , P Y | X ) = P P k n ( X ) E ( δ ) i = 1 L ( k n ) Pr { X k n T k n ( P , i ) } P e , max ( C n , P Y | X ) P e , max ( C n , P Y | X ) ,
which implies
lim   sup n 1 k n log P e ( φ k n , P X Y ) lim   sup n 1 k n log P e , max ( C n , P Y | X ) E ( P X , P Y | X , H ( P X ) R ) δ ( 1 + δ ) 2 .
In view of (29), (30), and the fact that δ > 0 is arbitrary, we must have E v ( P X Y , R ) E ( P X , P Y | X , H ( P X ) R ) (cf. Definition 4).
Now we proceed to show that E v ( P X Y , R ) E ( P X , P Y | X , H ( P X ) R ) . The main idea is that one can extract a constant composition code of type approximately P X and rate approximately H ( X ) R or greater from a given variable-rate Slepian-Wolf code φ n ( · ) of rate approximately R such that the average decoding error probability of this constant composition code over channel P Y | X is bounded from above by γ P e ( φ n , P X Y ) , where γ is a constant that does not depend on n.
By Definition 4, for any δ > 0 , there exists a sequence of variable-rate Slepian-Wolf codes { φ n } such that
lim   sup n   R ( φ n , P X Y ) R + δ ,
lim   sup n 1 n log P e ( φ n , P X Y ) E v ( P X Y , R ) δ .
Suppose φ n ( · ) induces a partition of T n ( P ) , P P n ( X ) , into N n ( P ) disjoint subsets T n ( P , 1 ) , , T n ( P , N n ( P ) ) . Here the partition is defined as follows: φ n ( x n ) = φ n ( x ˜ n ) if x n , x ˜ n T n ( P , i ) for some i, and φ n ( x n ) φ n ( x ˜ n ) if x n T n ( P , i ) , x ˜ n T n ( P , j ) for i j . Let
r ( T n ( P ) ) = 1 n log 2 e E [ l ( φ n ( X n ) ) | X n T n ( P ) ] , P P n ( X ) .
It follows from the source coding theorem that
r ( T n ( P ) ) 1 n i = 1 N n ( P ) | T n ( P , i ) | | T n ( P ) | log | T n ( P ) | | T n ( P , i ) | .
Define
F n ( δ ) = ( P , i ) : 1 n log | T n ( P ) | | T n ( P , i ) | R + 2 δ , P P n ( X ) , i = 1 , 2 , , N n ( P ) , F n c ( δ ) = ( P , i ) F n ( δ ) : P P n ( X ) , i = 1 , 2 , , N n ( P ) , G n ( γ ) = ( P , i ) : Pr { X ^ n X n | X n T n ( P , i ) } γ P e ( φ n , P X Y ) , P P n ( X ) , i = 1 , 2 , , N n ( P ) , G n c ( γ ) = ( P , i ) G n ( γ ) : P P n ( X ) , i = 1 , 2 , , N n ( P ) ,
where
γ > R + 2 δ δ .
Note that
R ( φ n , P X Y ) = P P n ( X ) Pr { X n T n ( P ) } r ( T n ( P ) ) ( 35 ) 1 n P P n ( X ) i = 1 N n ( P ) Pr { X n T n ( P , i ) } log | T n ( P ) | | T n ( P , i ) | 1 n ( P , i ) F n c ( δ ) Pr { X n T n ( P , i ) } log | T n ( P ) | | T n ( P , i ) | ( 36 ) ( R + 2 δ ) ( P , i ) F n c ( δ ) Pr { X n T n ( P , i ) } ,
where (35) is due to (33). Combing (31) and (36) yields
lim   sup n ( P , i ) F n c ( δ ) Pr { X n T n ( P , i ) } R + δ R + 2 δ .
Moreover, we have
( P , i ) G n c ( γ ) Pr { X n T n ( P , i ) } 1 γ
since otherwise
P e ( φ n , P X Y ) = P P n ( X ) i = 1 N n ( P ) Pr { X n T n ( P , i ) } Pr { X ^ n X n | X n T n ( P , i ) } ( P , i ) G n c ( γ ) Pr { X n T n ( P , i ) } Pr { X ^ n X n | X n T n ( P , i ) } > γ P e ( φ n , P X Y ) ( P , i ) G n c ( γ ) Pr { X n T n ( P , i ) } P e ( φ n , P X Y ) ,
which is absurd.
Define
S n ( δ ) = P P n ( X ) : H ( P ) H ( P X ) δ , max x P ( x ) P X ( x ) 1 + δ , S n c ( δ ) = P n ( X ) S n ( δ ) , D n ( δ , γ ) = ( P , i ) : ( P , i ) F n ( δ ) G n ( γ ) , P S n ( δ ) , D n c ( δ , γ ) = ( P , i ) D n ( δ , γ ) : P P n ( X ) , i = 1 , 2 , , N n ( P ) .
It follows from the weak law of large numbers that
lim n P S n c ( δ ) Pr { X n T n ( P ) } = 0 .
We have
lim   inf n ( P , i ) D n ( δ , γ ) Pr { X n T n ( P , i ) } = lim   inf n 1 ( P , i ) D n c ( δ , γ ) Pr { X n T n ( P , i ) } lim   inf n 1 ( P , i ) F n c ( δ ) Pr { X n T n ( P , i ) } ( P , i ) G n c ( γ ) Pr { X n T n ( P , i ) } P S n c ( δ ) Pr { X n T n ( P ) } ( 40 ) 1 R + δ R + 2 δ 1 γ ( 41 ) > 0 ,
where (40) is due to (37)–(39), and (41) is due to (34). Therefore, D n ( δ , γ ) is non-empty for all sufficiently large n. Pick an arbitrary ( P n * , i * ) from D n ( δ , γ ) for each sufficiently large n. We can construct a constant composition code C m n of length m n = ( 1 + δ ) n and type P m n for some P m n P m n ( X ) by concatenating a fixed sequence in X m n n to each sequence in T n ( P n * , i * ) such that
lim n P m n P X = 0 .
Note that
lim   inf n   R ( C m n ) = lim   inf n 1 m n log | T n ( P n * , i * ) | lim   inf n n m n 1 n log | T n ( P n * ) | R 2 δ H ( P X ) R 3 δ 1 + δ .
Moreover, since
P e ( C m n , P Y | X ) = Pr { X ^ n X n | X n T n ( P n * , i * ) } γ P e ( φ n , P X Y ) ,
it follows from (32) that
lim   sup n 1 m n log P e ( C m n , P Y | X ) E v ( P X Y , R ) δ 1 + δ .
In view of (42)–(44) and the fact that δ > 0 is arbitrary, we must have E v ( P X Y , R ) E ( P X , P Y | X , H ( P X ) R ) (cf. Definition 2). The proof is complete. ☐
The following result is an immediate consequence of Theorem 1 and Proposition 4.
Corollary 1.
Define
E v , e x ( P X Y , R ) = E e x ( P X , P Y | X , H ( P X ) R ) , E v , r c ( P X Y , R ) = E r c ( P X , P Y | X , H ( P X ) R ) , E v , s p ( P X Y , R ) = E s p ( P X , P Y | X , H ( P X ) R ) .
We have
  • E v ( P X Y , R ) max { E v , e x ( P X Y , R ) , E v , r c ( P X Y , R ) } ;
  • E v ( P X Y , R ) E v , s p ( P X Y , R ) with the possible exception of R = H ( P X ) R s p ( P X , P Y | X ) at which point the inequality not necessarily holds.
Remark 5.
  • We have E v ( P X Y , R ) = for R > H ( P X ) R e x ( P X , P Y | X ) , and E v ( P X Y , R ) < for R < H ( P X ) R s p ( P X , P Y | X ) . Therefore, H ( P X ) R e x ( P X , P Y | X ) and H ( P X ) R s p ( P X , P Y | X ) are respectively the upper bound and the lower bound on the zero-error rate of variable-rate Slepian-Wolf coding.
  • In view of (11), we have
    E v ( P X Y , R ) = E v , s p ( P X Y , R ) = E s p ( P X , P Y | X , H ( P X ) R )
    for R [ H ( X | Y ) , H ( P X ) R c r ( P X , P Y | X ) ] . Note that
    E v , s p ( P X Y , R ) E f , s p ( P X Y , R ) E f ( P X Y , R ) ,
    where the first inequality is strict unless the minimum in (20) is achieved at Q X = P X , (i.e., P X ( ρ ) = P X , where P X ( ρ ) is the marginal distribution of X ( ρ ) induced by P Y ( ρ ) and P X ( ρ ) | Y ( ρ ) in (21) and (22)). Therefore, variable-rate Slepian-Wolf coding can outperform fixed-rate Slepian-Wolf coding in terms of rate-error tradeoff.
For R > H ( P X ) R c r ( P X , P Y | X ) , it is possible to obtain upper bounds on E v ( P X Y , R ) that are tighter than E v , s p ( P X Y , R ) . Let E e x ( P Y | X , R ) and E s p ( P Y | X , R ) be respectively the expurgated exponent and the sphere packing exponent of channel P Y | X . The straight-line exponent E s l ( P Y | X , R ) of channel P Y | X [10] is the smallest linear function of R which touches the curve E s p ( P Y | X , R ) and also satisfies
E s l ( P Y | X , 0 ) = E e x ( P Y | X , 0 ) ,
where E e x ( P Y | X , 0 ) is assumed to be finite. Let R s l ( P Y | X ) be the point at which E s l ( P Y | X , R ) and E s p ( P Y | X , R ) coincide. It is well known [10] that E ( P Y | X , R ) E s l ( P Y | X , R ) for R ( 0 , R s l ( P Y | X ) ] . Since E ( P X , P Y | X , R ) E ( P Y | X , R ) , it follows from Theorem 1 that
E v ( P X Y , R ) E s l ( P Y | X , H ( P X ) R )
for R [ max { H ( P X ) R s l ( P Y | X ) , 0 } , H ( P X ) ) .
Note that the straight-line exponent holds for arbitrary block codes; one can obtain further improvement at high rates by leveraging bounds tailored to constant composition codes. Let E e x * ( Q X , P Y | X , 0 ) be the concave upper envelope of E e x ( Q X , P Y | X , 0 ) considered as a function of Q X . In view of ([7], Exercise 5.21), we have
E ( Q X , P Y | X , R ) E e x * ( Q X , P Y | X , 0 )
for any Q X P ( X ) and R > 0 . Now it follows from Theorem 1 that
E v ( P X Y , R ) E e x * ( P X , P Y | X , 0 )
for R < H ( P X ) .
The following theorem provides the second order expansion of E v ( P X Y , R ) at the Slepian-Wolf limit.
Theorem 2.
Assuming R c r ( P X , P X | Y ) < I ( P X , P Y | X ) (see Proposition 1 for the necessary and sufficient condition), we have
lim r 0 E v ( P X Y , H ( X | Y ) + r ) r 2 = 1 2 x , y P X Y ( x , y ) τ 2 ( x , y ) x P X ( x ) y τ ( x , y ) P Y | X ( y | x ) 2 1 ,
where τ ( x , y ) = log P Y ( y ) log P Y | X ( y | x ) .
Remark 6.
If R c r ( P X , P Y | X ) = I ( P X , P Y | X ) , then we have E v , r c ( P X Y , R ) = R H ( X | Y ) for R H ( X | Y ) , which implies
lim r 0 E v ( P X Y , H ( X | Y ) + r ) r 2 = .
It is also worth noting that the second order expansion of E v ( P X Y , R ) at the Slepian-Wolf limit yields the redundancy-error tradeoff constant of variable-rate Slepian-Wolf coding derived in [18].
Proof. 
Since R c r ( P X , P X | Y ) < I ( P X ; P Y | X ) , it follows that H ( X | Y ) + r ( H ( X | Y ) , H ( P X ) R c r ( P X , P Y | X ) ) when r ( r > 0 ) is sufficiently close to zero. In this case, we have
E v ( P X Y , H ( X | Y ) + r ) r 2 = E s p ( P X , P Y | X , I ( P X , P Y | X ) r ) r 2 = min Q Y | X : I ( P X , Q Y | X ) I ( P X , P Y | X ) r D ( Q Y | X P Y | X | P X ) r 2 = min Q Y | X : I ( P X , Q Y | X ) = I ( P X , P Y | X ) r D ( Q Y | X P Y | X | P X ) r 2 ,
where the last equality follows from the fact that E s p ( P X , P Y | X , R ) is a strictly decreasing convex function of R for R ( R s p ( P X , P Y | X ) , I ( P X , P Y | X ) ] .
Let Δ ( x , y ) = Q Y | X ( y | x ) P Y | X ( y | x ) for x X , y Y . Let Δ ( y ) = x P X ( x ) Δ ( x , y ) for y Y . By the Taylor expansion,
I ( P X , Q Y | X ) = x , y P X ( x ) ( P Y | X ( y | x ) + Δ ( x , y ) ) log ( P Y | X ( y | x ) + Δ ( x , y ) ) y ( P Y ( y ) + Δ ( y ) ) log ( P Y ( y ) + Δ ( y ) ) = x , y P X ( x ) ( P Y | X ( y | x ) + Δ ( x , y ) ) log P Y | X ( y | x ) + Δ ( x , y ) P Y | X ( y | x ) + o ( Δ ( x , y ) ) y ( P Y ( y ) + Δ ( y ) ) log P Y ( y ) + Δ ( y ) P Y ( y ) + o ( Δ ( y ) ) = I ( P X , P Y | X ) y ( Δ ( y ) + Δ ( y ) log P Y ( y ) + o ( Δ y ) ) + x , y P X ( x ) ( Δ ( x , y ) + Δ ( x , y ) log P Y | X ( y | x ) + o ( Δ ( x , y ) ) )
and
D ( Q Y | X P Y | X | P X ) = x , y P X ( x ) Q Y | X ( y | x ) log Q Y | X ( y | x ) P Y | X ( y | x ) = x , y P X ( x ) ( P Y | X ( y | x ) + Δ ( x , y ) ) log 1 + Δ ( x , y ) P Y | X ( y | x ) = x , y P X ( x ) ( P Y | X ( y | x ) + Δ ( x , y ) ) Δ ( x , y ) P Y | X ( y | x ) Δ 2 ( x , y ) 2 P Y | X 2 ( y | x ) + o ( Δ 2 ( x , y ) ) = x , y P X ( x ) Δ 2 ( x , y ) 2 P Y | X ( y | x ) + o ( Δ 2 ( x , y ) ) .
Here f ( z ) = o ( z ) means lim z 0 f ( z ) z = 0 .
As r 0 , we have Δ ( y ) 0 , Δ ( x , y ) 0 for all x X , y Y . Therefore, by ignoring the high order terms which do not affect the limit, we get
lim r 0 E v ( P X Y , H ( X | Y ) + r ) r 2 = lim r 0 min x , y P X ( x ) Δ 2 ( x , y ) 2 P Y | X ( y | x ) r 2 ,
where the minimization is over Δ ( x , y ) ( x X , y Y ) subject to the constraints
  • y Δ ( x , y ) = 0 for all x X ,
  • x , y P X ( x ) τ ( x , y ) Δ ( x , y ) = r .
Introduce the Lagrange multipliers α ( x ) ( x X ) and β for these constraints, and define
G = x , y P X ( x ) Δ 2 ( x , y ) 2 P Y | X ( y | x ) x , y α ( x ) Δ ( x , y ) β x , y P X ( x ) τ ( x , y ) Δ ( x , y ) .
The Karush-Kuhn-Tucker conditions yield
G Δ ( x , y ) = α ( x ) β P X ( x ) τ ( x , y ) + P X ( x ) Δ ( x , y ) P Y | X ( y | x ) = 0 , x X , y Y .
Therefore, we have
Δ ( x , y ) = β τ ( x , y ) P Y | X ( y | x ) + P Y | X ( y | x ) P X ( x ) α ( x ) .
Substituting (46) into constraint 1, we obtain
α ( x ) = β P X ( x ) y τ ( x , y ) P Y | X ( y | x ) ,
which, together with (46), yields
Δ ( x , y ) = β τ ( x , y ) P Y | X ( y | x ) β P Y | X ( y | x ) y τ ( x , y ) P Y | X ( y | x ) .
Therefore, we have
x , y P X ( x ) Δ 2 ( x , y ) 2 P Y | X ( y | x ) = β 2 2 x , y P X Y ( x , y ) τ ( x , y ) y τ ( x , y ) P Y | X ( y | x ) 2 = β 2 2 x , y P X Y ( x , y ) τ 2 ( x , y ) 2 τ ( x , y ) y τ ( x , y ) P Y | X ( y | x ) + y τ ( x , y ) P Y | X ( y | x ) 2 = β 2 2 x , y P X Y ( x , y ) τ 2 ( x , y ) x P X ( x ) y τ ( x , y ) P Y | X ( y | x ) 2 .
Constraint 2 and (47) together yield
r 2 β 2 = 1 β 2 x , y P X ( x ) τ ( x , y ) Δ ( x , y ) 2 = x , y P X ( x ) τ ( x , y ) τ ( x , y ) P Y | X ( y | x ) P Y | X ( y | x ) y τ ( x , y ) P Y | X ( y | x ) 2 = x , y P X Y ( x , y ) τ 2 ( x , y ) x P X ( x ) y τ ( x , y ) P Y | X ( y | x ) 2 2 .
The proof is complete by substituting (48) and (49) back into (45). ☐

4. Variable-Rate Slepian-Wolf Coding: Below the Slepian-Wolf Limit

Definition 5.
Given a joint probability distribution P X Y , we say that a correct decoding exponent E c 0 is achievable with variable-rate Slepian-Wolf codes at rate R if for any δ > 0 , there exists a sequence of variable-rate Slepian-Wolf codes { φ n } such that
lim   sup n   R ( φ n , P X Y ) R + δ , lim   inf n 1 n log P c ( φ n , P X Y ) E c + δ .
The smallest achievable correct decoding exponent at rate R is denoted by E v c ( P X Y , R ) .
In view of Theorem 1, it is tempting to conjecture that E v c ( P X Y , R ) = E c ( P X , P Y | X , H ( P X ) R ) . It turns out this is not true. We shall show that E v c ( P X Y , R ) = 0 for all R. Actually we have a stronger result—the correct decoding probability of variable-rate Slepian-Wolf coding can be bounded away from zero even when R < H ( X | Y ) . This is in sharp contrast with fixed-rate Slepian-Wolf coding for which the correct decoding probability decays to zero exponentially fast if the rate is below the Slepian-Wolf limit. To make the statement more precise, we need the following definition.
Definition 6.
Given a joint probability distribution P X Y , we say that a correct decoding probability P c , v ( P X Y , R ) is achievable with variable-rate Slepian-Wolf codes at rate R if for any δ > 0 , there exists a sequence of variable-rate Slepian-Wolf codes { φ n } such that
lim   sup n   R ( φ n , P X Y ) R + δ , lim   sup n   P c ( φ n , P X Y ) P c , v ( P X Y , R ) δ .
The largest achievable correct decoding probability at rate R is denoted by P c , v max ( P X Y , R ) .
Theorem 3.
P c , v max ( P X Y , R ) = R H ( X | Y ) for R ( 0 , H ( X | Y ) ] .
Remark 7.
It is obvious that P c , v max ( P X Y , R ) = 1 for R > H ( X | Y ) . Moreover, since P c , v max ( P X Y , R ) is a monotonically increasing function of R, it follows that P c , v max ( P X Y , 0 ) = 0 .
Proof. 
The intuition underlying the proof is as follows. Assume the rate is below the Slepian-Wolf limit, i.e., R < H ( X | Y ) . For each type P in the neighborhood of P X , the rate allocated to the type class T n ( P ) should be no less than H ( X | Y ) in order to correctly decode the sequences in T n ( P ) . However, since almost all the probability are captured by the type classes whose types are in the neighborhood of P X , there is no enough rate to protect all of them. Note that if the rate is evenly allocated among these type classes, none of them can get enough rate; consequently, the correct decoding probability goes to zero. A good way is to protect only a portion of them to accumulate enough rate. Specifically, we can protect R H ( X | Y ) fraction of these type classes so that the rate allocated to each of them is about H ( X | Y ) and leave the remaining type classes unprotected. It turns out this strategy achieves the maximum correct decoding probability as the block length n goes to infinity. Somewhat interestingly, although E v c ( P X Y , R ) E c ( P X , P Y | X , H ( P X ) R ) , the function E c ( P X , P Y | X , · ) does play a fundamental role in establishing the correct result.
The proof is divided into two parts. Firstly, we shall show that P c , v max ( P X Y , R ) R H ( X | Y ) . For any ϵ > 0 , define
U ( ϵ ) = P P ( X ) : P P X ϵ .
Since P X ( x ) > 0 for all x X , we can choose ϵ small enough so that
q min ( ϵ ) min P U ( ϵ ) , x X P ( x ) > 0 .
Using Stirling’s approximation
2 π m m e m e 1 12 m + 1 < m ! < 2 π m m e m e 1 12 m ,
we have, for any P U ( ϵ ) P n ( X ) ,
Pr ( X n T n ( P ) ) = n ! x ( n P ( x ) ) ! x P X ( x ) n P ( x ) 2 π n e 1 12 n x 2 π n P ( x ) e n D ( P P X ) 2 π e 1 12 n x 2 π P ( x ) n | X | 1 2 2 π e 1 12 n x 2 π q min ( ϵ ) n | X | 1 2 ,
which implies that Pr ( X n T n ( P ) ) converges uniformly to zero as n for all P U ( ϵ ) P n ( X ) . Moreover, it follows from the weak law of large numbers that
lim n P U ( ϵ ) P n ( X ) Pr ( X n T n ( P ) ) = 1 .
Therefore, for any δ > 0 , R ( 0 , H ( X | Y ) ] , and sufficiently large n, we can find a set S n U ( ϵ ) P n ( X ) such that
R H ( X | Y ) δ P S n Pr ( X n T n ( P ) ) R H ( X | Y ) .
Now consider a sequence of variable-rate Slepian-Wolf codes { φ n ( · ) } specified as follows.
  • The encoder sends of type of X n to the decoder, where each type is uniquely represented by a binary sequence of length log 2 | P n ( X ) | .
  • For each P S n , the encoder partitions the type class T n ( P ) into L n subsets T n ( P , 1 ) , T n ( P , 2 ) , , T n ( P , L n ) . If X n T n ( P ) for some P S n , the encoder finds the subset T n ( P , i * ) that contains X n and sends the index i * to the decoder, where each index in { 1 , 2 , , L n } is uniquely represented by a binary sequence of length log 2 L n .
  • The remaining type classes are left uncoded.
Specifically, we let
L n = 2 ( n + 1 ) | X | 2 e n ( H ( X | Y ) + δ ) .
It follows from ([8], Theorem 2) that for each P S n , it is possible to partition the type class T n ( P ) into L n disjoint subsets T n ( P , 1 ) , T n ( P , 2 ) , , T n ( P , L n ) so that
1 n log Pr ( X ^ n X n | X n T n ( P ) ) min Q X U ( ϵ ) E r c ( Q X , P Y | X , H ( Q X ) H ( X | Y ) δ ) ϵ
uniformly for all P S n when n is sufficiently large. In view of the fact that E r c ( P X , P Y | X , I ( P X , P Y | X ) δ ) > 0 and that E r c ( Q X , P Y | X , R ) as a function of the pair ( Q X , R ) is uniformly equicontinuous, we have
min Q X U ( ϵ ) E r c ( Q X , P Y | X , H ( Q X ) H ( X | Y ) δ ) ϵ κ 1 > 0
for sufficiently small ϵ .
For this sequence of constructed variable-rate Slepian-wolf codes { φ n ( · ) } , it can be readily verified that
lim   sup n   R ( φ n , P X Y ) = lim   sup n 1 n log 2 e log 2 | P n ( X ) | + P S n Pr { X n T n ( P ) } log 2 L n R H ( X | Y ) ( H ( X | Y ) + δ )
and
lim   sup n   P c ( φ n , R ) lim   sup n P S n Pr { X n T n ( P ) } 1 Pr { X ^ n X n | X n T n ( P ) } lim   sup n P S n Pr { X n T n ( P ) } 1 e n κ 1 R H ( X | Y ) δ .
Since δ > 0 is arbitrary, it follows from Definition 6 that P c , v max ( P X Y , R ) R H ( X | Y ) .
Now we proceed to prove the other direction. It follows from Definition 6 that for any δ > 0 , there exists a sequence of variable-rate Slepian-Wolf codes { φ n ( · ) } with
lim   sup n   R ( φ n , P X Y ) R + δ , lim   sup n   P c ( φ n , P X Y ) P c , v max ( R ) δ .
Recall the definition of T n ( P , 1 ) , , T n ( P , N n ( P ) ) as well as r ( T n ( P ) ) in the proof of Theorem 1. For P P n ( X ) , define
I n ( P , δ ) = i : 1 n log | T n ( P ) | | T n ( P , i ) | H ( X | Y ) δ , i = 1 , 2 , , N n ( P ) , I n c ( P , δ ) = i : 1 n log | T n ( P ) | | T n ( P , i ) | > H ( X | Y ) δ , i = 1 , 2 , , N n ( P ) .
Note that
i I n ( P , δ ) | T n ( P , i ) | | T n ( P ) | 1 r ( T n ( P ) ) H ( X | Y ) δ
since
r ( T n ( P ) ) 1 n i = 1 N n ( P ) | T n ( P , i ) | | T n ( P ) | log | T n ( P ) | | T n ( P , i ) | 1 n i I n c ( P , δ ) | T n ( P , i ) | | T n ( P ) | log | T n ( P ) | | T n ( P , i ) | ( H ( X | Y ) δ ) i I n c ( P , δ ) | T n ( P , i ) | | T n ( P ) | ,
where (50) is due to (33).
Each T n ( P , i ) can be viewed as a constant composition code of type P and we have
Pr { X ^ n = X n | X n T n ( P , i ) } = P c ( T n ( P , i ) , P Y | X ) .
Note that for P U ( ϵ ) P n ( X ) and i I n ( P , δ ) ,
1 n log | T n ( P , i ) | 1 n log | T n ( P ) | H ( X | Y ) + δ H ( P ) H ( X | Y ) + δ | X | log ( n + 1 ) n .
Therefore, it follows from ([9], Lemma 5) that
1 n log P c ( T n ( P , i ) , P Y | X ) min Q X U ( ϵ ) E c ( Q X , P Y | X , H ( Q X ) H ( X | Y ) + δ ϵ ) ϵ
uniformly for all P U ( ϵ ) P n ( X ) and i I n ( P , δ ) when n is sufficiently large. In view of the fact that E c ( P X , P Y | X , I ( P X , P Y | X ) + δ ) > 0 and that E c ( Q X , P Y | X , R ) as a function of the pair ( Q X , R ) is uniformly equicontinuous, we have
min Q X U ( ϵ ) E c ( Q X , P Y | X , H ( Q X ) H ( X | Y ) + δ ϵ ) ϵ κ 2 > 0
for sufficiently small ϵ .
Now it is easy to see that
lim   inf n   P e ( φ n , P X Y ) lim   inf n P U ( ϵ ) P n ( X ) Pr { X n T n ( P ) } i I n ( P , δ ) | T n ( P , i ) | | T n ( P ) | 1 Pr { X ^ n = X n | X n T n ( P , i ) } lim   inf n P U ( ϵ ) P n ( X ) Pr { X n T n ( P ) } i I n ( P , δ ) | T n ( P , i ) | | T n ( P ) | ( 1 e n κ 2 ) lim   inf n P U ( ϵ ) P n ( X ) Pr { X n T n ( P ) } 1 r ( T n ( P ) ) H ( X | Y ) δ ( 1 e n κ 2 ) lim   inf n P U ( ϵ ) P n ( X ) Pr { X n T n ( P ) } ( 1 e n κ 2 ) lim   sup n P U ( ϵ ) P n ( X ) Pr { X n T n ( P ) } r ( T n ( P ) ) H ( X | Y ) δ ( 1 e n κ 2 ) lim   inf n P U ( ϵ ) P n ( X ) Pr { X n T n ( P ) } ( 1 e n κ 2 ) lim   sup n P P n ( X ) Pr { X n T n ( P ) } r ( T n ( P ) ) H ( X | Y ) δ ( 1 e n κ 2 ) = lim   inf n P U ( ϵ ) P n ( X ) Pr { X n T n ( P ) } ( 1 e n κ 2 ) lim   sup n R ( φ n , P X Y ) H ( X | Y ) δ ( 1 e n κ 2 ) = 1 R + δ H ( X | Y ) δ ,
which implies
lim   sup n   P c ( φ n , P X Y ) R + δ H ( X | Y ) δ .
Therefore, we have
P c , v max ( P X Y , R ) δ R + δ H ( X | Y ) δ .
Since δ > 0 is arbitrary, this completes the proof. ☐

5. Example

Consider the joint distribution P X Y over Z 2 × Z 2 with P X | Y ( 1 | 0 ) = P X | Y ( 0 | 1 ) = p and P Y ( 0 ) = τ . We assume p ( 0 , 1 2 ) , τ ( 0 , 1 2 ] . It is easy to compute that
P X ( 0 ) = 1 P X ( 1 ) = τ ( 1 p ) + ( 1 τ ) p , P Y | X ( 1 | 0 ) = 1 P Y | X ( 0 | 0 ) = ( 1 τ ) p τ ( 1 p ) + ( 1 τ ) p , P Y | X ( 0 | 1 ) = 1 P Y | X ( 1 | 1 ) = τ p τ p + ( 1 τ ) ( 1 p ) ) .
For this joint distribution, we have H ( X | Y ) = H b ( p ) , where H b ( · ) is the binary entropy function (i.e., H b ( p ) = p log p ( 1 p ) log ( 1 p ) ). Given R [ 0 , log 2 ] , let q be the unique number satisfying H b ( q ) = R and q 1 2 . It can be verified that
E f , s p ( P X Y , R ) = D ( q p ) , R [ H b ( p ) , log 2 ] , E f c ( P X Y , R ) = D ( q p ) , R [ 0 , H b ( p ) ] .
Note that
E e x ( Q X , P Y | X , 0 ) = x , x Q X ( x ) Q X ( x ) log y P Y | X ( y | x ) P Y | X ( y | x ) = 2 Q X ( 0 ) Q X ( 1 ) log y P Y | X ( y | 0 ) P Y | X ( y | 1 ) ,
which is a concave function of Q X . Therefore,
E e x * ( P X , P Y | X , 0 ) = E e x ( P X , P Y | X , 0 ) .
Moreover, we have
E e x ( P Y | X , 0 ) = max Q X E e x ( Q X , P Y | X , 0 ) = 1 2 log y P Y | X ( y | 0 ) P Y | X ( y | 1 ) .
It is easy to show that
E v , s p ( P X Y , H ( P X ) ) = E s p ( P X , P X | Y , 0 ) = min Q Y x P X ( x ) y Q Y ( y ) log Q Y ( y ) P Y | X ( y | x ) ,
where the minimizer Q Y * is given by
Q Y * ( y ) = x P Y | X ( y | x ) P X ( x ) y x P Y | X ( y | x ) P X ( x ) , y Y .
Define
E f , e r ( P X Y , R ) = max { E f , e x ( P X Y , R ) , E f , r c ( P X Y , R ) } , E v , e r ( P X Y , R ) = max { E v , e x ( P X Y , R ) , E v , r c ( P X Y , R ) } .
We have
E f ( P X Y , R ) E f , e r ( P X Y , R ) , E v ( P X Y , R ) E v , e r ( P X Y , R ) .
It can be seen from Figure 2 that the achievable error exponent E v , e r ( P X Y , R ) of variable-rate Slepian-Wolf coding can completely dominate the sphere packing exponent E f , s p ( P X Y , R ) of fixed-rate Slepian-Wolf coding. The gain of variable-rate coding gradually diminishes as τ 1 2 (see Figure 3 and Figure 4).

6. Concluding Remarks

We have studied the reliability function of variable-rate Slepian-Wolf coding. An intimate connection between variable-rate Slepian-Wolf codes and constant composition codes has been revealed. It is shown that variable-rate Slepian-Wolf coding can outperform fixed-rate Slepian-Wolf coding in terms of rate-error tradeoff. Finally, we would like to mention that Theorem 1 has been generalized by Weinberger and Merhav in their recent paper on the optimal tradeoff between the error exponent and the excess-rate exponent of variable-rate Slepian-Wolf coding [19].

Acknowledgments

Jun Chen was supported in part by the Natural Sciences and Engineering Research Council of Canada through a Discovery Grant.

Author Contributions

All the authors contributed to the problem formulation and the proof of Theorem 2; Jun Chen established the remaining results and wrote the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Proposition 1

In view of (7) and (11), we have R c r ( Q X , W Y | X ) = I ( Q X , W Y | X ) if and only if the minimum of the convex optimization problem
min V Y | X D ( V Y | X W Y | X | Q X ) + I ( Q X , V Y | X )
is achieved at V Y | X = W Y | X . Let V Y | X * be a minimizer to the above optimization problem. Note that for x , y such that Q X ( x ) W Y | X ( y | x ) = 0 , there is no loss of generality in setting V Y | X * ( y | x ) = W Y | X ( y | x ) . Let A = { x X : Q X ( x ) > 0 } and B x = { y Y : W Y | X ( y | x ) > 0 } for x A . We can rewrite (A1) in the following equivalent form:
min V Y | X ( y | x ) : x A , y B x x A , y B x Q X ( x ) V Y | X ( y | x ) log V Y | X 2 ( y | x ) W Y | X ( y | x ) x A Q X ( x ) V Y | X ( y | x )
subject to
V Y | X ( y | x ) 0 for   all x A , y B x , y B x V Y | X ( y | x ) = 1 for   all x A .
Define
G = x A , y B x Q X ( x ) V Y | X ( y | x ) log V Y | X 2 ( y | x ) W Y | X ( y | x ) x A Q X ( x ) V Y | X ( y | x ) x A , y B x α ( x , y ) V Y | X ( y | x ) x A , y B x β ( x ) V Y | X ( y | x ) ,
where α ( x , y ) R + ( x A , y B x ) and β ( x ) R ( x A ). The Karush-Kuhn-Tucker conditions yield
G V Y | X ( y * | x * ) V Y | X ( y * | x * ) = V Y | X * ( y * | x * ) = 2 Q X ( x * ) log V Y | X * ( y * | x * ) + Q X ( x * ) Q X ( x * ) log W Y | X ( y * | x * ) Q X ( x * ) log x A Q X ( x ) V Y | X * ( y * | x ) α ( x * , y * ) β ( x * ) = 0 for   all x * A , y * B x * , V Y | X * ( y * | x * ) 0 for   all x * A , y * B x * , y * B x * V Y | X * ( y * | x * ) = 1 for   all x * A , α ( x * , y * ) V Y | X * ( y * | x * ) = 0 for   all x * A , y * B x * .
By the complementary slackness conditions (i.e., V Y | X * ( y * | x * ) > 0 α ( x * , y * ) = 0 ), we have V Y | X * = W Y | X if and only if for all x * A , y * B x * ,
Q X ( x * ) log W Y | X ( y * | x * ) + Q X ( x * ) Q X ( x * ) log x A Q X ( x ) W Y | X ( y * | x ) β ( x * ) = 0 ,
i.e., the value of
W Y | X ( y | x ) x Q X ( x ) W Y | X ( y | x )
does not depend on y for all x , y such that Q X ( x ) W Y | X ( y | x ) > 0 .

Appendix B. Proof of Proposition 4

  • It is known ([7], Exercise 5.17) that for every R > 0 , δ > 0 , and P P n ( X ) there exists a constant composition code C n T n ( P ) such that
    R ( C n ) R δ , 1 n log P e , max ( C n , W Y | X ) E e x ( P , W Y | X , R ) δ
    whenever n n 0 ( | X | , | Y | , δ ) . Let P n be a sequence of types with P n P n ( X ) and
    lim n P n Q X = 0 .
    Define
    V n * = arg min x , x ˜ P n ( x ) V n ( x ˜ | x ) d W Y | X ( x , x ˜ ) + I ( P n , V n ) R ,
    where the minimization is over V n : X X subject to the constraints
    x P n ( x ) V n ( x ˜ | x ) = P n ( x ˜ ) for   all x ˜ X , I ( P n , V n ) R .
    Note that { V n * } must contain a converging subsequence { V n k * } . Define
    V * = lim k V n k * .
    It is easy to verify that
    x X Q X ( x ) V * ( x ˜ | x ) = lim k x X P n k ( x ) V n k * ( x ˜ | x ) = lim k P n k ( x ˜ ) = Q X ( x ˜ ) f o r a l l x ˜ X , I ( Q X , V * ) = lim k I ( P n k , V n k * ) R .
    Therefore, we have
    lim sup n E e x ( P n , W Y | X , R ) lim sup k E e x ( P n k , W Y | X , R ) = lim sup k x , x ˜ X P n k ( x ) V n k * ( x ˜ | x ) d W Y | X ( x , x ˜ ) + I ( P n k , V n k ) R x , x ˜ X Q X ( x ) V * ( x ˜ | x ) d P Y | X ( x , x ˜ ) + I ( Q X , V * ) R E e x ( Q X , W Y | X , R ) .
    It is also known ([7], Theorem 5.2) that for every R > 0 , δ > 0 , and P P n ( X ) there exists a constant composition code C n T n ( P ) such that
    R ( C n ) R δ , 1 n log P e , max ( C n , W Y | X ) E r c ( P , W Y | X , R ) δ
    whenever n n 0 ( | X | , | Y | , δ ) . So it can be readily shown that
    E ( Q X , W Y | X , R ) E r c ( Q X , W Y | X , R )
    by invoking the fact that E r c ( P , W Y | X , R ) as a function of the pair ( P , R ) is uniformly equicontinuous ([7], Lemma 5.5). The proof is complete.
  • By Definition 2, for every R > 0 , δ > 0 there exists a sequence of block channel codes { C n } with C n T n ( P n ) for some P n P n ( X ) such that
    lim n P n Q X = 0 , lim   inf n   R ( C n ) R δ , lim   inf n 1 n log P e , max ( C n , W Y | X ) E ( Q X , W Y | X , R ) δ .
    For simplicity, we assume R ( C n ) R δ for all n. Now it follows from ([7], Theorem 5.3) that
    1 n log 2 P e , max ( C n , W Y | X ) E s p ( P n , W Y | X , R 2 δ ) ( 1 + δ )
    whenever n n 0 ( | X | , | Y | , δ ) . Let
    V Y | X * = arg min V Y | X : I ( Q X , V Y | X ) R 3 δ D ( V Y | X W Y | X | Q X ) .
    Without loss of generality, we can set V Y | X * ( · | x ) = W Y | X ( · | x ) for all x { x X : Q X ( x ) = 0 } . It is easy to see that there exists an ϵ > 0 such that
    I ( P , V Y | X * ) R 2 δ , D ( V Y | X * W Y | X | P ) D ( V Y | X * W Y | X | Q X ) + δ
    for all P P ( X ) with P Q X ϵ . Therefore, for all sufficiently large n,
    E s p ( P n , W Y | X , R 2 δ ) = min V Y | X : I ( P n , V Y | X ) R 2 δ D ( V Y | X W Y | X | P n ) D ( V Y | X * W Y | X | P n ) D ( V Y | X * W Y | X | Q X ) + δ = E s p ( Q X , W Y | X , R 3 δ ) + δ .
    Combining (A2)–(A4), we get
    E ( Q X , W Y | X , R ) δ [ E s p ( Q X , W Y | X , R 3 δ ) + δ ] ( 1 + δ ) .
    In view of the fact that δ > 0 is arbitrary and that for fixed P and W Y | X , E s p ( P , W Y | X , R ) is a decreasing continuous convex function of R in the interval where it is finite ([7], Lemma 5.4), the proof is complete.
  • It is known ([9], Lemma 5) that every constant composition code C n of common type P for some P P n ( X ) and rate R ( C n ) R + δ (with R > 0 and δ > 0 ) has
    1 n log P c ( C n , W Y | X ) min V Y | X D ( V Y | X W Y | X | P ) + | R I ( P , V Y | X ) | + δ
    whenever n n 0 ( | X | , | Y | , δ ) . Moreover, it is also known ([9], Lemma 2) ([7], Exercise 5.16) that for every R > 0 , δ > 0 , and P P n ( X ) there exists a constant composition code C n T n ( P ) such that
    R ( C n ) R δ , 1 n log P c ( C n , W Y | X ) min V Y | X D ( V Y | X W Y | X | P ) + | R I ( P , V Y | X ) | + + δ
    whenever n n 0 ( | X | , | Y | , δ ) . In view of the fact that min V Y | X [ D ( V Y | X W Y | X | P ) + | R I ( P , V Y | X ) | + ] as a function of the pair ( P , R ) is uniformly equicontinuous, it can be readily shown that
    E c ( Q X , W Y | X , R ) = min V Y | X D ( V Y | X W Y | X | Q X ) + | R I ( P , V Y | X ) | + .
    The proof is complete.

References

  1. Slepian, D.; Wolf, J.K. Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory 1973, 19, 471–480. [Google Scholar] [CrossRef]
  2. Csiszár, I. Linear codes for sources and source networks: Error exponents, universal coding. IEEE Trans. Inf. Theory 1982, 28, 585–592. [Google Scholar] [CrossRef]
  3. Pradhan, S.S.; Ramchandran, K. Distributed source coding using syndromes (DISCUS): Design and construction. IEEE Trans. Inf. Theory 2003, 49, 626–643. [Google Scholar] [CrossRef]
  4. Chen, J.; He, D.-K.; Jagmohan, A. The equivalence between Slepian-Wolf coding and channel coding under density evolution. IEEE Trans. Commun. 2009, 57, 2534–2540. [Google Scholar] [CrossRef]
  5. Sun, Z.; Tian, C.; Chen, J.; Wong, K.M. LDPC code design for asynchronous Slepian-Wolf coding. IEEE Trans. Commun. 2010, 58, 511–520. [Google Scholar] [CrossRef]
  6. Oohama, Y.; Han, T.S. Universal coding for the Slepian-Wolf data compression system and the strong converse theorem. IEEE Trans. Inf. Theory 1994, 40, 1908–1919. [Google Scholar] [CrossRef]
  7. Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Academic: New York, NY, USA, 1981. [Google Scholar]
  8. Csiszár, I.; Körner, J. Graph decomposition: A new key to coding theorems. IEEE Trans. Inf. Theory 1981, 27, 5–12. [Google Scholar] [CrossRef]
  9. Dueck, G.; Körner, J. Reliability function of a discrete memoryless channel at rates above capacity. IEEE Trans. Inf. Theory 1979, 25, 82–85. [Google Scholar] [CrossRef]
  10. Gallager, R.G. Information Theory and Reliable Communication; Wiley: New York, NY, USA, 1968. [Google Scholar]
  11. Gallager, R.G. Source Coding with Side Information and Universal Coding, MIT LIDS Technical Report (LIDS-P-937). Unpublished work. 1976.
  12. Chen, J.; He, D.-K.; Jagmohan, A.; Lastras-Montaño, L. On the redundancy-error tradeoff in Slepian-Wolf coding and channel coding. In Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, 24–29 Junuary 2007; pp. 1326–1330. [Google Scholar]
  13. Csiszár, I.; Körner, J. Towards a general theory of source networks. IEEE Trans. Inf. Theory 1980, 26, 155–165. [Google Scholar] [CrossRef]
  14. Chen, J.; He, D.-K.; Jagmohan, A.; Lastras-Montaño, L.A.; Yang, E.-H. On the linear codebook-level duality between Slepian-Wolf coding and channel coding. IEEE Trans. Inf. Theory 2009, 55, 5575–5590. [Google Scholar] [CrossRef]
  15. Ahlswede, R.; Dueck, G. Good codes can be produced by a few permutations. IEEE Trans. Inf. Theory 1982, 28, 430–443. [Google Scholar] [CrossRef]
  16. Chen, J.; He, D.-K.; Jagmohan, A. On the duality between Slepian-Wolf coding and channel coding under mismatched decoding. IEEE Trans. Inf. Theory 2009, 55, 4006–4018. [Google Scholar] [CrossRef]
  17. Ahlswede, R. Coloring hypergraphs: A new approach to multi-user source coding—II. J. Comb. Inf. Syst. Sci. 1980, 5, 220–268. [Google Scholar]
  18. He, D.-K.; Lastras-Montaño, L.; Yang, E.-H.; Jagmohan, A.; Chen, J. On the redundancy of Slepian-Wolf coding. IEEE Trans. Inf. Theory 2009, 55, 5607–5627. [Google Scholar] [CrossRef]
  19. Weinberger, N.; Merhav, N. Optimum tradeoffs between the error exponent and the excess-rate exponent of variable-rate Slepian-Wolf coding. IEEE Trans. Inf. Theory 2015, 61, 2165–2190. [Google Scholar] [CrossRef]
Figure 1. Slepian-Wolf coding.
Figure 1. Slepian-Wolf coding.
Entropy 19 00389 g001
Figure 2. p = 0.05 , τ = 0.12 .
Figure 2. p = 0.05 , τ = 0.12 .
Entropy 19 00389 g002
Figure 3. p = 0.05 , τ = 0.35 .
Figure 3. p = 0.05 , τ = 0.35 .
Entropy 19 00389 g003
Figure 4. p = 0.05 , τ = 0.50 .
Figure 4. p = 0.05 , τ = 0.50 .
Entropy 19 00389 g004

Share and Cite

MDPI and ACS Style

Chen, J.; He, D.-k.; Jagmohan, A.; Lastras-Montaño, L.A. On the Reliability Function of Variable-Rate Slepian-Wolf Coding. Entropy 2017, 19, 389. https://doi.org/10.3390/e19080389

AMA Style

Chen J, He D-k, Jagmohan A, Lastras-Montaño LA. On the Reliability Function of Variable-Rate Slepian-Wolf Coding. Entropy. 2017; 19(8):389. https://doi.org/10.3390/e19080389

Chicago/Turabian Style

Chen, Jun, Da-ke He, Ashish Jagmohan, and Luis A. Lastras-Montaño. 2017. "On the Reliability Function of Variable-Rate Slepian-Wolf Coding" Entropy 19, no. 8: 389. https://doi.org/10.3390/e19080389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop