Next Article in Journal
Goal Identification Control Using an Information Entropy-Based Goal Uncertainty Metric
Next Article in Special Issue
Distributed Hypothesis Testing with Privacy Constraints
Previous Article in Journal
A Study on the Hall–Petch Relationship and Grain Growth Kinetics in FCC-Structured High/Medium Entropy Alloys
Previous Article in Special Issue
Detection Games under Fully Active Adversaries
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Guessing with Distributed Encoders

1
P&C Solutions, Swiss Re, 8022 Zurich, Switzerland
2
Signal and Information Processing Laboratory, ETH Zurich, 8092 Zurich, Switzerland
*
Author to whom correspondence should be addressed.
Entropy 2019, 21(3), 298; https://doi.org/10.3390/e21030298
Submission received: 10 December 2018 / Revised: 4 March 2019 / Accepted: 14 March 2019 / Published: 19 March 2019

Abstract

:
Two correlated sources emit a pair of sequences, each of which is observed by a different encoder. Each encoder produces a rate-limited description of the sequence it observes, and the two descriptions are presented to a guessing device that repeatedly produces sequence pairs until correct. The number of guesses until correct is random, and it is required that it have a moment (of some prespecified order) that tends to one as the length of the sequences tends to infinity. The description rate pairs that allow this are characterized in terms of the Rényi entropy and the Arimoto–Rényi conditional entropy of the joint law of the sources. This solves the guessing analog of the Slepian–Wolf distributed source-coding problem. The achievability is based on random binning, which is analyzed using a technique by Rosenthal.

Graphical Abstract

1. Introduction

In the Massey–Arıkan guessing problem [1,2], a random variable X is drawn from a finite set X according to some probability mass function (PMF) P X , and it has to be determined by making guesses of the form “Is X equal to x?” until the guess is correct. The guessing order is determined by a guessing function G, which is a bijective function from X to { 1 , , | X | } . Guessing according to G proceeds as follows: the first guess is the element x ^ 1 X satisfying G ( x ^ 1 ) = 1 ; the second guess is the element x ^ 2 X satisfying G ( x ^ 2 ) = 2 , and so on. Consequently, G ( X ) is the number of guesses needed to guess X. Arıkan [2] showed that for any ρ > 0 , the ρ th moment of the number of guesses required by an optimal guesser G to guess X is bounded by:
1 ( 1 + ln | X | ) ρ 2 ρ H 1 / ( 1 + ρ ) ( X ) E [ G ( X ) ρ ] 2 ρ H 1 / ( 1 + ρ ) ( X ) ,
where ln ( · ) denotes the natural logarithm, and H 1 / ( 1 + ρ ) ( X ) denotes the Rényi entropy of order 1 1 + ρ , which is defined in Section 3 ahead (refinements of (1) were recently derived in [3]).
Guessing with an encoder is depicted in Figure 1. Here, prior to guessing X, the guesser is provided some side information about X in the form of f ( X ) , where f : X { 1 , , M } is a function taking on at most M different values (“labels”). Accordingly, a guessing function G ( · | · ) is a function from X × { 1 , , M } to { 1 , , | X | } such that for every label m { 1 , , M } , G ( · | m ) : X { 1 , , | X | } is bijective. If, among all encoders, f minimizes the ρ th moment of the number of guesses required by an optimal guesser to guess X after observing f ( X ) , then [4] (Corollary 7):
1 ( 1 + ln | X | ) ρ 2 ρ [ H 1 / ( 1 + ρ ) ( X ) log M ] E [ G ( X | f ( X ) ) ρ ] 1 + 2 ρ [ H 1 / ( 1 + ρ ) ( X ) log M + 1 ] .
Thus, in guessing a sequence of independent and identically distributed (IID) random variables, a description rate of approximately H 1 / ( 1 + ρ ) ( X ) bits per symbol is needed to drive the ρ th moment of the number of guesses to one as the sequence length tends to infinity [4,5] (see Section 2 for more related work).
In this paper, we generalize the single-encoder setting from Figure 1 to the setting with distributed encoders depicted in Figure 2, which is the analog of Slepian–Wolf coding [6] for guessing: A source generates a sequence of pairs { ( X i , Y i ) } i = 1 n over a finite alphabet X × Y . The sequence X n is described by one of 2 n R X labels and the sequence Y n by one of 2 n R Y labels using functions:
f n : X n { 1 , , 2 n R X } ,
g n : Y n { 1 , , 2 n R Y } ,
where R X 0 and R Y 0 . Based on f n ( X n ) and g n ( Y n ) , a guesser repeatedly produces guesses of the form ( x ^ n , y ^ n ) until ( x ^ n , y ^ n ) = ( X n , Y n ) .
For a fixed ρ > 0 , a rate pair ( R X , R Y ) R 0 2 is called achievable if there exists a sequence of encoders and guessing functions { ( f n , g n , G n ) } n = 1 such that the ρ th moment of the number of guesses tends to one as n tends to infinity, i.e.,
lim n E G n X n , Y n | f n ( X n ) , g n ( Y n ) ρ = 1 .
Our main contribution is Theorem 1, which characterizes the achievable rate pairs. For a fixed ρ > 0 , let the region R ( ρ ) comprise all rate pairs ( R X , R Y ) R 0 2 satisfying the following inequalities simultaneously:
R X lim sup n H ρ ˜ ( X n | Y n ) n ,
R Y lim sup n H ρ ˜ ( Y n | X n ) n ,
R X + R Y lim sup n H ρ ˜ ( X n , Y n ) n ,
where the Rényi entropy H α ( · ) and the Arimoto–Rényi conditional entropy H α ( · | · ) of order α are both defined in Section 3 ahead, and throughout the paper,
ρ ˜ 1 1 + ρ .
Theorem 1.
For any ρ > 0 , all rate pairs in the interior of R ( ρ ) are achievable, while those outside R ( ρ ) are not. If { ( X i , Y i ) } i = 1 are IID according to P X Y , then (6)(8) reduce to:
R X H ρ ˜ ( X | Y ) ,
R Y H ρ ˜ ( Y | X ) ,
R X + R Y H ρ ˜ ( X , Y ) .
Proof. 
The converse follows from Corollary 1 in Section 4; the achievability follows from Corollary 2 in Section 5; and the reduction of (6)–(8) to (10)–(12) in the IID case follows from (19) and (20) ahead. ☐
The rate region defined by (10)–(12) resembles the rate region of Slepian–Wolf coding [6] (Theorem 15.4.1); the difference is that the Shannon entropy and conditional entropy are replaced by their Rényi counterparts. The rate regions are related as follows:
Remark 1.
For memoryless sources and ρ > 0 , the region R ( ρ ) is contained in the Slepian–Wolf region. Typically, the containment is strict.
Proof. 
The containment follows from the monotonicity of the Arimoto–Rényi conditional entropy: (9) implies that ρ ˜ ( 0 , 1 ) , so, by [7] (Proposition 5), H ρ ˜ ( X | Y ) H ( X | Y ) , H ρ ˜ ( Y | X ) H ( Y | X ) , and H ρ ˜ ( X , Y ) H ( X , Y ) . As for the strict containment, first note that the Slepian–Wolf region contains at least one rate pair ( R X , R Y ) satisfying R X + R Y = H ( X , Y ) . Consequently, if H ρ ˜ ( X , Y ) > H ( X , Y ) , then the containment is strict. Because H ρ ˜ ( X , Y ) > H ( X , Y ) unless ( X , Y ) is distributed uniformly over its support [8], the containment is typically strict.
The claim can also be shown operationally: The probability of error is equal to the probability that more than one guess is needed, and for every ρ > 0 ,
Pr G n X n , Y n | f n ( X n ) , g n ( Y n ) 2 = Pr G n X n , Y n | f n ( X n ) , g n ( Y n ) ρ 1 2 ρ 1
E G n X n , Y n | f n ( X n ) , g n ( Y n ) ρ 1 2 ρ 1 ,
where (14) follows from Markov’s inequality. Thus, the probability of error tends to zero if the ρ th moment of the number of guesses tends to one.  ☐
Despite the resemblance between (10)–(12) and the Slepian–Wolf region, there is an important difference: while Slepian–Wolf coding allows separate encoding with the same sum rate as with joint encoding, this is not necessarily true in our setting:
Remark 2.
Although the sum rate constraint (12) is the same as in single-source guessing [5], separate encoding of X n and Y n may require a larger sum rate than joint encoding of X n and Y n .
Proof. 
If H ρ ˜ ( X | Y ) + H ρ ˜ ( Y | X ) > H ρ ˜ ( X , Y ) , then (10) and (11) together impose a stronger constraint on the sum rate than (12). For example, if:
P X Y ( x , y ) y = 0 y = 1 x = 0 0.65 0.17 x = 1 0.17 0.01
and ρ = 1 , then H 1 / 2 ( X | Y ) + H 1 / 2 ( Y | X ) 1 . 61 bits, so separate (distributed) encoding requires a sum rate exceeding 1 . 61 bits as opposed to joint encoding, which is possible with H 1 / 2 ( X , Y ) 1 . 58 bits (in Slepian–Wolf coding, this cannot happen because H ( X , Y ) H ( X | Y ) H ( Y | X ) = I ( X ; Y ) 0 ).  ☐
The guessing problem is related to the task-encoding problem, where based on f n ( X n ) and g n ( Y n ) , the decoder outputs a list that is guaranteed to contain ( X n , Y n ) , and the ρ th moment of the list size is required to tend to one as n tends to infinity. While, in the single-source setting, the guessing problem and the task-encoding problem have the same asymptotics [4], this is not the case in the distributed setting:
Remark 3.
For memoryless sources, the task-encoding region from [9] is strictly smaller than the guessing region R ( ρ ) unless X and Y are independent.
Proof. 
In the IID case, the task-encoding region is the set of all rate pairs ( R X , R Y ) R 0 2 satisfying the following inequalities [9] (Theorem 1):
R X H ρ ˜ ( X ) ,
R Y H ρ ˜ ( Y ) ,
R X + R Y H ρ ˜ ( X , Y ) + K ρ ˜ ( X ; Y ) ,
where K α ( X ; Y ) is a Rényi measure of dependence studied in [10] (when α is one, K α ( X ; Y ) is the mutual information). The claim now follows from the following observations: By [7] (Theorem 2), H ρ ˜ ( X ) H ρ ˜ ( X | Y ) with equality if and only if X and Y are independent; similarly, H ρ ˜ ( Y ) H ρ ˜ ( Y | X ) with equality if and only if X and Y are independent; and by [10] (Theorem 2), K ρ ˜ ( X ; Y ) 0 with equality if and only if X and Y are independent.  ☐
The rest of this paper is structured as follows: in Section 2, we review other guessing settings; in Section 3, we recall the Rényi information measures and prove some auxiliary lemmas; in Section 4, we prove the converse theorem; and in Section 5, we prove the achievability theorem, which is based on random binning and, in the case ρ > 1 , is analyzed using a technique by Rosenthal [11].

2. Related Work

Tighter versions of (1) can be found in [3,12]. The large deviation behavior of guessing was studied in [13,14]. The relation between guessing and variable-length lossless source coding was explored in [3,15,16].
Mismatched guessing, where the assumed distribution of X does not match its actual distribution, was studied in [17], along with guessing under source uncertainty, where the PMF of X belongs to some known set, and a guesser was sought with good worst-case performance over that set. Guessing subject to distortion, where instead of guessing X, it suffices to guess an X ^ that is close to X according to some distortion measure, was treated in [18].
If the guesser observes some side information Y, then the ρ th moment of the number of guesses required by an optimal guesser is bounded by [2]:
1 ( 1 + ln | X | ) ρ 2 ρ H ρ ˜ ( X | Y ) E [ G ( X | Y ) ρ ] 2 ρ H ρ ˜ ( X | Y ) ,
where H ρ ˜ ( X | Y ) denotes the Arimoto–Rényi conditional entropy of order ρ ˜ = 1 1 + ρ , which is defined in Section 3 ahead (refinements of (18) were recently derived in [3]). Guessing is related to the cutoff rate of a discrete memoryless channel, which is the supremum over all rates for which the ρ th moment of the number of guesses needed by the decoder to guess the message can be driven to one as the block length tends to infinity. In [2,19], the cutoff rate was expressed in terms of Gallager’s E 0 function [20]. Joint source-channel guessing was considered in [21].
Guessing with an encoder, i.e., the situation where the side information can be chosen, was studied in [4], where it was also shown that guessing and task encoding [22] have the same asymptotics. With distributed encoders, however, task encoding [9] and guessing no longer have the same asymptotics; see Remark 3. Lower and upper bounds for guessing with a helper, i.e., an encoder that does not observe X, but has access to a random variable that is correlated with X, can be found in [5].

3. Preliminaries

Throughout the paper, log ( · ) denotes the base-two logarithm. When clear from the context, we often omit sets and subscripts; for example, we write x for x X and P ( x ) for P X ( x ) . The Rényi entropy [23] of order α is defined for positive α other than one as:
H α ( X ) 1 1 α log x P ( x ) α .
In the limit as α tends to one, the Shannon entropy is recovered, i.e., lim α 1 H α ( X ) = H ( X ) . The Arimoto–Rényi conditional entropy [24] of order α is defined for positive α other than one as:
H α ( X | Y ) α 1 α log y x P ( x , y ) α 1 α .
In the limit as α tends to one, the Shannon conditional entropy is recovered, i.e., lim α 1 H α ( X | Y ) = H ( X | Y ) . The properties of the Arimoto–Rényi conditional entropy were studied in [7,24,25].
In the rest of this section, we recall some properties of the Arimoto–Rényi conditional entropy that will be used in Section 4 (Lemmas 1–3), and we prove auxiliary results for Section 5 (Lemmas 4–7).
Lemma 1 
([7], Theorem 2). Let α > 0 , and let P X Y Z be a PMF over the finite set X × Y × Z . Then,
H α ( X | Y , Z ) H α ( X | Z )
with equality if and only if X Z Y form a Markov chain.
Lemma 2 
([7], Proposition 4). Let α > 0 , and let P X Y Z be a PMF over the finite set X × Y × Z . Then,
H α ( X , Y | Z ) H α ( X | Z )
with equality if and only if Y is uniquely determined by X and Z.
Lemma 3 
([7], Theorem 3). Let α > 0 , and let P X Y Z be a PMF over the finite set X × Y × Z . Then,
H α ( X | Y , Z ) H α ( X | Z ) log | Y | .
Lemma 4 
([20], Problem 4.15(f)). Let Y be a finite set, and let f : Y R 0 . Then, for all p ( 0 , 1 ] ,
y f ( y ) p y f ( y ) p .
Proof. 
If y f ( y ) = 0 , then (24) holds because the left-hand side (LHS) and the right-hand side (RHS) are both zero. If y f ( y ) > 0 , then:
y f ( y ) p = y f ( y ) p y f ( y ) y f ( y ) p
y f ( y ) p y f ( y ) y f ( y )
= y f ( y ) p ,
where (26) holds because p ( 0 , 1 ] and f ( y ) / y f ( y ) [ 0 , 1 ] for every y Y .  ☐
Lemma 5.
Let a, b, and c be nonnegative integers. Then, for all p > 0 ,
( 1 + a + b + c ) p 1 + 4 p ( a p + b p + c p )
(the restriction to integers cannot be omitted; for example, (28) does not hold if a = b = c = 0 . 1 and p = 2 ).
Proof. 
If p ( 0 , 1 ] , then (28) follows from Lemma 4 because 4 p 1 . If p > 1 , then the cases with a + b + c { 0 , 1 , 2 } can be checked individually. For a + b + c 3 ,
( 1 + a + b + c ) p = 3 a + b + c + 3 p · a + b + c 3 p
4 p · a + b + c 3 p
4 p · a p + b p + c p 3
1 + 4 p ( a p + b p + c p ) ,
where (30) holds because a + b + c 3 , and (31) follows from Jensen’s inequality because z z p is convex on R 0 since p > 1 .  ☐
Lemma 6.
Let a, b, c, and d be nonnegative real numbers. Then, for all p > 0 ,
( a + b + c + d ) p 4 p ( a p + b p + c p + d p ) .
Proof. 
If p ( 0 , 1 ] , then (33) follows from Lemma 4 because 4 p 1 . If p > 1 , then:
( a + b + c + d ) p = 4 p · a + b + c + d 4 p
4 p · a p + b p + c p + d p 4
4 p ( a p + b p + c p + d p ) ,
where (35) follows from Jensen’s inequality because z z p is convex on R 0 since p > 1 . ☐
Lemma 7 
(Rosenthal). Let p > 1 , and let X 1 , , X n be independent random variables that are either zero or one. Then, X i = 1 n X i satisfies:
E [ X p ] 2 p 2 max { E [ X ] , E [ X ] p } .
Proof. 
This is a special case of [11] (Lemma 1). For convenience, we also provide a self-contained proof:
E [ X p ] = E i { 1 , , n } X i · j { 1 , , n } X j p 1
= E i { 1 , , n } X i · 1 + j { 1 , , n } { i } X j p 1
= i { 1 , , n } E X i · 1 + j { 1 , , n } { i } X j p 1
= i { 1 , , n } E [ X i ] · E 1 + j { 1 , , n } { i } X j p 1
i { 1 , , n } E [ X i ] · E 1 + j { 1 , , n } X j p 1
= E [ X ] · E [ ( 1 + X ) p 1 ]
E [ X ] · 2 p 1 · ( 1 + E [ X p 1 ] )
= 2 p 1 ( E [ X ] + E [ X ] E [ X p 1 ] )
2 p 1 E [ X ] + E [ X ] E [ X p ] p 1 p
2 p max E [ X ] , E [ X ] E [ X p ] p 1 p ,
where (39) holds because each X i is either zero or one; (41) holds because X 1 , , X n are independent; (42) holds because z z p 1 is increasing on R 0 for p > 1 ; (44) holds because for real numbers a 0 , b 0 , and r > 0 , we have ( a + b ) r ( 2 max { a , b } ) r = 2 r max { a r , b r } 2 r ( a r + b r ) ; and (46) follows from Jensen’s inequality because z z ( p 1 ) / p is concave on R 0 for p > 1 .
We now consider two cases depending on which term on the RHS of (47) achieves the maximum: If the maximum is achieved by E [ X ] , then E [ X p ] 2 p E [ X ] , which implies (37) because 2 p 2 p 2 since p > 1 . If the maximum is achieved by E [ X ] E [ X p ] ( p 1 ) / p , then:
E [ X p ] 2 p E [ X ] E [ X p ] p 1 p .
Rearranging (48), we obtain:
E [ X p ] 2 p 2 E [ X ] p ,
so (37) holds also in this case. ☐

4. Converse

In this section, we prove a nonasymptotic and an asymptotic converse result (Theorem 2 and Corollary 1, respectively).
Theorem 2.
Let U X Y V form a Markov chain over the finite set U × X × Y × V , and let τ 1 + ln | X × Y | . Then, for every ρ > 0 and for every guesser, the ρth moment of the number of guesses it takes to guess the pair ( X , Y ) based on the side information ( U , V ) satisfies:
E [ G ( X , Y | U , V ) ρ ] max { 2 ρ ( H ρ ˜ ( X | Y ) log | U | log τ ) , 2 ρ ( H ρ ˜ ( Y | X ) log | V | log τ ) , 2 ρ ( H ρ ˜ ( X , Y ) log | U × V | log τ ) } .
Proof. 
We view (50) as three lower bounds corresponding to the three terms in the maximization on its RHS. The lower bound involving H ρ ˜ ( X , Y ) holds because:
E [ G ( X , Y | U , V ) ρ ] 2 ρ ( H ρ ˜ ( X , Y | U , V ) log τ )
2 ρ ( H ρ ˜ ( X , Y ) log | U × V | log τ ) ,
where (51) follows from (18) and (52) follows from Lemma 3. The lower bound involving H ρ ˜ ( X | Y ) holds because:
E [ G ( X , Y | U , V ) ρ ] 2 ρ ( H ρ ˜ ( X , Y | U , V ) log τ )
2 ρ ( H ρ ˜ ( X , Y | U , V , Y ) log τ )
= 2 ρ ( H ρ ˜ ( X | U , V , Y ) log τ )
= 2 ρ ( H ρ ˜ ( X | U , Y ) log τ )
2 ρ ( H ρ ˜ ( X | Y ) log | U | log τ ) ,
where (53) follows from (18); (54) follows from Lemma 1; (55) follows from Lemma 2; (56) follows from Lemma 1 because X ( U , Y ) V form a Markov chain; and (57) follows from Lemma 3. The lower bound involving H ρ ˜ ( Y | X ) is analogous to the one with H ρ ˜ ( X | Y ) . ☐
Corollary 1.
For any ρ > 0 , rate pairs outside R ( ρ ) are not achievable.
Proof. 
We first show that (8) is necessary for a rate pair ( R X , R Y ) R 0 2 to be achievable. Indeed, if (8) does not hold, then there exists an ϵ > 0 such that for infinitely many n,
H ρ ˜ ( X n , Y n ) n R X + R Y + ϵ .
Using Theorem 2 with X X n , Y Y n , U { 1 , , 2 n R X } , V { 1 , , 2 n R Y } , P X Y P X n Y n , U f n ( X n ) , V g n ( Y n ) , and τ n = 1 + n ln | X × Y | leads to:
E [ G ( X n , Y n | U , V ) ρ ] 2 ρ ( H ρ ˜ ( X n , Y n ) log | U × V | log τ n )
2 ρ n ( 1 n H ρ ˜ ( X n , Y n ) R X R Y 1 n log τ n ) .
It follows from (60), (58), and the fact that 1 n log τ n tends to zero as n tends to infinity that the LHS of (59) cannot tend to one as n tends to infinity, so ( R X , R Y ) is not achievable if (8) does not hold. The necessity of (6) and (7) can be shown in the same way. ☐

5. Achievability

In this section, we prove a nonasymptotic and an asymptotic achievability result (Theorem 3 and Corollary 2, respectively).
Theorem 3.
Let X , Y , U , and V be finite nonempty sets; let P X Y be a PMF; let ρ > 0 ; and let ϵ > 0 be such that:
log | U | H ρ ˜ ( X | Y ) + ϵ ,
log | V | H ρ ˜ ( Y | X ) + ϵ ,
log | U × V | H ρ ˜ ( X , Y ) + ϵ .
Then, there exist functions f : X U and g : Y V and a guesser such that the ρth moment of the number of guesses needed to guess the pair ( X , Y ) based on the side information ( f ( X ) , g ( Y ) ) satisfies:
E G X , Y | f ( X ) , g ( Y ) ρ 1 + 4 ρ + 1 · 2 ρ ϵ if ρ ( 0 , 1 ] , 1 + 4 ( ρ + 1 ) 2 · 2 ϵ if ρ > 1 .
Proof. 
Our achievability result relies on random binning: we map each x X uniformly at random to some u U and each y Y uniformly at random to some v V . We then show that the ρ th moment of the number of guesses averaged over all such mappings f : X U and g : Y V is upper bounded by the RHS of (64). From this, we conclude that there exist f and g that satisfy (64).
Let the guessing function G correspond to guessing in decreasing order of probability [2] (ties can be resolved arbitrarily). Let f and g be distributed as described above, and denote by E f , g [ · ] the expectation with respect to f and g. Then,
E f , g E [ G ( X , Y | f ( X ) , g ( Y ) ) ρ ] = x , y P ( x , y ) E f , g [ G ( x , y | f ( x ) , g ( y ) ) ρ ]
x , y P ( x , y ) E f , g x , y ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) ρ
= x , y P ( x , y ) E f , g [ ( 1 + β 1 + β 2 + β 3 ) ρ ]
1 + 4 ρ x , y P ( x , y ) ( E f , g [ β 1 ρ ] + E f , g [ β 2 ρ ] + E f , g [ β 3 ρ ] )
with:
ψ ( x , y ) = ψ ( x , y , x , y ) 𝟙 { P ( x , y ) P ( x , y ) } ,
ϕ f ( x ) = ϕ f ( x , x ) 𝟙 { f ( x ) = f ( x ) } ,
ϕ g ( y ) = ϕ g ( y , y ) 𝟙 { g ( y ) = g ( y ) } ,
β 1 = β 1 ( x , y , f ) x x ψ ( x , y ) ϕ f ( x ) ,
β 2 = β 2 ( x , y , g ) y y ψ ( x , y ) ϕ g ( y ) ,
β 3 = β 3 ( x , y , f , g ) x x , y y ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) ,
where 𝟙 { · } is the indicator function that is one if the condition comprising its argument is true and zero otherwise; (65) holds because ( f , g ) and ( X , Y ) are independent; (66) holds because the number of guesses is upper bounded by the number of ( x , y ) that are at least as likely as ( x , y ) and that are mapped to the same labels ( u , v ) as ( x , y ) ; (67) follows from splitting the sum depending on whether x = x or not and whether y = y or not and from the fact that ψ ( x , y ) = ϕ f ( x ) = ϕ g ( y ) = 1 ; and (68) follows from Lemma 5 because β 1 , β 2 , and β 3 are nonnegative integers. As indicated in (69)–(74), the dependence of ψ , ϕ f , ϕ g , β 1 , β 2 , and β 3 on x, y, f, and g is implicit in our notation.
We first treat the case ρ ( 0 , 1 ] . We bound the terms on the RHS of (68) as follows:
x , y P ( x , y ) E f , g [ β 1 ρ ] x , y P ( x , y ) E f , g [ β 1 ] ρ
= x , y P ( x , y ) x x ψ ( x , y ) 1 | U | ρ
x , y P ( x , y ) x [ P ( x , y ) P ( x , y ) ] ρ ˜ 1 | U | ρ
= 1 | U | ρ x , y P ( x , y ) ρ ˜ x P ( x , y ) ρ ˜ ρ
= 1 | U | ρ y x P ( x , y ) ρ ˜ x P ( x , y ) ρ ˜ ρ
= 1 | U | ρ y x P ( x , y ) ρ ˜ 1 + ρ
= 2 ρ ( H ρ ˜ ( X | Y ) log | U | )
2 ρ ϵ ,
where (75) follows from Jensen’s inequality because z z ρ is concave on R 0 since ρ ( 0 , 1 ] ; (76) holds because the expectation operator is linear and because E f , g [ ϕ f ( x ) ] = 1 / | U | since x x ; in (77), we extended the inner summation and used that ψ ( x , y ) [ P ( x , y ) / P ( x , y ) ] ρ ˜ ; and (82) follows from (61). In the same way, we obtain:
x , y P ( x , y ) E f , g [ β 2 ρ ] 2 ρ ϵ .
Similarly,
x , y P ( x , y ) E f , g [ β 3 ρ ] x , y P ( x , y ) E f , g [ β 3 ] ρ
= x , y P ( x , y ) x x , y y ψ ( x , y ) 1 | U × V | ρ
x , y P ( x , y ) x , y [ P ( x , y ) P ( x , y ) ] ρ ˜ 1 | U × V | ρ
= 1 | U × V | ρ x , y P ( x , y ) ρ ˜ x , y P ( x , y ) ρ ˜ ρ
= 1 | U × V | ρ x , y P ( x , y ) ρ ˜ 1 + ρ
= 2 ρ ( H ρ ˜ ( X , Y ) log | U × V | )
2 ρ ϵ .
From (68), (82), (83), and (90), we obtain:
E f , g E [ G ( X , Y | f ( X ) , g ( Y ) ) ρ ] 1 + 3 · 4 ρ · 2 ρ ϵ
1 + 4 ρ + 1 · 2 ρ ϵ
and hence infer the existence of f :   X U and g :   Y V satisfying (64).
We now consider (68) when ρ > 1 . Unlike in the case ρ ( 0 , 1 ] , we cannot use Jensen’s inequality as we did in (75). Instead, for fixed x X and y Y , we upper-bound the first expectation on the RHS of (68) by:
E f , g [ β 1 ρ ] 2 ρ 2 max E f , g [ β 1 ] , E f , g [ β 1 ] ρ
2 ρ 2 E f , g [ β 1 ] ρ + E f , g [ β 1 ] ,
where (93) follows from Lemma 7 because ρ > 1 and because β 1 is a sum of independent random variables taking values in { 0 , 1 } . By the same steps as in (76)–(82),
x , y P ( x , y ) E f , g [ β 1 ] ρ 2 ρ ϵ .
As to the expectation of the other term on the RHS of (94),
x , y P ( x , y ) E f , g [ β 1 ] x , y P ( x , y ) E f , g [ β 1 ] ρ 1 ρ
2 ϵ ,
where (96) follows from Jensen’s inequality because z z 1 ρ is concave on R 0 since ρ > 1 , and (97) follows from (95). From (94), (95), and (97), we obtain:
x , y P ( x , y ) E f , g [ β 1 ρ ] 2 ρ 2 ( 2 ρ ϵ + 2 ϵ )
2 ρ 2 + 1 · 2 ϵ ,
where (99) holds because 2 ρ ϵ 2 ϵ since ρ > 1 and ϵ > 0 . In the same way, we obtain for the second expectation on the RHS of (68):
x , y P ( x , y ) E f , g [ β 2 ρ ] 2 ρ 2 + 1 · 2 ϵ .
Bounding E f , g [ β 3 ρ ] , i.e., the third expectation on the RHS of (68), is more involved because β 3 is not a sum of independent random variables. Our approach builds on the ideas used by Rosenthal [11] (Proof of Lemma 1); compare (47) and (48) with (108) and (123) ahead. For fixed x X and y Y ,
E f , g [ β 3 ρ ] = E f , g x x , y y ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) · x ˜ x , y ˜ y ψ ( x ˜ , y ˜ ) ϕ f ( x ˜ ) ϕ g ( y ˜ ) ρ 1
= E f , g x x , y y ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) · ( 1 + γ 1 + γ 2 + γ 3 ) ρ 1
= x x , y y E f , g ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) · ( 1 + γ 1 + γ 2 + γ 3 ) ρ 1
= x x , y y E f , g ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) · E f , g ( 1 + γ 1 + γ 2 + γ 3 ) ρ 1
x x , y y E f , g ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) · E f , g ( 1 + δ 1 + δ 2 + β 3 ) ρ 1
x x , y y E f , g ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) · 4 ρ 1 · E f , g 1 + δ 1 ρ 1 + δ 2 ρ 1 + β 3 ρ 1
= 4 ρ 1 { E f , g [ β 3 ] + y y 1 | V | E f , g [ δ 1 ] E f , g [ δ 1 ρ 1 ]
= 4 ρ 1 { + x x 1 | U | E f , g [ δ 2 ] E f , g [ δ 2 ρ 1 ] + E f , g [ β 3 ] E f , g [ β 3 ρ 1 ] }
4 ρ max { E f , g [ β 3 ] , y y 1 | V | E f , g [ δ 1 ] E f , g [ δ 1 ρ 1 ] ,
4 ρ max { x x 1 | U | E f , g [ δ 2 ] E f , g [ δ 2 ρ 1 ] , E f , g [ β 3 ] E f , g [ β 3 ρ 1 ] }
with:
γ 1 = γ 1 ( x , y , x , y , f ) x ˜ { x , x } ψ ( x ˜ , y ) ϕ f ( x ˜ ) ,
γ 2 = γ 2 ( x , y , x , y , g ) y ˜ { y , y } ψ ( x , y ˜ ) ϕ g ( y ˜ ) ,
γ 3 = γ 3 ( x , y , x , y , f , g ) x ˜ { x , x } , y ˜ { y , y } ψ ( x ˜ , y ˜ ) ϕ f ( x ˜ ) ϕ g ( y ˜ ) ,
δ 1 = δ 1 ( x , y , y , f ) x ˜ x ψ ( x ˜ , y ) ϕ f ( x ˜ ) ,
δ 2 = δ 2 ( x , y , x , g ) y ˜ y ψ ( x , y ˜ ) ϕ g ( y ˜ ) ,
where (102) follows from splitting the sum in braces depending on whether x ˜ = x or not and whether y ˜ = y or not and from assuming ψ ( x , y ) = ϕ f ( x ) = ϕ g ( y ) = 1 within the braces, which does not change the value of the expression because it is multiplied by ψ ( x , y ) ϕ f ( x ) ϕ g ( y ) ; (104) holds because ( ϕ f ( x ) , ϕ g ( y ) ) and ( γ 1 , γ 2 , γ 3 ) are independent since x ˜ x and y ˜ y ; (105) holds because ρ 1 > 0 , γ 1 δ 1 , γ 2 δ 2 , and γ 3 β 3 ; (106) follows from Lemma 6; and (107) follows from identifying E f , g [ β 3 ] , E f , g [ δ 1 ] , and E f , g [ δ 2 ] because ϕ f ( x ) and ϕ g ( y ) are independent, E f , g [ ϕ f ( x ) ] = 1 / | U | , and E f , g [ ϕ g ( y ) ] = 1 / | V | . As indicated in (109)–(113), the dependence of γ 1 , γ 2 , γ 3 , δ 1 , and δ 2 on x, y, x , y , f, and g is implicit in our notation.
To bound E f , g [ β 3 ρ ] further, we study some of the terms on the RHS of (108) separately, starting with the second, which involves the sum over y . For fixed x X , y Y , and y Y { y } ,
E f , g [ δ 1 ] E f , g [ δ 1 ρ 1 ] E f , g [ δ 1 ρ ] 1 ρ E f , g [ δ 1 ρ ] ρ 1 ρ
= E f , g [ δ 1 ρ ]
2 ρ 2 max E f , g [ δ 1 ] , E f , g [ δ 1 ] ρ
2 ρ 2 E f , g [ δ 1 ] + E f , g [ δ 1 ] ρ ,
where (114) follows from Jensen’s inequality because z z 1 ρ and z z ρ 1 ρ are both concave on R 0 since ρ > 1 , and (116) follows from Lemma 7 because ρ > 1 and because δ 1 is a sum of independent random variables taking values in { 0 , 1 } . This implies that for fixed x X and y Y ,
y y 1 | V | E f , g [ δ 1 ] E f , g [ δ 1 ρ 1 ] 2 ρ 2 y y 1 | V | E f , g [ δ 1 ] + E f , g [ δ 1 ] ρ
= 2 ρ 2 E f , g [ β 3 ] + 2 ρ 2 y y 1 | V | E f , g [ δ 1 ] ρ ,
where (119) follows from the definitions of δ 1 and β 3 . Similarly, for the third term on the RHS of (108),
x x 1 | U | E f , g [ δ 2 ] E f , g [ δ 2 ρ 1 ] 2 ρ 2 E f , g [ β 3 ] + 2 ρ 2 x x 1 | U | E f , g [ δ 2 ] ρ .
With the help of (119) and (120), we now go back to (108) and argue that it implies that for fixed x X and y Y ,
E f , g [ β 3 ρ ] 2 · 4 ρ 2   [ E f , g [ β 3 ] + y y 1 | V | E f , g [ δ 1 ] ρ + x x 1 | U | E f , g [ δ 2 ] ρ + E f , g [ β 3 ] ρ ] .
To prove this, we consider four cases depending on which term on the RHS of (108) achieves the maximum: If E f , g [ β 3 ] achieves the maximum, then (121) holds because 4 ρ 2 · 4 ρ 2 . If the LHS of (118) achieves the maximum, then (121) follows from (119) because 4 ρ · 2 ρ 2 2 · 4 ρ 2 . If the LHS of (120) achieves the maximum, then (121) follows similarly. Finally, if E f , g [ β 3 ] E f , g [ β 3 ρ 1 ] achieves the maximum, then:
E f , g [ β 3 ρ ] 4 ρ E f , g [ β 3 ] E f , g [ β 3 ρ 1 ]
4 ρ E f , g [ β 3 ] E f , g [ β 3 ρ ] ρ 1 ρ ,
where (123) follows from Jensen’s inequality because z z ρ 1 ρ is concave on R 0 for ρ > 1 . Rearranging (123), we obtain:
E f , g [ β 3 ρ ] 4 ρ 2 E f , g [ β 3 ] ρ ,
so (121) holds also in this case.
Having established (121), we now take the expectation of its sides to obtain:
x , y P ( x , y ) E f , g [ β 3 ρ ] 2 · 4 ρ 2 x , y P ( x , y ) E f , g [ β 3 ] + y y 1 | V | E f , g [ δ 1 ] ρ + x x 1 | U | E f , g [ δ 2 ] ρ + E f , g [ β 3 ] ρ .
We now study the terms on the RHS of (125) separately, starting with the fourth (last). By (85)–(90), which hold also if ρ > 1 ,
x , y P ( x , y ) E f , g [ β 3 ] ρ 2 ρ ϵ .
As for the first term on the RHS of (125),
x , y P ( x , y ) E f , g [ β 3 ] 2 ϵ ,
which follows from (126) in the same way as (97) followed from (95). As for the second term on the RHS of (125),
x , y P ( x , y ) y y 1 | V | E f , g [ δ 1 ] ρ
= x , y P ( x , y ) y y 1 | V | x x ψ ( x , y ) 1 | U | ρ
x , y P ( x , y ) ρ ˜ y 1 | V | x P ( x , y ) ρ ˜ 1 | U | ρ
= x , y P ( x , y ) ρ ˜ y x P ( x , y ) ρ ˜ 1 | U × V | ρ 1 ρ · x P ( x , y ) ρ ˜ 1 | U | ρ 1 + ρ ( 1 + ρ ) · ρ 1 ρ
x , y P ( x , y ) ρ ˜ y x P ( x , y ) ρ ˜ 1 | U × V | ρ 1 ρ · y x P ( x , y ) ρ ˜ 1 | U | ρ 1 + ρ 1 + ρ ρ 1 ρ
= 1 | U × V | ρ x , y P ( x , y ) ρ ˜ 1 + ρ 1 ρ · 1 | U | ρ y x P ( x , y ) ρ ˜ 1 + ρ ρ 1 ρ
( 2 ρ ϵ ) 1 ρ · ( 2 ρ ϵ ) ρ 1 ρ
= 2 ρ ϵ ,
where in (129), we extended the inner summations and used that ψ ( x , y ) [ P ( x , y ) / P ( x , y ) ] ρ ˜ ; (131) follows from Hölder’s inequality; and (133) follows from (89)–(90) and (81)–(82). In the same way, we obtain for the third term on the RHS of (125):
x , y P ( x , y ) x x 1 | U | E f , g [ δ 2 ] ρ 2 ρ ϵ .
From (125), (127), (134), (135), and (126), we deduce:
x , y P ( x , y ) E f , g [ β 3 ρ ] 2 · 4 ρ 2 ( 2 ϵ + 2 ρ ϵ + 2 ρ ϵ + 2 ρ ϵ )
8 · 4 ρ 2 · 2 ϵ ,
where (137) holds because 2 ρ ϵ 2 ϵ since ρ > 1 and ϵ > 0 . Finally, (68), (99), (100), and (137) imply:
E f , g E [ G ( X , Y | f ( X ) , g ( Y ) ) ρ ] 1 + 4 ρ ( 2 · 2 ρ 2 + 1 · 2 ϵ + 8 · 4 ρ 2 · 2 ϵ )
1 + 4 ( ρ + 1 ) 2 · 2 ϵ
and thus prove the existence of f : X U and g : Y V satisfying (64). ☐
Corollary 2.
For any ρ > 0 , rate pairs in the interior of R ( ρ ) are achievable.
Proof. 
Let ( R X , R Y ) be in the interior of R ( ρ ) . Then, (6)–(8) hold with strict inequalities, and there exists a δ > 0 such that for all sufficiently large n,
log 2 n R X H ρ ˜ ( X n | Y n ) + n δ ,
log 2 n R Y H ρ ˜ ( Y n | X n ) + n δ ,
log 2 n R X + log 2 n R Y H ρ ˜ ( X n , Y n ) + n δ .
Using Theorem 3 with X X n , Y Y n , U { 1 , , 2 n R X } , V { 1 , , 2 n R Y } , P X Y P X n Y n , and ϵ n n δ shows that, for all sufficiently large n, there exist encoders f n : X n U and g n : Y n V and a guessing function G n satisfying:
E G n X n , Y n | f n ( X n ) , g n ( Y n ) ρ 1 + 4 ρ + 1 · 2 ρ ϵ n if ρ ( 0 , 1 ] , 1 + 4 ( ρ + 1 ) 2 · 2 ϵ n if ρ > 1 .
Because ϵ n tends to infinity as n tends to infinity, the RHS of (143) tends to one as n tends to infinity, which implies that the rate pair ( R X , R Y ) is achievable.  ☐

Author Contributions

Writing—original draft preparation, A.B., A.L. and C.P.; writing—review and editing, A.B., A.L. and C.P.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Massey, J.L. Guessing and entropy. In Proceedings of the 1994 IEEE International Symposium on Information Theory (ISIT), Trondheim, Norway, 27 June–1 July 1994; p. 204. [Google Scholar] [CrossRef]
  2. Arıkan, E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory 1996, 42, 99–105. [Google Scholar] [CrossRef]
  3. Sason, I.; Verdú, S. Improved bounds on lossless source coding and guessing moments via Rényi measures. IEEE Trans. Inf. Theory 2018, 64, 4323–4346. [Google Scholar] [CrossRef]
  4. Bracher, A.; Hof, E.; Lapidoth, A. Guessing attacks on distributed-storage systems. arXiv, 2017; arXiv:1701.01981v1. [Google Scholar]
  5. Graczyk, R.; Lapidoth, A. Variations on the guessing problem. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018; pp. 231–235. [Google Scholar] [CrossRef]
  6. Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006; ISBN 978-0-471-24195-9. [Google Scholar]
  7. Fehr, S.; Berens, S. On the conditional Rényi entropy. IEEE Trans. Inf. Theory 2014, 60, 6801–6810. [Google Scholar] [CrossRef]
  8. Csiszár, I. Generalized cutoff rates and Rényi’s information measures. IEEE Trans. Inf. Theory 1995, 41, 26–34. [Google Scholar] [CrossRef]
  9. Bracher, A.; Lapidoth, A.; Pfister, C. Distributed task encoding. In Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 1993–1997. [Google Scholar] [CrossRef]
  10. Lapidoth, A.; Pfister, C. Two measures of dependence. In Proceedings of the 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE), Eilat, Israel, 16–18 November 2016; pp. 1–5. [Google Scholar] [CrossRef]
  11. Rosenthal, H.P. On the subspaces of Lp (p > 2) spanned by sequences of independent random variables. Isr. J. Math. 1970, 8, 273–303. [Google Scholar] [CrossRef]
  12. Boztaş, S. Comments on “An inequality on guessing and its application to sequential decoding”. IEEE Trans. Inf. Theory 1997, 43, 2062–2063. [Google Scholar] [CrossRef]
  13. Hanawal, M.K.; Sundaresan, R. Guessing revisited: A large deviations approach. IEEE Trans. Inf. Theory 2011, 57, 70–78. [Google Scholar] [CrossRef]
  14. Christiansen, M.M.; Duffy, K.R. Guesswork, large deviations, and Shannon entropy. IEEE Trans. Inf. Theory 2013, 59, 796–802. [Google Scholar] [CrossRef]
  15. Sundaresan, R. Guessing based on length functions. In Proceedings of the 2007 IEEE International Symposium on Information Theory (ISIT), Nice, France, 24–29 June 2007; pp. 716–719. [Google Scholar] [CrossRef]
  16. Sason, I. Tight bounds on the Rényi entropy via majorization with applications to guessing and compression. Entropy 2018, 20, 896. [Google Scholar] [CrossRef]
  17. Sundaresan, R. Guessing under source uncertainty. IEEE Trans. Inf. Theory 2007, 53, 269–287. [Google Scholar] [CrossRef]
  18. Arıkan, E.; Merhav, N. Guessing subject to distortion. IEEE Trans. Inf. Theory 1998, 44, 1041–1056. [Google Scholar] [CrossRef]
  19. Bunte, C.; Lapidoth, A. On the listsize capacity with feedback. IEEE Trans. Inf. Theory 2014, 60, 6733–6748. [Google Scholar] [CrossRef]
  20. Gallager, R.G. Information Theory and Reliable Communication; John Wiley & Sons: Hoboken, NJ, USA, 1968; ISBN 0-471-29048-3. [Google Scholar]
  21. Arıkan, E.; Merhav, N. Joint source-channel coding and guessing with application to sequential decoding. IEEE Trans. Inf. Theory 1998, 44, 1756–1769. [Google Scholar] [CrossRef]
  22. Bunte, C.; Lapidoth, A. Encoding tasks and Rényi entropy. IEEE Trans. Inf. Theory 2014, 60, 5065–5076. [Google Scholar] [CrossRef]
  23. Rényi, A. On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 20 June–30 July 1960; Volume 1, pp. 547–561. [Google Scholar]
  24. Arimoto, S. Information measures and capacity of order α for discrete memoryless channels. In Topics in Information Theory; Csiszár, I., Elias, P., Eds.; North-Holland Publishing Company: Amsterdam, The Netherlands, 1977; pp. 41–52. ISBN 0-7204-0699-4. [Google Scholar]
  25. Sason, I.; Verdú, S. Arimoto–Rényi conditional entropy and Bayesian M-Ary hypothesis testing. IEEE Trans. Inf. Theory 2018, 64, 4–25. [Google Scholar] [CrossRef]
Figure 1. Guessing with an encoder f.
Figure 1. Guessing with an encoder f.
Entropy 21 00298 g001
Figure 2. Guessing with distributed encoders f n and g n .
Figure 2. Guessing with distributed encoders f n and g n .
Entropy 21 00298 g002

Share and Cite

MDPI and ACS Style

Bracher, A.; Lapidoth, A.; Pfister, C. Guessing with Distributed Encoders. Entropy 2019, 21, 298. https://doi.org/10.3390/e21030298

AMA Style

Bracher A, Lapidoth A, Pfister C. Guessing with Distributed Encoders. Entropy. 2019; 21(3):298. https://doi.org/10.3390/e21030298

Chicago/Turabian Style

Bracher, Annina, Amos Lapidoth, and Christoph Pfister. 2019. "Guessing with Distributed Encoders" Entropy 21, no. 3: 298. https://doi.org/10.3390/e21030298

APA Style

Bracher, A., Lapidoth, A., & Pfister, C. (2019). Guessing with Distributed Encoders. Entropy, 21(3), 298. https://doi.org/10.3390/e21030298

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop