Exponential Strong Converse for Source Coding with Side Information at the Decoder

Oohama, Yasutada

doi:10.3390/e20050352

Open AccessArticle

Exponential Strong Converse for Source Coding with Side Information at the Decoder^†

by

Yasutada Oohama

Department of Communication Engineering and Informatics, University of Electro-Communications, Tokyo 182-8585, Japan

^†

This paper is an extended version of our paper published in 2016 International Symposium on Information Theory and Its Applications, Monterey, CA, USA, 6–9 November 2016; pp. 171–175.

Entropy 2018, 20(5), 352; https://doi.org/10.3390/e20050352

Submission received: 31 January 2018 / Revised: 18 April 2018 / Accepted: 20 April 2018 / Published: 8 May 2018

(This article belongs to the Special Issue Rate-Distortion Theory and Information Theory)

Download

Browse Figures

Versions Notes

Abstract

:

We consider the rate distortion problem with side information at the decoder posed and investigated by Wyner and Ziv. Using side information and encoded original data, the decoder must reconstruct the original data with an arbitrary prescribed distortion level. The rate distortion region indicating the trade-off between a data compression rate R and a prescribed distortion level

Δ

was determined by Wyner and Ziv. In this paper, we study the error probability of decoding for pairs of

(R, Δ)

outside the rate distortion region. We evaluate the probability of decoding such that the estimation of source outputs by the decoder has a distortion not exceeding a prescribed distortion level

Δ

. We prove that, when

(R, Δ)

is outside the rate distortion region, this probability goes to zero exponentially and derive an explicit lower bound of this exponent function. On the Wyner–Ziv source coding problem the strong converse coding theorem has not been established yet. We prove this as a simple corollary of our result.

Keywords:

source coding with side information at the decoder; the rate distortion region; exponent function outside the rate distortion region; strong converse theorem

1. Introduction

For single or multi terminal source coding systems, the converse coding theorems state that at any data compression rates below the fundamental theoretical limit of the system the error probability of decoding cannot go to zero when the block length n of the codes tends to infinity. On the other hand, the strong converse theorems state that, at any transmission rates exceeding the fundamental theoretical limit, the error probability of decoding must go to one when n tends to infinity. The former converse theorems are sometimes called the weak converse theorems to distinguish them with the strong converse theorems.

In this paper, we study the strong converse theorem for the rate distortion problem with side information at the decoder posed and investigated by Wyner and Ziv [1]. We call the above source coding system the Wyner and Ziv source coding system (the WZ system). The WZ system is shown in Figure 1. In this figure, the WZ system corresponds to the case where the switch is close. In Figure 1, the sequence

(X^{n}, Y^{n})

represents independent copies of a pair of dependent random variables

(X, Y)

which take values in the finite sets

X

and

Y

, respectively. We assume that

(X, Y)

has a probability distribution denoted by

p_{X Y}

. The encoder

φ^{(n)}

outputs a binary sequence which appears at a rate R bits per input symbol. The decoder function

ψ^{(n)}

observes

φ^{(n)} (X^{n})

and

Y^{n}

to output a sequence

Z^{n}

. The t-th component

Z_{t}

of

Z^{n}

for

t = 1, 2, \dots, n

take values in the finite reproduction alphabet

Z

. Let

d : X \times Z

\to [0, \infty)

be an arbitrary distortion measure on

X \times Z

. The distortion between

x^{n} \in X^{n}

and

z^{n} \in Z^{n}

is defined by

d (x^{n}, z^{n}) : = \sum_{t = 1}^{n} d (x_{t}, z_{t}) .

In general, we have two criteria on

d (X^{n}, Z^{n})

. One is the excess-distortion probability of decoding defined by

P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) : = Pr \{\frac{1}{n} d (X^{n}, Z^{n}) \geq Δ\} .

(1)

The other is the average distortion defined by

\begin{matrix} Δ_{n} : = & E [\frac{1}{n} d (X_{,}^{n} Z^{n})] : = \sum_{(x^{n}, z^{n}) \in X^{n} \times Z^{n}} [\frac{1}{n} \sum_{k = 1}^{n} d (x_{k}, z_{k})] \Pr {X^{n} = x^{n}, Z^{n} = z^{n}} \\ = & \frac{1}{n} \sum_{k = 1}^{n} \sum_{(x_{k}, z_{k}) \in X \times Z} d (x_{k}, z_{k}) \Pr \{X_{k} = x_{k}, Z_{k} = z_{k}\} . \end{matrix}

A pair

(R, Δ)

is

ε

-achievable for

p_{X Y}

if there exist a sequence of pairs

{(φ^{(n)},

ψ^{(n)} {)}}_{n \geq 1}

such that for any

δ > 0

and any n with

n \geq n_{0} = n_{0} (ε, δ)

\begin{matrix} \frac{1}{n} log ∥ φ^{(n)} ∥ \leq R + δ, P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq ε, \end{matrix}

where

∥ φ^{(n)} ∥

stands for the range of cardinality of

φ^{(n)}

. The rate distortion region

R_{WZ} (ε | p_{X Y})

is defined by

\begin{matrix} R_{WZ} (ε | p_{X Y}) & = & \{(R, Δ) : (R, Δ) is ε - achievable for p_{X Y}\} . \end{matrix}

Furthermore, set

R_{WZ} (p_{X Y}) : = ⋂_{ε > 0} R_{WZ} (ε | p_{X Y}) .

On the other hand, we can define a rate distortion region based on the average distortion criterion, a formal definition of which is the following. A pair

(R, Δ)

is achievable for

p_{X Y}

if there exist a sequence of pairs

{(φ^{(n)},

ψ^{(n)} {)}}_{n \geq 1}

such that for any

δ > 0

and any n with

n \geq n_{0}

= n_{0} (δ)

,

\begin{matrix} \frac{1}{n} log ∥ φ^{(n)} ∥ \leq R + δ, Δ^{(n)} \leq Δ + δ . \end{matrix}

The rate distortion region

{\tilde{R}}_{WZ} (p_{X Y})

is defined by

\begin{matrix} {\tilde{R}}_{WZ} (p_{X Y}) & : = & \{(R, Δ) : (R, Δ) is achievable for p_{X Y}\} . \end{matrix}

If the switch is open, then the side information is not available to the decoder. In this case the communication system corresponds to the source coding for the discrete memoryless source (DMS) specified with

p_{X}

. We define the rate distortion region

{\tilde{R}}_{DMS} (p_{X})

in a similar manner to the definition of

{\tilde{R}}_{WZ} (p_{X Y})

. We further define the region

R_{DMS} (ε | p_{X}), ε \in (0, 1)

and

R_{DMS} (p_{X})

, respectively in a similar manner to the definition of

R_{WZ} (ε | p_{X Y})

and

R_{WZ} (p_{X Y})

.

Previous works on the characterizations of

{\tilde{R}}_{DMS} (p_{X})

,

R_{DMS} (ε | p_{X}), ε \in (0, 1)

, and

R_{DMS} (p_{X})

are shown in Table 1. Shannon [2] determined

{\tilde{R}}_{DMS} (p_{X})

. Subsequently, Wolfowiz [3] proved that

{\tilde{R}}_{DMS} (p_{X}) = R_{DMS} (p_{X}) .

Furthermore, he proved the strong converse theorem. That is, if

(R, Δ) \notin

{\tilde{R}}_{DMS} (p_{X})

, then for any sequence

{(φ^{(n)}, ψ^{(n)})}_{n = 1}^{\infty}

of encoder and decoder functions satisfying the condition

\underset{n \to \infty}{lim sup} \frac{1}{n} log | | φ^{(n)} | | \leq R,

(2)

we have

lim_{n \to \infty} P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) = lim_{n \to \infty} Pr \{\frac{1}{n_{k}} d (X^{n}, Z^{n}) \geq Δ\} = 1 .

(3)

The above strong converse theorem implies that, for any

ε \in (0, 1)

,

{\tilde{R}}_{DMS} (p_{X}) = R_{DMS} (p_{X}) = R_{DMS} (ε | p_{X}) .

Csiszár and Körner proved that in Equation (3), the probability

P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ)

converges to one exponentially and determined the optimal exponent as a function of

(R, Δ)

.

The previous works on the coding theorems for the WZ system are summarized in Table 1. The rate distortion region

{\tilde{R}}_{WZ} (p_{X Y})

was determined by Wyner and Ziv [1]. Csiszár and Körner [4] proved that

{\tilde{R}}_{WZ} (p_{X Y})

= R_{WZ} (p_{X Y})

. On the other hand, we have had no result on the strong converse theorem for the WZ system.

Main results of this paper are summarized in Table 1. For the WZ system, we prove that if

(R, Δ)

is out side the rate distortion region

R_{WZ} (p_{X Y})

, then we have that for any sequence

{(φ^{(n)}, ψ^{(n)})}_{n = 1}^{\infty}

of encoder and decoder functions satisfying the conditionin Equation (2), the quantity

P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ)

goes to zero exponentially and derive an explicit lower bound of this exponent function. This result corresponds to Theorem 3 in Table 1. As a corollary from this theorem, we obtain the strong converse result, which is stated in Corollary 2 in Table 1. This results states that we have an outer bound with

O (1 \sqrt{n})

gap from the rate distortion region

R_{WZ} (p_{X Y})

.

To derive our result, we use a new method called the recursive method. This method is a general powerful tool to prove strong converse theorems for several coding problems in information theory. In fact, the recursive method plays important roles in deriving exponential strong converse exponent for communication systems treated in [5,6,7,8].

2. Source Coding with Side Information at the Decoder

In the following argument, the operations

E_{p} [\cdot]

and

{Var}_{p} [\cdot]

, respectively, stand for the expectation and the variance with respect to a probability distribution p. When the value of p is obvious from the context, we omit the suffix p in those operations to simply write

E [\cdot]

and

Var [\cdot]

. Let

X

and

Y

be finite sets and

{\{(X_{t}, Y_{t})\}}_{t = 1}^{\infty}

be a stationary discrete memoryless source. For each

t = 1, 2, \dots

, the random pair

(X_{t}, Y_{t})

takes values in

X \times Y

, and has a probability distribution

p_{X Y} = {\{p_{X Y} (x, y)\}}_{(x, y) \in X \times Y} .

We write n independent copies of

{\{X_{t}\}}_{t = 1}^{\infty}

and

{\{Y_{t}\}}_{t = 1}^{\infty}

, respectively, as

X^{n} = X_{1}, X_{2}, \dots, X_{n} and Y^{n} = Y_{1}, Y_{2}, \dots, Y_{n} .

We consider a communication system depicted in Figure 2. Data sequences

X^{n}

is separately encoded to

φ^{(n)} (X^{n})

and is sent to the information processing center. At the centerm the decoder function

ψ^{(n)}

observes

φ^{(n)} (X^{n})

and

Y^{n}

to output the estimation

Z^{n}

of

X^{n}

. The encoder function

φ^{(n)}

is defined by

\begin{matrix} φ^{(n)} : X^{n} \to M_{n} = \{1, 2, \dots, M_{n}\} . \end{matrix}

(4)

Let

Z

be a reproduction alphabet. The decoder function

ψ^{(n)}

is defined by

ψ^{(n)} : M_{n} \times Y^{n} \to Z^{n} .

(5)

Let

d : X \times Z

\to [0, \infty)

be an arbitrary distortion measure on

X \times Z

. The distortion between

x^{n} \in X^{n}

and

z^{n} \in Z^{n}

is defined by

d (x^{n}, z^{n}) : = \sum_{t = 1}^{n} d (x_{t}, z_{t}) .

The excess-distortion probability of decoding is

P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) = Pr \{\frac{1}{n} d (X^{n}, Z^{n}) \geq Δ\},

(6)

where

Z^{n} = ψ^{(n)} (φ^{(n)} (X^{n}), Y^{n})

. The average distortion

Δ^{(n)}

between

X^{n}

and

Z^{n}

is defined by

Δ^{(n)} : = \frac{1}{n} E [d (X^{n}, Z^{n})] : = \frac{1}{n} \sum_{t = 1}^{n} E d (X_{t}, Z_{t}) .

In the previous section, we gave the formal definitions of

R_{WZ} (ε |

p_{X Y})

,

ε \in (0, 1)

,

R_{WZ} (p_{X Y})

, and

{\tilde{R}}_{WZ} (p_{X Y})

. We can show that the above three rate distortion regions satisfy the following property.

Property 1.

(a): The regions $R_{WZ} (ε | p_{X Y})$ , $ε \in (0, 1)$ , $R_{WZ} (p_{X Y})$ , and ${\tilde{R}}_{WZ} (p_{X Y})$ are closed convex sets of $R_{+}^{2}$ , where

$\begin{matrix} R_{+}^{2} & : = & {(R, Δ) : R \geq 0, Δ \geq 0} . \end{matrix}$
(b): $R_{WZ} (ε | p_{X Y})$ has another form using $(n, ε)$ -rate distortion region, the definition of which is as follows. We set

$\begin{matrix} R_{WZ} (n, ε | p_{X Y}) & = & {(R, Δ) : T h e r e e x i s t s (φ^{(n)}, ψ^{(n)}) s u c h t h a t \\ \frac{1}{n} log | | φ^{(n)} | | \leq R, P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq ε}, \end{matrix}$

which is called the $(n, ε)$ -rate distortion region. Using $R_{WZ} (n, ε | p_{X Y})$ , $R_{WZ} (ε | p_{X Y})$ can be expressed as

$\begin{matrix} R_{WZ} (ε | p_{X Y}) & = & cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{WZ} (n, ε | p_{X Y})), \end{matrix}$

where $cl (\cdot)$ stands for the closure operation.

Proof of this property is given in Appendix A.

It is well known that

{\tilde{R}}_{WZ} (p_{X Y})

was determined by Wyner and Ziv [1]. To describe their result we introduce auxiliary random variables U and Z, respectively, taking values in finite sets

U

and

Z

. We assume that the joint distribution of

(U, X, Y, Z)

is

\begin{matrix} p_{U X Y Z} (u, x, y, z) = p_{U} (u) p_{X | U} (x | u) p_{Y | X} (y | x) p_{Z | U Y} (z | u, y) . \end{matrix}

The above condition is equivalent to

U \leftrightarrow X \leftrightarrow Y, X \leftrightarrow (U, Y) \leftrightarrow Z .

Define the set of probability distribution

p = p_{U X Y Z}

by

\begin{matrix} P (p_{X Y}) & : = & {p = p_{U X Y Z} : | U | \leq | X | + 1, U \leftrightarrow X \leftrightarrow Y, X \leftrightarrow (U, Y) \leftrightarrow Z}, \\ P^{*} (p_{X Y}) & : = & {p = p_{U X Y Z} : | U | \leq | X | + 1, U \leftrightarrow X \leftrightarrow Y, Z = ϕ (U, Y) for some ϕ : U \times Y \to Z} . \end{matrix}

By definitions, it is obvious that

P^{*} (p_{X Y}) \subseteq P (p_{X Y})

. Set

\begin{matrix} R (p) & : = & \begin{matrix} {(R, Δ) : R, Δ \geq 0, \begin{matrix} R & \geq & I_{p} (X; U | Y), Δ \geq E_{p} d (X, Z)}, \end{matrix} \end{matrix} \\ R (p_{X Y}) & : = & ⋃_{p \in P (p_{X Y})} R (p), R^{*} (p_{X Y}) : = ⋃_{p \in P^{*} (p_{X Y})} R (p) . \end{matrix}

We can show that the above functions and sets satisfy the following property:

Property 2.

(a): The region $R (p_{X Y})$ is a closed convex set of $R_{+}^{2}$ .
(b): For any $p_{X Y}$ , we have

$R (p_{X Y}) = R^{*} (p_{X Y}) .$

Proof of Property 2 is given in Appendix C. In Property 2 Part (b),

R (p_{X Y})

is regarded as another expression of

R^{*} (p_{X Y})

. This expression is useful for deriving our main result. The rate region

R_{WZ} (p_{X Y})

was determined by Wyner and Ziv [1]. Their result is the following:

Theorem 1

(Wyner and Ziv [1]).

\begin{matrix} {\tilde{R}}_{WZ} (p_{X Y}) = R^{*} (p_{X Y}) = R (p_{X Y}) . \end{matrix}

On

R_{WZ} (p_{X Y})

, Csiszár and Körner [4] obtained the following result.

Theorem 2

(Csiszár and Körner [4]).

\begin{matrix} R_{WZ} (p_{X Y}) = {\tilde{R}}_{WZ} (p_{X Y}) = R^{*} (p_{X Y}) = R (p_{X Y}) . \end{matrix}

We are interested in an asymptotic behavior of the error probability of decoding to tend to one as

n \to \infty

for

(R, Δ) \notin R_{WZ} (p_{X Y})

. To examine the rate of convergence, we define the following quantity. Set

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) : = 1 - P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ), \\ G^{(n)} (R, Δ | p_{X Y}) : = min_{\binom{(φ^{(n)}, ψ^{(n)}) :}{(1 / n) log ∥ φ^{(n)} ∥ \leq R}} (- \frac{1}{n}) log P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) . \end{matrix}

By time sharing, we have that

\begin{matrix} G^{(n + m)} (\frac{n R + m R^{'}}{n + m}, \frac{n Δ + m Δ^{'}}{n + m}| p_{X Y}) & \leq & \frac{n G^{(n)} (R, Δ | p_{X Y}) + m G^{(m)} (R^{'}, Δ^{'} | p_{X Y})}{n + m} . \end{matrix}

(7)

Choosing

R = R^{'}

and

Δ = Δ^{'}

in Equation (7), we obtain the following subadditivity property on

{G^{(n)} (R, Δ | p_{X Y})

}_{n \geq 1}

:

\begin{matrix} G^{(n + m)} (R, Δ | p_{X Y}) & \leq & \frac{n G^{(n)} (R, Δ | p_{X Y}) + m G^{(m)} (R, Δ | p_{X Y})}{n + m}, \end{matrix}

which together with Fekete’s lemma yields that

G^{(n)} (R,

Δ | p_{X Y})

exists and satisfies the following:

\begin{matrix} lim_{n \to \infty} G^{(n)} (R, Δ | p_{X Y}) = inf_{n \geq 1} G^{(n)} (R, Δ | p_{X Y}) . \end{matrix}

Set

\begin{matrix} G (R, Δ | p_{X Y}) : = lim_{n \to \infty} G^{(n)} (R, Δ | p_{X Y}), \\ G (p_{X Y}) : = {(R, Δ, G) : G \geq G (R, Δ | p_{X Y})} . \end{matrix}

The exponent function

G (R, Δ | p_{X Y})

is a convex function of

(R, Δ)

. In fact, from Equation (7), we have that for any

α \in [0, 1]

\begin{matrix} G (α R + \bar{α} R^{'}, α Δ + \bar{α} Δ^{'} | p_{X Y}) & \leq & α G (R, Δ | p_{X Y}) + \bar{α} G (R^{'}, Δ^{'} | p_{X Y}), \end{matrix}

where

\bar{α} = 1 - α

. The region

G (p_{X Y})

is also a closed convex set. Our main aim is to find an explicit characterization of

G (p_{X Y})

. In this paper, we derive an explicit outer bound of

G

(p_{X Y})

whose section by the plane

G = 0

coincides with

R_{WZ} (p_{X Y})

.

3. Main Results

In this section, we state our main results. We first explain that the rate distortion region

R (p_{X Y})

can be expressed with two families of supporting hyperplanes. To describe this result, we define two sets of probability distributions on

U \times X \times Y \times Z

by

\begin{matrix} P_{sh} (p_{X Y}) & : = & {p_{U X Y Z} : | U | \leq | X |, U \leftrightarrow X \leftrightarrow Y, X \leftrightarrow (U, Y) \leftrightarrow Z} . \\ Q & : = & {q = q_{U X Y Z} : | U | \leq | X |} . \end{matrix}

Let

\bar{μ} = 1 - μ

. We set

\begin{matrix} R^{(μ)} (p_{X Y}) : = min_{p \in P_{sh} (p_{X Y})} \{\bar{μ} I_{p} (X; U | Y) + μ E_{p} d (X; Z)\}, \\ R_{sh} (p_{X Y}) : = ⋂_{μ \in [0, 1]} {(R, Δ) : \bar{μ} R + μ Δ \geq R^{(μ)} (p_{X Y})} . \end{matrix}

Then, we have the following property:

Property 3.

For any

p_{X Y}

, we have

R_{sh} (p_{X Y}) = R (p_{X Y}) .

(8)

Proof of Property 3 is given in Appendix D. For

μ \in [0, 1]

and

λ, α \geq 0

, define

\begin{matrix} ω_{q | | p}^{(μ, λ)} (x, y, z | u) \\ : = log [\frac{q_{X} (x) q_{Y | X U} (y | x, u) q_{Z | U Y X} (z | u, y, x)}{p_{X} (x) p_{Y | X} (y | x) q_{Z | U Y} (z | u, y)}] + λ [\bar{μ} log \frac{q_{X | Y U} (x | y, u)}{p_{X | Y} (x | y)} + μ d (x, z)], \\ Ω^{(μ, λ, α)} (q | p_{X Y}) : = - log E_{q} [exp \{- α ω_{q | | p}^{(μ, λ)} (X, Y, Z | U)\}], \\ Ω^{(μ, λ, α)} (p_{X Y}) : = min_{\binom{}{q \in Q}} Ω^{(μ, λ, α)} (q | p_{X Y}), \\ F^{(μ, λ, α)} (\bar{μ} R + μ Δ | p_{X Y}) : = \frac{Ω^{(μ, λ, α)} (p_{X Y}) - λ α (\bar{μ} R + μ Δ)}{1 + (4 + λ \bar{μ}) α} . \end{matrix}

Furthermore, set

\begin{matrix} F (R, Δ | p_{X Y}) : = sup_{μ \in [0, 1], λ, α \geq 0} F^{(μ, λ, α)} (\bar{μ} R + μ Δ | p_{X Y}), \\ \bar{G} (p_{X Y}) : = \{(R, Δ, G) : G \geq F (R, Δ | p_{X Y})\} . \end{matrix}

We next define a functions serving as a lower bound of

F (R, Δ | p_{X Y})

. For each

p = p_{U X Y Z} \in P_{sh} (p_{X Y})

, define

\begin{matrix} {\tilde{ω}}_{p}^{(μ)} (x, y, z | u) : = [\bar{μ} log \frac{p_{X | Y U} (x | y, u)}{p_{X | Y} (x | y)} + μ d (x, z)], \\ {\tilde{Ω}}^{(μ, λ)} (p) : = - log E_{p} [exp \{- λ ω_{p}^{(μ)} (X, Y, Z | U)\}] . \end{matrix}

Furthermore, set

\begin{matrix} {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) : = min_{\binom{}{p \in P_{sh} (p_{X Y})}} {\tilde{Ω}}^{(μ, λ)} (p), \\ {\tilde{F}}^{(μ, λ)} (μ R + \bar{μ} Δ | p_{X Y}) : = \frac{{\tilde{Ω}}^{(μ, λ)} (p_{X Y}) - λ (\bar{μ} R + μ Δ)}{5 + λ (1 + \bar{μ})}, \end{matrix}

\begin{matrix} \tilde{F} (R, Δ | p_{X Y}) : = sup_{λ \geq 0, μ \in [0, 1]} {\tilde{F}}^{(μ, λ)} (\bar{μ} R + μ Δ | p_{X Y}) . \end{matrix}

We can show that the above functions satisfies the following properties:

Property 4.

(a): The cardinality bound $| U | \leq | X |$ appearing in $Q$ is sufficient to describe the quantity $Ω^{(μ, λ, α)} (p_{X Y})$ . Furthermore, the cardinality bound $| U | \leq | X |$ in $P_{sh} (p_{X Y})$ is sufficient to describe the quantity ${\tilde{Ω}}^{(μ, λ)} (p_{X Y})$ .
(b): For any $R, Δ \geq 0$ , we have

$F (R, Δ | p_{X Y}) \geq \tilde{F} (R, Δ | p_{X Y}) .$
(c): Fix any $p = p_{U X Y} \in P_{sh} (p_{X Y})$ and $μ \in [0, 1]$ . For $λ \in [0, 1]$ , ${\tilde{Ω}}^{(μ, λ)} (p)$ exists and is nonnegative. For $p =$ $p_{U X Y Z} \in P_{sh} (p_{X Y})$ , define a probability distribution $p^{(λ)} = p_{U X Y Z}^{(λ)}$ by

$\begin{matrix} p^{(λ)} (u, x, y, z) & : = & \frac{p (u, x, y, z) exp \{- λ {\tilde{ω}}_{p}^{(μ)} (x, y, z | u)\}}{E_{p} [exp \{- λ {\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)\}]} . \end{matrix}$

Then, for $λ \in [0, 1 / 2]$ , ${\tilde{Ω}}^{(μ, λ)} (p)$ is twice differentiable. Furthermore, for $λ \in [0, 1 / 2]$ , we have

$\begin{matrix} \frac{d}{d λ} {\tilde{Ω}}^{(μ, λ)} (p) = E_{p^{(λ)}} [ω_{p}^{(μ)} (X, Y, Z | U)], \\ \frac{d^{2}}{d λ^{2}} {\tilde{Ω}}^{(μ, λ)} (p) = - {Var}_{p^{(λ)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)] . \end{matrix}$

The second equality implies that ${\tilde{Ω}}^{(μ, λ)} (p)$ is a concave function of $λ \in [0, 1 / 2]$ .
(d): For $(μ, λ) \in [0, 1] \times [0, 1 / 2]$ , define

$\begin{matrix} ρ^{(μ, λ)} (p_{X Y}) : = max_{\binom{(ν, p) \in [0, λ]}{\binom{\times P_{sh} (p_{X Y}) :}{\binom{{\tilde{Ω}}^{(μ, λ)} (p)}{= {\tilde{Ω}}^{(μ, λ)} (p_{X Y})}}}} {Var}_{p^{(ν)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)], \end{matrix}$

and set

$\begin{matrix} ρ = ρ (p_{X Y}) : = max_{(μ, λ) \in [0, 1] \times [0, 1 / 2]} ρ^{(μ, λ)} (p_{X Y}) . \end{matrix}$

Then, we have $ρ (p_{X Y}) < \infty$ . Furthermore, for any $(μ, λ) \in [0, 1] \times [0, 1 / 2]$ ,

${\tilde{Ω}}^{(μ, λ)} (p_{X Y}) \geq λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} ρ (p_{X Y}) .$

(9)
(e): For every $τ \in (0, (1 / 2) ρ (p_{X Y}))$ , the condition $(R + τ, Δ + τ) \notin R (p_{X Y})$ implies

$\tilde{F} (R, Δ | p_{X Y}) > \frac{ρ (p_{X Y})}{10} \cdot g^{2} (\frac{τ}{ρ (p_{X Y})}) > 0,$

where g is the inverse function of $ϑ (a) : = a + (1 / 5) a^{2}, a \geq 0$ .

Proof of Property 4 Part (a) is given in Appendix B. Proof of Property 4 Part (b) is given in Appendix E. Proofs of Property 4 Parts (c), (d), and (e) are given in Appendix F.

Our main result is the following:

Theorem 3.

For any

R, Δ \geq 0

, any

p_{X Y}

, and for any

(φ^{(n)},

ψ^{(n)})

satisfying

(1 / n) log | | φ^{(n)} | | \leq R,

we have

P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq 5 exp \{- n F (R, Δ | p_{X Y})\} .

(10)

It follows from Theorem 3 and Property 4 Part (d) that if

(R, Δ)

is outside the rate distortion region, then the error probability of decoding goes to one exponentially and its exponent is not below

F (R, Δ | p_{X Y})

.

It immediately follows from Theorem 3 that we have the following corollary.

Corollary 1.

For any

R, Δ \geq 0

and any

p_{X Y}

, we have

\begin{matrix} G (R, Δ | p_{X Y}) \geq F (R, Δ | p_{X Y}) . \end{matrix}

(11)

Furthermore, for any

p_{X Y}

, we have

\begin{matrix} G (p_{X Y}) \subseteq \bar{G} (p_{X Y}) : = \{(R, Δ, G) : G \geq F (R, Δ | p_{X Y})\} . \end{matrix}

(12)

Proof of Theorem 3 will be given in the next section. The exponent function in the case of

Δ = 0

can be obtained as a corollary of the result of Oohama and Han [9] for the separate source coding problem of correlated sources [10]. The techniques used by them is a method of types [4], which is not useful for proving Theorem 3. In fact, when we use this method, it is very hard to extract a condition related to the Markov chain condition

U \leftrightarrow X \leftrightarrow Y

, which the auxiliary random variable

U \in U

must satisfy when

(R, Δ)

is on the boundary of the set

R (p_{X Y})

. Some novel techniques based on the information spectrum method introduced by Han [11] are necessary to prove this theorem.

From Theorem 3 and Property 4 Part (e), we can obtain an explicit outer bound of

R_{WZ} (ε | p_{X Y})

with an asymptotically vanishing deviation from

R_{WZ} (p_{X Y})

= R (p_{X Y})

. The strong converse theorem immediately follows from this corollary. To describe this outer bound, for

κ > 0

, we set

\begin{matrix} R (p_{X Y}) - κ (1, 1) & : = & {(R - κ, Δ - κ) : (R, Δ) \in R (p_{X Y})}, \end{matrix}

which serves as an outer bound of

R (p_{X Y})

. For each fixed

ε \in (0, 1)

, we define

κ_{n}

= κ_{n} (ε, ρ (p_{X Y}))

by

\begin{matrix} κ_{n} & : = & ρ (p_{X Y}) ϑ (\sqrt{\frac{10}{n ρ (p_{X Y})} log (\frac{5}{1 - ε})}) \\ \overset{(a)}{=} & \sqrt{\frac{10 ρ (p_{X Y})}{n} log (\frac{5}{1 - ε})} + \frac{2}{n} log (\frac{5}{1 - ε}) . \end{matrix}

(13)

Step (a) follows from

ϑ (a) = a + (1 / 5) a^{2}

. Since

κ_{n} \to 0

as

n \to \infty

, we have the smallest positive integer

n_{0} = n_{0} (ε, ρ (p_{X Y}))

such that

κ_{n} \leq (1 / 2) ρ (p_{X Y})

for

n \geq n_{0}

. From Theorem 3 and Property 4 Part (e), we have the following corollary.

Corollary 2.

For each fixed ε

\in (0, 1)

, we choose the above positive integer

n_{0} =

n_{0} (ε, ρ (p_{X Y}))

Then, for any

n \geq n_{0}

, we have

\begin{matrix} R_{WZ} (n, ε | p_{X Y}) \subseteq R (p_{X Y}) - κ_{n} (1, 1) . \end{matrix}

The above result together with

\begin{matrix} R_{WZ} (ε | p_{X Y}) & = & cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{WZ} (n, ε | p_{X Y})), \end{matrix}

yields that for each fixed

ε \in (0, 1)

, we have

\begin{matrix} R_{WZ} (ε | p_{X Y}) = R_{WZ} (p_{X Y}) = R (p_{X Y}) . \end{matrix}

Proof of this corollary will be given in the next section.

The direct part of coding theorem, i.e., the inclusion of

R (p_{X Y})

⊆

R_{WZ} (ε | p_{X Y})

was established by Csiszár and Körner [4]. They proved a weak converse theorem to obtain the inclusion

R_{WZ} (p_{X Y})

\subseteq R (p_{X Y})

. Until now, we have had no result on the strong converse theorem. The above corollary stating the strong converse theorem for the Wyner–Ziv source coding problem implies that a long standing open problem since Csiszár and Körner [4] has been resolved.

4. Proof of the Main Results

In this section, we prove Theorem 3 and Corollary 2. We first present a lemma which upper bounds the correct probability of decoding by the information spectrum quantities. We set

S_{n} : = φ^{(n)} (X^{n}), Z^{n} : = ψ_{n} (φ^{(n)} (X^{n}), Y^{n}) .

It is obvious that

S_{n} \leftrightarrow X^{n} \leftrightarrow Y^{n}, X^{n} \leftrightarrow (S_{n}, Y^{n}) \leftrightarrow Z^{n} .

Then, we have the following:

Lemma 1.

For any

η > 0

and for any

(φ^{(n)}

,

ψ^{(n)})

satisfying

(1 / n) log | | φ^{(n)} | | \leq R,

we have

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq p_{S_{n} X^{n} Y^{n} Z^{n}} { \end{matrix}

\begin{matrix} η & \geq & \frac{1}{n} log \frac{Q_{X^{n}}^{(i)} (X^{n})}{p_{X^{n}} (X^{n})}, \end{matrix}

(14)

\begin{matrix} η & \geq & \frac{1}{n} log \frac{Q_{Y^{n} | S_{n} X^{n}}^{(ii)} (Y^{n} | S_{n}, X^{n})}{p_{Y^{n} | X^{n}} (Y^{n} | X^{n})}, \end{matrix}

(15)

\begin{matrix} η & \geq & \frac{1}{n} log \frac{Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} (X^{n} | S_{n}, Y^{n}, Z^{n})}{p_{X^{n} | S_{n} Y^{n}} (X^{n} | S_{n}, Y^{n})}, \end{matrix}

(16)

\begin{matrix} R + η & \geq & \frac{1}{n} log \frac{Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (X^{n} | S_{n}, Y^{n})}{p_{X^{n} | Y^{n}} (X^{n} | Y^{n})}, \end{matrix}

(17)

\begin{matrix} Δ & \geq & \frac{1}{n} log exp \{d (X^{n}, Z^{n})\}\} + 4 e^{- n η} . \end{matrix}

(18)

The probability distribution and stochastic matrices appearing in the right members of Equation (18) have a property that we can select them arbitrary. In Equation (14), we can choose any probability distribution

Q_{X^{n}}^{(i)}

on

X^{n}

. In Equation (15), we can choose any stochastic matrix

Q_{Y^{n} | S_{n} X^{n}}^{(ii)} :

M_{n} \times

X^{n}

\to Y^{n}

. In Equation (16), we can choose any stochastic matrix

Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} :

M_{n} \times

Y^{n}

×

Z^{n}

\to X^{n}

. In Equation (17), we can choose any stochastic matrix

Q_{X^{n} | S_{n} Y^{n}}^{(iv)} :

M_{n}

×

Y^{n}

\to X^{n}

.

Proof of this lemma is given in Appendix G.

Lemma 2.

Suppose that, for each

t = 1, 2, \dots, n

, the joint distribution

p_{S_{n} X^{t} Y^{n}}

of the random vector

S_{n} X^{t} Y^{n}

is a marginal distribution of

p_{S_{n} X^{n} Y^{n}}

. Then, for

t = 1, 2, \dots, n

, we have the following Markov chain:

X_{t} \leftrightarrow S_{n} X^{t - 1} Y_{t}^{n} \leftrightarrow Y^{t - 1}

(19)

or equivalently that

I (X_{t}; Y^{t - 1} | S_{n} X^{t - 1} Y_{t}^{n}) = 0

.

Proof of this lemma is given in Appendix H. For

t = 1, 2, \dots, n

, set

u_{t} : = (s, x^{t - 1}, y_{t + 1}^{n})

. Let

U_{t} : =

(S_{n}, X^{t - 1},

Y_{t + 1}^{n})

be a random vector taking values in

M_{n}

×

X^{t - 1}

×

Y_{t + 1}^{n}

. From Lemmas 1 and 2, we have the following:

Lemma 3.

For any

η > 0

and for any

(φ^{(n)}

,

ψ^{(n)})

satisfying

(1 / n) log | | φ^{(n)} | | \leq R,

we have the following:

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq p_{S_{n} X^{n} Y^{n} Z^{n}} { \\ \begin{matrix} η & \geq & \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t}}^{(i)} (X_{t})}{p_{X_{t}} (X_{t})}, \\ η & \geq & \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{Y_{t} | U_{t} X_{t}}^{(ii)} (Y_{t} | U_{t}, X_{t})}{p_{Y_{t} | X_{t}} (Y_{t} | X_{t})}, \\ η & \geq & \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)} (X_{t} | U_{t}, Y_{t}, Z_{t})}{p_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}, \\ R + η & \geq & \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t} | U_{t} Y_{t}}^{(iv)} (X_{t} | U_{t}, Y_{t})}{p_{X_{t} | Y_{t}} (X_{t} | Y_{t})}, \\ Δ & \geq & \frac{1}{n} \sum_{t = 1}^{n} log e^{d (X_{t}, Z_{t})}\} + 4 e^{- n η}, \end{matrix} \end{matrix}

(20)

where for each

t = 1, 2, \dots, n

, the following probability distribution and stochastic matrices:

Q_{X_{t}}^{(i)}, Q_{Y_{t} | U_{t} X_{t}}^{(ii)}, Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)}, a n d Q_{X_{t} | U_{t} Y_{t}}^{(iv)}

appearing in the first term in the right members of Equation (21) have a property that we can choose their values arbitrary.

Proof.

On the probability distributions appearing in the right members of Equation (18), we take the following choices. In Equation (14), we choose

Q_{X^{n}}^{(i)}

so that

Q_{X^{n}}^{(i)} (X^{n}) = \prod_{t = 1}^{n} Q_{X_{t}}^{(i)} (X_{t}) .

(21)

In Equation (15), we choose

Q_{Y^{n} | S_{n} X^{n}}^{(ii)}

so that

\begin{matrix} Q_{Y^{n} | S_{n} X^{n}}^{(ii)} (Y^{n} | S_{n}, X^{n}) & = & \prod_{t = 1}^{n} Q_{Y_{t} | S_{n} X^{t} Y_{t + 1}^{n}}^{(ii)} (Y_{t} | S_{n}, X^{t}, Y_{t + 1}^{n}) = \prod_{t = 1}^{n} Q_{Y_{t} | X_{t} U_{t}}^{(ii)} (Y_{t} | U_{t} X_{t}) . \end{matrix}

(22)

In Equation (16), we choose

Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)}

so that

\begin{matrix} Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} (X^{n} | S_{n}, Y^{n}, Z^{n}) & = & \prod_{t = 1}^{n} Q_{X_{t} | S_{n} X^{t - 1} Y_{t}^{n} Z_{t}}^{(iii)} (X_{t} | S_{n} X^{t - 1}, Y_{t}^{n}, Z_{t}) \\ = & \prod_{t = 1}^{n} Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)} (X_{t} | U_{t} Y_{t} Z_{t}) . \end{matrix}

(23)

In Equation (16), we note that

\begin{matrix} p_{X^{n} | S_{n} Y^{n}} (X^{n} | S_{n}, Y^{n}) & = & \prod_{t = 1}^{n} p_{X_{t} | S_{n} X^{t - 1} Y^{n}} (X_{t} | S_{n}, X^{t - 1}, Y^{n}) \overset{(a)}{=} \prod_{t = 1}^{n} p_{X_{t} | S_{n} X^{t - 1} Y_{t}^{n}} (X_{t} | S_{n}, X^{t - 1}, Y_{t}^{n}) \\ = & \prod_{t = 1}^{n} p_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t}) . \end{matrix}

(24)

Step (a) follows from Lemma 2. In Equation (17), we choose

Q_{X^{n} | S_{n} Y^{n}}^{(iv)}

so that

\begin{matrix} Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (X^{n} | S_{n}, Y^{n}) = \prod_{t = 1}^{n} Q_{X_{t} | S_{n} X^{t - 1} Y_{t}^{n}}^{(iv)} (X_{t} | S_{n}, X^{t - 1}, Y_{t}^{n}) = \prod_{t = 1}^{n} Q_{X_{t} | U_{t} Y_{t}}^{(iv)} (X_{t} | U_{t}, Y_{t}) . \end{matrix}

(25)

From Lemma 1 and Equations (21)–(25), we have the bound of Equation (21) in Lemma 3. ☐

To evaluate an upper btound of Equation (21) in Lemma 3, we use the following lemma, which is well known as the Cramér’s bound in the large deviation principle.

Lemma 4.

For any real valued random variable A and any

θ \geq 0

, we have

Pr {A \leq a} \leq exp [(θ a + log E [exp (- θ A)])] .

Here, we define a quantity which serves as an exponential upper bound of

P_{c}^{(n)} (φ^{(n)},

ψ^{(n)})

. For each

t = 1, 2, \dots, n

, let

{\underset{̲}{Q}}_{t}

be a set of all

\begin{matrix} {\underset{̲}{Q}}_{t} & = & (Q_{X_{t}}^{(i)}, Q_{Y_{t} | U_{t} X_{t}}^{(ii)}, Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)}, Q_{X_{t} | U_{t} Y_{t}}^{(iv)}) . \end{matrix}

Set

\begin{matrix} {\underset{̲}{Q}}^{n} & : = & \prod_{t = 1}^{n} {\underset{̲}{Q}}_{t}, {\underset{̲}{Q}}^{n} : = {\{{\underset{̲}{Q}}_{t}\}}_{t = 1}^{n} \in {\underset{̲}{Q}}^{n} . \end{matrix}

Let

P^{(n)} (p_{X Y})

be a set of all probability distributions

p_{S_{n} X^{n} Y^{n} Z^{n}}

on

M_{n} \times X^{n} \times Y^{n} \times Z^{n}

having the form:

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (s, x^{n}, y^{n}, z^{n}) & = & p_{S_{n} | X^{n}} (s | x^{n}) \{\prod_{t = 1}^{n} p_{X_{t} Y_{t}} (x_{t}, y_{t})\} p_{Z^{n} | Y^{n} S_{n}} (z^{n} | y^{n}, s) . \end{matrix}

For simplicity of notation, we use the notation

p^{(n)}

for

p_{S_{n} X^{n} Y^{n} Z^{n}}

\in P^{(n)}

(p_{X Y})

. We assume that

p_{U_{t} X_{t} Y_{t} Z_{t}} = p_{S_{n} X^{t} Y_{t}^{n} Z_{t}}

is a marginal distribution of

p^{(n)}

. For

t = 1, 2, \dots, n

, we simply write

p_{t} =

p_{U_{t} X_{t} Y_{t} Z_{t}}

. For

p^{(n)}

\in P^{(n)} (p_{X Y})

and

{\underset{̲}{Q}}^{n}

\in {\underset{̲}{Q}}^{n}

, we define

\begin{matrix} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) : = - log E_{p^{(n)}} [(\prod_{t = 1}^{n} \{\frac{p_{X_{t}} (X_{t})}{Q_{X_{t}}^{(i)} (X_{t})} {\frac{p_{Y_{t} | X_{t}} (Y_{t} | X_{t})}{Q_{Y_{t} | X_{t} U_{t}}^{(ii)} (Y_{t} | X_{t}, U_{t})}\}}^{θ}) \\ \times (\prod_{t = 1}^{n} {\{\frac{p_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}{Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)} (X_{t} | U_{t}, Y_{t}, Z_{t})}\}}^{θ}) (\prod_{t = 1}^{n} {\{{(\frac{p_{X_{t} | Y_{t}} (X_{t} | Y_{t})}{Q_{X_{t} | Y_{t} U_{t}}^{(iv)} (X_{t} | U_{t}, Y_{t})})}^{\bar{μ}} e^{- μ d (X_{t}, Z_{t})}\}}^{λ θ})], \end{matrix}

where, for each

t = 1, 2, \dots, n

, the following probability distribution and stochastic matrices:

Q_{X_{t}}^{(i)}, Q_{X_{t} | U_{t} Y_{t}}^{(ii)}, Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)}, Q_{Y_{t} | X_{t} U_{t}}^{(iv)}

appearing in the definition of

Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})

are chosen so that they are induced by the joint distribution

q_{t} = q_{U_{t} X_{t} Y_{t} Z_{t}} \in Q_{t}

.

By Lemmas 3 and 4, we have the following proposition:

Proposition 1.

For any

μ \in [0, 1],

λ,

θ \geq 0

, any

q^{n} \in Q^{n}

, and any

(φ^{(n)},

ψ^{(n)})

satisfying

(1 / n) log | | φ^{(n)} | | \leq R,

we have

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) & \leq & 5 exp {- n {[1 + (3 + λ \bar{μ}) θ]}^{- 1} [\frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) - λ θ (\bar{μ} R + μ Δ)]\} . \end{matrix}

Proof.

When

Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) \leq n λ θ (\bar{μ} R + μ Δ)

, the bound we wish to prove is obvious. In the following argument, we assume that

Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) >

n λ θ (\bar{μ} R + μ Δ) .

We define five random variables

A_{i},

i = 1, 2, \dots, 5

by

\begin{matrix} A_{1} = \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t}}^{(i)} (X_{t})}{p_{X_{t}} (X_{t})}, A_{2} = \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{Y_{t} | X_{t} U_{t}}^{(ii)} (Y_{t} | X_{t}, U_{t})}{p_{Y_{t} | X_{t}} (Y_{t} | X_{t})}, \\ A_{3} = \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)} (X_{t} | U_{t}, Y_{t}, Z_{t})}{p_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}, \\ A_{4} = \frac{1}{n} \sum_{t = 1}^{n} log \frac{Q_{X_{t} | U_{t} Y_{t}}^{(iv)} (X_{t} | U_{t}, Y_{t})}{p_{X_{t} | Y_{t}} (X_{t} | Y_{t})}, A_{5} = \frac{1}{n} \sum_{t = 1}^{n} log e^{d (X_{t}, Z_{t})} . \end{matrix}

By Lemma 3, for any

(φ^{(n)}, ψ^{(n)})

satisfying

(1 / n) log | | φ^{(n)} | |

\leq R,

we have

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq p_{S_{n} X^{n} Y^{n} Z^{n}} {A_{i} \leq η f o r i = 1, 2, 3, A_{4} \leq R + η, A_{5} \leq Δ} + 4 e^{- n η} \\ \leq p_{S_{n} X^{n} Y^{n} Z^{n}} {A_{1} + A_{2} + A_{3} + λ (\bar{μ} A_{4} + μ A_{5}) \leq λ (\bar{μ} R + μ Δ) + (3 + λ \bar{μ}) η} + 4 e^{- n η} \\ = p_{S_{n} X^{n} Y^{n} Z^{n}} {A \leq a} + 4 e^{- n η}, \end{matrix}

(26)

where we set

\begin{matrix} A & : = & A_{1} + A_{2} + A_{3} + λ (\bar{μ} A_{4} + μ A_{5}), \\ a & : = & λ (\bar{μ} R + μ Δ) + (3 + λ \bar{μ}) η . \end{matrix}

Applying Lemma 4 to the first term in the right member of Equation (26), we have

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq exp [(θ a + log E_{p^{(n)}} [exp (- θ A)])] + 4 e^{- n η} \\ = & exp [n {λ θ (\bar{μ} R + μ Δ) + (3 + λ \bar{μ}) θ η - \frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})\}] + 4 e^{- n η} . \end{matrix}

(27)

We choose

η

so that

\begin{matrix} - η & = & λ θ (\bar{μ} R + μ Δ) + θ (3 + λ \bar{μ}) η - \frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) . \end{matrix}

(28)

Solving Equation (28) with respect to

η

, we have

\begin{matrix} η = \frac{\frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) - λ θ (\bar{μ} R + μ Δ)}{1 + (3 + λ \bar{μ}) θ} . \end{matrix}

For this choice of

η

and Equation (27), we have

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq 5 e^{- n η} \\ = & 5 exp {- n {[1 + (3 + λ \bar{μ}) θ]}^{- 1} [\frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) - λ θ (\bar{μ} R + μ Δ)]\}, \end{matrix}

completing the proof. ☐

Set

\begin{matrix} {\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y}) & : = & inf_{n \geq 1} min_{p^{(n)} \in P^{(n)} (p_{X Y})} max_{{\underset{̲}{Q}}^{n} \in {\underset{̲}{Q}}^{n}} \frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) . \end{matrix}

By Proposition 1, we have the following corollary.

Corollary 3.

For any

μ \in [0, 1], λ \geq 0

, for any

θ \geq 0

, and for any

(φ^{(n)},

ψ^{(n)})

satisfying

(1 / n) log | | φ^{(n)} | | \leq R,

we have

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) & \leq & 5 exp \{- n [\frac{{\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y}) - λ θ (\bar{μ} R + μ Δ)}{1 + (3 + λ \bar{μ}) θ}]\} . \end{matrix}

We shall call

{\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y})

the communication potential. The above corollary implies that the analysis of

{\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y})

leads to an establishment of a strong converse theorem for Wyner–Ziv source coding problem. In the following argument, we drive an explicit lower bound of

{\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y})

. We use a new technique we call the recursive method. The recursive method is a powerful tool to drive a single letterized exponent function for rates below the rate distortion function. This method is also applicable to prove the exponential strong converse theorems for other network information theory problems [5,6,7]. Set

F_{t} : = (p_{X_{t} | U_{t} Y_{t}}, {\underset{̲}{Q}}_{t}), F^{t} : = {F_{i}}_{i = 1}^{t} .

For each

t = 1, 2, \dots, n

, define a function of

(u_{t}, x_{t}, y_{t}, z_{t})

\in U_{t}

\times X

\times Y

\times Z

by

\begin{matrix} f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) \\ : = \{\frac{p_{X_{t}} (x_{t})}{Q_{X_{t}}^{(i)} (x_{t})} \frac{p_{Y_{t} | X_{t}} (y_{t} | x_{t})}{Q_{Y_{t} | X_{t} U_{t}}^{(ii)} (y_{t} | x_{t}, u_{t})} {\frac{p_{X_{t} | U_{t} Y_{t}} (x_{t} | u_{t}, y_{t})}{Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)} (x_{t} | u_{t}, y_{t}, z_{t})} {(\frac{p_{X_{t} | Y_{t}} (x_{t} | y_{t})}{Q_{X_{t} | Y_{t} U_{t}}^{(iv)} (x_{t} | u_{t}, y_{t})})}^{λ \bar{μ}} e^{- λ μ d (x_{t}, z_{t})}\}}^{θ} . \end{matrix}

By definition, we have

\begin{matrix} exp \{- Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})\} \\ = & \sum_{s, y^{n}} p_{S_{n} Y^{n}} (s, y^{n}) \sum_{x^{n}, z^{n}} p_{X^{n} Z^{n} | S_{n} Y^{n}} (x^{n}, z^{n} | s, y^{n}) \prod_{t = 1}^{n} f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \end{matrix}

(29)

For each

t = 1, 2, \dots, n

, we define the conditional probability distribution

\begin{matrix} p_{X^{t} Z^{t} | S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} & : = & {\{p_{X^{t} Z^{t} | S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} (x^{t}, z^{t} | s, y^{n})\}}_{(x^{t}, z^{t}, s, y^{n}) \in X^{t} \times Z^{t} \times M_{n} \times Y^{n}} \end{matrix}

by

\begin{matrix} p_{X^{t} Z^{t} | S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} (x^{t}, z^{t} | s, y^{n}) & : = & C_{t}^{- 1} (s, y^{n}) p_{X^{t} Z^{t} | S_{n} Y^{n}} (x^{t}, z^{t} | s, y^{n}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, λ, θ)} (x_{i}, y_{i}, z_{i} | u_{i}) \end{matrix}

where

\begin{matrix} C_{t} (s, y^{n}) & : = & \sum_{x^{t}, z^{t}} p_{X^{t} Z^{t} | S_{n} Y^{n}} (x^{t}, z^{t} | s, y^{n}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, λ, θ)} (x_{i}, y_{i}, z_{i} | u_{i}) \end{matrix}

(31)

are constants for normalization. For

t = 1, 2, \dots, n

, define

Φ_{t, F^{t}}^{(μ, λ, θ)} (s, y^{n}) : = C_{t} (s, y^{n}) C_{t - 1}^{- 1} (s, y^{n}),

(31)

where we define

C_{0} (s, y^{n}) = 1

for

(s, y^{n}) \in M_{n}

\times Y^{n} .

Then, we have the following lemma:

Lemma 5.

For each

t = 1, 2, \dots, n

, and for any

(s,

y^{n}

x^{t}, z^{t}) \in M_{n}

\times Y^{n}

\times X^{t}

\times Z^{t}

, we have

\begin{matrix} p_{X^{t} Z^{t} | S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} (x^{t}, z^{t} | s, y^{n}) = {(Φ_{t, F^{t}}^{(μ, λ, θ)} (s, y^{n}))}^{- 1} p_{X^{t - 1} Y^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \end{matrix}

\begin{matrix} \times p_{X_{t} Z_{t} | S_{n} X^{t - 1} Y^{n}} (x_{t}, z_{t} | s, x^{t - 1}, z^{t - 1}, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \\ Φ_{t, F^{t}}^{(μ, λ, θ)} (s, y^{n}) = \sum_{x^{t}, z^{t}} p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \end{matrix}

(32)

\begin{matrix} \times p_{X_{t} Z_{t} | S_{n} X^{t - 1} Y^{n}} (x_{t}, z_{t} | s, x^{t - 1}, z^{t - 1}, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \end{matrix}

(33)

Furthermore, we have

\begin{matrix} exp \{- Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})\} & = & \sum_{s, y^{n}} p_{S_{n} Y^{n}} (s, y^{n}) \prod_{t = 1}^{n} Φ_{t, F^{t}}^{(μ, λ, θ)} (s, y^{n}) . \end{matrix}

(34)

The equality in Equation (34) in Lemma 5 is obvious from Equations (29)–(31). Proofs of Equations (32) and (33) in this lemma are given in Appendix I. Next, we define a probability distribution of the random pair

(S_{n}, Y^{n})

taking values in

M_{n}

\times Y^{n}

by

\begin{matrix} p_{S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} (s, y^{n}) = {\tilde{C}}_{t}^{- 1} p_{S_{n} Y^{n}} (s, y^{n}) \prod_{i = 1}^{t} Φ_{i, F^{i}}^{(μ, λ, θ)} (s, y^{n}), \end{matrix}

(35)

where

{\tilde{C}}_{t}

is a constant for normalization given by

{\tilde{C}}_{t} = \sum_{s, y^{n}} p_{S_{n} Y^{n}} (s, y^{n}) \prod_{i = 1}^{t} Φ_{i, F^{i}}^{(μ, λ, θ)} (s, y^{n}) .

For

t = 1, 2, \dots, n

, define

Λ_{t, F^{t}}^{(μ, λ, θ)} : = \tilde{C_{t}} {\tilde{C}}_{t - 1}^{- 1},

(36)

where we define

{\tilde{C}}_{0} = 1

. Set

\begin{matrix} p_{S_{n} X^{t} Y_{t}^{n} Z_{t}; F^{t - 1}}^{(μ, λ, θ)} (s, x^{t}, y_{t}^{n}, z_{t}) = p_{U_{t} X_{t} Y_{t} Z_{t}; F^{t - 1}}^{(μ, λ, θ)} (u_{t}, x_{t}, y_{t}, z_{t}) \\ : = \sum_{y^{t - 1}, z^{t - 1}} p_{S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (s, y^{n}) p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \\ \times p_{X_{t} Z_{t} | X^{t - 1} Z^{t - 1} S_{n} Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) . \end{matrix}

(37)

Then, we have the following:

Lemma 6.

\begin{matrix} exp \{- Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})\} = \prod_{t = 1}^{n} Λ_{t, F^{t}}^{(μ, λ, θ)}, \end{matrix}

(38)

\begin{matrix} Λ_{t, F^{t}}^{(μ, λ, θ)} = \sum_{u_{t}, x_{t}, y_{t}, z_{t}} p_{U_{t} X_{t} Y_{t} Z_{t}; F^{t - 1}}^{(μ, λ, θ)} (u_{t}, x_{t}, y_{t}, z_{t}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \end{matrix}

(39)

Proof.

By the equality Equation (34) in Lemma 5, we have

\begin{matrix} exp \{- Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y})\} = {\tilde{C}}_{n} = \prod_{t = 1}^{n} {\tilde{C}}_{t} {\tilde{C}}_{t - 1}^{- 1} \overset{(a)}{=} \prod_{t = 1}^{n} Λ_{t, F^{t}}^{(μ, λ, θ)} . \end{matrix}

(40)

Step (a) follows from the definition in Equation (36) of

Λ_{t, F^{t}}^{(μ, λ, ν, θ)} .

We next prove Equation (39) in Lemma 6. Multiplying

Λ_{t, F^{t}}^{(μ, λ, θ)} =

{\tilde{C}}_{t} / {\tilde{C}}_{t - 1}

to both sides of Equation (35), we have

\begin{matrix} Λ_{t, F^{t}}^{(μ, λ, θ)} p_{S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} (s, y^{n}) \\ = & {\tilde{C}}_{t - 1}^{- 1} p_{S_{n} Y^{n}} (s, y^{n}) \prod_{i = 1}^{t} Φ_{i, F^{i}}^{(μ, λ, θ)} (s, y^{n}) \end{matrix}

(41)

\begin{matrix} = & p_{S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (s, y^{n}) Φ_{t, F^{t}}^{(μ, λ, θ)} (s, y^{n}) . \end{matrix}

(42)

Taking summations of Equations (41) and (42) with respect to

(s, y^{n})

, we have

\begin{matrix} Λ_{t, F^{t}}^{(μ, λ, θ)} = \sum_{s, y^{n}} p_{S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (s, y^{n}) Φ_{t, F^{t}}^{(μ, λ, θ)} (s, y^{n}) \\ \overset{(a)}{=} \sum_{s, y^{n}} p_{S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (s, y^{n}) \sum_{x^{t}, z^{t}} p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \\ \times p_{X_{t} Z_{t} | X^{t - 1} Z^{t - 1} S_{n} Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) \\ = \sum_{s, x^{t}, y_{t}^{n}, z_{t}} \sum_{y^{t - 1}, z^{t - 1}} p_{S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (s, y^{n}) p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \\ \times p_{X_{t} Z_{t} | X^{t - 1} Z^{t - 1} S_{n} Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \\ \overset{(b)}{=} \sum_{u_{t}, x_{t}, y_{t}, z_{t}} p_{U_{t} X_{t} Y_{t} Z_{t}; F^{t - 1}}^{(μ, λ, θ)} (u_{t}, x_{t}, y_{t}, z_{t}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \end{matrix}

Step (a) follows from Equation (33) in Lemma 5. Step (b) follows from the definition in Equation (37) of

p_{U_{t} X_{t} Y_{t} Z_{t}; F^{t - 1}}^{(μ, λ, θ)}

. ☐

The following proposition is a mathematical core to prove our main result.

Proposition 2.

For

θ \in [0, 1)

, we choose the parameter α such that

α = \frac{θ}{1 - θ} \Leftrightarrow θ = \frac{α}{1 + α} .

(43)

Then, for any

λ \geq 0, μ \in [0, 1]

and for any

θ \in [0, 1)

, we have

{\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y}) \geq \frac{Ω^{(μ, λ, α)} (p_{X Y})}{1 + α} .

(44)

Proof.

Set

\begin{matrix} {\hat{Q}}_{n} : = {q = q_{U X Y Z} : | U | \leq | M_{n} | | X^{n - 1} | | Y^{n - 1} |}, \\ {\hat{Ω}}_{n}^{(μ, λ, α)} (p_{X Y}) : = min_{\binom{}{q \in {\hat{Q}}_{n}}} Ω^{(μ, λ, α)} (q | p_{X Y}) . \end{matrix}

Then, by Lemma 6, we have

\begin{matrix} Λ_{t, F^{t}}^{(μ, λ, θ)} & = & \sum_{u_{t}, x_{t}, y_{t}, z_{t}} p_{U_{t} X_{t} Y_{t} Z_{t}; F^{t - 1}}^{(μ, λ, θ)} (u_{t}, x_{t}, y_{t}, z_{t}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \end{matrix}

For each

t = 1, 2, \dots, n

, we recursively choose

q_{t} = q_{U_{t} X_{t} Y_{t} Z_{t}}

so that

q_{U_{t} X_{t} Y_{t} Z_{t}} = p_{U_{t} X_{t} Y_{t} Z_{t}; F^{t - 1}}^{(μ, λ, θ)}

and choose

Q_{X_{t}}^{(i)}

,

Q_{Y_{t} | X_{t} U_{t}}^{(ii)}

,

Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)}

, and

Q_{X_{t} | Y_{t} U_{t}}^{(iv)}

appearing in

\begin{matrix} f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) \\ : = {\{\frac{p_{X_{t}} (x_{t})}{Q_{X_{t}}^{(i)} (x_{t})} \frac{p_{Y_{t} | X_{t}} (y_{t} | x_{t})}{Q_{Y_{t} | X_{t} U_{t}}^{(ii)} (y_{t} | x_{t}, u_{t})} \frac{p_{X_{t} | U_{t} Y_{t}} (x_{t} | u_{t}, y_{t})}{Q_{X_{t} | U_{t} Y_{t} Z_{t}}^{(iii)} (x_{t} | u_{t}, y_{t}, z_{t})} {(\frac{p_{X_{t} | Y_{t}} (x_{t} | y_{t})}{Q_{X_{t} | Y_{t} U_{t}}^{(iv)} (x_{t} | u_{t}, y_{t})})}^{λ \bar{μ}} e^{- λ μ d (x_{t}, z_{t})}\}}^{θ} \end{matrix}

such that they are the distributions induced by

q_{U_{t} X_{t} Y_{t} Z_{t}}

. Then, for each

t = 1, 2,

⋯, n, we have the following chain of inequalities:

\begin{matrix} Λ_{t, F^{t}}^{(μ, λ, θ)} = E_{q_{t}} [{\{\frac{p_{X_{t}} (X_{t})}{q_{X_{t}} (X_{t})} \frac{p_{Y_{t} | X_{t}} (Y_{t} | X_{t})}{q_{Y_{t} | X_{t} U_{t}} (Y_{t} | X_{t}, U_{t})} \frac{p_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}{q_{X_{t} | U_{t} Y_{t} Z_{t}} (X_{t} | U_{t}, Y_{t}, Z_{t})} \frac{p_{X_{t} | Y_{t}}^{λ \bar{μ}} (X_{t} | Y_{t}) e^{- λ μ d (X_{t}, Z_{t})}}{q_{X_{t} | Y_{t} U_{t}}^{λ \bar{μ}} (X_{t} | U_{t}, Y_{t})}\}}^{θ}] \\ = E_{q_{t}} [{\{\frac{p_{X_{t}} (X_{t})}{q_{X_{t}} (X_{t})} \frac{p_{Y_{t} | X_{t}} (Y_{t} | X_{t})}{q_{Y_{t} | X_{t} U_{t}} (Y_{t} | X_{t}, U_{t})} \frac{q_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}{q_{X_{t} | U_{t} Y_{t} Z_{t}} (X_{t} | U_{t}, Y_{t}, Z_{t})} \frac{p_{X_{t} | Y_{t}}^{λ \bar{μ}} (X_{t} | Y_{t}) e^{- λ μ d (X_{t}, Z_{t})}}{q_{X_{t} | Y_{t} U_{t}}^{λ \bar{μ}} (X_{t} | U_{t}, Y_{t})}\}}^{θ} \\ \times {\{\frac{p_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}{q_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}\}}^{θ}] \\ \overset{(a)}{\leq} {(E_{q_{t}} [{\{\frac{p_{X_{t}}^{} (X_{t})}{q_{X_{t}}^{} (X_{t})} \frac{p_{Y_{t} | X_{t}} (Y_{t} | X_{t})}{q_{Y_{t} | X_{t} U_{t}} (Y_{t} | X_{t}, U_{t})} \frac{q_{Z_{t} | U_{t} Y_{t}} (Z_{t} | U_{t}, Y_{t})}{q_{Z_{t} | U_{t} X_{t} Y_{t}} (Z_{t} | U_{t}, X_{t}, Y_{t})} \frac{p_{X_{t} | Y_{t}}^{λ \bar{μ}} (X_{t} | Y_{t}) e^{- λ μ d (X_{t}, Z_{t})}}{q_{X_{t} | Y_{t} U_{t}}^{λ \bar{μ}} (X_{t} | U_{t}, Y_{t})}\}}^{\frac{θ}{1 - θ}}])}^{1 - θ} \\ \times {(E_{q_{t}} \{\frac{p_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}{q_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}\})}^{θ} = exp \{- (1 - θ) Ω^{(μ, λ, \frac{θ}{1 - θ})} (q_{t} | p_{X Y})\} \\ \overset{(b)}{=} exp \{- \frac{Ω^{(μ, λ, α)} (q_{t} | p_{X Y})}{1 + α}\} \overset{(c)}{\leq} exp \{- \frac{{\hat{Ω}}_{n}^{(μ, λ, α)} (p_{X Y})}{1 + α}\} \overset{(d)}{=} exp \{- \frac{Ω^{(μ, λ, α)} (p_{X Y})}{1 + α}\} . \end{matrix}

(45)

Step (a) follows from Hölder’s inequality and the following identity:

\begin{matrix} \frac{q_{X_{t} | U_{t} Y_{t}} (X_{t} | U_{t}, Y_{t})}{q_{X_{t} | U_{t} Y_{t} Z_{t}} (X_{t} | U_{t}, Y_{t}, Z_{t})} = \frac{q_{Z_{t} | U_{t} Y_{t}} (Z_{t} | U_{t}, Y_{t})}{q_{Z_{t} | U_{t} X_{t} Y_{t}} (Z_{t} | U_{t}, X_{t}, Y_{t})} for t = 1, 2, \dots, n . \end{matrix}

Step (b) follows from Equation (43). Step (c) follows from the definition of

{\hat{Ω}}_{n}^{(μ, λ, α)}

(p_{X Y})

. Step (d) follows from that by Property 4 Part (a), the bound

| U | \leq | X |

, is sufficient to describe

{\hat{Ω}}_{n}^{(μ, λ, α)} (p_{X Y})

. Hence, we have the following:

\begin{matrix} max_{q^{n} \in Q^{n}} \frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) \\ \geq \frac{1}{n} Ω_{}^{(μ, λ, θ)} (p^{(n)}, {\underset{̲}{Q}}^{n} | p_{X Y}) \overset{(a)}{=} - \frac{1}{n} \sum_{t = 1}^{n} log Λ_{t, F^{t}}^{(μ, λ, θ)} \overset{(b)}{\geq} \frac{Ω^{(μ, λ, α)} (p_{X Y})}{1 + α} . \end{matrix}

(46)

Step (a) follows from Equation (38) in Lemma 6. Step (b) follows from Equation (45). Since Equation (46) holds for any

n \geq 1

and any

p^{(n)} \in P^{(n)}

(p_{X Y})

, we have

{\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y}) \geq \frac{Ω^{(μ, λ, α)} (p_{X Y})}{1 + α} .

Thus, we have Equation (44) in Proposition 2. ☐

Proof of Theorem 3:

For

θ \in [0, 1)

, set

α = \frac{θ}{1 - θ} \Leftrightarrow θ = \frac{α}{1 + α} .

(47)

Then, we have the following:

\begin{matrix} \frac{1}{n} log \{\frac{5}{P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ)}\} \overset{(a)}{\geq} \frac{{\underset{̲}{Ω}}^{(μ, λ, θ)} (p_{X Y}) - λ θ (\bar{μ} R + μ Δ)}{1 + θ (3 + λ \bar{μ})} \\ \overset{(b)}{\geq} & \frac{\frac{1}{1 + α} Ω^{(μ, λ, α)} (p_{X Y}) - \frac{λ α}{1 + α} (\bar{μ} R + μ Δ)}{1 + \frac{α}{1 + α} (3 + λ \bar{μ})} = \frac{Ω^{(μ, λ, α)} (p_{X Y}) - λ α (\bar{μ} R + μ Δ)}{1 + α + α (3 + λ \bar{μ})} \\ = & F^{(μ, λ, α)} (\bar{μ} R + μ Δ | p_{X Y}) . \end{matrix}

Step (a) follows from Corollary 3. Step (b) follows from Proposition 2 and Equation (47). Since the above bound holds for any positive

λ \geq 0

,

μ \in [0, 1],

and

α \geq 0

, we have

\frac{1}{n} log \{\frac{5}{P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ)}\} \geq F (R, Δ | p_{X Y}) .

Thus, Equation (10) in Theorem 3 is proved. ☐

Proof of Corollary 2:

Since g is an inverse function of

ϑ

, the definition in Equation (13) of

κ_{n}

is equivalent to

g (\frac{κ_{n}}{ρ (p_{X Y})}) = \sqrt{\frac{10}{n ρ (p_{X Y})} log (\frac{5}{1 - ε})} .

(48)

By the definition of

n_{0} = n_{0} (ε, ρ (p_{X Y}))

, we have that

κ_{n} \leq (1 / 2) ρ (p_{X Y})

for

n \geq n_{0}

. We assume that for

n \geq n_{0}

,

(R, Δ) \in R_{WZ} (n, ε | p_{X Y}) .

Then, there exists a sequence

{(φ^{(n)},

ψ^{(n)})

}_{n \geq n_{0}}

such that for

n \geq n_{0}

, we have

\frac{1}{n} log | | φ^{(n)} | | \leq R, P_{e}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq ε .

Then, by Theorem 3, we have

\begin{matrix} 1 - ε & \leq & P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) \leq 5 exp \{- n F (R, Δ | p_{X Y})\} \end{matrix}

(49)

for any

n \geq n_{0}

. We claim that for

n \geq n_{0}

, we have

(R + κ_{n}, Δ + κ_{n})

∈

R (p_{X Y})

. To prove this claim, we suppose that

(R + κ_{n^{*}}, Δ + κ_{n^{*}})

does not belong to

R (p_{X Y})

for some

n^{*} \geq n_{0}

. Then, we have the following chain of inequalities:

\begin{matrix} 5 exp [- n^{*} F (R, Δ | p_{X Y})] \overset{(a)}{<} 5 exp [- \frac{n^{*} ρ (p_{X Y})}{10} \cdot g^{2} (\frac{κ_{n^{*}}}{ρ (p_{X Y})})] \\ \overset{(b)}{=} & 5 exp [- \frac{n^{*} ρ (p_{X Y})}{10} \frac{10}{n^{*} ρ (p_{X Y})} log (\frac{5}{1 - ε})] = 5 exp [log (\frac{1 - ε}{5})] = 1 - ε . \end{matrix}

(50)

Step (a) follows from

κ_{n^{*}} \leq (1 / 2) ρ (p_{X Y})

and Property 4 Part (e). Step (b) follows from Equation (48). The bound of Equation (50) contradicts Equation (49). Hence, we have

(R + κ_{n}, Δ +

κ_{n})

∈

R (p_{X Y})

or equivalent to

(R, Δ) \in R (p_{X Y}) - κ_{n} (1, 1)

for

n \geq n_{0}

, which implies that for

n \geq n_{0}

,

R_{WZ} (n, ε | p_{X Y}) \subseteq R (p_{X Y}) - κ_{n} (1, 1),

completing the proof. ☐

5. Conclusions

For the WZ system, we have derived an explicit lower bound of the optimal exponent function on the correct probability of decoding for for

(R, Δ) \notin R_{WZ} (p_{X Y})

. We have described this result in Theorem 3. The determination problem of the optimal exponent remains to be resolved. This problem is our future work.

In this paper, we have treated the case where

X

and

Y

are finite sets. Extension of Theorem 3 to the case where

X

and

Y

are arbitrary sets is also our future work. Wyner [12] investigated the characterization of the rate distortion region in the case where

X

and

Y

are general sets and

{(X_{t}, Y_{t})}_{t = 1}^{\infty}

is a correlated stationary memoryless source. This work may provide a good tool to investigate the second future work.

Acknowledgments

I am very grateful to Shun Watanabe for his helpful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Properties of the Rate Distortion Regions

In this Appendix, we prove Property 1. Property 1 Part (a) can easily be proved by the definitions of the rate distortion regions. We omit the proofs of this part. In the following argument, we prove Part (b).

Proof of Property 1 Part (b):

We set

\begin{matrix} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y}) & = & ⋂_{n \geq m} R_{WZ} (n, ε | p_{X Y}) . \end{matrix}

By the definitions of

{\underset{̲}{R}}_{WZ} (m, ε | p_{X Y})

and

R_{WZ} (ε | p_{X Y})

, we have that

{\underset{̲}{R}}_{WZ} (m, ε | p_{X Y})

\subseteq R_{WZ} (ε | p_{X Y})

for

m \geq 1

. Hence, we have that

⋃_{m \geq 1} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y}) \subseteq R_{WZ} (ε | p_{X Y}) .

(A1)

We next assume that

(R, Δ) \in R_{WZ} (ε | p_{X Y})

. Set

R_{WZ}^{(δ)} (ε | p_{X Y}) : = {(R + δ, Δ) : (R, Δ) \in R_{WZ} (ε | p_{X Y})} .

Then, by the definitions of

R_{WZ} (n, ε

| p_{X Y})

and

R_{WZ} (

ε | p_{X Y})

, we have that, for any

δ > 0

, there exists

n_{0} (ε, δ)

such that for any

n \geq n_{0} (ε, δ)

,

(R + δ, Δ) \in R_{WZ} (n, ε | p_{X Y}),

which implies that

\begin{matrix} R_{WZ}^{(δ)} (ε | p_{X Y}) \subseteq ⋂_{n \geq n_{0} (ε, δ)} R_{WZ} (n, ε | p_{X Y}) \\ = & {\underset{̲}{R}}_{WZ} (n_{0} (ε, δ), ε | p_{X Y}) \subseteq cl (⋃_{m \geq 1} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y})) . \end{matrix}

(A2)

Here, we assume that there exists a pair

(R, Δ)

belonging to

R_{WZ} (ε | p_{X Y})

such that

(R, Δ) \notin cl (⋃_{m \geq 1} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y})) .

(A3)

Since the set in the right hand side of Equation (A3) is a closed set, we have

(R + δ, Δ) \notin cl (⋃_{m \geq 1} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y}))

(A4)

for some small

δ > 0

. Note that

(R + δ, Δ)

\in R_{WZ}^{(δ)} (ε | p_{X Y})

. Then, Equation (A4) contradicts Equation (A2). Thus, we have

\begin{matrix} ⋃_{m \geq 1} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y}) & \subseteq & R_{WZ} (ε | p_{X Y}) \subseteq cl (⋃_{m \geq 1} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y})) . \end{matrix}

(A5)

Note here that

R_{WZ} (ε | p_{X Y})

is a closed set. Then, from Equation (A5), we conclude that

\begin{matrix} R_{WZ} (ε | p_{X Y}) & = & cl (⋃_{m \geq 1} {\underset{̲}{R}}_{WZ} (m, ε | p_{X Y})) = cl (⋃_{m \geq 1} ⋂_{n \geq m} R_{WZ} (n, ε | p_{X Y})), \end{matrix}

completing the proof. ☐

Appendix B. Cardinality Bound on Auxiliary Random Variables

Set

\begin{matrix} R^{(μ)} (p_{X Y}) : = min_{q \in P_{sh} (p_{X Y})} \{\bar{μ} I_{q} (X; U | Y) + μ E_{q} d (X, Z)\}, \\ {\underset{̲}{R}}^{(μ)} (p_{X Y}) : = min_{q \in P (p_{X Y})} \{\bar{μ} I_{q} (X; U | Y) + μ E_{q} d (X, Z)\}, \end{matrix}

Since

P_{sh} (p_{X Y})

\subseteq P (p_{X Y})

, it is obvious that

R^{(μ)} (p_{X Y}) \geq {\underset{̲}{R}}^{(μ)} (p_{X Y}) .

(A6)

We first prove the following lemma.

Lemma A1.

\begin{matrix} R^{(μ)} (p_{X Y}) = {\underset{̲}{R}}^{(μ)} (p_{X Y}) . \end{matrix}

To prove Lemma A1, we set

\begin{matrix} P_{1} (p_{X Y}) : = {\begin{matrix} q_{1} = q_{U X Y} : | U | \leq | X |, q_{X Y} = p_{X Y}, U \leftrightarrow X \leftrightarrow Y}, \end{matrix} \\ Q_{1} (p_{X Y}) : = {\begin{matrix} q_{1} = q_{U X Y} : | U | \leq | X | + 1, q_{X Y} = p_{X Y}, U \leftrightarrow X \leftrightarrow Y}, \end{matrix} \\ Q_{2} (q_{U X Y}) : = {q_{2} = q_{Z | U X Y} : q_{U X Y Z} = (q_{U X Y}, q_{2}), X \leftrightarrow (U, Y) \leftrightarrow Z} . \end{matrix}

By definition, it is obvious that

\begin{matrix} P_{sh} (p_{X Y}) & = & {q = (q_{1}, q_{2}) : q_{1} \in P_{1} (p_{X Y}), q_{2} \in Q_{2} (q_{1})}, \end{matrix}

(A7)

\begin{matrix} P (p_{X Y}) & = & {q = (q_{1}, q_{2}) : q_{1} \in Q_{1} (p_{X Y}), q_{2} \in Q_{2} (q_{1})} . \end{matrix}

(A8)

Proof of Lemma A1:

Since we have Equation (A6), it suffices to show

R^{(μ)} (p_{X Y}) \leq {\underset{̲}{R}}^{(μ)} (p_{X Y})

to prove Lemma A1. We bound the cardinality

| U |

of U to show that the bound

| U | \leq | X |

is sufficient to describe

{\underset{̲}{R}}^{(μ)} (p_{X Y})

. We first observe that, by Equation (A8), we have

\begin{matrix} {\underset{̲}{R}}^{(μ)} (p_{X Y}) & = & min_{q_{1} \in Q_{1} (p_{X Y})} min_{q_{2} \in Q_{2} (q_{1})} \{\bar{μ} I_{q_{1}} (X; U | Y) + μ E_{(q_{1}, q_{2})} d (X, Z)\} \\ = & min_{q_{1} \in Q_{1} (p_{X Y})} {\bar{μ} I_{q_{1}} (X; U | Y) + μ min_{q_{2} \in Q_{2} (q_{1})} E_{(q_{1}, q_{2})} d (X, Z)\} \\ = & min_{q_{1} \in Q_{1} (p_{X Y})} \{\bar{μ} I_{q_{1}} (X; U | Y) + μ E_{(q_{1}, q_{2}^{*} (q_{1}))} d (X, Z)\}, \end{matrix}

where

q_{2}^{*} = q_{2}^{*} (q_{1}) = q_{Z | U Y}^{*} = {q_{Z | U Y}^{*} (z | u, y)}_{(u, y, z) \in U \times Y \times Z}

is a conditional probability distribution that attains the following optimization problem:

min_{q_{2} \in Q_{2} (q_{1})} E_{(q_{1}, q_{2})} d (X, Z) .

Observe that

\begin{matrix} p_{X} (x) = \sum_{u \in U} q_{U} (u) q_{X | U} (x | u), \end{matrix}

(A9)

\begin{matrix} \bar{μ} I_{q_{1}} (X; U | Y) + μ E_{(q_{1}, q_{2}^{*})} (X, Z) = \sum_{u \in U} p_{U} (u) π (q_{X | U} (\cdot | u)), \end{matrix}

(A10)

where

\begin{matrix} π (q_{X | U} (\cdot | u)) & : = & \sum_{(x, y, z) \in X \times Y \times Z} q_{X | U} (x | u) p_{Y | X} (y | x) q_{Z | U Y}^{*} (z | u, y) \\ \times log \{\frac{q_{X | U}^{\bar{μ}} (x | u)}{p_{X}^{\bar{μ}} (x)} \frac{p_{Y}^{\bar{μ}} (y) e^{- μ d (x, z)}}{{[\sum_{\tilde{x} \in X} p_{Y | X} (y | \tilde{x}) q_{X | U} (\tilde{x} | u)]}^{\bar{μ}}}\} . \end{matrix}

For each

u \in U

,

π (q_{X | U} (\cdot | u))

is a continuous function of

q_{X | U} (\cdot | u)

. Then, by the support lemma,

| U | \leq | X | - 1 + 1 = | X |

is sufficient to express

| X | - 1

values of Equation (A9) and one value of Equation (A10). ☐

Next, we give a proof of Property 4 Part (a).

Proof of Property 4 Part (a):

We first bound the cardinality

| U |

of U in

Q

to show that the bound

| U | \leq | X |

is sufficient to describe

Ω^{(μ, λ, α)}

(p_{X Y})

. Observe that

\begin{matrix} q_{X} (x) = \sum_{u \in U} q_{U} (u) q_{X | U} (x | u), \end{matrix}

(A11)

\begin{matrix} exp \{- Ω^{(μ, λ, α)} (q | p_{X Y})\} = \sum_{u \in U} q_{U} (u) Π^{(μ, λ, α)} (q_{X}, q_{X Y Z | U} (\cdot, \cdot, \cdot | u)), \end{matrix}

(A12)

where

\begin{matrix} Π^{(μ, λ, α)} (q_{X}, q_{X Y Z | U} (\cdot, \cdot, \cdot | u)) & : = & \sum_{\binom{(x, y, z)}{\in X \times Y \times Z}} q_{X Y Z | U} (x, y, z | u) exp \{- α ω_{q | | p}^{(μ, λ)} (x, y, z | u)\} . \end{matrix}

The value of

q_{X}

included in

Π^{(μ, λ, α)} (q_{X}, q_{X Y Z | U} (\cdot, \cdot, \cdot | u))

must be preserved under the reduction of

U

. For each

u \in U

,

Π^{(μ, λ)} (q_{X}, q_{X Y Z | U} (\cdot, \cdot, \cdot | u))

is a continuous function of

q_{X Y Z | U} (\cdot, \cdot, \cdot | u)

. Then, by the support lemma,

| U | \leq | X | - 1 + 1 = | X |

is sufficient to express

| X | - 1

values of Equation (A11) and one value of Equation (A12). We next bound the cardinality

| U |

of U in

P_{sh} (p_{X Y})

to show that the bound

| U | \leq | X |

is sufficient to describe

{\tilde{Ω}}^{(μ, λ)}

(p_{X Y})

. Observe that

\begin{matrix} p_{X} (x) = \sum_{u \in U} p_{U} (u) p_{X | U} (x | u), \end{matrix}

(A13)

\begin{matrix} exp \{- {\tilde{Ω}}^{(μ, λ)} (p)\} = \sum_{u \in U} p_{U} (u) {\tilde{Π}}^{(μ, λ)} (p_{X}, p_{X Y Z | U} (\cdot, \cdot, \cdot | u)), \end{matrix}

(A14)

where

\begin{matrix} {\tilde{Π}}^{(μ, λ)} (p_{X}, p_{X Y Z | U} (\cdot, \cdot, \cdot | u)) & : = & \sum_{\binom{(x, y, z)}{\in X \times Y \times Z}} p_{X Y Z | U} (x, y, z | u) exp \{- λ ω_{p}^{(μ)} (x, y, z | u)\} . \end{matrix}

The value of

p_{X}

included in

{\tilde{Π}}^{(μ, λ)} (p_{X}, p_{X Y Z | U} (\cdot, \cdot, \cdot | u))

must be preserved under the reduction of

U

. For each

u \in U

,

Π^{(μ, λ)} (p_{X}, p_{X Y Z | U} (\cdot, \cdot, \cdot | u))

is a continuous function of

p_{X Y Z | U} (\cdot, \cdot, \cdot | u)

. Then, by the support lemma,

| U | \leq | X | - 1 + 1 = | X |

is sufficient to express

| X | - 1

values of Equation (A13) and one value of Equation (A14). ☐

Appendix C. Proof of Property 2

In this Appendix, we prove Property 2. Property 2 Part (a) is a well known property. Proof of this property is omitted here. We only prove Property 2 Part (b).

Proof of Property 2 Part (b):

Since

P^{*} (p_{X Y})

⊆

P (p_{X Y})

, it is obvious that

R^{*} (p_{X Y})

⊆

R (p_{X Y})

. Hence it suffices to prove that

R (p_{X Y})

⊆

R^{*} (p_{X Y}) .

We assume that

(R, Δ) \in R (p_{X Y})

. Then, there exists

p \in P (p_{X Y})

such that

R \geq I_{p} (U; X | Y) a n d Δ \geq E_{p} d (X, Z) .

(A15)

On the second inequality in Equation (A15), we have the following:

\begin{matrix} Δ & \geq & E_{p} d (X, Z) = \sum_{(u, y) \in U \times Y} p_{U Y} (u, y) [\sum_{z \in Z} p_{Z | U Y} (z | u, y) (\sum_{x \in X} d (x, z) p_{X | U Y} (x | u, y))] \\ \geq & \sum_{(u, y) \in U \times Y} p_{U Y} (u, y) [min_{z \in Z} (\sum_{x \in X} d (x, z) p_{X | U Y} (x | u, y))] \\ = & \sum_{(u, y) \in U \times Y} p_{U Y} (u, y) (\sum_{x \in X} d (x, z^{*}) p_{X | U Y} (x | u, y)), \end{matrix}

(A16)

where

z^{*} = z^{*} (u, y)

is one of the minimizers of the function

\sum_{x \in X} d (x, z) p_{X | U Y} (x | u, y) .

Define

ϕ : U \times Y \to Z

by

ϕ (u, y) = z^{*}

. We further define

q = q_{U X Y Z}

by

q_{U X Y} = p_{U X Y}, q_{Z} = q_{ϕ (U, Y)} .

It is obvious that

q \in P^{*} (p_{X Y}) and R \geq I_{p} (X; U | Y) = I_{q} (X; U | Y) .

(A17)

Furthermore, from Equation (A16), we have

Δ \geq E_{p} d (X, Z) \geq E_{q} d (X, ϕ (U, Y)) .

(A18)

From Equations (A17) and (A18), we have

(R, Δ) \in R^{*} (p_{X Y})

. Thus

R (p_{X Y})

⊆

R^{*} (p_{X Y})

is proved.

Appendix D. Proof of Property 3

In this Appendix, we prove Property 3. From Property 2 Part (a), we have the following lemma.

Lemma A2.

Suppose that

(\hat{R}, \hat{Δ})

does not belong to

R (

p_{X Y})

. Then, there exist

ϵ > 0

and

μ^{*} \in [0, 1]

such that for any

(R, Δ) \in R (p_{X Y})

we have

\begin{matrix} \bar{μ^{*}} (R - \hat{R}) + μ^{*} (Δ - \hat{Δ}) - ϵ \geq 0 . \end{matrix}

Proof of this lemma is omitted here. Lemma A2 is equivalent to the fact that if the region

R (p_{X Y})

is a convex set, then for any point

(\hat{R}, \hat{Δ})

outside the region

R (p_{X Y})

, there exits a line which separates the point

(\hat{R}, \hat{Δ})

from the region

R (p_{X Y})

. Lemma A2 will be used to prove Equation (8) in Property 3.

Proof of Equation (8) in Property 3:

We first recall the following definitions of

P (p_{X Y})

and

P_{sh} (p_{X Y})

:

\begin{matrix} P (p_{X Y}) & : = & {p_{U X Y Z} : | U | \leq | X | + 1, U \leftrightarrow X \leftrightarrow Y, X \leftrightarrow (U, Y) \leftrightarrow Z}, \\ P_{sh} (p_{X Y}) & : = & {p_{U X Y Z} : | U | \leq | X |, U \leftrightarrow X \leftrightarrow Y, X \leftrightarrow (U, Y) \leftrightarrow Z} . \end{matrix}

We prove

R_{sh} (p_{X Y})

\subseteq R (p_{X Y})

. We assume that

(\hat{R}, \hat{Δ}) \notin R (p_{X Y})

. Then, by Lemma A2, there exist

ϵ > 0

and

μ^{*} \in [0, 1]

such that for any

(R, Δ) \in R (p_{X Y})

, we have

\begin{matrix} {\bar{μ}}^{*} \hat{R} + μ^{*} \hat{Δ} \leq {\bar{μ}}^{*} R + μ^{*} Δ - ϵ . \end{matrix}

Hence, we have

\begin{matrix} {\bar{μ}}^{*} \hat{R} + μ^{*} \hat{Δ} \leq min_{(R, Δ) \in R (p_{X Y})} \{{\bar{μ}}^{*} R + μ^{*} Δ\} - ϵ \\ \overset{(a)}{=} & min_{p \in P (p_{X Y})} \{{\bar{μ}}^{*} I_{p} (U; X | Y) + μ^{*} E_{p} d (X, Z)\} - ϵ \\ \leq & min_{p \in P_{sh} (p_{X Y})} \{{\bar{μ}}^{*} I_{p} (U; X | Y) + μ^{*} E_{p} d (X, Z)\} - ϵ = R^{(μ^{*})} (p_{X Y}) - ϵ . \end{matrix}

(A19)

Step (a) follows from the definition of

R (p_{X Y})

. The inequality in Equation (A19) implies that

(\hat{R}, \hat{Δ})

\notin R_{sh} (p_{X Y})

. Thus,

R_{sh} (p_{X Y})

\subseteq R (p_{X Y})

is concluded. We next prove

R (p_{X Y})

⊆

R_{sh}

(p_{X Y})

. We assume that

(R, Δ) \in R (p_{X Y})

. Then, there exists

q \in P (p_{X Y})

such that

R \geq I_{q} (X; U | Y), Δ \geq E_{q} d (X, Z) .

(A20)

Then, for each

μ \in [0, 1]

and for

(R, Δ)

\in R (p_{X Y})

, we have the following chain of inequalities:

\begin{matrix} \bar{μ} R + μ Δ \overset{(a)}{\geq} \bar{μ} I_{q} (X; U | Y) + μ E_{q} d (X, Z) \\ \geq & min_{q \in P (p_{X Y})} [\bar{μ} I_{q} (X; U | Y) + μ E_{q} d (X, Z)] = {\underset{̲}{R}}^{(μ)} (p_{X Y}) \overset{(b)}{=} R^{(μ)} (p_{X Y}) . \end{matrix}

Step (a) follows from Equation (A20). Step (b) follows from Lemma A1. Hence, we have

R (p_{X Y})

⊆

R_{sh} (p_{X Y})

. ☐

Appendix E. Proof of Property 4 Part (b)

In this Appendix, we prove Property 4 Part (b). We have the following lemma.

Lemma A3.

For any μ

\in [0, 1]

,

λ \geq 0

,

α \in [0, \frac{1}{λ + 1}]

and any

q = q_{U X Y} \in Q (p_{Y | X})

, there exists

p = p_{U X Y Z} \in P_{sh} (p_{X Y})

such that

Ω^{(μ, γ, α)} (q | p_{X Y}) \geq α {\tilde{Ω}}^{(μ, λ)} (p) .

(A21)

This implies that for any μ

\in [0, 1]

,

λ \geq 0

,

α \in [0, \frac{1}{λ + 1}]

, we have

Ω^{(μ, γ, α)} (p_{X Y}) \geq α {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) .

(A22)

Proof.

Since Equation (A22) is obvious from Equation (A21), we only prove Equation (A21). We consider the case where

(μ, λ, α)

satisfies

(μ, λ) \in [0, 1]

,

λ \geq 0

, and

α \in [0, \frac{1}{1 + λ}]

. In this case, we have

λ \bar{μ} \frac{α}{\bar{α}} \leq \frac{λ α}{\bar{α}} \leq \frac{\frac{λ}{1 + λ}}{1 - \frac{1}{1 + λ}} = 1 .

(A23)

For each

q = q_{U X Y Z} \in Q

, we choose

p = p_{U X Y Z} \in P_{sh} (p_{X Y})

so that

p_{U | X} = q_{U | X}

and

p_{Z | U Y} = q_{Z | U Y}

. Then, for any

(μ, λ, α)

satisfying

μ \in [0, 1]

,

λ \geq 0

, and

α \in [0, \frac{1}{1 + λ}]

, we have the following chain of inequalities:

\begin{matrix} exp \{- Ω^{(μ, λ, α)} (q | p_{X Y})\} \\ = E_{q} [{\{\frac{p_{X} (X)}{q_{X} (X)} \frac{p_{Y | X} (Y | X)}{q_{Y | X U} (Y | X, U)} \frac{q_{Z | U Y} (Z | U, Y)}{q_{Z | U Y X} (Z | U, Y, X)} \frac{p_{X | Y}^{λ \bar{μ}} (X | Y) e^{- λ μ d (X, Z)}}{q_{X | U Y}^{λ \bar{μ}} (X | U, Y)}\}}^{α}] \\ = E_{q} [{\{\frac{p_{U X Y Z} (X, Y, Z, U)}{q_{U X Y Z} (X, Y, Z, U)} \frac{p_{X | Y}^{λ \bar{μ}} (X | Y) e^{- λ μ d (X, Z)}}{p_{X | U Y}^{λ \bar{μ}} (X | U, Y)}\}}^{α} {\{\frac{p_{X | U Y} (X | U, Y)}{q_{X | U Y} (X | U, Y)}\}}^{λ \bar{μ} α}] \\ \overset{(a)}{\leq} {(E_{q} [\frac{p_{U X Y Z} (X, Y, Z, U)}{q_{U X Y Z} (X, Y, Z, U)} \frac{p_{X | Y}^{λ \bar{μ}} (X | Y) e^{- λ μ d (X, Z)}}{p_{X | U Y}^{λ \bar{μ}} (X | U, Y)}])}^{α} {(E_{q} [{\{\frac{p_{X | U Y} (X | U, Y)}{q_{X | U Y} (X | U, Y)}\}}^{λ \bar{μ} \frac{α}{\bar{α}}}])}^{\bar{α}} \\ = exp \{- α {\tilde{Ω}}^{(μ, λ)} (p)\} {(E_{q} [{\{\frac{p_{X | U Y} (X | U, Y)}{q_{X | U Y} (X | U, Y)}\}}^{λ \bar{μ} \frac{α}{\bar{α}}}])}^{\bar{α}} \overset{(b)}{\leq} exp \{- α {\tilde{Ω}}^{(μ, λ)} (p)\} . \end{matrix}

(A24)

Step (a) follows from Hölder’s inequality. Step (b) follows from Equation (A23) and Hölder’s inequality. ☐

Proof of Property 4 Part (b):

We have the following chain of inequalities:

\begin{matrix} F (R, Δ | p_{X Y}) = sup_{μ \in [0, 1], λ, α \geq 0} \frac{Ω^{(μ, λ, α)} (p_{X Y}) - λ α (\bar{μ} R + μ Δ)}{1 + (4 + \bar{μ} λ) α} \\ \geq sup_{μ \in [0, 1], λ \geq 0} sup_{α \in [0, \frac{1}{1 + λ}]} \frac{Ω^{(μ, λ, α)} (p_{X Y}) - λ α (\bar{μ} R + μ Δ)}{1 + (4 + \bar{μ} λ) α} \\ \overset{(a)}{\geq} sup_{μ \in [0, 1], λ \geq 0} sup_{α \in [0, \frac{1}{1 + λ}]} \frac{α [{\tilde{Ω}}^{(μ, λ)} (p_{X Y}) - λ (\bar{μ} R + μ Δ)]}{1 + (4 + \bar{μ} λ) α} \\ \overset{(b)}{=} sup_{μ \in [0, 1], λ \geq 0} \frac{{\tilde{Ω}}^{(μ, λ)} (p_{X Y}) - λ (\bar{μ} R + μ Δ)}{5 + λ (1 + \bar{μ})} = \tilde{F} (R, Δ | p_{X Y}) . \end{matrix}

Step (a) follows from Lemma A3. Step (b) follows from

sup_{α \in [0, \frac{1}{1 + λ}]} \frac{α}{1 + (4 + \bar{μ} λ) α} = \frac{1}{5 + λ (1 + \bar{μ})},

completing the proof. ☐

Appendix F. Proof of Property 4 Parts (c), (d), and (e)

In this Appendix, we prove Property 4 Parts (c), (d), and (e). We first prove Parts (c) and (d).

Proof of Property 4 Parts (c) and (d):

We first prove Part (c). We first show that for each

p \in P_{sh} (p_{X Y})

and

μ \in [0, 1]

,

{\tilde{Ω}}^{(μ, λ)} (p)

exists for

λ \in [0, 1]

. On a lower bound of

exp [- {\tilde{Ω}}^{(μ, λ)} (p)]

, for

λ \in [0, 1]

, we have the following:

\begin{matrix} exp [- {\tilde{Ω}}^{(μ, λ)} (p)] & = \sum_{\binom{(u, x, y, z)}{\in U \times X \times Y \times Z}} p_{U X Y Z} (u, x, y, z) {[\frac{p_{X | Y} (x | y)}{p_{X | U Y} (x | u, y)}]}^{\bar{μ} λ} e^{- μ λ d (x, z)} \end{matrix}

(A25)

\begin{matrix} \overset{(a)}{\geq} \sum_{\binom{(u, x, y, z)}{\in U \times X \times Y \times Y}} p_{U X Y Z} (u, x, y, z) p_{X | Y} (x | y) e^{- d (x, z)} . \end{matrix}

(A26)

Step (a) follows from that, for

μ, λ \in [0, 1]

,

{[\frac{p_{X | Y} (x | y)}{p_{X | U Y} (x | u, y)}]}^{\bar{μ} λ} e^{- μ λ d (x, z)} \geq p_{X | Y} (x | y) e^{- d (x, z)} .

It is obvious that the lower bound of Equation (A26) of

exp [- {\tilde{Ω}}^{(μ, λ)} (p)]

takes some positive value. This implies that

{\tilde{Ω}}^{(μ, λ)} (p)

exists for

λ \in [0, 1]

. We next show that

{\tilde{Ω}}^{(μ, λ)} (p) \geq 0

for

λ \in [0, 1]

. On upper bounds of

exp [- {\tilde{Ω}}^{(μ, λ)} (p)]

for

λ \in [0, 1]

, we have the following chain of inequalities:

\begin{matrix} exp [- {\tilde{Ω}}^{(μ, λ)} (p)] \overset{(a)}{\leq} \sum_{\binom{(u, x, y, z)}{\in U \times X \times Y \times Z}} p_{U X Y Z} (u, x, y, z) {[\frac{p_{X | Y} (x | y)}{p_{X | U Y} (x | u, y)}]}^{\bar{μ} λ} \\ = \sum_{\binom{(u, x, y)}{\in U \times X \times Y}} p_{U Y} (u, y) p_{X | U Y}^{1 - \bar{μ} λ} (x | u, y) p_{X | Y}^{\bar{μ} λ} (x | y) \\ \overset{(b)}{\leq} \sum_{(u, y) \in U \times Y} p_{U Y} (u, y) {(\sum_{x \in X} p_{X | U Y} (x | u, y))}^{1 - \bar{μ} λ} {(\sum_{x \in X} p_{X | Y} (x | y))}^{\bar{μ} λ} \\ = \sum_{(u, y) \in U \times Y} p_{U Y} (u, y) = 1 . \end{matrix}

(A27)

Step (a) follows from Equation (A25) and

e^{- μ λ d (x, z)} \leq 1

. Step (b) follows from

\bar{μ} λ \in [0, 1]

and Hölder’s inequality. We next prove that that, for each

p \in P_{sh} (p_{X Y})

and

μ \in [0, 1]

,

{\tilde{Ω}}^{(μ, λ)} (p)

is twice differentiable for

λ \in [0, 1 / 2]

. For simplicity of notations, set

\begin{matrix} \underset{̲}{a} : = (u, x, y, z), \underset{̲}{A} : = (U, X, Y, Z), \underset{̲}{A} : = U \times X \times Y \times Z, \\ {\tilde{ω}}_{p}^{(μ)} (x, y, z | u) : = ς (\underset{̲}{a}), {\tilde{Ω}}^{(μ, λ)} (p) : = ξ (λ) . \end{matrix}

Then, we have

{\tilde{Ω}}^{(μ, λ)} (p) = ξ (λ) = - log [\sum_{\underset{̲}{a} \in \underset{̲}{A}} p_{\underset{̲}{A}} (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a})}] .

(A28)

The quantity

p^{(λ)} (\underset{̲}{a}) = p_{\underset{̲}{A}}^{(λ)} (\underset{̲}{a}), \underset{̲}{a} \in A

has the following form:

p^{(λ)} (\underset{̲}{a}) = e^{ξ (λ)} p (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a})} .

(A29)

By simple computations, we have

\begin{matrix} ξ^{'} (λ) = e^{ξ (λ)} [\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a})}] = \sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς (\underset{̲}{a}), \\ ξ^{″} (λ) = - e^{2 ξ (λ)} [\sum_{\underset{̲}{a}, \underset{̲}{b} \in \underset{̲}{A}} p (\underset{̲}{a}) p (\underset{̲}{b}) \frac{{\{ς (\underset{̲}{a}) - ς (\underset{̲}{b})\}}^{2}}{2} e^{- λ \{ς (\underset{̲}{a}) + ς (\underset{̲}{b})\}}] \\ = - \sum_{\underset{̲}{a}, \underset{̲}{b} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) p^{(λ)} (\underset{̲}{b}) \frac{{\{ς (\underset{̲}{a}) - ς (\underset{̲}{b})\}}^{2}}{2} = - \sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς^{2} (\underset{̲}{a}) + {[\sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς (\underset{̲}{a})]}^{2} \leq 0 . \end{matrix}

(A30)

On upper bound of

- ξ^{″} (λ) \geq 0

for

λ \in [0, 1 / 2]

, we have the following chain of inequalities:

\begin{matrix} - ξ^{″} (λ) \overset{(a)}{\leq} \sum_{\underset{̲}{a} \in \underset{̲}{A}} p^{(λ)} (\underset{̲}{a}) ς^{2} (\underset{̲}{a}) \overset{(b)}{=} \sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{2} (\underset{̲}{a}) e^{- λ ς (\underset{̲}{a}) + ξ (λ)} \\ = e^{ξ (λ)} \sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) \sqrt{e^{- 2 λ ς (\underset{̲}{a})}} \sqrt{ς^{4} (\underset{̲}{a})} \overset{(c)}{\leq} \sqrt{e^{2 ξ (λ) - ξ (2 λ)}} \sqrt{\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{4} (\underset{̲}{a})} \\ \overset{(d)}{\leq} \sqrt{e^{2 ξ (λ)}} \sqrt{\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{4} (\underset{̲}{a})} . \end{matrix}

(A31)

Step (a) follows from Equation (A30). Step (b) follows from Equation (A29). Step (c) follows from Cauchy–Schwarz inequality and Equation (A28). Step (d) follows from that

ξ (2 λ) \geq 0

for

2 λ \in [0, 1]

. Note that

ξ (λ)

exists for

λ \in [0, 1 / 2]

. Furthermore, we have the following:

\sum_{\underset{̲}{a} \in \underset{̲}{A}} p (\underset{̲}{a}) ς^{4} (\underset{̲}{a}) < \infty .

Hence, by Equation (A31),

ξ^{″} (λ)

exists for

λ \in [0, 1 / 2]

. We next prove Part (d). We derive the lower bound Equation (9) of

{\tilde{Ω}}^{(μ, λ)} (p_{X Y})

. Fix any

(μ, λ) \in [0, 1] \times [0, 1 / 2]

and any

p \in P_{sh} (p_{X Y})

. By the Taylor expansion of

ξ (λ) = {\tilde{Ω}}^{(μ, λ)} (p)

with respect to

λ

around

λ = 0

, we have that, for any

p \in P_{sh} (p_{X Y})

and for some

ν \in [0, λ]

,

\begin{matrix} {\tilde{Ω}}^{(μ, λ)} (p) = ξ (0) + ξ^{'} (0) λ + \frac{1}{2} ξ^{″} (ν) λ^{2} \\ = λ E_{p} [{\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)] - \frac{λ^{2}}{2} {Var}_{p^{(ν)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)] \\ \overset{(a)}{\geq} λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} {Var}_{p^{(ν)}} [{\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)] . \end{matrix}

(A32)

Step (a) follows from

p \in P_{sh} (p_{X Y})

,

E_{p} [{\tilde{ω}}_{p}^{(μ)} (X, Y, Z | U)] = \bar{μ} I_{p} (X; U | Y) + μ E_{p} d (X, Z),

and the definition of

R^{(μ)} (p_{X Y})

. Let

(ν_{opt}, p_{opt}) \in [0, λ] \times P_{sh} (p_{X Y})

be a pair which attains

ρ^{(μ, λ)} (p_{X Y})

.

By this definition, we have that

\begin{matrix} {\tilde{Ω}}^{(μ, λ)} (p_{opt}) = {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) \end{matrix}

(A33)

and that, for any

ν \in [0, λ],

\begin{matrix} {Var}_{p_{opt}^{(ν)}} [ω_{p_{opt}}^{(μ)} (X, Y, Z | U)] \leq {Var}_{p_{opt}^{(ν_{opt})}} [ω_{p_{opt}}^{(μ)} (X, Y, Z | U)] = ρ^{(μ, λ)} (p_{X Y}) . \end{matrix}

(A34)

On lower bounds of

Ω^{(μ, λ)} (p_{X Y})

, we have the following chain of inequalities:

\begin{matrix} {\tilde{Ω}}^{(μ, λ)} (p_{X Y}) \overset{(a)}{=} {\tilde{Ω}}^{(μ, λ)} (p_{opt}) \overset{(b)}{\geq} λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} {Var}_{p_{opt}^{(ν)}} [{\tilde{ω}}_{p_{opt}}^{(μ)} (X, Y, Z | U)] \\ \overset{(c)}{\geq} & λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} ρ^{(μ, λ)} (p_{X Y}) \overset{(d)}{\geq} λ R^{(μ)} (p_{X Y}) - \frac{λ^{2}}{2} ρ (p_{X Y}) . \end{matrix}

Step (a) follows from Equation (A33). Step (b) follows from Equation (A32). Step (c) follows from Equation (A34). Step (d) follows from the definition of

ρ (p_{X Y})

. ☐

We next prove Part (e). For the proof we use the following lemma.

Lemma A4.

When

τ \in (0, (1 / 2) ρ]

, the maximum of

\frac{1}{5 + 2 λ} \{- \frac{1}{2} ρ λ^{2} + τ λ\}

for

λ \in (0, 1 / 2]

is attained by the positive

λ_{0}

satisfying

ϑ (λ_{0}) : = \frac{1}{5} λ_{0}^{2} + λ_{0} = \frac{τ}{ρ} .

(A35)

Let

g (a)

be the inverse function of

ϑ (a)

for

a \geq 0

. Then, the condition of Equation (A35) is equivalent to

λ_{0} = g (\frac{τ}{ρ})

. The maximum is given by

\frac{1}{5 + 2 λ_{0}} \{- \frac{1}{2} ρ λ_{0}^{2} + τ λ_{0}\} = \frac{ρ}{10} λ_{0}^{2} = \frac{ρ}{10} g^{2} (\frac{τ}{ρ}) .

By an elementary computation we can prove this lemma. We omit the detail.

Proof of Property 4 Part (e):

By the hyperplane expression

R_{sh} (p_{X Y})

of

R (p_{X Y})

stated Property 3 we have that when

(R + τ, Δ + τ) \notin R (p_{X Y})

, we have

{\bar{μ}}^{*} (R + τ) + μ^{*} (Δ + τ) < R^{(μ^{*})} (p_{X Y})

or equivalent to

R^{(μ^{*})} (p_{X Y}) - ({\bar{μ}}^{*} R + μ^{*} Δ) > τ

(A36)

for some

μ^{*} \in [0, 1]

. Then, for each positive

τ

, we have the following chain of inequalities:

\begin{matrix} \tilde{F} (R, Δ | p_{X Y}) \geq sup_{λ \in (0, 1 / 2]} {\tilde{F}}^{(μ^{*}, λ)} ({\bar{μ}}^{*} R + μ^{*} Δ | p_{X Y}) \\ = sup_{λ \in (0, 1 / 2]} \frac{{\tilde{Ω}}^{(μ^{*}, λ)} (p_{X Y}) - λ ({\bar{μ}}^{*} R + μ^{*} Δ)}{5 + λ (1 + {\bar{μ}}^{*})} \\ \overset{(a)}{\geq} sup_{λ \in (0, 1 / 2]} \frac{1}{5 + 2 λ} \{- \frac{1}{2} ρ λ^{2} + λ R^{(μ^{*})} (p_{X Y}) - λ ({\bar{μ}}^{*} R + μ^{*} Δ)} \\ \overset{(b)}{>} sup_{λ \in (0, 1 / 2]} \frac{1}{5 + 2 λ} \{- \frac{1}{2} ρ λ^{2} + τ λ\} \overset{(c)}{=} \frac{ρ}{10} g^{2} (\frac{τ}{ρ}) . \end{matrix}

Step (a) follows from Property 4 Part (d). Step (b) follows from Equation (A36). Step (c) follows from Lemma A4. ☐

Appendix G. Proof of Lemma 1

To prove Lemma 1, we prepare a lemma. Set

\begin{matrix} {\tilde{A}}_{n} & : = \{x^{n} : \frac{1}{n} log \frac{p_{X^{n}} (x^{n})}{Q_{X^{n}}^{(i)} (x^{n})} \geq - η\}, \\ A_{n} & : = {\tilde{A}}_{n} \times M_{n} \times Y^{n} \times Z^{n}, A_{n}^{c} : = {\tilde{A}}_{n}^{c} \times M_{n} \times Y^{n} \times Z^{n}, \\ {\tilde{B}}_{n} & : = \{(s, x^{n}, y^{n}) : \frac{1}{n} log \frac{p_{Y^{n} | X^{n}} (y^{n} | x^{n})}{Q_{Y^{n} | X^{n} S_{n}}^{(ii)} (y^{n} | x^{n}, s)} \geq - η\}, \\ B_{n} & : = {\tilde{B}}_{n} \times Z^{n}, B_{n}^{c} : = {\tilde{B}}_{n}^{c} \times Z^{n}, \\ C_{n} & : = \{(s, x^{n}, y^{n}, z^{n}) : \frac{1}{n} log \frac{p_{X^{n} | S_{n} Y^{n}} (x^{n} | s, y^{n})}{Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} (x^{n} | s, y^{n}, z^{n})} \geq - η\}, \\ {\tilde{D}}_{n} & : = {(s, x^{n}, y^{n}) : s = φ^{(n)} (x^{n}), Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (x^{n} | s, y^{n}) \leq M_{n} e^{n η} p_{X^{n} | Y^{n}} (x^{n} | y^{n})}, \\ D_{n} & : = {\tilde{D}}_{n} \times Z^{n}, D_{n}^{c} : = {\tilde{D}}_{n}^{c} \times Z^{n} . \end{matrix}

Then, we have the following lemma:

Lemma A5.

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n}^{c}) \leq e^{- n η}, p_{S_{n} X^{n} Y^{n} Z^{n}} (B_{n}^{c}) \leq e^{- n η}, \\ p_{S_{n} X^{n} Y^{n} Z^{n}} (C_{n}^{c}) \leq e^{- n η}, p_{S_{n} X^{n} Y^{n} Z^{n}} (D_{n}^{c}) \leq e^{- n η} . \end{matrix}

Proof.

We first prove the first inequality. We have the following chain of inequalities:

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n}^{c}) = p_{X^{n}} ({\tilde{A}}_{n}^{c}) = \sum_{x^{n} \in {\tilde{A}}_{n}^{c}} p_{X_{n}} (x^{n}) \\ \overset{(a)}{\leq} & \sum_{x^{n} \in {\tilde{A}}_{n}^{c}} e^{- n η} Q_{X^{n}}^{(i)} (x^{n}) \leq e^{- n η} \sum_{x^{n}} Q_{X^{n}}^{(i)} (x^{n}) = e^{- n η} . \end{matrix}

Step (a) follows from the definition of

A_{n}

. We next prove the second inequality. We have the following chain of inequalities:

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (B_{n}^{c}) = p_{S_{n} X^{n} Y^{n}} ({\tilde{B}}_{n}^{c}) \overset{(a)}{=} \sum_{(s, x^{n}, y^{n}) \in {\tilde{B}}_{n}^{c}} p_{S_{n} X^{n}} (s, x^{n}) p_{Y^{n} | X^{n}} (y^{n} | x^{n}) \\ \overset{(b)}{\leq} \sum_{(s, x^{n}, y^{n}) \in {\tilde{B}}_{n}^{c}} e^{- n η} p_{S_{n} X^{n}} (s, x^{n}) Q_{Y^{n} | S_{n} X^{n}}^{(ii)} (y^{n} | s, x^{n}) \\ \leq e^{- n η} \sum_{s, x^{n}, y^{n}} p_{S_{n} X^{n}} (s, x^{n}) Q_{Y^{n} | S_{n} X^{n}}^{(ii)} (y^{n} | s, x^{n}) = e^{- n η} . \end{matrix}

Step (a) follows from the Markov chain

S_{n} \leftrightarrow X^{n} \leftrightarrow Y^{n}

. Step (b) follows from the definition of

B_{n}

. On the third inequality, we have the following chain of inequalities:

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (C_{n}^{c}) \overset{(a)}{=} \sum_{(s, x^{n}, y^{n}, z^{n}) \in C_{n}^{c}} p_{X^{n} | S_{n} Y^{n}} (x^{n} | s, y^{n}) p_{S_{n} Y^{n} Z^{n}} (s, y^{n}, z^{n}) \\ \overset{(b)}{\leq} \sum_{(s, x^{n}, y^{n}, z^{n}) \in C_{n}^{c}} e^{- n η} Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} (x^{n} | s, y^{n}, z^{n}) p_{S_{n} Y^{n} Z^{n}} (s, y^{n}, z^{n}) \\ \leq e^{- n η} \sum_{s, x^{n}, y^{n}, z^{n}} Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} (x^{n} | s, y^{n}, z^{n}) p_{S_{n} Y^{n} Z^{n}} (s, y^{n}, z^{n}) = e^{- n η} . \end{matrix}

Step (a) follows from the Markov chain

X^{n} \leftrightarrow S_{n} Y^{n} \leftrightarrow Z^{n}

. Step (b) follows from the definition of

C_{n}

. We finally prove the fourth inequality. We have the following chain of inequalities:

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (D_{n}^{c}) = p_{S_{n} X^{n} Y^{n}} ({\tilde{D}}_{n}^{c}) = \sum_{s \in M_{n}} \sum_{\binom{(x^{n}, y^{n}) : φ^{(n)} (x^{n}) = s,}{\binom{p_{X^{n} | Y^{n}} (x^{n} | y^{n})}{\binom{\leq (1 / M_{n}) e^{- n η}}{\times Q_{X^{n} | S_{n}, Y^{n}}^{(iv)} (x^{n} | s, y^{n})}}}} p_{X^{n} | Y^{n}} (x^{n} | y^{n}) p_{Y^{n}} (y^{n}) \\ \leq \frac{e^{- n η}}{M_{n}} \sum_{s \in M_{n}} \sum_{\binom{(x^{n}, y^{n}) : φ^{(n)} (x^{n}) = s,}{\binom{p_{X^{n} | Y^{n}} (x^{n} | y^{n})}{\binom{\leq (1 / M_{n}) e^{- n η}}{\times Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (x^{n} | s, y^{n})}}}} Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (x^{n} | s, y^{n}) p_{Y^{n}} (y^{n}) \\ \leq \frac{e^{- n η}}{M_{n}} \sum_{s \in M_{n}} \sum_{x^{n}, y^{n}} Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (x^{n} | s, y^{n}) p_{Y^{n}} (y^{n}) = e^{- n η} . \end{matrix}

☐

Proof of Lemma 1:

We set

E_{n} : = \{(s, x^{n}, y^{n}, z^{n}) : \frac{1}{n} d (X^{n}, Z^{n}) \leq Δ\} .

Set

R^{(n)} : = (1 / n) log M_{n}

. By definition, we have

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n} \cap E_{n}) \\ \begin{matrix} = p_{S_{n} X^{n} Y^{n} Z^{n}} {η & \geq & \frac{1}{n} log \frac{Q_{X^{n}}^{(i)} (X^{n})}{p_{X^{n}} (X^{n})}, \\ η & \geq & \frac{1}{n} log \frac{Q_{Y^{n} | X^{n} S}^{(ii)} (Y^{n} | X^{n} S)}{p_{Y^{n} | X^{n}} (Y^{n} | X^{n})}, \\ η & \geq & \frac{1}{n} log \frac{Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} (X^{n} | S_{n} Y^{n} Z^{n})}{p_{X^{n} | S_{n} Y^{n}} (X^{n} | S_{n} Y^{n})}, \\ R^{(n)} + η & \geq & \frac{1}{n} log \frac{Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (X^{n} | S_{n} Y^{n})}{p_{X^{n} | Y^{n}} (X^{n} | Y^{n})}, \\ Δ & \geq & \frac{1}{n} log exp \{d (X^{n}, Z^{n})\}\} . \end{matrix} \end{matrix}

(A37)

Then, for any

(φ^{(n)}

,

ψ^{(n)})

satisfying

R^{(n)} = \frac{1}{n} log M_{n} \leq R,

we have

\begin{matrix} p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n} \cap E_{n}) \\ \begin{matrix} \leq p_{S_{n} X^{n} Y^{n} Z^{n}} {η & \geq & \frac{1}{n} log \frac{Q_{X^{n}}^{(i)} (X^{n})}{p_{X^{n}} (X^{n})}, \\ η & \geq & \frac{1}{n} log \frac{Q_{Y^{n} | X^{n} S}^{(ii)} (Y^{n} | X^{n} S)}{p_{Y^{n} | X^{n}} (Y^{n} | X^{n})}, \\ η & \geq & \frac{1}{n} log \frac{Q_{X^{n} | S_{n} Y^{n} Z^{n}}^{(iii)} (X^{n} | S_{n} Y^{n} Z^{n})}{p_{X^{n} | S_{n} Y^{n}} (X^{n} | S_{n} Y^{n})}, \\ R + η & \geq & \frac{1}{n} log \frac{Q_{X^{n} | S_{n} Y^{n}}^{(iv)} (X^{n} | S_{n} Y^{n})}{p_{X^{n} | Y^{n}} (X^{n} | Y^{n})}, \\ Δ & \geq & \frac{1}{n} log exp \{d (X^{n}, Z^{n})\}\} . \end{matrix} \end{matrix}

(A38)

Hence, it suffices to show

\begin{matrix} P_{c}^{(n)} (φ_{1}^{(n)}, φ_{2}^{(n)}, ψ^{(n)}; Δ) & \leq & p_{S_{n} X^{n} Y^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n} \cap E_{n}) + 4 e^{- n η} \end{matrix}

to prove Lemma 1. By definition, we have

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) = p_{S_{n} X^{n} Y^{n} Z^{n}} (E_{n}) . \end{matrix}

Then, we have the following:

\begin{matrix} P_{c}^{(n)} (φ^{(n)}, ψ^{(n)}; Δ) = p_{S_{n} X^{n} Y^{n} Z^{n}} (E_{n}) \\ = & p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n} \cap E_{n}) + p_{S_{n} X^{n} Y^{n} Z^{n}} ({[A_{n} \cap B_{n} \cap C_{n} \cap D_{n}]}^{c} \cap E_{n}) \\ \leq & p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n} \cap E_{n}) + p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n}^{c}) + p_{S_{n} X^{n} Y^{n} Z^{n}} (B_{n}^{c}) \\ + p_{S_{n} X^{n} Y^{n} Z^{n}} (C_{n}^{c}) + p_{S_{n} X^{n} Y^{n} Z^{n}} (D_{n}^{c}) \\ \overset{(a)}{\leq} & p_{S_{n} X^{n} Y^{n} Z^{n}} (A_{n} \cap B_{n} \cap C_{n} \cap D_{n} \cap E_{n}) + 4 e^{- n η} . \end{matrix}

Step (a) follows from Lemma A5. ☐

Appendix H. Proof of Lemma 2

In this Appendix, we prove Lemma 2.

Proof of Lemma 2:

We have the following chain of inequalities:

\begin{matrix} I (X_{t}; Y^{t - 1} | S_{n} X_{t - 1} Y_{t}^{n}) = H (Y^{t - 1} | S_{n} X^{t - 1} Y_{t}^{n}) - H (Y^{t - 1} | S_{n} X^{t} Y_{t}^{n}) \\ \leq & H (Y^{t - 1} | X^{t - 1}) - H (Y^{t - 1} | S_{n} X^{n} Y_{t}^{n}) \overset{(a)}{=} H (Y^{t - 1} | X^{t - 1}) - H (Y^{t - 1} | X^{n} Y_{t}^{n}) \\ \overset{(b)}{=} & H (Y^{t - 1} | X^{t - 1}) - H (Y^{t - 1} | X^{t - 1}) = 0 . \end{matrix}

Step (a) follows from that

S_{n} = φ^{(n)} (X^{n})

is a function of

X^{n}

. Step (b) follows from the memoryless property of the information source

{(X_{t}, Y_{t})}_{t = 1}^{\infty}

. ☐

Appendix I. Proof of Lemma 5

In this Appendix, we prove Equations (32) and (33) in Lemma 5.

Proofs Equations (32) and (33) in Lemma 5:

By the definition of

p_{X^{t} Z^{t} | S_{n} Y^{n}, q^{t}}^{(μ, λ, θ)}

(x^{t}, z^{t} | s, y^{n})

, for

t = 1, 2, \dots, n

, we have

\begin{matrix} p_{X^{t} Z^{t} | S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} (x^{t}, z^{t} | s, y^{n}) & = & C_{t}^{- 1} (s, y^{n}) p_{X^{t} Z^{t} | S_{n} Y^{n}} (x^{t}, z^{t} | s, y^{n}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, λ, θ)} (x_{i}, y_{i}, z_{i} | u_{i}) . \end{matrix}

(A39)

Then, we have the following chain of equalities:

\begin{matrix} p_{X^{t} Z^{t} | S_{n} Y^{n}; F^{t}}^{(μ, λ, θ)} (x^{t}, z^{t} | s, y^{n}) \overset{(a)}{=} C_{t}^{- 1} (s, y^{n}) p_{X^{t} Z^{t} | S_{n} Y^{n}} (x^{t}, z^{t} | s, y^{n}) \prod_{i = 1}^{t} f_{F_{i}}^{(μ, λ, θ)} (x_{i}, y_{i}, z_{i} | u_{i}) \\ = C_{t}^{- 1} (s, y^{n}) p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}} (x^{t - 1}, z^{t - 1} | s, y^{n}) \prod_{i = 1}^{t - 1} f_{F_{i}}^{(μ, λ, θ)} (x_{i}, y_{i}, z_{i} | u_{i}) \\ \times p_{X_{t} | Z_{t} | X^{t - 1} Z^{t - 1} S Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t} | u_{t}) \\ \overset{(b)}{=} \frac{C_{t - 1} (s, y^{n})}{C_{t} (s, y^{n})} p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \\ \times p_{X_{t} | Z_{t} | X^{t - 1} Z^{t - 1} S_{n} Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) \\ = {(Φ_{t}^{(μ, λ, θ)} (s, y^{n}))}^{- 1} p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) \\ \times p_{X_{t} | Z_{t} | X^{t - 1} Z^{t - 1} S_{n} Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \end{matrix}

(A40)

Steps (a) and (b) follow from Equation (A39). From Equation (A40), we have

\begin{matrix} Φ_{t, q^{t}}^{(μ, λ, θ)} (s, y^{n}) p_{X^{t} Z^{t} | S_{n} Y^{n}}^{(μ, λ, θ)} (x^{t}, z^{t} | s, y^{n}) \\ = p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \end{matrix}

(A41)

\begin{matrix} \times p_{X_{t} Z_{t} | X^{t - 1} Z^{t - 1} S_{n} Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}) . \end{matrix}

(A42)

Taking summations of Equation (A41) and (A42) with respect to

x^{t}, z^{t}

, we obtain

\begin{matrix} Φ_{t, F^{t}}^{(μ, λ, θ)} (s, y^{n}) = \sum_{x^{t}, z^{t}} p_{X^{t - 1} Z^{t - 1} | S_{n} Y^{n}; F^{t - 1}}^{(μ, λ, θ)} (x^{t - 1}, z^{t - 1} | s, y^{n}) \\ \times p_{X_{t} Z_{t} | X^{t - 1} Z^{t - 1} S_{n} Y^{n}} (x_{t}, z_{t} | x^{t - 1}, z^{t - 1}, s, y^{n}) f_{F_{t}}^{(μ, λ, θ)} (x_{t}, y_{t}, z_{t} | u_{t}), \end{matrix}

completing the proof. ☐

References

Wyner, A.D.; Ziv, J. The rate-distortion function for source coding with side information at the decoder. IEEE Trans. Inf. Theory 1976, 22, 1–10. [Google Scholar] [CrossRef]
Shannon, C.E. Coding theorems for a discrete source with a fidelity criterion. IRE Nat. Conv. Rec. 1959, 7, 142–163. [Google Scholar]
Wolfowiz, J. Approximation with a fidelity criterion. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1966; Volume 1, pp. 566–573. [Google Scholar]
Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Academic: London, UK, 1981. [Google Scholar]
Oohama, Y. Exponent function for one helper source coding problem at rates outside the rate region. In Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 1575–1579. [Google Scholar]
Oohama, Y. Strong converse exponent for degraded broadcast channels at rates outside the capacity region. In Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 939–943. [Google Scholar]
Oohama, Y. Strong converse theorems for degraded broadcast channels with feedback. In Proceedings of the 2015 IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 2510–2514. [Google Scholar]
Oohama, Y. Exponent function for asymmetric broadcast channels at rates outside the capacity region. In Proceedings of the 2016 IEEE InternationalSymposium on Information Theory and Its Applications, Monterey, CA, USA, 30 October–2 November 2016; pp. 568–572. [Google Scholar]
Oohama, Y.; Han, T.S. Universal coding for the Slepian-wolf data compression system and the strong converse theorem. IEEE Trans. Inf. Theory 1994, 40, 1908–1919. [Google Scholar] [CrossRef]
Slepian, D.; Wolf, J.K. Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory 1973, 19, 471–480. [Google Scholar] [CrossRef]
Han, T.S. Information-Spectrum Methods in Information Theory; Springer: Berlin, Germany, 2002; The Japanese edition was published by Baifukan-Publisher, Tokyo, 1998. [Google Scholar]
Wyner, A.D. The rate-distortion function for source coding with side information at the decoder-II: General sources. Inf. Control 1978, 38, 60–80. [Google Scholar] [CrossRef]

Figure 1. Source encoding with a fidelity criterion with or without side inforamion at the decoder.

Figure 2. Wyner–Ziv source coding system.

Table 1. Previous results on the converse coding theorems for DMS, WZ. Main results in the present paper on WZ are also included.

	Characterization of the Rate Distortion Region	Strong Converse	Strong Converse Exponent
DMS	Shannon [2] (1959) (Explicit form of ${\tilde{R}}_{DMS} (p_{X})$ ) Wolfowitz [3] (1966) ( ${\tilde{R}}_{DMS} (p_{X}) = R_{DMS} (p_{X})$ )	Wolfowitz [3] (1966) ( $R_{DMS} (ε \| p_{X}) = R_{DMS} (p_{X})$ for any $ε \in (0, 1)$ )	Csiszár and Körner [4] (1981) (The optimal exponent)
WZ	Wyner and Ziv [1] (1976) (Explicit form of ${\tilde{R}}_{WZ} (p_{X Y})$ ) Csiszár and Körner [4] (1981) ( ${\tilde{R}}_{WZ} (p_{X Y}) = R_{WZ} (p_{X Y})$ )	Corollary 2 (Outer bound with $O (1 / \sqrt{n})$ gap from the rate distortion region, $R_{WZ} (ε \| p_{X Y}) = R_{WZ} (p_{X Y})$ for any $ε \in (0, 1)$ )	Theorem 3 (Lower bound F of the opt. exp. G)

© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Oohama, Y. Exponential Strong Converse for Source Coding with Side Information at the Decoder. Entropy 2018, 20, 352. https://doi.org/10.3390/e20050352

AMA Style

Oohama Y. Exponential Strong Converse for Source Coding with Side Information at the Decoder. Entropy. 2018; 20(5):352. https://doi.org/10.3390/e20050352

Chicago/Turabian Style

Oohama, Yasutada. 2018. "Exponential Strong Converse for Source Coding with Side Information at the Decoder" Entropy 20, no. 5: 352. https://doi.org/10.3390/e20050352

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exponential Strong Converse for Source Coding with Side Information at the Decoder^†

Abstract

1. Introduction

2. Source Coding with Side Information at the Decoder

3. Main Results

4. Proof of the Main Results

5. Conclusions

Acknowledgments

Conflicts of Interest

Appendix A. Properties of the Rate Distortion Regions

Appendix B. Cardinality Bound on Auxiliary Random Variables

Appendix C. Proof of Property 2

Appendix D. Proof of Property 3

Appendix E. Proof of Property 4 Part (b)

Appendix F. Proof of Property 4 Parts (c), (d), and (e)

Appendix G. Proof of Lemma 1

Appendix H. Proof of Lemma 2

Appendix I. Proof of Lemma 5

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Exponential Strong Converse for Source Coding with Side Information at the Decoder †

Abstract

1. Introduction

2. Source Coding with Side Information at the Decoder

3. Main Results

4. Proof of the Main Results

5. Conclusions

Acknowledgments

Conflicts of Interest

Appendix A. Properties of the Rate Distortion Regions

Appendix B. Cardinality Bound on Auxiliary Random Variables

Appendix C. Proof of Property 2

Appendix D. Proof of Property 3

Appendix E. Proof of Property 4 Part (b)

Appendix F. Proof of Property 4 Parts (c), (d), and (e)

Appendix G. Proof of Lemma 1

Appendix H. Proof of Lemma 2

Appendix I. Proof of Lemma 5

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Exponential Strong Converse for Source Coding with Side Information at the Decoder^†