Wiretap Channel with Action-Dependent Channel State Information

Dai, Bin; Vinck, A. J. Han; Luo, Yuan; Tang, Xiaohu

doi:10.3390/e15020445

Open AccessArticle

Wiretap Channel with Action-Dependent Channel State Information

by

Bin Dai

^1,2,*,

A. J. Han Vinck

³,

Yuan Luo

² and

Xiaohu Tang

¹

School of Information Science and Technology, Southwest JiaoTong University, Northbound Section Second Ring Road 111, Chengdu, China

²

Computer Science and Engineering Department, Shanghai Jiao Tong University, Dongchuan Road 800, Shanghai, China

³

Institute for Experimental Mathematics, Duisburg-Essen University, Ellernstr.29, Essen, Germany

^*

Author to whom correspondence should be addressed.

Entropy 2013, 15(2), 445-473; https://doi.org/10.3390/e15020445

Submission received: 23 November 2012 / Revised: 9 January 2013 / Accepted: 17 January 2013 / Published: 28 January 2013

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we investigate the model of wiretap channel with action-dependent channel state information. Given the message to be communicated, the transmitter chooses an action sequence that affects the formation of the channel states, and then generates the channel input sequence based on the state sequence and the message. The main channel and the wiretap channel are two discrete memoryless channels (DMCs), and they are connected with the legitimate receiver and the wiretapper, respectively. Moreover, the transition probability distribution of the main channel depends on the channel state. Measuring wiretapper’s uncertainty about the message by equivocation, inner and outer bounds on the capacity-equivocation region are provided both for the case where the channel inputs are allowed to depend non-causally on the state sequence and the case where they are restricted to causal dependence. Furthermore, the secrecy capacities for both cases are bounded, which provide the best transmission rate with perfect secrecy. The result is further explained via a binary example.

Keywords:

action-dependent channel state information; capacity-equivocation region; causal; noncausal; wiretap channel

1. Introduction

Communication through state-dependent channels, with states known at the transmitter, was first investigated by Shannon [1] in 1958. In [1], the capacity of the discrete memoryless channel with causal (past and current) channel state information at the encoder was totally determined. After that, in order to solve the problem of coding for a computer memory with defective cells, Kuznetsov and Tsybakov [2] considered a channel in the presence of non-causal channel state information at the transmitter. They provided some coding techniques without determination of the capacity. The capacity was found in 1980 by Gel’fand and Pinsker [3]. Furthermore, Costa [4] investigated a power constrained additive noise channel, where part of the noise is known at the transmitter as side information. This channel is also called dirty paper channel. The assumption in these seminar papers, as well as in the work on communication with state dependent channels that followed, is that the channel states are generated by nature, and can not be affected or controlled by the communication system.

In 2009, Weissman [5] revisited the above problem setting for the case where the transmitter can take actions that affect the formation of the states, see Figure 1. Specifically, Weissman considered a communication system where encoding is in two parts: given the message, an action sequence is created. The actions affect the formation of the channel states, which are accessible to the transmitter when producing the channel input sequence. The capacity of this model is totally determined both for the case where the channel inputs are allowed to depend non-causally on the state sequence and the case where they are restricted to causal dependence. This framework captures various new channel coding scenarios that may arise naturally in recording for magnetic storage devices or coding for computer memories with defects.

Figure 1. Channel with action-dependent states.

Transmission of confidential messages has been studied in the literature of several classes of channels. Wyner, in his well-known paper on the wiretap channel [6], studied the problem how to transmit the confidential messages to the legitimate receiver via a degraded broadcast channel, while keeping the wiretapper as ignorant of the messages as possible, see Figure 2. Measuring the uncertainty of the wiretapper by equivocation, the capacity-equivocation region was established. Furthermore, the secrecy capacity was also established, which provided the maximum transmission rate with perfect secrecy. After the publication of Wyner’s work, Csisz

\overset{´}{a}

r and Körner [7] investigated a more general situation: the broadcast channels with confidential messages (BCC). In this model, a common message and a confidential message were sent through a general broadcast channel. The common message was assumed to be decoded correctly by the legitimate receiver and the wiretapper, while the confidential message was only allowed to be obtained by the legitimate receiver. This model is also a generalization of [8], where no confidentiality condition is imposed. The capacity-equivocation region and the secrecy capacity region of BCC [7] were totally determined, and the results were also a generalization of those in [6]. Based on Wyner’s work, Leung-Yan-Cheong and Hellman studied the Gaussian wiretap channel(GWC) [9], and showed that its secrecy capacity was the difference between the main channel capacity and the overall wiretap channel capacity (the cascade of main channel and wiretap channel).

Figure 2. Wiretap channel.

Inspired by the above works, Mitrpant et al. [10] studied transmission of confidential messages in the channels with channel state information (CSI). In [10], an inner bound on the capacity-equivocation region was provided for the Gaussian wiretap channel with CSI. Furthermore, Chen et al. [11] investigated the discrete memoryless wiretap channel with noncausal CSI (see Figure 3), and also provided an inner bound on the capacity-equivocation region. Note that the coding scheme of [11] is a combination of those in [3,6] Based on the work of [11], Dai [12] provided an outer bound on the wiretap channel with noncausal CSI, and determined the capacity-equivocation region for the model of wiretap channel with memoryless CSI, where the memoryless means that at the i-th time, the output of the channel encoder depends only on the i-th time CSI.

Figure 3. Wiretap channel with noncausal channel state information.

In this paper, we study the wiretap channel with action-dependent channel state information, see Figure 4. Concretely, the transmitted message W is firstly encoded as an action sequence

A^{N}

, and

A^{N}

is the input of a discrete memoryless channel (DMC). The output of this DMC is the channel state sequence

S^{N}

. Then, the transmitted message W and the state sequence

S^{N}

are encoded as

X^{N}

. The main channel is a DMC with inputs

X^{N}

and

S^{N}

, and output

Y^{N}

. The wiretap channel is also a DMC with input

Y^{N}

and output

Z^{N}

. Since the action-dependent state captures various new coding scenarios for channels with a rewrite option that may arise naturally in storage for computer memories with defects or in magnetic recoding, it is natural to ask: how about the security of these channel models in the presence of a wiretapper? Measuring wiretapper’s uncertainty about the transmitted message by equivocation, the inner and outer bounds on the capacity-equivocation region of the model of Figure 4 are provided both for the case where the channel input is allowed to depend non-causally on the state sequence and the case where it is restricted to causal dependence.

Figure 4. Wiretap channel with action-dependent channel state information.

In this paper, random variables, sample values and alphabets are denoted by capital letters, lower case letters and calligraphic letters, respectively. A similar convention is applied to the random vectors and their sample values. For example,

U^{N}

denotes a random N-vector

(U_{1}, . . ., U_{N})

, and

u^{N} = (u_{1}, . . ., u_{N})

is a specific vector value in

U^{N}

that is the Nth Cartesian power of

U

.

U_{i}^{N}

denotes a random

N - i + 1

-vector

(U_{i}, . . ., U_{N})

, and

u_{i}^{N} = (u_{i}, . . ., u_{N})

is a specific vector value in

U_{i}^{N}

. Let

p_{V} (v)

denote the probability mass function

P r {V = v}

. Throughout the paper, the logarithmic function is to the base 2.

The remainder of this paper is organized as follows. In Section 2, we present the basic definitions and the main result on the capacity-equivocation region of wiretap channel with action-dependent channel state information. In Section 3, we provide a binary example of the model of Figure 4. Final conclusions are presented in Section 4.

2. Notations, Definitions and the Main Results

In this section, the model of Figure 4 is considered into two parts. The model of Figure 4 with noncausal channel state information is described in Subsection 2.1, and the causal case is described in Subsection 2.2, see the followings.

2.1. The Model of Figure 4 with Noncausal Channel State Information

In this subsection, a description of the wiretap channel with noncausal action-dependent channel state information is given by Definition 1 to Definition 6. The inner and outer bounds on the capacity-equivocation region

C^{n}

composed of all achievable

(R, R_{e})

pairs are given in Theorem 1 and Theorem 2, respectively, where the achievable

(R, R_{e})

pair is defined in Definition 6.

Definition 1

(Action encoder) The message W take values in

W

, and it is uniformly distributed over its range. The action encoder is a deterministic mapping:

f_{1}^{N} : W \to A^{N}

(1)

The input of the action encoder is W, while the output is

A^{N}

.

The channel state sequence

S^{N}

is generated by a DMC with input

A^{N}

and output

S^{N}

. The transition probability distribution is given by

p_{S^{N} | A^{N}} (s^{N} | a^{N}) = \prod_{i = 1}^{N} p_{S_{i} | A_{i}} (s_{i} | a_{i})

(2)

Note that the components of the state sequence

S^{N}

may not be i.i.d. random variables, and this is due to the fact that

A^{N}

is not i.i.d. generated.

The transmission rate of the message is

\frac{log ∥ W ∥}{N}

.

Definition 2

(Channel encoder) The inputs of the channel encoder are W and

S^{N}

, while the output is

X^{N}

. The channel encoder

f_{2}^{N}

is a matrix of conditional probabilities

f_{2}^{N} (x^{N} | w, s^{N})

, where

x^{N} \in X^{N}

,

w \in W

,

s^{N} \in S^{N}

,

\sum_{x^{N}} f_{2}^{N} (x^{N} | w, s^{N}) = 1

, and

f_{2}^{N} (x^{N} | w, s^{N})

is the probability that the message w and the channel state sequence

s^{N}

are encoded as the channel input

x^{N}

.

Since the channel encoder knows the state sequence

s^{N}

in a noncausal manner, at the i-th time (

1 \leq i \leq N

), the channel encoder

f_{2, i}^{N}

is a matrix of conditional probabilities

f_{2, i}^{N} (x_{i} | w, s^{N})

, where

x_{i} \in X

,

w \in W

,

s^{N} \in S^{N}

,

\sum_{x_{i}} f_{2, i}^{N} (x_{i} | w, s^{N}) = 1

, and

f_{2, i}^{N} (x_{i} | w, s^{N})

is the probability that the message w and the channel state sequence

s^{N}

are encoded as the i-th time channel input

x_{i}

.

The transmission rate of the message is

\frac{log ∥ W ∥}{N}

.

Definition 3

(Main channel) The main channel is a DMC with finite input alphabet

X \times S

, finite output alphabet

Y

, and transition probability

Q_{M} (y | x, s)

, where

x \in X, s \in S, y \in Y

.

Q_{M} (y^{N} | x^{N}, s^{N}) = \prod_{n = 1}^{N} Q_{M} (y_{n} | x_{n}, s_{n})

. The inputs of the main channel are

X^{N}

and

S^{N}

, while the output is

Y^{N}

.

Definition 4

(Wiretap channel) The wiretap channel is also a DMC with finite input alphabet

Y

, finite output alphabet

Z

, and transition probability

Q_{W} (z | y)

, where

y \in Y, z \in Z

. The input and output of the wiretap channel are

Y^{N}

and

Z^{N}

, respectively. The equivocation to the wiretapper is defined as

Δ = \frac{H (W | Z^{N})}{N}

(3)

The cascade of the main channel and the wiretap channel is another DMC with transition probability

Q_{M W} (z | x, s) = \sum_{y \in Y} Q_{W} (z | y) Q_{M} (y | x, s)

(4)

Note that,

(X^{N}, S^{N}) \to Y^{N} \to Z^{N}

and

W \to A^{N} \to S^{N}

are two Markov chains in the model of Figure 4.

Definition 5

(Decoder) The decoder for the legitimate receiver is a mapping

f_{D 1} : Y^{N} \to W

, with input

Y^{N}

and output

\hat{W}

. Let

P_{e}

be the error probability of the receiver , and it is defined as

P r {W \neq \hat{W}}

.

Definition 6

(Achievable

(R, R_{e})

pair in the model of Figure 4) A pair

(R, R_{e})

(where

R, R_{e} > 0

) is called achievable if, for any

ϵ > 0

(where ϵ is an arbitrary small positive real number and

ϵ \to 0

), there exists channel encoders-decoders

(N, P_{e})

such that

lim_{N \to \infty} \frac{log ∥ W ∥}{N} = R, lim_{N \to \infty} Δ \geq R_{e} P_{e} \leq ϵ

(5)

The capacity-equivocation region

R^{n}

is a set composed of all achievable

(R, R_{e})

pairs. Inner and outer bounds on

R^{n}

are respectively provided in the following Theorem 1 and Theorem 2. Theorem 1 and Theorem 2 are respectively proved in Section A and Section B.

Theorem 1

(Inner bound) A single-letter characterization of the region

R^{n i}

is as follows,

\begin{matrix} R^{(n i)} = {(R, R_{e}) : 0 \leq R_{e} \leq R \\ R \leq I (U; Y) - I (U; S | A) \\ R_{e} \leq I (U; Y) - I (U; Z) \\ R_{e} \leq H (A | Z)} \end{matrix}

where

p_{U A S X Y Z} (u, a, s, x, y, z) = p_{Z | Y} (z | y) p_{Y | X, S} (y | x, s) p_{U A X S} (u, a, x, s)

, which implies that

(A, U) \to (X, S) \to Y \to Z

.

The region

R^{(n i)}

satisfies

R^{(n i)} \subseteq R^{(n)}

.

Remark 1

There are some notes on Theorem 1, see the following.

The formula $R_{e} \leq H (A | Z)$ in Theorem 1 implies that the wiretapper obtains the information about the message not only from the codeword transmitted in the channels, but also from the action sequence $a^{N}$ . If the wiretapper knows $a^{N}$ , he knows the corresponding message.
The region $R^{(n i)}$ is convex, and the proof is directly obtained by introducing a time sharing random variable into Theorem 1, and therefore, we omit the proof here.
The range of the random variable U satisfies

$∥ U ∥ \leq ∥ X ∥ ∥ A ∥ ∥ S ∥ + 2$

The proof is in Section C.
Without the equivocation parameter, the capacity of the main channel is given by

$C_{M} = max_{p_{X | U, S} (x | u, s) p_{U | A, S} (u | a, s) p_{A} (a)} (I (U; Y) - I (U; S | A))$

(6)

The formula (6) is proved by Weissman [5], and it is omitted here.
Secrecy capacity
The points in $R^{(n i)}$ for which $R_{e} = R$ are of considerable interest, which imply the perfect secrecy $H (W) = H (W | Z^{N})$ . Clearly, we can easily bound the secrecy capacity $C_{s}^{n}$ of the model of Figure 4 with noncausal channel state information by

$C_{s}^{n} \geq max_{p_{U A X S} (u, a, x, s)} min {I (U; Y) - I (U; Z), I (U; Y) - I (U; S | A), H (A | Z)}$

(7)

Proof 1 (Proof of (7)) Substituting $R_{e} = R$ into the region $R^{(n i)}$ in Theorem 1, we have

$\begin{matrix} R & \leq & I (U; Y) - I (U; Z) \end{matrix}$

(8)

$\begin{matrix} R & \leq & I (U; Y) - I (U; S | A) \end{matrix}$

(9)

$\begin{matrix} R & \leq & H (A | Z) \end{matrix}$

(10)

Note that the pair $(R = max min {I (U; Y) - I (U; Z), I (U; Y) - I (U; S | A), H (A | Z)}, R_{e} = R)$ is achievable, and therefore, the secrecy capacity $C_{s}^{(n)} \geq max min {I (U; Y) - I (U; Z), I (U; Y) - I (U; S | A), H (A | Z)}$ . Thus the proof is completed.

Theorem 2

(Outer bound) A single-letter characterization of the region

R^{n o}

is as follows,

\begin{matrix} R^{(n o)} = {(R, R_{e}) : 0 \leq R_{e} \leq R \\ R \leq I (U; Y) - I (U; S | A) \\ R_{e} \leq I (U; Y) - I (K; Z | V)} \end{matrix}

where

p_{U K V A S X Y Z} (u, k, v, a, s, x, y, z) = p_{Z | Y} (z | y) p_{Y | X, S} (y | x, s) p_{X | U, S} (x | u, s) p_{V | K} (v | k) p_{K | U} (k | u) p_{U, A, S} (u, a, s)

, which implies that

(A, U, K, V) \to (X, S) \to Y \to Z

and

V \to K \to U \to Y \to Z

are two Markov chains.

The region

R^{(n o)}

satisfies

R^{(n)} \subseteq R^{(n o)}

.

Remark 2

There are some notes on Theorem 2, see the following.

The region $R^{(n o)}$ is convex, and the proof is similar to that of Theorem 1. Therefore, we omit the proof here.
The ranges of the random variables U, V and K satisfy

$∥ U ∥ \leq ∥ X ∥ ∥ A ∥ ∥ S ∥ + 1$

$∥ V ∥ \leq ∥ X ∥ ∥ A ∥ ∥ S ∥$

$∥ K ∥ \leq {∥ X ∥}^{2} {∥ A ∥}^{2} {∥ S ∥}^{2}$

The proof is in Section D.
Observing the formula $R_{e} \leq I (U; Y) - I (K; Z | V)$ in Theorem 2, we have

$\begin{matrix} I (U; Y) - I (K; Z | V) & =^{(a)} & I (U; Y) - H (Z | V) + H (Z | K) \\ \geq & I (U; Y) - H (Z) + H (Z | K) \\ \geq & I (U; Y) - H (Z) + H (Z | K, U) \\ =^{(b)} & I (U; Y) - H (Z) + H (Z | U) = I (U; Y) - I (U; Z) \end{matrix}$

(11)

where (a) is from the fact that $V \to K \to Z$ , and (b) is from the Markov chain $K \to U \to Y \to Z$ . Then it is easy to see that $R^{(n i)} \subseteq R^{(n o)}$ .
The secrecy capacity $C_{s}^{n}$ of the model of Figure 4 with noncausal channel state information is upper bounded by

$C_{s}^{n} \leq max_{p_{X, U, K, V, A, S} (x, u, k, v, a, s)} min {I (U; Y) - I (K; Z | V), I (U; Y) - I (U; S | A)}$

(12)

The upper bound is easily obtained by substituting $R_{e} = R$ into the region $R^{(n o)}$ in Theorem 2, and therefore, we omit the proof here.

2.2. The Model of Figure 4 with Causal Channel State Information

The model of Figure 4 with causal channel state information is similar to the model with noncausal channel state information in Subsection 2.1, except that the state sequence

S^{N}

in Definition 1 is known to the channel encoder in a causal manner, i.e., at the i-th time (

1 \leq i \leq N

), the output of the encoder

x_{i} = f_{2, i} (w, s^{i})

, where

s^{i} = (s_{1}, s_{2}, . . ., s_{i})

and

f_{2, i}

is the probability that the message w and the state

s^{i}

are encoded as the channel input

x_{i}

at time i. Define

f^{N} (x^{N} | w, s^{N}) = \prod_{i = 1}^{N} f_{i} (x_{i} | w, s^{i})

(13)

Inner and outer bounds on the capacity-equivocation region

R^{c}

for the model of Figure 4 with causal channel state information are respectively provided in the following Theorem 3 and Theorem 4, and they are proved in Section E and Section F.

Theorem 3

(Inner bound) A single-letter characterization of the region

R^{c i}

is as follows,

\begin{matrix} R^{(c i)} = {(R, R_{e}) : 0 \leq R_{e} \leq R \\ R \leq I (U; Y) \\ R_{e} \leq I (U; Y) - I (U; Z) \\ R_{e} \leq H (A | Z)} \end{matrix}

where

p_{U A S X Y Z} (u, a, s, x, y, z) = p_{Z | Y} (z | y) p_{Y | X, S} (y | x, s) p_{X | U, S} (x | u, s) p_{S | A} (s | a) p_{U A} (u, a)

, which implies that

(A, U) \to (X, S) \to Y \to Z

and

U \to A \to S

.

The region

R^{(c i)}

satisfies

R^{(c i)} \subseteq R^{(c)}

.

Remark 3

There are some notes on Theorem 3, see the following.

The region $R^{(c i)}$ is convex.
The range of the random variable U satisfies

$∥ U ∥ \leq ∥ X ∥ ∥ A ∥ ∥ S ∥ + 1$

The proof is similar to that in Theorem 1, and it is omitted here.
Without the equivocation parameter, the capacity of the main channel is given by

$C_{M}^{*} = max_{p_{X | U, S} (x | u, s) p_{U, A} (u, a)} I (U; Y)$

(14)

The formula (14) is proved by Weissman [5], and it is omitted here.
Secrecy capacity
The points in $R^{(c i)}$ for which $R_{e} = R$ are of considerable interest, which imply the perfect secrecy $H (W) = H (W | Z^{N})$ . Clearly, we can easily bound the secrecy capacity $C_{s}^{(c)}$ of the model of Figure 4 with causal channel state information by

$C_{s}^{(c)} \geq max_{p_{X | U, S} (x | u, s) p_{U A} (u, a)} min {I (U; Y) - I (U; Z), H (A | Z)}$

(15)

Proof 2 (Proof of (15)) Substituting $R_{e} = R$ into the region $R^{(c i)}$ in Theorem 3, we have

$\begin{matrix} R & \leq & I (U; Y) - I (U; Z) \end{matrix}$

(16)

$\begin{matrix} R & \leq & I (U; Y) \end{matrix}$

(17)

$\begin{matrix} R & \leq & H (A | Z) \end{matrix}$

(18)

Note that the pair $(R = max min {I (U; Y) - I (U; Z), H (A | Z)}, R_{e} = R)$ is achievable, and therefore, the secrecy capacity $C_{s}^{(c)} \geq max min {I (U; Y) - I (U; Z), H (A | Z)}$ . Thus the proof is completed.

Theorem 4

(Outer bound) A single-letter characterization of the region

R^{c o}

is as follows,

\begin{matrix} R^{(c o)} = {(R, R_{e}) : 0 \leq R_{e} \leq R \\ R \leq I (U; Y) \\ R_{e} \leq I (U; Y) - I (K; Z | V)} \end{matrix}

where

p_{U K V A S X Y Z} (u, k, v, a, s, x, y, z) = p_{Z | Y} (z | y) p_{Y | X, S} (y | x, s) p_{X | U, S} (x | u, s) p_{V | K} (v | k) p_{K | U} (k | u) p_{U, A, S} (u, a, s)

, which implies that

(A, U, K, V) \to (X, S) \to Y \to Z

and

V \to K \to U \to Y \to Z

are two Markov chains.

The region

R^{(c o)}

satisfies

R^{(c)} \subseteq R^{(c o)}

.

Remark 4 There are some notes on Theorem 4, see the following.

The region $R^{(c o)}$ is convex.
The ranges of the random variables U, V and K satisfy

$∥ U ∥ \leq ∥ X ∥ ∥ A ∥ ∥ S ∥$

$∥ V ∥ \leq ∥ X ∥ ∥ A ∥ ∥ S ∥$

$∥ K ∥ \leq {∥ X ∥}^{2} {∥ A ∥}^{2} {∥ S ∥}^{2}$

The proof is similar to that in Section D, and it is omitted here.
The secrecy capacity $C_{s}^{(c)}$ of the model of Figure 4 with causal channel state information is upper bounded by

$C_{s}^{(c)} \leq max_{p_{X | U, S} (x | u, s) p_{U, K, V, A, S} (u, k, v, a, s)} I (U; Y) - I (K; Z | V)$

(19)

The upper bound is easily obtained by substituting $R_{e} = R$ into the region $R^{(c o)}$ in Theorem 4, and therefore, we omit the proof here.

3. A Binary Example for the Model of Figure 4 with Causal Channel State Information

In this section, we calculate the bound on secrecy capacity of a special case of the model of Figure 4 with causal channel state information.

Suppose that the channel state information

S^{N}

is available at the channel encoder in a casual manner. Meanwhile, the random variables X, Y and Z take values in

{0, 1}

, and the transition probability of the main channel is defined as follows:

When

s = 0

,

p_{Y | X, S} (y | x, s = 0) = \{\begin{matrix} 1 - p, & if y = x, \\ p, & otherwise . \end{matrix}

(20)

When

s = 1

,

p_{Y | X, S} (y | x, s = 1) = \{\begin{matrix} p, & if y = x, \\ 1 - p, & otherwise . \end{matrix}

(21)

The wiretap channel is a BSC (binary symmetric channel) with crossover probability q, i.e.,

p_{Z | Y} (z | y) = \{\begin{matrix} 1 - q, & if y = x, \\ q, & otherwise . \end{matrix}

(22)

The channel for generating the state sequence

S^{N}

is a BSC with crossover probability r, i.e.,

p_{S | A} (s | a) = \{\begin{matrix} 1 - r, & if y = x, \\ r, & otherwise . \end{matrix}

(23)

From Remark 3 and Remark 4 we know that the secrecy capacity for the causal case is bounded by

max min {I (U; Y) - I (U; Z), H (A | Z)} \leq C_{s}^{c} \leq max (I (U; Y) - I (K; Z | V)) \leq^{(a)} max I (U; Y) .

(24)

Note that in (a), “=” is achieved if

V = K

. Moreover,

max I (U; Y)

,

max H (A | Z)

and

max (I (U; Y) - I (U; Z))

are achieved if A is a function of U and X is a function of U and S, and this is similar to the argument in [5]. Define

a = g (u)

and

x = f (u, s)

, then (24) can be written as

max_{f, g, p_{U} (u)} min {I (U; Y) - I (U; Z), H (A | Z)} \leq C_{s}^{c} \leq max_{f, g, p_{U} (u)} I (U; Y)

(25)

and this is because the joint probability distribution

p_{A U S X Y Z} (a, u, s, x, y, z)

can be calculated by

p_{A U S X Y Z} (a, u, s, x, y, z) = p_{Z | Y} (z | y) p_{Y | X, S} (y | x, s) 1_{x = f (u, s)} p_{S | A} (s | a) 1_{a = g (u)} p_{U} (u)

(26)

Now it remains to calculate the characters

{max}_{f, g, p_{U} (u)} (I (U; Y) - I (U; Z))

,

{max}_{f, g, p_{U} (u)} H (A | Z)

and

{max}_{f, g, p_{U} (u)} I (U; Y)

, see the remaining of this section.

Let U take values in

{0, 1}

. The probability of U is defined as follows.

p_{U} (0) = α

and

p_{U} (1) = 1 - α

.

In addition, there are 16 kinds of f and 4 kinds of g. Define

f^{(1)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 0, \\ 10 \to 0, 11 \to 0 . \end{matrix} f^{(2)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 0, \\ 10 \to 0, 11 \to 1 . \end{matrix}

(27)

f^{(3)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 0, \\ 10 \to 1, 11 \to 0 . \end{matrix} f^{(4)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 0, \\ 10 \to 1, 11 \to 1 . \end{matrix}

(28)

f^{(5)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 1, \\ 10 \to 0, 11 \to 0 . \end{matrix} f^{(6)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 1, \\ 10 \to 0, 11 \to 1 . \end{matrix}

(29)

f^{(7)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 1, \\ 10 \to 1, 11 \to 0 . \end{matrix} f^{(8)} (u, s) : \{\begin{matrix} 00 \to 0, 01 \to 1, \\ 10 \to 1, 11 \to 1 . \end{matrix}

(30)

f^{(9)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 0, \\ 10 \to 0, 11 \to 0 . \end{matrix} f^{(10)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 0, \\ 10 \to 0, 11 \to 1 . \end{matrix}

(31)

f^{(11)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 0, \\ 10 \to 1, 11 \to 0 . \end{matrix} f^{(12)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 0, \\ 10 \to 1, 11 \to 1 . \end{matrix}

(32)

f^{(13)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 1, \\ 10 \to 0, 11 \to 0 . \end{matrix} f^{(14)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 1, \\ 10 \to 0, 11 \to 1 . \end{matrix}

(33)

f^{(15)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 1, \\ 10 \to 1, 11 \to 0 . \end{matrix} f^{(16)} (u, s) : \{\begin{matrix} 00 \to 1, 01 \to 1, \\ 10 \to 1, 11 \to 1 . \end{matrix}

(34)

g^{(1)} (u) : \{\begin{matrix} 0 \to 0, \\ 1 \to 0 . \end{matrix} g^{(2)} (u) : \{\begin{matrix} 0 \to 0, \\ 1 \to 1 . \end{matrix}

(35)

g^{(3)} (u) : \{\begin{matrix} 0 \to 1, \\ 1 \to 0 . \end{matrix} g^{(4)} (u) : \{\begin{matrix} 0 \to 1, \\ 1 \to 1 . \end{matrix}

(36)

The character

I (U; Y)

depends on the joint probability mass functions

p_{U Y} (u, y)

, and we have

\begin{matrix} p_{U Y} (u, y) & = & \sum_{x, s, a} p_{U Y X S A} (u, y, x, s, a) \\ = & \sum_{x, s, a} p_{Y | X S} (y | x, s) p_{X | U, S} (x | u, s) p_{U} (u) p_{A | U} (a | u) p_{S | A} (s | a) \end{matrix}

(37)

The character

I (U; Z)

depends on the joint probability mass functions

p_{U Z} (u, z)

, and we have

\begin{matrix} p_{U Z} (u, z) & = & \sum_{y} p_{U Y Z} (u, y, z) \\ = & \sum_{y} p_{Z | Y} (z | y) p_{U, Y} (u, y) \end{matrix}

(38)

By choosing the above f, g and α, we find that

max_{f, g, p_{U} (u)} (I (U; Y) - I (U; Z)) = max {h (p ⋆ q) - h (p), \frac{h (q ⋆ (r ⋆ p)) - h (p ⋆ r)}{2 r} - (\frac{1}{2 r} - 1) (h (p ⋆ q) - h (p))}

(39)

where

p ⋆ q = p + q - 2 p q

. Moreover,

h (p ⋆ q) - h (p)

is achieved when

f = f^{(7)}

,

g = g^{(2)}

and

α = \frac{1}{2}

, and

\frac{h (q ⋆ (r ⋆ p)) - h (p ⋆ r)}{2 r} - (\frac{1}{2 r} - 1) (h (p ⋆ q) - h (p))

is achieved when

f = f^{(2)}

,

g = g^{(2)}

and

α = \frac{1}{2}

.

Moreover,

max_{f, g, p_{U} (u)} H (A | Z) = h (p ⋆ q)

(40)

where

p ⋆ q = p + q - 2 p q

. Moreover,

h (p ⋆ q)

is achieved when

f = f^{(7)}

,

g = g^{(2)}

and

α = \frac{1}{2}

.

In addition,

max_{f, g, p_{U} (u)} I (U; Y) = 1 - h (p)

(41)

and “=” is achieved if

f = f^{(7)}

,

g = g^{(2)}

and

α = \frac{1}{2}

.

It is easy to see that

{max}_{f, g, p_{U} (u)} H (A | Z) = h (p ⋆ q) \geq {max}_{f, g, p_{U} (u)} (I (U; Y) - I (U; Z))

and therefore, the secrecy capacity for the causal case is bounded by

max {h (p ⋆ q) - h (p), \frac{h (q ⋆ (r ⋆ p)) - h (p ⋆ r)}{2 r} - (\frac{1}{2 r} - 1) (h (p ⋆ q) - h (p))} \leq C_{s}^{c} \leq 1 - h (p)

(42)

The following Figure 5 gives lower and upper bounds on the secrecy capacity of the model of Figure 4 with causal channel state information. It is easy to see that when

q = 0.5

, the lower bound meets with the upper bound, i.e., the secrecy capacity

C_{s}^{c}

satisfies

C_{s}^{c} = 1 - h (p)

. This is because when

q = 0.5

, zero leakage is always satisfied and the problem reduces to the problem of coding for channel with causal states. Moreover, when r is fixed, the bounds on secrecy capacity are getting better while p is decreasing.

Figure 5. When r=0.2, lower and upper bounds on the secrecy capacity of the model of Figure 4 with causal channel state information.

4. Conclusions

In this paper, we study the model of the wiretap channel with action-dependent channel state information. Inner and outer bounds on the capacity-equivocation region are provided both for the case where the channel inputs are allowed to depend non-causally on the state sequence and the case where they are restricted to causal dependence. Furthermore, the secrecy capacities for both cases are bounded, which provide the best transmission rate with perfect secrecy. The result is further explained via a binary’example.

Acknowledgement

The authors would like to thank N. Cai for his help to improve this paper. This work was supported by a sub-project in National Basic Research Program of China under Grant 2012CB316100 on Broadband Mobile Communications at High Speeds, the National Natural Science Foundation of China under Grant 61271222, and the Research Fund for the Doctoral Program of Higher Education of China (No. 20100073110016).

References

Shannon, C.E. Channels with side information at the transmitter. IBM J. Res. Dev. 1958, 2, 289–293. [Google Scholar] [CrossRef]
Kuznetsov, N.V.; Tsybakov, B.S. Coding in memories with defective cells. Probl. Peredachi Informatsii 1974, 10, 52–60. [Google Scholar]
Gel’fand, S.I.; Pinsker, M.S. Coding for channel with random parameters. Problems. Control Inf. Theory 1980, 9, 19–31. [Google Scholar]
Costa, M.H.M. Writing on dirty paper. IEEE Trans. Inf. Theory 1983, 29, 439–441. [Google Scholar] [CrossRef]
Weissman, T. Capacity of channels with action-dependent states. IEEE Trans. Inf. Theory 2010, 56, 5396–5411. [Google Scholar] [CrossRef]
Wyner, A.D. The wire-tap channel. Bell Syst. Tech. J. 1975, 54, 1355–1387. [Google Scholar] [CrossRef]
Csisz´r, I.; Körner, J. Broadcast channels with confidential messages. IEEE Trans. Inf. Theory 1978, 24, 339–348. [Google Scholar]
Körner, J.; Marton, K. General broadcast channels with degraded message sets. IEEE Trans. Inf. Theory 1977, 23, 60–64. [Google Scholar] [CrossRef]
Leung-Yan-Cheong, S.K.; Hellman, M.E. The Gaussian wire-tap channel. IEEE Trans. Inf. Theory 1978, 24, 451–456. [Google Scholar] [CrossRef]
Mitrpant, C.; Han Vinck, A.J.; Luo, Y. An achievable region for the gaussian wiretap channel with side information. IEEE Trans. Inf. Theory 2006, 52, 2181–2190. [Google Scholar] [CrossRef]
Chen, Y.; Han Vinck, A.J. Wiretap channel with side information. IEEE Trans. Inf. Theory 2008, 54, 395–402. [Google Scholar] [CrossRef]
Dai, B.; Luo, Y. Some new results on wiretap channel with side information. Entropy 2012, 14, 1671–1702. [Google Scholar] [CrossRef]
Csisz´r, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Academic: London, UK, 1981; pp. 123–124. [Google Scholar]

A. Proof of Theorem 1

In this section, we will show that any pair

(R, R_{e}) \in R^{n i}

is achievable. Gel’fand-Pinsker’s binning and Wyner’s random binning technique are used in the construction of the code-books.

Now the remainder of this section is organized as follows. The code construction is in Subsection A.1. The proof of achievability is given in Subsection A.2.

A.1. Code Construction

Since

R_{e} \leq I (U; Y) - I (U; Z)

,

R_{e} \leq H (A | Z)

and

R_{e} \leq R \leq I (U; Y) - I (U; S | A)

, it is sufficient to show that the pair

(R, R_{e} = min {I (U; Y) - max (I (U; Z), I (U; S | A))), H (A | Z)}

is achievable, and note that this implies that

R \geq R_{e} = min {I (U; Y) - max (I (U; Z), I (U; S | A))), H (A | Z)}

.

The construction of the code and the proof of achievability are considered into two cases:

(Case 1) If $H (A | Z) \geq I (U; Y) - max (I (U; Z), I (U; S | A)))$ , double binning technique [11] is used in the construction of the code-book.
(Case 2) If $H (A | Z) \leq I (U; Y) - max (I (U; Z), I (U; S | A)))$ , Gel’fand-Pinsker’s binning technique [3] is used in the construction of the code-book.
(Code construction for Case 1)
Given a pair $(R, R_{e})$ , choose a joint probability mass function $p_{U, A, S, X, Y, Z} (u, a, s, x, y, z)$ such that

$0 \leq R_{e} \leq R$

$R \leq I (U; Y) - I (U; S | A)$

$R_{e} = I (U; Y) - max (I (U; Z), I (U; S | A)))$

The message set $W$ satisfies the following condition:

$lim_{N \to \infty} \frac{1}{N} log ∥ W ∥ = R = I (U; Y) - I (U; S | A) - γ$

(A1)

where γ is a fixed positive real numbers and

$0 \leq γ \leq^{(a)} max (I (U; Z), I (U; S | A))) - I (U; S | A)$

(A2)

Note that (a) is from $R \geq R_{e} = I (U; Y) - max (I (U; Z), I (U; S | A)))$ and (A1). Let $W = {1, 2, . . ., 2^{N R}}$ .
Code-book generation:
–
(Construction of $A^{N}$ )
Generate $2^{N R}$ i.i.d. sequences $a^{N}$ , according to the probability mass function $p_{A} (a)$ . Index each sequence by $i \in {1, 2, . . ., 2^{N R}}$ . For a given message w ( $w \in W$ ), choose a corresponding $a^{N} (w)$ as the output of the action encoder.
–
(Construction of $U^{N}$ )
For the transmitted action sequence $a^{N} (w)$ , generate $2^{N (I (U; Y) - ϵ_{2, N})}$ ( $ϵ_{2, N} \to 0$ as $N \to \infty$ ) i.i.d. sequences $u^{N}$ , according to the probability mass function $p_{U | A} (u_{i} | a_{i} (w))$ . Distribute these sequences at random into $2^{N R} = 2^{N (I (U; Y) - I (U; S | A) - γ)}$ bins such that each bin contains $2^{N (I (U; S | A) + γ - ϵ_{2, N})}$ sequences. Index each bin by $i \in {1, 2, . . ., 2^{N R}}$ . Then place the $2^{N (I (U; S | A) + γ - ϵ_{2, N})}$ sequences in every bin randomly into $2^{N (max (I (U; S | A), I (U; Z)) - I (U; Z) + ϵ_{3, N})}$ ( $ϵ_{3, N} \to 0$ as $N \to \infty$ ) subbins such that every subbin contains
$2^{N (I (U; S | A) + γ - ϵ_{2, N} - max (I (U; S | A), I (U; Z)) + I (U; Z) - ϵ_{3, N})}$ sequences. Let J be the random variable to represent the index of the subbin. Index each subbin by
$j \in {1, 2, . . ., 2^{N (max (I (U; S | A), I (U; Z)) - I (U; Z) + ϵ_{3, N})}}$ , i.e.,

$log ∥ J ∥ = N (max (I (U; S | A), I (U; Z)) - I (U; Z) + ϵ_{3, N}) .$

(A3)

Here note that the number of the sequences in every subbin is upper bounded as follows.

$\begin{matrix} I (U; S | A) + γ - ϵ_{2, N} - max (I (U; S | A), I (U; Z)) + I (U; Z) - ϵ_{3, N} \\ \leq^{(a)} & I (U; Z) - ϵ_{2, N} - ϵ_{3, N} \end{matrix}$

(A4)

where (a) is from (A2). This implies that

$lim_{N \to \infty} H (U^{N} | W, J, Z^{N}) = 0 .$

(A5)

Note that (A5) can be proved by using Fano’s inequality and (A4).
Let $s^{N}$ be the state sequence generated in response to the action sequence $a^{N} (w)$ . For a given message w ( $w \in W$ ) and channel state $s^{N}$ , try to find a sequence $u^{N} (w, i^{*})$ in bin w such that $(u^{N} (w, i^{*}), a^{N} (w), s^{N}) \in T_{U A S}^{N} (ϵ_{2})$ . If multiple such sequences in bin w exist, choose the one with the smallest index in the bin. If no such sequence exists, declare an encoding error.
Figure A1 shows the construction of $U^{N}$ for case 1, see the following.

Figure A1. Code-book construction for $U^{N}$ in Theorem 1 for case 1.

Figure A1. Code-book construction for $U^{N}$ in Theorem 1 for case 1.

–
(Construction of $X^{N}$ ) The $x^{N}$ is generated according to a new discrete memoryless channel (DMC) with inputs $u^{N}$ , $s^{N}$ , and output $x^{N}$ . The transition probability of this new DMC is $p_{X | U, S} (x | u, s)$ , which is obtained from the joint probability mass function $p_{U, A, S, X, Y, Z} (u, a, s, x, y, z)$ . The probability
$p_{X^{N} | U^{N}, S^{N}} (x^{N} | u^{N}, s^{N})$ is calculated as follows.

$p_{X^{N} | U^{N}, S^{N}} (x^{N} | u^{N}, s^{N}) = \prod_{i = 1}^{N} p_{X | U, S} (x_{i} | u_{i}, s_{i})$

(A6)

Decoding:
Given a vector $y^{N} \in Y^{N}$ , try to find a sequence $u^{N} (\hat{w}, \hat{i})$ such that $(u^{N} (\hat{w}, \hat{i}), a^{N} (\hat{w}), y^{N}) \in T_{U A Y}^{N} (ϵ_{3})$ . If there exist sequences with the same $\hat{w}$ , put out the corresponding $\hat{w}$ . Otherwise, i.e., if no such sequence exists or multiple sequences have different message indices, declare a decoding error.
(Code construction for Case 2)
Given a pair $(R, R_{e})$ , choose a joint probability mass function $p_{U, A, S, X, Y, Z} (u, a, s, x, y, z)$ such that

$0 \leq R_{e} \leq R$

$R \leq I (U; Y) - I (U; S | A)$

$R_{e} = H (A | Z)$

The message set $W$ satisfies the following condition:

$lim_{N \to \infty} \frac{1}{N} log ∥ W ∥ = R = I (U; Y) - I (U; S | A) - γ_{1},$

(A7)

where $γ_{1}$ is a fixed positive real numbers and

$0 \leq γ_{1} \leq^{(b)} I (U; Y) - I (U; S | A) - H (A | Z) .$

(A8)

Note that (b) is from $R \geq R_{e} = H (A | Z)$ and (A7). Let $W = {1, 2, . . ., 2^{N R}}$ .
Code-book generation:
–
(Construction of $A^{N}$ )
Generate $2^{N R}$ i.i.d. sequences $a^{N}$ , according to the probability mass function $p_{A} (a)$ . Index each sequence by $i \in {1, 2, . . ., 2^{N R}}$ . For a given message w ( $w \in W$ ), choose a corresponding $a^{N} (w)$ as the output of the action encoder.
–
(Construction of $U^{N}$ )
For the transmitted action sequence $a^{N} (w)$ , generate $2^{N (I (U; Y) - ϵ_{2, N})}$ ( $ϵ_{2, N} \to 0$ as $N \to \infty$ ) i.i.d. sequences $u^{N}$ , according to the probability mass function $p_{U | A} (u_{i} | a_{i} (w))$ . Distribute these sequences at random into $2^{N R} = 2^{N (I (U; Y) - I (U; S | A) - γ_{1})}$ bins such that each bin contains $2^{N (I (U; S | A) + γ_{1} - ϵ_{2, N})}$ sequences. Index each bin by $i \in {1, 2, . . ., 2^{N R}}$ .
Let $s^{N}$ be the state sequence generated in response to the action sequence $a^{N} (w)$ . For a given message w ( $w \in W$ ) and channel state $s^{N}$ , try to find a sequence $u^{N} (w, i^{*})$ in bin w such that $(u^{N} (w, i^{*}), a^{N} (w), s^{N}) \in T_{U A S}^{N} (ϵ_{2})$ . If multiple such sequences in bin w exist, choose the one with the smallest index in the bin. If no such sequence exists, declare an encoding error.
Figure A2 shows the construction of $U^{N}$ for case 2, see the following.

Figure A2. Code-book construction for $U^{N}$ in Theorem 1 for case 2.

Figure A2. Code-book construction for $U^{N}$ in Theorem 1 for case 2.

–
(Construction of $X^{N}$ ) The $x^{N}$ is generated the same as that for the case 1, and it is omitted here.
Decoding:
Given a vector $y^{N} \in Y^{N}$ , try to find a sequence $u^{N} (\hat{w}, \hat{i})$ such that $(u^{N} (\hat{w}, \hat{i}), a^{N} (\hat{w}), y^{N}) \in T_{U A Y}^{N} (ϵ_{3})$ . If there exist sequences with the same $\hat{w}$ , put out the corresponding $\hat{w}$ . Otherwise, i.e., if no such sequence exists or multiple sequences have different message indices, declare a decoding error.

A.2. Proof of Achievability

By using the above definitions, it is easy to verify that

{lim}_{N \to \infty} \frac{log ∥ W ∥}{N} = R

.

Then, for the two cases, note that the above encoding and decoding schemes are similar to the one used in [5]. Hence, by similar arguments as in [5], it is easy to show that

P_{e} \leq ϵ

for both cases, and the proof is omitted here. It remains to show that

{lim}_{N \to \infty} Δ \geq R_{e}

for the two cases, see the following.

Proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for case 1

\begin{matrix} lim_{N \to \infty} Δ & = & lim_{N \to \infty} \frac{1}{N} H (W | Z^{N}) \\ = & lim_{N \to \infty} \frac{1}{N} (H (W, Z^{N}) - H (Z^{N})) \\ = & lim_{N \to \infty} \frac{1}{N} (H (W, Z^{N}, U^{N}, J) - H (J, U^{N} | Z^{N}, W) - H (Z^{N})) \\ \overset{(a)}{=} & lim_{N \to \infty} \frac{1}{N} (H (Z^{N} | U^{N}) + H (U^{N}, J, W) - H (J, U^{N} | Z^{N}, W) - H (Z^{N})) \\ \overset{(b)}{=} & lim_{N \to \infty} \frac{1}{N} (H (Z^{N} | U^{N}) + H (U^{N}) - H (J, U^{N} | Z^{N}, W) - H (Z^{N})) \\ = & lim_{N \to \infty} \frac{1}{N} (H (U^{N}) - H (J, U^{N} | Z^{N}, W) - I (Z^{N}; U^{N})) \\ = & lim_{N \to \infty} \frac{1}{N} (H (U^{N}) - H (J | Z^{N}, W) - H (U^{N} | Z^{N}, W, J) - I (Z^{N}; U^{N})) \\ \overset{(c)}{\geq} & lim_{N \to \infty} \frac{1}{N} (H (U^{N}) - log ∥ J ∥ - H (U^{N} | Z^{N}, W, J) - I (Z^{N}; U^{N})) \\ \geq & lim_{N \to \infty} \frac{1}{N} (H (U^{N}) - H (U^{N} | Y^{N}) - log ∥ J ∥ - H (U^{N} | Z^{N}, W, J) - I (Z^{N}; U^{N})) \\ = & lim_{N \to \infty} \frac{1}{N} (I (Y^{N}; U^{N}) - log ∥ J ∥ - H (U^{N} | Z^{N}, W, J) - I (Z^{N}; U^{N})) \\ \overset{(d)}{=} & lim_{N \to \infty} \frac{1}{N} (N I (Y; U) - log ∥ J ∥ - H (U^{N} | Z^{N}, W, J) - N I (Z; U)) \\ \overset{(e)}{=} & lim_{N \to \infty} \frac{1}{N} (N I (Y; U) - N max (I (U; S | A), I (U; Z)) + N I (U; Z) - N ϵ_{3, N} - N I (Z; U)) \\ \overset{(f)}{=} & I (Y; U) - max (I (U; S | A), I (U; Z)) = R_{e} \end{matrix}

(A9)

where (a) is from

(W, J) \to U^{N} \to Z^{N}

, (b) is from

H (J, W | U^{N}) = 0

, (c) is from

H (J | Z^{N}, W) \leq H (J) \leq log ∥ J ∥

, (d) is from that

S^{N}

,

U^{N}

and

X^{N}

are i.i.d. generated random vectors, and the channels are discrete memoryless, (e) is from (A3) and (A5), and (f) is from

ϵ_{3, N} \to 0

as

N \to \infty

.

Thus,

{lim}_{N \to \infty} Δ \geq R_{e}

for case 1 is proved.

Proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for case 2

\begin{matrix} lim_{N \to \infty} Δ & = & lim_{N \to \infty} \frac{1}{N} H (W | Z^{N}) \\ =^{(1)} & lim_{N \to \infty} \frac{1}{N} H (A^{N} | Z^{N}) \\ =^{(2)} & lim_{N \to \infty} \frac{1}{N} (N H (A | Z)) \\ = & H (A | Z) = R_{e} \end{matrix}

(A10)

where (1) is from

A^{N}

is a function of W, and (2) is from

A^{N}

and

X^{N}

are i.i.d. generated random vectors, and the channels are discrete memoryless.

Thus,

{lim}_{N \to \infty} Δ \geq R_{e}

for case 2 is proved.

The proof of Theorem 1 is completed.

B. Proof of Theorem 2

In this section, we prove Theorem 2: all the achievable

(R, R_{e})

pairs are contained in the set

R^{(n o)}

. Suppose

(R, R_{e})

is achievable, i.e., for any given

ϵ > 0

, there exists a channel encoder-decoder

(N, Δ, P_{e})

such that

lim_{N \to \infty} \frac{log ∥ W ∥}{N} = R, lim_{N \to \infty} Δ \geq R_{e}, P_{e} \leq ϵ

Then we will show the existence of random variables

(A, U, K, V) \to (X, S) \to Y \to Z

such that

\begin{matrix} 0 \leq R_{e} \leq & R \end{matrix}

(A11)

\begin{matrix} R \leq I (U; Y) & - I (U; S | A) \end{matrix}

(A12)

\begin{matrix} R_{e} \leq I (U; Y) & - I (K; Z | V) \end{matrix}

(A13)

Since W is uniformly distributed over

W

, we have

H (W) = log ∥ W ∥

. The formulas (A12) and (A13) are proved by Lemma 1, see the following.

Lemma 1

The random vectors

Y^{N}

,

Z^{N}

and the random variables W, V, U, K, A, Y, Z of Theorem 2, satisfy:

\begin{matrix} \frac{1}{N} H (W) & \leq I (U; Y) - I (U; S | A) + \frac{1}{N} δ (P_{e}) \end{matrix}

(A14)

\begin{matrix} \frac{1}{N} H (W | Z^{N}) & \leq I (U; Y) - I (K; Z | V) + \frac{1}{N} δ (P_{e}) \end{matrix}

(A15)

where

δ (P_{e}) = h (P_{e}) + P_{e} log (| W | - 1)

. Note that

h (P_{e}) = - P_{e} log P_{e} - (1 - P_{e}) log (1 - P_{e})

Substituting

H (W) = log ∥ W ∥

and (5) into (A14) and (A14), and using the fact that

ϵ \to 0

, the formulas (A12) and (A13) are obtained. The formula (A11) is from

R_{e} \leq lim_{N \to \infty} Δ = lim_{N \to \infty} \frac{1}{N} H (W | Z^{N}) \leq lim_{N \to \infty} \frac{1}{N} H (W) = R

It remains to prove Lemma 1, see the following.

Proof 3 (Proof of Lemma 1)

The formula (A14) follows from (A16), (A18) and (A28). The formula (A15) is from (A16), (A17), (A18), (A22), (A28) and (A29).

<Part i> We begin with the left parts of the inequalities (A14) and (A15), see the following.

Since

W \to Y^{N} \to Z^{N}

is a Markov chain, for the message W, we have

\begin{matrix} \frac{1}{N} H (W) & = & \frac{1}{N} H (W | Y^{N}) + \frac{1}{N} I (Y^{N}; W) \\ \leq^{(a)} & \frac{1}{N} δ (P_{e}) + \frac{1}{N} I (Y^{N}; W) \end{matrix}

(A16)

For the equivocation to the wiretapper, we have

\begin{matrix} \frac{1}{N} H (W | Z^{N}) & = & \frac{1}{N} (H (W) - I (W; Z^{N})) \\ = & \frac{1}{N} (H (W) + H (W | Y^{N}) - H (W | Y^{N}) - I (W; Z^{N})) \\ = & \frac{1}{N} (I (W; Y^{N}) + H (W | Y^{N}) - I (W; Z^{N})) \\ \leq^{(b)} & \frac{1}{N} (I (W; Y^{N}) - I (W; Z^{N}) + δ (P_{e})) \end{matrix}

(A17)

Note that (a) and (b) follow from Fano’s inequality.

<Part ii> By using chain rule, the character

I (Y^{N}; W)

in formulas (A16) and (A17) can be bounded as follows,

\begin{matrix} \frac{1}{N} I (Y^{N}; W) & = & \frac{1}{N} \sum_{i = 1}^{N} I (Y_{i}; W | Y^{i - 1}) \\ =^{(1)} & \frac{1}{N} \sum_{i = 1}^{N} (I (Y_{i}; W | Y^{i - 1}) - I (S_{i}; W | S_{i + 1}^{N}, A^{N})) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (I (Y_{i}; W, S_{i + 1}^{N}, A^{N} | Y^{i - 1}) - I (Y_{i}; S_{i + 1}^{N}, A^{N} | W, Y^{i - 1}) \\ - I (S_{i}; W, Y^{i - 1} | S_{i + 1}^{N}, A^{N}) + I (S_{i}; Y^{i - 1} | W, S_{i + 1}^{N}, A^{N})) \\ =^{(2)} & \frac{1}{N} \sum_{i = 1}^{N} (I (Y_{i}; W, S_{i + 1}^{N}, A^{N} | Y^{i - 1}) - I (S_{i}; W, Y^{i - 1} | S_{i + 1}^{N}, A^{N})) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (H (Y_{i} | Y^{i - 1}) - H (Y_{i} | Y^{i - 1}, W, S_{i + 1}^{N}, A^{N}) - H (S_{i} | S_{i + 1}^{N}, A^{N}) \\ + H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Y^{i - 1})) \\ \leq^{(3)} & \frac{1}{N} \sum_{i = 1}^{N} (H (Y_{i}) - H (Y_{i} | Y^{i - 1}, W, S_{i + 1}^{N}, A^{N}) - H (S_{i} | A_{i}) \\ + H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Y^{i - 1})) \end{matrix}

(A18)

where formula (1) follows from that

W \to A^{N} \to S^{N}

, formula (2) follows from that

\sum_{i = 1}^{N} I (Y_{i}; S_{i + 1}^{N}, A^{N} | W, Y^{i - 1}) = \sum_{i = 1}^{N} I (S_{i}; Y^{i - 1} | W, S_{i + 1}^{N}, A^{N})

(A19)

and formula (3) follows from that

S_{i} \to A_{i} \to (S_{i + 1}^{N}, A^{i - 1}, A_{i + 1}^{N})

.

Proof 4 (Proof of (A19))

The left part of (A19) can be rewritten as

\begin{matrix} \sum_{i = 1}^{N} I (Y_{i}; S_{i + 1}^{N}, A^{N} | W, Y^{i - 1}) & =^{(1)} & \sum_{i = 1}^{N} I (Y_{i}; S_{i + 1}^{N}, A^{N} | W, Y^{i - 1}, A^{N}) \\ = & \sum_{i = 1}^{N} I (Y_{i}; S_{i + 1}^{N} | W, Y^{i - 1}, A^{N}) \\ = & \sum_{i = 1}^{N} \sum_{j = i + 1}^{N} I (Y_{i}; S_{j} | A^{N}, Y^{i - 1}, W, S_{j + 1}^{N}) \\ = & \sum_{j = 1}^{N} \sum_{i = j + 1}^{N} I (Y_{j}; S_{i} | A^{N}, Y^{j - 1}, S_{i + 1}^{N}, W) \\ = & \sum_{i = 1}^{N} \sum_{j = 1}^{i - 1} I (Y_{j}; S_{i} | A^{N}, Y^{j - 1}, S_{i + 1}^{N}, W) \end{matrix}

(A20)

where (1) is from the fact that

A^{N}

is a deterministic function of W.

The right part of (A19) can be rewritten as

\sum_{i = 1}^{N} I (S_{i}; Y^{i - 1} | W, S_{i + 1}^{N}, A^{N}) = \sum_{i = 1}^{N} \sum_{j = 1}^{i - 1} I (Y_{j}; S_{i} | A^{N}, W, Y^{j - 1}, S_{i + 1}^{N})

(A21)

The formula (A19) is proved by (A20) and (A21). The proof is completed.

<Part iii> Similar to (A18), the character

I (W; Z^{N})

in formula (A17) can be rewritten as follows,

\begin{matrix} \frac{1}{N} I (Z^{N}; W) & = & \frac{1}{N} \sum_{i = 1}^{N} I (Z_{i}; W | Z^{i - 1}) \\ =^{(a)} & \frac{1}{N} \sum_{i = 1}^{N} (I (Z_{i}; W | Z^{i - 1}) - I (S_{i}; W | S_{i + 1}^{N}, A^{N})) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (I (Z_{i}; W, S_{i + 1}^{N}, A^{N} | Z^{i - 1}) - I (Z_{i}; S_{i + 1}^{N}, A^{N} | W, Z^{i - 1}) \\ - I (S_{i}; W, Z^{i - 1} | S_{i + 1}^{N}, A^{N}) + I (S_{i}; Z^{i - 1} | W, S_{i + 1}^{N}, A^{N})) \\ =^{(b)} & \frac{1}{N} \sum_{i = 1}^{N} (I (Z_{i}; W, S_{i + 1}^{N}, A^{N} | Z^{i - 1}) - I (S_{i}; W, Z^{i - 1} | S_{i + 1}^{N}, A^{N})) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}) - H (Z_{i} | Z^{i - 1}, W, S_{i + 1}^{N}, A^{N}) - H (S_{i} | S_{i + 1}^{N}, A^{N}) \\ + H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Z^{i - 1})) \\ \geq & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}) - H (Z_{i} | Z^{i - 1}, W, S_{i + 1}^{N}, A^{N}) - H (S_{i} | A_{i}) \\ + H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Z^{i - 1}, Y^{i - 1})) \\ =^{(c)} & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}) - H (Z_{i} | Z^{i - 1}, W, S_{i + 1}^{N}, A^{N}) - H (S_{i} | A_{i}) \\ + H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Y^{i - 1})) \end{matrix}

(A22)

where formula (a) follows from that

W \to A^{N} \to S^{N}

, formula (b) follows from that

\sum_{i = 1}^{N} I (Z_{i}; S_{i + 1}^{N}, A^{N} | W, Z^{i - 1}) = \sum_{i = 1}^{N} I (S_{i}; Z^{i - 1} | W, S_{i + 1}^{N}, A^{N})

(A23)

and formula (c) follows from that

Z^{i - 1} \to (S_{i + 1}^{N}, A^{N}, W, Y^{i - 1}) \to S_{i}

. Note that the proof of (A23) is similar to the proof of (A21), and therefore, it is omitted here.

<Part iv> (single letter) To complete the proof, we introduce a random variable J, which is independent of W,

A^{N}

,

X^{N}

,

S^{N}

,

Y^{N}

and

Z^{N}

. Furthermore, J is uniformly distributed over

{1, 2, . . ., N}

. Define

\begin{matrix} U & = (W, Y^{J - 1}, S_{J + 1}^{N}, A^{N}, J) \end{matrix}

(A24)

\begin{matrix} K & = (W, Z^{J - 1}, S_{J + 1}^{N}, A^{N}, J) \end{matrix}

(A25)

\begin{matrix} V & = (Z^{J - 1}, J) \end{matrix}

(A26)

\begin{matrix} X & = X_{J}, Y = Y_{J}, Z = Z_{J}, S = S_{J}, A = (A_{J}, J) \end{matrix}

(A27)

<Part v> Then (A18) can be rewritten as

\begin{matrix} \frac{1}{N} I (W; Y^{N}) & \leq & \frac{1}{N} \sum_{i = 1}^{N} (H (Y_{i}) - H (Y_{i} | Y^{i - 1}, W, S_{i + 1}^{N}, A^{N}) - H (S_{i} | A_{i}) + \\ H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Y^{i - 1})) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (H (Y_{i} | J = i) - H (Y_{i} | Y^{i - 1}, W, S_{i + 1}^{N}, A^{N}, J = i) - H (S_{i} | A_{i}, J = i) + \\ H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Y^{i - 1}, A_{i}, J = i)) \\ = & H (Y_{J} | J) - H (Y_{J} | Y^{J - 1}, W, S_{J + 1}^{N}, A^{N}, J) - H (S_{J} | A_{J}, J) + \\ H (S_{J} | S_{J + 1}^{N}, A^{N}, W, Y^{J - 1}, A_{J}, J) \\ \leq & H (Y_{J}) - H (Y_{J} | Y^{J - 1}, W, S_{J + 1}^{N}, A^{N}, J) - H (S_{J} | A_{J}, J) + \\ H (S_{J} | S_{J + 1}^{N}, A^{N}, W, Y^{J - 1}, A_{J}, J) \\ = & H (Y) - H (Y | U) - H (S | A) + H (S | U, A) \\ = & I (U; Y) - I (U; S | A) \end{matrix}

(A28)

Analogously, (A22) is rewritten as follows,

\begin{matrix} \frac{1}{N} I (Z^{N}; W) & \geq & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}) - H (Z_{i} | Z^{i - 1}, W, S_{i + 1}^{N}, A^{N}) - H (S_{i} | A_{i}) + \\ H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Y^{i - 1})) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}, J = i) - H (Z_{i} | Z^{i - 1}, W, S_{i + 1}^{N}, A^{N}, J = i) - H (S_{i} | A_{i}, J = i) + \\ H (S_{i} | S_{i + 1}^{N}, A^{N}, W, Y^{i - 1}, A_{i}, J = i)) \\ = & H (Z_{J} | Z^{J - 1}, J) - H (Z_{J} | Z^{J - 1}, W, S_{J + 1}^{N}, A^{N}, J) - H (S_{J} | A_{J}, J) + \\ H (S_{J} | S_{J + 1}^{N}, A^{N}, W, Y^{J - 1}, A_{J}, J) \\ = & H (Z | V) - H (Z | K, V) - H (S | A) + H (S | U, A) \\ = & I (Z; K | V) - I (U; S | A) \end{matrix}

(A29)

Substituting (A28), (A29) into (A16) and (A17), Lemma 1 is proved.

In addition, by using the definitions of U, K, V, Y and Z (see (A24), (A25), (A26) and (A27), note that V is a part of K), and observing that

Z^{J - 1} \to (Y^{J - 1}, W, S_{J + 1}^{N}, A^{N}, J) \to Y_{J} \to Z_{J}

is a Markov chain, it is easy to check that the Markov chain

V \to K \to U \to Y \to Z

holds.

The proof of Theorem 2 is completed.

C. Size Constraint of The Random Variables in Theorem 1

By using the support lemma (see [13], p.310), it suffices to show that the random variable U can be replaced by new one, preserving the Markovity

(U, A) \to (X, S) \to Y \to Z

and the mutual information

I (U; Z)

,

I (U; Y)

,

I (U; S | A)

, and furthermore, the range of the new U satisfies:

∥ U ∥ ∥ \leq ∥ X ∥ ∥ S ∥ ∥ A ∥ + 2

. The proof is in the reminder of this section.

Let

\bar{p} = p_{X S A} (x, s, a)

(A30)

Define the following continuous scalar functions of

\bar{p}

:

f_{X S A} (\bar{p}) = p_{X S A} (x, s, a), f_{Y} (\bar{p}) = H (Y), f_{Z} (\bar{p}) = H (Z), f_{S | A} (\bar{p}) = H (S | A)

Since there are

∥ X ∥ ∥ S ∥ ∥ A ∥ - 1

functions of

f_{X S A} (\bar{p})

, the total number of the continuous scalar functions of

\bar{p}

is

∥ X ∥ ∥ S ∥ ∥ A ∥

+2.

Let

{\bar{p}}_{X S A | U} = P r {X = x, S = s, A = a | U = u}

. With these distributions

{\bar{p}}_{X S A | U} = P r {X = x, S = s, A = a | U = u}

, we have

\begin{matrix} p_{X S A} (x, s, a) & = \sum_{u \in U} p (U = u) f_{X S A} ({\bar{p}}_{X S A | U}) \end{matrix}

(A31)

\begin{matrix} I (U; Z) & = f_{Z} (\bar{p}) - \sum_{u \in U} p (U = u) f_{Z} ({\bar{p}}_{X S A | U}) \end{matrix}

(A32)

\begin{matrix} I (U; S | A) & = f_{S | A} (\bar{p}) - \sum_{u \in U} p (U = u) f_{S | A} ({\bar{p}}_{X S A | U}) \end{matrix}

(A33)

\begin{matrix} H (Y | U) & = \sum_{u \in U} p (U = u) f_{Y} ({\bar{p}}_{X S A | U}) \end{matrix}

(A34)

According to the support lemma ([13], p.310), the random variable U can be replaced by new ones such that the new U takes at most

∥ X ∥ ∥ S ∥ ∥ A ∥ + 2

different values and the expressions (A31), (A32), (A33) and (A34) are preserved.

D. Size Constraint of The Random Variables in Theorem 2

By using the support lemma (see [13], p.310), it suffices to show that the random variables U, V and K can be replaced by new ones, preserving the Markovities

(U, K, A, V) \to (X, S) \to Y \to Z

,

V \to K \to Y \to Z

and the mutual information

I (U; Y)

,

I (K; Z | V)

,

I (U; S | A)

, and furthermore, the ranges of the new U, V and K satisfy:

∥ U ∥ ∥ \leq ∥ X ∥ ∥ S ∥ ∥ A ∥ + 1

,

∥ V ∥ ∥ \leq ∥ X ∥ ∥ S ∥ ∥ A ∥

,

∥ K ∥ \leq {∥ X ∥}^{2} {∥ S ∥}^{2} {∥ A ∥}^{2}

. The proof is in the reminder of this section.

(Proof of $∥ U ∥ ∥ \leq ∥ X ∥ ∥ S ∥ ∥ A ∥ + 1$ )
Let

$\bar{p} = p_{X S A} (x, s, a)$

(A35)

Define the following continuous scalar functions of $\bar{p}$ :

$f_{X S A} (\bar{p}) = p_{X S A} (x, s, a), f_{Y} (\bar{p}) = H (Y), f_{S | A} (\bar{p}) = H (S | A)$

Since there are $∥ X ∥ ∥ S ∥ ∥ A ∥ - 1$ functions of $f_{X S A} (\bar{p})$ , the total number of the continuous scalar functions of $\bar{p}$ is $∥ X ∥ ∥ S ∥ ∥ A ∥$ +1.
Let ${\bar{p}}_{X S A | U} = P r {X = x, S = s, A = a | U = u}$ . With these distributions ${\bar{p}}_{X S A | U} = P r {X = x, S = s, A = a | U = u}$ , we have

$\begin{matrix} p_{X S A} (x, s, a) & = \sum_{u \in U} p (U = u) f_{X S A} ({\bar{p}}_{X S A | U}) \end{matrix}$

(A36)

$\begin{matrix} I (U; S | A) & = f_{S | A} (\bar{p}) - \sum_{u \in U} p (U = u) f_{S | A} ({\bar{p}}_{X S A | U}) \end{matrix}$

(A37)

$\begin{matrix} H (Y | U) & = \sum_{u \in U} p (U = u) f_{Y} ({\bar{p}}_{X S A | U}) \end{matrix}$

(A38)

According to the support lemma ([13], p.310), the random variable U can be replaced by new ones such that the new U takes at most $∥ X ∥ ∥ S ∥ ∥ A ∥ + 1$ different values and the expressions (A36), (A37) and (A38) are preserved.
(Proof of $∥ V ∥ ∥ \leq ∥ X ∥ ∥ S ∥ ∥ A ∥$ )
Let

$\bar{p} = p_{X S A} (x, s, a)$

(A39)

Define the following continuous scalar functions of $\bar{p}$ :

$f_{X S A} (\bar{p}) = p_{X S A} (x, s, a), f_{Z} (\bar{p}) = H (Z)$

Since there are $∥ X ∥ ∥ S ∥ ∥ A ∥ - 1$ functions of $f_{X S A} (\bar{p})$ , the total number of the continuous scalar functions of $\bar{p}$ is $∥ X ∥ ∥ S ∥ ∥ A ∥$ .
Let ${\bar{p}}_{X S A | V} = P r {X = x, S = s, A = a | V = v}$ . With these distributions ${\bar{p}}_{X S A | V} = P r {X = x, S = s, A = a | V = v}$ , we have

$\begin{matrix} p_{X S A} (x, s, a) & = \sum_{v \in V} p (V = v) f_{X S A} ({\bar{p}}_{X S A | V}) \end{matrix}$

(A40)

$\begin{matrix} H (Z | V) & = \sum_{v \in V} p (V = v) f_{Z} ({\bar{p}}_{X S A | V}) \end{matrix}$

(A41)

According to the support lemma ([13], p.310), the random variable V can be replaced by new ones such that the new V takes at most $∥ X ∥ ∥ S ∥ ∥ A ∥$ different values and the expressions (A40) and (A41) are preserved.
(Proof of $∥ K ∥ \leq {∥ X ∥}^{2} {∥ S ∥}^{2} {∥ A ∥}^{2}$ )
Once the alphabet of V is fixed, we apply similar arguments to bound the alphabet of K, see the following. Define $∥ X ∥ ∥ S ∥ ∥ A ∥$ continuous scalar functions of ${\bar{p}}_{X S A}$ :

$f_{X S A} ({\bar{p}}_{X S A}) = p_{X S A} (x, s, a), f_{Z} ({\bar{p}}_{X S A}) = H (Z)$

where of the functions $f_{X S A} ({\bar{p}}_{X S A})$ , only $∥ X ∥ ∥ S ∥ ∥ A ∥ - 1$ are to be considered.
For every fixed v, let ${\bar{p}}_{X S A | K, V} = P r {X = x, S = s, A = a | K = k, V = v}$ . With these distributions ${\bar{p}}_{X S A | K, V}$ , we have

$\begin{matrix} P r {X = x, S = s, A = a | V = v} & = \sum_{k \in K} P r {K = k | V = v} f_{X S A} ({\bar{p}}_{X S A | K, V}) \end{matrix}$

(A42)

$\begin{matrix} I (K; Z | V = v) = H (Z | V = v) & - \sum_{k \in K} f_{Z} ({\bar{p}}_{X S A | K, V}) P r {K = k | V = v} \end{matrix}$

(A43)

By the support lemma ([13], p.310), for every fixed v, the size of the alphabet of the random variable K can not be larger than $∥ X ∥ ∥ S ∥ ∥ A ∥$ , and therefore, $∥ K ∥ \leq {∥ X ∥}^{2} {∥ S ∥}^{2} {∥ A ∥}^{2}$ is proved.

E. Proof of Theorem 3

In this section, we will show that any pair

(R, R_{e}) \in R^{c i}

is achievable. Wyner’s random binning technique is used in the construction of the code-book.

Now the remainder of this section is organized as follows. The code construction is in Subsection E.1. The proof of achievability is given in Subsection E.2.

E.1. Code Construction

Since

R_{e} \leq I (U; Y) - I (U; Z)

,

R_{e} \leq H (A | Z)

and

R_{e} \leq R \leq I (U; Y)

, it is sufficient to show that the pair

(R, R_{e} = min {I (U; Y) - I (U; Z), H (A | Z)}

is achievable, and note that this implies that

R \geq R_{e} = min {I (U; Y) - I (U; Z), H (A | Z)}

.

The construction of the code and the proof of achievability are considered into two cases:

(Case 1) If $H (A | Z) \geq I (U; Y) - I (U; Z)$ , Wyner’s random binning technique [6] is used in the construction of the code-book.
(Case 2) If $H (A | Z) \leq I (U; Y) - I (U; Z)$ , Shannon’s strategy [1] is used in the construction of the code-book.
(Code construction for case 1)
Given a pair $(R, R_{e})$ , choose a joint probability mass function $p_{U, A, S, X, Y, Z} (u, a, s, x, y, z)$ such that

$0 \leq R_{e} \leq R$

$R \leq I (U; Y)$

$R_{e} = I (U; Y) - I (U; Z)$

The message set $W$ satisfies the following condition:

$lim_{N \to \infty} \frac{1}{N} log ∥ W ∥ = R = I (U; Y) - γ$

(A44)

where γ is a fixed positive real numbers and

$0 \leq γ \leq^{(a)} I (U; Z)$

(A45)

Note that (a) is from $R \geq R_{e} = I (U; Y) - I (U; Z)$ and (A44). Let $W = {1, 2, . . ., 2^{N R}}$ .
Code-book generation:
–
(Construction of $A^{N}$ )
Generate $2^{N R}$ i.i.d. sequences $a^{N}$ , according to the probability mass function $p_{A} (a)$ . Index each sequence by $i \in {1, 2, . . ., 2^{N R}}$ . For a given message w ( $w \in W$ ), choose a corresponding $a^{N} (w)$ as the output of the action encoder.
–
(Construction of $U^{N}$ )
For the transmitted action sequence $a^{N} (w)$ , generate $2^{N (I (U; Y) - ϵ_{2, N})}$ ( $ϵ_{2, N} \to 0$ as $N \to \infty$ ) i.i.d. sequences $u^{N}$ , according to the probability mass function $p_{U | A} (u_{i} | a_{i} (w))$ . Distribute these sequences at random into $2^{N R} = 2^{N (I (U; Y) - γ)}$ bins such that each bin contains $2^{N (γ - ϵ_{2, N})}$ sequences. Index each bin by $i \in {1, 2, . . ., 2^{N R}}$ .
Here note that the number of the sequences in every bin is upper bounded as follows.

$γ - ϵ_{2, N} \leq^{(a)} I (U; Z) - ϵ_{2, N}$

(A46)

where (a) is from (A45). This implies that

$lim_{N \to \infty} H (U^{N} | W, Z^{N}) = 0$

(A47)

Note that (A47) can be proved by using Fano’s inequality and (A46).
For a given message w ( $w \in W$ ), randomly choose a sequence $u^{N} (w, i^{*})$ in bin w as the realization of $U^{N}$ .
Let $s^{N}$ be the state sequence generated in response to the action sequence $a^{N} (w)$ .
–
(Construction of $X^{N}$ ) The $x^{N}$ is generated according to a new discrete memoryless channel (DMC) with inputs $u^{N}$ , $s^{N}$ , and output $x^{N}$ . The transition probability of this new DMC is $p_{X | U, S} (x | u, s)$ , which is obtained from the joint probability mass function $p_{U, A, S, X, Y, Z} (u, a, s, x, y, z)$ . The probability
$p_{X^{N} | U^{N}, S^{N}} (x^{N} | u^{N}, s^{N})$ is calculated as follows.

$p_{X^{N} | U^{N}, S^{N}} (x^{N} | u^{N}, s^{N}) = \prod_{i = 1}^{N} p_{X | U, S} (x_{i} | u_{i}, s_{i})$

(A48)

Decoding:
Given a vector $y^{N} \in Y^{N}$ , try to find a sequence $u^{N} (\hat{w}, \hat{i})$ such that $(u^{N} (\hat{w}, \hat{i}), a^{N} (\hat{w}), y^{N}) \in T_{U A Y}^{N} (ϵ_{3})$ . If there exist sequences with the same $\hat{w}$ , put out the corresponding $\hat{w}$ . Otherwise, i.e., if no such sequence exists or multiple sequences have different message indices, declare a decoding error.
(Code construction for case 2)
Given a pair $(R, R_{e})$ , choose a joint probability mass function $p_{U, A, S, X, Y, Z} (u, a, s, x, y, z)$ such that

$0 \leq R_{e} \leq R$

$R \leq I (U; Y)$

$R_{e} = H (A | Z)$

The message set $W$ satisfies the following condition:

$lim_{N \to \infty} \frac{1}{N} log ∥ W ∥ = R = I (U; Y) - γ_{1}$

(A49)

where $γ_{1}$ is a fixed positive real numbers and

$0 \leq γ_{1} \leq^{(b)} I (U; Y) - H (A | Z)$

(A50)

Note that (b) is from $R \geq R_{e} = H (A | Z)$ and (A49). Let $W = {1, 2, . . ., 2^{N R}}$ .
Code-book generation:
–
(Construction of $A^{N}$ )
Generate $2^{N R}$ i.i.d. sequences $a^{N}$ , according to the probability mass function $p_{A} (a)$ . Index each sequence by $i \in {1, 2, . . ., 2^{N R}}$ . For a given message w ( $w \in W$ ), choose a corresponding $a^{N} (w)$ as the output of the action encoder.
–
(Construction of $U^{N}$ )
For the transmitted action sequence $a^{N} (w)$ , generate $2^{N R}$ i.i.d. sequences $u^{N}$ , according to the probability mass function $p_{U | A} (u_{i} | a_{i} (w))$ . Index each $u^{N}$ by $i \in {1, 2, . . ., 2^{N R}}$ .
For a given message w ( $w \in W$ ), choose a sequence $u^{N} (w)$ as the realization of $U^{N}$ .
Let $s^{N}$ be the state sequence generated in response to the action sequence $a^{N} (w)$ .
–
(Construction of $X^{N}$ ) The $x^{N}$ is generated the same as that for the case 1, and it is omitted here.
Decoding:
Given a vector $y^{N} \in Y^{N}$ , try to find a sequence $u^{N} (\hat{w})$ such that $(u^{N} (\hat{w}), a^{N} (\hat{w}), y^{N}) \in T_{U A Y}^{N} (ϵ_{3})$ . If there exist sequences with the same $\hat{w}$ , put out the corresponding $\hat{w}$ . Otherwise, i.e., if no such sequence exists or multiple sequences have different message indices, declare a decoding error.

E.2. Proof of Achievability

By using the above definitions, it is easy to verify that

{lim}_{N \to \infty} \frac{log ∥ W ∥}{N} = R

.

Then, for the two cases, note that the above encoding and decoding schemes are similar to the one used in [5]. Hence, by similar arguments as in [5], it is easy to show that

P_{e} \leq ϵ

for both cases, and the proof is omitted here. It remains to show that

{lim}_{N \to \infty} Δ \geq R_{e}

for the two cases, see the following.

Proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for case 1

\begin{matrix} lim_{N \to \infty} Δ & = & lim_{N \to \infty} \frac{1}{N} H (W | Z^{N}) = lim_{N \to \infty} \frac{1}{N} (H (W, Z^{N}) - H (Z^{N})) \\ = & lim_{N \to \infty} \frac{1}{N} (H (W, Z^{N}, U^{N}) - H (U^{N} | Z^{N}, W) - H (Z^{N})) \\ \overset{(a)}{=} & lim_{N \to \infty} \frac{1}{N} (H (Z^{N} | U^{N}) + H (U^{N}, W) - H (U^{N} | Z^{N}, W) - H (Z^{N})) \\ \overset{(b)}{=} & lim_{N \to \infty} \frac{1}{N} (H (Z^{N} | U^{N}) + H (U^{N}) - H (U^{N} | Z^{N}, W) - H (Z^{N})) \\ = & lim_{N \to \infty} \frac{1}{N} (H (U^{N}) - H (U^{N} | Z^{N}, W) - I (Z^{N}; U^{N})) \\ \geq & lim_{N \to \infty} \frac{1}{N} (H (U^{N}) - H (U^{N} | Y^{N}) - H (U^{N} | Z^{N}, W) - I (Z^{N}; U^{N})) \\ = & lim_{N \to \infty} \frac{1}{N} (I (U^{N}; Y^{N}) - H (U^{N} | Z^{N}, W) - I (Z^{N}; U^{N})) \\ =^{(c)} & lim_{N \to \infty} \frac{1}{N} (N I (U; Y) - N I (U; Z)) \\ = & I (U; Y) - I (U; Z) = R_{e}, \end{matrix}

(A51)

where (a) is from

W \to U^{N} \to Z^{N}

, (b) is from

H (W | U^{N}) = 0

, (c) is from that

S^{N}

,

U^{N}

and

X^{N}

are i.i.d. generated random vectors, the channels are discrete memoryless and (A47).

Thus,

{lim}_{N \to \infty} Δ \geq R_{e}

for case 1 is proved.

Proof of

{lim}_{N \to \infty} Δ \geq R_{e}

for case 2

\begin{matrix} lim_{N \to \infty} Δ & = & lim_{N \to \infty} \frac{1}{N} H (W | Z^{N}) \\ =^{(1)} & lim_{N \to \infty} \frac{1}{N} H (A^{N} | Z^{N}) \\ =^{(2)} & lim_{N \to \infty} \frac{1}{N} (N H (A | Z)) \\ = & H (A | Z) = R_{e}, \end{matrix}

(A52)

where (1) is from

A^{N}

is a function of W, and (2) is from

A^{N}

and

X^{N}

are i.i.d. generated random vectors, and the channels are discrete memoryless.

Thus,

{lim}_{N \to \infty} Δ \geq R_{e}

for case 2 is proved.

The proof of Theorem 3 is completed.

F. Proof of Theorem 4

In this section, we prove Theorem 4: all the achievable

(R, R_{e})

pairs are contained in the set

R^{(c o)}

. Suppose

(R, R_{e})

is achievable, i.e., for any given

ϵ > 0

, there exists a channel encoder-decoder

(N, Δ, P_{e})

such that

lim_{N \to \infty} \frac{log ∥ W ∥}{N} = R, lim_{N \to \infty} Δ \geq R_{e}, P_{e} \leq ϵ .

Then we will show the existence of random variables

(A, U, K, V) \to (X, S) \to Y \to Z

such that

\begin{matrix} 0 \leq & R_{e} \leq R \end{matrix}

(A53)

\begin{matrix} R \leq & I (U; Y) \end{matrix}

(A54)

\begin{matrix} R_{e} \leq & I (U; Y) - I (K; Z | V) \end{matrix}

(A55)

Since W is uniformly distributed over

W

, we have

H (W) = log ∥ W ∥

. The formulas (A54) and (A55) are proved by Lemma 2, see the following.

Lemma 2

The random vectors

Y^{N}

,

Z^{N}

and the random variables W, V, U, K, A, Y, Z of Theorem 4, satisfy:

\frac{1}{N} H (W) \leq I (U; Y) + \frac{1}{N} δ (P_{e}),

(A56)

\frac{1}{N} H (W | Z^{N}) \leq I (U; Y) - I (K; Z | V) + \frac{1}{N} δ (P_{e}),

(A57)

where

δ (P_{e}) = h (P_{e}) + P_{e} log (| W | - 1)

. Note that

h (P_{e}) = - P_{e} log P_{e} - (1 - P_{e}) log (1 - P_{e})

.

Substituting

H (W) = log ∥ W ∥

and (5) into (A56) and (A57), and using the fact that

ϵ \to 0

, the formulas (A54) and (A55) are obtained. The formula (A53) is from

R_{e} \leq lim_{N \to \infty} Δ = lim_{N \to \infty} \frac{1}{N} H (W | Z^{N}) \leq lim_{N \to \infty} \frac{1}{N} H (W) = R .

It remains to prove Lemma 2, see the following.

Proof 5 (Proof of Lemma 2)

The formula (A56) follows from (A58), (A60) and (A66). The formula (A57) is from (A58), (A59), (A60), (A61), (A66) and (A67).

<Part i> We begin with the left parts of the inequalities (A56) and (A57), see the following.

Since

W \to Y^{N} \to Z^{N}

is a Markov chain, for the message W, we have

\begin{matrix} \frac{1}{N} H (W) & = & \frac{1}{N} H (W | Y^{N}) + \frac{1}{N} I (Y^{N}; W) \\ \leq^{(a)} & \frac{1}{N} δ (P_{e}) + \frac{1}{N} I (Y^{N}; W) . \end{matrix}

(A58)

For the equivocation to the wiretapper, we have

\begin{matrix} \frac{1}{N} H (W | Z^{N}) & = & \frac{1}{N} (H (W) - I (W; Z^{N})) \\ = & \frac{1}{N} (H (W) + H (W | Y^{N}) - H (W | Y^{N}) - I (W; Z^{N})) \\ = & \frac{1}{N} (I (W; Y^{N}) + H (W | Y^{N}) - I (W; Z^{N})) \\ \leq^{(b)} & \frac{1}{N} (I (W; Y^{N}) - I (W; Z^{N}) + δ (P_{e})) . \end{matrix}

(A59)

Note that (a) and (b) follow from Fano’s inequality.

<Part ii> By using chain rule, the character

I (Y^{N}; W)

in formulas (A58) and (A59) can be bounded as follows,

\begin{matrix} \frac{1}{N} I (Y^{N}; W) & = & \frac{1}{N} \sum_{i = 1}^{N} I (Y_{i}; W | Y^{i - 1}) \\ \leq & \frac{1}{N} \sum_{i = 1}^{N} I (Y_{i}; W, Y^{i - 1}) \\ \leq & \frac{1}{N} \sum_{i = 1}^{N} I (Y_{i}; W, Y^{i - 1}, S^{i - 1}) . \end{matrix}

(A60)

<Part iii> Similar to (A60), the character

I (W; Z^{N})

in formula (A59) can be rewritten as follows,

\begin{matrix} \frac{1}{N} I (Z^{N}; W) & = & \frac{1}{N} \sum_{i = 1}^{N} I (Z_{i}; W | Z^{i - 1}) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}) - H (Z_{i} | Z^{i - 1}, W)) . \end{matrix}

(A61)

<Part iv> (single letter) To complete the proof, we introduce a random variable J, which is independent of W,

A^{N}

,

X^{N}

,

S^{N}

,

Y^{N}

and

Z^{N}

. Furthermore, J is uniformly distributed over

{1, 2, . . ., N}

. Define

\begin{matrix} U & = (W, Y^{J - 1}, S^{J - 1}, J) \end{matrix}

(A62)

\begin{matrix} K & = (W, Z^{J - 1}, J) \end{matrix}

(A63)

\begin{matrix} V & = (Z^{J - 1}, J) \end{matrix}

(A64)

\begin{matrix} X & = X_{J}, Y = Y_{J}, Z = Z_{J}, S = S_{J}, A = A_{J} \end{matrix}

(A65)

<Part v> Then (A60) can be rewritten as

\begin{matrix} \frac{1}{N} I (W; Y^{N}) & \leq & \frac{1}{N} \sum_{i = 1}^{N} I (Y_{i}; W, Y^{i - 1}, S^{i - 1}) \\ = & \frac{1}{N} \sum_{i = 1}^{N} I (Y_{i}; W, Y^{i - 1}, S^{i - 1} | J = i) \\ = & I (Y_{J}; W, Y^{J - 1}, S^{J - 1} | J) \\ \leq & I (Y_{J}; W, Y^{J - 1}, S^{J - 1}, J) \\ = & I (U; Y) \end{matrix}

(A66)

Analogously, (A61) is rewritten as follows,

\begin{matrix} \frac{1}{N} I (Z^{N}; W) & = & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}) - H (Z_{i} | Z^{i - 1}, W)) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (H (Z_{i} | Z^{i - 1}, J = i) - H (Z_{i} | Z^{i - 1}, W, J = i)) \\ = & H (Z_{J} | Z^{J - 1}, J) - H (Z_{J} | Z^{J - 1}, W, J) \\ = & H (Z | V) - H (Z | K, V) \\ = & I (Z; K | V) \end{matrix}

(A67)

Substituting (A66), (A67) into (A58) and (A59), Lemma 2 is proved.

In addition, by using the definitions of U, K, V, Y and Z (see (A62), (A63), (A64) and (A65), note that V is a part of K), and observing that

Z^{J - 1} \to (Y^{J - 1}, W, S^{J - 1}, J) \to Y_{J} \to Z_{J}

and

(W, Y^{J - 1}, S^{J - 1}, J) \to A_{J} \to S_{J}

are two Markov chains, it is easy to check that the Markov chains

V \to K \to U \to Y \to Z

and

U \to A \to S

hold.

The proof of Theorem 4 is completed.

© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Share and Cite

MDPI and ACS Style

Dai, B.; Vinck, A.J.H.; Luo, Y.; Tang, X. Wiretap Channel with Action-Dependent Channel State Information. Entropy 2013, 15, 445-473. https://doi.org/10.3390/e15020445

AMA Style

Dai B, Vinck AJH, Luo Y, Tang X. Wiretap Channel with Action-Dependent Channel State Information. Entropy. 2013; 15(2):445-473. https://doi.org/10.3390/e15020445

Chicago/Turabian Style

Dai, Bin, A. J. Han Vinck, Yuan Luo, and Xiaohu Tang. 2013. "Wiretap Channel with Action-Dependent Channel State Information" Entropy 15, no. 2: 445-473. https://doi.org/10.3390/e15020445

Article Menu

Wiretap Channel with Action-Dependent Channel State Information

Abstract

1. Introduction

2. Notations, Definitions and the Main Results

2.1. The Model of Figure 4 with Noncausal Channel State Information

2.2. The Model of Figure 4 with Causal Channel State Information

3. A Binary Example for the Model of Figure 4 with Causal Channel State Information

4. Conclusions

Acknowledgement

References

A. Proof of Theorem 1

A.1. Code Construction

A.2. Proof of Achievability

B. Proof of Theorem 2

C. Size Constraint of The Random Variables in Theorem 1

D. Size Constraint of The Random Variables in Theorem 2

E. Proof of Theorem 3

E.1. Code Construction

E.2. Proof of Achievability

F. Proof of Theorem 4

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI