Amplitude Constrained Vector Gaussian Wiretap Channel: Properties of the Secrecy-Capacity-Achieving Input Distribution

Antonino Favano; Luca Barletta; Alex Dytso

doi:10.3390/e25050741

Abstract

This paper studies the secrecy capacity of an n-dimensional Gaussian wiretap channel under a peak power constraint. This work determines the largest peak power constraint

{\bar{R}}_{n}

, such that an input distribution uniformly distributed on a single sphere is optimal; this regime is termed the low-amplitude regime. The asymptotic value of

{\bar{R}}_{n}

as n goes to infinity is completely characterized as a function of noise variance at both receivers. Moreover, the secrecy capacity is also characterized in a form amenable to computation. Several numerical examples are provided, such as the example of the secrecy-capacity-achieving distribution beyond the low-amplitude regime. Furthermore, for the scalar case

(n = 1)

, we show that the secrecy-capacity-achieving input distribution is discrete with finitely many points at most at the order of

\frac{R^{2}}{σ_{1}^{2}}

, where

σ_{1}^{2}

is the variance of the Gaussian noise over the legitimate channel.

Keywords:

wiretap channel; MIMO; amplitude constraints

1. Introduction

Consider the vector Gaussian wiretap channel with outputs

\begin{matrix} Y_{1} & = X + N_{1}, \end{matrix}

(1a)

\begin{matrix} Y_{2} & = X + N_{2}, \end{matrix}

(1b)

where

X \in R^{n}

,

N_{1} \sim N (0_{n}, σ_{1}^{2} I_{n})

and

N_{2} \sim N (0_{n}, σ_{2}^{2} I_{n})

, and with

(X, N_{1}, N_{2})

being mutually independent. The output

Y_{1}

is observed by the legitimate receiver, whereas the output

Y_{2}

is observed by the malicious receiver. In this work, we are interested in the scenario where the input

X

is limited by a peak power constraint or amplitude constraint, and assume that

X \in B_{0} (R) = {x : ∥ x ∥ \leq R}

, i.e.,

B_{0} (R)

is an n-ball centered at the origin and of radius

R

. For this setting, the secrecy capacity is given by

\begin{matrix} C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n) & = max_{X \in B_{0} (R)} I (X; Y_{1}) - I (X; Y_{2}) \end{matrix}

(2)

\begin{matrix} = max_{X \in B_{0} (R)} I (X; Y_{1} | Y_{2}), \end{matrix}

(3)

where the last expression holds due to the (stochastically) degraded nature of the channel. It can be shown that for

σ_{1}^{2} \geq σ_{2}^{2}

the secrecy capacity is equal to zero. Therefore, in the remainder, we assume that

σ_{1}^{2} < σ_{2}^{2}

.

We are interested in studying the input distribution

P_{X^{★}}

that maximizes (3) in the low (but not vanishing) amplitude regime. Since closed-form expressions for secrecy capacity are rare, we derive the secrecy capacity in an integral form that is easy to evaluate. For the scalar case

(n = 1)

, we establish an upper bound on the number of mass points of

P_{X^{★}}

, valid for any amplitude regime. We also argue in Section 2.3 that the solution to the secrecy capacity can shed light on other problems seemingly unrelated to security. The paper also provides a number of numerical simulations of

P_{X^{★}}

and

C_{s}

, the data for which are made available at [1].

1.1. Literature Review

The wiretap channel was introduced by Wyner in [2], who also established the secrecy capacity of the degraded wiretap channel. The results of [2] were extended to the Gaussian wiretap channel in [3]. The wiretap channel plays a central role in network information theory; the interested reader is referred to [4,5,6,7,8] and references therein for a detailed treatment of the topic. Furthermore, for an in-depth discussion on the wiretap fading channel, refer to [9,10,11,12].

In [3], it was shown that the secrecy-capacity-achieving input distribution of the Gaussian wiretap channel, under an average power constraint, is Gaussian. In [13], the authors investigated the Gaussian wiretap channel consisting of two antennas, both at the transmitter and receiver sides, and of a single antenna for the eavesdropper. The secrecy capacity of the MIMO wiretap channel was characterized in [14,15], where the Gaussian input was shown to be optimal. An elegant proof, using the I-MMSE relationship [16], of the optimality of Gaussian input, is given in [17]. Moreover, an alternative approach in the characterization of the secrecy capacity of a MIMO wiretap channel was proposed in [18]. In [19,20], the authors discuss the optimal signaling for secrecy rate maximization under average power constraints.

The secrecy capacity of the Gaussian wiretap channel under the peak power constraint has received far less attention. The secrecy capacity of the scalar Gaussian wiretap channel with an amplitude and power constraint was considered in [21], where the authors showed that the capacity-achieving input distribution

P_{X^{★}}

is discrete with finitely many support points.

The work of [21] was extended to noise-dependent channels by Soltani and Rezki in [22]. For further studies on the properties of the secrecy-capacity-achieving input distribution for a class of degraded wiretap channels, refer to [23,24,25].

The secrecy capacity for the vector wiretap channel with a peak power constraint was considered in [25], where it was shown that the optimal input distribution is concentrated on finitely many co-centric shells.

1.2. Contributions and Paper Outline

In Section 2, we introduce the mathematical tools, assumptions, and definitions used throughout the paper. Specifically, in Section 2.1, we introduce the oscillation theorem. In Section 2.2, we give a definition of low-amplitude regimes. Moreover, in Section 2.3, we show how the wiretap channel can be seen as a generalization of point-to-point channels and the evaluation of the largest minimum mean square error (MMSE), both under the assumption of amplitude-constrained input. In Section 2.4, we provide a definition of the Karush–Kuhn–Tucker (KKT) conditions for the wiretap channel.

In Section 3, we detail our main results. Theorem 2 provides a sufficient condition for the optimality of a single hypersphere. Theorem 3 and Theorem 4 give the conditions under which we can fully characterize the behavior of

{\bar{R}}_{n}

, that is, the radius below which we are in the low-amplitude regime, i.e., the optimal input distribution is composed of a single shell. Furthermore, Theorem 5 gives an implicit and an explicit upper bound on the number of mass points of the secrecy-capacity-achieving input distribution when

n = 1

.

In Section 4, we derive the secrecy capacity expression for the low-amplitude regime in Theorem 6. We also investigate its behavior when the number of antennas n goes to infinity.

Section 5 extends the investigation of the secrecy capacity beyond the low-amplitude regime. We numerically estimate both the optimal input pmf and the resulting capacity via an algorithmic procedure based on the KKT conditions introduced in Lemma 2.

Section 6, Section 7, Section 8 and Section 9 provide the proof for Theorem 3 and Theorem 4–6, respectively. Finally, Section 10 concludes the paper.

1.3. Notation

We use bold letters for vectors (

x

) and uppercase letters for random variables (X). We denote by

∥ x ∥

the Euclidean norm of the vector

x

. Given a vector

x \in R^{n}

and a scalar a, with a little abuse of notation, we denote

∥ a \cdot e_{1} + x ∥

by

∥ a + x ∥

, where

e_{1} = [1, 0, \dots, 0]

is the first vector in the standard basis of the Euclidean vector space

R^{n}

. Given a random variable X, its probability density function (pdf), pmf, and cumulative distribution function are denoted by

f_{X}

,

P_{X}

, and

F_{X}

, respectively. The support set of

P_{X}

is denoted and defined as

\begin{matrix} supp (P_{X}) & = {x : for every open set D ∋ x we have that P_{X} (D) > 0} . \end{matrix}

(4)

We denote by

N (μ, Σ)

a multivariate Gaussian distribution with mean vector

μ

and covariance matrix

Σ

. The pdf of a Gaussian random variable with zero mean and variance

σ^{2}

is denoted by

ϕ_{σ} (\cdot)

. We denote by

χ_{n}^{2} (λ)

the noncentral chi-square distribution with n degrees of freedom and with noncentrality parameter

λ

. We represent the

n \times 1

vector of zeros by

0_{n}

and the

n \times n

identity matrix by

I_{n}

. Furthermore, we represent by

D

the relative entropy. The minimum mean squared error is denoted by

\begin{matrix} mmse (X | X + N) = E [{∥ X - E [X | X + N] ∥}^{2}] . \end{matrix}

(5)

The modified Bessel function of the first kind of order

v \geq 0

is denoted by

I_{v} (x), x \in R

. The following ratio of the Bessel functions is commonly used in this work:

h_{v} (x) = \frac{I_{v} (x)}{I_{v - 1} (x)}, x \in R, v \geq 0 .

(6)

Finally, the number of zeros (counted in accordance with their multiplicities) of a function

f : R \to R

on the interval

I

is denoted by

N (I, f)

. Similarly, if

f : C \to C

is a function on the complex domain,

N (D, f)

denotes the number of its zeros within the region

D

.

2. Preliminaries

2.1. Oscillation Theorem

In this work, we often need to upper bound the number of oscillations of a function, i.e., its number of sign changes. This is useful, for example, to bound the number of zeros of a function or the number of roots of an equation. To be more precise, let us define the number of sign changes as follows.

Definition 1

(Sign Changes of a Function). The number of sign changes of a function

ξ : Ω \to R

is given by

S (ξ) = sup_{m \in N} \{sup_{y_{1} < \dots < y_{m} \subseteq Ω} N {ξ (y_{i})}_{i = 1}^{m}\},

(7)

where

N {ξ (y_{i})}_{i = 1}^{m}

is the number of sign changes of the sequence

{ξ (y_{i})}_{i = 1}^{m}

.

Definition 2

(Totally Positive Kernel). A function

f : I_{1} \times I_{2} \to R

is said to be a totally positive kernel of order n if

det ({[f (x_{i}, y_{j})]}_{i, j = 1}^{m}) > 0

for all

1 \leq m \leq n

, for all

x_{1} < \dots < x_{m} \in I_{1}

, and

y_{1} < \dots < y_{m} \in I_{2}

. If f is a totally positive kernel of order n for all

n \in N

, then f is a strictly totally positive kernel.

In [26], Karlin noticed that some integral transformations have a variation-diminishing property, which is described in the following theorem.

Theorem 1

(Oscillation Theorem). Given domains

I_{1}

and

I_{2}

, let

p : I_{1} \times I_{2} \to R

be a strictly totally positive kernel. For an arbitrary y, suppose

p (\cdot, y) : I_{1} \to R

is an n-times differentiable function. Assume that μ is a measure on

I_{2}

, and let

ξ : I_{2} \to R

be a function with

S (ξ) = n

. For

x \in I_{1}

, define

Ξ (x) = \int ξ (y) p (x, y) d μ (y) .

(8)

If

Ξ : I_{1} \to R

is an n-times differentiable function, then either

N (I_{1}, Ξ) \leq n

, or

Ξ \equiv 0

.

The above theorem says that the number of zeros of a function

Ξ

, which is the output of the integral transformation, is less than the number of sign changes of the function

ξ

, which is the input to the integral transformation.

2.2. Low-Amplitude Regime

In this work, a low-amplitude regime is defined as follows.

Definition 3.

Let

X_{R} \sim P_{X_{R}}

be uniform on

C (R) = {x : ∥ x ∥ = R}

. The capacity in (3) is said to be in the low-amplitude regime if

R \leq {\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

, where

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2}) = max \{R : P_{X_{R}} = arg max_{P_{X} : X \in B_{0} (R)} I (X; Y_{1} | Y_{2})\} .

(9)

If the set in (9) is empty, then we assign

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2}) = 0

.

The quantity

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

represents the largest radius

R

, for which

P_{X_{R}}

is secrecy-capacity-achieving.

One of the main objectives of this work is to characterize

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

.

2.3. Connections to Other Optimization Problems

The distribution

P_{X_{R}}

occurs in a variety of statistical and information-theoretic applications. For example, consider the following two optimization problems:

\begin{matrix} max_{P_{X} : X \in B_{0} (R)} & I (X; X + N), \end{matrix}

(10)

\begin{matrix} max_{P_{X} : X \in B_{0} (R)} & mmse (X | X + N), \end{matrix}

(11)

where

N \sim N (0_{n}, σ^{2} I_{n})

. The first problem seeks to characterize the capacity of the point-to-point channel under an amplitude constraint, and the second problem seeks to find the largest minimum mean squared error under the assumption that the signal has bounded amplitude; the interested reader is referred to [27,28,29] for a detailed background on both problems.

Similarly to the wiretap channel, we can define the low-amplitude regime for both problems as the largest

R

such that

P_{X_{R}}

is optimal and denote these by

{\bar{R}}_{n}^{ptp} (σ^{2})

and

{\bar{R}}_{n}^{MMSE} (σ^{2})

. We now argue that both

{\bar{R}}_{n}^{ptp} (σ^{2})

and

{\bar{R}}_{n}^{MMSE} (σ^{2})

can be seen as a special case of the wiretap solution. Hence, the wiretap channel provides an interesting unification and generalization of these two problems.

First, note that the point-to-point solution can be recovered from the wiretap by simply specializing the wiretap channel to the point-to-point channel, that is,

\begin{matrix} {\bar{R}}_{n}^{ptp} (σ^{2}) = lim_{σ_{2} \to \infty} {\bar{R}}_{n} (σ^{2}, σ_{2}^{2}) . \end{matrix}

(12)

Second, to see that the MMSE solution can be recovered from the wiretap, recall that by the I-MMSE relationship [16] we have that

\begin{matrix} max_{P_{X} : X \in B_{0} (R)} I (X; Y_{1}) - I (X; Y_{2}) \end{matrix}

\begin{matrix} = max_{P_{X} : X \in B_{0} (R)} \frac{1}{2} \int_{σ_{1}^{2}}^{\infty} \frac{mmse (X | X + \sqrt{s} Z)}{s^{2}} d s - \frac{1}{2} \int_{σ_{2}^{2}}^{\infty} \frac{mmse (X | X + \sqrt{s} Z)}{s^{2}} d s \end{matrix}

(13)

\begin{matrix} = max_{P_{X} : X \in B_{0} (R)} \frac{1}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{mmse (X | X + \sqrt{s} Z)}{s^{2}} d s \end{matrix}

(14)

where

Z

is standard Gaussian. Now, note that if we choose

σ_{2}^{2} = σ_{1}^{2} + ϵ

, then by the mean value theorem we arrive at

\begin{matrix} max_{P_{X} : X \in B_{0} (R)} I (X; Y_{1}) - I (X; Y_{2}) = max_{P_{X} : X \in B_{0} (R)} \frac{ϵ}{2} \frac{mmse (X | X + \sqrt{σ_{1}^{2}} Z)}{σ_{1}^{4}} + o (ϵ), \end{matrix}

(15)

where

{lim}_{ϵ \to 0^{+}} o (ϵ) / ϵ = 0

. Consequently, for a small enough

ϵ > 0

,

{\bar{R}}_{n}^{MMSE} (σ^{2}) = {\bar{R}}_{n} (σ^{2}, σ^{2} + ϵ) .

(16)

2.4. KKT Conditions

Let us define the secrecy density for the vector Gaussian wiretap channel as

\begin{matrix} Ξ (x; P_{X^{★}}) & = D (f_{Y_{1} | X} (\cdot | x) ∥ f_{Y_{1}^{★}}) - D (f_{Y_{2} | X} (\cdot | x) ∥ f_{Y_{2}^{★}}), \end{matrix}

(17)

where

D (\cdot ∥ \cdot)

is the relative entropy.

For the scalar case

(n = 1)

, the KKT conditions are necessary and sufficient to ensure that

P_{X^{★}}

is capacity-achieving [21].

Lemma 1.

P_{X^{★}}

maximizes (3) if, and only if,

\begin{matrix} Ξ (x) & = C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, 1), x \in supp (P_{X^{★}}), \end{matrix}

(18)

\begin{matrix} Ξ (x) & \leq C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, 1), x \in [- R, R], \end{matrix}

(19)

where for

x \in R

\begin{matrix} Ξ (x) & = D (f_{Y_{1} | X} (\cdot | x) ∥ f_{Y_{1}^{★}}) - D (f_{Y_{2} | X} (\cdot | x) ∥ f_{Y_{2}^{★}}) \end{matrix}

(20)

\begin{matrix} = E [g (Y_{1}) | X = x] + log (\frac{σ_{2}}{σ_{1}}), \end{matrix}

(21)

and where

\begin{matrix} g (y) = E [log \frac{f_{Y_{2}^{★}} (y + N)}{f_{Y_{1}^{★}} (y)}], y \in R, \end{matrix}

(22)

with

N \sim N (0, σ_{2}^{2} - σ_{1}^{2})

.

Proof.

The first part of Lemma 1 was shown in [21]. The proof of (21) goes as follows:

\begin{matrix} D (f_{Y_{1} | X} (\cdot | x) ∥ f_{Y_{1}^{★}}) - D (f_{Y_{2} | X} (\cdot | x) ∥ f_{Y_{2}^{★}}) - log (\frac{σ_{2}}{σ_{1}}) \end{matrix}

(23)

\begin{matrix} = \int_{- \infty}^{\infty} log \frac{1}{f_{Y_{1}^{★}} (y)} ϕ_{σ_{1}} (y - x) d y - \int_{- \infty}^{\infty} log \frac{1}{f_{Y_{2}^{★}} (y)} E [ϕ_{σ_{1}} (y - x - N)] d y \end{matrix}

(24)

\begin{matrix} = \int_{- \infty}^{\infty} log \frac{1}{f_{Y_{1}^{★}} (y)} ϕ_{σ_{1}} (y - x) d y - \int_{- \infty}^{\infty} E [log \frac{1}{f_{Y_{2}^{★}} (y + N)}] ϕ_{σ_{1}} (y - x) d y \end{matrix}

(25)

\begin{matrix} = \int_{- \infty}^{\infty} E [log \frac{f_{Y_{2}^{★}} (y + N)}{f_{Y_{1}^{★}} (y)}] ϕ_{σ_{1}} (y - x) d y \end{matrix}

(26)

\begin{matrix} = \int_{- \infty}^{\infty} g (y) ϕ_{σ_{1}} (y - x) d y, \end{matrix}

(27)

where

N \sim N (0, σ_{2}^{2} - σ_{1}^{2})

and (24) hold by noticing that

ϕ_{σ_{2}} (y - x)

can be reformulated as the convolution of Gaussian pdfs

E [ϕ_{σ_{1}} (y - x - N)]

; in (25) we applied the change in variable

y \mapsto y + N

. This concludes the proof. □

The convexity of the optimization problem is also guaranteed for the vector wiretap model in (1) with

n > 1

. Then, the results of Lemma 1 can be extended to the vector case as follows.

Lemma 2.

P_{X^{★}}

maximizes (3) if, and only if,

\begin{matrix} Ξ (x; P_{X^{★}}) & = C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n), x \in supp (P_{X^{★}}), \end{matrix}

(28a)

\begin{matrix} Ξ (x; P_{X^{★}}) & \leq C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n), x \in B_{0} (R), \end{matrix}

(28b)

where for

x \in R^{n}

\begin{matrix} Ξ (x; P_{X^{★}}) & = D (f_{Y_{1} | X} (\cdot | x) ∥ f_{Y_{1}^{★}}) - D (f_{Y_{2} | X} (\cdot | x) ∥ f_{Y_{2}^{★}}) \end{matrix}

(29)

\begin{matrix} = E [g (Y_{1}) | X = x], \end{matrix}

(30)

and where

\begin{matrix} g (y) = E [log \frac{f_{Y_{2}^{★}} (y + N)}{f_{Y_{1}^{★}} (y)}] + n log (\frac{σ_{2}}{σ_{1}}), y \in R^{n}, \end{matrix}

(31)

with

N \sim N (0_{n}, (σ_{2}^{2} - σ_{1}^{2}) I_{n})

.

Proof.

This is a straightforward vector extension of Lemma 1. □

Thanks to the spherical symmetry of the additive noise distributions and of

P_{X}

, the secrecy density

Ξ (x; P_{X})

can be expressed as a function of

∥ x ∥

only. Therefore, we denote the secrecy density in spherical coordinates by

\tilde{Ξ} (∥ x ∥; P_{∥ X ∥})

, and give a rigorous definition in (A9).

3. Main Results

3.1. A New Sufficient Condition on the Optimality of $P_{X_{R}}$

Our first main result provides a sufficient condition for the optimality of

P_{X_{R}}

.

Theorem 2.

If

R < σ_{1}^{2} \sqrt{n (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}})},

(32)

then

P_{X_{R}}

is secrecy-capacity-achieving.

Proof.

Let us consider the equivalent definition of the secrecy density in spherical coordinates (A9). Note that if the derivative of

\tilde{Ξ} (∥ x ∥; P_{∥ X_{R} ∥})

makes at most one sign change, from negative to positive, then the maximum of

∥ x ∥ \mapsto \tilde{Ξ} (∥ x ∥; P_{∥ X_{R} ∥})

occurs at either

∥ x ∥ = 0

or

∥ x ∥ = R

.

From Lemma A1 in the Appendix B, the derivative of

\tilde{Ξ}

is as given below

\begin{matrix} {\tilde{Ξ}}^{'} (∥ x ∥; P_{∥ X_{R} ∥}) = ∥ x ∥ E [{\tilde{M}}_{2} (σ_{1} Q_{n + 2}) - M_{1} (σ_{1} Q_{n + 2})] \end{matrix}

(33)

where

Q_{n + 2}^{2}

is a noncentral chi-square random variable with

n + 2

degrees of freedom and noncentrality parameter

\frac{{∥ x ∥}^{2}}{σ_{1}^{2}}

, and

\begin{matrix} M_{i} (y) & = \frac{1}{σ_{i}^{2}} (\frac{R}{y} h_{\frac{n}{2}} (\frac{R}{σ_{i}^{2}} y) - 1), i \in {1, 2} \end{matrix}

(34)

\begin{matrix} {\tilde{M}}_{2} (y) & = E [M_{2} (∥ y + W ∥)], \end{matrix}

(35)

where

W \sim N (0_{n + 2}, (σ_{2}^{2} - σ_{1}^{2}) I_{n + 2})

. A calculation related to (33) was erroneously performed in [27]. However, this error does not change the results of [27] as only the sign of the derivative is important and not the value itself. Note that

{\tilde{Ξ}}^{'} (0; P_{∥ X_{R} ∥}) = 0

and that

{\tilde{Ξ}}^{'} (∥ x ∥; P_{∥ X_{R} ∥}) > 0

for a sufficiently large

∥ x ∥

; in fact, we have

\begin{matrix} {\tilde{Ξ}}^{'} (∥ x ∥; P_{∥ X_{R} ∥}) & > ∥ x ∥ (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}) - \frac{∥ x ∥}{σ_{1}^{2}} E [\frac{R}{σ_{1} Q_{n + 2}}] \end{matrix}

(36)

\begin{matrix} = ∥ x ∥ (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}) - \frac{∥ x ∥}{σ_{1}^{2}} E [\frac{R}{∥ x ∥} h_{\frac{n}{2}} (\frac{∥ x ∥}{σ_{1}} Q_{n})] \end{matrix}

(37)

\begin{matrix} \geq ∥ x ∥ (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}) - \frac{R}{σ_{1}^{2}}, \end{matrix}

(38)

where (36) follows from

0 \leq h_{\frac{n}{2}} (x) \leq 1

for

x \geq 0

; (37) follows by noticing that

\frac{R}{σ_{1} \sqrt{t}} f_{Q_{n + 2}^{2}} (t) = \frac{R}{∥ x ∥} h_{\frac{n}{2}} (\frac{∥ x ∥}{σ_{1}} \sqrt{t}) f_{Q_{n}^{2}} (t)

; and finally, (38) holds by

h_{\frac{n}{2}} (x) \leq 1

.

Then, to show that

\tilde{Ξ} (∥ x ∥; P_{∥ X_{R} ∥})

is maximized in

∥ x ∥ = R

, we need to prove that

{\tilde{Ξ}}^{'} (∥ x ∥; P_{∥ X_{R} ∥})

changes sign at most once. To that end, we need Karlin’s oscillation theorem presented in Section 2.1. By using (33), the fact that the pdf of a chi-square is a positive defined kernel [26], and Theorem 1, the number of sign changes of

{\tilde{Ξ}}^{'} (∥ x ∥; P_{∥ X_{R} ∥})

is upper-bounded by the number of sign changes of

G_{σ_{1}, σ_{2}, R, n} (y) = {\tilde{M}}_{2} (y) - M_{1} (y),

(39)

for

y \in R^{+}

. Note that

\begin{matrix} G_{σ_{1}, σ_{2}, R, n} (y) & \geq - \frac{1}{σ_{2}^{2}} + \frac{1}{σ_{1}^{2}} - \frac{R}{σ_{1}^{2} y} h_{\frac{n}{2}} (\frac{R}{σ_{1}^{2}} y) \end{matrix}

(40)

\begin{matrix} \geq - \frac{1}{σ_{2}^{2}} + \frac{1}{σ_{1}^{2}} - \frac{R^{2}}{σ_{1}^{4} n}, \end{matrix}

(41)

where the inequality in (40) follows from

h_{\frac{n}{2}} (x) \geq 0

for

x \geq 0

, and (41) follows from

h_{\frac{n}{2}} (x) \leq \frac{x}{n}

for

x \geq 0

and

n \in N

. We conclude by noting that (41) is nonnegative, hence has no sign change, for

R < σ_{1}^{2} \sqrt{n (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}})}

(42)

for all

y \in R^{+}

, thus guaranteeing that

P_{X_{R}}

is secrecy-capacity-achieving. □

Remark 1.

As a consequence of the proof of Theorem 2, for any

R \geq 0, σ_{2} \geq σ_{1} \geq 0

and

n \in N

, if

G_{σ_{1}, σ_{2}, R, n} (y)

has at most one sign change, then

P_{X_{R}}

is secrecy-capacity-achieving if, and only if, for all

∥ x ∥ = R

Ξ (0; P_{X_{R}}) \leq Ξ (x; P_{X_{R}}) .

(43)

Because of the difficulty in evaluating analytical properties of (39), proving that

G_{σ_{1}, σ_{2}, R, n}

has at most one sign change does not seem easy. However, in Appendix A, we show via extensive numerical evaluations that

G_{σ_{1}, σ_{2}, R, n}

changes sign at most once for any

n, R, σ_{1}, σ_{2}

that we tried.

3.2. Characterizing the Low-Amplitude Regime

Let us characterize the low-amplitude regime as follows.

Theorem 3.

Consider a function

\begin{matrix} f (R) & = \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{E [h_{\frac{n}{2}}^{2} (\frac{∥ \sqrt{s} Z ∥ R}{s}) + h_{\frac{n}{2}}^{2} (\frac{∥ R + \sqrt{s} Z ∥ R}{s})] - 1}{s^{2}} d s \end{matrix}

(44)

where

Z \sim N (0_{n}, I_{n})

. If

G_{σ_{1}, σ_{2}, R, n}

of (39) has at most one sign change, the input

X_{R}

is secrecy-capacity-achieving if, and only if,

R \leq {\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

, where

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

is given as the solution of

f (R) = 0 .

(45)

Remark 2.

Note that (45) always has a solution. To see this, observe that

f (0) = \frac{1}{σ_{2}^{2}} - \frac{1}{σ_{1}^{2}} < 0

and

f (\infty) = \frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}} > 0

. Moreover, the solution is unique because

f (R)

monotonically increases for

R \geq 0

.

The solution to (45) needs to be found numerically. To avoid any loss of accuracy in the numerical evaluation of

h_{v} (x)

for large values of x, we used the exponential scaling provided in the MATLAB implementation of

I_{v} (x)

. Since evaluating

f (R)

is rather straightforward and not time-consuming, we opted for a binary search algorithm.

In Table 1, we show the values of

{\bar{R}}_{n} (1, σ_{2}^{2})

for some values of

σ_{2}^{2}

and n. Moreover, we report the values of

{\bar{R}}_{n}^{ptp} (1)

and

{\bar{R}}_{n}^{MMSE} (1)

from [27] in the first and the last row, respectively. As predicted by (12), we can appreciate the close match of the

{\bar{R}}_{n}^{ptp} (1)

row with the one of

{\bar{R}}_{n} (1, 1000)

. Similarly, the agreement between the

{\bar{R}}_{n}^{MMSE} (1)

row and the

{\bar{R}}_{n} (1, 1.001)

row is justified by (16).

Table 1. Values of

{\bar{R}}_{n}^{MMSE} (1)

,

{\bar{R}}_{n} (1, σ_{2}^{2})

, and

{\bar{R}}_{n}^{ptp} (1)

.

3.3. Large n Asymptotics

We now use the result in Theorem 3 to characterize the asymptotic behavior of

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

. In particular, it is shown that

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

increases as

\sqrt{n}

.

Theorem 4.

For

σ_{1}^{2} \leq σ_{2}^{2}

lim_{n \to \infty} \frac{{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})}{\sqrt{n}} = c (σ_{1}^{2}, σ_{2}^{2}),

(46)

where

c = c (σ_{1}^{2}, σ_{2}^{2})

is the solution of

\int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{\frac{c^{2}}{{(\frac{\sqrt{s}}{2} + \sqrt{\frac{s}{4} + c^{2}})}^{2}} + \frac{c^{2} (c^{2} + s)}{{(\frac{s}{2} + \sqrt{\frac{s^{2}}{4} + c^{2} (c^{2} + s)})}^{2}} - 1}{s^{2}} d s = 0 .

(47)

Proof.

See Section 7. □

In Figure 1, for

σ_{1}^{2} = 1

and

σ_{2}^{2} = 1.001, 1.5, 10, 1000

, we show the behavior of

{\bar{R}}_{n} (1, σ_{2}^{2}) / \sqrt{n}

and how its asymptotic converges to

c (1, σ_{2}^{2})

.

Figure 1. Asymptotic behavior of

{\bar{R}}_{n} (1, σ_{2}^{2}) / \sqrt{n}

versus n for

σ_{1}^{2} = 1

and

σ_{2}^{2} = 1.001, 1.5, 10, 1000

. In red, we show

c (1, σ_{2}^{2})

defined in (46).

3.4. Scalar Case $(n = 1)$

For the scalar case, the optimal input distribution

P_{X^{★}}

is discrete. In this regime, we provide an implicit and an explicit upper bound on the number of support points of the optimal input probability mass function (pmf)

P_{X^{★}}

.

Theorem 5.

Let

Y_{1}^{★}

and

Y_{2}^{★}

be the secrecy-capacity-achieving output distributions at the legitimate and malicious receivers, respectively, and let

\begin{matrix} g (y) = E [log \frac{f_{Y_{2}^{★}} (y + N)}{f_{Y_{1}^{★}} (y)}], y \in R, \end{matrix}

(48)

with

N \sim N (0, σ_{2}^{2} - σ_{1}^{2})

. For

R > 0

, an implicit upper bound on the number of support points of

P_{X^{★}}

is

\begin{matrix} | supp (P_{X^{★}}) | \leq N ([- L, L], g (\cdot) + κ_{1}) < \infty \end{matrix}

(49)

where

\begin{matrix} κ_{1} & = log (\frac{σ_{2}}{σ_{1}}) - C_{s}, \end{matrix}

(50)

\begin{matrix} L & = R \frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}} + \sqrt{\frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}}} . \end{matrix}

(51)

Moreover, an explicit upper bound on the number of support points of

P_{X^{★}}

is obtained by using

\begin{matrix} N ([- L, L], g (\cdot) + κ_{1}) \leq ρ \frac{R^{2}}{σ_{1}^{2}} + O (log (R)), \end{matrix}

(52)

where

ρ = {(2 e + 1)}^{2} {(\frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}})}^{2} + {(\frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}} + 1)}^{2}

.

The upper bounds in Theorem 5 are generalizations of the upper bounds on the number of points presented in [30] in the context of a point-to-point AWGN channel with an amplitude constraint. Indeed, if we let

σ_{2} \to \infty

, while keeping

σ_{1}

and

R

fixed, then the wiretap channel reduces to the AWGN point-to-point channel.

To find a lower bound on the number of mass points, a possible approach consists of the following steps:

\begin{matrix} C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, 1) & = I (X^{★}; Y_{1}) - I (X^{★}; Y_{2}) \end{matrix}

(53)

\begin{matrix} \leq H (X^{★}) - I (X^{★}; Y_{2}) \end{matrix}

(54)

\begin{matrix} \leq log (| supp (P_{X^{★}}) |) - I (X^{★}; Y_{2}), \end{matrix}

(55)

where the above uses the nonnegativity of the entropy and the fact that entropy is maximized by a uniform distribution. Furthermore, by using a suboptimal uniform (continuous) distribution on

[- R, R]

as an input and the entropy power inequality, the secrecy capacity is lower-bounded by

C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, 1) \geq \frac{1}{2} log (1 + \frac{\frac{2 R^{2}}{π e σ_{1}^{2}}}{1 + \frac{R^{2}}{σ_{2}^{2}}}) .

(56)

Combining the bounds in (55) and (56), we arrive at the following lower bound on the number of points:

| supp (P_{X^{★}}) | \geq \sqrt{1 + \frac{\frac{2 R^{2}}{π e σ_{1}^{2}}}{1 + \frac{R^{2}}{σ_{2}^{2}}}} e^{I (X^{★}; Y_{2})} .

(57)

At this point, one needs to determine the behavior of

I (X^{★}; Y_{2})

. A trivial lower bound on

| supp (P_{X^{★}}) |

can be found by lower-bounding

I (X^{★}; Y_{2})

by zero. However, this lower bound on

| supp (P_{X^{★}}) |

does not grow with

R

, while the upper bound does increase with

R

. A possible way of establishing a lower bound that increases in

R

is by showing that

I (X^{★}; Y_{2}) \approx \frac{1}{2} log (1 + \frac{R^{2}}{σ_{2}^{2}})

. However, because not much is known about the structure of the optimal input distribution

P_{X^{★}}

, it is not immediately evident how one can establish such an approximation or whether it is valid.

4. Secrecy Capacity Expression in the Low-Amplitude Regime

The result in Theorem 3 can also be used to establish the secrecy capacity for all

R \leq {\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

, as is performed next.

Theorem 6.

If

G_{σ_{1}, σ_{2}, R, n}

of (39) has at most one sign change and if

R \leq {\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

, then

C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n) = \frac{1}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{R^{2} - R^{2} E [h_{\frac{n}{2}}^{2} (\frac{∥ R + \sqrt{s} Z ∥ R}{s})]}{s^{2}} d s .

(58)

Proof.

See Section 9. □

Large n Asymptotics

It is important to note that as

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

grows as

\sqrt{n}

, according to Theorem 4, when we keep

R

constant and increase the number of antennas to infinity, the low-amplitude regime becomes the only regime. The next theorem characterizes the secrecy capacity in this ‘massive-MIMO’ regime (i.e., where

R

is fixed and n goes to infinity).

Theorem 7.

Consider the expression in (58) and fix

R \geq 0

and

σ_{1}^{2} \leq σ_{2}^{2}

, then

\begin{matrix} lim_{n \to \infty} C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n) = R^{2} (\frac{1}{2 σ_{1}^{2}} - \frac{1}{2 σ_{2}^{2}}) . \end{matrix}

(59)

Proof.

See Appendix C. □

Remark 3.

The result in Theorem 7 is reminiscent of the capacity in the wideband regime [31, Ch. 9], where the capacity increases linearly in the signal-to-noise ratio. Similarly, Theorem 7 shows that in the large antenna regime, the secrecy capacity grows linearly with the difference in the single-to-noise ratio between the legitimate user and the eavesdropper.

In Theorem 7,

R

was held fixed. It is also interesting to study the case when

R

is a function of n. Specifically, it is interesting to study the case when

R = c \sqrt{n}

for some coefficient c.

Theorem 8.

Suppose that

c \leq c (σ_{1}^{2}, σ_{2}^{2})

. Then,

lim_{n \to \infty} \frac{C_{s} (σ_{1}^{2}, σ_{2}^{2}, c \sqrt{n}, n)}{n} = \frac{1}{2} log (\frac{1 + c^{2} / σ_{1}^{2}}{1 + c^{2} / σ_{2}^{2}}) .

(60)

Proof.

See Appendix D. □

Notice that (60) is equivalent to the secrecy capacity of a vector Gaussian wiretap channel subject to an average power constraint. Gaussian wiretap channels under average power constraints have been extensively investigated [3,32] and, for an average power constraint

E [∥ X ∥^{2}] \leq P

, the resulting secrecy capacity is given by [3]

C_{G} (σ_{1}^{2}, σ_{2}^{2}, P, n) = \frac{n}{2} log \frac{1 + P / σ_{1}^{2}}{1 + P / σ_{2}^{2}} .

(61)

Thus, the result in (60) can be restated as

lim_{n \to \infty} \frac{C_{s} (σ_{1}^{2}, σ_{2}^{2}, c \sqrt{n}, n)}{C_{G} (σ_{1}^{2}, σ_{2}^{2}, c^{2}, n)} = 1 .

(62)

In other words, for the regime considered in Theorem 8, for a large enough n the secrecy capacity under the amplitude constraint

R_{n} = c \sqrt{n}

behaves as the secrecy capacity under the average power constraint

c^{2}

.

5. Beyond the Low-Amplitude Regime

To evaluate the secrecy capacity and find the optimal distribution

P_{X^{★}}

beyond

{\bar{R}}_{n}

we rely on numerical estimations. We remark that, as pointed out in [25], the secrecy-capacity-achieving distribution is isotropic and consists of finitely many co-centric shells. Keeping this in mind, we can find the optimal input distribution

P_{X^{★}}

by just optimizing over

P_{∥ X ∥}

with

∥ X ∥ \leq R

.

5.1. Numerical Algorithm

In the case of scalar Gaussian wiretap channels, the secrecy capacity and the optimal input pmf can be estimated via the algorithm described in [33], i.e., a numerical procedure that takes inspiration from the deterministic annealing algorithm sketched in [34]. Let us denote by

{\hat{C}}_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

the numerical estimate of the secrecy capacity, and by

{\hat{P}}_{∥ X^{★} ∥}

, the estimate of the optimal pmf on the input norm. To numerically evaluate

{\hat{C}}_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

and

{\hat{P}}_{∥ X^{★} ∥}

, we extend to the vector case the algorithm in [33]. Our extension is defined in Algorithm 1. The input parameters of the main function are the noise variances

σ_{1}^{2}

and

σ_{2}^{2}

, the radius

R

, the vectors

ρ

and

p

being, respectively, the mass points positions and probabilities of a tentative input pmf, the number of iterations in the while loop

N_{c}

, and finally, a tolerance

ε

to set the precision of the secrecy capacity estimate.

Algorithm 1 Secrecy capacity and optimal input pmf estimation

1:: procedure Main $(σ_{1}^{2}, σ_{2}^{2}, R, ρ, p, N_{c}, ε)$
2:: repeat
3:: $k \leftarrow 0$
4:: while $k < N_{c}$ do
5:: $k \leftarrow k + 1$
6:: $ρ \leftarrow$ Gradient Ascent $(ρ, p)$
7:: $p \leftarrow$ Blahut–Arimoto $(ρ, p)$
8:: end while
9:: valid ← KKT Validation $(ρ, p, ε)$
10:: if valid = False then
11:: $(ρ, p) \leftarrow$ Add–Point $(ρ, p)$
12:: end if
13:: until valid = True
14:: ${\hat{P}}_{∥ X^{★} ∥} \leftarrow (ρ, p)$
15:: ${\hat{C}}_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n) \leftarrow I_{s} (∥ X ∥; {\hat{P}}_{∥ X^{★} ∥})$
16:: return ${\hat{P}}_{∥ X^{★} ∥}, {\hat{C}}_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)$
17:: end procedure

At its core, the numerical procedure iteratively refines its estimate of

P_{∥ X^{★} ∥}

by running a gradient ascent algorithm to update the vector

ρ

and a variant of the Blahut–Arimoto algorithm [35] to update

p

.

The Gradient Ascent procedure uses the secrecy information as the objective function and stops either when

ρ

has reached convergence or at a given maximum number of iterations. Let us denote by

I_{s} (∥ X ∥; P_{∥ X ∥})

the secrecy information as a function of the input norm. Notice that, given a tentative pmf

{\hat{P}}_{∥ X ∥}

of mass points

ρ

, probabilities

p

, and

| supp ({\hat{P}}_{∥ X ∥}) | = K

, we have

\begin{matrix} I_{s} (∥ X ∥; {\hat{P}}_{∥ X ∥}) = \sum_{i = 1}^{K} p_{i} \cdot \tilde{Ξ} (ρ_{i}; {\hat{P}}_{∥ X ∥}), \end{matrix}

(63)

where

\tilde{Ξ} (t; {\hat{P}}_{∥ X ∥})

is the secrecy density, with respect to the input norm, defined in (A9) and where

p_{i}

and

ρ_{i}

are, respectively, the ith element of

p

and

ρ

. Then, the Gradient Ascent updates are given by

\begin{matrix} ρ_{i} = ρ_{i} + α \cdot \frac{\partial}{\partial ρ_{i}} I_{s} (∥ X ∥; {\hat{P}}_{∥ X ∥}), i = 1, \dots, K, \end{matrix}

(64)

where the partial derivatives are defined in Appendix E and

α

is the step size in the gradient ascent. We remark that, to ensure convergence to a local maximum, we use the gradient ascent algorithm in a backtracking line search version [36]. By suitably adjusting the step size

α

at each iteration, the backtracking line search version guarantees us that each new update of

ρ

provides a nondecreasing associated secrecy information, compared to the previous update of

ρ

.

The Blahut–Arimoto function runs a variant of the Blahut–Arimoto algorithm. For the scalar case, an example of the Blahut–Arimoto optimization, applied to wiretap channels, is given in [37]. Similar results can be extended to the case of vector wiretap channels. Given the current probabilities

p_{i}

’s, the updates are obtained by evaluating

\begin{matrix} p_{i}^{'} & = p_{i} exp (\tilde{Ξ} (ρ_{i}; {\hat{P}}_{∥ X ∥})), i = 1, \dots, K, \end{matrix}

(65)

and finally, by normalizing each

p_{i}^{'}

and assigning them to the entries of the vector

p

\begin{matrix} p_{i} & = \frac{p_{i}^{'}}{\sum_{k = 1}^{K} p_{i}^{'}}, i = 1, \dots, K . \end{matrix}

(66)

Similarly to Gradient Ascent, the Blahut–Arimoto procedure stops either when the values of

p

have reached a stable convergence or after a set number of updates.

Since the joint optimization of

ρ

and

p

is not numerically feasible, we need to reiterate both the Blahut–Arimoto and the Gradient Ascent procedures a given number of times, namely

N_{c}

. The parameter

N_{c}

is chosen empirically in such a way that

ρ

and

p

become fairly stable, and therefore we can expect to have reached joint convergence for both of them.

Then, the KKT Validation procedure ensures that the values of

ρ

and

p

are indeed close to the optimal ones. We check the optimality of

{\hat{P}}_{∥ X ∥}

by verifying whether the KKT conditions in Lemma 2 are satisfied. Since the algorithm has to verify the KKT conditions numerically, i.e., with finite precision, we find it more convenient to check the negated version of (28), where a tolerance parameter

ε

is introduced that trades off accuracy with computational burden. Specifically,

{\hat{P}}_{∥ X ∥}

is not an optimal input pmf if any of the following conditions are satisfied:

\begin{matrix} | \tilde{Ξ} (t; {\hat{P}}_{∥ X ∥}) - I_{s} (∥ X ∥; {\hat{P}}_{∥ X ∥}) | > ε, for some t \in supp ({\hat{P}}_{∥ X ∥}) \end{matrix}

(67a)

\begin{matrix} I_{s} (∥ X ∥; {\hat{P}}_{∥ X ∥}) + ε < \tilde{Ξ} (t; {\hat{P}}_{∥ X ∥}), for some t \in [0, R] . \end{matrix}

(67b)

Note that in (67), in place of the secrecy capacity

C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

, which is unknown, we used the secrecy information given by the tentative pmf

{\hat{P}}_{∥ X ∥}

, i.e.,

I_{s} (∥ X ∥; {\hat{P}}_{∥ X ∥})

. Condition (67a) is derived by negating (28a): there exists a

t \in supp ({\hat{P}}_{∥ X ∥})

, such that

\tilde{Ξ} (t; {\hat{P}}_{∥ X ∥})

is

ε

-away from the secrecy information

I_{s} (∥ X ∥; {\hat{P}}_{∥ X ∥})

. Condition (67b) is the negated version of (28b): there exists a

t \in [0, R]

such that

\tilde{Ξ} (t; {\hat{P}}_{∥ X ∥})

is at least

ε

-larger than the secrecy information

I_{s} (∥ X ∥; {\hat{P}}_{∥ X ∥})

. With some abuse of notation, we refer to (67) as to the

ε

-KKT conditions. If the tentative pmf

{\hat{P}}_{∥ X ∥}

does not pass the check of the

ε

-KKT conditions, then the algorithm checks whether a new point has to be added to the pmf.

The Add Point procedure evaluates the position of the new mass point

\begin{matrix} ρ_{new} = arg max_{t \in [0, R]} \tilde{Ξ} (t; {\hat{P}}_{∥ X ∥}) . \end{matrix}

(68)

The point

ρ_{new}

is appended to the vector

ρ

and the probabilities

p

are set to be equiprobable.

The whole procedure is repeated until KKT Validation gives a positive outcome, and at that point the algorithm returns

{\hat{P}}_{∥ X^{★} ∥}

as the optimal pmf estimate and

{\hat{C}}_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

as the secrecy capacity estimate.

Remark 4.

In this work, we focus on the secrecy capacity and on the secrecy-capacity-achieving input distribution. However, it is possible to study other points of the rate-equivocation region of the degraded wiretap Gaussian channel by suitably changing the KKT conditions, as reported in [21], Equations (33) and (34). With the due modifications, the proposed optimization algorithm can find the optimal input distribution for any point of the rate-equivocation region.

5.2. Numerical Results

In Figure 2, we show with black dots the numerical estimate

{\hat{C}}_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

versus

R

, evaluated via Algorithm 1, for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 1.5, 10

,

n = 2, 4

, and tolerance

ε = 10^{- 6}

. For the same values of

σ_{1}^{2}

,

σ_{2}^{2}

, and n we also show, with the red lines, the analytical low-amplitude regime secrecy capacity

C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

versus

R

from Theorem 6. In addition, we show with blue dotted lines the secrecy capacity under the average power constraint

E [{∥ X ∥}^{2}] \leq R^{2}

:

\begin{matrix} C_{G} (σ_{1}^{2}, σ_{2}^{2}, R^{2}, n) & = \frac{n}{2} log \frac{1 + R^{2} / σ_{1}^{2}}{1 + R^{2} / σ_{2}^{2}} \geq C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n), \end{matrix}

(69)

where the inequality follows by noting that the average power constraint

E [{∥ X ∥}^{2}] \leq R^{2}

is weaker than the amplitude constraint

∥ X ∥ \leq R

. Finally, the dashed vertical lines show

{\bar{R}}_{n}

, i.e., the upper limit of the low-amplitude regime, for the considered values of

σ_{1}^{2}

,

σ_{2}^{2}

, and n.

Figure 2. Secrecy capacity in bit per channel use (bpcu) versus

R

for

σ_{2}^{2} = 1.5, 10

and

n = 2, 4

. The secrecy capacity under average power constraints

C_{G} (σ_{1}^{2}, σ_{2}^{2}, R^{2}, n)

is defined in (69), while under peak power constraints, i.e.,

C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

, is defined in (58).

In Figure 3, we consider discrete values for

R

and for each value of

R

we plot the corresponding estimated pmf

{\hat{P}}_{∥ X^{★} ∥}

, evaluated via Algorithm 1, for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 1.5

,

n = 2, 8

, and tolerance

ε = 10^{- 6}

. The figure shows, at each

R

, the normalized amplitude of support points in the estimated pmf, while the size of the circles qualitatively shows the probability associated with each support point. Similarly, Figure 4 shows the evolution of the pmf estimate for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

,

n = 2, 8

, and

ε = 10^{- 6}

. It is interesting to notice how in both Figure 3 and Figure 4 when a new mass point is added to the pmf, it appears in zero. Moreover, the mass point of radius

R

always seems to be optimal.

Figure 3. Evolution of the numerically estimated

{\hat{P}}_{∥ X^{★} ∥}

versus

R

for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 1.5

, (a)

n = 2

, and (b)

n = 8

.

Figure 4. Evolution of the numerically estimated

{\hat{P}}_{∥ X^{★} ∥}

versus

R

for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

, (a)

n = 2

, and (b)

n = 8

.

Finally, Figure 5 shows the output distributions of the legitimate user and of the eavesdropper in the case of

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

,

n = 2

, and for two values of

R

. At the top of the figure, the distributions are shown for

R = 2.25

, which is a value close to

{\bar{R}}_{2} (1, 10)

. At the bottom of the figure, the distributions are shown for

R = 7.5

. For both values of

R

, the legitimate user sees an output distribution where the co-centric rings of the input distribution are easily distinguishable. On the other hand, as expected, the output distribution seen by the eavesdropper is close to a Gaussian.

Figure 5. Output pdf of the legitimate user and of the eavesdropper for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

,

n = 2

, (a,b)

R = 2.25

, and (c,d)

R = 7.5

. An animation showing the evolution of the output pdf as

R

varies can be found in [1].

6. Proof of Theorem 3

Estimation Theoretic Representation

By Remark 1, if

G_{σ_{1}, σ_{2}, R, n}

has at most one sign change,

P_{X_{R}}

is secrecy-capacity-achieving if, and only if, for all

∥ x ∥ = R

Ξ (0; P_{X_{R}}) \leq Ξ (x; P_{X_{R}}) .

(70)

We seek to re-write the condition (70) in the estimation theoretic form. To that end, we need the following representation of the relative entropy [38]:

D (P_{X_{1} + \sqrt{t} Z} ∥ P_{X_{2} + \sqrt{t} Z}) = \frac{1}{2} \int_{t}^{\infty} \frac{g (s)}{s^{2}} d s,

(71)

where

\begin{matrix} g (s) & = E [∥ X_{1} - ℓ_{2} (X_{1} + \sqrt{s} Z) ∥^{2}] - E [∥ X_{1} - ℓ_{1} (X_{1} + \sqrt{s} Z) ∥^{2}] \end{matrix}

(72)

and where

\begin{matrix} ℓ_{i} (y) & = E [X_{i} | X_{i} + \sqrt{s} Z = y] \end{matrix}

(73)

\begin{matrix} = \int x_{i} f_{X_{i} | X_{i} + \sqrt{s} Z} (x_{i} ∣ y) d x_{i}, i \in {1, 2} . \end{matrix}

(74)

Another fact that will be important for our expression is

\begin{matrix} E [X_{R} ∣ X_{R} + \sqrt{s} Z = y] = \frac{R y}{∥ y ∥} h_{\frac{n}{2}} (\frac{∥ y ∥ R}{s}), \end{matrix}

(75)

see, for example [27], for the proof.

Next, using (71) and (75) note that for any

∥ x ∥ = R

we have that for

i \in {1, 2}

\begin{matrix} D (P_{x + \sqrt{σ_{i}^{2}} Z} ∥ P_{X_{R} + \sqrt{σ_{i}^{2}} Z}) & = \frac{1}{2} \int_{σ_{i}^{2}}^{\infty} \frac{E [{∥x - \frac{R (x + \sqrt{s} Z)}{∥ x + \sqrt{s} Z ∥} h_{\frac{n}{2}} (\frac{∥ x + \sqrt{s} Z ∥ R}{s})∥}^{2}]}{s^{2}} d s \end{matrix}

(76)

\begin{matrix} = \frac{1}{2} \int_{σ_{i}^{2}}^{\infty} \frac{E [{∥x∥}^{2}] - E [{∥\frac{R (x + \sqrt{s} Z)}{∥ x + \sqrt{s} Z ∥} h_{\frac{n}{2}} (\frac{∥ x + \sqrt{s} Z ∥ R}{s})∥}^{2}]}{s^{2}} d s \end{matrix}

(77)

\begin{matrix} = \frac{1}{2} \int_{σ_{i}^{2}}^{\infty} \frac{R^{2} - R^{2} E [h_{\frac{n}{2}}^{2} (\frac{∥ x + \sqrt{s} Z ∥ R}{s})]}{s^{2}} d s, \end{matrix}

(78)

where (77) follows from

\begin{matrix} mmse (X_{R} | Y) & = E [∥ X_{R} - E [X_{R} | Y] ∥^{2}] \end{matrix}

(79)

\begin{matrix} = E [∥ X_{R} ∥^{2}] - E [∥ E [X_{R} | Y] ∥^{2}] . \end{matrix}

(80)

Moreover, for

∥ x ∥ = 0

, it holds

\begin{matrix} D (P_{0 + \sqrt{σ_{i}^{2}} Z} ∥ P_{X_{R} + \sqrt{σ_{i}^{2}} Z}) = \frac{1}{2} \int_{σ_{i}^{2}}^{\infty} \frac{R^{2} E [h_{\frac{n}{2}}^{2} (\frac{R ∥ Z ∥}{s})]}{s^{2}} d s . \end{matrix}

(81)

Now, note that by using the definition of

Ξ (x; P_{X_{R}})

in (30), (78), and (81) we have that for

∥ x ∥ = R

\begin{matrix} Ξ (x; P_{X_{R}}) & = D (P_{x + \sqrt{σ_{1}^{2}} Z} ∥ P_{X_{R} + \sqrt{σ_{1}^{2}} Z}) - D (P_{x + \sqrt{σ_{2}^{2}} Z} ∥ P_{X_{R} + \sqrt{σ_{2}^{2}} Z}) \end{matrix}

(82)

\begin{matrix} = \frac{1}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{R^{2} - R^{2} E [h_{\frac{n}{2}}^{2} (\frac{∥ x + \sqrt{s} Z ∥ R}{s})]}{s^{2}} d s, \end{matrix}

(83)

and

\begin{matrix} Ξ (0; P_{X_{R}}) & = D (P_{0 + \sqrt{σ_{1}^{2}} Z} ∥ P_{X_{R} + \sqrt{σ_{1}^{2}} Z}) - D (P_{0 + \sqrt{σ_{2}^{2}} Z} ∥ P_{X_{R} + \sqrt{σ_{2}^{2}} Z}) \end{matrix}

(84)

\begin{matrix} = \frac{1}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{R^{2} E [h_{\frac{n}{2}}^{2} (\frac{∥ \sqrt{s} Z ∥ R}{s})]}{s^{2}} d s \end{matrix}

(85)

Consequently, the necessary and sufficient condition in Theorem 2 can be equivalently written as

\begin{matrix} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{E [h_{\frac{n}{2}}^{2} (\frac{∥ \sqrt{s} Z ∥ R}{s}) + h_{\frac{n}{2}}^{2} (\frac{∥ x + \sqrt{s} Z ∥ R}{s})] - 1}{s^{2}} d s \leq 0 . \end{matrix}

(86)

Now

{\bar{R}}_{n} (σ_{1}^{2}, σ_{2}^{2})

will be the largest

R

that satisfies (86), which concludes the proof of Theorem 3.

7. Proof of Theorem 4

The objective of the proof is to understand how the condition in (45) behaves as

n \to \infty

. To study the large n behavior, we need to the following bounds on the

h_{ν}

[39,40]: for

ν > \frac{1}{2}

\begin{matrix} h_{ν} (x) = \frac{x}{\frac{2 ν - 1}{2} + \sqrt{\frac{{(2 ν - 1)}^{2}}{4} + x^{2}}} \cdot g_{ν} (x), \end{matrix}

(87)

where

\begin{matrix} 1 \geq g_{ν} (x) \geq \frac{\frac{2 ν - 1}{2} + \sqrt{\frac{{(2 ν - 1)}^{2}}{4} + x^{2}}}{ν + \sqrt{ν^{2} + x^{2}}} . \end{matrix}

(88)

Now let

R = c \sqrt{n}

for some

c > 0

. The goal is to understand the behavior of

E [h_{\frac{n}{2}}^{2} (\frac{∥ \sqrt{s} Z ∥ R}{s}) + h_{\frac{n}{2}}^{2} (\frac{∥ x + \sqrt{s} Z ∥ R}{s})]

(89)

as n goes to infinity. First, let

\begin{matrix} V_{n} = \frac{∥ Z ∥}{\sqrt{n}}, \end{matrix}

(90)

and note that

\begin{matrix} lim_{n \to \infty} E [h_{\frac{n}{2}}^{2} (\frac{∥ \sqrt{s} Z ∥ c \sqrt{n}}{s})] & = lim_{n \to \infty} E [{(\frac{\frac{c V_{n}}{\sqrt{s}}}{\frac{n - 1}{2 n} + \sqrt{\frac{{(n - 1)}^{2}}{4 n^{2}} + {(\frac{c V_{n}}{\sqrt{s}})}^{2}}} \cdot g_{\frac{n}{2}} (\frac{c V_{n}}{\sqrt{s}} n))}^{2}] \end{matrix}

(91)

\begin{matrix} = E [lim_{n \to \infty} {(\frac{\frac{c V_{n}}{\sqrt{s}}}{\frac{n - 1}{2 n} + \sqrt{\frac{{(n - 1)}^{2}}{4 n^{2}} + {(\frac{c V_{n}}{\sqrt{s}})}^{2}}} \cdot g_{\frac{n}{2}} (\frac{c V_{n}}{\sqrt{s}} n))}^{2}] \end{matrix}

(92)

\begin{matrix} = \frac{c^{2}}{{(\frac{\sqrt{s}}{2} + \sqrt{\frac{s}{4} + c^{2}})}^{2}}, \end{matrix}

(93)

where (92) follows from the dominated convergence theorem, and (93) follows since, by the law of large numbers we have, almost surely,

\begin{matrix} lim_{n \to \infty} V_{n}^{2} = lim_{n \to \infty} \frac{1}{n} \sum_{i = 1}^{n} Z_{i}^{2} = E [Z^{2}] = 1 . \end{matrix}

(94)

Second, let

\begin{matrix} W_{n} = \frac{∥ x + \sqrt{s} Z ∥}{\sqrt{n}}, \end{matrix}

(95)

where, without loss of generality, we take

x = [R, 0, \dots, 0]

\begin{matrix} lim_{n \to \infty} E [h_{\frac{n}{2}}^{2} (\frac{∥ x + \sqrt{s} Z ∥ c \sqrt{n}}{s})] & = lim_{n \to \infty} E [{(\frac{\frac{c W_{n}}{s} \cdot g_{\frac{n}{2}} (\frac{c W_{n}}{s} n)}{\frac{n - 1}{2 n} + \sqrt{\frac{{(n - 1)}^{2}}{4 n^{2}} + {(\frac{c W_{n}}{s})}^{2}}})}^{2}] \end{matrix}

(96)

\begin{matrix} = E [lim_{n \to \infty} {(\frac{\frac{c W_{n}}{s} \cdot g_{\frac{n}{2}} (\frac{c W_{n}}{s} n)}{\frac{n - 1}{2 n} + \sqrt{\frac{{(n - 1)}^{2}}{4 n^{2}} + {(\frac{c W_{n}}{s})}^{2}}})}^{2}] \end{matrix}

(97)

\begin{matrix} = \frac{c^{2} (c^{2} + s)}{{(\frac{s}{2} + \sqrt{\frac{s^{2}}{4} + c^{2} (c^{2} + s)})}^{2}}, \end{matrix}

(98)

where (97) follows from the dominated convergence theorem and where (98) follows since, by the strong law of large numbers we have, almost surely,

\begin{matrix} lim_{n \to \infty} W_{n}^{2} & = lim_{n \to \infty} \frac{1}{n} {(\sqrt{s} Z_{1} + c \sqrt{n})}^{2} + s lim_{n \to \infty} \frac{1}{n} \sum_{i = 2}^{n} Z_{i}^{2} \end{matrix}

(99)

\begin{matrix} = c^{2} + s . \end{matrix}

(100)

Combining (93) and (98) with (45), we arrive at

\begin{matrix} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{\frac{c^{2}}{{(\frac{\sqrt{s}}{2} + \sqrt{\frac{s}{4} + c^{2}})}^{2}} + \frac{c^{2} (c^{2} + s)}{{(\frac{s}{2} + \sqrt{\frac{s^{2}}{4} + c^{2} (c^{2} + s)})}^{2}} - 1}{s^{2}} d s = 0 . \end{matrix}

(101)

8. Proof of Theorem 5

8.1. Implicit Upper Bound

A consequence of the KKT conditions of Lemma 1 is the inclusion

supp (P_{X^{★}}) \subseteq \{x \in [- R, R] : Ξ (x) - C_{s} = 0\}

(102)

which suggests the following upper bound on the number of support points of

P_{X^{★}}

:

\begin{matrix} | supp (P_{X^{★}}) | & \leq N ([- R, R], Ξ (x) - C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, 1)) \end{matrix}

(103)

\begin{matrix} = N ([- R, R], E [g (Y_{1}) + log (\frac{σ_{2}}{σ_{1}}) - C_{s} | X = x]) \end{matrix}

(104)

\begin{matrix} \leq S (g (\cdot) + log (\frac{σ_{2}}{σ_{1}}) - C_{s}) \end{matrix}

(105)

\begin{matrix} \leq N (R, g (\cdot) + log (\frac{σ_{2}}{σ_{1}}) - C_{s}) \end{matrix}

(106)

\begin{matrix} = N ([- L, L], g (\cdot) + log (\frac{σ_{2}}{σ_{1}}) - C_{s}) \end{matrix}

(107)

\begin{matrix} < \infty, \end{matrix}

(108)

where (104) follows from using (21); (105) follows from applying Karlin’s oscillation Theorem 1 and the fact that the Gaussian pdf is a strictly totally positive kernel, which was shown in [26]; (107) is proved in Lemma A3 in the Appendix B; and (108) follows because

g (\cdot)

is an analytic function in

(- L, L)

. The implicit upper bound (49) of Theorem 5 follows from (107) and (108).

8.2. Explicit Upper Bound

The key to finding an explicit upper bound on the number of zeros will be the following complex-analytic result.

Lemma 3

(Tijdeman’s Number of Zeros Lemma [41]). Let

L, s, t

be positive numbers, such that

s > 1

. For the complex valued function

f \neq 0

, which is analytic on

| z | < (s t + s + t) L

, its number of zeros

N (D_{L}, f)

within the disk

D_{L} = {z : | z | \leq L}

satisfies

\begin{matrix} N (D_{L}, f) & \leq \frac{1}{log s} (log max_{| z | \leq (s t + s + t) L} | f (z) | - log max_{| z | \leq t L} | f (z) |) . \end{matrix}

(109)

Furthermore, the following loosened version of the implicit upper bound in (49) will be useful.

Lemma 4.

\begin{matrix} | supp (P_{X^{★}}) | & \leq N ([- L, L], h (\cdot)) + 1 \end{matrix}

(110)

where

\begin{matrix} \frac{h (y)}{σ_{1}^{2} f_{Y_{1}} (y)} & = \frac{E_{N} [E [X^{★} | Y_{2} = y + N]] - y}{σ_{2}^{2}} - \frac{E [X^{★} | Y_{1} = y] - y}{σ_{1}^{2}} \end{matrix}

(111)

\begin{matrix} = \frac{E [N log f_{Y_{2}} (y + N)]}{σ_{2}^{2} - σ_{1}^{2}} - \frac{E [X^{★} | Y_{1} = y] - y}{σ_{1}^{2}}, \end{matrix}

(112)

and where

N \sim N (0, σ_{2}^{2} - σ_{1}^{2})

.

Proof.

Starting from (107), we can write

\begin{matrix} | supp (P_{X^{★}}) | & \leq N ([- L, L], g (\cdot) + log (\frac{σ_{2}}{σ_{1}}) - C_{s}) \end{matrix}

(113)

\begin{matrix} \leq N ([- L, L], g^{'} (\cdot)) + 1 \end{matrix}

(114)

\begin{matrix} = N ([- L, L], σ_{1}^{2} f_{Y_{1}} (\cdot) g^{'} (\cdot)) + 1 \end{matrix}

(115)

where in step (114), we applied Rolle’s theorem, and in step (115), we used the fact that multiplying by a strictly positive function (i.e.,

σ_{1}^{2} f_{Y_{1}}

) does not change the number of zeros. The first derivative of g can be computed as follows:

\begin{matrix} g^{'} (y) & = E [\frac{d}{d y} log f_{Y_{2}} (y + N)] - \frac{d}{d y} log f_{Y_{1}} (y) \end{matrix}

(116)

\begin{matrix} = \frac{E_{N} [E [X^{★} | Y_{2} = y + N]] - y}{σ_{2}^{2}} - \frac{E [X^{★} | Y_{1} = y] - y}{σ_{1}^{2}}, \end{matrix}

(117)

where in the last step, we used the well-known Tweedie’s formula (see for example [42,43]):

E [X^{★} | Y_{i} = y] = y + σ_{i}^{2} \frac{d}{d y} log f_{Y_{i}} (y) .

(118)

An alternative expression for the first term in the right-hand side (RHS) of (116) is as follows:

\begin{matrix} E [\frac{d}{d y} log f_{Y_{2}} (y + N)] & = \int_{- \infty}^{\infty} f_{N} (n) \frac{d}{d y} log f_{Y_{2}} (y + n) d n \end{matrix}

(119)

\begin{matrix} = - \int_{- \infty}^{\infty} (\frac{d}{d n} f_{N} (n)) \cdot log f_{Y_{2}} (y + n) d n \end{matrix}

(120)

\begin{matrix} = \int_{- \infty}^{\infty} \frac{n}{σ_{2}^{2} - σ_{1}^{2}} f_{N} (n) \cdot log f_{Y_{2}} (y + n) d n \end{matrix}

(121)

\begin{matrix} = \frac{1}{σ_{2}^{2} - σ_{1}^{2}} E [N log f_{Y_{2}} (y + N)], \end{matrix}

(122)

where

f_{N} (n) = ϕ_{\sqrt{σ_{2}^{2} - σ_{1}^{2}}} (n)

. The proof is concluded by letting

\begin{matrix} h (y) ≜ σ_{1}^{2} f_{Y_{1}} (y) g^{'} (y) . \end{matrix}

(123)

□

To apply Tijdeman’s number of zeros Lemma, upper and lower bounds to the maximum module of the complex analytic extension of h over the disk

D_{L} = {z : | z | \leq L}

are proposed in Lemmas A4 and A5 in the Appendix B. Using those bounds, we can provide an upper bound on the number of mass points as follows:

\begin{matrix} N ([- L, L], h (\cdot)) \end{matrix}

\begin{matrix} \leq N (D_{L}, \overset{˘}{h} (\cdot)) \end{matrix}

(124)

\begin{matrix} \leq min_{s > 1, t > 0} \{\frac{log \frac{{max}_{| z | \leq (s t + s + t) L} | \overset{˘}{h} (z) |}{{max}_{| z | \leq t L} | \overset{˘}{h} (z) |}}{log s}\} \end{matrix}

(125)

\begin{matrix} \leq log \frac{\frac{e^{\frac{{(2 e + 1)}^{2} L^{2}}{2 σ_{1}^{2}}}}{\sqrt{2 π σ_{1}^{2}}} (a_{1} {(2 e + 1)}^{2} L^{2} + a_{2} (2 e + 1) L + a_{3})}{(c_{1} L - c_{2} R) \frac{exp (- \frac{{(L + R)}^{2}}{2 σ_{1}^{2}})}{\sqrt{2 π σ_{1}^{2}}}} \end{matrix}

(126)

\begin{matrix} = \frac{{(2 e + 1)}^{2} L^{2}}{2 σ_{1}^{2}} + \frac{{(L + R)}^{2}}{2 σ_{1}^{2}} + log \frac{a_{1} {(2 e + 1)}^{2} L^{2} + a_{2} (2 e + 1) L + a_{3}}{c_{1} L - c_{2} R} \\ = \frac{{(2 e + 1)}^{2} {(d_{1} R + d_{2})}^{2}}{2 σ_{1}^{2}} + \frac{{((d_{1} + 1) R + d_{2})}^{2}}{2 σ_{1}^{2}} \end{matrix}

(127)

\begin{matrix} + log \frac{a_{1} {(2 e + 1)}^{2} {(d_{1} R + d_{2})}^{2} + a_{2} (2 e + 1) (d_{1} R + d_{2}) + a_{3}}{(c_{1} d_{1} - c_{2}) R + c_{1} d_{2}} \end{matrix}

(128)

\begin{matrix} \leq b_{1} \frac{R^{2}}{σ_{1}^{2}} + b_{2} + log \frac{b_{3} R^{2} + b_{4} R + b_{5}}{b_{6} R + b_{7}} \end{matrix}

(129)

\begin{matrix} \leq b_{1} \frac{R^{2}}{σ_{1}^{2}} + O (log (R)), \end{matrix}

(130)

where (124) follows because extending to a larger domain can only increase the number of zeros; (125) follows from the Tijdeman’s Number of Zeros Lemma; (126) follows from choosing

s = e

and

t = 1

and using bounds in Lemmas A4 and A5; (128) follows from using the value of L in (A38); (129) using the bound

{(a + b)}^{2} \leq 2 (a^{2} + b^{2})

and defining

\begin{matrix} b_{1} & = {(2 e + 1)}^{2} d_{1}^{2} + {(d_{1} + 1)}^{2} \end{matrix}

(131a)

\begin{matrix} = {(2 e + 1)}^{2} {(\frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}})}^{2} + {(\frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}} + 1)}^{2} \end{matrix}

(131b)

\begin{matrix} b_{2} & = \frac{({(2 e + 1)}^{2} + 1) d_{2}^{2}}{σ_{1}^{2}} \end{matrix}

(131c)

\begin{matrix} = \frac{({(2 e + 1)}^{2} + 1)}{σ_{1}^{2}} \frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}} \end{matrix}

(131d)

\begin{matrix} = ({(2 e + 1)}^{2} + 1) (1 + 2 \frac{σ_{2}^{2}}{σ_{2}^{2} - σ_{1}^{2}} C_{s}) \end{matrix}

(131e)

\begin{matrix} b_{3} & = 2 {(2 e + 1)}^{2} a_{1} d_{1}^{2} \end{matrix}

(131f)

\begin{matrix} = 2 {(2 e + 1)}^{2} \frac{3 σ_{1}^{2}}{σ_{2}^{2} \sqrt{σ_{2}^{2} - σ_{1}^{2}}} {(\frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}})}^{2} \end{matrix}

(131g)

\begin{matrix} b_{4} & = (2 e + 1) d_{1} a_{2} \end{matrix}

(131h)

\begin{matrix} = (2 e + 1) \frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}} (\frac{\sqrt{2} σ_{1}^{2}}{\sqrt{σ_{2}^{2}} \sqrt{σ_{2}^{2} - σ_{1}^{2}}} + 2) \end{matrix}

(131i)

\begin{matrix} b_{5} & = 2 {(2 e + 1)}^{2} a_{1} d_{2}^{2} + (2 e + 1) a_{2} d_{2} + a_{3} \\ = 2 {(2 e + 1)}^{2} \frac{3 σ_{1}^{2}}{σ_{2}^{2} \sqrt{σ_{2}^{2} - σ_{1}^{2}}} (\frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}}) \\ + (2 e + 1) (\frac{\sqrt{2} σ_{1}^{2}}{\sqrt{σ_{2}^{2}} \sqrt{σ_{2}^{2} - σ_{1}^{2}}} + 2) \sqrt{\frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}}} \end{matrix}

(131j)

\begin{matrix} + \frac{σ_{1}^{2}}{\sqrt{σ_{2}^{2} - σ_{1}^{2}}} \cdot \sqrt{| log (2 π σ_{2}^{2}) |^{2} + \frac{24 {(σ_{2}^{2} - σ_{1}^{2})}^{2}}{σ_{2}^{4}} + π^{2}} \end{matrix}

(131k)

\begin{matrix} b_{6} & = c_{1} d_{1} - c_{2} \end{matrix}

(131l)

\begin{matrix} = \frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} \frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}} - \frac{σ_{2}^{2} + σ_{1}^{2}}{σ_{2}^{2}} = 2 \frac{σ_{1}}{σ_{2}} \end{matrix}

(131m)

\begin{matrix} b_{7} & = c_{1} d_{2} \end{matrix}

(131n)

\begin{matrix} = \frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} \sqrt{\frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}}}; \end{matrix}

(131o)

and (130) follows from the fact that the

b_{1}, b_{3}, b_{4}

, and

b_{6}

coefficients do not depend on

R

and the fact that the coefficients

b_{2}, b_{5}

, and

b_{4}

, while they do depend on

R

through

C_{s}

, do not grow with

R

. The fact that

C_{s}

does not grow with

R

follows from the bound in (69).

Finally, the explicit upper bound on the number of support points of

P_{X^{★}}

in (52) is a consequence of (130).

9. Proof of Theorem 6

Using the KKT conditions in (28), we have that for

x = [R, 0, \dots, 0]

\begin{matrix} C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n) & = Ξ (x; P_{X_{R}}) \end{matrix}

(132)

\begin{matrix} = D (f_{Y_{1} | X} (\cdot | x) ∥ f_{Y_{1}^{★}}) - D (f_{Y_{2} | X} (\cdot | x) ∥ f_{Y_{2}^{★}}) \end{matrix}

(133)

\begin{matrix} = \frac{1}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{R^{2} - R^{2} E [h_{\frac{n}{2}}^{2} (\frac{∥ R + \sqrt{s} Z ∥ R}{s})]}{s^{2}} d s \end{matrix}

(134)

where the last expression was computed in (83). This concludes the proof.

10. Conclusions

This paper has focused on the secrecy capacity of the n-dimensional vector Gaussian wiretap channel under the peak power (or amplitude constraint) in a so-called low (but not vanishing) amplitude regime. In this regime, the optimal input distribution

P_{X_{R}}

is supported on a single n-dimensional sphere of radius

R

. The paper has identified the largest

{\bar{R}}_{n}

, such that the distribution

P_{X_{R}}

is optimal. In addition, the asymptotic of

{\bar{R}}_{n}

has been completely characterized as dimension n approaches infinity. As a by-product of the analysis, the capacity in the low-amplitude regime has also been characterized in a more or less closed form. The paper has also provided a number of supporting numerical examples. Implicit and explicit upper bounds have been proposed on the number of mass points for the optimal input distribution

P_{X^{★}}

in the scalar case with

n = 1

.

There are several interesting future directions. For example, one interesting direction would be to determine a regime in which a mixture of a mass point at zero and

P_{X_{R}}

is optimal. It would also be interesting to establish a lower bound on the number of mass points in the support of the optimal input distribution when

n = 1

. We note that such a lower bound was obtained for a point-to-point channel in [30]. We finally remark that the extension of the results of this paper to nondegraded wiretap channels is not trivial and also constitutes an interesting but ambitious future direction.

Author Contributions

A.F., L.B. and A.D. contributed equally to this work. All authors have read and agreed to the published version of the manuscript. Part of this work was presented at the 2021 IEEE Information Theory Workshop [44], at the 2022 IEEE International Symposium on Information Theory [45], at the 2022 IEEE International Mediterranean Conference on Communications and Networking [33], and in the PhD dissertation in [46].

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Datasets for the numerical results provided in this work are available at [1].

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Examples of the Function G_σ1,σ2,R,n

In this section, we give supporting numerical arguments that the function

G_{σ_{1}, σ_{2}, R, n}

defined in (39) has at most one sign change. Figure A1 demonstrates the behavior of the function

G_{σ_{1}, σ_{2}, R, n}

. In addition, the code that generates the function

G_{σ_{1}, σ_{2}, R, n}

for various values of

n, σ_{1}

, and

σ_{2}

is provided in [1].

Figure A1. Examples of the function

G_{σ_{1}, σ_{2}, R, n}

defined in (39). (a)

n = 3

,

σ_{1} = 1

, and

σ_{2} = 2

. (b)

n = 11

,

σ_{1} = 1

, and

σ_{2} = 2

. (c)

n = 4

,

σ_{1} = 3

, and

σ_{2} = 3.1

. (d)

n = 11

,

σ_{1} = 3

, and

σ_{2} = 3.1

.

Figure A1. Examples of the function

G_{σ_{1}, σ_{2}, R, n}

defined in (39). (a)

n = 3

,

σ_{1} = 1

, and

σ_{2} = 2

. (b)

n = 11

,

σ_{1} = 1

, and

σ_{2} = 2

. (c)

n = 4

,

σ_{1} = 3

, and

σ_{2} = 3.1

. (d)

n = 11

,

σ_{1} = 3

, and

σ_{2} = 3.1

.

Appendix B. Derivative of the Secrecy-Density

Lemma A1.

The derivative of the secrecy density for the input

P_{X_{R}}

is

\begin{matrix} {\tilde{Ξ}}^{'} (∥ x ∥; P_{∥ X_{R} ∥}) = ∥ x ∥ E [{\tilde{M}}_{2} (σ_{1} Q_{n + 2}) - M_{1} (σ_{1} Q_{n + 2})] \end{matrix}

(A1)

where

Q_{n + 2}^{2}

is a noncentral chi-square random variable with

n + 2

degrees of freedom and noncentrality parameter

\frac{{∥ x ∥}^{2}}{σ_{1}^{2}}

and

\begin{matrix} M_{i} (y) & = \frac{1}{σ_{i}^{2}} (\frac{R}{y} h_{\frac{n}{2}} (\frac{R}{σ_{i}^{2}} y) - 1), i \in {1, 2} \end{matrix}

(A2)

\begin{matrix} {\tilde{M}}_{2} (y) & = E [M_{2} (∥ y + W ∥)], \end{matrix}

(A3)

where

W \sim N (0_{n + 2}, (σ_{2}^{2} - σ_{1}^{2}) I_{n + 2})

.

Proof.

We start with the secrecy density expressed in spherical coordinates. A quick way to obtain the information densities in this coordinate system is to note that:

\begin{matrix} I (X; Y_{i}) \end{matrix}

\begin{matrix} = h (Y_{i}) - h (N_{i}) \end{matrix}

(A4)

\begin{matrix} = h (∥ Y_{i} ∥) + (n - 1) E [log ∥ Y_{i} ∥] + h_{λ} (\frac{Y_{i}}{∥ Y_{i} ∥}) - h (N_{i}) \end{matrix}

(A5)

\begin{matrix} = h (∥ Y_{i} ∥^{2}) + (\frac{n}{2} - 1) E [log ∥ Y_{i} ∥^{2}] + log \frac{π^{\frac{n}{2}}}{Γ (\frac{n}{2})} - \frac{n}{2} log (2 π e σ_{i}^{2}) \end{matrix}

(A6)

\begin{matrix} = h (σ_{i}^{2} {∥\frac{X}{σ_{i}} + {\tilde{N}}_{i}∥}^{2}) + (\frac{n}{2} - 1) E [log (σ_{i}^{2} {∥\frac{X}{σ_{i}} + {\tilde{N}}_{i}∥}^{2})] + log \frac{π^{\frac{n}{2}}}{Γ (\frac{n}{2})} - \frac{n}{2} log (2 π e σ_{i}^{2}) \end{matrix}

(A7)

\begin{matrix} = h ({∥\frac{X}{σ_{i}} + {\tilde{N}}_{i}∥}^{2}) + (\frac{n}{2} - 1) E [log {∥\frac{X}{σ_{i}} + {\tilde{N}}_{i}∥}^{2}] - log ({(2 e)}^{\frac{n}{2}} Γ (\frac{n}{2})), \end{matrix}

(A8)

where (A5) holds by [47], Lemma 6.17, and by independence between

∥ Y_{i} ∥

and

\frac{Y_{i}}{∥ Y_{i} ∥}

; the term

h_{λ} (\cdot)

is a differential entropy-like quantity for random vectors on the n-dimensional unit sphere ([47], Lemma 6.16); (A6) holds because

\frac{Y_{i}}{∥ Y_{i} ∥}

is uniform on the unit sphere and thanks to [47], Lemma 6.15; the term

Γ (z)

is the gamma function; and in (A7) we have

{\tilde{N}}_{i} \sim N (0_{n}, I_{n})

. It is now required to write the secrecy density as follows:

\tilde{Ξ} (∥ x ∥; P_{∥ X ∥}) = i_{1} (∥ x ∥; P_{X}) - i_{2} (∥ x ∥; P_{X})

(A9)

where

\begin{matrix} i_{j} (∥ x ∥; P_{X}) & = - \int_{0}^{\infty} f_{χ_{n}^{2} (\frac{{∥ x ∥}^{2}}{σ_{j}^{2}})} (y) log \frac{\int_{0}^{R} f_{χ_{n}^{2} (\frac{t^{2}}{σ_{j}^{2}})} (y) d P_{∥ X ∥} (t)}{y^{\frac{n}{2} - 1}} d y - log ({(2 e)}^{\frac{n}{2}} Γ (\frac{n}{2})), \end{matrix}

(A10)

for

j \in {1, 2}

. The term

f_{χ_{n}^{2} (λ)} (y)

is the noncentral chi-square pdf with n degrees of freedom and noncentrality parameter

λ

.

Given two values

ρ_{1}, ρ_{2}

with

ρ_{1} > ρ_{2}

, write

\begin{matrix} i_{j} (ρ_{1}; P_{X}) - i_{j} (ρ_{2}; P_{X}) & = \int_{0}^{\infty} (f_{χ_{n}^{2} (\frac{ρ_{1}^{2}}{σ_{j}^{2}})} (y) - f_{χ_{n}^{2} (\frac{ρ_{2}^{2}}{σ_{j}^{2}})} (y)) log \frac{y^{\frac{n}{2} - 1}}{f_{∥ \frac{Y}{σ_{j}} ∥^{2}} (y; P_{X})} d y \end{matrix}

(A11)

\begin{matrix} = \int_{0}^{\infty} (F_{χ_{n}^{2} (\frac{ρ_{2}^{2}}{σ_{j}^{2}})} (y) - F_{χ_{n}^{2} (\frac{ρ_{1}^{2}}{σ_{j}^{2}})} (y)) \frac{d}{d y} log \frac{y^{\frac{n}{2} - 1}}{f_{∥ \frac{Y}{σ_{j}} ∥^{2}} (y; P_{X})} d y \end{matrix}

(A12)

where we have integrated by parts and where

F_{χ_{n}^{2} (λ)} (y)

is the cumulative distribution function of

χ_{n}^{2} (λ)

. Now notice that

\int_{0}^{\infty} (F_{χ_{n}^{2} (\frac{ρ_{2}^{2}}{σ_{j}^{2}})} (y) - F_{χ_{n}^{2} (\frac{ρ_{1}^{2}}{σ_{j}^{2}})} (y)) d y = \frac{ρ_{1}^{2} - ρ_{2}^{2}}{σ_{j}^{2}} .

(A13)

Since

χ_{n}^{2} (\frac{ρ_{1}^{2}}{σ_{j}^{2}})

statistically dominates

χ_{n}^{2} (\frac{ρ_{2}^{2}}{σ_{j}^{2}})

, the integrand function in (A13) is always positive. We can introduce an auxiliary output random variable

Q_{j}

, for

j \in {1, 2}

, with pdf

f_{Q_{j}} (y; ρ_{1}, ρ_{2}) = \frac{σ_{j}^{2}}{ρ_{1}^{2} - ρ_{2}^{2}} (F_{χ_{n}^{2} (\frac{ρ_{2}^{2}}{σ_{j}^{2}})} (y) - F_{χ_{n}^{2} (\frac{ρ_{1}^{2}}{σ_{j}^{2}})} (y)),

(A14)

for

y > 0

, to rewrite (A12) as follows:

\begin{matrix} i_{j} (ρ_{1}; P_{X}) - i_{j} (ρ_{2}; P_{X}) & = - \frac{ρ_{1}^{2} - ρ_{2}^{2}}{σ_{j}^{2}} \int_{0}^{\infty} f_{Q_{j}} (y; ρ_{1}, ρ_{2}) \frac{d}{d y} log \frac{f_{∥ \frac{Y}{σ_{j}} ∥^{2}} (y; P_{X})}{y^{\frac{n}{2} - 1}} d y . \end{matrix}

(A15)

We evaluate the derivative in (A15) as:

\begin{matrix} \frac{d}{d y} log \frac{f_{∥ \frac{Y}{σ_{j}} ∥^{2}} (y; P_{X})}{y^{\frac{n}{2} - 1}} \end{matrix}

\begin{matrix} = \frac{y^{\frac{n}{2} - 1}}{f_{∥ \frac{Y}{σ_{j}} ∥^{2}} (y; P_{X})} \int_{0}^{R} \frac{d}{d y} \frac{f_{χ_{n}^{2} (\frac{t^{2}}{σ_{j}^{2}})} (y)}{y^{\frac{n}{2} - 1}} d P_{∥ X ∥} (t) \end{matrix}

(A16)

\begin{matrix} = \frac{y^{\frac{n}{2} - 1}}{f_{∥ \frac{Y}{σ_{j}} ∥^{2}} (y; P_{X})} \int_{0}^{R} (\frac{f_{χ_{n - 2}^{2} (\frac{t^{2}}{σ_{j}^{2}})} (y)}{2 y^{\frac{n}{2} - 1}} - (\frac{1}{2} + \frac{\frac{n}{2} - 1}{y}) \frac{f_{χ_{n}^{2} (\frac{t^{2}}{σ_{j}^{2}})} (y)}{y^{\frac{n}{2} - 1}}) d P_{∥ X ∥} (t) \end{matrix}

(A17)

\begin{matrix} = E [\frac{1}{2} \frac{f_{χ_{n - 2}^{2} (\frac{{∥ X ∥}^{2}}{σ_{j}^{2}})} (\frac{{∥Y∥}^{2}}{σ_{j}^{2}})}{f_{χ_{n}^{2} (\frac{{∥ X ∥}^{2}}{σ_{j}^{2}})} (\frac{{∥Y∥}^{2}}{σ_{j}^{2}})} - (\frac{1}{2} + \frac{\frac{n}{2} - 1}{\frac{{∥Y∥}^{2}}{σ_{j}^{2}}}) | \frac{{∥Y∥}^{2}}{σ_{j}^{2}} = y] \end{matrix}

(A18)

\begin{matrix} = E [\frac{1}{2} \frac{∥ X ∥}{∥ Y ∥} \frac{I_{\frac{n}{2} - 2} (\frac{∥ X ∥ ∥ Y ∥}{σ_{j}^{2}})}{I_{\frac{n}{2} - 1} (\frac{∥ X ∥ ∥ Y ∥}{σ_{j}^{2}})} - (\frac{1}{2} + \frac{\frac{n}{2} - 1}{\frac{{∥Y∥}^{2}}{σ_{j}^{2}}}) | \frac{{∥Y∥}^{2}}{σ_{j}^{2}} = y] \end{matrix}

(A19)

\begin{matrix} = E [\frac{1}{2} \frac{∥ X ∥}{∥ Y ∥} h_{\frac{n}{2}} (\frac{∥ X ∥ ∥ Y ∥}{σ_{j}^{2}}) - \frac{1}{2} | \frac{{∥Y∥}^{2}}{σ_{j}^{2}} = y] \end{matrix}

(A20)

where, in (A16), we used

f_{∥ \frac{Y}{σ_{j}} ∥^{2}} (y; P_{X}) = \int_{0}^{R} f_{χ_{n}^{2} (\frac{t^{2}}{σ_{j}^{2}})} (y) d P_{∥ X ∥} (t);

(A21)

in (A17), we used the relationship

\frac{d}{d y} f_{χ_{n}^{2} (ρ^{2})} (y) = \frac{1}{2} f_{χ_{n - 2}^{2} (ρ^{2})} (y) - \frac{1}{2} f_{χ_{n}^{2} (ρ^{2})} (y);

(A22)

and (A20) follows from the recurrence relationship

I_{ν - 1} (z) - I_{ν + 1} (z) = \frac{2 ν}{z} I_{ν} (z) .

(A23)

Putting together (A15) and (A20), we find

\begin{matrix} i_{j} (ρ_{1}; P_{X}) - i_{j} (ρ_{2}; P_{X}) & = - \frac{ρ_{1}^{2} - ρ_{2}^{2}}{2 σ_{j}^{2}} E [E [\frac{∥ X ∥}{∥ Y ∥} h_{\frac{n}{2}} (\frac{∥ X ∥ ∥ Y ∥}{σ_{j}^{2}}) - 1 | \frac{{∥Y∥}^{2}}{σ_{j}^{2}} = Q_{j}]] . \end{matrix}

(A24)

We are now in the position to compute the derivative of the information density as

\begin{matrix} i_{j}^{'} (ρ; P_{X}) & = lim_{h \to 0} \frac{i_{j} (ρ + h; P_{X}) - i_{j} (ρ; P_{X})}{h} \end{matrix}

(A25)

\begin{matrix} = - \frac{ρ}{σ_{j}^{2}} E [E [\frac{∥ X ∥}{∥ Y ∥} h_{\frac{n}{2}} (\frac{∥ X ∥ ∥ Y ∥}{σ_{j}^{2}}) - 1 | \frac{{∥Y∥}^{2}}{σ_{j}^{2}} = Q^{'}]], \end{matrix}

(A26)

where

Q^{'} \sim χ_{n + 2}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})

thanks to Lemma A2.

The final result is obtained by letting

\begin{matrix} {\tilde{Ξ}}^{'} (∥ x ∥; P_{∥ x ∥}) = i_{1}^{'} (∥ x ∥; P_{X}) - i_{2}^{'} (∥ x ∥; P_{X}) \end{matrix}

(A27)

and by specializing the result to the input

P_{X_{R}}

. □

Lemma A2.

Consider the pdf

f_{Q_{j}} (y; ρ_{1}, ρ_{2})

defined in (A14). For any

ρ \geq 0

we have

lim_{h \to 0} f_{Q_{j}} (y; ρ + h, ρ) = f_{χ_{n + 2}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})} (y), y > 0 .

(A28)

Proof.

Thanks to the definition (A14), we have

\begin{matrix} lim_{h \to 0} f_{Q_{j}} (y; ρ + h, ρ) \end{matrix}

\begin{matrix} = lim_{h \to 0} \frac{σ_{j}^{2}}{h (2 ρ + h)} (F_{χ_{n}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})} (y) - F_{χ_{n}^{2} (\frac{{(ρ + h)}^{2}}{σ_{j}^{2}})} (y)) \end{matrix}

(A29)

\begin{matrix} = lim_{h \to 0} \frac{σ_{j}^{2}}{h (2 ρ + h)} \int_{0}^{y} (f_{χ_{n}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})} (t) - f_{χ_{n}^{2} (\frac{{(ρ + h)}^{2}}{σ_{j}^{2}})} (t)) d t \end{matrix}

(A30)

\begin{matrix} = \frac{σ_{j}^{2}}{2 ρ} \int_{0}^{y} \sum_{i = 0}^{\infty} lim_{h \to 0} \frac{1}{h} (\frac{e^{- \frac{ρ^{2}}{2 σ_{j}^{2}}} {(\frac{ρ^{2}}{2 σ_{j}^{2}})}^{i}}{i!} - \frac{e^{- \frac{{(ρ + h)}^{2}}{2 σ_{j}^{2}}} {(\frac{{(ρ + h)}^{2}}{2 σ_{j}^{2}})}^{i}}{i!}) f_{χ_{n + 2 i}^{2}} (t) d t \end{matrix}

(A31)

\begin{matrix} = \frac{σ_{j}^{2}}{2 ρ} \int_{0}^{y} \sum_{i = 0}^{\infty} \frac{d}{d ρ} (\frac{e^{- \frac{ρ^{2}}{2 σ_{j}^{2}}} {(\frac{ρ^{2}}{2 σ_{j}^{2}})}^{i}}{i!}) f_{χ_{n + 2 i}^{2}} (t) d t \end{matrix}

(A32)

\begin{matrix} = \frac{1}{2} \int_{0}^{y} \sum_{i = 0}^{\infty} (- \frac{e^{- \frac{ρ^{2}}{2 σ_{j}^{2}}} {(\frac{ρ^{2}}{2 σ_{j}^{2}})}^{i}}{i!} + \frac{e^{- \frac{ρ^{2}}{2 σ_{j}^{2}}} {(\frac{ρ^{2}}{2 σ_{j}^{2}})}^{i - 1}}{(i - 1)!} 1 (i \geq 1)) f_{χ_{n + 2 i}^{2}} (t) d t \end{matrix}

(A33)

\begin{matrix} = \frac{1}{2} \int_{0}^{y} (- f_{χ_{n}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})} (t) + f_{χ_{n + 2}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})} (t)) d t \end{matrix}

(A34)

\begin{matrix} = \int_{0}^{y} \frac{d}{d t} f_{χ_{n + 2}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})} (t) d t \end{matrix}

(A35)

\begin{matrix} = f_{χ_{n + 2}^{2} (\frac{ρ^{2}}{σ_{j}^{2}})} (y), \end{matrix}

(A36)

where

1 (\cdot)

is the indicator function; in (A31) we used the Poisson-weighted mixture representation of the noncentral chi-square pdf, and in (A35), we used (A22). □

Lemma A3.

There exists some

L = L (σ_{1}, σ_{2}, R) < \infty

such that

\begin{matrix} N (R, g (\cdot) + log (\frac{σ_{2}}{σ_{1}}) - C_{s}) = N ([- L, L], g (\cdot) + log (\frac{σ_{2}}{σ_{1}}) - C_{s}) < \infty . \end{matrix}

(A37)

Furthermore, L can be upper-bounded as follows:

L \leq R d_{1} + d_{2}

(A38)

where

\begin{matrix} d_{1} & = \frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}}, \end{matrix}

(A39)

\begin{matrix} d_{2} & = \sqrt{\frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}}} \leq \sqrt{\frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{G}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}}}, \end{matrix}

(A40)

with

\begin{matrix} C_{G} (σ_{1}^{2}, σ_{2}^{2}, R^{2}, 1) & = \frac{1}{2} log \frac{1 + R^{2} / σ_{1}^{2}}{1 + R^{2} / σ_{2}^{2}} . \end{matrix}

(A41)

Proof.

First, note that

C_{s} \leq C_{G}

thanks to (69). Second, for

| y | \geq R

, we can lower-bound the function g as follows:

\begin{matrix} g (y) & = E [log f_{Y_{2}^{★}} (y + N)] - log f_{Y_{1}^{★}} (y) \end{matrix}

(A42)

\begin{matrix} = E [log E [ϕ_{σ_{2}} (y + N - X^{★}) | N]] - log E [ϕ_{σ_{1}} (y - X^{★})] \end{matrix}

(A43)

\begin{matrix} \geq E [log ϕ_{σ_{2}} (y + N - X^{★})] - log E [ϕ_{σ_{1}} (y - X^{★})] \end{matrix}

(A44)

\begin{matrix} \geq log \frac{σ_{1}}{σ_{2}} - E [\frac{{(y + N - X^{★})}^{2}}{2 σ_{2}^{2}}] + \frac{{(| y | - R)}^{2}}{2 σ_{1}^{2}} \end{matrix}

(A45)

\begin{matrix} = log \frac{σ_{1}}{σ_{2}} - E [\frac{{(y - X^{★})}^{2}}{2 σ_{2}^{2}}] - \frac{σ_{2}^{2} - σ_{1}^{2}}{2 σ_{2}^{2}} + \frac{{(| y | - R)}^{2}}{2 σ_{1}^{2}} \end{matrix}

(A46)

\begin{matrix} \geq log \frac{σ_{1}}{σ_{2}} - \frac{{(| y | + R)}^{2}}{2 σ_{2}^{2}} - \frac{σ_{2}^{2} - σ_{1}^{2}}{2 σ_{2}^{2}} + \frac{{(| y | - R)}^{2}}{2 σ_{1}^{2}}, \end{matrix}

(A47)

where (A44) follows from applying Jensen’s inequality and the law of iterated expectation to the first term; (A45) follows from

E [ϕ_{σ_{1}} (y - X^{★})] \leq ϕ_{σ_{1}} (| y | - R), | y | \geq R;

(A48)

and (A47) follows from

{(y - X^{★})}^{2} \leq {(| y | + R)}^{2}

for all

| y | \geq R \geq | X^{★} |

. The RHS of

\begin{matrix} g (y) + log (\frac{σ_{2}}{σ_{1}}) - C_{s} & \geq - \frac{{(| y | + R)}^{2}}{2 σ_{2}^{2}} - \frac{σ_{2}^{2} - σ_{1}^{2}}{2 σ_{2}^{2}} + \frac{{(| y | - R)}^{2}}{2 σ_{1}^{2}} - C_{s} \end{matrix}

(A49)

is strictly positive when

| y | > \frac{R (\frac{1}{σ_{1}^{2}} + \frac{1}{σ_{2}^{2}}) + \sqrt{\frac{4 R^{2}}{σ_{1}^{2} σ_{2}^{2}} + (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}) (\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s})}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}} .

(A50)

By using the bound

\sqrt{a + b} \leq \sqrt{a} + \sqrt{b}

, we arrive at

\begin{matrix} | y | & \geq R \frac{σ_{2} + σ_{1}}{σ_{2} - σ_{1}} + \sqrt{\frac{\frac{σ_{2}^{2} - σ_{1}^{2}}{σ_{2}^{2}} + 2 C_{s}}{\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}}} . \end{matrix}

(A51)

This concludes the proof for the bound on L. □

Lemma A4.

Let

\overset{˘}{h} : C \to C

denote the complex extension of the function h in (123). Then, for

B \geq R

, we have that

\begin{matrix} max_{| z | \leq B} | \overset{˘}{h} (z) | \leq \frac{1}{\sqrt{2 π σ_{1}^{2}}} e^{\frac{B^{2}}{2 σ_{1}^{2}}} (a_{1} B^{2} + a_{2} B + a_{3}) \end{matrix}

(A52)

where

\begin{matrix} a_{1} & = \frac{3 σ_{1}^{2}}{σ_{2}^{2} \sqrt{σ_{2}^{2} - σ_{1}^{2}}}, \end{matrix}

(A53)

\begin{matrix} a_{2} & = \frac{\sqrt{2} σ_{1}^{2}}{\sqrt{σ_{2}^{2}} \sqrt{σ_{2}^{2} - σ_{1}^{2}}} + 2, \end{matrix}

(A54)

\begin{matrix} a_{3} & = \frac{σ_{1}^{2}}{\sqrt{σ_{2}^{2} - σ_{1}^{2}}} (\sqrt{| log (2 π σ_{2}^{2}) |^{2} + \frac{24 {(σ_{2}^{2} - σ_{1}^{2})}^{2}}{σ_{2}^{4}} + π^{2}}) . \end{matrix}

(A55)

Proof.

Let us denote

z = z_{R} + i z_{I}

, where

z_{R}

and

z_{I}

are real numbers and

i = \sqrt{- 1}

is the imaginary unit. Then, by triangular inequality, we have:

\begin{matrix} | \overset{˘}{h} (z) | & = |\frac{σ_{1}^{2} f_{Y_{1}} (z) E [N log f_{Y_{2}} (z + N)]}{σ_{2}^{2} - σ_{1}^{2}} - E [X^{★} ϕ_{σ_{1}} (z - X^{★})] + z f_{Y_{1}} (z)| \end{matrix}

(A56)

\begin{matrix} \leq |f_{Y_{1}} (z)| (\frac{σ_{1}^{2}}{σ_{2}^{2} - σ_{1}^{2}} E [| N | \cdot | log f_{Y_{2}} (z + N) |] + | z |) + E [| X^{★} | \cdot | ϕ_{σ_{1}} (z - X^{★}) |] . \end{matrix}

(A57)

Next, let us upper-bound each contribution of (A57). For

| z | \leq B

, we have

\begin{matrix} {|log f_{Y_{2}} (z + n)|}^{2} \end{matrix}

\begin{matrix} = {|log |f_{Y_{2}} (z + n)| + i arg (f_{Y_{2}} (z + n))|}^{2} \end{matrix}

(A58)

\begin{matrix} = {log}^{2} | f_{Y_{2}} (z + n) | + {arg}^{2} (f_{Y_{2}} (z + n)) \end{matrix}

(A59)

\begin{matrix} = {log}^{2} |E [ϕ_{σ_{2}} (z + n - X^{★})]| + {arg}^{2} (E [ϕ_{σ_{2}} (z + n - X^{★})]) \end{matrix}

(A60)

\begin{matrix} \leq {log}^{2} (\frac{1}{\sqrt{2 π σ_{2}^{2}}} E [exp (- \frac{{(z_{R} + n - X^{★})}^{2} - z_{I}^{2}}{2 σ_{2}^{2}})]) + {arg}^{2} (\sum_{x} α_{x} exp (i θ_{x})) \end{matrix}

(A61)

\begin{matrix} \leq {(\frac{z_{I}^{2}}{2 σ_{2}^{2}} - \frac{1}{2} log (2 π σ_{2}^{2}) + log E [e^{- \frac{{(z_{R} + n - X^{★})}^{2}}{2 σ_{2}^{2}}}])}^{2} + π^{2} \end{matrix}

(A62)

\begin{matrix} \leq 2 {(\frac{z_{I}^{2}}{2 σ_{2}^{2}} - \frac{1}{2} log (2 π σ_{2}^{2}))}^{2} + 2 {log}^{2} E [e^{- \frac{{(z_{R} + n - X^{★})}^{2}}{2 σ_{2}^{2}}}] + π^{2} \end{matrix}

(A63)

\begin{matrix} \leq 2 {(\frac{z_{I}^{2}}{2 σ_{2}^{2}} - \frac{1}{2} log (2 π σ_{2}^{2}))}^{2} + 2 \frac{E^{2} [{(z_{R} + n - X^{★})}^{2}]}{4 σ_{2}^{4}} + π^{2} \end{matrix}

(A64)

\begin{matrix} \leq 2 {(\frac{z_{I}^{2}}{2 σ_{2}^{2}} - \frac{1}{2} log (2 π σ_{2}^{2}))}^{2} + 2 \frac{{({(z_{R} + n)}^{2} + R^{2})}^{2}}{4 σ_{2}^{4}} + π^{2} \end{matrix}

(A65)

\begin{matrix} \leq \frac{2 B^{2}}{σ_{2}^{2}} + {| log (2 π σ_{2}^{2}) |}^{2} + \frac{8 (B^{4} + n^{4}) + R^{4}}{σ_{2}^{4}} + π^{2}, \end{matrix}

(A66)

where step (A61) holds by triangular inequality; step (A62) holds by noticing that

- π < arg (\sum_{x \in supp (P_{X^{★}})} α_{x} exp (i θ_{x})) \leq π,

(A67)

where

{α_{x}}

and

{θ_{x}}

are real numbers that depend on x; (A63) follows from using the bound

{(a + b)}^{2} \leq 2 (a^{2} + b^{2})

; (A64) holds because

x \mapsto {log}^{2} (x)

is a decreasing function for

x < 1

and because

E [e^{- \frac{{(z_{R} + n - X^{★})}^{2}}{2 σ_{2}^{2}}}] \geq e^{- \frac{E [{(z_{R} + n - X^{★})}^{2}]}{2 σ_{2}^{2}}}

, which follows from Jensen’s inequality; (A65) follows from

E [X^{★}] = 0

and

E [{(X^{★})}^{2}] \leq R^{2}

; and (A66) follows from the bound

{| a + b |}^{k} \leq 2^{k - 1} {(| a |}^{k} + {| b |}^{k})

for

k \geq 1

. Furthermore, given that

| z_{R} | \leq B

and

| z_{I} | \leq B

, we arrive at the bound

{({(z_{R} + n)}^{2} + R^{2})}^{2} \leq 2 (8 (B^{4} + n^{4}) + R^{4}) .

(A68)

Consequently,

\begin{matrix} \frac{E [| N | \cdot | log f_{Y_{2}} (z + N) |]}{\sqrt{σ_{2}^{2} - σ_{1}^{2}}} \end{matrix}

\begin{matrix} \leq \frac{\sqrt{E [{| N |}^{2}] E [| log f_{Y_{2}} {(z + N) |}^{2}]}}{\sqrt{σ_{2}^{2} - σ_{1}^{2}}} \end{matrix}

(A69)

\begin{matrix} \leq \sqrt{\frac{2 B^{2}}{σ_{2}^{2}} + {| log (2 π σ_{2}^{2}) |}^{2} + \frac{8 (B^{4} + E [N^{4}]) + R^{4}}{σ_{2}^{4}} + π^{2}} \end{matrix}

(A70)

\begin{matrix} = \sqrt{\frac{2 B^{2}}{σ_{2}^{2}} + {| log (2 π σ_{2}^{2}) |}^{2} + \frac{8 B^{4} + 24 {(σ_{2}^{2} - σ_{1}^{2})}^{2} + R^{4}}{σ_{2}^{4}} + π^{2}}, \end{matrix}

(A71)

where (A69) follows from Cauchy–Schwarz inequality; (A70) follows from

E [N^{4}] = 3 {(σ_{2}^{2} - σ_{1}^{2})}^{2}

. Moreover, we have

\begin{matrix} | f_{Y_{1}} (z) | & \leq E [|ϕ_{σ_{1}} (z - X^{★})|] \end{matrix}

(A72)

\begin{matrix} = \frac{1}{\sqrt{2 π σ_{1}^{2}}} E [exp (- \frac{{(z_{R} - X^{★})}^{2} - z_{I}^{2}}{2 σ_{1}^{2}})] \end{matrix}

(A73)

\begin{matrix} \leq \frac{1}{\sqrt{2 π σ_{1}^{2}}} exp (\frac{B^{2}}{2 σ_{1}^{2}}), \end{matrix}

(A74)

and finally

\begin{matrix} E [| X^{★} | \cdot | ϕ_{σ_{1}} (z - X^{★}) |] & \leq R E [| ϕ_{σ_{1}} (z - X^{★}) |] \end{matrix}

(A75)

\begin{matrix} \leq R \frac{1}{\sqrt{2 π σ_{1}^{2}}} exp (\frac{B^{2}}{2 σ_{1}^{2}}) . \end{matrix}

(A76)

Putting all contributions together, we get

\begin{matrix} | \overset{˘}{h} (z) | \sqrt{2 π σ_{1}^{2}} e^{- \frac{B^{2}}{2 σ_{1}^{2}}} & \leq \frac{σ_{1}^{2} \sqrt{\frac{2 B^{2}}{σ_{2}^{2}} + {| log (2 π σ_{2}^{2}) |}^{2} + \frac{8 B^{4} + 24 {(σ_{2}^{2} - σ_{1}^{2})}^{2} + R^{4}}{σ_{2}^{4}} + π^{2}}}{\sqrt{σ_{2}^{2} - σ_{1}^{2}}} + B + R \end{matrix}

(A77)

\begin{matrix} \leq a_{1} B^{2} + a_{2} B + a_{3}, \end{matrix}

(A78)

where, in the last step, we have used that

\sqrt{\sum_{i} x_{i}} \leq \sum_{i} \sqrt{x_{i}}

and the fact that

R \leq B

. □

Lemma A5.

Let

\overset{˘}{h} : C \to C

denote the complex extension of the function h in (123). Then, for

\begin{matrix} B \geq R \frac{σ_{2}^{2} + σ_{1}^{2}}{σ_{2}^{2} - σ_{1}^{2}}, \end{matrix}

(A79)

we have that

\begin{matrix} max_{| z | \leq B} | \overset{˘}{h} (z) | \geq (c_{1} B - c_{2} R) \frac{exp (- \frac{{(B + R)}^{2}}{2 σ_{1}^{2}})}{\sqrt{2 π σ_{1}^{2}}} > 0, \end{matrix}

(A80)

where

c_{1} = 1 - \frac{σ_{1}^{2}}{σ_{2}^{2}}

and

c_{2} = 1 + \frac{σ_{1}^{2}}{σ_{2}^{2}}

.

Proof.

First, note that

\begin{matrix} \frac{E_{N} [E [X^{★} | Y_{2} = B + N]]}{σ_{2}^{2}} - \frac{E [X^{★} | Y_{1} = B]}{σ_{1}^{2}} \geq - \frac{R}{σ_{2}^{2}} - \frac{R}{σ_{1}^{2}} . \end{matrix}

(A81)

Second, note that the condition in (A79) implies that

\begin{matrix} 0 \leq B (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}) - \frac{R}{σ_{2}^{2}} - \frac{R}{σ_{1}^{2}} . \end{matrix}

(A82)

Therefore, by using (111) together with (A81) and (A82), we arrive at

\begin{matrix} max_{| z | \leq B} | \overset{˘}{h} (z) | & \geq |\overset{˘}{h} (B)| \end{matrix}

(A83)

\begin{matrix} = |\frac{E [E [X^{★} | Y_{2} = B + N]] - B}{σ_{2}^{2}} - \frac{E [X^{★} | Y_{1} = B] - B}{σ_{1}^{2}}| σ_{1}^{2} f_{Y_{1}} (B) \end{matrix}

(A84)

\begin{matrix} \geq (B (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}) - \frac{R}{σ_{2}^{2}} - \frac{R}{σ_{1}^{2}}) σ_{1}^{2} f_{Y_{1}} (B) \end{matrix}

(A85)

\begin{matrix} \geq (B (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}}) - \frac{R}{σ_{2}^{2}} - \frac{R}{σ_{1}^{2}}) \frac{σ_{1}^{2}}{\sqrt{2 π σ_{1}^{2}}} exp (- \frac{{(B + R)}^{2}}{2 σ_{1}^{2}}), \end{matrix}

(A86)

where in last bound we have used Jensen’s inequality to arrive at

\begin{matrix} f_{Y_{1}} (B) & = E [ϕ_{σ_{1}} (B - X^{★})] \end{matrix}

(A87)

\begin{matrix} = \frac{1}{\sqrt{2 π σ_{1}^{2}}} E [exp (- \frac{{(B - X^{★})}^{2}}{2 σ_{1}^{2}})] \end{matrix}

(A88)

\begin{matrix} \geq \frac{1}{\sqrt{2 π σ_{1}^{2}}} exp (- \frac{{(B + R)}^{2}}{2 σ_{1}^{2}}) . \end{matrix}

(A89)

This concludes the proof. □

Appendix C. Proof of Theorem 7

To study the large n behavior, we need the following bounds on the function

h_{ν}

[39,40]: for

ν > \frac{1}{2}

\begin{matrix} h_{ν} (x) = \frac{x}{\frac{2 ν - 1}{2} + \sqrt{\frac{{(2 ν - 1)}^{2}}{4} + x^{2}}} \cdot g_{ν} (x), \end{matrix}

(A90)

where

\begin{matrix} 1 \geq g_{ν} (x) \geq \frac{\frac{2 ν - 1}{2} + \sqrt{\frac{{(2 ν - 1)}^{2}}{4} + x^{2}}}{ν + \sqrt{ν^{2} + x^{2}}} . \end{matrix}

(A91)

Moreover, let

U_{n} = ∥ R + \sqrt{s} Z ∥

(A92)

with

Z \sim N (0_{n}, σ^{2} I_{n})

. Consequently,

\begin{matrix} lim_{n \to \infty} E [h_{\frac{n}{2}}^{2} (\frac{∥ R + \sqrt{s} Z ∥ R}{s})] \end{matrix}

\begin{matrix} = E [lim_{n \to \infty} h_{\frac{n}{2}}^{2} (\frac{∥ R + \sqrt{s} Z ∥ R}{s})] \end{matrix}

(A93)

\begin{matrix} = E [lim_{n \to \infty} \frac{U_{n}^{2} \frac{R^{2}}{s^{2}}}{{(\frac{n - 1}{2} + \sqrt{\frac{{(n - 1)}^{2}}{4} + U_{n}^{2} \frac{R^{2}}{s^{2}}})}^{2}} \cdot g_{\frac{n}{2}}^{2} (U_{n} \frac{R}{s})] \end{matrix}

(A94)

\begin{matrix} = E [lim_{n \to \infty} \frac{\frac{1}{n} U_{n}^{2} \frac{R^{2}}{s^{2}}}{n \cdot {(\frac{1}{2} + \sqrt{\frac{1}{4} + {(\frac{1}{n} U_{n} \frac{R}{s})}^{2}})}^{2}} \cdot g_{\frac{n}{2}}^{2} (U_{n} \frac{R}{s})] \end{matrix}

(A95)

\begin{matrix} = 0, \end{matrix}

(A96)

where (A93) follows from the dominated convergence theorem, since

| h_{ν} | \leq 1

; (A94) follows from using (A90); (A96) follows from using the strong law of large numbers to note that

lim_{n \to \infty} \frac{1}{n} U_{n}^{2} = lim_{n \to \infty} \frac{∥ R + \sqrt{s} {Z ∥}^{2}}{n} = s .

(A97)

Now, combining the capacity expression in (58) and (A96), we have that

\begin{matrix} lim_{n \to \infty} C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n) = \frac{1}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{R^{2}}{s^{2}} d s = R^{2} (\frac{1}{2 σ_{1}^{2}} - \frac{1}{2 σ_{2}^{2}}) . \end{matrix}

(A98)

Appendix D. Proof of Theorem 8

Let

R_{n} = c \sqrt{n}

\begin{matrix} lim_{n \to \infty} \frac{C_{s} (σ_{1}^{2}, σ_{2}^{2}, R_{n}, n)}{n} & = \frac{c^{2}}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{1 - {lim}_{n \to \infty} E [h_{\frac{n}{2}}^{2} (\frac{∥ R_{n} + \sqrt{s} Z ∥ R_{n}}{s})]}{s^{2}} d s \end{matrix}

(A99)

\begin{matrix} = \frac{c^{2}}{2} \int_{σ_{1}^{2}}^{σ_{2}^{2}} \frac{1 - \frac{c^{2} (c^{2} + s)}{{(\frac{s}{2} + \sqrt{\frac{s^{2}}{4} + c^{2} (c^{2} + s)})}^{2}}}{s^{2}} d s \end{matrix}

(A100)

\begin{matrix} = \frac{1}{2} log (\frac{σ_{2}^{2} (c^{2} + σ_{1}^{2})}{σ_{1}^{2} (c^{2} + σ_{2}^{2})}), \end{matrix}

(A101)

where (A100) follows from the limit established in (98). This concludes the proof.

Appendix E. Partial Derivatives for the Gradient Ascent Algorithm

The partial derivatives of the secrecy information, with respect to any mass point

ρ_{l} \in supp (P_{∥ X ∥})

, are defined as

\begin{matrix} \frac{\partial}{\partial ρ_{l}} I_{s} (∥ X ∥; P_{∥ X ∥}) & = \sum_{k = 1}^{K} p_{i} \cdot \frac{\partial}{\partial ρ_{l}} \tilde{Ξ} (ρ_{k}; {\hat{P}}_{∥ X ∥}), l = 1, \dots, K . \end{matrix}

(A102)

By (A9), we have that

\tilde{Ξ} (∥ x ∥; {\hat{P}}_{∥ X ∥}) = i_{1} (∥ x ∥; P_{X}) - i_{2} (∥ x ∥; P_{X})

, where

i_{j} (∥ x ∥; P_{X})

, for

j = 1, 2

, is defined in (A10). Therefore, to compute (A102), we define the following derivatives

\begin{matrix} \frac{\partial}{\partial ρ_{l}} i_{j} (ρ_{k}; P_{X}) & = \int_{0}^{\infty} \frac{\partial}{\partial ρ_{l}} (f_{χ_{n}^{2} (ρ_{k}^{2} / σ_{j}^{2})} (y) log \frac{y^{\frac{n}{2} - 1}}{\sum_{m = 1}^{K} p_{m} f_{χ_{n}^{2} (ρ_{m}^{2} / σ_{j}^{2})} (y)}) d y, \end{matrix}

(A103)

where

f_{χ_{n}^{2} (ρ_{k}^{2} / σ_{j}^{2})} (y)

is the noncentral chi-square pdf with noncentrality parameter

ρ_{k}^{2} / σ_{j}^{2}

and n degrees of freedom. Notice that the derivative of

f_{χ_{n}^{2} (ρ_{k}^{2} / σ_{j}^{2})} (y)

with respect to

ρ_{l}

is different from zero only when

k = l

and is given by

\begin{matrix} \frac{\partial}{\partial ρ_{l}} f_{χ_{n}^{2} (ρ_{l}^{2} / σ_{j}^{2})} (y) & = \frac{ρ_{l}}{σ_{j}^{2}} (f_{χ_{n + 2}^{2} (ρ_{l}^{2} / σ_{j}^{2})} (y) - f_{χ_{n}^{2} (ρ_{l}^{2} / σ_{j}^{2})} (y)) . \end{matrix}

(A104)

Moreover, given the probability

p_{l}

associated with

ρ_{l}

, we have that

\begin{matrix} \frac{\partial}{\partial ρ_{l}} log \frac{y^{\frac{n}{2} - 1}}{\sum_{k = 1}^{K} p_{k} f_{χ_{n}^{2} (ρ_{k}^{2} / σ_{j}^{2})} (y)} & = - p_{l} \frac{\frac{\partial}{\partial ρ_{l}} f_{χ_{n}^{2} (ρ_{l}^{2} / σ_{j}^{2})} (y)}{\sum_{k = 1}^{K} p_{k} f_{χ_{n}^{2} (ρ_{k}^{2} / σ_{j}^{2})} (y)} . \end{matrix}

(A105)

Finally, by combining everything together, we find

\begin{matrix} \frac{\partial}{\partial ρ_{l}} I_{s} (∥ X ∥; P_{∥ X ∥}) = \\ p_{l} \int_{0}^{\infty} \frac{ρ_{l}}{σ_{1}^{2}} (f_{χ_{n + 2}^{2} (ρ_{l}^{2} / σ_{1}^{2})} (y) - f_{χ_{n}^{2} (ρ_{l}^{2} / σ_{1}^{2})} (y)) [log \frac{y^{\frac{n}{2} - 1}}{\sum_{k = 1}^{K} p_{k} f_{χ_{n}^{2} (ρ_{k}^{2} / σ_{1}^{2})} (y)} - 1] d y \\ - p_{l} \int_{0}^{\infty} \frac{ρ_{l}}{σ_{2}^{2}} (f_{χ_{n + 2}^{2} (ρ_{l}^{2} / σ_{2}^{2})} (y) - f_{χ_{n}^{2} (ρ_{l}^{2} / σ_{2}^{2})} (y)) [log \frac{y^{\frac{n}{2} - 1}}{\sum_{k = 1}^{K} p_{k} f_{χ_{n}^{2} (ρ_{k}^{2} / σ_{2}^{2})} (y)} - 1] d y . \end{matrix}

(A106)

References

Favano, A.; Barletta, L.; Dytso, A. Simulated Data. Available online: https://github.com/ucando83/WiretapCapacity (accessed on 26 April 2023).
Wyner, A.D. The wire-tap channel. Bell Syst. Tech. J. 1975, 54, 1355–1387. [Google Scholar] [CrossRef]
Leung-Yan-Cheong, S.; Hellman, M. The Gaussian wire-tap channel. IEEE Trans. Inf. Theory 1978, 24, 451–456. [Google Scholar] [CrossRef]
Bloch, M.; Barros, J. Physical-Layer Security: From Information Theory to Security Engineering; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Oggier, F.; Hassibi, B. A Perspective on the MIMO Wiretap Channel. Proc. IEEE 2015, 103, 1874–1882. [Google Scholar] [CrossRef]
Liang, Y.; Poor, H.V.; Shamai (Shitz), S. Information theoretic security. Found. Trends Commun. Inf. Theory 2009, 5, 355–580. [Google Scholar] [CrossRef]
Poor, H.V.; Schaefer, R.F. Wireless physical layer security. Proc. Natl. Acad. Sci. USA 2017, 114, 19–26. [Google Scholar] [CrossRef] [PubMed]
Mukherjee, A.; Fakoorian, S.A.A.; Huang, J.; Swindlehurst, A.L. Principles of physical layer security in multiuser wireless networks: A survey. IEEE Commun. Surv. Tutor. 2014, 16, 1550–1573. [Google Scholar] [CrossRef]
Gopala, P.K.; Lai, L.; El Gamal, H. On the secrecy capacity of fading channels. IEEE Trans. Inf. Theory 2008, 54, 4687–4698. [Google Scholar] [CrossRef]
Bloch, M.; Barros, J.; Rodrigues, M.R.; McLaughlin, S.W. Wireless information-theoretic security. IEEE Trans. Inf. Theory 2008, 54, 2515–2534. [Google Scholar] [CrossRef]
Khisti, A.; Tchamkerten, A.; Wornell, G.W. Secure broadcasting over fading channels. IEEE Trans. Inf. Theory 2008, 54, 2453–2469. [Google Scholar] [CrossRef]
Liang, Y.; Poor, H.V.; Shamai, S. Secure communication over fading channels. IEEE Trans. Inf. Theory 2008, 54, 2470–2492. [Google Scholar] [CrossRef]
Shafiee, S.; Liu, N.; Ulukus, S. Towards the secrecy capacity of the Gaussian MIMO wire-tap channel: The 2-2-1 channel. IEEE Trans. Inf. Theory 2009, 55, 4033–4039. [Google Scholar] [CrossRef]
Khisti, A.; Wornell, G.W. Secure transmission with multiple antennas–Part II: The MIMOME wiretap channel. IEEE Trans. Inf. Theory 2010, 56, 5515–5532. [Google Scholar] [CrossRef]
Oggier, F.; Hassibi, B. The secrecy capacity of the MIMO wiretap channel. IEEE Trans. Inf. Theory 2011, 57, 4961–4972. [Google Scholar] [CrossRef]
Guo, D.; Shamai, S.; Verdú, S. Mutual information and minimum mean-square error in Gaussian channels. IEEE Trans. Inf. Theory 2005, 51, 1261–1282. [Google Scholar] [CrossRef]
Bustin, R.; Liu, R.; Poor, H.V.; Shamai, S. An MMSE approach to the secrecy capacity of the MIMO Gaussian wiretap channel. Eurasip J. Wirel. Commun. Netw. 2009, 2009, 370970. [Google Scholar] [CrossRef]
Liu, T.; Shamai, S. A note on the secrecy capacity of the multiple-antenna wiretap channel. IEEE Trans. Inf. Theory 2009, 55, 2547–2553. [Google Scholar] [CrossRef]
Loyka, S.; Charalambous, C.D. An algorithm for global maximization of secrecy rates in Gaussian MIMO wiretap channels. IEEE Trans. Commun. 2015, 63, 2288–2299. [Google Scholar] [CrossRef]
Loyka, S.; Charalambous, C.D. Optimal signaling for secure communications over Gaussian MIMO wiretap channels. IEEE Trans. Inf. Theory 2016, 62, 7207–7215. [Google Scholar] [CrossRef]
Ozel, O.; Ekrem, E.; Ulukus, S. Gaussian wiretap channel with amplitude and variance constraints. IEEE Trans. Inf. Theory 2015, 61, 5553–5563. [Google Scholar] [CrossRef]
Soltani, M.; Rezki, Z. Optical wiretap channel with input-dependent Gaussian noise under peak-and average-intensity constraints. IEEE Trans. Inf. Theory 2018, 64, 6878–6893. [Google Scholar] [CrossRef]
Soltani, M.; Rezki, Z. The Degraded Discrete-Time Poisson Wiretap Channel. arXiv 2021, arXiv:2101.03650. [Google Scholar]
Nam, S.H.; Lee, S.H. Secrecy Capacity of a Gaussian Wiretap Channel with One-bit ADCs is Always Positive. In Proceedings of the 2019 IEEE Information Theory Workshop (ITW), Visby, Sweden, 25–28 August 2019; pp. 1–5. [Google Scholar] [CrossRef]
Dytso, A.; Egan, M.; Perlaza, S.M.; Poor, H.V.; Shitz, S.S. Optimal Inputs for Some Classes of Degraded Wiretap Channels. In Proceedings of the 2018 IEEE Information Theory Workshop (ITW), Guangzhou, China, 25–29 November 2018; pp. 1–5. [Google Scholar] [CrossRef]
Karlin, S. Pólya type distributions, II. Ann. Math. Stat. 1957, 28, 281–308. [Google Scholar] [CrossRef]
Dytso, A.; Al, M.; Poor, H.V.; Shamai Shitz, S. On the Capacity of the Peak Power Constrained Vector Gaussian Channel: An Estimation Theoretic Perspective. IEEE Trans. Inf. Theory 2019, 65, 3907–3921. [Google Scholar] [CrossRef]
Favano, A.; Ferrari, M.; Magarini, M.; Barletta, L. The Capacity of the Amplitude-Constrained Vector Gaussian Channel. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021; pp. 426–431. [Google Scholar] [CrossRef]
Berry, J.C. Minimax estimation of a bounded normal mean vector. J. Multivar. Anal. 1990, 35, 130–139. [Google Scholar] [CrossRef]
Dytso, A.; Yagli, S.; Poor, H.V.; Shamai (Shitz), S. The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points. IEEE Trans. Inf. Theory 2020, 66, 2006–2022. [Google Scholar] [CrossRef]
Cover, T.; Thomas, J. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
Han, T.S.; Endo, H.; Sasaki, M. Reliability and Secrecy Functions of the Wiretap Channel Under Cost Constraint. IEEE Trans. Inf. Theory 2014, 60, 6819–6843. [Google Scholar] [CrossRef]
Barletta, L.; Dytso, A. Amplitude-Constrained Gaussian Wiretap Channel: Computation of the Optimal Input Distribution. In Proceedings of the 2022 IEEE International Mediterranean Conference on Communications and Networking (MeditCom), Athens, Greece, 5–8 September 2022; pp. 106–111. [Google Scholar] [CrossRef]
Rose, K. A mapping approach to rate-distortion computation and analysis. IEEE Trans. Inf. Theory 1994, 40, 1939–1952. [Google Scholar] [CrossRef]
Blahut, R. Computation of channel capacity and rate-distortion functions. IEEE Trans. Inf. Theory 1972, 18, 460–473. [Google Scholar] [CrossRef]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Yasui, K.; Suko, T.; Matsushima, T. An algorithm for computing the secrecy capacity of broadcast channels with confidential messages. In Proceedings of the 2007 IEEE International Symposium on Information Theory (ISIT), Nice, France, 24–29 June 2007; pp. 936–940. [Google Scholar] [CrossRef]
Verdú, S. Mismatched estimation and relative entropy. IEEE Trans. Inf. Theory 2010, 56, 3712–3720. [Google Scholar] [CrossRef]
Segura, J. Bounds for ratios of modified Bessel functions and associated Turán-type inequalities. J. Math. Anal. Appl. 2011, 374, 516–528. [Google Scholar] [CrossRef]
Baricz, Á. Bounds for Turánians of modified Bessel functions. Expo. Math. 2015, 33, 223–251. [Google Scholar] [CrossRef]
Tijdeman, R. On the number of zeros of general exponential polynomials. In Proceedings of the Indagationes Mathematicae; North-Holland: Amsterdam, The Netherlands, 1971; Volume 74, pp. 1–7. [Google Scholar]
Esposito, R. On a relation between detection and estimation in decision theory. Inf. Control 1968, 12, 116–120. [Google Scholar] [CrossRef]
Dytso, A.; Poor, H.V.; Shitz, S.S. A general derivative identity for the conditional mean estimator in Gaussian noise and some applications. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; pp. 1183–1188. [Google Scholar] [CrossRef]
Barletta, L.; Dytso, A. Scalar Gaussian Wiretap Channel: Bounds on the Support Size of the Secrecy-Capacity-Achieving Distribution. In Proceedings of the 2021 IEEE Information Theory Workshop (ITW), Kanazawa, Japan, 17–21 October 2021; pp. 1–6. [Google Scholar] [CrossRef]
Favano, A.; Barletta, L.; Dytso, A. On the Capacity Achieving Input of Amplitude Constrained Vector Gaussian Wiretap Channel. In Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT), Espoo, Finland, 26 June–1 July 2022; pp. 850–855. [Google Scholar] [CrossRef]
Favano, A. The Capacity of Amplitude-Constrained Vector Gaussian Channels. Ph.D. Dissertation, Politecnico di Milano, Milan, Italy, 2022. [Google Scholar]
Lapidoth, A.; Moser, S.M. Capacity bounds via duality with applications to multiple-antenna systems on flat-fading channels. IEEE Trans. Inf. Theory 2003, 49, 2426–2467. [Google Scholar] [CrossRef]

Figure 1. Asymptotic behavior of

{\bar{R}}_{n} (1, σ_{2}^{2}) / \sqrt{n}

versus n for

σ_{1}^{2} = 1

and

σ_{2}^{2} = 1.001, 1.5, 10, 1000

. In red, we show

c (1, σ_{2}^{2})

defined in (46).

Figure 1. Asymptotic behavior of

{\bar{R}}_{n} (1, σ_{2}^{2}) / \sqrt{n}

versus n for

σ_{1}^{2} = 1

and

σ_{2}^{2} = 1.001, 1.5, 10, 1000

. In red, we show

c (1, σ_{2}^{2})

defined in (46).

Figure 2. Secrecy capacity in bit per channel use (bpcu) versus

R

for

σ_{2}^{2} = 1.5, 10

and

n = 2, 4

. The secrecy capacity under average power constraints

C_{G} (σ_{1}^{2}, σ_{2}^{2}, R^{2}, n)

is defined in (69), while under peak power constraints, i.e.,

C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

, is defined in (58).

Figure 2. Secrecy capacity in bit per channel use (bpcu) versus

R

for

σ_{2}^{2} = 1.5, 10

and

n = 2, 4

. The secrecy capacity under average power constraints

C_{G} (σ_{1}^{2}, σ_{2}^{2}, R^{2}, n)

is defined in (69), while under peak power constraints, i.e.,

C_{s} (σ_{1}^{2}, σ_{2}^{2}, R, n)

, is defined in (58).

Figure 3. Evolution of the numerically estimated

{\hat{P}}_{∥ X^{★} ∥}

versus

R

for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 1.5

, (a)

n = 2

, and (b)

n = 8

.

Figure 3. Evolution of the numerically estimated

{\hat{P}}_{∥ X^{★} ∥}

versus

R

for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 1.5

, (a)

n = 2

, and (b)

n = 8

.

Figure 4. Evolution of the numerically estimated

{\hat{P}}_{∥ X^{★} ∥}

versus

R

for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

, (a)

n = 2

, and (b)

n = 8

.

Figure 4. Evolution of the numerically estimated

{\hat{P}}_{∥ X^{★} ∥}

versus

R

for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

, (a)

n = 2

, and (b)

n = 8

.

Figure 5. Output pdf of the legitimate user and of the eavesdropper for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

,

n = 2

, (a,b)

R = 2.25

, and (c,d)

R = 7.5

. An animation showing the evolution of the output pdf as

R

varies can be found in [1].

Figure 5. Output pdf of the legitimate user and of the eavesdropper for

σ_{1}^{2} = 1

,

σ_{2}^{2} = 10

,

n = 2

, (a,b)

R = 2.25

, and (c,d)

R = 7.5

. An animation showing the evolution of the output pdf as

R

varies can be found in [1].

Table 1. Values of

{\bar{R}}_{n}^{MMSE} (1)

,

{\bar{R}}_{n} (1, σ_{2}^{2})

, and

{\bar{R}}_{n}^{ptp} (1)

.

Table 1. Values of

{\bar{R}}_{n}^{MMSE} (1)

,

{\bar{R}}_{n} (1, σ_{2}^{2})

, and

{\bar{R}}_{n}^{ptp} (1)

.

n	MMSE	$σ_{2}^{2}$				ptp
n	MMSE	1.001	1.5	10	1000	ptp
1	1.057	1.057	1.161	1.518	1.664	1.666
2	1.535	1.535	1.687	2.221	2.450	2.454
3	1.908	1.909	2.098	2.768	3.061	3.065
4	2.223	2.224	2.444	3.229	3.575	3.580
5	2.501	2.501	2.750	3.634	4.026	4.031
6	2.751	2.752	3.025	3.999	4.432	4.438
7	2.981	2.982	3.278	4.334	4.805	4.811
8	3.195	3.196	3.513	4.646	5.151	5.158
9	3.395	3.396	3.733	4.937	5.475	5.483
10	3.585	3.586	3.941	5.213	5.781	5.789
11	3.765	3.766	4.139	5.475	6.072	6.080
12	3.936	3.938	4.328	5.725	6.350	6.359
13	4.101	4.102	4.509	5.964	6.616	6.625
14	4.259	4.260	4.683	6.195	6.872	6.881
15	4.412	4.413	4.851	6.417	7.119	7.128
16	4.560	4.561	5.013	6.632	7.357	7.367
17	4.702	4.704	5.170	6.839	7.588	7.598
18	4.841	4.842	5.323	7.041	7.812	7.823
19	4.976	4.977	5.471	7.238	8.030	8.041
20	5.107	5.109	5.616	7.429	8.242	8.254
21	5.235	5.237	5.756	7.615	8.449	8.461
22	5.360	5.362	5.894	7.797	8.651	8.663
23	5.483	5.484	6.028	7.974	8.848	8.860
24	5.602	5.603	6.159	8.148	9.041	9.054
25	5.719	5.720	6.288	8.318	9.230	9.243
26	5.834	5.835	6.414	8.485	9.416	9.428
27	5.946	5.948	6.538	8.649	9.597	9.610
28	6.056	6.058	6.659	8.809	9.775	9.789
29	6.165	6.166	6.778	8.967	9.951	9.964
30	6.271	6.273	6.895	9.122	10.123	10.136
31	6.376	6.378	7.010	9.274	10.292	10.306
32	6.479	6.481	7.124	9.424	10.458	10.472
33	6.580	6.582	7.235	9.571	10.622	10.636
34	6.680	6.682	7.345	9.717	10.783	10.798
35	6.779	6.780	7.453	9.860	10.942	10.957

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

Amplitude Constrained Vector Gaussian Wiretap Channel: Properties of the Secrecy-Capacity-Achieving Input Distribution

Abstract

1. Introduction

1.1. Literature Review

1.2. Contributions and Paper Outline

1.3. Notation

2. Preliminaries

2.1. Oscillation Theorem

2.2. Low-Amplitude Regime

2.3. Connections to Other Optimization Problems

2.4. KKT Conditions

3. Main Results

3.1. A New Sufficient Condition on the Optimality of P X R

3.2. Characterizing the Low-Amplitude Regime

3.3. Large n Asymptotics

3.4. Scalar Case ( n = 1 )

4. Secrecy Capacity Expression in the Low-Amplitude Regime

Large n Asymptotics

5. Beyond the Low-Amplitude Regime

5.1. Numerical Algorithm

5.2. Numerical Results

6. Proof of Theorem 3

Estimation Theoretic Representation

7. Proof of Theorem 4

8. Proof of Theorem 5

8.1. Implicit Upper Bound

8.2. Explicit Upper Bound

9. Proof of Theorem 6

10. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Examples of the Function Gσ1,σ2,R,n

Appendix B. Derivative of the Secrecy-Density

Appendix C. Proof of Theorem 7

Appendix D. Proof of Theorem 8

Appendix E. Partial Derivatives for the Gradient Ascent Algorithm

References

Article Metrics

Article Access Statistics

3.1. A New Sufficient Condition on the Optimality of $P_{X_{R}}$

3.4. Scalar Case $(n = 1)$

Appendix A. Examples of the Function G_σ1,σ2,R,n