Abstract
This paper studies the secrecy capacity of an n-dimensional Gaussian wiretap channel under a peak power constraint. This work determines the largest peak power constraint , such that an input distribution uniformly distributed on a single sphere is optimal; this regime is termed the low-amplitude regime. The asymptotic value of as n goes to infinity is completely characterized as a function of noise variance at both receivers. Moreover, the secrecy capacity is also characterized in a form amenable to computation. Several numerical examples are provided, such as the example of the secrecy-capacity-achieving distribution beyond the low-amplitude regime. Furthermore, for the scalar case , we show that the secrecy-capacity-achieving input distribution is discrete with finitely many points at most at the order of , where is the variance of the Gaussian noise over the legitimate channel.
1. Introduction
Consider the vector Gaussian wiretap channel with outputs
where , and , and with being mutually independent. The output is observed by the legitimate receiver, whereas the output is observed by the malicious receiver. In this work, we are interested in the scenario where the input is limited by a peak power constraint or amplitude constraint, and assume that , i.e., is an n-ball centered at the origin and of radius . For this setting, the secrecy capacity is given by
where the last expression holds due to the (stochastically) degraded nature of the channel. It can be shown that for the secrecy capacity is equal to zero. Therefore, in the remainder, we assume that .
We are interested in studying the input distribution that maximizes (3) in the low (but not vanishing) amplitude regime. Since closed-form expressions for secrecy capacity are rare, we derive the secrecy capacity in an integral form that is easy to evaluate. For the scalar case , we establish an upper bound on the number of mass points of , valid for any amplitude regime. We also argue in Section 2.3 that the solution to the secrecy capacity can shed light on other problems seemingly unrelated to security. The paper also provides a number of numerical simulations of and , the data for which are made available at [1].
1.1. Literature Review
The wiretap channel was introduced by Wyner in [2], who also established the secrecy capacity of the degraded wiretap channel. The results of [2] were extended to the Gaussian wiretap channel in [3]. The wiretap channel plays a central role in network information theory; the interested reader is referred to [4,5,6,7,8] and references therein for a detailed treatment of the topic. Furthermore, for an in-depth discussion on the wiretap fading channel, refer to [9,10,11,12].
In [3], it was shown that the secrecy-capacity-achieving input distribution of the Gaussian wiretap channel, under an average power constraint, is Gaussian. In [13], the authors investigated the Gaussian wiretap channel consisting of two antennas, both at the transmitter and receiver sides, and of a single antenna for the eavesdropper. The secrecy capacity of the MIMO wiretap channel was characterized in [14,15], where the Gaussian input was shown to be optimal. An elegant proof, using the I-MMSE relationship [16], of the optimality of Gaussian input, is given in [17]. Moreover, an alternative approach in the characterization of the secrecy capacity of a MIMO wiretap channel was proposed in [18]. In [19,20], the authors discuss the optimal signaling for secrecy rate maximization under average power constraints.
The secrecy capacity of the Gaussian wiretap channel under the peak power constraint has received far less attention. The secrecy capacity of the scalar Gaussian wiretap channel with an amplitude and power constraint was considered in [21], where the authors showed that the capacity-achieving input distribution is discrete with finitely many support points.
The work of [21] was extended to noise-dependent channels by Soltani and Rezki in [22]. For further studies on the properties of the secrecy-capacity-achieving input distribution for a class of degraded wiretap channels, refer to [23,24,25].
The secrecy capacity for the vector wiretap channel with a peak power constraint was considered in [25], where it was shown that the optimal input distribution is concentrated on finitely many co-centric shells.
1.2. Contributions and Paper Outline
In Section 2, we introduce the mathematical tools, assumptions, and definitions used throughout the paper. Specifically, in Section 2.1, we introduce the oscillation theorem. In Section 2.2, we give a definition of low-amplitude regimes. Moreover, in Section 2.3, we show how the wiretap channel can be seen as a generalization of point-to-point channels and the evaluation of the largest minimum mean square error (MMSE), both under the assumption of amplitude-constrained input. In Section 2.4, we provide a definition of the Karush–Kuhn–Tucker (KKT) conditions for the wiretap channel.
In Section 3, we detail our main results. Theorem 2 provides a sufficient condition for the optimality of a single hypersphere. Theorem 3 and Theorem 4 give the conditions under which we can fully characterize the behavior of , that is, the radius below which we are in the low-amplitude regime, i.e., the optimal input distribution is composed of a single shell. Furthermore, Theorem 5 gives an implicit and an explicit upper bound on the number of mass points of the secrecy-capacity-achieving input distribution when .
In Section 4, we derive the secrecy capacity expression for the low-amplitude regime in Theorem 6. We also investigate its behavior when the number of antennas n goes to infinity.
Section 5 extends the investigation of the secrecy capacity beyond the low-amplitude regime. We numerically estimate both the optimal input pmf and the resulting capacity via an algorithmic procedure based on the KKT conditions introduced in Lemma 2.
Section 6, Section 7, Section 8 and Section 9 provide the proof for Theorem 3 and Theorem 4–6, respectively. Finally, Section 10 concludes the paper.
1.3. Notation
We use bold letters for vectors () and uppercase letters for random variables (X). We denote by the Euclidean norm of the vector . Given a vector and a scalar a, with a little abuse of notation, we denote by , where is the first vector in the standard basis of the Euclidean vector space . Given a random variable X, its probability density function (pdf), pmf, and cumulative distribution function are denoted by , , and , respectively. The support set of is denoted and defined as
We denote by a multivariate Gaussian distribution with mean vector and covariance matrix . The pdf of a Gaussian random variable with zero mean and variance is denoted by . We denote by the noncentral chi-square distribution with n degrees of freedom and with noncentrality parameter . We represent the vector of zeros by and the identity matrix by . Furthermore, we represent by the relative entropy. The minimum mean squared error is denoted by
The modified Bessel function of the first kind of order is denoted by . The following ratio of the Bessel functions is commonly used in this work:
Finally, the number of zeros (counted in accordance with their multiplicities) of a function on the interval is denoted by . Similarly, if is a function on the complex domain, denotes the number of its zeros within the region .
2. Preliminaries
2.1. Oscillation Theorem
In this work, we often need to upper bound the number of oscillations of a function, i.e., its number of sign changes. This is useful, for example, to bound the number of zeros of a function or the number of roots of an equation. To be more precise, let us define the number of sign changes as follows.
Definition 1
(Sign Changes of a Function). The number of sign changes of a function is given by
where is the number of sign changes of the sequence .
Definition 2
(Totally Positive Kernel). A function is said to be a totally positive kernel of order n if for all , for all , and . If f is a totally positive kernel of order n for all , then f is a strictly totally positive kernel.
In [26], Karlin noticed that some integral transformations have a variation-diminishing property, which is described in the following theorem.
Theorem 1
(Oscillation Theorem). Given domains and , let be a strictly totally positive kernel. For an arbitrary y, suppose is an n-times differentiable function. Assume that μ is a measure on , and let be a function with . For , define
If is an n-times differentiable function, then either , or .
The above theorem says that the number of zeros of a function , which is the output of the integral transformation, is less than the number of sign changes of the function , which is the input to the integral transformation.
2.2. Low-Amplitude Regime
In this work, a low-amplitude regime is defined as follows.
Definition 3.
The quantity represents the largest radius , for which is secrecy-capacity-achieving.
One of the main objectives of this work is to characterize .
2.3. Connections to Other Optimization Problems
The distribution occurs in a variety of statistical and information-theoretic applications. For example, consider the following two optimization problems:
where . The first problem seeks to characterize the capacity of the point-to-point channel under an amplitude constraint, and the second problem seeks to find the largest minimum mean squared error under the assumption that the signal has bounded amplitude; the interested reader is referred to [27,28,29] for a detailed background on both problems.
Similarly to the wiretap channel, we can define the low-amplitude regime for both problems as the largest such that is optimal and denote these by and . We now argue that both and can be seen as a special case of the wiretap solution. Hence, the wiretap channel provides an interesting unification and generalization of these two problems.
First, note that the point-to-point solution can be recovered from the wiretap by simply specializing the wiretap channel to the point-to-point channel, that is,
Second, to see that the MMSE solution can be recovered from the wiretap, recall that by the I-MMSE relationship [16] we have that
where is standard Gaussian. Now, note that if we choose , then by the mean value theorem we arrive at
where . Consequently, for a small enough ,
2.4. KKT Conditions
Let us define the secrecy density for the vector Gaussian wiretap channel as
where is the relative entropy.
For the scalar case , the KKT conditions are necessary and sufficient to ensure that is capacity-achieving [21].
Lemma 1.
Proof.
The convexity of the optimization problem is also guaranteed for the vector wiretap model in (1) with . Then, the results of Lemma 1 can be extended to the vector case as follows.
Lemma 2.
Proof.
This is a straightforward vector extension of Lemma 1. □
Thanks to the spherical symmetry of the additive noise distributions and of , the secrecy density can be expressed as a function of only. Therefore, we denote the secrecy density in spherical coordinates by , and give a rigorous definition in (A9).
3. Main Results
3.1. A New Sufficient Condition on the Optimality of
Our first main result provides a sufficient condition for the optimality of .
Theorem 2.
If
then is secrecy-capacity-achieving.
Proof.
Let us consider the equivalent definition of the secrecy density in spherical coordinates (A9). Note that if the derivative of makes at most one sign change, from negative to positive, then the maximum of occurs at either or .
From Lemma A1 in the Appendix B, the derivative of is as given below
where is a noncentral chi-square random variable with degrees of freedom and noncentrality parameter , and
where . A calculation related to (33) was erroneously performed in [27]. However, this error does not change the results of [27] as only the sign of the derivative is important and not the value itself. Note that and that for a sufficiently large ; in fact, we have
where (36) follows from for ; (37) follows by noticing that ; and finally, (38) holds by .
Then, to show that is maximized in , we need to prove that changes sign at most once. To that end, we need Karlin’s oscillation theorem presented in Section 2.1. By using (33), the fact that the pdf of a chi-square is a positive defined kernel [26], and Theorem 1, the number of sign changes of is upper-bounded by the number of sign changes of
for . Note that
where the inequality in (40) follows from for , and (41) follows from for and . We conclude by noting that (41) is nonnegative, hence has no sign change, for
for all , thus guaranteeing that is secrecy-capacity-achieving. □
Remark 1.
As a consequence of the proof of Theorem 2, for any and , if has at most one sign change, then is secrecy-capacity-achieving if, and only if, for all
Because of the difficulty in evaluating analytical properties of (39), proving that has at most one sign change does not seem easy. However, in Appendix A, we show via extensive numerical evaluations that changes sign at most once for any that we tried.
3.2. Characterizing the Low-Amplitude Regime
Let us characterize the low-amplitude regime as follows.
Theorem 3.
Consider a function
where . If of (39) has at most one sign change, the input is secrecy-capacity-achieving if, and only if, , where is given as the solution of
Remark 2.
Note that (45) always has a solution. To see this, observe that and . Moreover, the solution is unique because monotonically increases for .
The solution to (45) needs to be found numerically. To avoid any loss of accuracy in the numerical evaluation of for large values of x, we used the exponential scaling provided in the MATLAB implementation of . Since evaluating is rather straightforward and not time-consuming, we opted for a binary search algorithm.
In Table 1, we show the values of for some values of and n. Moreover, we report the values of and from [27] in the first and the last row, respectively. As predicted by (12), we can appreciate the close match of the row with the one of . Similarly, the agreement between the row and the row is justified by (16).
Table 1.
Values of , , and .
3.3. Large n Asymptotics
We now use the result in Theorem 3 to characterize the asymptotic behavior of . In particular, it is shown that increases as .
Theorem 4.
For
where is the solution of
Proof.
See Section 7. □
In Figure 1, for and , we show the behavior of and how its asymptotic converges to .
Figure 1.
Asymptotic behavior of versus n for and . In red, we show defined in (46).
3.4. Scalar Case
For the scalar case, the optimal input distribution is discrete. In this regime, we provide an implicit and an explicit upper bound on the number of support points of the optimal input probability mass function (pmf) .
Theorem 5.
Let and be the secrecy-capacity-achieving output distributions at the legitimate and malicious receivers, respectively, and let
with . For , an implicit upper bound on the number of support points of is
where
Moreover, an explicit upper bound on the number of support points of is obtained by using
where .
The upper bounds in Theorem 5 are generalizations of the upper bounds on the number of points presented in [30] in the context of a point-to-point AWGN channel with an amplitude constraint. Indeed, if we let , while keeping and fixed, then the wiretap channel reduces to the AWGN point-to-point channel.
To find a lower bound on the number of mass points, a possible approach consists of the following steps:
where the above uses the nonnegativity of the entropy and the fact that entropy is maximized by a uniform distribution. Furthermore, by using a suboptimal uniform (continuous) distribution on as an input and the entropy power inequality, the secrecy capacity is lower-bounded by
Combining the bounds in (55) and (56), we arrive at the following lower bound on the number of points:
At this point, one needs to determine the behavior of . A trivial lower bound on can be found by lower-bounding by zero. However, this lower bound on does not grow with , while the upper bound does increase with . A possible way of establishing a lower bound that increases in is by showing that . However, because not much is known about the structure of the optimal input distribution , it is not immediately evident how one can establish such an approximation or whether it is valid.
4. Secrecy Capacity Expression in the Low-Amplitude Regime
The result in Theorem 3 can also be used to establish the secrecy capacity for all , as is performed next.
Theorem 6.
If of (39) has at most one sign change and if , then
Proof.
See Section 9. □
Large n Asymptotics
It is important to note that as grows as , according to Theorem 4, when we keep constant and increase the number of antennas to infinity, the low-amplitude regime becomes the only regime. The next theorem characterizes the secrecy capacity in this ‘massive-MIMO’ regime (i.e., where is fixed and n goes to infinity).
Theorem 7.
Consider the expression in (58) and fix and , then
Proof.
See Appendix C. □
Remark 3.
The result in Theorem 7 is reminiscent of the capacity in the wideband regime [31, Ch. 9], where the capacity increases linearly in the signal-to-noise ratio. Similarly, Theorem 7 shows that in the large antenna regime, the secrecy capacity grows linearly with the difference in the single-to-noise ratio between the legitimate user and the eavesdropper.
In Theorem 7, was held fixed. It is also interesting to study the case when is a function of n. Specifically, it is interesting to study the case when for some coefficient c.
Theorem 8.
Suppose that . Then,
Proof.
See Appendix D. □
Notice that (60) is equivalent to the secrecy capacity of a vector Gaussian wiretap channel subject to an average power constraint. Gaussian wiretap channels under average power constraints have been extensively investigated [3,32] and, for an average power constraint , the resulting secrecy capacity is given by [3]
Thus, the result in (60) can be restated as
In other words, for the regime considered in Theorem 8, for a large enough n the secrecy capacity under the amplitude constraint behaves as the secrecy capacity under the average power constraint .
5. Beyond the Low-Amplitude Regime
To evaluate the secrecy capacity and find the optimal distribution beyond we rely on numerical estimations. We remark that, as pointed out in [25], the secrecy-capacity-achieving distribution is isotropic and consists of finitely many co-centric shells. Keeping this in mind, we can find the optimal input distribution by just optimizing over with .
5.1. Numerical Algorithm
In the case of scalar Gaussian wiretap channels, the secrecy capacity and the optimal input pmf can be estimated via the algorithm described in [33], i.e., a numerical procedure that takes inspiration from the deterministic annealing algorithm sketched in [34]. Let us denote by the numerical estimate of the secrecy capacity, and by , the estimate of the optimal pmf on the input norm. To numerically evaluate and , we extend to the vector case the algorithm in [33]. Our extension is defined in Algorithm 1. The input parameters of the main function are the noise variances and , the radius , the vectors and being, respectively, the mass points positions and probabilities of a tentative input pmf, the number of iterations in the while loop , and finally, a tolerance to set the precision of the secrecy capacity estimate.
| Algorithm 1 Secrecy capacity and optimal input pmf estimation |
|
At its core, the numerical procedure iteratively refines its estimate of by running a gradient ascent algorithm to update the vector and a variant of the Blahut–Arimoto algorithm [35] to update .
The Gradient Ascent procedure uses the secrecy information as the objective function and stops either when has reached convergence or at a given maximum number of iterations. Let us denote by the secrecy information as a function of the input norm. Notice that, given a tentative pmf of mass points , probabilities , and , we have
where is the secrecy density, with respect to the input norm, defined in (A9) and where and are, respectively, the ith element of and . Then, the Gradient Ascent updates are given by
where the partial derivatives are defined in Appendix E and is the step size in the gradient ascent. We remark that, to ensure convergence to a local maximum, we use the gradient ascent algorithm in a backtracking line search version [36]. By suitably adjusting the step size at each iteration, the backtracking line search version guarantees us that each new update of provides a nondecreasing associated secrecy information, compared to the previous update of .
The Blahut–Arimoto function runs a variant of the Blahut–Arimoto algorithm. For the scalar case, an example of the Blahut–Arimoto optimization, applied to wiretap channels, is given in [37]. Similar results can be extended to the case of vector wiretap channels. Given the current probabilities ’s, the updates are obtained by evaluating
and finally, by normalizing each and assigning them to the entries of the vector
Similarly to Gradient Ascent, the Blahut–Arimoto procedure stops either when the values of have reached a stable convergence or after a set number of updates.
Since the joint optimization of and is not numerically feasible, we need to reiterate both the Blahut–Arimoto and the Gradient Ascent procedures a given number of times, namely . The parameter is chosen empirically in such a way that and become fairly stable, and therefore we can expect to have reached joint convergence for both of them.
Then, the KKT Validation procedure ensures that the values of and are indeed close to the optimal ones. We check the optimality of by verifying whether the KKT conditions in Lemma 2 are satisfied. Since the algorithm has to verify the KKT conditions numerically, i.e., with finite precision, we find it more convenient to check the negated version of (28), where a tolerance parameter is introduced that trades off accuracy with computational burden. Specifically, is not an optimal input pmf if any of the following conditions are satisfied:
Note that in (67), in place of the secrecy capacity , which is unknown, we used the secrecy information given by the tentative pmf , i.e., . Condition (67a) is derived by negating (28a): there exists a , such that is -away from the secrecy information . Condition (67b) is the negated version of (28b): there exists a such that is at least -larger than the secrecy information . With some abuse of notation, we refer to (67) as to the -KKT conditions. If the tentative pmf does not pass the check of the -KKT conditions, then the algorithm checks whether a new point has to be added to the pmf.
The Add Point procedure evaluates the position of the new mass point
The point is appended to the vector and the probabilities are set to be equiprobable.
The whole procedure is repeated until KKT Validation gives a positive outcome, and at that point the algorithm returns as the optimal pmf estimate and as the secrecy capacity estimate.
Remark 4.
In this work, we focus on the secrecy capacity and on the secrecy-capacity-achieving input distribution. However, it is possible to study other points of the rate-equivocation region of the degraded wiretap Gaussian channel by suitably changing the KKT conditions, as reported in [21], Equations (33) and (34). With the due modifications, the proposed optimization algorithm can find the optimal input distribution for any point of the rate-equivocation region.
5.2. Numerical Results
In Figure 2, we show with black dots the numerical estimate versus , evaluated via Algorithm 1, for , , , and tolerance . For the same values of , , and n we also show, with the red lines, the analytical low-amplitude regime secrecy capacity versus from Theorem 6. In addition, we show with blue dotted lines the secrecy capacity under the average power constraint :
where the inequality follows by noting that the average power constraint is weaker than the amplitude constraint . Finally, the dashed vertical lines show , i.e., the upper limit of the low-amplitude regime, for the considered values of , , and n.
In Figure 3, we consider discrete values for and for each value of we plot the corresponding estimated pmf , evaluated via Algorithm 1, for , , , and tolerance . The figure shows, at each , the normalized amplitude of support points in the estimated pmf, while the size of the circles qualitatively shows the probability associated with each support point. Similarly, Figure 4 shows the evolution of the pmf estimate for , , , and . It is interesting to notice how in both Figure 3 and Figure 4 when a new mass point is added to the pmf, it appears in zero. Moreover, the mass point of radius always seems to be optimal.
Figure 3.
Evolution of the numerically estimated versus for , , (a) , and (b) .
Figure 4.
Evolution of the numerically estimated versus for , , (a) , and (b) .
Finally, Figure 5 shows the output distributions of the legitimate user and of the eavesdropper in the case of , , , and for two values of . At the top of the figure, the distributions are shown for , which is a value close to . At the bottom of the figure, the distributions are shown for . For both values of , the legitimate user sees an output distribution where the co-centric rings of the input distribution are easily distinguishable. On the other hand, as expected, the output distribution seen by the eavesdropper is close to a Gaussian.
Figure 5.
Output pdf of the legitimate user and of the eavesdropper for , , , (a,b) , and (c,d) . An animation showing the evolution of the output pdf as varies can be found in [1].
6. Proof of Theorem 3
Estimation Theoretic Representation
By Remark 1, if has at most one sign change, is secrecy-capacity-achieving if, and only if, for all
We seek to re-write the condition (70) in the estimation theoretic form. To that end, we need the following representation of the relative entropy [38]:
where
and where
Another fact that will be important for our expression is
see, for example [27], for the proof.
Next, using (71) and (75) note that for any we have that for
where (77) follows from
Moreover, for , it holds
Consequently, the necessary and sufficient condition in Theorem 2 can be equivalently written as
Now will be the largest that satisfies (86), which concludes the proof of Theorem 3.
7. Proof of Theorem 4
The objective of the proof is to understand how the condition in (45) behaves as . To study the large n behavior, we need to the following bounds on the [39,40]: for
where
Now let for some . The goal is to understand the behavior of
as n goes to infinity. First, let
and note that
where (92) follows from the dominated convergence theorem, and (93) follows since, by the law of large numbers we have, almost surely,
8. Proof of Theorem 5
8.1. Implicit Upper Bound
A consequence of the KKT conditions of Lemma 1 is the inclusion
which suggests the following upper bound on the number of support points of :
where (104) follows from using (21); (105) follows from applying Karlin’s oscillation Theorem 1 and the fact that the Gaussian pdf is a strictly totally positive kernel, which was shown in [26]; (107) is proved in Lemma A3 in the Appendix B; and (108) follows because is an analytic function in . The implicit upper bound (49) of Theorem 5 follows from (107) and (108).
8.2. Explicit Upper Bound
The key to finding an explicit upper bound on the number of zeros will be the following complex-analytic result.
Lemma 3
(Tijdeman’s Number of Zeros Lemma [41]). Let be positive numbers, such that . For the complex valued function , which is analytic on , its number of zeros within the disk satisfies
Furthermore, the following loosened version of the implicit upper bound in (49) will be useful.
Lemma 4.
where
and where .
Proof.
Starting from (107), we can write
where in step (114), we applied Rolle’s theorem, and in step (115), we used the fact that multiplying by a strictly positive function (i.e., ) does not change the number of zeros. The first derivative of g can be computed as follows:
where in the last step, we used the well-known Tweedie’s formula (see for example [42,43]):
An alternative expression for the first term in the right-hand side (RHS) of (116) is as follows:
where . The proof is concluded by letting
□
To apply Tijdeman’s number of zeros Lemma, upper and lower bounds to the maximum module of the complex analytic extension of h over the disk are proposed in Lemmas A4 and A5 in the Appendix B. Using those bounds, we can provide an upper bound on the number of mass points as follows:
where (124) follows because extending to a larger domain can only increase the number of zeros; (125) follows from the Tijdeman’s Number of Zeros Lemma; (126) follows from choosing and and using bounds in Lemmas A4 and A5; (128) follows from using the value of L in (A38); (129) using the bound and defining
and (130) follows from the fact that the , and coefficients do not depend on and the fact that the coefficients , and , while they do depend on through , do not grow with . The fact that does not grow with follows from the bound in (69).
9. Proof of Theorem 6
Using the KKT conditions in (28), we have that for
where the last expression was computed in (83). This concludes the proof.
10. Conclusions
This paper has focused on the secrecy capacity of the n-dimensional vector Gaussian wiretap channel under the peak power (or amplitude constraint) in a so-called low (but not vanishing) amplitude regime. In this regime, the optimal input distribution is supported on a single n-dimensional sphere of radius . The paper has identified the largest , such that the distribution is optimal. In addition, the asymptotic of has been completely characterized as dimension n approaches infinity. As a by-product of the analysis, the capacity in the low-amplitude regime has also been characterized in a more or less closed form. The paper has also provided a number of supporting numerical examples. Implicit and explicit upper bounds have been proposed on the number of mass points for the optimal input distribution in the scalar case with .
There are several interesting future directions. For example, one interesting direction would be to determine a regime in which a mixture of a mass point at zero and is optimal. It would also be interesting to establish a lower bound on the number of mass points in the support of the optimal input distribution when . We note that such a lower bound was obtained for a point-to-point channel in [30]. We finally remark that the extension of the results of this paper to nondegraded wiretap channels is not trivial and also constitutes an interesting but ambitious future direction.
Author Contributions
A.F., L.B. and A.D. contributed equally to this work. All authors have read and agreed to the published version of the manuscript. Part of this work was presented at the 2021 IEEE Information Theory Workshop [44], at the 2022 IEEE International Symposium on Information Theory [45], at the 2022 IEEE International Mediterranean Conference on Communications and Networking [33], and in the PhD dissertation in [46].
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
Datasets for the numerical results provided in this work are available at [1].
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Examples of the Function Gσ1,σ2,R,n
In this section, we give supporting numerical arguments that the function defined in (39) has at most one sign change. Figure A1 demonstrates the behavior of the function . In addition, the code that generates the function for various values of , and is provided in [1].
Figure A1.
Examples of the function defined in (39). (a) , , and . (b) , , and . (c) , , and . (d) , , and .
Figure A1.
Examples of the function defined in (39). (a) , , and . (b) , , and . (c) , , and . (d) , , and .

Appendix B. Derivative of the Secrecy-Density
Lemma A1.
The derivative of the secrecy density for the input is
where is a noncentral chi-square random variable with degrees of freedom and noncentrality parameter and
where .
Proof.
We start with the secrecy density expressed in spherical coordinates. A quick way to obtain the information densities in this coordinate system is to note that:
where (A5) holds by [47], Lemma 6.17, and by independence between and ; the term is a differential entropy-like quantity for random vectors on the n-dimensional unit sphere ([47], Lemma 6.16); (A6) holds because is uniform on the unit sphere and thanks to [47], Lemma 6.15; the term is the gamma function; and in (A7) we have . It is now required to write the secrecy density as follows:
where
for . The term is the noncentral chi-square pdf with n degrees of freedom and noncentrality parameter .
Given two values with , write
where we have integrated by parts and where is the cumulative distribution function of . Now notice that
Since statistically dominates , the integrand function in (A13) is always positive. We can introduce an auxiliary output random variable , for , with pdf
for , to rewrite (A12) as follows:
We evaluate the derivative in (A15) as:
where, in (A16), we used
in (A17), we used the relationship
and (A20) follows from the recurrence relationship
Putting together (A15) and (A20), we find
We are now in the position to compute the derivative of the information density as
where thanks to Lemma A2.
The final result is obtained by letting
and by specializing the result to the input . □
Lemma A2.
Consider the pdf defined in (A14). For any we have
Proof.
Lemma A3.
There exists some such that
Furthermore, L can be upper-bounded as follows:
where
with
Proof.
First, note that thanks to (69). Second, for , we can lower-bound the function g as follows:
where (A44) follows from applying Jensen’s inequality and the law of iterated expectation to the first term; (A45) follows from
and (A47) follows from for all . The RHS of
is strictly positive when
By using the bound , we arrive at
This concludes the proof for the bound on L. □
Lemma A4.
Proof.
Let us denote , where and are real numbers and is the imaginary unit. Then, by triangular inequality, we have:
Next, let us upper-bound each contribution of (A57). For , we have
where step (A61) holds by triangular inequality; step (A62) holds by noticing that
where and are real numbers that depend on x; (A63) follows from using the bound ; (A64) holds because is a decreasing function for and because , which follows from Jensen’s inequality; (A65) follows from and ; and (A66) follows from the bound for . Furthermore, given that and , we arrive at the bound
Consequently,
where (A69) follows from Cauchy–Schwarz inequality; (A70) follows from . Moreover, we have
and finally
Putting all contributions together, we get
where, in the last step, we have used that and the fact that . □
Lemma A5.
Appendix C. Proof of Theorem 7
To study the large n behavior, we need the following bounds on the function [39,40]: for
where
Moreover, let
with . Consequently,
where (A93) follows from the dominated convergence theorem, since ; (A94) follows from using (A90); (A96) follows from using the strong law of large numbers to note that
Now, combining the capacity expression in (58) and (A96), we have that
Appendix D. Proof of Theorem 8
Appendix E. Partial Derivatives for the Gradient Ascent Algorithm
The partial derivatives of the secrecy information, with respect to any mass point , are defined as
By (A9), we have that , where , for , is defined in (A10). Therefore, to compute (A102), we define the following derivatives
where is the noncentral chi-square pdf with noncentrality parameter and n degrees of freedom. Notice that the derivative of with respect to is different from zero only when and is given by
Moreover, given the probability associated with , we have that
Finally, by combining everything together, we find
References
- Favano, A.; Barletta, L.; Dytso, A. Simulated Data. Available online: https://github.com/ucando83/WiretapCapacity (accessed on 26 April 2023).
- Wyner, A.D. The wire-tap channel. Bell Syst. Tech. J. 1975, 54, 1355–1387. [Google Scholar] [CrossRef]
- Leung-Yan-Cheong, S.; Hellman, M. The Gaussian wire-tap channel. IEEE Trans. Inf. Theory 1978, 24, 451–456. [Google Scholar] [CrossRef]
- Bloch, M.; Barros, J. Physical-Layer Security: From Information Theory to Security Engineering; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Oggier, F.; Hassibi, B. A Perspective on the MIMO Wiretap Channel. Proc. IEEE 2015, 103, 1874–1882. [Google Scholar] [CrossRef]
- Liang, Y.; Poor, H.V.; Shamai (Shitz), S. Information theoretic security. Found. Trends Commun. Inf. Theory 2009, 5, 355–580. [Google Scholar] [CrossRef]
- Poor, H.V.; Schaefer, R.F. Wireless physical layer security. Proc. Natl. Acad. Sci. USA 2017, 114, 19–26. [Google Scholar] [CrossRef] [PubMed]
- Mukherjee, A.; Fakoorian, S.A.A.; Huang, J.; Swindlehurst, A.L. Principles of physical layer security in multiuser wireless networks: A survey. IEEE Commun. Surv. Tutor. 2014, 16, 1550–1573. [Google Scholar] [CrossRef]
- Gopala, P.K.; Lai, L.; El Gamal, H. On the secrecy capacity of fading channels. IEEE Trans. Inf. Theory 2008, 54, 4687–4698. [Google Scholar] [CrossRef]
- Bloch, M.; Barros, J.; Rodrigues, M.R.; McLaughlin, S.W. Wireless information-theoretic security. IEEE Trans. Inf. Theory 2008, 54, 2515–2534. [Google Scholar] [CrossRef]
- Khisti, A.; Tchamkerten, A.; Wornell, G.W. Secure broadcasting over fading channels. IEEE Trans. Inf. Theory 2008, 54, 2453–2469. [Google Scholar] [CrossRef]
- Liang, Y.; Poor, H.V.; Shamai, S. Secure communication over fading channels. IEEE Trans. Inf. Theory 2008, 54, 2470–2492. [Google Scholar] [CrossRef]
- Shafiee, S.; Liu, N.; Ulukus, S. Towards the secrecy capacity of the Gaussian MIMO wire-tap channel: The 2-2-1 channel. IEEE Trans. Inf. Theory 2009, 55, 4033–4039. [Google Scholar] [CrossRef]
- Khisti, A.; Wornell, G.W. Secure transmission with multiple antennas–Part II: The MIMOME wiretap channel. IEEE Trans. Inf. Theory 2010, 56, 5515–5532. [Google Scholar] [CrossRef]
- Oggier, F.; Hassibi, B. The secrecy capacity of the MIMO wiretap channel. IEEE Trans. Inf. Theory 2011, 57, 4961–4972. [Google Scholar] [CrossRef]
- Guo, D.; Shamai, S.; Verdú, S. Mutual information and minimum mean-square error in Gaussian channels. IEEE Trans. Inf. Theory 2005, 51, 1261–1282. [Google Scholar] [CrossRef]
- Bustin, R.; Liu, R.; Poor, H.V.; Shamai, S. An MMSE approach to the secrecy capacity of the MIMO Gaussian wiretap channel. Eurasip J. Wirel. Commun. Netw. 2009, 2009, 370970. [Google Scholar] [CrossRef]
- Liu, T.; Shamai, S. A note on the secrecy capacity of the multiple-antenna wiretap channel. IEEE Trans. Inf. Theory 2009, 55, 2547–2553. [Google Scholar] [CrossRef]
- Loyka, S.; Charalambous, C.D. An algorithm for global maximization of secrecy rates in Gaussian MIMO wiretap channels. IEEE Trans. Commun. 2015, 63, 2288–2299. [Google Scholar] [CrossRef]
- Loyka, S.; Charalambous, C.D. Optimal signaling for secure communications over Gaussian MIMO wiretap channels. IEEE Trans. Inf. Theory 2016, 62, 7207–7215. [Google Scholar] [CrossRef]
- Ozel, O.; Ekrem, E.; Ulukus, S. Gaussian wiretap channel with amplitude and variance constraints. IEEE Trans. Inf. Theory 2015, 61, 5553–5563. [Google Scholar] [CrossRef]
- Soltani, M.; Rezki, Z. Optical wiretap channel with input-dependent Gaussian noise under peak-and average-intensity constraints. IEEE Trans. Inf. Theory 2018, 64, 6878–6893. [Google Scholar] [CrossRef]
- Soltani, M.; Rezki, Z. The Degraded Discrete-Time Poisson Wiretap Channel. arXiv 2021, arXiv:2101.03650. [Google Scholar]
- Nam, S.H.; Lee, S.H. Secrecy Capacity of a Gaussian Wiretap Channel with One-bit ADCs is Always Positive. In Proceedings of the 2019 IEEE Information Theory Workshop (ITW), Visby, Sweden, 25–28 August 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Dytso, A.; Egan, M.; Perlaza, S.M.; Poor, H.V.; Shitz, S.S. Optimal Inputs for Some Classes of Degraded Wiretap Channels. In Proceedings of the 2018 IEEE Information Theory Workshop (ITW), Guangzhou, China, 25–29 November 2018; pp. 1–5. [Google Scholar] [CrossRef]
- Karlin, S. Pólya type distributions, II. Ann. Math. Stat. 1957, 28, 281–308. [Google Scholar] [CrossRef]
- Dytso, A.; Al, M.; Poor, H.V.; Shamai Shitz, S. On the Capacity of the Peak Power Constrained Vector Gaussian Channel: An Estimation Theoretic Perspective. IEEE Trans. Inf. Theory 2019, 65, 3907–3921. [Google Scholar] [CrossRef]
- Favano, A.; Ferrari, M.; Magarini, M.; Barletta, L. The Capacity of the Amplitude-Constrained Vector Gaussian Channel. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021; pp. 426–431. [Google Scholar] [CrossRef]
- Berry, J.C. Minimax estimation of a bounded normal mean vector. J. Multivar. Anal. 1990, 35, 130–139. [Google Scholar] [CrossRef]
- Dytso, A.; Yagli, S.; Poor, H.V.; Shamai (Shitz), S. The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points. IEEE Trans. Inf. Theory 2020, 66, 2006–2022. [Google Scholar] [CrossRef]
- Cover, T.; Thomas, J. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
- Han, T.S.; Endo, H.; Sasaki, M. Reliability and Secrecy Functions of the Wiretap Channel Under Cost Constraint. IEEE Trans. Inf. Theory 2014, 60, 6819–6843. [Google Scholar] [CrossRef]
- Barletta, L.; Dytso, A. Amplitude-Constrained Gaussian Wiretap Channel: Computation of the Optimal Input Distribution. In Proceedings of the 2022 IEEE International Mediterranean Conference on Communications and Networking (MeditCom), Athens, Greece, 5–8 September 2022; pp. 106–111. [Google Scholar] [CrossRef]
- Rose, K. A mapping approach to rate-distortion computation and analysis. IEEE Trans. Inf. Theory 1994, 40, 1939–1952. [Google Scholar] [CrossRef]
- Blahut, R. Computation of channel capacity and rate-distortion functions. IEEE Trans. Inf. Theory 1972, 18, 460–473. [Google Scholar] [CrossRef]
- Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Yasui, K.; Suko, T.; Matsushima, T. An algorithm for computing the secrecy capacity of broadcast channels with confidential messages. In Proceedings of the 2007 IEEE International Symposium on Information Theory (ISIT), Nice, France, 24–29 June 2007; pp. 936–940. [Google Scholar] [CrossRef]
- Verdú, S. Mismatched estimation and relative entropy. IEEE Trans. Inf. Theory 2010, 56, 3712–3720. [Google Scholar] [CrossRef]
- Segura, J. Bounds for ratios of modified Bessel functions and associated Turán-type inequalities. J. Math. Anal. Appl. 2011, 374, 516–528. [Google Scholar] [CrossRef]
- Baricz, Á. Bounds for Turánians of modified Bessel functions. Expo. Math. 2015, 33, 223–251. [Google Scholar] [CrossRef]
- Tijdeman, R. On the number of zeros of general exponential polynomials. In Proceedings of the Indagationes Mathematicae; North-Holland: Amsterdam, The Netherlands, 1971; Volume 74, pp. 1–7. [Google Scholar]
- Esposito, R. On a relation between detection and estimation in decision theory. Inf. Control 1968, 12, 116–120. [Google Scholar] [CrossRef]
- Dytso, A.; Poor, H.V.; Shitz, S.S. A general derivative identity for the conditional mean estimator in Gaussian noise and some applications. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; pp. 1183–1188. [Google Scholar] [CrossRef]
- Barletta, L.; Dytso, A. Scalar Gaussian Wiretap Channel: Bounds on the Support Size of the Secrecy-Capacity-Achieving Distribution. In Proceedings of the 2021 IEEE Information Theory Workshop (ITW), Kanazawa, Japan, 17–21 October 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Favano, A.; Barletta, L.; Dytso, A. On the Capacity Achieving Input of Amplitude Constrained Vector Gaussian Wiretap Channel. In Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT), Espoo, Finland, 26 June–1 July 2022; pp. 850–855. [Google Scholar] [CrossRef]
- Favano, A. The Capacity of Amplitude-Constrained Vector Gaussian Channels. Ph.D. Dissertation, Politecnico di Milano, Milan, Italy, 2022. [Google Scholar]
- Lapidoth, A.; Moser, S.M. Capacity bounds via duality with applications to multiple-antenna systems on flat-fading channels. IEEE Trans. Inf. Theory 2003, 49, 2426–2467. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).




