Next Article in Journal
Conformal Flattening for Deformed Information Geometries on the Probability Simplex
Next Article in Special Issue
Some Inequalities Combining Rough and Random Information
Previous Article in Journal
Fisher Information Based Meteorological Factors Introduction and Features Selection for Short-Term Load Forecasting
Previous Article in Special Issue
Entropies of Weighted Sums in Cyclic Groups and an Application to Polar Codes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Lower Bound on the Differential Entropy of Log-Concave Random Vectors with Applications

1
Center for the Mathematics of Information, California Institute of Technology, Pasadena, CA 91125, USA
2
Department of Electrical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
*
Author to whom correspondence should be addressed.
Entropy 2018, 20(3), 185; https://doi.org/10.3390/e20030185
Submission received: 18 January 2018 / Revised: 6 March 2018 / Accepted: 6 March 2018 / Published: 9 March 2018
(This article belongs to the Special Issue Entropy and Information Inequalities)

Abstract

:
We derive a lower bound on the differential entropy of a log-concave random variable X in terms of the p-th absolute moment of X. The new bound leads to a reverse entropy power inequality with an explicit constant, and to new bounds on the rate-distortion function and the channel capacity. Specifically, we study the rate-distortion function for log-concave sources and distortion measure d ( x , x ^ ) = | x x ^ | r , with r 1 , and we establish that the difference between the rate-distortion function and the Shannon lower bound is at most log ( π e ) 1 . 5 bits, independently of r and the target distortion d. For mean-square error distortion, the difference is at most log ( π e 2 ) 1 bit, regardless of d. We also provide bounds on the capacity of memoryless additive noise channels when the noise is log-concave. We show that the difference between the capacity of such channels and the capacity of the Gaussian channel with the same noise power is at most log ( π e 2 ) 1 bit. Our results generalize to the case of a random vector X with possibly dependent coordinates. Our proof technique leverages tools from convex geometry.

1. Introduction

It is well known that the differential entropy among all zero-mean random variables with the same second moment is maximized by the Gaussian distribution:
h ( X ) log ( 2 π e E [ | X | 2 ] ) .
More generally, the differential entropy under p-th moment constraint is upper bounded as (see e.g., [1] (Appendix 2)), for p > 0 ,
h ( X ) log α p X p ,
where
α p 2 e 1 p Γ 1 + 1 p p 1 p , X p E [ | X | p ] 1 p .
Here, Γ denotes the Gamma function. Of course, if p = 2 , α p = 2 π e , and Equation (2) reduces to Equation (1). A natural question to ask is whether a matching lower bound on h ( X ) can be found in terms of p-norm of X, X p . The quest is meaningless without additional assumptions on the density of X, as h ( X ) = is possible even if X p is finite. In this paper, we show that if the density of X, f X ( x ) , is log-concave (that is, log f X ( x ) is concave), then h ( X ) stays within a constant from the upper bound in Equation (2) (see Theorem 3 in Section 2 below):
h ( X ) log 2 X E [ X ] p Γ ( p + 1 ) 1 p ,
where p 1 . Moreover, the bound (4) tightens for p = 2 , where we have
h ( X ) 1 2 log ( 4 Var [ X ] ) .
The bound (4) actually holds for p > 1 if, in addition to being log-concave, X is symmetric (that is, f X ( x ) = f X ( x ) ), (see Theorem 1 in Section 2 below).
The class of log-concave distributions is rich and contains important distributions in probability, statistics and analysis. Gaussian distribution, Laplace distribution, uniform distribution on a convex set, chi distribution are all log-concave. The class of log-concave random vectors has good behavior under natural probabilistic operations: namely, a famous result of Prékopa [2] states that sums of independent log-concave random vectors, as well as marginals of log-concave random vectors, are log-concave. Furthermore, log-concave distributions have moments of all orders.
Together with the classical bound in Equation (2), the bound in (4) tells us that entropy and moments of log-concave random variables are comparable.
Using a different proof technique, Bobkov and Madiman [3] recently showed that the differential entropy of a log-concave X satisfies
h ( X ) 1 2 log 1 2 Var [ X ] .
Our results in (4) and (5) tighten (6), in addition to providing a comparison with other moments.
Furthermore, this paper generalizes the lower bound on differential entropy in (4) to random vectors. If the random vector X = ( X 1 , , X n ) consists of independent random variables, then the differential entropy of X is equal to the sum of differential entropies of the component random variables, and one can trivially apply (4) component-wise to obtain a lower bound on h ( X ) . In this paper, we show that, even for nonindependent components, as long as the density of the random vector X is log-concave and satisfies a symmetry condition, its differential entropy is bounded from below in terms of covariance matrix of X (see Theorem 4 in Section 2 below). As noted in [4], such a generalization is related to the famous hyperplane conjecture in convex geometry. We also extend our results to a more general class of random variables, namely, the class of γ -concave random variables, with γ < 0 .
The bound (4) on the differential entropy allows us to derive reverse entropy power inequalities with explicit constants. The fundamental entropy power inequality of Shannon [5] and Stam [6] states that for all independent continuous random vectors X and Y in R n ,
N ( X + Y ) N ( X ) + N ( Y ) ,
where
N ( X ) = e 2 n h ( X )
denotes the entropy power of X. It is of interest to characterize distributions for which a reverse form of (7) holds. In this direction, it was shown by Bobkov and Madiman [7] that, given any continuous log-concave random vectors X and Y in R n , there exist affine volume-preserving maps u 1 , u 2 such that a reverse entropy power inequality holds for u 1 ( X ) and u 2 ( Y ) :
N ( u 1 ( X ) + u 2 ( Y ) ) c ( N ( u 1 ( X ) ) + N ( u 2 ( Y ) ) ) = c ( N ( X ) + N ( Y ) ) ,
for some universal constant c 1 (independent of the dimension).
In applications, it is important to know the precise value of the constant c that appears in (9). It was shown by Cover and Zhang [8] that, if X and Y are identically distributed (possibly dependent) log-concave random variables, then
N ( X + Y ) 4 N ( X ) .
Inequality (10) easily extends to random vectors (see [9]). A similar bound for the difference of i.i.d. log-concave random vectors was obtained in [10], and reads as
N ( X Y ) e 2 N ( X ) .
Recently, a new form of reverse entropy power inequality was investigated in [11], and a general reverse entropy power-type inequality was developed in [12]. For further details, we refer to the survey paper [13]. In Section 5, we provide explicit constants for non-identically distributed and uncorrelated log-concave random vectors (possibly dependent). In particular, we prove that as long as log-concave random variables X and Y are uncorrelated,
N ( X + Y ) π e 2 ( N ( X ) + N ( Y ) ) .
A generalization of (12) to arbitrary dimension is stated in Theorem 8 in Section 2 below.
The bound (4) on the differential entropy is essential in the study of the difference between the rate-distortion function and the Shannon lower bound that we describe next. Given a nonnegative number d, the rate-distortion function R X d under r-th moment distortion measure is given by
R X d = inf P X ^ | X : E [ | X X ^ | r ] d I ( X ; X ^ ) ,
where the infimum is over all transition probability kernels R R satisfying the moment constraint. The celebrated Shannon lower bound [14] states that the rate-distortion function is lower bounded by
R X d R X d h ( X ) log α r d 1 r ,
where α r is defined in (3). For mean-square distortion ( r = 2 ), (14) simplifies to
R X d h ( X ) log 2 π e d .
The Shannon lower bound states that the rate-distortion function is lower bounded by the difference between the differential entropy of the source and the term that increases with target distortion d, explicitly linking the storage requirements for X to the information content of X (measured by h ( X ) ) and the desired reproduction distortion d. As shown in [15,16,17] under progressively less stringent assumptions (Koch [17] showed that (16) holds as long as H ( X ) < ), the Shannon lower bound is tight in the limit of low distortion,
0 R X d R X d d 0 0 .
The speed of convergence in (16) and its finite blocklength refinement were recently explored in [18]. Due to its simplicity and tightness in the high resolution/low distortion limit, the Shannon lower bound can serve as a proxy for the rate-distortion function R X d , which rarely has an explicit representation. Furthermore, the tightness of the Shannon lower bound at low d is linked to the optimality of simple lattice quantizers [18], an insight which has evident practical significance. Gish and Pierce [19] showed that, for mean-square error distortion, the difference between the entropy rate of a scalar quantizer, H 1 , and the rate-distortion function R X d converges to 1 2 log 2 π e 12 0 . 254 bit/sample in the limit d 0 . Ziv [20] proved that H 1 ˜ R X d is bounded by 1 2 log 2 π e 6 0 . 754 bit/sample, universally in d, where H 1 ˜ is the entropy rate of a dithered scalar quantizer.
In this paper, we show that the gap between R X d and R X d is bounded universally in d, provided that the source density is log-concave: for mean-square error distortion ( r = 2 in (13)), we have
R X d R X d log π e 2 1 . 05 bits .
Besides leading to the reverse entropy power inequality and the reverse Shannon lower bound, the new bounds on the differential entropy allow us to bound the capacity of additive noise memoryless channels, provided that the noise follows a log-concave distribution.
The capacity of a channel that adds a memoryless noise Z is given by (see e.g., [21] (Chapter 9)),
C Z ( P ) = sup X : E [ | X | 2 ] P I ( X ; X + Z ) ,
where P is the power allotted for the transmission. As a consequence of the entropy power inequality (7) (or more elementary as a consequence of the worst additive noise lemma, see [22,23]), it holds that
C Z ( P ) C Z ( P ) = 1 2 log 1 + P Var [ Z ] ,
for arbitrary noise Z, where C Z ( P ) denotes the capacity of the additive white Gaussian noise channel with noise variance Var [ Z ] . This fact is well known (see e.g., [21] (Chapter 9)), and is referred to as the saddle-point condition.
In this paper, we show that, whenever the noise Z is log-concave, the difference between the capacity C Z ( P ) and the capacity of a Gaussian channel with the same noise power satisfies
C Z ( P ) C Z ( P ) log π e 2 1 . 05 bits .
Let us mention a similar result by Zamir and Erez [24], who showed that the capacity of an arbitrary memoryless additive noise channel is well approximated by the mutual information between the Gaussian input and the output of the channel:
C Z ( P ) I ( X * ; X * + Z ) 1 2 bits ,
where X * is a Gaussian input satisfying the power constraint. The bounds (20) and (21) are not directly comparable.
The rest of the paper is organized as follows. Section 2 presents and discusses our main results: the lower bounds on differential entropy in Theorems 1, 3 and 4, the reverse entropy power inequalities with explicit constants in Theorems 7 and 8, the upper bounds on R X d R _ X d in Theorems 9 and 10, and the bounds on the capacity of memoryless additive channels in Theorems 12 and 13. The convex geometry tools served to prove the bounds on differential entropy and the bounds in Theorems 1, 3 and 4 are presented in Section 3. In Section 4, we extend our results to the class of γ -concave random variables. The reverse entropy power inequalities in Theorems 7 and 8 are proven in Section 5. The bounds on the rate-distortion function in Theorems 9 and 10 are proven in Section 6. The bounds on the channel capacity in Theorems 12 and 13 are proven in Section 7.

2. Main Results

2.1. Lower Bounds on the Differential Entropy

A function f : R n [ 0 , + ) is log-concave if log f : R n [ , ) is a concave function. Equivalently, f is log-concave if for every λ [ 0 , 1 ] and for every x , y R n , one has
f ( ( 1 λ ) x + λ y ) f ( x ) 1 λ f ( y ) λ .
We say that a random vector X in R n is log-concave if it has a probability density function f X with respect to Lebesgue measure in R n such that f X is log-concave.
Our first result is a lower bound on the differential entropy of symmetric log-concave random variable in terms of its moments.
Theorem 1.
Let X be a symmetric log-concave random variable. Then, for every p > 1 ,
h ( X ) log 2 X p Γ ( p + 1 ) 1 p .
Moreover, (23) holds with equality for uniform distribution in the limit p 1 .
As we will see in Theorem 3, for p = 2 , the bound (23) tightens as
h ( X ) log ( 2 X 2 ) .
The difference between the upper bound in (2) and the lower bound in (23) grows as log(p) as p + , as 1 p as p 0 + , and reaches its minimum value of log(e) ≈ 1.4 bits at p = 1 .
The next theorem, due to Karlin, Proschan and Barlow [25], shows that the moments of a symmetric log-concave random variable are comparable, and demonstrates that the bound in Theorem 1 tightens as p 1 .
Theorem 2.
Let X be a symmetric log-concave random variable. Then, for every 1 < p q ,
X q Γ ( q + 1 ) 1 q X p Γ ( p + 1 ) 1 p .
Moreover, the Laplace distribution satisfies (25) with equality [25].
Combining Theorem 2 with the well-known fact that X p is non-decreasing in p, we deduce that for every symmetric log-concave random variable X, for every 1 < p < q ,
X p X q Γ ( q + 1 ) 1 q Γ ( p + 1 ) 1 p X p .
Using Theorem 1 and (24), we immediately obtain the following upper bound for the relative entropy D ( X | | G X ) between a symmetric log-concave random variable X and a Gaussian G X with same variance as that of X.
Corollary 1.
Let X be a symmetric log-concave random variable. Then, for every p > 1 ,
D ( X | | G X ) log π e + Δ p ,
where G X N ( 0 , X 2 2 ) , and
Δ p log Γ ( p + 1 ) 1 p 2 X 2 X p , p 2 , log 2 , p = 2 .
Remark 1.
The uniform distribution achieves equality in (27) in the limit p 1 . Indeed, if U is uniformly distributed on a symmetric interval, then
Δ p = log Γ ( p + 2 ) 1 p 6 p 1 1 2 log 1 6 ,
and so, in the limit p 1 , the upper bound in Corollary 1 coincides with the true value of D ( U | | G U ) :
D ( U | | G U ) = 1 2 log π e 6 .
We next provide a lower bound for the differential entropy of log-concave random variables that are not necessarily symmetric.
Theorem 3.
Let X be a log-concave random variable. Then, for every p 1 ,
h ( X ) log 2 X E [ X ] p Γ ( p + 1 ) 1 p .
Moreover, for p = 2 , the bound (31) tightens as
h ( X ) log ( 2 Var [ X ] ) .
The next proposition is an analog of Theorem 2 for log-concave random variables that are not necessarily symmetric.
Proposition 1.
Let X be a log-concave random variable. Then, for every 1 p q ,
X E [ X ] q Γ ( q + 1 ) 1 q 2 X E [ X ] p Γ ( p + 1 ) 1 p .
Remark 2.
Contrary to Theorem 2, we do not know whether there exists a distribution that realizes equality in (33).
Using Theorem 3, we immediately obtain the following upper bound for the relative entropy D ( X | | G X ) between an arbitrary log-concave random variable X and a Gaussian G X with same variance as that of X. Recall the definition of Δ p in (28).
Corollary 2.
Let X be a zero-mean, log-concave random variable. Then, for every p 1 ,
D ( X | | G X ) log π e + Δ p ,
where G X N ( 0 , X 2 2 ) . In particular, by taking p = 2 , we necessarily have
D ( X | | G X ) log π e 2 .
For a given distribution of X, one can optimize over p to further tighten (35), as seen in (29) for the uniform distribution.
We now present a generalization of the bound in Theorem 1 to random vectors satisfying a symmetry condition. A function f : R n R is called unconditional if, for every ( x 1 , , x n ) R n and every ( ε 1 , , ε n ) { 1 , 1 } n , one has
f ( ε 1 x 1 , , ε n x n ) = f ( x 1 , , x n ) .
For example, the probability density function of the standard Gaussian distribution is unconditional. We say that a random vector X in R n is unconditional if it has a probability density function f X with respect to Lebesgue measure in R n such that f X is unconditional.
Theorem 4.
Let X be a symmetric log-concave random vector in R n , n 2 . Then,
h ( X ) n 2 log | K X | 1 n c ( n ) ,
where | K X | denotes the determinant of the covariance matrix of X, and c ( n ) = e 2 n 2 4 2 ( n + 2 ) . If, in addition, X is unconditional, then c ( n ) = e 2 2 .
By combining Theorem 4 with the well-known upper bound on the differential entropy, we deduce that, for every symmetric log-concave random vector X in R n ,
n 2 log | K X | 1 n c ( n ) h ( X ) n 2 log 2 π e | K X | 1 n ,
where c ( n ) = e 2 n 2 4 2 ( n + 2 ) in general, and c ( n ) = e 2 2 if, in addition, X is unconditional.
Using Theorem 4, we immediately obtain the following upper bound for the relative entropy D ( X | | G X ) between a symmetric log-concave random vector X and a Gaussian G X with the same covariance matrix as that of X.
Corollary 3.
Let X be a symmetric log-concave random vector in R n . Then,
D ( X | | G X ) n 2 log ( 2 π e c ( n ) ) ,
where G X N ( 0 , K X ) , with c ( n ) = n 2 e 2 ( n + 2 ) 4 2 in general, and c ( n ) = e 2 2 when X is unconditional.
For isotropic unconditional log-concave random vectors (whose definition we recall in Section 3.3 below), we extend Theorem 4 to other moments.
Theorem 5.
Let X = ( X 1 , , X n ) be an isotropic unconditional log-concave random vector. Then, for every p > 1 ,
h ( X ) max i { 1 , , n } n log 2 X i p Γ ( p + 1 ) 1 p 1 c ,
where c = e 6 . If, in addition, f X is invariant under permutations of coordinates, then c = e .

2.2. Extension to γ -Concave Random Variables

The bound in Theorem 1 can be extended to a larger class of random variables than log-concave, namely the class of γ -concave random variables that we describe next.
Let γ < 0 . We say that a probability density function f : R n [ 0 , + ) is γ-concave if f γ is convex. Equivalently, f is γ -concave if for every λ [ 0 , 1 ] and every x , y R n , one has
f ( ( 1 λ ) x + λ y ) ( ( 1 λ ) f ( x ) γ + λ f ( y ) γ ) 1 γ .
As γ 0 , (41) agrees with (22), and thus 0-concave distributions corresponds to log-concave distributions. The class of γ -concave distributions has been deeply studied in [26,27].
Since for fixed a , b 0 the function ( ( 1 λ ) a γ + λ b γ ) 1 γ is non-decreasing in γ , we deduce that any log-concave distribution is γ -concave, for any γ < 0 .
For example, extended Cauchy distributions, that is, distributions of the form
f X ( x ) = C γ 1 + | x | n 1 γ , x R n ,
where C γ is the normalization constant, are γ -concave distributions (but are not log-concave).
We say that a random vector X in R n is γ -concave if it has a probability density function f X with respect to Lebesgue measure in R n such that f X is γ -concave.
We derive the following lower bound on the differential entropy for one-dimensional symmetric γ -concave random variables, with γ ( 1 , 0 ) .
Theorem 6.
Let γ ( 1 , 0 ) . Let X be a symmetric γ-concave random variable. Then, for every p 1 , 1 1 γ ,
h ( X ) log 2 X p Γ ( p + 1 ) 1 p Γ ( 1 1 γ ) 1 + 1 p Γ ( 1 γ ) Γ ( 1 γ ( p + 1 ) ) 1 p .
Notice that (43) reduces to (23) as γ 0 . Theorem 6 implies the following relation between entropy and second moment, for any γ ( 1 3 , 0 ) .
Corollary 4.
Let γ ( 1 3 , 0 ) . Let X be a symmetric γ-concave random variable. Then,
h ( X ) 1 2 log 2 X 2 2 Γ ( 1 1 γ ) 3 Γ ( 1 γ ) 2 Γ ( 1 γ 3 ) = 1 2 log 2 X 2 2 ( 2 γ + 1 ) ( 3 γ + 1 ) ( γ + 1 ) 2 .

2.3. Reverse Entropy Power Inequality with an Explicit Constant

As an application of Theorems 3 and 4, we establish in Theorems 7 and 8 below a reverse form of the entropy power inequality (7) with explicit constants, for uncorrelated log-concave random vectors. Recall the definition of the entropy power (8).
Theorem 7.
Let X and Y be uncorrelated log-concave random variables. Then,
N ( X + Y ) π e 2 ( N ( X ) + N ( Y ) ) .
As a consequence of Corollary 4, reverse entropy power inequalities for more general distributions can be obtained. In particular, for any uncorrelated symmetric γ -concave random variables X and Y, with γ ( 1 3 , 0 ) ,
N ( X + Y ) π e ( γ + 1 ) 2 ( 2 γ + 1 ) ( 3 γ + 1 ) ( N ( X ) + N ( Y ) ) .
One cannot have a reverse entropy power inequality in higher dimensions for arbitrary log-concave random vectors. Indeed, just consider X uniformly distributed on [ ε 2 , ε 2 ] × [ 1 2 , 1 2 ] and Y uniformly distributed on [ 1 2 , 1 2 ] × [ ε 2 , ε 2 ] in R 2 , with ε > 0 small enough so that N ( X ) and N ( Y ) are arbitrarily small compared to N ( X + Y ) . Hence, we need to put X and Y in a certain position so that a reverse form of (7) is possible. While the isotropic position (discussed in Section 3) will work, it can be relaxed to the weaker condition that the covariance matrices are proportionals. Recall that we denote by K X the covariance matrix of X.
Theorem 8.
Let X and Y be uncorrelated symmetric log-concave random vectors in R n such that K X and K Y are proportionals. Then,
N ( X + Y ) π e 3 n 2 2 2 ( n + 2 ) ( N ( X ) + N ( Y ) ) .
If, in addition, X and Y are unconditional, then
N ( X + Y ) π e 3 ( N ( X ) + N ( Y ) ) .

2.4. New Bounds on the Rate-distortion Function

As an application of Theorems 1 and 3, we show in Corollary 5 below that in the class of one-dimensional log-concave distributions, the rate-distortion function does not exceed the Shannon lower bound by more than log ( π e ) 1 . 55 bits (which can be refined to log ( e ) 1 . 44 bits when the source is symmetric), independently of d and r 1 . Denote for brevity
β r 1 + r 2 r Γ ( 3 r ) Γ ( 1 r ) ,
and recall the definition of α r in (3).
We start by giving a bound on the difference between the rate-distortion function and the Shannon lower bound, which applies to general, not necessarily log-concave, random variables.
Theorem 9.
Let d 0 and r 1 . Let X be an arbitrary random variable.
( 1 ) Let r [ 1 , 2 ] . If X 2 > d 1 r , then
R X d R _ X d D ( X | | G X ) + log α r 2 π e .
If X 2 d 1 r , then R X d = 0 .
( 2 ) Let r > 2 . If X 2 d 1 r , then
R X d R _ X d D ( X | | G X ) + log β r .
If X r d 1 r , then R X d = 0 . If X r > d 1 r and X 2 < d 1 r , then R X d log 2 π e β r α r .
Remark 3.
For Gaussian X and r = 2 , the upper bound in (50) is 0, as expected.
The next result refines the bounds in Theorem 9 for symmetric log-concave random variables when r > 2 .
Theorem 10.
Let d 0 and r > 2 . Let X be a symmetric log-concave random variable.
If X 2 d 1 r , then
R X d R _ X d D ( X | | G X ) + min log ( β r ) , log α r Γ ( r + 1 ) 1 r 2 π e .
If X r d 1 r or X 2 2 Γ ( r + 1 ) 1 r d 1 r , then R X d = 0 . If X r > d 1 r and X 2 2 Γ ( r + 1 ) 1 r d 1 r , d 1 r , then R X d min log 2 π e β r α r , log Γ ( r + 1 ) 1 r 2 .
To bound R X d R _ X d independently of the distribution of X, we apply the bound (35) on D ( X | | G X ) to Theorems 9 and 10:
Corollary 5.
Let X be a log-concave random variable. For r [ 1 , 2 ] , we have
R X d R _ X d log α r 2 .
For r > 2 , we have
R X d R _ X d log π e 2 β r .
If, in addition, X is symmetric, then, for r > 2 , we have
R X d R _ X d min log α r Γ ( r + 1 ) 1 r 2 2 , log π e 2 β r .
Figure 1a presents our bound for different values of r. Regardless of r and d,
R X d R _ X d log ( π e ) 1 . 55 bits .
The bounds in Figure 1a tighten for symmetric log-concave sources when r ( 2 , 4 . 3 ) . Figure 1b presents this tighter bound for different values of r. Regardless of r and d,
R X d R _ X d log ( e ) 1 . 44 bits .
One can see that the graph in Figure 1b is continuous at r = 2 , contrary to the graph in Figure 1a. This is because Theorem 2, which applies to symmetric log-concave random variables, is strong enough to imply the tightening of (51) given in (52), while Proposition 1, which provides a counterpart of Theorem 2 applicable to all log-concave random variables, is insufficient to derive a similar tightening in that more general setting.
Remark 4.
While Corollary 5 bounds the difference R X d R X d by a universal constant independent of the distribution of X, tighter bounds can be obtained if one is willing to relinquish such universality. For example, for mean-square distortion ( r = 2 ) and a uniformly distributed source U, using Remark 1, we obtain
R U ( d ) R U ( d ) 1 2 log 2 π e 12 0 . 254 b i t s .
Theorem 9 easily extends to random vector X in R n , n 2 , with a similar proof. The only difference being an extra term of n 2 log 1 n X 2 2 / | K X | 1 n that will appear on the right-hand side of (50) and (51), and will come from the upper bound on the differential entropy (38). Here,
X p E i = 1 n | X i | p 1 p .
As a result, the bound R X d R X d can be arbitrarily large in higher dimensions because of the term 1 n X 2 2 / | K X | 1 n . However, for isotropic random vectors (whose definition we recall in Section 3.3 below), one has 1 n X 2 2 = | K X | 1 n . Hence, using the bound (39) on D ( X | | G X ) , we can bound R X d R _ X d independently of the distribution of isotropic log-concave random vector X in R n , n 2 .
Corollary 6.
Let X be an isotropic log-concave random vector in R n , n 2 . Then,
R X d R _ X d n 2 log ( 2 π e c ( n ) ) ,
where c ( n ) = n 2 e 2 ( n + 2 ) 4 2 in general, and c ( n ) = e 2 2 if, in addition, X is unconditional.
Let us consider the rate-distortion function under the determinant constraint for random vectors in R n , n 2 :
R X c o v ( d ) = inf P X ^ | X : | K X X ^ | 1 n d I ( X ; X ^ ) ,
where the infimum is taken over all joint distributions satisfying the determinant constraint | K X X ^ | 1 n d . For this distortion measure, we have the following bound.
Theorem 11.
Let X be a symmetric log-concave random vector in R n . If | K X | 1 n > d , then
0 R X c o v ( d ) R _ X d D ( X | | G X ) n 2 log ( 2 π e c ( n ) ) ,
with c ( n ) = n 2 e 2 ( n + 2 ) 4 2 . If, in addition, X is unconditional, then c ( n ) = e 2 2 . If | K X | 1 n d , then R X c o v ( d ) = 0 .

2.5. New Bounds on the Capacity of Memoryless Additive Channels

As another application of Theorem 3, we compare the capacity C Z of a channel with log-concave additive noise Z with the capacity of the Gaussian channel. Recall that the capacity of the Gaussian channel is
C Z ( P ) = 1 2 log 1 + P Var [ Z ] .
Theorem 12.
Let Z be a log-concave random variable. Then,
0 C Z ( P ) C Z ( P ) log π e 2 1 . 05 bits .
Remark 5.
Theorem 12 tells us that the capacity of a channel with log-concave additive noise exceeds the capacity a Gaussian channel by no more than 1 . 05 bits.
As an application of Theorem 4, we can provide bounds for the capacity of a channel with log-concave additive noise Z in R n , n 1 . The formula for capacity (18) generalizes to dimension n as
C Z ( P ) = sup X : 1 n X 2 2 P I ( X ; X + Z ) .
Theorem 13.
Let Z be a symmetric log-concave random vector in R n . Then,
0 C Z ( P ) n 2 log 1 + P | K Z | 1 n n 2 log 2 π e c ( n ) 1 n Z 2 2 + P | K Z | 1 n + P ,
where c ( n ) = n 2 e 2 ( n + 2 ) 4 2 . If, in addition, Z is unconditional, then c ( n ) = e 2 2 .
The upper bound in Theorem 13 can be arbitrarily large by inflating the ratio 1 n X 2 2 / | K X | 1 n . For isotropic random vectors (whose definition is recalled in Section 3.3 below), one has 1 n Z 2 2 = | K Z | 1 n , and the following corollary follows.
Corollary 7.
Let Z be an isotropic log-concave random vector in R n . Then,
0 C Z ( P ) n 2 log 1 + P | K Z | 1 n n 2 log 2 π e c ( n ) ,
where c ( n ) = n 2 e 2 ( n + 2 ) 4 2 . If, in addition, Z is unconditional, then c ( n ) = e 2 2 .

3. New Lower Bounds on the Differential Entropy

3.1. Proof of Theorem 1

The key to our development is the following result for one-dimensional log-concave distributions, well-known in convex geometry. It can be found in [28], in a slightly different form.
Lemma 1.
The function
F ( r ) = 1 Γ ( r + 1 ) 0 + x r f ( x ) d x
is log-concave on [ 1 , + ) , whenever f : [ 0 ; + ) [ 0 ; + ) is log-concave [28].
Proof of Theorem 1.
Let p > 0 . Applying Lemma 1 to the values 1 , 0 , p , we have
F ( 0 ) = F p p + 1 ( 1 ) + 1 p + 1 p F ( 1 ) p p + 1 F ( p ) 1 p + 1 .
The bound in Theorem 1 follows by computing the values F ( 1 ) , F ( 0 ) and F ( p ) for f = f X .
One has
F ( 0 ) = 1 2 , F ( p ) = X p p 2 Γ ( p + 1 ) .
To compute F ( 1 ) , we first provide a different expression for F ( r ) . Notice that
F ( r ) = 1 Γ ( r + 1 ) 0 + x r 0 f X ( x ) d t d x = r + 1 Γ ( r + 2 ) 0 max f X { x 0 : f X ( x ) t } x r d x d t .
Denote the generalized inverse of f X by f X 1 ( t ) sup { x 0 : f X ( x ) t } , t 0 . Since f X is log-concave and
f X ( x ) f X ( 0 ) = max f X ,
it follows that f X is non-increasing on [ 0 , + ) . Therefore, { x 0 : f X ( x ) t } = [ 0 , f X 1 ( t ) ] . Hence,
F ( r ) = r + 1 Γ ( r + 2 ) 0 f X ( 0 ) 0 f X 1 ( t ) x r d x d t = 1 Γ ( r + 2 ) 0 f X ( 0 ) ( f X 1 ( t ) ) r + 1 d t .
We deduce that
F ( 1 ) = f X ( 0 ) .
Plugging (69) and (73) into (68), we obtain
f X ( 0 ) Γ ( p + 1 ) 1 p 2 X p .
It follows immediately that
h ( X ) = f X ( x ) log 1 f X ( x ) d x log 1 f X ( 0 ) log 2 X p Γ ( p + 1 ) 1 p .
For p ( 1 , 0 ) , the bound is obtained similarly by applying Lemma 1 to the values 1 , p , 0 .
We now show that equality is attained, by letting p 1 , by U uniformly distributed on a symmetric interval [ a 2 , a 2 ] , for some a > 0 . In this case, we have
U p p = a 2 p 1 p + 1 .
Hence,
1 p log 2 p U p p Γ ( p + 1 ) = log a Γ ( p + 2 ) 1 p p 1 log ( a ) = h ( U ) .
Remark 6.
From (71) and (74), we see that the following statement holds: For every symmetric log-concave random variable X f X , for every p > 1 , and for every x R ,
f X ( x ) Γ ( p + 1 ) 1 p 2 X p .
Inequality (78) is the main ingredient in the proof of Theorem 1. It is instructive to provide a direct proof of inequality (78) without appealing to Lemma 1, the ideas going back to [25]:
Proof of inequality (78)
By considering X | X 0 , where X is symmetric log-concave, it is enough to show that for every log-concave density f supported on [ 0 , + ) , one has
f ( 0 ) 0 + x p f ( x ) d x 1 p Γ ( p + 1 ) 1 p .
By a scaling argument, one may assume that f ( 0 ) = 1 . Take g ( x ) = e x . If f = g , then the result follows by a straightforward computation. Assume that f g . Since f g and f = g , the function f g changes sign at least one time. However, since f ( 0 ) = g ( 0 ) , f is log-concave and g is log-affine, the function f g changes sign exactly once. It follows that there exists a unique point x 0 > 0 such that for every 0 < x < x 0 , f ( x ) g ( x ) , and for every x > x 0 , f ( x ) g ( x ) . We deduce that for every x > 0 , and p 0 ,
1 p ( f ( x ) g ( x ) ) ( x p x 0 p ) 0 .
Integrating over x > 0 , we arrive at
1 p 0 + x p f ( x ) d x Γ ( p + 1 ) = 1 p 0 + ( x p x 0 p ) ( f ( x ) g ( x ) ) d x 0 ,
which yields the desired result. ☐
Actually, the powerful and versatile result of Lemma 1, which implies (78), is also proved using the technique in (79)–(81). In the context of information theory, Lemma 1 has been previously applied to obtain reverse entropy power inequalities [7], as well as to establish optimal concentration of the information content [29]. In this paper, we make use of Lemma 1 to prove Theorem 1. Moreover, Lemma 1 immediately implies Theorem 2. Below, we recall the argument for completeness.
Proof of Theorem 2.
The result follows by applying Lemma 1 to the values 0 , p , q . If 0 < p < q , then
F ( p ) = F 0 · 1 p q + q · p q F ( 0 ) 1 p q F ( q ) p q .
Hence,
X p p Γ ( p + 1 ) X q q Γ ( q + 1 ) p q ,
which yields the desired result. The bound is obtained similarly if p < q < 0 or if p < 0 < q . ☐

3.2. Proof of Theorem 3 and Proposition 1

The proof leverages the ideas from [10].
Proof of Theorem 3.
Let Y be an independent copy of X. Jensen’s inequality yields
h ( X ) = f X log ( f X ) log f X 2 = log ( f X Y ( 0 ) ) .
Since X Y is symmetric and log-concave, we can apply inequality (74) to X Y to obtain
1 f X Y ( 0 ) 2 X Y p Γ ( p + 1 ) 1 p 2 X E [ X ] p Γ ( p + 1 ) 1 p ,
where the last inequality again follows from Jensen’s inequality. Combining (84) and (85) leads to the desired result:
h ( X ) log 1 f X Y ( 0 ) log 2 X E [ X ] p Γ ( p + 1 ) 1 p .
For p = 2 , one may tighten (85) by noticing that
X Y 2 2 = 2 Var [ X ] .
Hence,
h ( X ) log 1 f X Y ( 0 ) log 2 X Y 2 = log ( 2 Var [ X ] ) .
Proof of Proposition 1.
Let Y be an independent copy of X. Since X Y is symmetric and log-concave, we can apply Theorem 2 to X Y . Jensen’s inequality and triangle inequality yield:
X E [ X ] q X Y q Γ ( q + 1 ) 1 q Γ ( p + 1 ) 1 p X Y p 2 Γ ( q + 1 ) 1 q Γ ( p + 1 ) 1 p X E [ X ] p .

3.3. Proof of Theorem 4

We say that a random vector X f X is isotropic if X is symmetric and for all unit vectors θ , one has
E [ X , θ 2 ] = m X 2 ,
for some constant m X > 0 . Equivalently, X is isotropic if its covariance matrix K X is a multiple of the identity matrix I n ,
K X = m X 2 I n ,
for some constant m X > 0 . The constant
X f X ( 0 ) 1 n m X
is called the isotropic constant of X.
It is well known that X is bounded from below by a positive constant independent of the dimension [30]. A long-standing conjecture in convex geometry, the hyperplane conjecture, asks whether the isotropic constant of an isotropic log-concave random vector is also bounded from above by a universal constant (independent of the dimension). This conjecture holds under additional assumptions, but, in full generality, X is known to be bounded only by a constant that depends on the dimension. For further details, we refer the reader to [31]. We will use the following upper bounds on X (see [32] for the best dependence on the dimension up to date).
Lemma 2.
Let X be an isotropic log-concave random vector in R n , with n 2 . Then, X 2 n 2 e 2 ( n + 2 ) 4 2 . If, in addition, X is unconditional, then X 2 e 2 2 .
If X is uniformly distributed on a convex set, these bounds hold without factor e 2 .
Even though the bounds in Lemma 2 are well known, we could not find a reference in the literature. We thus include a short proof for completeness.
Proof. 
It was shown by Ball [30] (Lemma 8) that if X is uniformly distributed on a convex set, then X 2 n 2 ( n + 2 ) 4 2 . If X is uniformly distributed on a convex set and is unconditional, then it is known that X 2 1 2 (see e.g., [33] (Proposition 2.1)). Now, one can pass from uniform distributions on a convex set to log-concave distributions at the expense of an extra factor e 2 , as shown by Ball [30] (Theorem 7). ☐
We are now ready to prove Theorem 4.
Proof of Theorem 4.
Let X ˜ f X ˜ be an isotropic log-concave random vector. Notice that f X ˜ ( 0 ) 2 n | K X ˜ | 1 n = X ˜ 2 , hence, using Lemma 2, we have
h ( X ˜ ) = f X ˜ ( x ) log 1 f X ˜ ( x ) d x log 1 f X ˜ ( 0 ) n 2 log | K X ˜ | 1 n c ( n ) ,
with c ( n ) = n 2 e 2 ( n + 2 ) 4 2 . If, in addition, X ˜ is unconditional, then again by Lemma 2, c ( n ) = e 2 2 .
Now consider an arbitrary symmetric log-concave random vector X. One can apply a change of variable to put X in isotropic position. Indeed, by defining X ˜ = K X 1 2 X , one has for every unit vector θ ,
E [ X ˜ , θ 2 ] = E [ X , K X 1 2 θ 2 ] = K X ( K X 1 2 θ ) , K X 1 2 θ = 1 .
It follows that X ˜ is an isotropic log-concave random vector with isotropic constant 1. Therefore, we can use (93) to obtain
h ( X ˜ ) n 2 log 1 c ( n ) ,
where c ( n ) = n 2 e 2 ( n + 2 ) 4 2 in general, and c ( n ) = e 2 2 when X is unconditional. We deduce that
h ( X ) = h ( X ˜ ) + n 2 log | K X | 1 n n 2 log | K X | 1 n c ( n ) .

3.4. Proof of Theorem 5

First, we need the following lemma.
Lemma 3.
Let X f X be an isotropic unconditional log-concave random vector. Then, for every i { 1 , , n } ,
f X i ( 0 ) f X ( 0 ) 1 n c ,
where f X i is the marginal distribution of the i-th component of X, i.e., for every t R ,
f X i ( t ) = R n 1 f X ( x 1 , , x i 1 , t , x i + 1 , , x n ) d x 1 d x i 1 d x i + 1 d x n .
Here, c = e 6 . If, in addition, f X is invariant under permutations of coordinates, then c = e [33] (Proposition 3.2).
Proof of Theorem 5.
Let i { 1 , , n } . We have
X i p p = R | t | p f X i ( t ) d t .
Since f X is unconditional and log-concave, it follows that f X i is symmetric and log-concave, so inequality (74) applies to f X i :
R | t | p f X i ( t ) d t Γ ( p + 1 ) 2 p f X i ( 0 ) p .
We apply Lemma 3 to pass from f X i to f X in the right side of (100):
f X ( 0 ) 1 n X i p Γ ( p + 1 ) 1 p c 2 .
Thus,
h ( X ) log 1 f X ( 0 ) n log 2 X i p Γ ( p + 1 ) 1 p c .

4. Extension to γ -Concave Random Variables

In this section, we prove Theorem 6, which extends Theorem 1 to the class of γ -concave random variables, with γ < 0 . First, we need the following key lemma, which extends Lemma 1.
Lemma 4.
Let f : [ 0 , + ) [ 0 , + ) be a γ-concave function, with γ < 0 . Then, the function
F ( r ) = Γ ( 1 γ ) Γ ( 1 γ ( r + 1 ) ) 1 Γ ( r + 1 ) 0 + t r f ( t ) d t
is log-concave on 1 , 1 1 γ [34] (Theorem 7).
One can recover Lemma 1 from Lemma 4 by letting γ tend to 0 from below.
Proof of Theorem 6.
Let us first consider the case p ( 1 , 0 ) . Let us denote by f X the probability density function of X. By applying Lemma 4 to the values 1 , p , 0 , we have
F ( p ) = F ( 1 · ( p ) + 0 · ( p + 1 ) ) F ( 1 ) p F ( 0 ) p + 1 .
From the proof of Theorem 1, we deduce that F ( 1 ) = f X ( 0 ) . In addition, notice that, for γ ( 1 , 0 ) ,
F ( 0 ) = 1 2 Γ ( 1 γ ) Γ ( 1 γ 1 ) .
Hence,
f X ( 0 ) p 2 p X p p Γ ( p + 1 ) Γ ( 1 1 γ ) p + 1 Γ ( 1 γ ) p Γ ( 1 γ ( p + 1 ) ) ,
and the bound on differential entropy follows:
h ( X ) log 1 f X ( 0 ) 1 p log 2 p X p p Γ ( p + 1 ) Γ ( 1 1 γ ) p + 1 Γ ( 1 γ ) p Γ ( 1 γ ( p + 1 ) ) .
For the case p 0 , 1 1 γ , the bound is obtained similarly by applying Lemma 4 to the values 1 , 0 , p . ☐

5. Reverse Entropy Power Inequality with Explicit Constant

5.1. Proof of Theorem 7

Proof. 
Using the upper bound on the differential entropy (1), we have
h ( X + Y ) 1 2 log ( 2 π e Var [ X + Y ] ) = 1 2 log ( 2 π e ( Var [ X ] + Var [ Y ] ) ) ,
the last equality being valid since X and Y are uncorrelated. Hence,
N ( X + Y ) 2 π e ( Var [ X ] + Var [ Y ] ) .
Using inequality (32), we conclude that
N ( X + Y ) π e 2 ( N ( X ) + N ( Y ) ) .

5.2. Proof of Theorem 8

Proof. 
Since X and Y are uncorrelated and K X and K Y are proportionals,
| K X + Y | 1 n = | K X + K Y | 1 n = | K X | 1 n + | K Y | 1 n .
Using (110) and the upper bound on the differential entropy (38), we obtain
h ( X + Y ) n 2 log 2 π e | K X + Y | 1 n = n 2 log 2 π e | K X | 1 n + | K Y | 1 n .
Using Theorem 4, we conclude that
N ( X + Y ) 2 π e | K X | 1 n + | K Y | 1 n 2 π e c ( n ) ( N ( X ) + N ( Y ) ) ,
where c ( n ) = e 2 n 2 4 2 ( n + 2 ) in general, and c ( n ) = e 2 2 if X and Y are unconditional. ☐

6. New Bounds on the Rate-Distortion Function

6.1. Proof of Theorem 9

Proof. 
Under mean-square error distortion ( r = 2 ), the result is implicit in [21] (Chapter 10). Denote for brevity σ = X 2 .
(1) Let r [ 1 , 2 ] . Assume that σ > d 1 r . We take
X ^ = 1 d 2 r / σ 2 X + Z ,
where Z N 0 , σ 2 d 2 r σ 2 d 2 r is independent of X. This choice of X ^ is admissible since
X X ^ r r X X ^ 2 r = d 2 r σ 2 2 σ 2 + 1 d 2 r σ 2 2 Z 2 2 r 2 = d ,
where we used r 2 and the left-hand side of inequality (26). Upper-bounding the rate-distortion function by the mutual information between X and X ^ , we obtain
R X d I ( X ; X ^ ) = h ( X + Z ) h ( Z ) ,
where we used homogeneity of differential entropy for the last equality. Invoking the upper bound on the differential entropy (1), we have
h ( X + Z ) h ( Z ) 1 2 log 2 π e σ 2 + σ 2 d 2 r σ 2 d 2 r h ( Z ) = R _ X d + D ( X | | G X ) + log α r 2 π e ,
and (50) follows.
If X 2 d 1 r , then X r X 2 d 1 r , and setting X ^ 0 leads to R X d = 0 .
(2) Let r > 2 . The argument presented here works for every r 1 . However, for r [ 1 , 2 ] , the argument in part (1) provides a tighter bound. Assume that σ d 1 r . We take
X ^ = X + Z ,
where Z is independent of X and realizes the maximum differential entropy under the r-th moment constraint, Z r r = d . The probability density function of Z is given by
f Z ( x ) = r 1 1 r 2 Γ 1 r d 1 r e | x | r r d , x R .
Notice that
Z 2 2 = d 2 r r 2 r Γ ( 3 r ) Γ ( 1 r ) .
We have
h ( X + Z ) h ( Z ) 1 2 log ( 2 π e ( σ 2 + Z 2 2 ) ) log ( α r d 1 r )
R _ X d + log 2 π e β r σ h ( X ) ,
where β r is defined in (49). Hence,
R X d R _ X d D ( X | | G X ) + log β r .
If X r r d , then setting X ^ 0 leads to R X d = 0 . Finally, if X r r > d and σ < d 1 r , then, from (120), we obtain
R X d log 2 π e β r d 1 r log ( α r d 1 r ) = log 2 π e β r α r .

6.2. Proof of Theorem 10

Proof. 
Denote for brevity σ = X 2 , and recall that X is a symmetric log-concave random variable.
Assume that σ d 1 r . We take
X ^ = 1 δ σ 2 ( X + Z ) , δ 2 Γ ( r + 1 ) 2 r d 2 r ,
where Z N 0 , σ 2 δ σ 2 δ is independent of X. This choice of X ^ is admissible since
X X ^ r r X X ^ 2 r Γ ( r + 1 ) 2 r 2 = δ r 2 Γ ( r + 1 ) 2 r 2 = d ,
where we used r > 2 and Theorem 2. Using the upper bound on the differential entropy (1), we have
h ( X + Z ) h ( Z ) 1 2 log 2 π e σ 2 + σ 2 δ σ 2 δ h ( Z ) = 1 2 log σ 2 δ .
Hence,
R X d R X d D ( X | | G X ) + log α r Γ ( r + 1 ) 1 r 2 π e .
If σ 2 δ , then from Theorem 2 X r r d , hence R X d = 0 . Finally, if X r r > d and σ 2 ( δ , d 2 r ) , then, from (126), we obtain
R X d 1 2 log σ 2 δ 1 2 log Γ ( r + 1 ) 2 r 2 .
Remark 7.
1) Let us explain the strategy in the proof of Theorems 9 and 10. By definition, R X d I ( X ; X ^ ) for any X ^ satisfying the constraint. In our study, we chose X ^ of the form λ ( X + Z ) , with λ [ 0 , 1 ] , where Z is independent of X. To find the best bounds possible with this choice of X ^ , we need to minimize X X ^ r r over λ. Notice that if X ^ = λ ( X + Z ) and Z symmetric, then X X ^ r r = ( 1 λ ) X + λ Z r r .
To estimate ( 1 λ ) X + λ Z r r in terms of X r and Z r , one can use triangle inequality and the convexity of · r to get the bound
( 1 λ ) X + λ Z r r 2 r 1 ( ( 1 λ ) r X r r + λ r Z r r ) ,
or one can apply Jensen’s inequality directly to get the bound
( 1 λ ) X + λ Z r r ( 1 λ ) X r r + λ Z r r .
A simple study shows that (130) provides a tighter bound over (129). This justifies choosing X ^ as in (117) in the proof of (51).
To justify the choice of X ^ in (113) (also in (124)), which leads the tightening of (51) for r [ 1 , 2 ] in (50) (also in (52)), we bound r-th norm by second norm, and we note that by the independence of X and Z,
( 1 λ ) X + λ Z 2 2 ( 1 λ ) 2 X 2 2 + λ 2 Z 2 2 .
A simple study shows that (131) provides a tighter bound over (130).
2) Using Corollary 2, if r = 2 , one may rewrite our bound in terms of the rate-distortion function of a Gaussian source as follows:
R X d R G X ( d ) log π e Δ p ,
where Δ p is defined in (28), and where
R G X ( d ) = 1 2 log σ 2 d
is the rate-distortion function of a Gaussian source with the same variance σ 2 as X. It is well known that for arbitrary source and mean-square distortion (see e.g., [21] (Chapter 10))
R X d R G X ( d ) .
By taking p = 2 in (132), we obtain
0 R G X ( d ) R X d 1 2 log π e 2 .
The bounds in (134) and (135) tell us that the rate-distortion function of any log-concave source is approximated by that of a Gaussian source. In particular, approximating R X d of an arbitrary log-concave source by
R ^ X ( d ) = 1 2 log σ 2 d 1 4 log π e 2 ,
we guarantee the approximation error | R X d R ^ X ( d ) | of at most 1 4 log π e 2 1 2 bits.

6.3. Proof of Theorem 11

Proof. 
If | K X | 1 n > d , then we choose X ^ = 1 d | K X | 1 n ( X + Z ) , where Z N 0 , d | K X | 1 n d · K X is independent of X. This choice is admissible by independence of X and Z and the fact that K X and K Z are proportionals. Upper-bounding the rate-distortion function by the mutual information between X and X ^ , we have
R X c o v ( d ) h ( X + Z ) h ( Z ) n 2 log | K X | 1 n d .
Since the Shannon lower bound for determinant constraint coincides with that for the mean-square error constraint,
R X c o v ( d ) R X d = h ( X ) n 2 log ( 2 π e d ) .
On the other hand, using (137), we have
R X c o v ( d ) R X d D ( X | | G X ) n 2 log ( 2 π e c ( n ) ) ,
where (139) follows from Corollary 3.
If | K X | 1 n d , then we put X ^ 0 , which leads to R X c o v ( d ) = 0 . ☐

7. New Bounds on the Capacity of Memoryless Additive Channels

Recall that the capacity of such a channel is
C Z ( P ) = sup X : 1 n X 2 2 P I ( X ; X + Z ) = sup X : 1 n X 2 2 P h ( X + Z ) h ( Z ) .
We compare the capacity C Z of a channel with log-concave additive noise with the capacity of the Gaussian channel.

7.1. Proof of Theorem 12

Proof. 
The lower bound is well known, as mentioned in (19). To obtain the upper bound, we first use the upper bound on the differential entropy (1) to conclude that
h ( X + Z ) 1 2 log ( 2 π e ( P + Var [ Z ] ) ) ,
for every random variable X such that X 2 2 P . By combining (140), (141) and (32), we deduce that
C Z ( P ) 1 2 log ( 2 π e ( P + Var [ Z ] ) ) 1 2 log ( 4 Var [ Z ] ) = 1 2 log π e 2 1 + P Var [ Z ] ,
which is the desired result. ☐

7.2. Proof of Theorem 13

Proof. 
The lower bound is well known, as mentioned in (19). To obtain the upper bound, we write
h ( X + Z ) h ( Z ) n 2 log 2 π e | K X + Z | 1 n h ( Z ) n 2 log 2 π e c ( n ) 1 n Z 2 2 | K Z | 1 n + P | K Z | 1 n ,
where c ( n ) = n 2 e 2 ( n + 2 ) 4 2 in general, and c ( n ) = e 2 2 if Z is unconditional. The first inequality in (143) is obtained from the upper bound on the differential entropy (38). The last inequality in (143) is obtained by applying the arithmetic-geometric mean inequality and Theorem 4. ☐

8. Conclusions

Several recent results show that the entropy of log-concave probability densities have nice properties. For example, reverse, strengthened and stable versions of the entropy power inequality were recently obtained for log-concave random vectors (see e.g., [3,11,35,36,37,38]). This line of developments suggest that, in some sense, log-concave random vectors behave like Gaussians.
Our work follows this line of results, by establishing a new lower bound on differential entropy for log-concave random variables in (4), for log-concave random vectors with possibly dependent coordinates in (37), and for γ -concave random variables in (43). We made use of the new lower bounds in several applications. First, we derived reverse entropy power inequalities with explicit constants for uncorrelated, possibly dependent log-concave random vectors in (12) and (47). We also showed a universal bound on the difference between the rate-distortion function and the Shannon lower bound for log-concave random variables in Figure 1a and Figure 1b, and for log-concave random vectors in (59). Finally, we established an upper bound on the capacity of memoryless additive noise channels when the noise is a log-concave random vector in (20) and (66).
Under the Gaussian assumption, information-theoretic limits in many communication scenarios admit simple closed-form expressions. Our work demonstrates that, at least in three such scenarios (source coding, channel coding and joint source-channel coding), the information-theoretic limits admit a closed-form approximation with at most 1 bit of error if the Gaussian assumption is relaxed to the log-concave one. We hope that the approach will be useful in gaining insights into those communication and data processing scenarios in which the Gaussianity of the observed distributions is violated but the log-concavity is preserved.

Acknowledgments

This work is supported in part by the National Science Foundation (NSF) under Grant CCF-1566567, and by the Walter S. Baer and Jeri Weiss CMI Postdoctoral Fellowship. The authors would also like to thank an anonymous referee for pointing out that the bound (23) and, up to a factor 2, the bound (25) also apply to the non-symmetric case if p 1 .

Author Contributions

Arnaud Marsiglietti and Victoria Kostina contributed equally to the research and writing of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zamir, R.; Feder, M. On Universal Quantization by Randomized Uniform/Lattice Quantizers. IEEE Trans. Inf. Theory 1992, 32, 428–436. [Google Scholar] [CrossRef]
  2. Prékopa, A. On logarithmic concave measures and functions. Acta Sci. Math. 1973, 34, 335–343. [Google Scholar]
  3. Bobkov, S.; Madiman, M. The entropy per coordinate of a random vector is highly constrained under convexity conditions. IEEE Trans. Inf. Theory 2011, 57, 4940–4954. [Google Scholar] [CrossRef]
  4. Bobkov, S.; Madiman, M. Entropy and the hyperplane conjecture in convex geometry. In Proceedings of the 2010 IEEE International Symposium on Information Theory Proceedings (ISIT), Austin, TX, USA, 13–18 June 2010; pp. 1438–1442. [Google Scholar]
  5. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423, 623–656. [Google Scholar] [CrossRef]
  6. Stam, A.J. Some inequalities satisfied by the quantities of information of Fisher and Shannon. Inf. Control 1959, 2, 101–112. [Google Scholar] [CrossRef]
  7. Bobkov, S.; Madiman, M. Reverse Brunn-Minkowski and reverse entropy power inequalities for convex measures. J. Funct. Anal. 2012, 262, 3309–3339. [Google Scholar] [CrossRef]
  8. Cover, T.M.; Zhang, Z. On the maximum entropy of the sum of two dependent random variables. IEEE Trans. Inf. Theory 1994, 40, 1244–1246. [Google Scholar] [CrossRef]
  9. Madiman, M.; Kontoyiannis, I. Entropy bounds on abelian groups and the Ruzsa divergence. IEEE Trans. Inf. Theory 2018, 64, 77–92. [Google Scholar] [CrossRef]
  10. Bobkov, S.; Madiman, M. On the problem of reversibility of the entropy power inequality. In Limit Theorems in Probability, Statistics and Number Theory; Springer Proceedings in Mathematics and Statistics; Springer: Berlin/Heidelberg, Germany, 2013; Volume 42, pp. 61–74. [Google Scholar]
  11. Ball, K.; Nayar, P.; Tkocz, T. A reverse entropy power inequality for log-concave random vectors. Studia Math. 2016, 235, 17–30. [Google Scholar] [CrossRef]
  12. Courtade, T.A. Links between the Logarithmic Sobolev Inequality and the convolution inequalities for Entropy and Fisher Information. arXiv, 2016; arXiv:1608.05431. [Google Scholar]
  13. Madiman, M.; Melbourne, J.; Xu, P. Forward and Reverse Entropy Power Inequalities in Convex Geometry. In Convexity and Concentration; Carlen, E., Madiman, M., Werner, E., Eds.; The IMA Volumes in Mathematics and Its Applications; Springer: New York, NY, USA, 2017; Volume 161, pp. 427–485. [Google Scholar]
  14. Shannon, C.E. Coding theorems for a discrete source with a fidelity criterion. IRE Int. Conv. Rec. 1959, 7, 142–163, Reprinted with changes in Information and Decision Processes; Machol, R.E., Ed.; McGraw-Hill: New York, NY, USA, 1960; pp. 93–126. [Google Scholar]
  15. Linkov, Y.N. Evaluation of ϵ-entropy of random variables for small ϵ. Probl. Inf. Transm. 1965, 1, 18–26. [Google Scholar]
  16. Linder, T.; Zamir, R. On the asymptotic tightness of the Shannon lower bound. IEEE Trans. Inf. Theory 1994, 40, 2026–2031. [Google Scholar] [CrossRef]
  17. Koch, T. The Shannon Lower Bound is Asymptotically Tight. IEEE Trans. Inf. Theory 2016, 62, 6155–6161. [Google Scholar] [CrossRef]
  18. Kostina, V. Data compression with low distortion and finite blocklength. IEEE Trans. Inf. Theory 2017, 63, 4268–4285. [Google Scholar] [CrossRef]
  19. Gish, H.; Pierce, J. Asymptotically efficient quantizing. IEEE Trans. Inf. Theory 1968, 14, 676–683. [Google Scholar] [CrossRef]
  20. Ziv, J. On universal quantization. IEEE Trans. Inf. Theory 1985, 31, 344–347. [Google Scholar] [CrossRef]
  21. Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  22. Ihara, S. On the capacity of channels with additive non-Gaussian noise. Inf. Control 1978, 37, 34–39. [Google Scholar] [CrossRef]
  23. Diggavi, S.N.; Cover, T.M. The worst additive noise under a covariance constraint. IEEE Trans. Inf. Theory 2001, 47, 3072–3081. [Google Scholar] [CrossRef]
  24. Zamir, R.; Erez, U. A Gaussian input is not too bad. IEEE Trans. Inf. Theory 2004, 50, 1340–1353. [Google Scholar] [CrossRef]
  25. Karlin, S.; Proschan, F.; Barlow, R.E. Moment inequalities of Pólya frequency functions. Pac. J. Math. 1961, 11, 1023–1033. [Google Scholar] [CrossRef]
  26. Borell, C. Convex measures on locally convex spaces. Ark. Mat. 1974, 12, 239–252. [Google Scholar] [CrossRef]
  27. Borell, C. Convex set functions in d-space. Period. Math. Hungar. 1975, 6, 111–136. [Google Scholar] [CrossRef]
  28. Borell, C. Complements of Lyapunov’s inequality. Math. Ann. 1973, 205, 323–331. [Google Scholar] [CrossRef]
  29. Fradelizi, M.; Madiman, M.; Wang, L. Optimal concentration of information content for log-concave densities. In High Dimensional Probability VII 2016; Birkhäuser: Cham, Germany, 2016; Volume 71, pp. 45–60. [Google Scholar]
  30. Ball, K. Logarithmically concave functions and sections of convex sets in ℝn. Studia Math. 1988, 88, 69–84. [Google Scholar] [CrossRef]
  31. Brazitikos, S.; Giannopoulos, A.; Valettas, P.; Vritsiou, B.H. Geometry of Isotropic Convex Bodies; Mathematical Surveys and Monographs, 196; American Mathematical Society: Providence, RI, USA, 2014. [Google Scholar]
  32. Klartag, B. On convex perturbations with a bounded isotropic constant. Geom. Funct. Anal. 2006, 16, 1274–1290. [Google Scholar] [CrossRef]
  33. Bobkov, S.; Nazarov, F. On convex bodies and log-concave probability measures with unconditional basis. In Geometric Aspects of Functional Analysis; Springer: Berlin/Heidelberg, Germany, 2003; pp. 53–69. [Google Scholar]
  34. Fradelizi, M.; Guédon, O.; Pajor, A. Thin-shell concentration for convex measures. Studia Math. 2014, 223, 123–148. [Google Scholar] [CrossRef]
  35. Ball, K.; Nguyen, V.H. Entropy jumps for isotropic log-concave random vectors and spectral gap. Studia Math. 2012, 213, 81–96. [Google Scholar] [CrossRef]
  36. Toscani, G. A concavity property for the reciprocal of Fisher information and its consequences on Costa’s EPI. Physica A 2015, 432, 35–42. [Google Scholar] [CrossRef]
  37. Toscani, G. A strengthened entropy power inequality for log-concave densities. IEEE Trans. Inf. Theory 2015, 61, 6550–6559. [Google Scholar] [CrossRef]
  38. Courtade, T.A.; Fathi, M.; Pananjady, A. Wasserstein Stability of the Entropy Power Inequality for Log-Concave Densities. arXiv, 2016; arXiv:1610.07969. [Google Scholar]
Figure 1. The bound on the difference between the rate-distortion function under r-th moment constraint and the Shannon lower bound, stated in Corollary 5.
Figure 1. The bound on the difference between the rate-distortion function under r-th moment constraint and the Shannon lower bound, stated in Corollary 5.
Entropy 20 00185 g001

Share and Cite

MDPI and ACS Style

Marsiglietti, A.; Kostina, V. A Lower Bound on the Differential Entropy of Log-Concave Random Vectors with Applications. Entropy 2018, 20, 185. https://doi.org/10.3390/e20030185

AMA Style

Marsiglietti A, Kostina V. A Lower Bound on the Differential Entropy of Log-Concave Random Vectors with Applications. Entropy. 2018; 20(3):185. https://doi.org/10.3390/e20030185

Chicago/Turabian Style

Marsiglietti, Arnaud, and Victoria Kostina. 2018. "A Lower Bound on the Differential Entropy of Log-Concave Random Vectors with Applications" Entropy 20, no. 3: 185. https://doi.org/10.3390/e20030185

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop