Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy

Levhari, Niv; Samorodnitsky, Alex

doi:10.3390/e24101376

Open AccessFeature PaperArticle

Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy

by

Niv Levhari

^1,2,* and

Alex Samorodnitsky

^1,*

¹

School of Engineering and Computer Science, The Hebrew University of Jerusalem, Jerusalem 9103401, Israel

²

School of Mathematical Sciences, Tel Aviv University, Tel Aviv 6997801, Israel

^*

Authors to whom correspondence should be addressed.

Entropy 2022, 24(10), 1376; https://doi.org/10.3390/e24101376

Submission received: 4 July 2022 / Revised: 8 September 2022 / Accepted: 18 September 2022 / Published: 27 September 2022

(This article belongs to the Special Issue Extremal and Additive Combinatorial Aspects in Information Theory)

Download Review Reports Versions Notes

Abstract

Let

T_{ϵ}

,

0 \leq ϵ \leq 1 / 2

, be the noise operator acting on functions on the boolean cube

{0, 1}^{n}

. Let f be a distribution on

{0, 1}^{n}

and let

q > 1

. We prove tight Mrs. Gerber-type results for the second Rényi entropy of

T_{ϵ} f

which take into account the value of the

q^{t h}

Rényi entropy of f. For a general function f on

{0, 1}^{n}

we prove tight hypercontractive inequalities for the

ℓ_{2}

norm of

T_{ϵ} f

which take into account the ratio between

ℓ_{q}

and

ℓ_{1}

norms of f.

Keywords:

entropy; hypercontractivity; Rényi entropy; Mrs. Gerber’s inequality

1. Introduction

This paper considers the problem of quantifying the decrease in the

ℓ_{2}

norm of a function on the boolean cube when this function is acted on by the noise operator.

Given a noise parameter

0 \leq ϵ \leq 1 / 2

, the noise operator

T_{ϵ}

acts on functions on the boolean cube as follows: for

f : {0, 1}^{n} \to R

,

T_{ϵ} f

at a point x is the expected value of f at y, where y is a random binary vector whose

i^{t h}

coordinate is

x_{i}

with probability

1 - ϵ

and

1 - x_{i}

with probability

ϵ

, independently for different coordinates. Namely,

(T_{ϵ} f) (x) = \sum_{y \in {0, 1}^{n}} ϵ^{| y - x |} {(1 - ϵ)}^{n - | y - x |} f (y)

, where

| \cdot |

denotes the Hamming distance. We will write

f_{ϵ}

for

T_{ϵ} f

, for brevity.

Note that

f_{ϵ}

is a convex combination of shifted copies of f. Hence, the noise operator decreases norms. Recall that the

ℓ_{q}

norm of a function is given by

{∥ f ∥}_{q} = {({E | f |}^{q})}^{\frac{1}{q}}

(the expectations here and below are taken w.r.t. the uniform measure on

{0, 1}^{n}

). The norms

{{∥ f ∥}_{q}}_{q}

increase with q. An effective way to quantify the decrease of

ℓ_{q}

norm under noise is given by the hypercontractive inequality [1,2,3] (see also, e.g., [4] for background), which upperbounds the

ℓ_{q}

norm of the noisy version of a function by a smaller norm of the original function.

∥ f_{ϵ} ∥_{q} \leq {∥ f ∥}_{1 + {(1 - 2 ϵ)}^{2} (q - 1)} .

(1)

This inequality is essentially tight in the following sense. For any

p < 1 + (q - 1) {(1 - 2 ϵ)}^{2}

there exists a non-constant function

f : {0, 1}^{n} \to R

with

∥ f_{ϵ} ∥_{q} > {∥ f ∥}_{p}

.

Entropy provides another example of a convex homogeneous functional on (nonnegative) functions on the boolean cube. For a nonnegative function f let the entropy of f be given by

E n t (f) = E f \log_{2} f - E f \log_{2} E f

. The entropy of f is closely related to Shannon’s entropy of the corresponding distribution

f / Σ f

on

{0, 1}^{n}

, and similarly the entropy of

f_{ϵ}

is related to Shannon’s entropy of the output of a binary symmetric channel with error probability

ϵ

on input distributed according to

f / Σ f

(see below and, e.g., the discussion in the introduction of [5]). The decrease in entropy (or, correspondingly, the increase in Shannon’s entropy) after noise is quantified in the “Mrs. Gerber’s Lemma” [6]:

E n t (f_{ϵ}) \leq n E f \cdot ψ (\frac{E n t (f)}{n E f}, ϵ),

(2)

where

ψ = ψ (x, ϵ) = H ((1 - 2 ϵ) \cdot H^{- 1} (1 - x) + ϵ)

is an explicitly given function on

[0, 1] \times [0, 1 / 2]

, which is increasing and strictly concave in its first argument for any

0 < ϵ < \frac{1}{2}

. Here and below we write

H (t) = t \log_{2} (\frac{1}{t}) + (1 - t) \log_{2} (\frac{1}{1 - t})

for the binary entropy function.

Equality holds iff f is a product function with equal marginals. That is, there exists a function

g : {0, 1} \to R

, such that for any

x = (x_{1}, \dots, x_{n}) \in {0, 1}^{n}

holds

f (x) = \prod_{i = 1}^{n} g (x_{i})

.

One has

ψ (0, ϵ) = 0

and

{\frac{\partial ψ}{\partial x}}_{| x = 0} = {(1 - 2 ϵ)}^{2}

. Hence

ψ (x, ϵ) \leq {(1 - 2 ϵ)}^{2} \cdot x

, with equality only at

x = 0

. Hence the inequality (2) has the following weaker linear approximation version

E n t (f_{ϵ}) \leq {(1 - 2 ϵ)}^{2} \cdot E n t (f),

(3)

in which equality holds if and only if f is a constant function.

Rényi entropies. There is a well-known connection between

ℓ_{q}

norms of a nonnegative function f and its entropy (see, e.g., [7]): Assume, as we may by homogeneity, that

E f = 1

. Then

E n t (f) = \lim_{q \to 1} \frac{1}{q - 1} \log_{2} {| | f | |}_{q}^{q}

. (The quantity

E n t_{q} (f) = \frac{1}{q - 1} \log_{2} {| | f | |}_{q}^{q}

is known as the

q^{t h}

Rényi entropy of f ([8])). (Note that this notion is defined for all, not necessarily nonnegative, functions on

{0, 1}^{n}

.) The entropies

{E n t_{q} (f)}_{q}

increase with q. Restating the inequality (1) in terms of Rényi entropies gives

E n t_{q} (f_{ϵ}) \leq \frac{{(1 - 2 ϵ)}^{2} q}{{(1 - 2 ϵ)}^{2} (q - 1) + 1} \cdot E n t_{1 + {(1 - 2 ϵ)}^{2} (q - 1)} (f) .

Note that taking

q \to 1

in this inequality recovers only the (weaker) linear approximation version (3) of Mrs. Gerber’s inequality (2). This highlights an important difference between inequalities (1) and (2). Mrs. Gerber’s lemma takes into account the distribution of a function, specifically the ratio between its entropy and its

ℓ_{1}

norm. When this ratio is exponentially large in n, which typically holds in the information theory contexts in which this inequality is applied, (2) is significantly stronger than (3). On the other hand, hypercontractive inequalities seem to be typically applied in contexts in which the ratio between different norms of the function is subexponential in n, and there are examples of such functions for which (1) is essentially tight. With that, there are several recent results [9,10,11] which show that (1) can be strengthened, if the ratio

\frac{{∥ f ∥}_{q}}{{∥ f ∥}_{1}}

, for some

q > 1

, is exponentially large in n. In the framework of Rényi entropies, the possibility of a result analogous to (2) for higher Rényi entropies was discussed in [12].

Our results. This paper proves a Mrs. Gerber type result for the second Rényi entropy, and a hypercontractive inequality for the

ℓ_{2}

norm of

f_{ϵ}

which take into account the ratio between

ℓ_{q}

and

ℓ_{1}

norms of f. We try to pattern the results below after (2).

We start with a Mrs. Gerber type inequality.

Proposition 1.

Let

q > 1

, and let f be a nonnegative function on

{0, 1}^{n}

such that

E f = 1

. Then

\frac{E n t_{2} (f_{ϵ})}{n} \leq ψ_{2, q} (\frac{E n t_{q} (f)}{n}, ϵ),

(4)

where

ψ_{2, q}

is an explicitly given function on

[0, 1] \times [0, 1 / 2]

, which is increasing and concave in its first argument. The function

ψ_{2, q}

is defined in Definition 1 below.

This inequality is essentially tight in the following sense. For any

0 < x < 1

and

0 < ϵ < \frac{1}{2}

, and for any

y < ψ_{2, q} (x, ϵ)

there exists a sufficiently large n and a nonnegative function f on

{0, 1}^{n}

with

E f = 1

,

\frac{E n t_{q} (f)}{n} \leq x

and

\frac{E n t_{2} (f_{ϵ})}{n} > y

.

Let us make some comments about this result.

–: The functions ${ψ_{2, q}}_{q}$ are somewhat cumbersome to describe, and hence we relegate their precise definition to Definition 1 below.
–: Inequality (4) upper bounds $E n t_{2} (f_{ϵ})$ in terms of $E n t_{q} (f)$ for $q > 1$ , and $ϵ$ . Taking $q = 2$ gives an upper bound on $E n t_{2} (f_{ϵ})$ in terms of $E n t_{2} (f)$ and $ϵ$ , in analogy to (2).
–: Recall that for a point $x \in {0, 1}^{n}$ and $0 \leq r \leq n$ , the Hamming sphere of radius r around x is the set ${y \in {0, 1}^{n} : | y - x | = r}$ . As will be seen from the proof of Proposition 1, (4) is essentially tight for a certain convex combination of the uniform distribution on ${0, 1}^{n}$ and the characteristic function of a Hamming sphere of an appropriate radius (depending on q, $ϵ$ , and the required value of $E n t_{q} (f)$ ).
–: In information theory one typically considers a slightly different notion of Rényi entropies: For a probability distribution P on $Ω$ , the $q^{t h}$ Renyi entropy of P is given by $H_{q} (P) = - \frac{1}{q - 1} \log_{2} (\sum_{ω \in Ω} P^{q} (ω))$ . To connect notions, if f is a nonnegative (non-zero) function on ${0, 1}^{n}$ with expectation 1, then $P = \frac{f}{2^{n}}$ is a probability distribution, and $E n t_{q} (f) = n - H_{q} (P)$ . Furthermore, $E n t_{q} (f_{ϵ}) = n - H_{q} (X \oplus Z)$ , where X is a random variable on ${0, 1}^{n}$ distributed accordinng to P and Z is an independent noise vector corresponding to a binary symmetric channel with crossover probability $ϵ$ . Hence, (2) can be restated as

$H (X \oplus Z) \geq n \cdot φ (\frac{H (X)}{n}, ϵ),$

and Proposition 1 can be restated as

$H_{2} (X \oplus Z) \geq n \cdot φ_{2, q} (\frac{H_{q} (X)}{n}, ϵ)$

Here $φ$ is an explicitly given function on $[0, 1] \times [0, 1 / 2]$ , which is increasing and convex in its first argument ( $φ (x, ϵ) = 1 - ψ (1 - x, ϵ)$ ), and similarly for $φ_{2, q}$ .
Next, we describe our main result, a hypercontractive inequality for the $ℓ_{2}$ norm of $f_{ϵ}$ which takes into account the ratio between $ℓ_{q}$ and $ℓ_{1}$ norms of f, and more specifically $E n t_{q} (\frac{f}{{∥ f ∥}_{1}}) = \frac{q}{q - 1} \log_{2} (\frac{{∥ f ∥}_{q}}{{∥ f ∥}_{1}})$ .

Theorem 1.

Let

q > 1

, and let f be a non-zero function on

{0, 1}^{n}

. Then

∥ f_{ϵ} ∥_{2} \leq {∥ f ∥}_{κ},

(5)

where

κ = κ_{2, q} (\frac{E n t_{q} (\frac{f}{{∥ f ∥}_{1}})}{n}, ϵ)

, and

κ_{2, q}

is an explicitly given function on

[0, 1] \times [0, 1 / 2]

, which is decreasing in its first argument and which satisfies

κ_{2, q} (0, ϵ) = 1 + {(1 - 2 ϵ)}^{2}

, for all

0 \leq ϵ \leq \frac{1}{2}

. The function

κ_{2, q}

is defined in Definition 1 below.

This inequality is essentially tight in the following sense. For any

0 < x < 1

and

0 < ϵ < \frac{1}{2}

, and for any

y < κ_{2, q} (x, ϵ)

there exists a sufficiently large n and a function f on

{0, 1}^{n}

with

\frac{E n t_{q} (f / {∥ f ∥}_{1})}{n} \geq x

and

∥ f_{ϵ} ∥_{2} > {∥ f ∥}_{y}

.

Some comments (see also Lemma 10 below).

–: The precise definition of the functions ${κ_{2, q}}_{q}$ will be given in Definition 1 below. At this point let us just observe that since the sequence ${E n t_{q} (f)}_{q}$ increases with q, we would expect the fact that $E n t_{q} (f)$ is large to become less significant as q increases. This is expressed in the properties of the functions ${κ_{2, q}}_{q}$ in the following manner: If $q \geq 2$ then for any $0 < ϵ < \frac{1}{2}$ the function $κ_{2, q} (x, ϵ)$ starts as a constant- $(1 + {(1 - 2 ϵ)}^{2})$ function up to some $x = x (q, ϵ) > 0$ , and becomes strictly decreasing after that. In other words $x (q, ϵ)$ is the largest possible value of $\frac{E n t_{q} (\frac{f}{{∥ f ∥}_{1}})}{n}$ for which Theorem 1 provides no new information compared to (1). For $1 < q < 2$ there is a value $0 < ϵ (q) < \frac{1}{2}$ , such that for all $ϵ \leq ϵ (q)$ the function $κ_{2, q} (x, ϵ)$ is strictly decreasing (in which case we say that $x (q, ϵ) = 0$ ). However, $x (q, ϵ) > 0$ for all $ϵ > ϵ (q)$ . The function $ϵ (q)$ decreases with q (in particular, $ϵ (q) = 0$ for $g \geq 2$ ). The function $x (q, ϵ)$ increases both in q and in $ϵ$ .
–: Notably, taking $q \to 1$ in Theorem 1 gives (see Corollary 1)

$∥ f_{ϵ} ∥_{2} \leq {∥ f ∥}_{κ},$

where $κ = κ_{2, 1} (E n t (\frac{f}{{∥ f ∥}_{1}}) / n, ϵ) = - \frac{E n t (\frac{f}{{∥ f ∥}_{1}}) / n}{ϕ_{ϵ} (1 - E n t (\frac{f}{{∥ f ∥}_{1}}) / n)}$ . The function $κ_{2, 1} (x, ϵ) = - \frac{x}{ϕ_{ϵ} (1 - x)}$ is strictly decreasing in x for any $0 < ϵ < \frac{1}{2}$ . It satisfies $κ_{2, 1} (0, ϵ) = \lim_{x \to 0} κ_{2, 1} (x, ϵ) = 1 + {(1 - 2 ϵ)}^{2}$ , for all $0 \leq ϵ \leq \frac{1}{2}$ . Hence, this is stronger than (1) for any non-constant function f and for any $0 < ϵ < \frac{1}{2}$ , with the difference between the two inequalities becoming significant when $E n t (\frac{f}{{∥ f ∥}_{1}}) / n$ is bounded away from 0.
–: As will be seen from the proof of Theorem 1, (5) is essentially tight for a certain convex combination of the uniform distribution on ${0, 1}^{n}$ and characteristic functions of one or two Hamming spheres of appropriate radii (the number of the spheres and their radii depend on q, $ϵ$ , and the required value of $E n t_{q} (\frac{f}{{∥ f ∥}_{1}})$ ).
–: Let f be a non-constant function and let $0 < ϵ < \frac{1}{2}$ be fixed. Consider the function $F (q) = F_{f, ϵ} (q) = κ_{2, q} (\frac{E n t_{q} (\frac{f}{{∥ f ∥}_{1}})}{n}, ϵ)$ . It will be seen that there is a unique value $1 < q (f, ϵ) \leq 1 + {(1 - 2 ϵ)}^{2}$ of q for which $F (q) = q$ . Furthermore, $q (f, ϵ) = \min_{q \geq 1} F (q)$ . Hence it provides the best possible value for $κ$ in Theorem 1. With that, determining $q (f, ϵ)$ might in principle require knowledge of all the Renyi entropies $E n t_{q} (f)$ , for $1 \leq q \leq 1 + {(1 - 2 ϵ)}^{2}$ , while typically we are in possession of one of the “easier” Rényi entropies, such as $E n t (f)$ or $E n t_{2} (f)$ .

1.1. Full Statements of Proposition 1 and Theorem 1

We now define the functions

{ψ_{2, q}}_{q}

in Proposition 1 and

{κ_{2, q}}_{q}

in Theorem 1, completing the statements of these claims. We start with introducing yet another function on

[0, 1] \times [0, 1 / 2]

which will play a key role in what follows (we remark that this function was studied in [9]). For

0 \leq x \leq 1

and

0 \leq ϵ \leq \frac{1}{2}

, let

σ = H^{- 1} (x)

and let

y = y (x, ϵ) = \frac{- ϵ^{2} + ϵ \sqrt{ϵ^{2} + 4 (1 - 2 ϵ) σ (1 - σ)}}{2 (1 - 2 ϵ)}

. Let

Φ (x, ϵ) = \frac{1}{2} \cdot (x - 1 + σ H (\frac{y}{σ}) + (1 - σ) H (\frac{y}{1 - σ}) + 2 y \log_{2} (ϵ) + (1 - 2 y) \log_{2} (1 - ϵ)) .

The function

Φ

is nonpositive. It is increasing and concave in its first argument. Additional relevant properties of

Φ

are listed in Lemma 3 below. For a fixed

ϵ

, it will be convenient to write

ϕ_{ϵ} (x) = Φ (x, 2 ϵ (1 - ϵ))

, viewing

ϕ_{ϵ}

as a univariate function on

[0, 1]

.

Definition 1.

Let

0 \leq x \leq 1

and

0 \leq ϵ \leq \frac{1}{2}

.

If $ϕ_{ϵ}^{'} (1 - x) < \frac{1}{q}$ , let $α_{0} = {(ϕ_{ϵ}^{'})}^{- 1} (\frac{1}{q})$ . Define

$ψ_{2, q} (x, ϵ) = 2 \cdot \{\begin{matrix} \frac{q - 1}{q} \cdot x + (ϕ_{ϵ} (α_{0}) + \frac{1 - α_{0}}{q}) & i f & ϕ_{ϵ}^{'} (1 - x) < \frac{1}{q} \\ ϕ_{ϵ} (1 - x) + x & o t h e r w i s e \end{matrix}$
Let $y = \frac{q - 1}{q} \cdot x + \frac{1}{q}$ . Let $q_{0} = 1 + {(1 - 2 ϵ)}^{2}$ . If $y \geq \frac{1}{q_{0}}$ , let $α_{0}$ be determined by $1 - α_{0} - \frac{α_{0} ϕ_{ϵ} (α_{0})}{1 - α_{0}} = y$ . If $x = 0$ , define $κ_{2, q} (x, ϵ) = q_{0}$ . Otherwise, define

$κ_{2, q} (x, ϵ) = \{\begin{matrix} q_{0} & i f & y \leq \frac{1}{q_{0}} \\ - \frac{x}{ϕ_{ϵ} (1 - x)} & i f & y > \frac{1}{q_{0}} a n d - \frac{x}{ϕ_{ϵ} (1 - x)} \geq q \\ \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})} & i f & y > \frac{1}{q_{0}} a n d - \frac{x}{ϕ_{ϵ} (1 - x)} < q \end{matrix}$

We remark that it is not immediately obvious that the functions

ψ_{2, q}

and

κ_{2, q}

are well-defined. This will be clarified in the proofs of Proposition 1 and Theorem 1.

We state explicitly some special cases of Theorem 1, which seem to be the most relevant for applications. They describe the improvement over (1), given non-trivial information about

E n t (f)

and

{∥ f ∥}_{2}

.

Corollary 1.

1.: Taking $q \to 1$ in Theorem 1 gives:

$∥ f_{ϵ} ∥_{2} \leq {∥ f ∥}_{κ}, w i t h κ = - \frac{E n t (\frac{f}{{∥ f ∥}_{1}}) / n}{ϕ_{ϵ} (1 - E n t (\frac{f}{{∥ f ∥}_{1}}) / n)} .$
2.: Taking $q = 2$ in Theorem 1 gives, for $x = \frac{E n t_{2} (\frac{f}{{∥ f ∥}_{1}})}{n}$ and $q_{0} = 1 + {(1 - 2 ϵ)}^{2}$

$∥ f_{ϵ} ∥_{2} \leq {∥ f ∥}_{κ}, w i t h κ = \{\begin{matrix} q_{0} & i f & \frac{x + 1}{2} \leq \frac{1}{q_{0}} \\ \frac{α - 1}{ϕ_{ϵ} (α)} & o t h e r w i s e \end{matrix}$

In the second case α is determined by $1 - α - \frac{α ϕ_{ϵ} (α)}{1 - α} = \frac{x + 1}{2}$ .

We observe that both Proposition 1 and Theorem 1 are based on the following claim ([9], Corollary 3.2). This claim also explains the relevance of function

Φ

.

Theorem 2.

Let

0 \leq x \leq 1

. Let f be a function on

{0, 1}^{n}

supported on a set of cardinality at most

2^{x n}

. Then, for any

0 \leq ϵ \leq \frac{1}{2}

holds

〈f_{ϵ}, f〉 \leq 2^{(2 Φ (x, ϵ) + 1 - x) \cdot n} \cdot {∥ f ∥}_{2}^{2},

Moreover, this is tight, up to a polynomial in n factor, if f is the characteristic function of a Hamming sphere of radius

H^{- 1} (x) \cdot n

.

1.2. Applications

We describe some applications of the results above, related mainly to coding theory. We start with providing some relevant context.

Coding theory. A binary error-correcting code C of length n and minimal distance d is a subset of

{0, 1}^{n}

in which the distance between any two distinct points is at least d. Let

A (n, d)

be the maximal size of such a code. A well-known open problem in coding theory is to determine, given

0 < δ < \frac{1}{2}

, the asymptotic maximal rate

R (δ) = {lim sup}_{n \to \infty} \frac{1}{n} \log_{2} A (n, ⌊ δ n ⌋)

of a code with relative distance

δ

. The best known lower bound on

R (δ)

is the Gilbert-Varshamov bound

R (δ) \geq 1 - H (δ)

[13]. The best known upper bounds on

R (δ)

were obtained in [14] using the linear programming relaxation, constructed in [15], of the combinatorial problem of bounding

A (n, d)

. Let

A_{L P} (n, d)

be the value of the appropriate linear program of [15] and let

R_{L P} (δ) = {lim sup}_{n \to \infty} \frac{1}{n} \log_{2} A_{L P} (n, ⌊ δ n ⌋)

. By construction,

A_{L P} (n, d) \geq A (n, d)

for all n and d and hence

R_{L P} (δ) \geq R (δ)

. The first JPL bound of [14] is

R (δ) \leq R_{L P} (δ) \leq H (1 / 2 - \sqrt{δ (1 - δ)})

. This bound is the best known for a subrange of values of

δ

. The best known bound is the second JPL bound of [14]. It is better than the first bound for relatively small values of

δ

. However, it is more complicated to state explicitly and we omit it here. The second JPL bound is strictly larger than the Gilbert-Varshamov bound for all

0 < δ < \frac{1}{2}

, and hence

R (δ)

is unknown for all these values of

δ

.

The value of

R_{L P} (δ)

is also unknown, for all

0 < δ < \frac{1}{2}

. Clearly

R_{L P} (δ) \geq R (δ) \geq 1 - H (δ)

. It was conjectured in [14] that

R_{L P} (δ)

lies strictly between the second JPL bound and the Gilbert-Varshamov bound. On the other hand, there is a convincing numeric evidence [16] that

R_{L P} (δ)

in fact coincides with the second JPL bound. A lower bound

R_{L P} (δ) \geq \frac{1 - H (δ) + H (1 / 2 - \sqrt{δ (1 - δ)})}{2}

was shown in [17] (note that the RHS here is the arithmetic average of the Gilbert-Varshamov bound and the first JPL bound). It was improved, for a subrange of

δ

, in [18].

A different approach to obtain upper bounds on the cardinality of binary codes was presented in [19]. For a subset

D \subseteq {0, 1}^{n}

, let

M_{D}

be the adjacency matrix of the subgraph of the discrete cube induced by the vertices of D. Let

λ (D)

be the maximal eigenvalue of

M_{D}

. The following claim was proved in [19] for binary linear codes (and extended in [18] to general binary codes): Let D be subset of

{0, 1}^{n}

with

λ (D) \geq n - 2 d + 1

. Let C be a code of length n and minimal distance d. Then

| C | ≲ | D |

(here we use the approximate inequality sign to indicate that the inequality holds up to lower order terms). Choosing for D the Hamming balls of different radii with their corresponding parameters leads to a simple proof of the first JPL bound on

R (δ)

. Ref. [19] posed the natural problem of finding subsets of

{0, 1}^{n}

with the largest possible eigenvalue for their cardinality. This question was answered in [20], where is was shown that Hamming balls of radius

r = ρ n

,

0 < ρ < \frac{1}{2}

have essentially the largest eigenvalues for their cardinality. This seems to indicate that at least the straightforward version of the approach of [19], as described above, does not lead to an improvement of the first JPL bound. The claim in [20] was derived from a logarithmic Sobolev inequality for highly concentrated functions on the boolean cube. We continue with a brief description of relevant notions.

Logarithmic Sobolev inequalities. Viewing both sides of (1) as functions of

ϵ

, and writing

L (ϵ)

for the LHS and

R (ϵ)

for the RHS, we have

L (0) = R (0) = {∥ f ∥}_{2}

, and

L (ϵ) \leq R (ϵ)

for

0 \leq ϵ \leq \frac{1}{2}

. Since both L and R are differentiable in

ϵ

this implies

L^{'} (0) \leq R^{'} (0)

. This inequality is the logarithmic Sobolev inequality ([3]) for the Hamming cube. We proceed to describe it in more detail. Recall that the Dirichlet form

E (f, g)

for functions f and g on the Hamming cube is defined by

E (f, g) = E_{x} \sum_{y \sim x} (f (x) - f (y)) (g (x) - g (y))

. Here

y \sim x

means that x and y differ in precisely one coordinate. The logarithmic Sobolev inequality then states that

E (f, f) \geq 2 \ln 2 \cdot E n t (f^{2})

. This inequality describes the behavior of the norm on the RHS of the hypercontractive inequality (1) as

ϵ \to 0

and, as such, can be viewed as a special case of (1). In point of fact, it was introduced in [3] as a way to prove (1) by (roughly speaking) integrating this inequality over the noise parameter (using the semigroup property of noise operators). Following [3], logarithmic Sobolev inequalities were shown to hold in many spaces of interest (see [21] for discussion and for many applications of these inequalities).

The logarithmic Sobolev inequality for highly concentrated functions in [20] (we will state this inequality explicitly in the discussion following Corollary 2 below) improves over the inequality

E (f, f) \geq 2 \ln 2 \cdot E n t (f^{2})

similarly to the improvement to (1) provided by Theorem 1. However, deducing a tight hypercontractive inequality, such as Theorem 1, from the inequality in [20] by integration (following the approach of [3]) seems to be more challenging. Roughly speaking, the problem lies in the fact that the concentration of f might decrease very quickly under noise. With that, a family of logarithmic Sobolev inequalities, generalizing that of [20] was proved in [10]. Integrating these inequalities over noise leads to a family of hypercontractive inequalities which improve over (1) for highly concentrated functions and which are essentially tight in the vicinity of

ϵ = 0

. These inequalities were used in [10] to prove a version of the uncertainty principle on

{0, 1}^{n}

.

An uncertainty principle on

{0, 1}^{n}

. We recall some basic notions in Fourier analysis on the Hamming cube (see [4]). For

α \in {0, 1}^{n}

, define the Walsh-Fourier character

W_{α}

on

{0, 1}^{n}

by setting

W_{α} (y) = {(- 1)}^{\sum α_{i} y_{i}}

, for all

y \in {0, 1}^{n}

. The weight of the character

W_{α}

is the Hamming weight

| α |

of

α

. The characters

{W_{α}}_{α \in {0, 1}^{n}}

form an orthonormal basis in the space of real-valued functions on

{0, 1}^{n}

, under the inner product

〈f, g〉 = \frac{1}{2^{n}} \sum_{x \in {0, 1}^{n}} f (x) g (x)

. The expansion

f = \sum_{α \in {0, 1}^{n}} \hat{f} (α) W_{α}

defines the Fourier transform

\hat{f}

of f. We also have the Parseval identity,

{∥ f ∥}_{2}^{2} = \sum_{α \in {0, 1}^{n}} {\hat{f}}^{2} (α)

.

Uncertainty principle asserts that a function and its Fourier transform cannot be simultaneously narrowly concentrated. A well-known way (see, e.g., [22]) to state this for the Hamming cube is as follows. If f is a non-zero function on

{0, 1}^{n}

then

| s u p p (f) | \geq \frac{2^{n}}{| s u p p (\hat{f}) |}

. In [10], (see also the discussion following Theorem 1.10 in [9]) a different way to formalize this statement for the Hamming cube was presented. If f is a function on

{0, 1}^{n}

with

\frac{E n t_{2} (\frac{f}{{∥ f ∥}_{1}})}{n} \geq 1 - H (ρ)

, then its Fourier transform

\hat{f}

cannot attain its

ℓ_{2}

norm in a Hamming ball of radius much smaller than

(\frac{1}{2} - \sqrt{ρ (1 - ρ)}) \cdot n

. This result was then used to establish some properties of binary linear codes.

Our results. We now pass to presenting our results which are relevant to the topics above. We first remark that the idea of using hypercontractivity to study binary codes was discussed already in [23]. In [24], the hypercontractive inequality (1) was used to obtain bounds on the distance components and other parameters of binary codes. We observe (a similar observation was made in [9]) that these bounds can be strengthened by replacing (1) by (stronger) inequalities of Theorem 1. We do not go into details.

Next, we consider some implications of Theorem 1, focussing on the behavior of the norm

κ = κ_{2, 2}

for values of the noise parameter

ϵ

in the vicinity of 0. Clearly, for any

0 \leq x \leq 1

the function

κ_{2, 2} (x, ϵ)

is 2 at

ϵ = 0

. We prove the following technical claim.

Lemma 1.

Assume

0 < x < 1

. Let

κ (ϵ) = κ_{2, 2} (x, ϵ)

.

1.: $κ^{'} (0) = \frac{4}{\ln 2} \cdot \frac{(2 \sqrt{H^{- 1} (1 - x) (1 - H^{- 1} (1 - x))} - 1)}{x} .$
2.: Let $ϵ \sim 0$ express the fact that ϵ is a sufficiently small absolute constant. Then for $ϵ \sim 0$ holds $| κ^{'} (ϵ) - κ^{'} (0) | \leq O (ϵ)$ , where the asymptotic notation hides absolute constants which may depend on x.

We use the first part of this claim to rederive a slightly weaker (but sufficient for applications, see the dicussion following Corollary 4) version of the logarithmic Sobolev inequality from [20].

Corollary 2.

For any function f on

{0, 1}^{n}

holds

E (f, f) \geq ℓ (\frac{E n t_{2} (\frac{f}{{∥ f ∥}_{1}})}{n}) \cdot E n t (f^{2}),

where

ℓ (x) = 2 \cdot \frac{1 - 2 \sqrt{H^{- 1} (1 - x) (1 - H^{- 1} (1 - x))}}{x}

is a convex and increasing function on

[0, 1]

, taking

[0, 1]

onto

[2 \ln 2, 2]

.

We remark that in [20] (see also Theorem 6 in [10]) a somewhat stronger logarithmic Sobolev inequality

E (f, f) \geq ℓ (\frac{E n t (\frac{f^{2}}{{∥ f ∥}_{2}^{2}})}{n}) \cdot E n t (f^{2})

was shown using a different approach. (It seems that it might be possible to recover this stronger inequality by differentiating a corresponding hypercontractive inequality at zero, if one considers a more general version of Theorem 1 which takes into account the ratio between

ℓ_{q}

and

ℓ_{p}

norms of f, for

q > p

, and in this case taking both q and p to be very close to 2. We omit the details.)

Next, we use the second part of Lemma 1 to rederive the uncertainty principle from [10].

Corollary 3.

Let f be a non-zero function on

{0, 1}^{n}

such that

\frac{E n t_{2} (\frac{f}{{∥ f ∥}_{1}})}{n} = 1 - H (ρ)

, for some

0 \leq ρ < 1

. Let

0 \leq μ < \frac{1}{2} - \sqrt{ρ (1 - ρ)}

. Then

\sum_{| α | \leq μ n} {\hat{f}}^{2} (α) \leq 2^{- c n} \cdot \sum_{α} {\hat{f}}^{2} (α),

where c is an absolute constant depending on ρ and μ.

Let us remark that it seems helpful to have an explicit hypercontractive inequality (given by Theorem 1) from which both Corollaries 2 and 3 can be derived as special cases.

The following two results are simple consequences of Corollaries 2 and 3, respectively. Recall that for a subset

D \subseteq {0, 1}^{n}

,

λ (D)

is the maximal eigenvalue of the adjacency matrix of the subgraph of the discrete cube induced by the vertices of D. Recall also that

R_{L P} (δ)

denotes the best possible upper bound on the asymptotic maximal rate

R (δ)

of a code with relative distance

δ

which is possible to obtain using the linear programming approach of [15].

Corollary 4.

Let D be a subset of ${0, 1}^{n}$ of cardinality $| D | = 2^{H (ρ) n}$ , for some $0 \leq ρ \leq 1$ . Then

$λ (D) \leq 2 \sqrt{ρ (1 - ρ)} \cdot n .$

This is almost tight if D is a Hamming ball of exponentially small cardinality.
For any $0 \leq δ \leq \frac{1}{2}$ holds

$R_{L P} (δ) \geq \frac{1 - H (δ) + H (1 / 2 - \sqrt{δ (1 - δ)})}{2} .$

Some comments.

–: As discussed above, the first of the these claims answers the question of [19] and shows that a certain approach to bound binary codes does not lead to an improvement of the first JPL bound. The second claim shows that the best possible bound obtainable via the linear programming approach of [15] is not better than the arithmetic average of the Gilbert-Varshamov bound and the first JPL bound. Observe that the first claim is a consequence of the logarithmic Sobolev inequality in Corollary 2, and hence of the behavior of the norm $κ_{2, 2}$ in Theorem 1 as $ϵ \to 0$ . The second claim is a consequence of the uncertainty principle in Corollary 3, and hence of the behavior of the norm $κ_{2, 2}$ in Theorem 1 as $ϵ \sim 0$ . We find these connections between notions to be rather intriguing.
–: As we have mentioned, the first of the claims recovers a result of [20], where it was also derived from the appropriate logarithmic Sobolev inequality. (Apart from this claim being a simple corollary of Theorem 1, an additional reason for stating it here is that it has only appeared in the unpublished arXiv preprint [20].) The second claim of recovers a result of [17].

Finally we present a result of a somewhat different nature. The question of the maximal possible ratio

\frac{{∥ f ∥}_{2}}{{∥ f ∥}_{1}}

for a polynomial f of degree s on

{0, 1}^{n}

is considered in analysis [25,26] in connection with a conjecture of Pelczynski. The following claim is a simple consequence of Corollary 2.

Corollary 5.

Let

0 \leq s \leq \frac{n}{2}

and let f be a polynomial of degree s on

{0, 1}^{n}

(that is, f a restriction of a degree s polynomial on

R^{n}

to

{0, 1}^{n}

). Then, writing σ for

\frac{s}{n}

,

\frac{1}{n} \log_{2} (\frac{{∥ f ∥}_{2}}{{∥ f ∥}_{1}}) \leq \frac{1 - H (\frac{1}{2} - \sqrt{σ (1 - σ)})}{2} .

We remark that this improves the estimate of [25] for

0.3 . . \leq \frac{s}{n} < \frac{1}{2}

.

1.3. Related Work

In [10], it was shown that if

\frac{{∥ f ∥}_{p}}{{∥ f ∥}_{1}} \geq 2^{ρ n}

, for some

p \geq 1

and

ρ \geq 0

, then

{∥ f ∥}_{p} \geq {∥ f_{ϵ} ∥}_{1 + \frac{p - 1}{{(1 - 2 ϵ)}^{2}} + Δ (p, ρ, ϵ)}

, where

Δ (p, ρ, ϵ) > 0

for all

p > 1

,

ϵ, ρ > 0

(cf. with (1), which can be restated as

{∥ f ∥}_{p} \geq {∥ f_{ϵ} ∥}_{1 + \frac{p - 1}{{(1 - 2 ϵ)}^{2}}}

, for

p = 1 + {(1 - 2 ϵ)}^{2} (q - 1)

). The function

Δ (p, ρ, ϵ)

is “semi-explicit”, in the following sense: it is an explicit function of the (unique) solution of a certain explicit differential equation.

In [11], it was shown, using a different approach, that (restating the result in the notation of this paper)

∥ f_{ϵ} ∥_{2} \leq {∥ f ∥}_{q}

, where q is determined by

F_{f, ϵ} (q) = q

(in the notation of the last comment above to Theorem 1). As we have observed, this is the best possible value for

κ

in Theorem 1, but it might not be easy to determine explicitly in practice (compare with Corollary 1).

In [27], Mrs. Gerber type inequalities for Rényi divergence and arbitrary distributions on Polish spaces were proved, using a different approach. The results in [27] apply in higher generality, but they seem to be somewhat less explicit than these in Proposition 1.

This paper is organized as follows. We prove Proposition 1 in Section 2 and Theorem 1 in Section 3. We prove the remaining claims, including some technical lemmas and claims made above in the comments to the main results, in Section 4.

2. Proof of Proposition 1

We first prove (4) and then show it to be tight. We prove (4) in two steps, using Theorem 2 to reduce it to a claim about properties of the function

ϕ_{ϵ}

, and then proving that claim.

We start with the first step. It follows closely the proof of Theorem 1.8 in [9], and hence will be presented rather briefly, and not in a self-contained manner. Let f be a function on

{0, 1}^{n}

, for which we want to show (4). Recall that, by assumption,

E f = 1

. This means that

{∥ f ∥}_{\infty} \leq 2^{n}

, and that the points at which

f < 2^{- n}

, say, contribute little to both sides ot (4), so we may ignore them for the sake of the discussion (that is, we may and will assume that f vanishes on these points). All the remaining points can be partitioned into

O (n)

level sets

A_{1}, \dots A_{r}

such that f varies by a factor of 2 at most in each level set. Let

α_{i} = \frac{1}{n} \log_{2} (| A_{i} |)

, and let

ν_{i} = \frac{1}{n} \log_{2} (v_{i})

, where

v_{i}

is the minimal value of f on

A_{i}

. Then, as shown in the proof of Theorem 1.8 in [9], up to an additive error term of

O (\frac{\log (n)}{n})

, we have,

\frac{E n t_{2} (f_{ϵ})}{n} = \frac{1}{n} \log_{2} {∥ f_{ϵ} ∥}_{2}^{2} \leq 2 \cdot \max_{1 \leq i \leq r} \{ϕ_{ϵ} (α_{i}) + ν_{i}\} .

The negligible error here contributes towards a negligible error in (4), which can then be removed by a tensorization argument, so we will ignore it from now on.

Let

N = \frac{1}{n} \log_{2} ({∥ f ∥}_{q})

. Note that

N = \frac{q - 1}{q} \cdot \frac{E n t_{q} (f)}{n}

. Hence, in particular,

N \leq \frac{q - 1}{q}

. Note also that for any

1 \leq i \leq r

holds

α_{i} + ν_{i} \leq 1

(since

E f = 1

) and

\frac{α_{i} - 1}{q} + ν_{i} \leq N

(since

\frac{| A_{i} |}{2^{n}} 2^{q ν_{i} n} \leq \frac{1}{2^{n}} \sum_{x \in A_{i}} f^{q} (x) \leq {∥ f ∥}_{q}^{q}

). We also have

0 \leq α_{i} \leq 1

and

- 1 \leq ν_{i} \leq 1

. This discussion leads to the definition of the following two subsets of

R^{2}

, which will play an important role in the proof of Theorem 1 as well. (We remark that the relevance of the set

Ω

in the following definition is not immediately obvious. It will be made clear in the following arguments.)

Definition 2.

Let

q > 1

and

0 < N \leq \frac{q - 1}{q}

. Let

Ω_{0} \subseteq R^{2}

be defined by

Ω_{0} = \{(α, ν) : 0 \leq α \leq 1, - 1 \leq ν \leq 1, α + ν \leq 1, \frac{α - 1}{q} + ν \leq N\} .

Let

Ω \subseteq Ω_{0}

be the set of all pairs

(α, ν) \in Ω_{0}

with

ν \geq 0

.

By the preceding discussion, (4) will follow from the following claim.

Lemma 2.

For all

0 \leq ϵ \leq \frac{1}{2}

holds

\max_{(α, ν) \in Ω_{0}} \{ϕ_{ϵ} (α) + ν\} = \frac{1}{2} \cdot ψ_{2, q} (\frac{q N}{q - 1}, ϵ),

where

ψ_{2, q}

is defined in Definition 1.

Before proving Lemma 2, we collect the relevant properties of the function

ϕ_{ϵ}

in the following lemma.

Lemma 3.

Let

0 < ϵ < \frac{1}{2}

. Let

q_{0} = q_{0} (ϵ) = 1 + {(1 - 2 ϵ)}^{2}

. The function

ϕ_{ϵ}

has the following properties.

1.: $ϕ_{ϵ} (α)$ is strictly concave and increasing from $ϕ_{ϵ} (0) = - \frac{\log_{2} (\frac{4}{q_{0}})}{2}$ to 0 on $0 \leq α \leq 1$ .
2.: $ϕ_{ϵ}^{'} (0) = 1$ , $ϕ_{ϵ}^{'} (1) = \frac{1}{q_{0}}$ .
3.: $\frac{α - 1}{ϕ_{ϵ} (α)}$ is strictly increasing in α, going up to $q_{0}$ , as $α \to 1$ .
4.: The function $g (α) = 1 - α - \frac{α}{1 - α} \cdot ϕ_{ϵ} (α)$ is strictly decreasing on $[0, 1]$ . Moreover, $g (0) = 1$ and $g (1) = \frac{1}{q_{0}}$ .

This lemma will be proved in Section 4. For now we assume its correctness, and proceed with the proof of Lemma 2.

Proof.

Our first observation is that the maximum of

ϕ_{ϵ} (α) + ν

on

Ω_{0}

is located in

Ω

, since for any point

(α, ν) \in Ω_{0}

with

ν < 0

, the point

(α, 0)

is in

Ω

. So we may and will replace

Ω_{0}

with

Ω

in the following argument.

Since

ϕ_{ϵ}

is increasing, any local maximum of

ϕ_{ϵ} (α) + ν

is located on the upper boundary of

Ω

, that is on the piecewise linear curve which starts as the straight line

\frac{α}{q} + ν = N + \frac{1}{q}

, for

0 \leq α \leq 1 - \frac{q N}{q - 1}

and continues as the straight line

α + ν = 1

for

1 - \frac{q N}{q - 1} \leq α \leq 1

.

Note that, since

ϕ_{ϵ}^{'} < 1

for

α > 0

, the function

ϕ_{ϵ} (α) + ν

decreases (as a function of

α

) on the line

α + ν = 1

for

1 - \frac{q N}{q - 1} \leq α \leq 1

. Next, let

h (α) = ϕ_{ϵ} (α) - \frac{α}{q} + (N + \frac{1}{q})

. The function h describes the restriction of

ϕ_{ϵ} (α) + ν

to the line

\frac{α}{q} + ν = N + \frac{1}{q}

, and we are interested on the maximum of h on the interval

I = \{0 \leq α \leq 1 - \frac{q N}{q - 1}\}

. We have

h^{'} (α) = ϕ_{ϵ}^{'} (α) - \frac{1}{q}

. By Lemma 3, the function h is concave, and hence there are two possible cases:

$ϕ_{ϵ}^{'} (1 - \frac{q N}{q - 1}) \geq \frac{1}{q}$ . In this case h is increasing on I and we get

$\max_{(α, ν) \in Ω} \{ϕ_{ϵ} (α) + ν\} = \max_{α \in I} {h (α)} = h (1 - \frac{q N}{q - 1}) =$

$ϕ_{ϵ} (1 - \frac{q N}{q - 1}) + \frac{q N}{q - 1} = \frac{1}{2} \cdot ψ_{2, q} (\frac{q N}{q - 1}, ϵ) .$

The last equality follows from the definition of $ψ_{2, q}$ in this case.
$ϕ_{ϵ}^{'} (1 - \frac{q N}{q - 1}) < \frac{1}{q}$ . Note that, by Lemma 3, $1 = ϕ_{ϵ}^{'} (0) > \frac{1}{q}$ . Hence, in this case the maximum of h on I is located at the unique zero of its derivative, that is at the point $α_{0}$ such that $ϕ_{ϵ}^{'} (α_{0}) = \frac{1}{q}$ . Using the definition of $ψ_{2, q}$ in this case, we get

$\max_{(α, ν) \in Ω} \{ϕ_{ϵ} (α) + ν\} = h (α_{0}) = N + (ϕ_{ϵ} (α_{0}) + \frac{1 - α_{0}}{q}) = \frac{1}{2} \cdot ψ_{2, q} (\frac{q N}{q - 1}, ϵ) .$

□

This concludes the proof of (4). The fact that

ψ_{2, q} (x, ϵ)

is strictly increasing and concave in its first argument is an easy implication of Lemma 3.

We pass to showing the tightness of (4). Let

0 < ϵ < \frac{1}{2}

and

0 < x < 1

. Set

N = \frac{q - 1}{q} \cdot x

. Let

Ω

be the domain defined in Definition 2, and let

(α^{*}, ν^{*})

be the maximum point of

ϕ_{ϵ} (α) + ν

on

Ω

(note that the discussion above determines this point uniquely). We proceed to define the function f. Let n be sufficiently large. For

y \in {0, 1}^{n}

, let

| y |

denotes the Hamming weight of y, that is the number of 1-coordinates in y. Let

r = ⌊ H^{- 1} (α^{*}) \cdot n ⌋

. Let

S = {y \in {0, 1}^{n}, | y | = r}

be the Hamming sphere around zero of radius r in

{0, 1}^{n}

. Now there are two cases to consider.

If $ϕ_{ϵ}^{'} (1 - x) < \frac{1}{q}$ , then by the discussion above, the point $(α^{*}, ν^{*})$ lies on the line $\frac{α}{q} + ν = N + \frac{1}{q}$ , but not on the line $α + ν = 1$ . Observe that $2^{α^{*} n - o (n)} \leq | S | \leq 2^{α^{*} n}$ (the first estimate follows from the Stirling formula, for the second estimate see, e.g., Theorem 1.4.5. in [28]). As the first attempt, let $g = 2^{ν^{*} n} \cdot 1_{S}$ . Then $N - o (n) \leq \frac{α^{*} - 1}{q} + ν^{*} - o (n) \leq \frac{1}{n} \log_{2} {∥ g ∥}_{q} \leq \frac{α^{*} - 1}{q} + ν^{*} = N$ . That is, $x - o_{n} (1) \leq \frac{E n t_{q} (g)}{n} \leq x$ . However, $E g$ is exponentially small. To correct that, we define f to be $v = 2^{(ν^{*} - δ) \cdot n}$ on S, and $\frac{2^{n} - | S | v}{2^{n} - | S |}$ on the complement of S. Then $E f = 1$ . We choose $δ$ to be as small as possible, while ensuring that $\frac{E n t_{q} (f)}{n} \leq x$ . Since the contribution of the constant-1 function to ${∥ f ∥}_{q}$ is exponentially small w.r.t. ${∥ f ∥}_{q}$ , we can choose $δ = o_{n} (1)$ . We now have $E f = 1$ , $\frac{E n t_{q} (f)}{n} \leq x$ , and

$\frac{E n t_{2} (f_{ϵ})}{n} = \frac{1}{n} \log_{2} {∥ f_{ϵ} ∥}_{2}^{2} = \frac{1}{n} \log_{2} 〈f_{2 ϵ (1 - ϵ)}, f〉 \geq$

$2 \cdot (ϕ_{ϵ} (α^{*}) + ν^{*}) - o_{n} (1) \geq ψ_{2, q} (x, ϵ) - o_{n} (1) .$

Here the second equality follows from the semigroup property of the noise operator: $T_{ϵ} \circ T_{ϵ} = T_{2 ϵ (1 - ϵ)}$ . The first inequality follows from the tightness part of Theorem 2 and the definition of $ϕ_{ϵ}$ . The second inequality follows from Lemma 2.
The tightness of (4) in this case now follows, taking into account the fact that $ψ_{2, q}$ is strictly increasing.
If $ϕ_{ϵ}^{'} (1 - x) \geq \frac{1}{q}$ , the point $(α^{*}, ν^{*})$ lies on the intersection of the lines $\frac{α}{q} + ν = N + \frac{1}{q}$ , and $α + ν = 1$ . Hence the function $g = 2^{ν^{*} n} \cdot 1_{S}$ has both $x - o_{n} (1) \leq \frac{E n t_{q} (g)}{n} \leq x$ , and $1 - o_{n} (1) \leq E g \leq 1$ . It is easy to see that g can be corrected as in the preceding case, by decreasing it slightly on S and adding a constant component, to obtain a function f with expectation 1 and $E n t_{q} (f) \leq x$ , and with $\frac{E n t_{2} (f_{ϵ})}{n} \geq ψ_{2, q} (x, ϵ) - o_{n} (1)$ , proving the tightness of (4) in this case as well. We omit the details.

This completes the proof of Proposition 1. □

3. Proof of Theorem 1

The high-level outline of the argument in this proof is similar to that of Proposition 1. We start with proving (5), doing this in two steps. In the first step Theorem 2 is used to reduce (5) to a claim about properties of the function

ϕ_{ϵ}

. That claim is proved in the second step.

We will give only a brief description of the first step since, similarly to the first step in the proof of Proposition 1, it follows closely the proof of Theorem 1.8 in [9]. Let f be a function on

{0, 1}^{n}

, for which we may and will assume that

f \geq 2^{- n}

and that

E f = {∥ f ∥}_{1} = 1

. There are

O (n)

real numbers

0 \leq α_{1}, . . ., α_{r} \leq 1

and

- 1 \leq ν_{1}, \dots, ν_{r} \leq 1

, such that, up to a negligible error, which may be removed by tensorization, we have

\frac{1}{n} \log_{2} ∥ f_{ϵ} ∥_{2} \leq \max_{1 \leq i \leq r} \{ϕ_{ϵ} (α_{i}) + ν_{i}\} and \frac{1}{n} \log_{2} {∥ f ∥}_{q} = \max_{1 \leq i \leq r} \{\frac{α_{i} - 1}{q} + ν_{i}\} .

Hence (5) reduces to claim (6) in the following proposition.

Proposition 2.

Let

q > 1

and

0 \leq α_{1}, \dots, α_{r} \leq 1

,

- 1 \leq ν_{1}, \dots, ν_{r} \leq 1

with

\max_{1 \leq i \leq r} \{(α_{i} - 1) + ν_{i}\} = 0

. Let

N = \max_{1 \leq i \leq r} \{\frac{α_{i} - 1}{q} + ν_{i}\}

. Then for any

0 \leq ϵ \leq \frac{1}{2}

holds

\max_{1 \leq i \leq r} \{ϕ_{ϵ} (α_{i}) + ν_{i}\} \leq \max_{1 \leq i \leq r} \{\frac{α_{i} - 1}{κ} + ν_{i}\},

(6)

where

κ = κ_{2, q} (\frac{q N}{q - 1}, ϵ)

is defined in Definition 1 (it is easy to see that

0 \leq N \leq \frac{q - 1}{q}

, and hence κ is well defined).

Moreover, this is tight, in the following sense. For any

0 < N < \frac{q - 1}{q}

and

0 < ϵ < \frac{1}{2}

, and for any

\tilde{κ} < κ_{2, q} (x, ϵ)

, there exist

0 \leq α_{1}, α_{2} \leq 1

and

- 1 \leq ν_{1}, ν_{2} \leq 1

such that

\max_{1 \leq i \leq 2} \{(α_{i} - 1) + ν_{i}\} = 0

,

\max_{1 \leq i \leq 2} \{\frac{α_{i} - 1}{q} + ν_{i}\} = N

, and

\max_{1 \leq i \leq 2} \{ϕ_{ϵ} (α_{i}) + ν_{i}\} > \max_{1 \leq i \leq r} \{\frac{α_{i} - 1}{\tilde{κ}} + ν_{i}\}

.

Proof of Proposition 2.

We start with verifying simple boundary cases. First, we observe that

ϕ_{0} (x) = \frac{x - 1}{2}

(Lemma 9) and that

ϕ_{\frac{1}{2}} (x) = x - 1

(see the relevant discussion in the proof of Corollary 1). In addition, it is easy to see that

κ_{2, q} (x, \frac{1}{2}) = 1

for all

q \geq 1

and

0 \leq x \leq 1

; and (bearing in mind that

ϕ_{0} (x) = \frac{x - 1}{2}

) that

κ_{2, q} (x, 0) = 2

for all

q \geq 1

and

0 \leq x \leq 1

. Therefore (6) is an identity for

ϵ = 0

and

ϵ = \frac{1}{2}

. Hence we may and will assume from now on that

0 < ϵ < \frac{1}{2}

.

Let

q_{0} = 1 + {(1 - 2 ϵ)}^{2}

. We proceed to consider the (simple) cases

N = 0

or

N + \frac{1}{q} \leq \frac{1}{q_{0}}

. Note that in these cases we have

κ = κ_{2, q} (\frac{q N}{q - 1}, ϵ) = q_{0}

. Next, observe that, by the first and the second claims of Lemma 3, for any

0 \leq α \leq 1

holds

ϕ_{ϵ} (α) \leq \frac{α - 1}{q_{0}} = \frac{α - 1}{κ}

and hence (6) holds trivially in these cases.

We continue to prove (6), assuming from now on that

N > 0

and that

N + \frac{1}{q} > \frac{1}{q_{0}}

. Let

Ω \subseteq R^{2}

be the set defined in Definition 2. We now define a family of continuous functions on

Ω

, which will play an important role in the following argument. Let

(α_{1}, ν_{1})

be a point in

Ω

with

\frac{α_{1} - 1}{q} + ν_{1} = N

. Define a function

f = f_{α_{1}, ν_{1}}

on

Ω

as follows. For

(α, ν) \in Ω

with

α < 1

let

f (α, ν)

be the value of

κ

for which

ϕ_{ϵ} (α) + ν = \max \{\frac{α_{1} - 1}{κ} + ν_{1}, \frac{α - 1}{κ} + ν\}

. In addition, let

f (1, 0) = \frac{1 - α_{1}}{ν_{1}}

.

Lemma 4.

For any choice of

(α_{1}, ν_{1})

as above the function

f_{α_{1}, ν_{1}}

is well-defined and continuous on Ω.

Let

M (α_{1}, ν_{1}) = \max_{Ω} f_{α_{1}, ν_{1}}

. The inequality (6) will follow from the next main technical claim, describing the behavior of

M (α_{1}, ν_{1})

, as a function of

α_{1}

and

ν_{1}

. Before stating this claim, let us make some preliminary comments. Note that the points

(1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

and

(0, N + \frac{1}{q})

are possible choices for

(α_{1}, ν_{1})

. Note also that

α_{0}

in the third part of the claim is well-defined, by the fourth claim of Lemma 3.

Proposition 3.

1.: $M (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1}) = \frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} .$
2.: If $\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} \geq q$ , then for any choice of $(α_{1}, ν_{1})$ holds

$M (α_{1}, ν_{1}) \leq M (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1}) .$
3.: If $\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} \leq q$ , then for any choice of $(α_{1}, ν_{1})$ holds

$M (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1}) \leq M (α_{1}, ν_{1}) \leq M (0, N + \frac{1}{q}) = \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})},$

$where α_{0} is determined by 1 - α_{0} - \frac{α_{0} ϕ_{ϵ} (α_{0})}{1 - α_{0}} = N + \frac{1}{q} .$

We will prove Lemma 4 and Proposition 3 in Section 3.1 and Section 3.2. For now we assume their validity and complete the proof of Proposition 2.

We first prove (6). Note that if

x = \frac{q N}{q - 1}

then in the definition of

κ_{2, q} (x, ϵ)

we have

y = \frac{q - 1}{q} \cdot x + \frac{1}{q} = N + \frac{1}{q}

. Recall also that we may assume that

N > 0

and that

y = N + \frac{1}{q} > \frac{1}{q_{0}}

.

By assumption

α_{i} + ν_{i} \leq 1

, and

\frac{α_{i} - 1}{q} + ν_{i} \leq N

for all

1 \leq i \leq r

. Moreover there is an index

1 \leq i \leq r

for which

\frac{α_{i} - 1}{q} + ν_{i} = N

. Assume, w.l.o.g., that

i = 1

. We apply Proposition 3 to the function

f_{α_{1}, ν_{1}}

. Observe that the claim of the proposition together with the definition of

κ

imply

M (α_{1}, ν_{1}) \leq κ

. By the definition of

f_{α_{1}, ν_{1}}

, this means that for any point

(α, ν) \in Ω

holds

ϕ_{ϵ} (α) + ν \leq \max \{\frac{α_{1} - 1}{κ} + ν_{1}, \frac{α - 1}{κ} + ν\}

. We now claim that this inequality holds for all the points

(α_{i}, ν_{i})

,

1 \leq i \leq r

, which will immediately imply (6). In fact, points

(α_{i}, ν_{i})

with

0 \leq ν_{i} \leq 1

lie in

Ω

and hence the inequality holds for these points. Furthermore, if

ν_{i} < 0

for some

1 \leq i \leq r

, then the point

(α_{i}, 0)

lies in

Ω

, and hence

ϕ_{ϵ} (α_{i}) \leq \max \{\frac{α_{1} - 1}{q} + ν_{1}, \frac{α_{i} - 1}{q}\}

. However, then

ϕ_{ϵ} (α_{i}) + ν_{i} \leq \max \{\frac{α_{1} - 1}{q} + ν_{1}, \frac{α_{i} - 1}{q} + ν_{i}\}

, proving the inequality in this case as well.

We pass to proving the tightness of (6), starting with the case

N + \frac{1}{q} \leq \frac{1}{q_{0}}

. In this case, by definition,

κ = q_{0}

. Let

\tilde{κ} < κ

be given. Observe that since, by assumption,

N > 0

, we have

q > q_{0}

. Set

α_{1} = \frac{\frac{1}{q_{0}} - \frac{1}{q} - N}{\frac{1}{q_{0}} - \frac{1}{q}}

. Set

ν_{1} = \frac{1 - α_{1}}{q_{0}}

. Let

δ > 0

be sufficiently small (depending on N and

\tilde{κ}

). Set

α_{2} = 1 - δ

and

ν_{2} = δ

. It is easy to see that

α_{1}, α_{2}

and

ν_{1}

,

ν_{2}

satisfy the required constraints. We claim that

ϕ_{ϵ} (α_{2}) + ν_{2} > \max_{1 \leq i \leq 2} \{\frac{α_{i} - 1}{\tilde{κ}} + ν_{i}\}

. In fact, for a sufficiently small

δ

we have, using the second claim of Lemma 3 (and observing that

ϕ_{ϵ}^{'}

is continuous), that

ϕ_{ϵ} (α_{2}) + ν_{2} = ϕ_{ϵ} (1 - δ) + δ \approx - \frac{δ}{q_{0}} + δ > - \frac{δ}{\tilde{κ}} + δ = \frac{α_{2} - 1}{\tilde{κ}} + ν_{2},

and

ϕ_{ϵ} (α_{2}) + ν_{2} \approx - \frac{δ}{q_{0}} + δ > 0 \geq \frac{α_{1} - 1}{\tilde{κ}} + \frac{1 - α_{1}}{q_{0}} = \frac{α_{1} - 1}{\tilde{κ}} + ν_{1} .

We pass to the case

N + \frac{1}{q} > \frac{1}{q_{0}}

and

\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} \geq q

. In this case

κ = \frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})}

. Set

α_{1} = α_{2} = 1 - \frac{q N}{q - 1}

and

ν_{1} = ν_{2} = \frac{q N}{q - 1}

. It is easy to see that

α_{1}, α_{2}

and

ν_{1}

,

ν_{2}

satisfy the required constraints. It is also easy to see that for any

\tilde{κ} < κ

holds

ϕ_{ϵ} (α_{1}) + ν_{1} = \frac{α_{1} - 1}{κ} + ν_{1} > \frac{α_{1} - 1}{\tilde{κ}} + ν_{1} .

It remains to deal with the case

N + \frac{1}{q} > \frac{1}{q_{0}}

and

\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} < q

. Let

α_{0}

be determined by

1 - α_{0} - \frac{α_{0} ϕ_{ϵ} (α_{0})}{1 - α_{0}} = N + \frac{1}{q}

. Then

κ = \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})}

. Set

α_{1} = 0

and

ν_{1} = N + \frac{1}{q}

. Set

α_{2} = α_{0}

and

ν_{2} = 1 - α_{0}

. It is easy to see that in this case the function

1 - α - \frac{α ϕ_{ϵ} (α)}{1 - α}

is larger than

N + \frac{1}{q}

at

α = 1 - \frac{q N}{q - 1}

, and hence the fourth claim of Lemma 3 implies that

α_{2} = α_{0} > 1 - \frac{q N}{q - 1}

. Using this, it is easy to see that

α_{1}, α_{2}

and

ν_{1}

,

ν_{2}

satisfy the required constraints. Furthermore, note that

α_{2} < 1

(again, using the fourth claim of Lemma 3). It is also easy to verify, using the definition of

α_{0}

, that

ϕ_{ϵ} (α_{2}) + ν_{2} = \frac{α_{1} - 1}{κ} + ν_{1} = \frac{α_{2} - 1}{κ} + ν_{2},

which implies that for any

\tilde{κ} < κ

holds

ϕ_{ϵ} (α_{2}) + ν_{2} > \max_{1 \leq i \leq 2} \{\frac{α_{i} - 1}{\tilde{κ}} + ν_{i}\}

. This completes the proof of Proposition 2. □

We now prove Lemma 4 and Proposition 3. Recall that we may assume

N > 0

and

N + \frac{1}{q} > \frac{1}{q_{0}}

.

3.1. Proof of Lemma 4

Let

(α_{1}, ν_{1})

be a point in

Ω

with

\frac{α_{1} - 1}{q} + ν_{1} = N

. We start with some simple but useful observations about

α_{1}

and

ν_{1}

.

Lemma 5.

1.: $α_{1} \leq 1 - \frac{q N}{q - 1}$ and $ν_{1} \geq \frac{q N}{q - 1}$ .
2.: $\frac{1 - α_{1}}{ν_{1}} < q_{0}$ .

Proof.

The first claim of the lemma is an easy consequence of the inequalities

\frac{α_{1} - 1}{q} + ν_{1} = N

and

α_{1} + ν_{1} \leq 1

. We omit the details.

We pass to the second claim of the lemma, distinguishing two cases,

q \leq q_{0}

and

q > q_{0}

. If

q \leq q_{0}

, then

ν_{1} = N + \frac{1 - α_{1}}{q} > \frac{1 - α_{1}}{q} \geq \frac{1 - α_{1}}{q_{0}}

. If

q > q_{0}

, we use the fact that

N + \frac{1}{q} > \frac{1}{q_{0}}

to obtain

\frac{1 - α_{1}}{q_{0}} < (1 - α_{1}) (N + \frac{1}{q}) = (1 - α_{1}) (\frac{α_{1}}{q} + ν_{1})

. Viewing the last expression as a function of

α_{1}

, it is easy to see that it equals

ν_{1}

at

α_{1} = 0

and that it decreases in

α_{1}

. Hence

ν_{1} \geq (1 - α_{1}) (\frac{α_{1}}{q} + ν_{1}) > \frac{1 - α_{1}}{q_{0}}

, completing the argument in this case as well. □

We now show that the function

f = f_{α_{1}, ν_{1}}

is well-defined and that its values lie in the interval

(0, q_{0})

. By Lemma 5,

α_{1} < 1

and

0 < f (1, 0) = \frac{1 - α_{1}}{ν_{1}} < q_{0}

. Let now

α < 1

. In this case the function

g (κ) = \max \{\frac{α_{1} - 1}{κ} + ν_{1}, \frac{α - 1}{κ} + ν\}

is a strictly increasing continuous function of

κ

, which is

- \infty

at

κ = 0

. Furthermore, by Lemma 3,

ϕ_{ϵ} (α) < \frac{α - 1}{q_{0}}

, implying that

g (q_{0}) > ϕ_{ϵ} (α) + ν

. Hence, by the intermediate value theorem, there exists a unique

0 < κ < q_{0}

for which

ϕ_{ϵ} (α) + ν = \max \{\frac{α_{1} - 1}{κ} + ν_{1}, \frac{α - 1}{κ} + ν\}

.

Next, we argue that f is continuous on

Ω

. Let

(α, ν) \in Ω

. If

α < 1

, then there exists a compact neighborhood of

(α, ν)

in which both one-sided derivatives of

g (κ)

are positive and bounded. This, together with the fact that

ϕ_{ϵ} (α) + ν

is continuous, implies that f is continuous at

(α, ν)

.

It remains to argue that f is continuous at

(1, 0)

. Let O be a sufficiently small neighbourhood of

(1, 0)

in

Ω

. Let

(α, ν) \in O

, with

α < 1

. Then

ϕ_{ϵ} (α) + ν

is close to

ϕ_{ϵ} (1) + 0 = 0

. We would like to claim that

f (α, ν)

is close to

f (1, 0) = \frac{1 - α_{1}}{ν_{1}}

. In fact, assume towards contradiction that

f (α, ν)

is significantly larger than

\frac{1 - α_{1}}{ν_{1}}

. In this case

ϕ_{ϵ} (α) + ν = \max \{\frac{α_{1} - 1}{f (α, ν)} + ν_{1}, \frac{α - 1}{f (α, ν)} + ν\} \geq \frac{α_{1} - 1}{f (α, ν)} + ν_{1}

is significantly larger than 0 (taking into account that

α_{1} < 1

), reaching contradiction. On the other hand, assume that

f (α, ν)

is significantly smaller than

\frac{1 - α_{1}}{ν_{1}}

, and hence significantly smaller than

q_{0}

(by the second claim of Lemma 5). Recall that

ϕ_{ϵ} (1) = 0

and that

ϕ_{ϵ}^{'} (1) = \frac{1}{q_{0}}

. Hence

ϕ_{ϵ} (α) = \frac{α - 1}{q_{0}} + O ({(1 - α)}^{2}) > \frac{α - 1}{f (α, ν)}

. This means that

ϕ_{ϵ} (α) + ν = \frac{α_{1} - 1}{f (α, ν)} + ν_{1}

, which is significantly smaller than 0, again reaching contradiction. This completes the proof of Lemma 4.

We collect some useful properties of

f = f_{α_{1}, ν_{1}}

in the following claim.

Corollary 6.

1.: For any $(α, ν) \in Ω$ holds $ϕ_{ϵ} (α) + ν = \max \{\frac{α_{1} - 1}{f (α, ν)} + ν_{1}, \frac{α - 1}{f (α, ν)} + ν\}$ .
2.: $0 < f \leq M (α_{1}, ν_{1}) < q_{0}$ on Ω.
3.: For any $(α, ν) \in Ω$ holds $f (α, ν) \leq \frac{α - 1}{ϕ_{ϵ} (α)}$ . (If $α = 1$ we replace the RHS of this inequality with $q_{0}$ .)

Proof.

The first two claims follow immediately from the preceding discussion and from the continuity of f. For the third claim, recall that

ϕ_{ϵ} (α) + ν = \max \{\frac{α_{1} - 1}{f (α, ν)} + ν_{1}, \frac{α - 1}{f (α, ν)} + ν\} \geq \frac{α - 1}{f (α, ν)} + ν

□

3.2. Proof of Proposition 3

Let

(α_{1}, ν_{1})

be given, let

f = f_{α_{1}, ν_{1}}

, and let

M = M (α_{1}, ν_{1}) = \max_{Ω} f

. Let

(α^{*}, ν^{*})

be a maximum point of f. Then

f (α^{*}, ν^{*}) = M

and hence

ϕ_{ϵ} (α^{*}) + ν^{*} = \max \{\frac{α_{1} - 1}{M} + ν_{1}, \frac{α^{*} - 1}{M} + ν\}

. Clearly either

\frac{α_{1} - 1}{M} + ν_{1} \neq \frac{α^{*} - 1}{M} + ν^{*}

or

\frac{α_{1} - 1}{M} + ν_{1} = \frac{α^{*} - 1}{M} + ν^{*}

. In the first case we say that

(α^{*}, ν^{*})

is a maximum point of the first type, and otherwise it is a maximum point of the second type.

The following two claims constitute the main steps of the proof of Proposition 3. They describe the respective behavior of maxima points of the first and the second type.

Lemma 6.

Let

(α^{*}, ν^{*})

be a maximum point of

f

of the first type. Then the following two claims hold.

$\frac{α_{1} - 1}{f (α^{*}, ν^{*})} + ν_{1} > \frac{α^{*} - 1}{f (α^{*}, ν^{*})} + ν^{*}$ .
$α^{*} \leq 1 - \frac{q N}{q - 1}$ .

Lemma 7.

If

(α_{1}, ν_{1}) = (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

, then

(1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

is the unique maximum point of f. This is a maximum point of the second type.

If

(α_{1}, ν_{1}) \neq (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

, then there are two possible cases.

$\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} \geq q$ . Let $(α^{*}, ν^{*})$ be a maximum point of $f$ of the second type in this case. Then $α^{*} \leq 1 - \frac{q N}{q - 1}$ .
$\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} < q$ . In this case f has a unique maximum point $(α^{*}, ν^{*})$ . This point is of the second type. Furthermore, $α^{*} > 1 - \frac{q N}{q - 1}$ , and it is uniquely determined by the following identity:

$\frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})} = \frac{α^{*} - α_{1}}{α^{*} - (1 - ν_{1})} .$

Lemmas 6 and 7 will be proved in Section 3.3. At this point we prove Proposition 3 assuming these lemmas hold.

We start with the first claim of Proposition 3. Let

α_{1} = 1 - \frac{q N}{q - 1}

and

ν_{1} = \frac{q N}{q - 1}

. Let

f = f_{α_{1}, ν_{1}}

. By the first claim of Lemma 7, we have

M (α_{1}, ν_{1}) = f (α_{1}, ν_{1}) = \frac{α_{1} - 1}{ϕ_{ϵ} (α_{1}, ν_{1})} = \frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} .

We pass to the second claim of the proposition. Assume that

\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} \geq q

. Let

f = f_{α_{1}, ν_{1}}

, for some

α_{1}

and

ν_{1}

. Let

(α^{*}, ν^{*})

be a maximum point of f. Then Lemmas 6 and 7 imply that

α^{*} \leq 1 - \frac{q N}{q - 1}

. Hence

M (α_{1}, ν_{1}) = f (α^{*}, ν^{*}) \leq \frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})} \leq \frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} = M (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1}) .

Here in the second step we have used the third claim of Corollary 6, in the third step the third claim of Lemma 3 and in the fourth step the first claim of the proposition.

We pass to the third claim of the proposition. Assume that

\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} < q

. Let

f = f_{α_{1}, ν_{1}}

, for some

α_{1}

and

ν_{1}

. Then, by Lemma 7, f has a unique maximum point

(α^{*}, ν^{*})

. This means that

α^{*}

is determined by

α_{1}

and

ν_{1}

, and furthermore, since

ν_{1} = N + \frac{1 - α_{1}}{q}

,

α^{*}

is a function of

α_{1}

. We will show the following claim below.

Lemma 8.

If

(α_{1}, ν_{1}) \neq (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

and

\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} < q

, then

α^{*}

is a decreasing function of

α_{1}

.

Assume Lemma 8 to hold. We have

M (α_{1}, ν_{1}) = f (α^{*}, ν^{*}) = \frac{α^{*} (α_{1}) - 1}{ϕ_{ϵ} (α^{*} (α_{1}))} \leq \frac{α^{*} (0) - 1}{ϕ_{ϵ} (α^{*} (0))} = M (0, N + \frac{1}{q}) .

The second step uses the fact that

(α^{*}, ν^{*})

is a maximum point of the second type, and hence

f (α^{*}, ν^{*}) = \frac{α^{*} - 1}{ϕ (α^{*})}

. The third step uses Lemma 8 and the third claim of Lemma 3, and the fourth step the fact that

α_{1} = 0

implies

ν_{1} = N + \frac{1}{q}

.

Next, by Lemma 7,

α = α^{*} (0)

is determined by the identity

\frac{α - 1}{ϕ_{ϵ} (α)} = \frac{α}{α - (\frac{q - 1}{q} - N)}

which, after rearranging, gives

1 - α - \frac{α ϕ_{ϵ} (α)}{1 - α} = N + \frac{1}{q}

. Hence, by the fourth claim of Lemma 3,

α^{*} (0) = α_{0}

and

M (0, N + \frac{1}{q}) = \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})}

.

To conclude the proof of the third claim of the proposition, observe that since

α^{*} > 1 - \frac{q N}{q - 1}

, we have

M (α_{1}, ν_{1}) = f (α^{*}, ν^{*}) = \frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})} > \frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})},

where the last inequality is by the third claim of Lemma 3. This completes the proof of Proposition 3.

It remains to prove Lemmas 6–8.

3.3. Proofs of the Remaining Lemmas

Proof of Lemma 6.

We start with the first claim of the lemma. Assume towards contradiction that

\frac{α_{1} - 1}{f (α^{*}, ν^{*})} + ν_{1} < \frac{α^{*} - 1}{f (α^{*}, ν^{*})} + ν^{*}

. Since f is a positive continuous function on

Ω

, there is a neighborhood O of

(α^{*}, ν^{*})

in

Ω

on which

\frac{α_{1} - 1}{f (α, ν)} + ν_{1} < \frac{α - 1}{f (α, ν)} + ν

. This means that any point

(α, ν) \in O

satisfies

ϕ_{ϵ} (α) + ν = \frac{α - 1}{f (α, ν)} + ν

, and hence

f (α, ν) = \frac{α - 1}{ϕ_{ϵ} (α)}

. Since

f (α^{*}, ν^{*}) \geq f (α, ν)

, this implies that

\frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})} \geq \frac{α - 1}{ϕ_{ϵ} (α)}

, and hence, by the third claim of Lemma 3, that

α^{*} \geq α

. It follows that

α^{*}

has to be 1, and hence

(α^{*}, ν^{*}) = (1, 0)

. However, in this case

\frac{α_{1} - 1}{f (α^{*}, ν^{*})} + ν_{1} = \frac{α^{*} - 1}{f (α^{*}, ν^{*})} + ν^{*} = 0

, reaching contradiction.

We pass to the second claim of the lemma. By the first claim

\frac{α_{1} - 1}{f (α^{*}, ν^{*})} + ν_{1} > \frac{α^{*} - 1}{f (α^{*}, ν^{*})} + ν^{*}

. We claim that this implies that

(α^{*}, ν^{*})

is a local maximum of

ϕ_{ϵ} (α) + ν

. In fact, arguing as above, there is a neighborhood O of

(α^{*}, ν^{*})

on which

\frac{α_{1} - 1}{f (α, ν)} + ν_{1} > \frac{α - 1}{f (α, ν)} + ν

. This means that for any point

(α, ν) \in O

we have

ϕ_{ϵ} (α) + ν = \frac{α_{1} - 1}{f (α, ν)} + ν_{1}

. Since

f (α^{*}, ν^{*}) \geq f (α, ν)

, this implies that

ϕ_{ϵ} (α) + ν \leq ϕ (α^{*}) + ν^{*}

. To complete the proof, recall that any local maximum

(α, ν)

of

ϕ (α) + ν

has

α \leq 1 - \frac{q N}{q - 1}

(as shown in the proof of Proposition 1). □

Proof of Lemma 7.

Let

(α^{*}, ν^{*})

be a maximum point of f of the second type. The first observation is that

(α^{*}, ν^{*})

has to lie on the upper boundary of

Ω

. In fact, assume not. Then for a sufficiently small

τ > 0

the point

(α, ν) = (α^{*}, ν^{*} + τ)

is in

Ω

. Since

f (α, ν) \leq f (α^{*}, ν^{*})

, we have

ϕ_{ϵ} (α) + ν > ϕ_{ϵ} (α^{*}) + ν^{*} = \frac{α_{1} - 1}{f (α^{*}, ν^{*})} + ν_{1} \geq \frac{α_{1} - 1}{f (α, ν)} + ν_{1}

. Hence

f (α, ν)

is determined by the equality

ϕ_{ϵ} (α) + ν = \frac{α - 1}{f (α, ν)} + ν

, which implies

f (α, ν) = f (α^{*}, ν^{*}) = \frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})}

. Hence

(α, ν)

is a point of maximum of f of the first type with

\frac{α_{1} - 1}{f (α, ν)} + ν_{1} < \frac{α - 1}{f (α, ν)} + ν

. This, however, contradicts the first claim of Lemma 6.

Recall that the upper boundary of

Ω

is a piecewise linear curve which starts as the straight line

\frac{α}{q} + ν = N + \frac{1}{q}

, for

0 \leq α \leq 1 - \frac{q N}{q - 1}

and continues as the straight line

α + ν = 1

for

1 - \frac{q N}{q - 1} \leq α \leq 1

. Hence there are two cases to consider: In the first case

α^{*} \leq 1 - \frac{q N}{q - 1}

and

\frac{α^{*}}{q} + ν^{*} = N + \frac{1}{q}

. In the second case

1 - \frac{q N}{q - 1} < α^{*} \leq 1

and

α^{*} + ν^{*} = 1

.

Assume that the second case holds. Then

(α^{*}, ν^{*})

satisfies

$\frac{α_{1} - 1}{f (α^{*}, ν^{*})} + ν_{1} = \frac{α^{*} - 1}{f (α^{*}, ν^{*})} + ν^{*} = ϕ_{ϵ} (α^{*}) + ν^{*}$ .
$1 - \frac{q N}{q - 1} < α^{*} \leq 1$ and $α^{*} + ν^{*} = 1$ .

In particular,

f (α^{*}, ν^{*}) = \frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})} = \frac{α^{*} - α_{1}}{α^{*} - (1 - ν_{1})}

. Consider the following two functions of

α

:

g_{1} (α) = \frac{α - 1}{ϕ_{ϵ} (α)}

and

g_{2} (α) = \frac{α - α_{1}}{α - (1 - ν_{1})}

, for

α > 1 - \frac{q N}{q - 1}

. Note that

g_{2}

is well-defined since, by Lemma 5,

ν_{1} \geq \frac{q N}{q - 1}

. By the third claim of Lemma 3,

g_{1}

is strictly increasing. On the other hand,

g_{2} (α) = 1 + \frac{1 - α_{1} - ν_{1}}{α - (1 - ν_{1})}

is non-increasing. Note also that

g_{1} (1) = q_{0}

(more precisely,

\lim_{α \to 1} g_{1} (α) = q_{0}

) and, by Lemma 5,

g_{2} (1) = \frac{1 - α_{1}}{ν_{1}} < q_{0}

. This means that

g_{1}

and

g_{2}

coincide at a (unique) point

1 - \frac{q N}{q - 1} < α < 1

iff

g_{1} (1 - \frac{q N}{q - 1}) < g_{2} (1 - \frac{q N}{q - 1})

.

Observe that if

(α_{1}, ν_{1}) = (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

then

g_{2}

is the constant 1-function. Furthermore, by the first and the third claims of Lemma 3,

g_{1} (1 - \frac{q N}{q - 1}) \geq g_{1} (0) = \frac{2}{\log_{2} (4 / q_{0})} \geq 1

, and hence in this case

g_{1}

and

g_{2}

cannot coincide for

α > 1 - \frac{q N}{q - 1}

. If

(α_{1}, ν_{1}) \neq (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

then it is easy to see (recall that

\frac{α_{1}}{q} + ν_{1} = N + \frac{1}{q}

) that

g_{2} (1 - \frac{q N}{q - 1}) = q

, and hence the two functions have a unique intersection at some

α > 1 - \frac{q N}{q - 1}

iff

g_{1} (1 - \frac{q N}{q - 1}) = \frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})}

is smaller than q.

To recap, the second case can hold only provided

(α_{1}, ν_{1}) \neq (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

and

\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} < q

. Furthermore, if it holds then

1 - \frac{q N}{q - 1} < α^{*} < 1

is uniquely determined by the equality

g_{1} (α^{*}) = g_{2} (α^{*})

.

We can now complete the proof of the lemma. First, let

(α_{1}, ν_{1}) = (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

. By the preceding discussion, in this case a maximum point

(α^{*}, ν^{*})

of f of the second type has to have

α^{*} \leq α_{1}

. Moreover, taking into account Lemma 6, this is true for any maximum point of f. By the third claim of Corollary 6, this means that

M (α_{1}, ν_{1}) \leq \frac{α_{1} - 1}{ϕ_{ϵ} (α_{1})} = f (α_{1}, ν_{1})

. Hence

(α_{1}, ν_{1})

is a maximum point of f. It is trivially a maximum point of the second type. To see that it is a unique maximum point, note that for any point

(α, ν)

on the upper boundary of

Ω

, if

α = α_{1}

, then necessarily

ν = ν_{1}

. So, for any other putative maximum point

(α, ν)

, we would have

α < α_{1}

and hence, by the third claims of Lemma 3 and the third claim of Corollary 6,

f (α, ν) \leq \frac{α - 1}{ϕ_{ϵ} (α)} < \frac{α_{1} - 1}{ϕ_{ϵ} (α_{1})} = f (α_{1}, ν_{1})

. This proves the first claim of the lemma.

Assume now that

(α_{1}, ν_{1}) \neq (1 - \frac{q N}{q - 1}, \frac{q N}{q - 1})

. Let

(α^{*}, ν^{*})

be a maximum point of f of the second type. If

g_{1} (1 - \frac{q N}{q - 1}) = \frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} \geq q

, then the preceding discussion implies that

α^{*} \leq 1 - \frac{q N}{q - 1}

, proving the second claim of the lemma.

If

\frac{- \frac{q N}{q - 1}}{ϕ_{ϵ} (1 - \frac{q N}{q - 1})} < q

, let

α

be the unique solution for

g_{1} (α) = g_{2} (α)

on

1 - \frac{q N}{q - 1} < α < 1

. Set

α^{*} = α

and

ν^{*} = 1 - α

. We claim that

(α^{*}, ν^{*})

is the unique maximum point of f (note that by Lemma 6 it would necessarily be of the second type). In fact, let us first verify that

\frac{α_{1} - 1}{κ} + ν_{1} = \frac{α^{*} - 1}{κ} + ν^{*} = ϕ_{ϵ} (α^{*}) + ν^{*}

, for

κ = \frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})}

. The second equality is immediate, by the definition of

κ

. The first equality is equivalent to

κ = \frac{α^{*} - α_{1}}{α^{*} - (1 - ν_{1})}

, which follows from the definitions of

α^{*}

and

κ

. Hence

f (α^{*}, ν^{*}) = κ = \frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})}

. For any other putative maximum point

(α, ν)

, we would have, by the preceding discussion, that

α \leq 1 - \frac{q N}{q - 1} < α^{*}

and hence, as above,

f (α, ν) \leq \frac{α - 1}{ϕ_{ϵ} (α)} < f (α^{*}, ν^{*})

. This proves the third claim of the lemma. □

Proof of Lemma 8.

In the assumptions of the lemma,

α^{*}

is the unique solution on

(1 - \frac{q N}{q - 1}, 1)

of the identity

\frac{α^{*} - 1}{ϕ_{ϵ} (α^{*})} = \frac{α^{*} - α_{1}}{α^{*} - (1 - ν_{1})} .

Here the LHS is a strictly increasing and the RHS a strictly decreasing (since by assumption

α_{1} \neq 1 - \frac{q N}{q - 1}

, and hence

α_{1} + ν_{1} < 1

) functions of

α^{*}

. It follows that to prove the claim of the lemma it suffices to show that for a fixed

α^{*} > 1 - \frac{q N}{q - 1}

the RHS is a decreasing function of

α_{1}

(keeping in mind that

ν_{1} = - \frac{α_{1}}{q} + (N + \frac{1}{q})

). However, this is easily verifiable by a direct differentiation of the RHS w.r.t.

α_{1}

. □

This completes the proof of Proposition 2 and of (5). We proceed to complete the proof of Theorem 1. The tightness of (5) follows from the tightness of (6), similarly to the way the tightness of (4) was shown in the proof of Proposition 1. We omit the details.

It remains to consider the properties of the function

κ_{2, q}

. We first remark that it is easy to see, using the properties of the function

ϕ_{ϵ}

given in Lemma 3, that

κ_{2, q}

is a continuous function of its first variable (we omit the details). In particular, we can replace strict inequalities with non-strict ones in the definition of

κ_{2, q}

in Definition 1. Now there are two cases to consider.

$q \geq q_{0}$ . In this case, by the third claim of Lemma 3, $- \frac{x}{ϕ_{ϵ} (1 - x)}$ is never larger than q, and hence

$κ_{2, q} (x, ϵ) = \{\begin{matrix} q_{0} & i f & y \leq \frac{1}{q_{0}} \\ \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})} & i f & y \geq \frac{1}{q_{0}} \end{matrix}$

Here $y = \frac{q - 1}{q} \cdot x + \frac{1}{q}$ , $q_{0} = 1 + {(1 - 2 ϵ)}^{2}$ , and $α_{0}$ is determined by $1 - α_{0} - \frac{α_{0} ϕ_{ϵ} (α_{0})}{1 - α_{0}} = y$ . Note that $α_{0}$ is well-defined, by the fourth claim of Lemma 3. The fact that $κ_{2, q}$ is decreasing in x follows from combining the third and the fourth claims of Lemma 3. In fact, $κ_{2, q}$ is a constant- $(1 + {(1 - 2 ϵ)}^{2})$ function for $0 \leq x \leq \frac{q - q_{0}}{(q - 1) q_{0}}$ , and it is strictly decreasing for larger x.
$q < q_{0}$ . In this case y is always greater than $\frac{1}{q_{0}}$ and we have that

$κ_{2, q} (x, ϵ) = \{\begin{matrix} - \frac{x}{ϕ_{ϵ} (1 - x)} & i f & - \frac{x}{ϕ_{ϵ} (1 - x)} \geq q \\ \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})} & i f & - \frac{x}{ϕ_{ϵ} (1 - x)} \leq q \end{matrix}$

It suffices to show that $κ_{2, q}$ is decreasing on both relevant subintervals of $[0, 1]$ , and this again follows from the third and the fourth claims of Lemma 3. In this case $κ_{2, q}$ is strictly decreasing on $[0, 1]$ .

This completes the proof of Theorem 1. □

4. Remaining Proofs

Proof of Lemma 3.

The strict concavity of

ϕ_{ϵ}

and the bounds on its derivative were shown in [9], Lemma 2.13 (note that

ϕ_{ϵ} (x) = \frac{1}{2} \tilde{ϕ} (x, 2 ϵ (1 - ϵ))

in terms of [9]). The value of

ϕ_{ϵ}

at the endpoints of the interval

[0, 1]

are directly computable.

We pass to the third claim of the lemma. Taking the derivative and rearranging, it suffices to prove that for any

α \in (0, 1)

holds

ϕ_{ϵ} (α) > (α - 1) ϕ_{ϵ}^{'} (α)

. This follows immediately from the strict concavity of

ϕ_{ϵ}

and the fact that

ϕ_{ϵ} (1) = 0

.

We pass to the last claim of the lemma. Taking the derivative and rearranging, it suffices to prove that for any

α \in (0, 1)

holds

(1 - α) (α ϕ_{ϵ}^{'} (α) + (1 - α)) > - ϕ_{ϵ} (α) .

Since

(1 - α) \cdot ϕ_{ϵ}^{'} (α) > - ϕ_{ϵ} (α)

, it suffices to show that

α ϕ_{ϵ}^{'} (α) + (1 - α) \geq ϕ_{ϵ}^{'} (α)

, and this follows from the first two claims of the lemma. The values of the function g at the endpoints are directly computable. □

Proof of Lemma 1.

We start with a technical lemma which deals with the behavior of the function

ϕ_{ϵ} (x)

in the vicinity of

ϵ = 0

. We write

ϵ \sim 0

as a shorthand for “

ϵ

close to 0”. We again use the fact that

ϕ_{ϵ} (x) = Φ (x, 2 ϵ (1 - ϵ)) = \frac{1}{2} \tilde{ϕ} (x, 2 ϵ (1 - ϵ))

, where the function

\tilde{ϕ}

was defined and studied in [9]. In the calculations below

ϕ (x, ϵ)

is written instead of

ϕ_{ϵ} (x)

, for notational convenience.

Lemma 9.

Let

0 < t < 1

. Then

1.: $ϕ (t, 0) = \frac{t - 1}{2} a n d f o r ϵ \sim 0 h o l d s | ϕ (t, ϵ) - \frac{t - 1}{2} | \leq O (ϵ) .$
2.: $\frac{\partial ϕ}{\partial ϵ} (t, 0) = \frac{2 \sqrt{H^{- 1} (t) (1 - H^{- 1} (t))} - 1}{\ln (2)} a n d f o r ϵ \sim 0 h o l d s$

$| \frac{\partial ϕ}{\partial ϵ} (t, ϵ) - \frac{2 \sqrt{H^{- 1} (t) (1 - H^{- 1} (t))} - 1}{\ln (2)} | \leq O (ϵ) .$
3.: $\frac{\partial ϕ}{\partial t} (t, 0) = \frac{1}{2} a n d f o r ϵ \sim 0 h o l d s | \frac{\partial ϕ}{\partial t} (t, ϵ) - \frac{1}{2} | \leq O (ϵ) .$

Proof of Lemma 9.

Notation. Here and below we write

a \pm ϵ

as a shorthand for the interval

[a - ϵ, a + ϵ]

.

Recall that

\tilde{ϕ} (t, ϵ) = t - 1 + σ H (\frac{z}{σ}) + (1 - σ) H (\frac{z}{1 - σ}) + 2 z \log_{2} (ϵ) + (1 - 2 z) \log_{2} (1 - ϵ),

where

σ = H^{- 1} (t)

and

z = z (t, ϵ) = \frac{- ϵ^{2} + ϵ \sqrt{ϵ^{2} + 4 (1 - 2 ϵ) σ (1 - σ)}}{2 (1 - 2 ϵ)}

.

The fact that

ϕ (t, 0) = \frac{1}{2} \tilde{ϕ} (t, 0) = \frac{t - 1}{2}

is verified by inspection, observing that

z (t, 0) = 0

for any t. Note also that, by assumption,

σ > 0

, and hence

z (t, ϵ) \in \sqrt{σ (1 - σ)} \cdot ϵ \pm O (ϵ^{2})

for a sufficiently small

ϵ

.

Using (as in the proof of Lemma 2.13 in [9]) the fact that for

ϵ > 0

holds

\frac{(σ - z) (1 - σ - z)}{z^{2}} = \frac{{(1 - ϵ)}^{2}}{ϵ^{2}}

, and writing

δ = 2 ϵ (1 - ϵ)

, we have that

\frac{\partial ϕ (t, ϵ)}{\partial ϵ} = \frac{1}{2} \cdot \frac{\partial \tilde{ϕ} (t, δ)}{\partial ϵ} = \frac{1 - 2 ϵ}{\ln (2)} \cdot \frac{2 z - δ}{δ (1 - δ)} .

Hence for

ϵ \sim 0

we have

\frac{\partial ϕ (t, ϵ)}{\partial ϵ} \in \frac{2 \sqrt{σ (1 - σ)} - 1}{\ln (2)} \pm O (δ)

, or equivalently

\frac{\partial ϕ (t, ϵ)}{\partial ϵ} \in \frac{2 \sqrt{σ (1 - σ)} - 1}{\ln (2)} \pm O (ϵ)

.

In particular,

{\frac{\partial ϕ (t, ϵ)}{\partial ϵ}}_{| ϵ = 0} = \lim_{ϵ \to 0} \frac{\partial ϕ (t, ϵ)}{\partial ϵ} = \frac{2 \sqrt{σ (1 - σ)} - 1}{\ln (2)} = \frac{2 \sqrt{H^{- 1} (t) (1 - H^{- 1} (t))} - 1}{\ln (2)} .

This proves both the first and the second claims of the lemma.

We pass to the third claim of the lemma. As shown in the proof of Lemma 2.13 in KS2 we have

\frac{\partial \tilde{ϕ}}{\partial t} (t, ϵ) = \frac{\ln (\frac{1 - σ - z}{σ - z})}{\ln (\frac{1 - σ}{σ})}

. Hence

\frac{\partial ϕ (t, ϵ)}{\partial t} = \frac{1}{2} \cdot \frac{\partial \tilde{ϕ} (t, δ)}{\partial t} = \frac{1}{2} \cdot \frac{\ln (\frac{1 - σ - z}{σ - z})}{\ln (\frac{1 - σ}{σ})},

where

z = z (t, δ)

. Recall for any

0 < t < 1

we have

z (t, 0) = 0

and in addition for

δ \sim 0

we have

z (t, δ) \in \sqrt{σ (1 - σ)} \cdot δ \pm O (δ^{2})

. The third claim of the lemma now follows by inspection. This completes the proof of Lemma 9. □

We proceed with the proof of Lemma 1. First, consider the definition of

κ = κ_{2, 2}

. For

ϵ

sufficiently close to zero, we have that

\frac{x + 1}{2} > \frac{1}{q_{0}}

(recall that

q_{0} = 1 + {(1 - 2 ϵ)}^{2}

) and hence

κ = \frac{α - 1}{ϕ (α, ϵ)}

, where

α = α (ϵ)

is determined by

1 - α + \frac{α ϕ (α, ϵ)}{α - 1} = \frac{x + 1}{2}

. Taking the derivative w.r.t.

ϵ

in the definition of

α

and rearranging gives

α^{'} (ϵ) = - \frac{α (α - 1) \frac{\partial ϕ}{\partial ϵ} (α, ϵ)}{α (α - 1) \frac{\partial ϕ}{\partial α} (α, ϵ) - ϕ (α, ϵ) - {(α - 1)}^{2}} .

Using the first claim of Lemma 9, it is easy to see that

α (0) = 1 - x

. Hence, using all claims of Lemma 9, we have that

α^{'} (0) = - \frac{2}{\ln 2} \cdot \frac{α (0) (2 \sqrt{H^{- 1} (α (0)) (1 - H^{- 1} (α (0)))} - 1)}{1 - α (0)} =

- \frac{2}{\ln 2} \cdot \frac{(1 - x) (2 \sqrt{H^{- 1} (1 - x) (1 - H^{- 1} (1 - x))} - 1)}{x}

Next, we compute

κ

and

κ^{'}

at 0. Note that by the definition of

κ

, we have

1 - α + \frac{α}{κ} = \frac{x + 1}{2}

. Hence,

κ = \frac{α}{\frac{x + 1}{2} + α - 1}

and

κ^{'} = \frac{κ (1 - κ) α^{'}}{α}

. In particular,

κ (0) = 2

and

κ^{'} (0) = = - \frac{2 α^{'} (0)}{α (0)} = \frac{4}{\ln 2} \cdot \frac{(2 \sqrt{H^{- 1} (1 - x) (1 - H^{- 1} (1 - x))} - 1)}{x},

proving the first claim of the proposition.

Let now

ϵ \sim 0

. We start with estimating

α (ϵ)

and

κ (ϵ)

. From the identity

1 - α + \frac{α ϕ (α, ϵ)}{α - 1} = \frac{x + 1}{2}

, using the monotonicity of the LHS in

α

(by Lemma 2.3) and Lemma 9, it is easy to see that

α (ϵ) \in 1 - x \pm O (ϵ)

. From this, and from the identity

1 - α + \frac{α}{κ} = \frac{x + 1}{2}

, we get

κ (ϵ) = \frac{α (ϵ)}{\frac{x + 1}{2} + α (ϵ) - 1} \in 2 \pm O (ϵ)

.

Proceeding in a similar vein, using the above expression for

α^{'}

, we get that

α^{'} (ϵ) \in \frac{2}{\ln (2)} \cdot \frac{1 - x}{x} \cdot (1 - 2 \sqrt{H^{- 1} (1 - x) (1 - H^{- 1} (1 - x))}) \pm O (ϵ),

and

κ^{'} (ϵ) = \frac{κ (ϵ) (1 - κ (ϵ)) α^{'} (ϵ)}{α (ϵ)} \in = - \frac{2 α^{'} (ϵ)}{α (ϵ)} \subseteq

\frac{4}{\ln 2} \cdot \frac{(2 \sqrt{H^{- 1} (1 - x) (1 - H^{- 1} (1 - x))} - 1)}{x} \pm O (ϵ),

completing the proof of the lemma. □

Proof of Corollary 2.

Let

q = 2

and

κ = κ_{2, 2}

(see the second claim of Corollary 1 for a more explicit statement of Theorem 1 in this case). Viewing both sides of (5) as functions of

ϵ

, and writing

L (ϵ)

for the LHS and

R (ϵ)

for the RHS, we have

L (0) = R (0) = {∥ f ∥}_{2}

, and

L (ϵ) \leq R (ϵ)

for

0 \leq ϵ \leq \frac{1}{2}

. It is easy to see that both L and R are differentiable, and we may deduce that

L^{'} (0) \leq R^{'} (0)

. Computing the derivatives (see, e.g., [3]) gives

L^{'} (0) = - \frac{1}{2} \cdot \frac{E (f, f)}{{∥ f ∥}_{2}} and R^{'} (0) = \frac{\ln (2) κ^{'} (0)}{4} \cdot \frac{E n t (f^{2})}{{∥ f ∥}_{2}},

where we write

κ^{'} (0)

for

{\frac{\partial κ}{\partial ϵ}}_{| ϵ = 0}

. Hence

L^{'} (0) \leq R^{'} (0)

is equivalent to

E (f, f) \geq - \frac{\ln (2) κ^{'} (0)}{2} \cdot E n t (f^{2}) .

(7)

The claim of the corollary now follows from the first claim of Lemma 1. It only remains to add that the fact that

ℓ (\cdot)

a convex and increasing function on

[0, 1]

, taking

[0, 1]

onto

[2 \ln 2, 2]

was proved in [20]. □

Proof of Corollary 3.

Let us point out that our argument follows along the same lines as the proof of the same result in [10]. We do believe that the argument here is worth presenting in full, since it seems to be somewhat more explicit and easier to parse.

We use the simple fact (see, e.g., [4]) that for any

0 \leq ϵ \leq \frac{1}{2}

and for any

α \in {0, 1}^{n}

holds

\hat{f_{ϵ}} (α) = {(1 - 2 ϵ)}^{| α |} \hat{f} (α)

. Hence, using Parseval’s identity in the first step below, we have

∥ f_{ϵ} ∥_{2}^{2} = \sum_{α \in {0, 1}^{n}} {(1 - 2 ϵ)}^{2 | α |} {\hat{f}}^{2} (α) \geq {(1 - 2 ϵ)}^{2 μ n} \cdot \sum_{| α | \leq μ n} {\hat{f}}^{2} (α) .

Since this holds for any

0 \leq ϵ \leq \frac{1}{2}

, we deduce that

\sum_{| α | \leq μ n} {\hat{f}}^{2} (α) \leq \min_{0 \leq ϵ \leq \frac{1}{2}} \frac{∥ f_{ϵ} ∥_{2}^{2}}{{(1 - 2 ϵ)}^{2 μ n}} \leq \min_{0 \leq ϵ \leq \frac{1}{2}} \frac{{∥ f ∥}_{κ}^{2}}{{(1 - 2 ϵ)}^{2 μ n}},

where we have used Theorem 1 with

q = 2

in the second step, and

κ = κ (ϵ) = κ_{2, 2} (\frac{E n t_{2} (\frac{f}{{∥ f ∥}_{1}})}{n}, ϵ)

.

Let

F (ϵ) = \frac{1}{n} \log_{2} (\frac{{∥ f ∥}_{κ}^{2}}{{(1 - 2 ϵ)}^{2 μ n}}) = \frac{1}{n} \log_{2} ({∥ f ∥}_{κ}^{2}) - 2 μ \log_{2} (1 - 2 ϵ)

. Since

κ (0) = 2

, we have

F (0) = \frac{1}{n} \log_{2} ({∥ f ∥}_{2}^{2})

. Hence the claim of the corollary is equivalent to the claim that

\min_{0 \leq ϵ \leq \frac{1}{2}} F (ϵ)

is negative and bounded away from

F (0)

by some absolute constant. To show this, it suffices to show that

F^{'} (ϵ)

is negative and bounded away from 0 by an absolute constant for

ϵ

in a constant length interval

[0, ϵ_{0}]

.

Recall that for any nonnegative non-zero function g on

{0, 1}^{n}

holds

\frac{E n t (g^{2})}{E g^{2}} \geq \log_{2} (\frac{E g^{2}}{E^{2} g}) = E n t_{2} (\frac{g}{{∥ g ∥}_{1}})

(see, e.g., [10]). Recall also that

\frac{\partial}{\partial ϵ} \log_{2} ({∥ f ∥}_{κ (ϵ)}) = \frac{κ^{'}}{κ^{2}} \cdot \frac{E n t ({| f |}^{κ})}{{∥ f ∥}_{κ}^{κ}}

.

Hence, recalling that, by Lemma 1,

κ^{'} < 0

in the vicinity of 0, we have

F^{'} (ϵ) = 2 \frac{κ^{'}}{κ^{2}} \cdot \frac{1}{n} \frac{E n t ({| f |}^{κ})}{{∥ f ∥}_{κ}^{κ}} + \frac{4}{\ln (2)} \cdot \frac{μ}{1 - 2 ϵ} \leq 2 \frac{κ^{'}}{κ^{2}} \cdot \frac{1}{n} \log_{2} (\frac{E ({| f |}^{κ})}{E^{2} {| f |}^{κ / 2}}) + \frac{4}{\ln (2)} \cdot \frac{μ}{1 - 2 ϵ} .

Let

x = \frac{E n t_{2} (\frac{f}{{∥ f ∥}_{1}})}{n} = 1 - H (ρ)

. Recalling again that

κ (0) = 2

and applying the first claim of Lemma 1 we get

F^{'} (0) \leq \frac{κ^{'} (0)}{2} \cdot x + \frac{4 μ}{\ln (2)} = \frac{4}{\ln (2)} \cdot (μ - (\frac{1}{2} - \sqrt{ρ (1 - ρ)})) < 0 .

It now suffices to show that for sufficiently small

ϵ

we have

F^{'} (ϵ) \leq F^{'} (0) + O (ϵ)

. Taking the second claim of Lemma 1 into account, it is enough to show that

\frac{1}{n} \log_{2} (\frac{E ({| f |}^{κ})}{E^{2} {| f |}^{κ / 2}}) \geq x - O (ϵ)

. Let

G (ϵ) = \frac{1}{n} \log_{2} (\frac{E ({| f |}^{κ})}{E^{2} {| f |}^{κ / 2}})

. Then

G (0) = x

and it suffices to show that

| G^{'} |

is bounded by an absolute constant. A simple calculation gives that

G^{'} = \frac{κ^{'}}{κ} \cdot (\frac{1}{n} \frac{E n t ({| f |}^{κ})}{{E | f |}^{κ}} - \frac{2}{n} \frac{E n t ({| f |}^{κ / 2})}{{E | f |}^{κ / 2}} + G) .

The RHS in the last expression is bounded by a constant, since for any nonnegative non-zero function g on

{0, 1}^{n}

both

\frac{E n t (g)}{E g}

and

\log_{2} (\frac{E g^{2}}{E^{2} (g)})

are bounded by n. □

Proof of Corollary 4.

The first claim of the corollary

Let

D \subseteq {0, 1}^{n}

,

| D | = 2^{H (ρ) n}

. Let

M_{D}

be the adjacency matrix of the subgraph of the discrete cube induced by the vertices of D. Let

λ (D)

be the maximal eigenvalue of

M_{D}

. Let f be a maximal eigenvector of

M_{D}

. We view f as a function on D and extend it to a function on

{0, 1}^{n}

by defining it to be zero outside D. Let A be the adjacency matrix of

{0, 1}^{n}

. Then

λ (D) = \frac{〈f, M_{D} f〉}{〈f, f〉} = \frac{〈f, A f〉}{〈f, f〉}

. Note also that since f is supported on D we have

E^{2} | f | = {(〈f, sign (f) \cdot 1_{D}〉)}^{2} \leq E f^{2} \cdot E {(sign (f) \cdot 1_{D})}^{2} = E f^{2} \cdot \frac{| D |}{2^{n}} = E f^{2} \cdot 2^{(H (ρ) - 1) n} .

It follows that

\frac{E n t_{2} (\frac{f}{{∥ f ∥}_{1}})}{n} \geq 1 - H (ρ)

.

Next, it is easy to check that for any function g on

{0, 1}^{n}

holds

E (g, g) = 2 〈g, (n I - A) g〉

, where I is the

2^{n} \times 2^{n}

identity matrix. Hence, using Corollary 2 and the fact that

\frac{E n t (f^{2})}{E f^{2}} \geq \log_{2} (\frac{E f^{2}}{E^{2} | f |}) = E n t_{2} (\frac{f}{{∥ f ∥}_{1}})

, we have, writing x for

\frac{1}{n} E n t_{2} (\frac{f}{{∥ f ∥}_{1}})

,

λ (D) = \frac{〈f, A f〉}{〈f, f〉} = n - \frac{1}{2} \frac{E (f, f)}{〈f, f〉} \leq n - \frac{1}{2} \frac{ℓ (x) \cdot E n t (f^{2})}{E f^{2}} \leq

n - \frac{n}{2} x ℓ (x) \leq n (1 - \frac{1}{2} (1 - H (ρ) ℓ (1 - H (ρ))) = 2 \sqrt{ρ (1 - ρ)} \cdot n .

This is almost tight if we set

r = ⌈ ρ n ⌉

and take

D = \{x \in {0, 1}^{n} : | x | \leq r\}

to be the Hamming ball of radius r around 0. In fact, recall that

| D | \approx 2^{H (ρ) n}

(see, e.g., [19]) and, as shown in [19],

λ (D) \geq 2 \sqrt{ρ (1 - ρ)} \cdot n - o (n)

. □

The second claim of the corollary

Let

0 < δ < \frac{1}{2}

. Let

d = ⌊ δ n ⌋

, and let f be a feasible solution of the dual linear program of [15] with parameters n and d. Then, as observed by [29] f can be viewed as a function on

{0, 1}^{n}

with the following properties:

f is symmetric, that is $f (x)$ depends only on $| x |$ .
$f (x) \leq 0$ for $| x | \geq d$ .
$\hat{f} \geq 0$ and $\hat{f} (0) = 1$ .
$f (0) \leq 2^{R_{L P} (δ) \cdot n + o (n)}$ .

To prove the claim, we will show that any function f with the first three of these properties satisfies

\frac{1}{n} \log_{2} (f (0)) \geq \frac{1 - H (δ) + H (\frac{1}{2} - \sqrt{δ (1 - δ)})}{2} - o_{n} (1)

.

Notation: We write

{∥ g ∥}_{q, F}

for

{(\sum_{α \in {0, 1}^{n}} {| g (α) |}^{q})}^{1 / q}

. Note that Parseval’s identity states

{∥ f ∥}_{2} = {∥ \hat{f} ∥}_{2, F}

. We write ≈, ≲, and ≳ to denote equality or inequality which hold up to lower order terms. To give an example, recall that for

0 < ρ \leq \frac{1}{2}

the cardinalities of the Hamming ball

\{x \in {0, 1}^{n} : | x | \leq r\}

and the Hamming sphere

\{x \in {0, 1}^{n} : | x | = r\}

are

2^{H (ρ) n}

, up to lower order terms. We write this as

\frac{1}{n} \log_{2} (| \{x \in {0, 1}^{n} : | x | \leq r\} |) \approx H (ρ)

.

We start with some preliminary observations. First, we need some simple and well-known facts from Fourier analysis on

{0, 1}^{n}

. If f is symmetric, then so is

\hat{f}

. Next,

\hat{f} (0) = E f \leq {∥ f ∥}_{1}

. Furthermore, finally, using the fact that in our case

\hat{f} \geq 0

,

f (0) = \sum_{α \in {0, 1}^{n}} \hat{f} (α) = {∥ \hat{f} ∥}_{1, F}

.

Next, we claim that if f is symmetric and if, for some

0 \leq i \leq n

holds

\frac{1}{2^{n}} (\binom{n}{i}) | f (i) | \geq Ω (\frac{1}{n}) \cdot {∥ f ∥}_{1}

then

\frac{{∥ f ∥}_{2}}{{∥ f ∥}_{1}} \geq Ω (\frac{1}{n}) \cdot \sqrt{\frac{2^{n}}{(\binom{n}{i})}}

. In fact, we will have

{∥ f ∥}_{2}^{2} \geq \frac{1}{2^{n}} (\binom{n}{i}) f^{2} (i) \geq Ω (\frac{1}{n^{2}}) \cdot \frac{1}{2^{n}} (\binom{n}{i}) {(\frac{2^{n}}{(\binom{n}{i})} {∥ f ∥}_{1})}^{2} = Ω (\frac{1}{n^{2}}) \cdot \frac{2^{n}}{(\binom{n}{i})} {∥ f ∥}_{1}^{2} .

Similarly, if for some

0 \leq j \leq n

holds

(\binom{n}{j}) {\hat{f}}^{2} (j) \geq Ω (\frac{1}{n}) \cdot {∥ \hat{f} ∥}_{2, F}^{2}

then

\frac{∥ \hat{f} ∥_{1, F}}{∥ \hat{f} ∥_{2, F}} \geq Ω (\frac{1}{n}) \cdot \sqrt{(\binom{n}{j})}

.

Finally, we need a slight extension of Corollary 3. As stated, it shows that if f has a large second entropy, then

\hat{f}

cannot attain its

ℓ_{2}

norm in a Hamming ball of small radius around 0. We claim, as was also observed in [10], that this holds more generally for Hamming balls with arbitrary centers in

{0, 1}^{n}

. To see that, let

z \in {0, 1}^{n}

, and define

g = f \cdot W_{z}

, where

W_{z}

is the corresponding Walsh-Fourier character. It is easy to see that for any

y \in {0, 1}^{n}

holds

\hat{g} (y) = \hat{f} (y + z)

, and hence g has the same first and second norms as f. Moreover, writing

B (z, r)

for the Hamming ball of radius r around z, we have

\sum_{α \in B (z, r)} {\hat{f}}^{2} (α) = \sum_{β \in B (0, r)} {\hat{g}}^{2} (β)

.

We pass to the proof of the claim. Note that since

f (x) \leq 0

for

| x | \geq d

and since

E f \geq 0

, there exists

0 \leq i \leq d - 1

such that

\frac{1}{2^{n}} (\binom{n}{i}) | f (i) | \geq Ω (\frac{1}{n}) \cdot {∥ f ∥}_{1}

. Hence

\frac{1}{n} E n t_{2} (\frac{f}{{∥ f ∥}_{1}}) = \frac{1}{n} \log_{2} (\frac{{∥ f ∥}_{2}^{2}}{{∥ f ∥}_{1}^{2}}) ≳ 1 - H (\frac{i}{n}) \geq 1 - H (δ) .

By Corollary 3 this means that

\hat{f}

cannot attain its

ℓ_{2}

norms inside Hamming balls or radii much smaller than

r (δ) : = (\frac{1}{2} - \sqrt{δ (1 - δ)}) \cdot n

around the all-0 and all-1 vectors. Hence there exists

r (δ) - o (n) \leq j \leq r (δ) + o (n)

such that

(\binom{n}{j}) {\hat{f}}^{2} (j) \geq Ω (\frac{1}{n}) \cdot {∥ \hat{f} ∥}_{2, F}^{2}

. It follows that

\frac{1}{n} \log_{2} (\frac{∥ \hat{f} ∥_{1, F}}{∥ \hat{f} ∥_{2, F}}) ≳ \frac{H (\frac{j}{n})}{2} ≳ \frac{H (\frac{1}{2} - \sqrt{δ (1 - δ)})}{2} .

We can now complete the proof of the second claim of the corollary. We have

0 = \frac{1}{n} \log_{2} (\hat{f} (0)) \leq \frac{1}{n} \log_{2} ({∥ f ∥}_{1}) ≲ \frac{1}{n} \log_{2} ({∥ f ∥}_{2}) - \frac{1 - H (δ)}{2} =

\frac{1}{n} \log_{2} (∥ \hat{f} ∥_{2, F}) - \frac{1 - H (δ)}{2} ≲ \frac{1}{n} \log_{2} (∥ \hat{f} ∥_{1, F}) - \frac{1 - H (δ) + H (\frac{1}{2} - \sqrt{δ (1 - δ)})}{2} =

\frac{1}{n} \log_{2} (f (0)) - \frac{1 - H (δ) + H (\frac{1}{2} - \sqrt{δ (1 - δ)})}{2} .

□

Proof of Corollary 5.

Let

0 \leq s \leq n / 2

and let f be a polynomial of degree s on

{0, 1}^{n}

. We need two simple and well-known facts from Fourier analysis on

{0, 1}^{n}

. First, that the Fourier expansion of f is supported on characters of weight at most s; and second, that for any function g on

{0, 1}^{n}

holds

E (g, g) = 4 \sum_{α \in {0, 1}^{n}} | α | {\hat{g}}^{2} (α)

. Combining these two facts implies that

E (f, f) = 4 \sum_{α \in {0, 1}^{n}} | α | {\hat{f}}^{2} (α) = 4 \sum_{| α | \leq s} | α | {\hat{f}}^{2} (α) \leq 4 s \cdot \sum_{| α | \leq s} {\hat{f}}^{2} (α) = 4 s \cdot E f^{2},

where in the last step we used Parseval’s identity.

Write

σ

for

s / n

and x for

\frac{1}{n} E n t_{2} (\frac{f}{{∥ f ∥}_{1}})

. We have, using Corollary 2,

4 σ n = 4 s \geq \frac{E (f, f)}{E f^{2}} \geq ℓ (x) \cdot \frac{E n t (f^{2})}{E f^{2}} \geq n x ℓ (x) =

n \cdot (2 - 4 \sqrt{H^{- 1} (1 - x) (1 - H^{- 1} (1 - x))}) .

Rearranging and simplifying, this is equivalent to

\frac{1}{n} \log_{2} (\frac{{∥ f ∥}_{2}}{{∥ f ∥}_{1}}) = \frac{x}{2} \leq \frac{1 - H (\frac{1}{2} - \sqrt{σ (1 - σ)})}{2},

completing the proof. □

Proof of Corollary 1.

We start with the first claim of the corollary. First consider the case

ϵ = \frac{1}{2}

. It is easy to see that

ϕ_{\frac{1}{2}} (x) = x - 1

(note that in the definition of

Φ (x, ϵ)

we have

y (x, \frac{1}{2}) = \lim_{ϵ \to \frac{1}{2}} y (x, ϵ) = H^{- 1} (x) (1 - H^{- 1} (x))

) and hence in this case the value of

κ

given by the claim is 1 (as it should be).

Assume now

ϵ < \frac{1}{2}

. This implies that

q_{0} = 1 + {(1 - 2 ϵ)}^{2} > 1

. By the first claim of Lemma 3, this means that for any

0 \leq x \leq 1

we have

\frac{- x}{ϕ_{ϵ} (1 - x)} \geq - \frac{1}{ϕ_{ϵ} (0)} = \frac{2}{\log_{2} (\frac{4}{q_{0}})} > 1

. Hence, it is easy to see that for q sufficiently close to 1 the first and the third clauses in the definition of

κ_{2, q}

in Definition 1 do not apply, and we have

κ_{2, q} (x, ϵ) = \frac{- x}{ϕ_{ϵ} (1 - x)}

. Theorem 1 then gives

∥ f_{ϵ} ∥_{2} \leq {∥ f ∥}_{κ}, w i t h κ = - \frac{\frac{E n t_{q} (\frac{f}{{∥ f ∥}_{1}})}{n}}{ϕ_{ϵ} (1 - \frac{E n t_{q} (\frac{f}{{∥ f ∥}_{1}})}{n})} .

Taking

q \to 1

and recalling that

E n t_{q} (\cdot) \to_{q \to 1} E n t (\cdot)

completes the proof of the claim.

We pass to the second claim of the corollary. First consider the case

ϵ = 0

. Note that in this case

q_{0} = 2

. Furthermore, by the first claim of Lemma 9,

ϕ_{0} (x) = \frac{x - 1}{2}

, and hence the value of

κ

given by the claim is 2 (as expected).

Assume now

ϵ > 0

. This implies that

q_{0} < 2

, and hence, by the third claim of Lemma 3, for any

0 \leq x \leq 1

we have

\frac{- x}{ϕ_{ϵ} (1 - x)} \leq q_{0} < 2 = q

. Hence the second clause in the definition of

κ_{2, q}

in Definition 1 does not apply. The remaining two clauses give the claim, as stated. □

Proofs of Comments to Theorem 1

Some of the claims in these comments require a proof. These claims are restated and proved in the following lemma.

Lemma 10.

If $q \geq 2$ then for any $0 < ϵ < \frac{1}{2}$ the function $κ_{2, q} (x, ϵ)$ starts as a constant- $(1 + {(1 - 2 ϵ)}^{2})$ function up to some $x = x (q, ϵ) > 0$ , and becomes strictly decreasing after that. For $1 < q < 2$ there is a value $0 < ϵ (q) < \frac{1}{2}$ , such that for all $ϵ \leq ϵ (q)$ the function $κ_{2, q} (x, ϵ)$ is strictly decreasing (in which case we say that $x (q, ϵ) = 0$ ). However, $x (q, ϵ) > 0$ for all $ϵ > ϵ (q)$ . The function $ϵ (q)$ decreases with q (in particular, $ϵ (q) = 0$ for $g \geq 2$ ). The function $x (q, ϵ)$ increases both in q and in ϵ.
The function $κ_{2, 1} (x, ϵ) = - \frac{x}{ϕ_{ϵ} (1 - x)}$ is strictly decreasing in its first argument for any $0 < ϵ < \frac{1}{2}$ . It satisfies $κ_{2, 1} (0, ϵ) = \lim_{x \to 0} κ_{2, 1} (x, ϵ) = 1 + {(1 - 2 ϵ)}^{2}$ , for all $0 \leq ϵ \leq \frac{1}{2}$ .
Let f be a non-constant function on ${0, 1}^{n}$ . Let $0 < ϵ < \frac{1}{2}$ . Let $F (q) = F_{f, ϵ} (q) = κ_{2, q} (E n t_{q} (\frac{f}{{∥ f ∥}_{1}}) / n, ϵ)$ . There is a unique value $1 < q (f, ϵ) \leq 1 + {(1 - 2 ϵ)}^{2}$ of q for which $F (q) = q$ . Moreover, $q (f, ϵ) = \min_{q \geq 1} F (q)$ . Furthermore, $\lim_{ϵ \to 0} q (f, ϵ) = 2$ for any f.

Proof.

The first claim of the lemma follows from the properties of

κ_{2, q}

as shown in the proof of Theorem 1. In particular, it is easy to see that for

q \leq 2

we have

ϵ (q) = \frac{1 - \sqrt{q - 1}}{2}

and for

ϵ \geq ϵ (q)

we have

x (q, ϵ) = \frac{q - (1 + {(1 - 2 ϵ)}^{2})}{(1 + {(1 - 2 ϵ)}^{2}) \cdot (q - 1)}

. The claim that

ϵ (q)

decreases with q and that

x (q, ϵ)

increases in both q and

ϵ

follows by direct verification.

The second claim of the lemma follows immediately from the third claim of Lemma 3.

We pass to the third claim of the lemma. Note that the function

x (q) = E n t_{q} (\frac{f}{{∥ f ∥}_{1}}) / n

is positive and strictly increasing in q. We need the following auxiliary claim.

Lemma 11.

The function

y (q) = \frac{q - 1}{q} \cdot x (q) + \frac{1}{q}

is strictly decreasing in q.

Proof of Lemma 11.

Assume w.l.o.g. that

f \geq 0

and that

E f = 1

. Let

P = \frac{f}{2^{n}}

be a distribution on

{0, 1}^{n}

. A simple calculation gives that

y (q) = 1 + \frac{1}{n} \cdot \log_{2} ({(\sum_{a \in {0, 1}^{n}} P {(a)}^{q})}^{\frac{1}{q}}),

which is strictly decreasing in q, by Hölder’s inequality. □

We proceed with the proof of of the third claim of Lemma 10. Let

q_{0} = 1 + {(1 - 2 ϵ)}^{2}

. We claim, first, that F is strictly increasing on

q_{0} \leq q < \infty

. In fact, for these values of q the second clause of Definition 1 does not apply (by the third claim of Lemma 3) and we have

κ_{2, q} (x, ϵ) = \{\begin{matrix} q_{0} & if & y \leq \frac{1}{q_{0}} \\ \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})} & if & y > \frac{1}{q_{0}} \end{matrix},

where

y = y (q)

and

α_{0}

is determined by

1 - α_{0} - \frac{α_{0} ϕ_{ϵ} (α_{0})}{1 - α_{0}} = y

. The claim now follows by combining Lemma 11, and the third and fourth claims of Lemma 3.

Next, we claim that there exists a unique value

1 \leq q = q^{*} \leq q_{0}

for which

\frac{- x}{ϕ_{ϵ} (1 - x)} = q

(here

x = x (q)

). Moreover, F decreases for

1 \leq q \leq q^{*}

and increases for

q \geq q^{*}

. Finally,

F (q^{*}) = q^{*}

. Observe that verifying these claims will essentially complete the proof of the third claim of Lemma 10 (apart from the fact that

\lim_{ϵ \to 0} q (f, ϵ) = 2

).

In fact, by the first and third claims of Lemma 3, and the fact that x is strictly increasing in q, the function

\frac{- x}{ϕ_{ϵ} (1 - x)}

is strictly decreasing in q, taking values between

\frac{2}{\log_{2} (\frac{4}{q_{0}})}

and

q_{0}

. This means that it has a unique intersection

q = q^{*}

with the function q in

[1, q_{0}]

. Next, observe that by Definition 1 for

q \leq q_{0}

we have

κ_{2, q} (x, ϵ) = \{\begin{matrix} - \frac{x}{ϕ_{ϵ} (1 - x)} & if & - \frac{x}{ϕ_{ϵ} (1 - x)} \geq q \\ \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})} & if & - \frac{x}{ϕ_{ϵ} (1 - x)} \leq q \end{matrix}

This means that for

q < q^{*}

we have

F (q) = κ_{2, q} (x, ϵ) = - \frac{x}{ϕ_{ϵ} (1 - x)}

, which is decreasing in q, and for for

q > q^{*}

we have

F (q) = \frac{α_{0} - 1}{ϕ_{ϵ} (α_{0})}

, which increases in q. Finally, for

q = q^{*}

, we have

F (q) = - \frac{x}{ϕ_{ϵ} (1 - x)} = q

.

It remains to verify that

\lim_{ϵ \to 0} q (f, ϵ) = 2

. By the first claim of Lemma 9,

ϕ_{0} (x) = \frac{x - 1}{2}

. This means that for any

0 < x \leq 1

we have

\lim_{ϵ \to 0} \frac{- x}{ϕ_{ϵ} (1 - x)} = 2

. The claim follows since, by the preceding discussion,

q = q (f, ϵ) = \frac{- x (q)}{ϕ_{ϵ} (1 - x (q))}

. □

Author Contributions

Conceptualization, N.L. and A.S.; Investigation, N.L. and A.S.; Methodology, N.L. and A.S.; Validation, N.L. and A.S.; Writing—original draft, N.L. and A.S.; Writing—review & editing, N.L. and A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank Or Ordentlich for a very helpful discussion. We would also like to thank Igal Sason and the anonymous referees for their comments, which led to significant improvements in the presentation of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Beckner, W. Inequalities in Fourier Analysis. Ann. Math. 1975, 102, 159–182. [Google Scholar] [CrossRef]
Bonami, A. Etude des coefficients Fourier des fonctions de L_p(G). Annales de l’Institut Fourier 1970, 20, 335–402. [Google Scholar] [CrossRef]
Gross, L. Logarithmic Sobolev inequalities. Am. J. Math. 1975, 97, 1061–1083. [Google Scholar] [CrossRef]
O’Donnell, R. Analysis of Boolean Functions; Cambridge University Press: New York, NY, USA, 2014. [Google Scholar]
Samorodnitsky, A. An upper bound on ℓ_q norms of noisy functions. IEEE Trans. Inf. Theory 2020, 66, 742–748. [Google Scholar] [CrossRef]
Wyner, A.D.; Ziv, J. A theorem on the entropy of certain binary sequences and applications: Part I. IEEE Trans. Inf. Theory 1973, 19, 769–772. [Google Scholar] [CrossRef]
Cover, T.; Thomas, J. Elements of Information Theory; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
Rényi, A. On the measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 20–30 June 1961; Volume 1, pp. 547–561. [Google Scholar]
Kirshner, N.; Samorodnitsky, A. A moment ratio bound for polynomials and some extremal properties of Krawchouk polynomials and Hamming spheres. IEEE Trans. Inf. Theory 2021, 67, 3509–3541. [Google Scholar] [CrossRef]
Polyanskiy, Y.; Samorodnitsky, A. Improved log-Sobolev inequalities, hypercontractivity and uncertainty principle on the hypercube. J. Funct. Anal. 2019, 277, 108280. [Google Scholar] [CrossRef]
Yu, L.; Anantharam, V.; Chen, J. Graphs of joint types. In Proceedings of the ISIT, Melbourne, Australia, 12–20 July 2021. [Google Scholar]
Chandar, V.; Tchamkerten, A. Most informative quantization functions. In Proceedings of the ITA Workshop, San Diego, CA, USA, 9–14 February 2014. [Google Scholar]
MacWilliams, J.; Sloane, N.J.A. The Theory of Error Correcting Codes; Elsevier: Amsterdam, The Netherlands, 1977. [Google Scholar]
McEliece, R.J.; Rodemich, E.R.; Rumsey, H., Jr.; Welch, L.R. New upper bounds on the rate of a code via the Delsarte-MacWilliams inequalities. IEEE Trans. Inf. Theory 1977, 23, 157–166. [Google Scholar] [CrossRef]
Delsarte, P. An algebraic approach to the association schemes of coding theory. Philips Res. Rep. Suppl. 1973, 10, vi+–97. [Google Scholar]
Barg, A.; Jaffe, D.B. Numerical results on the asymptotic rate of binary codes. In Codes and Association Schemes; Barg, A., Litsyn, S., Eds.; American Mathematical Society: Providence, RI, USA, 2001. [Google Scholar]
Samorodnitsky, A. On the Optimum of Delsarte’s Linear Program. J. Comb. Theory Ser. A 2001, 96, 261–287. [Google Scholar] [CrossRef]
Navon, M.; Samorodnitsky, A. On Delsarte’s Linear Programming Bounds for Binary Codes. In Proceedings of the FOCS 2005, Pittsburgh, PA, USA, 23–25 October 2005; pp. 327–338. [Google Scholar]
Friedman, J.; Tillich, J.-P. Generalized Alon-Boppana theorems and error-correcting codes. SIAM J. Discret. Math. 2005, 19, 700–718. [Google Scholar] [CrossRef][Green Version]
Samorodnitsky, A. A modified logarithmic Sobolev inequality for the hamming cube and some applications. arXiv 2008, arXiv:0807.1679. [Google Scholar]
Boucheron, S.; Lugosi, G.; Massart, P. Concentration Inequalities; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
Donoho, D.L.; Stark, P.B. Uncertainty principles and signal recovery. SIAM J. Appl. Math. 1989, 49, 906–931. [Google Scholar] [CrossRef]
Kalai, G.; Linial, N. On the distance distribution of codes. IEEE Trans. Inf. Theory 1995, 41, 1467–1472. [Google Scholar] [CrossRef]
Ashikhmin, A.; Cohen, G.; Krivelevich, M.; Litsyn, S. Bounds on distance distributions in codes of known size. IEEE Trans. Inf. Theory 2005, 51, 250–258. [Google Scholar] [CrossRef]
Eskenazis, A.; Ivanisvili, P. Polynomial inequalities on the Hamming cube. Prob. Theory Relat. Fields 2020, 178, 235–287. [Google Scholar] [CrossRef]
Ivanisvili, P.; Tkocz, T. Comparison of moments of Rademacher Chaoses. Ark. Mat. 2019, 57, 121–128. [Google Scholar] [CrossRef]
Yu, L. Strong Brascamp-Lieb Inequalities. arXiv 2021, arXiv:2102.06935. [Google Scholar]
van Lint, J.H. Introduction to Coding Theory, 3rd ed.; Graduate Texts in Mathematics; Springer: Berlin, Germany, 1999; Volume 86. [Google Scholar]
Kalai, G.; (The Hebrew University of Jerusalem, Jerusalem, Israel); Linial, N.; (The Hebrew University of Jerusalem, Jerusalem, Israel). Personal communication, 1994.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Levhari, N.; Samorodnitsky, A. Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy. Entropy 2022, 24, 1376. https://doi.org/10.3390/e24101376

AMA Style

Levhari N, Samorodnitsky A. Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy. Entropy. 2022; 24(10):1376. https://doi.org/10.3390/e24101376

Chicago/Turabian Style

Levhari, Niv, and Alex Samorodnitsky. 2022. "Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy" Entropy 24, no. 10: 1376. https://doi.org/10.3390/e24101376

APA Style

Levhari, N., & Samorodnitsky, A. (2022). Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy. Entropy, 24(10), 1376. https://doi.org/10.3390/e24101376

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hypercontractive Inequalities for the Second Norm of Highly Concentrated Functions, and Mrs. Gerber’s-Type Inequalities for the Second Rényi Entropy

Abstract

1. Introduction

1.1. Full Statements of Proposition 1 and Theorem 1

1.2. Applications

1.3. Related Work

2. Proof of Proposition 1

3. Proof of Theorem 1

3.1. Proof of Lemma 4

3.2. Proof of Proposition 3

3.3. Proofs of the Remaining Lemmas

4. Remaining Proofs

Proofs of Comments to Theorem 1

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI