Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy

Nomura, Ryo; Yagi, Hideki

doi:10.3390/e26090766

Open AccessArticle

Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy^†

by

Ryo Nomura

^1,*,‡

and

Hideki Yagi

^2,‡

¹

Center for Data Science, Waseda University, Tokyo 169-8050, Japan

²

Department of Computer and Network Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan

^*

Author to whom correspondence should be addressed.

^†

This paper is an extension of our conference papers: Nomura, R.; Yagi, H. Optimum source resolvability rate with respect to f-divergences using the smooth Rényi entropy. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; and Nomura, R.; Yagi, H. Optimum intrinsic randomness rate with respect to f-divergences using the smooth min entropy. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021.

^‡

These authors contributed equally to this work.

Entropy 2024, 26(9), 766; https://doi.org/10.3390/e26090766

Submission received: 1 July 2024 / Revised: 29 August 2024 / Accepted: 4 September 2024 / Published: 6 September 2024

(This article belongs to the Section Information Theory, Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

Two typical fixed-length random number generation problems in information theory are considered for general sources. One is the source resolvability problem and the other is the intrinsic randomness problem. In each of these problems, the optimum achievable rate with respect to the given approximation measure is one of our main concerns and has been characterized using two different information quantities: the information spectrum and the smooth Rényi entropy. Recently, optimum achievable rates with respect to f-divergences have been characterized using the information spectrum quantity. The f-divergence is a general non-negative measure between two probability distributions on the basis of a convex function f. The class of f-divergences includes several important measures such as the variational distance, the KL divergence, the Hellinger distance and so on. Hence, it is meaningful to consider the random number generation problems with respect to f-divergences. However, optimum achievable rates with respect to f-divergences using the smooth Rényi entropy have not been clarified yet in both problems. In this paper, we try to analyze the optimum achievable rates using the smooth Rényi entropy and to extend the class of f-divergence. To do so, we first derive general formulas of the first-order optimum achievable rates with respect to f-divergences in both problems under the same conditions as imposed by previous studies. Next, we relax the conditions on f-divergence and generalize the obtained general formulas. Then, we particularize our general formulas to several specified functions f. As a result, we reveal that it is easy to derive optimum achievable rates for several important measures from our general formulas. Furthermore, a kind of duality between the resolvability and the intrinsic randomness is revealed in terms of the smooth Rényi entropy. Second-order optimum achievable rates and optimistic achievable rates are also investigated.

Keywords:

f-divergence; Hellinger distance; intrinsic randomness; Kullback–Leibler divergence; random number generation; smooth Rényi entropy; source resolvability; variational distance

1. Introduction

Two typical fixed-length random number generation problems in information theory are considered for general sources. One is the source resolvability problem (i.e., the resolvability problem), and the other is the intrinsic randomness problem. The problem setting of the resolvability problem is as follows. Given an arbitrary source

X = {X^{n}}_{n = 1}^{\infty}

(the target random number), we approximate it by using a discrete random number that is uniformly distributed, which we call the uniform random number. Here, the size of the uniform random number is requested to be as small as possible. In this setting, a degree of approximation is measured by several criteria. Han and Verdú [1] and Steinberg and Verdú [2] have determined the first-order optimum achievable rates with respect to the variational distance and the normalized Kullback–Leibler (KL) divergence. Nomura [3] has studied the first-order optimum achievable rates with respect to the KL divergence. Recently, Nomura [4] has characterized the first-order optimum achievable rates with respect to f-divergences. The class of f-divergence considered in [4] includes the variational distance and the KL divergence. Hence, the result can be considered as a generalization of the results given in [1,3]. The second-order optimum achievable rates in the resolvability problem have also been studied with respect to several approximation measures [4,5]. It should be noted that the results mentioned above are based on the information spectrum quantity. On the other hand, Uyematsu [6] has characterized the first-order optimum achievable rate with respect to the variational distance using the smooth Rényi entropy.

The intrinsic randomness problem, which is also one of typical random number generation problems, has also been studied. The problem setting of the intrinsic randomness problem is as follows. By using a given arbitrary source

X = {X^{n}}_{n = 1}^{\infty}

(the coin random number), we approximate a discrete uniform random number whose size is requested to be as large as possible. Also in the intrinsic randomness problem, optimum achievable rates with respect to various criteria have been considered. Vembu and Verdú [7] have considered the intrinsic randomness problem with respect to the variational distance as well as the normalized KL divergence and derived general formulas of the first-order optimum achievable rates (cf. Han [8]). Hayashi [9] has considered the first- and second-order optimum achievable rates with respect to the KL divergence. Recently, the first- and second-order optimum achievable rates with respect to f-divergences have been clarified in [4]. The results mentioned here are based on information spectrum quantities. On the other hand, Uyematsu and Kunimatsu [10] have characterized the first-order optimum achievable rates with respect to the variational distance using the smooth Rényi entropy.

Related works include works given by Liu, Cuff and Verdú [11], Yagi and Han [12], Kumagai and Hayashi [13,14], and Yu and Tan [15]. In [11], the channel resolvability problem with respect to the

E_{γ}

-divergence has been considered. They have applied their results to the case of the source resolvability problem. Yagi and Han [16] have determined the optimum variable-length resolvability rates with respect to the variational distance as well as the KL divergence. Kumagai and Hayashi [13,14] have determined the first- and second-order optimum achievable rates in the random number conversion problem. It should be noted that the random number conversion problem includes the resolvability and intrinsic randomness problems treated in this paper. In [13,14], an approximation measure related to the Hellinger distance has been used. Yu and Tan [15] have considered the random number conversion problem with respect to the Rényi divergence.

As we have mentioned above, in both problems of the resolvability and the intrinsic randomness, various approximation measures have been considered. Furthermore, general formulas of achievable rates have been characterized by using the information spectrum quantity and the smooth Rényi entropy. We here note that optimum achievable rates with respect to f-divergence using the smooth Rényi entropy have not been clarified yet. The smooth Rényi entropy is an information quantity that has a clear operational meaning and is easy to understand. Moreover, a class of f-divergences is a general distance measure, in which several important measures are included. In this paper, hence, we try to characterize the first- and second-order optimum achievable rates with respect to f-divergences using the smooth Rényi entropy. In addition, we also extend the class of f-divergence for which optimum achievable rates can be characterized. As a result, we find that two types of smooth Rényi entropies are useful to describe these optimum achievable rates for a wider class of f-divergence. Furthermore, a kind of duality between the resolvability and the intrinsic randomness is revealed in terms of the smooth Rényi entropy and f-divergences.

This paper is organized as follows. In Section 2, we describe the problem setting and give some definitions of the optimum first-order achievable rates. The class of f-divergences and the smooth Rényi entropy are also introduced. In Section 3 and Section 4, we show general formulas of the optimum first-order achievable rates in the resolvability problem and the intrinsic randomness problem, respectively. In Section 5, we derive the general formulas of these achievable rates for an extended class of f-divergence. In Section 6, we apply general formulas obtained in previous sections to some specified functions f and compute the optimum first-order achievable rates in each cases. In Section 7, we show general formulas of the optimum second-order achievable rates in two problems. In Section 8, optimum achievable rates in the optimistic sense are considered. Section 9 is devoted to the discussion concerning our results. Finally, we provide some concluding remarks on our results in Section 10.

2. Preliminaries

2.1. f-Divergences

The f-divergence between two probability distributions

P_{Z}

and

P_{\bar{Z}}

is defined as follows [17]. Let

f (t)

be a convex function defined for

t > 0

and

f (1) = 0

.

Definition 1

(f-divergence [17]). Let

P_{Z}

and

P_{\bar{Z}}

denote probability distributions over a finite or countably infinite set

Z

. The f-divergence between

P_{Z}

and

P_{\bar{Z}}

is defined by

D_{f} (Z | | \bar{Z}) : = \sum_{z \in Z} P_{\bar{Z}} (z) f (\frac{P_{Z} (z)}{P_{\bar{Z}} (z)}),

(1)

where we set

0 f (\frac{0}{0}) = 0

,

f (0) = {lim}_{t ↓ 0} f (t)

,

0 f (\frac{a}{0}) = {lim}_{t \to 0} t f (\frac{a}{t}) = a {lim}_{u \to \infty} \frac{f (u)}{u}

.

The f-divergence is a general approximation measure, which includes some important measures. We give some examples of f-divergences [17,18]:

$f (t) = t log t$ : (Kullback–Leibler (KL) divergence)

$\begin{matrix} D_{f} (Z | | \bar{Z}) = \sum_{z \in Z} P_{Z} (z) log \frac{P_{Z} (z)}{P_{\bar{Z}} (z)} = : D (Z | | \bar{Z}) . \end{matrix}$

(2)
$f (t) = - log t$ : (Reverse Kullback–Leibler divergence)

$\begin{matrix} D_{f} (Z | | \bar{Z}) = \sum_{z \in Z} P_{\bar{Z}} (z) log \frac{P_{\bar{Z}} (z)}{P_{Z} (z)} = D (\bar{Z} | | Z) . \end{matrix}$

(3)
$f (t) = 1 - \sqrt{t}$ : (Hellinger distance)

$\begin{matrix} D_{f} (Z | | \bar{Z}) = 1 - \sum_{z \in Z} \sqrt{P_{Z} (z) P_{\bar{Z}} (z)} . \end{matrix}$

(4)
$f (t) = {(1 - \sqrt{t})}^{2}$ : (Squared Hellinger distance)

$\begin{matrix} D_{f} (Z | | \bar{Z}) = \sum_{z \in Z} {(\sqrt{P_{Z} (z)} - \sqrt{P_{\bar{Z}} (z)})}^{2} . \end{matrix}$

(5)
$f (t) = | t - 1 |$ : (Variational distance)

$\begin{matrix} D_{f} (Z | | \bar{Z}) = \sum_{z \in Z} | P_{Z} (z) - P_{\bar{Z}} (z) | . \end{matrix}$

(6)
$f (t) = {(1 - t)}^{+} : = max {1 - t, 0}$ : (Half variational distance)

$\begin{matrix} D_{f} (Z | | \bar{Z}) & = & \frac{1}{2} \sum_{z \in Z} | P_{Z} (z) - P_{\bar{Z}} (z) | \\ = & \sum_{z \in Z : P_{Z} (z) > P_{\bar{Z}} (z)} (P_{Z} (z) - P_{\bar{Z}} (z)) . \end{matrix}$

(7)
$f (t) = \frac{t^{α} - α t - (1 - α)}{α (α - 1)}$ : $α$ -divergence ( $0 < α < 1$ )

$\begin{matrix} D_{f} (Z | | \bar{Z}) & = & \frac{1}{α (α - 1)} (1 - \sum_{z \in Z} P_{Z} {(z)}^{α} P_{\bar{Z}} {(z)}^{1 - α}) . \end{matrix}$

(8)
$f (t) = {(t - γ)}^{+}$ : ( $E_{γ}$ -divergence) For any given $γ \geq 1$ ,

$\begin{matrix} D_{f} (Z | | \bar{Z}) = \sum_{z \in Z : P_{Z} (z) > γ P_{\bar{Z}} (z)} (P_{Z} (z) - γ P_{\bar{Z}} (z)) = : E_{γ} (Z | | \bar{Z}) . \end{matrix}$

(9)

The

E_{γ}

-divergence is a generalization of the half variational distance defined in (7), because

γ \geq 1

is arbitrary.

Remark 1.

It is known [4] that the

E_{γ}

-divergence can be expressed as an f-divergence using the function:

f (t) = {(γ - t)}^{+} + 1 - γ .

(10)

The following key property holds for the f-divergence from Jensen’s inequality [17]:

\begin{matrix} \sum_{z \in Z^{'}} b (z) f (\frac{a (z)}{b (z)}) & \geq & (\sum_{z \in Z^{'}} b (z)) f (\frac{\sum_{z \in Z^{'}} a (z)}{\sum_{z \in Z^{'}} b (z)}) . \end{matrix}

(11)

As we have mentioned above, the f-divergence is a general approximation measure, which includes several important measures. In this study, we first assume the following conditions on the function f that have also been imposed by previous studies [4].

C1): The function $f (t)$ is a decreasing function for $t > 0$ with $f (0) > 0$ .
C2): The function $f (t)$ satisfies

$lim_{u \to \infty} \frac{f (u)}{u} = 0 .$

(12)
C3): For any pair of positive real numbers $(a, b)$ , it holds that

$lim_{n \to \infty} \frac{f (e^{- n b})}{e^{n a}} = 0 .$

(13)

Remark 2.

Notice here that functions

f (t) = - log t

,

f (t) = 1 - \sqrt{t}

, and

f (t) = {(1 - t)}^{+}

satisfy the above conditions, while

f (t) = t log t

does not satisfy conditions C1) and C2). Moreover, it is not difficult to check that (10) satisfies these conditions.

Remark 3.

For a decreasing function

f (t)

, it always holds that

f (0) = {lim}_{t ↓ 0} f (t) \geq 0

because

f (1) = 0

. Then, the condition

f (0) > 0

in C1) excludes the case of

f (t) = 0

for all

t \geq 0

, in which f-divergence is identically zero.

Remark 4.

From the definition of the f-divergence, C2) means

0 f (\frac{a}{0}) = 0,

(14)

for any

a > 0

. In the derivation of our main theorems, we can use (14) instead of (12).

Remark 5.

We will show in Section 5 that condition C1) is automatically met for the function f satisfying condition C2) (cf. claim (i) of Lemma 1).

2.2. Smooth Rényi Entropy

In what follows, we consider the case of

Z = X^{n}

, where

X

is a finite or countably infinite set and n is an integer. We consider the general source defined as an infinite sequence

X = {\{X^{n} = (X_{1}^{(n)}, X_{2}^{(n)}, \dots, X_{n}^{(n)})\}}_{n = 1}^{\infty}

of n-dimensional random variables

X^{n}

, where each component random variable

X_{i}^{(n)}

takes values in a countable set

X

. Let

P_{X} (\cdot)

denote the probability distribution of the random variable X. In this paper, we assume the following condition on the source

X

:

\underset{̲}{H} (X) < + \infty,

(15)

where

\begin{matrix} \underset{̲}{H} (X) & = & sup \{R |lim_{n \to \infty} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \geq R\} = 1\} \end{matrix}

(16)

is called the spectral inf-entropy rate of the source

X

[8]. Here, Han [8] ([Theorem 1.7.2]) has shown that

\begin{matrix} \underset{̲}{H} (X) \leq log | X | \end{matrix}

(17)

holds. Hence, the condition (15) holds for any source with a finite alphabet.

The random number

U_{M}

which is uniformly distributed on

{1, 2, \dots, M}

is defined by

P_{U_{M}} (i) = \frac{1}{M}, i \in U_{M} : = {1, 2, \dots, M} .

(18)

We next introduce the smooth Rényi entropy of the source.

Definition 2

(Smooth Rényi entropy of order

α

[19]). For given random variables

X^{n}

, the smooth Rényi entropy of order α given

δ (0 \leq δ < 1)

is defined by

H_{α} (δ | X^{n}) : = \frac{1}{1 - α} inf_{P_{{\bar{X}}^{n}} \in B^{δ} (P_{X^{n}})} log (\sum_{x \in X^{n}} P_{{\bar{X}}^{n}} {(x)}^{α}),

(19)

where

\begin{matrix} B^{δ} (P_{X^{n}}) & : = & \{P_{{\bar{X}}^{n}} \in P^{n} |\frac{1}{2} \sum_{x \in X^{n}} | P_{X^{n}} (x) - P_{{\bar{X}}^{n}} (x) | \leq δ\} . \end{matrix}

(20)

Here,

H_{α} (δ | X^{n})

is a decreasing function of

δ

. The smooth Rényi entropy of order 0 and the smooth Rényi entropy of order ∞ are, respectively, called the smooth max entropy and the smooth min entropy [20].

The following theorems have shown alternative expressions of the smooth max entropy and the smooth min entropy.

Theorem 1

(Uyematsu [6,21]).

H_{0} (δ | X^{n}) = min_{\begin{matrix} A_{n} \subset X^{n} \\ Pr {X^{n} \in A_{n}} \geq 1 - δ \end{matrix}} log | A_{n} | .

(21)

Theorem 2

(Uyematsu and Kunimatsu [10]).

H_{\infty} (δ | X^{n}) = - inf_{\begin{matrix} β \geq \frac{1}{| X^{n} |} : \\ \sum_{x \in X^{n}} {(P_{X^{n}} (x) - β)}^{+} \leq δ \end{matrix}} log β,

(22)

where if

| X |

is a countably infinite set, the infimum is taken over

β \geq 0

.

It should be noted that these alternative expressions are simple and easy to understand compared to (19). Figure 1 and Figure 2 show operational meanings of (21) and (22). As in Figure 1, the smooth max entropy

H_{0} (δ | X^{n})

is equal to the logarithm of the cardinality of the set

A_{n}

with

Pr {X^{n} \in A_{n}} \geq 1 - δ

where each of the sequence

x \in A_{n}

has large probability. On the other hand, the smooth min entropy

H_{\infty} (δ | X^{n})

is equal to the supremum of

- log β

such that the sum of probabilities of sequences

x \in X^{n}

that exceeds

β

is less than or equal to

δ

(Figure 2).

In this paper, we use the above alternative expressions of the smooth max entropy and the smooth min entropy instead of (19).

3. Source Resolvability Problem

We consider the problem concerning how to simulate a given discrete source

X = {X^{n}}_{n = 1}^{\infty}

by using the uniform random number

U_{M_{n}}

and the mapping

ϕ_{n}

. Figure 3 is an illustrative figure of this problem (the probability distribution for

X^{n}

is depicted in black, while the one for

ϕ_{n} (U_{M_{n}})

is shown in blue). Since it is hard to simulate the exact source in general, we consider the approximation problem under some measure. This problem is called the resolvability problem. One of the main objectives in the resolvability problem is to derive the smallest value of a in the form of

M_{n} = e^{n a}

, which we call the optimum resolvability rate [1,8]. This is formulated as follows.

Definition 3.

Rate R is said to be D-achievable with the given f-divergence if there exists a sequence of mapping

ϕ_{n} : U_{M_{n}} \to X^{n}

such that

\underset{n \to \infty}{lim sup} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D,

(23)

\underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} \leq R .

(24)

Given some D, if the rate constraint R is sufficiently large, it can be shown that there exists a sequence of mappings satisfying constraints in the above definition. Conversely, if R is too small, no sequence of mappings that satisfies constraints can be found. Therefore, in the resolvability problem, the infimum of R is of particular interest.

Definition 4 (First-order optimum resolvability rate).

\begin{matrix} S_{r}^{(f)} (D | X) & : = & inf \{R |R i s D - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} . \end{matrix}

(25)

Remark 6.

It should be noted that we do not use

D_{f} (ϕ_{n} (U_{M_{n}}) | | X^{n})

but

D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}}))

as a condition in Definition 3. This is important to consider the asymmetric measure such as the KL-divergence.

Remark 7.

We consider the case where D is in

[0, f (0))

under the given f-divergence. Since

f (t)

is defined in the range

t > 0

and we assume that the function

f (t)

is a decreasing function of t,

D_{f} (X^{n} | | Y^{n}) \leq f (0)

holds for any distributions

P_{X^{n}} (\cdot)

and

P_{Y^{n}} (\cdot)

from the definition of f-divergence. Hence,

D \geq f (0)

means that there exists no restriction about the approximation error (for example,

f (0) = 1

in the case of the half variational distance and

f (0) = \infty

in the case of the KL divergence). This case leads to the trivial result that the first-order optimum resolvability rate equals 0. Hence, we only consider the case of

D \in [0, f (0))

. A similar observation is applicable throughout the following sections.

Our main objective in this section is to derive the general formula of the first-order optimum resolvability rate. To do so, we first derive the following two theorems. We use the notation

f^{- 1} (a) = inf {t | f (t) = a} .

Theorem 3.

Under conditions C1)–C3), for any

γ > 0

and any

M_{n}

satisfying

\frac{1}{n} log M_{n} \geq \frac{1}{n} H_{0} (1 - f^{- 1} (D) | X^{n}) + γ,

(26)

there exists a mapping

ϕ_{n}

, which satisfies

D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D + γ

(27)

for sufficiently large n.

Proof.

We arbitrarily fix

M_{n}

satisfying (26). We show that there exists a mapping

ϕ_{n}

that satisfies (27) for sufficiently large n. Let

B_{n} \subset X^{n}

denote a set satisfying

Pr {X^{n} \in B_{n}} \geq f^{- 1} (D)

(28)

and

log | B_{n} | = H_{0} (1 - f^{- 1} (D) | X^{n}) .

(29)

The existence of the above set

B_{n}

is guaranteed by (21). We define the probability distribution

P_{{\bar{X}}^{n}}

over

B_{n}

as

P_{{\bar{X}}^{n}} (x) : = \{\begin{matrix} \frac{P_{X^{n}} (x)}{Pr {X^{n} \in B_{n}}} & x \in B_{n}, \\ 0 & o t h e r w i s e . \end{matrix}

(30)

Furthermore, let a set

C_{n}

be as

C_{n} : = \{x \in B_{n} |P_{{\bar{X}}^{n}} (x) \geq \frac{1}{M_{n}}\}

(31)

and arrange elements in

C_{n}

as

C_{n} = {x_{1}, x_{2}, \dots, x_{| C_{n} |}}

(32)

according to

P_{{\bar{X}}^{n}} (x)

in ascendant order. That is,

P_{{\bar{X}}^{n}} (x_{i}) \leq P_{{\bar{X}}^{n}} (x_{j}) (1 \leq i < j \leq | C_{n} |)

holds. Here, we define

i^{*} : = | C_{n} |

and index

x \in B_{n} ∖ C_{n}

as

x_{i^{*} + 1}, x_{i^{*} + 2}, \dots, x_{| B_{n} |}

arbitrarily.

Then, from the above definition, it holds that

P_{{\bar{X}}^{n}} (x_{i^{*}}) = max_{x \in C_{n}} P_{{\bar{X}}^{n}} (x) .

(33)

Thus, from the assumption (15), for any small

ε \in (0, \underset{̲}{H} (X))

, it holds that

P_{{\bar{X}}^{n}} (x_{i^{*}}) \geq e^{- n (\underset{̲}{H} (X) - ε)}

(34)

for sufficiently large n.

Set

k_{0} = 0

. For

x_{1}

we determine

k_{1}

such that

\frac{k_{1}}{M_{n}} \leq P_{{\bar{X}}^{n}} (x_{1}), \frac{k_{1} + 1}{M_{n}} > P_{{\bar{X}}^{n}} (x_{1}) .

(35)

Secondly, we determine

k_{2}

for

x_{2}

such that

\frac{k_{2} - k_{1}}{M_{n}} \leq P_{{\bar{X}}^{n}} (x_{2}), \frac{k_{2} - k_{1} + 1}{M_{n}} > P_{{\bar{X}}^{n}} (x_{2}) .

(36)

In a similar way, we repeat this operation to choose

k_{i}

for

x_{i}

as long as possible. Then, it is not difficult to check that the above procedure does not stop before

i < i^{*}

.

We define a mapping

ϕ_{n} : U_{M_{n}} \to X^{n}

as

ϕ_{n} (j) = \{\begin{matrix} x_{i} & k_{i - 1} + 1 \leq j \leq k_{i}, i < i^{*} \\ x_{i^{*}} & otherwise \end{matrix}

(37)

and set

{\tilde{X}}^{n} = ϕ_{n} (U_{M_{n}})

.

We evaluate the performance of the mapping

ϕ_{n}

. From the construction of the mapping, for any i satisfying

1 \leq i \leq i^{*} - 1

it holds that

P_{{\tilde{X}}^{n}} (x_{i}) \leq P_{{\bar{X}}^{n}} (x_{i})

(38)

P_{{\bar{X}}^{n}} (x_{i}) < P_{{\tilde{X}}^{n}} (x_{i}) + \frac{1}{M_{n}} .

(39)

We next evaluate

P_{{\tilde{X}}^{n}} (x_{i^{*}})

. From the construction, we have

P_{{\bar{X}}^{n}} (x_{i^{*}}) \leq P_{{\tilde{X}}^{n}} (x_{i^{*}})

. Since

P_{{\tilde{X}}^{n}} (x_{i}) = 0

holds for

\forall i \in B_{n} ∖ C_{n}

, we obtain

P_{{\bar{X}}^{n}} (x_{i}) - P_{{\tilde{X}}^{n}} (x_{i}) = P_{{\bar{X}}^{n}} (x_{i}) < \frac{1}{M_{n}}

(40)

for

\forall i \in B_{n} ∖ C_{n}

. Hence, also from the construction of the mapping, we obtain

\begin{matrix} P_{{\tilde{X}}^{n}} (x_{i^{*}}) - P_{{\bar{X}}^{n}} (x_{i^{*}}) & = & (1 - \sum_{i = 1}^{i^{*} - 1} P_{{\tilde{X}}^{n}} (x_{i})) - (1 - \sum_{x_{i} \in B_{n} ∖ {x_{i^{*}}}} P_{{\bar{X}}^{n}} (x_{i})) \\ = & \sum_{x_{i} \in B_{n} ∖ {x_{i^{*}}}} P_{{\bar{X}}^{n}} (x_{i}) - \sum_{x_{i} \in B_{n} ∖ {x_{i^{*}}}} P_{{\tilde{X}}^{n}} (x_{i}) \\ = & \sum_{x_{i} \in B_{n} ∖ {x_{i^{*}}}} (P_{{\bar{X}}^{n}} (x_{i}) - P_{{\tilde{X}}^{n}} (x_{i})) \\ \leq & \frac{| B_{n} |}{M_{n}} \\ \leq & e^{- n γ}, \end{matrix}

(41)

where the second equality is from the fact that

P_{{\tilde{X}}^{n}} (x_{i}) = 0

for

\forall i \in B_{n} ∖ C_{n}

, the first inequality is due to (39) and (40), and the last inequality is obtained from (26) and (29). Thus, we have

P_{{\tilde{X}}^{n}} (x_{i^{*}}) \leq P_{{\bar{X}}^{n}} (x_{i^{*}}) + e^{- n γ} .

(42)

From the above argument, the f-divergence is given by

\begin{matrix} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) & = & \sum_{i = 1}^{i^{*}} P_{{\tilde{X}}^{n}} (x_{i}) f (\frac{P_{X^{n}} (x_{i})}{P_{{\tilde{X}}^{n}} (x_{i})}) \\ = & \sum_{i = 1}^{i^{*}} P_{{\tilde{X}}^{n}} (x_{i}) f (\frac{P_{{\bar{X}}^{n}} (x_{i}) Pr \{X^{n} \in B_{n}\}}{P_{{\tilde{X}}^{n}} (x_{i})}) \\ = & \sum_{i = 1}^{i^{*} - 1} P_{{\tilde{X}}^{n}} (x_{i}) f (\frac{P_{{\bar{X}}^{n}} (x_{i}) Pr \{X^{n} \in B_{n}\}}{P_{{\tilde{X}}^{n}} (x_{i})}) \\ + P_{{\tilde{X}}^{n}} (x_{i^{*}}) f (\frac{P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{P_{{\tilde{X}}^{n}} (x_{i^{*}})}) \\ \leq & \sum_{i = 1}^{i^{*} - 1} P_{{\tilde{X}}^{n}} (x_{i}) f (Pr \{X^{n} \in B_{n}\}) \\ + P_{{\tilde{X}}^{n}} (x_{i^{*}}) f (\frac{P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{P_{{\tilde{X}}^{n}} (x_{i^{*}})}), \end{matrix}

(43)

where the first equality is due to the condition C2) and the last inequality is due to (38) and the condition C1).

The second term of the RHS of (43) is evaluated as follows. From (42) and C1), we have

\begin{matrix} P_{{\tilde{X}}^{n}} (x_{i^{*}}) f (\frac{P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{P_{{\tilde{X}}^{n}} (x_{i^{*}})}) \\ \leq (P_{{\bar{X}}^{n}} (x_{i^{*}}) + e^{- n γ}) f (\frac{P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{P_{{\bar{X}}^{n}} (x_{i^{*}}) + e^{- n γ}}) . \end{matrix}

(44)

Here, using the relation

\begin{matrix} P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\} \\ = (1 - e^{- n γ}) P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\} + e^{- n γ} P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}, \end{matrix}

(45)

we obtain

\begin{matrix} (P_{{\bar{X}}^{n}} (x_{i^{*}}) + e^{- n γ}) f (\frac{P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{P_{{\bar{X}}^{n}} (x_{i^{*}}) + e^{- n γ}}) \\ \leq P_{{\bar{X}}^{n}} (x_{i^{*}}) f (\frac{(1 - e^{- n γ}) P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{P_{{\bar{X}}^{n}} (x_{i^{*}})}) \\ + e^{- n γ} f (\frac{e^{- n γ} P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{e^{- n γ}}) \\ = P_{{\bar{X}}^{n}} (x_{i^{*}}) f ((1 - e^{- n γ}) Pr \{X^{n} \in B_{n}\}) \\ + e^{- n γ} f (P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}) \\ \leq P_{{\bar{X}}^{n}} (x_{i^{*}}) f ((1 - e^{- n γ}) Pr \{X^{n} \in B_{n}\}) \\ + e^{- n γ} f (e^{- n (\underset{̲}{H} (X) - ε)} Pr \{X^{n} \in B_{n}\}) \end{matrix}

(46)

for sufficiently large n, where the first inequality is due to (11) and the last inequality is from (34) and the condition C1).

Hence, from C3) and the continuity of the function f, for

\forall ν > 0

we have

\begin{matrix} (P_{{\bar{X}}^{n}} (x_{i^{*}}) + e^{- n γ}) f (\frac{P_{{\bar{X}}^{n}} (x_{i^{*}}) Pr \{X^{n} \in B_{n}\}}{P_{{\bar{X}}^{n}} (x_{i^{*}}) + e^{- n γ}}) \\ \leq P_{{\bar{X}}^{n}} (x_{i^{*}}) f (Pr \{X^{n} \in B_{n}\} - e^{- n γ}) + ν \\ \leq P_{{\bar{X}}^{n}} (x_{i^{*}}) f (Pr \{X^{n} \in B_{n}\}) + 2 ν \end{matrix}

(47)

for sufficiently large n. Therefore, noting that

P_{{\bar{X}}^{n}} (x_{i^{*}}) \leq P_{{\tilde{X}}^{n}} (x_{i^{*}})

, from (28), (43), (44) and (47) it holds that

\begin{matrix} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) & \leq & \sum_{i = 1}^{i^{*}} P_{{\tilde{X}}^{n}} (x_{i}) f (Pr \{X^{n} \in B_{n}\}) + 2 ν \\ = & f (Pr \{X^{n} \in B_{n}\}) + 2 ν \\ \leq & f (f^{- 1} (D)) + 2 ν \\ = & D + 2 ν \end{matrix}

(48)

for sufficiently large n. This completes the proof of the theorem. □

Theorem 4.

Under conditions C1) and C2), for any mapping

ϕ_{n}

satisfying

D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D,

(49)

it holds that

\frac{1}{n} log M_{n} \geq \frac{1}{n} H_{0} (1 - f^{- 1} (D) | X^{n}) .

(50)

Proof.

It suffices to show the fact that the relation

\frac{1}{n} log M_{n} < \frac{1}{n} H_{0} (1 - f^{- 1} (D) | X^{n})

(51)

necessarily yields

D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) > D .

(52)

We here denote

H^{'} : = H_{0} (1 - f^{- 1} (D) | X^{n})

for short. For any fixed mapping

ϕ_{n} : U_{M_{n}} \to X^{n}

, we set

{\tilde{X}}^{n} : = ϕ_{n} (U_{M_{n}})

and

B_{n} : = \{x \in X^{n} | P_{{\tilde{X}}^{n}} (x) > 0\} .

(53)

Then, from the property of the mapping it must hold that

M_{n} \geq | B_{n} | .

(54)

From the condition C2) the f-divergence between

P_{X^{n}}

and

P_{{\tilde{X}}^{n}}

is lower bounded by

\begin{matrix} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) & = & \sum_{x \in B_{n}} P_{{\tilde{X}}^{n}} (x) f (\frac{P_{X^{n}} (x)}{P_{{\tilde{X}}^{n}} (x)}) \\ \geq & f (Pr {X^{n} \in B_{n}}) \\ \geq & f (max_{\begin{matrix} B_{n} \subset X^{n} \\ | B_{n} | \leq M_{n} \end{matrix}} Pr {X^{n} \in B_{n}}) \\ \geq & f (max_{\begin{matrix} B_{n} \subset X^{n} \\ log | B_{n} | < H^{'} \end{matrix}} Pr {X^{n} \in B_{n}}) \\ > & f (1 - (1 - f^{- 1} (D))) \\ = & D, \end{matrix}

(55)

where the first inequality is due to (11), the second inequality is due to condition C1) and (54) and the third inequality is from (51). The last inequality is from the definition of the alternative expressions given in Theorem 1. This completes the proof. □

Theorems 3 and 4 show that the smooth max entropy and the inverse function of f have important roles in the resolvability problem with respect to f-divergences. From these theorems, we obtain the following theorem, which addresses the general formula of the optimum resolvability rate. It should be noted that because of the assumption

0 \leq D < f (0)

and C1), we have

0 < f^{- 1} (D) \leq 1

.

Theorem 5.

Under conditions C1)–C3), it holds that

\begin{matrix} S_{r}^{(f)} (D | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) \\ = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D) + ν | X^{n}) . \end{matrix}

(56)

Proof.

We here show the first equality, because the second equality can be derived from the first inequality together with the continuity of the function

f^{- 1}

.

(Direct Part:) Fix

ν > 0

arbitrarily. From Theorem 3, for any

γ > 0

, there exists a mapping

ϕ_{n}

such that

\begin{matrix} \frac{1}{n} log M_{n} & \leq & \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) + γ, \end{matrix}

(57)

and

D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D + ν + γ .

(58)

We here use the diagonal line argument [8]. Fix a sequence

{γ_{i}}_{i = 1}^{\infty}

such that

γ_{1} > γ_{2} > \dots > 0

, and we repeat the above argument as

i \to \infty

. Then, we can show that there exists a mapping

ϕ_{n}

satisfying

\underset{n \to \infty}{lim sup} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D + ν,

(59)

and

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} \leq \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) . \end{matrix}

(60)

Here, also from the diagonal line argument with respect to

ν

, we obtain

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} \leq lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) . \end{matrix}

(61)

This completes the proof of the direct part.

(Converse Part:) We fixed

ν > 0

arbitrarily. From Theorem 4, for any mapping

ϕ_{n}

satisfying

D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D + ν,

(62)

it holds that

\frac{1}{n} log M_{n} \geq \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) .

(63)

Consequently, we have

\underset{n \to \infty}{lim sup} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D + ν

(64)

and

\underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} \geq \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) .

(65)

We also use the diagonal line argument [8]. We repeat the above argument as

i \to \infty

for a sequence

{ν_{i}}_{i = 1}^{\infty}

such that

ν_{1} > ν_{2} > \dots > 0

. Then, for any mapping

ϕ_{n}

satisfying

\underset{n \to \infty}{lim sup} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D,

(66)

it holds that

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} & \geq & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) . \end{matrix}

(67)

This completes the proof of the converse part. □

4. Intrinsic Randomness Problem

In the previous section, we reveal the general formula for the optimum resolvability rate. In this section, we consider how to approximate the uniform random number

U_{M_{n}}

by using the given discrete source

X = {X^{n}}_{n = 1}^{\infty}

and the mapping

φ_{n}

. Figure 4 is an illustrative figure of the problem (the probability distribution for

U_{M_{n}}

is depicted in blue, while the one for

φ_{n} (X^{n})

is shown in black). The size of the random number

M_{n}

is requested to be as large as possible. In the intrinsic randomness problem, one of our main concerns is to derive the largest value of b in the form of

M_{n} = e^{n b}

under some approximation measure [7]. This problem setting is formulated as follows.

Definition 5.

R is said to be Δ-achievable with the given f-divergence if there exists a sequence of mapping

φ_{n} : X^{n} \to U_{M_{n}}

such that

\begin{matrix} \underset{n \to \infty}{lim sup} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & \leq & Δ, \end{matrix}

(68)

\begin{matrix} \underset{n \to \infty}{lim inf} \frac{1}{n} log M_{n} & \geq & R . \end{matrix}

(69)

In this case, given

Δ

, if the rate constraint R is sufficiently small, it can be shown that there exists a sequence of mappings that satisfies the constraints. On the other hand, if R is too large, no sequence of mappings that achieves the desired constraints can be found. Consequently, in this setting, the supremum of R is of particular interest.

Definition 6 (First-order optimum intrinsic randomness rate).

S_{ι}^{(f)} (Δ | X) : = sup \{R |R i s Δ - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} .

(70)

Remark 8.

It should be emphasized that we use the f-divergence of the form

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}})

instead of

D_{f} (U_{M_{n}} | | φ_{n} (X^{n}))

(cf. Remark 6).

We also assume that

Δ \in [0, f (0))

in this section (cf. Remark 7). In order to analyze the general formula of the optimum intrinsic randomness rate

S_{ι}^{(f)} (Δ | X)

, we first give two theorems.

Theorem 6.

Under conditions C1) and C2), for any

γ > 0

and

M_{n}

satisfying

\frac{1}{n} log M_{n} \leq \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ) | X^{n}) - γ,

(71)

there exists a mapping

φ_{n}

such that

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ + γ

(72)

for sufficiently large n.

Proof.

We set

β_{0}

as

- log β_{0} = H_{\infty} (1 - f^{- 1} (Δ) | X^{n})

(73)

for short.

From Theorem 2, we notice that

1 - f^{- 1} (Δ) \geq \sum_{x \in X^{n}} {(P_{X^{n}} (x) - β_{0})}^{+} = : 1 - A_{n} (Δ),

(74)

where if

β_{0} > 1 / | X^{n} |

holds, then

f^{- 1} (Δ) = A_{n} (Δ)

. We shall show that for any

M_{n}

satisfying

\frac{1}{n} log M_{n} \leq - \frac{1}{n} log β_{0} - \frac{1}{n} log \frac{1}{A_{n} (Δ)} - \frac{γ}{2},

(75)

there exists a mapping

φ_{n}

such that

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ + γ

(76)

for sufficiently large n.

For every sequence

x \in X^{n}

, we define the probability distribution

P_{{\bar{X}}^{n}} (x) : = \{\begin{matrix} \frac{β_{0}}{A_{n} (Δ)} & P_{X^{n}} (x) \geq β_{0}, \\ \frac{P_{X^{n}} (x)}{A_{n} (Δ)} & P_{X^{n}} (x) < β_{0} . \end{matrix}

(77)

Since

0 < A_{n} (Δ) < 1

, this probability distribution is well-defined. Then, from the definition of the smooth min entropy it holds that

\sum_{x \in X^{n}} P_{{\bar{X}}^{n}} (x) = 1 .

(78)

Here, from (75) and the definition of the smooth min entropy it holds that

\begin{matrix} M_{n} & \leq \frac{1}{β_{0}} A_{n} (Δ) e^{- n γ / 2} \leq | X^{n} | \end{matrix}

(79)

for sufficiently large n.

We next define the mapping

φ_{n} : X^{n} \to U_{M_{n}}

by using

P_{{\bar{X}}^{n}}

. To do so, we classify the elements of

X^{n}

into

I_{n} (i) (1 \leq i \leq M_{n})

as follows.

We choose a set $I_{n} (1)$ arbitrarily satisfying

$\sum_{x \in I_{n} (1)} P_{{\bar{X}}^{n}} (x) \leq \frac{1}{M_{n}},$

(80)

$\sum_{x \in I_{n} (1)} P_{{\bar{X}}^{n}} (x) + P_{{\bar{X}}^{n}} (x^{'}) > \frac{1}{M_{n}}$

(81)

for any $x^{'} \in X^{n} ∖ I_{n} (1)$ .
Next, we choose a set $I_{n} (2) \subset X^{n} ∖ I_{n} (1)$ satisfying

$\sum_{x \in I_{n} (2)} P_{{\bar{X}}^{n}} (x) \leq \frac{1}{M_{n}},$

(82)

$\sum_{x \in I_{n} (2)} P_{{\bar{X}}^{n}} (x) + P_{{\bar{X}}^{n}} (x^{'}) > \frac{1}{M_{n}}$

(83)

for any $x^{'} \in X^{n} ∖ ⋃_{i = 1}^{2} I_{n} (i)$ .

Furthermore, we repeat this operation

(M_{n} - 1)

times so as to choose sets

I_{n} (i) (1 \leq i \leq M_{n} - 1)

. Notice here that since

\frac{1}{M_{n}} > \frac{β_{0}}{A_{n} (Δ)}

holds, we can repeat this operation

(M_{n} - 1)

times. Thus, from the above procedure, all of

I_{n} (i) (1 \leq i \leq M_{n} - 1)

are not empty. Lastly, we set

I_{n} (M_{n}) = {x \in X^{n} | x \in X^{n} ∖ \cup_{i = 1}^{M_{n} - 1} I_{n} (i))}

.

From

I_{n} (i) (1 \leq i \leq M_{n})

, we define the mapping

φ_{n} : X^{n} \to U_{M_{n}}

as follows:

φ_{n} (x) = i, x \in I_{n} (i) .

(84)

Furthermore, we set

{\tilde{U}}_{M_{n}} = φ_{n} (X^{n})

. Thus,

P_{{\tilde{U}}_{M_{n}}} (i) = \sum_{x \in I_{n} (i)} P_{X^{n}} (x)

(85)

holds for every i in

1 \leq i \leq M_{n}

.

We next evaluate the above mapping

φ_{n}

. From the construction of the mapping, it holds that

\frac{1}{M_{n}} < \sum_{x \in I_{n} (i)} P_{{\bar{X}}^{n}} (x) + \frac{β_{0}}{A_{n} (Δ)}

(86)

for all

i (1 \leq i \leq M_{n} - 1)

and

\frac{1}{M_{n}} \leq \sum_{x \in I_{n} (M_{n})} P_{{\bar{X}}^{n}} (x) .

(87)

Hence, for all

i (1 \leq i \leq M_{n})

, we have

\frac{1}{M_{n}} - \frac{β_{0}}{A_{n} (Δ)} \leq \sum_{x \in I_{n} (i)} P_{{\bar{X}}^{n}} (x) .

(88)

Here, notice that for all

x \in X^{n}

P_{{\bar{X}}^{n}} (x) \leq \frac{P_{X^{n}} (x)}{A_{n} (Δ)}

(89)

holds from (77). Thus, we have

\begin{matrix} \frac{P_{{\tilde{U}}_{M_{n}}} (i)}{A_{n} (Δ)} & = & \frac{\sum_{x \in I_{n} (i)} P_{X^{n}} (x)}{A_{n} (Δ)} \\ \geq & \sum_{x \in I_{n} (i)} P_{{\bar{X}}^{n}} (x) \\ > & \frac{1}{M_{n}} - \frac{β_{0}}{A_{n} (Δ)} \\ = & \frac{1}{M_{n}} (1 - \frac{M_{n} β_{0}}{A_{n} (Δ)}) \\ \geq & \frac{1}{M_{n}} (1 - e^{- n γ / 2}) \end{matrix}

(90)

for all

i (1 \leq i \leq M_{n})

where the first equality is due to (85), the first inequality is due to (89), the second inequality is due to (88), and the last inequality is due to (79). Hence, we obtain

\begin{matrix} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & = & \sum_{1 \leq i \leq M_{n}} \frac{1}{M_{n}} f (\frac{P_{{\tilde{U}}_{M_{n}}} (i)}{\frac{1}{M_{n}}}) \\ \leq & \sum_{1 \leq i \leq M_{n}} \frac{1}{M_{n}} f (A_{n} (Δ) (1 - e^{- n γ / 2})) \\ \leq & f (f^{- 1} (Δ)) + δ_{n} \\ = & Δ + δ_{n}, \end{matrix}

(91)

where we can choose some

δ_{n} > 0

such that

δ_{n} \to 0 (n \to \infty)

, the first inequality is due to (90), and the second inequality is due to the continuity of the function f, (74) and C1). This completes the proof of the theorem. □

Theorem 7.

Under conditions C1) and C2), for any

ε > 0

if the mapping

φ_{n}

satisfies

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ - ε,

(92)

then it holds that

\frac{1}{n} log M_{n} \leq \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ) | X^{n})

(93)

for sufficiently large n.

Proof.

Setting

H^{'} : = H_{\infty} (1 - f^{- 1} (Δ) | X^{n}),

(94)

we only consider the case where

H^{'} < | X^{n} |

holds, because

H^{'} = | X^{n} |

means the trivial result. Let

ε > 0

be fixed arbitrarily. We show that if

\frac{1}{n} log M_{n} > \frac{1}{n} H^{'}

(95)

holds for infinitely many

n = n_{1}, n_{2}, \dots,

then for any

φ_{n}

it holds that

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) > Δ - ε .

(96)

From (95), there exists a positive constant

γ

satisfying

\frac{1}{n} log M_{n} - 2 γ > \frac{1}{n} H^{'} .

(97)

Here, for

γ > 0

satisfying the above inequality we set

T_{n}

as

\begin{matrix} T_{n} & : = \{x \in X^{n} |\frac{1}{n} log \frac{1}{P_{X^{n}} (x)} \leq \frac{1}{n} H^{'} + γ\} \end{matrix}

(98)

\begin{matrix} = \{x \in X^{n} |P_{X^{n}} (x) \geq e^{- (H^{'} + n γ)}\} . \end{matrix}

(99)

Then, from the relation

\begin{matrix} 1 \geq \sum_{x \in T_{n}} P_{X^{n}} (x) \geq | T_{n} | e^{- (H^{'} + n γ)} \end{matrix}

(100)

we have

| T_{n} | \leq e^{H^{'} + n γ} .

(101)

Next, we fix

M_{n}

and a mapping

φ_{n}

satisfying (95) and set

{\tilde{U}}_{M_{n}}

as

{\tilde{U}}_{M_{n}} = φ_{n} (X^{n})

. Using

φ_{n}

and

T_{n}

, we set

I_{n}

as

I_{n} : = {i | \exists x \in T_{n}, φ_{n} (x) = i} .

(102)

Thus, the set

I_{n}

is the set of index i constructing from at least one

x \in T_{n}

and the set

{(I_{n})}^{c}

is the set of i constructing only from

x \in {(T_{n})}^{c}

.

Then, from the definition of the mapping and (101), it holds that

| I_{n} | \leq | T_{n} | \leq e^{H^{'} + n γ} .

(103)

On the other hand, from (97), we have

\begin{matrix} M_{n} > e^{H^{'} + 2 n γ} . \end{matrix}

(104)

This means that

\frac{| I_{n} |}{M_{n}} \leq e^{- n γ}

(105)

holds. Hence, from the condition C2), we have

\frac{| I_{n} |}{M_{n}} f (\frac{M_{n}}{| I_{n} |}) \to 0 (n \to \infty) .

(106)

From the above argument, the f-divergence between

φ_{n} (X^{n})

and

U_{M_{n}}

is evaluated as

\begin{matrix} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & = & \sum_{1 \leq i \leq M_{n}} \frac{1}{M_{n}} f (\frac{P_{{\tilde{U}}_{M_{n}}} (i)}{\frac{1}{M_{n}}}) \\ = & \sum_{1 \leq i \leq M_{n}, i \in I_{n}} \frac{1}{M_{n}} f (\frac{P_{{\tilde{U}}_{M_{n}}} (i)}{\frac{1}{M_{n}}}) \\ + \sum_{1 \leq i \leq M_{n}, i \in {(I_{n})}^{c}} \frac{1}{M_{n}} f (\frac{P_{{\tilde{U}}_{M_{n}}} (i)}{\frac{1}{M_{n}}}) \\ \geq & \frac{| I_{n} |}{M_{n}} f (\frac{\sum_{1 \leq i \leq M_{n}, i \in I_{n}} P_{{\tilde{U}}_{M_{n}}} (i)}{\frac{| I_{n} |}{M_{n}}}) \\ + \frac{| {(I_{n})}^{c} |}{M_{n}} f (\frac{\sum_{1 \leq i \leq M_{n}, i \in {(I_{n})}^{c}} P_{{\tilde{U}}_{M_{n}}} (i)}{\frac{| {(I_{n})}^{c} |}{M_{n}}}) \\ \geq & \frac{| I_{n} |}{M_{n}} f (\frac{1}{\frac{| I_{n} |}{M_{n}}}) + \frac{| {(I_{n})}^{c} |}{M_{n}} f (\frac{\sum_{1 \leq i \leq M_{n}, i \in {(I_{n})}^{c}} P_{{\tilde{U}}_{M_{n}}} (i)}{\frac{| {(I_{n})}^{c} |}{M_{n}}}) \\ \geq & \frac{| I_{n} |}{M_{n}} f (\frac{M_{n}}{| I_{n} |}) + \frac{| {(I_{n})}^{c} |}{M_{n}} f (\frac{Pr {X^{n} \in {(T_{n})}^{c}}}{\frac{| {(I_{n})}^{c} |}{M_{n}}}), \end{matrix}

(107)

where the first inequality is due to (11), and the last inequality is due to the relation

\sum_{1 \leq i \leq M_{n}, i \in {(I_{n})}^{c}} P_{{\tilde{U}}_{M_{n}}} (i) \leq Pr {X^{n} \in {(T_{n})}^{c}}

(108)

and C1).

We next focus on the evaluation of the second term on the RHS of (107). From the definition of the smooth min entropy

H^{'}

and Theorem 1, for any

γ > 0

it necessarily holds that

\begin{matrix} \sum_{x \in X^{n}} {(P_{X^{n}} (x) - e^{- (H^{'} + n γ)})}^{+} \geq 1 - f^{- 1} (Δ) . \end{matrix}

(109)

Thus, from the definition of

T_{n}

it holds that

\begin{matrix} \sum_{x \in T_{n}} (P_{X^{n}} (x) - e^{- (H^{'} + n γ)}) & = & \sum_{x \in X^{n}} {(P_{X^{n}} (x) - e^{- (H^{'} + n γ)})}^{+} \\ \geq & 1 - f^{- 1} (Δ) . \end{matrix}

(110)

Thus, we obtain

\begin{matrix} Pr \{X^{n} \in T_{n}\} & \geq & 1 - f^{- 1} (Δ), \end{matrix}

(111)

from which it holds that

\begin{matrix} Pr \{X^{n} \in {(T_{n})}^{c}\} & < 1 - (1 - f^{- 1} (Δ)) = f^{- 1} (Δ) . \end{matrix}

(112)

Plugging the above inequality with (107), we obtain

\begin{matrix} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & > & \frac{| I_{n} |}{M_{n}} f (\frac{M_{n}}{| I_{n} |}) + \frac{| {(I_{n})}^{c} |}{M_{n}} f (\frac{f^{- 1} (Δ)}{\frac{| {(I_{n})}^{c} |}{M_{n}}}) . \end{matrix}

(113)

Noticing that

\frac{| {(I_{n})}^{c} |}{M_{n}} > 1 - e^{- n γ},

(114)

from (105), for some

δ_{n} \to 0

, we have

\begin{matrix} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & > & (1 - e^{- n γ}) f (\frac{f^{- 1} (Δ)}{1 - e^{- n γ}}) - δ_{n} \\ = & f (\frac{f^{- 1} (Δ)}{1 - e^{- n γ}}) - e^{- n γ} f (\frac{f^{- 1} (Δ)}{1 - e^{- n γ}}) - δ_{n} \\ = & f (f^{- 1} (Δ) (1 + γ_{n}^{'})) - 2 δ_{n} \end{matrix}

(115)

for sufficiently large n, where we use the property (106) and the notation

γ_{n}^{'} = \frac{e^{- n γ}}{1 - e^{- n γ}}

. Since

γ_{n}^{'} \to 0 (n \to \infty)

holds, from the continuity of the function f, it holds that

\begin{matrix} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & > & Δ - 3 δ_{n} \\ \geq & Δ - ε \end{matrix}

(116)

for

n = n_{j}, n_{j + 1}, \dots,

with some

j \geq 1

. Therefore, we obtain the theorem. □

Theorems 6 and 7 show that the smooth min entropy and the inverse function of f have important roles in the intrinsic randomness problem with respect to f-divergences, while the smooth max entropy is important in the resolvability problem. By using the above two theorems, we obtain the following theorem. It should be noted that because of the assumption

0 \leq Δ < f (0)

and C1), we have

0 < f^{- 1} (Δ) \leq 1

.

Theorem 8.

Under conditions C1) and C2), it holds that

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}) \\ = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ) + ν | X^{n}) . \end{matrix}

(117)

Proof.

(Direct Part:) Fix

ν > 0

arbitrarily. From Theorem 6, for any

γ > 0

and

M_{n}

such that

\begin{matrix} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}) - 2 γ & \leq \frac{1}{n} log M_{n} \leq \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ) | X^{n}) - γ \end{matrix}

(118)

holds, there exists a mapping

φ_{n}

satisfying

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ + γ

(119)

for sufficiently large n.

Since

γ > 0

is arbitrarily, we obtain

\underset{n \to \infty}{lim inf} \frac{1}{n} log M_{n} \geq \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}) .

(120)

Here, we fix a sequence

{ν_{i}}_{i = 1}^{\infty}

such that

ν_{1} > ν_{2} > \dots > 0

, and we repeat the above argument as

i \to \infty

. Then, we can show that there exists a mapping

φ_{n}

satisfying

\underset{n \to \infty}{lim inf} \frac{1}{n} log M_{n} \geq lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (D + ν) | X^{n}),

(121)

and

\underset{n \to \infty}{lim sup} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ .

(122)

This completes the proof of the direct part of the theorem.

(Converse Part:) Fix

ν > 0

arbitrarily. From Theorem 7, for any mapping

φ_{n}

satisfying

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ + ν,

(123)

it holds that

\frac{1}{n} log M_{n} \leq \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + 2 ν) | X^{n}) .

(124)

Thus, for any

ν > 0

, we obtain

\underset{n \to \infty}{lim sup} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ + ν,

(125)

and

\underset{n \to \infty}{lim inf} \frac{1}{n} log M_{n} \leq \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + 2 ν) | X^{n}) .

(126)

Noting that

ν > 0

is arbitrarily, we fix a sequence

{ν_{i}}_{i = 1}^{\infty}

such that

ν_{1} > ν_{2} > \dots > 0

, and we repeat the above argument as

i \to \infty

. Then, we obtain

\underset{n \to \infty}{lim sup} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) \leq Δ,

(127)

and

\begin{matrix} \underset{n \to \infty}{lim inf} \frac{1}{n} log M_{n} \leq lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}) . \end{matrix}

(128)

This completes the proof of the converse part of the theorem. □

5. Relaxation of Conditions C1) and C2)

Thus far, we have derived the general formulas for the optimum resolvability rate under conditions C1)–C3) and for the optimum intrinsic randomness rate under conditions C1) and C2). In this section, we relax conditions C1) and C2) to extend the class of f-divergence for which we can characterize these optimum rates. Hereafter, we do not consider a linear function

f (t) = a (t - 1)

with some a because it always gives a trivial case where

D_{f} (Z | | \bar{Z}) = 0

.

We consider the following condition, which is a relaxation of C2):

C2’): The function f satisfies

$\begin{matrix} lim_{u \to \infty} \frac{f (u)}{u} < + \infty . \end{matrix}$

(129)

For function f satisfying condition C2’), we denote the LHS of (129) by

\begin{matrix} c_{f} = lim_{u \to \infty} \frac{f (u)}{u} . \end{matrix}

(130)

We give some examples of the function

f (t)

, which satisfies C2’) but not C2).

$f (t) = | t - 1 |$ : The f-divergence is variational distance, and $c_{f} = 1$ .
$f (t) = {(1 - \sqrt{t})}^{2}$ : The f-divergence is the squared Hellinger distance, and $c_{f} = 1$ .
$f (t) = \frac{t^{α} - α t - (1 - α)}{α (α - 1)}$ ( $0 < α < 1$ ): The f-divergence is $α$ -divergence, and $c_{f} = \frac{1}{1 - α}$ .

For function

f (t)

satisfying condition C2’), we consider its modified function

\begin{matrix} f_{0} (t) : = f (t) + c_{f} (1 - t), \end{matrix}

(131)

which is offset by

c_{f} (1 - t)

. This function is called the offset function of f. It should be noted that under condition C2), which is a special case of C2’), it holds that

c_{f} = 0

and thus

f_{0} (t) = f (t)

for all

t \geq 0

. We have the following lemma:

Lemma 1.

Assume that the function

f (t)

satisfies condition C2’). Then,

(i): the offset function $f_{0}$ satisfies conditions C1) and (C2),
(ii): for any pair of probability distributions $P_{Z}$ and $P_{\bar{Z}}$ with the same alphabet $Z$ , it holds that

$\begin{matrix} D_{f} (Z | | \bar{Z}) = D_{f_{0}} (Z | | \bar{Z}) . \end{matrix}$

(132)

Proof.

It is easily verified that

f_{0}

is a convex function with

f_{0} (1) = 0

, and claim (ii) is well-known. So, here we show claim (i). By definition, it holds that

\begin{matrix} lim_{u \to \infty} \frac{f_{0} (u)}{u} & = lim_{u \to \infty} (\frac{f (u)}{u} + \frac{c_{f} (1 - u)}{u}) \\ = lim_{u \to \infty} \frac{f (u)}{u} - c_{f} = 0, \end{matrix}

(133)

which indicates that

f_{0}

satisfies condition C2).

To show C1) being held, we use the left-derivative of

f_{0}

at

t > 0

, denoted as

\begin{matrix} f_{0}^{'} (t -) = lim_{h ↑ 0} \frac{f_{0} (t + h) - f_{0} (t)}{h} \end{matrix}

(134)

(cf. [22]). Contrary to ordinary derivatives, the left-derivative at

t > 0

always exists for function

f_{0}

, which is continuous. To show that

f_{0}

satisfies condition C1), it suffices to show that

f_{0}^{'} (t -) \leq 0

for all

t > 0

. Using the left-derivative

f_{0}^{'} (t -)

, a tangent line at

t > 0

can be expressed as

f_{0}^{'} (t -) \cdot t + b

with some b, where

f_{0}^{'} (t -)

and b correspond to the slope and intercept of this tangent line, respectively. We call this tangent line the left-tangent line at t. Fixing

t^{*} > 0

arbitrarily, let

a^{*} : = f_{0}^{'} (t^{*} -)

and

b^{*}

be the intercept of the left-tangent line at

t^{*}

. The convexity of

f_{0}

implies that

\begin{matrix} f_{0} (t) \geq a^{*} t + b^{*} (\forall t \geq 0) . \end{matrix}

(135)

Then, it follows from (133) that

\begin{matrix} 0 = lim_{u \to \infty} \frac{f_{0} (u)}{u} \geq lim_{u \to \infty} \frac{a^{*} u + b^{*}}{u} = a^{*} = f_{0}^{'} (t^{*} -) . \end{matrix}

(136)

Since

t^{*} > 0

is arbitrary, this inequality implies that

f_{0} (t)

is decreasing for

t > 0

with

f_{0} (0) > 0

, completing the proof of the lemma. □

Lemma 1 indicates that if the original function f satisfies condition C2’), then its offset function

f_{0}

satisfies conditions C1) and C2) without changing the value of f-divergence. Because condition C2) is a special instance of condition C2’) with

c_{f} = 0

, claim (i) of Lemma 1 implies that condition C1) is superfluous for functions satisfying C2) (cf. Remark 5). The following proposition is immediately obtained by claim (ii) of Lemma 1:

Proposition 1.

Assume that the function

f (t)

satisfies condition C2’). Then, we have

\begin{matrix} S_{r}^{(f)} (D | X) & = S_{r}^{(f_{0})} (D | X), \end{matrix}

(137)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = S_{ι}^{(f_{0})} (Δ | X) . \end{matrix}

(138)

It is easily verified if f satisfies condition C3) as well as C2’), then so does

f_{0}

. From this fact, Lemma 1, and Proposition 1, we have the following generalization of Theorem 5.

Theorem 9.

Under conditions C2’) and C3), it holds that

\begin{matrix} S_{r}^{(f)} (D | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f_{0}^{- 1} (D + ν) | X^{n}) \\ = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f_{0}^{- 1} (D) + ν | X^{n}) . \end{matrix}

(139)

For the optimum intrinsic randomness rate, we also have the generalized result of Theorem 8.

Theorem 10.

Under condition C2’), it holds that

\begin{matrix} S_{ι}^{(f)} (D | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f_{0}^{- 1} (Δ + ν) | X^{n}) \\ = lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f_{0}^{- 1} (Δ) + ν | X^{n}) . \end{matrix}

(140)

6. Particularization to Several Distance Measures

In previous sections, we have derived the general formula of the first-order optimum resolvability and intrinsic randomness rates with respect to f-divergences, where the smooth Rényi entropy and the inverse function of f have important roles. In this section, we first focus on several specified functions f satisfying conditions C1)–C3) and compute these rates by using Theorems 5 and 8. In addition, we consider the function f satisfying C2’) and C3) and compute the rates by using Theorems 9 and 10.

It will turn out that it is easy to derive the optimum achievable rates for specified approximation measures. We use the notation

D_{f} (X^{n} | | {\tilde{X}}^{n}) : = D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})), D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) : = D_{f} (φ_{n} (X^{n}) | | U_{M_{n}})

(141)

for convenience.

Remark 9.

Since the function

f (t) = t log t

(which indicates the KL divergence) does not satisfy C1) and C2), we can not apply Theorems 5 and 8 to the case of the KL divergence:

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) = D (X^{n} | | {\tilde{X}}^{n}) & = & \sum_{x \in X^{n}} P_{X^{n}} (x) log \frac{P_{X^{n}} (x)}{P_{{\tilde{X}}^{n}} (x)}, \end{matrix}

(142)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) = D ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & \sum_{1 \leq i \leq M_{n}} P_{{\tilde{U}}_{M_{n}}} (i) log \frac{P_{{\tilde{U}}_{M_{n}}} (i)}{P_{U_{M_{n}}} (i)} . \end{matrix}

(143)

The resolvability problem with respect to the KL divergence of this direction has not been considered yet. On the other hand, in the intrinsic randomness problem, Hayashi [9] ([Theorem 7]) has studied the problem with respect to the normalized KL divergence:

1 / n D ({\tilde{U}}_{M_{n}} | | U_{M_{n}})

as well as

D (U_{M_{n}} | | {\tilde{U}}_{M_{n}})

.

6.1. Half Variational Distance

We first consider the case of

f (t)

given as

f (t) = {(1 - t)}^{+}

, which indicates

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) & = & \frac{1}{2} \sum_{x \in X^{n}} |P_{{\tilde{X}}^{n}} (x) - P_{X^{n}} (x)|, \end{matrix}

(144)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & \frac{1}{2} \sum_{1 \leq i \leq M_{n}} |P_{{\tilde{U}}_{M_{n}}} (i) - P_{U_{M_{n}}} (i)| . \end{matrix}

(145)

In this special case, we obtain the following corollary:

Corollary 1.

For

f (t) = {(1 - t)}^{+}

, it holds that

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (D + ν | X^{n}), \end{matrix}

(146)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (Δ + ν | X^{n}) . \end{matrix}

(147)

Proof.

In the case of

f (t) = {(1 - t)}^{+}

, the inverse function becomes

f^{- 1} (D) = 1 - D

, because

0 \leq D < 1

holds. Hence, from Theorems 5 and 8, we obtain the corollary. □

The former result in the above corollary coincides with the result given by Uyematsu [6] ([Theorem 6]), while the latter one coincides with the result given by Uyematsu and Kunimatsu [10] ([Theorem 6]). It is important to note that

S_{r}^{(f)} (D | X)

has been addressed by Steinberg [2] and Han [8] ([Theorem 2.4.1]), and

S_{ι}^{(f)} (D | X)

has also been addressed by Vembu and Verdú [7] ([Theorem 1]), Han [8] ([Theorem 2.4.2]), and Hayashi [9] ([Theorem 2]), using different information-theoretic approaches. In particular, Hayashi [9] ([Theorem 2]) has considered various achievable rates concerning the intrinsic randomness problem with respect to the variational distance, but these are not included in our current analysis. Our work provides an alternative derivation and contextualizes these results within our framework of f-divergences.

6.2. Reverse Kullback–Leibler Divergence

Secondly, we consider the case of

f (t) = - log t

, which indicates

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) & = & D (ϕ_{n} (U_{M_{n}}) | | X^{n}) = \sum_{x \in X^{n}} P_{{\tilde{X}}^{n}} (x) log \frac{P_{{\tilde{X}}^{n}} (x)}{P_{X^{n}} (x)}, \end{matrix}

(148)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & D (U_{M_{n}} | | φ_{n} (X^{n})) = \sum_{1 \leq i \leq M_{n}} P_{U_{M_{n}}} (i) log \frac{P_{U_{M_{n}}} (i)}{P_{{\tilde{U}}_{M_{n}}} (i)} . \end{matrix}

(149)

In this case, we obtain the following corollary:

Corollary 2.

For

f (t) = - log t

, it holds that

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - e^{- (D + ν)} | X^{n}), \end{matrix}

(150)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - e^{- (Δ + ν)} | X^{n}) . \end{matrix}

(151)

Proof.

The inverse function is immediately given by

f^{- 1} (D) = e^{- D}

. Hence, from Theorems 5 and 8, we obtain the corollary. □

It is important to note that

S_{ι}^{(f)} (D | X)

has been previously addressed by Hayashi [9] ([Theorem 7]) using different information-theoretic approaches. In particular, Vembu and Verdú [7] ([Theorem 1]) and Hayashi [9] ([Theorem 7]) have also considered the intrinsic randomness problem with respect to the normalized KL divergence, which is not included in our current analysis.

6.3. Hellinger Distance

We consider the case of

f (t) = 1 - \sqrt{t}

, which indicates

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) & = & 1 - \sum_{x \in X^{n}} \sqrt{P_{X^{n}} (x) P_{{\tilde{X}}^{n}} (x)}, \end{matrix}

(152)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & 1 - \sum_{1 \leq i \leq M_{n}} \sqrt{P_{{\tilde{U}}_{M_{n}}} (i) P_{U_{M_{n}}} (i)} . \end{matrix}

(153)

In this case, we obtain the following corollary:

Corollary 3.

For

f (t) = 1 - \sqrt{t}

, it holds that

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (2 D - D^{2} + ν | X^{n}), \end{matrix}

(154)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (2 Δ - Δ^{2} + ν | X^{n}) . \end{matrix}

(155)

Proof.

The inverse function of

f (t) = 1 - \sqrt{t}

is given by

f^{- 1} (D) = {(1 - D)}^{2}

. Hence, from Theorems 5 and 8, we obtain the corollary. Notice here that since both of D and

Δ

are smaller than one,

2 D - D^{2}

as well as

2 Δ - Δ^{2}

are positive. □

It is worth noting that Kumagai and Hayashi [13] have analyzed this quantity for the case of i.i.d. sources. Importantly, they addressed this quantity as part of a broader problem: the random number conversion problem. On the other hand, our approach differs in that we derive this quantity from results based on f-divergences.

6.4. E_γ-Divergence

We consider the case of

f (t) = {(γ - t)}^{+} + 1 - γ

, which indicates

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) & = & \sum_{x \in X^{n} : P_{X^{n}} (x) > γ P_{{\tilde{X}}^{n}} (x)} (P_{X^{n}} (x) - γ P_{{\tilde{X}}^{n}} (x)) . \end{matrix}

(156)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & \sum_{1 \leq i \leq M_{n} : P_{{\tilde{U}}_{M_{n}}} (i) > γ P_{U_{M_{n}}} (i)} (P_{{\tilde{U}}_{M_{n}}} (i) - γ P_{U_{M_{n}}} (i)) . \end{matrix}

(157)

In this case, we obtain the corollary:

Corollary 4.

For

f (t) = {(γ - t)}^{+} + 1 - γ

, we have

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (D + ν | X^{n}), \end{matrix}

(158)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (Δ + ν | X^{n}) . \end{matrix}

(159)

Proof.

Noting that

γ \geq 1

, we have

f (t) = 1 - t

. Hence, the corollary holds. □

Remark 10.

The above corollary shows that both optimum achievable rates with respect to the

E_{γ}

-divergence does not depend on γ, which means that these rates coincide with the optimum achievable rates with respect to the half variational distance (cf. Corollary 1).

6.5. Variational Distance

We next consider functions f satisfying C2’) and C3). Firstly, the function

f (t) = | 1 - t |

is considered:

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) & = & \sum_{x \in X^{n}} |P_{{\tilde{X}}^{n}} (x) - P_{X^{n}} (x)|, \end{matrix}

(160)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & \sum_{1 \leq i \leq M_{n}} |P_{{\tilde{U}}_{M_{n}}} (i) - P_{U_{M_{n}}} (i)| . \end{matrix}

(161)

As we have already mentioned in the previous section,

f (t) = | 1 - t |

does not satisfy C1). However, it satisfies C2’) and C3). Hence, from Theorems 9 and 10, we obtain the corollary:

Corollary 5.

For

f (t) = | 1 - t |

, we have

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (\frac{D}{2} + ν| X^{n}), \end{matrix}

(162)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (\frac{Δ}{2} + ν| X^{n}) . \end{matrix}

(163)

Proof.

Noticing that

c_{f} = 1

, we have

f_{0} (t) = | 1 - t | + (1 - t)

, from which we obtain

f_{0}^{- 1} (D) = 1 - \frac{D}{2} .

(164)

Therefore, we obtain the corollary from Theorems 9 and 10. □

6.6. Squared Hellinger Distance

We consider the function

f (t) = {(1 - \sqrt{t})}^{2}

, which is also satisfies C2’) and C3). It indicates

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) & = & \sum_{x \in X^{n}} {(\sqrt{P_{{\tilde{X}}^{n}} (x)} - \sqrt{P_{X^{n}} (x)})}^{2}, \end{matrix}

(165)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & \sum_{1 \leq i \leq M_{n}} {(\sqrt{P_{{\tilde{U}}_{M_{n}}} (i)} - \sqrt{P_{U_{M_{n}}} (i)})}^{2} . \end{matrix}

(166)

In this case, we also apply Theorems 9 and 10.

Corollary 6.

For

f (t) = {(1 - \sqrt{t})}^{2}

, we have

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (D - \frac{D^{2}}{4} + ν| X^{n}), \end{matrix}

(167)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (D - \frac{D^{2}}{4} + ν| X^{n}) . \end{matrix}

(168)

Proof.

Noticing that

c_{f} = 1

, we obtain

f_{0}^{- 1} (D) = {(1 - \frac{D}{2})}^{2} .

(169)

Hence, we obtain the corollary. □

Remark 11.

The variational distance is twice the half variational distance. Consequently, the results of Corollary 5 can be trivially derived from those of Corollary 1. However, we emphasize that

f (t) = | 1 - t |

does not satisfy conditions C1) and C2). Therefore, to derive the results of Corollary 5, it is necessary to apply the discussion from Section 5, specifically the examination using

f_{0} (t)

. This underscores the importance of our theoretical framework in handling cases where the function f does not meet conditions C1) and C2). A similar relationship exists between Corollary 3 and the later-discussed Corollary 6.

6.7. α-Divergence

We consider the function

f (t) = \frac{t^{α} - α t - (1 - α)}{α (α - 1)}

(

0 < α < 1

), which also satisfies C2’) and C3). The

α

-divergence in our setting is given by

\begin{matrix} D_{f} (X^{n} | | {\tilde{X}}^{n}) & = & \frac{1}{α (1 - α)} (1 - \sum_{x \in X^{n}} P_{X^{n}} {(x)}^{α} P_{{\tilde{X}}^{n}} {(x)}^{1 - α}), \end{matrix}

(170)

\begin{matrix} D_{f} ({\tilde{U}}_{M_{n}} | | U_{M_{n}}) & = & \frac{1}{α (1 - α)} (1 - \sum_{1 \leq i \leq M_{n}} P_{{\tilde{U}}_{M_{n}}} {(i)}^{α} P_{U_{M_{n}}} {(i)}^{1 - α}) . \end{matrix}

(171)

In this case, we obtain the following corollary using Theorems 9 and 10.

Corollary 7.

For

f (t) = \frac{t^{α} - α t - (1 - α)}{α (α - 1)}

, we have

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - {(D α (α - 1) + 1)}^{1 / α} + ν| X^{n}), \end{matrix}

(172)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - {(D α (α - 1) + 1)}^{1 / α} + ν| X^{n}) . \end{matrix}

(173)

Proof.

Noticing that

c_{f} = 1 / (1 - α)

, we obtain

\begin{matrix} f_{0} (t) & = & \frac{t^{α} - 1}{α (α - 1)} \Leftrightarrow f_{0}^{- 1} (D) = {(D α (α - 1) + 1)}^{1 / α} . \end{matrix}

(174)

Hence, we obtain the corollary. □

Let us consider the case of

α = 1 / 2

. In this case, the inverse function can be simply expressed as

f_{0}^{- 1} (D) = {(1 - \frac{D}{4})}^{2} .

(175)

Hence, we have

\begin{matrix} S_{r}^{(f)} (D | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (\frac{D}{2} - \frac{D^{2}}{16} + ν| X^{n}) . \end{matrix}

(176)

It is known that

α

-divergence with

α = 1 / 2

is related to the squared Hellinger distance. In actuality, the optimum resolvability rate

S_{r}^{(f)} (D | X)

with respect to the squared Hellinger distance is identical to

S_{r}^{(f)} (2 D | X)

with respect to the

α

-divergence with

α = 1 / 2

.

7. Second-Order Optimum Achievable Rate

Thus far, we have considered the first-order optimum resolvability rate as well as the first-order optimum intrinsic randomness rate. The rate of the second-order, which enables us to make a finer evaluation of achievable rates, has already been investigated in several information-theoretic problems [5,9,23,24,25,26,27,28,29,30]. In this section, according to these results, we also consider the second-order optimum achievable rates in random number generation problems with respect to f-divergences.

It is important to acknowledge that the second-order analysis for information-theoretic problems was initiated by Hayashi [9]. Building upon these works, Kumagai and Hayashi [13] conducted a second-order analysis for the broader random number conversion problem. They focused on i.i.d. sources and provided a more detailed analysis in this context. On the other hand, our results apply to more general sources, including but not limited to i.i.d. sources.

7.1. General Formula

We first define the second-order achievability in the resolvability problem.

Definition 7.

L is said to be

(D, R)

-achievable with the given f-divergence if there exists a sequence mapping

ϕ_{n} : U_{M_{n}} \to X^{n}

such that

\begin{matrix} \underset{n \to \infty}{lim sup} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D, \end{matrix}

(177)

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} log \frac{M_{n}}{e^{n R}} \leq L . \end{matrix}

(178)

Definition 8 (Second-order optimum resolvability rate).

\begin{matrix} S_{r}^{(f)} (D, R | X) : = inf \{L |L i s (D, R) - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} . \end{matrix}

In order to analyze the above quantity, we use the following condition instead of C3):

C3’): For any pair of positive real numbers $(a, b)$ , it holds that

$lim_{n \to \infty} \frac{f (e^{- n b})}{e^{\sqrt{n} a}} = 0 .$

(179)

Here, functions

f (t) = - log t

,

f (t) = 1 - \sqrt{t}

,

f (t) = {(1 - t)}^{+}

, and

f (t) = {(γ - t)}^{+} + (1 - γ)

satisfy the condition C3’). Then, the following theorem holds:

Theorem 11 (Second-order optimum resolvability rate).

Under conditions C2’) and C3’), it holds that

\begin{matrix} S_{r}^{(f)} (D, R | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{0} (1 - f_{0}^{- 1} (D + ν) | X^{n}) - n R}{\sqrt{n}}, \end{matrix}

(180)

where

f_{0}

is the offset function of f, defined in (131).

In particular, under conditions C2) and C3’), it holds that

\begin{matrix} S_{r}^{(f)} (D, R | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{0} (1 - f^{- 1} (D + ν) | X^{n}) - n R}{\sqrt{n}} . \end{matrix}

(181)

Proof.

Noticing that Lemma 1 indicates that the offset function

f_{0}

satisfies conditions C1)–C3), the proof of (180) proceeds in parallel with proofs of Theorems 3–5 in which f,

\frac{1}{n}

and

e^{- n γ}

are replaced by

f_{0}

,

\frac{1}{\sqrt{n}}

and

e^{- \sqrt{n} γ}

, respectively. Equation (181) is a special case of (180) with

f_{0} = f

. □

We next consider the case of the intrinsic randomness problem.

Definition 9.

L is said to be

(Δ, R)

-achievable with the given f-divergence if there exists a sequence of mapping

φ_{n} : X^{n} \to U_{M_{n}}

such that

\begin{matrix} \underset{n \to \infty}{lim sup} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & \leq Δ, \end{matrix}

(182)

\begin{matrix} \underset{n \to \infty}{lim inf} \frac{1}{\sqrt{n}} log \frac{M_{n}}{e^{n R}} & \geq L . \end{matrix}

(183)

Definition 10 (Second-order optimum intrinsic randomness rate).

S_{ι}^{(f)} (Δ, R | X) : = sup \{L |L i s (Δ, R) - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} .

(184)

Then, we have the theorem:

Theorem 12

(Second-order optimum intrinsic randomness rate). Under condition C2’), it holds that

\begin{matrix} S_{ι}^{(f)} (Δ, R | X) = lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{\infty} (1 - f_{0}^{- 1} (Δ + ν) | X^{n}) - n R}{\sqrt{n}}, \end{matrix}

(185)

where

f_{0}

is the offset function of f.

In particular, under condition C2), it holds that

\begin{matrix} S_{ι}^{(f)} (Δ, R | X) = lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}) - n R}{\sqrt{n}}, \end{matrix}

(186)

Proof.

The proof of (185) proceeds in parallel with proofs of Theorems 6–8 in which f and

\frac{1}{n}

are replaced by

f_{0}

and

\frac{1}{\sqrt{n}}

. Equation (186) is a special case of (185) with

f_{0} = f

. □

Theorems 11 and 12 show that in both the resolvability and the intrinsic randomness, the smooth Rényi entropy and the inverse function of f also have essential roles to express second-order optimum achievable rates.

7.2. Particularizations to Several Distance Measures

Analogously to Section IV, we compute

S_{r}^{(f)} (D, R | X)

and

S_{ι}^{(f)} (Δ, R | X)

for the specified function f satisfying C1), C2) and C3’), by using Theorems 11 and 12. We obtain the following corollary:

Corollary 8.

For

f (t) = {(1 - t)}^{+}

, it holds that

\begin{matrix} S_{r}^{(f)} (D, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{0} (D + ν | X^{n}) - n R}{\sqrt{n}}, \end{matrix}

(187)

\begin{matrix} S_{ι}^{(f)} (Δ, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{\infty} (Δ + ν | X^{n}) - n R}{\sqrt{n}} . \end{matrix}

(188)

For

f (t) = - log t

, it holds that

\begin{matrix} S_{r}^{(f)} (D, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{0} (1 - e^{- (D + ν)} | X^{n}) - n R}{\sqrt{n}}, \end{matrix}

(189)

\begin{matrix} S_{ι}^{(f)} (Δ, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{\infty} (1 - e^{- (Δ + ν)} | X^{n}) - n R}{\sqrt{n}} . \end{matrix}

(190)

For

f (t) = 1 - \sqrt{t}

, it holds that

\begin{matrix} S_{r}^{(f)} (D, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{0} (2 D - D^{2} + ν | X^{n}) - n R}{\sqrt{n}}, \end{matrix}

(191)

\begin{matrix} S_{ι}^{(f)} (Δ, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{\infty} (2 Δ - Δ^{2} + ν | X^{n}) - n R}{\sqrt{n}} . \end{matrix}

(192)

For

f (t) = {(γ - t)}^{+} + 1 - γ

, we have

\begin{matrix} S_{r}^{(f)} (D, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{0} (D + ν | X^{n}) - n R}{\sqrt{n}}, \end{matrix}

(193)

\begin{matrix} S_{ι}^{(f)} (Δ, R | X) & = & lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{\infty} (Δ + ν | X^{n}) - n R}{\sqrt{n}} . \end{matrix}

(194)

Proof.

The proof is similar to proofs of Corollaries 1–4. □

The optimum achievable rates with the variational distance in terms of the smooth Rényi entropy have already been derived. Relations (187) and (188) coincide with the result given by Tagashira and Uyematsu [31], and the result given by Namekawa and Uyematsu [32], respectively. As in the case of the first-order achievability, second-order optimum rates for the half variational distance (

f (t) = {(1 - t)}^{+}

) and the

E_{γ}

-divergence (

f (t) = {(γ - t)}^{+} + 1 - γ

) are the same, regardless of the value of

γ \geq 1

.

It is important to note that

S_{ι}^{(f)} (Δ, R | X)

in the case of the variational distance and the reverse KL divergence has been addressed by Hayashi [9] ([Theorem 3]) and [9] ([Theorem 9]), respectively, using different information-theoretic approaches. Furthermore,

S_{ι}^{(f)} (Δ, R | X)

, in the case of the Hellinger distance for the i.i.d. case, were studied by Kumagai and Hayashi [13] for the broader setting: the random number conversion problem. Their work focused specifically on i.i.d. sources, while our results extend to more general source models.

8. Optimistic Optimum Achievable Rates

8.1. Source Resolvability

In previous sections, we have treated general formulas of the first- and second-order optimum rates. In this section, we consider optimum achievable rates in the optimistic sense. The notion of the optimistic optimum rates was first introduced by Vembu, Verdú and Steinberg [33] in the source-channel coding problem. Several researchers have developed an optimistic coding scenario in other information-theoretic problems [4,9,34,35]. In this subsection, we also clarify the optimistic optimum resolvability rate with respect to the f-divergence using the smooth Rényi entropy.

Definition 11.

R is said to be optimistically D-achievable with the given f-divergence if there exists a sequence of mapping

ϕ_{n} : U_{M_{n}} \to X^{n}

satisfying

\underset{n \to \infty}{lim sup} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D, \underset{n \to \infty}{lim inf} \frac{1}{n} log M_{n} \leq R .

(195)

Definition 12 (Optimistic first-order optimum resolvability rate).

\begin{matrix} T_{r}^{(f)} (D | X) & : = inf \{R |R i s o p t i m i s t i c a l l y D - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} . \end{matrix}

(196)

We similarly define the second-order achievability in the optimistic scenario.

Definition 13.

L is said to be optimistically

(D, R)

-achievable with the given f-divergence if there exists a sequence of mapping

ϕ_{n} : U_{M_{n}} \to X^{n}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D, \underset{n \to \infty}{lim inf} \frac{1}{\sqrt{n}} log \frac{M_{n}}{e^{n R}} \leq L \end{matrix}

(197)

Definition 14 (Optimistic second-order optimum resolvability rate).

\begin{matrix} T_{r}^{(f)} (D, R | X) : = inf \{L |L i s o p t i m i s t i c a l l y (D, R) - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} . \end{matrix}

(198)

Remark 12.

Conditions of optimistic D-achievability (195) can also be written as

\underset{n \to \infty}{lim inf} D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}})) \leq D, \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} \leq R .

(199)

In actuality, the optimistic first-order optimum resolvability rate on the basis of (199) coincides with the one defined by Definition 12. A similar argument is also applicable in the optimistic second-order optimum resolvability rate as well as the optimistic optimum intrinsic randomness rates.

The following theorem can be obtained by using Theorems 3 and 4.

Theorem 13.

Under conditions C2’) and C3), for any

0 \leq D < f (0)

, it holds that

\begin{matrix} T_{r}^{(f)} (D | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{0} (1 - f_{0}^{- 1} (D + ν) | X^{n}), \end{matrix}

(200)

where

f_{0}

is the offset function of f, defined in (131).

Proof.

The proof proceeds in parallel with proof of Theorems 5 and 9 in which

{lim sup}_{n \to \infty} 1 / n log M_{n}

is replaced by

{lim inf}_{n \to \infty} 1 / n log M_{n}

. □

Theorem 14.

Under conditions C2’) and C3’), for any

0 \leq D < f (0)

, it holds that

\begin{matrix} T_{r}^{(f)} (D, R | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{0} (1 - f_{0}^{- 1} (D + ν) | X^{n}) - n R}{\sqrt{n}} . \end{matrix}

(201)

Proof.

The proof proceeds in parallel with the proof of Theorem 11 in which

{lim sup}_{n \to \infty} 1 / \sqrt{n} log M_{n}

is replaced by

{lim inf}_{n \to \infty} 1 / \sqrt{n} log M_{n}

. □

We have revealed the first- and second-order optimum resolvability rates in the optimistic scenario. As a result, the effectiveness of Theorems 3 and 4 has also been shown.

The optimistic second-order optimum achievable rates with the half variational distance using the smooth Rényi entropy have already been derived by Tagashira and Uyematsu [31]. If we consider the case of

f (t) = {(1 - t)}^{+}

, Theorem 14 coincides with their result.

8.2. Intrinsic Randomness

We next consider the optimum intrinsic randomness rates in the optimistic scenario.

Definition 15.

R is said to be optimistically Δ-achievable with the given f-divergence if there exists a sequence of mapping

φ_{n} : X^{n} \to U_{M_{n}}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & \leq Δ, \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} \geq R . \end{matrix}

(202)

Definition 16 (Optimistic first-order optimum intrinsic randomness rate).

T_{ι}^{(f)} (Δ | X) : = sup \{R |R i s Δ - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} .

(203)

Definition 17.

L is said to be optimistically

(Δ, R)

-achievable with the given f-divergence if there exists a sequence of mapping

φ_{n} : X^{n} \to U_{M_{n}}

satisfying

\begin{matrix} \underset{n \to \infty}{lim sup} D_{f} (φ_{n} (X^{n}) | | U_{M_{n}}) & \leq Δ, \underset{n \to \infty}{lim sup} \frac{1}{\sqrt{n}} log \frac{M_{n}}{e^{n R}} \geq L . \end{matrix}

(204)

Definition 18 (Optimistic second-order optimum intrinsic randomness rate).

T_{ι}^{(f)} (Δ, R | X) : = sup \{L |L i s o p t i m i s t i c a l l y (Δ, R) - a c h i e v a b l e w i t h t h e g i v e n f - d i v e r g e n c e\} .

(205)

Then, we have the following theorem by using Theorems 6 and 7.

Theorem 15.

Under condition C2’), for any

0 \leq Δ < f (0)

it holds that

\begin{matrix} T_{ι}^{(f)} (Δ | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{\infty} (1 - f_{0}^{- 1} (Δ + ν) | X^{n}) . \end{matrix}

(206)

Proof.

The proof is similar to the proof of Theorems 8 and 10 in which

{lim inf}_{n \to \infty} 1 / n log M_{n}

is replaced by

{lim sup}_{n \to \infty} 1 / n log M_{n}

. □

Theorem 16.

Under condition C2’), for any

0 \leq Δ < f (0)

it holds that

\begin{matrix} T_{ι}^{(f)} (Δ, R | X) & = lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{\infty} (1 - f_{0}^{- 1} (Δ + ν) | X^{n}) - n R}{\sqrt{n}} . \end{matrix}

(207)

Proof.

The proof is similar to the proof of Theorem 12 in which

{lim inf}_{n \to \infty} 1 / \sqrt{n} log M_{n}

is replaced by

{lim sup}_{n \to \infty} 1 / \sqrt{n} log M_{n}

. □

We have revealed the first- and second-order optimum intrinsic randomness rates in an optimistic scenario. As in the case of the resolvability problem, the effectiveness of Theorems 6 and 7 has also been shown.

The optimistic first-order optimum intrinsic randomness rate with the half variational distance using the smooth Rényi entropy has been derived by Uyematsu and Kunimatsu [10], while the second-order one has been characterized by Namekawa and Uyematsu [32]. Our results (Theorems 15 and 16) are generalizations of their results.

It is important to acknowledge that the topic of optimistic optimum achievable rates has also been previously studied in [9]. Our analysis of

T_{ι} (Δ | X)

and

T_{ι} (Δ, R | X)

relates to the optimistic optimum achievable rates for intrinsic randomness with variational distance, which were addressed in Theorems 2 and 3 of [9] using different information-theoretic quantities. It should be noted that his work encompasses the analysis of several optimal rates, including the optimistic optimum achievable rates.

9. Discussion

Theorems 5 and 8 (as well as Theorems 11 and 12) have shown a kind of duality of two optimum achievable rates in different random number generation problems in terms of the smooth Rényi entropy. It should be noted that in the case of the variational distance, Theorem 6 in [6] and Theorem 7 in [10] have implied the same duality.

As we have mentioned in Section 1, the optimum achievable rates

S_{r}^{(f)} (D | X)

and

S_{ι}^{(f)} (Δ | X)

have already been characterized by using the information spectrum quantity.

Definition 19.

\begin{matrix} {\bar{K}}_{f} (ε | X) & : = & inf \{R |\underset{n \to \infty}{lim sup} f (Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \leq R\}) \leq ε\}, \\ {\underset{̲}{K}}_{f} (ε | X) & : = & sup \{R |\underset{n \to \infty}{lim sup} f (Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \geq R\}) \leq ε\} . \end{matrix}

(208)

Then, using these two quantities the following theorem has already been given.

Theorem 17

(Nomura [4] ([Theorems 3.1 and 4.1])). Under conditions C1)–C3), it holds that

\begin{matrix} S_{r}^{(f)} (D | X) & = & {\bar{K}}_{f} (D | X), \end{matrix}

(209)

\begin{matrix} S_{ι}^{(f)} (Δ | X) & = & {\underset{̲}{K}}_{f} (Δ | X) . \end{matrix}

(210)

From the above theorem and Theorems 5 and 8, we obtain the following relationship.

Theorem 18.

Under conditions C1)–C3), it holds that

\begin{matrix} lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}) & = & {\bar{K}}_{f} (D | X), \end{matrix}

(211)

\begin{matrix} lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}) & = & {\underset{̲}{K}}_{f} (Δ | X) . \end{matrix}

(212)

The above theorem shows equivalences between information spectrum quantities and smooth Rényi entropies.

Remark 13.

Theorem 18 can also be proved by using previous results and the continuity of the function f. In actuality, for

f (t) = {(1 - t)}^{+}

, Steinberg and Verdú [2] have shown

S_{r}^{(f)} (D | X) = inf \{R |\underset{n \to \infty}{lim sup} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \geq R\} \leq D\},

(213)

from which together with the theorem given by Uyematsu [6] ([Theorem 6]) (Corollary 1 in this paper), we obtain

lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (D + ν | X^{n}) = inf \{R |\underset{n \to \infty}{lim sup} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \geq R\} \leq D\} .

(214)

\begin{matrix} {\bar{K}}_{f} (D | X) = inf \{R |\underset{n \to \infty}{lim sup} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \geq R\} \leq 1 - f^{- 1} (D)\} \end{matrix}

(215)

holds under conditions C1)–C3), we have (211). Equation (212) can also be derived from Corollary 1 and the result given by [8] ([Theorem 2.4.2]).

Remark 14.

From Definition 19, two quantities

{\bar{K}}_{f} (D | X)

and

{\underset{̲}{K}}_{f} (D | X)

are right-continuous functions of D, while

\underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D) | X^{n}) a n d \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (D) | X^{n})

(216)

may not. The operation

{lim}_{ν ↓ 0}

in Theorem 18 can be considered an operation that makes quantities in (216) to be right-continuous. Furthermore, since

f^{- 1} (D)

is a decreasing function of D,

H_{α} (1 - f^{- 1} (D) | X^{n})

is also a decreasing function of D. This means that the relation

\underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D) | X^{n}) \geq lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{1}{n} H_{0} (1 - f^{- 1} (D + ν) | X^{n}),

(217)

holds. It should be emphasized that the above inequality holds with equality except for at most countably many D. Similarly, we obtain

\underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ) | X^{n}) \geq lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{1}{n} H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}),

(218)

where the equality holds except for at most countably many Δ. A similar observation can be applied to Theorem 20 below.

The quantity on the right-hand side of Equation (214) is an information-spectrum quantity defined in [8]. This quantity has been instrumental in analyzing various problems, including source coding and resolvability. On the other hand, the following quantity is specifically used for analyzing the intrinsic randomness problem:

sup \{R |\underset{n \to \infty}{lim sup} Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \leq R\} \leq D\}

(219)

It is noteworthy that Hayashi [9] has defined second-order extensions of these quantities, further expanding their applicability in information theory. These extensions provide a more refined analysis of the asymptotic behavior of various information-theoretic problems.

We next consider the case of the second-order setting. We first define two quantities:

\begin{matrix} {\bar{K}}_{f} (ε, R | X) & : = & inf \{L |\underset{n \to \infty}{lim sup} f (Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \leq R + \frac{L}{\sqrt{n}}\}) \leq ε\}, \end{matrix}

(220)

\begin{matrix} {\underset{̲}{K}}_{f} (ε, R | X) & : = & sup \{L |\underset{n \to \infty}{lim sup} f (Pr \{\frac{1}{n} log \frac{1}{P_{X^{n}} (X^{n})} \geq R + \frac{L}{\sqrt{n}}\}) \leq ε\} . \end{matrix}

(221)

By using these quantities, the following theorem has been obtained.

Theorem 19

(Nomura [4] ([Theorems 6.1 and 6.2])). Under conditions C1), C2) and C3’), it holds that

\begin{matrix} S_{r}^{(f)} (D, R | X) & = & {\bar{K}}_{f} (D, R | X), \end{matrix}

(222)

\begin{matrix} S_{ι}^{(f)} (Δ, R | X) & = & {\underset{̲}{K}}_{f} (Δ, R | X) . \end{matrix}

(223)

From the above theorem and Theorems 11 and 12, we obtain:

Theorem 20.

Under conditions C1), C2) and C3’), it holds that

\begin{matrix} lim_{ν ↓ 0} \underset{n \to \infty}{lim sup} \frac{H_{0} (1 - f^{- 1} (D + ν) | X^{n}) - n R}{\sqrt{n}} & = & {\bar{K}}_{f} (D, R | X), \end{matrix}

(224)

\begin{matrix} lim_{ν ↓ 0} \underset{n \to \infty}{lim inf} \frac{H_{\infty} (1 - f^{- 1} (Δ + ν) | X^{n}) - n R}{\sqrt{n}} & = & {\underset{̲}{K}}_{f} (Δ, R | X) . \end{matrix}

(225)

The above theorem also shows equivalences between information spectrum quantities and smooth Rényi entropies in the second-order sense.

We have discussed functions f under C1), C2), and C3) (or C3’)) for simplicity. We can also extend the discussions for f under C2’) and C3) (or C3’)) with due modification using

f_{0}

.

10. Concluding Remarks

We have so far considered the optimum achievable rates in two random number generation problems with respect to a subclass of f-divergences. We have demonstrated general formulas of the first- and second-order optimum achievable rates with respect to the given f-divergence by using the smooth Rényi entropy including the inverse function of f. To our knowledge, this is the first use of the smooth Rényi entropy in information theory that contains the general function f. We believe that this is important from both the theoretical and practical viewpoints. In actuality, we have shown that we can easily derive the results on several important measures, such as the variational distance, the KL divergence, and the Hellinger distance, by substituting the specified function f into our general formulas. It should be noted that the optimum achievable rates with important measures have not been characterized before by using the smooth Rényi entropy except for the variational distance. Expressions of the smooth max entropy in Theorem 1 and the smooth min entropy in Theorem 2 are simple and easy to understand. Hence, our results using the smooth max entropy and the smooth min entropy are also comprehensive. This provides us another viewpoint to understand the mechanism of the random number generation problems compared to the results given in [4], in which the information spectrum quantities are used. In addition, we have shown that the conditions on f-divergence can be relaxed, leading to the general formulas holding for a wider class of f-divergence. These are major contributions of this paper.

As a consequence of our results and the results in [4], the equivalence of the smooth Rényi entropy and the information spectrum quantity has been clarified (Theorem 18). One may consider that if we show this equivalency first, then we can derive Theorems 5 and 8 directly. This observation is correct. That is, one simple method of deriving both of the general formulas of the optimum achievable rates (Theorems 5 and 8) is to show this equivalency (Theorem 18) first. Then, combining Theorem 18 and results in [4], we obtain Theorems 5 and 8. However, we have taken another approach to show Theorems 5 and 8 in this paper. For example, we first have shown Theorems 3 and 4 so as to establish Theorem 5. Although Theorem 5 has been established by using Theorems 3 and 4, we think that these two theorems are significant themselves. In fact, Theorem 3 provides us with how to construct an optimum mapping in the resolvability problem, and Theorem 4 shows the relationship between the rate of the random number and the smooth max entropy in terms of the finite block length. Hence, these two theorems are also significant not only for proving Theorem 5 but also for constructing the optimum mapping in the practical situation.

In this paper, we have considered the f-divergence

D_{f} (X^{n} | | ϕ_{n} (U_{M_{n}}))

in the case of the resolvability problem and

D_{f} (φ_{n} (X^{n}) | | U_{M_{n}})

in the case of the intrinsic randomness problem and shown a kind of duality of these problems in terms of the smooth Rényi entropy. On the other hand, we can consider the resolvability problem with respect to

D_{f} (ϕ_{n} (U_{M_{n}}) | | X^{n})

as well as the intrinsic randomness problem with respect to

D_{f} (U_{M_{n}} | | φ_{n} (X^{n}))

. Although these problems are also important, a similar technique in the present paper cannot be applied directly. In order to treat these problems, it seems we need some novel techniques, which remain to be studied. This is similar to the case of the information spectrum approach [4].

Finally, the condition C3) and the assumption (15) for the source, have only been needed to show Direct Part (Theorem 3) in the resolvability problem. To consider the necessity or weakening of these conditions is also a future work.

Author Contributions

R.N. and H.Y. conceptualized the overall study. R.N. took the lead in writing the manuscript, with H.Y. contributing by writing Section 5. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by JSPS KAKENHI Grant Number JP20K04462, JP22K04111, JP23K10992, and Kayamori Foundation of Informational Science Advancement.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Han, T.S.; Verdú, S. Approximation theory of output statistics. IEEE Trans. Inf. Theory 1993, 39, 752–772. [Google Scholar] [CrossRef]
Steinberg, Y.; Verdú, S. Simulation of random processes and rate-distortion theory. IEEE Trans. Inf. Theory 1996, 42, 63–86. [Google Scholar] [CrossRef]
Nomura, R. Source resolvability with Kullback-Leibler divergence. In Proceedings of the 2018 IEEE International Symposium on Information Theory, Vail, CO, USA, 17–22 June 2018; pp. 2042–2046. [Google Scholar]
Nomura, R. Source resolvability and intrinsic randomness: Two random number generation problems with respect to a subclass of f-divergences. IEEE Trans. Inf. Theory 2020, 66, 7588–7601. [Google Scholar] [CrossRef]
Nomura, R.; Han, T.S. Second-order resolvability, intrinsic randomness, and fixed-length source coding for mixed sources: Information spectrum approach. IEEE Trans. Inf. Theory 2013, 59, 1–16. [Google Scholar] [CrossRef]
Uyematsu, T. Relating source coding and resolvability: A direct approach. In Proceedings of the 2010 IEEE International Symposium on Information Theory, Austin, TX, USA, 13–18 June 2010; pp. 1350–1354. [Google Scholar]
Vembu, S.; Verdú, S. Generating random bits from an arbitrary source: Fundamental limits. IEEE Trans. Inf. Theory 1995, 41, 1322–1332. [Google Scholar] [CrossRef]
Han, T.S. Information-Spectrum Methods in Information Theory; Springer: New York, NY, USA, 2003. [Google Scholar]
Hayashi, M. Second-order asymptotics in fixed-length source coding and intrinsic randomness. IEEE Trans. Inf. Theory 2008, 54, 4619–4637. [Google Scholar]
Uyematsu, T.; Kunimatsu, S. A new unified method for intrinsic randomness problems of general sources. In Proceedings of the 2013 IEEE Information Theory Workshop (ITW), Seville, Spain, 9–13 September 2013; pp. 1–5. [Google Scholar]
Liu, J.; Cuff, P.; Verdú, S. E_γ-resolvability. IEEE Trans. Inf. Theory 2017, 63, 2629–2658. [Google Scholar]
Yagi, H.; Han, T.S. Variable-length resolvability for mixed sources and its application to variable-length source coding. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018. [Google Scholar]
Kumagai, W.; Hayashi, M. Second-order asymptotics of conversions of distributions and entangled states based on rayleigh-normal probability distributions. IEEE Trans. Inf. Theory 2017, 63, 1829–1857. [Google Scholar] [CrossRef]
Kumagai, W.; Hayashi, M. Random number conversion and LOCC conversion via restricted storage. IEEE Trans. Inf. Theory 2017, 63, 2504–2532. [Google Scholar] [CrossRef]
Yu, L.; Tan, V.Y.F. Simulation of random variables under Rényi divergence measures of all orders. IEEE Trans. Inf. Theory 2019, 65, 3349–3383. [Google Scholar] [CrossRef]
Yagi, H.; Han, T.S. Variable-length resolvability for general sources and channels. Entropy 2023, 25, 1466. [Google Scholar] [CrossRef] [PubMed]
Csiszár, I.; Shields, P.C. Information theory and statistics: A tutorial. Found. Trends® Commun. Inf. Theory 2004, 1, 417–528. [Google Scholar] [CrossRef]
Sason, I.; Verdú, S. f-divergence inequalities. IEEE Trans. Inf. Theory 2016, 62, 5973–6006. [Google Scholar] [CrossRef]
Renner, R.; Wolf, S. Smooth Rényi entropy and applications. In Proceedings of the 2004 IEEE International Symposium on Information Theory (ISIT), Chicago, IL, USA, 27 June–2 July 2004; p. 233. [Google Scholar]
Holenstein, T.; Renner, R. On the randomness of independent experiments. IEEE Trans. Inf. Theory 2011, 57, 1865–1871. [Google Scholar] [CrossRef]
Uyematsu, T. A new unified method for fixed-length source coding problems of general sources. IEICE Trans. Fundam. 2010, E93-A, 1868–1877. [Google Scholar]
Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970. [Google Scholar]
Hayashi, M. Information spectrum approach to second-order coding rate in channel coding. IEEE Trans. Inf. Theory 2009, 55, 4947–4966. [Google Scholar] [CrossRef]
Polyanskiy, Y.; Poor, H.; Verdú, S. Channel coding rate in the finite blocklength regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
Ingber, A.; Kochman, Y. The dispersion of lossy source coding. In Proceedings of the Data Compression Conference (DCC), Snowbird, UT, USA, 29–31 March 2011; pp. 53–62. [Google Scholar]
Kostina, V.; Verdú, S. Fixed-length lossy compression in the finite blocklength regime. IEEE Trans. Inf. Theory 2012, 58, 3309–3338. [Google Scholar] [CrossRef]
Kontoyiannis, I.; Verdú, S. Optimal lossless compression: Source varentropy and despersion. In Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013; pp. 1739–1742. [Google Scholar]
Tan, V.Y.F.; Kosut, O. On the dispersions of three network information theory problems. IEEE Trans. Inf. Theory 2014, 60, 881–903. [Google Scholar] [CrossRef]
Yagi, H.; Han, T.S.; Nomura, R. First- and second-order coding theorems for mixed memoryless channels with general mixture. IEEE Trans. Inf. Theory 2016, 62, 4395–4412. [Google Scholar] [CrossRef]
Watanabe, S. Second-order region for Gray-Wyner network. IEEE Trans. Inf. Theory 2017, 63, 1006–1018. [Google Scholar] [CrossRef]
Tagashira, S.; Uyematsu, T. The second order asymptotic rates in fixed-length coding and resolvability problem in terms of smooth rényi entropy. IEICE Tech. Rep. 2013, 112, 65–70. (In Japanese) [Google Scholar]
Namekawa, E.; Uyematsu, T. The second order asymptotic rates in intrinsic randomness problem in terms of smooth rényi entropy. IEICE Tech. Rep. 2015, 114, 1–6. (In Japanese) [Google Scholar]
Vembu, S.; Verdú, S.; Steinberg, Y. The source-channel separation theorem revisited. IEEE Trans. Inf. Theory 1995, 41, 44–54. [Google Scholar] [CrossRef]
Chen, P.O.; Alajaji, F. Optimistic Shannon coding theorems for arbitrary single-user systems. IEEE Trans. Inf. Theory 1999, 45, 2623–2629. [Google Scholar] [CrossRef]
Koga, H. Four limits in probability and their roles in source coding. IEICE Trans. Fundam. 2011, 94, 2073–2082. [Google Scholar] [CrossRef]

Figure 1. Smooth max entropy

H_{0} (δ | X^{n})

.

Figure 1. Smooth max entropy

H_{0} (δ | X^{n})

.

Figure 2. Smooth min entropy

H_{\infty} (δ | X^{n})

.

Figure 2. Smooth min entropy

H_{\infty} (δ | X^{n})

.

Figure 3. Resolvability problem.

Figure 4. Intrinsic randomness problem.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nomura, R.; Yagi, H. Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy. Entropy 2024, 26, 766. https://doi.org/10.3390/e26090766

AMA Style

Nomura R, Yagi H. Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy. Entropy. 2024; 26(9):766. https://doi.org/10.3390/e26090766

Chicago/Turabian Style

Nomura, Ryo, and Hideki Yagi. 2024. "Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy" Entropy 26, no. 9: 766. https://doi.org/10.3390/e26090766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy^†

Abstract

1. Introduction

2. Preliminaries

2.1. f-Divergences

2.2. Smooth Rényi Entropy

3. Source Resolvability Problem

4. Intrinsic Randomness Problem

5. Relaxation of Conditions C1) and C2)

6. Particularization to Several Distance Measures

6.1. Half Variational Distance

6.2. Reverse Kullback–Leibler Divergence

6.3. Hellinger Distance

6.4. E_γ-Divergence

6.5. Variational Distance

6.6. Squared Hellinger Distance

6.7. α-Divergence

7. Second-Order Optimum Achievable Rate

7.1. General Formula

7.2. Particularizations to Several Distance Measures

8. Optimistic Optimum Achievable Rates

8.1. Source Resolvability

8.2. Intrinsic Randomness

9. Discussion

10. Concluding Remarks

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy †

Abstract

1. Introduction

2. Preliminaries

2.1. f-Divergences

2.2. Smooth Rényi Entropy

3. Source Resolvability Problem

4. Intrinsic Randomness Problem

5. Relaxation of Conditions C1) and C2)

6. Particularization to Several Distance Measures

6.1. Half Variational Distance

6.2. Reverse Kullback–Leibler Divergence

6.3. Hellinger Distance

6.4. Eγ-Divergence

6.5. Variational Distance

6.6. Squared Hellinger Distance

6.7. α-Divergence

7. Second-Order Optimum Achievable Rate

7.1. General Formula

7.2. Particularizations to Several Distance Measures

8. Optimistic Optimum Achievable Rates

8.1. Source Resolvability

8.2. Intrinsic Randomness

9. Discussion

10. Concluding Remarks

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy^†

6.4. E_γ-Divergence