Next Article in Journal
Optimizing Prognostic Predictions in Liver Cancer with Machine Learning and Survival Analysis
Previous Article in Journal
Natural Induction: Spontaneous Adaptive Organisation without Natural Selection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy †

1
Center for Data Science, Waseda University, Tokyo 169-8050, Japan
2
Department of Computer and Network Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan
*
Author to whom correspondence should be addressed.
This paper is an extension of our conference papers: Nomura, R.; Yagi, H. Optimum source resolvability rate with respect to f-divergences using the smooth Rényi entropy. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; and Nomura, R.; Yagi, H. Optimum intrinsic randomness rate with respect to f-divergences using the smooth min entropy. In Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT), Melbourne, Australia, 12–20 July 2021.
These authors contributed equally to this work.
Entropy 2024, 26(9), 766; https://doi.org/10.3390/e26090766
Submission received: 1 July 2024 / Revised: 29 August 2024 / Accepted: 4 September 2024 / Published: 6 September 2024
(This article belongs to the Section Information Theory, Probability and Statistics)

Abstract

:
Two typical fixed-length random number generation problems in information theory are considered for general sources. One is the source resolvability problem and the other is the intrinsic randomness problem. In each of these problems, the optimum achievable rate with respect to the given approximation measure is one of our main concerns and has been characterized using two different information quantities: the information spectrum and the smooth Rényi entropy. Recently, optimum achievable rates with respect to f-divergences have been characterized using the information spectrum quantity. The f-divergence is a general non-negative measure between two probability distributions on the basis of a convex function f. The class of f-divergences includes several important measures such as the variational distance, the KL divergence, the Hellinger distance and so on. Hence, it is meaningful to consider the random number generation problems with respect to f-divergences. However, optimum achievable rates with respect to f-divergences using the smooth Rényi entropy have not been clarified yet in both problems. In this paper, we try to analyze the optimum achievable rates using the smooth Rényi entropy and to extend the class of f-divergence. To do so, we first derive general formulas of the first-order optimum achievable rates with respect to f-divergences in both problems under the same conditions as imposed by previous studies. Next, we relax the conditions on f-divergence and generalize the obtained general formulas. Then, we particularize our general formulas to several specified functions f. As a result, we reveal that it is easy to derive optimum achievable rates for several important measures from our general formulas. Furthermore, a kind of duality between the resolvability and the intrinsic randomness is revealed in terms of the smooth Rényi entropy. Second-order optimum achievable rates and optimistic achievable rates are also investigated.

1. Introduction

Two typical fixed-length random number generation problems in information theory are considered for general sources. One is the source resolvability problem (i.e., the resolvability problem), and the other is the intrinsic randomness problem. The problem setting of the resolvability problem is as follows. Given an arbitrary source X = { X n } n = 1 (the target random number), we approximate it by using a discrete random number that is uniformly distributed, which we call the uniform random number. Here, the size of the uniform random number is requested to be as small as possible. In this setting, a degree of approximation is measured by several criteria. Han and Verdú [1] and Steinberg and Verdú [2] have determined the first-order optimum achievable rates with respect to the variational distance and the normalized Kullback–Leibler (KL) divergence. Nomura [3] has studied the first-order optimum achievable rates with respect to the KL divergence. Recently, Nomura [4] has characterized the first-order optimum achievable rates with respect to f-divergences. The class of f-divergence considered in [4] includes the variational distance and the KL divergence. Hence, the result can be considered as a generalization of the results given in [1,3]. The second-order optimum achievable rates in the resolvability problem have also been studied with respect to several approximation measures [4,5]. It should be noted that the results mentioned above are based on the information spectrum quantity. On the other hand, Uyematsu [6] has characterized the first-order optimum achievable rate with respect to the variational distance using the smooth Rényi entropy.
The intrinsic randomness problem, which is also one of typical random number generation problems, has also been studied. The problem setting of the intrinsic randomness problem is as follows. By using a given arbitrary source X = { X n } n = 1 (the coin random number), we approximate a discrete uniform random number whose size is requested to be as large as possible. Also in the intrinsic randomness problem, optimum achievable rates with respect to various criteria have been considered. Vembu and Verdú [7] have considered the intrinsic randomness problem with respect to the variational distance as well as the normalized KL divergence and derived general formulas of the first-order optimum achievable rates (cf. Han [8]). Hayashi [9] has considered the first- and second-order optimum achievable rates with respect to the KL divergence. Recently, the first- and second-order optimum achievable rates with respect to f-divergences have been clarified in [4]. The results mentioned here are based on information spectrum quantities. On the other hand, Uyematsu and Kunimatsu [10] have characterized the first-order optimum achievable rates with respect to the variational distance using the smooth Rényi entropy.
Related works include works given by Liu, Cuff and Verdú [11], Yagi and Han [12], Kumagai and Hayashi [13,14], and Yu and Tan [15]. In [11], the channel resolvability problem with respect to the E γ -divergence has been considered. They have applied their results to the case of the source resolvability problem. Yagi and Han [16] have determined the optimum variable-length resolvability rates with respect to the variational distance as well as the KL divergence. Kumagai and Hayashi [13,14] have determined the first- and second-order optimum achievable rates in the random number conversion problem. It should be noted that the random number conversion problem includes the resolvability and intrinsic randomness problems treated in this paper. In [13,14], an approximation measure related to the Hellinger distance has been used. Yu and Tan [15] have considered the random number conversion problem with respect to the Rényi divergence.
As we have mentioned above, in both problems of the resolvability and the intrinsic randomness, various approximation measures have been considered. Furthermore, general formulas of achievable rates have been characterized by using the information spectrum quantity and the smooth Rényi entropy. We here note that optimum achievable rates with respect to f-divergence using the smooth Rényi entropy have not been clarified yet. The smooth Rényi entropy is an information quantity that has a clear operational meaning and is easy to understand. Moreover, a class of f-divergences is a general distance measure, in which several important measures are included. In this paper, hence, we try to characterize the first- and second-order optimum achievable rates with respect to f-divergences using the smooth Rényi entropy. In addition, we also extend the class of f-divergence for which optimum achievable rates can be characterized. As a result, we find that two types of smooth Rényi entropies are useful to describe these optimum achievable rates for a wider class of f-divergence. Furthermore, a kind of duality between the resolvability and the intrinsic randomness is revealed in terms of the smooth Rényi entropy and f-divergences.
This paper is organized as follows. In Section 2, we describe the problem setting and give some definitions of the optimum first-order achievable rates. The class of f-divergences and the smooth Rényi entropy are also introduced. In Section 3 and Section 4, we show general formulas of the optimum first-order achievable rates in the resolvability problem and the intrinsic randomness problem, respectively. In Section 5, we derive the general formulas of these achievable rates for an extended class of f-divergence. In Section 6, we apply general formulas obtained in previous sections to some specified functions f and compute the optimum first-order achievable rates in each cases. In Section 7, we show general formulas of the optimum second-order achievable rates in two problems. In Section 8, optimum achievable rates in the optimistic sense are considered. Section 9 is devoted to the discussion concerning our results. Finally, we provide some concluding remarks on our results in Section 10.

2. Preliminaries

2.1. f-Divergences

The f-divergence between two probability distributions P Z and P Z ¯ is defined as follows [17]. Let f ( t ) be a convex function defined for t > 0 and f ( 1 ) = 0 .
Definition 1 
(f-divergence [17]). Let P Z and P Z ¯ denote probability distributions over a finite or countably infinite set Z . The f-divergence between P Z and P Z ¯ is defined by
D f ( Z | | Z ¯ ) : = z Z P Z ¯ ( z ) f P Z ( z ) P Z ¯ ( z ) ,
where we set 0 f 0 0 = 0 , f ( 0 ) = lim t 0 f ( t ) , 0 f ( a 0 ) = lim t 0 t f ( a t ) = a lim u f ( u ) u .
The f-divergence is a general approximation measure, which includes some important measures. We give some examples of f-divergences [17,18]:
  • f ( t ) = t log t : (Kullback–Leibler (KL) divergence)
    D f ( Z | | Z ¯ ) = z Z P Z ( z ) log P Z ( z ) P Z ¯ ( z ) = : D ( Z | | Z ¯ ) .
  • f ( t ) = log t : (Reverse Kullback–Leibler divergence)
    D f ( Z | | Z ¯ ) = z Z P Z ¯ ( z ) log P Z ¯ ( z ) P Z ( z ) = D ( Z ¯ | | Z ) .
  • f ( t ) = 1 t : (Hellinger distance)
    D f ( Z | | Z ¯ ) = 1 z Z P Z ( z ) P Z ¯ ( z ) .
  • f ( t ) = ( 1 t ) 2 : (Squared Hellinger distance)
    D f ( Z | | Z ¯ ) = z Z P Z ( z ) P Z ¯ ( z ) 2 .
  • f ( t ) = | t 1 | : (Variational distance)
    D f ( Z | | Z ¯ ) = z Z | P Z ( z ) P Z ¯ ( z ) | .
  • f ( t ) = ( 1 t ) + : = max { 1 t , 0 } : (Half variational distance)
    D f ( Z | | Z ¯ ) = 1 2 z Z | P Z ( z ) P Z ¯ ( z ) | = z Z : P Z ( z ) > P Z ¯ ( z ) P Z ( z ) P Z ¯ ( z ) .
  • f ( t ) = t α α t ( 1 α ) α ( α 1 ) : α -divergence ( 0 < α < 1 )
    D f ( Z | | Z ¯ ) = 1 α ( α 1 ) 1 z Z P Z ( z ) α P Z ¯ ( z ) 1 α .
  • f ( t ) = ( t γ ) + : ( E γ -divergence) For any given γ 1 ,
    D f ( Z | | Z ¯ ) = z Z : P Z ( z ) > γ P Z ¯ ( z ) P Z ( z ) γ P Z ¯ ( z ) = : E γ ( Z | | Z ¯ ) .
The E γ -divergence is a generalization of the half variational distance defined in (7), because γ 1 is arbitrary.
Remark 1. 
It is known [4] that the E γ -divergence can be expressed as an f-divergence using the function:
f ( t ) = ( γ t ) + + 1 γ .
The following key property holds for the f-divergence from Jensen’s inequality [17]:
z Z b ( z ) f a ( z ) b ( z ) z Z b ( z ) f z Z a ( z ) z Z b ( z ) .
As we have mentioned above, the f-divergence is a general approximation measure, which includes several important measures. In this study, we first assume the following conditions on the function f that have also been imposed by previous studies [4].
C1) 
The function f ( t ) is a decreasing function for t > 0 with f ( 0 ) > 0 .
C2) 
The function f ( t ) satisfies
lim u f ( u ) u = 0 .
C3) 
For any pair of positive real numbers ( a , b ) , it holds that
lim n f e n b e n a = 0 .
Remark 2. 
Notice here that functions f ( t ) = log t , f ( t ) = 1 t , and f ( t ) = ( 1 t ) + satisfy the above conditions, while f ( t ) = t log t does not satisfy conditions C1) and C2). Moreover, it is not difficult to check that (10) satisfies these conditions.
Remark 3. 
For a decreasing function f ( t ) , it always holds that f ( 0 ) = lim t 0 f ( t ) 0 because f ( 1 ) = 0 . Then, the condition f ( 0 ) > 0 in C1) excludes the case of f ( t ) = 0 for all t 0 , in which f-divergence is identically zero.
Remark 4. 
From the definition of the f-divergence, C2) means
0 f a 0 = 0 ,
for any a > 0 . In the derivation of our main theorems, we can use (14) instead of (12).
Remark 5. 
We will show in Section 5 that condition C1) is automatically met for the function f satisfying condition C2) (cf. claim (i) of Lemma 1).

2.2. Smooth Rényi Entropy

In what follows, we consider the case of Z = X n , where X is a finite or countably infinite set and n is an integer. We consider the general source defined as an infinite sequence
X = X n = X 1 ( n ) , X 2 ( n ) , , X n ( n ) n = 1
of n-dimensional random variables X n , where each component random variable X i ( n ) takes values in a countable set X . Let P X ( · ) denote the probability distribution of the random variable X. In this paper, we assume the following condition on the source X :
H ̲ ( X ) < + ,
where
H ̲ ( X ) = sup R lim n Pr 1 n log 1 P X n ( X n ) R = 1
is called the spectral inf-entropy rate of the source X [8]. Here, Han [8] ([Theorem 1.7.2]) has shown that
H ̲ ( X ) log | X |
holds. Hence, the condition (15) holds for any source with a finite alphabet.
The random number U M which is uniformly distributed on { 1 , 2 , , M } is defined by
P U M ( i ) = 1 M , i U M : = { 1 , 2 , , M } .
We next introduce the smooth Rényi entropy of the source.
Definition 2 
(Smooth Rényi entropy of order α [19]). For given random variables X n , the smooth Rényi entropy of order α given δ ( 0 δ < 1 ) is defined by
H α ( δ | X n ) : = 1 1 α inf P X ¯ n B δ ( P X n ) log x X n P X ¯ n ( x ) α ,
where
B δ ( P X n ) : = P X ¯ n P n 1 2 x X n | P X n ( x ) P X ¯ n ( x ) | δ .
Here, H α ( δ | X n ) is a decreasing function of δ . The smooth Rényi entropy of order 0 and the smooth Rényi entropy of order are, respectively, called the smooth max entropy and the smooth min entropy [20].
The following theorems have shown alternative expressions of the smooth max entropy and the smooth min entropy.
Theorem 1 
(Uyematsu [6,21]).
H 0 ( δ | X n ) = min A n X n Pr { X n A n } 1 δ log | A n | .
Theorem 2 
(Uyematsu and Kunimatsu [10]).
H ( δ | X n ) = inf β 1 | X n | : x X n ( P X n ( x ) β ) + δ log β ,
where if | X | is a countably infinite set, the infimum is taken over β 0 .
It should be noted that these alternative expressions are simple and easy to understand compared to (19). Figure 1 and Figure 2 show operational meanings of (21) and (22). As in Figure 1, the smooth max entropy H 0 ( δ | X n ) is equal to the logarithm of the cardinality of the set A n with Pr { X n A n } 1 δ where each of the sequence x A n has large probability. On the other hand, the smooth min entropy H ( δ | X n ) is equal to the supremum of log β such that the sum of probabilities of sequences x X n that exceeds β is less than or equal to δ (Figure 2).
In this paper, we use the above alternative expressions of the smooth max entropy and the smooth min entropy instead of (19).

3. Source Resolvability Problem

We consider the problem concerning how to simulate a given discrete source X = { X n } n = 1 by using the uniform random number U M n and the mapping ϕ n . Figure 3 is an illustrative figure of this problem (the probability distribution for X n is depicted in black, while the one for ϕ n ( U M n ) is shown in blue). Since it is hard to simulate the exact source in general, we consider the approximation problem under some measure. This problem is called the resolvability problem. One of the main objectives in the resolvability problem is to derive the smallest value of a in the form of M n = e n a , which we call the optimum resolvability rate [1,8]. This is formulated as follows.
Definition 3. 
Rate R is said to be D-achievable with the given f-divergence if there exists a sequence of mapping ϕ n : U M n X n such that
lim sup n D f ( X n | | ϕ n ( U M n ) ) D ,
lim sup n 1 n log M n R .
Given some D, if the rate constraint R is sufficiently large, it can be shown that there exists a sequence of mappings satisfying constraints in the above definition. Conversely, if R is too small, no sequence of mappings that satisfies constraints can be found. Therefore, in the resolvability problem, the infimum of R is of particular interest.
Definition 4 (First-order optimum resolvability rate).
S r ( f ) ( D | X ) : = inf R R   i s   D - a c h i e v a b l e   w i t h   t h e   g i v e n   f - d i v e r g e n c e .
Remark 6. 
It should be noted that we do not use D f ( ϕ n ( U M n ) | | X n ) but D f ( X n | | ϕ n ( U M n ) ) as a condition in Definition 3. This is important to consider the asymmetric measure such as the KL-divergence.
Remark 7. 
We consider the case where D is in [ 0 , f ( 0 ) ) under the given f-divergence. Since f ( t ) is defined in the range t > 0 and we assume that the function f ( t ) is a decreasing function of t, D f ( X n | | Y n ) f ( 0 ) holds for any distributions P X n ( · ) and P Y n ( · ) from the definition of f-divergence. Hence, D f ( 0 ) means that there exists no restriction about the approximation error (for example, f ( 0 ) = 1 in the case of the half variational distance and f ( 0 ) = in the case of the KL divergence). This case leads to the trivial result that the first-order optimum resolvability rate equals 0. Hence, we only consider the case of D [ 0 , f ( 0 ) ) . A similar observation is applicable throughout the following sections.
Our main objective in this section is to derive the general formula of the first-order optimum resolvability rate. To do so, we first derive the following two theorems. We use the notation f 1 ( a ) = inf { t | f ( t ) = a } .
Theorem 3. 
Under conditions C1)–C3), for any γ > 0 and any M n satisfying
1 n log M n 1 n H 0 ( 1 f 1 ( D ) | X n ) + γ ,
there exists a mapping ϕ n , which satisfies
D f ( X n | | ϕ n ( U M n ) ) D + γ
for sufficiently large n.
Proof. 
We arbitrarily fix M n satisfying (26). We show that there exists a mapping ϕ n that satisfies (27) for sufficiently large n. Let B n X n denote a set satisfying
Pr { X n B n } f 1 ( D )
and
log | B n | = H 0 ( 1 f 1 ( D ) | X n ) .
The existence of the above set B n is guaranteed by (21). We define the probability distribution P X ¯ n over B n as
P X ¯ n ( x ) : = P X n ( x ) Pr { X n B n } x B n , 0 o t h e r w i s e .
Furthermore, let a set C n be as
C n : = x B n P X ¯ n ( x ) 1 M n
and arrange elements in C n as
C n = { x 1 , x 2 , , x | C n | }
according to P X ¯ n ( x ) in ascendant order. That is, P X ¯ n ( x i ) P X ¯ n ( x j ) ( 1 i < j | C n | ) holds. Here, we define i : = | C n | and index x B n C n as x i + 1 , x i + 2 , , x | B n | arbitrarily.
Then, from the above definition, it holds that
P X ¯ n ( x i ) = max x C n P X ¯ n ( x ) .
Thus, from the assumption (15), for any small ε ( 0 , H ̲ ( X ) ) , it holds that
P X ¯ n ( x i ) e n ( H ̲ ( X ) ε )
for sufficiently large n.
Set k 0 = 0 . For x 1 we determine k 1 such that
k 1 M n P X ¯ n ( x 1 ) , k 1 + 1 M n > P X ¯ n ( x 1 ) .
Secondly, we determine k 2 for x 2 such that
k 2 k 1 M n P X ¯ n ( x 2 ) , k 2 k 1 + 1 M n > P X ¯ n ( x 2 ) .
In a similar way, we repeat this operation to choose k i for x i as long as possible. Then, it is not difficult to check that the above procedure does not stop before i < i .
We define a mapping ϕ n : U M n X n as
ϕ n ( j ) = x i k i 1 + 1 j k i , i < i x i otherwise
and set X ˜ n = ϕ n ( U M n ) .
We evaluate the performance of the mapping ϕ n . From the construction of the mapping, for any i satisfying 1 i i 1 it holds that
P X ˜ n ( x i ) P X ¯ n ( x i )
P X ¯ n ( x i ) < P X ˜ n ( x i ) + 1 M n .
We next evaluate P X ˜ n ( x i ) . From the construction, we have P X ¯ n ( x i ) P X ˜ n ( x i ) . Since P X ˜ n ( x i ) = 0 holds for i B n C n , we obtain
P X ¯ n ( x i ) P X ˜ n ( x i ) = P X ¯ n ( x i ) < 1 M n
for i B n C n . Hence, also from the construction of the mapping, we obtain
P X ˜ n ( x i ) P X ¯ n ( x i ) = 1 i = 1 i 1 P X ˜ n ( x i ) 1 x i B n { x i } P X ¯ n ( x i ) = x i B n { x i } P X ¯ n ( x i ) x i B n { x i } P X ˜ n ( x i ) = x i B n { x i } P X ¯ n ( x i ) P X ˜ n ( x i ) | B n | M n e n γ ,
where the second equality is from the fact that P X ˜ n ( x i ) = 0 for i B n C n , the first inequality is due to (39) and (40), and the last inequality is obtained from (26) and (29). Thus, we have
P X ˜ n ( x i ) P X ¯ n ( x i ) + e n γ .
From the above argument, the f-divergence is given by
D f X n | | ϕ n ( U M n ) = i = 1 i P X ˜ n ( x i ) f P X n ( x i ) P X ˜ n ( x i ) = i = 1 i P X ˜ n ( x i ) f P X ¯ n ( x i ) Pr X n B n P X ˜ n ( x i ) = i = 1 i 1 P X ˜ n ( x i ) f P X ¯ n ( x i ) Pr X n B n P X ˜ n ( x i ) + P X ˜ n ( x i ) f P X ¯ n ( x i ) Pr X n B n P X ˜ n ( x i ) i = 1 i 1 P X ˜ n ( x i ) f Pr X n B n + P X ˜ n ( x i ) f P X ¯ n ( x i ) Pr X n B n P X ˜ n ( x i ) ,
where the first equality is due to the condition C2) and the last inequality is due to (38) and the condition C1).
The second term of the RHS of (43) is evaluated as follows. From (42) and C1), we have
P X ˜ n ( x i ) f P X ¯ n ( x i ) Pr X n B n P X ˜ n ( x i ) P X ¯ n ( x i ) + e n γ f P X ¯ n ( x i ) Pr X n B n P X ¯ n ( x i ) + e n γ .
Here, using the relation
P X ¯ n ( x i ) Pr X n B n = ( 1 e n γ ) P X ¯ n ( x i ) Pr X n B n + e n γ P X ¯ n ( x i ) Pr X n B n ,
we obtain
P X ¯ n ( x i ) + e n γ f P X ¯ n ( x i ) Pr X n B n P X ¯ n ( x i ) + e n γ P X ¯ n ( x i ) f ( 1 e n γ ) P X ¯ n ( x i ) Pr X n B n P X ¯ n ( x i ) + e n γ f e n γ P X ¯ n ( x i ) Pr X n B n e n γ = P X ¯ n ( x i ) f ( 1 e n γ ) Pr X n B n + e n γ f P X ¯ n ( x i ) Pr X n B n P X ¯ n ( x i ) f ( 1 e n γ ) Pr X n B n + e n γ f e n ( H ̲ ( X ) ε ) Pr X n B n
for sufficiently large n, where the first inequality is due to (11) and the last inequality is from (34) and the condition C1).
Hence, from C3) and the continuity of the function f, for ν > 0 we have
P X ¯ n ( x i ) + e n γ f P X ¯ n ( x i ) Pr X n B n P X ¯ n ( x i ) + e n γ P X ¯ n ( x i ) f Pr X n B n e n γ + ν P X ¯ n ( x i ) f Pr X n B n + 2 ν
for sufficiently large n. Therefore, noting that P X ¯ n ( x i ) P X ˜ n ( x i ) , from (28), (43), (44) and (47) it holds that
D f X n | | ϕ n ( U M n ) i = 1 i P X ˜ n ( x i ) f Pr X n B n + 2 ν = f Pr X n B n + 2 ν f f 1 ( D ) + 2 ν = D + 2 ν
for sufficiently large n. This completes the proof of the theorem. □
Theorem 4. 
Under conditions C1) and C2), for any mapping ϕ n satisfying
D f ( X n | | ϕ n ( U M n ) ) D ,
it holds that
1 n log M n 1 n H 0 ( 1 f 1 ( D ) | X n ) .
Proof. 
It suffices to show the fact that the relation
1 n log M n < 1 n H 0 ( 1 f 1 ( D ) | X n )
necessarily yields
D f ( X n | | ϕ n ( U M n ) ) > D .
We here denote H : = H 0 ( 1 f 1 ( D ) | X n ) for short. For any fixed mapping ϕ n : U M n X n , we set X ˜ n : = ϕ n ( U M n ) and
B n : = x X n | P X ˜ n ( x ) > 0 .
Then, from the property of the mapping it must hold that
M n | B n | .
From the condition C2) the f-divergence between P X n and P X ˜ n is lower bounded by
D f ( X n | | ϕ n ( U M n ) ) = x B n P X ˜ n ( x ) f P X n ( x ) P X ˜ n ( x ) f Pr { X n B n } f max B n X n | B n | M n Pr { X n B n } f max B n X n log | B n | < H Pr { X n B n } > f 1 ( 1 f 1 ( D ) ) = D ,
where the first inequality is due to (11), the second inequality is due to condition C1) and (54) and the third inequality is from (51). The last inequality is from the definition of the alternative expressions given in Theorem 1. This completes the proof. □
Theorems 3 and 4 show that the smooth max entropy and the inverse function of f have important roles in the resolvability problem with respect to f-divergences. From these theorems, we obtain the following theorem, which addresses the general formula of the optimum resolvability rate. It should be noted that because of the assumption 0 D < f ( 0 ) and C1), we have 0 < f 1 ( D ) 1 .
Theorem 5. 
Under conditions C1)–C3), it holds that
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) = lim ν 0 lim sup n 1 n H 0 ( 1 f 1 ( D ) + ν | X n ) .
Proof. 
We here show the first equality, because the second equality can be derived from the first inequality together with the continuity of the function f 1 .
(Direct Part:) Fix ν > 0 arbitrarily. From Theorem 3, for any γ > 0 , there exists a mapping ϕ n such that
1 n log M n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) + γ ,
and
D f ( X n | | ϕ n ( U M n ) ) D + ν + γ .
We here use the diagonal line argument [8]. Fix a sequence { γ i } i = 1 such that γ 1 > γ 2 > > 0 , and we repeat the above argument as i . Then, we can show that there exists a mapping ϕ n satisfying
lim sup n D f ( X n | | ϕ n ( U M n ) ) D + ν ,
and
lim sup n 1 n log M n lim sup n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) .
Here, also from the diagonal line argument with respect to ν , we obtain
lim sup n 1 n log M n lim ν 0 lim sup n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) .
This completes the proof of the direct part.
(Converse Part:) We fixed ν > 0 arbitrarily. From Theorem 4, for any mapping ϕ n satisfying
D f ( X n | | ϕ n ( U M n ) ) D + ν ,
it holds that
1 n log M n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) .
Consequently, we have
lim sup n D f ( X n | | ϕ n ( U M n ) ) D + ν
and
lim sup n 1 n log M n lim sup n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) .
We also use the diagonal line argument [8]. We repeat the above argument as i for a sequence { ν i } i = 1 such that ν 1 > ν 2 > > 0 . Then, for any mapping ϕ n satisfying
lim sup n D f ( X n | | ϕ n ( U M n ) ) D ,
it holds that
lim sup n 1 n log M n lim ν 0 lim sup n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) .
This completes the proof of the converse part. □

4. Intrinsic Randomness Problem

In the previous section, we reveal the general formula for the optimum resolvability rate. In this section, we consider how to approximate the uniform random number U M n by using the given discrete source X = { X n } n = 1 and the mapping φ n . Figure 4 is an illustrative figure of the problem (the probability distribution for U M n is depicted in blue, while the one for φ n ( X n ) is shown in black). The size of the random number M n is requested to be as large as possible. In the intrinsic randomness problem, one of our main concerns is to derive the largest value of b in the form of M n = e n b under some approximation measure [7]. This problem setting is formulated as follows.
Definition 5. 
R is said to be Δ-achievable with the given f-divergence if there exists a sequence of mapping φ n : X n U M n such that
lim sup n D f ( φ n ( X n ) | | U M n ) Δ ,
lim inf n 1 n log M n R .
In this case, given Δ , if the rate constraint R is sufficiently small, it can be shown that there exists a sequence of mappings that satisfies the constraints. On the other hand, if R is too large, no sequence of mappings that achieves the desired constraints can be found. Consequently, in this setting, the supremum of R is of particular interest.
Definition 6 (First-order optimum intrinsic randomness rate).
S ι ( f ) ( Δ | X ) : = sup R R   i s   Δ - a c h i e v a b l e   w i t h   t h e   g i v e n   f - d i v e r g e n c e .
Remark 8. 
It should be emphasized that we use the f-divergence of the form D f ( φ n ( X n ) | | U M n ) instead of D f ( U M n | | φ n ( X n ) ) (cf. Remark 6).
We also assume that Δ [ 0 , f ( 0 ) ) in this section (cf. Remark 7). In order to analyze the general formula of the optimum intrinsic randomness rate S ι ( f ) ( Δ | X ) , we first give two theorems.
Theorem 6. 
Under conditions C1) and C2), for any γ > 0 and M n satisfying
1 n log M n 1 n H ( 1 f 1 ( Δ ) | X n ) γ ,
there exists a mapping φ n such that
D f ( φ n ( X n ) | | U M n ) Δ + γ
for sufficiently large n.
Proof. 
We set β 0 as
log β 0 = H ( 1 f 1 ( Δ ) | X n )
for short.
From Theorem 2, we notice that
1 f 1 ( Δ ) x X n P X n ( x ) β 0 + = : 1 A n ( Δ ) ,
where if β 0 > 1 / | X n | holds, then f 1 ( Δ ) = A n ( Δ ) . We shall show that for any M n satisfying
1 n log M n 1 n log β 0 1 n log 1 A n ( Δ ) γ 2 ,
there exists a mapping φ n such that
D f ( φ n ( X n ) | | U M n ) Δ + γ
for sufficiently large n.
For every sequence x X n , we define the probability distribution
P X ¯ n ( x ) : = β 0 A n ( Δ ) P X n ( x ) β 0 , P X n ( x ) A n ( Δ ) P X n ( x ) < β 0 .
Since 0 < A n ( Δ ) < 1 , this probability distribution is well-defined. Then, from the definition of the smooth min entropy it holds that
x X n P X ¯ n ( x ) = 1 .
Here, from (75) and the definition of the smooth min entropy it holds that
M n 1 β 0 A n ( Δ ) e n γ / 2 | X n |
for sufficiently large n.
We next define the mapping φ n : X n U M n by using P X ¯ n . To do so, we classify the elements of X n into I n ( i ) ( 1 i M n ) as follows.
  • We choose a set I n ( 1 ) arbitrarily satisfying
    x I n ( 1 ) P X ¯ n ( x ) 1 M n ,
    x I n ( 1 ) P X ¯ n ( x ) + P X ¯ n ( x ) > 1 M n
    for any x X n I n ( 1 ) .
  • Next, we choose a set I n ( 2 ) X n I n ( 1 ) satisfying
    x I n ( 2 ) P X ¯ n ( x ) 1 M n ,
    x I n ( 2 ) P X ¯ n ( x ) + P X ¯ n ( x ) > 1 M n
    for any x X n i = 1 2 I n ( i ) .
Furthermore, we repeat this operation ( M n 1 ) times so as to choose sets I n ( i ) ( 1 i M n 1 ) . Notice here that since 1 M n > β 0 A n ( Δ ) holds, we can repeat this operation ( M n 1 ) times. Thus, from the above procedure, all of I n ( i ) ( 1 i M n 1 ) are not empty. Lastly, we set I n ( M n ) = { x X n | x X n i = 1 M n 1 I n ( i ) ) } .
From I n ( i ) ( 1 i M n ) , we define the mapping φ n : X n U M n as follows:
φ n ( x ) = i , x I n ( i ) .
Furthermore, we set U ˜ M n = φ n ( X n ) . Thus,
P U ˜ M n ( i ) = x I n ( i ) P X n ( x )
holds for every i in 1 i M n .
We next evaluate the above mapping φ n . From the construction of the mapping, it holds that
1 M n < x I n ( i ) P X ¯ n ( x ) + β 0 A n ( Δ )
for all i ( 1 i M n 1 ) and
1 M n x I n ( M n ) P X ¯ n ( x ) .
Hence, for all i ( 1 i M n ) , we have
1 M n β 0 A n ( Δ ) x I n ( i ) P X ¯ n ( x ) .
Here, notice that for all x X n
P X ¯ n ( x ) P X n ( x ) A n ( Δ )
holds from (77). Thus, we have
P U ˜ M n ( i ) A n ( Δ ) = x I n ( i ) P X n ( x ) A n ( Δ ) x I n ( i ) P X ¯ n ( x ) > 1 M n β 0 A n ( Δ ) = 1 M n 1 M n β 0 A n ( Δ ) 1 M n 1 e n γ / 2
for all i ( 1 i M n ) where the first equality is due to (85), the first inequality is due to (89), the second inequality is due to (88), and the last inequality is due to (79). Hence, we obtain
D f ( φ n ( X n ) | | U M n ) = 1 i M n 1 M n f P U ˜ M n ( i ) 1 M n 1 i M n 1 M n f A n ( Δ ) ( 1 e n γ / 2 ) f ( f 1 ( Δ ) ) + δ n = Δ + δ n ,
where we can choose some δ n > 0 such that δ n 0 ( n ) , the first inequality is due to (90), and the second inequality is due to the continuity of the function f, (74) and C1). This completes the proof of the theorem. □
Theorem 7. 
Under conditions C1) and C2), for any ε > 0 if the mapping φ n satisfies
D f ( φ n ( X n ) | | U M n ) Δ ε ,
then it holds that
1 n log M n 1 n H ( 1 f 1 ( Δ ) | X n )
for sufficiently large n.
Proof. 
Setting
H : = H ( 1 f 1 ( Δ ) | X n ) ,
we only consider the case where H < | X n | holds, because H = | X n | means the trivial result. Let ε > 0 be fixed arbitrarily. We show that if
1 n log M n > 1 n H
holds for infinitely many n = n 1 , n 2 , , then for any φ n it holds that
D f ( φ n ( X n ) | | U M n ) > Δ ε .
From (95), there exists a positive constant γ satisfying
1 n log M n 2 γ > 1 n H .
Here, for γ > 0 satisfying the above inequality we set T n as
T n : = x X n 1 n log 1 P X n ( x ) 1 n H + γ
= x X n P X n ( x ) e ( H + n γ ) .
Then, from the relation
1 x T n P X n ( x ) | T n | e ( H + n γ )
we have
| T n | e H + n γ .
Next, we fix M n and a mapping φ n satisfying (95) and set U ˜ M n as U ˜ M n = φ n ( X n ) . Using φ n and T n , we set I n as
I n : = { i | x T n , φ n ( x ) = i } .
Thus, the set I n is the set of index i constructing from at least one x T n and the set ( I n ) c is the set of i constructing only from x ( T n ) c .
Then, from the definition of the mapping and (101), it holds that
| I n | | T n | e H + n γ .
On the other hand, from (97), we have
M n > e H + 2 n γ .
This means that
| I n | M n e n γ
holds. Hence, from the condition C2), we have
| I n | M n f M n | I n | 0 ( n ) .
From the above argument, the f-divergence between φ n ( X n ) and U M n is evaluated as
D f ( φ n ( X n ) | | U M n ) = 1 i M n 1 M n f P U ˜ M n ( i ) 1 M n = 1 i M n , i I n 1 M n f P U ˜ M n ( i ) 1 M n + 1 i M n , i ( I n ) c 1 M n f P U ˜ M n ( i ) 1 M n | I n | M n f 1 i M n , i I n P U ˜ M n ( i ) | I n | M n + | ( I n ) c | M n f 1 i M n , i ( I n ) c P U ˜ M n ( i ) | ( I n ) c | M n | I n | M n f 1 | I n | M n + | ( I n ) c | M n f 1 i M n , i ( I n ) c P U ˜ M n ( i ) | ( I n ) c | M n | I n | M n f M n | I n | + | ( I n ) c | M n f Pr { X n ( T n ) c } | ( I n ) c | M n ,
where the first inequality is due to (11), and the last inequality is due to the relation
1 i M n , i ( I n ) c P U ˜ M n ( i ) Pr { X n ( T n ) c }
and C1).
We next focus on the evaluation of the second term on the RHS of (107). From the definition of the smooth min entropy H and Theorem 1, for any γ > 0 it necessarily holds that
x X n P X n ( x ) e ( H + n γ ) + 1 f 1 ( Δ ) .
Thus, from the definition of T n it holds that
x T n P X n ( x ) e ( H + n γ ) = x X n P X n ( x ) e ( H + n γ ) + 1 f 1 ( Δ ) .
Thus, we obtain
Pr X n T n 1 f 1 ( Δ ) ,
from which it holds that
Pr X n ( T n ) c < 1 ( 1 f 1 ( Δ ) ) = f 1 ( Δ ) .
Plugging the above inequality with (107), we obtain
D f ( φ n ( X n ) | | U M n ) > | I n | M n f M n | I n | + | ( I n ) c | M n f f 1 ( Δ ) | ( I n ) c | M n .
Noticing that
| ( I n ) c | M n > 1 e n γ ,
from (105), for some δ n 0 , we have
D f ( φ n ( X n ) | | U M n ) > 1 e n γ f f 1 ( Δ ) 1 e n γ δ n = f f 1 ( Δ ) 1 e n γ e n γ f f 1 ( Δ ) 1 e n γ δ n = f f 1 ( Δ ) ( 1 + γ n ) 2 δ n
for sufficiently large n, where we use the property (106) and the notation γ n = e n γ 1 e n γ . Since γ n 0 ( n ) holds, from the continuity of the function f, it holds that
D f ( φ n ( X n ) | | U M n ) > Δ 3 δ n Δ ε
for n = n j , n j + 1 , , with some j 1 . Therefore, we obtain the theorem. □
Theorems 6 and 7 show that the smooth min entropy and the inverse function of f have important roles in the intrinsic randomness problem with respect to f-divergences, while the smooth max entropy is important in the resolvability problem. By using the above two theorems, we obtain the following theorem. It should be noted that because of the assumption 0 Δ < f ( 0 ) and C1), we have 0 < f 1 ( Δ ) 1 .
Theorem 8. 
Under conditions C1) and C2), it holds that
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H ( 1 f 1 ( Δ + ν ) | X n ) = lim ν 0 lim inf n 1 n H ( 1 f 1 ( Δ ) + ν | X n ) .
Proof. 
(Direct Part:) Fix ν > 0 arbitrarily. From Theorem 6, for any γ > 0 and M n such that
1 n H ( 1 f 1 ( Δ + ν ) | X n ) 2 γ 1 n log M n 1 n H ( 1 f 1 ( Δ ) | X n ) γ
holds, there exists a mapping φ n satisfying
D f ( φ n ( X n ) | | U M n ) Δ + γ
for sufficiently large n.
Since γ > 0 is arbitrarily, we obtain
lim inf n 1 n log M n lim inf n 1 n H ( 1 f 1 ( Δ + ν ) | X n ) .
Here, we fix a sequence { ν i } i = 1 such that ν 1 > ν 2 > > 0 , and we repeat the above argument as i . Then, we can show that there exists a mapping φ n satisfying
lim inf n 1 n log M n lim ν 0 lim inf n 1 n H ( 1 f 1 ( D + ν ) | X n ) ,
and
lim sup n D f ( φ n ( X n ) | | U M n ) Δ .
This completes the proof of the direct part of the theorem.
(Converse Part:) Fix ν > 0 arbitrarily. From Theorem 7, for any mapping φ n satisfying
D f ( φ n ( X n ) | | U M n ) Δ + ν ,
it holds that
1 n log M n 1 n H ( 1 f 1 ( Δ + 2 ν ) | X n ) .
Thus, for any ν > 0 , we obtain
lim sup n D f ( φ n ( X n ) | | U M n ) Δ + ν ,
and
lim inf n 1 n log M n lim inf n 1 n H ( 1 f 1 ( Δ + 2 ν ) | X n ) .
Noting that ν > 0 is arbitrarily, we fix a sequence { ν i } i = 1 such that ν 1 > ν 2 > > 0 , and we repeat the above argument as i . Then, we obtain
lim sup n D f ( φ n ( X n ) | | U M n ) Δ ,
and
lim inf n 1 n log M n lim ν 0 lim inf n 1 n H ( 1 f 1 ( Δ + ν ) | X n ) .
This completes the proof of the converse part of the theorem. □

5. Relaxation of Conditions C1) and C2)

Thus far, we have derived the general formulas for the optimum resolvability rate under conditions C1)–C3) and for the optimum intrinsic randomness rate under conditions C1) and C2). In this section, we relax conditions C1) and C2) to extend the class of f-divergence for which we can characterize these optimum rates. Hereafter, we do not consider a linear function f ( t ) = a ( t 1 ) with some a because it always gives a trivial case where D f ( Z | | Z ¯ ) = 0 .
We consider the following condition, which is a relaxation of C2):
C2’)
The function f satisfies
lim u f ( u ) u < + .
For function f satisfying condition C2’), we denote the LHS of (129) by
c f = lim u f ( u ) u .
We give some examples of the function f ( t ) , which satisfies C2’) but not C2).
  • f ( t ) = | t 1 | : The f-divergence is variational distance, and c f = 1 .
  • f ( t ) = ( 1 t ) 2 : The f-divergence is the squared Hellinger distance, and c f = 1 .
  • f ( t ) = t α α t ( 1 α ) α ( α 1 ) ( 0 < α < 1 ): The f-divergence is α -divergence, and c f = 1 1 α .
For function f ( t ) satisfying condition C2’), we consider its modified function
f 0 ( t ) : = f ( t ) + c f ( 1 t ) ,
which is offset by c f ( 1 t ) . This function is called the offset function of f. It should be noted that under condition C2), which is a special case of C2’), it holds that c f = 0 and thus f 0 ( t ) = f ( t ) for all t 0 . We have the following lemma:
Lemma 1. 
Assume that the function f ( t ) satisfies condition C2’). Then,
(i) 
the offset function f 0 satisfies conditions C1) and (C2),
(ii) 
for any pair of probability distributions P Z and P Z ¯ with the same alphabet Z , it holds that
D f ( Z | | Z ¯ ) = D f 0 ( Z | | Z ¯ ) .
Proof. 
It is easily verified that f 0 is a convex function with f 0 ( 1 ) = 0 , and claim (ii) is well-known. So, here we show claim (i). By definition, it holds that
lim u f 0 ( u ) u = lim u f ( u ) u + c f ( 1 u ) u = lim u f ( u ) u c f = 0 ,
which indicates that f 0 satisfies condition C2).
To show C1) being held, we use the left-derivative of f 0 at t > 0 , denoted as
f 0 ( t ) = lim h 0 f 0 ( t + h ) f 0 ( t ) h
(cf. [22]). Contrary to ordinary derivatives, the left-derivative at t > 0 always exists for function f 0 , which is continuous. To show that f 0 satisfies condition C1), it suffices to show that f 0 ( t ) 0 for all t > 0 . Using the left-derivative f 0 ( t ) , a tangent line at t > 0 can be expressed as f 0 ( t ) · t + b with some b, where f 0 ( t ) and b correspond to the slope and intercept of this tangent line, respectively. We call this tangent line the left-tangent line at t. Fixing t > 0 arbitrarily, let a : = f 0 ( t ) and b be the intercept of the left-tangent line at t . The convexity of f 0 implies that
f 0 ( t ) a t + b ( t 0 ) .
Then, it follows from (133) that
0 = lim u f 0 ( u ) u lim u a u + b u = a = f 0 ( t ) .
Since t > 0 is arbitrary, this inequality implies that f 0 ( t ) is decreasing for t > 0 with f 0 ( 0 ) > 0 , completing the proof of the lemma. □
Lemma 1 indicates that if the original function f satisfies condition C2’), then its offset function f 0 satisfies conditions C1) and C2) without changing the value of f-divergence. Because condition C2) is a special instance of condition C2’) with c f = 0 , claim (i) of Lemma 1 implies that condition C1) is superfluous for functions satisfying C2) (cf. Remark 5). The following proposition is immediately obtained by claim (ii) of Lemma 1:
Proposition 1. 
Assume that the function f ( t ) satisfies condition C2’). Then, we have
S r ( f ) ( D | X ) = S r ( f 0 ) ( D | X ) ,
S ι ( f ) ( Δ | X ) = S ι ( f 0 ) ( Δ | X ) .
It is easily verified if f satisfies condition C3) as well as C2’), then so does f 0 . From this fact, Lemma 1, and Proposition 1, we have the following generalization of Theorem 5.
Theorem 9. 
Under conditions C2’) and C3), it holds that
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 ( 1 f 0 1 ( D + ν ) | X n ) = lim ν 0 lim sup n 1 n H 0 ( 1 f 0 1 ( D ) + ν | X n ) .
For the optimum intrinsic randomness rate, we also have the generalized result of Theorem 8.
Theorem 10. 
Under condition C2’), it holds that
S ι ( f ) ( D | X ) = lim ν 0 lim inf n 1 n H ( 1 f 0 1 ( Δ + ν ) | X n ) = lim ν 0 lim inf n 1 n H ( 1 f 0 1 ( Δ ) + ν | X n ) .

6. Particularization to Several Distance Measures

In previous sections, we have derived the general formula of the first-order optimum resolvability and intrinsic randomness rates with respect to f-divergences, where the smooth Rényi entropy and the inverse function of f have important roles. In this section, we first focus on several specified functions f satisfying conditions C1)–C3) and compute these rates by using Theorems 5 and 8. In addition, we consider the function f satisfying C2’) and C3) and compute the rates by using Theorems 9 and 10.
It will turn out that it is easy to derive the optimum achievable rates for specified approximation measures. We use the notation
D f ( X n | | X ˜ n ) : = D f ( X n | | ϕ n ( U M n ) ) , D f ( U ˜ M n | | U M n ) : = D f ( φ n ( X n ) | | U M n )
for convenience.
Remark 9. 
Since the function f ( t ) = t log t (which indicates the KL divergence) does not satisfy C1) and C2), we can not apply Theorems 5 and 8 to the case of the KL divergence:
D f ( X n | | X ˜ n ) = D ( X n | | X ˜ n ) = x X n P X n ( x ) log P X n ( x ) P X ˜ n ( x ) ,
D f ( U ˜ M n | | U M n ) = D ( U ˜ M n | | U M n ) = 1 i M n P U ˜ M n ( i ) log P U ˜ M n ( i ) P U M n ( i ) .
The resolvability problem with respect to the KL divergence of this direction has not been considered yet. On the other hand, in the intrinsic randomness problem, Hayashi [9] ([Theorem 7]) has studied the problem with respect to the normalized KL divergence: 1 / n D ( U ˜ M n | | U M n ) as well as D ( U M n | | U ˜ M n ) .

6.1. Half Variational Distance

We first consider the case of f ( t ) given as f ( t ) = ( 1 t ) + , which indicates
D f ( X n | | X ˜ n ) = 1 2 x X n P X ˜ n ( x ) P X n ( x ) ,
D f ( U ˜ M n | | U M n ) = 1 2 1 i M n P U ˜ M n ( i ) P U M n ( i ) .
In this special case, we obtain the following corollary:
Corollary 1. 
For f ( t ) = ( 1 t ) + , it holds that
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 ( D + ν | X n ) ,
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H ( Δ + ν | X n ) .
Proof. 
In the case of f ( t ) = ( 1 t ) + , the inverse function becomes f 1 ( D ) = 1 D , because 0 D < 1 holds. Hence, from Theorems 5 and 8, we obtain the corollary. □
The former result in the above corollary coincides with the result given by Uyematsu [6] ([Theorem 6]), while the latter one coincides with the result given by Uyematsu and Kunimatsu [10] ([Theorem 6]). It is important to note that S r ( f ) ( D | X ) has been addressed by Steinberg [2] and Han [8] ([Theorem 2.4.1]), and S ι ( f ) ( D | X ) has also been addressed by Vembu and Verdú [7] ([Theorem 1]), Han [8] ([Theorem 2.4.2]), and Hayashi [9] ([Theorem 2]), using different information-theoretic approaches. In particular, Hayashi [9] ([Theorem 2]) has considered various achievable rates concerning the intrinsic randomness problem with respect to the variational distance, but these are not included in our current analysis. Our work provides an alternative derivation and contextualizes these results within our framework of f-divergences.

6.2. Reverse Kullback–Leibler Divergence

Secondly, we consider the case of f ( t ) = log t , which indicates
D f ( X n | | X ˜ n ) = D ( ϕ n ( U M n ) | | X n ) = x X n P X ˜ n ( x ) log P X ˜ n ( x ) P X n ( x ) ,
D f ( U ˜ M n | | U M n ) = D ( U M n | | φ n ( X n ) ) = 1 i M n P U M n ( i ) log P U M n ( i ) P U ˜ M n ( i ) .
In this case, we obtain the following corollary:
Corollary 2. 
For f ( t ) = log t , it holds that
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 ( 1 e ( D + ν ) | X n ) ,
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H ( 1 e ( Δ + ν ) | X n ) .
Proof. 
The inverse function is immediately given by f 1 ( D ) = e D . Hence, from Theorems 5 and 8, we obtain the corollary. □
It is important to note that S ι ( f ) ( D | X ) has been previously addressed by Hayashi [9] ([Theorem 7]) using different information-theoretic approaches. In particular, Vembu and Verdú [7] ([Theorem 1]) and Hayashi [9] ([Theorem 7]) have also considered the intrinsic randomness problem with respect to the normalized KL divergence, which is not included in our current analysis.

6.3. Hellinger Distance

We consider the case of f ( t ) = 1 t , which indicates
D f ( X n | | X ˜ n ) = 1 x X n P X n ( x ) P X ˜ n ( x ) ,
D f ( U ˜ M n | | U M n ) = 1 1 i M n P U ˜ M n ( i ) P U M n ( i ) .
In this case, we obtain the following corollary:
Corollary 3. 
For f ( t ) = 1 t , it holds that
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 ( 2 D D 2 + ν | X n ) ,
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H ( 2 Δ Δ 2 + ν | X n ) .
Proof. 
The inverse function of f ( t ) = 1 t is given by f 1 ( D ) = ( 1 D ) 2 . Hence, from Theorems 5 and 8, we obtain the corollary. Notice here that since both of D and Δ are smaller than one, 2 D D 2 as well as 2 Δ Δ 2 are positive. □
It is worth noting that Kumagai and Hayashi [13] have analyzed this quantity for the case of i.i.d. sources. Importantly, they addressed this quantity as part of a broader problem: the random number conversion problem. On the other hand, our approach differs in that we derive this quantity from results based on f-divergences.

6.4. Eγ-Divergence

We consider the case of f ( t ) = ( γ t ) + + 1 γ , which indicates
D f ( X n | | X ˜ n ) = x X n : P X n ( x ) > γ P X ˜ n ( x ) P X n ( x ) γ P X ˜ n ( x ) .
D f ( U ˜ M n | | U M n ) = 1 i M n : P U ˜ M n ( i ) > γ P U M n ( i ) P U ˜ M n ( i ) γ P U M n ( i ) .
In this case, we obtain the corollary:
Corollary 4. 
For f ( t ) = ( γ t ) + + 1 γ , we have
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 ( D + ν | X n ) ,
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H ( Δ + ν | X n ) .
Proof. 
Noting that γ 1 , we have f ( t ) = 1 t . Hence, the corollary holds. □
Remark 10. 
The above corollary shows that both optimum achievable rates with respect to the E γ -divergence does not depend on γ, which means that these rates coincide with the optimum achievable rates with respect to the half variational distance (cf. Corollary 1).

6.5. Variational Distance

We next consider functions f satisfying C2’) and C3). Firstly, the function f ( t ) = | 1 t | is considered:
D f ( X n | | X ˜ n ) = x X n P X ˜ n ( x ) P X n ( x ) ,
D f ( U ˜ M n | | U M n ) = 1 i M n P U ˜ M n ( i ) P U M n ( i ) .
As we have already mentioned in the previous section, f ( t ) = | 1 t | does not satisfy C1). However, it satisfies C2’) and C3). Hence, from Theorems 9 and 10, we obtain the corollary:
Corollary 5. 
For f ( t ) = | 1 t | , we have
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 D 2 + ν X n ,
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H Δ 2 + ν X n .
Proof. 
Noticing that c f = 1 , we have f 0 ( t ) = | 1 t | + ( 1 t ) , from which we obtain
f 0 1 ( D ) = 1 D 2 .
Therefore, we obtain the corollary from Theorems 9 and 10. □

6.6. Squared Hellinger Distance

We consider the function f ( t ) = ( 1 t ) 2 , which is also satisfies C2’) and C3). It indicates
D f ( X n | | X ˜ n ) = x X n P X ˜ n ( x ) P X n ( x ) 2 ,
D f ( U ˜ M n | | U M n ) = 1 i M n P U ˜ M n ( i ) P U M n ( i ) 2 .
In this case, we also apply Theorems 9 and 10.
Corollary 6. 
For f ( t ) = ( 1 t ) 2 , we have
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 D D 2 4 + ν X n ,
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H D D 2 4 + ν X n .
Proof. 
Noticing that c f = 1 , we obtain
f 0 1 ( D ) = 1 D 2 2 .
Hence, we obtain the corollary. □
Remark 11. 
The variational distance is twice the half variational distance. Consequently, the results of Corollary 5 can be trivially derived from those of Corollary 1. However, we emphasize that f ( t ) = | 1 t | does not satisfy conditions C1) and C2). Therefore, to derive the results of Corollary 5, it is necessary to apply the discussion from Section 5, specifically the examination using f 0 ( t ) . This underscores the importance of our theoretical framework in handling cases where the function f does not meet conditions C1) and C2). A similar relationship exists between Corollary 3 and the later-discussed Corollary 6.

6.7. α-Divergence

We consider the function f ( t ) = t α α t ( 1 α ) α ( α 1 ) ( 0 < α < 1 ), which also satisfies C2’) and C3). The α -divergence in our setting is given by
D f ( X n | | X ˜ n ) = 1 α ( 1 α ) 1 x X n P X n ( x ) α P X ˜ n ( x ) 1 α ,
D f ( U ˜ M n | | U M n ) = 1 α ( 1 α ) 1 1 i M n P U ˜ M n ( i ) α P U M n ( i ) 1 α .
In this case, we obtain the following corollary using Theorems 9 and 10.
Corollary 7. 
For f ( t ) = t α α t ( 1 α ) α ( α 1 ) , we have
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 1 D α ( α 1 ) + 1 1 / α + ν X n ,
S ι ( f ) ( Δ | X ) = lim ν 0 lim inf n 1 n H 1 D α ( α 1 ) + 1 1 / α + ν X n .
Proof. 
Noticing that c f = 1 / ( 1 α ) , we obtain
f 0 ( t ) = t α 1 α ( α 1 ) f 0 1 ( D ) = D α ( α 1 ) + 1 1 / α .
Hence, we obtain the corollary. □
Let us consider the case of α = 1 / 2 . In this case, the inverse function can be simply expressed as
f 0 1 ( D ) = 1 D 4 2 .
Hence, we have
S r ( f ) ( D | X ) = lim ν 0 lim sup n 1 n H 0 D 2 D 2 16 + ν X n .
It is known that α -divergence with α = 1 / 2 is related to the squared Hellinger distance. In actuality, the optimum resolvability rate S r ( f ) ( D | X ) with respect to the squared Hellinger distance is identical to S r ( f ) ( 2 D | X ) with respect to the α -divergence with α = 1 / 2 .

7. Second-Order Optimum Achievable Rate

Thus far, we have considered the first-order optimum resolvability rate as well as the first-order optimum intrinsic randomness rate. The rate of the second-order, which enables us to make a finer evaluation of achievable rates, has already been investigated in several information-theoretic problems [5,9,23,24,25,26,27,28,29,30]. In this section, according to these results, we also consider the second-order optimum achievable rates in random number generation problems with respect to f-divergences.
It is important to acknowledge that the second-order analysis for information-theoretic problems was initiated by Hayashi [9]. Building upon these works, Kumagai and Hayashi [13] conducted a second-order analysis for the broader random number conversion problem. They focused on i.i.d. sources and provided a more detailed analysis in this context. On the other hand, our results apply to more general sources, including but not limited to i.i.d. sources.

7.1. General Formula

We first define the second-order achievability in the resolvability problem.
Definition 7. 
L is said to be ( D , R ) -achievable with the given f-divergence if there exists a sequence mapping ϕ n : U M n X n such that
lim sup n D f ( X n | | ϕ n ( U M n ) ) D ,
lim sup n 1 n log M n e n R L .
Definition 8 (Second-order optimum resolvability rate).
S r ( f ) ( D , R | X ) : = inf L L   i s   ( D , R ) - a c h i e v a b l e   w i t h   t h e   g i v e n   f - d i v e r g e n c e .
In order to analyze the above quantity, we use the following condition instead of C3):
C3’) 
For any pair of positive real numbers ( a , b ) , it holds that
lim n f e n b e n a = 0 .
Here, functions f ( t ) = log t , f ( t ) = 1 t , f ( t ) = ( 1 t ) + , and f ( t ) = ( γ t ) + + ( 1 γ ) satisfy the condition C3’). Then, the following theorem holds:
Theorem 11 (Second-order optimum resolvability rate).
Under conditions C2’) and C3’), it holds that
S r ( f ) ( D , R | X ) = lim ν 0 lim sup n H 0 ( 1 f 0 1 ( D + ν ) | X n ) n R n ,
where f 0 is the offset function of f, defined in (131).
In particular, under conditions C2) and C3’), it holds that
S r ( f ) ( D , R | X ) = lim ν 0 lim sup n H 0 ( 1 f 1 ( D + ν ) | X n ) n R n .
Proof. 
Noticing that Lemma 1 indicates that the offset function f 0 satisfies conditions C1)–C3), the proof of (180) proceeds in parallel with proofs of Theorems 3–5 in which f, 1 n and e n γ are replaced by f 0 , 1 n and e n γ , respectively. Equation (181) is a special case of (180) with f 0 = f . □
We next consider the case of the intrinsic randomness problem.
Definition 9. 
L is said to be ( Δ , R ) -achievable with the given f-divergence if there exists a sequence of mapping φ n : X n U M n such that
lim sup n D f ( φ n ( X n ) | | U M n ) Δ ,
lim inf n 1 n log M n e n R L .
Definition 10 (Second-order optimum intrinsic randomness rate).
S ι ( f ) ( Δ , R | X ) : = sup L L   i s   ( Δ , R ) - a c h i e v a b l e   w i t h   t h e   g i v e n   f - d i v e r g e n c e .
Then, we have the theorem:
Theorem 12 
(Second-order optimum intrinsic randomness rate). Under condition C2’), it holds that
S ι ( f ) ( Δ , R | X ) = lim ν 0 lim inf n H ( 1 f 0 1 ( Δ + ν ) | X n ) n R n ,
where f 0 is the offset function of f.
In particular, under condition C2), it holds that
S ι ( f ) ( Δ , R | X ) = lim ν 0 lim inf n H ( 1 f 1 ( Δ + ν ) | X n ) n R n ,
Proof. 
The proof of (185) proceeds in parallel with proofs of Theorems 6–8 in which f and 1 n are replaced by f 0 and 1 n . Equation (186) is a special case of (185) with f 0 = f . □
Theorems 11 and 12 show that in both the resolvability and the intrinsic randomness, the smooth Rényi entropy and the inverse function of f also have essential roles to express second-order optimum achievable rates.

7.2. Particularizations to Several Distance Measures

Analogously to Section IV, we compute S r ( f ) ( D , R | X ) and S ι ( f ) ( Δ , R | X ) for the specified function f satisfying C1), C2) and C3’), by using Theorems 11 and 12. We obtain the following corollary:
Corollary 8. 
For f ( t ) = ( 1 t ) + , it holds that
S r ( f ) ( D , R | X ) = lim ν 0 lim sup n H 0 ( D + ν | X n ) n R n ,
S ι ( f ) ( Δ , R | X ) = lim ν 0 lim inf n H ( Δ + ν | X n ) n R n .
For f ( t ) = log t , it holds that
S r ( f ) ( D , R | X ) = lim ν 0 lim sup n H 0 ( 1 e ( D + ν ) | X n ) n R n ,
S ι ( f ) ( Δ , R | X ) = lim ν 0 lim inf n H ( 1 e ( Δ + ν ) | X n ) n R n .
For f ( t ) = 1 t , it holds that
S r ( f ) ( D , R | X ) = lim ν 0 lim sup n H 0 ( 2 D D 2 + ν | X n ) n R n ,
S ι ( f ) ( Δ , R | X ) = lim ν 0 lim inf n H ( 2 Δ Δ 2 + ν | X n ) n R n .
For f ( t ) = ( γ t ) + + 1 γ , we have
S r ( f ) ( D , R | X ) = lim ν 0 lim sup n H 0 ( D + ν | X n ) n R n ,
S ι ( f ) ( Δ , R | X ) = lim ν 0 lim inf n H ( Δ + ν | X n ) n R n .
Proof. 
The proof is similar to proofs of Corollaries 1–4. □
The optimum achievable rates with the variational distance in terms of the smooth Rényi entropy have already been derived. Relations (187) and (188) coincide with the result given by Tagashira and Uyematsu [31], and the result given by Namekawa and Uyematsu [32], respectively. As in the case of the first-order achievability, second-order optimum rates for the half variational distance ( f ( t ) = ( 1 t ) + ) and the E γ -divergence ( f ( t ) = ( γ t ) + + 1 γ ) are the same, regardless of the value of γ 1 .
It is important to note that S ι ( f ) ( Δ , R | X ) in the case of the variational distance and the reverse KL divergence has been addressed by Hayashi [9] ([Theorem 3]) and [9] ([Theorem 9]), respectively, using different information-theoretic approaches. Furthermore, S ι ( f ) ( Δ , R | X ) , in the case of the Hellinger distance for the i.i.d. case, were studied by Kumagai and Hayashi [13] for the broader setting: the random number conversion problem. Their work focused specifically on i.i.d. sources, while our results extend to more general source models.

8. Optimistic Optimum Achievable Rates

8.1. Source Resolvability

In previous sections, we have treated general formulas of the first- and second-order optimum rates. In this section, we consider optimum achievable rates in the optimistic sense. The notion of the optimistic optimum rates was first introduced by Vembu, Verdú and Steinberg [33] in the source-channel coding problem. Several researchers have developed an optimistic coding scenario in other information-theoretic problems [4,9,34,35]. In this subsection, we also clarify the optimistic optimum resolvability rate with respect to the f-divergence using the smooth Rényi entropy.
Definition 11. 
R is said to be optimistically D-achievable with the given f-divergence if there exists a sequence of mapping ϕ n : U M n X n satisfying
lim sup n D f ( X n | | ϕ n ( U M n ) ) D , lim inf n 1 n log M n R .
Definition 12 (Optimistic first-order optimum resolvability rate).
T r ( f ) ( D | X ) : = inf R R   i s   o p t i m i s t i c a l l y   D - a c h i e v a b l e   w i t h   t h e   g i v e n   f - d i v e r g e n c e .
We similarly define the second-order achievability in the optimistic scenario.
Definition 13. 
L is said to be optimistically ( D , R ) -achievable with the given f-divergence if there exists a sequence of mapping ϕ n : U M n X n satisfying
lim sup n D f ( X n | | ϕ n ( U M n ) ) D , lim inf n 1 n log M n e n R L
Definition 14 (Optimistic second-order optimum resolvability rate).
T r ( f ) ( D , R | X ) : = inf L L   i s   o p t i m i s t i c a l l y   ( D , R ) - a c h i e v a b l e   w i t h t h   e g i v e n   f - d i v e r g e n c e .
Remark 12. 
Conditions of optimistic D-achievability (195) can also be written as
lim inf n D f ( X n | | ϕ n ( U M n ) ) D , lim sup n 1 n log M n R .
In actuality, the optimistic first-order optimum resolvability rate on the basis of (199) coincides with the one defined by Definition 12. A similar argument is also applicable in the optimistic second-order optimum resolvability rate as well as the optimistic optimum intrinsic randomness rates.
The following theorem can be obtained by using Theorems 3 and 4.
Theorem 13. 
Under conditions C2’) and C3), for any 0 D < f ( 0 ) , it holds that
T r ( f ) ( D | X ) = lim ν 0 lim inf n 1 n H 0 ( 1 f 0 1 ( D + ν ) | X n ) ,
where f 0 is the offset function of f, defined in (131).
Proof. 
The proof proceeds in parallel with proof of Theorems 5 and 9 in which
lim sup n 1 / n log M n is replaced by lim inf n 1 / n log M n . □
Theorem 14. 
Under conditions C2’) and C3’), for any 0 D < f ( 0 ) , it holds that
T r ( f ) ( D , R | X ) = lim ν 0 lim inf n H 0 ( 1 f 0 1 ( D + ν ) | X n ) n R n .
Proof. 
The proof proceeds in parallel with the proof of Theorem 11 in which
lim sup n 1 / n log M n is replaced by lim inf n 1 / n log M n . □
We have revealed the first- and second-order optimum resolvability rates in the optimistic scenario. As a result, the effectiveness of Theorems 3 and 4 has also been shown.
The optimistic second-order optimum achievable rates with the half variational distance using the smooth Rényi entropy have already been derived by Tagashira and Uyematsu [31]. If we consider the case of f ( t ) = ( 1 t ) + , Theorem 14 coincides with their result.

8.2. Intrinsic Randomness

We next consider the optimum intrinsic randomness rates in the optimistic scenario.
Definition 15. 
R is said to be optimistically Δ-achievable with the given f-divergence if there exists a sequence of mapping φ n : X n U M n satisfying
lim sup n D f ( φ n ( X n ) | | U M n ) Δ , lim sup n 1 n log M n R .
Definition 16 (Optimistic first-order optimum intrinsic randomness rate).
T ι ( f ) ( Δ | X ) : = sup R R   i s   Δ a c h i e v a b l e   w i t h   t h e   g i v e n   f - d i v e r g e n c e .
Definition 17. 
L is said to be optimistically ( Δ , R ) -achievable with the given f-divergence if there exists a sequence of mapping φ n : X n U M n satisfying
lim sup n D f ( φ n ( X n ) | | U M n ) Δ , lim sup n 1 n log M n e n R L .
Definition 18 (Optimistic second-order optimum intrinsic randomness rate).
T ι ( f ) ( Δ , R | X ) : = sup L L   i s   o p t i m i s t i c a l l y   ( Δ , R ) - a c h i e v a b l e   w i t h   t h e   g i v e n f - d i v e r g e n c e .
Then, we have the following theorem by using Theorems 6 and 7.
Theorem 15. 
Under condition C2’), for any 0 Δ < f ( 0 ) it holds that
T ι ( f ) ( Δ | X ) = lim ν 0 lim sup n 1 n H ( 1 f 0 1 ( Δ + ν ) | X n ) .
Proof. 
The proof is similar to the proof of Theorems 8 and 10 in which lim inf n 1 / n log M n is replaced by lim sup n 1 / n log M n . □
Theorem 16. 
Under condition C2’), for any 0 Δ < f ( 0 ) it holds that
T ι ( f ) ( Δ , R | X ) = lim ν 0 lim sup n H ( 1 f 0 1 ( Δ + ν ) | X n ) n R n .
Proof. 
The proof is similar to the proof of Theorem 12 in which lim inf n 1 / n log M n is replaced by lim sup n 1 / n log M n . □
We have revealed the first- and second-order optimum intrinsic randomness rates in an optimistic scenario. As in the case of the resolvability problem, the effectiveness of Theorems 6 and 7 has also been shown.
The optimistic first-order optimum intrinsic randomness rate with the half variational distance using the smooth Rényi entropy has been derived by Uyematsu and Kunimatsu [10], while the second-order one has been characterized by Namekawa and Uyematsu [32]. Our results (Theorems 15 and 16) are generalizations of their results.
It is important to acknowledge that the topic of optimistic optimum achievable rates has also been previously studied in [9]. Our analysis of T ι ( Δ | X ) and T ι ( Δ , R | X ) relates to the optimistic optimum achievable rates for intrinsic randomness with variational distance, which were addressed in Theorems 2 and 3 of [9] using different information-theoretic quantities. It should be noted that his work encompasses the analysis of several optimal rates, including the optimistic optimum achievable rates.

9. Discussion

Theorems 5 and 8 (as well as Theorems 11 and 12) have shown a kind of duality of two optimum achievable rates in different random number generation problems in terms of the smooth Rényi entropy. It should be noted that in the case of the variational distance, Theorem 6 in [6] and Theorem 7 in [10] have implied the same duality.
As we have mentioned in Section 1, the optimum achievable rates S r ( f ) ( D | X ) and S ι ( f ) ( Δ | X ) have already been characterized by using the information spectrum quantity.
Definition 19. 
K ¯ f ( ε | X ) : = inf R lim sup n f Pr 1 n log 1 P X n ( X n ) R ε , K ̲ f ( ε | X ) : = sup R lim sup n f Pr 1 n log 1 P X n ( X n ) R ε .
Then, using these two quantities the following theorem has already been given.
Theorem 17 
(Nomura [4] ([Theorems 3.1 and 4.1])). Under conditions C1)–C3), it holds that
S r ( f ) ( D | X ) = K ¯ f ( D | X ) ,
S ι ( f ) ( Δ | X ) = K ̲ f ( Δ | X ) .
From the above theorem and Theorems 5 and 8, we obtain the following relationship.
Theorem 18. 
Under conditions C1)–C3), it holds that
lim ν 0 lim sup n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) = K ¯ f ( D | X ) ,
lim ν 0 lim inf n 1 n H ( 1 f 1 ( Δ + ν ) | X n ) = K ̲ f ( Δ | X ) .
The above theorem shows equivalences between information spectrum quantities and smooth Rényi entropies.
Remark 13. 
Theorem 18 can also be proved by using previous results and the continuity of the function f. In actuality, for f ( t ) = ( 1 t ) + , Steinberg and Verdú [2] have shown
S r ( f ) ( D | X ) = inf R lim sup n Pr 1 n log 1 P X n ( X n ) R D ,
from which together with the theorem given by Uyematsu [6] ([Theorem 6]) (Corollary 1 in this paper), we obtain
lim ν 0 lim sup n 1 n H 0 ( D + ν | X n ) = inf R lim sup n Pr 1 n log 1 P X n ( X n ) R D .
K ¯ f ( D | X ) = inf R lim sup n Pr 1 n log 1 P X n ( X n ) R 1 f 1 ( D )
holds under conditions C1)–C3), we have (211). Equation (212) can also be derived from Corollary 1 and the result given by [8] ([Theorem 2.4.2]).
Remark 14. 
From Definition 19, two quantities K ¯ f ( D | X ) and K ̲ f ( D | X ) are right-continuous functions of D, while
lim sup n 1 n H 0 ( 1 f 1 ( D ) | X n ) a n d lim inf n 1 n H ( 1 f 1 ( D ) | X n )
may not. The operation lim ν 0 in Theorem 18 can be considered an operation that makes quantities in (216) to be right-continuous. Furthermore, since f 1 ( D ) is a decreasing function of D, H α ( 1 f 1 ( D ) | X n ) is also a decreasing function of D. This means that the relation
lim sup n 1 n H 0 ( 1 f 1 ( D ) | X n ) lim ν 0 lim sup n 1 n H 0 ( 1 f 1 ( D + ν ) | X n ) ,
holds. It should be emphasized that the above inequality holds with equality except for at most countably many D. Similarly, we obtain
lim inf n 1 n H ( 1 f 1 ( Δ ) | X n ) lim ν 0 lim inf n 1 n H ( 1 f 1 ( Δ + ν ) | X n ) ,
where the equality holds except for at most countably many Δ. A similar observation can be applied to Theorem 20 below.
The quantity on the right-hand side of Equation (214) is an information-spectrum quantity defined in [8]. This quantity has been instrumental in analyzing various problems, including source coding and resolvability. On the other hand, the following quantity is specifically used for analyzing the intrinsic randomness problem:
sup R lim sup n Pr 1 n log 1 P X n ( X n ) R D
It is noteworthy that Hayashi [9] has defined second-order extensions of these quantities, further expanding their applicability in information theory. These extensions provide a more refined analysis of the asymptotic behavior of various information-theoretic problems.
We next consider the case of the second-order setting. We first define two quantities:
K ¯ f ( ε , R | X ) : = inf L lim sup n f Pr 1 n log 1 P X n ( X n ) R + L n ε ,
K ̲ f ( ε , R | X ) : = sup L lim sup n f Pr 1 n log 1 P X n ( X n ) R + L n ε .
By using these quantities, the following theorem has been obtained.
Theorem 19 
(Nomura [4] ([Theorems 6.1 and 6.2])). Under conditions C1), C2) and C3’), it holds that
S r ( f ) ( D , R | X ) = K ¯ f ( D , R | X ) ,
S ι ( f ) ( Δ , R | X ) = K ̲ f ( Δ , R | X ) .
From the above theorem and Theorems 11 and 12, we obtain:
Theorem 20. 
Under conditions C1), C2) and C3’), it holds that
lim ν 0 lim sup n H 0 ( 1 f 1 ( D + ν ) | X n ) n R n = K ¯ f ( D , R | X ) ,
lim ν 0 lim inf n H ( 1 f 1 ( Δ + ν ) | X n ) n R n = K ̲ f ( Δ , R | X ) .
The above theorem also shows equivalences between information spectrum quantities and smooth Rényi entropies in the second-order sense.
We have discussed functions f under C1), C2), and C3) (or C3’)) for simplicity. We can also extend the discussions for f under C2’) and C3) (or C3’)) with due modification using f 0 .

10. Concluding Remarks

We have so far considered the optimum achievable rates in two random number generation problems with respect to a subclass of f-divergences. We have demonstrated general formulas of the first- and second-order optimum achievable rates with respect to the given f-divergence by using the smooth Rényi entropy including the inverse function of f. To our knowledge, this is the first use of the smooth Rényi entropy in information theory that contains the general function f. We believe that this is important from both the theoretical and practical viewpoints. In actuality, we have shown that we can easily derive the results on several important measures, such as the variational distance, the KL divergence, and the Hellinger distance, by substituting the specified function f into our general formulas. It should be noted that the optimum achievable rates with important measures have not been characterized before by using the smooth Rényi entropy except for the variational distance. Expressions of the smooth max entropy in Theorem 1 and the smooth min entropy in Theorem 2 are simple and easy to understand. Hence, our results using the smooth max entropy and the smooth min entropy are also comprehensive. This provides us another viewpoint to understand the mechanism of the random number generation problems compared to the results given in [4], in which the information spectrum quantities are used. In addition, we have shown that the conditions on f-divergence can be relaxed, leading to the general formulas holding for a wider class of f-divergence. These are major contributions of this paper.
As a consequence of our results and the results in [4], the equivalence of the smooth Rényi entropy and the information spectrum quantity has been clarified (Theorem 18). One may consider that if we show this equivalency first, then we can derive Theorems 5 and 8 directly. This observation is correct. That is, one simple method of deriving both of the general formulas of the optimum achievable rates (Theorems 5 and 8) is to show this equivalency (Theorem 18) first. Then, combining Theorem 18 and results in [4], we obtain Theorems 5 and 8. However, we have taken another approach to show Theorems 5 and 8 in this paper. For example, we first have shown Theorems 3 and 4 so as to establish Theorem 5. Although Theorem 5 has been established by using Theorems 3 and 4, we think that these two theorems are significant themselves. In fact, Theorem 3 provides us with how to construct an optimum mapping in the resolvability problem, and Theorem 4 shows the relationship between the rate of the random number and the smooth max entropy in terms of the finite block length. Hence, these two theorems are also significant not only for proving Theorem 5 but also for constructing the optimum mapping in the practical situation.
In this paper, we have considered the f-divergence D f ( X n | | ϕ n ( U M n ) ) in the case of the resolvability problem and D f ( φ n ( X n ) | | U M n ) in the case of the intrinsic randomness problem and shown a kind of duality of these problems in terms of the smooth Rényi entropy. On the other hand, we can consider the resolvability problem with respect to D f ( ϕ n ( U M n ) | | X n ) as well as the intrinsic randomness problem with respect to D f ( U M n | | φ n ( X n ) ) . Although these problems are also important, a similar technique in the present paper cannot be applied directly. In order to treat these problems, it seems we need some novel techniques, which remain to be studied. This is similar to the case of the information spectrum approach [4].
Finally, the condition C3) and the assumption (15) for the source, have only been needed to show Direct Part (Theorem 3) in the resolvability problem. To consider the necessity or weakening of these conditions is also a future work.

Author Contributions

R.N. and H.Y. conceptualized the overall study. R.N. took the lead in writing the manuscript, with H.Y. contributing by writing Section 5. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by JSPS KAKENHI Grant Number JP20K04462, JP22K04111, JP23K10992, and Kayamori Foundation of Informational Science Advancement.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Han, T.S.; Verdú, S. Approximation theory of output statistics. IEEE Trans. Inf. Theory 1993, 39, 752–772. [Google Scholar] [CrossRef]
  2. Steinberg, Y.; Verdú, S. Simulation of random processes and rate-distortion theory. IEEE Trans. Inf. Theory 1996, 42, 63–86. [Google Scholar] [CrossRef]
  3. Nomura, R. Source resolvability with Kullback-Leibler divergence. In Proceedings of the 2018 IEEE International Symposium on Information Theory, Vail, CO, USA, 17–22 June 2018; pp. 2042–2046. [Google Scholar]
  4. Nomura, R. Source resolvability and intrinsic randomness: Two random number generation problems with respect to a subclass of f-divergences. IEEE Trans. Inf. Theory 2020, 66, 7588–7601. [Google Scholar] [CrossRef]
  5. Nomura, R.; Han, T.S. Second-order resolvability, intrinsic randomness, and fixed-length source coding for mixed sources: Information spectrum approach. IEEE Trans. Inf. Theory 2013, 59, 1–16. [Google Scholar] [CrossRef]
  6. Uyematsu, T. Relating source coding and resolvability: A direct approach. In Proceedings of the 2010 IEEE International Symposium on Information Theory, Austin, TX, USA, 13–18 June 2010; pp. 1350–1354. [Google Scholar]
  7. Vembu, S.; Verdú, S. Generating random bits from an arbitrary source: Fundamental limits. IEEE Trans. Inf. Theory 1995, 41, 1322–1332. [Google Scholar] [CrossRef]
  8. Han, T.S. Information-Spectrum Methods in Information Theory; Springer: New York, NY, USA, 2003. [Google Scholar]
  9. Hayashi, M. Second-order asymptotics in fixed-length source coding and intrinsic randomness. IEEE Trans. Inf. Theory 2008, 54, 4619–4637. [Google Scholar]
  10. Uyematsu, T.; Kunimatsu, S. A new unified method for intrinsic randomness problems of general sources. In Proceedings of the 2013 IEEE Information Theory Workshop (ITW), Seville, Spain, 9–13 September 2013; pp. 1–5. [Google Scholar]
  11. Liu, J.; Cuff, P.; Verdú, S. Eγ-resolvability. IEEE Trans. Inf. Theory 2017, 63, 2629–2658. [Google Scholar]
  12. Yagi, H.; Han, T.S. Variable-length resolvability for mixed sources and its application to variable-length source coding. In Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018. [Google Scholar]
  13. Kumagai, W.; Hayashi, M. Second-order asymptotics of conversions of distributions and entangled states based on rayleigh-normal probability distributions. IEEE Trans. Inf. Theory 2017, 63, 1829–1857. [Google Scholar] [CrossRef]
  14. Kumagai, W.; Hayashi, M. Random number conversion and LOCC conversion via restricted storage. IEEE Trans. Inf. Theory 2017, 63, 2504–2532. [Google Scholar] [CrossRef]
  15. Yu, L.; Tan, V.Y.F. Simulation of random variables under Rényi divergence measures of all orders. IEEE Trans. Inf. Theory 2019, 65, 3349–3383. [Google Scholar] [CrossRef]
  16. Yagi, H.; Han, T.S. Variable-length resolvability for general sources and channels. Entropy 2023, 25, 1466. [Google Scholar] [CrossRef] [PubMed]
  17. Csiszár, I.; Shields, P.C. Information theory and statistics: A tutorial. Found. Trends® Commun. Inf. Theory 2004, 1, 417–528. [Google Scholar] [CrossRef]
  18. Sason, I.; Verdú, S. f-divergence inequalities. IEEE Trans. Inf. Theory 2016, 62, 5973–6006. [Google Scholar] [CrossRef]
  19. Renner, R.; Wolf, S. Smooth Rényi entropy and applications. In Proceedings of the 2004 IEEE International Symposium on Information Theory (ISIT), Chicago, IL, USA, 27 June–2 July 2004; p. 233. [Google Scholar]
  20. Holenstein, T.; Renner, R. On the randomness of independent experiments. IEEE Trans. Inf. Theory 2011, 57, 1865–1871. [Google Scholar] [CrossRef]
  21. Uyematsu, T. A new unified method for fixed-length source coding problems of general sources. IEICE Trans. Fundam. 2010, E93-A, 1868–1877. [Google Scholar]
  22. Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970. [Google Scholar]
  23. Hayashi, M. Information spectrum approach to second-order coding rate in channel coding. IEEE Trans. Inf. Theory 2009, 55, 4947–4966. [Google Scholar] [CrossRef]
  24. Polyanskiy, Y.; Poor, H.; Verdú, S. Channel coding rate in the finite blocklength regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
  25. Ingber, A.; Kochman, Y. The dispersion of lossy source coding. In Proceedings of the Data Compression Conference (DCC), Snowbird, UT, USA, 29–31 March 2011; pp. 53–62. [Google Scholar]
  26. Kostina, V.; Verdú, S. Fixed-length lossy compression in the finite blocklength regime. IEEE Trans. Inf. Theory 2012, 58, 3309–3338. [Google Scholar] [CrossRef]
  27. Kontoyiannis, I.; Verdú, S. Optimal lossless compression: Source varentropy and despersion. In Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013; pp. 1739–1742. [Google Scholar]
  28. Tan, V.Y.F.; Kosut, O. On the dispersions of three network information theory problems. IEEE Trans. Inf. Theory 2014, 60, 881–903. [Google Scholar] [CrossRef]
  29. Yagi, H.; Han, T.S.; Nomura, R. First- and second-order coding theorems for mixed memoryless channels with general mixture. IEEE Trans. Inf. Theory 2016, 62, 4395–4412. [Google Scholar] [CrossRef]
  30. Watanabe, S. Second-order region for Gray-Wyner network. IEEE Trans. Inf. Theory 2017, 63, 1006–1018. [Google Scholar] [CrossRef]
  31. Tagashira, S.; Uyematsu, T. The second order asymptotic rates in fixed-length coding and resolvability problem in terms of smooth rényi entropy. IEICE Tech. Rep. 2013, 112, 65–70. (In Japanese) [Google Scholar]
  32. Namekawa, E.; Uyematsu, T. The second order asymptotic rates in intrinsic randomness problem in terms of smooth rényi entropy. IEICE Tech. Rep. 2015, 114, 1–6. (In Japanese) [Google Scholar]
  33. Vembu, S.; Verdú, S.; Steinberg, Y. The source-channel separation theorem revisited. IEEE Trans. Inf. Theory 1995, 41, 44–54. [Google Scholar] [CrossRef]
  34. Chen, P.O.; Alajaji, F. Optimistic Shannon coding theorems for arbitrary single-user systems. IEEE Trans. Inf. Theory 1999, 45, 2623–2629. [Google Scholar] [CrossRef]
  35. Koga, H. Four limits in probability and their roles in source coding. IEICE Trans. Fundam. 2011, 94, 2073–2082. [Google Scholar] [CrossRef]
Figure 1. Smooth max entropy H 0 ( δ | X n ) .
Figure 1. Smooth max entropy H 0 ( δ | X n ) .
Entropy 26 00766 g001
Figure 2. Smooth min entropy H ( δ | X n ) .
Figure 2. Smooth min entropy H ( δ | X n ) .
Entropy 26 00766 g002
Figure 3. Resolvability problem.
Figure 3. Resolvability problem.
Entropy 26 00766 g003
Figure 4. Intrinsic randomness problem.
Figure 4. Intrinsic randomness problem.
Entropy 26 00766 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nomura, R.; Yagi, H. Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy. Entropy 2024, 26, 766. https://doi.org/10.3390/e26090766

AMA Style

Nomura R, Yagi H. Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy. Entropy. 2024; 26(9):766. https://doi.org/10.3390/e26090766

Chicago/Turabian Style

Nomura, Ryo, and Hideki Yagi. 2024. "Optimum Achievable Rates in Two Random Number Generation Problems with f-Divergences Using Smooth Rényi Entropy" Entropy 26, no. 9: 766. https://doi.org/10.3390/e26090766

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop