Next Article in Journal
On the Fragility of Bulk Metallic Glass Forming Liquids
Next Article in Special Issue
Semantic Security with Practical Transmission Schemes over Fading Wiretap Channels
Previous Article in Journal
A Characterization of the Domain of Beta-Divergence and Its Connection to Bregman Variational Model
Previous Article in Special Issue
Energy Harvesting for Physical Layer Security in Cooperative Networks Based on Compressed Sensing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Biometric Authentication from an Information Theoretic Perspective †

1
Chair of Theoretical Information Technology, Technical University of Munich, Munich 80290, Germany
2
Information Theory and Applications Chair, Technische Universität Berlin, Berlin 10587, Germany
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in the 7th IEEE International Workshop on Information Forensics and Security, Rome, Italy, 16–19 November 2015.
Entropy 2017, 19(9), 480; https://doi.org/10.3390/e19090480
Submission received: 22 June 2017 / Revised: 28 August 2017 / Accepted: 7 September 2017 / Published: 9 September 2017
(This article belongs to the Special Issue Information-Theoretic Security)

Abstract

:
Robust biometric authentication is studied from an information theoretic perspective. Compound sources are used to account for uncertainty in the knowledge of the source statistics and are further used to model certain attack classes. It is shown that authentication is robust against source uncertainty and a special class of attacks under the strong secrecy condition. A single-letter characterization of the privacy secrecy capacity region is derived for the generated and chosen secret key model. Furthermore, the question is studied whether small variations of the compound source lead to large losses of the privacy secrecy capacity region. It is shown that biometric authentication is robust in the sense that its privacy secrecy capacity region depends continuously on the compound source.

1. Introduction

Biometric identifiers, such as fingerprints, iris and retina scans, are becoming increasingly attractive for the use in security systems because of their uniqueness and time invariant characteristics—for example, in authentication and identification systems. Conventional personal authentication systems usually use secret passwords or physical tokens to guarantee the legitimacy of a person. On the other hand, biometric authentication systems use the physical characteristics of a person to guarantee the legitimacy of the person to be authenticated.
Biometric authentication systems are decomposed into two phases: the enrollment and the authentication phase. A simple authentication approach is to gather biometric measurements in the enrollment phase, apply a one-way function and then store the results in a public database. In the authentication phase, new biometric measurements are gathered. The same one-way is applied and the outcome is then compared to the one stored in the database. Unfortunately, biometric measurements might be affected by noise. To deal with noisy data, error correction is needed. Therefore, helper data is generated during the enrollment phase as well based on the biometric measurements and then stored directly in the public database that will be then used in the authentication phase, which will then be used in the authentication phase to correct the noisy imperfections of the measurements.
Since the database containing the helper data is public, an eavesdropper can have access to the data if desired. How can we prevent an eavesdropper from gaining information about the biometric data from the publicly stored helper data? One is interested in encoding the biometric data into a helper data and a secret key such that the helper data does not reveal any information about the secret key. Cryptographic techniques are one approach to keeping the key secret. However, security on higher layers is usually based on the assumption of insufficient computational capabilities of eavesdroppers. Information theoretic security, on the contrary, uses the physical properties of the source to guarantee security independent from the computational capabilities of the adversary. This line of research was initiated by Shannon in [1] and has attracted considerable interest recently—cf., for example, recent textbooks [2,3,4] and references therein. In particular, Ahlswede and Csiszár in [5] and Maurer in [6] introduced a secret key sharing model. It consists of two terminals that observe the correlated sequences of a joint source. Both terminals generate a common key based on their observation and using public communication. The message transmitted over the public channel should not leak any amount of information about the common key.
Both works mentioned above use the weak secrecy condition as a measure of secrecy. Given a code of a certain blocklength, the weak secrecy condition is fulfilled if the mutual information between the key and the available information at the eavesdropper normalized by the code blocklength is arbitrarily small for large blocklengths. On the other hand, the strong secrecy condition is fulfilled if the un-normalized mutual information between the key and the available information at the eavesdropper is arbitrarily small for large blocklengths, i.e., the total amount of information leaked to the eavesdropper is negligible. The secret key sharing model satisfying the strong secrecy condition has been studied in [7].
One could model the biometric authentication similar to this secret key generation source model; however, this model does not take into account the amount of information that the public data (the helper data in the biometric scenario) leaks about the biometric measurement. The goal of biometric authentication is to perform a secret and successful authentication procedure without compromising the information about the user (privacy leakage). Compromised biometric information is unique and cannot be replaced, so once it is compromised, it is compromised forever, which might lead to an identity theft (see [8,9,10] for more information on privacy concerns). Since the helper data we use to deal with noisy data is a function of the biometric measurements, it contains information about the biometric measurement. Thus, if an attacker breaks into the data base, he could be able to extract information about the biometric measurement from where the helper data is stored. Hence, we aim to control the privacy leakage as well. An information theoretic approach of secure biometric authentication controlling the privacy leakage was studied in [11,12] under ideal conditions, i.e., with perfect source state information (SSI) and without the presence of active attackers.
In both references [11,12], the capacity results under the weak secrecy condition were derived. In [13], the capacity result for the sequential key-distillation with rate limited one-way public communication using the strong secrecy condition was shown.
For reliable authentication, SSI is needed; however, in practical systems, it is never perfectly available. Compound sources model a simple and realistic SSI scenario in which the legitimate users are not aware of the actual source realisation. Nevertheless, they know that it belongs to a known uncertainty set of sources and that it remains constant during the entire observation. This model was first introduced and studied in [14,15] in a channel coding context. Compound sources can also model the presence of an active attacker, who is able to control the state of the source. We are interested in performing an authentication process that is robust against such uncertainties and attacks. The secret key generation for source uncertainty was studied in [16,17,18,19]. In [16], the secret key generation using compound joint sources was studied and the key-capacity was established.
In [20], the achievability result of the privacy secrecy capacity region for generated secret keys for compound sources has been derived under the weak secrecy condition. In this work, we study robust biometric authentication in detail and extend this result in several directions. First, we consider a model where the legitimate users suffer from source uncertainty and/or attacks and derive achievability results under the strong secrecy conditions for both the generated and chosen secret key authentication. We then provide matching converses to obtain single-letter characterizations of the corresponding privacy secrecy capacity regions.
We further address the following question: can small changes of the compound source cause large changes in the privacy secrecy capacity region? Such a question has been first studied in [21] for arbitrarily varying quantum channels (AVQCs) showing that deterministic capacity has discontinuity points, while the randomness-assisted capacity is a continuous function of the AVQCs. This line of research is continued in [22,23], in which the classical compound wiretap channel, the arbitrarily varying wiretap channel (AVWC), and the compound broadcast channel with confidential messages (BCC) are studied. We study this for the biometric authentication problem at hand and show that the corresponding privacy secrecy capacity regions are continuous functions of the underlying uncertainty sets. Thus, small changes in the compound set lead to small changes in the capacity region only.
The rest of this paper is organized as follows. In Section 2, we introduce the biometric authentication model for perfect SSI and present the corresponding capacity results. In Section 3, we introduce the biometric authentication model for compound sources and show that secure, under the strong secrecy condition, and reliable authentication, under source uncertainty with positive rates, is possible deriving a single-letter characterization of the privacy secrecy capacity region for the chosen and generated secret key model. In Section 4, we show that the privacy secrecy capacity region for compound sources is a continuous function of the uncertainty set. Finally, the paper ends with a conclusion in Section 5.
Notation: Discrete random variables are denoted by capital letters and their realizations and ranges by lower case and script letters. P ( X ) denotes the set of all probability distributions on X ; E ( · ) denotes the expectation of a random variable; Pr { · } , H ( · ) and I ( · ; · ) indicate the probability, the entropy of a random variable, and mutual information between two random variables; D ( · · ) is the information divergence; p q T V is the total variation distance between p and q on X defined as p q T V x X | p ( x ) q ( x ) | . The set T p , δ n denotes the set of δ typical sequences of length n with respect to the distribution p; the set T W , δ n ( x n ) denotes the set of δ conditional typical sequences with respect to the conditional distribution W : X P ( Y ) and sequence x n X n ; p x n denotes the empirical distribution of the sequence x n .

2. Information Theoretic Model for Biometric Authentication

Let X and Y be two finite alphabets. Let ( x n , y n ) X n × Y n be a pair of biometric sequences of length n N ; then, the discrete memoryless joint-source is given by the joint probability distribution Q n ( x n , y n ) i = 1 n Q ( x i , y i ) . This models perfect SSI, i.e., all possible measurements are generated by the discrete memoryless joint-source source Q, which is perfectly known at both the enrollment and the authentication terminal.

2.1. Generated Secret Key Model

The information theoretic authentication model consists of a discrete memoryless joint-source Q, which represents the biometric measurement source, and two terminals: the enrollment terminal and the authentication terminal as shown in Figure 1. At the enrollment terminal, the enrollment sequence X n is observed and the secret key K and helper data M are generated. At the authentication terminal, the authentication sequence Y n is observed. An estimate of the secret key K ^ is made based on the authentication sequence Y n and the helper data M . Since the helper data is stored in a public database, this should not reveal anything about the secret key K and also as little as possible about the enrollment measurement X n . The distribution of the key must be close to uniform.
We consider a block-processing of arbitrary but fixed length n. Let M { 1 , , M n } be the helper data set and K { 1 , , K n } the secret key set.
Definition 1.
An ( n , M n , K n ) -code for generated secret key authentication for joint-source Q P ( X × Y ) consists of an encoder f at the enrollment terminal with
f : X n K × M
and a decoder ϕ at the authentication terminal
φ : Y n × M K .
Remark 1.
Note that the function f means that every x n is mapped into a ( k , m ) K × M , which implies that | f ( · ) | = K n M n | X n | .
Definition 2.
A privacy secrecy rate pair ( R P L , R K ) R + 2 is called achievable for the generated secret key authentication for a joint-source Q, if, for any δ > 0 , there exist an n ( δ ) N and a sequence of ( n , M n , K n ) -codes such that, for all n n ( δ ) , we have
Pr { K ^ K } δ ,
1 n H ( K ) + δ 1 n log K n R K δ ,
1 n I ( K ; M ) δ ,
1 n I ( X n ; M ) R P L + δ .
Remark 2.
Condition (1b) requires the key distribution p K to be close to the uniform distribution p K ˜ , where K ˜ is a random variable uniformly distributed over the key set K . By (1b), we have 1 n log K n 1 n H ( K ) = D ( K K ˜ ) δ ; combined with Pinsker’s inequality, we have p K p K ˜ 2 ln 2 δ . For small δ, we have that both distributions are close to each other.
Remark 3.
Condition (1a) stands for reliable authentication, the information about the key leaked by the helper data is negligible by (1c) and the information about the biometric measurements leaked by the helper data 1 n I ( X n ; M ) is close to R P L by (1d).
Definition 3.
The set of all achievable privacy secrecy rate pairs for generated key authentication is called privacy secrecy capacity region and is denoted by C G ( Q ) .
We next present the privacy secrecy capacity region for the generated key authentication for the joint-source Q, which was first established in [11,12].
To do so, for some U with alphabet | U | | X | + 1 and V : X P ( U ) , we define the region R ( Q , V ) as the set of all ( R P L , R K ) R + 2 satisfying
R K I ( U ; Y ) , R P L I ( U ; X ) I ( U ; Y ) ,
with P U X Y ( u , x , y ) = V ( u | x ) Q ( x , y ) .
Theorem 1
[11,12]. The privacy secrecy capacity region for generated key authentication is given by
C G ( Q ) = V : X P ( U ) R ( Q , V ) .

2.2. Chosen Secret Key Model

In this section, we study the authentication model for systems for which the secret key is chosen beforehand. At the enrollment terminal, a secret key K is chosen uniformly and independent of the biometric measurements. The secret key K is bound to the biometric measurements X n , and, based on this, the helper data M is generated as shown in Figure 2. At the authentication terminal, the authentication measurement Y n is observed. An estimate of the secret key K ^ is made based on the authentication sequence Y n and the helper data M . Since the helper data is stored in a public database, this should not reveal anything about the secret key and minimize the information leakage about the enrollment sequence X n . However, we should be able to reconstruct K. To achieve this, a masking layer based on the one-time pad principles is used.
The masking layer, which is another uniformly distributed chosen secret key K, is added to the top of the generated secret key authentication. At the enrollment terminal, a secret key K g and a helper data M are generated. The generated secret key is added modulo- | K | to the masking layer K and sent together with the helper data as additional helper data, i.e., M = ( M , K K g ) . At the authentication terminal, an estimation of the generated secret key K ^ g is made based on Y n and M and the estimation of masking layer is made K ^ = K K g K ^ g .
We consider a block-processing of arbitrary but fixed length n. Let M { 1 , , M n } be the helper data set and K { 1 , , K n } the secret key set.
Definition 4.
An ( n , M n , K n ) -code for chosen secret key authentication for joint-source Q P ( X × Y ) consists of an encoder f at the enrollment terminal with
f : K × X n M
and a decoder ϕ at the authentication terminal
φ : Y n × M K .
Definition 5.
A privacy secrecy rate pair ( R P L , R K ) R + 2 for chosen secret key authentication is called achievable for a joint-source Q, if, for any δ > 0 , there exist an n ( δ ) N and a sequence of ( n , M n , K n ) -codes, such that, for all n n ( δ ) , we have
Pr { K ^ K } δ ,
1 n log K n R K δ ,
1 n I ( K ; M ) δ ,
1 n I ( X n ; M ) R P L + δ .
Remark 4.
The difference between Definition 5 and 2 is that, in here, the uniformity of the key is already guaranteed.
Definition 6.
The privacy secrecy capacity region for chosen secret key authentication for the joint-source Q P ( X × Y ) is called privacy secrecy capacity region and is denoted as C C ( Q ) .
We next present the privacy secrecy capacity region for chosen secret key authentication for the joint-source Q as showed in [11].
Theorem 2
 ([11]). The privacy secrecy capacity region for the chosen secret key authentication is given by
C C ( Q ) = V : X P ( U ) R ( Q , V ) .

3. Authentication for Compound Sources

Let X and Y be two finite sets and S a finite state set. Let ( x n , y n ) X n × Y n be a sequence pair of length n N . For every s S , the discrete memoryless joint-source is given by the joint probability distribution Q s n ( x n , y n ) i = 1 n Q s ( x i , y i ) = i = 1 n p s ( x i ) W s ( y i | x i ) , with p s P ( X ) a marginal distribution on X and W s : X P ( Y ) a stochastic matrix.
Definition 7.
The discrete memoryless compound joint-source Q XY is given by the family of joint probabilities distributions on X × Y as
Q X Y { Q s P ( X × Y ) : s S } .
We define the finite set of marginal distributions Q X over the alphabet X from the compound joint-source Q XY as
Q X p s P ( X ) : s S , p s ( x ) = y Y Q s ( x , y ) for every x X and Q s Q X Y .
We define L as the index set of Q X . Note that | L | = | Q X | | Q XY | .
For every L , we define the subset of the compound joint-source Q XY with the same marginal distribution p as
Q XY , Q s Q XY : Q s ( x , y ) = p ( x ) W s ( y | x ) for every ( x , y ) X × Y .
For every L , we define the index set S of Q XY , as
S { s S : Q s Q XY , } .
Remark 5.
Note that, for every , L with , it holds that Q XY , Q XY , = , S S = , S = L S and Q XY = L Q XY , .

3.1. Compound Generated Secret Key Model

In this section, we study the generated secret key authentication for finite compound joint-sources, which is a special class of sources that model a limited SSI, as shown in Figure 3.
We consider a block-processing of arbitrary but fixed length n. Let M { 1 , , M n } be the helper data set and K { 1 , , K n } the secret key set.
Definition 8.
An ( n , M n , K n ) -code for generated secret key authentication for the compound joint-source Q XY P ( X × Y ) consists of an encoder f at the enrollment terminal with
f : X n K × M
and a decoder ϕ at the authentication terminal
φ : Y n × M K .
Definition 9.
A privacy secrecy rate pair ( R P L , R K ) R + 2 is called achievable for generated secret key authentication for the compound joint-source Q XY , if, for any δ > 0 , there exist an n ( δ ) N and a sequence of ( n , M n , K n ) -codes, such that for all n n ( δ ) and for every s S , we have
Pr { K ^ K } δ , H ( K ) + δ 1 n log K n R K δ , I ( K ; M ) δ , 1 n I ( X s n ; M ) R P L + δ .
Consider the compound joint-source Q XY . For a fixed L , V : X P ( U ) and for every s S , we define the region R ( V , , s ) as the set of all ( R P L , R K ) R + 2 that satisfy
R K I ( U ; Y s ) , R P L I ( U ; X ) I ( U ; Y s ) ,
with P U X Y , s ( u , x , y ) = V ( u | x ) Q s ( x , y ) .
Theorem 3.
The privacy secrecy capacity region for generated secret key authentication for the compound joint-source Q XY is given by
C G ( Q XY ) = L V : X P ( U ) | U | | X | + | S | s S R ( V , , s ) .
Proof. 
The proof of Theorem 3 consists of two parts: achievability and converse. The achievability scheme uses the following protocol:
  • Estimate the marginal distribution p ^ Q X from the observed sequence X n at the enrollment terminal via hypothesis testing.
  • Compute the key K and a helper data M based on X n , a common shared sequence T = U n by the enrollment and authentication terminal and using an extractor function g : { 0 , 1 } n × { 0 , 1 } d { 0 , 1 } k with N , d , k N whose input are the shared sequence T and a sequence of d uniformly distributed bits U d . The helper data M is equivalent to the helper data for the case with perfect SSI. The extended helper data in this case contains also the state of the marginal distribution and the uniformly distributed bits sequence, i.e., M = ( M , L ^ , U d ) .
  • Store the extended helper data M in the public database.
  • Estimate the key K ^ at the authentication terminal, based on the observations M and Y n , which can be seen as the outcome of one of the channels in W ^ { W s : X P ( Y ) : s S ^ } .
A detailed proof can be found in Appendix A. □
Remark 6.
Note that the authentication for compound source model is a generalization of the models studied by [11,12], i.e., | S | = 1 . Furthermore, one can see that, for | S | = 1 , the capacity region under the strong secrecy condition equals the capacity region under the weak secrecy condition showed by [11,12].
Remark 7.
As we already mentioned, we aim for strong secrecy, i.e., in contrast to the weak secrecy constraint in (1c), we now require the un-normalized mutual information between the key and the helper data to be negligibly small. It would be Ideal to show perfect secrecy and a perfectly uniformed key, i.e., I ( K ; M ) = 0 and H ( K ) = 1 n log K n . It would be interesting to see how this constraint affects the achievable rate region. We suspect that the achievable rate region under perfect secrecy and perfectly uniformed key remains the same as in Theorem 3.
Remark 8.
From the protocol, note that once we have estimated the marginal distribution p ^ Q X , we deal with a compound channel model without channel state information (CSI) at the transmitter (see [24]).
Remark 9.
The order of the set operations of the capacity region displays the fact that the marginal distribution is first estimated. This can be seen as partial state information, where the marginal distribution over X is known.

3.2. Compound Chosen Secret Key Model

In this section, we study chosen secret key authentication for finite compound joint-sources (see Figure 4).
We consider a ( n , M n , K n ) -code of arbitrary but fixed length n.
Definition 10.
A privacy secrecy rate pair ( R P L , R K ) R + 2 is called achievable for chosen secret key authentication for the compound joint-source Q XY , if for any δ > 0 there exist an n ( δ ) N and a sequence of ( n , M n , K n ) -codes, such that, for all n n ( δ ) and for every s S , we have
Pr { K ^ K } δ ,
1 n log K n R K δ ,
I ( K ; M ) δ ,
1 n I ( X s n ; M ) R P L + δ .
Consider the compound joint-source Q XY . For a fixed L , V : X P ( U ) and for every s S , we define the region R ( V , , s ) as the set of all ( R P L , R K ) R + 2 that satisfy
R K I ( U ; Y s ) , R P L I ( U ; X ) I ( U ; Y s ) ,
with P U X Y , s ( u , x , y ) = V ( u | x ) Q s ( x , y ) .
Theorem 4.
The privacy secrecy capacity region for chosen secret key authentication for the compound joint-source Q XY is given by
C C ( Q XY ) = L V : X P ( U ) | U | | X | + | S | s S R ( V , , s ) .
Proof. 
The proof can be found in Appendix B. □
Remark 10.
Note that, as for generated secret key authentication for compound sources, chosen secret key authentication for compound sources is a generalization of the models studied by [11]. Furthermore, for perfect SSI, one can see that the capacity region under the strong secrecy condition equals the capacity region under the weak secrecy condition showed by [11].
Remark 11.
Note that the privacy secrecy capacity region for the generated key model equals the privacy secrecy capacity region for chosen secret key authentication, i.e., C G ( Q XY ) = C C ( Q XY ) .

4. Continuity of the Privacy Secrecy Capacity Region for Compound Sources

We are interested in studying how small variations in the compound source affect the privacy secrecy capacity region. The question of whether the capacity or capacity region is a continuous function of a source or channel is not always clear, especially if the source or channel are complicated. In [22], one can find an example of AVWCs, whose uncertainty set consists of only two channels, which already shows discontinuity points in its unassisted secrecy capacity. For a detailed discussion, see [25]. In this section, we study the continuity of the privacy secrecy capacity region for compound sources. For this purpose, we introduce the distance between two compound sources and capacity regions, respectively.

4.1. Distance between Compound Sources

Definition 11.
Let Q XY , 1 and Q XY , 2 be two compound sources. We define
d 1 ( Q XY , 1 , Q XY , 2 ) = max s 2 S 2 min s 1 S 1 Q s 1 Q s 2 T V , d 2 ( Q XY , 1 , Q XY , 2 ) = max s 1 S 1 min s 2 S 2 Q s 1 Q s 2 T V .
The Hausdorff distance D H ( Q XY , 1 , Q XY , 2 ) between Q XY , 1 and Q XY , 2 is defined as
D H ( Q XY , 1 , Q XY , 2 ) = max d 1 ( Q XY , 1 , Q XY , 2 ) , d 2 ( Q XY , 1 , Q XY , 2 ) .
Definition 12.
Let R 1 , and R 2 be two non-empty subsets of the metric space ( R 2 , d ) with d ( x , y ) = i = 1 2 | x i y i | 2 for all x , y R 2 . We define the distance between two sets as
D R ( R 1 , R 2 ) = max { max r 1 R 1 min r 2 R 2 d ( r 1 , r 2 ) , max r 2 R 2 min r 1 R 2 d ( r 1 , r 2 ) } .

4.2. Continuity of the Privacy Secrecy Capacity Region

Theorem 5.
Let ϵ ( 0 , 1 ) and n N . Let Q XY , 1 and Q XY , 2 be two compound sources. If
D H ( Q XY , 1 , Q XY , 2 ) ϵ ,
then it holds
D R ( C G ( Q XY , 1 ) , C G ( Q XY , 2 ) ) δ ( ϵ , | X | , | Y | )
with δ ( ϵ ) = δ 1 ( ϵ ) 2 + δ 2 ( ϵ ) 2 , where δ 1 ( ϵ ) = 2 ϵ log | Y | + 2 H 2 ( ϵ ) ϵ log ϵ | U | and δ 2 ( ϵ ) = 2 ϵ log | Y | | X | + 4 H 2 ( ϵ ) 2 ϵ log ϵ | U | .
Remark 12.
Note that since the privacy secrecy capacity region for the chosen secret key equals the privacy secrecy capacity region for the chosen secret key, the continuity behaviour holds also for the chosen secret key privacy capacity region.
Remark 13.
This theorem shows that the privacy secrecy capacity region is a continuous function of the uncertainty set. In other words, small variations of the uncertainty set lead to small variations in the capacity region.
Proof. 
A detailed proof can be found in Appendix C. □
Remark 14.
A complete characterisation of the discontinuity behaviour of the AVC capacity under list decoding can be found in [26]. Note that this behaviour, based on Theorem 5, can not occur.

5. Conclusions

In this paper, we considered a biometric authentication model in the presence of source uncertainty. In particular, we studied a model where the actual source realization is not known, however it belongs to a known source set: this is the finite compound source model. We have shown that biometric authentication is robust against source uncertainty and certain classes of attacks. In other words, reliable and secure authentication is possible at positive key rates. We further characterize the minimum privacy leakage rate under source uncertainty. For future work, perfect secrecy for the biometric authentication model and a compound source with infinite sources is of great interest.

Acknowledgments

The authors would like to thank their Sebastian Baur for insightful discussions. This work was supported by the Gottfried Wilhelm Leibniz Programme of the German Research Foundation (DFG) under Grant BO 1734/20-1, Grant BO 1734/24-1 and Grant BO 1734/25-1.

Author Contributions

Andrea Grigorescu, Holger Boche and Rafael Schaefer conceived this study and derived the results. Andrea Grigorescu and wrote the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 3

Appendix A.1. Achievability of Theorem 3

Appendix A.1.1. State Estimation

We first show that we can estimate the marginal distribution p ^ Q X correctly with probability approaching one. Then, for every = ^ L , we use the random coding argument to show that all rate pairs ( R P L , R K ) R ( V , , s ) are achievable.
To estimate the actual source realization, we perform hypothesis testing. The set of hypotheses is the set of finite marginal distributions Q X . For every L , we define
δ = 1 2 min L p p T V .
We choose 0 < δ < min L δ and consider the test set (typical sequences set) T p , δ n { x n X n : p x n p δ } . Note that, for every , L with , we have that T p , δ n T p , δ n = . We show this by arbitrarily choosing a sequence x n T p , δ n of type p x n and show that p p x n T V > δ for . By the triangle inequality, we have
p p T V = p p + p x n p x n T V p p x n T V + p x n p T V .
Hence,
p x n p T V p p T V p p x n T V 2 δ δ > δ ,
proving the disjointness of the sets.
The test function is the indicator function 1 [ x n T p , δ n ] , i.e., after observing x n the test looks for the hypothesis p ^ = p for which 1 [ x n T p , δ n ] = 1 .
An error occurs, if the sequence x n was generated by the source p for any L ; however, x n T p , δ n . This implies that either x n L T p , δ n or x n T p , δ n with . Using Lemma 2.12 in [27], we upper bound the probability of this error event by
p ( T p , δ n c ) ϵ δ ( n , | X | ) ,
where ϵ δ ( n , | X | ) = ( n + 1 ) | X | 2 n c δ 2 . Letting n , the right-hand side of (A1) tends to zero.

Appendix A.1.2. Code Construction

For each L , we consider the auxiliary random variable U and the channel V and construct a code for which we analyze the decoding error, secrecy and privacy condition.
Generate 2 n ( R K + R M ) codewords U k , m n with k K { 1 , , 2 n R K } and m M { 1 , , 2 n R M } by choosing each symbol U i k , m in the codebook independently at random according to p u P ( U ) , computed from p ( x ) V ( u | x ) for every ( x , u ) X × U . We denote the codebook as U ˜ = { U k , m n } ( k , m ) K × M .
For every L and every s S , we define the following channels Σ X : U P ( X ) , Σ Y s : U P ( Y ) and Σ XY s : U P ( X × Y ) that satisfy:
Σ X ( x | u ) = p ( x ) V ( u | x ) x X p ( x ) V ( u | x ) , Σ Y s ( y | u ) = x X V ( u | x ) Q s ( x , y ) ( x , y ) X × Y V ( u | x ) Q s ( x , y ) , Σ XY s ( x , y | u ) = V ( u | x ) Q s ( x , y ) ( x , y ) X × Y V ( u | x ) Q s ( x , y ) ,
for every ( u , x , y ) U × X × Y .

Appendix A.1.3. Encoding Sets

For every ( k , m , ) K × M × L , we define the encoding sets E k , m , ( U ˜ ) X n as follows:
E k , m , ( U ˜ ) = T Σ X , δ n ( U k , m n ) ,
with δ > δ | U | .
Remark 15.
Note that, by the definition of δ and Lemma 2.10 in [27], if U k , m n T p u , δ ′′′ n with δ ′′′ = δ | U | δ and x n T Σ X , δ n ( U k , m n ) , then x n T p , δ n .

Appendix A.1.4. Decoding Sets

For every ( k , m , ) K × M × L , we define the decoding sets D k ( m ( U ˜ ) , ) Y n as follows:
D k ( m ( U ˜ ) , ) s S ^ T Σ Y s , δ n ( U k , m n ) , D k ( m ( U ˜ ) , ) D k ( m ( U ˜ ) , ) k k k K D k ( m ( U ˜ ) , ) c ,
with δ > δ | U | .
Remark 16.
One could consider sending some bits of the sequences X n through the public channel, such that the user at the authentication terminal can be able to estimate the actual source realization and so avoid the complicated decoding strategy. However, this approach would violate the strong secrecy condition.

Appendix A.1.5. Encoder–Decoder Pair Sets

For every ( k , m ) K × M , we define the encoder–decoder pair set C k , m , ( U ˜ ) X n × Y n as follows:
C k , m , ( U ˜ ) = ( E k , m , ( U ˜ ) × D k ( m ( U ˜ ) , ) ) s S ^ T Σ XY s , δ ˜ n ( U k , m n ) ,
with δ ˜ > 0 .

Appendix A.1.6. Error Analysis

For every L , assume that the marginal distribution was estimated correctly, i.e., ^ = . We analyze the probability of each error event separately. We denote the error at the enrollment terminal given the codebook U ˜ as ϵ E , n ( U ˜ ) . An error occurs at the enrollment terminal if, for every ( k , m , ) K × M × L , the observed sequence x n does not belong to E k , m , ( U ˜ ) , i.e.,
ϵ E , n ( U ˜ ) = p n ( k , m ) K × M E k , m , ( U ˜ ) c = p n ( k , m ) K M E k , m , ( U ˜ ) c = ( k , m ) K M 1 p n ( E k , m , ( U ˜ ) ) .
Averaging over all codebooks, from the independence of the random variables involved and from Lemma 2.13 in [27], we have
E U ˜ ( ϵ E , n ( U ˜ ) ) = ( k , m ) K M E U k , m n 1 p n ( T Σ X , δ n ( U k , m n ) ) [ 1 ( n + 1 ) | U | | X | ( 2 n ( I ( U ; X ) 2 ψ ( δ , | U | | X | ) ) ) ] 2 n ( R K + R M ) exp ( n + 1 ) | U | | X | exp 2 n ( R K + R M I ( U ; X ) ψ ( δ , | U | | X | ) ) .
The inequality (A2) follows from ( 1 x ) r exp ( r x ) , which holds for every x , r > 0 . Letting n and choosing
R K + R M > I ( U ; X ) + ψ ( δ , | U | | X | ) ,
the right-hand side of (A2) goes doubly exponentially fast to zero. An error at the authentication terminal occurs, when ( k , m ) was encoded at the enrollment terminal, but k k was decoded at the authentication terminal. The set of joint observations describing this event is given by
C E k , m , ( U ˜ ) c = C k , m , ( U ˜ ) c E k , m , ( U ˜ ) × D k ( m ( U ˜ ) , ) c = E k , m , ( U ˜ ) × D k ( m ( U ˜ ) , ) c s S T Σ XY s , δ ˜ n ( U k , m n ) c .
We denote the error probability of this event given the codebook U ˜ for each correlated source Q t with t S as ϵ n , k t ( U ˜ ) .
ϵ n , k t ( U ˜ ) = Σ XY t n ( C E k , m , ( U ˜ ) c | U k , m n ) = Σ XY t n E k , m , ( U ˜ ) × D k ( m ( U ˜ ) , ) c s S T Σ XY s , δ ˜ n ( U k , m n ) c | U k , m n Σ XY t n E k , m , ( U ˜ ) × D k ( m ( U ˜ ) , ) c | U k , m n + Σ XY t n s S T Σ XY s , δ ˜ n ( U k , m n ) c | U k , m n Σ Y t n D k ( m ( U ˜ ) , ) c | U k , m n + Σ XY t n s S T Σ XY s , δ ˜ n ( U k , m n ) c | U k , m n = Σ Y t n D k ( m ) c k k k K D k ( m ( U ˜ ) , ) | U k , m n + Σ XY t n s S T Σ XY s , δ ˜ n ( U k , m n ) c | U k , m n Σ Y t n D k ( m ) c | U k , m n + Σ Y t n k k k K D k ( m ( U ˜ ) , ) | U k , m n + Σ XY t n s S T Σ XY s , δ ˜ n ( U k , m n ) c | U k , m n = Σ Y t n s S T Σ Y s , δ n ( U k , m n ) c | U k , m n + Σ Y t n s S k k k K T Σ Y s , δ n ( U k , m n ) | U k , m n + Σ XY t n s S T Σ XY s , δ ˜ n ( U k , m n ) c | U k , m n Σ Y t n T Σ Y t , δ n ( U k , m n ) c | U k , m n + s S k k k K Σ Y t n T Σ Y s , δ n ( U k , m n ) | U k , m n + Σ XY t n T Σ XY t , δ ˜ n ( U k , m n ) c | U k , m n .
Averaging over all codebooks and applying Lemma 2.12 in [27], we have
E U ˜ ( ϵ n , k t ( U ˜ ) ) ϵ δ ( n , | U | | Y | ) + ϵ δ ˜ ( n , | U | | X | | Y | ) + s S k k k K E U k , m n E U k , m n Σ Y t n T Σ Y s , δ n ( U k , m n ) | U k , m n ,
with ϵ δ ( n , | U | | Y | ) = ( n + 1 ) | U | | Y | 2 n c δ 2 and ϵ δ ˜ ( n , | U | | X | | Y | ) = ( n + 1 ) | U | | X | | Y | 2 n c δ ˜ 2 .
For k k and applying from Lemma 3.3 in [28], we can bound the second term of the last inequality by
E U k , m n Σ Y t n ( T Σ Y s , δ n ( U k , m n ) | U k , m n ) p Y , t n T Σ Y s , δ n ( U k , m n ) p u n ( T p u , δ ′′′ n ) ,
with δ ′′′ = δ δ | U | , since U k , m n T p u , δ n with probability one. For any t , s S , we have
E U k , m n Σ Y t n T Σ Y s , δ n ( U k , m n ) | U k , m n ( n + 1 ) | U | | Y | 1 ϵ δ ′′′ ( n , | U | ) 2 n ( I ( U ; Y s ) ϕ ( δ , | U | , | Y | ) ) .
For every t , s S and every k K , we have
E U ˜ ( ϵ n , k t ( U ˜ ) | U k , m n ) ϵ δ ( n , | U | | Y | ) + s S k k k K ( n + 1 ) | U | | Y | 1 ϵ δ ′′′ ( n , | U | ) 2 n ( I ( U ; Y s ) ϕ ( δ , | U | , | Y | ) ) + ϵ δ ˜ ( n , | U | | X | | Y | ) ϵ δ ( n , | U | | Y | ) + ( n + 1 ) | U | | Y | 1 ϵ δ ′′′ ( n , | U | ) | S | 2 n ( min s S I ( U ; Y s ) R K ϕ ( δ , | U | , | Y | ) ) + ϵ δ ˜ ( n , | U | | X | | Y | ) .
There is an n ( δ , δ ′′′ , δ ˜ , | U | , | X | , | Y | ) such that for all n > n ( δ , δ ′′′ , δ ˜ , | U | , | X | , | Y | ) for which we have
E U ˜ ( ϵ n , k t ( U ˜ ) | U k , m n ) | S | 2 n ( min s S I ( U ; Y s ) R K ϕ ( δ , | U | , | Y | ) )
for all k K . By choosing
R K < I ( U ; Y s ) ϕ ( δ , | U | , | Y | )
and letting n , the right-hand side of (A4) tends to zero. Considering (A5) and (A3), the helper data rate is lower bounded by
R M > I ( U ; X ) I ( U ; Y s ) + ϕ ( δ , | U | , | Y | ) + ψ ( δ , | U | , | X | ) .

Appendix A.1.7. Key Distribution

Besides reliability, a privacy secrecy rate pair has to fulfill three other conditions. One of them is that the secret key distribution must be close to the uniform distribution. Here, we show that this is indeed satisfied using the proof of [13]. For completeness, we introduce a sketch of the proof shown in [13] for a sequential key distillation, which consists of two phases: reconciliation and privacy amplification. The reconciliation step is equivalent to the reliability proved above. The privacy amplification step consists on the construction of the key K from a common shared sequence T = U n using an extractor function g : { 0 , 1 } n × { 0 , 1 } d { 0 , 1 } k with d , k , N N whose inputs are the shared sequence T and a sequence of d uniformly distributed bits U d and gives as output a k nearly uniformly distributed sequence.
Lemma 1
 ([7]). Let T { 0 , 1 } n be the random variable that represents the common sequence shared by both terminals and let E be the random variable that represents the total knowledge about T available to the eavesdropper. Let e be a particular realization of E. If both terminals know the conditional min-entropy H ( T | E = e ) γ n for some γ ( 0 , 1 ) , then there exists an extractor
g : { 0 , 1 } n × { 0 , 1 } d { 0 , 1 } k
with
d n δ ( n ) a n d k n ( γ δ ( n ) ) ,
with lim n δ ( n ) = 0 and if U d is a random variable with uniform distribution on { 0 , 1 } d and both terminals choose K = g ( T , U d ) as their secret key, then
H ( K | U d , E = e ) k δ ( n ) .
Sequential key distillation protocol: For every source realization s S , we have an = ( s ) L such that Q s Q XY , . For every L , we perform the following protocol:
  • Repeat i N times the reconciliation protocol creating i shared sequences T 1 , T 2 , , T i of length n.
  • Perform the privacy amplification phase based on an extractor with output size k, i.e., K = g ( T 1 , T 2 , , T i , U d ) = g ( U 1 n , U 2 n , , U i n , U d ) = g ( U N , U d ) with N = i n . U d has to be transmitted through the public channel together with the public message M i .
  • The total information available to the eavesdropper is E = ( M i , U d , Θ ) , with Θ being a binary random variable introduced for calculation purposes informing if T i T p T , δ n .
In [13], it was shown that
H ( T i | M i = m i , L ^ = , Θ = 1 , U d ) N I ( U ; Y s ) N ϕ ( δ , | U | , | Y | ) H ( X | U ) 2 i i ϕ ( δ , | U | , | Y | ) δ ϵ ( i ) N δ ( N ) ( N ) ,
with lim n δ ϵ ( n ) = 0 (see Lemma 1 in [13]). Using Lemma 1, we have
H ( K | M i = m i , L ^ = , Θ = 1 , U d ) k δ ( N ) ,
which implies that
H ( K | L ^ = ) H ( K | M i , Θ , U d , L ^ = ) k δ ( N ) .
Since this holds for every L , we have that
log | K | k H ( K ) H ( K | L ^ ) .
Furthermore, we have
H ( K | L ^ ) = ˜ L Pr { L ^ = ˜ } H ( K | L ^ = ˜ ) = Pr { L ^ = } H ( K | L = ) + ˜ ˜ L Pr { L ^ = ˜ } H ( K | L ^ = ˜ ) = Pr { L ^ = } H ( K | L ^ = ) + Pr { L ^ } H ( K | L ^ ) Pr { L ^ = } H ( K | L ^ = ) + Pr { L ^ } max L H ( K | L ^ ) Pr { L ^ = } H ( K | L ^ = ) + ϵ δ ( n , | X | ) i N log | X |
H ( K | L ^ = ) + ( n + 1 ) | X | i N log | X | 2 N c δ 2 = H ( K | L ^ = ) + ϵ δ ( n , i , | X | ) ,
where lim i , n ϵ δ ( n , i , | X | ) = 0 and (A10) follows from (A1). We then have that
| H ( K | L ^ ) H ( K | L ^ = ) | ϵ δ ( n , i , | X | ) ,
showing that H ( K | L ^ ) approaches H ( K | L ^ = ) for increasing n or i or both at the same time.
Combining (A8), (A9) and (A12), we get
log | K | H ( K | L ^ = ) ϵ δ ( n , i , | X | ) k δ ( N ) ϵ δ ( n , i , | X | ) = log | K | δ ( N ) ϵ δ ( n , i , | X | ) .

Appendix A.1.8. Privacy Leakage

Another condition that has to be fulfilled by an achievable privacy secrecy rate pair is that the information rate provided by the helper data about the sequence X n is bounded. We show here that this condition is fulfilled.
For every source realization s S , we have an = ( s ) L such that Q s Q XY , . For every L , we have
1 N I ( X N ; M i , Θ , U d , L ^ ) = 1 N I ( X N ; L ^ ) + 1 N I ( X N ; M i , Θ , U d | L ^ ) log | L | N + 1 N I ( X N ; M i , Θ , U d | L ^ ) .
We analyze the second term of the right-hand side of (A14):
1 N I ( X N ; M i , Θ , U d | L ^ ) Pr { L ^ = } 1 N I ( X N ; M i , Θ , U d | L ^ = ) + Pr { L ^ } log | X | .
Similar to (A11), we have
1 N I ( X N ; M i , Θ , U d | L ^ ) 1 N I ( X N ; M i , Θ , U d | L ^ = ) + ϵ δ ( n , | X | ) i log | X | .
For every L and from (A6), it holds
1 N I ( X N ; M i , Θ , U d | L ^ = ) i log | M | N + d + 1 N I ( U ; X | L ^ = ) I ( U ; Y s ) + ϕ ( δ , | U | , | Y | ) + ψ ( δ , | U | , | X | )
+ d + 1 N ,
with ϕ ( δ , | U | , | Y | ) > 0 and ψ ( δ , | U | , | X | ) > 0 . Combining (A14), (A15) and (A17), it follows that
1 N I ( X N ; M i , Θ , U d , L ^ ) I ( U ; X | L ^ = ) I ( U ; Y s ) + log | L | N + ϕ ( δ , | U | , | Y | ) + ψ ( δ , | U | , | X | ) + ϵ δ ( n , | X | ) i log | X | ,
where the last three terms of the right-hand side of the inequality goes to zero for large enough n and i.

Appendix A.1.9. Secrecy Leakage

The last condition that has to be fulfilled by an achievable privacy secrecy rate pair is that the information rate provided by the helper data about the secret key is negligibly small. For every source realization s S , we have an = ( s ) L such that Q s Q XY , . For every L , we have
I ( K ; M i , Θ , U d , L ^ ) = I ( K ; L ^ ) + I ( K ; M i , Θ , U d , | L ^ ) .
We first consider the first term of (A18). Using (A8) and (A12), we get that
I ( K ; L ^ ) = H ( K ) H ( K | L ^ ) δ ( N ) + ϵ δ ( n , i , | X | ) .
We consider the second term of (A18). Using (A13), we get
I ( K ; M i , Θ , U d | L ^ ) Pr { L ^ = } I ( K ; M i , Θ , U d | L ^ = ) + Pr { L ^ } N log | X | H ( K | L ^ = ) H ( K | M i , Θ , U d , L ^ = ) + ϵ ( n , | X | ) i N log | X | 2 N c δ 2 log | K | log | K | + δ ( N ) + ϵ δ ( n , i , | X | ) = δ ( N ) + ϵ δ ( n , i , | X | ) .
Hence,
I ( K ; M i , Θ , U d , L ^ ) 2 δ ( N ) + 2 ϵ δ ( n , i , | X | ) .
Note that the right-hand side of the inequality goes to zero for large enough N, showing that for every source realization s S , the secret key information rate leaked by the helper is negligibly small.
Note that we showed that the rate pair can be achieved for large N = i n , i.e., not for all N N . To show the achievability for all blocklengths N N , we define the sequence N i with i N with N i = i 2 . We showed that for the sequence N i of blocklengths with i N , there exists a blocklength N i 0 such that for all blocklengths N i > N i 0 , we can find a code sequence that fulfills the achievability conditions. For every N i < N < N i + 1 , one can rewrite N = N i + r i with r i < N i + 1 N i . We use only the first N i symbols to generate the key and discard the rest r i . One can easily see that there is a ϵ ( N ) such that, for δ = ϵ ( N ) , all conditions are fulfilled. This completes the proof of achievability.

Appendix A.2. Converse of Theorem 3

For the converse, we consider a genie-aided enrollment and authentication terminal, i.e., the user at the enrollment and authentication terminal has partial knowledge of the source, i.e., he knows the actual state of the marginal distribution L but not the complete source state. The converse follows from the corresponding result for a joint-source with perfect SSI shown in [11]. For a fixed L , s S and V : X P ( U ) , we define the region R ( V , , s ) as the set of all ( R P L , R K ) R + 2 that satisfy
R K I ( U ; Y s ) , R P L I ( U ; X | L = ) I ( U ; Y s ) .
We start analyzing the secret key rate. For a fixed L and s S , we have
H ( K ) = H ( K | L = ) = I ( K ; M Y s n | L = ) + H ( K | M Y s n K ^ , L = ) ,
where K ^ is a deterministic function of M , Y n and L = , i.e., K ^ = f ( M , Y n , L = ) ,
H ( K ) I ( K ; M Y s n | L = ) + H ( K | K ^ ) I ( K ; M Y s n | L = ) + ϵ n = I ( K ; M | L = ) + I ( K M ; Y s n | L = ) + ϵ n = I ( K ; M | L = ) + i = 1 n I ( K M ; Y s , i | Y s i 1 L = ) + ϵ n = I ( K ; M | L = ) + i = 1 n I ( K M Y s i 1 ; Y s , i | L = ) + ϵ n
I ( K ; M | L = ) + i = 1 n I ( K M X i 1 ; Y s , i | L = ) + ϵ n
= I ( K ; M | L = ) + n I ( U ; Y s | L = ) + ϵ n ,
where (A21) holds for ϵ n = 1 + Pr { K ^ K } log K n and follows from Fano’s Inequality and (A22) from Y i 1 K M X i 1 Y i forming a Markov chain. This comes from
P K M Y i 1 X i 1 Y i ( k , m , y i 1 , x i 1 , y i ) = x i + 1 n x i p ( x i 1 ) p ( x i ) p ( x i 1 n ) P K M ( k , m | x n ) W s ( y i , x i ) W s ( y i 1 | x i 1 ) = P X i 1 K M Y i ( x i 1 , k , m , y i ) W s ( y i 1 | x i 1 ) = p ( x i 1 ) Pr ( k , m , y i | x i 1 ) W s ( y i 1 | x i 1 ) .
We define U , i = ( K M X i 1 ) . The Equality (A23) is obtained using a time-sharing variable T uniformly distributed over { 1 , , n } and independent of all other variables. Setting U = ( U , i ) , X = X , i and Y = Y , i for T = i , we obtain
i = 1 n I ( K M X i 1 ; Y s , i | L = ) = i = 1 n I ( U , i ; Y s , i | L = ) = n I ( U , T ; Y s , T | T , L = ) = n I ( ( U , T , T ) ; Y s , T | L = ) = n I ( U ; Y s | L = ) .
Dividing by n, we get
1 n H ( K ) 1 n I ( K ; M | L = ) + I ( U ; Y s , L = ) + 1 n ϵ I ( U ; Y s , L = ) + λ n , + 1 n + 1 + ϵ n ,
where the last inequality holds with λ n , 0 for n (see [11]).
Assuming the rate pair ( R P L , R K ) is achievable, we have that ϵ 1 + δ log K n and obtain
R K δ I ( U ; Y s , L = ) + λ n , + 1 n + 1 + δ log K n n .
We continue with the privacy leakage. For a fixed s S we have
I ( X n ; M ) = I ( X n ; M | L = ) = H ( M | L = ) H ( M | X n , L = ) H ( M | Y s n , L = ) H ( K M | X n , L = ) = H ( K M | Y s n , L = ) H ( K | M Y n K ^ ) H ( K M | X n , L = ) H ( K M | Y s n , L = ) H ( K | K ^ ) H ( K M | X n , L = ) H ( K M | Y s n , L = ) ϵ n H ( K M | X n , L = ) = I ( K M ; X n | L = ) I ( K M ; Y s n | L = ) ϵ ) n = i = 1 n I ( K M ; X , i | X i 1 , L = ) i = 1 n I ( K M ; Y s , i | Y i 1 , L = ) ϵ n = i = 1 n I ( K M X i 1 ; X , i , L = ) i = 1 n I ( K M Y s i 1 ; Y s , i , L = ) ϵ n i = 1 n I ( K M X i 1 ; X , i , L = ) i = 1 n I ( K M X i 1 ; Y s , i , L = ) ϵ n = n I ( U ; X | L = ) n I ( U ; Y s | L = ) ϵ n .
Dividing by n, we get
1 n I ( X n ; M ) I ( U ; X | L = ) I ( U ; Y s | L = ) + 1 n ϵ n .
Assuming ( R P L , R K ) is achievable, we have that ϵ 1 + δ log K n and obtain
R P L + δ I ( U ; X | L = ) I ( U ; Y s | L = ) + 1 + δ log K n n .
We have shown that C G ( Q XY ) L C . This means that if ( R P L , R K ) C G ( Q XY ) holds, then we have that ( R P L , R K ) L C . Equivalently, if ( R P L , R K ) L C , then ( R P L , R K ) C G ( Q XY ) . Assume ( R P L * , R K * ) L C . This implies that there exists a L such that, for all auxiliary channels V, we have that ( R P L * , R K * ) R ( V , ) , which implies that ( R P L * , R K * ) C G ( Q XY ) . This completes the converse and therewith proves the desired result.
It remains to derive the bound on the cardinality of the auxiliary random variables U. Let L be arbitrarily but fixed and U be a random variable fulfilling P U X Y , s ( u , x , y ) = V ( u | x ) Q s ( x , y ) for all s S ( ) . We show that there is a random variable U ¯ with range | U ¯ | = | X | + | S |
I ( U ¯ ; Y ) = I ( U ; Y ) , I ( U ¯ ; X ) I ( U ¯ ; Y ) = I ( U ; X ) I ( U ; Y ) ,
for all s S . We consider the following | X | + | S | real valued continuous functions on P ( X )
f x ( p ) = p ( x ) , for all x X but one , g s ( P ) = H ( p W s ) , h ( P ) = H ( p ) ,
for all s S . We have that Σ X ( · | u ) P ( X ) having μ -measure p u . Then, it holds that
u p u ( u ) f x ( Σ X ( · | u ) ) = p ( x ) , u p u ( u ) g s ( Σ X ( · | u ) ) = H ( Y | U ) , u p u ( u ) h ( Σ X ( · | u ) ) = H ( X | U ) ,
for all s S . According to (Lemma 15.4, [27]), there exists a random variable U ¯ fulfilling the Markov condition with values in U ¯ = { 1 , , | X | + | S | } and (A26) holds (see also Lemma 15.5 in [27]). □

Appendix B. Proof of Theorem 4

Appendix B.1. Achievability of Theorem 4

The achievability proof of Theorem 4 is very similar to the achievability proof of Theorem 3, where first the index of marginal distribution over X is estimated. The difference is that, in this model, we use a generated secret key K , g in a one-pad system to conceal the uniformly distributed chosen key K over the set K ; as in [11], it is additionally sent together with the generated helper message M , g and the index of the estimated marginal distribution L ^ over the public message, i.e., the helper data is M = ( M , g , K K , g , L ^ ) . The error analysis is similar to the error analysis for Theorem 3 and the key is already uniformly distributed; however, we should take a deeper look into the privacy leakage and the secrecy leakage. We perform the privacy amplification step as in Appendix A to show that the strong secrecy is fulfilled.

Appendix B.1.1. Privacy Leakage

Another condition that has to be fulfilled by an achievable privacy secrecy rate pair is that the information rate provided by the helper data about the sequence X n is bounded. We show here that this condition is fulfilled.
For every source realization s S , we have an = ( s ) such that Q s Q XY , . We have
1 N I ( X N ; M , g i , K K , g , Θ , U d , L ^ ) log | L | N + 1 N I ( X N ; M , g i , K K , g , Θ , U d | L ^ ) .
We analyze the second term of the right-hand side of (A27)
1 N I ( X N ; M , g i , K K , g , Θ , U d | L ^ ) Pr { L ^ = } 1 N I ( X N ; M , g i , K K , g , Θ , U d | L ^ = ) + Pr { L ^ } N log | X | N Pr { L ^ = } × 1 N I ( X N ; M , g i , K K , g , Θ , U d | L ^ = ) + ϵ δ ( n , i , | X | ) log | X | .
Similar to (A11), we have
1 N I ( X N ; M , g i , K K , g , Θ , U d | L ^ ) 1 N I ( X N ; M , g i , K K , g , Θ , U d | L ^ = ) + ϵ δ ( n , i , | X | ) log | X | .
In [11], the authors show that, for every L , it holds
1 N I ( X N ; M , g i , K K , g , Θ , U d | L ^ = ) 1 N I ( X N ; M , g i , Θ , U d | L ^ = Q s Q XY , ) + 1 N H ( K K , g | L ^ = Q s Q XY , ) 1 N H ( K K , g | X n , M , g i , Θ , U d , K , g , L ^ = ) 1 N I ( X N ; M , g i , Θ , U d | L ^ = Q s Q XY , ) + 1 N log K N 1 N log K N 1 N I ( X n ; M , g i , Θ , U d | L ^ = Q s Q XY , ) I ( U ; X | L ^ = Q s Q XY , ) I ( U ; Y Q s Q XY , ) + ϕ ( δ , | U | , | Y | ) + ψ ( δ , | U | , | X | ) + d + 1 N ,
which proves the bound on the privacy leakage.

Appendix B.1.2. Secrecy Leakage

For every source realization s S , we have an = ( s ) such that Q s Q XY , . Following similar steps as for the privacy leakage, it can be shown that the secrecy leakage is upper-bounded by
I ( K ; M , g i , K K , g , Θ , U d , L ^ ) = I ( K ; M , g i , K K , g , Θ , U d | L ^ ) .
We analyze the right-hand side of (A30):
I ( K ; M , g i , K K , g , Θ , U d | L ^ ) = ˜ L Pr { L ^ = ˜ } I ( K ; M , g i , K K , g , Θ , U d | L ^ = ˜ ) = Pr { L ^ = } I ( K ; M , g i , K K , g , Θ , U d | L ^ = ) + Pr { L ^ } I ( K ; M , g i , K K , g , Θ , U d | L ^ ) .
For every L , it holds
I ( K ; M , g i , K K , g , Θ , U d | L ^ = ) log | K | H ( K , g | L ^ = ) + I ( K , g ; M , g i , Θ , U d | L ^ = ) log | K | log | K | + I ( K , g ; M , g i , Θ , U d | L ^ = ) .
The last inequality follows from (A13). Substituting K , g with K, combining (A31) with (A19) and letting i , n , we obtain the desired result.

Appendix B.2. Converse of Theorem 4

The converse of Theorem 4 can be shown using the same lines of arguments as for the converse of Theorem 3. □

Appendix C. Proof Lemma 1

For every channel V : X P ( U ) , for every s 1 S 1 and s 2 S 2 , we have the following effective sources:
P U X Y , s 1 ( u , x , y ) = V ( u | x ) Q s 1 ( x , y ) , P U X Y , s 2 ( u , x , y ) = V ( u | x ) Q s 2 ( x , y ) .
Let d H ( Q XY 1 , Q XY 2 ) ϵ then there exists a s 1 S 1 and s 2 S 2 such that ( V ¯ , s 1 ¯ , s 2 ¯ ) = argmax d H ( Q XY 1 , Q XY 2 ) . Then, we have that
P U X Y , s 1 ¯ P U X Y , s 2 ¯ T V = u U x , y X × Y | P U X Y , s 1 ¯ ( u , x , y ) P U X Y , s 2 ¯ ( u , x , y ) | = u U x , y X × Y | V ( u | x ) Q s 1 ¯ ( x , y ) V ( u | x ) Q s 2 ¯ ( x , y ) | = u U x , y X × Y | V ( u | x ) ( Q s 1 ¯ ( x , y ) Q s 2 ¯ ( x , y ) ) | = u U x , y X × Y V ( u | x ) | Q s 1 ¯ ( x , y ) Q s 2 ¯ ( x , y ) | = x , y X × Y | Q s 1 ¯ ( x , y ) Q s 2 ¯ ( x , y ) | u U V ( u | x ) = x , y X × Y | Q s 1 ¯ ( x , y ) Q s 2 ¯ ( x , y ) | ϵ ,
and
P U , s 1 ¯ P U , s 2 ¯ T V = u U | x , y X × Y P U X Y , s 1 ¯ ( u , x , y ) P U X Y , s 2 ¯ ( u , x , y ) | = u U | x , y X × Y V ( u | x ) ( Q s 1 ¯ ( x , y ) Q s 2 ¯ ( x , y ) ) | u U x , y X × Y V ( u | x ) | ( Q s 1 ¯ ( x , y ) Q s 2 ¯ ( x , y ) ) | ϵ .
For every channel V : X P ( U ) , for every s 1 S 1 and s 2 S 2 , there is an 1 = 1 ( s 1 ) and 2 = 2 ( s 2 ) the region R ( V , i , s i ) with i = { 1 , 2 } is rectangular. Therefore, to calculate the Hausdorff distance between regions, we are only interested in the corner points:
R K , s i = I ( U i ; Y s i ) , R P L , s i = I ( U i ; X i ) I ( U i ; Y s i ) .
Let V be arbitrary but fixed. Then, for every s 1 S 1 and s 2 S 2 , we have
| I ( U 1 ; Y s 1 I ( U 2 ; Y s 2 ) | = | H ( U 1 ) H ( U 2 ) + H ( Y s 2 | U 2 ) H ( Y s 2 | U 1 ) | | H ( U 1 ) H ( U 2 ) | + | H ( Y s 2 | U 2 ) H ( Y s 2 | U 1 ) | .
For V ¯ , s 1 ¯ and s 2 ¯ , there is a 1 ¯ = 1 ¯ ( s 1 ¯ ) and 2 ¯ = 2 ¯ ( s 2 ¯ ) . Using [27], Lemma 2.12 and Using Lemma 1 in [22], we get
| I ( U 1 ¯ ; Y s 1 ¯ I ( U 2 ¯ ; Y s 2 ¯ ) | 2 ϵ log | Y | + 2 H 2 ( ϵ ) ϵ log ϵ | U | .
Following the same line of arguments as for (A34), we get
| I ( U 1 ¯ ; X 1 ¯ I ( U 2 ¯ ; X 2 ¯ ) | 2 ϵ log | X | + 2 H 2 ( ϵ ) ϵ log ϵ | U | .
Hence, for every channel V : X P ( U ) , s 1 ¯ and s 2 ¯ , we have
D H ( R ( V , 1 ¯ , s 1 ¯ ) , R ( V , 2 ¯ , s 2 ¯ ) ) δ ( ϵ ) ,
with δ ( ϵ ) = δ 1 ( ϵ ) 2 + δ 2 ( ϵ ) 2 , where δ 1 ( ϵ ) = 2 ϵ log | Y | + 2 H 2 ( ϵ ) ϵ log ϵ | U | and δ 2 ( ϵ ) = 2 ϵ log | Y | | X | + 4 H 2 ( ϵ ) 2 ϵ log ϵ | U | .
For fixed 1 and 2 , we denote
R ( V , 1 ) = s i S 1 R ( V , 1 , s 1 ) , R ( V , 2 ) = s i S 2 R ( V , 2 , s 2 ) , R ( 1 ) = V : X P ( U ) | U | | X | + | S 1 R ( V , 1 ) , R ( 2 ) = V : X P ( U ) , | U | | X | + | S 2 R ( V , 2 ) .
We have
D H ( R ( V , 1 ) , R ( V , 2 ) ) = D H s 1 S 1 R ( V , 1 , s 1 ) , s 2 S 2 R ( V , 2 , s 2 )
= D H s 1 S 1 R ( V , 1 , s 1 ) c , s 2 S 2 R ( V , 2 , s 2 ) c
D H ( R ( V ¯ , 1 ¯ , s 1 ¯ ) c , R ( V ¯ , 2 ¯ , s 2 ¯ ) c ) δ ( ϵ ) .
Equation (A37) holds since the Hausdorff distance between two sets equals the Hausdorf distance between the complements of each set. Inequation (A37) holds since V ¯ , s 1 ¯ , s 2 ¯ is the index of the sets that maximises the Hausdorf distance. It also holds that
D H ( R ( 1 ) , R ( 2 ) ) = D H V : X P ( U ) | U | | X | + | S 1 R ( V , 1 ) , V : X P ( U ) | U | | X | + | S 2 R ( V , 2 ) D H ( R ( V ¯ , 1 ¯ , s 1 ¯ ) c , R ( V ¯ , 2 ¯ , s 2 ¯ ) c ) δ ( ϵ ) ,
and
D H ( C G ( Q XY , 1 ) , C G ( Q XY , 2 ) ) = D H 1 L 1 R ( 1 ) , 2 L 2 R ( 2 ) = D H 1 L 1 R ( 1 ) c , 2 L 2 R ( 2 ) c D H ( R ( V ¯ , 1 ¯ , s 1 ¯ ) c , R ( V ¯ , 2 ¯ , s 2 ¯ ) c ) δ ( ϵ ) .

References

  1. Shannon, C.E. Communication theory of secrecy systems. Bell Syst. Tech. J. 1949, 28, 656–715. [Google Scholar] [CrossRef]
  2. Liang, Y.; Poor, H.V.; Shamai, S. Information theoretic security. Found. Trends Commun. Inf. Theor. 2009, 5, 355–580. [Google Scholar] [CrossRef]
  3. Bloch, M.; Barros, J. Physical-Layer Security; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  4. Schaefer, R.F.; Boche, H.; Khisti, A.; Poor, H.V. Information Theoretic Security and Privacy of Information Systems; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
  5. Ahlswede, R.; Csiszàr, I. Common randomness in information theory and cryptography—Part I: Secret sharing. IEEE Trans. Inf. Theor. 1993, 39, 1121–1132. [Google Scholar] [CrossRef]
  6. Maurer, U.M. Secret key agreement by public discussion from common information. IEEE Trans. Inf. Theor. 1993, 39, 733–742. [Google Scholar] [CrossRef]
  7. Maurer, U.; Wolf, S. Information-theoretic key agreement: From weak to strong secrecy for free. Adv. Crypt. EUROCRYPT 2000, 1807, 351–368. [Google Scholar]
  8. Schneier, B. Inside risks: The uses and abuses of biometrics. Commun. ACM 1999, 42, 136. [Google Scholar] [CrossRef]
  9. Ratha, N.K.; Connell, J.H.; Bolle, R.M. Enhancing security and privacy in biometrics-based authentication systems. IBM Syst. J. 2001, 40, 614–634. [Google Scholar] [CrossRef]
  10. Prabhakar, S.; Pankanti, S.; Jain, A.K. Biometric recognition: Security and privacy concerns. IEEE Secur. Priv. 2003, 1, 33–42. [Google Scholar] [CrossRef]
  11. Ignatenko, T.; Willems, F.M. Biometric systems: Privacy and secrecy aspects. IEEE Trans. Inf. Forensics Secur. 2009, 4, 956–973. [Google Scholar] [CrossRef]
  12. Lai, L.; Ho, S.W.; Poor, H.V. Privacy–security trade-offs in biometric security systems—Part I: Single use case. IEEE Trans. Inf. Forensics Secur. 2011, 6, 122–139. [Google Scholar] [CrossRef]
  13. Chou, R.A.; Bloch, M.R. One-way rate-limited sequential key-distillation. In Proceedings of the IEEE International Symposium Information Theory, Cambridge, MA, USA, 1–6 July 2012; pp. 1777–1781. [Google Scholar]
  14. Wolfowitz, J. Simultaneous channels. Arch. Ration. Mech. Anal. 1959, 4, 371–386. [Google Scholar] [CrossRef]
  15. Blackwell, D.; Breiman, L.; Thomasian, A. The capacity of a class of channels. Ann. Math. Stat. 1959, 30, 1229–1241. [Google Scholar] [CrossRef]
  16. Boche, H.; Wyrembelski, R.F. Secret key generation using compound sources-optimal key-rates and communication costs. In Proceedings of the 9th International ITG Conference on Systems, Communication and Coding, München, Germany, 21–24 January 2013; pp. 1–6. [Google Scholar]
  17. Bloch, M. Channel intrinsic randomness. In Proceedings of the IEEE International Symposium on Information Theory, Austin, TX, USA, 13–18 June 2010; pp. 2607–2611. [Google Scholar]
  18. Chou, R.; Bloch, M.R. Secret-key generation with arbitrarily varying eavesdropper’s channel. In Proceedings of the IEEE Global Conference on Signal and Information Processing, Austin, TX, USA, 3–5 December 2013; pp. 277–280. [Google Scholar]
  19. Tavangaran, N.; Boche, H.; Schaefer, R.F. Secret-key generation using compound sources and one-way public communication. IEEE Trans. Inf. Forensics Secur. 2017, 12, 227–241. [Google Scholar] [CrossRef]
  20. Grigorescu, A.; Boche, H.; Schaefer, R.F. Robust PUF based authentication. In Proceedings of the IEEE International Workshop on Information Forensics and Security, Rome, Italy, 16–19 November 2015; pp. 1–6. [Google Scholar]
  21. Boche, H.; Nötzel, J. Positivity, discontinuity, finite resources, and nonzero error for arbitrarily varying quantum channels. J. Math. Phys. 2014, 55, 122201. [Google Scholar] [CrossRef]
  22. Boche, H.; Schaefer, R.F.; Poor, H.V. On the continuity of the secrecy capacity of compound and arbitrarily varying wiretap channels. IEEE Trans. Inf. Forensics Secur. 2015, 10, 2531–2546. [Google Scholar] [CrossRef]
  23. Grigorescu, A.; Boche, H.; Schaefer, R.F.; Poor, H.V. Capacity region continuity of the compound broadcast channel with confidential messages. In Proceedings of the IEEE Information Theory Workshop, Jerusalem, Israel, 24 April–1 May 2015; pp. 1–6. [Google Scholar]
  24. Wolfowitz, J. Coding Theorems of Information Theory; Springer: New York, NY, USA, 1978. [Google Scholar]
  25. Schaefer, R.F.; Boche, H.; Poor, H.V. Super-activation as a unique feature of secure communication in malicious environments. Information 2016, 7, 24. [Google Scholar] [CrossRef]
  26. Boche, H.; Schaefer, R.F.; Poor, H.V. Characterization of Super-Additivity and Discontinuity Behavior of the Capacity of Arbitrarily Varying Channels Under List Decoding. Available online: http://ieeexplore.ieee.org/abstract/document/8007044/ (accessed on 7 September 2017).
  27. Csiszàr, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  28. Bjelaković, I.; Boche, H.; Sommerfeld, J. Secrecy results for compound wiretap channels. Probl. Inf. Transm. 2013, 49, 73–98. [Google Scholar] [CrossRef]
Figure 1. The biometric measurements X n and Y n are observed in the enrollment and authentication terminal, respectively. In the enrollment terminal, the key K and the helper data M are generated. The helper data is public, hence the eavesdropper also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed biometric measurements Y n and the helper data M .
Figure 1. The biometric measurements X n and Y n are observed in the enrollment and authentication terminal, respectively. In the enrollment terminal, the key K and the helper data M are generated. The helper data is public, hence the eavesdropper also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed biometric measurements Y n and the helper data M .
Entropy 19 00480 g001
Figure 2. The biometric sequences X n and Y n are observed at the enrollment and authentication terminal, respectively. In the enrollment terminal, the helper data M is generated for a given secret key K. The helper data is public, hence the eavesdropper also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed biometric authentication sequence Y n and the helper data M .
Figure 2. The biometric sequences X n and Y n are observed at the enrollment and authentication terminal, respectively. In the enrollment terminal, the helper data M is generated for a given secret key K. The helper data is public, hence the eavesdropper also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed biometric authentication sequence Y n and the helper data M .
Entropy 19 00480 g002
Figure 3. The attacker controls the state of the source s S . The biometric sequences X n and Y n are observed at the enrollment and authentication, terminal respectively. In the enrollment terminal, the key K and the helper data M are generated. The helper data is public, hence the attacker also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed authentication sequence Y n and the helper data M .
Figure 3. The attacker controls the state of the source s S . The biometric sequences X n and Y n are observed at the enrollment and authentication, terminal respectively. In the enrollment terminal, the key K and the helper data M are generated. The helper data is public, hence the attacker also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed authentication sequence Y n and the helper data M .
Entropy 19 00480 g003
Figure 4. The attacker controls the state of the source s S . The biometric sequences X n and Y n are observed in the enrollment and authentication terminal, respectively. In the enrollment terminal, the key K is predefined and the helper data M is generated. The helper data is public, hence the attacker also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed authentication sequences Y n and the helper data M .
Figure 4. The attacker controls the state of the source s S . The biometric sequences X n and Y n are observed in the enrollment and authentication terminal, respectively. In the enrollment terminal, the key K is predefined and the helper data M is generated. The helper data is public, hence the attacker also has access to it. In the authentication terminal, an estimation of a key K ^ is made based on the observed authentication sequences Y n and the helper data M .
Entropy 19 00480 g004

Share and Cite

MDPI and ACS Style

Grigorescu, A.; Boche, H.; Schaefer, R.F. Robust Biometric Authentication from an Information Theoretic Perspective. Entropy 2017, 19, 480. https://doi.org/10.3390/e19090480

AMA Style

Grigorescu A, Boche H, Schaefer RF. Robust Biometric Authentication from an Information Theoretic Perspective. Entropy. 2017; 19(9):480. https://doi.org/10.3390/e19090480

Chicago/Turabian Style

Grigorescu, Andrea, Holger Boche, and Rafael F. Schaefer. 2017. "Robust Biometric Authentication from an Information Theoretic Perspective" Entropy 19, no. 9: 480. https://doi.org/10.3390/e19090480

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop