1. Introduction
In the current era of big data, artificial intelligence, and privacy awareness, there is a growing need for securely computing with distributed data. The field of secure multi-party computation is evolving towards maturity, and leads to secure cryptographic solutions for this purpose. In its most general setting, there are multiple parties, each having private inputs, and they want to jointly evaluate a certain function on their inputs, without revealing those inputs. Although these cryptographic solutions achieve a high level of security, the challenge is often to restrict their computational and communication effort.
We focus on the setting of two parties using additively homomorphic encryption, and only one holding the decryption key, with numerous applications in secure signal processing like secure privacy-protected face recognition, privacy-protected
k-means clustering, and privacy-protected content recommendation [
1]. Other applications are neural network classification for e-health, biometric matching, watermark detection, fingerprinting for DRM, and smartgrids [
1].
Within all those applications, there is clear need for a secure comparison protocol as a building block [
1]. A few secure protocols based on homomorphic encryption are known for comparing two integers, the so-called millionaires problem. We present a new comparison protocol, which is dedicated to lightweight environments that require little memory and a low computational effort. For this reason, we focus on the semi-trusted model where both players follow the required protocol steps. As far as we know, this protocol is the solution with the lowest number of modular multiplications (of two additively homomorphic encrypted numbers) by now. The main reason is that our protocol does not need intermediate decryptions, which are computationally expensive.
First, the problem definition and notations are explained in the preliminaries. Then, an overview is presented on the most important related work. The actual comparison protocol called LSIC (Lightweight Secure Integer Comparison) is presented in the second section, its correctness and security are proven in
Section 3. In
Section 4, the computational, communication and storage complexity of our solution is analyzed, counting the total number of multiplications, transmissions, and storage of encryptions. These complexity parameters are compared with the most important alternative solutions in
Section 5. In
Section 6, it is shown how any comparison protocol can be transferred to the client-server model in an efficient and secure way. The final section contains the conclusions.
1.1. Preliminaries
We consider two parties, A and B, both having a private integer, respectively, a and b. Both parties would like to know which of their integers is the largest one, without revealing its value to the other party. This problem is called the millionaires problem. Since we are especially interested in lightweight environments, we assume both players are honest but curious.
The output of our secure multiparty computation protocol is the bit t such that . We consider different variants with different types of output:
The value t becomes known to both A and B.
A obtains the value t encrypted by a public key (homomorphic) cryptosystem, and B holds the private key K.
A and B share the output, i.e., they both have a private output bit, respectively, and , such that (⊕ denotes exclusive or).
In the last two variants, the value
t is not known to A and B, but it enables them to use the output for other applications. For the public key cryptosystem used in our protocols, any additively homomorphic and semantically secure cryptosystem could be used. In our notation, we consider two classes of encryption systems, namely
to denote the encryption of a single bit with, e.g., Goldwasser-Micali (GM) [
2,
3], and
to denote encryption of integers with, e.g., Pallier [
4] or Damgård, Geisler, and Krøigaard (DGK) [
5,
6,
7].
Besides the model where both parties know their private integer, we also consider the client–server model. In this model, party A is the client who has both integers, but only in encrypted form, and B is the server, who owns the private key of the cryptosystem. In the client–server model, both parties are not allowed to learn the value of the integers
a and
b, and the same variants as above apply for obtaining the output
t. Such a model is often used in privacy protecting applications where intermediate values should remain unknown, e.g., in face recognition [
8].
Let
N be the modulus of the encryption system, which is usually equal to the product of two large primes. We recall an important property of homomorphic encryption systems, namely that for bits
x and
y we have
, and for integers
x and
y we have
. An encrypted integer is negated most efficiently by using the Euclidean algorithm [
9]. This is denoted by
. During the complexity computations, we assume that
N is 1024 bits long, and that any security wise comparable symmetric encryption system has a key size of 80 bits [
10]. For other key lengths, similar complexity conclusions can be drawn. For convenience, we neglect in our notation that the cipher text size in Pallier is
and use
N instead, just like in DGK or RSA.
In DGK and other similar cryptosystems, an integer x is encrypted by computing , where g and h are fixed public generators, and r is a secret random number chosen by the encrypting person. In GM, a bit x is encrypted by computing where g is a fixed public integer (quadratic nonresidue) of bits, and r is a secret random number of the same size, chosen by the encrypting person. In both cryptosystems, the variable r is used to randomize the outcome of the encryption, so two encryptions of the same value will not be recognized as such. This is also the reason that within a cryptographic protocol, where encryptions are received from the other party, processed within local computations, and subsequently returned, encryptions need to be re-randomized before returning.
We use pseudo-code to describe the protocols. Assertions between are used to describe the current value of variables. Additionally, comments are used, prefixed by ▹, to explain the corresponding line of the protocol. Each statement is prefixed by A or B, indicating the party that performs the statement. e.g., A: means that A multiplies the (encrypted) variable with an encrypted 1 modulo N, and stores the result in the (encrypted) variable .
To compute the computational complexity of the different protocols, we use the fact that an exponentiation modulo
N with an exponent of
n bits will on average take
multiplications modulo
N. When the factoring of
N is known, this can be reduced to
by using the Chinese remainder theorem [
9]. Namely, a multiplication modulo
N is assumed to be equivalent to four multiplications modulo one of the two prime factors. The effort for negating an encrypted number is considered negligible. A Pallier decryption takes
multiplications modulo
N [
4]. A Pallier encryption of a plaintext of
n bits will take
multiplications modulo
N [
4].
We use to denote the integer division of x by y, so . Let be the statistical security parameter, which value is usually chosen around 80. The maximum size of the input variables is denoted by ℓ. We assume all random variables are uniformly chosen.
1.2. Related Work
The first solution of the millionaires problem is by Yao [
11] in 1982. His solution, based on garbled circuits, has been improved many times since.
Most solutions are either based on homomorphic encryption or on garbled circuits. In
Section 5, we describe one of the best candidates for both categories, and compare it with our solution LSIC. The candidate that uses homomorphic encryption is by Damgård, Geisler, and Krøigaard [
5,
6,
7], who use a dedicated cryptosystem finetuned for small plaintext values. Their protocol is described in detail in
Section 5.2 and compared to our work. One of the most efficient implementations nowadays based on garbled circuits is described by Kolesnikov, Sadeghi and Schneider [
12]. The general garbled circuit approach is described in
Section 5.3, and their specific implementation is compared to our work.
Other related work is, e.g., by Fischlin [
13], who describes a system that enables to compute the product (AND) of two quadratic residues. However, an error parameter
is required to guarantee the correctness of the result, which increases the computational and communication load, including
ℓ decryptions.
Garay, Schoenmakers and Villegas [
14] describe a nice solution for the client-server setting in the multi-party case, but since they use the malicious adversary model instead of the honest-but-curious model, their solutions are less efficient.
More recent results focus on the malicious adversary model [
15], or other techniques, such as fully homomorphic encryption [
16], which reduces communication, but increases computational efforts, or oblivious transfers [
17].
Kerschbaum and Terzidis (KT) present an efficient solution to the millionaires problem in the semi-trusted model in [
18], as described in detail in
Section 5.1. This solution is later extended to multiple parties by Kerschbaum, Biswas and de Hoogh [
19].
2. Comparison Protocol
Suppose party A has a private unencrypted number a, and party B has a private and unencrypted number b. The integers a and b have size ℓ. We denote their bits by and , for , where and are the least significant bits. We use the notation () to denote the integer , i.e., the first l bits of a, and the same for b. Note that and .
The idea behind our comparison protocol [
20] is to compute the bits
, from
towards
, where
. The bit
can be computed from
by using the relation:
The correctness of this recurrence relation is easily seen by observing that and are the most significant bits of and , respectively.
In order to compute
, we propose the protocol Lightweight Secure Integer Comparison (LSIC), which is shown in Algorithm 1. Algorithm 1 contains assertions between
to describe the current value of variables. Additionally, comments are used, prefixed by ▷, to explain the protocol. The correctness of the protocol, i.e., the proof of the assertions, is shown in
Section 3.1. The formal security proof, to show that A and B learned nothing more than
, is given in
Section 3.2.
Algorithm 1 Lightweight secure integer comparison (LSIC). |
| | |
Input A | |
Input B and the decryption key K | |
Joined output bit t, where | |
| | |
| Party B encrypts and randomizes and sends to A |
| if then |
| A: |
| else |
5: | A: | ▷ , randomized in line 17 |
| end if |
| |
| for do | ▷ A computes from |
| |
10: | A blinds by tossing a fair coin |
| if then |
| A: |
| else |
| A: | ▷ |
15: | end if |
| |
| A randomizes and sends it to B |
| if then |
| B: | ▷ , randomized in line 24 |
20: | else |
| B: |
| end if | ▷ B computed without decrypting |
| |
| B encrypts , randomizes , and sends both and to A |
25: | |
| if then | ▷ A smartly unblinds |
| A: | ▷ |
| end if |
| |
30: | if then |
| |
| A: | ▷ |
| else |
| |
35: | A: |
| end if |
| |
| end for |
| |
40: | A sends to B so B can decrypt it and send back to A |
Party A and B encrypt single bits by
, but only party B can decrypt. The main idea is that A uses variable
, which is the encryption of
, and computes, in a joined protocol with B,
. This computation is done recursively, starting with
,
until
. In order for A to compute the next value, B sends the encrypted bits
in line 24, but since this is not enough for A, as he is computing in the encrypted domain, B also sends the encryption of
, being the product of
and
. To compute the product
, A sends a blinded version of
to B in line 17, because each intermediate value
should be unknown to B (and A). The product is smartly unblinded again by A in line 27. Although the computations by A in lines 27 and 32 are not obvious, in
Section 3.1, it is shown that A indeed correctly computes the next value
from
,
,
, and
. The computation in line 27 is actually an efficient combination of two similar statements:
| |
if then | ▷ A unblinds |
A: | ▷ |
end if | |
| |
if then | |
A: | ▷ |
end if | |
| |
Since each statement is the inverse of the other one (as proved in
Section 3.1), a double execution should be avoided. Only when
, i.e.,
, one execution is necessary.
This protocol works for any number of bits
ℓ. Since the protocol doesn’t require intermediate decryptions, the computational complexity is low. Algorithm 1 computes
, but also
is similarly computed by letting A compute
and B:
in the beginning, and flipping the value of
t at the end by B:
. The encryption system should be homomorphic and semantically secure. We use GM, because encryption and re-randomization are easy, but one could also use, e.g., Pallier or DGK. This requires a small modification of the homomorphic operations on encrypted numbers, as depicted in
Table 1 below, where
is used to denote Pallier encryption.
The modified operation in line 27 works, because , so . The modified operation in line 32 works, because in this case (), , so whenever , we have . Therefore, the addition of t and cannot exceed 1.
When the output
has to remain secret to both players, obviously the final step of the protocol at line 40 can be skipped. This also saves the (only) costly decryption by B. When the output has to be secretly shared among both players, the final step has to be modified slightly:
A blinds by tossing a fair coin | |
if then | |
A: | |
else | |
A: | ▷ |
end if | |
| |
A: | |
A sends to B | |
B decrypts | |
B: | |
| |
All homomorphic systems consist of an encryption and a randomization part, e.g., in GM , where g is a fixed integer (quadratic nonresidue) and r is randomly chosen. When implementing our comparison protocol, the randomization part can be skipped in most computations. Only when a value has to be sent to the other party (in lines 1, 17, 24, and 40), the result should be re-randomized.
3. Correctness and Security
3.1. Correctness
The number
is blinded by A in line 14 by negating the bit with probability
, in which case
(otherwise,
). From
, B computes
. After receiving
, A computes either
or
depending on the value of
. To see that the computation in line 27 has been done correctly, observe that the value of
is toggled from
to
and back by computing
:
The converse follows by substituting
for
and vice versa. The correctness of the computation of
(describing the relation
) from
in lines 32 and 35 follows from
Table 2 below.
The first three rows show the eight possible values of the triplet
. The last four rows are computed from the first three. For computing the row
, the recurrence relation of Equation (
1) is used. From the table follows that, for each value of the triplet
,
when
, and
otherwise.
It follows that in each iteration, the value of is correctly computed from the previous one, and thus that the decrypted value t at the end of the comparison protocol indeed equals the bit , and therefore .
3.2. Security
In order to show that our protocol privately computes the comparison of two integers in the semi-honest model, we have to show that whatever can be computed by A or B from their view of a protocol execution, can be computed from their input and the comparison result (see Definition 7.2.1 in Goldreich [
21]).
The view of A consists of its private number
a, the size
ℓ, the comparison bit
t, the internal coin tosses
,
, and all intermediate messages received from B: the encrypted bits
,
, which are the encrypted bits of the number
b, and the encrypted bits
,
, which equal
,
being the blinded version of the bit
. Summarizing, the view of A equals
It suffices to show that there exists a probabilistic polynomial-time algorithm
such that
is computationally indistinguishable from
[
21]. Since the encryption algorithm is semantically secure, every pair of encryptions is computationally indistinguishable, so by letting
randomly generate
encryptions and
coin tosses, this condition is easily verified.
The view of B consists of its private number
b, the decryption key
K, the size
ℓ, the comparison bit
t, and all intermediate messages received from A: the blinded bits
,
,
being the blinded version of the bit
. Note that B also receives
in the end, but since this number doesn’t add anything to the view, we leave it out for convenience. Since B owns the decryption key, all encrypted values
can be decrypted to
. Additionally, B is able to deduce the randomization part of each encryption, but since A carefully uses re-randomization before each transmission, this information can be considered as a random variable and is therefore useless to B. Summarizing, the view of B is equivalent to
Again, we have to show that there exists a probabilistic polynomial-time algorithm
such that
is computationally indistinguishable from
. This is easily satisfied by letting
randomly generate
bits. The derivation in Equation (
2) shows that the bits
are indeed uniformly distributed.
In fact, we can even prove a much stronger assertion, namely that yields no more information regarding a than its inputs and output do, implying perfect (or information-theoretic) security for A towards B. This is due to the blinding technique of the numbers :
- (i)
The bit t contains the value .
- (ii)
A tosses a fair coin c, when head then else .
- (iii)
A sends to B.
To see that
, where
I denotes the mutual information, we show that each
is uniformly distributed.
It’s easy to see that the random variable
(and its uniform distribution) is independent of
a,
b,
K,
ℓ,
t, or any previous value of
, so the equality of mutual information follows.
We conclude that the private input of B is computationally secure towards A, and that the private input of A is unconditionally (so even with unbounded computation power) secure towards B.
4. Complexity
In this section, we compute the computational, communication, and storage complexity of our protocol. We assume that during the protocol, the randomization part is omitted where possible. In fact, only when an encrypted value is transmitted to the other party, this encrypted value has to be rerandomized, i.e., multiplied by a random square, such that two pairs of encrypted bits are indistinguishable. Note that although B owns the decryption key (also called private key), re-randomization is even necessary for A, because B might be able to recognize the randomization part.
The summary of the complexity analysis of our comparison protocol is depicted in
Table 3. We ignore the efforts for key generation and key distribution.
The computational complexity is measured in the number of multiplications modulo N, the communication complexity is measured in the number of messages. Each message equals an encrypted bit and has a size of bits. The storage complexity is measured in the number of encrypted numbers (of size bits) to be stored. When the comparison result has to be known to both parties, we need an extra decryption of the comparison result by B, as depicted in the table above. If the output bit should be available in encrypted form, no decryptions are required. The unencrypted message of one bit (the comparison result) from B to A is neglected in our analysis.
An important observation is that the (maximal) number of multiplications modulo
N is even less than the number of modular multiplications needed for the Pallier encryption of an
ℓ (
) bit number:
[
4].
4.1. Computational Complexity
We compute the computational complexity by counting the number of multiplications modulo N, since these form the main computational load. Due to the construction of GM, the encryption of 0 requires one multiplication (squaring) modulo N, and the encryption of 1 requires two multiplications modulo N. A decryption of to x requires the computation of the Jacobi symbol , which equals the product of the Legendre symbols and . The Legendre symbol is equivalent to modulo p, and therefore requires approximately multiplications modulo p. A multiplication modulo N is more intensive, and requires more or less four multiplications modulo p (or q), such that the total number of multiplications modulo N for one GM decryption is estimated by .
The expected number of multiplications (in the encrypted domain) for A in this protocol equals times: for the blinding of in line 14, 2 for the re-randomization of in line 17, for the possible conversion of in line 27, and for the computation of in line 32. Together with 2 for the final re-randomization of in line 40, for a total of multiplications. The expected computational load for B equals the ℓ encryptions of in lines 1 and 24, and re-randomizations of in line 24, for a total of multiplications.
The actual number of multiplications depends on the (bit) values of a and b. Since timing attacks might reveal some information about these numbers, we also mention the maximal number of multiplications, which are computed similarly and equal for both A and B. This maximum for B is achieved when each bit of b is zero, because then the encryption of will take 2 multiplications, as well as the randomization of . For A, we have to take a closer look at lines 14, 27 and 32 and their conditions , and , respectively. One can see that at most two of these three conditions will hold for each i in the for loop. This will be the case when either or .
No intermediate decryptions are needed in the protocol. Only when the end result
has to be available in plain text, or a conversion to another encryption system is needed (see e.g., the client–server solution in
Section 6), one decryption by B is desired at the end which costs approximately
multiplications modulo
N.
Some values can be precomputed to reduce the number of multiplications, but this requires more storage capacity. All encryptions and all random parts (the squarings) can be precomputed and stored.
4.2. Communication Complexity
The numbers that are sent in our protocol are all single bits encrypted by the GM system, resulting in encrypted numbers of bits.
A sends to B times the value of , and once the number , for a total of ℓ numbers of bits. B sends to A the number , and times the numbers and , for a total of numbers of bits. Our protocol takes ℓ communication rounds (plus half a round in the first step).
4.3. Storage Complexity
We count the number of encrypted values that have to be stored, since plain integers are relatively small.
A has to store the current values of , and , requiring three storage units. When is computed, the storage unit of can be used so this doesn’t require an extra storage unit. B has to store and , requiring two storage units. When is received, this can be stored in the storage unit of avoiding an extra storage unit.
The storage complexity expands when using precomputations to store encryptions of known bits, or some random squares used for re-randomization. The total number of storage units will depend on the implementation and the requirements with respect to waiting time, communication, computation and storage capacities.
5. Alternative Solutions
In this section, we compare our comparison protocol, the Lightweight Secure Integer Comparison (LSIC), with other solutions for solving the millionaires problem. Since we are only interested in the most efficient solutions, we restrict ourselves to the semi-honest model. In the literature, we find two main classes of solutions that use public key cryptography, namely one based on homomorphic encryptions, and one based on garbled circuits. We compare our solution to the best representatives in each category.
5.1. KT
Kerschbaum and Terzidis (KT) present an efficient solution to the millionaires problem in the semi-trusted model in [
18]. Their cryptographic protocol is depicted in Algorithm 2. Party A computes the Pallier encrypted difference
and lets B decide whether
. They use the upper half
of the plain text range to represent negative numbers. Since B is not allowed to learn
x, A uses multiplicative blinding (which preserves the sign of
x) in line 5 to hide it for B. To prevent
x from exceeding
,
ℓ is bounded by
. The main disadvantage is that multiplicative hiding is not perfect and leaks some information about
x, and thus
a, to B [
22].
We compute the computational complexity of KT to compare it with our solution LSIC. In line 2,
b is encrypted by Pallier, a number of
ℓ bits. In line 5 a number of
bits is encrypted, and an exponentiation is computed to the power
r which has
bits, and in line 8, the number
x is decrypted. Therefore, the total number of multiplications modulo
N is
. This computational complexity is better than the other alternatives, but worse than LSIC. Although the communication complexity of KT is very good, its weaker notion of security is a serious drawback.
Algorithm 2 Kerschbaum and Terzidis (KT). |
| |
Input A integer a | |
Input B integer b and the decryption key K | |
Joined output bit t, where | |
| |
| {Both a and b consist of ℓ bits} |
| B encrypts b and sends to A |
| A chooses random number r of bits |
| A chooses random number such that |
5: | A: | ▷ |
| |
| A sends to B |
| B decrypts |
| if then |
10: | B: |
| else |
| B: |
| end if |
| |
15: | B sends t to A |
| |
5.2. DGK
The DGK (Damgård, Geisler, and Krøigaard) protocol [
5,
6,
7], which is actually inspired by Blake and Kolesnikov [
23] is depicted in Algorithm 3.
Algorithm 3 The Damgård, Geisler, and Krøigaard (DGK) comparison protocol. |
| |
Input A | |
Input B and the decryption key K | |
Joined output bit t, where | |
| |
| Party B encrypts the bits , and sends all to A. |
| for all do | ▷ A computes the bitwise exclusive or’s |
| if then |
| A: |
5: | else |
| A: | ▷ |
| end if |
| |
| end for |
10: | for all do |
| A encrypts |
| A: |
| {} |
| A blinds towards B by raising to a random non-zero number |
15: | end for |
| A randomly permutes the numbers and sends them to B |
| B checks whether one of them is zero, and returns the result t to A |
| |
The main idea of DGK is to search, from left to right, for the first position
i where the bits of
a and
b differ. When
, as indicated by
, then
. All other numbers
,
will be positive. The values and order of the numbers
have to be blinded because they reveal some information about
a towards B. They use a special homomorphic encryption system that is finetuned to small plaintext sizes
u and enables efficiently checking whether encrypted values are zero [
6].
There is an important computational improvement in Algorithm 3, which is not mentioned by Damgård, Geisler, and Krøigaard, based on the observation that the number can only be zero when . This implies that A only has to compute the numbers for which . For the other numbers , an encrypted, non-zero random number can be chosen. This will save some modular multiplications for A.
The computational complexity of the DGK protocol is analyzed as follows:
- (i)
The bitwise exclusive or’s have to be computed for the bits of a and b. This takes on average multiplications.
- (ii)
From the exclusive or’s, the ℓ numbers have to be computed. This can be done in multiplications, by storing the intermediate result of .
- (iii)
The numbers
have to be blinded, i.e., raised to a random number of length
u [
6]. This step can be done efficiently due to the smartly chosen encryption system. The number
u is relatively small and equal to the first prime larger than
. The blinding takes around
multiplications.
These estimates are based on the idea that in case one has to raise a number to an exponent of n bits, this will take on average multiplications.
When taking u, the plaintext size, equal to , we come up with a total of multiplications, which is of order (due to the blinding of the numbers ).
These are all multiplications on the account of A. For B, the main computational load is in decrypting the received numbers. B receives the
ℓ blinded numbers
and has to decrypt them to decide whether one of them is zero or not. Therefore, a full decryption is not required, only a check whether the encrypted value is zero or not. This is relatively easy in the DGK encryption system, and is equivalent to raising each
to the power
v, which is a number of size
[
6]. By using the factorization of
N while decrypting, the total number of multiplications for B is about
. Unless
ℓ is very large, the decryptions (by B) determine the main computational load of the DGK protocol.
The encryptions by A and B are easy, except for the randomization part, which is done when sending the encrypted value to the other party. (Re)randomizations in the DGK encryption system are roughly equivalent to raising a number to an exponent of size
bits [
6], so we estimate each randomization by
multiplications modulo
N. Each party has to (re)randomize
ℓ numbers.
In the DGK protocol, A sends to B the ℓ numbers , and B sends to A the ℓ numbers , and the final result t (which is only one bit). So the communication load from B to A is ℓ instead of our . More importantly, the DGK protocol takes only one (and a half) round.
In the DGK protocol, A has to store the ℓ numbers , and , for a total of storage units. Party B has to store the ℓ numbers before they can be decrypted, for a total of ℓ storage units. The storage complexity can be reduced by increasing the number of communication rounds, but it can not be less than linear in ℓ, because the order of the ℓ numbers has to be randomized towards B.
The DGK protocol offers perfect (unconditional) security for B towards A, because A only receives blinded values, and computational security for A towards B, protected by the semantically secure encryption system.
5.3. Garbled Circuits
The millionaires problem could also be solved by using some form of garbled circuits. The main components of such a solution are:
- (1)
A creates a garbled circuit for comparing two ℓ bits numbers and sends it to B. The private inputs of A are incorporated by using only the corresponding input wires.
- (2)
For each input bit of B, A and B perform an (1 out of 2) oblivious transfer protocol so B can use the correct input wire of the garbled circuit.
- (3)
B evaluates the garbled circuit, which results in one output wire.
- (4)
B sends (a part of) the output wire to A, which translates this to the result of the comparison.
The computationally most intensive step is the oblivious transfer of the input bits of B (step 2), since this involves asymmetric cryptography, while evaluation of the circuit can be efficiently done with symmetric techniques. The most efficient implementations of step 2 are based on Elliptic Curve Cryptography (ECC), in which case at least one ECC encryption and decryption is needed per input bit. In [
24], it is estimated that one ECC encryption plus decryption is comparable to 200 multiplications modulo an 1024 bit number, when considering a security level similar to an RSA number of 1024 bits, which in ECC corresponds with a 160 bits modulus.
It must be noted that the computational complexity of GC can be considerably reduced by using precomputations, but this is considered out of scope for the environments we are interested in.
The communication complexity of one of the best known GC solutions [
12] is
, where
t is the usual key size of symmetrical cryptosystems (we use
). The circuit here is also based on our recurrence relation that is depicted in Equation (
1).
The storage complexity of GC is more or less equal to the communication complexity, since the communicated garbled circuit, and the obliviously transferred values have to be stored separately. However, when using more communication rounds, where in each round the oblivious transfer values and the garbled circuit part with respect to one bit is communicated, the storage complexity could be reduced to approximately , i.e., bits per party.
Although other variants of garbled circuits exist, the most efficient implementation considered here works in the semi-honest model, and offers computational security for both parties. Furthermore, the solution considered here uses a weak form of Random Oracle, namely of correlation-robust functions [
12].
5.4. Summary
A rough summary of the previous subsections is depicted in
Table 4.
The size of the asymmetric modulus (used in LSIC, KT, and DGK) was chosen as 1024 bits, 160 bits for the elliptic curves used in the oblivious transfer part of GC, and 80 bits for the symmetric systems used in the circuit evaluation. The amount of computation is measured as the number of multiplications modulo a 1024 bit number. The amount of communication and storage is in bits. The computational load of GC is a lower bound, since only one ECC encryption and decryption (per input bit) is counted, which is needed in the oblivious transfer part.
The comparison of the computational complexity of the four solutions is visualized in
Figure 1. The amount of multiplications needed for (re)randomization is incorporated here too.
When comparing LSIC with other solutions, it’s clear that both computational and storage complexity is much smaller than for existing solutions, while the communication complexity is slightly larger. The large computational effort of the Paillier decryption during KT can be clearly seen in
Figure 1, and shows our motivation for finding a solution like LSIC that does not require one.
Compared to KT, DGK, and GC, which are constant round solutions, LSIC uses a linear (in ℓ) number of communication rounds. Although extra rounds could lead to delays between rounds depending on the implementation, it offers a significant reduction in storage complexity.
6. Client-Server Model
The LSIC protocol could, like most comparison protocols, also be used in the client–server model. Although we focus on LSIC, the same approach could be used for converting any comparison protocol to the client–server model. We also discuss some variations of the client–server protocol, as depicted in Algorithm 4, making it suitable for different types of output.
Assume A has two encrypted numbers and of ℓ bits, party B has the private key, and they want to compare the numbers a and b. The actual values of a and b are not known to A and B. Note that is used to denote the encryption scheme (e.g., Pallier), which differs from GM, because a and b consist of more than one bit.
The main idea of Algorithm 4 is that the most significant bit of
indicates whether
. Since
x is an
bit number, its most significant bit equals
. In line 2, the number
x is statistically blinded by the random number
r, which should contain
more bits than
x. Since we don’t allow carry-overs modulo
N when computing
x, this protocol only works whenever
. In line 9 of Algorithm 4 we use the LSIC protocol to compute
. Since its output
t should remain unknown to A and B, to protect the privacy of
a and
b, we use the output variant where A obtains the encrypted value
. The correctness of Algorithm 4, i.e., the proof of the assertions, and the formal security proof, is given in later subsections. Algorithm 4 computes
, but also
could be similarly computed by swapping
a and
b in the computation of
x and flipping
t at the end, which is most easily done by flipping
(by A) or
(by B) before encrypting it.
Algorithm 4 Client–server comparison. |
| |
Input A and | |
Input B the decryption key K | |
Output A encrypted bit , where | |
| |
| {Both a and b consist of ℓ bits} |
| A: | ▷ |
| A chooses a random number r of bits for blinding x |
| A: | ▷ |
5: | A sends to B |
| B decrypts |
| A: |
| B: |
| A: |
10: | |
| B encrypts and sends to A | ▷ is the -th bit of z |
| A encrypts | ▷ is the -th bit of r |
| A: | ▷ |
| {} |
This client–server protocol only needs one decryption, namely the Pallier decryption in line 6. In addition to LSIC, which is executed in line 9, only five extra multiplications modulo
N are needed (and two GM encryptions), so this client–server protocol has a very low overall computational complexity. In other known solutions, like the client–server protocol by Erkin et al. in [
8], the number
is computed via the number
, which requires a division of an encrypted number by
, and thus an exponentiation to a number of size
, which requires substantially more modular multiplications.
Note that although the inputs are encrypted by Pallier, the output is encrypted by QR. It’s also possible to obtain the output encrypted by Pallier by replacing lines 11 to 13 by the following lines:
B computes , encrypts it with Pallier, and sends to A |
A encrypts with Pallier | |
A: | ▷ |
| |
The only problem with this solution is that the output
of the comparison protocol in line 9 is encrypted by QR, while Pallier encryption is needed in line 13. To overcome this problem, one solution is to run LSIC entirely with Pallier encrypted bits. This introduces some extra computations for the encryptions and re-randomizations. A computationally less intensive solution is to convert the QR encrypted output bit at the end into a Pallier encryption. This requires a modification of the final line, number 40, of Algorithm 1:
{ is the QR encrypted bit } | |
A blinds by tossing a fair coin | |
if then | |
A: | |
else | |
A: | ▷ |
end if | |
| |
A sends to B | |
B decrypts , encrypts it with Pallier , and sends to A |
if then | |
A: | |
else | ▷ A unblinds t |
A: | ▷ |
end if | |
{Now is the Pallier encrypted bit } | |
It’s also possible to obtain the comparison result in the client–server model as a shared secret. To this end, the final lines of Algorithm 4, starting with line 9, have to be modified slightly:
A:
B:
{}
The shared output solution saves two multiplications by A, and one encryption and transmission of the number by B. On the other hand, the shared output version of LSIC requires an extra blinding action and transmission by A and also an extra decryption by B.
6.1. Correctness
Since , it is clear that the most significant bit of x, i.e., , will be equal to the comparison result of , i.e., if and only if , since both a and b are ℓ bits long. This proves that the assertion in line 14 indeed gives the correct output.
In order to understand line 13, where the number
is computed, both in the QR encrypted and the Pallier encrypted version, observe that for each positive integer
x, the number
is defined by
such that
. Since
we know that
and
whenever
, and
and
, otherwise. In the first case,
. In the second case,
. So
The relation
easily follows, which shows the correctness of line 13 in the Pallier encrypted version. The correctness in the QR encrypted version follows by observing that
, so
. Since
is a bit value, we have
, so
The correctness of the shared output version of the client–server protocol also follows immediately from this equation.
6.2. Security
We prove the security of the first mentioned variant of the client–server protocol as depicted in Algorithm 4. The proof of the other variants is similar.
In order to show that our protocol privately computes the comparison of two integers in the semi-honest model, we have to show that whatever can be computed by A or B from their view of a protocol execution, can be computed from their input and the comparison result (see Definition 7.2.1 in Goldreich [
21]).
The view of A consists of the encrypted numbers
and
, the size
ℓ, the random number
r, the encrypted comparison bit
, and all intermediate messages received from B. Besides
, this also includes all intermediate messages in the LSIC subprotocol: the encrypted bits
,
, which are the encrypted bits of the number
d, and the encrypted bits
,
, which equal
,
being the blinded version of the bit
. Summarizing, the view of A equals
It suffices to show that there exists a probabilistic polynomial-time algorithm
such that
is computationally indistinguishable from
[
21]. Since both encryption algorithms are semantically secure, every pair of encryptions is computationally indistinguishable, so by letting
randomly generate
encryptions and one random number with the same distribution as
r, this condition is easily verified.
The view of B consists of the decryption key
K and the size
ℓ, and all intermediate messages received from A. Besides
, this also includes all intermediate messages in the LSIC subprotocol: the blinded bits
,
,
being the blinded version of the bit
. The comparison result
and the intermediate comparison result
t are not received by B. Since B owns the decryption key, all encrypted values
can be decrypted to
, just like the number
z. Additionally, B is able to deduce the randomization part of each encryption, but since A carefully uses re-randomization before each transmission, this information can be considered as a random variable and is therefore useless to B. Summarizing, the view of B is equivalent to
Again, we have to show that there exists a probabilistic polynomial-time algorithm
such that
is computationally indistinguishable from
. In Equation (
2), it was shown that the bits
are uniformly distributed, so we could let
generate
random bits for those. The number
z is a number of
bits, but is, due to its construction, statistically indistinguishable from a random number of equal length, which means that the difference between both probabilities is bounded by
. Therefore, we can even prove the stronger assertion that
is statistically indistinguishable from
.
We conclude that the private input of B is computationally secure towards A, and that the private input of A is statistically secure towards B. We use the term statistically secure instead of perfectly secure, although the private input of A is secure even when B has unbounded computer power. The difference with perfect security is that the amount of information leakage is not zero, but negligible.
7. Conclusions
We described a new protocol for the millionaires problem using additively homomorphic encryption in the honest-but-curious model. There are no restrictions on the size of the inputs. Since this protocol based on homomorphic encryption doesn’t use intermediate decryptions, it has the lowest number of modular multiplications of all known solutions, and also a low communication and storage complexity, making it preferable for light-weight environments. The (maximal) number of multiplications modulo N is even less than the number of modular multiplications needed for the Pallier encryption of an ℓ bit number. The number of communication rounds is equal to the number of input bits. The private input of the first player (A) is computationally secure towards the second player (B), and the private input of the second player is even perfectly secure towards the first player. Furthermore, we showed how to transform any comparison protocol to the client–server model in an efficient and secure way. The client–server solution offers computational security for B, and even statistical security for A. We showed three output variants for both the private input as the client-server model, namely a publicly known output, an encrypted output, and a secret-shared output.