1.1. Private Information Retrieval
Private Information Retrieval (PIR) protocols allow the client to fetch items from the server’s database without disclosing to the server which item was requested. A main challenge in constructing PIR protocols is minimizing the communication complexity. The idea of PIR was introduced by Chor et al. [
1], together with the 2-server PIR protocol having the communication complexity
for the dataset size
n. PIR has a wide variety of applications such as anonymous communication [
2,
3], privacy-preserving media streaming [
4], blockchain security [
5,
6], personalized advertisement [
7], location and contact discovery [
8,
9,
10], etc.
The naive approach to PIR is just to make the server send all the items in the database to the client: we stress that PIR cares only about the privacy of the client’s request but not about the privacy of the server. However, it entails a huge communication complexity equal to the size of the database. To shorten the communication complexity and still keep the privacy of the request, there are two main approaches to construct PIR:
Historically, the first type of PIR was
a Multi-Server PIR [
1], where the database is replicated for
non-colluding servers. The client secret-shares its request, and servers locally compute the secret-shared response and send it back to the client. The client recovers the item from the shares of response. Multi-Server PIR protocols, such as [
11,
12,
13,
14] are relatively efficient in information-theoretic settings. The requirement of the replicated database kept by the non-colluding parties is restrictive; however, there is a space for such a PIR, preferably in blockchain databases, cloud services, multi-server enterprise ecosystems where a small number of servers (but not all) are likely to be compromised.
Single-Server PIR protocols work in a computational setting and are built on the basis of homomorphic encryption (FHE, AHE, or SHE). The starting point in single-server PIR is the AHE-based protocol of Kushilevitz and Ostrovski [
15]. The early single-server PIR constructions were both computationally and communicationally low efficient, although recently significant progress was made which allow speaking about the practically suitable one-server PIR solutions [
16,
17,
18,
19]. For instance, the OnionPIR protocol from SHE [
16] achieves a 64 KB request and 128 KB response in the online phase of the protocol (and the same in the offline phase) for all the realistic database sizes.
On the high level, for both approaches, the database is represented as a function (usually, a polynomial) f such that for any key x and the correspondent value (a record) y holds . Then, the client has to send the request x to the server (servers) in a way that preserves its privacy. For the Single-Server PIR, it means that x is sent encrypted, in the Multi-Server paradigm, x is secret-shared. Encryption or secret sharing has to be homomorphic so that the server (servers) could compute the function under the encryption/secret-sharing and send the encrypted or secret-shared response y back to the client.
In a 2-server computationally-secure PIR of Gilboa and Ishai [
20], the request is shared as a DPF (Distributed Point Function) and has a polylog length. In this case, to compute the shares of the response, only additive operations are needed (DPF sharing is homomorphic in respect of them). However, in the information-theoretic setting, which is the focus of this work, it is still unclear how to construct
efficient in terms of communication and computation PIR with the secret sharing which is homomorphic in respect of any number of additions and multiplications.
Currently, 3 generations of information-theoretic PIR protocols exist: the first generation originated from the work of Chor et al [
1] is based on Reed-Muller codes and have communication complexity
, in the second from Beimel et al. [
21] they restated some of the previous results in a more arithmetic language, in terms of polynomials, and also considered a certain encoding of the inputs and element-wise secret sharing the encoding, which resulted in
communication complexity. The third generation from works of Efremenko [
11] followed by [
22,
23,
24,
25,
26], Yekhanin [
12], Beimel et al. [
13], Dvir and Gopi [
14] is based on matching vectors and is the most computationally efficient line of protocols with the complexity
for database size
n. In all the 3rd generation schemes, but [
14], as was demonstrated by Beimel et al. [
13], in fact, the combination of two secret-sharing schemes is utilized, both linear in different groups, and a
share conversion with respect to some relation, allowing to locally perform some non-linear operation over the shares (apart from the case of the identity relation).
1.2. Share Conversion
Suppose that there is some number of parties, each holding a share of a secret
s which was created by a secret-sharing scheme
.
The share conversion is defined as a process of a local computation performed by those parties based only on their shares and outputting the new shares of the secret
in a different scheme
so that there is some predefined relation between
s and
. A systematic study of share conversion was started by Cramer et al. [
27] by considering the case
for two arbitrary linear secret sharing systems over different fields.
Let us consider an easy illustrative example: for the function over the ring , and for the conversion’s relation , for the input shared in a linear scheme over the ring R, it is possible to compute in the following circuit: first, according to the linear property of the first scheme, servers locally compute shares of , then convert shares of , and to shares of , and over , and finally obtain shares of the response .
This approach, however, leaves room for improvement, as such a conversion usually increases the size of the request and response in PIR, because the conversion is a local operation and therefore it is not a trivial issue: to evaluate the circuit which computes some succinct function
which represents the database, the client forms its request as a proper input to this circuit. In addition, not any circuit is possible to compute within the existing secret sharing and conversion schemes, which means that we are bound to only certain kinds of the circuit families and, depending on the VC-dimension of these families of those certain function families, the proper representation of request might be much larger than the size of the database. Recall that the notion of the VC-dimension was introduced by V. Vapnik and A. Chervonenkis in [
28]. Informally, for the boolean function family
, where each
, VC-dimension
is the size of the largest
such that the set
of restrictions of functions from
contains all the possible boolean functions over
I. The higher
relative to
, the more efficient PIR can be built. For a precise definition, see [
13].
Using homomorphic properties of secret sharing schemes to perform MPC on shared values is a widely used technique in information-theoretic MPC, initiated by the seminal work of [
29]. Indeed, in order to (semi honestly) securely evaluate an algebraic circuit, the parties share their input with Shamir secret sharing. Then, linear combinations can be homomorphically evaluated ‘for free’ via local computation on the shares so that additions can be performed repeatedly any number of times. Multiplications can also be performed, however, multiplying two shared values results in a value shared according to Shamir with the doubled degree. This limits the depth of a circuit computable with (even) 1-privacy if we require that the only communication round will be sending shares for the final reconstruction. This idea transfers to PIR, where inputs come from a single party, so they may also be conveniently preprocessed by it via arbitrarily complex functions (which is not always possible for inputs distributed among multiple parties). For instance, for 3-server PIR, degree-2 polynomials can be locally evaluated if Shamir secret sharing was used. As degree-2 polynomials (over a field) in
n variables have non-trivially high VC dimension (
), this allows for encoding each input via a vector of
entries and using the appropriate share conversion. For
k-server PIR, different kinds of share conversion may enable us to evaluate a family of shallow circuits that both have high VC dimension and suitable secret sharing with share conversion, allowing us to locally evaluate them. In particular, note that a share conversion for a suitable relation, rather than a function suffice to evaluate circuits of that type.
1.3. BIKO Framework
In [
13], Beimel, Ishai, Kushilevitz, and Orlov (BIKO) interpret the state of the art 3-server PIR schemes as using share conversion from a (variant) of Shamir secret sharing over a certain ring
for small composite
m, applied to circuits stemming from MV codes [
30]
with a bounded set
of
values, for some
. It has the property that
, while
for
is in
S. We refer to such codes as
S-bounded MV codes. They manage to get improved complexity of the resulting PIR, by using conversions from CNF secret sharing rather than from Shamir over certain small
, for which a conversion from Shamir for that relation does not exist (the
-CNF is a threshold secret sharing scheme introduced in [
31]; see
Section 2.2 for a detailed description). Specifically, they obtained conversions from
-CNF over
to the additive secret sharing scheme over
for the following relation
. They work with the so-called
canonical set
, where
is the decomposition of
m into prime factors. This is a useful choice, due to the existence of good
-bounded MV codes over composite moduli
m. Their approach is motivated by the existence of conversions for CNF to additive (roughly, that CNF can be converted to “any” scheme, and any scheme can be converted to additive), they use
as CNF over a certain ring, and
as additive over another ring. This relation (although not a function) suffices to evaluate the required type of circuits, arising from the MV family. There is a potential tradeoff here between the best MV codes that exist over a certain ring
R, and the size (more generally, the identity) of the set
S that can be achieved. On a high level:
The smaller S is, the easier it is to find a suitable share conversion (required to evaluate functions in the circuit family induced by the MV code).
The larger S is, the easier it is to find an MV code resulting in a family of circuits with high VC dimension. The communication complexity of the resulting PIR decreases with the VC dimension of the set (and eventually, the size of the shallow circuit to evaluate).
The concrete parameters of both constructions used so far for 3-server PIR (in their most efficient variants) follow from the following Theorem 7, and instantiations of it via known constructions of MV codes and share conversion schemes.
On a very high level, these PIR protocols consist of three steps and is shown in Construction 1.
Construction 1: BIKO Framework [13] |
- 1
Let denote the server’s database. The client preprocesses its input into a vector for a (constant) ring R, where is a set of vectors of an S-bounded MV code. It shares the vector coordinate-wise among the k servers via some -private secret sharing scheme (so no single server learns anything about the secret). - 2
The servers use linear homomorphism properties of , which are homomorphic over certain finite groups, to locally evaluate (an encoding of) f on the shared v. More concretely,
where . In some more detail, each uses linear homomorphism of , then a share conversion from to relatively to , applied to each share of , and finally linear homomorphism of is applied to evaluate on the resulting shares. The share conversion is required to transform for into a non-zero value, and for into 0’s, making the sum non-zero iff. . Then each server sends its share to the client. - 3
The client recovers the output using linear homomorphism of , and post-processing the value.
|
The correctness of the scheme is easy to verify.
For a 3-server PIR, Ref. [
13] provides the technique for the constructing the conversion (it such a conversion exists) from (2,3)-CNF to the additive secret sharing and obtains results for some special cases. Utilizing the results of Beimel et al., Paskin-Cherniavsky and Schmerler in [
32] proved that there is a share conversion from (2,3)-CNF over
to 3-additive secret sharing over
, if
, for distinct odd primes
and
, one of which is equal to
p. Thereby they found infinitely many cases when conversion falling into the BIKO framework exists.
Theorem 1 ([
13,
32])
. Let , where are distinct primes, and p is a prime. Then, there exists a share conversion from -CNF to additive over for the relation for some β in the following cases:- 1
, and
- 2.
and .
For other cases of and p, however, the existence of the conversion was neither confirmed nor disproved. The constant in Theorem 1 seems to grow with m, but due to the techniques used, it has not been proven for any infinite family of parameters.
Remark 1. However, not all the 3rd generation information-theoretic PIR protocols fall into the BIKO framework. For instance, the work of [14] could be viewed as a certain generalization of it. This beautiful work surprisingly manages to carry over "3rd generation" PIR communication complexity previously achieved for 3 or more servers, to the 2-server setting, resolving a long standing open problem, thereby illustrating the limitations of the BIKO framework, providing evidence that generalizing it in certain directions can be instrumental in the context of PIR. In some more detail, [14]’s PIR has a bilinear, rather than linear reconstruction in , and the step corresponding to share conversion can not be cleanly viewed as a share conversion from to according to (or in fact any) relation. In particular, the client essentially uses a 2-out-of-3 sharing scheme to make the share conversion work, with himself holding one of the shares. 1.4. Our Contribution
Following the BIKO framework [
13] and utilizing some results of [
32], we prove that:
Theorem 2 (Main result, informal).
There exists a share conversion from (2,3)-CNF over to 3-additive secret-sharing scheme over for any odd prime q.
There is no conversion from (2,3)-CNF over to for any odd primes q and p (including the case ) and any .
In this way, we prove the existence of the conversion for infinitely many cases, and also for infinitely many cases we prove a conversion does not exist. Together with [
32] for
m’s which are products of two primes, it leaves open only the question of the conversion in the case when
, where
and
are both odd and not equal to
p.
Note also that for considered cases, we managed to compute the parameter which determines the server’s response size. We prove that in Theorem 7 is indeed the best for among where . More concretely, one of our contributions is the precise value of for share conversion with respect to relation . Previous techniques did not allow to compute , as they traded generality that could allow computing for some additional simplicity—using a single row in to understand the rank difference .
Another somewhat surprising observation we made is that we may sometimes increase
S beyond
so that a conversion from
-CNF over
to
(for the same
as before) still exists. This may have two possible implications. A direct implication that we observed experimentally for several values of
m, is that the rank difference
sometimes goes down, but not all the way to 0. Thus, if the share conversion still exists, as follows from the BIKO technique,
may decrease, leading to the reduced size of the server’s response. We checked this fact for some small
m’s by computer search and obtained positive results, which is presented in
Section 4. Indeed, we obtained smaller
supplementing
up to
by additional values. We informally sum the result of the computer search in the following theorem.
Theorem 3. There exists a share conversion from (2,3)-CNF over to 3-additive secret-sharing scheme over with respect to the relation , refining β, where:
, , and, , ;
, , , ;
, , , , ;
, , , , ;
, , , , ;
, , , , ;
, , , , ;
, , , , .
This result may also be viewed as evidence that canonical sets for m with a larger number r of prime factors may potentially have share conversions for for (significantly) smaller than number of servers (as we have conversions for servers but S larger than , where the resulting linear system has much more rows than columns). This direction is interesting to explore, initiating a systematic search for share conversions with server sets as small as possible, resulting in PIR with share complexity polynomial in MV codeword length for m which is a factor of r primes.
In addition to our two main contributions,
we identify a few minor errors in [
13,
32]. Nevertheless, these errors do not affect the correctness of any of their main contributions.
We recalculated some computer search results of [
13] (BIKO) as they come in contradiction with the theoretical result of Paskin-Cherniavsky and Schmerler. In particular, [
13] showed the absence of the conversion for
,
, while [
32] proved that the conversion for this case exists. In addition, we obtained numerical results for cases
, 26, 33 which were not considered in BIKO. Our numerical results given in
Section 4 confirm both our theoretical result for
and the conclusion of [
32].
We corrected some calculation mistakes made in previous work [
32]. The corrigenda are shown in
Appendix A.
1.5. Instantiations of BIKO and Future Directions of Our Work
Almost all third-generation PIR protocols falling in a BIKO framework, utilize the conversion from Shamir secret sharing instead of CNF. The existence of the conversion from Shamir secret sharing scheme implies the existence of conversion from CNF, but not vice versa [
13].
The following theorem by V. Grolmusz generalizes a similar instance of the theorem for 3-servers in [
13], to put our work in a broader context. It states the size of the MV families depending on the constant
m which has an impact on the complexity of the PIR protocols based on them.
Theorem 4 ([
30])
. Let where the ’s are distinct constant primes, and is constant. Then there exists an MV code family of size which is -bounded. Here , where is the largest prime. In fact, the construction in Theorem 4 generalizes to any m with r distinct prime divisors.
Next, we outline some parameters for which suitable share conversions leading to (3rd generation) PIR via the BIKO framework and MV codes from Theorem 4 exist. Note that Theorems 5 and 6 were initially stated in terms of conversion from Shamir secret sharing, but a corresponding conversion from CNF is implied.
Theorem 5 ([
11,
26])
. For each , there exists a number m with r distinct prime divisors , with , for which there exists a share conversion from -CNF over to -additive over for some , and relation . Furthermore, such a conversion exists for every m of the form with r distinct prime divisors, if the number of parties, , is replaced by . In a nutshell, the above result is obtained by [
26] via a composition technique applied to [
11]’s result for 3-server and
-server PIR. The reduction in the number of parties from
to
for
m with
r prime distinct divisors follows from the (somewhat surprising) 3-party conversion for
and
.
In [
23], the authors found 50 additional such 3-party conversions for
(which need to satisfy a certain condition), leading to further improvements in the number of parties as a function of
r. Note, that for all
m found in [
23],
are large, so the constant in Theorem 1 grows fast with
r.
Theorem 6 ([
23])
. For each , there exists a share conversion from -CNF over to -additive over for some , and relation . For each , there exists a share conversion from -CNF over to -additive over for some , and relation . Note that for the above instantiations, “descending” from [
11],
m must be odd.
Theorem 7 (Implicit in [
13])
. Let , , and C an S-bounded MV code family of vectors in . Assume also there exists share conversion from -CNF over to for some constant β, for the relation . Then there exists a k-server PIR family for databases of size with client’s message of size and server’s message of size . From Theorems 6 and 7 follows
Corollary 7. Let . Then there exists some where are primes where for some β. Then a -server PIR with client communication complexity of and (each) server’s communication complexity , for some exists. For , this improves to servers, and servers for .
We note that among the known m’s in the Corollary above for , and grows particularly fast with r for if servers (instead of ) exist.
Instantiating Theorem 7 with Theorem 4 for MV-code construction, and either Theorem 1, we obtain the best known concrete efficiency of 3-server PIR, with communication complexity. On the other hand, for more than polynomially improved communication complexity and a larger number of servers, the best result is obtained by instantiating the share conversion via Theorem 5.
Our concrete result does not improve communication complexity for 3-server PIR, which is essentially optimal for conversion from
by [
13] as stated in Theorem 1. However, the technical tools developed may help understand the existence of share conversions for even
m with a larger number of prime factors, with better the communication complexity of PIR and the larger number of servers. Due to the generality of BIKO’s framework, converting from CNF, one could hopefully get improved efficiency of communication complexity relatively to the number of servers. In particular, as noted above, the instantiation of BIKO as in [
11] does not yield PIR protocols with even
m, and the known values of
m have large maximal factors and lead to PIR with high constants in the exponent. By a direct corollary from Theorems 4 and 5 similar to Corollary 7, we get a 6-server PIR with communication complexity
. Using the BIKO framework instantiated Theorem 5—the ‘furthermore’ part, for 8-server PIR we obtain a complexity of
, by using
, and instantiating Theorem 4 with
. Thus, as far as we know, no PIR with complexity better than
(best known 6-server PIR) exists for 7 servers. We conjecture that a 7-server PIR with much improved constants exists, by using share conversion from
-CNF with parameters generalizing the conversions we obtained for
.
Conjecture 1. A share conversion from -CNF over to for some constant β exists, implying a 7-server PIR for.
We hope to be able to verify the conjecture more easily by generalizing the insights we have for the existence of a share conversion for
to a share conversion to
(more generally, for
for some composite
c), and the fact that in this case of
, the analysis turned out to be rather simple. Another reason to hope we can manage with 7 servers is that
is in that case, has a form similar to the 3-server case considered in present work (unless, for example, 6-server case). See
Section 1.6 for more details.
A broader goal is improving the number of servers one can tolerate for PIR with CC corresponding to MV codes over
with
r prime factors. While [
26] show how to achieve
servers for an infinite number of
r’s and corresponding
m’s, and
-server PIR for finitely many
r’s, it would be interesting to improve Theorem 6 to get share conversion for
-server PIR for all
r. Our hope is to devise a composition theorem along the lines of [
26], composing ‘gadgets’ of conversions from
-CNF over
for coprime composite
m’s. As we already have such conversions for infinitely many pairwise coprime
m’s via Theorem 1, we only need a suitable composition theorem. In fact, it is not hard to show, that if we had conversions for coprime
respectively, both to
for the same
p, say
, we would obtain the result. In particular, it is strictly easier to prove the existence of conversion from
to
for some
depending on
for infinitely many coprime
’s (as the 51 known cases based on Mersenne-style primes in [
26] are a special case). To summarize, to complete this direction, we only need to find a conversion from
-CNF over
to
for infinitely many coprimes
’s of the form
where
are distinct primes. This seems to require only moderate extension on the (linear algebraic) toolbox conversions from
-CNF that has been laid out in the seminal work of [
13] and subsequently in [
32].
A more ambitious still direction (which we expect to be more technically involved) is expected to lead to dramatic improvements in the number of servers, bringing it down from exponential to linear in r. It relies on the following composition lemma, which is not hard to prove (see full version for details).
Lemma 1. Let , where are odd coprime integers. Assume there exists a share conversion from -CNF over to -CNF over for the relation (and an analogous conversion exists for ). Then there exists a share conversion from -CNF over to -CNF over for .
Remark 2. More generally, slightly optimizing parameters, relatively to iteratively applying Lemma 1 for two ’s, for any , and as above, we obtain a share conversion from -CNF over to -CNF over for the relations .
Assume a conversion generalizing our result from Theorem A1 for 3 servers to more servers, while keeping the conversion to a scheme -CNF for sufficiently small t. Such a scheme has enough redundancy to support multiplications over the resulting field unlike -additive, which has none (if needed, the field characteristic 2 may be replaced with some other prime, generalizing Theorem 1 instead). Then we can obtain PIR with linear server complexity , using Theorem 4, and applying Lemma 1 times. More precisely, we have:
Corollary 7. Assume there exists a (global) constant t, such that for all sufficiently large k, the following holds. For infinitely many ’s of the form where all are odd distinct primes, there exists a share conversion from -CNF to -CNF over for the relation . Then, for all sufficiently large r, there exists a -server PIR with communication complexity .
1.6. Our Techniques
As described above, one of the main contributions of [
13] was an instantiation of the framework for designing PIR protocols, which reduces the question of the existence of a three-server PIR protocol to the existence of a share conversion for certain parameters
, and certain linear sharing schemes over Abelian rings
R,
determined by the parameters.
BIKO provides the criteria of the share conversion existence in the case when
for distinct primes
and
and the set
, where
,
,
,
. Namely, they prove that for such
m and
, the share conversion from (2,3)-CNF over
to 3-additive scheme over
exists if and only if
, where the rank is computed over
. The matrices
and
are matrices over
with
columns and
and
rows respectively which are constructed from some specific system of equations and inequalities. Beimel et al. in [
13] did not provide the general solution for this system; however, they proved existence and nonexistence of the conversion for some special cases.
While the solvability of a system can be verified efficiently for a concrete instance, it does not provide a simple condition for characterizing triples
for which solutions exist. Moreover, the size of the matrix
in this system is
which makes the numerical solution for big
m’s heavy in practice (though asymptotically efficient). Before [
32], where the solvability of the system for the case odd primes
and
, if one of them equals to
p was proven, even the question of whether an infinite set of such triples exists remained open.
Our concrete goal in this work is to better understand the case of
, motivated by understanding the technical foundations of the broader problem for
m which is a product of
distinct primes (see
Section 1.5 for details). We proceed using the BIKO characterization above. Concretely, for parameters
and
p, this reduces to calculating the quantity
, where the rank is computed over
.
In [
32], the case
for odd
and
was explored. To simplify the technical task, the authors of [
32] rely on the observation from [
13] that
iff
does not span
for
any row
of
. Thus, they replace
with some
as above, and work with that (forgoing the goal of understanding the particular value of
). Then, they proceed by bringing the matrix
to a more convenient form by performing a sequence of carefully tailored elimination steps on the rows of the matrix
. The sequence of eliminations is based on a observing a 3-leveled structure of the matrix of the matrix
, and working on blocks of decreasing coarseness as the elimination process progresses. It also involves a change of basis at some point, to make the matrix’s structure nicer for understanding. That is, rewriting the matrix so that the set of columns corresponds to a new basis—here we even manage to get fewer vectors in, as it suffices to include a set of vectors which is guaranteed to span
. However, the resulting matrix after that process remains too complex to check whether
for all parameters. The analysis up to that point (resulting in some matrix
to analyze) is oblivious to the particular parameters except for not looking at even
m (not because it was particularly hard, but rather out of a decision to limit the scope of the paper at what was already achieved). To obtain their partial result for some of the parameters, the authors then reduce the matrix’s rows modulo a certain vector subspace (formally, multiplied it from the right by a certain square matrix
L with non-trivial left kernel). Clearly, it holds that if
, then
as well (implying the existence of a share conversion), but not necessarily the other way around. The matrix
turns out to be sufficiently simple to analyze, and for
p which is either
or
, the resulting rank difference is non-zero. However, we do not yet understand other parameters, for which
, or the case of even
m. Also, due to the first simplification, the concrete value of
is not found, and thus the concrete answer complexity of the resulting PIR as implied by Theorem 8 remains unknown.
Our current paper considers the case where
. We proceed by a quite straightforward generalization of [
32]’s elimination process up until producing the matrix
, except that we do not make the simplification of keeping a single row out of
, but rather keep the entire matrix. The main divergence from [
32] is that we do not perform the reduction modulo a subspace, but are able to directly check whether
, and furthermore to compute the exact value of
. This is made possible, as the case where
turns out to be particularly simple, and we managed to successfully analyze it directly (for all
). The other cases (when
m is odd, and
p is not equal to
or
) remain open.