1. Introduction
Encrypting data before uploading to the cloud has become a standard method to protect data privacy. With the advancement of technology, cloud services have become a part of people’s daily lives. Sharing data through the cloud server can reduce the storage cost of the user end, but at the same time, there will be concerns about data leakage. Therefore, to protect the data’s confidentiality, users may encrypt it before uploading it to the cloud server. Based on the convenience of key management, most people use the public key cryptography system to encrypt data. To confirm the correctness of a user’s public key, a certificate of the user’s public key is often required as proof. However, since the management and verification of certificates will lead to more computation and transmission costs, many certificate-less public key cryptosystems have been proposed.
Functional encryption is a type of certificate-less public key encryption. For example, identity-based encryption (IBE), attribute-based encryption (ABE), and subset-predicate encryption (SPE) are functional encryption methods. In 1984, Shamir proposed the first IBE scheme [
1]. In his scheme, the data owner can use the receivers’ identities as the encryption keys, thereby saving the management and verification of certificates. In 2005, Sahai and Waters proposed a fuzzy IBE scheme [
2]. Since a fuzzy identity can be regarded as an attribute, the scheme of Sahai and Waters is considered the first ABE scheme. In ABE schemes, the data owner can use attributes to encrypt data, so there is no need to use certificates. The concept of SPE was first introduced by Katz et al. [
3] in 2017. In SPE schemes, the data owner can select an attribute set to encrypt data, and the receivers can decrypt it if and only if their attributes are subsets of the attribute set of the encrypted data.
Proxy re-encryption (PRE) is used to reduce file-sharing costs. To change the recipient of the ciphertext from Alice to Bob, one may decrypt the ciphertext with Alice’s private key and then encrypt it with Bob’s public key. However, this method is inefficient, and how to directly convert the ciphertext becomes an issue. Therefore, the purpose of PRE is to directly convert the ciphertext without decrypting it. The first PRE scheme was proposed by Blaze et al. in 1998 [
4]. In their scheme, a semi-trusted proxy is allowed to transform the ciphertext for Alice into the ciphertext for Bob without changing the content of the message, thereby reducing the computation and transmission costs of sharing files. On the other hand, PRE can be used in combination with functional encryption. For example, the first identity-based PRE scheme was proposed by Green and Ateniese [
5] in 2007; the first attribute-based PRE scheme was proposed by Liang et al. in 2009 [
6].
Searchable encryption (SE) solves the problem that files cannot be searched after encryption. Although encryption can protect data privacy, searching for encrypted data takes effort. To solve this problem, Song et al. [
7] proposed the first SE scheme in 2000, whereas encrypted files are searchable. In 2004, Boneh et al. proposed the first SE scheme with keyword search [
8]. In Boneh et al.’s scheme, users can choose a keyword to search for files. In 2007, Hwang and Lee [
9] proposed SE that supports multi-keyword search, increasing the flexibility of searchability.
On the other hand, due to the rapid development of quantum computers, some traditional encryption schemes are facing a crisis of being cracked. Due to the special properties of quantum bits, quantum computers can perform parallel operations on large amounts of data. Therefore, quantum computers can use their parallel computing capabilities to solve some traditional hard problems on classical computers. In 1994, Shor proposed a quantum algorithm [
10] that can find the prime factors of a large integer. In addition, the discrete logarithm problem, which is generally considered as difficult as the prime factorization problem, is also considered to be at risk of being solved by quantum algorithms. Therefore, encryption schemes based on such mathematical problems suffer from the risk of being cracked.
Lattice-based cryptography has been extensively researched and validated for security on both classical and quantum algorithms. To resist quantum attacks, the National Institute of Standards and Technology (NIST) launched a post-quantum cryptography standards competition. In July 2022, the competition winner was Kyber [
11], a lattice-based public-key encryption algorithm. The competition result shows that lattice-based schemes are generally considered to be effective against quantum attacks. Lattice-based cryptography is considered quantum-resistant because it is based on mathematical problems that are difficult for quantum computers to solve. Quantum computers cannot solve the shortest vector problem exponentially faster than classical computers, which means that quantum attacks would not be able to break the encryption schemes based on that problem. The Learning with Errors (LWE) problem is also considered intractable for quantum computers. In addition, no known quantum algorithm can break the Decisional Learning with Errors (D-LWE) problem, which has applications in many fields.
Many lattice-based public-key cryptosystems have been proposed. In 1996, Ajtai [
12] proposed a one-way hash function based on the shortest vector problem. In 1997, Goldreich et al., proposed the first public-key cryptosystem based on the closest vector problem. Hoffstein et al. [
13] proposed a public key encryption scheme called the Number Theory Research Unit (NTRU) in 1998. Although the security of NTRU is not formally proven, its computation cost is lower than previous schemes. In 2005, Regev [
14] proposed the LWE problem, which is at least as hard as the worst case of the shortest independent vectors problem. Since then, most of the lattice-based public-key cryptosystems published after 2005 were based on the LWE problem due to the low computational cost required. Even the Kyber algorithm [
11], the winner of the NIST post-quantum competition, is based on the LWE problem.
One of the advantages of the lattice encryption algorithm is that it can provide functional encryption. In 2008, Gentry et al., published an IBE scheme based on LWE. In 2013, Boneh proposed the first lattice-based ABE scheme [
15]. In 2014, Singh et al. proposed the first identity-based PRE [
16]. In 2019, Liu et al. [
17] proposed an ABE protocol for searchable keywords. In 2022, Wang et al. [
18] proposed a searchable SPE scheme.
Outsourcing computation reduces the computation and transmission costs of the data sender. In 2019 Zhang et al. [
19] proposed a lattice-based SE support outsourcing computation. In their scheme, DO will generate oriented keys and send them to PS. With those keys, PS can prove that DO has authorized it to assist in encryption. With the assistance of PS, DO can save some computation and transmission costs.
Some lattice-based SE schemes [
17,
20,
21,
22] use linear secret sharing scheme (LSSS) as the access structure, but using LSSS under the LWE problem may cause problems. Because the Gaussian elimination method used to solve the LSSS problem will generate coefficients that cannot be guaranteed to be small enough, the decryption or search results after running LSSS may be wrong. Therefore, LSSS-based schemes under the LWE problem cannot be realized in practice. To avoid this problem, the proposed scheme uses a tree-based access structure.
1.1. Problem Statements
In summary, to improve the efficiency of cloud services, an encrypted file-sharing scheme should meet the following features:
The encryption should resist quantum attacks.
The encrypted file should be searchable. Moreover, the scheme should support multi-keyword to increase the flexibility of searchability.
To reduce the file-sharing costs of the data owner, the scheme should support PRE.
The encryption should avoid the cost of using certificates.
Unfortunately, no known scheme can achieve these features simultaneously. Therefore, we propose a scheme to satisfy these features simultaneously.
1.2. Contributions
We proposed a multi-keyword searchable identity-based proxy re-encryption scheme from lattices. To highlight the contributions, the features comparison between the proposed scheme and other schemes is shown in Table 2. The proposed scheme is the first to offer the following properties simultaneously:
To resist quantum attacks, the security of the proposed scheme is based on the LWE problem from lattices.
The flexibility of searchability is increased through the proposed multi-keyword search that supports AND and OR operations. The access structure is tree-based rather than LSSS-based, avoiding possible errors in the decryption and search phases.
KGC only needs to assist users in generating private keys during the registration phase. The burden on the KGC is reduced since it does not involve other phases. Moreover, the risk of KGC being attacked by adversaries through the network can be reduced.
The proposed scheme supports PRE to reduce file-sharing costs of the data owner. The concept of outsourcing computation is added to the scheme design. As the number of data users increases, the costs required for the data owner remain the same, and the proxy server will handle the increased workload.
Users’ access rights are verified in both the search phase and the decryption phase to prevent adversaries from illegally accessing files.
The proposed scheme is identity-based, which avoids the cost of using certificates.
2. Preliminaries
The definitions of lattice, trapdoors, and hardness assumptions are shown here. Furthermore, the access structure, system model, and security model of the proposed scheme are presented here.
2.1. Lattices
Definition 1. An n-dimensional lattice is defined as a set of linear combinations of m linearly independent vectors .where is a basis of , and the rank of is m. Definition 2. Given a prime number q, a basis , and a vector .
Three types of lattices are defined as follows: 2.2. Discrete Gaussians
Definition 3. Given a Gaussian parameter , a vector , and .
The discrete Gaussian distribution over is The sum of all is The Gaussian function on is
2.3. Inhomogeneous Short Integer Solution (ISIS)
Definition 4 (ISIS problem). Given a prime q, a random matrix , a target vector , and a parameter , the goal is to find a non-zero vector such that and .
The ISIS problem is a variant of the Shortest Vector Problem (SVP) in lattices, which is known to be computationally hard. It falls within the class of NP-hard problems and is believed to be resistant to efficient classical and quantum algorithms. Lattice-based schemes often design public and private keys based on the ISIS problem. and are often used as public parameters or public keys, while is often used as a private key.
2.4. Decisional Learning with Errors (D-LWE)
Given a prime q, a Gaussian distribution over , and a positive integer n. Assume that there is a non-specified oracle , which could be a truly uniform random sampler or a noisy pseudo-random sampler . and are defined as follows:
The noisy pseudo-random sampler outputs pseudo-random samples , where is sampled from , is a consistent secret vector, and is a uniformly random vector.
The truly random sampler outputs uniformly random samples in .
Definition 5 (D-LWE problem).
Given a polynomial number of samples from . Decide whether is or .
Definition 6. The advantage of an adversary to break the D-LWE problem is defined as Definition 7 (D-LWE Assumption). If no polynomial-time algorithm has a non-negligible advantage in solving the D-LWE problem, then the D-LWE assumption holds.
The D-LWE problem is widely regarded as a hard problem in lattice-based cryptography. Many lattice-based cryptographic schemes, such as encryption, signatures, and key exchange protocols, rely on the hardness of D-LWE for their security. As long as solving the D-LWE problem is difficult, the security of those schemes will be guaranteed.
2.5. Trapdoor Functions
There are two types of trapdoor functions.Type 2 trapdoor functions are smaller, faster, and better than Type 1 trapdoor functions. The type 2 trapdoor functions are based on the one-way function proposed by Regev [
23], and their security is constructed on the hardness of the learning with errors problem (LWE). In 2012, Micciancio and Peikert [
24] proposed an efficient trapdoor generation function. Then, Genise and Micciancio [
25] gave improved algorithms in 2018. Therefore, the proposed scheme applies the trapdoor generation functions [
24,
25] and introduces them here.
Given a prime
q and two positive integers
. To obtain a trapdoor of a lattice, a gadget vector
is defined as
, where
. Moreover, a gadget matrix
is defined as
Definition 8. Given a matrix and an invertible matrix . Output a matrix and a trapdoor of , where . Moreover, the trapdoor’s quality guaranteed that , where the function extracts the Euclidean length of the input, and the little-omega notation is an asymptotic notation.
Definition 9. Given a matrix , a trapdoor of , an invertible matrix , a target , and a Gaussian parameter σ. Output a vector such that .
The details of how to generate with a gadget matrix is described as follows:
Randomly choose a perturbation vector
with
, and divide
into two parts
:
Choose a vector
from
, and compute
Output .
Definition 10. Given a matrix ( is an arbitrary matrix), a trapdoor of , an invertible matrix , and a Gaussian parameter σ. Output a trapdoor of .
will call repeatedly until getting a trapdoor containing d linearly independent vectors such that .
2.6. Tree-Based Access Structure
In the proposed scheme, the data owner will choose a set of keywords while encrypting data, and a data user may determine a keyword-search policy while generating a search token. Moreover, the cloud server will send the encrypted data to the data user if satisfies . The access structure in the proposed scheme is a tree-based access structure supporting OR and AND gates.
Figure 1 shows an example of the tree-based access structure applied in the proposed scheme. In the example, the keyword-search policy
, where
denotes the
i-th keyword in the system. Here is how to generate
, which will be used in the proposed scheme. First, the data user will set the value of the root node as an identity matrix
. Second, since the root node has two child nodes and requires an AND gate, the data user will compute two invertible matrices
such that
. Third, since
has three child nodes and requires an OR gate, the data user will set
. Fourth, since
has two child nodes and requires an AND gate, the data user will compute two invertible matrices
such that
. Finally, each
belonging to
will be used to generate a search token in the proposed scheme. If a keyword set
can exactly satisfy
, then
. For example, in
Figure 1, both
and
can exactly satisfy
, then
.
2.7. System Model
Figure 2 shows the system model of the proposed scheme. It includes five system roles, and their behaviors are defined as follows.
Key Generation Center (KGC): KGC is fully trusted and is responsible for system setup and key distribution.
Data Owner (DO): DO can pre-generate re-encryption keys and send them to PS before the encryption phase. Then, in the encryption phase, DO computes the ciphertext, index, and s (a set of DUs’ identities). Then, DO sends (ciphertext, index, s) to PS.
Proxy Server (PS): PS is fully trusted and responsible for re-encryption. After receiving the re-encryption keys, ciphertext, index, and s from DO, PS re-encrypts the ciphertext and index. Finally, PS sends the re-encrypted results to CS.
Cloud Server (CS): CS is honest-but-curious and responsible for data storage and search.
Data User (DU): DU can send a search request with a search token to CS to download matching ciphertexts.
The proposed scheme consists of the following polynomial-time algorithms.
: Taking a security parameter as input, KGC outputs the public parameters and the master secret key .
: With a user’s identity and as inputs, KGC computes the private key of the user.
Re-KeyGen(): Taking DU’s identity , and as inputs, the user computes the re-encryption key .
Enc(: Taking an identity , a one-bit message , and a keyword set as inputs, DO outputs the ciphertext and the index .
Re-Enc(): Given the ciphertext , the index , DO’s identity , DU’s identity , and a re-encryption key as inputs, PS computes the re-encrypted ciphertext and re-encrypted index .
TokenGen(): Taking a keyword-search policy , , and as inputs, DU can compute the search token .
or : Given , , and as inputs, CS checks whether matches . If the result matches, CS sends to DU. Otherwise, CS sends ⊥ to DU.
Dec(): With and , DU decrypts and gets the plaintext .
2.8. Security Model
Based on the security requirements of cloud services, the proposed scheme achieves indistinguishability under chosen plaintext attacks (IND-CPA) and indistinguishability under chosen keyword attacks (IND-CKA). The security model is defined as the following IND-CPA and IND-CKA games. Assume that is a polynomial-time adversary, and the simulator simulates the games.
2.8.1. Ciphertext Security
The IND-CPA game is defined as follows:
: gives a target identity to . receives a polynomial number of samples from the D-LWE oracle.
initializes the system and sends public parameters to .
can adaptively issue the following queries multiple times:
- –
query: gives an identity to . sends to .
- –
query: gives an identity to . sends to .
- –
KeyGen query gives an identity . If , aborts it. Otherwise, sends to .
- –
Re-KeyGen query gives DO’s identity and DU’s identity . If , aborts it. Otherwise, sends to .
: submits two messages () to . randomly chooses . Then, computes the ciphertext related to . Finally, sends to .
: may do more queries as in .
: Finally, answers a bit . If , wins the IND-CPA game.
Definition 11. The advantage of to win the IND-CPA game is defined as Definition 12. If no polynomial-time adversary wins the IND-CPA game with a non-negligible advantage, then the proposed scheme is IND-CPA secure.
2.8.2. Keyword-Search Security
The IND-CKA game is defined as follows:
: gives a target identity and two target keywords to . randomly chooses and will use for the challenge. Furthermore, receives a polynomial number of samples from the D-LWE oracle.
initializes the system and sends public parameters to .
can adaptively issue the following queries multiple times:
- –
query: gives an identity to . sends to .
- –
query: gives an identity to . sends to .
- –
KeyGen query gives an identity . If , aborts it. Otherwise, sends to .
- –
Re-KeyGen query gives DO’s identity and DU’s indetiy . If , aborts it. Otherwise, sends to .
- –
TokenGen query gives an identity and a keyword-search policy . If ( and ( contains or )), aborts it. Otherwise, sends to .
: submits a signal to start the . computes the index related to . Finally, sends to .
: may do more queries as in .
: Finally, answers a bit . If , wins the IND-CKA game.
Definition 13. The advantage of to win the IND-CKA game is defined as Definition 14. If no polynomial-time adversary wins the IND-CKA game with a non-negligible advantage, then the proposed scheme is IND-CKA secure.
7. Conclusions
In this research, a multi-keyword searchable identity-based proxy re-encryption scheme from lattices has been proposed. First, based on the D-LWE assumption, the proposed scheme can resist quantum attacks. Second, it provides multi-keyword searchability with AND and OR operators, increasing the flexibility of keyword searches. Third, KGC in the proposed scheme can be offline, avoiding the risk of being attacked by adversaries through the network. Fourth, the proposed scheme supports PRE to reduce file-sharing costs of the data owner. The costs of the data owner will remain the same as the number of data users increases. Fifth, to prevent illegal access to files, the user’s access rights are verified in both the search and decryption phases. Finally, the proposed scheme is identity-based, avoiding the cost of using certificates.
Compared with other schemes, the proposed scheme has good performance in transmission cost. Although the proposed scheme has a slightly higher computation cost than other schemes in the case of a single-keyword search, the proposed scheme supports multi-keyword search, increasing the flexibility of keyword searches. In addition, the cost of the data owner does not increase with the number of data users. In the future, we will try to reduce the transmission and computation costs of the proposed scheme. Furthermore, future research will be designing a new multi-keyword mechanism that supports threshold operations.