1. Introduction
Wireless sensor networks (WSN) and cloud computing have been widely deployed in daily life. WSN consists of small low-power sensors and lightweight mobile devices connected to the Internet [
1,
2]. These devices collect and exchange information in a variety of applications. Cloud computing has the advantages of unlimited capability in terms of both storage and computation. WSN is rapidly emerging, which is unprecedentedly driven by the assistance of cloud computing. As an emerging technology, WSN has utilized cloud computing to store and process data to reduce the burden of lightweight mobile devices.
More and more attention has been paid to using WSN technology as a crucial part of the Internet of Things (IoT) in various industries. IoT improves manufacturing efficiency and enables sustainable production [
3,
4,
5,
6,
7]. As IoT and cloud-assisted WSN applications, enterprises and individuals have utilized cloud storage to complete the data storage and data sharing to reduce the burden of local storage.
As shown in
Figure 1, the cloud-assisted WSN typical architecture. In this architecture, the cloud-assisted WSN system has powerful data processing capabilities and storage resources. The sensors implanted in the system collect data information and upload them to the cloud server by using a light mobile device. When the cloud-assisted WSN receives data, it stores and sends the data to relevant industry workers for utilization. In a specific practical scenario, such as a cloud-assisted medical system [
8], the medical data documents are confidential to anyone except the patient and the chief physician. Consequently, the stored data should be guaranteed to be secure, since any information disclosure may result in serious consequences. Therefore, security requirements have become a key challenge in cloud-assisted WSN.
Security issues, such as users’ confidence that their data will remain secure with nobody able to modify or observe the contents, will remain the stumbling block that hinders the adoption of cloud-assisted WSN. Generally, users encrypt the data prior to uploading it to the cloud server for protecting data confidentiality. Unfortunately, this approach eliminates the data search services provided by modern search engines, which inevitably makes the effective data search function a challenging research problem. There are two trivial solutions to solve the search problem in encrypted documents. The first one is that the data receiver downloads the encrypted data locally, then decrypts the data and searches for the keyword at the local end. However, this method is impractical since it requires huge communication consumption and occupies a huge local storage space in the WSN. Another way is for the data receiver to send the authorization key to the cloud server which enables it to decrypt the encrypted documents in the cloud and to perform a search operation. However, this approach exposes data privacy to the cloud server and contradicts the original intention of data encryption. Focusing on the aforementioned problem, searchable encryption was proposed [
9]. Searchable encryption enables a data receiver to authorize the cloud server to search in encrypted documents and returns the associated encrypted files, where the encrypted documents do not need to be decrypted.
Searchable encryption can be divided into symmetric searchable encryption (SSE) and public key encryption with keyword search (PEKS). In SSE, a shared key is required to achieve a data sharing function. PEKS [
10] was proposed to eliminate the shared key in SSE. The general PEKS system includes three participants, that is, data senders, a data receiver and a cloud server. Data senders encrypt the data file and keywords index using the data receiver’s public key and then send ciphertexts to the cloud server. The data receiver uses its private key to generate a keyword trapdoor and transmits it to the cloud server. The cloud server uses the trapdoor to match the keyword ciphertext, if the keyword in the ciphertext and the keyword in the trapdoor are equal, it outputs equal; otherwise it outputs not equal.
Unfortunately, the traditional PEKS suffers from an inherent insecurity problem regarding trapdoor privacy. Anyone can use the data receiver’s public key to generate the valid keyword ciphertext. If the channel between the data receiver and cloud server is public, then the trapdoor is also open. If the adversary can execute the test algorithm, then it can verify whether or not the trapdoor and the ciphertext are matched. When they are well matched, the keyword in the trapdoor is equal to the keyword in the ciphertext; otherwise, the adversary can continue to guess another keyword until the correct keyword is found since the keyword space has a much smaller size. This kind of attack is called an off-line keyword guessing attack (off-line KGA), as shown in
Figure 2. The off-line KGA is divided into an external adversary’s off-line KGA and an internal server adversary’s off-line KGA, according to which the adversary is an external adversary or an internal server adversary.
Besides, another inherent insecurity problem regarding trapdoor privacy exists in the traditional PEKS scheme. Since the keyword space has a much smaller size, a malicious data sender (including the external adversary) can generate a data file ciphertext and associated keyword ciphertext by guessing a keyword. If the channel between the data receiver and the cloud server is public, then the trapdoor to locate and return encrypted files is also open. After the cloud server performs the test matching operation, the related encrypted data files are returned. If the returned files have a encrypted data file generated by the malicious sender, the malicious data sender can determine the keyword associated with the encrypted data file, then the keyword in the trapdoor is also known to the malicious data sender. This kind of attack is called an on-line keyword guessing attack (on-line KGA), as shown in
Figure 3. The difference between on-line KGA and off-line KGA mainly depends on whether the adversary attacks the scheme through the cloud server.
For both types of attacks, a trivial solution is that we need a secure channel to share the secret between the data receiver and data senders. A secure channel between cloud server and the data receiver can avoid the off-line KGA initiated by the external adversary and the on-line KGA. But the cost of building a secure channel prevents a Wi-Fi or 4G method from being utilized in the practical application. Moreover, for an internal server adversary, the data receiver and every data sender should share the secret in a secure channel against the off-line KGA initiated by the internal cloud adversary, while this method breaks the asymmetry property of PEKS. Therefore, it is significant and essential to resist both off-line KGA and on-line KGA performed by external and internal adversaries.
Considering a specific scenario: Personal Health Records (PHRs) are confidential documents to anyone except the patient and the chief physician. In order to protect patients’ PHR privacy, patients need to encrypt the PHR data prior to uploading it to the cloud server. We want to implement a search function, so a chief physician can search the PHR authorized information. We can use a PEKS scheme to solve the keyword search problem in encrypted PHR. However, the PEKS scheme suffers from an inherent problem, namely, the keyword guessing attack (KGA). In the process of searching, the adversary may obtain the keyword in the trapdoor, which exposes PHR data privacy to the adversary. Therefore, if we can design an efficient and secure data sharing and searching scheme to address the off-line KGA and on-line KGA problem, then data privacy will be guaranteed.
1.1. Our Contributions
In this paper, we study how to resist both off-line KGA and on-line KGA performed by external and internal adversaries in PEKS and propose a remedy to these problems. Specifically, our contributions are as follows:
1. We introduce a dating sharing and searching (DSS) frame that can effectively resist both off-line KGA and on-line KGA performed by an external adversary and an internal adversary. We also give a specific dual server DSS construction. The security of the scheme can achieve double ciphertext indistinguishability against the on-line KGA and indistinguishability against a chosen keyword attack (IND-CKA). We adopt the dual server method, which divides the cloud server into the forward server and backward server such that any single server cannot complete the test algorithm independently and any single server cannot get the correspondence between trapdoor and keyword ciphertext, therefore, the off-line KGA cannot be conducted successfully.
2. We add data file encryption/decryption to our scheme. In the traditional PEKS scheme, there is no algorithm for data file encryption/decryption. PEKS mainly focuses on the search process and omits the data file encryption/decryption process, which means there is only a keyword encryption algorithm in PEKS and it does not involve a data file encryption/decryption algorithm. However, in the actual application, a data file encryption/decryption is indispensable. The malicious data sender adversary may initiate an on-line KGA by observing the encrypted returned files. We adopt the re-encrypt technique, which the malicious data sender (including backward server) cannot get the correspondence between a trapdoor and encrypted data file, therefore, the on-line KGA cannot be conducted successfully.
3. Our scheme can simultaneously resist both off-line KGA and on-line KGA performed by external and internal adversaries. It does not require a secure channel and keeps the asymmetry property rather than a trivial solution. Compared to the previous schemes, our scheme also improves efficiency by eliminating the pairing computation and offers richer functionality by adding the data file encryption/decryption process.
Technical note: We choose PEKS as the starting point for the design of the scheme. For resisting KGA, we will discuss on-line KGA and off-line KGA. For an external adversary’s off-line KGA, the scheme generates a key pair for the cloud server to prevent the external adversary from launching an off-line KGA after eavesdropping the trapdoor through the public channel. What we need to point out here is to generate a key pair for the server it cannot entirely resist an external adversary’s off-line KGA. For example, Baek’s scheme has a fixed trapdoor. By comparing two bilinear pairs, the adversary can guess a keyword. We also need the trapdoor to satisfy the trapdoor indistinguishability to overcome this external adversary’s off-line KGA.
For an internal server adversary’s off-line KGA, we can divide the cloud server into two servers, which are the forward server and the backward server. Any single server cannot complete the test algorithm independently. Then, any single server cannot get the correspondence between the trapdoor and the keyword ciphertext, so the off-line KGA cannot be initiated. Therefore, our frame can resist off-line KGA performed by external and internal adversaries.
For on-line KGA, since the attack is initiated by observing the returned data files, we need to consider the data file encryption/decryption. We use the encryption scheme to provide data file encryption/decryption. The malicious data sender observes whether including the returned data file ciphertext is generated by itself to judge the keyword in eavesdropping on the trapdoor. Since the cloud server has strong computing power, we let the forward server perform double encryption for the data file ciphertext. In this way, the generated double ciphertext can satisfy the ciphertext indistinguishability for a malicious data sender, and therefore the malicious data sender adversary cannot initiate on-line KGA.
1.2. Related Works
In 2000, Song et al. first proposed an SSE scheme based on a symmetric cryptosystem [
9]. Song et al.’s scheme can search any keyword in the ciphertext by word-by-word comparison to complete the keyword search function, therefore, the efficiency is low. Song et al.’s scheme suffers from statistical attacks and it cannot be proven secure. After Song et al.’s scheme, many researchers proposed SSE schemes [
11,
12]. The symmetric searchable encryption scheme can only be established under the symmetric cryptosystem, therefore, there is a problem of key distribution. In order to solve this problem, Boneh et al. proposed the first PEKS scheme based on the asymmetric cryptosystem in 2004 [
10]. Boneh et al.’s scheme is transformed from identity-based encryption (IBE), which replaces the identity in the IBE with the keyword. Boneh et al.’s scheme needs a secure channel between the cloud server and the receiver for uploading the trapdoor. However, the cost of building a secure channel is expensive as is the connection between the receiver and cloud server through an insecure communication channel in IoT environment. In 2005, Abdalla et al. explored the conversion relationship between IBE and PEKS [
13]. It is shown that an anonymous IBE scheme could be transformed into a PEKS scheme and it proposed the temporary keyword search scheme. Baek et al. proposed a PEKS scheme to remove the secure channel (dPEKS) [
14]. In 2006, Baek et al. proposed a scheme combining a public key encryption (PKE) scheme and PEKS [
15]. The scheme achieves the data file encryption/decryption function and keyword search function. Baek et al.’s scheme cannot resist off-line KGA, because the trapdoor in the Baek et al. scheme is fixed, the adversary can test each keyword through a bilinear pair to obtain the keyword in the trapdoor. Rhee et al. improved Baek et al.’s security model in 2009, which allows the adversary to obtain correspondence between the ciphertext and the trapdoor [
16].
In 2010, Rhee et al. proposed a new dPEKS scheme [
17]. The scheme proposed a new security definition, the trapdoor indistinguishability, and it is a sufficient condition for resisting the external adversary’s off-line KGA. In 2013, Fang et al. proposed a scheme that can resist the external adversary’s off-line KGA under the standard model [
18]. Fang et al.’s scheme is the first dPEKS scheme to achieve the indistinguishability against a chosen keyword ciphertext attack that allows the adversary to initiate test query. Rhee et al’s two schemes and Fang et al’s scheme cannot resist internal server’s off-line KGA. In 2014, Chen et al. [
19] proposed a generalized structure against on-line KGA. Chen et al.’s scheme [
19] only satisfies the trapdoor security against on-line KGA and it also suffers from the off-line KGA. In 2016, Chen et al. proposed a two cloud server model [
20] and any single server cannot complete the test operation so that it can resist the off-line KGA. However, in Chen et al.’s scheme [
20], anyone who can generate a trapdoor and access the test query can create a security problem. It also cannot resist on-line KGA.
In 2016, Chen et al. proposed a joint scheme combining PKE and PEKS [
21]. This scheme achieved the IND-CCA security and the indistinguishability against a chosen keyword ciphertext attack security but it could not resist both off-line and on-line KGA. In 2009, Tang et al. proposed a PEKS scheme for resisting off-line KGA [
22]. Tang et al.’s method is to share the previously registered keywords between the receiver and every data sender. In 2017, Satio et al. proposed a PEKS scheme of designed-senders [
23]. As a designed data sender, it needs to obtain the receiver’s authentication. Only the specified data sender can generate valid ciphertext and upload the shared encrypted data to the cloud server; therefore, the internal server adversary cannot generate valid ciphertext and cannot initiate the off-line KGA. In the same year, Huang et al. [
24] and Jiang et al. [
25] also used the idea of designed-senders. Only designed-senders can generate valid ciphertext so that it can resist the internal adversary’s off-line KGA. In 2018, Wu et al. proposed an off-line KGA scheme against an internal server adversary [
26]. It is a method for sharing a secret between the data receiver and every sender. However, all the above five schemes have broken the asymmetry property of PEKS and cannot resist on-line KGA. Zhu et al. proposed a PEKS with a public verifiability scheme [
27]. It achieves the public verifiability of the search results, but it cannot resist the internal server’s off-line KGA. Han et al. proposed a survey of keyword search schemes in recent years [
28]. Many researchers also studied the keyword search problem [
29,
30].
After we finished our work, we found that Noroozi et al. concurrently presented a generalized PEKS structure against off-line KGA and on-line KGA for an external adversary [
31]. It is a method to combine the PEKS with a designated server structure and the technique of re-randomizing ciphertexts. However, it is not enough for the PEKS scheme to resist this external adversary alone. The PEKS scheme still needs to resist an internal server adversary. In our work, we design a PEKS scheme that it simultaneously resists both external adversary and internal server adversary.
Noroozi et al. also considers that designing a PEKS scheme which is secure against off-line KGA and on-line KGA, even performed by the internal server adversary, remains a challenging problem.
We also found that this challenging problem still needs to be addressed. We designed a secure and efficient data sharing and searching (DSS) scheme against both off-line KGA and on-line KGA performed by external and internal adversaries.
1.3. Organization
The paper is organized as follows. The scheme definition and security model are described in
Section 2. A secure and efficient data sharing and searching scheme against KGA (DSS against KGA) is proposed in
Section 3. We analyze the security and efficiency of the proposed scheme in
Section 3. The paper is concluded in
Section 4.
2. Scheme Definition and Security Models
2.1. System Model
The model of the dual server DSS against KGA scheme (Dual server DSS against KGA model) that we proposed is shown in
Figure 4. There are four participants in this model including data senders, a receiver, cloud sever 1 and cloud server 2. The workflow is as follows:
First of all, data senders encrypt the data file M using the data receiver’s public key and encryption algorithm to form a data file ciphertext . Data senders also encrypt the corresponding keyword index using two servers’ public keys , the receiver’s public key and the encryption algorithm to form keyword ciphertext , then sends the ciphertext to cloud server 1. Secondly, cloud server 1 generates the double ciphertext by re-encrypting the data file ciphertext . Then, the data receiver uses its secret key to generate a keyword trapdoor and transmits it to cloud server 1. Next, cloud server 1 uses the trapdoor and keyword ciphertext to compute the transitional ciphertext , and sends the to cloud server 2. Afterwards, cloud server 2 outputs the matching result. If the keyword in the ciphertext and the keyword in the trapdoor are equal, cloud server 2 sends the relevant encrypted data file to the data receiver. In the final step, to obtain the message M, the receiver decrypts the data file’s double ciphertext using its secret key .
Although our scheme uses the re-encryption technique, its computational efficiency is almost equal to that of Noroozi et al.’s re-randomizing ciphertexts technique. Of course, the re-encryption technique can also be easily replaced with a re-randomizing ciphertexts technique in our work.
2.2. Algorithm Definitions
Before defining our algorithms, we define a notations
Table 1 for the mathematical symbols in the whole paper.
More specifically, a scheme of DSS against KGA consists of the following algorithms:
- (1)
: on input a security parameter k and output a system parameter .
- (2)
:
: on input a system parameter and output two pairs of public and secret key for the cloud server 1 and cloud server 2, separately.
: on input a system parameter and output a pair of public and secret key for the receiver.
- (3)
: on input a system parameter , the cloud server 1 public key , the cloud server 2 public key , the receiver public key , the keyword w, the message M and output the ciphertext .
- (4)
: on input a system parameter , the receiver public key , the ciphertext , and output the double ciphertext .
- (5)
: on input a system parameter , cloud server 1 public key , cloud server 2 public key , the receiver public key , the receiver secret key , the keyword w, and output the keyword search trapdoor .
- (6)
: on input a system parameter , the cloud server 1 secret key , the cloud server 2 secret key , the keyword search trapdoor , the ciphertext , and output ciphertext if the keyword search trapdoor matching the ciphertext , and ⊥ otherwise. The matching process as follows:
: the cloud server 1 inputs the trapdoor , the ciphertext , the cloud server 1 secret key , the system parameter , and outputs the transitional ciphertext .
: the cloud server 2 inputs the system parameter , the transitional ciphertext , the cloud server 2 secret key . If the transitional ciphertext satisfies the condition, it outputs the double ciphertext , and ⊥ otherwise.
- (7)
: on input a system parameter , the receiver secret key , the ciphertext and output the message M.
2.3. Security Model
We define six security models, including the indistinguishability against a chosen keyword attack (IND-CKA 1) security model for cloud server 1, the IND-CKA 2 security model for cloud server 2, trapdoor indistinguishability against the off-line KGA (IND-Trapdoor 1) security model for cloud server 1, trapdoor indistinguishability against the off-line KGA (IND-Trapdoor 2) security model for cloud server 2, double ciphertext indistinguishability against the on-line KGA (IND-Double ciphertext) security model, transitional ciphertext indistinguishability against chosen keyword attack (IND-CKA 3) security model.
It should be noted that both cloud server 1 and cloud server 2 are “honest but curious” and they will not collude with each other. More specifically, the two servers strictly enforce the testing process of the algorithm but may be curious about the content of the keyword. It should be noted that these models implicitly define the security against external adversaries since the external adversary has less capability than the cloud server.
We define the keyword ciphertext’s semantic security. Any adversary cannot distinguish the challenge ciphertext unless the trapdoor is available. Formally, we define security model IND-CKA 1 and IND-CKA 2 played between a challenger and adversary .
For the IND-CKA 1 security model, as the
Table 2, the challenger
generates three key pairs
. It sends public keys
and secret key
to the cloud server 1 adversary
.
can access the trapdoor oracle
to get any keyword trapdoor
and outputs two distinct challenge keywords and a message
, which
. The challenger
generates challenge PEKS ciphertext
of
with a random bit
b and sends it to
. During the game, the adversary can adaptively continue to query trapdoor oracle
unless the challenge keywords
and
. Finally, the adversary
outputs
as its guess.
For the IND-CKA 2 security model, as the
Table 3, the game is similar to IND-CKA 1. We define security model IND-CKA 2 played between a challenger
and adversary
. We omit the details here. The definition is as follows:
Definition 1 (IND-CKA). A scheme of DSS against the KGA is indistinguishable against a chosen keyword attack if no PPT adversaries can win game IND-CKA 1 and can win game IND-CKA 2 with a non-negligible advantage, where is the challenger, is cloud server 1, is cloud server 2.
We define advantage as: Next, we define the keyword trapdoor semantic security. Any adversary cannot distinguish the challenge trapdoor, that is to say, the challenge trapdoor does not reveal any information about the keyword. Formally, we define security model IND-Trapdoor 1 and IND-Trapdoor 2 played between a challenger and adversary .
The IND-Trapdoor 1 and IND-Trapdoor 2 are similar to the IND-CKA 1. The adversary is given the challenge trapdoor instead of the PEKS challenge ciphertext. For the IND-Trapdoor 1 security model, as the
Table 4, the challenger
generates three key pairs
. It sends public keys
and secret key
to the cloud server 1 adversary
.
can access the trapdoor oracle
to get any keyword trapdoor
and outputs two distinct challenge keywords
, which
. The challenger generates challenge trapdoor
of
with a random bit
b and sends it to
. During the game, the adversary can adaptively continue to query trapdoor oracle
unless the challenge keywords
and
. Finally, the adversary
outputs
as its guess.
For the IND-Trapdoor 2 security model, as the
Table 5, the game is similar to IND-Trapdoor 1. We define security model IND-Trapdoor 2 played between a challenger
and adversary
. We omit the details here. The definition is as follows:
Definition 2 (IND-Trapdoor). A scheme of DSS against the KGA is trapdoor indistinguishability against off-line KGA if no PPT adversaries can win the game IND-Trapdoor 1 and can win game IND-Trapdoor 2 with non-negligible advantage, where is the challenger, is cloud server 1, is cloud server 2.
We define advantage as: After that, we define the double ciphertext semantic security. Any adversary cannot distinguish the challenge double ciphertext. Formally, we define the IND-Double ciphertext security model, as the
Table 6. The IND-Double ciphertext is similar to the IND-CKA 1. The adversary outputs two distinct challenge ciphertext
. The challenger generates double challenge ciphertext
of
with a random bit
b and sends it to adversary. The adversary is given the challenge double ciphertext instead of the PEKS challenge ciphertext. Finally, the adversary outputs
as its guess.
Definition 3 (IND-Double ciphertext). A scheme of DSS against the KGA is double ciphertext indistinguishability against the on-line KGA if no PPT adversary can win the game IND-Double ciphertext with non-negligible advantage, where is the challenger, is the malicious data sender (including the cloud server 2.
We define advantage as: Finally, we define the transitional ciphertext semantic security. Any adversary can not distinguish the challenge transitional ciphertext unless the trapdoor is available. Formally, we define security model IND-CKA 3, as the
Table 7. The IND-CKA 3 is similar to the IND-CKA 1. The adversary is given the challenge transitional ciphertext instead of the PEKS challenge ciphertext. We omit the details here.
Definition 4 (IND-CKA 3). A scheme of DSS against the KGA is transitional ciphertext indistinguishability against chosen keyword attack if no PPT adversary can win the game IND-CKA 3 with non-negligible advantage, where is the challenger and is an adversary (including the cloud server 2).
We define advantage as: