1. Introduction
Recently, cloud services [
1] have been rapidly promoted by the development of the internet technique. As a widely used paradigm of outsourcing service, the cloud has been accepted by the market (e.g., iCloud, Dropbox, and Microsoft Cloud), providing a convenient and low-cost method for data storage and data sharing [
2]. As shown in
Figure 1, cloud services play an important role in our daily life, such as smart healthcare, smart agriculture, smart cities, and smart transport [
3,
4,
5,
6]. According to the report released by Gartner in 2021, more than 45% of IT spending will be on building infrastructure, applications, and business process outsourcing, shifting from traditional solutions to the cloud by 2024. Despite the proliferation of the cloud, data security and privacy preservation arise as long-term concerns from the user side, since they lose physical control of their data. Therefore, cloud service providers (CSPs) are commonly treated as honest-but-curious (HBC) entities. On the other hand, different cloud services should prevent data breaches on the cloud to enhance CSPs’ reliability [
7]. Therefore, it is crucial to design a secure and privacy-preserving data sharing scheme for cloud services.
The General Data Protection Regulation (GDPR) sets strict privacy requirements for CSPs. Specifically, three principles must be satisfied: (1) Receiver access control. From the data collection limitation principle, the data should only be sent to receivers that meet the data sender’s access policies. (2) Sender access control. From the data quality principle, the data sender should be identified to ensure data accuracy. (3) Data privacy. From the data privacy principle, sensitive data (e.g., access policies and shared data) should not be disclosed. Thereby, the access control in the data sharing scheme should be designed by both the sender and receiver. Moreover, the cloud-based data sharing scheme should guarantee data privacy when devices share data and store it on the cloud.
Inherently, several significant challenges arise when applying existing data sharing schemes to cloud services [
8,
9,
10,
11,
12,
13,
14], which are not only caused by data breaches but also by the users’ strict privacy requirements. Taking the smart transport system as an example, end devices (e.g., distance sensors, speed sensors, and temperature sensors) collect information from vehicles [
15]. By analyzing the relevant information, the smart transport system is able to perform more precise and effective traffic management. Due to the restricted computational resources and storage capability of end devices, vehicles primarily outsource the data collection to the cloud server. As is shown in
Figure 1, vehicles are willing to share their data for the purpose of avoiding traffic jams and planning optimal travel paths. The data collections are transmitted from end devices to the cloud server. Then, the required data will be sent to target vehicles by the cloud server. Commonly, the collected data might contain sensitive information such as individuals’ daily action trajectories and real-time locations. If this information stored on the cloud server is accessible to anyone, it will directly threaten users’ data security. Therefore, it is necessary to ensure that sensitive information cannot be snooped on by the cloud server and prevent unauthorized entities from illegal access.
However, most current access control schemes only support one-side access control (i.e., sender/receiver access control). The one-side access control schemes cannot satisfy the practical privacy requirements, but result in vast communication overhead for transmitting information in the system. By applying bilateral access control, the access policy can be designed both by senders and receivers. Specifically, vehicles expect to grant access privileges to their information to the designated end devices. The end devices can also decide to get the information from the specified vehicles or other devices simultaneously. Intuitively, attribute-based encryption (ABE) seems to be a possible solution to address access control among multiple users [
16]. The standard ABE approaches cannot support bilateral access control. To tailor the ABE technique for bilateral access control, ABE with a keyword search (ABKS) enables receivers to seek suitable senders using keywords [
10]. Their scheme, however, requires additional interactions between the users and the cloud server, introducing extra communication overheads to users. Later, Ateniese et al. [
17] presented an encryption primitive on CRYPTO’19, named Matchmaking Encryption (ME), to achieve bilateral access control without revealing any privacy for both senders and receivers. When the matching fails, nothing (i.e., the access policies and data) will be disclosed. However, the matching process brings heavy computation and communication overhead to end devices. To further improve efficiency, it is desirable to delegate the matching process to the cloud without revealing any users’ private information. To summarize, the practical secure data sharing for cloud services should be with privacy preservation and bilateral access control in order to facilitate the blossom of cloud services.
In this paper, we introduce a cloud-based privacy-preserving data sharing scheme with bilateral access control. By analyzing the practical security requirements, we formalize the crucial challenges in the state-of-the-art. Specifically, to provide bilateral access control, we construct our scheme based on identity-based matchmaking encryption (IB-ME) for realizing both sides designing the match policies simultaneously. To achieve high efficiency, we delegate the matching process to the cloud server while protecting the user’s private information and data by designing a signature-based match tag. The contributions of our work are summarized as follows:
We suggest a data-sharing scheme for cloud services, derived from identity-based matchmaking encryption, named IBME-DS. The access policies in IBME-DS are specified by both the sender and receiver to achieve bilateral access control.
To further improve the system efficiency, we design a privacy-preserving matching mechanism to delegate the matching process to the cloud server, which ensures user privacy and data confidentiality during the matching procedure.
We formally define the system model, threat model, and security model of IBME-DS. Then, a comprehensive security analysis is to demonstrate that our proposed scheme meets the practical security requirements.
Finally, we evaluate the performance of IBME-DS by conducting extensive experiments on a real-world dataset to show that IBME-DS is more efficient than relevant works.
Organization. The remainder of this paper is structured as follows.
Section 2 discusses the preliminary adopted in this paper. In
Section 3, we define the system model, threat model and security model of our scheme.
Section 4 provides the concrete construction based on bilinear groups. Then, in
Section 5, we give rigorous security proof to prove the security of our scheme. Then,
Section 6 presents the theoretical analysis and experimental performance. In
Section 7, we introduce relevant works on access control and matchmaking encryption. In
Section 8, we discuss the advantages of our research and the limitations of it. Finally, we conclude our work in
Section 9.
5. Security Analysis
Theorem 1. If the underlying IB-ME is IND-CPA secure, our proposed IBME-DS is secure.
Proof. Assume that a PPT adversary can break IBME-DS, a simulator can use to break the underlying IB-ME.
Setup: Choose random values and set , and . The master secret key is and the master public key is the tuple . Then, the is sent to . selects a random value . Then, sends system parameters to . And the padding function is under control.
Queries: performs the hash queries on to construct hash table list :
- 1.
If query has been requested before, that the query can be found in , returns . Otherwise, generates a coin , .
- 2.
If , chooses and computes . Then, add to . Otherwise, sets , and adds to , where x is unknown to .
- 3.
Return .
Queries: performs the hash queries on . The hash list is constructed by . If was already queried, returns the value . Otherwise, chooses . Then, it adds to . Finally, returns to .
KeyGen Queries: Upon being the input, obtains by from , and returns . Upon being the input, obtains from . If , returns . Otherwise, terminates the game.
Challenge: sends to , where , . Then performs as follows:
- 1.
queries and . Let and , it means that .
- 2.
selects a random .
- 3.
sends to .
Guess: outputs a as a guess on b.
If holds, we say that breaks the security of our proposed scheme. Then, can make use of the result from to break the underlying IB-ME. and are used in IB-ME. If can tell is computed from , also can tell that in IB-ME with the same advantage.
Moreover, in MatchTag and Match, are computed with a randomness . From the view of , the distribution of these elements is indistinguishable from the random elements. If can tell the difference, then can solve the discrete logarithm problem. □
Theorem 2. Our proposed IBME-DS holds authenticity, if the bilinear Diffie–Hellman (BDH) problem is hard.
Proof. Suppose that a PPT adversary breaks the authenticity of IBME-DS, a simulator is able to use to break BDH problem with non-negligible advantage. Receiving the challenge , computes .
Setup: sends the public system parameters to , where two hash functions and are under control.
Queries: performs the hash queries on to construct hash table list :
- 1.
If query has been requested before, that the query can be found in , returns . Otherwise, generates a coin randomly, .
- 2.
If , randomly selects and computes . Otherwise, computes . Then, add to .
- 3.
Finally, send to .
Queries: performs the hash queries on to construct the hash table list :
- 1.
If query has been requested before, then the query can be found in , returns . Otherwise, generates a coin randomly, .
- 2.
If , randomly selects and computes . Otherwise, computes . Then, add to .
- 3.
Finally, send to .
Queries: performs the hash queries on . The hash list is constructed by . If was already queried, returns the value . Otherwise, randomly chooses . Then, it adds to . Finally, returns to .
SKGen Queries: Input , obtains by from . If , returns ; otherwise, aborts.
RKGen Queries: Input , obtains by from . If , returns ; otherwise, aborts.
Forgery: sends the tuple to . Let , performs as follows:
- 1.
queries and .
- 2.
If both the tuples and without and , aborts. If not, , , , where , and .
- 3.
parses C as , computes and selects a random tuple .
- 4.
Return .
If the above simulation holds, we say that breaks the authenticity of our proposed scheme. can forge the valid ciphertext even if it is not authorized, as it can solve the BDH problem.
Next, we assume that the adversary makes and queries to oracle SKGen and RKGen, and analyze the advantage that outputs the solution to the BDH assumption. By the above proof, the probability that does not abort for any of these calls is and the probability that does not abort in the forgery phase is . Thus, the total probability that does not abort is and will get the maximum value when which is . If does not abort, it outputs the correct solution with a probability at least . Hence, can solve the BDH problem with advantage . □
Theorem 3. Our proposed IBME-DS holds privacy preservation, if the DL problem is hard.
Proof. The privacy preservation of IBME-DS lies in the fact that no one can obtain any private information (i.e., the outsourced data and access policy) from the communication by intercepting or making any unauthorized modifications.
From the data privacy aspect, IBME-DS encrypts the data using the user’s secret encryption key. According to the security of IBME-DS, the encrypted data is a string of random characters from the view of the adversary. Therefore, any PPT adversary cannot get the data contained in the system.
From the aspect of the sender’s identity and access policy, even though receivers can specify the sender to generate the sender’s corresponding hash , it is impossible to further calculate the sender’s secret key due to the hardness of the DL problem. The designed access policy is embedded in the ciphertext as .
From the aspect of the receiver’s identity and access policy, the identity and access policy are hidden in , which should be kept secretly by receivers. During the matching process, receivers will hide their secret decryption key by randomly choosing parameter .
Therefore, any PPT adversary cannot find out the certain identity and access policy of the sender and receiver. We complete the proof that IBME-DS holds privacy preservation, if the DL problem is hard. □
Theorem 4. There does not exist any PPT adversary who can forge the match tag for the match algorithm in IBME-DS, if the CDH assumption holds.
Proof. With the master public key , the ciphertext C and the match tag , the cloud server can get and . However, since the random numbers are unknown to the cloud server, the cloud server cannot retrieve any sensitive information about from C and , according to the hardness of the CDH problem. Furthermore, for the final result of match , its construction is based on the BDH problem. Hence, the cloud server only knows what we expect to disclose, which is whether the equation holds. We complete the proof that there does not exist any PPT adversary who can forge the match in IBME-DS, if the CDH assumption holds. □