Abstract
Medical data sharing is pivotal in enhancing accessibility and collaboration among healthcare providers, researchers, and institutions, ultimately leading to enhanced patient outcomes and more efficient healthcare delivery. However, due to the sensitive nature of medical information, ensuring both privacy and confidentiality is paramount. Access control-based data sharing methods have been explored to address these issues, but data privacy concerns still remain. Therefore, this paper proposes a secure and privacy-preserving data sharing scheme that achieves an equilibrium between data confidentiality and privacy. By leveraging key aggregate encryption and private set intersection techniques, our scheme ensures secure data sharing while protecting against the exposure of sensitive information related to data. We conduct informal and formal security analyses, including Burrow–Abadi–Needham logic and Scyther, to demonstrate its resilience against potential adversarial attacks. We also implement the execution time for cryptographic operations using multiprecision integer and a rational arithmetic cryptographic library and perform comparative analysis with existing related schemes in terms of security, computational cost, and time complexity. Our findings demonstrate a high level of security and efficiency, demonstrating that the proposed scheme contributes to the field by providing a solution that protects data privacy while enabling secure and flexible sharing of medical data.
Keywords:
medical data sharing; key aggregate encryption; private set intersection; homomorphic encryption; mutual authentication MSC:
68M12
1. Introduction
With the rapid advancement of modern technologies and the increasing digitalization of the medical sector, there has been a significant surge in both the volume and diversity of medical data. This proliferation, especially within the realm of medical information systems such as electronic health records (EHRs) and health information exchange (HIE), promotes accessibility and data sharing among healthcare providers, researchers, and institutions, improving patient outcomes, accelerating healthcare discovery, and optimizing healthcare delivery [1,2]. Medical data can be used to identify patterns and correlations to advance the understanding, diagnosis, and treatment of disease. This can improve quality of care and patient satisfaction by personalizing the care process and effectively managing chronic conditions. Ultimately, medical data sharing is an essential element of modern healthcare and plays an important role in advancing medical information systems and improving people’s quality of life.
Despite the numerous advantages of medical data sharing, significant concerns remain, particularly in the realms of security and privacy. The sensitive nature of medical information means that without adequate protection, there is a substantial risk of severe privacy violations and data breaches [3,4,5]. Unauthorized access or the theft of medical data can not only infringe upon privacy rights but also undermine trust in the entire medical information system. Data breaches can disrupt healthcare operations, compromise the integrity of research efforts, and even jeopardize public health initiatives. Ensuring the security of medical data is not merely about privacy but is also a critical aspect of protecting the entire healthcare infrastructure. This underscores the crucial importance of balancing confidentiality and privacy in data sharing, prompting the exploration of advanced solutions that can enhance utility while ensuring data security. Hence, it is imperative to conduct research on secure sharing of medical data by restricting access to information.
To enhance medical data security, researchers have increasingly explored data sharing frameworks that leverage access control technologies, such as attribute-based encryption (ABE) [6] and key aggregate encryption (KAE) [7]. Based on specific properties for generating and decrypting ciphertext, these frameworks ensure that only authorized users can access data, allowing data owners to securely share sensitive data while maintaining strict control over access rights. While offering substantial flexibility in data and access rights management, they also raise serious concerns regarding data privacy, necessitating thorough examination from alternative perspectives. In access control-based systems, privacy vulnerabilities can arise from either by revealing sensitive information through attribute values [8,9,10] or by exposing the data-related information itself that was caused by data users having to identify the desired data before requesting an access key from the owner. Conversely, if data owners choose to withhold data-related information, data users will resort to inefficient and insecure practices, such as randomly querying on a cloud server to determine the availability of certain data. This method is inherently flawed as it neither guarantees efficiency nor meets the security requisite for handling sensitive medical data. Therefore, there is a need for further research and development in medical data sharing methodologies that enable secure data sharing between data owners and users while maintaining data privacy.
In this paper, we propose a design of a secure and privacy-preserving data sharing scheme for medical information systems. We integrate private set intersection (PSI) with KAE to achieve an equilibrium between data confidentiality and privacy. To ensure secure and adaptable access control over medical data, we leverage a single access key feature of KAE [11] while integrating PSI to alleviate potential privacy concerns. PSI enables both data owners and users to confirm the presence of common information in their respective private sets without revealing the information about them, allowing them to issue or request access keys only after verifying intersection. This mechanism markedly diminishes the necessity for data owners to divulge data-related information, effectively mitigating a primary privacy concern. By adopting this approach, we not only reduce the risk of information exposure but also fortify the overall data security framework within medical information systems and address the imperatives of data confidentiality and privacy preservation in medical data sharing. The key considerations of this paper are as follows.
- We propose a privacy-preserving medical data sharing scheme. To maintain a balance between privacy and data sharing, we leverage PSI between the data owner and the data user before access requests. This facilitates interaction and data sharing while protecting sensitive information.
- The proposed scheme ensures secure data sharing and access control through KAE. Since KAE enables secure and flexible access control with a single aggregate key, the integration of KAE in the proposed scheme enhances data security by reducing the risk of data breaches and unauthorized disclosures.
- We perform security analysis using the Scyther tool [12] and mathematical analysis methods such as Burrows–Abadi–Needham (BAN) logic [13] and indistinguishability against the chosen plaintext attack (IND-CPA). In addition, we conduct performance analyses using the multiprecision integer and rational arithmetic cryptographic library (MIRACL) [14] and compare the obtained results with those of previous studies.
The remainder of this paper is organized as follows. The related works are presented in Section 2, and the preliminaries for the paper are in Section 3. Section 4 presents the system model for the proposed scheme, including network model, adversary model, and security model. Section 5 explains the proposed medical data sharing scheme. Informal and formal Security analyses are performed in Section 6, and the comparative analysis is conducted in Section 7. Section 8 summarizes the conclusions of this paper.
2. Related Works
Considerable research has been conducted in the realm of medical data sharing with a notable focus on data security and privacy preservation. In 2022, Bao et al. [15] proposed a lightweight ABE scheme, specifically tailored for the Internet of Things (IoT) and supported by cloud technology, within smart healthcare systems. Their approach prioritizes both the efficiency required by resource-limited devices and the implementation of fine-grained access control, ensuring that data access aligns with user authorization levels. Mamta et al. [16] developed a secure and efficient fine-grained data sharing scheme for IoT-based healthcare systems. Their approach critiques existing models of fine-grained medical data sharing and leverages fog computing alongside ABE to significantly reduce the computational load on data users while simultaneously enhancing data confidentiality. Wang et al. [17] proposed a consortium blockchain-based scheme for personal health record (PHR) management and sharing that prioritizes both security and privacy. They emphasized the importance of allowing patients to customize access control to their PHR according to their individual preferences, ensuring that only authorized users have access. To achieve this, they integrated a modified ABE scheme with smart contracts, enabling functionalities for secure search, privacy preservation, and personalized access control. Oh et al. [18] introduced a patient-centric secure PHR sharing system, addressing data integrity, transparency, mutual authentication, etc. They acknowledged the common use of ABSE in medical data sharing but identified key management challenges inherent in its implementation. To address this issue, they adopted the concept of key aggregate searchable encryption (KASE), presenting a key aggregate dynamic searchable encryption framework integrated with a linear secret sharing scheme.
In 2023, Trivedi and Patel [19] developed a KASE-based framework for sharing electronic health records in integrated healthcare systems on clouds. They identified limitations in existing schemes, particularly the lack of secure multi-user authorization and keyword untraceability. Their framework addresses these limitations by incorporating robust security features, including secure multi-user authorization and keyword untraceability. This approach not only enhances security but also demonstrates significant efficiency gains in storage, communication, and computation. Xu et al. [20] devised a privacy-enhanced medical data sharing framework that utilizes an authorization mechanism and ABE on the blockchain, aiming to address the challenges of fragmented healthcare systems, which can compromise treatment quality and lead to privacy breaches. Their approach empowers data owners with control over data access via ABE, complemented by an efficient authorized and revocable mechanism, ensuring access for authorization and revocation mechanism, which ensures that authorized doctors can access data while swiftly revoking access for unauthorized individuals. Zhang et al. [21] developed a multi-server search scheme that facilitates collaborative operations among various healthcare entities for tasks such as diagnostic institution location, medical data retrieval, and cross-domain data exploration. Their scheme incorporates a secure data transfer method that enables servers from disparate organizations to perform joint computational tasks while preserving the confidentiality of each participant’s data and the privacy of their search identities. Zhang et al. [22] identified that while weighted ABE enhances the flexibility of access policies in medical data sharing, it poses challenges for data owners striving to maintain control over their privacy, particularly in collaborative e-health systems. Aiming to strike an optimal balance between privacy and flexibility, they proposed a cloud-based system for sharing personal health records. This system employs AND-weighted ABE, enabling users to access data only if they belong to specified organizations, thus reinforcing both security and selective accessibility. Peng et al. [23] proposed a patient-centric EMR sharing scheme, aiming to tackle the complex issues of privacy concerns and inter-agency distrust stemming from the misuse of access to medical records and the challenge of admitting unconscious patients. They designed an architecture for privacy-preserving medical data sharing, leveraging a dual-blockchain system and an identity-based tripartite authentication key agreement scheme, fostering trust between patients and healthcare institutions.
In 2024, Zhang et al. [24] pointed out that existing attribute-based searchable encryption schemes could expose sensitive information about data users and lead to data tampering and even untrusted results due to the delegation of complex search operations to a cloud server. In response, they proposed a blockchain-based anonymous ABSE scheme to enhance data sharing security. This approach conceals the attributes of the access policy, thereby safeguarding the confidentiality of the attributes that fulfil the access requirements. By integrating ABSE with blockchain technology, the scheme also incorporates features like tamper-proofing, integrity verification, and non-repudiation, significantly bolstering the trust and security of digital transactions. Jastaniah et al. [25] introduced the SAMA scheme, crafted to overcome the shortcomings of current methodologies in managing data aggregation and sharing for wearable devices. Their objective was to offer a scheme that is not only centered around the user and privacy-friendly but also flexible enough to efficiently support multiple data owners and requesters. The SAMA scheme integrates multi-key partial homomorphic encryption with ciphertext-policy ABE to ensure robust data confidentiality, user-centric access control, and streamlined data processing tailored for wearable technology. Yin et al. [26] indicated the paucity of research into the privacy implications associated with user identity during the key generation phase. To address this gap, they proposed a decentralized ciphertext-policy ABE scheme, specifically designed to bolster the secure dissemination of sensitive healthcare information within blockchain-enabled healthcare systems. Leveraging Shamir’s threshold secret sharing, their scheme distributed the master key across all attribute nodes in the blockchain, thereby augmenting the robustness and enhancing the system’s resilience to adversarial attacks.
While existing research in medical data sharing has made significant progress in areas such as security, efficiency, and privacy, there remains a crucial aspect requiring more focused attention. Given the highly sensitive nature of medical data, the potential for information leakage through attributes and data-related information in access control-based systems, as observed in the aforementioned study, poses significant privacy concerns. This issue impedes the development of secure data sharing practices, subsequently restricting thorough data analysis and mutual efforts [27,28]. To enhance cooperation among various data owners and advance medical research, it is essential to prioritize the protection of each entity’s data privacy. This entails limiting the information disclosed to only what is necessary during the process of sharing medical data. Hence, we utilize key aggregation encryption and private set intersection to protect data confidentiality and prevent any inadvertent exposure of sensitive information, and verify the legitimacy of entities through mutual authentication. This process protects data at all stages, including data upload, storage, transmission, and access and ensures a secure data sharing environment.
3. Preliminaries
This section briefly introduces the foundational mathematical and technical principles to aid in comprehending the contents of this document.
3.1. Elliptic Curve Cryptography
Elliptic curve cryptography (ECC) is a public key cryptography that exploits the mathematical properties of elliptic curves over finite fields [29]. An elliptic curve over a finite field is defined as , where p is a large prime integers and , ensuring that the discriminant . The additive group is defined, where symbolizes the point at infinity, serving as the identity element of . Scalar multiplication is defined by repeated addition operation as ( times), with a base point and an integer . The mathematical security of ECC are represented as follows.
- Elliptic curve discrete logarithm problem (ECDLP): Given two points P and Q on , determining the scalar such that is considered computationally difficult.
- Elliptic curve computational Diffie–Hellman problem (ECCDHP): Given two points and , it is hard to calculate .
- Elliptic curve decisional Diffie–Hellman problem (ECDDHP): Given three points , , and , it is difficult to determine whether , where .
3.2. Bilinear Pairing
Let be an additive group, consisting of points on an elliptic curve E defined over a field , having order n and identity element . Let be a multiplicative group. A bilinear pairing satisfies the following conditions.
- Bilinearity: For , and , .
- Non-degeneracy: for some .
- Efficiency: can be calculated in polynomial time for .
3.3. Decisional Bilinear Diffie–Hellman (DBDH) Assumption
This assumption assumes that a probabilistic polynomial-time adversary lacks the ability to differentiate between and . Consequently, we can express the ’s advantage as follows.
DBDH assumption holds if cannot distinguish , i.e., whether or , with an inescapable advantage.
3.4. Key Aggregate Encryption
Key aggregate encryption (KAE) is an access control cryptosystem that streamlines data decryption procedures by allowing the decryption of a collection of data encrypted with multiple keys using a single constant aggregate key [7]. This aggregate key, though as concise as a solitary secret key, combines the capabilities of numerous such keys, granting decryption authority for any subset of ciphertext classes. In contrast to traditional systems that require a distinct key for each ciphertext, KAE uses a single aggregate key, reducing complexity and cost. This approach not only diminishes the key management overhead but also amplifies efficiency in data sharing. However, most KAE schemes rely on bilinear operation, incurring significant computational overhead [30]. Especially in scenarios involving data sharing, processing, and transmission of large datasets, such methods prove to be inefficient. In response, an alternative approach called ECC-based KAE was introduced. This method optimizes resource utilization by leveraging the small key size of ECC, ensuring robust security while facilitating efficient data transmission and processing. Consequently, the proposed system adopts the ECC-based KASE method, with the operational process outlined as follows.
- (1)
- KAE.Setup (): Generate a random number , compute for , and publish . Then, discard
- (2)
- KAE.KeyGen (): Generate and compute . Then, output public and private key pair .
- (3)
- KAE.Encrypt (): For data in , choose a random number and compute , , . Then, output .
- (4)
- KAE.Extrac t (): For the subset of data class indices S, output the aggregate key .
- (5)
- KAE.Decrypt (): If , output ⊥. Otherwise, calculate , , and output .
3.5. Brakerski–Gentry–Vaikuntanathan
Brakerski–Gentry–Vaikuntanathan (BGV) [31] is a type of fully homomorphic encryption (FHE) that enables arithmetic operations on encrypted data. BGV eliminates the need for decryption, producing an encrypted output that, when decrypted, yields the same result as operations performed on the plaintext. The BGV scheme is renowned for its effectiveness in performing unlimited additions and multiplications on encrypted data, which is facilitated by a process known as bootstrapping. This process effectively manages the noise generated during computations, ensuring the encryption integrity. Below is a description of the BGV algorithm.
- (1)
- BGV.Setup (): Select a ring , where l is a power of 2. Given a security parameter , set the ciphertext modulus q, plaintext modulus t, and the noise distribution . Output .
- (2)
- BGV.KeyGen (): Generate the secret key , a random polynomial , and a random error polynomial . Calculate the public key .
- (3)
- BGV.Enc (): Generate , random polynomial u, and compute .
- (4)
- BGV.Dec (): Calculate using s.
3.6. Private Set Intersection
Private set intersection (PSI) is a cryptographic protocol designed to identify common elements between sets held by two or more parties, such as individuals or organizations, without revealing any underlying data. This functionality enables the parties to determine overlapping information while preserving the confidentiality of their respective datasets. In the typical PSI scenario discussed in this paper, the sender and receiver have sets X and Y with sizes and , respectively. Upon the receiver’s request for the intersection, the sender computes it and transmits the result. The receiver leverages their private key to decrypt the intersected set. The detailed structure of the PSI employed in this study is as follows.
- (1)
- PSI.Setup (): Sender and receiver each generate a public–private key pair using the BGV.KeyGen procedure.
- (2)
- PSI.Enc (): Receiver encrypts each element using BGV.Enc and transmits the ciphertext to the sender.
- (3)
- PSI.Intersection (): Sender chooses a random number for , and computes . Then, the sender returns to the receiver.
- (4)
- PSI.Ext (): Receiver computes using BGV secret key s, and obtains where BGV.Dec().
4. System Models
In this section, we introduce the models proposed in our study: the network model, the adversary model, and the security model.
4.1. Network Model
The proposed system consists of four entities: a trusted authority (), a data owner (), a data user (), and a cloud server (). The system architecture is illustrated in Figure 1, and a detailed description of each entity is given as follows.
Figure 1.
Network model of the proposed scheme.
- : is a trusted authority that initiates the system by generating parameters for data sharing. undertakes the task of registering both and , issuing them with the necessary credentials.
- Data owner (): is hospitals, clinics, or research institutions. encrypts medical data and sends them to . When requests a common keyword identification query, computes an intersection set result, decryptable only by after legitimacy verification. also provides the aggregate key and relevant data class set upon ’s data access request.
- Data user (): is a doctor, nurse, researcher, patient, etc., within a medical institution. To access data, initiates a common keyword identification query. After receiving the results, requests access to data related to the matched keyword results and then uses the aggregate key to decrypt the data obtained from CS.
- Cloud server (): is an entity that stores the medical data and returns the data search results. When data are uploaded by , stores the data if has the necessary legal permissions. facilitates data access to following a verification process to ascertain the legal status of .
The communication flows of the proposed model are summarized as follows.
- (1)
- initializes the system parameters for authentication, intersection calculation, and data sharing.
- (2)
- registers and , storing the identity information to prevent duplicate registrations. Then, issues credentials for secure data sharing through authentication.
- (3)
- encrypts the medical data and uploads them to . then verifies the ’s legitimacy prior to storing the data.
- (4)
- submits the common keyword identification query for owned information. generates and transmits the encrypted intersection results after confirming ’s legitimacy with . verifies the received message and stores the intersection.
- (5)
- transmits a query for data access permission to using the intersection results. Then, generates and sends the aggregate key and corresponding data class set based on the intersected keywords.
- (6)
- requests the data from using a data class set and decrypts them using the aggregate key obtained from .
4.2. Adversary Model
We adopt the Dolev–Yao (DY) model to assess the security of the proposed scheme [32], which is widely used to evaluate the security of protocols. The DY model assumes that an adversary has the ability to intercept all communications on a network, read and modify intercepted messages, and create and transmit new messages. These capabilities allow the adversary to carry out a range of attacks, such as impersonation, replay, and main-in-the-middle attacks. By analyzing the proposed scheme using the DY model, our objective is to assess its effectiveness in preventing unauthorized access, data tampering, and malicious activities planned by potential opponents.
4.3. Security Model
Aligned with the adversary model outlined in Section 4.2, the proposed scheme is designed to uphold stringent data privacy standards. Given the sensitive nature of medical data, a breach could have serious consequences. Hence, protecting the confidentiality of ’s information is critical to prevent unauthorized access and the subsequent leakage of sensitive data. To ensure robust data privacy, it is imperative that the ciphertext remains impervious to unauthorized decryption attempts, thereby preventing the exposure of plaintext information. In order to rigorously assess and validate the efficacy of our approach in preserving data privacy, we adopt the IND-CPA model. In this paper, we introduce the IND-CPA model game for evaluating the security posture of the proposed scheme.
Definition 1
(Data privacy). In our proposed scheme, we establish semantic security for data privacy using the IND-CPA model. The advantage of adversary is quantified by . The scheme achieves security against IND-CPA if, across all potential attacks, the inequality is upheld, where ε represents a negligibly small probability.
- Init. selects a specific set from the available set , which it aims to exploit.
- Setup. The simulator provides the system parameters to .
- Phase 1. For , submits an aggregate key request query to . Subsequently, generates and transmits the aggregate key to .
- Challenge. selects two plaintexts, and , of equal length from a set of possible plaintexts associated with class . These plaintexts are then forwarded to . Thereafter, obtains a random bit via a coin flip. Following this, encrypts the selected plaintext and transmits the resulting ciphertext to .
- Phase 2. iterates through Phase 1 for , encompassing classes that do not belong to .
- Guess. produces an estimate of the true value of ϰ and communicates it to . If the estimate aligns with the true value ϰ, is deemed successful in the game.
5. Proposed Scheme
The proposed scheme encompasses six distinct phases: setup, registration, data upload, common keyword identification, aggregate key issuance, and data request and download. Figure 2 is the flowchart of the proposed scheme. During the setup phase, initializes the system parameters. In the registration phase, registers and , providing them with the necessary credentials for data sharing. In the data upload phase, uploads the encrypted data, which are then stored by following verification of ’s legitimacy. The common keyword identification phase involves communicating with to acquire matching keywords. During the aggregate key issuance phase, issues an aggregate key along with the corresponding dataset based on the set of keywords requested by . Finally, in the data request and download phase, can request and retrieve the data from . Table 1 provides the notation utilized throughout the proposed scheme.
Figure 2.
The overall flowchart of the proposed scheme.
Table 1.
Notation.
5.1. Setup Phase
sets a security parameter and chooses ciphertext modulus q, plaintext modulus t, noise distribution , and . also generates the bilinear parameters and chooses a generates and . Then, computes for . generates a hash function and publishes .
5.2. Registration Phase
conducts the registration of both and , issuing the necessary credentials. The registration procedure is performed in a secure channel, and we present this phase only for , since the registration process is identical for both and .
- Step 1:
- selects and sends to .
- Step 2:
- checks whether is registered by computing . generates , , , and and computes , , . Then, stores , and sends to .
- Step 3:
- stores securely.
5.3. Data Upload Phase
For data security, computes the authentication message and encrypted data using the random nonce , credentials , and for data . Upon the message being received, stores the encrypted data after verifying ’s legal registration with . The data upload process is illustrated in Figure 3, with detailed steps outlined below.
Figure 3.
Data upload phase.
- Step 1:
- generates a random nonce , and computes , , . also computes , , , , for document . Then, sends to .
- Step 2:
- Upon the uploaded message, computes and checks whether is equal to . If it it correct, computes and stores .
5.4. Common Keyword Identification Phase
initiates a request to obtain common keywords related to its own data. sends the for Y with an authentication value. After receiving the query, verifies the legitimacy of through using and transmits the encrypted intersection results . Subsequently, extracts the common keywords from the intersection set. Figure 4 illustrates the common keyword identification procedure, providing a detailed overview of each step.
Figure 4.
Common keyword identification phase.
- Step 1:
- generates , , and computes , , , . Then, sends .
- Step 2:
- After receiving the message, checks and by computing . If it is correct, generates and and computes , , . also generates for and computes . Then, transmits .
- Step 3:
- checks and computes . If is equated to , computes using , and inputs y in where .
5.5. Aggregate Key Issuance Phase
To obtain the data access permission about intersection , sends the aggregate key request message to . After confirming the validity of the , provides an accessible dataset S with an aggregate key . Figure 5 depicts the process for aggregate key issuance, outlining the steps in detail below.
Figure 5.
Aggregate key issuance phase.
- Step 1:
- generates , and computes , , , . Then, sends .
- Step 2:
- checks , and computes , . If , generates , and computes , , , , . Then, transmits .
- Step 3:
- checks and computes for checking . If accurate, computes .
5.6. Data Request and Download Phase
requests data from corresponding to the dataset S received from , and transmits the matched results . then uses the aggregate key to decrypt the received data and obtain the document . This phase is delineated in Figure 6, elucidating each sequential step below.
Figure 6.
Data request and download phase.
- Step 1:
- generates , , computes , , , and sends .
- Step 2:
- According to the received message, checks and computes , . For S, generates and computes , , . Then, sends .
- Step 3:
- checks and obtains data by computing . To verify the data, checks whether is equal to .
- Correctness:
6. Security Analysis
We conduct a comprehensive security analysis to prove the resilience of our proposed scheme. Our assessment encompasses potential threats, ranging from informal attack scenarios to formal analysis. In the informal security analysis, we evaluate whether the proposed scheme meets essential security requirements, including resilience against impersonation, replay, and denial-of-service attacks, as well as ensuring mutual authentication and data privacy. For the formal analysis, we use IND-CPA to verify the robustness of data privacy protections. We employ BAN logic to confirm the guarantee of mutual authentication and utilize the Scyther tool to validate the security of the proposed scheme against potential vulnerabilities, focusing on common keyword confirmation and integrated key issuance.
6.1. Informal Security Analysis
We perform a security evaluation to estimate the robustness of the proposed scheme against various threats that can occur in a medical data sharing environment. We also verify that mutual authentication between entities is provided during communication. We consider that an adversary endeavors security breaches founded on the suppositions delineated in Section 4.2.
6.1.1. Impersonation Attack
endeavors to impersonate in an effort to intercept the transmitted messages between , , and , aiming to obtain sensitive data. initiates the transmission of a data request message, denoted as , to as outlined in Section 5.6. However, cannot compute the message without the dataset S. also endeavors to extract data from an intercepted message , but it is impossible without an aggregate key . In an attempt to acquire the and S of , endeavors to transmit in Section 5.5. However, faces insurmountable barriers as it lacks crucial information including ’s secret key , a random nonce , the common keyword set , and the identity of the data owner . Even if tries to obtain and S from , it is impossible because needs . Furthermore, attempts to access and for the desired data detailed in Section 5.4 are futile due to ’s lack of knowledge about and . Consequently, the security of the proposed system against impersonation attacks is affirmed.
6.1.2. Replay and Man-in-the-Middle (MITM) Attack
tries to resend the common keyword confirmation message , data access permission request message , and data request message with the purpose of obtaining data. However, these messages consist of timestamps , and random nonces , and each entity that receives the message checks its freshness. Even if retransmits a previous message, entities can distinguish it as a malicious message. also intercepts and attempts to modify the messages, but it is impossible without the knowledge of . Hence, our scheme resists the replay and MITM attacks.
6.1.3. Denial of Services (DoS) Attack
seeks to disrupt availability by inundating with an overwhelming volume of messages, thereby overloading its capacity or halting data sharing services altogether. During such an attack, ruthlessly transmits data upload messages and data request messages to . However, effectively mitigates this threat by scrutinizing the timestamps of incoming messages and promptly interrupting any deemed invalid. Consequently, the proposed system robustly defends against DoS attacks, ensuring uninterrupted service availability.
6.1.4. Mutual Authentication
In the proposed scheme, legitimacy is verified between communication entities to ensure secure medical data sharing. In Section 5.4, when requests the common keyword results from through , upon receiving the message, checks whether has been legitimately registered in via . If this verification is successful, sends along with the common keyword identification function , which can be decrypted by . then verifies the correctness of the message sent by the , who has legally registered with , via . Mutual authentication is performed in the same way at other phases. Therefore, the proposed scheme ensures mutual authentication.
6.1.5. Data Verification
Upon receiving the results of the data request query , proceeds with data verification. This involves computing and subsequently checking whether . This verification process ensures that the received data have not been tampered with. By adding an additional layer of security, the proposed scheme reinforces the integrity of transmitted data. Therefore, it not only facilitates secure medical data sharing but also prioritizes data integrity, mitigating the risk of unauthorized modifications.
6.2. Semantic Security
In Theorem 1, we show that the proposed scheme provides IND-CPA security.
Theorem 1.
Given a probabilistic polynomial-time adversary with a non-negligible advantage ε, is capable of tackling the formidable assumption problem with a gain of .
Proof of Theorem 1.
Let be an entity capable of compromising the proposed scheme with an advantage of . In response, we introduce to engage in the DBDH game, achieving an advantage of . The challenger selects a generator and four random values . then randomly determines a value and shares it with . If , computes , resulting in the tuple . Otherwise, if , computes , resulting in the tuple .
Init. employs to produce a distinct subset from the existing set , which aims to focus on. Afterward, delivers this selected set to .
Setup. formulates the public parameters , where and . Then, disseminates these parameters to .
Phase 1. submits an query for , and responds to by calculating as .
Challenge. submits two plaintexts of equal length, denoted as and , along with to . randomly flips a coin to determine . If and , we set , then and is computed. Otherwise, if , then and . also calculates , , and sends to .
Phase 2. repeats Phase 1 to obtain within .
Guess. hypothesizes to guess . If , returns 0, indicating , and , with an advantage of , can practically obtain the ciphertext, resulting in a probability . If , returns 1, indicating , and receives an invalid ciphertext. Therefore, by correctly guessing , gains no significant advantage, and the probability of success in the game is . The probability of a successful game can be calculated as
Hence, the proposed scheme provides IND-CPA security. □
6.3. Formal Security Analysis Using BAN Logic
In the proposed scheme, and perform mutual authentication in Section 5.4 to prove that they are entities correctly registered in TA before performing an intersection. To demonstrate the mutual authentication of our scheme, we utilize a widely recognized formal verification technique called BAN logic [13]. Many researchers have affirmed the mutual authentication of their approaches using BAN logic [33,34]. To incorporate our approach with BAN logic, we provide the following notations and descriptions. Table 2 is the notation used in BAN logic.
Table 2.
BAN logic notation.
6.3.1. Rules
The rules employed for analyzing the security scheme in BAN logic are outlined below.
- Message mearning rule (MMR):
- Freshness rule (FR):
- Nonce verification rule (NVR):
- Jurisdiction rule (JR):
- Belief rule (BR):
6.3.2. Goals
The goals for checking the adequacy of the authentication properties of the proposed scheme are defined as follows.
- Goal 1:
- Goal 2:
- Goal 3:
- Goal 4:
6.3.3. Assumptions
The assumptions driving the analysis are presented as follows.
- :
- :
- :
- :
- :
- :
6.3.4. Idealized Forms
The idealized forms for messages exchanged among communication entities are outlined below.
- :
- :
6.3.5. Proof
In accordance with the provided rules, idealized forms, and assumptions, the analytical process aimed at achieving the goals of the proposed scheme is outlined as follows.
- Step 1:
- can be obtained from .
- Step 2:
- can be obtained by applying the MMR with .
- Step 3:
- can be obtained by applying the FR with and .
- Step 4:
- can be obtained by applying the NVR with and .
- Step 5:
- can be obtained by applying the BR with .
- Step 6:
- can be obtained by applying the JR with and .
- Step 7:
- can be obtained from .
- Step 8:
- can be obtained by applying the MMR with .
- Step 9:
- can be obtained by applying the FR with and .
- Step 10:
- can be obtained by applying the NVR with and .
- Step 11:
- can be obtained by applying the BR with .
- Step 12:
- can be obtained by applying the JR with and .
Therefore, all goals are accomplished, and the proposed scheme delivers mutual authentication.
6.4. Scyther Tool
We utilize the Scyther tool for the formal security analysis of the proposed scheme. Scyther is a push-button tool designed for the verification and analysis of the security protocol [12]. It offers extensive verification capabilities, ensuring termination while verifying the correctness of the scheme across an unlimited number of sessions. Scyther also provides features for model checking and multi-protocol analysis, complemented by a Python-based graphical user interface. These functionalities streamline the process of identifying and addressing security vulnerabilities within systems by users. Scyther delineates roles and events, representing message transmission and reception, based on the Security Protocol Description Language (SPDL). The Scyther command-line tool evaluates the security of a proposed protocol by scrutinizing the various claim events described in Table 3. Upon completion of the simulation, the result window confirms the security robustness of the proposed protocol. A status of “OK” in the “Status” tab, along with ”No attacks” in the “Comment” tab, assures the security of the authentication process. Figure 7 presents the simulation result of the proposed scheme, showing the “OK” status and “No attacks” comments in all claim events. Therefore, we ensure the robustness of the security measures implemented.
Table 3.
Scyther tool claim events.
Figure 7.
Scyther results.
7. Comparative Analysis
We perform an evaluative comparison regarding the security and efficiency metrics of our approach against pertinent existing frameworks.
7.1. Security Features
To ensure data confidentiality and privacy in a medical information system, it is imperative that only authorized data users should be granted access to the data, with no information related to data being disclosed. Achieving this necessitates the implementation of robust security measures to thwart unauthorized access attempts, secure message exchanges between entities, and grant data access only following thorough verification via mutual authentication. Moreover, it is crucial to maintain data integrity by verifying the authenticity of the information accessed by data users. In this context, we evaluate the security features of our proposed scheme against existing related schemes to determine its effectiveness in thwarting potential threats such as impersonation, replay, MITM, and DoS attacks. In addition, our evaluation focuses on verifying the robustness of mutual authentication, data integrity verification, and the prevention of data privacy leaks. Table 4 delineates the analysis results, comparing our proposed scheme with existing ones in terms of their capability to address the aforementioned security concerns. Based on our findings, existing studies lack robustness against DoS attacks, do not adequately consider MITM attacks, and lack essential features such as mutual authentication, data verification, or data privacy. In contrast, our proposed scheme meets the security requirements for secure data sharing within medical information systems.
Table 4.
Security features.
7.2. Computational Costs
We investigated the execution time of cryptographic operations on personal computers (PCs) using MIRACL [14], a software tool designed to facilitate the practical implementation of cryptographic techniques and algorithms. The PC’s specifications are as follows: Ubuntu 20.04.6 LTS operating system, 16 GB of RAM, and an Intel Core i5-10400 processor operating at 2.90 GHz (64-bit CPU). To ensure the accuracy of the measurements, we calculated the average duration of 100 iterations for each cryptographic operation, and the results are in Table 5.
Table 5.
Execution time of each cryptographic operation.
We analyzed the message execution time on the public channel. We remain consistent in treating the keywords and attributes discussed in each paper as 1 to compare the computational cost with increasing data volume, denoted by . The comparison results are laid out in Table 6. As depicted in Figure 8, our proposed scheme demonstrates the lowest execution times with increasing data volume. Within medical information systems, the seamless exchange of vast datasets between data owners and users is critical for driving research, advancing medical technologies, and enhancing service delivery. Therefore, our scheme is not only efficient but also well suited for real-world medical information systems.
Table 6.
Execution time comparison.
Figure 8.
Comparison of the execution times with the number of data [18,19,22,24].
7.3. Time Complexity Comparison
We conduct a comparative analysis of time complexity concerning computation and communication costs in relation to existing studies. Regarding computation costs, our analysis encompasses encryption, request, and verification. Encryption involves the process of encrypting data from the owner. Request refers to the process of a user desiring access to specific data, while verification entails confirming the accuracy and reliability of received encrypted data. Regarding communication costs, we define the access key as the decryption key, the request as the user’s data access query, and the ciphertext as the encrypted data received from the cloud server. As illustrated in Table 7, our comparison demonstrates significantly lower time complexity for both computation and communication compared to existing methods. Thus, our proposed approach offers enhanced efficiency and performance, rendering it more suitable for medical data sharing systems.
Table 7.
Time complexity comparison.
8. Conclusions
We have proposed a secure and privacy-preserving data sharing scheme designed for medical information systems. This scheme leverages KAE to facilitate secure and flexible data sharing between data owners and users and incorporates PSI techniques to achieve a balance between data privacy and flexible sharing. The security of our proposed scheme was rigorously evaluated through both informal and formal security analyses. Through the use of BAN logic, we ensured the scheme supports mutual authentication, while semantic secrecy was employed to prove data privacy. Additionally, the robustness of our scheme was validated using the Scyther tool, confirming its resilience against potential security threats. Our assessment extended to a comparative analysis of the security properties, execution times, and complexities, contrasting our scheme with existing methodologies. This comparison highlighted the improved security and efficiency metrics of our scheme. In conclusion, the proposed data sharing scheme not only meets the stringent security and privacy requirements of medical information systems but also exhibits superior performance and flexibility. However, as our system employs homomorphic encryption in determining the intersection of private sets, there may be a computational burden on each entity. Hence, we intend to pursue future research aimed at identifying intersections using a lighter methodology. In addition, since there is a possibility of advanced security risks due to the development of quantum computing technology, we will consider studies to improve the resilience to these security threats after the proposed method is employed.
Author Contributions
Conceptualization, J.O.; methodology, J.O. and S.S.; software, D.K.; validation, Y.P. (Yohan Park) and M.K.; formal analysis, J.O. and S.S.; investigation, J.O. and M.K.; writing—original draft preparation, J.O.; writing—review and editing, S.S. and Y.P. (Yohan Park); supervision, Y.P. (Youngho Park); funding acquisition, Y.P. (Youngho Park). All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by the National Research Foundation of Korea (NRF) and funded by the Ministry of Education under grant number 2020R1I1A3058605.
Data Availability Statement
Data are contained within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Arunprasath, S.; Annamalai, S. Improving patient centric data retrieval and cyber security in healthcare: Privacy preserving solutions for a secure future. Multimed. Tools Appl. 2024, 1–31. [Google Scholar] [CrossRef]
- Wang, T.; Wu, Q.; Chen, J.; Chen, F.; Xie, D.; Shen, H. Health data security sharing method based on hybrid blockchain. Future Gener. Comp. Syst. 2024, 153, 251–261. [Google Scholar] [CrossRef]
- Zhang, J.; Yang, Y.; Liu, X.; Ma, J. An efficient blockchain-based hierarchical data sharing for Healthcare Internet of Things. IEEE Trans. Ind. Inform. 2022, 18, 7139–7150. [Google Scholar] [CrossRef]
- Khan, M.A.; Alhakami, H.; Alhakami, W.; Shvetsov, A.V.; Ullah, I. A smart card-based two-factor mutual authentication scheme for efficient deployment of an IoT-based telecare medical information system. Sensors 2023, 23, 5419. [Google Scholar] [CrossRef]
- Lee, J.; Oh, J.; Kwon, D.; Kim, M.; Kim, K.; Park, Y. Blockchain-enabled key aggregate searchable encryption scheme for personal health record sharing with multi-delegation. IEEE Internet Things J. 2024, 11, 17482–17494. [Google Scholar] [CrossRef]
- Sahai, A.; Waters, B. Fuzzy identity-based encryption. In Proceedings of the Advances in Cryptology–EUROCRYPT 2005: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, 22–26 May 2005; Volume 24, pp. 457–473. [Google Scholar] [CrossRef]
- Chu, C.K.; Chow, S.S.; Tzeng, W.G.; Zhou, J.; Deng, R.H. Key-aggregate cryptosystem for scalable data sharing in cloud storage. IEEE Trans. Parallel Distrib. Syst. 2014, 25, 468–477. [Google Scholar] [CrossRef]
- Yang, L.; Li, C.; Cheng, Y.; Yu, S.; Ma, J. Achieving privacy-preserving sensitive attributes for large universe based on private set intersection. Inf. Sci. 2022, 582, 529–546. [Google Scholar] [CrossRef]
- Sucasas, V.; Mantas, G.; Papaioannou, M.; Rodriguez, J. Attribute-based pseudonymity for privacy-preserving authentication in cloud services. IEEE Trans. Cloud Comput. 2023, 11, 168–184. [Google Scholar] [CrossRef]
- Wang, H.; Liang, J.; Ding, Y.; Tang, S.; Wang, Y. Ciphertext-policy attribute-based encryption supporting policy-hiding and cloud auditing in smart health. Comput. Stand. Interfaces 2023, 84, 103696. [Google Scholar] [CrossRef]
- Oh, J.; Lee, J.; Kim, M.; Park, Y.; Park, K.; Noh, S. A secure data sharing based on key aggregate searchable encryption in fog-enabled IoT environment. IEEE Trans. Netw. Sci. Eng. 2022, 9, 4468–4481. [Google Scholar] [CrossRef]
- Cremers, C.J. The Scyther Tool: Verification, Falsification, and Analysis of Security Protocols: Tool Paper. In Proceedings of the International Conference on Computer Aided Verification, Princeton, NJ, USA, 7–14 July 2008; pp. 414–418. [Google Scholar] [CrossRef]
- Burrows, M.; Abadi, M.; Needham, R. A logic of authentication. ACM Trans. Comput. Syst. 1990, 8, 18–36. [Google Scholar] [CrossRef]
- MIRACL Cryptographic SDK. Available online: https://github.com/miracl/MIRACL (accessed on 2 April 2024).
- Bao, Y.; Qiu, W.; Cheng, X. Secure and lightweight fine-grained searchable data sharing for IoT-oriented and cloud-assisted smart healthcare system. IEEE Internet Things J. 2022, 9, 2513–2526. [Google Scholar] [CrossRef]
- Mamta; Gupta, B.B.; Lytras, M.D. Fog-enabled secure and efficient fine-grained searchable data sharing and management scheme for IoT-based healthcare systems. In IEEE Transactions on Engineering Management; IEEE: New York, NY, USA, 2022; pp. 1–13. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, A.; Zhang, P.; Qu, Y.; Yu, S. Security-aware and privacy-preserving personal health record sharing using consortium blockchain. IEEE Internet Things J. 2022, 9, 12014–12028. [Google Scholar] [CrossRef]
- Oh, J.; Lee, J.; Kim, M.; Park, Y.; Park, K.; Noh, S. A secure personal health record sharing system with key aggregate dynamic searchable encryption. Electronics 2022, 11, 3199. [Google Scholar] [CrossRef]
- Trivedi, H.S.; Patel, S.J. Key-aggregate searchable encryption with multi-user authorization and keyword untraceability for distributed IoT healthcare systems. Trans. Emerg. Telecommun. Technol. 2023, 34, e4734. [Google Scholar] [CrossRef]
- Xu, G.; Qi, C.; Dong, W.; Gong, L.; Liu, S.; Chen, S.; Liu, J.; Zheng, X. A privacy-preserving medical data sharing scheme based on blockchain. IEEE J. Biomed. Health Inform. 2023, 27, 698–709. [Google Scholar] [CrossRef] [PubMed]
- Zhang, C.; Luo, X.; Fan, Q.; Wu, T.; Zhu, L. Enabling privacy-preserving multi-server collaborative search in smart healthcare. Future Gener. Comp. Syst. 2023, 143, 265–276. [Google Scholar] [CrossRef]
- Zhang, Y.; Guo, F.; Susilo, W.; Yang, G. Balancing privacy and flexibility of cloud-based personal health records sharing system. IEEE Trans. Cloud Comput. 2023, 11, 2420–2430. [Google Scholar] [CrossRef]
- Peng, G.; Zhang, A.; Lin, X. Patient-centric fine-grained access control for electronic medical record sharing with security via dual-blockchain. IEEE Trans. Netw. Sci. Eng. 2023, 10, 2908–3921. [Google Scholar] [CrossRef]
- Zhang, K.; Zhang, Y.; Li, Y.; Liu, X.; Lu, L. A blockchain-based anonymous attribute-based searchable encryption scheme for data sharing. IEEE Internet Things J. 2024, 11, 1685–1697. [Google Scholar] [CrossRef]
- Jastaniah, K.; Zhang, N.; Mustafa, M.A. Efficient user-centric privacy-friendly and flexible wearable data aggregation and sharing. In IEEE Transactions on Cloud Computing; IEEE: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
- Yin, H.; Zhao, Y.; Zhang, L.; Qiao, B.; Chen, W.; Wang, H. Attribute-based searchable encryption with decentralized key management for healthcare data sharing. J. Syst. Architect. 2024, 148, 103081. [Google Scholar] [CrossRef]
- Lai, C.; Zhang, H.; Lu, R.; Zheng, D. Privacy-preserving medical data sharing scheme based on two-party cloud-assisted PSI. IEEE Internet Things J. 2024, 11, 15855–15868. [Google Scholar] [CrossRef]
- Lax, G.; Nardone, R.; Russo, A. Enabling secure health information sharing among healthcare organizations by public blockchain. Multimed. Tools Appl. 2024, 1–17. [Google Scholar] [CrossRef]
- Koblitz, N. Elliptic curve cryptosystems. Math. Comput. 1987, 48, 203–209. [Google Scholar] [CrossRef]
- Patranabis, S.; Shrivastava, Y.; Mukhopadhyay, D. Dynamic key-aggregate cryptosystem on elliptic curves for online data sharing. In Progress in Cryptology, Proceedings of the INDOCRYPT 2015: 16th International Conference on Cryptology in India, Bangalore, India, 6–9 December 2015; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar] [CrossRef]
- Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (TOCT) 2014, 6, 13. [Google Scholar] [CrossRef]
- Dolev, D.; Yao, A. On the security of public key protocols. IEEE Trans. Inf. Theory 1983, 29, 198–208. [Google Scholar] [CrossRef]
- Son, S.; Lee, J.; Park, Y.; Park, Y.; Das, A.K. Design of blockchain-based lightweight V2I handover authentication protocol for VANET. IEEE Trans. Netw. Sci. Eng. 2022, 9, 1346–1358. [Google Scholar] [CrossRef]
- Attir, A.; Naït-Abdesselam, F.; Faraoun, K.M. Lightweight anonymous and mutual authentication scheme for wireless body area networks. Comput. Netw. 2023, 224, 109625. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).







