Next Article in Journal
Application of Salp Swarm Algorithm and Extended Repository Feature Selection Method in Bearing Fault Diagnosis
Next Article in Special Issue
Covert Communication for Dual Images with Two-Tier Bits Flipping
Previous Article in Journal
Direct Yaw Moment Control for Distributed Drive Electric Vehicles Based on Hierarchical Optimization Control Framework
Previous Article in Special Issue
Mathematical Model of the Process of Data Transmission over the Radio Channel of Cyber-Physical Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design of Secure and Privacy-Preserving Data Sharing Scheme Based on Key Aggregation and Private Set Intersection in Medical Information System

1
School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea
2
School of Computer Engineering, Keimyung University, Daegu 42601, Republic of Korea
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(11), 1717; https://doi.org/10.3390/math12111717
Submission received: 3 May 2024 / Revised: 29 May 2024 / Accepted: 30 May 2024 / Published: 31 May 2024

Abstract

:
Medical data sharing is pivotal in enhancing accessibility and collaboration among healthcare providers, researchers, and institutions, ultimately leading to enhanced patient outcomes and more efficient healthcare delivery. However, due to the sensitive nature of medical information, ensuring both privacy and confidentiality is paramount. Access control-based data sharing methods have been explored to address these issues, but data privacy concerns still remain. Therefore, this paper proposes a secure and privacy-preserving data sharing scheme that achieves an equilibrium between data confidentiality and privacy. By leveraging key aggregate encryption and private set intersection techniques, our scheme ensures secure data sharing while protecting against the exposure of sensitive information related to data. We conduct informal and formal security analyses, including Burrow–Abadi–Needham logic and Scyther, to demonstrate its resilience against potential adversarial attacks. We also implement the execution time for cryptographic operations using multiprecision integer and a rational arithmetic cryptographic library and perform comparative analysis with existing related schemes in terms of security, computational cost, and time complexity. Our findings demonstrate a high level of security and efficiency, demonstrating that the proposed scheme contributes to the field by providing a solution that protects data privacy while enabling secure and flexible sharing of medical data.

1. Introduction

With the rapid advancement of modern technologies and the increasing digitalization of the medical sector, there has been a significant surge in both the volume and diversity of medical data. This proliferation, especially within the realm of medical information systems such as electronic health records (EHRs) and health information exchange (HIE), promotes accessibility and data sharing among healthcare providers, researchers, and institutions, improving patient outcomes, accelerating healthcare discovery, and optimizing healthcare delivery [1,2]. Medical data can be used to identify patterns and correlations to advance the understanding, diagnosis, and treatment of disease. This can improve quality of care and patient satisfaction by personalizing the care process and effectively managing chronic conditions. Ultimately, medical data sharing is an essential element of modern healthcare and plays an important role in advancing medical information systems and improving people’s quality of life.
Despite the numerous advantages of medical data sharing, significant concerns remain, particularly in the realms of security and privacy. The sensitive nature of medical information means that without adequate protection, there is a substantial risk of severe privacy violations and data breaches [3,4,5]. Unauthorized access or the theft of medical data can not only infringe upon privacy rights but also undermine trust in the entire medical information system. Data breaches can disrupt healthcare operations, compromise the integrity of research efforts, and even jeopardize public health initiatives. Ensuring the security of medical data is not merely about privacy but is also a critical aspect of protecting the entire healthcare infrastructure. This underscores the crucial importance of balancing confidentiality and privacy in data sharing, prompting the exploration of advanced solutions that can enhance utility while ensuring data security. Hence, it is imperative to conduct research on secure sharing of medical data by restricting access to information.
To enhance medical data security, researchers have increasingly explored data sharing frameworks that leverage access control technologies, such as attribute-based encryption (ABE) [6] and key aggregate encryption (KAE) [7]. Based on specific properties for generating and decrypting ciphertext, these frameworks ensure that only authorized users can access data, allowing data owners to securely share sensitive data while maintaining strict control over access rights. While offering substantial flexibility in data and access rights management, they also raise serious concerns regarding data privacy, necessitating thorough examination from alternative perspectives. In access control-based systems, privacy vulnerabilities can arise from either by revealing sensitive information through attribute values [8,9,10] or by exposing the data-related information itself that was caused by data users having to identify the desired data before requesting an access key from the owner. Conversely, if data owners choose to withhold data-related information, data users will resort to inefficient and insecure practices, such as randomly querying on a cloud server to determine the availability of certain data. This method is inherently flawed as it neither guarantees efficiency nor meets the security requisite for handling sensitive medical data. Therefore, there is a need for further research and development in medical data sharing methodologies that enable secure data sharing between data owners and users while maintaining data privacy.
In this paper, we propose a design of a secure and privacy-preserving data sharing scheme for medical information systems. We integrate private set intersection (PSI) with KAE to achieve an equilibrium between data confidentiality and privacy. To ensure secure and adaptable access control over medical data, we leverage a single access key feature of KAE [11] while integrating PSI to alleviate potential privacy concerns. PSI enables both data owners and users to confirm the presence of common information in their respective private sets without revealing the information about them, allowing them to issue or request access keys only after verifying intersection. This mechanism markedly diminishes the necessity for data owners to divulge data-related information, effectively mitigating a primary privacy concern. By adopting this approach, we not only reduce the risk of information exposure but also fortify the overall data security framework within medical information systems and address the imperatives of data confidentiality and privacy preservation in medical data sharing. The key considerations of this paper are as follows.
  • We propose a privacy-preserving medical data sharing scheme. To maintain a balance between privacy and data sharing, we leverage PSI between the data owner and the data user before access requests. This facilitates interaction and data sharing while protecting sensitive information.
  • The proposed scheme ensures secure data sharing and access control through KAE. Since KAE enables secure and flexible access control with a single aggregate key, the integration of KAE in the proposed scheme enhances data security by reducing the risk of data breaches and unauthorized disclosures.
  • We perform security analysis using the Scyther tool [12] and mathematical analysis methods such as Burrows–Abadi–Needham (BAN) logic [13] and indistinguishability against the chosen plaintext attack (IND-CPA). In addition, we conduct performance analyses using the multiprecision integer and rational arithmetic cryptographic library (MIRACL) [14] and compare the obtained results with those of previous studies.
The remainder of this paper is organized as follows. The related works are presented in Section 2, and the preliminaries for the paper are in Section 3. Section 4 presents the system model for the proposed scheme, including network model, adversary model, and security model. Section 5 explains the proposed medical data sharing scheme. Informal and formal Security analyses are performed in Section 6, and the comparative analysis is conducted in Section 7. Section 8 summarizes the conclusions of this paper.

2. Related Works

Considerable research has been conducted in the realm of medical data sharing with a notable focus on data security and privacy preservation. In 2022, Bao et al. [15] proposed a lightweight ABE scheme, specifically tailored for the Internet of Things (IoT) and supported by cloud technology, within smart healthcare systems. Their approach prioritizes both the efficiency required by resource-limited devices and the implementation of fine-grained access control, ensuring that data access aligns with user authorization levels. Mamta et al. [16] developed a secure and efficient fine-grained data sharing scheme for IoT-based healthcare systems. Their approach critiques existing models of fine-grained medical data sharing and leverages fog computing alongside ABE to significantly reduce the computational load on data users while simultaneously enhancing data confidentiality. Wang et al. [17] proposed a consortium blockchain-based scheme for personal health record (PHR) management and sharing that prioritizes both security and privacy. They emphasized the importance of allowing patients to customize access control to their PHR according to their individual preferences, ensuring that only authorized users have access. To achieve this, they integrated a modified ABE scheme with smart contracts, enabling functionalities for secure search, privacy preservation, and personalized access control. Oh et al. [18] introduced a patient-centric secure PHR sharing system, addressing data integrity, transparency, mutual authentication, etc. They acknowledged the common use of ABSE in medical data sharing but identified key management challenges inherent in its implementation. To address this issue, they adopted the concept of key aggregate searchable encryption (KASE), presenting a key aggregate dynamic searchable encryption framework integrated with a linear secret sharing scheme.
In 2023, Trivedi and Patel [19] developed a KASE-based framework for sharing electronic health records in integrated healthcare systems on clouds. They identified limitations in existing schemes, particularly the lack of secure multi-user authorization and keyword untraceability. Their framework addresses these limitations by incorporating robust security features, including secure multi-user authorization and keyword untraceability. This approach not only enhances security but also demonstrates significant efficiency gains in storage, communication, and computation. Xu et al. [20] devised a privacy-enhanced medical data sharing framework that utilizes an authorization mechanism and ABE on the blockchain, aiming to address the challenges of fragmented healthcare systems, which can compromise treatment quality and lead to privacy breaches. Their approach empowers data owners with control over data access via ABE, complemented by an efficient authorized and revocable mechanism, ensuring access for authorization and revocation mechanism, which ensures that authorized doctors can access data while swiftly revoking access for unauthorized individuals. Zhang et al. [21] developed a multi-server search scheme that facilitates collaborative operations among various healthcare entities for tasks such as diagnostic institution location, medical data retrieval, and cross-domain data exploration. Their scheme incorporates a secure data transfer method that enables servers from disparate organizations to perform joint computational tasks while preserving the confidentiality of each participant’s data and the privacy of their search identities. Zhang et al. [22] identified that while weighted ABE enhances the flexibility of access policies in medical data sharing, it poses challenges for data owners striving to maintain control over their privacy, particularly in collaborative e-health systems. Aiming to strike an optimal balance between privacy and flexibility, they proposed a cloud-based system for sharing personal health records. This system employs AND-weighted ABE, enabling users to access data only if they belong to specified organizations, thus reinforcing both security and selective accessibility. Peng et al. [23] proposed a patient-centric EMR sharing scheme, aiming to tackle the complex issues of privacy concerns and inter-agency distrust stemming from the misuse of access to medical records and the challenge of admitting unconscious patients. They designed an architecture for privacy-preserving medical data sharing, leveraging a dual-blockchain system and an identity-based tripartite authentication key agreement scheme, fostering trust between patients and healthcare institutions.
In 2024, Zhang et al. [24] pointed out that existing attribute-based searchable encryption schemes could expose sensitive information about data users and lead to data tampering and even untrusted results due to the delegation of complex search operations to a cloud server. In response, they proposed a blockchain-based anonymous ABSE scheme to enhance data sharing security. This approach conceals the attributes of the access policy, thereby safeguarding the confidentiality of the attributes that fulfil the access requirements. By integrating ABSE with blockchain technology, the scheme also incorporates features like tamper-proofing, integrity verification, and non-repudiation, significantly bolstering the trust and security of digital transactions. Jastaniah et al. [25] introduced the SAMA scheme, crafted to overcome the shortcomings of current methodologies in managing data aggregation and sharing for wearable devices. Their objective was to offer a scheme that is not only centered around the user and privacy-friendly but also flexible enough to efficiently support multiple data owners and requesters. The SAMA scheme integrates multi-key partial homomorphic encryption with ciphertext-policy ABE to ensure robust data confidentiality, user-centric access control, and streamlined data processing tailored for wearable technology. Yin et al. [26] indicated the paucity of research into the privacy implications associated with user identity during the key generation phase. To address this gap, they proposed a decentralized ciphertext-policy ABE scheme, specifically designed to bolster the secure dissemination of sensitive healthcare information within blockchain-enabled healthcare systems. Leveraging Shamir’s threshold secret sharing, their scheme distributed the master key across all attribute nodes in the blockchain, thereby augmenting the robustness and enhancing the system’s resilience to adversarial attacks.
While existing research in medical data sharing has made significant progress in areas such as security, efficiency, and privacy, there remains a crucial aspect requiring more focused attention. Given the highly sensitive nature of medical data, the potential for information leakage through attributes and data-related information in access control-based systems, as observed in the aforementioned study, poses significant privacy concerns. This issue impedes the development of secure data sharing practices, subsequently restricting thorough data analysis and mutual efforts [27,28]. To enhance cooperation among various data owners and advance medical research, it is essential to prioritize the protection of each entity’s data privacy. This entails limiting the information disclosed to only what is necessary during the process of sharing medical data. Hence, we utilize key aggregation encryption and private set intersection to protect data confidentiality and prevent any inadvertent exposure of sensitive information, and verify the legitimacy of entities through mutual authentication. This process protects data at all stages, including data upload, storage, transmission, and access and ensures a secure data sharing environment.

3. Preliminaries

This section briefly introduces the foundational mathematical and technical principles to aid in comprehending the contents of this document.

3.1. Elliptic Curve Cryptography

Elliptic curve cryptography (ECC) is a public key cryptography that exploits the mathematical properties of elliptic curves over finite fields [29]. An elliptic curve E p ( a , b ) over a finite field Z p is defined as E p ( a , b ) : y 2 x 3 + a x + b ( mod p ) , where p is a large prime integers and x , y , a , b Z p , ensuring that the discriminant 4 a 3 + 27 b 2 ( mod p ) 0 . The additive group G = { ( x , y ) : x , y Z p , ( x , y ) E / Z p } { O } is defined, where O symbolizes the point at infinity, serving as the identity element of G . Scalar multiplication is defined by repeated addition operation as α P = P + P + + P ( α times), with a base point P G and an integer α Z p * . The mathematical security of ECC are represented as follows.
  • Elliptic curve discrete logarithm problem (ECDLP): Given two points P and Q on E p ( a , b ) , determining the scalar α Z p such that Q = α · P is considered computationally difficult.
  • Elliptic curve computational Diffie–Hellman problem (ECCDHP): Given two points α · P and β · P , it is hard to calculate α · β · P .
  • Elliptic curve decisional Diffie–Hellman problem (ECDDHP): Given three points α · P , β · P , and γ · P , it is difficult to determine whether γ · P = α · β · P , where α , β , γ Z p .

3.2. Bilinear Pairing

Let G be an additive group, consisting of points on an elliptic curve E defined over a field F , having order n and identity element O . Let G T be a multiplicative group. A bilinear pairing e ^ : G × G G T satisfies the following conditions.
  • Bilinearity: For P , Q G , and a , b Z p * , e ^ ( a P , b Q ) = e ^ ( P , Q ) a b .
  • Non-degeneracy: e ^ ( P , Q ) 1 for some P , Q G .
  • Efficiency: e ^ ( P , Q ) can be calculated in polynomial time for P , Q G .

3.3. Decisional Bilinear Diffie–Hellman (DBDH) Assumption

This assumption assumes that a probabilistic polynomial-time adversary A lacks the ability to differentiate between ( a P , b P , c P , e ^ ( P , P ) a b c ) and ( a P , b P , c P , e ^ ( P , P ) d ) . Consequently, we can express the A ’s advantage ε as follows.
| P r [ A ( a P , b P , c P , e ^ ( P , P ) a b c ) = 1 ] P r [ A ( a P , b P , c P , e ^ ( P , P ) d ) = 1 ] | ε
DBDH assumption holds if A cannot distinguish e ^ ( P , P ) d = e ^ ( P , P ) a b c , i.e., whether d = a b c or d Z q * , with an inescapable advantage.

3.4. Key Aggregate Encryption

Key aggregate encryption (KAE) is an access control cryptosystem that streamlines data decryption procedures by allowing the decryption of a collection of data encrypted with multiple keys using a single constant aggregate key [7]. This aggregate key, though as concise as a solitary secret key, combines the capabilities of numerous such keys, granting decryption authority for any subset of ciphertext classes. In contrast to traditional systems that require a distinct key for each ciphertext, KAE uses a single aggregate key, reducing complexity and cost. This approach not only diminishes the key management overhead but also amplifies efficiency in data sharing. However, most KAE schemes rely on bilinear operation, incurring significant computational overhead [30]. Especially in scenarios involving data sharing, processing, and transmission of large datasets, such methods prove to be inefficient. In response, an alternative approach called ECC-based KAE was introduced. This method optimizes resource utilization by leveraging the small key size of ECC, ensuring robust security while facilitating efficient data transmission and processing. Consequently, the proposed system adopts the ECC-based KASE method, with the operational process outlined as follows.
(1)
KAE.Setup ( 1 λ , n ): Generate a random number α Z p , compute P i = α i P G for i = { 1 , , n , n + 2 , , 2 n } , and publish p a r a m = { P , n , { P i } 1 i 2 n , i n + 1 } . Then, discard α .
(2)
KAE.KeyGen (): Generate s k Z p and compute p k = s k · P . Then, output public and private key pair ( p k , s k ) = ( s k · P , s k ) .
(3)
KAE.Encrypt ( p a r a m , p k , i , F ): For data F i G T in i { 1 , , n } , choose a random number s Z p and compute c 1 = s · P , c 2 = s · ( p k + P i ) , c 3 = F i · e ( P 1 , P n ) s . Then, output C = { c 1 , c 2 , c 3 } .
(4)
KAE.Extrac t ( p a r a m , s k , S ): For the subset of data class indices S, output the aggregate key A K = j S s k · P n + 1 j .
(5)
KAE.Decrypt ( p a r a m , A K , S , i , C ): If i S , output ⊥. Otherwise, calculate v 1 = j S , j i P n + 1 j + i , v 2 = j S P n + 1 j , and output F i = c 3 · e ( A K + v 1 , c 1 ) e ( v 2 , c 2 ) .

3.5. Brakerski–Gentry–Vaikuntanathan

Brakerski–Gentry–Vaikuntanathan (BGV) [31] is a type of fully homomorphic encryption (FHE) that enables arithmetic operations on encrypted data. BGV eliminates the need for decryption, producing an encrypted output that, when decrypted, yields the same result as operations performed on the plaintext. The BGV scheme is renowned for its effectiveness in performing unlimited additions and multiplications on encrypted data, which is facilitated by a process known as bootstrapping. This process effectively manages the noise generated during computations, ensuring the encryption integrity. Below is a description of the BGV algorithm.
(1)
BGV.Setup ( 1 λ ): Select a ring R q = Z q [ X ] / ( X l + 1 ) , where l is a power of 2. Given a security parameter λ , set the ciphertext modulus q, plaintext modulus t, and the noise distribution X . Output p a r a m s = ( R q , l , q , t , X ) .
(2)
BGV.KeyGen ( p a r a m ): Generate the secret key s { 1 , 0 , 1 } l , a random polynomial r R q , and a random error polynomial e X . Calculate the public key p = ( p 0 , p 1 ) = ( r · s + t · e , r ) .
(3)
BGV.Enc ( p a r a m s , p k , m ): Generate e 0 , e 1 X , random polynomial u, and compute ct = ( c t 0 , c t 1 ) = ( p 0 · u + t e 0 + m , p 1 · u + t e 1 ) = m p .
(4)
BGV.Dec ( p a r a m s , s , ct ): Calculate m = [ c t 0 + c t 1 · s ] q t using s.

3.6. Private Set Intersection

Private set intersection (PSI) is a cryptographic protocol designed to identify common elements between sets held by two or more parties, such as individuals or organizations, without revealing any underlying data. This functionality enables the parties to determine overlapping information while preserving the confidentiality of their respective datasets. In the typical PSI scenario discussed in this paper, the sender and receiver have sets X and Y with sizes N X and N Y , respectively. Upon the receiver’s request for the intersection, the sender computes it and transmits the result. The receiver leverages their private key to decrypt the intersected set. The detailed structure of the PSI employed in this study is as follows.
(1)
PSI.Setup ( 1 λ ): Sender and receiver each generate a public–private key pair using the BGV.KeyGen procedure.
(2)
PSI.Enc ( Y , p ): Receiver encrypts each element y z Y using BGV.Enc and transmits the ciphertext ct = Y p = ( y 1 p , y 2 p , , y N Y p ) to the sender.
(3)
PSI.Intersection ( ct , X ): Sender chooses a random number r z for y z p ct , and computes d i = r i x X ( y z p s x ) . Then, the sender returns ( d 1 , d 2 , , d m ) to the receiver.
(4)
PSI.Ext ( d , s ): Receiver computes D e c ( d z ) = r z x X ( y z x ) using BGV secret key s, and obtains y z where X Y =  BGV.Dec( d z ) = 0 .

4. System Models

In this section, we introduce the models proposed in our study: the network model, the adversary model, and the security model.

4.1. Network Model

The proposed system consists of four entities: a trusted authority ( TA ), a data owner ( DO ), a data user ( DU ), and a cloud server ( CS ). The system architecture is illustrated in Figure 1, and a detailed description of each entity is given as follows.
  • TA : TA is a trusted authority that initiates the system by generating parameters for data sharing. TA undertakes the task of registering both DO and DU , issuing them with the necessary credentials.
  • Data owner ( DO ): DO is hospitals, clinics, or research institutions. DO encrypts medical data and sends them to CS . When DU requests a common keyword identification query, DO computes an intersection set result, decryptable only by DU after legitimacy verification. DO also provides the aggregate key and relevant data class set upon DU ’s data access request.
  • Data user ( DU ): DU is a doctor, nurse, researcher, patient, etc., within a medical institution. To access data, DU initiates a common keyword identification query. After receiving the results, DU requests access to data related to the matched keyword results and then uses the aggregate key to decrypt the data obtained from CS.
  • Cloud server ( CS ): CS is an entity that stores the medical data and returns the data search results. When data are uploaded by DO , CS stores the data if DO has the necessary legal permissions. CS facilitates data access to DU following a verification process to ascertain the legal status of DU .
The communication flows of the proposed model are summarized as follows.
(1)
TA initializes the system parameters for authentication, intersection calculation, and data sharing.
(2)
TA registers DO and DU , storing the identity information to prevent duplicate registrations. Then, TA issues credentials for secure data sharing through authentication.
(3)
DO encrypts the medical data and uploads them to CS . CS then verifies the DO ’s legitimacy prior to storing the data.
(4)
DU submits the common keyword identification query for owned information. DO generates and transmits the encrypted intersection results after confirming DU ’s legitimacy with TA . DU verifies the received message and stores the intersection.
(5)
DU transmits a query for data access permission to DO using the intersection results. Then, DO generates and sends the aggregate key and corresponding data class set based on the intersected keywords.
(6)
DU requests the data from CS using a data class set and decrypts them using the aggregate key obtained from DO .

4.2. Adversary Model

We adopt the Dolev–Yao (DY) model to assess the security of the proposed scheme [32], which is widely used to evaluate the security of protocols. The DY model assumes that an adversary has the ability to intercept all communications on a network, read and modify intercepted messages, and create and transmit new messages. These capabilities allow the adversary to carry out a range of attacks, such as impersonation, replay, and main-in-the-middle attacks. By analyzing the proposed scheme using the DY model, our objective is to assess its effectiveness in preventing unauthorized access, data tampering, and malicious activities planned by potential opponents.

4.3. Security Model

Aligned with the adversary model outlined in Section 4.2, the proposed scheme is designed to uphold stringent data privacy standards. Given the sensitive nature of medical data, a breach could have serious consequences. Hence, protecting the confidentiality of DO ’s information is critical to prevent unauthorized access and the subsequent leakage of sensitive data. To ensure robust data privacy, it is imperative that the ciphertext remains impervious to unauthorized decryption attempts, thereby preventing the exposure of plaintext information. In order to rigorously assess and validate the efficacy of our approach in preserving data privacy, we adopt the IND-CPA model. In this paper, we introduce the IND-CPA model game for evaluating the security posture of the proposed scheme.
Definition 1 
(Data privacy). In our proposed scheme, we establish semantic security for data privacy using the IND-CPA model. The advantage of adversary A is quantified by A d v A I N D C P A = | P r [ ϰ = ϰ ] 1 2 | . The scheme achieves security against IND-CPA if, across all potential attacks, the inequality | P r [ ϰ = ϰ ] 1 2 |   ε is upheld, where ε represents a negligibly small probability.
  • Init. A selects a specific set S a from the available set S = { 1 , , n } , which it aims to exploit.
  • Setup. The simulator B provides the system parameters to A .
  • Phase 1. For S * S ¯ a , A submits an aggregate key request query to B . Subsequently, B generates and transmits the aggregate key to A .
  • Challenge. A selects two plaintexts, F 0 and F 1 , of equal length from a set of possible plaintexts associated with class i t . These plaintexts are then forwarded to B . Thereafter, B obtains a random bit ϰ { 0 , 1 } via a coin flip. Following this, B encrypts the selected plaintext F ϰ and transmits the resulting ciphertext to A .
  • Phase 2. A iterates through Phase 1 for S * S a ¯ , encompassing classes that do not belong to S a .
  • Guess. A produces an estimate ϰ of the true value of ϰ and communicates it to B . If the estimate ϰ aligns with the true value ϰ, A is deemed successful in the game.

5. Proposed Scheme

The proposed scheme encompasses six distinct phases: setup, registration, data upload, common keyword identification, aggregate key issuance, and data request and download. Figure 2 is the flowchart of the proposed scheme. During the setup phase, TA initializes the system parameters. In the registration phase, TA registers DO and DU , providing them with the necessary credentials for data sharing. In the data upload phase, DO uploads the encrypted data, which are then stored by CS following verification of DO ’s legitimacy. The common keyword identification phase involves DU communicating with DO to acquire matching keywords. During the aggregate key issuance phase, DO issues an aggregate key along with the corresponding dataset based on the set of keywords requested by DU . Finally, in the data request and download phase, DU can request and retrieve the data from CS . Table 1 provides the notation utilized throughout the proposed scheme.

5.1. Setup Phase

TA sets a security parameter λ and chooses ciphertext modulus q, plaintext modulus t, noise distribution X , and R q = Z q [ X ] / ( X l + 1 ) . TA also generates the bilinear parameters ( p , G , G T , e ^ ) and chooses a generates P G and α Z p . Then, TA computes P i = α i P G for i { 1 , , n , n + 2 , , 2 n } . TA generates a hash function h : { 0 , 1 } * Z p and publishes p a r a m = { q , t , l , X , R q , p , G , G T , e ^ , P , { P i } 1 i 2 n , i n + 1 , n , h } .

5.2. Registration Phase

TA conducts the registration of both DO and DU , issuing the necessary credentials. The registration procedure is performed in a secure channel, and we present this phase only for DO , since the registration process is identical for both DO and DU .
Step 1: 
DO selects and sends I D o to TA .
Step 2: 
TA checks whether I D o is registered by computing V o = h ( I D o | | k T A ) . TA generates r o R q , e o X , s o { 1 , 0 , 1 } k , and R o Z p and computes p o = ( r o · s o + t · e o , r o ) , v o = ( k T A + R o ) mod p , d o = R o · P . Then, TA stores V o = h ( I D o | | k T A ) , and sends { p o , s o , v o , d o } to DO .
Step 3: 
DO stores { p o , s o , v o , d o } securely.

5.3. Data Upload Phase

For data security, DO computes the authentication message and encrypted data using the random nonce s , u Z p , credentials v o , and s k o for data F i . Upon the message being received, CS stores the encrypted data { c 1 , c 2 , c 3 , v i } after verifying DO ’s legal registration with TA . The data upload process is illustrated in Figure 3, with detailed steps outlined below.
Step 1: 
DO generates a random nonce s , u Z p , and computes U 1 = u · P , U 2 = u · p k s , U 3 = u + h ( U 2 | | v o · P ) · s k o ( mod p ) . DO also computes c 1 = s · P , c 2 = s · ( p k o + P i ) , c 3 = F i · e ( P 1 , P n ) s , v i = h ( F i | | I D o ) , C i = ( c 1 | | c 2 | | c 3 | | v i ) U 2 for document F i ( i { 1 , , n } ) . Then, DO sends { I D o , U 1 , U 3 , d o , C i } to CS .
Step 2: 
Upon the uploaded message, CS computes U 2 * = U 1 · s k s and checks whether U 3 · P is equal to U 1 + h ( U 2 * | | p k T A + d o ) · p k o . If it it correct, CS computes ( c 1 | | c 2 | | c 3 | | v i ) = C i U 2 and stores { c 1 , c 2 , c 3 , v i } .

5.4. Common Keyword Identification Phase

DU initiates a request to obtain common keywords related to its own data. DU sends the ct for Y with an authentication value. After receiving the query, DO verifies the legitimacy of DU through A 3 using d u and transmits the encrypted intersection results d z . Subsequently, DU extracts the common keywords from the intersection set. Figure 4 illustrates the common keyword identification procedure, providing a detailed overview of each step.
Step 1: 
DU generates a 1 , T A 1 , and computes A 1 = a 1 · P , A 2 = a 1 · p k o , A 3 = a 1 + h ( A 2 | | v u · P | | I D u ) · s k u ( mod p ) , ct = Y p u = ( y 1 p u , y 2 p u , , y m p u ) . Then, DU sends { T A 1 , I D u , d u , A 1 , A 3 , ct } .
Step 2: 
After receiving the message, DO checks | T A 1 * T A 1 | and A 3 · P = ? A 1 + h ( A 2 * | | p k T A + d u | | I D u ) · p k u by computing A 2 * = A 1 · s k o . If it is correct, DO generates a 2 and T A 2 and computes A 4 = a 2 · P , A 5 = a 2 · p k u , A 6 = a 2 + h ( A 6 | | v o · P | | A 2 * | | I D o ) · s k o ( mod p ) . DO also generates r z for y z p u ct and computes d z = r z x X ( y z p u x ) . Then, DO transmits { T A 2 , I D o , d o , A 4 , A 6 , d z } .
Step 3: 
DU checks | T A 2 * T A 2 | T and computes A 5 * = A 4 · s k u . If A 6 · P is equated to A 4 + h ( A 5 * | | p k T A + d o | | A 2 | | I D o ) · p k o , DU computes D e c ( d z ) = r z x X ( y z x ) using s u , and inputs y in S I where y X Y .

5.5. Aggregate Key Issuance Phase

To obtain the data access permission about intersection S I , DU sends the aggregate key request message { T B 1 , I D u , B 1 , B 3 , B 4 } to DO . After confirming the validity of the DU , DO provides an accessible dataset S with an aggregate key A K . Figure 5 depicts the process for aggregate key issuance, outlining the steps in detail below.
Step 1: 
DU generates b 1 , T B 1 and computes B 1 = b 1 · P , B 2 = b 2 · p k o , B 3 = S I h ( B 2 | | T B 1 ) , B 4 = b 1 + h ( I D o | | I D u | | B 2 | | S I ) · s k u ( mod p ) . Then, DU sends { T B 1 , I D u , B 1 , B 3 , B 4 } .
Step 2: 
DO checks | T B 1 * T B 1 | T , and computes B 2 * = B 1 · s k o , S I = B 3 h ( B 2 * | | T B 1 ) . If B 4 · P = ? B 1 + h ( I D o | | I D u | | B 2 * | | S I ) · p k u , DO generates b 2 , T B 2 and computes B 5 = b 2 · P , B 6 = b 2 · p k u , A K = j S s k o · P n + 1 j , B 7 = b 2 + h ( I D o | | I D u | | B 2 * | | B 6 | | S I ) · s k o ( mod p ) , B 8 = ( A K | | S ) h ( B 2 * | | B 6 ) . Then, DO transmits { T B 2 , I D o , B 5 , B 7 , B 8 } .
Step 3: 
DU checks | T B 2 * T B 2 | T and computes B 6 * = B 5 · s k u for checking B 7 · P = ? B 5 + h ( I D o | | I D u | | B 2 | | B 6 * | | T B 2 ) · p k o . If accurate, DU computes ( A K | | S ) = B 8 h ( B 2 | | B 6 * ) .

5.6. Data Request and Download Phase

DU requests data from CS corresponding to the dataset S received from DO , and CS transmits the matched results { T D 2 , c 1 , v i , P F i } . DU then uses the aggregate key A K to decrypt the received data and obtain the document F i . This phase is delineated in Figure 6, elucidating each sequential step below.
Step 1: 
DU generates d 1 , T D 1 , computes D 1 = d 1 · P , D 2 = d 1 · p k s , D 3 = S h ( D 2 | | T D 1 ) , and sends { T D 1 , D 1 , D 3 } .
Step 2: 
According to the received message, CS checks | T D 1 * T D 1 | T and computes D 2 * = D 1 · s k s , S = D 3 h ( D 2 * | | T D 1 ) . For S, CS generates T D 2 and computes v 1 = j S , j i P n + 1 j + i , v 2 = j S P n + 1 j , P F i = c 3 · e ( v 1 , c 1 ) e ( v 2 , c 2 ) . Then, CS sends { T D 2 , c 1 , v i , P F i } .
Step 3: 
DU checks | T D 2 * T D 2 | T and obtains data F i by computing F i * = P F i · e ( A K , c 1 ) . To verify the data, DU checks whether v i is equal to h ( F i * | | I D i ) .
Correctness: 
F i = P F i · e ( A K , c 1 ) = c 3 · e ( v 1 , c 1 ) e ( v 2 , c 2 ) · e ( A K , c 1 ) = c 3 · e ( j S , j i P n + 1 j + i , s · P ) e ( j S P n + 1 j , s · ( p k o + P i ) ) · e ( j S s k o · P n + 1 j , s · P ) = c 3 · e ( j S , j i P n + 1 j + i , s · P ) e ( j S P n + 1 j , s · p k o ) · e ( j S P n + 1 j , s · P i ) · e ( j S s k o · P n + 1 j , s · P ) = c 3 · e ( j S s k o · P n + 1 j , s · P ) e ( j S P n + 1 j , s · p k o ) · e ( α n + 1 P , s · P ) = F i · e ( P 1 , P n ) s e ( α n + 1 P , s · P ) = F i · e ( P 1 , P n ) s e ( P 1 , P n ) s = F i

6. Security Analysis

We conduct a comprehensive security analysis to prove the resilience of our proposed scheme. Our assessment encompasses potential threats, ranging from informal attack scenarios to formal analysis. In the informal security analysis, we evaluate whether the proposed scheme meets essential security requirements, including resilience against impersonation, replay, and denial-of-service attacks, as well as ensuring mutual authentication and data privacy. For the formal analysis, we use IND-CPA to verify the robustness of data privacy protections. We employ BAN logic to confirm the guarantee of mutual authentication and utilize the Scyther tool to validate the security of the proposed scheme against potential vulnerabilities, focusing on common keyword confirmation and integrated key issuance.

6.1. Informal Security Analysis

We perform a security evaluation to estimate the robustness of the proposed scheme against various threats that can occur in a medical data sharing environment. We also verify that mutual authentication between entities is provided during communication. We consider that an adversary endeavors security breaches founded on the suppositions delineated in Section 4.2.

6.1.1. Impersonation Attack

A endeavors to impersonate DU in an effort to intercept the transmitted messages between DO , DU , and CS , aiming to obtain sensitive data. A initiates the transmission of a data request message, denoted as { T D 1 , D 1 , D 3 } , to CS as outlined in Section 5.6. However, A cannot compute the message without the dataset S. A also endeavors to extract data from an intercepted message { T D 2 , c 1 , v i , P F i } , but it is impossible without an aggregate key A K . In an attempt to acquire the A K and S of DU , A endeavors to transmit { T B 1 , I D u , B 1 , B 3 , B 4 } in Section 5.5. However, A faces insurmountable barriers as it lacks crucial information including DU ’s secret key s k u , a random nonce b 1 , the common keyword set S I , and the identity of the data owner I D o . Even if A tries to obtain A K and S from { T B 2 , I D o , B 5 , B 7 , B 8 } , it is impossible because A needs s k u . Furthermore, attempts to access S I and I D o for the desired data detailed in Section 5.4 are futile due to A ’s lack of knowledge about s k u and a 1 . Consequently, the security of the proposed system against impersonation attacks is affirmed.

6.1.2. Replay and Man-in-the-Middle (MITM) Attack

A tries to resend the common keyword confirmation message { T A 1 , I D u , d u , A 1 , A 3 , ct } , data access permission request message { T B 1 , I D u , B 1 , B 3 , B 4 } , and data request message { T D 1 , D 1 , D 3 } with the purpose of obtaining data. However, these messages consist of timestamps T A 1 , T B 1 , T D 1 , and random nonces a 1 , b 1 , d 1 , and each entity that receives the message checks its freshness. Even if A retransmits a previous message, entities can distinguish it as a malicious message. A also intercepts and attempts to modify the messages, but it is impossible without the knowledge of s k u , s k o , a 1 , a 2 , b 1 , b 2 , d 1 , d 2 . Hence, our scheme resists the replay and MITM attacks.

6.1.3. Denial of Services (DoS) Attack

A seeks to disrupt availability by inundating CS with an overwhelming volume of messages, thereby overloading its capacity or halting data sharing services altogether. During such an attack, A ruthlessly transmits data upload messages { I D o , U 1 , U 3 , d o , C i } and data request messages { T D 1 , D 1 , D 3 } to CS . However, CS effectively mitigates this threat by scrutinizing the timestamps of incoming messages and promptly interrupting any deemed invalid. Consequently, the proposed system robustly defends against DoS attacks, ensuring uninterrupted service availability.

6.1.4. Mutual Authentication

In the proposed scheme, legitimacy is verified between communication entities to ensure secure medical data sharing. In Section 5.4, when DU requests the common keyword results from DO through { T A 1 , I D u , d u , A 1 , A 3 , ct } , upon receiving the message, DO checks whether DU has been legitimately registered in TA via A 3 · P = ? A 1 + h ( A 2 * | | p k T A + d u | | I D u ) · p k u . If this verification is successful, DO sends { T B 2 , I D o , B 5 , B 7 , B 8 } along with the common keyword identification function d z = r z x X ( y z p k u x ) , which can be decrypted by DU . DU then verifies the correctness of the message sent by the DO , who has legally registered with TA , via A 6 · P = ? A 4 + h ( A 5 * | | p k T A + d o | | A 2 | | I D o ) · p k o . Mutual authentication is performed in the same way at other phases. Therefore, the proposed scheme ensures mutual authentication.

6.1.5. Data Verification

Upon receiving the results of the data request query { T D 2 , c 1 , v i , P F i } , DU proceeds with data verification. This involves computing F i * = P F i · e ( A K , c 1 ) and subsequently checking whether v i = ? h ( F i * | | I D i ) . This verification process ensures that the received data have not been tampered with. By adding an additional layer of security, the proposed scheme reinforces the integrity of transmitted data. Therefore, it not only facilitates secure medical data sharing but also prioritizes data integrity, mitigating the risk of unauthorized modifications.

6.2. Semantic Security

In Theorem 1, we show that the proposed scheme provides IND-CPA security.
Theorem 1. 
Given a probabilistic polynomial-time adversary A with a non-negligible advantage ε, A is capable of tackling the formidable assumption problem with a gain of ε 2 .
Proof of Theorem 1. 
Let A be an entity capable of compromising the proposed scheme with an advantage of ε . In response, we introduce B to engage in the DBDH game, achieving an advantage of ε / 2 . The challenger C selects a generator P G and four random values a , b , c , d Z p . C then randomly determines a value ϰ { 0 , 1 } and shares it with B . If ϰ = 0 , C computes V = e ^ ( P , P ) a b c , resulting in the tuple ( a P , b P , c P , e ^ ( P , P ) a b c ) . Otherwise, if ϰ = 1 , C computes V = e ^ ( P , P ) d , resulting in the tuple ( a P , b P , c P , e ^ ( P , P ) d ) .
Init. B employs A to produce a distinct subset S a from the existing set S = { 1 , , n } , which A aims to focus on. Afterward, A delivers this selected set to B .
Setup. B formulates the public parameters { P i } 1 i 2 n , i n + 1 , where α 1 = a and α n = b . Then, B disseminates these parameters to A .
Phase 1. A submits an A K query for S * S ¯ a , and B responds to A by calculating A K as A K = j S * s k o · P n + 1 j .
Challenge. A submits two plaintexts of equal length, denoted as F 0 and F 1 , along with S * to B . B randomly flips a coin to determine ϰ { 0 , 1 } . If ϰ = 0 and V = e ^ ( P , P ) a b c , we set s = c , then e ^ ( P , P ) a b c = e ^ ( P , P ) a b · s = e ^ ( a P , b P ) s = e ^ ( P 1 , P n ) s and c 3 = F ϰ · e ^ ( g , g ) a b c is computed. Otherwise, if ϰ = 1 , then V = e ^ ( P , P ) d and c 3 = F ϰ · e ^ ( P , P ) d . B also calculates c 1 = s · P , c 2 = s · ( p k o + P i ) , and sends { c 1 , c 2 , c 3 } to A .
Phase 2. A repeats Phase 1 to obtain A K within S * S a ¯ .
Guess. A hypothesizes ϰ to guess ϰ . If ϰ = ϰ , B returns 0, indicating V = e ^ ( P , P ) a b c , and A , with an advantage of ε , can practically obtain the ciphertext, resulting in a probability P r [ ϰ = ϰ | V = e ^ ( P , P ) a b c ] = 1 2 + ε . If ϰ ϰ , B returns 1, indicating V = e ^ ( P , P ) d , and A receives an invalid ciphertext. Therefore, by correctly guessing ϰ , A gains no significant advantage, and the probability of success in the game is P r [ ϰ ϰ | V = e ^ ( P , P ) d ] = 1 2 . The probability P r of a successful game can be calculated as
P r = 1 2 P r [ A ( a P , b P , c P , e ^ ( P , P ) a b c ) = 1 ] + 1 2 P r [ A ( a P , b P , c P , e ^ ( P , P ) d ) = 1 ] 1 2 = 1 2 P r [ ϰ = ϰ | V = e ^ ( P , P ) a b c ] + 1 2 P r [ ϰ ϰ | V = e ^ ( P , P ) d ] 1 2 = 1 2 × ( 1 2 + ε ) + 1 2 × 1 2 1 2 = ε 2
Hence, the proposed scheme provides IND-CPA security. □

6.3. Formal Security Analysis Using BAN Logic

In the proposed scheme, DO and DU perform mutual authentication in Section 5.4 to prove that they are entities correctly registered in TA before performing an intersection. To demonstrate the mutual authentication of our scheme, we utilize a widely recognized formal verification technique called BAN logic [13]. Many researchers have affirmed the mutual authentication of their approaches using BAN logic [33,34]. To incorporate our approach with BAN logic, we provide the following notations and descriptions. Table 2 is the notation used in BAN logic.

6.3.1. Rules

The rules employed for analyzing the security scheme in BAN logic are outlined below.
  • Message mearning rule (MMR):
    Q | Q L K , Q M L Q | K | M
  • Freshness rule (FR):
    Q | # ( M ) Q | # ( M , S )
  • Nonce verification rule (NVR):
    Q | # ( M ) , Q | K | M Q | K | M
  • Jurisdiction rule (JR):
    Q | K | M , Q | K | M Q | M
  • Belief rule (BR):
    Q | K | ( M , S ) Q | K | M

6.3.2. Goals

The goals for checking the adequacy of the authentication properties of the proposed scheme are defined as follows.
Goal 1: 
DO | A 3
Goal 2: 
DO | DU | A 3
Goal 3: 
DU | A 6
Goal 4: 
DU | DO | A 6

6.3.3. Assumptions

The assumptions driving the analysis are presented as follows.
A 1 :
DO | DO A 2 D U
A 2 :
DO | # ( T A 1 )
A 3 :
DO | DU | A 3
A 4 :
DU | DO A 5 D U
A 5 :
DU | # ( T A 2 )
A 6 :
DU | DO | A 6

6.3.4. Idealized Forms

The idealized forms for messages exchanged among communication entities are outlined below.
M 1 :
DU DO : T A 1 , I D u , v u , A 3 A 2
M 2 :
DO DU : T A 2 , I D o , v o , A 6 A 5

6.3.5. Proof

In accordance with the provided rules, idealized forms, and assumptions, the analytical process aimed at achieving the goals of the proposed scheme is outlined as follows.
Step 1: 
S 1 can be obtained from M 1 .
S 1 : DO T A 1 , I D u , v u , A 3 A 2
Step 2: 
S 2 can be obtained by applying the MMR with A 1 .
S 2 : DO | DU | ( T A 1 , I D u , v u , A 3 )
Step 3: 
S 3 can be obtained by applying the FR with S 2 and A 2 .
S 3 : DO | # ( T A 1 , I D u , v u , A 3 )
Step 4: 
S 4 can be obtained by applying the NVR with S 2 and S 3 .
S 4 : DO | DU | ( T A 1 , I D u , v u , A 3 )
Step 5: 
S 5 can be obtained by applying the BR with S 4 .
S 5 : DO | DU | A 3 ( Goal 2 )
Step 6: 
S 6 can be obtained by applying the JR with S 5 and A 3 .
S 6 : DO | A 3 ( Goal 1 )
Step 7: 
S 7 can be obtained from M 2 .
S 7 : DU T A 2 , I D o , v o , A 6 A 5
Step 8: 
S 8 can be obtained by applying the MMR with A 4 .
S 8 : DU | DO | ( T A 2 , I D o , v o , A 6 )
Step 9: 
S 9 can be obtained by applying the FR with S 8 and A 5 .
S 9 : DU | # ( T A 2 , I D o , v o , A 6 )
Step 10: 
S 10 can be obtained by applying the NVR with S 8 and S 9 .
S 10 : DU | DO | ( T A 2 , I D o , v o , A 6 )
Step 11: 
S 11 can be obtained by applying the BR with S 10 .
S 5 : DU | DO | A 6 ( Goal 4 )
Step 12: 
S 12 can be obtained by applying the JR with S 11 and A 6 .
S 12 : DU | A 6 ( Goal 3 )
Therefore, all goals are accomplished, and the proposed scheme delivers mutual authentication.

6.4. Scyther Tool

We utilize the Scyther tool for the formal security analysis of the proposed scheme. Scyther is a push-button tool designed for the verification and analysis of the security protocol [12]. It offers extensive verification capabilities, ensuring termination while verifying the correctness of the scheme across an unlimited number of sessions. Scyther also provides features for model checking and multi-protocol analysis, complemented by a Python-based graphical user interface. These functionalities streamline the process of identifying and addressing security vulnerabilities within systems by users. Scyther delineates roles and events, representing message transmission and reception, based on the Security Protocol Description Language (SPDL). The Scyther command-line tool evaluates the security of a proposed protocol by scrutinizing the various claim events described in Table 3. Upon completion of the simulation, the result window confirms the security robustness of the proposed protocol. A status of “OK” in the “Status” tab, along with ”No attacks” in the “Comment” tab, assures the security of the authentication process. Figure 7 presents the simulation result of the proposed scheme, showing the “OK” status and “No attacks” comments in all claim events. Therefore, we ensure the robustness of the security measures implemented.

7. Comparative Analysis

We perform an evaluative comparison regarding the security and efficiency metrics of our approach against pertinent existing frameworks.

7.1. Security Features

To ensure data confidentiality and privacy in a medical information system, it is imperative that only authorized data users should be granted access to the data, with no information related to data being disclosed. Achieving this necessitates the implementation of robust security measures to thwart unauthorized access attempts, secure message exchanges between entities, and grant data access only following thorough verification via mutual authentication. Moreover, it is crucial to maintain data integrity by verifying the authenticity of the information accessed by data users. In this context, we evaluate the security features of our proposed scheme against existing related schemes to determine its effectiveness in thwarting potential threats such as impersonation, replay, MITM, and DoS attacks. In addition, our evaluation focuses on verifying the robustness of mutual authentication, data integrity verification, and the prevention of data privacy leaks. Table 4 delineates the analysis results, comparing our proposed scheme with existing ones in terms of their capability to address the aforementioned security concerns. Based on our findings, existing studies lack robustness against DoS attacks, do not adequately consider MITM attacks, and lack essential features such as mutual authentication, data verification, or data privacy. In contrast, our proposed scheme meets the security requirements for secure data sharing within medical information systems.

7.2. Computational Costs

We investigated the execution time of cryptographic operations on personal computers (PCs) using MIRACL [14], a software tool designed to facilitate the practical implementation of cryptographic techniques and algorithms. The PC’s specifications are as follows: Ubuntu 20.04.6 LTS operating system, 16 GB of RAM, and an Intel Core i5-10400 processor operating at 2.90 GHz (64-bit CPU). To ensure the accuracy of the measurements, we calculated the average duration of 100 iterations for each cryptographic operation, and the results are in Table 5.
We analyzed the message execution time on the public channel. We remain consistent in treating the keywords and attributes discussed in each paper as 1 to compare the computational cost with increasing data volume, denoted by κ . The comparison results are laid out in Table 6. As depicted in Figure 8, our proposed scheme demonstrates the lowest execution times with increasing data volume. Within medical information systems, the seamless exchange of vast datasets between data owners and users is critical for driving research, advancing medical technologies, and enhancing service delivery. Therefore, our scheme is not only efficient but also well suited for real-world medical information systems.

7.3. Time Complexity Comparison

We conduct a comparative analysis of time complexity concerning computation and communication costs in relation to existing studies. Regarding computation costs, our analysis encompasses encryption, request, and verification. Encryption involves the process of encrypting data from the owner. Request refers to the process of a user desiring access to specific data, while verification entails confirming the accuracy and reliability of received encrypted data. Regarding communication costs, we define the access key as the decryption key, the request as the user’s data access query, and the ciphertext as the encrypted data received from the cloud server. As illustrated in Table 7, our comparison demonstrates significantly lower time complexity for both computation and communication compared to existing methods. Thus, our proposed approach offers enhanced efficiency and performance, rendering it more suitable for medical data sharing systems.

8. Conclusions

We have proposed a secure and privacy-preserving data sharing scheme designed for medical information systems. This scheme leverages KAE to facilitate secure and flexible data sharing between data owners and users and incorporates PSI techniques to achieve a balance between data privacy and flexible sharing. The security of our proposed scheme was rigorously evaluated through both informal and formal security analyses. Through the use of BAN logic, we ensured the scheme supports mutual authentication, while semantic secrecy was employed to prove data privacy. Additionally, the robustness of our scheme was validated using the Scyther tool, confirming its resilience against potential security threats. Our assessment extended to a comparative analysis of the security properties, execution times, and complexities, contrasting our scheme with existing methodologies. This comparison highlighted the improved security and efficiency metrics of our scheme. In conclusion, the proposed data sharing scheme not only meets the stringent security and privacy requirements of medical information systems but also exhibits superior performance and flexibility. However, as our system employs homomorphic encryption in determining the intersection of private sets, there may be a computational burden on each entity. Hence, we intend to pursue future research aimed at identifying intersections using a lighter methodology. In addition, since there is a possibility of advanced security risks due to the development of quantum computing technology, we will consider studies to improve the resilience to these security threats after the proposed method is employed.

Author Contributions

Conceptualization, J.O.; methodology, J.O. and S.S.; software, D.K.; validation, Y.P. (Yohan Park) and M.K.; formal analysis, J.O. and S.S.; investigation, J.O. and M.K.; writing—original draft preparation, J.O.; writing—review and editing, S.S. and Y.P. (Yohan Park); supervision, Y.P. (Youngho Park); funding acquisition, Y.P. (Youngho Park). All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Research Foundation of Korea (NRF) and funded by the Ministry of Education under grant number 2020R1I1A3058605.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Arunprasath, S.; Annamalai, S. Improving patient centric data retrieval and cyber security in healthcare: Privacy preserving solutions for a secure future. Multimed. Tools Appl. 2024, 1–31. [Google Scholar] [CrossRef]
  2. Wang, T.; Wu, Q.; Chen, J.; Chen, F.; Xie, D.; Shen, H. Health data security sharing method based on hybrid blockchain. Future Gener. Comp. Syst. 2024, 153, 251–261. [Google Scholar] [CrossRef]
  3. Zhang, J.; Yang, Y.; Liu, X.; Ma, J. An efficient blockchain-based hierarchical data sharing for Healthcare Internet of Things. IEEE Trans. Ind. Inform. 2022, 18, 7139–7150. [Google Scholar] [CrossRef]
  4. Khan, M.A.; Alhakami, H.; Alhakami, W.; Shvetsov, A.V.; Ullah, I. A smart card-based two-factor mutual authentication scheme for efficient deployment of an IoT-based telecare medical information system. Sensors 2023, 23, 5419. [Google Scholar] [CrossRef]
  5. Lee, J.; Oh, J.; Kwon, D.; Kim, M.; Kim, K.; Park, Y. Blockchain-enabled key aggregate searchable encryption scheme for personal health record sharing with multi-delegation. IEEE Internet Things J. 2024, 11, 17482–17494. [Google Scholar] [CrossRef]
  6. Sahai, A.; Waters, B. Fuzzy identity-based encryption. In Proceedings of the Advances in Cryptology–EUROCRYPT 2005: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Aarhus, Denmark, 22–26 May 2005; Volume 24, pp. 457–473. [Google Scholar] [CrossRef]
  7. Chu, C.K.; Chow, S.S.; Tzeng, W.G.; Zhou, J.; Deng, R.H. Key-aggregate cryptosystem for scalable data sharing in cloud storage. IEEE Trans. Parallel Distrib. Syst. 2014, 25, 468–477. [Google Scholar] [CrossRef]
  8. Yang, L.; Li, C.; Cheng, Y.; Yu, S.; Ma, J. Achieving privacy-preserving sensitive attributes for large universe based on private set intersection. Inf. Sci. 2022, 582, 529–546. [Google Scholar] [CrossRef]
  9. Sucasas, V.; Mantas, G.; Papaioannou, M.; Rodriguez, J. Attribute-based pseudonymity for privacy-preserving authentication in cloud services. IEEE Trans. Cloud Comput. 2023, 11, 168–184. [Google Scholar] [CrossRef]
  10. Wang, H.; Liang, J.; Ding, Y.; Tang, S.; Wang, Y. Ciphertext-policy attribute-based encryption supporting policy-hiding and cloud auditing in smart health. Comput. Stand. Interfaces 2023, 84, 103696. [Google Scholar] [CrossRef]
  11. Oh, J.; Lee, J.; Kim, M.; Park, Y.; Park, K.; Noh, S. A secure data sharing based on key aggregate searchable encryption in fog-enabled IoT environment. IEEE Trans. Netw. Sci. Eng. 2022, 9, 4468–4481. [Google Scholar] [CrossRef]
  12. Cremers, C.J. The Scyther Tool: Verification, Falsification, and Analysis of Security Protocols: Tool Paper. In Proceedings of the International Conference on Computer Aided Verification, Princeton, NJ, USA, 7–14 July 2008; pp. 414–418. [Google Scholar] [CrossRef]
  13. Burrows, M.; Abadi, M.; Needham, R. A logic of authentication. ACM Trans. Comput. Syst. 1990, 8, 18–36. [Google Scholar] [CrossRef]
  14. MIRACL Cryptographic SDK. Available online: https://github.com/miracl/MIRACL (accessed on 2 April 2024).
  15. Bao, Y.; Qiu, W.; Cheng, X. Secure and lightweight fine-grained searchable data sharing for IoT-oriented and cloud-assisted smart healthcare system. IEEE Internet Things J. 2022, 9, 2513–2526. [Google Scholar] [CrossRef]
  16. Mamta; Gupta, B.B.; Lytras, M.D. Fog-enabled secure and efficient fine-grained searchable data sharing and management scheme for IoT-based healthcare systems. In IEEE Transactions on Engineering Management; IEEE: New York, NY, USA, 2022; pp. 1–13. [Google Scholar] [CrossRef]
  17. Wang, Y.; Zhang, A.; Zhang, P.; Qu, Y.; Yu, S. Security-aware and privacy-preserving personal health record sharing using consortium blockchain. IEEE Internet Things J. 2022, 9, 12014–12028. [Google Scholar] [CrossRef]
  18. Oh, J.; Lee, J.; Kim, M.; Park, Y.; Park, K.; Noh, S. A secure personal health record sharing system with key aggregate dynamic searchable encryption. Electronics 2022, 11, 3199. [Google Scholar] [CrossRef]
  19. Trivedi, H.S.; Patel, S.J. Key-aggregate searchable encryption with multi-user authorization and keyword untraceability for distributed IoT healthcare systems. Trans. Emerg. Telecommun. Technol. 2023, 34, e4734. [Google Scholar] [CrossRef]
  20. Xu, G.; Qi, C.; Dong, W.; Gong, L.; Liu, S.; Chen, S.; Liu, J.; Zheng, X. A privacy-preserving medical data sharing scheme based on blockchain. IEEE J. Biomed. Health Inform. 2023, 27, 698–709. [Google Scholar] [CrossRef] [PubMed]
  21. Zhang, C.; Luo, X.; Fan, Q.; Wu, T.; Zhu, L. Enabling privacy-preserving multi-server collaborative search in smart healthcare. Future Gener. Comp. Syst. 2023, 143, 265–276. [Google Scholar] [CrossRef]
  22. Zhang, Y.; Guo, F.; Susilo, W.; Yang, G. Balancing privacy and flexibility of cloud-based personal health records sharing system. IEEE Trans. Cloud Comput. 2023, 11, 2420–2430. [Google Scholar] [CrossRef]
  23. Peng, G.; Zhang, A.; Lin, X. Patient-centric fine-grained access control for electronic medical record sharing with security via dual-blockchain. IEEE Trans. Netw. Sci. Eng. 2023, 10, 2908–3921. [Google Scholar] [CrossRef]
  24. Zhang, K.; Zhang, Y.; Li, Y.; Liu, X.; Lu, L. A blockchain-based anonymous attribute-based searchable encryption scheme for data sharing. IEEE Internet Things J. 2024, 11, 1685–1697. [Google Scholar] [CrossRef]
  25. Jastaniah, K.; Zhang, N.; Mustafa, M.A. Efficient user-centric privacy-friendly and flexible wearable data aggregation and sharing. In IEEE Transactions on Cloud Computing; IEEE: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
  26. Yin, H.; Zhao, Y.; Zhang, L.; Qiao, B.; Chen, W.; Wang, H. Attribute-based searchable encryption with decentralized key management for healthcare data sharing. J. Syst. Architect. 2024, 148, 103081. [Google Scholar] [CrossRef]
  27. Lai, C.; Zhang, H.; Lu, R.; Zheng, D. Privacy-preserving medical data sharing scheme based on two-party cloud-assisted PSI. IEEE Internet Things J. 2024, 11, 15855–15868. [Google Scholar] [CrossRef]
  28. Lax, G.; Nardone, R.; Russo, A. Enabling secure health information sharing among healthcare organizations by public blockchain. Multimed. Tools Appl. 2024, 1–17. [Google Scholar] [CrossRef]
  29. Koblitz, N. Elliptic curve cryptosystems. Math. Comput. 1987, 48, 203–209. [Google Scholar] [CrossRef]
  30. Patranabis, S.; Shrivastava, Y.; Mukhopadhyay, D. Dynamic key-aggregate cryptosystem on elliptic curves for online data sharing. In Progress in Cryptology, Proceedings of the INDOCRYPT 2015: 16th International Conference on Cryptology in India, Bangalore, India, 6–9 December 2015; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar] [CrossRef]
  31. Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) fully homomorphic encryption without bootstrapping. ACM Trans. Comput. Theory (TOCT) 2014, 6, 13. [Google Scholar] [CrossRef]
  32. Dolev, D.; Yao, A. On the security of public key protocols. IEEE Trans. Inf. Theory 1983, 29, 198–208. [Google Scholar] [CrossRef]
  33. Son, S.; Lee, J.; Park, Y.; Park, Y.; Das, A.K. Design of blockchain-based lightweight V2I handover authentication protocol for VANET. IEEE Trans. Netw. Sci. Eng. 2022, 9, 1346–1358. [Google Scholar] [CrossRef]
  34. Attir, A.; Naït-Abdesselam, F.; Faraoun, K.M. Lightweight anonymous and mutual authentication scheme for wireless body area networks. Comput. Netw. 2023, 224, 109625. [Google Scholar] [CrossRef]
Figure 1. Network model of the proposed scheme.
Figure 1. Network model of the proposed scheme.
Mathematics 12 01717 g001
Figure 2. The overall flowchart of the proposed scheme.
Figure 2. The overall flowchart of the proposed scheme.
Mathematics 12 01717 g002
Figure 3. Data upload phase.
Figure 3. Data upload phase.
Mathematics 12 01717 g003
Figure 4. Common keyword identification phase.
Figure 4. Common keyword identification phase.
Mathematics 12 01717 g004
Figure 5. Aggregate key issuance phase.
Figure 5. Aggregate key issuance phase.
Mathematics 12 01717 g005
Figure 6. Data request and download phase.
Figure 6. Data request and download phase.
Mathematics 12 01717 g006
Figure 7. Scyther results.
Figure 7. Scyther results.
Mathematics 12 01717 g007
Figure 8. Comparison of the execution times with the number of data [18,19,22,24].
Figure 8. Comparison of the execution times with the number of data [18,19,22,24].
Mathematics 12 01717 g008
Table 1. Notation.
Table 1. Notation.
NotationDescription
I D o , I D u Identity of DO and DU
( p k T A , k T A ) TA ’s public-master key based on ECC
( p k o , s k o ) , ( p o , s o ) DO ’s public-private key pairs based on ECC and BGV
( p k u , s k u ) , ( p u , s u ) DU ’s public-private key pairs based on ECC and BGV
( p k s , k s ) CS ’s public–private key pair based on ECC
nMaximum number of document
SDataset index of DU
α , R o , r o , R u , r u , s Random number
e o , e u Random error
u , a 1 , a 2 , b 1 , b 2 , d 1 , d 2 Random nonce
T A 1 , T A 2 , T B 1 , T B 2 , T D 1 , T D 2 Timestamp
T Maximum transmission delay
A K Aggregate key
G , G T Additive group and multiplicative group
e ^ Bilinear map e ^ : G × G G T
hOne-way hash function h : { 0 , 1 } * Z p
Bitwise exclusive-or operator
| | Concatenation operator
Table 2. BAN logic notation.
Table 2. BAN logic notation.
NotationDescription
Q | M O believes statement M
# M Statement M is fresh
Q M Q receives statement M
Q | M Q once said M
Q M Q controls statement M
M L Statement M is combined with secret statement L
Q L K L is a secret known only to Q and K
Table 3. Scyther tool claim events.
Table 3. Scyther tool claim events.
Claim EventDescription
SecrecyConfirms that sensitive information remains confidential during communication
AliveVerifies active participation of communicating parties
WeakagreeChecks whether the communicating participant is active user or not
NiagreeEnsures an implicit agreement between communicating participants
NisynchEnsures messages are exchanged in the proper order from authorized participants
Table 4. Security features.
Table 4. Security features.
Security Features[18][19][22][24]Ours
Replay attack
MITM attack
Impersonation attack
DoS attack××
Mutual authentication××
Data verification×××
Data Privacy××××
∘: Support/resist the security features; ×: Does not support/resist the security features; −: Not applicable.
Table 5. Execution time of each cryptographic operation.
Table 5. Execution time of each cryptographic operation.
NotationDescriptionExecution Time
T b p G m Bilinear pairing e ^ : G m × G m G m T ( G m : multiplicative group) 4.717 ms
T e G m T Exponentiation in G m T 1.990 ms
T m G m T Multiplication/Division in G m T 0.032 ms
T m G m Multiplication in G m 0.323 ms
T a G m Point addition in G m 0.013 ms
T b p G a Bilinear pairing e ^ : G a × G a G a T ( G a : additive group) 3.023 ms
T e G a T Exponentiation in G a T 0.341 ms
T m G a T Multiplication/Division in G a T 0.027 ms
T m G a Multiplication in G a 0.172 ms
T a G a Point addition in G a 0.003 ms
T m Z Multiplication in Z p 0.006 ms
T a Z Addition in Z p 0.005 ms
T e Modular exponentiation 0.094 ms
T s Symmetric key encryption/decryption 0.001 ms
T h SHA-256 hash function 0.001 ms
Table 6. Execution time comparison.
Table 6. Execution time comparison.
SchemeExecution Times (ms)
[18] 3 T b p G m + T e G m T + 16 T m G m + 6 T m Z + 4 T a G m + 22 T h + κ ( 2 T b p G m + T m G m T + 2 T a G m + T h ) 9.493 κ + 24.419
[19] 8 T b p G m + 8 T e G m T + 4 T m G m T + 11 T m G m + 2 T m Z + 6 T a G m + 5 T h + κ ( 2 T b p G m + T m G m T + 4 T a G m + T h ) 9.519 κ + 57.432
[22] 6 T m G m + 7 T m Z + 4 T a Z + κ ( 4 T b p G m + 2 T m G m T + 4 T m G m + 9 T m Z + T a G m + T a Z ) 20.296 κ + 28.346
[24] 6 T b p G m + T e G m T + 6 T m G m + 5 T m Z + 2 T a G m + T a Z + T e + T s + 4 T h + κ ( 7 T b p G m + 3 T m G m T + 4 T m G m + 3 T m Z + 2 T a G m + 2 T e + T s + 2 T h ) 34.642 κ + 32.39
Ours T b p G a + T e G a T + 38 T m G a + 11 T m Z + 9 T a G a + 10 T a Z + 16 T h + κ ( 3 T b p G a + T m G a T + 2 T m G a + T h ) 9.441 κ + 11.708
Table 7. Time complexity comparison.
Table 7. Time complexity comparison.
Computation CostCommunication Cost
SchemeEncryptionRequestVerificationAccess KeyRequestCiphertext
[18] O ( | K W | E ) O ( | Q | E ) O ( | S | P ) O ( 1 ) O ( 1 ) O ( 1 )
[19] O ( | K W | P ) O ( | Q | M ) O ( | Q | P ) O ( 1 ) O ( | Q | ) O ( | Q | )
[22] O ( | A | E ) N A O ( | A | P ) O ( | A | ) N A O ( 1 )
[24] O ( | K W | P ) O ( | Q | H ) O ( | Q | P ) O ( | A | ) O ( | Q | ) O ( | Q | )
Ours O ( 1 ) O ( 1 ) O ( | S | P ) O ( 1 ) O ( 1 ) O ( 1 )
| K W | : the number of keywords with the ciphertext; | A | : the number of attributes in access policy; | S | : the number of data; | Q | : the number of keyword in query set; P: pairing; M: multi-scalar multiplication; H: hash; E: exponentiation; N A : not appliable.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Oh, J.; Son, S.; Kwon, D.; Kim, M.; Park, Y.; Park, Y. Design of Secure and Privacy-Preserving Data Sharing Scheme Based on Key Aggregation and Private Set Intersection in Medical Information System. Mathematics 2024, 12, 1717. https://doi.org/10.3390/math12111717

AMA Style

Oh J, Son S, Kwon D, Kim M, Park Y, Park Y. Design of Secure and Privacy-Preserving Data Sharing Scheme Based on Key Aggregation and Private Set Intersection in Medical Information System. Mathematics. 2024; 12(11):1717. https://doi.org/10.3390/math12111717

Chicago/Turabian Style

Oh, Jihyeon, Seunghwan Son, DeokKyu Kwon, Myeonghyun Kim, Yohan Park, and Youngho Park. 2024. "Design of Secure and Privacy-Preserving Data Sharing Scheme Based on Key Aggregation and Private Set Intersection in Medical Information System" Mathematics 12, no. 11: 1717. https://doi.org/10.3390/math12111717

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop