Next Article in Journal
The Effects of Drop Vertical Jump Task Variation on Landing Mechanics: Implications for Evaluating Limb Asymmetry
Previous Article in Journal
Data-Driven Control Based on Information Concentration Estimator and Regularized Online Sequential Extreme Learning Machine
Previous Article in Special Issue
Symmetric Color Image Encryption Using a Novel Cross–Plane Joint Scrambling–Diffusion Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Blockchain-Based Privacy-Preserving Healthcare Data Sharing Scheme for Incremental Updates

1
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China
2
Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250014, China
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(1), 89; https://doi.org/10.3390/sym16010089
Submission received: 17 November 2023 / Revised: 28 December 2023 / Accepted: 3 January 2024 / Published: 11 January 2024

Abstract

:
With the rapid development of artificial intelligence (AI) in the healthcare industry, the sharing of personal healthcare data plays an essential role in advancing medical AI. Unfortunately, personal healthcare data sharing is plagued by challenges like ambiguous data ownership and privacy leakage. Blockchain, which stores the hash of shared data on-chain and ciphertext off-chain, is treated as a promising approach to address the above issues. However, this approach lacks a flexible and reliable mechanism for incremental updates of the same case data. To avoid the overhead of authentication, access control, and rewards caused by on-chain data changes, we propose a blockchain and trusted execution environment (TEE)-based privacy-preserving sharing scheme for healthcare data that supports incremental updates. Based on chameleon hash and TEE, the scheme achieves reliable incremental updates and verification without changing the on-chain data. In the scheme, for privacy concerns, off-chain data are protected through symmetric encryption, whereas data verification, decryption, and computation are performed within TEE. The experimental results show the feasibility and effectiveness of the proposed scheme.

1. Introduction

In the era of the digital economy, data serve as a valuable asset and play a pivotal role in driving innovation and facilitating economic growth across various industries. Particularly in the medical field, with the widespread use of electronic medical records (EMRs), the popularization of smart medical devices, and the rise in health-tracking apps, individual users have accumulated a large amount of healthcare data [1]. Utilizing these data, pertinent organizations such as medical research institutes and data centers employ machine learning and deep learning techniques [2,3] to extract vital information, which can assist in diagnosing diseases, preventing epidemics, and further improving healthcare services [4]. Consequently, encouraging users to share their healthcare data with medical research organizations has emerged as a significant approach in fostering the development of medical research.
Most personal healthcare data is stored in centralized servers, such as cloud servers [5]. This storage and management approach has its limitations. If the server fails, it could have catastrophic consequences for data storage and data-sharing services. Additionally, outsourced storage creates an ambiguous attribution of ownership of the data. The cloud server operator could assert ownership or control of the data, rendering the user incapable of directly administering or manipulating their information. Moreover, the absence of access control of personal healthcare data can significantly escalate the risk of data leakage, misuse, and unauthorized access [6].
Blockchain is a promising solution for these challenges [7,8]. By adopting a distributed storage and consensus mechanism, blockchain eliminates the risk of a single point of failure associated with centralized storage, thereby enhancing the reliability and stability of data storage. The non-tampering and transparent features of blockchain ensure data integrity and traceability, effectively resolving the challenges of data management and control authority. Furthermore, the smart contract enables precise control of access and sharing privileges of sensitive personal medical data by defining protocols between communicating parties [9,10].
As the application deepens, existing systems still face several problems. One issue arises when individuals are diagnosed with chronic diseases like diabetes. As their condition changes over time, their personal case data grow. Initially, the case record includes the diagnosis, lifestyle advice, and prescribed medications. Subsequently, data such as follow-up visits, blood glucose monitoring, medication adjustments, and relevant test results (e.g., funduscopic examinations and renal function assessments) are added. This accumulation of data enables a comprehensive assessment and analysis of specific diseases and treatment outcomes, necessitating incremental updates to the on-chain data. However, the existing schemes [11,12], which store the hash of shared data on-chain and ciphertexts off-chain, lack a flexible and reliable incremental update mechanism, resulting in a necessity to republish new records during incremental updates. The same case record may contain identical information, leading to many redundant data in the blockchain [13,14]. Furthermore, data changes increase the overhead of the system in terms of authentication, access control, and rewards [1,15].
To address the above problems, we propose a privacy-preserving sharing scheme for personal healthcare data that supports off-chain incremental updates based on blockchain and TEE.
The major contributions of the proposed scheme are as follows.
  • We propose a blockchain and TEE-based healthcare data-sharing scheme that supports incremental updates. The scheme achieves incremental updating without changing the data on-chain, effectively reducing data redundancy and minimizing system overhead.
  • To ensure that shared and incremental update records are traceable, we construct a shared blockchain (SB) and an updated blockchain (UB) to store incremental update and data-sharing transactions, respectively. We design a data validation mechanism in TEE to ensure the quality of shared data.
  • We use symmetric encryption to protect data stored off-chain. We divide the shared transaction into two parts—on-chain state tracking and off-chain TEE execution—and complete the verification and computation of confidential data within the TEE. The security of personal case data storage and sharing is effectively guaranteed.
  • We developed a prototype of the scheme and conducted experiments to evaluate its gas consumption and computational overhead. Additionally, we compares it with the non-TEE environment. The experiment results show the effectiveness and feasibility of our scheme.
The remainder of the paper is organized as follows. We begin by introducing some related work in Section 2. Section 3 is concerned with some preliminaries used in this paper. In Section 4, we describe the system model and design goals. In Section 5, the proposed system’s operational details are presented. In Section 6, we give a security analysis and implementation evaluation. Finally, this paper is concluded in Section 7.

2. Related Work

2.1. Storage and Management for Sharing Data

Deshmukh et al. [16] proposed a cloud-based electronic health record management system where patients and doctors can access relevant medical data using keys. However, outsourcing data storage removes the authority of patients to manage and control their data. To ensure personal ownership of shared data, Tian et al. [17] and Guo et al. [18] proposed a data ownership management mechanism for cloud servers. However, this scheme assumes complete trust in the cloud storage servers to be feasible. To establish an open, transparent, and decentralized healthcare data management and sharing system, Azaria et al. [19] introduced blockchain technology. However, this scheme is not sufficient for big-data scenarios. Nguyen et al. [20] and Zhang et al. [21] used blockchain technology to provide secure and reliable data access and management. They used the on-chain ledger and off-chain storage (OLOS) model to improve system performance and scalability. However, the schemes were limited to static data storage and did not address the need for on-chain data incremental updates.

2.2. Preserving Privacy for Medical Data Sharing

Kumar et al. [22] proposed a data-sharing protocol for industrial healthcare systems based on federated blockchain and deep learning. Belhadi et al. [23] proposed an algorithm for training complex health data in IoMT, an end-to-end intelligence framework based on blockchain technology, federated learning, and genetics. However, these unencrypted systems are susceptible to attacks on the network or system, leading to potential privacy violations. Zhang et al. [24] utilized blockchain to access and retrieve electronic health records, share them among authorized users, and implement symmetric key encryption for privacy protection. However, this scheme did not consider the storage capacity on the blockchain. Liu et al. [25] introduced a blockchain-based approach to maintain privacy while sharing electronic health records. The shared data are stored in the cloud, and the indexed information is stored on the blockchain. However, this solution ignores the design of patients sharing their medical data with third-party organizations. Li et al. [1] proposed a blockchain-based healthcare data-sharing scheme for the MIoT industry that rewards individuals who share their data. However, it only focuses on the behavioral privacy of the data subjects and does not include privacy protection for the shared data. Li et al. [26] combined homomorphic encryption (HE) with smart contracts to solve the privacy leakage problem in the health insurance claims process. However, homomorphic encryption is inefficient, and it is difficult to verify the calculation results.

3. Preliminaries

3.1. Trusted Execution Environment

A trusted execution environment (TEE) is a secure processing environment that runs on a separate kernel. Intel SGX [27] and Trust Zone [28] are famous examples. It ensures that the code is accurate, the run-time state is complete, and the code, data, and run-time state stored in permanent memory are confidential [29]. Following the work of [30], we can give a simple formal definition of TEE:
  • p k T E E , s k T E E T E E . I n i t 1 λ . TEE receives input as the security parameter λ and generates a pair of public and private keys ( p k T E E , s k T E E ), which are called master public key and master private key.
  • e i d T E E . I n s t a l l p . TEE receives input as the program p and stores it in the enclave, and the identifier of the enclave is e i d .
  • o u t , ρ T E E . R e s u m e e i d , f , i n . TEE receives input as the identity of enclave e i d , function f and the input i n and outputs o u t as the program running result of function f attached with an attestation ρ .
  • 0 , 1 T E E . v e r i f y p k T E E , p , o u t , ρ . TEE receives input as the master public key p k T E E , the program p , the output o u t , ρ , and outputs1 means the attestation is correct, and 0 means the opposite.

3.2. Chameleon Hash

Chameleon hash (CH) is a special type of cryptographic hash function proposed by Krawczyk et al. [31]. Briefly, chameleon hash contains the trapdoor, and the trapdoor holder can effectively generate conflicts. Chameleon hash is a tuple of efficient algorithms C H = C H . K G e n , C H . H a s h , C H . A d a p t , C H . C h e c k .
  • t k , h k C H . K G e n 1 λ . The key generation algorithm C H . K G e n takes as input a security parameter λ and outputs a private and public key pair t k , h k .
  • h , μ C H . H a s h   h k , m . The hashing algorithm C H . H a s h takes as input a public key h k and a message m ϵ M and outputs a hash h and its check string μ .
  • μ C H . A d a p t t k h , μ , m , m . The adaptation algorithm C H . A d a p t takes as input a private key t k , a triple of old hash h , check string μ and message m , and a new message m ϵ M , then outputs a new check string μ .
  • 0 , 1 C H . C h e c k h k , m ,   h , μ . The deterministic verification algorithm C H . C h e c k takes as input the public key h k , a triple of hash value h , check string μ and message m ϵ M . It then outputs 1 if h , μ is a valid hash–check string pair for the message m ; otherwise, it outputs 0 .

4. System Model

In this section, we give the system model of our proposed scheme. Based on the system model, we state the design goals and the adversary model.

4.1. System Architecture

To secure healthcare data sharing and facilitate reliable incremental updates without altering the on-chain data, we design a system architecture, as shown in Figure 1. The entities involved in the system include the data owner, data user, blockchain, TEE, smart contract, and cloud service provider. Each entity is described below.
  • Data Owner (DO): DO is the personal healthcare dataset owner who provides his case data to relevant medical research organizations in exchange for rewards. The DO establishes access policies and sales rules for his data. At the same time, DO holds his private key for incremental updates of his case data on-chain.
  • Data User (DU): DU is the demander of personal healthcare data (e.g., the medical research institution or the insurance company) willing to pay certain rewards for the right to use the health data.
  • Blockchain: Our system uses two blockchains: a shared blockchain (SB) and an updated blockchain (UB). The SB leverages the public blockchain to record and execute shared transactions, promoting transparency and openness. The UB uses the consortium blockchain to store off-chain incremental update records for traceability and validation. It is comanaged by medical research organizations, insurance companies, and regulatory agencies as trusted institutions.
  • TEE: The trusted, isolated execution environment that operates independently of untrusted operating systems. In this scheme, TEE acts as an off-chain trusted executor, providing an efficient, reliable, and secure execution environment for incremental updates and confidential data computation.
  • Smart Contract: We have designed various smart contracts to implement system functions, including data upload/incremental update smart contract dataUpload, access control smart contract accessControl, and key authorization smart contract keyAuthorization. These smart contracts are executed automatically when triggered.
  • CSP: An entity with extensive storage capabilities that stores the ciphertext of shared data.
The execution flow of the system, as depicted in Figure 1, can be described as follows.
DO encrypts his case data to be shared, stores the ciphertext in the CSP, and obtains the storage address (steps 1–2). Next, DO computes the chameleon hash of the ciphertext and releases the chameleon hash digest, address, and other necessary information to the SB. Meanwhile, the data access and sales rules are uploaded to the access control smart contract to generate the access policy (step 3).
During the incremental update process, DO submits a data update request to the smart contract within the SB (Step 4). After authentication, the SB initiates an off-chain execution license to the TEE (Step 5). DO establishes remote attestation with the TEE and transmits the updated private key and new ciphertext to the TEE (Steps 6–7). The TEE executes the data incremental update algorithm and generates the update identification (Step 8). Upon completion of the update, the TEE reports the status back to the SB and uploads the update log to the UB (Step 9).
During the data-sharing process, the DU retrieves the required data information from the SB and submits a data usage request to the SB (step ①). The access control smart contract accessControl determines whether the DU satisfies the access policy. If it does, it generates an off-chain TEE execution license (steps ②–③). The TEE obtains the shared data ciphertext stored in the CSP and the off-chain update logs in the UB. Subsequently, it establishes a remote attestation channel with DU (steps ④–⑤). The TEE performs data validation to obtain the qualified data, then decrypts the qualified data and performs data calculation to obtain the final result (steps ⑥–⑦). The TEE transmits the execution status to the SB. Once the majority of nodes have verified and signed the results, the TEE encrypts the computation results and delivers them to the DU (steps ⑧–⑨).

4.2. Design Goal

The goal of our scheme is to effectively address the issue of incremental updates of personal case data on-chain while ensuring secure data sharing. Consequently, we propose the following design goals.
  • Support incremental updates: Individual healthcare data accumulate over time and under changing conditions, making incremental updates necessary to maintain the integrity of case data in the chain. The scheme should be designed with a flexible and reliable incremental update mechanism to cope with the above scenarios.
  • Verifiable: A validation mechanism must be established to ensure that DO can only perform incremental updates of his case data in the TEE but that DO cannot tamper with the original data or privately update case data without permission.
  • Security: The update private key, which is confidential data, requires a secure execution environment. In addition, as personal healthcare data can contain a significant amount of private information, it is critical to ensure that it is stored and shared securely.
  • Ownership: Only DO can make incremental updates to his case data on-chain, and no one else can tamper with the data of the DO. DO has the right to set access policies for his shared data.
  • Traceability: The system must ensure that off-chain incremental update records and shared transaction records are traceable.

4.3. Threat Model

In our scheme, the threat model we consider is as follows.
  • Honest but curious cloud: The CSP is seen as an honest but curious entity. It will honestly execute system commands and also be curious about the data stored.
  • Repudiation and fraud attack: Malicious DOs may attempt to upload false, redundant, or irrelevant data in pursuit of profit. At the same time, malicious DUs may conduct denial and fraud attacks by rejecting data usage records and denying payments.
In our model, we believe that the blockchain and TEE are thoroughly reliable, which means they cannot be directly broken by adversaries. A physical attack on the TEE, such as side-channel attacks, where the attacker can retrieve some critical secrets, is not considered in this model.

5. Our Concrete Scheme

In this section, we provide specific details of our scheme, which can be divided into system initialization, data upload, TEE-based incremental updates, and data sharing. We have also organized the symbols used in this section, summarized in Table 1.

5.1. System Initialization

The security parameter of the system is defined as λ . G is the cyclic additive group with prime q , H is a secure cryptographic hash function H : 0 , 1 * Z q * .
DO picks the random number s k D O Z q * and computes p k D O = s k D O P as the public key. Thus, the public–private key pair of DO is ( s k D O , p k D O ). In addition, the address of DO on the blockchain is associated with its public key.
TEE picks the random number s k T E E Z q * and computes p k T E E = s k T E E P as the public key. Thus, the public–private key pair of TEE is ( s k T E E , p k T E E ). Then, TEE executes the initialization algorithm T E E . i n t a l l f e i d to deploy the data update, data validation, and data computation program code. TEE encrypts the license information e i d and sends it to the smart contract of SB. The e i d is an identifier for TEE to verify whether the smart contract license is legal or not.
Blockchain deployment smart contracts. The secret key authorization smart contract k e y Authorization picks the random number s k a s c Z q * and computes p k a s c = s k a s c P as the public key. Thus, the public–private key pair of the secret key authorization smart contract is ( s k a s c , p k a s c ).

5.2. Data Upload

In the system, we use the on-chain and off-chain storage model. Consequently, the data upload process is divided into two stages: storage to CSP and upload to the blockchain. Figure 2 illustrates the logical flow of the data upload process. To ensure security storage, we employ efficient symmetric encryption algorithms to encrypt the data in plaintext. We then encrypt the symmetric key using the public key of the smart contract keyAuthorization and the public key of the TEE to generate the ciphertext for key exchange. To regulate the storage of on-chain data, we design the storage transaction T s in the SB.

5.2.1. Storage in CSP

The following are the steps for DO to follow when uploading his healthcare data for the first time.
  • DO encrypts the healthcare data plaintext M D 1 with a symmetric key k e y to obtain the encrypted data E D 1 .
    E D = E s y m k e y , M D 1
  • DO stores the healthcare data ciphertext E D 1 in CSP and obtains the off-chain storage address a d d r .

5.2.2. Upload to Blockchain

  • DO computes the chameleon hash of the ciphertext E D 1 and obtains h .
    h , μ 1 C H . H a s h h k , E D 1
  • DO picks the random number R T and encrypts the symmetric key k e y with the public key p k T E E of the TEE to obtain the ciphertext T . Then, DO picks the random number R A and encrypts the ciphertext T with the public key p k a s c of the smart contract k e y Authorization to obtain the ciphertext E K .
    T = k e y + R T p k T E E
    E K = T + R A p k a s c
  • DO uploads the random numbers R T and R A to the smart contract k e y Authorization and sends the access details, such as the charge, to the smart contract a c c e s s C o n t r o l in the following form. C h a r g e is the fee to be paid for using the data, and S t a r t T i m e and E n d T i m e are the time limits for using the data.
    P o l i c y =   < p k D O | | c h a r g e | | S t a r t T i m e | | E n d T i m e | | S i g D O >
  • The access control smart contract a c c e s s C o n t r o l analyzes the P o l i c y and subsequently stores it in the form of key–value pairs in the access control table.
  • DO uploads his healthcare data to the blockchain SB. The data is stored on the SB in a transactional T s format.   A D is the address of DO on the SB, calculated from his public key, while t y p e is the type of healthcare data (e.g., heart data, blood pressure data, etc.).
    T S = p k D O A D h , μ 1 E K a d d r t y p e T i m e S t a m p S i g D O
Note that DOs are required to pay a deposit before uploading data in order to prevent them from posting meaningless data.

5.3. TEE-Based Off-Chain Incremental Updates

As the condition changes, DO generates additional healthcare data. There is no need to create a new shared transaction. Instead, DO can initiate a data update request to the data upload smart contract and perform the incremental update in the TEE. The workflow between the entities in this section is illustrated in Figure 3, while the execution specifics for each entity are detailed below.
  • DO initiates a data update request to the smart contract d a t a U p l o a d .
  • After verification, the smart contract d a t a U p l o a d sends an off-chain execution license L i c e n s e to the TEE.
    L i c e n s e =   < E a s y e i d , p k T E E | | E a s y p k D O , p k T E E | | T s | | T i m e S t a m p | |   H ( L i c e n s e ) >
  • The TEE verifies the license information. If the verification is successful, DO and CSP establish the remote attestation channel with the TEE. DO and CSP transmit the updated private key t k , the new ciphertext E D j + 1 , and the original ciphertext E D j to the TEE via the remote attestation channel. The TEE retrieves the update records of E D j in UB. The TEE verifies the updated data, then executes the incremental update algorithm and generates the updated identification of the current data update. Finally, the new ciphertext E D j + 1 is stored in the CSP. Algorithm 1 illustrates the above process.
  • The TEE uploads the record of this incremental update to the blockchain UB. Transaction T u is generated in UB.
    T u =   < p k D O | | μ j , ρ | | Timestamp | | S i g T E E >
Algorithm 1 TEE-based incremental updates
Input:   L i c e n s e , ( h , μ 1 ) , E D j , T u , E D j + 1 , t k , h k
Output:   μ j + 1 , ρ
  1: e i d D a s y E a s y e i d , p k T E E , s k T E E
  2: if  v e r i f y e i d   then
  3:  if  T u = = N U L L  then
  4:    if  C H . C h e c k h k , h , μ 1 , E D 1  then
  5:         μ 2 C H . A d a p t t k , h , μ 1 , E D 1 , E D 2
  6:    end if
  7:   else
  8:    get μ j , ρ from T u
  9:    if  v e r i f y ρ  then
10:       if  C H . C h e c k h k , h , μ j , E D j  then
11:        μ j + 1 C H . A d a p t t k , h , μ j , E D j , E D j + 1
12:       end if
13:    end if
14:   end if
15: end if
16: use attestation ρ attest for μ j + 1

5.4. Privacy-Preserving Healthcare Data Sharing

Smart contracts are utilized for automated access control in data sharing. Furthermore, to prevent privacy breaches during the data-sharing process, the execution of the sharing transaction is divided into two components: on-chain status tracking and off-chain TEE execution. The TEE provides solely the computation results to ensure that the healthcare data are available and not visible. The data-sharing process can be divided into access control with smart contracts and off-chain shared transaction execution with TEE.

5.4.1. Access Control with Smart Contracts

We use smart contracts to implement access control and key exchange. We design two smart contracts: the access control smart contract accessControl and the key authorization smart contract keyAuthorization. The accessControl smart contract manages the access control lists of all shared data and determines whether DU meets the case data access policy set by the DO. The keyAuthorization smart contract decrypts the key E K corresponding to the ciphertext of the shared case data and generates the intermediate ciphertext T , which only the TEE can decrypt. Figure 4 illustrates the flow of data-sharing requests and access control. Based on Figure 4, we present a more detailed description of each process.
  • After retrieving the desired data from SB, DU submits a data usage request to the SB.
  • The accessControl smart contract verifies whether the DU meets the access policy according to the access control list. If it does, it calls the keyAuthorization to decrypt the secret key of DO to obtain the intermediate ciphertext T .
    T = E K s k a s c R A P = T + R A p k a s c s k a s c R A P = T + R A p k a s c R A s k a s c P
  • The accessControl smart contract generates the off-chain execution license of TEE. The chosen data transaction T s , the intermediate ciphertext T , and random number R T are transmitted to the TEE.

5.4.2. Off-Chain Shared Transaction Execution with the TEE

We designed a data validation mechanism within the TEE for the incremental update mechanism. Only validated data can be selected as qualified shared data to participate in data sharing and bring economic benefits to data owners.
  • Data verification and selection
    The TEE retrieves the shared data ciphertext in the CSP according to the data storage address a d d r and retrieves the update record in the blockchain UB.
    Based on the update identification, TEE check involves verifying whether the update record includes the off-chain execution attention ρ of the TEE, followed by verifying the chameleon hash value of the data to ensure consistency with the record on-chain. Only data that meet both criteria can be selected as qualified data.
  • Data decryption and calculation
    DU and TEE establish a remote attestation channel. The DU transmits the data training model M o d e l and its secure hash H M o d e l to the TEE. The TEE verifies the accuracy of the model and executes the following steps.
    The TEE obtains the symmetric key k e y by decrypting the ciphertext T .
    key = T s k T E E R T P = k e y + R T p k T E E s k T E E R T P = k e y + R T p k T E E R T s k T E E P
    The TEE decrypts the validated ciphertext E D i using the symmetric key k e y to obtain the plaintext M D i .
    M D i D s y m E D i , k e y
    The TEE performs data calculations using the validated plaintext and the model M o d e l provided by DU. The final result is the r e s .
    The TEE encrypts the result r e s with the public key of DU and sends it to the DU. The TEE subsequently signals the completion of the off-chain transaction execution back to the blockchain SB. Upon validation from a majority of SB nodes, the rewards are transferred to the account of DO, and the transaction is recorded in the SB.
Algorithm 2 illustrates the above process. In addition, to prevent malicious DOs from uploading fake records, existing reputation mechanisms [32] can be used to identify dishonest DOs.
Algorithm 2 Off-Chain Shared Transaction Execution With TEE
verification():
Input:  E D ˜ = E D 1 , E D 2 , , E D l , T u ˜ , h k , h
Output:   Qualified   array   Q A [ ]
  1: for   E D i   i n   E D ˜   do
  2:       get   ρ , μ i   from   T u ˜
  3:    if   v e r i f y ρ && C H . C h e c k h k , h , E D i , μ i then
  4:     add   E D i   in   Q A [ ]
  5:    end if
  6: end for
  decryption and calculation():
  Input:   T , R u , M o d e l , H M o d e l , Q A [ ]
  Output:   E a s y r e s , p k D U
  1: if   v e r i f y H ¯ M o d e l , , H M o d e l  then
  2:     k e y T s k T E E R u P
  3:  for   E D i   in   Q A [ ]  do
  4:     M D i D s y m E D i , k e y
  5:       use   M D i   train   for   M o d e l
  6:    end for
  7:     obtain   result   r e s
  8:     encrypt   r e s   and   get   E a s y r e s , p k D U
  9:  end if

6. Security Analysis, Comparison, and Implementation Evaluation

6.1. Security Analysis

  • Support incremental updates: Based on a hybrid storage model combining on-chain and off-chain methods, we utilize chameleon hash as the on-chain digest for personal case data. This particular hash function has the unique property that the trapdoor owner can efficiently find collisions that make h M = h M . As a result, DO can utilize his private key to update his case data off-chain while keeping the on-chain data unchanged. This helps avoid the overhead associated with data authentication, access control, and rewards caused by the on-chain data changing. In addition, as a trusted off-chain extension of the blockchain, the TEE provides a secure and reliable environment for making incremental updates. At the same time, the TEE can provide a verifiable attestation of the operations it performs, ensuring that DO performs incremental updates of his case data in a legally compliant manner.
  • Security: First, DO encrypts its case data using a symmetric encryption algorithm and stores it in the CSP, preventing unauthorized users without the key from accessing the plaintext data. Second, the blockchain stores the chameleon hash of the case data, which can be used to verify that the data have not been tampered with illegally by malicious users. Additionally, private key verification for incremental updates on personal healthcare data, as well as the decryption and computation process of shared data, takes place solely within the TEE. The TEE guarantees that internal computation is concealed and internal data remain inaccessible from external sources to prevent privacy breaches.
  • Verifiable: After completing the incremental update of data, an identification μ for the update is generated. The TEE then generates an execution attestation ρ for μ and uploads the record of the incremental update to the blockchain SB. Before sharing the data, the TEE verifies that the execution attestation is correct by checking the record in the blockchain SB to prevent unauthorized private modifications by the DO. Then, the TEE verifies the correctness of the chameleon hash value for the data by using the update identification μ to prevent any tampering by DO with the original data. Only the data that pass the validation process are considered qualified shared data. This method ensures that DO can only update his case data incrementally in a compliant manner, and the off-chain incremental update records can be verified.
  • Ownership: The chameleon hash of the on-chain sharing data is held personally by DO, and only the private key holder can find the hash collision. As a result, only DO has the right to incremental update his on-chain data. In addition, smart contracts are utilized to implement access control and key exchange for shared data. DO submits access rules for his shared data to the smart contract and has the authority to establish the access policy for his case data. The smart contract grants data usage permission to authorized DUs following the access control policy set by DO, while unauthorized DUs are prohibited from utilizing the data. Our scheme effectively ensures that DO maintains ownership of the data they share.
  • Traceability: To ensure traceability of off-chain update records, we set up the blockchain UB to store off-chain incremental update records. In addition, the blockchain SB stores the transaction records of the sharing parties. These transactions are public and accessible to anyone, ensuring non-tampering and traceability of the shared records.
In our scheme, we assume that before data sharing, patients have provided legal consent by applicable regulations and ethical guidelines, such as informed consent and privacy policies. We encourage ethics review and ensure compliance with applicable ethics guidelines and regulations before implementing our scheme.

6.2. Comparison

We compared our scheme with other schemes of the same type, and the results are presented in Table 2. From the table, we can see that scheme [33] does not address the security of shared data, while scheme [34] introduces ambiguity regarding the ownership of shared data. Furthermore, neither scheme addresses the issue of incremental updates. The results show that our scheme stands out by combining security, unambiguous data ownership, traceability, and support for incremental updates.

6.3. Implementation Evaluation

To verify the effectiveness of the proposed scheme, we conducted simulation experiments on a computer with Intel(R) Core(TM) i5-7500 CPU @ 3.40 GHz 3.41 GHz, with 8 GB RAM on an Ubuntu 18.04 LTS 64-bit operating system. We selected Ethereum as the blockchain platform for our scheme, utilizing the Ethereum client geth version 1.10.12-stable for blockchain construction. Intel SGX is employed as the trusted execution environment (TEE) in our experiments, and Solidity is used for coding smart contracts.

6.3.1. Performance of Smart Contracts

We implemented the data upload smart contract d a t a U p l o a d , the access control smart contract accessControl, and the key authorization smart contract keyAuthorization using Solidity 0.8.0, and deployed the above smart contracts on Remix, which is an editor for modifying, debugging, and deploying smart contracts. According to the data format of each smart contract, we entered the test data, and the experimental results are shown in Table 3. The gas cost is an overhead that users need to pay, encouraging nodes to execute the contracts.

6.3.2. System Computational Overhead

  • Comparison between SHA-256 and Chameleon Hash
Experiments were conducted to compare the computation time of chameleon hash and SHA-256 for different data sizes. Implementation of the RSA-based chameleon hash algorithm and SHA-256 algorithm was executed using basic library functions of C language. We implemented the RSA-based chameleon hash and SHA-256 using the basic library functions of C language. Afterward, a test was conducted using 1–6 MB data to calculate the time overhead of chameleon hash and SHA-256 values respectively. Reading the data in chunks and then merging the calculation, 1000 tests were conducted to find the average value, and the results are shown in Figure 5. Based on the experimental data, it is observed that the calculation overhead for chameleon hash values for the same data size is slightly higher than that of SHA-256. However, the time overhead is still within feasible limits for the users. It should be noted that the time difference between the two decreases gradually as the size of the data increases. We suspect that this is due to the increasing time overhead of reading data, which is becoming a substantial portion of the computation time overhead.
  • Comparison between SGX and non-SGX
Our version of the Intel SGX SDK and Intel SGX PSW is v2.10.100.2. Since SGX does not support common cryptographic libraries in C/C++, we implement the data verification, data decryption by AES encryption algorithm, and data update functions using the basic C libraries to run the code in the SGX environment and compare it with the non-SGX environment.
First, we conducted tests to measure the time overhead of the data validation and calculation algorithm for data ranging from 50 to 300 KB in both the SGX and non-SGX environments. The experimental results are presented in Figure 6a. Analysis of the results indicates that the time overhead for data validation in the SGX environment is approximately 3 × 10 3   s higher than that in the non-SGX environment. Secondly, we compared the decryption time of data in SGX and non-SGX environments for ciphertexts ranging from 10 to 50 KB. The experimental results are shown in Figure 6b. The results indicate that the time overhead of decrypting ciphertexts with the same data size is approximately 2 × 10 3   s more in SGX than non-SGX environments on average. Finally, we conducted tests on incremental updating data and measured the time consumption with various original data sizes. We used 50 KB and 150 KB as new data, and according to the experimental results shown in Figure 6c, it was determined that the overhead for updating data is positively correlated with the size of the data. In addition, the execution time in the SGX environment was about 0.1 s slower on average than the execution time in the non-SGX environment.
In summary, although the introduction of SGX adds some time, the additional overhead is very small. Therefore, we believe that it is acceptable, with improved security and reliability. In addition, according to the analysis of the working mechanism of SGX, more than the time overhead is brought by the data in and out of the SGX Enclave. From the theoretical analysis, as the data volume increases, the time overhead of the data in and out of the Enclave accounts for a smaller and smaller proportion of the total time overhead.

7. Conclusions

This paper proposes a blockchain and TEE-based privacy-preserving sharing scheme for healthcare data that supports incremental updates. The scheme utilizes chameleon hash and TEE to achieve reliable off-chain incremental updates and verification without modifying the on-chain data, which reduces the authentication, access control, and rewards overheads brought to the system by incremental updates of the same case data. In addition, considering privacy and security issues, we employ symmetric encryption to protect data off-chain and divide the sharing transaction into two parts: on-chain state tracking and off-chain TEE execution. TEE only returns the computation results of confidential data, ensuring that the data are available but not visible in the data usage process. We test the scheme regarding smart contract gas consumption and system computation performance and compare the system with the non-TEE environment for experiments. The experimental results demonstrate that our scheme achieves reliable incremental updates and verification of on-chain data within the tolerable time overhead of the system. Moreover, our scheme ensures the security of data sharing.
The implications of our work extend to the scientific community, policymakers, and health organizations. Researchers can build upon our findings to explore advanced encryption techniques and further incorporate additional security measures to enhance privacy in healthcare data sharing. Policymakers and health organizations can leverage our scheme’s insights to develop standardized frameworks that balance data privacy and collaboration, fostering innovation and improving patient care.
However, it is essential to acknowledge the limitations of our proposed scheme. How to make our scheme better able to handle the increasing amount of healthcare data and support highly concurrent users remains a challenge, and we will focus on this issue in our future work to achieve large-scale healthcare data sharing.

Author Contributions

Conceptualization, L.W.; data curation, X.L.; funding acquisition, L.W.; methodology, L.W. and X.L.; project administration, L.W.; software, X.L.; supervision, S.X. and S.Z.; validation, C.G. and Q.H.; writing—original draft, X.L.; writing—review and editing, W.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Shandong Provincial Key Research and Development Program (Grant Number 2021CXGC010107 and 2020CXGC010107), the National Natural Science Foundation of China (Grant Number 62102209), the Shandong Provincial Natural Science Foundation of China (Grant Number ZR2020KF035), and the New 20 project of higher education of Jinan, China (Grant Number 202228017).

Data Availability Statement

The datasets used during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Li, T.; Wang, H.; He, D.; Yu, J. Blockchain-based privacy-preserving and rewarding private data sharing for IoT. IEEE Internet Things J. 2022, 9, 15138–15149. [Google Scholar] [CrossRef]
  2. Liu, T.; Siegel, E.; Shen, D. Deep learning and medical image analysis for COVID-19 diagnosis and prediction. Annu. Rev. Biomed. Eng. 2022, 24, 179–201. [Google Scholar] [CrossRef]
  3. Bhattacharya, S.; Maddikunta, P.K.; Pham, Q.V.; Gadekallu, T.R.; Chowdhary, C.L.; Alazab, M.; Piran, M.J. Deep learning and medical image processing for coronavirus (COVID-19) pandemic: A survey. Sustain. Cities Soc. 2021, 65, 102589. [Google Scholar] [CrossRef] [PubMed]
  4. Apell, P.; Eriksson, H. Artificial intelligence (AI) healthcare technology innovations: The current state and challenges from a life science industry perspective. Technol. Anal. Strateg. Manag. 2023, 35, 179–193. [Google Scholar] [CrossRef]
  5. Tan, L.; Yu, K.; Shi, N.; Yang, C.; Wei, W.; Lu, H. Towards secure and privacy-preserving data sharing for COVID-19 medical records: A blockchain-empowered approach. IEEE Trans. Netw. Sci. Eng. 2021, 9, 271–281. [Google Scholar] [CrossRef]
  6. Xi, P.; Zhang, X.; Wang, L.; Liu, W.; Peng, S. A review of Blockchain-based secure sharing of healthcare data. Appl. Sci. 2022, 12, 7912. [Google Scholar] [CrossRef]
  7. Shamshad, S.; Minahil; Mahmood, K.; Kumari, S.; Chen, C.-M. A secure blockchain-based e-health records storage and sharing scheme. J. Inf. Secur. Appl. 2020, 55, 102590. [Google Scholar] [CrossRef]
  8. An, H.; Chen, J. ElearnChain: A privacy-preserving consortium blockchain system for e-learning educational records. J. Inf. Secur. Appl. 2021, 63, 103013. [Google Scholar] [CrossRef]
  9. Novo, O. Blockchain meets IoT: An architecture for scalable access management in IoT. IEEE Internet Things J. 2018, 5, 1184–1195. [Google Scholar] [CrossRef]
  10. Hasan, H.R.; Salah, K. Blockchain-based solution for proof of delivery of physical assets. In Blockchain–ICBC 2018, Proceedings of the First International Conference, Held as Part of the Services Conference Federation, SBF 2018, Seattle, WA, USA, 25–30 June 2018; Proceedings 1; Springer International Publishing: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
  11. Jayabalan, J.; Jeyanthi, N. Scalable blockchain model using off-chain IPFS storage for healthcare data security and privacy. J. Parallel Distrib. Comput. 2022, 164, 152–167. [Google Scholar] [CrossRef]
  12. Wang, L.; Meng, L.; Liu, F.; Shao, W.; Fu, K.; Xu, S.; Zhang, S. A User-Centered Medical Data Sharing Scheme for Privacy-Preserving Machine Learning. Secur. Commun. Netw. 2022, 2022, 3670107. [Google Scholar] [CrossRef]
  13. Nishi, F.K.; Shams-E-Mofiz, M.; Khan, M.M.; Alsufyani, A.; Bourouis, S.; Gupta, P.; Saini, D.K. Electronic healthcare data record security using blockchain and smart contract. J. Sens. 2022, 2022, 7299185. [Google Scholar] [CrossRef]
  14. Benil, T.; Jasper, J. Blockchain based secure medical data outsourcing with data deduplication in cloud environment. Comput. Commun. 2023, 209, 1–13. [Google Scholar] [CrossRef]
  15. Shrestha, A.K.; Vassileva, J.; Deters, R. A blockchain platform for user data sharing ensuring user control and incentives. Front. Blockchain 2020, 3, 497985. [Google Scholar] [CrossRef]
  16. Deshmukh, P. Design of cloud security in the EHR for Indian healthcare services. J. King Saud Univ. Comput. Inf. Sci. 2017, 29, 281–287. [Google Scholar] [CrossRef]
  17. Tian, G.; Ma, H.; Xie, Y.; Liu, Z. Randomized deduplication with ownership management and data sharing in cloud storage. J. Inf. Secur. Appl. 2020, 51, 102432. [Google Scholar] [CrossRef]
  18. Guo, C.; Wang, L.; Tang, X.; Feng, B.; Zhang, G. Two-party interactive secure deduplication with efficient data ownership management in cloud storage. J. Inf. Secur. Appl. 2023, 73, 103426. [Google Scholar] [CrossRef]
  19. Azaria, A.; Ekblaw, A.; Vieira, T.; Lippman, A. Medrec: Using blockchain for medical data access and permission management. In Proceedings of the 2016 2nd International Conference on Open and Big Data (OBD), Vienna, Austria, 22–24 August 2016; IEEE: New York, NY, USA, 2016. [Google Scholar]
  20. Zhang, G.; Yang, Z.; Liu, W. Blockchain for secure ehrs sharing of mobile cloud based e-health systems. IEEE Access 2019, 7, 66792–66806. [Google Scholar]
  21. Zhang, G.; Yang, Z.; Liu, W. Blockchain-based privacy preserving e-health system for healthcare data in cloud. Comput. Netw. 2022, 203, 108586. [Google Scholar] [CrossRef]
  22. Kumar, R.; Kumar, P.; Tripathi, R.; Gupta, G.P.; Islam, A.K.M.N.; Shorfuzzaman, M. Permissioned blockchain and deep learning for secure and efficient data sharing in industrial healthcare systems. IEEE Trans. Ind. Inform. 2022, 18, 8065–8073. [Google Scholar] [CrossRef]
  23. Belhadi, A.; Holland, J.O.; Yazidi, A.; Srivastava, G.; Lin, J.C.; Djenouri, Y. BIoMT-ISeg: Blockchain internet of medical things for intelligent segmentation. Front. Physiol. 2023, 13, 1097204. [Google Scholar] [CrossRef] [PubMed]
  24. Zhang, X.; Poslad, S. Blockchain support for flexible queries with granular access control to electronic medical records (EMR). In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas City, MO, USA, 20–24 May 2018; IEEE: New York, NY, USA, 2018. [Google Scholar]
  25. Liu, J.; Li, X.; Ye, L.; Zhang, H.; Du, X.; Guizani, M. BPDS: A blockchain based privacy-preserving data sharing for electronic medical records. In Proceedings of the 2018 IEEE Global Communications Conference (GLOBECOM), Abu Dhabi, United Arab Emirates, 9–13 December 2018; IEEE: New York, NY, USA, 2018. [Google Scholar]
  26. Li, F.; Liu, K.; Zhang, L.; Huang, S.; Wu, Q. EHRchain: A blockchain-based EHR system using attribute-based and homomorphic cryptosystem. IEEE Trans. Serv. Comput. 2021, 15, 2755–2765. [Google Scholar] [CrossRef]
  27. Costan, V.; Srinivas, D. Intel SGX Explained. Cryptology ePrint Archive. 2016. Available online: https://ia.cr/2016/086 (accessed on 17 November 2023).
  28. Ngabonziza, B.; Martin, D.; Bailey, A.; Cho, H.; Martin, S. Trustzone explained: Architectural features and use cases. In Proceedings of the 2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC), Pittsburgh, PA, USA, 1–3 November 2016; IEEE: New York, NY, USA, 2016. [Google Scholar]
  29. Sabt, M.; Achemlal, M.; Bouabdallah, A. Trusted execution environment: What it is, and what it is not. In Proceedings of the 2015 IEEE Trustcom/BigDataSE/Ispa, Helsinki, Finland, 20–22 August 2015; IEEE: New York, NY, USA, 2015; Volume 1. [Google Scholar]
  30. Mao, W.; Jiang, P.; Zhu, L. BTAA: Blockchain and TEE Assisted Authentication for IoT Systems. IEEE Internet Things J. 2023, 10, 12603–12615. [Google Scholar] [CrossRef]
  31. Krawczyk, H.; Rabin, T. Chameleon Hashing and Signatures. Cryptology ePrint Archive. 1998. Available online: https://ia.cr/1998/010 (accessed on 17 November 2023).
  32. Kang, J.; Xiong, Z.; Niyato, D.; Xie, S.; Zhang, J. Incentive mechanism for reliable federated learning: A joint optimization approach to combining reputation and contract theory. IEEE Internet Things J. 2019, 6, 10700–10714. [Google Scholar] [CrossRef]
  33. Shen, M.; Duan, J.; Zhu, L.; Zhang, J.; Du, X.; Guizani, M. Blockchain-based incentives for secure and collaborative data sharing in multiple clouds. IEEE J. Sel. Areas Commun. 2020, 38, 1229–1241. [Google Scholar] [CrossRef]
  34. Huang, H.; Zhu, P.; Xiao, F.; Sun, X.; Huang, Q. A blockchain-based scheme for privacy-preserving and secure sharing of medical data. Comput. Secur. 2020, 99, 102010. [Google Scholar] [CrossRef]
Figure 1. System architecture.
Figure 1. System architecture.
Symmetry 16 00089 g001
Figure 2. The logical flow of data upload.
Figure 2. The logical flow of data upload.
Symmetry 16 00089 g002
Figure 3. The logical flow of TEE-based off-chain incremental updates.
Figure 3. The logical flow of TEE-based off-chain incremental updates.
Symmetry 16 00089 g003
Figure 4. The logical flow of access control with smart contracts.
Figure 4. The logical flow of access control with smart contracts.
Symmetry 16 00089 g004
Figure 5. SHA-256 and chameleon computational overheads.
Figure 5. SHA-256 and chameleon computational overheads.
Symmetry 16 00089 g005
Figure 6. SGX vs. non-SGX overheads.
Figure 6. SGX vs. non-SGX overheads.
Symmetry 16 00089 g006
Table 1. Notations and descriptions.
Table 1. Notations and descriptions.
NotationsDescriptions
s k , p k The private–public key pair
t k , h k The chameleon hash private–public key pair
M D , E D Personal healthcare data in plaintext, ciphertext
h Chameleon hash value
k e y Symmetric encryption secret key
a d d r Storage address in CSP
e i d License information of SGX
E a s y · , D a s y · Asymmetric cryptography
E s y m · , D s y m · Symmetric cryptography
R Random number
μ Data update identification
ρ Attestation of SGX
M o d e l The training model of DU
H ( · )Secure hash function (i.e., SHA256)
Table 2. Scheme comparison.
Table 2. Scheme comparison.
SchemeFeature
SecurityOwnershipTraceabilitySupports
Incremental
Updates
Shen [33]××
Huang [34]××
Li [26]×
Wang [12]×
Ours
Table 3. Gas cost of smart contracts.
Table 3. Gas cost of smart contracts.
Smart ContractOperationsCost/GasGas PriceEth Fee (Ether)
dataUploadCreate853,38620 gwei0.01706772
Invoke94,7500.001895
accessControlCreate754,7860.01509572
a d d P o l i c y 45,4920.00090984
v e r i f y P o l i c y 30,3600.0006072
keyAuthorization Create553,3870.01106774
Invoke107,9990.00215998
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, L.; Liu, X.; Shao, W.; Guan, C.; Huang, Q.; Xu, S.; Zhang, S. A Blockchain-Based Privacy-Preserving Healthcare Data Sharing Scheme for Incremental Updates. Symmetry 2024, 16, 89. https://doi.org/10.3390/sym16010089

AMA Style

Wang L, Liu X, Shao W, Guan C, Huang Q, Xu S, Zhang S. A Blockchain-Based Privacy-Preserving Healthcare Data Sharing Scheme for Incremental Updates. Symmetry. 2024; 16(1):89. https://doi.org/10.3390/sym16010089

Chicago/Turabian Style

Wang, Lianhai, Xiaoqian Liu, Wei Shao, Chenxi Guan, Qihao Huang, Shujiang Xu, and Shuhui Zhang. 2024. "A Blockchain-Based Privacy-Preserving Healthcare Data Sharing Scheme for Incremental Updates" Symmetry 16, no. 1: 89. https://doi.org/10.3390/sym16010089

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop