1. Introduction
Recent developments in information and communication technology (ICT) have revolutionised the whole healthcare system. The main aim of smart healthcare systems is to enhance the quality of healthcare and reduce related costs. The introduction of IoT in the healthcare system significantly increased the processing of large data for targeted services on a daily basis [
1]. The smart healthcare system collects the patient’s health information (PHI), including personal identifiable information (PII), and shares that information with relevant stakeholders to provide medical services to the patient [
2]. The idea of smart healthcare has been researched in many directions to provide healthcare services over a connected network. The smart ambulance, patient monitoring, nurse reservation, smart hospital, and blockchain-enabled security system are proposed to support the smart healthcare system in smart cities. The smart healthcare system architecture involves all the important stakeholders, including government, hospitals, medical research institutes, pharmaceuticals, clinics, and transport systems, to ensure timely services to citizens.
Figure 1 describes the information flow among the components/stakeholders in a smart healthcare ecosystem.
A smart healthcare system includes the Internet of Medical Things (IoMT) as a non-separable part of the healthcare system to provide data for better diagnosis. The IoMT is a connected infrastructure of smart medical devices that connect with smart healthcare via the internet. IoMT with mobile application enables the data collection and sharing of patient information to doctors or hospitals for the prevention of chronic issues, and the tracking, monitoring, and better control of diseases [
3]. The mismanagement of patient data resulted in many kinds of data breaches. The Health Insurance Portability and Accountability Act (HIPAA) stated that 13,236,569 medical records were breached in 2018, double the record breaches in 2017. The data breaches expose economic threats, possible social stigma, and mental anguish (Health privacy project, 2007) [
4]. As the usage of technology increases in the healthcare system, various information about patients’ health also flows in the system with or without necessary regulations. The service providers collect a good amount of PII information to provide user-centric services. In addition, the gathered information is shared with other stakeholders without the necessary consent of the patient. In response to increasing threats to the privacy of PII, the EU published the General Data Protection Regulation (GDPR) in 2016. In particular, Art. 25 of the GDPR, “Data Protection by Design and by Default”, is the most interesting and controversial article since it addresses the anonymisation mechanisms [
5]. Healthcare is one of the domains where a large amount of data are being managed on a daily basis [
6]. Blockchain is able to maintain a large distributed database [
7]. Blockchain plays a crucial role in healthcare applications for improving medical record management, insurance claim processes, accelerating clinical/biomedical research, and advancing the healthcare data ledger [
8].
A smart healthcare system includes multiple stakeholders, as presented in
Figure 1, and has multiple communications among these stakeholders. The most commonly used truncations involve patients, such as hospitals, clinics, nursing homes, and pharmacies collecting patients’ health data to provide medical services. Government institutes use patients’ information for health policy making, disease control (such as COVID-19), law enforcement, and national health guidelines. The transpiration system is also an active participant in a smart healthcare system to handle accidents and emergency services. Research institutes and industry associations also play important roles in a smart healthcare system to provide quality health services to patients.
In this context, this paper proposes a framework that is GDPR “privacy by design” and “Right to be Forgotten” compliant for data storage and sharing in the smart healthcare system using blockchain. It is highlighted that the GDPR is considered in this research as the regulation covers all personal information protection in any area of usage, while HIPAA covers information protection in the health insurance area only. Moreover, the limit on data access time for requesters is provided in the GDPR only.
Blockchain technology is a distributed security solution for healthcare applications. The main characteristics of the blockchain include decentralised data management, immutable audit trails, data provenance, robustness, availability, security, and privacy. The distributed data-centric security characteristics increase the suitability of blockchain for healthcare applications compared to traditional databases. The proposed framework uses blockchain in a smart healthcare system to store non-personal information. Smart contracts are designed to share health data. Based on prior permission, the requestor can access information for a period. Moreover, information sharing is auditable because the transactions are stored on a blockchain. Despite the advantages mentioned above, blockchain has two major issues. First, data cannot be deleted after uploading on blockchain because blockchain provides immutability. Second, when a large volume of data are saved on the blockchain, retrieval and search of information are inefficient. To deal with these issues, an Inter-Planetary File System (IPFS) is used in the proposed framework. The framework separates the PII and PHI with the public information and stores the private information offline, separated with other public information. The framework uses IPFS to store critical information, including PII and PHI, off-chain. IPFS is a distributed data storage file system. Other members cannot access this personal information, except the owner/creator. Further, a protocol is embedded to delete the information on IPFS to make the framework comply with the GDPR’s “Right to be Forgotten” rule. The key contribution of the paper can be summarised in the direction of GDPR “Privacy by Design” as follows:
Firstly, a GDPR-compliant data storage and sharing framework using blockchain is proposed for smart healthcare systems while storing public information on blockchain and private information (such as PII, PHI, etc.) off-chain.
Secondly, IPFS is used to store private data in an encrypted form. Users and IoMT devices can upload data on IPFS, and only owners can access the data.
Thirdly, smart contracts are designed to share the off-chain data. Further, a proxy re-encryption network is used to share the encrypted data.
Finally, the proposed blockchain-based framework is implemented using permissioned blockchain along with IPFS and an oracle proxy re-encryption network, and evaluated in comparison with the state-of-the-art protocols, considering several metrics for blockchain-based healthcare systems.
The remainder of this paper is organised as follows:
Section 2 elaborates on related studies on GDPR and smart healthcare, and reviews the literature on data storage and sharing techniques using blockchain. The proposed blockchain framework to store and share PII and an elaboration and detailed description with scenarios are presented in
Section 3. Experimental results and their analyses are discussed in
Section 4. Finally, the conclusion and future works are presented in
Section 5.
3. The Proposed Blockchain-Based Data Storage and Sharing Framework
This section describes the detailed design of the proposed blockchain-based data storage and sharing of IoMT data in smart healthcare.
Figure 2 shows the main component of the proposed blockchain-based system along with its workflow in
Box 1. In the system design, IoMT device interaction is different to other stakeholders’ (doctors, patients, hospitals, etc.) interaction with the system. The participants of the blockchain network, except for the constrained IoMT device, directly interact with the blockchain through a smart contract to read and write the content on the blockchain. The IoMT devices do not interact directly with the blockchain. These devices interact via owners such as patients, doctors, hospitals, and other owners. Further, PII and PHI are stored off-chain on IPFS, and the hash of data are stored on the blockchain to maintain immutability and transparency. The rest of the section will explain each component of the system.
Box 1. Workflow of the proposed blockchain based data storage and sharing architecture
Data subject (DS) of data uploads the encrypted record with her symmetric key on IPFS.
Hash of record is pushed on a blockchain.
The data requestor (DR) requests the record to data subjects with a smart contract.
DS gets the request to access the record from DR.
DS grant permission to access the hash of record R at IPFS to proxy the re-encryption server.
Proxy re-encryption server gets the encrypted R with the help of the received hash of R.
Proxy re-encryption server generates a new key.
IPFS re-encrypts the encrypted record R and sends it to DR with a smart contract.
DR decrypt the record with their own symmetric key, and the record R.
3.1. IoMT Devices
The IoMT devices are implanted with patients to monitor the activity and health status of the patients. These IoMT devices have limited computing, network, and storage power. Due to limited capabilities, the IoMT devices need a gateway to interact with the blockchain and IPFS. In the system design, the owner of the devices acts like a gateway, and devices interact with blockchain and IPFS with the help of owner nodes. The data are collected or sensed by the IoMT devices, stored off-chain, and their hash is logged on chain with the timestamp. When a participating node downloads data from the off-chain store, it can cross-verify the data with the hash of data downloaded from the blockchain. This enforces transparency, integrity, and trust in the system.
3.2. Participants
The proposed system is designed for a private blockchain network, so all the participants must register in the system. Each participant is treated like a blockchain node. Participants are all important stakeholders of smart healthcare such as hospitals, doctors, patients, drugs and medical device manufacturers, medical centres, and many others who connect with healthcare to get or provide services. The GDPR terminology is used to explain patient data storage and sharing scenario. The patient is a data subject and willingly provides his information to data collectors and the data analyser.
Table 2 shows the mapping of terminologies between the healthcare system and GDPR roles.
3.3. Off-Chain Data Storage (IPFS)
IPFS stores the IoMT stream data and other health data such as health reports, prescriptions from doctors, critical diseases history, etc. The critical information that can be used to harm the subject or PII is stored off-chain, and other information can be stored on the blockchain directly. The information of IoMT devices and other information is stored on IFPS, encrypted to provide secrecy. The data are distributed, and at the same time, are secure, and only the data subject can access the data. There are two motivations behind off-chain storage. The first one is to comply with the system against the “Right to be Forgotten” rule under the GDPR to protect the privacy of the data owner. Further, the owner of data has full control over their data. The second one is to reduce the cost of storing data on-chain. The IoMT devices first encrypt the data with their symmetric key and encrypt the symmetric key with the owner’s public key. Then, the encrypted key is stored on IPFS along with the encrypted data. Only the hashes of off-chain data without the actual data are stored on the blockchain. The deletion of data on IPFS to practice the “Right to be Forgotten” right will be discussed in the implementation section. Further, the implementation section also discusses the communication between blockchain and IPFS and the communication of nodes/participants with blockchain and IPFS.
Figure 3 illustrates the sequence flow of uploading the data on blockchain and IPFS.
The device with a symmetric key
encrypts sensed data D before storing it on IPFS. Further, the owner of the device encrypts the symmetric key of the device
with the owner’s public key (
) and stores both data
and
on IPFS and returns the address to the device. Algorithm 1 defines the data upload of device D.
Algorithm 1: Record upload on IPFS by IoMT
|
1. Start
2. Device with data D and symmetric key
3. Device encrypts D with symmetric key:
4. Encrypt the device key with the owner’s public key:
5. Store ()
6. Return CID
7. Exit |
Doctors, lab technician, and other stakeholders also store their report R on IPFS in encrypted form, as described in Algorithm 2. Here, in the Algorithm 2, node means all stakeholders of smart healthcare systems who participated and registered in the blockchain network.
Algorithm 2: Record upload on IPFS by Node
|
1. Start
2. Node with record R and symmetric key
3. Node encrypts R with symmetric key: 4. Encrypt the node key with the public key: 5. Store () 6. Return CID 7. Exit |
3.4. Blockchain
Blockchain is the core component of the proposed system. All stakeholders of smart healthcare register themselves in the smart healthcare blockchain network to get the services. According to the access control list, the participants, nodes, or stakeholders have access privileges on the blockchain. The access control list is also saved on the blockchain, and smart contracts are designed to implement the access control list. The blockchain provides data provenance, data tracking, logging of transactions, and accountability of actions performed by the participants.
3.5. Smart Contract
The smart contracts are deployed on the blockchain, and participants of the blockchain run the smart contract to perform a specific task. The smart contracts are designed to provide authentication, authorisation, access control, and logging of the transactions on the blockchain. The smart contract offers three main functionalities. First, the owner of the IoMT device controls the device’s data and streaming using a smart contract. Second, the participants use a smart contract to share the data among the network. Third, the owner shares data access control of devices with authorised nodes. For example, a patient gives access to the thermostat to the doctor to check the temperature. The doctor provides the prescription or treatment to the patient based on IoMT device data.
3.6. Data Sharing
The streamed data from the IoMT devices and uploaded by the participants is stored on IPFS in encrypted form. Only users authorised by the owner of the data can access the encrypted data, which preserves the privacy and confidentiality of data. The next step is how the requester gets the decrypted data and conducts the analysis to provide some health service. The smart contracts are designed and deployed on blockchain to share the data with other participants. Further, the proxy re-encryption server is also deployed to ensure the confidentiality and privacy of data. The re-encryption server also maintains a copy of the symmetric key of all stakeholders and IoMT devices.
Whenever a participant wants to access the encrypted data, she sends a request to the owner of the data or device (data subject). If the data subject wants to share the data, then they send the address of the IPFS tuple and simultaneously send consent to the re-encryption server to generate the new re-encryption key along with an encrypted symmetric key. There-encryption server then generates a new key with the private key of DS and the public key of the requester. The re-encryption server encrypts the encrypted key with a new generated key and sends the re-encrypted key to the requestor. The re-encrypted key is with a timestamp, which means the key will be invalid after a specified time. The requestor decrypts the re-encrypted key with her private key to obtain the symmetric key. After getting the encrypted key, the requestor decrypts the data and analyses it on the basis of the desired service.
Figure 4 describes the sequence flow for sharing the data in the system.
After uploading the record, only the owner who uploaded the record controls the data and can share it with other participants. For example, a patient consults a doctor regarding a specific disease and needs access to a patient’s health record. The doctor requests the patient to share the specific record. Algorithm 3 describes the sharing process of the record.
Algorithm 3: Record R sharing on IPFS by Node
|
1. Start
2. Participant P request for R record tuple: , to DS 3. If DS accepts sharing request
4. Then generate re-encryption key N with public key of participant P and private key of DS for the time T 5. DS send N to the proxy re-encryption server 6. Re-encryption proxy network encrypts with N 7. Send re-encrypted to Participant P 8. Participant P decrypts with her private key 9. Decrypt the record R with 10. End If 11. Else denied 12. Exit |
3.7. Deletion at IPFS
For the compilation of deletion of content in GDPR, the framework follows the “Proof of ownership” concept proposed in Politou et al. [
2]. The proposed model saves the PII and PHI on IPFS (offline), as discussed earlier. The ownership of the file is added with the file while uploading it to IPFS. When the owner of content wants to delete content permanently, she sends an erasure request on IPFS. The requester sends a hashed version of the file along with the content-dependent key d. The key d is derived from the master key that each user owns. This extension is only for PII and PHI, and other information can be shared without proof of ownership. The user appends the proof when she wants to delete specific information and preserve the right to be forgotten rule of GDPR.
The protocol is described in three phases: initialisation, record distribution, and record deletion, and five functions: KeyGen, RecKeyGen, ProofGen, GenDelRequest, and CheckProof.
The protocol runs these functions:
KeyGen: A random generation function which generates the master key for each user. The user keeps this key secret.
RecKeyGen (
mKey,
R)
→ rKey: A function that generates the record
key rKey for the master key
mKey and record
. In the proposed work,
ProofGen (rKey, R)
→ P: This function generates the proof of ownership for the record
R using the record
key rKey.
GenDelRequest (
R, rKey)
→ (
h, rKey): This function generates the delete request for record
R and record
key rKey. This function results in a delete request with the hash of record
S and record
key rKey as proof of ownership.
CheckProof (h, rKey) → (Pass, Fails): This function checks that the owner pushed the record R with hash S. This function gives a pass only if the owner pushed the record.
In the initialisation phase, each user runs KeyGen(), generates the master key mKey, and keeps the key secret. The master key is different to the private key.
In the record distribution phase, the user stores a record on IPFS using the record key. The functions and are used to generate the record key for the record R and its ownership proof P.
To distribute the record, the user commits a tuple (
R,
P) that distributes the record on the IPFS network. The IPFS network uses distributed sloppy hash table (DSHT) and BitSwap protocol to distribute the record over the network [
4]. To delete a record
R that was uploaded earlier, the user uses
GenDelRequest () to delete the record on the IPFS. The user sends the request as tuple d = (
h(R), rKey) to the network. After receiving the request, the node in the network runs
CheckProof () to locate the
h(R) and verify the ownership. If the request is valid, the node forwards the request to the neighbouring node to delete the record
R by sending the
h(R). Algorithm 4 describes the same process. It is highlighted that the proposed
GDPR compliant framework can also be utilised in other domains such as a connected vehicle traffic environment [
29,
30,
31] or electric vehicle charging network environment [
32,
33,
34] for driver and vehicle data protection. In the next section, experimental implementation and analysis of results are presented.
Algorithm 4: Record delete by the owner |
1. Start 2. Node receives delete request d = (h(R), rKey) 3. If the h(R) = h stored with the node then 4. If CheckProof (h, rKey) = = true then 5. Delete R from the local store 6. Forward d to neighbour nodes using DSHT 7. End if 8. End if 9. Exit |
4. Experimental Results and Discussion
In this section, simulation experiments are performed to carry out a performance analysis of the proposed blockchain-based privacy-preserving framework in a smart healthcare system. Simulation setting, parameters, and performance analysis of results are discussed for a smart healthcare environment. The discussion is divided into three steps: the first is GDPR compliance, the second is security and privacy analysis, and the third is performance evaluation. The system uses the Hyperledger fabric blockchain platform to make a distributed network, and smart contracts (called chain code in the Hyperledger fabric) are implemented in GO language and deployed on the blockchain. Further, the IPFS is used to store the IoMT data and other data, as specified in the architecture. The first step in implantation is creating the distributed network and the ledger. The network is built using five nodes, i.e., patient along with two IoMT devices, doctor, hospital, laboratory, and government.
The prototype was implemented to analyse the performance of the system. Many obstacles were faced during the implementation of a prototype. The major obstacles were low-performance matrix such as time to verify the transactions, high query time, high search time due to system configuration such as random access memory (RAM) and read-only memory (ROM) size and processor capacity, etc. So, the implementation was moved to a larger RAM and ROM with a higher capacity processor system to get the desired result. Further, the complexity of programs (smart contract and network design) was also very crucial. The programs were also improved many times to get lower complexity, directly impacting the performance. So, we can conclude that the performance depends on the system components such as central processing unit (CPU), RAM and ROM size, the complexity of smart contract programs, and other algorithms.
4.1. GDPR-Compliance
As discussed in
Section 4, the data owner can perform CRUD operations on their data, and others are not able to modify these rights. This defines the “Right to access” and “Right to rectification”. Smart contracts support the “Right to be informed” by implementing the request and access policy before sharing health data with other parties (who are not owners). The owner has full access of their own data and can manage data usage. This is how data owners exercise the “Right to restricted processing” and “Right to data portability”. The deletion of data on IPFS is defined to comply with the “Right to be Forgotten”. The owner can delete their own data at any time on IPFS, and the blockchain does not have any personal data. The deletion of PII or PHI on a blockchain is proposed to comply with the “Right to be Forgotten” rule of the GDPR. The PII and other critical information, including PHI, is stored on the IPFS database off-chain. The data owner can delete the off-chain data via smart contract using a proxy re-encryption network. The transactions of uploading data, deleting data, or any modification of data are present on the blockchain and these transactions do not include the actual data.
4.2. Security and Privacy Analysis
Further, the proposed framework is analysed for security and privacy using the following parameters:
Authentication: The proposed solution is based on a private blockchain that implies a secure identity-based solution. Participants must register on the network before uploading, accessing, or sharing the health data. All participants/stakeholders of the healthcare system are verifiable through the standard identity management system, i.e., identities managed by a trusted CA. It is noted that all the transactions are digitally signed at the proposal time. So, all the identities in the private network of the healthcare system are authenticated. In the prototype implementation, Hyperledger fabric service MSP is used for identity management.
Privacy by design: GDPR, Article 25, defines that privacy must be incorporated into the system from the design time itself. In the proposed framework, privacy from the design is incorporated. The PII and other critical data are stored off-chain at the peer level on IPFS. The private data could share the private data with other participants on the chain with the help of smart contracts. The proposed framework enables confidentiality through off-chain data storage and sharing through smart contracts. Only network participants have access to smart contracts and data transactions; this preserves confidentiality and privacy within the permissioned network.
Visibility and transparency: The proposed framework ensures data transparency. All the transactions related to the private data are visible to the data controller, and the data controller is responsible for informing the data subject about their private data.
Traceability: The proposed framework enforces trust in the system, as the data logs are stored on the blockchain ledger, and the blockchain provides immutability. The changes in data, data requests, data sharing, and other transactions related to data are stored on the blockchain and cannot be modified at any point. These logs can be used to track the data for forensics or other purposes.
4.3. Performance and Scalability
The proposed framework is applicable to serve many participants accessing data simultaneously. Therefore, the performance and scalability should be evaluated. The performance of a blockchain platform can be affected by many variables such as transaction size, block size, network size, as well as limits of the hardware, etc. In this subsection, the effectiveness and efficiency of the proposed framework are evaluated. To evaluate the proposed framework, Hyperledger tool Caliper is used, which is also utilised to evaluate the performance of various private blockchain frameworks such as Hyperledger Fabric, Sawtooth, etc.
Figure 5 and
Figure 6 present the latency and throughput of the READ WRITE operation on the blockchain. As presented in
Figure 5, the WRITE operation takes more time than the READ operation. In READ operation, users throw a query via smart contract or other user interaction method and get the results. Whereas in WRITE operation, the user updates some data on a ledger or off-chain storage. The consensus algorithm also runs to sync up all nodes’ data after a defined time. So, the whole process takes time greater than the READ operation. Further, the READ and WRITE operation time also increases with the increment in the number of nodes. As the number of nodes increases, the search time also increases in the case of the READ operation and WRITE time also increases accordingly. The throughput of the network is calculated on 200 TPS and 500 TPS workloads on READ and WRITE operations, respectively. As the number of nodes increases, the throughput for READ and WRITE operations decreases. Because, as the number of transactions per second increases, the database also increases, so the system takes more time to process the operations.
The performance of READ and WRITE operations in the system is described in
Figure 7. The throughputs of READ and WRITE operations on different workloads of 100 TPS, 200 TPS, 300 TPS, 400 TPS, and 500 TPS were collected to calculate the performance. As shown in
Figure 7, the WRITE operation got the highest performance on a 200 TPS workload, and it was 157 TPS. Then, the performance started degrading as workload increased. In contrast, the READ operation performed best on 500 TPS as 498 TPS. After that, the performance started to degrade.
The smart contract is deployed on the blockchain, and nodes trigger queries through a smart contract for different functionality such as storing the IoMT data, deleting the record on IPFS, requesting to share the record, etc. An increment in the number of nodes increases the number of smart contract queries, leading to a slowing down of the execution process of queries with the existing computation resources. As shown in
Figure 8, the execution time of a smart contract increases with the increment in the number of nodes participating in the network.
4.4. Comparative Results Analysis
The first comparison is between the computation and storage throughput with Healthchain [
22] and the traditional system. The Healthchain takes 8.565 ms to generate the user transactions, and the traditional system takes about 130 ms. We analysed our system and found that it takes only 6.098 ms to generate the user transaction when only ten nodes are present in the network.
Figure 9 represents the same. As the number of nodes increases in the network, the time to generate the transactions also increases. When the number of participants increases in the system, the number of transaction requests per second also increases. The increased transactions take time to compute with the existing computation resources. It complies that computation time and storage of the system increase with the extra number of participants. The computation time and storage were measured when the number of transactions was 50, 100, 150, 200, and 250 in the network.
Figure 9 represents storage overhead as the number of transactions varies. The figure illustrates that as the number of transactions increases, storage overhead also increases.
Figure 10,
Figure 11 and
Figure 12 depict that the proposed framework ensures high accuracy in terms of security, privacy, and authorisation. These parameters are observed and compared with the previous work.
Figure 10 presents the accuracy of security of the proposed framework along with the security accuracy of the FHIR Chain (FHIRC) [
35], Attribute-Based Encryption (ABE) [
36], and the Decentralised Telemedicine Framework (DTF) [
27]. The results show that the proposed framework possesses high security accuracy (99.6 per cent) when the number of nodes is 600. At the same time and with the same number of records, DTF has 99.5 per cent, ABE has 98.5, and FHIRC only has 97.3 per cent security accuracy. Maximum authentication privacy is observed at 99.5 per cent when the number of records is 400, as presented in
Figure 12. Maximum privacy accuracy is 99.6 per cent when the number of records is 600, as presented in
Figure 11. The privacy and authentication accuracy are compared with FHIRC, ABE, and DTF. The results show that the proposed system has high accuracy compared to the state-of-the-art techniques.
5. Conclusions and Future Work
This paper presents a GDPR-compliant data storage and sharing framework using blockchain for the smart healthcare system while storing public information on blockchain and private information (such as PII, PHI, etc.) off-chain. In the proposed framework, data control is given to the data owner. The medical data, including data from the IoMT devices, is managed (e.g., storage, option, sharing, and deletion) only by the data owner. IPFS stores the data offline to improve privacy and provide deletion control to the user. The data are stored in encrypted form on-chain and off-chain. The system uses a proxy re-encryption network to share the encrypted data. The result shows that the framework provides security and privacy to health data and gives control to the owner rather than the service provider. In addition, simulation results show that the proposed framework outperforms the state-of-the-art protocols. For the future perspective, the proposed framework can introduce the patient to the blockchain network to take a trace of their own data at any time. This can be performed using the integration of mobile API with the framework. Further, this paper does not include an emergency service scenario, and this can be considered in future research propositions.