1. Introduction
The blockchain can be regarded as a chain of blocks that have already been time-labeled and stay connected by employing encryption features (cryptographic hashes). A block can include the communications of various users, which are publicly offered to all users within the network. Furthermore, each block embraces the hash of the earlier block as well as the traditional transaction data, which results in the generation of an immutable and secured, append-only chain. There is a continuous increase in chain length with the addition of each new block to the end of the chain [
1].
The blockchain infrastructure is usually based on a peer-to-peer interaction, wherein both network users (transactions participators) as well as the blockchain “miners” (those that enable communications with a distributed ledger) are involved. The storing of the ledger is done in a decentralized node network, which is generated via cryptographic routes that have been computed by all the miners present in the network [
2]. Furthermore, highly reliable storage capabilities are provided by the blockchain ledger since digital signatures, consensus mechanisms, and hash chains are employed for its creation, because of such unconventional features, blockchain technology offers many facilities, including information traceability, security, and non-repudiation, while at the same time keeping all information within a public decentralized way, thus maintaining privacy [
3].
Figure 1 depicts the blockchain structure overview.
With regards to the access and management of authorizations, three kinds of blockchains can be considered: Unrestricted, consortium, and private. An unrestricted (or permissionless) blockchain must be highly distributed, wherein any person can contribute via a miner. Even though this warrants maximum immutability, the efficiency is limited because of the collaborative achievement of consensus with the help of the highly extended miner network. While in a private blockchain where miners process blocks for a single organization, efficiency can be maximized and immutability can be altered. A consortium (or federated) blockchain can offer productivity like a private one, while a moderately distributed miner network is integrated, including nodes offered by the nominated organizations [
4,
5].
The healthcare industry has various unique requirements in terms of privacy and security due to the need for additional legal compliance to safeguard the medical information of patients. In the age of the Internet, where cloud storage as well as the adoption of mobile health devices has facilitated the sharing of data and records, the number of malicious attacks and the possibility of private information becoming compromised during sharing have increased significantly. It has now become easy to obtain health information; patients use smart modules and consult multiple doctors, thus increasing the concerns regarding sharing and secrecy of this information. Furthermore, patients can get their medical history records easily through a widely accessible network without the need to request permission from every provider [
6]. Healthcare utilities face the need for unique requirements with regards to interoperability, the transfer of biomedical data, authentication, data sharing, and the considerations regarding e-Health (mobile health) [
7].
E-Health refers to the provision of health facilities by employing digital technology. In the e-Health area, research trends have been concentrated on employing electronic health records to perform tasks related to diagnoses and patient monitoring. Typically, a patient may be involved with various providers of healthcare services, including specialists, primary care physicians, and therapists [
8]. Thus, in the industry and research community, health record exchanging/sharing has been garnering attention as well as causing concern pertaining to privacy preservation and data security. Recently, the Blockchain technique has been employed in numerous applications to manage and control biomedical information in the internet of medical things (IoMT) for remote and urban areas [
9].
For a particular patient, a single disease could be produced by or may be linked to an additional pathema(s). In this case, reliability and interpretation accuracy are based on the patient’s supplementary health information provided to the doctor. The doctor may get some information pertaining to the related sickness by asking the patient [
10]. However, this process cannot always be deemed as effective for diagnosis for two reasons: i) If events occurred long before, there is a chance that the patient will not remember the details, for example, the medical examinations or medicines he/she had taken at that time, thus potentially impacting the treatment accuracy or diagnosis he/she receives; ii) most patients have limited medical knowledge and fail to describe the diagnosis or treatments they have received professionally, which can impact the current doctor’s judgment. Thus, the present physician may fail to deduce precise information when carrying out the diagnosis [
11].
To resolve the above-mentioned issues, we have put forward a new scheme for a secure and privacy preserving blockchain within the setting of a hospital/medical care unit. The biomedical data are stored in the private blockchain, which offers benefits in terms of better privacy preservation, a fast transaction, a better security performance, and cost-effectiveness. Moreover, hospitals have been set up in a way that allows for the formulation of a consortium blockchain, which gives the possibility of storing searchable indexes pertaining to data elements. Doctors can easily visit the corresponding hospital’s private blockchain to search the consortium blockchain for indexes pertaining to the data of interest as well as having access to original patient records. Moreover, the algorithm has the ability to digest different forms of biomedical data such as images, patient’s records, patient’s bio-signals, etc. Additionally, the system performance can be increased and make defense against a cyber-attack more robust by employing the adaptive technique for public encryption with keyword search (PEKS) and double key encryption.
2. Literature Review
Over recent years, there has been a peak in interest in the blockchain technique in terms of establishing privacy and security in e-Health because of the associated benefits in data managing, i.e., integrated autonomy and the immutability features of the blockchain.
Q. Xia et al. [
12] put forward a framework for data sharing, which is based on blockchain health data sharing to address challenges pertaining to access control for sensitive data stored in the cloud. The system was designed considering a permissioned blockchain, which gives access to those verified users who are invited. Yue et al. [
13] also put forward a three-layer system made up of data usage, data storage, and data managing layers. Unlike the aforementioned works in which the cloud is used as the storage infrastructure, this framework suggests that the secluded blockchain can also act as a cloud. In [
14], transactions are employed so as to conduct instructions, such as querying, storing, and data sharing. The authors integrated blockchain and off-blockchain storage to develop a private data administration platform focusing on confidentiality. The latest health care/biomedical applications pertaining to blockchain technologies have been studied by Kuo et al. [
15]. They also described the potential challenges and proposed solutions that integrate blockchain technologies into the health care/biomedical domains. A new blockchain-based scheme was put forward by Hussein et al. [
16] to secure and manage medical records. The proposed algorithm employs the genetic algorithm as well as discrete wavelet transform to increase the security and robustness.
The state-transition functions of the contract are employed to carry out policies, which allow for the imposing of data interchange only for those transactions that are legitimate. Instead of keeping the health record in the block, [
17] have added addresses pertaining to mobile devices and sensors to the healthcare blockchain with regards to pervasive social network (PSN) nodes. Similar to the Bitcoin approach, the block employed in [
18] consists of a structure based on a Merkle tree. The tree’s leaf nodes characterize patient data communications; resources adding to the official patient biomedical data record are also described. However, the actual record document is not included in these transactions, and in its place fast healthcare interoperability resources (FHIR) are referenced via uniform resource locators (URLs). Separate from the above-mentioned works that concentrate on health biomedical data sharing, [
19] and [
20] have concentrated on various other issues. In [
19], a blockchain stage architecture is proposed pertaining to clinical trials and precision medical treatment. This work evaluates a design related to the blockchain podium that is focused on the health field, especially precision medicine and clinical trials. There have been few studies that focus on key management schemes pertaining to the blockchain technique, and [
20] made an effort to develop a key discussion with regards to this area. To design a lightweight backup, it employs body sensor networks, which also help to establish an efficient recovery scheme pertaining to keys of the health blockchain. In 2016, a cloud-based electronic health record sharing system was put forward by Liu et al. [
21], which supports a fuzzy keyword search. In their search scheme, a medical practitioner (or other user) is allowed to quickly retrieve electronic health records (EHRs) based on the symptoms mentioned in the queried keywords. Implementing the attribute-based encryption (ABE) algorithm also helps to authenticate users with regards to their attributes. Guo et al. [
22] proposed an attribute-based encryption for the fine-grained authorization scheme pertaining to a relational database. Recently in 2017, C. Hu and P. Liu [
23] put forward a searchable symmetric encryption scheme that was based on blockchain technology, in which blockchain is employed as a peer-to-peer network for storing user data based on a pay-per-use approach. In the decentralized system, each of the users possesses equal status and he/she can request other users to store his/her data through the submission of a transaction.
Table 1 shows the summery of the literature review. From this, we found the following observations of interest:
The value of the various data combinations and data integrity are poorly considered; where most of the previous studies focused on using simulated or even simple individual data in the proposed schemes;
However, it is vital to replace lost or compromised keys; key management and access control are not strongly focused on or sometimes not even mentioned;
Most studies focused on the proof of the main concept without using real-world data, i.e., a simulated data type; however, it is essential to use the real-world data for the evaluation of the proposed scheme.
In this study, the proposed scheme was designed to overcome the mentioned challenges and gaps found in previous studies. Therefore, two main databases were used to validate the proposed scheme. Moreover, the use of multiple databases with different data structures combined with traditional biomedical patient’s record contributes to improving the effectiveness and achieving the objective of data integrity. On the other hand, the use of two blockchain structures assists the key management process and improves it.
3. Background
3.1. The Blockchain
The blockchain can be explained as a scattered database, which also includes an ordered records list, which are linked via a continuous chain of blocks [
24]. Normally, a block includes the encrypted key (hash value) of the earlier block, contributor signature, payload, and timestamp. The hash value from the earlier block imparts immutability from modification to the blockchain. The block’s payload changes with the application. For original data, it can be an address pointer or some other information or the content pertaining to the transaction. The generator and generation time pertaining to the block is demonstrated by the contributor signature and timestamp. Two important entities are involved in the blockchain network: Verifiers and miners. Miners signify the nodes that can create new blocks pertaining to the blockchain. Various scenarios can define diverse node types as miners. For instance, in the Bitcoin blockchain, the nodes showing proof of work are allocated as miners to maintain records pertaining to the transactions. The acceptance of new blocks is done only when verifiers confirm the validity of the new blocks. Mining can be defined as a set of processes involving verifying, generating, and accumulating new blocks into the blockchain. In the blockchain network, the consensus mechanism is crucial for ensuring the reliability and security of the mining processes. It helps to determine who is responsible for keeping records and how the new block’s validity can be checked. Since blockchain technology was originally employed for Bitcoin, transactions are broadly employed in the network to showcase new produced data. In our study, all transactions signify the secured indexes pertaining to the new emerging health records in the system. In general, blockchain technology can be divided into three groups: Private blockchain, permissionless blockchain (public blockchain), and consortium blockchain. In a permissionless blockchain, anybody from any part of the world can enter into the system, which allows access to data and the sending of transactions—for instance, the Bitcoin system. In a remote blockchain, the entry right is fully controlled by an organization. The management of a consortium blockchain is done by various organizations. With regards to our work, we have utilized the consortium blockchain and private blockchain in the e-Health system for managing and storing patient’s records, which aid in enhancing the diagnosis process.
3.2. Keyword Search
Boneh et al. [
25] proposed the public encryption with keyword search (PEKS) to perform searches with encrypted data maintained in asymmetric settings. Normally, from a message m, extraction of a keyword w is done. The keyword is encrypted with a public key and a trapdoor is employed for the search, which is produced by the unit that corresponds to the public key.
PEKS consists of four polynomial time randomized algorithms; KeyGen (λ), PEKS (pki, w), Trapdoor (ski, w’), and Test (pki, cw, Tw).
Afterwards, different keyword search algorithms were suggested to offer diverse searching functionality by integrating PEKS with other cryptographic primitives. PEKS schemes with a designed tester have been proposed to improve the search security [
26,
27]. When integrated with proxy re-encryption [
28], the mechanisms of the keyword search allow a delegate to search for keywords of interest from the data of the delegator. In [
26], to resolve the user privacy issue, keyword searches along with oblivious transfers were introduced. The data supplier is stopped from knowing the selected keywords as well as the corresponding cipher text due to the oblivious keyword search. Some applications need to search more than one keyword. The works of [
27,
28] demonstrate the public encryption key used with a conjunctive field keyword.
In this study, to achieve the needed security objectives, the cited PEKS schemes can be mixed together as explained. The use of multiple schemes enhances the keyword searching function, especially in multiple blockchain networks where the search requires longer periods.
3.3. Discrete Wavelet Transform
As is mentioned in the introduction, one of the essential objectives is to unify various patients’ data (text records with images) into a single scheme; during this, it is important to efficiently use the available resources and maintain the security. Biomedical images can be very large, making it difficult to fit them inside the scheme structure. To resolve this problem, biomedical image segmentation, involving discrete wavelet transform (DWT), was done. DWT was used to reduce the image distortion that might have occurred during the segmentation process. This procedure converts bigger images into smaller, more manageable portions [
29].
It is important to develop input images that function with the generated data chains. The DWT can be defined as a mathematical model to perform multi-resolution analysis [
30]. A common method to perform spatial frequency analysis is the wavelet transform. Equation (1) [
31] presents the 1D continuous wavelet transform (CWT).
where
y(
t) represents the input signal and φ defines the wavelet function;
a denotes the scaling parameter and
b signifies the translation parameter. The base functions pertaining to a DWT were captured with the help of sampling from CWT. Equation 2 demonstrates the
function.
1D wavelet transform was employed to derive 2D DWT by applying filters in columns and rows. The images were segmented by 2D CWT into four diverse sub-bands i.e., LH, LL, HH, and HL. Every sub-band signifies an image, with estimated horizontal, vertical, and diagonal details, correspondingly.
4. Materials and Methods
The computer clients and hospital servers are considered as being semi-trusted. They are deemed as being trustworthy when performing the protocol but not so when it comes to accessing or deducing the health information of a patient without proper authorization. In the public channel, there is a chance that outsider attackers could eavesdrop or interfere with the transmissions, such as encrypted patients data, secure indexes, and trapdoors. The computer clients are prohibited from colluding with the server that would allow them to infer the user’s real identity [
32,
33].
In this study, we aimed to achieve these main contributions: Access control as well as security issues. Since the record of the patient is a secrecy sensitive issue, it is crucial to maintain data refuge, including data auditing, data integrity and confidentiality, and access control. Normally, encryption and signatures ensure data integrity and confidentiality. It is crucial to maintain access control and data auditability to guarantee the monitoring of all data access activities under data generators (hospitals) and data owners (patients). These things can be achieved by employing cryptographic primitives via authentication, identification, and authorization.
Even though privacy preservation can be established partly through access control and data confidentiality, in e-Health systems, a user’s identity can also leak certain privacy-sensitive information. Thus, it is crucial to keep the identity information of the user secret. In general, it is important to establish unlinkability and anonymity to achieve identity privacy. Herein, unlinkability refers to the condition in which eavesdroppers cannot judge if two or more flows of records originate from the same source. In the proposed system, a patient authorizes the doctor to access his/her history records to enhance the diagnosis. In this process, the desirable content can only be accessed by the authorized doctor. The eavesdroppers fail to deduce the keywords while the physician also looks out for pseudo-identities.
4.1. Used Datasets
In this study, two main databases were used to validate the proposed scheme. The first database contained various collections of magnetic resonance (MR) images with individual information gathered from healthy cases and patient cases. This database is available on brain-development.org/ixi-dataset/ [
34]. Each case in this database was well documented and explained, and could be found as an attachment. It was used to test our proposed algorithm, which was designed to accept various data types because of this. Another common data type in hospitals and healthcare units is microscopic histological images. In this study, the images from
www.microscopyu.com/galleries/pathology were employed to validate the proposed scheme [
35]. Most patient’s records were from these data types, consequently, using them was very helpful to improve and get high detection accuracy.
4.2. Image Segmentation
The proposed image segmentation method shown in
Figure 2 was based on a previous study [
36]. Wavelet analysis was employed for the extraction of the original image as well as data fusion. Initially, various image modalities were segmented into four different channels, i.e., low horizontal and low vertical (LL1), while the detailed images included low horizontal and high vertical (LH1), high horizontal and low vertical (HL1), and high horizontal and high vertical (HH1) frequencies. In the second stage, every channel (segment) would be sub-segmented into four new channels, namely: HL2, LL2, HH1, and LH1 [
37]. This procedure continued for two more levels. Finally, the original image was sub-divided into small image groups as follows:
4.3. General Description of the Proposed Scheme Design
The blockchain architecture that forms the central part of our proposed scheme is described in
Figure 3. It mainly consists of a consortium blockchain and private blockchain. The central part represents the private blockchain while the boundary represents the consortium blockchain. The centre is the core chain, composed of the scheme supervisor, these members are responsible for giving authority to any new user located inside the outer or consortium blockchain.
This chain is open for any user who needs to provide detection and evidence services to join as a member node. The link nodes have the ability to communicate over a peer-to-peer network. The network (consortium blockchain) has the features of decentration and active change, which include the ways of networking and the communication mechanisms between nodes. The central nodes are located inside the private blockchain, while the other nodes can freely join or exit the consortium network. The information is sent from each node to its neighbor node, then the neighbor node forwards the received information to its own neighbor node. Through this technique, the information increasingly spreads throughout the network. The node first directs its own block mark (like an ID) to the neighbor node. If the neighbor node’s mark is less than the first node’s, the block needs to obtain the missing information. If the neighbor node mark is higher, the neighbor node receives the reverse block information. As a result, all nodes continuously exchange block information with their neighbor nodes.
The proposed scheme design is shown in
Figure 4. In this design, security issues are resolved using a new technique. In this technique, we employed the blockchain twice, the second blockchain algorithm was used to generate a secure sequence for the hash key generated in first blockchain algorithm. This means that the generated hash keys were double encrypted, thus improving the security index. Moreover, modifications in block structure were prepared, and new sections were added to fulfill the scheme adaptivity for various data types.
Figure 5 shows the proposed block structure. It contains two main parts: The block header and system payload. The block header consists of the block index, time stamp, previous block hash, next block hash, and nonce. The system payload consists of the encryption type, record type information, patient ID, and system data. Note that the information (of any data type) is stored inside the system data section on the system payload, and the other parts are used to form the security barrier that isolates the patient’s information. There are three entities in the system: Scheme supervisor, scheme facility providers or hospitals, and operators (patients).
Scheme supervisor: The entire system is supervised by the system manager. It is necessary that all the doctors and patients register with the system manager. This produces system bounds and maintains a public key tree for patients and doctors. Scheme facility providers: Medical service providers can be defined as medical institutions that offer medical services for the users. Typically, each hospital includes a server and numerous computer clients. A doctor commands each computer client to record health information of their respective patients. Then, the clients produce blocks pertaining to the patients’ health records and subsequently transmit the information to the hospital’s private blockchain. Furthermore, the nominated computer client also verifies the new incoming blocks. The server is also involved in the registration of the users by maintaining a register table for the doctors and users. It also assumes the responsibility of collecting new blocks within the private blockchain at specific intervals and the formulation of new blocks pertaining to the consortium blockchain. In addition, the chosen server is responsible for confirming new blocks within the consortium blockchain. Notably, the doctors who are outside the private blockchain are authenticated by the server so as to allow them to access users in the private blockchain. Operators: Users can refer to those beneficiaries (doctors and patients) who visit the hospitals to avail or offer services. They should register on the hospital server prior to visiting the doctor. After registration, each patient receives a token from the server. The patient should not disclose anything about the token and only show it to the doctor when visiting. The beacon serves as evidence related to the interaction between the patient and the doctor, providing authorization for the doctor who is generating the record for the patient. The obtained record is then stored within the hospital’s private blockchains. For the doctors, they could also offer computer clients. Consequently, the development of the consortium blockchain and private blockchain can be achieved with the existing infrastructures in the hospitals without the need to add more instruments. Notably, the installation of software on the hospital servers and computers is enough to establish a consortium blockchain and private blockchain.
The system was constructed from a polynomial based sequence. Assume
with size n. It is expressed by computing
and a polynomial is constructed as follows:
The scheme algorithm is expressed as follows:
Scheme algorithm |
Input: Various data types |
Output: Blockchain series |
The data type selection is made by doctor (physician) If data type is text → step 7 Do Image segmentation process Wavelet coefficients computation , Initiate image matrix vector Initiate the data vector Generate the hash keys for Blockchain#1 Indexing the Blockchain#1 requests Generate the Blockchain#2 hash keys While i Output: Blockchain series based on data input
|
5. Results
In this study, 1749 images and 859 patients’ records were used to verify the scheme algorithm activity. An eight computer combination was set up to act as a hospital network; three of them were core I5 CPU, 8 Gbyte RAM, while the others were core I3 CPU, 8 Gbyte RAM. The PCs were located in different places within the same building and connected within the same backbone. The blockchain environment was achieved using Octave software under the Ubuntu 16.04 operating system.
Figure 6 shows the types of data that are most frequently used in hospitals (microscopic histological images, magnetic resonance imaging (MRI) images, and patient descriptive records). The operation process ran for 210 hours continuously receiving different requests from different users.
Figure 7 shows the generated blocks with data content, showing the image data after segmenting and inserting into generated blocks. The acquiring time for requesting and searching a certain patient record is depicted in
Figure 8, the selection of the requested number started at 50 request/second and increased gradually to 400 request/second where it was limited by the number of PCs. The response latency increased gradually with the increase in the request number. This is normal because of the extra functions needed in each transaction. It can be seen that when the requests were about 200, the total latency was below 100 ms. However, it became 750 ms when the number of requests reached 400. The increment ratio is acceptable when compared with the complexity of the implemented system (linking between two blockchains). The system scalability was maintained and depended upon the number of private miner nodes. On the basis of the obtained results, it was established that the suggested algorithm solution performs a very good paring regarding the distribution of the other nodes that were linked to them. As the Chord algorithm offers access to nodes pertaining to replicated content, it also enables peers to access other nodes with replicated data regardless of persisting communication issues in a few of the nodes. As a result of this, our solution’s overall operation was not hindered.
Table 2 shows the average resource usage of the entire designed network. It can be seen from
Table 2 that more memory would be recommended for when the physical memory modules are fully loaded. This can be expected in cases where biomedical images (such as DICOM images) occupy a large amount of space. Even though the location address was provided, replications pertaining to these files in the blockchain were not anticipated.
To validate the scheme robustness, the publishing time cost of a transaction (blockchain publishing transactions) was calculated. We initially calculated the size of the transactions in the blockchain, which were by 128 bytes for the private blockchain and 512 bytes for the consortium blockchain. We then set the number of transaction for different cases as n = 500, 1000, 1500, and 2000 to study the scheme under various payloads.
Table 3 depicts the statistical analysis for publishing time cost (transactions). In the table, the average time cost changed slightly with the transaction numbers for both the private and consortium blockchain. However, this difference increased at higher transaction numbers. The reasons for this can be explained by studying the transaction publishing procedure. It includes stuffing the transaction into a package, signing the package, and distributing it. After being confirmed by the other nodes, the transaction is accepted and added to the chain. For each transaction, these steps yield a basic time cost, which is varying for each case. The time cost rises gradually with the progress of the package where the signature time cost, transmission, and confirmation procedures rise with the package length. This result confirms that the setting of keyword size should not be too large to control the transaction efficiency and achieve the availability of the system.
6. Conclusions
The main aim of this study was to provide a state-of-the-art solution for managing different biomedical data types in a scheme that improves data integrity and makes the working system more reliable and adaptable. Such a scheme must be evaluated and tested under real-environment conditions, as biomedical data is an extremely sensitive issue.
In this study, a new scheme for managing biomedical data was presented. Two blockchains were employed to improve the system security and increase the overall stability. Furthermore, the combination of various data types contributed to an overall reduction in system costs whilst also improving the searching and processing time. To validate the data integrity, two real databases with a traditional hospital patients’ record were employed. Moreover, for the blockchains, the conformance proof could be well defined and planned as an accord mechanism wherein authenticated blocks were constructed. With regards to the blockchains, public key encryption with a keyword search was proposed for the records sharing protocol. After receiving hatches from the patient, the authorization of the physician was completed allowing access to the desirable biomedical records, thus enhancing their diagnosis ability. The obtained results indicated that the scheme was fast, had the ability to work under dense payloads, and thus performed well enough to be used within a large hospital environment.
For potential future work, the proposed technique is promising in respect to improving the security of the Internet of Things’ (IoT) connections. It has the ability to work with a fast response rate and a low latency. The specialized blockchain features (decentralized, integrity, and anonymity) can further be used as an optimum security solution.