Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment

Kuttiyappan, Moorthi; Appadurai, Jothi Prabha; Kavin, Balasubramanian Prabhu; Selvaraj, Jeeva; Gan, Hong-Seng; Lai, Wen-Cheng

doi:10.3390/math12131969

Open AccessArticle

Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment

by

Moorthi Kuttiyappan

¹,

Jothi Prabha Appadurai

²,

Balasubramanian Prabhu Kavin

³

,

Jeeva Selvaraj

⁴

,

Hong-Seng Gan

⁵ and

Wen-Cheng Lai

^6,*

¹

Department of Computer Science and Engineering, Dr.N.G.P. Institute of Technology, Coimbatore 641048, Tamil Nadu, India

²

Department of CSE (Networks), Kakatiya Institute of Technology and Science, Warangal 506015, Telangana, India

³

Department of Data Science and Business Systems, College of Engineering and Technology, SRM Institute of Science and Technology, SRM Nagar, Chengalpattu District, Kattankulathur 603203, Tamil Nadu, India

⁴

Department of Information Science and Engineering, Jain Deemed to Be University, Global Campus, Bangalore 560069, Karnataka, India

⁵

School of AI and Advanced Computing, XJTLU Entrepreneur College (Taicang), Xi’an Jiaotong-Liverpool University, Suzhou 215400, China

⁶

Department of Electrical Engineering, Ming Chi University of Technology, New Taipei City 24301, Taiwan

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(13), 1969; https://doi.org/10.3390/math12131969

Submission received: 2 April 2024 / Revised: 5 June 2024 / Accepted: 11 June 2024 / Published: 25 June 2024

(This article belongs to the Special Issue Big Data-Driven Responsible Edge Intelligence: Privacy, Security, Robustness, and Potential)

Download

Browse Figures

Versions Notes

Abstract

:

One of the industries with the fastest rate of growth is healthcare, and this industry’s enormous amount of data requires extensive cloud storage. The cloud may offer some protection, but there is no assurance that data owners can rely on it for refuge and privacy amenities. Therefore, it is essential to offer security and privacy protection. However, maintaining privacy and security in an untrusted green cloud environment is difficult, so the data owner should have complete data control. A new work, SecPri-BGMPOP (Security and Privacy of BoostGraph Convolutional Network-Pinpointing-Optimization Performance), is suggested that can offer a solution that involves several different steps in order to handle the numerous problems relating to security and protecting privacy. The Boost Graph Convolutional Network Clustering (BGCNC) algorithm, which reduces computational complexity in terms of time and memory measurements, was first applied to the input dataset to begin the clustering process. Second, it was enlarged by employing a piece of the magnifying bit string to generate a safe key; pinpointing-based encryption avoids amplifying leakage even if a rival or attacker decrypts the key or asymmetric encryption. Finally, to determine the accuracy of the method, an optimal key was created using a meta-heuristic algorithmic framework called Hybrid Fragment Horde Bland Lobo Optimisation (HFHBLO). Our proposed method is currently kept in a cloud environment, allowing analytics users to utilise it without risking their privacy or security.

Keywords:

big data; security; privacy; Boost Graph Convolutional Network Clustering algorithm; magnify pinpointing based encryption approach; Hybrid Particle Swarm; Grey Wolf Optimization

MSC:

37M10; 68T09

1. Introduction

There are now many electronic medical records, hospital information systems, medical imaging, and other types of data due to the Internet’s quick expansion and the analysis of big data, which has gradually saturated the medical business. Experts in linked sectors predicted the amount of data in the medical industry would be 44 times larger in 2020 than it was in 2009 [1]. Improved design was based on the routing protocol for scalable and secure Internet of Things (IoT)-based healthcare data transmission. A range of wearables and other Internet of Things devices gather health data. The medical IoT network gathers physiological data, which is then sent to the healthcare big data centre for archival and disease diagnostics. The medical file must be encrypted before transmission in order to preserve patients’ privacy and stop third parties from listening in on private communications. On the protected data, the patient applies an access policy to define the permissible attributes and relationships. Only users possessing the appropriate attribute secret keys—such as a doctor, nurse, anaesthetist, or a patient’s family—are allowed to decrypt the ciphertext. Attribute-based encryption is the name of this encryption technique [2]. Information security and protection have evolved into a fundamental problem that impacts numerous cloud applications. The ease with which cloud administrators can access sensitive data is one of the major concerns with regard to data security and privacy. This concern sharply increases client anxiety and hinders the adoption of distributed computing in many industries, including the financial sector and governmental and administrative entities. The security of structured or unstructured data is not addressed by the conventional big data approach. In cases where the risk of disclosing personal information is real, it also has to combine the concepts of insurance and security. The enormous combination of organised and big data necessitates new models for improving safety and security. The growing issue of data security in big data is of great interest to researchers [3].

A patient with unclear symptoms may be diagnosed and treated in many hospitals, and various medical records may be kept, according to the current medical system [4]. As a result, developing a cross-domain, safe data-sharing system is essential to facilitating patient care at various hospitals. For instance, the clinicians at hospital B have access to the examination report created by hospital A. The public clouds are used to store and provide ubiquitous data access for the encrypted medical files produced by various hospitals [5]. The cross-domain access policy for a patient’s protected medical records is determined by the patient. Each member of the medical staff must register with their respective medical facility in order to obtain the attribute secret key needed to decrypt the patient’s encrypted files [6]. Figure 1 explains the role of big data in the healthcare industry [7]. The sources of big data used in healthcare range from wearable technology and search engine server logs to electronic health records. This is a limitless ocean of information that presents an infinite number of opportunities. Knowing how to use this data effectively is crucial. All parties involved in the healthcare system, including healthcare organisations (HCOs), patients, medical professionals, pharmaceutical makers, etc., can benefit from proper storage and analysis instruments. In general, patients’ health improves, doctors can dramatically improve medical results, HCOs can reduce costs and increase operational efficiency, pharmaceutical companies can make better judgements, and other healthcare providers can do so as well.

Unencrypted data transmissions are one of the privacy issues brought on by the volume of data [8,9]. The cloud server, as well as any database server in the cloud, is unreliable. The Febo is a comprehensive health toolbox that helps a person keep track of their bowel movements and discover how their body reacts. [10]. Green computing that uses less energy [11] is used to create the best plans for assigning tasks and [12,13] that addresses energy waste issues in dynamic networking environments. The goal of both of these examples is to cut down on overall energy use. One of our primary areas of focus will be healthcare. Because each person’s genome occupies more than 140 terabytes, healthcare applications require a significant amount of space to safely store genome data [14,15]. As a result of security breaches [16,17,18], data owners reveal it on big data servers or in the cloud. Password guessing, brute-force attacks, stolen verification attacks, and other security threats target big data storage systems. Users’ and data owners’ confidentiality is not adequately protected by current security measures, which advocate encrypting data before delivering it [19].

As Figure 2 clearly portrays, the privacy assurance of the Big Data in Healthcare system [20], SecPri-BGMPOP, which can offer a solution that incorporates multiple processes, is proposed as a means of addressing the numerous issues associated with privacy and security protection. A brand-new Boost Graph Convolutional Network Clustering (BGCNC) algorithm was initially used to develop effective techniques. We provide a framework to improve BGCNC’s convergence to advance this idea. The advantages we provide in terms of memory and computation are significant. It has the same memory complexity as conventional SGD, and the full gradient descent has the same temporal complexity per epoch. Second, a Magnify Pinpointing-Based Encryption (MPBE) method uses a portion of the identification bit string to create a safe key, preventing identity leakage even if the encryption or encoded data is cracked by an enemy or assault. Using the meta-heuristic algorithmic framework Hybrid Fragment Horde Bland Lobo Optimisation (HPSGWO), an optimal key was then produced. Despite their good performance, traditional algorithms can be improved upon to address their flaws and raise the bar. The traditional PSO method has some drawbacks, such as subpar performance in a number of sectors. The GWO algorithm also has slower convergence, less solution precision, and less efficient local searching, in addition to these drawbacks. The updated place is the primary renovation in the proposed paradigm. Our recommended approach is now maintained in a cloud environment, enabling analytics users to access it without jeopardising their security or privacy.

The contribution of the work is as follows:

The purpose of the Prvsec-Sanitize is to offer a cloud-based healthcare big data environment with an improved security mechanism.
The Boost Graph Convolutional Network Clustering (BGCNC) technique is a quick and memory-efficient training algorithm that lowers computational complexity in terms of time and memory measures.
Even when traditional algorithms perform well, they can still be enhanced to address flaws and increase the bar. The traditional PSO method has certain drawbacks, such as poorer performance in a number of domains. The GWO algorithm also has poorer solution precision, slower convergence, and less successful local searching in addition to these drawbacks. Thus, more investigation is required to improve integration and robustness.

2. Literature Review

One of the most prominent research fields at the moment is cloud security, and several methods have been put out over the previous few decades. Smart grids make extensive use of software-defined networks (SDN) for communication network management and monitoring. Big data analytics are becoming increasingly significant for SDN-based smart grids. A possible strategy is to examine the vast amounts of data produced by an SDN-based smart grid by utilising machine learning techniques. The optimising and differentially private clustering algorithm (ODPCA) is a technique for selectively private and optimal clustering. For privacy-preserving clustering of mixed data, the ODPCA combines the differentially private K-means algorithm with the K-modes approach. The distribution of the privacy budget is optimised [21] to boost the accuracy of the clustering findings. Using fog computing technology, secure healthcare-sensitive data may be kept in the cloud. A tri-party, one-round authenticated key-agreement mechanism that may generate a session key among participants and enable safe communication can also be represented by bilinear pairing cryptography. Finally, a decoy method might be used to access and safeguard private healthcare data. The suggested method [22] states that when an intruder is discovered accessing system data, fake files are retrieved right away to protect data security.

A blockchain-based system uses secure key management (BC-EKM). In order to build the stake blockchain, a hybrid sensor network is used. The security of key management is compromised by an easily targeted nontrusted base station (BS) due to dynamics. In addition to avianizing the role of BS, the distribution key management schemes also result in a significant and extra expense on sensors. Finally, we carry out in-depth security simulations and analyses [23]. For a number of smart city applications, Holistic Big Data Integrated Artificial Intelligent Modelling (HBDIAIM) has been presented as a way to enhance the security and privacy of the data management interface. To adequately secure the private data management interface in smart city applications, a differential evolutionary algorithm has been incorporated into HBDIAIM. The differential evolutionary algorithm also incorporates the big data analytics-assisted decision privacy approach, which enhances the scalability and accessibility of data in a data management interface based on their corresponding storage location. Additionally, to solve privacy and scalability issues in the data management interface for various smart city applications, the Adaptable Interference Method was created [24].

By recommending a better key management system, this study aims to address the challenges related to the security and privacy concerns of sensitive patient information. The suggested approach also strives to offer a simple and structured key management system. This system has a null rekeying technique and only needs a few key calculations to achieve forward and backward compatibility secrecy. As a consequence, the Healthcare Key Management (HCKM) framework, which strives to decode the same plain text with various keys, is the safe and privacy-preserving key management method for e-health systems. While maintaining an adequate degree of security, HCKM reduces the rekeying overhead for group members and the overhead stated in terms of the number of messages exchanged [25]. A blockchain-based Internet of Things solution for large data transport and privacy protection is facilitated by AI. Using graph modeling, the proposed algorithm first constructs a reliable and scalable system for data collection and transmission. Additionally, the approach based on artificial intelligence is used to extract the subset of nodes and produce efficient healthcare services. Using symmetric-based digital certificates, blockchain allows for secure and private transmission of communication resources [26].

By lengthening the keys in the Data Encryption Standard (DES), the Triple Data Encryption Standard (TDES) technique offers a way that is significantly easier to use while securing data privacy and preventing assaults. The experiment’s findings demonstrate that the TDES technique is effective in securing and safeguarding substantial volumes of healthcare data stored in the cloud [27]. This research presents a unique encryption technique using Serpent, an Advanced Encryption Standard (AES), and elliptic curve cryptography to protect healthcare data in IoT-enabled healthcare infrastructure. The suggested hybrid encryption approach improves healthcare data security by integrating symmetric and asymmetric-based encryption techniques. This technique additionally makes use of an elliptic curve-based digital signature to ensure the data’s integrity [28]. The Internet of medical things (IoMT) network receives massive amounts of big data, also known as health data, and registers a sizable number of patients and devices each day. To prevent misuse, this patient data should remain private and secure on the IoMT network. To achieve such data privacy and security, the interplanetary file system (IPFS) and a three-level/tier network have been proposed [29].

An enhanced architecture for safe and scalable IoT-based healthcare data transfer is based on the routing protocol. First, a variety of Internet of Things (IoT) devices, including wearables and sensors, collect health data. Utilizing data reduction and cleaning procedures, the raw data are preprocessed. Principal Component Analysis (PCA) and K-Nearest Neighbour (KNN) imputation are used to minimize the dimensionality of the data. The preprocessed data is used to extract the features using modified local binary patterns (MLBP). The FDT-RPL protocol for low-power and lossy networks enhances overall data transmission security by integrating the Butter Ant Optimization (BAO) algorithm with the fuzzy dynamic trust-based RPL algorithm [30]. None of these methods, therefore, will guarantee knowledge protection, which is necessary for both privacy and usefulness. To safeguard medical records, an effective anonymization approach is nevertheless required. Recent developments have shown that sensitive data, including proprietary knowledge, is kept, frequently by meta-heuristic algorithms. These methods are designed to generate an ideal key for the sanitization process. It is demonstrated that these algorithms produce superior results when compared to traditional methods. When compared to security, privacy is valued as a benefit, which is why it is important for both purchasing and selling in the business. Therefore, it is critical to strike a balance between security and privacy. A number of research frequently employed query and k-anonymity to address privacy concerns. However, these methods need a lot of computational resources and time. Our recommended approach offers a more reliable security system for cloud-based large data in healthcare.

The remaining parts of the paper are laid out as follows: The proposed work is discussed in Section 3, Results and Discussions are presented in Section 4, and the conclusion and the work to come are discussed in Section 5.

3. Proposed Method

The SecPri-BGMPOP approach is suggested and can offer a solution that involves several different steps in order to handle the numerous problems relating to security and protecting privacy. The Boost Graph Convolutional Network Clustering (BGCNC) algorithm, which reduces computational complexity in terms of time and memory measurements, is first applied to the input dataset to begin the clustering process. Figure 3 shows the architecture of the data security and privacy model. The way Boost Graph convolutional network clustering operates is as follows: the BGCNC training algorithm is fast and memory-efficient. It samples a block of nodes linked to a dense subnetwork at each phase and restricts the neighbourhood search towards this substring. The dense subgraph is identified via a graph-clustering algorithm. Memory and computational efficiency are greatly increased by using this straightforward yet efficient technique. An encryption method based on Magnify Pinpointing prevents identity leaks even if an adversary or attacker decodes the key or encrypted material by creating a safe key utilising a portion of the identification bit string. Based on the Hybrid Fragment Horde Bland Lobo Optimisation framework, an enhanced meta-heuristic algorithmic framework, an optimum key was obtained. Our suggested SecPri-BGMPOP is intended to offer a better security method for cloud-based big data in healthcare.

3.1. Graph Convolutional Network Clustering (GCNC)

The best of both worlds is achieved by the graph convolutional network clustering technique; this has the same memory complexity as conventional SGD and full gradient descent has the same temporal complexity per epoch [31]. Consider the scenario in which each batch’s embeddings are determined for a collection of nodes B from layer 1 to layer L. Given that every layer of computation uses the same subgraph

A_{B, B} (L i n k s w i t h i B)

, we may assume the number of edges in this batch,

{‖A_{B, B}‖}_{0}

. To indicate the embedding utilization, we need build a batch B that maximises within-batch edges in order to increase embedding utilisation. In order to link the effectiveness of SGD updates with graph clustering techniques, Figure 4 shows the complete graph G and the graph with the clustering partition G as examples of the community expansion. Cluster-GCN, as can be shown, can concentrate on the neighbours within each cluster rather than conducting a thorough neighbourhood search.

A graph

G = (V, ε, A

) consists of N =

|V|

vertices and

|ε|

edges such that an edge among any two vertices

i

and

j

represents their similarity. The corresponding adjacency matrix A is an

N \times N

sparse matrix with

(i, j)

entry equaling 1 if there is an edge between

i

and

j

and 0 otherwise. G’s partition its nodes into c groups:

V = [V_{1}, \dots V_{c}],

where

V_{t}

consists of the nodes in the t-th partition, L is the number of layers, F is the number of features, N is the number of nodes, and b is the batch size. As a result, we have c subgraphs

\bar{G} = [G_{1}, \dots G_{c}] = [\{v_{1}, ε_{1}\}, \dots, \{v_{c}, ε_{c}\}]

(1)

where each

ε_{t}

is made up only of the connections connecting the nodes in

v_{t}

. The adjacency matrix is divided into

c^{2}

submatrices after nodes are reorganised as

A = \bar{A} + ∆ = [\begin{matrix} A_{11} & 1 & A_{1 c} \\ 1 & 0 & 1 \\ A_{c 1} & 1 & A_{c c} \end{matrix}]

(2)

and

\bar{A} = [\begin{matrix} A_{11} & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & A_{c c} \end{matrix}], ∆ = [\begin{matrix} 0 & 1 & A_{1 c} \\ 1 & 0 & 1 \\ A_{c 1} & 1 & 0 \end{matrix}]

(3)

The links within

G_{t}

are contained in each diagonal block

A_{t t}

is a

|v_{t}| \times |v_{t}|

adjacency matrix. The adjacency matrix for the graph

G_{t}

.

\bar{A}

is the adjacency matrix for graph

\bar{G}

;

A_{s t}

contains the connections between its two partitions

v_{t}

;

∆

. Similar to this, we can divide the

\hat{X}

and

Y

into

[V_{1}, \dots V_{c}]

as

[X_{1}, \dots X_{c}]

according to the partition [

Y_{1}, \dots Y_{c}

] where

X_{t}

and

Y_{t}

are made up.

The calculation

\bar{G}

allows for the decomposition of the objective function of GCN into many groups (clusters). The final embedding matrix is if we use

\bar{A}'

to represent the normalised version of

\bar{A}

.

Z^{(L)} = {\bar{A}}^{'} σ ({\bar{A}}^{'} σ (\dots σ ({\bar{A}}^{'} X W^{(0)}) W^{(1)}) \dots) W^{(L - 1)}

(4)

This is significantly less difficult to construct than the neighbourhood search method utilised in earlier SGD-based training methods because

\bar{A}

has a block-diagonal structure (remember that

{\bar{A}}_{t t}^{'}

is the matching diagonal block of

\bar{A}'

). Additionally, it has broken down into

L_{\bar{A'}} = \sum_{t} \frac{|v_{t}|}{N} L_{{\bar{A}}_{t t}^{'}} a n d L_{{\bar{A}}_{t t}^{'}} = \frac{1}{|v_{t}|} \sum_{i \in v_{t}} l o s s (y i, z_{i}^{(L)})

(5)

The decomposition form in (3) and (4) serves as the basis for the Cluster-GCN (5). At each step, we sample a cluster

V_{t}

and the conduct SGD to update based on the gradient of

L_{{\bar{A}}_{t t}^{'}}

and this only requires the subgraph

A_{t t}

, the

X_{t}

, and the

Y_{t}

on the current batch models

{\{w^{l}\}}_{l = 1}^{L}

. Compared to the neighbourhood search strategy employed in earlier SGD-based training methods, the implementation merely calls for matrix components’ forward and retrograde transmission (one block of (5)). As previously mentioned, the embedding utilisation is similar to the within-cluster linkages for each; therefore, these are exactly what we require. (1) Assuming that each node and its neighbours are often found in the same cluster, neighbourhood nodes have a high probability of remaining in the same cluster after a few hops. (2) We need to design a partition to reduce the number of between-cluster links since we replace A with its block diagonal approximation A and the error is proportional to those links.

Each node in

v_{t}

of

A_{t t}

only links to other nodes inside of

v_{t}

. Each batch’s computation will only involve matrix products

{\bar{A}}_{t t}^{'} X_{t}^{(l)} w^{(l)}

and a few element-wise operations; thus the total computation time is

O ({‖A_{t t}‖}_{0} F + b F^{2})

. As a result,

O ({‖A‖}_{0} F + N F^{2})

is the overall temporal complexity per epoch. On average, each batch only needs to compute

O (b L)

embeddings, which is exponential to L rather than linear. For embedding storage, we simply must load b examples per batch, and the per layer’s extracted features are saved using O(bLF) memory.

Boost Graph Convolutional Network Clustering (BGCNC)

To achieve good computational and memory complexity, two potential concerns still exist:

Certain links (the part in $∆$ Equation (2)) are eliminated from the graph after it has been partitioned. As a result, the performance can be impacted.
Algorithms for graph clustering frequently group related nodes together. As a result, a cluster’s distribution may deviate from the original data set, which could cause a skewed estimation of the complete gradient while doing SGD updates.

Based on the label distribution of each cluster, we determine its entropy value. When compared to random partitioning, it is evident that most clusters have lower entropies, which suggests that the label distributions of the clusters are skewed towards particular labels. This raises the variance between batches and could have an impact on SGD convergence [32].

We suggest a stochastic multiple clustering strategy to incorporate between-cluster linkages and lessen volatility across batches in order to overcome the aforementioned problems. Then, we divide the graph into

p

clusters

V_{1} \dots \dots . V_{P}

using a sizable

p

. Instead of taking into account just one cluster while building a batch B for an SGD update, we pick q clusters at random, denoted as

t_{1}, \dots \dots \dots t_{q}

and add their nodes as part of the batch using the notation

\{V_{t_{1}} \cup \dots . . \cup V_{t_{q}}\}

. Moreover, the connections between the selected clusters are re-added.

\{A_{i j} | i, j \in t_{1}, \dots . . t_{q}\}

(6)

In this way, the cluster combinations reduce the variance between batches and reintegrate the between-cluster linkages. Algorithm 1 presents our final Boost Graph Convolutional Network Clustering Technique.

Algorithm 1: Boost Graph Convolutional Network Clustering

1: Input: Graph A, Feature X, Label Y
2: Output: Node representation

\hat{X}

3: Partition graph nodes into

c

clusters

V_{1}, V_{2}, \dots \dots V_{c}

4: For

i t e r = 1, \dots \max_{iter}

do
5: Randomly choose

q

clusters,

t_{1} \dots . t_{q}

from V without replacement;
6: From the subgraph

\hat{G}

with nodes

\hat{V} = [V_{t_{1},} V_{t_{2}}, \dots . ., V_{t_{q}}]

and links

A_{\hat{V}, \hat{V}};

7: Compute

g \leftarrow \nabla L_{A_{\hat{V}, \hat{V}}}

(loss on the subgraph

A_{\hat{V}, \hat{V}}

);
8: Conduct adam update using gradient estimator

g

9: Output:

{\{W_{l}\}}_{l = 1}^{L}

The data is divided into groups using the Boost Graph Convolutional Network Clustering technique, and each group is then handed to a faster parallel process to increase the computational complexity in terms of time and memory measurements. The magnify pinpointing-based encryption method is then applied to the data groups, which transforms the data into a different format based on the key value generated. To save time, the encryption, decryption, and key creation processes all use the magnify pinpointing-based approach.

3.2. Magnify Pinpointing-Based Encryption (MPBE)

Only key creation, encryption, and decryption times are calculated in the suggested method. Currently, network overhead is a problem. This framework shows how user 1 and user 2 can access each other’s data by using magnify-pointing-based encryption as a security, authentication, and authorization tool. Both user 1 and user 2, two users who access the cloud’s services, are shown in Figure 5 as part of the cloud environment, using the cloud database for access.

Role-based classification is faster with identity-based encryption [33]. With these techniques, users are given access to the data after authenticating their identities. This plan is initially put into practice on proxy servers to remove unwanted users. The identities of allowed users are recorded on proxy servers, and whenever an unauthorised user tries to access the server’s service, their access is revoked because no key matches that identity. As a result, in order to access the service, each user must register their identification. Security is becoming increasingly in demand as a result of technological advancements like cloud computing. This system is used by several security companies to protect user data and information [34]. Uploading and downloading of data or files is a crucial component of cloud security implementation. The effectiveness of an IBE scheme is also significantly influenced by network congestion and service speed. The Algorithm 2 describes the key generation process. However, in order to improve time efficiency, the suggested method described in Section 3.2 focuses primarily on key creation, encryption, and decryption. The only times that are calculated in the proposed method are those for key generation, encryption, and decryption. At this time, network overhead is compromised. Using IBE as a security, authentication, and authorization tool, this system illustrates how Bob and Alice can access each other’s data. An authorised user’s identity cannot be obtained by an unauthorised user because PKG stores user identities in Lagrange polynomial equation form, which is computationally challenging for an outsider to decipher and obtain the user’s original identity. Because they communicate these identities using secure socket connections (SSL), when Bob sends his identification to Alice in the middle of the network, an attacker cannot obtain the identity. Hash-based key generation techniques are computationally costly; hence, pairing-based key generation techniques are used instead.

Algorithm 2: Key generation

1:  Private key generator (PKG) starts the setup process and decides the security parameters like the level of bits and type of curves.
2:  Bob obtains the master public key from PKG.
3:  Bob authenticate himself by issuing his identity to PKG and receives the private key for encrypting the data.
4:  Similarly, Alice obtains the master public key from PKG.
5:  Alice authenticate herself by issuing his identity to PKG and receives the private key for encrypting the data.
6:  Bob sends his identity to Alice for generating the public key related to Bob’s identity. Alice will use this key to decrypt the data authenticated by Bob.
7:  Alice obtains Bob’s data from the database and decrypts it for accessing.

As mentioned, the demand for security is rising because of technological advancements like cloud computing. This system is used by several security companies to protect user data and information. Uploading and downloading of data or files is a crucial component of cloud security implementation.

Let

G

stand for a collection of prime order

p

. Group

G

forms an accurate bilinear map onto Group

G_{1}

.

G_{1}

’s bilinear map representation is given by the formula

e

:

G \times G_{1} \to G_{2}

, where

g

is the group

G

’s generator. The group size is determined by a security parameter, and each identity is represented by four strings, each of length

n . 4

.

S L = ({s l}_{1}, {s l}_{2}, {s l}_{3}, \dots, {s l}_{n}) .

(7)

The SL bit strings length is utilised to produce fixed length n bit strings. The suggested magnify pinpointing-based encryption algorithm includes the following phases.

3.2.1. Initial Phase

The system parameters are created first. From

Z_{p}

an underground is chosen at random. An accidental maker

g

from the set

G

is selected such that

g \in G

, the value

g I = g^{α}

is fixed, and

g_{2}

is randomly chosen from the set

G

. A chance quantity

u

is selected such that

u^{'} \in G

and a random

n

-length vector such that

U = \{u_{i}\}

fuig after selecting all authority parameters. Finally, the public parameters

g, g 1, g 2, u',

and

U

are broadcast along with the master key

g_{2}^{α}

.

3.2.2. Generation Phase

Let

v

be the

n

-bit string SL for the user, and let

V \subseteq \{1, \dots n\}

be the collection of all

i

for which

v_{i} = 1

.

V

is split into two different, namely

V = \{v_{1}, v_{2}, \dots v_{m}\}

and

\{V_{r_{1}}, V_{r_{2}}, \dots v_{r_{m}}\}

such that

m + r_{m} = n

, where

V_{r_{1}}

stands for a random value that is introduced to the suggested approach to increase security. The private key corresponding to identity

v

is obtained by selecting a random value, as indicated in Equation (8).

u^{'} \prod_{i \in V} v_{i}

is the formula for group operation during the key generation:

d_{v} = (u^{'} \prod_{i \in V} v_{i})

(8)

U = \{u_{1}, u_{2} | \dots u_{n}\}

and V = {

v_{1}, v_{2} \dots v_{m}

} such that

m < n

. Now, a polynomial function using the Lagrange coefficient method is created and we perform polynomial interpolation. We can conceal v, which can be effectively reconstructed from the available data points with the aid of polynomial interpolation. For the suggested strategy, the polynomial equation and Lagrange coefficient is

∆_{i, v} (x) = \sum_{i = 0, k \in V}^{n} (\prod_{0 < i < n, j \neq i} \frac{x - x_{j}}{x_{i} - x_{j}}) y_{k}

(9)

where

x = u_{i} a n d y = v_{k} .

Every user identity’s random set

u_{i}

is generated once, and each identity’s Lagrange coefficient is generated using the same

u_{i}

value. M-terms of identity will be used by authority value as a result of which a challenger will never learn the authentic user’s original identity. As a result, it will be challenging to retrieve or deduce the key created for a specific SL. What if every user identity and U value are identical? The challenger will be unable to infer anything from the key in such case since

∆_{i, v} (x)

produces an error of zero. This situation is unique. The greatest error between any two subsequent nodes will be shown, and the error created will be zero.

3.2.3. Encryption Phase

Let message

M (M \in G_{1})

and “c” be a random number selected from

Z_{p}

. Equation can be used for encryption for some identities

v

(10), Clusters 1, 2, and 4. Encryption using three-key TDEA is prohibited unless otherwise approved by another NIST guidance.

C = ({e (g_{1}, g_{2})}^{c} M, g^{c}, {(u^{'} \prod_{i = V} V_{i})}^{c})

(10)

3.2.4. Decryption Phase

Let

C = (C_{1}, C_{2}, C_{3})

be a valid cypher text under user identity v for message M. Then, using cypher text C as a key,

d_{v} = (d_{1}, d_{2})

as given in Equations (11)–(13):

C_{1} \frac{e (d_{2}, C_{3})}{e (d_{1}, C_{2})} = {(e (g_{1}, g_{2})}^{c} M) \frac{e (g^{r}, {(u^{'} \prod_{i = v} v_{i})}^{c})}{e (g_{2}^{α}, {(u, \prod_{i = v} v_{i})}^{r}, g^{c})},

(11)

= ({e (g_{1}, g_{2})}^{c} M) \frac{e (g, {(u^{'} \prod_{i = v} v_{i})}^{r c})}{e {(g_{1}, g_{2})}^{c} e {((u^{'} \prod_{i = v} v_{i})}^{r c}, g)},

(12)

= M

(13)

Next the generated key is optimised using the Hybrid Fragment Horde Bland Lobo Optimization algorithm.

3.3. Hybrid Fragment Horde Bland Lobo Optimization

3.3.1. Traditional PSO Algorithm

Three vectors are used in the computations of the PSO algorithm. They are x-vectors, p-vectors, and v-vectors, accordingly [35]. The p-vector (pbest) designates the place where the particle has found the object, and the x-vector tracks the particle’s current location inside the search zone (best response yet). Particle velocity, which forecasts where each succeeding particle will move throughout the course of that iteration, is likewise included in the v-vector. Initial random displacement of the particles occurs in preset directions. Gradually altering the particle’s orientation allowed it to start moving on its own in the direction of the prior best position. Then it searches the area for the optimal spots to perform some fitness-related tasks, using the formula

f i t = S^{m} - S

. Here, the particle’s position is specified as

\vec{M} \in s^{m}

, although its speed is given as

\vec{w}

. These two variables are initially chosen at random, and they are subsequently updated repeatedly using the two formulas in Equation (14):

\vec{w} = ω \vec{w} + c_{1} r_{1} (\vec{q} - \vec{M}) + c_{2} r_{2} (\vec{f} - \vec{M})

(14)

The inertia weight, or

w

in this instance, is a user-defined behavioural parameter that controls the degree of particle velocity recurrence. The particles implicitly interact with one another since the even-before position of the particle (pbest position) is q, and the prior-best location of the particle inside the swarm (gbest position) is

\vec{f}

. The acceleration constants are

c_{1}, c_{2}

. Utilizing the stochastic variables, this is graded,

r_{1}, r_{2} ~ U (0,1) .

The equation shows that the particle is moved to the next site in the search zone by adding the velocity to its present position (15), as shown in Figure 6, regardless of fitness gains [36].

\vec{M} \leftarrow \vec{M} + \vec{w}

(15)

3.3.2. Traditional GWO Algorithm

Hierarchical search agents exist in the GWO algorithm [37]. Equations (16) and (17) use mathematics to illustrate how encircling occurs as grey wolves hunt their prey (17):

\vec{B} = |\vec{E} \cdot {\vec{M}}_{q} (u) - \vec{M} (u)|

(16)

\vec{M} (u + 1) = {\vec{M}}_{q} (u) - \vec{H} \cdot \vec{B}

(17)

where

u

the current iteration is handed to the term “coefficient vectors” and is used to describe

\vec{H}

and

\vec{E}

. Grey wolves have a special ability to locate their prey and surround it. Using the heightened awareness of potential prey sites of alpha, beta, and delta wolves, these grey wolf hunting behaviours are statistically replicated. Regardless of whether further solutions are necessary, the top three are taken into account. Below are the mathematical Equations (18)–(24):

{\vec{B}}_{α} = |{\vec{E}}_{1} \cdot {\vec{M}}_{α} - \vec{M} (u)|

(18)

{\vec{B}}_{β} = |{\vec{E}}_{2} \cdot {\vec{M}}_{β} - \vec{M} (u)|

(19)

{\vec{B}}_{δ} = |{\vec{E}}_{3} \cdot {\vec{M}}_{δ} - \vec{M} (u)|

(20)

{\vec{M}}_{1} = {\vec{M}}_{α} - {\vec{H}}_{1} \cdot ({\vec{B}}_{α})

(21)

{\vec{M}}_{2} = {\vec{M}}_{β} - {\vec{H}}_{2} \cdot ({\vec{B}}_{β})

(22)

{\vec{M}}_{3} = {\vec{M}}_{δ} - {\vec{H}}_{3} \cdot ({\vec{B}}_{δ})

(23)

\vec{M} (u + 1) = \frac{{\vec{M}}_{1} + {\vec{M}}_{2} + {\vec{M}}_{3}}{3}

(24)

Despite their good performance, traditional algorithms can be improved upon to address their flaws and raise the bar.

3.3.3. Hybrid Fragment Horde Bland Lobo Optimization

As mentioned, traditional algorithms can be improved upon despite having good performance to solve the shortcomings and raise the bar. There are some flaws in the conventional PSO algorithm, including inferior performance across a variety of fields. Along with these disadvantages, the GWO algorithm also has slower convergence, worse solution precision, and less effective local searching. Therefore, additional research is needed to enhance robustness and integration. These issues are dealt with in this work using a brand-new hybrid methodology. In the suggested Hybrid Fragment Horde Bland Lobo Optimization, the PSO method’s criteria are blended with the GWO algorithm. Equations (15) and (16) illustrate the mathematical model of the prey enclosure in the proposed approach, whereas Equation (17) shows the hunting strategy’s mathematical model (18). The location has been updated, which is the main reformation in the recommended paradigm. Equation (19) illustrates the updating of the position in our Hybrid Fragment Horde Bland Lobo Optimization model, where

\vec{M}

denotes the velocity for updating the location of PSO, as indicated in Equations (18) and (24):

M (u + 1) = \frac{{\vec{M}}_{1} + {\vec{M}}_{2} + {\vec{M}}_{3} + \vec{M}}{4}

(25)

Again, in the classic PSO method, c₁ and c₂ are regarded as acceleration constants; however, in the recommended Hybrid Fragment Horde Bland Lobo Optimization model,

c_{1}

and

c_{2}

fluctuate in accordance with the values 0.1, 0.3, 0.5, 0.7, and 1. The optimal key is generated using the Hybrid Fragment Horde Bland Lobo Optimization. Algorithm 3 presents the ideal key selection based on Hybrid Fragment Horde Bland Lobo Optimization.

Algorithm 3: Hybrid Fragment Horde Bland Lobo Optimization

1:

M_{j}

is the grey wolf population where j = 1, 2, N. Here,

M_{a}

,

M_{b}

and

M_{d}

denote the best searching agent, the second-best searching agent, and the third-best searching agent, respectively. Moreover, e is the components, and H, E are coefficients. The goal of this algorithm is to output the best searching agent,

M_{a}

.
2: {
3: Set initial values to the

M_{j}

4: Set initial values to e, H, and E also
5: Measure the fitness values of each searching agent,

M_{a}

,

M_{b}

, and

M_{d}

.
6:  while (u < max) do
7:  {
8:  for each searching agent, do
9:  {
10: Revise the present location of the searching agents using Equation (25)
11: }
12: Revise e, H, and E
13: Assess fitness values for all searching agents
14: Revise

M_{a}

,

M_{b}

, and

M_{d}

15: u: = u + 1
16: }
17: return

M_{a}

18: }

4. Results and Discussion

Java code for clustering and the SecPri-BGMPOP technique were used in the trials, which were done under Windows utilising HADOOP jars in CloudSim. Python was used in the development of the suggested procedure. Python was used in the development of the suggested procedure. The Universiti Kebangsaan Malaysia faculty of education provided the autism datasets. The autism datasets used in this study were gathered from autistic children in various age groups. These include the 24-month dataset for autistic children, which contains 26 attributes and 209 instances; the 30-month dataset for autistic children, which contains 29 attributes and 209 instances; the 36-month dataset for autistic children, which includes 31 attributes and 234 instances; and the 48-month dataset for autistic children, which includes 33 attributes and 302 instances. There are three score alternatives for each dataset, which are autism diagnostic data: z = 0, v = 5, and x = 10. The cut-off numbers varied for each type of dataset at 71, 95, 100, and 105, respectively.

Our research’s experimental findings are contrasted with the current system. By tracking the amount of time and memory used during execution, the SecPri-BGMPOP technique is assessed. The whole SecPri-BGMPOP execution duration in Cloudsim is precisely calculated and measured in milliseconds (ms). The memory utilisation was effectively organised using the Java-based CloudSim programme. Kilobytes are used to measure the size of the SecPri-memory BGMPOP’s footprint and the stack memory needed for newly generated objects (kb).

The Boost Graph Convolutional Network Clustering approach’s SecPri-BGMPOP threshold is “1” for time and memory efficiency tests. Table 1 lists the results for various cluster sizes. Then, in terms of time and memory requirements, the proposed system is compared with FCM [38] and WFCM [39] and the Boost Graph Convolutional Network Clustering employed in previous work.

According to a graph drawn for time against cluster count (Figure 7 and Figure 8), the process takes less time to complete the more clusters there are. Because many mappers can run simultaneously, the procedure can be completed more quickly plotted as shown in Figure 7 and Figure 8. It is discovered that the least amount of memory is needed to complete a process when the number of clusters is low because higher cluster counts also result in higher memory allocation.

Table 2 lists the clustering-based time and memory metrics for the “2” SecPri-BGMPOP threshold, and Figure 9 and Figure 10 display graphs of these measures. The findings demonstrate that when the number of clusters rises, both time and memory are decreased since the threshold chosen is the best option.

Table 3 lists the findings of the clustering-based time measurements for SecPri-BGMPOP threshold value 4 for WFCM. The graphs in Figure 11 and Figure 12 show that when the cluster size increases, the time to complete increases since a high threshold requires more processing power and requires less memory.

Based on clusters 1, 2, and 4, the Boost Graph Convolutional Network Clustering method was evaluated in terms of time and memory for the SecPri-BGMPOP threshold values of 1, 2, and 4. The suggested system and the Boost Graph Convolutional Network Clustering used in earlier work are then compared with FCM and WFCM in terms of time and memory requirements. Because our suggested approach is now maintained in a cloud environment, analytics consumers can use it without worrying about their security or privacy. In order to demonstrate that the method based on Boost Graph Convolutional Network Clustering has produced better results than the existing FCM methodology, we compared it to the latter. Measuring performance includes looking at things like memory utilisation and execution time. The Boost Graph Convolutional Network Clustering recommended method produced improved outcomes, as illustrated in Figure 13.

When comparing the Boost Graph Convolutional Network Clustering (BGCNC)-based SecPri-BGMPOP method to FCM clustering, it is discovered that Boost Graph Convolutional Network Clustering performs better. Here, we have examined time and memory across various cluster sizes. When we examine the data, we discover that the proposed system uses less memory and requires less time than the current FCM-based technique (see Figure 10). As a result, our suggested Boost Graph Convolutional Network Clustering approach performs better and produces better results.

The amount of time needed for computation in the key generation phase using the proposed MPBE, upgraded W09, Water 05, and MPBE curve. In Figure 14 and Figure 15, compared to other methods, it can be seen that the Bibber approach requires less calculation time for the key creation, encryption, and decryption phases.

The simulation thus shows that our suggested information security technique outperformed the currently used traditional algorithms based on particular assaults. As a result, in Table 4 and Table 5, it is evident from the simulation results that our SecPri-BGMPOP strategy outperforms other standard algorithms currently in use. Protecting this type of data is essential since it is necessary to use sensitive diagnostic information about autism to determine if a person is autistic. This type of information is more relevant in the healthcare industry. This study’s findings demonstrated that our suggested cleaning method protects these data against some threats more effectively than existing techniques. Nevertheless, it is indicated that, in terms of data security and privacy, the healthcare industry can make extensive use of our suggested strategy. Performance evaluation includes assessing memory utilisation and execution time. The SecPri-BGMPOP approach was shown to drastically reduce memory consumption and calculation time, which is consistent with the experiment’s findings.

5. Conclusions

The SecPri-BGMPOP strategy aims, security, and privacy protection have been the subjects of distinct research endeavours in the past. The purpose of this essay was to provide big data analytics with security while maintaining privacy. The Boost Graph Convolutional Network Clustering (BGCNC) algorithm, which reduces computational complexity in terms of time and memory measurements, is first applied to the input dataset to begin the clustering process. Second, it is enlarged by employing a piece of the magnifying bit string to generate a safe key; pinpointing-based encryption avoids amplify leakage even if a rival or attacker decrypts the key or asymmetric encryption. Finally, to determine the accuracy of the method, an optimal key was created using a meta-heuristic algorithmic framework called Hybrid Particle Swarm and Grey Wolf Optimisation (HPSGWO). Our proposed method is currently kept in a cloud environment, allowing analytics users to utilise it without risking their privacy or security. We compared the technique based on Boost Graph Convolutional Network Clustering to the current FCM approach and created Boost Graph Convolutional Network Clustering to show that it has delivered superior results. Examining memory usage and execution time is part of measuring performance. Our SecPri-BGMPOP approach achieves an efficiency of 58.130 s with an information loss of 0.024%. The encryption and decryption times are 0.0086 s and 0.315 s, respectively. Confirming the experiment’s results, it was found that the SecPri-BGMPOP strategy significantly cuts down on memory usage and computation time. We intend to present additional real-world uses for the SecPri-BGMPOP scheme in the future and contrast it with other encryption techniques. In order to accelerate the encryption and decryption processes even more, we also aim to improve the SecPri-BGMPOP scheme.

Author Contributions

M.K. Conceptualization; J.P.A. initial draft of manuscript; H.-S.G. Experimental design and Methodology; B.P.K.: Validation and Analyses; J.S. Review; W.-C.L. Editing., Software. All authors have read and agreed to the published version of the manuscript.

Funding

Department of Electrical Engineering, Ming Chi University of Technology, Taiwan.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare that they have no potential conflicts.

References

Lv, Z.; Qiao, L. Analysis of healthcare big data. Future Gener. Comput. Syst. 2020, 109, 103–110. [Google Scholar] [CrossRef]
Yang, Y.; Zheng, X.; Guo, W.; Liu, X.; Chang, V. Privacy-preserving smart IoT-based healthcare big data storage and self-adaptive access control system. Inf. Sci. 2019, 479, 567–592. [Google Scholar] [CrossRef]
Hathaliya, J.J.; Tanwar, S. An exhaustive survey on security and privacy issues in Healthcare 4.0. Comput. Commun. 2020, 153, 311–335. [Google Scholar] [CrossRef]
Meyer, A.N.; Giardina, T.D.; Khawaja, L.; Singh, H. Patient and clinician experiences of uncertainty in the diagnostic process: Current understanding and future directions. Patient Educ. Couns. 2021, 104, 2606–2615. [Google Scholar] [CrossRef] [PubMed]
Premkamal, P.K.; Pasupuleti, S.K.; Alphonse, P.J.A. A new verifiable outsourced ciphertext-policy attribute based encryption for big data privacy and access control in cloud. J. Ambient Intell. Humaniz. Comput. 2019, 10, 2693–2707. [Google Scholar] [CrossRef]
Chelladurai, M.U.; Pandian, S.; Ramasamy, K. A blockchain based patient centric electronic health record storage and integrity management for e-Health systems. Health Policy Technol. 2021, 10, 100513. [Google Scholar] [CrossRef]
Available online: https://www.n-ix.com/big-data-healthcare-key-benefits-uses-cases/ (accessed on 31 May 2024).
Razzaq, A. Blockchain-based secure data transmission for internet of underwater things. Clust. Comput. 2022, 25, 4495–4514. [Google Scholar] [CrossRef]
Liu, J.; Liu, Z.; Sun, C.; Zhuang, J. A data transmission approach based on ant colony optimization and threshold proxy re-encryption in wsns. J. Artifi. Intell. Technol. 2022, 2, 23–31. [Google Scholar] [CrossRef]
Wang, T.; Ke, H.; Zheng, X.; Wang, K.; Sangaiah, A.K.; Liu, A. Big data cleaning based on mobile edge computing in industrial sensor-cloud. IEEE Trans. Industr. Inform. 2019, 16, 1321–1329. [Google Scholar] [CrossRef]
Hosseinioun, P.; Kheirabadi, M.; Tabbakh, S.R.K.; Ghaemi, R. A new energy-aware tasks scheduling approach in fog computing using hybrid meta-heuristic algorithm. J. Parallel Distrib. Comput. 2020, 143, 88–96. [Google Scholar] [CrossRef]
Shukla, D.K.; Ali, S.; Trivedi, M.C. Energy Aware Scheduling of Tasks in Cloud environment. Turk. Online J. Qual. Inq. 2021, 12, 8816. [Google Scholar]
Papageorgiou, P. A Novel Framework for Maritime Security Assessments and Its Applications on the Shipping Industry. Cyber Security Examples. Ph.D. Thesis, Liverpool John Moores University, Liverpool, UK, 2022. [Google Scholar]
Miyachi, K.; Mackey, T.K. hOCBS: A privacy-preserving blockchain framework for healthcare data leveraging an on-chain and off-chain system design. Inf. Process. Manag. 2021, 58, 102535. [Google Scholar] [CrossRef]
Pramanik, P.K.D.; Pal, S.; Mukhopadhyay, M. Healthcare big data: A comprehensive overview. In Research Anthology on Big Data Analytics, Architectures, and Applications; IGI Global: Hershey, PA, USA, 2022; pp. 119–147. [Google Scholar]
Riaz, S.; Khan, A.H.; Haroon, M.; Latif, S.; Bhatti, S. Big data security and privacy: Current challenges and future research perspective in cloud environment. In Proceedings of the 2020 International Conference on Information Management and Technology (ICIMTech), Bandung, Indonesia, 13–14 August 2020; pp. 977–982. [Google Scholar]
Sauber, A.M.; El-Kafrawy, P.M.; Shawish, A.F.; Amin, M.A.; Hagag, I.M. A New Secure Model for Data Protection over Cloud Computing. Comput. Intell. Neurosci. 2021, 2021, 8113253. [Google Scholar] [CrossRef]
Samaraweera, G.D.; Chang, J.M. Security and privacy implications on database systems in Big Data era: A survey. IEEE Trans. Knowl. Data Eng. 2019, 33, 239–258. [Google Scholar] [CrossRef]
Amanullah, M.A.; Habeeb, R.A.A.; Nasaruddin, F.H.; Gani, A.; Ahmed, E.; Nainar, A.S.M.; Akim, N.M.; Imran, M. Deep learning and big data technologies for IoT security. Comput. Commun. 2020, 151, 495–517. [Google Scholar] [CrossRef]
Ghayvat, H.; Pandya, S.N.; Bhattacharya, P.; Zuhair, M.; Rashid, M.; Hakak, S.; Dev, K. CP-BDHCA: Blockchain-based Confidentiality-Privacy preserving Big Data scheme for healthcare clouds and applications. IEEE J. Biomed. Health Inform. 2021, 26, 1937–1948. [Google Scholar] [CrossRef] [PubMed]
Lv, Z.; Wang, L.; Guan, Z.; Wu, J.; Du, X.; Zhao, H.; Guizani, M. An optimizing and differentially private clustering algorithm for mixed data in SDN-based smart grid. IEEE Access 2019, 7, 45773–45782. [Google Scholar] [CrossRef]
Shanmugapriya, E.; Kavitha, R. Medical big data analysis: Preserving security and privacy with hybrid cloud technology. Soft Comput. 2019, 23, 2585–2596. [Google Scholar] [CrossRef]
Tian, Y.; Wang, Z.; Xiong, J.; Ma, J. A blockchain-based secure key management scheme with trustworthiness in DWSNs. IEEE Trans. Indus. Inform. 2020, 16, 6193–6202. [Google Scholar] [CrossRef]
Chen, J.; Ramanathan, L.; Alazab, M. Holistic big data integrated artificial intelligent modeling to improve privacy and security in data management of smart cities. Microprocess. Microsyst. 2021, 81, 103722. [Google Scholar] [CrossRef]
Iqbal, S.; Kiah, M.L.M.; Zaidan, A.A.; Zaidan, B.B.; Albahri, O.S.; Albahri, A.S.; Alsalem, M.A. Real-time-based E-health systems: Design and implementation of a lightweight key management protocol for securing sensitive information of patients. Health Technol. 2019, 9, 93–111. [Google Scholar] [CrossRef]
Elhoseny, M.; Haseeb, K.; Shah, A.A.; Ahmad, I.; Jan, Z.; Alghamdi, M.I. IoT solution for AI-enabled PRIVACY-PREServing with big data transferring: An application for healthcare using blockchain. Energies 2021, 14, 5364. [Google Scholar] [CrossRef]
Ramachandra, M.N.; Rao, M.S.; Lai, W.C.; Parameshachari, B.D.; Babu, J.A.; Hemalatha, K.L. An Efficient and Secure Big Data Storage in Cloud Environment by Using Triple Data Encryption Standard. Big Data Cogn. Comput. 2022, 6, 101. [Google Scholar] [CrossRef]
Gehlot, A.; Misra, N. Privacy and Security Enabling for Healthcare Data using Lightweight Deep learning with Cryptography. In Proceedings of the 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), Mysuru, India, 16–17 October 2022; pp. 1–6. [Google Scholar]
Thantilage, R.D.; Le-Khac, N.A.; Kechadi, M.T. Healthcare data security and privacy in Data Warehouse architectures. Inform. Med. Unlocked 2023, 39, 101270. [Google Scholar] [CrossRef]
Refaee, E.; Parveen, S.; Begum, K.M.J.; Parveen, F.; Raja, M.C.; Gupta, S.K.; Krishnan, S. Secure and scalable healthcare data transmission in IoT based on optimized routing protocols for mobile computing applications. Wirel. Commun. Mob. Comput. 2022, 2022, 5665408. [Google Scholar] [CrossRef]
Lejun, Z.; Minghui, P.; Shen, S.; Weizheng, W.; Zilong, J.; Yansen, S.; Huiling, C.; Ran, G.; Gataullin, S. Redundant data detection and deletion to meet privacy protection requirements in blockchain-based edge computing environment. China Commun. 2024, 21, 149–159. [Google Scholar] [CrossRef]
Şenel, F.A.; Gökçe, F.; Yüksel, A.S.; Yiğit, T. A novel hybrid PSO–GWO algorithm for optimization problems. Eng. Comput. 2018, 35, 1359–1373. [Google Scholar] [CrossRef]
Zolghadr-Asli, B.; Bozorg-Haddad, O.; Chu, X. Crow Search Algorithm (CSA). In Advanced Optimization by Nature-Inspired Algorithms. Studies in Computational Intelligence; Bozorg-Haddad, O., Ed.; Springer: Singapore, 2018; Volume 720. [Google Scholar]
Alphonsa, M.M.A.; Amudhavalli, P. Genetically modified glowworm swarm optimization based privacy preservation in cloud computing for healthcare sector. Evol. Intell. 2018, 11, 101–116. [Google Scholar] [CrossRef]
Kavitha, C.; Anita, X. Task failure resilience technique for improving the performance of MapReduce in Hadoop. Etri J. 2020, 42, 748–760. [Google Scholar]
Kavitha, C.; Srividhya, S.R.; Lai, W.-C.; Mani, V. IMapC: Inner MAPping Combiner to Enhance the Performance of MapReduce in Hadoop. Electronics 2022, 11, 1599. [Google Scholar] [CrossRef]
Pu, Y.; Rong, Y.; Chen, J.; Mao, Y. Accelerated identification algorithms for exponential nonlinear models: Two-stage method and particle swarm optimization method. Circuits Syst. Signal Process. 2022, 41, 2636–2652. [Google Scholar] [CrossRef]
Reegan, A.S.; Kabila, V. Highly secured cluster based WSN using novel FCM and enhanced ECC-ElGamal encryption in IoT. Wirel. Pers. Commun. 2021, 118, 1313–1329. [Google Scholar] [CrossRef]
Thulasimani, L.; Hyils Sharon Magdalene, A. Lessening Spectrum Sensing Data Falsification Attack by Weighted Fuzzy Clustering Means Using Simulation Annealing in Cognitive Radio Networks. In International Conference on Advances in Electrical and Computer Technologies; Springer Nature: Singapore, 2021; pp. 423–435. [Google Scholar]

Figure 1. Big data in the healthcare industry.

Figure 2. Big data privacy assurance in the healthcare system.

Figure 3. The main architecture of the data security and privacy model.

Figure 4. The distinction between our suggested cluster technique and conventional graph convolution is the neighbourhood expansion. The extension of neighbourhood nodes begins at the red node. The growing convolution suffers from hyperbolic neighbourhood growth; however, our technique can stop expensive neighbourhood expansion.

Figure 5. Framework for proposed work.

Figure 6. Architecture of the traditional PSO algorithm.

Figure 7. Clustering-based time measure for SecPri-BGMPOP threshold value “1”.

Figure 8. Cluster-based memory measures for SecPri-BGMPOP threshold value “1”.

Figure 9. Clustering-based time measure for SecPri-BGMPOP threshold “2”.

Figure 10. Clustering-based memory measure for SecPri-BGMPOP threshold value “2”.

Figure 11. Clustering-based time measure for SecPri-BGMPOP threshold value “4”.

Figure 12. Clustering-based memory measure for SecPri-BGMPOP threshold value “4”.

Figure 13. Comparison of time measures with BGCNC and FCM for SecPri-BGMPOP threshold value “1, 2, 4”.

Figure 14. Comparison of proposed vs. FCM memory measures for SecPri-BGMPOP threshold value “1, 2, 4”.

Figure 15. Timing comparison graph for the proposed work in key generation, encryption phase, and decryption phase.

Table 1. Cluster-Based Time, Memory Measures for SecPri-BGMPOP Threshold Value “1”.

Number of Clusters	Time (s)	Memory (KB)
1	31.569	1520
2	30.638	1613
4	30.456	1663

Table 2. Clustering-Based Time and Memory Measures for SecPri-BGMPOP Threshold Value “2”.

Number of Clusters	Time (s)	Memory (KB)
1	29.55	1325
2	27.55	1265
4	26.45	1232

Table 3. Clustering-Based Time and Memory Measures for SecPri-BGMPOP Threshold Value “4”.

Number of Clusters	Time (s)	Memory (KB)
1	22.37	1216
2	24.45	1216
4	25.64	1199

Table 4. Sanitization Method and Proposed SecPri-BGMPOP.

Performance Metrics	Proposed vs. Previous Approaches
Performance Metrics	Sanitization Method	SecPri-BGMPOP
Information loss	0.07%	0.024%
Throughput	3.5 Mbps	7 Mbps
Throughput	3.625 Mbps	7.16 Mbps
Encryption time	0.11 s	0.0086 s
Decryption time	0.054 s	0.315 s
Efficiency	46.87 s	58.130 s

Table 5. Performance-Enhanced Hybrid Fragment Horde Bland Lobo Optimization.

HFHBLO	PSO	GWO	Attack
Better than	0.26%	0.23%	CCA
Superior to	0.40%	0.29%	CPA

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kuttiyappan, M.; Appadurai, J.P.; Kavin, B.P.; Selvaraj, J.; Gan, H.-S.; Lai, W.-C. Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment. Mathematics 2024, 12, 1969. https://doi.org/10.3390/math12131969

AMA Style

Kuttiyappan M, Appadurai JP, Kavin BP, Selvaraj J, Gan H-S, Lai W-C. Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment. Mathematics. 2024; 12(13):1969. https://doi.org/10.3390/math12131969

Chicago/Turabian Style

Kuttiyappan, Moorthi, Jothi Prabha Appadurai, Balasubramanian Prabhu Kavin, Jeeva Selvaraj, Hong-Seng Gan, and Wen-Cheng Lai. 2024. "Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment" Mathematics 12, no. 13: 1969. https://doi.org/10.3390/math12131969

APA Style

Kuttiyappan, M., Appadurai, J. P., Kavin, B. P., Selvaraj, J., Gan, H.-S., & Lai, W.-C. (2024). Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment. Mathematics, 12(13), 1969. https://doi.org/10.3390/math12131969

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Big Data Privacy Protection and Security Provisions of the Healthcare SecPri-BGMPOP Method in a Cloud Environment

Abstract

1. Introduction

2. Literature Review

3. Proposed Method

3.1. Graph Convolutional Network Clustering (GCNC)

Boost Graph Convolutional Network Clustering (BGCNC)

3.2. Magnify Pinpointing-Based Encryption (MPBE)

3.2.1. Initial Phase

3.2.2. Generation Phase

3.2.3. Encryption Phase

3.2.4. Decryption Phase

3.3. Hybrid Fragment Horde Bland Lobo Optimization

3.3.1. Traditional PSO Algorithm

3.3.2. Traditional GWO Algorithm

3.3.3. Hybrid Fragment Horde Bland Lobo Optimization

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI