Securing the Internet of Health Things: Embedded Federated Learning-Driven Long Short-Term Memory for Cyberattack Detection

Kumar, Manish; Kim, Sunggon

doi:10.3390/electronics13173461

Open AccessArticle

Securing the Internet of Health Things: Embedded Federated Learning-Driven Long Short-Term Memory for Cyberattack Detection

by

Manish Kumar

and

Sunggon Kim

^*

Department of Computer Science, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(17), 3461; https://doi.org/10.3390/electronics13173461

Submission received: 5 August 2024 / Revised: 23 August 2024 / Accepted: 29 August 2024 / Published: 31 August 2024

(This article belongs to the Special Issue Computer Architecture & Parallel and Distributed Computing)

Download

Browse Figures

Versions Notes

Abstract

:

The proliferation of the Internet of Health Things (IoHT) introduces significant benefits for healthcare through enhanced connectivity and data-driven insights, but it also presents substantial cybersecurity challenges. Protecting sensitive health data from cyberattacks is critical. This paper proposes a novel approach for detecting cyberattacks in IoHT environments using a Federated Learning (FL) framework integrated with Long Short-Term Memory (LSTM) networks. The FL paradigm ensures data privacy by allowing individual IoHT devices to collaboratively train a global model without sharing local data, thereby maintaining patient confidentiality. LSTM networks, known for their effectiveness in handling time-series data, are employed to capture and analyze temporal patterns indicative of cyberthreats. Our proposed system uses an embedded feature selection technique that minimizes the computational complexity of the cyberattack detection model and leverages the decentralized nature of FL to create a robust and scalable cyberattack detection mechanism. We refer to the proposed approach as Embedded Federated Learning-Driven Long Short-Term Memory (EFL-LSTM). Extensive experiments using real-world ECU-IoHT data demonstrate that our proposed model outperforms traditional models regarding accuracy (97.16%) and data privacy. The outcomes highlight the feasibility and advantages of integrating Federated Learning with LSTM networks to enhance the cybersecurity posture of IoHT infrastructures. This research paves the way for future developments in secure and privacy-preserving IoHT systems, ensuring reliable protection against evolving cyberthreats.

Keywords:

cybersecurity; federated learning; internet of things; long short-term memory

1. Introduction

The proliferation of the Internet of Health Things (IoHT) brings significant advancements to healthcare by enhancing connectivity and providing data-driven insights. It creates an environment where numerous smart healthcare sensors are interconnected across various networks and primarily monitor vital physiological parameters such as temperature, blood pressure, and heart rate, which measure a patient’s health [1]. Saheed and Arowolo [2] have discussed how these parameters interact with one another to distribute sensitive medical information that will be used by healthcare officials, hospitals, and physicians for medication and support. These confidential data are securely stored at data centers by various gateways and then transmitted to authorized users. The IoHT provides a medium that not only helps hospitals manage seamless services but also allows doctors to access patient information anytime, anywhere, in a digital manner. Suleski et al. [3,4] have discussed multi-factor authentication applications for next-generation authentication in the IoHT.

However, it also introduces considerable cybersecurity challenges, and the implications of this for patients’ health as well as for patients’ privacy are a deep concern today [5]. As the IoHT becomes increasingly integrated into healthcare systems, the risk of cyberattacks targeting sensitive health data escalates, posing serious threats to patient confidentiality and the integrity of healthcare infrastructure. A Siemens healthcare report highlighted this issue in 2019 and discussed how hackers used data from electronic health records, such as social security numbers, credit card or bank account information, and even patient diagnosis histories, for blackmail or ransom purposes [6]. They also froze hospital services by hacking installed computers and related networks, demanding large sums of money. These incidents not only damage the hospital’s reputation but also erode patients’ trust in the hospital. Si-ahmed et al. [7] have surveyed the challenges posed by IoHT environments, such as the integration of a vast array of interconnected medical devices, the critical nature of real-time patient data, and the potential for physical harm resulting from compromised medical devices. Researchers have also discussed the possible threats at different levels, such as during data collection, transmission, and storage. Rasool et al. [8] discussed the necessity of security as well as privacy for medical IoHT networks and proposed that an AI-based approach could facilitate these requirements. The authors of [9] specifically discussed several attack vectors that might be launched to control smart medical devices like intelligent pacemakers and revealed their negative impacts on these devices. These factors significantly differ from traditional cyberattacks that primarily target centralized healthcare databases and IT infrastructure. IoHT-specific cyberattacks can exploit vulnerabilities unique to medical IoT devices, such as manipulating sensor data and hindering remote access to life-saving devices [10].

Protecting sensitive health data from cyberattacks is critical, yet existing solutions often fall short due to their inability to provide robust security without compromising data privacy. Abdulwahid [11] discusses the use of Machine Learning (ML) techniques to detect middle box-based attacks on the IoHT, as such systems generate huge amounts of network data. These data can be analyzed with ML techniques to detect any unusual activities. Similarly, Kilincer et al. [12] provide an alternative Deep Learning (DL) solution by applying feature selection and multilayer perceptron (MLP) for detecting man-in-the-middle attacks on an IoMT-based (WUSTL-EHMS-2020) system. As IoHT-based systems generate a lot of temporal data, DL-based techniques (e.g., RNN, LSTM) perform better than traditional ML by effectively capturing temporal dependencies and patterns [13,14]. Kumar et al. have discussed the efficiency of using these AI-based technique in healthcare problems [15]. Both conventional ML and DL-based techniques use a centralized approach, posing significant privacy risks by aggregating sensitive data on a central server, making it vulnerable to breaches. This approach also leads to higher latency and inefficiencies due to the continuous data transfer [16]. In IoHT environments, traditional centralized cybersecurity methods are becoming increasingly inadequate because they often necessitate sharing local data, risking privacy breaches. Moreover, most research focuses on a particular type of attack, whereas in practice, attackers may launch more than one type of cyberattack on an IoHT-based network.

This paper proposes a novel approach for detecting multiple cyberattacks in IoHT environments using a Federated Learning (FL) framework integrated with Long Short-Term Memory (LSTM) networks. The FL paradigm ensures data privacy by allowing individual IoHT devices to collaboratively train a global model without sharing local data, thereby maintaining patient confidentiality [17]. LSTM networks, known for their effectiveness in handling time-series data, are employed to capture and analyze temporal patterns indicative of cyberthreats. Our proposed system not only uses embedded feature selection techniques to minimize the computational complexity of the cyberattack detection model but also leverages the decentralized nature of FL to create a robust and scalable cyberattack detection mechanism. We conduct extensive experiments using real-world ECU-IoHT data, demonstrating that our proposed model outperforms traditional models in terms of accuracy and data privacy. The subjective and quantitative outcomes highlight the feasibility and advantages of integrating Federated Learning with LSTM networks in enhancing the cybersecurity posture of IoHT infrastructures. This research paves the way for future developments in secure and privacy-preserving IoHT systems, ensuring reliable protection against evolving cyberthreats. The key contributions of this article are as follows:

We employ embedded feature selection techniques to minimize the computational complexity of the cyberattack detection model.
We propose a novel approach for detecting multiple cyberattacks in IoHT environments using a Federated Learning (FL) framework integrated with Long Short-Term Memory (LSTM) networks (FL-LSTM).
We ensure data privacy by enabling individual IoHT devices to collaboratively train a global model without sharing local data, thereby maintaining patient confidentiality.
We utilize LSTM networks, known for their effectiveness in handling time-series data, to capture and analyze temporal patterns indicative of cyberthreats.

The rest of the article is structured as follows: Related research is discussed in Section 2. Section 3 demonstrates the materials and methods, highlighting the mathematical structure of the LSTM-based Federated Learning network for detecting cyberattacks. The outcomes of the simulation and comparative studies are showcased in Section 4. Section 5 explores noteworthy observations and their implications in Section 6.

2. Related Work

Rashid et al. [18] have discussed the role of IoT technologies in enhancing people’s well-being and quality of life and the related risks in terms of cyberattacks. They suggest the use of various ML techniques such as Linear Regression, Support Vector Machines (SVMs), Decision Trees (DTs), Random Forests, Artificial Neural Networks (ANNs), and the K-Nearest Neighbors (K-NN) method for detecting such anomalies. These technologies are employed as binary and multi-class classifiers, acknowledging that attackers may launch multiple types of attacks to complicate countermeasures, potentially demanding ransom to release vital networks like the IoHT. However, Ensemble Learning was not part of their research. Again, Saheed et al. [2] discussed the application of ML and Deep Recurrent Neural Networks (DRNNs). Techniques like the use of Recurrent Networks can efficiently solve time-series problems. Medical-related IoT systems store temporal logged data, making deep recurrent techniques like the use of LSTM, Gated Recurrent Units (GRUs), and Bi-Directional LSTM (Bi-LSTM) indispensable [19]. They also showcased an ensemble LSTM model to produce more aggregated outcomes. Fazlullah et al. [20] demonstrated EL-based techniques using a set of LSTM networks to classify cyberattacks on IoMT networks. They focused on a binary classification problem to determine whether an IoMT-based network is compromised. Their research highlighted that attackers might launch a series of attacks to completely seize a medical facility. Rahman et al. [21] presented an LSTM-based technique to detect distributed DoS attacks. Al-Kahtani [22] et al. developed a hybrid model that was a fusion LSTM and Gated Recurrent Unit (GRU)-based RNN to detect specific intrusion. Similarly, Hong et al. [23] compared the various available RNN-based models to detect contingencies and cyberattacks on power grids. These works suggest that LSTM-based networks are very efficient in identifying any cyberattack. However, data privacy may be compromised since these networks are centralized models. The research in [24] highlights the application of feature selection techniques, such as neighborhood component analysis, and LSTM-based Directed Acyclic Graph (DAG) networks for classifying IoHT attacks. Although the test accuracy is impressive, the process of finding regularization parameters is time-consuming. In the current research, we address this issue by implementing embedded techniques like Ensemble Learning (EL). Typically, IoT-based networks generate extensive network information, and these log data are crucial for identifying cyberattacks. However, using the entire dataset to build a predictive model can lead to overfitting and an increased convergence time. Liu et al. [25] have discussed the issue of imbalanced classification and highlight that it will affect the performance of any AI-based model. They also discussed how the use of an embedded feature selection technique can solve this problem. In addition, we have incorporated RusBoost-based EL for feature selection.

Abbas et al. [26] raised concerns with conventional ML and DL approaches, particularly regarding the centralized use of cyberattack detection models, as the privacy of crucial data may be compromised. Since these intelligent techniques learn from a network’s past records, providing all data to a central system to build such a model could reveal vital records, especially medical records. Patient privacy is paramount when building any system. To avoid such complications, Yang et al. [27] recommended using a decentralized approach like Federated Learning (FL) to prevent these issues and detect poisoning attacks in IoT networks. FL-based techniques combine private data from multiple clients to train local models, which are then aggregated to create a global model, preventing the exposure of actual data. This global model is then shared with both the central and local systems for detecting cyberattacks. In addition, Dao et al. [28] provided an FL-based framework that they successfully applied to various IoT-based networks, focusing solely on DoS attacks. Mehmet Akif Günen [29] discussed another issue with AI-based classifiers, namely data imbalance, as these classifiers are usually designed based on equal classes, and model performance may deteriorate in case of imbalanced classification. This research primarily focuses on building an FL network to detect cyberattacks on IoHT networks. Gupta et al. [30] have specifically discussed the role of FL in privacy preservation in IoHT systems. Similarly, Nair et al. [31] have showcased the application of FL in privacy preservation for IoMT big data analysis suing edge computing. Singh et al. [32] have applied FL to detect intrusion in an IoMT-based network. Here, we employ LSTM to build an FL-based cyberthreat detection system. The embedded RusBoost-based EL assists in feature selection, handles data imbalance in the classification step, and overcomes the overfitting problem of the proposed FL-based techniques. The integration of FL is crucial as it allows for decentralized data processing, enhancing data privacy while maintaining a high detection accuracy. The aforementioned things are incorporated in Table 1 below, which also exhibits the uniqueness of the present research.

3. Materials and Methods

This section provides information about the research background, which highlights the technologies involved and the ECU-IoHT dataset used in the development of the proposed cyberattack detection model. The collected dataset was preprocessed to serve as the input to the proposed model. Additionally, this section discusses the embedded feature selection and the functioning of the proposed FL network.

3.1. Research Background

The Internet of Health Things (IoHT): A specialized and regulated subset of the broader IoT, the IoHT is specifically designed to address the unique demands and challenges of healthcare [37]. It involves the connection of medical devices, health-monitoring sensors (such as fitness trackers and glucose monitors), and other healthcare-related technologies to the internet. This connectivity enables the real-time collection, transmission, and analysis of health data, enhancing patient care, diagnostics, and health management. The IoHT handles highly sensitive and personal health data, necessitating stringent privacy and security measures to protect patient information, in compliance with regulations like the Health Insurance Portability and Accountability Act (HIPAA) in the U.S. and the General Data Protection Regulation (GDPR) in Europe [38]. By focusing on improving patient care, safety, and health outcomes through connected devices, data analysis, and real-time monitoring, the IoHT distinguishes itself from the general IoT by its adherence to strict privacy and regulatory standards.

Ensemble Learning (EL): EL is a kind of ML paradigm that aggregates the predictions of several models, such as SVM, K-NN, ANN, or DT models. EL can capitalize on the strengths of each model and mitigate their weaknesses. This will lead to improved accuracy, robustness, and generalization [39]. DT-based ensembles, like Random Forests and Gradient-Boosting Machines, not only provide robust predictions but also offer intrinsic mechanisms for feature selection through feature importance metrics [40,41]. By evaluating and ranking features based on their contributions to the ensemble, practitioners can perform embedded feature selection, leading to more efficient models with potentially improved generalization capabilities.

Random Undersampling Boosting (RUSBoost): In imbalanced datasets, typical feature selection methods may overestimate the importance of features that are more predictive of the majority class. RUSBoost, by mitigating this imbalance, provides a more balanced view of feature importance, leading to better feature selection decisions. In embedded feature selection, as with other boosting methods, RUSBoost inherently performs feature selection by focusing on features that reduce the classification error over successive iterations. The features that consistently contribute to correcting misclassifications are identified as more important. After training, features with low importance can be trimmed [26,42].

Federated Learning (FL): FL is a decentralized approach to ML where multiple devices or servers collaboratively train a model without sharing their local data. Instead of sending data to a central server for training, each device (often referred to as a “client”) trains the model locally using its own data, and only the model updates (e.g., gradients or model parameters) are shared with a central server. The server aggregates these updates to create a global model that benefits from the collective knowledge of all participating devices without exposing their individual data [43]. Federated Learning in the IoHT offers significant benefits by enabling the development of accurate and privacy-preserving prediction models. It allows healthcare providers to leverage the wealth of data generated by IoHT devices without compromising patient privacy or regulatory compliance. Data security is provided by keeping data on local devices, enabling FL to reduce the risk of data breaches and cyberattacks that could occur if large amounts of sensitive health data were centralized [28].

Long Short-Term Memory (LSTM): LSTM is a type of RNN architecture designed to process and predict sequences of data. The LSTM architecture is particularly well suited for tasks involving time-series data, natural language processing (NLP), and other sequence-based problems because it can capture and retain information over long periods, which is challenging for standard RNNs. The LSTM cell has a “cell state” that runs through the entire sequence, functioning as a memory that carries relevant information across the sequence. It is regulated by several gates that control the flow of information [44]. In this research, the LSTM technique is used as a classifier that will detect various cyberattacks.

These discussed technologies were utilized to develop an AI-based cyberattack detection model, leveraging the effectiveness of RusBoost-based Ensemble Learning for feature selection to address challenges like class imbalance and overfitting. Additionally, we emphasized the integration of Deep Recurrent Networks, particularly LSTM-based Federated Learning, to create a robust model that ensures data privacy.

3.2. Research Dataset and Preprocessing

The ECU-IoHT dataset [45] was developed by Edith Cowan University (ECU), Australia, to address security concerns in IoHT-based systems, where various physiological sensors are connected to IoT-based networks. During our experiments, the Libelium MySignals healthcare kit was used as a testbed to create an IoHT environment. This kit comprises multiple health-related sensors to monitor various biometrics. These sensors primarily monitor vital parameters such as temperature, blood pressure (BP), and heart rate to assess a patient’s health status. These data are transferred to the user’s private cloud account with the help of Wi-Fi or Bluetooth. Multiple cyberattacks were executed on the target device from the host IP, with the default gateway. The dataset includes a variety of cyberattacks to ensure comprehensive testing of the proposed model. Specifically, it consists instances of the following:

ARP poisoning (ARP spoofing): In this type of attack, the attacker sends spoofed ARP messages over LAN networks and redirects the traffic from the intended host to the attacker. This diverts the traffic from the target address to another IP address, and subsequently generates a huge amount of malicious ARP packets in Wireshark.
Denial-of-Service (DoS) attack: DoS attacks usually entail large amounts of data being sent to the server, potentially impacting its reliability or stability. This is realized with the help of Ettercap, an open-source network security tool for man-in-the-middle attacks on local area networks (LANs).
Smurf attack: A Smurf attack is a kind of distributed DoS attack. Usually, a high quantity of echo requests will be flooded into the server during this type of attack.
Nmap port scan: Nmap port scans are categorized as a type of attack in the ECU-IOHT dataset, underscoring their role in the reconnaissance phase of cyberattacks.

In the three hours of simulation, ECU-IoHT keeps records of factors such as the type of protocol, simulation period, number of flows, source bytes, destination bytes, source packets, destination packets, normal and attack instances, and unique source and destination IP addresses. Figure 1 depicts a histogram analysis of the ECU-IoHT dataset, illustrating the frequency of various cyberattacks and normal instances. The analysis shows that the “Smurf attack” occurred most frequently, with a maximum of 77,920 instances in the current IoHT network. In contrast, the “DoS attack” had the fewest occurrences, totaling only 639 times. The figure also displays the frequency of other attacks and normal instances. It indicates that the data used to build the proposed cyberthreat detection model represent an example of imbalanced classification, which needs to be addressed to maintain overall accuracy. The ECU-IoHT dataset was developed following the standard methodology for whitehat penetration testing, a widely recognized approach used by companies and IT organizations to simulate various attacks for vulnerability analyses. The reasoning behind the selection of these four attacks (ARP poisoning, DoS, Smurf, and Nmap) is that they represent common threats that an attacker might launch to compromise IoHT systems. Therefore, our detection system needed to be designed to identify these types of attacks, enabling timely countermeasures to be implemented.

After collecting the datasets, the data underwent preprocessing. This step not only made the data compatible with feature selection but also assisted in building the proposed classification network. From the available dataset, we created a table consisting of encoded features and labeled targets, as any DL/ML network only accepts numerical values. The IoHT-based dataset contains a lot of network information and features, such as source or destination addresses or protocols having character values. These needed to be encoded using the label-encoding technique, which converts categorical data into numerical format. The data were then randomly divided into training, validation, and test sets in the ratio of 70:15:15, respectively. After that, any feature selection technique could be applied to these data. For the proposed model, the selected input features needed to be reshaped and kept in a cell array. Similarly, the target output data needed to be categorized because the proposed model accepts sequential data.

3.3. Proposed Embedded Federated Learning-Driven Long Short-Term Memory (EFL-LSTM) Model

For the present research, we chose to apply the ML-based embedded Ensemble Learning (EL) approach to identify the important features for detecting different attacks on IoHT-based systems. Here, the EL involved utilizing various Decision Trees (DTs) trained on subsets of the collected ECU-IoHT datasets, and the results from these DTs were aggregated to provide individual feature scores that highlight the features that significantly impact the proposed cyberthreat detection model. The EL-based feature selection algorithm can be mathematically expressed with the following equations.

Let

X

be a matrix of features, with

n

rows (samples) and

d

columns (features). Similarly, let

y

be the labels’ vector, and

(X, y)

can be represented as follows:

X = [x_{1}, x_{2}, \dots, x_{n}]

(1)

y = [y_{1}, y_{2}, \dots, y_{n}]

(2)

Split

X

and

y

into training and validation sets,

{(X}_{t r a i n}, y_{t r a i n})

and

{(X}_{v a l}, y_{v a l})

, respectively. Then, define an ensemble of

m

models; each of them is trained on a random set of features, as represented in Equation (3):

h_{i} = [x_{1 i}, x_{2 i}, \dots, x_{j i}]

(3)

For each model

i

in the ensemble, randomly select a subset of features

X_{i}

from

X

. Train the EL model on the subset of features

X_{i}

and the corresponding labels

y_{i}

. The subset of features

X_{i}

can be represented as a binary vector of length

j

, as follows:

X_{i} = [x_{1 i}, x_{2 i}, \dots, x_{j i}]

(4)

where

x_{j i}

is equal to 1 if

j^{t h}

feature is included in the subset, and is 0 otherwise. Each model

h_{i}

can be represented as the following function:

h_{i} : X \to y

, which maps the feature vector to the target labels. Combine the predictions of the EL to produce the final prediction:

\hat{y} = \frac{\sum w_{i} * h_{i}}{\sum w_{i}}

(5)

where

\hat{y}

is the predicted output value and

w_{i}

is the weight of the assigned model

i

. Evaluate the performance of the validation set, using a suitable evaluation metric, such as accuracy or the mean squared error (MSE). Then, calculate the feature importance of each feature by using the ensembles in Equation (6):

I_{j} = \frac{\sum w_{i} * I_{j i}}{\sum w_{i}}

(6)

where

I_{j}

is the feature score of the

j^{t h}

feature, and

I_{j i}

is the importance of feature

j

in model

i

. Feature importance can be defined in various ways, such as the average decrease in impurity for DTs or the magnitude of weights for linear models. For the proposed work, we chose to employ an ensemble of Decision Trees using the Random Undersampling Boosting (RUSBoost) method for feature selection, which is particularly effective for dealing with imbalanced datasets [46]. RUSBoost balances the class distribution by undersampling the majority class while boosting the overall performance. Using the selected feature

I_{j}

, an LSTM cell is designed to maintain long-term dependencies by using a series of gates that regulate the flow of information [47]. The mathematical model of an LSTM cell is represented as follows:

f_{t} = σ (W_{f} . [h_{t - 1}, I_{j}] + b_{f})

(7)

where

f_{t}

is the forget gate that determines which information to discard from the cell state. Similarly, the input gate, i.e.,

i_{t}

, decides which values to update in the cell state:

i_{t} = σ (W_{i} . [h_{t - 1}, I_{j}] + b_{i})

(8)

Then, create a vector of new candidate value that could be added to cell state, as follows:

\bar{C_{t}} = t a n h (W_{c} . [h_{t - 1}, I_{j}] + b_{c})

(9)

Update the cell state, i.e.,

C_{t}

, based on the previous cell state, the forget gate, and the input gate:

C_{t} = f_{t} * C_{t - 1} + i_{t} * \bar{C_{t}}

(10)

The output gate, i.e.,

o_{t}

, decides what the next hidden state as follows:

o_{t} = σ (W_{o} . [h_{t - 1}, I_{j}] + b_{o})

(11)

The hidden state, i.e.,

h_{t}

, is then updated using the output gate and new cell state:

h_{t} = o_{t} * t a n h (C_{t})

(12)

where

σ

and

t a n h

are hyperbolic tangent activation functions. The symbols

{(W}_{f}, W_{i}, W_{c}, W_{o})

are the weight matrices of the forget gate, input gate, candidate cell state, and output gate. Similarly,

{(b}_{f}, b_{i}, b_{c}, b_{o})

are the bias vectors of the forget gate, input gate, candidate cell state, and output gate.

Equations (1)–(12) represent the architecture of the Ensemble Learning-based LSTM model, which forms the core of our proposed cyberthreat detection model. This model includes various layers such as dropout layers, fully connected layers, soft-max layers, and classification layers. The Adaptive Moment Estimation (Adam) algorithm was employed to optimize the entire network. In the context of Federated Learning (FL), a global LSTM model is first initialized on a central server, with initial weights and biases are set randomly. Instead of training the model on the entire dataset centrally, local models are trained on each client’s data using this global model as a starting point. After local training, the updated weights from each client are sent back to the server, where they are averaged to update the global model. This global model is then redistributed to the clients for further training. Crucially, FL preserves privacy by ensuring that raw client data remain on the local devices, and only model updates (such as gradients or weights) are shared with the central server for aggregation and further refinement of the global model. Figure 2 highlights the structure of the proposed model with all necessary steps like feature selection, preprocessing, and building as well as validating the proposed LSTM-based FL network.

The parameters used to build the proposed EFL-LSTM model are summarized in Table 2. The choice of parameters was based on technical and scientific considerations to optimize the performance of the model. For feature selection, a Decision Tree (DT) was used as the learner parameter, which was then enhanced using Ensemble Learning (EL) techniques. The developed EL component was then further optimized with the RusBoost algorithm to improve the feature selection efficacy. Table 2 details the hyperparameters of the LSTM network used as a classifier in the EFL-LSTM technique. The model was optimized using the Adam learning algorithm, which is known for its computational efficiency and low memory requirements. The number of epochs was set to 10 to balance the training time and model performance. The initial learning rate was set to 0.001 with a piecewise schedule to allow the learning rate to decrease over time, enhancing the convergence. The learning rate drop period and drop factor were set to 5 and 0.1, respectively, to ensure a gradual and controlled reduction in learning rate, preventing abrupt changes that could destabilize the training. Since the model integrates Federated Learning (FL) with EL and LSTM, the number of clients (local models) was set to 5. This choice ensured that there was a sufficient distribution of data across multiple clients while maintaining a manageable computational complexity.

4. Subjective and Quantitative Evaluation

The proposed EFL-LSTM model was implemented in the MATLAB 2024a software environment. The hardware configuration included a 12th Generation Intel Core i5-12400 processor, 16 GB of RAM, and a 2.50 GHz clock frequency. In terms of network and cyberthreat information, the ECU-IoHT dataset contains a total of 111,208 entries and features information such as the protocol type, timestamps, the number of flows, source bytes, destination bytes, source packets, destination packets, instances of normal and attack occurrences, and unique source and destination IP addresses.

4.1. Subjective Analysis

Initially, we needed to determine the optimal leaf size to build a Decision Tree (DT) for the Ensemble Learning-based proposed classifier. Figure 3a illustrates the classification error across various leaf sizes, indicating that a leaf size of 5 is more efficient than sizes of 1 and 15, as it resulted in the minimum classification error. For this experimental task, we utilized 25 Decision Trees. Figure 3b showcases the feature scores based on embedded EL methods, with the x-axis demonstrating all features. Feature 1 represents the timestamps, capturing the precise moment when the packet was recorded. Feature 2 identifies the source address, which is the MAC or IP address of the device that transmitted the packet. In a similar vein, Feature 3 denotes the destination address, identifying the MAC or IP address of the receiving device. Feature 4 indicates the protocol used for the communication, such as ARP or DNS. Feature 5 describes the length of the packet in bytes, offering insight into the size of the data being transmitted. Lastly, Feature 6 provides additional information or content of the packet, giving a detailed account of the communication process and context. Similarly, the y-axis highlights the feature scores, and it reveals that here, all features are important in the present IoHT network, as indicated by the non-zero feature score values.

Figure 4a,b demonstrate the training progress of the global model of the proposed EFL-LSTM method, showing that the percentage accuracy improved with each iteration and reached the expected limit of 100% accuracy. Similarly, the loss decreased with each iteration, reaching its minimum level, indicating that the developed predictive model was ready to be tested with unknown features that were not included in the training. The light and dark blue curves correspond to the accuracy, whereas the red and yellow curves represent the loss. These colors are system-generated by default to visually distinguish between the metrics and demonstrate that the accuracy of the proposed technique improves with the number of iterations or epochs. Consequently, the loss (or error) decreases as the model trains. The numbers of iterations and epochs are predefined in the experimental setup.

Figure 5 exhibits the normalized confusion matrix of the proposed EFL-LSTM network, revealing the performance of the developed model subjectively. Here, normal events and specific cyberattacks were classified by the proposed model. For example, the proposed network detected “ARP spoofing” 342 times with 100% accuracy. In the case of a “DoS attack”, this attack was detected accurately only 21 times; it was misclassified once as an “Nmap port scan” and 65 times as “no attack”. Figure 5 also displays the classification of specific attacks separately in parallel columns. These columns provide a detailed breakdown of the classification results for specific types of cyberattacks, displayed separately from the overall confusion matrix. This figure shows that when detecting a “DoS attack” or an “Nmap port scan”, our model needs improvement, as it exhibited accuracies of 24.1% and 77.3% for these attacks, respectively. This may have happened due to the smaller amount of information available about such attacks compared to others. However, the proposed model shows significant outcomes in the case of classifying potential cyberattacks as “no attack” or a “Smurf attack”, providing accuracies of up to 95.2% and 100%, respectively.

4.2. Quantitative Analysis

The given confusion matrix can also be used to determine the overall accuracy of the proposed EFL-LSTM model. Additionally, various other performance measures can be evaluated, including metrics such as accuracy, recall, specificity, precision, the False Positive Rate (FPR), F1 score, Matthews Correlation Coefficient (MCC), and Cohen’s Kappa (K). These metrics are mathematically expressed as follows:

A c c u r a c y (%) = (\frac{T P + T N}{T P + F N + F P + F N}) \times 100

(13)

R e c a l l = \frac{T P}{T P + F N}

(14)

S p e c i f i c i t y = \frac{F P}{F P + T N}

(15)

P r e c i s i o n = \frac{T P}{T P + F P}

(16)

F P P = \frac{F P}{F P + T N}

(17)

F 1 S c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s o n + R e c a l l}

(18)

M C C = \frac{T N \times T P - F N \times F P}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(19)

K = \frac{2 \times (T P \times T N - F N \times F P)}{(T P + F P) \times (F P + T N) \times (T P + F N) \times (F N + T N)}

(20)

where

T P

,

T N

,

F N

, and

F P

denote the True Positives, True Negatives, False Negatives, and False Positives, respectively.

Table 2 presents the quantitative data obtained from various experiments, demonstrating that the proposed EFL-LSTM-based technique achieves an accuracy score of 0.9716, i.e., its percentage accuracy is 97.16%. Similarly, the simple EL-based embedded LSTM network technique achieves 93.25% efficiency since its accuracy score is 0.9325. Similarly, the F1 score of the proposed technique surpasses that of the non-FL-based technique, i.e., 0.8237, and other parameters indicate that the proposed FL-based technique outperforms the non-FL-based technique. Likewise, the False Positive Rate (FPR) also decreases in the case of the proposed technique. All of the performance measures justify that the proposed technique outperforms simple LSTM-based techniques. Although features were selected with the help of DT-based Ensemble Learning, the reasoning behind the improved performance of the proposed model lies in its utilization of diverse data during training, obtained from subsets of ECU-IoHT data. Additionally, the FL-based technique employs robust aggregation techniques that enhance the accuracy, convergence speed, robustness, and privacy preservation capabilities of FL algorithms, ultimately leading to better overall performance.

Figure 6 illustrates the relationship between the training time, selected features, and percentage accuracy. The features were selected based on the best feature scores, which were evaluated using the embedded Ensemble Learning (EL) techniques. The plot shows that selecting all six features resulted in a training time of 900 s and a percentage accuracy of 97.16%. When the best four features were selected, the training time decreased to 755 s, and the percentage accuracy slightly dropped to 96.57%. However, selecting only the best two features caused a significant decrease in the percentage accuracy, prompting it to fall to 92.88%. This drop in accuracy occurred because, as shown in Figure 2, all features in our model are important, and so eliminating any features negatively impacts the performance. Therefore, while reducing the number of features can decrease the training time of the proposed algorithm, it compromises the accuracy.

5. Discussion

To demonstrate the efficacy of the proposed model, we have conducted a comprehensive set of experiments, including both subjective and quantitative analyses. These experiments cover various aspects such as the optimal leaf size for the Decision Trees, feature importance scores, confusion matrices, quantitative performance metrics, and training times. Additionally, we performed a sensitivity analysis to examine the impact of different parameters on the model’s accuracy. The results of our analyses demonstrate the efficacy of the proposed EFL-LSTM model in detecting cyberattacks within IoHT networks, validated through comprehensive experimentation using the ECU-IoHT dataset. The model’s performance was significantly enhanced by the careful selection and optimization of key parameters. For instance, the choice of a Decision Tree (DT) as the base learner within the ensemble framework and the use of the RusBoost algorithm was instrumental in effective feature selection, as highlighted in Table 1. These decisions directly contributed to improving the model’s overall robustness and accuracy.

The impact of varying leaf sizes on the classification error is illustrated in Figure 3a, which reveals that while adjusting the leaf size influences model performance, increasing the leaf size beyond a certain point does not guarantee further error reduction and only increases the computational time. This insight informed our choice of the optimal leaf size for the DT-based ensemble model. Additionally, Figure 3b provides an analysis of feature importance using the embedded RUSBoost-based Ensemble Learning method, indicating that all features in the IoHT network contribute to the model’s effectiveness, with each feature demonstrating a non-zero importance score.

The training progress plot in Figure 4 further supports the model’s capacity to adapt and improve iteratively, as evidenced by the continuous increase in accuracy with each iteration. This iterative learning process highlights the model’s ability to refine its performance over time. Furthermore, the normalized confusion matrix (Figure 5) presents a detailed evaluation of the model’s classification capabilities across different types of cyberattacks. While the model achieved high accuracy in detecting prevalent attacks like ARP spoofing and Smurf attacks, its performance was slightly lower for less frequent attacks such as DoS and Nmap port scan attacks. This discrepancy can be attributed to the lower representation of these attacks in the dataset, which presents an opportunity for further enhancement by augmenting the dataset with more diverse examples of these specific threats. In fact, although Smurf attacks and ARP spoofing may be less common today due to existing mitigation techniques, our model’s ability to accurately detect these attacks underscores its comprehensive threat detection capabilities. Including these varied attacks in our evaluation was crucial for demonstrating this model’s generalizability and ensuring its applicability across a wide range of cyberthreats.

In Table 3, we present a quantitative comparison between the proposed EFL-LSTM model and a non-FL-based technique. The results show that the EFL-LSTM model significantly outperformed the non-FL approach, achieving an overall accuracy of 97.16% compared to 93.25%. This superior performance extended across various evaluation metrics, further validating the effectiveness of our approach. Additionally, Figure 6 illustrates the trade-off between training time and accuracy when varying the number of selected features. While fewer features reduced the training time, it came at the cost of decreased accuracy, emphasizing the critical role of comprehensive feature selection in maintaining a high detection performance. Figure 7 presents a sensitivity analysis of the proposed EFL-LSTM model, showing accuracy based on varying hyperparameters such as the number of hidden units and learning rates. The model achieved its highest accuracy, 0.9744, when the learning rate was set to 0.0001 and the number of hidden units was 100.

The robustness of the proposed EFL-LSTM network was further validated by applying it to the WUSTL-EHMS-2020 [48] dataset for intrusion detection. The model achieved an impressive accuracy of 99.78% on this binary classification task, successfully distinguishing between normal and attack events. Specifically, when tested on 30% of previously unseen data, the model correctly classified normal events 4227 times and misclassified 55 instances as attacks. Similarly, it accurately identified 559 attack instances, with only 55 false negatives. These results reinforce the model’s ability to generalize effectively across different datasets and cyberthreat scenarios.

In summary, the proposed EFL-LSTM model demonstrates strong performance in detecting a variety of cyberattacks within IoHT networks, driven by the strategic combination of LSTM networks, Ensemble Learning, and Federated Learning. Thus, the proposed model can be utilized to detect other available cyberattacks and it can be employed to solve imbalance classification problems. The only thing is that we need to train the proposed model with more information. While there are areas for further refinement, particularly in addressing less frequent attack types, the model’s ability to achieve high accuracy across different datasets and attacks establishes it as a promising solution for enhancing IoHT network security.

6. Conclusions

This research introduces an innovative approach to enhancing cybersecurity in IoHT environments by integrating Federated Learning (FL) with Long Short-Term Memory (LSTM) networks. The proposed Embedded Federated Learning-Driven LSTM (EFL-LSTM) model combines advanced feature selection techniques, such as RUSBoost, with FL’s decentralized framework to protect sensitive healthcare data while improving the cyberthreat detection accuracy. Also, this model can assist in eradicating problems like overfitting and imbalance classification in AI-based solutions. This dual approach not only enhances detection accuracy but also ensures that sensitive healthcare data remain secure throughout the process. Our experiments using real-world IoHT data, specifically from the ECU-IoHT and WUSTL-EHMS-2020 datasets, have yielded promising results, achieving impressive accuracies of 97.16% and 99.78%, respectively. These high levels of accuracy demonstrate the model’s robust capability in detecting and classifying potential cyberattacks within healthcare systems. The practical contribution of this work lies in its ability to enhance the security of healthcare networks by enabling local data processing, thereby minimizing the risk of data breaches and safeguarding patient confidentiality. Looking ahead, future work will focus on refining the model to improve the detection of specific threats like DoS attacks and Nmap port scans. Expanding the scope of FL applications across the diverse healthcare industry will also be critical in maintaining resilience against emerging cyberthreats.

Author Contributions

Conceptualization, M.K.; methodology, M.K.; software, M.K.; formal analysis, M.K. and S.K.; investigation, S.K.; resources, S.K.; writing—original draft, M.K.; writing—review and editing, M.K.; Supervision, S.K.; project administration, S.K.; funding acquisition, S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by Seoul National University of Science and Technology.

Data Availability Statement

The data used for this research are available at https://ro.ecu.edu.au/datasets/48/ (Accessed on: 20 May 2024).

Acknowledgments

This research was financially supported by the Seoul National University of Science and Technology.

Conflicts of Interest

There are no conflicts of interest between the authors.

References

Zhang, Y.; Zhu, D.; Wang, M.; Li, J.; Zhang, J. A Comparative Study of Cyber Security Intrusion Detection in Healthcare Systems. Int. J. Crit. Infrastruct. Prot. 2024, 44, 100658. [Google Scholar] [CrossRef]
Saheed, Y.K.; Arowolo, M.O. Efficient Cyber Attack Detection on the Internet of Medical Things-Smart Environment Based on Deep Recurrent Neural Network and Machine Learning Algorithms. IEEE Access 2021, 9, 161546–161554. [Google Scholar] [CrossRef]
Suleski, T.; Ahmed, M.; Yang, W.; Wang, E. A Review of Multi-Factor Authentication in the Internet of Healthcare Things. Digit. Health 2023, 9, 20552076231177144. [Google Scholar] [CrossRef] [PubMed]
Suleski, T.; Ahmed, M. A Data Taxonomy for Adaptive Multifactor Authentication in the Internet of Health Care Things. J. Med. Internet Res. 2023, 25, e44114. [Google Scholar] [CrossRef] [PubMed]
Donga, L.; Raj, R.K.; Mishra, S. Internet of Healthcare Things (IoHT): Towards a Digital Chain of Custody. In Proceedings of the 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI 2022), Dallas, TX, USA, 14–17 June 2022; pp. 524–526. [Google Scholar] [CrossRef]
Siemens Healthcare GmbH. Embracing Healthcare 4.0. Available online: https://www.siemens-healthineers.com/insights/news/embracing-healthcare-4-0.html (accessed on 13 June 2024).
Si-Ahmed, A.; Al-Garadi, M.A.; Boustia, N. Survey of Machine Learning Based Intrusion Detection Methods for Internet of Medical Things. Appl. Soft Comput. 2022, 140, 110227. [Google Scholar] [CrossRef]
Rasool, R.U.; Ahmad, H.F.; Rafique, W.; Qayyum, A.; Qadir, J. Security and Privacy of Internet of Medical Things: A Contemporary Review in the Age of Surveillance, Botnets, and Adversarial ML. J. Netw. Comput. Appl. 2022, 201, 103332. [Google Scholar] [CrossRef]
Mohamad Noor, M.B.; Hassan, W.H. Current Research on Internet of Things (IoT) Security: A Survey. Comput. Netw. 2019, 148, 283–294. [Google Scholar] [CrossRef]
Algethami, S.A.; Alshamrani, S.S. A Deep Learning-Based Framework for Strengthening Cybersecurity in Internet of Health Things (IoHT) Environments. Appl. Sci. 2024, 14, 4729. [Google Scholar] [CrossRef]
Al Abdulwahid, A. Detection of Middlebox-Based Attacks in Healthcare Internet of Things Using Multiple Machine Learning Models. Comput. Intell. Neurosci. 2022, 2022, 2037954. [Google Scholar] [CrossRef]
Firat Kilincer, I.; Ertam, F.; Sengur, A.; Tan, R.S.; Rajendra Acharya, U. Automated Detection of Cybersecurity Attacks in Healthcare Systems with Recursive Feature Elimination and Multilayer Perceptron Optimization. Biocybern. Biomed. Eng. 2023, 43, 30–41. [Google Scholar] [CrossRef]
Ullah, I.; Mahmoud, Q.H. Design and Development of RNN Anomaly Detection Model for IoT Networks. IEEE Access 2022, 10, 62722–62750. [Google Scholar] [CrossRef]
Singh, S.K.; Kumar, M.; Tanwar, S.; Park, J.H. GRU-Based Digital Twin Framework for Data Allocation and Storage in IoT-Enabled Smart Home Networks. Future Gener. Comput. Syst. 2024, 153, 391–402. [Google Scholar] [CrossRef]
Kumar, M.; Kumar Singh, S.; Kim, S. Predictive Analytics for Mortality: FSRNCA-FLANN Modeling Using Public Health Inventory Records. IEEE Access 2024, 12, 81252–81264. [Google Scholar] [CrossRef]
Mothukuri, V.; Khare, P.; Parizi, R.M.; Pouriyeh, S.; Dehghantanha, A.; Srivastava, G. Federated-Learning-Based Anomaly Detection for IoT Security Attacks. IEEE Internet Things J. 2022, 9, 2545–2554. [Google Scholar] [CrossRef]
Hijazi, N.M.; Aloqaily, M.; Guizani, M.; Ouni, B.; Karray, F. Secure Federated Learning with Fully Homomorphic Encryption for IoT Communications. IEEE Internet Things J. 2023, 11, 4289–4300. [Google Scholar] [CrossRef]
Rashid, M.M.; Kamruzzaman, J.; Hassan, M.M.; Imam, T.; Gordon, S. Cyberattacks Detection in Iot-Based Smart City Applications Using Machine Learning Techniques. Int. J. Environ. Res. Public Health 2020, 17, 9347. [Google Scholar] [CrossRef]
Saharkhizan, M.; Azmoodeh, A.; Dehghantanha, A.; Choo, K.K.R.; Parizi, R.M. An Ensemble of Deep Recurrent Neural Networks for Detecting IoT Cyber Attacks Using Network Traffic. IEEE Internet Things J. 2020, 7, 8852–8859. [Google Scholar] [CrossRef]
Khan, F.; Jan, M.A.; Alturki, R.; Alshehri, M.D.; Shah, S.T.; Rehman, A.U. A Secure Ensemble Learning-Based Fog-Cloud Approach for Cyberattack Detection in IoMT. IEEE Trans. Ind. Inform. 2023, 19, 10125–10132. [Google Scholar] [CrossRef]
Rehman, S.U.; Khaliq, M.; Imtiaz, S.I.; Rasool, A.; Shafiq, M.; Javed, A.R.; Jalil, Z.; Bashir, A.K. DIDDOS: An Approach for Detection and Identification of Distributed Denial of Service (DDoS) Cyberattacks Using Gated Recurrent Units (GRU). Future Gener. Comput. Syst. 2021, 118, 453–466. [Google Scholar] [CrossRef]
Al-Kahtani, M.S.; Mehmood, Z.; Sadad, T.; Zada, I.; Ali, G.; Elaffendi, M. Intrusion Detection in the Internet of Things Using Fusion of GRU-LSTM Deep Learning Model. Intell. Autom. Soft Comput. 2023, 37, 2279–2290. [Google Scholar] [CrossRef]
Hong, W.C.; Huang, D.R.; Chen, C.L.; Lee, J.S. Towards Accurate and Efficient Classification of Power System Contingencies and Cyber-Attacks Using Recurrent Neural Networks. IEEE Access 2020, 8, 123297–123309. [Google Scholar] [CrossRef]
Kumar, M.; Kim, C.; Son, Y.; Singh, S.K.; Kim, S. Empowering Cyberattack Identification in IoHT Networks with Neighborhood-Component-Based Improvised Long Short-Term Memory. IEEE Internet Things J. 2024, 11, 16638–16646. [Google Scholar] [CrossRef]
Liu, H.; Zhou, M.; Liu, Q. An Embedded Feature Selection Method for Imbalanced Data Classification. IEEE/CAA J. Autom. Sin. 2019, 6, 703–715. [Google Scholar] [CrossRef]
Abbas, S.; Hejaili, A.A.; Sampedro, G.A.; Abisado, M.; Almadhor, A.S.; Shahzad, T.; Ouahada, K. A Novel Federated Edge Learning Approach for Detecting Cyberattacks in IoT Infrastructures. IEEE Access 2023, 11, 112189–112198. [Google Scholar] [CrossRef]
Yang, R.; He, H.; Wang, Y.; Qu, Y.; Zhang, W. Dependable Federated Learning for IoT Intrusion Detection against Poisoning Attacks. Comput. Secur. 2023, 132, 103381. [Google Scholar] [CrossRef]
Dao, N.N.; Phan, T.V.; Sa’ad, U.; Kim, J.; Bauschert, T.; Do, D.T.; Cho, S. Securing Heterogeneous IoT With Intelligent DDoS Attack Behavior Learning. IEEE Syst. J. 2022, 16, 1974–1983. [Google Scholar] [CrossRef]
Günen, M.A. Fast Building Detection Using New Feature Sets Derived from a Very High-Resolution Image, Digital Elevation and Surface Model. Int. J. Remote Sens. 2024, 45, 1477–1497. [Google Scholar] [CrossRef]
Gupta, A.; Misra, S.; Pathak, N.; Das, D. FedCare: Federated Learning for Resource-Constrained Healthcare Devices in IoMT System. IEEE Trans. Comput. Soc. Syst. 2023, 10, 1587–1596. [Google Scholar] [CrossRef]
Nair, A.K.; Sahoo, J.; Raj, E.D. Privacy Preserving Federated Learning Framework for IoMT Based Big Data Analysis Using Edge Computing. Comput. Stand. Interfaces 2023, 86, 103720. [Google Scholar] [CrossRef]
Singh, P.; Gaba, G.S.; Kaur, A.; Hedabou, M.; Gurtov, A. Dew-Cloud-Based Hierarchical Federated Learning for Intrusion Detection in IoMT. IEEE J. Biomed. Health Inform. 2023, 27, 722–731. [Google Scholar] [CrossRef]
Adil, M.; Javaid, N.; Qasim, U.; Ullah, I.; Shafiq, M.; Choi, J.G. LSTM and Bat-Based Rusboost Approach for Electricity Theft Detection. Appl. Sci. 2020, 10, 4378. [Google Scholar] [CrossRef]
Elayan, H.; Aloqaily, M.; Guizani, M. Sustainability of Healthcare Data Analysis IoT-Based Systems Using Deep Federated Learning. IEEE Internet Things J. 2022, 9, 7338–7346. [Google Scholar] [CrossRef]
Lakhan, A.; Mohammed, M.A.; Nedoma, J.; Martinek, R.; Tiwari, P.; Vidyarthi, A.; Alkhayyat, A.; Wang, W. Federated-Learning Based Privacy Preservation and Fraud-Enabled Blockchain IoMT System for Healthcare. IEEE J. Biomed. Health Inform. 2023, 27, 664–672. [Google Scholar] [CrossRef] [PubMed]
Ahmed, J.; Nguyen, T.N.; Ali, B.; Javed, M.A.; Mirza, J. On the Physical Layer Security of Federated Learning Based IoMT Networks. IEEE J. Biomed. Health Inform. 2023, 27, 691–697. [Google Scholar] [CrossRef]
Ketu, S.; Mishra, P.K. Internet of Healthcare Things: A Contemporary Survey. J. Netw. Comput. Appl. 2021, 192, 103179. [Google Scholar] [CrossRef]
Krzyzanowski, B.; Manson, S.M. Twenty Years of the Health Insurance Portability and Accountability Act Safe Harbor Provision: Unsolved Challenges and Ways Forward. JMIR Med. Inform. 2022, 10, e37756. [Google Scholar] [CrossRef] [PubMed]
Javed, A.R.; Fahad, L.G.; Farhan, A.A.; Abbas, S.; Srivastava, G.; Parizi, R.M.; Khan, M.S. Automated Cognitive Health Assessment in Smart Homes Using Machine Learning. Sustain. Cities Soc. 2021, 65, 102572. [Google Scholar] [CrossRef]
Adler, A.I.; Painsky, A. Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection. Entropy 2022, 24, 687. [Google Scholar] [CrossRef]
Christo, V.R.E.; Nehemiah, H.K.; Brighty, J.; Kannan, A. Feature Selection and Instance Selection from Clinical Datasets Using Co-Operative Co-Evolution and Classification Using Random Forest. IETE J. Res. 2022, 68, 2508–2521. [Google Scholar] [CrossRef]
Vong, C.M.; Du, J. Accurate and Efficient Sequential Ensemble Learning for Highly Imbalanced Multi-Class Data. Neural Netw. 2020, 128, 268–278. [Google Scholar] [CrossRef]
Zhao, Y.; Chen, J.; Wu, D.; Teng, J.; Yu, S. Multi-Task Network Anomaly Detection Using Federated Learning. ACM Int. Conf. Proc. Ser. 2019, 24, 273–279. [Google Scholar] [CrossRef]
Sung, Y.; Jang, S.; Jeong, Y.S.; Park, J.H. Malware Classification Algorithm Using Advanced Word2vec-Based Bi-LSTM for Ground Control Stations. Comput. Commun. 2020, 153, 342–348. [Google Scholar] [CrossRef]
Ahmed, M.; Byreddy, S.; Nutakki, A.; Sikos, L.F.; Haskell-Dowland, P. ECU-IoHT: A Dataset for Analyzing Cyberattacks in Internet of Health Things. Ad Hoc Netw. 2021, 122, 102621. [Google Scholar] [CrossRef]
Zuech, R.; Hancock, J.; Khoshgoftaar, T.M. Detecting Web Attacks Using Random Undersampling and Ensemble Learners. J. Big Data 2021, 8, 75. [Google Scholar] [CrossRef]
Jadav, D.; Jadav, N.K.; Gupta, R.; Tanwar, S.; Alfarraj, O.; Tolba, A.; Raboaca, M.S.; Marina, V. A Trustworthy Healthcare Management Framework Using Amalgamation of AI and Blockchain Network. Mathematics 2023, 11, 637. [Google Scholar] [CrossRef]
Hady, A.A.; Ghubaish, A.; Salman, T.; Unal, D.; Jain, R. IoMT Intrusion Detection. Available online: https://www.cse.wustl.edu/~jain/ehms/index.html (accessed on 20 June 2024).

Figure 1. Frequency of different cyberattacks in an IoHT-based system.

Figure 2. Structure of the proposed EFL-LSTM model.

Figure 3. (a) Classification error at different leaf sizes; (b) feature scores of the IoHT network.

Figure 4. Training progress of the proposed EFL-LSTM network. (a) Percentage accuracy during the model’s training progress; (b) loss during the model’s training progress.

Figure 5. Confusion matrix of the proposed EFL-LSTM network.

Figure 6. Training time in terms of the selected features and the model’s accuracy.

Figure 7. Sensitivity analysis: model accuracy with different learning rates and hidden units.

Table 1. Comparison of this study with other available research.

Research	Year	Techniques	Medical Security	Multiple Attacks	Data Privacy	Solution of Imbalanced Classification
Saheed et al. [2]	2021	DRNN, DTs, KNN, RF	Yes	No	No	No
Adil et al. [33]	2020	LSTM, RUSBoost	No	No	No	Yes
Saharkhizan et al. [19]	2020	LSTM-based Ensemble Learning	No	Yes	No	No
Abbas et al. [26]	2023	Federated Deep Neural Network	No	No	Yes	Yes
Al-khatani et al. [22]	2023	LSTM-GRU	No	No	No	No
Kumar et al. [24]	2024	LSTM-DAG Network	Yes	Yes	No	No
Elayan et al. [34]	2022	Deep FL	Yes	No	Yes	No
Lakhan et al. [35]	2023	FL and Blockchain	Yes	No	Yes	No
Ahmed et al. [36]	2023	FL	Yes	No	Yes	No
Proposed Work	2024	RUSBoost-based EL and LSTM-based FL	Yes	Yes	Yes	Yes

Table 2. Design parameters of the proposed Embedded Federated Learning-Driven Long Short-Term Memory (EFL-LSTM) method.

Method	Parameter	Type/Value
Ensemble Learning-based feature selection	Learner	Decision Tree (DT)
	Number of learning cycles	50
	Boosting principle	Random Undersampling and Boosting (RusBoost)
Proposed EFL-LSTM classification network	Learning algorithm	Adam
	Maximum epochs	10
	Mini batch size	64
	Initial learning rate	0.01
	Initial learning schedule	Piecewise
	Learning rate drop period	5
	Learning rate drop factor	0.1
	Number of clients	5

Table 3. Performance metrics of DL-based techniques.

Technique	Accuracy	Recall	Specificity	Precision	FPP	F1 Score	MCC	K
Embedded LSTM	0.9325	0.7160	0.9457	0.8954	0.0228	0.7737	0.7906	0.8868
Proposed EFL-LSTM	0.9716	0.7933	0.9932	0.9396	0.0068	0.8237	0.8363	0.9112

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kumar, M.; Kim, S. Securing the Internet of Health Things: Embedded Federated Learning-Driven Long Short-Term Memory for Cyberattack Detection. Electronics 2024, 13, 3461. https://doi.org/10.3390/electronics13173461

AMA Style

Kumar M, Kim S. Securing the Internet of Health Things: Embedded Federated Learning-Driven Long Short-Term Memory for Cyberattack Detection. Electronics. 2024; 13(17):3461. https://doi.org/10.3390/electronics13173461

Chicago/Turabian Style

Kumar, Manish, and Sunggon Kim. 2024. "Securing the Internet of Health Things: Embedded Federated Learning-Driven Long Short-Term Memory for Cyberattack Detection" Electronics 13, no. 17: 3461. https://doi.org/10.3390/electronics13173461

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Securing the Internet of Health Things: Embedded Federated Learning-Driven Long Short-Term Memory for Cyberattack Detection

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Research Background

3.2. Research Dataset and Preprocessing

3.3. Proposed Embedded Federated Learning-Driven Long Short-Term Memory (EFL-LSTM) Model

4. Subjective and Quantitative Evaluation

4.1. Subjective Analysis

4.2. Quantitative Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI