1. Introduction
The number of intelligent devices connected to the internet has been growing daily. According to Gartner, the number of connected Internet of Things(IoT) devices is predicted to be 27 billionby 2025, which is almost double the number of IoT devices connected to the Internet in 2021 [
1]. The integration of intelligence capabilities into medical devices revolutionizes the medical field. The IoT integration with medical devices is termed as internet of medical things (IoMT). IoMT devices almost cover 30% of the IoT device market [
2]. IoT technology’s adoption in the medical field has improved patient health monitoring, healthcare operations, and remote healthcare services. However, security and privacy have been a concern in technology-enabled healthcare operations. For instance, as per the Cynerio report [
3], more than half of the medical devices connected to the IoT contain critical vulnerabilities. The unintended exposure of these devices in public may pose security risks and help the adversaries to leverage the critical vulnerabilities. Attackers may use the advanced persistent threats (APT) and known vulnerabilities to compromise the victim devices [
4]. Successful exploitation of the vulnerabilities may impact healthcare operations and put human life in danger. So, security should be considered a high priority in remote health monitoring using IoMT [
5,
6,
7,
8].
The attack detection and mitigation in IoMT can be performed using various techniques and methodologies. Various attacks, such as man in the middle, malicious network traffic injection, and denial of service, are performed to attack and compromise the IoMT networks. Some detection and prevention techniques include log monitoring, vulnerability management, threat intelligence, end device monitoring, intrusion detection, and prevention systems. The intrusion detection system is a commonly used technique to identify security issues and network attacks in IoMT. The network traffic anomalies, signature-based rules, or security policies are implemented in the intrusion detection system (IDS) to identify the network attacks in the IoT-enabled networks [
9]. These traditional security detection techniques are ineffective, as the attacker constantly updates attack strategies and uses advanced hacking techniques. For instance, the security policies can be easily evaded if the attacker performs network reconnaissance and reverse engineering the network devices, such as router and firewall configurations. Researchers have started exploring machine learning (ML) and deep learning (DL) solutions to improve attack detection. The emergence of computing and processing capabilities allows us to use the ML and DL techniques at scale and predict the attack events accurately.
Intelligent IDS solutions have been proposed in the literature to detect the attacks in conventional networks using ML and DL techniques [
10]. However, these solutions are not applicable in the IoMT context because various health IoT sensors connected to the internet and the conventional network datasets are not ideal for evaluating attack detection in IoMT. Moreover, in smart health applications, most of the existing works focus on only analyzing the network traffic to identify the IoMT attacks [
10,
11,
12,
13]. In an application like IoMT, patient biometric information is important and gives more insights into the patient’s condition. There is a direct relationship between sudden drops in the patient sensing data and the attacks impacting the network to influence the confidentiality, availability, and integrity of healthcare data. So, we considered the combined network traffic and patient biometrics dataset to predict the attack events and analyze the relationship between the two disparate data types when the attack events occur.
Hady et al. [
14] utilized the combination of the network traffic data and patient sensing data as a dataset and showed that the combined dataset slightly improved the attack detection performance. Nevertheless, the detection performance can still be improved. Ref. [
15] augmented the dataset by making the normal and attack data equal in proportion. This work did not consider the network to attack traffic proportion in a real-time network traffic scenario. To improve the attack detection performance and ensure that the dataset follows the real-time attack traffic proportions, we preprocess the datasets, apply the feature selection technique PSO and utilize various ML and DL models to study the accurate prediction of IoMT attacks. In the end, our main contributions to this work are as follows.
      
- Propose particle swarm optimization deep neural network based (PSO-DNN) model to effectively detect IoMT attacks using the network traffic and patient biometric combined datasets. 
- Perform detailed detection performance analysis of various machine learning and deep learning models to improve the IoMT intrusion detection system attack detection. 
- Obtain better performance compared to the state-of-the-art works reported on the same datasets with an accuracy of 96%. 
The remainder of the paper is described as follows. 
Section 2 discusses the background and related work related to the proposed work. 
Section 3 describes the proposed approach to improve intrusion detection in IoMT. 
Section 4 describes the dataset used in our study. 
Section 5 discusses the experimental setup and performance evaluation of ML and DL models used in our study. 
Section 6 includes the discussion and future work on the IoMT intrusion detection system. 
Section 7 concludes the paper.
  2. Background and Related Work
In this section, the state-of-the-art work related to cyber attack detection using machine learning and deep learning techniques in IoMT is discussed, and we also discuss the limitations in the related work.
The IoMT network comprises the IoT medical devices connected to the patient’s body or patient premises, an IoT gateway to connect with the conventional network, and the internet to monitor the patient’s health condition and patient activity remotely. A successful attack on the IoMT network can have significant consequences, including patient life. Several works have been proposed in the literature to detect and mitigate cyber attacks in the IoMT network. The signature and anomaly-based intrusion detection system has existed for decades to detect attacks, including in the IoMT networks. However, the signature/policy-based solutions will not be able to identify the novel attacks, including advanced persistent threats. Although the anomaly-based solution is able to detect unknown attacks, the false positive rate is much higher in anomaly solutions. Additionally, tuning is required to set the thresholds and flag the attacks.
Recently, machine learning techniques have been proposed in the literature for intrusion detection in IoMT networks [
16,
17,
18] and other technology fields [
19]. The authors in [
20] surveyed the security and privacy solutions in IoMT and discussed various solutions, including machine learning solutions to solve the attack detection problem. The authors concluded that an effective intrusion detection system still needs to be proposed to detect the attacks proactively, as the IoT devices are memory- and computationally scarce, and little security control implementation is performed at the device level.
In [
16], the authors proposed an ensemble classifier-based intrusion detection system to classify attacks in smart hospitals. The bagging decision tree techniques obtained 93.2% accuracy in classifying the attacks in the KDDcup-99 dataset. It is also important to mention that the KDDcup-99 datasets were generated in the conventional network environment long ago, and the IoT device traffic is not included in the dataset simulation. Moreover, the classification accuracy can be improved to an extent on the KDD datasets. The authors in [
17] presented an ensemble of the decision tree, naive Bayes, and random forest in the first stage. XGBoost was used in the second stage to classify the normal and attack network records. The IoT-based dataset ToN-IoT was used to perform the experiments and evaluate their proposed model. The reports show that their model obtained 96.35% accuracy in classifying the attacks in IoMT. However, the ToN_IoT dataset was generated in the industrial IoT network setup, and the Modbus weather sensors were used to generate the IoT data. These two sensors are not generally used in the IoMT environment. So, the reported results may not be applicable for detecting IoMT network attacks.
Radoglou et al. [
11] explored the active learning approach to retrain the ML models and tested the proposed approach in HTTP and Modbus communication protocol network datasets. The CIC-IDS2017 dataset was used to test the performance of ML models using an active learning approach on HTTP communication protocol network datasets. The authors reported that the decision tree classifier achieved 96.44% accuracy in classifying the attacks. On the other hand, for the Modbus datasets, random forest obtained 94.45% accuracy. zachos et al. [
21] combined the network traffic feature, IoT device feature, and gateways features to form a unique feature set and applied machine learning models to improve the performance of the attack detection in the IoMT network. The CPU and memory consumption level features in the IoT device were considered for evaluation. The authors showed that the random forest performed better than other ML models for attack detection in IoMT. Thamilarasu et al. [
22] proposed a mobile agent-based intrusion detection to detect network and device-based attacks in IoMT. The machine learning and regression algorithms were used to test the simulation-generated datasets. Accuracies of 99.8% and 97.93% were obtained for network level and device level intrusion detection, respectively, when the ML model DT was used for evaluation. Binbusayyis et al. [
23] evaluated the performance of the ML algorithm in the Bot-IoT dataset. The Bot-IoT dataset covers the Denial of service (DoS), Distributed Denial of Service (DDoS), scan, and theft attack categories. However, IoMT-based attacks, such as man in the middle attacks and spoofing attacks, are not included in the dataset. The authors reported that the decision tree achieved 100% accuracy on the test dataset and obtained more than 99% with other ML models, such as KNN, NB, and SVM.
Overall, the ML-based models used to perform the attack detection in the IoMT environment indicate that most datasets were not generated, focusing on the IoMT attacks and the IoMT environment. However, the reported results were impressive, with an accuracy of more than 95% in most of the contributions. The input features may include network traffic, metric feature, IoT device memory, or CPU features for the IoMT study. None of the above works considered patient biometric data as a feature to detect cyber attacks in IoMT.
Some researchers also explored the deep learning models to detect or classify the attacks in the IoMT network. The authors in [
12] applied particle swarm optimization (PSO) for feature selection and then used ML/DL-based models to detect cyber attacks in the IoMT. The authors considered the NSL-KDD datasets to evaluate the performance of the proposed approach. The PSO and RF-based solution obtained the best results with an accuracy of 99.76%. However, the dataset NSL-KDD was not generated focusing on the IoT network environment and is not the right dataset to evaluate attack detection in IoMT networks. Awotunde et al. [
24] utilized a deep feed-forward neural network to classify the network attacks in the IoT network. The deep autoencoder was used to reduce the feature dimension. The network flow records were extracted from the dataset ToN-IoT to conduct the experiments. The authors reported that their model DAE-DFFNN obtained 89% accuracy and mentioned that it performed better than ML models like SVM and DT. The authors in [
25] proposed an SDN-enabled CNN and LSTM hybrid DL model framework for IoMT malware detection. The authors obtained more than 99 percent accuracy in detecting malware. However, the proposed method was not evaluated for the IoMT intrusion detection system to detect network attacks in IoMT. The authors in [
13] leveraged an intelligent agent-based swarm neural network for intrusion detection in IoMT. The ToN-IoT dataset is used to conduct the experiments, and it reported that their proposed neural network obtained 99.5% accuracy on the ToN-IoT dataset. Manimurugan et al. [
26] presented a deep belief neural network to classify the network attacks in the IoMT. A CICIDS dataset was considered to evaluate the proposed method. The deep belief neural network reported more than 96% accuracy for attack classification. However, the CICIDS dataset was not generated focusing on the IoMT network attacks.
The review of DL models used for IoMT intrusion detection indicates that DL models are not highly leveraged to detect the IoMT network attacks except swarm-based neural networks. Additionally, the literature-reviewed DL models only consider the network traffic pr metric datasets to classify or detect the attacks. The patient bio-metric data are not considered a feature in any of the above-discussed works. 
Table 1 compares the ML- and DL-based state-of-the-art solutions for attack detection in IoMT.
  3. Proposed Approach
To effectively identify the IoMT attacks using network traffic and patient biometric data, we explored various ways to improve the attack detection performance. Prior to discussing our proposed approach, we describe the IoMT intrusion detection system architecture. 
Figure 1 displays a typical IoMT network architecture used to manage security operations and predict the attacks using ML or DL techniques. The IoMT intrusion detection system architecture contains patient sensor devices, IoT gateway, network traffic collector, ML/DL data processing pipeline, intrusion detection system, and the security operators to monitor the attacks. The sensor devices may include a temperature sensor, pulse rate detector, heart rate detector, ECG device, blood pressure, and respiration rate monitoring device, but are not limited to these. IoT network protocols, such as MQ Telemetry Transport (MQTT) and Advanced Message Queuing Protocol (AMQP), are used to send the IoT sensing device’s information to the remote servers. The IoT gateways collect the sensor data through wireless or wired communication and send the data to remote locations. The network traffic as well as the patient biometrics data are collected in our approach. The intrusion detection system mentioned in 
Figure 1 includes the data processing and analytical capabilities to detect intrusion in the IoMT network. Continuous monitoring and analysis are required at the intrusion detection system level to tune the alerts and reduce the false positives. The applications of the IoMT system include remote patient monitoring and monitoring of the physical premises in the hospital environment to save and cure the patient health.
Figure 2 shows our proposed approach to improve the IoMT attack detection performance using ML and DL models. In contrast to performing the network traffic analytics and detecting intrusion in the network infrastructure, we leverage the IoT sensing data from the patient-specific IoMT systems to identify the patient biometrics anomalies and improve the overall attack detection performance when an adversary conducts attacks in the IoMT network infrastructure. The network traffic and patient biometric data are combined using the timestamp of the network traffic events and patient biometric data events in the IoMT. The final dataset includes the majority of the network traffic features and a minority of the patient biometric features.
 Data Preprocessing: The data were preprocessed using the simple imputer and standard-scalar technique. In the simple imputer, the missing values in the column are replaced with either mean, median, or most frequent values in the same column. We used the median statistic to replace the missing values in the feature columns. Standard scalar: Standard scalar is one of the well-known data standardization techniques. The feature data are standardized by subtracting the value from the mean of the feature values and dividing it by the standard deviation of the feature data. It will result in the feature data being normally distributed with mean zero and unit variance.
PSO Feature Selection: The selection of relevant features is needed to improve the accuracy and perform the prediction faster of the model. We use the particle swarm optimization technique [
27] to select the relevant features for IoMT attack prediction. Let 
 be the 
d features of the dataset. The selection of the features is represented as 1. If the feature is not selected, it is assigned as 0.
      
The total number of features in the dataset is considered ‘
d’. In order to select the feature, we assign a threshold value of 0.5. If the feature value is greater than the threshold value of 0.5, then the feature will be selected. Otherwise, the feature value will be 0.
      
The function F(..) will optimize the classification accuracy by penalizing the number of features selected. The objective is to minimize the function F(…) in which the 
 represents a parameter to decide the trade-off between classification accuracy and the number of features selected with respect to the total number of features. The parameter 
P is denoted as the classification accuracy. 
 denotes the number of the selected features, and 
 denotes the total number of features [
27].
      
Figure 3 shows the fitness values for the PSO when the number of iterations is varied from 0 to 100. We obtained the 8 optimal features out of the network and biometric combined 43 features during the feature selection process. As the number of features reaches 8, the fitness value becomes constant and maintained well below the 0.231 fitness value. So, we used the PSO-based features to evaluate the IoMT attack detection performance.
 The dataset is split into training and test datasets to conduct the experiments and evaluate the performance. We split the dataset into 70% training data and 30% test data for our experimental evaluation. The following ML and DL algorithms are used to test our datasets and determine the best-performing models when the network traffic data and patient biometric data are combined.
  3.1. Machine Learning Models
  3.1.1. Logistic Regression (LR)
Logistic regression is used for the binary classification of a given dataset. The logistic regression takes the real values as input and predicts the probability of the input features associated with the output class. The coefficient values are determined using stochastic gradient descent. A logistic function is used to transform the output as the probability.
The probability of predicting the class R, given the input sample 
 is defined as
          
  3.1.2. K-Nearest Neighbor (KNN)
KNN can be used for both classification and prediction. The prediction works based on feature similarity. The nearest neighbors are selected based on different distance measures. We used the uniform weight to assign equal weights to all neighbors. The number of neighbors is selected as 3 in our evaluation, and the average nearest neighbor data values are assigned as the final predicted values. The Euclidean distance (E) is used to determine the nearest neighbors [
28].
          
          where 
k denotes the number of neighbors, and 
 and 
 are the data points in the 
ith dimension.
  3.1.3. Decision Tree
A decision tree is a tree-like graph that is used for regression and classification problems. Each node or branch in the tree represents the feature of a dataset. The entropy and the information gain against each feature against the target variable used to construct the DT. The test data class is predicted by traversing through the tree up to the leaf nodes.
  3.1.4. AdaBoost
Adaptive boosting is one of the well-known boosting techniques used to improve weak classifiers’ performance. The multiple weak classifiers are combined to form a robust classifier using boosting techniques like Adaboost. A single classifier may not accurately classify the input. So, the misclassified classes are passed through another classifier to improve the overall accuracy. The classifiers can be decision trees, logistic regression, random forest, etc.
  3.1.5. Random Forest (RF)
RF is a family of decision tree-based machine learning algorithms. Ensemble learning is used to perform the classification and predictions, and bootstrapping RF is used to perform the predictions in this work. The bootstrapping method combines ensemble learning and the random selection of the decision trees to determine the prediction output as the average value of all the decision tree predictions.
Let 
 to 
B is the number of decision trees, 
 denotes the regression prediction value of the 
bth decision tree [
29], then the regression prediction of the RF forest is defined as
          
  3.1.6. Support Vector Machine (SVM)
SVM is a supervised machine learning algorithm used for regression and classification problems. SVM separates the data of two classes with a hyperplane or decision boundary. We selected “linear kernel” in our experiments to classify benign and attack traffic in the IoMT network. The original feature space is converted into a new feature space to support the nonlinear decision boundaries.
The hyperplane function is denoted as
          
The objective function needs to be minimized such that 1 is satisfied all the time.
  3.2. DL Methods
  3.2.1. Deep Neural Networks (DNN)
A deep neural network contains more than two hidden layers, the input layer, and the output layer. DNN contains more than one multi-layer perception layer to produce the output. MLP is a global approximator and well suited for mapping the non-linear input–output combination. Typically, MLPs consist of three layers. The input layer feeds the input values to the neural network. The output layer performs the classification or prediction of the given problem. The hidden layer includes the neurons and supports the computations to process the input data and forward the processed data as input to the output layer. The number of hidden layers can be arbitrary. The neuron processing unit is represented as follows: [
30].
          
          where 
b denotes the bias value, 
 denotes the 
ith neuron weight and the 
 denotes the input to the 
ith neuron unit. 
 is the non-linear activation function, and 
 is the neuron processing unit output.
  3.2.2. Convolutional Neural Network (CNN)
A convolutional neural network is a feed-forward neural network that takes input in a one-dimensional or two-dimensional form. CNN is commonly used for image classification and object detection in images. The CNN contains a convolutional layer,  layer, pooling layer, and fully connected layer. The convolution layer extracts the feature patterns from the input features, and  incorporates the non-linearity into the network. The pooling layer reduces the dimensionality of the feature map. Finally, the fully connected layer multiplies the input with weights and adds bias values to produce the output. Usually, the last layer of the CNN is a fully connected layer.
  3.2.3. Long Short-Term Memory LSTM
LSTM is a class of recurrent neural networks (RNN). LSTM is used to predict and classify the time series input data. In contrast to the RNN, LSTM remembers the long-term dependencies of the input data. LSTM comprises the input gate, forget gate, and output gate. The first gate determines whether or not the information from the previous timestamp data is valuable. The second gate is used to learn the input data, and the third gate passes the information to the next timestamp.
  4. Dataset Description
In this section, the IoMT dataset considered to perform the data model evaluation is discussed in detail. The dataset was created using a real-time health monitoring testbed [
14]. The testbed comprises the sensor devices attached to the patient’s body, the network gateway, and the Software Defined Network (SDN) controller for visualizing the network traffic. The network traffic and sensor data generated in the testbed are used to detect the anomalies in the data and identify the attacks. Three attacks were simulated in the environment to generate the attack dataset. Those attacks are man-in-the-middle attacks, data injection, and spoofing attacks.
The man-in-the-middle attack includes the attacker who joins the patient health monitoring system and can read/manipulate the network traffic on the fly. This attack results in violations of the patient’s data confidentiality and integrity in the network. The data injection attack includes manipulating the patient health network packets passing through the gateway. This attack results in violating patient data integrity. The spoofing attacks result in reading the network traffic passing through the gateway. It violates the patient’s data confidentiality.
The network traffic and patients bio-metric data combined dataset is generated using the ARGUS tool [
31]. The biometric data includes the temperature, peripheral oxygen saturation, pulse rate, systolic blood pressure, diastolic blood pressure, heart rate, respiration rate, and ECG ST segment data. The network traffic flow records and their metrics are captured to obtain the overall network traffic features. Overall, the dataset contains 44 features, including 35 network traffic features out of 44 features. The dataset output is labeled as an attack or normal traffic. The attack traffic is labeled as “0”, whereas the normal traffic is labeled as “1”. 
Table 2 shows all the features considered in the dataset and the description of each feature and the feature type.
As shown in 
Table 3, the dataset comprises the 14,272 normal and 2046 attack sample network records. We randomly selected 1400 attack samples to balance the attack and normal traffic proportions in the dataset. In real-time networks, the attacks are rarely seen, and the normal-to-attack traffic ratio is very high. To mimic the real-time scenario, we deduced the attack sample count in the dataset and considered an unbalanced dataset for performance evaluation.
  6. Discussion and Future Work
We leveraged PSO-based feature selection and the DNN model to predict the IoMT attacks. Although our work improved the performance of the IoMT intrusion detection, the dataset used for our evaluation mainly addresses the patient’s confidentiality and integrity-based attacks. The denial of service attacks is not considered in our evaluation. One of our future works will be performing IoMT attack classification using ML and DL models, including the DoS, data injection, man-in-the-middle attacks, etc.
Data analytics plays a significant role in the smart health industry. The secured implementation of the machine learning operations (MLOps) is essential to align with the health industry regulations and maintain the compliance requirements. The combination of patient data with network data in our approach requires securely collecting, processing, and transforming the data in real-time applications. Hence, additional security measures are needed in MLops implementation to apply our approach in real-time healthcare industry applications.
Various sensing-related patient data are generated in the IoMT networks. The intrusion detection system deployment location and preserving the privacy of the patient data are important in the IoMT network. So, data anonymization techniques will be used in the future to preserve the privacy of the patients when collecting patient biometric data. The most relevant patient biometric features for attack detection will be studied to understand the correlation between attack detection and biometric feature.
We also want to explore the adversarial attacks in IoMT targeting the ML and DL solutions and manipulating the predictions. An adversary may compromise the machine learning operations infrastructure and poison the data. This results in attacks possibly going undetected, or new attacks being missed in the intrusion detection system.