2. Machine Learning Methods
Different types of machine learning methods have been used for the detection of malicious attacks in the literature. The most used machine learning methods are RF, DT, and KNN. The workings of these machine learning methods are provided as follows.
Figure 3 shows the structure of the DT, which splits the root node into nodes and sub-nodes by using multiple algorithms. Decision nodes are split into further sub-nodes and decide for their sub-nodes. Leaf nodes are the nodes that provide the outcomes and are not split into further sub-nodes. A section of an entire tree is called a sub-tree, which includes a decision node and leaf nodes.
KNN is the simplest and most popular ML algorithm that helps an unknown class to identify its neighboring classes so that it can estimate its own class.
Figure 4 shows the KNN process, where we have two classes, A and B, and a class with a question mark that needs estimation of its class regardless of labels. The neighborhood of the query instance is three, because there are three instances within the circle. Within the small circle, we have one instance belonging to class A and two instances belonging to class B. As there are more neighbors of the query instance belonging to class B, it will be assigned the class B label.
Random forest (RF) is an ensemble learning method. It consists of multiple decision trees operating together. A number of decision trees, T1–Tn, are trained on different samples (with replacement) of the dataset. Each tree from T1 to Tn individually provides a class for a query instance, and the class with more votes becomes the final class prediction, as illustrated in
Figure 5.
Supervised, semi-supervised, and unsupervised ML methods are presented in the following subsections for detailed discussion on cyber attack detection in IoT systems and devices.
2.1. Supervised Machine Learning Method in IoT Security
An SVM classifier was used in [
46] to detect the intrusion of selective forward and blackhole network layer attacks in a network. IoT testbed data were used to test the model against selective forward (SF) and blackhole (BH) attacks. Sink in the middle and at the top network topologies were used to evaluate the detection. The SVM detection model was not able to detect all the malicious nodes for SF attack, and the precision rate was below 50%. The Matthews correlation coefficient equation was used to derive the precision, accuracy, PPV, NPV, and TPR. The results show that 100% TPR and a 99.8% accuracy rate were achieved for SF and BH network routing attacks. In [
47], c-support SVM was proposed for detecting abnormalities within IoT networks. Both normal and malicious data were trained and evaluated using the KDD-99 dataset. The detection accuracy was up to 100% when SH and BH attacks were present, whereas 81% detection accuracy was achieved when different network topologies were evaluated for all routing attacks.
A uniform intrusion detection method was used in [
48] to detect the intrusions of DoS, probe, U2R, and R2L in an IoT network. To make the IoT network more secure and accurate, NSL-KDD and KDDCUP99 datasets were used, along with random forest as the classifier of supervised machine learning. Other classifiers (KNN, NB, DT, RF, and LR) were used for comparison to test the accuracy of the proposed method. With both the datasets and classifiers, the accuracy of the proposed model was shown to be up to 99.9% with minimum use of time and energy. In [
49], researchers proposed the IoTArgos model to detect anomalies and new attacks and to secure the privacy of user data in smart homes. IoTArgos used a two-stage IDS supervised classification algorithm to filter and detect known attacks using a training suite of the classification algorithm. Classifiers used in this article were kNN, LR, RF, NB, and SVM. To identify and evaluate new attacks, IoTArgos also used anomaly detection algorithms of CBLOF, FastABOD, FB, IForest, LOF, and PCA. Experimental result showed that precision rate of proposed IoTArgos model was 0.9876, and its recall rate was 0.9763.
In [
50], attacks such as DoS and spoofing were detected using the IoTID20 dataset. RF, SVC, XGBoost, and LR techniques were used to detect intrusion and improve the performance and accuracy of the proposed model. Simulation results showed that these techniques provide high accuracy and can be used to detect IoT attacks. In [
51], various supervised ML methods are integrated into the MLlib library of Apache Spark for fast data processing and identification of the SYN-DoS cyber attack. Both performance and application/training time for SYN-DoS cyber attacks were analyzed. Performance and implementation of ML algorithms such as RF, DT, LR, SVM, and GBT were tested on the SYN-DoS public dataset. Experimental results showed that the RF accuracy rate was 100%, and the shortest training time (up to 23.22 s) was achieved for DT, with 2 million rows. The minimum application time was 0.13 s for about 600,000 instances in the case of the RF algorithm with Apache Spark. Note that this algorithm is required to be used in a cloud environment for better scalability and ease of use. Moreover, the model generated by RF is easy-to-use and easy-to-implement in both low and high-level languages.
Table 3 presents a summary of different supervised ML methods in terms of: (1) types of malicious attacks, (2) feature selection methods, (3) detection methods, and (4) datasets considered for study.
2.2. Semi-Supervised Machine Learning Methods in IoT Security
In [
52], classifiers such as SVM and KNN were used to classify the feature sets, and ensemble method was used to detect normal or malicious packets. The dataset used in this paper was the NSL-KDD dataset. Note that all classifiers worked in a distributed environment to reduce future attacks. With hybrid methods and fewer features, theaccuracy increase was 10%, and the false positive rate was shown to be reduced by 0.05. Hence, it is concluded that detection performance was improved with a higher true positive rate and fewer features. A flow-based NIDS (SSLEEK) approach was developed in [
53] to produce alerts on anomaly and malicious attacks. NetFlow files are used to detect botnet traffic in a network session. This method shows improvements in accuracy and efficiency compared to traditional NIDS. The classifiers selected were K-means, K-NN, and GMM. The K-NN classifier is the most popular in machine learning. The workings of the K-NN classifier is shown in
Figure 4.
In [
54], the hierarchical stacking-temporal convolutional network (HS-TCN) was developed to detect anomalies in the communication of smart homes. Using this semi-supervised technique, a 30% improvement in results was shown as compared to the supervised model. Using hierarchical and stacking methods improves security and performance. A multi-layer clustering model has been proposed in [
55] to detect and prevent intrusion. A semi-supervised multi-layered clustering (SMLC) model was compared with tri-training and classifiers such as RF, Bagging, and AdaboostM1 with two datasets, NSL and Kyoto 2006+. The results showed that multi-layer clustering performed better than the tri-training model while using 20% less unlabeled data and had comparable performance to the ensemble method, but SMLC has a higher testing time than the latter. In [
56], fuzziness-based learning approach was used by developing unlabeled samples supported with a supervised approach to improve the performances of classifiers. For a base classifier, NNRw (neural network with random weights) was used because it is computationally efficient and has an excellent learning performance. The proposed method showed that unlabeled samples belong to high and low fuzziness categories, which played an important role in improving the classifiers’ performances as compared to other existing classifiers.
In [
57], authors proposed a two-model Gaussian fields approach and a spectral graph transducer to detect the unknown malicious attacks. They also used MPCK-means to improve the performances of clustering methods. KDD Cup 1999 dataset was used to test the models. In [
58], the DAS-CIDS system was designed to enhance the performance of IDS and to reduce the false alarm rate. The DARPA (KDD99) dataset was used to analyze the performance of the detection, and Snort alarm was used to reduce the false alarm rate. Results showed that the proposed method is more efficient than traditional supervised classifiers due to the automatic support for unlabeled data. A dynamic ensemble algorithm was used in [
59] in combination with a semi-supervised extreme learning machine (SSELM). Moreover, the mutual information criterion was proposed for detecting anomalies of large-scale data. SSELM works as a base classifier that provides high relevance and low redundancy. Real-life datasets from UCI (BC, COIL20, ILPD, and HARS) were used for the experiment, and results showed that the proposed algorithm outperformed the state-of-the-art methods in the case of average classification. In [
60], authors proposed the SDRK machine learning model to detect and mitigate intrusion on fog nodes. NSL-KDD was used as a dataset. Testing of SDRK model was performed on fog nodes that lay between cloud layers and IoT. The proposed model showed more accuracy and a shorter testing time. SDRK detection accuracy improves up to 99.78%.
Table 4 presents a summary of semi-supervised ML methods in terms of types of malicious attacks, feature selection methods, detection methods, and datasets considered.
2.3. Unsupervised Learning in IoT Security
The grey wolf optimization one class support vector machine (GWO-OCSVM) was proposed in [
61] to detect the botnet attacks that are launched from IoT devices. OCSVM, IF, and LOF algorithms were used to test the proposed model, and results showed that GWO-OCSVM can detect botnet and perform classification better than the other algorithms. With the use of the NN-BaIoT dataset, experiments showed that GWO-OCSVM achieved better results as compared to the other three algorithms in terms of FPR, TPR, and G-means. The performance was enhanced up to 92%. In [
62], MCS applications have been used to protect the reliability and correctness of user data. The cyber trustworthiness of the MCS report was ensured in the presence of smart and scheming adversaries. Real IoT datasets are used to prove the effectiveness and accuracy of this model.
In [
63], authors proposed the IRESE model to detect rare-events and anomalies on the incoming data stream over the edge devices of IoT. For better detection and performance of the IRESE model, various rare-event types (gunshot, glass break, scream, and siren) were used for testing. The whole system was tested using an agile-based IoT gateway. Testing results proved that IRESE is a portable and lightweight system, which can be deployed anywhere and start detecting rare events from the start. Anomaly-based detection was used in [
64] to detect the botnet in IoT devices. Multiple features were used from both datasets (unbalanced and balanced), but only three features were able to differentiate between normal and malicious traffic. Experiments showed the best precision and accuracy of up to 90% were achieved through RF and entropy with five features in both balanced and unbalanced datasets. Note that the result was the same when 10 features were used. Results showed that single model for IoT devices provides better detection. However, a separate model for each IoT device provided a more accurate detection rate.
In [
65], authors considered hybrid-based intrusion detection (misuse-based and anomaly-based detection) that uses map reduce for distributed detection. Both misuse and anomaly-based methods used supervised and unsupervised optimum-path forest models to detect the intrusion from wireless sensor network and IoT devices. Anomaly detection based on unsupervised OPF was used for detecting internal attacks that happened in 6LoWAPN, and misuse detection was based on external cyber attacks that happened from the Internet. Both internal and external attack detection showed superior results compared to other existing classifiers.
In [
66], authors proposed a network threat situation assessment model using concepts of unsupervised models: it used unlabeled data to detect network threats in an IoT system. CSIC 2010 HTTP, ADFA-LD, UNSW-NB15, and ISOT datasets were used to test the proposed model. Experimental results showed that the developed model performed better than the traditional model based on the supervised method which used labeled data to detect network threats.
Table 5 presents a summary of unsupervised ML methods in terms of types of malicious attacks, feature selection methods, detection methods, and datasets considered.
3. Deep Learning Methods
Deep learning methods have also been used for detection of malicious attacks in the literature. Popular deep learning methods in the IoT are deep belief networks and adaptive boost algorithms. The deep belief network (DBN) is a popular deep learning algorithm that consists of a visible layer (input Layer) and multiple hidden layers (latent variables). This algorithm works in layers. First, the input layer sends data to the first hidden layer and processes it. Secondly, the next hidden layer takes the first hidden layer as an input layer and processes the data. This process is repeated until the last layer shows the output of the algorithm, which is shown in
Figure 6.
Figure 7 presents the working of popular adaptive boost (AdaBoost) algorithm. This algorithm shows that weights are reassigned at each iteration. Higher weights are assigned to an imperfectly classified instance. At the start, all the instances have equal weights. In the first classifier, incorrect classifier instances are given higher weights than corrected classifiers, and these instances are used as an input in the second classifier. This process is repeated until specified conditions are met. In this algorithm, all the classifiers (models) are created by using errors of previous classifiers, and this process repeats until a strong or correct classifier is obtained. The detailed discussions on the use of deep learning for detection of cyber attacks in IoT system are provided in the following.
An intrusion detection model based on the hybrid genetic algorithm and deep belief network (DBN) algorithm was presented in [
67]. It was achieve higher accuracy and a higher detection rate. For simulation and evaluation of the model, the KDDCUP dataset was utilized. Different existing DL models were used for a performance comparison to detect the intrusion of DoS, R2L, Probe, and U2R attacks. These DL algorithms used for comparison were the thermodynamics-based artificial neural network (TANN), the fuzzy clusters-artificial neural network (FC-ANN), and the back propagation neural network (BPNN). Results showed that the intrusion detection of the hybrid genetic algorithm and DBN model were considerably improved compared to the considered existing DL algorithms. In [
68], authors proposed an LM-BP neural-network-based model to detect DoS, R2L, Probe, and U2R attacks. This approach was addressed to prove that it is better than the traditional BP and PSO-BP models. For evaluation, simulation, and performance of this approach, the KDDCUPP-99 dataset was utilized. Results showed that LM-BP overpowers the other models in terms of intrusion detection. Experimental results showed that the false alarm rate is 1.34% and the accuracy rate up to 93.31% better than those of other models.
An IDS model case-sensitive stacked auto-encoder (CSSAE) was developed in [
69] to deal with imbalanced data in IDS. Two datasets, KDDCUP-99 and NSL-KDD, were used to evaluate the performance of CSSAE. It was compared with SAE and NDAE models. Experimental results showed that the accuracy of CSSAE is up to 99.35% better and it is 1.15 times faster than SAE and NDAE models. A cyber attack detection method based on the recurrent neural network (RNN) was developed in [
70]. LSTM acts as a module in an ensemble of detectors, and LSTM modules merge with DT method to produce the final output. RF, KNN, MLP, and SVM classifiers were applied to the datasets. The effectiveness and performance of the proposed method were evaluated by a real-world dataset (Modbus Network Traffic) and the method obtained a 99% accuracy rate to detect cyber attack in IoT devices.
In [
71], a CNN-based dual deep learning model was proposed for disaggregation and aggregation architecture using an energy audit to detect the cyber-physical attack. By using an energy meter, the proposed model checks the system behavior to detect the attacks. The disaggregation model detects a cyber attack, and the aggregation model detects a physical attack. By using energy consumption, the proposed model can detect attacks much better than a single deep learning method. The simulation results showed that a cyber attack is detected in between 900 and 1100 s, and physical attacks are detected in the time frame of 150 to 600 s.
The intelligent intrusion detection system (IID) was used with a DBN-based deep learning algorithm in [
72] to detect malicious traffic in an IoT environment. The proposed method was evaluated for both real (provide proof of concept) and simulation (provide evidence of scalability) networks to prove its effectiveness. For evaluation, IID was compared with the inverse weight clustering (IWC) model. Results showed that the proposed model can efficiently detect both real and simulation traffic attacks. IoT devices are changing rapidly in shape, size, complexity, and usage nowadays; and it is getting difficult to detect attacks transferred between IoT devices. Hence, a hybrid convolutional neural network model (HCNN) was developed in [
73] to detect the DoS, sinkhole, and eavesdropping attacks in IoT devices. The UNSW NB15 dataset was used, and the RNN model was compared with HCNN for performance comparison. Experimental results showed that the hybrid approach can detect a wider range of attacks in IoT systems than RNN. HCNN achieved 98% better efficiency than RNN.
In [
24], a distributed deep learning was proposed for an IoT/Fog system to detect DoS, R2L, Probe, and U2R-based cyber attacks. Distributed deep learning was compared with centralized learning for analysis, and results showed that a distributed deep method can detect attacks with up to 99% accuracy. The NSL-KDD dataset was used to detect attacks in this study. In [
74], a deep neural network-based learning strategy was proposed for identification and detection of malicious attacks in IoT networks. The malicious attacks considered were DoS, probing, malicious, scan, spying, wrong setup, and normal attacks in IoT network. Different ML classifiers, GaussianNB, SVM, SDG, RF, LDA, LR, and DT, were compared with DNN on the DS2OS dataset. Simulation results showed that the accuracy rate of training was 98.27% and the testing accuracy rate was 98.29%, which resulted in an average accuracy rate of 98.28%.
A DNN-based framework was proposed in [
75] for detecting network attacks and reducing the false alarm rate. The self-adaptive identification method was adopted in which the proposed model can send an early warning in a case of attack detection in an IoT network. For evaluation of performance, the NSL-KDD dataset was used. The early warning accuracy of the proposed DNN model was 99.9% compared to PCA, Gain Ratio, and DBN-based frameworks. SVM and SDA methods were used for attack classification. In [
76], a vector convolutional deep learning (VCDL) approach under a fog environment was used to detect DDoS, DoS, theft, and reconnaissance-based malicious attacks in IoT traffic. For testing and evaluation of the model, UNSW’s Bot-IoT dataset was used. The proposed model was compared with SVM, RNN, and LSTM models for performance analysis. Results showed that VCDL performed best, with accuracy, precision, and recall of up to 99.974%, 99.99%, and 99.75%, respectively.
In [
77], software defined network–IoT was proposed along with a fuzzy neural network (FNN) to detect three attacks, namely, man-in-the-middle (MITM), malicious code (MC), and side-channel (SC) attacks, in addition to DDoS in IoT traffic. The fuzzy-rule based neural network system was used to test and train the model using the NSL-KDD dataset. With the FNN detection model, the detection accuracy for these four malicious attacks was reported to be up to 83%. In [
78], a feed-forward neural network (FFNN) model was proposed with new layers for multi-class classification to detect DDoS, DoS, data gathering, and data theft attacks. The efficiency of the model was tested using both binary and multi-class classification on datasets with real IoT traffic. Results showed that binary classifiers detection accuracy was up to 99.99% and multi-class classifier detection accuracy was 99.79% for the proposed model.
With the rapid growth of IoT devices, it has become more difficult to secure them against malware. In this regard, an ARM-processor-based IoT application was considered in [
79]. This processor used an LSTM-based RNN structure to detect malware in IoT network. The LSTM structure had three layers, and it showed promising results compared to RF, NB, SVM, MLP, KDD, DT, and AdaBoost classifiers. The detection accuracy rate was up to 98%. In [
80], a bi-directional LSTM recurrent neural network (BLSTM-RNN) model was developed to detect backdoor, DoS, worm, analysis, and reconnaissance attacks in an IoT network, and the UNSW-NB15 dataset was used for evaluation. Experimental results showed that the intrusion detection accuracy rate of proposed method was 95.7%. Moreover, its precision rate and minimum wrong detection rate were 100% and 0.04%, respectively. A zero false alarm rate was also achieved with recall. The f1-score rate was up to 98%.
An anomaly detection system (ADS) based on deep learning was presented in [
81] to identify malicious activities, such as fuzzers, analysis, backdoor, DoS, generic, exploits, reconnaissance, shellcode, and worms attacks, in an industrial IoT environment. In the proposed ADS, results of deep auto encoder (DAE) were used for initialization of deep feed-forward neural network (DFFNN) in the training phase and testing phase. Old NSL-KDD and new UNSW-NB15 datasets were used to detect both outdated and new malicious attacks. Activities were detected by using DAE and DFFNN, two models of ADS, for evaluation. Results showed that the proposed model detection rate was up to 99%, and the false positive rate was minimal, at 1.8%.
In [
82], an efficient intrusion detection model was proposed based on deep learning to detect DoS, injection, reconnaissance, and zero-attacks in the Brownfield industrial IoT system. A denoising auto-encoder was used for unsupervised learning from data and a deep neural network for supervised learning from data with the dataset MODBUS. The proposed model was compared with SVM, KNN, NB, and RF models for testing and evaluation. The proposed model showed promising results with a detection rate of 91.49%, a precision rate of 96.41%, and a false positive rate of 1.87%.
Table 6 presents a summary of DL methods in terms of types of malicious attacks, feature selection methods, detection methods, and datasets considered.
4. Conclusions and Future Prospects
IoT is an emerging technology, but the security of its devices and systems is a major concern. Therefore, this paper presented security concerns on IoT networks. Moreover, supervised, semi-supervised, and unsupervised machine learning methods were discussed for the detection of different malicious attacks in IoT networks. Deep-learning-based methods were also explained for the detection of cyber attacks in IoT systems. In machine and deep learning detection methods, various malicious attacks, such as DoS, DDoS, probing, U2R, R2L, botnet, spoofing, and MITM attacks, were discussed. Moreover, datasets used in machine and deep learning detection methods were also included. All learning methods were compared in terms of the types of attack, feature selection methods, method(s) used to detect attacks, and datasets to pick the best techniques or methods to detect these attacks.
In the future, research should be focused on system throughput, as more IoT devices will be connected to IoT systems. Therefore, scalability issues of detection methods should also be considered when addressing security protocols. Security protocols should be designed to be cost-efficient and computationally efficient to meet the devices’ resource constraints. Future studies should also be focused on data security, infrastructure problems, and privacy leakage. Novel machine and deep learning methods can also be explored to overcome cyber attacks. Semi-supervised machine learning and reinforcement learning methods have not been well explored for malicious attack detection in IoT systems. There is also a need for a comprehensive cyber-detection system which can offer robustness, scalability, accuracy, and protection against all types of malicious threats.