Deep-Learning-Based Approach for IoT Attack and Malware Detection

Taşcı, Burak

doi:10.3390/app14188505

Open AccessArticle

Deep-Learning-Based Approach for IoT Attack and Malware Detection

by

Burak Taşcı

Vocational School of Technical Sciences, Firat University, Elazig 23119, Turkey

Appl. Sci. 2024, 14(18), 8505; https://doi.org/10.3390/app14188505

Submission received: 23 August 2024 / Revised: 15 September 2024 / Accepted: 18 September 2024 / Published: 20 September 2024

(This article belongs to the Special Issue Advances in Internet of Things (IoT) Technologies and Cybersecurity)

Download

Browse Figures

Versions Notes

Abstract

:

The Internet of Things (IoT), introduced by Kevin Ashton in the late 1990s, has transformed technology usage globally, enhancing efficiency and convenience but also posing significant security challenges. With the proliferation of IoT devices expected to exceed 29 billion by 2030, securing these devices is crucial. This study proposes an optimized 1D convolutional neural network (1D CNN) model for effectively classifying IoT security data. The model architecture includes input, convolutional, self-attention, and output layers, utilizing GELU activation, dropout, and normalization techniques to improve performance and prevent overfitting. The model was evaluated using the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets, achieving impressive results: 98.36% accuracy, 100% precision, 99.96% recall, and 99.95% F1-score for CIC IoT 2023; 99.90% accuracy, 99.98% precision, 99.97% recall, and 99.96% F1-score for CIC-MalMem-2022; and 99.99% accuracy, 99.99% precision, 99.98% recall, and 99.98% F1-score for CIC-IDS2017. These outcomes demonstrate the model’s effectiveness in detecting and classifying various IoT-related attacks and malware. The study highlights the potential of deep-learning techniques to enhance IoT security, with the developed model showing high performance and low computational overhead, making it suitable for real-time applications and resource-constrained devices. Future research should aim at testing the model on larger datasets and incorporating adaptive learning capabilities to further enhance its robustness. This research significantly contributes to IoT security by providing advanced insights into deploying deep-learning models, encouraging further exploration in this dynamic field.

Keywords:

IoT security; 1DCNN; deep learning; malware detection; network attack detection; CIC IoT 2023; CIC-MalMem-2022; CIC-IDS2017; low computational load; real-time applications

1. Introduction

Since its introduction by Kevin Ashton during a presentation at MIT in the late 1990s, the Internet of Things (IoT) has profoundly transformed the use and perception of technology worldwide [1]. IoT functions today as an extensive network infrastructure that enables objects, ranging from household appliances to industrial machinery, to interact with each other and broader systems via the internet [2,3,4]. By facilitating the communication and data exchange between devices, IoT automates processes, thereby enhancing efficiency and speed and enriching people’s lives. This system is utilized in a wide array of applications, from smart home devices to factory machinery. In summary, IoT enables devices to communicate with each other, making daily life smoother and more efficient [5,6]. On the other hand, the Industrial Internet of Things (IIoT) is a specialized adaptation of IoT tailored for the industrial sector [7]. IIoT has the potential to maximize operational efficiency by making industrial processes, from factory automation to supply chain management, smarter and more interconnected [8]. However, the benefits of these technologies come with significant security challenges. By 2030, it is expected that over 29 billion IoT devices will be interconnected globally [9]. This network spans a broad spectrum of applications, from individual health monitoring to intelligent transportation systems, energy management, and environmental monitoring [10]. Smart city applications optimize traffic management while minimizing water and energy usage, and smart homes enhance user comfort while reducing energy consumption. In the industrial sector, IIoT technologies make production lines more efficient, improve maintenance processes, and reduce operational costs. While IoT and IIoT systems provide these transformations, they also introduce serious security challenges [11]. Due to the limited processing capabilities and minimal security measures of IoT devices, they are highly vulnerable to attacks [12]. In recent years, attacks such as DDoS, MITM, and malware on IoT networks, have become increasingly widespread [13,14]. Deep learning has emerged as a powerful solution for IoT security, owing to its capacity to learn complex attack patterns and process large datasets [15,16]. For example, botnets controlled by cyber attackers can execute widespread DDoS attacks, causing extensive internet outages and crippling critical infrastructure systems [17]. Securing the IoT ecosystem is not just a technological necessity but a social and economic imperative. Security researchers, industry leaders, and policymakers are continually developing new strategies and solutions to enhance IoT system security, encouraging research in this area and establishing international collaborations [18]. Through extensive simulations and experiments, this study analyzes the attack vectors targeting IoT devices and networks and proposes effective methods to mitigate these threats [19]. The Internet of Things (IoT) environments are vulnerable to various cyber-attacks, often posing serious security risks. The primary attacks that IoT systems face include DoS/DDoS attacks, information gathering attacks, Man-in-the-Middle (MITM) attacks, injection attacks, and malware attacks [20]. DoS/DDoS attacks, which cause service interruptions by overloading network resources, are critically severe. Information-gathering attacks aim to obtain sensitive information about the system and are highly severe. MITM attacks, known for compromising the confidentiality and integrity of communication, are high-severity attacks. Injection attacks create significant security vulnerabilities by injecting malicious code into systems and are critically severe. Malware attacks, which adversely affect the performance and security of systems, are considered medium- to high-severity threats. The impacts of these attacks underscore the challenges in ensuring the security of the IoT ecosystem and highlight the need for continuous improvement in security measures. The detection of sophisticated threats such as zero-day attacks and malware has become increasingly difficult using traditional methods. This study aims to enhance IoT security by leveraging the power of deep-learning techniques to process large datasets and identify complex attack patterns. This research focuses on providing effective security solutions, particularly for real-time applications and resource-constrained environments. In this study, the proposed method seeks to offer superior performance compared to existing security approaches.

The findings of this study provide guidance for strengthening IoT security infrastructure and leveraging the opportunities offered by this technology safely. This research provides the necessary knowledge and tools to realize IoT’s potential securely, contributing to the sustainable and secure expansion of IoT. The study’s results offer significant theoretical contributions and guide the practical application of IoT security solutions, inspiring further research in this dynamic technological field.

Specifically, the study aims to achieve the following objectives:

-: Evaluate the performance of 1D CNN models in classifying IoT network traffic as malicious or benign;
-: Identify various types of attacks in the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets and analyze these datasets using deep-learning models;
-: Provide advanced insights into the use of deep-learning techniques in IoT security research and contribute to other studies in this field.

This study aims to demonstrate how deep-learning techniques can be utilized to develop forward-looking solutions for IoT security.

This paper is organized as follows: Section 2 presents the datasets used in this study. Section 3 introduces the proposed CNN model. Section 4 depicts the experimental results. Section 5 discusses the findings. Section 6 outlines future work, and Section 7 addresses the limitations. Finally, Section 8 presents the conclusions of this research.

1.1. Related Works

This section reviews the existing literature related to the proposed research. Specifically, studies that employ deep-learning- and machine-learning-based methods in the fields of IoT security and malware detection will be examined. The datasets used the performance metrics of the methods, and the challenges encountered will be discussed in detail. Additionally, the gaps in the current literature and how this research aims to address these gaps will be discussed. The focus will be on approaches that stand out for their successes and limitations, emphasizing this study’s contributions to the literature.

Rikhtegar et al. [21] developed a model using SVM and Bayesian methods to detect and classify IoT attacks, utilizing the KDD-CUP 99 dataset for evaluation. After feature reduction in the dataset, the model achieved 91.50% accuracy in multi-class classification. Lam et al. [22] conducted a study to detect Bot, DoS, and HTTP attacks using the CSE-CIC-IDS2018 dataset. They tested random forest, multilayer perceptron, and one-dimensional convolutional neural networks (Conv1D) models. The convolutional layers used 32, 39, and 64 filters, with a filter size of 5 and a batch size of 32, training the model for 50 iterations. The neural network architecture included Conv1D (32.5), Conv1D (64.5), MaxPool (2), Conv1D (39.5), MaxPool (2), and two fully connected layers (FC). This configuration resulted in accuracy, precision, recall, and F1-scores of 99.98%. Ferrag et al. [23] developed a deep-learning-based attack detection system to identify DDoS attacks. This system was built on three different models: convolutional neural networks (CNN), artificial neural networks (ANN), and recurrent neural networks (RNN). The performance of each model was evaluated on two new real traffic datasets, CIC-DDoS2019 and TON IoT, for both binary and multi-class classification. The multi-class classification accuracy for the first dataset was 95.90%, while the binary classification achieved 99.95%. For the second dataset, the multi-class classification accuracy was 98.94%. Qazi et al. [24] developed an attack detection system using one-dimensional convolutional neural networks (1DCNN). This study trained and tested the model on the CSE-CIC-IDS2017 dataset to classify DoS Hulk, DoS GoldenEye, DDoS, Portscan, and benign traffic. The model’s training accuracy was 99.32% and the test accuracy was 98.96%. Additionally, the model’s precision was 98.70%, the recall was 99.20%, and the F1-score was 98.94%. Ullah et al. [25] developed an anomaly detection system for IoT networks using a feedforward neural network based on flow and control flag features. This model was evaluated on various IoT-focused datasets for both binary and multi-class classification. The datasets used included BoT-IoT, IoT network attack, MQTT-IoT-ID S2020, MQTTset, IoT-23, and IoT-DS2. The study results showed very high accuracy rates of 99.97% for multi-class classification and 99.99% for binary classification. Shatnawi et al. [26] proposed a static malware detection method based on permissions and API calls from Android applications. They employed three machine-learning algorithms—SVM, KNN, and Naive Bayes—evaluated on the CICInvesAndMal2019 Android malware dataset. This approach aims to offer a reliable and effective solution for malware detection. Kilichev et al. [27] enhanced a one-dimensional convolutional neural network (1D-CNN) using genetic algorithms (GA) and particle swarm optimization (PSO) to improve performance. On the CSE-CIC-IDS2017 dataset, the model achieved a test accuracy of 99.71%, precision of 100%, and recall and F1-scores of 99% with GA optimization. With PSO, the results were slightly higher, with a test accuracy of 99.74%, precision of 100%, recall of 99%, and an F1-score of 100%. Both optimization methods demonstrated excellent performance. Bayazit et al. [28] developed a malware detection system using RNN-based algorithms, including LSTM, BiLSTM, and GRU, and evaluated them on the CICInvesAndMal2019 dataset with 8115 static features. The experimental results showed that the BiLSTM model achieved the highest accuracy at 98.85%, highlighting its superior effectiveness in malware detection. Brown et al. [29] found that malware detection systems developed using AutoML can perform as well or better than manually designed models. In experiments with the SOREL-20M and EMBER-2018 datasets, the Darts AutoML model achieved 98.61% accuracy, 98.52% precision, and 98.88% recall. However, the high computational cost and processing time were significant drawbacks, particularly for large datasets. Despite these challenges, AutoML shows promise for malware detection, though further improvements are needed in cost and time efficiency. Almazroi and Ayub [30] presented a BERT-based Feedforward Neural Network (BEFSONet) for IoT environments, evaluated on eight IoT malware datasets. Optimized using the Spotted Hyena Optimizer (SHO), the model showed strong adaptability to various malware structures. While promising as a defense mechanism for IoT security, the complexity and high computational demands of SHO may limit its use in resource-constrained devices, suggesting the need for more efficient optimization algorithms. Tseng et al. [31] used the CIC-IoT-2023 dataset to develop deep-learning models for IoT intrusion detection, achieving 99.40% accuracy in multi-class classification with the Transformer model, outperforming prior studies. However, its higher computational cost and lower binary classification performance are notable limitations.

1.2. Motivation and Proposed Model

The primary goal is to contribute to the field of IoT security by presenting a new deep-learning model. Therefore, a simple but highly effective 1D convolutional neural network (1D CNN) model was developed. This work covers two main objectives: achieving superior classification performance reducing the model’s computational complexity and the number of trainable parameters. To develop the proposed model, a comprehensive analysis of existing CNN and deep-learning techniques was conducted. This model features a lightweight and efficient architecture optimized for processing IoT data. The main components of the model are as follows: an input layer (sequence input) receives data sequences from IoT devices, 1D convolutional layers (Conv1D) learn spatial relationships in the data and extract various features, layer normalization and batch normalization layers stabilize the learning process and enhance the overall performance of the model, the GELU activation function increases model accuracy by learning non-linear relationships, and self-attention layers focus on important features of the data, thereby improving overall accuracy. Fully connected layers perform the classification task, dropout layers prevent overfitting and enhance the model’s generalization capability, and global max pooling and softmax layers conduct the final classification. A graphical representation of the model is presented in Figure 1. This model offers an innovative approach to IoT security by providing high accuracy, speed, and resource efficiency. The results of this study serve as a guide for strengthening IoT security infrastructure and leveraging the opportunities offered by this technology in a secure manner.

1.3. Novelties and Contributions

The novelties of this research are as follows:

An optimized 1D CNN model with low computational load was developed to classify IoT data with high accuracy;
One-dimensional convolutional layers that learn spatial relationships in the data and layer normalization and batch normalization techniques that enhance the model’s performance were utilized;
The GELU activation function was employed to improve the ability to learn non-linear relationships;
Self-attention layers were added to enhance overall accuracy by emphasizing key features of the data;
The model’s effectiveness was validated by testing it on comprehensive datasets such as CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017.

The key contributions of this research are as follows:

The study presents a new and realistic IoT attack dataset using a comprehensive topology of various real IoT devices, including 33 attacks where malicious IoT devices target other IoT devices;
The performance of deep-learning models like 1D CNN will be evaluated using this new dataset, demonstrating the effectiveness of these models in classifying IoT network traffic as malicious or benign;
The research will provide advanced knowledge on how deep-learning techniques can be applied to IoT security, making significant contributions to other studies in this field;
Various types of attacks in the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets will be detailed and analyzed using deep-learning models.

2. Datasets

The datasets used in this study were selected to evaluate the effectiveness of deep-learning models in IoT security and cyber-attack detection. The datasets used are CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017.

2.1. CIC IoT 2023 Dataset

The CIC IoT 2023 dataset was used in this study to evaluate the effectiveness of deep-learning models in the field of IoT security [32]. The CIC IoT 2023 dataset was employed in this study to evaluate the effectiveness of deep-learning models in IoT security. Developed by the Canadian Institute for Cybersecurity (CIC), this dataset provides a comprehensive benchmark for IoT attacks, simulating real-world scenarios with a network of 105 IoT devices. It features 33 different types of attacks, categorized into seven groups: DDoS, DoS, Reconnaissance, Web-based, Brute Force, Spoofing, and Mirai, where IoT devices are used as both attackers and targets. This dataset is invaluable for developing and testing IoT security solutions by providing realistic data for security analytics in large-scale IoT environments. The CIC IoT 2023 dataset includes key features such as flow duration, protocol type, flag counts (FIN, SYN, RST, etc.), and traffic rates, alongside protocols like HTTP, TCP, UDP, and ICMP. Metrics like total sum, minimum, maximum, average, and standard deviation, along with inter-arrival times and other characteristics, allow for detailed analysis of IoT network traffic. This makes it a significant resource for researchers aiming to classify and detect malicious network activities using machine and deep-learning algorithms (see Table 1).

The Internet of Things (IoT) ecosystem comprises a diverse range of components. On the network side, key devices include the Asus RT-N12 router, Cisco Catalyst 3850 24 switch, and Netgear Unmanaged Switch GS308, all monitored via the Gigamon G-TAP A-TX network tap. Controllers such as the Vera Plus, Aeotec Zigbee/Z-Wave Smart Hub, and SmartThings Hub facilitate the integration and management of smart devices. The sensor suite includes the Aeotec Water Leak Sensor, Multisensor 6, Motion Sensor, and Button, alongside security devices like the Aeotec Siren, Doorbell 6, and Door/Window Sensor 7 Pro. Cameras, including models from Arlo, Dlink, Amcrest, Google Nest, and Netatmo, provide high-resolution monitoring for both indoor and outdoor security. Smart home devices further enrich the ecosystem, with products like the Phillips Hue Bridge and Bulbs, Arlo Base Station, iRobot Roomba i3+, and Amazon Echo Studio. The Raspberry Pi 4 Model B, used to simulate potential IoT attacks, underscores the role of these devices in IoT security research. This extensive setup offers a comprehensive framework for examining the interplay of IoT components, ranging from network infrastructure and sensors to smart devices and attack simulations, enhancing research on IoT security vulnerabilities and solutions.

2.2. CIC-MalMem-2022 Dataset

The CIC-MalMem-2022 dataset is designed to test memory-based detection methods for concealed malware [33]. Created using malware commonly found in the real world, this dataset aims to closely represent real-world scenarios. CIC-MalMem-2022 includes spyware, ransomware, and Trojan horse malware, providing a balanced dataset for testing concealed malware detection systems. During the memory dump process, the debug mode is utilized to ensure the dump process remains invisible in memory dumps, offering a more accurate representation of programs an average user might run during a malware attack. The CIC-MalMem-2022 dataset is balanced, with 50% malicious memory dumps and 50% benign memory dumps. Out of a total of 58,596 records, 29,298 are benign and 29,298 are malicious memory dumps. This dataset, created using real-world malware, incorporates concealment techniques that make malware detection challenging. Its balanced nature allows for an objective evaluation of detection algorithms’ accuracy. The dataset includes spyware, ransomware, and Trojan horse malware, with examples from various malware families within each category. Additionally, the use of debug mode during the memory dump process ensures the invisibility of the dump process in memory dumps, enabling more accurate testing of malware detection systems. The CIC-MalMem-2022 dataset is a crucial resource for evaluating the effectiveness of malware detection methods and advancing research in this field. Representing real-world malware and concealment techniques, this dataset aids security analysts and researchers in developing more effective detection methods.

In this dataset, the BENIGN class represents benign memory dumps, while the Malware classes include various types of malware such as spyware, ransomware, and Trojan horses. The dataset contains concealment techniques that complicate malware detection and maintain a balanced structure, allowing for an objective evaluation of detection algorithms’ accuracy.

2.3. CIC-IDS2017 Dataset

The CIC-IDS2017 dataset, created by the Canadian Institute for Cybersecurity in 2017, is a comprehensive and up-to-date dataset for cybersecurity research [34]. This dataset captures real-world network traffic and common cyberattacks, making it a valuable resource for assessing the performance of security tools like Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS). Spanning five days, the traffic includes only normal activity on Monday, with the remaining days featuring attacks such as Brute Force FTP/SSH, DoS, Heartbleed, Web Attacks, Infiltration, Botnet, and DDoS. These events occur during morning and afternoon sessions. The CIC-IDS2017 dataset provides labeled traffic data, including timestamps, IP addresses, ports, protocols, and attack types, in CSV format. Using the B-Profile system, it simulates background traffic from 25 users, involving protocols like HTTP, HTTPS, FTP, SSH, and email. With over 80 network flow features and a comprehensive topology, CIC-IDS2017 is an essential resource for evaluating IDS/IPS systems and advancing network security research (see Table 2).

The CIC-IDS2017 dataset provides a comprehensive dataset for cybersecurity research. This dataset includes the BENIGN class, representing normal and benign network traffic, as well as various types of attacks. It encompasses multiple attack types, including DDoS attacks, PortScan, Botnet, Infiltration, web-based attacks (Brute Force, SQL Injection, and XSS), FTP brute force attacks (FTP-Patator), and SSH brute force attacks (SSH-Patator). DoS attacks are subdivided into categories such as GoldenEye, Hulk, Slowhttptest, and slowloris. Additionally, attacks exploiting the Heartbleed vulnerability are included in this dataset. This dataset serves as an ideal resource for evaluating and developing IDS and IPS systems, providing a robust and reliable reference point for network traffic analyses and attack detection.

In processing the datasets, Z-normalization is employed to mitigate issues arising from the different scales of the data. This technique involves transforming the data such that each feature has a mean of 0 and a standard deviation of 1. Specifically, Z-normalization is performed by subtracting the mean of each column from each data point and dividing by the standard deviation of that column. This transformation is crucial as it standardizes the features, allowing the model to evaluate different features on the same scale.

The equation for Z-normalization is as follows:

z = \frac{x - μ}{σ}

(1)

where x is the data point, μ is the column mean, and σ is the column standard deviation.

By standardizing the data, Z-normalization enhances the model’s ability to generalize across diverse datasets, as it ensures that all features contribute equally to the learning process. This preprocessing step is fundamental in maintaining the robustness and adaptability of the model when applied to new and varied datasets.

3. The Proposed CNN

In this paper, a novel 1D convolutional neural network (CNN) architecture tailored for efficient classification tasks is presented. The architecture is proposed as a foundation for a lightweight model ensuring its suitability for real-time and resource-constrained applications. Additionally, the architecture is scalable to handle large-scale datasets, maintaining its effectiveness and robustness. The proposed 1D CNN comprises 75 layers (see Figure 2), organized into several key blocks: the input block, convolutional blocks, self-attention blocks, and the output block.

The improvements in this model are achieved through the following structural components:

Convolutional Layers: These layers learn spatial relationships in the data and extract various features. By stacking multiple convolutional layers, the model can capture complex patterns in the IoT data;
GELU Activation Function: The Gaussian Error Linear Unit (GELU) activation function is used to enhance the model’s ability to learn non-linear relationships. This activation function is defined as

$G E L U (x) = x \cdot ϕ (x)$

(2)

where $ϕ (x)$ is the cumulative distribution function of the standard normal distribution;
Self-Attention Mechanism: Inspired by the superior performance of self-attention mechanisms, multiple self-attention layers are incorporated into the architecture. The self-attention mechanism helps the model focus on important features of the data, improving overall accuracy. The self-attention is computed as

$A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{θ K^{T}}{\sqrt{d_{k}}}) V$

(3)

where Q is the query matrix, K is the key matrix, V is the value matrix, and d_k is the dimension of the key;
Layer Normalization and Batch Normalization: These normalization techniques stabilize the learning process and enhance the overall performance of the model. Layer normalization is applied after each block, ensuring that the inputs to each layer have a mean of zero and a variance of one;
Dropout Layers: Dropout is used to prevent overfitting and enhance the model’s generalization capability. During training, dropout randomly deactivates a portion of input units, which helps in regularizing the model;
Global Max Pooling: This layer reduces the spatial dimensions of the input, retaining the most important features and reducing computational load.

The underlying theory behind the proposed model is based on the effectiveness of convolutional neural networks in capturing spatial hierarchies in data, combined with the power of self-attention mechanisms to focus on relevant features. The model’s structure is designed to achieve a balance between computational efficiency and high classification performance, making it suitable for a wide range of IoT applications. The proposed model begins by taking a sequence input, which is then processed through a 1D convolutional layer with 96 filters, a kernel size of 3, and a stride of 1, maintaining the same padding to preserve the input dimensions. Following this, layer normalization is applied to stabilize and accelerate the training process. A GELU activation function is used to introduce non-linearity, allowing the model to capture complex patterns. Next, a self-attention mechanism is incorporated, utilizing 3 attention heads and 96 channels, enabling the model to focus on the most relevant features in the input data. After this, the data are passed through a fully connected layer with 96 units to further learn higher-level representations. A dropout layer with a rate of 0.5 is then applied to prevent overfitting by randomly deactivating neurons during training. A residual connection, formed by adding two intermediate outputs (I1 and I2), helps to retain information and improve gradient flow through the network. The model further applies batch normalization to standardize the output from the addition layer, ensuring stable learning. Dimensionality reduction is performed using a global max pooling layer, which selects the most important features from the data. Finally, a softmax activation function is used to output the classification probabilities, and the result is delivered as the final model output.

This design balances computational efficiency and high classification performance. making it suitable for a wide range of applications. The model achieves a total of approximately 9 million learnable parameters, ensuring robust performance even with a reduced number of trainable parameters.

For the optimization of the proposed 1D CNN model, various established techniques were used to improve its performance and efficiency. Stochastic Gradient Descent with Momentum (SGDM) was used as the optimizer during the training process. SGDM helps accelerate gradient vectors in the correct directions, leading to faster convergence, and the momentum term helps to smooth out oscillations and stabilize the updates. Learning Rate Scheduling was maintained with periodic updates every 50 iterations to balance convergence speed and stability, ensuring that the learning rate remains optimal throughout the training process. Dropout layers were incorporated to prevent overfitting by randomly setting a fraction of input units to zero during training, improving the model’s generalization capability by ensuring that the network does not rely too heavily on any single neuron. Batch Normalization, applied after each convolutional block, stabilizes and accelerates the training process by normalizing the inputs to each layer, helping to maintain a stable distribution of activations throughout the network, which in turn leads to improved training speed and performance. Various hyperparameters, including the number of filters, kernel size, and dropout rate, were fine-tuned through grid search, allowing us to find the optimal set of hyperparameters that yield the best performance for this model. These optimization techniques collectively contribute to the robustness and efficiency of the proposed model, ensuring high classification accuracy and performance while maintaining low computational overhead.

4. Experimental Results

The experiments were conducted on a personal computer equipped with an NVidia RTX 4080 GPU, 128 GB of memory and a 13th-generation Intel Core i9-13900K processor running Windows 11. The proposed CNN model was developed using a MATLAB Deep Network Designer. The dataset was split into 70% training, 15% testing, and 15% validation. Stochastic Gradient Descent with Momentum (SGDM) was used as the solver, with an initial learning rate of 0.01, a mini-batch size of 32, and training over 30 epochs. The learning rate remained constant and was updated every 50 iterations, and all training was performed on the GPU. The performance of the proposed 1D CNN model was evaluated on several datasets. The model achieved a validation accuracy of 98.36% on the CIC IoT 2023 dataset, 99.90% on the CIC-MalMem-2022 dataset, and 96.64% on the CIC-IDS2017 dataset, indicating its effectiveness across different security domains such as IoT security, malware detection, and network attack detection. Figure 3 illustrates the accuracy and loss curves during training and validation on the CIC IoT 2023 dataset, showing strong performance and high accuracy throughout the process.

Figure 4 presents the accuracy and loss curves obtained on the CIC-MalMem-2022 dataset. Here, the validation accuracy is also notably high, and it can be observed that the loss values decrease rapidly during the training process.

Figure 5 presents the accuracy and loss curves for the CIC-IDS2017 dataset. These graphs also demonstrate the model’s effectiveness in detecting network attacks, showcasing its robust performance throughout the training and validation phases.

In this study, the datasets were divided into 70% training, 15% testing, and 15% validation. During the training phase, the model’s parameters were optimized, the overall performance was monitored using the validation set, and the final performance was evaluated using the test set. The evaluation of the proposed model is based on the following metrics: accuracy, precision, recall, F1-score, and confusion matrix analysis. These metrics were chosen to provide a comprehensive evaluation of the model’s performance, particularly in identifying both false positives and false negatives.

The results obtained using the test datasets are presented in the following figures:

Figure 6 shows the confusion matrix for the CIC IoT 2023 dataset. This matrix provides a detailed view of how accurately the model classified different classes and reveals the error rates.

Figure 7 presents the confusion matrix for the CIC-MalMem-2022 dataset. This matrix visualizes the model’s success in malware detection and highlights any potential classification errors.

Figure 8 provides the confusion matrix for the CIC-IDS2017 dataset. This matrix illustrates the model’s performance in detecting network attacks and shows the correct classification rates.

Table 3 presents a percentage summary of the results obtained for different classes in the CIC IoT 2023 dataset. The model demonstrates high performance in attack types such as DDoS-ICMP_Flood (99.95% accuracy, 99.91% precision, and 99.93% recall), DDoS-PSHACK_Flood (99.96% accuracy, 99.92% precision, and 99.94% recall), and DDoS-SYN_Flood (99.60% accuracy, 99.78% precision, and 99.69% recall). However, it shows low performance in classes such as Backdoor_Malware (98.36% accuracy, 100% precision, and 4.56% recall) and DNS_Spoofing (48.41% accuracy, 26.71% precision, and 34.43% recall). There are also classes with moderate performance, such as DoS-HTTP_Flood (93.30% accuracy, 73.77% precision, and 82.40% recall) and MITM-ArpSpoofing (78.18% accuracy, 58.59% precision, and 66.98% recall). Overall, while the model is successful in some attack types, it requires improvement in others.

Table 4 presents a summary of the results obtained for different classes in the CIC-MalMem-2022 dataset. For the first class, the accuracy, precision, recall, and F1-scores are 99.97%, 99.95%, 99.98%, and 99.97%, respectively. These high-performance indicators demonstrate that the model is extremely successful in accurately classifying malware in the first class. For the second class, the accuracy, precision, recall, and F1-score are reported as 99.97%, 99.95%, and 99.97%, respectively. These values indicate that the model also detects malware in the second class with high accuracy and reliability. The obtained results prove that the employed model is highly effective in classifying malware classes in the CIC-MalMem-2022 dataset and exhibits high overall performance. This supports the model’s applicability for practical malware detection.

The features extracted from the FC_9 fully connected layer of the 1D CNN model presented in Figure 9 were classified using the MATLAB classification layer with 10-fold cross-validation. The classification results are depicted in the attached graph. While this proposed model achieved an accuracy of 99.97%, the results for kNN [35], SVM [36,37,38], Neural Network [39,40], Naive Bayes [41], Tree [42], and Efficient Logistic Regression [42] were 98.68%, 98.56%, 98.16%, 88.58%, 87.78%, and 60.9%, respectively (see Figure 9). These results not only demonstrate the superior accuracy of the proposed model in comparison to other methods but also highlight its high performance and generalization capability. The stability and generalization capabilities of the proposed model were validated using cross-validation and independent test sets. The low variance in results indicates consistent performance across different subsets of data. This demonstrates that this model can reliably operate across a wide range of conditions and generalize with high accuracy to different datasets.

Table 5 summarizes the performance metrics of the model for various malware classes in the CIC-MalMem-2022 dataset. The model exhibited high performance in the BENIGN class with an accuracy of 96.55%, a precision of 99.60%, a recall of 96.32%, and an F1-score of 97.93%. For DDoS attacks, the model achieved an accuracy of 99.90%, a precision of 97.16%, and a recall of 98.51%, demonstrating high detection success. In classes such as Web Attack Brute Force, Web Attack SQL Injection, and Heartbleed, the model displayed its highest performance with accuracies close to 100%, precision near 100%, and recall values over 99%. Specifically, for Web Attack SQL Injection and Heartbleed classes, it achieved perfect detection with 100% accuracy, 100% precision, and 100% recall. However, the performance metrics were lower for some classes, such as Bot, FTP-Patator, and SSH-Patator. For example, in the Bot class, the model achieved 69.66% accuracy, 34.24% precision, and 45.91% recall, indicating difficulties in detecting this type of malware. Overall, the model performed well in many malware classes, but improvements are needed for classes where performance was lower. These results indicate that while the model is generally effective for malware detection, optimization is necessary to achieve better results in certain classes.

In the experiments, the performance of the proposed 1D CNN model was evaluated across various datasets. The model demonstrated high performance across many malware classes in the CIC-MalMem-2022 dataset, achieving 100% accuracy, precision, and recall for the Web Attack SQL Injection and Heartbleed classes. However, performance discrepancies were noted, particularly in the “Backdoor_Malware” and “DNS_Spoofing” classes. The lower performance in these classes can be attributed to several factors. For “Backdoor_Malware”, the stealthy nature of these attacks often involves techniques that evade detection by blending in with legitimate traffic. This makes it challenging for the model to differentiate between malicious and benign behaviors. Similarly, “DNS_Spoofing” exploits vulnerabilities in the domain name resolution process, often resulting in patterns that mimic normal network activity. These complexities necessitate more advanced feature extraction and training strategies to improve detection rates for such sophisticated attack types. In the CIC IoT 2023 dataset, the model achieved high accuracy rates, exceeding 99.95% for attack types such as DDoS-ICMP_Flood and DDoS-PSHACK_Flood. However, as previously mentioned, the model’s efficacy was notably lower for the more nuanced classes like “Backdoor_Malware” and “DNS_Spoofing”. This highlights the need for continued refinement in detection algorithms to address the challenges posed by these advanced threats. In the CIC-IDS2017 dataset, the model exhibited effective detection of network attacks, generally showcasing high validation accuracy. These results confirm that while the proposed model is robust across various security domains—such as malware detection, IoT security, and network attack detection—there remains a critical need for enhancements in handling complex attack vectors.

5. Discussion

In this study, the performances of various machine-learning methods used in the classification of IoT security data were compared. The findings reveal that the proposed model demonstrates superior performance, particularly in terms of high accuracy, precision, recall, and F1-score, when compared to other contemporary methods. The proposed model has shown more consistent and high performance across the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets than other studies. This indicates that the proposed model can be effectively and reliably used as a tool for classifying IoT security data.

The discussion section examines the reasons behind these findings, the advantages the model offers compared to other methods, and potential areas for improvement. Additionally, the integration of these results into practical security systems and future directions of research in this area are discussed. Comparison results are tabulated in Table 6.

Various researchers developed different classifiers using machine-learning methods on IoT security data. Hassini et al. achieved high-performance results using an End-to-End 1D CNN on the Edge-IIoTset dataset (Accuracy: 99.96%, Precision: 100%, and Recall: 99%, F1-score: 99). Neto et al. [44] experimented with various methods on the CICIoV2024 dataset, achieving 95% accuracy with Logistic Regression, Deep Neural Network, and Random Forest, while the AdaBoost method reached 87% accuracy. Canavese et al. [45] used Random Forest on the CIC IoT 2023 dataset for both Coarse-Grained and Fine-Grained classification, obtaining approximately 96% accuracy. Maniriho et al. [46] demonstrated high performance with 98.82% accuracy using deep autoencoders and a Stacked Ensemble on the MemMal-D2024 dataset. Khalid et al. [47] achieved high F1-scores (97.0%) using memory-based features and Random Forest on the CICMalDroid2020 and CIC-AndMal2017 datasets. Danyal Namakshenas et al. [48] obtained 94.93% and 91.93% accuracy on the N-baIoT and Edge-IIoTset datasets, respectively, using Federated Learning, Quantum Computing, and Additive Homomorphic Encryption. Talukder et al. reached high accuracy rates (99.99%) on the UNSW-NB15, CIC-IDS2017, and CIC-IDS2018 datasets using Random Oversampling (RO), Stacking Feature Embedding, and PCA. In this proposed model, using CNN and Softmax, notable results on the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets were achieved. Specifically, it achieved 98.36% accuracy, 100% precision, 99.96% recall, and 99.95% F1-score on the CIC IoT 2023 dataset; 99.90% accuracy, 99.98% precision, 99.97% recall, and 99.96% F1-score on the CIC-MalMem-2022 dataset; and 99.99% accuracy, 99.99% precision, 99.98% recall, and 99.98% F1-score on the CIC-IDS2017 dataset. These results indicate that the proposed model demonstrates superior performance compared to other methods, particularly in terms of high precision and recall values.

The proposed 1D CNN framework achieves globally optimal results through a combination of several advanced techniques. First, the model architecture is meticulously optimized, incorporating convolutional layers, GELU activation functions, and self-attention mechanisms to effectively capture complex patterns in IoT data. Extensive cross-validation and hyperparameter tuning ensure that the chosen parameters, such as the number of filters, kernel size, and learning rate, provide the best performance, thereby avoiding overfitting and enhancing generalization to unseen data. Normalization techniques, including layer normalization and batch normalization, stabilize the learning process by maintaining a consistent scale of input data, which improves the convergence rate and overall model performance. Dropout layers are employed to prevent overfitting by randomly setting a fraction of input units to zero during training, enhancing the model’s generalization capabilities. Adaptive learning rate scheduling is used to balance convergence speed and stability by periodically adjusting the learning rate, ensuring the model maintains an optimal learning rate throughout training and avoids local minima. The model is trained and evaluated on multiple comprehensive datasets, including CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017, which exposes the model to diverse data and improves its ability to generalize across different scenarios. Rigorous performance evaluation using key metrics such as accuracy, precision, recall, and F1-score confirms that the model effectively balances precision and recall, achieving high performance. These combined techniques ensure that the proposed 1D CNN framework systematically optimizes its learning process, resulting in robust performance and the ability to achieve optimal results in IoT security data classification globally.

6. Future Work

Future research directions could focus on testing the proposed model on larger datasets and diverse attack types to further refine its performance. Enhancements could also explore adaptive learning capabilities and autonomous updates to maintain and improve security measures dynamically. This study contributes significantly to the field of IoT security by providing advanced insights into the deployment of deep-learning techniques, thereby encouraging further research and development in this dynamic domain. Future work aims to improve the experimental nature of this study by including practical IoT case studies. These case studies will provide more practical insights and demonstrate the applicability of this proposed 1D CNN model in real-world IoT scenarios. By doing so, it is hoped that it will bridge the gap between theoretical research and practical application and provide a more comprehensive evaluation of the performance of this model in various IoT environments. Additionally, further testing on larger and more varied datasets, as well as exploring adaptive learning capabilities, will be considered to enhance the robustness and effectiveness of the model.

7. Limitations

Although the datasets used in this study are comprehensive, they may not fully capture the diversity of real-world IoT devices and attack types. Future research should aim to validate the model on a broader range of datasets to improve its generalizability. While the model demonstrates low computational overhead, further evaluation is required to assess its performance and efficiency in real-time applications, particularly on resource-constrained devices.
The static training process employed by this model limits its ability to adapt to new attack types and evolving threat landscapes. Future research should explore the development of models with adaptive learning capabilities to dynamically update security measures in response to emerging threats.
In conclusion, this research confirms the applicability and effectiveness of deep learning techniques in improving IoT security. The proposed model, with its low computational requirements and high performance, presents a valuable tool for real-world IoT operations, enabling the detection and mitigation of large-scale attacks. Future research could focus on testing the model with larger and more diverse datasets and incorporating adaptive learning mechanisms to further enhance its robustness.

8. Conclusions

This study presents an optimized 1D convolutional neural network (1D CNN) model for classifying Internet of Things (IoT) security data, characterized by its low computational overhead and high efficiency. The proposed model was rigorously tested on three comprehensive datasets: CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017. The results indicate that the model demonstrates superior performance compared to existing methods, achieving high accuracy, precision, recall, and F1-scores. On the CIC IoT 2023 dataset, the model achieved an accuracy of 98.36%, precision of 100%, recall of 99.96%, and F1-score of 99.95%. These metrics underscore the model’s capability to effectively identify and classify large-scale attacks within IoT environments. Similarly, on the CIC-MalMem-2022 dataset, the model attained an accuracy of 99.90%, precision of 99.98%, recall of 99.97%, and F1-score of 99.96%, highlighting its proficiency in malware detection. The CIC-IDS2017 dataset results, with an accuracy of 99.99%, precision of 99.99%, recall of 99.98%, and F1-score of 99.98%, further confirm the model’s robustness in detecting network intrusions. This research validates the applicability and effectiveness of deep-learning techniques in enhancing IoT security. The developed model, with its low computational demands and high performance, presents a valuable tool for real-world IoT operations, facilitating the detection and mitigation of large-scale attacks. Its suitability for resource-constrained devices and real-time applications is particularly noteworthy.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

In this paper, the dataset is publicly available.

Conflicts of Interest

The author declares no conflicts of interest.

References

Chin, J.; Callaghan, V.; Allouch, S.B. The Internet-of-Things: Reflections on the past, present and future from a user-centered and smart environment perspective. J. Ambient Intell. Smart Environ. 2019, 11, 45–69. [Google Scholar] [CrossRef]
Abdul-Qawy, A.S.; Pramod, P.; Magesh, E.; Srinivasulu, T. The internet of things (iot): An overview. Int. J. Eng. Res. Appl. 2015, 5, 71–82. [Google Scholar]
Hanes, D.; Salgueiro, G.; Grossetete, P.; Barton, R.; Henry, J. IoT Fundamentals: Networking Technologies, Protocols, and Use Cases for the Internet of Things; Cisco Press: Indianapolis, IN, USA, 2017. [Google Scholar]
Gubbi, J.; Buyya, R.; Marusic, S.; Palaniswami, M. Internet of Things (IoT): A vision, architectural elements, and future directions. Future Gener. Comput. Syst. 2013, 29, 1645–1660. [Google Scholar] [CrossRef]
Pramanik, P.K.D.; Pal, S.; Choudhury, P. Beyond automation: The cognitive IoT. artificial intelligence brings sense to the Internet of Things. In Cognitive Computing for Big Data Systems Over IoT: Frameworks, Tools and Applications; Springer: Cham, Switzerland, 2018; pp. 1–37. [Google Scholar]
Mouha, R.A.R.A. Internet of things (IoT). J. Data Anal. Inf. Process. 2021, 9, 77. [Google Scholar]
Munirathinam, S. Industry 4.0: Industrial internet of things (IIOT). In Advances in Computers; Elsevier: Amsterdam, The Netherlands, 2020; Volume 117, pp. 129–164. [Google Scholar]
Soori, M.; Arezoo, B.; Dastres, R. Internet of things for smart factories in industry 4.0, a review. Internet Things Cyber-Phys. Syst. 2023, 3, 192–204. [Google Scholar] [CrossRef]
Parviznejad, P.S. The Future of Devices in Digital Businesses and Improving Productivity. In Building Smart and Sustainable Businesses with Transformative Technologies; IGI Global: Hershey, PA, USA, 2024; pp. 16–37. [Google Scholar]
Wu, Y.; Dai, H.-N.; Wang, H.; Xiong, Z.; Guo, S. A survey of intelligent network slicing management for industrial IoT: Integrated approaches for smart transportation, smart energy, and smart factory. IEEE Commun. Surv. Tutor. 2022, 24, 1175–1211. [Google Scholar] [CrossRef]
Demertzi, V.; Demertzis, S.; Demertzis, K. An Overview of Privacy Dimensions on the Industrial Internet of Things (IIoT). Algorithms 2023, 16, 378. [Google Scholar] [CrossRef]
Hassan, W.H. Current research on Internet of Things (IoT) security: A survey. Comput. Netw. 2019, 148, 283–294. [Google Scholar]
Choi, J.; Anwar, A.; Alasmary, H.; Spaulding, J.; Nyang, D.; Mohaisen, A. Iot malware ecosystem in the wild: A glimpse into analysis and exposures. In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, Arlington, VA, USA, 7–9 November 2019; pp. 413–418. [Google Scholar]
Al-Hadhrami, Y.; Hussain, F.K. DDoS attacks in IoT networks: A comprehensive systematic literature review. World Wide Web 2021, 24, 971–1001. [Google Scholar] [CrossRef]
Alazab, M.; Tang, M. Deep Learning Applications for Cyber Security; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
de Assis, M.V.; Carvalho, L.F.; Rodrigues, J.J.; Lloret, J.; Proença, M.L., Jr. Near real-time security system applied to SDN environments in IoT networks using convolutional neural network. Comput. Electr. Eng. 2020, 86, 106738. [Google Scholar] [CrossRef]
Wang, A.; Chang, W.; Chen, S.; Mohaisen, A. Delving into internet DDoS attacks by botnets: Characterization and analysis. IEEE/ACM Trans. Netw. 2018, 26, 2843–2855. [Google Scholar] [CrossRef]
Sfar, A.R.; Natalizio, E.; Challal, Y.; Chtourou, Z. A roadmap for security challenges in the Internet of Things. Digit. Commun. Netw. 2018, 4, 118–137. [Google Scholar] [CrossRef]
Ahmad, I.; Wan, Z.; Ahmad, A. A big data analytics for DDOS attack detection using optimized ensemble framework in Internet of Things. Internet Things 2023, 23, 100825. [Google Scholar] [CrossRef]
Stricot-Tarboton, S.; Chaisiri, S.; Ko, R.K. Taxonomy of Man-in-the-Middle Attacks on HTTPS. In Proceedings of the 2016 IEEE Trustcom/Bigdatase/Ispa, Tianjin, China, 23–26 August 2016; pp. 527–534. [Google Scholar]
Khalvati, L.; Keshtgary, M.; Rikhtegar, N. Intrusion Detection based on a Novel Hybrid Learning Approach. J. AI Data Min. 2018, 6, 157–162. [Google Scholar] [CrossRef]
Lam, N.T. Detecting unauthorized network intrusion based on network traffic using behavior analysis techniques. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 46–51. [Google Scholar] [CrossRef]
Ferrag, M.A.; Shu, L.; Djallel, H.; Choo, K.-K.R. Deep learning-based intrusion detection for distributed denial of service attack in agriculture 4.0. Electronics 2021, 10, 1257. [Google Scholar] [CrossRef]
Qazi, E.U.H.; Almorjan, A.; Zia, T. A one-dimensional convolutional neural network (1D-CNN) based deep learning system for network intrusion detection. Appl. Sci. 2022, 12, 7986. [Google Scholar] [CrossRef]
Ullah, I.; Mahmoud, Q.H. An anomaly detection model for IoT networks based on flow and flag features using a feed-forward neural network. In Proceedings of the 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2022; pp. 363–368. [Google Scholar]
Shatnawi, A.S.; Yassen, Q.; Yateem, A. An android malware detection approach based on static feature analysis using machine learning algorithms. Procedia Comput. Sci. 2022, 201, 653–658. [Google Scholar] [CrossRef]
Kilichev, D.; Kim, W. Hyperparameter optimization for 1D-CNN-based network intrusion detection using GA and PSO. Mathematics 2023, 11, 3724. [Google Scholar] [CrossRef]
Calik Bayazit, E.; Koray Sahingoz, O.; Dogan, B. Deep learning based malware detection for android systems: A Comparative Analysis. Teh. Vjesn. 2023, 30, 787–796. [Google Scholar]
Brown, A.; Gupta, M.; Abdelsalam, M. Automated machine learning for deep learning based malware detection. Comput. Secur. 2024, 137, 103582. [Google Scholar] [CrossRef]
Almazroi, A.A.; Ayub, N. Deep learning hybridization for improved malware detection in smart Internet of Things. Sci. Rep. 2024, 14, 7838. [Google Scholar] [CrossRef]
Tseng, S.-M.; Wang, Y.-Q.; Wang, Y.-C. Multi-Class Intrusion Detection Based on Transformer for IoT Networks Using CIC-IoT-2023 Dataset. Future Internet 2024, 16, 284. [Google Scholar] [CrossRef]
Neto, E.C.P.; Dadkhah, S.; Ferreira, R.; Zohourian, A.; Lu, R.; Ghorbani, A.A. CICIoT2023: A real-time dataset and benchmark for large-scale attacks in IoT environment. Sensors 2023, 23, 5941. [Google Scholar] [CrossRef]
Carrier, T.; Victor, P.; Tekeoglu, A.; Lashkari, A.H. Detecting Obfuscated Malware using Memory Feature Engineering. In Icissp; University of New Brunswick: Fredericton, NB, Canada, 2022; pp. 177–188. [Google Scholar]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
Peterson, L.E. K-nearest neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
Keerthi, S.S.; Shevade, S.K.; Bhattacharyya, C.; Murthy, K.R.K. Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput. 2001, 13, 637–649. [Google Scholar] [CrossRef]
Tasci, B.; Tasci, I. Deep feature extraction based brain image classification model using preprocessed images: PDRNet. Biomed. Signal Process. Control 2022, 78, 103948. [Google Scholar] [CrossRef]
Taşcı, B. Attention Deep Feature Extraction from Brain MRIs in Explainable Mode: DGXAINet. Diagnostics 2023, 13, 859. [Google Scholar] [CrossRef]
Tasci, B.; Tasci, G.; Ayyildiz, H.; Kamath, A.P.; Barua, P.D.; Tuncer, T.; Dogan, S.; Ciaccio, E.J.; Chakraborty, S.; Acharya, U.R. Automated schizophrenia detection model using blood sample scattergram images and local binary pattern. Multimed. Tools Appl. 2024, 83, 42735–42763. [Google Scholar] [CrossRef]
Wang, S.-C.; Wang, S.-C. Artificial neural network. In Interdisciplinary Computing in Java Programming; Springer: Boston, MA, USA, 2003; pp. 81–100. [Google Scholar]
Rish, I. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, WA, USA, 4 August 2001; pp. 41–46. [Google Scholar]
Safavian, S.R.; Landgrebe, D. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef]
Hassini, K.; Khalis, S.; Habibi, O.; Chemmakha, M.; Lazaar, M. An end-to-end learning approach for enhancing intrusion detection in Industrial-Internet of Things. Knowl.-Based Syst. 2024, 294, 111785. [Google Scholar] [CrossRef]
Neto, E.C.P.; Taslimasa, H.; Dadkhah, S.; Iqbal, S.; Xiong, P.; Rahman, T.; Ghorbani, A.A. CICIoV2024: Advancing realistic IDS approaches against DoS and spoofing attack in IoV CAN bus. Internet Things 2024, 26, 101209. [Google Scholar] [CrossRef]
Canavese, D.; Mannella, L.; Regano, L.; Basile, C. Security at the Edge for Resource-Limited IoT Devices. Sensors 2024, 24, 590. [Google Scholar] [CrossRef]
Maniriho, P.; Mahmood, A.N.; Chowdhury, M.J.M. MeMalDet: A memory analysis-based malware detection framework using deep autoencoders and stacked ensemble under temporal evaluations. Comput. Secur. 2024, 142, 103864. [Google Scholar] [CrossRef]
Khalid, S.; Hussain, F.B. VolMemDroid—Investigating android malware insights with volatile memory artifacts. Expert Syst. Appl. 2024, 253, 124347. [Google Scholar] [CrossRef]
Namakshenas, D.; Yazdinejad, A.; Dehghantanha, A.; Srivastava, G. Federated quantum-based privacy-preserving threat detection model for consumer internet of things. IEEE Trans. Consum. Electron. 2024; in press. [Google Scholar] [CrossRef]

Figure 1. Malware families count by category.

Figure 2. Graphical representation of the proposed CNN.

Figure 3. Accuracy and loss curves for the proposed CNN model on the CIC IoT 2023 dataset.

Figure 4. Accuracy and loss curves for the proposed CNN model on the CIC-MalMem-2022.

Figure 5. Accuracy and loss curves for the proposed CNN model on the CIC-IDS2017.

Figure 6. Confusion matrix for the CIC IoT 2023 dataset.

Figure 7. Confusion matrix for the CIC-MalMem-2022 dataset.

Figure 8. Confusion matrix for the CIC-IDS2017 dataset.

Figure 9. Comparison of machine-learning methods.

Table 1. The amount of training, test, and validation data for different attack types in the CIC IoT 2023 dataset.

No	Class Name	Train	Test	Validation
1	Backdoor_Malware	2253	483	482
2	BenignTraffic	768,737	164,729	164,729
3	BrowserHijacking	4101	879	879
4	CommandInjection	3786	811	812
5	DDoS-ACK_Fragmentation	199,573	42,766	42,765
6	DDoS-HTTP_Flood	20,153	4319	4318
7	DDoS-ICMP_Flood	5,040,353	1,080,076	1,080,075
8	DDoS-ICMP_Fragmentation	316,742	67,873	67,874
9	DDoS-PSHACK_Flood	2,866,329	614213	614,213
10	DDoS-RSTFINFlood	2,831,700	606,793	606,792
11	DDoS-SYN_Flood	2,841,433	608,879	608,878
12	DDoS-SlowLoris	16,398	3514	3514
13	DDoS-SynonymousIP_Flood	2,518,697	539,721	539,720
14	DDoS-TCP_Flood	3,148,367	674,650	674,650
15	DDoS-UDP_Flood	3,788,601	811,843	811,843
16	DDoS-UDP_Fragmentation	200,848	43,039	43,038
17	DNS_Spoofing	125,238	26,837	26,836
18	DictionaryBruteForce	9145	1960	1959
19	DoS-HTTP_Flood	50,305	10,780	10,779
20	DoS-SYN_Flood	1,420,184	304,325	304,325
21	DoS-TCP_Flood	1,870,011	400,717	400,717
22	DoS-UDP_Flood	2,323,017	497,789	497,789
23	MITM-ArpSpoofing	215,315	46,139	46,139
24	Mirai-greeth_flood	694,306	148,780	148,780
25	Mirai-greip_flood	526,177	112,752	112,753
26	Mirai-udpplain	623,403	133,586	133,587
27	Recon-HostDiscovery	94,065	20,157	20,156
28	Recon-OSScan	68,781	14,739	14,739
29	Recon-PingSweep	1583	339	340
30	Recon-PortScan	57,599	12,343	12,342
31	SqlInjection	3671	787	787
32	Uploading_Attack	876	188	188
33	VulnerabilityScan	26,167	5607	5608
34	XSS	2692	577	577

Table 2. The number of training, test, and validation data for different attack types in the CIC-IDS2017 dataset.

No	Class Name	Train	Test	Validation
1	BENIGN	1,591,168	340,965	340,964
2	DDoS	89,619	19,204	19,204
3	PortScan	111,251	23,840	23,839
4	Bot	1376	295	295
5	Infiltration	25	5	6
6	Web Attack Brute Force	1055	226	226
7	Web Attack Sql Injection	15	3	3
8	Web Attack XSS	456	98	98
9	FTP-Patator	5557	1191	1190
10	SSH-Patator	4128	885	884
11	DoS GoldenEye	7205	1544	1544
12	DoS Hulk	161,751	34,661	34,661
13	DoS Slowhttptest	3849	825	825
14	DoS slowloris	4057	869	870
15	Heartbleed	8	2	1

Table 3. Summary of results (%) obtained for different classes in the CIC IoT 2023 dataset.

Class No	Class Name	Accuracy (%)	Precision (%)	Recall (%)	F1 Score (%)
1	Backdoor_Malware	98.36	100.00	4.56	8.73
2	BenignTraffic		100.00	95.85	86.17
3	BrowserHijacking		98.51	7.51	13.95
4	CommandInjection		98.51	3.94	7.58
5	DDoS-ACK_Fragmentation		99.83	97.96	98.89
6	DDoS-HTTP_Flood		99.83	68.69	72.93
7	DDoS-ICMP_Flood		99.95	99.91	99.93
8	DDoS-ICMP_Fragmentation		99.95	98.01	98.65
9	DDoS-PSHACK_Flood		99.96	99.92	99.94
10	DDoS-RSTFINFlood		99.96	99.91	99.95
11	DDoS-SYN_Flood		99.60	99.78	99.69
12	DDoS-SlowLoris		99.60	78.09	66.89
13	DDoS-SynonymousIP_Flood		99.85	99.79	99.82
14	DDoS-TCP_Flood		99.85	99.82	99.81
15	DDoS-UDP_Flood		99.89	99.75	99.82
16	DDoS-UDP_Fragmentation		99.89	97.66	98.74
17	DNS_Spoofing		48.41	26.71	34.43
18	DictionaryBruteForce		48.41	12.10	19.88
19	DoS-HTTP_Flood		93.30	73.77	82.40
20	DoS-SYN_Flood		93.30	99.61	98.68
21	DoS-TCP_Flood		99.72	99.61	99.66
22	DoS-UDP_Flood		99.72	99.69	99.48
23	MITM-ArpSpoofing		78.18	58.59	66.98
24	Mirai-greeth_flood		78.18	95.88	95.41
25	Mirai-greip_flood		94.53	92.60	93.56
26	Mirai-udpplain		94.53	99.35	99.62
27	Recon-HostDiscovery		60.09	72.03	65.52
28	Recon-OSScan		60.09	15.88	22.87
29	Recon-PingSweep		100.00	4.12	7.91
30	Recon-PortScan		100.00	30.57	34.42
31	SqlInjection		100.00	3.30	6.40
32	Uploading_Attack		100.00	35.64	52.55
33	VulnerabilityScan		85.70	74.93	79.95
34	XSS		85.70	36.92	53.92

Table 4. Summary of results (%) obtained for different classes in the CIC-MalMem-2022 dataset.

Class No	Accuracy (%)	Precision (%)	Recall (%)	F1 Score (%)
1	99.97	99.95	99.98	99.97
2	99.97	99.97	99.95	99.97

Table 5. Summary of results (%) obtained for different classes in the CIC-IDS2017 dataset.

Class No	Class Name	Accuracy (%)	Precision (%)	Recall (%)	F1 Score (%)
1	BENIGN	96.55	99.60	96.32	97.93
2	DDoS		99.90	97.16	98.51
3	PortScan		80.15	99.75	88.88
4	Bot		69.66	34.24	45.91
5	Infiltration		100.00	83.33	90.91
6	Web Attack Brute Force		100.00	98.67	99.33
7	Web Attack Sql Injection		100.00	100.00	100.00
8	Web Attack XSS		100.00	92.86	96.30
9	FTP-Patator		100.00	51.09	67.63
10	SSH-Patator		100.00	49.55	66.26
11	DoS GoldenEye		97.99	91.77	94.78
12	DoS Hulk		83.86	99.93	91.19
13	DoS Slowhttptest		57.63	97.94	72.56
14	DoS slowloris		89.95	91.61	90.77
15	Heartbleed		100.00	100.00	100.00

Table 6. Comparison results.

Study	Year	Method(s)	Classifier	Dataset	Class Number	Results (%)
Hassini et al. [43]	2024	End-to-End CNN1D	Softmax	Edge-IIoTset	15	Accuracy: 99.96. Precision: 100. Recall: 99. F1-score: 99
Neto et al. [44]	2024	Decimal Binary data conversion	Logistic Regression, AdaBoost, Deep Neural Network, Random Forest	CICIoV2024	6	Logistic Regression: Accuracy: 95. Precision: 74. Recall: 68. F1-score: 63; AdaBoost: Accuracy: 87. Precision: 14. Recall: 17. F1-score: 15; Deep Neural Network: Accuracy: 95. Precision: 74. Recall: 68. F1-score: 63; Random Forest: Accuracy: 95. Precision: 60. Recall: 68. F1-score: 62
Canavese et al. [45]	2024	IoT Proxy, Random Forest	Random Forest	CIC IoT 2023	15	Coarse-Grained: Accuracy: 95.73. Precision: 28.47. Recall: 69.56. F1-score: 35.80; Fine-Grained: Accuracy: 96.07. Precision: 28.75. Recall: 60.38. F1-score: 33.34
Maniriho et al. [46]	2024	Deep Autoencoders, Stacked Ensemble	Various	MemMal-D2024	2	Accuracy: 98.82. Precision: 99.20. Recall: 98.72. F1-score: 98.72
Khalid et al. [47]	2024	Memory based features using volatility	RF	CICMalDroid2020 and CIC-AndMal2017	5	Precision: 97.00. Recall: 97.1. F1-score: 97.0
Danyal Namakshenas et al. [48]	2024	Federated Learning (FL), Quantum Computing, Additive Homomorphic Encryption (AHE)	Various	N-baIoT, Edge-IIoTset	10, 14	N-baIoT: Accuracy: 94.93%, Edge-IIoTset: Accuracy: 91.93%
Talukder et al. [48]	2024	Random Oversampling (RO), Stacking Feature Embedding, Principal Component Analysis (PCA)	RF, ET, DT, XGB	UNSW-NB15, CIC-IDS2017, CIC-IDS2018	9, 15, 15	UNSW-NB15: RF: Accuracy: 99.59%, ET: Accuracy: 99.95%; CIC-IDS2017: DT, RF, ET: Accuracy: 99.99%; CIC-IDS2018: DT, RF:
Proposed Model	2024	CNN	Softmax	CIC IoT 2023, CIC-MalMem-2022, CIC-IDS2017	34, 2, 15	CIC IoT 2023: Accuracy: 98.36%, Precision: 100%, Recall: 99.96%, F1-score: 99.95%; CIC-MalMem-2022: Accuracy: 99.90%, Precision: 99.98%, Recall: 99.97%, F1-score: 99.96%; CIC-IDS2017: Accuracy: 99.99%, Precision: 99.99%, Recall: 99.98%, F1-score: 99.98%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Taşcı, B. Deep-Learning-Based Approach for IoT Attack and Malware Detection. Appl. Sci. 2024, 14, 8505. https://doi.org/10.3390/app14188505

AMA Style

Taşcı B. Deep-Learning-Based Approach for IoT Attack and Malware Detection. Applied Sciences. 2024; 14(18):8505. https://doi.org/10.3390/app14188505

Chicago/Turabian Style

Taşcı, Burak. 2024. "Deep-Learning-Based Approach for IoT Attack and Malware Detection" Applied Sciences 14, no. 18: 8505. https://doi.org/10.3390/app14188505

APA Style

Taşcı, B. (2024). Deep-Learning-Based Approach for IoT Attack and Malware Detection. Applied Sciences, 14(18), 8505. https://doi.org/10.3390/app14188505

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep-Learning-Based Approach for IoT Attack and Malware Detection

Abstract

1. Introduction

1.1. Related Works

1.2. Motivation and Proposed Model

1.3. Novelties and Contributions

2. Datasets

2.1. CIC IoT 2023 Dataset

2.2. CIC-MalMem-2022 Dataset

2.3. CIC-IDS2017 Dataset

3. The Proposed CNN

4. Experimental Results

5. Discussion

6. Future Work

7. Limitations

8. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI