1. Introduction
Since its introduction by Kevin Ashton during a presentation at MIT in the late 1990s, the Internet of Things (IoT) has profoundly transformed the use and perception of technology worldwide [
1]. IoT functions today as an extensive network infrastructure that enables objects, ranging from household appliances to industrial machinery, to interact with each other and broader systems via the internet [
2,
3,
4]. By facilitating the communication and data exchange between devices, IoT automates processes, thereby enhancing efficiency and speed and enriching people’s lives. This system is utilized in a wide array of applications, from smart home devices to factory machinery. In summary, IoT enables devices to communicate with each other, making daily life smoother and more efficient [
5,
6]. On the other hand, the Industrial Internet of Things (IIoT) is a specialized adaptation of IoT tailored for the industrial sector [
7]. IIoT has the potential to maximize operational efficiency by making industrial processes, from factory automation to supply chain management, smarter and more interconnected [
8]. However, the benefits of these technologies come with significant security challenges. By 2030, it is expected that over 29 billion IoT devices will be interconnected globally [
9]. This network spans a broad spectrum of applications, from individual health monitoring to intelligent transportation systems, energy management, and environmental monitoring [
10]. Smart city applications optimize traffic management while minimizing water and energy usage, and smart homes enhance user comfort while reducing energy consumption. In the industrial sector, IIoT technologies make production lines more efficient, improve maintenance processes, and reduce operational costs. While IoT and IIoT systems provide these transformations, they also introduce serious security challenges [
11]. Due to the limited processing capabilities and minimal security measures of IoT devices, they are highly vulnerable to attacks [
12]. In recent years, attacks such as DDoS, MITM, and malware on IoT networks, have become increasingly widespread [
13,
14]. Deep learning has emerged as a powerful solution for IoT security, owing to its capacity to learn complex attack patterns and process large datasets [
15,
16]. For example, botnets controlled by cyber attackers can execute widespread DDoS attacks, causing extensive internet outages and crippling critical infrastructure systems [
17]. Securing the IoT ecosystem is not just a technological necessity but a social and economic imperative. Security researchers, industry leaders, and policymakers are continually developing new strategies and solutions to enhance IoT system security, encouraging research in this area and establishing international collaborations [
18]. Through extensive simulations and experiments, this study analyzes the attack vectors targeting IoT devices and networks and proposes effective methods to mitigate these threats [
19]. The Internet of Things (IoT) environments are vulnerable to various cyber-attacks, often posing serious security risks. The primary attacks that IoT systems face include DoS/DDoS attacks, information gathering attacks, Man-in-the-Middle (MITM) attacks, injection attacks, and malware attacks [
20]. DoS/DDoS attacks, which cause service interruptions by overloading network resources, are critically severe. Information-gathering attacks aim to obtain sensitive information about the system and are highly severe. MITM attacks, known for compromising the confidentiality and integrity of communication, are high-severity attacks. Injection attacks create significant security vulnerabilities by injecting malicious code into systems and are critically severe. Malware attacks, which adversely affect the performance and security of systems, are considered medium- to high-severity threats. The impacts of these attacks underscore the challenges in ensuring the security of the IoT ecosystem and highlight the need for continuous improvement in security measures. The detection of sophisticated threats such as zero-day attacks and malware has become increasingly difficult using traditional methods. This study aims to enhance IoT security by leveraging the power of deep-learning techniques to process large datasets and identify complex attack patterns. This research focuses on providing effective security solutions, particularly for real-time applications and resource-constrained environments. In this study, the proposed method seeks to offer superior performance compared to existing security approaches.
The findings of this study provide guidance for strengthening IoT security infrastructure and leveraging the opportunities offered by this technology safely. This research provides the necessary knowledge and tools to realize IoT’s potential securely, contributing to the sustainable and secure expansion of IoT. The study’s results offer significant theoretical contributions and guide the practical application of IoT security solutions, inspiring further research in this dynamic technological field.
Specifically, the study aims to achieve the following objectives:
- -
Evaluate the performance of 1D CNN models in classifying IoT network traffic as malicious or benign;
- -
Identify various types of attacks in the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets and analyze these datasets using deep-learning models;
- -
Provide advanced insights into the use of deep-learning techniques in IoT security research and contribute to other studies in this field.
This study aims to demonstrate how deep-learning techniques can be utilized to develop forward-looking solutions for IoT security.
This paper is organized as follows:
Section 2 presents the datasets used in this study.
Section 3 introduces the proposed CNN model.
Section 4 depicts the experimental results.
Section 5 discusses the findings.
Section 6 outlines future work, and
Section 7 addresses the limitations. Finally,
Section 8 presents the conclusions of this research.
1.1. Related Works
This section reviews the existing literature related to the proposed research. Specifically, studies that employ deep-learning- and machine-learning-based methods in the fields of IoT security and malware detection will be examined. The datasets used the performance metrics of the methods, and the challenges encountered will be discussed in detail. Additionally, the gaps in the current literature and how this research aims to address these gaps will be discussed. The focus will be on approaches that stand out for their successes and limitations, emphasizing this study’s contributions to the literature.
Rikhtegar et al. [
21] developed a model using SVM and Bayesian methods to detect and classify IoT attacks, utilizing the KDD-CUP 99 dataset for evaluation. After feature reduction in the dataset, the model achieved 91.50% accuracy in multi-class classification. Lam et al. [
22] conducted a study to detect Bot, DoS, and HTTP attacks using the CSE-CIC-IDS2018 dataset. They tested random forest, multilayer perceptron, and one-dimensional convolutional neural networks (Conv1D) models. The convolutional layers used 32, 39, and 64 filters, with a filter size of 5 and a batch size of 32, training the model for 50 iterations. The neural network architecture included Conv1D (32.5), Conv1D (64.5), MaxPool (2), Conv1D (39.5), MaxPool (2), and two fully connected layers (FC). This configuration resulted in accuracy, precision, recall, and F1-scores of 99.98%. Ferrag et al. [
23] developed a deep-learning-based attack detection system to identify DDoS attacks. This system was built on three different models: convolutional neural networks (CNN), artificial neural networks (ANN), and recurrent neural networks (RNN). The performance of each model was evaluated on two new real traffic datasets, CIC-DDoS2019 and TON IoT, for both binary and multi-class classification. The multi-class classification accuracy for the first dataset was 95.90%, while the binary classification achieved 99.95%. For the second dataset, the multi-class classification accuracy was 98.94%. Qazi et al. [
24] developed an attack detection system using one-dimensional convolutional neural networks (1DCNN). This study trained and tested the model on the CSE-CIC-IDS2017 dataset to classify DoS Hulk, DoS GoldenEye, DDoS, Portscan, and benign traffic. The model’s training accuracy was 99.32% and the test accuracy was 98.96%. Additionally, the model’s precision was 98.70%, the recall was 99.20%, and the F1-score was 98.94%. Ullah et al. [
25] developed an anomaly detection system for IoT networks using a feedforward neural network based on flow and control flag features. This model was evaluated on various IoT-focused datasets for both binary and multi-class classification. The datasets used included BoT-IoT, IoT network attack, MQTT-IoT-ID S2020, MQTTset, IoT-23, and IoT-DS2. The study results showed very high accuracy rates of 99.97% for multi-class classification and 99.99% for binary classification. Shatnawi et al. [
26] proposed a static malware detection method based on permissions and API calls from Android applications. They employed three machine-learning algorithms—SVM, KNN, and Naive Bayes—evaluated on the CICInvesAndMal2019 Android malware dataset. This approach aims to offer a reliable and effective solution for malware detection. Kilichev et al. [
27] enhanced a one-dimensional convolutional neural network (1D-CNN) using genetic algorithms (GA) and particle swarm optimization (PSO) to improve performance. On the CSE-CIC-IDS2017 dataset, the model achieved a test accuracy of 99.71%, precision of 100%, and recall and F1-scores of 99% with GA optimization. With PSO, the results were slightly higher, with a test accuracy of 99.74%, precision of 100%, recall of 99%, and an F1-score of 100%. Both optimization methods demonstrated excellent performance. Bayazit et al. [
28] developed a malware detection system using RNN-based algorithms, including LSTM, BiLSTM, and GRU, and evaluated them on the CICInvesAndMal2019 dataset with 8115 static features. The experimental results showed that the BiLSTM model achieved the highest accuracy at 98.85%, highlighting its superior effectiveness in malware detection. Brown et al. [
29] found that malware detection systems developed using AutoML can perform as well or better than manually designed models. In experiments with the SOREL-20M and EMBER-2018 datasets, the Darts AutoML model achieved 98.61% accuracy, 98.52% precision, and 98.88% recall. However, the high computational cost and processing time were significant drawbacks, particularly for large datasets. Despite these challenges, AutoML shows promise for malware detection, though further improvements are needed in cost and time efficiency. Almazroi and Ayub [
30] presented a BERT-based Feedforward Neural Network (BEFSONet) for IoT environments, evaluated on eight IoT malware datasets. Optimized using the Spotted Hyena Optimizer (SHO), the model showed strong adaptability to various malware structures. While promising as a defense mechanism for IoT security, the complexity and high computational demands of SHO may limit its use in resource-constrained devices, suggesting the need for more efficient optimization algorithms. Tseng et al. [
31] used the CIC-IoT-2023 dataset to develop deep-learning models for IoT intrusion detection, achieving 99.40% accuracy in multi-class classification with the Transformer model, outperforming prior studies. However, its higher computational cost and lower binary classification performance are notable limitations.
1.2. Motivation and Proposed Model
The primary goal is to contribute to the field of IoT security by presenting a new deep-learning model. Therefore, a simple but highly effective 1D convolutional neural network (1D CNN) model was developed. This work covers two main objectives: achieving superior classification performance reducing the model’s computational complexity and the number of trainable parameters. To develop the proposed model, a comprehensive analysis of existing CNN and deep-learning techniques was conducted. This model features a lightweight and efficient architecture optimized for processing IoT data. The main components of the model are as follows: an input layer (sequence input) receives data sequences from IoT devices, 1D convolutional layers (Conv1D) learn spatial relationships in the data and extract various features, layer normalization and batch normalization layers stabilize the learning process and enhance the overall performance of the model, the GELU activation function increases model accuracy by learning non-linear relationships, and self-attention layers focus on important features of the data, thereby improving overall accuracy. Fully connected layers perform the classification task, dropout layers prevent overfitting and enhance the model’s generalization capability, and global max pooling and softmax layers conduct the final classification. A graphical representation of the model is presented in
Figure 1. This model offers an innovative approach to IoT security by providing high accuracy, speed, and resource efficiency. The results of this study serve as a guide for strengthening IoT security infrastructure and leveraging the opportunities offered by this technology in a secure manner.
1.3. Novelties and Contributions
The novelties of this research are as follows:
An optimized 1D CNN model with low computational load was developed to classify IoT data with high accuracy;
One-dimensional convolutional layers that learn spatial relationships in the data and layer normalization and batch normalization techniques that enhance the model’s performance were utilized;
The GELU activation function was employed to improve the ability to learn non-linear relationships;
Self-attention layers were added to enhance overall accuracy by emphasizing key features of the data;
The model’s effectiveness was validated by testing it on comprehensive datasets such as CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017.
The key contributions of this research are as follows:
The study presents a new and realistic IoT attack dataset using a comprehensive topology of various real IoT devices, including 33 attacks where malicious IoT devices target other IoT devices;
The performance of deep-learning models like 1D CNN will be evaluated using this new dataset, demonstrating the effectiveness of these models in classifying IoT network traffic as malicious or benign;
The research will provide advanced knowledge on how deep-learning techniques can be applied to IoT security, making significant contributions to other studies in this field;
Various types of attacks in the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets will be detailed and analyzed using deep-learning models.
2. Datasets
The datasets used in this study were selected to evaluate the effectiveness of deep-learning models in IoT security and cyber-attack detection. The datasets used are CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017.
2.1. CIC IoT 2023 Dataset
The CIC IoT 2023 dataset was used in this study to evaluate the effectiveness of deep-learning models in the field of IoT security [
32]. The CIC IoT 2023 dataset was employed in this study to evaluate the effectiveness of deep-learning models in IoT security. Developed by the Canadian Institute for Cybersecurity (CIC), this dataset provides a comprehensive benchmark for IoT attacks, simulating real-world scenarios with a network of 105 IoT devices. It features 33 different types of attacks, categorized into seven groups: DDoS, DoS, Reconnaissance, Web-based, Brute Force, Spoofing, and Mirai, where IoT devices are used as both attackers and targets. This dataset is invaluable for developing and testing IoT security solutions by providing realistic data for security analytics in large-scale IoT environments. The CIC IoT 2023 dataset includes key features such as flow duration, protocol type, flag counts (FIN, SYN, RST, etc.), and traffic rates, alongside protocols like HTTP, TCP, UDP, and ICMP. Metrics like total sum, minimum, maximum, average, and standard deviation, along with inter-arrival times and other characteristics, allow for detailed analysis of IoT network traffic. This makes it a significant resource for researchers aiming to classify and detect malicious network activities using machine and deep-learning algorithms (see
Table 1).
The Internet of Things (IoT) ecosystem comprises a diverse range of components. On the network side, key devices include the Asus RT-N12 router, Cisco Catalyst 3850 24 switch, and Netgear Unmanaged Switch GS308, all monitored via the Gigamon G-TAP A-TX network tap. Controllers such as the Vera Plus, Aeotec Zigbee/Z-Wave Smart Hub, and SmartThings Hub facilitate the integration and management of smart devices. The sensor suite includes the Aeotec Water Leak Sensor, Multisensor 6, Motion Sensor, and Button, alongside security devices like the Aeotec Siren, Doorbell 6, and Door/Window Sensor 7 Pro. Cameras, including models from Arlo, Dlink, Amcrest, Google Nest, and Netatmo, provide high-resolution monitoring for both indoor and outdoor security. Smart home devices further enrich the ecosystem, with products like the Phillips Hue Bridge and Bulbs, Arlo Base Station, iRobot Roomba i3+, and Amazon Echo Studio. The Raspberry Pi 4 Model B, used to simulate potential IoT attacks, underscores the role of these devices in IoT security research. This extensive setup offers a comprehensive framework for examining the interplay of IoT components, ranging from network infrastructure and sensors to smart devices and attack simulations, enhancing research on IoT security vulnerabilities and solutions.
2.2. CIC-MalMem-2022 Dataset
The CIC-MalMem-2022 dataset is designed to test memory-based detection methods for concealed malware [
33]. Created using malware commonly found in the real world, this dataset aims to closely represent real-world scenarios. CIC-MalMem-2022 includes spyware, ransomware, and Trojan horse malware, providing a balanced dataset for testing concealed malware detection systems. During the memory dump process, the debug mode is utilized to ensure the dump process remains invisible in memory dumps, offering a more accurate representation of programs an average user might run during a malware attack. The CIC-MalMem-2022 dataset is balanced, with 50% malicious memory dumps and 50% benign memory dumps. Out of a total of 58,596 records, 29,298 are benign and 29,298 are malicious memory dumps. This dataset, created using real-world malware, incorporates concealment techniques that make malware detection challenging. Its balanced nature allows for an objective evaluation of detection algorithms’ accuracy. The dataset includes spyware, ransomware, and Trojan horse malware, with examples from various malware families within each category. Additionally, the use of debug mode during the memory dump process ensures the invisibility of the dump process in memory dumps, enabling more accurate testing of malware detection systems. The CIC-MalMem-2022 dataset is a crucial resource for evaluating the effectiveness of malware detection methods and advancing research in this field. Representing real-world malware and concealment techniques, this dataset aids security analysts and researchers in developing more effective detection methods.
In this dataset, the BENIGN class represents benign memory dumps, while the Malware classes include various types of malware such as spyware, ransomware, and Trojan horses. The dataset contains concealment techniques that complicate malware detection and maintain a balanced structure, allowing for an objective evaluation of detection algorithms’ accuracy.
2.3. CIC-IDS2017 Dataset
The CIC-IDS2017 dataset, created by the Canadian Institute for Cybersecurity in 2017, is a comprehensive and up-to-date dataset for cybersecurity research [
34]. This dataset captures real-world network traffic and common cyberattacks, making it a valuable resource for assessing the performance of security tools like Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS). Spanning five days, the traffic includes only normal activity on Monday, with the remaining days featuring attacks such as Brute Force FTP/SSH, DoS, Heartbleed, Web Attacks, Infiltration, Botnet, and DDoS. These events occur during morning and afternoon sessions. The CIC-IDS2017 dataset provides labeled traffic data, including timestamps, IP addresses, ports, protocols, and attack types, in CSV format. Using the B-Profile system, it simulates background traffic from 25 users, involving protocols like HTTP, HTTPS, FTP, SSH, and email. With over 80 network flow features and a comprehensive topology, CIC-IDS2017 is an essential resource for evaluating IDS/IPS systems and advancing network security research (see
Table 2).
The CIC-IDS2017 dataset provides a comprehensive dataset for cybersecurity research. This dataset includes the BENIGN class, representing normal and benign network traffic, as well as various types of attacks. It encompasses multiple attack types, including DDoS attacks, PortScan, Botnet, Infiltration, web-based attacks (Brute Force, SQL Injection, and XSS), FTP brute force attacks (FTP-Patator), and SSH brute force attacks (SSH-Patator). DoS attacks are subdivided into categories such as GoldenEye, Hulk, Slowhttptest, and slowloris. Additionally, attacks exploiting the Heartbleed vulnerability are included in this dataset. This dataset serves as an ideal resource for evaluating and developing IDS and IPS systems, providing a robust and reliable reference point for network traffic analyses and attack detection.
In processing the datasets, Z-normalization is employed to mitigate issues arising from the different scales of the data. This technique involves transforming the data such that each feature has a mean of 0 and a standard deviation of 1. Specifically, Z-normalization is performed by subtracting the mean of each column from each data point and dividing by the standard deviation of that column. This transformation is crucial as it standardizes the features, allowing the model to evaluate different features on the same scale.
The equation for Z-normalization is as follows:
where
x is the data point,
μ is the column mean, and
σ is the column standard deviation.
By standardizing the data, Z-normalization enhances the model’s ability to generalize across diverse datasets, as it ensures that all features contribute equally to the learning process. This preprocessing step is fundamental in maintaining the robustness and adaptability of the model when applied to new and varied datasets.
3. The Proposed CNN
In this paper, a novel 1D convolutional neural network (CNN) architecture tailored for efficient classification tasks is presented. The architecture is proposed as a foundation for a lightweight model ensuring its suitability for real-time and resource-constrained applications. Additionally, the architecture is scalable to handle large-scale datasets, maintaining its effectiveness and robustness. The proposed 1D CNN comprises 75 layers (see
Figure 2), organized into several key blocks: the input block, convolutional blocks, self-attention blocks, and the output block.
The improvements in this model are achieved through the following structural components:
Convolutional Layers: These layers learn spatial relationships in the data and extract various features. By stacking multiple convolutional layers, the model can capture complex patterns in the IoT data;
GELU Activation Function: The Gaussian Error Linear Unit (GELU) activation function is used to enhance the model’s ability to learn non-linear relationships. This activation function is defined as
where
is the cumulative distribution function of the standard normal distribution;
Self-Attention Mechanism: Inspired by the superior performance of self-attention mechanisms, multiple self-attention layers are incorporated into the architecture. The self-attention mechanism helps the model focus on important features of the data, improving overall accuracy. The self-attention is computed as
where Q is the query matrix, K is the key matrix, V is the value matrix, and d
k is the dimension of the key;
Layer Normalization and Batch Normalization: These normalization techniques stabilize the learning process and enhance the overall performance of the model. Layer normalization is applied after each block, ensuring that the inputs to each layer have a mean of zero and a variance of one;
Dropout Layers: Dropout is used to prevent overfitting and enhance the model’s generalization capability. During training, dropout randomly deactivates a portion of input units, which helps in regularizing the model;
Global Max Pooling: This layer reduces the spatial dimensions of the input, retaining the most important features and reducing computational load.
The underlying theory behind the proposed model is based on the effectiveness of convolutional neural networks in capturing spatial hierarchies in data, combined with the power of self-attention mechanisms to focus on relevant features. The model’s structure is designed to achieve a balance between computational efficiency and high classification performance, making it suitable for a wide range of IoT applications. The proposed model begins by taking a sequence input, which is then processed through a 1D convolutional layer with 96 filters, a kernel size of 3, and a stride of 1, maintaining the same padding to preserve the input dimensions. Following this, layer normalization is applied to stabilize and accelerate the training process. A GELU activation function is used to introduce non-linearity, allowing the model to capture complex patterns. Next, a self-attention mechanism is incorporated, utilizing 3 attention heads and 96 channels, enabling the model to focus on the most relevant features in the input data. After this, the data are passed through a fully connected layer with 96 units to further learn higher-level representations. A dropout layer with a rate of 0.5 is then applied to prevent overfitting by randomly deactivating neurons during training. A residual connection, formed by adding two intermediate outputs (I1 and I2), helps to retain information and improve gradient flow through the network. The model further applies batch normalization to standardize the output from the addition layer, ensuring stable learning. Dimensionality reduction is performed using a global max pooling layer, which selects the most important features from the data. Finally, a softmax activation function is used to output the classification probabilities, and the result is delivered as the final model output.
This design balances computational efficiency and high classification performance. making it suitable for a wide range of applications. The model achieves a total of approximately 9 million learnable parameters, ensuring robust performance even with a reduced number of trainable parameters.
For the optimization of the proposed 1D CNN model, various established techniques were used to improve its performance and efficiency. Stochastic Gradient Descent with Momentum (SGDM) was used as the optimizer during the training process. SGDM helps accelerate gradient vectors in the correct directions, leading to faster convergence, and the momentum term helps to smooth out oscillations and stabilize the updates. Learning Rate Scheduling was maintained with periodic updates every 50 iterations to balance convergence speed and stability, ensuring that the learning rate remains optimal throughout the training process. Dropout layers were incorporated to prevent overfitting by randomly setting a fraction of input units to zero during training, improving the model’s generalization capability by ensuring that the network does not rely too heavily on any single neuron. Batch Normalization, applied after each convolutional block, stabilizes and accelerates the training process by normalizing the inputs to each layer, helping to maintain a stable distribution of activations throughout the network, which in turn leads to improved training speed and performance. Various hyperparameters, including the number of filters, kernel size, and dropout rate, were fine-tuned through grid search, allowing us to find the optimal set of hyperparameters that yield the best performance for this model. These optimization techniques collectively contribute to the robustness and efficiency of the proposed model, ensuring high classification accuracy and performance while maintaining low computational overhead.
4. Experimental Results
The experiments were conducted on a personal computer equipped with an NVidia RTX 4080 GPU, 128 GB of memory and a 13th-generation Intel Core i9-13900K processor running Windows 11. The proposed CNN model was developed using a MATLAB Deep Network Designer. The dataset was split into 70% training, 15% testing, and 15% validation. Stochastic Gradient Descent with Momentum (SGDM) was used as the solver, with an initial learning rate of 0.01, a mini-batch size of 32, and training over 30 epochs. The learning rate remained constant and was updated every 50 iterations, and all training was performed on the GPU. The performance of the proposed 1D CNN model was evaluated on several datasets. The model achieved a validation accuracy of 98.36% on the CIC IoT 2023 dataset, 99.90% on the CIC-MalMem-2022 dataset, and 96.64% on the CIC-IDS2017 dataset, indicating its effectiveness across different security domains such as IoT security, malware detection, and network attack detection.
Figure 3 illustrates the accuracy and loss curves during training and validation on the CIC IoT 2023 dataset, showing strong performance and high accuracy throughout the process.
Figure 4 presents the accuracy and loss curves obtained on the CIC-MalMem-2022 dataset. Here, the validation accuracy is also notably high, and it can be observed that the loss values decrease rapidly during the training process.
Figure 5 presents the accuracy and loss curves for the CIC-IDS2017 dataset. These graphs also demonstrate the model’s effectiveness in detecting network attacks, showcasing its robust performance throughout the training and validation phases.
In this study, the datasets were divided into 70% training, 15% testing, and 15% validation. During the training phase, the model’s parameters were optimized, the overall performance was monitored using the validation set, and the final performance was evaluated using the test set. The evaluation of the proposed model is based on the following metrics: accuracy, precision, recall, F1-score, and confusion matrix analysis. These metrics were chosen to provide a comprehensive evaluation of the model’s performance, particularly in identifying both false positives and false negatives.
The results obtained using the test datasets are presented in the following figures:
Figure 6 shows the confusion matrix for the CIC IoT 2023 dataset. This matrix provides a detailed view of how accurately the model classified different classes and reveals the error rates.
Figure 7 presents the confusion matrix for the CIC-MalMem-2022 dataset. This matrix visualizes the model’s success in malware detection and highlights any potential classification errors.
Figure 8 provides the confusion matrix for the CIC-IDS2017 dataset. This matrix illustrates the model’s performance in detecting network attacks and shows the correct classification rates.
Table 3 presents a percentage summary of the results obtained for different classes in the CIC IoT 2023 dataset. The model demonstrates high performance in attack types such as DDoS-ICMP_Flood (99.95% accuracy, 99.91% precision, and 99.93% recall), DDoS-PSHACK_Flood (99.96% accuracy, 99.92% precision, and 99.94% recall), and DDoS-SYN_Flood (99.60% accuracy, 99.78% precision, and 99.69% recall). However, it shows low performance in classes such as Backdoor_Malware (98.36% accuracy, 100% precision, and 4.56% recall) and DNS_Spoofing (48.41% accuracy, 26.71% precision, and 34.43% recall). There are also classes with moderate performance, such as DoS-HTTP_Flood (93.30% accuracy, 73.77% precision, and 82.40% recall) and MITM-ArpSpoofing (78.18% accuracy, 58.59% precision, and 66.98% recall). Overall, while the model is successful in some attack types, it requires improvement in others.
Table 4 presents a summary of the results obtained for different classes in the CIC-MalMem-2022 dataset. For the first class, the accuracy, precision, recall, and F1-scores are 99.97%, 99.95%, 99.98%, and 99.97%, respectively. These high-performance indicators demonstrate that the model is extremely successful in accurately classifying malware in the first class. For the second class, the accuracy, precision, recall, and F1-score are reported as 99.97%, 99.95%, and 99.97%, respectively. These values indicate that the model also detects malware in the second class with high accuracy and reliability. The obtained results prove that the employed model is highly effective in classifying malware classes in the CIC-MalMem-2022 dataset and exhibits high overall performance. This supports the model’s applicability for practical malware detection.
The features extracted from the FC_9 fully connected layer of the 1D CNN model presented in
Figure 9 were classified using the MATLAB classification layer with 10-fold cross-validation. The classification results are depicted in the attached graph. While this proposed model achieved an accuracy of 99.97%, the results for kNN [
35], SVM [
36,
37,
38], Neural Network [
39,
40], Naive Bayes [
41], Tree [
42], and Efficient Logistic Regression [
42] were 98.68%, 98.56%, 98.16%, 88.58%, 87.78%, and 60.9%, respectively (see
Figure 9). These results not only demonstrate the superior accuracy of the proposed model in comparison to other methods but also highlight its high performance and generalization capability. The stability and generalization capabilities of the proposed model were validated using cross-validation and independent test sets. The low variance in results indicates consistent performance across different subsets of data. This demonstrates that this model can reliably operate across a wide range of conditions and generalize with high accuracy to different datasets.
Table 5 summarizes the performance metrics of the model for various malware classes in the CIC-MalMem-2022 dataset. The model exhibited high performance in the BENIGN class with an accuracy of 96.55%, a precision of 99.60%, a recall of 96.32%, and an F1-score of 97.93%. For DDoS attacks, the model achieved an accuracy of 99.90%, a precision of 97.16%, and a recall of 98.51%, demonstrating high detection success. In classes such as Web Attack Brute Force, Web Attack SQL Injection, and Heartbleed, the model displayed its highest performance with accuracies close to 100%, precision near 100%, and recall values over 99%. Specifically, for Web Attack SQL Injection and Heartbleed classes, it achieved perfect detection with 100% accuracy, 100% precision, and 100% recall. However, the performance metrics were lower for some classes, such as Bot, FTP-Patator, and SSH-Patator. For example, in the Bot class, the model achieved 69.66% accuracy, 34.24% precision, and 45.91% recall, indicating difficulties in detecting this type of malware. Overall, the model performed well in many malware classes, but improvements are needed for classes where performance was lower. These results indicate that while the model is generally effective for malware detection, optimization is necessary to achieve better results in certain classes.
In the experiments, the performance of the proposed 1D CNN model was evaluated across various datasets. The model demonstrated high performance across many malware classes in the CIC-MalMem-2022 dataset, achieving 100% accuracy, precision, and recall for the Web Attack SQL Injection and Heartbleed classes. However, performance discrepancies were noted, particularly in the “Backdoor_Malware” and “DNS_Spoofing” classes. The lower performance in these classes can be attributed to several factors. For “Backdoor_Malware”, the stealthy nature of these attacks often involves techniques that evade detection by blending in with legitimate traffic. This makes it challenging for the model to differentiate between malicious and benign behaviors. Similarly, “DNS_Spoofing” exploits vulnerabilities in the domain name resolution process, often resulting in patterns that mimic normal network activity. These complexities necessitate more advanced feature extraction and training strategies to improve detection rates for such sophisticated attack types. In the CIC IoT 2023 dataset, the model achieved high accuracy rates, exceeding 99.95% for attack types such as DDoS-ICMP_Flood and DDoS-PSHACK_Flood. However, as previously mentioned, the model’s efficacy was notably lower for the more nuanced classes like “Backdoor_Malware” and “DNS_Spoofing”. This highlights the need for continued refinement in detection algorithms to address the challenges posed by these advanced threats. In the CIC-IDS2017 dataset, the model exhibited effective detection of network attacks, generally showcasing high validation accuracy. These results confirm that while the proposed model is robust across various security domains—such as malware detection, IoT security, and network attack detection—there remains a critical need for enhancements in handling complex attack vectors.
5. Discussion
In this study, the performances of various machine-learning methods used in the classification of IoT security data were compared. The findings reveal that the proposed model demonstrates superior performance, particularly in terms of high accuracy, precision, recall, and F1-score, when compared to other contemporary methods. The proposed model has shown more consistent and high performance across the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets than other studies. This indicates that the proposed model can be effectively and reliably used as a tool for classifying IoT security data.
The discussion section examines the reasons behind these findings, the advantages the model offers compared to other methods, and potential areas for improvement. Additionally, the integration of these results into practical security systems and future directions of research in this area are discussed. Comparison results are tabulated in
Table 6.
Various researchers developed different classifiers using machine-learning methods on IoT security data. Hassini et al. achieved high-performance results using an End-to-End 1D CNN on the Edge-IIoTset dataset (Accuracy: 99.96%, Precision: 100%, and Recall: 99%, F1-score: 99). Neto et al. [
44] experimented with various methods on the CICIoV2024 dataset, achieving 95% accuracy with Logistic Regression, Deep Neural Network, and Random Forest, while the AdaBoost method reached 87% accuracy. Canavese et al. [
45] used Random Forest on the CIC IoT 2023 dataset for both Coarse-Grained and Fine-Grained classification, obtaining approximately 96% accuracy. Maniriho et al. [
46] demonstrated high performance with 98.82% accuracy using deep autoencoders and a Stacked Ensemble on the MemMal-D2024 dataset. Khalid et al. [
47] achieved high F1-scores (97.0%) using memory-based features and Random Forest on the CICMalDroid2020 and CIC-AndMal2017 datasets. Danyal Namakshenas et al. [
48] obtained 94.93% and 91.93% accuracy on the N-baIoT and Edge-IIoTset datasets, respectively, using Federated Learning, Quantum Computing, and Additive Homomorphic Encryption. Talukder et al. reached high accuracy rates (99.99%) on the UNSW-NB15, CIC-IDS2017, and CIC-IDS2018 datasets using Random Oversampling (RO), Stacking Feature Embedding, and PCA. In this proposed model, using CNN and Softmax, notable results on the CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017 datasets were achieved. Specifically, it achieved 98.36% accuracy, 100% precision, 99.96% recall, and 99.95% F1-score on the CIC IoT 2023 dataset; 99.90% accuracy, 99.98% precision, 99.97% recall, and 99.96% F1-score on the CIC-MalMem-2022 dataset; and 99.99% accuracy, 99.99% precision, 99.98% recall, and 99.98% F1-score on the CIC-IDS2017 dataset. These results indicate that the proposed model demonstrates superior performance compared to other methods, particularly in terms of high precision and recall values.
The proposed 1D CNN framework achieves globally optimal results through a combination of several advanced techniques. First, the model architecture is meticulously optimized, incorporating convolutional layers, GELU activation functions, and self-attention mechanisms to effectively capture complex patterns in IoT data. Extensive cross-validation and hyperparameter tuning ensure that the chosen parameters, such as the number of filters, kernel size, and learning rate, provide the best performance, thereby avoiding overfitting and enhancing generalization to unseen data. Normalization techniques, including layer normalization and batch normalization, stabilize the learning process by maintaining a consistent scale of input data, which improves the convergence rate and overall model performance. Dropout layers are employed to prevent overfitting by randomly setting a fraction of input units to zero during training, enhancing the model’s generalization capabilities. Adaptive learning rate scheduling is used to balance convergence speed and stability by periodically adjusting the learning rate, ensuring the model maintains an optimal learning rate throughout training and avoids local minima. The model is trained and evaluated on multiple comprehensive datasets, including CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017, which exposes the model to diverse data and improves its ability to generalize across different scenarios. Rigorous performance evaluation using key metrics such as accuracy, precision, recall, and F1-score confirms that the model effectively balances precision and recall, achieving high performance. These combined techniques ensure that the proposed 1D CNN framework systematically optimizes its learning process, resulting in robust performance and the ability to achieve optimal results in IoT security data classification globally.
6. Future Work
Future research directions could focus on testing the proposed model on larger datasets and diverse attack types to further refine its performance. Enhancements could also explore adaptive learning capabilities and autonomous updates to maintain and improve security measures dynamically. This study contributes significantly to the field of IoT security by providing advanced insights into the deployment of deep-learning techniques, thereby encouraging further research and development in this dynamic domain. Future work aims to improve the experimental nature of this study by including practical IoT case studies. These case studies will provide more practical insights and demonstrate the applicability of this proposed 1D CNN model in real-world IoT scenarios. By doing so, it is hoped that it will bridge the gap between theoretical research and practical application and provide a more comprehensive evaluation of the performance of this model in various IoT environments. Additionally, further testing on larger and more varied datasets, as well as exploring adaptive learning capabilities, will be considered to enhance the robustness and effectiveness of the model.
7. Limitations
Although the datasets used in this study are comprehensive, they may not fully capture the diversity of real-world IoT devices and attack types. Future research should aim to validate the model on a broader range of datasets to improve its generalizability. While the model demonstrates low computational overhead, further evaluation is required to assess its performance and efficiency in real-time applications, particularly on resource-constrained devices.
The static training process employed by this model limits its ability to adapt to new attack types and evolving threat landscapes. Future research should explore the development of models with adaptive learning capabilities to dynamically update security measures in response to emerging threats.
In conclusion, this research confirms the applicability and effectiveness of deep learning techniques in improving IoT security. The proposed model, with its low computational requirements and high performance, presents a valuable tool for real-world IoT operations, enabling the detection and mitigation of large-scale attacks. Future research could focus on testing the model with larger and more diverse datasets and incorporating adaptive learning mechanisms to further enhance its robustness.
8. Conclusions
This study presents an optimized 1D convolutional neural network (1D CNN) model for classifying Internet of Things (IoT) security data, characterized by its low computational overhead and high efficiency. The proposed model was rigorously tested on three comprehensive datasets: CIC IoT 2023, CIC-MalMem-2022, and CIC-IDS2017. The results indicate that the model demonstrates superior performance compared to existing methods, achieving high accuracy, precision, recall, and F1-scores. On the CIC IoT 2023 dataset, the model achieved an accuracy of 98.36%, precision of 100%, recall of 99.96%, and F1-score of 99.95%. These metrics underscore the model’s capability to effectively identify and classify large-scale attacks within IoT environments. Similarly, on the CIC-MalMem-2022 dataset, the model attained an accuracy of 99.90%, precision of 99.98%, recall of 99.97%, and F1-score of 99.96%, highlighting its proficiency in malware detection. The CIC-IDS2017 dataset results, with an accuracy of 99.99%, precision of 99.99%, recall of 99.98%, and F1-score of 99.98%, further confirm the model’s robustness in detecting network intrusions. This research validates the applicability and effectiveness of deep-learning techniques in enhancing IoT security. The developed model, with its low computational demands and high performance, presents a valuable tool for real-world IoT operations, facilitating the detection and mitigation of large-scale attacks. Its suitability for resource-constrained devices and real-time applications is particularly noteworthy.