1. Introduction
Over the last decade, the proliferation of the Internet of Things (IoT) across various domains, including healthcare, agriculture, smart cities, environmental monitoring, and industry, has significantly influenced the global economy. By facilitating communication between physical devices via the internet, IoT has enabled the emergence of innovative services and applications [
1]. Projections indicate that by 2025, IoT-related developments will generate an economic impact ranging from USD 3.9 trillion to 11.1 trillion [
2].
IIoT extends the IoT paradigm to the manufacturing sector to enhance operational efficiency. IIoT involves the integration of intelligent devices with management platforms, while supervisory control and data acquisition (SCADA) systems serve as fundamental components. The transition of traditional SCADA architectures into IoT-enabled frameworks has fundamentally altered the landscape of cybersecurity threats [
3].
Although integrating SCADA systems with IoT and cloud infrastructures provides cost-effectiveness and enhanced performance, it simultaneously introduces heightened cybersecurity risks, mainly due to the potential for remote access vulnerabilities [
4]. IIoT devices are increasingly susceptible to various cyberthreats, including distributed denial-of-service (DDoS) attacks and malware infections. Consequently, research efforts concerning IIoT security have intensified recently [
5].
The IIoT ecosystem is increasingly vulnerable to cybersecurity threats, including unauthorized access, data integrity breaches, denial-of-service (DoS) attacks, and protocol-specific vulnerabilities. To mitigate these risks, intrusion detection systems (IDSs) are critical in identifying anomalous network traffic, enabling proactive security measures [
6]. Additionally, digital forensic incident response (DFIR) strategies are essential for protecting supervisory control and data acquisition (SCADA) systems from cyberthreats [
7]. However, conventional security approaches have limitations, necessitating the development of next-generation detection methods. Recent advancements in machine learning and deep learning have significantly improved the accuracy and efficiency of IDS models, making them increasingly relevant in IIoT cybersecurity research [
8,
9,
10,
11,
12].
A promising approach in this domain is the integration of hybrid deep learning models that enhance IDS performance against evolving cyberthreats, including distributed denial-of-service (DDoS) and DoS attacks. Hybrid deep learning models combine multiple neural network architectures or integrate traditional machine learning approaches to enhance predictive performance across various applications. Such methodologies leverage the strengths of distinct algorithms to address the shortcomings found in their isolated use [
13]. By leveraging the strengths of multiple deep learning architectures, hybrid models improve detection accuracy while reducing false-positive rates—critical factors in complex and dynamic IIoT environments. Notably, Konatham et al. [
14] introduced a hybrid model combining convolutional neural networks (CNNs) and gated recurrent units (GRUs), achieving 94.94% accuracy in detecting anomalies in IIoT edge computing environments. This highlights the effectiveness of integrating spatial and temporal feature extraction. Similarly, Marzouk et al. [
15] developed a hybrid model incorporating metaheuristic algorithms for intrusion detection in clustered IIoT environments, demonstrating adaptability to different network architectures.
DDoS attacks remain a significant challenge in network security, requiring advanced detection mechanisms. Shaikh et al. [
16] proposed a CNN–LSTM hybrid model specifically designed to detect DDoS attacks, utilizing CNNs for spatial feature extraction and LSTMs for temporal sequence analysis. This dual-method approach enhances detection accuracy while addressing the limitations of traditional IDS models, which often struggle with the complexity and volume of IIoT network data. Further refinement has been achieved through optimization techniques such as the satin bowerbird optimization algorithm, which improves data preprocessing for enhanced model compatibility [
17].
Recent developments also incorporate federated learning to enhance privacy while maintaining detection performance. Huang et al. [
18] introduced a federated learning-based IDS model that integrates CNNs with attention mechanisms, effectively addressing privacy concerns and accuracy challenges in IIoT environments. This approach is particularly relevant given the increasing emphasis on secure industrial networks.
Additionally, ensemble methods have demonstrated significant potential in improving predictive performance. Begum et al. [
19] proposed a CNN–LSTM hybrid model that achieved 99% accuracy on the KDD-Cup dataset, reinforcing the effectiveness of ensemble techniques in intrusion detection. Javeed et al. [
20] further highlighted the importance of hybrid classifiers in detecting sophisticated cyberthreats in secure industrial environments.
This study employed the WUSTL-IIoT-2021 dataset to detect cyberthreats in IIoT systems, providing a robust platform for evaluating security mechanisms by simulating real-world industrial conditions [
21]. As machine learning continues to demonstrate significant promise in cybersecurity, research in this area has expanded rapidly [
22,
23]
In this study, SCADA network traffic was analyzed using five machine learning models (CART, decision tree, logistic regression, naïve Bayes, random forest), five deep learning models (CNN, GRU, LSTM, RNN, MLP), and two hybrid models (CNN–LSTM, LSTM–CNN). A key contribution of this research is the comparative evaluation of machine learning, deep learning, and hybrid models within a unified experimental framework, providing a comprehensive analysis of their relative effectiveness. By systematically applying different hyperparameter configurations, the study aimed to refine model performance and establish a foundation for future research. The model demonstrating the highest accuracy in cyberattack detection was identified and evaluated for potential integration into network-based IDS solutions. When deployed in IDS environments, the proposed model offers superior detection accuracy and reduced error rates compared to conventional security mechanisms.
The results indicated that the multilayer perceptron (MLP) model outperformed other approaches, achieving an accuracy of 99.99%, surpassing similar studies in the literature. Contrary to widely held assumptions that hybrid models yield the highest performance, this study demonstrates that standalone models can achieve superior accuracy when applied to the WUSTL-IIoT-2021 dataset. These findings provide valuable insights for advancing cyberattack detection methodologies in IIoT environments.
Furthermore, this study makes significant contributions to the cybersecurity literature by providing a detailed analysis of common attack types in SCADA systems, offering a valuable resource for researchers and developers working in this domain. Using a dataset obtained directly from real IIoT devices ensures the practical applicability of the proposed model across both SCADA and IIoT environments. Unlike traditional attack detection techniques, integrating artificial intelligence-based methodologies introduces an innovative perspective, with the effectiveness of the proposed approach validated through comparative analysis with existing studies.
In addition to evaluating machine learning and deep learning models, the study also explores hybrid architectures, presenting new insights for cybersecurity experts. The study lays a solid foundation for future research by testing and optimizing various hyperparameters. A comprehensive assessment of SCADA security vulnerabilities and cyberthreats is conducted, enhancing awareness of potential risks. Moreover, advanced big data analytics in this research contribute significantly to data processing methodologies and cyberthreat detection.
This study sought to answer the following research questions:
To strengthen cybersecurity in IIoT environments, which machine learning, deep learning, and hybrid models exhibit the highest performance in cyberattack detection?
How are these models compared and evaluated regarding key performance metrics such as accuracy, F1 score, precision, and recall?
The structure of this paper is as follows.
Section 2 provides a review of related literature,
Section 3 details the methodology and dataset,
Section 4 describes the implementation and evaluation processes,
Section 5 presents and analyzes the results, and
Section 6 concludes the study with discussions on future research directions.
2. Related Work
IIoT systems form a complex structure that integrates various devices and technologies, making them vulnerable to cyberattacks. In particular, denial-of-service (DoS) and distributed denial-of-service (DDoS) attacks can seriously affect the operation of IIoT networks, and such attacks have the potential to turn off production systems [
24,
25]. Many IIoT devices have outdated technology and are equipped with insufficient security measures. This can make it easier for attackers to infiltrate the network [
26]. In addition, since IIoT systems usually operate over cloud-based infrastructures, cloud computing systems also become a target for attackers [
27,
28].
The WUSTL-IIoT-2021 dataset has become a widely recognized benchmark for evaluating intrusion detection systems (IDSs) in IIoT environments. Its consistent adoption across numerous studies enables direct comparisons between different models, algorithms, and methodologies within a standardized experimental framework [
29].
For instance, Alani et al. (2022) [
30] introduced a deep learning-based IDS for industrial IoT, employing a multi-stage process. Initially, the dataset was randomly partitioned into training and testing subsets, with 75% allocated for training a deep learning classifier over 75 epochs using a batch size of 1000. The remaining 25% was used in the testing phase to evaluate the classifier’s generalization capability. To further enhance generalization, a third stage incorporated 10-fold cross-validation. The resulting model, DeepIIoT, demonstrated exceptional performance on the WUSTL-IIoT-2021 dataset, achieving accuracy exceeding 99% with false-positive and false-negative rates of 0.069% and 0.032%, respectively. Notably, DeepIIoT outperformed alternative approaches tackling similar challenges.
In 2023, Mohy-Eddine et al. [
31] introduced an intrusion detection model emphasizing advanced preprocessing techniques. Their approach integrated feature selection using the Pearson correlation coefficient (PCC) and outlier detection via the isolation forest (IF) algorithm. The PCC was employed as a feature reduction method to improve model convergence, lower computational costs, and enhance training efficiency. The IF algorithm effectively identified outliers within the Bot-IoT and WUSTL-IIoT-2021 datasets, significantly improving model performance, particularly in handling imbalanced datasets. Similarly, Babayigit and Abubaker (2024) [
32] proposed a hybrid framework for detecting malicious activities in IIoT environments, incorporating minimum description length (MDL) and transfer learning (TL). This framework standardized dimensions and distributions across diverse IIoT datasets, enabling a unified feature representation. A CNN-GRU model was trained on the integrated datasets, with Bayesian optimization (BO) applied for hyperparameter tuning. Experimental results demonstrated the framework’s effectiveness in developing a robust deep learning model capable of generalizing across multiple datasets, emphasizing the crucial role of dataset quality in ensuring reliable training and testing outcomes.
Eid et al. (2024) [
33] developed a machine learning-based intrusion detection system (IDS) for IIoT networks, evaluating six machine learning algorithms: decision tree (DT), random forest (RF), k-nearest neighbor (KNN), support vector machine (SVM), logistic regression (LR), and naïve Bayes (NB). All models performed well, with Matthews correlation coefficient (MCC) scores exceeding 99%. Alani (2023) [
9] investigated a flow-based IDS for IIoT utilizing classifiers such as RF, LR, DT, and Gaussian naïve Bayes (GNB). The study employed a preprocessed dataset and applied recursive feature elimination to refine the feature set to 11 attributes. A 10-fold cross-validation approach ensured robust generalization with minimal accuracy variance.
Popoola et al. (2023) [
34] introduced a federated deep learning (FDL) model for intrusion detection in consumer-centric IoT, achieving accuracy of 0.9997 for DT with an inference time of 0.1517 μs. Their comparative analysis of multiple datasets, including X-IIoTID, Edge-IIoTset, and WUSTL-IIoT-2021, demonstrated the superiority of FDL models over centralized deep learning (CDL) models, providing timely and privacy-preserving intrusion detection. Alzahrani and Aldhyani (2023) [
35] examined AI-driven cybersecurity enhancements for industrial control systems, comparing machine learning techniques (KNN, RF) and deep learning architectures (CNN-GRU). Their study achieved exceptionally high accuracy rates—99.99% for KNN and RF and 99.98% for CNN-GRU—on the WUSTL-IIoT-2018 dataset, further confirming the effectiveness of AI-based intrusion detection strategies in IIoT security.
Similarly, Dina et al. (2023) [
36] proposed a deep learning-based intrusion detection model utilizing feedforward neural networks (FNNs) and convolutional neural networks (CNNs). Their evaluation using the WUSTL-IIoT-2021 dataset demonstrated superior performance, with the FNN-focal model achieving accuracy of 98.95%. Diaba et al. (2023) [
37] analyzed the impact of manipulated datasets on machine learning models to assess cybersecurity risks in power systems. Based on the WUSTL-IIoT-2021 and WUSTL-IIoT-2018 datasets, their findings revealed that manipulated datasets led to reduced accuracy, increased prediction errors, and longer training times for all algorithms except the boosted tree algorithm.
Xu et al. (2023) [
38] proposed an IoT intrusion detection system based on machine learning, employing the XGBoost classifier in conjunction with two-stage feature selection methods: binary gray wolf optimization (BGWO) and recursive feature elimination with XGBoost (RFE-XGBoost). Experiments conducted on five publicly available datasets demonstrated the approach’s superior accuracy, recall, precision, and F1 score performance. However, the study also highlighted challenges in scaling the method to large datasets due to computational complexity, memory constraints, efficiency, generalization ability, and robustness.
Ahakonye et al. (2024) [
39] introduced the trees bootstrap aggregation (TBA) algorithm for detecting and classifying IoT-SCADA network traffic, focusing on the IEC-104 network communication protocol. Their study demonstrated TBA’s high precision in identifying different network traffic types while reducing false-acceptance rates in heterogeneous IIoT sensor data. Bekbulatova et al. (2023) [
40] addressed IIoT security concerns by proposing an anomaly-based IDS for detecting zero-day attacks. Their approach utilized semi-supervised learning on large-scale, unlabeled IIoT network traffic, implementing the DeepSAD model in a federated learning framework. While the centralized model outperformed the federated approach in detecting DoS attacks, variations in client performance within the federated setting indicated potential areas for future optimization.
Gaber et al. (2023) [
41] made a notable contribution to IIoT cybersecurity by integrating particle swarm optimization (PSO) and the bat algorithm (BA) for feature selection, significantly improving the efficiency of IIoT-based traffic classification. Their study, which utilized the WUSTL-IIoT dataset, achieved accuracy of 99.99% and precision of 99.96%, substantially enhancing computational efficiency. Eid et al. (2024) [
33] systematically evaluated machine learning models, investigating preprocessing techniques and dataset imbalances for IDSs in IIoT environments. Their study demonstrated that applying the synthetic minority oversampling technique (SMOTE) improved binary classification accuracy, with random forest and decision trees achieving 99.98%. The study also introduced a novel multi-class classification approach using SMOTE, enhancing detection performance for various attack types, with RF, DT, and LR achieving near-perfect accuracy.
Casajús-Setién et al. (2023) [
42] proposed an anomaly detection-based IDS framework using a transformer model, significantly advancing IIoT cybersecurity. Utilizing the WUSTL-IIoT-2021 dataset, their research demonstrated the model’s effectiveness in analyzing sequential network flows using a streamlined multi-head attention mechanism.
Ye et al. (2024) [
43] introduced an ensemble framework incorporating an enhanced harmony search algorithm (HBO) for feature selection in IDSs. Their approach, tested on datasets such as NSL-KDD, WUSTL-IIoT-2021, and HAI, improved intrusion detection accuracy by nearly 15% on large-scale datasets while significantly reducing the number of original features.
Lastly, Saxena and Mittal (2023) [
44] conducted a comprehensive review of existing IIoT network datasets and advanced persistent threat (APT) attack characteristics, proposing a standardized evaluation framework for benchmarking IDS models. Their assessment, applied to datasets including WUSTL-IIoT-2021, underscored the dataset’s significance in advancing IIoT security research and developing effective countermeasures against sophisticated cyberthreats.
In summary, studies utilizing the WUSTL-IIoT-2021 dataset frequently adapt their methodologies and consistently report high accuracy in intrusion detection. Many of the proposed models evaluated on this dataset achieve accuracy nearing 99%, reflecting a prevailing trend in developing robust and effective intrusion detection systems for IIoT environments. Several studies have demonstrated superior performance compared to the existing literature, highlighting the reliability and effectiveness of these approaches. The strong emphasis on high accuracy underscores the critical need for advanced, trustworthy security solutions to protect IIoT networks. As a widely adopted benchmark, the WUSTL-IIoT-2021 dataset plays a key role in these evaluations.
As shown in
Table 1, the WUSTL-IIoT-2021 dataset has been analyzed in various studies. However, as it is relatively new compared to other datasets in the field, research directly focused on it remains limited. This study is expected to contribute to its growing adoption. Furthermore, this study stands out by examining three different modeling approaches—machine learning, deep learning, and hybrid models—making it one of the few in this domain to do so. The analysis also incorporates 12 different models, a rare approach in existing research. While hybrid methods have generally outperformed standalone deep learning techniques in previous studies, the specific models employed in this research demonstrated superior performance on the WUSTL-IIoT-2021 dataset.
3. Methods
This study aimed to detect cyberattacks in IIoT networks, which are increasingly adopted, interconnected, and expected to play a growing role in cybersecurity research as network sizes expand. The proposed framework consists of four key stages—data preprocessing, data splitting, classification, and evaluation—as illustrated in
Figure 1. The raw dataset was prepared for classification algorithms following the data preprocessing phase.
Subsequently, machine learning, deep learning, and hybrid models were applied to the preprocessed dataset. All modeling tasks were executed within the Google Colab environment to overcome hardware limitations, leveraging Google’s computational infrastructure. The analysis used widely recognized libraries, including Pandas, MLlib, Scikit-learn, and PyCharm. Apache Spark (OpenJDK 8 headless ver 3.0.0) was selected as the computing platform and Python (ver 3.10) was used as the primary programming language. Hyperparameter tuning was performed for the deep learning models, with optimal values determined through iterative testing and alignment with commonly adopted parameters in the literature. The models categorized network traffic into normal and attack classes, and their performance was assessed on an independent test set that was not utilized during training. The model achieving the highest accuracy in attack detection was identified and considered for integration into network-based intrusion detection systems (IDSs). When implemented in IDSs, the developed model demonstrates superior detection rates and reduced error margins, surpassing traditional intrusion detection techniques.
A binary classification approach was employed to distinguish between normal and attack traffic. Since 90% of the attacks in the dataset comprised denial-of-service (DoS) traffic, the characteristics of other attack types, which exhibited high similarity to DoS patterns, were categorized accordingly. The dataset was divided into two subsets to ensure objective evaluation—70% for training and 30% for testing—with the test set exclusively reserved for performance assessment. The models were evaluated based on F1 score, accuracy, recall, and precision metrics to assess their effectiveness in detecting attack traffic.
During the final evaluation phase, the performance of the proposed method was rigorously analyzed, and the results demonstrated high accuracy in detecting DoS attacks, underscoring the model’s effectiveness in IIoT cybersecurity applications.
3.1. Dataset
The WUSTL-IIoT-2021 dataset was developed to facilitate the detection of cyberattacks targeting IIoT networks, supporting cybersecurity research. Created by Zolanvari et al. [
29], this dataset was explicitly designed to enhance the detection and classification of denial-of-service (DoS) attacks. It was generated using an IIoT testbed that closely replicates real-world industrial systems, enabling the execution of actual cyberattacks. The dataset encompasses 2.7 GB of network traffic data collected over approximately 53 h. Before analysis, the dataset underwent preprocessing, including removing missing values, corrupted entries, and extreme outliers. Although it originally contained six distinct attack types, a binary classification approach was adopted, distinguishing between “attack traffic” and “normal traffic.” Given the high feature similarity among different attack types, all attack instances were labeled class 1, while normal traffic was designated class 0. The characteristics of the dataset are presented in
Table 2.
The dataset was intentionally designed to be imbalanced to ensure a realistic representation of industrial network environments. Attack scenarios, including command injection, reconnaissance, and DoS attacks, were executed against the testbed to capture a diverse range of malicious activities. Notably, attack traffic constitutes less than 8% of the dataset, aligning with real-world industrial control system conditions.
Table 3 provides statistical insights into the dataset, with an average data rate of 419 kbit/s and an average packet size of 76.75 bytes. Since DoS attacks typically generate high volumes of network traffic, approximately 90% of recorded attack instances were allocated to this category. In contrast, other attack types were comparatively infrequent, generating only limited traffic.
The dataset includes various host-specific attributes, such as source and destination IP addresses. However, incorporating these features during model training may lead to overfitting, limiting the model’s ability to generalize to unseen data. Additionally, specific attributes, such as flow start and end times, do not directly contribute to attack detection. Consequently, features such as StartTime, Last-Time, SrcAddr, DstAddr, slpld, and dlpld were removed to prevent model over-learning and improve detection performance. Following this refinement process, the total number of features was reduced to 42.
Feature selection plays a critical role in constructing an adequate dataset for intrusion detection. The selected features exhibited significant changes during attack phases compared to normal network behavior. If a feature remains static across both attack and normal states, even the most advanced detection algorithms will fail to identify anomalies. The final dataset contains 42 features, with their descriptions detailed in
Table 4.
The WUSTL-IIoT-2021 dataset contains 1,194,464 samples. It contains four types of attacks and is labeled into six classes: normal traffic, total attack traffic, command injection traffic, DoS traffic, reconnaissance traffic, and backdoor traffic. Four of these are DoS attack types.
Table 5 shows the data distribution by class. Additionally,
Table 6 shows the count, mean, std, min, and max values of the dataset.
In this study, evaluations were made using the WUSTL-IIoT-2021 dataset. This dataset was selected because it includes DoS attacks specific to IIoT. It has been preferred because it has recently been widely used in machine learning studies.
3.2. Data Preprocessing
The data preprocessing phase consists of three key stages: encoding, normalization, and feature selection.
The first stage, encoding, involves converting textual attributes within the dataset into numerical values to facilitate processing by artificial intelligence algorithms. Additionally, class labels indicating the category of each data entry are numerically represented. In cases where class labels are initially in textual format, they are transformed into numerical equivalents. For instance, the Traffic_Type attribute in the dataset comprises textual categories such as normal, command injection, DoS, reconnaissance, and backdoor. Given that 92% of the dataset consists of normal traffic and 90% of the remaining attack traffic corresponds to DoS attacks, all other attack types were also categorized as DoS attacks. Since textual representations hinder computational efficiency, these labels were converted into numerical values. Through one-hot encoding, the Traffic_Type feature was transformed into the target feature, as shown in
Table 7. In this conversion, rows labeled 0 represent normal traffic, while those labeled 1 correspond to attack traffic.
The second stage, normalization, standardizes the numerical values of features within a 0–1 range to prevent attributes with large numerical values from disproportionately influencing the model’s outcomes. Normalization ensures that all features contribute equitably to the learning process, enhancing model stability and performance. After normalization, the dataset is prepared for use in machine learning models. This step is particularly critical in mitigating the dominance of high-value features, which could otherwise introduce biases in algorithmic calculations. Normalization was applied uniformly to all features, ensuring that their values remained within the 0–1 range.
The third stage, feature selection, involves eliminating non-contributory features that increase computational complexity and strain local hardware resources. An excessive number of features can lead to higher processing costs, increased energy consumption, and extended computation times. Therefore, feature reduction is often implemented to enhance model efficiency by streamlining the dataset.
In this study, random forest, logistic regression, and ExtraTreesClassifier algorithms were employed to evaluate the significance of each feature. The outcomes of these three feature selection techniques were compared, and the results indicated that feature reduction was unnecessary for this dataset. All 42 features were deemed relevant and retained for model training. By analyzing the impact of these features on classification accuracy, this study provides valuable insights for future research and contributes to the broader literature on intrusion detection in IIoT environments.
4. Experiments and Evaluations
In this section, the WUSTL-IIoT-2021 dataset was analyzed using five machine learning models and five deep learning models, and two hybrid models were used in addition to the models. Google infrastructure was used to overcome hardware limitations and perform all modeling operations in the Google Colab environment. Popular libraries such as Pandas, MLib, Scikit-learn, and PyCharm were used in the analysis processes. The Apache Spark platform was preferred as the working environment, and Python was used as the programming language. Hyperparameters were used in the analysis of deep learning models. The most appropriate values of these parameters were determined by testing different values and considering the values commonly used in the literature. The classification process used the binary (normal and attack traffic) approach. To increase the objectivity of the results, the data set was divided into training and testing: 70% was reserved for model training, and the remaining 30% was reserved for testing. The part reserved for testing was never used with models other than testing (e.g., training). While detecting the attack traffic, the results were evaluated and compared according to the F1 score, accuracy, recall, and precision values. The proposed method was compared with five different machine and deep learning algorithms. In addition, two hybrid models were used for evaluation, and the results were interpreted.
4.1. Model Parameters and Training Configurations
This section presents the parameters used for the machine learning, deep learning, and hybrid models in this study. Each table provides a structured overview of the models, their configurations, and key hyperparameters.
Table 8 below summarizes the study’s machine learning models and their key parameter settings.
Table 9 presents the deep learning models, their input, hidden, and output layer configurations, and additional settings.
Table 10 outlines the hybrid models used in this study, detailing the combination of deep learning architectures and their respective configurations.
Table 11 provides an overview of the training configurations applied to all deep learning models.
Adam dynamically adjusts the learning rate, improving optimization efficiency and model convergence. To prevent overfitting, the epoch count should be increased carefully.
4.2. Evaluation Parameters
The most commonly used metrics in the literature—F1 score, accuracy, recall, and precision—were employed to evaluate the classification models’ performance. These metrics serve as standard benchmarks for assessing and comparing the effectiveness of different classification algorithms and are widely utilized in various machine learning and deep learning applications [
45,
46].
These evaluation parameters are derived from the confusion matrix, which consists of four key components:
True Positive (TP): The number of correctly classified attack instances.
True Negative (TN): The number of correctly classified normal instances.
False Positive (FP): Normal instances incorrectly classified as attacks.
False Negative (FN): Attack instances mistakenly classified as normal.
The evaluation metrics are computed as follows.
Accuracy measures the proportion of correctly classified instances (
TP + TN) to the total number of instances (
TP + TN + FP + FN).
Precision quantifies the proportion of correctly classified attack instances (
TP) to the total predicted attack instances (
TP + FP).
Recall (sensitivity) measures the proportion of correctly classified attack instances (
TP) relative to all actual attack instances (
TP + FN).
F1 score represents the harmonic mean of precision and recall, providing a balanced measure that accounts for both false positives and false negatives.
While precision emphasizes the accuracy of positive classifications, recall provides insight into the model’s ability to identify actual attack instances correctly. The F1 score serves as a comprehensive measure that balances both precision and recall, making it one of the most widely adopted performance indicators in classification tasks.
5. Results and Comparison
The results of this study were compared across traditional machine learning, deep learning, and hybrid learning algorithms. The dataset was split into 70% training and 30% testing. Commonly accepted hyperparameter values were initially assigned to deep learning algorithms, and fine-tuning was performed to achieve the best results.
Although the dataset contains six different attack types, a binary classification approach was adopted: attack traffic vs. normal traffic. This method was successful due to the high similarity between feature distributions of different attack types. The Adam optimization algorithm accelerated and enhanced the model’s convergence to the global minimum. Adam combines momentum and RMSprop methods, ensuring a balanced and efficient optimization process.
Using 42 features from the WUSTL-2021-IIoT dataset, machine learning, deep learning, and hybrid learning models were analyzed. The results are presented in
Figure 2, which compares various machine learning and deep learning algorithms regarding accuracy on the WUSTL-IIoT-2021 dataset.
MLP achieved the highest accuracy scores among deep learning models and CART and logistic regression among machine learning models, all exceeding 99.87% accuracy. However, the CNN–LSTM hybrid model performed worse than the standalone CNN and LSTM models on the WUSTL-IIoT-2021 dataset. While hybrid methods generally outperform individual deep learning techniques in the literature, standalone models performed better on this dataset, likely due to their unique characteristics.
The lowest-performing models were RNN, CNN, and naïve Bayes, with accuracy of 92.74%. These models struggled to classify DoS traffic and normal traffic instances accurately. A detailed comparison of classification metrics—accuracy, precision, recall, and F1 score—is presented in
Figure 3 and
Table 12.
The results showed that the LSTM–CNN model had the lowest accuracy, at 92.22%, while the MLP model achieved the highest accuracy, at 99.99%. The highest accuracy values are highlighted in bold. RNN and CNN exhibited the lowest success rates among deep learning models, whereas CART was the best-performing machine learning model. A comparative analysis of all models is visually presented in
Figure 3.
The dataset is better suited for binary classification models. In the dataset, normal traffic is labeled 0, and attack traffic is labeled 1. The most commonly used models for binary classification in the literature include logistic regression, decision tree, random forest, and support vector machine (SVM). Our study examined these models along with additional approaches. While high classification performance was achieved with models suited for binary classification, MLP yielded the highest accuracy.
This is likely due to the heterogeneous nature of attack traffic within the dataset. Although the dataset consists of a simple 0–1 classification structure, the feature variations among different attack types increase complexity. As a result, simpler models such as logistic regression and decision tree struggle with classification. For complex datasets like this, more advanced models should be preferred. Therefore, this study explored binary classification models and various machine learning and deep learning models.
MLP model performance and hyperparameters: The MLP model achieved the highest accuracy (99.99%) using 10 epochs and a batch size of 100. It consists of three hidden layers with 64, 128, and 256 neurons. The output layer has a single neuron and employs the sigmoid activation function. The model was trained using binary cross-entropy as the loss function and Adam as the optimizer. The selected hyperparameters are listed in
Table 13.
ROC curve and performance evaluation: The receiver operating characteristic (ROC) curve in
Figure 4 provides insights into the model’s classification performance. The x-axis represents the false-positive rate (FPR), while the y-axis represents the true.positive rate (TPR). A high TPR and low FPR indicate strong model performance. The closer the ROC curve is to the top-left corner, the better the model’s accuracy.
Recent studies show that the use of traditional methods for attack detection is decreasing and AI-based approaches are becoming more common. Hybrid models are increasingly being developed to minimize false positives. However, in this study, such hybrid approaches showed lower performance. This is thought to be due to the unique feature structure of the dataset.
As seen in
Table 1, a comprehensive literature review revealed that this study achieved the highest accuracy on the WUSTL-2021-IIoT dataset using a wide range of models. These results show the importance of choosing the right classification model according to dataset complexity.
6. Conclusions and Future Works
Integrating SCADA systems with IIoT has significantly increased operational efficiency in various industrial sectors. However, this connection has also introduced numerous vulnerabilities related to DoS attacks and IDS. Understanding these threats and their mitigation mechanisms is crucial to maintaining the integrity and reliability of industrial operations. SCADA systems are critical for monitoring and controlling industrial processes, but their increased exposure to external networks has made them susceptible to various cyberthreats.
This study examined the network traffic of a SCADA system built with IIoT devices, and both normal and attack traffic were analyzed. This analysis, which was conducted using machine learning, deep learning, and hybrid learning models, aims to make a significant contribution considering the limited studies in the literature. Experimental studies were conducted in the Google Colab environment using the Apache Spark big data platform, and the performances of the models were evaluated with metrics such as accuracy, precision, recall, and F1 score. The MLP model achieved 99.99% accuracy, the CART model 99.98% accuracy, and the logistic regression model 99.86% accuracy, outperforming other methods in the literature. In this process, hyperparameter adjustments were performed for deep learning algorithms and parameter optimizations were performed for machine learning algorithms. The MLP model had the highest accuracy rate and was trained for 10 epochs using ReLU and softmax activation functions. It has a dropout rate of 0.5 and consists of three hidden layers containing 64, 128, and 256 units. The Adam algorithm was used for model optimization, and categorical cross-entropy was adopted as the loss function.
The proposed model successfully detects security threats in the data streams generated by SCADA systems through AI-driven anomaly detection. Additionally, in-depth analyses of information security, authentication mechanisms, and security technologies are presented, offering a comprehensive framework for protecting industrial control systems. The study reinforces security measures in this critical field by addressing SCADA security challenges within the broader IoT ecosystem.
Future studies can improve model performance by examining hybrid models and applying data-balancing techniques. Class imbalance problems can be addressed by investigating the effects of over- and undersampling methods. In addition, validating the developed models on different IIoT datasets is important to assess their generalizability. Integrating techniques such as hyperparameter optimization and automatic feature selection can increase the reliability and effectiveness of intrusion detection systems.