Next Article in Journal
Spatiotemporal Prediction of Landslide Displacement Using Graph Convolutional Network-Based Models: A Case Study of the Tangjiao 1# Landslide in Chongqing, China
Previous Article in Journal
RSTSRN: Recursive Swin Transformer Super-Resolution Network for Mars Images
Previous Article in Special Issue
A Study of Discriminatory Speech Classification Based on Improved Smote and SVM-RF
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improving Deceptive Patch Solutions Using Novel Deep Learning-Based Time Analysis Model for Industrial Control Systems

by
Hayriye Tanyıldız
1,*,
Canan Batur Şahin
1 and
Özlem Batur Dinler
2
1
Faculty of Engineering and Natural Sciences, Malatya Turgut Ozal University, Malatya 44210, Turkey
2
Faculty of Computer Engineering, Siirt University, Siirt 56100, Turkey
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(20), 9287; https://doi.org/10.3390/app14209287
Submission received: 16 July 2024 / Revised: 2 October 2024 / Accepted: 6 October 2024 / Published: 12 October 2024
(This article belongs to the Special Issue Advances in Security, Trust and Privacy in Internet of Things)

Abstract

:
Industrial control systems (ICSs) are critical components automating the processes and operations of electromechanical systems. These systems are vulnerable to cyberattacks and can be the targets of malicious activities. With increased internet connectivity and integration with the Internet of Things (IoT), ICSs become more vulnerable to cyberattacks, which can have serious consequences, such as service interruption, financial losses, and security hazards. Threat actors target these systems with sophisticated attacks that can cause devastating damage. Cybersecurity vulnerabilities in ICSs have recently led to increasing cyberattacks and malware exploits. Hence, this paper proposes to develop a security solution with dynamic and adaptive deceptive patching strategies based on studies on the use of deceptive patches against attackers in industrial control systems. Within the present study’s scope, brief information on the adversarial training method and window size manipulation will be presented. It will emphasize how these methods can be integrated into industrial control systems and how they can increase cybersecurity by combining them with deceptive patch solutions. The discussed techniques represent an approach to improving the network and system security by making it more challenging for attackers to predict their targets and attack methods. The acquired results demonstrate that the suggested hybrid method improves the application of deception to software patching prediction, reflecting enhanced patch security.

1. Introduction

Industrial control systems (ICSs) are computer-based systems that manage and supervise production, energy generation, water and wastewater management, transportation, and other critical infrastructure. ICSs have various components, such as process control systems (PCS) and distributed control systems (DCS), and ensure centralized monitoring and control, like supervisory control and data acquisition (SCADA) systems.
Cyberattacks on industrial control systems can seriously disrupt critical infrastructure, compromise safety systems, and cause significant financial losses. Deception techniques in cybersecurity refer to methods used to mislead attackers or manipulate their actions. These techniques can include honeypots (luring attackers into a controlled environment), honeytokens (placing false or decoy data that trigger an alert when accessed), or even disinformation campaigns to confuse potential attackers.
However, it is crucial to note that engagement in deceptive practices should be performed ethically and legally. Organizations should consult cybersecurity professionals and adhere to legal requirements or regulations when implementing such measures.
Additionally, while deception can be a valuable tool in defending against cyberthreats, relying on something other than these tactics is essential. Robust security measures, such as network segmentation, access controls, regular vulnerability assessments, employee training programs, and incident response plans, are essential in maintaining industrial control systems’ security. On the one hand, the complexity and diversity of industrial control systems make traditional methods of detecting and preventing cyberattacks inadequate. On the other hand, cyberattackers are constantly developing more advanced and sophisticated attack techniques, which further increases the cybersecurity risks of ICSs.
Software security updates using deception to influence attacker decision-making and exploit generation are called deceptive patches. One standard method of deception is using decoy systems or honeypots within the ICS environment. These decoy systems are designed to mimic fundamental components of the control system, such as SCADA devices or PLCs, but are isolated from critical operations. When an attacker interacts with these deceptive elements, it provides security teams with valuable insights into their tactics and motivations. By implementing deception, organizations can create an additional layer of defense to mislead and confuse potential attackers. It is possible to apply deception to software security patches to impact attackers’ decision-making process. Traditional software security patches can help the exploit generation process. Therefore, it is essential to create and analyze solutions regarding how deception can be applied to software patches as part of a defense strategy to help protect the patches and the programs that they spread from attacks.
In recent years, digitalization and Industry 4.0 transformation have made industrial control systems more vulnerable to cyberattacks [1]. Therefore, it is vital to consider and develop cybersecurity measures.
The study in [2] focuses on examining adversarial samples, which are used to mislead machine learning models. Adversarial samples represent specifically designed inputs that cause a model to produce erroneous predictions when classifying. Such samples can lead to security vulnerabilities and the inability of models to deal with real-world data successfully. In the mentioned paper, adversarial training also involves using adversarial samples in the model’s training process. Thus, the model learns to deal with adversarial samples, increasing the classification success.
Deceptive virtual hosts enhance cyber–physical system security in industrial control networks [3]. The referenced study stresses the importance of research and development in this field, suggesting that deceptive virtual hosts can contribute significantly to the cybersecurity of industrial control networks.
Probability-based models can assess the effectiveness and performance of cyberdeception techniques, which may help to make them more efficient and secure [4]. Probability models can be employed to predict the success rates of cyberdeception techniques, the probability of detection, and how attackers will react to these techniques; thus, cyberdeception strategies can be better optimized, and cyberdefense mechanisms can be strengthened.
Combining mobile target defense and cyberdeception techniques effectively prevents cyberattacks on IoT systems and increases their security [5]. This approach will likely reduce attackers’ success and enable cybersecurity teams to react to threats more quickly and effectively.
Cyberdeception technologies, particularly honeypots and honeytokens, can be used in hybrid cyberdefense strategies. How can these technologies increase cybersecurity? Researchers argue that such technologies will help to strengthen cybersecurity by misleading cyberattackers, denying them access to natural systems and data, and providing valuable intelligence to understand attackers’ real intentions and abilities [6].
A study [7] examines the possibility of stealthy cyberattacks toward an IDS involving function code attacks, injection attacks, and reconnaissance attacks, improving its robustness to adversarial attacks. The results show that the detector’s robustness to adversarial samples increases after training on a mixture of the original dataset and newly produced samples.
A timeline analysis of the effect of deceptive patches is presented, and, finally, a formal model of deceptive patches investigating the theoretical security of deceptive patches is analyzed. A framework employing the traditional software patching lifecycle is introduced, and the following steps are added to generate different versions of the released patches. The metrics that trigger the release of the diversified patches in question are discussed [8].
The study in [9] suggests GRN, an interpretable multivariate time series anomaly detection method based on neural graph networks and gated recurrent units (GRUs). GRN can automatically learn the possible correlations between sensors from multidimensional industrial control time series data. The experimental findings show that the model is more interpretable and provides more effective solutions.
The proposed approach in [10] combines a structure learning approach with graph neural networks and utilizes attention weights to ensure explainability for the anomalies detected. The experiments conducted on two real-world sensor datasets with ground truth anomalies demonstrate that the method detects anomalies more accurately than baseline approaches.
The current research aims to divert attention from fundamental system components and security vulnerabilities by giving false or misleading information to attackers, thus preventing them from using industrial control systems (ICSs) as a proactive defense mechanism against cybersecurity vulnerabilities in such systems. Furthermore, it proposes an artificial intelligence-based hybrid model to ensure cybersecurity in industrial control systems. The model is innovative in combining deceptive patch and window size manipulation techniques.
The present study aims to integrate and examine deceptive patch solutions with window size manipulation and adversarial training methods to increase cybersecurity in industrial control systems. Adversarial training is a method developed to detect and prevent cyberattacks against artificial neural networks and is thought to strengthen cybersecurity in ICSs when combined with deceptive patch solutions. In general, the purpose of deception patches in protecting ICSs is to provide an additional layer of defense against cyberattacks by deceiving attackers and providing an early warning of a potential attack. This paper first mentions the types of cyberattacks and vulnerabilities for industrial control systems. Then, it focuses on how adversarial training and deceptive patch solutions can be applied in industrial control systems and window size manipulation. Finally, trials and evaluations are performed on the ICS dataset to assess the effectiveness of this integrated approach. Deceptive patching technology increases security by changing the attack surface in a system and making it more challenging for an attacker to perform an attack successfully. Window size manipulation changes how network protocols function, making it more difficult for an attacker to observe and analyze the network traffic. Combining these two techniques enables the analysis of large datasets to detect and respond to potential attacks, preventing attacks from succeeding and disrupting the attacker’s strategy.
This paper explains in detail how the suggested artificial intelligence-based hybrid model functions, how it is implemented, and the consequences. Moreover, it discusses what this model means for future cybersecurity strategies and the roadmap that it presents for further research and development.
The present study, which explains how deceptive patch solutions can be integrated with window size manipulation to increase cybersecurity in ICSs with the adversarial training method, is a significant step toward the better understanding and application of methods to create an effective cybersecurity strategy for industrial control systems.
A summary of the novelties and contributions of the present study is presented below:
  • Analyzing the suitability of deceptive patching techniques for industrial control systems, developing novel methods, and optimizing the current methods;
  • Warning system administrators about the presence of an attacker in the system with the early detection of attacks by assessing the effectiveness and performance of the methods developed;
  • Reducing the risk of system damage by diverting attackers’ attention from fundamental system components and security vulnerabilities and decreasing the risk of critical infrastructure damage;
  • Developing deceptive patching strategies for industrial control systems and applying dynamic and adaptive deception techniques;
  • Ensuring real-time threat analysis and response capabilities for system security, analyzing aggressive behavior patterns specific to industrial control systems, and optimizing the deceptive patching strategies with these models.
Our study is organized as follows. Section 2 presents the background. Section 3 describes the methods employed. Section 4 addresses the suggested approach. The experiments and their outcomes are presented in Section 5. Finally, the conclusions and future research are given in Section 6.

2. Background

Industrial control systems (ICSs) are widely integrated into our lives nowadays. They ensure that the most critical infrastructure and processes are managed more efficiently. Gas, water, manufacturing, power distribution, and transportation are ICS-dependent to ensure the daily functioning of their processes. This section briefly mentions the types of attacks on critical infrastructure, shedding light on the increasing cyberthreats to ICS devices.

2.1. Attack Types

Spyware is utilized to sabotage ICSs by infiltrating them, manipulating transactions, or accessing sensitive data. Stuxnet is a famous example of spyware designed against Iran’s nuclear facilities in 2010, which disrupted the facility’s operation by changing the speed of centrifuges. In addition, Night Dragon (2010) attackers utilized sophisticated malware to target global energy, oil, and petrochemical companies. In the Duqu/Flame/Gauss (2011) incident, highly developed and sophisticated malware was utilized to target particular organizations, such as ICS producers. In the Shamoon (2012) incident, the malware was utilized to target large energy companies in the Middle East, including RasGas and Saudi Aramco. Havex (2013) is an example of an ICS-focused malware campaign [11].
A distributed denial of service (DDoS) attack represents a malicious attempt to sabotage a network due to overwhelming its capability to process legitimate requests and traffic. Consecutively, the activity mentioned above denies the victim of service, which leads to expensive setbacks and downtime. A DDoS attack represents a network-based attack, which utilizes network-based internet services, e.g., domain name service (DNS), routers, and network time protocol (NTP). It aims to disrupt network devices connecting the organization to the internet. Load balancers, routers (traditional WAN and ISP edge routers), and firewalls can be listed among these devices.
A significant number of security vulnerabilities in ICSs are caused by human errors such as misconfigurations, outdated software, and user errors. Moreover, malicious insiders can also damage ICSs
Social engineering attacks target human vulnerabilities, defrauding users and causing information leaks. Such attacks are realized especially with phishing and spear phishing attacks via e-mail.
A zero-day attack represents a software vulnerability that attackers use before the vendor realizes it. In this case, there is no patch; therefore, attackers can use the vulnerability easily since they know that defense is absent, which transforms zero-day vulnerabilities into a significant security threat. Such attacks can be used to damage ICSs.
Advanced persistent threats (APTs), which are long-term, targeted, and sophisticated attacks, aim to infiltrate ICSs, gather intelligence from inside systems, and manipulate the infrastructure. MITM attacks are performed to modify, hack, or spoof data in communication channels. These attacks can be utilized to manipulate command and control messages to ICSs.

2.2. Cyberattack History for Industrial Control Systems

ICSs have been integrated into modern life, ensuring that the most essential processes and infrastructure are managed more efficiently. Gas, water, manufacturing, transportation, and power distribution are ICS-dependent, keeping their processes functioning daily. Table 1 summarizes studies on cybersecurity in ICSs, categorized according to the features of their security problems.

3. Deep Learning-Based Methods

3.1. Time Series Analysis for Deep Learning

Time series frequently contain temporal dependencies, causing two otherwise identical points of time to belong to various classes or produce a prediction of various behaviors. The mentioned feature usually renders their analysis more challenging. Moreover, the aim is to capture sophisticated inter-sensor relationships and identify and explain abnormalities deviating from the relationships in question. Deep learning approaches have recently enhanced anomaly detection in high-dimensional time series data (such as sensor data). Nevertheless, the current methods still need to thoroughly learn the structure of the available correlations between variables or utilize them to predict the expected behavior of time series. Deep learning models represent a very effective tool for anomaly detection in time series. Such models learn particular patterns in time series and can identify data outside these patterns as anomalies.

3.2. Deep Learning-Based Classifiers

The current study presents some newly introduced approaches to realizing tasks in time series by employing deep learning architectures. The representative algorithms that can be employed to this end involve recurrent neural networks (RNNs), long short-term memory (LSTM), gated repetitive units (GRUs), convolution neural networks (CNNs), multilayer perceptrons (MLPs), and attention mechanisms.

3.2.1. Gated Recurrent Unit (GRU)

A gated recurrent unit (GRU) is a recurrent neural network (RNN) structure. This development helps RNNs to learn long-string dependencies more effectively. GRU networks have two gates: a reset gate (r) that adjusts the incorporation of new input with the previous memory and an update gate (z) that controls the preservation of the previous memory. Figure 1 shows the transition functions in the hidden units of GRUs.
The update gate assists the model in determining the previous information amount (from previous time steps). Equation (1) is employed for the current update gate.
Z t j = σ ( W z X t + U z h t 1 ) j
When Xt is plugged into the network unit, its weight W(z) is multiplied. Likewise, it can be applied to ht−1 with the information for the previous t − 1 units, and its weight U(z) is multiplied. A sigmoid activation function squashes the outcome between 0 and 1. Figure 2 demonstrates the function’s behavior.
The reset gate ensures that the model decides how much of the previous information to forget. Equation (2) is utilized to update the rest of the gates.
r t j = σ ( W r X t + U r h t 1 ) j
This equation is similar to the equation of the update gate. The difference arises from the weights and the use of the gate, which will be demonstrated in the subsequent parts to reveal how the gates will impact the final output precisely. First, the reset gate is used. ĥt denotes novel memory content, utilizing the reset gate to store the related information from the previous time. It is calculated in the equation below:
h ^ t j = tanh ( W X t + r t U h t 1 )
Ultimately, the network must compute the ht vector with information for the current unit. The update gate finds what must be collected from the current memory content ĥt and the previous steps ht − 1, with the final memory at the current time step.
h t j = tanh ( z t h t 1 + ( 1 z t ) h ^ t )
Figure 1 and Figure 3 show the GRU and LSTM memory block architectures.

3.2.2. Long Short-Term Memory (LSTM)

A long short-term memory (LSTM) neural network architecture is a specific variation of recurrent neural networks (RNNs), capable of learning dependencies in the long run. This primarily distinguishes it from a feed-forward neural network, which maps an input vector into an output vector. The feature vector representation is crucial to establishing a high-accuracy vulnerability prediction model. Equation (5) computes the network h(t) output.
h t j = σ t j tanh ( c t j )
where Ctj refers to the memory amount of every LSTM j unit at time t. σhtj demonstrates the output gate for LSTM unit j at time t, managing the memory content exposure. The output gate is
σ t j = σ ( W 0 X t + U 0 h t 1 + V 0 c t ) j
The standard sigmoid function is denoted by W0, V0, and U0, representing the weight, diagonal, and uniform matrices for the output gate, respectively. Every Xt refers to the input at the time step during training. ht−1 contains information from the past unit at time step t − 1.
c t j = f t j c t 1 j + i t j C ^ t j
Ĉ(t) represents a new memory that is the memory unit’s content, updated partially by the current memory and the new memory content to c(t). The novel memory content is found below. Wc and Uc are the weight and uniform matrices for the novel memory content.
C ^ t j = t a n h ( W c X t + U c h t 1 ) j
Ftj modulates the current memory forgetting gate for LSTM unit j at time t. The input gate modulates the degree of addition of new memory content to the memory cell for LSTM unit j at time t. Wf, Uf, Vf, Wi, Ui, and Vi represent the weight, uniform, and diagonal matrices for the forget and input gates.
f t j = σ ( W f X t + U f h t 1 + V f c t 1 ) j
i t j = σ ( W i X t + i h t 1 + V i c t 1 ) j
LSTM can automatically learn syntactic and semantic features, representing dependencies in the long run. Long short-term memory (LSTM) is a recurrent neural network that maps an input vector sequence into an output vector sequence. At every step t, the LSTM unit reads the input Wt, the past output state ht−1, and the memory ct−1 and uses a set of model parameters to compute the output state ht.
Although LSTM units usually display more powerful performance [24], GRU units are more computationally efficient since they have fewer parameters and require shorter training times.
LSTM [25], introduced by Hochreiter and Schmidhuber, is a frequently employed model to detect anomalies in time series. LSTM represents a type of recurrent neural network (RNN) developed to analyze time series data. LSTM can model long-term dependencies in time series and thus detect anomalies.

3.2.3. Convolutional Neural Network (CNN)

The CNN is a popular deep learning model for image processing. Nevertheless, it can also be utilized to detect anomalies in time series. The CNN can detect specific patterns in time series and identify data outside these patterns as anomalies.
The CNN applies the basic concept of the neural network (NN) algorithm with more layers. The CNN utilizes a convolution layer capable of addressing spatial information in images, whereas fully connected layers have memory for information storage in time series data. One difference between computer vision problems and time series ones is the input to the model, which is an image matrix for computer vision and a 1D array for time series forecasting. The observation sequence can treat the raw input data as a 1D array, which the CNN model can read and filter. In this way, it is possible to implement the mentioned principle in time series analysis [26].

3.2.4. Multilayer Perceptron (MLP)

A multilayer perceptron is a type of artificial neural network (ANN) model. In its simplest form, an MLP comprises an input layer, one or more hidden layers, and an output layer. The nodes or neurons in each layer are connected to all neurons in the next layer. Hence, this type of network is also called a “fully connected” or “dense” network.
The main characteristic of an MLP is that it has one or more hidden layers, each with an activation function (usually a ReLU or sigmoid function). This allows the MLP to model complex features and enables the network to solve nonlinear problems.
The main problems of function approximation include time series forecasting and prediction. The MLP comprises a hierarchy of processing units organized in a series of two or more mutually exclusive sets of layers or neurons. The input layer performs the function of a holding site for the input implemented on the network. The output layer represents the point where the overall mapping of the network input occurs. The back-propagation algorithm is the most common algorithm for MLP learning [27].

3.2.5. Attention Mechanisms

Attention mechanisms determine the parts of the input data that a model should focus on. They bring the essential parts of the input data that contain critical information for learning to the forefront. Thus, the model gives more weight to the essential parts of the data in the learning process and can produce better and more accurate predictions [28].
The attention mechanism helps models to learn long-term dependencies and complex relationships more effectively, particularly in models working on sequential data (text, time series, etc.).
The above-mentioned mechanisms have become popular with Transformer models, predominantly used for natural language processing. However, these mechanisms can also be employed in time series estimation. Attention mechanisms become essential, especially in multivariate time series predictions.
In summary, nesting in attention mechanisms assists in increasing the performance and flexibility of models, in learning transfer between tasks, and in solving more extensive and complex problems. Therefore, this use has become popular in artificial intelligence and deep learning.

3.2.6. Generative Adversarial Networks (GANs)

GANs represent a class of artificial intelligence algorithms employed in unsupervised machine learning. They are applied by a system of two neural networks competing with each other within the framework of a zero-sum game. GANs typically comprise two main components: a generator and a discriminator. The generator generates novel data samples, whereas the discriminator assesses them in terms of authenticity. The generator iteratively improves by trying to trick the discriminator with ever-better synthetic data.
GANs were first applied to visual data, but their ability to model complex, high-dimensional distributions makes them suitable for time series data. When applied to time series data, GANs can create new synthetic sequences that mimic the statistical properties of the original data. Within the context of time series, a GAN can be trained on a set of historical data points (e.g., stock prices, weather forecast models, etc.) to create new series that are statistically similar to the input data.

3.2.7. Variational Autoencoder (VAE)

A variational autoencoder (VAE) represents an unsupervised and generative autoencoder model that forces the distribution of vectors in a hidden space to follow a normal distribution.
Like autoencoders, VAEs consist of two main parts: an encoder and a decoder. However, unlike autoencoders, the encoder part of a VAE learns the parameters that represent the probability distribution of the data, rather than the original data. These parameters usually represent a mean and a standard deviation and can be represented differently in different VAE models.
There is a probabilistic sibling variational autoencoder (VAE), a Bayesian neural network, in the autoencoder. It attempts to reconstruct the parameters of the distribution of the (selected) output and not the original input. The design of an anomaly score is performed to correspond to the probability of an anomaly.
Variational autoencoders (VAEs) differ from GRU-based models. A VAE consists of an encoder, a decoder, and a loss function that encourages learned encodings to approximate a prior distribution.

3.2.8. Asynchronous Advantage Actor–Critic (A3C)

The basic structure of A3C has two components: an actor and a critic. The actor identifies the action to take in a given environment, whereas the critic assesses how appropriate this action is.
Its asynchronous structure allows for the simultaneous running of different copies of many student agents. Each agent updates the policy (in other words, the set of rules that determine which actions are to be taken in which situations) independently of others. This accelerates the learning process and helps to obtain a broader range of experience.
Another essential feature of A3C is the advantage function, which helps to predict how influential each action will be on future rewards. These more accurate predictions lead to a faster and more stable learning process.
In conclusion, the A3C algorithm is a reinforcement learning algorithm known for performing parallel learning on multiprocessor systems and learning faster and more efficiently.

4. Proposed Model

4.1. Proposed Hybrid Model

In the present study, deceptive patch solutions and window size manipulation represent an innovative hybrid technique developed to prevent and detect cyberattacks. Window size manipulation is a technique for the more effective detection of anomalies and attacks in time series data used in cybersecurity analysis. The addition of noise is often used to increase the robustness and generalization ability of the model. This technique creates fake data that are similar to accurate data, making it more challenging for attackers to access accurate data, thereby increasing the cybersecurity. The proposed model aims to show that deceptive patch solutions and window size manipulation methods can be used in the field of cybersecurity and are complementary to each other. In the context of DoS attacks, window size manipulation can refer to the TCP window size, which determines the amount of data that can be sent before an acknowledgment is received. By manipulating the window size, attackers can potentially amplify their attacks by sending more data without waiting for acknowledgments. Minimal and hardly detectable changes are added to the dataset obtained by applying window size manipulation, rather than classical deceptive patch models with adversarial attacks. In terms of minimal and hardly detectable changes added to the dataset obtained by applying window size manipulation, this could indicate stealthy tactics used by attackers to make their DoS attacks harder to detect. The effects of manipulating the window size are as follows.
A large window size allows the model to understand a broader context. For example, a large window size in a time series forecast can help the model to learn about long-term dependencies and trends. In contrast, a small window size can help the model to learn more general features because it relies on less contextual information. However, it can also prevent the model from missing critical long-term dependencies or complex structures.
The interval selection of the window size value is aimed at choosing the appropriate window size for the most accurate result of the model. Effectively adjusting the window size is critical for the early detection of security breaches and the ability to respond accurately. In the training of the created dataset, attention mechanisms, which are among the deep learning models that were previously run on the dataset, were preferred to benefit from the structure of the model, which determines which parts of the input data should be focused on and helps to obtain more accurate and effective results. The training and testing process was performed using the model developed using the attention mechanisms algorithm.
The model shown in Figure 4 is trained using deceptive patching technology and window size manipulation techniques. Combining these two methods, the hybrid model provides a more comprehensive and flexible cybersecurity solution. This hybrid model can make ICSs more resilient to cyberthreats, while helping cybersecurity experts to manage threats more quickly and effectively.
The methods used in the proposed framework are explained briefly in the following sections.

4.1.1. Window Size Manipulation

Window size manipulation represents a TCP/IP parameter regulating network traffic, and it determines how many data can be sent at a time in a network connection. It is possible to utilize window size manipulation to detect abnormal behaviors in network traffic and prevent cyberattacks. Malicious users can manipulate the window size to perform denial of service (DoS) attacks. In the TCP/IP protocol, the window size determines how many data segments can be sent without acknowledgment. A malicious user can cause the target system to be overloaded or the network traffic to be corrupted by changing this window size. To explain this with an example, the “TCP Zero Window” attack is such an attack type. In the attack mentioned above, the attacker sets the window size determined by the receiver to zero, causing the sender to stop sending data and essentially freeze the network connection. This situation can cause the target server to be out of service or severely disrupt the network traffic. In other words, an attacker can acquire important information about the state and functioning of a target system by monitoring and analyzing the network traffic. We can disrupt this flow of information and render the attacker’s actions more difficult by manipulating the window size. The AI model can be trained to identify actions violating normal functioning in the ICS dataset. If an anomaly is detected, the model can manipulate the window size in the ICS’s network traffic, making it challenging for an attacker to monitor the flow of information in the system.

4.1.2. Adversarial Training Methods

Adversarial training is a method developed to detect and prevent malicious attacks on neural networks. It is based on creating malicious samples that target the classifier in the learning process and training the classifier with these samples. Thus, the classifier becomes more resistant to cyberattacks and produces more reliable predictions [11,29].
Adversarial training was employed to increase the model’s resilience and generalization ability. In this process, adversarial samples were created and included in the training dataset.

Fast Gradient Sign Method (FGSM) Training Method

The fast gradient sign method (FGSM) is a method employed to generate adversarial samples in machine learning. Adversarial samples render the original data samples misleading with minor but deliberate modifications.
The FGSM generates adversarial samples by computing the gradient of a data sample and checking the sign of the gradient (i.e., whether it is positive or negative). The FGSM is popular because of its low computational cost and is mainly used to test classification models’ vulnerabilities. It minimally alters an input image using the model’s gradient, thereby misleading the model. These changes are usually invisible to the human eye but can considerably impact the model’s classification accuracy.
Adversarial samples were created using the fast gradient sign method (FGSM). The FGSM minimally alters an input image using the model’s gradient, thereby misleading the model. These changes are usually invisible to the human eye but can considerably impact the model’s classification accuracy.
The mathematical model of the FGSM attack is specified in Equation (11):
X′ = X + ε sign(∇ × J(X, Ytrue))
Here, X′ refers to the adversarial sample, X denotes the original input, ε represents a small step size, and ∇ × J(X, Ytrue) refers to the gradient of the loss function concerning X. The sign function takes the sign of the gradient and thus takes a step in the direction that will make the most significant change.
The adversarial samples created are added to the standard training dataset, and the model is trained on this extended dataset. This approach makes the model more resilient to adversarial attacks because the model encounters adversarial samples during the training process and learns to classify them correctly.

Projected Gradient Descent (PGD) Training Method

PGD, namely the projected gradient descent attack, involves adversarial sample generation against deep learning models. These samples are added to the standard training dataset, and the model is trained on this new dataset. The above-mentioned method can help to make the model more resilient to adversarial attacks.
The PGD attack formulates the adversarial sampling process as an optimization problem. This iteratively updates the input to maximize the output of a target model along one aspect of the input sample. This process keeps the input within a specific area, so that it only moves slightly through a step.
This process is applied by adding a series of PGD attacks to the model’s training dataset, which helps the model to learn to recognize and correctly classify adversarial samples. Hence, the model’s overall resilience to adversarial attacks increases.
The PGD method ensures an iterative solution to an optimization problem. While aiming to find the minimum or maximum of a function, the PGD moves in the negative direction of the gradient at each step. However, it also applies a projection operation to keep the solution within a specific area at each step. This ensures that the solution remains within a particular set of constraints. In adversarial attacks, PGD is generally employed to make minor changes to the model’s input, which can cause the model to make an error. In this case, the optimization goal is to maximize the model’s probability of making an error.

MIA Training Method

The momentum iterative attack (MIA) is usually used to model security attacks against neural networks. It mainly belongs to the adversarial attack type, which includes the misuse of artificial learning models, which can be deceptively successful. The MIA makes minor modifications to the input data to mislead the model’s intended output.
The MIA aims to create a more effective adversarial sample by adding a momentum term to iterative attacks. An iterative attack attempts to mislead the model’s response by manipulating the model’s predictions step by step. The momentum term ensures consistency between these steps and makes the attack more effective. In other words, the primary purpose of the MIA is to mislead a model’s decision boundaries and direct the model to produce erroneous predictions.
However, it is essential to understand such attacks and determine how they function to make a model more robust and resilient. By simulating such attacks, we can determine how vulnerable a model is to them, thus making the model more resilient to them. This type of attack is usually employed to assess a model’s security and resilience. If a model can correctly classify adversarial samples, this model is considered safe and robust.

Jacobian-Based Saliency Map Attack (JSMA) Training Method

The Jacobian-based saliency map attack (JSMA) represents an adversarial attack usually utilized against deep learning models. The attack changes a model’s input in a tiny and targeted way to redirect the model’s output to a specific target.
The name JSMA arises from two main concepts: the Jacobian matrix and the saliency map. The Jacobian matrix measures the impact of each function input on the output. The saliency map is a map showing the impact of the input on the output. The JSMA utilizes the Jacobian matrix to mislead the output of a model. First, it measures the impact of each input feature on the output. Afterward, the features affecting the output the most are determined, and these features are changed in a targeted way. The changes in question are generally minimal and invisible to the human eye but can considerably impact the model’s output [30].
The JSMA is generally employed as an attack method but can also make a model more robust. It is possible to utilize the JSMA to understand how a model deals with misleading samples and enhance its performance on such samples. The process mentioned above is commonly known as adversarial training.
Adversarial training involves creating misleading samples of a model and adding these examples to the model’s training set. The model then learns to classify these misleading samples correctly. The JSMA can be employed to create these misleading samples.

CWL2 Attack Training Method

The CWL2 attack represents an adversarial attack technique developed by Nicholas Carlini and David Wagner in 2017 [11]. The attack is used to mislead learning models by minimally altering them so that the model yields an incorrect output.
The CWL2 attack changes the input data to direct the model to a target output. For example, in an image classification model, an attacker can cause an image to appear to belong to a different class by making small, usually invisible changes. These changes are generally invisible to the human eye but considerably impact the model’s output.
The term “L2” in the CWL2 attack name indicates that this attack uses the L2 norm (or Euclidean distance) to measure the extent of the changes in the input data. The L2 norm refers to the linear (Euclidean) distance between two points or vectors. The CWL2 attack employs this measure when comparing the original and modified inputs.

5. Experimental Results

The current study examines the performance of deceptive patch solutions for cybersecurity in industrial control systems by integrating them with adversarial training and window size manipulation. The acquired results demonstrate that this integrated approach can strengthen the cybersecurity of industrial control systems by significantly increasing the success of detecting and preventing cyberattacks by deceiving attackers. Hence, these methods create an effective cybersecurity strategy for industrial control systems.

5.1. Dataset

The HAICon2021 [31,32] dataset was used in the present study. The ICS dataset was analyzed to identify essential characteristics, such as sensor data, control data, and configuration files. The HAICon2021 dataset is designed to detect anomalies in industrial control systems. It includes data collected from various sensors and actuators and covers the average operating data and data with injected anomalies. The dataset contains time series data collected from multiple sensors, which can be used to establish machine learning models to detect abnormal system behaviors.

5.2. Results and Discussion

This study was developed in Python 3.10 using the Google Colab-Pro platform. We selected the robust hardware infrastructure supported by NVIDIA’s A100 GPU. This hardware has a capacity of 2 × 16 G (32 GB) RAM. We evaluated the performance results for the standard models in terms of the precision, accuracy, recall, F1, mean absolute error (MAE), median absolute error (MAD), and maximum absolute error metrics.

5.2.1. Patch Applications for Standard Models

The Patch performance results for standard models represented in Table 2. The parameters used for the standard models are noise_std = 0.01, N_Hiddens = 100, N_Layers = 3, and BATCH_SIZE = 512. Moreover, the hidden_dim = 256, latent_dim = 20, and 0.01 parameters are set for the VAE method. The LSTM model performed with low precision and accuracy values. The precision, accuracy, and sensitivity scores of the CNN model were higher, which indicates that the model could perform better. The MAE score was the highest, indicating more model value fluctuations. Although the accuracy score of the MLP model is very high, its precision, sensitivity, and F1 score are rather low. This may indicate that most of the model’s correct predictions are negative samples. The Transformer is a model with a very high sensitivity score. However, the precision and accuracy values of this model could be higher. The precision and accuracy scores of the attention mechanism model are higher than those of other models. However, the sensitivity score is low, indicating that the model has difficulty in detecting all positive cases. The RESNET model has a high precision score, which indicates that most positive predictions are correct. The accuracy score is also at an acceptable level. The random forest model excels in terms of its accuracy, precision, and F1 score. The MAD and MAE values are low, indicating fewer errors. The Gaussian naive Bayes model has a high accuracy score, but its precision and sensitivity values are below average. This may indicate that most of the model’s correct predictions are negative samples. The model’s ability to detect positive situations is strong, but these predictions can often be false positives. The performance of this model could be improved.

5.2.2. Patch Applications for Proposed Hybrid Model

The proposed hybrid model was assessed based on determined window intervals of 20–40, 40–60, 60–80,80–100,100–120, and 120–140 for each adversarial training method.
Window Size Manipulation + Attention Mechanism + FGSM
In this model, window size manipulation was applied to the model created by applying the attention mechanism algorithm, and noise was added to the data using the FGSM method.
After these trials, we observed that the optimum window size for this model was in the 80–100 range, as represented in Table 3. When evaluated based on the mean absolute error (MAE), which reflects the average magnitude of the errors in the model’s predictions, we observe that the model achieves its most accurate predictions within this window size range. Some improvement in the model’s overall performance may have been observed with window size manipulation. However, it is essential to note that the sensitivity and precision values of the model are still low, and the maximum absolute error value is high. This indicates that the model may be more susceptible to certain types of errors. The parameters of the proposed attention mechanism + FGSM model are as follows: batch_size = 32, n_epochs = 32, epsilon = 0.01, optimizer = AdamW optimizer, loss function = mean squared error (MSE). The loss value is calculated between the model’s prediction and the true answer.
Window Size Manipulation + Attention Mechanisms + Projected Gradient Descent (PGD)
The hybrid model performance based on the PGD method is shown in Table 4. These results show that increasing the window size to within (100–120) significantly affects the model’s performance. The precision, accuracy, and sensitivity metrics are slightly reduced compared to the previous state (80–100). However, the F1 score still needs to be higher, indicating that the model fails to strike an ideal balance between precision and sensitivity. The errors have decreased somewhat in the model’s error metrics, indicating that the model’s predictions are closer to the actual values. These results show that the window size significantly affects the model performance, and this hyperparameter must be carefully selected to optimize the model’s overall performance. The proposed attention mechanism + PGD model’s parameters are epsilon = 0.01, alpha = 0.01, num_steps = 3.
Window Size Manipulation + Attention Mechanism + CWL2
The hybrid model performance based on the CWL2 method is presented in Table 5. The results show that the CWL2 method is unsuitable for this model and dataset. The model’s performance metrics are low, and the error metrics are very high. This indicates that the model’s predictions are often incorrect, and the responses are often very different from the actual values. These results show that the model cannot learn effectively on this dataset, and the predictions often need to be more accurate. The parameters of the proposed attention mechanism + CWL2 model are as follows: epsilon = 0.001. The epsilon determines the step size during the Carlini–Wagner (CW) attack and optimization in the training function. It is used as the learning rate (lr) in the CW attack.
Window Size Manipulation + Attention Mechanism + JSMA
The hybrid model performance based on the JSMA method is represented in Table 6. The results show that the metrics for different window sizes are low. For example, the sensitivity obtained in the range of (20–40) is 0.079%, the accuracy is 0.001%, and the F1 score is 0.058%, which are low values. This situation shows that the model’s classification ability needs to be improved and it needs to perform better on the data. Especially in the range of (40–60), the accuracy and sensitivity values of zero reveal that the model cannot produce any correct predictions in this data range. The error metrics also reach rather high values. This shows that the model’s predictions are generally very far from the fundamental values and that the model’s learning process is ineffective. These results show that the JSMA attack is unsuitable for this model and dataset. The model cannot learn effectively on this dataset, and its predictions are often incorrect. For the model’s predictions to become more accurate, the dataset’s suitability may need to be improved, or the model may need to be given a deeper and more complex structure. As a result, the current model’s performance on this dataset is relatively poor and it requires further development and improvement. The parameters used in the proposed attention mechanism + JSMA model are selected as epsilon = 0.001. This is a hyperparameter that determines how large the model parameters will become during optimization. In the JSMA attack, the lr parameter, which is the learning rate of the optimizer (AdamW), is also equal to this value.
Window Size Manipulation + Attention Mechanism + MIA
When Table 7 is examined, it is found that as the window size increases, the precision of the model generally decreases. This may indicate that the model’s predictions are less accurate when operating over a more extensive data window. In particular, a drop in sensitivity is evident between the (80–100) and (100–120) ranges. The accuracy value is usually high and fluctuates for various window sizes. For example, the accuracy is 68.9% and 73.8% for window sizes (20–40) and (80–100). However, these values are low for window sizes (60–80) and (120–140). The recall value generally increases as the window size increases. This indicates that the model’s ability to correct errors generally increases when working on a larger data window. The F1 score generally decreases as the window size increases. This indicates that the balance between precision and recall often deteriorates as the window size increases. The mean absolute error (MAE), median absolute error, and maximum absolute error generally decrease as the window size increases. This indicates that the model’s predictions are often closer to the true values when operating over a more extensive data window. The parameters of the proposed attention mechanism + MAI model are as follows: epsilon = 0.001, alpha = 0.01, num_steps = 3.

6. Conclusions

This paper presents deceptive patch solutions for cybersecurity in industrial control systems with adversarial training and integrated window size manipulation methods. These methods could be integrated to increase the cybersecurity of industrial control systems and ensure their resistance against adversarial attacks. The present study is a step toward developing more effective and reliable solutions in cybersecurity. The adversarial training method and deceptive patch solutions offer a more effective cybersecurity defense strategy in industrial control systems. The mentioned integration works with the current methods used to detect and block cyberattacks, increasing the scope and effectiveness of cybersecurity measures. Finally, as a result of assessing the cybersecurity performance of the integrated method and discussing its effectiveness, it was observed that the suggested hybrid method was quite successful.
We aim to obtain more successful results in error detection by adding more features to increase the model’s performance and working with novel deep learning architectures on more comprehensive datasets in future research.

Author Contributions

Conceptualization, C.B.Ş. and H.T.; methodology, H.T. and C.B.Ş.; formal analysis, Ö.B.D.; investigation, H.T. and Ö.B.D.; writing—original draft preparation, C.B.Ş.; writing—review & editing, Ö.B.D. and C.B.Ş.; visualization, H.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the Malatya Turgut Özal University Scientific Research Projects Coordination Unit under grant number 24Y05.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available in a publicly accessible DACON repository. Available online: https://github.com/icsdataset/hai, accessed on 20 March 2024.

Acknowledgments

Thanks to Malatya Turgut Özal University, Scientific Research Projects Coordination Unit.

Conflicts of Interest

There is no conflict of interest regarding authorship.

References

  1. Stouffer, K.; Pillitteri, V.; Lightman, S. Guide to Industrial Control Systems (ICS) Security. NIST Special Publication 800-82 Revision 2. National Institute of Standards and Technology. 2015. Available online: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-82r2.pdf (accessed on 27 September 2024).
  2. Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
  3. Vollmer, T.; Manic, M. Cyber-Physical System Security With Deceptive Virtual Hosts for Industrial Control Networks. IEEE Trans. Ind. Inform. 2014, 10, 1337–1347. [Google Scholar] [CrossRef]
  4. Ge, M.; Cho, J.-H.; Kim, D.; Dixit, G.; Chen, I.-R. Proactive Defense for Internet-of-things: Moving Target Defense With Cyberdeception. ACM Trans. 2021, 22, 1–31. [Google Scholar] [CrossRef]
  5. Qin, X.; Jiang, F.; Cen, M.; Doss, R. Hybrid Cyber Defense Strategies Using Honey-X: A Survey. Comput. Netw. 2023, 230, 109776. [Google Scholar] [CrossRef]
  6. Chen, J.; Gao, X.; Deng, R.; He, Y.; Fang, C.; Cheng, P. Generating Adversarial Examples Against Machine Learning-Based Intrusion Detector in Industrial Control Systems. IEEE Trans. Dependable Secur. Comput. 2022, 19, 1810–1825. [Google Scholar] [CrossRef]
  7. Buchanan, S.S. Cyber-Attacks to Industrial Control Systems since Stuxnet: A Systematic Review; Capitol Technology University ProQuest Dissertations Publishing: Laurel, MD, USA, 2022; p. 29163646. [Google Scholar]
  8. Mekdad, Y.; Bernieri, G.; Conti, M.; El Fergougui, A. The Rise of ICS Malware: A Comparative Analysis. In Computer Security. ESORICS 2021 International Workshops; ESORICS 2021. Lecture Notes in Computer Science 2022; Springer: Cham, Germany, 2022; Volume 13106. [Google Scholar] [CrossRef]
  9. Deng, A.; Hooi, B. Graph Neural Network-Based Anomaly Detection in Multivariate Time Series. Computer Science. arXiv 2021, arXiv:2106.06947. [Google Scholar] [CrossRef]
  10. Zhang, Y.; Chen, Y.; Wang, J.; Pan, Z. Unsupervised deep anomaly detection for multi-sensor time-series signals. IEEE Transactions on Knowledge and Data Engineering. arXiv 2021, arXiv:2107.12626. [Google Scholar]
  11. Yuan, X.; He, P.; Zhu, Q.; Li, X. Adversarial examples: Attacks and defenses for deep learning. arXiv 2019, arXiv:1712.07107. Available online: https://arxiv.org/abs/1712.07107 (accessed on 5 October 2024).
  12. Hassani, P. Implementing Patch Management Process, Bachelor’s Thesis, 2020, School of Technology Degree Programme in Information and Communication Technology. Available online: https://www.theseus.fi/handle/10024/341620 (accessed on 15 January 2023).
  13. Yantz, M. Importance of Patch Management to Avoid Business Vulnerabilities. 2019. Available online: https://itsupportguys.com/it-blog/importance-of-patch-management-to-avoid-business-vulnerabilities/ (accessed on 31 March 2020).
  14. Koskenkorva, H. The Role of Security Patch Management in Vulnerability Management. Master’s Thesis, Master of Engineering Cybersecurity 2021, South-Eastern Finland University of Applied Sciences, Kouvola, Kymenlaakso, 2021. [Google Scholar]
  15. Panetta, K. Gartner’s Top 10 Security Projects for 2020-2021. Blog. Updated 15 September 2020. Available online: https://www.gartner.com/smarterwithgartner/gartner-top-security-projects-for-2020-2021/ (accessed on 15 January 2023).
  16. Olswang, A.; Gonda, T.; Puzis, R.; Shani, G.; Shapira, B.; Tractinsky, N. Prioritizing vulnerability patches in large networks. Expert Syst. Appl. 2022, 193, 116467. [Google Scholar] [CrossRef]
  17. Corallo, A.; Lazoi, M.; Lezzi, M.; Luperto, A. Cybersecurity awareness in the context of the Industrial Internet of Things: A systematic literature review. Comput. Ind. 2022, 137, 103614. [Google Scholar] [CrossRef]
  18. Dhirani, L.L.; Armstrong, E.; Newe, T. Industrial IoT, Cyber Threats, and Standards Landscape: Evaluation and Roadmap. Sensors 2021, 21, 3901. [Google Scholar] [CrossRef] [PubMed]
  19. Altulaihan, E.; Almaiah, M.A.; Aljughaiman, A. Cybersecurity Threats, Countermeasures and Mitigation Techniques on the IoT: Future Research Directions. Electronics 2022, 11, 3330. [Google Scholar] [CrossRef]
  20. Firoozjaei, M.D.; Mahmoudyar, N.; Baseri, Y.; Ghorbani, A.A. An evaluation framework for industrial control system cyber incidents. Int. J. Crit. Infrastruct. Prot. 2022, 36, 100487. [Google Scholar] [CrossRef]
  21. Yang, B.; Zhang, Y. Cybersecurity Analysis of Wind Farm Industrial Control System Based on Hierarchical Threat Analysis Model Framework. In Proceedings of the 2022 International Conference on Computing, Communication, Perception and Quantum Technology (CCPQT), Xiamen, China, 5–7 August 2022; pp. 6–13. [Google Scholar] [CrossRef]
  22. Tong, H.; Xu, J.; Zhang, L.; Liang, S.; Mai, C.; Ding, W. The Risk of Cyber Security for Power Stability Control System and Its Test Platform. In Proceedings of the 2022 IEEE 4th International Conference on Power, Intelligent Computing and Systems (ICPICS), Shenyang, China, 29–31 July 2022; pp. 267–272. [Google Scholar] [CrossRef]
  23. Available online: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21 (accessed on 25 September 2024).
  24. Alzahrani, A.; Aldhyani, T.H.H. Design of Efficient Based Artificial Intelligence Approaches for Sustainable Cyber Security in Smart Industrial Control System. Sustainability 2023, 15, 8076. [Google Scholar] [CrossRef]
  25. Hochreiter, S.; ve Schmidhuber, J. Long Short-Term Memory, Neural Compultation. 1997. Available online: https://www.bioinf.jku.at/publications/older/2604.pdf (accessed on 10 March 2024).
  26. Wibawa, A.P.; Utama, A.B.P.; Elmunsyah, H.; Pujianto, U.; Dwiyanto, F.A.; Hernandez, L. Time-series analysis with smoothed Convolutional Neural Network. J. Big. Data 2022, 9, 44. [Google Scholar] [CrossRef] [PubMed]
  27. Shiblee, M.; Kalra, P.K.; Chandra, B. Time Series Prediction with Multilayer Perceptron (MLP): A New Generalized Error Based Approach. In Advances in Neuro-Information Processing; Köppen, M., Kasabov, N., Coghill, G., Eds.; ICONIP 2008. Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5507. [Google Scholar] [CrossRef]
  28. Qin, Y.; Song, D.; Cheng, H.; Cheng, W.; Jiang, G.; Cottrell, G. A dual-stage attention-based recurrent neural network for time series prediction. arXiv 2017, arXiv:1704.02971. [Google Scholar]
  29. Zhao, W.; Alwidian, S.; Mahmoud, Q.H. Adversarial Training Methods for Deep Learning: A Systematic Review. Algorithms 2022, 15, 283. [Google Scholar] [CrossRef]
  30. Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z.B.; Swami, A. The Limitations of Deep Learning in Adversarial Settings. arXiv 2016, arXiv:1511.07528. Available online: https://arxiv.org/abs/1511.07528 (accessed on 5 October 2024).
  31. HAICon 2021. Available online: https://github.com/icsdataset/hai (accessed on 20 March 2024).
  32. Shin, H.-K.; Lee, W.; Yun, J.-H.; Min, B.-G. Two ICS Security Datasets and Anomaly Detection Contest on the HIL-based Augmented ICS Testbed. In Cyber Security Experimentation and Test (CSET ‘21); Association for Computing Machinery: New York, NY, USA, 2021; pp. 36–40. [Google Scholar]
Figure 1. LSTM and GRU recurrent neural networks [23].
Figure 1. LSTM and GRU recurrent neural networks [23].
Applsci 14 09287 g001
Figure 2. The overall framework of the model.
Figure 2. The overall framework of the model.
Applsci 14 09287 g002
Figure 3. The convergence of the sigmoid activation function.
Figure 3. The convergence of the sigmoid activation function.
Applsci 14 09287 g003
Figure 4. The overall framework of the proposed hybrid model.
Figure 4. The overall framework of the proposed hybrid model.
Applsci 14 09287 g004
Table 1. Comparison with other studies on cybersecurity on ICSs.
Table 1. Comparison with other studies on cybersecurity on ICSs.
No.YearStudyFeatureRef.
12020Industrial Control Systems: Cyberattack trends and counter-countermeasuresThis paper summarizes common attacks collected in the industrial control system (ICS). It suggests solutions to the most prominent ones.[12]
22019Importance of Patch Management to Avoid Business Vulnerabilities.Assesses patch management compliance and risk management in ICSs.[13]
32023Implementing Patch Management ProcessPresents approaches for solutions and methods for security patch management in ICSs.[14]
42021The Role of Security Patch Management in Vulnerability ManagementPresents approaches for risk-based security patch management in ICSs.[15]
42021Gartner Top 10 Security Projects for 2020–2021.Provides a joint security project for recent years.[15]
52022Prioritizing Vulnerability Patches in Large NetworksRanks vulnerability fixes based on the machine’s position within large networks.[16]
62022Cybersecurity Awareness in the context of the Industrial Internet of Things: A Systems Literature Review.Reviews cybersecurity awareness in IIoT contexts with cybersecurity policies.[17]
72021Cybersecurity, SurveySurveys actual and perceived risks, threats, information sources, and operational implementation challenges[18]
92022Cybersecurity Threats, Counter Measures and Mitigation Techniques on the IOT: Future Research Directions.Discusses popular application-layer protocols for IOT security.[19]
102022An Evaluation Framework for Industrial Control System Cyber IncidentsAnalyzes ICS cyber incidents to highlight protection solutions for ICSs.[20]
112022Cybersecurity Analysis of the Wind Farm Industrial Control System Based on Hierarchical Threat Analysis Model FrameworkIdentifies and analyzes cybersecurity threats, with an adequate basis for threat abatement and system design through the visualization of threats.[21]
122022The Risk of Cybersecurity for the Power Stability Control System and Its Test PlatformAnalyzes cybersecurity risks for the power stability control system in ICS.[22]
Table 2. Patch performance results for standard models.
Table 2. Patch performance results for standard models.
ModelPrecisionAccuracyRecallF1MAEMADMaximum Absolute Error
GRU0.07520.20310.10570.05270.01210.005221.5777
LSTM0.05920.15600.09580.03330.01540.007020.9665
CNN0.03560.00670.07890.01440.02140.006824.8914
MLP0.06880.58370.04700.01440.04030.008720.5822
Transformer0.08480.10310.14220.06060.01490.005321.0068
Attention Mechanism0.09750.26010.10310.03070.01300.005420.5695
RESNET0.17770.45590.09580.03330.00770.004021.6432
Random Forest0.14170.99280.15560.14750.08270.055
G. Naive Bayes0.10310.98650.10440.10370.08700.045
Gan0.10050.73700.11900.06860.01230.005621.000
A3C0.04410.29710.08050.02800.01380.006020.1652
VAE0.05380.12480.07030.01750.01060.004220.2315
Table 3. Hybrid model performance based on FGSM method.
Table 3. Hybrid model performance based on FGSM method.
Window SizePrecisionAccuracyRecallF1 ScoreMAEMADMaximum Absolute Error
20–400.1380.3410.1430.1130.01650.001721.672
40–600.1010.2170.1400.0790.01220.008520.3738
60–800.0790.1400.1380.1150.00960.003120.4977
80–1000.1320.7950.1380.0970.00650.002520.7388
100–1200.0710.5410.0810.0470.01330.004820.8360
120–1400.0820.2130.0770.0280.00820.003220.5589
Table 4. Hybrid model performance based on PGD method.
Table 4. Hybrid model performance based on PGD method.
Window SizePrecisionAccuracyRecallF1 ScoreMAEMADMaximum Absolute Error
20–400.20370.6170.2040.1820.00700.001821.4162
40–600.10450.2950.1360.0860.01240.008620.5604
60–800.13170.4800.1380.1150.00960.003120.4977
80–1000.1460.2410.1450.1200.01070.007921.7364
100–1200.1130.3220.1370.0890.00950.006920.4732
120–1400.0820.2130.0770.0280.00820.003220.5589
Table 5. Hybrid model performance based on CWL2 method.
Table 5. Hybrid model performance based on CWL2 method.
Window SizePrecisionAccuracyRecallF1 ScoreMAEMADMaximum Absolute Error
20–400.000020.000330.06250.000042256.111406.34935.39
40–600.12020.11770.11060.08530.00840.005921.8995
60–800.000020.00030.06250.00004690.14365.541567.36
80–1000.000020.00030.06250.000041269.41730.442811.29
100–1200.000020.00030.06250.000041694.36734.223846.79
120–1400.000020.00030.06250.00042668.451458.65853.51
Table 6. Hybrid model performance based on JSMA.
Table 6. Hybrid model performance based on JSMA.
Window SizePrecisionAccuracyRecallF1 ScoreMAEMADMaximum Absolute Error
20–400.0790.0010.0500.0580.61490.1851945.1836
40–600.00.00.00569.18232.751807.27
60–800.0140.2730.5250.0221.8530.20052550.765
80–1000.0140.2720.5240.0221.75470.20862583.856
100–1200.0000250.0000780.00150.000512.2770.09853264.548
120–1400.00.00.001251691.633672.13
Table 7. Hybrid model performance based on MIA method.
Table 7. Hybrid model performance based on MIA method.
Window SizePrecisionAccuracyRecallF1 ScoreMAEMADMaximum Absolute Error
20–400.18560.6790.1720.1360.01600.011121.6042
40–600.21420.7390.2350.1500.01300.010721.5030
60–800.20980.2880.2090.1750.01200.010121.4362
80–1000.14300.7380.1830.1250.01150.008421.3297
100–1200.13780.4040.19570.1120.01110.009021.6640
120–1400.15930.2510.21310.1320.00910.006321.5921
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tanyıldız, H.; Batur Şahin, C.; Batur Dinler, Ö. Improving Deceptive Patch Solutions Using Novel Deep Learning-Based Time Analysis Model for Industrial Control Systems. Appl. Sci. 2024, 14, 9287. https://doi.org/10.3390/app14209287

AMA Style

Tanyıldız H, Batur Şahin C, Batur Dinler Ö. Improving Deceptive Patch Solutions Using Novel Deep Learning-Based Time Analysis Model for Industrial Control Systems. Applied Sciences. 2024; 14(20):9287. https://doi.org/10.3390/app14209287

Chicago/Turabian Style

Tanyıldız, Hayriye, Canan Batur Şahin, and Özlem Batur Dinler. 2024. "Improving Deceptive Patch Solutions Using Novel Deep Learning-Based Time Analysis Model for Industrial Control Systems" Applied Sciences 14, no. 20: 9287. https://doi.org/10.3390/app14209287

APA Style

Tanyıldız, H., Batur Şahin, C., & Batur Dinler, Ö. (2024). Improving Deceptive Patch Solutions Using Novel Deep Learning-Based Time Analysis Model for Industrial Control Systems. Applied Sciences, 14(20), 9287. https://doi.org/10.3390/app14209287

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop