Next Article in Journal
Correction of Aero-Optical Effect with Blow–Suction Control for Hypersonic Vehicles
Previous Article in Journal
Mechanical Properties of Cement Concrete with Waste Rubber Powder
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Network Security Situation Based on Attention Mechanism and Convolutional Neural Network–Gated Recurrent Unit

1
School of Electronics and Information, Zhengzhou University of Light Industry, Zhengzhou 450003, China
2
School of Computer Science and Technology, Zhengzhou University of Light Industry, Zhengzhou 450003, China
3
School of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450003, China
4
Research Institute of Industrial Technology, Zhengzhou University of Light Industry, Zhengzhou 450003, China
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2024, 14(15), 6652; https://doi.org/10.3390/app14156652 (registering DOI)
Submission received: 27 June 2024 / Revised: 24 July 2024 / Accepted: 27 July 2024 / Published: 30 July 2024

Abstract

:
Network-security situation prediction is a crucial aspect in the field of network security. It is primarily achieved through monitoring network behavior and identifying potential threats to prevent and respond to network attacks. In order to enhance the accuracy of situation prediction, this paper proposes a method that combines a convolutional neural network (CNN) and a gated recurrent unit (GRU), while also incorporating an attention mechanism. The model can simultaneously handle the spatial and temporal features of network behavior and optimize the weight allocation of features through the attention mechanism. Firstly, the CNN’s powerful feature extraction ability is utilized to extract the spatial features of the network behavior. Secondly, time-series features of network behavior are processed through the GRU layer. Finally, to enhance the model’s performance further, we introduce attention mechanisms, which can dynamically adjust the importance of different features based on the current context information; this enables the model to focus more on critical information for accurate predictions. The experimental results show that the network-security situation prediction method, which combines a CNN and a GRU and introduces an attention mechanism, performs well in terms of the fitting effect and can effectively enhance the accuracy of situation prediction.

1. Introduction

With the increasing diversification and complexity of network security attacks [1], traditional security protection measures, including anti-virus software, vulnerability scanning, intrusion detection systems (IDS), and firewalls, whether active or passive defense, are facing development bottlenecks. Anti-virus software and vulnerability scanning tools typically concentrate on safeguarding individual computers. They do not offer a thorough evaluation of the overall security status of the network environment and lack the capacity to analyze and oversee the security of the entire network system at a macro level. While an intrusion detection system (IDS) [2,3] can monitor the entire network, it can only take corrective actions after identifying potential threats. This reactive approach leaves the network in a passive state, unable to proactively avert future network security risks. Network-security situational awareness systems, through dynamic and comprehensive monitoring of the real-time security status and risk level of the network, have overcome the limitation of a single security device that can only monitor local security information [4,5]. As a result, they have gradually become a focal point in the field of network security research. Network-security situation prediction technology is an essential component of network-security situation awareness. It can analyze historical network data to predict future network threats and issue warnings [6]. Network-security situation prediction plays an important role in network defense, network-security early warning, and network resource allocation.
In recent years, with the rapid progress of artificial intelligence technology, all sectors have become acutely aware of its enormous potential and advantages. Artificial intelligence technology is now considered a crucial engine for the industry’s development. In the field of network-security situation prediction, this trend is also evident, with artificial intelligence technology gradually emerging as a key driver of innovation and development in this domain. At present, many studies on situation prediction focus on time-series analysis, typically utilizing recurrent neural networks (RNNs) [7], long short-term memory networks (LSTMs) [8], gated recurrent units (GRUs) [9,10,11], or their variants to predict network situations. However, these methods do not fully utilize effective dataset feature-processing techniques when dealing with large-scale network data, resulting in limitations in feature extraction. In order to address this deficiency, this paper introduces the convolutional neural network (CNN) structure to enhance the feature extraction capability of the dataset. In this way, we aim to optimize the feature processing process to enhance the efficiency and accuracy of network-security situation prediction.
Therefore, aiming to address the shortcomings of datasets in feature processing, this paper proposes a CNN-GRU neural network model integrating an attention mechanism for predicting network security situations. At present, some studies have combined the advantages of CNNs and GRUs. They utilize CNNs to extract spatial features and GRUs to process time-series data. These studies have made significant achievements in fields such as power grid and water level prediction [12,13,14]. However, there is a scarcity of research on network-security situation prediction. The research in this paper utilizes real data released by the National Internet Emergency Response Center (CERT) as the experimental dataset. It employs a combination of a CNN, a GRU, and an attention mechanism to enable the model to automatically focus on the most relevant data, thereby enhancing the accuracy of network-security situation prediction.
The main contributions of this paper are as follows.
  • Utilizing a gated recurrent unit (GRU) to capture overall trends and patterns in time-series data enhances prediction accuracy.
  • Convolutional neural networks (CNNs) are suitable for processing data with spatial correlation. In the prediction of network security situations, a 1D CNN is utilized to extract local features from time-series data.
  • The CNN-GRU model combines a convolutional neural network and a gated recurrent unit network to handle both sequential and spatial data. This allows the model to take into account both the time-series characteristics of the data and their spatial distribution characteristics, thus improving the prediction accuracy.
  • By integrating the attention mechanism, the model can focus more on the features that have significant effects on the prediction of situation value. This not only enhances the accuracy of the prediction but also accelerates the convergence rate of model training.
This paper is divided into five chapters, explaining the method of predicting network security situations based on the attention mechanism and CNN-GRU. Section 1 introduces the research background and emphasizes the importance of predicting network security situations in cyberspace governance. Section 2 provides a summary of existing research on network-security situation prediction. Section 3 elaborates on the attention mechanism and the architecture design of the CNN-GRU model. Section 4 introduces the datasets used in the experiment and the evaluation criteria, and offers an analysis and discussion of the experimental results. Section 5 summarizes the network-security situation prediction method proposed in this paper and outlines future research directions.

2. Related Works

In the field of network-security situation prediction, prediction methods can be roughly divided into three categories based on the data sources and their characteristics, as well as the technical means used: methods based on uncertain reasoning theory, methods based on machine learning, and methods based on neural network methods.

2.1. Predictive Methods Based on Uncertain Inference Theory

In the field of network-security situational awareness, grey system theory and D-S evidence theory are two commonly used uncertain reasoning tools that help manage and utilize incomplete information for decision-making and prediction. Shibo and colleagues [15] employ a network-security situation prediction method based on D-S evidence theory to consider the impact of past and present network security situations on future network security conditions. This approach aims to reduce uncertainty and assess the probability of the network being in various security states in the near future. Liu D [16] designed a prediction model based on the improved Dempster–Shafer (D-S) evidence theory and verified the accuracy and robustness of the enhanced method in handling large data conflicts through simulation experiments. Deng Yongjie et al. [17] combined the network-security situation prediction method using gray GM (1, 1) and GM (1, N) models. They utilized the GM (1, 1) model to predict changes in situational factors and derived multiple functions for these changes. Finally, these functions and the GM (1, N) model were used to predict the network security situation. A novel adaptive Gray Verhulst model, which improves prediction accuracy by adjusting the generated sequence, was proposed by Leau Y. B. et al. [18] The literature above applies uncertain reasoning theory to predict the network security situation. The uncertain reasoning theory is suitable for situations with limited and incomplete data volume, but it has significant limitations in handling data with irregular and large fluctuations.

2.2. Machine Learning-Based Prediction Methods

In the case of complete data, machine learning can be used to predict network security situations. Machine learning excels at processing small samples and non-linear data, offering high accuracy and ease of understanding. Support vector machine (SVM) and the hidden Markov model (HMM) are two commonly used prediction methods. Jingjing Hu et al. [19] proposed a network-security situation prediction model (MR-SVM) based on MapReduce and support vector machine (SVM). The model utilizes the CS algorithm to optimize SVM parameters and conducts distributed training through MapReduce to enhance training speed. Ke et al. [20] proposed a support vector machine network-security situation prediction model optimized using an enhanced artificial bee colony algorithm—this enhancement aims to improve the convergence speed and accuracy of the algorithm by refining the search equation. Wei Liang et al. [21] proposed a mobile network-security situation prediction algorithm based on the weighted hidden Markov model. This algorithm utilizes multi-scale entropy to address the issue of slow data training speed. It predicts future security situations by optimizing the parameters of the HMM state transition matrix and incorporating the autocorrelation coefficient. Peshave et al. [22] proposed a method based on hidden Markov model (HMM) integration for predicting cybersecurity threat events and evaluated its prediction accuracy through two strategies—majority voting and maximum generative likelihood—which were shown to outperform a baseline predictor in predicting malicious cyber events. Despite its significant advantages, machine learning faces challenges in training and model design when dealing with large-scale data.

2.3. Neural Network-Based Prediction Method

In recent years, the neural network method has played an increasingly important role in network security for predicting network situations. Neural networks can handle complex nonlinear relationships and capture advanced data features through hierarchical network structures. This method has become the mainstream approach for predicting trends in large-scale, high-dimensional data. There are three main categories of situation prediction using neural network methods: BP neural networks, RBF neural networks, and RNN neural networks. The three methods each have their own advantages and disadvantages. The advantages of the BP neural network [23,24] lie in its adaptive and self-learning abilities. It can adjust parameters to meet the desired output, with strong nonlinear mapping ability, rigorous theoretical support, and good generalization ability. However, its numerous parameters result in a slow convergence rate and make it easy to fall into a local minimum. The advantages of the RBF network [25] include unique best approximation characteristics, the absence of local minimum problems, strong input and output mapping ability, a linear relationship between network connection weights and output, fast classification ability, and rapid learning convergence speed. However, its reasoning process and basis cannot be explained, and it may not function properly when data are insufficient. The advantages of recurrent neural networks [26,27,28,29,30,31,32] lie in their memory, parameter sharing, and Turing completeness, making them especially suitable for processing sequence data and nonlinear feature learning. The GRU used in this paper belongs to the recurrent neural networks. However, their training is challenging due to the large number of parameters and the difficulty in parameter tuning. Table 1 shows a comparison of various prediction methods.

3. Model Construction

In this paper, we introduce a convolutional neural network (CNN) structure to enhance the feature extraction capability of the dataset. Our goal is to optimize the feature processing flow to improve the efficiency and accuracy of network-security situation prediction. The network-security situation prediction model proposed in this study combines a convolutional neural network (CNN) with a gated recurrent unit (GRU) and introduces an attention mechanism to enhance the performance of feature selection and time-series analysis. The model structure mainly includes three key components: a feature extraction layer, a time-series analysis layer, and an attention layer. The model structure is shown in Figure 1.
  • Feature extraction layer: Utilizing CNN technology, this layer can accurately capture the spatial characteristics in the network behavior and provide robust data support for subsequent analysis.
  • Time-series analysis layer: The time-series analysis layer consists of the GRU module. The GRU module learns the feature vector extracted from the CNN module and captures its internal change rules.
  • Attention layer: The attention mechanism is introduced based on the CNN and GRU. By assigning weights to different features, the model can automatically focus on the most influential key information for the prediction task, thereby enhancing the accuracy of predictions.

3.1. 1D CNN

In order to improve the feature extraction ability for the data, this paper introduces a one-dimensional convolutional neural network. The 1D CNN, or one-dimensional convolutional neural network, is a deep learning model specifically designed to process linear sequence data. Compared with traditional fully connected neural networks, the 1D CNN can more effectively capture local dependencies when processing sequence data. This capability enables it to excel in tasks such as speech recognition, natural language processing, and time-series analysis.
The core of the 1D CNN is the convolutional layer, which extracts important features by sliding a local window of the convolution kernel over the input data. Set the input of the 1D CNN layer as X = x 1 , x 2 , , x m , where x i   r e p r e s e n t s   t h e   i feature quantity, and let Y = y 1 , y 2 , , y n be the output of the convolutional layer. This paper utilizes Conv1d as a convolution layer with a kernel size of 3, a stride of 1, and padding of 1. The function of the convolution layer is to extract local features from the input data and map these features to higher-level representations. The formula for calculating the output of the data processed by the convolution layer is presented in Formula (1).
y t = k = 1 m w k x t k + 1
The output Y of the convolution layer is obtained by applying the convolution kernel to each local window of the input sequence X and calculating the dot product of the convolution kernel with these windows. This process not only retains the critical information in the sequence but also minimizes feature reduction through the design of the convolution kernel.
In the formula, y t is the output of the convolutional layer at moment t; w k is the weight of the convolutional kernel; and m is the length of the convolutional kernel.

3.2. GRU

The gated recurrent unit (GRU) is an enhanced type of recurrent neural network (RNN) known for its simplified architecture and reduced parameter count. This design not only makes the GRU more efficient during training but also improves its computing speed when handling specific tasks. The core innovation of the GRU lies in its unique gating technology, which helps alleviate the gradient vanishing problem commonly encountered during long time-series training. The GRU contains two key gating structures: the reset gate and the update gate. The reset gate function determines the impact of the current input on the hidden state from the previous moment, while the update gate is responsible for regulating the influence of the previous hidden state on the current output. In addition, the GRU introduces the concept of a candidate hidden state, which is used to calculate the hidden state at the current time point. These design improvements not only simplify the model but also ensure that it can effectively capture the long-term dependencies in the sequence data. The GRU structure is shown in Figure 2.
The GRU layer receives the data Y = y 1 , y 2 , , y n that has been processed by the CNN layer. The sigmoid function is used to activate the reset gate and the update gate to ensure that their output values are between 0 and 1. This mechanism uses the output of the sigmoid as a weighting factor, where 0 represents complete forgetfulness and 1 represents complete retention of the information. The GRU will first receive the inputs h t 1   and x t , which will then be passed into the reset gate and update gate. These gates are computed by Formulas (2) and (3) to obtain the outputs r t and z t at time t. Subsequently, the combination of r t , z t , and x t will be inputted into the neural network. The tanh function will be applied to adjust the network outputs, which will be computed by Formula (4) to determine the output h ~ t . The output z t of the update gate serves two decision-making purposes: one is to be multiplied by h ~ t to determine which part of the information is retained, and the other is to combine with h t 1 to decide which data to ignore. Finally, the final output f of the generated GRU is computed through Formula (5). Through its unique gating mechanism and simplified structure, this model improves training efficiency while maintaining performance, making it a powerful tool for processing sequential data.
z t = σ w z · h t 1 , x t
r t = σ w r · h t 1 , x t
h ~ t = t a n h w z · r t h t 1 , x t
h t = 1 z t h t 1 + z t h t ~
In the formula, r t is the reset gate; z t is the update gate; σ is the input information; h t 1 is the sigmoid activation function;   h ~ t is the hidden state of the previous moment; and h t   is the hidden state passed to the next moment in time. “ · ” is a dot product that refers to the multiplication of the corresponding elements of two vectors, resulting in a scalar; “ ” is a cross product, which is the operation of two vectors, resulting in a vector.

3.3. Attention Mechanism

The primary function of the attention mechanism is to selectively focus on the most crucial components for the current task from a multitude of inputs, thereby improving the model’s capacity to identify and process essential information. In this paper, by applying the attention mechanism to the outputs of the gated recurrent unit (GRU), it is possible to assign distinct weights to various feature vectors. This strategy effectively emphasizes the crucial features that significantly influence the prediction of the network security situation. The design of the attention mechanism enables a quantitative assessment of the interactions between features by calculating a score s ( k i , q ) for each feature vector to ascertain its importance. The higher this score, the more attention the feature vector receives from the model. The weight value α i reflects the relative importance of the i input feature in the attention weighting process. By weighting and summing all feature vectors and calculating their average values, a final vector is obtained as the output, integrating the importance of all features. The structure of the attention mechanism is shown in Figure 3.
The attention layer receives the output h t = x 1 ,   x 2 , , x t of the GRU module as the input to this layer and computes its query, key, and value for the inputs, as shown in Formulas (6)–(8).
Q = W Q h t
K = W K h t
V = W V h t
This is then normalized by the softmax function, as shown in Formula (9). Finally, Formula (10) is weighted and summed to obtain the converged value of network security situation.
α i = s o f t m a x ( s ( k i , q ) ) = e x p ( s ( k i , q ) ) j = 1 N e x p ( s ( k i , q ) )
o u t = i = 1 N α i v i = i = 1 N e x p ( s ( k i , q ) ) j e x p ( s ( k i , q ) ) v i
By integrating attention mechanisms, neural networks acquire the ability to flexibly adjust the intensity of attention to the various components of the input data. This approach not only improves the ability to capture important signals but also enhances the accuracy of its predictions and the generalization performance of the model. When processing the input sequences, the model uses the query vector to identify and select the relevant elements, weighting the values of these elements. This weighting strategy allows the model to focus on information closely related to the task while ignoring less relevant parts, thereby achieving better performance when executing a specific task.

4. Experiment and Results Analysis

4.1. Preprocessing of the Experimental Datasets

In the field of network-security situation analysis, current studies often rely on outdated datasets or data collected in specific virtual environments, which may not be widely representative. The data for this study are derived from real data published by the National Internet Emergency Response Center, covering the 171-week weekly network-security situation report between 2018 and 2021. These weekly reports evaluate the network security situation from five key dimensions: the number of domestic hosts infected by the virus, the total number of domestic websites illegally tampered with, the number of domestic websites implanted with backdoors, the number of counterfeit pages of domestic websites, and the number of newly discovered information security vulnerabilities. To translate these data into intuitive network-security situation indicators, this paper quantifies them by assigning varying weights to different security threat levels. Specific weight assignments are shown in Table 2. Subsequently, Formula (11) is used to calculate the weekly network-security situation value in order to more accurately reflect the overall network security situation.
S A = i = 1 5 N T i N T i m a x · w i
where N T i represents the number of a specific network security threat in a particular week (where “i” represents the type of security threat), N T i m a x represents the maximum number of this type of security threat in the 171 selected periods of data, and w i represents its corresponding weight. The computationally processed data are shown in Figure 4.

4.2. Evaluation Criteria

In order to evaluate the effect of the proposed prediction model, two parameters, the mean-square error (MSE) and the coefficient of goodness of fit (the coefficient of determination, R 2 ), are selected as evaluation indicators. The calculation formula for the evaluation index is as follows:
M S E = 1 N i = 1 N y i y ~ i 2
R 2 = 1 i = 1 N y i y ^ i 2 i = 1 N y i y ¯ 2
where y i represents the true value of a sample, y ^ i represents the predicted value of a sample, N represents the number of samples, and y ¯ represents the average of the true values, R 2 represents the coefficient of goodness of fit.

4.3. Training Environment and Parameter Configuration

Experimental environment: Windows 10 64-bit operating system, the processor is Intel (R) Core (TM) i7-9700 CPU @ 3.00GHz. Model training and testing were conducted using the PyTorch deep learning framework in the Python 3.8 environment.
The configuration of parameters during model training significantly impacts the model’s performance. Proper parameter settings can accelerate the convergence of the model and improve prediction accuracy. The parameter settings during the training process are shown in Table 3.
In the model training stage, the dataset is divided into a training set and a test set in an 8:2 ratio, and then transformed into feature tensors. After disrupting the order, the processed data are entered into the model for training.

4.4. Analysis of the Experimental Results

Due to differences in the experimental environment, configuration, and dataset, it cannot be effectively compared with other models. In this paper, ablation experiments are conducted on the proposed model to demonstrate its effectiveness. Figure 5 visually presents the performance of the proposed method in this study compared to models such as the GRU and GRU-Attention. The experiment selected 17 weeks of network-security posture data and used four models to compare them with the real network-security posture values, respectively. As shown, we added the following analysis to Figure 5, and all of these prediction models have some predictive capability. However, the method proposed in this study achieves almost perfect correspondence in fitting between the predicted values and the real values. The prediction results on multiple data points all demonstrate better prediction performance of the GRU compared to the GRU model alone and the CNN model alone, highlighting the superiority of GRU in predicting time series. The prediction of the GRU-Attention model, after the addition of the CNN, shows a significant improvement, indicating that the introduction of a CNN can significantly enhance the model’s predictive ability. The experimental results further confirm that the use of a model fusion strategy to integrate the advantages of different models can significantly improve the accuracy of predictions compared to using the CNN or GRU alone. This fusion method not only optimizes the prediction performance but also enhances the model’s ability to capture data features, resulting in more accurate experimental results.
Figure 6 displays a trend plot of changes in model training error as the number of iterations increases. It can be observed that in the model with the attention mechanism, the convergence rate is significantly faster. This demonstrates that the inclusion of the attention mechanism can substantially enhance the convergence rate of the model. The prediction model proposed in this paper performs well in terms of convergence speed, confirming the effectiveness and superiority of the model in capturing temporal data features. Additionally, its prediction accuracy is excellent. In this paper, the model can effectively learn the key information from temporal data, demonstrating a faster convergence rate and higher accuracy during model training. This makes it an effective situation prediction model.
Table 4 lists the specific comparison results between the method proposed in this paper and several other methods in terms of mean-square error (MSE) and goodness of fit. Through the comparative analysis, we can clearly see that the model proposed in this study shows excellent performance in both key metrics of prediction error and goodness of fit, surpassing the other compared models, validating its application value in the domain of network-security situation prediction.
Table 5 displays the absolute error of various prediction models at each time point. The data results further prove that the prediction method proposed in this research can achieve accurate prediction results in most cases. It can be observed from the data that when using the method proposed in this study, all the absolute errors are successfully controlled to an order of magnitude lower than 0.004. This performance showed higher prediction accuracy compared to the other models. Therefore, the prediction method of this study is capable of providing highly precise prediction results in most cases, with the error range consistently kept at a low level, demonstrating its advantage in prediction accuracy.

5. Conclusions

In this paper, the effectiveness of the CNN-GRU-based network-security situation prediction model is experimentally demonstrated using real data released by the National Internet Emergency Response Center (NIERC). The proposed model in this paper combines the spatial feature extraction capability of convolutional neural networks and the time-series processing capability of gated recurrent units, as well as the key feature focusing capability of an attention mechanism, so that the model achieves significant performance improvement in the task of network-security situation prediction. The experimental results show that compared with using a CNN and a GRU alone without a CNN, the model’s goodness of fit has been improved by 1.2%, 1%, and 7%, respectively, and the absolute error is one order of magnitude lower. In terms of the model’s convergence speed, compared with using a CNN and a GRU only, the experimental results show that the model proposed in this paper is the first to converge with the increase in the number of training rounds. Although the current model has achieved good performance, there is still room for further optimization. Network-security situation prediction can benefit from the fusion of multimodal data, such as combining network traffic data, system logs, and user behavior data, and future research can explore how to effectively integrate these data sources.

Author Contributions

Conceptualization, Y.F. and H.Z.; methodology, Y.F. and H.Z.; validation, J.Z., Z.C. and L.Z.; data curation, R.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (62072416), the Key Research and Development Special Project of Henan Province (221111210500), and the Key Technologies R&D Program of Henan Province (232102211053, 242102211071, 242102210142).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Arogundade, O.R. Network security concepts, dangers, and defense best practical. Comput. Eng. Intell. Syst. 2023, 14, 25–38. [Google Scholar]
  2. Husák, M.; Komárková, J.; Bou-Harb, E.; Čeleda, P. Survey of attack projection, prediction, and forecasting in cyber security. IEEE Commun. Surv. Tutor. 2018, 21, 640–660. [Google Scholar] [CrossRef]
  3. Nasir, M.H.; Khan, S.A.; Khan, M.M.; Fatima, M. Swarm intelligence inspired intrusion detection systems—A systematic literature review. Comput. Netw. 2022, 205, 108708. [Google Scholar] [CrossRef]
  4. S, S.M.J.; Thirunavukkarasu, M.; Kumaran, N.; Thamaraiselvi, D. Deep learning with blockchain based cyber security threat intelligence and situational awareness system for intrusion alert prediction. Sustain. Comput. Inform. Syst. 2024, 42, 100955. [Google Scholar]
  5. Zhang, J.; Feng, H.; Liu, B.; Zhao, D. Survey of technology in network security situation awareness. Sensors 2023, 23, 2608. [Google Scholar] [CrossRef] [PubMed]
  6. Sokol, P.; Staňa, R.; Gajdoš, A.; Pekarčík, P. Network security situation awareness forecasting based on statistical approach and neural networks. Log. J. IGPL 2023, 31, 352–374. [Google Scholar] [CrossRef]
  7. Shen, Y.; Mariconti, E.; Vervier, P.A.; Stringhini, G. Tiresias: Predicting security events through deep learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 592–605. [Google Scholar]
  8. Zhang, H.; Kang, C.; Xiao, Y. Research on network security situation awareness based on the LSTM-DT model. Sensors 2021, 21, 4788. [Google Scholar] [CrossRef] [PubMed]
  9. Zhang, S.; Fu, Q.; An, D. Network Security Situation Prediction Model Based on VMD Decomposition and DWOA Optimized BiGRU-ATTN Neural Network. IEEE Access 2023, 11, 129507–129535. [Google Scholar] [CrossRef]
  10. Xie, P.S.; Wang, S.; Zhao, Y.W.; Shao, W.J.; Li, W.; Feng, T. Security Situation Prediction Method of Industrial Control System Based on Self-Attention and GRU Neural Network. Int. J. Netw. Secur. 2023, 25, 729–735. [Google Scholar]
  11. Yuan, Y.; Xu, W. Neural network security situation prediction method based on attention-GRU. In Proceedings of the International Conference on Cyber Security, Artificial Intelligence, and Digital Economy (CSAIDE 2022), Huzhou, China, 15–17 April 2022; SPIE: Bellingham, WA, USA, 2022; Volume 12330, pp. 94–99. [Google Scholar]
  12. Li, X. CNN-GRU model based on attention mechanism for large-scale energy storage optimization in smart grid. Front. Energy Res. 2023, 11, 1228256. [Google Scholar] [CrossRef]
  13. Li, M.W.; Xu, D.Y.; Geng, J.; Hong, W.C. A hybrid approach for forecasting ship motion using CNN–GRU–AM and GCWOA. Appl. Soft Comput. 2022, 114, 108084. [Google Scholar] [CrossRef]
  14. Pan, M.; Zhou, H.; Cao, J.; Liu, Y.; Hao, J.; Li, S.; Chen, C.-H. Water level prediction model based on GRU and CNN. IEEE Access 2020, 8, 60090–60100. [Google Scholar] [CrossRef]
  15. Shi, B.; Xie, X. Research on network security situation prediction method based on DS evidence theory. Comput. Eng. Des. 2013, 34, 821–825. [Google Scholar]
  16. Liu, D. Prediction of network security based on DS evidence theory. ETRI J. 2020, 42, 799–804. [Google Scholar] [CrossRef]
  17. Deng, Y.; Wen, Z.; Jiang, X. Network security situation prediction method based on grey theory. J. Hunan Univ. Technol. 2015, 29, 69–73. [Google Scholar]
  18. Leau, Y.B.; Manickam, S. A novel adaptive grey verhulst model for network security situation prediction. Int. J. Adv. Comput. Sci. Appl. 2016, 7. [Google Scholar] [CrossRef]
  19. Hu, J.; Ma, D.; Liu, C.; Shi, Z.; Yan, H.; Hu, C. Network security situation prediction based on MR-SVM. IEEE Access 2019, 7, 130937–130945. [Google Scholar] [CrossRef]
  20. Ke, G.; Chen, R.S.; Chen, Y.C.; Yeh, J.H. Network security situation prediction method based on support vector machine optimized by artificial Bee colony algorithms. J. Comput. 2021, 32, 144–153. [Google Scholar]
  21. Liang, W.; Long, J.; Chen, Z.; Yan, X.; Li, Y.; Zhang, Q.; Li, K.-C. A security situation prediction algorithm based on HMM in mobile network. Wirel. Commun. Mob. Comput. 2018, 2018, 1–11. [Google Scholar] [CrossRef]
  22. Peshave, A.; Ganesan, A.; Oates, T. Predicting network threat events using HMM ensembles. In International Conference on Advanced Data Mining and Applications; Springer International Publishing: Cham, Switzerland, 2022; pp. 229–240. [Google Scholar]
  23. Li, Y.; Feng, W. Improved population intelligence algorithm and BP neural network for network security posture prediction. Int. J. Distrib. Sens. Netw. 2023, 9970205. [Google Scholar] [CrossRef]
  24. Xiao, P.; Xian, M.; Wang, H. Network security situation prediction method based on MEA-BP. In Proceedings of the 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 9–10 February 2017; pp. 1–5. [Google Scholar]
  25. Chen, Z. Research on internet security situation awareness prediction stechnology based on improved RBF neural network algorithm. J. Comput. Cogn. Eng. 2022, 1, 103–108. [Google Scholar]
  26. Shang, L.; Zhao, W.; Zhang, J.; Fu, Q.; Zhao, Q.; Yang, Y. Network security situation prediction based on long short-term memory network. In Proceedings of the 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS), Matsue, Japan, 18–20 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–4. [Google Scholar]
  27. Xiao, K.; Zhang, Y.; He, Y.; Xu, G.; Wang, C. Industrial IoT Network Security Situation Prediction Based on Improved SSA-BiLSTM. In Proceedings of the China Conference on Wireless Sensor Networks, Guangzhou, China, 10–13 November 2022; Springer Nature: Singapore, 2022; pp. 212–224. [Google Scholar]
  28. Ansari, M.S.; Bartoš, V.; Lee, B. GRU-based deep learning approach for network intrusion alert prediction. Future Gener. Comput. Syst. 2022, 128, 235–247. [Google Scholar] [CrossRef]
  29. Jacob, S.; Qiao, Y.; Jacob, P.; Lee, B. Using recurrent neural networks to predict future events in a case with application to cyber security. In Proceedings of the BUSTECH 2020: The Tenth International Conference on Business Intelligence and Technology, Nice, France, 25–29 October 2020; pp. 13–19. [Google Scholar]
  30. Gao, F.; Xia, J.; Wu, D.; Wang, W.; Wang, C.; Song, C. Network security situation prediction based on LSTM. In Proceedings of the 2023 2nd International Conference on Cloud Computing, Big Data Application and Software Engineering (CBASE), Chengdu, China, 3–5 November 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 350–354. [Google Scholar]
  31. Du, X.; Ding, X.; Tao, F. Network Security Situation Prediction Based on Optimized Clock-Cycle Recurrent Neural Network for Sensor-Enabled Networks. Sensors 2023, 23, 6087. [Google Scholar] [CrossRef] [PubMed]
  32. Zhao, D.; Shen, P.; Zeng, S. ALSNAP: Attention-based long and short-period network security situation prediction. Ad Hoc Netw. 2023, 150, 103279. [Google Scholar] [CrossRef]
Figure 1. Model architecture.
Figure 1. Model architecture.
Applsci 14 06652 g001
Figure 2. GRU structure.
Figure 2. GRU structure.
Applsci 14 06652 g002
Figure 3. Attention mechanism structure.
Figure 3. Attention mechanism structure.
Applsci 14 06652 g003
Figure 4. Standardized network-security situation value.
Figure 4. Standardized network-security situation value.
Applsci 14 06652 g004
Figure 5. Comparison of different models.
Figure 5. Comparison of different models.
Applsci 14 06652 g005
Figure 6. Number of iterations of the different models.
Figure 6. Number of iterations of the different models.
Applsci 14 06652 g006
Table 1. Comparison of different prediction methods.
Table 1. Comparison of different prediction methods.
MethodAdvantageDisadvantaged
Predictive Methods Based on Uncertain Inference TheoryIt is suitable for scenarios involving limited and incomplete data.The error is significant when dealing with irregular and fluctuating data.
Machine Learning-based Prediction MethodsGood at dealing with small samples, nonlinear data, high accuracy, and easy to understand.There are challenges and limitations in training and designing models when working with large-scale data.
Neural Network-based Prediction MethodCan handle complex nonlinear relationships and capture high-level features in the data through a hierarchical network structure.It is difficult to train due to the excessive number of parameters and the complexity of adjusting them.
Table 2. Network-security threat weight allocation.
Table 2. Network-security threat weight allocation.
Number of Hosts Infected with Network Viruses in ChinaThe Total Number of Tampered Websites in the RegionThe Total Number of Backdoor Websites Implanted in the RegionNumber of Counterfeit Pages on Domestic WebsiteNumber of New Information Security Vulnerabilities
0.300.250.150.150.15
Table 3. Setting of the training parameters.
Table 3. Setting of the training parameters.
ParameterSet up
OptimizerAdam
Learning rate0.001
Batch size1
Epoch100
Table 4. Comparison of prediction error and fit degree.
Table 4. Comparison of prediction error and fit degree.
ModelMSER2
CNN_GRU_Attention0.0000030.999500
GRU0.0000530.990206
CNN0.0000670.987618
GRU_Attention0.0003850.929284
Table 5. Comparison of absolute errors of different prediction models.
Table 5. Comparison of absolute errors of different prediction models.
Sample Serial NumberCNNGRUGRU-AttentionCNN-GRU-Attention
10.0114273130.0084308980.0032798350.004217833
20.0067620580.0103021860.0164965090.000301093
30.0129431780.0028827490.0038163360.002004951
40.0137803260.0078711210.0056945090.001495391
50.0057485400.0067206030.0196985010.000264764
60.0061949340.0024524180.0236083120.000444755
70.0074894430.0038964750.0224281550.001112163
80.0069583210.0003230420.0271763060.000948891
90.0045105810.0059108440.0223302250.001037091
100.0017251970.0126426820.0166347920.003122360
110.0055923160.0070088800.0226705070.001148432
120.0033345220.0080582200.0220997330.001389056
130.0004532930.0023129280.0247693660.000990987
140.0025096240.0033121850.0271669480.001395062
150.0011394620.0036849680.0254828630.000966758
160.0080947880.0040349960.0189527270.001177907
170.0120911300.0048347120.0136719350.001008064
180.0147554870.0037960710.0091491340.001440793
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Feng, Y.; Zhao, H.; Zhang, J.; Cai, Z.; Zhu, L.; Zhang, R. Prediction of Network Security Situation Based on Attention Mechanism and Convolutional Neural Network–Gated Recurrent Unit. Appl. Sci. 2024, 14, 6652. https://doi.org/10.3390/app14156652

AMA Style

Feng Y, Zhao H, Zhang J, Cai Z, Zhu L, Zhang R. Prediction of Network Security Situation Based on Attention Mechanism and Convolutional Neural Network–Gated Recurrent Unit. Applied Sciences. 2024; 14(15):6652. https://doi.org/10.3390/app14156652

Chicago/Turabian Style

Feng, Yuan, Hongying Zhao, Jianwei Zhang, Zengyu Cai, Liang Zhu, and Ran Zhang. 2024. "Prediction of Network Security Situation Based on Attention Mechanism and Convolutional Neural Network–Gated Recurrent Unit" Applied Sciences 14, no. 15: 6652. https://doi.org/10.3390/app14156652

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop