ISSA-ELM: A Network Security Situation Prediction Model

Sun, Hongzhe; Wang, Jian; Chen, Chen; Li, Zhi; Li, Jinjin

doi:10.3390/electronics12010025

Open AccessArticle

ISSA-ELM: A Network Security Situation Prediction Model

by

Hongzhe Sun

^1,*,

Jian Wang

²,

Chen Chen

¹

,

Zhi Li

¹ and

Jinjin Li

¹

Key Laboratory for Fault Diagnosis and Maintenance of Spacecraft In-Orbit, Xi’an Satellite Control Center, Xi’an 710043, China

²

Air and Missile Defense College, Air Force Engineering University, Xi’an 710051, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(1), 25; https://doi.org/10.3390/electronics12010025

Submission received: 14 November 2022 / Revised: 18 December 2022 / Accepted: 19 December 2022 / Published: 21 December 2022

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

To resolve the problems of low prediction accuracy and slow convergence speed of traditional extreme learning machines in network security situation prediction methods, we combine a meta-heuristic search algorithm with neural networks and propose a prediction method based on the improved sparrow search algorithm optimization of an extreme learning machine. Firstly, the initial population is initialized by cat-mapping chaotic sequences to enhance the randomness and ergodicity of the initial population and improve the global search ability of the algorithm. Secondly, the Cauchy mutation and tent chaos disturbance are introduced to expand the local search ability, so that the individuals caught in the local extremum can jump out of the limit and continue the search. Finally, the explorer-follower number adaptive adjustment strategy is proposed to enhance the global search ability in the early stage and the local depth mining ability in the later stage of the algorithm by using the change of the explorer and follower numbers in each stage to improve the optimization-seeking accuracy of the algorithm. The improvement not only guarantees the diversity of the population, but also makes up for the defect that the sparrow search algorithm is easily trapped in the local optima in later iterations, and greatly improves the accuracy of the network security situation prediction.

Keywords:

sparrow search algorithm optimization; extreme learning machine; algorithm improvement; network security situation prediction

1. Introduction

With the rapid development of the information revolution and cyberspace, the Internet has greatly promoted the prosperity and progress of the economy and society. It has brought great convenience to people, along with new security risks and challenges. Traditional network security defense systems provide mainly passive defense based on firewalls and anti-virus software. These systems have huge defects in preventing security threats: they cannot deal with new viruses and cannot prevent attacks in advance. Once a system is invaded, it can suffer huge losses. As the final step of big data incorporated with AI-enabled cyber security situational awareness, network security situation prediction can analyze the previous and existing state of the network, and then can predict the future situation. It can formulate safe and effective preventive measures before the network is attacked, and shift from passive defense to active defense that can perform dynamic analysis, real-time monitoring, and trend prediction. Therefore, designing an effective and accurate network security situation prediction model is a key step in the transition from passive defense to active defense.

At present, there are various network security situation prediction models [1]. Traditional prediction models are divided into gray prediction models [2,3], D-S evidence theory models [4], and artificial intelligence models [5]. Gray prediction models require accurate mathematical expressions for prediction. They are computationally intensive and can only predict the general trend of a network situation. Meanwhile, D-S evidence theory models are qualitative knowledge models based on expert experience and qualitatively described data. Such models cannot effectively use quantitative data and may face combinatorial explosion. The accuracy of prediction is also biased due to the uncertainty of expert experience. These models cannot meet the situation prediction requirements of large and complex networks. Artificial intelligence models use various neural networks to optimize training on quantitative data, but their learning speed is slow. These models are also prone to overfitting. Zhang et al. [6] used the gray correlational-entropy method to analyze the correlation of the factors that affect network security, selected the key factors that affect the network security, established the corresponding process equations and prediction equations based on these factors, and predicted the network security situation recursively using Kalman filtering. Although the prediction accuracy is higher than that of the RBF algorithm, the calculation cost is higher. Furthermore, Wang et al. [7] proposed an improved D-S evidence theory for correlation analysis to fuse the reliability and rationality of the prediction results. The theory reduces the false-alarm rate of the system, but it cannot be used in large-scale networks. Ren et al. [8] established an improved BP neural network prediction model. The prediction results are consistent with actual situations, but the gradient descent characteristics of the BP neural network result in the algorithm having a lengthy training time. In order to reduce the training time, Zhu et al. [9] used two nonlinear mathematical modeling methods, the ELM and the multilayer perceptron neural network (MLPNN), to establish a prediction model that also has good adaptability.

In order to seek better algorithms to establish network security situation prediction models, this paper introduces the meta-heuristic search algorithm into an ELM. We also use the unique optimization capability of the meta-heuristic search algorithm to overcome the shortcomings of the ELM itself so that we can realize better training effects. In recent years, many researchers have proposed a variety of meta-heuristic search algorithms and their improved versions. Meta-heuristic search algorithms are a type of bio-inspired optimization algorithm obtained by abstract research on observation and simulation of the natural habits of some biological populations in nature. These algorithms generally have multiple advantages, such as proximity, stability, and adaptability, and are widely used in image processing [10] and feature selection [11]. Common meta-heuristic search algorithms include particle-swarm optimization [12], gray wolf optimization [13], whale optimization [14], sine cosine [15], salp swarm [16], and sparrow search [17] algorithms. Meta-heuristic search algorithms have the problem of easily falling into a local optimum, reducing the diversity of the population in later iterations. In response to this problem, many researchers have proposed improvements to various algorithms. For example, Liu et al. [18] introduced an adaptive leader-follower adjustment strategy to address the problem of unstable solution results of the bottle-ocean sheath group algorithm, which enhanced the stability of the algorithm. Zhou et al. [19] used cat-mapped chaotic sequences combined with the inverse-solution method instead of randomly generated initial populations in order to avoid the defects of premature convergence of the whale optimization algorithm, which enhanced the whale optimization algorithm in terms of initial population diversity and solution-seeking traversal. Zhou et al. [20] used tent chaotic mapping to improve the wolf initialization method to make the initial distribution of wolves more uniform and enhance the global search capability of the algorithm. Zhang et al. [21] proposed an improved whale optimization algorithm (NGS-WOA) based on nonlinear adaptive weights and golden sine operator. Firstly, NGS-WOA introduced nonlinear adaptive weight to enable the search agent to explore the search space adaptively and balance the development and exploration stages. Secondly, the improved golden sine operator was introduced into WOA algorithm. The improved strategy can effectively improve the performance of the algorithm, so that NGS-WOA has the advantages of strong global convergence and avoiding falling into local optimization. Zhang et al. [22] proposed a new Gaussian mutation operator for the fireworks algorithm, which makes sparks learn from more samples. At the same time, the rule-explosion operator of the fireworks algorithm was combined with the migration operator based on biogeographic optimization (BBO) to increase information sharing. Finally, a new overall selection strategy was adopted to make high-quality solutions possess a high probability of entering the next generation without high computing cost. Cheng et al. [23] used an improved tent chaos mapping to initialize the population, increased the population diversity, and added an adaptive local search strategy to improve the global search ability. Liang et al. [24] proposed an improved SSA search algorithm based on adaptive weights and improved boundary constraints. The adaptive weights improve the algorithm’s performance. The adaptive weights improve the convergence speed of the algorithm, and the improved boundary-handling strategy improves the convergence accuracy to a certain extent.

In this paper, we combine the advantages and disadvantages of the ELM and the meta-heuristic search algorithm to improve SSA in the meta-heuristic search algorithm. Then, by combining the improved SSA (ISSA) with an ELM, we propose an ISSA-ELM network security situation prediction model. By comparing the ISSA and six other algorithms on 15 benchmark functions, we verify the superior performance of the improved algorithm. We conduct network situation prediction experiments simultaneously with the traditional ELM algorithm and the GA-ELM algorithm presented by Gokul et al. [25]. The comparison verifies the practicability and accuracy of our model.

2. Extreme Learning Machine

The extreme learning machine was first proposed by Huang et al. [26] in 2004. By randomly selecting input-layer weights and hidden-layer biases, and based on a single hidden-layer feedforward neural network, the output-layer weights are calculated and analyzed according to the Moore–Penrose generalized inverse matrix theory. The extreme learning machine has the advantages of requiring few training parameters and being a fast learner with strong generalization ability. Let the number of nodes in the input layer, hidden layer, and output layer of the ELM be

n

,

l

and

m

, respectively. The network structure is shown in Figure 1.

For a given

N

arbitrarily different samples,

(x_{i}, t_{i})

, where

x_{i} = {(x_{i 1}, x_{i 2}, \dots, x_{i n})}^{T}

,

t_{i} = {(t_{i 1}, t_{i 2}, \dots, t_{i m})}^{T}

, the output of the ELM is as follows:

f (x_{j}) = \sum_{i = 1}^{l} β_{i} g (w_{i}, b_{j}, x_{j}); j = 1, 2, \dots, N

(1)

where

w_{i} = {(w_{i 1}, w_{i 2}, \dots, w_{i n})}^{T}

is the input weight between the input-layer neurons and the hidden layer neurons;

β_{i} = {(β_{i 1}, β_{i 2}, \dots, β_{i m})}^{T}

is the output weight between the hidden-layer neurons and the output-layer neurons; is the bias of the hidden-layer neurons, and is the activation function of hidden-layer neurons. The matrix expression of the ELM system is as follows:

H β = Τ

(2)

where

H = {[\begin{matrix} g (w_{1}, b_{1}, x_{1}) & g (w_{2}, b_{2}, x_{1}) & \dots & g (w_{l}, b_{l}, x_{1}) \\ g (w_{1}, b_{1}, x_{2}) & g (w_{2}, b_{2}, x_{2}) & \dots & g (w_{l}, b_{l}, x_{2}) \\ ⋮ & ⋮ & ⋮ \\ g (w_{1}, b_{1}, x_{N}) & g (w_{2}, b_{2}, x_{N}) & \dots & g (w_{l}, b_{l}, x_{N}) \end{matrix}]}_{N \times l}

;

β = {[\begin{matrix} β_{1}^{T} & β_{2}^{T} & \dots & β_{l}^{T} \end{matrix}]}_{l \times m}^{T}

;

Τ = {[\begin{matrix} t_{1}^{T} & t_{2}^{T} & \dots & t_{N}^{T} \end{matrix}]}_{N \times m}^{T}

.

In order to achieve the final training effect of the ELM, the least-squares solution,

\hat{β}

, needs to be obtained so that:

‖ H \hat{β} - Τ ‖ = \min_{β} ‖ H β - Τ ‖

(3)

where

H

is the hidden-layer output matrix of the ELM network and is the expected output matrix of the network’s samples. Finally, the output weight is obtained by solving the formula:

\hat{β} = H^{†} Τ

(4)

where

H^{†}

is the Moore–Penrose generalized inverse matrix of the output matrix.

It can be concluded that the ELM does not need to use the gradient-descent method when training samples. Compared with the traditional back-propagation neural network that uses the gradient-descent method, our model greatly reduces the training time while retaining more accurate prediction capabilities.

3. Sparrow Search Algorithm and Its Improvement

3.1. Sparrow Search Algorithm

Tang et al. [27] pointed out that the sparrow search algorithm is abstracted as an explorer-follower-warning model with three position update formulas as:

x_{i, d}^{t + 1} = {\begin{matrix} x_{i, d}^{t} \cdot \exp (\frac{- i}{α \cdot i t e r_{\max}}), R_{2} < S T \\ x_{i, d}^{t} + Q \times L, R_{2} \geq S T \end{matrix}

(5)

x_{i, d}^{t + 1} = {\begin{matrix} Q \cdot \exp (\frac{x w o r s t_{d}^{t} - x_{i, d}^{t}}{i^{2}}), i > n / 2 \\ \begin{array}{l} x b e s t_{d}^{t + 1} + \\ \frac{1}{D} \sum_{d = 1}^{D} (| x_{i, d}^{t} - x b e s t_{d}^{t + 1} | r a n d {- 1, 1}) \end{array} \end{matrix}

(6)

x_{i, d}^{t + 1} = {\begin{matrix} x b e s t_{d}^{t} + β \cdot | x_{i, d}^{t} - x b e s t_{d}^{t} |, f_{i} \neq f_{g} \\ x_{i, d}^{t} + K \cdot (\frac{| x_{i, d}^{t} - x w o r s t_{d}^{t} |}{| f_{i} - f_{w} | + ε}), f_{i} = f_{g} \end{matrix}

(7)

where x represents the sparrow position, t represents the number of iterations, Q is the random number, L is the

1 \times d

matrix, and

R_{2}

is the alert value;

β

is the minimum constant to avoid zero-score error.

3.2. Initial Population by Cat Mapping Chaos

Chaos, as a nonlinear natural phenomenon, is widely used in optimization search problems for its chaotic sequences with the advantages of ergodicity and randomness. In order to maintain the diversity of populations and make the distribution among individuals as uniform as possible, this paper adopts a chaotic sequence-initialization strategy instead of the random population-generation method in the SSA algorithm. A variety of chaotic mappings have been formed in the optimization field, including logistic mapping, tent mapping and cat mapping. As a typical chaotic system, the probability density function of the sequence of logistic mapping, which obeys Chebyshev distribution, presents low density in the middle and high density on both sides of the mapping points, which leads to traversal inhomogeneity and affects the search efficiency of the algorithm. Chen et al. [28] proposed that the traversal uniformity and convergence speed of tent mapping are better than logistic mapping. However, tent mapping is prone to problems in small cycle periods and immobility points, and the optimal solution can only be found when it is only the edge value. To address the shortcomings of logistic mapping and tent mapping, the initial population of SSA algorithm is generated by cat mapping in this paper. The cat mapping expression is:

[\begin{matrix} y_{i + 1} \\ w_{i + 1} \end{matrix}] = [\begin{matrix} 1 & a_{1} \\ b_{1} & a_{1} b_{1} + 1 \end{matrix}] [\begin{matrix} y_{i} \\ w_{i} \end{matrix}] \mod 1

(8)

where

a_{1}

and

b_{1}

are arbitrary real numbers, denoting the fractional part of finding

a_{1}

.

Due to the simple structure of the cat map, it is not easy to fall into small cycles and immobility points, and the initial population generated by this map has better traversal uniformity.

According to the characteristics of the cat mapping, the steps of the method to generate a chaotic sequence in the feasible domain and combine it with the reverse solution initialization are: randomly generate a feasible solution for the current population, denoted as:

{Y_{i} = [y_{i 1}, y_{i 2}, \dots, y_{i d}, \dots, y_{i D}]; y_{i d} \in [l b_{i d}, u b_{i d}]}

(9)

Then the reverse solution is:

\begin{array}{l} Y^{'} = [Y_{1}^{'}, Y_{2}^{'}, \dots, Y_{d}^{'}, \dots, Y_{D}^{'}] \\ y_{i d} = q (l b_{i d} - u b_{i d}) - y_{i d} \end{array}

(10)

where

q

is a uniform distribution on the interval [0, 1],

l b_{i d}

and

u b_{i d}

denote the upper and lower bounds of the feasible solutions.

3.3. Tent Chaos and Cauchy Variation Perturbation Strategy

We add to the sparrow search algorithm with a joint perturbation strategy of tent hybrid perturbation and Cauchy variance, and to avoid the tent hybrid falling into unstable cycle points, we add a random variable, and the improved tent hybrid expression is:

z_{i + 1} = {\begin{matrix} 2 z_{i} + rand (0, 1) \times \frac{1}{N}, 0 \leq z \leq \frac{1}{2} \\ 2 (1 - z_{i}) + rand (0, 1) \times \frac{1}{N}, \frac{1}{2} < z \leq 1 \end{matrix}

(11)

The Cauchy variance is derived from the Cauchy distribution of continuous-type probability distribution [29], which is given by

m u t a t i o n (x) = x (1 + \tan (π (u - 0.5)))

(12)

The tent mixture perturbation and the Cauchy variance perturbation are used in different cases, respectively, so that the sparrow search algorithm is always tuned toward the optimal position.

3.4. Improved Explorer Location Update Formula

In the SSA algorithm, the sparrow explorer is only influenced by the position of the previous generation of explorers, and the value of

\exp (\frac{- i}{α \cdot i t e r_{\max}})

adaptively decreases in the process of continuous iteration; when its value is large, the explorer enters the extensive search mode. As its value decreases, the explorer mainly performs narrow-search mode, i.e., digging deeper at the optimal-solution attachment to improve the convergence accuracy of the algorithm. Therefore, the value of

\exp (\frac{- i}{α \cdot i t e r_{\max}})

is particularly important. Since a small change in the value of

\exp (\frac{- i}{α \cdot i t e r_{\max}})

has a large impact on the explorer, in this paper, based on Equation (7), the explorer update formula is changed to:

x_{i, d}^{t + 1} = {\begin{matrix} x_{i, d}^{t} \cdot \frac{2}{\exp {(\frac{4 i}{α \cdot i t e r_{\max}})}^{m}}, R_{2} < S T \\ x_{i, d}^{t} + Q \times L, R_{2} \geq S T \end{matrix}

(13)

Let

c = \frac{2}{\exp {(\frac{4 i}{α \cdot i t e r_{\max}})}^{m}}

, and choose m from 1 to 4 to analyze the effect of parameter m on the performance of the explorer, as shown in Figure 2:

The selection of the m value affects the balance relationship between global and local searches by the sparrow explorer. As can be seen from Figure 2, each curve converges quickly in the early stage and slowly in the late stage. In order to select the most appropriate value of m, the Schaffer function was selected to test this parameter. The Schaffer function is characterized by the presence of a large number of local minima around the global optimum value of 3.14 range, and the function oscillates strongly. After changing the m value seven times and averaging each time by performing the optimization search calculation 50 times, a comparative table of the effect of the m value on the SSA algorithm was obtained, as shown in Table 1. It can be seen that the SSA algorithm can avoid the local minima of the Schaffer function and find the optimal value 0 due to the good performance of the SSA algorithm itself. Therefore, the influence of the value of m on the SSA algorithm is mainly reflected in the number of convergence generations; when the value of m increases, the number of tie convergences decreases at first and then increases, and the average number of convergence iterations is the smallest when the value of m is 2. In summary, taking m = 2 and parameter c can make the exploration ability and foraging ability of the SSA algorithm reach a better balance, when the explorer can fully and extensively search the target in the early stage and focus on the optimal location in the later stage.

3.5. Explorer-Follower Adaptive Adjustment Strategy

In the SSA algorithm, the ratio of the number of explorers to the number of followers is kept constant, which leads to a relatively small number of explorers in the early iterations and prevents an adequate search of the global. In the late iteration, the number of explorers is relatively large again, when more explorers are no longer needed for global search, and the number of followers needs to be increased for accurate local search. To solve this problem, this paper proposes an explorer-follower adaptive adjustment strategy, which can account for a majority of the population number in the early iterations. As the number of iterations increases, the number of explorers adaptively decreases and the number of followers adaptively increases, gradually shifting from global search to local exact search, improving the convergence accuracy of the algorithm as a whole. The number of explorers and followers is adjusted by the formula:

r = b (\tan (- \frac{π t}{4 \cdot i t e r_{\max}} + \frac{π}{4}) - k \cdot r a n d (0, 1))

(14)

p N u m = r \cdot N

(15)

s N u m = (1 - r) \cdot N

(16)

where

p N u m

is the number of explorers,

s N u m

is the number of followers; b is the scaling factor to control the number between explorers and followers; k is the perturbation deviation factor, which perturbs the non-linearly decreasing r value.

3.6. Improved Sparrow Search Algorithm

The flow chart of ISSA algorithm is as shown in Figure 3.:

4. ISSA-ELM Prediction Model

Due to the randomly given input weight matrix and hidden layer deviation, some of the standard ELM values may be 0, which makes some hidden-layer nodes invalid and results in poor prediction and insufficient stability. To achieve accurate prediction, the number of hidden-layer nodes needs to be increased. However, an increase in the number of hidden-layer nodes can lead to poor adaptability of training samples and reduced generalization ability. Therefore, to ensure that the ELM has high prediction accuracy under the condition of the optimal number of hidden-layer nodes, the ISSA algorithm is used to optimize the ELM. With the help of the global search capabilities of ISSA, the ELM’s input weights and hidden-layer deviations are optimally searched, which not only enhances the stability of the ELM, but also does not reduce the convergence speed of the ELM. In the process of ELM training input samples, the various relationships between samples are further learned. The algorithm flowchart is presented in Figure 4.

5. Experimental Results and Analysis

5.1. Algorithm Tests

In order to validate the optimization-seeking ability and feasibility of the improved algorithm in this paper, ISSA is tested against PSO, GWO, WOA, SSO and SSA algorithms on 16 benchmark functions at the same time.

5.1.1. Benchmark Functions

The benchmark functions are shown in Table 2. Among them, f1–f6 are high-dimensional single-peaked functions, f7–f11 are high-dimensional multi-peaked functions, and f12–f16 are low-dimensional multi-peaked functions. The high-dimensional single-peaked functions have only one global optimum point and no local extreme value point, mainly to test the convergence speed of the function. Multi-peak functions have multiple local extrema, from both a high- and low-dimensional perspective, which are used to observe the performance of the function, in different dimensions, jumping out of the local extrema.

5.1.2. Convergence Accuracy and Stability Analysis

Simulation and comparison experiments of six algorithms were conducted and in order to avoid excessive chance errors, each benchmark function was selected to run 40 times independently in the experiments, and the best value, mean and standard deviation were taken as the evaluation index. The population size is set to 30 and the maximum number of iterations is 500 in the experiments; the final results obtained are shown in Table 3 with the best results in bold.

Firstly, among the six high-dimensional unimodal functions, ISSA finds the theoretical optimal value 0 when solving f1, f2 and f3 functions, and the standard deviation of f1 and f3 functions is also 0. In addition, only SSA obtains the optimal value 0 when solving f1. Secondly, when solving f4 and f5 functions, the optimal value, average value and standard deviation obtained by ISSA are at least 2 orders of magnitude higher than the other five algorithms. In solving f6 function, although the degree of improvement is not high, the optimal value and stability are still higher than other algorithms. Among the five high-dimensional multimodal functions, for the solution of f7, the optimization ability of the six algorithms is basically the same, and ISSA does not show the superiority of the algorithm. The solution of f8 and f10 can reach the optimal value in WOA and SSA, and the performance of the original algorithm is maintained in the improved ISSA. ISSA does not highlight its advantages in solving f9. Like SSA, ISSA cannot jump out of a certain extreme point after reaching the extreme point, but the standard deviation is 0 and has strong stability. In the high-dimensional multimodal function, only the solution of f11 highlights the improvement of the optimization performance of the ISSA algorithm. For five low-dimensional multimodal functions, f12–f15 can reach the optimal value under the solution of six algorithms, except F16, which shows that the algorithm can better optimize under the condition of low dimensionality. In f12, SSA and ISSA are less stable than the other four algorithms, but ISSA is more stable than SSA. The stability of ISSA in f13 is second only to WOA. The stability of ISSA is the best in f14 and f15. In f16, only GWO, SSA and ISSA can find the optimal solution, and the stability increases in turn.

The analysis shows that for each algorithm, the solution accuracy of unimodal function is higher than that of multimodal function, and the solution accuracy of low-dimensional function is higher than that of high-dimensional function. However, regardless of unimodal, multimodal, high-dimensional and low-dimensional function, ISSA not only improves the optimization accuracy, but also has better stability compared with the other five algorithms.

5.1.3. Wilcoxon Rank-Sum Test Analysis

Derrac et al. [30] proposed that statistical tests should be performed for the evaluation of improved algorithm performance. In other words, it is not enough to compare algorithm strengths and weaknesses based on mean and standard deviation values; statistical tests are needed to demonstrate that the proposed improved algorithm has significant improvement advantages over other existing algorithms. The stability and fairness of the algorithm is demonstrated by comparing the results independently in each case. In this paper, Wilcoxon rank-sum test is used at 5% significance level to determine whether each result of ISSA is statistically significantly different from the best results of the other five algorithms. When the p-value is less than 5%, the hypothesis is rejected, indicating that the comparison algorithms are significantly different; otherwise, the hypothesis is accepted, indicating that the superiority-seeking ability of the comparison algorithms is the same overall. Table 4 gives the p-values of the rank-sum test between ISSA and the other five algorithms under 16 benchmark functions, because when both comparison algorithms reach the best value, no comparison can be made. NaN in the table means “not applicable”, that is, no significance judgment can be made, and R is the result of significance judgment; “+”, “−”, and “=“ represent ISSA performance better than, worse than, and equivalent to the comparison algorithm, respectively.

From Table 4, it can be concluded that most of the p-values are much less than 5%, which indicates that the superiority of ISSA over the other five algorithms is statistically significant. In the comparison between ISSA and SSA, the R-value in the f13~f16 functions is “-” because of the better performance of SSA seeking itself. Both ISSA and SSA can find the best value, but only differ in the embodiment of the average value, and ISSA has some improvement in the seeking performance on the low-dimensional multi-peak function However, there is not much room for improvement.

5.1.4. Model Ablation Experiment

The various additive mechanisms of ISSA lead to a better performance of its advantage finding, but it is unknown exactly which additive mechanism works and whether a particular additive mechanism does not work. To show the role of various mechanisms and make the experiments more convincing, we conducted ablation experiments of the ISSA model. The comparison experiments were done with SSA, adaptive SSA (ASSA), variational SSA (MSSA), and ISSA on 16 benchmark functions and the final results are shown in Table 5.

As can be seen from Table 5, in the high-dimensional unimodal functions f1–f6, except f6, ASSA and MSSA in other functions play a certain role in the optimization performance. In f6, although ASSA and MSSA have no improvement effect compared with SSA, ISSA has the highest stability when they are combined. In the high-dimensional multimodal functions f7–f11: the test results of function f7 are similar to f6, and the stability is improved, but the optimization performance is not improved. All algorithms in function f8–f10 can find the optimal value. In function f11, the effects of ASSA and MSSA are better than SSA, and the effect of ISSA is the best. In the low-dimensional multimodal functions f12–f16, because each algorithm can approach the optimal value infinitely, the performance of ISSA is not improved much, and in f13, the performance of ASSA, MSSA and ISSA is not as good as SSA.

5.1.5. Time Complexity Analysis of ISSA

Assume that the population size in the algorithm is N; the dimension is D; the maximum number of iterations is

i t e r_{\max}

; the time to randomly initialize the population parameters is

s_{1}

; the time to find the individual fitness value is

j (D)

; the number of explorers is

p N u m

and the update time per dimension is

s_{2}

; the number of followers is

s N u m

and the update time per dimension is

s_{3}

; and the update time per dimension of the warning is

s_{4}

. Therefore, the initial phase time complexity is:

T_{1} = O (s_{1} + N (j (D) + D s_{1}))

; explorer update time complexity is:

T_{2} = O (p N u m \times s_{2} D)

; follower update time complexity is:

T_{3} = O (s N u m \times s_{3} D)

; and early warning update time complexity is:

T_{4} = O ((N - p N u m - s N u m) s_{4} D)

. In summary, the total time complexity of SSA is:

T = T_{1} + (T_{2} + T_{3} + T_{4}) i t e r_{\max} = O (D + j (D))

.

In ISSA, the time required for cat mapping chaos is assumed to be

u_{1}

and the sorting selection time is

u_{2}

, so the initial-stage time complexity is:

T_{11} = O (s_{1} + N (u_{1} + j (D) + D s_{1}) + u_{2})

; the explorer and follower number update formula is u3, then the explorer update time complexity is:

T_{22} = O (p N u m \times s_{2} D + i t e r_{\max} u_{3})

. The time complexity of follower update is:

T_{33} = O (s N u m \times s_{3} D + i t e r_{\max} u_{3})

; the time complexity of warning update is:

T_{44} = O ((N - p N u m - s N u m) s_{4} D)

. In the Cauchy variation and tent chaos perturbation process, let the time to solve the favg be

u_{4}

, and the time to calculate the perturbation formula and the Cauchy variation formula be

u_{5}

and

u_{6}

, respectively. The time to compare the sparrow fitness value with the average fitness value and to update the target position by merit are

u_{7}

and

u_{8}

, respectively, then the time complexity of this stage is:

T_{55} = O (u_{4} + u_{5} + u_{6} + N (j (D) + u_{7}) + u_{8})

. In summary, the time complexity of ISSA is:

T_{1} = T_{11} + (T_{22} + T_{33} + T_{44} + T_{55}) i t e r_{\max} = O (D + j (D))

. Since

T = T_{1}

, it shows that the time complexity of ISSA does not increase.

In summary, the analysis demonstrates that each of the proposed strategies contributes to the final performance of the ISSA algorithm through model ablation experiments. The time complexity of ISSA is shown to be equivalent to SSA by time complexity analysis. It is concluded from the experiments of ISSA on three types of functions that although the performance of ISSA compared to SSA in the low-dimensional multi-peak function is not obvious, the performance of the search in the high-dimensional single-peak and high-dimensional multi-peak functions is at least two orders of magnitude higher than that of the other five algorithms, the convergence speed is also significantly improved, and the stability is enhanced. This fully indicates that the various performances of the ISSA algorithm are generally better than the remaining five algorithms, demonstrating the superiority and feasibility of ISSA.

5.2. Network Security Situation Prediction Experiment Based on ISSA-ELM

5.2.1. Experimental Environment and Data Preprocessing

The experimental data in this paper comes from the network environment built by Jiang et al. [31]. The network environment is shown in Figure 5. First, a real hacker attacks, and various attacks are simulated. Second, by counting the number and types of attacks, and assessing the degree of damage to the host after the attack, a network security assessment system is established to obtain the current network security situation value. Every 30 min of the experiment, statistics are collected and the network security situation value is evaluated and calculated. Finally, 150 situation values are selected to form the sample data, and the possibility of large errors is eliminated through normalization so that the values are in the range of [0, 1], as shown in Figure 6.

The parameters of the ELM neural network need to be determined during the experiment. Because the number of neurons in the input layer of the ELM represents the dimensional characteristics of the sample data, in this paper, corresponding to the network security situation value for a certain period of time, the sliding-window method is used for data input of the ELM. Network security situation analysis shows that the current situation of the network has some connection with the previous three to ten time periods. In order to not lose the relationship between these situations, the size of the sliding window is set to the maximum value of ten for the prediction experiment. The number of hidden-layer neurons has a certain impact on the final experimental results. Because the number of hidden-layer neurons usually does not exceed the number of input-layer neurons, in order to accurately find the relationship between the situation values, according to the idea of selecting the sliding-window value, we select 10 as the maximum number of neurons in the input layer. The purpose of this paper is to predict the future network situation, so the number of neurons in the output layer is 1, which represents the network security situation value in the next time period.

Because network security situation time series are constituted by a total of 150 consecutive samples of the situation value, and the sliding-window size is 10, a total of 150 − (10 + 1) + 1 = 140 samples are formed, of which 120 are training samples and 20 are test samples. The selection of the security situation dimension is shown in Table 6.

5.2.2. Analysis of Experimental Results

In order to evaluate the network situation prediction results, the mean relative error (MRE), mean square error (MSE), mean absolute error (MAE), and coefficient of determination ( ) are used as evaluation metrics. MRE reflects the credibility of the measurement. MSE evaluates the data variability: the smaller the value, the better the prediction accuracy. MAE reflects the actual situation of the predicted value errors. The evaluation metrics are expressed as follows:

M R E = \frac{1}{N} \sum_{i = 1}^{N} | \frac{y_{i} - \overset{}{{\hat{y}}_{i}}}{y_{i}} | \times 100 %

(17)

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{i})}^{2}

(18)

M A E = \frac{1}{N} \sum_{i = 1}^{N} | y_{i} - {\hat{y}}_{i} | \times 100 %

(19)

R^{2} = \frac{{[\sum_{i = 1}^{N} (y_{i} - \bar{y}) ({\hat{y}}_{i} - {\bar{\hat{y}}}_{i})]}^{2}}{[\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}] [\sum_{i = 1}^{N} {({\hat{y}}_{i} - {\bar{\hat{y}}}_{i})}^{2}]}

(20)

where

y_{i}

is the actual value of the sample,

{\hat{y}}_{i}

is the predicted value of the sample,

N

is the number of samples,

\bar{y}

is the average of actual values, and

{\bar{\hat{y}}}_{i}

is the average of the predicted values.

The experimental results of ISSA-ELM are compared with those of the traditional ELM, GA-ELM, and SSA-ELM in Figure 7. The convergence curve of each algorithm is shown in Figure 8, and the comparison of evaluation metrics is shown in Table 7.

It can be concluded from Figure 7 that the prediction curve of ISSA-ELM basically matches the actual value and has a higher goodness-of-fit than other prediction models; moreover, the prediction error of ISSA-ELM is the smallest. In Figure 8, SSA-ELM stopped converging before the number of iterations reached 400, and its effectiveness is not as good as that of GA-ELM. Meanwhile, ISSA-ELM not only converges fast, but also has a better fitness value. In summary, through comparative analysis of the experimental results, it can be seen that for the same sliding-window size, the ISSA-ELM algorithm proposed in this paper has a higher goodness-of-fit in the network security situation training data than the GA-ELM and SSA-ELM algorithms. Specifically, it has accuracies that are 88.7%, 6.6%, and 24.4% higher than those of ELM, GA-ELM, and SSA-ELM, respectively.

6. Conclusions

To tackle the problem of accuracy in network security situation prediction, we introduce and improve the sparrow search algorithm based on the extreme learning machine and propose the ISSA-ELM model. The ELM neural network can quickly train samples while the ISSA optimizes its initial weights. Together, they can accurately predict the next network security situation. The improved ISSA can overcome the shortcoming of being prone to falling into local optima, it has good global convergence performance and robustness, shows better optimization capabilities, and has better overall performance than the original algorithm.

Experimental comparisons show that the ISSA-ELM model has certain advantages over GA-ELM and SSA-ELM in a real network environment: it has fast convergence speed, and higher prediction accuracy. However, the ISSA-ELM also has shortcomings. For example, there is great uncertainty in the hidden-layer node-selection process. Meanwhile, the sliding-window size is too large, leading to ISSA-ELM being prone to overfitting. Future studies should focus on the number of adaptive hidden-layer nodes required to further improve the convergence speed and prediction accuracy.

Author Contributions

Conceptualization, H.S. and J.W.; methodology, J.W.; software, H.S. and C.C.; validation, J.W.; formal analysis, J.W. and Z.L.; investigation, H.S. and J.W.; resources, H.S. and J.W.; data curation, J.W.; writing—original draft preparation, H.S. and Z.L.; writing— review and editing, J.W. and Z.L.; visualization, J.L.; supervision, J.L.; project administration, J.L.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (61806219, 61876189).

Conflicts of Interest

The authors declare no conflict of interest.

References

Shi, L.Y.; Liu, J.; Liu, W.H.; Zhu, H.Q.; Duan, P.F. Survey of Research on Network Security Situation Awareness. Comput. Eng. Appl. 2019, 55, 1–9. [Google Scholar]
Lai, J.B.; Wang, H.Q.; Zhu, L. Study of Network Security Situation Awareness Model Based on Simple Additive Weight and Grey Theory. Comput. Intell. Secur. 2006, 2, 545–1548. [Google Scholar]
Hu, W.; Li, J.H.; Chen, X.Z.; Jiang, X.H. Network security situation prediction based on improved adaptive grey Verhulst model. J. Shanghai Jiaotong Univ. 2010, 15, 408–413. [Google Scholar] [CrossRef]
Hu, H.L. Research on Data Fusion Technology for Network Security Awareness Based on D-S Evidence Theory. Ph.D. Thesis, National University of Defense Technology, Changsha, China, 2015. [Google Scholar]
Sun, N.Q.; Yang, L. Intrusion Detection Based on Back-Propagation Neural Network and Feature Selection Mechanism. Future Gener. Inf. Technol. Lect. Notes Comput. Sci. 2009, 5899, 151–159. [Google Scholar]
Zhang, L.; Liu, X.J.; Ma, J.; Sun, W.C.; Wang, X.F. The Prediction Algorithm of Network Security Situation Based on Grey Correlation Entropy Kalman Filtering. In Proceedings of the 2014 IEEE 7th Joint International Information Technology and Artificial Intelligence Conference, Chongqing, China, 20–21 December 2014. [Google Scholar]
Wang, C.D.; Zhang, Y.K. Network Security Situation Evaluation Based on Modified D-S Evidence Theory. Wuhan Univ. J. Nat. Sci. 2014, 19, 409–416. [Google Scholar] [CrossRef]
Ren, H.; Zhu, Y.J.; Wang, P.; Li, P.; Zhang, Y.Q.; Wang, X.Z.; Li, Y.Y.; Gong, F.Q. Classification and Application of Roof Stability of Bolt Supporting Coal Roadway Based on BP Neural Network. Adv. Civ. Eng. 2020, 2020, 8838640. [Google Scholar] [CrossRef]
Zhu, S.L.; Salim, H. Prediction of dissolved oxygen in urban rivers at the Three Gorges Reservoir, China: Extreme learning machines (ELM) versus artificial neural network (ANN). Water Qual. Res. J. 2020, 55, 106–118. [Google Scholar] [CrossRef]
Yang, L.N.; Sun, X.; Li, Z.L. An efficient framework for remote sensing parallel processing: Integrating the artificial bee colony algorithm and multiagent technology. Remote Sens. 2019, 11, 152. [Google Scholar] [CrossRef] [Green Version]
Hu, P.; Pan, J.S.; Chu, S.C. Improved binary grey wolf optimizer and its application for feature selection. Knowl.-Based Syst. 2020, 195, 105746. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; pp. 1942–1948. [Google Scholar]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef] [Green Version]
Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
Mirjalili, S. SCA: A sine cosine algorithm for solving optimization problems. Knowl. Based Syst. 2016, 96, 120–133. [Google Scholar] [CrossRef]
Mirjalili, S.; Gandomi, A.H.; Mirjalili, S.Z. Salp swarm algorithm: A bio-inspired optimizer for engineering design problems. Adv. Eng. Softw. 2017, 114, 163–191. [Google Scholar] [CrossRef]
Xue, J.K.; Shen, B. A novel swarm intelligence optimization approach: Sparrow search algorithm. Syst. Sci. Control Eng. 2020, 8, 22–34. [Google Scholar] [CrossRef]
Liu, J.S.; Yuan, M.M.; Zuo, F. Global search-oriented adaptive leader salp swarm algorithm. Control Decis. 2021, 36, 2152–2160. [Google Scholar]
Zhou, J.; Wang, L.; Chen, X.Q. Image segmentation of 2-D maximum entropy based on the improved whale optimization algorithm. Intell. Comput. Appl. 2020, 10, 71–75. [Google Scholar]
Zhou, J. Workshop used robot navigation path planning method based on chaotic wolf pack besieging algorithm. Mach. Des. Manuf. 2020, 251–255. [Google Scholar]
Zhang, J.; Wang, J.S. Improved whale optimization algorithm based on nonlinear adaptive weight and golden sine operator. IEEE Access 2020, 8, 77013–77048. [Google Scholar] [CrossRef]
Zhang, B.; Zheng, Y.J.; Zhang, M.X.; Chen, S.Y. Fireworks Algorithm with Enhanced Fireworks Interaction. IEEE/ACM Trans. Comput. Biol. Bioinform. 2015, 14, 42–55. [Google Scholar] [CrossRef]
Chengtian, O.; Yujia, L.; Donglin, Z. An adaptive chaotic sparrow search optimization algorithm. In Proceedings of the 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Nanchang, China, 26–28 March 2021; pp. 76–82. [Google Scholar] [CrossRef]
Liang, Q.; Chen, B.; Wu, H.; Han, M. A Novel Modified Sparrow Search Algorithm Based on Adaptive Weight and Improved Boundary Constraints. In Proceedings of the 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), Chengdu, China, 23–26 April 2021; pp. 104–109. [Google Scholar] [CrossRef]
Krishnan, G.S.; Kamath, S. A novel GA-ELM model for patient-specific mortality prediction over large-scale lab event data. Appl. Soft Comput. 2019, 80, 525–533. [Google Scholar] [CrossRef]
Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing 2005, 70, 489–501. [Google Scholar] [CrossRef]
Tang, Y.; Li, C.; Li, S.; Cao, B.; Chen, C. A Fusion Crossover Mutation Sparrow Search Algorithm. Math. Probl. Eng. 2021, 2021, 9952606. [Google Scholar] [CrossRef]
Chen, R.; Wang, S.Y. An optimization method for an integrated energy system scheduling process based on NSGA-II improved by tent mapping chaotic algorithms. Processes 2020, 8, 426. [Google Scholar] [CrossRef] [Green Version]
Guo, Z.Z.; Wang, P.; May, Y.F.; Wang, Q.; Gong, C.Q. Whale optimization algorithm based on adaptive weight and cauchy mutation. Microelectron. Comput. 2017, 34, 20–25. [Google Scholar]
Derrac, J.; Garcia, S.; Molina, D.; Herrera, F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 2011, 1, 3–18. [Google Scholar] [CrossRef]
Jiang, Y.; Li, C.H.; Wei, X.H.; Li, Z.P. Research on Network Security Situation Prediction Based on RBF Optimized by Improved PSO. Meas. Control Technol. 2018, 37, 56–60. [Google Scholar]

Figure 1. Network structure of ELM.

Figure 2. Change curve of c.

Figure 3. Flow chart of ISSA.

Figure 4. Flow chart of ISSA-ELM prediction.

Figure 5. Network experiment environment diagram.

Figure 6. Network security situation value.

Figure 7. Comparison of experimental results. (a) Predicted value; (b) Error comparison.

Figure 8. The comparison of fitness curves.

Table 1. Influence of parameter m on SSA.

m	Mean	Std	Average Number of Convergences
1.0	5.509 × 10⁻⁷²	3.120 × 10⁻⁷¹	891
1.5	0	0	723
2.0	0	0	151
2.5	7.505 × 10⁻³⁵	4.747 × 10⁻³⁴	289
3.0	0	0	224
3.5	0	0	557
4.0	0	0	292

Table 2. Benchmark function.

Function	Formula	Dim	Domain	Best
Sphere	$f_{1} (x) = \sum_{i = 1}^{n} x_{i}^{2}$	30	[−100, 100]	0
Schwefel’s	$f_{2} (x) = \sum_{i = 1}^{n} \| x_{i} \| + \prod_{i = 1}^{n} \| x_{i} \|$	30	[−10, 10]	0
Quadric	$f_{3} (x) = {\sum_{i = 1}^{n} (\sum_{j = 1}^{n} x_{j})}^{2}$	30	[−100, 100]	0
Rosenbrock	$f_{4} (x) = \sum_{i = 1}^{n - 1} [100 {(x_{i + 1} - x_{i}^{2})}^{2} + {(x_{i} - 1)}^{2}]$	30	[−30, 30]	0
Step	$f_{5} (x) = \sum_{i = 1}^{n} {(\| x_{i} + 0.5 \|)}^{2}$	30	[−100, 100]	0
Quartic	$f_{6} (x) = \sum_{i = 1}^{n} i x_{i}^{4} + r a n d o m [0, 1]$	30	[−1.28, 1.28]	0
Schwefel	$f_{7} (x) = \sum_{i = 1}^{n} - x_{i} \sin (\sqrt{\| x_{i} \|})$	30	[−500,500]	−418.9829n
Rastrigrin	$f_{8} (x) = \sum_{i = 1}^{n} [x_{i}^{2} - 10 \cos (2 π x_{i}) + 10]$	30	[−5.12, 5.12]	0
Ackley	$f_{9} (x) = - 20 \exp (- 0 . 2 \sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}) - \exp (\frac{1}{n} \sum_{i = 1}^{n} \cos (2 π x_{i})) + 20 + e$	30	[−32, 32]	0
Griewing	$f_{10} (x) = \frac{1}{4000} \sum_{i = 1}^{n} x_{i}^{2} - \prod_{i = 1}^{n} \cos (\frac{x_{i}}{\sqrt{i}}) + 1$	30	[−600, 600]	0
Generalized penalized	$f_{11} (x) = \frac{π}{n} {10 \sin^{2} (π y_{1}) + \sum_{i = 1}^{n - 1} {(y_{i} - 1)}^{2} [1 + 10 \sin^{2} (π y_{i + 1})]} + {(y_{n} - 1)}^{2}$	30	[−50, 50]	0
Foxholes	$f_{12} (x) = {(\frac{1}{500} + \sum_{j = 1}^{25} \frac{1}{j + \sum_{i = 1}^{2} {(x_{i} - a_{i j})}^{6}})}^{- 1}$	2	[−65, 65]	1
Hartmann 6-D	$f_{13} (x) = - \sum_{i = 1}^{4} c_{i} \exp (- \sum_{j = 1}^{6} a_{i j} {(x_{j} - p_{i j})}^{2})$	6	[0, 1]	−3.32237
Schkel	$f_{14} (x) = - {\sum_{i = 1}^{10} [(X - a_{i}) {(X - a_{i})}^{T} + c_{i}]}^{- 1}$	4	[0, 10]	−10.5363
Six-Hump Camel	$f_{15} (x) = 4 x_{1}^{2} - 2.1 x_{1}^{4} + \frac{1}{3} x_{1}^{6} + x_{1} x_{2} - 4 x_{2}^{2} - 4 x_{2}^{4}$	2	[−5, 5]	−1.0316
Kowalik	$f_{16} (x) = {\sum_{i = 1}^{11} [a_{i} - \frac{x_{1} (b_{i}^{2} + b_{i} x_{2})}{b_{i}^{2} + b_{i} x_{3} + x_{4}}]}^{2}$	4	[−5, 5]	0.000307

Table 3. Comparison of benchmark function results.

Algorithm	f1			f2
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	1.057× 10⁻⁵	1.422× 10⁻⁴	2.013× 10⁻⁴	5.100× 10⁻³	4.260× 10⁻²	6.160× 10⁻²
GWO	2.026× 10⁻²⁹	1.522× 10⁻²⁷	1.998× 10⁻²⁷	2.312× 10⁻¹⁷	8.104× 10⁻¹⁷	3.908× 10⁻¹⁷
WOA	2.579× 10⁻⁸⁷	6.403× 10⁻⁷³	2.573× 10⁻⁷²	4.803× 10⁻⁵⁸	1.678× 10⁻⁵¹	7.650× 10⁻⁵¹
SSA1	2.628× 10⁻⁸	1.302× 10⁻⁷	1.163× 10⁻⁷	4.260× 10⁻²	2.371	1.860
SSA	0.000E+00	1.227× 10⁻⁵¹	7.761× 10⁻⁵¹	1.129× 10⁻¹¹⁸	5.065× 10⁻³¹	3.185× 10⁻³⁰
ISSA	0	1.010× 10⁻¹⁸²	0	0	4.831× 10⁻³⁵	2.656× 10⁻³⁴
Algorithm	f3			f4
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	25.69	73.63	36.42	15.63	99.15	57.19
GWO	2.630× 10^-9	7.351× 10^-6	1.307× 10^-6	26.1	26.93	6.457× 10^-1
WOA	3610	48220	17930	27.07	27.98	4.642× 10^-1
SSA1	233	1467	767.7	24.3	165.9	259.3
SSA	2.774× 10^-139	4.129× 10^-29	1.628× 10^-28	6.711× 10^-9	3.419× 10^-5	1.144× 10^-4
ISSA	0	1.168× 10^-167	0	1.658× 10^-13	8.871× 10^-7	3.077× 10^-6
Algorithm	f5			f6
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	1.008× 10^-5	1.894× 10^-4	3.878× 10^-4	4.340× 10^-2	1.793× 10^-1	5.150× 10^-2
GWO	7.462× 10^-5	7.011× 10^-1	3.678× 10^-1	3.836× 10^-4	1.900× 10^-3	8.773× 10^-4
WOA	7.440× 10^-2	4.005× 10^-1	2.182× 10^-1	3.416× 10^-5	2.700× 10^-3	3.400× 10^-3
SSA1	2.344× 10^-8	2.011× 10^-7	2.792× 10^-7	6.170× 10^-2	1.766× 10^-1	6.360× 10^-2
SSA	1.179× 10^-14	1.538× 10^-11	3.735× 10^-11	8.384× 10^-5	1.700× 10^-3	1.400× 10^-3
ISSA	2.101× 10^-23	7.744× 10^-15	1.533× 10^-14	2.468× 10^-6	8.001× 10^-4	7.674× 10^-4
Algorithm	f7			f8
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	−7082	−4601	1108	36.09	60.26	14.6
GWO	−7586	−5865	907.4	0	3.174	4.412
WOA	−12570	−10570	1769	0	0	0
SSA1	−9017	−7584	660.7	23.88	53.13	20.77
SSA	−9618	−8525	541.5	0	0	0
ISSA	−8839	−6541	672.4	0	0	0
Algorithm	f9			f10
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	2.200× 10^-3	1.381× 10^-1	3.759× 10^-1	1.354× 10^-6	1.030× 10^-1	9.200× 10^-3
GWO	7.550× 10^-14	1.021× 10^-13	1.668× 10^-14	0	3.200× 10^-3	8.000× 10^-3
WOA	8.882× 10^-16	4.530× 10^-15	2.955× 10^-15	0	0	0
SSA1	9.313× 10^-1	2.648	1.245	6.545× 10^-4	1.330× 10^-2	1.050× 10^-2
SSA	8.882× 10^-16	8.882× 10^-16	0	0	0	0
ISSA	8.882× 10^-16	8.882× 10^-16	0	0	0	0
Algorithm	f11			f12
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	9.907× 10^-8	1.040× 10^-2	3.930× 10^-2	9.980× 10^-1	3.635	2.619
GWO	1.280× 10^-2	4.300× 10^-2	1.470× 10^-2	9.980× 10^-1	3.792	3.810
WOA	4.500× 10^-3	2.920× 10^-2	4.220× 10^-2	9.980× 10^-1	2.838	3.215
SSA1	2.115	6.832	3.754	9.980× 10^-1	1.097	3.762× 10^-1
SSA	2.785× 10^-16	9.587× 10^-13	3.536× 10^-12	9.980× 10^-1	4.847	5.242
ISSA	1.335× 10^-22	3.992× 10^-15	8.391× 10^-15	9.980× 10^-1	9.611	4.962
Algorithm	f13			f14
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	−3.322	−3.274	5.900× 10^-2	−10.54	−9.165	2.805
GWO	−3.322	−3.254	8.450× 10^-2	−10.54	−10.33	1.283
WOA	−3.322	−3.209	1.158× 10^-2	−10.54	−7.119	3.386
SSA1	−3.322	−3.220	6.150× 10^-2	−10.54	−8.422	3.343
SSA	−3.322	−3.280	5.740× 10^-2	−10.54	−8.508	2.652
ISSA	-3.322	−3.216	3.114× 10^-2	−10.54	−10.06	6.803× 10^-15
Algorithm	f15			f16
Algorithm	Best	Mean	Std	Best	Mean	Std
PSO	−1.032	−1.032	2.043× 10^-16	3.275× 10^-4	8.612× 10^-4	1.552E× 10^-4
GWO	−1.032	−1.032	2.281× 10^-8	3.075× 10^-4	3.900× 10^-3	7.700× 10^-3
WOA	−1.032	−1.032	5.898× 10^-10	3.229× 10^-4	7.261× 10^-4	4.383× 10^-4
SSA1	−1.032	−1.032	3.233× 10^-14	4.024× 10^-4	1.400× 10^-3	3.100× 10^-3
SSA	−1.032	−1.032	1.067× 10^-16	3.075× 10^-4	3.219× 10^-4	5.463× 10^-5
ISSA	−1.032	−1.032	2.073× 10^-18	3.075× 10^-4	3.075× 10^-4	8.314× 10^-10

Table 4. p-value for Wilcoxon’s rank-sum test.

Function	PSO		GWO		WOA		SSA1		SSA
Function	p	R	p	R	p	R	p	R	p	R
f1	8.2567× 10⁻¹⁵	+	8.2567× 10⁻¹⁵	+	8.6499× 10⁻¹¹	+	9.5500× 10⁻¹⁵	+	5.5707× 10⁻¹⁰	+
f2	1.1765× 10⁻¹⁴	+	1.2616× 10⁻¹⁴	+	1.8551× 10⁻⁵	+	1.1765× 10⁻¹⁴	+	1.9884× 10⁻⁷	+
f3	1.2970× 10⁻¹⁴	+	9.5500× 10⁻¹⁵	+	9.5500× 10⁻¹⁵	+	9.5500× 10⁻¹⁵	+	2.6945× 10⁻⁷	+
f4	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	1.9493× 10⁻¹⁰	+
f5	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	8.4081× 10⁻¹³	+
f6	1.4351× 10⁻¹⁴	+	6.5400× 10⁻²	-	6.9401× 10⁻⁴	+	1.4351× 10⁻¹⁴	+	9.5780× 10⁻¹	-
f7	3.4621× 10⁻¹¹	+	2.4000× 10⁻³	+	2.8321× 10⁻¹⁰	+	9.9259× 10⁻⁷	+	2.8951E-13	+
f8	1.9667× 10⁻¹⁶	+	1.9035× 10⁻¹⁶	+	NaN	=	1.9667× 10⁻¹⁶	+	NaN	=
f9	1.9667× 10⁻¹⁶	+	1.7706× 10⁻¹⁶	+	1.9034× 10⁻¹¹	+	1.9667× 10⁻¹⁶	+	NaN	=
f10	1.9667× 10⁻¹⁶	+	4.1870× 10⁻⁴	+	NaN	=	1.9667× 10⁻¹⁶	+	NaN	=
f11	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	1.4351× 10⁻¹⁴	+	5.1215× 10⁻¹¹	+
f12	1.1761× 10⁻⁵	+	1.1234× 10⁻¹	-	5.1000× 10⁻³	+	2.9017× 10⁻⁸	+	7.0000× 10⁻³	+
f13	3.7700× 10⁻²	+	2.2793× 10⁻⁸	+	9.8876× 10⁻¹²	+	6.3229× 10⁻¹³	+	3.4070× 10⁻¹	-
f14	1.6200× 10⁻²	+	7.1850× 10⁻⁵	+	2.6114× 10⁻⁶	+	9.0250× 10⁻⁷	+	5.6820× 10⁻¹	-
f15	9.6055× 10⁻⁸	+	2.9329× 10⁻¹⁵	+	2.9329× 10⁻¹⁵	+	6.8851× 10⁻¹⁵	+	6.3210× 10⁻¹	-
f16	1.4351× 10⁻¹⁴	+	8.6608× 10⁻¹²	+	1.0585× 10⁻¹³	+	1.7977× 10⁻¹⁴	+	6.6850× 10⁻¹	-

Table 5. Experimental results of model ablation.

Algorithm	f1		f2		f3
Algorithm	Mean	Std	Mean	Std	Mean	Std
SSA	2.526× 10⁻⁶⁰	1.050× 10⁻⁵⁹	1.212× 10⁻²⁶	7.105× 10⁻²⁶	7.076× 10⁻²⁶	3.363× 10⁻²⁵
ASSA	1.690× 10⁻⁷⁵	1.069× 10⁻⁷⁴	9.746× 10⁻²⁹	6.164× 10⁻²⁸	5.576× 10⁻²⁹	1.763× 10⁻²⁸
MSSA	1.679× 10⁻⁶⁷	1.032× 10⁻⁶⁶	4.333× 10⁻³⁷	1.978× 10⁻³⁶	5.084× 10⁻²⁹	1.605× 10⁻²⁸
ISSA	1.034× 10⁻¹⁸¹	0	1.966× 10⁻³⁷	1.243× 10⁻³⁶	1.026× 10⁻¹⁵⁵	3.243× 10⁻¹⁵⁵
Algorithm	f4		f5		f6
Algorithm	Mean	Std	Mean	Std	Mean	Std
SSA	9.633× 10⁻⁵	2.729× 10⁻⁴	5.003× 10⁻¹¹	1.924× 10⁻¹⁰	1.600× 10⁻³	1.300× 10⁻³
ASSA	3.436× 10⁻⁶	8.914× 10⁻⁶	2.327× 10⁻¹⁴	3.221× 10⁻¹⁴	1.600× 10⁻³	1.800× 10⁻³
MSSA	3.527× 10⁻⁶	7.135× 10⁻⁶	2.300× 10⁻¹⁴	3.999× 10⁻¹⁴	2.000× 10⁻³	1.800× 10⁻³
ISSA	4.318× 10⁻⁷	8.616× 10⁻⁷	2.029× 10⁻¹⁴	3.859× 10⁻¹⁴	1.400× 10⁻³	8.627× 10⁻⁴
Algorithm	f7		f8		f9
Algorithm	Mean	Std	Mean	Std	Mean	Std
SSA	−8463	531.4	0	0	8.882× 10⁻¹⁶	0
ASSA	−8492	592.1	0	0	8.882× 10⁻¹⁶	0
MSSA	−8726	907.5	0	0	8.882× 10⁻¹⁶	0
ISSA	−8759	405.2	0	0	8.882× 10⁻¹⁶	0
Algorithm	f10		f11		f12
Algorithm	Mean	Std	Mean	Std	Mean	Std
SSA	0	0	3.474× 10⁻¹²	1.105× 10⁻¹¹	6.26	5.454
ASSA	0	0	6.636× 10⁻¹⁵	1.377× 10⁻¹⁴	7.62	5.746
MSSA	0	0	1.356× 10⁻¹⁵	2.427× 10⁻¹⁵	9.980× 10⁻¹	1.480× 10⁻¹⁶
ISSA	0	0	1.045× 10⁻¹⁵	8.414× 10⁻¹⁶	9.980× 10⁻¹	1.655× 10⁻¹⁶
Algorithm	f13		f14		f15
Algorithm	Mean	Std	Mean	Std	Mean	Std
SSA	−3.263	6.270× 10⁻²	−8.373	2.793	−1.032	1.958× 10⁻¹⁶
ASSA	−3.274	6.140× 10⁻²	−8.914	2.612	−1.032	1.958× 10⁻¹⁶
MSSA	−3.298	5.010× 10⁻²	−8.914	2.612	−1.032	2.094× 10⁻¹⁶
ISSA	−3.274	6.140× 10⁻²	−9.455	2.280	−1.032	1.958× 10⁻¹⁶
Algorithm	f16
Algorithm	Mean	Std
SSA	3.456× 10⁻⁴	8.718× 10⁻⁵
ASSA	3.404× 10⁻⁴	1.040× 10⁻⁴
MSSA	3.215× 10⁻⁴	6.452× 10⁻⁵
ISSA	3.075× 10⁻⁴	8.442× 10⁻¹⁰

Table 6. Selection of security situation dimension.

Input Samples	Output Samples
x₁, x₂, x₃, x₄, x₅, x₆, x₇, x₈, x₉, x₁₀	x₁₁
x₂, x₃, x₄, x₅, x₆, x₇, x₈, x₉, x_10,x₁₁	x₁₂
…	…
x₁₁₀, x₁₁₁, x₁₁₂, x₁₁₃, x₁₁₄, x₁₁₅, x₁₁₆, x₁₁₇, x_118,x₁₁₉	x₁₂₀

Table 7. Comparison of evaluation indexes.

Algorithm	MRE	MSE	MAE	R²
ELM	0.32291	0.0052134	0.058761	0.11032
GA-ELM	0.13248	0.00087758	0.025771	0.90944
SSA-ELM	0.19723	0.001945	0.0387	0.73668
ISSA-ELM	0.055477	0.00015388	0.010065	0.97399

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, H.; Wang, J.; Chen, C.; Li, Z.; Li, J. ISSA-ELM: A Network Security Situation Prediction Model. Electronics 2023, 12, 25. https://doi.org/10.3390/electronics12010025

AMA Style

Sun H, Wang J, Chen C, Li Z, Li J. ISSA-ELM: A Network Security Situation Prediction Model. Electronics. 2023; 12(1):25. https://doi.org/10.3390/electronics12010025

Chicago/Turabian Style

Sun, Hongzhe, Jian Wang, Chen Chen, Zhi Li, and Jinjin Li. 2023. "ISSA-ELM: A Network Security Situation Prediction Model" Electronics 12, no. 1: 25. https://doi.org/10.3390/electronics12010025

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ISSA-ELM: A Network Security Situation Prediction Model

Abstract

1. Introduction

2. Extreme Learning Machine

3. Sparrow Search Algorithm and Its Improvement

3.1. Sparrow Search Algorithm

3.2. Initial Population by Cat Mapping Chaos

3.3. Tent Chaos and Cauchy Variation Perturbation Strategy

3.4. Improved Explorer Location Update Formula

3.5. Explorer-Follower Adaptive Adjustment Strategy

3.6. Improved Sparrow Search Algorithm

4. ISSA-ELM Prediction Model

5. Experimental Results and Analysis

5.1. Algorithm Tests

5.1.1. Benchmark Functions

5.1.2. Convergence Accuracy and Stability Analysis

5.1.3. Wilcoxon Rank-Sum Test Analysis

5.1.4. Model Ablation Experiment

5.1.5. Time Complexity Analysis of ISSA

5.2. Network Security Situation Prediction Experiment Based on ISSA-ELM

5.2.1. Experimental Environment and Data Preprocessing

5.2.2. Analysis of Experimental Results

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI