1. Introduction
Symmetry is characterized as an option in terms of reformation and modification. The 21st century is characterized by a worldwide desire for digital modernization in many areas of human life. Research on innovative digital development needs to be carried out symmetrically and must penetrate into all areas of the global community. Today, there is an asymmetry of scientific contribution in the study of issues of digital innovation, both in technical areas and in the humanities. The reason for this situation is that the problem of the cognition of digital innovations is associated, first of all, with data mining, machine learning, cyber data, and the Internet of things, which belong to the field of technical sciences. However, when analyzing the problem of the formation of digital modernization, it will be meager and asymmetric if we do not take into account the socio-economic aspect, such as the persistence and reliability of digital transformation.
Mathematics and information technology is a multifunctional scientific basis that provides solutions to many issues related to the field of both technical and economic sciences, especially in the field of complex security.
The rapid development of information technologies opens up new prospects in the information processing system development. Their structure changes fundamentally. At the same time, regulatory requirements are being tightened, especially in terms of complex object security [
1]. For the FEC (fuel and energy complex) of Russia, it is connected with current and recently adopted normative documents, such as: the energy security doctrine of the Russian Federation (Decree of the President of the Russian Federation, 13 May 2019, № 216); the new version of the information security doctrine (Decree of the President of the Russian Federation № 646, dated 05 December 2016); federal law, 26 July 2017, № 187-FL, “On the Security of the Russian Federation Critical Data Infrastructure”; federal law, 21 July 2011, № 256-FL, “On the safety of fuel and energy complex facilities”; and federal law, 27 July 2006, № 149-FL, “About information, information technology and data security” [
1]. In these documents, the priority is given to the fuel and energy complex object [
1] security on the basis of the threat monitoring of its functioning. These issues are also relevant for the global fuel and energy industry. The report [
2] indicates: «At the same time, intense financial strain is hurting the industry, including companies who own and operate critical infrastructure facilities. Policy makers and regulators need to ensure that operational, maintenance and safety expenditures are prioritised and appropriately maintained».
In this case, threat monitoring of any object’s functioning is a relevant issue that is currently being implemented using the following technologies regarding the example of FEC objects:
Monitoring using separate, usually heterogeneous security systems: physical [
3,
4]; economic [
5,
6]; fire [
7,
8]; informational [
9,
10]; intellectual property security, such as data loss prevention; technogenic [
11,
12]; psychological; security against terrorism [
13]; ecological safety [
14]; and power security [
5,
15], where there is no integration of information flows of security threats (events, violations, and incidents) between them for the purpose of collecting, processing, comprehensively analyzing, and forecasting risks and undesired events.
Monitoring using models and algorithms to support security management with an observer [
9,
10,
16,
17,
18] to make a final decision based on a specific security system model.
Today, predicative analytics [
19] is a global trend [
20,
21]. It is primarily due to the big data phenomenon which requires new methods of information processing, such as data mining. Experts believe [
22,
23] that it opens up new prospects for security tasks, including the threat monitoring of any object’s functioning.
There are attempts to create such solutions by individual domestic companies such as Kaspersky Industrial CyberSecurity, Kaspersky Machine Learning for Anomaly Detection of Kaspersky Lab, KOMRAD of NPO (scientific-industrial association) ECHELON, VipNet Industrial Security of INFOTECS, etc. These companies face a lack of domestic digital technologies for a number of security tasks in the field of data mining, artificial intelligence, and predictive analytics. There are foreign solutions that are more widely available on the market, for example, the security control system MAXPATROL of Positive Technologies, the EcoStruxure architecture of Schneider Electric, the Predix platform of General Electric, etc. They do not take into account the possibility of using artificial intelligence and predictive analytics for a number of security tasks as well as domestic solutions.
It should be noted that in the platform Predix of General Electric, it is stated that machine learning algorithms are used for solving analytical problems, first of all, to detect anomalies in data flows about the industrial equipment state. Predix should be considered exclusively as an analytical platform for monitoring an industrial equipment state; however, there is a number of objectively complex issues. There is no open information about the security of Predix machine learning algorithms’ software implementations, which has a potential threat to information security at several OSI model levels. The platform architecture of IIoT and Predix analytical applications can be performed both in the cloud and on-premises, i.e., at the edge level. This fact causes doubts about the possibility of platform deployment according to the regulator requirements (Order of the Ministry of Energy of the Russian Federation № 1015, dated 6 November 2018), which establishes the need for data processing centers to be located on the territory of the Russian Federation.
Thus, we can distinguish the following promising technologies for the threat monitoring of any object’s functioning:
The complex security monitoring of an object considering the integration of information flows of security threats and events, including the critical systems security [
24,
25] and their processing using artificial intelligence [
26,
27,
28];
Monitoring using integrated security management support algorithms [
12,
29] and component models that include reasonable intruder models and digital object models constructed by digital structures, such as information models [
17], neural networks [
12], or a neurographs [
30]; and
Monitoring based on predicative models [
19,
26,
28] using current states and historical events about the incident on an object to forecast future incident.
Today, the last technology is hardly implemented for security systems [
21]. This is critical for the whole class of risk analysis techniques and methods that are used nowadays for security systems, including [
31,
32,
33]: «brainstorming», structured and semi-structured questionnaires, the Delphi method, check lists, scenario analysis, Business Impact Analysis (BIA), fault tree analysis, event tree analysis, cause-and-effect analysis, layers of protection analysis, decision tree analysis, the Human Reliability Assessment (HRA), bow tie analysis, the Markov chain Queueing theory, the Monte Carlo method, the Bayesian approach and Bayesian networks, FN curves, the index method, the probability and impact matrix, Multicriteria analysis, etc. Most of these methods and risk analysis methods solve the problem based on the specific security system model rather than on the complex object security model. Their application is also limited in terms of integrating information flows of security threats and events; their processing and risk assessment using artificial intelligence; as well as risk accounting for a specific intruder model and a digital object model (Digital Twin). Almost all of these techniques and methods are not applicable to big data and they do not implement all the features of data mining in predictive analytics. This explains the limited use for different objects, including the fuzzy final output relative to the current object states.
The approach proposed in the article is devoted to risk assessment problem solving by predicting an incident of complex object security based on incident monitoring. The item of the study is to forecast and assess the risk of the complex security of an object based on main security requirements. The subject of the study is a system for the risk assessment and forecasting of the complex security incidents of an object based on its monitoring. The purpose of the study is to improve the forecasting efficiency and complex security object risk assessment on the basis of data mining classification model formation according to safety requirements and complex security incident forecasting systems based on its monitoring.
At the end of this section, we discuss the novelty of the study. The obtained results allow us to quantify a reasonable risk from the occurrence of the future incident for any object, which is widely used in world practice, but the mathematical model is the basis for a new approach to the risk assessment of complex threats based on their monitoring.
The Decree of the President of the Russian Federation № 490, dated 10 October 2019, «On the development of artificial intelligence in the Russian Federation», directly indicates the relevance and necessity of creating such systems to ensure national security and law enforcement; improve employees’ safety when performing business processes (including forecasting risks and adverse events, and reducing the level of direct human participation in processes associated with an increased life and health risk); and ensure complex security system formation when creating, developing, implementing, and using artificial intelligence technologies.
At the same time, the solution of this task will contribute to the formation and development of unique complex security systems aimed at the identification, reflection, and elimination of existing and future operation threats, and to ensuring object security, including constant threat monitoring.
The solution of this task will contribute to allowing us to quantify a reasonable risk from the occurrence of the future incident for any object, which is widely used in world practice, but the mathematical model is the basis for a new approach to the risk assessment of complex threats based on their monitoring. The solution of this task will contribute to adding opportunities for the practical clarification of the current regulatory documents on the objects of critical information infra-structures of the Russian Federation; for a wider application of data mining algorithms and big data technologies when considering not only a physical object but also its digital twin; and for building security systems against complex threats and developing the neuro-graph theory of complex security.
2. Materials and Methods
Stages of a predicative model forming as a cybernetic predictive device were firstly described in [
34] and their usage in the automatic control theory with artificial intelligence elements, as a predictive module, was described in [
26]. Then, for the joint use with data mining, the stages were clarified in some domestic and foreign articles. For example, in [
19], it is indicated that data mining methods can be used to form a predicative model, including the decision tree method, which is consistently recognized as the most popular practical method due to its simplicity and efficiency. Furthermore, we will follow the approach described in [
26], taking into account data mining methods. The proposed approach for solving this problem includes:
Building and training a classification model. This step can be performed using data sampling that includes the finite precedents set, which are previous incidents at the site. For each incident, the main groups of security requirements are collected and analyzed. They are used to train the model that allows for identifying dependencies, patterns, and interrelations that correspond not only to the given sample but also to other parameters, even those that have not been observed yet. Data sets of object incidents with different parameters of each object (where the incident occurred) will be model attributes. For the classification model, we used the algorithm C4.5, which characterizes the class of each object. Using a set of object attributes and the corresponding class, the C4.5 algorithm builds a decision tree that can predict new object classes that were not previously considered regarding their attributes basis. Decision tree classification creates a flowchart for distributing new data. At the first stage, there were developed as an algorithm and program for data generation on the basis that the decision tree will be built. This algorithm is a kind of expert system that determines the acceptable range of values for the checked set. The range is created due to conditions that are formed in accordance with current regulatory documents, including that mentioned above.
The decision tree building. At the second stage, C4.5 builds a decision tree based on the data obtained as a result of the program. This tree is designed to speed up the process of predicting the incident of the checked set and identify possible security violations that may lead to an incident.
Development of the complex security risk assessment and incident forecasting system. The system is a predicative self-configuring neural system as a predication module [
26], including a self-configuring neural network, a recurrent neural network, and a predicative model that allows for determining the risk and forecasting the probability of an incident for an object. The neural system is created on the principle of a base prediction class formation of an object, including historical events of an object’s incident and considering current object states by a fuzzy system of predicative model outputs for risk assessment and for forecasting future incident.
According to the definition given by the workshop managers [
35], self-configuration is the process of selecting and using existing algorithmic components (such as operator variants). However, this definition is given exclusively for genetic algorithms. Then, it will be appropriate to define a SCNN (self-configuring neural network).
A self-configuring neural network is a neural network that self-configures neurons and connections between them on the basis of the prediction class of an object.
SCNN can be trained with or without a teacher depending on what neurons are used in the neural network. A perceptron [
36], a sigmoid neuron [
37,
38], and an adaline neuron [
39] can only be trained with a teacher. The instar and outstar neurons proposed by Grossberg S. can be trained with or without a teacher. A teacher is not required to train WTA-type neurons [
37,
38,
40]. Learning with a teacher requires an “external” teacher who evaluates the behavior of the system and manages its subsequent modifications. When teaching without a teacher, the network makes the required changes by self-organizing.
The most interesting aspect is the possibility of implementing the training of SCNN perceptrons without a teacher, which is theoretically possible by adaptive thresholds of the neuron activation function and a special learning algorithm.
To adapt thresholds, you need to be able to forecast them. Currently, various approaches are used for this purpose based on the probability theory [
41] and mathematical statistics [
42], the RNN (recurrent neural networks) [
43], self-organization based on the competitive Kohonen rule [
40], Hebb [
44] or Gustafson Kessel [
45], and the functional and correlation analysis of time series based on the Takens–Manet theory about the possibility of reconstructing the phase space of a dynamical system [
26,
46]. All these methods are used to find a certain measure of the considered set, namely the attractor. The attractor for a recurrent neural network is a stability point in the local minimum state space of the Lyapunov energy function [
36,
37,
47]. Nowadays, RNN is widely used [
48,
49,
50] in LSTM-based models using GRU and encoders–decoders in a combination of RNN and convolutional neural networks. In the work of [
49], the network with LSTM-based architecture and encoders–decoders proved to be effective for detecting anomalous (critical) values, including for time series that do not contain information about external factors that affect the correlation between time series elements. A similar task is to forecast future threshold values based on historical threshold events without considering any other external factors that directly or indirectly affect the historical threshold.
The solution that we usually obtain requires clarification, especially regarding current incidents accounting for the examined object, risk assessment, and future incident forecasting, where the risk value is unclear and determined by the corresponding linguistic variable. Currently, fuzzy inference systems based on the principles of Zadeh fuzzy logic [
51], the Wang–Mendel modeling method [
52], and Takagi–Sugeno–Kang models [
53] are widely used for this purpose.
3. Results
Many historical events (precedents) about the incident
at the objects:
where
is an examined object occur.
Each object is characterized by a set of variables:
where
are independent base variables whose values are known and on which the basis of the set of the object’s base class is defined;
are independent functional variables whose values are known and on which the basis of the value of the dependent variable is determined for
; and
is the time of the incident on the object.
Furthermore, using the concept of data mining as an analogy close to predictive analytics [
19], we write a set of independent base variables in the form of a variables set (3) for the object under study:
where each variable
can receive the value from the set (4):
The variable set of the examined object is as follows:
where
are independent base variables whose value is known;
are independent functional variables whose value is known and on which the basis of the value of the dependent variable is determined for
; and
is the time of the incident on the examined object.
In real-time mode
, the object can have a number of current states. These states make up the set [
29]:
where
are current states.
The subject will be notified about each of them by event alerts (7)
[
29]:
It is assumed that the number of object states
is not equal to the number of event alerts (8), One object state is reported by several event alerts
:
where
is a number of event alerts that report the object status
.
For the object under study, we write the variables set by analogy (9)
:
and the set (10)
:
Then, the set of the base product class for the object under study is as follows:
Explain Equation (11) and therefore the definition of the base product class should be as follows. The base prediction class includes all possible historical events of the incident on objects, for which there are two requirements: all elements of the independent basic variables set are equal to the set of the examined object and all elements of the set for each variable of the independent basic variable set are equal to the elements of the examined object set.
Let us write an independent functional variable set as a variables set (12) for the object under study:
Each variable
can receive the value from the set
, determined by a cause-and-effect relation
:
The most optimal method of risk analysis from the discussion in paragraph 2 can be used to form the selected set (13) of each specific object. To form the specified set, the author uses the class of each object and the decision tree C4.5 built in paragraphs 1 and 2, where the variable can receive the value from a discrete set . It should be noted that in most cases, for historical events of an incident, the cause–effect relationship is already established by the relevant competent authorities.
For the examined object, let us write the variables set (14):
Each variable
can receive the value from the set
, defined in the result of preventive, supervisory, or expert actions in relation to the object based on the current regulatory documents and technical regulations that determine safe object operation:
To form the specified set (15) for each particular object, the same risk analysis method should be used as for the set ; however, the risk analysis is already performed for data obtained as a result of preventive, supervisory, or expert actions. To form the mentioned set, the author uses a similar approach of the set formation (10) where, for the class of each object, there is a decision tree C4.5, as built in paragraphs 1 and 2, on the basis of preventive, supervisory, and expert actions.
To determine the value of the dependent variable
of the base prediction class of the object, the incident prediction system includes SCNN (
Figure 1), which is taught without a teacher. SCNN architecture includes a three-layer feedforward network of the perceptron type, where
are the input layer neurons corresponding to independent functional variables from (2) and received with (11);
are intermediate layer neurons corresponding to independent base variables (2) and with an adaptive activation function threshold
; and
are output layer neurons corresponding to the dependent variable value
and with an adaptive activation function threshold
.
In SCNN architecture, taking into account the basic prediction class, we solve the fundamental problem of teaching a perceptron neural network without a teacher. This ability to implement SCNN training without a teacher is provided by the presence of adaptive thresholds in the neural network for activation functions of intermediate and output layer neurons as well as a special learning algorithm.
To determine the adaptive threshold of the activation function
, let us use the fundamental concept of an elementary outcome
[
41] for the incident
. If the present moment of time is denoted as
, then, for the set of historical events (1) about the incident
that occurred during the time
, all
are identified. Then, the adaptive threshold of the activation function
for time
in the general view is distributions (probabilities) series
discrete values
and
, taking into account the number of intermediate layer neurons
:
To adapt the threshold to time , it is necessary to forecast time series . This task can be formalized as follows. By values of an univariate time series , which characterizes the threshold values in discrete equidistant moments of time , it is necessary to determine the values of the following time series elements that make up the variables set for relevant points in time . The value defines the time series forecast horizon .
The value of sampling
is defined as (17):
Then, the threshold values are
for the time
considering (16) will be:
The dimension formed on the basis of (18) time series is equal to the value . The value defines the time series forecast horizon . The optimal set of parameters and can be found experimentally.
To solve the problem, we need to define the following functional dependency:
where
is the variable set used for forecasting (
);
is the predicted values variable set (
); and
is the dependence function between time series elements, which is set by the RNN internal structure.
The function
is determined by pre-training the RNS model on a sample
consisting of thresholds
with an interval of
. During the training, pairs of variable sets are iteratively selected from the sample
using the algorithm (20), thus
,
:
Each pair of variable sets are inserted in (19) instead of and , which changes the RNN internal structure under the task-defined regularities in the training data array, i.e., the model training and identification of .
The adaptive activation function threshold
of the output layer neuron is defined by:
Physically, this means that the activator of the incident development is the cause-and-effect relationship , with the maximum value of the activation function threshold in time.
The proposed SCNN architecture (
Figure 1) is not directly recurrent, i.e., there is no signal transfer from the output or hidden layer to the input layer [
36,
37,
47,
54,
55,
56]. Therefore, this fact must be taken into account when conducting the training process.
The SCNN network training occurs in several stages using a special algorithm:
- 1.
Configuration of SCNN architecture (
Figure 1). A prerequisite for developing good abilities to generalize any neural network in the learning process is a competent identification of the Vapnik–Chervonenkis measure (
) [
47]. Then, for SCNN considering the proposed approach [
57], we define the upper and lower bounds of this measure as:
where [] indicates the whole part of the number;
is the dimension of the variable set
;
is the total number of network weights;
is the total neurons’ number in the network; and
is the network weights.
From Equation (22), it follows that the lower range bound is approximately equal to the number of weights connecting the input and intermediate layer, and the upper bound exceeds double the total number of all network weights. From Equation (22), it is easy to obtain network configuration parameters (23), which must be performed both when configuring and reconfiguring the network.
where the lower range bound
(24) is defined by the SCNN architecture and equal to
Thus, the neurons’ number in the intermediate layer is affected by the relation of the network weights’ number to the training samples’ number of the base prediction class (11).
- 2.
Training of the RNN model and determination of both intermediate thresholds and output layers by algorithm (20) and Equation (21). At this stage, it is necessary to determine the parameters of the sample in advance:, , . This can be done experimentally and will be shown later.
- 3.
Training of SCNN using a set of base prediction classes
(11). Generally, the learning algorithm will be determined by the type of neurons used in the neural network. For the SCNN of the perceptron type (
Figure 1), the gradient method and the backpropagation [
47] are used. In this case, required continuous activation functions for intermediate
and output
layers are obtained by corresponding values’ approximation
and
. This approach has the following physical meaning: at value
, the probability of an incident
is 50%, which is equivalent to the threshold of its occurrence at time
. Then, for a unipolar sigmoid activation function
[
47] of the intermediate layer, we calculate the following.
where
is the parameter, affecting the form of the activation function.
It is easy to show that Equation (25) can be written as:
which leads to an exponential approximation of [
42]
by the function
:
where
and
are the approximating exponential function parameters.
Then, there is logarithmic Equation (27) and by applying products of logarithms to the right side of the equation, we obtain Equation (28) for
:
The final activation function equation
of the intermediate layer, taking into account (26) and the limit value
, is as follows:
where
is the parameter, affecting the activation function form.
From Equation (26), it is easy to obtain approximation error
, which will be equal to the known error
, and an exponential approximation of
, which is the mean-squared error [
42] under conditions of the convergence of the sequence
to zero in the mean square [
41]:
Equations (29) and (30) mean the convergence of functions by probability based on the condition of convergence, with a probability equal to one (16), and convergence in the mean square [
41]. The convergence of functions by probability implies a weak convergence of functions [
58] assuming that the function
is continuous and can only have a gap at points
. It follows that
.
By analogy, we obtain the final equation of the activation function
of the output layer (31) for an exponential approximation by a function of the form
including (21) and (29). Thus, we have:
where
is the parameter that affects the activation function form.
By analogy, we obtain the approximation error
(32) that is equal to the known mean-squared error
of exponential approximation
:
The algorithm [
47] ends when the norm of the gradient of the objective function
falls below the given value
, which characterizes the accuracy of the learning process. For the SCNN of the perceptron type (
Figure 1), this means that components of the gradient (33) for all weights
of intermediate
and output
layers must be less or equal to the corresponding errors
and
:
To estimate the received Equations (29) and (30), it is possible through the corresponding confidence coefficients of
to obtain the function’s approximation
and
by the Fisher criterion [
42]:
From Equation (34), it is easy to obtain the reliability condition of the SCNN model:
where
is the critical value of the Fisher criterion from the Fisher distribution tables [
42] for a given significance level
and known parameters
and
.
Thus, the data for SCNN training is a base predication class and the accuracy of its training and is determined by the network independently without a teacher.
- 4.
Completion of training.
- 5.
SCNN reconfiguration that includes a repetition of items 1–3 if there is information about changing the set of independent function variables or the set of independent base variables (2) or a repetition of items 2–3 if there is information about a new incident in (1).
After network training, data from the object under study is fed to the network (5) and as a result, the value of the dependent variable is formed as
for the base prediction class
(11). It should be noted that during the investigation, we obtain the value
, not
, as in the case of the classical neural network training approach [
37,
47]. This is due to the time difference between the time of the last network reconfiguration and the time
if the total execution time (item 1 and 3) is approximately equal to 1 min [
59]. Thus, it can be accepted as equal to:
Furthermore, to refine the obtained results and form a predicative model, we will consider the variable
as the membership function [
51]
(6) includes three fuzzy sets, namely the “low risk”, “medium risk”, and “high risk” of an incident at the object. States
(6) correspond to
, the incident cause;
, the incident consequence; and
, the incident occurrence.
Membership functions are shown in
Figure 2 and built considering both the analysis
from states
and assumptions that
for fuzzy sets “low risk”, “medium risk”, and “high risk”, as well as for the fuzzy set “medium risk”, has a symmetric triangular membership function relative to the central point of the event
.
Then, the probability
(37) of the occurrence of a fuzzy event on the object under study, wherein
is a low, medium, or high-risk value of an incident, is determined by the formula [
52,
60]:
where
is the probability of the occurrence of condition
. If we assume the continuous uniform distribution [
41] of state occurrence
, we will derive:
where the value in square brackets is obtained by solving a system of equation
functions for the “medium risk” and “high risk” of an incident on the object.
Result (38) considers the concept of vagueness of the linguistic variables “low risk”, “medium risk”, and “high risk” of an incident at the site and each event is assigned the probability of its occurrence.
The obtained Equation (38) forms the basis of the fuzzy inference system [
52,
53] of the predicative model. Then, let us formulate the fuzzy rule inference system for the risk assessment and prediction of incident:
If = , then there is a “high risk” of an incident on the object. The incident is currently at the time or, with probability , will occur during for reasons of current regulation violations corresponding to or to random causes corresponding to (11).
If = then there is a “medium risk” of the incident. The incident will occur at time with probability by reasons of violations of existing regulatory documents that correspond to or to random causes corresponding to (11).
If = there is a “low risk” of incident on the object. The incident is not occurring at the time or the incident will occur with the probability during the time by reasons of violations of existing regulatory documents that correspond to or to random causes corresponding to (11).
It should be noted that above, we discussed the situation with one incident
but this approach can be generalized to many incidents
(39); if accepted, the last set will be written as an extended form of the complex threat [
27] and represented as the set
(1). Set
(6), which is the set of elementary threats
, and set
have a correlation between them:
where the elementary threat
is determined by the incident
under the terms:
Further arguments presented in [
27] do not lose their meaning but rather they add opportunities for a wider application of data mining algorithms and big data technologies in building security systems against complex threats and develop the neurographic complex security theory. At the same time, taking into account the results obtained in this article, they allow us to solve an important problem of optimizing detector sets for an object considering (40) when maintaining the necessary level of object security from complex threats.
4. Experiment and Discussion
For practical use of the proposed incident prediction system, it is important to define optimal sets of independent basic and functional variables. The selection is made on the basis of a preliminary analysis of the object, taking into account the current regulatory documents. The most generalized approach to this analysis is described in [
61] and is based on Peril Classification by the categories of geophysical, hydrological, meteorological, climatological, biological, and extraterrestrial. If one follows the approach of the Center for Security Studies (CSS) [
62], such an analysis for a critical infrastructure object can be performed on the basis of a score resilience index, where variables are weighted by domain experts in collaboration with representatives of the critical infrastructure object on the basis of their relative importance compared to others. In Russia, the object categorization rules of critical information infrastructures of the Russian Federation (Decree Of the government of the Russian Federation № 127, dated 8 February 2018), which is a categorical approach to a critical infrastructure object evaluation, is defined, which is based on indicators of significance criteria including the social, political, economic, and environmental significance for national defense, state security, and law enforcement, related by category with measures to ensure the safety of the object (Order of the Federal Service for Technical and Export Control of Russia № 239, dated 25 December 2017). Taking this approach into account, data from current regulatory documents and technical regulations that determine the safe operation of an object can be used to determine variables.
For practical use of the proposed incident prediction system to conduct SCNN training without a teacher to also improve the performance of forecasting, we need to define the sample parameters in advance, namely , , and , and the network reconfiguration time . This can be done experimentally.
Assume that the sample
consists of 1080 threshold values with an interval of
30 s. During the minimal preprocessing of the training sample, the presence of missing values is checked and numeric values’ normalization is conducted to a range of
. The algorithm for creating and training the RNN model is implemented in Python using the open source TensorFlow library. The RNN architecture is a two-layered network with one layer of LSTM cells and one fully connected layer. The following parameter ranges are selected for experiments in minutes:
with step 1 and
with a step 2.5. During the experiment, we compared RNN-predicted threshold values with the actual ones using elements of a separately collected test sample as input and verification data, which was not used in the RNN training process. The forecast error, as part of the experiment, is usually calculated as a percentage as the module of the difference between the actual and predicted threshold value. The good value
with a mean prediction error is 5.41% on the interval
and was received for the parameter
min (
Figure 3).
The graph of the mean dependence (for all values of
) of the forecast error on
is shown in
Figure 4.
Based on the obtained results, we can provide the following conclusions.
The forecast error reduces when the amount of input data increases (
Figure 4). The lowest value of the mean error for any forecast horizon is achieved when analyzing values for the previous 30 min.
Using the optimal set of parameters (
and
) gives the mean forecast error a 4.41% at the forecast horizon, equal to 10 min, which is a good result (
Figure 3).
When using the optimal set of parameters (
and
) at the given significance level of
and at the value of the Fisher criterion
[
42], critical value
is obtained. The obtained result
fully satisfies the requirement (35) that indicates the reliability of the SCNN model as a whole.
Saving the RNN forecast error at the level defined in the item 2 level is possible only if the SCNN reconfiguration period is longer than the forecast horizon: . For the obtained results, this means that min.
Further experimental work on optimizing the RNN parameters can improve the obtained solution and, if necessary, can allow for the selection of optimal parameters , , and , as well as for the reconfiguration time of SCNN for a specific object.
The obtained optimal sets of parameters (
and
) have great practical importance. Parameter
determines the risk time horizon but the parameter
defines the horizon of the response to the incident in terms of threat monitoring, which can be used to clarify existing regulatory documents. In paragraph 4 of the information procedure of the Russian Federal Security Service about computer incidents, responding to them, and taking measures to eliminate the consequences of computer attacks conducted in relation to significant objects of critical information infrastructures of the Russian Federation (Russian Federal Security Service order № 282, dated 19 June 2019), the response horizon for a computer incident related to the operation of a significant object of a critical information infrastructure is determined no later than 3 h from the moment of detection; for other objects, this is no later than 24 h from the moment of its detection. In this case, the obtained parameter
min (
Figure 3), when the mean forecast error is 4.41%, is a good result.
If
regarding what is possible at stage 5 of the network training, the RNN forecast error is a random variable during the reconfiguration period, but the SCNN error will be determined by a complex function of both the RNN forecast error and the SCNN training error. The last error is directly related to the time interval
of the readjustments of weights to new data of the base predication class, i.e., related to the constant update during the reconfiguration process. Considering Equation (36), the probability of such an error occurrence is equal to the probability of an incident occurrence during the reconfiguration period. Then,
can be written as a condition under which the network reconfiguration may not be performed up to any value
:
Otherwise, the network reconfiguration is conducted relatively similarly to the last time . In fact, condition (41) means that the SCNN forecasted the occurrence of a future incident and the error of its training is insignificant.
Condition (41) can be used also for the forecast interval extension. However, in this case, we require an external teacher, which will determine the confidence values of the intervals .
The SCNN error
is able to be estimated by the value of the forecast error of RNN if we assume that the error
is approximately equal to the mean forecast error (item 2)
and error
considering (21) will be approximately equal:
where the value
, by analogy, can be defined from
Figure 3. For the value
,
%. If we assume that the number of independent base variables is
= 3, then, by Equation (42), we will derive
.