A Gnn-Enhanced Ant Colony Optimization for Security Strategy Orchestration

Miao, Weiwei; Zhao, Xinjian; Wang, Ce; Chen, Shi; Gao, Peng; Li, Qianmu

doi:10.3390/sym16091183

Open AccessArticle

A Gnn-Enhanced Ant Colony Optimization for Security Strategy Orchestration

by

Weiwei Miao

¹,

Xinjian Zhao

¹,

Ce Wang

^2,*,

Shi Chen

¹,

Peng Gao

² and

Qianmu Li

²

¹

State Grid Jiangsu Electric Power Co., Ltd., Information & Telecommunication Branch, Nanjing 210024, China

²

School of Cyber Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

^*

Author to whom correspondence should be addressed.

Symmetry 2024, 16(9), 1183; https://doi.org/10.3390/sym16091183

Submission received: 13 July 2024 / Revised: 2 September 2024 / Accepted: 2 September 2024 / Published: 10 September 2024

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

:

The expansion of Internet of Things (IoT) technology and the rapid increase in data in smart grid business scenarios have led to a need for more dynamic and adaptive security strategies. Traditional static security measures struggle to meet the evolving low-voltage security requirements of state grid systems under this new IoT-driven environment. By incorporating symmetry in metaheuristic algorithms, we can further improve performance and robustness. Symmetrical properties have the potential to lead to more efficient and balanced solutions, improving the overall stability of the grid. We propose a gnn-enhanced ant colony optimization method for orchestrating grid security strategies, which trains across combinatorial optimization problems (COPs) that are representative scenarios in the state grid business scenarios, to learn specific mappings from instances to their heuristic measures. The learned heuristic metrics are embedded into the ant colony optimization (ACO) to generate the optimal security policy adapted to the current security situation. Compared to the ACO and adaptive elite ACO, our method reduces the average time consumption of finding a path within a limited time in the capacitated vehicle routing problem by 67.09% and 66.98%, respectively. Additionally, ablation experiments verify the effectiveness and necessity of the individual functional modules.

Keywords:

ant colony optimization; graph neural network; internet of things; security strategy orchestration

1. Introduction

Modern state grid business scenarios involve a large number of heterogeneous devices and nodes, which are widely distributed and interconnected, forming a complex Internet of Things (IoT) system [1,2,3,4,5]. The heterogeneous information is diverse, including sensor node data, video, audio, control commands, and other types of information, and the transmission process may face many security threats, such as tampering, camouflage, replay attacks, and so on. The transmission process is vulnerable to various security threats, such as tampering, spoofing, and replay attacks. Due to the interconnected nature of these systems, an abnormality in a single node or device can compromise the stability of the entire grid.

Symmetry also has applications in the optimization and operation of smart grids. Symmetry can refer to the balanced distribution of loads, the consistency of grid component configurations, and consistency of data patterns. Utilizing symmetry in heuristic algorithms can improve their performance. A symmetric load distribution can reduce transmission losses and improve the efficiency of power delivery. Additionally, symmetry in the data patterns can facilitate more accurate anomaly detection and predictive maintenance to ensure the stable and secure operation of the grid. Blockchain technology can further enhance these benefits by providing a decentralized and tamper-resistant framework for secure data sharing and authentication across the grid, ensuring the integrity of symmetrical data patterns [2,6]. For example, integrating blockchain with secure consensus mechanisms can protect against data tampering and unauthorized access [7,8,9], reinforcing the grid’s overall security architecture. Furthermore, lightweight and efficient blockchain-based messaging schemes can enhance real-time communication between grid components, supporting the balanced distribution of loads and improving system resilience [10,11].

As smart grid systems become more complex, there is an increasing need for advanced optimization techniques to enhance adaptability and efficiency. Heuristic algorithms, such as ant colony optimization, have been introduced to address these challenges in various aspects of smart grid operations. In electricity generation, transmission, and distribution, heuristic algorithms are applied to system control and monitoring to improve reliability and performance [12]. Additionally, in the field of electricity consumption, heuristic algorithms optimize resource allocation and operational strategies, supporting integrated energy services and electricity market transactions. By leveraging these algorithms, smart grids can achieve more dynamic and adaptive control, meeting the evolving demands of modern electricity systems.

The complexity of the new state grid business scenarios lies in its multipoint collaboration and linkage characteristics, involving human–machine collaboration, modularized application scenarios, and intelligent decision-making for linkage strategies [13].

The existing grid security measures are mainly aimed at the high-voltage level, and the protection of low-voltage control-related business is relatively weak [14]. With the emergence of intelligent attacks, advanced persistent threats, and other new threats, the security situation of low-voltage control-related business is increasingly severe [15]. The existing grid security measures are mainly targeted at the high-voltage level, the protection of low-voltage control business is relatively weak [16], the traditional security policy arrangement is usually based on static rules, it is difficult to adapt to the dynamic and complex security threat environment, and there is a lack of intelligent decision-making mechanisms [17].

Based on the above challenges, a security policy orchestration of ant colony optimization method based on graph neural enhancement is proposed, aiming to improve the overall security and system performance of a novel power business system. The decision-making method learns the complex relationship between power information devices and threat information by embedding a graph neural network (GNN) learner [18] in order to better capture the potential association patterns of security policies in the power system and combines it with an ant colony algorithm to achieve the intelligent path selection and resource allocation of security policies. In addition, an adaptive security policy orchestration framework for edge computing [19] is designed to achieve the dynamic adjustment of policies and improve the system’s ability to cope with unknown threats. The research results in this paper are expected to provide a new theoretical foundation and technical support for the security protection of low-voltage control-related services promote the safe, stable, and reliable operation of the power system and at the same time can be extended to other similar network security issues.

The main contributions of this paper are as follows.

(1): An intelligent path selection and resource allocation method for security policy based on the graph neural network and ant colony algorithm is proposed, which can achieve efficient and reliable security policy scheduling in a complex low-voltage control-related business environment.
(2): An adaptive security policy optimization framework is designed to improve the system’s ability to cope with unknown threats by achieving the dynamic adjustment of security policies in different situations.
(3): The effectiveness of the proposed method is verified through extensive experiments, which provides new ideas and references for the security protection of low-voltage control-related services.

The remainder of this paper is organized as follows: In Section 2, we review the related work, focusing on ant colony optimization (ACO), graph neural networks (GNNs), and security orchestration, automation, and response (SOAR). Section 3 presents our proposed methodology, detailing the integration of the GNN with ACO to form the GACO algorithm. Section 4 discusses the experimental setup, performance evaluation, and statistical significance testing of the results. We reach a conclusion in Section 5.

2. Related Work

In this section, we first review the advancements in security orchestration, automation, and response (SOAR). Then, we briefly introduce the key concepts and applications of graph neural networks (GNNs). Finally, we discuss the ant colony optimization (ACO) algorithm and its relevance to our proposed methodology.

2.1. Security Orchestration, Automation, and Response (SOAR)

To address the prominent issues faced in network security operations and maintenance, Gartner first proposed the concept of “SOAR” in 2015, which was defined as security operations analytics and reporting [20]. With the rapid development and evolution of security operations technology, Gartner redefined the “SOAR” concept in 2017 as security orchestration [21], automation, and response. This technology aims to help enterprises and organizations collect various monitored network information, conduct a comprehensive analysis and classification of security events, and utilize standardized workflows to integrate products and security services from different security vendors through a combination of human and machine efforts [22]. This assists security operations personnel in defining, prioritizing, and driving standardized incident response activities, thereby enhancing the capability and efficiency of network security incident operations and maintenance.

With the continuous development of the IoT industry, the number of dedicated SOAR products is increasing, but several issues and contradictions have also emerged. Currently, SOAR faces the following prominent problems:

(a): Lack of standardization: There are no unified standards for IoT information security detection, evaluation, and access [23,24]. Different analysis and detection engines assign varying priority levels to the same risk event. The lack of uniform management ultimately results in inconsistent product security standards and frequent vulnerabilities.
(b): Single validation method: The validation methods used to verify the effectiveness of threat mitigation are limited and cannot ensure the accuracy and objectivity of the results. Consequently, the effectiveness of threat mitigation measures cannot be guaranteed from the root cause [25]. The validation process lacks a dual verification mechanism, relying instead on a single entity.
(c): Insufficient intelligent orchestration: Existing playbooks for orchestration are primarily manually generated. With the increasing number of IoT devices and the continuous iteration of solutions [26], the manual workload becomes increasingly heavy.

In real projects, a prerequisite for automation in SOAR is the integration of security devices into the SOAR platform, for example, by integrating the packet filtering function on the traditional firewall, the state monitoring function on the next-generation firewall (NGFW) to obtain the session state, etc., to achieve the function of calling on the SOAR platform. At present, Phantom [27], Demisto, and other SOAR vendors already have a beginning scale of commercialized products, in the security incident operation and emergency response links in the use of cases, and have shown good results.

2.2. Graph Neural Network

Graph neural networks are a class of deep learning models specialized in processing graph-structured data. In recent years, graph neural networks have achieved remarkable success in many fields, such as social network analysis, recommender systems, traffic prediction, etc. [28]. The core idea of graph neural networks is to learn the representation of nodes by iteratively aggregating their neighborhood information and optimize the whole network through end-to-end training [29].

In the field of power grid security, graph neural networks have shown a broad application prospect. The power grid is essentially a complex network system, and the topology and node attributes contain rich information. Traditional methods effectively utilize this graph structure information with difficulty, and graph neural networks provide new ideas to solve this problem [30].

Graph neural networks can be used to model grid topology and node characteristics and learn low-dimensional node embedding representations [31]. By capturing the interactions and dependencies between nodes, graph neural networks are able to generate information-rich node embeddings that provide a strong support for subsequent tasks. Haghshenas et al. [32] proposed a graph convolutional network-based approach to achieve efficient grid fault localization and anomaly detection by learning the embedding representations of grid nodes. Graph neural networks have also shown advantages in attack detection and risk assessment. By modeling the topology of the grid and node interaction patterns, graph neural networks can automatically learn anomaly patterns and attack features. Hansen et al. [33] proposed an intrusion detection method based on graph attention networks, which achieves a high accuracy attack detection and classification by learning the attention weights between nodes. In addition, graph neural networks have been used for vulnerability analysis and the risk assessment of power grids. By modeling the critical nodes and connection patterns of the grid, graph neural networks can predict potential fault propagation paths and cascading failure risks. In 2023, Ma et al. [34] proposed a graph neural network-based grid risk assessment framework to achieve the identification and risk quantification of grid vulnerabilities by learning the importance and connection strength of nodes.

2.3. Ant Colony Optimization Algorithm

Ant colony optimization (ACO) is a swarm intelligence algorithm, which consists of a group of unintelligent individuals that collaborate with each other to exhibit intelligent behavior in order to solve a complex problem. ACO was firstly proposed by Italian scholars, such as Colorni A., Dorigo M., et al. in 1991. Artificial ants in ACO are a stochastic solution construction process that uses (artificial) pheromone information that is modified based on the ants’ search experience and can be accessed heuristically to generate a solution to the problem [35]. One of the starting examples of such algorithms is the ant colony algorithm, which was proposed using the well-known application example of the traveler’s problem. Despite the promising results achieved by the ACO algorithm, it is still not sufficient to compete with the state-of-the-art TSP algorithms [29]. In order to fine-tune ant-built solutions, local improvement heuristics combined with ACO serve as the most common technique. Pop et al. [36] proposed a method that uses other complete solutions to take out partial solutions. Using the former method speeds up the process of solution construction and allows the direct utilization of good parts of the solution. The results presented by the researchers using this approach show that it is very effective in the absence of other useful local search methods.

ACO has its own unique advantages in solving routing problems. Pal et al. [37] proposed an ant colony-based energy-efficient routing (A-ESR) for energy efficient networks. The authors first used the idea of traffic concentration to transform the NP-hard energy consumption by transforming the minimization problem into a simple one by allowing the data flow to be concentrated on certain heavily loaded links while shutting down other lightly loaded links. Sumathi et al. [38] propose a new adaptive routing algorithm based on the ant colony algorithm, which improves the transfer rules of the ants during the search process, making the algorithm more flexible and adaptable to different environments.

3. Methods

This section describes the proposed methodology for security policy orchestration in power systems, leveraging the combined strengths of graph neural networks (GNNs) and the ant colony optimization (ACO) algorithm. The approach is designed to address the limitations of traditional static security policies by enabling dynamic and adaptive policy adjustments based on real-time system conditions.

3.1. Construction of Directed Graph of Alarm Node Relationship

We constructed the security policy as a directed graph as shown in Figure 1. The alarm information node

V_{1}

stores information including alarm classification, alarm level, and alarm content, and the device node

D_{1}, D_{2}, D_{3}

stores the corresponding device information.

b_{11}

denotes the disposal measures, and

t_{11}

denotes the disposal order. In the feature generation stage, one-hot encoding is used to generate feature representations of alarm classification and alarm level, and the trained Word2Vec model is used to extract the word embedding representation of the alarm content, converting discrete text information into continuous word embedding representation. The generated feature representations are spliced into a node feature vector as the input of the graph neural network.

Let

G = (V, E)

be the input graph, where

V

is a node set,

V_{1}

denotes the alarm information node, and

D_{1}, D_{2}, D_{3}

denote the device nodes.

E

is the set of edges representing relationships between nodes. For alarm information nodes

V_{1}

, we used a variety of feature representation methods: on the one hand, we used one-hot encoding to represent alarm classification

f_{0}^{c}

and alarm levels

f_{0}^{l}

; on the other hand, we used the pre-trained Word2Vec model to extract the word embedding representation of the alarm content

f_{0}^{w}

. For device nodes

D_{1}, D_{2}, D_{3}

, we used the feature vector of device information

f_{i}^{d}

. The features obtained from each node

v_{i}

are spliced into the node feature vector

f_{i}

, forming a complete node feature representation. The graph neural network model is represented as

f_{θ} (G)

, where

θ

denotes the model parameters to learn the representation of the whole graph

G

.

Edge

(v_{0}, v_{i})

weight prediction:

{\hat{w}}_{i} = g_{1} (f (v_{i}))

, Edge

(v_{i}, v_{j})

weight prediction:

{\hat{w}}_{i j} = g_{2} (f (v_{i}), f (v_{j}))

, where

f (v_{i})

and

f (v_{j})

denote the feature representations of the nodes

v_{i}

and

v_{j}

, respectively, and

g

is a function used for predicting the weights of the edges.

3.2. Heuristic Learner Based on Graph Neural Network Enhancement

The overall architecture of the GNN-enhanced ant colony optimization (GACO) is shown in Figure 2. The main components include a heuristic learner based on graph neural network and an adaptive elite strategy.

Among them, the heuristic learner embeds the heuristic information into the pheromone model of the ant colony algorithm by extracting features and rules through graph neural network. The adaptive strategy dynamically adjusts the parameters and strategies of the algorithm according to the state and performance of the algorithm to improve the adaptability of the algorithm. The elite strategy accelerates the convergence of the algorithm by evaluating the quality of the candidate solutions, selecting the top-performing solutions as the elite solutions, and feeding their pheromone information to the pheromone model.

3.2.1. Ant Colony Coding Initialization

There are

N

power business device nodes and

M

search paths. Each ant generates a route and denotes it as

r (ν_{1}, ν_{n})

when it reaches the end point. The matrix of the ant colony is expressed as

P = \{P_{1}, P_{2}, \dots, P_{M}\} .

The path of the mth ant is denoted as

P_{m} = \{p_{m, 1}, p_{m, 2}, \dots, p_{m, N}\}

, where

p_{M, N}

denotes the device node selected by the

M

th ant when searching for a path.

N

denotes the node size of the network. When

p_{M, N} = 0

, it means that the node is not selected by the ants.

The coding aims to bridge the mathematical models with real-world challenges. The initial ant colony iterations neglected the influence of information and relied on a stochastic generator for population encoding.

3.2.2. Heuristic Learner Based on Graph Neural Network

For the pheromone model

P_{H}

, we used the pre-trained GNN model from [33] as the GNN backbone. This model operates on an anisotropic message passing framework and incorporates edge gating mechanisms. To effectively capture high-order information in graph-structured data, we adopted an architecture similar to ResGCN [39]. By employing residual connections and output aggregation strategies, the standard graph convolutional network (GCN) is extended to deeper layers. These residual connections enable the model to retain essential features from each layer, thereby mitigating the vanishing gradient problem and allowing for a deeper architecture. Additionally, we simplified the aggregation process by removing transformation matrices and activation functions from each layer, which reduces the computational complexity and focuses on the core information propagation mechanism.

The GNN has 10 layers. In layer

l

, the node features of

v_{i}

are denoted as

h_{v}^{(l)}

, the edge features of edge

< i, j >

are denoted as

h_{e}^{(l)}

, and the propagation feature is defined:

h_{v}^{(l + 1)} = \sum_{v \in N_{v}} {\tilde{A}}_{v e} h_{e}^{(l)} + h_{v}^{(l)}, h_{v}^{(0)} = e_{v},

(1)

h_{e}^{(l + 1)} = \sum_{u \in N_{v}} {\tilde{A}}_{v e} h_{v}^{(l)} + h_{e}^{(l)}, h_{e}^{(0)} = e_{i j} .

(2)

where

{\tilde{A}}_{v e} = \frac{1}{| N_{v} |}

and

N_{v}

denotes the set of neighbor nodes of node

v

.

The extracted edge features from the final layer of the GNN are input to a 3-layer multilayer perceptron (MLP). The MLP maps the aggregated edge features to real-valued heuristic representations, which enables the network to better understand and utilize the information of the input graph by learning appropriate parameters and representations.

The heuristic learner consists of a GNN with trainable parameters

θ

and an MLP that maps the extracted edge features to heuristic parameters to parameterize the heuristic space. The input instance

I

is mapped to its heuristic measure

η_{θ} (I)

, which is simplified to

η_{θ}

. The formula is as follows:

P_{η_{θ}} (s| I) = \prod_{t = 1}^{n} P_{η_{θ}} (s_{t}| s_{< t}, I) .

(3)

where

s_{t}

represents the decision at step

t

and

s_{< t}

denotes the sequence of decisions made up to step t. This probabilistic formulation leverages the heuristic representations

η_{θ}

learned by the GNN and MLP to guide the decision-making process in generating the solution

s

.

3.2.3. Training the Heuristic Learner

We first used GNN to extract the edge features of the input graph, which include feature vectors of nodes and edges, as well as the relationships between them. Then, we input these features into MLP, which maps these features into a real-valued heuristic representation space to parameterize the heuristic space. In this way, we obtained the heuristic measure of the input graph

η_{θ} (I)

, which contains the predicted value of each solution component

c_{i j}

, as well as the deviation of the heuristic measure. We trained the heuristic learner across COP instances to minimize the loss function:

L_{1} = \frac{1}{N} \sum_{i = 1}^{N} \{\begin{array}{l} \frac{1}{2} ({\hat{w}}_{i} - η_{i; θ})^{2}, i f | {\hat{w}}_{i} - w_{i} | \leq δ, \\ δ (| {\hat{w}}_{i} - η_{i; θ} | - \frac{1}{2} δ), o t h e r w i s e, \end{array}

(4)

L_{2} = \frac{1}{N} \sum_{i = 1}^{N} ({\hat{w}}_{i j} - η_{i; θ})^{2} .

(5)

where

η_{i; θ}

is

(v_{i}, v_{j})

the true weight of the edge,

{\hat{w}}_{i j}

is the weight predicted by the model, and

δ

is a hyperparameter used to adjust the degree of smoothing. By minimizing the loss function, we can train the graph neural network model so that it can effectively learn the weight information of the edges in the graph.

3.2.4. Adaptive Elite Strategy

The rule guiding ants to discover a path involves selecting a node randomly each time from the chaos, initiating the search from the current node, and terminating upon reaching the end node. Throughout the exploration, in adherence to biologically inspired principles, each ant deposits pheromones along its traversed path. The pseudo-code of the proposed GACO algorithm is presented in Algorithm 1.

In the initial stage of the algorithm, in order to introduce more randomness, we set the pheromone concentration as a constant and used a roulette wheel to select the next jump [38]. The specific formula is as follows:

S P_{i, j} (m) = \frac{τ_{i, j}^{α} (m) φ_{i, j}^{β} (m)}{\sum_{n = 1}^{N} τ_{i, j}^{α} (m) φ_{i, j}^{β} (m)},

(6)

φ_{i, j} = \frac{1}{w_{e} f^{C E *} (l) + w_{d} f^{D *} (l) - w_{s} f^{S *} (l)}

(7)

where

S P_{i, j} (m)

represents the probability of the mth ant moving from node i to node j,

τ_{i, j}^{α} (m)

represents the pheromone concentration on link

l

,

φ_{i, j}

represents the visibility of link l, consists of three QOS parameters, and different weights are applied to emphasize the importance of each attribute.

f^{C E *}

is the cost function, which evaluates the resource allocation during the execution of the strategy;

f^{D *}

is the security function, which evaluates the alarm level of the security event; and

f^{S *}

is the correlation function, which is the correlation degree between the strategy and the relevant indicators of the power system.

Algorithm 1: GACO algorithm

Input: Security alarm information

I

, number of iterations

g_{m a x}

Output: Optimal path

r_{b e s t}

1:: $t = 1$ (initialization)
2:: Heuristic information $η_{θ} (I) \leftarrow I$
3:: Initialize the ant colony parameters, set the ants in the control center, and generate the first generation of ant colonies in chaos
4:: Ants generate initial feasible paths according to the rules of Equation (6)
5:: Optimal path $r_{b e s t} = r$
6:: while $t \leq g_{m a x}$ do
7:: if $τ > τ_{m a x}$ or $τ < τ_{m i n}$ then
8:: Update information $τ$ according to Equation (9)
9:: else
10:: Update information $τ$ according to Equation (8)
11:: end if
12:: $r_{b e s t} = r$
13:: $t = t + 1$
14:: end
15:: return $r_{b e s t}$

In order to distinguish between good and bad search schemes, they are evaluated for fitness and pheromone updating, the next node is selected based on the pheromone concentration and the heuristic metric generated by the GNN, and based on the updated pheromone, elite ants are obtained by the elite strategy. The best route is the global optimal solution of the algorithm, and the corresponding ant is called the elite ant, expressed as

r_{b e s t}

[37]. The pheromone update method can be expressed as:

\begin{matrix} τ_{i, j} (g, g + 1) = ρ \cdot τ_{i, j} (g) + Δ τ_{i, j} (g, g + 1) + Δ τ_{i, j}^{*}, \\ Δ τ_{i, j}^{*} = δ \cdot Q . \end{matrix}

(8)

where

Δ τ_{i, j}^{*}

denotes the elite ant routing,

r_{b e s t}

denotes pheromone,

δ

denotes the weight of the pheromone of elite ants, and

Q

is the constant.

During the collaborative phase, ants communicate via pheromones with the aim of attaining superior solutions. However, conventional ant colony algorithms, akin to other evolutionary methods, often face premature convergence towards the local optima. To overcome this challenge, an adaptive strategywas employed to dynamically adjust the volatility weight of pheromones.

This adaptive approach fosters improved global search exploration while enhancing convergence speed. Initially, as ants traverse paths, pheromones are emitted along their trajectories. With time, these pheromones accumulate, leading subsequent ants to favor paths already explored. Yet, to preserve colony diversity and bolster global search capability, an adaptive operator is introduced to regulate pheromone volatility. The adaptive update method is delineated as follows:

\begin{matrix} \{\begin{matrix} τ_{i, j} (g, g + 1) = ρ^{1 + μ (θ)} \cdot τ_{i, j} (g) + Δ τ_{i, j} (g, g + 1), \\ τ > τ_{m a x}, \\ τ_{i, j} (g, g + 1) = ρ^{1 - μ (θ)} \cdot τ_{i, j} (g) + Δ τ_{i, j} (g, g + 1), \\ τ < τ_{m i n}, \end{matrix} \\ μ (θ) = θ / ζ . \end{matrix}

(9)

where

ρ

is the pheromone volatility factor,

τ_{m a x}

is the maximum pheromone concentration, and

τ_{m i n}

is the minimum pheromone concentration. ζ is a constant and

θ

denotes the convergence factor.

μ (θ)

represents a function proportional to

θ

.

3.3. Security Policy Orchestration

In the low-voltage control business of power grid, facing the complex and changing network environment and evolving security threats, traditional static defense measures can hardly provide sufficient protection [40]. In order to meet this challenge, this paper proposes a security policy scheduling method based on graph neural network and ant colony algorithm, which achieves dynamic and adaptive security policy scheduling through intelligent anomaly detection, threat association, and policy optimization. Figure 3 shows the overall architecture of security policy orchestration, and the security policy orchestration process is divided into the following steps.

Step 1: Monitoring and Anomaly Detection

The power network monitoring system monitors network traffic and system behavior in real time. When a suspicious attack behavior is detected, the system will trigger an abnormal alarm and transmit relevant information to the security event management module to generate security events.

Step 2: Intelligent Reasoning

The intelligent reasoning engine includes three functional modules: ant colony algorithm, strategy template library, and script planning. Among them, the ant colony algorithm module achieves the intelligent security policy optimization of security events by comprehensively utilizing the heuristic information of ant colony algorithm and graph neural network; based on the control strategy and orchestration rule base, the script orchestration planning module achieves the orchestration of security script response actions and dynamically adjusts the generated scripts based on event handling responses and performance evaluation results.

Step 3: Security incident response and policy enforcement

The intelligent decision-making and orchestration engine will automatically convert the generated scripts into executable security incident response scripts and achieves the automated deployment and execution of policies through the orchestration and configuration of security devices.

4. Experiment

4.1. Experimental Setup

In order to verify the effectiveness of the security strategy for power grid low-voltage control services, this paper adopted two classic combinatorial optimization problems, namely TSP (Traveling Salesman Problem) and CVRP (Capacitated Vehicle Routing Problem) [41], and applied them to simulate security path planning and resource allocation in policy orchestration.

In the redesigned TSP scenario, security equipment and network equipment in the power business system are regarded as cities that traveling salesmen want to visit, and the distance between cities represents the degree of association between devices. By applying path optimization algorithms such as ant colony algorithm, the access path of the control equipment is minimized, thereby improving the access efficiency and security of threat warning information. In the CVRP scenario, considering the workload differences of different devices in the low-voltage control business of the power grid, the CVRP algorithm is introduced to achieve the load balancing of each vehicle (service vehicle) under limited resources, maximize resource utilization, and ensure the security of transmission and processing of threat warning information. For the TSP100, the coordinates of each city are denoted by

(x_{i}, y_{i})

, all cities are placed in [0,1] × [0,1] unit flat squares, and the same data distribution is used for the training and testing phases. CVRP100 instances were generated by uniformly sampling customer locations from a two-dimensional [0,1] × [0,1] unit, similar to TSP. Each customer was assigned a random demand, representing the workload or resource requirement, sampled from a predefined range (e.g., [1,10]). A central depot, from which all vehicles start and return, was also randomly placed within the unit square.

To evaluate the performance of the method, this paper used the average consumption commonly used in path optimization problems as an evaluation indicator: the average consumption of the path found within a given time limit. The lower the average consumption, the better the performance of the algorithm. Through these experiments, the applicability and effectiveness of the proposed security policy orchestration method in the low-voltage control business of the power grid can be fully evaluated.

The experiments were conducted on a NVIDIA GeForce RTX 3090 GPU, which provides 24 GB of VRAM. The NVIDIA GeForce RTX 3090 GPU used in the experiments was manufactured by NVIDIA Corporation, which is headquartered in Santa Clara, CA, USA.

4.2. Experimental Results’ Analysis

GACO was compared to two classical ACO algorithms: the adaptive elitist ant system (AEAS) and the ant colony system (ACO system) and two neural combinatorial optimization (NCO) algorithms, POMO [42] and GCN-MCTS [43], on two combinatorial optimization problems: the TSP100 and the CVRP100. Table 1 and Table 2 show the average consumption of the original heuristic measurements and the deep learning-based heuristic measurements to find the paths in different finite times on the TSP100 and CVRP100 problems, respectively.

The experimental results are shown in Table 1 and Table 2. On the TSP100 problem, the GACO method performs excellently at

T

= 1. Its average cost is only 8.882 units, which is significantly lower than AEAS and ACO, which are reduced by 44.06% and 43.32%, respectively. Figure 4 also shows that GACO has achieved significant performance improvements in optimizing paths and resource allocation, reflecting its more effective strategy. By combining the ant colony algorithm with the heuristic information provided by GNN, GACO can better adapt to changing business scenarios and provide more intelligent solutions for path selection and resource allocation.

Both TSP and CVRP are typical graph structure problems, and the relationship between the nodes and edges of the problem is the key to solving these problems. GNN can better capture the complex dependencies in the graph structure through the propagation and updating of information between nodes. The introduction of GNN in GACO enables the use of the powerful modeling capability of GNN to represent and optimize the graph structure of the problem more effectively, which makes the algorithm perform better in solving complex graph problems. In addition, GACO can learn richer and more effective features globally, which helps to improve the strategy of path selection, thus improving the global search capability of the algorithm and reducing the possibility of falling into the local optima.

4.3. Hyperparameter Experiment

We conducted a detailed experimental analysis of the hyperparameters of the model, as shown in Figure 5 and Figure 6, to assess their impact on the model performance. Specifically, we adjusted the learning rate (lr) parameter by setting it to 0.001, 0.01, and 0.1, and observed its effect on the graph neural network model in solving the TSP100 and CVRP100 problems. The experimental results show that different learning rates have a significant effect on the performance of the model. At a learning rate of 0.001, the performance of the model improves relative to learning rates of 0.01 and 0.1, especially at larger

T

values. However, we also observe a certain degree of crossover in model performance at the learning rates of 0.01 and 0.1, i.e., at some

T

values, the performance at learning rate of 0.01 is better than that at learning rate of 0.1, while the opposite is true at other

T

values. This demonstrates that the impact of hyperparameter selection on model performance is complex and challenging and requires further in-depth research and analysis.

4.4. Ablation Experiment

In order to further verify the effectiveness of the key components in the GraphACO method, this paper conducted an ablation experiment. As shown in Figure 7, the GNN module and adaptive module in GACO were removed, and two variants were obtained, without GNN and GNN + Elite, respectively.

GACO combines the pheromone model of GNN with the ant colony algorithm, as well as the intelligent selection of paths and resource allocation. Its excellent performance may be attributed to the comprehensive consideration of the interplay between graph structure and pheromone, which enables better adaptation to changing business scenarios during path optimization. The GNN + Elite strategy focuses on the introduction of elite information to steer the search process, with the expectation of improving the overall performance through the steering of individually excellent paths in the ant colony. However, the experimental effect of no GNN decreases considerably compared to GACO, indicating the effectiveness of GNN in the overall architecture.

4.5. Disscussion

In the context of real-world grid security, the efficient planning of security paths and allocation of resources is essential to maintain operational stability and mitigate potential threats. The proposed approach was successful in reducing path consumption and balancing resource loads, demonstrating its potential to improve security event response times and optimize the utilization of available resources. The approach offers several improvements over traditional static security policies. By dynamically adjusting policies and optimizing resource allocation based on real-time data, the approach enhances the flexibility and adaptability of security measures. This helps to prevent and mitigate threats more effectively, reduces downtime, and improves the overall security of the grid.

The methodology can be applied to a variety of security operations within the grid, such as real-time threat detection and response, security alarm management, and resource allocation. For example, in the event of a security breach or anomaly, the system can quickly route control signals and data using optimized paths to minimize the impact of the threat and ensure a swift and effective response.

5. Conclusions

In this paper, a security policy orchestration method based on the graph neural network and ant colony algorithm is proposed, aiming at solving the problem that traditional static security policy, which is difficult to adapt to the new type of power system. By combining the features of device nodes learned from graph neural network and ant colony algorithm, it achieves the intelligent perception of power system security situation and the precise prevention and control by dynamically adjusting the security policy. The experimental results show that the proposed method is able to reduce the average consumption of the path within a limited time compared to the traditional ant colony algorithm and the adaptive elite ant colony algorithm, which verifies the effectiveness of the method. The research results in this paper provide a new theoretical foundation and technical support for the security protection of low-voltage control-related operations, which is expected to promote the safe, stable, and reliable operation of power systems. However, this method also increases the computational complexity. Especially in large-scale power systems, the processing time may be significantly extended with the increase in the number of nodes, making it difficult to meet the demand for real-time decision-making. In order to solve this problem, we will explore more efficient algorithm optimization methods in future research to reduce the computational complexity so that the method is more suitable for real-time application scenarios.

Author Contributions

Methodology, W.M., C.W. and P.G.; validation, W.M., X.Z. and S.C.; formal analysis, W.M., X.Z. and S.C.; investigation, W.M., X.Z. and P.G.; resources, W.M.; writing—original draft, W.M.; writing—reviewand editing, W.M, Q.L. and X.Z.; visualization, X.Z.; supervision, Q.L.; project administration, W.M.; funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Project of State Grid Jiangsu Electric Power Company Ltd., under Grant J2023124.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest. Author Weiwei Miao, Xinjian Zhao and Shi Chen were employed by the company State Grid Jiangsu Electric Power Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Pinto, S.J.; Siano, P.; Parente, M. Review of Cybersecurity Analysis in Smart Distribution Systems and Future Directions for Using Unsupervised Learning Methods for Cyber Detection. Energies 2023, 16, 1651. [Google Scholar] [CrossRef]
Bhattacharjya, A.; Zhong, X.; Wang, J.; Li, X. Security challenges and concerns of Internet of Things (IoT). Cyber-Phys. Syst. Archit. Secur. Appl. 2019, 1, 153–185. [Google Scholar]
Bhattacharjya, A.; Zhong, X.; Wang, J.; Li, X. Secure IoT structural design for smart homes. In Smart Cities Cybersecurity and Privacy; Elsevier: Amsterdam, The Netherlands, 2019; pp. 187–201. [Google Scholar]
Bhattacharjya, A.; Zhong, X.; Wang, J.; Li, X. Present scenarios of IoT projects with security aspects focused. Digit. Twin Technol. Smart Cities 2020, 2, 95–122. [Google Scholar]
Bhattacharjya, A.; Zhong, X.; Wang, J.; Li, X. CoAP—Application layer connection-less lightweight protocol for the Internet of Things (IoT) and CoAP-IPSEC Security with DTLS Supporting CoAP. Digit. Twin Technol. Smart Cities 2020, 1, 151–175. [Google Scholar]
Bhattacharjya, A.; Zhong, X.; Li, X. A Lightweight and Efficient Secure Hybrid RSA (SHRSA) Messaging Scheme with Four-Layered Authentication Stack. IEEE Access 2019, 7, 30487–30506. [Google Scholar] [CrossRef]
Bhattacharjya, A.; Kozdroj, K.; Bazydlo, G.; Wisniewski, R. Trusted and Secure Blockchain-Based Architecture for Internet-of-Medical-Things. Electronics 2022, 11, 2560. [Google Scholar] [CrossRef]
Bhattacharjya, A.; Wisniewski, R.; Nidumolu, V. Holistic Research on Blockchain’s Consensus Protocol Mechanisms with Security and Concurrency Analysis Aspects of CPS. Electronics 2022, 11, 2760. [Google Scholar] [CrossRef]
Bhattacharjya, A. A Holistic Study on the Use of Blockchain Technology in CPS and IoT Architectures Maintaining the CIA Triad in Data Communication. Int. J. Appl. Math. Comput. Sci. 2022, 32, 403–413. [Google Scholar] [CrossRef]
Bachani, V.; Bhattacharjya, A. Preferential Delegated Proof of Stake (PDPoS)-Modified DPoS with Two Layers towards Scalability and Higher TPS. Symmetry 2023, 15, 4. [Google Scholar] [CrossRef]
Bazydlo, G.; Kozdroj, K.; Wisniewski, R.; Bhattacharjya, A. Trusted Third Party Application in Durable Medium e-Service. Appl. Sci. 2024, 14, 191. [Google Scholar] [CrossRef]
Gu, H.; Shang, J.; Wang, P.; Mi, J.; Bhattacharjya, A. A Secure Protocol Authentication Method Based on the Strand Space Model for Blockchain-Based Industrial Internet of Things. Symmetry 2024, 16, 851. [Google Scholar] [CrossRef]
Rahman, S.; Khan, I.A.; Khan, A.A.; Mallik, A.; Nadeem, M.F. Comprehensive review & impact analysis of integrating projected electric vehicle charging load to the existing low voltage distribution system. Renew. Sustain. Energy Rev. 2022, 153, 111756. [Google Scholar] [CrossRef]
Alrashidi, M. Community Battery Storage Systems Planning for Voltage Regulation in Low Voltage Distribution Systems. Appl. Sci. 2022, 12, 9083. [Google Scholar] [CrossRef]
Koirala, A.; Van Acker, T.; D’Hulst, R.; Van Hertem, D. Hosting capacity of photovoltaic systems in low voltage distribution systems: A benchmark of deterministic and stochastic approaches. Renew. Sustain. Energy Rev. 2022, 155, 111899. [Google Scholar] [CrossRef]
Zong, S.; Lyu, Y.; Wang, C. Control strategy of multiple interlinking converters for low-voltage hybrid microgrid based on adaptive droop. Energy Rep. 2023, 9, 721–731. [Google Scholar] [CrossRef]
Davi-Arderius, D.; Troncia, M.; Peiro, J.J. Operational Challenges and Economics in Future Voltage Control Services. Curr. Sustain./Renew. Energy Rep. 2023, 3, 130–138. [Google Scholar] [CrossRef]
Zhang, J.; Feng, H.; Liu, B.; Zhao, D. Survey of Technology in Network Security Situation Awareness. Sensors 2023, 23, 2608. [Google Scholar] [CrossRef]
Das, R.; Soylu, M. A key review on graph data science: The power of graphs in scientific studies. Chemom. Intell. Lab. Syst. 2023, 240, 104896. [Google Scholar] [CrossRef]
Bartolomeo, G.; Yosofie, M.; Baeurle, S.; Haluszczynski, O.; Mohan, N.; Ott, J.; Association, U. Oakestra: A Lightweight Hierarchical Orchestration Framework for Edge Computing. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC), Boston, MA, USA, 10–12 July 2023; Volume 2023, pp. 215–231. [Google Scholar]
Ghiasi, M.; Niknam, T.; Wang, Z.; Mehrandezh, M.; Dehghani, M.; Ghadimi, N. A comprehensive review of cyber-attacks and defense mechanisms for improving security in smart grid energy systems: Past, present and future. Electr. Power Syst. Res. 2023, 215, 108975. [Google Scholar] [CrossRef]
Hasan, M.K.; Habib, A.K.M.A.; Shukur, Z.; Ibrahim, F.; Islam, S.; Razzaque, M.A. Review on cyber-physical and cyber-security system in smart grid: Standards, protocols, constraints, and recommendations. J. Netw. Comput. Appl. 2023, 209, 103540. [Google Scholar] [CrossRef]
An, D.; Zhang, F.; Yang, Q.; Zhang, C. Data Integrity Attack in Dynamic State Estimation of Smart Grid: Attack Model and Countermeasures. IEEE Trans. Autom. Sci. Eng. 2022, 19, 1631–1644. [Google Scholar] [CrossRef]
Bitirgen, K.; Filik, U.B. A hybrid deep learning model for discrimination of physical disturbance and cyber-attack detection in smart grid. Int. J. Crit. Infrastruct. Prot. 2023, 40, 100582. [Google Scholar] [CrossRef]
Ding, J.; Qammar, A.; Zhang, Z.; Karim, A.; Ning, H. Cyber Threats to Smart Grids: Review, Taxonomy, Potential Solutions, and Future Directions. Energies 2022, 15, 6799. [Google Scholar] [CrossRef]
Syrmakesis, A.D.; Alcaraz, C.; Hatziargyriou, N.D. Classifying resilience approaches for protecting smart grids against cyber threats. Int. J. Inf. Secur. 2022, 21, 1189–1210. [Google Scholar] [CrossRef]
Inayat, U.; Zia, M.F.; Mahmood, S.; Berghout, T.; Benbouzid, M. Cybersecurity Enhancement of Smart Grid: Attacks, Methods, and Prospects. Electronics 2022, 11, 3854. [Google Scholar] [CrossRef]
Jiang, W.; Luo, J. Graph Neural Network for Traffic Forecasting: A Survey. arXiv 2022, arXiv:2101.11174. [Google Scholar] [CrossRef]
Wu, L.; Cui, P.; Pei, J.; Zhao, L.; Guo, X. ACM Graph Neural Networks: Foundation, Frontiers and Applications. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Long Beach, CA, USA, 6–10 August 2023; Volume 2023, pp. 5831–5832. [Google Scholar]
Velickovic, P. Everything is connected: Graph neural networks. Curr. Opin. Struct. Biol. 2023, 79, 102538. [Google Scholar] [CrossRef]
Ullah, I.; Manzo, M.; Shah, M.; Madden, M.G. Graph convolutional networks: Analysis, improvements and results. Appl. Intell. 2022, 52, 9033–9044. [Google Scholar] [CrossRef]
Haghshenas, S.H.; Hasnat, M.A.; Naeini, M. A Temporal Graph Neural Network for Cyber Attack Detection and Localization in Smart Grids. In Proceedings of the 2023 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA, 16–20 July 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–5. [Google Scholar]
Hansen, J.B.; Anfinsen, S.N.; Bianchi, F.M. Power Flow Balancing With Decentralized Graph Neural Networks. IEEE Trans. Power Syst. 2023, 38, 2423–2433. [Google Scholar] [CrossRef]
Ma, T.-X.; Duan, X.; Xu, Y.; Wang, R.-L.; Li, X.-Y. Research on fault location in DC distribution network based on adaptive artificial bee colony slime mould algorithm. IEEE Access 2023, 11, 62630–62638. [Google Scholar] [CrossRef]
Comert, S.E.; Yazgan, H.R. A new approach based on hybrid ant colony optimization-artificial bee colony algorithm for multi-objective electric vehicle routing problems. Eng. Appl. Artif. Intell. 2023, 123, 106375. [Google Scholar] [CrossRef]
Pop, P.C.; Cosma, O.; Sabo, C.; Sitar, C.P. A comprehensive survey on the generalized traveling salesman problem. Eur. J. Oper. Res. 2024, 314, 819–835. [Google Scholar] [CrossRef]
Pal, K.; Sachan, S.; Gholian-Jouybari, F.; Hajiaghaei-Keshteli, M. An analysis of the security of multi-area power transmission lines using fuzzy-ACO. Expert Syst. Appl. 2023, 224, 120070. [Google Scholar] [CrossRef]
Sumathi, M.; Vijayaraj, N.; Raja, S.P.; Rajkamal, M. HHO-ACO hybridized load balancing technique in cloud computing. Int. J. Inf. Technol. 2023, 15, 1357–1365. [Google Scholar] [CrossRef]
Li, C.; Liu, Y.; Xiao, J.; Zhou, J. MCEAACO-QSRP: A Novel QoS-Secure Routing Protocol for Industrial Internet of Things. IEEE Internet Things J. 2022, 9, 18760–18777. [Google Scholar] [CrossRef]
Chuang, Y.-T.; Hung, Y.-T. A real-time and ACO-based offloading algorithm in edge computing. J. Parallel Distrib. Comput. 2023, 179, 104703. [Google Scholar] [CrossRef]
Chen, B.; Wu, Q.H.; Li, M.; Xiahou, K. Detection of false data injection attacks on power systems using graph edge-conditioned convolutional networks. Prot. Control Mod. Power Syst. 2023, 8, 16. [Google Scholar] [CrossRef]
Kwon, Y.-D.; Choo, J.; Kim, B.; Yoon, I.; Gwon, Y.; Min, S. Pomo: Policy optimization with multiple optima for reinforcement learning. Adv. Neural Inf. Process. Syst. 2020, 33, 21188–21198. [Google Scholar]
Fu, Z.-H.; Qiu, K.-B.; Zha, H. Generalize a Small Pre-trained Model to Arbitrarily Large TSP Instances. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 7474–7482. [Google Scholar]

Figure 1. Alarm node relationship-directed graph construction.

Figure 2. Overall architecture of GACO.

Figure 3. Security policy orchestration architecture.

Figure 4. Performance comparison of CVRP issues. We used two traditional ACO algorithms and two NCO algorithms as a comparison.

Figure 5. Different learning rate settings on the TSP100 problem.

Figure 6. Different learning rate settings on the CVRP 100 problem.

Figure 7. Ablation experiment of GACO. We evaluated the role of different components in GACO.

Table 1. Performance comparison of GACO benchmark methods on TSP100. * indicates the best performance.

	T = 1	T = 10	T = 20	T = 50	T = 100
GACO	8.882 *	8.543 *	8.421 *	8.227 *	8.113 *
AEAS	15.875	15.023	13.984	11.276	9.851
ACO	15.668	12.058	10.859	9.887	9.461
POMO	12.463	10.745	9.432	9.137	8.512
GCN-MCTS	13.928	11.684	10.473	10.017	9.089

Table 2. Performance comparison of GACO benchmark methods on CVRP100. * indicates the best performance.

	T = 1	T = 10	T = 20	T = 50	T = 100
GACO	17.184 *	15.586 *	15.257 *	14.761 *	7.853 *
AEAS	38.412	34.557	32.269	30.004	23.785
ACO	38.776	28.764	26.267	25.142	23.864
POMO	25.472	23.593	22.183	21.029	16.347
GCN-MCTS	23.8553	21.481	20.628	20.184	14.371

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Miao, W.; Zhao, X.; Wang, C.; Chen, S.; Gao, P.; Li, Q. A Gnn-Enhanced Ant Colony Optimization for Security Strategy Orchestration. Symmetry 2024, 16, 1183. https://doi.org/10.3390/sym16091183

AMA Style

Miao W, Zhao X, Wang C, Chen S, Gao P, Li Q. A Gnn-Enhanced Ant Colony Optimization for Security Strategy Orchestration. Symmetry. 2024; 16(9):1183. https://doi.org/10.3390/sym16091183

Chicago/Turabian Style

Miao, Weiwei, Xinjian Zhao, Ce Wang, Shi Chen, Peng Gao, and Qianmu Li. 2024. "A Gnn-Enhanced Ant Colony Optimization for Security Strategy Orchestration" Symmetry 16, no. 9: 1183. https://doi.org/10.3390/sym16091183

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Gnn-Enhanced Ant Colony Optimization for Security Strategy Orchestration

Abstract

1. Introduction

2. Related Work

2.1. Security Orchestration, Automation, and Response (SOAR)

2.2. Graph Neural Network

2.3. Ant Colony Optimization Algorithm

3. Methods

3.1. Construction of Directed Graph of Alarm Node Relationship

3.2. Heuristic Learner Based on Graph Neural Network Enhancement

3.2.1. Ant Colony Coding Initialization

3.2.2. Heuristic Learner Based on Graph Neural Network

3.2.3. Training the Heuristic Learner

3.2.4. Adaptive Elite Strategy

3.3. Security Policy Orchestration

4. Experiment

4.1. Experimental Setup

4.2. Experimental Results’ Analysis

4.3. Hyperparameter Experiment

4.4. Ablation Experiment

4.5. Disscussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI