**1. Introduction**

For smart grids, the advanced communication and information technology are employed to enhance the intelligence and automation of the power systems. Meanwhile, cyber threats are introduced to the physical systems triggering the self-organized criticality of the power system, leading to cascading failure propagation between networks even blackouts occurred [1–3]. As the scale of the smart grid expands, how to optimize the power system structure and effectively alleviate cascading failures has aroused public concern.

Modern power systems are dynamical systems featured by complexity and nonlinearity. For simplifying the model complexity, the complex network theory and the graph theory are introduced to demonstrate the network dynamics [4]. Besides, the characteristics of complex networks can be used to analyze the impacts on cascading propagation [5]. The larger the cluster coefficient (CC) of the network is, the wider the cascading failure propagation is. Moreover, the smaller the average path length (APL) of the network is, the deeper the cascading failure propagation is [6]. Statistics indicate that the power system is a typical sparse network owing to geographical location constraints and inadequate investment budgets [7]. As the power system expands, regional and long-distance power transmission lines are constructed to balance the regional generation capacity. With the increase in transmission lines, the APL increases slowly, while the regional CC is relatively large. Therefore, cascading failures can be easily propagated in large regions of the power system.

Previous studies have put forward the load-capacity model to analyze the cascading failure propagation. Cascading failure model of the power system based on the complex network theory combines with the characteristics of power flows [8]. System capacity and network connectivity affect the propagation of cascading failures [9]. An electrical path efficiency matrix is assisted with the assessment of power system influences and losses [10]. Based on the percolation theory [11], the remaining giant component indicates the robustness of the network. However, evaluation indexes of the existing studies are used to assess the connected component performance, which cannot be implemented for isolated islands. The power system can maintain islanding operations after attacks. Thus, the robustness index of the power system should contain all survival islands.

Additionally, relevant research focused on the mechanism of cascading failures. In the power system, cascading failure can be triggered by means of physical equipment malfunction or misoperation owing to weather or man-made, and intentional cyber-attacks. Power node or link failure caused by system hidden failures as well as large area blackouts caused by natural disasters exhibit random attacks (RA) to the power system. Adversaries can also attack specific targets. For example, high degree node attacks (HDNA) disconnect the highly connected substation to destroy the network connectivity. Moreover, cyber-attacks compromise communication data to control the power system operations, which can construct not only simultaneous attacks but also sequential attacks [12]. For example, a large area of new energy resources simultaneously disconnects from the backbone network, or some special targets are sequentially compromised by coordinated strategies. The current research indicates that vulnerability sequence attack (VSA) damages the network more seriously than simultaneous attacks [13], because VSA can collapse the whole network by attacking fewer nodes. The evolution of both logical and real values of system parameters can be analyzed by a hybrid attack graph under attack and recovery actions scenarios [14]. As simultaneous attacks and sequential attacks have diverse impacts on power systems, it is necessary to investigate the cascading failure propagation of multiple attack scenarios by using proper evaluation indexes.

However, vulnerability of topology is affected by the transmission efficiency, connectivity, and connected components [15], particularly the power flow distribution of power systems [16]. The topology of the power system is relatively inflexible and vulnerable to intentional attacks [17]. Diverse fault diagnosis technologies have applied to monitor, locate, and identify the faults, which need to handle a large amount of data and operate system resources [2,3]. The effective control chart technique could substantially decrease the loss caused by the diagnosis and correction [18]. Optimal nonlinear adaptive control reduced uncertainties and improved the robustness under different operation scenarios [19]. In order to decrease the network vulnerability, the network structure can be optimized by link-addition strategies to mitigate cascading failures [20]. Existing research proposes interlink addition strategy to increase connectivity density, in order to reduce cascade-safe region and improve the network connectivity [21]. For improving the network robustness, connectivity links and interlinks could be added simultaneously [22], while the construction costs are too high to realize [23]. Ji et al. [24] compared with various connectivity link addition strategies, for the purpose of verifying the feasibility of low-degree node link-addition strategy and improving the power network robustness. However, these link-addition strategies have focused on the pure topology evolution evaluating by using degree or betweenness indexes, without considering special characteristics of power systems.

Since the power system is managed in regions, isolated islands can maintain in operation. The Fast–Newman algorithm is introduced to divide the network topology into communities, thereby ensuring that the network can be effectively partitioned [25]. In power systems, the location of generators is the key factor for a valid community [8]. Besides, the load distribution has influences on the power generation dispatch and control strategy [26]. For providing sufficient power supply, the power system can be partitioned into communities following the power flow directions. Moreover, critical regions greatly affect the topology evolution, and the community partition of these regions seriously influences on the network vulnerability [27]. To achieve the reliability and preventive maintenance is another optimization goal [28]. Therefore, the community-based link-addition strategy is proposed to

optimize the existing power network topology, in order to reduce investment budgets and alleviate the burden of load centers.

In summary, present researches have confirmed that the power system is affected by the community structure, but less attention is paid to the optimal community structure on mitigating cascading failure propagation. In order to address this issue, we propose an improved load-capacity model based on the islanding power flow distribution, in terms of the complex system and percolation theory. The island ratio is a measure of the robustness of power networks. For further demonstrating the difficulty of attacks, an evaluation indicator is introduced to assess the influence of the sequential attack. In order to optimize the original power system, three community-based link-addition strategies between low-degree nodes are therefore proposed to meet the requirements of engineering practice. This paper is of practical significance in how to optimize network topology and improve the network robustness of the power system.

The reminder of the paper is organized as follows. Section 2 presents the fundamental theoretical background on constructing a load-capacity model. Section 3 discusses the evaluation index. Section 4 describes the process of constructing link-addition strategy. Section 5 provides the simulation results and the corresponding analysis. Section 6 summarizes several concluding remarks and discusses the challenging issue. Lack of the period.

#### **2. System Model**

Based on the complex network theory, the power system is modeled as a directed graph *GP* = (*VP*, *EP*), with *N* nodes and without multiple edges or loops, where *VP* and *EP* are power nodes and lines, respectively. The power nodes are categorized as three types: generator nodes that generate electricity, load nodes that consume electricity, and substation nodes that transfer electricity. Particularly, one generator node carrying loads can be classified into the load node. The power lines are directed by the power flow changes over time. In order to decrease calculation complexity, this study ignores the differences in transmission lines, the transient voltage instability and phase angle mismatch. In this graph, the nodes and lines can be removed as a result of failures or attacks. It is assumed that the adversaries can manipulate the systematic information to construct malicious attacks of any target of the system.

In the power system, the real and reactive power injections are balanced at every node, as indicated in Equations (1) and (2). Moreover, the real and reactive power flows in transmission lines by following Kirchhoff's law, as expressed in Equations (3) and (4) [29].

Real and reactive power injection at node *i*:

$$P\_i = V\_i \sum\_{j=1}^{N} V\_j (G\_{ij} \cos \theta\_{ij} + B\_{ij} \sin \theta\_{ij}),\tag{1}$$

$$Q\_i = V\_i \sum\_{j=1}^{N} V\_j (G\_{ij} \sin \theta\_{ij} - B\_{ij} \cos \theta\_{ij}),\tag{2}$$

Real and reactive power flows from node *i* to node *j* are:

$$P\_{\rm ij} = V\_{\rm i}^2 G\_{\rm ij} - V\_{\rm i} V\_{\rm j} (G\_{\rm ij} \cos \theta\_{\rm ij} + B\_{\rm ij} \sin \theta\_{\rm ij}), \tag{3}$$

$$Q\_{i\bar{j}} = -V\_i^2 B\_{i\bar{j}} - V\_i V\_{\bar{j}} (G\_{i\bar{j}} \sin \theta\_{i\bar{j}} - B\_{i\bar{j}} \cos \theta\_{i\bar{j}}),\tag{4}$$

where *Pi* is the real power injection at the power node *i*, *Qi* the reactive injection at the power nod *i*, *Pij* the real power flow from node *i* to node *j*, *Qij* the reactive power flow from node *i* to node *j*, *V* the voltage magnitude, θ*ij* the difference in the phase angle between power nodes *i* and node *j*, *Bij* the admittance, *Gij* the susceptance, and *N* the initial number of nodes, *i*, *j* ∈ *N*.

According to the power flow distribution, the power system capacity is assumed to be proportional to its initial states [30]. It is assumed that the power system is provided with moderately reactive power to compensate losses and avoid out-of-limit at the same voltage grade. The initial power flow capacity is the maximum power flow in transmission lines of Equation (5). The initial generation capacity is the maximum output of generators of Equation (6). The initial node capacity is the maximum sum of out flows *Pout flow*,*ij*(*i*) and local loads *Lload*(*i*) of node *i* of Equation (7).

$$\mathbb{C}\_{PF} = \max(P\_{ij}),\tag{5}$$

$$\mathbb{C}\_{\mathcal{S}^{\text{gen},i}} = \max(P\_{\mathcal{S}^{\text{gen}}}(i)),\tag{6}$$

$$\mathbb{C}\_{\text{Nader},i} = \max\limits\_{i,j \in \mathcal{N}} \left(\sum\_{i \in \mathcal{N}} P\_{outflow,ij}(i) + L\_{\text{land}}(i)\right), \tag{7}$$

So, the system capacity C*<sup>p</sup>* is α times the initial states.

$$\mathbb{C}\_{\mathcal{P}} = a \big( \mathbb{C}\_{\mathcal{P}\mathcal{F}}, \mathbb{C}\_{\mathcal{G}^{\text{gen}}}, \mathbb{C}\_{\text{Nod};j} \big)\_{\mathsf{\prime}} \tag{8}$$

where α is the tolerance parameter, α ≥ 1. In the model, the tolerance parameter α is a consistent one. It is assumed that the power system adopts the overcurrent protection mechanism. For simplicity, if the power flow exceeds the system capacity, the transmission lines trip off instantly without further automatic reclose.

#### **3. Evaluation Index**

(1) Cluster coefficient

CC indicates the network connectivity level between nodes and their neighboring nodes [31]. Assume that node *i* has a number of *Ei* links and *ki* neighbors, while the maximum number links of these neighboring nodes is *ni*(*ni* − 1). The CC of node *i* is shown as follows.

$$\mathcal{C}(i) = \frac{2E\_i}{n\_i(n\_i - 1)},\tag{9}$$

Then, global CC of the network equals to the mean value of the local CC of all nodes

$$\mathcal{C} = \sum\_{i \in \mathcal{N}} \mathcal{C}(i) / N\_{\prime} \tag{10}$$

#### (2) Average path length

APL is a measure of network efficiency. Dijkstra algorithm [32] is used to find the shortest path from the source node *i* to the destination node *j*, then the average distance between two nodes is shown as follows.

$$L = \frac{1}{N(N-1)} \sum\_{i \neq j \in \mathcal{N}} d\_{ij\cdot} \tag{11}$$

In this study, *dij* is assumed to be the distance cost of one new connectivity link, which indicates the difficulty of adding one new link from one source node to the other destination node.

(3) Node vulnerability

Based on the percolation theory, nodes are functional only in a giant component, which is a maximal connected component of the graph. The number of nodes that belong to giant components owing to one node removal indicates the node vulnerability. In one network, although a number of nodes have the same vulnerability, node removal contributes various influences on the remained components. In literature [4], the node types and their locations are combined to further distinguish the most vulnerable node. If the nodes are in separate single loops, the node in the bigger single loop is more important than that of the smaller one. Since a line-shaped branch is generated after unlocking the single loop, the longer the branch, the more the loss of nodes. If the nodes are in the same single loop or in different single loops of the same size, further investigation is required until the most critical node is located.

$$I\_{r(i)} = \frac{N\nu}{N}, \forall \text{length}(r(i)) > \text{length}(r(\varphi)), \tag{12}$$

where *N* is the node number of the remaining giant component, ϕ the set of nodes with the same vulnerability, *r*(*i*) the single loop where node *i* locates, and *length* stands for the length of the single loop, *i* ∈ ϕ ∈ *N*.

After part of nodes are removed from the network in a random or targeted manner, the remaining giant component ratio is used to estimate the network robustness [33]. However, the power system can maintain in islanding operations. Thus, the island ratio is the proportion of all survival isolated components of the power system.

$$I = \frac{\sum\_{x} \Theta(x)}{N},\tag{13}$$

where Θ is the node number of one survival island, and *x* is the number of islands.

For assessing the influence of the network under sequential attacks, an evaluation indicator *S* is introduced to combine with the difficulty of attacks and the survivability of the network.

*S* = τ × *I*, (14)

where τ is the number of sequential attacks, and *S* is a scalar without units.

#### **4. Link-Addition Strategy**

#### *4.1. Fast–Newman Algorithm for Community Partition*

According to the power system management, each community has at least one generator node to supply sufficient electricity, or it will fail to partition. The directed power system graph detects the valid community modularity by using the Fast–Newman algorithm [25].

$$\mathcal{Q} = \frac{1}{2m} \sum\_{ij} \left[ A\_{ij} - \frac{k\_i k\_j}{2m} \right] \delta(v\_{i\prime} v\_j) , \tag{15}$$

where *m* is the link number, 2*m* the sum of degrees of the network, *A* the adjacent matrix, *k* the degree of a node, and δ(*vi*, *vj*) the function for judging the community of two nodes. If they are in the same community, it is 1, otherwise 0. The modularity *Q* ranges from [−0.5, 1), the greater the modularity, the better the effect of community partition. Statistics show that when *Q* is between 0.3 and 0.7, communities will cluster effectively [34].

#### *4.2. Low-Degree-Node-Based Link-Addition Strategy*

One-degree node (leaf node) of the power system is easily removed, owing to its overloaded transmission line or neighboring node removal that suffers from disturbances or attacks. Through the addition of new links to the leaf nodes, the connectivity level of the network can be increased. This is because the removal of tree-shaped root nodes can cause a large area to be disconnected from the core component, and the leaf nodes of the most vulnerable nodes are critical for optimizing the power system topology. However, some leaf nodes are generator nodes, so it is unreasonable to connect two generators except one generator node carrying a heavy load. The newly added links cannot overlap the original links. Moreover, the new network has to ensure that each community has at least one generator node. In conclusion, three link-addition strategies are proposed to enhance the original network connectivity and decrease the vulnerability.

#### (1) Low-degree-node link-addition strategy (LDNLAS)

The strategy aims to optimize long-distance transmission line construction for solving the long-distance electricity transmission of the large scale power systems. Based on the community partition and node vulnerability of the original power system, the new links from one community to other communities satisfy the average shortest path. If the most vulnerable node has leaf nodes, new links are first added from them.

$$E\_{\rm LDNLAS} = \sum\_{D\_1} E\_{\rm s\rm s.t.} \delta(v\_{\rm s\,\gamma} v\_{\rm l}) = 0, \mathbf{s} \in D\_1, t \in \mathcal{N}, D\_1 \in \mathcal{N}, \mathbf{s} \neq t, \text{min } L\_{\rm mzw} = \frac{1}{\mathcal{N}(N-1)} \sum\_{s \neq t} d\_{\rm s\,l} \tag{16}$$

where *Est* is an additional link, *s* the low-degree nodes, *t* the leaf nodes, and *D*<sup>1</sup> the set of low-degree nodes that satisfy the average shortest path *Lnew*.

### (2) Nearest-neighboring-node link-addition strategy (NNNLAS)

The strategy aims to connect the nearest nodes to enhance the local network connectivity and density. Based on breadth-first search algorithm, the new links find the shortest distance between neighboring nodes. If new links have the same shortest distance, those who have the average shortest path will satisfy the requirement.

$$E\_{\rm NNNLAS} = \sum\_{D\_2} E\_{\rm st} \text{ s.t. } \text{weight}(v\_{\rm ts} v\_l), \text{s} \in D\_2, t \in \mathcal{N}, D\_2 \in \mathcal{N}, \text{s} \neq t \,\text{min} \, d\_{\rm st} \tag{17}$$

where *Est* is an additional link, *t* the leaf nodes, *s* the neighbor of leaf nodes that satisfy the shortest path *dst*, and *D*<sup>2</sup> the set of neighboring nodes.
