5.1. Impact of Load Capacity Parameters on Cascading Failures
The cascading failure model of command and control networks evaluates network robustness by considering the initial load and capacity of nodes, using a nonlinear model and relevant parameters (e.g., load distribution coefficient and tolerance coefficient) to describe node state characteristics. The model employs a random attack strategy to select target nodes and introduces a probabilistic repair mechanism to simulate real-world uncertainties, incorporating a certain degree of randomness. However, this randomness is modeled based on predefined parameters (e.g., node capacity, load, and repair probability) rather than being completely stochastic. Overall, node vulnerability and failure propagation are governed and assessed by network structural parameters and state transition probabilities.
Through the analysis of the cascading failure model in the command and control network mentioned above, it is evident that when other parameters are given, a larger means that the load capacity of the nodes is sufficient to handle additional load, preventing cascading failures caused by a single node failure. In this case, . Conversely, as decreases, the network is more likely to experience cascading failures triggered by a single node failure, leading to . There exists a critical value in the relationship between and , such that when , , and when , . Therefore, can serve as another metric for measuring the robustness of the command and control network against cascading failures—the smaller the , the better the network’s robustness. The following sections analyze the impact of various parameters on and .
According to the command and control network generation algorithm, the network was initialized with , , , , and for robustness analysis. When , the load capacity model exhibited a linear relationship. The impact of load parameters and on the network’s robustness against cascading failures was analyzed, and the simulation results were obtained as the average of 200 experiments.
Figure 2 shows the curve of network robustness as a function of the parameter
. The figure illustrates the trend of network robustness under different values of
, revealing the impact of
on the network’s resilience. As
increased, the network robustness exhibited specific variation patterns, which intuitively reflected the network’s capability to withstand failures under different conditions. By analyzing this curve, one can better understand the role of the
parameter in enhancing network stability and failure recovery capabilities, as well as its optimization strategies. In
Figure 2, with
, it can be seen that for a given
, as
increases, the scale of cascading failures
increases, indicating that the network’s robustness against cascading failures improves. The critical value
first decreases and then increases as
increases, reaching its minimum when
.
Figure 3 shows the curve of network robustness as a function of the parameter
. The figure depicts the trend of network robustness under different values of
, revealing the impact of
on network resilience. As
increases, the network robustness changes accordingly. This curve intuitively reflects the network’s ability to withstand cascading failures under different conditions. By analyzing the curve, one can gain deeper insights into the influence of the parameter
on network structure and performance, providing a reference for optimizing network design. In
Figure 3, as
increases, the scale of cascading failures
increases, and the critical value
first decreases and then increases as
increases, reaching its minimum when
.
Since indicated the hierarchical nature of the load distribution in the command and control network, it led to an uneven distribution of load, thereby affecting the network’s response to cascading failures. The impacts of different values of on the scale of cascading failures and the phase transition critical value was analyzed. The experimental statistical results are shown in the figure, with each curve obtained by averaging the results of 200 independent experiments.
Figure 4 shows the relationship curve between the average size of cascading failures and load capacity parameters. The figure illustrates the variation trend of the average size of cascading failures under different load capacity parameters, revealing the impact of the node load capacity on network resilience. As the load capacity parameters changed, the average size of cascading failures exhibited specific regular fluctuations. By analyzing this curve, one can intuitively understand how load capacity parameters affect the scope and propagation speed of cascading failures, providing a theoretical basis and design reference for enhancing network robustness.
As shown in
Figure 4, under different values of the overload failure adjustment parameter
, the initial load parameter
and the average scale of cascading failures in the network consistently exhibited a negative correlation. This indicated that the stronger the hierarchical nature of the load distribution in the command and control network, the smaller the average scale of cascading failures. The reason for this was that as the hierarchical nature of the load distribution in the command and control network became stronger, a smaller number of high-command-level nodes bore a larger portion of the load, while a larger number of low-command-level nodes bore less load. Consequently, only when high-command-level nodes fail will a large-scale cascading failure be triggered, resulting in a smaller average scale of cascading failures in the network.
5.2. Comparison of Optimal Recovery and Probabilistic Recovery Strategies
The probabilistic recovery strategy can also be described as performing random recovery under a certain recovery proportion. For example, if the recovery probability
, then it is essentially the same as random recovery with a recovery proportion of 0.3. Therefore, probabilistic recovery can be compared with optimal recovery, and this section focuses on comparing the two and observing their impact on the network. As shown in
Figure 5,
Figure 6 and
Figure 7, regardless of the recovery probability or the recovery proportion, the optimal recovery strategy consistently outperformed the probabilistic recovery strategy and could better suppress the propagation of cascading risks. The higher the recovery probability or recovery proportion, the more effectively the scope of cascading failures could be reduced.
Therefore, in command and control networks, when cascading failures occur, it is essential to screen and rank the failed nodes according to the recovery importance proposed in this paper and select the more critical failed nodes for optimal recovery. It is advisable to prioritize the recovery of nodes with more cooperative relationships and higher load capacity within the network. This optimal selection of recovery nodes offers greater benefits to the network compared with random recovery, allowing the cascading failure range to be quickly controlled at a lower level and effectively suppressing the propagation of cascading failures. From these results, it is clear that the negative impact of cascading failures on command and control networks is substantial. Even with recovery strategies in place, the harm caused by cascading risk propagation cannot be entirely mitigated, emphasizing the necessity of preventing cascading risk propagation.
Under sustained attacks, the overall network resilience gradually decreases with the increasing number of attacks. The experiments revealed that when attacks were concentrated on high-importance nodes, the network connectivity and functionality rapidly deteriorated, resulting in command chain disruption and ineffective transmission of commands. Without timely repair, the failure of critical nodes could trigger a cascading effect, leading to large-scale network failure.
Although the priority-based recovery strategy proposed in this paper could enhance the resilience of command and control networks, its implementation was subject to the following technical constraints: (1) Real-time monitoring and dynamic evaluation: Efficient monitoring systems are required to continuously collect the node status (e.g., load, degree centrality, etc.) and quickly assess failure conditions, which is a prerequisite for strategy implementation. (2) Computational and data processing capacity: The strategy relies on complex network topology analysis and node importance assessment, necessitating robust computational resources and efficient algorithms to support real-time analysis and decision-making. (3) Reliability of communication links: Communication links may be disrupted under attack, so redundant communication paths are needed to ensure that repair instructions can be effectively transmitted, and repair operations can be coordinated through a centralized or distributed decision-making center. (4) Resource allocation and recovery priority: Repair resources need to be allocated reasonably, and repair priorities must be clearly defined to maximize recovery effectiveness. Therefore, the implementation of this strategy must meet multiple technical requirements to enhance network resilience in complex and dynamic environments.
5.3. Analysis of the Average Number of Attacks the Network Can Withstand
This section calculates the average number of attacks each network undergoes across 200 experiments, referred to as the “average number of attacks”. As shown in
Figure 8a, in the context of systems engineering theory and practice, using a repair strategy during the cascading failure process in command and control networks can increase the number of attacks the network can withstand, thereby enhancing its robustness. To mitigate the impacts of scale advantages on assessing network robustness, this section also calculates the average attack scale each network can withstand. This is performed by dividing the average number of attacks by the original number of nodes in the network, yielding the corresponding average attack scale.
As shown in
Figure 8b, whether before or after using a repair strategy, the average attack scale that the network can withstand decreases as
increases. As shown in
Figure 8e, although using a repair strategy does help the network withstand a larger attack scale, the benefit brought by the repair strategy diminishes as
increases. As shown in
Figure 8f, the benefit to network robustness from the repair strategy increases gradually as
increases. This result confirms a conclusion: increasing
helps improve network robustness.
However, using a repair strategy does not help all networks withstand more attacks. For example, as shown in
Figure 8c,d, when
and
, the average number of attacks and average attack scale that the network can withstand after using a repair strategy are lower than when no repair strategy is used.
Overall, using a repair strategy can help most networks withstand more attacks. Moreover, the parameters and have a significant moderating effect on the implementation effectiveness of the repair strategy.
The paper examines the effectiveness of different repair strategies (e.g., probabilistic repair and priority-based recovery) under sustained attack conditions. The results indicate that the priority-based recovery strategy can significantly delay the degradation of network performance. Especially under limited repair resources, by prioritizing the restoration of important nodes, this strategy effectively reduces the spread of failures and enhances the overall network recovery capacity over multiple attack rounds. However, as the attack intensity increases and the duration extends, even with the priority-based recovery strategy, the overall network resilience will inevitably decline.
5.4. Analysis of the Probability of Network Failure Due to a Single Attack
In this section, we discuss the probability of a command and control network failing due to a single attack. If a network completely collapses after just one attack, it indicates insufficient robustness to some extent. As shown in
Figure 9, when
and
, using a repair strategy can only help reduce the probability of failure due to a single attack for networks with certain values of
(complexity), and it may even increase the probability of such failures. However, when
, the network almost never experiences a complete collapse due to a single attack. For example, when
and
, the probability of network collapse after a single attack is reduced after using a repair strategy. This suggests that networks with higher
values have significantly stronger robustness.
As shown in
Figure 10, for networks with the same
, the probability of failure due to a single attack in medium-sized networks can be significantly reduced by using a repair strategy. For instance, when
, the probability of network failure due to a single attack is significantly reduced by using a repair strategy. This indicates that the repair strategy can effectively improve the robustness of medium-sized networks by reducing the probability of failure due to a single attack. However, in small and large networks, the probability of such failures increases with the use of a repair strategy. For example, when
or
, the probability of network failure due to a single attack is significantly higher with the repair strategy compared with the probability without it. Additionally, the probability of failure due to a single attack is notably higher in small networks than in large networks. For example, without using a repair strategy, the probability of this occurring in a network with
is 0.265, whereas for a network with
, the probability is 0.155, showing a significant difference between the two.
Overall, the discussion on the probability of network collapse due to a single failure reveals that such incidents are significantly less likely in networks with higher values, indicating that networks with higher have considerably stronger robustness. Although the use of a repair strategy does not help all networks with lower complexity reduce the probability of such failures, it can significantly lower the probability of these incidents in medium-sized networks, highlighting the dual nature of the repair strategy.
5.5. Performance Fluctuation Rate Analysis
The existing literature indicates that the network performance can fluctuate during the cascading failure process [
6]. To compare the performance of a set of networks and to highlight the volatility of network performance, this chapter calculates the average performance fluctuation rate for each network. Taking the average performance fluctuation rate of a specific network as an example, the performance fluctuation rate is first calculated using the performance values obtained after two consecutive iterations, and then the average of all fluctuation rates is determined. The result is the average performance fluctuation rate for the current network. A higher average performance fluctuation rate indicates greater volatility in network performance during the iterations, implying that the network performance is more unstable. This chapter discusses the robustness of command and control networks from the perspectives of efficacy and efficiency based on network performance.
- (1)
Robustness Analysis from the Efficacy Perspective:
As shown in
Figure 11, comparing the scenarios of
and
, it is observed that after applying a repair strategy, the average performance fluctuation rate for the network with
shows a noticeable change, while the network with
is almost unaffected by the repair strategy. However, for the network with
, the repair strategy does not always have a positive impact. For example, when
and
, using the repair strategy increases the average performance fluctuation rate, indicating greater volatility in performance changes during the iterations and exacerbating the network instability. The change in network performance is attributed to changes in node status and load distribution. The results suggest that under the influence of the repair strategy, newly recovered nodes that are restored to normal operation and reintegrated into the network may disrupt the current operational order, leading to negative impacts on the network.
To more intuitively demonstrate the performance changes in the command and control network, this section further explores the three different cases of performance fluctuation rates. This chapter analyzes the changes in the proportion of the three types of performance fluctuation rates in the command and control network before and after applying the repair strategy. The three types of performance fluctuation rates are as follows: a positive fluctuation rate, where the current network performance is higher than the performance at the end of the previous iteration; a negative fluctuation rate, where the current network performance is lower than the performance at the end of the previous iteration; and zero fluctuation rate, where the current network performance is the same as the performance at the end of the previous iteration.
As shown in
Figure 12, for networks with
, the impact of the repair strategy varies depending on the value of
, although it is generally noticeable. For instance, when
, after applying the repair strategy, the negative fluctuation rate increases, the proportion of cases with zero fluctuation decreases, and the proportion of positive fluctuation remains nearly unchanged. This indicates that the network performance becomes more volatile, and as attacks continue, the overall trend of the network performance declines. Increased volatility in performance suggests decreased stability in network operations, and a downward trend in network performance indicates weakened robustness against attacks.
Conversely, when , the situation is different from that with . After applying the repair strategy, the positive fluctuation rate increases, the proportion of cases with zero fluctuation decreases, and the proportion of negative fluctuation remains nearly unchanged. This means that while the network performance volatility increases, the overall trend in the network performance improves. Although the stability of the command and control network’s operations decreases, its robustness against attacks strengthens.
When
is larger, the effect of the repair strategy on the network efficacy becomes more complex. As shown in
Figure 13, when
, the use of the repair strategy significantly impacts the average efficacy fluctuation rate in only a few networks, and the situations in these networks are quite complicated.
For example, when , the proportion of cases with zero fluctuation rate significantly decreases, while the proportions of both positive and negative fluctuation rates increase, with the increase being nearly equal for both. This indicates not only that the network’s volatility is increasing but also that both the upward and downward trends in network performance are becoming more pronounced. Compared with the enhancement of a single trend, this situation suggests even poorer network stability.
- (2)
Robustness Analysis from the Efficiency Perspective
As shown in
Figure 14, cases where the efficiency fluctuation rate is zero are rare both before and after implementing the repair strategy. When the proportion of cases with zero fluctuation rate is zero, it means that only positive and negative fluctuation rates exist, and the proportions of these two cases cannot change in the same direction, indicating that the repair strategy can only have a single effect on the network’s efficiency. For example, when
and
, after applying the repair strategy, the proportion of positive fluctuation rates decreases, while the proportion of negative fluctuation rates increases, with both changing by the same magnitude. This suggests that the downward trend in network performance is becoming more pronounced, indicating weaker robustness in the command and control network.
As shown in
Figure 15, for networks with low or high
values, the repair strategy may either exacerbate the downward trend in network performance or enhance the upward trend. However, for networks with moderate
values, the effect of the repair strategy is more consistent, typically increasing the proportion of negative fluctuation rates and thus exacerbating the downward trend in performance.
The paper further explores network performance fluctuations under long-term attacks. The experiments show that after multiple rounds of attacks and repairs, performance metrics (such as network efficiency and connectivity) exhibit noticeable fluctuations. In particular, when repair resources are insufficient, repair operations may destabilize the network structure, leading to increased performance volatility. Additionally, the paper measures the network’s resilience in terms of the average number of attacks it can withstand across 200 experiments for different repair strategies. The results show that under sustained attacks, the average number of attacks the network can endure decreases as the attack intensity increases, and once the intensity surpasses a critical threshold, the network can hardly maintain its normal functionality.
Overall, the exploration of network performance fluctuations reveals that the impact of using a repair strategy on efficiency volatility is more significant than its impact on efficacy volatility. From the perspective of efficacy, smaller networks still have some room for improvement in achieving optimal performance levels, whereas larger networks tend to exhibit a relatively stable state in terms of optimal performance levels. From the perspective of efficiency, using a repair strategy significantly affects the volatility of the network’s current performance. Notably, when is at a moderate level, the use of a repair strategy increases the proportion of negative performance fluctuation rates, resulting in a significant negative impact on the robustness of the command and control network.