Next Article in Journal
The Time Series Classification of Discrete-Time Chaotic Systems Using Deep Learning Approaches
Previous Article in Journal
Some Properties of the Potential Field of an Almost Ricci Soliton
Previous Article in Special Issue
Handling Non-Linearities in Modelling the Optimal Design and Operation of a Multi-Energy System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stochastic Evolutionary Analysis of an Aerial Attack–Defense Game in Uncertain Environments

1
Equipment Management and UAV Engineering College, Air Force Engineering University, Xi’an 710051, China
2
National Key Laboratory of Unmanned Aerial Vehicle Technology, Xi’an 710051, China
3
College of Information Technology, Nanjing Police University, Nanjing 210023, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(19), 3050; https://doi.org/10.3390/math12193050 (registering DOI)
Submission received: 8 July 2024 / Revised: 7 September 2024 / Accepted: 13 September 2024 / Published: 28 September 2024
(This article belongs to the Special Issue Operations Research and Its Applications)

Abstract

:
Aiming at the problem of random environment interference in the process of strategy interaction and the behavioral evolution of an aerial attack–defense game, this paper considers the influence of the difference in the performance and value between both game players in terms of strategy evolution; explores the randomness of the complex battlefield environment, the uncertainty of the behavioral state of game players, and the limitations of the emergent situation; constructs a mathematical model of the stochastic evolution of an aerial-coordinated attack–defense game in uncertain environments; and studies the stability of the strategy interaction and behavioral decision-making process of both players of the aerial attack–defense game. Simulation results show that many factors of the performance and value between both game players have a greater impact on the strategy evolution trend in both game players, which not only causes changes in the results of the strategy selection but also affects the rate of strategy evolution for the game players. In addition, random environmental factors cause a certain degree of interference to the strategy evolution process of the game players, which usually accelerates the game players’ strategy evolution rate and greatly affects the evolution process of the game players’ strategy. This study can provide a theoretical basis and feasible reference for improving mission decision-making, response mechanisms, and system modeling of an aerial attack–defense game, which has important theoretical value and practical significance.
MSC:
65C30; 68U01; 91A35; 91A22; 91A25; 91A26

1. Introduction

Modern operation essentially means to create and use an “asymmetric” advantage to seize the battlefield advantage and defeat the other player [1]. Therefore, countries around the world are vigorously developing advanced equipment to seek an asymmetric technological advantage in the battlefield game. As typical new directional energy equipment, laser equipment has the characteristics of strong anti-interference ability, high hitting accuracy, strong continuous striking ability of the power supply, small collateral damage, and a high cost-effectiveness ratio. It is capable of realizing precise and efficient strikes against incoming targets without interfering with or accidentally injuring allies. It is extremely suitable for large-scale cluster combat scenarios, thus it has been paid great attention by many countries. In recent decades, with the deepening of research on laser technology in the safety protection field, the development of laser equipment has become more and more advanced and it has been widely used in actual operations. With the volume and weight of laser equipment becoming gradually miniaturized, compact, and lightweight, its carrying platforms can gradually diversify. The operation field is no longer limited to land and sea battlefields, and has begun to expand to the sky and space, contribution to the formation of all-round and multi-level operation forces. Especially in the sky field, air operations, as an important part of modern warfare, have become one of the key factors in determining the victory or defeat of a war. Air operations have characteristics of rapidity and mobility, the operation field is relatively independent, and the energy propagation of the laser equipment is not easily affected by the geographic environment or the curvature of the earth. It can lead to better and more precise strikes on incoming targets in real-time, dynamically, quickly, and accurately. Therefore, laser equipment mounted on an aircraft platform for air operations has gradually become a key direction for future development [2,3]. However, with the introduction of new equipment, aerial attack–defense confrontations in the future will take on a more complex form. Differences in the airborne performance of attacking and defending players will have a direct impact on the outcome of the game. In addition, the game player is faced with the interaction of multiple factors, such as the rapid change in the surrounding environment, the unpredictability of the target’s behavioral state, and the existence of unexpected situations, which will seriously affect the strategic behavioral choices of the game player. Therefore, studying the game evolution mechanism of an aerial attack–defense game confrontation in uncertain environments and the impact of random environmental interference in the strategy interaction process has important theoretical value and practical significance for decision-making responses and system model analysis of future aerial game.
Thus far, many scholars have conducted research on the issue of aerial attack–defense confrontation and have found corresponding research results. The open literature surrounding aerial attack–defense confrontation mainly focuses on three aspects: game players, finite rationality, and stochastic uncertainty. In terms of game players, the impact of maneuver action behaviors of the two game players involved in the air confrontation on the decision-making outcome is commonly investigated. Li Zuolong et al. [4] proposed a hierarchical decision-making algorithm for over-the-horizon air combat based on deep reinforcement learning. Park Hyunju et al. [5] developed an automated maneuver strategy generation algorithm for UCAV air-to-air combat in line-of-sight using a differential game theory based approach. Zhang Jiandong et al. [6] constructed a multi-UAV cooperative air combat maneuver decision-making model based on multi-agent reinforcement learning, in turn based on the study of 1v1 autonomous air combat maneuver decision-making. These studies mainly considered the influence of the game player’s maneuvering behavior on the decision-making results, but did not take into account the environmental uncertainty and the limitations of emergency situations in the process of air combat attack–defense confrontation. They especially did not take into account the bounded rationality of the decision-making player on the game strategy, which is a clear research gap regarding the complex decision-making environment.
In terms of finite rationality, similar studies have mainly analyzed the strategy changes in the two players of the game participating in the aerial confrontation under the premise of finite rationality. Zhao Minrui et al. [7] proposed a deep reinforcement learning (DRL)-based approach to solve the problem of limited rationality and insufficient intelligence in the operation of current autonomous decision-making systems. Yu Minggang et al. [8,9] constructed a cluster cooperation evolution model based on the multivariate public goods evolution game. They theoretically derived and characterized the conditions under which unmanned combat clusters’ strategies prevail, and provided the evolutionary dynamics process of clusters in the association structure. Hu Shiguang et al. [10] studied the game evolution of cooperative behavioral strategies of UAV swarms in communication-constrained environments. Gao Yifan et al. [11] proposed an evolutionary game-theory-based target assignment method for multi-UAV networks in 3D scenarios. Sheng Lei et al. [12] proposed a posture evolution game model for the UAV cluster dynamic attack and defense problem. These studies take into account the effect of limited rationality on the player’s strategy choices on both players of the air game, but do not consider the effect of random uncertainty interference on the player’s strategy, which gives decision-making certain limitations.
In the area of stochastic uncertainty, related studies have focused on the problem of high dynamics and uncertainty in realistic aerial confrontation. Huang Changqiang et al. [13] constructed an autonomous maneuvering decision-making system, which used fuzzy logic to describe the highly dynamic and significant uncertainty of an air combat game, computing air combat posture through Bayesian theory to provide support for the intelligent decision-making of fighter aircraft. Wang Yuan et al. [14] presented the problem of an unmanned combat aerial vehicle (UCAV) maneuver decision-making method for solving the uncertainty problem caused by incomplete target information. Li Yiyuan et al. [15] proposed a decision-making method for the dynamic threat assessment of UAV targets based on intuitionistic fuzzy multi-attribute decision-making for the problem of uncertainty and dynamic change in multi-attribute information of UAV targets. Chen Bo et al. [16] proposed a dynamic decision-making method based on attribute weights and time weights by considering the fuzzy uncertainty of the target and the time factor of target information. Ren Zhi et al. [17] proposed a collaborative decision-making method based on dynamic games with incomplete information for generating maneuvering strategies for multiple UAVs in air combat. Cao Yuan et al. [18] proposed an unmanned combat UAV maneuvering decision-making algorithm that combines deep reinforcement learning and game theory to address the problem that unmanned combat UAVs struggle to quickly and accurately perceive situational information and make maneuvering decisions autonomously in modern air combat, which is easily affected by complex factors. These studies considered the interference of uncertain random factors on air combat decision-making, but did not address the influence of random interference factors on the dynamic process of decision-making of the game players from the perspective of the coordinated evolution of offensive and defensive confrontation, which gives the research in this area the potential for further expansion.
Furthermore, numerous scholars have proposed various theoretical approaches in the realm of behavioral decision-making. Kwangjin Yang et al. [19] introduced a novel aerial combat algorithm that extracts knowledge from human pilots’ experiences and employs behavior tree models as decision-making frameworks to guide aircraft during engagements. H. Mansikka et al. [20] utilized the critical decision method to enhance pilots’ tactical decision-making in air combat training. Shahzad Faizi et al. [21] proposed a multi-criteria group decision-making method based on Bonferroni and Heronian mean operators. Zhang Hongpeng et al. [22] developed a behavioral decision-making approach founded on deep reinforcement learning and Monte Carlo tree search, investigating autonomous decision-making in the absence of human knowledge or advantage functions.
The above literature on the aerial attack–defense game problem provides a good idea for solving the complex air operation decision-making problem, and has high theoretical and applied value. However, there are still some shortcomings. Firstly, the existing results are mainly based on the premise that the player has perfect rationality. However, battlefield situation information perception is uncertain and incomplete; the assumption of perfect rationality is not in line with the real situation. Secondly, at the present stage, most game studies on aerial attack–defense games have set the condition of symmetrical and equal capabilities of both game players. In reality, the decision-making of aerial attack–defense game appears in the form of asymmetric performance and value of both game players in a complex environment, which will make it difficult to apply existing research methods. Thirdly, existing research focuses on solving the optimal strategy of the aerial attack–defense game, and there is a gap in the research on the influence of key parameters on the dynamic process of game decision-making and the results of strategy selection.
Based on the above analysis, this paper is oriented toward the future development trend in aerial attack–defense games and mission requirements, considering the impact of the difference in performance and value between both game players and random uncertain environmental interference on the interaction and behavioral evolution of aerial attack–defense confrontation strategies, focusing on the strategy interaction stability and behavior decision-making evolution processes of both players in the air operation attack–defense game. The main contributions of this paper are:
(1)
The micro-evolutionary mechanism of the aerial attack–defense game in uncertain environments is studied. By constructing a game mathematical model of aerial attack–defense confrontation behavior, analyzing the role of multiple individual performance factors on strategy selection, exploring the risk of random uncertainties in the complex battlefield environment on the game player’s maneuver decision-making, and uncovering the micro-evolutionary mechanism of asymmetric performance and random uncertain environmental disturbances on the decision-making of aerial attack–defense confrontation behavior.
(2)
The difference in performance and value between both game players is analyzed, and the operational effectiveness evaluation system of the equipment is improved. On the basis of traditional operational effectiveness indices, dynamic factors such as time sensitivity and equipment load weight coefficients are introduced to further improve the operational effectiveness assessment system, which helps decision-makers to more accurately assess the effectiveness of equipment and optimize the allocation of resources in the complex and changing battlefield environment.
(3)
This study supports the construction of the aerial game mission decision-making response mechanism and game system model. The study not only focuses on the construction and analysis of the theoretical model, but also emphasizes the significance of guidance to the actual management practice, specifically through an in-depth study of the influence of key variables on the behavior of the game player and the stable state of the system’s evolution, which provides a scientific basis and theoretical support for the improvement of the decision-making and response mechanism of aerial game missions in the future.

2. Evolutionary Game Modeling

2.1. Problem Description and Parameterization

In the process of aerial attack–defense game confrontation, the main players involved in the game are the red and blue players, respectively. There are great differences in the parameters, such as the performance and value of the two players in the game. The strategic gains of the both game players are not only disturbed by many factors—such as the probability of aircraft finding the target, the probability of the penetration survival of the equipment, the probability of hitting the target, the probability of destroying the target, the reaction time of the equipment, the number of bombs or strikes, the environment, and the value of the target—but also affected by the asymmetric imbalance of the strategy combination of both players of the game and the competition conflict within the cluster. In order to analyze the influence of the deployment of new equipment on the evolution of the air attack–defense confrontation behavior of the game players, this paper introduces the equipment operational effectiveness index system to establish a mathematical model of the air attack–defense confrontation behavior of the game players and analyzes the attack–defense evolution process of the game players. The operational effectiveness of equipment is primarily influenced by a combination of basic effectiveness, countermeasure effectiveness, and environmental factors [23,24,25]. The details are shown in Figure 1.
Considering the randomness of combat operations and the complexity of mission requirements, the paper does not consider the factors of availability and credibility, but focuses on the influence of the inherent ability, penetration survival ability, response ability, continuous strike ability, and environmental interference factors of equipment in the game process. Currently, the measurements of effectiveness indices are generally expressed by numerical characteristics with probability properties [26]. Therefore, based on the traditional effectiveness probability index, this paper introduces variables such as equipment reaction time sensitivity parameter, equipment load weight coefficient, and strategy gain correction factor to study the strategy evolution process of both players of the game. The specific relevant parameters and meanings involved in the paper are shown in Table 1.
Table 1 provides some key parameters and their corresponding meanings of the game players in the whole process of operation simulation. These parameters run through the whole operation process and affect every decision point of the air operation. Through the comprehensive consideration of these parameters, we can construct a multi-dimensional and dynamic interactive operation simulation mathematical model, which can not only simulate the performance of a single equipment system, but also evaluate the synergistic effect and strategic value of the whole operation system. Taking the relevant parameters of the blue player as an example, the calculation process of the factors influencing the red player is similar. The specific calculation process is as follows:
P C b = i = 1 4 P b i · P T b · P M b P T b = e 0.2 T b   P M = e 10 / M b
f = α   , α > 1 f = β   , β = 1 f = γ   ,   0 < γ < 1
In the above formula,   P b i ( i = 1 , 2 , 3 , 4 ) represents the probability of different stages in the whole process of equipment operation.
P T b represents the equipment’s response time sensitivity parameter. It reflects the adaptability and flexibility of the equipment system in the dynamic battlefield environment. The parameter generally obeys an exponential distribution. The longer the reaction time of the equipment hitting the target, the smaller the battlefield gain.
  P M   represents the equipment’s load weight coefficient, which is used to describe the continuous strike capability of the equipment. The larger the parameter, the stronger the continuous strike capability of the equipment, and the greater the battlefield gain.
f represents the correction factor of strategy income, which is used to represent the influence of the superior and inferior situation caused by the unequal strategy choices of both game players. The purpose is to balance the rationality of the income caused by the non-equilibrium of the strategy selection of the two game players. α denotes the gain amplification coefficient of situational advantage strategy, β denotes the gain balance coefficient of situational equivalence strategy, γ denotes the gain reduction coefficient of situational disadvantage strategy, and α ,   β ,   γ should satisfy the equivalence relation α β = β γ .

2.2. Model Assumptions and Payment Matrix

In the process of the aerial attack–defense game, both game players are equipped with different types of fire equipment, with the red player carrying new laser equipment as the defense player, the blue player carrying traditional air-to-air missile weapons as the attack player. During the confrontation, both players were within the effective range of each other’s energy output. The attacking and defending players are limited rational clusters with autonomous intelligent decision-making abilities, and their strategy choices have certain dynamic evolution laws. Accordingly, the model of red–blue offense and defense confrontation is constructed, assuming the following:
The blue player’s strategy set is S A = ( S A 1 , S A 2 ) , representing a coordinated attack and an independent attack strategy, respectively, and the corresponding choice probabilities of the cluster are ( y , 1 y ). Correspondingly, the red player’s strategy set is S D = ( S D 1 , S D 2 ) , representing a coordinated attack and an independent attack strategy, respectively, and the corresponding choice probabilities of the cluster are ( x , 1 x ), where 0 x , y 1 . The game tree formed in the process of attacking and defending is shown in Figure 2.
When the red player adopts an independent defense strategy, the blue player, as the attacking player, can choose the two strategies of coordinated attack and independent attack. Due to the inequality of the behavioral strategies of the two players of the game, the gains that can be obtained by the two subjects under the conditions of different strategy combinations are different. Under the premise of not considering repeated strikes, the two cases are describe as follows:
(1)
If the blue player chooses to attack independently, the two players are symmetric and equal at the strategy level, and the attack gain that the blue player can achieve at this time is R b 1 = β μ P C b r r 0 ; the costs incurred include C b 1 and M b C b 3 . The defense gain that the red player can achieve at this time is r r 1 = β μ p c r R b 0 ; the costs incurred include c r 1 and m r c r 3 .
(2)
If the blue player chooses the coordinated attack strategy, the two players are asymmetric and unequal at the strategy level, the blue player is in an advantageous position, and the attack gain that the blue player can obtain at this time is R b 2 = α P C b r r 0 , the costs incurred include C b 1 , C b 2 , and M b C b 3 . The defense gain that the red player can obtain at this time is r r 3 = γ μ p c r R b 0 ; the costs incurred include c r 1 and m r c r 3 .
Similarly, when the red player adopts a coordinated defense strategy, the blue player as the attacking player can choose between the two strategies of coordinated attack and independent attack. Without considering repeated strikes, the two situations are discussed:
(1)
If the blue player chooses the independent attack strategy, the two players are asymmetric and unequal at the strategy level, and the attack gain that the blue player can obtain at this time is R b 3 = γ μ P C b r r 0 ; the costs incurred include C b 1 and M b C b 3 . Meanwhile, the red player is in a dominant position, and the defense gain that the red player can obtain at this time is r r 2 = α p c r R b 0 ; the costs incurred include c r 1 , c r 2 , and m r c r 3 .
(2)
If the blue player chooses the coordinated attack strategy, the two players are symmetric and equal at the strategy level, and the attack gain that the blue player can obtain at this time is R b 4 = β P C b r r 0 ; the costs incurred include C b 1 , C b 2 , and M b C b 3 . The defense gain that the red player can obtain at this time is r r 4 = β p c r R b 0 ; the costs incurred include c r 1 , c r 2 , and m r c r 3 .
Players participating in the game are mainly categorized as the attacker and defenser. However, the boundaries between the offensive and defensive players are not fixed in the actual process. Affected by the complex situation environment of the battlefield, the two players of the game will experience a mutual conversion and attack–defense translocation in the situation, resulting in a dynamic pursuit–evasion game process. Considering that the gains obtained by both game players in confrontation are the losses of each other, it is necessary to take into account the losses of one’s own side in the calculation of the gains. Combined with the previous assumptions and analysis of the main strategies of game players, the establishment of the payoff matrix is shown in Table 2.
Let U 11 , U 12 , and U 1 ¯ denote the expected gain of the red player choosing the coordinated defense strategy, the expected gain of the independent defense strategy, and the average expected gain, respectively. This is specified as follows:
U 11 = y ( r r 0 + r r 4 c r 1 c r 2 m r c r 3 R b 4 ) + ( 1 y ) ( r r 0 + r r 2 c r 1 c r 2 m r c r 3 R b 3 ) U 12 = y ( r r 0 + r r 3 c r 1 m r c r 3 R b 2 ) + ( 1 y ) ( r r 0 + r r 1 c r 1 m r c r 3 R b 1 ) U 1 ¯ = x U 11 + ( 1 x ) U 12
Let U 21 , U 22 , and U 2 ¯ denote the expected gain of the blue player choosing the coordinated attack strategy, the expected gain of the independent attack strategy, and the average expected gain, respectively. This is specified as follows:
U 21 = x ( R b 0 + R b 4 C b 1 C b 2 M b C b 3 r r 4 ) + ( 1 x ) ( R b 0 + R b 2 C b 1 C b 2 M b C b 3 r r 3 ) U 22 = x ( R b 0 + R b 3 C b 1 M b C b 3 r r 2 ) + ( 1 x ) ( R b 0 + R b 1 C b 1 M b C b 3 r r 1 ) U 2 ¯ = y U 21 + ( 1 y ) U 22
Then, the replicated dynamic equations for both game players are specified as follows, respectively:
F ( x ) = d x / d t = x ( 1 x ) [ y ( r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 ) + r r 2 R b 3 r r 1 c r 2 + R b 1 ]
F ( y ) = d y / d t = y ( 1 y ) [ x ( R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 ) + R b 2 r r 3 R b 1 + r r 1 C b 2 ]

3. Stochastic Evolutionary Game Modeling

3.1. Stochastic Evolutionary Game

Aerial attack–defense game is a macro, complex dynamic system; its battlefield environment has a high degree of complexity and uncertainty. The rapid change in the surrounding environment, the unpredictability of the target state, and the existence of unexpected situations may have a great influence on the behavioral strategy choice of the game player. However, classical evolutionary game theory lacks the consideration of uncertainty interference [27], and is unable to describe uncertainty and stochastic dynamics in complex battlefield environments. Therefore, this paper introduces Gaussian white noise [28,29] as a stochastic perturbation term and combines it with stochastic differential equations (SDEs) [30,31,32] to characterize the impact of uncertainties on the game players [33]. Considering x , y [ 0,1 ] , we determined that 1 x and 1 y are also non-negative, 1 x , 1 y [ 0,1 ] ; it has no effect on the evolutionary outcome of the game system [27]. Combining the above factors, the replicated dynamic Equations (5) and (6) were simplified and improved as follows:
d x t = x ( t ) [ y ( r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 ) + r r 2 R b 3 r r 1 c r 2 + R b 1 ] d t + δ 1 x ( t ) d ω ( t )
d y t = y ( t ) [ x ( R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 ) + R b 2 r r 3 R b 1 + r r 1 C b 2 ] d t + δ 2 y ( t ) d ω ( t )
where ω ( t ) is the standard one-dimensional Brownian motion, obeying the normal distribution N ( 0 , t ) ; d ω ( t ) denotes Gaussian white noise, and its increment ω ( t ) = ω ( t + h ) ω ( t ) obeys a normal distribution N ( 0 , h ) when t > 0 and the step size h > 0 ; δ 1 and δ 2 are normal numbers indicating the intensity coefficient of random environmental disturbances.

3.2. Equilibrium Solving

I t o ^ stochastic differential equations are a central concept in stochastic analysis, used to describe stochastic processes that vary over time in a random environment. This equation takes into account random fluctuations caused by Brownian motion or other stochastic processes; its general form is as follows [34,35]:
d X ( t ) = F ( t , X ( t ) ) d t + G ( t , X ( t ) ) d ω ( t )
where t [ t 0 , T ] , X ( t 0 ) = X 0 , X 0 R . X ( t ) represents random variables dependent on   t , F ( t , X ( t ) ) as the drift term, which represents the expected rate of change in the system in the absence of random disturbances; G ( t , X ( t ) ) is the diffusion term that describes the intensity of random disturbances to the system.
Accordingly, Equations (7) and (8) are SDEs. Since the I t o ^ stochastic differential equations are nonlinear equations, analytical solutions cannot be derived directly.
In SDEs, Taylor expansion is the basis for numerical solution of SDEs [34,36], which can be used to approximate the description of stochastic processes over small time intervals, and therefore analyzed using Taylor expansion method. Let h = T t 0 / N , t n = t 0 + n h , and a stochastic Taylor expansion of Equation (9) yields the following:
X ( t n + 1 ) = X ( t n ) + Γ 0 F ( X ( t n ) ) + Γ 1 G ( X ( t n ) ) + Γ 11 Κ 1 G ( X ( t n ) ) + Γ 00 Κ 0 F ( X ( t n ) ) + R
where Γ 0 = h , Γ 1 = ω n , Γ 11 = ω n 2 h / 2 , Γ 00 = h 2 / 2 , Κ 1 = G ( X ) / X , Κ 0 = F ( X ) / X + 1 / 2 2 G ( X ) / X 2 , and R is the remainder term of the expansion.
The Euler method and the Milstein method are the two main techniques for numerically solving SDEs, and both are used in simulations to solve I t o ^ processes or other types of stochastic processes. The Euler method is a relatively simple numerical solution method with high computational efficiency but low accuracy. The Milstein method is a more advanced solution method, which introduces a second-order term based on Euler’s method to improve the accuracy of the numerical solution.
Therefore, the Milstein method is used to solve SDEs. The specific forms are as follows [37]:
X ( t n + 1 ) = X ( t n ) + h F ( X ( t n ) ) + ω n G ( X ( t n ) ) + 1 2 ω n 2 h G ( X ( t n ) ) G ( X ( t n ) )
Referring to Equation (11), Equations (7) and (8) are expanded and solved, specific as follows:
x n + 1 = x n + x n y n r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 + r r 2 R b 3 r r 1 c r 2 + R b 1 h + ω n δ 1 x n + 1 2 ω n 2 h δ 1 x n δ 1 x n
y n + 1 = y n + y n x n R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 + R b 2 r r 3 R b 1 + r r 1 C b 2 h + ω n δ 2 y n + 1 2 ω n 2 h δ 2 y n δ 2 y n
According to Equations (12) and (13), the numerical solution of differential Equations (7) and (8) can be realized, so as to obtain the corresponding equilibrium solution of attack–defense evolution.

3.3. System Stability Analysis

When x ( t ) = 0 , y ( t ) = 0 , Equations (7) and (8) have a solution of at least zero. This indicates that the system will always be in the stable equilibrium state if there is no interference from random factors. However, that state is too ideal to be realized in reality. The game player is bound to experience interference from random factors inside and outside the system to varying degrees, thus affecting the decision-making of the game player and the stability of the system. In order to discuss the stability problem generated by stochastic perturbations, the strategy selection of both game players is analyzed for stability according to the stability discrimination theorem of SDEs [35,38].
Let there be a continuously differentiable smooth function V ( t , x ) and positive constants c 1 , c 2 such that:
c 1 | x | p V ( t , x ) c 2 | x | p V ( t , x ) ,   t 0
(1)
If there is a positive constant λ such that L V ( t , x ) λ V ( t , x ) , t 0 , then the zero-solution P-order moment index of the I t o ^ stochastic differential Equation (10) is stable, and E | x ( t , x 0 ) | p < c 2 c 1 | x 0 | p e λ t holds.
(2)
If there is a positive constant λ such that L V ( t , x ) λ V ( t , x ) , t 0 , then the zero-solution P-order moment index of the I t o ^ stochastic differential Equation (10) is unstable, and E | x ( t , x 0 ) | p c 2 c 1 | x 0 | p e λ t holds.
Among them, L V ( t , x ) = V t ( t , x ) + V x ( t , x ) F t , x + 1 2 G 2 t , x V x x t , x .
For Equations (7) and (8), let V ( t , x ) = x , V ( t , y ) = y , 0 x , y 1, c 1 = c 2 = 1 , p = 1 , λ = 1 ; it can be obtained as L V ( t , x ) = F t , x , and then:
L V ( t , x ) = [ y ( r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 ) + r r 2 R b 3 r r 1 c r 2 + R b 1 ] x
L V ( t , y ) = [ x ( R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 ) + R b 2 r r 3 R b 1 + r r 1 C b 2 ] y
If the zero-solution P-order moment index of Equations (15) and (16) are stable, the following requirements need to be satisfied:
y ( r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 ) 1 r r 2 + R b 3 + r r 1 + c r 2 R b 1
x ( R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 ) 1 R b 2 + r r 3 + R b 1 r r 1 + C b 2
If the zero-solution P-order moment index of Equations (15) and (16) are unstable, the following requirements need to be satisfied:
y ( r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 ) 1 r r 2 + R b 3 + r r 1 + c r 2 R b 1
x ( R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 ) 1 R b 2 + r r 3 + R b 1 r r 1 + C b 2
By simplifying Equations (17) and (18), respectively, we can obtain the conditions under which Equations (15) and (16) satisfy the stability of the zero-solution P-order moment index:
(a)
When r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 > 0 , then y ( 1 r r 2 + R b 3 + r r 1 + c r 2 R b 1 ) / ( r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 ) , and 1 + c r 2 r r 4 + R b 4 + r r 3 R b 2 0 .
(b)
When r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 < 0 , then y ( 1 r r 2 + R b 3 + r r 1 + c r 2 R b 1 ) / ( r r 4 R b 4 r r 2 + R b 3 r r 3 + R b 2 + r r 1 R b 1 ) , and 1 + c r 2 r r 4 + R b 4 + r r 3 R b 2 0 .
(c)
When R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 > 0 , then x ( 1 R b 2 + r r 3 + R b 1 r r 1 + C b 2 ) / ( R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 ) , and 1 + C b 2 R b 4 + r r 4 + R b 3 r r 2 0 .
(d)
When R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 < 0 , then x ( 1 R b 2 + r r 3 + R b 1 r r 1 + C b 2 ) / ( R b 4 r r 4 R b 2 + r r 3 R b 3 + r r 2 + R b 1 r r 1 ) , and 1 + C b 2 R b 4 + r r 4 + R b 3 r r 2 0 .
In summary, the stochastic evolutionary game system composed of the attacking and defending sides participating in the game satisfies the condition that the zero-solution expected moment index is stable as ( a b ) ( c d ) . Similarly, we can obtain the unstable condition of zero-solution expected moment index, which satisfies Equations (15) and (16). ( a b ) and ( c d ) represent the unions of two sets of conditions. Specifically, a and b are conditions pertaining to the variable y , while c and d are conditions related to the variable x . The condition ( a b ) indicates that when the attacker’s strategy y satisfies certain inequalities, the zero-solution P-order moment index of Equation (15) is stable, ensuring stability for the y-component of the system. Similarly, the condition ( c d ) signifies that when the defender’s strategy x meets specific inequalities, the zero-solution P-order moment index of Equation (16) is stable, guaranteeing stability for the x-component of the system. When ( a b ) ( c d ) occurs, it implies that both components of the system simultaneously satisfy at least one condition from each of these two sets. This results in four possible combinations of stability conditions: ( a , c ) , ( b , c ) , ( a , d ) , and ( b , d ) . These combined conditions ensure that within these parameter ranges, the system as a whole will converge to a stable state.

4. Numerical Simulation Analysis

4.1. System Stability Verification

In order to confirm the correctness of the above theoretical derivation and verify the validity of the model, the decision-making evolution process of the two sides of the game is simulated and analyzed by Matlab 2023b (developed by MathWorks, Natick, MA, USA). Before the numerical simulation, it is necessary to initialize the numerical simulation according to the system stability constraints, so the following values are based on the previous constraints. The initial selection probability of both players of the game is set to 0.5 , the intensity coefficients of random environmental disturbances are taken as δ i = 0.5 , 1 , 2 , the step size of the simulation is h = 0.00 1, and the other parameters are set to values that satisfy the stability of Equations (17) and (18) according to the constraints. The specific values of the relevant parameters are: R b 0 = 100 , P b 1 = P b 2 = P b 3 = P b 4 = 0.8 , T b = 3 , M b = 6 , C b 2 = 10 ; r r 0 = 100 , p r 1 = p r 3 = 0.8 , p r 2 = 1 , p r 4 = 0.5 , t r = 1 , m r = 50 , c r 2 = 10 ; α = 1.2 ,   γ = 0.8 ,   μ = 0.8 . Figure 3 shows the dynamic evolution process of the game player strategy under the conditions of different interference intensities.
As can be seen from the figure, the two game players can reach a stable equilibrium state under the initial conditions, and the strategy evolution process of the two game players all show a certain degree of volatility under different stochastic environmental interference intensities. Especially when the interference intensity is larger, the more the decision-maker cannot accurately predict the changes in the environment, and the larger the degree of decision-making swing. The stability exhibited by this model demonstrates its efficacy in describing the decision-making behavior of both parties in the game and accurately predicting their ultimate strategy choices. Moreover, the volatility observed in the model reveals the system’s dynamics and complexities, while also reflecting the impact of environmental uncertainties on the decision-making process. These findings not only validate the accuracy of the aforementioned theoretical derivations but also confirm the model’s effectiveness.
From a mathematical perspective, the stability of the system is influenced by random components. As the intensity of random environmental interferences increases, the impact of these random components on the system strengthens, leading to greater fluctuations in the system’s state. Such fluctuations may cause the system to deviate from its stable equilibrium point, thereby affecting the strategic choices of decision-makers. The system’s stability primarily depends on the zero-solution exponential stability of the stochastic differential equation, implying that under the influence of random perturbations, the system can gradually converge to a stable state without divergence. By observing the evolutionary trends in strategies under various interference intensities, we find that although the strategy evolution process exhibits greater volatility and more complex dynamic characteristics under higher interference intensities—potentially corresponding to “critical point” phenomena in real decision-making environments, where decision behaviors may undergo qualitative changes when uncertainty exceeds a certain threshold—the system ultimately converges to a stable equilibrium state. This indicates that the system possesses strong robustness, capable of resisting uncertainties brought about by external environmental changes within a certain range, demonstrating good dynamic stability.
From the perspective of the system’s evolutionary process, we observe that the game players exhibit a degree of volatility under different random environmental interference intensities, yet the entire evolutionary process maintains strong continuity. This continuity is driven by the players’ continuous adjustment of strategies in response to environmental changes and the opponent’s strategic shifts. It is primarily manifested in the fact that the decision evolution of both players depends not only on their current strategic choices but also on changes in external environmental interference intensities.
On the one hand, random environmental interferences introduce uncertainty, causing fluctuations in the strategy evolution process. However, from an overall perspective, both players tend to evolve towards a stable equilibrium state, reflecting the system’s maintenance of good continuity during the evolutionary process. This means that small initial differences do not lead to significant behavioral changes in the game players. Nevertheless, as the intensity of environmental disturbances increases, subtle changes begin to exhibit in this continuity. Larger disturbances may lead to greater oscillations in decision-makers strategies, demonstrating increased uncertainty. This is due to the fact that accurate predictions become difficult in highly uncertain environments, which affects the continuity of strategy selection.
On the other hand, the intensification of environmental interferences does not linearly affect the strategy evolution process but presents a nonlinear dynamic response. Under low-intensity interferences, the strategy adjustment process is relatively smooth, with the decision-making process showing high continuity along the time axis. In contrast, under high-intensity interferences, strategy adjustments become more drastic, impacting system continuity. The evolutionary trajectory exhibits fluctuations and jumps, indicating that the decision-making processes of both players are significantly disturbed in high-intensity random environments, resulting in reduced continuity. This may be because under weaker interferences, the system can better adapt and adjust strategies, while under stronger interferences, the system may need to undergo larger-scale adjustments to cope with environmental changes, exhibiting more complex dynamic behaviors. This leads to fluctuations and jumps in strategy evolution, but these fluctuations represent dynamic adjustments made by decision strategies to adapt to environmental changes, without breaking the overall continuity of system evolution.
In summary, despite environmental interferences, the strategy evolution process of both game players maintains a degree of continuity. However, when faced with intense interferences, this continuity may be locally disrupted, with decision-makers’ reactions becoming more sensitive, manifested as large short-term fluctuations in system strategies. Nevertheless, after multiple adjustments and adaptations, the system gradually converges to a stable equilibrium state under the disturbed environment. This also suggests that when designing practical strategies or models, we need to fully consider environmental uncertainties and their impact on strategy continuity.

4.2. System Sensitivity Analysis

The selection of parameters is based on the initial assignment, and the coefficient of the intensity of random environmental interference is uniformly taken as δ i = 0.5 . Combined with the stochastic differential equations, we analyze the reasonable values of each parameter within the constraint range. Considering that the stochastic differential equations of the two subjects of the game are similar in mathematical form, the evolution process of the subject of the game of one side only needs to be analyzed by taking the red side as an example, and the process of analyzing the factors of the blue side is similar.
(1)
Basic probability of equipment operation in total process p r i .
Under the condition that the initial values of other parameters remain unchanged, only the value of the basic probability of equipment operation p r i changes, and the influence of different values of p r i on the game evolution result is analyzed. Taking the laser equipment damage probability p r 4 as a reference for illustration, let p r 4 take the value 0 ,   0.25 ,   0.5 ,   0.75 ,   1 , respectively, the simulation results are shown in Figure 4, where the dashed line indicates the deterministic evolution curve, and the solid line indicates the stochastic evolution curve.
As can be seen from the figure, when p r 4 is small, the strategy evolution of both game players tends to select the independent strategy. Stochastic environmental disturbances have a small effect on the strategy evolution, only causing a certain degree of interference and affecting the rate of strategy evolution. Considering the reason for this phenomenon, it is because the red player’s equipment has a low probability damage to the blue target, and the red cluster cannot obtain the battlefield advantage. In order to reduce the risk of its own mass destruction caused by cluster coordinated operation, the independent strategy is preferred, while the blue cluster has an advantage in this case and prefers to act alone to avoid a decrease in revenue.
As the value of p r 4 increases, the red cluster tends to choose the coordinated strategy and accelerate the convergence speed, while the blue cluster still tends to choose the independent strategy. As the damage probability to blue targets from red equipment increases, the red cluster gradually begins to gain a battlefield advantage, and the potential benefits of coordinated operations increase, making the red subjects more willing to adopt coordinated tactics in order to maximize the effectiveness of their equipment systems. When the value of p r 4 is the largest, the blue cluster has a tendency to evolve into a coordinated strategy. However, due to the interference of random environmental factors, the strategy selection process experiences the phenomenon of decision selection swing. This may reflect the fact that, in actual air operation, even in the face of a high opposing damage efficiency, the blue cluster may not be fully inclined to engage in coordinated operations because of tactical flexibility, risk aversion, or other strategic considerations.
The decision-making process not only reflects the strategic choices of both players under varying conditions but also unveils the intricate tactical considerations inherent in aerial operations. The evolutionary curves reveal that this process is far from a simple linear progression; instead, it exhibits characteristics of continuity, fluctuation, and distinct phases. These nonlinear features underscore the uncertainty and complexity of aerial operation, manifesting not only in strategic shifts but also in the accelerated convergence of strategies. As the value of p r 4 incrementally increases, the red player’s strategy demonstrates a gradual and continuous transition from independent to collaborative action. This shift highlights a growing flexibility and adaptability in strategic selection, reflecting the red player’s tactical inclination to decisively adopt aggressive maneuvers when gaining an advantage. Conversely, while the blue player’s strategy only begins to show a collaborative trend after p r 4 reaches a certain threshold, the oscillations in its decision-making process reveal a continuous evolution in strategic choice. This fluidity suggests an ongoing calibration between tactical flexibility and risk aversion, potentially indicating a delicate balance between preserving strength through independent action and enhancing survival probability through collaboration when faced with extreme threats. Indeed, the strategic evolution of both game players is not characterized by abrupt mutations but rather by continuous dynamic changes. This continuity not only reflects the adaptability of both players in aerial operational environments but also elucidates how, under varying probabilities of destruction, game players adjust their strategies to counter opposing threats, thereby maximizing their operational effectiveness and survival probability.
(2)
Equipment reaction time, t r .
Under the condition that the initial values of other parameters remain unchanged, only the value of the equipment reaction time t r changes, and the influence of different values of t r on game evolution results is analyzed. Let t r take the value 0 , 0.5 , 1 , 5 , 10 , respectively; the simulation results are shown in Figure 5.
As can be seen from the figure, the value of t r does not affect the evolution of the behavioral decisions of the blue cluster, but has a greater impact on the behavioral decisions of the red cluster participating in the game. Considering the reasons for this phenomenon, it may be due to the fact that the tactical choices and decision-making process of the blue cluster are less sensitive to the reaction time of the red side’s equipment, or the blue side has already adopted a strategy adapted to different reaction times.
When the value of t r is small, the red cluster tends to choose the coordinated strategy, and the random environment interference factor has a greater impact on the strategy evolution of the game cluster. This may reflect the fact that in actual air operations, the quick-reaction striking capability of the equipment makes the red cluster work together more efficiently and effectively, and has an advantage in the confrontation, thus achieving more battlefield gains. However, laser equipment is susceptible to atmospheric conditions, such as the atmospheric absorption of energy, attenuation, heat halos, turbulence, etc., which leads to a reduction in the game cluster’s strategic gains, and therefore its strategy selection appears to be fluctuate more.
As the value of t r gradually increases, the red cluster tends to choose independent strategies, the convergence speed is accelerated, and the influence of random environmental disturbances on the strategy evolution of the game cluster is smaller at this time. This may be due to the fact that the longer equipment reaction time increases the uncertainty and risk of coordinated action, which causes the red cluster to prefer independent action to reduce the reliance on coordinated action.
In the aforementioned analysis, we observe that variations in equipment reaction time t r exert significantly different impacts on the strategic choices of both red and blue players. This disparity underscores the critical importance of time sensitivity in tactical decision-making. Notably, the simulation results reveal profound effects of differing equipment response times on the strategic choices of the red player. Specifically, as t r fluctuates, the red player’s decision-making process between cooperative and independent strategies reflects its adaptive adjustments to the operational environment and the adversary’s tactics. This process not only demonstrates the red player’s flexibility and adaptability under varying tactical conditions but also highlights how temporal factors emerge as crucial sensitive variables influencing decision-making within a competitive environment.
(3)
Sustained Strike Capability, m r , of the Equipment
Under the condition that the initial values of other parameters remain unchanged, only the value of the sustained strike capability m r of the equipment changes, and the influence of different values of m r on the game evolution result is analyzed. Let m r take the value 0 , 10 , 50 , 100 , 300 , respectively; the simulation results are shown in Figure 6. As can be seen from the figure, the value of m r does not affect the evolution result of the blue cluster’s behavior decision-making, but it has a great influence on the behavior decision-making of the red cluster participating in the game. The continuous strike ability of the equipment is an important index to measure the ability of the equipment system to continuously destroy or suppress the target within a certain period of time, which directly affects the strategy selection, operation efficiency and battlefield control ability of the game players. Therefore, the continuous strike ability of the laser equipment directly affects the strategy evolution of the red cluster. In addition, it may be due to the fact that the blue side’s tactical choice and decision-making process are less sensitive to the red side’s continuous strike capability, but rather rely on its own more robust or diversified tactical system, focusing on defensive flexibility, intelligence collection and electronic warfare capabilities, or rapid mobility to respond to energy output threats of different intensities.
When the value of m r is small, the red cluster tends to choose the independent strategy. This may be because when the sustained striking capability of the equipment is small, it means that the laser equipment of the red cluster needs a longer cooling time after each attack, which limits the ability of the red cluster to continuously fight and quickly respond. In this case, the red cluster is more inclined to adopt an independent strategy in order to spread the risk, to avoid the concentration of forces leading to the depletion of operational effectiveness in a short period of time and inability to continue the confrontation, or to use the limited strike capability more effectively in independent operations.
With the gradual increase in m r , the red cluster gradually tends to choose the coordinated strategy. However, due to the interference of random environmental factors, the strategy selection process has the phenomenon of decision selection swing. Due to the increase in the continuous strike capability of the equipment, the continuous operation capability of the red equipment is enhanced, which can provide more continuous and high-intensity fire coverage, creating conditions for coordinated operations. At this time, the red cluster is allowed to better coordinate the action to have a complementary effect, and the concentrated energy output will cause greater pressure on the enemy to achieve more effective battlefield control and target suppression. However, random environmental factors may lead to inaccurate battlefield situation assessment, temporarily reduce the efficiency of coordinated operations, promote the red cluster to balance between independent and coordinated strategies, and lead to fluctuations in strategy selection.
In the aforementioned analysis, the strategic choices of the red player reflect its sensitivity and adaptability to changes in sustained strike capability. Specifically, under conditions of limited energy output resources, independent operation can maximize the utilization of existing resources and reduce reliance on coordinated operation, demonstrating the system’s self-preservation mechanism under resource constraints. Conversely, when energy output resources are abundant, the red player transitions from an independent strategy to a coordinated one, favoring the use of enhanced energy output for concentrated and sustained strikes, aiming to achieve higher operational effectiveness and battlefield control. This strategic shift not only illustrates the red player’s real-time assessment of the operational environment and its own capabilities but also reflects its adaptive adjustments in optimizing battlefield resources and capabilities, revealing its flexible adjustment capabilities and complex decision-making mechanisms under different tactical demands.
Furthermore, the oscillation observed in the decision-making process indicates that while pursuing maximum operational effectiveness, the red player must also fully consider the risks brought by the dynamic changes in the battlefield environment. It also reminds players that they must maintain the flexibility and adaptability of their strategies. The decision-making process of the players in the game is not merely a direct response to equipment performance but also involves the real-time assessment of the battlefield situation and anticipation of the opponent’s strategy. The complexity and dynamism of this decision-making process emphasize that in confrontation games, decision-making relies not only on the enhancement of one’s own capabilities but also on keen perception and flexible response to environmental changes and opponent actions.
(4)
The inherent value of the target r r 0 .
Under the condition that the initial values of other parameters remain unchanged, only the value of the inherent value of the target r r 0 is changed, and the influence of different values of r r 0 on the game evolution result is analyzed. Let r r 0 take the value 10 , 100 , 200 , 500 , 800 , respectively; the simulation results are shown in Figure 7. As can be seen from the figure, the value of r r 0 has a great influence on the behavior decision-making of the red and blue players participating in the game. The random environmental interference factors have little influence on the strategy evolution of the game players, which only causes a certain degree of interference and affects the rate of strategy evolution.
When the value of r r 0 is small, both red and blue clusters tend to choose independent strategies. Considering the reasons for this phenomenon, it may mean that in the actual air operation, if the strategic value of the target is not high, the acquisition or destruction of the low-value target has little impact on the overall war situation, and it is not worth bearing the risk and cost of coordinated operation. Both parties may think that it is not necessary to invest a lot of resources for coordinated operation.
With the gradual increase in r r 0 , both the red and blue clusters tend to choose the coordinated strategy, and the convergence speed is accelerated. This may be because as the strategic value of the target increases, the destruction or capture of high-value targets has a significant impact on the outcome of the war. In actual air operation, high-value targets may require more accurate and more focused strikes. All parties recognize the importance of coordinated operations, which prompts both parties to adjust their strategies more quickly, to reach a consensus more quickly and efficiently, and to prefer coordinated operations to ensure mission success. In addition, high-value targets may prompt both parties to allocate resources more effectively to ensure maximum operational effectiveness and improve the accuracy and success rate of strikes.
The analysis presented herein elucidates the decision-making processes of both red and blue players under varying inherent value returns of targets, reflecting the sensitivity of the game players to the target value. It reveals a significant correlation between the target value and the willingness to choose a cooperative strategy. Specifically, in the face of low-value targets, the incentives for collaboration are insufficient; each player prioritizes cost-effectiveness and risk management, tending towards autonomous decision-making to mitigate risks and expenses. As the target value increases, the destruction or capture of high-value targets exerts a pronounced influence on the operational outcome. The potential benefits derived from cooperation begin to outweigh the risks and costs associated with independent actions, prompting a strategic shift towards collaborative operations. This transition emphasizes the concentration of resources and enhancement of operational effectiveness, underscoring the increasing demand for the cooperative engagement of high-value targets, as well as a heightened recognition of the importance of such collaboration among the involved players. This process not only aids in understanding the motivations underlying strategic choices but also provides theoretical guidance for practical operation deployments. It suggests that in varying strategic contexts, optimizing resource allocation and operational modalities through the precise assessments of objective value can effectively achieve strategic aims.
(5)
Network collaboration cost c r 2 .
Under the condition that the initial values of other parameters remain unchanged, only the value of the network collaboration cost c r 2 changes, and the influence of different values of c r 2 on the game evolution result is analyzed. Let c r 2 take the value 0 , 5 , 10 , 30 , 80 , respectively; the simulation results are shown in Figure 8. As can be seen from the figure, the value of c r 2 does not affect the evolution result of the blue cluster’s behavior decision-making, but it has a great influence on the behavior decision-making of the red cluster participating in the game. The random environmental interference factors have little influence on the evolution of the game players’ strategy, only causing a certain degree of interference, which affects the rate of strategy evolution. This is because the cluster network coordination cost of the red cluster will only affect the strategic evolution of the red cluster, but will not affect the blue cluster.
When the value of c r 2 is small, the red cluster tends to choose the coordinated strategy. In this case, the additional cost of coordinated operation is relatively small, and the overall benefits of coordinated operation may exceed its cost. Compared with the potential strategic advantages brought by collaboration, such as information sharing, fire concentration, etc., the network collaboration cost is acceptable. Therefore, the red cluster is more inclined to choose a coordinated strategy to enhance the overall operational effectiveness.
As c r 2 gradually increases, the red cluster gradually tends to choose an independent strategy. As the cost of networking collaboration increases, the additional resource consumption required for collaboration, such as communication resources, command and control costs, and exposure risks, becomes more significant, making the red cluster begin to evaluate the cost performance of independent actions and gradually turn to independent strategies to save costs and reduce exposure risks.
The aforementioned analysis elucidates the strategic choices made by game players in a dynamic environment, based on a careful assessment of costs and benefits. When weighing the pros and cons of coordinated versus independent operations, the red player adjusts their decision-making inclinations based on the actual costs associated with networked collaboration. When the benefits of information sharing and enhanced operational effectiveness from coordinated operation significantly outweigh the costs, the red player is more inclined to opt for coordinated strategies to maximize overall benefits. Conversely, should the costs of coordination surpass the strategic benefits it confers, the red player shifts to independent strategies to avoid high resource consumption and exposure risks. The essence of this decision-making process lies in the trade-off between resource consumption, benefits, and risks, which reflects the player’s adaptive capacity in complex environments and the evolutionary game mechanism that optimizes strategies through cost–benefit analysis. This demonstrates the player’s ability to flexibly adjust strategies based on external environmental factors and cost considerations.
Based on the above analysis, it can be seen that the evolution of the system will be disturbed by the random factors of the external environment, and the greater the intensity of the random environment interference, the more the game players cannot accurately predict the change in the environment, and the greater the fluctuation range of the game players’ decision-making in the process of strategy evolution. The random environment interference factors usually accelerate the strategy evolution rate of the game players, prompting both game players to reach equilibrium as soon as possible. In addition, the basic probability of equipment operation in the total process, equipment response time, equipment sustained strike capability, target inherent value, and network coordination cost all have different effects on the evolution rate and result of the air operation attack–defense game system. The system decision-making process is a dynamic and multi-factor process in which each game player will choose the optimal strategy according to its cost–benefit analysis. This process not only depends on the choice of parameters but also relates to the speed of strategy adjustment and the response-ability to environmental changes. Therefore, in the actual operation process, the decision-makers need to understand the regulation effect of various parameters on the decision-making process of the game players in different aspects, and comprehensively consider the changes caused by many factors on the decision-making evolution of both game players.

5. Conclusions

Aiming to address the problem of random environment interference in the process of strategy interaction and behavioral evolution of aerial attack–defense game in uncertain environments, the influence of the difference in the performance and value between both game players on the strategy evolution is discussed, and the strategy interaction stability and behavior decision-making evolution process of both players in the aerial attack–defense game are studied. Through the analysis of the simulation results, the following conclusions can be drawn:
(1)
The related index factors of the performance and value between both players have a great influence on the strategy evolution trend in both game players, which will not only change the result of strategy selection but also affect the rate of strategy evolution. From the perspective of strategy results, parameters such as the basic probability of equipment operation in the total process and target inherent value will have a significant impact on the strategy evolution results of both game players, resulting in the reversal of the strategy selection of both game players. Meanwhile, parameters such as equipment reaction time, equipment sustained strike capability, and network coordination cost have a greater impact on the parameter owner itself, and have less impact on the other game player. From the perspective of evolution rate, the basic probability of equipment operation in the total process slows down the evolution rate of both game players. Parameters such as equipment reaction time, equipment sustained strike capability, and target intrinsic value relatively slow down the evolution rate of the parameter owner itself. The parameters of equipment reaction time, equipment sustained strike capability and network coordination cost relatively accelerate the evolution rate of the other game player.
(2)
Random environmental factors cause a certain degree of interference to the strategy evolution process of the game players, which usually speeds up the strategy evolution rate, promotes the game players to reach the equilibrium as soon as possible, and greatly affects the evolution process of the game strategy. Especially in the transition moment when the strategy is about to change, the random environmental factors will cause a sharp fluctuation in the game strategy choice, which seriously affects the strategy choice of the game player.
(3)
Traditional evolutionary game theory lacks a description of the random dynamic environment, which cannot reflect the uncertainty and random dynamics in the complex battlefield environment. The stochastic evolutionary game model can better describe the random external environment and more accurately describe the evolution process of the game player’s behavior strategy, which has stronger universality and practicability.
This research can provide theoretical support and reference for the improvement in the aerial game mission decision-making response mechanism and system model construction. However, the model still has some limitations, such as the simplification of environmental influences, the assumption of behavioral strategies about the game players, the diversity and complexity of strategy evolution, and the sensitivity of model parameters, which are all problems that may be faced in future research. Therefore, in subsequent research, we will focus on the influence of more significant factors, such as equipment availability and credibility, on the choice of air operation behavior strategies; further combine complex networks to explore the influence mechanism of other factors, such as the diversity of strategy selection, diversity of game players, and multi-stage dynamic strategies on the evolution of players’ behavior strategies; and strive to establish and improve the mathematical model of air operation players’ strategy selection in the mission process.

Author Contributions

Conceptualization, S.H. and L.R.; methodology, S.H.; software, S.H., Z.W. and B.L.; validation, Z.W., B.L. and W.W.; formal analysis, S.H., H.X. and Z.W.; investigation, S.H. and L.R.; resources, L.R.; data curation, S.H. and L.R.; writing—original draft preparation, S.H.; writing—review and editing, L.R., Z.W. and H.X.; visualization, S.H. and Z.W.; supervision, W.W. and H.X.; project administration, L.R.; funding acquisition, L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Current research is limited to the application expansion of game theory, which is beneficial for academic and theoretical research and does not pose a threat to public health or national security. The authors acknowledge the dual use potential of the research involving aerial attack–defense games and confirm that all necessary precautions have been taken to prevent potential misuse. As an ethical responsibility, authors strictly adhere to relevant national and international laws about DURC. Authors advocate for responsible deployment, ethical considerations, regulatory compliance, and transparent reporting to mitigate misuse risks and foster beneficial outcomes.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Acknowledgments

We would like to thank everyone who provided comments on this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. McLennan-Smith, T.A.; Kalloniatis, A.C.; Jovanoski, Z.; Sidhu, H.S.; Roberts, D.O.; Watt, S. Humanitarian Aid Agencies in Attritional Conflict Environments. Oper. Res. 2021, 69, 1696–1714. [Google Scholar] [CrossRef]
  2. Lyu, C.Y.; Zhan, R.J. Global Analysis of Active Defense Technologies for Unmanned Aerial Vehicle. IEEE Aerosp. Electron. Syst. Mag. 2022, 37, 6–30. [Google Scholar] [CrossRef]
  3. Suresh, K.; Senthil, K.; Prasad, N. A Novel Method to Develop High Fidelity Laser Sensor Simulation Model for Evaluation of Air to Ground Weapon Algorithms of Combat Aircraft. Def. Sci. J. 2019, 69, 3–9. [Google Scholar]
  4. Li, Z.L.; Zu, J.H.; Kuang, M.C.; Zhang, J.; Ren, J. Hierarchical decision algorithm for air combat with hybrid action based on reinforcement learning. Acta Aeron. Et Astron. Sin. 2024, 45, 530053. [Google Scholar]
  5. Park, H.; Lee, B.-Y.; Tahk, M.-J.; Yoo, D.-W. Differential Game Based Air Combat Maneuver Generation Using Scoring Function Matrix. Int. J. Aeronaut. Space Sci. 2016, 17, 204–213. [Google Scholar] [CrossRef]
  6. Zhang, J.; Yang, Q.; Shi, G.; Lu, Y.; Wu, Y. UAV Cooperative Air Combat Maneuver Decision Based on Multi-Agent Reinforcement Learning. J. Syst. Eng. Electron. 2021, 32, 1421–1438. [Google Scholar] [CrossRef]
  7. Zhao, M.R.; Wang, G.; Fu, Q.; Guo, X.K.; Li, T.D. Deep Reinforcement Learning-Based Air Defense Decision-Making Using Potential Games. Adv. Intell. Syst. 2023, 5, 2300151. [Google Scholar] [CrossRef]
  8. Yu, M.G.; Chen, J.; He, M.; Liu, X.D.; Zhang, D.G. Cooperative evolution mechanism of multiclustered unmanned swarm on community networks. Sci. Sin. Technol. 2023, 53, 221–242. [Google Scholar] [CrossRef]
  9. Yu, M.G.; He, M.; Zhang, D.G.; Ma, Z.Y.; Kang, K. Strategy dominance condition of unmanned combat cluster based on multi-player public goods evolutionary game. Syst. Engin. Electron. 2021, 43, 2553–2561. [Google Scholar]
  10. Hu, S.; Ru, L.; Wang, W. Evolutionary Game Analysis of Behaviour Strategy for UAV Swarm in Communication-Constrained Environments. IET Control Theory Appl. 2023, 18, 350–363. [Google Scholar] [CrossRef]
  11. Gao, Y.F.; Zhang, L.; Wang, C.Y.; Zheng, X.Y.; Wang, Q.L. An Evolutionary Game-Theoretic Approach to Unmanned Aerial Vehicle Network Target Assignment in Three-Dimensional Scenarios. Mathematics 2023, 11, 4196. [Google Scholar] [CrossRef]
  12. Sheng, L.; Shi, M.H.; Qi, Y.C.; Li, H.; Pang, M.J. Dynamic offense and defense of UAV swarm based on situation evolution game. Syst. Engin. Electron. 2023, 45, 2332–2342. [Google Scholar]
  13. Huang, C.; Dong, K.; Huang, H.; Tang, S.; Zhang, Z. Autonomous Air Combat Maneuver Decision Using Bayesian Inference and Moving Horizon Optimization. J. Syst. Eng. Electron. 2018, 29, 86–97. [Google Scholar] [CrossRef]
  14. Wang, Y.; Huang, C.Q.; Tang, C.L. Research on Unmanned Combat Aerial Vehicle Robust Maneuvering Decision under Incomplete Target Information. Adv. Mech. Eng. 2016, 8, 168781401667438. [Google Scholar] [CrossRef]
  15. Li, Y.Y.; Chen, W.Y.; Liu, S.K.; Yang, G.; He, H. Multi-UAV Cooperative Air Combat Target Assignment Method Based on VNS-IBPSO in Complex Dynamic Environment. Int. J. Aerosp. Eng. 2024, 2024, 1–17. [Google Scholar] [CrossRef]
  16. Chen, B.; Tong, R.; Gao, X.E.; Chen, Y.F. A Novel Dynamic Decision-Making Method: Addressing the Complexity of Attribute Weight and Time Weight. J. Comput. Sci. 2024, 77, 102228. [Google Scholar] [CrossRef]
  17. Ren, Z.; Zhang, D.; Tang, S.; Xiong, W.; Yang, S.H. Cooperative Maneuver Decision Making for Multi-UAV Air Combat Based on Incomplete Information Dynamic Game. Def. Technol. 2023, 27, 308–317. [Google Scholar] [CrossRef]
  18. Cao, Y.; Kou, Y.X.; Li, Z.W.; Xu, A. Autonomous Maneuver Decision of UCAV Air Combat Based on Double Deep Q Network Algorithm and Stochastic Game Theory. Int. J. Aerosp. Eng. 2023, 6, 1–20. [Google Scholar] [CrossRef]
  19. Yang, K.; Kim, S.; Lee, Y.; Jang, C.; Kim, Y.-D. Manual-Based Automated Maneuvering Decisions for Air-to-Air Combat. J. Aerosp. Inf. Syst. 2024, 21, 28–36. [Google Scholar] [CrossRef]
  20. Mansikka, H.; Virtanen, K.; Lipponen, T.; Harris, D. Improving Pilots’ Tactical Decisions in Air Combat Training Using the Critical Decision Method. Aeronaut. J. 2024, 1–14. [Google Scholar] [CrossRef]
  21. Faizi, S.; Sałabun, W.; Shaheen, N.; Rehman, A.U.; Wątróbski, J. A Novel Multi-Criteria Group Decision-Making Approach Based on Bonferroni and Heronian Mean Operators under Hesitant 2-Tuple Linguistic Environment. Mathematics 2021, 9, 1489. [Google Scholar] [CrossRef]
  22. Zhang, H.; Zhou, H.; Wei, Y.; Huang, C. Autonomous Maneuver Decision-Making Method Based on Reinforcement Learning and Monte Carlo Tree Search. Front. Neurorobot. 2022, 16, 996412. [Google Scholar] [CrossRef] [PubMed]
  23. Tang, T.; Wang, Y.; Jia, L.J.; Hu, J.; Ma, C. Close-in Weapon System Planning Based on Multi-Living Agent Theory. Def. Technol. 2022, 18, 1219–1231. [Google Scholar] [CrossRef]
  24. Peng, S.X.; Wang, H.T.; Zhou, Q. Combat effectiveness evaluation model of submarine-to-air missile weapon system. Syst. Engin. Theo. Pract. 2015, 35, 267–272. [Google Scholar]
  25. Shi, L.K.; Pei, Y.; Yun, Q.J.; Ge, Y.X. Agent-Based Effectiveness Evaluation Method and Impact Analysis of Airborne Laser Weapon System in Cooperation Combat. Chin. J. Aeronaut. 2023, 36, 442–454. [Google Scholar] [CrossRef]
  26. Gao, F.; Zhang, A.; Bi, W.H. Weapon System Operational Effectiveness Evaluation Based on the Belief Rule-Based System with Interval Data. J. Intell. Fuzzy Syst. 2020, 39, 6687–6701. [Google Scholar] [CrossRef]
  27. Yang, H.; Mo, J. Stochastic Evolutionary Game of Bidding Behavior for Generation Side Enterprise Groups. Pow. Syst. Technol. 2021, 45, 3389–3397. [Google Scholar]
  28. Liu, W.Y.; Zhu, W.Q. Lyapunov Function Method for Analyzing Stability of Quasi-Hamiltonian Systems under Combined Gaussian and Poisson White Noise Excitations. Nonlinear Dyn. 2015, 81, 1879–1893. [Google Scholar] [CrossRef]
  29. Liu, W.Y.; Zhu, W.Q.; Xu, W. Stochastic Stability of Quasi Non-Integrable Hamiltonian Systems under Parametric Excitations of Gaussian and Poisson White Noises. Probabilistic Eng. Mech. 2013, 32, 39–47. [Google Scholar] [CrossRef]
  30. Marek, T.M.; Mariusz, M. The Interrelation between Stochastic Differential Inclusions and Set-Valued Stochastic Differential Equations. J. Math. Anal. Appl. 2013, 408, 733–743. [Google Scholar]
  31. Cheng, D.Z.; He, F.H.; Qi, H.S.; Xu, T.T. Modeling, Analysis and Control of Networked Evolutionary Games. IEEE Trans. Autom. Control 2015, 60, 2402–2415. [Google Scholar] [CrossRef]
  32. Giulia, D.N.; Zhang, T.S. Approximations of Stochastic Partial Differential Equations. Ann. Appl. Probab. 2016, 26, 1443–1466. [Google Scholar]
  33. Jumarie, G. Approximate Solution for Some Stochastic Differential Equations Involving Both Gaussian and Poissonian White Noises. Appl. Math. Lett. 2003, 16, 1171–1177. [Google Scholar] [CrossRef]
  34. Mustafa, B.; Tugcem, P.; Gulsen, O.B. Numerical Methods for Simulation of Stochastic Differential Equations. Adv. Differ. Equ. 2018, 2018, 17. [Google Scholar]
  35. Wang, C.S.; Weng, J.T.; He, J.S.; Wang, X.P.; Ding, H.; Zhu, Q.X. Stability Analysis of the Credit Market in Supply Chain Finance Based on Stochastic Evolutionary Game Theory. Mathematics 2024, 12, 1764. [Google Scholar] [CrossRef]
  36. Tabar, M. Numerical Solution of Stochastic Differential Equations: Diffusion and Jump-Diffusion Processes. In Analysis and Data-Based Reconstruction of Complex Nonlinear Dynamical Systems; Springer: Cham, Switzerland, 2019; ISBN 1860-0840. [Google Scholar]
  37. Gilsing, H.; Shardlow, T. SDELab: A Package for Solving Stochastic Differential Equations in MATLAB. J. Comput. Appl. Math. 2007, 205, 1002–1018. [Google Scholar] [CrossRef]
  38. Peng, S.G.; Zhang, Y. Some New Criteria on Pth Moment Stability of Stochastic Functional Differential Equations with Markovian Switching. IEEE Trans. Autom. Control 2010, 55, 2886–2890. [Google Scholar] [CrossRef]
Figure 1. Equipment operational effectiveness index system.
Figure 1. Equipment operational effectiveness index system.
Mathematics 12 03050 g001
Figure 2. The game tree of air combat attack–defense confrontation.
Figure 2. The game tree of air combat attack–defense confrontation.
Mathematics 12 03050 g002
Figure 3. Evolution stabilization strategies of both game players. (a) Trends in the evolution of red player’s strategy; (b) Trends in the evolution of blue player’s strategy.
Figure 3. Evolution stabilization strategies of both game players. (a) Trends in the evolution of red player’s strategy; (b) Trends in the evolution of blue player’s strategy.
Mathematics 12 03050 g003
Figure 4. p r i impact on the evolution of behavioral decisions on game players. (a) p r i impact on the evolution of behavioral decisions in the red cluster; (b) p r i impact on the evolution of behavioral decisions in the blue cluster.
Figure 4. p r i impact on the evolution of behavioral decisions on game players. (a) p r i impact on the evolution of behavioral decisions in the red cluster; (b) p r i impact on the evolution of behavioral decisions in the blue cluster.
Mathematics 12 03050 g004
Figure 5. t r impact on the evolution of behavioral decisions on game players. (a) t r impact on the evolution of behavioral decisions in the red cluster; (b) t r impact on the evolution of behavioral decisions in the blue cluster.
Figure 5. t r impact on the evolution of behavioral decisions on game players. (a) t r impact on the evolution of behavioral decisions in the red cluster; (b) t r impact on the evolution of behavioral decisions in the blue cluster.
Mathematics 12 03050 g005
Figure 6. m r impact on the evolution of behavioral decisions on game players. (a) m r impact on the evolution of behavioral decisions in the red cluster; (b) m r impact on the evolution of behavioral decisions in the blue cluster.
Figure 6. m r impact on the evolution of behavioral decisions on game players. (a) m r impact on the evolution of behavioral decisions in the red cluster; (b) m r impact on the evolution of behavioral decisions in the blue cluster.
Mathematics 12 03050 g006
Figure 7. r r 0 impact on the evolution of behavioral decisions on game players. (a) r r 0 impact on the evolution of behavioral decisions in the red cluster; (b) r r 0 impact on the evolution of behavioral decisions in the blue cluster.
Figure 7. r r 0 impact on the evolution of behavioral decisions on game players. (a) r r 0 impact on the evolution of behavioral decisions in the red cluster; (b) r r 0 impact on the evolution of behavioral decisions in the blue cluster.
Mathematics 12 03050 g007
Figure 8. c r 2 impact on the evolution of behavioral decisions on game players. (a) c r 2 impact on the evolution of behavioral decisions in the red cluster; (b) c r 2 impact on the evolution of behavioral decisions in the blue cluster.
Figure 8. c r 2 impact on the evolution of behavioral decisions on game players. (a) c r 2 impact on the evolution of behavioral decisions in the red cluster; (b) c r 2 impact on the evolution of behavioral decisions in the blue cluster.
Mathematics 12 03050 g008
Table 1. Model parameters and definitions.
Table 1. Model parameters and definitions.
ParameterDefinitionParameterDefinition
P C b basic probability of the whole process of air-air missile operations p C r basic probability of the whole process of laser operations
P b 1 probability of red target detected by blue player p r 1 probability of blue target detected by red player
P b 2 penetration survival probability of air-to-air missile p r 2 penetration survival probability of laser
P b 3 probability of air-to-air missile hitting target after penetration p r 3 probability of laser hitting target after penetration
P b 4 destruction probability of air-to-air missile after hitting target p r 4 destruction probability of laser after hitting target
P T b response time sensitivity parameters of air-to-air missile p t r response time sensitivity parameters of laser
T b reaction time of air-to-air missile striking target t r reaction time of laser striking target
P M b load weight coefficient of air-to-air missile p m r load weight coefficient of laser
M b capacity of blue player to carry ammunition m r capacity of red player to sustainable strikes
f strategy revenue correction factor μ discount rate for independent strategy returns
C b 1 flight basis cost of blue player c r 1 flight basis cost of red player
C b 2 network collaboration cost of blue player cluster c r 2 network collaboration cost of red player cluster
C b 3 single missile strike cost of blue player c r 3 single laser strike cost of red player
R b 0 the inherent value income of blue player r r 0 the inherent value income of red player
R b 1 equal gain from independent attack of blue player r r 1 equal gain from independent defense of red player
R b 2 advantageous gain from coordinated attack of blue player r r 2 advantageous gain from coordinated defense of red player
R b 3 disadvantage gain from independent attack of blue player r r 3 disadvantage gain from independent defense of red player
R b 4 equal gain from coordinated attack of blue player r r 4 equal gain from coordinated defense of red player
Table 2. Gains matrix for both game players.
Table 2. Gains matrix for both game players.
Strategy
Combination
Blue Player
S A 1 ( y ) S A 2 ( 1 y )
red player S D 1 r r 0 + r r 4 c r 1 c r 2 m r c r 3 R b 4 r r 0 + r r 2 c r 1 c r 2 m r c r 3 R b 3
( x ) R b 0 + R b 4 C b 1 C b 2 M b C b 3 r r 4 R b 0 + R b 3 C b 1 M b C b 3 r r 2
S D 2 r r 0 + r r 3 c r 1 m r c r 3 R b 2 r r 0 + r r 1 c r 1 m r c r 3 R b 1
( 1 x ) R b 0 + R b 2 C b 1 C b 2 M b C b 3 r r 3 R b 0 + R b 1 C b 1 M b C b 3 r r 1
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, S.; Ru, L.; Lu, B.; Wang, Z.; Wang, W.; Xi, H. Stochastic Evolutionary Analysis of an Aerial Attack–Defense Game in Uncertain Environments. Mathematics 2024, 12, 3050. https://doi.org/10.3390/math12193050

AMA Style

Hu S, Ru L, Lu B, Wang Z, Wang W, Xi H. Stochastic Evolutionary Analysis of an Aerial Attack–Defense Game in Uncertain Environments. Mathematics. 2024; 12(19):3050. https://doi.org/10.3390/math12193050

Chicago/Turabian Style

Hu, Shiguang, Le Ru, Bo Lu, Zhenhua Wang, Wenfei Wang, and Hailong Xi. 2024. "Stochastic Evolutionary Analysis of an Aerial Attack–Defense Game in Uncertain Environments" Mathematics 12, no. 19: 3050. https://doi.org/10.3390/math12193050

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop