Orbital Pursuit–Evasion–Defense Linear-Quadratic Differential Game

Li, Zhen-Yu

doi:10.3390/aerospace11060443

Open AccessArticle

Orbital Pursuit–Evasion–Defense Linear-Quadratic Differential Game

by

Zhen-Yu Li

Beijing Institute of Tracking and Telecommunications Technology, Beijing 100094, China

Aerospace 2024, 11(6), 443; https://doi.org/10.3390/aerospace11060443

Submission received: 13 March 2024 / Revised: 11 May 2024 / Accepted: 16 May 2024 / Published: 30 May 2024

Download

Browse Figures

Versions Notes

Abstract

:

To find superior guidance strategies for preventing possible interception threats from space debris, out-of-control satellites, etc., this paper investigates an orbital pursuit–evasion–defense game problem with three players called the pursuer, the evader, and the defender, respectively. In this game, the pursuer aims to intercept the evader, while the evader tries to escape the pursuer. A defender accompanying the evader can protect the evader by actively intercepting the pursuer. For such a game, a linear-quadratic duration-adaptive (LQDA) strategy is first proposed as a basic strategy for the three players. Later, an advanced pursuit strategy is designed for the pursuer to evade the defender when they are chasing the evader. Meanwhile, a cooperative evasion–defense strategy is proposed for the evader and the defender to build their cooperation. Simulations determined that the proposed LQDA strategy has higher interception accuracy than the classic LQ strategy. Meanwhile, the proposed two-sided pursuit strategy can improve the interception performance of the pursuer against a non-cooperative defender. But if the evader and defender employ the proposed cooperation strategy, the pursuer’s interception will be much more difficult.

Keywords:

differential games; orbital pursuit–evasion–defense; duration-adaptive guidance; cooperative evasion–defense

1. Introduction

Recently, increasing research efforts [1] have been dedicated to orbital differential games where spacecraft are regarded as conflicting players, and attempts are being made to maximize their interests. The most typical problem among them is the orbital pursuit–evasion game in which the spacecraft aims to optimize the survival performance indices such as miss distance, game duration, energy consumption, or a combination of them. While most previous research focus on the two-player orbital pursuit–evasion games [2], the game scenario that includes three spacecraft lacks much attention and remains quite open. In this paper, the three-player orbital pursuit–evasion–defense (PED) game has been investigated, in which the roles of the three spacecraft are the pursuer, the evader, and the defender, respectively. A motivating scenario for the PED problem is the active protection of in-orbit spacecraft from space debris (or out-of-control satellites) [3]. In this scenario, the in-orbit spacecraft is the evader, the space debris is considered to be the pursuer, and the spacecraft accompanying the evader is the defender. The defender can reduce the impact threat of the debris by actively intercepting the pursuer. This study formulated this active protection problem as a PED differential game, as mentioned in [4].

Pontani and Conway [5] conducted an early study on a type of two-player orbital pursuit–evasion problem for spacecraft interception, where the pursuer attempts to reduce the interception time while the evader aims to maximize it. A two-sided optimal solution, i.e., the saddle-point solution, is obtained by solving a challenging high-dimensional two-point boundary value problem (TPBVP). Thereafter, many researchers merge the state-of-the-art intelligent optimization algorithms, such as the evolutionary algorithms [6], and the traditional gradient optimization algorithms, such as the gradient descent algorithm [7], to form a variety of numerical methods for solving the TPBVP, such as the sensitive method [8], the shooting method [9], and the combined shooting and collocation methods [10]. Instead of focusing on the orbital pursuit–evasion game with interception time as the only objective, Jagat and Sinclair [11] examined the orbital linear-quadratic differential game (LQDG), where the pursuer and evader try to optimize the individual performance index, combining both the miss distance and the energy consumption. A pair of two-sided linear-quadratic (LQ) guidance laws are obtained. The orbital LQDG was further extended to nonlinear-quadratic cases by considering the nonlinear spacecraft dynamics [12]. Taking into account a more realistic information condition in LQDG, Li et al. [13] investigated the orbital pursuit–evasion game with incomplete information and proposed an optimal strategy for the evader. The above research primarily focuses on the orbital differential game between two players, whereas this paper investigates the game that involves three players.

Three-player differential games have been chiefly examined in the field of missile defense over the past couple of years [14], with a focus on the target–attacker–defender (TAD) issue. In this problem, a missile (attacker) is pursuing a non-maneuverable aircraft (target), while the target has the potential to launch another missile (defender) to protect itself by actively intercepting the attacker. Research efforts from various perspectives have been devoted to the problem. Shaferman and Shima [15] proposed a cooperative guidance law for the defender in which the possible guidance laws and parameters of the incoming homing missile are represented by employing a multiple-model adaptive estimator. Ratnoo and Shima [16] came forth with a guidance strategy for the pursuer in a situation when a set of guidance laws like the line-of-sight guidance are implemented for the defender-missile. Considering both sides, Perelman et al. [17] reported cooperative evasion and pursuit strategies based on linear kinematics. More meticulously, Prokopov and Shima [18] categorized the possible cooperation schemes between the target and the defender into three types: one-way cooperation realized by the target, one-way cooperation realized by the defender, and two-way cooperation. Comparison results highlighted that the two-way cooperation exhibited the best performance. Taking energy consumption as the optimization objective, Weiss et al. [19] determined the minimum interception and evasion effort required to achieve the desired performance in terms of miss distance. Some prominent studies have improved the above strategies from different aspects: the optimization of the switched system [20], the accessibility of the control information [21], suitability for the large distance case [22], and the utilization of the learning-based method [23]. Moreover, there is also a qualitative analysis of the three-player conflict problem presented by Rubinsky and Gutman [24] in which the algebraic conditions for the pursuer to capture the evader while escaping from the defender are examined.

Although extensive research has been conducted for the three-player TAD differential game in the fields of missiles, few research works have been presented for the orbital PED game that involves three spacecraft. Due to the dissimilarity in dynamics and game environment, the method developed for the missile TAD problem may not be directly applicable to the orbital PED game in which the gravitational difference among the players needs to be considered. Moreover, the game duration of the orbital game cannot be estimated using a linearized collision triangle like that in the TAD problem, which makes it more complex and difficult to solve. Moreover, the defender in space is not necessarily to be launched by the evader but can hover or lurk at a distance away from the evader at the beginning of the game. Existing research on the orbital PED problem is very limited. The interception trajectories of the orbital PED game under continuous thrust assumption have been optimized by Liu et al. [25]. The obtained solutions are open-loop and thus cannot reflect real-time conflicts among the three players, as they are being treated as a trajectory optimization problem rather than a closed-loop guidance problem. Later on, they also optimized an impulsive-transfer solution to this problem [26]. But, the solution is still open-loop. Liang et al. [27] considered space active defense problems in a two-on-two engagement, including an interceptor, a protector, a target spacecraft, and a defender. The closed-loop guidance laws are framed for them, neglecting the differences in the gravitational force among the players in the game formulation.

In this paper, the three-player orbital PED game is examined and treated as a closed-loop guidance problem. Orbital dynamics accounting for the gravitational force are employed for the spacecraft. Considering both the energy consumption and the miss distance, an LQDG formulation is adopted to model the game. Three categories of guidance strategies are progressively designed and compared. The key contributions of this study are as follows: (1) a two-sided optimal pursuit strategy is designed for the pursuer, which can enhance its self-defense ability while chasing a mission; (2) a cooperative evasion–defense strategy is devised for the evader and the defender to enhance their cooperation. Moreover, a variant of the LQ strategy, the linear-quadratic duration-adaptive (LQDA) strategy, is presented with time-to-go exclusively designed to achieve the interception condition.

The remainder of this paper is organized as follows: Section 2 briefly reviews the orbital dynamics and the two-player game model, and further introduces the formulation of the three-player orbital PED game. Section 3 represents the details of deriving and computing the LQDA guidance law, along with the two-sided optimal pursuit strategy, and the design of the cooperative evasion–defense strategy. Section 4 highlights the effectiveness of the proposed strategies via numerical simulations, followed by concluding remarks in Section 5.

2. Problem Formulation

Throughout this paper, vectors are represented in bold lowercase letters and matrices in bold uppercase.

‖\cdot‖

denotes the Euclidean norm of a vector. The superscript

\cdot^{T}

represents the transpose. The subscript

\cdot_{i}

indicates the player

i

, where

i = P

for the pursuer,

i = E

for the evader, and

i = D

for the defender.

2.1. Orbital Relative Dynamics

The orbital PED game in this study is established in the Local Vertical Local Horizontal (LVLH) frame (refer to Figure 1), which is attached to a moving point in the orbit of the Earth. The

x

-axis is directed along the radius vector from the Earth’s center to the moving point. The

z

-axis is along the reference orbit normal and the

y

-axis is in the reference orbit plane which follows the right-hand coordinate system. If the moving point is regarded as the chief spacecraft, then the pursuer, the evader, and the defender are all deputies of it. Define

x_{i} = {(x_{i}, y_{i}, z_{i}, {\dot{x}}_{i}, {\dot{y}}_{i}, {\dot{z}}_{i})}^{T} \in ℝ^{6}

as the vector of states in the LVLH reference frame and

u_{i} = {(a_{i, x}, a_{i, y}, a_{i, z})}^{T} \in ℝ^{3}

as the vector of control inputs, where the subscripts

i = P, E, D

denote the pursuer, the evader, and the defender, respectively. Neglecting the orbital perturbations and assuming that the distances between the deputies and the chief are much shorter than those from the deputies to the Earth’s center, the dynamics of the deputy spacecraft can be described using the following Clohessy–Wiltshire Equation (1) [28]:

{\dot{x}}_{i} = A x_{i} + B u_{i}

(1)

where

A = [\begin{matrix} 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 3 ω^{2} & 0 & 0 & 0 & 2 ω & 0 \\ 0 & 0 & 0 & - 2 ω & 0 & 0 \\ 0 & 0 & - ω^{2} & 0 & 0 & 0 \end{matrix}], B = [\begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}]

ω

is the angular velocity of the reference moving point. When the spacecraft has no controls, the state-transfer matrix

Φ (t_{f}, t)

from the current game time

t

to the terminal game time

t_{f}

has the following form (2):

Φ (t_{f}, t) = [\begin{matrix} 4 - 3 c & 0 & 0 & s / ω & - 2 (c - 1) / ω & 0 \\ 6 (s - ω τ) & 1 & 0 & 2 (c - 1) / ω & (4 s - 3 ω τ) / ω & 0 \\ 0 & 0 & c & 0 & 0 & s / ω \\ 3 ω s & 0 & 0 & c & 2 s & 0 \\ 6 ω (c - 1) & 0 & 0 & - 2 s & - 3 + 4 c & 0 \\ 0 & 0 & - ω s & 0 & 0 & c \end{matrix}]

(2)

where

τ = (t_{f} - t)

,

s = \sin ω τ

, and

c = \cos ω τ

.

2.2. Two-Player Pursuit–Evasion Game

The linear-quadratic pursuit–evasion game between two players is defined by a linear dynamical system as Equation (3):

\dot{x} (t) = A x (t) + B u_{P} (t) - B u_{E} (t)

(3)

A quadratic payoff function [2] that the pursuer attempts to minimize while the evader aims to maximize is in Equation (5):

J = \frac{1}{2} x^{T} (t_{f}) Q_{f} x (t_{f}) + \frac{1}{2} \int_{t_{0}}^{t_{f}} [u_{P}^{T} (t) R_{P} u_{P} (t) - u_{E}^{T} (t) R_{E} u_{E} (t)] d t

(4)

where

x (t) = x_{P} (t) - x_{E} (t)

signifies a six-dimensional vector describing the state of the pursuit–evasion game system;

t_{0}

and

t_{f}

are the start and end time, respectively; the weighting matrix of the terminal state

Q_{f} \in ℝ^{6 \times 6}

is a semi-positive definite matrix; and the weighting matrices of the consumed energy

R_{P}, R_{E} \in ℝ^{3 \times 3}

represent positive definite matrices. Diagonal weighting matrices are used in this study. The weighting matrices are assumed to be in the following diagonal form (5):

Q_{f i} = q_{f i} [\begin{matrix} I_{3} & 0_{3} \\ 0_{3} & 0_{3} \end{matrix}], R_{1 i} = r_{1 i} I_{3}, R_{2 i} = r_{2 i} I_{3}

(5)

where

I_{3}

denotes a three-dimensional identity matrix, and

q_{f i}

,

r_{1 i}

, and

r_{2 i}

are the weighting coefficients; the form of

Q_{f i}

signifies that the game is an interception game.

The optimal control laws of the pursuer and the evader can be determined by solving for the saddle-point solution of the game, which is defined as a pair of strategies

(u_{P}^{*}, u_{E}^{*})

that satisfies the following inequalities [10]:

J (u_{P}^{*}, u_{E}) \leq J (u_{P}^{*}, u_{E}^{*}) \leq J (u_{P}, u_{E}^{*})

(6)

According to Ref. [29], the saddle-point strategies of the pursuer and the evader have the following forms:

u_{P}^{*} (t) = - {R_{P}}^{- 1} B^{T} P (t) x (t)

(7)

u_{E}^{*} (t) = - {R_{E}}^{- 1} B^{T} P (t) x (t)

(8)

where

P (t)

represents the solution of the following matrix Riccati differential equation [11]:

\dot{P} (t) = P (t) (B {R_{1}}^{- 1} B^{T} - B {R_{2}}^{- 1} B^{T}) P (t) - A^{T} (t) P (t) - P (t) A (t) - Q, P (t_{f}) = Q_{f}

(9)

The saddle-point strategies

(u_{P}^{*}, u_{E}^{*})

constitute a Nash equilibrium [2], at which any deviation from each player’s separation from

(u_{P}^{*}, u_{E}^{*})

may lead to an increase in its payoff. In the subsequent sections, the strategies of this linear-quadratic game in Equations (7) and (8) are also identified as the LQ strategy.

2.3. Three-Player Pursuit–Evasion–Defense Game

If a defender is added to the above two-player game, the game gets converted into a three-player pursuit–evasion–defense game. In this case, the purpose of the defender is to protect the evader from being intercepted by the pursuer through a medium of actively intercepting the pursuer.

To begin with, consider a simple case where the three spacecraft are “independent” of each other. In this case, the three-player game can be formulated as two sets of two-player pursuit–evasion games. Firstly, a pursuit–evasion game is formulated between the pursuer and the evader:

\dot{x} (t) = A x (t) + B u_{P} (t) - B u_{E} (t)

(10)

J_{P E} = \frac{1}{2} x {(t_{f})}^{T} Q_{f} x (t_{f}) + \int_{t_{0}}^{t_{f}} [u_{P}^{T} (t) R_{P} u_{P} (t) - u_{E}^{T} (t) R_{E} u_{E} (t)] d t

(11)

where the superscript “PE” marks the game between the pursuer and the evader. Further, between the pursuer and the defender, another pursuit–evasion game appears as follows:

\dot{y} (t) = A y (t) + B u_{D} (t) - B u_{P} (t)

(12)

J_{D P} = \frac{1}{2} y {(t_{f})}^{T} Q_{f} y (t_{f}) + \int_{t_{0}}^{t_{f}} [u_{D}^{T} (t) R_{D} u_{D} (t) - u_{P}^{T} (t) R_{P} u_{P} (t)] d t

(13)

where

y (t) = x_{D} (t) - x_{P} (t)

is six-dimensional relative state of the system (12);

u_{D}

denotes the control input vector of the defender; the subscript “DP” marks the game between the pursuer and the defender; and

R_{D} \in ℝ^{3 \times 3}

indicates the weighting matrix of the defender’s energy consumption. In this game, the defender plays the role of a chaser against the pursuer.

Two more complex cases are further considered, in one of which the pursuer simultaneously chases the evader and avoids interception from the defender. In the other case, the evader and the defender cooperatively confront the pursuer. Different guidance laws are derived for each case, as described in Section 3.

The three-player game terminates if the pursuer intercepts the evader or the defender intercepts the pursuer. Let m be the permitted terminal miss distance between the spacecraft,

r_{P E} = ‖D x (t)‖

, and

r_{D P} = ‖D y (t)‖

; then, the terminal time of the game is defined as

t_{f} = \min {t | r_{P E} (t) = m | | r_{D P} (t) = m}

(14)

where

D

= [

I_{3} : 0

]. If the defender’s interception of the pursuer is earlier than the moment when the pursuer arrives at the evader, the evader and defender will win the game; otherwise, the pursuer will win the game. Based on the above game model, the guidance strategies for the three players will be designed subsequently.

3. Strategy Design

Different strategies are designed for the three introduced cases. The simplest scenario is where the three players are independent. The pursuer merely chases the evader and has no maneuvers to evade the defender, the evader simply dodges the interception of the pursuer and has no coordination with the defender, and the defender focuses solely on chasing the pursuer. For this scenario, a linear-quadratic duration-adaptive (LQDA) strategy has been designed for the three players. Further, considering that the pursuer can chase the evader and simultaneously evade the defender, a two-sided optimal pursuit strategy has been proposed for the pursuer. Eventually, considering the potential coordination between the evader and defender, a cooperative optimal evasion–defense strategy has been designed.

3.1. Linear-Quadratic Duration-Adaptive Strategy

As discussed in Section 2.3, the three-player game in the first scenario can be formulated as two sets of two-player pursuit–evasion games. But, the LQ strategy in Equations (7) and (8) cannot be applied directly to the game since it terminates at a fixed duration. For this three-player game terminating by the interception condition in Equation (15), an LQDA strategy is designed for the spacecraft players.

The primary idea of the LQDA strategy is to adjust the time-to-go according to the real-time miss distance and achieve a terminal interception, or to say, to complete a terminal control (TC) [30] in the pursuit–evasion game. More specifically, the LQDA strategy has the same form of control laws as the classic LQ strategy, but the time-to-go is designed in a different manner. The idea of adjusting the time-to-go can be seen in the literature concerning missiles [31]. Gutman and Rubinsky [32] derived analytical vector guidance laws through the analysis of the time-to-go. Later on, Ye et al. [33] applied this idea to a kind of orbital pursuit–evasion game that considers the terminal miss distance as the only payoff function. But the methods do not apply to this LQ pursuit–evasion game that considers both the miss distance and energy consumption. So, this section presents the details of the LQDA strategy.

3.1.1. Derivation

For the convenience of expression, a reduced state vector is utilized to describe such an interception game:

z (t) = D Φ (t_{f}, t) x (t)

(15)

where

z (t) \in ℝ^{3}

is a three-dimensional state vector known as the zero-effort miss (ZEM) [31], predicted at time t based on the fact that no control effort will be applied during the time interval

[t, t_{f}]

;

Φ (t_{f}, t)

represents the state-transition matrix.

Further, the game is defined by Equations (10) and (11) can be simplified using the following equations:

\dot{z} (t) = G_{P} (t_{f}, t) u_{P} (t) - G_{E} (t_{f}, t) u_{E} (t)

(16)

J = \frac{1}{2} z {(t_{f})}^{T} Q_{T} z (t_{f}) + \int_{t_{0}}^{t_{f}} [u_{P}^{T} (t) R_{P} u_{P} (t) - u_{E}^{T} (t) R_{E} u_{E} (t)] d t

(17)

where the control matrix

G_{i} (t_{f}, t) = D Φ_{i} (t_{f}, t) B

and the weighting matrix

Q_{T} = D Q_{f} D^{T}

.

Accordingly, the form of the saddle-point solution becomes the following [34]:

u_{P} (t) = - R_{P}^{- 1} G_{P}^{T} (τ) K^{- 1} (τ) z (t)

(18)

u_{E} (t) = - R_{E}^{- 1} G_{E}^{T} (τ) K^{- 1} (τ) z (t)

(19)

where the time-to-go

τ = t_{f} - t

replaces the game duration. The gain

K (τ) = Q_{T}^{- 1} + M_{P} (τ) - M_{E} (τ)

(20)

where the reduced controllability matrices

M_{i} (τ), i = P, E

are computed by:

M_{i} (τ) = \int_{t}^{t + τ} G_{i} (τ) R_{i}^{- 1} G_{i}^{T} (τ) d t

(21)

The above solutions are derived from a standard LQ finite-time game, lacking a terminal constraint according to Equations (14) and (15), i.e.,

‖z (t_{f})‖ = ‖D x (t_{f})‖ = r_{P E} = m

(22)

which transforms the original problem into a free-time pursuit–evasion TC problem [30]. To make the solutions in Equations (18) and (19) adaptive to the problem, the time-to-go is exclusively designed.

With the controls of the saddle-point strategy input, the final ZEM depends on the following state equation:

\dot{z} (t) = [G_{E} (τ) R_{E}^{- 1} G_{E}^{T} (τ) - G_{P} (τ) R_{P}^{- 1} G_{P}^{T} (τ)] K^{- 1} (τ) z (t)

(23)

Denoting

H (τ) = [G_{E} (τ) R_{E}^{- 1} G_{E}^{T} (τ) - G_{P} (τ) R_{P}^{- 1} G_{P}^{T} (τ)] K^{- 1} (τ)

and combining the initial condition

z (t_{0}) = z_{0}

, Equation (23) can be rewritten as follows:

\dot{z} (t) = H (τ) z (t), z (t_{0}) = z_{0}

(24)

For this first-order linear time-variant ordinary differential equation, a closed-form solution of

z (t)

does not exist unless

\int H (τ) H^{T} (τ) d t = \int H^{T} (τ) H (τ) d t

[35]. The

z (t_{f})

can be computed by numerically integrating Equation (24), but an approximate method is adopted to solve it efficiently.

First, the interval

[t_{0}, t_{f}]

is evenly divided into N (

N \to \infty

) subintervals,

[t_{0}, t_{1}]

,

[t_{1}, t_{2}]

, …,

[t_{l - 1}, t_{l}]

, …,

[t_{N - 1}, t_{N}]

, where

t_{N} = t_{f}, t_{l - 1} = t_{0} + (l - 1) (t_{f} - t_{0}) / N, l = 1, 2, \dots, N

. In each subinterval

[t_{l - 1}, t_{l}]

,

H (τ)

is taken as a constant matrix

H (τ_{l - 1})

, where

τ_{l - 1} = t_{f} - t_{l - 1} = τ_{0} (N + 1 - l) / N

. According to the analytical solution of the linear time-invariant system without control inputs [35],

\{\begin{cases} z (t_{1}) = e^{H (τ_{0}) (t_{1} - t_{0})} z (t_{0}), \\ z (t_{2}) = e^{H (τ_{1}) (t_{2} - t_{1})} z (t_{1}), \\ \dots, \\ z (t_{f}) = e^{H (τ_{N - 1}) (t_{f} - t_{N - 1})} z (t_{N - 1}) \end{cases}

(25)

Let

Δ t = t_{l} - t_{l - 1}, l = 1, 2, \dots, N

, then

\begin{array}{l} z (t_{f}) & = [e^{H (τ_{0}) Δ t} e^{H (τ_{1}) Δ t} \cdot e^{H (τ_{N - 1}) Δ t}] z (t_{0}) = [e^{Δ t \sum_{l = 1}^{N} H (τ_{l - 1})}] z (t_{0}) \\ = [e^{Δ t \sum_{l = 1}^{N} H (\frac{(N + 1 - l) τ_{0}}{N})}] z_{0} = [e^{Δ t \sum_{l = 1}^{N} H (\frac{(N + 1 - l) τ}{N})}] z (t) \\ = [e^{Δ t \sum_{l = 1}^{N} H (\frac{(N + 1 - l) τ}{N})}] D Φ (t) x (t) \end{array}

(26)

Combining Equations (22) and (26), the following relationship can be derived:

‖z (t_{f})‖ = ‖e^{Δ t \sum_{l = 1}^{N} H (\frac{(N + 1 - l) τ}{N})} D Φ (τ) x (t)‖ = ‖D x (t_{f})‖ = r_{P E} = m

(27)

The designed time-to-go

τ

is the exact solution of Equation (27). Given the current miss distance, the current time-to-go can be computed by reversely solving the above equation.

The procedure for computing the time-to-go is presented in Section 3.1.2. Given the time-to-go

τ

, the control laws for the pursuer and the evader will be derived as follows:

u_{P} (t) = F_{P} (τ) z (t) = - R_{P}^{- 1} G^{T} (τ) K^{- 1} (τ) D Φ (τ) x (t)

(28)

u_{E} (t) = F_{E} (τ) z (t) = - R_{E}^{- 1} G^{T} (τ) K^{- 1} (τ) D Φ (τ) x (t)

(29)

where the values of the control gains keep fluctuating with the real-time miss distance. Note that although the control laws have a linear form, they are nonlinear since the equation of the time-to-go is nonlinear.

Similar to the pursuer in Equation (28), the control law of the defender is given by

u_{D} (t) = F_{D} (\bar{τ}) \bar{z} (t) = - R_{D}^{- 1} G^{T} (\bar{τ}) K^{- 1} (\bar{τ}) D Φ (\bar{τ}) y (t)

(30)

where

\bar{τ}

represents the time-to-go of the defender calculated by Equation (13) and

\bar{z}

indicates the ZEM of the defender.

Eventually, Equations (28)–(30) constitute the LQDA strategy for the three independent players.

3.1.2. Calculation of Time-to-Go

Precisely, the time-to-go is the minimum positive zero-point of the following function

g (τ)

:

g (τ) = ‖e^{Δ t \sum_{l = 1}^{N} H (\frac{(N + 1 - l) τ}{N})} D Φ (τ) x (t)‖ - m

(31)

Since that

g (τ)

is not a monotone function and may have multiple zero points, solving the time-to-go demands special techniques. Note that

g_{k} (τ)

is used here to denote the

g (τ)

when

t = t_{k}, k \in ℕ

, because

g (τ)

changes with

x (t)

during the game.

Two rational assumptions are made that (1) at the beginning of the game, the distance between the two players is larger than the permitted miss distance [m in Equation (14)]; (2) given stronger maneuverability of the chasing player along with sufficient time, the miss distance will be smaller than the permitted value;

g (0) = ‖D Φ (0) x (t_{0})‖ - m = ‖r (t_{0})‖ - m > 0

, and

g (\infty) < 0

. These two assumptions guarantee the existence of the time-to-go.

Based on these assumptions, the first zero point must lie in the descending segment of the curve of

g (τ)

, i.e.,

g^{'} (τ) \leq 0

, excluding all the other zero points with

g^{'} (τ) > 0

. After extensive simulations with a variety of different parameters, m,

R_{P}

,

R_{E}

, and

x (t)

, it can be concluded that (3) the time interval between a pair of adjacent peak and trough of

g (τ)

is no less than the 50 s even for extremely fluctuating

g (τ)

(refer Figure 2); (4) the change in the time-to-go between two consecutive steps is significantly less than 50 s unless there is a critical point (refer Figure 3). Because of this, the time-to-go calculated in the last step is taken as the initial guess, and Newton’s method [36] is used to efficiently solve the current time-to-go irregularities.

But as for the time-to-go at the initial phase of the game or when a critical point appears, it is mandatory to look for an appropriate initial guess. According to the analysis (3) and (4), a minimum

τ

can be obtained, subject to

g^{'} (τ) < 0

from

τ = 0

to

τ = T

, with a step of

Δ τ = 50 s

(

T

is sufficiently large). This

τ

is later used as the initial guess in Newton’s method. The entire algorithm is summarized in Algorithm 1.

Algorithm 1. Calculation of the time-to-go in the game
1:	Initialize all the parameters in $g_{0} (τ)$ when $t = t_{0}$ .
2:	For $τ = 0, Δ τ, 2 Δ τ, \dots, T$ do
3:	Calculate the derivative of $g_{0} (τ)$ with respect to $τ$ , ${g_{0}}^{'} (τ)$ .
4:	If ${g_{0}}^{'} (τ) \leq 0$ ,
5:	Take $τ$ as the initial guess and solve ${g_{0}}^{'} (τ_{0}^{*}) = 0$ using the Newton method.
6:	If $τ_{0}^{*}$ is founded within the accuracy of iteration,
7:	Break the loop.
8:	End if
9:	End if
10:	End for
11:	For $t = t_{1}, \dots, t_{i}, \dots, t_{N}$ do
12:	Let $τ = τ_{i - 1}^{*}$ and calculate the value of ${g_{i}}^{'} (τ)$ .
13:	If ${g_{i}}^{'} (τ) \leq 0$ ,
14:	Take $τ$ as the initial guess and solve $g_{i} (τ_{i}^{*}) = 0$ using the Newton method.
15:	If $τ_{i}^{*}$ is founded within the accuracy of iteration,
16:	Break the loop.
17:	Else
18:	Repeat steps 2 to 8 to obtain the $τ_{i}^{*}$ and break the loop.
19:	End if
20:	Else
21:	Let $τ = τ_{i - 1}^{*} - Δ τ$ , calculate the value of ${g_{i}}^{'} (τ)$ .
22:	End if
23:	Repeat steps 11 to 18.
24:	End for

3.2. Two-Sided Optimal Pursuit Strategy

In the second scenario, the pursuer can chase the evader and simultaneously evade the defender. A two-sided optimal pursuit strategy for the pursuer is further proposed, where “two-sided” implies to take both the evader and the defender into consideration.

From the pursuer’s angle of view, the game system is a combination of Equations (10) and (12):

\{\begin{cases} \dot{x} (t) = A x (t) + B u_{P} (t) - B u_{E} (t) \\ \dot{y} (t) = A y (t) + B u_{D} (t) - B u_{P} (t) \end{cases}

(32)

To behave both as a pursuer as well as an evader, a comprehensive payoff function is needed to be constructed for the pursuer to simultaneously handle the two opponents:

J_{P} = \frac{1}{2} x {(t_{f})}^{T} Q_{f} x (t_{f}) - \frac{1}{2} y {(t_{f})}^{T} Q_{f} y (t_{f}) + \frac{1}{2} \int_{t_{0}}^{t_{f}} u_{P}^{T} (t) R_{P} u_{P} (t) - u_{E}^{T} (t) R_{E} u_{E} (t) - u_{D}^{T} (t) R_{D} u_{D} (t) d t

(33)

To solve the optimal pursuing controls for the pursuer a Hamiltonian

H

and a terminal function

Φ

are constructed as follows:

\begin{array}{l} H (t) & = \frac{1}{2} [u_{P}^{T} (t) R_{P} u_{P} (t) - u_{E}^{T} (t) R_{E} u_{E} (t) - u_{D}^{T} (t) R_{D} u_{D} (t)] \\ + λ_{E}^{T} [A x (t) + B u_{P} (t) - B u_{E} (t)] + λ_{D}^{T} [A y (t) + B u_{D} (t) - B u_{P} (t)] \end{array}

(34)

Φ (t_{f}) = \frac{1}{2} x {(t_{f})}^{T} Q_{f} x (t_{f}) - \frac{1}{2} y {(t_{f})}^{T} Q_{f} y (t_{f})

(35)

where

λ_{E}

and

λ_{D}

denote the co-states associated with the state equations of

y (t)

and

z (t)

, respectively. Substituting Equations (29) and (30) into Equation (34),

\begin{array}{l} H & = \frac{1}{2} [u_{P}^{T} (t) R_{P} u_{P} (t) - x^{T} (t) F_{E}^{T} (τ) R_{E} F_{E} (τ) x (t) - y^{T} (t) F_{D}^{T} (τ) R_{D} F_{D} (τ) y (t)] \\ + λ_{E}^{T} [A x (t) + B u_{P} (t) - B F_{E} (τ) x (t)] + λ_{D}^{T} [y (t) + B F_{D} (τ) z (t) - G_{P} (τ) u_{P} (t)] \end{array}

(36)

According to Pontryagin’s maximum principle (PMP), the optimal controls of the pursuer must minimize the Hamiltonian. Thus, the derivative of Hamiltonian with respect to the pursuer’s control inputs satisfies the following equation:

\frac{\partial H}{\partial u_{P}} = R_{P} u_{P} (t) + B^{T} λ_{E} - B^{T} λ_{D} = 0 \Rightarrow u_{P} (t) = - R_{P}^{- 1} B^{T} (λ_{E} - λ_{D})

(37)

The evolution of the co-state vectors

λ_{E}

and

λ_{D}

follows the adjoint equations:

{\dot{λ}}_{E} (t) = - \frac{\partial H}{\partial x (t)} = F_{E}^{T} (τ) L_{3} F_{E} (τ) x (t) - {[A - B F_{E} (τ)]}^{T} λ_{E}

(38)

{\dot{λ}}_{D} (t) = - \frac{\partial H}{\partial y (t)} = F_{D}^{T} (τ) L_{4} F_{D} (τ) y (t) - {[A + B F_{D} (τ)]}^{T} λ_{D}

(39)

The transversality condition of the co-state satisfies the following conditions:

λ_{E} (t_{f}) = \frac{\partial Φ (t_{f})}{\partial x (t_{f})} Q_{f} x (t_{f})

(40)

λ_{D} (t_{f}) = \frac{\partial Φ (t_{f})}{\partial y (t_{f})} = - Q_{f} y (t_{f})

(41)

Substituting Equations (38) and (39) into Equation (37), and further substituting Equation (37) into Equation (32), then the following equation can be obtained:

\{\begin{cases} \dot{x} (t) = [A - B F_{E} (τ)] x (t) - B R_{P}^{- 1} B (λ_{E} - λ_{D}) \\ \dot{y} (t) = [A + B F_{D} (τ)] y (t) + B R_{P}^{- 1} B (λ_{E} - λ_{D}) \end{cases}

(42)

Combining the equations of the state and the co-state, we have

[\begin{matrix} \dot{x} \\ \dot{y} \end{matrix}] = [\begin{matrix} A - B F_{E} (τ) & 0 \\ 0 & A + B F_{D} (τ) \end{matrix}] [\begin{matrix} x \\ y \end{matrix}] + [\begin{matrix} - B R_{P}^{- 1} B^{T} & B R_{P}^{- 1} B^{T} \\ B R_{P}^{- 1} B^{T} & - B R_{P}^{- 1} B^{T} \end{matrix}] [\begin{matrix} λ_{E} \\ λ_{D} \end{matrix}]

(43)

[\begin{matrix} {\dot{λ}}_{E} \\ {\dot{λ}}_{D} \end{matrix}] = [\begin{matrix} F_{E}^{T} (τ) R_{E} F_{E} (τ) & 0 \\ 0 & F_{D}^{T} (τ) R_{D} F_{D} (τ) \end{matrix}] [\begin{matrix} y \\ z \end{matrix}] + [\begin{matrix} - A^{T} + F_{E}^{T} (τ) B^{T} & 0 \\ 0 & - A^{T} - F_{D}^{T} (τ) B^{T} \end{matrix}] [\begin{matrix} λ_{E} \\ λ_{D} \end{matrix}]

(44)

Let

X = {[x^{T}, y^{T}]}^{T}

,

Λ = {[λ_{E}^{T}, λ_{D}^{T}]}^{T}

, and let

M = [\begin{matrix} A - B F_{E} (τ) & 0 \\ 0 & A + B F_{D} (τ) \end{matrix}], V = [\begin{matrix} F_{E}^{T} (τ) L_{3} F_{E} (τ) & 0 \\ 0 & F_{D}^{T} (τ) L_{4} F_{D} (τ) \end{matrix}] N = [\begin{matrix} - B R_{P}^{- 1} B^{T} & B R_{P}^{- 1} B^{T} \\ B R_{P}^{- 1} B^{T} & - B R_{P}^{- 1} B^{T} \end{matrix}], W = [\begin{matrix} - A^{T} + F_{E}^{T} (τ) B^{T} & 0 \\ 0 & - A^{T} - F_{D}^{T} (τ) B^{T} \end{matrix}]

(45)

Therefore, Equation (43) can be rewritten in the matrix form as below:

[\begin{matrix} \dot{X} (t) \\ \dot{Λ} (t) \end{matrix}] = [\begin{matrix} M (τ) & N (τ) \\ V (τ) & W (τ) \end{matrix}] [\begin{matrix} X (t) \\ Λ (t) \end{matrix}]

(46)

A linear relationship between the state and the co-state is obtained as below:

Λ (t) = P (t) X (t)

(47)

where

P (t)

represents the Riccati matrix. By substituting Equation (47) into Equation (46), a matrix Riccati differential equation is derived as follows:

\dot{P} (t) = - P (t) M - M^{T} P (t) - P (t) N P (t) + V

(48)

Substituting Equation (48) into Equations (40) and (41), the boundary condition of Equation (48) is derived as

P (t_{f}) = [\begin{matrix} Q_{f} & 0 \\ 0 & - Q_{f} \end{matrix}]

(49)

By integrating Equation (48) backwards, the matrix

P (t)

can be solved. Then, the optimal control law of the pursuer in Equation (37) can be calculated by

u_{P} (t) = - R_{P}^{- 1} B^{T} S_{P} P (t) X (t)

(50)

where

S_{P} = [I_{3}, - I_{3}]

.

Eventually, Equation (50) provides a two-sided optimal pursuit strategy for an offensive as well as defensive pursuer.

3.3. Cooperative Evasion–Defense Strategy

The evader and the defender can coordinate with each other to defeat the pursuer. With the help of the defender, the evader can operate without moving far away from the pursuer. Instead, it only needs to maintain a decent distance from the pursuer, preferably more than that between the defender and the pursuer. Based on this, the following payoff function is derived for the evader and the defender:

\begin{array}{l} J_{D E} = \frac{1}{2} Q_{f} ({‖x (t_{f})‖}^{2} - {‖y (t_{f})‖}^{2}) + \frac{1}{2} \int_{t_{0}}^{t_{f}} u_{P}^{T} (t) R_{P} u_{P} (t) - u_{E}^{T} (t) R_{E} u_{E} (t) - u_{D}^{T} (t) R_{D} u_{D} (t) d t \\ = \frac{1}{2} x {(t_{f})}^{T} Q_{f} x (t_{f}) - \frac{1}{2} y {(t_{f})}^{T} Q_{f} y (t_{f}) + \frac{1}{2} \int_{t_{0}}^{t_{f}} u_{P}^{T} (t) R_{P} u_{P} (t) - u_{E}^{T} (t) R_{E} u_{E} (t) - u_{D}^{T} (t) R_{D} u_{D} (t) d t \end{array}

(51)

In the above equation, the square of the distance is utilized to replace the distance for maintaining a quadratic form of the payoff function, which is the same as that in Equation (33).

Based on this, the Hamiltonian is constructed as

\begin{array}{l} H & = \frac{1}{2} [x^{T} (t) F_{P}^{T} (τ) R_{P} F_{P} (τ) x (t) - u_{E}^{T} (t) R_{E} u_{E} (t) - u_{D}^{T} (t) R_{D} u_{D} (t)] \\ + λ_{E}^{T} [A x (t) + B F_{P} (τ) x (t) - B u_{E} (t)] + λ_{D}^{T} [A y (t) + B u_{D} (t) - B F_{P} (τ) x (t)] \end{array}

(52)

According to the PMP, the optimal controls of the evader and the defender must minimize the Hamiltonian. The derivatives of Hamiltonian with respect to their controls thus satisfy the following conditions:

\frac{\partial H}{\partial u_{E}} = - R_{E} u_{E} (t) - B^{T} λ_{E} = 0 \Rightarrow u_{E} (t) = - R_{E}^{- 1} B^{T} λ_{E}

(53)

\frac{\partial H}{\partial u_{D}} = - R_{D} u_{D} (t) + B^{T} λ_{D} = 0 \Rightarrow u_{D} (t) = R_{D}^{- 1} B^{T} λ_{D}

(54)

The evolution of the co-state vectors

λ_{E}

and

λ_{D}

follows the adjoint equations:

{\dot{λ}}_{E} (t) = - \frac{\partial H}{\partial x (t)} = - F_{P}^{T} (τ) R_{P} F_{P} (τ) x (t) - {[A + B F_{P} (τ)]}^{T} λ_{E} + F_{P}^{T} B^{T} λ_{D}

(55)

{\dot{λ}}_{D} (t) = - \frac{\partial H}{\partial y (t)} = - A λ_{D}

(56)

Substituting Equations (55) and (56) into the equations of the state (32), the following equation can be derived:

\{\begin{cases} \dot{x} (t) = [A + B F_{P} (τ)] x (t) + B R_{E}^{- 1} B λ_{E} \\ \dot{y} (t) = A y (t) - B F_{P} (τ) x (t) + B R_{D}^{- 1} B λ_{D} \end{cases}

(57)

The equations of the state and the co-state are further combined as follows:

[\begin{matrix} \dot{x} \\ \dot{y} \end{matrix}] = [\begin{matrix} A + B F_{P} (τ) & 0 \\ - B F_{P} (τ) & A \end{matrix}] [\begin{matrix} x \\ y \end{matrix}] + [\begin{matrix} B R_{E}^{- 1} B & 0 \\ 0 & B R_{D}^{- 1} B \end{matrix}] [\begin{matrix} λ_{E} \\ λ_{D} \end{matrix}]

(58)

[\begin{matrix} {\dot{λ}}_{E} \\ {\dot{λ}}_{D} \end{matrix}] = [\begin{matrix} - F_{E}^{T} (τ) R_{E} F_{E} (τ) & 0 \\ 0 & 0 \end{matrix}] [\begin{matrix} y \\ z \end{matrix}] + [\begin{matrix} - A^{T} - F_{P}^{T} (τ) B^{T} & F_{P}^{T} (τ) B^{T} \\ 0 & - A^{T} \end{matrix}] [\begin{matrix} λ_{E} \\ λ_{D} \end{matrix}]

(59)

Performing a similar procedure in Equations (45)–(49), the following matrices can be obtained to form a matrix Riccati differential equation with the same form as that of Equation (48).

M = [\begin{matrix} A + B F_{P} (τ) & 0 \\ - B F_{P} (τ) & A \end{matrix}], V = [\begin{matrix} - F_{E}^{T} (τ) R_{E} F_{E} (τ) & 0 \\ 0 & 0 \end{matrix}] N = [\begin{matrix} B R_{E}^{- 1} B & 0 \\ 0 & B R_{D}^{- 1} B \end{matrix}], W = [\begin{matrix} - A^{T} - F_{P}^{T} (τ) B^{T} & F_{P}^{T} (τ) B^{T} \\ 0 & - A^{T} \end{matrix}]

(60)

After solving the matrix Riccati differential Equation (48), the optimal control laws of the evader and the defender are as follows:

u_{E} (t) = - R_{E}^{- 1} B^{T} S_{E} P (t) X (t)

(61)

u_{D} (t) = R_{D}^{- 1} B^{T} S_{D} P (t) X (t)

(62)

where

S_{E} = [I_{3}, 0_{3}]

and

S_{D} = [0_{3}, I_{3}]

.

Consequently, Equations (61) and (62) constitute the optimal cooperative evasion–defense strategy for a coordinated evader–defender pair.

4. Simulation Results

To verify the performance of the proposed strategies, a three-player orbital pursuit–evasion–defense game with distinct strategies is simulated and compared in this section. Initially, the evader is moving in a circular orbit with a height of 400 km. To establish the LVLH coordinate frame, the initial orbit of the evader is chosen as the reference orbit and the point coinciding with the initial position of the evader is selected as the reference moving point. The initial states of the three players are chosen as

x_{P}

= [−6 km, −16 km, 4 km, −9 m/s, 13.6 m/s, 0 m/s],

x_{E}

= [0 km, 0 km, 0 km, 0 m/s, 0 m/s, 0 m/s], and

x_{D}

= [−1 km, 3 km, 0 km, 0 m/s, 0 m/s, 0 m/s] under the LVLH frame. The permitted terminal miss distance in the game is 100 m. The payoff weightings of the game are tuned as

Q_{f} = 2 \times 10^{4}

,

R_{P} = 0.8 \times 10^{10}

,

R_{E} = 3.6 \times 10^{10}

, and

R_{D} = 0.2 \times 10^{10}

. The above case (case 1 in Table 1) will be a typical representative which is simultaneously used in Section 4.1, Section 4.2, Section 4.3 for a vertical comparison. Some other cases with different initial positions are also provided in Table 1. (In case 2, the pursuer flies around the evader and the defender hovers; in case 3, the defender flies around the evader and the pursuer hovers.)

4.1. Results of the LQDA Strategy

First, we examine the LQDA strategy in the two-player pursuit–evasion game (On the basis of case 1, the defender is temporarily overlooked). As illustrated in Figure 4a, the pursuer can successfully intercept the evader. Compared with the results of the classic LQ strategy in Figure 4b, the LQDA strategy has higher interception accuracy. Then, a defender using the LQDA strategy is added to the case (namely, becoming case 1), the game results are illustrated in Figure 4c. It can be observed that the defender intercepts the pursuer before the pursuer reaches the evader. Compared with Figure 4a, the appearance of an unexpected defender dramatically affects the ending of the game, signifying the prominence of the defender to the evader. But if the defender uses the classic LQ strategy, the evader will lose the game, as shown in Figure 4d. Figure 4e,f further depict the control histories of the three players when the LQDA and LQ strategy are employed by the defender in case 1, respectively. Cases 2 and 3 further validate the significance of the defender. In these two cases, the pursuer is intercepted by the defender after 262 s and 242 s, respectively.

Figure 5 demonstrates the relationship between the miss distance and the time-to-go under different settings of

q_{f}

. A turning point at around 100 m can be observed, before which the curve drops sharply and after which the curve descends gradually. (If the miss distance is 100 m, the interception time will be 560 s. But if the miss distance is 10 m, the interception time will be 1018 s). This suggests that when the miss distance is smaller than a certain value, the required interception time will greatly increase with the decrease in the miss distance.

4.2. Results of the Two-Sided Optimal Pursuit Strategy

When the control law of the pursuer has no components against the defender, the defender can break the two-player equilibrium between the pursuer and the evader. Thus, if the pursuer can pursue the evader and avoid the defender simultaneously, the ending of the game may be in favor of the pursuer.

In this case, the parameters of the game are the same as those in Section 4.1, but the pursuer is employing the two-sided optimal pursuit strategy, which aims to chase the evader and dodge the defender simultaneously. Figure 6a depicts the trajectories of the game. Compared to the results of the LQDA strategy in Figure 4a, the pursuer shown in Figure 6a succeeds in avoiding the interception from the defender, leading to a sharp transition in the trajectory of the defender. As seen from the locally enlarged view in Figure 6a, the pursuer successfully intercepts the evader in the end.

The control histories are illustrated in Figure 6b. Several jumps can be observed in the three players’ controls, accompanied by the jumps of the time-to-go. This is because the control gain in optimal feedback strategy is a function of the time-to-go. To further demonstrate the self-defense ability of the pursuer under the two-sided optimal pursuit strategy, the defender is put in another initial position which is closer to the pursuer. Results in Figure 7a reflect that the pursuer can still safely complete the interception.

The results of the LQDA strategy and the two-sided optimal pursuit strategy under different

R_{P}

settings are compared in Figure 7b. Given a lower

R_{P}

(the importance of the energy consumption decreases for the pursuer), the pursuer still cannot succeed under the LQDA strategy, but the results under the two-sided optimal pursuit strategy are different from this. Through these simulations, it can be concluded that it is indeed crucial for the pursuer to be both offensive and defensive in such a PED game, and the designed two-sided optimal pursuit strategy can help the pursuer to achieve high self-defense performance in an offensive task.

4.3. Results of the Cooperative Evasion–Defense Strategy

For the evader and the defender, the performance of the LQDA strategy also possesses some drawbacks owing to ignorance exhibited during the information sharing and the potential cooperation between the two players. Under the LQDA strategy, the defender is solely focusing on intercepting the pursuer but is not concerned with indirectly increasing the distance between the pursuer and the evader. However, the evader does not need to maintain a considerable distance from the pursuer. The optimum scenario would be to maintain a distance between the pursuer and the evader larger than that between the pursuer and the defender.

An example is given in Figure 8, where the pursuer under the LQDA strategy succeeds in intercepting the evader before being intercepted by the defender. In this case, although the

R_{E}

is smaller than that in Section 4.1 (the importance of the energy consumption decreases for the evader), the evader is still unable to get rid of the pursuer’s interception. However, if the evader and the defender adopt the cooperative evasion–defense strategy, the game ending will be quite different, as illustrated in Figure 8b. In the cooperation with the defender, the evader behaves like bait and flies to the defender to reduce the distance between the defender and the pursuer. Eventually, the defender achieves a head-on interception to the pursuer.

In Figure 9, the game results are depicted to demonstrate the cooperation effect between the evader and the defender when the evader or the defender has no control. In Figure 9a, the defender is a free-flying spacecraft with no external control force, and the evader is employing the proposed cooperation strategy. As the game goes on, the evader gradually flies to the defender. Similarly, when the evader has no control (see Figure 9b), the defender actively flies to the evader and intercepts the pursuer. From this case, it can be concluded that the proposed cooperative evasion–defense strategy can help the evader and the defender build cooperation and achieve a better ending in the orbital PED game.

5. Conclusions

This paper investigates the orbital pursuit–evasion–defense (PED) game that involves three spacecraft players. First, a linear-quadratic duration-adaptive (LQDA) guidance law is proposed for each player. In contrast to the LQ strategy, which requires a fixed known game duration, the LQDA strategy can adaptively determine the game duration based on the real-time miss distance, thereby assuring terminal interception. Numerical analysis suggests that if the permitted miss distance is smaller than a certain threshold, the required time cost to achieve the interception will significantly increase with the decrease in the permitted miss distance. Secondly, a two-sided optimal pursuit strategy is designed for the pursuer. Compared to the LQDA strategy, the designed pursuit strategy can notably improve the self-defense ability of the pursuer. Furthermore, a cooperative evasion–defense strategy for the evader and defender is presented through this research. Simulations proved that the cooperative strategy can help the evader and the defender build mutual coordination and achieve a better ending in the orbital PED game. A limitation of this study is the lack of control constraints. Saturation constraints for the control magnitude are commonly seen in the literature. Adding saturation constraints may prolong the interception time in the two-player pursuit–evasion game compared with that case without constraints, or it may alter the game ending in the three-player pursuit–evasion–defense game when different control magnitudes are given to the three players. However, it will not change the effectiveness of the proposed guidance strategies. The basic conclusions still hold with the control constraints considered. Future work may consider the mass change and saturation constraints for the control magnitude for more realistic scenarios.

Funding

This research was funded by National Natural Science Foundation of China grant number (Nos. 12125207).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The author declares that there are no conflicts of interest in this work.

References

Luo, Y.Z.; Li, Z.Y.; Zhu, H. Survey on spacecraft orbital pursuit-evasion differential games. Sci. Sin. Technol. 2020, 50, 1533–1545. [Google Scholar] [CrossRef]
Li, Z.Y.; Zhu, H.; Luo, Y.Z. Orbital inspection game formulation and epsilon-Nash equilibrium solution. J. Spacecraft Rocket. 2024, 61, 157–172. [Google Scholar] [CrossRef]
Gong, B.C.; Li, W.D.; Li, S.; Ma, W.; Zheng, L. Angles-only initial relative orbit determination algorithm for noncooperative spacecraft proximity operations. Astrodynamics 2018, 2, 217–231. [Google Scholar] [CrossRef]
Spendel, D.F. Parameter Study of an Orbital Debris Defender Using Two Teams, Three Player Differential Game Theory; Air Force Institute of Technology: Wright-Patterson Air Force Base, OH, USA, 2018. [Google Scholar]
Pontani, M.; Conway, B.A. Numerical solution of the three-dimensional orbital pursuit-evasion game. J. Guid. Control Dyn. 2009, 32, 474–487. [Google Scholar] [CrossRef]
Storn, R. Sytem design by constraint adaptation and differential evolution. IEEE Trans. Evolut. Comput. 1999, 3, 22–34. [Google Scholar] [CrossRef]
Dong, X.; Zhou, D.X. Learning gradients by a gradient descent algorithm. J. Math. Anal. Appl. 2008, 341, 1018–1027. [Google Scholar] [CrossRef]
Hafer, W.T.; Reed, H.L.; Turner, J.D.; Pham, K. Sensitivity methods applied to orbital pursuit evasion. J. Guid. Control Dyn. 2015, 38, 1118–1126. [Google Scholar] [CrossRef]
Shen, H.L. Casalino, Revisit of the three-dimensional orbital pursuit-evasion game. J. Guid. Control Dyn. 2018, 41, 1820–1828. [Google Scholar] [CrossRef]
Li, Z.Y.; Zhu, H.; Yang, Z.; Luo, Y.-Z. Saddle point of orbital pursuit-evasion game under J2-perturbed dynamics. J. Guid. Control Dyn. 2020, 43, 1733–1739. [Google Scholar] [CrossRef]
Jagat, A.; Sinclair, A.J. LQ Dynamic Optimization and Differential Games. In Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, San Diego, CA, USA, 3–7 January 2014; pp. 1–20. [Google Scholar]
Jagat, A.; Sinclair, A.J. Nonlinear control for spacecraft pursuit-evasion game using the state-dependent Riccati equation method. IEEE Trans. Aerosp. Electron. Syst. 2017, 53, 3032–3042. [Google Scholar] [CrossRef]
Li, Z.Y.; Zhu, H.; Luo, Y.Z. An escape strategy in orbital pursuit-evasion games with incomplete information. Sci. China Ser. E Technol. Sci. 2021, 64, 559–570. [Google Scholar] [CrossRef]
Yan, X.D.; Lyu, S. A two-side cooperative interception guidance law for active air defense with a relative time-to-go deviation. Aerosp. Sci. Technol. 2020, 100, 105787. [Google Scholar] [CrossRef]
Shaferman, V.; Shima, T. Cooperative multiple-model adaptive guidance for an aircraft defending missile. J. Guid. Control Dyn. 2010, 33, 1801–1813. [Google Scholar] [CrossRef]
Ratnoo, A.; Shima, T. Guidance strategies against defended aerial targets. J. Guid. Control Dyn. 2012, 35, 1059–1068. [Google Scholar] [CrossRef]
Perelman, A.; Shima, T.; Rusnak, I. Cooperative differential games strategies for active aircraft protection from a homing missile. J. Guid. Control Dyn. 2011, 34, 761–773. [Google Scholar] [CrossRef]
Prokopov, O.; Shima, T. Linear quadratic optimal cooperative strategies for active aircraft protection. J. Guid. Control Dyn. 2013, 36, 753–764. [Google Scholar] [CrossRef]
Weiss, M.; Shima, T.; Castaneda, D. Minimum effort intercept and evasion guidance algorithms for active aircraft defense. J. Guid. Control Dyn. 2016, 39, 2297–2310. [Google Scholar] [CrossRef]
Shalumov, V. Online launch-time selection using deep learning in a target–missile–defender engagement. J. Guid. Control Dyn. 2019, 16, 224–236. [Google Scholar] [CrossRef]
Sun, Q.L.; Zhang, C.F.; Liu, N.; Zhou, W.; Qi, N. Guidance laws for attacking defended target. Chin. J. Aeronaut. 2019, 32, 2337–2353. [Google Scholar] [CrossRef]
Qi, N.M.; Sun, Q.L.; Zhao, J. Evasion and pursuit guidance law against defended target. Chin. J. Aeronaut. 2017, 30, 1958–1973. [Google Scholar] [CrossRef]
Shalumov, V. Cooperative online guide-launch-guide policy in a target-missile-defender engagement using deep reinforcement learning. Aerosp. Sci. Technol. 2020, 104, 105996. [Google Scholar] [CrossRef]
Rubinsky, S.; Gutman, S. Three player pursuit and evasion conflict. J. Guid. Control Dyn. 2014, 37, 98–110. [Google Scholar] [CrossRef]
Liu, Y.F.; Li, R.F.; Wang, S.Q. Orbital three-player differential game using semi-direct collocation with nonlinear programming. In Proceedings of the 2nd International Conference on Control Science and Systems Engineering, Singapore, 27–29 July 2016; pp. 217–222. [Google Scholar]
Liu, Y.F.; Li, R.F.; Hu, L.; Cai, Z.-Q. Optimal solution to orbital three-player defense problems using impulsive transfer. Soft Comput. 2018, 22, 2921–2934. [Google Scholar] [CrossRef]
Liang, H.Z.; Wang, J.Y.; Liu, J.Q.; Liu, P. Guidance strategies for interceptor against active defense spacecraft in two-on-two engagement. Aerosp. Sci. Technol. 2020, 96, 105529. [Google Scholar] [CrossRef]
Clohessy, W.H.; Wiltshire, R.S. Terminal guidance system for satellite rendezvous. J. Aerosp. Sci. 1960, 27, 653–658. [Google Scholar] [CrossRef]
Engwerda, J. LQ Dynamic Optimization and Differential Games; John Wiley & Sons, Ltd.: Chichester, UK, 2005. [Google Scholar]
Xia, X.; Xu, Z. Effective finite horizon linear-quadratic continuous terminal control. Int. J. Comput. Method 2013, 10, 1350038. [Google Scholar] [CrossRef]
Rubinsky, S.; Gutman, S. Vector guidance approach to three-player conflict in exoatmospheric interception. J. Guid. Control Dyn. 2015, 38, 2270–2285. [Google Scholar] [CrossRef]
Gutman, S.; Rubinsky, S. Exoatmospheric thrust vector interception via time-to-go analysis. J. Guid. Control Dyn. 2019, 42, 86–96. [Google Scholar] [CrossRef]
Ye, D.; Shi, M.M.; Sun, Z.W. Satellite proximate interception vector guidance based on differential games. Chin. J. Aeronaut. 2018, 31, 1352–1361. [Google Scholar] [CrossRef]
Behk, R.D.; Ho, Y.C. On a class of linear stochastic differentia1 games. IEEE Trans. Automat. Contr. 1968, 13, 227–240. [Google Scholar]
Ogata, K. Modern Control Engineering; Pearson Education: New Delhi, India, 2009. [Google Scholar]
Ypma, T.J. Historical development of the Newton–Raphson method. SIAM Rev. 1995, 37, 531–551. [Google Scholar] [CrossRef]

Figure 1. Illustration of the PED game in the LVLH frame.

Figure 2. Illustration of a fluctuating

g (τ)

.

Figure 2. Illustration of a fluctuating

g (τ)

.

Figure 3. A critical point for the time-to-go.

Figure 4. Results of the game using the LQDA strategy. (a) The pursuer uses the LQDA strategy in the PE game; (b) pursuer uses the LQ strategy in the PE game; (c) the defender uses the LQDA strategy in the PED game; (d) the defender uses the LQ strategy in the PED game; (e) the controls of the defender using the LQDA strategy; (f) the ontrols of the defender using the LQ strategy.

Figure 5. m vs. t with different q_f.

Figure 6. Results of the game under the two-sided optimal pursuit strategy for the pursuer. (a) The game trajectories; (b) the controls and time-to-go.

Figure 7. Results of the game under the two-sided optimal pursuit strategy for the pursuer. (a) The trajectories when the defender has a more favorable initial position; (b) the results of LQDA and two-sided optimal pursuit (TSOP) strategy under different parameters

R_{P}

.

Figure 7. Results of the game under the two-sided optimal pursuit strategy for the pursuer. (a) The trajectories when the defender has a more favorable initial position; (b) the results of LQDA and two-sided optimal pursuit (TSOP) strategy under different parameters

R_{P}

.

Figure 8. Results of the game when

Q_{f} = 2 \times 10^{4}, R_{P} = 0.8 \times 10^{10}, R_{E} = 1.1 \times 10^{10}, R_{D} = 0.2 \times 10^{10}

. (a) the trajectories under the LQDA strategy; (b) the trajectories under the cooperation strategy.

Figure 8. Results of the game when

Q_{f} = 2 \times 10^{4}, R_{P} = 0.8 \times 10^{10}, R_{E} = 1.1 \times 10^{10}, R_{D} = 0.2 \times 10^{10}

. (a) the trajectories under the LQDA strategy; (b) the trajectories under the cooperation strategy.

Figure 9. Results of the game when the evader or the defender has no controls. (a) The game trajectories when

u_{D} = 0

; (b) the game trajectories when

u_{E} = 0

.

Figure 9. Results of the game when the evader or the defender has no controls. (a) The game trajectories when

u_{D} = 0

; (b) the game trajectories when

u_{E} = 0

.

Table 1. Initial states of the players in the game.

Case	Height	Player	x (km)	y (km)	z (km)	$\dot{x} (m / s)$	$\dot{y} (m / s)$
1	400	Evader	0	0	0	0	0
		Pursuer	−6	−16	4	−9	13.6
		Defender	−1	3	0	0	0
2	500	Evader	0	0	0	0	0
		Pursuer	−6	−16	4	−9	13.6
		Defender	0	−5	0	0	0
3	600	Evader	0	0	0	0	0
		Pursuer	0	10	0	0	−2
		Defender	5	1	0	0.54	−10.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Z.-Y. Orbital Pursuit–Evasion–Defense Linear-Quadratic Differential Game. Aerospace 2024, 11, 443. https://doi.org/10.3390/aerospace11060443

AMA Style

Li Z-Y. Orbital Pursuit–Evasion–Defense Linear-Quadratic Differential Game. Aerospace. 2024; 11(6):443. https://doi.org/10.3390/aerospace11060443

Chicago/Turabian Style

Li, Zhen-Yu. 2024. "Orbital Pursuit–Evasion–Defense Linear-Quadratic Differential Game" Aerospace 11, no. 6: 443. https://doi.org/10.3390/aerospace11060443

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Orbital Pursuit–Evasion–Defense Linear-Quadratic Differential Game

Abstract

1. Introduction

2. Problem Formulation

2.1. Orbital Relative Dynamics

2.2. Two-Player Pursuit–Evasion Game

2.3. Three-Player Pursuit–Evasion–Defense Game

3. Strategy Design

3.1. Linear-Quadratic Duration-Adaptive Strategy

3.1.1. Derivation

3.1.2. Calculation of Time-to-Go

3.2. Two-Sided Optimal Pursuit Strategy

3.3. Cooperative Evasion–Defense Strategy

4. Simulation Results

4.1. Results of the LQDA Strategy

4.2. Results of the Two-Sided Optimal Pursuit Strategy

4.3. Results of the Cooperative Evasion–Defense Strategy

5. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI