Algorithm for finding the guaranteed capture time.
After moving along a straight segment, the pursuer moves along a spiral. Thus, the pursuer can guarantee capture by iterating through all elements of the set . This algorithm for computing the guaranteed capture time is implemented using the software package Maple 2023.
4.1. Sorting the Set of Velocities for Pursuit
In order to minimize the assured search time, it is beneficial for the pursuer to arrange the set of velocities in a certain order. This involves the application of scheduling theory, which addresses such optimization problems. The primary challenge in establishing this order arises from the fact that the resultant tasks cannot be divided into separate, unrelated components. The sequence of actions taken influences the nature of the tasks at hand and the methods employed to address them. Scheduling theory addresses common elements found in most ordering dilemmas. While the resultant theoretical models may not precisely match specific real-world scenarios, their broad applicability allows for approximations to be derived across a wide spectrum of problems.
Scheduling theory extensively leverages a variety of established concepts and techniques aimed at making optimal decisions. These decisions are often arrived at through the creation and examination of relevant operational models. The advancement of methods for tackling extreme problems, statistical testing, heuristic programming, and similar domains has played a role in shaping the techniques for crafting optimal schedules. The maturation of scheduling theory into a distinct branch of applied mathematics coincided with the era when linear programming had reached a significant level of development. During this period, numerous linear models were studied, and tangible outcomes were achieved in the practical application of these models and methods.
Methods centered around sequential construction, analysis, and the elimination of scheduling options have become well-established approaches in scheduling theory. A multitude of domination rules have been put forth, aiming to discard entire sets of schedules through the examination of individually created segments. Computational strategies have been devised wherein the comparative merits of one schedule over others are methodically assessed. These strategies might incorporate heuristic components, such as the application of diverse preference functions, as well as certain training techniques paired with elements of statistical modeling and the like.
Dynamic programming techniques can be directly applied to address numerous problems in scheduling theory. In straightforward scenarios, optimal schedules can be devised by employing basic logic to assess alterations in schedule attributes brought about by elementary transformations. This set of methods constitutes the foundation of what is known as the combinatorial approach within scheduling theory. This approach has yielded the most efficient algorithms for solving several scheduling problems.
In our specific task, the finite and predetermined number of speeds, denoted as “n”, necessitates validation for each speed. It is assumed that during schedule creation, processing can commence with any of the available speeds. The time required to validate each speed is contingent upon the schedule being developed. Leveraging the algorithm delineated earlier for determining the guaranteed search time, we will proceed to formulate a matrix of time values denoted as , where represents the duration of checking the speed when speed precedes speed . Then the time taken by the pursuer to check all speeds, i.e., guaranteed search time, depends on the order. Denote by .
Maximum test duration: The task is to check each of the n speeds once and only once, and the order should be such as to minimize the maximum duration of the passage. It is necessary to find such a matrix
of order
with elements.
The sum is minimized. Two methodologies are employed to attain optimal solutions of heightened accuracy when dealing with a limited number of speeds while also providing approximations for scenarios involving a larger array of speeds. The primary algorithm, known as the “branch and bound” method, engages in a sequential transformation of matrices into one of three standard configurations through a specific procedure.
Initially, an initial matrix is formulated based on the provided problem. Subsequently, this matrix undergoes a systematic process following a predefined scheme to generate more straightforward iterations, which are also depicted in matrix form. The established procedures are repeatedly applied to each of these iterations until a conclusive solution to the problem is achieved. Thus, the analysis of each matrix results in one of three potential outcomes:
Directly acquiring a solution from the original matrix, particularly when the problem is of sufficient simplicity.
Disregarding a matrix from further consideration upon establishing that it does not yield a solution to the problem.
Employing branching, a method that involves simplifying the problem by contemplating two reduced-complexity variations of the original.
The solution to the problem involves checking all speeds, given by a permutation of indices
.
The obtained solution is a sum of n terms, each of which is determined by an element of the matrix
T according to the adopted order:
The optimal solution is the permutation that minimizes this sum. At each step of the algorithm, the problem involves n speeds, of which k can be determined, and the remaining n − k must be chosen optimally. For all selections, a value must be assigned as the lower bound for all possible solutions to the problem, including the optimal solution. There are trivial lower bounds, such as the minimum element of matrix T or the sum of the minimum elements of its n rows. The subtlety of the algorithm lies in constructing this lower bound, striving to make it as large as possible.
Thus, the matrix is characterized by the remaining number of unknown steps for checking, , and the lower bound of the solution. Moreover, it can be assumed that for the remaining set of steps, at least one solution to the problem is known (for example, the permutation 1…, n is a solution), and let be the best of them. Now the matrix undergoes further changes depending on the following possibilities:
If , then there are no more than two steps left, and the solution is found immediately. If its value is less than , then is set equal to this new value and is considered the best of the known solutions.
If Y is greater than or equal to , the matrix is excluded because the checks presented in it do not lead to better solutions than what is already known.
If none of the above situations apply, then two matrices are created instead of the original one. The branching of the original verification occurs in two directions, and each direction corresponds to its own matrix:
In one of them, the transition from i to j is chosen, as a result of which the lower bound of the solutions may increase.
In the other one, the transition from i to j is prohibited (the element is set equal to ), because of which the lower bound of the solutions will undoubtedly increase.
Hence, the resultant matrices are distinguished by a progressively increasing lower bound and, potentially, a larger count of established steps. Furthermore, with each subsequent matrix, the quantity of evaluations is less than that of its predecessor, eventually culminating in a state where the permutation is definitively defined.
Situations wherein the solution is readily derived or the matrix is eliminated are readily apparent. The core of branching revolves around the concepts of reduction and selection. Reduction endeavors to ensure that at least one zero is present in every row and column of the original matrix . Given that each solution to the problem encompasses a solitary element from every row or column of matrix , altering all elements within a column or row by a constant, either subtracting or adding, does not displace the optimal solution.
Subtract a constant h from each element of a row or column of matrix . Let the resulting matrix be . Then the optimal solution found from is also optimal for T, i.e., both matrices have the same permutation that minimizes time. We can choose as the lower bound for solutions obtained from . Subtraction can continue until each column or row contains at least one zero (i.e., the minimum element in each row or column is zero). The sum of all reduction constants determines the lower bound for the original problem. The matrix T is reduced if it cannot be further reduced. In this case, finding route options is associated with studying a particular transition, say from i to j. As a result, instead of the original matrix, we consider two matrices.
Matrix , which is associated with finding the best of all solutions, given by matrix T and including the order .
Matrix , which is associated with choosing the best of all solutions, not including the order .
After fixing the transition from i to j, we need to exclude transitions from i to other speeds except j and transitions to j from other speeds except i by setting all elements of row i and column j, except , to infinity. We also need to prohibit the order (j, i) in the future by setting = ∞. This is because checking all speeds during a single pass cannot include both and simultaneously. Since these prohibitions may lead to the elimination of some zeros in matrix , further reduction of and obtaining a new, larger lower bound for solutions associated with matrix is not excluded.
In the matrix , it is prohibited to transition from i to j, i.e., is set to infinity. In this case, the possibility of further reducing the matrix and the resulting increase in the lower bound for solutions obtained from is not excluded. The choice of should be such as to maximize the lower bound for , which may allow for the elimination of trajectories without further branching. To achieve this, all possible pairs in the matrix are examined, and the choice is made in such a way that the sum of two consecutive reducing constants is maximal. Obviously, transitions corresponding to zero elements of matrix should be prohibited first, since the choice with nonzero elements does not contribute to further reducing
The second way to order the enumeration of velocities is via the method of dynamic programming. Without loss of generality, choose a certain velocity as the initial one. After that, divide the set of all velocities into four non-intersecting subsets:
In the matrix the transition from i to j is forbidden, i.e., = ∞ is assumed. In this case, there is also the possibility of further reducing the matrix and the resulting increase in the lower bound for solutions obtained from . The choice of should be such as to maximize the lower bound for , which may allow the exclusion of a number of trajectories without further branching. To achieve this, all possible pairs in the matrix are examined, and the choice is made in such a way that the sum of two consecutive leading constants is maximized. It is obvious that transitions that correspond to zero elements of the matrix should be prohibited in the first place since the choice of non-zero elements do not contribute to further reduction of
The second method for ordering the enumeration of speeds is the dynamic programming approach. Without loss of generality, we choose some speed as the initial speed. After that, we divide all the sets of velocities into four disjoint subsets:
{}-the set consisting only of the initial speed.
{}-the set consisting only of one non-initial speed.
{}-the set consisting of k speeds, except for and .
{}-the set consisting of the remaining speeds.
Let us assume that the optimal order for checking speeds is known, starting with speed . Then we can choose speed and a subset {} consisting of k speeds in such a way that this optimal permutation begins with {} and includes the set {}, then {}, after which it checks the set {}.
Let us now focus exclusively on the subsection of the permutation that lies between {} and {} with an intermediate check of {}. It can be noted that the minimum time for this segment is known. If this were not the case, then without changing the part of the permutation up to speed , we could find the best guaranteed time for completing its check and, therefore, the minimum time for the whole. However, this is impossible since it contradicts the initial assumption that the optimal permutation is known.
Let
f (
; {
}) be the time for checking the best permutation from
to
, including the set {
}. Note that when
k = 0.
If is an element of the matrix and k = n − 1 and coincide with the start of the movement, then is the time of the optimal permutation of the original problem. The idea of dynamic programming is to increment k step by step, starting from k = 0. Starting from , the permutation is traversed in reverse order to find the optimal solution.
For the problem under consideration, the main functional equation of dynamic programming is given by:
This equation demonstrates that to find the optimal permutation starting from and ending with , with intermediate velocities, one needs to choose the best among the permutations, starting from the transition from to one of the velocities and then moving in the fastest manner to with intermediate visits to options. Each of these options, in turn, represents the fastest of the permutations, according to the equation mentioned earlier. Eventually, a point is reached where the right-hand side of the equation simply represents an element of .
The solution to the problem for five velocities will be considered as an example, with the fifth velocity taken as the starting point. Then,
f (
; {
,
,
,
}) represents the shortest time for the best permutation, and any sequence of checking velocities that leads to such time is optimal. At step
, the solution is sought for five options with
.
At the first step, solutions for
are expressed in terms of known solutions for
.
At the second step, solutions for
are expressed in terms of known solutions for
:
We proceed to the third step, using each of the solutions of the second step.
At the fourth step, the solution of the original problem is obtained.
There are variations. For , there are choices.
The number of comparisons to be made between them: The total number of computations at all stages will be equal to this number.
As an example, let us consider solving the problem for six speeds
. The
Table 1 is obtained by applying the algorithm for computing the guaranteed search time, implemented using the Maple 2023 software package.
4.2. Theoretical Game Model of Search and Interception
To reduce the guaranteed interception time, it is advisable for the pursuer to order the search for escape speeds. However, if the escapee becomes aware of this, they can move at a speed that the pursuer intends to check last, which would allow the escapee to maximize their search time. Thus, the search problem can be considered a game problem under the conditions of opposition.
The system , where and are non-empty sets, and the function is called an antagonistic game in normal form. Elements and are called player 1 and player 2 strategies, respectively, in game . The Cartesian product elements (i.e., strategy pairs , where and are called situations, and the function is the function of player 1’s gain. Player 2’s gain in an antagonistic game in situation is assumed to be so the function is also called the game’s gain function, and game is a zero-sum game.
Let us define the game for the search problem under consideration. Let the escapee choose any speed from the set and any direction from the set . Then, the set of pure strategies for the escapee (player 1) will be the set of combinations of possible velocities of their movement and movement directions α, and the set of pure strategies for the pursuer will be the set of all possible permutations of the escapee’s velocities. The gain will be the time it takes to catch the escapee, which is found using the algorithm described above. The game is interpreted as follows: players independently and simultaneously choose strategies and . After that, player 1 receives a gain equal to , and player 2 receives a gain equal to . Antagonistic games in which both players have finite sets of strategies are called matrix games.
Let player 1 in a matrix game have a total of m strategies. We establish a one-to-one correspondence between the set of strategies and the set . Similarly, if player 2 has n strategies, we can establish a one-to-one correspondence between the sets and . Then, the game is fully determined by the matrix A = {}, where . In this case, the game is played as follows: player 1 chooses a row i ∈ M, and player 2 (simultaneously with player 1) chooses a column . After that, player 1 receives a payoff of , and player 2 receives ().
Each player aims to maximize their own winnings by choosing a strategy. However, for player 1, their winnings are determined by the function , while for the second player it is , i.e., the players’ goals are directly opposite. It should be noted that the winnings of player 1 (2) are determined by the situations that arise during the game. However, each situation, and therefore the winnings of a player, depend not only on their own choice but also on what strategy their opponent will choose. Therefore, in seeking to obtain the maximum winnings possible, each player must take into account the behavior of their opponent.
In game theory, it is assumed that both players act rationally, i.e., strive to achieve maximum winnings, assuming that their opponent acts in the best possible way for themselves. Let player 1 choose a strategy
. Then, in the worst case, they will win
Therefore, player 1 can always guarantee themselves a win of
. If we abandon the assumption of the attainability of the extremum, then player 1 can always obtain winnings that are arbitrarily close to this value.
This is called the lower value of the game. If the external extremum is reached, then the value is also called the maximin; the principle of constructing the strategy based on maximizing the minimum payoff is called the maximin principle; and the strategy chosen in accordance with this principle is the maximin strategy of player 1.
For player 2, similar reasoning can be applied. Suppose they choose strategy
. Then in the worst case, they will lose
. Therefore, the second player can always guarantee a loss of
. The number… (The text appears to be cut off here, so I am unable to translate the complete sentence).
“The upper value of the game
is referred to as the maximum-minimum, and when an external extremum is attained, it is specifically called the minimax. The principle of constructing strategy
, aimed at minimizing the maximum losses, is known as the minimax principle, and the strategy
chosen in accordance with this principle represents the minimax strategy of player 2. It should be emphasized that the existence of a minimax (maximin) strategy is determined by the attainability of an external extremum. In the matrix game
, these extremums are achieved, resulting in the respectively equal of the lower and upper values of the game.”
The minimax and maximin for the game
can be found as follows:
Let us consider the issue of optimal behavior for players in an antagonistic game. It is natural to examine a situation
in game
optimal if neither player has an incentive to deviate from it. Such a situation
is called an equilibrium, and the optimality principle based on constructing an equilibrium situation is called the principle of equilibrium. For antagonistic games, the principle of equilibrium is equivalent to the principles of minimax and maximin. In an antagonistic game
, a situation
is called an equilibrium or a saddle point if:
or all
.
For the matrix game
, we are talking about the saddle points of the payoff matrix
A, i.e., points (
) such that for all
) the inequalities.
Theorem 1. Let () and () be two arbitrary equilibrium situations in the antagonistic game . Then: where the set of all equilibrium situations.
Let be an equilibrium situation in game . The number is called the value of game .
Now we establish a connection between the principle of equilibrium and the principles of minimax in an antagonistic game.
Theorem 2. In order for there to exist an equilibrium situation in game , it is necessary and sufficient that the minimax and maximin exist and the equality is satisfied. If an equilibrium situation exists in a matrix game, then the minimax is equivalent to the maximin. According to the definition of an equilibrium situation, each player can communicate their optimal (maximin) strategy to their opponent, and as a result, neither player can gain any additional advantage. Now, suppose that in game
there is no equilibrium situation. In such a case, we have.
When an equilibrium situation is present in a matrix game, the minimax strategy aligns with the maximin strategy. According to the definition of such equilibrium, each player can openly share their optimal (maximin) strategy with their adversary, thereby preventing any party from gaining a further advantage.
Now, consider a scenario where an equilibrium situation is absent in game . In such cases, both the maximin and minimax strategies cease to be optimal choices. Adhering to these strategies may not yield an advantage, as players could potentially secure greater gains by deviating from them. However, divulging their selected strategy to the opponent might result in even greater losses compared to sticking with the maximin or minimax approach.
In such intricate scenarios, it becomes reasonable for players to introduce an element of randomness into their actions, thereby amplifying the uncertainty in strategy selection. By embracing randomness, the outcome of their choice remains concealed from the opponent, just as it is initially uncertain to the players themselves until the random mechanism is set in motion. This stochastic element, encapsulating the player’s strategies, is termed a “mixed strategy.” As a random variable, a mixed strategy, denoted as
for player 1, is represented as an m-dimensional vector, characterized by its distribution.
Similarly, in player 2’s mixed strategy,
is the n-dimensional vector.
In this case, and represent the probabilities of selecting pure strategies and , respectively, when players employ mixed strategies and . We will use and to denote the sets of mixed strategies for the first and second players, respectively. Let = (, …, ) ∈ X be a mixed strategy. The set of mixed strategies for a player is an extension of their pure strategy space. A pair of mixed strategies for players in matrix game is called a situation in mixed strategies.
Let us define the payoff of player 1 in the situation
in mixed strategies for the matrix game
as the mathematical expectation of their payoff given that players use mixed strategies
and
y, respectively. The players choose their strategies independently of each other; therefore, the expected payoff
in the situation
in mixed strategies
= (
, …,
) and
= (
, …,
) is equal to:
The situation
is called an equilibrium situation if:
for all
.
Theorem 3. Every matrix game has a situation of equilibrium in mixed strategies.
A common way to solve a matrix game is by reducing it to a linear programming problem. However, difficulties arise when solving matrix games of large dimensions. Therefore, the iterative Brown-Robinson method is often used to find a solution. The idea of the method is to repeatedly play a fictitious game with a given payoff matrix. One repetition of the game is called a round. Let A = {} be an -matrix game. In the first round, both players choose their pure strategies completely randomly. In the k-th round, each player chooses the pure strategy that maximizes their expected payoff against the observed empirical probability distribution of the opponent’s moves in the previous rounds.
So, suppose that in the first
rounds, player 1 used the
i-th strategy
times, and player 2 used the
j-th strategy
times. Then in the
-th round, player 1 will use the
-th strategy, and player 2 will use their
strategy, where:
Let
be the value of the matrix game
. Consider the relations.
Vectors
) и
) are mixed strategies of players 1 and 2, respectively, so by the definition of the value of the game, we have:
Let
v be the value of the matrix game
. Consider the relations.
Vectors
)
u ) are mixed strategies of players 1 and 2, respectively, so by the definition of the value of the game, we have:
Thus, an iterative process is obtained, which allows for finding an approximate solution of the matrix game; the degree of approximation to the true value of the game is determined by the length of the interval. The convergence of the algorithm is guaranteed by a theorem.
During our simulations, we employed the MPC [
22] module as the primary instrument for governing the flight trajectory of the UAV.
Model predictive control (MPC), also known as rolling time domain control and receding horizon control (RHC), is a control strategy. Its operation theory can be described as: at each sampling moment, according to the current measurement information, solve a finite-time-domain, open-loop optimal control problem online and apply the first element of the obtained control sequence to the controlled object; in the next sampling period, repeat the above process. Therefore, model predictive control can predict future events and process them accordingly. Mainly used in process industries and metallurgical manufacturing in chemical plants for 28 long-term multiple-input, multiple-output systems. MPC can handle soft and hard constraints on input, states, and outputs and has the potential to satisfy applications such as chemical plants, metallurgical manufacturing, oil refineries, power systems, process industries, and more. For nonlinear and constrained systems, the exact analytical solution cannot be obtained by directly solving the Hamilton-Jacobi-Bellman equation of the system. So, this method relying on real-time numerical optimization has gained extensive attention.
Example 1. The initial distance between the pursuer and the evader is 200 km. The evader chooses a velocity from the set
= {8, 56, 78} and a direction from the set
= {23, 137, 182}. The maximum speed of the pursuer is
= 100 km/h. Then the set of strategies for the evader is:
set of pursuer strategies:
The resulting game matrix looks like
Table 2.
The game is solved by the Brown-Robinson method, and the value of the game is 35189.49. There is an evader strategy (1/20, 0, 0, 0, 0, 0, 1/20, 3/10, 3/5) and pursuer strategy (1/4, 1/20, 3/5, 0, 0, 1/10).
We transform the topic into the search and pursuit between quadrotor UAVs and modify the topic slightly.
Let the distance between the fugitive UAV and the ground be 100 m. The fugitive UAV selects a speed from the set = {8, 56, 78} as the -axis speed and selects from the set = {23, 37, 82} A value as the -axis direction. The maximum speed of the chaser is = 100 m/min.
Then the fugitive policy set is:
The modeling process is
Figure 1, Escapees and chasers map out movement routes based on strategy sets
Figure 2.
Example 2. An initial distance between the pursuer and the evader be 50 km. The evader chooses a speed from the set
= {4, 10, 16} and a direction from the set
= {8, 10, 16}. The maximum speed of the pursuer is
= 80 km/h. Then the set of strategies for the evader is:
, and the set of strategies for the pursuer is:
The resulting game matrix looks like
Table 3.
We transform the topic into the search and pursuit between quadrotor UAVs and modify the topic slightly.
Let the distance between the fugitive UAV and the ground be 100 m. The fugitive UAV selects a speed from the set = {4, 10, 16} as the -axis speed and selects from the set = {8, 10, 16} A value as the -axis direction. The maximum speed of the chaser is V^P = 100 m/min.
The game was solved using the method of Brown-Robinson, and the value of the game is 1.57. The strategy for the evader is (1/20, 0, 0, 0, 0, 0, 1/10, 1/4, 3/5), and the strategy for the pursuer is (9/20, 1/20, 3/20, 1/20, 1/4, 1/20). The solutions to the examples showed that the most probable speed for the evader would be the maximum of the possible speeds. Therefore, the pursuer should start checking the speeds at the maximum possible speed.
Escapees and chasers map out movement routes based on strategy sets
Figure 3.
Then the fugitive policy set is: