**1. Introduction**

Use of autonomous robots in groups is a promising area of research in today's mobile robotics, and it receives much attention. Decentralized control of autonomous robots is one of the more complex yet effective approaches. Thus, many papers cover decentralized control applications in ground-based robots [1–3] and autonomous unmanned aerial vehicles (UAVs) alike [4,5]. Most papers cover decentralized control of rotary-wing UAV groups, mainly quadcopter formations [6,7].

Currently, control of decentralized swarms of autonomous robots [8–10] and unmanned aerial vehicles (UAVs) [11,12] is a promising area of research. However, beside control and trajectory-planning algorithms, these formations require optimizing their transient trajectories. Since autonomous robot formations, including UAV formations, are complex nonlinear interconnected systems, soft computing could be effective for such optimization.

Control optimization research focuses on evolutionary algorithms [13] including genetic algorithms [14,15] and particle swarm optimization [16,17]. Some papers show implementation of particle swarm optimization for single-quadcopter controllers [18–20], UAV formations [21–23], UAV trajectory optimization [24], UAV movement planning [25] and UAV formations [26] in an uncertain environment. However, swarm optimization for vector field-controlled UAV formations remains under-researched. Thus, the goal hereof is to test the feasibility of applying particle swarm optimization to a decentralized consensus-based UAV formation controlled via vector field guidance.

### **2. Preliminary Remarks and Statement of Problem**

The assumptions here are the same as in [12,27]: no wind, inter-UAV communication enabled and sufficiently accurate computation of the relative inter-UAV distance in the group.

The dynamics of a decentralized UAV formation can be tested via full models as well as via high-level models that approximate the movement of the formation provided that the UAVs are equipped with fine-tuned onboard autopilots. Full models are preferable, e.g., for

**Citation:** Muslimov, T. Particle Swarm Optimization for Target Encirclement by a UAV Formation. *Eng. Proc.* **2023**, *33*, 15. https:// doi.org/10.3390/engproc2023033015

Academic Editors: Askhat Diveev, Ivan Zelinka, Arutun Avetisyan and Alexander Ilin

Published: 9 June 2023

**Copyright:** © 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

final testing of control algorithms or for testing formation-wide stability. High-level models, also referred to in the literature as guidance models, are more suitable for simulating spatial movement planning algorithms as well as for trajectory optimization. However, full models are still useful when the trial and error method is used to find the initial values for the formation controller coefficients that are further to be used for trajectory optimization.

UAV formation trajectory optimization is a subtask of cooperative target tracking. This task is sometimes referred to as collective circumnavigation or target encirclement. The idea is to maintain a certain preset distance not only between the UAVs (through specified angular values) but also to the target encirclement orbit, which is a moving path. Formal statement of the problem can be found in Section 4. We covered a similar problem in [28], where we used a genetic algorithm to solve it.

#### **3. Consensus-Based UAV Formation Control Algorithm**

This paper uses a decentralized neighbor-to-neighbor interaction topology, where UAVs only receive data from the neighboring aircraft. The topology can be defined as a graph referred to as the "interaction graph" hereinafter. Paper [12] shows a mathematical model of interaction in a consensus-based UAV formation. The same model is described later in this section. Let N be the set of all UAV agents.

Let **<sup>e</sup>***<sup>θ</sup>* <sup>∈</sup> <sup>R</sup>*N*×<sup>1</sup> be this vector, where <sup>R</sup>*N*×<sup>1</sup> is a space of *<sup>N</sup>* <sup>×</sup> 1-dimensional matrices with components from R. To find this vector, use some elements of the vector of all possible relative phase shift angle errors **e¯** *<sup>θ</sup>* = *e*ˆ*i*, *<sup>j</sup>* <sup>∈</sup> <sup>R</sup>*N*(*N*−1)×1, where *<sup>e</sup>*ˆ*i*, *<sup>j</sup>* is the value of error for the directly interacting *i*th and *j*th agents. The choice is dictated by the interaction architecture; in this research, the control action vector is set as such for open-chain interaction in the same manner as described in [12,27]:

$$\mathbf{e}\_{\theta} = \begin{bmatrix} e\_1 \\ \vdots \\ e\_k \\ \vdots \\ \vdots \\ e\_N \end{bmatrix} = \begin{bmatrix} \pounds\_{12} \\ \vdots \\ \pounds\_{k-1,k} + \pounds\_{k,k+1} \\ \vdots \\ \pounds\_{N-1,N} \end{bmatrix} = \mathbf{\hat{M}}\_{\theta} \mathbf{\bar{e}}\_{\theta} + \mathbf{D}\_{\prime} \tag{1}$$

where **<sup>D</sup>** <sup>=</sup> <sup>−</sup>**M***θ***H**−<sup>1</sup> *θ* **P***d θ* T , *P*ˆ *θ* !<sup>T</sup> is a system control vector in the space of relative distances (an (*N* − 1)-dimensional space generated by the interaction graph incidence matrix columns), and **H***<sup>θ</sup>* is a matrix that specifies the agents for agent-to-agent distance measurements, defined as follows:

$$\mathbf{H}\_{\theta} = \begin{bmatrix} \mathbf{q}\_{1} \\ \mathbf{q}\_{2} \\ \vdots \\ \mathbf{q}\_{N} \end{bmatrix}, \quad \mathbf{q}\_{i} = \begin{bmatrix} \vdots \\ 1 \\ \vdots \\ -1 \\ \vdots \end{bmatrix}^{\mathrm{T}}, \quad i < N, \ \mathbf{q}\_{N} = \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix}^{\mathrm{T}},$$

where **<sup>H</sup>***<sup>θ</sup>* <sup>∈</sup> <sup>R</sup>*N*×*N*, **<sup>q</sup>***<sup>i</sup>* <sup>∈</sup> <sup>R</sup>1×*<sup>N</sup>* and the positions of "1" and "-1" in **<sup>q</sup>***<sup>i</sup>* are determined according to the structure of the interaction graph.

**P***d <sup>θ</sup>* <sup>∈</sup> <sup>R</sup>(*N*−1)×<sup>1</sup> is the vector of the desired inter-UAV phase shift angles and *<sup>P</sup>*<sup>ˆ</sup> *<sup>θ</sup>* <sup>=</sup> *<sup>N</sup>* ∑ *k*=1 *ϕk* is the total of the current UAV phase angles in an inertial coordinate system;


If **<sup>n</sup>** · (**d***<sup>i</sup>* <sup>×</sup> **<sup>d</sup>***i*+1) <sup>≥</sup> 0, then - *ei*, *<sup>i</sup>*+<sup>1</sup> <sup>=</sup> *<sup>β</sup>* <sup>=</sup> arccos (**d***i*, **<sup>d</sup>***i*+1) **d***i***d***i*+1 and - *ei*, *<sup>i</sup>*+<sup>1</sup> = 2*π* − *β* in other cases, where **d***k*, *k* ∈ N is the vector of aircraft-to-moving-target distance at a given time, **n** = (0, 0, 1) T;

**<sup>M</sup>***<sup>θ</sup>* <sup>∈</sup> <sup>R</sup>*N*×*<sup>N</sup>* is an interaction matrix that in cases of decentralized neighbor–neighbor interactions as herein is as follows:

$$\mathbf{M}\_{\theta} = \begin{bmatrix} -1 & 1 & 0 & \cdots & 0 \\ 1 & -2 & \ddots & \ddots & \vdots \\ 0 & \ddots & \ddots & 1 & 0 \\ \vdots & \ddots & 1 & -2 & 1 \\ 0 & \cdots & 0 & 1 & -1 \end{bmatrix};$$

**Mˆ** *<sup>θ</sup>* <sup>∈</sup> <sup>R</sup>*N*×(*N*−1) is a matrix derived from the matrix **<sup>M</sup>***θ***H**−<sup>1</sup> *<sup>θ</sup>* by removing the *N*th column.

For collective target encirclement, this paper uses the same control laws as in [12,27]. For control based on angular errors, it uses the speeds of the UAVs in the formation. The following control command vector **v***<sup>c</sup>* is set for UAV speeds:

$$\mathbf{v}^{\varepsilon} = \begin{bmatrix} v\_1 & v\_2 & \dots & v\_N \end{bmatrix}^{\mathrm{T}} = v\mathbf{1}\_N + \mathbf{L} \tag{2}$$

where **<sup>1</sup>***<sup>N</sup>* <sup>=</sup> " 1 1 ... 1 #<sup>T</sup> <sup>∈</sup> <sup>R</sup>*N*×<sup>1</sup> and the vector **L** = *vf*(2/*π*) arctan\$ *k<sup>θ</sup> ei* + *k* ˙ *θ e*˙*i* % *<sup>i</sup>*=1,*<sup>N</sup>* <sup>∈</sup> <sup>R</sup>*N*×<sup>1</sup> is found given (1), *<sup>k</sup><sup>θ</sup>* is the positive tuning coefficient, *k* ˙ *<sup>θ</sup>* is the positive tuning coefficient for the derivative signal, *vf* is the maximum norm of the additional velocity vector that is to be adjusted for the constraints of the real-UAV dynamics, and *v* is the ultimate linear cruise speed of the UAVs provided that the target is stationary.

Path error-based control relies on the heading angles of the UAVs in the formation. The following control law from [12,27] is applied to the heading-angle command vector **Ø***<sup>c</sup>* with slightly modified coefficients:

$$\mathcal{B}^c = \left(\varphi\_i + \lambda \left[\frac{\pi}{2} + \arctan\left(k\_o^i (d\_i - \rho) + k\_o^j d\_i\right)\right]\right)\_{i = \overline{1, N}} \in \mathbb{R}^{N \times 1},\tag{3}$$

where *di* is the ith UAV-to-target distance, ˙*di* is the corresponding derivative signal, *k<sup>i</sup> <sup>o</sup>* is the tuning coefficient for the distance-to-circular-path signal for the *i*th UAV, *k<sup>i</sup> <sup>o</sup>*˙ is the tuning coefficient for the distance-to-circular-path derivative signal for the *i*th UAV, *ρ* is the radius of the circular path that the UAV follows whilst encircling the target, *ϕ<sup>i</sup>* is the phase angle of circumvolution around the target for the *i*th UAV.

#### **4. Implementation of Particle Swarm Optimization**

The standard particleswarm solver from MATLAB2015b with default parameters was used for particle swarm optimization. A four-UAV formation was tested for this paper, hence eight tuning parameters. These parameters are coefficients in the control law (3):

$$\mathbf{K} = \begin{bmatrix} k\_o^1 & k\_\phi^1 & \dots & k\_o^4 & k\_\phi^4 \end{bmatrix} \in \mathbb{R}^{1 \times 8}.$$

We also added upper and lower bounds as follows:

[ 0.1 0.1 . . . 0.1 0.1 ] ≤ **K** ≤ [ 30 30 . . . 30 30 ]

These constraints were chosen in order to preserve the UAV formation stability. Stability can be lost if the control law coefficients go beyond certain limits in the absence of adaptive control.

The initial guess for the vector **K** was as follows:

$$\mathbf{K}\_{initial} = \begin{bmatrix} 1 & 2 & \dots & 1 & 2 \end{bmatrix} \in \mathbb{R}^{1 \times 8}.$$

For the fitness function, we chose the following:

$$F\_{fitness} \stackrel{\Delta}{=} \int\_0^{t\_n} t \sum\_{i=1,\ldots,4} \left( |d\_i - \rho| + |e\_i| \right) dt\_\prime \tag{4}$$

where *tn* is the particle swarm optimization time; the remaining parameters are defined in Equations (2) and (3). Thus, this solution optimizes not only for the error of each UAV's distance to the ultimate orbit of target encirclement but also for the relative neighbor-toneighbor distance errors.

The formal statement of the goal would be as follows:

$$\text{minimize} \qquad F\_{fitness} \stackrel{\Delta}{=} \int\_{0}^{t\_n} t \sum\_{i=1,\dots,4} \left( |d\_i - \rho| + |e\_i| \right) dt\_i$$
 
$$\text{s.t.} \quad \left[ \begin{array}{cccc} 0.1 & 0.1 & \dots & 0.1 & 0.1 \end{array} \right] \le \mathbf{K} \le \left[ \begin{array}{cccc} 30 & 30 & \dots & 30 & 30 \end{array} \right]$$

#### **5. Simulation Results**

#### *5.1. Simulation Parameters*

For simulation, we ran a high-level UAV model from [29]. The formation consisted of four UAVs of the same type. To make an initial guess, we also ran full UAV models of this formation. This allowed us to find, by trial and error, a controller coefficient that would keep the entire formation system stable. Control laws (2) and (3) were used in the simulation. The simulation parameters are shown in Table 1.

**Table 1.** Simulation parameters.


#### *5.2. Simulation Results and Discussion*

Figure 1 shows how the fitness function changed during optimization. Apparently, the function's value stopped declining drastically after 25 iterations. However, a significantly increasing number of iterations would be required to further reduce the value.

Running the optimization algorithm returned the following **K** = **K***opt* values that were used in the simulation:

$$\mathbf{K}\_{\rm opt} = \begin{bmatrix} .4.7426, & 25.1807, & 2.6214, & 23.8265, & 20.6034, & 29.4922, & 18.7241, & 26.8984 \end{bmatrix}^{\rm T}$$

**Figure 1.** Fitness function value change during simulation.

Figure 2 shows UAV formation angular errors before and after optimization. As can be seen in the graphs, optimization enabled the formation to reach the pre-specified relative angular positions somewhat faster. Figure 3 shows how path errors changed in the UAV formation. As can be seen from the graphs, UAV1 and UAV2 showed the most drastic changes. Apparently, transient trajectories before and after optimization are different (Figure 4). Even though the trajectories look similar in the figure, they are still different. That is especially noticeable in the trajectories for UAV1 and UAV2.

**Figure 2.** UAV angular errors at time *t* = 90 s. (**a**) Angular errors before particle swarm optimization; (**b**) angular errors after particle swarm optimization.

**Figure 3.** UAV path errors at time *t* = 90 s. (**a**) Path errors before particle swarm optimization; (**b**) path errors after particle swarm optimization.

**Figure 4.** UAV formation trajectories at time *t* = 90 s. (**a**) Trajectories before particle swarm optimization; (**b**) trajectories after particle swarm optimization.

Notably, although the controller coefficients were tuned only for the control law (3), the fitness function included the total angular error of the formation (4). The reason for this was that control by path errors is tied to control by angular errors in a decentralized UAV formation system. This connection can be seen, among other things, in the simulation results in Figure 2.

#### **6. Conclusions**

The paper demonstrates a successful use of particle swarm optimization for target encirclement and tracking by a UAV formation. The formation itself was a vector fieldcontrolled decentralized formation. Simulations showed a reduction in the proposed fitness function as well as a change in the pattern of transient trajectories. A close connection was found between optimizing the path error controller optimization and the quality of transient trajectories for angular errors in the UAV formation.

**Funding:** This work was supported by the Ministry of Science and Higher Education of the Russian Federation (Agreement No. 075-15-2021-1016).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
