Decentralized Navigation with Optimality for Multiple Holonomic Agents in Simply Connected Workspaces

Kotsinis, Dimitrios; Bechlioulis, Charalampos P.

doi:10.3390/s24103134

Open AccessArticle

Decentralized Navigation with Optimality for Multiple Holonomic Agents in Simply Connected Workspaces

by

Dimitrios Kotsinis

^1,2

and

Charalampos P. Bechlioulis

^1,2,*

¹

Division of Systems and Automatic Control, Department of Electrical and Computer Engineering, University of Patras, Rio, 26504 Patras, Greece

²

Athena Research Center, Robotics Institute, Artemidos 6 & Epidavrou, 15125 Maroussi, Greece

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(10), 3134; https://doi.org/10.3390/s24103134

Submission received: 31 March 2024 / Revised: 30 April 2024 / Accepted: 13 May 2024 / Published: 15 May 2024

(This article belongs to the Special Issue Data-Driven Adaptive Intelligent Control of Rigid–Flexible Coupled Robotic Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Multi-agent systems are utilized more often in the research community and industry, as they can complete tasks faster and more efficiently than single-agent systems. Therefore, in this paper, we are going to present an optimal approach to the multi-agent navigation problem in simply connected workspaces. The task involves each agent reaching its destination starting from an initial position and following an optimal collision-free trajectory. To achieve this, we design a decentralized control protocol, defined by a navigation function, where each agent is equipped with a navigation controller that resolves imminent safety conflicts with the others, as well as the workspace boundary, without requesting knowledge about the goal position of the other agents. Our approach is rendered sub-optimal, since each agent owns a predetermined optimal policy calculated by a novel off-policy iterative method. We use this method because the computational complexity of learning-based methods needed to calculate the global optimal solution becomes unrealistic as the number of agents increases. To achieve our goal, we examine how much the yielded sub-optimal trajectory deviates from the optimal one and how much time the multi-agent system needs to accomplish its task as we increase the number of agents. Finally, we compare our method results with a discrete centralized policy method, also known as a Multi-Agent Poli-RRT* algorithm, to demonstrate the validity of our method when it is attached to other research algorithms.

Keywords:

multi-agent systems; motion planning; decentralized navigation; navigation function; optimal motion planning; multi-agent Poli-RRT*

1. Introduction

Motion Planning (MP) is one of the fundamental problems in robotics and a great amount of control algorithms have been proposed to solve it. The main purpose is to find methods to discover safe and, to the greatest extent possible, optimal trajectories either for a single mobile robot/agent system (Single-Agent System—SAS) or a multiple mobile robots/agents system (Multi-Agent System—MAS), which navigate the robot/agents’ journey from any initial point of the workspace to their destination. In the research community and industry, controlling MASs is a state-of-the-art task, as these systems can complete tasks faster and more efficiently than SAS. Despite their effectiveness, developing control policies that result in suitable trajectories with minimum cost (in terms of the agent’s path distance and input energy) is extremely challenging.

There are two control policies used to navigate a MAS: the centralized policy (CP), in which the agents are treated as a whole system and each agent is guided by one global controller, and the decentralized policy (DP), where each agent has its own control policy and exchanges information with the others to reach its destination safely without collisions. Designing DP controllers is more challenging than the centralized one, as the CP simplifies the problem and the agents have all the information about the environment and the goal position of all agents. On the other hand, in DP, the constraint of agents’ communication plays a vital role in the control design (for more information about the differences between CP and DP, the reader is referred to [1]). There is an abundance of studies that utilize these policies to efficiently navigate the agents, helping them to accomplish their tasks. They are mostly published in the field of exploration [2,3,4,5,6,7] and formation [8,9,10,11,12,13].

Furthermore, numerous researchers have proposed methods for navigating mobile robots to a final position; see [14,15,16,17,18]. Some of these studies use a non-degenerate navigation function [19], creating a negated gradient potential field that is attractive towards the goal configuration and repulsive regarding obstacles and the workspace boundary. The first studies to apply the above function to help mobile agents navigate to their destination without facing any conflicts are [20,21] which describe a CP and DP, respectively. We have to mention that this type of controller does not provide strict global navigation, because a vector field in any manifold that has a unique attractive point should have at least as many saddles as obstacles [19]. Related multi-agent navigation function methods can be found in [22,23,24,25,26,27].

As we mentioned earlier, the ultimate goal for MAS navigation is to design methods that can find the optimal path for the agents to follow. In SAS, there are discrete (DM) and continuous methods (CM) that accomplish this task. In DM, there is an RRT* algorithm with various versions [28,29]; this algorithm modifies the robot’s path in order to decrease the minimum path length. The CM relies on Reinforcement Learning (RL)-based methods that use iterative algorithms to converge in an optimal navigation policy [30,31,32]. It is common in MAS to use methods based on Deep Reinforcement Learning (DRL), which is a powerful tool that combines neural networks and RL algorithms that allow each agent to learn from its interactions with the environment [33,34,35,36,37,38]. Despite the effectiveness of the RL-based methods, the main disadvantage in MAS is the computational complexity and abundance of data required to converge to the global policy.

Therefore in this paper, we are going to introduce a sub-optimal approach for navigating a MAS in planar simply connected workspaces (i.e., workspaces without any internal obstacles). We refer to them as sub-optimal because we begin by finding the optimal navigation for each agent separately with a novel off-policy iterative method [30] that is used to prevent the computational complexity of global MAS and, subsequently, we design a DP controller for multi-agent navigation, where we use the negated gradient of a navigation function from [19]. We apply a DP method to our MAS problem, as every agent should operate independently from the others. This is due to the fact that each agent possesses predetermined information about its optimal navigation and follows this policy to its destination when it is isolated from the others. Additionally, we adopt the approach from [21,26], referred to above, to design the controller for each agent in order to prevent collisions, and this approach also demands fewer numerical operations than the CP of [20], thus relaxing the requirements of the solution. Hence, the main contributions of this work are summarized as follows:

We navigate each agent with a sub-optimal policy to its destination. To the best of our knowledge, this is the first work based on artificial potential fields that introduces optimality within a multi-agent navigation framework.
No collision with other nearby agents or the workspace boundary occurs.
Knowledge about the current position of the nearby agents and not their destination is required.
The complexity is rendered linear with respect to the number of the agents and, if combined with the recent work [39], may be fixed.

Additionally, to demonstrate the efficiency of the proposed controller, in our simulation study, we examine how much the sub-optimal trajectory of each agent deviates from the optimal one of the SAS, how much time the overall system needs to accomplish its task, and how the time taken changes as we increase the number of agents starting simultaneously from their initial position. Also, we consider the results obtained using the Multi-Agent Poli-RRT* method, which was proposed by [40], in order to compare the validity of our method.

The rest of the paper is organized as follows: Section 2 formulates the problem. Section 3 describes the proposed decentralized navigation policy. Section 4 provides the proof of correctness of our approach. Section 5 presents the simulation results. Finally, Section 6 concludes the work and proposes future research directions.

2. Problem Formulation

Let us assume that we have a sphere manifold (ball set)

B_{r_{0}} (q) = {x \in R^{2} : | | x - q | | < r_{0}}

. Inside this set, we define a simply connected workspace

W \subset B_{r_{0}} (q)

and its boundary, which is given by

\partial W

. There are m mobile agents (or robots) in W, with each one of them occupying a disk

A_{i} = {q \in W : | | q - q_{i} | | \leq r_{i}}

,

i = 1, \dots, m

where

q_{i}

,

r_{i}

denote the center and radius of the disk, respectively. The configuration space is spanned by

q = {[q_{1}^{T}, \dots, q_{m}^{T}]}^{T}

. The initial position vector is defined as

\bar{q} = {[{\bar{q_{1}}}^{T}, \dots, {\bar{q_{m}}}^{T}]}^{T}

and the desired destination vector as

q_{d} = {[q_{d 1}^{T}, \dots, q_{d m}^{T}]}^{T}

. The motion of each agent follows a single integrator model:

\begin{matrix} {\dot{q}}_{i} = u_{i}, i = 1, \dots, m \end{matrix}

(1)

where

u_{i}

denotes the velocity command applied to each agent. In particular,

u_{i}

will be a sub-optimal decentralized control policy which will navigate the agent to its destination and prevent any conflicts/collisions with other agents and the workspace boundary

\partial W

. Also, the controller possesses knowledge about the current position of other nearby agents, but not their destination. Thus, every agent sees the others as moving obstacles in W.

Moreover, we assume that the workspace is spacious, and at every point of it, each agent is capable of avoiding any conflict, since in this paper, we mainly focus on the functionality and the appropriate design of the sub-optimal control policy. Thus, we exclude narrowly connected workspaces, where the agents need global coordination to navigate towards their destination to alleviate blocking issues, such as in [41]. We are going to deal with this issue in future work. Finally, we define the following infinite-horizon cost function for each agent:

\begin{matrix} V_{u_{i}} (\bar{q_{i}}) = \int_{0}^{\infty} [Q (q_{i} (τ; \bar{q_{i}}); q_{d i}) + R (u_{i} (τ; \bar{q_{i}}))] d τ \end{matrix}

(2)

where

\bar{q_{i}}

is the starting point of the agent in W. Additionally, we define the state-related cost term Q and the input-related cost term R, respectively:

\begin{matrix} Q (q_{i}; q_{d i}) = w | | q_{i} - q_{d} i {| |}^{2} \end{matrix}

(3)

\begin{matrix} R (u_{i}) = (1 - w) | | u_{i} {| |}^{2} \end{matrix}

(4)

where

w \in (0, 1)

is a weight parameter and

| | . | |

denotes the Euclidean norm. The metric (2) along with (3), (4) form a classical function from Optimal Regulation theory [42]. The state-related term (3) expresses the minimization of the settling time of the system (1) and the input-related term (4) penalizes the control input’s Euclidean norm, which allows us to calculate the energy expenditure of the system (1). Furthermore, the weight’s value plays a vital role in the final result of the metric (2). It depends on what specification (reducing settling time or input energy) we desire to pass on to our system.

In summation, the aforementioned metric examines two factors: the first one is the amount of time that the whole MAS needs to reach the goal configuration and the second one is the energy of the control input signal, which is spent by each agent in order to successfully navigate the workspace. Thus, the objective of this work is to design appropriate velocity control policies

u_{i}

such that collisions among the agents and the workspace boundary are safely avoided and the overall cost

V_{M A S} = \sum_{i = 1, \dots, m} V_{u_{i}} (\bar{q_{i}})

is rendered as small as possible for any

\bar{q_{i}} \in W

,

i = 1, \dots, m

.

3. Decentralized Navigation

In this section, we briefly discuss the sub-optimal DP. First, we are going to mention the definition of the navigation function, its properties, and the sub-optimal decentralized control law that each agent is equipped with. The control law for each agent, as mentioned before, navigates them to their destination optimally with respect to (2) and prevents them from colliding with the workspace boundary. Hence, we are going to briefly present the off-policy RL-based method of [30], which we use to find the independent optimal navigation policy of each agent and determine the function that provides a warning regarding conflicts between the agents and suggestions of how to resolve such issues.

3.1. Navigation Function

According to [19], a Navigation Function (NF) is a real-valued map defined through cost functions, whose negated gradient field is attractive towards the goal configuration and repulsive with respect to obstacles. In the aforementioned paper, it has been shown that it is not possible to achieve strict global navigation, (i.e., with a globally attracting equilibrium state), because a smooth vector field on any sphere world, which has a unique attractor, should have at least as many saddles as obstacles. Our assumption that we have spherical robots (and thus spherical obstacles) does not constrain the generality of this work, since it has been proven [19] that navigation properties are invariant under diffeomorphisms. A NF can be defined as follows:

Definition 1

([19]). Let

F \subset R^{n}

be a compact connected analytic manifold with a boundary. A map

φ : F \to [0, 1]

is a navigation function if

1.: It is analytic on F;
2.: It has only one minimum at $q_{d} \in i n t (F)$ ;
3.: Its Hessian at all critical points (zero gradient vector field) is full rank;
4.: $lim_{q \to \partial F} φ (q) = 1$ .

Since we design controllers with the DP, where each agent needs to know the current position of the other close-distance agents with the goal of preventing collisions, and each of them operates solely based on its own navigation law, we consider the above class of navigation functions from [26] for each agent:

\begin{matrix} φ_{i} = σ_{d} \circ σ \circ \hat{φ_{i}} = {(\frac{γ_{i}}{γ_{i} + G_{i}})}^{\frac{1}{k}} \end{matrix}

(5)

which is a composition of

σ_{d} (x) = x^{\frac{1}{k}}

,

σ (x) = \frac{x}{1 + x}

and

\begin{matrix} \hat{φ_{i}} = \frac{γ_{i}}{G_{i}} \end{matrix}

(6)

is a polar function, where the column vector

\hat{φ} = {[\hat{φ_{1}}, \dots, \hat{φ_{m}}]}^{T}

has its minimum at

q_{d} \in i n t (F)

, k is a positive constant and

γ_{i}^{- 1} (0)

denotes the desirable set and

G_{i}^{- 1} (0)

the set to avoid. We choose

γ_{i} = {γ_{d i}}^{k}

where

γ_{d i} = V_{u^{*}}^{(i)}

. This function is equal to the optimal cost function, which will be calculated with a novel iterative method such as the one in [30], which will be described in the next section. Thus, the gradient of

V_{u^{*}}^{(i)}

shows the optimal navigation of every agent towards its desired destination. Notice that the cost function

V_{u^{*}}^{(i)}

, as a proper Lyapunov function, possesses the following properties [20]: (1) it is non-negative, (2)

V_{u^{*}}^{(i)} (q) = 0 \Leftrightarrow q = q_{d i}

, (3) it is differentiable with regard to its variables, and (4) its Hessian is positive-definite. Hence, the sub-optimal control law for each agent will be given by

\begin{matrix} u_{i} = - u_{s i} \frac{\nabla φ_{i}}{| | \nabla φ_{i} | | + ϵ} \end{matrix}

(7)

with

u_{s i} = \sqrt{\frac{w}{1 - w}} | | q_{i} - q_{d i} | |

as discussed in [30], which leads the agent smoothly to its destination while avoiding collisions (

u_{i}

is co-linear with

\nabla φ_{i}

) and

ϵ > 0

is a small constant.

Remark 1.

The variable ϵ takes proximate values between

10^{- 4}

and

10^{- 8}

to avoid the singularity at the desired configuration. We choose the value according to the system’s dynamic.

Remark 2.

The estimation of the cost function (2) affects the behavior of the decentralized navigation controller (7). We can see this easily from the gradient of NF, which is equal to

\begin{matrix} \nabla φ_{i} (q_{i}) = \frac{1}{{({γ_{d i}}^{k} + G_{i})}^{\frac{2}{k}}} ({({γ_{d i}}^{k} + G_{i})}^{\frac{1}{k}} \nabla γ_{d i} - γ_{d i} \nabla {({γ_{d i}}^{k} + G_{i})}^{\frac{1}{k}}) \end{matrix}

(8)

where

γ_{d i}

is equal to the estimated function

V_{u^{*}}^{(i)}

. As we observe in (8), the value and gradient of the

γ_{d i}

for any

q_{i} \in W

affect the optimal reactive navigation of each agent and the way in which the agents interact to prevent conflicts among them.

3.2. Individual Optimal Policy

As we explained in the previous section, the control law for each agent involves an optimal navigation policy that leads it to its destination with minimal cost (2). Specifically, the function

γ_{d i} = V_{u^{*}}^{(i)}

is the estimation of the optimal cost function extracted by an off-policy iterative method presented in [30]. In the present work, such an optimal policy is considered to be known and in this section, we are going to briefly analyze how to extract it for a simply connected workspace. More specifically, we adopt the method described in [30] to calculate optimal policies based on which the authors propose an off-policy iterative RL-based method that identifies control laws by minimizing a certain infinite-horizon cost function. The authors of the study chose the off-policy approach [43] because it has a reduced computational cost compared to the on-policy. Their method starts with an initial admissible control policy which leads the system to its goal for any starting point within W. In practice, they define the initial policy with an Artificial Harmonic Potential Field (AHPF)-based method [44]. Moreover, they apply the Zeroing Barrier Function theory [45] to converge to the optimal control policy and also keep the agent at a safe distance from the workspace boundary

\partial W

. The approximation of the optimal cost function

V_{\hat{u}}^{(i)}

is conducted by a Gaussian Radial Basis Functions (RBFs) Neural Network [46], whereas its gradient reveals the optimal navigation velocity of the agent.

In Figure 1, we show a simple example of the aforementioned algorithm. In this example, a point robot is in a simply connected workspace and we want to find its optimal navigation policy for any starting point in W. Figure 1a shows the initial admissible policy designed by the AHPF-based method and Figure 1b the optimal policy in which the algorithm converges. We observe in these two figures that the robot traverses the shortest distance with the optimal policy according to the three starting points. Finally, Figure 1c presents the minimum value of the cost function (2). As we can see, the algorithm achieves a sufficient approximation of the optimal solution within the workspace W. Nevertheless, when multiple agents navigate within the same workspace, collisions may occur, which will be handled by the following repulsive terms relating to nearby agents at risk of imminent collisions.

3.3. Resolving Conflicts via the Terms $G_{i}$

In this section, we will refer to mathematical tools and terminology from [20,26] in order to define the function

G_{i}

that is in charge of resolving the conflicts that lead to collisions among the agents. More specifically, the function

G_{i}

is different for each agent and shows whether the agents are close enough to collide with each other or with the workspace boundary. Hence, we can define this function as described below:

\begin{matrix} G_{i} = G_{b i} G_{a i} \end{matrix}

(9)

where

G_{b i}

is the function of the i-th agent for avoiding conflicts with the boundary and

G_{a i}

with the other agents. When one of these two functions approaches zero value, it means that the agent will face a conflict with some other nearby agents or the workspace boundary

\partial W

.

3.3.1. Calculate Function $G_{b i}$

The function

G_{b i}

is selected as

\begin{matrix} G_{b i} = b_{w i} \end{matrix}

(10)

where

b_{w i}

is a distance metric between the agent and the workspace boundary. In [20,26]

b_{w i}

defines the metric distance from the center of W to the agent’s center, because a spherical world is assumed. Since we deal with generic simply connected workspaces, we cannot use this metric. Therefore, we define its equivalent as follows:

\begin{matrix} b_{w i} = d^{2} (q_{i}) - r_{i}^{2} \end{matrix}

(11)

where

d (q_{i}) = {min}_{z \in \partial W} | | q_{i} - z | |

.

3.3.2. Calculate Function $G_{a i}$

Before defining the function

G_{a i}

, we need to refer to some mathematical tools and definitions and assume that each agent tracks other nearby agents as moving obstacles. So A will symbolize the examining agent and

O_{i}, i = 1, \dots m - 1

will be the set of moving obstacles of A. The robot proximity function, which is a measure for calculating the distance between the agent and the obstacles, is defined as follows:

\begin{matrix} β_{A, i} = {(q_{A} - q_{i})}^{2} - {(r_{A} + r_{i})}^{2} \end{matrix}

(12)

We will use the term relation to describe an imminent collision between one agent and a moving obstacle in W. A binary relation is a relation between the agent and a single obstacle. Any relation can be expressed as a set of binary relations. Finally, the relation level is the number of elements in the set

O_{i}, i = 1, \dots m - 1

.

Using the aforementioned terminology, we can define the set which involves the relations in level-l as

R^{(l)} = {R_{i}^{(l)} : i = 1, \dots, s_{l}}

where

s_{l} = (\binom{m - 1}{l})

and

R_{i}^{(l)}

is the i-th set of binary relations in level-l. In the same manner, we define the complementary set of

R_{i}^{(l)}

, which is

\bar{R_{i}^{(l)}} = {R_{j}^{(l)} : i \neq j \cap j = 1, \dots, s_{l}}

. To keep things simple, we use

R_{i}

and

\bar{R_{i}}

, with

i = 1, \dots, S

and

S = \sum_{l = 1}^{m - 1} s_{l}

, to refer to the relation and their complementary relation, respectively. For example, in Figure 2, we see the relations of an agent which is in a workspace with two other mobile obstacles. The total relations are

S = 3

and the maximum level of relations is two. For every relation shown in Figure 2, we have the following: on the left side the set

R_{1} = {(A, O_{1})}

and

\bar{R_{1}} = {(A, O_{2})}

, on the center

R_{2} = {(A, O_{2})}

and

\bar{R_{2}} = {(A, O_{1})}

and finally, on the right

R_{3} = {(A, O_{1}), (A, O_{2})}

and

\bar{R_{3}} = \emptyset

.

A Relation Proximity Function (RPF) is a measure of the distance between the agent and the obstacles involved in a relation, and each relation has its own RPF. It is equal to

\begin{matrix} b_{R_{i}} = \sum_{j \in R_{i}} β_{j} \end{matrix}

(13)

where

β_{j}

is the proximity function (12) with j denoting each binary relation of the set

R_{i}

. The property of RPF is that when this function has zero value, it shows that the agent conflicts with one or multiple obstacles according to the level of relation. A Relation Verification Function (RVF) is defined by

\begin{matrix} g_{R_{i}} (b_{R_{i}}, B_{\bar{R_{i}}}) = \{\begin{matrix} b_{R_{i}} + λ \frac{b_{R_{i}}}{b_{R_{i}} + {(B_{\bar{R_{i}}})}^{1 / h}} & if i \neq S \\ b_{R_{i}} & if i = S \end{matrix} \end{matrix}

(14)

where

λ, h > 0

and

B_{\bar{R_{i}}} = \prod_{k \in \bar{R_{i}}} b_{k}

. The RVF is zero if a relation holds while no other relation from the same level holds, and has the following properties: (a)

{lim}_{x \to 0} {lim}_{y \to 0} g_{R} (x, y) = λ

and (b)

{lim}_{y \to 0} {lim}_{x \to 0} g_{R} (x, y) = 0

.

It should be noted that due to the fact that we assume we are working with large workspaces, the function

g_{R_{i}}

can take large values in some relation

R_{i}

, affecting the numerical stability of the algorithm. Consequently, we add a sigmoidal function, as defined below, which ensures the values are adequately small. Figure 3 shows the graph of the sigmoidal function along with its gradient:

\begin{matrix} σ_{τ_{m}} (g_{R_{i}}) = τ_{m} \frac{e^{g_{R_{i}} / τ_{m}} - e^{- g_{R_{i}} / τ_{m}}}{e^{g_{R_{i}} / τ_{m}} + e^{- g_{R_{i}} / τ_{m}}} \end{matrix}

(15)

with

τ_{m} = 5

.

Having defined the RVF, then we can calculate the term

G_{a i}

as follows:

\begin{matrix} G_{a i} = \prod_{j = 1}^{S} σ_{τ_{m}} (g_{R_{j}}) \end{matrix}

(16)

with its gradient given by:

\begin{matrix} \nabla G_{a i} = \sum_{j = 1}^{S} (\prod_{k = 1, j \neq k}^{S} σ_{τ_{m}} (g_{R_{k}})) σ_{τ_{m}}^{'} (g_{R_{j}}) \nabla g_{R_{j}} \end{matrix}

(17)

In Equation (17) we see another benefit of the sigmoid function. When a relation provides us with a big value from its RVF, the derivative of the sigmoidal function eliminates the gradient of the relation’s RVF. As a result, when an agent is quite away from the others, it is navigated toward its destination by the optimal policy.

4. Proof of Correctness

Let

ϵ > 0

. Define a ball

B_{j, l}^{(i)} (ϵ) = {q : 0 < g_{R_{j}^{(l)}}^{(i)} (q) < ϵ}

. According to [20], we discriminate the following topologies for the function

ϕ_{i}

:

The destination point $q_{d i}$ .
The free space boundary: $\partial F (q) = G_{i}^{- 1} (δ), δ \to 0$ .
The set near collisions: $F_{0} (ϵ) = ⋃_{l = 1}^{m - 1} ⋃_{i = 1}^{s_{l}} B_{j, l}^{(i)} (ϵ) - {q_{d_{i}}}$ .
The set away from collision: $F_{1} (ϵ) = F - (q_{d i} \cup \partial F \cup F_{0} (ϵ))$ .

The following theorem examines the properties of the proposed decentralized optimal navigation protocol via an overall multi-agent Navigation Function.

Theorem 1.

Let

I_{1}, I_{2} \subseteq R

be intervals and let

\hat{φ} : F \to I_{1}

and

σ : I_{1} \to I_{2}

be analytic. Define the composition

φ : F \to I_{2}

as

φ = σ \circ \hat{φ}

. If σ is monotonically increasing on

I_{1}

, then the set of critical points of

\hat{φ}

and φ coincide and the (Morse) index of each critical point is identical.

Proof.

The proof proceeds similarly to [19]. Since we have a connected workspace and a well-defined function

γ_{d i}

, we first present some results for the navigation function

φ_{i}

[20,26], which establish that the goal configurations are attainable without collisions and that there will always be a direction of movement decreasing the overall potential function. More specifically,

Since the workspace is connected, the destination point $q_{d i}$ is a non-degenerate local minimum of $φ_{i}$ .
All critical points of $φ_{i}$ are in the interior of the free space.
For every $ϵ > 0$ , there exists a positive integer $N (ϵ)$ such that if $k > N (ϵ)$ , then there are no critical points of ${\hat{φ}}_{i}$ in $F_{1} (ϵ)$ .
There exists an $ϵ_{0} > 0$ , such that ${\hat{φ}}_{i}$ has no local minimum in $F_{0} (ϵ)$ , as long as $ϵ < ϵ_{0}$ .

Nevertheless, the aforementioned suppositions do not guarantee global convergence of the system state to the goal configurations on their own. To achieve such convergence, we define a time-invariant Lyapunov function for the overall system with respect to the positions of all agents. The candidate Lyapunov function is the sum of

φ_{i}

. Thus, differentiating the candidate Lyapunov function

φ = \sum_{i = 1}^{m} φ_{i}

along the trajectories of the multi-agent system (i.e., substituting the navigation protocol (7)), we obtain a negative semi-definite derivative. Thus, the value of the Lyapunov function decreases (i.e., no collisions occur) and moreover, the system state converges to the largest invariant set of the system, which is composed of the desired configuration and the rest of equilibria that, fortunately for sufficiently large parameters

h, k

, attain a measured zero domain of attraction. Therefore, the system trajectories converge safely to the desired configuration initializing from almost everywhere within the workspace. □

5. Results

In this section, we will present the simulation results of the aforementioned method. We will use 8 mobile holonomic agents with radius

r = 0.1

which are in a workspace as shown in Figure 4 with the initial and desired positions given in Table 1. We will also run the same simulations with the Multi-Agent Poli-RRT* method to compare our method’s result. We will provide a short description of this comparison method in the next subsection. For the simulations, we use the programming and numeric computing platform MATLAB in version R2022b and a computer with a processor Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz 2.90 GHz and RAM 8GB (Intel, Santa Clara, CA, USA).

5.1. Multi-Agent Poli-RRT* Algorithm

The Multi-Agent Poli-RRT* method was first proposed by [40]. In that work, the authors extended the Poli-RRT* algorithm [47] to a non-holonomic Multi-Agent system in which the agents move in a workspace with internal obstacles by adopting a priority-based approach. That is, the agents are ranked according to a priority criterion and the algorithm finds trajectories in sequence, starting from the highest-priority agent and moving to the lowest one. Furthermore, in each iteration, when it is their turn, the agents with lower priority will learn about the trajectories of the previous higher-priority agents from an updated list. This list involves the edges of the agents’ trajectories and the time duration for which they were there. With this structure, the agents can prevent a possible collision and maintain a safe distance from the previous one. The aforementioned algorithm can be simply adjusted to our problem, and we adapt the agents’ controller to be similar to the (7).

5.2. Simulations

Before we define the sub-optimal control law for each agent, we use the off-policy method to determine the optimal cost function for each agent. Because we want to attach equal importance to the minimization of the settling time and input response of the system (1), we set the

w = 0.5

. In Figure 5, we illustrate the optimal vector fields of each agent using the method presented in [30]. Additionally, in Table 2 and Table 3 we present the time and cost value of each optimal policy, respectively.

As we mentioned earlier, to demonstrate the efficiency of the proposed sub-optimal controller, in this simulation, we examine how much the trajectory of each agent deviates from the optimal one of a SAS and the time that the global system will take to accomplish its task as we increase the number of agents starting simultaneously from the initial position. As the goal of the aforementioned MAS, we stated that all agents should reach their destination starting from the above initial position and following an optimal no-collision trajectory. To this end, we ran three simulations. In the first one, a group of two agents started towards their destination; in the second, a group of four started together; and in the last one, all the agents started together. We assume that an agent interacts only with the group’s members and does not affect the navigation of the other agents which are not in the group. Notably, using the same aforementioned properties, we implemented the comparison method, too. For each simulation, we chose the group of agents that would interact and the variables of sub-optimal control law (

k, λ, k, τ_{m}

).

For simulations, we categorized the agents into the following groups:

Two members $(A_{1}, A_{2})$ , $(A_{3}, A_{4})$ , $(A_{5}, A_{6})$ and $(A_{7}, A_{8})$ .
Four members $(A_{1}, A_{2}, A_{7}, A_{8})$ and $(A_{3}, A_{4}, A_{5}, A_{6})$ .
Eight members $(A_{1}, A_{2}, A_{3}, A_{4}, A_{5}, A_{6}, A_{7}, A_{8})$ (all the agents).

For the Multi-Agent Poli-RRT* method, the first agent has the highest priority and the last one has the lowest.

As variables of sub-optimal control for each simulation, we define the values as follows:

Simulation 1: $k = 300, λ = N a N, h = N a N, τ_{m} = 2$ . https://www.youtube.com/watch?v=-qLbfTVryj8 (accessed on 31 March 2024)
Simulation 2: $k = 250, λ = 8, h = 150, τ_{m} = 16$ . https://www.youtube.com/watch?v=M2rhUSAz1w0 (accessed on 31 March 2024)
Simulation 3: $k = 200, λ = 1, h = 85, τ_{m} = 2$ . https://www.youtube.com/watch?v=Z__lYbZY7O0 (accessed on 31 March 2024)

We have to notice that in simulation one, the variables

λ = h = N a N

because the complementary relations of each agent are not defined when we observe an interaction between two agents. Also, the selected variables are the best ones available, providing the best sub-optimal no-collision trajectory for each agent. Finally, the accompanying videos for each simulation are given in slow motion in order to observe the collision avoidance property.

In Table 2, we list the time durations each agent needs to accomplish its tasks and in Table 3 we list the cost value of each agent for each simulation. Moreover, we also state the global system time duration and cost value in the last line. Both of the navigation methods’ results are shown in these tables too. Finally, in Figure 6 we have the optimal and sub-optimal greedy trajectory of each agent for optimal SAS and each simulation, and also in Figure 7 we present the trajectories of the two navigation methods during Simulation 3.

6. Discussion

In Figure 6, we provide the agents’ trajectories during the above simulations. Specifically, we see the optimal (solid blue line) and sub-optimal trajectories of each agent separately (dashed lines). In all simulations, at the start and the end of their trajectory, we notice that every agent is close to its optimal navigation. On the contrary, in the center of workspace, where the group of agents meets up, each agent deviates from its optimal trajectory or remains idle for a short time period in order to avoid conflicts with the other agents. As a result, the sub-optimal cost value increases as we upsize the members of the group. This occurs as a result of the tuning of the parameters of the sub-optimal controller according to Theorem 1, which is implemented to establish convergence from almost everywhere in the workspace. If we had not selected proper parameter variables, we would have watched the agents remain idle in the center of the workspace without accomplishing their tasks owing to the presence of a stable unwanted equilibrium.

In comparison with the Multi-Agent Poli-RRT* method, in Table 2 and Table 3, we showcase the global efficiency of our method. Also, we observe the efficiency of the priority attitude using the comparison method. That is, in Table 2 and Table 3, the highest-priority agents have a smaller time duration and cost value than the respective agents when our method is used. We observe the same result in Figure 7, which lists the trajectory results of the two methods during Simulation 3. The trajectories of the highest-priority agents (dashed purple line) are more limited in their optimal trajectory than sub-optimal (dashed red line). However, when the updated list increases in each iteration, the lowest-priority agents start to deviate from their optimal trajectory. For the above reasons, our approach is globally valid in comparison with the above priority criterion algorithm.

Moreover, in the aforementioned results, as we expected, we observe a trade-off between total time duration and cost value. What we mean is that as we increase the number of agents that interact in the workspace, a decrease in the total time the agents require to accomplish their tasks occurs alongside an increase in the total cost value of MAS, since the agents deviate from their optimal trajectory. Figure 8 shows the decrease in the total time duration (blue lines) and percentage rise of the total cost value attached to the global optimal cost value (red lines) for the sub-optimal navigation (solid lines) and Multi-Agent Poli-RRT* (dashed lines) method where the horizontal axis is in logarithmic scale. This trade-off result provides additional confirmation of the global efficiency of our method.

7. Conclusions

In this paper, we propose a sub-optimal approach for the navigation problem of MAS in a simply connected workspace. We observed a trade-off between the time duration needed to fulfill the task and the overall cost value as we increase the number of members of a group. Moreover, we compare this method with the Multi-Agent Poli-RRT* to demonstrate the validity of our method compared to other research algorithms. We use this approach to avoid the computational complexity of RL-based methods, designing a sub-optimal decentralized control law. Despite the promising results of our study, there are several issues we should discuss. First of all, when using this method, the agents may be blocked by others if the workspace has narrow halls. Moreover, we cannot achieve global asymptotic navigation such as that referred to in the previous sections [19]. In future research, we will extend this methodology to ensure it remains functional in narrow workspaces, coordinating the agents properly to ensure they reach their destination, and we will utilize algorithms based on game theory to asymptotically approach the global optimal navigation of the system.

Author Contributions

Conceptualization, D.K. and C.P.B.; methodology, D.K. and C.P.B.; software, D.K.; validation, D.K.; formal analysis, D.K. and C.P.B.; investigation, D.K. and C.P.B.; resources, C.P.B.; writing—original draft preparation, D.K.; writing—review and editing, C.P.B.; visualization, D.K.; supervision, C.P.B.; project administration, C.P.B.; funding acquisition, C.P.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the project “Applied Research for Autonomous Robotic Systems” (MIS 5200632) which is implemented within the framework of the National Recovery and Resilience Plan “Greece 2.0” (Measure: 16618—Basic and Applied Research) and is funded by the European Union—NextGenerationEU.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

MP	Motion Planning.
MAS	Multi-Agent System.
SAS	Single-Agent System.
CP	Centralized Policy.
DP	Decentralized Policy.
RRT	Rapidly-exploring Random Tree.
RL	Reinforcement Learning.
DRL	Deep Reinforcement Learning.
NF	Navigation Function.
AHPF	Artificial Harmonic Potential Field.
RPF	Relation Proximity Function.
RVF	Relation Verification Function.

References

Xuan, P.; Lesser, V.R. Multi-Agent Policies: From Centralized Ones to Decentralized Ones. In Proceedings of the First International Joint Conference on Autonomous Agents and Multi-Agent Systems, Bologna, Italy, 15–19 July 2002; pp. 1098–1105. [Google Scholar]
Atinç, G.M.; Stipanovic, D.M.; Voulgaris, P.G. Supervised Coverage Control of Multi-Agent Systems. Automatica 2014, 50, 2936–2942. [Google Scholar] [CrossRef]
Gul, F.; Mir, A.; Mir, I.; Mir, S.; Islaam, T.U.; Abualigah, L.; Forestiero, A. A Centralized Strategy for Multi-Agent Exploration. IEEE Access 2022, 10, 126871–126884. [Google Scholar] [CrossRef]
Ota, J. Multi-Agent Robot Systems as Distributed Autonomous Systems. Adv. Eng. Inform. 2006, 20, 59–70. [Google Scholar] [CrossRef]
Romeh, A.E.; Mirjalili, S. Multi-Robot Exploration of Unknown Space Using Combined Meta-Heuristic Salp Swarm Algorithm and Deterministic Coordinated Multi-Robot Exploration. Sensors 2023, 23, 2156. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Yu, J.; Tang, J.; Xu, Y.; Wang, Y. MR-TopoMap: Multi-Robot Exploration Based on Topological Map in Communication Restricted Environment. IEEE Robot. Autom. Lett. 2022, 7, 10794–10801. [Google Scholar] [CrossRef]
Rooker, M.N.; Birk, A. Multi-Robot Exploration under the Constraints of Wireless Networking. Control Eng. Pract. 2007, 15, 435–445. [Google Scholar] [CrossRef]
AlonsoMora, J.; Baker, S.; Rus, D. Multi-Robot Formation Control and Object Transport in Dynamic Environments via Constrained Optimization. Int. J. Robot. Res. 2017, 36, 1000–1021. [Google Scholar] [CrossRef]
Borate, S.S.; Vadali, M. FFRRT: A Sampling-Based Path Planner for Flexible Multi-Robot Formations. In Proceedings of the 2021 5th International Conference on Advances in Robotics, Kanpur, India, 30 June–4 July 2021; pp. 53:1–53:6. [Google Scholar]
Chipade, V.S.; Panagou, D. Multiagent Planning and Control for Swarm Herding in 2D Obstacle Environments under Bounded Inputs. IEEE Trans. Robot. 2021, 37, 1956–1972. [Google Scholar] [CrossRef]
LópezNicolás, G.; Aranda, M.; Mezouar, Y. Adaptive Multi-Robot Formation Planning to Enclose and Track a Target with Motion and Visibility Constraints. IEEE Trans. Robot. 2020, 36, 142–156. [Google Scholar] [CrossRef]
Cheng, P.D.C.; Indri, M.; Possieri, C.; Sassano, M.; Sibona, F. Path Planning in Formation and Collision Avoidance for Multi-Agent Systems. Nonlinear Anal. Hybrid Syst. 2023, 47, 101293. [Google Scholar] [CrossRef]
Tong, X.; Yu, S.; Liu, G.; Niu, X.; Xia, C.; Chen, J.; Yang, Z.; Sun, Y. A Hybrid Formation Path Planning Based on A* and Multi-Target Improved Artificial Potential Field Algorithm in the 2D Random Environments. Adv. Eng. Inform. 2022, 54, 101755. [Google Scholar] [CrossRef]
Banyassady, B.; de Berg, M.; Bringmann, K.; Buchin, K.; Fernau, H.; Halperin, D.; Kostitsyna, I.; Okamoto, Y.; Slot, S. Unlabeled Multi-Robot Motion Planning with Tighter Separation Bounds. In Proceedings of the 38th International Symposium on Computational Geometry, Berlin, Germany, 7–10 June 2022; pp. 12:1–12:16. [Google Scholar]
Boardman, B.L.; Harden, T.; Martínez, S. Multiagent Motion Planning with Sporadic Communications for Collision Avoidance. IFAC J. Syst. Control 2021, 15, 100126. [Google Scholar] [CrossRef]
Petrović, L.; Marković, I.; Seder, M. Multiagent Gaussian Process Motion Planning via Probabilistic Inference. In Proceedings of the 12th IFAC Symposium on Robot Control, St Etienne, France, 17–19 May 2018; Volume 51, pp. 160–165. [Google Scholar]
Vlantis, P.; Bechlioulis, C.P.; Kyriakopoulos, K.J. Navigation of Multiple Disk-Shaped Robots with Independent Goals within Obstacle-Cluttered Environments. Sensors 2023, 23, 221. [Google Scholar] [CrossRef] [PubMed]
Zhu, H.; AlonsoMora, J. Chance-Constrained Collision Avoidance for MAVs in Dynamic Environments. IEEE Robot. Autom. Lett. 2019, 4, 776–783. [Google Scholar] [CrossRef]
Koditschek, D.E.; Rimon, E. Robot Navigation Functions on Manifolds with Boundary. Adv. Appl. Math. 1990, 11, 412–442. [Google Scholar] [CrossRef]
Loizou, S.G.; Kyriakopoulos, K.J. Closed Loop Navigation for Multiple Holonomic Vehicles. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, 30 September–4 October 2002; pp. 2861–2866. [Google Scholar]
Zavlanos, M.M.; Kyriakopoulos, K.J. Decentralized Motion Control of Multiple Mobile Agents. In Proceedings of the 11th Mediterranean Conference on Control and Automation, Rhodes, Greece, 18–20 June 2003. [Google Scholar]
Chen, J.; Dawson, D.M.; Salah, M.; Burg, T. Cooperative Control of Multiple Vehicles with Limited Sensing. Int. J. Adapt. Control Signal Process. 2006, 21, 115–131. [Google Scholar] [CrossRef]
Loizou, S.G.; Kyriakopoulos, K.J. Closed Loop Navigation for Multiple Non-Holonomic Vehicles. In Proceedings of the 2003 IEEE International Conference on Robotics and Automation, Taipei, Taiwan, 14–19 September 2003; pp. 4240–4245. [Google Scholar]
Loizou, S.G.; Lui, D.G.; Petrillo, A.; Santini, S. Connectivity Preserving Formation Stabilization in an Obstacle-Cluttered Environment in the Presence of TimeVarying Communication Delays. IEEE Trans. Autom. Control 2022, 67, 5525–5532. [Google Scholar] [CrossRef]
Luis, R.; Tanner, H.G. Flocking, Formation Control, and Path Following for a Group of Mobile Robots. IEEE Trans. Control Syst. Technol. 2015, 23, 1268–1282. [Google Scholar]
Dimarogonas, D.V.; Loizou, S.G.; Kyriakopoulos, K.J.; Zavlanos, M.M. A Feedback Stabilization and Collision Avoidance Scheme for Multiple Independent Non-point Agents. Automatica 2006, 42, 229–243. [Google Scholar] [CrossRef]
Tanner, H.G.; Boddu, A. Multiagent Navigation Functions Revisited. IEEE Trans. Robot. 2012, 28, 1346–1359. [Google Scholar] [CrossRef]
Karaman, S.; Frazzoli, E. Sampling-Based Algorithms for Optimal Motion Planning. Int. J. Robot. Res. 2011, 30, 846–894. [Google Scholar] [CrossRef]
Vãras, L.G.D.; Medeiros, F.L.; Guimarães, L.N.F. Systematic Literature Review of Sampling Process in Rapidly-Exploring Random Trees. IEEE Access 2019, 7, 50933–50953. [Google Scholar] [CrossRef]
Rousseas, P.; Bechlioulis, C.P.; Kyriakopoulos, K.J. A Continuous Off-Policy Reinforcement Learning Scheme for Optimal Motion Planning in Simply-Connected Workspaces. In Proceedings of the IEEE International Conference on Robotics and Automation, London, UK, 29 May–2 June 2023; pp. 10247–10253. [Google Scholar]
Vlachos, C.; Rousseas, P.; Bechlioulis, C.P.; Kyriakopoulos, K.J. Reinforcement Learning-Based Optimal Multiple Waypoint Navigation. In Proceedings of the IEEE International Conference on Robotics and Automation, London, UK, 29 May–2 June 2023; pp. 1537–1543. [Google Scholar]
Parras, J.; Apellániz, P.A.; Zazo, S. Deep Learning for Efficient and Optimal Motion Planning for AUVs with Disturbances. Sensors 2021, 21, 5011. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.F.; Liu, M.; Everett, M.; How, J.P. Decentralized Non-communicating Multi-Agent Collision Avoidance with Deep Reinforcement Learning. In Proceedings of the IEEE International Conference on Robotics and Automation, Singapore, 29 May–3 June 2017; pp. 285–292. [Google Scholar]
Benjeddou, A.; Mechbal, N.; Deü, J. Smart Structures and Materials: Vibration and Control. J. Vib. Control 2020, 26, 1109. [Google Scholar] [CrossRef]
Everett, M.; Chen, Y.F.; How, J.P. Motion Planning among Dynamic, Decision-Making Agents with Deep Reinforcement Learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain, 1–5 October 2018; pp. 3052–3059. [Google Scholar]
Qie, H.; Shi, D.; Shen, T.; Xu, X.; Li, Y.; Wang, L. Joint Optimization of Multi-UAV Target Assignment and Path Planning Based on Multi-Agent Reinforcement Learning. IEEE Access 2019, 7, 146264–146272. [Google Scholar] [CrossRef]
Sartoretti, G.; Kerr, J.; Shi, Y.; Wagner, G.; Kumar, T.S.; Koenig, S.; Choset, H. PRIMAL: Path-Finding via Reinforcement and Imitation Multi-Agent Learning. IEEE Robot. Autom. Lett. 2019, 4, 2378–2385. [Google Scholar] [CrossRef]
Wan, K.; Wu, D.; Li, B.; Gao, X.; Hu, Z.; Chen, D. MEMADDPG: An Efficient Learning-Based Motion Planning Method for Multiple Agents in Complex Environments. Int. J. Intell. Syst. 2022, 37, 2393–2427. [Google Scholar] [CrossRef]
Rousseas, P.; Bechlioulis, C.P.; Kyriakopoulos, K.J. Reactive Optimal Motion Planning to Anywhere in the Presence of Moving Obstacles. Int. J. Robot. Res. 2024, 02783649241245729. [Google Scholar] [CrossRef]
Ragaglia, M.; Prandini, M.; Bascetta, L. Multi-Agent Poli-RRT* Optimal Constrained RRT-based Planning for Multiple Vehicles with Feedback Linearisable Dynamics. In Modelling and Simulation for Autonomous Systems, Proceedings of the Third International Workshop, MESAS 2016, Rome, Italy, 15–16 June 2016; Springer International Publishing: New York, NY, USA, 2016; pp. 261–270. [Google Scholar]
Peasgood, M.; Clark, C.M.; McPhee, J. A Complete and Scalable Strategy for Coordinating Multiple Robots within Roadmaps. IEEE Trans. Robot. 2008, 24, 283–292. [Google Scholar] [CrossRef]
Kalman, R.E. Contributions to the Theory of Optimal Control. Bol. Soc. Mat. Mex. 1960, 5, 102–109. [Google Scholar]
Song, R.; Lewis, F.L.; Wei, Q.; Zhang, H. Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems with Disturbances. IEEE Trans. Cybern. 2016, 46, 1041–1050. [Google Scholar] [CrossRef] [PubMed]
Rousseas, P.; Bechlioulis, C.P.; Kyriakopoulos, K.J. Trajectory Planning in Unknown 2D Workspaces: A Smooth, Reactive, Harmonics-Based Approach. IEEE Robot. Autom. Lett. 2022, 7, 1992–1999. [Google Scholar] [CrossRef]
Ames, A.D.; Xu, X.; Grizzle, J.W.; Tabuada, P. Control Barrier Function Based Quadratic Programs for Safety Critical Systems. IEEE Trans. Autom. Control 2017, 62, 3861–3876. [Google Scholar] [CrossRef]
Majdisova, Z.; Skala, V. Radial Basis Function Approximations: Comparison and Applications. Appl. Math. Model. 2017, 51, 728–743. [Google Scholar] [CrossRef]
Ragaglia, M.; Prandini, M.; Bascetta, L. Poli-RRT*: Optimal RRT-based Planning for Constrained and Feedback Linearisable Vehicle Dynamics. In Proceedings of the 14th European Control Conference, Linz, Austria, 15–17 July 2015; pp. 2521–2526. [Google Scholar]

Figure 1. (a) Initial Policy. (b) Optimal Policy. (c) Optimal Cost Value.

Figure 2. The relations which an agent has when a workspace involves three mobile agents.

Figure 3. (a) Sigmoid function and (b) gradient of sigmoid function for

τ_{m} = 5

.

Figure 3. (a) Sigmoid function and (b) gradient of sigmoid function for

τ_{m} = 5

.

Figure 4. Workspace and the initial (symbol I) and desired (symbol G) position of each agent.

Figure 5. Optimal vector fields of each agent. The red-filled circle showcases the goal position. (a) Agent 1. (b) Agent 2. (c) Agent 3. (d) Agent 4. (e) Agent 5. (f) Agent 6. (g) Agent 7. (h) Agent 8.

Figure 6. The Optimal (solid blue line), Simulation 1 (dashed red line), Simulation 2 (dashed purple line) and Simulation 3 (dashed magenta line) trajectories of each agent. The green-filled circle showcases the initial position and the red-filled circle showcases the goal position. (a) Agent 1. (b) Agent 2. (c) Agent 3. (d) Agent 4. (e) Agent 5. (f) Agent 6. (g) Agent 7. (h) Agent 8.

Figure 7. Agents’ optimal navigation (solid blue line) and their trajectories calculated with the sub-optimal navigation method (dashed red line) and Multi-Agents Poli-RRT* method (dashed purple line) at simulation 3. The green-filled circle showcases the initial position and the red-filled circle the goal position. (a) Agent 1. (b) Agent 2. (c) Agent 3. (d) Agent 4. (e) Agent 5. (f) Agent 6. (g) Agent 7. (h) Agent 8.

Figure 8. Total time duration of sub-optimal navigation method (solid blue line) and Multi-Agent PoliRRT* method (dashed blue line) and the percentage increase of the total cost value of the sub-optimal navigation method (solid red line) and Multi-Agent PoliRRT* method (dashed red line) according to the number of agents in the group.

Table 1. The initial and desired position of each agent.

Agent Index	Initial Position		Desired Position
	x	y	x	y
1	1.17	−1.87	−0.88	1.72
2	−1.23	1.92	0.85	−1.63
3	1.57	−0.85	−1.85	0.70
4	−1.55	0.72	1.94	−0.85
5	1.66	0.86	−1.96	−0.81
6	−1.59	−0.77	2.04	0.85
7	0.56	1.93	−0.33	−1.44
8	−0.57	−1.84	0.37	1.49

Table 2. Time duration of each agent and whole system, with sub-optimal control method and Multi-Agent Poli-RRT* method.

Agents Index	Time Duration
	Optimal SAS	Simulation 1		Simulation 2		Simulation 3
	Optimal SAS	Sub-Optimal	Poli-RRT*	Sub-Optimal	Poli-RRT*	Sub-Optimal	Poli-RRT*
1	4.45	4.45	4.58	4.55	5.28	4.70	4.52
2	4.40	4.45	5.00	4.50	4.43	4.70	4.52
3	4.35	4.35	4.35	4.55	4.33	4.80	4.76
4	4.35	4.35	4.70	4.45	4.63	4.55	4.48
5	4.40	4.40	4.93	4.55	4.65	4.60	5.96
6	4.35	4.40	4.43	4.40	5.00	4.70	4.84
7	4.20	4.25	4.35	4.30	4.68	4.50	4.48
8	4.20	4.25	4.93	4.35	4.80	4.35	5.72
Total	34.70	17.45	19.56	9.10	10.28	4.80	5.96

Table 3. Cost value of each agent and whole system, with sub-optimal control method and Multi-Agent Poli-RRT* method.

Agents Index	Cost Value
	Optimal SAS	Simulation 1		Simulation 2		Simulation 3
	Optimal SAS	Sub-Optimal	Poli-RRT*	Sub-Optimal	Poli-RRT*	Sub-Optimal	Poli-RRT*
1	8.59	8.60	8.61	8.89	8.65	9.35	8.64
2	8.51	8.53	8.71	8.72	8.63	9.40	8.67
3	7.08	7.11	7.45	7.61	7.09	8.35	7.40
4	7.37	7.40	7.46	7.94	7.56	8.33	8.63
5	7.94	7.97	8.04	8.30	8.67	8.63	8.92
6	7.94	7.97	8.03	8.05	8.23	9.14	9.97
7	6.06	6.13	6.10	6.26	7.25	6.85	6.95
8	6.01	6.05	6.18	6.32	7.02	6.38	8.29
Total	59.51	59.75	60.58	62.09	63.10	66.43	67.47

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kotsinis, D.; Bechlioulis, C.P. Decentralized Navigation with Optimality for Multiple Holonomic Agents in Simply Connected Workspaces. Sensors 2024, 24, 3134. https://doi.org/10.3390/s24103134

AMA Style

Kotsinis D, Bechlioulis CP. Decentralized Navigation with Optimality for Multiple Holonomic Agents in Simply Connected Workspaces. Sensors. 2024; 24(10):3134. https://doi.org/10.3390/s24103134

Chicago/Turabian Style

Kotsinis, Dimitrios, and Charalampos P. Bechlioulis. 2024. "Decentralized Navigation with Optimality for Multiple Holonomic Agents in Simply Connected Workspaces" Sensors 24, no. 10: 3134. https://doi.org/10.3390/s24103134

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Decentralized Navigation with Optimality for Multiple Holonomic Agents in Simply Connected Workspaces

Abstract

1. Introduction

2. Problem Formulation

3. Decentralized Navigation

3.1. Navigation Function

3.2. Individual Optimal Policy

3.3. Resolving Conflicts via the Terms $G_{i}$

3.3.1. Calculate Function $G_{b i}$

3.3.2. Calculate Function $G_{a i}$

4. Proof of Correctness

5. Results

5.1. Multi-Agent Poli-RRT* Algorithm

5.2. Simulations

6. Discussion

7. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Decentralized Navigation with Optimality for Multiple Holonomic Agents in Simply Connected Workspaces

Abstract

1. Introduction

2. Problem Formulation

3. Decentralized Navigation

3.1. Navigation Function

3.2. Individual Optimal Policy

3.3. Resolving Conflicts via the Terms G i

3.3.1. Calculate Function G b i

3.3.2. Calculate Function G a i

4. Proof of Correctness

5. Results

5.1. Multi-Agent Poli-RRT* Algorithm

5.2. Simulations

6. Discussion

7. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3. Resolving Conflicts via the Terms $G_{i}$

3.3.1. Calculate Function $G_{b i}$

3.3.2. Calculate Function $G_{a i}$