An Auto-Adaptive Multi-Objective Strategy for Multi-Robot Exploration of Constrained-Communication Environments

Benavides, Facundo; Ponzoni Carvalho Chanel, Caroline; Monzón, Pablo; Grampín, Eduardo

doi:10.3390/app9030573

Open AccessArticle

An Auto-Adaptive Multi-Objective Strategy for Multi-Robot Exploration of Constrained-Communication Environments

by

Facundo Benavides

^1,2,*,†

,

Caroline Ponzoni Carvalho Chanel

^2,‡

,

Pablo Monzón

^3,‡

and

Eduardo Grampín

^1,‡

¹

Computer Science Institute, Faculty of Engineering, Universidad de la República, 11300 Montevideo, Uruguay

²

Aerospace Vehicles Design and Control Department (DCAS), ISAE-SUPAERO, Université de Toulouse, 31055 Toulouse, France

³

Electrical Engineering Institute, Faculty of Engineering, Universidad de la República, 11300 Montevideo, Uruguay

^*

Author to whom correspondence should be addressed.

^†

Current address: J. H. y Reissig 565, Facultad de Ingeniería (InCo), 11300 Montevideo, Uruguay.

^‡

These authors contributed equally to this work.

Appl. Sci. 2019, 9(3), 573; https://doi.org/10.3390/app9030573

Submission received: 21 December 2018 / Revised: 31 January 2019 / Accepted: 2 February 2019 / Published: 9 February 2019

(This article belongs to the Special Issue Multi-Robot Systems: Challenges, Trends and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Featured Application

In application fields where strong communication requirements do not condition the mission, the present approach represents a proper option for coping with real communication constraints, being more fault tolerant and still having good performance simultaneously.

Abstract

The exploration problem is a fundamental subject in autonomous mobile robotics that deals with achieving the complete coverage of a previously unknown environment. There are several scenarios where completing exploration of a zone is a main part of the mission. Due to the efficiency and robustness brought by multi-robot systems, exploration is usually done cooperatively. Wireless communication plays an important role in collaborative multi-robot strategies. Unfortunately, the assumption of stable communication and end-to-end connectivity may be easily compromised in real scenarios. In this paper, a novel auto-adaptive multi-objective strategy is followed to support the selection of tasks regarding both exploration performance and connectivity level. Compared with others, the proposed approach shows effectiveness and flexibility to tackle the multi-robot exploration problem, being capable of decreasing the last of disconnection periods without noticeable degradation of the completion exploration time.

Keywords:

exploration missions; cooperative systems; multi-robot coordinated systems; constrained-communication environments

1. Introduction

The exploration problem is a fundamental subject in autonomous mobile robotics that deals with achieving the complete coverage of a previously unknown environment. There are several scenarios where completing exploration of a zone is a central part of the mission, e.g., planetary exploration, reconnaissance, search and rescue, agriculture, cleaning, or dangerous places as mined lands and radioactive zones. Additionally, due to the inner qualities—mainly efficiency and robustness—of multi-robot systems, exploration is usually done cooperatively [1].

Schematically, the exploration of an environment can be seen as the composition of Mapping and Motion Planning tasks. A map is needed in order to plan new motions. Moreover, choosing a correct motion sequence based on this map is also needed to expand the knowledge about the environment optimally. Consequently, Mapping is regularly interleaved with Motion Planning, and vice versa, during the whole process [1,2,3].

Given that the lack of knowledge is essentially inherent to exploration missions, the best choice for the robots is to visit the places where the gain of information can be potentially higher. The Task Identification problem concerns the identification of the points of interest that should be visited next. It strongly depends on both the sensory robot capabilities and the underlying environment representation. The most widely used representation for this purpose is the well-known Occupancy Grid structure [4]. Based on it, a method to identify points of interest was proposed by [5]. The strategy assumes that the closer to the frontier between known and unknown regions the tasks are defined, the more information the team can gather. Since then, the majority of exploration proposals has adopted this scheme known as Frontier Points or Frontier Regions [6,7,8].

When multiple robots are involved, it is advisable to avoid several of them moving to the same place. The Task Allocation problem concerns the search for a distribution of tasks to robots that maximises the overall system utility and minimises the amount of overlapped information obtained by the robots [1,9,10].

There exist a wide variety of proposed solutions to this problem where a family of methods based on market economies are probably the most popular ones. These methods are based on the notion of Auctions from which the robots can bid for the tasks to decide who goes to where at each moment. The market may be managed centrally either by a virtual agent at the base station as in [3] where the bids are processed centralised by a greedy algorithm or by a robotic agent as in [1]. Conversely, the fleet can manage to exchange the bids among all the members in order to take decentralised decisions [11,12], avoiding, in turn, the single point of failure. All these methods owe their popularity to their simplicity and ease of implementation, but they suffer from a significant shortcoming: falling in local minima [13].

Far from economy inspired approaches, a scheduling based approach is presented in [2]. This method combines an environment segmentation technique with the centralised task allocation method proposed by [14]. The exploration is performed after dividing the environment into disjoint segments. Thus, the expected sensory overlap between agents is decreased as much as possible.

In [15] the authors address coordination implicitly through localisation data exchanging. Robots are forced to wait for others before making a decision. Task selection is made iteratively—one robot after another—employing an objective function which rewards the right choices. In [16] a centralised approach is used. The tasks-to-robots distribution is computed balancing information gain, localisation quality, and navigation costs. Another centralised approach computes a utility function enabling the robots to locally prioritise the tasks within its scope and, potentially, also enabling the whole team to search for the best global distribution as well [9].

On the contrary, a decentralised approach, called minPos [17], attempts to distribute the robots over the unexplored locations as much as possible. By doing so, it has outperformed several reference proposals decreasing the completion-exploration time for a big set of practical scenarios. The working principle is to rank robots concerning their distance to every possible task. The robots coordinate their actions implicitly and may choose to visit the tasks for which they are best ranked at each point in time.

Finally, the strategy described in [18] is mainly devoted to deal with uncertainties in sensing and motion processes of a multi-robot system. To this end, the authors model the exploration and mapping problem as a POMDP that is solved centrally. In [19] the assignment algorithm works in an asynchronous fashion assuming that not all robots must be ready for new plans at the same time.

Wireless communication plays an important role in collaborative multi-robot strategies. Unfortunately, the assumption or requirement of stable communication and end-to-end connectivity may be easily compromised in real scenarios due to interference, fading, or simply robots moving beyond the communication range. When robots are unconnected they have no possibilities to coordinate their actions and damages or inner failures can lead to information losses. Therefore, depending on the application field, the exploration strategy should take this into account to prevent isolation situations.

1.1. Communication Issues

Mobile Ad-hoc NETworks (MANETs) constitute a particular example of scenarios where the topology of the robot network varies dynamically over time. This kind of network is recommended when the fixed infrastructure is no longer available, e.g., in disasters to support the communication among rescue team members. In such cases, connectivity is of utmost importance because the loss of communication could imply human losses.

A first critical issue concerns the collective knowledge of the environment. Under communication restrictions, such knowledge cannot be assumed to be always accessible and depending on the coordination mechanism could be the cause of significant performance degradation [20]. Therefore, depending on the application, the exploration strategy should take this into account in order to prevent the robots from becoming completely unconnected, let say isolation situations. Such an isolation situation, as well as its possible effects, are illustrated in the example scene depicted in Figure 1.

1.1.1. Connection Requirements

Three categories are mainly identified [20]:

None. Robots are not required to communicate.
Event-based connectivity. The need for regaining connectivity is triggered by particular events such as the discovery of new information or just periodically.
Continuous connectivity. Every robot must be connected at all times to any other fleet member either directly or in a multi-hop manner.

Please note that these requirements could have an impact on the fleet mobility and, in turn, on the availability of exploration strategies to be adopted. For instance, under a continuous connectivity scheme, the fleet is more restricted to move around than in other cases.

1.1.2. Communication Models

Communication model refers to the prior knowledge about communication capabilities that support the decision making of the robots along the exploration. Nevertheless, sometimes no communication model is assumed and, consequently, robots do not depend on communicating to decide where to go next. In such cases, explicit coordination only occurs opportunistically due to random encounters [20].

The communication models typically adopted are [20]:

None. Robots do not make any assumption on the communication possibilities between any pair of arbitrary locations.
Line-of-sight (LoS). Two robots can communicate if and only if their positions belong to a free-of-obstacle line segment. Usually, the distance is also restricted to a maximum value that is often related to the scope of the communication device.
Disc or Circle. Communication with any other robot is permitted when its location is within a fixed maximum distance (communication radius) regardless of the presence of obstacles.
Signal. Communication is available with a certain probability that depends on the estimated signal power between the robot positions. The higher the signal power, the higher the probability.
Traces. Robots can communicate with each other by dropping messages in the environment.

Additionally, to these five categories that cover an essential aspect of the communications, say connectivity, there exist other formulations aimed at cover bandwidth or throughput as well. Clear examples of its use are the applications with a strong dependence on video streaming like search and rescue applications.

1.2. Connectivity-Based Proposals

Despite its well-known inefficiencies, there exist some few approaches without any connection requirements where robots meet each other by chance. Nevertheless, this section only surveys the proposals that depend on connectivity in one way or another.

In [21] a behaviour-based approach is presented. The architecture is designed to guide the exploration constraining the fleet to keep within the communication range, establishing a mobile network. The well-known disk model and a graph structure are used to model the network connectivity and identify possible disconnections. Frontier cells are evaluated regarding costs (computed utilising a flooding algorithm) and information utility (based on the ideas proposed in [3]). Behaviours are selected according to the network topology conditions.

In [22] a centralised communicative exploration algorithm is proposed. Communicative exploration implies that the team of robots have to maintain connections between each other at all times. The target selection is based on a utility function that weights the benefits of exploring new regions versus the goal of keeping connected. While connectivity is valued using the classic disc model, the costs of the shortest paths are computed from the Manhattan distance notion. Due to spatial and movement restrictions, specific behaviours are defined to deal with deadlocks. Also following a centralised approach, [23] presents four fully reactive exploration strategies. They consist in translating the distance to tasks and disconnection situations into artificial forces that pull and push the robot to reach new positions smoothly, avoiding them to lose connectivity. The radio signal quality is modelled considering both the communication range and the distance attenuation effect. Deadlocks are avoided by assigning tasks to a cluster of robots. This allocation guarantees that robots belonging to the same cluster do not exert conflicting forces upon each other towards different directions.

In [24], the authors propose a decentralised version of the strategy proposed in [22] based on message exchanging and a graph structure where the group always tries to keep a biconnected network efficiently. Communication model is based on the classic disc model. In consequence, robot mobility is restricted by the communication range. Using the same graph theory, in [25] the experimental validation of a distributed algorithm that preserves connectivity is also discussed. Nevertheless, a different coordination mechanism—supported by a market-based negotiation algorithm—is adopted. Unfortunately, only results on connectivity maintenance are shown, lacking exploration metrics reports.

The proposal of [26] aims to maintain and repair the underlying wireless mesh network while the coverage task is being performed, all at once. The system works in a fully asynchronous and distributed way. Differently from previous works, the authors propose a network disconnection detection by checking the real state of connections without assumptions on communication range or propagation model. On the contrary, all nodes require knowledge about the area to be covered and on global positions.

In [15] the robots can disconnect as long as they regain connectivity periodically following a distributed but synchronous strategy. Authors address coordination implicitly through localisation data exchanging. Robots are forced to wait for others before making a decision. The system works as an optimisation method where each variable is optimised at a time in a round-robin while the others remain unchangeable. In [27] the authors describe a heterogeneous multi-robot system for exploration tasks. They consider several explorer robots and conceive a particular robot playing the role of relay dispenser. This agent is in charge of place relays when and where it is necessary to support the video/audio streaming generated by explorers.

In [28] the problem of exploration and mapping is addressed by using a Decentralised POMDP. This technique takes advantage of local interaction and coordination from the interaction-oriented resolution of decentralised decision makers. Distributed value functions (DVS) are used by decoupling the multi-agent problem into a set of individual agent problems. In order to address full local observability, limited information sharing, and communication breaks, an extension of the DVS methodology is proposed and applied in multi-robot exploration so that each robot computes locally a strategy that minimises the interaction between fleet members and maximises the coverage achieved by the team, even in communication constrained environments. A decision step consists in building the model, computing the policy from the DVS and producing a trajectory.

Rendezvous-based techniques have also been used to deal with limited communication ranges. In [16] robots are enabled to move out of the communication range but forced to rejoin the group frequently. After moving out the communication range robots have to return to a pre-arranged meeting point to exchange the information gathered during the disconnection period in order to avoid exploration overlaps.

The proposal presented in [29] describes a Particle Swarm Optimisation based approach to achieving fault-tolerance in preventing communication network splits. The principal objective is to keep the fleet k-connected. Considering that the application domain defines the fault-tolerance level required to the system, a MANET connectivity algorithm is extended with the concept of k-fault-tolerance.

A multi-robot system for crisis management is described in [30]. The system is composed of mobile sensors (ground robots—UGV) and mobile relays (aerial vehicles—UAV). However, some robots may change roles dynamically during the mission (e.g., UAVs equipped with both wireless routers and cameras). The problem is modelled and solved using constrained-based local search on a communication model based on graph theory.

In [31] a fully distributed approach for multi-robot sweep exploration is introduced. The proposal aims to guarantee full coverage using a minimum number of messages and to maintain connectivity at all times, even under severe restrictions on the communication type, range, and quality. The algorithm proposed uses communication not only to exchange information but to direct the robot movements. Communication intensity is used in order to disperse the fleet while beacons are used to mark locations of interest.

In [32] a multi-robot exploration algorithm based on multiple behaviours is proposed. Quad-rotors are asked to explore and map an indoor zone with unreliable communication and limited battery life. Robots are enabled to change roles both dynamically according to intrinsic and extrinsic factors (e.g., boundaries/distances and battery level) and hierarchically in order to explore and avoid collision among each other. The remaining battery level is considered in order to avoid losing gathered information. Quad-rotors are also able to leave the network, but after a fixed period they search for regaining connectivity. Relay robots are designated to forward information from/to the more distant robots improving communication between team members. Although no optimal relay placement is computed, the existence of relays is crucial in the proposed scheme.

In [33,34] the relay node dynamic re-positioning problem is tackled. The proposed solution relies on optimisation procedures and evolutionary algorithms to find the best relay locations and how the robots should move to these points. The authors follow a centralised multi-stage approach where one node is in charge of computing the best assignment regarding both connectivity and throughput.

In [35], the problem of how to connect one or more remote units to a base station investing a limited number of intermediate relay robots in constrained communication environments is investigated. The authors study the complexity of the optimal relay placement problem and propose methodologies to create chains or trees of relays as required by different static scenarios. By contrast, in changing environments static solutions cannot be successfully applied because the location optimality does not hold over time.

In [36] the exploration problem is addressed ensuring a time-varying connected topology in 3D cluttered environments but following a decentralised control strategy which enables simultaneous multi-task exploration.

Another centralised but asynchronous strategy is followed in [19,37] in order to address the problem of multi-robot exploration under recurrent connectivity. In these works, the authors leverage a variant of the Steiner tree problem that appears as a particular case of different known graph optimisation problems. Robot placement is treated as an optimisation problem through Integer Linear Programming. Exact and approximated algorithms are compared on particular scenarios.

1.3. Conclusions

Some conclusions arise from this brief survey of recent works. Firstly, it is remarkable that despite being the most restrictive class of exploration algorithms, the exploration strategies based on continuous connectivity are prevalent in applications where real-time image streaming are needed (e.g., search and rescue), or simply when human operators at the base station need to enforce timely information updates, or even when a high level of coordination is needed (i.e., when globally shared knowledge between robots is assumed). Additionally, robustness is also highly appreciated in hostile or inaccessible scenarios. In these missions, fault-tolerance is typically achieved adding redundancy (e.g., systems that guarantee k-connected time-varying network topologies) and employing distributed systems.

Nevertheless, when these strong requirements do not condition the mission, the event-based connectivity—that is less restrictive than the former concerning the fleet mobility—seems to be more appropriate.

Now, moving up from essential aspects as communication to the top of the software architecture stack. There exists a large set of distributed reactive and behaviour-based proposals. Compared to the centralised approaches, distributed approaches have the advantage of not presenting the single-point-of-failure weakness. However, in many cases, it suffers from deadlocks at the individual or collective level.

Market-based coordination methods represent another popular option. There exists a wide variety of implementations that mainly differ from each other in the way the bids are computed by the robots (e.g., single-item or multiple-item auctions). These difference are not insignificant and typically trade simplicity and computational efficiency off for proper coordination and local optima avoidance. Besides, since each auction involves a period of synchronicity between robots, fully asynchronous market-based systems have no place. Nevertheless, asynchronous systems may be advantageous over those that periodically ask the robots to wait for others before making a decision.

Finally, in communication-restricted environments, there seems to be a general agreement on the benefits of spreading out the fleet as long as the robots can regain connectivity in disconnection case. From this, and trying to balance these potentially opposed goals, some multi-objective utility-based approaches have been proposed. Also, defining multiple roles (including communication relays) has demonstrated to be a worthy strategy to address the multi-robot exploration problem when communication restrictions are present.

In conclusion, the survey suggests that in the context of decentralised systems there is room to try new ideas related to connectivity-regaining policies and rendezvous places. On the one hand, the event-based connectivity framework imposes the execution of connectivity-regaining actions in the presence of some events. On the other hand, rendezvous-based approaches imply the definition of particular meeting points where robots have to meet in order to regain connectivity. Leaving apart the fact that the selection of these places could be a hard issue itself, once the connectivity-regaining action is triggered and the meeting place is known by robots, they should interrupt its exploration plans deviating from its current trajectories in order to accomplish the new goal. This action probably leads to global time performance degradation and individual energy consumption increasing. However, what would happen if robots are only influenced to keep or recover connectivity at all times instead of being demanded to regain connectivity? Furthermore, what would happen if they are free to meet by chance, having been motivated to stay close but without having to meet at specific places?

1.4. Contributions

This work tries to answer these research questions from the development of a novel multi-objective approach where the robots, when selecting their targets, are always considering travelling costs and the opportunity cost of keeping connected or regaining connectivity. A simple yet useful model for the signal strength and attenuation effects provide the robots with connectivity awareness. Thereupon, connectivity level measurements and path costs are considered together into a task utility function for finding solutions with a right balance between the benefit of visiting the closer targets and the usefulness of keeping the team connectivity level as high as possible.

For the sake of robustness, a decentralised approach is followed. Robots make decisions asynchronously addressing coordination implicitly through localisation and mapping data exchanging. The human operator is asked to use his application-field expertise to play a part in the task assessment process.

The main contributions of this proposal can be summarised as follows.

Ease to Deploy and Flexibility

In order to establish the task selection criterion, the human operator only needs to choose the extra distance he is willing to ask the robots to travel in order to keep or enlarge the connectivity level of the fleet. From this criterion and through formal analysis, the weights of these potentially conflicting objectives are derived. This way the robots can deal with communication constraints auto-adjusting the weights of each objective in a more intuitive manner. Furthermore, by eliminating the need for training stages the system is more adaptable to different environments.

Good Performance

Asynchronism is taken as a natural way of avoiding waiting times to make a decision as well as decreasing the number of robots that are simultaneously making a decision. Since the task allocation computation strongly depends on the number of robots under consideration, asynchronism also makes optimal decisions can be linearly computable most of the time. As a consequence, robots can compute optimal tasks-to-robots distributions in a short time, achieving high levels of dispersion efficiently. Besides, regarding reconnections, the proposal consists of a rendezvous policy where the locations of the selected tasks become the meeting points themselves, avoiding deviations from the planned paths. Compared with others, the proposed approaches are capable of decreasing the last of disconnection periods without noticeable degradation of the completion exploration time.

1.5. Outline

The present document is organised as follows. Section 2 provides the exploration problem formalisation including models and goals. Next, an Auto-Adaptive Multi-Objective (AAMO) task selection approach, as well as the task allocation algorithm and the decentralised coordination mechanism, are thoroughly described in Section 3, Section 4 and Section 5, respectively. Experimental results related to a baseline and to the AAMO approach itself are discussed in Section 6. Finally, the document is concluded highlighting some future research directions in Section 7.

2. Problem Formulation

This section defines the instance of the multi-robot exploration problem, which constitutes the basis for the proposal formulated in this work. All particular assumptions are mentioned throughout the following sections. Firstly, the environment, robot and communication models are defined. Namely, some real communication constraints are taken into account and formalised into the model. A task definition is given as well as the task identification method. Finally, the global exploration objectives are stated.

2.1. Environment Model

The environment E is defined as a bounded planar workspace

E \subseteq R^{2}

previously unknown. Besides, E is represented by an occupancy grid structure [4] where each cell c can belong to three different probabilistic states

S = {f, o, u}

, standing for free, occupied and unknown, respectively. Typically,

P (s t a t e (c) = f) = 1 - P (s t a t e (c) = o)

is assumed. When

| P (s t a t e (c) = f) - 0.5 | < ϵ

the cell c is labelled as unknown; otherwise it is labelled as free or occupied, accordingly. These states represent all possible theoretical situations in which a point of the environment can be classified over time. The mapping algorithm frequently updates the probability value of each cell on each robot. Despite this, only the current classification of each cell at a given decision time step is considered. Consequently, the representation of E belongs to the domain of matrices

S^{m \times n}

. Furthermore, the region already explored

E_{k n o w n}

and the remaining that is yet unexplored

E_{u n k n o w n}

at time t may be defined from this representation as follows:

E_{u n k n o w n} (t) = {c \in E ∣ | P (s t a t e (c, t) = f) - 0.5 | < ϵ}

and

E_{k n o w n} (t) = {c \in E \ E_{u n k n o w n} (t)}

.

2.2. Robot Model

Given a robot team

R = \{R_{1}, R_{2}, \dots, R_{M}\}

consisting of M homogeneous circular rigid mobile robots with wireless communication capabilities, a traditional representation defines each robot:

R_{i} = (x_{i}, y_{i}, θ_{i}, r_{i}, s_{i}, c_{i})

, where

i \in [1 . . M]

and

X_{i} (t) = {x_{i} (t), y_{i} (t), θ_{i} (t)}

represents the configuration vector of the robot i at time t (position of its centre and heading with respect to the inertial frame),

r_{i}

represents the radius of the robot body, and

s_{i}, c_{i}

represent the sensory capabilities as maximum radius of sensing and maximum range of communication, respectively.

2.3. Communication Model

This model aims to support the connectivity awareness ability of robots needed to deal with disconnection situations during the exploration. Given the position of their teammates and obstacles, robots can estimate the connectivity degree of a specific location considering some of the communication constraints that are widely present in real scenarios, mainly indoor (e.g., office-like and buildings).

The signal strength function (

Γ_{i}

represents a slight adaptation of the signal strength function presented in [38])

Γ_{i} : N \times S^{m \times n} \times R \to R

is defined as follows:

\begin{matrix} Γ_{i} (j, E_{k n o w n} (t), t) & = Γ_{i}^{0} - d_{A t t} (i, j, t) - w_{A t t} (i, j, E_{k n o w n} (t), t) \\ Γ_{i}^{0} & = 10 \cdot D_{a f} \cdot l o g_{10} (c_{i} / r_{i}) \\ d_{A t t} (i, j, t) & = 10 \cdot D_{a f} \cdot l o g_{10} (d_{i} (j, t) / r_{i}) \\ d_{i} (j, t) & = {∥X_{i} (t), X_{j} (t)∥}_{2} \\ w_{A t t} (i, j, E_{k n o w n} (t), t) & = \{\begin{matrix} w_{i} (j, E_{k n o w n} (t), t) \cdot W_{a f} & if w_{i} (j, E_{k n o w n} (t), t) < C \\ C \cdot W_{a f} & otherwise \end{matrix} \end{matrix}

(1)

where,

d_{A t t}

and

w_{A t t}

stand for distance attenuation and wall attenuation terms, respectively. In addition,

d_{i} (j, t)

represents the Euclidean distance between two robot locations at time t: typically the transmitter (

X_{i} (t)

) and receiver (

X_{j} (t)

),

w_{i} (j, E_{k n o w n} (t), t)

represents the number of walls (robots cannot distinguish between different kind of rigid obstacles, but the term wall is used for simplicity and in order to be consistent with the underlying proposal) present in the known region between transmitter and receiver locations at time t,

D_{a f}

represents a distance attenuation factor, and

W_{a f}

represents a wall attenuation factor. Finally, C represents the maximum number of walls up to which the

W_{a f}

factor causes a significant effect in function

Γ_{i}

. When

w_{i} (j, E_{k n o w n} (t), t) \geq C

, the distance attenuation effect dominates. Finally, note that in [38] the independent term

Γ_{i}^{0}

is suggested to be either derived empirically or obtained directly from the wireless network device specification. Nevertheless, in this work the model is adapted in order to become independent from specific deployments (communication devices), deriving the

Γ_{i}^{0}

value so that the signal strength

Γ_{i} (j, E_{k n o w n} (t), t) = 0

when

d_{i} (j, t) = c_{i}

.

In Figure 2 the shape of the function

Γ_{i}

, as well as the attenuation effects caused by both distances and walls, are plotted.

Unfortunately, due to uncertain and incomplete knowledge, the

Γ_{i}

function only can either confirm the absence of connectivity or deliver an optimistic estimation of connectivity level instead. Although this model represents a valuable improvement in relation to others (e.g., the classic disk or line of sight models [20]), for the sake of simplicity other impairments also common in communication (e.g., bandwidth, information losses, fading, and multi-path propagation phenomenon [39,40]) are not considered in this work.

2.4. Task Identification Method

The task identification problem is addressed following a frontier point approach [5] where the free cells (cf. Section 2.1) that belong to a frontier are over labelled as frontier points (FP). Besides, the resulting set of FP cells is clustered (using procedures such as K-Means [41] or Affinity Propagation [42]) in order to identify the cells that better represent each frontier, defining a set of tasks (in the remainder of the document, the terms task and target are used indistinctly)

T = {T_{1}, T_{2}, \dots, T_{N}} ∣ T_{j} \in R^{2}, \forall j \in {1 \dots N}

. Thus, T represents, at each moment, the smallest set of promising locations that the robots could be interested in visiting to explore all frontiers. In Figure 3 these task cells are coloured in yellow.

2.5. Multi-Robot Task Allocation Problem—MRTA

Following the classification proposed in [10], the MRTA problem to be tackled is described as a single-task robots (ST), single-robot tasks (SR), and instantaneous assignment (IA) problem. ST means that each robot is able to visit at most one task at a time. SR means that each task requires only one robot to be explored. IA means that the available information about the robots, the tasks, and the environment permits only an instantaneous allocation of tasks to robots, preventing the possibility to plan future allocations. Additionally, an ST-SR-IA can be formulated as an instance of the well known Optimal Assignment Problem (OAP) as follows. Given M robots, N tasks, and utility estimates U for each

M N

possible robot-task pair, the goal is to assign tasks to robots so as to maximise overall expected utility. Finally, from an Integer Linear Programming perspective, the problem can be formalised as: Find the

M N

non-negative integers

α_{i j}

that maximise (2).

\sum_{i = 1}^{M} \sum_{j = 1}^{N} α_{i j} U_{i j}

(2)

\begin{matrix} s . t . \\ \sum_{i = 1}^{M} α_{i j} = 1, 1 \leq j \leq N \\ \sum_{j = 1}^{N} α_{i j} = 1, 1 \leq i \leq M \end{matrix}

2.6. Global Objectives

The exploration aims for the full coverage of a bounded indoor environment, a priori totally unknown, with a team of terrestrial robots, in minimal time and avoiding isolation situations as much as possible. In this context, isolation refers to the fact of being unconnected from any other fleet member. In this work, the multi-robot system is designed to address these objectives from the following definitions.

2.6.1. Full Coverage

Given the

E_{k n o w n}

and

E_{u n k n o w n}

previously defined in Section 2.1, it is possible to claim that the completion condition is reached when

E = E_{k n o w n}

or equivalently

E_{u n k n o w n} = \emptyset

. Although this condition is straightforward, it is useless in practice. Alternatively, the completion condition is conceived considering the sensing activity of the robots over time. Let

s e n_{i} (t) = s e n (X_{i} (t))

be the information gathered by the robot i at time t in the configuration

X_{i} (t)

. From this,

E_{k n o w n}

at completion time

T

is defined as follows:

E_{k n o w n} = ⋃_{i = 1}^{M} ⋃_{t = 0}^{T} s e n_{i} (t)

(3)

Finally, the completion condition may be written as in (4) implying that there are no reachable configurations where any robot can gather new information.

\begin{matrix} ∄ X_{i} ∣ s e n_{i} ⋂ E_{u n k n o w n} \neq \emptyset \end{matrix}

(4)

2.6.2. Completion Time Optimisation

Additionally to full coverage, the multi-robot system is asked to perform the exploration in minimal time. Therefore, from (4), the minimal completion time condition can be expressed as:

min T ∣ ∄ X_{i} (T), s e n_{i} (T) ⋂ E_{u n k n o w n} \neq \emptyset

(5)

2.6.3. Isolation Avoidance

In multi-robot exploration missions, the individual isolation situations (when a robot becomes unconnected from any other) are non-desirable. The key motivations to avoid them are (i) When robots are unconnected they have no possibilities to coordinate their actions, hence they could visit the same regions. Therefore, keeping the fleet connected is a way to decrease inefficiency; (ii) Damages or inner failures during isolation periods can lead to information losses. Therefore, keeping the fleet connected is also a way to decrease the risk of re-work and to prevent time performance degradation, consequently.

Thus, in addition to (5), the last of possible individual disconnections should be minimised. To this end, concepts of graph theory are borrowed in order to model a time-varying network topology of mobile robots. Such network is represented employing an undirected graph defined as

G (t) = (V, E (t))

where the nodes

V = {1 \dots M}

represent the robots

R_{i} ∣ i \in [1 . . M]

and the edges

E (t) = {i, j ∣ i, j \in V, j \in N_{i} (E_{k n o w n} (t), t)}

represent the operative communication links between any pair of robots

(R_{i}, R_{j})

.

The function

N_{i} (E_{k n o w n} (t), t) = {j ∣ Γ_{i} (j, E_{k n o w n} (t), t) > 0}

computes the neighbours of a robot i at time t. From this it is possible to define the isolation situations of any robot i like the periods when the corresponding node i has no incident edges (

d e g r e e (i) = 0

). Furthermore, isolation situations may repeat several times along the exploration.

In Figure 4 an example of an exploration timeline concerning disconnections is depicted.

From this model, the expression for the disconnection last optimisation may be obtained as follows:

min \sum_{i \in V} \sum_{k} Δ d E_{k}

(6)

\begin{matrix} where : & k indexes the disconnection events d E \\ Δ d E_{k} = t_{e k}^{i} - t_{s k}^{i}, represents the last of the disconnection event d E_{k} \\ t_{s k}^{i} = min t_{k} ∣ N_{i} (E_{k n o w n} (t), t_{k}) = \emptyset, represents the starting time of the disconnection d E_{k} \\ t_{e k}^{i} = max t_{k} ∣ N_{i} (E_{k n o w n} (t), t_{k}) = \emptyset, represents the ending time of the disconnection d E_{k} \end{matrix}

3. Auto-Adaptive Multi-Objective Task Selection Approach

In this section, a novel multi-objective based approach for multi-robot exploration missions is introduced. As was mentioned above, in exploration missions the best choice for the robots is to visit the places where the gain of information can be potentially higher. Gaining information is, actually, the only way to conclude the exploration task. Therefore, the connection between path-cost-based target selection strategies and the completion time performance obtained resides in the fact that this way the fleet expand its territorial knowledge potentially faster. Besides, when the environment presents communication restrictions, individual failures or incoordinations can lead to inefficiency more likely.

In order to make the system robust and efficient, a decentralised and asynchronous coordination mechanism is defined. An auto-adaptive multi-objective task utility function is defined in accordance with both the task identification method presented in Section 2.4 and the objectives of the exploration problem defined in Section 2.6. Its primary purpose is to integrate travelling costs and connectivity levels finding solutions with a right balance between the benefit of visiting the closer targets and the usefulness of keeping the team connectivity level as high as possible.

Furthermore, to make the system more flexible, an analytic approach through which the relative importance of each goal is set independently of the scenarios, is followed. As a result, an auto-adaptive procedure—where the human operator is asked to use his application field expertise in order to influence the robot decisions defining a criterion to balance the importance of both objectives—is developed. Several proofs of correctness on such a procedure are conducted demonstrating that the robots are always capable of auto-adapt the objectives weights to select the tasks accordingly with the human-operator criterion.

3.1. Task Utility Function

This function will guide the optimal task distribution search regarding well-balanced solutions where both the travelling cost and the team connectivity level are considered to evaluate the current targets. The objectives are implemented using utility functions such as (i) path utility function takes the travelling costs to deliver a notion of how beneficial—concerning distance—the tasks under consideration are; (ii) connectivity utility function gives the robots a connectivity awareness ability.

The task utility function

Φ_{i} : [0, 1] \times T \times R^{M} \times S^{m \times n} \to [0, 1]

, is defined as follows:

Φ_{i} (α, T_{j}, R, E_{k n o w n}) = α \cdot Ψ_{i} (T_{j}, E_{k n o w n}) + β \cdot Ω_{i} (T_{j}, R)

(7)

\begin{matrix} s . t . \\ 1 \leq i \leq M = | R |, 1 \leq j \leq N = | T | \\ α + β = 1 ∣ α, β \in [0, 1] \end{matrix}

Given the current state of the fleet R and the current environment knowledge

E_{k n o w n}

, the function

Φ_{i}

estimates the utility obtained by a robot

R_{i}

in case of selecting the task

T_{j}

. The current fleet state refers to both the location of the assigned tasks in case of assigned robots and the robot positions otherwise. The terms

Ψ

and

Ω

represent path utility and connectivity utility functions, respectively. The weights

α

and

β

work as tuning parameters that permit to adjust the kind of solutions the system will search for. If

α = 1

during the whole exploration, then the system would only intent to spread out the fleet. On the contrary, if

α = 0

then the system would always search for potentially fully connected solutions. Otherwise, when (

0 < α < 1

) the system will balance both path utility and connectivity utility. As a result, sometimes the robots could choose other tasks than the closest to favour the team connectivity level. This possibility is deeply analysed further below in Section 4.

Although in this double-objective function the symbol

β

could be substituted by

1 - α

, it is preserved for the sake of generality: if the weighted sum had more than two terms, it would not be possible to express all weights as

α

functions.

3.2. Path Utility

Path utility measures the relative effort needed for a robot to reach a task from its current location. The path utility function

Ψ_{i} : T \times S^{m \times n} \to [0, 1]

is defined as follows:

Ψ_{i} (T_{j}, E_{k n o w n}) = 2^{{(\frac{\bar{Δ} - Δ_{i} (T_{j})}{\bar{Δ}})}^{γ}} - 1

(8)

\begin{matrix} s . t . \\ 1 \leq i \leq M = | R |, 1 \leq j \leq N = | T | \\ where : \\ \bar{Δ} = \bar{d} - \underset{̲}{d} \\ \bar{d} = max {∥X_{i}, T_{j}∥}_{s p}, \forall j \\ \underset{̲}{d} = min {∥X_{i}, T_{j}∥}_{s p}, \forall j \\ Δ_{i} (T_{j}) = {∥X_{i}, T_{j}∥}_{s p} - \underset{̲}{d} \\ {∥X_{i}, T_{j}∥}_{s p} = min_{\begin{matrix} w p_{k} \in E_{k n o w n} \forall k \in {1 \dots e} \\ w p_{1} = X_{i}, w p_{e} = T_{j} \end{matrix}} \sum_{k = 1}^{e - 1} {∥w p_{k} - w p_{k + 1}∥}_{2} \end{matrix}

Given the current environment knowledge

E_{k n o w n}

, the function

Ψ_{i}

estimates the path utility obtained by a robot

R_{i}

in case of selecting the task

T_{j}

. The parameter

γ

works as a shaping factor that could be used to tune the relation between distance and utility. The ordered sequence of waypoints

w p_{k}

represents the shortest path between the robot configuration

X_{i}

and the target

T_{j}

. All segments

(w p_{k}, w p_{k + 1})

are safe given that they are always built regarding only the collision-free pathways present in the known region

E_{k n o w n}

. The wavefront propagation method proposed by [43] is employed to determine the waypoint sequence. The shape and behaviour of the

Ψ

function are depicted in Figure 5.

3.3. Connectivity Utility

Connectivity utility computes, optimistically, the connectivity level present in a location at a certain moment. The connectivity utility function

Ω_{i} : T \times R^{M} \times S^{m \times n} \to [0, 1]

is defined as follows:

Ω_{i} (T_{j}, R, E_{k n o w n}) = \frac{l o g_{2} ((2^{ρ} - 1) \cdot \frac{| N_{i} (E_{k n o w n} (t), t) |}{M - 1} + 1)}{ρ}

(9)

\begin{matrix} s . t . \\ 1 \leq i \leq M = | R |, 1 \leq j \leq N = | T | \end{matrix}

Given the current state of the fleet R and the current environment knowledge

E_{k n o w n}

, the function

Ω_{i}

estimates the connectivity utility obtained by a robot

R_{i}

in case of selecting the task

T_{j}

. Particularly, it is interesting to do so concerning the arrival time to

T_{j}

. The current fleet state refers to both the location of the assigned tasks in case of assigned robots and the robot positions otherwise. The parameter

ρ

works as a shaping factor that could be used to tune the relation between connectivity level and utility. Note that the utility is decreasing in the number of robots, and may favour the adoption of MANET compliant connectivity techniques. In such networks, messages travel from source to destination members in more than one hop, where intermediate nodes forward messages until the destination is reached. The shape and usefulness of the

Ω

function may be appreciated in Figure 6.

3.4. General Considerations

The definition of multi-objective weights is usually accomplished as an empirical matter. Typically, a search process is run in order to find—after a lot of trials—values that fit some optimal criteria. This kind of methods is typically used when the parametric function is planned to be used many times. However, in the exploration context, this assumption or even the possibility of running trials are frequently out of the question. It is not possible to assume that all scenarios where the exploration will be conducted will be similar between each other and, for this reason, is neither possible to assume that the best

α

and

β

values can remain unchangeable.

Furthermore, when these procedures are followed, at the end of the training stage it is often tough to associate the resultant parameter values with real aspects of the problem (e.g., performance metrics like time, distances, energy, or even connectivity levels). This lack of understanding may, in turn, wrongly influence the fine tuning of such parameters without rerunning a portion of trials. Taking those shortcomings into account, an analytic approach—through which the

α

and

β

values might be set independently of the scenarios—is explored.

4. Adaptive $α$ -Value Computation

When a multi-robot exploration process is going to run under communication constrained conditions, choosing between only exploring or exploring preserving connectivity level is a crucial decision. The first choice would be suitable when connectivity is out of the question, or it is impossible for a robot to keep connected and explore at once. In such a case, connectivity does not play any role in the decision-making process. On the contrary, the second choice is suitable when it is necessary to interleave high-performance exploration (minimising the total exploration time) and acceptable connectivity level (avoiding robot isolation as much as possible).

To this end, the human operator is let to use his application field expertise in order to influence the robot decision—defining a criterion to balance the importance of both objectives—by merely setting a parameter before the exploration starts.

Therefore, since

α

and

β

parameters determine the behaviour of the robots concerning target selection, two questions come up: (i) How can the value of those parameters be defined in order to ensure the applicability of the human-operator criterion along the exploration process? (ii) Should these values be adapted during the exploration process?

Henceforth, the task selection framework and the human-operator criterion are formalised. Besides, several proofs to demonstrate the existence and correctness of an adaptive

α

-value that makes the robots behave following the criterion mentioned above are conducted.

4.1. Task Selection Framework

This process is always made iteratively from a list, comparing the currently best task against the rest, one by one. Therefore, without loss of generality, the most relevant aspects can be studied just analysing all the possible relations between an arbitrary pair of tasks. Regarding the distance to a specific robot location and the connectivity level (number of connections with the rest of the fleet), any task can be classified according to Table 1.

Therefore, the meaning of these categories is straightforward: regarding the assignment of the fleet, Co means that the task location would offer to the robot at least the minimum level of connectivity (i.e., one connection to another fleet member); NC means the opposite; regarding the spatial distribution of tasks, Cl means that the task under consideration is closest to the robot than any other; F means that the task is furthest to the robot than any other task.

Moreover, let

R_{i}

a robot and

T_{j}

and

T_{k}

two tasks such that

c l a s s (T)

can belong to any class defined in Table 1. In any scenario, these tasks can be related to each other according to Table 2. Given that

T_{j}

and

T_{k}

are arbitrary tasks, the matrix can be considered symmetric. Thus, taking one of the triangular matrices is enough to study all possible cases.

From the lower triangular, it is possible to identify some cases where one task is better (regarding both path utility and connectivity utility) than the other. Such an example is the [Cl/Co;F/NC] where

T_{j}

is closer to the robot than

T_{k}

, and it is the only one that keeps the robot connected as well. Similarly, in the [Cl/NC;F/NC] case neither task can keep the robot connected, and in consequence, the closest task

T_{j}

results more convenient than

T_{k}

. Thus, in both previous cases, the criterion to choose a task is clear: the closest task should be selected. However, in the other cases, it is not clear at all which task should be selected. In one case, [Cl/NC;F/Co], whichever selection implies either traversing longer distances or losing connectivity. In the other case, [Cl/Co;F/Co], selecting the closest task

T_{j}

ensures traversing the shortest path but could imply losing connectivity. By contrast, selecting the furthest task

T_{k}

would be acceptable only when the gain in connectivity oppose a more significant travelling effort.

Definition 1.

The human operator threshold HO-Threshold expresses the human operator criterion through a distance that represents the extra effort made by robots that the human operator is willing to accept in order to maintain or enlarge the size of the robot communication network.

In other words, the human operator criterion is determined by setting the distance threshold until which the targets that preserve or enlarge connectivity are preferred over the rest, for all robots.

For instance, in the [Cl/NC;F/Co] case the selection will be conditioned as follows:

T_{k}

will be selected if and only if the length of the shortest path between

T_{k}

and the robot location is less than or equal to HO-Threshold.

T_{j}

will be selected otherwise.

In order to make the influence of HO-Threshold clearer, an example scene is depicted in Figure 7. Note that all tasks are within the HO-Threshold, but only

T_{3}

can enlarge the connectivity level of the robot

R_{1}

. Thus, applying Definition 1 leads to the selection of task

T_{3}

because it enables the robot

R_{1}

to travel more distance to gain connectivity. On the contrary, whether the HO-Threshold

\leq 3

,

T_{3}

would be no longer preferred over the rest, and consequently the closest task

T_{2}

would be selected instead.

Hence, in the presence of some specific conditions, it is expected that the application of the HO criterion can make the fleet more cohesive than following approaches that do not take communication constraints into account and less restrictive than the ones that do not permit disconnections or force re-connections as well.

Next, the proofs of correctness and existence of

α

(and

β

) values that implement the HO criterion are conducted regarding the cases present in the lower triangular of Table 2. The cases {[Cl/Co,F/NC];[Cl/NC,F/NC]} are considered first, while the remaining {[Cl/Co,F/Co];[F/Co,Cl/NC]} are considered afterwards.

4.2. [Cl/Co,F/NC] and [Cl/NC,F/NC] Cases

In Figure 8a,b, two instances of these cases are depicted, respectively.

Proposition 1.

When

T_{j}

and

T_{k}

belong to

[C l / C o, F / N C]

or

[C l / N C, F / N C]

, the values of α and β do not make any difference in the selection process.

Proof.

This claim can be derived directly from the following facts:

in the [Cl/Co,F/NC] case the furthest task $T_{k}$ makes the robot disconnected, and then applying (7) to $T_{j}$ and $T_{k}$ leads to:

$\begin{matrix} Φ_{i} (α, T_{k}, R) = α \cdot Ψ_{i} (T_{k}) \leq α \cdot Ψ_{i} (T_{j}) + β \cdot Ω_{i} (T_{j}, R) = Φ_{i} (α, T_{j}, R), \forall α \end{matrix}$

(10)

$\begin{matrix} s . t . & Ω_{i} (T_{j}, R) > 0, Ω_{i} (T_{k}, R) = 0 \\ Ψ_{i} (T_{k}) \leq Ψ_{i} (T_{j}) \end{matrix}$
in the [Cl/NC,F/NC] case both tasks make the robot to be disconnected, and thus the $Φ$ function value will depend only on the $Ψ$ term:

$\begin{matrix} Φ_{i} (α, T_{k}, R) = α \cdot Ψ_{i} (T_{k}) \leq α \cdot Ψ_{i} (T_{j}) = Φ_{i} (α, T_{j}, R), \forall α \end{matrix}$

(11)

$\begin{matrix} s . t . & Ω_{i} (T_{j}, R) = 0, Ω_{i} (T_{k}, R) = 0 \\ Ψ_{i} (T_{k}) \leq Ψ_{i} (T_{j}) \end{matrix}$

In conclusion, in any of these cases, the task selection is not affected by

α

. ☐

4.3. [Cl/Co,F/Co] and [F/Co,Cl/NC] Cases

In the [Cl/Co,F/Co] case both tasks offer the possibility to be connected. On the contrary, in the [F/Co,Cl/NC] case opposite objectives are present: one task is closer but disconnected while the other is connected but further. Thus, the latter case is taken to prove the existence of an

α

, that can respect any given HO criterion. The former case is finally used to corroborate the non-existence of any possible unwanted side effect caused by the achieved

α

expression.

4.3.1. [F/Co,Cl/NC] case.

Based on the human-operator criterion (set by a threshold value) we want an

α

-value that makes, following the scenario depicted in Figure 9,

T_{3}

preferred over

T_{2}

if and only if

T_{3}

belongs to the circular area defined by the HO-Threshold.

Next, the existence of such an

α

parameter will be demonstrated, and its value will be derived as well.

Proposition 2.

When

T_{j}

and

T_{k}

belong to [F/Co,Cl/NC] is always possible to find an α-value that satisfies the following inequality:

\begin{matrix} Φ_{i} (α, T_{j}, R) = α \cdot Ψ_{i} (T_{j}) + β \cdot Ω_{i} (T_{j}, R) \geq α \cdot Ψ_{i} (T_{k}) = Φ_{i} (α, T_{k}, R) \end{matrix}

(12)

\begin{matrix} s . t . & Ω_{i} (T_{j}, R) > 0, Ω_{i} (T_{k}, R) = 0 \\ Ψ_{i} (T_{j}) \leq Ψ_{i} (T_{k}) \end{matrix}

Proof.

Let

Ω_{1}

the utility assigned to the fact of being connected with only one teammate. Then, from (9), it is possible to state that: if

T_{j}

belongs to any

[* / C o]

class,

Ω_{1} \leq Ω_{i} (T_{j}, R), \forall (i, j)

over time. Moreover, if the number of robots does not change, it is also possible to state that

Ω_{1}

remains invariant over time. Applying this result into (12) leads to the inequality presented next in (13):

\begin{matrix} Φ_{i} (α, T_{j}, R) \geq α \cdot Ψ_{i} (T_{j}) + β \cdot Ω_{1} \geq & α \cdot Ψ_{i} (T_{k}) = Φ_{i} (α, T_{k}, R) \\ α \cdot Ψ_{i} (T_{j}) + (1 - α) \cdot Ω_{1} \geq & α \cdot Ψ_{i} (T_{k}) \\ α \cdot (Ψ_{i} (T_{j}) - Ω_{1}) + Ω_{1} \geq & α \cdot Ψ_{i} (T_{k}) \\ \frac{Ω_{1}}{Ψ_{i} (T_{k}) - Ψ_{i} (T_{j}) + Ω_{1}} \geq & α \end{matrix}

(13)

Besides, substituting

Ω_{1} = x

and

Ψ_{i} (T_{k}) - Ψ_{i} (T_{j}) = u

, equation (13) may be rewritten as follows:

\begin{matrix} α \leq \frac{x}{u + x} \Rightarrow & α \leq i n f (\frac{x}{u + x}) \end{matrix}

(14)

\begin{matrix} s . t . & 0 < c \leq x \leq 1 \\ 0 \leq u \leq 1 \\ given that : & Ω_{1} = c \\ Ψ_{i} (T_{k}) \geq Ψ_{i} (T_{j}) \end{matrix}

From (14) is possible to claim the existence of an

α

-value that obey any HO-Threshold if and only if the function

\frac{x}{u + x}

presents an absolute minimum on the domain:

D = {(x, u) ∣ 0 < c \leq x \leq 1, 0 \leq u \leq 1} .

This fact can be stated employing Weierstrass theorem (a function f has an absolute extreme if it is continuous and its domain is compact). Besides, the minimum point might be calculated analysing both: (i) the relative extrema and (ii) the points lying on the border of D. Following this procedure, it is possible to find the absolute extreme of the function

\frac{x}{u + x}

in

(x, u) = (c, 1)

.

Moreover, it is remarkable that this extreme represents a place where the most demanding conditions are reached: task

T_{j}

presents the lowest positive connectivity utility, and the distance between both tasks is the largest. Hence, the existence of a positive value

α \leq \frac{Ω_{1}}{1 + Ω_{1}}

(regardless of how demanding can be the distance relation between tasks) that might alter the task selection in favour of connectivity has been demonstrated. ☐

Nevertheless, in (13)

α

is independent of the HO-Threshold. Consequently, its direct application would result in a strictly connectivity-guided exploration, where tasks that offer connectivity are always preferred over the rest no matter how far they are. Therefore, to relate it with an HO-Threshold the value of the term

Ψ_{i} (T_{j})

in (13) must be substituted by the utility of being HO-Threshold far from the robot, say

Ψ_{i} (T_{HO})

. Next, the value of the term

Ψ_{i} (T_{k})

is substituted by 1 since

Ψ_{i} (T_{k}) = 1

represents the necessary condition to reach the extreme coordinate

u = 1

that arose from (14). Finally, the expression for an HO-Threshold dependent

α

, say

α_{HO}

, is expressed in (15) as follows:

\begin{matrix} α_{HO} = \frac{Ω_{1}}{1 - Ψ_{i} (T_{HO}) + Ω_{1}} \end{matrix}

(15)

Proposition 3.

The applicability of the

α_{HO}

referred to in (15) causes any task within the threshold scope that also offers any positive connectivity level to be favoured over the rest of the tasks that do not offer any connectivity level, regardless of how close to the robot they are.

Proof.

Φ_{i} (α, T_{j}, R) \geq Φ_{i} (α, T_{k}, R)

is imposed to any tasks

(T_{j}, T_{k})

that respect the [F/Co,Cl/NC] conditions:

\begin{matrix} Φ_{i} (α, T_{j}, R) = α \cdot Ψ_{i} (T_{j}) + β \cdot Ω_{i} (T_{j}, R) & \geq α \cdot Ψ_{i} (T_{k}) = Φ_{i} (α, T_{k}, R) \\ α \cdot Ψ_{i} (T_{j}) + (1 - α) \cdot Ω_{i} (T_{j}, R) & \geq α \cdot Ψ_{i} (T_{k}) \\ α \cdot (Ψ_{i} (T_{j}) - Ω_{i} (T_{j}, R)) + Ω_{i} (T_{j}, R) & \geq α \cdot Ψ_{i} (T_{k}) \\ \frac{Ω_{i} (T_{j}, R)}{Ψ_{i} (T_{k}) - Ψ_{i} (T_{j}) + Ω_{i} (T_{j}, R)} & \geq α \end{matrix}

Then, applying (15) leads to (16):

\begin{matrix} \frac{Ω_{i} (T_{j}, R)}{Ψ_{i} (T_{k}) - Ψ_{i} (T_{j}) + Ω_{i} (T_{j}, R)} \geq \frac{Ω_{1}}{1 + Ω_{1} - Ψ_{i} (T_{HO})} \\ Ψ_{i} (T_{j}) \geq \frac{Ω_{i} (T_{j}, R)}{Ω_{1}} \cdot (Ψ_{i} (T_{HO}) - 1) + Ψ_{i} (T_{k}) \\ Ψ_{i} (T_{j}) = s u p (\frac{Ω_{i} (T_{j}, R)}{Ω_{1}} \cdot (Ψ_{i} (T_{HO}) - 1) + Ψ_{i} (T_{k})) \end{matrix}

(16)

Since i)

Ω_{1}

is constant, ii)

\frac{Ω_{i} (T_{j}, R)}{Ω_{1}} \geq 1

, and iii)

(Ψ_{i} (T_{HO}) - 1) \leq 0

, it is possible to conclude that:

$Ψ_{i} (T_{j})$ is monotonically decreasing concerning $Ω_{i} (T_{j}, R)$ .
the upper bound is reached when:
(a)
$Ψ_{i} (T_{HO}) = 1$
(b)
$0 \leq Ψ_{i} (T_{HO}) < 1$ , $Ω_{i} (T_{j}, R) = Ω_{1}$ and $Ψ_{i} (T_{k}) = 1$ .

Please note that (a) is out of the proposition conditions. Instead, (16) can be rewritten imposing (b), leading to:

\begin{matrix} Ψ_{i} (T_{j}) \geq \frac{Ω_{1}}{Ω_{1}} \cdot (Ψ_{i} (T_{HO}) - 1) + 1) \\ Ψ_{i} (T_{j}) \geq Ψ_{i} (T_{HO}) \end{matrix}

which is true if, and only if,

Δ_{i} (T_{j}) \leq

HO-Threshold, which is indeed what the human operator would like to get from his criterion application to tasks within the HO-Threshold. Hence, following (15) under the [F/Co,Cl/NC] conditions it is always possible to compute an

α_{HO}

-value that makes the robots behave following the human-operator criterion. ☐

Likewise, it is important to highlight that the

α_{HO}

-value needs to be calculated every time a robot is ready to make a decision. This need for adaptation arises from

Ψ_{i} (T_{HO})

, which is not constant. Its value depends on the relation between the HO-Threshold and the relative distance to the current furthest task. That way, the robots can autonomously adapt the weights of the task utility function according to the changing conditions of the environment in order to be always consistent with the human-operator criterion.

4.3.2. [Cl/Co,F/Co] Case

This analysis is devoted to checking the applicability of the

α_{HO}

when the conditions to achieve a good trade-off between path cost and connectivity level are less demanding than in the [F/Co,Cl/NC] case. In the [Cl/Co,F/Co] case, although one task is closer than the other, the differences in the positive connectivity level offered by them could make the furthest task more attractive than the closest. From that, considering the connectivity level offered by the closest, two cases may be identified: (i) When

T_{j}

offers a higher level of connectivity than

T_{k}

. In such a case, there is no doubt that independently of the

α_{HO}

value, the selection would always favour the task

T_{j}

because it is the closest as well; (ii) On the contrary, when

T_{k}

offers a higher level of connectivity than

T_{j}

, the selection of

T_{k}

will depend on both how distant from robot it is and how much more connected would be the robot on

T_{k}

respect to

T_{j}

.

Finally, to show that the

α_{HO}

value does not introduce any unwanted side effect on the task selection process when tasks belong to the [Cl/Co,F/Co] case, it is needed to prove that it neither contradicts the first case nor restricts the occurrence of the second case.

Proposition 4.

In the presence of two tasks subject to the [Cl/Co,F/Co] case conditions, if

T_{j}

is the closest and simultaneously the one which provides the highest level of connectivity, then the application of the

α_{HO}

value will never result in the selection of

T_{k}

.

Proof.

By contradiction, it is assumed that under these conditions the selection could be in favour of

T_{k}

, implying that the following inequality holds:

\begin{matrix} Φ_{i} (α, T_{j}, R) = α \cdot Ψ_{i} (T_{j}) + β \cdot Ω_{i} (T_{j}, R) \leq α \cdot Ψ_{i} (T_{k}) + β \cdot Ω_{i} (T_{k}, R) = Φ_{i} (α, T_{k}, R) \\ α \cdot (Ψ_{i} (T_{j}) - Ψ_{i} (T_{k})) + β \cdot (Ω_{i} (T_{j}, R) - Ω_{i} (T_{k}, R)) \leq 0 \end{matrix}

(17)

Which implies that, independently of the

α_{HO}

value, the terms

(Ψ_{i} (T_{j}) - Ψ_{i} (T_{k}))

and

(Ω_{i} (T_{j}, R) - Ω_{i} (T_{k}, R))

should not be positive simultaneously. Thus, either

(Ψ_{i} (T_{j}) \leq Ψ_{i} (T_{k}))

or

(Ω_{i} (T_{j}, R) \leq Ω_{i} (T_{k}, R))

. However, this contradicts the hypothesis where

T_{j}

is stated as the closest and the one which simultaneously provides the highest level of connectivity, and accordingly the proposition has been demonstrated. ☐

Proposition 5.

In the presence of two tasks subject to the [Cl/Co,F/Co] case conditions, if

T_{j}

is the closest and

T_{k}

the one which provides the highest level of connectivity, then the application of the

α_{HO}

value will never be conclusive concerning the task selection.

Proof.

The relation between the utility of tasks is written as follows in (18):

\begin{matrix} Φ_{i} (α, T_{j}, R) = α \cdot Ψ_{i} (T_{j}) + β \cdot Ω_{i} (T_{j}, R) ≶ α \cdot Ψ_{i} (T_{k}) + β \cdot Ω_{i} (T_{k}, R) = Φ_{i} (α, T_{k}, R) \\ α \cdot (Ψ_{i} (T_{j}) - Ψ_{i} (T_{k})) ≶ β \cdot (Ω_{i} (T_{k}, R) - Ω_{i} (T_{j}, R)) \\ α \cdot (Ψ_{i} (T_{j}) - Ψ_{i} (T_{k})) ≶ (1 - α) \cdot (Ω_{i} (T_{k}, R) - Ω_{i} (T_{j}, R)) \\ α ≶ \frac{Ω_{i} (T_{k}, R) - Ω_{i} (T_{j}, R)}{(Ω_{i} (T_{k}, R) - Ω_{i} (T_{j}, R)) + (Ψ_{i} (T_{j}) - Ψ_{i} (T_{k}))} \end{matrix}

(18)

Substituting

(Ω_{i} (T_{k}, R) - Ω_{i} (T_{j}, R)) = x

and

(Ψ_{i} (T_{j}) - Ψ_{i} (T_{k})) = u

, it is possible to state that in order to favour the selection of

T_{j}

the inequality (19) must be held, otherwise the (20):

\begin{matrix} α \geq \frac{x}{u + x} & \Rightarrow α = s u p (\frac{x}{u + x}) \end{matrix}

(19)

\begin{matrix} α \leq \frac{x}{u + x} & \Rightarrow α = i n f (\frac{x}{u + x}) \\ s . t . & 0 \leq x \leq 1 \\ 0 \leq u \leq 1 \\ given that : & Ω_{i} (T_{k}, R) \geq Ω_{i} (T_{j}, R) \\ Ψ_{i} (T_{j}) \geq Ψ_{i} (T_{k}) \end{matrix}

(20)

On this domain, the function

\frac{x}{u + x}

presents an absolute maximum equal to 1 in the point

(x, u) = (1, 0)

, and absolute minima equal to 0 along the line segment defined by

(x, u) = (0, u)

. Assessing the

α_{HO}

expression derived in (15) with

(0, u)

leads to (21) and (22), respectively:

\begin{matrix} 0 = & \frac{Ω_{1}}{1 + Ω_{1} - Ψ_{i} (T_{HO})} ∴ 0 = Ω_{1} \end{matrix}

(21)

\begin{matrix} 1 = \frac{Ω_{1}}{1 + Ω_{1} - Ψ_{i} (T_{HO})} \\ 1 - Ψ_{i} (T_{HO}) = 0 ∴ Ψ_{i} (T_{HO}) = 1 \end{matrix}

(22)

From which, while the condition expressed in (21) is reached when

| R | \to \infty

, the one expressed in (22) is reached when HO-Threshold tends to 0. The condition (21) is unreachable in practice implying that no

α_{HO}

can make the task

T_{k}

always preferred over

T_{j}

. Conversely, the condition (22) is reachable if, and only if, the human operator deliberately does not want to care about connectivity. Otherwise, there is no positive

α_{HO}

-value that can make the task

T_{j}

always preferred over

T_{k}

.

Consequently, when

α_{HO} \in (0 . . 1]

under the [Cl/Co,F/Co] conditions, it is not possible to hold a single preference over time. ☐

4.4. Considerations and Usefulness

In order to establish the task selection criterion, the human operator only needs to choose the extra distance HO-Threshold—according to his expertise and knowledge—he is willing to ask the robots to travel in order to keep or enlarge the connectivity level of the fleet. Once the HO-Threshold is set, robots are capable of selecting tasks consistently with the HO criterion following the Equation (15). Furthermore, it is important to note that the HO-Threshold value does not change along the exploration but, as was pointed out, the

α_{H O}

does, due to the dependency on the

Ψ

function. This explains the need for auto-adaptive capabilities concerning the multi-objective

Φ

function.

Additionally, it also worth noticing that setting HO-Threshold

= \infty

it is a practical way to implement an event-based connectivity approach where the tasks that provide connectivity will always be preferred over the rest, no matter how close they are.

5. Task Allocation Scheme

The allocation scheme is founded on two pillars: the coordination method and the task selection algorithm.

5.1. Coordination Method

In order to take advantage of the individual computing power of the robots, to avoid the single point of failure, and to deal better with the presence of real communication constraints during the exploration, a decentralised approach is followed. Typically, estimation of travelling costs and target benefits, as well as mapping and localisation, are the tasks chosen to be made locally by the robots. However, to achieve a cooperative behaviour, both the local map and localisation information must be shared among team members.

Additionally, the relation between

| T |

and

| R |

can result in two somewhat different behaviours: (i) If

| T | < | R |

, not all robots would be needed to reach all targets. Some robots may choose to keep quiet; (ii) When

| T | \geq | R |

all robots would be needed in order to reach the maximum amount of targets at a time. When robots decide to explore, the task selection is made coordinately. Robots coordinate their actions implicitly, sharing specific information (such as locations, eventually already-done-selections, and local maps) and running the same selection algorithm. Thus, it is possible for the multi-robot system to compute a coordinated-tasks-to-robots distribution in a decentralised way [15,17,44].

To do so properly, the exchanging information time is carefully set up. The system is fully asynchronous, meaning that: (i) Robots do not wait for others; (ii) After selecting a task, the robots do exchange their selection in order to prevent future overlappings; (iii) Local maps and—by means of this —the sets of new available tasks are periodically exchanged, each time two conditions are met: (1) A waypoint of the planned path is reached; (2) New information has been gathered; (iv) Localisation data is exchanged at a higher rate than maps because its influence on the task selection algorithm is higher too.

While localisation data is exchanged periodically, the rest of data exchanging is triggered by events instead. These policies make the system more efficient and flexible because: (i) No data is transmitted when there is no new information to exchange; (ii) There is no need to set up any rate parameter when exploring different environments. The robot life-cycle algorithm is sketched in Algorithm 1.

Algorithm 1 Robotic Agent Life-cycle algorithm.

1:

functionexplore(i,R,

HO - Threshold

)

▹ i stands for the robot position in vector R.
▹ R stands for the robots location vector.

2:

a t T \leftarrow t r u e

▹ atTarget flag.

3:

p o s e \leftarrow R [i]

4:

g M a p \leftarrow g e t M a p (p o s e)

▹ Occupancy Grid map.

5:

while true do

6:

R^{*} \leftarrow \emptyset

▹ Vector of connected robot locations.

7:

for

j \in R \land j \neq i

do

8:

if

Γ_{i} (j, g M a p) > 0

then ▹ Connected robot.

9:

R^{*} [j] = r c v P o s e (j)

▹ Asking for localisation data.

10:

s n d P o s e (p o s e, j)

▹ Sending own localisation data.

11:

g M a p = m a p M e r g e (g M a p, r c v M a p (j))

▹ Asking for local maps.

12:

end if

13:

end for

14:

if

a t T

then

15:

T = g e t F r o n t i e r T a s k s (g M a p)

▹ Tasks location vector.

16:

t a s k \leftarrow g e t A s s i g n m e n t (i, R^{*}, T, | R |, HO - Threshold)

17:

g o t o (t a s k)

18:

end if

19:

p o s e = g e t P o s e ()

▹ Global localisation.

20:

a t T \leftarrow p o s e = t a s k

21:

[g M a p, n i] = m a p M e r g e (g M a p, g e t M a p (p o s e))

▹ Mapping.

22:

if

a t T \lor n i

then ▹

R_{i}

arrives at task or new information was gathered.

23:

for

j \in R^{*} \land j \neq i

do

24:

s n d M a p (g M a p, j)

▹ Sending local map.

25:

end for

26:

end if

27:

end while

28:

end function

5.2. Task Selection Algorithm

The task selection process employs the multi-objective utility function

Φ

defined in (7) with

α_{HO}

values dynamically adapted by (15) to solve the MRTA problem stated in Secion Section 2.5. The corresponding algorithm is sketched in Algorithm 2.

Algorithm 2 Task selection algorithm.

1:

functiongetAssignment(

i, R^{*}, T, M, HO - Threshold

)

▹ i stands for the robot position in vector $R^{*}$ .
▹ $R^{*}$ stands for the robots location vector.
▹ T stands for the current tasks location vector.
▹ M stands for the fleet size.

2:

[T^{u}, T^{a}] \leftarrow T

3:

R^{u} \leftarrow R^{*}

4:

T^{HO} \leftarrow {T_{j} ∣ T_{j} \in T^{u}, Δ_{k} (T_{j}) \leq HO - Threshold, \forall R_{k} \in R^{u}}

▹ Relative distance

Δ_{k} (T_{j})

is defined in Section 3.2.

5:

for each k in

R^{u}

do

6:

for each j in

T^{HO}

do

7:

P U [k, j] = Ψ_{k} (T_{j})

▹ Path utility matrix.

8:

end for

9:

end for

10:

α_{HO} \leftarrow \frac{Ω_{1}}{1 - Ψ_{i} (HO - Threshold) + Ω_{1}}

▹ (15)

11:

β = 1 - α_{HO}

12:

N^{HO} = | T^{HO} |

13:

M^{u} = | R^{u} |

14:

T 2 R D i s t \leftarrow A r_{M^{u}}^{N^{HO}}

▹ Tasks-to-robots distributions

T 2 R D i s t \in N^{| A r_{M^{u}}^{N^{HO}} | \times M^{u}}

15:

for each row r in

T 2 R D i s t

do

16:

Φ [r] = \sum_{k = 1}^{M^{u}} α_{HO} \cdot P U [k, j] + β \cdot Ω_{k} (T_{j}^{HO}, [R^{u}, T^{a}]), j = T 2 R D i s t (r, k)

17:

end for

18:

T^{*} \leftarrow T 2 R D i s t [r, i] ∣ arg {max}_{r} Φ [r]

19:

return

T^{*}

20:

end function

Firstly, the input parameter

R^{*}

specifically corresponds to the locations of the teammates currently connected with the robot

R_{i}

. Next, in lines 2 and 3, both the task and robot location sets are split up into two subsets each one (assigned and unassigned items, respectively). Line 4 is in charge of taking only the unassigned tasks that are within the HO-Threshold scope from every robot. Afterwards, from lines 5 until 9, the path utility matrix is computed regarding all possible task-robot combinations. Next, lines 10 and 11 aim to compute the

α_{HO}

and

β

values according to (15). The set of tasks-to-robots distributions is calculated from line 12 to 14. Finally, from line 15 to 17 all possible assignments are evaluated using the

Φ

function while the task corresponding to robot i of the best assignment is selected in line 18.

Some considerations on Algorithm 2 are hereafter discussed. Concerning the computation of the set of tasks-to-robots distributions (lines 12 to 14), it provides a way to potentially avoid falling in local minima or even taking wrong decisions. Note that the connectivity utility function is subject to locality conditions and thus, it is not possible to compute optimal distributions from the application of iterative polynomial-time assignation algorithms such as the Hungarian method [14].

On the contrary, Algorithm 2 can choose the optimum tasks-to-robots distribution by evaluating all possible

T^{HO}

-to-

R^{u}

distributions. Nevertheless, this process may be potentially very hard since

| A r_{M^{u}}^{N^{HO}} | = \frac{N^{HO}!}{(N^{HO} - M^{u})!} = Π_{m = 1}^{n} (N^{HO} - m + 1) = Π_{m = 0}^{n - 1} (N^{HO} - m) \to O (N^{* M^{u}})

. Therefore, the smaller

| T^{HO} |

and

| R^{u} |

the faster the algorithm will run. In the first case

| T^{HO} |

is bounded by pruning

| T^{u} |

with the help of HO-Threshold.

On the contrary, even being naturally bounded (

| R | \geq | R^{*} | \geq | R^{u} |

), the set R could imply a large

R^{u}

. Besides, all efforts are to keep the fleet connected as much as possible, leading to

| R^{*} | \to | R |

. Fortunately, in a fully asynchronous multi-robot system the probability of two or more robots being simultaneously making a decision is negligible.

Finally, note that Algorithm 2 assumes

| T^{HO} | > | R^{u} |

; otherwise the tasks-to-robots distribution cannot be computed. In such a case, the input parameters are managed in order to conduct a robots-to-tasks distribution instead. In turn,

| A r_{N^{HO}}^{M^{u}} |

does not represent a significant effort since

M^{u} \geq N^{HO}

holds for small values.

6. Baseline Statement and AAMO Approach Results

The aims of this section are: (i) To establish a baseline on the main figure of merits that will be defined to asses the benefits of different approaches; (ii) To assess and analyse the performance of different instances of the Auto-Adaptive Multi-Objective (AAMO) approach (different instances—from now on—refer to different HO-Threshold setup values) under non-ideal communication conditions; (iii) To compare AAMO instances against other approaches under non-ideal communication conditions.

Regarding the first purpose, the baseline is established regarding two state-of-art approaches so that the simulation runnings concern the comparison between a Yamauchi-based algorithm [5] and the minPos algorithm [17] under ideal communication conditions. These algorithms were chosen since they are decentralised, as are the author’s proposal; while Yamauchi is a reference on exploration and typically serves itself as a comparison baseline, the minPos proposal has demonstrated very good performance, outperforming other important reference algorithms.

On the contrary, regarding the AAMO assessment and the comparison with other approaches, the simulation runnings concern exploration missions subject to non-ideal communication conditions. In this case, the primary purpose is to understand how compromised could be the exploration time performance when the connectivity level is prioritised and to reveal possible improvements concerning previous techniques. In consequence, there are experiments which compare only the performance achieved by different instances of AAMO, while in other experiments, where relevant, comparison with state-of-art performance is taken into account too.

6.1. Simulation Setup

All simulations were conducted over MORSE physics simulator (www.openrobots.org/morse/doc/stable/morse.html) using ATRV-like robots equipped with laser range sensors. The more relevant simulation parameters are shown in Table 3.

Furthermore, it is important to precise that except for Communication range that depends on the device, the rest of communication factors were taken from [38] regarding their strong dependency on the materials present in the environment. The values of HO-Threshold correspond to 66%, 50%, and 33% of the communication range

c_{i}

, respectively. In all simulations localisation and low level motion control are taken for granted.

6.1.1. Scenarios

Simulations are conducted over synthetic scenarios (See Figure 10) where long distances and obstacle presence may offer similar challenging conditions that would be expected in the real world. The Loop and Cross scenarios (see Figure 10a,b) were mainly used to confirm the correctness of the implemented solutions and to show the advantages of using a multi-robot approach over a single one. Unfortunately, and caused by the shape and size of the free zones, on those scenarios there are nearly no possibilities to demonstrate any advantage of the proposed approaches over the others. Finally, the Maze scenario (Figure 10c), that represents the most challenging environment, was used to establish comparative results among the approaches. These results are further analysed below in Section 6.3 and Section 6.4, respectively. Due to the big amount of collected data, only the values related to Maze runnings are summarised and discussed here. Even so, all charts and screen-shots generated from data concerning all of the three environments are available online: www.fing.edu.uy/ fbenavid/projects/MuRE/mure.html.

6.1.2. Robotic Agent Architecture

From a software architecture point of view, each robot is organised in three layers. In Figure 11 the main components are roughly depicted. Each layer is responsible for different aspects grouped by abstraction levels so that the higher layer, the more abstract are the issues which the software components are devoted to.

Going bottom-up in the layer stack, in the first layer the components are in charge of the interaction between the robotic agent and the environment. The Motion Control component is taken from the MORSE (MORSE physics simulator www.openrobots.org/morse/doc/stable/morse.html) repository and is responsible for controlling the motors. Besides, in this work, the component follows a way-point-based motion strategy. In the Sensory Capabilities component all sensory systems in charge of gathering environmental information are grouped. The most relevant information comes from the Pose and the Laser scanner sensors, also taken from the MORSE repository. From the Pose sensor it is possible to know the robot configuration

X_{i} (t) = {x_{i} (t), y_{i} (t), θ_{i} (t)}

at any time—implementing the localisation capability—while the laser gives an array of distance measurements

z (t)

from which is possible to build the map of the close surroundings. Finally, the Communication Capabilities component is asked to manage every aspect related to communications receiving/sending information from/to (see incoming/outgoing arrows) other team members. In this work, and since only the distance and wall attenuation effects (discarding other sources of perturbation) are considered, the communication is simulated in a very simple manner directly applying the communication model introduced in Section 2.3.

The second layer represents the core of the system where the models and algorithms that support the highest level functionalities—namely related to the exploration purpose of the system—are allocated. On the one side, the World Model component is in charge of modelling all physic interaction between the robotic agents and its surroundings. By keeping several structures up-to-date (e.g., occupancy grid map, the position of the fleet members, assignment of the fleet members), it is also able to support foretelling services that would be required for the highest level algorithms. On the other side, Mapping and Path Planning components are also supported by the World Model component since it gives an access point to the mapping structures and the kinematic models as well. The Mapping component implements a standard occupancy grid approach [46] where the posterior of the map is calculated from a collection of separate problems of estimating

p (m_{k} | z (t), X_{i} (t))

for all grid cell

m_{i}

and where each

m_{i}

has attached to it one of the occupancy values

S = {f, o, u}

(previously defined in Section 2.1). The Path Planning component implements the wave-front propagation approach introduced in [43].

Finally, high level decisions as coordination are taken in the third layer when the task allocation scheme is executed by the Task Assignment component. In particular, the arrow between Task Assignment and Communication Capabilities components represents the exchange of current positions and task assignments from the agent to the fleet and vice-versa.

6.2. Figure of Merits

The performance of approaches is assessed regarding the following figures of merit. The first three are the most popular and represent the strongest quality indicators [20]. The fourth has been taken from [47] and sometimes can be useful to explain the results concerning the first two. The fifth was inspired by [48] in order to measure the connectivity quality. Besides, a sixth indicator is proposed here in order to have a better qualitative analysis of the connectivity aspects. Moreover, the connected components of the topology along the exploration are also plotted. The indicators are defined as follows:

Total explorationTime (TT): time elapsed from the beginning until the end of exploration measured in seconds.
PathLength (PL): sum of the distance travelled by each robot measured in meters.
CoverageRatio (CR): percentage of the accessible terrain covered by the team. Calculated as: $\frac{explored cells \cdot 100}{accessible cells}$ .
Over-Sensing cellRatio (OSR): percentage of cells sensed as new by more than one robot. Calculated as: $\frac{over - sensed cells \cdot 100}{explored cells}$ .
DisconnectionLastRatio (DLR): percentage of TT where at least one robot is totally unconnected. Calculated from the Fiedler number corresponding to the network connectivity graph (see Section 2.6.3).
MaximumDisconnectionLastRatio (MDLR): calculated as: $\frac{longest disconnection period \cdot 100}{T T}$ .

6.3. Baseline Statement

In this section, a baseline of performance on the main indicators is established from runnings of both Yamauchi and minPos approaches under ideal communication conditions. Since the exploration problem is expected to be more difficult under non-ideal communication conditions than otherwise [20], the obtained results may be considered as a baseline of the first four indicators—defined before in Section 6.2—with respect to the corresponding performance achieved in runnings conducted under non-ideal communication conditions.

6.3.1. Collected Data

In order to conduct the assessment and comparison stated above, at least ten realistic software-in-the-loop simulations were executed on the Maze scenario presented in Figure 10. All collected data is presented in Table 4 and are organised obeying the following scheme. The columns refer to (from left to right): figure of merits (FM); approaches, where Y and MP stand for Yamauchi and MinPos, respectively; and the fleet size

| R |

. In each fleet size, the average AVE and standard deviation StD values are registered.

6.3.2. Baseline Assessment

We start the analysis highlighting that both approaches can adequately explore all the environments presented above in Section 6.1.1. Coherently, both approaches achieve high levels of CR. This can be seen clearer in Figure 12.

Furthermore, the minPos approach outperforms Yamauchi concerning TT as was expected. However, the most notorious differences of performance are observed on fleets which size is less than or equal to five robots, as can be seen in Figure 13.

In crowded environments, going from one location to another is often more difficult than in the presence of fewer robots. Therefore, due to collision avoidance manoeuvres, both approaches show an increasing PL when the fleet size increases. This behaviour may be observed in the corresponding chart in Figure 14. On the one hand, Yamauchi presents a trend with an almost invariant slope along the different fleet size values. On the other hand, under MinPos, the trend of PL presents a positive but minor slope from one to five-robot-sized fleet after what it becomes very steep.

Hence, the analysis is divided into two cases. Firstly, when fleet size is less than or equal to five robots, MinPos is more efficient than Yamauchi since both approaches achieve very similar coverage ratios (see Figure 12) despite in the latter robots need to traverse longer distances than in the former, on average. That is expected since the Yamauchi approach does not take care about the dispersion of the fleet as the MinPos does and consequently, in the former robots are forced to deal with crowding more frequently than in the latter. This is a remarkable difference given that the energy needed to support an exploration mission will be closely related to the distance traversed by robots.

Contrarily, as the fleet size increase beyond five robots, the shape of the scenario and the peculiar wall distribution all together seem to make the crowding unavoidable for the MinPos approach, causing a severe worsening on its PL performance.

Finally, it is interesting to observe the over-sensing-cell phenomenon, because, by observing the amount of rework done by the fleet during exploration tasks, it also gives a good measure of the system efficiency.

In this case, we start the analysis pointing that in an ideal world—with perfect communications, perfect sensing and instantaneous actions—there would be no place for over-sensing. Nevertheless, in the real world, communications and sensing systems are not perfect and, more important, all actions take time. Even the ones which do not involve motion such as sensing, computing and communicating actions need some window time to be executed. Therefore, many things can happen simultaneously, e.g., sensing actions conducted on the same objects. In such a case, two or more robots might report the discovery of the same cells.

In conclusion, even under ideal communication conditions, it is possible to register some level of over-sensing, and this level is unavoidable because of the parallel nature of the system. However, it is equally interesting to analyse the over-sensing results: (i) When the fleets are obeying different policies; (ii) To have a baseline against which the results obtained under non-ideal communication conditions may be compared.

Backing to the experiments, during the simulation runnings we verify that the most significant over-sensing record is mainly generated at starting steps when all robots are very close to each other (recall that all robots start from the same corner of the scenario, see Table 3) and, in consequence, its sensing scopes overlap each other, significantly. In Figure 15 the robot placement setup at the starting time is shown.

Conversely, after this initial period, the robots overlap each other less frequently, and hence the OSR remains almost unchangeable over time, in both approaches. Despite this, minor differences may be highlighted. Due to a better fleet distribution on the terrain—which decreases the probability of simultaneous sensing events—the fleet makes slightly less rework under MinPos approach than under Yamauchi approach (see Figure 16).

6.3.3. Conclusions

Concerning the maze scenario, the conclusions of the section are: (i) Regarding fleets integrated with at most five robots, the MinPos approach is clearly advantageous (outperforming the Yamauchi approach in all assessed figures of merit); (ii) The benefits of employing the MinPos approach are severely affected when fleet increase beyond five robots, decreasing quickly or even disappearing when it is about eight robots.

6.4. AAMO Assessment

This section aims to study the impact of using different HO-Threshold values on the performance of the proposed AAMO approach when the fleet is asked to explore an environment under non-ideal communication conditions. Moreover, these results are compared with the one achieved by other approaches like Yamauchi and MinPos—when they are subject to non-ideal communication conditions too—and also with an event-based-connectivity strategy that does make all efforts in favour of connectivity (regardless the total exploration time).

This last comparison is namely important because the performance of this kind of strategy may serve as an upper bound on the connectivity level over time and the total exploration time as well. To do so, typically two strategies (based on different connection requirements, see Section 1.1.1) can be considered: the ones which force the robots to be connected only on task-arrival time (kind of event-based connectivity) or the ones which force the robots to keep always connected—even during the path traversal periods (continuous connectivity). In the former, the robots are forced to select only between tasks which location would not cause isolation on arrival—regarding the current task assignment of the fleet. Nevertheless, it does not take into account the connectivity level along the path between the current robot location and the location of the task under consideration. Conversely, the latter imposes stronger restrictions on the fleet mobility in order to guarantee connectivity at all times. Consequently, depending on the application field the latter strategy would be recommended but is more complex to implement than the former. On the contrary, the former allows a simpler implementation but could lead to a lower level of connectivity along the exploration. Concerning this document, a connectivity-at-task-arrival-time based strategy is used for comparison purposes.

Besides, it is also important to highlight that, despite Yamauchi and MinPos assume ideal communication conditions, neither approach needs to be modified or adapted in order to properly run under non-ideal communication condition. Nevertheless, in the MinPos case, some severe degradation is expected because of the following working hypothesis are not guaranteed anymore: All robots share the same map and know the position of the other fleet members, at all times. This could lead to incoordinations that, in turn, would harm the dispersion strategy on which the approach is strongly based. Conversely, in the Yamauchi case, the level of expected degradation is fewer due to the coordination level between robots is fewer as well. Robots only try to avoid going to the same task simultaneously.

6.4.1. Collected Data

In order to conduct the assessment and comparison stated above, at least ten realistic software-in- the-loop simulations were executed on the Maze scenario presented in Figure 10. All collected data is presented in Table 5 and are organised obeying the following scheme. The columns refer to (from left to right): Figure of Merits (FM), Approach, where Y, MP, EbC, and AAMO:HO-Th stand for Yamauchi, MinPos, Event-based Connectivity (implemented by an AAMO:∞ instance, as was mentioned above in Section 4.4) and Auto-Adaptive Multi- Objective:HumanOperator-Threshold, respectively; and fleet size

| R |

. The HO-Threshold values are 20 m, 15 m, and 10 m (equivalent to 66%, 50%, and 33% of the communication range

c_{i}

, respectively). Besides, since the communication conditions are non-ideal all runnings only concern fleets integrated with multiple robots (explicitly avoiding the single robot case because the communication conditions do not make any difference on it).

6.4.2. Effectiveness Assessment

We start the analysis highlighting that all implemented approaches—and particularly all AAMO instances—can adequately explore all the environments presented above in Section 6.1.1. Coherently, all instances achieve a high level of CR when exploring the Maze environment, as can be appreciated in Figure 17.

6.4.3. AAMO vs. Baseline Comparison

Concerning TT, as was expected in multi-robot systems, all AAMO instances benefit from adding robots to the fleet. This result can be seen in Figure 18a. Nevertheless, compared to the baseline results all AAMO instances show performance degradation (see Figure 18b).

The evidence indicates that the more efforts made in favour of connectivity (bigger HO-Threshold) the worst TT. In other words, not all HO-Threshold setup values produce the same level of performance degradation. Since the degradation of TT performance could be very problematic in many application fields, this subject is carefully analysed.

At first, the PL indicator can help to initially explain why the fleet spends more time under AAMO approach than under the MinPos approach, to explore the same environment. In Figure 19a it is possible to observe the same behaviour as in the baseline (see Section 6.3): larger fleets imply bigger PL; while Figure 19b shows the difference between the corresponding total length of the paths traversed by fleets.

The similarity between Figure 18b and Figure 19b is remarkable and could explain, to a large extent, the origin of TT degradation. Simply, under the AAMO approach, the robots are asked to invest some effort (translated as a distance using the HO-Threshold) in order to keep the fleet connected and hence it is logic to get a bigger PL as a result. Moreover, the tradeoff between path and connectivity utility discussed in Section 3.1 shows up through these results, reflecting that the price of connectivity maintenance is the inability to apply an optimal policy concerning path costs.

Nevertheless, there exists a small portion of the TT degradation that cannot be explained by the PL increasing. Therefore, the hypothesis assumed in the tractability analysis made at the end of Section 5.2 are compared here with the simulation results in order to add a complementary explanation on the TT degradation. Furthermore, this TT degradation shows a parabolic trend as the fleet size increase, reaching a maximum about three-sized fleets, independently of the HO-Threshold values. Thus, the analysis will be conducted observing what happens when the fleet size does change but the HO-Threshold does not (in order to explain the shape of the curve or the relative values), and the opposite conditions are imposed in order to explain the absolute values.

In any case, it is worth knowing that the Task selection algorithm is the most demanding software component in the software architecture of the robots. Hence, the overall performance of the multi-robot system is highly determined by the performance of this component. In turn—as was pointed out in Section 5.2—its performance is strongly influenced by the number of unassigned thresholded tasks

n = | T^{HO} |

and the number of unassigned robots in a connected component

m = | R^{u} |

that are making a decision at the same time, in the following way:

| A r_{m}^{n} | = \frac{n!}{(n - m)!} = Π_{m = 0}^{n - 1} (n - m) \to O (n^{m})

. Therefore, the smaller

| T^{HO} |

and

| R^{u} |

the faster the algorithm will run. Please recall that

| T^{HO} |

is upper bounded by the amount of unassigned tasks

| T^{u} |

.

Firstly, from Figure 20 and Figure 21, it is possible to examine how

| R^{u} |

and

| T^{u} |

change along explorations depending on the fleet size. In all cases, both values show well defined patterns that are easily identifiable. Concerning

| R^{u} |

(see Figure 20) it is possible to state that in all AAMO instances—working in a fully asynchronous modality—the probability of two or more robots simultaneously running a decision making process is negligible. Thus the majority of time either none robot is making a decision or at most one robot is evaluating the available tasks.

Results obtained during simulations are summarised in Table 6 and show a behaviour that is consistent with this last statement independently of the fleet size. The low ratio of robot coincidences is remarkable (e.g., for 3-sized fleets, about 96% of the decision making moments have only one robot participating on them).

In conclusion, in practice, the worsening of the TT performance is apparently only related to the incidence of the HO-Threshold on the

| T^{HO} |

value. Next, this relation is carefully studied, and some answers are essayed.

The parabola described by the TT degradation values in Figure 18 suggests the presence of two factors impacting on this behaviour. One presses the trend upwards and the other in a counter sense. In the following, two particular factors are analysed: the fleet size and the bounded condition of the environment. (i) As the fleet size increase robots make progress faster, causing

| T^{HO} |

to increase more quickly as well. When

| T^{HO} |

rises, the task selection algorithm becomes slower, and thus the increase in the fleet size could explain the first increasing section of the trend; (ii) In bounded environments, the multi-robot exploration systems typically show two mobility patterns that characterise, in turn, two different exploration stages: (1) One is characterised by the dispersion of the fleet on the terrain. In such a stage, the new available tasks appear closer to each other, and its total amount

| T^{u} |

is upward; (2) On the contrary, the second exploration stage is characterised by the convergence of the fleet to the remaining unexplored zones starting when it is no longer possible to disperse the fleet until the end of the exploration. In such a stage, the new available tasks generally appear further to each other and its total amount

| T^{u} |

is decreasing. Therefore, since the tasks

T^{HO}

are the ones which are closer than a relative distance HO-Threshold, under the AAMO approach, it is statistically less demanding for the robots to select a task during the last exploration stage than in the initial one.

Additionally, either when the fleet size increase or the HO-Threshold decrease, the transition from the first to the second exploration stage is achieved faster. This fact can be corroborated in both Figure 21 and Figure 22. For instance, concerning Figure 21, the 3-Robot system spends about 410 s to reach the end of the dispersion stage whereas the 5-Robot system and 8-Robot system spend about 320 s and 260 s, respectively. Likewise, from Figure 22, the AAMO:20 instance spends about 310 s to reach the end of the dispersion stage whereas the AAMO:15 and AAMO:10 spend about 260 s and 150 s, respectively.

Hence, despite the fact the impact of the fleet size on the exploration stage transition appears to be higher than the one caused by the HO-Threshold value, both aspects contribute to reducing the task selection effort enabling robots to save time in the task allocation procedure anticipatedly.

In conclusion, when the AAMO is executed in bounded environments, the addition of robots and the decreasing of HO-Threshold can almost entirely mitigate the worsening in the total exploration time performance. Please note that the performance degradation of AAMO:10 instances is almost null for eight-sized fleets.

From these promising results, in the following, all AAMO instances are compared with the other approaches concerning non-ideal communication conditions.

6.4.4. AAMO Efficiency Assessment

In this section, several statistical analyses were performed on different indicators to demonstrate the efficiency of the proposed AAMO approach. Wilcoxon signed-rank tests were performed (a non-parametric test was chosen since data in each condition do not follow a normal distribution) to compare samples from two populations. More precisely, it tests the indicator differences between approaches for a given fleet size.

Firstly, in relation to TT (see Figure 23), the evidence confirms two expected results: (i) All approaches benefit from adding robots to the fleet. A Wilcoxon difference test was performed regarding TT and the fleet size for each approach. All comparisons present a significant decrease in TT when fleet size increases (p-value

< 0.05

); (ii) Since it only takes care of connectivity, the EbC approach shows the worst performance regardless the fleet size. Wilcoxon tests showed a significant result (p-value

< 0.001

) for all comparisons between approaches given a fleet size.

Additionally, all AAMO instances show competitive TT results even slightly outperforming other approaches in the case of AAMO:10. In particular, a Wilcoxon difference test showed that AAMO:10 has a smaller TT than MinPos for 2 and 5 robots (resp.

W = 169

, p-value

< 0.01

, and

W = 159

, p-value

< 0.05

).

Secondly, concerning the PL indicator (see Figure 24), the EbC approach present the worst performance, coherently. Again, the Wilcoxon test showed significant results (p-value

< 0.001

). Likewise, all AAMO instances show competitive results too.

Besides, and as was pointed above, the TT and PL results show that the lack of ideal communication conditions negatively affects the MinPos approach more than the Yamauchi approach. Wilcoxon tests showed a trend for 4 and 5 robots (p-value

< 0.1

) concerning TT, and a sigfinicant difference in PL for 4 robots (p-value

< 0.05

).

Up to this point, the AAMO approach has shown results as good as the MinPos approach. Next, the indicators related to connectivity are analysed in order to properly assess the potential advantages of the AAMO approach in the presence of more realistic communication conditions.

The DLR indicator trend is shown in Figure 25. As can be seen, while the performance of the MinPos and Yamauchi approaches are the worst, the EbC performance is remarkably the best. These visual results were confirmed by Wilcoxon tests between approaches for each fleet size. DLR indicator is significantly bigger (p-value

< 0.001

) for MinPos and Yamauchi than AAMO and EbC approaches, except for 8-sized fleets where these results are significant only when compared to EbC. Moreover EbC has a significant smaller DLR indicator (p-value <

0.05

) than all the others approaches except for the 5 and 8 robots cases, in which any statistical difference can be found between AAMO approaches and EbC.

Similarly, the AAMO approach results represent a very good improvement with respect to both MinPos and Yamauchi approaches. The chart in Figure 25 reveals that our approach outperforms both Yamauchi and minPos approaches independently of the fleet size on average. Nevertheless, the smaller fleet, the greater outperforming. The explanation can arise correctly from intuition: when the environment is bounded, the probability of being disconnected tends to decrease as the fleet size increase. Therefore, the benefits of our approach tend to be smaller when the fleet size increases. Either way, it is always meaningful. Please note that even in the largest fleet size case, the DLR of AAMO represents an improvement of 20% on average compared to the corresponding Yamauchi or MinPos.

Furthermore, the relation between TT, DLR and HO-Threshold is noticeable. The more effort demanded by the human operator (higher threshold), the slower but higher connected the AAMO performs. This claim is confirmed by Wilcoxon tests that showed a significantly bigger (p-value

< 0.05

) TT indicator for AAMO:20 than for AAMO:10, and also show that the DRL indicator is significantly smaller (p-value

< 0.05

) for AAMO:20 than for AAMO:10, regardless the fleet size.

Regarding the oscillation registered, it could suggest the existence of the following rational pattern. When fleet size is even, the easier way to avoid isolation situations is keeping in pairs (connected with at least another teammate). Contrarily, when the fleet size is odd, not all robots can keep in pairs. In case the fleet has divided, at least one sub group must be composed of three robots. Therefore, this oscillatory behaviour could hint at the fact that odd-sized fleets need to make little more effort to avoid robot isolation situations and are consequently subject to bigger DLR results as well.

Likewise, it is interesting to analyse the DLR indicator and network topology together. This way it is possible to get a closer notion about the interaction between robots along the exploration. Figure 26 is devoted to showing the number of connected components present in the network, averaged over time.

Please note that for the AAMO:20 instance—run on 2-Robot fleet—the DLR is about 40% (see Figure 25), coinciding with the percentage achieved by the 2CC of the same fleet size in Figure 26. In other words, the fleet holds a network composed of one single connected component during 60% (100%–40%) of total exploration time. Consistently, this is equivalent to say that during this portion of the time none robot has been disconnected.

Additionally, and as a matter of fact, the chart shows that as the fleet size increase it is more challenging to keep the whole fleet connected: 1CC stack is decreasing in size as the fleet size increase. Nevertheless, it also shows that simultaneously with the adding of new robots, the fleet is more and more cohesive (in relative terms). This fact may be corroborated looking at the upper part of the chart where the stacks corresponding to the greatest number of connected components are plotted. The following pattern can be observed: the number of connected components (given by

n C C

) increase slower than the fleet size n. Again, the fact that the Maze scenario is bounded may explain this phenomenon to a large extent.

Although all this information gives an approximated notion about how disconnected is the fleet (group perspective) along explorations, it is not enough to hint what is happening at the individual level. Thus, it is also interesting to study the worst case of the individual disconnections last. This way it is easier to evaluate both coordination capabilities (how long a robot is unable to coordinate its actions with any other teammates) and risky situations (how long the fleet present single points of failure). Recall that the key motivations in considering communication constraints are strongly related with the rework avoidance: (i) When robots are unconnected they have fewer possibilities to coordinate their actions hence they could visit the same regions unnecessarily. Hence, keep them connected is a way to favour the efficiency; (ii) In the presence of damages or inner failures the exploration strategy should take those events into account preventing the need of re-exploration.

In Figure 27 the trend followed by Maximum Disconnection Last Ratio MDLR indicator is depicted showing that the bigger HO-Threshold, the shorter disconnection periods (Wilcoxon tests showed that the MDLR indicator is significant smaller (p-value

< 0.05

) for AAMO:20 than for AAMO:10, for 2 and 3 robots, and tends to be smaller (p-value

= 0.09

) for 4 robots) and that the last of isolation situations is at most equivalent to half of the DLR values for every fleet size and HO-Threshold value as well. In other words, the isolation situations regard more than one single robot and this in turn, reveals that under the AAMO approach the robots often intent to rejoin each other.

At last but not least, it is worth to discuss the trend of OSR as the fleet size increase. The results obtained by the different AAMO instances are depicted in Figure 28. In Section 6.3 the OSR levels were achieved mostly thanks to simultaneous sensing actions, conversely, in these simulation runnings, the OSRs achieve higher levels due to non-ideal communication conditions. As was expected, the more the mapping information of the robots is out-of-date with respect to each other, the higher the OSR. However, in any communication conditions, the same upper bound is achieved. This suggests that the size and bounded condition of the Maze environment could be limiting the over-sensing phenomenon when fleet size increase beyond five robots.

To sum up and concerning the Maze scenario and the baseline stated in Section 6.3, the conclusions of this section are: (i) The AAMO approach can be employed as a strategy to coordinate multi-robot systems that are dedicated to exploration tasks; (ii) As was expected, the HO-Threshold value directly impacts on the connectivity level that the fleet is able to hold during the mission; (iii) Likewise, the relation between HO-Threshold values and the TT and DLR/MDLR indicators is the expected: the bigger the HO-Threshold value, the worse TT performance, but the better DLR/MDLR ratios; (iv) Although all instances of the AAMO approach present TT degradation with respect to the baseline, in any case it is not significantly due to the computation of the proposed task-to-robots distribution; (v) All AAMO instances outperform the baseline concerning the DLR and MDLR indicators; (vi) With the exception of DLR/MDLR, all instances of the AAMO approach outperform the EbC approach; (vii) The topology of the fleet networks shown during exploration is consistent with the HO-Threshold values, for all AAMO instances.

The AAMO approach shows effectiveness and flexibility (through the HO-Threshold setup) to tackle the multi-robot exploration problem. Particularly concerning the efficiency related to both completion time and connectivity level maintenance, the approach appears as an intermediate solution that presents much better TT performance than the most restrictive approach EbC and better connectivity level along exploration than the approaches that do not take care about communication issues.

7. Conclusions

7.1. Task Assessment Problem

The proposed Auto-Adaptive Multi-Objective (AAMO) approach follows a multi-objective assessment strategy where the tasks under consideration are assessed regarding two objectives: the cost associated with the corresponding shortest path and the connectivity level each task location can offer to robots at arrival time. The multi-objective strategy is implemented employing a weighted sum that trades travelling cost off for connectivity levels. Up until now, all these concepts are quite standard being present in several state-of-art approaches.

Nevertheless, in this work: (i) Connectivity awareness ability is given to the robots by modelling attenuation effects that commonly affect the communication signal strength; (ii) the weights of these potentially conflicting objectives are derived from formal analysis instead of a training stage, making the system more adaptable to different environments; (iii) The human operator is asked to use his application-field expertise to play a part in the task assessment process by setting a distance threshold until which the tasks that preserve or enlarge connectivity are preferred over the rest. All this leads to a more flexible system where the robots can deal with communication constraints adjusting the weights of each objective independently of any scenario in a more intuitive manner, saving a lot of training time.

All existence and correctness proofs conducted on the task selection procedure support the fact that the robots are always capable of auto-adapting the objectives weights in order to select the tasks accordingly with the human-operator criterion. In conclusion, this task assessment approach may be very advantageous considering its ease of deployment.

7.2. Task Allocation Problem

Concerning the tasks-to-robots distribution algorithm, all previous proposals explicitly avoid the combinatorial blow-up of allocation complexity using different heuristics. Nevertheless, heuristic-based approaches make assumptions that cannot be verified at all times. In consequence, when the heuristic fails the robots choose suboptimal distributions.

Taking into account this limitation and since the number of possible distributions depends on both the number of available tasks and the number of robots making a decision, the proposal presented here computes optimal distributions based on more general assumptions such as: (i) Robots can implicitly coordinate their actions; (ii) Asynchronism may keep the number of simultaneous decision making at small values; (iii) Pruning the furthest tasks (out of the scope defined by the human operator –HO-Threshold) does not prevent the computation of optimal tasks-to-robots distribution.

7.3. Connectivity Maintenance Problem

While all event-based connectivity approaches consist in the execution of regaining-connectivity actions in the presence of specific events (e.g., typically disconnections, whenever it happens or periodically after a certain amount of time), the AAMO approach integrates a less restrictive connectivity strategy where the robots are motivated but not compelled to regain connectivity. When selecting their targets, the robots are always considering the opportunity cost of keeping connected or regaining connectivity, implicitly. Furthermore, in reconnection cases, the task location becomes the meeting point itself eliminating the need for rendezvous policy implementation and, maybe more important, avoiding deviations from natural paths. This way, the policy is utterly transparent to the eyes of the external observer: every time it is possible to explore and keep or enlarge connectivity level the robots will choose this option. On the contrary, when it is not, they merely behave guided by a pure path-cost exploration.

Particularly concerning the efficiency related to both completion time and connectivity level maintenance, the approach is capable of decreasing the last of disconnection periods without a noticeable degradation of the completion exploration time, appearing as an intermediate solution that presents much better completion time performance than the most restrictive event-based connectivity approaches and better connectivity level along exploration than the approaches that do not take care about communication issues.

7.4. Future Research Directions

New research questions have arisen along this work leaving, as a result, several opportunities to improve the developed system. Although the environments employed in simulations are proposed as benchmarks, it would be beneficial to check the validity and performance of the proposed approach on a broader variety of scenarios. Large office-like environments would be exciting to put the system into more realistic situations like mapping buildings where larger fleets could be employed, too. Since the robot model defined can support robots with different characteristics, exploiting heterogeneity could be a promising research direction. Integrating a fleet with heterogeneous robots (e.g., different in size, sensory, and motion capabilities) could enhance the skills of the fleet. For instance, given their greater mobility, UAVs could help the fleet to keep connected by playing a relay role, while small terrestrial robots could be the key to get into access-restricted spaces. At last but not least, executions on real systems are also planned. Despite the goodness of any simulator, many important details escape their scopes. The proposed approach is designed to serve as a solution for real-world applications so that it is imperative to verify its feasibility in real scenarios. In such a case, localisation and mapping errors cannot be ignored. Both simultaneous Localisation and Mapping (SLAM) algorithms and the sensory and motor devices should be carefully studied to limit the influence of this kind of errors on the high-level decision components. Regarding the equipment availability of the involved laboratories, the candidate platforms would be either IRobot or KheperaIII units.

Author Contributions

Conceptualisation, Methodology, Software, Validation, Analysis and Writing-original draft preparation, F.B.; Writing reviews and Supervision, E.G., C.C. and P.M.

Funding

This research was partially funded by the Comisión Sectorial de Investigación (CSIC-UdelaR) through its mobility program MIA.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following acronyms are used in this manuscript:

AAMO	Auto-Adaptive Multi-Objective approach
CC	Connected Component
CR	Coverage Ratio
EbC	Event-based Connectivity approach
DLR	Disconnection Last Ratio
DR	Dual Role approach
LoS	Line-of-sight
MANET	Mobile Ad-hoc NETwork
MDLR	Maximum Disconnection Last Ratio
MDP	Markov Decision Process
MORSE	Modular Open Robots Simulation Engine
MRS	Multi-Robot System
MP	MinPos approach
OSR	Over-Sensing Ratio
PL	Path length
POMDP	Partially Observable Markov Decision Process
SLAM	simultaneous Localisation and Mapping
TT	Total exploration Time
UAV	Unmanned Aerial Vehicle
Y	Yamauchi approach

References

Burgard, W.; Moors, M.; Stachniss, C.; Schneider, F. Coordinated multi-robot exploration. IEEE Trans. Robot. 2005, 21, 376–386. [Google Scholar] [CrossRef] [Green Version]
Wurm, K.; Stachniss, C.; Burgard, W. Coordinated multi-robot exploration using a segmentation of the environment. In Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 22–26 September 2008; pp. 1160–1165. [Google Scholar] [CrossRef]
Simmons, R.; Apfelbaum, D.; Burgard, W.; Fox, D.; Moors, M.; Thrun, S.; Younes, H. Coordination for Multi-Robot Exploration and Mapping; Aaai: Menlo Park, CA, USA; AAAI Press: Cambridge, MA, USA; MIT Press: London, UK, 1999/2000; pp. 852–858. [Google Scholar]
Elfes, A. Using Occupancy Grids for Mobile Robot Perception and Navigation. Computer 1989, 22, 46–57. [Google Scholar] [CrossRef]
Yamauchi, B. Frontier-based exploration using multiple robots. In Proceedings of the Second International Conference on Autonomous Agents AGENTS 98, Minneapolis, MN, USA, 10–13 May 1998; pp. 47–53. [Google Scholar] [CrossRef]
Keidar, M.; Kaminka, G.A. Efficient frontier detection for robot exploration. Int. J. Robot. Res. 2014, 33, 215–236. [Google Scholar] [CrossRef]
Keidar, M.; Kaminka, G.A. Robot Exploration with Fast Frontier Detection: Theory and Experiments. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, Valencia, Spain, 4–8 June 2012; pp. 113–120. [Google Scholar]
Yuan, J.; Huang, Y.; Tao, T.; Sun, F. A cooperative approach for multi-robot area exploration. In Proceedings of the IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010—Conference Proceedings, Taipei, Taiwan, 18–22 October 2010; pp. 1390–1395. [Google Scholar] [CrossRef]
Korsah, G.A.; Stentz, A.; Dias, M.B. A comprehensive taxonomy for multi-robot task allocation. Int. J. Robot. Res. 2013, 32, 1495–1512. [Google Scholar] [CrossRef]
Gerkey, B.P. A Formal Analysis and Taxonomy of Task Allocation in Multi-Robot Systems. Int. J. Robot. Res. 2004, 23, 939–954. [Google Scholar] [CrossRef] [Green Version]
Sheng, W.; Yang, Q.; Tan, J.; Xi, N. Distributed multi-robot coordination in area exploration. Robot. Auton. Syst. 2006, 54, 945–955. [Google Scholar] [CrossRef]
Zlot, R.; Stentz, A.; Dias, M.; Thayer, S. Multi-robot exploration controlled by a market economy. In Proceedings of the 2002 IEEE International Conference on Robotics and Automation, Washington, DC, USA, 11–15 May 2002; Volume 3, pp. 3016–3023. [Google Scholar] [CrossRef]
Cavalcante, R.C.; Noronha, T.F.; Chaimowicz, L. Improving combinatorial auctions for multi-robot exploration. In Proceedings of the 2013 16th International Conference on Advanced Robotics (ICAR), Montevideo, Uruguay, 25–29 November 2013; pp. 1–6. [Google Scholar] [CrossRef]
Kuhn, H.W. The Hungarian method for the assignment problem. Naval Res. Logist. 1955, 2, 83–97. [Google Scholar] [CrossRef] [Green Version]
Hollinger, G.A.; Singh, S. Multirobot coordination with periodic connectivity: Theory and experiments. IEEE Trans. Robot. 2012, 28, 967–973. [Google Scholar] [CrossRef]
Pham, V.C.; Juang, J.C. A multi-robot, cooperative, and active slam algorithm for exploration. Int. J. Innov. Comput. Inf. Control 2013, 9, 2567–2583. [Google Scholar]
Bautin, A.; Simonin, O. MinPos: A Novel Frontier Allocation Algorithm for Multi-robot Exploration. ICIRA 2012, 7507, 496–508. [Google Scholar]
Valentin, L.; Murrieta-Cid, R.; Muñoz-Gómez, L.; López-Padilla, R.; Alencastre-Miranda, M. Motion strategies for exploration and map building under uncertainty with multiple heterogeneous robots. Adv. Robot. 2014, 28, 1133–1149. [Google Scholar] [CrossRef]
Banfi, J.; Li, A.Q.; Basilico, N.; Rekleitis, I.; Amigoni, F. Asynchronous multirobot exploration under recurrent connectivity constraints. In Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; pp. 5491–5498. [Google Scholar] [CrossRef]
Amigoni, F.; Banfi, J.; Basilico, N. Multirobot Exploration of Communication-Restricted Environments: A Survey. IEEE Intell. Syst. 2017, 32, 48–57. [Google Scholar] [CrossRef]
Vazquez, J.; Malcolm, C. Distributed Multirobot Exploration Maintaining a Mobile Network. In Proceedings of the IEEE International Conference on Intelligent Systems, Varna, Bulgaria, 22–24 June 2004. [Google Scholar]
Rooker, M.N.; Birk, A. Multi-robot exploration under the constraints of wireless networking. Control Eng. Pract. 2007, 15, 435–445. [Google Scholar] [CrossRef]
Mosteo, A.R.; Montano, L.; Lagoudakis, M.G. Multi-robot routing under limited communication range. In Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 19–23 May 2008; pp. 1531–1536. [Google Scholar] [CrossRef]
Le, V.T.; Bouraqadi, N.; Stinckwich, S.; Moraru, V.; Doniec, A. Making networked robots connectivity-aware. In Proceedings of the IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009; pp. 3502–3507. [Google Scholar] [CrossRef]
Michael, N.; Zavlanos, M.M.; Kumar, V.; Pappas, G.J. Maintaining Connectivity in Mobile Robot Networks. Springer Tracts Adv. Robot. (STAR) 2009, 54, 117–126. [Google Scholar] [CrossRef] [Green Version]
Derbakova, A.; Correll, N.; Rus, D. Decentralized self-repair to maintain connectivity and coverage in networked multi-robot systems. In Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3863–3868. [Google Scholar] [CrossRef]
Pei, Y.; Mutka, M.W. Steiner traveler: Relay deployment for remote sensing in heterogeneous multi-robot exploration. In Proceedings of the IEEE International Conference on Robotics and Automation, Saint Paul, MN, USA, 14–18 May 2012; pp. 1551–1556. [Google Scholar] [CrossRef]
Laëtitia Matignon, L.J.; Mouaddib, A.I. Coordinated Multi-Robot Exploration Under Communication Constraints Using Decentralized Markov Decision Processes. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto, ON, Canada, 22–26 July 2012. [Google Scholar]
Couceiro, M.S.; Figueiredo, C.M.; Rocha, R.P.; Ferreira, N.M. Darwinian swarm exploration under communication constraints: Initial deployment and fault-tolerance assessment. Robot. Autonomous Syst. 2014, 62, 528–544. [Google Scholar] [CrossRef]
Pralet, C.; Lesire, C. Deployment of Mobile Wireless Sensor Networks for Crisis Management: A Constraint-Based Local Search Approach. In Proceedings of the International Conference on Principles and Practice of Constraint Programming, Toulouse, France, 5–9 September 2014; pp. 870–885. [Google Scholar]
Jensen, E.A.; Nunes, E.; Gini, M. Communication-Restricted Exploration for Robot Teams; Workshop on Multiagent Interaction without Prior Coordination; Number Association for the Advancement of Artificial Intelligence (AAAI): Palo Alto, CA, USA, 2014; pp. 15–21. [Google Scholar]
Cesare, K.; Skeele, R.; Yoo, S.H.; Zhang, Y.; Hollinger, G. Multi-UAV exploration with limited communication and battery. In Proceedings of the IEEE International Conference on Robotics and Automation, Seattle, WA, USA, 26–30 May 2015; Volume 2015, pp. 2230–2235. [Google Scholar] [CrossRef]
Magán-Carrión, R.; Camacho, J.; García-Teodoro, P.; Flushing, E.F.; Di Caro, G.A. A Dynamical Relay node placement solution for MANETs. Comput. Commun. 2017, 114, 36–50. [Google Scholar] [CrossRef]
Magán-Carrión, R.; Rodríguez-Gómez, R.A.; Camacho, J.; García-Teodoro, P. Optimal relay placement in multi-hop wireless networks. Ad Hoc Netw. 2016, 46, 23–36. [Google Scholar] [CrossRef]
Rahman, M.M.; Bobadilla, L.; Abodo, F.; Rapp, B. Relay vehicle formations for optimizing communication quality in robot networks. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Vancouver, BC, Canada, 24–28 September 2017; pp. 6633–6639. [Google Scholar] [CrossRef]
Nestmeyer, T.; Robuffo Giordano, P.; Bülthoff, H.H.; Franchi, A. Decentralized simultaneous multi-target exploration using a connected network of multiple robots. Autonomous Robots 2017, 41, 989–1011. [Google Scholar] [CrossRef]
Banfi, J.; Quattrini Li, A.; Rekleitis, I.; Amigoni, F.; Basilico, N. Strategies for coordinated multirobot exploration with recurrent connectivity constraints. Autonomous Robots 2018, 42, 875–894. [Google Scholar] [CrossRef]
Bahl, P.; Padmanabhan, V.N. RADAR: An In-Building RF-based User Location and Tracking System. In Proceedings of the IEEE INFOCOM 2000, Conference on Computer Communications, Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies, Tel Aviv, Israel, 26–30 March 2000; pp. 775–784. [Google Scholar] [CrossRef]
Caccamo, S.; Parasuraman, R.; Freda, L.; Gianni, M.; Ogren, P. RCAMP: A resilient communication-aware motion planner for mobile robots with autonomous repair of wireless connectivity. In Proceedings of the IEEE International Conference on Intelligent Robots and System, Vancouver, BC, Canada, 24–28 September 2017; pp. 2010–2017. [Google Scholar] [CrossRef]
Fink, J.; Ribeiro, A.; Kumar, V. Robust Control of Mobility and Communications in Autonomous Robot Teams. IEEE Access 2013, 1, 290–309. [Google Scholar] [CrossRef]
MacQueen, J. Some Methods for classification and Analysis of Multivariate Observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA; 1967. [Google Scholar]
Frey, B.J.; Dueck, D. Clustering by passing messages between data points. Science 2007, 315, 972–976. [Google Scholar] [CrossRef] [PubMed]
Bautin, A.; Simonin, O.; Charpillet, F. SyWaP: Synchronized wavefront propagation for multi-robot assignment of spatially-situated tasks. In Proceedings of the 2013 16th International Conference on Advanced Robotics, ICAR, Montevideo, Uruguay, 25–29 November 2013. [Google Scholar] [CrossRef]
Bautin, A.; Simonin, O.; Charpillet, F. Towards a communication free coordination for multi-robot exploration. In Proceedings of the 6th National Conference on Control Architectures of Robots, CAR2011, Grenoble, France, 24–25 May 2011. [Google Scholar]
Yan, Z.; Fabresse, L.; Laval, J.; Bouraqadi, N. Metrics for performance benchmarking of multi-robot exploration. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Hamburg, Germany, 28 September–2 October 2015; pp. 3407–3414. [Google Scholar] [CrossRef]
Thrun, S.; Burgard, W.; Fox, D. Probabilistic Robotics (Intelligent Robotics and Autonomous Agents); The MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
Pal, A.; Tiwari, R.; Shukla, A. Communication constraints multi-agent territory exploration task. Appl. Intell. 2013, 38, 357–383. [Google Scholar] [CrossRef]
Satici, A.C.; Poonawala, H.; Eckert, H.; Spong, M.W. Connectivity preserving formation control with collision avoidance for nonholonomic wheeled mobile robots. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2013; pp. 5080–5086. [Google Scholar] [CrossRef]

Figure 1. Re-exploration caused by restricted communication. [20]. The yellow portion of the map is only known by robot B. Thus, robot A goes to re-explore the region beyond the red frontier.

Figure 2. Behaviour of the signal strength model. (a) The signal strength function

Γ_{i}

(dBm) is plotted for a [0..5] range of walls and [0..30] (m) range of distances; (b) Attenuation caused by distance; (c) Attenuation caused by wall interference.

Figure 2. Behaviour of the signal strength model. (a) The signal strength function

Γ_{i}

(dBm) is plotted for a [0..5] range of walls and [0..30] (m) range of distances; (b) Attenuation caused by distance; (c) Attenuation caused by wall interference.

Figure 3. Frontier points. The different cell types are identified according to the following colour code: dark blue cells are Obstacles, light blue cells are Unknown, green cells are Free, orange cells are FP cells, and yellow cells are tasks.

Figure 4. Disconnection events representation. The disconnection events

d E_{k}

can appear distributed along the exploration timeline. Its last is variable and depends on the movements realised by the fleet during the exploration. The starting and ending times of each disconnection are represented by the timestamps

t_{s k}^{i}

and

t_{e k}^{i}

, respectively.

Figure 4. Disconnection events representation. The disconnection events

d E_{k}

can appear distributed along the exploration timeline. Its last is variable and depends on the movements realised by the fleet during the exploration. The starting and ending times of each disconnection are represented by the timestamps

t_{s k}^{i}

and

t_{e k}^{i}

, respectively.

Figure 5. Path utility function behaviour. There are several tasks in the scene (blue circles). The closest is located 6 m away from the robot while the furthest is 36 m far away. The closest and furthest tasks always return

1.0

and

0.0

, respectively.

Figure 5. Path utility function behaviour. There are several tasks in the scene (blue circles). The closest is located 6 m away from the robot while the furthest is 36 m far away. The closest and furthest tasks always return

1.0

and

0.0

, respectively.

Figure 6. (a) Connectivity utility function shape; (b) A scene where the usefulness of the

Ω

function can be appreciated. Three robots (coloured dots) and several walls were arranged simulating an ongoing exploration process. Robots are surrounded with two lines of the same dot colour indicating sensory (dashed) and communication ranges (solid); (c) Shows the connectivity level map corresponding to (b). Therefore, as long as the will of another robot is to keep connected with the fleet, it would be able to take this perspective into account when deciding where going to.

Figure 6. (a) Connectivity utility function shape; (b) A scene where the usefulness of the

Ω

function can be appreciated. Three robots (coloured dots) and several walls were arranged simulating an ongoing exploration process. Robots are surrounded with two lines of the same dot colour indicating sensory (dashed) and communication ranges (solid); (c) Shows the connectivity level map corresponding to (b). Therefore, as long as the will of another robot is to keep connected with the fleet, it would be able to take this perspective into account when deciding where going to.

Figure 7. Two robots are carrying out an exploration mission. The communication and sensory ranges are drawn around the robots with red and green dashed lines, respectively. It is assumed that

R_{2}

has already chosen the task

T_{4}

whereas

R_{1}

is still selecting from

T_{1}, T_{2}

, and

T_{3}

. Dotted lines are used to show the sight-line between

R_{2}

and the tasks. The corresponding Euclidean distance is also shown. HO-Threshold is set to 6.

Figure 7. Two robots are carrying out an exploration mission. The communication and sensory ranges are drawn around the robots with red and green dashed lines, respectively. It is assumed that

R_{2}

has already chosen the task

T_{4}

whereas

R_{1}

is still selecting from

T_{1}, T_{2}

, and

T_{3}

. Dotted lines are used to show the sight-line between

R_{2}

and the tasks. The corresponding Euclidean distance is also shown. HO-Threshold is set to 6.

Figure 8. Two robots are carrying out an exploration mission. It is assumed that

R_{2}

has already chosen the task

T_{4}

whereas

R_{1}

is still making its decision. The communication and sensory ranges are drawn around the robots with red and green dashed lines, respectively. Dotted lines are used to show the sight-line between

R_{2}

and the tasks. The corresponding Euclidean distance is also shown. (a) [Cl/Co,F/NC] case: robot

R_{1}

is selecting from targets

T_{2}

that is the closest and keeps it connected and

T_{3}

that is the furthest and cause a disconnection; (b) [Cl/NC,F/NC] case: robot

R_{1}

is selecting from targets

T_{1}

—the closest—and

T_{2}

—the furthest—given that both targets cause a disconnection.

Figure 8. Two robots are carrying out an exploration mission. It is assumed that

R_{2}

has already chosen the task

T_{4}

whereas

R_{1}

is still making its decision. The communication and sensory ranges are drawn around the robots with red and green dashed lines, respectively. Dotted lines are used to show the sight-line between

R_{2}

and the tasks. The corresponding Euclidean distance is also shown. (a) [Cl/Co,F/NC] case: robot

R_{1}

is selecting from targets

T_{2}

that is the closest and keeps it connected and

T_{3}

that is the furthest and cause a disconnection; (b) [Cl/NC,F/NC] case: robot

R_{1}

is selecting from targets

T_{1}

—the closest—and

T_{2}

—the furthest—given that both targets cause a disconnection.

Figure 9. [F/Co,Cl/NC] case.

Figure 10. Benchmark scenarios. All terrains cover an 80 × 80 m

^{2}

flat surface with static obstacles (walls including the outer perimeter). Proposed as benchmarks in [45]. (a) Loop-like scenario; (b) Cross-like scenario; (c) Maze-like scenario.

Figure 10. Benchmark scenarios. All terrains cover an 80 × 80 m

^{2}

flat surface with static obstacles (walls including the outer perimeter). Proposed as benchmarks in [45]. (a) Loop-like scenario; (b) Cross-like scenario; (c) Maze-like scenario.

Figure 11. Robotic Agent Architecture. The first layer includes the software components that represent systems or devices through which the agent can interact with the environment. The second layer includes models and algorithms to keep the models up to date. The third layer includes the task identification and selection algorithms. Components on the shadowed zone were developed during this work.

Figure 12. Coverage ratio (CR) under ideal communication conditions. Both approaches achieve a coverage bigger than 99% of the terrain regardless of the fleet size.

Figure 13. Total exploration time (TT) under ideal communication conditions. Both approaches show a decreasing trend of TT as the fleet size increase. Nevertheless, the fact that the performance improvements are decreasing suppose the existence of a limit on the benefit from robots adding.

Figure 14. Path length (PL) under ideal communication conditions. The trend of PL is upward in both cases.

Figure 15. Robot placement setup at starting time. Robots are represented by black dots. The sensing scope of the robot placed right in the corner is represented by a grey area where it is possible to see the laser aces and the obstruction caused by some teammates. Robots are placed from the corner along the x and y axes. As the fleet size increase, new robots are placed next following the row of robots on each axis, alternately.

Figure 16. Over-sensing ratio (OSR) under ideal communication conditions. This shows how as fleet size increases the trend of OSR is upward as well. This is expected since the more robots sensing the environment the higher the probability of simultaneously sensing the same cells.

Figure 17. AAMO Coverage ratio. Regardless of how different the HO-Threshold values are, in all cases, the AAMO approach can cover more than

99 %

of the terrain.

Figure 17. AAMO Coverage ratio. Regardless of how different the HO-Threshold values are, in all cases, the AAMO approach can cover more than

99 %

of the terrain.

Figure 18. AAMO Total Exploration Time (TT) under non-ideal communication conditions and Degradation with respect to baseline results. (a) All AAMO instances show a decreasing trend of TT as the fleet size increase. The Yamauchi and MinPos approach results (coloured in purple and green, respectively) obtained under ideal communication conditions are placed together to make the comparison easier; (b) The degradation is expressed in terms of the difference between the TT achieved by each of the AAMO instances and the one achieved by the MinPos approach, for each fleet size.

Figure 19. AAMO Path length (PL) under non-ideal communication conditions and Degradation with respect to baseline results. (a) An increasing trend of PL is shown by all AAMO instances as the fleet size increase. The Yamauchi and MinPos approach results (coloured in purple and green, respectively) obtained under ideal communication conditions are placed together to make the comparison easier; (b) The degradation is expressed in terms of the difference between the PL achieved by each of the AAMO instances and the one achieved by the MinPos approach, for each fleet size.

Figure 20. The maximum amount of unassigned robots

| R^{u} |

in any connected component over time under different sized multi-robot systems. All images concern instances of AAMO set with HO-Threshold = 15. Blue dots represent the

| R^{u} |

(on average) that are simultaneously deciding along the exploration.

Figure 20. The maximum amount of unassigned robots

| R^{u} |

in any connected component over time under different sized multi-robot systems. All images concern instances of AAMO set with HO-Threshold = 15. Blue dots represent the

| R^{u} |

(on average) that are simultaneously deciding along the exploration.

Figure 21. Amount of unassigned tasks

| T^{u} |

over time for different sized multi-robot systems. All images concern instances of AAMO set with HO-Threshold = 15. The maximum

| T^{u} |

and the end of the dispersion stage are reached at the same time. Red dots represent the

| T^{u} |

considered by robots (on average) along the exploration.

Figure 21. Amount of unassigned tasks

| T^{u} |

over time for different sized multi-robot systems. All images concern instances of AAMO set with HO-Threshold = 15. The maximum

| T^{u} |

and the end of the dispersion stage are reached at the same time. Red dots represent the

| T^{u} |

considered by robots (on average) along the exploration.

Figure 22. Number of unassigned tasks

| T^{u} |

over time for different instances of the AAMO approach on 8-Robot systems. The maximum

| T^{u} |

and the end of the dispersion stage are reached at the same time. Red dots represent the

| T^{u} |

considered by robots (on average) along the exploration.

Figure 22. Number of unassigned tasks

| T^{u} |

over time for different instances of the AAMO approach on 8-Robot systems. The maximum

| T^{u} |

and the end of the dispersion stage are reached at the same time. Red dots represent the

| T^{u} |

considered by robots (on average) along the exploration.

Figure 23. Total Exploration Time (TT) under non-ideal communication conditions.

Figure 24. Path length (PL) under non-ideal communication conditions.

Figure 25. Disconnection Last Ratio (DLR) under non-ideal communication conditions. The bigger the HO-Threshold, the smaller DLR. This fact holds showing an oscillatory behaviour as the fleet size increase.

Figure 26. Network topology composition under non-ideal communication conditions averaged over time. Depending on the number of connected components and the fleet size, it is possible to study the existence of sufficient conditions to fall into isolation situations. For instance, for a 3-Robot fleet, the 2CC or 3CC topologies imply having at least one robot isolated while for a 5-Robot fleet this implication is related to 3CC, 4CC, or 5CC topologies, and so on.

Figure 27. Maximum Disconnection Last Ratio (MDLR) under non-ideal communication conditions. MDLR shows the longest individual isolation period registered by some fleet member along the exploration. The trend is oscillatory following the same pattern as the DLR indicator.

Figure 28. Over-sensing ratio (OSR) under non-ideal communication conditions. The Yamauchi and MinPos approach results (coloured in purple and green, respectively) obtained under ideal communication conditions are placed together to make the comparison easier.

Table 1. Task classification.

	Closest (Cl)	Furthest (F)
Connectivity	Closest (Cl)	Furthest (F)
Connected (Co)	Cl/Co	F/Co
Non-Connected (NC)	Cl/NC	F/NC

Table 2. Possible cases when selecting from two tasks.

	Cl/Co	F/Co	Cl/NC	F/NC
$T_{k}$	Cl/Co	F/Co	Cl/NC	F/NC
Cl/Co		$[\begin{array}{l} F / C o \\ C l / C o \end{array}]$		$[\begin{array}{l} F / N C \\ C l / C o \end{array}]$
F/Co	$[\begin{array}{l} C l / C o \\ F / C o \end{array}]$		$[\begin{array}{l} C l / N C \\ F / C o \end{array}]$
Cl/NC		$[\begin{array}{l} F / C o \\ C l / N C \end{array}]$		$[\begin{array}{l} F / N C \\ C l / N C \end{array}]$
F/NC	$[\begin{array}{l} C l / C o \\ F / N C \end{array}]$		$[\begin{array}{l} C l / N C \\ F / N C \end{array}]$

Table 3. Simulation setup.

Robot Features & Capabilities
Model	ATRV
Maximum speed ( $s_{i}$ )	1 (m/s)
Laser scan window	360 ( $^{\circ}$ )
Laser range	6 (m)
Laser resolution	3 ( $^{\circ}$ )
Communication parameters.
Range ( $c_{i}$ )	30 (m)
Wall attenuation factor ( $W_{a f}$ )	3.1 (dBm)
Distance attenuation factor ( $D_{a f}$ )	1.523
C factor	4 (walls)
Fleet features.
Heterogeneity	Homogeneous
Initial positions	Left Bottom corner
Environment features.
Terrain	80 × 80 (m $^{2}$ )
Wall height	2 (m)
Wall thickness	0.2 (m)
Corridor width	8 (m)
Grid Map features & parameters.
Mesh	Cartesian grid
Cell side	$2 r_{i}$
AAMO parameters.
$γ$	3
$ρ$	$2 \cdot (\| R \| - 1)$
HO-Threshold	20, 15, 10 (m).

Table 4. Yamauchi and MinPos results under ideal communication conditions on Maze environment.

FM		$\| R \|$
		1		2		3		4		5		8		10
		AVE	StD	AVE	StD	AVE	StD	AVE	StD	AVE	StD	AVE	StD	AVE	StD
TT	Y	1958	121.8	1288	255.9	902	128.6	791	125.6	647	78.5	516	41.3	459.5	41.5
TT	MP	1898	148.0	1044	110.1	779	72.8	615	52.0	505	48.8	496	19.7	482	37.4
PL	Y	1308	94.0	1581	157.4	1665	158.9	1831	225.7	1896	186.2	2093	95.9	2294	173.9
PL	MP	1268	59.0	1413	124.1	1420	85.4	1467	94.8	1592	107.8	2053	50.3	2438	140.7
CR	Y	99.1	0.09	99.0	0.06	99.0	0.04	99.1	0.05	99.0	0.04	99.0	0.05	99.1	0.04
CR	MP	99.0	0.04	99.1	0.06	99.1	0.06	99.1	0.05	99.1	0.05	99.1	0.07	99.1	0.07
OSR	Y	0.0	0.00	1.23	0.57	2.22	0.48	3.68	0.70	5.25	0.81	5.45	0.21	7.00	0.12
OSR	MP	0.0	0.00	0.93	0.19	2.31	0.38	2.94	0.25	4.37	0.50	5.45	0.11	7.00	0.08

Table 5. AAMO Results obtained under non-ideal communication conditions on Maze environment.

FM	Approach	$\| R \|$
		2		3		4		5		8
		AVE	StD	AVE	StD	AVE	StD	AVE	StD	AVE	StD
TT	Y	1216	131.7	904	73.6	759	78.2	653	53.3	496	86.5
	MP	1201	118.2	945	95.4	801	90.2	683	57.9	491	29.2
	EbC	1751	131.0	1600	190.7	1339	291.7	1028	154.6	750	89.0
	AAMO:20	1292	87.8	1100	88.1	913	94.9	767	104.0	661	108.2
	AAMO:15	1222	73.6	1100	130.1	823	79.0	723	65.7	606	85.8
	AAMO:10	1137	85.3	960	123.1	774	94.3	620	76.9	514	23.1
PL	Y	1592	144.2	1707	173.8	1842	190.0	1846	139.0	2216	290.7
	MP	1583	134.9	1744	177.7	1911	200.2	1960	173.8	2375	159.9
	EbC	2215	154.0	2929	323.5	3416	618.6	3181	415.1	3412	296.3
	AAMO:20	1726	114.4	2086	222.5	2243	236.1	2394	252.8	2782	376.1
	AAMO:15	1669	86.9	2056	207.8	2106	204.8	2263	219.2	2536	222.6
	AAMO:10	1542	119.9	1859	225.6	1982	215.5	2007	221.7	2279	248.5
CR	Y	99.2	0.33	99.4	0.37	99.3	0.33	99.3	0.24	99.2	0.20
	MP	99.3	0.35	99.4	0.29	99.4	0.33	99.6	0.34	99.2	0.20
	EbC	99.1	0.04	99.0	0.05	99.1	0.09	99.1	0.11	99.2	0.30
	AAMO:20	99.1	0.26	99.2	0.28	99.2	0.28	99.3	0.35	99.4	0.37
	AAMO:15	99.2	0.35	99.3	0.32	99.3	0.31	99.2	0.12	99.4	0.39
	AAMO:10	99.3	0.32	99.3	0.40	99.3	0.29	99.3	0.29	99.1	0.05
OSR	Y	13.39	6.72	20.21	17.71	28.95	19.00	23.28	13.94	6.44	1.28
	MP	20.49	13.22	28.72	18.85	30.46	19.77	27.98	18.51	6.10	0.87
	EbC	2.47	2.98	3.99	3.01	3.94	1.96	5.04	2.34	5.44	0.01
	AAMO:20	1.85	1.57	2.68	1.12	3.53	2.29	4.56	1.46	5.92	0.83
	AAMO:15	1.94	1.50	2.47	1.07	2.40	0.10	4.81	2.04	5.43	0.02
	AAMO:10	2.47	2.98	3.99	3.01	3.94	1.96	5.04	2.34	5.44	0.01
DLR	Y	77.0	9.10	85.9	6.35	79.7	7.47	77.1	5.94	60.7	9.38
	MP	81.7	7.69	87.2	7.26	80.2	8.28	77.0	8.52	59.6	10.74
	EbC	14.5	2.60	30.9	3.83	29.0	9.02	41.8	11.16	39.4	8.64
	AAMO:20	40.6	9.32	54.0	9.43	44.1	13.29	51.8	10.11	46.5	13.92
	AAMO:15	45.6	12.41	53.9	15.36	55.4	14.85	39.9	11.22	47.8	12.30
	AAMO:10	61.3	9.49	68.6	13.24	54.9	9.53	62.8	11.32	46.6	14.28
MDLR	Y	34.8	14.30	58.0	26.30	41.8	18.06	49.3	15.41	25.0	6.66
	MP	33.8	10.88	55.5	16.44	40.6	17.52	44.6	18.38	27.2	8.40
	EbC	3.3	0.39	6.6	1.70	6.7	2.99	12.7	6.90	11.0	3.84
	AAMO:20	17.8	10.47	21.7	8.96	15.9	7.61	24.2	12.99	18.0	8.88
	AAMO:15	19.2	6.49	22.2	9.65	24.4	13.17	15.2	6.62	17.4	4.36
	AAMO:10	27.2	10.11	33.6	10.95	20.0	5.72	24.9	9.15	22.9	8.61

Table 6. AAMO Robot Coincidence on Decision Making moments.

HO-Threshold	$\| R \|$	$\| R^{u} \|$ that are simultaneously making a decision
		1		2		3		4
		AVE	StD	AVE	StD	AVE	StD	AVE	StD
AAMO:10	3	0.959	0.01	0.041	0.01	$≃ 0$	$≃ 0$	n/a	n/a
	5	0.895	0.02	0.097	0.02	0.011	0.01	$≃ 0$	$≃ 0$
	8	0.807	0.02	0.153	0.04	0.037	0.02	$≃ 0$	$≃ 0$
AAMO:15	3	0.969	0.02	0.031	0.02	$≃ 0$	$≃ 0$	n/a	n/a
	5	0.929	0.02	0.068	0.03	0.003	0.01	$≃ 0$	$≃ 0$
	8	0.823	0.03	0.146	0.03	0.024	0.01	$≃ 0$	$≃ 0$
AAMO:20	3	0.968	0.02	0.032	0.02	$≃ 0$	$≃ 0$	n/a	n/a
	5	0.917	0.02	0.080	0.02	0.003	0.01	$≃ 0$	$≃ 0$
	8	0.838	0.02	0.148	0.03	0.018	0.01	$≃ 0$	$≃ 0$

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Benavides, F.; Ponzoni Carvalho Chanel, C.; Monzón, P.; Grampín, E. An Auto-Adaptive Multi-Objective Strategy for Multi-Robot Exploration of Constrained-Communication Environments. Appl. Sci. 2019, 9, 573. https://doi.org/10.3390/app9030573

AMA Style

Benavides F, Ponzoni Carvalho Chanel C, Monzón P, Grampín E. An Auto-Adaptive Multi-Objective Strategy for Multi-Robot Exploration of Constrained-Communication Environments. Applied Sciences. 2019; 9(3):573. https://doi.org/10.3390/app9030573

Chicago/Turabian Style

Benavides, Facundo, Caroline Ponzoni Carvalho Chanel, Pablo Monzón, and Eduardo Grampín. 2019. "An Auto-Adaptive Multi-Objective Strategy for Multi-Robot Exploration of Constrained-Communication Environments" Applied Sciences 9, no. 3: 573. https://doi.org/10.3390/app9030573

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Auto-Adaptive Multi-Objective Strategy for Multi-Robot Exploration of Constrained-Communication Environments

Abstract

Featured Application

Abstract

1. Introduction

1.1. Communication Issues

1.1.1. Connection Requirements

1.1.2. Communication Models

1.2. Connectivity-Based Proposals

1.3. Conclusions

1.4. Contributions

Ease to Deploy and Flexibility

Good Performance

1.5. Outline

2. Problem Formulation

2.1. Environment Model

2.2. Robot Model

2.3. Communication Model

2.4. Task Identification Method

2.5. Multi-Robot Task Allocation Problem—MRTA

2.6. Global Objectives

2.6.1. Full Coverage

2.6.2. Completion Time Optimisation

2.6.3. Isolation Avoidance

3. Auto-Adaptive Multi-Objective Task Selection Approach

3.1. Task Utility Function

3.2. Path Utility

3.3. Connectivity Utility

3.4. General Considerations

4. Adaptive α -Value Computation

4.1. Task Selection Framework

4.2. [Cl/Co,F/NC] and [Cl/NC,F/NC] Cases

4.3. [Cl/Co,F/Co] and [F/Co,Cl/NC] Cases

4.3.1. [F/Co,Cl/NC] case.

4.3.2. [Cl/Co,F/Co] Case

4.4. Considerations and Usefulness

5. Task Allocation Scheme

5.1. Coordination Method

5.2. Task Selection Algorithm

6. Baseline Statement and AAMO Approach Results

6.1. Simulation Setup

6.1.1. Scenarios

6.1.2. Robotic Agent Architecture

6.2. Figure of Merits

6.3. Baseline Statement

6.3.1. Collected Data

6.3.2. Baseline Assessment

6.3.3. Conclusions

6.4. AAMO Assessment

6.4.1. Collected Data

6.4.2. Effectiveness Assessment

6.4.3. AAMO vs. Baseline Comparison

6.4.4. AAMO Efficiency Assessment

7. Conclusions

7.1. Task Assessment Problem

7.2. Task Allocation Problem

7.3. Connectivity Maintenance Problem

7.4. Future Research Directions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4. Adaptive $α$ -Value Computation