**1. Introduction**

The exploration problem is a fundamental subject in autonomous mobile robotics that deals with achieving the complete coverage of a previously unknown environment. There are several scenarios where completing exploration of a zone is a central part of the mission, e.g., planetary exploration, reconnaissance, search and rescue, agriculture, cleaning, or dangerous places as mined lands and radioactive zones. Additionally, due to the inner qualities—mainly efficiency and robustness—of multi-robot systems, exploration is usually done cooperatively [1].

Schematically, the exploration of an environment can be seen as the composition of *Mapping* and *Motion Planning* tasks. A map is needed in order to plan new motions. Moreover, choosing a correct

motion sequence based on this map is also needed to expand the knowledge about the environment optimally. Consequently, *Mapping* is regularly interleaved with *Motion Planning*, and vice versa, during the whole process [1–3].

Given that the lack of knowledge is essentially inherent to exploration missions, the best choice for the robots is to visit the places where the gain of information can be potentially higher. The *Task Identification* problem concerns the identification of the points of interest that should be visited next. It strongly depends on both the sensory robot capabilities and the underlying environment representation. The most widely used representation for this purpose is the well-known *Occupancy Grid* structure [4]. Based on it, a method to identify points of interest was proposed by [5]. The strategy assumes that the closer to the frontier between known and unknown regions the tasks are defined, the more information the team can gather. Since then, the majority of exploration proposals has adopted this scheme known as *Frontier Points* or *Frontier Regions* [6–8].

When multiple robots are involved, it is advisable to avoid several of them moving to the same place. The *Task Allocation* problem concerns the search for a distribution of tasks to robots that maximises the overall system utility and minimises the amount of overlapped information obtained by the robots [1,9,10].

There exist a wide variety of proposed solutions to this problem where a family of methods based on market economies are probably the most popular ones. These methods are based on the notion of *Auctions* from which the robots can bid for the tasks to decide who goes to where at each moment. The market may be managed centrally either by a virtual agent at the base station as in [3] where the bids are processed centralised by a greedy algorithm or by a robotic agent as in [1]. Conversely, the fleet can manage to exchange the bids among all the members in order to take decentralised decisions [11,12], avoiding, in turn, the single point of failure. All these methods owe their popularity to their simplicity and ease of implementation, but they suffer from a significant shortcoming: falling in local minima [13].

Far from economy inspired approaches, a scheduling based approach is presented in [2]. This method combines an environment segmentation technique with the centralised task allocation method proposed by [14]. The exploration is performed after dividing the environment into disjoint segments. Thus, the expected sensory overlap between agents is decreased as much as possible.

In [15] the authors address coordination implicitly through localisation data exchanging. Robots are forced to wait for others before making a decision. Task selection is made iteratively—one robot after another—employing an objective function which rewards the right choices. In [16] a centralised approach is used. The tasks-to-robots distribution is computed balancing information gain, localisation quality, and navigation costs. Another centralised approach computes a utility function enabling the robots to locally prioritise the tasks within its scope and, potentially, also enabling the whole team to search for the best global distribution as well [9].

On the contrary, a decentralised approach, called *minPos* [17], attempts to distribute the robots over the unexplored locations as much as possible. By doing so, it has outperformed several reference proposals decreasing the completion-exploration time for a big set of practical scenarios. The working principle is to rank robots concerning their distance to every possible task. The robots coordinate their actions implicitly and may choose to visit the tasks for which they are best ranked at each point in time.

Finally, the strategy described in [18] is mainly devoted to deal with uncertainties in sensing and motion processes of a multi-robot system. To this end, the authors model the exploration and mapping problem as a POMDP that is solved centrally. In [19] the assignment algorithm works in an asynchronous fashion assuming that not all robots must be ready for new plans at the same time.

Wireless communication plays an important role in collaborative multi-robot strategies. Unfortunately, the assumption or requirement of stable communication and end-to-end connectivity may be easily compromised in real scenarios due to interference, fading, or simply robots moving beyond the communication range. When robots are unconnected they have no possibilities to coordinate their actions and damages or inner failures can lead to information losses. Therefore,

depending on the application field, the exploration strategy should take this into account to prevent isolation situations.
