1. Introduction
The use of autonomous mobile robots has increased in various fields of life in recent times. Mobile robots are being used in industry, medicine, search and rescue, and other applications. Most of these applications require a team of robots to fulfill the task efficiently and in less time than their human counterparts [
1]. However, they face some issues hindering them from optimal path planning due to the symmetrical shape of the environment. Multiple robots are expected to explore more areas in less time while solving robot localization and collision-avoidance issues. In a scenario where a team of mobile robots is operating, it is necessary to keep them safe from collisions and the surroundings. Collision avoidance for a multi-robot system forms the focus of many current studies. When deploying a multi-robot system, it is ensured that the hardware parts do not collide with each other or the surroundings, especially in symmetric environment. There are two main approaches used to address collision avoidance in multi-robot systems: centralized and decentralized [
2]. The centralized approach is efficient only for small groups of robots, and for large groups, the decentralized approach is more effective, as it is less expensive computationally. A decentralized method that uses a triangular grid pattern was introduced by Yang [
3], using previously explored maps and information from neighbors to avoid collisions.
The agents need a communication system to inform them about their current position and velocity in a decentralized approach. Various types of communication setups have been introduced in recent years. The Alternating Direction Method of Multipliers (ADMM) used by Rey et al. [
4] works so that each agent communicates with the neighboring agents, creating a local coordinate system. The setup can manage transmission from all the agents simultaneously. Being a decentralized network, it can manage frequent setup changes and failures. Rodríguez et al. [
5] used the velocity information of each agent to create a cooperative communication setup. The relative velocity of agents was then used to decrease an individual agent’s detection region, resulting in collision avoidance. A scheme suggesting the use of a shared memory-driven device for coordination and communication was presented by Pesce et al. [
6]. Each agent utilizes the shared device to learn from the collective private observations and share relevant information with other agents. Deep reinforcement learning and Voronoi cell setup were proposed in Wang et al. [
7] and Nguyen et al. [
8] to ensure cooperative behavior among the agents. Another approach towards collision avoidance is by using a local or global path planner. These approaches use path-planning strategies such as GMapping [
9] to avoid collisions [
10,
11]. Socially aware path-planning strategies have gained popularity recently. Avoidance of robot-to-pedestrian collisions, human-like speed of motion, and navigation through dense environments require carefully planned trajectories [
12,
13]. When planning a trajectory, the structure of the environment also plays a vital role in collision avoidance. It is essential to explore the environment carefully to maintain a low time and energy cost while reaching the target. Some task planner techniques have been presented to address this issue [
14]. However, very few studies have discussed cluttered and confined environments [
15,
16].
In the process of exploring the databases for this SLR study, it was observed that most of the studies on collision avoidance are conducted for aerial vehicles [
17,
18,
19], surface vehicles [
20,
21,
22], ships [
23,
24,
25,
26,
27] and underwater vehicles [
15,
28,
29] However, the focus of this SLR is only on ground vehicles. By carefully scanning the titles and abstracts of the studies, any study using other than ground vehicles was excluded. In this SLR, 17 studies from 2015 to date are selected after systematic screening. All these studies address the collision-avoidance issue for multi-robots (ground vehicles) directly or indirectly. These studies propose new algorithms such as CADRL or innovate classic techniques such as ORCA to avoid collisions. Although various studies were conducted before 2015, they rarely addressed the issue of collision avoidance in a decentralized manner for confined spaces. Many studies attempted to resolve the said issue for no more than two agents. In this SLR, we select only those studies which use decentralized approaches for a large group of agents in confined space scenarios. Some studies experimented with up to a hundred agents in different scenarios [
2], while others considered techniques such as ECBS for a group of agents [
30].
This paper aims to retrieve the most relevant studies on collision avoidance for multi-robot systems. A thorough literature review has been conducted to provide a critical understanding of the collision-avoidance issue for a group of autonomous ground mobile robots. This paper answers the following research questions:
RQ1. What is the number of primary studies on collision avoidance in the multi-robot system since 2015?
RQ2. How do researchers report the implemented techniques for avoiding collisions in a multi-robot system?
RQ3. How does a collision-avoidance technique solve coordination issues?
RQ4. What are the limitations of the current research on collision avoidance in multi-robot systems?
RQ5 What are the main directions of future research and possible solutions in multi-robot systems?
2. Protocol for Identifying Related Works of Collision Avoidance for Multi-Robot
The method used by Okoli et al. [
31,
32] is used as a guideline to answer RQ1 by extracting the maximum possible details from the studies on collision avoidance for multi-robots. This method involves planning, selection, extraction, and execution of the review. The inclusion criteria for the studies was that the study must address the issue of collision avoidance for multi-robot systems (more than two agents) and should be evaluated for multiple scenarios at least through simulation. All the works included provided at least one successful solution for avoiding collisions. They experimented with as many as a hundred agents in circular, hallway, and deadlock arrangements [
2,
13,
33]. Some of the works indirectly addressed the issue of planning a collision-free path for a team of robots [
34,
35].
Eight electronic databases were selected after considering expert opinions on searching literature. At the same time, another critical feature in selection was the availability of the most reliable and relevant data. The following is the list of selected databases: Scopus, Web of Science, Springer Link, IEEE Xplore, ACM Digital Library, and ProQuest. The academic search engine Google Scholar was also used to expand the search, as it is one of the most comprehensive sources. The keywords used for the search string were “Collision Avoidance,” “Decentralized,” “Multi-robot,” “Drone,” “Aerial,” and “Rotor.” Appropriate Boolean operators were also utilized to ensure meaningful search results. “AND”,” OR” and “NOT” operators were used to narrow down the search, while the “-” operator was used in Google Scholar, which is equivalent to NOT. As every database has its search algorithm, the keywords and Boolean operators were accordingly modified. Besides the operators, each database was searched with a slightly different arrangement of keywords.
The screening process followed to retrieve the most relevant articles involved three steps: initially, 852 papers were retrieved from all the databases. After reviewing for duplication 692 papers were left. To narrow them down further, each study’s title in this review was scanned to ensure it was related to the topic. Papers that had keywords in their titles pointing to different issues than multi-agent collision avoidance were excluded. After careful screening, 121 papers were left from 692, which were included for further investigation. For further screening, the abstract and keywords were scanned more deeply for a better understanding of the papers which were more relevant to the topic. The inclusion criteria for the papers for second screening are shown in
Table 1.
A second screening resulted in 42 papers which underwent full-text screening for quality appraisal and to ensure their relevance to collision avoidance in a multi-robot system. We analyzed whether papers included put the right amount of emphasis on the topic and adequately answered the research questions. The main focus of the analysis was each paper’s aim, description, and methodology. A comprehensive summary of the screened papers was assembled to answer specific research questions thoroughly. This final screening resulted in 17 papers out of 852, which fulfilled the entire criteria for inclusion in this review. The screening process for the number of articles included from each database after each step is shown in
Figure 1.
3. Primary Studies on Collision Avoidance in the Multi-Robot System from 2015 to 2021
The main focus of the analysis was each paper’s aim, description and methodology, where comprehensive summaries of the screened papers were assembled to answer specific research questions thoroughly. The overview of the systematic screening for each source is presented in
Table 2.
We present our analysis for the distribution of year published (
Figure 2), countries (
Figure 3), publication types (
Figure 4), and statistical results from the papers. When distributing over databases, it was observed that most of the publications, including the keywords, were held by ProQuest. The literature search performed in this review started in 2015, showed no papers published in 2016 or 2018 on this particular topic.
After careful screening, the resulting 17 studies were considered the primary works on decentralized collision avoidance for multi-robots during the last seven years. The papers were divided into two classes: analytical and empirical. Careful analysis revealed that most of the studies used a combination of analytical and empirical approaches. Furthermore, 2 of the 15 studies used the theoretical approach to validate their results. They compiled some theorems and proved them to validate their proposed model [
36]. The rest of the studies used both approaches to prove their proposed solutions. The validation of the proposed algorithms was conducted using different simulation setups. Only one of the studies used real-time experiments to validate the model. None of the studies used the quantitative approach, as it was unsuitable for the issue under consideration. Three studies innovated and applied existing models and used a mixed approach to validate their proposition [
37,
38,
39]. No specific studies focused on purely qualitative approaches or case studies, as these methods are not suitable for the topic. The topic of collision avoidance in a multi-robot system needs a more practical approach than a theoretical one and needs experiments to apply and validate the results.
4. Characteristics of Collision Avoidance in Multi-Robot System Techniques
This section discusses RQ2, involving the main characteristics of various collision-avoidance techniques using reinforcement learning (RL) and non-reinforcement learning (non-RL) for a multi-robot system. It was observed that only a few studies address the said issue directly, while others propose a path-planning algorithm that avoids collisions. Each study considered the hardware specifications of the agent to implement the proposed model effectively. Different real-world scenarios were evaluated to narrow the simulation and real transition gap. Following is a summary of each related study. The summary of the technique experimental design, benchmark, strength, and weakness on all selected studies are summarized in
Table 3.
For non-RL approaches, eight different studies are selected with various agents in both open and confined spaces. Most studies used ORCA as their benchmark evaluation. Cap et al. [
33] address the problem of finding a collision-free trajectory for an agent in a dynamic environment. The setup considered is an infrastructure with agents already performing tasks when a new task is assigned to an individual agent. The proposed algorithm implements a token system and plans a global trajectory considering all the agents. Meanwhile, Dergachev et al. [
30] suggest coordinating sub-groups of the agents that appear to be deadlocked, using locally confined multi-agent pathfinding (MAPF) solvers. The limitation of this model is its assumption that each agent has prior knowledge of the environment and its failure to perform in uncertain situations. Arul et al. [
39] used buffered Voronoi cells (BVC) and reciprocal velocity obstacles (RVO) to develop a collision-avoidance method for dense environments. To calculate a local collision-free path for each agent, first, a suitable direction is computed by superimposing BVC and RVO cones together. However, this method does not guarantee deadlock resolution, similar to earlier decentralized methods—more studies focus on alternative approaches to avoid the need for global communication among robots [
33,
41,
45].
Mao et al. [
40] presented a collision-avoidance approach by considering the non-holonomic constraints of the agents. The proposed method is cheaper than PB-NC-DMPC, as it does not use central coordination or rely on communication among the robots. Another study by Wei et al. [
41] proposed altruistic coordination where each robot is ready to make concessions whenever in congested situations. It is demonstrated that when robots face a congested situation, they can implement waiting, moving forward, dodging, retreating, and turning-head strategies to make local adjustments. Another approach using robot graph exploration is proposed by Nagavarapu et al. [
36], in which no direct communication is needed to avoid collisions. A data structure is proposed to provide efficient information exchange. Modified Multi-Robot Depth First Search (MR-DFS) strategy is used to achieve better execution than other tree strategies. Zhang et al. [
42]. suggest a technique using two cooperative strategies to decrease the effective detection regions of the vehicles, for a random number (large number) of agents, using velocity information. Another approach using prioritization is presented by Das et al. [
34], where agents intentionally disclose their information in order to become prioritized. Competitive robots take part in spot auctions, where they show their willingness to pay the price to obtain access to the desired location. The results show that the proposed method can manage dynamic arrival without compromising the path-length optimality too much. Zhang et al. [
42] propose an obstacle-avoidance method that incorporates virtual center points, implemented in a distributed manner, which is set based on the current state of the nearby robots and the agent itself. The stability of the system is proved using a Lyapunov function. Two control modes—the obstacle-free mode and obstacle avoidance mode—are used for robots, which are switched carefully using a direct signal.
Researchers have applied several reinforcement learning approaches for decentralized collision avoidance. Chen et al. [
13] developed an innovative method that applies deep reinforcement learning to offload the online computation to an offline learning procedure to predict interaction patterns. A value network that uses each agent’s joint configuration is developed to estimate the time to the goal. This value network also finds a collision-free velocity vector by admitting the efficient queries while considering other agents’ motion uncertainty. However, some robots become stuck when the obstacle field is dense such that various traps and dead ends are formed. One effective method to resolve the dead-end issue is presented by [
44]. Meanwhile, Li et al. [
37] presented a continuous action space-based algorithm. In this method, only the positions and velocities of nearby agents are observed by each agent. A solution of simple convex optimization with safety constraints from ORCA is implemented to resolve the multi-robot collision-avoidance problem. The training process of the proposed approach is much faster than other RL-based collision-avoidance algorithms. Fan et al. [
2] designed a decentralized sensor-level collision-avoidance policy. The movement velocity for an agent’s steering commands is directly drawn from raw sensor measurements. The technique used here is policy-gradient-based reinforcement learning. The said technique is integrated into a hybrid control framework to improve the policy’s robustness and effectiveness. Liang et al. [
12] used a depth camera with 2D LIDAR as multiple perception sensors to detect nearby dynamic agents and calculate collision-free velocities. The previously unseen virtual and dense real-world environment is directly transferable from the navigation model learned by the agents. However, in the case of glass or nonplanar surfaces, the sensors fail to perform accurately.
Bae et al. [
43] suggested a combination of Deep q-learning and CNN algorithm. This combination enhances the learning algorithm and analyzes the situation more efficiently. Depending on the given situation, the agents can act independently or collaboratively. The memory regeneration technique is used to reuse experience data and reduce correlations between samples to improve data efficiency. The presented method uses image-processing techniques such as object recognition [
46,
47] to obtain the robot’s location. However, an unnecessary movement occurs in an environment where the generated path is simple or without obstacles. Lin et al. [
44] propose a method with a geometric centroid of the robot team, which avoids collisions while maintaining connectivity using Deep RL. The proposed model can sometimes fail to predict the dead-end scenario effectively, which can cause the agent formation to take extra time to reach the goal. Cai et al. [
38] suggest a combination of Multi-robot Reinforcement Learning MARL and decentralized Control Barrier Function (CBF) shields based on available local information. They developed a Multi-robot Deep Deterministic Policy Gradient (MADDPG) to Multi-robot Deep Deterministic Policy Gradient with decentralized multiple Control Barrier Functions (MADDPGCBF). Each agent has its unique CBFs according to the proposed approach. These CBFs involve cooperative CBFs and non-cooperative CBFs, which deal with the respective types of agents.
4.1. Experimental Setup for Decentralized Multi-Robot Collision Avoidance
This section discusses the main characteristics of the experimental approach and hardware specifications of the agent to implement the proposed model effectively. Furthermore, different real-world scenarios were tested to narrow the simulation gap to real-time transition. Based on the selected papers, researchers considered several essential scenarios: circle, random, roundabout, dense crowds, narrow corridors, obstacle shapes, intersection, edge traversals, deadlock situations, static, dynamic obstacle, and indoor environments. The environmental setup scenarios’ platform, and the number of robots used by each study are summarized in
Table 4. The distributions of the papers per number of robots and environment are illustrated in
Figure 5 and
Figure 6.
4.2. Evaluation Measures
Our analysis presents nine main evaluation parameters used by previous studies as presented in
Table 5. We observed that the four main parameters for multi-robot collision avoidance are success rate, travel distance, time cost, and velocity. These parameters are crucial to evaluate the efficiency and successful implementation of the approaches. The other measurements are position error, minimum separation distance, cumulative reward, planning time and search scope.
5. Collision-Avoidance Techniques to Solve Coordination Issues
To answer RQ3, we analyzed the primary works presenting collision-avoidance techniques for multi-robot systems while considering the coordination issues. The techniques studied in this work are focused on presenting solutions to the said problem for a group of more than two agents. Many studies on collision avoidance in a multi-robot system made a list when first searched. Most of these studies addressed manipulators or warehouse scenarios, which were unsuitable for this work. Studies that addressed the type of agents other than ground robots were excluded, which resulted in around 121 studies. These studies were further screened by excluding those that focused on aspects of the multi-robot system other than collision avoidance, such as trajectory optimization or navigation issues [
10,
49]. Studies that addressed only the coordination issue without considering collision avoidance were excluded [
7]. The final screening resulted in 17 carefully selected studies for relevance to the topic, each presenting a unique collision-avoidance strategy.
Collision avoidance is an essential part to be considered when dealing with multi-robot systems. It includes collisions with obstacles and among the agents. Two types of methods most often used are centralized and decentralized approaches. The decentralized approach is computationally inexpensive and enables the agent to be more independent. Another division is classical and reactive approaches [
50]. However, this review provides information about effective decentralized techniques, classical or reactive. This review is designed according to the guidelines provided by Okoli [
31], which resulted in the final selection of 17 papers. All the studies included are focused on finding and developing an effective strategy to avoid collisions in a multi-robot system. Several methods, including deep reinforcement learning, fuzzy logic, and supervised learning, are used as a base to validate or apply the proposed strategies. Most of the studies are focused on application in an open-space environment with static or dynamic obstacles, while confined-space scenarios are not often studied. Environments such as AC ducts and sewers need to be explored even more, as these environments offer different types of hurdles for a multi-robot system compared to open-space environments. Only 2 of the 17 studies addressed the deadlock situation [
2,
5], which can appear when agents need to swap their positions or cross a narrow entrance, while others fail to perform in such scenarios [
45]. Some of the crucial criteria for decentralized multi-robot collision avoidance are summarized as the following:
Coordination strategy: Several efficient coordination systems have produced successful collision avoidance for multi-robot systems. The velocity obstacle method allows robots to transmit and receive each other’s states and intentions via an altruistic coordination network. A token-passing technique based on a synchronized shared memory holds all robots’ current trajectories, which learns a value function that completely encodes collaborative behaviors. By utilizing the processed data from the LIDAR sensor, agents coordinate with others via a robot team that works as a centroid or beacon to exchange information among the robots. In the decentralized control barrier function, local information received through the agent’s sensors can be shared by other robots nearby.
Traversable region detection region: A unique application of the locally confined multi-robot pathfinding (MAPF) solvers is suggested by Dergachev et al. [
30]. This approach presents a way to build a grid-based MAPF instance to avoid deadlocks. Through the learned policies, the robots use local observations of each robot and traversable detection region to collaboratively plan the move to accomplish the team’s navigation task. A proper data-structure-based technique is essential for providing efficient information exchange, as suggested by Nagavarapu et al. [
36]. Combining 2D multiple perception sensors using LIDAR and depth cameras enabled the agents to sense the dynamic agents in the surroundings and compute collision-free velocities. An approach using Voronoi cells and RVO cones provides an efficient calculation of collision-free direction for each agent. On the other hand, exploiting the velocity information in the proposed technique resulted in less complicated and collision-free trajectories by the traversable region-detection region. Furthermore, incorporating virtual center points implemented in a distributed manner should be set based on the current state of nearby robots and the agent itself.
Optimal multi-robot trajectories: While ensuring optimal collision-free navigation, agents are also required to maintain a coordination link within their coordinated network to lower the communication overhead in decentralized systems. One of the essential factors is detecting nearby dynamic agents and calculating collision-free velocities. Early detection and trajectory prediction will result in a higher success rate with less time and trajectory length taken to the goal. Detection of the obstacle’s direction and assuming it to be the leader, the agent can be trained to maintain a predefined distance using formation control.
Adaptability: One of the essential criteria for successful application is adaptability, especially in deadlock situations that often occur in multi-robot systems. Few studies faced a problem where agents become stuck when deployed in a dense environment. In some studies, the problem of failing to navigate in an uncertain environment also arose. In some scenarios, only static obstacles are considered while training the agents, making it challenging to address inter-agent collisions. Overall, it is observed that not a single study was able to address all the issues faced when designing a collision-avoidance algorithm. However, deadlock and inter-agent collisions are two main issues that need to be addressed to develop an efficient collision-avoidance model.
6. Limitations of the Current Research on Collision Avoidance in Multi-Robot Systems
To answer RQ4, we discussed the challenges with proposed methods in previous studies. The most common issues mentioned were dealing dynamic obstacles [
40], congested situations [
41], narrow passages [
33], convergence time, deviation [
5] and deadlocks [
2,
42]. These issues are reported to occur when obstacles are other agents, more than one agent at one spot, the path not being wide enough, finding the optimal path, and more than one agent wanting to cross the same spot. The listed issues are addressed by testing the proposed algorithms in multiple scenarios with different levels of complexity, and promising results are reported. The summary of challenges and future studies on decentralized multi-robot collision avoidance is presented in
Table 6.
Decision-Making Dynamic Obstacles: There can be two types of obstacles in a multi-robot system: static and dynamic. Dynamic obstacles include other agents in the team as well. These obstacles can cause a problem as each agent can make their own decision in a decentralized approach; more than one agent can decide to take the same action. This problem was explored by Mao et al. [
40] using linear equations of motion to design the state-space.
Congested situations: Congested situations are not widely studied when considering a multi-robot system. It is highlighted by Wei et al. [
41] that even when a researcher considers a congested scenario, the agents are still connected to a shared database such as a reservation table [
51] and conflict-avoidance table [
52]. Multiple scenarios with very congested situations were successfully tested by Chen et al. [
13] using Multi-Robot Cooperative Pathfinding (MRCP) based on the ideas from Ryan et al. [
53].
Narrow passages: When developing a collision-avoidance strategy for multiple agents, it is preferred to consider some real-life scenarios that an agent can encounter. One such scenario is a narrow path, where crossing or taking-over other agents is quite tricky. This can become even more complex if the path has obstacles. Some solutions such as prioritized planning [
54] and complete polynomial algorithms [
55,
56] were presented, but they fail in complex environments. An innovative solution for this issue is presented by Čáp et al. [
33], in which they suggest using an auction system for prioritized path planning.
Convergence time: Convergence is defined as a point when an agent completes an optimal trajectory without collisions. One strategy for a lower convergence time using Lyapunov-based analysis [
57] was proposed and was extended to multiple cooperative avoidance control strategies [
58]. The main drawback for these techniques was that they fail to perform in situations with an arbitrarily large number of agents [
59,
60,
61], or the paths are too conservative [
58,
62,
63]. This challenge was the main focus in [
5], where they proposed a collision-avoidance strategy with optimal convergence time using velocity-based detection regions.
Less deviation: Optimal collision-free trajectory is required to be as close to the ideal trajectory as possible. Standard deviation measures this aspect; lesser deviation indicates that the trajectory is closer to the ideal path. A method was proposed to ensure the smaller deviation by [
58], but it was not suitable for a large number of agents. By developing collision avoidance for fully-actuated Lagrangian systems, the authors in [
5] successfully addressed the issue of deviation from the ideal path.
Deadlocks: For a multi-robot system, if the environment has intersections, a small entrance into a room, or a narrow-shared path, there is a high chance of a deadlock situation arising. This situation occurs when more than one agent wants to cross each other, but there is not enough space for them to pass. Some centralized methods [
64,
65,
66] are proposed, but they cannot be implemented effectively to a large group of agents. Taking inspiration from the hybrid control framework proposed by Egerstedt et al. [
67], a solution for operating many agents in the scenarios mentioned above is proposed by Fan et al. [
2]. They successfully trained up to 100 agents. Another solution for deadlocks is presented by Dergachev et al. [
30] using MAPF.
7. Future Directions for Multi-Robot the Collision-Avoidance Studies
Above all the issues raised above, there are several key highlights for future studies on decentralized multi-robot collision avoidance, as summarized in
Table 7. To answer RQ5, the following eight future directions are suggested:
Adaptability in an uncertain environment: Most researchers aim to make their collision-avoidance algorithm easily generalized to any uncertain environment. Many tasks involving multi-robots, such as search and rescue, duct exploration, and cleaning a chemical spill, have a high percentage of uncertainty in the environment under consideration. Furthermore, 2 of the studies from selected 17 admitted to omitting this issue from their current study and aim to address this issue in their following research [
13,
27,
40].
More complex scenarios: In Čáp et al. [
33], paths with wide enough space for passing more than one agent are tested, such as a warehouse, office room, and other areas. So, they want to expand their work by improving the proposed method to apply to more complex scenarios such as confined spaces. At the same time, Fan et al. [
2] admitted that their approach could not compete with a global path planner. Incorporating classical mapping methods (e.g., SLAM [
68]) and global path planners (e.g., RRT and A) with their algorithms will be the goal in the future.
Dense obstacle field: The future direction of exploring environments with more obstacles and even dynamic obstacles is stated by Chen et al. [
13]. They declare their proposed algorithm a collision-avoidance algorithm but not a path-planning one. It performs only in environments with few obstacles and open space.
Outdoor Settings: Mobile robot systems can be used for outdoor exploration, such as in Fan et al. [
2] and Erdmann et al. [
54]. However, Mao et al. [
40] only considered indoor settings with static obstacles when designing a collision-avoidance algorithm. Overcoming the freezing robot issue and implementation in an outdoor setting is declared the future goal by Liang et al. [
12].
Heterogeneous Agents: In practical applications, multi-robot systems may consist of heterogenous agents such as turtle-bot and fetch robot. Most of the studies analyzed in this review have focused on homogenous agents. Only Zhang et al. [
42] stated that their future work would aim to implement the proposed collision-avoidance problem to heterogeneous agents.
Curvilinear Paths: When moving on curvilinear paths, more collisions are possible as agents may slip. A piece-wise linear approximation can resolve this issue to a curved edge between two vertices resulting in multiple virtual vertices with linear edges. This issue of inter-agent collision considering curvilinear paths is the future direction chosen by Nagavarapu et al. [
36].
Larger state space: There may be complex dynamics when an environment is closer to real-world situations. Complex dynamics lead to larger state space. It is difficult and time-consuming to differentiate between relevant and irrelevant actions in ample state space to achieve an effective collision-avoidance solution. An optimal solution to deal with ample state space and collision avoidance for the multi-robot system is addressed by Li et al. [
37].
Balance safety and performance: Studies are needed to examine the safety of the agents and how to keep them from collisions, without considering optimal trajectories too much, as highlighted by Cai et al. [
38]. Further studies should consider algorithms that can balance the multi-robot system’s safety and performance.
8. Conclusions
This study conducted a systematic review of different RL and non-RL decentralized collision-avoidance techniques for multi-robots. The main goal of this study was to summarize state-of-the-art decentralized collision-avoidance techniques for multi-robots. The studies in this review were selected from well-reputed public databases such as Springer, IEEE Xplore, WoS and other sources. The outputs of these databases were filtered through a series of screening steps to exclude any irrelevant studies. It was observed that most of the studies used non-RL approaches to address the issue. Furthermore, China has conducted the most studies on decentralized collision avoidance for multi-robots in the last seven years.
In general, the summarized studies related to the topic were analyzed based on using RL and non-RL approaches. Algorithms such as Discrete ORCA-MPC, CADRL, MR and MR-DFS and other approaches have been used to solve the problem successfully. Then, the success rate of the technique proposed, and the coordination strategy used, were analyzed and discussed. It was observed that only 2 out of 17 studies considered confined-space scenarios, while others implemented collision-avoidance strategies in relatively open and empty space. Few studies implemented their algorithms in real-world scenarios, while others focused on simulations. Scenarios such as deadlocks or swapping in a narrow path are not widely studied. The most used algorithms are different modified versions of ORCA implemented in relatively open spaces. All of the techniques proposed, plus benchmark algorithms, strengths and weaknesses are presented and discussed as references for future research. The data used in this review are up to date as per our knowledge and any omission is unintentional.
Author Contributions
Conceptualization, A.H.A.R. and M.S.M.N.; methodology, M.R. and A.H.A.R.; validation, M.F.N., N.M.R.N. and T.S.Y.; writing, M.R.; editing and visualization, G.J.A.-A. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Ministry of Higher Education, grant number FRGS/1/2020/ICT02/UKM/02/7.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
References
- Stączek, P.; Pizoń, J.; Danilczuk, W.; Gola, A. A digital twin approach for the improvement of an autonomous mobile robots (AMR’s) operating environment—A case study. Sensors 2021, 21, 7830. [Google Scholar] [CrossRef] [PubMed]
- Fan, T.; Long, P.; Liu, W.; Pan, J. Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. Int. J. Robot. Res. 2020, 39, 856–892. [Google Scholar] [CrossRef]
- Yang, X. A decentralized algorithm for collision-free search tasks by multiple robots in 3D areas. In Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics, ROBIO 2017, Macau, Macao, 5–8 December 2017; pp. 2502–2507. [Google Scholar] [CrossRef]
- Rey, F.; Pan, Z.; Hauswirth, A.; Lygeros, J. Fully Decentralized ADMM for Coordination and Collision Avoidance. In Proceedings of the 2018 European Control Conference, ECC 2018, Limassol, Cyprus, 12–15 June 2018; pp. 825–830. [Google Scholar] [CrossRef]
- Rodriguez-Seda, E.J.; Stipanovic, D.M. Cooperative avoidance control with velocity-based detection regions. IEEE Control. Syst. Lett. 2020, 4, 432–437. [Google Scholar] [CrossRef]
- Pesce, E.; Montana, G. Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication. Mach. Learn. 2020, 109, 1727–1747. [Google Scholar] [CrossRef] [Green Version]
- Wang, D.; Deng, H.; Pan, Z. MRCDRL: Multi-robot coordination with deep reinforcement learning. Neurocomputing 2020, 406, 68–76. [Google Scholar] [CrossRef]
- Nguyen, M.T.; Maniu, C.S.; Olaru, S. Decentralized constructive collision avoidance for multi-agent dynamical systems. In Proceedings of the 2016 European Control Conference, ECC 2016, Aalborg, Denmark, 29 June–1 July 2016; pp. 1526–1531. [Google Scholar] [CrossRef]
- Lee, W.C.; Salam, A.S.A.; Ibrahim, M.F.; Rahni, A.A.A.; Mohamed, A.Z. Autonomous industrial tank floor inspection robot. In Proceedings of the IEEE 2015 International Conference on Signal and Image Processing Applications, ICSIPA 2015-Proceedings, Kuala Lumpur, Malaysia, 19–21 October 2016; pp. 473–475. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, D.; Mihankhah, E. Navigation of multiple mobile robots in unknown environments using a new decentralized navigation function. In Proceedings of the 2016 14th International Conference on Control, Automation, Robotics and Vision, ICARCV 2016, Phuket, Thailand, 13–15 November 2016; pp. 13–15. [Google Scholar] [CrossRef]
- Xiong, X.; Wang, J.; Zhang, F.; Li, K. Combining Deep Reinforcement Learning and Safety Based Control for Autonomous Driving. 2016. Available online: https://arxiv.org/abs/1612.00147 (accessed on 14 February 2022).
- Liang, J.; Patel, U.; Sathyamoorthy, A.J.; Manocha, D. Real-time Collision Avoidance for Mobile Robots in Dense Crowds using Implicit Multi-sensor Fusion and Deep Reinforcement Learning. 2020. Available online: http://arxiv.org/abs/2004.03089 (accessed on 14 February 2022).
- Chen, Y.F.; Liu, M.; Everett, M.; How, J.P. Decentralized non-communicating multi-agent collision avoidance with deep reinforcement learning. In Proceedings of the IEEE International Conference on Robotics and Automation, Singapore, 29 May–3 June 2017; pp. 285–292. [Google Scholar] [CrossRef] [Green Version]
- Ibrahim, M.F.; Huddin, A.B.; Hussain, A.; Zaman, M.H.M. Frontier Strategy with GA based Task Scheduler for Autonomous Robotic Exploration Systems. Adv. Nat. Appl. Sci. 2020, 14, 259–265. [Google Scholar] [CrossRef]
- Bechlioulis, C.P.; Giagkas, F.; Karras, G.C.; Kyriakopoulos, K.J. Robust Formation Control for Multiple Underwater Vehicles. Front. Robot. AI 2019, 6, 90. [Google Scholar] [CrossRef] [Green Version]
- Hua, X.; Wang, G.; Xu, J.; Chen, K. Reinforcement learning-based collision-free path planner for redundant robot in narrow duct. J. Intell. Manuf. 2021, 32, 471–482. [Google Scholar] [CrossRef]
- Dai, X.; Mao, Y.; Huang, T.; Qin, N.; Huang, D.; Li, Y. Automatic obstacle avoidance of quadrotor UAV via CNN-based learning. Neurocomputing 2020, 402, 346–358. [Google Scholar] [CrossRef]
- Han, D.; Yang, Q.; Wang, R. Three-dimensional obstacle avoidance for UAV based on reinforcement learning and RealSense. J. Eng. 2020, 2020, 540–544. [Google Scholar] [CrossRef]
- Back, S.; Cho, G.; Oh, J.; Tran, X.T.; Oh, H. Autonomous UAV Trail Navigation with Obstacle Avoidance Using Deep Neural Networks. J. Intell. Robot. Syst. Theory Appl. 2020, 100, 1195–1211. [Google Scholar] [CrossRef]
- Woo, J.; Kim, N. Collision avoidance for an unmanned surface vehicle using deep reinforcement learning. Ocean. Eng. 2020, 199, 107001. [Google Scholar] [CrossRef]
- Meyer, E.; Robinson, H.; Rasheed, A.; San, O. Taming an Autonomous Surface Vehicle for Path following and Collision Avoidance Using Deep Reinforcement Learning. IEEE Access 2020, 8, 41466–41481. [Google Scholar] [CrossRef]
- Xu, X.; Lu, Y.; Liu, X.; Zhang, W. Intelligent collision avoidance algorithms for USVs via deep reinforcement learning under COLREGs. Ocean. Eng. 2020, 217, 107704. [Google Scholar] [CrossRef]
- Wang, C.; Zhang, X.; Cong, L.; Li, J.; Zhang, J. Research on intelligent collision avoidance decision-making of unmanned ship in unknown environments. Evol. Syst. 2019, 10, 649–658. [Google Scholar] [CrossRef]
- Sawada, R.; Sato, K.; Majima, T. Automatic ship collision avoidance using deep reinforcement learning with LSTM in continuous action spaces. J. Mar. Sci. Technol. 2021, 26, 509–524. [Google Scholar] [CrossRef]
- Zhao, L.; Roh, M.i. COLREGs-compliant multiship collision avoidance based on deep reinforcement learning. Ocean. Eng. 2019, 191, 106436. [Google Scholar] [CrossRef]
- Xie, S.; Garofano, V.; Chu, X.; Negenborn, R.R. Model predictive ship collision avoidance based on Q-learning beetle swarm antenna search and neural networks. Ocean. Eng. 2019, 193, 106609. [Google Scholar] [CrossRef]
- Xie, S.; Chu, X.; Zheng, M.; Liu, C. A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control. Neurocomputing 2020, 411, 375–392. [Google Scholar] [CrossRef]
- Havenstrøm, S.T.; Rasheed, A.; San, O. Deep Reinforcement Learning Controller for 3D Path Following and Collision Avoidance by Autonomous Underwater Vehicles. Front. Robot. AI 2021, 7, 211. [Google Scholar] [CrossRef]
- Kim, J. While Preserving Collision Avoidance. IEEE Trans. Cybern. 2016, 47, 4038–4048. [Google Scholar] [CrossRef] [PubMed]
- Dergachev, S.; Yakovlev, K. Distributed Multi-Agent Navigation Based on Reciprocal Collision Avoidance and Locally Confined Multi-Agent Path Finding. In Proceedings of the IEEE International Conference on Automation Science and Engineering, Lyon, France, 23–27 August 2021; pp. 1489–1494. [Google Scholar] [CrossRef]
- Okoli, C. A guide to conducting a standalone systematic literature review. Commun. Assoc. Inf. Syst. 2015, 37, 879–910. [Google Scholar] [CrossRef] [Green Version]
- Sadiq, R.B.; Safie, N.; Rahman, A.H.A.; Goudarzi, S. Artificial intelligence maturity model: A systematic literature review. PeerJ Comput. Sci. 2021, 7, e661. [Google Scholar] [CrossRef] [PubMed]
- Čáp, M.; Vokřínek, J.; Kleiner, A. Complete decentralized method for on-line multi-robot trajectory planning in well-formed infrastructures. In Proceedings of the International Conference on Automated Planning and Scheduling, ICAPS, Jerusalem, Israel, 7–11 June 2015; pp. 324–332. [Google Scholar]
- Das, S.; Nath, S.; Saha, I. SPARCAS: A Decentralized, Truthful Multi-Agent Collision-free Path Finding Mechanism. arXiv 2019, arXiv:1909.08290. [Google Scholar]
- Long, P.; Liu, W.; Pan, J. Deep-learned collision avoidance policy for distributed multi-agent navigation. IEEE Robot. Autom. Lett. 2017, 2, 656–663. [Google Scholar] [CrossRef] [Green Version]
- Nagavarapu, S.C.; Vachhani, L.; Sinha, A. Multi-Robot Graph Exploration and Map Building with Collision Avoidance: A Decentralized Approach. J. Intell. Robot. Syst. 2016, 83, 503–523. [Google Scholar] [CrossRef]
- Li, H.; Weng, B.; Gupta, A.; Pan, J.; Zhang, W. Reciprocal Collision Avoidance for General Nonlinear Agents using Reinforcement Learning. 2019. Available online: http://arxiv.org/abs/1910.10887 (accessed on 13 February 2022).
- Cai, Z.; Cao, H.; Lu, W.; Zhang, L.; Xiong, H. Safe Multi-Agent Reinforcement Learning through Decentralized Multiple Control Barrier Functions. 2021. Available online: http://arxiv.org/abs/2103.12553 (accessed on 13 February 2022).
- Arul, S.H.; Manocha, D. V-RVO: Decentralized Multi-Agent Collision Avoidance using Voronoi Diagrams and Reciprocal Velocity Obstacles. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 8097–8104. [Google Scholar] [CrossRef]
- Mao, R.; Gao, H.; Guo, L. A Novel Collision-Free Navigation Approach for Multiple Nonholonomic Robots Based on ORCA and Linear MPC. Math. Probl. Eng. 2020, 2020, 4183427. [Google Scholar] [CrossRef]
- Wei, C.; Hindriks, K.v.; Jonker, C.M. Altruistic coordination for multi-robot cooperative pathfinding. Appl. Intell. 2016, 44, 269–281. [Google Scholar] [CrossRef]
- Zhang, L.; Wang, J.; Lin, Z.; Lin, L.; Chen, Y.; He, B. Distributed Cooperative Obstacle Avoidance for Mobile Robots Using Independent Virtual Center Points. J. Intell. Robot. Syst. 2019, 98, 791–805. [Google Scholar] [CrossRef]
- Bae, H.; Kim, G.; Kim, J.; Qian, D.; Lee, S. Multi-robot path planning method using reinforcement learning. Appl. Sci. 2019, 9, 3057. [Google Scholar] [CrossRef] [Green Version]
- Lin, J.; Yang, X.; Zheng, P.; Cheng, H. End-to-end Decentralized Multi-robot Navigation in Unknown Complex Environments via Deep Reinforcement Learning. In Proceedings of the 2019 IEEE International Conference on Mechatronics and Automation, ICMA 2019, Tianjin, China, 4–7 August 2019; pp. 2493–2500. [Google Scholar] [CrossRef] [Green Version]
- Mehdipour, N.; Abdollahi, F.; Mirzaei, M. Consensus of multi-agent systems with double-integrator dynamics in the presence of moving obstacles. In Proceedings of the 2015 IEEE Conference on Control and Applications, CCA 2015-Proceedings, Sydney, Australia, 21–23 September 2015; pp. 1817–1822. [Google Scholar] [CrossRef]
- Salameh, M.O.; Abdullah, A.; Sahran, S. Ensemble of vector and binary descriptor for loop closure detection. Adv. Intell. Syst. Comput. 2017, 447, 329–340. [Google Scholar] [CrossRef]
- Dewi, D.A.; Sundararajan, E.; Prabuwono, A.S.; Cheng, L.M. Object detection without color feature: Case study Autonomous Robot. Int. J. Mech. Eng. Robot. Res. 2019, 8, 646–650. [Google Scholar] [CrossRef]
- Ding, Y. Distributed Multi-robot Collision Avoidance Using the Voronoi-based Method. J. Phys. Conf. Ser. 2021, 1948, 012015. [Google Scholar] [CrossRef]
- Ramalho, G.M.; Carvalho, S.R.; Finardi, E.C.; Moreno, U.F. Trajectory Optimization Using Sequential Convex Programming with Collision Avoidance. J. Control. Autom. Electr. Syst. 2018, 29, 318–327. [Google Scholar] [CrossRef]
- Patle, B.K.; Babu L, G.; Pandey, A.; Parhi, D.R.K.; Jagadeesh, A. A review: On path planning strategies for navigation of mobile robot. Def. Technol. 2019, 15, 582–606. [Google Scholar] [CrossRef]
- Silver, D. Cooperative Pathfinding. In Proceedings of the 1st Artificial Intelligence and Interactive Digital Entertainment Conference, Marina del Rey, CA, USA, 1–3 June 2005. [Google Scholar]
- Standley, T.; Korf, R. Complete Algorithms for Cooperative Pathfinding Problems. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence 2011, Barcelona Catalonia, Spain, 16–22 July 2011. [Google Scholar]
- Ryan, L.; Kostas, E.B. Efficient and complete centralized multi-robot path planning. In Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, USA, 25–30 September 2011; Available online: https://ieeexplore.ieee.org/abstract/document/6095085/ (accessed on 14 February 2022).
- Erdmann, M.; Lozano-Pérez, T. On multiple moving objects. Algorithmica 1987, 2, 477–521. [Google Scholar] [CrossRef]
- Surynek, P. A novel approach to path planning for multiple robots in bi-connected graphs. In Proceedings of the IEEE International Conference on Robotics and Automation, Kobe, Japan, 12–17 May 2009; pp. 3613–3619. [Google Scholar] [CrossRef]
- De Wilde, B.; Ter Mors, A.W.; Witteveen, C. Push and rotate: Cooperative multi-agent path planning. In Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013, St. Paul, MN, USA, 6–10 May 2013; Volume 1, pp. 87–94. [Google Scholar]
- Leitmann, G.; Skowronski, J. Avoidance control. J. Optim. Theory Appl. 1977, 23, 581–591. [Google Scholar] [CrossRef]
- Stipanović, D.M.; Hokayem, P.F.; Spong, M.W.; Šiljak, D.D. Cooperative avoidance control for multi-agent systems. J. Dyn. Syst. Meas. Control 2007, 129, 699–707. [Google Scholar] [CrossRef]
- Rodríguez-Seda, E.J.; Stipanović, D.M.; Spong, M.W. Guaranteed Collision Avoidance for Autonomous Systems with Acceleration Constraints and Sensing Uncertainties. J. Optim. Theory Appl. 2016, 168, 1014–1038. [Google Scholar] [CrossRef]
- Fiorini, P.; Shiller, Z. Motion planning in dynamic environments using velocity obstacles. Int. J. Robot. Res. 1998, 17, 760–772. [Google Scholar] [CrossRef]
- Van den Berg, J.; Guy, S.J.; Lin, M.; Manocha, D. Reciprocal n-body Collision Avoidance; Springer Tracts in Advanced Robotics Book Series; Springer: Berlin/Heidelberg, Germany, 2011; Volume 70, pp. 3–19. [Google Scholar]
- Rodriguez-Seda, E.J.; Spong, M.W. Guaranteed safe motion of multiple lagrangian systems with limited actuation. In Proceedings of the IEEE Conference on Decision and Control, Maui, HI, USA, 10–13 December 2012; pp. 2773–2780. [Google Scholar] [CrossRef]
- Khatib, O. Real-Time Obstacle Avoidance for Manipulators and Mobile Robots. In Proceedings of the IEEE International Conference on Robotics and Automation, St. Louis, MO, USA, 25–28 March 1985; pp. 500–505. Available online: https://ieeexplore.ieee.org/abstract/document/1087247/ (accessed on 14 February 2022).
- Yu, J.; LaValle, S.M. Optimal Multirobot Path Planning on Graphs: Complete Algorithms and Effective Heuristics. IEEE Trans. Robot. 2016, 32, 1163–1177. [Google Scholar] [CrossRef]
- Schwartz, J.T.; Sharir, M. On the Piano Movers’ Problem: III. Coordinating the Motion of Several Independent Bodies: The Special Case of Circular Bodies Moving Amidst Polygonal Barriers. Int. J. Robot. Res. 1983, 2, 46–75. [Google Scholar] [CrossRef]
- Tang, S.; Thomas, J.; Kumar, V. Hold Or take Optimal Plan (HOOP): A quadratic programming approach to multi-robot trajectory generation. Int. J. Robot. Res. 2017, 37, 1062–1084. [Google Scholar] [CrossRef]
- Egerstedt, M.; Hu, X. A hybrid control approach to action coordination for mobile robots. Automatica 2002, 38, 125–130. [Google Scholar] [CrossRef]
- Pirahansiah, F.; Abdullah, S.N.H.S.; Sahran, S. Simultaneous Localization and Mapping Trends and Humanoid Robot Linkages. Asia-Pac. J. Inf. Technol. Multimed. 2013, 2, 27–38. [Google Scholar] [CrossRef]
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).