1. Introduction
Multi-Agent Robot Exploration is a field of study that deals with finding optimal solutions for a group of robots exploring an unknown environment. The main challenge in this field is to coordinate the actions of multiple robots so that they can work together effectively and efficiently [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12]. In order to solve this problem, researchers often use optimization algorithms, which are mathematical methods that find the best solution to a given problem based on specific objectives and constraints [
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28]. Using a group of autonomous mobile robots can have a number of benefits, including increased effectiveness, dependability, and robustness when conducting tasks such as exploration, surveillance, and inspection to acquire information [
29]. These benefits are attained by utilizing some kind of team coordination, which is frequently built assuming the ability to interact without boundaries. However, dealing with communication-challenged circumstances is a common requirement for operations in the real world. Robots in these environments can only communicate with teammates nearby (locally), depending on their transmission capabilities and the environment itself (e.g., the presence of obstacles or disturbances). It may not be easy to achieve a good level of coordination as a result.
Creating maps is one of the most fundamental yet difficult tasks for a group of autonomous mobile robots. Map creation is frequently a part of the problem-formulation and solution processes in applications in search and rescue, surveillance, and similar disciplines. As a result, many of the conclusions we reach in the following can be readily applied to other issues [
30,
31]. The scenario of multi-robot exploration with restricted communication has received far less attention than the one with unconstrained communication (see, for example, [
32]). However, limiting communication has several drawbacks. The shared understanding of the surroundings throughout the exploratory journey is the first important problem. Such knowledge can be presumed to be available to each robot simultaneously with limitless communication. In fact, data-sharing protocols may be used by map-merging algorithms [
33] (whether distributed or centralized) to exchange the updated map.
The frontier notion is often used in autonomous exploration methods to frame the issue. The idea of a frontier between open space and uncharted space serves as the foundation for the single-robot exploration approach, which was first put forth by [
34]. A team of robots simultaneously exploring various areas of an unknown environment is the foundation of another coordinated multi-robot exploration approach [
35]. The deterministic, well-liked method called Coordinated Multi-Exploration (CME) attempts to deterministically build a definable map in an unknowable space. The goal of an exploration procedure is to cover the complete environment in the shortest amount of time. Therefore, the robots must use a centralized method to continuously track which areas of the environment have already been explored. Since CME is deterministic and always repeats the same search pattern, there is no way to avoid local optima. The only way to solve this problem is to change the environment’s setting, which is not always possible. In addition, there is no certainty that a waypoint will be reached, which has a significant impact on the robots individually and causes them to forget their assigned tasks, which leads to a breakdown in coordination. For exploration, robots require the development of a global map in order to plan their paths and coordinate their operations [
36]. The robots may always communicate with one another about where they are on the map, what they have looked at, and where they are currently on the map. The occupancy grid [
37] is used by robots to communicate with one another.
The whole issue of exploration can be abstractly defined as follows: Suppose there is a setting space S and that time is discretized into steps. There are n robots in the collection nRbt = {r1, r2, r3, … rn }, each of which can move in S. Call the area of S that robot j is aware of at time t. Let us assume that every robot can always communicate with every other robot. The map of the area of the environment that robots are capable of perceiving at time t is provided by:
When there occurs a period t such that = S (or such that is equal to the free space of S), the exploration is finished. The challenge of multi-robot exploration is to decide, while optimizing a performance metric, which frontiers (borders of the known area of the environment ) the robots should travel towards at each time step t. The time t to finish exploration (to be minimized), the area mapped in a given time (to be maximized), the distance traveled by robots to finish exploration (to be minimized), or combinations of those metrics are typical performance measures. The exploration approach, which incorporates the “intelligence” of the system, determines which frontier each robot should explore in order to optimize the performance measure.
1.1. Motivating Problem for the Paper
According to the publications cited in the literature study, hybridization is the newest trend in algorithms [
38,
39,
40,
41]. Different algorithms are combined to create hybrids that outperform their individual components while requiring fewer resources. Researchers are developing new hybrid algorithms with the express purpose of addressing the shortcomings of existing ones by enhancing the optimization variables and conducting experiments with varying degrees of complexity. Therefore, this method is critical to developing novel hybrid solutions that accommodate limited real estate and include bio-inspired methods in the MAE paradigm.
1.2. Research Contributions
Our study makes significant contributions to the field of space exploration in obstacle-cluttered environments by proposing a novel hybrid technique that combines deterministic and swarm-based methods. The key contributions of our approach include:
Hybrid Methodology: We present a hybrid methodology that combines deterministic and swarm-based approaches, harnessing the benefits of each.
Adaptability: Because of the incorporation of swarm-based methodologies, which demonstrate emergent behavior and decentralized decision-making, our technology is deemed versatile. This versatility allows the robots to adjust to changes in the environment and deal with unexpected impediments or interruptions.
Robustness: The combination of deterministic and swarm-based approaches improves our approach’s overall robustness, allowing it to effectively navigate through complicated obstacle configurations.
Exploration Efficiency: By integrating established exploration patterns from deterministic methods with real-time changes and optimization possibilities afforded by swarm-based approaches, we optimize exploration efficiency.
2. Related Studies
In the related-work section, the authors discuss several existing methods for multi-agent exploration and optimization and highlight their limitations [
42].
The basic objective of an autonomous robotic exploration algorithm is to guide robots into uncharted terrain, expanding the known and explored portion of a map that is being constructed as the robot moves. The frontier notion is often used in autonomous exploration methods to frame the issue. In other words, it can also be explained that the goal of Multi-Agent Robot Exploration Optimization is to find an optimal exploration strategy for a team of robots that allows them to cover a large area of the environment, gather information, and avoid obstacles while using minimal resources such as time and energy [
43,
44,
45,
46]. One approach for multi-agent exploration is based on swarm intelligence, where a group of agents interacts with each other and their environment to achieve a common goal. Swarm intelligence algorithms, such as ant colony optimization and particle swarm optimization, have been applied to multi-agent exploration problems, but they can be limited by their sensitivity to the initial conditions and their inability to handle complex environments. Another approach is based on genetic algorithms, which are inspired by biological evolution and use a population of solutions to search for an optimal solution. Genetic algorithms have been applied to multi-agent exploration problems, but they can be computationally expensive and require a large number of evaluations to converge to an optimal solution.
This paper attempts to address the shortcomings of the ANFIS by using a modified Aquila Optimizer (AO) and the Opposition-Based Learning (OBL) technique to optimize ANFIS parameters. The major goal of the developed model, AOOBL-ANFIS, is to improve the Aquila Optimizer (AO) search process while utilizing the AOOBL to improve the ANFIS’s performance. Utilizing a variety of performance metrics, including root mean square error (RMSE), mean absolute error (MAE), coefficient of determination (R2), standard deviation (Std), and computational time, the proposed model is assessed using real-world oil production datasets gathered from various oilfields [
47].
In [
48], the authors presented the integration of reinforcement learning with a multi-agent system. To be more precise, the authors suggest knowledge normalization to optimize the reciprocal information between agents’ identities and their trajectories to promote thorough investigation and a range of unique behavioral patterns. To encourage learning sharing across agents while maintaining sufficient variety, the author added agent-specific modules in the shared neural network design. These modules are regularized using the L1-norm. Experimental findings demonstrate that the approach delivers state-of-the-art performance on highly challenging StarCraft II paternalistic management tasks and researching online football.
The approach proposed in [
49] combines a parallel computational Aquila Optimizer, a bio-inspired technology, with deterministic multi-agent exploration. The Aquila Optimizer is inspired by the behavior of eagles and uses a stochastic search strategy to explore the search space efficiently. Stochastic factors are integrated into the Aquila Optimizer to enhance the algorithm efficiency. The proposed approach is compared with several existing approaches, including the whale algorithm, using a simulation of a multi-agent exploration problem. The results demonstrate that the proposed approach outperforms the existing approaches in terms of convergence speed and solution quality. Overall, the proposed approach offers a promising solution for multi-agent exploration problems, particularly in complex and uncertain environments, by leveraging bio-inspired optimization algorithms and integrating stochastic factors into the search strategy.
Albina et al. [
50] suggest stochastic optimization to simulate the coordinated predatory behavior of grey wolves, and they apply it to multi-robot exploration. Here, deterministic and metaheuristic methods are coupled to compute the robots’ movement. A novel approach called ‘hybrid stochastic exploration’ makes use of the Coordinated Multi-Robot Exploration and Grey Wolf Optimizer algorithms. The obtained result shows the proposed algorithm performs better compared to other algorithms. However, no modification was found in the bio-inspired algorithm to check the performance of the Grey Wolf Optimizer by varying the stochastic variables involved in the optimizer.
Anirudh et al. [
51] present the Wavefront Frontier Detector (WFD), an autonomous frontier-based exploration approach. It is described and put into practice on the Kobuki TurtleBot hardware platform and the Gazebo Simulation Environment utilizing the Robot Operating System (ROS). This algorithm’s benefit is that the robot can explore both huge, open areas and small, crowded areas. Additionally, the map produced using this method is contrasted and checked against the map produced using the turtlebot-teleop ROS package.
Limitations
Multi-robot space exploration in situations with many obstacles is a difficult undertaking that calls for a thorough assessment of the techniques employed. We examine the drawbacks of both deterministic and swarm-based methods in the context of space exploration in this subsection.
Deterministic Method:
Limited Robustness: The behavior of the robots is frequently governed by specified rules and algorithms in deterministic approaches. This method’s capacity to deal with unforeseen circumstances or dynamic changes in the environment may be constrained in an environment that is dense with obstacles. The method might find it difficult to adjust to unforeseen impediments or disturbances, which can result in less-than-ideal or ineffective exploration.
Lack of Scalability: Deterministic approaches are usually developed for a fixed number of robots and a fixed number of obstacles. It can be difficult to scale up the system or accommodate varying obstacle densities. As the number of robots and obstacles rises, the deterministic nature of these algorithms may result in greater processing complexity and communication overhead.
Exploration Efficiency: Deterministic approaches frequently rely on specified exploration patterns or robot trajectories. While these patterns may be successful in some situations, they may not be optimized for obstacle avoidance or efficient environment covering. The inability to adjust and make real-time decisions may impede overall exploration efficiency.
Swarm-based Method:
Lack of Determinism: Swarm-based approaches frequently demonstrate emergent behavior that cannot be controlled explicitly. While adaptability might be beneficial, it can also pose difficulties in assuring deterministic and predictable behavior, particularly in complicated, obstacle-cluttered settings. Due to the lack of determinism, it can be difficult to ensure collision-free exploration or adherence to established mission objectives.
Communication Overhead: To achieve collective decision-making and work distribution, swarm-based approaches often necessitate substantial communication and coordination among the robots. In obstacle-cluttered environments where communication links can be broken or limited, relying on communication can impair overall system performance and scalability.
Exploration Completeness: Due to their reliance on local interactions and limited sensing capabilities, swarm-based approaches may fail to achieve thorough exploration of the environment. Certain environmental areas or regions may go unexplored or underexplored, resulting in insufficient mapping or data collection.
To summarize, both deterministic and swarm-based techniques for space exploration in obstacle-cluttered settings have flaws. Deterministic approaches may lack robustness, scalability, and exploration efficiency, whereas swarm-based methods may encounter determinism, communication-overhead, and exploration-completeness difficulties. To overcome these drawbacks, hybrid systems that integrate adaptability, scalability, efficient decision-making, and resilience to efficiently navigate and investigate obstacle-cluttered environments in space exploration missions are required.
5. Discussion of Simulation Results
In this section, we show the results of the multi-coordinated investigation that is proposed utilizing the Aquila method. The feasibility of the proposed method is evaluated by increasing the complexity of the map. The number of challenges is adjusted for variety. The map always has a 20 m × 20 m size. The robotics toolbox is used to create the maps. The space is separated into an open area that needs to be explored and a dark zone that denotes the region where barriers are present. The outcomes are contrasted with multi-agent-Aquila and multi-agent-whale optimizers in order to validate them and determine whether there are any advantages.
To calculate the total area of the investigated cells, we use Equation (
25).
This characteristic is used by the multi-agent to assess the area it will be surfing. A value of 0 indicates that there is no area being researched, while a value of 1 indicates that the entire region has been examined. The resulting figure will serve as an evaluation criterion for the percentage of the area under investigation.
The framework is then tested using two alternative scenarios, as shown in
Figure 1a,b. The vastness of the map, coupled with the total number of obstacles, iterations, and robot locations, is one of its most visible features. The results show that in just 27 and 29 s, respectively, 96.9 and 99.3 percent of the total area was successfully probed. Additionally, more iterations can lead to even better performance characteristics. The results provide strong evidence for the effectiveness of the suggested strategy because it not only allowed for extensive searches of a wide area but also significantly decreased computer complexity and exploration time frames. Hence, MAE-PAO successfully illustrates quick and effective map exploration.
6. MAE-PAO Algorithm Compared to the Latest CME-AO and Whale Algorithms
Our proposed MAE-PAO’s performance has been thoroughly simulated across a wide range of environmental conditions, and now we are taking those results and running with them. Here, we look at the outcomes from both MAE-PAO and the state-of-the-art CME-AO (CME-Aquila Optimizer) and whale technique, comparing and contrasting them using in-depth research and analysis. The study takes place in a more complex setting that presents challenging situations and a plethora of obstacles that appear in random order. We look at how the system operates in the two different environmental conditions that Maps 1 and 2 reflect. These maps’ complexity varies depending on the direction, number, and length of the barriers. The comparison accurately accounts for all significant variables, such as the percentage of the map that was explored, the total number of runs that were abandoned, and the cumulative time needed to explore the maps under varied conditions.
Case 1: The use of a classic Aquila Optimizer with multi-agent exploration (CME-AO) is shown in
Figure 2a, while our suggested Adaptive Aquila Optimizer (MAE-PAO) is also visible in
Figure 2b. According to the simulation results, the traditional Aquila Optimizer covers
% of the region in about 40.2 s with one failed run. The proposed MAE-PAO examined
% of the area in around
s with one failed run, which is a substantially shorter amount of time. The deterministic approach’s inherent nature, which necessitates the agent takes the same path each time the simulation starts, means that CME cannot fully explore the map on its own. Increasing the total number of iterations also improves computing capability.
Case 2: The comparative findings between the proposed MAE-PAO and the traditional Aquila under the various environmental circumstances shown in Map 2 are shown in
Figure 3a,c. A total of 96.73% of the environment was explored by the suggested MAE-PAO algorithm. The required time was 27.13 s, and there were zero unsuccessful runs. With the traditional Aquila Optimizer, a similar exploration rate of 94.41 percent was attained. The algorithm needed to be run numerous times, and each run took 45.7 s to complete. This shows that although the exploration rates of the two algorithms are comparable, the suggested MAE-PAO’s rate of execution and the number of failed simulated runs are much lower, providing more persuasive evidence of the algorithm’s effectiveness.
6.1. Summarized Results
For quick reference and the readers’ convenience, the whole set of results described above is condensed in
Table 1. The results obtained using the suggested MAE-PAO and the referred CME-AO algorithms are shown in the cited
Table 1. The comparison properly takes into account all relevant factors, including the proportion of the entire area that was investigated, the number of failures, and the amount of time needed for map exploration in various situations. According to the findings, the primary goal of space exploration is satisfactorily accomplished by the suggested MAE-PAO approach in fewer test runs and less time. The traditional CME-AO algorithms have a modest propensity to investigate simpler environments. Additional runs and exploration times are negative. As a result, the suggested MAE-PAO algorithm is a popular option for onboard practical use.
6.2. Statistics-Based Performance Evaluation
Here, we compare the proposed MAE-PAO to CME-Aquila and whale in terms of performance by analyzing the two systems’ respective statistical properties. Multiple tests in the same environmental conditions are done in order to conduct the inquiry, as demonstrated in
Figure 2 and
Figure 3, respectively. We examine the mode and dispersion of the percentages, and we compute the full duration of the process. The findings acquired across several runs for the two separate environmental conditions indicated by Maps 1 and 2, respectively, are shown in
Table 2 and
Table 3.
For the proposed algorithm MAE-PAO (refer to
Table 2), the average mean exploration rate for Case 1 and Case 2 is 97.15%. The average mean time consumed by MAE-PAO is 26.41 s. Similarly, for algorithms CME-Aquila and whale, refer to
Table 3 and
Table 4, respectively, which depict similar salient details.
The collective rates of exploration along with the time taken for the exploration of the proposed MAE-PAO and the contemporary CME-Aquila and whale algorithms are jotted in
Table 5. As mentioned in the provided table, the proposed MAE-PAO has an average area exploration rate of approx 97.15%, which is greater than the area exploration rates of 92.76% and 80.452% for CME-Aquila and whale, respectively. Similarly, the mean (26.41 s) exploration time of the proposed MAE-PAO is much less than those of CME-Aquila (43.32 s) and whale (31.164 s). This empirically verifies the superior efficiency of MAE-PAO’s exploration across a broad variety of environmental situations. There is also very little variation across runs in terms of the exploration rate and exploration time.
The whale algorithm is presented with the CME method to further validate the proposed algorithm. The comparison of the whale algorithm demonstrates that it underperforms as compared to the proposed algorithm. The method was statistically tested for an average rate of exploration and for the time taken to complete the exploration process. The number of iterations was kept at 100 because the environment dimension is 20 × 20, which requires no more than 100 runs. If we increase the number of runs, the robots/agents keep on exploring the area that has already been explored, and decreasing the number of runs results in inefficient and insufficient area exploration. The statistic results are mentioned in
Table 4. The average rate of exploration and time was found to be 80.3440% and 31.1760 s, respectively.
6.3. CME-AO Test on Additional Map
The same idea is extended to a map containing different-shaped obstacles that are added to check the efficacy of the algorithms. Since the robots are exploring the map on the ground level, the height factor is ignored. Secondly, the size of the obstacle does not affect the rate of exploration in a map. The number of robots is kept at three; the algorithm is designed so that all three robots collaborate simultaneously to share information about their whereabouts on the map during the process of exploration. The Robotic System Toolbox is employed to create a 2D grid map. This is why the shape of the map is constant in the algorithm, whereas colors refer to different robots.
The robot poses are specified as follows: r1 = (6,5,0), r2 = (7,4,0), and r3 = (8,4,0) for the first three in
Figure 4a–c, and r1 = (2,5,0), r2 = (4,4,0), and r3 = (5,20,0) for
Figure 4d. The map dimensions are 20 × 20 m for the first two figures and 25 × 25 m for
Figure 4c. These configurations are employed to examine the effectiveness of the robot’s exploration process in various map configurations.
Figure 4a demonstrates the exploration process with a single robot,
Figure 4b illustrates two robots, and
Figure 4c showcases the exploration process with three robots. Lastly,
Figure 4a depicts the exploration process with the same number of robots but with different robot poses.
The purpose of presenting these experimental data is to evaluate the effectiveness of the proposed algorithm under various map configurations and robot positions. This adds to the study’s contribution by providing insights about the algorithm’s adaptability and performance in a variety of scenarios.