1. Introduction
The mobile robot-aided mapping of environmental processes, such as information sampling [
1], sensor coverage [
2], localization of source [
3], and monitoring of environmental phenomena [
4], has been well investigated. In particular, sensor coverage with multiple robots involves optimally positioning robots to maximize overall performance in terms of sensing environmental phenomena. Monitoring is a persistent process for identifying anomalies in physical processes by efficiently collecting the most informative samples. For all of these objectives, it is required to obtain the model of the underlying processes through the physical sampling or mapping of the environment.
Modeling of physical processes plays an important role in autonomous robots’ decision-making. Robots must create a model of the environmental phenomenon to accomplish mapping tasks, especially when the environment is unexplored [
5]. Mapping spatial distribution enables the robots to work autonomously in search and rescue missions and make decisions without human intervention (e.g., for rescuing targets in areas with high radiation exposure). Similarly, the robot requires knowing the areas with higher risks to always choose a path to connect to the network, maintain communication, etc. [
6,
7]. Therefore, robotics researchers are actively investigating different strategies for mapping physical processes, such as radiation, Wi-Fi signal strength, gas distribution, radio signal strength, etc., using unmanned vehicles [
3,
8,
9].
To map the environmental phenomenon, sensing and measuring the value of a physical process throughout the environment is crucial. However, not all the locations can provide helpful information about the change in the process itself. The primary considerations involved in mapping such processes include the degree of autonomy, accuracy, and efficiency. Measuring intensity at every location is deemed impractical; hence, dense sampling is not viable in mapping. Instead, an accurate, time- and cost-effective process model can be obtained by gathering samples from the points containing the most significant information.
Exploration refers to accumulating samples from previously unexplored areas to reduce uncertainty in the map, while exploitation implies determining the next sampling point based on the best information from the current estimates (to localize the source, for example). Mapping algorithms and techniques in the literature use either exploration or exploitation, or a combination of both. For example, an active control law for mobile robots is proposed in [
3] to shift between exploration and exploitation objectives; research in [
2] has utilized a utility function to adjust exploration and exploitation. On the other hand, a parallel strip route (pure exploratory approach) is used in [
10,
11] to explore the environment for mapping the spatial distribution. Hence, these two techniques (exploration and exploitation) are fundamental to the mapping process. The study in this paper aims to compare how well various exploration and exploitation techniques perform, as well as analyze how tradeoffs between exploring and exploitation affect different sampling objectives. Specifically, the contributions made in this paper are three-fold.
We comprehensively compare various information-sampling variants. Our analysis evaluates the balance of performance metrics related to the accuracy, confidence bound, time, and energy consumption for the exploration of objective and source localization accuracy for the exploitation objective. Additionally, we investigate how both objectives can be balanced.
We systematically analyze the impact of different source locations on this tradeoff using single-robot experiments with random walk (RW) and fixed sweep trajectory (FS) as the baselines for comparison.
We extend this analysis to multi-robot settings with fixed and dynamic Voronoi partition-based adaptive sampling [
12] assignments to each robot in the system.
The outcomes of this investigation provide significant perspectives on selecting appropriate weights in the information function for active sampling with mobile robots, especially in scenarios where it is necessary to strike a balance between exploration (ensuring well-balanced performance metrics) and exploitation (locating sources with minimal samples) objectives.
2. Related Work
In informative path planning, both adaptive (taking the informativeness of the sampled data into account) and non-adaptive (without considering informativeness) sampling approaches have been previously reported in the literature. Non-adaptive sampling methods focus on sampling the whole environment [
11,
13,
14]. Non-adaptive methods are time-consuming, and with such methods, it is hard to achieve the desired threshold (upper bound) of information certainty. Alternatively, adaptive sampling methods provide convergence to an objective (threshold), and sampling can be stopped as soon as the desired threshold is reached.
Table 1 provides detailed information about closely related works in the literature on adaptive information gathering. Several objective functions have been used in coordination with Gaussian process regression [
15] to map the physical process (i.e., to predict the samples at unvisited (unexplored) locations with confidence bounds).
The approaches to planning a robot path that contains the most informative samples have been well studied. The technique utilized in [
1] aimed to maximize the mean entropy information metric when searching for a station. An informative planner algorithm based on RRT was utilized to select the path that provides maximum utility, i.e., the tradeoff between the informativeness and cost to reach that station. However, the tradeoff was applied based on the budget; if the cost was lower than the budget, then the path was selected. The authors of [
4] employed entropy as an information criterion over the Sparse Gaussian Process to identify the most informative locations to persistently monitor salinity in the ocean. Similarly, Ref. [
19] used wireless signals for robot localization in an indoor GPS-denied environment. A path loss model was learned from the data, and then the Gaussian process was trained with the mismatches between the models and the data with a focus on better prediction of model variance. In [
8], the authors focused on mapping in structured environments. The algorithm partitioned the environment for each robot and used differential entropy as an information theory metric on top of Gaussian Process predictions to determine the next sampling point.
The authors in [
20] proposed a Hexagonal Tree (HexTree)-based sampling algorithm, which took samples over a set of hexagonal grid points and built a tree of possible trajectories by extending candidate trajectories toward the sampled points. In [
24], an energy-aware approach is introduced to balance between coverage and sampling. Similarly, a recent work in [
25] considered balancing the coverage and sampling (learning environmental model) objectives with a time-varying parameter. However, we consider adaptive sampling as an independent objective without the need for performing the area coverage of the environment, allowing us to focus solely on analyzing the informative sampling tradeoff with different objectives (exploration to obtain new data or exploitation of available data to complete the sampling objective).
Researchers have investigated decentralized methods for modeling the environmental process as well [
26,
27,
28,
29]. Ref. [
26] introduced a technique utilizing radial basis functions, emphasizing the cooperative learning of the model under communication constraints. Meanwhile, Ref. [
27] devised a multi-robot algorithm that employed a pure exploration strategy to map the underlying physical process using Spatial GPR. However, for our analysis, we chose a centralized server to mitigate potential biases associated with decentralization.
In a related study [
28], a decentralized informative path-planning algorithm was introduced, aiming to balance exploration and exploitation. The study also conducted a comparison of exploitation coefficients based on completion time and mapping accuracy. However, the efficiency of the algorithm is influenced by the robot’s starting position, a factor not thoroughly examined, particularly in a multi-robot setting with a constrained energy perspective. Our study differs from [
28] in that we holistically consider the energy consumption and confidence bound into account for the tradeoff analysis of different sampling objectives. We also report the effect of the information sampling parameter (coefficient) at different time instances during the exploration task. To avoid bias from the robot’s location or the source, we performed extensive analysis through simulation experiments with five different source locations and performed five trials per source location.
Jang et al. [
29] proposed an approach to learn the underlying model function employing decentralized GPR. The paper implemented a pure exploration strategy to decrease variance, shifting to pure exploitation when the model’s variance exceeded a specified threshold. However, a potential drawback of this technique is that exploitation will not be initiated until the variance threshold is reached. In practical settings, determining the accurate threshold value is challenging without specific knowledge of the environmental process.
In the context of a time-invariant physical process, exploitation proves more advantageous than exploration, offering valuable insight, such as identifying the location of the source and determining its intensity. Our study aims to comprehensively understand which values for the exploration and exploitation coefficients yield more favorable tradeoffs for their respective underlying objectives without relying on a predefined threshold.
Compared to the literature, our work is novel and unique in the following ways: (1) we extensively analyze several variants of adaptive sampling parameters through simulation experiments with multiple source locations and distributions; (2) we present the analysis from a pure sensing perspective (i.e., to obtain an accurate prediction of the sensed information throughout the environment) with both single-robot and multi-robot use cases; and (3) we discuss how to holistically balance between sampling and energy demands considering practical resource constraints.
3. Robot Sampling Aided by Gaussian Processes Regression
Gaussian process regression (GPR) has been widely utilized for modeling spatial processes. For example, Ref. [
30] used GPR to model spatial functions for mobile wireless sensor networks and to generate a likelihood model for signal strength measurements. To obtain a spatial map in an environment with limited communication, Ref. [
12] employed GPR to the occurrence of model algae bloom. GPR was utilized in [
3] to obtain a model of radio signal strength and to obtain the maximum likelihood of the source location. Therefore, we use GPR in our work to dynamically predict and continuously update the sensor data estimates for the whole map region (including unexplored locations) using the available data sampled so far from locations previously visited by the mobile robot.
A Gaussian process (GP) is a non-parametric continuous function that defines a probability distribution over functions. It assumes that every point has a normal distribution and that there is a correlation between values at these points. Let
q be the 2D location (x and y coordinates) where
from where the signal strength is measured, and let
z be the measurement. The value of
z at any location q can be related to a function
using the Gaussian noise model for the observed location
q as
where
is additive Gaussian noise.
We are interested in calculating a posterior function
that makes predictions for given test locations
. A GPR model, also known as Kriging, assumes a GP prior that can be completely defined using mean and covariance. The joint Gaussian distribution on the test set
, assuming noisy observation
z, can be defined as follows:
where
K is the covariance matrix between the training points,
is the covariance matrix between the training points and test points, and
is the covariance between only the test points. The posterior mean and variance for any testing location
learned by GPR are as follows:
In our experiments, we employ a widely used kernel, i.e., a squared exponential.
where
q and
are both training points and
and l are hyper-parameters, called variance and length, respectively. We continuously learn these hyperparameters by maximizing the log marginal likelihood of the observations [
3,
31]. The variance and mean in Equations (
3) and (
4) are used to calculate the informativeness of every point.
The data used for training correspond to the information collected by the robot up to this point, while the testing data in this specific scenario encompass all the locations within the environment that the robot has not yet visited but needs to predict the signals for so it can decide which location it can go to next to speed up the sampling process. The methodology utilizes the fact that, once GPR is trained, mean values can be extrapolated for the entire region of interest.
4. Experiment Design and Implementation
We developed the simulations using the Robot Operating Systems (ROS [
34]) Gazebo simulation framework, built on top of the open-source code base from [
21]. We considered a 10 m × 15 m simulated area free of obstacles (to avoid bias in the analysis due to collision-avoidance algorithms). Until the robot’s battery is depleted, it takes samples based on the informative function, navigates to the location, and collects samples. With each new sample, the Gaussian process regression is trained, the intensity values of the whole environment (map) are predicted, and the informativeness of each location is updated based on one of the six variants in
Section 3, which includes six adaptive sampling variants and the two baseline non-adaptive sampling variants. The robot’s starting location and battery timing were kept fixed for all scenarios.
The ground truth for the Wi-Fi signal map (
Figure 1) was generated as per the equation for the received signal strength indicator (RSSI) [
35,
36]:
where
is the signal reference power at
= 1 m,
f is the signal frequency (2.4 GHz),
is the power of the signal transmitter,
is the path loss factor (
),
d is the distance between the signal source (i.e., an Access Point) and robot,
is a Gaussian distribution with zero mean and variance (0.65 dB
) to represent noise in signals, similar to the settings in [
21].
Figure 2 presents an architectural overview of the control flow in the GPR-based informative path planning approach and the experimental design to analyze the influence of the information function (Equation (
6)) on the outcome of mapping objectives.
4.1. Single-Robot Experiments
For single-robot experiments, we deployed a hector UAV robot (an aerial robot) with a battery capacity to sustain 500 ROS seconds (with a real-time factor close to 1) and a starting position of (4.5, 0). To provide a thorough analysis of the impact of exploration and exploitation on online learning and the mapping of the spatial distribution map, we consider two baseline scenarios:
Fixed Sweep (FS)
Random Walk (RW)
In a Fixed Sweep (FS) baseline, the UAV sweeps the whole region in a horizontal parallel strip pattern starting from the bottom center of the region to the upper center of the region. The idea is to make UAVs familiar with the overall intensity changes of the environment. In a random walk (RW) baseline, the UAV randomly explores the surrounding region after randomly moving up, down, left, or right by 3 points from its current position. All the
variants would use the RW as the initial approach for their minimum number of samples before using GPR and the Information function in Equation (
6) to choose the next sampling point. Once the UAV finishes sweeping the whole region or has taken at least 15 samples from the surroundings (using RW), the informativeness of each location is updated, and the next sampling location is chosen based on the information function variant.
Following the ideas of the time-varying nature of the
coefficient in [
33] and to further support our analysis, we include a new
TimeVariant (TW) approach that dynamically varies the
value with respect to the time evolution of the mission, gradually moving from exploration (giving full priority to the uncertainty of the information value in Equation (
6)) to exploitation (giving full priority to the predicted mean value of the information function). Inspired by a similar approach to balancing coverage and learning in [
25] and balancing coverage and recharging in [
24] throughout the mission duration, we present the time-variant coefficient below in Equation (
9) that dynamically changes the priority from exploration to exploitation based on the expected mission period (e.g., based on the maximum energy capacity or the task requirement).
Here, is the minimum time required to collect enough samples with which GPR can become useful, is the maximum time allocated for the mission. The idea for this time-varying is to generalize the dependence of with respect to robot limitations such as energy, communication, and task requirements. For instance, energy-limited robots like UAVs can choose their based on their current energy level. Higher energy means the robot can explore better, reaching farther regions, while lower energy can let it exploit the signal variations to find the peaks (sources).
This TimeVariant approach is expected to provide a balanced performance on multiple objectives in terms of mapping accuracy and source localization, but the cost of energy efficiency is not entirely known. Therefore, this paper will add this novel perspective in the comprehensive comparison of the sampling objectives when the information function priorities are fixed (a constant
in
Table 2) or dynamically change during the mission (a time-variant
in Equation (
9)).
4.2. Multi-Robot Experiments
Researchers have used the Voronoi partitioning method in a multi-robot setting to divide an environment for multi-robot sampling [
2,
12,
16]. Here, the robots are driven toward the centroids of the respective Voronoi region to maximize the mapping (sampling) performance and minimize the sensing cost. Robots choose the most informative location within the Voronoi region based on a utility function encompassing exploration and exploitation. Specifically, the work in [
21] uses the heterogeneity of robots to weight the Voronoi partition, which is continually updated during the sampling process. Motivated by the investigations in the works mentioned above, we use Voronoi partitions to analyze multi-robot settings to distribute regions among multi-robots. To divide the given region
for n robots, we divide the environment into n regions that the partition
for each robot that
i corresponds to:
In the case of multi-robot sampling, we have considered the following two scenarios of Voronoi partitioning:
Fixed Voronoi Partition (FVP)—Considering only the initial robot positions, the region associated with the robot is decided at the start of the experiment. We fix these positions throughout the experiment, and the utility function determines what points within the respective region are to be chosen as target points.
Dynamic Voronoi Partition (DVP)—In this scenario, the Voronoi partition continuously updates as the robot moves. The target point can only be determined if it belongs to the respective partition at the time of the request based on the informative function.
For multi-robot experiments, 3 simulated Jackal UGV robots were deployed to the same Gazebo simulation framework. The initial positions of the three robots are (3, 2), (3, 10), and (7, 7), respectively. For the multi-robot sampling scenario, we only employed the RW baseline and took five random samples per robot (totaling 15 samples) within the Voronoi region before utilizing adaptive sampling. The baseline variant of both scenarios (FVP and DVP) is random walk sampling (non-adaptive).
4.3. Performance Metrics
We consider the following performance metrics:
Samples: The number of Wi-Fi signal strength samples the robot takes using its Wi-Fi device. This number should be minimized for better information sampling.
RMSE: The root mean squared error between the predicted mean information (Wi-Fi signal strength) through the GPR and the ground truth information. The aim is to obtain predictions as close as possible to the ground truth, i.e., lower RMSE. The RMSE values in the tables and figures represent the average RMSE over the whole map.
Variance: The confidence bounds of the predicted values given by the GPR. The goal is to be confident about the predicted mean value, i.e., lower variance. The variance values in the tables and figures represent the average variance over the entire map.
Cumulative Distance: Cumulative distance refers to the total distance traveled by the robot. The shorter the distance traveled, the lower the power consumption. We have used the cumulative distance metric to determine the energy cost incurred by the robot. The cumulative distance should be as low as possible for the optimum approach.
Source localization accuracy: If the location at where the maximum mean value of the predicted GP map lies within 1m of the actual source location, then that is classified as the correct localization; otherwise, the localization is incorrect. The localization accuracy is the percentage of correct localization of all trials out of all source location experiments combined.
The energy consumption of a robot depends mainly on two types of devices onboard a robot: time-dependent hardware resources (e.g., computer, controller, and sensors) and mobility-dependent hardware resources (e.g., motors and manipulators). Accordingly, we can derive the instantaneous change in the energy consumption equation of a mobile robot as [
37],
where
and
are the coefficient parameters that provide importance to time-dependent and distance-dependent energy consumption. In real-world robotic systems, the motion (distance or velocity of the motors) significantly influences the energy characteristics of mobile robots [
37,
38]. For instance, it was found in [
39] that the motion component consumes up to 95% of power in a mobile robot. Since we set the velocity of the robot constant in our analysis for the sake of controlled comparison of sampling objectives, we considered the
cumulative distance as the key metric representing the energy consumption and obtained a measure of the energy efficiency of the robot’s sampling trajectory.
We ran five trials per variant in each scenario. Further, the experiments were repeated for five different Wi-Fi source locations, with each being at the middle, top-left, top-right, bottom-left, and bottom-right corners of the map area. In total, we conducted more than 1400 simulations for this analysis with the core results derived when the radio signal path loss factor in Equation (
8) (
) and repeated them for a different loss factor (
) to analyze the impact of the environment in
Section 5.4.
5. Results and Analysis
In adaptive sensing and the efficient modeling of environmental phenomena with robot-aided observations, the goal is to minimize the prediction (sensing) variance, improve the prediction (sensing) accuracy, and conserve energy by utilizing predictions promptly. It is generally understood that the exploration objective seeks to minimize the variance (uncertainty) of the predicted information, while exploitation seeks to minimize the RMSE of the predicted map (information accuracy). In both cases, we need to make use of the predictions as soon as possible. For instance, in the case of exploitation, we need to identify the signal source location, i.e., the place with the maximum signal intensity. An informative function can be either exploitation-based, exploration-based, or a weighted combination of both. In our variants, the MaxMean strategy does pure exploitation, whereas MaVar, Fixed Sweep, and Random Walk are pure exploration strategies. The rest of the variants combine exploration and exploitation. Below, we present the results of the single-robot and multi-robot experiments separately and then summarize the common analysis from an information sampling perspective.
5.1. Single-Robot Experiment Results
Table 3 summarizes the performance metrics results obtained by averaging the data collected over all trials with different source locations for the single-robot experiments. Detailed results are shown in
Figure 3, where the plots of performance metrics (RMSE, variance, and cumulative distance) are presented, comparing different variants of the analyzed information functions. An example evolution of the RMSE and variance of the GPR predictions for single-robot experiments can be seen in
Figure 4.
In
Figure 3b, summarized views of these performance metrics are presented from the perspective of the
parameter in different experiment settings. Here, we can visibly observe that the accuracy of the sensed information (RMSE) decreases when
and the uncertainty (variance) of the predicted information decreases with
. So, ideally, the
should be balanced to obtain high accuracy and low uncertainty. However, when the energy perspective is added (i.e., the cumulative distance), then the selection of
becomes complicated. Therefore, an in-depth discussion of this nature is essential to meaningfully analyze the tuning of the sampling function parameters in adaptive sensing and informative path planning applications. We present the discussion from the exploration and exploitation perspectives below. Depending on the mission requirements, one can choose the informative path-planning coefficients (
and
), and our study provided a direction toward this objective. It is worth noting that our focus lies on methodologies that accomplish both exploration and exploitation objectives while also conserving energy.
5.1.1. Exploration Perspective
The exploration objective is to obtain accurate predictions of the sampled environmental process with the highest confidence bounds, i.e., the lowest variance (uncertainty) in all map areas. We compare the performance metrics with respect to the variation in the
values in Equation (
6) at different instances during the exploration task.
As shown in
Table 3 and
Figure 3b, the higher the alpha values, the lesser the cumulative distance will be. Furthermore, the number of samples required for RMSE and variance saturation also increases with alpha. MaxVar, i.e.,
, yielded the best convergence results for all cases, but at the cost of increased distance and the number of samples that the robot had to collect. In the MaxVar approach, the robot prioritizes exploring locations with higher uncertainty that have not yet been visited. These locations could be situated at a considerable distance from the robot. On the other hand, the time-varying alpha (TW) approach kept increasing
behaving like a MaxVar-like variant in the beginning and a MaxMean-like variant in the end. Due to this time-varying nature, the uncertainty has a huge variability and was very high (close to 12.85 dBm
) in the end, which could not meet the exploration objectives of the mission.
For , the values for variance are not stable as the robot prioritizes exploitation, gets stuck in local optima, and keeps taking samples from the same location without exploring further; therefore, new sampling data do not consistently improve the map variance. Hence, values near 0.25 demonstrate optimal convergence and a well-balanced performance across all metrics. This is attributed to the fact that, in selecting the next target location to visit, the robot assigns higher importance to variance with a factor of . This approach enables the robot to explore while considering locations with maximum mean values, facilitating a better understanding of the source. For a more accurate selection of alpha values, the energy consumption and variance must be considered depending on the specific mapping scenario.
5.1.2. Exploitation Perspective
We take the source localization example as the objective of exploitation in our work. To properly locate a source, a robot should detect and provide a GP map with maximum mean at a point within 1 m of the real source location in any direction.
Table 4 shows the source localization accuracy of all variances in all scenarios based on the number of times the resultant GP map can be used to correctly identify the source locations at different instances in the sampling process.
It can be observed from
Table 4 that the localization accuracy of each approach improved with increasing numbers of samples. We are interested in identifying approaches to obtain better results with fewer observations. The RW, MaxMean, and FS approaches did not perform well, especially in the early stages of the experiment. With the FS baseline, fewer measurements are taken, while with MaxMean, the measurements are taken repetitively at the same location (local maxima) since the information function (with
) only depends on the predicted GP mean. The RW approach had improved source localization performance, but it was slower than the other approaches. RW’s results were better after just 25 samples than FS’s after 35 samples. In all scenarios (except FS), the initial 15 samples were obtained through random walk. Except for MaxMean, FS, and RW, all variants demonstrated strong performance in source localization. This can be attributed to their limited exploration and lack of consideration for information. Counterintuitively, giving full priority to the predicted mean value of the information function (by setting
, the sampling function could not obtain the source’s peak since learning the process was compromised by not accounting for the uncertainty of the predictions.
MaxVar, however, does not represent a cost-effective approach since it involves very long distances. MaxVarMaxMean, Alpha0.25, Alpha0.5, TW, and Alpha0.75 exhibit quicker convergence and cost effectiveness. If the MaxVarMaxMean approach fails to meet the variance and RMSE thresholds, it behaves identically to MaxVar.
We discovered that the alpha range
works well for exploitation objectives when minimizing distance cost is the first priority (e.g., if energy availability is heavily limited [
37]). However, the increased variance continues to be a concern as the alpha value increases. To further narrow down the selection for exploitation within this range, the need to maintain a threshold variance versus energy consumption needs to be considered. In particular, Alpha0.25 is most effective when a balanced tradeoff is necessary, especially in scenarios where source localization accuracy needs to be enhanced. Interestingly, in situations where cost is not a concern, the MaxVar approach proved to be the most effective in achieving both mapping accuracy and confidence in exploitation performance.
5.2. Multi-Robot Experiment Results
Table 5 summarizes the results for performance metrics obtained by averaging the data collected over all trials with different source locations for the multi-robot experiments. In
Figure 5, summarized views of these performance metrics are presented from the perspective of the
parameter in different experiment settings. Similarly to the single-robot experiments, we can observe that the mapping accuracy (RMSE) and the uncertainty (variance) of the predicted information improve with the reduction in
. Accordingly, the
should be the lowest value for all robots in the team. But, as you can see in the MaxVar approach where
, the energy consumed is the highest. We discuss the impact of this parameter in a multi-robot setting below, with the aim of setting the
for all the robots to leverage the advantages offered by the multiple robots in completing the sampling mission.
5.2.1. Exploration Perspective
The distance plots for multi-robot scenarios (see
Figure 5a,b) show that all variants of DVP approaches had much shorter travel distances than the same approaches based on FVP. However, the FVP scenario resulted in improved variance as well as the speed of convergence compared to DVP scenarios. Consequently, we can conclude that the DVP scenario is suitable for cost-effective sampling (less energy), while the FVP scenario is suited for faster convergence and better exploration results. This is because, in the FVP approach, the robot is assigned to a fixed region, and its Voronoi partition does not change when it visits a corner. On the contrary, in DVP, when a robot visits a corner, its Voronoi partition undergoes a significant change from the previous time step, leading to longer distances.
Consistent with the findings in single-robot experiments, MaxVar has promising results in terms of variance and takes fewer samples, but its distance cost is almost twice that of Alpha0.75 and MaxVarMaxMean. Additionally, when , we can observe stable variance with an extended distance. An value close to 0.25 offers the optimal balance between all the performance metrics while performing close to the MaxVar approach. It effectively reduces variance and RMSE while keeping the increase in distance within an acceptable range.
5.2.2. Exploitation Perspective
Table 6 presents the source localization results for the multi-robot experiments. Here, we observe similar results for both the multi-robot partitioning settings (FVP and DVP), where the Alpha0.25 variant still balances both source localization accuracy and the energy-consumption requirements well. After 25 samples, the dynamic and fixed Voronoi partitions performed close to each other, and they were successful at locating the source much faster, even with just 25 samples. This is expected, as more robots in the multi-robot system contribute to the task objectives and improve performance and efficiency. The performance improvements found with MaxMean and RW were lower than those of single-robot experiments. Generally, the alpha range
is useful to obtain a balanced performance for the exploitation objective. Higher Alpha variants like Alpha0.75 travel a lower distance and can localize the source faster but at the expense of decreased variance (see
Table 5).
5.3. Impact of Source Locations on the Sampling Performance
We also analyzed the impact of different source locations on the sampling performance (results for these special cases are available in the
Supplementary Materials). We found that there was almost no impact on the results across all sources, especially when the
value (i.e., the weights towards confidence bounds) was higher. However, for variants where
the value is higher (MaxMean, Alpha0.75, and Alpha0.5), they gave significantly different results for the furthest source locations at the bottom-right (0, 14) and top-right (9, 14) parts of the map area. This could be attributed to the fact that when
is higher, exploitation is more preferred, and therefore, localizing a much farther source could be difficult to accomplish. In summary, the effect of source locations was not observed for informative functions with greater weights for variance (exploration).
5.4. Impact of Wi-Fi Signal Distribution on the Sampling Performance
Further, we analyzed the impact of the signal distribution itself on the results by repeating all the single-robot cases with different path loss exponent (
in Equation (
8)) (results for these special cases are included in the
Supplementary Materials). It was observed that all approaches with
= 2 performed quite well in comparison to the cases where
, and we found that the change in both variance and RMSE metrics was smoother for all variants when
= 2 than the same approaches when
= 3. Nevertheless, the change in the signal distribution had a minimal impact on our analysis, and the observations made for
above hold for
as well.
5.5. Summary of Findings
Our findings suggest that optimizing can help strike a balance between the number of samples, the energy incurred, and the prediction accuracy while maintaining a high level of confidence. Specifically, assigning significance to the mean value is crucial. It is critical to give importance to both the mean and the variance of the predicted map; however, we determined that prioritizing the variance would quickly reduce the mapping uncertainty and help efficiently find the signal source in the map. This would allow the mapping process to simultaneously achieve exploration and exploitation objectives while maintaining a balance in the energy consumption attributed to the distance metric. Based on our analysis, an alpha value near represents an optimal balance between the two objectives, enabling robots to model a physical process efficiently and accurately.
Our analysis can help to decide the
values for specific scenarios based on the objective. For instance, the objective outlined in [
29] is for the robot to model the physical process, identify the source, and navigate to the source location. The proposed algorithm bears a resemblance to our MaxVarMaxMean approach. In this method, the algorithm explores the environment until a specific threshold is reached, subsequently employing MaxMean for exploitation. However, in real-world scenarios marked by noise and error-prone sensing, achieving the designated threshold for variance may be challenging for the robot. In such situations, continuous exploration persists, leading to increased distance traveled and higher energy consumption. On the other hand, a dynamic change in priority, as in the TimeVariant approach, could help balance the sampling objective (e.g., source localization) with other objectives such as achieving optimal coverage of the environment [
25], but at the cost of increased uncertainty of the predicted data, which would be of extreme value in a mapping task.
To address this issue, the algorithm can be enhanced by incorporating Alpha0.25, which proved to provide the best balance of all metrics, including energy consumption. This modification assigns significance to both the mean and the variance throughout the experiment, demonstrating comparable performance to the MaxVar approach during exploration and effectively identifying the source location. Relaxing the threshold condition makes this approach less susceptible to variations in source location, ultimately conserving energy. Subsequently, the robot can navigate to the source location after reaching a predefined number of samples. A similar approach can be used in [
16]. In methodologies similar to [
11], the goal is to model the environment by utilizing an aerial robot and subsequently identifying the sources using a ground robot; Alpha0.75 proves to be a suitable choice in this context, mitigating energy consumption while maintaining a source localization accuracy on par with alternatives. However, it is worth noting that, as previously discussed, Alpha0.75 is accompanied by increased variance. Nevertheless, when it comes to applications safety-related applications like nuclear radiation mapping [
40], where precision takes precedence over energy conservation, the MaxVar approach proves to be the most suitable choice.