To address the inherent limitations of the traditional ACO algorithm in terms of computational efficiency and convergence to local optima, in this section we propose the intelligently enhanced ant colony optimization algorithm (IEACO). The proposed algorithm improves the performance of ACO through the integration of six key advancements. First, an optimized initial pheromone distribution strategy is applied to enhance early-stage search efficiency. Second, an -greedy mechanism is incorporated into the state transition probability function to achieve a dynamic balance between exploration and exploitation. Third, adaptive methods for adjusting the heuristic function and pheromone intensity exponents are introduced, enhancing the algorithm’s adaptability across different scenarios. Fourth, a composite heuristic function based on target distance and turning angles is constructed to improve path evaluation accuracy. Fifth, an improved global pheromone update mechanism is designed to effectively reduce the risk of local optima entrapment. Finally, a multi-objective evaluation framework is established, transforming path planning into a multi-objective optimization problem that comprehensively considers factors such as path length, safety, energy consumption, and time complexity to achieve more thorough path optimization.
4.1. Non-Uniform Distribution of Initial Pheromone
The initial pheromone distribution plays a crucial role in the initial search path of ACO. As illustrated in
Figure 4, traditional ACO employs a uniform distribution of initial pheromones in the grid map. This approach leads to numerous ineffective searches during the initial iterations, not only reducing convergence speed but also negatively impacting the algorithm’s overall efficiency and solution quality. To address this issue, this study proposes a non-uniform initial pheromone distribution strategy, depicted in
Figure 5. This novel distribution strategy considers multiple key factors, including the distance from grid points to the target point, the distribution of surrounding obstacles, and the distance from the start point to the target point. The proposed non-uniform distribution of initial pheromone is mathematically formulated in Equation (
6):
where
represents the initial pheromone concentration,
denotes the improved pheromone concentration, D represents the distance information calculated using Equation (
7), and n indicates the number of traversable free grid cells surrounding the current point
i. In addition,
is the Euclidean distance from grid point
i to the target point T, calculated as
, where the coordinates of
i are
, those of T are
, and
represents the Euclidean distance from the start point S to the target point T, calculated as
, where the coordinates of S are
. It is noteworthy that
, where 1 represents the length of a single grid cell.
Figure 6 shows an example calculation of d.
This study proposes an innovative non-uniform pheromone initialization strategy that significantly enhances the ability of individuals to identify advantageous grids during ACO. Specifically, as illustrated in
Figure 7, as the distance
from grid point
i to target point T increases, the value of
decreases, resulting in reduced pheromone concentration on grid
i and consequently lowering its probability of selection. However,
Figure 7 also reveals that as d increases, the rate of change in
gradually plateaus. To address this issue and amplify the range of variation, we innovatively introduce the distance
as a coefficient in the calculation of D. Furthermore, this study considers the significant impact on path selection of obstacle distribution around grids. In
Figure 8, black grids represent obstacles, while white grids indicate free spaces. Each grid is surrounded by eight adjacent grids, with the number of adjacent free grids positively correlating to the grid’s suitability as a path point. As the number of surrounding obstacles increases, the pheromone concentration on that grid gradually decreases, reflected by a reduction in grayscale value.
By integrating the distance information of the current grid and the distribution of surrounding grids, we optimize the initial pheromone into a non-uniform distribution. This method not only considers the traversability of grids but also reflects their potential advantages as path points. Therefore, this improvement increases the probability of the algorithm selecting advantageous areas in the initial stages, thereby enhancing its initial search efficiency.
4.2. Improved State Transition Probability Using the -Greedy Strategy
In traditional ACO, the roulette wheel selection mechanism is a crucial pathway for individual ants to move towards selectable points. However, the presence of numerous selectable nodes may significantly increase the computational complexity of traditional ACO. To address this issue, we introduce a deterministic state transition probability rule combined with an
-greedy strategy to achieve a balance between exploration and exploitation. The improved state transition probability is presented in Equation (
8):
where the argmax() function is used to determine the node j that maximizes the objective function f(j), where
,
is an random variable distributed in the interval [0, 1] that controls the transition probability, and
is the threshold value of
. When
, ant m will move at time k+1 to the node that produces the maximum product of pheromone concentration and heuristic visibility. This deterministic selection mode facilitates faster convergence of the algorithm. Conversely, when
, ant m will select nodes according to the roulette wheel selection mechanism of the traditional ACO. The deterministic state transition probability rule explicitly defines the ant’s next action through a mathematical model, thereby reducing the computational burden associated with randomness. Concurrently, introduction of the
-greedy strategy allows the algorithm to select the optimal path in most cases while retaining a certain probability of randomly choosing alternative paths. This strategy effectively balances the algorithm’s global search capability with its local optimization ability, enhancing overall algorithmic performance.
In fact, the value of
determines the probability of choosing between the deterministic selection mode and the random selection mode. Therefore,
significantly influences convergence speed and global search capability. If the value of
is small, the selection of the next point is more likely to be in deterministic mode. In this case, it can accelerate convergence speed; however, the global search capability will be reduced. If the value of
is large, then the path shift tends towards the random mode, which increases the randomness of path selection, resulting in increased computational complexity. Thus, it is necessary to propose rules for setting the value of
in order to balance the deterministic and random modes. Consequently, an adaptive adjustment mechanism for
is introduced to improve the state transition probability rule. The proposed formula for calculating
is shown in Equation (
9).
In the early stages,
takes a smaller value, favoring deterministic transitions with higher probability, which can accelerate the search speed for locally optimal paths. During the algorithm’s execution,
takes larger values, increasing the probability of choosing random transitions, which avoids local optima. As the algorithm iterates to later stages and its evolutionary direction becomes essentially determined, the value of
can be gradually reduced to accelerate the convergence speed. Improving the state transition probability by using the
-greedy method helps to achieve a good balance between global search capability and convergence speed.
Figure 9 illustrates the adaptive adjustment for
.
4.4. Multi-Objective Heuristic Function
In traditional ACO, the heuristic function
is typically calculated as the reciprocal of the Euclidean distance between the current node
i and the next node
j, denoted as
; however, in a grid-based environment model, the distance between adjacent grids can be either 1 or
. This causes the ants to be overly reliant on the heuristic function during the path search process, reducing the algorithm’s effectiveness and global search capability. To improve the algorithm’s global search efficiency, this study proposes an enhanced heuristic function; specifically, we incorporate the target point’s location information and the turning penalty factor into the heuristic function. The improved heuristic function formula is provided by Equation (
12):
where
and
are the respective weighting coefficients for
and
, which are subject to the constraint
. The term
denotes the angle formed by the start point
S, current point
i, and target point
T, as illustrated in
Figure 12.
The improved heuristic function comprehensively considers both the target distance information and the turning angle. The smaller the distance and turning angle, the better the heuristic function. The dynamic adjustment strategy for
(the weighting coefficient of
) is shown in Equation (
13). Given that the sum of
and
is always 1, uniquely determining one of them also determines the other. As iterations progress, the value of
gradually increases, leading to an increase in the denominator of the heuristic function, thereby progressively reducing the heuristic function value. This mechanism effectively reduces unnecessary searches in the later stages of the algorithm, enhancing its computational efficiency. Based on this improved heuristic function, the algorithm can better balance local and global information when selecting the next node, helping to enhance the goal-oriented nature of the search. This improvement not only increases the search efficiency of the algorithm but also significantly enhances the smoothness of the planned path. In addition,
can further improve the path quality. Specifically, the
helps to penalize excessive turning, promoting smoother and more continuous paths. The inclusion of the
improves path smoothness by reducing sharp turns, which enhances the overall efficiency and stability of the planned trajectory. Moreover, it helps avoid unnecessary detours, leading to more direct paths.
4.5. Adjusting the Global Pheromone Update Mechanism
In global path planning problems, the pheromone update mechanism of the traditional ACO algorithm optimizes ant search behavior by dynamically adjusting path selection probabilities. This mechanism is primarily controlled by two key parameters: the evaporation coefficient
, and the pheromone intensity coefficient
Q. The evaporation coefficient
suppresses the algorithm’s tendency to become trapped in local optima, while the intensity coefficient
Q enhances the attractiveness of effective paths. However, the static nature of these two parameters in traditional ACO may limit the algorithm’s overall performance. As the iteration process progresses, excessive pheromone concentrations may accumulate on advantageous paths, causing more ants to favor these high-concentration paths. This phenomenon can significantly reduce the algorithm’s global search capability and increase the risk of becoming trapped in local optimal solutions. To overcome these limitations and enhance the algorithm’s robustness, this study proposes a method for dynamically adjusting the evaporation coefficient
and pheromone intensity
Q. This method is implemented through Equations (
14) and (
15), aiming to maintain a balance between exploitation of known good solutions and exploration of the search space, which improves the algorithm’s ability to find global optimal solutions.
In the initial and intermediate stages of the algorithm, as the iteration count
k increases, the improved evaporation coefficient
exhibits an increasing trend while the improved intensity value
demonstrates a decreasing trend. This dynamic adjustment mechanism gradually reduces the pheromone levels during the path planning process, thereby encouraging the ant colony to explore new solution spaces and enhancing the algorithm’s global search capability. In the later stages of the algorithm, where the global optimal path has been successfully identified, the pheromone intensity and evaporation rate remain constant; this prompts the ant colony to favor the paths with higher pheromone concentrations, accelerating the algorithm’s convergence.
Figure 13 and
Figure 14 respectively illustrate the variation trends of the
and
parameters.
In path planning applications utilizing ACO, the max–min ant system (MMAS) prevents the algorithm from converging to local optima by constraining the range of pheromone concentrations. Specifically, the MMAS imposes upper and lower bounds on the pheromone concentration
for each edge, ensuring that the updated pheromone levels remain within the interval
. This strategy mitigates excessive pheromone accumulation on certain paths, thereby enhancing the diversity of the search space and reducing the occurrence of local optima. This constraint is expressed in Equation (
16):
where
and
denote the minimum and maximum pheromone concentrations, respectively. Drawing from previous experimental findings, the calculation of
and
can be enhanced by incorporating iterative patterns and historical performance [
35]. Equation (
17) presents an improved formulation for computing
and
:
in which
represents the minimal path length attained by an ant over
k iterations. As demonstrated in Equation (
17), incorporating the pheromone evaporation coefficient and optimal path length under the current iteration enhances the convergence efficiency of the ACO algorithm. This modification provides more effective guidance for node selection by individual ant colonies. Furthermore, integrationg
enhances the adaptability of
; in scenarios where
is substantial, indicating an expansive path search domain, the value of
can be augmented to broaden the exploration space.
By adjusting the state transition probability and the global pheromone update mechanism, it is possible to get rid of the local minimum and increase effective exploration. Specifically, adaptive adjustment of the evaporation coefficient and pheromone intensity encourages continual exploration during the early stages of the algorithm, preventing early stagnation. Additionally, resampling the adaptive variable at each iteration allows for a dynamic balance between deterministic (greedy) and probabilistic (explorative) selection, helping to avoid convergence to suboptimal solutions. Moreover, the transition criteria based on the current iteration k, with the threshold set at (where K is the total number of iterations), have been experimentally validated; this adaptive threshold promotes a gradual shift from exploration to exploitation, enhancing the algorithm’s ability to escape local minima. The combination of these strategies significantly improves the algorithm’s robustness and ensures effective exploration, leading to higher-quality global solutions.
4.6. Multi-Objective Function Evaluation Index
In order to obtain a more accurate and effective solution, this study reformulates the path planning problem as a multi-objective optimization problem. We selected four key indicators as evaluation criteria: path length, safety, energy consumption, and time complexity. These indicators collectively constitute the constraints for multi-objective optimization of mobile robots, reflecting critical aspects such as path efficiency, reliability, energy efficiency, and computational performance [
36]. Utilizing the TurtleBot3-Burger as the experimental platform, we conducted an in-depth analysis of these evaluation indicators to quantify the algorithm’s performance in practical application scenarios.
(1) Path Length Evaluation Index
Path length is a critical performance metric for mobile robots, as its value directly affects both the movement efficiency of the robot and the time required to complete tasks. The path length is shown in Equations (
18) and (
19):
where
L represents the length of the path,
n is the number of points on the path,
and
are adjacent points on the path, and
denotes the distance between these adjacent points. In a grid map, if the movement between
and
is horizontal or vertical, then
; if it is diagonal, then
.
(2) Safety Evaluation Index
Safety can be reflected in the reliability of the path by quantifying the distance between the robot and obstacles. This metric aims to ensure that the robot remains as far away from obstacles as possible, achieving a collision-free path. Safety is directly related to the distance between the robot and the nearest obstacle. The calculation of the safety index is shown in Equations (
20)–(
23):
where
represents the number of hazardous grid cells along the path,
S denotes the overall safety metric, and
is the minimum distance between each point on the path and the nearest obstacle. In addition,
is a very small parameter to avoid
being 0, while
is the coordinate value of the obstacle closest to
.
(3) Energy Consumption Evaluation Index
Energy consumption evaluation plays a crucial role in path planning, directly impacting the endurance of mobile robots. This study considers the path length, path angular variation, and turning frequency as the primary influencing factors, while also accounting for the energy consumption required for rotation during the motion of the TurtleBot3-Burger robot. The comprehensive energy consumption indicator can be represented by Equations (
24)–(
28):
in which
E represents the total energy consumption, where
is the energy consumed due to the path length,
U is the voltage,
I is the current, and
is the motion time with
, where
v is the linear velocity. In addition,
is defined as the angle change between adjacent path segments
and
,
represents the total number of turns along the path,
and
are the respective weight coefficients of the path angle change and number of turns,
represents the rotational energy consumption,
P is the rotational power, and
is the rotation time, calculated as
, where
is the rotation angle and
is the angular velocity.
(4) Time Complexity Evaluation Index
In global path planning for mobile robots based on ACO, the algorithm’s time complexity is primarily determined by the path generation time, which depends on both the initialization of the state matrix parameters and the path exploration process [
37].
The state matrix initialization time can be expressed as follows:
where
S is the dimensionality of the space and
M is the number of ants. The assumed initialization time for the algorithm’s information parameters is denoted as
, while the time required to initialize the pheromone matrix is
, the initialization time for each ant is
, and the time to initialize the path states is
.
The time of the path exploration phase can be expressed as follows:
where
K is the maximum number of iterations,
is the time to calculate the fitness function,
is the time to select the optimal individual,
is the time to replace the final individual in the iteration,
is the time to compute the next potential grid point on the path,
is the computation time for the improved state transition function,
is the dynamic adjustment time for pheromone evaporation rate and intensity, and
is the time to generate random numbers. The time required to select the next node position is
, and the time to update the pheromone is
.
In summary, the time
T of the algorithm can be expressed as
In the application of path planning for the TurtleBot3-Burger robot, multiple performance indicators exhibit complex interactions and potential conflicts, forming a multidimensional optimization problem. Specifically, four key indicators are mutually constraining, namely, the path length, energy consumption, safety, and time complexity, constituting a highly coupled system. For instance, while the shortest path may minimize time, it could lead to increased energy consumption and reduced safety; conversely, overemphasizing safety might extend path length, thereby increasing both energy consumption and time. Consequently, the path optimization process requires a method capable of dynamically balancing these competing objectives. To address this multi-objective optimization problem, this study proposes a weighted comprehensive evaluation method. The proposed method can be represented by the following formula:
where
,
,
, and
represent the weight coefficients for the four key performance indicators. This weighted combination method provides a flexible framework capable of effectively handling the complex interrelationships and potential conflicts among various indicators. The determination of weight coefficients is a crucial parameter tuning process, typically based on the specific characteristics of the robot’s working environment, the particular requirements of the task, and the associated constraints.
By adjusting the weight coefficients, this method can flexibly balance different performance indicators according to the requirements of specific application scenarios, including the path length, energy consumption, and safety. This customizability allows for precise tuning of the weights based on the characteristics of the TurtleBot3-Burger and its operational environment, thereby adapting to diverse task requirements. This multi-objective optimization approach provides a more comprehensive and flexible solution for robot path planning in complex environments, with the potential to significantly enhance the overall performance of the TurtleBot3-Burger across various application scenarios. Therefore, the robot’s path planning problem can be formalized as a constrained multi-objective optimization problem, expressed as follows:
Algorithm 1 provides the pseudocode for IEACO, while the IEACO flowchart is shown in
Figure 15.
Algorithm 1 Process based on IEACO |
- Input:
initialize the parameters, such as K, M, , , Q, and , as well as the heuristic information matrix, path matrix, and pheromone concentration; - 1:
Build a grid map environment model; - 2:
Initialize the parameters and build the start grid S and the target grid T; - 3:
; ; - 4:
Arrange all ants at the designated start point S to initiate the path planning process; - 5:
for k = 1 to K do - 6:
for m = 1 to M do - 7:
while ant m does not arrive at the target grid T do - 8:
Optimize the exponents and ; - 9:
, ; - 10:
Implement the heuristic mechanism with target information and angle; - 11:
; - 12:
Select the next optional grid by state transition probability; - 13:
- 14:
end while - 15:
if All the ants have successfully reached the target grid T then - 16:
Update global pheromone on all effective paths; - 17:
, , , - 18:
end if - 19:
end for - 20:
end for - Output:
The final optimal path;
|
In general, compared with traditional ACO, the innovations of this paper are as follows: (1) a non-uniform initial pheromone distribution to enhance search efficiency; (2) an adaptive -greedy state transition mechanism that effectively balances exploration and exploitation; (3) dynamic adjustment of the and exponents to improve adaptability to varying conditions; (4) a multi-objective heuristic function that considers both distance and turning penalties, guiding the search towards smoother paths; (5) an adaptive global pheromone update mechanism that ensures continued exploration while focusing on promising paths; and (6) transformation into a multi-objective optimization problem, optimizing not only path length but also safety, energy consumption, and computational efficiency.