1. Introduction
Robotics is a growing research area in artificial intelligence. The robot motion planning (RMP) problem is a key problem in robotics [
1]. The problem concerns mobile robots and consists of finding a path for a robot between two positions and avoiding obstacles. This problem is present in many real-world applications. A mobile robot can be used in situations where humans can be in danger, or places that are not reachable due to the characteristics of the terrain [
2].
In this work, we are interested in the motion planning problem when the environment is not fully known. Many offline approaches exist to solve this problem with a known environment [
3]; however, it is a new challenge when parts of the environment are unknown [
4]. The robot must not only adapt its path to new conditions but also make quick decisions while looking for a minimal path. Our proposal is focused not only on path minimization but also on obtaining smooth paths.
In this paper, we consider a robot with limited sensors that can only detect nearby constraints in its environment such that the robot will discover new areas and new obstacles only when it is moving.
Our motivation is to propose a fast, adaptive, and effective technique for path minimization that can also obtain a smooth path without collision for a robot with limited sensors.
Unfortunately, comparisons with the existing techniques from the literature for online planning are difficult. The authors typically use specific instances of the problem that are neither publicly available nor sufficiently well described to be re-generated. In this paper, we also propose a new publicly available set of instances of robot motion planning for online techniques to have a common set for evaluation.
The main contributions of this work are (i) an adaptation of an offline robot motion approach for solving the online robot motion planning problem, (ii) a framework to solve motion planning problems with different degrees of knowledge of the environment, and (iii) a set of 40 new instances with different degrees of environment knowledge available for new research.
The next section introduces our approach with the environment modelization and the mathematical model of the problem. We roughly present OfflineRMP in
Section 4.1, our offline approach, which works in a fully known environment for the robot. OfflineRMP is the basis of our online approach called AdaptiveRMP, which is explained in detail in
Section 4.2. Also, we propose a framework to integrate both techniques according to the robot’s knowledge of the environment. We present different instances and testing scenarios in
Section 5, as well as the results and a statistical analysis for comparison. We revise the related work and discuss our framework and the results obtained in
Section 2. Finally, some conclusions and future work are presented in
Section 6.
2. Related Work
Robot motion planning has been studied since 1960. Its original version was the Piano Mover’s Problem, which aims to move a piano or complex object from one room to another with obstacles in the way [
5]. The path planning problem is demonstrated to be NP-hard in [
6].
Several approaches have been proposed to solve the robot motion planning problem. Most of these are focused on completely known environments, the offline motion planning problem, and the newest ones are focused on unknown environments, the online motion planning version. Moreover, these approaches can be classified as classic methods and heuristic-based methods [
7]. Most classic methods can be classified as planning by construction, while heuristic optimization algorithms are classified as planning by modification.
The classic roadmap methods model the space without obstacles onto a network of lines, transforming the offline path planning problem into a graph searching problem. Visibility graphs use the search space to trace lines between different parts of the obstacles. Including the starting and target points, it is possible to create a graph in which a graph search algorithm is used [
8]. In these graphs, all the lines must be feasible paths. Voronoi diagrams aim to create graphs of paths similar to visibility graphs, but, instead of using vertices, Voronoi diagrams create additional space around the obstacles [
9]. The specific goal considered by these approaches is to obtain the clearest paths.
In cell decomposition approaches, the search space is divided into cells, which are then used to search a path that begins at the starting point and ends at the target point connected by cells in sequence [
7]. In potential fields, the robot is treated as a point under the influence of an artificial potential field, while the target attracts the robot, and the obstacles push the robot away [
10].
Sampling-based techniques aim to find paths in different samples of the free workspace and then join them [
11]. This method obtains good results for robots with a high degree of freedom in static environments. The two main classifications of these methods are probabilistic roadmap methods and rapidly exploring random trees. Probabilistic roadmap methods locate a set of configurations in the free space, which are connected using a local planner. The starting and target positions are added to the path. Finally, a graph-search-based algorithm is used to find the shortest path [
12]. A large number of approaches can be found in the literature, all based on the rapidly exploring random trees [
13]. The basic method works by iteratively growing a tree from the starting point to the goal node by generating random points in the space that are linked to the closest point in the current tree. The basic algorithm and its variants can be classified as sampling methods. In [
14,
15], reviews of these RTT methods are presented. Most of the recent approaches consider the use of several trees [
16] that enable balancing the processes of the exploration and exploitation of the robot search space by generating random heuristic trees that are used to connect the rooted tree towards the goal area.
In [
17], the authors propose a path planning algorithm focused on the consideration of collision risk. The approach combines a trajectory estimation algorithm and A* to select the low-risk points to conform to the path. Also, ref. [
18] considers the inclusion of spatial-temporal specifications.
Heuristic algorithms have also been used to solve the offline motion planning problem. In [
19], a genetic algorithm is introduced. The goal is to find an adaptable path according to changes in the environment. Each individual of the population corresponds to a candidate solution using a binary representation. In [
20], the use of simulated annealing with potential fields is proposed. Once a local minima is found using potential fields, simulated annealing is used with a high initial temperature to enable changing to a worse solution to escape the local minima. In [
2], an ant colony optimization algorithm is implemented to solve the offline robot motion planning. In this work, the workspace is discretized in
connected nodes with binary values representing obstacles. The pheromone levels are initialized at each and change according to the ant’s paths.
In [
3], a hybrid approach is presented. Surrounding points set close to critical obstacles are used to find a path, and then particle swarm optimization is applied to improve such a path. When the robot approaches narrow spaces where a feasible path cannot be found, this approach considers more surrounding points, enlarging the grid resolution; if there exists a path, it will be found. The approach can obtain better results than probabilistic roadmap methods [
12] and an ant colony optimization-based algorithm [
21] solving instances of
. In [
22], a new hybrid planning algorithm is proposed. The approach, RRT*-CFS, combines sampling-based and optimization-based methods. Here, RRT* [
23] is used to construct an initial feasible path, and then the convex feasible set algorithm is applied to solve the non-convex motion planing.
More recently, learning-based approaches have been proposed for the problem. A residual convolutional neural network is proposed in [
24]. Inspired by the imitation learning idea, the neural network uses an offline-prepared dataset based on global environment information and the optimal paths for training. An approach for online robot motion planning using neural networks is introduced in [
25]. This approach can be used for mobile robots, manipulator robots, and multi-robot systems. Neural networks respond well to changes in the environment related to changes in obstacles or the dynamics of the environment [
26]. The instances used include two-dimensional featuress with
x and
y coordinates and size
using 2500 neurons to represent the workspace. The instances used include labyrinth workspaces and mobile targets and mobile obstacles.
Machine learning algorithms are coupled with case-based reasoning in [
4]. Here, the authors present two approaches that use sets of solutions to resemble new ones using case-based reasoning. Given a path planning problem, the first approach selects a set of similar problems with their respective solutions. These solutions are merged and, using case-based reasoning, a solution to the new problem is constructed. The second approach constructs paths by iteratively analyzing the current state and determining the next ones using case-based reasoning based on the nearest neighbor solutions.
In [
27], an evolutionary approach is presented. The approach uses a matrix representation of the workspace where each cell is the size of the robot and movements are restricted to four axes. It solves offline and online planning. For the experiments, small obstacles are included in the initial path. A re-planning of the path is used to avoid collisions. The instances used are from
to
, and the execution times vary from 1 s in the smallest instances to 96 s in the largest instances. The proposed algorithm obtains better results than the classic methods, especially for the largest problem instances. The same authors present in [
28] an evolutionary approach for the dynamic problem. In this case, the positions of the obstacles are unknown, and the robot’s sensors are limited. They use the same instances as in their previous work, but the robot only has information on the closest cells. The results vary from 2 s in the easy instances to 250 s in the more complex ones. These results are better in terms of time and the number of re-plannings needed to find a feasible path compared to [
29]; however, the length of the path must be improved.
In [
30], a convolutionally evaluated gradient first search approach is proposed. A two-dimensional grid is used to model unknown spaces. Convolutional evaluation is used to guide the path to the target location by iteratively evaluating the target orientation. This evaluation enables the algorithm to detect when to move back when continually deviating from the target to avoid spending too much time on failed paths.
More complex autonomous environments like the design of autonomous behavior for humanoid robots require specially designed approaches [
31]. Also, data-driven approaches have demonstrated success in these more challenging versions of the problem [
32].
Table 1 summarizes the main features of the static environment methods presented here. Historically, the most studied problem has been offline path planning. A large number of the listed studies propose sampling-based/roadmap approaches. As can be observed in
Table 1, most of these approaches solve the offline path planning problem. When the online version is studied, heuristic approaches are mostly used. In this work, we propose a simulated annealing algorithm coupled with a surrounding strategy to surpass obstacles. Unlike other online approaches, our method assumes a simple line segment between the start and end points that is made feasible (avoiding obstacles) using the surrounding strategy. When obstacles are detected, the surrounding strategy is applied. Our approach can be used in offline and online scenarios through the framework proposal presented in this research work. Some other strategies have been proposed to solve the offline and online path planning problems. Our proposal describes a simple framework that combines a simulated annealing approach that solves the offline problem with had-doc surrounding strategies to adapt the path to the presence of objects. The approach does not require training time and can be applied without much computational effort to large-size instances with not many objects.
4. AdaptiveRMP Framework
The framework proposed, AdaptiveRMP, includes different components that work as shown in
Figure 3. Depending on the environment detected, subsets of components are used. These components are explained as follows.
4.1. OfflineRMP
This planning scenario searches for a feasible path for the robot navigating from the start to the target position in a known environment. A simulated annealing algorithm proposed in [
35] is used to solve this problem. In this algorithm, solutions are presented as coordinates representing the sequence of positions the robot must follow from the start to the target. The simulated annealing approach works in the feasible space; hence, solutions that use obstacle positions are not considered.
Algorithm 1 shows the main structure of the OfflineRMP algorithm implemented.
Table 2 summarizes the main symbols used to describe the algorithm. It lists the SA-related symbols as initial and current temperature and differences in the quality of solutions. It also lists the solution points of the paths being constructed.
The SA algorithm implements two main steps: the pre-processing and the search. Before the initial solution is constructed, the empty cells that the robot cannot reach are marked as obstacles using the flood-fill algorithm [
36] (line 1).
Figure 4 shows the grid obtained after the flood-fill algorithm.
Algorithm 1 OfflineRMP |
- 1:
flood_fill(obstacles); - 2:
if then - 3:
route = direct_route(start, target); - 4:
initialize_temperature(); - 5:
; - 6:
while do - 7:
(, ) = select_random_coordinates(route); - 8:
new_route = generate_new_route(, ); - 9:
if len(new_route) < len(route) then - 10:
route = new_route; - 11:
else - 12:
acceptance_criterion(, t); - 13:
if new_route is accepted then - 14:
route = new_route; - 15:
end if - 16:
end if - 17:
decrease_temperature(t); - 18:
end while - 19:
post_processing(route); - 20:
return route; - 21:
else - 22:
return (“There is no possible route between start and end points”); - 23:
end if
|
A greedy-based approach generates initial solutions (line 3). The approach uses a myopic function, a pile of directions, a surrounding strategy, and a pruning strategy. The myopic function detects the possible movements to perform in the next step without violating the constraints defined in
Section 3.2. The next step is a selection based on the priorities defined by the pile of directions. The robot follows the direction without obstacles approaching the target cell. When an obstacle is detected, the algorithm selects to surround the obstacle by left or right. After this, the robot follows its path using the same strategy.
During the surrounding step, the pile of directions defines the priorities of selecting the next position, considering that, at any point, the robot has a suitable direction. If the robot is going around an obstacle, the suitable direction is the one that guides it closer to the obstacle. Otherwise, the suitable direction is the one that brings the robot closer to the destination cell. If an obstacle is being surrounded, the positions are searched starting from the suitable direction counterclockwise. Otherwise, the positions are searched from the suitable direction clockwise.
Figure 5 shows the surrounding strategy. In this case, it has been chosen to surround the left. Hence, positions are added to the pile, starting with the suitable direction until a feasible position is found. The pile is computed in each step according to the current suitable direction. A pruning strategy is included to discard unfeasible surrounding steps. Once an unfeasible step is detected, a new surrounding strategy is selected.
The local search phase (lines 6–18) uses a move to repair the candidate solution (line 8). The route between two randomly selected positions from the current path is replaced by a new sub-path that joins these two positions. If there are no obstacles between these positions, a linear sub-path that joins them is generated. If there are obstacles, for each obstacle, the algorithm decides either to surround it by the left or the right, generating an alternative candidate solution.
Figure 6 shows a situation without obstacles between the two selected positions marked with circles. Gray grids show the current path in (a) and the new path in (b). A direct sub-path is selected.
Figure 7 shows an example of obstacles between the two selected positions. As in
Figure 6, circles show the randomly selected positions, the current path is shown in gray in (a), and the new route is shown in gray in (b). In this example, it is randomly decided to go around the right of the first object. The second obstacle is surrounded on the left.
If the new route is shorter than the previous one, it is accepted and selected as the current solution for the next iteration. If the new route is longer than the current one, the probability computed by is used to determine if it should be accepted as the current solution or not.
A post-processing step is used to improve the quality of the paths obtained in terms of length in line 19. The post-processing algorithm follows the path from the beginning and tries to join its current cell to the farthest collision-free cell with a straight line. The final path is constructed by joining the vertices of the path.
Figure 8 shows a solution path before the post-processing (a) and after the post-processing step (b). Here, the start and target cells are in blue, the solution path is in red, and the obstacles are in gray.
This move generates diverse solutions without traversing the entire environment. This prevents it from going around obstacles that are not in its way. Surrounding obstacles when close to colliding makes it easiest to return to the original path and follow to the ending point. Direction choice is also important in going around an obstacle. Otherwise, it would be more difficult to return to the original path.
Post-processing increases the degree of freedom the robot has without being restricted by the discretization of space. In addition, the number of times the robot rotates is decreased.
4.2. OnlineRMP
The goal of OnlineRMP is to find a feasible route for the robot to navigate from a starting point to a target point in an environment where the location of obstacles is entirely unknown. Because planning is performed in real time, it is essential to perform it in low execution times.
The same representation of OfflineRMP is used for OnlineRMP.
Because information about the environment is not always known, path planning in real time is closer to reality, where the robot collects information from the environment on the fly and re-plans the path if necessary.
To solve the planning problem in an unknown environment, initially, the robot has the information in the map resolution and the starting and ending positions. Sensors enable the robot to discover two cells around it, as shown in
Figure 9. Here, the robot is shown in black, white cells show the space that can be observed by the robot, and, in gray, those that are out of sight are shown.
A greedy method is used to construct initial solutions. For the construction of a path between two positions, the surrounding strategy, the pruning of non-feasible sub-path, and the pile of directions used by the OfflineRMP are also adapted.
As the robot advances, information about the environment is updated with its available vision range. In this case, it can create an unfeasible solution given the unknown obstacles. If the robot is close to colliding, it will be necessary to re-plan the path. It is important to note that the re-planning of the path will occur only on the remaining path and up to the length of the obstacle that is within the current vision range. Because the planning is performed in real time, the path already traveled cannot be modified. Moreover, since the remainder of the way is unknown, re-planning the path to the end is forbidden. Once the re-planning is performed, the robot advances until it is close to colliding with a new obstacle, generating a new re-planning. This continues until the robot reaches the ending point or it cannot continue because an obstacle is found.
Given the limited vision of a robot, only a portion of the obstacle may be known when using the surrounding strategy. When new information is available, the robot will decide to go around to the object from the right or the left. This can cause the robot to become caught. For this, the normal vector for the movement direction and the obstacle is considered.
Figure 10 shows the surrounding change as the robot moves around an obstacle. The black square represents the robot, the obstacle is in gray, and the path is in red. The current direction and its normal vector are shown in green and blue, respectively.
At each step, the normal direction is verified pointing to the obstacle; if it is, then the previous orientation of the surrounding strategy is maintained. If the normal direction is not pointing to the obstacle at any moment, the robot is no longer circling the same obstacle. In this case, the next re-planning of a new surrounding orientation may be random.
Because the flood-fill algorithm described in
Section 4.1 is expensive for high-resolution instances, it is used only in the case when the robot tries to go around an obstacle and the two orientations (left or right) are unfeasible. In this case, the robot follows an unfeasible path, so flood-fill is used. It can also be used when the robot tries to go around an obstacle and reaches the same point from where it started; that is, an obstacle surrounds the target point.
As the robot moves around an obstacle, it maintains a local goal of target; once it is possible to reach the final goal via a direct path, that path is used, and the local target of the surrounding strategy is discarded. In
Figure 11, two cases of detour are shown. The obstacle is represented in gray, the generated path in red, the final objective in green, and the local objective in blue. On the right, the discarding is used, and, on the left, the surround is completed.
Algorithm 2 shows the general structure of the planning approach for an unknown environment using the constructive method. Before traveling the path in real time, an initial route is built considering the known partial obstacles. For this, the SA-based approach for OfflineRMP previously described is used. Since OnlineRMP uses a discretized environment and the path generated by OfflineRMP must be traversed cell by cell, OfflineRMP is used without applying post-processing. The procedure shown in Algorithm 2 starts by constructing an initial solution that defines a direct path between the start and end positions in line 1. Then, the robot starts moving and checking each position in the initial path. From the sensors, the robot can update the information about obstacles in the route in line 3. When a possible collision is detected, re-planning is performed (line 5). Details of the re-planning process are shown in Algorithm 3. Here, when it is possible to reach the final goal via a direct path, then that path is used, and the local goal of the surrounding is discarded. Once a re-planning process is performed, the algorithm resumes the initial path.
Algorithm 2 OnlineRMP |
- 1:
route = direct_route (start, target); - 2:
for position in route do - 3:
update_obstacles_map (); - 4:
if is close to collision then - 5:
route = Re-planning (position); - 6:
end if - 7:
end for
|
Algorithm 3 Re-planning |
- 1:
path_to_fix = obstacle_length(); - 2:
if direct_route (position, local_target) then - 3:
route = direct_route (position, local_target); - 4:
new_route = surround_obstacle (position, path_to_fix); - 5:
else - 6:
path_to_fix = obstacle_length (); - 7:
route = surround_obstacle (position, path_to_fix); - 8:
end if
|
Because OnlineRMP is an algorithm that works in real time, each position the robot moves will be part of the final route. Once the current sub-path is finished, the robot follows the original path. Re-planning is performed as many times as necessary until the target point is reached or when there is no feasible path.
4.3. AdaptiveRMP
AdaptiveRMP is a cooperation of the two previously described approaches.
Figure 12 shows a flow diagram of AdaptiveRMP. According to the features of the environment, OfflineRMP and OnlineRMP are used for completely known and completely unknown environments. In the case of partially known environments, the framework first performs OfflineRMP and then advances along the route until the target is reached or the robot is next to collide. Each time the robot is next to collide, the framework changes to OnlineRMP on the specific segment to re-plan and then continues the path. For this, the new information in the environment is considered, computing a route re-planning range. It is also necessary to return to the route previously generated by OfflineRMP once the route to the new obstacle is discovered. To achieve this, a selection of the section to be re-planned is used, from the current position of the robot to the furthest cell from the original path that can be reached with a direct path without collision; this is conducted without taking into account obstacles recently discovered. Once the section to be re-planned is obtained, the OnlineRMP algorithm is used again.
6. Conclusions
In this work, we presented a framework to tackle the two-dimensional robot motion planning problem with known, partially known, and unknown environments. In known environments, path planning is calculated before its execution. In partially known and unknown environments, the framework adapts itself to the changes in the environment during its execution. The main goal is to minimize the length, smoothness of the paths, and execution times.
The OfflineRMP approach has demonstrated its high-quality performance. When comparing the results with those of the literature, we notice that OfflineRMP demonstrates a very competitive performance, obtaining high-quality results compared to the techniques that are widely used, such as the probabilistic roadmap method. It also obtains better results than techniques like two-phase ACO in terms of execution time but with longer paths. Competitive results are still obtained when comparing OfflineRMP with recent techniques from the literature. OfflineRMP considers only the critical obstacles and offers various solutions surrounding them from different sides without exploring the whole search space, which enables reduced execution times. OfflineRMP works well in high-resolution instances, obtaining short and smooth routes within an execution time of fewer than 15 s, with a 100% success rate in the tested instances.
The results of OnlineRMP are consistent with a very good performance, especially concerning execution times, obtaining quality re-plans in less than one second. Also, it shows an adaptation capability for completely unknown environments in terms of finding a collision-free path. AdaptiveRMP provides a framework that covers a large number of environments, and it can adapt the best plan obtained by OfflineRMP using OnlineRMP for re-planning. The execution times are very low for very high-resolution instances.
Our contribution with this work is a simple and adaptive approach for solving robot motion planning that is able to obtain competitive solutions compared to the state-of-the-art approaches. We also propose a set of 40 new instances with different degrees of environmental knowledge, which are available for future research.
The future work includes different goals, such as increasing the smoothness or safety of the path. Combinations of different goals could result in a multi-objective problem.
Adding a third dimension to the environment to account for the shape of the terrain, such as hills or valleys, and the three-dimensional shape of each obstacle could be considered in further work.
Moreover, spatial features such as velocity and acceleration can easily be included using a real-time measurement procedure during the path construction.
Last, collaboration with the machine learning approaches already applied to these problems listed in the literature could greatly improve the ability of our algorithm to learn to make better decisions.