Real-Time Localization for an AMR Based on RTAB-MAP

Lin, Chih-Jer; Peng, Chao-Chung; Lu, Si-Ying

doi:10.3390/act14030117

Open AccessArticle

Real-Time Localization for an AMR Based on RTAB-MAP

by

Chih-Jer Lin

^1,*

,

Chao-Chung Peng

²

and

Si-Ying Lu

¹

Graduate Institute of Automation Technology, National Taipei University of Technology, Taipei 10608, Taiwan

²

Department of Aeronautics and Astronautics, National Cheng-Kung University, Tainan 70101, Taiwan

^*

Author to whom correspondence should be addressed.

Actuators 2025, 14(3), 117; https://doi.org/10.3390/act14030117

Submission received: 20 January 2025 / Revised: 14 February 2025 / Accepted: 26 February 2025 / Published: 27 February 2025

(This article belongs to the Special Issue Actuators in Robotic Control—3rd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

This study aimed to develop a real-time localization system for an AMR (autonomous mobile robot), which utilizes the Robot Operating System (ROS) Noetic version in the Ubuntu 20.04 operating system. RTAB-MAP (Real-Time Appearance-Based Mapping) is employed for localization, integrating with an RGB-D camera and a 2D LiDAR for real-time localization and mapping. The navigation was performed using the A* algorithm for global path planning, combined with the Dynamic Window Approach (DWA) for local path planning. It enables the AMR to receive velocity control commands and complete the navigation task. RTAB-MAP is a graph-based visual SLAM method that combines closed-loop detection and the graph optimization algorithm. The maps built using these three methods were evaluated with RTAB-MAP localization and AMCL (Adaptive Monte Carlo Localization) in a high-similarity long corridor environment. For RTAB-MAP and AMCL methods, three map optimization methods, i.e., TORO (Tree-based Network Optimizer), g2o (General Graph Optimization), and GTSAM (Georgia Tech Smoothing and Mapping), were used for the graph optimization of the RTAB-MAP and AMCL methods. Finally, the TORO, g2o, and GTSAM methods were compared to test the accuracy of localization for a long corridor according to the RGB-D camera and the 2D LiDAR.

Keywords:

position estimation; LiDAR SLAM; RTAB-MAP; RGB-D camera; AMCL

1. Introduction

Autonomous mobile robots (AMRs) depend on the effectiveness of mapping the surrounding environment to determine their movement. Simultaneous Localization and Mapping (SLAM) enables the AMR to understand its surroundings and constructs a map model of the surrounding environment simultaneously [1]. When the AMR navigates in an unknown environment, it acquires point clouds, images, and inertial data through mounted sensors. Then, SLAM restores the real-time position and attitude of the AMR through the spatial–temporal correlation of the observations to realize its localization and mapping. There are two main categories according to the sensor types in the SLAM methods: visual SLAM (VSLAM) [2,3] and LiDAR SLAM [4,5,6,7].

A robot equipped with a vision sensor to estimate the position and orientation of the robot to perform SLAM is called VSLAM. VSLAM relies on camera images to perform the localization and mapping tasks and it covers the variety of visual data detectors, including monocular, stereo, omnidirectional, and RGB-D cameras [8]. VSLAM has the advantages of cheaper hardware requirements and straightforward object detection and tracking, and it provides visual and semantic information [9]. A fundamental classification used in VSLAM is the SLAM system based on the type of data, which can be divided into direct and indirect (feature-based) methods [10]. The direct algorithm estimates camera motions from pixel-level data to build an optimization problem that minimizes photo-metric error. The direct approach has a higher-accuracy 3D reconstruction and obtains more information from images than the indirect method. The indirect method extracts the feature points obtained by processing the scene and by matching their descriptors in sequential frames. These methods are precise and robust against the photo-metric changes in frame intensities, but the performance of feature extraction and matching stages are expensive [11].

RGB-D SLAM methods use a combination of an RGB camera and a depth sensor to measure pixel depth directly to simplify the depth estimation process [12]. RGB-D SLAM methods are popular choices for indoor settings as they provide precise and accurate depth information. However, for outdoor applications, RGB-D cameras may provide unreliable mapping results due to potential interference from sunlight [13]. In dynamic environments, the efficiency of RGB-D SLAM depends on factors such as the speed of moving objects, the quality of sensor data, and the complexity of the scene. Therefore, the use of multiple sensors is the solution to integrate motion models and detect moving objects in dynamic environments. Therefore, the impact on RGB-D SLAM can be mitigated by employing strategies to enhance overall accuracy and robustness [14].

LiDAR is a perception device utilized to check exterior surfaces and measure object distance by emitting a laser toward the object to capture the traveling time [15]. Two-dimensional LiDAR sensors record X and Y parameters using the axes of beams [16]. Three-dimensional LiDAR sensors work in the same way as 2D LiDAR, but the additional measurement around the Z-axis is performed to collect true 3D data. The data from the Z-axis are collected using several lasers at longitudinal projections of various angles. Three-dimensional LiDAR sensors are ideal for visualization due to their higher precision and resolution, but they are more expensive than 2D LiDAR versions. The 2D LiDAR SLAM architecture is equipped with a LiDAR sensor to create a 2D map of its surroundings [17]. The LiDAR sensor illuminates the target with a laser pulse to calculate the distance from the object. Three-dimensional LiDAR SLAM is a precise solution for map generation in various landscapes [18]. For indoor and outdoor robot navigation, 2D LiDAR SLAM is unaffected by disruptions like temperature and light changes and is commonly used in a wide range of autonomous applications, including building inspections, autonomous driving, and smart fabrication [19,20].

RTAB-MAP (Real-Time Appearance-Based Mapping) is a mapping algorithm that combines SLAM features with appearance-based mapping [21]. It is designed for real-time 3D modeling in dynamic environments and has the ability to adapt to fast-changing environments [22]. RTAB-MAP can be used to implement either a visual SLAM approach or a LiDAR SLAM approach on a real robot. It can be used to create 2D or 3D maps by using an RGB-D camera or a LiDAR to estimate its position and orientation in the map simultaneously [23,24]. Therefore, multi-sensor fusion SLAM techniques can be utilized in the robotic system to collect LiDAR data and perform RGB-D reconstructions [25]. RTAB-MAP includes important features such as stereo image matching and depth image matching. It can be integrated with an ROS (Robot Operating System) on various robotic platforms. Many studies optimize the performance of RTAB-MAP in real-time 3D mapping, including the use of loop closure techniques [26]. Loop closure is an important process that detects whether an AMR has returned to a previously visited location and connects the separated parts of the map into a unified whole. It is very important because loop closure can produce accurate maps. Three-dimensional loop closure is the process of detecting the 3D map created by the system to identify previously visited locations [27].

A fast deep learning-based VSLAM system for dynamic environments was proposed by Su et al. [28]. The proposed real-time framework integrates ORB-SLAM 2.0 with the lightweight object detection network YOLOv5s to acquire semantic information in the scene [29]. An optimized homograph matrix module is introduced to create more accurate Optical Flow vectors. Their method performs better than existing semantic VSLAM algorithms in terms of accuracy and performance, and the TUM RGB-D dataset is employed to analyze the system’s runtime and accuracy. The backbone network of YOLOv5 needs to be optimized to make it more robust and practicable. Recently, a comprehensive review found that YOLO 11 for object detection was revolutionized by achieving an optimal balance between speed and accuracy [30]. The key architectural improvements, performance benchmarks, and applications were evaluated using the framework’s strengths and limitations in practical scenarios. Critical gaps in the literature were identified and the review outlines future directions to enhance YOLO’s adaptability, robustness, and integration into emerging technologies [31].

In recent years, SLAM technologies have been developed that can solve complex problems related to real-time 3D mapping. Many algorithms based on SLAM have been developed to enhance its work efficiency. In terms of visual SLAMs, three known algorithms are used: ORB-SLAM2, SPTAM, and RTAB-MAP [32]. ORB-SLAM2 generates the sparse reconstruction from RGB-D inputs, exhibiting better stereo SLAM solutions and achieving zero-drift localization in already-mapped areas for the KITTI visual odometry benchmark. In the evaluation results, ORB-SLAM2 exhibits higher speeds than RTAB-MAP and SPTAM. SPTAM has higher accuracy in terms of feature recognition in 3D maps [33]. Comparing ORB-SLAM2 and RTAB-MAP, the performance of ORB SLAM2 is better than that of RTAB-MAP in terms of accuracy, speed, and feature usage. Although ORB-SLAM2 provides better overall performance, RTAB-MAP remains the priority in cases that deal with difficult lighting and camera motion conditions. Therefore, RTAB-MAP has the advantage in that the object creates shadows and the camera motion is faster [2]. Moreover, RTAB-MAP is superior in terms of real-time data processing and can generate more accurate maps than ORB-SLAM2 and SPTAM in indoor environments [34]. When the AMR system detects a previously visited location, it adjusts the robot’s position and orientation to update the map accordingly. There are various techniques to optimize the 3D loop closure in RTAB-MAP. It can improve the accuracy and efficiency of loop closure detection through key frame selection, feature extraction and matching, and pose graph optimization [35].

In this study, an AMR is equipped with RGB-D and LiDAR sensors. RTAB-MAP is used for the mapping and localization system. Then, its ability is tested to determine the robot’s position in indoor environments. The RTAB-MAP memory management supports the data inputs from a variety of sensors, such as the 2D LiDAR and RGB-D depth cameras, making it particularly suitable for complex indoor environments. The rest of this paper is organized as follows: Section 2 describes the methodology and framework of the system. This section introduces the hardware architecture of the AMR system and the RTAB-MAP method. Section 3 describes the experiments, including loop closure detection using RTAB-MAP, and three graph optimization approaches—TORO, g2o, and GTSAM—are integrated into RTAB-MAP. Section 4 provides an analysis of RTAB-MAP and the AMCL method with the three graph optimization approaches. Finally, Section 5 summarizes the main conclusions of this research.

2. Methodology

In this study, the AMR system consists of four stages: the results display, RGB-D data acquisition, the TECO AMR system, and real-time 3D reconstruction using RTAB-MAP, as shown in Figure 1. Three map optimization methods, TORO (Tree-based Network Optimizer), g2o (General Graph Optimization), and GTSAM (Georgia Tech Smoothing and Mapping), are used for the graph optimization of RTAB-MAP. The maps built using the three methods are evaluated with RTAB-MAP localization in a high-similarity long corridor environment. The odometer, RGB-D D435i depth camera, and 2D LiDAR device are input to the AMR for the RTAB-MAP architecture. RTAB-MAP is the method used for the real-time location and construction of the real-time environment mapping and localization, and is particularly suitable for automated navigation robot applications. The RTAB-MAP mapping process begins with the systematic collection of data from the RGB-D camera and the 2D LiDAR, followed by pre-processing to filter and synchronize the data from different sensors.

The AMR is controlled using the computer (AD-LINK, Taoyuan, Taiwan), AmITX-SL-G-H110, whose CPU is an Intel^® Core™ i7-6700 LGA1151 with a 32 GB DDR, as shown in Figure 1. To reach a specified destination while avoiding obstacles, an Intel^® RealSense D435i RGB-D camera (Intel Cooperation, Santa Clara, CA, USA) and a SICK TIM551 2D-LiDAR (SICK AG, Waldkirch, Germany) were used to determine the geography of the environment. The operating system of the AMR is the ROS environment on the Ubuntu 20.04 (Focal) release; the AMR incorporates several sensing devices, including a laser rangefinder, a depth camera, an odometer via the servo drives, and an IMU. The odometer utilizes the encoder feedback from the servo drives and RTAB-MAP is used for the construction of a graph optimizer. The RGB-D depth camera and the 2D LiDAR device (SICK Tim 551) are the feedback inputs of the RTAB-MAP architecture, as shown in Figure 2a.

RTAB-MAP is the construction of SLAM and real-time environment mapping and localization. The method is suitable for navigation robot applications. The RTAB-MAP mapping process begins with the systematic collection of data from the RGB-D camera and the 2D LiDAR, followed by pre-processing to filter and synchronize the data from different sensors. The characteristics are then extracted from the data and matched, and the pose of the robot is estimated. If loops are detected, the map will be optimized to reduce the accumulated error. The optimized attitude and sensor data are used to generate high-precision 3D maps. After extracting and matching the features from the data, the attitude of the AMR is estimated and the navigation target is set for path planning and obstacle avoidance. Then, the positioning is updated in real time during navigation to improve the positioning accuracy. The ROS provides the precise synchronization and approximate synchronization. RTAB-MAP mainly uses the approximate synchronization method to synchronize the top publishing speed of multiple sensors with minimal latency error. We used rtabmap_ros/rgbd_sync to synchronize the RGB-D camera’s RGB and the depth image data into a single subject rtabmap_ros/RGBD image. Then, rgbd_sync approximated synchronization and rgbd_sync was used to synchronize data with different sensors to improve the overall performance and stability, as shown in Figure 2b.

3. Visual SLAM with RTAB-MAP

Short-term internal memory (STM) is used to observe the current location scene where the robot walks. The time change between the images is used to compare the similarity between the new node N_t and the previous node N_t₋₁. To update the weight of the acquired location, the location Lt is compared with the last one in the STM, and the similarity (sim) is evaluated using Equation (1) [36].

s i m (t, t - 1) = \{\begin{matrix} N_{p a i r} / N_{t}, i f N_{t} \geq N_{t - 1} \\ N_{p a i r} / N_{t - 1}, i f N_{t} < N_{t - 1} \end{matrix},

(1)

where N_pair is the number of matched word pairs between the compared location signatures, and

N_{t}

and

N_{t - 1}

are the total number of words of signature t and the compared signature t − 1, respectively.

The working memory (WM) detects the closed loop of the spatial position and sets the STM size according to the robot speed and position acquisition rate. The number of nodes stored in the STM is set by the parameter “Mem/STMSize”. When the STM reaches the upper limit of capacity, the oldest data will be transferred to the WM for closed-loop detection. After the closed-loop detection is completed, the data will be transferred to the long-term memory (LTM), making it easier for long-term storage and deeper information processing, which helps to maintain system efficiency and reduce data loss [36].

P (S_{t}| L^{t}) = η p (L_{t}| S_{t}) \sum_{i = - 1}^{t_{n}} p (S_{t}| S_{t - 1} = i) p (S_{t - 1} = i | L^{t - 1}),

(2)

where

p (s_{t}| s_{t - 1} = i)

is used to predict the distribution of

S_{t}

, given each state of distribution

S_{t - 1}

in accordance with the robot’s motion between t and t − 1. Between t − 1 and t, the full posterior is updated according to the time evolution model of the pdf

p (S_{t}| S_{t - 1} = i)

, which gives the probability of transition from one state i at time t − 1 to every possible state at time t. According to the robot combining the latter part of the filter between time t and t − 1,

p (S_{t - 1} = i | L^{t - 1})

, the next closed-loop detection is achieved [36].

Depending on the respective values of

S_{t}

and

S_{t - 1}

, this probability takes one of the following values:

(1): $p (S_{t} = - 1| S_{t - 1} = - 1)$ = 0.9; the probability that no loop closure event will occur at time t, given that none occurred at time t − 1.
(2): $p (S_{t} = i| S_{t - 1} = - 1)$ = $\frac{0.1}{N_{w m}}$ , $i \in ⌈0, t_{n}⌋$ ; N_WM refers to the number of positions in the WM of the current iteration.
(3): $p (S_{t} = - 1| S_{t - 1} = j)$ = 0.1, $j \in ⌈0, t_{n}⌋$ ; the probability of the event “no loop closure at time t” is low given that a loop closure occurred at time t − 1.
(4): $p (S_{t} = i| S_{t - 1} = j)$ with i, j ∈ [0; $t_{n}$ ] is a Gaussian on the distance in time between i and j whose sigma value is chosen so that it is nonzero for exactly four neighbors.

If there are closed-loop connections, a location in the graph may have more than two neighbors [36]. The closed-loop detection will be limited due to the lack of visual features when the robot faces a highly similar wall. The proximity detection uses laser scanning data to correct the drift of the odometer. It connects nodes with close locations to make up for the lack of image data and enhances the accuracy of location estimation.

3.1. Loop Closure Detection

In RTAB-MAP, the LTM affects large-scale and long-term loop closure detection. It is mainly responsible for storing information that is currently not within the direct perception range of the robot. For example, map data, past path records, and environmental features are used for navigation. Loop closure detection uses a Bayesian filter to track by estimating the probability of the current position, assuming that Lt matches the accessed position stored in the WM. It represents the sequence of states for all closed-loop hypotheses at a certain moment [36,37].

RTAB-MAP performs real-time positioning and provides visual positioning and map construction information. It displays loopback detection results in the middle, including the matching results between the current image and the historical image. When a loop closure or a proximity detection are detected or some nodes are retrieved or transferred because of memory management, a graph optimization approach is applied to minimize errors in the map. In RTAB-MAP, loop closure detection is performed by comparing current sensor data with previous sensor data to find similarities in the features. When the system has returned to a previously visited location, loop closure detection is performed. If the loop closure detection is successful, then an estimation of the transformation between the two locations is made.

When a loop closure hypothesis is accepted, a graph optimizer minimizes the errors in the map. First, the graph optimization approach TORO is integrated into RTAB-MAP [38]. The AMR is controlled from A, B, C, D, E, and F, and then to A to perform the loop closure to establish a localization map with a velocity of 0.2 m/s, as shown in Figure 3. This process involves matching features between the two locations, followed by optimization using the pose graph optimization method to improve the estimation of the transformation. In Figure 4, the right graph is the generated 3D map and the blue lines indicate the AMR’s trajectory and current anchor points. Once an accurate estimation of the transformation is obtained, loop closure correction is performed, which involves adjusting the camera position and adjusting the 3D model using the new estimation of the transformation. This way, the errors that arise from inaccurate camera or sensor movements can be corrected, resulting in a more accurate and consistent 3D model. The loop closure detection of the environment at the initial time (t = 10) is shown in Figure 4a and the AMR continues to move forward to the end of the corridor, as shown in Figure 4c. Then, the AMR turns back to move forward in the original position (as shown in Figure 4e). Following this, the loop closure detection and correction processes are performed continuously as the camera moves around the environment to ensure that the 3D model is always up to date and accurate. In this paper, three graph optimization approaches—TORO [38], g2o [39], and GTSAM [40]—are integrated in RTAB-MAP. After the loop closure process is finished, the localization graph can be obtained by RTAB-MAP with TORO, g2o, and GTSAM approaches, as shown in Figure 5, Figure 6 and Figure 7.

3.2. Three Graph Optimizations with RTAB-MAP

When a loop closure hypothesis is accepted, a graph optimizer minimizes the errors in the map. In this study, three graph optimization approaches, i.e., TORO, g2o, and GTSAM, are integrated in RTAB-MAP. All of these optimizations use node poses and link transformations as constraints. As a loop closure is detected, the errors introduced by the odometer can be propagated to all links, correcting the map.

(a): TORO uses a tree structure to organize and simplify problems effectively. In the tree structure, each node represents a pose of the robot, which is used to optimize the pose map. The global positioning error is reduced by arranging the pose into a tree structure.
(b): g2o is a highly flexible and modular optimization framework for solving large-scale nonlinear least squares problems, supporting Gaussian Newton and a variety of different local optimization algorithms, making it suitable for processing a variety of sensor data.
(c): GTSAM supports real-time updates and can quickly recalculate and update maps and pose estimates when new nodes enter. Smooth mapping provides accurate estimates even when the data are incomplete or noisy, making it suitable for dynamic environment mapping.

In the RTAB-MAP graph optimization results shown in Table 1, the results show that g2o and GTSAM converge faster than TORO. The results obtained using TORO are less sensitive to poorly estimated odometry covariance. For the single map, the quality of g2o and GTSAM optimization is better than that of TORO based on empirical data. In the experimental results, GTSAM is more robust than g2o and TORO. The experimental results show that the performances of g2o and GTSAM are better than that of TORO. However, in terms of the computational cost of the three methods, GTSAM requires the most CPU usage, as shown in Table 1.

4. Localization of RTAB-MAP and AMCL

After the mapping is completed, RTAB-MAP positioning and robot odometers are used to guide the AMR and the TF tree, as shown in Figure 8. The robot navigation framework, move_base, is divided into global path planning and local path planning (global_planner and local_planner). The global cost map, regional cost map, recovery action (recovery_behavior), and each block need to operate to allow the navigation to run successfully. The move_base node is shown in Figure 9. The simple/goal node is used to view the information regarding the target located on the map, and the positioning algorithm provides the current location information and destination information to generate a global path of the direction of movement and the route from the starting point to the end point. The speed command is calculated according to the global path and issued to the controller, and the local path planning controller receives the speed command and executes it. Therefore, the robot can cope with obstacles or real-time changes. If the robot is unable to continue moving after the set time or the path planning fails many times, it will perform a rotation on the map to clear obstacles within a set distance around it and rotate it once to re-detect and confirm the surrounding environment, as shown in Figure 10.

4.1. Experiment Results for the AMCL and RTAB-MAP for the TORO Method

In this paper, the A* global path planning feasible moving path and dynamic window approach (DWA) are used for regional path planning, and the navigation is completed by receiving the control speed command. Relevant research on large-scale 3D map navigation can be found in [43]. In this paper, adaptive Monte Carlo positioning (AMCL) and RTAB-MAP positioning were used to test the positioning accuracy for the long corridor obstacle avoidance scene, as shown in Figure 11. The AMCL method and RTAB-MAP employ different sensor modalities to estimate their respective locations. The experimental results reveal distinct characteristics of the two SLAM methods tested. The AMCL only uses odometry, IMU, and LiDAR data to estimate the AMR’s locations. However, RTAB-MAP utilizes odometry, IMU, RGB-D, and LiDAR information to estimate the locations. Figure 12 illustrates the localization results for the AMCL method based on the TORO method. The AMCL method adopts SLAM techniques along the navigated path and can continuously localize the system with different levels of success when applied to the data collected. RTAB-MAP based on the TORO method is shown in Figure 13. RTAB-MAP uses visual features for loop closure, while the AMCL is supported by LiDAR-based loop closure. Both approaches successfully identify loop closures along the path. It can be observed that both methods can localize the AMR system with different levels of success when applied to the data collected. Figure 14 shows the actual navigation results for the proposed RTAB-MAP with TORO. The results show that the approach successfully identifies the loop closures along the path and it can localize the AMR system with the data collected.

Moreover, we made use of the default graph optimization approach, TORO, with the RTAB-MAP and AMCL methods, which also supports g2o and GTSAM, to test the obstacle avoidance. When loop closure and proximity detection are detected, the graph optimization approaches are applied to minimize errors in the map. According to the case studies, AMCL and RTAB-MAP are integrated with the three graph optimization approaches: TORO, g2o, and GTSAM.

4.2. AMCL Methods for the TORO, g2o, and GTSAM Methods

To compare the navigation errors of the AMCL for the three graph optimization approaches, the three methods were used to test the obstacle avoidance localization. The experimental results for the AMCL method are shown in Table 2, Table 3 and Table 4. The results show that the navigation errors of angle are not the same. In the AMCL method, the navigation error of the x, y positions for the GTSAM is smaller than for the g2o and TORO methods. However, the navigation error of angle for the TORO is the best of the three methods. In the AMCL methods, the time spent by TORO is the shortest for the three methods.

4.3. RTAB-MAP Method for the TORO, g2o, and GTSAM Methods

To compare the navigation errors of RTAB-MAP for the three graph optimization approaches, the three methods were used to test the obstacle avoidance localization. The experimental results for the RTAB-MAP method are shown in Table 5, Table 6 and Table 7. The results show that the navigation errors of the x, y positions for GTSAM are smaller than for the g2o and TORO methods. Moreover, the navigation error of angle for GTSAM is the best of the three methods. Among the RTAB-MAP methods, the time spent by TORO is also the smallest for the three methods.

Table 8 shows the integrated navigation errors of AMCL and RTAB-MAP for the obstacle avoidance localization with the three graph optimizations. As can be seen from the results in Table 8, the navigation errors for AMCL are much bigger than those of RTAB-MAP. On the one hand, for the AMCL methods, the navigation error of the x, y positions for the GTSAM is smaller than for the g2o and TORO methods. However, the navigation error of angle for TORO is the best of the three methods. On the other hand, in the navigation errors of RTAB-MAP, the navigation error of the x, y positions for GTSAM is smaller than for g2o and TORO. Moreover, the navigation error of angle for GTSAM is the best of the three methods.

Therefore, the results show that RTAB-MAP with GTSAM is the best method for the real-time experiments. Figure 15 shows the experimental results for the obstacle avoidance test for TORO, g2o, and GTSAM with RTAB-MAP. As can be seen from the results in Figure 15, the GTSAM method has better navigation error rates than the TORO and g2o methods; the AMR is further away from the obstacle in the GTSAM experiment than for the other two methods. The final navigation error of GTSAM is the smallest of the three methods. However, according to the time taken by the three methods, TORO is the best. The g2o method takes the longest amount of time to perform the task in the AMCL and the RTAB-MAP methods.

5. Conclusions

In this study, the navigation of a corridor with high similarity features was carried out using an AMR based on the AMCL method and RTAB-MAP with three graph optimization approaches: TORO, g2o, and GTSAM. The experimental results reveal distinct characteristics of the two methods tested. For the AMCL method, the error of the x, y positions for GTSAM was smaller than for the g2o and TORO methods, with average errors of

e_{x} = 3.06 {a n d e}_{y} = 3.74

. However, the TORO had the smallest error of angle among the three methods. However, the processing time for GTSAM is longer than for the TORO. For the RTAB-MAP method, GTSAM was the best in terms of the error of x, y positions and the angle; the average errors were

e_{x} = 2.08, e_{y} = 1.44, a n d

e_{θ} = 2.7

. However, the TORO method had the shortest computation time. Due to the sequential relationship between the tree-like data, the ability of the algorithm to deal with complex intersections or multipath scenarios in the environment is limited. In long-term applications, GTSAM uses smoothing and mapping techniques to provide more accurate estimation and faster convergence speeds with differential function, allowing users to reduce calculation errors and make it easier to optimize the model.

As shown in the experiments, AMCL uses 2D LiDAR scan matching for positioning; the positioning accuracy is influenced by map similarity and feature points. However, RTAB-MAP can use a visual SLAM approach to create 2D or 3D maps by using an RGB-D camera. LiDAR is used to estimate its position and orientation in the map simultaneously. Therefore, RTAB-MAP can perform closed loop detection by using the two sensors to more accurately extract environmental features. Therefore, the positioning accuracies of RTAB-MAP in the three methods are better than those of AMCL. RTAB-MAP positioning relies on both visual and LiDAR data to achieve positioning accuracy, while AMCL only considers LiDAR information, leading to low accuracy. However, the depth camera used by RTAB-MAP is affected by light; thus, the detection results are affected by the accuracy of map construction and positioning as well as navigation error accuracy. The AMCL method uses 2D LiDAR for particle filtering positioning; the AMCL method does not depend heavily on light conditions and can be used even if the visual information is limited or disturbed. Therefore, in scenarios where the light conditions change greatly, the performance of AMCL will be better than that of RTAB-MAP, which requires visual positioning.

Author Contributions

Conceptualization, C.-J.L.; methodology, C.-J.L.; software, S.-Y.L.; validation, C.-J.L. and S.-Y.L.; formal analysis, C.-J.L. and S.-Y.L.; investigation, C.-J.L. and S.-Y.L.; writing—original draft preparation, C.-J.L.; writing—review and editing, C.-J.L.; visualization, S.-Y.L. and C.-J.L.; supervision, C.-J.L.; project administration, C.-C.P.; funding acquisition, C.-C.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science Council of the Republic of China, Contract Nos. MOST 110-2218-E-027-003, MOST 111-2218-E-027-002, NSTC 112-2218-E-027-007, and NSTC 113-2218-E-027-002. The APC was funded by Contract Nos. NSTC 112-2218-E-027-007 and NSTC 113-2218-E-027-002.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef]
Ragot, N.; Khemmar, R.; Pokala, A.; Rossi, R.; Ertaud, J.-Y. Benchmark of Visual SLAM Algorithms: ORB-SLAM2 vs RTAB Map. In Proceedings of the 2019 Eighth International Conference on Emerging Security Technologies (EST), Colchester, UK, 22–24 July 2019; pp. 1–6. [Google Scholar]
Chuang, C.S.; Peng, C.C. Development of an Uneven Terrain Decision-Aid Landing System for Fixed-Wing Aircraft Based on Computer Vision. Electronics 2024, 13, 1946. [Google Scholar] [CrossRef]
Pak, J.; Son, H.I. Semantic SLAM-based Autonomous Tributary Navigation System Using 3D LiDAR Point Cloud for UAV. In Proceedings of the 2022 22nd International Conference on Control, Automation and Systems (ICCAS), Jeju, Republic of Korea, 27 November–1 December 2022; pp. 1380–1382. [Google Scholar]
Cao, K.; Liu, R.; Wang, Z.; Peng, K.; Zhang, J.; Zheng, J.; Teng, Z.; Yang, K.; Stiefelhagen, R. Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents. In Proceedings of the 2023 IEEE International Conference on Robotics and Biomimetics (ROBIO), Koh Samui, Thailand, 4–9 December 2023. [Google Scholar]
Wang, Y.T.; Peng, C.C.; Ravankar, A.A. A single LiDAR-based feature fusion indoor localization algorithm. Sensors 2018, 18, 1294. [Google Scholar] [CrossRef] [PubMed]
Chen, L.H.; Peng, C.C. A robust 2D-SLAM technology with environmental variation adaptability. IEEE Sens. J. 2019, 19, 11475–11491. [Google Scholar] [CrossRef]
Zaffar, M.; Ehsan, S.; Stolkin, R.; Maier, K.M. Sensors, slam and long-term autonomy: A review. In Proceedings of the 2018 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), Edinburgh, UK, 6–9 August 2018; pp. 285–290. [Google Scholar]
Filipenko, M.; Afanasyev, I. Comparison of various slam systems for mobile robot in an indoor environment. In Proceedings of the 2018 International Conference on Intelligent Systems (IS), Funchal, Portugal, 25–27 September 2018; pp. 400–407. [Google Scholar]
Duan, C.; Junginger, S.; Huang, J.; Jin, K.; Thurow, K. Deep Learning for Visual SLAM in Transportation Robotics: A Review. Transp. Saf. Environ. 2020, 1, 177–184. [Google Scholar] [CrossRef]
Tourani, A.; Bavle, H.; Sanchez-Lopez, S.L.; Voos, H. Visual SLAM: What are the Current Trends and What to Expect? Sensors 2022, 22, 92–97. [Google Scholar] [CrossRef] [PubMed]
Alenyà, G.; Foix, S.; Torras, C. Using ToF and RGBD cameras for 3D robot perception and manipulation in human environments. Intell. Serv. Robot. 2014, 7, 211–220. [Google Scholar] [CrossRef]
Jin, Q.; Liu, Y.; Man, Y.; Li, F. Visual SLAM with RGB-D cameras. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; pp. 4072–4077. [Google Scholar]
Sahili, A.R.; Hassan, S.; Sakhrieh, S.M.; Mounsef, J.; Maalouf, N.; Arain, B.; Taha, T. A Survey of Visual SLAM Methods. IEEE Access 2023, 11, 139643–139677. [Google Scholar] [CrossRef]
Wang, H.; Liu, X.; Yuan, X.; Liang, D. Multi-perspective terrestrial LiDAR point cloud registration using planar primitives. In Proceedings of the International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 6722–6725. [Google Scholar] [CrossRef]
Hess, W.; Kohler, D.; Rapp, H.; Andor, D. Real-time loop closure in 2D LIDAR SLAM. In Proceedings of the IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 16–21 May 2016; pp. 1271–1278. [Google Scholar] [CrossRef]
Chan, S.H.; Wu, P.T.; Fu, L.C. Robust 2D Indoor Localization Through Laser SLAM and Visual SLAM Fusion. In Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics, Miyazaki, Japan, 7–10 October 2018; pp. 1263–1268. [Google Scholar] [CrossRef]
Zhang, Y.J.; Shi, P.C.; Li, J.Y. 3D LiDAR SLAM: A survey. Photogramm. Rec. 2024, 39, 457–517. [Google Scholar] [CrossRef]
Chen, Q.M.; Dong, C.Y.; Mu, Y.Z.; Li, B.C.; Fan, Z.Q.; Wang, Q.L. An Improved Particle Filter SLAM Algorithm for AGVs. In Proceedings of the 2020 IEEE 6th International Conference on Control Science and Systems Engineering, Beijing, China, 17–19 July 2020; pp. 27–31. [Google Scholar] [CrossRef]
Holder, M.; Hellwig, S.; Winner, H. Real-time pose graph SLAM based on radar. In Proceedings of the IEEE Intelligent Vehicles Symposium, Proceedings, Paris, France, 9–12 June 2019; pp. 1145–1151. [Google Scholar] [CrossRef]
Labbé, M.; Michaud, F. RTAB-MAP as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation. J. Field Robot. 2018, 36, 416–446. [Google Scholar] [CrossRef]
Labbé, M.; Michaud, F. Appearance-based loop closure detection for online large-scale and long-term operation. IEEE Trans. Robot. 2013, 29, 734–745. [Google Scholar] [CrossRef]
Tsai, T.C.; Peng, C.C. Ground segmentation based point cloud feature extraction for 3D LiDAR SLAM enhancement. Measurement 2024, 236, 114890. [Google Scholar] [CrossRef]
Li, Y.; Li, S.; Tan, Y.; Hu, J. SLAM of Mobile Robot Based on the Joint Optimization of LiDAR and Camera. In Proceedings of the 2022 7th International Conference on Control and Robotics Engineering (ICCRE), Beijing, China, 15–17 April 2022; pp. 21–25. [Google Scholar]
Shang, Z.; Shen, Z. Dual-function depth camera array for inline 3D reconstruction of complex pipelines. Autom. Constr. 2023, 152, 104893. [Google Scholar] [CrossRef]
Labbe, M.; Michaud, F. Online global loop closure detection for large-scale multi-session graph-based SLAM. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014; pp. 2661–2666. [Google Scholar]
Zhang, Y.; Liu, R.; Yu, H.; Zhou, B.; Qian, K. Visual Loop Closure Detection With Instance Segmentation and Image Inpainting in Dynamic Scenes Using Wearable Camera. IEEE Sens. J. 2022, 22, 16628–16637. [Google Scholar] [CrossRef]
Su, P.; Luo, S.; Huang, X. Real-Time Dynamic SLAM Algorithm Based on Deep Learning. IEEE Access 2022, 10, 87754–87766. [Google Scholar] [CrossRef]
Jocher, G.; Stoken, A.; Borovec, J.; Changyu, L.; Hogan, A.; Diaconu, L.; Ingham, F.; Poznanski, J.; Fang, J.; Yu, L. Ultralytics/yolov5: V3.1—Bug Fixes and Performance Improvements. 2020. Available online: https://zenodo.org/record/4154370#.Y4LNkHZBxPY (accessed on 15 October 2022).
Cherubin, S.; Kaczmarek, W.; Siwek, M. YOLO object detection and classification using low-cost mobile robot. Przegląd Elektrotech. 2024, 29–33. [Google Scholar] [CrossRef]
Ali, M.L.; Zhang, Z. The YOLO Framework: A comprehensive review of evolution, applications, and benchmarks in object detection. Computers 2024, 13, 336. [Google Scholar] [CrossRef]
de Jesus, K.J.; Kobs, H.J.; Cukla, A.R.; de Souza Leite Cuadros, M.A.; and Gamarra, D.F.T. Comparison of Visual SLAM Algorithms ORB-SLAM2, RTAB-MAP and SPTAM in Internal and External Environments with ROS. In Proceedings of the 2021 Latin American Robotics Symposium (LARS), 2021 Brazilian Symposium on Robotics (SBR), and 2021 Workshop on Robotics in Education (WRE), Natal, Brazil, 11–15 October 2021; pp. 216–221. [Google Scholar]
Mur-Artal, R.; Tardós, J.D. ORB-SLAM2: An open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef]
Muharom, S.; Sardjono, T.A.; Mardiyanto, R. Real-Time 3D Modeling and Visualization Based on RGB-D Camera using RTAB-MAP through Loop Closure. In Proceedings of the 2023 International Seminar on Intelligent Technology and Its Applications (ISITIA), Surabaya, Indonesia, 26–27 July 2023; pp. 228–233. [Google Scholar]
Zhou, Y.; Wang, Y.; Poiesi, F.; Qin, Q.; Wan, Y. Loop Closure Detection Using Local 3D Deep Descriptors. IEEE Robot. Autom. Lett. 2022, 7, 6335–6342. [Google Scholar] [CrossRef]
Angeli, A.; Filliat, D.; Doncieux, S.; Meyer, J.A. Fast and incremental method for loop-closure detection using bags of visual words. IEEE Trans. Robot. 2008, 24, 1027–1037. [Google Scholar] [CrossRef]
Labbé, M.; Michaud, F. Long-term online multi-session graph-based splam with memory management. Auton. Robot. 2018, 42, 1133–1150. [Google Scholar] [CrossRef]
Grisetti, G.; Kummerle, R.; Stachniss, C.; Burgard, W. A tutorial on graph-based SLAM. IEEE Intell. Transp. Syst. Mag. 2010, 2, 31–43. [Google Scholar] [CrossRef]
Kummerle, R.; Grisetti, G.; Strasdat, H.; Konolige, K.; Burgard, W. G²o: A general framework for graph optimization. In Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3607–3613. [Google Scholar]
Dellaert, F. Factor graphs and GTSAM: A hands-on introduction. Tech. Rep. Ga. Inst. Technol. 2012, 2. [Google Scholar]
Move_Base. Available online: https://wiki.ros.org/move_base (accessed on 18 January 2024).
Aditya, P. Recovery Behaviours Used in Navigation Stack—ROS. Available online: https://medium.com/@patiladitya1309/recovery-behaviours-used-in-navigation-stack-ros-5df991f408c7 (accessed on 1 January 2024).
Peng, C.C.; Wang, C.Y. Design of constrained dynamic path planning algorithms in large-scale 3D point cloud maps for UAVs. J. Comput. Sci. 2023, 67, 101944. [Google Scholar] [CrossRef]

Figure 1. Block diagram of the AMR for this experiment.

Figure 2. (a) Architecture of RTAB-MAP. (b) Flowchart of the RTAB-MAP method.

Figure 3. (a) Experimental location (AMR moves from A,B,C,D,E, to F); (b) graph optimization setup for RTAB-MAP with TORO.

Figure 4. (a) Loop closure detection for time t = 00:10. (b) Loop closure detection for time t = 01:02. (c) Loop closure detection for initial time t = 02:07. (d) Loop closure detection for time t = 03:21. (e) Loop closure detection for time t = 04:24.

Figure 5. Localization graph for RTAB-MAP with TORO.

Figure 6. Localization graph for RTAB-MAP with g2o.

Figure 7. Localization graph for RTAB-MAP with GTSAM.

Figure 8. Proposed TF tree in ROS.

Figure 9. Move_base node [41].

Figure 10. Recovery behaviors of the move_base node [42].

Figure 11. (a) Obstacle avoidance and loop closure detection; (b) beginning of the task; (c) destination of the obstacle avoidance task.

Figure 12. Navigation results for AMCL with TORO.

Figure 13. Navigation results for RTAB-MAP with TORO.

Figure 14. Navigation photos for the proposed RTAB-MAP with TORO.

Figure 15. (a) Obstacle avoidance trajectories of TORO for RTAB-MAP. (b) Obstacle avoidance trajectories of g2o for RTAB-MAP. (c) Obstacle avoidance trajectories of GTSAM for RTAB-MAP.

Table 1. RTAB-MAP graph optimization.

Method	Processing Time	CPU Usage	Successful Rate
TORO	6′46″	5%	95%
g2o	4′01″	8%	96%
GTSAM	4′00″	10%	96%

Table 2. Navigation errors for TORO with AMCL.

AMCL	Navigation Error			Time (s)
AMCL	x (cm)	y (cm)	Angle (°)	Time (s)
1	6.2	4.5	3.1	39
2	6.7	4.1	2.8	41
3	5.4	3.8	1.6	37
4	4.9	3.1	3.2	42
5	4.2	4.2	1.8	39
Average	5.48	3.94	2.5	39.6

Table 3. Navigation errors for g2o with AMCL.

AMCL	Navigation Error			Time (s)
AMCL	x (cm)	y (cm)	Angle (°)	Time (s)
1	6.2	4.5	3.5	42
2	5.8	3.2	2.8	42
3	3.2	4.9	3.2	44
4	4.3	4.5	2.6	43
5	3.1	3.8	2.1	40
Average	4.52	4.18	2.84	42.2

Table 4. Navigation errors for GTSAM with AMCL.

AMCL	Navigation Error			Time (s)
AMCL	x (cm)	y (cm)	Angle (°)	Time (s)
1	2.1	1.5	3.2	41
2	2.6	3.8	2.5	39
3	3.2	3.5	4.8	42
4	4.1	6.1	2.1	40
5	3.3	3.8	1.4	40
Average	3.06	3.74	2.8	40.4

Table 5. Navigation errors for TORO with RTAB-MAP.

RTAB-MAP	Navigation Error			Time (s)
RTAB-MAP	x (cm)	y (cm)	Angle (°)	Time (s)
1	3.4	2.1	3.8	40
2	3.1	2.3	4.9	42
3	2.8	4.5	6.1	42
4	3.2	4.1	5.7	41
5	3.1	3.8	5.2	39
Average	3.12	3.36	5.14	40.8

Table 6. Navigation errors for g2o with RTAB-MAP.

RTAB-MAP	Navigation Error			Time (s)
RTAB-MAP	x (cm)	y (cm)	Angle (°)	Time (s)
1	2.5	2.4	4.2	45
2	3.2	3.6	3.9	46
3	3.9	2.7	3.7	48
4	2.2	2.6	5.3	45
5	3.1	1.8	2.8	46
Average	2.98	2.38	3.98	46

Table 7. Navigation errors for GTSAM with RTAB-MAP.

RTAB-MAP	Navigation Error			Time (s)
RTAB-MAP	x (cm)	y (cm)	Angle (°)	Time (s)
1	1.8	1.2	2.3	41
2	2.4	0.8	3.1	45
3	2.4	1.5	3.4	41
4	1.5	1.8	2.5	46
5	2.3	1.9	2.2	42
Average	2.08	1.44	2.7	43

Table 8. Navigation errors for AMCL and RTAB-MAP.

	Methods	Navigation Error			Time (s)
	Methods	x (cm)	y (cm)	Angle (°)	Time (s)
AMCL	TORO	5.48	3.94	2.5	39.6
	g2o	4.52	4.18	2.84	42.2
	GTSAM	3.06	3.74	2.8	40.4
RTAB-MAP	TORO	3.12	3.36	5.14	40.8
	g2o	2.98	2.38	3.98	46
	GTSAM	2.08	1.44	2.7	43

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lin, C.-J.; Peng, C.-C.; Lu, S.-Y. Real-Time Localization for an AMR Based on RTAB-MAP. Actuators 2025, 14, 117. https://doi.org/10.3390/act14030117

AMA Style

Lin C-J, Peng C-C, Lu S-Y. Real-Time Localization for an AMR Based on RTAB-MAP. Actuators. 2025; 14(3):117. https://doi.org/10.3390/act14030117

Chicago/Turabian Style

Lin, Chih-Jer, Chao-Chung Peng, and Si-Ying Lu. 2025. "Real-Time Localization for an AMR Based on RTAB-MAP" Actuators 14, no. 3: 117. https://doi.org/10.3390/act14030117

APA Style

Lin, C.-J., Peng, C.-C., & Lu, S.-Y. (2025). Real-Time Localization for an AMR Based on RTAB-MAP. Actuators, 14(3), 117. https://doi.org/10.3390/act14030117

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Real-Time Localization for an AMR Based on RTAB-MAP

Abstract

1. Introduction

2. Methodology

3. Visual SLAM with RTAB-MAP

3.1. Loop Closure Detection

3.2. Three Graph Optimizations with RTAB-MAP

4. Localization of RTAB-MAP and AMCL

4.1. Experiment Results for the AMCL and RTAB-MAP for the TORO Method

4.2. AMCL Methods for the TORO, g2o, and GTSAM Methods

4.3. RTAB-MAP Method for the TORO, g2o, and GTSAM Methods

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI