Next Article in Journal
Runtime Construction of Large-Scale Spiking Neuronal Network Models on GPU Devices
Previous Article in Journal
A Hybrid Vibration Isolator Based on Elastomeric and Electromagnetic Restoring Force
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fast Lidar Inertial Odometry and Mapping for Mobile Robot SE(2) Navigation

1
State Key Laboratory for Strength & Vibration of Mechanical Structures, School of Aerospace, Xi’an Jiaotong University, Xi’an 710049, China
2
Shaanxi Engineering Laboratory for Vibration Control of Aerospace Structures, Xi’an Jiaotong University, Xi’an 710049, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(17), 9597; https://doi.org/10.3390/app13179597
Submission received: 31 July 2023 / Revised: 19 August 2023 / Accepted: 22 August 2023 / Published: 24 August 2023

Abstract

:
This paper presents a fast Lidar inertial odometry and mapping (F-LIOM) method for mobile robot navigation on flat terrain with high real-time pose estimation, map building, and place recognition. Existing works on Lidar inertial odometry have mostly parameterized the keyframe pose as SE(3) even when the robots moved on flat ground, which complicated the motion model and was not conducive to real-time non-linear optimization. In this paper, F-LIOM is shown to be cost-effective in terms of model complexity and computation efficiency for robot SE(2) navigation, as the motions in other degrees of freedom in 3D, including roll, pitch, and z, are considered to be noise terms that corrupt the pose estimation. For front-end place recognition, the smoothness information of the feature point cloud is introduced to construct a novel global descriptor that integrates geometry and environmental texture characteristics. Experiments under challenging scenarios, including self-collected datasets and public datasets, were conducted to validate the proposed method. The experimental results demonstrated that F-LIOM could achieve competitive real-time performance in terms of accuracy compared with state-of-the-art counterparts. Our solution has significant superiority and the potential to be deployed in limited-resource mobile robot systems.

1. Introduction

Simultaneous localization and mapping (SLAM) plays a crucial role in diverse navigation missions and is indispensable for autonomous mobile robots [1,2]. It provides reliable state estimation and high-precision maps required for collision-free path planning and real-time obstacle avoidance. Over the past thirty years, numerous excellent visual SLAM techniques, Lidar SLAM techniques, and back-end optimization methods have been proposed to support 6-DOF state estimation for mobile robots. Vision SLAM methods, including monocular-vision-based [3,4], RGB-D-based [5,6], and vision-inertial-based [7,8,9,10] SLAM frameworks, usually employ a camera to obtain a semantic interpretation of the scene and have advantages in place recognition; at the same time, they cannot work in poor lighting situations. Lidar SLAM [11,12,13,14,15] is invariant to illumination changes, which is essential for obstacle detection and tracking to achieve autonomous navigation. Therefore, this paper mainly focuses on mobile robot pose estimation and mapping based on 3D Lidar.
In 3D Lidar-based SLAM, one of the goals is to estimate Lidar odometry (LO), or Lidar inertial odometry (LIO) if the inertial measurement unit (IMU) is employed. LIO is usually represented by SE(3) [16], involving translation terms for the x, y, and z axes and the rotation of the three axes (roll, pitch, and yaw). A robot’s changes in the z direction and pitch and roll dimensions are insignificant for navigation on the ground, especially in indoor or outdoor flat terrain. In these cases, the traditional SLAM system can be almost equivalently converted to estimate only the x, y, and yaw angle, and mobile robots move in SE(2). Although these perturbations are trivial, they can still represent the state change of the robot and may corrupt the measurements, and it is not wise to ignore them directly during the state estimation process. For mobile robots, the state space in terms of extended and minimal representations is SE(2) and can be conveniently optimized by incremental smoothing and mapping with a Bayes tree (iSAM2) [17]. These perturbations are considered as a noise term and integrated into the model to construct a new constraint in the factor graph. In addition, loop closure, as one of the indispensable components of SLAM, is used to correct the accumulated error. The smoothness information calculated in the feature extraction process can not only be used to distinguish edge or plane features [18] but can also be used for the front-end place recognition algorithm to detect loop closures. More importantly, it integrates the geometry and environmental texture information of the point cloud over a local region and requires less memory.
This study proposes a fast Lidar inertial odometry and mapping (F-LIOM) method for real-time local accuracy and the global drift-free state estimation of mobile robots in flat terrain. F-LIOM contains all the essential elements required for a SLAM system, including point-cloud de-skewing, pose estimation, and loop closure. We first redefine the data structure of the point cloud in the feature extraction stage for later place recognition and jointly apply the Gaussian process (GP) and IMU measurements to correct the point-cloud distortion in a non-linear form, thereby replacing the traditional assumption of uniform motion. Next, this paper assumes that the robots perform SE(2) motion and proposes a novel constraint model integrating the perturbations as noises to optimize the state estimation by associating the extended pose and the 3D points via the Lidar measurements. Last, for the front-end place recognition algorithm, we construct a new global descriptor, the smoothness scanning context, to detect loop closure and construct a globally consistent trajectory and map. The main contributions of our work can be summarized as follows:
  • An efficient, fast F-LIOM method is developed that integrates the perturbations as weights in noise terms. It enables the system to be computationally tractable.
  • We propose a novel global descriptor consisting of only the smoothness of the feature point cloud to aid the Lidar inertia odometry in detecting loop closures.
  • The results show that F-LIOM achieved competitive performance on self-collected real-world and public datasets, making it more suitable for deployment in limited-resource service robots.
The rest of the paper is organized as follows: Section 2 provides a discussion of the relevant works. Section 3 presents a systematic overview of the proposed approach. The open dataset and self-collected dataset are described in Section 4. Lastly, the conclusions are presented in Section 5.

2. Related Work

The iterative closest point (ICP) [19,20] methods were introduced to achieve rigid transformation between two Lidar frames in back-end optimization by finding their correspondences at a point-wise or feature-wise level. However, due to the sensor motion and the rotation mechanism of 3D Lidar, the first step of Lidar SLAM is to eliminate the point-cloud distortion. Lidar odometry and mapping (LOAM) [18], as one of the representative methods of 3D Lidar SLAM, was proposed by Zhang Ji in 2014 for low-drift and real-time state estimation and mapping. The linear interpolation method used IMU measurements to correct point-cloud distortion. LOAM extracted features (edge/planar) by calculating the smoothness of a point in its local Lidar scan and subjected edge and planar features to scan-matching to find correspondences between scans. In order to achieve real-time requirements, the two threads were designed to return low-precision odometry with a high frequency and high-precision mapping with a low frequency, respectively. A lightweight and ground-optimized Lidar odometry and mapping (LeGO-LOAM) method was proposed in [12] for real-time six-degree-of-freedom mobile robot state estimation. On the one hand, LeGO-LOAM adapted the primary pipeline of [18] and designed a more advanced feature segmentation method at the front end and two steps for state optimization at the back end. On the other hand, the loop closure module was integrated into LeGO-LOAM to make it more robust in complex environments. A general approach that aimed to achieve an accurate framework was provided in [13] for Lidar SLAM, where the local smoothness was taken into account for iterative pose optimization. Compared with the loosely coupled scheme, the Lidar inertial odometry via smoothing and mapping (LIO-SAM) adapted tightly coupled system framework was proposed in [14], which applied a factor graph to optimize the robot state using incremental smoothing and mapping. A versatile Lidar SLAM system based on the efficient multi-metric linear least-squares ICP was presented in [15]. LINS was proposed in [21] as a lightweight Lidar inertial state estimator for the real-time navigation of ground vehicles. It was also based on a tightly coupled framework and introduced an error-state Kalman filter to compensate for the estimated motion state recursivity. IMU measurements have been used to characterize system motion at any time to correct point-cloud distortion [22,23,24]. Traditional Lidar SLAM methods are corrupted by noise, and the estimated trajectories have drift issues in the z axis. Ground parameters have been extracted to build ground factors in order to eliminate the drift in the altitude direction [25].
The concept of continuous time was proposed in [26] to present mobile robot trajectory estimation and state prediction, a non-parametric robotic state estimation method based on the GP. Chi Hay [27] derived a mathematical model of the GP based on the weight-space and function-space methods and proposed an efficient approach that could provide robot state predictions at any time of interest. In addition, the GP [28] can estimate the probability distribution (mean and variance) of the multi-dimensional navigation parameters of ground vehicles and has been widely involved in the domains of trajectory optimization [29], target tracking [30], fusion [31], and terrain modeling [32].
Although the lack of texture information in the point cloud makes it difficult to construct global descriptors, Lidar scanning is more robust due to its illumination- and view-point-invariant properties. For place recognition, the global descriptors were represented in the form of histograms in [33,34]. The scan context (SC) was proposed in [35] as a non-histogram-based global descriptor from a 3D Lidar scan that encoded a point cloud of a 3D scan into a matrix by selecting the maximum height of the point in each bin. The similarity score was calculated by matching the candidate SC and the current SC, and the frame corresponding to the SC with the highest score was considered a loop closure. In [36], the intensity information over the local Lidar scan was introduced to build the intensity scan context (ISC). SC-LiDAR-SLAM was proposed in [37] as a complete SLAM system that integrated front-end place recognition with Scan Context++ to eliminate accumulated errors. These methods verify that both geometric and intensity information can construct descriptors to achieve loop detection effectively. A semantic-based place recognition algorithm was proposed in [37], named Semantic Scan Context (SSC), which consisted of a two-step global iterative closest point and semantic-based descriptor.

3. Methodology

The system structure of the proposed F-LIOM is shown in Figure 1. The sensor data required by the system are mainly obtained from an IMU and a 3D Lidar. We first redefine the data structure of the point cloud in the feature extraction stage for subsequent loop closure. GP and IMU measurements are combined jointly to correct point-cloud distortion using the non-linear method. An efficient, fast constraint model is presented to estimate the robot state for scan matching, and we introduce the concept of the smoothness scan context to detect loop closure. For back-end optimization, a factor graph using incremental smoothing and mapping with a Bayes tree (iSAM2) [17] is used to model the state estimation problems, as it is more appropriate to conduct inference.

3.1. Point-Cloud De-Skewing

Due to the mechanical rotation mechanism of the 3D Lidar, the movement of the sensor can occur while scanning, which directly impacts the point in a frame scan, making it differ from the true position. At present, the scanning frequency of Lidar is generally 10 Hz, and existing works [12,18] have mostly assumed that the robot moves at a constant angular velocity and linear velocity within one frame scanning period, which does not always hold.
In this paper, we jointly combine IMU preintegration and the GP to de-skew the point cloud by taking the measurements from IMU propagation as the training points in the GP. IMU preintegration can avoid repeated integrations once the linearization point at time  t i  changes, and it is more convenient to present the relative motion increments between two time steps. The formulation of IMU preintegration can be simplified as follows:
Δ R i j = R i T R j Δ v i j = R i T v j v i v i Δ t i j Δ p i j = R i T p j p i v i Δ t i j 1 2 g Δ t i j
where  v i j p i j , and  R i j  are used to represent the velocity, position, and acceleration of mobile robots at time i g  is the constant gravity vector in world coordinates; and  Δ v i j Δ p i j , and  Δ R i j  denote the preintegrated measurements between times i and j. Detailed information on IMU preintegration can be found in [23], and we do not repeat it here. On the one hand, thanks to the development of SLAM back-end optimization, IMU preintegration factors and Lidar odometry factors can be directly added to the factor graph as different constraint types for joint optimization. On the other hand, we use IMU preintegration as the training set in the GP to correct the distortion Lidar scan, given the point-wise timestamp of each Lidar frame.
The GP can be regarded as a generalization of a Gaussian random variable to the continuous-time domain, described by the mean function  m ( K )  and covariance function  k k , k  [28]. The process  f ( t )  over time dimension t can be defined by the mean function  m ( t )  and the covariance function  k t , t  as
m ( t ) = E [ f ( t ) ]
k t , t = E ( f ( t ) m ( t ) ) f t m t ,
with the GP written as
f ( t ) G P m ( t ) , k t , t
without a loss of generality, where the index set of the random variables is the time dimension t. The covariance function uses the squared exponential as the kernel function, involving two time variables to account for cross-temporal relations.
Assuming that  t s  and  t e  are the measurement timestamps of the beginning and end of a frame scan, respectively, the relative transform  Δ T t SE ( 3 ) R 4 × 4 , t i t s , t e  between two sequential frames from  k 1  to k can be accurately estimated using (2). Let  T k  be the keyframe pose at time k; the transform within the period  t s , t e  can be calculated as follows:
T k + t i = T k · Δ T t i .
Since the timestamp of the IMU preintegration within the period  t s , t e  and the corresponding pose information are known, the GP can correct the distortion efficiently by referring to the timestamp information of the point cloud.
The detailed process of point-cloud de-skewing is shown in Algorithm 1. Let  y = T k + t i t i t s , t e , i = 1 , , N n  be the set of IMU preintegration within this period, and let the timestamp  t r , i t s , t e  be the ith point on the rth ring of the Lidar frame.
Algorithm 1 The algorithm for de-skewing the point cloud
   Input:  X : = t t 1 , t N n  (training set),     y  (targets),     x : = t t s , t e  (test inputs).
   1: Determining the hyperparameters from training data;
   2: for  t N n t 1  to  t i  do
           for  t N n t 1  to  t j  do
                K = k t i , t j ;
           end for
          end for
     3:  L : = c h o l e s k y ( K ) ;
          α : = L T ( L y ) ; k = k t , x ;  predictive mean (point cloud)
   4:  f ¯ : = k T α ;
          v : = L k ; V f : = k x , x v T v ;  predictive variance
   5: Output:  f ¯  (mean),  V f  (variance).
The mean  f ¯  returned from Algorithm 1 is the undistorted point cloud  P ˜ k .
P ˜ k = p k m , n = f ¯ p k m , n P k
More importantly, the variance  V f  returned from Algorithm 1 can be applied to evaluate the uncertainty of the predicted pose of each point in the point cloud. Correspondingly, the undistorted edge feature set and planar feature set are represented as  F ˜ k e  and  F ˜ k p , respectively.

3.2. Redefining the Data Structure of the Point Cloud

Feature point pair matching has been broadly used in Lidar SLAM and has proved very effective. In this paper, the feature extraction process follows the popular method used in [18] to extract edge and planar features by calculating the smoothness  s k i , j  of points over a local region. For place recognition, the smoothness can also be used as a new dimension combined with geometry information  [ x , y , z ] T  smoothness. Let  P k = p k m , n  be the raw measurements at time k, where  p k i , j = [ x , y , z ] T  is the jth point on the ith ring of the frame at time k. The raw point-cloud measurement is first converted into a range image, and then the smoothness  s k i , j  of points is calculated as follows:
s k i , j = 1 | C | · p k i , j m C , m j p k i , m p k i , j
where  C  denotes the set of the nearest points adjacent to the left and right of the point  p k i , j  in the same row of the range image,  C  is the size of set, and  p k i , j  is the Euclidean distance from point  p k i , j  to the scan center. According to the given threshold, the points with a smoothness exceeding the threshold are classified as edge/corner features, and those with a smoothness lower than the threshold are ranked as planar points. For convenience, we denote the edge feature set and the planar feature set as  F k e  and  F k p , respectively.
We redefine the data structure of the features as  p k i , j = [ x , y , z , s ] T , including the raw geometry  [ x , y , z ] T  and the smoothness s calculated above. The newly added dimension is significant for the front-end place recognition algorithm and will be explained in a later section.

3.3. Pose Estimation

For mobile robot SE(2) navigation, existing works have mostly assumed that the robot’s changes in the z direction and the pitch and roll dimensions are insignificant for navigation on the ground. Most of them only paid attention to the changes in the x, y, and yaw dimensions, and the keyframe pose was set as  T k w SE ( 2 ) , where the vector tangents and Lie algebra can be presented by elements of the type
τ = ρ θ R 3 , τ = [ θ ] x ρ 0 0 se ( 2 )
where the operator  [ ] x  is a skew-symmetric matrix, and the superscript   indicates that the variable is an element of Lie algebra. Meanwhile, the pose is also extended to  T k w  for specific conditions to maintain the dimension consistently:
T k w = R k t k 0 1 SE ( 3 ) R 4 × 4 ,
where rotation and translation are redefined as
R k = Exp ( ξ ) SO ( 3 ) ξ = 0 0 θ k T t k = x y 0 T = V ( ξ ) ε R 3 ε = ρ 0 T
with
V ( ξ ) = I + 1 cos ξ ξ 2 [ ξ ] × + ξ sin ξ ξ 3 [ ξ ] × 2
When a Lidar scan arrives at time k, the current frame is first transformed from the b frame to the w frame by the pose  T k w :
F ˜ k w = μ F ˜ k b = T k w F ˜ k b + n l = R k · F ˜ k b + t k + n l .
Note that the transformation from homogeneous coordinates to non-homogeneous coordinates is implicit. For convenience, we still apply the equal sign to represent with  F ˜ k b = F ˜ k e F ˜ k p  and  F ˜ k b  the set of all features in the b frame and w frame at time k, respectively. The transform process is corrupted by a zero mean additive Gaussian noise  n l .
The pose estimation strategy is to match the edge features  F ˜ k e  and planar features  F ˜ k p  from the current frame with the global map  m , which consists of an edge map and planar map updated increasingly. Once the transformed feature points are associated with the map using the KD-tree method, the distance from point to edge and point to planar can be calculated and then used as the residual for optimization. Thus, the error function is formulated as
e = r F ˜ k w , m ,
where  r ( · )  is the cost function, and  e  is the residual. The current optimal pose  T k w  can be solved by iteratively minimizing the distance until it converges.
Although the environments are generally in 2D, the robot model is sensitive to perturbations from slipping and complex terrain and can then be corrupted, making it challenging to be consistent with the dimensions of the SE(2) model. It is observed that these perturbations are part of the robot’s state, which is more critical for the accuracy of the pose. Therefore, we propose an efficient constraint model to integrate these perturbations. To balance the complexity and accuracy of the pose, the perturbations, including rotation perturbations  n β x y N 0 2 × 1 , Σ β x y  and vertical translation perturbation  n z N 0 , σ z 2 , are incorporated into the pose model as the noise terms. The rotation matrix and the translation exist in non-Euclidean space and Euclidean space, respectively. The parameterization of the perturbation must be consistent with the space to which it belongs.
R k Exp n β R k , n β = n β w 0 T R 3 t k t k + n z , n z = 0 0 n z T R 3 ,
since the disturbance in the z direction is a Euclidean translation that can be directly used for robot poses according to the principle of standard vector addition. The perturbations are integrated into the transformation process of feature points via dimension extension. In this case, we rewrite (12) as
F ˜ k w = μ F ˜ k b = Exp n β R k F ˜ k b + t k + n z + n l μ F ˜ k b + J n β μ n β + J n μ n z + n l .
The perturbations are considered noise terms in the model and can be reached by a general mathematical method (first-order Taylor expansion). Exploiting the left perturbation model, the Jacobian matrix of  μ  with respect to  n z  and  n β  can be derived as follows:
J n = J n β μ J n z μ = Exp n β R k F ˜ k b I 3 ,
where the meaning of hat operator  ( · )  is redefined to map a vector into a skew-symmetric matrix and no longer represents the element of Lie algebra.
The noise consists of process noise, rotation perturbations, and vertical translation perturbation. We artificially add different noises together to highlight the contribution of the noise and reformulate the zero-mean noise term  δ n l . It can be seen that in the specific form of the noise term, the third term of  n β  is zero, and the first two terms of  n z  are zero. Due to the influence of these zero elements, only the part involved with out-SE(2) perturbations in the Jacobian matrix is applied to the noise term. Accordingly, the covariance matrix is approximately represented as
δ n l = J n β μ ϕ β β λ y J n β μ ϕ β T + σ z 2 J n z μ ϕ z J n z μ ϕ z T + σ l 2 I ,
where the coefficients  ϕ β  and  ϕ z  are appropriate for eliminating useless matrix components and have no other practical meaning.
Now, we return the focus to the previously defined error function (13), which finds the link between features from  P ˜ k  and map  m  and calculates the distance sum consisting of point-to-edge and point-to-planar displacement as the new error term. The optimal robot state can be calculated by using the Gauss–Newton method to minimize the distance iteratively:
min T k F ˜ k e P ˜ k d k e + F ˜ k p P ˜ k d k p ,
where  d k e  and  d k p  are the -to-edge distance and point-to-planar distance, respectively. The general description of the iteration formula is as follows:
J T J · Δ T = J T e ,
where  J  is the Jacobian matrix of the residual with respect to the optimization variables  T k , and  Δ T  is the increment. According to the formulation of the error function, the chain rule is used to infer the Jacobian  J :
J = c L F ˜ k w F ˜ k w T k w ,
where  c L F ˜ k w  is the Jacobian matrix of the edge and planar residual. As the optimization variable is composed of rotation and translation, the Jacobian matrix can be divided into the following forms:
J = c L t k c L θ k = c L F ˜ k w I 3 ϕ β c L F ˜ k w Exp n β R k F ˜ k b ϕ z
Finally, the coefficients  ϕ β  and  ϕ z  are used to hold the dimension consistent, since the robot is assumed to perform SE(2) movement in this paper. The optimal pose  T k w  can be derived by non-linear optimization based on the above constraint. Following the tightly coupled Lidar inertial odometry structure, the estimated Lidar odometry factors can be utilized to correct the bias of the IMU.

3.4. Smoothness Scan Context

Lidar SLAM may degenerate into simple odometry if there is no loop closure mechanism to eliminate the accumulated error. We recall the previously redefined data structure  p k i , j = [ x , y , z , s ] T  of the feature, which adds the smoothness s elements based on the raw geometry  [ x , y , z ] T . The smoothness can reveal the texture of the environmental surface over a local region to a certain extent because it corresponds to all features  F ˜ k b  from the frame. It is believed that smoothness is an important attribute that can be further exploited for place recognition, similar to the intensity-based and geometric-based loop closure detection solutions [35,36]. However, a Lidar scan is composed of tens of thousands of points, making the scan context computationally weak.
Therefore, smoothness s, consisting of only thousands of features, is introduced to construct the smoothness scan context as a global descriptor for place recognition. Some basic properties of the smoothness can be derived from (1), where the value is always non-negative, and the variance corresponding to the edge features is large. The planar features with a lower smoothness are less distinctive in the local frames, while the edge features are the opposite. To improve the place recognition accuracy, we first set the threshold  ρ s  to filter features with a lower smoothness than  ρ s , just like certain meaningless planar features. For the remaining available features, the logarithmic function is used to smooth the selected features:
s p k i , j = log s p k i , j ,
where the function  s ( · )  provides the smoothness value of a feature point  p k i , j .
We rewrite the bin encoding function to construct the smoothness scan context as follows:
ϕ P i j = max p k i j P i j s p k i , j ,
where  p k i j  is the set of points belonging to the bin with the ith ring and jth sector. Using the smoothness elements, the number of point clouds for the scan context is reduced by two orders of magnitude. The proposed algorithm only needs thousands of features to achieve the bin encoding function; in contrast, taking a 16-line Lidar as an example, the number of raw point clouds exceeds 30,000 in a frame scan. In the following scan context process, the search module is implemented by the two-phase search algorithm proposed in [36]. Building on this logic, the candidate loop closure frame  P c  can be calculated. Once  P c  is determined, the relative motion between the current frame  P k  and the candidate frame  P c  can be calculated using the ICP function  h ( · ) , as follows:
Δ T k , c = h T k , T c , ( k c + 1 )
where  T k  and  T c  are the poses corresponding to keyframes  P k  and  P c , respectively. The relative transformation  Δ T k , c  between  P k  and  P k  is the loop closure factor linking these two keyframes.

4. Experimental Evaluation

To qualitatively and quantitatively analyze the proposed method, we conducted a series of experiments to evaluate F-LIOM and compare it with the state-of-the-art FLOAM [13], LIO-SAM [14], and LeGO-LOAM [12] methods. The physical experimental platform of this paper is shown in Figure 2a. A Leishen C-16 Lidar and a MicroStrain 3DM-GX3-25 IMU were installed on the ground mobile robot TIABBOT TOM08Q2. TOM08Q2 is a four-wheel independently driven mobile robot with a maximum speed of 2 m/s and a drive wheel diameter of 152 mm. All the experiments were implemented in C++ and executed on an Intel NUC equipped with a Core i3 CPU and 4Gib memory using the robot operating system (ROS) in the Ubuntu operation system. Different datasets were collected to validate the performance of F-LIOM in terms of rotating, moving, and loop closure.

4.1. Outdoor Environment

This test aimed to evaluate the performance of F-LIOM in outdoor scenes when the system encountered aggressive perturbations in translation or rotation. We first drove the robot at a non-uniform speed in different outdoor scenes to validate the performance of the mapping. The reason for emphasizing non-uniform speed was to prove that GP contributed to eliminating point-cloud distortion under natural working conditions. Note that this paper aimed not to eliminate these perturbations but to reduce their influence on pose estimation effectively. The corresponding maps generated by our method are shown in Figure 2. For convenience, we label the self-collected datasets corresponding to Figure 2b,c as  G a r d e n  and  B u m p y , respectively. The environment of  G a r d e n  was relatively open, mainly containing many trees and bushes, but the ground was relatively flat. The  B u m p y  dataset represented a bumpy and complex environment with multiple large and small pits in the ground. F-LIOM was capable of accomplishing mapping in all tested conditions. As supplementary material, the mapping and robot movement videos recorded by the camera can be found using the links in Appendix A.
The robot moved a long distance and then returned to the vicinity of the initial position or revisited the scenes that were passed by to verify the place recognition. Figure 3 provides the complete trajectory generated by FLOAM, LIO-SAM, LeGO-LOAM, and our method using the  G a r d e n B u m p y , and public  P a r k  datasets [14]. Because there was no ground truth, we first aimed to show that our method could achieve the same localization accuracy as the state-of-the-art methods FLOAM, LIO-SAM, and LeGO-LOAM, thus proving the effectiveness and versatility of the proposed approach.
It should be emphasized that although the perturbations are considered in this paper, the robot model is still parameterized as SE(2), so the z axis is always meaningless. FLOAM, LIO-SAM, and LeGO-LOAM assume that the robot pose exists in SE(3), where the z axis will drift during robot movement, even if the robot moves on flat ground in practice. In order to fairly evaluate the results of the different methods, the trajectory results were preprocessed before comparison. We replaced all the z-axis trajectories with 0 for the comparison of the translation part. The SLAM accuracy assessment tool evaluation of odometry (EVO) [38] was used to calculate the absolute trajectory error (ATE) of FLOAM, LIO-SAM, and LeGO-LOAM. Taking the trajectory from our method as a reference, the detailed data for comparing the different ATE results are shown in Table 1.
As is shown, the error between the trajectory generated by our method and the trajectories generated by LIO-SAM and LeGO-LOAM was close to zero, indicating that our method achieved the same accuracy as LIO-SAM and LeGO-LOAM. LIO-SAM_FixZ denotes a particular set of experiments, where we set the z axis of LIO-SAM throughout to 0 during the optimization process. The accuracy of LIO-SAM_FixZ was the same as that of LIO-SAM, but its running time was longer, which will be explained later. The trajectory of FLOAM had a significant deviation from the other methods.
We ran the different datasets to verify the performance of F-LIOM in loop closure. A list of parameters pertaining to the smoothness scan context used in the experiment is shown in Table 2. The test datasets consisted of the self-collected  G a r d e n  and  B u m p y  and the public  P a r k  datasets. The scenarios recorded by these datasets were considered very challenging for loop closure detection because they contained many factors that are not conducive to place recognition, such as reverse and forward visits and similar geometric structures and objects. Figure 4 shows the loop closure based on the smoothness scan context proposed in this paper, where each of the loop closure points identified is marked with a red dot, and the detected loop pairs are connected by the black line.
Recall rate and precision are two fundamental indicators for evaluating loop detection. For low-power embedded systems, we quantitatively analyzed the conditions required to trigger loop closure. The results were compared with existing place recognition algorithms generally applied in SLAM, such as SC [35] and ISC [36]. Table 3 lists the results. Compared to SC and ISC, our method only required hundreds of point clouds instead of tens of thousands to achieve highly competitive precision across all three datasets. The number of loops implied the high performance of the method for place recognition, even for challenging scenes, such as forward or reverse visits. More importantly, the smaller the number of point clouds required, the fewer memory resources are required; this can significantly relieve the load on the computation and memory of embedded devices in practice.

4.2. Real-Time Performance

In the above experiments, we demonstrated that our method could achieve an accuracy in terms of pose estimation and loop closure that was competitive in relation to advanced methods. Next, we prove that the method proposed in this paper has more powerful advantages from a real-time perspective. As one of the most time-consuming core components in the system, the scan-matching module constructs local map constraints and solves keyframe poses via optimization methods. We ran different datasets to compare the run-time of the scan-matching modules in FLOAM, LeGO-LOAM, LIO-SAM, LIO-SAM_FixZ, and our method. The results for the time consumed by the four different scenarios are shown in Figure 5, where the data represent the run-times of the scan-matching modules and thus serve as a quantitative metric to compare and analyze the processes. It can be seen that in the more complex scenario Bumpy, the single optimization time of some methods reached the second level.
Table 4 quantitatively shows the average run-times of the scan-matching modules. If we intuitively analyze the data in Table 4, then the optimization speed of FLOAM is undoubtedly the fastest. However, it should be pointed out that the optimization process of FLOAM was not synchronized with the dataset when it was played at normal speed; the play speed of the dataset was set to 0.3 to achieve synchronization. The whole running time of FLOAM was more than three times the duration of the dataset, and it did not achieve real-time performance. Throughout all tests, the scan-matching run-time of F-LIOM in the three scenarios was significantly shorter than that of LIO-SAM and LeGO-LOAM. In some highly challenging cases, such as the bumpy ground scene shown in Figure 2c, the average run-times of LIO-SAM and LeGO-LOAM reached 578.49 milliseconds and 421.54 milliseconds, respectively. This meant that some Lidar frames were rejected in the process when the Lidar rotation rate was 10 Hz. Lastly, let us analyze LIO-SAM_FixZ; its average run-time was the longest under the different datasets. The initial idea of setting the z axis to a fixed value was intended to eliminate the possible drift of the z axis, so the z axis was reset during each optimization process. Due to the lack of z-axis information, the model did not match the actual state, making it difficult to converge during optimization. Finally, the experiment showed quantitatively that the optimization process took longer, and the average time reached 970 milliseconds in extreme cases. As shown, F-LOIM still achieved a quite notable performance in real time, even in complex environments, making it more promising for deployment in limited-resource embedded systems.

5. Conclusions

This paper developed F-LIOM, a robust smoothness-aided Lidar inertial odometry method for ground mobile robots equipped with 3D Lidar in complex environments with rough terrain. The F-LIOM is cost-effective in terms of model complexity and calculation units, as it employs the proposed out-SE(2) constraint method that assumes the mobile robot is still moving in SE(2) and integrates the out-SE(2) perturbations as weights in the noise term and covariance matrix to optimize the estimated state. The system follows a tightly coupled Lidar inertial odometry structure, including tracking, real-time state estimation, mapping, and loop closure functions. Inspired by the recent work on Lidar intensity and geometry, a new global descriptor, the smoothness scan context, was presented that exploits geometry and smoothness properties and only requires a few thousand features, instead of tens of thousands of point clouds like existing methods. Multiple sets of experiments were carried out to evaluate the performance of the proposed algorithm in various scenes, including indoor, road, building, garden, and public datasets. The results showed that the proposed method achieved a competitive performance in real-time pose estimation, making it more suitable for deployment in low-power embedded systems.

Author Contributions

W.C. contributed the central idea, analyzed most of the data, and wrote the initial draft of the paper; J.S. reviewed the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The video related to this paper can be found online at https://youtu.be/zz8LpyLcuBs (accessed on 19 August 2023).

References

  1. Kuutti, S.; Fallah, S.; Katsaros, K.; Dianati, M.; Mccullough, F.; Mouzakitis, A. A Survey of the State-of-the-Art Localization Techniques and Their Potentials for Autonomous Vehicle Applications. IEEE Internet Things J. 2018, 5, 829–846. [Google Scholar] [CrossRef]
  2. Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef]
  3. Davison, A.; Reid, I.; Molton, N.; Stasse, O. MonoSLAM: Real-Time Single Camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 1052–1067. [Google Scholar] [CrossRef] [PubMed]
  4. Leutenegger, S.; Lynen, S.; Bosse, M.; Siegwart, R.; Furgale, P. Keyframe-based visual–inertial odometry using nonlinear optimization. Int. J. Robot. Res. 2015, 34, 314–334. [Google Scholar] [CrossRef]
  5. Guo, C.X.; Roumeliotis, S. IMU-RGBD camera 3D pose estimation and extrinsic calibration: Observability analysis and consistency improvement. In Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 6–10 May 2013; pp. 2935–2942. [Google Scholar]
  6. Laidlow, T.; Bloesch, M.; Li, W.; Leutenegger, S. Dense RGB-D-inertial SLAM with map deformations. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 6741–6748. [Google Scholar]
  7. Qin, T.; Li, P.; Shen, S. VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. IEEE Trans. Robot. 2018, 34, 1004–1020. [Google Scholar] [CrossRef]
  8. Bu, Z.; Sun, C.; Wang, P. Semantic Lidar-Inertial SLAM for Dynamic Scenes. Appl. Sci. 2022, 12, 10497. [Google Scholar] [CrossRef]
  9. Cai, Y.; Qian, W.; Dong, J.; Zhao, J.; Wang, K.; Shen, T. A LiDAR–Inertial SLAM Method Based on Virtual Inertial Navigation System. Electronics 2023, 12, 2639. [Google Scholar] [CrossRef]
  10. Cao, M.; Zhang, J.; Chen, W. Visual-Inertial-Laser SLAM Based on ORB-SLAM3. In Unmanned Systems; World Scientific: Singapore, 2023; pp. 1–10. [Google Scholar]
  11. Xu, X.; Zhang, L.; Yang, J.; Cao, C.; Wang, W.; Ran, Y.; Tan, Z.; Luo, M. A review of multi-sensor fusion slam systems based on 3D LIDAR. Remote Sens. 2022, 14, 2835. [Google Scholar] [CrossRef]
  12. Shan, T.; Englot, B. LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4758–4765. [Google Scholar]
  13. Wang, H.; Wang, C.; Chen, C.; Xie, L. F-LOAM: Fast LiDAR Odometry and Mapping. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 25–29 October 2020. [Google Scholar]
  14. Shan, T.; Englot, B.; Meyers, D.; Wang, W.; Ratti, C.; Rus, D. LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 5135–5142. [Google Scholar]
  15. Pan, Y.; Xiao, P.; He, Y.; Shao, Z.; Li, Z. MULLS: Versatile LiDAR SLAM via Multi-metric Linear Least Square. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation, Xi’an, China, 30 May–5 June 2021. [Google Scholar]
  16. Barfoot, T.D. State Estimation for Robotics; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
  17. Kaess, M.; Johannsson, H.; Roberts, R.; Ila, V.; Leonard, J.; Dellaert, F. iSAM2: Incremental smoothing and mapping using the Bayes tree. Int. J. Robot. Res. 2012, 31, 216–235. [Google Scholar] [CrossRef]
  18. Zhang, J.; Singh, S. LOAM: Lidar Odometry and Mapping in Real-time. In Proceedings of the Robotics: Science and Systems, Berkeley, CA, USA, 12–16 July 2014. [Google Scholar]
  19. Segal, A.V.; Hähnel, D.; Thrun, S. Generalized-ICP. In Proceedings of the Robotics: Science and Systems, Seattle, WA, USA, 28 June–1 July 2009. [Google Scholar]
  20. Censi, A. An ICP variant using a point-to-line metric. In Proceedings of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, CA, USA, 19–23 May 2008; pp. 19–25. [Google Scholar]
  21. Qin, C.; Ye, H.; Pranata, C.E.; Han, J.; Zhang, S.; Liu, M. LINS: A Lidar-Inertial State Estimator for Robust and Efficient Navigation. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 8899–8906. [Google Scholar]
  22. Gentil, C.L.; Vidal-Calleja, T.; Huang, S. IN2LAAMA: Inertial Lidar Localization Autocalibration and Mapping. IEEE Trans. Robot. 2021, 37, 275–290. [Google Scholar] [CrossRef]
  23. Forster, C.; Carlone, L.; Dellaert, F.; Scaramuzza, D. IMU Preintegration on Manifold for Efficient Visual-Inertial Maximum-a-Posteriori Estimation. In Proceedings of the Robotics: Science and Systems, Rome, Italy, 13–17 July 2015. [Google Scholar]
  24. Forster, C.; Dellaert, F.; Scaramuzza, D.; Carlone, L. On-Manifold Preintegration for Real-Time Visual-Inertial Odometry. In Proceedings of the IEEE Transactions on Robotics A Publication of the IEEE Robotics & Automation Society, Singapore, 1 February 2017. [Google Scholar]
  25. Li, D.; Sun, B.; Liu, R.; Xue, R. Tightly Coupled 3D Lidar Inertial SLAM for Ground Robot. Electronics 2023, 12, 1649. [Google Scholar] [CrossRef]
  26. Tong, C.; Furgale, P.; Barfoot, T. Gaussian Process Gauss-Newton: Non-Parametric State Estimation. In Proceedings of the 2012 Ninth Conference on Computer and Robot Vision, Washington, DC, USA, 28–30 May 2012; pp. 206–213. [Google Scholar]
  27. Tong, C.H.; Furgale, P.; Barfoot, T.D. Gaussian Process Gauss–Newton for non-parametric simultaneous localization and mapping. Int. J. Robot. Res. 2013, 32, 507–525. [Google Scholar] [CrossRef]
  28. Rasmussen, C.; Williams, C.K.I. Gaussian Processes for Machine Learning. In Adaptive Computation and Machine Learning; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
  29. Guizilini, V.; Ramos, F. Variational Hilbert regression for terrain modeling and trajectory optimization. Int. J. Robot. Res. 2019, 38, 1375–1387. [Google Scholar] [CrossRef]
  30. Urtasun, R.; Fleet, D.J.; Fua, P. 3D People Tracking with Gaussian Process Dynamical Models. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; Volume 1, pp. 238–245. [Google Scholar]
  31. Lv, J.; Lang, X.; Xu, J.; Wang, M.; Liu, Y.; Zuo, X. Continuous-Time Fixed-Lag Smoothing for LiDAR-Inertial-Camera SLAM. IEEE/ASME Trans. Mechatron. 2023, 28, 2259–2270. [Google Scholar] [CrossRef]
  32. Vasudevan, S.; Ramos, F.; Nettleton, E.; Durrant-Whyte, H.; Blair, A. Gaussian Process modeling of large scale terrain. In Proceedings of the ICRA, Kobe, Japan, 12–17 May 2009. [Google Scholar]
  33. Wohlkinger, W.; Vincze, M. Ensemble of shape functions for 3D object classification. In Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics, Karon Beach, Thailand, 7–11 December 2011; pp. 2987–2992. [Google Scholar]
  34. Muhammad, N.; Lacroix, S. Loop closure detection using small-sized signatures from 3D LIDAR data. In Proceedings of the 2011 IEEE International Symposium on Safety, Security, and Rescue Robotics, Kyoto, Japan, 1–5 November 2011; pp. 333–338. [Google Scholar]
  35. Kim, G.; Kim, A. Scan Context: Egocentric Spatial Descriptor for Place Recognition Within 3D Point Cloud Map. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4802–4809. [Google Scholar]
  36. Wang, H.; Wang, C.; Xie, L. Intensity Scan Context: Coding Intensity and Geometry Relations for Loop Closure Detection. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 2095–2101. [Google Scholar]
  37. Kim, G.; Yun, S.; Kim, J.; Kim, A. Sc-lidar-slam: A front-end agnostic versatile lidar slam system. In Proceedings of the 2022 International Conference on Electronics, Information, and Communication (ICEIC), Jeju, Republic of Korea, 6–9 February 2022; pp. 1–6. [Google Scholar]
  38. Grupp, M. Evo: Python Package for the Evaluation of Odometry and SLAM. 2017. Available online: https://github.com/MichaelGrupp/evo (accessed on 18 February 2023).
Figure 1. The system structure.
Figure 1. The system structure.
Applsci 13 09597 g001
Figure 2. (a) The primary sensors included IMU and 3D Lidar, and the camera was only used to record scenes. (b) An open area and multiple trees. (c) Bumpy ground and aircraft with distinctive features. The zoomed-in subfigures show the loop closure connected by the red line.
Figure 2. (a) The primary sensors included IMU and 3D Lidar, and the camera was only used to record scenes. (b) An open area and multiple trees. (c) Bumpy ground and aircraft with distinctive features. The zoomed-in subfigures show the loop closure connected by the red line.
Applsci 13 09597 g002
Figure 3. Trajectories built by different methods on the  G a r d e n B u m p y , and public  P a r k  datasets. For LIO-SAM_FixZ, we set the z axis of LIO-SAM to a fixed value.
Figure 3. Trajectories built by different methods on the  G a r d e n B u m p y , and public  P a r k  datasets. For LIO-SAM_FixZ, we set the z axis of LIO-SAM to a fixed value.
Applsci 13 09597 g003
Figure 4. Loop closure based on our method, (a) Loop closure results for Garden, (b) Loop closure results for Bumpy, (c) Loop closure results for the public Park dataset. The red dots represent keyframe nodes, and the solid black line connecting the two red dots represents the loop closure constraint detected by the system. We can see that the loop closure was successfully triggered when the robot moved a long distance and revisited the scenes that it had passed.
Figure 4. Loop closure based on our method, (a) Loop closure results for Garden, (b) Loop closure results for Bumpy, (c) Loop closure results for the public Park dataset. The red dots represent keyframe nodes, and the solid black line connecting the two red dots represents the loop closure constraint detected by the system. We can see that the loop closure was successfully triggered when the robot moved a long distance and revisited the scenes that it had passed.
Applsci 13 09597 g004
Figure 5. Run-times of the scan-matching modules. (a) Run-times for  G a r d e n . (b) Run-times for  B u m p y . (c) Run-times for the public  P a r k  dataset. (d) Run-times for the  s m a l l - s c a l e  dataset. We can see that the run-times of the scan-matching modules were different in different scenarios. The more complex the scene, the more time was consumed.
Figure 5. Run-times of the scan-matching modules. (a) Run-times for  G a r d e n . (b) Run-times for  B u m p y . (c) Run-times for the public  P a r k  dataset. (d) Run-times for the  s m a l l - s c a l e  dataset. We can see that the run-times of the scan-matching modules were different in different scenarios. The more complex the scene, the more time was consumed.
Applsci 13 09597 g005
Table 1. Results of absolute trajectory error.
Table 1. Results of absolute trajectory error.
DatasetMethodMaxMeanRmseStd
GardenFLOAM4.001.962.200.99
LeGO-LOAM0.600.240.260.10
LIO-SAM0.660.290.310.10
LIO-SAM_FixZ0.610.270.290.10
BumpyFLOAM12.31.852.732.01
LeGO-LOAM0.620.250.280.13
LIO-SAM0.450.130.160.09
LIO-SAM_FixZ0.270.130.150.08
ParkFLOAM47.9525.2528.5113.26
LeGO-LOAM1.250.250.300.17
LIO-SAM0.510.210.230.10
LIO-SAM_FixZ0.590.260.290.12
Table 2. Parameters related to the smoothness scan context.
Table 2. Parameters related to the smoothness scan context.
ParameterBriefThreshold
  L m a x Maximum sensing range (m)70
  N s Number of sectors60
  N r Number of rings20
  ρ τ Smoothness matching threshold0.3
  ρ c Smoothness threshold0.1
Table 3. The results of the smoothness scan context.
Table 3. The results of the smoothness scan context.
DatasetMethodAvg No. of Point CloudsNo. of LoopsPrecision
  G a r d e n SC29,7212281.81%
ISC29,7211894.44%
our3382395.65%
  B u m p y SC26,1001364.54%
ISC26,1001275.00%
our9401275.00%
  P a r k SC26,0002184.12%
ISC26,0002184.12%
our3592390.10%
Table 4. Average run-times of scan-matching module processing (ms).
Table 4. Average run-times of scan-matching module processing (ms).
MethodGardenBumpyParkSynchronized with Dataset?
FLOAM146.98192.03154.95No
LeGO-LOAM245.05421.5197.45Yes
LIO-SAM521.32578.49238.84Yes
LIO-SAM_FixZ730.68970.23242.86Yes
Our method265.10258.0569.15Yes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, W.; Sun, J. Fast Lidar Inertial Odometry and Mapping for Mobile Robot SE(2) Navigation. Appl. Sci. 2023, 13, 9597. https://doi.org/10.3390/app13179597

AMA Style

Chen W, Sun J. Fast Lidar Inertial Odometry and Mapping for Mobile Robot SE(2) Navigation. Applied Sciences. 2023; 13(17):9597. https://doi.org/10.3390/app13179597

Chicago/Turabian Style

Chen, Wei, and Jian Sun. 2023. "Fast Lidar Inertial Odometry and Mapping for Mobile Robot SE(2) Navigation" Applied Sciences 13, no. 17: 9597. https://doi.org/10.3390/app13179597

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop