*Article* **IMU-Aided High-Frequency Lidar Odometry for Autonomous Driving**

**Hanzhang Xue 1, Hao Fu <sup>1</sup> and Bin Dai 1,2,\***


Received: 14 March 2019; Accepted: 6 April 2019; Published: 11 April 2019

**Abstract:** For autonomous driving, it is important to obtain precise and high-frequency localization information. This paper proposes a novel method in which the Inertial Measurement Unit (IMU), wheel encoder, and lidar odometry are utilized together to estimate the ego-motion of an unmanned ground vehicle. The IMU is fused with the wheel encoder to obtain the motion prior, and it is involved in three levels of the lidar odometry: Firstly, we use the IMU information to rectify the intra-frame distortion of the lidar scan, which is caused by the vehicle's own movement; secondly, the IMU provides a better initial guess for the lidar odometry; and thirdly, the IMU is fused with the lidar odometry in an Extended Kalman filter framework. In addition, an efficient method for hand–eye calibration between the IMU and the lidar is proposed. To evaluate the performance of our method, extensive experiments are performed and our system can output stable, accurate, and high-frequency localization results in diverse environment without any prior information.

**Keywords:** ego-motion estimation; hand-eye calibration; IMU; lidar odometry; sensor fusion

### **1. Introduction**

Precise and high-frequency localization is one of the key problems for autonomous vehicles. In recent years, the fusion of Global Navigation Satellite System (GNSS) and Inertial Measurement Unit (IMU) has been the most popular localization method. However, the GNSS/IMU system will fail in some environments where GNSS signals suffer from satellite blockage or multipath propagation [1], such when an autonomous vehicle is driving in a tunnel, in the mountains, or in an environment with electromagnetic interference. Thus, an alternative localization method is necessary when the GNSS signal is unavailable or is of poor quality.

Currently, most autonomous vehicles are equipped with light detection and ranging (lidar) device, which is a promising sensor that could accurately calculate the range measurements of the surroundings. Some recent navigation approaches have begun to use lidar as a complementary sensor for the GNSS/IMU localization system. A typical lidar odometry algorithm could roughly be divided into three steps: a pre-processing step, which tries to compensate the intra-frame distortion caused by the vehicle's own movement; an intermediate step, which is the main body of the lidar odometry algorithm, and a last step, which outputs the localization result.

As the lidar mostly spins at 10 Hz, its output frequency is thus less than 10 Hz, which might not be fast enough to meet the needs of other modules, like planning or control. To remedy this, some approaches try to combine the lidar with other sensors, such as IMU. In these approaches, the IMU and the lidar odometry are usually combined in an Extended Kalman Filter (EKF) framework, or by using a factor graph representation [2–7] (as shown in the left of Figure 1). In this paper, we claim that IMU information can not only be fused with the lidar odometry at the output level, but it can also aid lidar odometry in the pre-processing level and the intermediate level (as shown in the right of Figure 1).

**Figure 1.** A typical lidar odometry algorithm can be roughly divided into three steps: a pre-processing step, an intermediate step, and an output step. Most existing lidar/Inertial Measurement Unit (IMU) fusion approaches try to fuse lidar odometry and the IMU in the output level [2–7]. In this paper, we try to utilize the IMU information in all the three steps.

In the pre-processing step, as the vehicle itself is moving during the lidar spin, the generated point clouds will be distorted. This intra-frame distortion is usually ignored. In this paper, we show that the high-frequency IMU information could naturally help correct this distortion, as long as the IMU and lidar are well-calibrated. We also claim that **the problem of intra-frame correction is in fact equivalent to inter-frame registration**, and propose an interesting two-step correction procedure which does not rely on the IMU. In the intermediate step, the IMU could also provide a good initial guess for the lidar odometry algorithm. Past works usually use the registration result from the previous two scans as the initial guess. We will do experiments to compare these two approaches. In the output step, we try to combine the lidar odometry with the IMU in the EKF framework.

In summary, we make the following contributions in this paper:


The rest of this paper is organized as follows. In Section 2, we introduce some related works. Section 3.1 provides an overview of the proposed approach, and introduces the coordinate systems and notations in our approach. The details of the proposed approach are presented from Sections 3.2–3.6. The experimental results and comparison with state-of-the-art approaches are presented in Section 4. Finally, the conclusions are summarized in Section 5.

#### **2. Related Works**

Lidar odometry has been a hot topic in recent years. Lidar odometry is, in essence, equivalent to scan matching, which tries to estimate the relative transform between two frames of point clouds. The scan matching methods can be classified into three categories: point-based, mathematical property-based, and feature-based. The Iterative Closest Point (ICP) algorithm [8] and its variants [9] are the most popular point-based methods which estimate the relative transform through finding the nearest points between the two scans. For the mathematical property-based method, such as the Normal Distribution Transform (NDT) algorithm and its variants (D2D-NDT [10] and P2D-NDT [11]), the observed points are modeled as a set of Gaussian probability distribution. For the feature-based method, there have been various features applied, such as the line segment feature [12], corner feature [13], and grid feature [14]. Zhang et al. proposed a method called Lidar Odometry And Mapping (LOAM) [15,16], which extracts feature points located on sharp edges or planar surfaces, and matches these feature points using either the point-to-line distance or the point-to-plane distance.

All these lidar odometry algorithms can only output localization results at a frequency of less than 10 Hz, which might not be fast enough. Some recent approaches have started to fuse the lidar odometry with the IMU using either the Kalman Filter framework [2–5,17,18] or the factor graph representation [6,7].

In [2], the authors proposed a localization system which combines a 2D Simultaneous Localization and Mapping (SLAM) subsystem with a 3D navigation subsystem based on IMU. This system can estimate the accurate state of an unmanned ground vehicle (UGV), but the usage of 2D SLAM is not suitable for the off-road environment. In [3], a hybrid lidar odometry algorithm was proposed which combines the feature-based scan matching method with the ICP-based scan matching method, and the lidar odometry result was then fused with the IMU in an EKF framework. In [6], a bunch of IMU measurements were considered as a relative constraint using the pre-integration theory, and the results of lidar odometry were fused with IMU measurements in a factor graph framework. However, in all these previous works, the IMU information was only fused with the lidar odometry in the result level; whilst in this paper, we try to utilize the IMU information to aid the whole process of the lidar odometry algorithm.

#### **3. The Proposed Approach**

#### *3.1. Coordinate Systems and Notations*

The aim of this work is to accurately estimate the pose of the vehicle. Three coordinate systems are used in our approach, which are defined as follows:


Figure 2 shows the coordinate system {*W*} and {*B*} in the 2D plane and the relationship between them. The body coordinate system {*B*} changes with the movement of the vehicle. *<sup>T</sup>*WB <sup>k</sup> represents the transformation between the coordinate system {*W*} and {*B*} at time *t*k, which can also denote the pose of the vehicle at time *t*k. On the contrary, *T*BW <sup>k</sup> transforms the coordinate system {*B*} to {*W*} and *T*BW <sup>k</sup> <sup>=</sup> *T*WB k −<sup>1</sup> . Moreover, *T*<sup>B</sup> <sup>k</sup>−1,k, which could be obtained from the IMU, represents the relative transformation of the vehicle between time *<sup>t</sup>*k−<sup>1</sup> and *<sup>t</sup>*k. The relationship between *T*<sup>B</sup> <sup>k</sup>−1,k and *<sup>T</sup>*WB k can be expressed as:

$$
\Delta T\_{\mathbf{k}-1,\mathbf{k}}^{\rm B} = \left(T\_{\mathbf{k}-1}^{\rm WB}\right)^{-1} T\_{\mathbf{k}}^{\rm WB}.\tag{1}
$$

In addition, the relationship between the coordinate system {*B*} and {*L*} is shown in Figure 3.

**Figure 2.** The relative relationship between the world coordinate system {*W*} and the body coordinate system {*B*} in the 2D plane, where the black axis indicates {*W*} and the red axis indicates {*B*}.

**Figure 3.** The relationship between the body coordinate system {*L*} and the lidar coordinate system {*B*}, where the yellow axis indicates {*L*} and the green axis indicates {*B*}.

We use *T*WL <sup>k</sup> to denote the pose of the lidar at time *<sup>t</sup>*k. *T*<sup>L</sup> <sup>k</sup>−1,k represents the relative transformation of the lidar between time *<sup>t</sup>*k−<sup>1</sup> and *<sup>t</sup>*k. The transformation *<sup>T</sup>*BL, which transforms the coordinate system {*B*} to {*L*}, can be obtained from the hand–eye calibration.

The transformation *T* can be expressed by:

$$T = \begin{bmatrix} \mathbf{R} & \mathbf{p} \\ \mathbf{0}^T & 1 \end{bmatrix},\tag{2}$$

where *<sup>R</sup>* <sup>∈</sup> <sup>R</sup>3×<sup>3</sup> represents the rotational matrix and *<sup>p</sup>* <sup>∈</sup> <sup>R</sup><sup>3</sup> is the translational vector.

In *Lie Group* [19], *T* belongs to the *Special Euclidean group* SE(3) and the rotational matrix *R* belongs to the *Special Orthogonal group* SO(3). Correspondingly, the se(3) and so(3) are two kinds of *Lie Algebra*. *θ* ∈ so(3) indicates the rotation angle which can be expressed as *θ* = *θn*, where *θ* represents the

magnitude of the rotation angle and *n* denotes the rotation axis. The pose can thus be represented by *ξ* = *θ p T* ∈ se(3).

The exponential map and logarithmic map can be used to derive the relationship between the *Lie Group* and the *Lie Algebra*. For example, *T* can be represented by the exponential map of the *ξ*, namely, *T* = exp (*ξ*∧). Conversely, *ξ* can be represented by the logarithmic map of the *T*, namely, *ξ* = log (*T*) ∨.

The exponential map between *θ* and *R* can be formulated by using the Rodriguez formula:

$$\mathcal{R} = \cos\theta \cdot I + (1 - \cos\theta) \cdot \mathfrak{n} \cdot \mathfrak{n}^T + \sin\theta \cdot \mathfrak{n}^\wedge. \tag{3}$$

The operator ∧ maps a vector to a skew-symmetric matrix as Equation (4), and the operator ∨ maps a skew symmetric matrix back to a vector.

$$
\boldsymbol{\theta}^{\wedge} = \begin{bmatrix} \theta^x \\ \theta^y \\ \theta^z \end{bmatrix}^{\wedge} = \begin{bmatrix} 0 & -\theta^z & \theta^y \\ \theta^z & 0 & -\theta^x \\ -\theta^y & \theta^x & 0 \end{bmatrix}. \tag{4}
$$

The notation of the data used in our system are specified as follows. Each type of data used in our system is composed of the timestamp and the data itself. The timestamp is defined as the time when the data is generated. *t* <sup>B</sup> denotes the timestamp of the IMU measurement and *t* <sup>L</sup> represents the timestamp of the lidar point clouds. *f* <sup>L</sup> <sup>k</sup> indicates all points generated from the lidar during [*t* L <sup>k</sup>−1, *<sup>t</sup>* L k]. In addition, the *i*-th point generated by the *b*-th laser beam is denoted as *X*<sup>L</sup> (i,b,k) in the lidar coordinate system {*L*}. The number of points received per laser beam is denoted as *Sb* and *i* ∈ [1, *Sb*].

#### *3.2. Motion Prior from IMU and Wheel Encoder*

It is well-known that the accelerometer in the IMU is affected by the gravity, and it needs to be integrated twice to obtain the translational vector [20]. In contrast, the gyro in the IMU directly measures the angular velocity, and it only needs to be integrated once to obtain the rotation angle. It is reasonable to believe that the angular output from the IMU is much more accurate than the translational measurement.

For the ground vehicle, besides IMU, it is also equipped with the wheel encoder. The wheel encoder directly measures the rotation of the wheel, and thus provides a much more accurate translational measurement.

In this paper, we try to combine the angular output from the IMU and the translational output from the wheel encoder to obtain the motion prior.

The motion model of the vehicle can be simplified as Figure 4. We mount the IMU at the center of two rear wheels, and mount two wheel encoders on the left and right rear wheels, respectively. The black car denotes the position of the vehicle at the previous moment *t* B <sup>k</sup>−1, and the blue car indicates its position at the current time *t* B <sup>k</sup>. *sl* and *sr* represent the traveled distances of the left rear wheel and the right rear wheel during the short period of time.

Firstly, we compute the rotation matrix *R*WB <sup>k</sup> by triaxial angular velocities of the IMU. We introduce the following kinematic model [21]:

$$\frac{d\mathcal{R}\_{\tau}^{\rm BW}}{dt} = \mathcal{R}\_{\tau}^{\rm BW} \begin{bmatrix} \omega\_{\tau}^{x} \\ \omega\_{\tau}^{y} \\ \omega\_{\tau}^{z} \end{bmatrix}^{\wedge}. \tag{5}$$

In Equation (5), *R*BW *<sup>τ</sup>* indicates the rotation matrix from the coordinate system {*B*} to {*W*}. *<sup>ω</sup><sup>x</sup> τ*, *ωy <sup>τ</sup>*, and *ω<sup>z</sup> <sup>τ</sup>* represent the instantaneous angular velocities of the IMU.

**Figure 4.** Schematic diagram of the simplified vehicle motion model. The black car represents the previous position of the vehicle, and the blue car indicates its current position. During a time period, the left rear wheel travels a distance of *sl*, and the right rear wheel travels *sr*.

By assuming the angular velocity of the IMU constant during [*t* B <sup>k</sup>−1, *<sup>t</sup>* B <sup>k</sup>], we integrate Equation (5) from *t* B <sup>k</sup>−<sup>1</sup> to *<sup>t</sup>* B <sup>k</sup> and solve the differential equation. The rotation matrix *<sup>R</sup>*BW <sup>k</sup> can be obtained by:

$$\mathcal{R}\_{\mathbf{k}}^{\rm BW} = \mathcal{R}\_{\mathbf{k}-1}^{\rm BW} \exp\left( (\omega\_{\mathbf{k}} \,\triangle t)^{\wedge} \right) . \tag{6}$$

To compute the translational component *p*WB <sup>k</sup> of the pose *<sup>T</sup>*WB <sup>k</sup> , we approximate the traveled distance *s* of the vehicle to be the average of the two rear wheels. *s* indicates the relative displacement of the vehicle during [*t* B <sup>k</sup>−1, *<sup>t</sup>* B <sup>k</sup>] in the body coordinate system {*B*} and the instantaneous speed of the vehicle can be denoted as *v* = *s*/(*t* B <sup>k</sup> − *t* B <sup>k</sup>−1). We assume the velocity of the vehicle to be constant during [*t* B <sup>k</sup>−1, *<sup>t</sup>* B <sup>k</sup>], where the wheel encoder always records the distance traveled along the heading direction of the vehicle. Thus, the instantaneous velocity vector can be denoted as *v*B <sup>k</sup> = [*v* 0 0] *<sup>T</sup>*. The relationship between the velocity vector *v*<sup>W</sup> <sup>k</sup> in the coordinate system {*W*} and the velocity vector *v*<sup>B</sup> <sup>k</sup> in the coordinate system {*B*} is:

$$
\boldsymbol{\sigma}\_{\mathbf{k}}^{\rm W} = \mathcal{R}\_{\mathbf{k}}^{\rm WB} \boldsymbol{\sigma}\_{\mathbf{k}}^{\rm B}. \tag{7}
$$

Using *p*WB <sup>k</sup>−1, *<sup>R</sup>*WB <sup>k</sup> , and *<sup>v</sup>*<sup>W</sup> <sup>k</sup> , the translational component *<sup>p</sup>*WB <sup>k</sup> can be computed as:

$$\begin{split} \boldsymbol{p}\_{\mathbf{k}}^{\text{WB}} &= \boldsymbol{p}\_{\mathbf{k}-1}^{\text{WB}} + \boldsymbol{\sigma}\_{\mathbf{k}}^{\text{W}} \begin{pmatrix} \mathbf{t}\_{\mathbf{k}} - \mathbf{t}\_{\mathbf{k}-1} \end{pmatrix} \\ &= \boldsymbol{p}\_{\mathbf{k}-1}^{\text{WB}} + \boldsymbol{\mathcal{R}}\_{\mathbf{k}}^{\text{WB}} \begin{bmatrix} \triangle s & \mathbf{0} & \mathbf{0} \end{bmatrix}^{T} . \end{split} \tag{8}$$

Since both the IMU and the wheel encoder output data at a frequency of 100 Hz, we use two separate buffers to cache these two kinds of data. These two sources of information are then fused by Equations (6) and (8). This fusion process only involves a few operators, and usually takes less than 1 millisecond. Therefore, the fusion output, which we term as the motion prior data, is also generated at 100 Hz.

#### *3.3. Pre-Processing Step 1: Lidar-IMU Hand–Eye Calibration*

As described before, the lidar point clouds are represented in the lidar coordinate system {*L*}. Taking the lidar point clouds of two frames *f* <sup>L</sup> <sup>k</sup>−<sup>1</sup> and *<sup>f</sup>* <sup>L</sup> <sup>k</sup> as inputs, we can calculate *T*<sup>L</sup> <sup>k</sup>−1,k by the lidar odometry algorithm. *T*<sup>L</sup> <sup>k</sup>−1,k indicates the relative transformation in the coordinate system {*L*}. However, the *T*<sup>B</sup> <sup>k</sup>−1,k obtained from the IMU is expressed in the body coordinate system {*B*}. *Appl. Sci.* **2019**, *9*, 1506

Therefore, in order to utilize the IMU information, we need to establish the relationship between {*L*} and {*B*} first.

The calculation of the relative transformation *<sup>T</sup>*BL between {*L*} and {*B*} is a typical hand–eye calibration problem. Based on *<sup>T</sup>*BL, we can transform a point from the lidar coordinate system {*L*} to the body coordinate system {*B*}:

$$\mathbf{X}^{\rm B}\_{\rm (i,b,k)} = \mathbf{T}^{\rm BL} \mathbf{X}^{\rm L}\_{\rm (i,b,k)}.\tag{9}$$

According to [22], we can formulate the hand–eye calibration problem between the coordinate system {*L*} and {*B*} as:

$$
\triangle T\_{\rm k,k-1}^{\rm B} T^{\rm BL} = T^{\rm BL} \triangle T\_{\rm k,k-1}^{\rm L} \,. \tag{10}
$$

*T*<sup>B</sup> <sup>k</sup>−1,k and *T*<sup>L</sup> <sup>k</sup>−1,k are inputs of Equation (10). Let the timestamps of frames *<sup>f</sup>* <sup>L</sup> <sup>k</sup> and *<sup>f</sup>* <sup>L</sup> <sup>k</sup>−<sup>1</sup> be *<sup>τ</sup>*<sup>L</sup> k and *τ*<sup>L</sup> <sup>k</sup>−1, respectively. We firstly find the nearest timestamps *<sup>τ</sup>*<sup>B</sup> <sup>k</sup> and *<sup>τ</sup>*<sup>B</sup> <sup>k</sup>−<sup>1</sup> from the IMU, then *<sup>T</sup>*WB <sup>k</sup> and *T*WB <sup>k</sup>−<sup>1</sup> can be obtained. The relative transformation *T*<sup>B</sup> <sup>k</sup>−1,k can be computed by Equation (1). We then use *T*<sup>B</sup> <sup>k</sup>−1,k as the initial guess and *T*<sup>L</sup> <sup>k</sup>−1,k can be obtained from the lidar odometry.

Equation (10) can be solved by calculating the rotational and translational component separately. The rotational component is:

$$
\triangle \mathcal{R}\_{\mathbf{k}, \mathbf{k}-1}^{\rm B} \mathcal{R}^{\rm BL} = \mathcal{R}^{\rm BL} \triangle \mathcal{R}\_{\mathbf{k}, \mathbf{k}-1}^{\rm L} \tag{11}
$$

and the translational component is:

$$
\triangle \mathcal{R}\_{\mathbf{k}, \mathbf{k}-1}^{\rm B} p^{\rm BL} + \triangle p\_{\mathbf{k}, \mathbf{k}-1}^{\rm B} = \mathcal{R}^{\rm BL} \triangle p\_{\mathbf{k}, \mathbf{k}-1}^{\rm L} + p^{\rm BL}.\tag{12}
$$

There are many methods to solve Equations (11) and (12). According to [23], the method proposed by Tsai et al. [24] exhibits the best performance. We therefore chose to use this method.

In our approach, multiple pairs of transformation *T*<sup>L</sup> i,i−<sup>1</sup> and *T*<sup>B</sup> i,i−<sup>1</sup> are needed. To reduce the error of hand–eye calibration [25,26], the following requirements need to be met:


After completing the hand–eye calibration, each point of the frame *f* <sup>L</sup> <sup>k</sup> can be transformed by Equation (9) and be represented in the coordinate system {*B*}.

#### *3.4. Pre-Processing Step 2: Intra-Frame Correction*

As mentioned in Section 3.1, it usually takes 100 ms for the lidar to spin a full circle. During this time, the vehicle also keeps moving. Some of these data are received at the previous moment *t* L <sup>k</sup> − *tj* (*tj* ∈ (0, 100) (ms)) and their coordinates are represented in the coordinate system corresponding to the moment *t* L <sup>k</sup> − *tj*. These points can be denoted as *<sup>X</sup>*<sup>B</sup> (i,b,k−j) . If we want to represent these points in the coordinate system corresponding to time *t* L <sup>k</sup>, we need to transform the point *<sup>X</sup>*<sup>B</sup> (i,b,k−j) to *<sup>X</sup>*<sup>B</sup> (i,b,k) . This transformation is usually called "correction" or "rectification" of the lidar scan.

There are two common ways to rectify the distorted point clouds. One utilizes the information of IMU [16,27]. We make some improvements in this method which is described in Section 3.4.1. The other one uses the previous registration results [15], which is presented in Section 3.4.2. Furthermore, we also propose an interesting two-step correction procedure which does not rely on the IMU in Section 3.4.3.

#### 3.4.1. Rectification with the Aid of IMU

In this method, we try to use IMU to rectify the raw points. Let the pose at time *t* L <sup>k</sup> and *t* L <sup>k</sup>−<sup>1</sup> be *T*WB <sup>q</sup> and *T*WB <sup>p</sup> . Since the IMU data is generated at a frequency of 100 Hz and the lidar point clouds are generated at a frequency of 10 Hz, there are always nine frames of IMU data between the pose *T*WB p and *T*WB <sup>q</sup> . Therefore, ten relative transformations *T*<sup>B</sup> <sup>m</sup>−1,m (*<sup>m</sup>* <sup>∈</sup> (*p*, *<sup>q</sup>*]) can be obtained by Equation (1) and each *T*<sup>B</sup> <sup>m</sup>−1,m represents the relative transformation during [*<sup>t</sup>* B <sup>m</sup>−1, *<sup>t</sup>* B <sup>m</sup>] , where *t* B <sup>m</sup> − *t* B <sup>m</sup>−<sup>1</sup> equals to 10 ms.

Since the rotational velocity of the lidar is constant during [*t* L <sup>k</sup>−1, *<sup>t</sup>* L <sup>k</sup>], the rotation angle of the lidar per 10 ms is also fixed. We can separate all points of a frame *f* <sup>L</sup> <sup>k</sup> into ten sets, and the ID of points in each set can be denoted as i ∈ [ *m*−1−*p <sup>q</sup>*−*<sup>p</sup>* <sup>×</sup> *Sb*, *<sup>m</sup>*−*<sup>p</sup> <sup>q</sup>*−*<sup>p</sup>* <sup>×</sup> *Sb*]. Let the total number of points in each set be *Se*, which is equal to *Sb*/10. We further assume that the motion of the vehicle is constant during [*t* B <sup>m</sup>−1, *<sup>t</sup>* B <sup>m</sup>]. Each set of points now correspond to a unique relative transformation *T*<sup>B</sup> m,m−<sup>1</sup> which can also be denoted as *exp ξ*<sup>∧</sup> <sup>m</sup>−1,m .

We linearly interpolated *exp ξ*<sup>∧</sup> <sup>m</sup>−1,m to transform point *X*<sup>B</sup> (i,b,k−j) to *<sup>X</sup>*<sup>B</sup> (i,b,m−1) . Then, *X*<sup>B</sup> (i,b,m−1) will be transformed to *<sup>X</sup>*<sup>B</sup> (i,b,k) by *T*<sup>B</sup> q,m−1. The whole rectification process can be denoted as:

$$\hat{\mathbf{X}}\_{\text{(i,b,k)}}^{\text{B}} = \bigtriangleup \mathbf{T}\_{\text{q,m}-1}^{\text{B}} \cdot \exp\left(\bigtriangleup \mathbf{f}\_{\text{m}-1,\text{m}} \cdot \frac{\mathbf{i}^{\text{q}} \mathbf{S}\_{\text{t}}}{\mathbf{S}\_{\text{t}}}\right)^{\wedge} \cdot \mathbf{X}\_{\text{(i,b,k-j)}}^{\text{B}} \tag{13}$$

where *A*%*B* is used to compute the remainder when *A* is divided by *B*, and *X***ˆ** <sup>B</sup> (i,b,k) represents the undistorted point after rectification.

#### 3.4.2. Rectification by Previous Registration Results

This approach assumes that the motion of the vehicle is constant during *t* L <sup>k</sup>−2, *<sup>t</sup>* L k . Thus, it can assume that the relative transformation *T*<sup>B</sup> <sup>k</sup>−1,k−<sup>2</sup> between frame *<sup>f</sup>* <sup>L</sup> <sup>k</sup>−<sup>1</sup> and *<sup>f</sup>* <sup>L</sup> <sup>k</sup>−<sup>2</sup> is the same as the relative transformation *T*<sup>B</sup> k,k−<sup>1</sup> between frame *<sup>f</sup>* <sup>L</sup> <sup>k</sup> and *<sup>f</sup>* <sup>L</sup> <sup>k</sup>−1. Therefore, we can linearly interpolate the relative transformation *exp ξ*<sup>∧</sup> k−1,k−2 to rectify the raw distorted points in frame *f* <sup>L</sup> <sup>k</sup> as:

$$\mathbf{X}\_{\text{(i,b,k)}}^{\text{B}} = \exp\left(\triangle \mathbf{f}\_{\text{k-1,k-2}} \cdot \frac{i}{S\_{\text{b}}}\right)^{\wedge} \cdot \mathbf{X}\_{\text{(i,b,k-j)}}^{\text{B}} \cdot \tag{14}$$

3.4.3. Rectification by Registering Two Distorted Point Clouds

In this approach, we note that intra-frame correction is in fact quite related to inter-frame registration. As the task of intra-frame correction is to estimate the lidar motion between 0 ms and 100 ms, inter-frame registration tries to estimate the motion between 100 ms and 200 ms. Both frame *a* (corresponding to the lidar date generated between 0 ms and 100 ms) and frame *b* (generated between 100 ms and 200 ms) are distorted point clouds. It is reasonable to assume that the degree of distortion of these two frames are quite similar. We could thus directly register these two frames without any intra-frame correction first, and then use the registration result to correct each frame.

Let the relative transformation obtained from registering two distorted point clouds be *T*<sup>B</sup> k,k−1. We assumed that *T*<sup>B</sup> k,k−<sup>1</sup> can approximate the lidar motion between time *t* L k, *t* L k−1 and linearly interpolate *exp ξ* ∧ k,k−1 to rectify the raw distorted points of *f* <sup>L</sup> <sup>k</sup> , as in Equation (15).

$$\mathbf{X}\_{\mathrm{(i,b,k)}}^{\mathrm{B}} = \exp\left(\triangle \tilde{\mathfrak{F}}\_{\mathrm{k,k-1}} \cdot \frac{\mathrm{i}}{\mathcal{S}\_{\mathrm{b}}}\right)^{\wedge} \cdot \mathbf{X}\_{\mathrm{(i,b,k-j)}}^{\mathrm{B}} \cdot \tag{15}$$

The undistorted frame after rectification is denoted as **ˆ** *f* L <sup>k</sup> . In Section 4, we do experiments to compare these three approaches.

#### *3.5. Intermediate Step: Inter-Frame Registration*

To register two frames of point clouds, we could use any off-the-shelf scan registration approaches, such as ICP, NDT, and LOAM. To successfully apply these approaches, it always requires an accurate initial guess for the estimated transformation. Just as mentioned previously, the problem of inter-frame registration is, in fact, quite related to intra-frame correction. Therefore, the approaches mentioned in Section 3.4 can also be applied here. We can either use the registration result of the previous two frames, or use the IMU data directly. We undertake experiments to compare these two approaches, presented in Section 4.

#### *3.6. Output Step: Fusion in an EKF Framework*

#### 3.6.1. Basic Concepts

The Extended Kalman filter (EKF) [28] is one of the most commonly used data fusion algorithms. It is a recursive state estimator for the nonlinear system, and the state is assumed to satisfy a Gaussian distribution. A typical EKF algorithm consists of two steps: a prediction step, and an update step.

In the prediction step, the previous state *x*k−<sup>1</sup> is used to compute a priori estimate of the current state by:

$$\begin{aligned} \mathfrak{A}\_{\mathbf{k}} &= f\left(\mathfrak{A}\_{\mathbf{k}-1}\right) \\ \mathfrak{P}\_{\mathbf{k}} &= F \not\mathfrak{P}\_{\mathbf{k}-1} F^T + \mathfrak{R}\_{\mathbf{k}} \end{aligned} \tag{16}$$

where *R* is the covariance of the process noise. *x***¯**<sup>k</sup> and *P***¯** <sup>k</sup> are the mean and covariance matrices of the prior distribution of the state *<sup>x</sup>*k, respectively. *<sup>x</sup>***ˆ**k−<sup>1</sup> and *<sup>P</sup>***<sup>ˆ</sup>** <sup>k</sup>−<sup>1</sup> are the mean and covariance matrices of the posterior Gaussian distribution of the state *x*k−1. By performing a first-order Taylor expansion to the function *f* at the mean of the state *x*k−1, the first-order partial derivative can be denoted as the Jacobian matrix *F*:

$$F = \left. \frac{\partial f}{\partial \mathbf{x}\_{k-1}} \right|\_{\mathfrak{A}\_{k-1}}.\tag{17}$$

In the update step, the priori estimate of the current state is combined with the current observation *z*<sup>k</sup> to refine the state estimate and obtain the posteriori state estimate *x*k, which can be formulated as:

$$\begin{aligned} \mathbf{K}\_{\mathbf{k}} &= \vec{\mathbf{P}}\_{\mathbf{k}} \boldsymbol{H}^{T} \left( \boldsymbol{H} \, \vec{\mathbf{P}}\_{\mathbf{k}} \, \boldsymbol{H}^{T} + \boldsymbol{Q} \right)^{-1}, \\ \mathbf{\hat{x}}\_{\mathbf{k}} &= \vec{\mathbf{x}}\_{\mathbf{k}} + \mathbf{K}\_{\mathbf{k}} \left( \boldsymbol{z}\_{\mathbf{k}} - \boldsymbol{h} \left( \vec{\mathbf{x}}\_{\mathbf{k}} \right) \right), \\ \mathbf{\hat{P}}\_{\mathbf{k}} &= \left( \boldsymbol{I} - \mathbf{K}\_{\mathbf{k}} \boldsymbol{H} \right) \vec{\mathbf{P}}\_{\mathbf{k}}. \end{aligned} \tag{18}$$

where *Q* represents the covariance of the measurement noise. *K*<sup>k</sup> is the Kalman gain and the posteriori state estimation *<sup>x</sup>*<sup>k</sup> ∼ N *x***ˆ**k, *P***ˆ** k . Similarly, a first-order Taylor expansion to the function *h* can be performed at the mean of the prior distribution of the state *x*k. The first-order partial derivative could be defined as the Jacobian matrix *H*:

$$H = \left. \frac{\partial \mathbf{h}}{\partial \mathbf{x}\_{\mathbf{k}}} \right|\_{\mathbf{x}\_{\mathbf{k}}}.\tag{19}$$

In this section, We use an EKF framework to fuse the IMU and the results obtained from the lidar odometry, as shown in Figure 5. Let the motion prior model presented in Section 3.2 be the system model and the results from the lidar odometry (which is denoted as the red star in Figure 5) be the measurements. The system model and measurement model are described as follows.

**Figure 5.** The diagram of the proposed fusion framework. The yellow rectangle represents the IMU data, and the blue triangle indicates the lidar point clouds. The red star denotes the localization result output from the lidar odometry.

#### 3.6.2. System Model

Since the main purpose of our system is to obtain the pose of the vehicle, we could use the *Lie Algebra ξ* to define the state vector *x*<sup>k</sup> at time *t*<sup>k</sup> as:

$$\mathfrak{a}\mathfrak{x}\_{\mathbf{k}} = \begin{bmatrix} \mathfrak{e}\_{\mathbf{k}} & \mathfrak{p}\_{\mathbf{k}} \end{bmatrix}^{T} \in \mathfrak{se}(\mathfrak{z})\,. \tag{20}$$

By utilizing the logarithmic map and the approximate Baker-Campbell-Hausdorff (BCH) formula [21], the *R*WB <sup>k</sup> obtained from the Equation (6) can be converted to:

$$
\theta\_{\mathbf{k}} = f\_l^{-1} \left( \theta\_{\mathbf{k}-1} \right) \cdot \left( \omega\_{\mathbf{k}} \triangle t \right) + \theta\_{\mathbf{k}-1}, \tag{21}
$$

where *J*−<sup>1</sup> *<sup>l</sup>* (*θ*k) denotes the right BCH Jacobian of SO(3) and it equals to:

$$J\_{l}^{-1} \left(\theta\_{\mathbf{k}}\right) = \frac{\theta\_{\mathbf{k}}}{2} \text{cot}\frac{\theta\_{\mathbf{k}}}{2} I + \left(1 - \frac{\theta\_{\mathbf{k}}}{2} \text{cot}\frac{\theta\_{\mathbf{k}}}{2}\right) \mathfrak{m}\_{\mathbf{k}} \mathfrak{m}\_{\mathbf{k}}^{T} - \frac{\theta\_{\mathbf{k}}}{2} \mathfrak{m}\_{\mathbf{k}}^{\wedge},\tag{22}$$

and *J*−<sup>1</sup> *<sup>l</sup>* (*θ*k) can be denoted as:

$$J\_l^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) = \begin{bmatrix} J\_1^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) & J\_2^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) & J\_3^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) \\ J\_4^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) & J\_5^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) & J\_6^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) \\ J\_7^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) & J\_8^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) & J\_9^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) \end{bmatrix}. \tag{23}$$

Combining the Equations (8), (21) and (23), the state transition equation could be defined as:

$$\mathbf{x}\_{\mathbf{k}} = \mathbf{x}\_{\mathbf{k}-1} + \begin{bmatrix} \int\_{l}^{-1} (\boldsymbol{\theta}\_{\mathbf{k}-1}) \cdot \boldsymbol{\omega}\_{\mathbf{k}} \bigtriangleup t \\\ \mathbf{R}\_{\mathbf{k}}^{\text{WB}} (1:3,1) \cdot \bigtriangleup s \end{bmatrix}' \tag{24}$$

where *R*WB <sup>k</sup> is computed by *<sup>θ</sup><sup>x</sup>* <sup>k</sup>, *θ y* <sup>k</sup> and *<sup>θ</sup><sup>z</sup>* <sup>k</sup> according to Equation (3) and *<sup>R</sup>*WB <sup>k</sup> (1 : 3, 1) means the first column of the *R*WB <sup>k</sup> .

According to Equations (17) and (24), *F* can be derived as:

$$F = \begin{bmatrix} F\_{\theta\theta} & \mathbf{0}\_{3 \times 3} \\ F\_{p\theta} & I\_{3 \times 3} \end{bmatrix} \in \mathbb{R}^{6 \times 6},\tag{25}$$

where *<sup>F</sup>θθ* <sup>∈</sup> <sup>R</sup>3×3, and it can be computed by:

$$\mathbf{F}\_{\theta\theta} = I\_{3 \times 3} + \begin{bmatrix} \left(\frac{\partial I\_l^{-1}(\theta\_{k-1})}{\partial \theta\_{k-1}^z} \cdot \boldsymbol{\omega}\_{\mathbf{k}} \triangle t\right)^T \\\\ \left(\frac{\partial I\_l^{-1}(\theta\_{k-1})}{\partial \theta\_{k-1}^y} \cdot \boldsymbol{\omega}\_{\mathbf{k}} \triangle t\right)^T \\\\ \left(\frac{\partial I\_l^{-1}(\theta\_{k-1})}{\partial \theta\_{k-1}^z} \cdot \boldsymbol{\omega}\_{\mathbf{k}} \triangle t\right)^T \end{bmatrix}^T \tag{26}$$

*Fp<sup>θ</sup>* can be computed by:

$$F\_{p\emptyset} = \begin{bmatrix} \frac{\partial \mathcal{R}\_k^{\text{WB}}(1:3,1)}{\partial \theta\_{k-1}^{\text{k}}} & \frac{\partial \mathcal{R}\_k^{\text{WB}}(1:3,1)}{\partial \theta\_{k-1}^{\text{WB}}} & \frac{\partial \mathcal{R}\_k^{\text{WB}}(1:3,1)}{\partial \theta\_{k-1}^{\text{kA}}} \end{bmatrix}. \tag{27}$$

The prior probability distribution of the state *x*<sup>k</sup> could then be obtained by the Equation (16).

#### 3.6.3. Measurement Model

The measurement vector *z*<sup>k</sup> is denoted as:

$$\mathbf{z}\_{\mathbf{k}} = \begin{bmatrix} \theta\_{z\_{\mathbf{k}}}^{\mathbf{x}} & \theta\_{z\_{\mathbf{k}}}^{\mathbf{y}} & \theta\_{z\_{\mathbf{k}}}^{z} & p\_{z\_{\mathbf{k}}}^{\mathbf{x}} & p\_{z\_{\mathbf{k}}}^{z} & p\_{z\_{\mathbf{k}}}^{z} \end{bmatrix}^{T} \,. \tag{28}$$

According to Figure 5, the lidar odometry algorithm starts to run at time *t* L m, but the result is obtained at time *t*k, which is several tens of milliseconds after *t* L m. We assume that the pose of the vehicle at time *t* L <sup>m</sup> and *t*<sup>k</sup> were approximately equal to *T*WB <sup>i</sup> and *<sup>T</sup>*WB *<sup>l</sup>* . Both *<sup>T</sup>*WB <sup>i</sup> and *<sup>T</sup>*WB *<sup>l</sup>* can be obtained from the IMU data.

The relationship between the measurement *z*<sup>k</sup> and the real state *x*<sup>k</sup> of the vehicle at time *t*<sup>k</sup> can be indicated by *T*<sup>B</sup> i,*l* , which can also be denoted as *ξ*i,*<sup>l</sup>* = *θ*i,*<sup>l</sup> p*i,*<sup>l</sup> T* . The observation equation could then be represented as:

$$\mathbf{z}\_{\mathbf{k}} = \begin{bmatrix} \boldsymbol{\theta}\_{\mathbf{k}} \\ \boldsymbol{\triangle p\_{i,l}} \end{bmatrix} + \begin{bmatrix} \boldsymbol{I}\_{l}^{-1} \left( \boldsymbol{\theta}\_{\mathbf{k}} \right) \cdot \boldsymbol{\triangle \theta\_{i,l}} \\ \boldsymbol{\triangle R\_{i,l}^{\rm B}} \cdot \boldsymbol{p\_{k}} \end{bmatrix}. \tag{29}$$

According to Equations (19) and (29), *H* can be derived as:

$$H = \begin{bmatrix} H\_{\emptyset \emptyset} & \mathbf{0}\_{3 \times 3} \\ \mathbf{0}\_{3 \times 3} & H\_{pp} \end{bmatrix} \; \prime \tag{30}$$

where *<sup>H</sup>pp* = *R*WB i,*<sup>l</sup>* <sup>∈</sup> <sup>R</sup>3×3, *<sup>H</sup>θθ* <sup>∈</sup> <sup>R</sup>3×<sup>3</sup> and *<sup>H</sup>θθ* can be computed by:

$$H\_{\theta\theta} = \begin{bmatrix} \left(\frac{\partial I\_{\mathrm{I}}^{-1}(\mathfrak{d}\_{\mathrm{k}})}{\partial \theta\_{\mathrm{k}}^{\mathrm{T}}} \cdot \bigtriangleup \mathfrak{d}\_{\mathrm{i},\mathrm{I}}\right)^{T} \\\\ \left(\frac{\partial I\_{\mathrm{I}}^{-1}(\mathfrak{d}\_{\mathrm{k}})}{\partial \theta\_{\mathrm{k}}^{\mathrm{T}}} \cdot \bigtriangleup \mathfrak{d}\_{\mathrm{i},\mathrm{I}}\right)^{T} \\\\ \left(\frac{\partial I\_{\mathrm{I}}^{-1}(\mathfrak{d}\_{\mathrm{k}})}{\partial \theta\_{\mathrm{k}}^{\mathrm{T}}} \cdot \bigtriangleup \mathfrak{d}\_{\mathrm{i},\mathrm{I}}\right)^{T} \end{bmatrix}^{T} \tag{31}$$

Based on the covariance matrix *P***¯** <sup>k</sup> from Equation (16) and the Jacobian matrix *H* from Equation (30), the Kalman gain *K*<sup>k</sup> and the posteriori state estimation could be computed by the Equation (18).

This fusion framework is summarized in Algorithm 1.


#### **4. Experimental Results**

The platform used in the following experiments is shown in Figure 6. The vehicle is equipped with a Velodyne HDL-64E lidar, a Xsens MTI300 IMU, two photoelectric encoders, a NovAtel SPAN-CPT GNSS/INS system, and a Nova 5100 industrial computer. The lidar is mounted on the top of the car, and it has 64 beams which could generate over 1.3 million points per second with a high range accuracy. The horizontal Field of View (FOV) is 360◦, the vertical FOV is 26.8◦, and the maximal detection distance is 120 m. The NovAtel GNSS/INS system provides localization results with centimeter-level accuracy, which can be served as the ground truth.

**Figure 6.** Our experimental platform. It is equipped with a Velodyne HDL-64E Lidar, a Xsens MTI300 IMU, two photoelectric encoders, a NovAtel SPAN-CPT GNSS/INS system, and a Nova 5100 industrial computer.

#### *4.1. Experiments on Hand–Eye Calibration*

As described in Section 3.3, in order to perform the lidar-IMU hand–eye calibration, we need to utilize multiple pairs of relative transformations {*T*<sup>B</sup> <sup>k</sup>−1,k, *T*<sup>L</sup> <sup>k</sup>−1,k}.

We choose the frame pairs { *<sup>f</sup>* <sup>L</sup> <sup>i</sup> , *<sup>f</sup>* <sup>L</sup> <sup>i</sup>−1} before and after the vehicle makes a turn. In addition, all the frames are collected when the vehicle is stationary. Figure 7 shows several transformation pairs we have chosen. In each sub-figure, the left part is the registration result before the hand–eye calibration, whilst the right half is the result after hand–eye calibration. It can be seen that *f* <sup>L</sup> <sup>i</sup> and *<sup>f</sup>* <sup>L</sup> <sup>i</sup>−<sup>1</sup> are much

better-registered in the right half of Figure 7a–d than the left half, which indicates the importance of hand–eye calibration.

**Figure 7.** Four typical lidar pairs chosen to perform the hand–eye calibration. In each sub-figure, the left part is the registration result before the hand–eye calibration, whilst the right half is the result after hand–eye calibration. The yellow dotted line indicates the movement trajectory of the vehicle, while the yellow arrow and the blue arrow denote the orientation of the vehicle.

Since the ground truth of *T*BL is impossible to be obtained, we could not verify the calibration result directly. However, we could use the IMU data to assemble different frames together, and the quality of the assembled scene can be used to evaluate the accuracy of the hand–eye calibration result. Figure 8 shows the assembled scene when the vehicle makes a turn at the intersection. Figure 8a is the assembled result before hand–eye calibration, which is very vague, while the result after the hand–eye calibration in Figure 8b is much clearer. This suggests that the result of our hand–eye calibration is accurate.

To quantitatively test the effect of hand–eye calibration, we choose the LOAM algorithm as the lidar odometry algorithm, and compare the estimated trajectories with the ground-truth trajectory generated by the GNSS/INS system. Figure 9 illustrates the localization result before and after hand–eye calibration. The translation error and rotation error are shown in Figure 9b,c, where the horizontal axis represents the traveled distance and the vertical axis shows the errors. It is obvious that the translation error and the rotation error are significantly reduced after the hand–eye calibration.

**Figure 8.** The 3D reconstruction result when the vehicle turns around at the intersection. The left (**a**) is the result before hand–eye calibration, and the right (**b**) is the result after hand–eye calibration.

**Figure 9.** The localization results before and after hand–eye calibration, where (**a**) compares the trajectories before/after hand–eye calibration with the ground truth; (**b**) compares the translation error before/after calibration, and rotation error is shown in (**c**). In this experiment, the UGV drives at a speed around 25 km/h, and the overall distance is about 1.1 km.

#### *4.2. Experiments on Intra-Frame Correction*

Based on the hand–eye calibration results, we could also use the 3D reconstruction results to qualitatively test the three rectification methods introduced in Section 3.4. In order to test the result of the rectification, the vehicle should run fast to amplify the intra-frame distortion. Therefore, we carry out this experiment when the vehicle drives at around 60 km/h, and the results are shown in Figure 10.

Figure 10a–d represent different 3D reconstruction results when the vehicle is making a turn. Figure 10a is the result without any rectification, and the reconstructed scene is quite blurry. Figure 10b–d show the results when the distorted point clouds are rectified by the previous registration result, by registering two distorted point clouds and by the IMU data. The results obtained by these three approaches are all much better than the one without rectification. Moreover, the average processing time for these three methods applied on 2000 frames of point clouds is displayed in Table 1. It is seen that the approach of registering two distorted point clouds is the most computational expensive approach.

To quantitatively test the effect of intra-frame correction, LOAM is also used as the lidar odometry algorithm, and the localization accuracy is compared with the ground-truth trajectory. Figure 11 compares the localization results with/without rectification. The trajectories generated by the three rectification methods and without rectification are compared with the ground truth in Figure 11a, where "Method 1" represents rectification by previous registration, "Method 2" is the approach of registering two distorted point clouds, and "Method 3" indicates rectification by IMU data. In addition, Figure 11b,c compare the translation error and rotation error of these approaches.

**Figure 10.** The 3D reconstruction results to test the three rectification methods. (**a**) represents the result without any rectification, and (**b**–**d**) show the results when the distorted point clouds are rectified by the previous registration result, by registering two distorted point clouds and by the IMU data, respectively.

**Figure 11.** The localization results with/without rectification, where (**a**) compares the trajectories generated by the three rectification methods (Methods 1–3 represent rectification by the previous registration result, by registering two distorted point clouds and by the IMU data, respectively), the trajectory generated without rectification (Origin) and the ground truth; and (**b**,**c**) compare the translation error and rotation error among them, respectively. In this experiment, the UGV drives at a speed of around 60 km/h.

It can be seen that both the translation error and the rotation error are significantly reduced after rectification, and "Method 3" exhibits the best performance, followed by "Method 1" and "Method 2". A detailed error comparison is presented in Table 2.





#### *4.3. Experiments on Inter-Frame Registration*

### 4.3.1. Tests with Different Methods of Assigning Initial Guess

In this experiment, two methods of assigning initial guess for the lidar odometry algorithm are compared—namely, using the IMU data or the registration result from the previous two scans. In both approaches, the raw distorted point clouds are rectified by the IMU data, and LOAM is used as the lidar odometry algorithm.

The tests are conducted when the vehicle drives at different speeds: a relatively slow speed of around 25 km/h, and a faster speed of around 60 km/h. A detailed error comparison is shown in Table 3. It can be seen that using the IMU data as the initial guess always produces a better result than using the registration result from the previous two scans, especially when the vehicle drives at a faster speed.


**Table 3.** Error comparison of different methods at the intermediate step.

#### 4.3.2. Tests with Different Lidar Odometry Algorithms

Here, we compare three popular lidar odometry algorithms: ICP, NDT, and LOAM. During the experiments, the point clouds are rectified by the IMU data. The vehicle drives at a speed around 25 km/h, and the overall distance is about 1.1 km. A detailed error comparison is shown in Table 4. It is obvious that LOAM performs much better than NDT and ICP.



#### 4.3.3. Tests in Both Urban and Off-Road Environments

Our algorithm is tested in both urban and off-road environment. We choose to use LOAM as the lidar odometry algorithm. The point clouds are rectified by the IMU data. In the urban environment, the vehicle drives at a speed of around 25 km/h, and the overall distance is about 1.1 km. In the off-road environment, the vehicle drives at a speed of around 15 km/h and the overall distance is about 1.0 km. The trajectories generated by our method are compared with the ground truth in Figure 12. A detailed error comparison is shown in Table 5. It is seen that our approach is applicable for both the urban and off-road environments, whilst the error in the off-road scenario is slightly higher than the urban environment.


**Table 5.** Error comparison of diverse environments.

**Figure 12.** The results in the urban environment and off-road environment. (**a**) shows the trajectory generated in the urban environment when the UGV drives at a speed of around 25 km/h, and (**b**) shows the trajectory generated in the off-road environment when the UGV drives at a speed of around 15 km/h.

#### *4.4. Experiments in GNSS-Denied Environment*

Finally, we test our whole localization system in the GNSS-Denied environment. Figure 13a shows the trajectory generated by our localization system when the UGV drives at a speed of 20 km/h. By projecting the trajectory to the satellite image, we find that the position drift is quite small. Figure 13b compares the trajectories generated by our localization system and by the disturbed GNSS/INS system. Figure 13c shows the output frequency of our localization system. The average output frequency is around 39.7 Hz. In summary, no matter whether the GNSS signal is available or not, our system can output precise and high-frequency localization results.

**Figure 13.** The results in a GNSS-deniedenvironment. (**a**) shows the trajectory generated by our system when the UGV drives at a speed around 20 km/h and the overall distance is about 1.3 km. (**b**) compares the trajectories generated by our system and by the disturbed GNSS/INS system. (**c**) shows the output period of our localization system during the whole process, and the average output frequency is about 39.7 Hz.

#### **5. Conclusions**

In this paper, we present a novel approach for combining the IMU data with lidar odometry. The proposed localization system could generate stable, accurate, and high-frequency localization results. Since our system does not recognize the loop closure, we can further extend our work to enable the ability of loop-closure detection.

**Author Contributions:** Conceptualization, H.F.; Data curation, H.X.; Formal analysis, H.X.; Funding acquisition, B.D.; Investigation, H.X.; Methodology, H.F. and H.X.; Project administration, B.D.; Resources, H.F. and B.D.; Software, H.F. and H.X.; Supervision, B.D.; Validation, H.F.; Original draft, H.X.; Review and editing, H.F.

**Funding:** This work was supported by the National Natural Science Foundation of China under No. 61790565.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article*
