*4.2. Spatial Unification of Multi-Sensor Poses*

Constructing the time—space correlation of each sensor is a fundamental task in multisensor fusion optimization. For this system, it is necessary to spatially unify the positional estimation results of LiDAR and IMU in the local map with the GNSS measurements in the global map. Therefore, the spatial unification strategy of the multi-sensor pose involved in this paper is shown in Figure 5.

**Figure 5.** Schematic diagram of the spatial unification of multi-sensor poses. The spatial association of the poses of LiDAR and IMU in the local coordinate system is on the left, and the spatial association of the poses of IMU and GNSS receivers in the global coordinate system is on the right.

As shown in Figure 5, *WIMU GNSS* is the external parameter conversion matrix from IMU to GNSS and *WLiDAR IMU* is the external parameter conversion matrix from LiDAR to IMU. Since the hardware is fixed to the mobile carrier, both are calibrated to a constant value. The left figure shows the positional conversion between LiDAR and IMU in the local coordinate system, whereas the right figure shows the multi-sensor positional spatial unification from the local to the global coordinate system. Here, it is necessary to introduce the spatial transformation parameters *F<sup>L</sup> <sup>G</sup>* (including translation *<sup>p</sup><sup>L</sup> <sup>G</sup>* and rotation *<sup>q</sup><sup>L</sup> <sup>G</sup>*) to correlate the two positional spaces. The spatial unification of the multi-sensor pose at the moment of *t* can be expressed as:

$$\min\_{q\_L^G, p\_L^G} \frac{1}{2} \sum\_{i=1}^{l} \left\| P\_t^G - \left( \left( q\_G^L \right)^T \cdot \left( q\_t^L \cdot \mathcal{W}\_{GNSS}^{IMLI} + p\_t^L \right) + p\_L^G \right) \right\|\_2^2 \tag{18}$$

where the initial value of *F<sup>L</sup> <sup>G</sup>* is set as the unit matrix. Every time the GNSS factor is added to solve the global optimum, the value of *F<sup>L</sup> <sup>G</sup>* at the next moment will be updated, thus correcting the cumulative offset between the local and global coordinate systems.

#### *4.3. Global Pose Map Structure*

Global pose map construction can be regarded as a nonlinear optimization problem; that is, the nonlinear optimization of the state vector in the sliding window. Different from the factor graph method adopted in [27], this paper adopts the graph optimization method to directly construct the residual block in the original pose graph structure for nonlinear optimization, and only optimizes the key frames in the sliding window. However, the factor graph based on GTSAM [29] needs to construct the optimization problem into a new graph corresponding to the original pose graph, with the optimization variables as the vertices and error terms as the edges. The complicated constraint relationship among the vertices is more favorable toward the optimization accuracy. However, once a new key frame is detected, all of its associated constraint nodes will be updated, which is complicated and takes too long in the engineering field. Therefore, in order to meet the requirements of the lightweight and real-time performance of the vehicle platform, we choose not to build a new constraint-related Bayesian network, but to construct the residual error and nonlinear optimization in the original pose map structure. The global pose optimization framework proposed in this paper is shown in Figure 6.

**Figure 6.** LiDAR-IMU-GNSS fusion framework based on graph optimization.

The global optimization function is constructed as follows:

$$\mathcal{X} = \operatorname\*{argmin}\_{\mathcal{X}} \sum\_{t=0}^{n} \left( \left\| z\_t^L - h\_t^L(\mathcal{X}) \right\|\_{\Sigma\_t^k}^2 + \rho \left\| z\_t^G - h\_t^G(\mathcal{X}) \right\|\_{\Sigma\_t^k}^2 \right) + r\_{\text{loop}} \left( T\_k^N \mathcal{P}\_k, \mathcal{S}\_k \right) \tag{19}$$

where *ρ* is the GNSS confidence level expressed by the covariance of the error in the GNSS observations obtained by the pseudo-range single point positioning (SPP) algorithm solution. *T<sup>W</sup> <sup>k</sup>* is the pose transformation matrix between the current global point cloud P*<sup>k</sup>* and the local point cloud S*<sup>k</sup>* derived from the inter-frame local matching. The specific meaning of each sensor cost function in the formula are as follows.

#### 4.3.1. LIO Factor

According to Section 4.1, the position *p<sup>L</sup> <sup>t</sup>* and rotation *q<sup>L</sup> <sup>t</sup>* of the carrier in the local coordinate system at the moment *t* can be obtained. Therefore, the LIO local residual factor can be constructed as follows:

$$r\_D\left(\hat{z}\_t^{t-1}, \mathcal{X}\right) = \begin{bmatrix} q\_{t-1}^L {-1} \left(p\_t^L - p\_{t-1}^L\right) \\ q\_{t-1}^L {-1} q\_t^L \end{bmatrix} \ominus \begin{bmatrix} q\_{t-1}^G {-1} \left(p\_t^G - p\_{t-1}^G\right) \\ q\_{t-1}^G {-1} q\_t^G \end{bmatrix} \tag{20}$$

where the symbol represents the quaternion subtraction.

### 4.3.2. GNSS Factor

Set the time interval between two frames of GNSS observations as Δ*t* and realize the time alignment with LIO pose estimation by interpolation. Cubic spline interpolation is used for position interpolation and spherical linear interpolation is used for quaternion interpolation. Now, given the GNSS measurement *pGNSS <sup>t</sup>* in the ENU coordinate system and the LIO positional observations *p<sup>G</sup> <sup>t</sup>* in the global coordinate system, the GNSS residual factor is expressed as follows:

$$r\_G\left(\sharp\_t^{t-1}, \mathcal{X}\right) = p\_t^G - p\_t^{GNSS} \tag{21}$$

When the carrier moves to the GNSS signal confidence region, in order to fully and reliably utilize the GNSS observations, the GNSS factor is added with the GNSS confidence as the weight. The GNSS confidence is determined by the number of visible and effective GNSS satellites. After GNSS participates in the global pose estimation, it will update the pose conversion parameter *F<sup>L</sup> <sup>G</sup>* between the local coordinate system and the global coordinate system. This ensures that, even if the mobile carrier enters a GNSS-rejected environment (e.g., indoor car parks and tunnels), our algorithm can provide a more accurate initial observation after GNSS correction.

#### 4.3.3. Loop Factor

Considering the possible overlap of the moving vehicle driving areas, it is necessary to add a loop detection link to establish possible loop constraints between non-adjacent frames. According to Equation (5), the loop factors can be constructed as follows:

$$r\_L \left( T\_k^W \mathcal{P}\_k, \mathcal{S}\_k \right) = \begin{cases} \begin{aligned} &T\_k^W = \operatorname\*{argmin} \sum\_k \left( N\_k \mathcal{e}\_k^T \Sigma\_k^{-1} \mathcal{e}\_k \right) \\ &\mathcal{E}\_k = \frac{\sum\_k \mathcal{P}\_k}{N\_k} - T\_k^W \mathcal{S}\_k \\ &\mathcal{E}\_k = \frac{\sum\_k \mathcal{P}\_k^P}{N\_k} + T\_k^W \Sigma\_k^S T\_k^{WT} \end{aligned} \tag{22}$$

Using the optimized point cloud registration method in Section 3.2, we optimize and correct the historical trajectory through the registration between the point cloud of the prior local map and the current global point cloud. This method ensures that the positional estimates converge to the global optimum result.

#### **5. Experimental Setup and Results**

*5.1. Point Cloud Registration Results*

To verify the superiority of the point cloud registration algorithm used in this paper, we compared the registration results of the ICP algorithm used in traditional LiDAR odometry with our algorithm. The comparison results are shown in Figures 7–9.

**Figure 7.** Point cloud registration results. (**a**) Original source and target point clouds. (**b**) Alignment results using ICP algorithm. (**c**) Alignment results using our algorithm.

**Figure 8.** Detail diagram of point registration results of the point cloud registration results. (**a**) Original source and target point clouds. (**b**) Alignment results using ICP algorithm. (**c**) Alignment results using our algorithm.

**Figure 9.** A circular expansion of the point cloud alignment results. (**a**) Original source and target point clouds. (**b**) Alignment results using ICP algorithm. (**c**) Alignment results using our algorithm.

Compared with the typical indoor environment, for mobile carriers in complex urban environments, the angles and translations between the source and target point clouds during the continuous frame and the loopback detection may be larger. As shown in Figure 7, when the initial position is unreasonable, the registration results of the ICP algorithm cannot fully approach the global optimal solution, which is detrimental to both the local pose estimation and loopback correction of the mobile carrier. However, the registration accuracy of our algorithm is not affected by the large positional transformation of the vehicle platform. Compared with the traditional ICP algorithm, the registration result of our algorithm suited the needs of the vehicle platform better.

Furthermore, in order to avoid the contingency of the registered objects, we made a comparative experiment on the source point clouds with different rotation angles and translation distances, and quantitatively compared the registration accuracy and time consumption of each algorithm. The test results are shown in Tables 2 and 3.


**Table 2.** Registration results of different algorithms for point clouds with different angles.

**Table 3.** Registration results of different algorithms for point clouds with different translation distances.


From the vertical comparison between Tables 2 and 3, it can be seen that the point cloud registration algorithm is more sensitive to rotation, which means that, if the vehicle rotates at a large angle in the city, the point cloud registration between consecutive frames is perhaps less reliable. It may even lead to a failure of the pose estimation, as has been demonstrated in [5]. However, as the rotation angle increases, it can be seen from Table 2 that the registration accuracy of our algorithm decreases the least, and, at the extreme 90◦ rotation angle, the accuracy is still more than four times better than the other algorithms, with a root mean square error of approximately 2.67116 m. On the other hand, for large translations (5 m) between the source and target point clouds, our algorithm also shows an excellent registration accuracy, with a root mean square error of approximately 0.03047 m. It is worth noting that, in the aspect of single-point cloud registration, the registration time of our algorithm is increased by approximately one order of magnitude compared with others. To sum up, it is sufficient to verify the superiority of the proposed registration algorithm in terms of compressing time and improving the registration accuracy.

#### *5.2. Positioning Accuracy*

In this paper, the absolute trajectory error (ATE) was selected as the evaluation index of SLAM system positioning accuracy so as to directly reflect the difference between the global position estimation of the moving carrier and the ground truth. The absolute trajectory error is calculated as follows.

$$\mathcal{A}\_{i} := \mathcal{g}\_{i}^{-1} \mathcal{S} p\_{i} \tag{23}$$

where *Ai* is the absolute trajectory error of the SLAM system in the *i*th frame, *gi* and *pi* are the ground truth and the estimated pose, respectively, and *S* is the transformation matrix between the ground truth and the estimated pose. In this paper, the mean error (ATE\_ME) and root mean square error (ATE\_RMSE) of the absolute trajectory error were selected as evaluation criterion.

#### 5.2.1. Public Dataset

To verify the positioning accuracy of the fusion algorithm in different outdoor environments, the KITTI\_RAW dataset [30], which includes a variety of outdoor scenes, was used

to evaluate the localization accuracy of the fusion algorithm and to compare it with other similar advanced algorithms. The experimental data acquisition platform is as follows: LiDAR point cloud data are acquired by Velodyne HDL~64 line LiDAR, with horizontal field angle of view of 360◦, vertical field angle range of (−24.8◦, +2◦), horizontal resolution range of (0.08◦, 0.35◦), vertical angle resolution of 0.4◦ and scanning frequency of 10 Hz, which can meet the requirements of in-vehicle point cloud data acquisition. The GPS/IMU integrated system adopts OXTS RT3003, with a GPS output frequency of 1 Hz/s and an IMU output frequency of 100 Hz. The ground truth is provided by a high-precision integrated navigation system.

Four different outdoor scenarios were used to validate the performance of the fusion algorithm, including urban environments, open area, highway and forest road. The voxel grid size of the fusion algorithm was set to 0.3 × 0.2 × 0.3, the maximum iteration threshold was set to 30 and the iteration termination tolerance threshold was set to 1 × <sup>10</sup>−8, so as to meet the real-time requirements and ensure the stable number of feature point clouds participating in the matching in sparse areas of outdoor environments. The comparison of the experimental results is shown in Figures 10 and 11 and Table 4.

**Figure 10.** Comparison of the estimated trajectories. (**a**) Global positioning trajectory. (**b**) Local details of the trajectory. (**c**) Local details of the trajectory.


**Table 4.** Comparison of ATE\_RMSE(m) of each algorithm in KITTI\_RAW dataset.

**Figure 11.** Comparison of the positioning error of each algorithm. (**a**) APE fitting curve. (**b**) The box diagram of APE.

Figure 10 shows the comparison of the positioning results of each algorithm in the 09\_30\_0018 dataset representing the urban environment. As shown in Figure 11b, both LiDAR odometry (LO), represented by A-LOAM, and LIO, represented by LeGO-LOAM, show significant degradation in the position estimation results in the first 50 s and the last 50 s. The reason is that LO and LIO systems only rely mainly on LiDAR to extract spatial geometric feature information. Once LiDAR feature constraints are sparse or fail, the carrier state estimation degradation will occur in this feature direction, and additional constraints need to be added. The first 50 s and last 50 s are both flat, open roads with sparse point cloud features, which are susceptible to the degradation of the LiDAR positional optimization results. However, the number of GNSS visible satellites in the flat and open road is enough, and using GNSS observations as global constraints can greatly improve the positioning accuracy and robustness in sparse areas of point clouds. As can be seen from Figure 11b, the ATE\_RMSE of both the LIO-SAM with GNSS global constraints and the present algorithm is stable between (0 m, 2 m), and the positioning accuracy remains stable in the sparse region of the point cloud features in the latter 50 s without large data drift. In addition, from the box diagram shown in Figure 9c, it can be seen that the positional outliers estimated by LIO-SAM are reduced by approximately 80% compared with the LIO system. Furthermore, the positional estimation errors of our algorithm are concentrated between (0.68 m, 1.23 m) with very few outliers, which fully demonstrates the superiority of the proposed algorithm in its positioning accuracy in urban environments.

#### 5.2.2. Urban Dataset

To further investigate the extent to which improvements in both the front-end and backend components of the fusion algorithm improve its positioning accuracy, we conducted ablation experiments in a complex environment of GNSS signals. We constructed a system without GNSS global correction (-), a system without smoothed voxelized point cloud

registration and loopback correction (#) and a complete system (Proposed), respectively. The experimental environment is the complex reflection area of GNSS in the urban environment. The experimental platform includes: the ground truth, which is provided by NovAtel SPAN-CPT positioning results; the LiDAR point cloud, which is acquired by HDL 32E Velodyne LiDAR, where the horizontal field of view angle is 360◦, the vertical field of view angle range is (−30◦, +10◦) and the scanning frequency is 10 Hz, which is suitable for in-vehicle point cloud data acquisition; IMU, which is Xsens Mti 10, and the update frequency of the pose is 100 Hz; the GNSS receiver, which is u-blox M8T, and the update frequency is 1 Hz.

Different from the KITTI\_RAW dataset, the GNSS confidence parameter in this experiment is not fixed. After solving the raw observation data collected by u-blox with the SPP algorithm, we obtained the GNSS confidence covariance as the GNSS factor weight parameter. This is more in line with the real urban environment, where GNSS reflected and refracted signals interfere with the direct signal superimposed, thus causing the pseudorange and carrier phase observations to deviate from the true value of the direct signal. The experimental results are shown in Figures 12 and 13 and Table 5.

**Figure 12.** Comparison of the estimated trajectories. (**a**) Global positioning trajectory. (**b**) Local details of the trajectory. (**c**) Local details of the trajectory.



**Figure 13.** Comparison of the positioning error of each algorithm. (**a**) APE fitting curve. (**b**) The box diagram of APE.

As can be seen from Figure 13a, firstly, due to the accurate registration of the front-end point cloud, the absolute trajectory error of Proposed(-) decreases slightly compared to the pre-improved system. In the initial parking section of the dataset, the traditional ICP algorithm suffers from the problem of over-iterations, and the result is not the global optimal solution. However, the data smoothing processing and the setting of the iteration termination threshold of our algorithm can solve this problem well, providing a better initial value for the positional matching. The absolute trajectory error within the first 25 s drops by approximately 8m compared to LIO-SAM. Secondly, compared to LIO-SAM, which uses GNSS observations directly as global constraints without filtering, we introduced GNSS confidence into the optimization equation. It allows our algorithm to remain unaffected by poor-quality GNSS observations and to maintain a better positioning accuracy in the latter 50 s in areas with dense tall buildings and poor-quality point cloud distribution. In contrast, due to the poor quality of GNSS observations involved in optimization (LIO-SAM) and the low accuracy of LiDAR loop detection as a global constraint (A-LOAM and LeGO-LOAM), all other similar algorithms have a cumulative increase in the absolute trajectory error with steeper slopes. This is extremely detrimental for vehicle-mounted platforms driving in realistic large outdoor environments. From the experimental results, it can be seen that the global optimization link in our complete algorithm can well suppress the local cumulative drift and make the pose estimation result move more towards the global optimal solution.

In summary, driven by a combination of respective front-end and back-end improvements, our complete algorithm achieves a higher positioning accuracy than other comparable algorithms within real urban environments. Furthermore, as a result of the inherent advantage of local sensors not being subject to signal refraction environmental interference, it compensates for positioning outliers arising from multipath effects in traditional GNSS positioning in urban environments. It fully ensures the integrity and reliability of the fusion system's positioning.

#### *5.3. Time-Consuming Performance*

In this paper, the KITTI 09\_30\_0033 sequence is randomly selected to verify the real-time performance of our algorithm and the similar algorithms. In this paper, the above-mentioned three similar advanced algorithms are selected as control algorithms to compare with our algorithm, so as to verify the superior real-time performance of this algorithm in three stages: downsampling, point cloud registration and optimization. The experimental results are shown in Figure 14.

**Figure 14.** Time-consuming comparison of three processes. (**a**) Point cloud downsampling process. (**b**) Point cloud registration process. (**c**) Position global optimization process.

The time consumption of the point cloud downsampling is shown in Figure 14a. The downsampling process of the three other algorithms uses RANSAC as the core algorithm, but its iterative approximation speed is slow, at approximately (0.5 ms, 1 ms), and the filtering and fitting quality of the depth information is not good. In contrast, our proposed algorithm uses HashMap instead of random sampling, which improves the speed of filtering out similar points in voxels to a certain extent and reduces the time consumption by two to five times compared with the traditional downsampling method. Although the time-consuming ratio of this process is relatively small in typical indoor environments or short-term positioning processes, for vehicles driving in large outdoor environments with complex point cloud environments for long periods of time, the accumulation of

tiny instances of time consumption will lead to a cumulative increase in positioning time consumption. Therefore, the time-consuming compression in point cloud downsampling in this paper is beneficial for ensuring real-time vehicle positioning.

The time consumption of the point cloud registration is shown in Figure 14b, which shows that the time taken for the LIO-SAM and A-LOAM point cloud registration step is in the range of (50 ms, 200 ms). By extracting and separating the ground point clouds, LeGO-LOAM can inhibit the time-consuming increase caused by outlier registration due to the interference between non-identical cluster point clouds to a certain extent. The point cloud registration step takes approximately (50 ms, 100 ms). There are two possible factors for the obvious fluctuation of the above algorithm. The first is the fluctuation in the registration time due to the changing distribution of the ambient point cloud. The second is that, according to the experiments in Section 5.1, the rotation angle and displacement between the source and the target point cloud also have a certain influence on time consumption, which is obvious in the urban driving environment, where the speed and driving direction change irregularly. However, the time consumption of our algorithm is stable between (20 ms, 30 ms). The smoothing of the single-to-many distribution of the point cloud sequence greatly reduces the effect of the sparsity of the point cloud distribution on the alignment time, ensuring that the point cloud alignment step is both time-efficient and stable.

The time consumption for the positional optimization step is shown in Figure 14c. The time taken to match the local map to the global map for LIO-SAM and LeGO-LOAM is around (30 ms, 130 ms). Due to the addition of GNSS sensors and the interference of some GNSS observations with low confidence, the total time consumption of LIO-SAM is even higher than that of LeGO-LOAM. However, A-LOAM takes (20 ms, 25 ms). The main reason is that the observation of only one sensor in LiDAR needs to be optimized, and the residual block is directly constructed in the original pose map structure, which reduces the computational burden of the multidimensional factor map. In this algorithm, the optimization method of A-LOAM is used for reference. It can be seen that, although the GNSS sensor is added, the time consumption is still stable at approximately 30 ms. The reason is that the global constraint of GNSS provides a more accurate transformation matrix from the local map to the global map for the fusion system, which makes it easier for the objective function of map matching to converge to the optimal solution. Secondly, we use the Gauss–Newton method instead of the steepest gradient descent method used in [27] to minimize all of the cost functions, so as to reduce the number of iterations to converge quickly to the locally optimal estimate, which is one of the main reasons for the decrease in time consumption. The average total time of each algorithm in a single frame is shown in Table 6.


**Table 6.** Comparison of average total time consumption of each algorithm in single frame.

In summary, thanks to the double improvement of this algorithm in the front-end and back-end of our system, the time consumption of all three steps involved is compressed. Although the optimization vector of the GNSS sensor is newly added, it has a better real-time performance than other similar algorithms.

#### *5.4. Mapping Results in the Real Urban Environment*

As shown in Figure 15, this section shows the comparison of mapping results between our algorithm and similar advanced algorithms. Figure 15a shows the ground truth in the real outdoor environment, which is obtained by NovAtel SPAN-CPT. The vehicle travels for one week from the starting point in the lower right corner and then returns, and the trajectory is almost closed. Figure 15b shows the mapping result of LIO-SAM, and it is clear that the section near the end of the journey deviates significantly from the actual path travelled. The reason can be attributed to the fast displacement of the carrier, which leads to an increased difference in the point cloud clusters captured between the front and back frames, including rotation and displacement, resulting in the point cloud registration in the loopback detection of LIO-SAM being prone to failure and the loopback constraint results not being ideal. In addition, the GNSS constraint strategy adopted by LIO-SAM has a poor global correction effect on local sensors. However, compared with LIO-SAM, our algorithm has an obvious loop detection accuracy and GNSS global constraint effectiveness.

(**a**) (**b**) (**c**)

**Figure 15.** Comparison of mapping effects of various algorithms. (**a**) The ground truth. (**b**) The mapping results of LIO-SAM. (**c**) The mapping results of the proposed algorithm.

As shown in Figure 15c, the mapping trajectory of the algorithm proposed in this paper is basically fitted with the true value of the driving trajectory, and, in the loop road section in the lower right corner, the trajectories passing through the same landmark twice are basically coincident. The reasons are as follows: firstly, smoothing the point cloud clusters can improve the fault tolerance of the point cloud registration between the front and back frames; secondly, the weighted GNSS global constraint can eliminate the GNSS measurements with large observation gross errors, thus achieving superior mapping results.

#### **6. Discussion**

SLAM-based multi-sensor fusion positioning technology expands the application field of traditional GNSS-based mapping techniques and makes continuous and reliable positioning in complex urban environments with good/intermittent/rejection GNSS a reality. The superior sensing capability of LIO for natural sources has been demonstrated in the literature [5,6] and others. However, the accuracy limitation of point cloud registration and the local pose drift of LIO limit the application of LIO in large outdoor environments to some extent. This paper focuses on the above two technical bottlenecks faced by the existing LiDAR SLAM and proposes simplified but effective improvement solutions.

First of all, the primary technical bottleneck is how to improve the accuracy of point cloud registration. Koideet al. [20] demonstrated that smoothing the point cloud cluster can improve the fault tolerance of point cloud registration and improve the accuracy of the point cloud from 1.20m to 0.89m. However, from our practical tests, it has been shown that, in long and large turning or translating sections in real environments, if the number of iterations is not limited, it may still fall into the local optimum problem. Therefore, on the basis of constructing a registration equation based on smooth voxelization filtering, we further use the judgment condition of iteration termination for the secondary constraint, so as to reduce the over-fitting problem of local point clouds in a typical environment. According to the registration experiment in Section 5.1 and the positional estimation accuracy experiment in Section 5.2, it can be seen that our registration method has a higher accuracy than the conventional methods. According to Figure 11, it can be seen that accurate point cloud alignment leads to an accurate initial value estimation of the positional attitude, which has a considerable positive effect on the positional estimation of the vehicle platform with a complex moving state.

Secondly, there is the challenge of how to effectively use GNSS measurements and loopback detection mechanisms to converge the local positioning results of LIO to a globally optimal solution. Firstly, the main step of loop detection dependency is point cloud registration. The previous paragraph has specifically analyzed the superiority of our method. Thanks to the positive point cloud registration results, we can reason that our global constraint using loopback detection is superior, as demonstrated in Section 5.2.2. Next, some related research has been carried out on the research of GNSS and LIO fusion positioning. Optimization-based methods have been applied in [31,32] and so on. However, taking [27] for example, in the traditional graph optimization model, the covariance effect of measurements is only used to determine whether to add factors or not, which is crude for the screening of the measurements, especially for GNSS, a measurement signal that is greatly influenced by the environmental catadioptric surface, where its observation information has not been fully applied. Therefore, in this paper, on the basis of the rough screening of GNSS observations, the covariance of the quantitative measurements is added as a weight to the graph optimization model. It achieves a more adequate and accurate global constraint on the LIO local poses using GNSS observations. From the ablation experiment results in Section 4.3.2, it is evident that the method of the weighted GNSS residual added in this paper achieves a satisfactory positioning accuracy. According to the above results, we can draw a reasonable inference: even when entering an indoor parking lot or tunnel, the accumulated error of LIO in this algorithm will start to accumulate from a lower initial value of drift. In contrast, for the LIO algorithms without GNSS constraints, they already have a large deviation between the local map and the global map before entering the denial environment. Therefore, the cumulative error range of this algorithm is acceptable. Once the GNSS signal is restored, the local error will be corrected within the time alignment interval of 0.1s as set in Section 4.3.2. This provides an accurate and continuous positional estimation for the in-vehicle platform during travel in complex urban environments.

#### **7. Conclusions**

In this paper, a LiDAR-IMU-GNSS fusion positioning algorithm with accurate local alignment and weak global drift is proposed for the high-precision continuous positioning of mobile carriers in complex urban environments.

Firstly, a voxelized point cloud downsampling method based on curvature segmentation is proposed. Rough classification is carried out by a curvature threshold, and the voxelized point cloud downsampling is performed using HashMap instead of RANSAC, so that the spatial feature distribution attributes of the source point cloud, including texture feature information, such as surfaces and curves, are retained to a greater extent.

Secondly, a point cloud registration model based on the nearest neighbors of the point and neighborhood point sets is constructed. Furthermore, an iterative termination threshold is set to reduce the probability of the local optimal solution. This greatly improves the real-time performance of the point cloud registration and can also play a large role in aligning the point cloud between the front and back frames of a fast-moving carrier with large displacement.

Finally, we propose a LIO-GNSS fusion positioning model based on graph optimization that uses GNSS observations weighted by confidence to globally correct local drift. In addition, the loop detection mechanism using the above-mentioned point cloud registration algorithm is also added into the fusion system, resulting in further global constraints of the driving areas with prior maps. Experimental results show that our algorithm can realize a more continuous and accurate pose estimation and map reconstruction in complex urban environments than similar state-of-the-art algorithms.

In the future work, there are still several issues in our work that deserve further exploration. Firstly, we plan to build a deeper constraint relationship between LIO and GNSS and make use of the rich planar features perceived by LiDAR in the urban environments to compensate for GNSS occlusion or the presence of multipath areas in the direction of

the constraint. It will reduce the probability of the unreliability of SPP positioning results in the urban environments. Secondly, considering the environments with multi-sensor failures, such as tunnels with a high environmental texture similarity, where both GNSS rejection and point cloud degradation failures exist. We consider a more accurate degradation direction detection using the degeneracy factor (DF) algorithm proposed in [5], and make a non-linear optimization correction for the positional attitude in that direction. Finally, the perception ability of 3D environmental features by using only LiDAR, IMU and GNSS sensors is still relatively limited. We plan to use the observation residuals from other sensors to add multi-dimensional feature constraints to the fusion positioning algorithm, such as cameras, wheel odometers and so on, so as to make full use of environmental features and realize accurate and real-time navigation and positioning targets with high environmental universality.

**Author Contributions:** Conceptualization, X.L.; methodology, X.H.; software, X.H.; validation, S.P.; formal analysis, X.H.; investigation, X.L.; resources, W.G.; data curation, X.H.; writing—original draft preparation, X.L.; writing—review and editing, X.H.; visualization, X.H.; supervision, S.P. and W.G.; project administration, W.G.; funding acquisition, S.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research study was funded by the National Key Research and Development Program of China (2021YFB3900804) and the Research Fund of the Ministry of Education of China and China Mobile (MCM20200J01).

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Appendix A**

#### **Mathematical Derivation of the Equations (14) and (15).**

First, given the original accelerometer and gyroscope measurement values of IMU as follows:

$$\begin{aligned} \dot{a}\_t &= a\_t + b\_{at} + R\_{\dot{W}\mathcal{S}w}^I + n\_a \\ \dot{\omega}\_t &= \omega\_t + b\_{\mathcal{a}t} + n\_{\mathcal{a}} \end{aligned} \tag{A1}$$

where *na* <sup>∼</sup> *<sup>N</sup>* 0, *σ*<sup>2</sup> *a* and *<sup>n</sup>* <sup>∼</sup> *<sup>N</sup>* 0, *σ*<sup>2</sup> are white Gaussian noise of accelerometer and gyroscope, respectively. The following mathematical derivations are all completed in the IMU body coordinate system.

Therefore, the position, rotation and velocity between the *i*th IMU frame and the *i* + 1th IMU frame can be obtained:

$$\begin{aligned} p\_{i+1} &= p\_i + v\_i \Delta t\_i + \iint\_{t \in [t\_i, t\_{i+1}]} \left( R\_W^b (\mathfrak{a}\_t - b\_t - n\_d) - \mathfrak{g}\_w \right) dt^2 \\ q\_{i+1} &= q\_i \text{Exp} \left( \hat{\omega}\_t - b\_t - n\_{\omega} \right) \Delta t\_i \\ v\_{i+1} &= v\_i + \int\_{t \in [t\_i, t\_{i+1}]} \left( R\_W^b (\mathfrak{a}\_t - b\_t - n\_d) - \mathfrak{g}\_w \right) dt \end{aligned} \tag{A2}$$

To avoid repeated calculation of IMU parameters during pose estimation, pre-integration is introduced to simplify calculation, namely:

$$\begin{cases} p\_{i+1} = p\_i + v\_i \Delta t\_i - \frac{1}{2} g\_w \Delta t\_i^2 + R\_b^w a\_{i+1}^i \\ q\_{i+1} = q\_i \otimes \theta\_{i+1}^i \\ v\_{i+1} = v\_i - g\_w \Delta t\_i + R\_b^w \beta\_{i+1}^i \end{cases} \tag{A.3}$$

where *αi*+<sup>1</sup> *<sup>i</sup>* , *<sup>θ</sup>i*+<sup>1</sup> *<sup>i</sup>* , *<sup>β</sup>i*+<sup>1</sup> *i T* is the IMU pre-integration value. It can be inferred from [28] that the IMU pre-integration value is only related to the IMU bias at different times. Since the IMU bias change is very small, we assume that the pre-integration change is linear with the IMU bias, and then *αi*+<sup>1</sup> *<sup>i</sup>* , *<sup>θ</sup>i*+<sup>1</sup> *<sup>i</sup>* , *<sup>β</sup>i*+<sup>1</sup> *i T* after each pose estimation can be recorded as:

$$\begin{array}{l} \mathsf{a}\_{i}^{i+1} = \hat{\mathsf{a}}\_{i}^{i+1} + \frac{\delta \underline{\theta}\_{i}^{i+1}}{\delta \underline{\b}\_{x}} \stackrel{\smile}{\delta \underline{\b}\_{a}} + \frac{\delta \underline{\delta}\_{i}^{i+1}}{\delta \underline{\b}\_{\omega}} \stackrel{\smile}{\delta \underline{\b}\_{\omega}} \\\\ \theta\_{i}^{i+1} = \theta\_{i}^{i+1} \Leftrightarrow \begin{bmatrix} 1 \\ \frac{1}{2} \frac{\delta \underline{\theta}\_{i}^{i+1}}{\delta \underline{\b}\_{\omega}} \stackrel{\smile}{\delta \underline{\b}\_{\omega}} \end{bmatrix} \\\\ \theta\_{i}^{i+1} = \hat{\mathsf{\theta}}\_{i}^{i+1} + \frac{\delta \underline{\theta}\_{i}^{i+1}}{\delta \underline{\theta}\_{\omega}} \stackrel{\smile}{\delta \underline{\b}\_{\omega}} \stackrel{\smile}{\delta \underline{\b}\_{\omega}} + \frac{\delta \underline{\theta}\_{i}^{i+1}}{\delta \underline{\b}\_{\omega}} \stackrel{\smile}{\delta \underline{\b}\_{\omega}} \end{array} \tag{A4}$$

Equation (A4) is the pre-integration form in the continuous time between the two IMU frames, and the actual IMU pre-integration is the incremental in the discrete time. Therefore, the mid-point integration is used for discretization, and the matrix form of the discrete IMU state error transfer equation is obtained:

$$
\begin{bmatrix}
\delta a\_{i+1} \\
\delta \theta\_{i+1} \\
\delta \beta\_{i+1} \\
\delta b\_{ai+1} \\
\delta b\_{\omega i+1}
\end{bmatrix} = F\_{\bar{i}} \begin{bmatrix}
\delta a\_{\bar{i}} \\
\delta \theta\_{\bar{i}} \\
\delta \beta\_{\bar{i}} \\
\delta b\_{\omega i} \\
\delta b\_{\omega i}
\end{bmatrix} + V\_{\bar{i}} \begin{bmatrix}
n\_{a\_{\bar{i}}} \\
n\_{\omega i\_{\bar{i}}} \\
n\_{\omega i\_{\bar{i}+1}} \\
n\_{b\_{\bar{\omega}}} \\
n\_{b\_{\bar{\omega}}}
\end{bmatrix} \tag{A5}
$$

where *Fi* and *Vi* are matrix abbreviations, with specific values as follows:

$$F\_i \begin{bmatrix} \delta a\_i \\ \delta \theta\_i \\ \delta \beta\_i \\ \delta b\_{ai} \\ \delta b\_{\omega i} \end{bmatrix} = \begin{bmatrix} I & f\_{01} & \delta t & f\_{03} & f\_{04} \\ 0 & f\_{11} & 0 & 0 & -\delta t \\ 0 & f\_{21} & I & f\_{23} & f\_{24} \\ 0 & 0 & 0 & I & 0 \\ 0 & 0 & 0 & 0 & I \end{bmatrix} \tag{A6}$$

$$\begin{aligned} f\_{01} &= \frac{q\_i}{2} f\_{21} = -\frac{1}{4} q\_i (\hat{a}\_i - b\_i) \times \delta t^2 - \frac{1}{4} q\_{i+1} (\hat{a}\_{i+1} - b\_i) \times \left[ I - \left( \frac{\hat{a}\_i + \hat{\omega}\_{i+1}}{2} - b\_i \right) \times \delta t \right] \delta t^2 \\\ f\_{03} &= -\frac{1}{4} (q\_i + q\_{i+1}) \delta t^2 \\\ f\_{04} &= \frac{\delta t}{2} f\_{24} = \frac{1}{4} q\_{i+1} (\hat{a}\_{i+1} - b\_i) \times \delta t^3 \\\ f\_{11} &= I - \left( \frac{\hat{\omega}\_i + \hat{\omega}\_{i+1}}{2} - b\_k \right) \times \delta t \\\ f\_{21} &= -\frac{1}{2} q\_i (\hat{a}\_i - b\_i) \times \delta t - \frac{1}{2} q\_{i+1} (\hat{a}\_{i+1} - b\_i) \times \left[ I - \left( \frac{\hat{\omega}\_i + \hat{\omega}\_{i+1}}{2} - b\_i \right) \times \delta t \right] \delta t \\\ f\_{22} &= -\frac{1}{2} (q\_i + q\_{i+1}) \delta t^2 \\\ f\_{24} &= \frac{1}{2} q\_{i+1} (\hat{a}\_{i+1} - b\_i) \times \delta t^2 \end{aligned} \tag{A7}$$

$$V\_i = \begin{bmatrix} v\_{00} & v\_{01} & v\_{02} & v\_{03} & 0 & 0 \\ 0 & \frac{\delta t}{2} & 0 & \frac{\delta t}{2} & 0 & 0 \\ \frac{q\_i \delta t}{2} & v\_{21} & \frac{q\_{i+1} \delta t}{2} & v\_{23} & 0 & 0 \\ 0 & 0 & 0 & 0 & \delta t & 0 \\ 0 & 0 & 0 & 0 & 0 & \delta t \end{bmatrix} \tag{A8}$$

$$\begin{cases} v\_{00} = -\frac{1}{4} q\_i \delta t^2 \\ v\_{01} = v\_{03} = \frac{\delta t}{2} v\_{21} = \frac{1}{4} q\_{i+1} (\mathring{a}\_{i+1} - \mathring{b}\_i) \times \delta t^2 \frac{\delta t}{2} \\ v\_{02} = -\frac{1}{4} q\_{i+1} \delta t^2 \\ v\_{21} = v\_{23} = \frac{1}{4} q\_{i+1} (\mathring{a}\_{i+1} - \mathring{b}\_i) \times \delta t^2 \end{cases} \tag{A9}$$

Let *z*15×<sup>1</sup> *<sup>i</sup>* = [*δαi*, *δθi*, *δβi*, *δbai*, *δbωi*] *<sup>T</sup>* and *z*15×<sup>1</sup> *<sup>i</sup>*+<sup>1</sup> = [*δαi*+1, *δθi*+1, *δβi*+1, *δbai*+1, *δbωi*+1] *T* be the error state vector of the *i* th frame and the *i*+1 th frame, respectively, and *n* = *nαi* , *nω<sup>i</sup>* , *nαi*+<sup>1</sup> , *nωi*+<sup>1</sup> , *nba* , *nb<sup>ω</sup> <sup>T</sup>* is the noise vector; then, Equation (A5) can be written as:

$$
\delta z\_{i+1}^{15 \times 1} = F^{15 \times 15} \delta z\_i^{15 \times 1} + V^{15 \times 18} n^{18 \times 1} \tag{A10}
$$

where the initial Jacobian value is *Ji* = *I* and the Jacobian iteration formula in the process of nonlinear optimization is:

$$J\_{i+1}^{15 \times 15} = F^{15 \times 15} J\_i^{15 \times 15} \tag{A11}$$

The iterative formula of the covariance of pre-integration in the nonlinear optimization process is:

$$\sum\_{i=1}^{15 \times 15} = F \sum\_{i}^{15 \times 15} F^T + V \eta\_i V^T \tag{A12}$$

After the pre-integration derivation, Equation (14) is the variable quantity of position, rotation, velocity and IMU bias between two frames.
