*4.4. Lack of Quality Control*

The quantitative evaluation of the SLAM algorithms is another important challenge. There are some criteria to evaluate SLAM algorithms, such as their accuracy, scalability, availability, recovery (which is the ability to localize the vehicle inside a large-scale map), and updatability. Quantitative analysis of the performance of SLAM algorithms is essential since they can provide numerical evaluation and a basis for comparison of different SLAM algorithms.

Estimation accuracy is a widely used quality analysis metric, but it can be difficult in practice for autonomous driving. Most approaches evaluate the performance of SLAM algorithms by comparing the results to ground truth using, for example, an accurate map. However, a suitable ground truth map is seldom available. Sometimes the estimated map is evaluated by overlaying it onto the floor plan and searching for differences [238], which is harder for outdoor applications and needs human operator intervention [239]. Two popular accuracy metrics, relative pose error (RPE) and absolute trajectory error (ATE), were proposed by Sturm et al. [240] which evaluate a Visual SLAM system by comparing the estimated camera motion against the true trajectory, instead of doing complex map comparison. The RPE measures the local accuracy of trajectory over a fixed time interval, while ATE compares the absolute distances between the estimated and the ground truth trajectory and thus estimates the global consistency. These two trajectories should first be aligned using the Horn method [240]. According to [240], the RPE considers both translational and rotational errors, while the ATE only considers the translational error. These metrics have been widely used by the SLAM community for evaluating and comparing different SLAM approaches. However, similar to maps, the precise location of the vehicle trajectory on the actual road surface may not always be available. In [239], the authors proposed a framework for analyzing the accuracy of SLAM by measuring the error of the corrected trajectory. Root Mean Square Error (RMSE) of vehicle poses is normally used to indicate the accuracy of the SLAM trajectory estimation result. Another widely used quality analysis method is the Chi-squared (χ2) test. According to [241], the χ<sup>2</sup> test is a statistic test to quantify the quality of the provided covariance matrices for landmark measurements and odometry error. When the minimum χ<sup>2</sup> error is nearly equal to the difference of the dimension of the measurement vector and the size of the state vector, the measure would be considered as being of good quality [241].

Some researchers [242–244] have considered the consistency of their SLAM algorithms. According to [242], the major reason for SLAM inconsistency is the accumulated error caused by the incorrect odometry model and inaccurate linearization of the SLAM nonlinear functions. When the estimation error is beyond the uncertainty, it can be assumed that the estimation results are inconsistent. EKF-SLAM suffers from such an inconsistency problem unless the Jacobians of observation/odometry functions are evaluated around the true system state. In [30] and [245], the consistency of fastSLAM and EKF-SLAM algorithms was quantitatively determined using the measure indicator normalized estimation error squared (NEES). In [246], observability properties of the filter's error state model were analyzed to investigate the fundamental causes of the inconsistency of EKF-SLAM. In the work of [247], the consistency of an incremental graph SLAM was checked by applying a χ<sup>2</sup> test to the weighted sum of measurement residuals. Whether inconsistency can be tolerated ultimately depends on the application of the SLAM results [19].

The reliability of the output of the localization, mapping, and navigation system should also be checked. However, few studies have been made on the quantitative analysis of the reliability of SLAM. Some reliability studies for other localization systems (such as GNSS, GNSS/INS) can be used as a reference to guide the SLAM community. System reliability can be considered as having two components: internal reliability and external reliability. The former identifies the ability of the system to detect faults, which is quantified by the Minimal Detectable Bias (MDB), and is indicated by the lower bound for detectable faults. The latter estimates the influence of undetected faults on the final solution [175,248–251]. When the MDB value is low, the system is more reliable. Similarly, the reliability of the SLAM system feature observation model and vehicle motion model can also be evaluated with these approaches.

Integrity is very important, as it is an indicator of the "trustworthiness" of the information supplied by the localization system, and can provide timely warning of the risks caused by inaccuracy [252]. Integrity measures are used to quantify the requirements for localization safety. The concept was first established in aviation and is also applicable to land vehicle localization [253]. Due to the strict safety requirement of autonomous driving, there is increasing attention to integrity by autonomous driving researchers. The localization and navigation of a self-driving car are based on the use of multiple sensors, therefore the traditional integrity analysis methods for GNSS should be extended. Fault detection and isolation (FDI) is one of the most popular alert generation approaches for GNSS-based localization [229,254–256].

#### **5. Lidar/GNSS/INS Based Mapping and Localization: A Case Study**

The Lidar-based Simultaneous Localization and Mapping (SLAM) technology approach is widely studied and used in the robotics field because Lidar can generate a very dense 3D point cloud with a fast sensing rate and high accuracy. Normally the SLAM system experiences estimation error which increases with the travel distance, thus it needs "loop closure" to correct the errors. However, the closed loop is hard to achieve in some large-scale outdoor applications of autonomous driving, such as driving on a highway, or a complex trajectory in urban areas. Furthermore, the Lidar-only SLAM will only provide the relative localization information. Therefore, the combination of GNSS/INS with Lidar SLAM will effectively reduce the dependence on loop closures and provide absolute positioning information.

Furthermore, a Lidar system can also support localization using existed HD maps when GNSS signals are not available. A modernized SLAM procedure that combines Lidar, GNSS, and INS is tested here. This procedure contains two parts: Lidar/GNSS/INS-based offline mapping part, and Lidar/HD map-based online localization and mapping part.

#### *5.1. Experiment Setup*

Land vehicle tests were conducted in some urban areas of Sydney, Australia, to test the proposed Lidar/GNSS/INS multi-sensor system. The vehicle was equipped with a VLP-16 LiDAR sensor, a tactical-grade IMU sensor, and two GNSS antennas from PolyExplore, Inc., San Jose, CA, USA (Figure 5). The second antenna can be used to provide a dual-antennaaided heading update for the online localization system. The sampling rate of the Lidar was 10-Hz, the sample rate of GNSS was 1-Hz, and for the IMU it was 100-Hz.

**Figure 5.** Experimental platform: (**a**) The multi-sensor system, (**b**) side view of the system installed within a vehicle.

The trajectory of the road test is shown in Figure 6a. The vehicle was driven from the campus of the University of New South Wales (UNSW) in Kensington to La Perouse (Section A), and then back to UNSW (Section B). In this study, the forward journey (from UNSW to La Perouse) was used to a produce high precision 3D point cloud map of the road, and the backward journey (from La Perouse to UNSW) was used to test the performance of the Lidar/3D point cloud map-based localization method.

**Figure 6.** (**a**) The road test trajectory (in blue) on Google Maps; (**b**) GNSS/INS localization of the whole trajectory with RTK positioning status in a local coordinate system.

In order to conduct a quantitative analysis of the localization performance, three sections of the trajectory were selected (Figure 6b). For each of the selected sections on the driving trajectory, the GNSS-RTK status for the forward journey and backward journey was "integer-ambiguity fixed". Hence, the offline mapping results are expected to be accurate at about the 5 cm level. For the backward journey (from La Perouse to UNSW), the selected sections will have accurate GNSS/INS positioning results as a reference to evaluate the performance of the Lidar/3D point cloud map-based localization method.

#### *5.2. Lidar/GNSS/INS Mapping*

The acquired dataset of the forward journey (from UNSW to La Perouse) was used to generate a georeferenced point cloud map of the road environment. The georeferenced map was generated using Lidar odometry frame-to-frame matching and GNSS/INS positioning/attitude. Figure 7 shows an overview of the offline mapping system architecture.

**Figure 7.** Overview of the Lidar/GNSS/INS mapping system architecture.

The GNSS/INS system can provide the geodetic positioning and attitude information. Since this map generation was performed offline, an optimal GNSS/INS trajectory can be obtained. The GNSS/INS-derived position and attitude results were used as initial values for the frame-by-frame matching to transfer the newly merged point cloud to the referenced frame. In this way, the point cloud can be georeferenced. When GNSS results are unavailable the inertial navigation 6-DOF pose results can be used to generate the initial transformation before the GNSS signals are reacquired. When conducting Lidar odometry, each current frame was matched to the previous frame with Normal Distributions Transform (NDT) scan matching algorithm, with the initial transformation information provided by GNSS/INS. The point clouds were firstly pre-processed to remove the ground plane point (Figure 8), before matching by NDT to improve the accuracy of registration.

Figure 9 shows two scan views before being matched. It appears that these two scan views have slight differences in features. The matched point cloud from the two scan views can be generated (Figure 10).

By conducting Lidar odometry sequentially with all the available Lidar scans, the newly matched point cloud can be merged with the previously generated point cloud maps, and the accumulated map of the whole trajectory can be obtained and georeferenced, as shown in Figure 11.

By enlarging Figure 11, details of the road map can be seen, and its corresponding real-world road view can be compared with Google Earth images (since this map is georef-

erenced). Figure 12 shows a comparison of one zoomed-in section of this generated map and the corresponding view in Google Earth.

**Figure 8.** Scan view of a Lidar scan frame (**a**) the original scan view; (**b**) the view after pre-processing.

**Figure 9.** Scan views of two sequenced Lidar scan frames ((**a**) previous scan frame; (**b**) current scan frame) for scan matching.

**Figure 10.** Generated map point cloud after matching two sequenced Lidar scan frames.

**Figure 11.** Global georeferenced road map from UNSW to La Perouse (frame: ECEF, unit: meter) from the 3D point cloud-based map.

**Figure 12.** A section of (**a**) generated map, and (**b**) the Google Earth view for the same location.

This generated map shows a good structure of the road environment, including the road edge, buildings, trees, and parked vehicles along the road.

Three control points with known coordinates are placed around the UNSW Scientia Lawn. These control points can be used to evaluate the accuracy of the generated point cloud. By comparing the coordinates of the identified control points within the map to the real known position, it is found the difference at each axis X, Y, and Z is around 2–8 cm. Therefore, the offline-generated map accuracy is considered to be 5 cm.

### *5.3. Localization with Lidar Scans and the GeoReferenced 3D Point Cloud Map Matching*

The georeferenced 3D point cloud map produced from the data of Section A (the forward journey from UNSW to La Perouse) can then be used to support the Lidar-based localization for Section B (the backward journey from La Perouse to UNSW) by matching the Lidar scans to the map. The procedure of the online Lidar/3D map matching-based localization method is shown in Figure 13. An INS is used to support the Lidar/3D mapbased online localization. In order to show the performance of different fusion levels, two fusion methods were investigated. The first method simply utilizes the IMU as an assistant sensor that directly uses the INS solution as initial information for scan/map matching. The second fusion method is a tightly coupled one that not only uses the INS solution to support matching, but also contains an EKF-based error state update step that enhances the inertial navigation performance.

**Figure 13.** Overview of the proposed Lidar/3D map matching based localization system architecture. (**a**) Method 1: fusing IMU as an assistant sensor; (**b**) Method 2: fusing IMU using the EKF-based tightly coupled method.

Method 2 consists of two parts: scan matching and EKF fusing. Firstly, if the inertial navigation information is not available, the frame-to-frame Lidar odometry can be used to support localization. After initializing the error-state EKF, the estimated pose from the inertial navigation will provide a rough pose for the current Lidar scan frame, and the Lidar odometry can be shut down to lower the computation load. With the rough position provided by the INS, a local map is searched and selected from the pre-generated global map to improve the matching efficiency. NDT-based scan matching between the current Lidar frame and the local map is undertaken with the inertial-based initial Transformation Matrix. The Lidar-estimated vehicle pose can be obtained. A new real-time road map can also be generated if needed.

After obtaining the Lidar pose, the difference between the Lidar pose and the inertial propagation pose can be obtained, and the error within the inertial navigation information is estimated by the error-state EKF and then fed back to the inertial system to improve the pose results and bias estimation. When GNSS information is available, such as the RTK position results, these can also be used to correct the inertial navigation information to improve the accuracy and reliability of the localization system.

For data fusion of Lidar, INS, and GNSS, some current work has proposed using the graph optimization-based method to generate optimal localization and mapping solutions [179,257–259]. However, some of them are post-processed or highly dependent on GNSS data to mitigate the navigation drift, or even ignore the IMU bias. Since for our online Lidar/map matching based localization method, a reliable inertial navigation solution is essential to provide a good initialization for the scan/map matching process, and to increase the efficiency and accuracy of local map searching and selection, in-time IMU bias correction is critical and is more easily achieved using an EKF. As our test is undertaken within an urban area where GNSS signals are frequently lost, the feedback to the IMU states should also depend on the Lidar data, especially during GNSS outages. Moreover, the estimation uncertainty, which is an important parameter for the analyzing system solution, is seldom estimated in graph-based methods but can be directly estimated through the EKF method. Therefore in our current Method 2, the EKF method is used to fuse the Lidar/map localization, GNSS, and inertial navigation results. A comparison of Method 1 and Method 2 also highlights the difference between using an IMU sensor as a separate aiding sensor and as a tightly-coupled aiding sensor.

#### 5.3.1. Estimation Results of Lidar/3D Map-Based Localization System

Since there is no ground truth information for this urban road test, the Lidar/map matching-based solutions are compared with the GNSS/INS solution within the three selected trajectory sections (Figure 14), during which the RTK status is "ambiguity-fixed" (Figure 6b).

Table 3 shows the comparison between the Lidar/map matching-based localization and the reference GNSS/INS localization results. For Method 1, the result differences fluctuate around zero, and their mean values are at the centimeter to decimeter level. The standard deviation for all three sections is around 0.1–0.2 m, therefore we treat the difference of the coordinates larger than 0.6 m as indicating possible outliers. The epochs that have outliers are about 1.7% of the total test data, which means the presence of outliers is rare. The possible reason for the outlier will be discussed in the next Section. For Method 2, the result has better accuracy. The standard deviation is around 0.05 m, much lower than that of Method 1. It can be seen from Figure 14, that Method 2 has a lower difference to the reference during the periods that Method 1 shows possible outliers, indicating that the tightly coupled method is more robust to outliers than simply using the INS solution for initialization.

#### 5.3.2. Quality Analysis of the Numerical Results

The details of measurements during the epochs with big jumps (such as during the "red and green boxes" in Figure 14) are checked to investigate possible causes of the detected outliers. For Trajectory Section 1 in Figure 14 (red box), it is found that when driving around a roundabout, there were some big outliers by Method 1. The trajectory of the Lidar/map system and the GNSS/INS solution around this roundabout and their

views in Google Maps are shown in Figure 15. It can be seen that the GNSS/INS solutions (Figure 15 yellow line) are smoother in this area since the GNSS integer ambiguities are "fixed", while the Lidar/map solution has some differences to the reference trajectory if only using the IMU as a simple assistant (Method 1, Figure 15 blue line).

**Figure 14.** Coordinate difference between the proposed Lidar/map matching-based localization method and the reference GNSS/INS localization method at three trajectory sections. (**a**) Method 1: fusing IMU as an assistant sensor; (**b**) Method 2: fusing IMU using the EKF-based tightly coupled method. The red and green boxes indicate epochs with big coordinate difference with different timestamps.


**Table 3.** Mean and standard deviation for the difference between Lidar/map matching-based localization and the reference GNSS/INS localization results, for Trajectory Sections 1, 2 and 3 in Figure 14. (a) Method 1: fusing IMU as an assistant sensor; (b) Method 2: fusing IMUs using the EKF-based tightly coupled method.

**Figure 15.** Trajectory of Lidar/map matching-based localization Method 1 (blue), Method 2 (Red) and GNSS/INS localization (Yellow): (**a**) view in Google Map; (**b**) view in local coordinate system around the roundabout.

Figure 16 shows the map view at this roundabout. It can be seen that the structure of the pre-generated map on the driving side of the road is not very clear since it lacks features around the trajectory. The roundabout is located at a parking area of a tourist attraction. There is no building and very few trees around this area. Since the testing was undertaken in the evening, there were not many parked vehicles that could be used as features. Therefore the quality of the matching step may be poorer, which results in degraded localization accuracy.

Figure 17 shows the Lidar scan view at this point with a range threshold of 20 m, and it can be seen that this Lidar scan does not have many usable features, especially after pre-processing. Similar road environments with fewer features can be found when another cluster of outliers appeared in Trajectory Section 1 (green box), shown in Figure 14. In this situation, extending the range threshold may enhance the accuracy by including more features, however, it will increase the computational burden and be impacted by more outlier sources. Incorporating the inertial motion model by fusing the IMU more tightly may make the localization system more robust to this featureless condition (Figure 15 red line).

**Figure 16.** A section of (**a**) the generated map, and (**b**) the Google Earth view for the same location around the roundabout.

**Figure 17.** Scan frame at epoch 40,647 s with big outlier in Trajectory Section 1 around the roundabout ((**a**) the original scan view; (**b**) the view after pre-processing).

Another major source of outliers is the other moving entities around the host vehicle. Figures 18–20 show the Lidar views when there are big outliers in the localization stage.

No matter whether the other moving vehicle is on the same side of the road or the other side of the road, such a moving vehicle will influence the quality of the Lidar/map matching-based localization. When the moving vehicle is on the same side of the road as the host vehicle, it will result in bad estimation when it is initially detected or has a different speed from the host vehicle, or when it turns and drives onto another road. This will make this vehicle no longer detectable (red box in Figure 14 Trajectory Section 2 and Figure 18).

**Figure 18.** Scan frame at epoch 40,854 s with big outlier in Trajectory Section 2: a following vehicle is driving to another road, (**a**) Lidar scan; (**b**) Google Earth view.

**Figure 19.** Scan frame at epoch 40,888 s with big outlier in Trajectory Section 2, an opposite driving vehicle detected, (**a**) Lidar scan; (**b**) Google Earth view.

**Figure 20.** Scan frame at epoch 41,103 s with big outlier in Trajectory Section 3, with one tall bus driving past (**a**) Lidar scan; (**b**) Google Earth view.

Once the host vehicle detected an opposite driving vehicle, the localization estimation errors could reach 1–1.5 m (green box in Figure 14 Trajectory Section 2 and Figure 19).

The type of moving elements will also impact the presence of outliers. Most of the time, the vertical position estimation is less influenced by the moving elements. However, when checking the red box in Trajectory Section 3 (Figure 14), it can be found that the differences in the vertical direction are much higher than in other sections. By looking at the details of the Lidar view it is found that, at that section, a tall bus was driving past the host vehicle (Figure 20), which means vertical differences between the current Lidar scan and the pre-generated map might have caused some systematic vertical biases.

The moving objects within the road environment will be a major source of measurement outliers because the system treated the pre-generated map as a fixed reference map. Therefore, if there are any moving objects that cause the structures of the pre-generated map and the current scan frame to be different, outliers will occur. The moving objects, such as other vehicles, may exist in both the previous road mapping stage and in the current road scans for use in localization. For the step of offline HD map generation, such moving objects should be carefully identified and removed from the static 3D point cloud maps. For the online step, the moving objects could be identified and removed based on the cleaned pre-generated map, or directly achieved semantic segmentation with sensor data. Some researchers have developed some moving object segmentation methods, such as LMNet [166] which can distinguish moving and static objects based on CNN. Therefore, the possible detected moving objects could be removed, or the possible road environment change could be updated to the global map to enhance the accuracy of future driving around the same road path. These methods may be undertaken during the perception step. The aid of some numerical quality control methods may also contribute to this task at the localization and mapping steps, such as the FDI method, or outlier detection and identification methods, which can directly estimate and mitigate the influence of outliers from all kinds of resources, not only the moving outlier, and also other sensor or model faults.

Some FDI methods or integrity monitoring methods [255,256] have already been successfully applied to the GNSS/INS integration system under an EKF framework. Since in this case study the EKF method is used to fuse Lidar/map localization results and INS pose to generate high-frequency precise pose solutions, these quality control methods also indicate the potential for this proposed localization system. This will be a future research topic.

#### **6. Conclusions**

This paper gives a brief review of different SLAM approaches and their characteristics. SLAM has become a key approach for localization, mapping, planning, and controlling in automated driving. It shows promising progress in generating high-resolution maps for autonomous driving and for vehicle localization in road environments. The advantages and disadvantages of different SLAM techniques have been identified and their applications for autonomous driving have been discussed.

The trustworthiness of localization and navigation algorithms is an important issue for autonomous driving. There are many challenges that limit the performance of the SLAM techniques, which affect the safety of the localization and navigation results. These challenging issues, and possible solutions, are mentioned in this review. Furthermore, in order to ensure safety, the performance of the algorithms should be quantitatively evaluated with respect to such measures as accuracy, consistency, precision, reliability, and integrity. The methods to evaluate these qualities are briefly discussed.

A real-world road test was conducted to demonstrate the application of SLAM for autonomous driving with multi-sensor integration. The numerical results show that a GNSS/INS-aided-Lidar system can generate a georeferenced high-density point cloud map. This pre-generated map can then be used to support online localization, which has achieved about centimeter-level accuracy. This Lidar/map matching-based localization method may also be useful to support an autonomous driving system during periods when GNSS signals are unavailable, which makes it suitable for urban area driving. A more tightly coupled fusion of IMU measurements will make the Lidar/map-based localization more accurate and robust to outliers than simply utilizing the inertial solution as assistant information.

Future studies should be focused on how to detect moving entities and mitigate their impact in the 3D point cloud mapping and localization process. In addition, integrity monitoring procedures for such Lidar/GNSS/INS-based vehicle localization and mapping system should be investigated.

**Author Contributions:** J.W. and S.Z. had the idea for the article. S.Z. performed the literature search, conducted the experiments and corresponding analysis, and wrote and revised the manuscript. J.W. reviewed and commented on the draft manuscript. W.D. supported the experiment setup. C.R. and A.E.-M. provided critical feedback. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work is supported by the Australian Research Council (ARC) Project No. DP170103341.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the size of the data.

**Conflicts of Interest:** The authors declare no conflict of interest.
