1. Introduction
The autonomous vehicle [
1,
2] is well believed to be the next revolutionary technology changing people’s lives in many ways. For fully autonomous vehicles, localization is one of the key parts because accurate and robust positioning is the basics of further perception and path planning missions for autonomous driving. The most promising solution to provide globally referenced positioning is sensor integration of global navigation satellites system (GNSS), light detection and ranging (LiDAR), inertial measurement unit (IMU) and high definition map (HD map). Currently, this solution can provide satisfactory performance in suburban areas [
3,
4,
5,
6,
7,
8,
9]. However, the performance of the integration solution can be severely challenged in the deep urbanized area, such as Hong Kong, Tokyo, and New York. Firstly, the accuracy of GNSS positioning can be decreased to 50 m [
10,
11], due to the blockage, reflection, and diffraction from buildings and moving objects. Moreover, the uncertainty of GNSS positioning, which is significant for sensor fusion, is difficult to model. The IMU can drift over time due to the dense traffic congestions. The matching between the real-time point clouds from LiDAR and offline point clouds from HD Map can also be challenged due to the excessive moving objects, changeable city structures, and environment feature availability. Simultaneous localization and mapping (SLAM) [
12,
13] is a significant method to provide positioning service based on mapping of point clouds. The accuracy of SLAM-based positioning relies heavily on the mapping between point clouds. In other words, the performance of SLAM is strongly related to the environment. Thus, this paper proposes to evaluate the performance of NDT-based graph SLAM in diverse urban scenarios to further study the relationship, between the performance of SLAM and environment conditions.
Numerous studies [
13,
14,
15,
16,
17,
18] are conducted in the past decades on the LiDAR-based SLAM. The main principle of LiDAR-based SLAM is to continuously track the transformation between successive frames of point clouds. In this case, the performance of SLAM relies heavily on the accuracy of the mapping-based transformation. The LiDAR odometry and mapping (LOAM) [
18] can obtain low drift positioning when satisfactory environment features are available, such as the planes and edges. However, the performance of this algorithm can be severely degraded in dense urban areas, due to the excessive moving objects. On one hand, the positioning accuracy cannot be guaranteed with too much dynamic plane features from traffic. On the other hand, the LOAM algorithm did not propose an effective method to model the uncertainty of LiDAR-based positioning. The submap concept is proposed in [
13] and the real-time loop closure detection is achieved. However, the uncertainty is not well modeled, and the performance of this algorithm relies heavily on the additional sensors, such as the IMU. The mapping solution, normal distribution transform (NDT), based on the normal distribution transform is proposed in [
19]. This method innovatively employs the normal distribution transform to transfer the mapping process into probabilistic continuous functions. However, this method also cannot effectively model the positioning uncertainty caused by moving objects. According to the previous research, the uncertainty caused by the moving objects are not well modeled [
20]. To reduce the drift of LiDAR-based positioning, the loop closure detection [
14] algorithm is proposed to mitigate the global positioning error. The main idea for loop detection is to identify the two similar poses that the vehicle has gone through. Then, the overall correction of the poses is obtained based on the loop closure to improve accuracy. However, the loop closure is subjected to the availability of closed loop.
Both the LOAM and the NDT can be used to calculate the transformation between the consecutive frames of point clouds. The LiDAR odometry can be obtained by tracking the continuous transformations. To integrate the LiDAR odometry and other sensors, a sensor fusion framework is needed. Based on the principle of the sensor integration, the sensor integration methods can be divided into two groups, the filtering-based and the smoothing-based integration. The symbolic filtering-based sensor integration method is the Bayes filter, including Kalman filter [
21,
22], information filter [
16,
23,
24] and particle filter [
17,
25,
26]. The Bayes filter-based sensor integration estimates the current state only based on current observation and the previous state estimation, abandoning all the states before the previous states [
27]. This is because of the assumption of the first order of the Markov model which is one of the key assumptions of the Bayes filter. Conversely, the smoothing approaches [
28,
29,
30,
31] estimate the pose and map by considering the full sets of measurements between the first epoch and the current epoch. The most well-known smoothing method is the graph-based SLAM [
12]. However, no matter which mapping method is chosen, the accuracy of LiDAR-based positioning is significant for final sensor integration.
Our team aims to develop autonomous driving vehicles to facilitate the next generation of the intelligent transportation system of Hong Kong. An accurate and robust localization service is the basis. This paper extensively analyzes the performance of NDT-based graph SLAM in diverse urban scenarios. This paper firstly employs the NDT to calculate the transformation between two consecutive frames. Then, graph optimization is used to optimize all the LiDAR odometry measurements from the first epoch to current epoch. Moreover, this paper estimates the uncertainty of the LiDAR odometry in terms of the degree of matching, number of iterations and time used for NDT optimization. This covariance estimation solution is available in the point cloud library (PCL).
The main contributions of this paper are listed as follows:
- (1)
This paper proposes to generate the 3D building models of the tested area to define the degree of urbanization of the given scenario. The Skyplot is generated as an indicator of the degree of urbanization and the corresponding definition is presented. The classification criteria of different urban scenarios are proposed using Skyplot features.
- (2)
The multi-sensor integrated localization solution [
7,
8] tend to evaluate the performance in sparse scenarios with friendly traffic. This paper evaluates the performance of NDT-based graph SLAM in diverse urban scenarios, with different traffic conditions and degree of urbanization.
- (3)
This paper qualitatively analyzes the relationship between the performance of NDT-based graph SLAM and the traffic conditions and degree of urbanization. The evaluated results related to the traffic and performance of LiDAR-based positioning can be a useful basic work for further mitigating the effects of traffic and urbanization to improve the performance of LiDAR-based positioning.
The rest of the paper is structured as follows. The transformation calculation based on LiDAR is presented in
Section 2. The graph-based SLAM is introduced in
Section 3 before the experiment evaluation is given in
Section 4. Finally, the conclusion and future work are presented in
Section 5.
4. Experimental Evaluation
To evaluate the performance of NDT-based graph SLAM in diverse urban scenarios, experiments are conducted in three different scenarios with different traffic conditions. The environment features of three scenes are shown as following:
- (1)
Sparse area: (a) Sparse area with normal traffic. (b) The sparse area with dense traffic.
- (2)
Sub-urban area: (a) Sub-urban area with normal traffic. (b) The sub-urban area with dense traffic. (presented in the
Appendix A)
- (3)
Dense urban area: (a) Dense urban area with normal traffic. (b) The dense urban area with dense traffic.
Definition 1. The degree of urbanization: The level of urbanization increases from scenes (1) to (3). There is almost no effective and existing way to model the degree of urbanization regarding to the autonomous vehicle. The degree of urbanization is extensively discussed in the GNSS field, as buildings [
36,
37]
and moving objects [
20]
can have significant effects on the accuracy of GNSS solutions. The Skyplot [
38]
is employed to represent satellite visibility (LOS: line-of-sight, NLOS: non-line-of-sight) by project both the 3D building models and satellites into the Skyplot coordinate system. Inspired by this, we propose to use the mean mask elevation angle of a Skyplot as an indicator of urbanization. As the building model information are available in Google Maps for research purpose, we construct the 3D building models of the experiment scene shown in Figure 2. For a dense urban scenario, the 3D building models generated in a street are shown on the left side in
Figure 3, assuming the vehicle is at the position shown in
Figure 3 with 3D LiDAR sensor being installed on the top. By connecting a line from the vehicle and building roof, we can denote the mask elevation angle and azimuth angle as
and
, respectively. Assume that at an azimuth angle
, the distance between the vehicle and the building is
W and building height is
H, we can calculate the corresponding elevation angle
as:
By traversing all the azimuth angles from
to
with an angular resolution of
, we can obtain the Skyplot [
38] (shown in the right side of
Figure 3) regarding the given vehicle position. The inner circle indicates a different elevation angle. The “N” means north. The shaded area means blockage from buildings. Taller buildings cause more blockage, which introduces a higher degree of urbanization.
Therefore, we propose to calculate the mean elevation mask angle as a quantitative indicator of the degree of urbanization of the given scenario as follows:
In this case, the degree of urbanization of a given scenario is represented by
. The
for the scenario shown in
Figure 3 is
. We propose to define the degree of urbanization using the rules in
Table 1. Therefore, the
for the scenario in
Figure 3 satisfies the dense urban area condition.
Definition 2. Traffic conditions: We define that the normal traffic means the common traffic condition density in urban areas with approximately 2–5 vehicles surrounding the ego-vehicle (shown in the left of Figure 4). The dense traffic indicates that there are numerous moving objects on the roads. For example, in rush hours with approximately 8–12 vehicles surrounding the ego-vehicle (shown in the right of Figure 4). 4.1. Experimental Setup
A 3D LiDAR sensor, Velodyne 32, is employed to provide the real-time point clouds scanned from the surroundings. 3D LiDAR is installed on the top of a vehicle during the experiment which can be seen in
Figure 5. The LiDAR coordinate system is shown in
Figure 5 with the
x-axis pointing to the back of the vehicle. The integrated navigation system (NovAtel SPAN-CPT, RTK/INS integrated navigation system with fiber optics gyroscopes) based on local ENU [
39] coordinate system is used to provide ground truth. The coordinate system of LiDAR and SPAN-CPT is calibrated at the beginning of the tests.
The evaluated items in the following experiments include the positioning error of NDT-based graph SLAM and reliability estimation error. Regarding the total positioning error
, it is calculated by:
where
,
and
denote the estimated position in east, north and upward axes of ENU coordinate system. The
,
and
denote the ground truth (by GNSS RTK/INS integrated system) in east, north and upward axes of ENU coordinate system. Assuming that the heading angle of vehicle relative to north of the earth is denoted as
in radian, then the positioning error
of a vehicle in longitudinal direction can be calculated by projecting
into the direction of heading (
). Similarly, the error in the lateral direction can also be calculated by projecting
into the direction normal to heading direction (
).
The reliability is calculated as . The ground truth for reliability estimation is the actual total positioning error . The objective of reliability estimation is to obtain a smallest circle that can cover to represent the uncertainty of a given positioning result.
4.2. Experiment in Sparse Area
4.2.1. Experiment 1: Performance Evaluation of NDT-based Graph SLAM in Sparse Area with Normal Traffic
In this experiment, the scenario is shown in the top panel of
Figure 6. The overall drive of a vehicle lasts about 395 s in a sparse area with normal traffic. The height of the surrounding buildings is about 5–10 m high and the width of the streets is approximately 16 m. The
for the scenario shown in
Figure 6 is about
satisfying the sparse area condition.
We can see from the bottom panel of
Figure 6, the positioning result of SLAM can well track the ground truth at the beginning of the test. However, due to the accumulated error over time, the SLAM-based trajectory drifts away from the ground truth. The detailed positioning error during the experiment is shown in
Figure 7. The top panel shows the positioning error in three different directions. The lateral direction is normal to the driving direction of the vehicle with the longitudinal direction parallel with the driving direction. The bottom panel indicates the reliability estimation. The reliability shown in the bottom panel is calculated based on
presented in
Section 2. We can see that the 3D positioning error (
) almost increases over time with the final positioning error reaching almost 10 m. The estimated reliability can track the 3D positioning error at the very beginning of the test. The estimated reliability tends to fluctuate between 7 m over time in this experimental scenario.
Table 2 shows the mean error and standard deviation of positioning error in three separate directions; 3.44 m of mean error in the lateral direction is obtained with a standard deviation of 1.88 m. The mean positioning error in the longitudinal direction is 3.19 m. Moreover, the mean positioning error in altitude direction is just 3.05 m with a standard deviation of 1.02 m. The mean of 2D (sum of lateral and longitudinal direction) positioning error is 6.64 m which are slightly smaller than the 3D positioning error (9.69 m). As the positioning error accumulates over time using standalone NDT-based graph SLAM, we propose to use the gradient of accumulated error to evaluate the performance of SLAM. The 2D gradient indicates the rate of change of 2D mean positioning error and is obtained with total 2D mean error divided by total epochs. Similarly, the 3D gradient indicates the rate of change of 3D positioning error. The 2D gradient is 0.017, which means that the accumulated error of SLAM increased by meters per second. The 3D positioning error gradient is 0.024 m per second.
Regarding the reliability estimation result (blue dots in the bottom panel of
Figure 7), we can find that the
overestimates the 3D positioning error in the majority of the situations. Moreover, the estimated reliability fluctuates dramatically during the test (from 0 to 15 m).
4.2.2. Experiment 2: Performance Evaluation of NDT-based Graph SLAM in Sparse Area with Dense Traffic
In this experiment, the scenario is shown in the top panel of
Figure 8. The overall drive of the vehicle lasts about 400 s in a sparse area with dense traffic. The height of the surrounding buildings is about 5–10 m high and the width of the streets is approximately 16 m which are similar to experiment 1.
We can see from the bottom panel of
Figure 8, that the positioning result of SLAM can track the ground truth at the beginning of the test. However, due to the accumulated error over time, the SLAM-based trajectory is drifting away from the ground truth. The positioning error during the experiment is shown in
Figure 9. We can see that the 3D positioning error increases over time with the final positioning error reaching about 19 m. The estimated reliability can track the 3D positioning error at the very beginning of the test. However, the difference between the estimated reliability of SLAM and the 3D positioning error increases over time. The estimated reliability tends to fluctuate in the vicinity of 5 m over time during the experiment.
Table 3 shows the mean and standard deviation of positioning error in three separate directions; 6.31 m of mean error in the lateral direction is obtained with a standard deviation of 5.26 m. The mean positioning error in the longitudinal direction is 4.91 m. Interestingly, the mean positioning error in altitude direction is just 0.77 m with a standard deviation of 0.84 m. The mean of 2D (sum of lateral and longitudinal directions) positioning error is 11.21 m which are slightly smaller than the 3D positioning error (11.99 m). The 2D gradient is 0.028, which means that the accumulated error of SLAM increased by meters per second. The 3D positioning error gradient is 0.03 m per second.
Regarding the reliability estimation result (blue dots in the bottom panel of
Figure 9), we can find that the estimated reliability is even worse than that in experiment 1. From epoch 100 to 400, the actual 3D positioning error is larger than the estimated (reliability). The estimated mean reliability is 5.93 m which are smaller than its ground truth (11.99 m).
The main reason for this is due to the increased traffic density, compared with experiment 1. According to [
40], the dynamic objects from traffic can cause increased uncertainty in LiDAR-based positioning. The evaluated reliability estimation method [
34] cannot model the uncertainty caused by dynamic objects. These experimental results show that traffic has a negative effect on the performance of NDT-based graph SLAM.
4.3. Experiment in Dense Urban Area
4.3.1. Experiment 3: Performance Evaluation of NDT-based Graph SLAM in Dense Urban Area with Normal Traffic
In this experiment, the scenario is shown in the top panel of
Figure 10. The overall drive of vehicle lasts about 124 s in dense urban areas with normal traffic. The height of the surrounding buildings is about 59–105 m high and the width of the streets is approximately 16–20 m. The
for the scenario shown in
Figure 10 is about
satisfying the dense urban area condition.
We can see from the bottom panel of
Figure 11, the positioning result of SLAM can well track the ground truth at the beginning of the test. However, due to the accumulated error over time, the SLAM-based trajectory is drifting away from the ground truth. The positioning error during the experiment is shown in
Figure 11. We can find that the main trend in positioning error is that the positioning error
increased over time. The 3D positioning error increases over time with the final positioning error reaching about 28 m. The estimated reliability can track the 3D positioning error at the very beginning of the test. However, the difference between the estimated reliability of SLAM and the 3D positioning error increases over time. The estimated reliability tends to fluctuate between 10 and 60 m over time during the experiment.
Table 4 shows the mean and standard deviation of positioning error in three separate directions; 6.73 m of mean error in the lateral direction is obtained with a standard deviation of 4.39 m. The mean positioning error in the longitudinal direction is 4.81 m. Interestingly, the mean positioning error in altitude direction is just 11.90 m which are significantly larger than the other two directions. The mean of 2D (sum of lateral and longitudinal directions) positioning error is 11.54 m with the 3D positioning error reaching 14.85 m. The 2D gradient is 0.094 and the value for the 3D gradient is 0.121.
Regarding the reliability estimation result (blue dots in the bottom panel of
Figure 11), from epoch 0 to 60, the actual 3D positioning error is smaller than the estimated (reliability). The estimated mean reliability is 18.90 m which are significantly larger than its ground truth (14.85 m).
By comparing this with the experiments conducted in the sparse area, the mean 3D gradient of 3D positioning error in dense urban (0.121 m/s) is significantly larger than that in the sparse area (0.024 m/s in normal traffic and 0.03 in normal traffic). These results show that the degree of urbanization has also an impact on the performance of NDT-based graph SLAM.
4.3.2. Experiment 4: Performance Evaluation of NDT-based Graph SLAM in Dense Urban Area with Dense Traffic
In this experiment, the scenario is shown in the top panel of
Figure 12. The overall drive of vehicle lasts about 124 s in dense urban with dense traffic. The height of the surrounding buildings is about 50–175 m high and the width of the streets is approximately 16–20 m. The
for the scenario shown in
Figure 10 is about
satisfying the dense urban area condition.
We can see from the bottom panel of
Figure 13, that the positioning result of SLAM can well track the ground truth at the beginning of the test. However, due to the accumulated error over time, the SLAM-based trajectory is drifting away from the ground truth. The positioning error (
) during the experiment is shown in
Figure 13. The 3D positioning error increases over time with the final positioning error reaching about 42 m. The estimated reliability can track the 3D positioning error at the very beginning of the test. However, the difference between the estimated reliability of SLAM and the 3D positioning error increases over time. Interestingly, we can find that altitude positioning error in both experiment 3 and experiment 4 dominant the
.
Table 5 shows the mean and standard deviation of positioning error in three separate directions; 11.25 m of mean error in the lateral direction is obtained with a standard deviation of 5.09 m. The mean positioning error in the longitudinal direction is 2.77 m. Interestingly, the mean positioning error in altitude direction reaches 19.07 m which are significantly larger than the other two directions. The mean of 2D (sum of lateral and longitudinal directions) positioning error is 14.02 m with the 3D positioning error reaching 23.22 m. The 2D gradient is 0.114 and the value for the 3D gradient is 0.189.
Regarding the reliability estimation result (blue dots in the bottom panel of
Figure 13), from epoch 20 to 120, the actual 3D positioning error is larger than estimated (reliability). The estimated mean reliability is 14.38 m which are significantly smaller than its ground truth (23.22 m). Interestingly, we can find that it tends to overestimate the uncertainty of SLAM in normal traffic scenario and underestimate that in dense traffic scene. The main reason for this is that the used reliability estimation method cannot model the effects of traffic (dynamic objects). In other words, the dense traffic scenes introduce larger uncertainty. This result again shows that traffic has a bad impact on the performance of NDT-based graph SLAM.
5. Discussion, Conclusion and Future Work
(1) The relationship between the traffic conditions and the performance of NDT-based graph SLAM positioning:
Traffic condition and accuracy of NDT-based graph SLAM: The detailed analysis of the relationship between the traffic conditions and the accuracy of LiDAR-based positioning is shown in
Figure 14, which shows the results in two different degrees of traffic conditions. According to the presented six experiments (including two experiments presented in the
Appendix A); the accuracy of the SLAM is degraded with increased traffic density. For example, the mean 3D positioning error increased from 1.58 m (experiment 5 with normal traffic) to 1.91 m (experiment 6 with dense traffic). This phenomenon is also the same in the sparse area and dense urban areas. The main reason causing this degradation in SLAM performance is the moving objects in traffic, such as the double-decker bus, cars, and trucks. Our previous research [
20] shows that the height of the double-decker bus can go up to 4.5 m in Hong Kong and, thus, it takes up the majority of the field of view (FOV) of 3D LiDAR. The double-decker bus is a moving object on the roads. In this case, the majority of the 3D point clouds are scanned from the moving objects. The points from moving objects can distort the mapping between two consecutive frames of point clouds, thus impairing the accuracy of SLAM. We can also see that the positioning error gradient increases with enhanced traffic density. Overall, traffic conditions have negative effects on the accuracy of NDT-based SLAM. In other words, more dynamic environments with more moving objects introduce more degradation in the positioning accuracy of NDT-based graph SLAM. The evaluated results related to the traffic and accuracy of NDT-based SLAM can be a good benchmark for further mitigating the effects of traffic to improve the accuracy of LiDAR-based positioning.
Traffic condition and reliability estimation: As presented in Experiment 4, it tends to overestimate the uncertainty of SLAM in a normal traffic scenario and underestimate that in dense traffic scene. In other words, the dense traffic scenes introduce larger uncertainty. This result again shows that the traffic has a bad impact on the performance of NDT-based graph SLAM. Proper methods to cope with the dynamic objects are needed to effectively estimate the uncertainty caused by dense traffic.
(2) The relationship between the degree of urbanization and the performance of LiDAR-based positioning:
The degree of urbanization and accuracy of NDT-based graph SLAM: The detailed analysis of the relationship between the degree of urbanization and the accuracy of NDT-based graph SLAM is shown in
Figure 15. According to the six experiments, three levels of areas classified based on the degree of urbanization are presented. We can see from
Figure 15, the 3D gradient in sub-urban is similar to that in the sparse area. However, the 3D gradient in dense urban is significantly larger than that in both sparse areas and sub-urban. The main reason for this result is the environment features availability. In the sub-urban and sparse areas experiments, the main features are buildings, moving objects, and some trees, which means that abundant features are available. In the dense urban area, the main features are tall buildings and moving objects, which means less feature availability. Moreover, we can find that the 3D positioning error in altitude direction increases dramatically with an increased degree of urbanization which can be seen by comparing with Experiment 1 and 3. In total, the increased density of urbanization can degrade the accuracy of SLAM-based positioning, especially in the altitude direction. To effectively model the uncertainty of LiDAR-based positioning, the surrounding environment features are needed to be considered, for example, the degree of urbanization. The 3D building model generated in
Figure 2 is a potential resource which contains the models of buildings to improve the accuracy of NDT-based graph SLAM. Inspired by this, we are going to employ the 3D building model to facilitate the effects estimation of urbanization on the performance of LiDAR-based positioning.
The degree of urbanization and reliability estimation of NDT-based graph SLAM: as discussed in experiment 4, the reliability estimation of NDT-based SLAM is highly related to the traffic condition. However, there are no obvious relationships between the reliability estimation and the degree of urbanization according to the presented experiments. In total, the sub-urban area has the smallest mean uncertainty and the dense urban area possesses the largest estimated uncertainty.
Insufficient positioning accuracy and robustness is one of the main technical problems that prevent the arrival of autonomous vehicles in the super-urbanized areas, such as Hong Kong, Tokyo, and New York. According to our previous localization experiments conducted in Hong Kong using multi-sensor fusion (GNSS/IMU/LiDAR/HD Map) method [
34] using Kalman filter, we find that the method (GNSS/IMU/LiDAR/HD Map) can obtain satisfactory performance in the sparse area. However, the solution usually fails in a deep urban area with dense traffic. Therefore, we propose to conduct experiments in diverse urban scenarios to find out the effects of traffic and degree of urbanization on LiDAR-based positioning. This is also the main contribution of this paper. Coping with the effects from both traffic and urbanization on the LiDAR-based positioning is significant for the GNSS/INS/LiDAR/HD Map-based localization solution for an autonomous vehicle. We believe that this paper can be a useful basic work to improving the LiDAR-based positioning in dense urban scenarios.
(3) Future work: we aim to employ moving objects detection and 3D building models to improve the performance of NDT-based graph SLAM. Moreover, the uncertainty estimation of LiDAR-based SLAM will be conducted by considering both the traffic conditions and 3D building models.