**1. Introduction**

A mobile robot is a complex system integrating computer technology, sensor technology, information processing, electronic engineering, automation, and artificial intelligence. With the assistance of artificial intelligence technology, mobile robots with versatile functions are widely used in the fields of emergency rescue, industrial automation, and smart life. Precise positioning is one of the key technologies for mobile robots to complete tasks autonomously. With the rapid development of robot technology, a single sensor can no longer meet the increasingly rich functional requirements of robots. Therefore, the technology of information fusion of multi-source sensing has gradually attracted attention.

Mobile robots are widely used in indoor environments. 2D LIDAR has become the choice for indoor navigation and positioning due to the advantages of high-precision ranging and reduced data volume. However, with the increasing demand for outdoor scenes, robots are gradually moving towards increasingly complex open scenes. Driven by the DARPA (Defense Advanced Research Projects Agency Ground Challenge) [1,2], multi-line 3D LIDAR became known and began to be widely used in outdoor scenes. 3D LIDAR has stronger environmental awareness but at the cost of expensive price, high data volume, and processing difficulty. In recent years, with the popularization of 3D LIDAR and the enhancement of the computing power of embedded processors, the positioning technology based on 3D LIDAR has developed rapidly. 3D LIDAR provides high-density

**Citation:** Xu, X.; Zhang, L.; Yang, J.; Cao, C.; Wang, W.; Ran, Y.; Tan, Z.; Luo, M. A Review of Multi-Sensor Fusion SLAM Systems Based on 3D LIDAR. *Remote Sens.* **2022**, *14*, 2835. https://doi.org/10.3390/rs14122835

Academic Editors: Yuwei Chen, Changhui Jiang, Qian Meng, Bing Xu, Wang Gao, Panlong Wu, Lianwu Guan and Zeyu Li

Received: 5 May 2022 Accepted: 10 June 2022 Published: 13 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

point clouds with richer matching methods and better robustness for the matching between frames. Furthermore, it can be fused with image and odometer information [3] to enhance the positioning accuracy, which has become the mainstream sensor used in many fields such as unmanned driving, robot autonomous navigation, and drone flight control.

For a SLAM system, accurate position and orientation estimation are essential. Scholars have conducted a lot of studies, including vision-based methods and LIDAR-based methods, to realize real-time high-precision 6-DOF state estimation for mobile robots. However, a single sensor system has limitations. On the one hand, the dependence of vision on initialization and the sensitivity to the sum of illumination leads to the instability of the system. On the other hand, the sparse information provided by LIDAR rapidly degenerates the positioning in unstructured scenes. In addition, the rapid motion mode and long-term error accumulation further invalidates the odometer. Therefore, many auxiliary sensors such as IMU, GPS, MEMS, and UWB are added to the positioning system to solve the above problems. In recent years, there have been many LIDAR SLAM review literatures. Most of them introduce the development process of the entire 3D LIDAR SLAM in simple terms, which includes huge but too complicated content.

Tee [4] presents a detailed analysis and comparison of several common open-source solutions for 2D SLAM. The advantages and disadvantages of each method are demonstrated by simulation and experiment. However, 3D SLAM has not been addressed. Bresson [5] reviews LIDAR SLAM related to the large-scale problem faced by autonomous driving. Similarly, reference [6] is an earlier SLAM review, which discusses in detail the basic issues of SLAM and many works in the development of SLAM, including long-distance SLAM, theoretical performance analysis, semantic association, and development directions. Both works summarize the classic theories and work in the field of SLAM, however, the related content of multi-sensor fusion is not involved.

Debeunne [7] divides SLAM into three parts: image-based, LIDAR-based, and image-LIDAR fusion. The integration of SLAM work, complicated integration methods, and the development process of data fusion have not been mentioned. Taheri [8] proposes a SLAM review showing the development of SLAM by reviewing important works. It summarizes and looks forward to SLAM work from multiple directions and stages. However, this work mainly summarizes the visual SLAM, and the reference value of LIDAR SLAM is limited. Zhou [9] summarizes the SLAM algorithm based on 3D LIDAR from the aspects of optimization framework, key SLAM modules, and future research hotspots. Subsequently, this work compares the performance of various SLAM algorithms in detail, which has high reference value.

It can be seen that most of the relevant reviews of SLAM are based on key modules such as front-end matching, closed-loop detection, back-end optimization, and mapping, focusing on the development history and latest works of SLAM. This paper will summarize the multi-sensor fusion SLAM algorithms based on 3D LIDAR from different perspectives. The contributions are:


This paper provides a detailed overview of multi-sensor systems through five main sections. The first section details the necessity of multi-sensor fusion in localization systems. In Section 2, the basic problems to be solved by SLAM and the classical framework are presented. In Section 3, the related works of the loosely coupled system are reviewed in detail in two parts according to the sensor types. Similarly, the related works on tightly coupled systems are reviewed in Section 4. A comparison table is given at the end of each

section. Finally, a summary of the full text and an outlook for the follow-up works are presented. Abbreviations used in this paper are summarized in Table 1.


#### **2. Simultaneous Localization and Mapping System**

Over the past few decades, SLAM techniques have come a long way. SLAM systems based on various sensors have been developed, such as LIDAR, cameras, millimeterwave radar, ultrasonic sensors, etc. As early as in 1990, the feature-based fusion SLAM framework [10], as shown in Figure 1, was established and it is still in use today. The SLAM problem has evolved from two independent modules, localization and mapping, into a complete system that integrates the two. The two modules promote each other. The highprecision odometer composed of multiple sensors provides real-time pose estimation for the robot and the basis for the reconstruction and stitching of the 3D scene. Similarly, highprecision 3D reconstruction provides important data for pose estimation for feature-based odometry. Even a separate odometer system is also inseparable from the establishment or storage of temporary local maps to assist pose estimation.

**Figure 1.** Feature-based fusion SLAM framework.

Most modern SLAM systems are divided into two parts: front end and back end (as shown in Figure 2). The front end is responsible for estimating the current frame pose in real time and storing the corresponding map information. The back end is responsible for large-scale pose and scene optimization. Loop closure detection is one of the key issues of SLAM, which helps the robot identify visited scenes and trigger global-scale drift correction. Large-scale global optimization is also the main difference between SLAM and modern odometry. The two methods have many similarities in pose estimation. Most modern multi-sensor fusion technologies act in the front end to achieve high-precision and low drift of the odometer systems by the means of information complementation, local pose fusion, and multi-data source filtering.

**Figure 2.** The main components of a SLAM system.

The development of single sensor system is relatively mature, among which LIDAR, camera, and IMU are the most common sensors in SLAM systems. 3D LIDAR can provide the system with rich structural information of the environment. However, the data are discrete and numerous. The camera can capture color and texture in the environment at high speed, but depth cannot be directly perceived by the camera and it tends to be perturbed by light. The IMU can sensitively perceive weak changes of the system in a very short period of time, but long-term drift is inevitable. The characteristics of the three are distinct, and their advantages and disadvantages are obvious. Single-sensor SLAM systems are fragile and full of uncertainty. They are not capable of dealing with multiple complex environments such as high-speed scenes, small spaces, and open and large scenes at the same time.

Therefore, multi-sensor fusion has become a new trend in the development of SLAM systems. Most of the SLAM and odometry systems for multimodal sensor fusion use a combination of LIDAR, camera, and IMU, which can be categorized as loosely coupled or tightly coupled modes. The loosely coupled system processes measurement data of each sensor separately and fuses them in a filter that marginalizes data of the current frame to achieve the latest state estimation results. The tightly coupled system jointly optimizes the measurement data of all sensors and combines the observation characteristics and physical models of each sensor to obtain a more robust pose estimation. The loosely coupled system has the advantages of small calculation amount, simple system structure, and easy implementation. However, its positioning accuracy usually has limitations. In contrast, a tightly coupled system is computationally intensive so that its implementation is difficult, but it gains a more accurate state estimation in complex and changeable environments.

Based on these three sensors, a number of multi-sensor fusion simultaneous localization and mapping works have emerged in recent years. In this paper, according to the coupling method of the system and the types of sensors to be fused, these works are divided into LIDAR-IMU loosely coupled system, Visual-LIDAR-IMU loosely coupled system, LIDAR-IMU tight coupled system, and Visual-LIDAR-IMU tight coupled system. The development of SLAM is a process of transition from loose coupling to tight coupling. The classification of some of the SLAM systems mentioned in this paper and the developmental relationship between them are shown in Figure 3.

**Figure 3.** Classification of parts of the work and the relationship between them.
