A Review of Key Technologies for Environment Sensing in Driverless Vehicles

Huo, Yuansheng; Zhang, Chengwei

doi:10.3390/wevj15070290

Open AccessReview

A Review of Key Technologies for Environment Sensing in Driverless Vehicles

by

Yuansheng Huo

and

Chengwei Zhang

^*

School of Physics and Mechatronic Engineering, Guizhou Minzu University, Guiyang 550025, China

^*

Author to whom correspondence should be addressed.

World Electr. Veh. J. 2024, 15(7), 290; https://doi.org/10.3390/wevj15070290

Submission received: 12 June 2024 / Revised: 25 June 2024 / Accepted: 27 June 2024 / Published: 29 June 2024

Download

Browse Figures

Versions Notes

Abstract

:

Environment perception technology is the most important part of driverless technology, and driverless vehicles need to realize decision planning and control by virtue of perception feedback. This paper summarizes the most promising technology methods in the field of perception, namely visual perception technology, radar perception technology, state perception technology, and information fusion technology. Regarding the current development status in the field, the development of the main perception technology is mainly the innovation of information fusion technology and the optimization of algorithms. Multimodal perception and deep learning are becoming popular. The future of the field can be transformed by intelligent sensors, promote edge computing and cloud collaboration, improve system data processing capacity, and reduce the burden of data transmission. Regarding driverless vehicles as a future development trend, the corresponding technology will become a research hotspot.

Keywords:

driverless; visual perception; radar perception; state perception; information fusion

1. Introduction

Computer vision is one of the most widely used technologies for driverless vehicles, which, after combining digital image processing technology and machine learning with deep learning, can complete the deep understanding of images, thus realizing the recognition, tracking, and understanding of various targets on the road [1]. Moreover, the use of computer vision combined with artificial intelligence, the Internet, and other technologies can greatly avoid human safety problems that exist in intelligent transportation systems (ITSs) [2], thus perfecting advanced driver assistance systems (ADASs). For example, Tesla, an automobile brand, is committed to the development of driverless technology using pure vision to upgrade vehicles from semi-autonomous driving to fully autonomous driving (FSD) by optimizing hardware and other aspects [3,4,5,6].

In addition to computer vision, radar-based environmental sensing techniques are also widely used. Driverless vehicles use radar technology to measure key information such as distance, speed, direction, and shape of the target [7,8,9,10]. In China, companies represented by Huawei have conducted in-depth research in radar perception technology to form a unique smart driving system.

In addition to externally sensing the environment, driverless vehicles also need to internally sense the vehicle state. State sensing relies on a variety of sensor sensing fits, e.g., global positioning system (GPS) and inertial measurement unit (IMU) [11], to enable the collection of speed, position, orientation, and other relevant information to ensure safe and efficient operation.

A single source of information cannot provide a reliable decision control basis for driverless vehicles. Comprehensive decision-making data based on multi-source sensing fusion can effectively avoid the vehicle’s misjudgment of the actual environment and improve reliability and travel efficiency [12].

In this paper, we will introduce the application practice and theoretical innovation of the corresponding methods from the above four aspects, analyze the general research in this area in the current society, summarize the advantages and disadvantages of various techniques, and look forward to the future development trend.

2. Visual Perception Technology

Visual perception techniques mainly include target detection and recognition, target tracking, and scene understanding [13,14,15]. Target detection and recognition refers to finding the target of interest from the image, such as vehicles, pedestrians, road signs, etc. [16,17,18,19]. Based on the detection of the target, the motion trajectory of the target is tracked to realize target tracking in order to understand the motion state of the target. At the same time, the driving environment of the vehicle often has complexity, such as the scene of the target, road, traffic signs, and other intertwined information, which requires each part to be analyzed individually and integrate decision making in order to provide a basis for the decision making and control of the vehicle.

Computer vision-based perception technology realizes the functions of image acquisition and analysis, matching prediction, etc., to complete the detection of the desired target. As shown in Figure 1, the detection process can be divided into six parts.

2.1. Computer Vision

As computer vision continues to evolve, it is used in a variety of fields, with the field of driverless technology benefiting greatly from its development. In a nutshell, computer vision can be categorized into two types, one for digital image processing techniques and the other for learning-based machine vision.

2.1.1. Digital Imaging Technology

Digital image technique is the basic method of computer vision, and target detection for driverless vehicles also depends on this technique. This technique realizes image modification and feature extraction through the steps of image acquisition, preprocessing, feature extraction, image enhancement, image recovery, image compression, image segmentation, target recognition, and image display [20,21].

2.1.2. Machine Learning and Deep Learning

With the development of image processing technology, pure digital image processing is oriented towards updating and elimination, and more options are machine learning and deep learning [22,23]. These two methods are targeted for target detection, which improves the recognition accuracy and greatly reduces the recognition time.

Machine learning

As a branch of artificial intelligence, the main goal of machine learning is to allow computer systems to automatically learn and adapt based on input data for prediction, decision making, and task execution. It usually relies on the construction and selection of appropriate feature extractors and models to learn rules from input data, e.g., linear models, decision trees, and support vector machines [24].

2.: Deep learning

Deep learning focuses on the use of deep neural network models, which consist of multiple neural network layers that automatically learn and classify abstract features in data.

Deep learning is a new research direction in machine learning, which concerns learning the intrinsic laws and representation levels of sample data. As shown in Figure 2, the structure is a fully connected structure model in deep learning, each circle represents a neuron, and each line represents the output of the previous layer and the input of the next layer, which is divided into three layers, including the input layer, hidden layer, and output layer. Each input of each neuron has its own weight

w_{i}

, and the neuron needs to carry out the summation operation according to the weight of the input and gradually categorize the feature information until the condition is satisfied and output. In the figure, six inputs are divided into two categories as an example; the width of the hidden layer is seven layers, and the length is arbitrary. Generally, the width and length of the hidden layer need to be moderate, as too wide and too long will lead to overfitting, and vice versa will lead to underfitting.

The development of deep learning has promoted the progress of driverless technology, and convolutional neural network (CNN), deep neural network (DNN), recurrent neural network (RNN), etc., have become the main methods of video processing [25,26]. Compared with machine learning, deep learning has a stronger learning ability and lower requirements for data preprocessing, and its image processing ability is excellent after learning based on a large amount of data, which is very advantageous in target detection, collision warning, and lane keeping [27,28,29].

2.2. Detection Technology

Both of these types of computer vision are used in driverless technology, with machine vision applications being particularly widespread, with applications including lane detection, vehicle detection, pedestrian detection, and traffic sign detection.

2.2.1. Lane Detection

Lane detection for driverless vehicles refers to the use of sensors and algorithms to determine the location of the lane in which the vehicle is currently traveling. Based on the condition of accurately detecting the lane, the vehicle can make correct driving decisions to avoid deviating from the lane or colliding with other vehicles.

Lane detection

Validated new algorithms often have better detection results [30]. Ge et al. [31] proposed a lightweight lane detection method based on an improved multi-attention mechanism, which firstly uses a cubic polynomial to approximate the shape of lane lines. Secondly, the cubic polynomial is converted to a “bird’s eye view” by considering the ruggedness of the lane. Finally, the camera is tilted to take into account the bumps during real driving, so a variable tilt angle

θ

is introduced, and the final lane line approximation model is shown as follows:

\{\begin{matrix} x^{'} = \frac{α_{1}^{'}}{(y^{'} - f^{″})} + \frac{α_{2}^{'}}{(y^{'} - f^{″})} + α_{3}^{'} + b^{'} y^{'} - b^{″} \\ f^{″} = f \sin θ \\ α_{1}^{'} = α_{1} \cos^{2} θ \times \frac{H^{2}}{f_{x}} \times f_{y}^{2} \\ α_{2}^{'} = α_{2} \cos θ \times \frac{H}{f_{x}} \times f_{y} \\ α_{3}^{'} = \frac{α_{3}}{f_{x}} \\ b^{'} = b \times \cos θ \frac{f_{y}}{f_{x}} \times H \\ b^{″} = b \times \cos θ \frac{f \tan θ}{f_{x}} \times f_{y} \times H \end{matrix}

(1)

where

x

is the horizontal coordinate of the plane pixel point;

y

is the vertical coordinate of the plane pixel point;

f

is the focal length of the camera;

α_{1}

,

α_{2}

, and

α_{3}

are polynomial coefficients; and

H

is the height of the camera erection.

In order to calculate the match between the predicted lane line parameters and the real lane line parameters, the Hungarian fitting loss is used to perform a two-part match between the predicted lane line parameters and the real lane lines, and the matching results are used to optimize the regression loss for a specific lane. Finally, the regression loss function is defined as:

L = \sum_{i = 0}^{M} - μ_{1} \log g (c_{i}) + Z (c_{i} = 1) μ_{2} L_{m a e} (S_{i}, S_{\overset{\land}{l} (i)})

(2)

where

M

is the number of lanes on the road,

g (c_{i})

is the probability of category

c_{i}

;

S_{\hat{l} (i)}

is the sequence of fitted lane lines;

μ_{1}

and

μ_{2}

are the coefficients of the loss function;

L_{m a e}

is the mean absolute value error; and

Z ()

stands for the indicator function.

Combining the above lane detection methods with CNN and multi-head self-attention (MHSA) can effectively capture the contextual information of lane lines and obtain richer lane line detail information. The video frames collected by the camera contain many cluttering factors, and the image processing efficiency has a direct impact on lane line detection, so it becomes an important means of eliminating irrelevant information in the image [32,33,34,35].

Lane detection can be efficiently calibrated by incorporating a global positioning system (GPS), which was utilized by Xu et al. [36] to achieve calibration of the lane detection results, and the vehicle achieves freedom from the severe limitations of the initial position error, the number of lanes, and the road smoothness.

Since roads can be categorized into structured, semi-structured, and other roads and there are problems such as incomplete lane lines and lane line wear, solving the problem through the idea of feature extractions is one of the most effective methods. In addition, lane line wear can be used for robust lane extraction by enhancing the contrast of the lane region using contrast significance model [37].

From the above, it can be seen that the lane detection process is always continuous, the scene is constantly changing, and various environmental factors need to be considered in the realization. Strengthening the detection effect can start from the detection algorithm and multimodality in order to improve the lane level semantic understanding of the detection system, and then can be combined with other sensors and detection technology to realize fast and accurate detection in a wide range of scenes.

2.: Lane maintenance

The purpose of lane detection is lane keeping. Achieving lane keeping eliminates the need for real-time direction correction behavior by the driver. Vehicles need to predict the next moment’s lane situation in real time while maintaining the lane.

Regarding lane keeping as an important component of lateral control system, its efficient keeping ability and fault tolerance are the prerequisite guarantees for safe traveling. The lane keeping system based on the deep learning end-to-end algorithm has practical application prospects [38,39,40,41], which is designed with a complete development process including data acquisition and processing, model design, cloud training, vehicle deployment and testing, and ensures the lane keeping ability when the vehicle is traveling through systematic and large amount of data learning.

To improve the lane keeping effect, Yang et al. [42] proposed an end-to-end convolutional neural network controller training framework that collects samples from the virtual world and generalizes the training model to the real world. The framework solves the problems of long-time small data volume and low efficiency caused by manual collection of training data, and proposes to collect samples from the virtual world for training and to apply the training results to reality to predict the road change trend.

Although deep learning-based strategies are highly adaptive, it is difficult for a lane keeping system (LKS) to work properly in the case of discontinuous or missing sensory information. The system fails when the driver needs to take over the vehicle in a very short period of time, which may cause driving safety issues. Kang et al. [43] proposed fault-tolerant LKS for LKS in case of camera failure or data loss. The kinematic model-based lane prediction is the core of LKS, which does not need to parameterize the vehicle in detail but only needs to keep a handle on the body states such as inertia. This lateral control system can also predict lane information and synchronize it with the control sampling time, which effectively extends the fault tolerance time and driver reaction time.

Overall, lane keeping requires further detection and control based on lane detection, prediction of the lane direction at the next moment, and tracking in combination with a controller. Improving the algorithm and ensuring the quality of the data can both improve the corresponding sensing and control capabilities.

2.2.2. Vehicle Detection

The lane is the factor that determines the direction in which the vehicle is traveling, and the other vehicles in the lane are the factors that determine the next moment of the vehicle’s action. Effectively detecting other vehicles and planning to avoid them while the vehicle is moving is a prerequisite for driverless driving. To realize the accurate detection of vehicles, the deep learning method is one of the most effective means [44,45], as it has strong robustness and adaptability to cope with unknown lanes that are in change.

Gu et al. [46] proposed a new network method to address the problem of relatively low accuracy of 3D envelope detection based on monocular vehicle detection. First, an improved feature pyramid network (FPN) is proposed, which fuses the information of different layers, passes the deep high-semantic information to the next layer, and then passes the high-resolution information of the bottom and sub-bottom layers to the upper layer, with complementary advantages. Secondly, the improved residual network (ResNet) is proposed to better solve the gradient vanishing and gradient explosion problems in deep neural network training. The residual unit can be expressed as:

\{\begin{matrix} y_{l} = E L U (x_{l}) + F (x_{l}, w_{l}) \\ x_{l + 1} = y_{l} \end{matrix}

(3)

where

x_{l}

and

x_{l + 1}

are the input and output of the

l

th residual unit;

F ()

is the residual function; and the

E L U

expression is shown as follows:

f (x) = \{\begin{matrix} x, x > 0 \\ a (\exp (x) - 1), x < 0 \end{matrix}

(4)

where

a

is a coefficient, and changing the magnitude of its value can affect the bias offset of the activation unit of the next layer, which generally takes the value of 1. Finally, an improved fully connected layer is proposed, which leads to an effective improvement in the detection performance of the network.

The method optimizes the detection algorithm through a three-part improvement that improves accuracy without increasing network training time.

In practical application scenarios, vehicle detection results may originate from multiple camera angles. Nguyen et al. [47] proposed an efficient multi-target multi-camera tracking (MTMCT) application that builds features graphically and customizes the graphical similarity to match vehicle objects from different cameras, classifies the usage environment into two states online and offline, and selects different image matching sources based on the different environments in order to improve the overall detection and recognition capability.

In vehicle detection, both algorithm and hardware optimization are important, but the most critical is to extract vehicle features from sensory data, such as vehicle shape, size, motion pattern, and texture. By accurately extracting and analyzing these features, accurate vehicle detection and tracking can be achieved to support driverless driving.

2.2.3. Pedestrian Detection

Pedestrians alongside the lane are an unstable factor compared with other vehicles in the lane. During driverless driving, pedestrian detection can help vehicles recognize pedestrians around them and predict their future movements so that appropriate safety measures can be taken.

Pedestrian detection

Unlike object detection, from a personnel safety-centered perspective, information about the person being detected should have greater confidence in control decisions.

In order to improve the efficiency of vehicle recognition of pedestrians, Wang and Wang [48] proposed to use depth-separable convolution to replace the ordinary convolution algorithm of the YOLOv5 algorithm, which improves the detection efficiency of the model and adds channel attention and spatial attention to the feature fusion mechanism to make the network more focused on pedestrian features. Since the detection results are susceptible to the influence of the environment, the algorithm improvement should not only focus on pedestrian features but also address environmental factor interferences [49].

In addition to improvement of the algorithm, dimensionality reduction of the image data can be performed to reduce the computational complexity of the convolutional neural network algorithm in order to improve the quality of recognition [50,51].

2.: Pedestrian behavior prediction

In a realistic scenario, pedestrians, as the main body of the activity, have their behaviors affected by many factors, such as the motion state of the detected person, the direction of the target, gender, and age, etc. [52]. After effective pedestrian feature recognition, it is very important to determine the trajectory of the next moment, and the vehicle needs to make corresponding actions based on this state.

Most pedestrian behavior prediction schemes ignore the diversity of use scenarios, such as only targeting individual factors and a small number of detection targets for pedestrian-generated actions. Human actions are often directly linked to the environment, and in order to achieve an accurate recognition of pedestrian intentions, we can use data that incorporate scene conditioning factors as a dataset for deep learning [53], and achieve efficient prediction by learning the interactive behaviors of pedestrians and the environment.

The combined judgment of vehicle trajectory and pedestrian trajectory is a good approach, which will directly avoid accidents and greatly improve the safety of pedestrians [54,55]. This method often needs to solve the many-to-many problem, i.e., multiple vehicles versus multiple pedestrians. Zhou et al. [56] proposed a social interaction force (SIF) to identify and quantify social interaction behavior, which combines the pedestrian–vehicle interaction data, the human–human interaction data, and the road feature data, and recreates real scenarios from multiple perspectives and achieves efficient prediction.

Pedestrian behavior prediction enables vehicles to realize first judgment first operation, effectively avoiding road safety accidents.

2.2.4. Traffic Sign Detection

Accurate identification of traffic signs is one of the basic requirements for driverless cars. There are many types of traffic signs, and some of them have similarities, which poses a great challenge for traffic sign detection and classification. To improve the recognition efficiency, model optimization is an important means.

To solve the classification efficiency problem, Kim et al. [57] proposed a new traffic sign recognition model based on angular margin loss, which optimizes the necessary hyperparameters for angular margin loss through Bayesian optimization to maximize the effectiveness of the loss and achieve a high level of classification performance. To address the problem of diverse traffic sign sizes, Liu et al. [58] embedded a finite deformation convolution module into a CNN layer to learn the distorted information representation used for deformation processing, and applied a scale-aware multitasking region suggestion network module to achieve the detection of traffic signs of various scales.

Recognition results are often affected by environmental factors. Xiao et al. [59] proposed an improved model based on YOLOv4-minor to address this problem. First, an efficient layer aggregation lightweight block is constructed by depth-separable convolution to enhance the feature extraction capability of the backbone. Second, a feature fusion refinement module aiming to fully fuse multi-scale features is proposed. The module incorporates efficient coordination of attention for refining interference information during feature delivery. Finally, an improved feature extraction module is proposed to incorporate contextual feature information into the network to further enhance the accuracy of traffic sign detection. The method not only improves the attention to traffic signs but also refines the environmental interference factors, which improves the recognition efficiency from two aspects.

In general, the core of traffic sign recognition is the classification of a large number of signs, reducing recognition errors and shortening recognition time as the main development direction.

3. Radar Sensing Technology

The three main types of radar technology include millimeter-wave radar, laser radar, and ultrasonic radar. The three types of radar use millimeter waves, laser beams, and ultrasound as detection signals, respectively.

As shown in Figure 3, the three have similar realization steps in terms of principle. The relevant information of the detected target is calculated by the information of time difference and speed of the signal.

3.1. Millimeter-Wave Radar

In automotive radar, millimeter-wave radar is widely used because it has a long detection range and is subject to few environmental factors.

Millimeter-wave radar characteristics

The electromagnetic wave wavelength of millimeter-wave radar is at the millimeter level, and the transmitter transmits a signal whose frequency varies with time, which is reflected by the target and received by the receiver, and the frequency difference between the two signals is derived by mixing between the reflected signal and the received signal, and then the target distance and speed are calculated by the electromagnetic propagation formula and the Doppler effect formula.

Millimeter-wave radar has a longer detection range compared with ultrasonic radar, but it is also therefore subject to the effects of weather, environment, and obstacles, which may result in weakened reflected signals or loss of information. Based on the characteristics of millimeter-wave radar, it has a broad application prospect and advantages in unmanned driving conditions [60].

Millimeter-wave radar has the characteristics of electromagnetic wave beams, so it also has the limitations of the beam. Improving the efficiency of millimeter-wave radar can start from the working principle of speed, angle, range, and radar scattering cross sectional area, and from the bottom to understand the resolution of radar measurement, accuracy error, and the reason of false alarm [61].

2.: Millimeter-wave radar applications

Aiming at the vehicle detection problem, Hu and Zhao [62] proposed a vehicle detection method based on a multi-hypothesis tracking model. The method defines the correspondence between the millimeter-wave radar measurement set and the target set, extracts the valid targets in the measurement set using a generalized probabilistic data association algorithm, thus obtaining the valid target set, and maintains the detected target set by estimating the occurrence probability of the target using a probability tree model. Its probability at the moment t is as follows:

P (t) = \{\begin{matrix} \partial p \\ 1 \end{matrix}

(5)

If each element of the target set has a probability, the detection target is maintained by examining each probability. For a new target set, the initial probability as

p (0) = 1

and the probability at moment

t - 1

as

P (t - 1) = p

. Equation (5) shows the probability at moment

t

. When the target is detected, the probability is 1 or

\partial p

, where

\partial

is the coefficient. A threshold K can be set when judging based on the probability, which is greater than K to judge as the existence of the target, and less than K as the non-existence. This method can effectively solve the problem of detection stability due to radar omission or detection target not within detection range.

For multi-target detection problems, multiple algorithms can be combined to optimize the computational efficiency of the model and reduce the complexity [63].

Scene object blurring is unavoidable under vehicle motion conditions, which poses a difficult problem for high-precision imaging. This requires the reduction of motion error, which is compensated by improving vehicle stability or algorithm optimization [64].

The application of millimeter-wave radar should be tailored to the use scenario, taking into account the errors associated with its long-range measurements.

3.2. Lidar

Compared with millimeter-wave radar, lidar has the advantages of high detection accuracy, high resolution, and strong anti-interference ability, which plays a decisive role in vehicle environment detection and modeling.

Lidar characteristics

Similar to other radars, lidar consists of a transmitter and a receiver, but the signals it transmits and receives are laser signals. When lidar is activated, it emits a laser beam to scan objects within its field of view, and then calculates the distance based on the time difference between the emitted and received signals.

Compared with millimeter-wave radar and ultrasonic radar, lidar has higher resolution but only detects objects in the direct line of sight. With sufficient laser point clouds, it has excellent 3D information-gathering capabilities to meet the requirements of real-time vehicle mapping. As shown in Figure 4, the lidar is usually placed on the roof of autonomous vehicles to facilitate the collection of 3D environmental information.

With the information collected by the 3D laser point cloud, functions such as scene map construction, target detection, and target tracking can be effectively realized [65,66]. As shown in Table 1, lidar can be categorized into three types, which can be selected according to different needs.

2.: Lidar applications

To improve the localization capability of driverless vehicles, Jeong et al. [75] proposed a reliable lidar mapping and localization scheme using lidar for ranging and mapping and matching the environment of target objects. Since the scheme does not rely on GPS, the vehicle is not affected by the delay of GPS signals during traveling, which greatly improves travel efficiency.

In roadblock detection, lidar is more widely used. Hu J et al. [76] proposed a random sampling consistent-based road obstacle target position detection algorithm to detect the road obstacle target, construct the box model of the target obstacle, and output the position information of the box model.

During vehicle travel, the obstacles on the road are mainly vehicles and pedestrians, and the point cloud features detected by the lidar are mainly on the side rather than perpendicular to the ground, but they can be approximated as perpendicular to the ground. This allows the target point cloud to be projected towards the ground to identify the target object attitude. Determining the attitude direction requires fitting the point cloud data of the target object, selecting the long side of the point cloud distribution to fit, and characterizing the direction of the fitted straight line as the object direction.

As shown in Figure 5,

L

is the fitted straight line for detecting the laser point cloud, and the straight line formula is

A x + B y + C = 0

, reflecting the pose of the detected target.

Select the vehicle coordinate system in which the

X

direction is forward and the

Y

direction is to the right, the azimuth angle

θ

is the angle between the straight line

L

and the

X

direction, and the clockwise direction is positive, and the azimuth angle

θ

of the target object is calculated according to the straight line formula, i.e.,

θ

denotes the attitude angle of the target object in the vehicle coordinate system, and the formula is as follows:

θ = \arctan (- A / B)

(6)

After the point cloud is transformed into coordinates, the size of the target object can be calculated, and the center position can be calculated based on the size, thus realizing the position detection of the detection target.

Lidar has high environmental sensing ability, but it needs to ensure the real-time nature of its work. In order to improve the recognition efficiency, the data can be classified first and then clustered to learn, and the vehicle computational pressure can be reduced through the preprocessing of data [77].

In terms of target tracking, Xiong et al. [78] proposed a target association and adaptive survival cycle management strategy based on the border intersection over union (BIoU) metric, which solves the problem of insufficient target association and fixed survival cycle management. The problem of data loss in lidar acquisition can be effectively avoided using this method. The loss of data frames is often due to the weak correlation between the front and back data frames, and the analysis method of two consecutive frames can be used to correlate the information of the two frames to realize dynamic analysis and dynamic tracking [79].

Although lidar is suitable for high-precision environmental sensing and 3D mapping, it requires better computerized data analysis capabilities, and its use requires comprehensive consideration of needs.

3.3. Ultrasonic Radar

Ultrasonic radar is a good choice, considering that vehicles do not need either the long range of millimeter-wave radar or the high resolution of lidar in some inspection scenarios.

Ultrasonic radar characteristics

Ultrasonic radar also consists of a transmitter and a receiver. The transmitter sends out ultrasonic waves and the receiver receives the reflected waves. The distance is measured by the time difference between the transmitted and reflected waves.

Compared with millimeter wave, ultrasonic wave can be reflected on any material surface and has low energy consumption, which greatly expands the detection scene, but it also has the disadvantage of short detection distance. Based on the characteristics of ultrasonic radar, it is usually used in vehicle reversing warning or vehicle navigation.

Due to the physical properties of sound waves, the radar is highly susceptible to environmental influences when working. In order to improve the working range of ultrasonic radar, we can start by increasing the radar power and improving the sensitivity of the receiver module to improve the monitoring range, accuracy, and stability, and solve the problem of the narrow measuring range of vehicle reversing radar [80].

2.: Ultrasonic radar applications

Yang et al. [81] proposed the use of narrow beam angle ultrasonic judgment near the side of the body to detect obstacles and warnings in the study of automatic parking perception. This, combined with vision sensors, not only greatly improves the accuracy of obstacle detection, but also the control part of the control can be realized through the data to achieve efficient collision avoidance.

Based on the characteristics of ultrasonic radar, it can also assist in the realization of vehicle navigation, such as autonomous guided vehicles (AGVs) that have high accuracy requirements for navigation. For this reason, Zhang et al. [82] proposed the use of ultrasonic radar to realize the autonomous navigation of AGVs. In the simulation, multiplicative and additive noise are added as interference, and the size of the noise parameter is set by the calibration of the filter consistency through the consistency detection method, and the selected navigation noise parameter can satisfy the autonomous navigation movement under the AGV lidar navigation mode.

The application prospect of ultrasonic radar in driverless vehicles is broad, and, based on the specificity of the application scenario, the use of ultrasonic radar can realize low cost and high efficiency.

3.4. Multiple Radar Combinations

In driverless technology, object detection and obstacle avoidance are two functions that are in great demand. These functions need to be realized by relying on radar, and it is difficult to achieve the detection effect by using a single type of radar.

To solve the problem of lidar’s inaccurate detection of small objects, Wiseman [83] proposed the use of ultrasonic rangefinders to compensate for this problem. Ultrasonic rangefinders can detect objects up to 300 feet and are still effective at detecting small objects, compensating for the shortcomings of lidar. The combination of the two radars gives the vehicle more accurate detection and a longer detection range. Similarly, ultrasonic radar can assist lidar to realize obstacle avoidance and navigation, reducing the probability of body collision with surrounding objects [84].

In addition to the combination of ultrasonic radar and lidar, other radar combinations can also solve the corresponding problems; for example, combining the detection characteristics of millimeter-wave radar, lidar can realize the environmental detection under rainy and foggy weather. Therefore, to realize better environmental detection, multi-radar combination is a better strategy.

4. State Perception Technology

Condition-aware technology obtains various parameters and state information of the vehicle in real time through various sensors and monitoring systems, and provides accurate, reliable, and fast motion state feedback signals for the control system [85], which can help drivers better understand the operation of the vehicle, discover potential faults and problems in time, and improve the safety and reliability of driving.

As shown in Figure 6, the state-aware process can be divided into six parts, and the final output is a prerequisite for decision control of the driverless vehicle.

State perception is the overall perception of the driverless vehicle of the scene in which it is located. In addition to the visual perception technology and radar perception technology mentioned above, it is very important to rely on the collection of information from other sensing.

4.1. Structured Environment

Driverless vehicles traveling in structured environments can achieve efficient planning and control due to good roads with guide lines, traffic signs, and a priori road environment information. Reducing or eliminating the influence of other environmental factors will greatly guarantee driving safety under the premise of ensuring effective feedback information from the structured environment.

When vehicles are in, e.g., low light or limited vision, or obstacles are difficult to recognize, accurate state perception is crucial to ensure driving safety, which requires further improvement of the perception ability. By reading infrared images and vehicle moving speed, the direction of movement and movement intention of surrounding vehicles can be judged [86].

In structured roads, due to the variety of road information and the influence of unstructured factors, which puts pressure on vehicle state assessment, it is necessary to fuse multiple factors to determine driving lanes [87]. For road factor safety assessment, Cheng et al. [88] proposed a vehicle behavior safety assessment method. Firstly, accident-generating factors are derived based on a large number of accident cause analyses. Secondly, multi-sensors are utilized to perceive information such as vehicle state and road attributes. Finally, the vehicle behavior is judged based on the acquired information, and the safety assessment is performed.

The overall perception of the vehicle can also be achieved by integrating sensors that are matched to the environment for efficient control [89]. Driverless technology based on good infrastructure continues to progress, but there is still much room for development in various aspects, such as variable environments and safety.

4.2. Unstructured Environment

Contrary to the structured environment, unstructured environments mainly refer to vehicle driving environments without traffic signs, where the environmental information and vehicle state information collected while driving do not rely on cloud data or edge facilities, and where only the positioning system and vehicle sensors provide the basis for behavioral decisions. This means that vehicles cannot rely on pre-established maps or a priori knowledge for navigation and behavioral decisions when encountering such environments.

Facing the unstructured environment, vehicle sensing schemes still need to be based on multi-source sensing data. For example, the more applied inertial guidance/guard guidance combination navigation scheme, which integrates IMU, GNSS, etc., can realize the overall perception of vehicle position information, body state, environment, and navigation planning [90,91].

Simultaneous localization and map building (SLAM), as a typical perception method, which mainly relies on sensing data to localize the vehicle and simultaneously build the map, can be the main perception means for unmanned vehicles under the condition of higher perception accuracy. Aiming at the commonly used algorithms for simultaneous localization and map building for driverless vehicles, Demim et al. [92] proposed a new alternative scheme for the SLAM problem for driverless vehicles. The scheme utilizes odometer data and laser data to achieve the construction of an environment map and localization in the map. The problems of EKF-SLAM, which requires accurate observation models, and FAST-SLAM, which is insufficient in real time, are solved.

For unstructured environments, the focus is on the construction of unknown environments, and vehicles need to make judgments based on real-time constructed environment maps and collected environment information.

5. Information Fusion Technology

Information fusion technology is the key to vehicle environment perception, which integrates the data obtained from the above and other methods to provide a strong basis for the vehicle’s behavioral decision making. As shown in Figure 7, taking Tesla as an example, in order to realize accurate environment perception, the vehicle fuses camera and radar, which shows that multi-source heterogeneous information fusion (MSHIF) is the core of information fusion technology. The higher the information fusion capability, the better the driverless vehicle can understand the road environment, identify obstacles, predict behavior, and make accurate decisions. In particular, the data obtained from state sensing must be comprehensively analyzed to derive the current state of the vehicle, such as speed, acceleration, angular acceleration, etc., which are directly related to the safety of the occupants.

Information fusion technology is the core part of vehicle state perception, which is mainly manifested in data fusion and transmission. Advanced fusion technology can effectively solve the perception problems caused by the lack of sensor accuracy. As shown in Figure 8, information fusion technology can be realized in five steps.

5.1. Data Fusion

Data fusion methods include high-level fusion (HLF), low-level fusion (LLF), and mid-level fusion (MLF). The three methods are slightly different and are classified according to the degree of pre-processing of the raw sensor counts; no processing is low-level fusion, feature extraction is intermediate-level fusion, and processing and analyzing are high-level fusion [93].

Data fusion algorithms are statistical, probabilistic, knowledge-based, evidential reasoning, and interval analysis methods [94]. These algorithms enable the extraction of key information and reduce the fusion burden.

Aiming at the problem of longitudinal speed failure of vehicles, Zhang et al. [95] proposed a longitudinal speed estimation method integrating lidar and IMU, which reduces the model error and suppresses the filter dispersion in order to improve the robustness of the algorithm and the filtering accuracy. The information of the two sensing data is combined to provide a feedback basis for the longitudinal control of the vehicle, which ensures that the average error of the vehicle is 0.113

m / s

under medium- and high-speed conditions.

Enhancing the positioning accuracy often depends on a larger number of information sources [96]. Wang et al. [97] proposed an accurate GPS-IMU/dead reckoning (DR) data fusion method based on a set of predictive models and occupancy grid constraints. The method reduces the cumulative position error of DR by fusing data from multi-source sensors and reduces the computational complexity while greatly improving the positioning accuracy due to the optimized data fusion algorithm.

Vehicle-to-infrastructure (V2I) communication technology enables computational edging by storing data at edge facilities. As shown in Figure 9, driverless vehicles use V2I for obtaining real-time HD maps to support localization [98]. First, the map data are transmitted to the vehicle near the infrastructure. Second, the vehicle acquires the 3D map resource and then needs to combine the algorithm to match the real-time information from the lidar feedback. Finally, the accuracy of the radar data is judged based on the matching results, thus enabling localization without relying on cloud-based map data.

For the information fusion problem in ITS, researchers have proposed many information fusion schemes, such as discriminable unit-based fusion, complementary feature-based fusion, attribute-based fusion, and multi-source decision-based fusion [99].

5.2. Data Transmission

The fused data need to be transmitted efficiently, so data transmission is also a key factor in the operation of driverless vehicles.

The development of data transfer strategies for driverless technologies involves communication and data transfer between self-driving vehicles and between vehicles and surrounding infrastructure. Currently, solving the problem of inefficient data transmission in the cloud can be divided into two directions. One is to optimize the data and transmission network; the other is to marginalize the data and rely on the infrastructure around the vehicle to provide vehicle decision-making information. In the data transmission process, artificial intelligence and deep learning play an important role [100].

In the case of better road information, the efficiency of information transfer is the main factor that affects the safety of traveling. Sun et al. [101] proposed a state-aware H∞ event-triggered path-tracking control strategy to solve the path-tracking control problem of unmanned vehicles under communication constraints. The method is able to dynamically adjust the communication threshold according to the measured state of the control system, effectively realizing the adaptive co-design of communication and control for autonomous vehicles.

In order to enhance the data interaction capability of unmanned vehicles, Lv et al. [102] proposed a transmission load optimization method for video sensing data. First, the video frames captured by the roadside facilities are sent to the foreground and background separation module. Second, the separated foreground and background are transmitted to the environment construction module. Finally, the environment perception structure is formed and the perception results are given. The control module makes timely and correct control actions based on the efficiently transmitted data to ensure safe driving.

As an important part of ITS, V2I allows driverless vehicles to greatly enhance their perception by virtue of the data coming from the infrastructure. Noh et al. [103] proposed that, when a driverless vehicle passes near the infrastructure, it provides object perception information through broadcasting, and the vehicle expands its visual perception ability based on the feedback information. Through experimental verification, V2I technology not only effectively solves the data transmission problem, but also greatly improves the perception ability.

The development of data transmission technology has facilitated the advancement of data fusion, and the sensory data of the vehicle need to be supported by data transmission, whether it is fused first or later.

6. Conclusions

This paper outlines the current development status of key technologies for environment sensing in driverless vehicles. In general, these technologies have practical application scenario requirements but still have some limitations and need to be improved in the future.

In computer vision, digital image technology is more limited; for example, the application of this technology requires each frame collected to be analyzed individually, which requires vehicles to have stronger data processing capabilities. The realization of the program of machine vision requires a large amount of training work in the early stage, which includes data collection, data processing, and data training, and wrong and non-essential data sets will make the training results deviate from the expected. Radar technology often has limitations due to its own physical properties. During the use of millimeter-wave radar, problems such as target omission, target misdetection, and false targets usually occur, which can lead to misjudgment of the environmental state by driverless vehicles and cause driving hazards. Similarly, lidar has an obvious limitation in that it can only directly detect the surface of objects in the line of sight and does not have the ability to penetrate. Due to the large scattering angle of ultrasonic waves, low propagation speed, and large propagation energy loss, its detection distance is short, so it cannot play a role in the detection of the vehicle in middle and long distances. The state perception of the vehicle relies on the reliability of the sensing data, which requires the sensors to work stably, and in the event of problems with the sensors, the results of the state perception cannot be guaranteed. As a key factor affecting the environmental detection of the unmanned vehicle, the information fusion method faces two problems. One is that, due to the large amount of computational data, sufficient computational resources are needed to maintain it, and the other is that the data are heterogeneous and need to rely on algorithms to achieve standardization.

Overall, the above-mentioned environment sensing methods need to be optimized for their own shortcomings, so that the field of unmanned vehicles can be greatly developed. In the face of future development, multiple perspectives are needed to solve the existing problems. The following analysis summarizes the conclusions and prospects of six aspects.

Multi-source sensor fusion

The application of a single sensor can hardly meet the demand of unmanned vehicles for sensing the complex environment, and the comprehensive data based on multi-source sensors can provide a reliable judgment basis for the decision making and control part. In order to maximize the accuracy of multi-source sensor perception, it is necessary to combine information fusion technology to give all sensor data a reasonable degree of confidence, and finally output the optimal comprehensive data.

2.: Machine learning and deep learning

The application of machine learning and deep learning will greatly improve the perception ability of vehicles, especially in vision-based target detection technology and data processing advantages. After a large amount of learning, the output perception data and decision-making scheme can be optimized.

3.: High-precision maps

At present, domestic vehicles with ADAS (L2 level) need to realize corresponding functions by virtue of high-precision maps, and L3 level self-driving vehicles that can be legally put on the road also need to rely on high-precision maps, so it is obvious that high-precision maps are of great importance. However, the use of high-precision maps also brings data transmission problems, which requires optimization of the data transmission scheme to achieve efficient map downloads or vigorously develop marginalized infrastructure construction.

4.: Intelligent sensors

Ordinary sensors do not have data processing capabilities and communication capabilities compared with smart sensors, which increases the burden on vehicle computing centers. The development of smart sensors is especially critical because driverless vehicles with higher degrees of intelligence require more computational processing power. The universal application of smart sensors can greatly save vehicle computing resources and improve the overall sensing capability.

5.: Computational marginalization

Marginalization in this context means processing some data locally on the car first, then sending only the results to the main computer. The universal application of computing marginalization needs to accelerate the promotion of the development of intelligent sensors and infrastructure in order to achieve the pre-processing of sensing data and the rapid transmission of data in the cloud.

6.: Vehicle cooperative sensing

In order to improve the sensing range of driverless vehicles, the sensing efficiency will be greatly improved with the help of sensing information from surrounding vehicles. Through the data sharing of vehicles traveling on the road, the vehicle receiving the data can realize the overall control of the situation in the region and make more accurate judgment by combining its own sensing data.

Author Contributions

Y.H.: conceptualization, software, validation, formal analysis, resources, visualization, writing—original draft preparation. C.Z.: methodology, investigation, supervision, funding acquisition, project administration, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of P. R. China (Grant No. 61563010) and the Guizhou Minzu University Scientific Research Fund Sponsored Programs (Grant No. GZMUZK[2023]YB06).

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dilek, E.; Dener, M. Computer vision applications in intelligent transportation systems: A survey. Sensors 2023, 23, 2938. [Google Scholar] [CrossRef] [PubMed]
Suciu, D.A.; Dulf, E.H.; Kovács, L. Low-cost autonomous trains and safety systems implementation, using computer Vision. Acta Polytech. Hung. 2024, 21, 29–43. [Google Scholar] [CrossRef]
Talpes, E.; Sarma, D.D.; Venkataramanan, G.; Bannon, P.; McGee, B.; Floering, B.; Jalote, A.; Hsiong, C.; Arora, S.; Gorti, A.; et al. Compute solution for tesla’s full self-driving computer. IEEE Micro 2020, 40, 25–35. [Google Scholar] [CrossRef]
Hao, C.; Sarwari, A.; Jin, Z.; Abu-Haimed, H.; Sew, D.; Li, Y.; Liu, X.; Wu, B.; Fu, D.; Gu, J.; et al. A hybrid GPU+ FPGA system design for autonomous driving cars. In Proceedings of the 2019 IEEE International Workshop on Signal Processing Systems (SiPS), Nanjing, China, 5 March 2020. [Google Scholar]
HajiRassouliha, A.; Taberner, A.J.; Nash, M.P.; Nielsen, P.M. Suitability of recent hardware accelerators (DSPs, FPGAs, and GPUs) for computer vision and image processing algorithms. Signal Process. Image Commun. 2018, 68, 101–119. [Google Scholar] [CrossRef]
Cheng, Z.M. Analysis of tesla autopilot software system. Repair Maint. 2022, 33–35. [Google Scholar]
Wang, S.F.; Dai, X.; Xu, N.; Zhang, P.F. Overview on environment perception technology for unmanned ground vehicle. J. Chang. Univ. Sci. Technol. Nat. Sci. Ed. 2017, 40, 1–6. [Google Scholar]
Tian, Y.; Yang, H.; Hu, C.; Tian, J.D. Moving foreign object detection and track for electric vehicle wire-less charging based on millimeter-wave radar. Trans. China Electrotech. Soc. 2023, 38, 297–308. [Google Scholar]
Patole, S.M.; Torlak, M.; Wang, D.; Ali, M. Automotive radars: A review of signal processing techniques. IEEE Signal Process. Mag. 2017, 34, 22–35. [Google Scholar] [CrossRef]
Qiao, D.Y.; Yuan, W.Z.; Ren, Y. Review of mems lidar. Microelectron. Comput. 2023, 40, 41–49. [Google Scholar]
Yang, P.; Duan, D.; Chen, C.; Cheng, X.; Yang, L. Multi-sensor multi-vehicle (MSMV) localization and mo-bility tracking for autonomous driving. IEEE Trans. Veh. Technol. 2020, 69, 14355–14364. [Google Scholar] [CrossRef]
Wang, J.; Wu, Z.; Liang, Y.; Tang, J.; Chen, H. Perception methods for adverse weather based on vehicle infrastructure cooperation system: A review. Sensors 2024, 24, 374. [Google Scholar] [CrossRef] [PubMed]
Yin, H.P.; Chen, B.; Chai, Y.; Liu, Z.D. Vision-based object detection and tracking: A review. Acta Autom. Sin. 2016, 42, 1466–1489. [Google Scholar]
Hou, Z.Q.; Han, C.Z. A survey of visual tracking. Acta Autom. Sin. 2006, 32, 603–617. [Google Scholar]
Abbass, M.Y.; Kwon, K.C.; Kim, N.; Abdelwahab, S.A.; El-Samie, F.E.A.; Khalaf, A.A. A survey on online learning for visual tracking. Vis. Comput. 2021, 37, 993–1014. [Google Scholar] [CrossRef]
Huang, Z.; Wang, Y.C.; Li, D.Y. A survey of 3D object detection algorithms. Chin. J. Intell. Sci. Technol. 2023, 5, 7–31. [Google Scholar]
Li, M.X.; Lin, Z.K.; Qu, Y. Survey of vehicle object detection algorithm in computer vision. Comput. Eng. Appl. 2019, 55, 20–28. [Google Scholar]
Chen, Z.H.; Li, A.J.; Qiu, X.Y.; Yuan, W.C.; Ge, Q.Y. Survey of environment visual perception for intelligent vehicle and its supporting key technologies. J. Hebei Univ. Sci. Technol. 2019, 40, 15–23. [Google Scholar]
Ranft, B.; Stiller, C. The role of machine vision for intelligent vehicles. IEEE Trans. Intell. Veh. 2016, 1, 8–19. [Google Scholar] [CrossRef]
Burger, W.; Burge, M.J. Digital Image Processing: An Algorithmic Introduction; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
Solomon, C.; Breckon, T. Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Geethika, C.; Sindhuja, A.; Prasanna, M.L. A survey-machine learning techniques in self-driving cars. Adv. Appl. Math. Sci. 2021, 20, 2787–2793. [Google Scholar]
Grigorescu, S.; Trasnea, B.; Cocias, T.; Macesanu, G. A survey of deep learning techniques for autonomous driving. J. Field Robot. 2020, 37, 362–386. [Google Scholar] [CrossRef]
Su, J.S.; Zhang, B.F.; Xu, X. Advances in machine learning based text categorization. J. Softw. 2006, 17, 1848–1859. [Google Scholar] [CrossRef]
Sharma, V.; Gupta, M.; Kumar, A.; Mishra, D. Video processing using deep learning techniques: A systematic literature review. IEEE Access 2021, 9, 139489–139507. [Google Scholar] [CrossRef]
Hoque, S.; Xu, S.; Maiti, A.; Wei, Y.; Arafat, M.Y. Deep learning for 6D pose estimation of objects—A case study for autonomous driving. Expert Syst. Appl. 2023, 223, 119838. [Google Scholar] [CrossRef]
Wang, X.; Zhang, W.; Wu, X.; Xiao, L.; Qian, Y.; Fang, Z. Real-time vehicle type classification with deep convolutional neural networks. J. Real-Time Image Process. 2019, 16, 5–14. [Google Scholar] [CrossRef]
Rill, R.A.; Faragó, K.B. Collision avoidance using deep learning-based monocular vision. SN Comput. Sci. 2021, 2, 375. [Google Scholar] [CrossRef]
Zhao, X.; Wang, L.; Zhang, Y.; Han, X.; Deveci, M.; Parmar, M. A review of convolutional neural networks in computer vision. Artif. Intell. Rev. 2024, 57, 99. [Google Scholar] [CrossRef]
Hu, X.X.; Li, X.M.; Bai, Y.Y.; Li, R. Research on driverless lane line detection. Electron. Des. Eng. 2020, 28, 118–121, 126. [Google Scholar]
Ge, Z.K.; Tao, F.Z.; Fu, Z.M.; Song, S.Z. Lane detection method based on improved multi-head self-attention. Comput. Eng. Appl. 2024, 60, 264–271. [Google Scholar]
Han, G.F.; Li, X.M.; Wu, X. Research of lane line detection in the vision navigation of unmanned vehicle. Fire Control Command Control 2015, 40, 152–154. [Google Scholar]
Wang, Y.; Dahnoun, N.; Achim, A. A novel system for robust lane detection and tracking. Signal Process. 2012, 92, 319–334. [Google Scholar] [CrossRef]
Piao, J.; Shin, H. Robust hypothesis generation method using binary blob analysis for multi-lane detection. IET Image Process. 2017, 11, 1210–1218. [Google Scholar] [CrossRef]
Bar Hillel, A.; Lerner, R.; Levi, D.; Raz, G. Recent progress in road and lane detection: A survey. Mach. Vis. Appl. 2014, 25, 727–745. [Google Scholar] [CrossRef]
Xu, Q.; Zhu, F.; Hu, J.; Liu, W.; Zhang, X. An enhanced positioning algorithm module for low-cost GNSS/MEMS integration based on matching straight lane lines in HD maps. GPS Solut. 2023, 27, 22. [Google Scholar] [CrossRef]
Jin, Z.; Ge, D.Y. Detection and recognition method of monocular vision traffic safety information for intelligent vehicles. J. Intell. Fuzzy Syst. 2020, 39, 5017–5026. [Google Scholar] [CrossRef]
Xiao, X.Z.Y.; Chu, P.Z.; Liang, X.N.; Xun, W.K.; Ren, T.X. Experimental design of lane keeping based on deep learning end-to-end algorithm. Res. Explor. Lab. 2022, 41, 27–33. [Google Scholar]
Yuan, W.; Yang, M.; Li, H.; Wang, C.; Wang, B. End-to-end learning for high-precision lane keeping via multi-state model. CAAI Trans. Intell. Technol. 2018, 3, 185–190. [Google Scholar] [CrossRef]
Liu, S.; Müller, S. Reliability of deep neural networks for an end-to-end imitation learning-based lane keeping. IEEE Trans. Intell. Transp. Syst. 2023, 24, 13768–13786. [Google Scholar] [CrossRef]
Lee, D.H.; Liu, J.L. End-to-end deep learning of lane detection and path prediction for real-time autonomous driving. Signal Image Video Process. 2023, 17, 199–205. [Google Scholar] [CrossRef]
Yang, S.; Wu, J.; Jiang, Y.D.; Wang, G.J.; Liu, H.Z. Deep-learning-based lane-keeping control framework: From virtuality to reality. J. S. China Univ. Technol. Nat. Sci. Ed. 2019, 47, 90–97. [Google Scholar]
Kang, C.M.; Lee, S.H.; Kee, S.C.; Chung, C.C. Kinematics-based fault-tolerant techniques: Lane prediction for an autonomous lane keeping system. Int. J. Control Autom. Syst. 2018, 16, 1293–1302. [Google Scholar] [CrossRef]
Gao, R.Z.; Li, S.N.; Li, X.H. Research on pedestrian and vehicle detection algorithms in robot vision. Mach. Des. Manuf. 2023, 277–280. [Google Scholar]
Yu, J.X.; Zhang, M.Q.; Su, Y.T. Three-dimensional vehicle detection algorithm based on binocular vision. Laser Optoelectron. Prog. 2021, 58, 0215004. [Google Scholar]
Gu, D.Y.; Zhang, S.; Meng, F.W. Vehicle 3D space detection method based on monocular vision. J. Northeast. Univ. Nat. Sci. 2022, 43, 328. [Google Scholar]
Nguyen, T.T.; Nguyen, H.H.; Sartipi, M.; Fisichella, M. Multi-vehicle multi-camera tracking with graph-based tracklet features. IEEE Trans. Multimed. 2023, 26, 972–983. [Google Scholar] [CrossRef]
Wang, Z.Y.; Wang, G.Z. Application of improved lightweight YOLOv5 algorithm in pedestrian detection. Front. Data Comput. 2023, 5, 161–172. [Google Scholar]
Wei, X.; Zhang, H.; Liu, S.; Lu, Y. Pedestrian detection in underground mines via parallel feature transfer network. Pattern Recognit. 2020, 103, 107195. [Google Scholar] [CrossRef]
Zhang, Z.J. CNN-based driverless pedestrian recognition. Telecom World 2019, 26, 246–247. [Google Scholar]
Tomè, D.; Monti, F.; Baroffio, L.; Bondi, L.; Tagliasacchi, M.; Tubaro, S. Deep convolutional neural networks for pedestrian detection. Signal Process. Image Commun. 2016, 47, 482–489. [Google Scholar] [CrossRef]
Zhang, C.; Berger, C. Pedestrian behavior prediction using deep learning methods for urban scenarios: A review. IEEE Trans. Intell. Transp. Syst. 2023, 24, 10279–10301. [Google Scholar] [CrossRef]
Yang, B.; Fan, F.C.; Yang, J.C.; Cai, Y.F.; Wang, H. Recognition of pedestrians’ street-crossing intentions based on action prediction and environment context. Automot. Eng. 2021, 43, 1066–1076. [Google Scholar]
Alghodhaifi, H.; Lakshmanan, S. Holistic spatio-temporal graph attention for trajectory prediction in vehicle–pedestrian interactions. Sensors 2023, 23, 7361. [Google Scholar] [CrossRef] [PubMed]
Yang, W.Y.; Zhang, X.; Chen, H.; Jin, W.Q. A model of pedestrian trajectory prediction for autonomous vehicles based on social force. J. Highw. Transp. Res. Dev. 2020, 37, 127–135. [Google Scholar]
Zhou, Z.; Liu, Y.; Liu, B.; Ouyang, M.; Tang, R. Pedestrian crossing intention prediction model considering social interaction between multi-pedestrians and multi-vehicles. Transp. Res. Rec. 2024, 2678, 80–101. [Google Scholar] [CrossRef]
Kim, T.; Park, S.; Lee, K. Traffic sign recognition based on bayesian angular margin loss for an autonomous vehicle. Electronics 2023, 12, 3073. [Google Scholar] [CrossRef]
Liu, Z.; Shen, C.; Fan, X.; Zeng, G.; Zhao, X. Scale-aware limited deformable convolutional neural networks for traffic sign detection and classification. IET Intell. Transp. Syst. 2020, 14, 1712–1722. [Google Scholar] [CrossRef]
Xiao, Y.; Yin, S.; Cui, G.; Zhang, W.; Yao, L.; Fang, Z. E-YOLOv4-tiny: A traffic sign detection algorithm for urban road scenarios. Front. Neurorobotics 2023, 17, 1220443. [Google Scholar] [CrossRef]
Hu, J.Q. Simulation and performance analysis of millimeter wave radar under unmanned driving conditions. Smart Rail Transit 2023, 60, 6–11. [Google Scholar]
Lu, G. Research on the Method of Environmental Perception and Scene Reconstruction Based on Millimeter Wave Radar. Ph.D. Thesis, Harbin Institute of Technology, Harbin, China, 2020. [Google Scholar]
Hu, B.; Zhao, C.X. Vehicle detection method based on mht model using millimeter-wave radar. J. Nanjing Univ. Sci. Technol. 2012, 36, 557–560. [Google Scholar]
Du, L.P.; Su, G.C. Multi-moving targets detection based on p_0 order CWD in MMW radar. Syst. Eng. Electron. 2005, 27, 1523–1527. [Google Scholar]
Huang, P.; Shan, W.; Xu, W.; Tan, W.; Dong, Y.; Zhang, Z. Motion compensation method of an imaging radar based on unmanned automobile. J. Eng. 2019, 2019, 6170–6174. [Google Scholar] [CrossRef]
Liu, B.; Zhang, J.; Lu, M.; Teng, S.H.; Ma, Y.X. Research progress of laser radar applications. Laser Infrared 2015, 45, 117–122. [Google Scholar]
Liang, X.L.; Zhang, J.X.; Li, H.T.; Yan, P. The characteristics of LIDAR data. Remote Sens. Inf. 2005, 27, 71–76. [Google Scholar]
Li, Y.; Ibanez-Guzman, J. Lidar for autonomous driving: The principles, challenges, and trends for automotive lidar and perception systems. IEEE Signal Process. Mag. 2020, 37, 50–61. [Google Scholar] [CrossRef]
Wen, C.; Tan, J.; Li, F.; Wu, C.; Lin, Y.; Wang, Z.; Wang, C. Cooperative indoor 3D mapping and modeling using LiDAR data. Inf. Sci. 2021, 574, 192–209. [Google Scholar] [CrossRef]
Altuntas, C. Review of scanning and pixel array-based lidar point-cloud measurement techniques to capture 3D shape or motion. Appl. Sci. 2023, 13, 6488. [Google Scholar] [CrossRef]
Ilci, V.; Toth, C. High definition 3D map creation using GNSS/IMU/LiDAR sensor integration to support autonomous vehicle navigation. Sensors 2020, 20, 899. [Google Scholar] [CrossRef] [PubMed]
Debeunne, C.; Vivet, D. A review of visual-LiDAR fusion based simultaneous localization and mapping. Sensors 2020, 20, 2068. [Google Scholar] [CrossRef] [PubMed]
Catapang, A.N.; Ramos, M. Obstacle detection using a 2D LIDAR system for an autonomous vehicle. In Proceedings of the 2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Penang, Malaysia, 25–27 November 2016. [Google Scholar]
Li, J.; Zhang, Y.; Liu, X.; Zhang, X.; Bai, R. Obstacle detection and tracking algorithm based on multi-lidar fusion in urban environment. IET Intell. Transp. Syst. 2021, 15, 1372–1387. [Google Scholar] [CrossRef]
Kumar, G.A.; Lee, J.H.; Hwang, J.; Park, J.; Youn, S.H.; Kwon, S. Lidar and camera fusion approach for object distance estimation in self-driving vehicles. Symmetry 2020, 12, 324. [Google Scholar] [CrossRef]
Jeong, S.; Ko, M.; Kim, J. Lidar localization by removing moveable objects. Electronics 2023, 12, 4659. [Google Scholar] [CrossRef]
Hu, J.; Liu, H.; Xu, W.C.; Zhao, L. Position detection algorithm of road obstacles based on 3D lidar. Chin. J. Lasers 2021, 48, 2410001. [Google Scholar]
Lou, X.Y.; Wang, H.; Cai, Y.F.; Zheng, Z.Y.; Cheng, L. A research on an algorithm for real-time detection and classification of road obstacle by using 64-line lidar. Automot. Eng. 2019, 41, 779–784. [Google Scholar]
Xiong, Z.K.; Cheng, X.Q.; Wu, Y.D.; Zuo, Z.Q.; Liu, J.S. Lidar-based 3D multi-object tracking for un-manned vehicles. Acta Autom. Sin. 2023, 49, 2073–2083. [Google Scholar]
Zou, B.; Liu, K.; Wang, K.W. Dynamic obstacle detection and tracking method based on 3D lidar. Automob. Technol. 2017, 8, 19–25. [Google Scholar]
Qin, W.; Yan, W.J. Design of ultrasonic car reversing radar for parking based on CX20106A. Piezoelectrics Acoustooptics 2011, 33, 161–164. [Google Scholar]
Yang, Y.; Huang, J.; Sun, K.; Luo, H.; Ding, D. Research on automated parking perception based on a multi-sensor method. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2023, 237, 1021–1046. [Google Scholar] [CrossRef]
Zhang, W.; Sun, L.; Zeng, L.S.; Zhang, B. Selection of AGV navigation parameters based on ultrasonic wave radar sensor. Transducer Microsyst. Technol. 2014, 33, 34–37. [Google Scholar]
Wiseman, Y. Ancillary ultrasonic rangefinder for autonomous vehicles. Int. J. Secur. Its Appl. 2018, 12, 49–58. [Google Scholar] [CrossRef]
Premnath, S.; Mukund, S.; Sivasankaran, K.; Sidaarth, R.; Adarsh, S. Design of an autonomous mobile robot based on the sensor data fusion of LIDAR 360, ultrasonic sensor and wheel speed encoder. In Proceedings of the 2019 9th International Conference on Advances in Computing and Communication (ICACC), Kochi, India, 6–8 November 2019. [Google Scholar]
Xu, K.; Luo, Y.Y.; Yang, Y.; Xu, G.Q. Review on state perception and control for distributed drive electric vehicles. J. Mech. Eng. 2020, 55, 60–79. [Google Scholar]
Pei, J.X.; Sun, S.Y.; Wang, Y.L.; Li, D.W.; Huang, R. Nighttime environment perception of driverless vehicles based on improved YOLOv3 network. J. Appl. Opt. 2019, 40, 380–386. [Google Scholar]
Huang, A.S.; Moore, D.; Antone, M.; Olson, E.; Teller, S. Finding multiple lanes in urban road networks with vision and lidar. Auton. Robot. 2009, 26, 103–122. [Google Scholar] [CrossRef]
Cheng, X.; Zhou, J.; Zhao, X. Safety assessment of vehicle behaviour based on the improved D–S evidence theory. IET Intell. Transp. Syst. 2020, 14, 1396–1402. [Google Scholar] [CrossRef]
Ma, F.Y.; Wang, X.N. Overview on environment perception and navigation and location technology applied on unmaned ground vehicle. Auto Electr. Parts 2015, 2, 5. [Google Scholar]
Rosique, F.; Navarro, P.J.; Fernández, C.; Padilla, A. A systematic review of perception system and simulators for autonomous vehicles research. Sensors 2019, 19, 648. [Google Scholar] [CrossRef] [PubMed]
Yang, L.N.; Liu, Z.H.; Wen, N. Integrated navigation trajectory prediction method based on deep Gaussian process for multiple unknown environments. Syst. Eng. Electron. 2023, 45, 3632–3639. [Google Scholar]
Demim, F.; Nemra, A.; Louadj, K. Robust SVSF-SLAM for unmanned vehicle in unknown environment. IFAC-Pap. 2016, 49, 386–394. [Google Scholar] [CrossRef]
Yeong, D.J.; Velasco-Hernandez, G.; Barry, J.; Walsh, J. Sensor and sensor fusion technology in autonomous vehicles: A review. Sensors 2021, 21, 2140. [Google Scholar] [CrossRef]
Fayyad, J.; Jaradat, M.A.; Gruyer, D.; Najjaran, H. Deep learning sensor fusion for autonomous vehicle perception and localization: A review. Sensors 2020, 20, 4220. [Google Scholar] [CrossRef] [PubMed]
Zhang, C.; Guo, Z.; Dang, M. Longitudinal velocity estimation of driverless vehicle by fusing lidar and inertial measurement unit. World Electr. Veh. J. 2023, 14, 175. [Google Scholar] [CrossRef]
Shi, X.B.; Zhao, D.X.; Kong, Z.F.; Ni, T.; Zhao, X.L. Vehicle high-precision positioning technique based on multi-sensors information fusion. China Mech. Eng. 2022, 33, 2381. [Google Scholar]
Wang, S.; Deng, Z.; Yin, G. An accurate GPS-IMU/DR data fusion method for driverless car based on a set of predictive models and grid constraints. Sensors 2016, 16, 280. [Google Scholar] [CrossRef] [PubMed]
Kim, M.J.; Kwon, O.; Kim, J. Vehicle to infrastructure-based lidar localization method for autonomous vehicles. Electronics 2023, 12, 2684. [Google Scholar] [CrossRef]
AlZubi, A.A.; Alarifi, A.; AlMaitah, M.; Alheyasat, O. Multi-sensor information fusion for internet of things assisted automated guided vehicles in smart city. Sustain. Cities Soc. 2021, 64, 102539. [Google Scholar] [CrossRef]
Ma, Y.; Wang, Z.; Yang, H.; Yang, L. Artificial intelligence applications in the development of autonomous vehicles: A survey. IEEE/CAA J. Autom. Sin. 2020, 7, 315–329. [Google Scholar] [CrossRef]
Sun, H.T.; Zhang, P.F.; Peng, C.; Ding, F. State-sensitive based event-triggered H∞ control for path tracking of unmanned ground vehicle. J. Hunan Univ. Nat. Sci. 2022, 49, 34–42. [Google Scholar]
Lv, P.; Li, K.; Xu, J.; Li, T.S.; Chen, N.J. Cooperative sensing information transmission load optimization for automated vehicles. Chin. J. Comput. 2021, 44, 1984–1997. [Google Scholar]
Noh, J.; Jo, Y.; Kim, J.; Min, K. Enhancing transportation safety with infrastructure cooperative autonomous driving system. Int. J. Automot. Technol. 2024, 25, 61–69. [Google Scholar] [CrossRef]

Figure 1. Computer vision target detection process.

Figure 2. Fully connected structural model.

Figure 3. Radar detection workflow.

Figure 4. Vehicle-mounted lidar layout.

Figure 5. Attitude detection principle.

Figure 6. Driverless vehicle state perception process.

Figure 7. Vehicle-mounted sensor layout.

Figure 8. Information fusion realization process.

Figure 9. V2I-based lidar localization process.

Table 1. Lidar classification table.

	Autonomous Driving Lidar [67]	3D Map Lidar [68,69,70,71]	Basic Sensing Lidar [72,73,74]
Characteristic	Higher resolution; Wide detection range.	High resolution; Large field of view; Capture terrain and architectural features.	Low cost; Simple design.
Applicable scenarios	High-precision environment sensing; Environment understanding.	Create accurate 3D maps; Vehicle localization; Route planning.	Obstacle detection

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huo, Y.; Zhang, C. A Review of Key Technologies for Environment Sensing in Driverless Vehicles. World Electr. Veh. J. 2024, 15, 290. https://doi.org/10.3390/wevj15070290

AMA Style

Huo Y, Zhang C. A Review of Key Technologies for Environment Sensing in Driverless Vehicles. World Electric Vehicle Journal. 2024; 15(7):290. https://doi.org/10.3390/wevj15070290

Chicago/Turabian Style

Huo, Yuansheng, and Chengwei Zhang. 2024. "A Review of Key Technologies for Environment Sensing in Driverless Vehicles" World Electric Vehicle Journal 15, no. 7: 290. https://doi.org/10.3390/wevj15070290

Article Menu

A Review of Key Technologies for Environment Sensing in Driverless Vehicles

Abstract

1. Introduction

2. Visual Perception Technology

2.1. Computer Vision

2.1.1. Digital Imaging Technology

2.1.2. Machine Learning and Deep Learning

2.2. Detection Technology

2.2.1. Lane Detection

2.2.2. Vehicle Detection

2.2.3. Pedestrian Detection

2.2.4. Traffic Sign Detection

3. Radar Sensing Technology

3.1. Millimeter-Wave Radar

3.2. Lidar

3.3. Ultrasonic Radar

3.4. Multiple Radar Combinations

4. State Perception Technology

4.1. Structured Environment

4.2. Unstructured Environment

5. Information Fusion Technology

5.1. Data Fusion

5.2. Data Transmission

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI