1. Introduction
Passenger vehicle sales have become a growing part of the worldwide economy, with an increasing trend over the last years. With that growth and the advancement of automation technology consumers, governments and society are all demanding better safety and reduction of the amount of deaths and injuries on the roads. Car manufacturers have started implementing Driver Assistance Systems (DAS) in production models as an answer to such demands. Among those, we recall stability control systems, anti-collision systems, Antilock Braking System (ABS), traction control and Electronic Brakeforce Distribution (EBD), seat belts, airbags, shock-absorbing bumpers, anti-intrusion bars, visual systems (VDAS) [
1,
2]. Most VDAS employ video cameras. They are often used as car parking systems, front or rear and lane vision systems [
3,
4,
5]. Video cameras need outside light sources: the sensor does not work in low visibility or in adverse weather conditions (e.g., fog, rain) and in presence of smoke. These limitations can be overcome by radar technology [
6]. Radar-based systems can detect targets hundreds of meters ahead, being minimally affected by fog or heavy rain,
i.e., conditions that greatly limit the driver’s field of vision [
7]. Radar systems adopt several sensing and processing methods for determining the position and speed of the vehicles ahead [
8,
9,
10]. Usually car manufacturers are very reluctant to alter the shape of the vehicles to accommodate any sensors, so designers are forced to design systems small enough to be mounted inside car’s front grille. In order to combine small dimensions and versatility, small antennas are required, and consequently signals at high frequencies are adopted. In particular, several proposed systems work at 76–77 GHz, which is a good compromise between compactness and cost. In order to produce a high resolution radar imaging system for automotive applications many aspects need to be taken into account, for instance the choice of the more appropriate radar imaging system, the choice of the antenna, the development of a radar and system simulator that helps in evaluating system performance by generating a controllable synthetic environment and, once obtained the image, the post-processing stage, such as target detection and tracking.
At present, several technological solutions for automotive radar imaging systems have been developed. The systems synthesize, analogically or digitally [
6], a beam scanning the area of interest to identify targets. The analog synthesis and scanning of the beam can be obtained in several ways, such as phased arrays, travelling wave antennas, and lens antennas. The best performing architecture in terms of resolution and scanning range is a phased array. This solution is also the most expensive, so it is necessary to find a compromise between price and performance. One possibility is to adequately process the signals of several antennas to synthesize a larger array [
7]. Another possibility is to use a modular architecture by splitting the array into identical sub-arrays, the feed of each of which is individually controlled.
In the post-processing stage, the radars currently used in the automotive industry are based on the Ultra-wide Band structure [
6,
10]. Most of the algorithms, developed with the aim of cleaning/reducing noise in the data and classifying targets, are based on the statistical analysis of backscattered radar returns, followed by a statistical classification which allows individuating the category to which the observed target pertains. Both for detection and classification, statistical models are fitted to the data in order to assess their suitability and either confirm or reject the membership of a target to the proposed class.
In this manuscript, we focus on the signal processing step. In particular, we propose a novel signal processing algorithm, based on Compressive Sensing (CS) theory [
11,
12], for the detection and 3D imaging of targets within an observed volume, even in the case of scatterers sharing the same line of sight (LoS). This goal is achieved by detecting the presence of targets in the observed scene, estimating their positions and inferring their reflectivities. It will be shown that, by exploiting the solution sparsity property, CS techniques result particularly effective in solving such detection and estimation problems. The following of the paper is divided into four sections: the description of the acquisition model is made in
Section 2. The CS-based approach is presented in
Section 3. Results on simulated data are reported in
Section 4. Conclusions are drawn in the final section.
2. Methodology
Let us consider an antenna ideally located at the front of the car, laying in the (
x,
y) plane. Let us consider a planar array antenna of
KN ×
KM elements, where each one transmits and receives the signal. Let us denote with
xi,
i = [1, …,
KN] and
yj,
j = [1, …,
KM],
z = 0 the coordinates of antenna elements. The schematic view of the antenna is shown in
Figure 1.
We assume the transmission of a monochromatic signal
ST at frequency
f0. In the noise free-case and neglecting constants, the signal received by the antenna at the position
(xi,
yj) can be modeled as [
13]:
where
G is the antenna gain,
c is the speed of light and
R(
xi −
x,
yj −
y,
z) is the distance between each antenna element, with coordinates (
xi,
yj, 0), and a target placed in (
x,
y,
z) with reflectivity γ(
x,
y,
z). The signal
SR coherently collects all the echoes from the illuminated volume
V, with proper attenuation and phase. This model assumes that all targets are point scatterers, that there is no multipath effect and that the superposition principle stands. Being a good trade-off between complexity and handling, such assumptions are widely adopted.
In order to simplify the realization of the system, the planar antenna is synthetized with the combination of two linear arrays, one horizontal and one vertical, as shown in
Figure 2. In this case, the vertical array contains the transmitting elements (blue dots in
Figure 2), and the horizontal array contains the receiving elements (red dots in
Figure 2).
The vertical array is composed of KN transmitting elements while the horizontal one is composed of KM receiving elements. In this case KN + KM elements are considered instead of KN × KM. The spacing among elements is λ/2. Each transmitting element of the vertical array emits a signal at different time intervals, and each echo is received by all elements of the array of receiving antennas on separate channels. In this case, the distance R of Equation (1) can be decomposed as the sum of RT and RR, i.e., the distance between the transmitting element and the target and the distance between the target and the receiving element, respectively.
Our aim is to estimate the 3D distribution of scatterers across the imaged volume. In order to speed up the process, only directions,
i.e., Lines of Sight (LoS), with the presence of at least one target are selected. To do this, a first processing step implementing a fast 2D focusing algorithm is performed. The goal consists in focusing the acquired signal on a vertical plane at a fixed distance
zo. Developing in Taylor series and truncating at the 2nd term,
RT and
RR can be written as:
Substituting in Equation (1), we obtain:
where the substitution
=
has been done.
After applying deramping,
i.e., correcting the phase term in order to remove the linear component due to the distance, the term
is converted to
, obtaining the signal
Sdr:
where the intermediate derivations are reported in
Appendix.
Moving to a multi-frequency system, the dependency of the acquired signal with respect to frequency has to be made explicit, obtaining Sdr(xi, yj, f). By assuming a stepped system, a discrete number of frequencies is exploited, thus the vector f = [f0, f1, f2, …, fN] containing the N frequencies can defined.
After discretization, Equation (4) becomes similar to the direct 2D Discrete Fourier Transform expression, with the two exponential functions being the transformation kernel. Thus, the Inverse Fast Fourier Transform (IFFT) algorithm is adopted in order to invert Equation (4) and estimate the term γ(
x,
y,
z0) from the acquired signal,
i.e.,:
Note that the data at each considered frequency fi produces a 2D image of the reflectivity γ(x, y, z0). The reflectivity estimated from Equation (5) is exploited in order to detect the presence of targets within the imaged volume and their horizontal θ and vertical φ angles of view.
After this first processing step, a second one is implemented in order to detect the presence of scatterers within the 3D imaged volume and estimate their coordinates. For each identified target (
i.e., for each (θ, φ) couple of interest), the antenna beam is tilted to its direction by applying a proper phase correction term to the acquired signals. In this step, we move to the spherical coordinates system (ε, θ, φ) from the Cartesian one (
x,
y,
z). In other words, while 2D focusing works on vertical (
x,
y) planes, the 3D focusing considers a volume that is a cone with the vertex positioned in the center of the antenna. Considering the multi-frequency approach, a single complex value is obtained for each of the
N working frequency. Our aim is, once focused on a LoS, to detect one or multiple targets and estimate their range distances, based on
N acquisitions. The acquisition model can be written as:
where
q is the
N × 1 data vector collecting the focused signals at the different frequencies for the direction (θ, φ),
A is the transformation matrix and
h is a vector of the reflectivity at different distances. In particular, the vector
h contains the complex reflectivity values for different range distances ε
κ, uniformly sampled in the interval [ε
min, ε
max], as reported in
Figure 3. Few targets are expected to be detected for each line of sight, thus most of the
h elements are supposed to be equal to zero. In other words,
h can be assumed to be a sparse vector.
Concerning matrix
A, it is defined by discretizing the acquisition model of Equation (1). The generic element of matrix
A is:
where
fi is one of the frequencies within the bandwidth and ε
j is a discretized range distance.
Given the previously reported model, our aim is to estimate the number of non-zero elements of h, i.e., how many targets are present in the selected line of sight, their position within vector h, i.e., the range distances of the detected targets, and their values, i.e., the reflectivity of the targets. We can refer to the estimation of vector h as an “in depth” focusing.
In the realistic case, measurements are corrupted by noise, leading to:
where
w is the thermal noise vector, whose element are circular complex Gaussian distributed.
The problem of reconstructing a sparse vector from a low number of measurements is the typical problem addressed by the CS technique. The estimation algorithm can be formulated by solving the following minimization problem:
where the L
1-norm promotes the sparsity of the unknown
h vector, while the L
2-norm minimizes the difference between the model and the acquired data. ψ is a regularization factor whose ideal value depend mostly on SNR and has to be set [
14,
15]. In order to compute the estimation of
h,
i.e., the solution of Equation (8), several algorithms can be adopted [
11,
12,
16].
3. Results
In order to evaluate the performances of the proposed method, different simulated case studies have been implemented. We simulated, in Matlab
® environment, the received signal in the case of difference scenarios, corrupting data with circular complex Gaussian distributed random noise. A cross antenna composed of two linear arrays of 111 (horizontal, Receiving-Rx) and 141 (vertical, Transmitting-Tx) elements has been considered. The system band, between 77 GHz and 77.5 GHz, has been sampled following a stepped approach. Complete system details are reported in
Table 1. For the reported simulations, a constant SNR of 30 dB for a target at a distance of 200 m from the antenna has been adopted. In this case, the regularization factor ψ has been empirically set equal to 0.1.
The first dataset is composed of two targets in front of the antenna,
i.e., (θ, φ) = (0, 0), with the same reflectivity and range distances of 20 and 30 m, respectively. In
Figure 4, the images obtained by the 2D FFT based focus approach,
i.e., the first step of the proposed method, considering different distances (
z0) are reported. In particular, 10, 20, 30 and 50 m have been considered. It can be seen that targets are evident in all focusing range distance cases, although at 10 m and 50 m no targets are present, suggesting that the approximations made in Equation (4) hold also in case of wrong range distance assumption. From
Figure 4, it can be stated that the choice of the focusing distance
z0 does not noticeably modify the focused image, thus the parameter
z0 can be
a priori fixed to any value.
Subsequently, we considered the line of sight corresponding to (θ, φ) = (0, 0) for computing the in depth focusing, i.e., estimating the number of targets, their range distances and their reflectivities in front of antenna. Several test studies have been considered in case of different steps in sampling the available bandwidth (in the 77–77.5 GHz interval). In particular, instead of uniformly sampling the bandwidth, random frequencies have been chosen. We first investigated the number of frequencies that has to be considered in order to achieve effective results. For this simulation 500, 100, 20 and 10 frequencies, randomly sampled within the 77–77.5 GHz interval, have been considered. We recall that, the lower the number of adopted frequencies is, the lower the global acquisition time of the system is.
The unknown vector h has been assumed to cover a range distance between 10 to 100 m with a spacing of 9 cm, providing 1000 positions. Note that the number or rows of transformation matrix A is sensibly lower than the number of columns. In particular, it has 1000 columns and 500, 100, 20 or 10 rows.
In order to provide a reference solution, the estimation via L
2-norm minimization technique has also been performed. In
Figure 5, results for the L
2-norm and the proposed CS techniques are reported in red and blue color, respectively, in case of 500 frequencies (
Figure 5a), 100 frequencies (
Figure 5c), 20 frequencies (
Figure 5e) and 10 frequencies (
Figure 5g). Each line represents the estimated reflectivity for each range distance between 10 and 100 m. In particular, its value is expected to be zero where no targets are present, while a peak is associated to each detected target within the considered line of sight. In order to better appreciate the results, enlargements in the 15–35 m range are presented for all the cases in the right column of
Figure 5.
In case of 500 frequencies (
Figure 5a,b), both techniques are able to detect the presence of different targets, their reflectivities and distances (two peaks at 20 and 30 m are evident), with the L
2-norm approach characterized by a coarser resolution with respect to proposed CS based methodology, as the impulses are larger. Moving to the 100 frequencies case (
Figure 5c,d), the proposed approach produces very similar results compared to the previous case, while the L
2-norm technique shows a much higher amount of estimation noise and fails in evaluating the reflectivity of the target at 30 m. In the 20 frequencies case (
Figure 5e,f), characterized by a deeply undersampled bandwidth, the L
2-norm fails to detect the second target, and shows several false alarms in the 10–30 m range, while the proposed approach still correctly retrieves the scatterers. In the last case,
i.e., 10 frequencies sampled within the 500 MHz interval (
Figure 5g,h), both techniques fail, as CS is also unable to detect the correct number of scatterers and their distance from the antenna.
The second simulated dataset is a more realistic scenario. Several scatterers have been placed within the volume of interest in order to simulate a road with cars and lampposts, providing the scenario illustrated in
Figure 6. In this case, 100 frequencies have been considered.
First, the 2D FFT based focus has been applied, providing the image reported in
Figure 7. It can be noted that the shapes have been well retrieved, with the lampposts visible on the left and right sides of the images and the cars visible in the central area.
The in depth focus via CS approach has been applied for each line of sight, providing the detection of targets within the considered 3D volume. In this case, the L
2-norm technique provided very unsatisfactory results, thus they have not been reported. In
Figure 8 the estimated scatterers (red dots) have been plotted overlapped with the reference scenario (blue dots).
From the reported results, different aspects can be highlighted. The proposed method is able to detect multiple targets sharing the same LoS. The estimated position of the targets is globally satisfactory,
i.e., red dots are correctly positioned in the 3D volume. The false alarm rate is very low, even at far range distances from the antenna. Concerning the detection rates, as expected, performances are better in the short range region. From
Figure 8 it is evident that the number of detected targets beyond 35 m is very low compared to the nearest region. We have to underline that the considered scenario is very challenging,since in many lines of sight more than two scatterers are present. However, at least few scatterers have been found for each car, while, considering lampposts, only the most distant on the left side of the road has been completely missed.
An evaluation of computation time of the method has been made. At present, the detection of the targets for each line of sight requires about 10 s on a Core i7 workstation with 16 GB of RAM in the case of 20 frequencies and 1000 unknowns. In case of the simulated scenario reported in
Figure 8, 119 lines of sight have been detected, thus the simulation was completed in about 119 × 10 s, which is about 20 min in total. However, it has to be underlined that all the process has been implemented in a Matlab environment and no optimization was done on the code. If we accept a resolution reduction over range, e.g., moving from 9 to 90 cm (100 unknows), which could be still acceptable considering the application, the computational time for each line of sight reduces to 0.7 s. Moreover, to improve the code performances, massive parallelization may be implemented (all lines of sight could be processed simultaneously). Both code optimization and parallelization could lead to two orders of magnitude speedup. For example, by employing a General Purpose Graphic Processor Unit (GP-GPU) with hundreds of cores, the global processing time can be further reduced. Such value could be further reduced by optimizing the code, making it suitable for most of real time applications.