A New Method for Traffic Participant Recognition Using Doppler Radar Signature and Convolutional Neural Networks

Ślesicki, Błażej; Ślesicka, Anna

doi:10.3390/s24123832

Open AccessArticle

A New Method for Traffic Participant Recognition Using Doppler Radar Signature and Convolutional Neural Networks

by

Błażej Ślesicki

^1,*

and

Anna Ślesicka

²

¹

Department of Avionics and Control Systems, Faculty of Aviation Division, Polish Air Force University, 08-521 Dęblin, Poland

²

Institute of Navigation, Polish Air Force University, 08-521 Dęblin, Poland

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(12), 3832; https://doi.org/10.3390/s24123832

Submission received: 15 May 2024 / Revised: 7 June 2024 / Accepted: 12 June 2024 / Published: 13 June 2024

(This article belongs to the Section Electronic Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

The latest survey results show an increase in accidents on the roads involving pedestrians and cyclists. The reasons for such situations are many, the fault actually lies on both sides. Equipping vehicles, especially autonomous vehicles, with frequency-modulated continuous-wave (FMCW) radar and dedicated algorithms for analyzing signals in the time–frequency domain as well as algorithms for recognizing objects in radar imaging through deep neural networks can positively affect safety. This paper presents a method for recognizing and distinguishing a group of objects based on radar signatures of objects and a special convolutional neural network structure. The proposed approach is based on a database of radar signatures generated on pedestrian, cyclist, and car models in a Matlab environment. The obtained results of simulations and positive tests provide a basis for the application of the system in many sectors and areas of the economy. Innovative aspects of the work include the method of discriminating between multiple objects on a single radar signature, the dedicated architecture of the convolutional neural network, and the use of a method of generating a custom input database.

Keywords:

Doppler radar; artificial intelligence; deep learning; convolution neural network; micro-Doppler signature

1. Introduction

In recent years, it is important to note the tremendous growth of interest in the application of artificial intelligence in various areas of life. In particular, this phenomenon concerns the development of deep learning algorithms [1]. Deep learning refers to multilayer neural networks, which simultaneously perform two functions. The first concerns the generator of diagnostic features for the process under analysis. The second performs the function of a classifier or regression system. This approach creates a great tool to replace humans, first of all, in the difficult field of describing a process with specialized descriptors, the creation of which requires great expert skills. Consequently, this approach to the non-interventional method of feature generation is much more effective than the descriptor generation methods traditionally used. In particular, it improves the accuracy of system performance, and for this reason, deep network technology has recently rapidly become one of the most popular areas in computer science [2].

Fukushima’s multilayer neocognitron, defined in the early 1990s [3], can be considered the creator of these networks. The real development of these networks is in fact attributable to Professor LeCun [4], who defined the basic structure and learning algorithm of a specialized multilayer network, called a convolutional neural network (CNN). A turning point in the development of deep neural networks came in 2012 when Alex Krizhevsky’s team won the ImageNet competition in image recognition by several lengths over other teams [5].

Currently, CNN is the basic structure used on a large scale in image processing. Meanwhile, many varieties of networks have been developed as modifications of the basic CNN structure (e.g., UNN), as well as networks that differ fundamentally from CNNs. Examples include the autoencoder (AE) as a multilayer nonlinear generalization of the linear PCA network [2], recurrent networks of the Long Short-Term Memory (LSTM) type [1]—which is an effective solution to the problem of back-propagation over time—or the Restricted (multilayer) Boltzmann Machine (RBM) used in the Deep Belief Network (DBN) [6].

Recently, there has been considerable interest in using radar data as input to deep neural networks [7]. Radar as a sensor is an active device and its use is independent of atmospheric conditions since microwaves are able to penetrate the atmosphere under virtually any conditions. Hence, it is possible to use radar, for example, SAR to image the terrain both during the day and night, as well as in the presence of clouds, fog, and smoke. In addition, radar as a sensor has a much longer range than optoelectronic sensors like cameras or lidar. In addition, radar has the ability to detect moving objects, and is even capable of detecting objects masked by, for example, vegetation or leaves. Unfortunately, radar imagery is difficult to interpret, requiring skilled analysts [8]; hence, the authors propose a solution in this article to replace the difficult analysis of radar imagery by using deep neural networks to distinguish the types of objects moving opposite the radar. From a military point of view, it should also be added that radar as an active sensor is thus easy to detect by the enemy [9].

The use of the Doppler phenomenon in radiolocation is well known. On the other hand, the description of the occurrence of the micro-Doppler effect is noteworthy [10]. The main difference between the micro-Doppler effect and the classical Doppler effect is that the Doppler effect measures the change in frequency of a wave associated with the motion of an object, while the micro-Doppler effect additionally analyzes the frequency changes resulting from internal structural movements of an object. In other words, while the Doppler effect deals with the usual frequency changes caused by the motion of the object as a whole, the micro-Doppler effect additionally studies the microscopic movements within the object’s structure that also affect the wave [11]. In the case of radar, this phenomenon occurs when the radar-radiated object of interest performs rotation or mechanical vibration. Then, in the received signal, the described vibrations or movements in the object’s body axis can induce additional frequency components in the echo signal. Examples of movements generating the micro-Doppler phenomenon are the rotors of helicopters (propellers of a fixed-wing aircraft, the engine compressor and blade assemblies of a jet aircraft) or the limbs of a person while walking [12]. In summary, the movements of various parts of an object placed in front of the radar produce micro-Doppler signatures that can be used to identify the object.

The authors of this article are very clear about the novel content of the method of identifying objects using FMCW radar, processing the obtained signals using the short-time Fourier transform (STFT), and then the obtained spectrograms using deep neural networks. The proposed solution eliminates the known limitations of existing techniques for counting (distinguishing) people, cyclists, or cars with the help of a camera in vision systems since the camera requires a clear line of sight, while the quality of the imaging obtained with the camera depends on the lighting conditions.

These things considered, the application of the presented solution to automotive radar—in addition to its essential ability to measure the distance and relative speed of objects in front of cars—can serve to improve the driver’s ability to perceive objects in poor visibility or objects in blind spots.

Taking the above into account, including in particular the recalled properties of radar relative to other imaging sensors, by collecting radar signatures in real-time and using machine learning algorithms to detect pedestrians, cyclists, and cars, it is possible to develop a system to warn a driver or other road users of the possibility of a collision with a pedestrian at night. Such a system could have applications in autonomous vehicles, in which a signal can be developed to activate the automatic braking system (ABS) when necessary.

The proposed system can be integrated into the vehicle’s intelligent system, which provides real-time pedestrian detection and warning mechanisms by vibrating the steering wheel and displaying a message on the monitor/dashboard to warn the driver of a collision with a pedestrian.

The system can also be extended to detect and avoid cyclists and animals and can be implemented in autonomous vehicles to support automatic braking systems.

In addition, the system of automatic detection of people, animals, or vehicles can be used in sectors of the economy such as production halls where accidental uncontrolled interference by a person or other object can disrupt the normal operation of the enterprise or expose people, including employees, to loss of life or health.

Innovative solutions in this article also include the development of a method for the generation and flow of simulated radar data.

The rest of this paper is organized as follows. Section 2 describes the state of the art of the issues addressed. Section 3 describes the method of radar data generation as input data for a special convolutional neural network structure. In addition, a brief characterization of the FMCW radar, object models, Fourier transform (SFTF), and a description of the construction and parameters of convolutional neural networks is also included. Section 4 presents the results of the experiments. Finally, Section 5 presents conclusions and discussions of the obtained results.

2. Related Works

Target detection and classification based on deep learning from micro-Doppler signatures is gaining popularity in the field of automatic recognition of moving objects on the ground, such as people, animals, or vehicles. Micro-Doppler signatures are characteristic frequency patterns of radar signals that result from an object’s motion. Through deep learning, models can be trained to analyze these signatures, making it possible to automatically recognize and classify different types of objects based on their motion [13].

In the paper [14], a MAFAT dataset containing echo signals from humans and animals collected by different Doppler radars in different locations, terrains, and with different SNRs was used to train a six-layer convolutional neural network. To achieve higher classification accuracy, the data were further augmented by randomly shifting frequency/time, adding noise, and inverting the image vertically/horizontally.

On the other hand, a paper [15] proposed a CNN consisting of five dense blocks (i.e., 3 × 3 convolutional layer followed by 1 × 1 convolutional layer) and five transition blocks (i.e., 1 × 1 convolutional layer followed by 2 × 2 pooling layer) for human motion classification based on micro-Doppler signatures. The performance of this network was tested on two datasets containing echoes associated with six human movements (walking, running, crawling, forward jumping, and boxing), obtained by simulation and measurement, respectively. The main feature of the human-motion-recognition algorithm in [15] is that the proposed network is more robust to the variable aspect of the target angle than most classical CNN models, such as VGGNet, ResNet, and Dense-Net.

In articles [16,17,18], the authors focused on the problem of recognizing single individuals/groups/vehicles based on micro-Doppler signatures using deep learning (DL). To improve the training efficiency of the network, pre-trained classical CNN models, such as VGG16, VGG19, and AlexNet, and a learning-transfer technique were used. Human target/vehicle tracking data collected from Ku-band pulsed Doppler radar (RadEch) included various typical scenarios, such as a single person, a group walking, running, and truck movement. These data were used to evaluate the performance of the proposed network. In addition, data augmentation was applied, meaning that the training data were augmented by vertically inverting the image and circularly shifting it to compensate for the limited amount of training data available. Data augmentation is a technique used in machine learning that involves generating new training examples by making changes to existing data to increase the variety and quantity of training data.

The works [19,20] used pre-trained classical CNN models (for example, GoogLeNet) for drone classification. In particular, in work [19], micro-Doppler signatures and velocity-cadence diagrams obtained by 14 GHz FMCW radar in indoor and outdoor experiments were combined as Doppler images, based on which drones with different numbers of engines were classified. In [20], both a pre-trained GoogLeNet and a deep CNN series with 34 layers were used to classify drones/birds at night. The RGB and the grayscale echo signal dataset collected by 24 GHz and 94 GHz FMCW radars are used to train the two networks, respectively. One distinctive feature of the networks presented in [20] is that clutter and noise have been treated as two separate subclasses.

In [21,22], the authors proposed a two-layer deep network having symmetric connections between nodes (neurons) of these layers (DBN, Deep Belief Networks). This network is built by juxtaposing classical RBM and RBM with Gauss-Bernoulli distribution (GBRBM), which is similar to the solution proposed in [19] to solve the problem of microdrone detection and classification. The classification was based on the Doppler signatures of the objects in question and their spectral signature patterns (SCF) (i.e., the Fourier transform of the autocorrelation function). The performance of the proposed DBN network was tested on echo signals collected from three microdrones using S-band Doppler radar.

In conclusion, it is worth noting that a comprehensive overview of the application of algorithms based on deep learning (DL), such as convolutional neural networks (CNNs) for detecting and classifying different types of objects, was presented in [23]. This paper discusses the application areas of deep learning, i.e., radar waveform and antenna array design; radar waveform recognition of passive or low-probability-of-detection (LPI) radars; automatic target recognition (ATR) based on high-resolution range profiles (HRRPs); Doppler signatures and synthetic-aperture radar (SAR) images; and radar interference/noise recognition and suppression.

Taking into account the above literature review, the authors of this article proposed a missing approach based on simulated data generated in the Matlab environment and a novel method of identifying objects using FMCW radar operating at 77 GHz, processing the obtained signals using STFT, and then the obtained spectrograms using a specialized deep neural network structure. In addition, the Table 1 below contains a list of acronyms to facilitate reading and analysis of the article.

3. Method

This paragraph describes the radar, pedestrian, cyclist, and car models adopted in the article. A frequency-modulated continuous-wave (FMCW) radar operating at f_C = 77 GHz was adopted. The Figure 1 below shows a view of a simulated situation, where the radar will radiate a specific area where there may be moving pedestrians, cyclists, and cars.

Such a situation can occur on roads, in parking lots, airports, or other important public places, where it is necessary to take care of the safety or identification of emerging objects.

3.1. FMCW Radar

Frequency-modulated continuous-wave radar allows the detection and estimation of distances to objects within the radar beam. The principle of such radar is to probe the space with a signal with linear frequency modulation. The echo signal is then subjected to spectral analysis. The position of the striations (exceeding the assumed detection threshold) on the frequency axis is directly proportional to the distance between the radar antenna and the observed spatial objects. The advantages of such a radar are a simple and uncomplicated design, low power of the probing signal, and, thus, what is important from the military point of view—a low susceptibility to detection by the enemy [24].

The signal transmitted by the FMCW radar is called a chirp. It is assumed that the radar transmits a finite number of K chirps, each with a linearly increasing frequency as a function of time. The Figure 2 below shows a model of the chirp signal generated by the FMCW radar and labels its most important parameters.

The time dependence of the frequency of the emitted signal is shown by the following relationship [24]:

f (t) = β t^{'}

(1)

where

β = \frac{B}{T}

is the frequency slope; T denotes the chirp duration; B denotes the frequency bandwidth swept; and

t^{'} \in [0, T]

and

k = 0, \dots, K - 1

.

The signal transmitted by the FMCW radar can be modeled as follows [24]:

s (t) = \cos (2 π f_{C} t + φ (t))

(2)

where the time t and phase

φ (t)

can be expressed as follows [24]:

t = k T + t^{'}

(3)

φ (t) = π k β T^{2} + π β {t^{'}}^{2}

(4)

If the transmitted signal encounters an object in its path, part of the energy of the electromagnetic wave will return back to the radar. In the receiving part of the FMCW radar, the operation of multiplying the received signal with a replica of the transmitted signal and filtering takes place, which ultimately results in a signal at the output of the receiving path, the form of which can be expressed as follows [24]:

y (t) \approx ε e^{- j 2 π f_{B} t^{'}} e^{- j 2 π f_{D} k T}

(5)

where

ε

denotes the complex coefficient depending on antenna gain;

f_{B}

denotes the beat frequency; and

f_{D}

denotes the Doppler frequency. The situation described is shown in the Figure 3 below.

Knowing the values of

f_{B}

and

f_{D}

, one can determine the velocity and distance to the object, respectively [25,26,27]:

V = \frac{f_{D} c}{2 f_{c}}

(6)

R = \frac{f_{B} c}{2 β}

(7)

where c denotes the speed of light.

3.2. Data Generation

The input data for the neural network were generated in the Matlab (R2023b) environment using built-in functions that simulate the movement of a pedestrian, a cyclist, and a car [27]. This paragraph will describe the various properties and parameters set for each object model. Finally, the scheme of radar data generation and processing will be presented, up to the final classification of the object recognized by the system. Figure 4 shows the three-dimensional model of the pedestrian and cyclist used in this article.

The human model consists of 16 body segments, that is the neck and head, left and right shoulder, left and right upper limb, left and right lower arm, left and right upper arm, left and right hip, left and right lower limb, left and right foot, with echoes from the pedestrian including components from each segment. The three-dimensional initial location, random pedestrian height, random constant walking speed, and pedestrian turn and direction were assumed for the pedestrian gait parameters.

The model of the cyclist actually consists of the rider and the bicycle. Hence, the cyclist model consists of seven segments, which include the frame of the bicycle and the cyclist, the pedals of the bicycle, the upper and lower legs of the cyclist, the rear wheel, and the front wheel of the cyclist. The parameters of the cyclist include a three-dimensional initial location, a random constant speed of movement, the turn and direction of movement, the number of spokes in the bicycle wheel, the ratio of wheel rotation to pedal rotation, and information about whether the cyclist is pedaling or not pedaling during the simulation.

The car model was described as an object that is a specified platform in the Matlab environment. Hence, a three-dimensional initial location, a random constant displacement speed, turn and direction of displacement, and a specified value of effective reflective area were assumed for the car’s parameters.

This article proposes a novel method of generating input data for a convolutional neural network. Its scheme is shown in Figure 5. In the first step, a signal is generated by the FMCW radar. The signal goes to the transmitter, which emits the signal into space. A specific area in front of the radar was assumed, in which the pedestrian, cyclist, and car models described earlier were placed. This article assumes 10 cases of different configurations of objects in the radar beam. Part of the signal emitted by the radar returns back to the radar receiver, where further processing of the received signal takes place.

In the case of a non-rigid pedestrian target, in addition to the pedestrian’s actual movement (as the center of gravity), there are usually micro-movements, such as waving of the limbs or turning of the head. The effect of Doppler frequency modulation on radar echoes caused by micro-movements is referred to as the micro-Doppler effect.

The micro-Doppler can be visually plotted in the form of a spectrogram using the short-time Fourier transform (STFT); hence, the received signal in the radar’s receiving path was subjected to the Fourier transform (STFT). Even if the pedestrian’s center of gravity is moving at a constant speed, different parts of the human body can have different speeds, as well as acceleration and deceleration profiles. Therefore, the spectrogram of a radar echo from a walking pedestrian will contain different Doppler signatures produced by different parts of the body. The speed and periodicity information associated with walking can be used as unique radar signatures to detect and classify pedestrians.

The spectrograms obtained in the above way for 10 scenarios form the input data for a special convolutional neural network structure that is the final classifier distinguishing between the types of objects in front of the radar. The 10 cases of occurrence of different objects in front of the radar were assumed for the simulation:

1.: Single pedestrian;
2.: Two pedestrians;
3.: Three pedestrians;
4.: Single cyclist;
5.: Single car;
6.: Single pedestrian and single car;
7.: Single pedestrian and single cyclist;
8.: Single car and single cyclist;
9.: Two pedestrians and a single car;
10.: Two pedestrians and a single cyclist.

The image database contained 1000 images for each of the 10 classes, each image having a dimension of 875 × 656. Examples of images representing all 10 classes are shown in Figure 6. Considering the representations of all classes, one should find great difficulty in unambiguously determining the membership of a given image in the correct class. As can be seen, the image database is heavily imbalanced in terms of the proportion of each class, hence, it is difficult to process. This is another problem that the authors have solved in this article.

This article uses radar imagery generated in the Matlab environment. Nevertheless, it is worth noting that there are many databases with radar imagery widely available on the Internet, freely available for scientific research purposes [7]. For the authors, the purpose of this article is to test the performance and correctness of the proposed convolutional neural network structure for generated radar signals reflected from selected available objects.

3.3. The Short-Time Fourier Transform

This paragraph discusses the basics of the time-varying frequency analysis of non-stationary signals using the short-time Fourier transform (STFT). The application of the STFT applies to all measurement situations in which it is important to extract the information contained in the time- or space-varying frequency spectrum of the signal under study. The continuous short-time Fourier transform (STFT) can be interpreted as being an undiscretized-in-time-and-frequency Gabor transform. The definition of this transformation in the time and frequency domain is as follows [28]:

{S T F T}_{x}^{T} (t, f) = \int_{- \infty}^{+ \infty} x (τ) γ^{*} (τ - t) e^{- j 2 π f t} d τ

(8)

{S T F T}_{x}^{F} (t, f) = e^{- j 2 π f t} \int_{- \infty}^{+ \infty} X (υ) Γ^{*} (υ - f) e^{j 2 π υ t} d υ

(9)

where the function

γ^{} (t)

denotes the temporal observation window; and

Γ^{} (f)

is its Fourier spectrum. Equations (8) and (9) are often called “the Moving Window Method (MWM)” in the time domain (8) and frequency domain (9), respectively. In the time domain, STFT involves performing a simple Fourier transform on successive signal fragments, “clipped” by a moving window

γ^{} (t)

. For the short-time Fourier transform, the so-called spectrogram is defined as the square of its modulus [28]:

S_{x}^{S P E C T O G R A M} (t, f) = {|{S T F T}_{x}^{} (t, f)|}^{2}

(10)

3.4. Deep Learning Network

This article uses the CNN model, which is characterized by a multilayer (deep) structure. The model includes inter-area convolutional connections in the first layers, which allows for efficient feature extraction from input data. Batch normalization layers were also used during the learning process to speed up the learning process and stabilize the network [29]. The model also uses a linking layer, which contributes to reducing the dimensionality of the data and increasing the invariance to translation [30].

These elements make up the comprehensive architecture of the CNN model, which enables efficient classification of input data. The architecture of the network used is shown in Figure 7.

The CNN network was created from scratch using a program in Matlab. The final network structure, defined after many preliminary experiments, contained four convolutional layers:

The first layer consists of 16 neurons, 5 × 5 filter reception field, 2 × 2 stride, batch normalization layer, ReLU, max pooling;
The second layer is composed of 32 neurons, 5 × 5 filter reception field, stride 2 × 2, batch normalization layer, ReLU, max pooling;
The third layer is composed of 64 neurons, 5 × 5 filter reception field, stride 2 × 2, batch normalization layer, ReLU, max pooling;
The fourth layer is composed of 128 neurons, 5 × 5 filter reception field, stride 2 × 2, batch normalization layer, ReLU, max pooling.

In addition, there was a flatten layer, one fully connected layer containing 10 neurons representing 10 recognized image classes.

The input layer of a convolutional neural network is the first layer of the network that accepts input data. The input layer does not perform any convolution or activation operations. Its task is only to accept the input data and pass it to subsequent layers of the network, preserving its original dimensions and properties. This is the starting point of the neural network’s data processing [31].

The convolutional layer is a key element in convolutional neural networks (CNNs). The convolution operation is a basic image processing technique that allows feature extraction from input data by moving a filter (mask) over the image and calculating the scalar product between image pixel values and filter weights [31]. In the case of images, the data array is represented by a two-dimensional matrix I with elements representing pixel brightness degrees I(m, n), and the kernel K is two-dimensional. The two-dimensional spline operation is then written in the form [32]:

Y (i, j) = I (i, j) * K (i, j)

(11)

Y (i, j) = \sum_{m} \sum_{n} I (m, n) K (i - m, j - n) = \sum_{m} \sum_{n} I (i - m, j - n) K (m, n)

(12)

In CNNs, the input for the first hidden layer is an RGB representation of images set in tensor form. The convolution operation in the first convolutional layer includes all three RGB channels of the image (the sum of the individual channels), each with different assumed values of the kernel weights in the form of a linear filter constituting the analysis neuron. In further layers, this operation includes multiple (user-determined) images from the preceding layer.

In data processing, the convolution operation used in CNNs is distinguished by important advantages over ordinary matrix operations in classical networks. These include the locality of the connections, the common (repeatable) values of the connection weights of the image-shifting filter, and the resulting invariance (equivariance) with respect to the shift [32].

Batch normalization is used to combat a phenomenon known as internal covariate shift of distribution; this is when the distribution of input data to individual layers changes during training, making the network learning process more difficult. Batch normalization works by normalizing the input data for each training mini-patch (batch). This involves calculating the mean and standard deviation for each feature in a given mini-batch, and then transforming the input data so that they have a mean of zero and a unit standard deviation. In this way, even when the model parameters change during training, the input data are kept within a stable range of values, which facilitates the network learning process [33].

4. Experiments and Results

Based on the data in the tables below, simulations were carried out to teach a deep neural network capable of recognizing 10 classes of objects. Table 2 contains the parameters of the FMCW radar used for signal generation. Table 3 contains the parameters of the pedestrian, cyclist, and car movement.

In order to carry out the research, the following equipment (hardware and software) was used:

Dell 13th Gen Intel(R) Core(TM) i5-1345U 1.60 GHz, 16 GB RAM, Windows 11 Pro;
Matlab Version: 23.2.0.2365128 (R2023b), manufacturer: MathWorks, Inc., Natick, MA, USA.

The research experiment was conducted on 10,000 images (10 object classes of 1000 images each). The images were randomly divided into three sets of data: training data (70% of images) were used to estimate the weights of the artificial neural network, validation data (15% of images) were used to test the trained network, and test data (15% of images) were used to test the functioning of the network after training.

Figure 8 shows the simulation results in the form of a learning curve for 10 epochs, where the accuracy reached 95.93%. In addition, a confusion matrix is presented in Figure 9 to show how the adopted model performs against target classes in the dataset.

The confusion matrix shows display examples that were properly classified (blue) against misclassified examples (orange). As can be seen, the adopted model made errors when clearly distinguishing between a single cyclist and the case where there were two pedestrians and a car in front of the radar.

The simulation results presented above are the best of the many results obtained. They are the result of time-consuming simulations that differed in many particular parameters such as the number of images for each class, the number of epochs, or the different values of the neural network hyperparameters. A summary of all simulations performed is presented collectively in Table 4.

In order to check the performance of the proposed method, a comparison was made with other available methods. For this purpose, the methods described in the articles [34,35] were implemented in the Matlab environment. The first paper formulates the multiuser automatic modulation classification (mAMC) of compound signals as a multi-label learning task, which aims to recognize the modulation type of each constituent signal in a compound signal [34]. The second article proposes a semantic-based learning network (SLN) that simultaneously learns modulation classification and parameter regression of frequency-modulated continuous-wave (FMCW) signals [35]. In order to use common input data, modifications were made to both of the above methods so that they were capable of recognizing 10 classes of objects. Figure 10 shows the performance of the three methods, capable of recognizing pedestrians, cyclists, and cars. The research experiment was conducted on 10,000 images (10 object classes of 1000 images each). The images were randomly divided into three sets of data: training data (70% of images) were used to estimate the weights of the artificial neural network, validation data (15% of images) were used to test the trained network, and test data (15% of images) were used to test the functioning of the network after training.

The results obtained suggest the highest accuracy for the method proposed by the authors of this article. Some attention should be paid to the obtained results. The selection of a neural network structure is never random. According to the literature, it has been shown that choosing a reasonable structure at the beginning, well-suited to the specifics of the task to be solved, can significantly reduce the learning time and improve the final results [36]. Hence, the authors conclude that one of the reasons for the best results for the newly developed method is the selection of the right network structure and learning process for the input data.

5. Discussion and Conclusions

Deep neural networks and related deep learning have created new prospects for the development of artificial intelligence, while the creative combination of this technology with radar allows their real application in the daily functioning of ordinary people.

Deep neural networks, which combine simultaneously two functions—that is, the generation of diagnostic features and the final classifier—are able to process original data without the need for preliminary expert analysis.

After a deep analysis of the literature, the authors of this article proposed a missing approach based on simulated data generated in the Matlab environment and a novel method of identifying objects using an FMCW radar operating at 77 GHz, processing the obtained signals using STFT, and then the obtained spectrograms using a specialized deep neural network structure.

Accordingly, this paper presents a method for recognizing and distinguishing a group of objects on the basis of radar signatures of objects and a special convolutional neural network structure. The proposed approach is based on a database of radar signatures generated on pedestrian, cyclist, and car models in the Matlab environment.

Innovative aspects of the work include a method for discriminating multiple objects on a single radar signature, a dedicated convolutional neural network architecture, and the use of a method for generating a custom input database.

Section 4, which presents research experiments and recognition results for 10 classes of objects, confirms the high efficiency of the developed method. In order to check the performance of the proposed method, a comparison was made with other available methods. For this purpose, the methods described in articles [34,35] were implemented in the Matlab environment. The results obtained suggest the highest accuracy for the method proposed by the authors of this article.

The obtained results of simulations and positive tests provide a basis for the application of this system in many sectors and areas of the economy. To be mentioned here is the possibility of realizing an intelligent vehicle that provides real-time pedestrian detection and warning mechanisms by vibrating the steering wheel and displaying a message on the monitor/dashboard to warn the driver of a collision with a pedestrian. In addition, the proposed solution can be used as an automatic detection system for people, animals, or vehicles, and can be used in sectors of the economy such as production halls where accidental uncontrolled interference by a person or other object can disrupt the normal operation of the enterprise or expose people, including employees, to loss of life or health.

Further research directions and work of the authors of this article will be directed towards the practical implementation of the above-mentioned research, including the purchase of the necessary FMCW radar, the execution of practical tests, and the verification and improvement of the developed convolutional neural network structure.

Author Contributions

Conceptualization, B.Ś. and A.Ś.; methodology, B.Ś.; software, B.Ś.; validation, A.Ś. and B.Ś.; formal analysis, B.Ś.; investigation, B.Ś.; resources, B.Ś.; data curation, B.Ś.; writing—original draft preparation, B.Ś.; writing—review and editing, A.Ś.; visualization, B.Ś.; supervision, A.Ś.; project administration, A.Ś.; funding acquisition, A.Ś. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Fukushima, K. Neocognitron—A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 1980, 36, 193–202. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bengio, Y. Convolutional Networks for Images, Speech, and Time-Series. The Handbook of Brain Theory and Neural Networks; MIT Press: Cambridge, MA, USA, 1995. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
Geng, Z.; Yan, H.; Zhang, J.; Zhu, D. Deep-Learning for Radar: A Survey. IEEE Access 2021, 9, 141800–141818. [Google Scholar] [CrossRef]
Rypulak, A. Sensory obrazowe bezzałogowych statków powietrznych; Polish Air Force University: Dęblin, Poland, 2023. [Google Scholar]
Mason, E.; Yonel, B.; Yazici, B. Deep learning for radar. In Proceedings of the 2017 IEEE Radar Conference (RadarConf), Seattle, WA, USA, 8–12 May 2017; pp. 1703–1708. [Google Scholar]
Chen, V. The Micro-Doppler Effect in Radar; Artech Hause: Norwood, MA, USA, 2019. [Google Scholar]
Ying, L.; Chen, Y.; Zhu, Y.; Li, W.; Zhang, Q. Doppler effect and micro-Doppler effect of vortex-electromagnetic-wave-based radar. IET Radar Sonar Navig. 2020, 14, 2–9. [Google Scholar]
Baczyk, M.K.; Samczynski, P.; Kulpa, K.; Misiurewicz, J. Micro-Doppler signatures of helicopters in multistatic passive radars. IET Radar Sonar Navig. 2015, 9, 1276–1283. [Google Scholar] [CrossRef]
Belgiovane, D.; Chen, C. Micro-Doppler characteristics of pedestrians and bicycles for automotive radar sensors at 77 GHz. In Proceedings of the 2017 11th European Conference on Antennas and Propagation (EUCAP), Paris, France, 19–24 March 2017; pp. 2912–2916. [Google Scholar]
Dadon, Y.; Yamin, S.; Feintuch, S.; Permuter, H.; Bilik, I.; Taberkian, J. Moving target classification based on micro-Doppler signatures via deep learning. In Proceedings of the IEEE Radar Conf. (RadarConf), Atlanta, GA, USA, 7–14 May 2021; pp. 1–6. [Google Scholar]
Yang, Y.; Hou, C.; Lang, Y.; Sakamoto, T.; He, Y.; Xiang, W. Omnidirectional motion classification with monostatic radar system using microDoppler signatures. IEEE Trans. Geosci. Remote Sens. 2020, 58, 3574–3587. [Google Scholar] [CrossRef]
Hadhrami, E.A.; Mufti, M.A.; Taha, B.; Werghi, N. Ground Moving Radar Targets Classification Based on Spectrogram Images Using Convolutional Neural Networks. In Proceedings of the 19th International Radar Symposium (IRS), Bonn, Germany, 20–22 June 2018; pp. 1–9. [Google Scholar]
Hadhrami, E.A.; Mufti, M.A.; Taha, B.; Werghi, N. Transfer learning with convolutional neural networks for moving target classification with micro-Doppler radar spectrograms. In Proceedings of the International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 26–28 May 2018; pp. 148–154. [Google Scholar]
Hadhrami, E.A.; Mufti, M.A.; Taha, B.; Werghi, N. Learned Micro-Doppler Representations for Targets Classification Based on Spectrogram Images. IEEE Access 2019, 7, 139377–139387. [Google Scholar] [CrossRef]
Kim, B.K.; Kang, H.-S.; Park, S.-O. Drone Classification Using Convolutional Neural Networks With Merged Doppler Images. IEEE Geosci. Remote Sens. Lett. 2017, 14, 38–42. [Google Scholar] [CrossRef]
Rahman, S.; Robertson, D. Classification of drones and birds using convolutional neural networks applied to radar micro-Doppler spectrogram images. IET Radar Sonar Navig. 2020, 14, 653–661. [Google Scholar] [CrossRef]
Mendis, G.J.; Wei, J.; Madanayake, A. Deep learning cognitive radar for micro UAS detection and classification. In Proceedings of the Cognitive Communications for Aerospace Applications Workshop (CCAA), Cleveland, OH, USA, 27–28 June 2017; pp. 1–5. [Google Scholar]
Mendis, G.J.; Randeny, T.; Wei, J.; Madanayake, A. Deep learning based doppler radar for micro UAS detection and classification. In Proceedings of the MILCOM 2016—2016 IEEE Military Communications Conference, Baltimore, MD, USA, 1–3 November 2016; pp. 924–929. [Google Scholar]
Feng, B.; Chen, B.; Liu, H. Radar HRRP target recognition with deep networks. Pattern Recognit. 2017, 61, 379–393. [Google Scholar] [CrossRef]
Jankiraman, M. FMCW Radar Design; Artech Hause: Norwood, MA, USA, 2018. [Google Scholar]
Skolnik, M. Radar Handbook, 3rd ed.; McGraw-Hill: New York, NY, USA, 2008. [Google Scholar]
Budge, M.; German, S. Basic Radar Analysis, 2nd ed.; Artech Hause: Norwood, MA, USA, 2020. [Google Scholar]
Cuevas, E.; Luque, A.; Escobar, H. Computational Methods with MATLAB; Springer Nature: Cham, Switzerland, 2024. [Google Scholar]
Grigoryan, A.M. Fourier transform representation by frequency-time wavelets. IEEE Trans. Signal Process. 2005, 53, 2489–2497. [Google Scholar] [CrossRef]
Giusti, A.; Cireşan, D.C.; Masci, J.; Gambardella, L.M.; Schmidhuber, J. Fast image scanning with deep max-pooling convolutional neural networks. In Proceedings of the 2013 IEEE International Conference on Image Processing, Melbourne, VIC, Australia, 15–18 September 2013; pp. 4034–4038. [Google Scholar]
Agarwal, T.; Sugavanam, N.; Ertin, E. Sparse Signal Models for Data Augmentation in Deep Learning ATR. In Proceedings of the 2020 IEEE Radar Conference (RadarConf20), Florence, Italy, 21–25 September 2020; pp. 1–6. [Google Scholar]
Matuszewski, J.; Pietrow, D. Specific Radar Recognition Based on Characteristics of Emitted Radio Waveforms Using Convolutional Neural Networks. Sensors 2021, 21, 8237. [Google Scholar] [CrossRef] [PubMed]
Osowski, S. Sieci neuronowe do przetwarzania informacji; Oficyna Wydawnicza Politechniki Warszawskiej: Warsaw, Poland, 2020. [Google Scholar]
Loffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
Mengtao, Z.; Yunjie, L.; Zesi, P.; Jian, Y. Automatic modulation recognition of compound signals using a deep multi-label classifier: A case study with radar jamming signals. Signal Process. 2020, 169, 107393. [Google Scholar]
Chen, K.; Zhang, J.; Chen, S.; Zhang, S.; Zhao, H. Recognition and Estimation for Frequency-Modulated Continuous-Wave Radars in Unknown and Complex Spectrum Environments. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 6098–6111. [Google Scholar] [CrossRef]
Tadeusiewicz, R.; Gąciarz, T.; Borowik, B.; Leper, B. Odkrywanie właściwości sieci neuronowych przy użyciu programów w języku C#; Polska Akademia Umiejętności: Kraków, Poland, 2007. [Google Scholar]

Figure 1. Area of interest of radar.

Figure 2. Chirp signal generated by FMCW radar.

Figure 3. Block diagram of FMCW radar.

Figure 4. Simulated models of pedestrian and cyclist.

Figure 5. Block diagram of simulation.

Figure 6. Datasets: 1−single pedestrian; 2−two pedestrians; 3−three pedestrians; 4−single cyclist; 5−single car; 6−single pedestrian and single car; 7−single pedestrian and single cyclist; 8−single car and single cyclist; 9−two pedestrians and a single car; 10−two pedestrians and a single cyclist.

Figure 7. Data flow structure and CNN network structure.

Figure 8. Training progress.

Figure 9. Confusion matrix.

Figure 10. Comparison of three object recognition methods.

Table 1. List of acronyms.

Acronym	Definition
ATR	Automatic Target Recognition
CNN	Convolutional Neural Network
DBN	Deep Belief Network
DL	Deep Learning
FMCW	Frequency-Modulated Continuous-Wave
HRRP	High-Resolution Range Profiles
LSTM	Long Short-Term Memory
MWM	Moving Window Method
RBM	Restricted Boltzmann Machine
SCF	Spectral Signature Patterns
STFT	Short-Time Fourier Transform

Table 2. Parameters of radar FMCW.

Parameter	Value
Carrier frequency	77 GHz
Range resolution	0.5 m
Maximum detectable range	100 m
Maximum detectable speed	50 km/h
Sweep time	3.667 ms
Sample rate	300 MHz
Maximum beat frequency	54.55 MHz
Maximum Doppler shift	7.13 kHz

Table 3. Parameters of the pedestrian, cyclist, and car.

Pedestrian
Parameter	Value
Initial position	$x ϵ [0, 50]$ $y ϵ [- 10, 10]$ $z = 0$
RCS	[0.5, 1] m²
Height	$[1.5, 2.0] m$
Speed	$[0.1, 1.0] m / s$
Heading	$[0, 360] degree$
Cyclist
Parameter	Value
Initial position	$x ϵ [0, 50]$ $y ϵ [- 10, 10]$ $z = 0$
RCS	[2, 4] m²
Number of spokes per wheel	$18$
Speed	$[1, 10] m / s$
Heading	$[0, 360] degree$
Ratio of wheel rotations to pedal rotations	$[0.5, 6]$
Pedaling status information	True or False
Car
Parameter	Value
Initial position	$x ϵ [0, 50]$ $y ϵ [- 10, 10]$ $z = 0$
RCS	[10, 20] m²
Height	$[1.5, 2.0] m$
Motion model	Constant velocity
Speed	$[0.1, 1.0] m / s$
Radar cross-section	$10$ m²

Table 4. Summary of simulation results.

Simulation 1
Number of images (per class)	Accuracy
50	33.23%
100	56.67%
400	71.08%
700	88.54%
1000	94.22%
Simulation 2 (10 classes of 1000 images each)
Number of epochs	Accuracy
3	88.54%
5	92.12%
7	94.04%
10	95.22%
Simulation 3 (10 classes of 1000 images each, 10 epochs)
Initial learning rate	Training time
0.00001	2560 min
0.0001	1140 min
0.001	450 min

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ślesicki, B.; Ślesicka, A. A New Method for Traffic Participant Recognition Using Doppler Radar Signature and Convolutional Neural Networks. Sensors 2024, 24, 3832. https://doi.org/10.3390/s24123832

AMA Style

Ślesicki B, Ślesicka A. A New Method for Traffic Participant Recognition Using Doppler Radar Signature and Convolutional Neural Networks. Sensors. 2024; 24(12):3832. https://doi.org/10.3390/s24123832

Chicago/Turabian Style

Ślesicki, Błażej, and Anna Ślesicka. 2024. "A New Method for Traffic Participant Recognition Using Doppler Radar Signature and Convolutional Neural Networks" Sensors 24, no. 12: 3832. https://doi.org/10.3390/s24123832

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Method for Traffic Participant Recognition Using Doppler Radar Signature and Convolutional Neural Networks

Abstract

1. Introduction

2. Related Works

3. Method

3.1. FMCW Radar

3.2. Data Generation

3.3. The Short-Time Fourier Transform

3.4. Deep Learning Network

4. Experiments and Results

5. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI