1. Introduction
With the increasing application of radar maneuvering target-tracking technology in both civil and military domains, how to use the data measured by radar sensors to obtain more accurate target state estimation has become a prominent research focus. Furthermore, the single-station passive positioning technology applied in target tracking utilizes the radar sensor that passively receives the source radiation signal to enable the positioning of the radiation source and is highly regarded for its strong concealment and broad application [
1]. In the past, pure azimuth passive positioning technology was commonly used for target localization and tracking [
2]. However, relying solely on the nonlinear relationship between angle measurements and target states made it challenging to obtain accurate target state estimation. Subsequently, researchers discovered that introducing new observables such as angle change rate and Doppler frequency could enhance localization accuracy [
3,
4]. For instance, combining azimuth and Doppler observables enables more precise target state estimation.
However, using the combined observations of azimuth and Doppler has certain requirements on the relative positions of the observation point and the target trajectory. In specific cases, the observation point will experience weak observation or non-observation during some relative motion periods [
5,
6], as shown in
Figure 1. When a maneuvering target moves with a uniform or uniformly variable speed along the line connecting it to the observation point, both the azimuth and Doppler measurements have a rate of change of 0. During these periods, the target becomes unobservable as there is no significant change in the measured information. Similarly, when the maneuvering target follows a uniform circular motion relative to the observation point, the Doppler measurement remains constant, and only the azimuth measurement provides some observable information, this observation is considered weak due to the lack of Doppler information, which limits the accuracy of target state estimation during such periods.
In short, the single-station passive positioning technique based on the combination of azimuth and Doppler measurement has some limitations in the observation range of the maneuvering target tracking. Meanwhile, the maneuvering target states are always uncertain and complex in practicality, and the single-model traditional algorithms like extended Kalman filter (EKF) and unscented Kalman filter (UKF) applied in the field of weak maneuvering target tracking are difficult to achieve the accurate estimation of such complex maneuvering target states [
7]. To solve this kind of complex strong maneuvering target tracking problem, multiple models (MM) algorithms and their various variants have been proposed and widely used [
8]. Taking the interactive multiple-model algorithm (IMM) as an example [
9,
10], it usually needs to predefine a variety of target maneuvering models, model probabilities, and model transfer probabilities for the estimation of target states. However, in the specific application of this traditional target tracking algorithm, there is the problem of delayed model estimation when the target maneuvering state changes abruptly [
11], coupled with the fact that the multi-model algorithm essentially needs to set up the motion models in advance just as the single-model target tracking algorithm does, and the inaccuracy of the preset motion model may also occur in the face of complex maneuvering situations, as well as the existence of observation noise in the radar sensors themselves, and various other adverse factors have a negative impact on this interactive multi-model algorithm [
12]. Moreover, particle filter algorithms are extensively employed in tracking applications. For instance, the paper [
13] proposes a method called a cost-referenced particle filter, which is utilized to estimate the state of discrete dynamic stochastic systems by dynamically optimizing user-defined cost functions, and it also presents a novel particle selection algorithm suitable for parallel computing, addressing the primary limitations of particle filter resampling technology. Additionally, a study [
14] proposes a shadow filter that departs from the statistical basis of Kalman filtering, instead relying on deterministic dynamics to address tracking issues, and it delves into an analysis of the proposed filter method’s performance concerning its parameter influence as well.
In recent years, with the advancement of deep learning, the research on utilizing neural network modeling algorithms to break through the limitations of traditional target tracking algorithms has significantly expanded [
15]. Leveraging the distinctive advantages of neural networks [
16], it becomes feasible to dispense with the requirement of predefining the motion models beforehand and accomplish end-to-end predictions between observations and maneuvering target states. A previous study with a maneuvering trajectory prediction method that employed a backpropagation neural network (BPNN) was introduced to combine the historical trajectory of the target to capture the target’s motion patterns and generate predicted trajectories [
17]. Furthermore, the articles [
18,
19,
20] focus on addressing the challenges that arise from the inherent uncertainty in both maneuvering target states and the measurement information faced by traditional target tracking algorithms and presenting methodologies that better model the long-term dependence among sequence data through the gating mechanism. Subsequently, the articles [
21,
22] address the limitation of the long short-term memory (LSTM) model in capturing the global nature of the target maneuvering state by proposing the use of the transformer architecture which captures both long-term and short-term dependence of the target state, further enhancing the accuracy of target tracking algorithms.
To achieve an accurate estimation of strong maneuvering target states based on the combined observations of azimuth and Doppler, the first step is to use a time series of observations in the target motion state prediction to address the challenge of insufficient time information obtained from the azimuth–Doppler information at a single moment. The prediction of time series typically involves three fundamental structures, namely Recurrent Neural Networks (RNN), Transformer-based Networks (TBN), and Temporal Convolutional Networks (TCN) [
23,
24,
25]. For leveraging the distinctive property where temporal relationships are largely preserved even after downsampling a time series into two subsequences, we propose a recursive downsample convolution interactive learning neural network (RDCINN) based on the Convolutional Neural Network (CNN) architecture [
26] designed to address the challenge of motion states estimation. Our approach involves several key operations to extract motion features from the input observation time series. We first apply a full-connection layer and a position-coding layer to perform temporal coding operations on the input. Then, we proceed with recursive downsampling, temporal convolution, and interactive learning operations. In each layer, multiple convolutional filters are employed to extract motion features from the downsampling time series. By combining these rich features gathered from multiple resolutions, we can effectively address the issue of weak observation or non-observation encountered in traditional maneuvering target tracking algorithms based on the combined observations of azimuth and Doppler. Finally, the utilization of a binary tree structure in our model contributes to an increased depth of temporal feature extraction which allows for effective modeling of the nonlinear mapping relationship between high-noise observation time series and complex maneuvering states.
2. Problem Formulation
As in previous studies on maneuvering target tracking using deep learning approaches [
19,
20,
21], our simulation scenarios are set on a 2D plane. In this setup, the radar observation point passively receives azimuth and Doppler velocity, and is positioned at the origin
. We assumed that
and
are the
-
momentary maneuvering target state vector and observation vector, respectively.
denotes the position of a
-
moment maneuvering target in the
-Y plane,
represents the corresponding velocity. And
represents the azimuth and Doppler velocity of the measurements
and is expressed as [
27]:
where
,
,
,
are the standard deviations of the Gaussian noise of the azimuth and Doppler measurement, respectively.
To perform target tracking using a deep learning approach, it is essential to generate an extensive dataset of trajectories and observations for training the network model [
28]. This dataset enables the modeling of the nonlinear relationship between target states and observation information, ultimately facilitating the accurate estimation of maneuvering target states. Typically, the dataset is generated based on the state equations and observation equations as follows:
where
,
denote state transfer noise and observation noise and are expressed as follows, respectively:
where
,
,
is the radar sensor sampling interval time.
denotes the maneuvering acceleration noise, which follows a Gaussian distribution. In the state equation
, we have incorporated two maneuver models, namely the constant velocity (CV) motion and the constant turning (CT) motion. The definitions of these models are as follows:
In the network model, the input consists of the measurement time series
, while the output is the estimated state of the maneuver target
,
represents the total number of time steps in the input–output time series, and
is the turning rate. To evaluate the performance of the model, we employ the mean absolute error (MAE) as a measure. We denote the Loss function with the estimated state of the maneuver target and its true state as:
4. Simulation Experiments
In this section, we design several experimental scenarios to evaluate the superiority of our algorithm in predicting the states of strong maneuvering radar targets with the combined observations of azimuth and Doppler. Additionally, we provide a detailed explanation of the specific parameters listed in each part of the experiment.
4.1. Parameter Setting Details
We utilize the LASTD which consists of 450,000 trajectories with different motion models and their corresponding observations for a comprehensive evaluation of the algorithms’ performance in maneuvering target tracking tasks. The dataset was structured as follows: 150,000 samples consist of 16 s long trajectories of either uniform linear motion or uniform circular motion. Another 150,000 samples are composed of 16 s trajectories segmented into two 8 s long trajectories, every trajectory could be uniform linear motion or uniform circular motion. The remaining 150,000 samples consist of 16 s trajectories segmented into four 4 s long trajectories, while every trajectory could be either uniform linear motion or uniform circular motion. The parameters of the LASTD are listed in
Table 1. We set the distance from the radar to the target about 926 m to 18,520 m, which covers the common detection range of the airport surveillance radar [
30]. Aircraft rarely exceed the sound velocity in real-world scenarios, so we set the velocity of our maneuvering target in the range of −340 m/s–340 m/s [
31]. According to [
32], we set the turn rate
that ranges from −10°/s to 10°/s, and the standard deviation of acceleration noise is randomly sampled in the range of [8 m/s
2, 13 m/s
2]. The angle that ranges from −180° to 180° intersects the north and the direction from the radar to the target. The deviations of azimuth noise
and Doppler velocity noise
are randomly sampled in the range of [1°, 1.8°] and 1 m/s
2 according to the funding request, respectively. Finally, we set the sample interval T at about 1 s.
In the training process, we set the following hyperparameters for our model: the dimension E of the fully connected layer is set to 64, the binary tree height is set to 2, the convolutional layer’s kernel size, dilation rate, and group length are set to 5, 2, and 1, respectively. For the decoding layer, we have two one-dimensional convolutional layers with dimensions of 16 and 4, respectively. We use the Adam optimizer for the model training process. The weight decay rate is set to 1 × 10−5. The learning rate is initially set to 7 × 10−4, and it decays by 0.95 after each epoch. We trained 300 epochs with a batch size of 256 on a single NVIDIA 3090 GPU.
In our experiments, we compare our proposed algorithm with three existing algorithms: the LSTM network [
19], the TBN model [
21], and the traditional maneuvering target tracking method IMM-EKF [
11]. We keep the model parameters of the LSTM network and the TBN model unchanged, as specified in their respective research papers, and train the deep learning models using the LASTD we have created.
4.2. Experimental Results
We first created a dataset that consists of 1500 trajectories to evaluate the performance of each baseline neural network model, as well as our model.
The dataset is similar in structure to the training set and consists of three types of trajectories with different motion patterns. Specifically, there are 500 samples of 16 s uniform linear motion trajectory or uniform circular motion trajectory, 500 samples of two 8 s uniform linear motion trajectories or uniform circular motion trajectories combined, and 500 samples of four 4 s uniform linear motion trajectories or uniform circular motion trajectories combined. The trajectory tracking performance results are shown in
Table 2.
Based on the results presented in
Table 2, it can be observed that our network achieves lower position mean absolute error and velocity mean absolute error results compared to the other two baseline neural networks. This demonstrates that our model, applied to the strong maneuvering target tracking domain based on the combined observations of azimuth and Doppler, outperforms the previous target tracking networks.
After that, we utilize Monte Carlo simulation to generate a 16 s strong maneuvering trajectory A. The initial state of A is [−4000 m, 4000 m, 50 m/s, −66 m/s]. This trajectory consists of four segments, each lasting 4 s and employing different motion models, which reflect sudden changes in the motion target states in real-world scenarios. The first segment of the trajectory is a 4 s uniform motion. The second segment is uniform circular motion with a turning rate
of −7°. The third segment is also a uniform circular motion but with a turning rate
of 7°. Finally, we set the last segment as a uniform motion. Additionally, we introduce azimuth observation noise as white noise with zero mean and standard deviation
of 1.8°, while the standard deviation of Doppler velocity observation noise
is 1 m/s. Additionally, the standard deviation of acceleration
is set to 10 m/s
2. To assess the tracking performance of trajectory A, we employ our own network model as well as three other baseline algorithms.
Table 3 presents the evaluation results, while
Figure 6,
Figure 7 and
Figure 8 provide visual representations of these results.
In order to verify the applicability of our network model for tracking strong maneuvering trajectories with different step sizes, we generate trajectory B and trajectory C by conducting Monte Carlo simulations. Trajectory B is a 32 s strong maneuvering trajectory with an initial state of [−8000 m, 5000 m, −30 m/s, 21 m/s]. It consists of four segments of 8 s trajectories, each with a different model. The models for each segment are as follows: uniform circular motion with a turning rate
of 6°, uniform motion, uniform circular motion with a turning rate
of −5°, and uniform motion, respectively. Trajectory C is a 64 s strong maneuvering trajectory with an initial state of [−5000 m, 5000 m, 30 m/s, −23 m/s]. It also consists of four segments of 16 s trajectories with different motion models. The motion models for each 16 s trajectory are as follows: uniform circular motion with a turning rate
of −1°, uniform motion, uniform circular motion with a turning rate
of 2°, and uniform motion, respectively. Keeping the standard deviation setup as how trajectory A was set up as the same, we then evaluate the tracking performance of trajectories B and C using our network model and three baseline algorithms. The evaluation results are presented in
Table 4 and
Table 5. Additionally,
Figure 9,
Figure 10,
Figure 11,
Figure 12,
Figure 13 and
Figure 14 provide visual representations of these results.
In our simulation experiment, we find that the change of the observation information of the noisy azimuth in the highlighted time fragment marked in the figure is extremely subtle, when the trajectory is in the CV motion state; it results in the change of the observation information of the associated Doppler velocity being also subtle and the target is in the non-observation state. Then, the target is in the weak observation state when in the CT motion state, where there is only observation information of the Doppler velocity playing a role in the tracking scenario. The highlighted place in
Figure 6 is the target unobservable state, and the highlighted place in
Figure 9 and
Figure 12 is the weak observation state of the target.
We then experimented with different azimuth noise standard deviation values on trajectory C to further test the generalization ability of our model algorithm, as shown in
Table 6. When we use the values within the standard deviation range of azimuth noise set by LASTD for testing, we find that the position MAE and velocity MAE of the target can still be kept at a small value in this case, indicating that the performance of the model algorithm can still have the desired effect. When the noise standard deviation of the azimuth angle is adjusted to 2.8, the position MAE and velocity MAE of the target increase greatly, and when adjusted to 3.8, the position MAE and velocity MAE of the target can achieve our expected effect, which shows that our model algorithm still has a certain generalization ability.
The experimental results demonstrate that our model achieves superior trajectory tracking performance compared to other algorithms. This is particularly noticeable when tracking strong maneuvering targets under the combined observations of azimuth and Doppler.