Next Article in Journal
A Bayesian Approach for Modeling and Forecasting Solar Photovoltaic Power Generation
Previous Article in Journal
On Matrix Representation of Extension Field GF(pL) and Its Application in Vector Linear Network Coding
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Hypersonic Target Trajectory Estimation Method Based on Long Short-Term Memory and a Multi-Head Attention Mechanism

by
Yue Xu
1,*,
Quan Pan
1,
Zengfu Wang
1 and
Baoquan Hu
2,3
1
School of Automation, Northwestern Polytechnical University, Xi’an 710129, China
2
School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China
3
School of Engineering, Xi’an International University, Xi’an 710077, China
*
Author to whom correspondence should be addressed.
Entropy 2024, 26(10), 823; https://doi.org/10.3390/e26100823
Submission received: 10 September 2024 / Revised: 22 September 2024 / Accepted: 25 September 2024 / Published: 26 September 2024
(This article belongs to the Section Multidisciplinary Applications)

Abstract

:
To address the complex maneuvering characteristics of hypersonic targets in adjacent space, this paper proposes an LSTM trajectory estimation method combined with the attention mechanism and optimizes the model from the information-theoretic perspective. The method captures the target dynamics by using the temporal processing capability of LSTM, and at the same time improves the efficiency of information utilization through the attention mechanism to achieve accurate prediction. First, a target dynamics model is constructed to clarify the motion behavior parameters. Subsequently, an LSTM model incorporating the attention mechanism is designed, which enables the model to automatically focus on key information fragments in the historical trajectory. In model training, information redundancy is reduced, and information validity is improved through feature selection and data preprocessing. Eventually, the model achieves accurate prediction of hypersonic target trajectories with limited computational resources. The experimental results show that the method performs well in complex dynamic environments with improved prediction accuracy and robustness, reflecting the potential of information theory principles in optimizing the trajectory prediction model.

1. Introduction

With the rapid development of aviation technology, hypersonic aircraft have shown enormous potential for applications in military reconnaissance, rapid strikes, and civilian transportation due to their unique speed and maneuverability [1,2]. However, the high speed and maneuverability of hypersonic aircraft also pose significant challenges in trajectory estimation. Traditional trajectory estimation methods, such as dynamic and kinematic models, often require a large amount of prior information and find it difficult to accurately capture the complex dynamic characteristics of hypersonic aircraft during flight [3]. Therefore, developing a new method that can accurately estimate the trajectory of hypersonic targets is of great significance for improving the efficiency of tasks such as target tracking, early warning, and interception [4].
At present, research on trajectory estimation of complex maneuvering targets mainly focuses on multi-model frameworks. To meet various estimation needs, researchers have continuously improved and optimized the Interactive Multiple Model (IMM) method from various aspects such as the selection of maneuvering models, the design of filtering algorithms, and the determination of weighting strategies between models. For example, He et al. [5] combined virtual maneuvering noise with first-order Markov process models and proposed a hypersonic target tracking method with adaptive corrected unbiased minimum variance estimation. Wu et al. [6] proposed a target tracking method based on multi-hypothesis fuzzy matching to address the issue of distance ambiguity in detecting weak targets in high pulse repetition rate radar. Li et al. [7] proposed a new target-tracking mode to address the difficulty of ground radar in effectively defending hypersonic aircraft. By dividing the tracking task into several tracking intervals with the same duration, and then dispatching multiple satellites for tracking within each interval, high tracking accuracy has been achieved. Huang et al. [8] achieved adaptive state estimation through unscented Kalman filtering, achieving stronger robustness and higher estimation accuracy. Li et al. [9] proposed a target tracking method based on adaptive kernel learning Kalman filtering. This method introduces the maximum entropy criterion and can update the state transfer function in real-time, achieving high tracking accuracy and convergence speed. Huang et al. [10] proposed a tracking method for hypersonic aircraft based on UKF filters, which solves the prediction error problem caused by aircraft tilt reversal maneuvers. Tang et al. [11] proposed a composite control method based on adaptive dynamic programming for contour tracking of hypersonic aircraft under multiple constraints. Liu et al. [12] conducted a detailed numerical simulation study on different maneuvering modes of hypersonic targets, and based on the simulation results, innovatively designed a finite time disturbance observer. This observer can accurately predict the acceleration information of hypersonic targets during flight, providing strong support for subsequent trajectory estimation and target tracking. Yang et al. [13] proposed a Multi-Granularity Scene Understanding framework (MGSU) integrating the Transformer structure, which significantly improves the accuracy of future trajectory prediction in complex scenarios through multi-granularity fusion and inverse reinforcement learning path generation. Although Transformer performs well in handling sequential data, LSTM may be a more appropriate choice when the amount of data is small, or the computational resources are limited. Due to LSTM’s lower model complexity and number of parameters, it is easier to train on small datasets and does not require as much computational resources as a Transformer. In addition, LSTM may be preferred in application scenarios such as hypersonic trajectory prediction where real-time requirements are very high. Because the inference process of LSTM is relatively simple and predictable, it is easier to meet real-time requirements.
However, despite significant progress in the field of trajectory estimation using multi-model frameworks, there are still many challenges when dealing with complex maneuvering targets, especially those with extremely high speeds and complex maneuverability such as hypersonic aircraft [14]. This requires trajectory estimation methods to have higher adaptability and robustness to cope with the complex behavioral characteristics exhibited by targets in dynamic environments [15].
Recently, the rapid development of deep learning [16] technology has provided new ideas for trajectory estimation. Especially the Long Short-Term Memory (LSTM) network [17], as a special type of Recurrent Neural Network (RNN) [18], has shown excellent performance in processing time series data due to its unique memory mechanism and gating unit [19]. LSTM can effectively capture long-term dependencies in time series data and exhibits good generalization ability when dealing with high-dimensional and nonlinear data [20]. Therefore, trajectory estimation methods based on LSTM have gradually become a research hotspot [21,22,23].
At present, some scholars have applied LSTM to the field of trajectory estimation and have achieved certain results. For example, Bartusiak et al. [24] proposed a machine-learning method for predicting the trajectory of maneuvering targets. Specifically, this method predicts the transition mode of the trajectory through random syntax, thereby predicting the kinematic characteristics of the aircraft. The proposed method was validated on two datasets, and the results showed that it can accurately predict the flight trajectory of the aircraft even in noisy environments. Xu et al. [25] proposed a multi-agent trajectory collaborative prediction model based on a Social Long Short-Term Memory (S-LSTM) network, which captures the interactions between aircraft by establishing an LSTM network for each plane and integrating the hidden states of associated aircraft, thereby improving the accuracy of trajectory predictions and the efficiency of airspace management. Pang et al. [26] introduced a weather-related trajectory prediction model that combines RNNs with convolutional layers, significantly reducing flight prediction bias and variance while enhancing management efficiency and safety during convective weather. Schimpf et al. [27] explored methods for 4D flight trajectory prediction using deep learning techniques, including LSTM, GRU, and RNN, in conjunction with various data sources. By incorporating attention mechanisms, they improved prediction accuracy. Song et al. [28] proposed a dynamic tracking algorithm based on LSTM to solve the problem of real-time tracking of maneuvering targets in complex electromagnetic environments. Firstly, LSTM is used to learn the motion characteristics of the target from the input noise data, and then the learned features are integrated into the proposed tracking algorithm, effectively capturing the motion trajectory of small and weak maneuvering targets. Li et al. [29] proposed a multi-target tracking method based on a bidirectional LSTM network. Firstly, the network is trained using offline multi-target data, and then the trained LSTM network is used for multi-target matching. Dai et al. [30] proposed a target trajectory tracking method based on an LSTM network, which learns and extracts the target’s motion pattern or model from the historical trajectory data of the target through the LSTM network. Subsequently, to further improve the accuracy and dynamic response ability of state estimation, the Kalman filtering algorithm was used to dynamically adjust and optimize the trajectory state predicted by the LSTM network in real-time. This combination enables the trajectory tracking system to learn the complex motion laws of the target while maintaining robustness in the face of uncertainty or noise. Yang et al. [31] proposed an innovative pedestrian trajectory prediction method called GTPPO, which integrates LSTM-encoded temporal attention movement patterns, graph attention-based social interaction capture, and uncertainty handling for multi-modal outputs. GTPPO demonstrated outstanding performance across multiple datasets, particularly excelling in handling sudden changes in movement, showcasing state-of-the-art prediction capabilities. However, these studies mainly focus on the trajectory estimation of low-speed or medium-speed targets, and there is relatively little research on the trajectory estimation of hypersonic targets. Hypersonic aircraft have extremely high speeds and complex maneuverability during flight, and their trajectory data often exhibits high nonlinearity and uncertainty, which poses greater challenges to LSTM-based trajectory estimation methods.
To solve the above problems, this paper proposes a novel hypersonic target trajectory estimation method based on LSTM and multi-head attention mechanism. It aims to accurately capture the dynamic characteristics of hypersonic vehicles during flight and realize the accurate prediction of target trajectories. The main contributions of this paper are as follows:
(1)
In trajectory estimation, the motion trajectories of targets exhibit significant long-term dependencies, and these dependencies contain rich information. The LSTM network proposed in this paper can effectively capture and model this long-term dependency information through its unique structural advantages, to accurately predict the target’s trajectory.
(2)
Traditional LSTM models usually assume that all input information is of equal importance, i.e., given the same weight, when dealing with time series data. However, in the complex scenario of hypersonic target trajectory estimation, data at different time points carry different amounts of information, which naturally have different impacts on the final estimation results. The LSTM model incorporating the attention mechanism proposed in this paper can dynamically assess the importance of the data based on its information entropy and automatically assign weights accordingly.
(3)
Aiming at the real-time demand of hypersonic target trajectory estimation, this paper carries out an in-depth study on model optimization. By introducing a regularization term based on information gain, we reduce the model’s dependence on redundant information and improve the efficiency of information utilization. Meanwhile, a dynamic learning rate adjustment mechanism is designed, which can intelligently adjust the learning rate according to the change of information entropy during the training process and accelerate the convergence process of the model.
The remaining chapters of this paper are organized as follows: Section 2 briefly introduces the basic theory of LSTM and hypersonic vehicles; Section 3 describes the proposed trajectory estimation method; Section 4 validates the effectiveness of the proposed method; and, finally, the conclusion is given in Section 5.

2. Theoretical Foundations

2.1. Long Short-Term Memory (LSTM) Network

LSTM mainly consists of an oblivion gate, input gate, and output gate as shown in Figure 1. Its calculation formula is as follows:
I t = sigmoid X t W x i + H t 1 W h i + b i F t = sigmoid X t W x f + H t 1 W h f + b f O t = sigmoid X t W x o + H t 1 W h o + b o
where I t n × h is the input gate, F t n × h is the forgetting gate, O t n × h is the output gate, W x i , W x f , W x o d × h and W h i , W h f , W h o h × h are the weighting coefficients, b i , b f , b o 1 × h is the bias parameter, and sigmoid is the activation function, which is calculated as follows:
sigmoid ( x ) = 1 1 + e x
The formula for candidate memory cell C ˜ t n × h at time step t is as follows:
C ˜ t = tanh X t W x c + H t 1 W h c + b c
where W x c d × h and W h c h × h are the weighting coefficients and b c 1 × h is the bias parameter.
In LSTM, the input gate I t controls how much new data from C ˜ t is employed, while the forget gate F t controls how much of the content of the past memory element C t 1 n × h is retained, as obtained in the following:
C t = F t C t 1 + I t C ˜ t
The hidden state H t n × h is a gated version of the tanh of the memory element, ensuring that the value of H t is in the interval (−1,1):
H t = O t tanh C t
where tanh is the hyperbolic tangent function, calculated as follows:
tanh x = e x e x e x + e x

2.2. Dynamic Model of Hypersonic Aircraft

The motion equation of a conventional hypersonic aircraft in a ballistic coordinate system [32] is:
x ˙ = v cos γ sin ψ y ˙ = v cos γ cos ψ z ˙ = v sin γ v ˙ = D m g sin γ γ ˙ = L cos σ m v g v cos γ ψ ˙ = L sin σ m v cos γ
where v and m are the flight speed and mass of the hypersonic target, respectively; γ is the ballistic inclination (the angle between the velocity vector and the horizontal plane); ψ is the ballistic inclination (the angle between the projection of the velocity vector in the horizontal plane and the y-axis); g is the acceleration of gravity, taken as 9.81 m / s 2 ; σ is the tilt angle; L is the aerodynamic lift; and D is the aerodynamic drag. The expressions for L and D are as follows:
L = 1 2 ρ z v 2 C L α , M a S D = 1 2 ρ z v 2 C D α , M a S
where ρ z is the air density versus altitude curve; M a is the flight Mach number; α is the angle of attack; and S is the vehicle characteristic area.

3. Proposed Methods

Traditional methods for predicting the trajectories of hypersonic targets often struggle to accurately capture dynamic changes in the target and overlook important feature information, leading to insufficient prediction accuracy and poor robustness. To overcome these challenges, we propose an LSTM trajectory estimation method integrated with a multi-head attention mechanism. This approach leverages the strengths of LSTM in sequence modeling and combines it with a multi-head attention mechanism, allowing for dynamic adjustment of attention to different historical trajectory information, effectively extracting key features for precise recognition and prediction of target motion behavior. This design not only improves prediction accuracy but also enhances the model’s adaptability to complex dynamic characteristics, providing more reliable technical support for monitoring and tracking hypersonic targets in near space. In this section, we will detail the proposed method, including the LSTM network integrated with the attention mechanism, the multi-head attention mechanism, model parameters, and the network training process.

3.1. LSTM Network Incorporating Attention Mechanism

For the overall structural design of the network, we integrate the algorithmic essence of the multi-head attention mechanism and LSTM, and this design aims to achieve a dual objective: firstly, to enhance the recognition accuracy by enhancing the model’s ability to capture the intrinsic relevance and complexity of the data; and secondly, to take advantage of the complementary nature of the two mechanisms to optimize the learning process, thus accelerating the network’s convergence speed.
Figure 2 shows a schematic diagram of the network structure. Considering the relatively low dimensionality of the input data, we first use M fully connected layers (Dense layers) to perform dimensionality enhancement operations, thereby enriching the feature representation of the input data. Through this dimensionality enhancement step, we can ensure that subsequent network layers can more effectively process and parse these enhanced features. The fully connected layer is represented as:
δ = Dense μ = tanh w x + b
When constructing the network structure, we particularly focused on the parameter configuration and regularization methods of the fully connected layer (Dense layer) to optimize the performance of the network. The weights w and biases b of the fully connected layer are key parameters for model learning, which are updated during the training process to minimize prediction errors.
To prevent overfitting and improve the generalization ability of the network, we have introduced two effective regularization techniques: batch normalization (BN) and dropout.
The BN layer is represented as:
x ^ i = x i μ B ¯ σ ^ B 2 + ε μ B ¯ = 1 B x B x σ ^ B 2 = 1 B x B x μ B ¯ 2 + ε
where x is the input data; μ B ¯ and σ ^ B 2 are the mean and variance of the input data, respectively; ε > 0 is a constant; and x ^ i is the data obeying the standard normal distribution after transformation.
Through Formula (10), the input data x , which was originally randomly distributed, can be transformed into data x ^ i that satisfies a normal distribution, thereby making the distribution of data in the input network closer, which is conducive to the iterative updating of the network.
The LSTM network structure with BN layer and attention mechanism is shown in Figure 2, and its mathematical expression is as follows:
δ k , 1 = D e n s e 1 μ k δ ¯ k , 1 = B N 1 δ k , 1 δ k , 1 = D r o p o u t 1 δ ¯ k , 1 δ k , j = D e n s e j δ k , j 1 δ ¯ k , j = B N j δ k , j δ k , j = D r o p o u t j δ ¯ k , j δ k , L S T M 1 , C k , L S T M 1 = L S T M 1 δ K 1 , L S T M 1 , C k 1 , L S T M 1 , δ k , d δ k , L S T M i , C k , L S T M i = L S T M i δ K 1 , L S T M i , C k 1 , L S T M i , δ k , L S T M i 1 δ 1 : N , L S T M l = δ 1 , L S T M l , δ 2 , L S T M l , , δ N , L S T M l w a t t = Softmax w a t t 1 δ 1 : N , L S T M l + b a t t δ a t t = w a t t T δ 1 : N , L S T M l P L a b e l s μ 1 : N = Softmax δ a t t
where i = 1 , 2 , , n , j = 2 , 3 , , m , k = 1 , 2 , , N .

3.2. Multi-Attention Mechanisms

The self-attention mechanism is a sequence processing method that enhances the representation and performance of the model by generating query, key, and value vectors for each sequence element, calculating the relevance weights of each element with respect to the other elements in the sequence, and applying these weights to the corresponding value vectors in a weighted summation that dynamically focuses on the part of the sequence that is the most relevant to the task at hand.
The self-attention mechanism is shown in Figure 3 and is calculated as follows:
q i = w q i x i k i = w k i x i v i = w v i x i
where x i is the input sequence data; w q , w k , and w v are the weight matrices; and q , k , and v are the obtained feature representations.
Then, by using the feature representation obtained in the previous step, the matrix of query, key, and value can be calculated.
Q = W q X K = W k X V = W v X
where Q , K and V denote the query, key, and value matrices, respectively.
Next, we calculate the attention score:
α i , j = S o f t m a x ( q i k i )
where denotes the dot product of q and k . The SoftMax function is used to calculate the normalized weight coefficients.
Finally, multiply the attention score by v to obtain the output sequence y :
y j = i = 1 N α j , i v i
Multi-head attention, as shown in Figure 4, is an extended form of self-attention mechanism. The main advantage of multi-head attention is that it can process the input sequence with attention from different perspectives in parallel, thus improving the model’s ability to understand and capture complex dependencies.

3.3. Model Parameters

The structural parameters of each layer of the network are shown in Table 1, where the network parameters used are m = 2 and n = 2 , which are two fully connected layers and two LSTM layers, with a neuron count of 256-128-128-256, respectively. To improve the performance of the model, the Dropout layer and BN layer were added to the fully connected layer. The Dropout layer randomly discards neurons, allowing the model to have different structures during each training session, thereby increasing the diversity of the model. The BN layer makes it easier for the network to learn the distribution of data and improve the model’s generalization ability through standardized operations.
The training parameters of the network are set as shown in Table 2. Specifically, the partition ratio between the training and testing sets is set to 8:2, where 80% of the input data is used as the training set and the remaining 20% is used as the testing set. The maximum number of iterations is 200. Choose a binary cross entropy loss function to calculate the model’s loss, select the Adam optimizer to update parameters in the network in real-time, and set the proportion of randomly discarded neurons in the Dropout layer to 0.2.

3.4. Network Training Process

Based on the LSTM model that integrates a multi-head attention mechanism, and in conjunction with the segmentation of trajectory data samples and the model testing phase, a complete trajectory prediction process has been designed, as shown in Figure 5. Below are more detailed descriptions of the main steps in this process:
(1)
Perform preprocessing operations like standardization and normalization on the data to ensure it is on an appropriate scale for model training.
(2)
Divide the preprocessed trajectory dataset into training and testing sets. The training set is used for model training, while the testing set is used to evaluate the model’s final performance.
(3)
Use a sliding window technique to segment the trajectory data into multiple samples. Each sample contains trajectory data points over a period and their corresponding target variable (e.g., future position).
(4)
Design and construct the model architecture, including multi-head attention layers to capture intrinsic relationships in the data and LSTM layers to capture dynamic features of the time series. Insert batch normalization (BN) and Dropout layers appropriately to enhance training efficiency and prevent overfitting. Set up the input and output layers to ensure the input data dimensions match the model and that the output layer can predict the target variable.
(5)
Set initial weights and biases for each layer in the model, using either random initialization or pretrained weights.
(6)
Input samples from the training set into the model for training, calculating the model’s output values. Compute the loss function (Mean Squared Error, MSE) based on the model’s output and true values. Use the Adam optimizer to compute the gradient of the loss function with respect to model parameters and update the parameters.
(7)
When the model’s performance on the training set meets predefined stopping criteria (e.g., loss no longer significantly decreases, or a set number of iterations is reached), save the model’s weights and parameters.
(8)
Evaluate the trained model using the testing set, calculating performance metrics on the test data.

4. Experimental Comparative Analysis

4.1. Description of the Data Set

The publicly released CAV-H model from Lockheed Martin [33] was used, which has a total mass of 907 kg and an aerodynamic reference area of 0.4839 m2. The aerodynamic coefficient wind tunnel test tables are shown in Table 3 and Table 4.
An equation fit to the aerodynamic data was used to replace the direct interpolation method for aerodynamic data with Mach numbers greater than 5 in Table 5 and Table 6.
C L is fitted to the angle of attack α and flight Mach number M a as follows:
C L α , M a = 0.0561 0.00443 M a + 0.05 α 0.00083 M a α + 0.00032 M a 2 + 0.00037 α 2
C D is fitted to the angle of attack α and flight Mach number M a as follows:
C D α , M a = 0.12721 0.01542 M a + 0.00486 α 0.0003 M a α + 0.00057 M a 2 + 0.00067 α 2
By substituting the C L and C D obtained from Equations (16) and (17) into Formula (8) in Section 2.2, we can calculate the lift L and drag D . Then, substituting lift L and drag D into Formula (7) allows us to solve for various trajectory data. Using this method, we constructed a dataset containing 11,520 trajectory entries. This dataset not only includes basic flight patterns, such as quasi-equilibrium gliding and hopping gliding, but also incorporates complex target maneuver types, such as sharp turns, agile evasive actions, and periodic penetration strategies.
To verify and evaluate the performance of the machine learning models trained on these trajectory data, we randomly split the dataset into a training set and a testing set. Specifically, 80% of the trajectories were selected as the training set for the model’s training and learning process, while the remaining 20% served as the testing set to assess the model’s generalization ability and accuracy on unseen data.

4.2. Experimental Results

In order to validate the effectiveness of the proposed method, four different network architectures were selected for comparative analysis: (1) Case 1: LSTM combining batch normalization and a multi-head attention mechanism (Att-LSTM+BN); (2) Case 2: LSTM enhanced by a multi-head attention mechanism (Att-LSTM); (3) Case 3: LSTM combining batch normalization (LSTM+BN); and (4) Case 4: LSTM.
The network was trained using a computer with an AMD Ryzen 7 4800H CPU, Nvidia GeForce RTX 2060 GPU, and 32 GB of RAM, based on the Python 3.7 + TensorFlow 2.3 + Keras 2.7 platform.
Since the input sequence length will have an impact on the recognition accuracy in the learning training of time series data. Therefore, we first analyze the effect of input sequence length on the recognition accuracy of the network. The network structure parameters used, m = 2 and n = 2 , i.e., two fully connected layers and two LSTM layers, and the neurons number 256-128-128-256. The lengths of the input sequences are 50, 100, 150, 200, 250, and 300, respectively, and the experimental results are shown in Figure 5 and Figure 6. Accuracy and MSE were used as the evaluation metrics of the model with the following expressions:
A c c u r a c y = T P + T N T P + T N + F P + F N
In the formula, T P (True Positive) represents the number of positive samples correctly classified as positive, T N (True Negative) indicates the number of negative samples correctly classified as negative, F P (False Positive) refers to the number of negative samples incorrectly classified as positive, and F N (False Negative) denotes the number of positive samples incorrectly classified as negative.
M S E = 1 n i = 1 n y i y ^ i 2
where n is the total number of samples, y i is the true value of the i t h sample, and y ^ i is the predicted value of the i t h sample.
Figure 6 and Table 5 show the recognition accuracy of four methods under six different input sequence lengths. As the input sequence length increases, the recognition accuracy of each method also increases. Moreover, when the length of the input sequence is greater than or equal to 200, the recognition accuracy of various methods becomes more stable. Furthermore, it can be observed that when the input sequence length is greater than or equal to 200, the two methods incorporating multi-head attention have higher and more stable recognition accuracy. Overall, the proposed method achieved the highest and most robust recognition accuracy under six different input sequences.
Table 5. Average recognition accuracy (%) of various methods with different input sequences.
Table 5. Average recognition accuracy (%) of various methods with different input sequences.
CaseLength
50100150200250300
Att-LSTM+BN98.299.699.399.899.799.8
Att-LSTM90.289.689.596.795.898.7
LSTM+BN87.586.483.491.692.496.8
LSTM85.383.682.486.491.693.9
Figure 7 and Table 6 show the recognition loss of four methods under six input sequence lengths. The loss value is usually used to measure the prediction error of the model during the training process, while the accuracy directly reflects the classification accuracy of the model on the test set. Usually, the lower the loss value, the better the performance of the method. From Figure 7, the four methods exhibit lower loss values for all input sequence lengths. Relatively speaking, the loss value of our method significantly decreases with the increase in iteration times and remains relatively stable at different sequence lengths. This once again proves the effectiveness and robustness of the proposed method.
Table 6. Average loss of various methods with different input sequences.
Table 6. Average loss of various methods with different input sequences.
CaseLength
50100150200250300
Att-LSTM+BN0.140.110.100.060.050.03
Att-LSTM0.680.650.240.620.190.17
LSTM+BN1.171.210.780.680.280.46
LSTM1.390.970.761.170.670.78
When designing deep learning models based on LSTM, the number of LSTM layers and the configuration of fully connected layers have a crucial impact on network accuracy. The LSTM layer is responsible for capturing long-term dependencies in the input sequence, and different numbers of LSTM layers will affect the model’s ability to learn these dependencies.
Meanwhile, the role of fully connected layers in LSTM networks is to convert the output of LSTM layers into the final prediction results. The configuration of fully connected layers, including the number of layers and the number of neurons in each layer, also has a significant impact on the accuracy of the model. Too few fully connected layers or neurons may not fully utilize the features extracted by the LSTM layer, resulting in the model being unable to make accurate predictions. However, excessive fully connected layers or neurons may make the model too complex, increasing the risk of overfitting.
In summary, to achieve optimal network accuracy, it is necessary to carefully consider the configuration of the LSTM layer and the fully connected layer. By exploring the combination of different layers and neuron numbers through experiments, combined with appropriate activation functions and regularization techniques, the most suitable model configuration for specific tasks and datasets can be found. Therefore, based on the network model designed in Figure 1, different sum values were used for experiments (i.e., using different fully connected layers and LSTM layers), and the experimental results are shown in Figure 8 and Figure 9, as well as Table 7 and Table 8.
From Figure 8 and Table 7, as the number of network layers increases, the recognition accuracy of various methods has increased. Relatively speaking, the proposed method has higher and more robust recognition accuracy. From Figure 9 and Table 8, the proposed method has lower and more robust losses, and when both the fully connected layer and LSTM reach more than two layers, the attention mechanism integrated into the model can demonstrate its effectiveness, which once again proves the effectiveness of the proposed method.

4.3. Diagnostic Performance Analysis under Noise Interference

To further validate the robustness and effectiveness of the proposed method in noisy environments, we introduced Gaussian white noise into the original dataset and tested it under different signal-to-noise ratio (SNR) [34] conditions. Specifically, we chose four representative scenarios, SNR = −6, −2, 2, and 6, to simulate a data environment from very noisy to relatively clean.
The experimental results are shown in Figure 10, and the Att-LSTM+BN model exhibits the best performance in all four noise environments. This is mainly attributed to its combination of attention mechanism and batch normalization advantages, which enables the model to better focus on important information while effectively reducing internal covariate shifts, thereby improving stability and accuracy in noisy environments.
In contrast, although the Att-LSTM model also utilizes attention mechanisms to capture key information, its noise resistance is slightly insufficient without batch normalization. Similarly, although the LSTM+BN model improves its stability to some extent through batch normalization, it lacks guidance from attention mechanisms, resulting in limited performance in noisy environments. The standard LSTM model performs the worst in situations with high noise, mainly because it lacks attention mechanisms to focus on key information and batch normalization to reduce the impact of internal covariate shifts.
In summary, through experimental analysis under different SNR conditions, we have once again demonstrated the excellent accuracy of the Att-LSTM+BN model in processing noisy data, which further verifies the effectiveness and robustness of the model in complex noisy environments.

5. Conclusions

In this paper, we propose a novel trajectory estimation method that combines the temporal processing capability of LSTM networks with the multi-head information entropy attention mechanism, specifically for trajectory estimation of hypersonic targets. The fast maneuverability and high uncertainty of hypersonic targets in complex and changing flight environments make trajectory estimation a difficult task in aerospace.
By introducing the LSTM network, we successfully capture the temporal dependence and long-term memory information in the target trajectory, which provides the model with the ability to deeply understand the target’s movement patterns. The recurrent structure of LSTM allows the model to learn and retain historical information, which is crucial for predicting future trajectories.
However, relying on LSTM alone may not be enough to adequately cope with the complexity and uncertainty of hypersonic targets. Therefore, we further introduce a multi-head information entropy attention mechanism. This mechanism not only enhances the model’s attention to critical information, but also quantifies the information content and uncertainty of the data at different time points by calculating the information entropy. In this way, the model can intelligently assign weights and give higher attention to data with more information and lower uncertainty, thus improving the accuracy and robustness of trajectory estimation.
The experimental results show that the trajectory estimation method, which combines LSTM with the multi-head information entropy attention mechanism, exhibits significant advantages when dealing with hypersonic target trajectory data. Compared with the traditional method, this method shows a substantial improvement in both prediction accuracy and error rate. Especially when the target undergoes rapid maneuvers or is subject to external interference, the proposed method can still maintain stable performance, which fully proves its applicability and reliability in complex environments.
In addition, this method has good scalability and flexibility. By adjusting the network structure and parameter settings, we can further optimize the model performance to adapt to the needs of trajectory estimation in different scenarios. Meanwhile, information entropy, as an effective tool to quantify the importance of information, provides new ideas and methods for model optimization, which helps to promote the continuous progress of trajectory estimation technology.

Author Contributions

Methodology, Y.X.; software, B.H.; writing—original draft preparation, Y.X.; writing—review and editing, Z.W.; supervision, Z.W. and Q.P.; data curation, B.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was partially supported by the National Natural Science Foundation of China (Grant No. 61790552, 62203358, 62233014).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

ADS-B datasets can be downloaded from https://flightadsb.variflight.com/track-data, accessed on 16 January 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Yu, S.; Ni, X.; Li, X.; Hu, T.; Chen, F. Real-Time Dynamic Optimized Band Detection Method for Hypersonic Glide Vehicle. Infrared Phys. Technol. 2022, 121, 104020. [Google Scholar] [CrossRef]
  2. Arroyo Cebeira, A.; Asensio Vicente, M. Adaptive IMM-UKF for Airborne Tracking. Aerospace 2023, 10, 698. [Google Scholar] [CrossRef]
  3. Chatterjee, A.; Tharmarasa, R. Effects of Measurement Uncertainties on Multitarget Tracking. IEEE Instrum. Meas. Mag. 2022, 25, 37–45. [Google Scholar] [CrossRef]
  4. Wei, L.; Chen, J.; Ding, Y.; Wang, F.; Zhou, J. Adaptive Tracking of High-Maneuvering Targets Based on Multi-Feature Fusion Trajectory Clustering: LPI’s Purpose. Sensors 2022, 22, 4713. [Google Scholar] [CrossRef]
  5. He, S.; Wu, P.; Li, X.; Bo, Y.; Yun, P. Adaptive Modified Unbiased Minimum-Variance Estimation for Highly Maneuvering Target Tracking with Model Mismatch. IEEE Trans. Instrum. Meas. 2023, 72, 8501216. [Google Scholar] [CrossRef]
  6. Wei, W.; Dandan, L.; Guohong, W. Detection of Hypersonic Weak Targets by High Pulse Repetition Frequency Radar Based on Multi-hypothesis Fuzzy-matching Radon Transform. IET Radar Sonar Navig. 2024, 18, 423–433. [Google Scholar] [CrossRef]
  7. Li, Z.; Wang, Y.; Zheng, W. Accurately Tracking Hypersonic Gliding Vehicles via an LEO Mega-Constellation in Relay Tracking Mode. J. Syst. Eng. Electron. 2024, 35, 211–221. [Google Scholar] [CrossRef]
  8. Huang, J.; Li, Z.; Liu, D.; Yang, Q.; Zhu, J. An Adaptive State Estimation for Tracking Hypersonic Glide Targets with Model Uncertainties. Aerosp. Sci. Technol. 2023, 136, 108235. [Google Scholar] [CrossRef]
  9. Li, Y.; Lou, J.; Tan, X.; Xu, Y.; Zhang, J.; Jing, Z. Adaptive Kernel Learning Kalman Filtering with Application to Model-Free Maneuvering Target Tracking. IEEE Access 2022, 10, 78088–78101. [Google Scholar] [CrossRef]
  10. Huang, J.; Zhang, H.; Tang, G.; Bao, W. Robust UKF-Based Filtering for Tracking a Maneuvering Hypersonic Glide Vehicle. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2022, 236, 2162–2178. [Google Scholar] [CrossRef]
  11. Tang, R.; Luo, B.; Liao, Y. Adaptive Dynamic Programming Based Composite Control for Profile Tracking with Multiple Constraints. Neurocomputing 2023, 557, 126711. [Google Scholar] [CrossRef]
  12. Liu, S.; Yan, B.; Zhang, T.; Dai, P.; Liu, R.; Yan, J. Three-Dimensional Cooperative Guidance Law for Intercepting Hypersonic Targets. Aerosp. Sci. Technol. 2022, 129, 107815. [Google Scholar] [CrossRef]
  13. Yang, B. Multi-Granularity Scenarios Understanding Network for Trajectory Prediction. Intell. Syst. 2023, 9, 851–864. [Google Scholar] [CrossRef]
  14. Hu, Y.; Yi, J.; Cheng, F.; Wan, X.; Hu, S. 3-D Target Tracking for Distributed Heterogeneous 2-D–3-D Passive Radar Network. IEEE Sens. J. 2023, 23, 29502–29512. [Google Scholar] [CrossRef]
  15. Liu, Z.; Wang, Z.; Yang, Y.; Lu, Y. A Data-Driven Maneuvering Target Tracking Method Aided with Partial Models. IEEE Trans. Veh. Technol. 2024, 73, 414–425. [Google Scholar] [CrossRef]
  16. Menghani, G. Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better. ACM Comput. Surv. 2023, 55, 1–37. [Google Scholar] [CrossRef]
  17. Kader, N.I.A.; Yusof, U.K.; Khalid, M.N.A.; Husain, N.R.N. A Review of Long Short-Term Memory Approach for Time Series Analysis and Forecasting. In Proceedings of the 2nd International Conference on Emerging Technologies and Intelligent Systems; Al-Sharafi, M.A., Al-Emran, M., Al-Kabi, M.N., Shaalan, K., Eds.; Lecture Notes in Networks and Systems; Springer International Publishing: Cham, Switzerland, 2023; Volume 573, pp. 12–21. ISBN 978-3-031-20428-9. [Google Scholar]
  18. Durstewitz, D.; Koppe, G.; Thurm, M.I. Reconstructing Computational System Dynamics from Neural Data with Recurrent Neural Networks. Nat. Rev. Neurosci. 2023, 24, 693–710. [Google Scholar] [CrossRef]
  19. Yu, Y.; Si, X.; Hu, C.; Zhang, J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
  20. Smagulova, K.; James, A.P. A Survey on LSTM Memristive Neural Network Architectures and Applications. Eur. Phys. J. Spec. Top. 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
  21. Zhou, X.; Qin, T.; Ji, M.; Qiao, D. A LSTM Assisted Orbit Determination Algorithm for Spacecraft Executing Continuous Maneuver. Acta Astronaut. 2023, 204, 568–582. [Google Scholar] [CrossRef]
  22. Luo, W.; Zhao, Y.; Shao, Q.; Li, X.; Wang, D.; Zhang, T.; Liu, F.; Duan, L.; He, Y.; Wang, Y.; et al. Procapra Przewalskii Tracking Autonomous Unmanned Aerial Vehicle Based on Improved Long and Short-Term Memory Kalman Filters. Sensors 2023, 23, 3948. [Google Scholar] [CrossRef]
  23. Nelson, M.; Barzegar, V.; Laflamme, S.; Hu, C.; Downey, A.R.J.; Bakos, J.D.; Thelen, A.; Dodson, J. Multi-Step Ahead State Estimation with Hybrid Algorithm for High-Rate Dynamic Systems. Mech. Syst. Signal Process. 2023, 182, 109536. [Google Scholar] [CrossRef]
  24. Bartusiak, E.R.; Jacobs, M.A.; Chan, M.W.; Comer, M.L.; Delp, E.J. Predicting Hypersonic Glide Vehicle Behavior with Stochastic Grammars. IEEE Trans. Aerosp. Electron. Syst. 2024, 60, 1208–1223. [Google Scholar] [CrossRef]
  25. Xu, Z.; Zeng, W.; Chu, X.; Cao, P. Multi-Aircraft Trajectory Collaborative Prediction Based on Social Long Short-Term Memory Network. Aerospace 2021, 8, 115. [Google Scholar] [CrossRef]
  26. Pang, Y.; Yao, H.; Hu, J.; Liu, Y. A Recurrent Neural Network Approach for Aircraft Trajectory Prediction with Weather Features From Sherlock. In Proceedings of the AIAA Aviation 2019 Forum, Dallas, TX, USA, 17–21 June 2019. [Google Scholar]
  27. Schimpf, N.; Knoblock, E.J.; Wang, Z.; Apaza, R.D.; Li, H. Flight Trajectory Prediction Based on Hybrid—Recurrent Networks. In Proceedings of the 2021 IEEE Cognitive Communications for Aerospace Applications Workshop (CCAAW), Cleveland, OH, USA, 21–23 June 2021; pp. 1–6. [Google Scholar]
  28. Song, F.; Li, Y.; Cheng, W.; Dong, L. An Improved Dynamic Programming Tracking-before-Detection Algorithm Based on LSTM Network. EURASIP J. Adv. Signal Process. 2023, 2023, 57. [Google Scholar] [CrossRef]
  29. Li, W.; Yang, A.; Zhang, L. Improved Data Association Algorithm for Airborne Radar Multi-Target Tracking Via Deep Learning Network. In Proceedings of the IGARSS 2022—2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 7417–7420. [Google Scholar]
  30. Dai, T.; Wang, H.; Ruan, L.; Tong, H.; Wang, H. Research on Deep Learning Methods of UUV Maneuvering Target Tracking. In Proceedings of the Global Oceans 2020: Singapore–U.S. Gulf Coast, Biloxi, MS, USA, 5–30 October 2020; pp. 1–7. [Google Scholar]
  31. Yang, B.; Yan, G.; Wang, P.; Chan, C.; Song, X.; Chen, Y. A Novel Graph Based Trajectory Predictor with Pseudo Oracle. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 7064–7078. [Google Scholar] [CrossRef]
  32. Xie, Y.; Zhuang, X.; Xi, Z.; Chen, H. Dual-Channel and Bidirectional Neural Network for Hypersonic Glide Vehicle Trajectory Prediction. IEEE Access 2021, 9, 92913–92924. [Google Scholar] [CrossRef]
  33. Jorris, T.R.; Cobb, R.G. Three-Dimensional Trajectory Optimization Satisfying Waypoint and No-Fly Zone Constraints. J. Guid. Control Dyn. 2009, 32, 551–572. [Google Scholar] [CrossRef]
  34. Liu, D.; Zhao, Y.; Yuan, Z.; Li, J.; Chen, G. Target Tracking Methods Based on a Signal-to-Noise Ratio Model. Front. Inf. Technol. Electron. Eng. 2020, 21, 1804–1814. [Google Scholar] [CrossRef]
Figure 1. Basic structure of LSTM.
Figure 1. Basic structure of LSTM.
Entropy 26 00823 g001
Figure 2. LSTM network incorporating attention mechanism.
Figure 2. LSTM network incorporating attention mechanism.
Entropy 26 00823 g002
Figure 3. Self-attention module.
Figure 3. Self-attention module.
Entropy 26 00823 g003
Figure 4. Multi-head attention module.
Figure 4. Multi-head attention module.
Entropy 26 00823 g004
Figure 5. Network training flow.
Figure 5. Network training flow.
Entropy 26 00823 g005
Figure 6. Recognition accuracy of various methods under different input sequences: (a) length = 50; (b) length = 100; (c) length = 150; (d) length = 200; (e) length = 250; (f) length = 300.
Figure 6. Recognition accuracy of various methods under different input sequences: (a) length = 50; (b) length = 100; (c) length = 150; (d) length = 200; (e) length = 250; (f) length = 300.
Entropy 26 00823 g006
Figure 7. Recognition losses of various methods under different input sequences: (a) length = 50; (b) length = 100; (c) length = 150; (d) length = 200; (e) length = 250; (f) length = 300.
Figure 7. Recognition losses of various methods under different input sequences: (a) length = 50; (b) length = 100; (c) length = 150; (d) length = 200; (e) length = 250; (f) length = 300.
Entropy 26 00823 g007
Figure 8. Recognition accuracy of various methods at different network layers: (a) m = 1, n = 1; (b) m = 1, n = 2; (c) m = 2, n = 1; (d) m = 2, n = 2; (e) m = 2, n = 3; (f) m = 3, n = 2.
Figure 8. Recognition accuracy of various methods at different network layers: (a) m = 1, n = 1; (b) m = 1, n = 2; (c) m = 2, n = 1; (d) m = 2, n = 2; (e) m = 2, n = 3; (f) m = 3, n = 2.
Entropy 26 00823 g008aEntropy 26 00823 g008b
Figure 9. Recognition losses of various methods at different network layers: (a) m = 1, n = 1; (b) m = 1, n = 2; (c) m = 2, n = 1; (d) m = 2, n = 2; (e) m = 2, n = 3; (f) m = 3, n = 2.
Figure 9. Recognition losses of various methods at different network layers: (a) m = 1, n = 1; (b) m = 1, n = 2; (c) m = 2, n = 1; (d) m = 2, n = 2; (e) m = 2, n = 3; (f) m = 3, n = 2.
Entropy 26 00823 g009aEntropy 26 00823 g009b
Figure 10. Performance of various methods in noisy environments: (a) SNR = −6; (b) SNR = −2; (c) SNR = 2; (d) SNR = 6.
Figure 10. Performance of various methods in noisy environments: (a) SNR = −6; (b) SNR = −2; (c) SNR = 2; (d) SNR = 6.
Entropy 26 00823 g010
Table 1. Parameters of each layer in the network.
Table 1. Parameters of each layer in the network.
LayerNumber of NeuronsOutput ShapeParam #
Dense_12565 × 2561280
BN-5 × 2561024
Dropout-5 × 2560
Dense_21285 × 12832,896
BN-5 × 128512
Dropout-5 × 1280
Multi Attention Module-5 × 1280
LSTM_11285 × 128131,584
LSTM_22565 × 256394,240
Dense_315 × 1257
Table 2. Setting of training parameters.
Table 2. Setting of training parameters.
ParametersValue
Training and testing set ratio8:2
Maximum number of iterations200
loss functionbinary_crossentropy
optimizerAdam
Dropping probability0.2
Table 3. Lift coefficients ( C L ).
Table 3. Lift coefficients ( C L ).
α (°) M a 3.5 M a 5 M a 8 M a 10 M a 15 M a 20 M a 23
100.4500.4250.4000.3800.3700.3600.350
150.7400.7000.6700.6300.6000.5700.557
201.0501.0000.9500.9000.8500.8000.780
Table 4. Resistance coefficients ( C D ).
Table 4. Resistance coefficients ( C D ).
α (°) M a 3.5 M a 5 M a 8 M a 10 M a 15 M a 20 M a 23
100.2050.1700.1290.1090.1090.1090.109
150.2960.2630.2240.1970.1950.1920.192
200.4770.4230.3540.3100.3050.3000.300
Table 7. Average recognition accuracy (%) of the various methods with different network layers.
Table 7. Average recognition accuracy (%) of the various methods with different network layers.
Casem = 1, n = 1m = 1, n = 2m = 2, n = 1m = 2, n = 2m = 2, n = 3m = 3, n = 2
Att-LSTM+BN97.996.895.896.793.697.8
Att-LSTM89.679.680.680.682.685.2
LSTM+BN86.886.388.786.487.986.7
LSTM85.486.785.385.786.888.6
Table 8. Average loss of various methods with different network layers.
Table 8. Average loss of various methods with different network layers.
Casem = 1, n = 1m = 1, n = 2m = 2, n = 1m = 2, n = 2m = 2, n = 3m = 3, n = 2
Att-LSTM+BN0.160.120.110.090.060.02
Att-LSTM1.320.580.190.120.110.10
LSTM+BN0.860.710.771.250.871.05
LSTM0.931.080.990.891.260.98
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, Y.; Pan, Q.; Wang, Z.; Hu, B. A Novel Hypersonic Target Trajectory Estimation Method Based on Long Short-Term Memory and a Multi-Head Attention Mechanism. Entropy 2024, 26, 823. https://doi.org/10.3390/e26100823

AMA Style

Xu Y, Pan Q, Wang Z, Hu B. A Novel Hypersonic Target Trajectory Estimation Method Based on Long Short-Term Memory and a Multi-Head Attention Mechanism. Entropy. 2024; 26(10):823. https://doi.org/10.3390/e26100823

Chicago/Turabian Style

Xu, Yue, Quan Pan, Zengfu Wang, and Baoquan Hu. 2024. "A Novel Hypersonic Target Trajectory Estimation Method Based on Long Short-Term Memory and a Multi-Head Attention Mechanism" Entropy 26, no. 10: 823. https://doi.org/10.3390/e26100823

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop