4.3.2. Experiment 1: Verification of Consistency Between Virtual and Real Environments and Verification of Interface Interoperability
Regarding the consistency between the virtual and real aspects established in the UAV digital twin verification platform and the availability of communication interfaces between the real and virtual domains, this paper will conduct experiments designed to reflect reality through virtual simulations. The designed scenario is a single UAV cruising in a quadrilateral, with waypoints being (0,0,4), (0,5,4), (5,5,4), (5,0,4), and (0,0,0) in sequence. The experiment will control the flight of the virtual UAV in AirSim 1.5.0 [
28] using the control functions of the physical UAV and send waypoint information to the virtual UAV via a socket communication protocol. By comparing the flight trajectory data of the real and virtual UAVs, a consistency evaluation and analysis of the alignment between the virtual and real UAV flight conditions will be conducted. As shown in
Figure 6, (a) represents a comparison of the three-dimensional flight trajectories of both UAVs, and (b) shows the comparison of the two-dimensional trajectories.
Using the actual UAV’s flight trajectory as the reference, the average errors of the virtual UAV’s position data in the X, Y, and Z directions were calculated to be 0.24 m, 0.27 m, and 0.23 m, respectively. The mean error is relatively small, and the trajectory closely aligns with the preset one, which meets the simulation requirements of the virtual UAV on the UAV real–virtual integrated verification platform. Based on the consistency of the virtual–real mapping and the effectiveness of the interface communication, subsequent tests were conducted.
4.3.3. Experiment 2: Verify the Accuracy of LSTM-DQN-APF Prediction
- (1)
LSTM-DQN-APF
The configuration of parameters in the LSTM-DQN-APF network has a significant impact on both prediction accuracy and speed. To optimize these parameters, this paper employs an LSTM architecture with two hidden layers, each containing 50 neurons. Additionally, the DQN component includes three fully connected hidden layers, each containing 128 neurons, which enhances the decision-making and prediction capabilities. During the training process, the learning rate is dynamically adjusted to ensure effective convergence at different stages, and a batch size of 40 is used for each training iteration. The model selects five main features: latitude, longitude, altitude, horizontal velocity, and vertical velocity, and generates time-series data for each variable. Therefore, the actual input data for each batch are represented as a three-dimensional tensor with a shape of B × 40 × N, where B represents the time steps and N = 5 denotes the number of features. To improve prediction accuracy, we perform a comparative analysis of the prediction errors across time steps from 5 to 50, thereby determining the optimal time step for the LSTM-DQN-APF network in quadrotor UAV trajectory prediction. To evaluate the performance of the LSTM-DQN-APF model, we use mean absolute error (MAE) as the evaluation metric. MAE quantifies the closeness between the predicted UAV trajectory points and their actual values; a lower MAE indicates that the LSTM-DQN-APF-based quadrotor UAV trajectory prediction model has higher accuracy.
As can be seen from
Table 2, when the step size is set to 40, the prediction errors for both flight altitude and latitude are minimized. The optimal step size for predicting longitude is 45, which yields the smallest measurement error. Therefore, the optimal step sizes for predicting the flight height of a quadrotor UAV, latitude, and longitude are 40, 40, and 45, respectively.
Table 1 also indicates that as the time step length increases, the prediction accuracy improves, suggesting that LSTM-DQN-APF networks perform better with longer time-series data. However, beyond a certain step size, increasing the step size further leads to a decrease in prediction accuracy.
As shown in
Table 3, under the given network structure and time step conditions for the LSTM-DQN-APF model, the multi-step prediction results for the quadrotor UAV trajectory demonstrate that one-step and two-step predictions fall within an acceptable error range, while the error for three-step predictions increases significantly.
- (2)
verifies the prediction errors of the LSTM-DQN-APF, LSTM, BP, and CNN models along the x, y, and z axes
As illustrated in
Figure 7, the predictions of the four models for the three coordinate axes (x, y, z) generally follow the trend of the actual trajectory. However, the prediction curves of the BP (backpropagation algorithm) and CNN (convolutional neural network) models exhibit significant deviations from the actual trajectory. In contrast, the LSTM and LSTM-DQN-APF models demonstrate smaller prediction errors. Notably, the prediction error for the
z-axis is marginally higher compared to the x and y axes across all models, with the BP and CNN models showing considerable fluctuations at various positions. Although the LSTM model’s prediction error for the
z-axis height remains within an acceptable range, it is still greater than that of the LSTM-DQN-APF model. The figure clearly indicates that the trajectory predicted by the LSTM-DQN-APF model most closely aligns with the actual trajectory, exhibiting the minimum error and the highest accuracy.
4.3.4. Experiment 3: Algorithm Validation Test for Urban Warfare Scenario
In this paper, we simulate a complex urban battlefield scenario that includes roads, trees, buildings, landscape trees, street lights, built-up areas, and walking pedestrians. As shown in
Figure 8, it is a panoramic dynamic threat map of the urban scene.
The main obstacle information in the urban scenario is shown in
Table 4.
In order to increase the sample diversity, this paper conducts three different experiments: (1) the presence of static obstacles; (2) the presence of dynamic obstacles; and (3) the introduction of the effect of wind. The setup of these experiments helps to provide richer training data for the intelligences, thus enhancing their adaptive ability under different environmental conditions. The size of the experience pool in this paper is 10,000, and these experiments can make the samples stored in the experience pool more diversified, thus improving the performance of the model in various complex scenes. In order to further verify the performance of this method, the LSTM-DQN-APF path planning method was compared with APF and GE-APF [
29,
30,
31] under the same testing environment, where the attraction and repulsion of the GE-APF algorithm can be defined as:
and are the gain factors of the attractive force, where their summation produces the maximum attractive force. The gain factors of the repulsive force are denoted as and , where their summation gives the maximum repulsive force. , , , and are positive constants that contribute to identifying the required minimum velocity and minimum displacement that generate the maximum attractive and repulsive Forces.
- (1)
Static obstacles
In order to ensure the comprehensiveness and accuracy of the experiment, 2 task points were set up in the scene, each of which was independently experimented for testing the navigation performance of the UAV in different situations. During the experiment, the UAV adopts the traditional artificial potential field method and the improved artificial potential field method, respectively, and is considered to have completed a flight mission when it starts from the starting position (0,0,0) and successfully reaches the task point while ensuring that it avoids all the obstacles in the scene. In order to ensure the reliability of the experimental results, 10 independent experiments were conducted for each task point. The performance of the three algorithms is evaluated by comparing the flight trajectories, flight times, and flight distances of the three algorithms in accomplishing the tasks.
The detailed information of the setup task points is shown in
Table 5. The task points are set from near to far, covering different spatial locations and obstacle distributions, aiming to comprehensively test the navigation performance and adaptability of the artificial potential field method.
As shown in
Figure 9, the experimental trajectory data of the traditional artificial potential field method and the improved artificial potential field method in the UAV virtual–real integration verification platform for Task 1 is presented. The blue represents LSTM-DQN-APF, the green represents Ge-Apf, and the red represents APF. It can be seen that compared to APF and GE-Apf, the LSTM-DQN-APF algorithm trajectory is smoother and does not exhibit severe fluctuations or abrupt changes.
From
Figure 10, it can be seen that the flight time of LSTM-DQN-APF, represented by blue, is significantly shorter in all 10 experiments compared to APF, represented by red, and Ge-Apf, represented by green. The average flight time of the LSTM-DQN-APF algorithm is 12.87 s, while the average flight times of Ge-Apf and APF are 16.65 s and 19.16 s, respectively. The difference in flight times reflects the significant improvement in flight efficiency of LSTM-DQN-APF. The shorter flight time not only means that the UAV can complete flight missions faster but also reduces energy consumption, thereby extending the UAV’s endurance.
Figure 10 compares the performance of the three algorithms in terms of flight distance. The average flight distance of the LSTM-DQN-APF algorithm to the target point (20,20,5) is 32.89 m, while the Ge-Apf and APF algorithms are 36.73 m and 39.81 m, respectively. The average variance of the flight time for APF is 15.424 s, for Ge-Apf it is 6.72 s, and for LSTM-DQN-APF it is 1.336 s; the average variance of the flight distance for APF is 67.942 m, for Ge-Apf it is 9.41 m, and for LSTM-DQN-APF it is 9.896 m.
Figure 11 shows the analysis of the trajectory data of the UAV flying to the target point (40,20,5) in task 2. From the figure, it can be seen that its main obstacle is the building 1 located at (20,10,0) with a height of 10 m. In the face of this obstacle, LSTM-DQN-APF, Ge-Apf, and APF chose different avoidance directions. Compared to Ge-Apf and APF, LSTM-DQN-APF’s trajectory was smoother, and the distance consumed to avoid the obstacle was smaller.
In
Figure 12, it can be seen that LSTM-DQN-APF outperforms Ge-Apf and APF in both flight time and flight distance. The average flight time of LSTM-DQN-APF is 21.15 s, and the average flight distance is 51.23 m, while the average flight time of Ge-Apf is 26.98 s, and the average flight distance is 65.42 m. The average flight time of APF is 30.33 s, and the average flight distance is 71.6 m. For flight time, the average variance of APF is 15.424 s, the average variance of Ge-Apf is 6.87 s, and the average variance of LSTM-DQN-APF is 1.121 s; for flight distance, the average variance of APF is 67.942 m, and the average variance of Ge-Apf is 10 m. The average variance of the flight distance of LSTM-DQN-APF is 9.896 m.
- (2)
Dynamic obstacles
The starting point coordinates of the UAV are (0,0,0), and the target point coordinates are (10,50,0). The motion parameters of the three spheres are shown in
Table 6. As can be seen from
Figure 13, there are dynamic obstacles in the scene, including a moving sphere and pedestrian, and the movement direction of the sphere has been marked in the figure.
Figure 13a shows the static state;
Figure 13b shows the state after 15 s of exercise.
As shown in
Figure 14a, the task is segmented into three target waypoints: the first waypoint is located at coordinates (4,20,5), the second at (12,25,5), and the third at (10,50,0). As illustrated in
Figure 14a, when the UAV approaches the first dynamic obstacle, it begins to adjust its trajectory due to the perceived relative velocity and the repulsive force exerted by the obstacle.
Figure 14b depicts the second phase of trajectory planning in a dynamic environment, demonstrating the UAV’s successful avoidance of both the dynamic sphere and pedestrians as it continues toward the target waypoint. Upon entering the influence zone of the second dynamic obstacle, the flight path planning system initiates adjustments to the UAV’s course based on the obstacle’s impact.
Figure 14c illustrates the final stage of trajectory planning in a dynamic environment, where the UAV successfully reaches its designated target.
As shown in
Figure 15, the evolution curves of the number of iterations versus the reward value for DQN and LSTM-DQN-APF are demonstrated. From the figure, it can be seen that DQN shows fast learning ability in the initial phase of the experiment, and its reward value rises rapidly, which implies that it can adapt to the environment relatively quickly and can find a suitable path planning strategy quickly. However, in the middle and late stages, the reward value of DQN shows a more obvious decreasing trend, which may indicate that the algorithm has some problems in the subsequent iterations, such as falling into the local optimal solution or deteriorating its adaptability to the environmental changes.
In contrast, LSTM-DQN-APF is able to quickly rise to a positive reward value at the beginning of the experiment, which fully demonstrates that the algorithm has a stronger fast learning ability and is able to find an effective strategy for avoiding obstacles faster than the DQN algorithm. LSTM-DQN-APF not only performs well at the beginning of the learning process but also has better stability with relatively small fluctuations in its reward value throughout the iteration process, which further demonstrates the superior performance of the LSTM-DQN-APF algorithm in handling this task. This further indicates that the LSTM-DQN-APF algorithm has better performance in handling this task.
- (3)
Wind effects
As shown in
Figure 16, a horizontal westward wind field with a wind speed of 5 m/s is added to the simulation environment, and the mission target point is set to (10,0,5) for the virtual UAV to fly towards the target point with a speed of 1 m/s. The green line is the flight path of the virtual UAV without disturbing wind, and the purple line is the actual flight trajectory of the virtual UAV under the influence of wind.
As shown in
Figure 17, the UAV starts from (0,0,0), the target point location is (5,25,0), and the flight speed is 1m/s. The red represents LSTM-DQN-APF moving towards the target point; Ge-Apf and APF initially keep circling under the influence of wind. In real situations, such flying behavior of the UAV is very dangerous, and the APF falls into a local minimum after 15 s.
Table 7 compares the average flight time, average path length, and task completion rate of APF, Ge-Apf, and LSTM-DQN-APF under windy and non-windy conditions. The LSTM-DQN-APF algorithm shows significant advantages in average flight time, average path length, and task completion rate under both windy and non-windy conditions. In contrast, the APF algorithm has many issues in both environments. Under windless conditions, when applying APF, the UAV reached an unreachable target state after flying for 20 s, resulting in increased experimental time and path length. In windy conditions, the APF algorithm causes the UAV to be affected by the wind field, deviating from the target point. After flying for 15 s, it gets trapped in a local minimum. Under the influence of the wind field, the virtual UAV in the simulation environment experiences multiple collisions and fails to find the correct path for an extended period. This not only increases the experimental time and flight path length but also potentially introduces safety risks in practical applications. While the Ge-Apf algorithm outperforms APF in certain aspects, its overall performance is still inferior to LSTM-DQN-APF. This indicates that the LSTM-DQN-APF algorithm has better performance and adaptability in path planning and flight control for UAVs, especially in complex environments (such as wind fields), providing more reliable algorithmic support for the application of UAVs in real-world complex environments.