Next Article in Journal
Comparing Machine Learning Algorithms for Estimating the Maize Crop Water Stress Index (CWSI) Using UAV-Acquired Remotely Sensed Data in Smallholder Croplands
Next Article in Special Issue
Trajectory-Tracking Control for Quadrotors Using an Adaptive Integral Terminal Sliding Mode under External Disturbances
Previous Article in Journal
Bifurcation Analysis and Sticking Phenomenon for Unmanned Rotor-Nacelle Systems with the Presence of Multi-Segmented Structural Nonlinearity
Previous Article in Special Issue
Review of Aerial Transportation of Suspended-Cable Payloads with Quadrotors
 
 
Article
Peer-Review Record

Dynamic Scene Path Planning of UAVs Based on Deep Reinforcement Learning

by Jin Tang 1,2, Yangang Liang 1,2 and Kebo Li 1,2,*
Reviewer 1:
Reviewer 2: Anonymous
Submission received: 11 December 2023 / Revised: 4 February 2024 / Accepted: 5 February 2024 / Published: 9 February 2024
(This article belongs to the Special Issue Advances in Quadrotor Unmanned Aerial Vehicles)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper presents current and interesting topic. The lack of clear objectives and sound contributions are the main issues.

References are poor with some cited journal difficult to locate and other not referencing the updated citation.

The paper also shows a great overlap with the article (Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments) reference [19].

The enhancement of this paper over the mentioned paper needs to be clearly stated to justify the significance of your contribution.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Traditional UAV trajectory planning methods have focused on solving planning problems in static scenes, have struggled to balance optimality and real-time performance, and have been prone to local optimality. In this paper, the authors propose an improved deep learning approach with reinforcement learning for UAV trajectory planning in dynamic scenarios. First, the authors create a problem scenario including an obstacle estimation model and model the UAV path planning problem using a Markov Decision Process. The authors transfer the MDP model to a reinforcement learning framework and design the state space, action space and reward function, and incorporate heuristic rules into the action search strategy. Second, the authors use the Q-function approximation of the extended D3QN with a prioritized experience replay mechanism and design the network structure of the algorithm based on the TensorFlow framework. Through intensive training, the authors obtain reinforcement-based path planning policies for static and dynamic scenes and the authors use a visualized action field to analyze the planning performance. Simulations show that the proposed algorithm can solve UAV path planning problems in dynamic scenes and outperforms classical methods such as A*, RRT and DQN in terms of planning efficiency.

My comments:

1) Authors should more clearly emphasize the main points of novelty that distinguish their work from closely related papers on the topic.

2) Punctuation looks sloppy. For example, in Line 129, 137 and further the word "where" should start with a small letter.

3) In general, it is unclear how the emphasis on UAVs and the rather simple model used by the authors relate. What specifics of UAVs are manifested in the model.

4) Line 184: what are the 4 conditions in question?

5) Formulas (8) and (9): what is w and w'? Not defined? Also, $\gamma$ is not defined?

6) Formula (12): What is $\epsilon_0$

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

My comments have been addressed

Back to TopTop