Next Article in Journal
Adaptive Dynamic Programming with Reinforcement Learning on Optimization of Flight Departure Scheduling
Previous Article in Journal
Effects of Fuel Penetration on the RDE Performance with JISC Injector Configuration
 
 
Article
Peer-Review Record

Autonomous Trajectory Planning Method for Stratospheric Airship Regional Station-Keeping Based on Deep Reinforcement Learning

Aerospace 2024, 11(9), 753; https://doi.org/10.3390/aerospace11090753
by Sitong Liu 1,2, Shuyu Zhou 1,*, Jinggang Miao 1,2,*, Hai Shang 1, Yuxuan Cui 1 and Ying Lu 1
Reviewer 1: Anonymous
Reviewer 2:
Aerospace 2024, 11(9), 753; https://doi.org/10.3390/aerospace11090753
Submission received: 6 August 2024 / Revised: 2 September 2024 / Accepted: 11 September 2024 / Published: 13 September 2024
(This article belongs to the Section Astronautics & Space Science)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper presents an interesting application of DRL to the problem of stratospheric airship trajectory planning. However, to meet the standards of the aerospace journal, significant improvements are needed in terms of methodological rigor, validation, and discussion of limitations. The paper would benefit from a more critical approach, both in terms of evaluating the proposed method and in comparing it to existing solutions. Additionally, the paper should provide more detailed justifications for the choices made in the model design and training process. Please address these and attached comments, and reflect them in your manuscript

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Please check the attached comments on clarity of language used.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The article develops a method for controlling an airship based on reinforcement learning techniques. There are some remarks regarding the work:

1. Please add to the Introduction and comment there on the following works:

https://www.sciencedirect.com/science/article/abs/pii/S1270963820307823

https://arc.aiaa.org/doi/abs/10.2514/6.2024-3897

https://ieeexplore.ieee.org/abstract/document/7555054

https://link.springer.com/chapter/10.1007/978-981-15-8450-3_56

https://onlinelibrary.wiley.com/doi/full/10.1155/2019/7854173

https://ieeexplore.ieee.org/abstract/document/8993815

https://www.emerald.com/insight/content/doi/10.1108/IJICC-02-2017-0017/full/html

https://www.sciencedirect.com/science/article/abs/pii/S0273117722003672

Although not all of these works are related to planning, they nevertheless address reinforcement learning methods for airship control tasks, consider disturbances, including those from wind, as well as the problem of maintaining motion.

2. Line 150: If there's a reference for the data you're using, please provide it here.

3. In formula (9), there should be no minus sign before the exponent.

4. Lines 203-204: Please replace the sentence with a more accurate one: "After normalization, the observation is fed into the neural network". It's not the space that's input into the neural network, but rather the observation vector.

5. Table 2: The angular velocity ω is not entirely clear. Previously, ω was used to denote the rate of change of the angle θ due to wind action. Here, it appears to be a controllable parameter. Are these different angular velocities, or how should this be interpreted?

6. Do I understand correctly that the observation contains exact values for the airship's airspeed, heading, wind speed, wind direction, etc.? Does the airship have the ability to accurately determine wind speed and direction? Wouldn't it be more reasonable to add some noise to these values before feeding them into the neural network? In reinforcement learning, it's easy to account for noise in the observation vector, and the resulting model would be more robust.

7. Figure 4 is not entirely correct. If there's no convergence (i.e., if the learning algorithm's stopping criteria are not met), there's no need to reinitialize the neural networks for the Actor and Value. The arrow should probably point to the Agent or PPO Continuous Model instead.

8. Lines 270-272: Does this mean that airships are not launched during this time? Or does it mean that the model cannot be adequately trained for this period? Or is this done intentionally to test the model later? Please clarify this in the text.

9. Please provide higher quality versions of figures 5a and 5b, with better resolution.

10. Could you please explain in the text how the parameters in Table 3 were selected: were they manually adjusted to achieve adequate airship control, or were they optimized?

Overall, I believe the article requires minor revision.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Review of Revised Manuscript Autonomous Trajectory Planning Method for Stratospheric Airship Regional Station-keeping Based on Deep Reinforcement Learning

The revised manuscript is now well-written and ready for publication in its current form. The authors have addressed all the reviewers' comments and suggestions. 

Back to TopTop