Next Article in Journal
Strategic and Tactical Path Planning for Urban Air Mobility: Overview and Application to Real-World Use Cases
Next Article in Special Issue
A Lightweight Uav Swarm Detection Method Integrated Attention Mechanism
Previous Article in Journal
An Improved Spanning Tree-Based Algorithm for Coverage of Large Areas Using Multi-UAV Systems
Previous Article in Special Issue
A Multi-Agent System Using Decentralized Decision-Making Techniques for Area Surveillance and Intruder Monitoring
 
 
Article
Peer-Review Record

Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method

by Yu Chen 1, Qi Dong 1,2,*, Xiaozhou Shang 2, Zhenyu Wu 3 and Jinyu Wang 4
Reviewer 1:
Reviewer 2:
Submission received: 30 November 2022 / Revised: 14 December 2022 / Accepted: 17 December 2022 / Published: 23 December 2022
(This article belongs to the Special Issue Intelligent Coordination of UAV Swarm Systems)

Round 1

Reviewer 1 Report

[Comment 1] Novelty

[Subcomment 1a] I question the novelty of this study. How does this study differ from https://ieeexplore.ieee.org/document/9838808

[Subcomment 1b] There are many studies using centralized learning-decentralized execution scheme. The authors must show how their proposed method differs from the ones in these studies (and many more):

- https://ieeexplore.ieee.org/abstract/document/8917217

- https://ieeexplore.ieee.org/abstract/document/9037042

- https://ieeexplore.ieee.org/document/9560748

- https://arc.aiaa.org/doi/abs/10.2514/6.2021-1952

[Subcomment 1c] When stating the novelty points, the authors must contrast their study with the ones of other studies, while mentioning how they differ and citing the specific previous studies.

 

[Comment 2] Numerical experiments

[Subcomment 2a] The authors should show the parameters used in their proposed model.

[Subcomment 2b] The authors must present the numerical experiments comparing their proposed method with the state-of-the-art approach of the considered problem.

 

[Comment 3] Writing quality and clarity

[Subcomment 3a] Please correct inappropriate use of capital letters (e.g., We in abstract).

[Subcomment 3b] Many spaces were not used appropriately.

[Subcomment 3c] The authors must define each abbreviation in its first use.

[Subcomment 3d] Please revise the grammatical mistakes, e.g., in line 176.

Author Response

Please see the attachment

Thank you for your comments and suggestions,we have taken all these comments and suggestions.

Point 1:  Novelty

[Subcomment 1a] I question the novelty of this study. How does this study differ from https://ieeexplore.ieee.org/document/9838808

Response 1a: The above study adopted a multi-agent policy gradients algorithm based on centralized training and decentralized execution(CTDE) framework to achieve UAV swarms dynamic routing under constraints, and introduce a counterfactual baseline scheme to improve the convergence speed. Our study used reinfocement learning to complete multi-agent path planing, too. However, our study especially aimed to seek the optimal policy under the imcomplete information because of the difference between two tasks. We used CTDE framework to get more informations, and introduce recurrent neural network to remember historical informations. We also deeply explored how the centralized training imporve the model’s performance through control test. Unlike prior study, we used multi-objective reinforcement learning to obtain a set of solutions to approach “Pareto front”, instead design the reward function manually to find the optimal solution under constraints. We state the novelties in the revision.

 

[Subcomment 1b] There are many studies using centralized learning-decentralized execution scheme. The authors must show how their proposed method differs from the ones in these studies (and many more):

- https://ieeexplore.ieee.org/abstract/document/8917217

- https://ieeexplore.ieee.org/abstract/document/9037042

- https://ieeexplore.ieee.org/document/9560748

- https://arc.aiaa.org/doi/abs/10.2514/6.2021-1952

Response 1b: The above studies adopted centralized learning-decentralized execution scheme to guide multi-agent cooperatively for different tasks, which proves the CTDE framework is facilitated, but these works didn’t seek the reasons CTDE scheme improves the performance. Our study not only used CTDE scheme to train, and we set controlled experiments to validate the effectiveness of CTDE scheme. We concluded that the effect of centralized training is more obvious with the growth of the UAV’s number. We also confirmed the different roles of critic network and actor network. The results shows the performance of the model are improved greatly after adding RNN layer to critic network , it verifies the critic network plays the role of central controller. On the contrary, the performance of the model remains unchanged after adding RNN layer to the actor network, because the actor network just outputs actions according to current state. It proves decentralized execution scheme can light the information dependence.

 

[Subcomment 1c] When stating the novelty points, the authors must contrast their study with the ones of other studies, while mentioning how they differ and citing the specific previous studies.

 Response 1c: In the lastest submission, we have contrasted our study with some previous studies, summarized these works and proposed our improvement points in the introduction. Unlike the previous studies, we emphasized that our study explored that why adopting centralized training could coordinates all the UAVs.

 

Point2:Numerical experiments

[Subcomment 2a] The authors should show the parameters used in their proposed model.

[Subcomment 2b] The authors must present the numerical experiments comparing their proposed method with the state-of-the-art approach of the considered problem.

Response 1: In the lastest submission, we show the key parameters of our model in the Table 1. In addition, we increased numerical experiments to support our method by contrast with the state-of-the-art approach, and we add a table comparing the proposed approach with previous similar state-of-the-art approaches.

 

Point3:Writing quality and clarity

[Subcomment 3a] Please correct inappropriate use of capital letters (e.g., We in abstract).

[Subcomment 3b] Many spaces were not used appropriately.

[Subcomment 3c] The authors must define each abbreviation in its first use.

[Subcomment 3d] Please revise the grammatical mistakes, e.g., in line 176.

Response 3: We revised the whole manuscript carefully to aviod language errors. In addition, we checked the English carefully. We ensure that all errors are corrected,and the language is now acceptable for the review process.

Author Response File: Author Response.docx

Reviewer 2 Report

Summary: The authors propose a multi-UAV autonomous path planning algorithm based on multi-agent reinforcement learning, which coordinates all UAVs through central training and decentralized execution, and they introduce the hidden state of the recurrent neural network to take advantage of historical observation information. To tackle the multi-objective optimization problem, create a combined reward function to help UAVs in learning appropriate policies.   Comments and Suggestions:   - The paper is well-written and well-structured and the contribution is significant.   - Line 4: "observable.It’s difficult for a UAV to" ===> add space + avoid unformal short forms ("It's") (Please apply this everywhere in the paper)   -  A related works section is missing.    - A table comparing the proposed approach with previous similar approaches will be of important added value.   - The authors are invited to consider security aspects related to UAV communications. For this purpose, they may include the following references  in their study:   + https://www.mdpi.com/2504-446X/6/9/222  +https://ieeexplore.ieee.org/document/9842403   - The authors need to provide stronger justification for the use of the adopted AI Techniques.   -  Why are experiments limited to simulations? Why not consider real experimentation?   - The authors are invited to share the data on which they built their experiments   - The authors also need to identify the main limitations of the proposed approach and identify sone relevant future work directions.  

Author Response

Please see the attachment,

Thank you for your comments and suggestions,we have taken all these comments and suggestions.

 

Point 1:Line 4: "observable.It’s difficult for a UAV to" ===> add space + avoid unformal short forms ("It's") (Please apply this everywhere in the paper)

Response 1: We revised the whole manuscript carefully to aviod these errors. In addition, we consulted a professional editing service to check the Englishi. We ensure that all errors are corrected,and the language is now acceptable for the review process.

 

Point 2: A related works section is missing;A table comparing the proposed approach with previous similar approaches will be of important added value.

Response 2: The related works are added into the introduction. We have compared the SOTA approachs with our method in the lastest manuscript.

Point 3: The authors are invited to consider security aspects related to UAV communications. For this purpose, they may include the following references in their study:

+ https://www.mdpi.com/2504-446X/6/9/222  +https://ieeexplore.ieee.org/document/9842403  

Response 3: Thank you for your suggestions. We have cited these papers in our study.

 

Point 4: Why are experiments limited to simulations? Why not consider real experimentation?  

Response 4: Real experimentations mean higher costs, because UAVs constantly optimize their policies by interacting with environment, it may cause damage. Once the success rate is close to our expectations, we will consider to real experimentation

 

Point 5: The authors are invited to share the data on which they built their experiments   - The authors also need to identify the main limitations of the proposed approach and identify sone relevant future work directions.  

Response 5: We shared the data of our expriments in Table 2. We state the limitations of our method in the conclusion.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Thank you for your revisions.

Author Response

Thank you for your suggestions. We consulted a professional editing service to check the English, and checked it carefully.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors took into consideration my comments and suggestions. Please pay attention to English Mistakes in the final version. Good luck.

Author Response

Thank you for your suggestions. We consulted a professional editing service to check the English, and checked it carefully.

Author Response File: Author Response.pdf

Back to TopTop