Next Article in Journal
Risk-Constrained Stochastic Scheduling of a Grid-Connected Hybrid Microgrid with Variable Wind Power Generation
Previous Article in Journal
Review of the Recent Progress on GaN-Based Vertical Power Schottky Barrier Diodes (SBDs)
 
 
Article
Peer-Review Record

Completing Explorer Games with a Deep Reinforcement Learning Framework Based on Behavior Angle Navigation

Electronics 2019, 8(5), 576; https://doi.org/10.3390/electronics8050576
by Shixun You, Ming Diao and Lipeng Gao *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Electronics 2019, 8(5), 576; https://doi.org/10.3390/electronics8050576
Submission received: 7 May 2019 / Revised: 20 May 2019 / Accepted: 20 May 2019 / Published: 25 May 2019
(This article belongs to the Section Systems & Control Engineering)

Round 1

Reviewer 1 Report

- in keywords as well as in the article, there is term "cognitive electronic warfare", but the article has almost nothing to do with CEW. The topic described in the article could be indeed used in CEW applications, but it is about target searching. Even it is stated, that the targets should be also radars, the article does not solve anything connected with radar emmiters (no radar equatuions or signal propagation equations used for helping the searching algorithms) - please reconsider if there is a need to connect the article subject with the term electronic warfare".

- Row 53 - the Section 1 is stated as related works, however, related works are in Section 2

- Equation 7 - should not be hj instead of hi?

- In the article, the searching of certain area for "enemy observation station" is described and the term UCAV (Combat UAV) is used. However the article does not state anything about the process of target identification (friendly or foe) and destruction process. Therefore I strongly recommned to leave the abbreviation UCAV in the whole article and use only UAV (or Reconnaissance UAV).

- In the chapter 4.2, first paragraph, there is stated that "if on-policy training is adopted, the algorithm will diverge because of the strong correlation of the data". From military point of view (and if you use the term UCAV - it is meant to be used in military) if something strongly correlate then it should not be diverged. Please reformulate the statement or explain in new paragragh.

- page 11 - the abbreviation for Cross-entropy method (CEW) is the same as cognitive electronic warfare (CEW) - confusing. Use different abbreviation.


Author Response

Point 1: In keywords as well as in the article, there is term "cognitive electronic warfare", but the article has almost nothing to do with CEW. The topic described in the article could be indeed used in CEW applications, but it is about target searching. Even it is stated, that the targets should be also radars, the article does not solve anything connected with radar emitters (no radar equations or signal propagation equations used for helping the searching algorithms) - please reconsider if there is a need to connect the article subject with the term electronic warfare"


 

Response 1: A good suggestion. After careful consideration, we updated the manuscript by enhancing the relevance between target searching/tracking and cognitive electronic warfare in the Introduction section:

 

the sensors set by each member of the game have typical radar characteristics, and the interaction between the radar sensors and the game environments is detailed in Reference [1], and to simplify the complexity of the model, no soft/hard killing weapons (such as jammers and missiles) are introduced in this version of the game;

 

and the Section 3.1.4:

 

Explorer is based on the vCEW framework, which includes many unique designs about radar equations, such as using the line-of-sight (LOS) angle and the equivalent detection distance (related to the radar cross section) to determine whether the station is within the field of view of the UCAV, assessing the collision caused by objects, and analysing the stage progression of the radar based on a parametric data processing system (PDPS).

 

Point 2: The Section 1 is stated as related works, however, related works are in Section 2

 

Response 2: We updated the manuscript by modifying this section label.

 

Point 3: Equation 7 - should not be hj instead of hi?

 

Response 3: We are very grateful to the reviewee for pointing out such subtle mistake and we corrected it in the new manuscript.

 

Point 4: In the article, the searching of certain area for "enemy observation station" is described and the term UCAV (Combat UAV) is used. However the article does not state anything about the process of target identification (friendly or foe) and destruction process. Therefore I strongly recommned to leave the abbreviation UCAV in the whole article and use only UAV (or Reconnaissance UAV).

 

Response 4: A good suggestion. The manuscript does not give much description of combat strategy or competition strategy, and the method used to control UCAV in this paper is also applicable to conventional UAVs, so it is possible to change the term UCAV to UAV. However, for a UCAV, the part that is different from the UAV (or reconnaissance UAV) is its unique signal processing and excellent maneuvrability, so we emphasize the characteristics of UCAV in the Introduction section of this article:

 

Note that although Explorer is not a game full of confrontation, we still need its combat vehicles (as control agents) possess excellent maneuverability and certain radar signal processing capability. Therefore, in this paper, the term UAV is qualitatively described as UCAV for distinction. Actually, the method proposed in this paper for UCAV control is also applicable to conventional UAVs.

 

Point 5: In the chapter 4.2, first paragraph, there is stated that "if on-policy training is adopted, the algorithm will diverge because of the strong correlation of the data". From military point of view (and if you use the term UCAV - it is meant to be used in military) if something strongly correlates then it should not be diverged. Please reformulate the statement or explain in new paragraph.

 

Response 5: The original manuscript was inaccurately expressed, so we updated the manuscript by rewriting the concept in Chapter 4.2:

 

However, due to the continuity of the trajectory, the sampled data generated are highly correlated. Thus if the on-policy training is adopted, the DRL neural network will output an unstable action decision [17], and a divergent accumulative reward.

 

Point 6: Page 11 - the abbreviation for Cross-entropy method (CEW) is the same as cognitive electronic warfare (CEW) - confusing. Use different abbreviation.

 

Response 6: We updated the manuscript by rewriting the abbreviation of the cross-entropy method to CEM.


Reviewer 2 Report

In the summary add the research goal of the article
Improve document formatting line 242 For 1) and line 415
In the conclusions, add a few more tasks for further research on the subject matter covered (more details)

Author Response

Point 1: In the summary add the research goal of the article


 

Response 1: We update the manuscript by increasing the research goal of the article in summary:

Thus, to solve this problem, we developed an autonomous reasoning search method that can generate efficient decision-making actions and guide the UCAV as early as possible to the target area.

 

Point 2: Improve document formatting line 242 For 1) and line 415

 

Response 2: We update the manuscript by proofreading the format of each chapter.

 

Point 3: In the conclusions, add a few more tasks for further research on the subject matter covered (more details)

 

Response 3: We updated the manuscript by rewriting the Conclusion section:

 

Our work not only pioneered verifying the ability of popular DRL algorithms to perform target searching tasks in 3D continuous action space, but also achieved breakthrough results in optimizing the end-to-end action policies. However, in the future, there are still many aspects worthy of in-depth research, among which we are most concerned about solving navigation problems in the environments of multiagent cooperation and competition, such as real-time obstacle avoidance and moving target tracking. In addition, when introducing weapon models such as missiles and jammers into the vCEW framework, a visual physics engine software is needed to better evaluate the interaction between these models and various game environments. At the same time, with the increasing complexity of engineering models, the development of high-performance DRL algorithms is imminent.


Back to TopTop