Next Article in Journal
A Variable Neighbourhood Search-Based Algorithm for the Transit Route Network Design Problem
Previous Article in Journal
Enzymatic Synthesis of Galacto-Oligosaccharides from Concentrated Sweet Whey Permeate and Its Application in a Dairy Product
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO

Aeronautics Engineering College, Air Force Engineering University, Xi’an 710038, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(20), 10230; https://doi.org/10.3390/app122010230
Submission received: 27 September 2022 / Revised: 7 October 2022 / Accepted: 9 October 2022 / Published: 11 October 2022

Abstract

Maneuver decision-making is the core of autonomous air combat, and reinforcement learning is a potential and ideal approach for addressing decision-making problems. However, when reinforcement learning is used for maneuver decision-making for autonomous air combat, it often suffers from awful training efficiency and poor performance of maneuver decision-making. In this paper, an air combat maneuver decision-making method based on final reward estimation and proximal policy optimization is proposed to solve the above problems. First, an air combat environment based on aircraft and missile models is constructed, and an intermediate reward and final reward are designed. Second, the final reward estimation is proposed to replace the original advantage estimation function of the surrogate objective of proximal policy optimization to improve the training performance of reinforcement learning. Third, sampling according to the final reward estimation is proposed to improve the training efficiency. Finally, the proposed method is used in a self-play framework to train agents for maneuver decision-making. Simulations show that final reward estimation and sampling according to final reward estimation are effective and efficient.
Keywords: autonomous air combat; maneuver decision-making; reinforcement learning; final reward estimation; proximal policy optimization autonomous air combat; maneuver decision-making; reinforcement learning; final reward estimation; proximal policy optimization

Share and Cite

MDPI and ACS Style

Zhang, H.; Wei, Y.; Zhou, H.; Huang, C. Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO. Appl. Sci. 2022, 12, 10230. https://doi.org/10.3390/app122010230

AMA Style

Zhang H, Wei Y, Zhou H, Huang C. Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO. Applied Sciences. 2022; 12(20):10230. https://doi.org/10.3390/app122010230

Chicago/Turabian Style

Zhang, Hongpeng, Yujie Wei, Huan Zhou, and Changqiang Huang. 2022. "Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO" Applied Sciences 12, no. 20: 10230. https://doi.org/10.3390/app122010230

APA Style

Zhang, H., Wei, Y., Zhou, H., & Huang, C. (2022). Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO. Applied Sciences, 12(20), 10230. https://doi.org/10.3390/app122010230

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop