This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
A Method for Multi-AUV Cooperative Area Search in Unknown Environment Based on Reinforcement Learning
by
Yueming Li
Yueming Li *,
Mingquan Ma
Mingquan Ma ,
Jian Cao
Jian Cao ,
Guobin Luo
Guobin Luo ,
Depeng Wang
Depeng Wang and
Weiqiang Chen
Weiqiang Chen
National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin Engineering University, Harbin 150001, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2024, 12(7), 1194; https://doi.org/10.3390/jmse12071194 (registering DOI)
Submission received: 2 April 2024
/
Revised: 12 July 2024
/
Accepted: 12 July 2024
/
Published: 16 July 2024
Abstract
As an emerging direction of multi-agent collaborative control technology, multiple autonomous underwater vehicle (multi-AUV) cooperative area search technology has played an important role in civilian fields such as marine resource exploration and development, marine rescue, and marine scientific expeditions, as well as in military fields such as mine countermeasures and military underwater reconnaissance. At present, as we continue to explore the ocean, the environment in which AUVs perform search tasks is mostly unknown, with many uncertainties such as obstacles, which places high demands on the autonomous decision-making capabilities of AUVs. Moreover, considering the limited detection capability of a single AUV in underwater environments, while the area searched by the AUV is constantly expanding, a single AUV cannot obtain global state information in real time and can only make behavioral decisions based on local observation information, which adversely affects the coordination between AUVs and the search efficiency of multi-AUV systems. Therefore, in order to face increasingly challenging search tasks, we adopt multi-agent reinforcement learning (MARL) to study the problem of multi-AUV cooperative area search from the perspective of improving autonomous decision-making capabilities and collaboration between AUVs. First, we modeled the search task as a decentralized partial observation Markov decision process (Dec-POMDP) and established a search information map. Each AUV updates the information map based on sonar detection information and information fusion between AUVs, and makes real-time decisions based on this to better address the problem of insufficient observation information caused by the weak perception ability of AUVs in underwater environments. Secondly, we established a multi-AUV cooperative area search system (MACASS), which employs a search strategy based on multi-agent reinforcement learning. The system combines various AUVs into a unified entity using a distributed control approach. During the execution of search tasks, each AUV can make action decisions based on sonar detection information and information exchange among AUVs in the system, utilizing the MARL-based search strategy. As a result, AUVs possess enhanced autonomy in decision-making, enabling them to better handle challenges such as limited detection capabilities and insufficient observational information.
Share and Cite
MDPI and ACS Style
Li, Y.; Ma, M.; Cao, J.; Luo, G.; Wang, D.; Chen, W.
A Method for Multi-AUV Cooperative Area Search in Unknown Environment Based on Reinforcement Learning. J. Mar. Sci. Eng. 2024, 12, 1194.
https://doi.org/10.3390/jmse12071194
AMA Style
Li Y, Ma M, Cao J, Luo G, Wang D, Chen W.
A Method for Multi-AUV Cooperative Area Search in Unknown Environment Based on Reinforcement Learning. Journal of Marine Science and Engineering. 2024; 12(7):1194.
https://doi.org/10.3390/jmse12071194
Chicago/Turabian Style
Li, Yueming, Mingquan Ma, Jian Cao, Guobin Luo, Depeng Wang, and Weiqiang Chen.
2024. "A Method for Multi-AUV Cooperative Area Search in Unknown Environment Based on Reinforcement Learning" Journal of Marine Science and Engineering 12, no. 7: 1194.
https://doi.org/10.3390/jmse12071194
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article metric data becomes available approximately 24 hours after publication online.