MDPI - Publisher of Open Access Journals

18 pages, 17230 KB

Open AccessArticle

SAREnv: An Open-Source Dataset and Benchmark Tool for Informed Wilderness Search and Rescue Using UAVs

by Kasper Andreas Rømer Grøntved, Alejandro Jarabo-Peñas, Sid Reid, Edouard George Alain Rolland, Matthew Watson, Arthur Richards, Steve Bullock and Anders Lyhne Christensen

Drones 2025, 9(9), 628; https://doi.org/10.3390/drones9090628 - 5 Sep 2025

Viewed by 827

Abstract

Unmanned aerial vehicles (UAVs) play an increasingly vital role in wilderness search and rescue (SAR) operations by enhancing situational awareness and extending the capabilities of human teams. Yet, a lack of standardized benchmarks has impeded the systematic evaluation of single- and multi-agent path-planning [...] Read more.

Unmanned aerial vehicles (UAVs) play an increasingly vital role in wilderness search and rescue (SAR) operations by enhancing situational awareness and extending the capabilities of human teams. Yet, a lack of standardized benchmarks has impeded the systematic evaluation of single- and multi-agent path-planning algorithms. This paper introduces an open-source dataset and evaluation framework to address this gap. The framework comprises 60 geospatial scenarios across four distinct European environments, featuring high-resolution probability maps. We present a lost person probabilistic model derived from statistical models of lost person behavior. We provide a suite of tools for evaluating search paths against four baseline methods: Concentric Circles, Pizza Zigzag, Greedy, and Random Exploration, using three quantitative metrics: Accumulated probability of detection, time-discounted probability of detection, and lost person discovery score. We provide an evaluation framework to facilitate the comparative analysis of single- and multi-agent path-planning algorithms, supporting both the baseline methods presented and custom user-defined path generators. By providing a structured and extensible framework, this work establishes a foundation for the rigorous and reproducible assessment of UAV search strategies in complex wilderness environments. Full article

(This article belongs to the Special Issue Innovative Applications of UAVs in Search and Rescue: Improving Safety and Effectiveness)

► Show Figures

Figure 1

26 pages, 14110 KB

Open AccessArticle

Gemini: A Cascaded Dual-Agent DRL Framework for Task Chain Planning in UAV-UGV Collaborative Disaster Rescue

by Mengxuan Wen, Yunxiao Guo, Changhao Qiu, Bangbang Ren, Mengmeng Zhang and Xueshan Luo

Drones 2025, 9(7), 492; https://doi.org/10.3390/drones9070492 - 11 Jul 2025

Viewed by 813

Abstract

In recent years, UAV (unmanned aerial vehicle)-UGV (unmanned ground vehicle) collaborative systems have played a crucial role in emergency disaster rescue. To improve rescue efficiency, heterogeneous network and task chain methods are introduced to cooperatively develop rescue sequences within a short time for [...] Read more.

In recent years, UAV (unmanned aerial vehicle)-UGV (unmanned ground vehicle) collaborative systems have played a crucial role in emergency disaster rescue. To improve rescue efficiency, heterogeneous network and task chain methods are introduced to cooperatively develop rescue sequences within a short time for collaborative systems. However, current methods also overlook resource overload for heterogeneous units and limit planning to a single task chain in cross-platform rescue scenarios, resulting in low robustness and limited flexibility. To this end, this paper proposes Gemini, a cascaded dual-agent deep reinforcement learning (DRL) framework based on the Heterogeneous Service Network (HSN) for multiple task chains planning in UAV-UGV collaboration. Specifically, this framework comprises a chain selection agent and a resource allocation agent: The chain selection agent plans paths for task chains, and the resource allocation agent distributes platform loads along generated paths. For each mission, a well-trained Gemini can not only allocate resources in load balancing but also plan multiple task chains simultaneously, which enhances the robustness in cross-platform rescue. Simulation results show that Gemini can increase rescue effectiveness by approximately 60% and improve load balancing by approximately 80%, compared to the baseline algorithm. Additionally, Gemini’s performance is stable and better than the baseline in various disaster scenarios, which verifies its generalization. Full article

(This article belongs to the Special Issue Distributed Control, Optimization, and Game of UAV Swarm Systems (2nd Edition))

► Show Figures

Figure 1

31 pages, 555 KB

Open AccessReview

Advances in Zeroing Neural Networks: Bio-Inspired Structures, Performance Enhancements, and Applications

by Yufei Wang, Cheng Hua and Ameer Hamza Khan

Biomimetics 2025, 10(5), 279; https://doi.org/10.3390/biomimetics10050279 - 29 Apr 2025

Viewed by 739

Abstract

Zeroing neural networks (ZNN), as a specialized class of bio-Iinspired neural networks, emulate the adaptive mechanisms of biological systems, allowing for continuous adjustments in response to external variations. Compared to traditional numerical methods and common neural networks (such as gradient-based and recurrent neural [...] Read more.

Zeroing neural networks (ZNN), as a specialized class of bio-Iinspired neural networks, emulate the adaptive mechanisms of biological systems, allowing for continuous adjustments in response to external variations. Compared to traditional numerical methods and common neural networks (such as gradient-based and recurrent neural networks), this adaptive capability enables the ZNN to rapidly and accurately solve time-varying problems. By leveraging dynamic zeroing error functions, the ZNN exhibits distinct advantages in addressing complex time-varying challenges, including matrix inversion, nonlinear equation solving, and quadratic optimization. This paper provides a comprehensive review of the evolution of ZNN model formulations, with a particular focus on single-integral and double-integral structures. Additionally, we systematically examine existing nonlinear activation functions, which play a crucial role in determining the convergence speed and noise robustness of ZNN models. Finally, we explore the diverse applications of ZNN models across various domains, including robot path planning, motion control, multi-agent coordination, and chaotic system regulation. Full article

(This article belongs to the Special Issue Bio-Inspired Data-Driven Methods and Their Applications in Engineering Control, Optimization and AI)

► Show Figures

Figure 1

22 pages, 6496 KB

Open AccessArticle

A Fully Controllable UAV Using Curriculum Learning and Goal-Conditioned Reinforcement Learning: From Straight Forward to Round Trip Missions

by Hyeonmin Kim, Jongkwan Choi, Hyungrok Do and Gyeong Taek Lee

Drones 2025, 9(1), 26; https://doi.org/10.3390/drones9010026 - 31 Dec 2024

Cited by 1 | Viewed by 1406

Abstract

The focus of unmanned aerial vehicle (UAV) path planning includes challenging tasks such as obstacle avoidance and efficient target reaching in complex environments. Building upon these fundamental challenges, an additional need exists for agents that can handle diverse missions like round-trip navigation without [...] Read more.

The focus of unmanned aerial vehicle (UAV) path planning includes challenging tasks such as obstacle avoidance and efficient target reaching in complex environments. Building upon these fundamental challenges, an additional need exists for agents that can handle diverse missions like round-trip navigation without requiring retraining for each specific task. In our study, we present a path planning method using reinforcement learning (RL) for a fully controllable UAV agent. We combine goal-conditioned RL and curriculum learning to enable agents to progressively master increasingly complex missions, from single-target reaching to round-trip navigation. Our experimental results demonstrate that the trained agent successfully completed 95% of simple target-reaching tasks and 70% of complex round-trip missions. The agent maintained stable performance even with multiple subgoals, achieving over 75% success rate in three-subgoal missions, indicating strong potential for practical applications in UAV path planning. Full article

► Show Figures

Figure 1

19 pages, 5550 KB

Open AccessArticle

Trajectory Planning for Unmanned Vehicles on Airport Apron Under Aircraft–Vehicle–Airfield Collaboration

by Dezhou Yuan, Yingxue Zhong, Xinping Zhu, Ying Chen, Yue Jin, Xinze Du, Ke Tang and Tianyu Huang

Sensors 2025, 25(1), 71; https://doi.org/10.3390/s25010071 - 26 Dec 2024

Cited by 2 | Viewed by 1302

Abstract

To address the issue of safe, orderly, and efficient operation for unmanned vehicles within the apron area in the future, a hardware framework of aircraft–vehicle–airfield collaboration and a trajectory planning method for unmanned vehicles on the apron were proposed. As for the vehicle–airfield [...] Read more.

To address the issue of safe, orderly, and efficient operation for unmanned vehicles within the apron area in the future, a hardware framework of aircraft–vehicle–airfield collaboration and a trajectory planning method for unmanned vehicles on the apron were proposed. As for the vehicle–airfield perspective, a collaboration mechanism between flight support tasks and unmanned vehicle departure movement was constructed. As for the latter, a control mechanism was established for the right-of-way control of the apron. With the goal of reducing waiting time downstream of the pre-selected path, a multi-agent reinforcement learning model with a collaborative graph was created to accomplish path selection among various origin–destination pairs. Then, we took Apron NO.2 in Ezhou Huahu Airport as an example for simulation verification. The results show that, compared with traditional methods, the proposed method improves the average vehicle speed and reduces average vehicle queue time by 11.60% and 32.34%, respectively. The right-of-way signal-switching actions are associated with the path selection behavior of the corresponding agent, fitting the created aircraft–vehicle collaboration. After 10 episodes of training, the Q-values can steadily converge, with the deviation rate decreasing from 40% to below 0.22%, making the balance between sociality and competitiveness. A single trajectory can be planned in just 0.78 s, and for each second of training, 7.54 s of future movement of vehicles can be planned in the simulation world. Future research could focus on online rolling trajectory planning for UGSVs in the apron area, and realistic verification under multi-sensor networks can further advance the application of unmanned vehicles in apron operations. Full article

(This article belongs to the Special Issue Computer Vision Recognition and Communication Sensing System)

► Show Figures

Figure 1

24 pages, 15090 KB

Open AccessArticle

Multi-Agent Collaborative Path Planning Algorithm with Multiple Meeting Points

by Jianlin Mao, Zhigang He, Dayan Li, Ruiqi Li, Shufan Zhang and Niya Wang

Electronics 2024, 13(16), 3347; https://doi.org/10.3390/electronics13163347 - 22 Aug 2024

Cited by 1 | Viewed by 3581

Abstract

Traditional multi-agent path planning algorithms often lead to path overlap and excessive energy consumption when dealing with cooperative tasks due to the single-agent-single-task configuration. For this reason, the “many-to-one” cooperative planning method has been proposed, which, although improved, still faces challenges in the [...] Read more.

Traditional multi-agent path planning algorithms often lead to path overlap and excessive energy consumption when dealing with cooperative tasks due to the single-agent-single-task configuration. For this reason, the “many-to-one” cooperative planning method has been proposed, which, although improved, still faces challenges in the vast search space for meeting points and unreasonable task handover locations. This paper proposes the Cooperative Dynamic Priority Safe Interval Path Planning with a multi-meeting-point and single-meeting-point solving mode switching (Co-

{D P S I P P}_{m s}

) algorithm to achieve multi-agent path planning with task handovers at multiple or single meeting points. First, the initial priority is set based on the positional relationships among agents within the cooperative group, and the improved Fermat point method is used to locate multiple meeting points quickly. Second, considering that agents must pick up sub-tasks or conduct task handovers midway, a segmented path planning strategy is proposed to ensure that cooperative agents can efficiently and accurately complete task handovers. Finally, an automatic switching strategy between multi-meeting-point and single-meeting-point solving modes is designed to ensure the algorithm’s success rate. Tests show that Co-

{D P S I P P}_{m s}

outperforms existing algorithms in 1-to-1 and m-to-1 cooperative tasks, demonstrating its efficiency and practicality. Full article

(This article belongs to the Special Issue Path Planning for Mobile Robots, 2nd Edition)

► Show Figures

Figure 1

21 pages, 8343 KB

Open AccessArticle

A Multi-Area Task Path-Planning Algorithm for Agricultural Drones Based on Improved Double Deep Q-Learning Net

by Jian Li, Weijian Zhang, Junfeng Ren, Weilin Yu, Guowei Wang, Peng Ding, Jiawei Wang and Xuen Zhang

Agriculture 2024, 14(8), 1294; https://doi.org/10.3390/agriculture14081294 - 5 Aug 2024

Cited by 14 | Viewed by 3163

Abstract

With the global population growth and increasing food demand, the development of precision agriculture has become particularly critical. In precision agriculture, accurately identifying areas of nitrogen stress in crops and planning precise fertilization paths are crucial. However, traditional coverage path-planning (CPP) typically considers [...] Read more.

With the global population growth and increasing food demand, the development of precision agriculture has become particularly critical. In precision agriculture, accurately identifying areas of nitrogen stress in crops and planning precise fertilization paths are crucial. However, traditional coverage path-planning (CPP) typically considers only single-area tasks and overlooks the multi-area tasks CPP. To address this problem, this study proposed a Regional Framework for Coverage Path-Planning for Precision Fertilization (RFCPPF) for crop protection UAVs in multi-area tasks. This framework includes three modules: nitrogen stress spatial distribution extraction, multi-area tasks environmental map construction, and coverage path-planning. Firstly, Sentinel-2 remote-sensing images are processed using the Google Earth Engine (GEE) platform, and the Green Normalized Difference Vegetation Index (GNDVI) is calculated to extract the spatial distribution of nitrogen stress. A multi-area tasks environmental map is constructed to guide multiple UAV agents. Subsequently, improvements based on the Double Deep Q Network (DDQN) are introduced, incorporating Long Short-Term Memory (LSTM) and dueling network structures. Additionally, a multi-objective reward function and a state and action selection strategy suitable for stress area plant protection operations are designed. Simulation experiments verify the superiority of the proposed method in reducing redundant paths and improving coverage efficiency. The proposed improved DDQN achieved an overall step count that is 60.71% of MLP-DDQN and 90.55% of Breadth-First Search–Boustrophedon Algorithm (BFS-BA). Additionally, the total repeated coverage rate was reduced by 7.06% compared to MLP-DDQN and by 8.82% compared to BFS-BA. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

46 pages, 4055 KB

Open AccessReview

Path Planning Technique for Mobile Robots: A Review

by Liwei Yang, Ping Li, Song Qian, He Quan, Jinchao Miao, Mengqi Liu, Yanpei Hu and Erexidin Memetimin

Machines 2023, 11(10), 980; https://doi.org/10.3390/machines11100980 - 23 Oct 2023

Cited by 54 | Viewed by 20688

Abstract

Mobile robot path planning involves designing optimal routes from starting points to destinations within specific environmental conditions. Even though there are well-established autonomous navigation solutions, it is worth noting that comprehensive, systematically differentiated examinations of the critical technologies underpinning both single-robot and multi-robot [...] Read more.

Mobile robot path planning involves designing optimal routes from starting points to destinations within specific environmental conditions. Even though there are well-established autonomous navigation solutions, it is worth noting that comprehensive, systematically differentiated examinations of the critical technologies underpinning both single-robot and multi-robot path planning are notably scarce. These technologies encompass aspects such as environmental modeling, criteria for evaluating path quality, the techniques employed in path planning and so on. This paper presents a thorough exploration of techniques within the realm of mobile robot path planning. Initially, we provide an overview of eight diverse methods for mapping, each mirroring the varying levels of abstraction that robots employ to interpret their surroundings. Furthermore, we furnish open-source map datasets suited for both Single-Agent Path Planning (SAPF) and Multi-Agent Path Planning (MAPF) scenarios, accompanied by an analysis of prevalent evaluation metrics for path planning. Subsequently, focusing on the distinctive features of SAPF algorithms, we categorize them into three classes: classical algorithms, intelligent optimization algorithms, and artificial intelligence algorithms. Within the classical algorithms category, we introduce graph search algorithms, random sampling algorithms, and potential field algorithms. In the intelligent optimization algorithms domain, we introduce ant colony optimization, particle swarm optimization, and genetic algorithms. Within the domain of artificial intelligence algorithms, we discuss neural network algorithms and fuzzy logic algorithms. Following this, we delve into the different approaches to MAPF planning, examining centralized planning which emphasizes decoupling conflicts, and distributed planning which prioritizes task execution. Based on these categorizations, we comprehensively compare the characteristics and applicability of both SAPF and MAPF algorithms, while highlighting the challenges that this field is currently grappling with. Full article

(This article belongs to the Section Automation and Control Systems)

► Show Figures

Figure 1

28 pages, 6177 KB

Open AccessArticle

Optimized-Weighted-Speedy Q-Learning Algorithm for Multi-UGV in Static Environment Path Planning under Anti-Collision Cooperation Mechanism

by Yuanying Cao and Xi Fang

Mathematics 2023, 11(11), 2476; https://doi.org/10.3390/math11112476 - 27 May 2023

Cited by 8 | Viewed by 2674

Abstract

With the accelerated development of smart cities, the concept of a “smart industrial park” in which unmanned ground vehicles (UGVs) have wide application has entered the industrial field of vision. When faced with multiple tasks and heterogeneous tasks, the task execution efficiency of [...] Read more.

With the accelerated development of smart cities, the concept of a “smart industrial park” in which unmanned ground vehicles (UGVs) have wide application has entered the industrial field of vision. When faced with multiple tasks and heterogeneous tasks, the task execution efficiency of a single UGV is inefficient, thus the task planning research under multi-UGV cooperation has become more urgent. In this paper, under the anti-collision cooperation mechanism for multi-UGV path planning, an improved algorithm with optimized-weighted-speedy Q-learning (OWS Q-learning) is proposed. The slow convergence speed of the Q-learning algorithm is overcome to a certain extent by changing the update mode of the Q function. By improving the selection mode of learning rate and the selection strategy of action, the relationship between exploration and utilization is balanced, and the learning efficiency of multi-agent in complex environments is improved. The simulation experiments in static environment show that the designed anti-collision coordination mechanism effectively solves the coordination problem of multiple UGVs in the same scenario. In the same experimental scenario, compared with the Q-learning algorithm and other reinforcement learning algorithms, only the OWS Q-learning algorithm achieves the convergence effect, and the OWS Q-learning algorithm has the shortest collision-free path for UGVS and the least time to complete the planning. Compared with the Q-learning algorithm, the calculation time of the OWS Q-learning algorithm in the three experimental scenarios is improved by 53.93%, 67.21%, and 53.53%, respectively. This effectively improves the intelligent development of UGV in smart parks. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Applications)

► Show Figures

Figure 1

16 pages, 9414 KB

Open AccessArticle

AutoDRIVE: A Comprehensive, Flexible and Integrated Digital Twin Ecosystem for Autonomous Driving Research & Education

by Tanmay Samak, Chinmay Samak, Sivanathan Kandhasamy, Venkat Krovi and Ming Xie

Robotics 2023, 12(3), 77; https://doi.org/10.3390/robotics12030077 - 26 May 2023

Cited by 24 | Viewed by 8095

Abstract

Prototyping and validating hardware–software components, sub-systems and systems within the intelligent transportation system-of-systems framework requires a modular yet flexible and open-access ecosystem. This work presents our attempt to develop such a comprehensive research and education ecosystem, called AutoDRIVE, for synergistically prototyping, simulating and [...] Read more.

Prototyping and validating hardware–software components, sub-systems and systems within the intelligent transportation system-of-systems framework requires a modular yet flexible and open-access ecosystem. This work presents our attempt to develop such a comprehensive research and education ecosystem, called AutoDRIVE, for synergistically prototyping, simulating and deploying cyber-physical solutions pertaining to autonomous driving as well as smart city management. AutoDRIVE features both software as well as hardware-in-the-loop testing interfaces with openly accessible scaled vehicle and infrastructure components. The ecosystem is compatible with a variety of development frameworks, and supports both single- and multi-agent paradigms through local as well as distributed computing. Most critically, AutoDRIVE is intended to be modularly expandable to explore emergent technologies, and this work highlights various complementary features and capabilities of the proposed ecosystem by demonstrating four such deployment use-cases: (i) autonomous parking using probabilistic robotics approach for mapping, localization, path-planning and control; (ii) behavioral cloning using computer vision and deep imitation learning; (iii) intersection traversal using vehicle-to-vehicle communication and deep reinforcement learning; and (iv) smart city management using vehicle-to-infrastructure communication and internet-of-things. Full article

(This article belongs to the Special Issue Mechatronics Systems and Robots)

► Show Figures

Figure 1

21 pages, 6193 KB

Open AccessArticle

Improved Optimization Strategy Based on Region Division for Collaborative Multi-Agent Coverage Path Planning

by Yijie Qin, Lei Fu, Dingxin He and Zhiwei Liu

Sensors 2023, 23(7), 3596; https://doi.org/10.3390/s23073596 - 30 Mar 2023

Cited by 11 | Viewed by 4033

Abstract

In this paper, we investigate the algorithms for traversal exploration and path coverage of target regions using multiple agents, enabling the efficient deployment of a set of agents to cover a complex region. First, the original multi-agent path planning problem (mCPP) is transformed [...] Read more.

In this paper, we investigate the algorithms for traversal exploration and path coverage of target regions using multiple agents, enabling the efficient deployment of a set of agents to cover a complex region. First, the original multi-agent path planning problem (mCPP) is transformed into several single-agent sub-problems, by dividing the target region into multiple balanced sub-regions, which reduces the explosive combinatorial complexity; subsequently, closed-loop paths are planned in each sub-region by the rapidly exploring random trees (RRT) algorithm to ensure continuous exploration and repeated visits to each node of the target region. On this basis, we also propose two improvements: for the corner case of narrow regions, the use of geodesic distance is proposed to replace the Eulerian distance in Voronoi partitioning, and the iterations for balanced partitioning can be reduced by more than one order of magnitude; the Dijkstra algorithm is introduced to assign a smaller weight to the path cost when the geodesic direction changes, which makes the region division more “cohesive”, thus greatly reducing the number of turns in the path and making it more robust. The final optimization algorithm ensures the following characteristics: complete coverage of the target area, wide applicability of multiple area shapes, reasonable distribution of exploration tasks, minimum average waiting time, and sustainable exploration without any preparation phase. Full article

(This article belongs to the Special Issue Intelligent Sensing, Control and Optimization of Networks)

► Show Figures

Figure 1

26 pages, 7722 KB

Open AccessArticle

Inverse Kinematic Solver Based on Bat Algorithm for Robotic Arm Path Planning

by Mohamed Slim, Nizar Rokbani, Bilel Neji, Mohamed Ali Terres and Taha Beyrouthy

Robotics 2023, 12(2), 38; https://doi.org/10.3390/robotics12020038 - 9 Mar 2023

Cited by 14 | Viewed by 4487

Abstract

The bat algorithm (BA) is a nature inspired algorithm which is mimicking the bio-sensing characteristics of bats, known as echolocation. This paper suggests a Bat-based meta-heuristic for the inverse kinematics problem of a robotic arm. An intrinsically modified BA is proposed to find [...] Read more.

The bat algorithm (BA) is a nature inspired algorithm which is mimicking the bio-sensing characteristics of bats, known as echolocation. This paper suggests a Bat-based meta-heuristic for the inverse kinematics problem of a robotic arm. An intrinsically modified BA is proposed to find an inverse kinematics (IK) solution, respecting a minimum variation of the joints’ elongation from the initial configuration of the robot manipulator to the proposed new pause position. The proposed method is called IK-BA, it stands for a specific bat algorithm dedicated to robotic-arms’ inverse geometric solution, and where the elongation control mechanism is embedded in bat agents update equations. Performances analysis and comparatives to related state of art meta-heuristics solvers showed the effectiveness of the proposed IK bat solver for single point IK planning as well as for geometric path planning, which may have several industrial applications. IK-BA was also applied to a real robotic arm with a spherical wrist as a proof of concept and pertinence of the proposed approach. Full article

► Show Figures

Figure 1

18 pages, 5335 KB

Open AccessArticle

Optimization of a Regional Marine Environment Mobile Observation Network Based on Deep Reinforcement Learning

by Yuxin Zhao, Yanlong Liu and Xiong Deng

J. Mar. Sci. Eng. 2023, 11(1), 208; https://doi.org/10.3390/jmse11010208 - 12 Jan 2023

Cited by 1 | Viewed by 2170

Abstract

The observation path planning of an ocean mobile observation network is an important part of the ocean mobile observation system. With the aim of developing a traditional algorithm to solve the observation path of the mobile observation network, a complex objective function needs [...] Read more.

The observation path planning of an ocean mobile observation network is an important part of the ocean mobile observation system. With the aim of developing a traditional algorithm to solve the observation path of the mobile observation network, a complex objective function needs to be constructed, and an improved deep reinforcement learning algorithm is proposed. The improved deep reinforcement learning algorithm does not need to establish the objective function. The agent samples the marine environment information by exploring and receiving feedback from the environment. Focusing on the real-time dynamic variability of the marine environment, our experiment shows that adding bidirectional recurrency to the Deep Q-network allows the Q-network to better estimate the underlying system state. Compared with the results of existing algorithms, the improved deep reinforcement learning algorithm can effectively improve the sampling efficiency of the observation platform. To improve the prediction accuracy of the marine environment numerical prediction system, we conduct sampling path experiments on a single platform, double platform, and five platforms. The experimental results show that increasing the number of observation platforms can effectively improve the prediction accuracy of the numerical prediction system, but when the number of observation platforms exceeds 2, increasing the number of observation platforms will not improve the prediction accuracy, and there is a certain degree of decline. In addition, in the multi-platform experiment, the improved deep reinforcement learning algorithm is compared with the unimproved algorithm, and the results show that the proposed algorithm is better than the existing algorithm. Full article

(This article belongs to the Special Issue Earth System Modeling, Data Assimilation, Artificial Intelligence, Deep Learning and Ocean Information Engineering)

► Show Figures

Figure 1

18 pages, 5193 KB

Open AccessArticle

Cooperative Following of Multiple Autonomous Robots Based on Consensus Estimation

by Guojie Kong, Jie Cai, Jianwei Gong, Zheming Tian, Lu Huang and Yuan Yang

Electronics 2022, 11(20), 3319; https://doi.org/10.3390/electronics11203319 - 14 Oct 2022

Cited by 2 | Viewed by 1863

Abstract

When performing a specific task, a Multi-Agent System (MAS) not only needs to coordinate the whole formation but also needs to coordinate the dynamic relationship among all the agents, which means judging and adjusting their positions in the formation according to their location, [...] Read more.

When performing a specific task, a Multi-Agent System (MAS) not only needs to coordinate the whole formation but also needs to coordinate the dynamic relationship among all the agents, which means judging and adjusting their positions in the formation according to their location, velocity, surrounding obstacles and other information to accomplish specific tasks. This paper devises an integral separation feedback method for a single-agent control with a developed robot motion model; then, an enhanced strategy incorporating the dynamic information of the leader robot is proposed for further improvement. On this basis, a method of combining second-order formation control with path planning is proposed for multiple-agents following control, which uses the system dynamic of one agent and the Laplacian matrix to generate the consensus protocol. Due to a second-order consensus, the agents exchange information according to a pre-specified communication digraph and keep in a certain following formation. Moreover, an improved path planning method using an artificial potential field is developed to guide the MAS to reach the destination and avoid collisions. The effectiveness of the proposed approach is verified with simulation results in different scenarios. Full article

(This article belongs to the Special Issue Recent Advances in Unmanned System Navigation and Control)

► Show Figures

Figure 1

14 pages, 6272 KB

Open AccessArticle

Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments

by Dugan Um, Prasad Nethala and Hocheol Shin

AI 2022, 3(3), 645-658; https://doi.org/10.3390/ai3030037 - 3 Aug 2022

Cited by 4 | Viewed by 3706

Abstract

In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierarchical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical [...] Read more.

In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierarchical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical environment and state among workers and managers, while a unique reward system is required for each Deep Deterministic Policy Gradient (DDPG) agent. Therefore, the HDDPG allows easy structural expansion with probabilistic action selection of a worker by the manager. Due to its innate structural advantage, the HDDPG has a merit in building a general AI to deal with a complex time-horizon tasks with various conflicting sub-goals. The experimental results demonstrated its usefulness with a manipulator motion planning problem in a dynamic environment, where path planning and collision avoidance conflict each other. The proposed HDDPG is compared with an HAM and a single DDPG for performance evaluation. The result shows that the HDDPG demonstrated more than 40% of reward gain and more than two times the reward improvement rate. Another important feature of the proposed HDDPG is the biased manager training capability. By adding a preference factor to each worker, the manager can be trained to prefer a certain worker to achieve better success rate for a specific objective if needed. Full article

► Show Figures

Figure 1

Search Results (18)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (18)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI