Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,004)

Search Parameters:
Keywords = cooperative agents

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 22493 KB  
Article
H-CoRE: A Cooperative Framework for Heterogeneous Multi-Robot Exploration and Inspection
by Simone D’Angelo, Francesca Pagano, Riccardo Caccavale, Vincenzo Scognamiglio, Alessandro De Crescenzo, Pasquale Merone, Stefano Ciaravino, Alberto Finzi and Vincenzo Lippiello
Drones 2026, 10(4), 232; https://doi.org/10.3390/drones10040232 (registering DOI) - 25 Mar 2026
Abstract
This paper presents the H-CoRE (Heterogeneous Cooperative Multi-Robot Execution) framework designed to enable autonomous multi-robot operations in GNSS-denied environments. Built on an ROS 2-based architecture, H-CoRE enables collaborative, structured task execution through standardized software stacks. Each robot’s stack combines a high-level executive system [...] Read more.
This paper presents the H-CoRE (Heterogeneous Cooperative Multi-Robot Execution) framework designed to enable autonomous multi-robot operations in GNSS-denied environments. Built on an ROS 2-based architecture, H-CoRE enables collaborative, structured task execution through standardized software stacks. Each robot’s stack combines a high-level executive system with an agent-specific motion layer and leverages multi-sensor fusion for localization and mapping. The framework is inherently reconfigurable, allowing individual agents to operate autonomously or as part of a multi-robot team for collaborative missions. In the considered scenario, the system integrates aerial and ground vehicles, a fixed pan–tilt–zoom camera, and a human supervisory interface within a unified, modular infrastructure. The proposed system has been deployed in indoor, GNSS-denied environments, demonstrating autonomous navigation, cooperative area coverage, and real-time information sharing across multiple agents. Experimental results confirm the effectiveness of H-CoRE in maintaining general awareness and mission continuity, paving the way for future applications in search-and-rescue, inspection, and exploration tasks. Full article
Show Figures

Figure 1

31 pages, 16969 KB  
Article
Research on Cooperative Vehicle–Infrastructure Perception Integrating Enhanced Point-Cloud Features and Spatial Attention
by Shiyang Yan, Yanfeng Wu, Zhennan Liu and Chengwei Xie
World Electr. Veh. J. 2026, 17(4), 164; https://doi.org/10.3390/wevj17040164 - 24 Mar 2026
Abstract
Vehicle–infrastructure cooperative perception (VICP) extends the sensing capability of single-vehicle systems by integrating multi-source information from onboard and roadside sensors, thereby alleviating limitations in sensing range and field-of-view coverage. However, in complex urban environments, the robustness of such systems—particularly in terms of blind-spot [...] Read more.
Vehicle–infrastructure cooperative perception (VICP) extends the sensing capability of single-vehicle systems by integrating multi-source information from onboard and roadside sensors, thereby alleviating limitations in sensing range and field-of-view coverage. However, in complex urban environments, the robustness of such systems—particularly in terms of blind-spot coverage and feature representation—is severely affected by both static and dynamic occlusions, as well as distance-induced sparsity in point cloud data. To address these challenges, a 3D object detection framework incorporating point cloud feature enhancement and spatially adaptive fusion is proposed. First, to mitigate feature degradation under sparse and occluded conditions, a Redefined Squeeze-and-Excitation Network (R-SENet) attention module is integrated into the feature encoding stage. This module employs a dual-dimensional squeeze-and-excitation mechanism operating across pillars and intra-pillar points, enabling adaptive recalibration of critical geometric features. In addition, a Feature Pyramid Backbone Network (FPB-Net) is designed to improve target representation across varying distances through multi-scale feature extraction and cross-layer aggregation. Second, to address feature heterogeneity and spatial misalignment between heterogeneous sensing agents, a Spatial Adaptive Feature Fusion (SAFF) module is introduced. By explicitly encoding the origin of features and leveraging spatial attention mechanisms, the SAFF module enables dynamic weighting and complementary fusion between fine-grained vehicle-side features and globally informative roadside semantics. Extensive experiments conducted on the DAIR-V2X benchmark and a custom dataset demonstrate that the proposed approach outperforms several state-of-the-art methods. Specifically, Average Precision (AP) scores of 0.762 and 0.694 are achieved at an IoU threshold of 0.5, while AP scores of 0.617 and 0.563 are obtained at an IoU threshold of 0.7 on the two datasets, respectively. Furthermore, the proposed framework maintains real-time inference performance, highlighting its effectiveness and practical potential for real-world deployment. Full article
(This article belongs to the Section Automated and Connected Vehicles)
Show Figures

Figure 1

19 pages, 7352 KB  
Article
Track-to-Track Fusion for Cooperative Perception Using Collective Perception Messages
by Redge Melroy Castelino, Shrijal Pradhan and Axel Hahn
Sensors 2026, 26(6), 2003; https://doi.org/10.3390/s26062003 - 23 Mar 2026
Viewed by 10
Abstract
Vehicle-to-everything communication grants connected and automated road vehicles the opportunity to share their sensor information such as detected road objects for collective awareness. This paper compares various state fusion strategies within a high-level cooperative perception architecture, focusing on the fusion of object-level information [...] Read more.
Vehicle-to-everything communication grants connected and automated road vehicles the opportunity to share their sensor information such as detected road objects for collective awareness. This paper compares various state fusion strategies within a high-level cooperative perception architecture, focusing on the fusion of object-level information provided in standard Collective Perception Messages. This work compares five track-to-track fusion methods, namely Covariance Intersection, Inverse Covariance Intersection, Adapted Extended Kalman Filter, Adapted Unscented Kalman Filter and Information Matrix Fusion, using a simulation framework built with CARLA and Autoware. The methods are analyzed in a case study to assess their performance under different vehicle maneuvers and varying input information accuracy. The case study highlights trade-offs between fusion strategies and illustrate their behavior in asynchronous multi-agent scenarios. While the analysis is conducted in simulation, the architecture is designed to be extensible, and directions for future development are outlined, including the integration of classification and object confidence fusion modules. Full article
(This article belongs to the Special Issue Cooperative Perception and Control for Autonomous Vehicles)
Show Figures

Figure 1

21 pages, 459 KB  
Article
Formation-Constrained Cooperative Localization for UAV Swarms in GNSS-Denied Environments
by Qin Li, Peng Wang, Xiaochun Li, Jieyong Zhang, Ying Luo, Wangsheng Yu and Haiyan Cheng
Sensors 2026, 26(6), 1984; https://doi.org/10.3390/s26061984 - 22 Mar 2026
Viewed by 171
Abstract
Cooperative localization is critical for UAV swarm operations in GNSS-denied environments. The backbone-listener scheme, using a small subset of agents as active backbone nodes and others as passive listeners, offers notable advantages in reducing communication overhead and enhancing swarm scalability. Building on this [...] Read more.
Cooperative localization is critical for UAV swarm operations in GNSS-denied environments. The backbone-listener scheme, using a small subset of agents as active backbone nodes and others as passive listeners, offers notable advantages in reducing communication overhead and enhancing swarm scalability. Building on this scheme, we propose a formation-constrained cooperative localization method to improve accuracy by integrating known formation geometry into the localization process. First, backbone node selection uses a formation-constrained greedy node activation (GNA) strategy with weighted distance fusion, combining measured and ideal formation distances to enable near-optimal selection aligned with formation structure. Second, listener node localization incorporates formation constraints into Chan’s algorithm, paired with angle-of-arrival (AOA) refinement, to ensure estimated positions match expected inter-agent distances. Third, global optimization uses a gradient descent-based refinement to enforce formation constraints across all agent positions. Our theoretical derivations and simulations are limited to the two-dimensional (2D) case. Simulation results validate the proposed method’s improved success rate, reliability, and stability. Its effectiveness is demonstrated across various formation types, with robust adaptability to asymmetric geometries shown to be a valuable feature for practical deployment. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

35 pages, 6392 KB  
Article
EO-MADDPG: An Improved Reinforcement Learning Approach for Multi-UAV Pursuit–Evasion Games
by Xiao Wang, Mengyu Wang, Xueqian Bai, Zhe Ma, Kewu Sun and Jiake Li
Aerospace 2026, 13(3), 296; https://doi.org/10.3390/aerospace13030296 - 21 Mar 2026
Viewed by 95
Abstract
To advance research in multi-agent reinforcement learning (MARL) for pursuit–evasion scenarios, this paper introduces a novel algorithm called Expert Knowledge and Opponent Modeling Multi-UAV Deep Deterministic Policy Gradient (EO-MADDPG). EO-MADDPG consists of two key components: the integration of expert knowledge and real-time sampled [...] Read more.
To advance research in multi-agent reinforcement learning (MARL) for pursuit–evasion scenarios, this paper introduces a novel algorithm called Expert Knowledge and Opponent Modeling Multi-UAV Deep Deterministic Policy Gradient (EO-MADDPG). EO-MADDPG consists of two key components: the integration of expert knowledge and real-time sampled data and the prediction of evader UAV actions. The expert knowledge includes a multi-UAV formation control algorithm and an encirclement strategy, which incorporates consensus algorithms and Apollonius circle guidance. Additionally, the network-training framework is optimized by integrating information about opponent actions under a fixed policy for improved prediction accuracy. The experiments focus on three vs. one and three vs. two scenarios, where pursuer UAVs utilize EO-MADDPG and evader UAVs follow fixed policies with Gaussian perturbations. Experimental results show that EO-MADDPG achieves success rates of 99.9 ± 0.3% and 97.5 ± 1.4% (mean ± std over five seeds) in three vs. one and three vs. two pursuit–evasion simulations, respectively, outperforming the baseline MADDPG (72.7 ± 6.0% and 64.4 ± 34.4%). Ablation studies and cooperative landmark tasks further demonstrate improved training stability and interpretability. Full article
(This article belongs to the Section Aeronautics)
Show Figures

Figure 1

18 pages, 800 KB  
Article
Transient Dynamic Feature Adaptive Cooperative Control for Renewable Grids via Multi-Agent Deep Reinforcement Learning
by Mingyu Pang, Min Li, Xi Ye, Peng Shi, Zongsheng Zheng, Lai Yuan and Hongwen Tan
Electronics 2026, 15(6), 1285; https://doi.org/10.3390/electronics15061285 - 19 Mar 2026
Viewed by 117
Abstract
The increasing integration of inverter-based distributed energy resources (DERs) significantly reduces power system inertia, posing critical challenges to transient stability. Traditional fault ride-through strategies, relying on passive and localized rules, often fail to provide effective coordinated support in low-inertia grids. To address these [...] Read more.
The increasing integration of inverter-based distributed energy resources (DERs) significantly reduces power system inertia, posing critical challenges to transient stability. Traditional fault ride-through strategies, relying on passive and localized rules, often fail to provide effective coordinated support in low-inertia grids. To address these limitations, this paper proposes a Transient Dynamic Features Adaptation Distributed Cooperative Control (TDA-DCC) framework. This approach integrates a dynamic context-aware policy network based on multi-head attention mechanisms to extract temporal features from local observations, allowing agents to anticipate transient dynamics rather than merely reacting to instantaneous states. A multi-agent deep deterministic policy gradient algorithm is employed to optimize a global multi-dimensional objective function encompassing frequency, voltage, and rotor angle stability. Furthermore, to ensure engineering reliability, a hybrid execution architecture is introduced, which embeds a deterministic safety monitor to switch between the intelligent policy and a robust backup controller during extreme anomalies. Case studies on a modified IEEE 39-bus system demonstrate that the proposed method significantly enhances transient stability margins and robustness against sensor failures compared to conventional baselines. Full article
Show Figures

Figure 1

23 pages, 8149 KB  
Article
UGV Swarm Multi-View Fusion Under Occlusion: A Graph-Based Calibration-Free Framework
by Jiaqi Jing, Weilong Song, Hangcheng Zhang, Yong Liu, Fuyong Feng, Dezhi Zheng and Shangchun Fan
Drones 2026, 10(3), 214; https://doi.org/10.3390/drones10030214 - 18 Mar 2026
Viewed by 129
Abstract
In unmanned ground vehicle (UGV) swarm systems, comprehensive environmental awareness is critical for coordinated operations. Yet they are frequently deployed in occlusion-rich, constrained environments where multi-agent visual fusion is essential. However, existing methods are critically limited by offline-calibrated extrinsic parameters, hindering flexible deployment, [...] Read more.
In unmanned ground vehicle (UGV) swarm systems, comprehensive environmental awareness is critical for coordinated operations. Yet they are frequently deployed in occlusion-rich, constrained environments where multi-agent visual fusion is essential. However, existing methods are critically limited by offline-calibrated extrinsic parameters, hindering flexible deployment, and by a strong co-visibility assumption, which fails under severe occlusion. To overcome these constraints, we introduce an end-to-end, calibration-free framework for the joint registration of cameras and subjects. Our approach begins with a single-view module that estimates subjects’ poses and appearance features. Subsequently, a novel graph-based pose propagation module (GPPM) treats UGVs’ cameras as nodes in a graph, connecting them with edges when they share co-visible subjects identified via appearance matching. Breadth-first search (BFS) then finds the shortest registration path from any camera to a designated root camera, enabling pose propagation via local co-visibility links and global alignment of all subjects into a unified bird’s-eye-view (BEV) space. This strategy relaxes the stringent requirement of full co-visibility with the root node. A multi-task loss function is proposed to jointly optimize pose estimation and feature matching. Trained and evaluated on a synthetic dataset with occlusions (CSRD-O) collected by a UGV swarm system, our framework achieves mean camera pose errors of 1.57 m/8.70° and mean subject pose errors of 1.40 m/9.14°. Furthermore, we demonstrate a scene monitoring task using a UGV swarm system. Experiments show that the proposed method generates robust BEV estimates even under severe occlusion and low inter-view overlap. This work presents a purely visual, self-calibrating multi-view fusion perception scheme, demonstrating its potential to support cooperative perception, task-oriented monitoring, and collective situational awareness in UGV swarm systems. Full article
Show Figures

Figure 1

25 pages, 2296 KB  
Article
A Multi-Agent Advisory Board Reinforcement Learning Framework for Adaptive Cooperative Control
by Onur Osman, Tolga Kudret Karaca, Bahar Yalcin Kavus, Gokalp Tulum and Sajjad Nematzadeh
Algorithms 2026, 19(3), 230; https://doi.org/10.3390/a19030230 - 18 Mar 2026
Viewed by 102
Abstract
This study proposes Advisory Board Reinforcement Learning (AdvB-RL), a cooperative reinforcement-learning framework that integrates multiple advisory neural networks to guide policy optimization. Unlike conventional single-agent architectures, AdvB-RL maintains a set of independently trained advisory networks that contribute to action selection through a dynamic [...] Read more.
This study proposes Advisory Board Reinforcement Learning (AdvB-RL), a cooperative reinforcement-learning framework that integrates multiple advisory neural networks to guide policy optimization. Unlike conventional single-agent architectures, AdvB-RL maintains a set of independently trained advisory networks that contribute to action selection through a dynamic aggregation mechanism. This design preserves diverse experiential knowledge while improving learning stability and the exploration–exploitation balance. The framework is evaluated on three benchmark control tasks, namely LunarLander-v2, CartPole-v1, and MountainCar-v0, using advisory board sizes of 1, 5, and 10 members against a Double Deep Q-Network (DDQN) baseline. The best-performing configuration, 10 AdvB, achieved 270.02 ± 24.74 on LunarLander-v2 versus 227.92 ± 86.02 for DDQN, 497.79 ± 5.18 on CartPole-v1 versus 304.37 ± 144.04, and −103.16 ± 15.46 on MountainCar-v0 versus −130.71 ± 31.64, indicating higher returns together with markedly lower variability. Across the three environments, these results show that increasing the number of advisory members improves both reward consistency and overall robustness, with the 10-member setting providing the strongest performance. Within the tested configurations, the advisory board mechanism remains computationally feasible, while preliminary experiments beyond 10 advisors show diminishing returns relative to added complexity. Overall, AdvB-RL provides a robust and modular alternative to single-policy reinforcement learning for adaptive cooperative control. Full article
Show Figures

Figure 1

35 pages, 2351 KB  
Article
A Bilevel Optimization Model Based on Agency Theory in Relief Supply Chain Considering Authorization
by Xiaoli Wu and Xiulan Wang
Symmetry 2026, 18(3), 524; https://doi.org/10.3390/sym18030524 - 18 Mar 2026
Viewed by 103
Abstract
As a proactive response, reserving a certain amount of relief materials in advance is crucial for responding to potential disasters. Different from public tendering and bidding, this study proposes the purchasing mode of authorization, under which a nonprofit organization (NPO), as a buyer, [...] Read more.
As a proactive response, reserving a certain amount of relief materials in advance is crucial for responding to potential disasters. Different from public tendering and bidding, this study proposes the purchasing mode of authorization, under which a nonprofit organization (NPO), as a buyer, wholly authorizes the procurement of relief materials to a professional agent. The relief material procurement system under the purchasing mode of authorization is regarded as a bilevel relief supply chain consisting of one buyer, one agent, and two suppliers with private information about the quality levels of relief materials. For the disclosure of private information, the quality-related procurement strategy is designed in the form of a menu based on the suppliers’ private information. A bilevel optimization model is developed based on agency theory to derive the optimal strategic decisions, and the impacts of the main influencing factors on the optimal procurement strategy and the buyer’s minimum expected cost are discussed via numerical analysis. Then, the study is extended by exploring supplier’s alternative cost functions and supply availability, as well as proposing future research directions. This paper presents an optimal quality-related procurement strategy, which provides rules for quickly responding to the changes in influencing factors during the material procurement process, as well as the minimum expected cost for the buyer to purchase relief materials, which serves as a threshold for screening a reliable retail enterprise as the agent. Finally, three managerial implications with practical significance, drawn from our findings, are presented to facilitate cooperation between NPO and large retail enterprises in order to achieve effective procurement of relief materials at the pre-disaster preparation stage. Full article
(This article belongs to the Section Mathematics)
Show Figures

Figure 1

38 pages, 6270 KB  
Article
Cooperative Rapid Search for Evasive Targets Using Multiple UAVs Based on Graph Theory
by Wenying Dou, Peng Yang, Zhiwei Zhang, Guangpeng Hu and Sirun Xu
Drones 2026, 10(3), 196; https://doi.org/10.3390/drones10030196 - 11 Mar 2026
Viewed by 361
Abstract
Rapid search for evasive targets using multiple Unmanned Aerial Vehicles (UAVs) presents significant challenges, as it requires real-time target-motion prediction, multi-agent coordination, and adherence to kinematic constraints. Existing cooperative search methods often assume non-adversarial target behavior or model target motion independently of UAV [...] Read more.
Rapid search for evasive targets using multiple Unmanned Aerial Vehicles (UAVs) presents significant challenges, as it requires real-time target-motion prediction, multi-agent coordination, and adherence to kinematic constraints. Existing cooperative search methods often assume non-adversarial target behavior or model target motion independently of UAV actions, which reduces their effectiveness against targets that actively evade based on UAV positions. To address these limitations, this study introduces the Cooperative Rapid Search Algorithm for Evasive Targets (CRS-AET). The proposed framework utilizes graph-theoretic modeling to represent spatial-temporal relationships among UAVs, targets, and environmental grids. A directional gradient-based motion prediction (DG-Prediction) method first estimates probable movement areas of dynamic targets within the graph-structured environment. An improved multi-round auction algorithm with graph-based utility propagation (IMRAA) then optimizes UAV resource allocation. Finally, Dubins-Constrained Trajectory Optimization (DC-RTO) is integrated within a distributed model predictive control (DMPC) scheme to ensure kinematic feasibility. Simulation results across three representative scenarios indicate that CRS-AET enables faster target detection, enhanced area coverage, and more efficient coordination than baseline methods. Hardware-in-the-loop (HIL) experiments further confirm the robustness and practical applicability of the framework in realistic operational environments. Full article
(This article belongs to the Section Artificial Intelligence in Drones (AID))
Show Figures

Figure 1

21 pages, 5133 KB  
Review
Synergistic Anticancer Effects of Vitamin D and Plant-Derived Compounds: Molecular Mechanisms, Therapeutic Potential, and Nanotechnology-Enabled Delivery Approaches
by Arik Dahan, Sapir Ifrah, Ludmila Yarmolinsky, Boris Khalfin, Sigal Fleisher-Berkovich and Shimon Ben-Shabat
Int. J. Mol. Sci. 2026, 27(5), 2507; https://doi.org/10.3390/ijms27052507 - 9 Mar 2026
Viewed by 322
Abstract
Vitamin D is widely recognized for its pivotal role in the prevention and treatment of various cancers. The active compounds derived from plants have garnered significant attention due to their multi-faceted anticancer properties. Given the complexity and heterogeneity of cancer, monotherapies often fall [...] Read more.
Vitamin D is widely recognized for its pivotal role in the prevention and treatment of various cancers. The active compounds derived from plants have garnered significant attention due to their multi-faceted anticancer properties. Given the complexity and heterogeneity of cancer, monotherapies often fall short in effectiveness. As a result, combinatorial pharmacological strategies, which utilize multiple drug agents, are increasingly being employed globally. Notably, emerging evidence highlights the potent synergistic anticancer effects of vitamin D in combination with certain phytochemicals against a variety of cancers. This review explores the cooperative mechanisms through which vitamin D and phytochemicals enhance cancer prevention and therapy. In addition to examining their synergistic effects, this review also discusses recent advancements in nanotechnology-based delivery systems for vitamin D, which hold promise for optimizing its therapeutic potential. Collectively, these findings underscore the potential of combining vitamin D with phytochemicals and innovative delivery methods as a promising strategy in the fight against cancer, paving the way for more effective, multi-targeted therapeutic approaches. Full article
(This article belongs to the Section Bioactives and Nutraceuticals)
Show Figures

Figure 1

39 pages, 67440 KB  
Article
LLM-TOC: LLM-Driven Theory-of-Mind Adversarial Curriculum for Multi-Agent Generalization
by Chenxu Wang, Jiang Yuan, Tianqi Yu, Xinyue Jiang, Liuyu Xiang, Junge Zhang and Zhaofeng He
Mathematics 2026, 14(5), 915; https://doi.org/10.3390/math14050915 - 8 Mar 2026
Viewed by 322
Abstract
Zero-shot generalization to out-of-distribution (OOD) teammates and opponents in multi-agent systems (MASs) remains a fundamental challenge for general-purpose AI, especially in open-ended interaction scenarios. Existing multi-agent reinforcement learning (MARL) paradigms, such as self-play and population-based training, often collapse to a limited subset of [...] Read more.
Zero-shot generalization to out-of-distribution (OOD) teammates and opponents in multi-agent systems (MASs) remains a fundamental challenge for general-purpose AI, especially in open-ended interaction scenarios. Existing multi-agent reinforcement learning (MARL) paradigms, such as self-play and population-based training, often collapse to a limited subset of Nash equilibria, leaving agents brittle when faced with semantically diverse, unseen behaviors. Recent approaches that invoke Large Language Models (LLMs) at run time can improve adaptability but introduce substantial latency and can become less reliable as task horizons grow; in contrast, LLM-assisted reward-shaping methods remain constrained by the inefficiency of the inner reinforcement-learning loop. To address these limitations, we propose LLM-TOC (LLM-Driven Theory-of-Mind Adversarial Curriculum), which casts generalization as a bi-level Stackelberg game: in the inner loop, a MARL agent (the follower) minimizes regret against a fixed population, while in the outer loop, an LLM serves as a semantic oracle that generates executable adversarial or cooperative strategies in a Turing-complete code space to maximize the agent’s regret. To cope with the absence of gradients in discrete code generation, we introduce Gradient Saliency Feedback, which transforms pixel-level value fluctuations into semantically meaningful causal cues to steer the LLM toward targeted strategy synthesis. We further provide motivating theoretical analysis via the PAC-Bayes framework, showing that LLM-TOC converges at rate O(1/K) and yields a tighter generalization error bound than parameter-space exploration under reasonable preconditions. Experiments on the Melting Pot benchmark demonstrate that, with expected cumulative collective return as the core zero-shot generalization metric, LLM-TOC consistently outperforms self-play baselines (IPPO and MAPPO) and the LLM-inference method Hypothetical Minds across all held-out test scenarios, reaching 75% to 85% of the upper-bound performance of Oracle PPO. Meanwhile, with the number of RL environment interaction steps to reach the target relative performance as the core efficiency metric, our framework reduces the total training computational cost by more than 60% compared with mainstream baselines. Full article
(This article belongs to the Special Issue Applications of Intelligent Game and Reinforcement Learning)
Show Figures

Figure 1

29 pages, 2249 KB  
Article
Reinforcement Learning-Based Management in IoT-Enabled Renewable Energy Communities: An Approach to Optimization for Comfort, Economy, and Sustainable Performance
by Stefano Caputo, Eleonora Iacobelli, Maurizio De Lucia, Sara Jayousi and Lorenzo Mucchi
Sensors 2026, 26(5), 1682; https://doi.org/10.3390/s26051682 - 6 Mar 2026
Viewed by 277
Abstract
The increasing deployment of Internet of Things (IoT) sensing infrastructures and distributed renewable energy resources is enabling the emergence of Renewable Energy Communities (RECs), which require intelligent, adaptive, and decentralized energy management strategies. This study proposes a sensor-driven reinforcement learning (RL) framework for [...] Read more.
The increasing deployment of Internet of Things (IoT) sensing infrastructures and distributed renewable energy resources is enabling the emergence of Renewable Energy Communities (RECs), which require intelligent, adaptive, and decentralized energy management strategies. This study proposes a sensor-driven reinforcement learning (RL) framework for the coordinated management of residential RECs, aiming to jointly optimize thermal comfort, economic savings, and environmental sustainability. Each household is equipped with a network of IoT sensors monitoring indoor temperature, energy production and consumption, battery state of charge, and user presence, which collectively define a discretized state space for a tabular Q-learning agent controlling heating systems and programmable appliances. A stochastic simulation environment is developed to realistically reproduce weather variability, building thermal dynamics, user activity profiles, and photovoltaic generation. To address the instability typical of multi-agent learning, a two-stage training strategy is adopted: agents are first pre-trained at single-house level using synthetic sensor data and are subsequently deployed within the full community, where coordination is achieved through shared reward components without explicit inter-agent communication. Performance is evaluated on a heterogeneous Renewable Energy Community (REC) composed of eleven households, including both prosumers and consumers. The simulation results show that the proposed approach significantly outperforms rule-based control strategies, achieving lower energy consumption, improved thermal comfort stability, and higher global reward. Moreover, pre-trained agents maintain stable and cooperative behavior when operating concurrently at community level, with limited sensitivity to exploration. These findings demonstrate that sensor-driven, lightweight reinforcement learning represents a viable and scalable solution for decentralized energy management in IoT-enabled Renewable Energy Communities. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

49 pages, 1775 KB  
Systematic Review
Single-Agent Sedation for Behavioral Management in Pediatric Dentistry: An Umbrella Review of Agents, Routes of Administration, Providers, and Clinical Settings
by Federica Di Spirito, Francesco Giordano, Giuseppina De Benedetto, Maria Pia Di Palo, Francesco Traino, Colomba Pessolano, Alessia Bramanti, Antonino Fiorino and Carlo Rengo
Children 2026, 13(3), 373; https://doi.org/10.3390/children13030373 - 6 Mar 2026
Viewed by 337
Abstract
Background: Dental fear and anxiety are highly prevalent in children, resulting in avoidance or incomplete dental treatment; sedation emerges as a possible behavioral management strategy. This umbrella review aimed to provide a structured and critical synthesis of the available knowledge on sedative single-agent [...] Read more.
Background: Dental fear and anxiety are highly prevalent in children, resulting in avoidance or incomplete dental treatment; sedation emerges as a possible behavioral management strategy. This umbrella review aimed to provide a structured and critical synthesis of the available knowledge on sedative single-agent efficacy and routes of administration employed for achieving sedation (excluding deep sedation/general anesthesia) during dental procedures in children for behavior management, as well as to evaluate acceptability and satisfaction for child, caregiver, and provider, and to assess the influence of clinical setting and provider. Methods: In line with the PRISMA statement, the protocol was registered on PROSPERO (CRD420251043738), and 18 systematic reviews were included and synthesized qualitatively. Results: Single-agent sedation was safe and effective for managing behavior in children during dental procedures, with midazolam and nitrous oxide being the most studied agents. Different routes of administration showed distinct characteristics in onset, recovery time, adverse effects and cooperation, while agent selection appeared influenced by clinical setting and provider type. However, data on acceptability and satisfaction from children, caregivers, and providers remains limited. Conclusions: Evidence suggests potential effectiveness of selected agents and routes in appropriately monitored settings, but data heterogeneity precludes strong comparative recommendations. Further studies are therefore needed to address the existing gaps in pediatric dental sedation. Full article
(This article belongs to the Collection Advance in Pediatric Dentistry)
Show Figures

Figure 1

23 pages, 7676 KB  
Article
Co-DMPC Strategy for Coordinated Chassis Control of Distributed Drive Electric Vehicles
by Mengdong Zheng, Hongjie Wei, Wanli Liu, Zhaoxue Deng and Xingquan Li
World Electr. Veh. J. 2026, 17(3), 132; https://doi.org/10.3390/wevj17030132 - 5 Mar 2026
Viewed by 221
Abstract
To address the challenge that existing vehicle chassis coordinated control methods struggle to balance the nonlinear couplings and control conflicts among Four-Wheel Steering (4WS), Direct Yaw-moment Control (DYC), and Active Suspension Systems (ASS), this paper proposes a Cooperative Distributed Model Predictive Control (Co-DMPC) [...] Read more.
To address the challenge that existing vehicle chassis coordinated control methods struggle to balance the nonlinear couplings and control conflicts among Four-Wheel Steering (4WS), Direct Yaw-moment Control (DYC), and Active Suspension Systems (ASS), this paper proposes a Cooperative Distributed Model Predictive Control (Co-DMPC) strategy. First, the 4WS, DYC, and ASS are modeled as three interacting agents that effectively mitigate inter-subsystem control conflicts through information exchange and coupling compensation. Second, a Gaussian Mixture Model (GMM) is utilized to extract features from vehicle state data to enable the real-time grading of instability risks, which dynamically adjusts the control weights of the 4WS, DYC, and ASS agents. Finally, a distributed iterative optimization algorithm is designed to ensure that all agents converge to a global Pareto-optimal solution through rapid negotiation, achieving a balance between control performance and computational burden. Simulation results demonstrate that compared with No-Control and CMPC, the proposed Co-DMPC strategy significantly enhances the comprehensive performance of the vehicle. In terms of path tracking accuracy, the maximum tracking errors under high- and low-adhesion road conditions are reduced by 32.73% and 17%, respectively. Regarding roll stability, the peak roll angles of the vehicle are 0.27 rad and 0.01 rad under the respective conditions. For lateral stability, the proposed method maintains a more compact sideslip angle-yaw rate phase plane envelope, effectively achieving the coordinated optimization of chassis subsystems. Hardware-in-the-Loop (HIL) experiments further validate the performance and effectiveness of the controller. Full article
(This article belongs to the Special Issue Vehicle System Dynamics and Intelligent Control for Electric Vehicles)
Show Figures

Figure 1

Back to TopTop