Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (451)

Search Parameters:
Keywords = multi-agent decision making

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
35 pages, 1234 KB  
Review
A Survey of Autonomous Driving Trajectory Prediction: Methodologies, Challenges, and Future Prospects
by Miao Xu, Zhi Liu, Bingyi Wang and Shengyan Li
Machines 2025, 13(9), 818; https://doi.org/10.3390/machines13090818 (registering DOI) - 6 Sep 2025
Abstract
Trajectory prediction is a critical component of autonomous driving decision-making systems, directly impacting driving safety and traffic efficiency. Despite advancements, existing reviews exhibit limitations in timeliness, classification frameworks, and challenge analysis. This paper systematically reviews multi-agent trajectory prediction technologies, focusing on generating future [...] Read more.
Trajectory prediction is a critical component of autonomous driving decision-making systems, directly impacting driving safety and traffic efficiency. Despite advancements, existing reviews exhibit limitations in timeliness, classification frameworks, and challenge analysis. This paper systematically reviews multi-agent trajectory prediction technologies, focusing on generating future position sequences from historical trajectories, high-precision maps, and scene context. We propose a multi-dimensional classification framework integrating input representation, output forms, method paradigms, and interaction modeling. The review comprehensively compares conventional methods and deep learning architectures, including diffusion models and large language models. We further analyze five core challenges: complex interactions, rule and map dependence, long-term prediction errors, extreme-scene generalization, and real-time constraints. Finally, interdisciplinary solutions are prospectively explored. Full article
(This article belongs to the Special Issue New Journeys in Vehicle System Dynamics and Control)
38 pages, 2474 KB  
Article
Generative and Adaptive AI for Sustainable Supply Chain Design
by Sabina-Cristiana Necula and Emanuel Rieder
J. Theor. Appl. Electron. Commer. Res. 2025, 20(3), 240; https://doi.org/10.3390/jtaer20030240 - 4 Sep 2025
Viewed by 92
Abstract
This study explores how the integration of generative artificial intelligence, multi-objective evolutionary optimization, and reinforcement learning can enable sustainable and cost-effective decision-making in supply chain strategy. Using real-world retail demand data enriched with synthetic sustainability attributes, we trained a Variational Autoencoder (VAE) to [...] Read more.
This study explores how the integration of generative artificial intelligence, multi-objective evolutionary optimization, and reinforcement learning can enable sustainable and cost-effective decision-making in supply chain strategy. Using real-world retail demand data enriched with synthetic sustainability attributes, we trained a Variational Autoencoder (VAE) to generate plausible future demand scenarios. These were used to seed a Non-Dominated Sorting Genetic Algorithm (NSGA-II) aimed at identifying Pareto-optimal sourcing strategies that balance delivery cost and CO2 emissions. The resulting Pareto frontier revealed favorable trade-offs, enabling up to 50% emission reductions for only a 10–15% cost increase. We further deployed a deep Q-learning (DQN) agent to dynamically manage weekly shipments under a selected balanced strategy. The reinforcement learning policy achieved an additional 10% emission reduction by adaptively switching between green and conventional transport modes in response to demand and carbon pricing. Importantly, the agent also demonstrated resilience during simulated supply disruptions by rerouting decisions in real time. This research contributes a novel AI-based decision architecture that combines generative modeling, evolutionary search, and adaptive control to support sustainability in complex and uncertain supply chains. Full article
(This article belongs to the Special Issue Digitalization and Sustainable Supply Chain)
Show Figures

Figure 1

23 pages, 3829 KB  
Article
Causal Correction and Compensation Network for Robotics: Applications and Validation in Continuous Control
by Xiaoqing Zhu, Lanyue Bi, Tong Wu, Chuan Zhang and Jiahao Wu
Appl. Sci. 2025, 15(17), 9628; https://doi.org/10.3390/app15179628 - 1 Sep 2025
Viewed by 155
Abstract
Deep Reinforcement Learning (DRL) has achieved remarkable success in robotic control, autonomous driving, and game-playing agents. However, its decision-making process often remains a black box, lacking both interpretability and verifiability. In robotic control tasks, developers cannot pinpoint decision errors or precisely adjust control [...] Read more.
Deep Reinforcement Learning (DRL) has achieved remarkable success in robotic control, autonomous driving, and game-playing agents. However, its decision-making process often remains a black box, lacking both interpretability and verifiability. In robotic control tasks, developers cannot pinpoint decision errors or precisely adjust control strategies based solely on observed robot behaviors. To address this challenge, this work proposes an interpretable DRL framework based on a Causal Correction and Compensation Network (C2-Net), which systematically captures the causal relationships underlying decision-making and enhances policy robustness. C2-Net integrates a Graph Neural Network-based Neural Causal Model (GNN-NCM) to compute causal influence weights for each action. These weights are then dynamically applied to correct and compensate the raw policy outputs, thereby balancing performance optimization and transparency. This work validates the approach on OpenAI Gym’s Hopper, Walker2d, and Humanoid environments, as well as the multi-agent AzureLoong platform built on Isaac Gym. In terms of convergence speed, final return, and policy robustness, experimental results show that C2-Net achieves higher performance over both non-causal baselines and conventional attention-based models. Moreover, it provides rich causal explanations for its decisions. The framework represents a principled shift from correlation to causation and offers a practical solution for the safe and reliable deployment of multi-robot systems. Full article
Show Figures

Figure 1

17 pages, 2179 KB  
Article
Federated Multi-Agent DRL for Task Offloading in Vehicular Edge Computing
by Hongwei Zhao, Yu Li, Zhixi Pang and Zihan Ma
Electronics 2025, 14(17), 3501; https://doi.org/10.3390/electronics14173501 - 1 Sep 2025
Viewed by 286
Abstract
With the expansion of vehicle-to-everything (V2X) networks and the rising demand for intelligent services, vehicle edge computing encounters heightened requirements for more efficient task offloading. This study proposes a task offloading technique that utilizes federated collaboration and multi-agent deep reinforcement learning to reduce [...] Read more.
With the expansion of vehicle-to-everything (V2X) networks and the rising demand for intelligent services, vehicle edge computing encounters heightened requirements for more efficient task offloading. This study proposes a task offloading technique that utilizes federated collaboration and multi-agent deep reinforcement learning to reduce system latency and energy consumption. The task offloading issue is formulated as a Markov decision process (MDP), and a framework utilizing the Multi-Agent Dueling Double Deep Q-Network (MAD3QN) is developed to facilitate agents in making optimal offloading decisions inside intricate environments. Secondly, Federated Learning (FL) is implemented during the training phase, leveraging local training outcomes from many vehicles to enhance the global model, thus augmenting the learning efficiency of the agents. Experimental results indicate that, compared to conventional baseline algorithms, the proposed method decreases latency and energy consumption by at least 10% and 9%, respectively, while enhancing the average reward by at least 21%. Full article
Show Figures

Figure 1

27 pages, 520 KB  
Article
QiMARL: Quantum-Inspired Multi-Agent Reinforcement Learning Strategy for Efficient Resource Energy Distribution in Nodal Power Stations
by Sapthak Mohajon Turjya, Anjan Bandyopadhyay, M. Shamim Kaiser and Kanad Ray
AI 2025, 6(9), 209; https://doi.org/10.3390/ai6090209 - 1 Sep 2025
Viewed by 632
Abstract
The coupling of quantum computing with multi-agent reinforcement learning (MARL) provides an exciting direction to tackle intricate decision-making tasks in high-dimensional spaces. This work introduces a new quantum-inspired multi-agent reinforcement learning (QiMARL) model, utilizing quantum parallelism to achieve learning efficiency and scalability improvement. [...] Read more.
The coupling of quantum computing with multi-agent reinforcement learning (MARL) provides an exciting direction to tackle intricate decision-making tasks in high-dimensional spaces. This work introduces a new quantum-inspired multi-agent reinforcement learning (QiMARL) model, utilizing quantum parallelism to achieve learning efficiency and scalability improvement. The QiMARL model is tested on an energy distribution task, which optimizes power distribution between generating and demanding nodal power stations. We compare the convergence time, reward performance, and scalability of QiMARL with traditional Multi-Armed Bandit (MAB) and Multi-Agent Reinforcement Learning methods, such as Greedy, Upper Confidence Bound (UCB), Thompson Sampling, MADDPG, QMIX, and PPO methods with a comprehensive ablation study. Our findings show that QiMARL yields better performance in high-dimensional systems, decreasing the number of training epochs needed for convergence while enhancing overall reward maximization. We also compare the algorithm’s computational complexity, indicating that QiMARL is more scalable to high-dimensional quantum environments. This research opens the door to future studies of quantum-enhanced reinforcement learning (RL) with potential applications to energy optimization, traffic management, and other multi-agent coordination problems. Full article
(This article belongs to the Special Issue Advances in Quantum Computing and Quantum Machine Learning)
Show Figures

Figure 1

14 pages, 1246 KB  
Article
Multi-Agent-Based Service Composition Using Integrated Particle-Ant Algorithm in the Cloud
by Seongsoo Cho, Yeonwoo Lee and Hanyong Choi
Appl. Sci. 2025, 15(17), 9603; https://doi.org/10.3390/app15179603 - 31 Aug 2025
Viewed by 281
Abstract
The increasing complexity and scale of service-oriented architectures in cloud computing have heightened the demand for intelligent, decentralized, and adaptive service composition techniques. This study proposes an advanced framework that integrates a Multi-Agent System (MAS) with a novel hybrid metaheuristic optimization method, the [...] Read more.
The increasing complexity and scale of service-oriented architectures in cloud computing have heightened the demand for intelligent, decentralized, and adaptive service composition techniques. This study proposes an advanced framework that integrates a Multi-Agent System (MAS) with a novel hybrid metaheuristic optimization method, the Integrated Particle-Ant Algorithm (IPAA), to achieve efficient, scalable, and Quality of Service (QoS)-aware service composition. The IPAA dynamically combines the global search capabilities of Particle Swarm Optimization (PSO) with the local exploitation strength of Ant Colony Optimization (ACO), thereby enhancing convergence speed and solution quality. The proposed system is structured into three logical layers—agent, optimization, and infrastructure—facilitating autonomous decision-making, distributed coordination, and runtime adaptability. Extensive simulations using a synthetic cloud service dataset demonstrate that the proposed approach significantly outperforms traditional optimization methods, including standalone PSO, ACO, and random composition strategies, across key metrics such as utility score, execution time, and scalability. Moreover, the framework enables real-time monitoring and automatic re-optimization in response to QoS degradation or Service-Level Agreement (SLA) violations. Through decentralized negotiation and minimal communication overhead, agents exhibit high resilience and flexibility under dynamic service availability. These results collectively suggest that the proposed IPAA-based framework provides a robust, intelligent, and scalable solution for service composition in complex cloud computing environments. Full article
(This article belongs to the Section Green Sustainable Science and Technology)
Show Figures

Figure 1

27 pages, 4949 KB  
Article
Resolving the Classic Resource Allocation Conflict in On-Ramp Merging: A Regionally Coordinated Nash-Advantage Decomposition Deep Q-Network Approach for Connected and Automated Vehicles
by Linning Li and Lili Lu
Sustainability 2025, 17(17), 7826; https://doi.org/10.3390/su17177826 - 30 Aug 2025
Viewed by 270
Abstract
To improve the traffic efficiency of connected and automated vehicles (CAVs) in on-ramp merging areas, this study proposes a novel region-level multi-agent reinforcement learning framework, Regionally Coordinated Nash-Advantage Decomposition Deep Q-Network with Conflict-Aware Q Fusion (RC-NashAD-DQN). Unlike existing vehicle-level control methods, which suffer [...] Read more.
To improve the traffic efficiency of connected and automated vehicles (CAVs) in on-ramp merging areas, this study proposes a novel region-level multi-agent reinforcement learning framework, Regionally Coordinated Nash-Advantage Decomposition Deep Q-Network with Conflict-Aware Q Fusion (RC-NashAD-DQN). Unlike existing vehicle-level control methods, which suffer from high computational overhead and poor scalability, our approach abstracts on-ramp and main road areas as region-level control agents, achieving coordinated yet independent decision-making while maintaining control precision and merging efficiency comparable to fine-grained vehicle-level approaches. Each agent adopts a value–advantage decomposition architecture to enhance policy stability and distinguish action values, while sharing state–action information to improve inter-agent awareness. A Nash equilibrium solver is applied to derive joint strategies, and a conflict-aware Q-fusion mechanism is introduced as a regularization term rather than a direct action-selection tool, enabling the system to resolve local conflicts—particularly at region boundaries—without compromising global coordination. This design reduces training complexity, accelerates convergence, and improves robustness against communication imperfections. The framework is evaluated using the SUMO simulator at the Taishan Road interchange on the S1 Yongtaiwen Expressway under heterogeneous traffic conditions involving both passenger cars and container trucks, and is compared with baseline models including C-DRL-VSL and MADDPG. Extensive simulations demonstrate that RC-NashAD-DQN significantly improves average traffic speed by 17.07% and reduces average delay by 12.68 s, outperforming all baselines in efficiency metrics while maintaining robust convergence performance. These improvements enhance cooperation and merging efficiency among vehicles, contributing to sustainable urban mobility and the advancement of intelligent transportation systems. Full article
Show Figures

Figure 1

18 pages, 3066 KB  
Article
A Tree-Based Search Algorithm with Global Pheromone and Local Signal Guidance for Scientific Chart Reasoning
by Min Zhou, Zhiheng Qi, Tianlin Zhu, Jan Vijg and Xiaoshui Huang
Mathematics 2025, 13(17), 2739; https://doi.org/10.3390/math13172739 - 26 Aug 2025
Viewed by 394
Abstract
Chart reasoning, a critical task for automating data interpretation in domains such as aiding scientific data analysis and medical diagnostics, leverages large-scale vision language models (VLMs) to interpret chart images and answer natural language questions, enabling semantic understanding that enhances knowledge accessibility and [...] Read more.
Chart reasoning, a critical task for automating data interpretation in domains such as aiding scientific data analysis and medical diagnostics, leverages large-scale vision language models (VLMs) to interpret chart images and answer natural language questions, enabling semantic understanding that enhances knowledge accessibility and supports data-driven decision making across diverse domains. In this work, we formalize chart reasoning as a sequential decision-making problem governed by a Markov Decision Process (MDP), thereby providing a mathematically grounded framework for analyzing visual question answering tasks. While recent advances such as multi-step reasoning with Monte Carlo tree search (MCTS) offer interpretable and stochastic planning capabilities, these methods often suffer from redundant path exploration and inefficient reward propagation. To address these challenges, we propose a novel algorithmic framework that integrates a pheromone-guided search strategy inspired by Ant Colony Optimization (ACO). In our approach, chart reasoning is cast as a combinatorial optimization problem over a dynamically evolving search tree, where path desirability is governed by pheromone concentration functions that capture global phenomena across search episodes and are reinforced through trajectory-level rewards. Transition probabilities are further modulated by local signals, which are evaluations derived from the immediate linguistic feedback of large language models. This enables fine grained decision making at each step while preserving long-term planning efficacy. Extensive experiments across four benchmark datasets, ChartQA, MathVista, GRAB, and ChartX, demonstrate the effectiveness of our approach, with multi-agent reasoning and pheromone guidance yielding success rate improvements of +18.4% and +7.6%, respectively. Full article
(This article belongs to the Special Issue Multimodal Deep Learning and Its Application in Healthcare)
Show Figures

Figure 1

23 pages, 2958 KB  
Article
Controlling Heterogeneous Multi-Agent Systems Under Uncertainty Using Fuzzy Inference and Evolutionary Search
by Yukinobu Hoshino, Keigo Yoshimi, Tuan Linh Dang and Namal Rathnayake
Information 2025, 16(9), 732; https://doi.org/10.3390/info16090732 - 25 Aug 2025
Viewed by 573
Abstract
Real-time coordination of heterogeneous multi-agent systems in dynamic and partially observable environments poses significant challenges. To address this, we propose a framework that integrates fuzzy inference systems with real-valued genetic algorithms to optimize decision-making under strict time constraints and sensory uncertainty. We evaluate [...] Read more.
Real-time coordination of heterogeneous multi-agent systems in dynamic and partially observable environments poses significant challenges. To address this, we propose a framework that integrates fuzzy inference systems with real-valued genetic algorithms to optimize decision-making under strict time constraints and sensory uncertainty. We evaluate the proposed method in the RoboCup Soccer Simulation 2D League, where 22 autonomous agents coordinate through a fuzzy-evaluated action sequence search. Spatial heuristics are encoded as fuzzy rules, and optimization based on genetic algorithms refines evaluation function parameters according to performance metrics such as number of shots, goal area entries, and scoring rates. The resulting control strategy remains interpretable; spatial heat maps reveal emergent behaviors such as coordinated positioning and ridgeline passing patterns near the penalty area. The experiments against established RoboCup teams, serving as benchmarks, demonstrate the competitive performance of our trained agents while enabling analyses of evolving decision structures and agent behaviors. Our method provides a transparent and adaptable framework for controlling heterogeneous agents in uncertain real-time environments, with broad applicability to robotics, autonomous systems, and distributed control systems. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

24 pages, 11782 KB  
Article
Research on Joint Game-Theoretic Modeling of Network Attack and Defense Under Incomplete Information
by Yifan Wang, Xiaojian Liu and Xuejun Yu
Entropy 2025, 27(9), 892; https://doi.org/10.3390/e27090892 - 23 Aug 2025
Viewed by 438
Abstract
In the face of increasingly severe cybersecurity threats, incomplete information and environmental dynamics have become central challenges in network attack–defense scenarios. In real-world network environments, defenders often find it difficult to fully perceive attack behaviors and network states, leading to a high degree [...] Read more.
In the face of increasingly severe cybersecurity threats, incomplete information and environmental dynamics have become central challenges in network attack–defense scenarios. In real-world network environments, defenders often find it difficult to fully perceive attack behaviors and network states, leading to a high degree of uncertainty in the system. Traditional approaches are inadequate in dealing with the diversification of attack strategies and the dynamic evolution of network structures, making it difficult to achieve highly adaptive defense strategies and efficient multi-agent coordination. To address these challenges, this paper proposes a multi-agent network defense approach based on joint game modeling, termed JG-Defense (Joint Game-based Defense), which aims to enhance the efficiency and robustness of defense decision-making in environments characterized by incomplete information. The method integrates Bayesian game theory, graph neural networks, and a proximal policy optimization framework, and it introduces two core mechanisms. First, a Dynamic Communication Graph Neural Network (DCGNN) is used to model the dynamic network structure, improving the perception of topological changes and attack evolution trends. A multi-agent communication mechanism is incorporated within the DCGNN to enable the sharing of local observations and strategy coordination, thereby enhancing global consistency. Second, a joint game loss function is constructed to embed the game equilibrium objective into the reinforcement learning process, optimizing both the rationality and long-term benefit of agent strategies. Experimental results demonstrate that JG-Defense outperforms the Cybermonic model by 15.83% in overall defense performance. Furthermore, under the traditional PPO loss function, the DCGNN model improves defense performance by 11.81% compared to the Cybermonic model. These results verify that the proposed integrated approach achieves superior global strategy coordination in dynamic attack–defense scenarios with incomplete information. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

22 pages, 2971 KB  
Article
Cooperative Schemes for Joint Latency and Energy Consumption Minimization in UAV-MEC Networks
by Ming Cheng, Saifei He, Yijin Pan, Min Lin and Wei-Ping Zhu
Sensors 2025, 25(17), 5234; https://doi.org/10.3390/s25175234 - 22 Aug 2025
Viewed by 633
Abstract
The Internet of Things (IoT) has promoted emerging applications that require massive device collaboration, heavy computation, and stringent latency. Unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) systems can provide flexible services for user devices (UDs) with wide coverage. The optimization of both [...] Read more.
The Internet of Things (IoT) has promoted emerging applications that require massive device collaboration, heavy computation, and stringent latency. Unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) systems can provide flexible services for user devices (UDs) with wide coverage. The optimization of both latency and energy consumption remains a critical yet challenging task due to the inherent trade-off between them. Joint association, offloading, and computing resource allocation are essential to achieving satisfying system performance. However, these processes are difficult due to the highly dynamic environment and the exponentially increasing complexity of large-scale networks. To address these challenges, we introduce a carefully designed cost function to balance the latency and the energy consumption, formulate the joint problem into a partially observable Markov decision process, and propose two multi-agent deep-reinforcement-learning-based schemes to tackle the long-term problem. Specifically, the multi-agent proximal policy optimization (MAPPO)-based scheme uses centralized learning and decentralized execution, while the closed-form enhanced multi-armed bandit (CF-MAB)-based scheme decouples association from offloading and computing resource allocation. In both schemes, UDs act as independent agents that learn from environmental interactions and historic decisions, make decision to maximize its individual reward function, and achieve implicit collaboration through the reward mechanism. The numerical results validate the effectiveness and show the superiority of our proposed schemes. The MAPPO-based scheme enables collaborative agent decisions for high performance in complex dynamic environments, while the CF-MAB-based scheme supports independent rapid response decisions. Full article
Show Figures

Figure 1

33 pages, 3689 KB  
Article
Research on a Multi-Agent Job Shop Scheduling Method Based on Improved Game Evolution
by Wei Xie, Bin Du, Jiachen Ma, Jun Chen and Xiangle Zheng
Symmetry 2025, 17(8), 1368; https://doi.org/10.3390/sym17081368 - 21 Aug 2025
Viewed by 338
Abstract
As the global manufacturing industry’s transformation accelerates toward being intelligent, “unmanned”, and low-carbon, manufacturing workshops face conflicts between production schedules and transportation tasks, leading to low efficiency and resource waste. This paper presents a multi-agent collaborative scheduling optimization method based on a hybrid [...] Read more.
As the global manufacturing industry’s transformation accelerates toward being intelligent, “unmanned”, and low-carbon, manufacturing workshops face conflicts between production schedules and transportation tasks, leading to low efficiency and resource waste. This paper presents a multi-agent collaborative scheduling optimization method based on a hybrid game–genetic framework to address issues like high AGV (Automated Guided Vehicle) idle rates, excessive energy consumption, and uncoordinated equipment scheduling. The method establishes a trinity system integrating distributed decision-making, dynamic coordination, and environment awareness. In this system, the multi-agent decision-making and collaboration process exhibits significant symmetry characteristics. All agents (machine agents, mobile agents, etc.) follow unified optimization criteria and interaction rules, forming a dynamically balanced symmetric scheduling framework in resource competition and collaboration, which ensures fairness and consistency among different agents in task allocation, path planning, and other links. An improved best-response dynamic algorithm is employed in the decision-making layer to solve the multi-agent Nash equilibrium, while the genetic optimization layer enhances the global search capability by encoding scheduling schemes and adjusting crossover/mutation probabilities using dynamic competition factors. The coordination pivot layer updates constraints in real time based on environmental sensing, forming a closed-loop optimization mechanism. Experimental results show that, compared with the traditional genetic algorithm (TGA) and particle swarm optimization (PSO), the proposed method reduces the maximum completion time by 54.5% and 44.4% in simple scenarios and 57.1% in complex scenarios, the AGV idling rate by 68.3% in simple scenarios and 67.5%/77.6% in complex scenarios, and total energy consumption by 15.7%/10.9% in simple scenarios and 25%/18.2% in complex scenarios. This validates the method’s effectiveness in improving resource utilization and energy efficiency, providing a new technical path for intelligent scheduling in manufacturing workshops. Meanwhile, its symmetric multi-agent collaborative framework also offers a reference for the application of symmetry in complex manufacturing system optimization. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

16 pages, 1750 KB  
Article
An Intelligent Educational System: Analyzing Student Behavior and Academic Performance Using Multi-Source Data
by Haifang Li and Zhandong Liu
Electronics 2025, 14(16), 3328; https://doi.org/10.3390/electronics14163328 - 21 Aug 2025
Viewed by 494
Abstract
Student behavior analysis plays a critical role in enhancing educational quality and enabling personalized learning. While previous studies have utilized machine learning models to analyze campus card consumption data, few have integrated multi-source behavioral data with large language models (LLMs) to provide deeper [...] Read more.
Student behavior analysis plays a critical role in enhancing educational quality and enabling personalized learning. While previous studies have utilized machine learning models to analyze campus card consumption data, few have integrated multi-source behavioral data with large language models (LLMs) to provide deeper insights. This study proposes an intelligent educational system that examines the relationship between student consumption behavior and academic performance. The system is built upon a dataset collected from students of three majors at Xinjiang Normal University, containing exam scores and campus card transaction records. We designed an artificial intelligence (AI) agent that incorporates LLMs, SageGNN-based graph embeddings, and time-series regularity analysis to generate individualized behavior reports. Experimental evaluations demonstrate that the system effectively captures both temporal consumption patterns and academic fluctuations, offering interpretable and accurate outputs. Compared to baseline LLMs, our model achieves lower perplexity while maintaining high report consistency. The system supports early identification of potential learning risks and enables data-driven decision-making for educational interventions. Furthermore, the constructed multi-source dataset serves as a valuable resource for advancing research in educational data mining, behavioral analytics, and intelligent tutoring systems. Full article
Show Figures

Figure 1

35 pages, 3129 KB  
Article
Spatiotemporal Meta-Reinforcement Learning for Multi-USV Adversarial Games Using a Hybrid GAT-Transformer
by Yang Xiong, Shangwen Wang, Hongjun Tian, Guijie Liu, Zihao Shan, Yijie Yin, Jun Tao, Haonan Ye and Ying Tang
J. Mar. Sci. Eng. 2025, 13(8), 1593; https://doi.org/10.3390/jmse13081593 - 20 Aug 2025
Viewed by 371
Abstract
Coordinating Multi-Unmanned Surface Vehicle (USV) swarms in complex, adversarial maritime environments is a significant challenge, as existing multi-agent reinforcement learning (MARL) methods often fail to capture intricate spatiotemporal dependencies, leading to suboptimal policies. To address this, we propose Adv-TransAC, a novel Spatio-Temporal Meta-Reinforcement [...] Read more.
Coordinating Multi-Unmanned Surface Vehicle (USV) swarms in complex, adversarial maritime environments is a significant challenge, as existing multi-agent reinforcement learning (MARL) methods often fail to capture intricate spatiotemporal dependencies, leading to suboptimal policies. To address this, we propose Adv-TransAC, a novel Spatio-Temporal Meta-Reinforcement Learning framework. Its core innovation is a hybrid GAT-transformer architecture that decouples spatial and temporal reasoning: a Graph Attention Network (GAT) models instantaneous tactical formations, while a transformer analyzes their temporal evolution to infer intent. This is combined with an adversarial meta-learning mechanism to enable rapid adaptation to opponent tactics. In high-fidelity escort and defense simulations, Adv-TransAC significantly outperforms state-of-the-art MARL baselines in task success rate and policy robustness. The learned policies demonstrate the emergence of complex cooperative behaviors, such as intelligent risk-aware coordination and proactive interception maneuvers. The framework’s practicality is further validated by a communication-efficient federated optimization architecture. By effectively modeling spatiotemporal dynamics and enabling rapid adaptation, Adv-TransAC provides a powerful solution that moves beyond reactive decision-making, establishing a strong foundation for next-generation, intelligent maritime platforms. Full article
(This article belongs to the Special Issue Advanced Control Strategies for Autonomous Maritime Systems)
Show Figures

Figure 1

16 pages, 1586 KB  
Article
A Multi-Agent Deep Reinforcement Learning Anti-Jamming Spectrum-Access Method in LEO Satellites
by Wenting Cao, Feihuang Chu, Luliang Jia, Hongyu Zhou and Yunfan Zhang
Electronics 2025, 14(16), 3307; https://doi.org/10.3390/electronics14163307 - 20 Aug 2025
Viewed by 562
Abstract
Low-Earth-orbit (LEO) satellite networks face significant vulnerabilities to malicious jamming and co-channel interference, compounded by dynamic topologies, resource constraints, and complex electromagnetic environments. Traditional anti-jamming approaches lack adaptability, centralized intelligent methods incur high overhead, and distributed intelligent methods fail to achieve global optimization. [...] Read more.
Low-Earth-orbit (LEO) satellite networks face significant vulnerabilities to malicious jamming and co-channel interference, compounded by dynamic topologies, resource constraints, and complex electromagnetic environments. Traditional anti-jamming approaches lack adaptability, centralized intelligent methods incur high overhead, and distributed intelligent methods fail to achieve global optimization. To address these limitations, this paper proposed a value decomposition network (VDN)-based multi-agent deep reinforcement learning (DRL) anti-jamming spectrum access approach with a centralized training and distributed execution architecture. Following offline centralized ground-based training, the model was deployed distributedly on satellites for real-time spectrum-access decision-making. The simulation results demonstrate that the proposed method effectively balances training costs with anti-jamming performance. The method achieved near-optimal user satisfaction (approximately 97%) with minimal link overhead, confirming its effectiveness for resource-constrained LEO satellite networks. Full article
Show Figures

Figure 1

Back to TopTop