Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (34)

Search Parameters:
Keywords = dual-layer reinforcement learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 3196 KB  
Article
Sustainable Grid-Compliant Rooftop PV Curtailment via LQR-Based Active Power Regulation and QPSO–RL MPPT in a Three-Switch Micro-Inverter
by Ganesh Moorthy Jagadeesan, Kanagaraj Nallaiyagounder, Vijayakumar Madhaiyan and Qutubuddin Mohammed
Sustainability 2026, 18(8), 3674; https://doi.org/10.3390/su18083674 - 8 Apr 2026
Abstract
The increasing penetration of rooftop photovoltaic (RTPV) systems in low-voltage (LV) distribution networks introduces challenges such as voltage rises, reverse power flow, and reduced hosting capacity, thereby necessitating effective active power regulation (APR) in module-level micro inverters. This paper proposes a dual-layer control [...] Read more.
The increasing penetration of rooftop photovoltaic (RTPV) systems in low-voltage (LV) distribution networks introduces challenges such as voltage rises, reverse power flow, and reduced hosting capacity, thereby necessitating effective active power regulation (APR) in module-level micro inverters. This paper proposes a dual-layer control framework for a 250 watt-peak (Wp) three switch rooftop PV micro-inverter, integrating quantum-behaved particle swarm optimization with reinforcement learning (QPSO-RL) for accurate maximum power point tracking (MPPT) and a linear quadratic regulator (LQR) for reserve- aware APR. The QPSO-RL algorithm improves available-power estimation under varying irradiance, temperature, and partial-shading conditions, while the LQR-based controller ensures fast, well-damped, and grid-compliant power regulation. The proposed framework was developed and validated using MATLAB/Simulink 2024 for simulation studies and LabVIEW with NI myRIO 2022 for real-time hardware implementation. Both simulation and experimental results confirm that the proposed method achieves 99.5% MPPT accuracy, convergence within 20 ms, grid-injected current total harmonic distortion (THD) below 3%, and a near-unity power factor. In addition, the reserve-based regulation strategy improves feeder compliance and reduces converter stress, thereby supporting reliable rooftop PV integration. These results demonstrate that the proposed QPSO-RL + LQR framework offers a practical and intelligent solution for high-performance, grid-supportive rooftop PV micro-inverter applications. Full article
(This article belongs to the Section Energy Sustainability)
13 pages, 2283 KB  
Article
Study on RF Parameter Extraction Method for Novel Heterogeneous Integrated GaN Schottky Rectifiers Based on Hierarchical Reinforcement Learning
by Yi Wei, Li Huang, Ce Wang, Xiong Yin and Ce Wang
Electronics 2026, 15(7), 1537; https://doi.org/10.3390/electronics15071537 - 7 Apr 2026
Abstract
This study presents a heterogeneous integration micro-assembly process and circuit board packaging solution for GaN Schottky Barrier Diode (SBD) rectifiers, and innovatively constructs a hierarchical reinforcement learning strategy for optimizing SBD RF parameters. By establishing an optimization framework with the goal of efficiency [...] Read more.
This study presents a heterogeneous integration micro-assembly process and circuit board packaging solution for GaN Schottky Barrier Diode (SBD) rectifiers, and innovatively constructs a hierarchical reinforcement learning strategy for optimizing SBD RF parameters. By establishing an optimization framework with the goal of efficiency in the load-input power two-dimensional space, a dual-layer optimization mechanism is employed: the high-level strategy dynamically selects optimization regions and parameter combinations, while the low-level strategy implements specific parameter adjustments. This approach effectively addresses the challenges of device parameter modeling and circuit design. Experimental data shows that the efficiency error for the SBD1 rectifier remains stable within 2%, and the average error for SBD2 is reduced to 1.5%. This method enables efficient and accurate optimization of RF parameters, providing a reliable technical pathway for the engineering application of Wireless Power Transmission systems. Full article
Show Figures

Figure 1

19 pages, 1920 KB  
Article
Knowledge Distillation Meets Reinforcement Learning: A Cluster-Driven Approach to Image Processing
by Titinunt Kitrungrotsakul, Yingying Xu and Preeyanuch Srichola
Sensors 2026, 26(1), 209; https://doi.org/10.3390/s26010209 - 28 Dec 2025
Viewed by 971
Abstract
Knowledge distillation (KD) enables the training of lightweight yet effective models, particularly in the visual domain. Meanwhile, reinforcement learning (RL) facilitates adaptive learning through environment-driven interactions, addressing the limitations of KD in handling dynamic and complex tasks. We propose a novel two-stage framework [...] Read more.
Knowledge distillation (KD) enables the training of lightweight yet effective models, particularly in the visual domain. Meanwhile, reinforcement learning (RL) facilitates adaptive learning through environment-driven interactions, addressing the limitations of KD in handling dynamic and complex tasks. We propose a novel two-stage framework integrating Knowledge Distillation with Reinforcement Learning (KDRL) to enhance model adaptability to complex data distributions, such as remote sensing and medical imaging. In the first stage, supervised fine-tuning guides the student model using logit and feature-based distillation. The second stage refines the model via RL, leveraging confidence-based and cluster alignment rewards while dynamically reducing reliance on task loss. By combining the strengths of supervised knowledge distillation and reinforcement learning, KDRL provides a comprehensive approach to address the dual challenges of model efficiency and domain heterogeneity. A key innovation is the introduction of auxiliary layers within the student encoder to evaluate and reward the alignment of the characteristics with the teacher’s cluster centers, promoting robust feature learning. Our framework demonstrates superior performance and computational efficiency across diverse tasks, establishing a scalable design for efficient model training. Across remote sensing benchmarks, KDRL boosts the lightweight CLIP/ViT-B-32 student to 69.51% zero-shot accuracy on AID and 80.08% on RESISC45; achieves state-of-the-art cross-modal retrieval on RSITMD with 67.44% (I→T) and 74.76% (T→I) at R@10; and improves DIOR-RSVG visual-grounding precision to 64.21% at Pr@0.9. These gains matter in real deployments by reducing missed targets and speeding analyst search on resource-constrained platforms. Full article
Show Figures

Figure 1

35 pages, 8987 KB  
Article
A Method for UAV Path Planning Based on G-MAPONet Reinforcement Learning
by Jian Deng, Honghai Zhang, Yuetan Zhang, Mingzhuang Hua and Yaru Sun
Drones 2025, 9(12), 871; https://doi.org/10.3390/drones9120871 - 17 Dec 2025
Cited by 2 | Viewed by 722
Abstract
To address the issues of efficiency and robustness in UAV trajectory planning under complex environments, this paper proposes a Graph Multi-Head Attention Policy Optimization Network (G-MAPONet) algorithm that integrates Graph Attention (GAT), Multi-Head Attention (MHA), and Group Relative Policy Optimization (GRPO). The algorithm [...] Read more.
To address the issues of efficiency and robustness in UAV trajectory planning under complex environments, this paper proposes a Graph Multi-Head Attention Policy Optimization Network (G-MAPONet) algorithm that integrates Graph Attention (GAT), Multi-Head Attention (MHA), and Group Relative Policy Optimization (GRPO). The algorithm adopts a three-layer architecture of “GAT layer for local feature perception–MHA for global semantic reasoning–GRPO for policy optimization”, comprehensively achieving the goals of dynamic graph convolution quantization and global adaptive parallel decoupled dynamic strategy adjustment. Comparative experiments in multi-dimensional spatial environments demonstrate that the Gat_Mha combined mechanism exhibits significant superiority compared to single attention mechanisms, which verifies the efficient representation capability of the dual-layer hybrid attention mechanism in capturing environmental features. Additionally, ablation experiments integrating Gat, Mha, and GRPO algorithms confirm that the dual-layer fusion mechanism of Gat and Mha yields better improvement effects. Finally, comparisons with traditional reinforcement learning algorithms across multiple performance metrics show that the G-MAPONet algorithm reduces the number of convergence episodes (NCE) by an average of more than 19.14%, increases the average reward (AR) by over 16.20%, and successfully completes all dynamic path planning (PPTC) tasks; meanwhile, the algorithm’s reward values and obstacle avoidance success rate are significantly higher than those of other algorithms. Compared with the baseline APF algorithm, its reward value is improved by 8.66%, and the obstacle avoidance repetition rate is also enhanced, which further verifies the effectiveness of the improved G-MAPONet algorithm. In summary, through the dual-layer complementary mode of GAT and MHA, the G-MAPONet algorithm overcomes the bottlenecks of traditional dynamic environment modeling and multi-scale optimization, enhances the decision-making capability of UAVs in unstructured environments, and provides a new technical solution for trajectory planning in intelligent logistics and distribution. Full article
Show Figures

Figure 1

22 pages, 3542 KB  
Article
Dual Resource Scheduling Method of Production Equipment and Rail-Guided Vehicles Based on Proximal Policy Optimization Algorithm
by Nengqi Zhang, Bo Liu and Jian Zhang
Technologies 2025, 13(12), 573; https://doi.org/10.3390/technologies13120573 - 5 Dec 2025
Cited by 3 | Viewed by 2003
Abstract
In the context of intelligent manufacturing, the integrated scheduling problem of dual rail-guided vehicles (RGVs) and multiple parallel processing equipment in flexible manufacturing systems has gained increasing importance. This problem exhibits spatiotemporal coupling and dynamic constraint characteristics, making traditional optimization methods ineffective at [...] Read more.
In the context of intelligent manufacturing, the integrated scheduling problem of dual rail-guided vehicles (RGVs) and multiple parallel processing equipment in flexible manufacturing systems has gained increasing importance. This problem exhibits spatiotemporal coupling and dynamic constraint characteristics, making traditional optimization methods ineffective at finding optimal solutions. At the problem formulation level, the dual resource scheduling task is modeled as a mixed-integer optimization problem. An intelligent scheduling framework based on action mask-constrained Proximal Policy Optimization (PPO) deep reinforcement learning is proposed to achieve integrated decision-making for production equipment allocation and RGV path planning. The approach models the scheduling problem as a Markov Decision Process, designing a high-dimensional state space, along with a multi-discrete action space that integrates machine selection and RGV motion control. The framework employs a shared feature extraction layer and dual-head Actor-Critic network architecture, combined with parallel experience collection and synchronous parameter update mechanisms. In computational experiments across different scales, the proposed method achieves an average makespan reduction of 15–20% compared with numerical methods, while exhibiting excellent robustness under uncertain conditions including processing time fluctuations. Full article
(This article belongs to the Section Manufacturing Technology)
Show Figures

Figure 1

22 pages, 2226 KB  
Article
A Structure-Aware and Attention-Enhanced Explainable Learning Resource Recommendation Approach for Smart Education Within Smart Cities
by Tianxue Bu, Hao Zheng and Fen Zhao
Electronics 2025, 14(23), 4561; https://doi.org/10.3390/electronics14234561 - 21 Nov 2025
Viewed by 557
Abstract
With the rapid advancement in smart city infrastructures, the demand for personalized and explainable educational services has become increasingly prominent. To address the challenges of information overload and the lack of interpretability in traditional learning resource recommendation, this paper proposes a Structure-aware and [...] Read more.
With the rapid advancement in smart city infrastructures, the demand for personalized and explainable educational services has become increasingly prominent. To address the challenges of information overload and the lack of interpretability in traditional learning resource recommendation, this paper proposes a Structure-aware and Attention-enhanced explainable learning resource Recommendation approach (StAR) for smart education. StAR constructs a reinforcement learning framework grounded in a knowledge graph to model learner–resource interactions. First, a multi-head attention mechanism encodes path states and extracts key semantic features, enhancing the model’s ability to represent complex learning contexts. Then, a dual-layer action pruning strategy compresses the action space and improves reasoning efficiency. Finally, a structure-aware reward function guides the generation of semantically coherent and interpretable recommendation paths. Experiments on two real-world educational datasets, COCO and MoocCube, demonstrate that StAR outperforms several baseline models, achieving improvements of 14.2% and 12.6% in NDCG and Recall on COCO, and 5.2% and 4.2% on MoocCube, respectively. The results validate the effectiveness of StAR in enhancing recommendation accuracy, reasoning efficiency, and interpretability, offering a promising AI-enhanced solution for personalized learning in smart cities. Full article
(This article belongs to the Special Issue Advances in AI-Augmented E-Learning for Smart Cities)
Show Figures

Figure 1

27 pages, 4763 KB  
Article
Lightweight Reinforcement Learning for Priority-Aware Spectrum Management in Vehicular IoT Networks
by Adeel Iqbal, Ali Nauman and Tahir Khurshaid
Sensors 2025, 25(21), 6777; https://doi.org/10.3390/s25216777 - 5 Nov 2025
Cited by 1 | Viewed by 876
Abstract
The Vehicular Internet of Things (V-IoT) has emerged as a cornerstone of next-generation intelligent transportation systems (ITSs), enabling applications ranging from safety-critical collision avoidance and cooperative awareness to infotainment and fleet management. These heterogeneous services impose stringent quality-of-service (QoS) demands for latency, reliability, [...] Read more.
The Vehicular Internet of Things (V-IoT) has emerged as a cornerstone of next-generation intelligent transportation systems (ITSs), enabling applications ranging from safety-critical collision avoidance and cooperative awareness to infotainment and fleet management. These heterogeneous services impose stringent quality-of-service (QoS) demands for latency, reliability, and fairness while competing for limited and dynamically varying spectrum resources. Conventional schedulers, such as round-robin or static priority queues, lack adaptability, whereas deep reinforcement learning (DRL) solutions, though powerful, remain computationally intensive and unsuitable for real-time roadside unit (RSU) deployment. This paper proposes a lightweight and interpretable reinforcement learning (RL)-based spectrum management framework for Vehicular Internet of Things (V-IoT) networks. Two enhanced Q-Learning variants are introduced: a Value-Prioritized Action Double Q-Learning with Constraints (VPADQ-C) algorithm that enforces reliability and blocking constraints through a Constrained Markov Decision Process (CMDP) with online primal–dual optimization, and a contextual Q-Learning with Upper Confidence Bound (Q-UCB) method that integrates uncertainty-aware exploration and a Success-Rate Prior (SRP) to accelerate convergence. A Risk-Aware Heuristic baseline is also designed as a transparent, low-complexity benchmark to illustrate the interpretability–performance trade-off between rule-based and learning-driven approaches. A comprehensive simulation framework incorporating heterogeneous traffic classes, physical-layer fading, and energy-consumption dynamics is developed to evaluate throughput, delay, blocking probability, fairness, and energy efficiency. The results demonstrate that the proposed methods consistently outperform conventional Q-Learning and Double Q-Learning methods. VPADQ-C achieves the highest energy efficiency (≈8.425×107 bits/J) and reduces interruption probability by over 60%, while Q-UCB achieves the fastest convergence (within ≈190 episodes), lowest blocking probability (≈0.0135), and lowest mean delay (≈0.351 ms). Both schemes maintain fairness near 0.364, preserve throughput around 28 Mbps, and exhibit sublinear training-time scaling with O(1) per-update complexity and O(N2) overall runtime growth. Scalability analysis confirms that the proposed frameworks sustain URLLC-grade latency (<0.2 ms) and reliability under dense vehicular loads, validating their suitability for real-time, large-scale V-IoT deployments. Full article
(This article belongs to the Section Internet of Things)
Show Figures

Figure 1

25 pages, 5185 KB  
Article
Q-Learning-Based Multi-Strategy Topology Particle Swarm Optimization Algorithm
by Xiaoxi Hao, Shenwei Wang, Xiaotong Liu, Tianlei Wang, Guangfan Qiu and Zhiqiang Zeng
Algorithms 2025, 18(11), 672; https://doi.org/10.3390/a18110672 - 22 Oct 2025
Viewed by 827
Abstract
In response to the issues of premature convergence and insufficient parameter control in Particle Swarm Optimization (PSO) for high-dimensional complex optimization problems, this paper proposes a Multi-Strategy Topological Particle Swarm Optimization algorithm (MSTPSO). The method builds upon a reinforcement learning-driven topological switching framework, [...] Read more.
In response to the issues of premature convergence and insufficient parameter control in Particle Swarm Optimization (PSO) for high-dimensional complex optimization problems, this paper proposes a Multi-Strategy Topological Particle Swarm Optimization algorithm (MSTPSO). The method builds upon a reinforcement learning-driven topological switching framework, where Q-learning dynamically selects among fully informed topology, small-world topology, and exemplar-set topology to achieve an adaptive balance between global exploration and local exploitation. Furthermore, the algorithm integrates differential evolution perturbations and a global optimal restart strategy based on stagnation detection, together with a dual-layer experience replay mechanism to enhance population diversity at multiple levels and strengthen the ability to escape local optima. Experimental results on 29 CEC2017 benchmark functions, compared against various PSO variants and other advanced evolutionary algorithms, show that MSTPSO achieves superior fitness performance and exhibits stronger stability on high-dimensional and complex functions. Ablation studies further validate the critical contribution of the Q-learning-based multi-topology control and stagnation detection mechanisms to performance improvement. Overall, MSTPSO demonstrates significant advantages in convergence accuracy and global search capability. Full article
Show Figures

Figure 1

21 pages, 2522 KB  
Article
A Reinforcement Learning-Based Adaptive Grey Wolf Optimizer for Simultaneous Arrival in Manned/Unmanned Aerial Vehicle Dynamic Cooperative Trajectory Planning
by Wei Jia, Lei Lv, Ruizhi Duan, Tianye Sun and Wei Sun
Drones 2025, 9(10), 723; https://doi.org/10.3390/drones9100723 - 17 Oct 2025
Cited by 2 | Viewed by 1363
Abstract
Addressing the challenge of high-precision time-coordinated path planning for manned and unmanned aerial vehicle (UAV) clusters operating in complex dynamic environments during missions like high-level autonomous coordination, this paper proposes a reinforcement learning-based Adaptive Grey Wolf Optimizer (RL-GWO) method. We formulate a comprehensive [...] Read more.
Addressing the challenge of high-precision time-coordinated path planning for manned and unmanned aerial vehicle (UAV) clusters operating in complex dynamic environments during missions like high-level autonomous coordination, this paper proposes a reinforcement learning-based Adaptive Grey Wolf Optimizer (RL-GWO) method. We formulate a comprehensive multi-objective cost function integrating total flight distance, mission time, time synchronization error, and collision penalties. To solve this model, we design multiple improved GWO strategies and employ a Q-Learning framework for adaptive strategy selection. The RL-GWO algorithm is embedded within a dual-layer “global planning + dynamic replanning” framework. Simulation results demonstrate excellent convergence and robustness, achieving second-level time synchronization accuracy while satisfying complex constraints. In dynamic scenarios, the method rapidly generates safe evasion paths while maintaining cluster coordination, validating its practical value for heterogeneous UAV operations. Full article
(This article belongs to the Special Issue Path Planning, Trajectory Tracking and Guidance for UAVs: 3rd Edition)
Show Figures

Figure 1

29 pages, 9032 KB  
Article
Multi-Agent Deep Reinforcement Learning for Joint Task Offloading and Resource Allocation in IIoT with Dynamic Priorities
by Yongze Ma, Yanqing Zhao, Yi Hu, Xingyu He and Sifang Feng
Sensors 2025, 25(19), 6160; https://doi.org/10.3390/s25196160 - 4 Oct 2025
Cited by 3 | Viewed by 3171
Abstract
The rapid growth of Industrial Internet of Things (IIoT) terminals has resulted in tasks exhibiting increased concurrency, heterogeneous resource demands, and dynamic priorities, significantly increasing the complexity of task scheduling in edge computing. Cloud–edge–end collaborative computing leverages cross-layer task offloading to alleviate edge [...] Read more.
The rapid growth of Industrial Internet of Things (IIoT) terminals has resulted in tasks exhibiting increased concurrency, heterogeneous resource demands, and dynamic priorities, significantly increasing the complexity of task scheduling in edge computing. Cloud–edge–end collaborative computing leverages cross-layer task offloading to alleviate edge node resource contention and improve task scheduling efficiency. However, existing methods generally neglect the joint optimization of task offloading, resource allocation, and priority adaptation, making it difficult to balance task execution and resource utilization under resource-constrained and competitive conditions. To address this, this paper proposes a two-stage dynamic-priority-aware joint task offloading and resource allocation method (DPTORA). In the first stage, an improved Multi-Agent Proximal Policy Optimization (MAPPO) algorithm integrated with a Priority-Gated Attention Module (PGAM) enhances the robustness and accuracy of offloading strategies under dynamic priorities; in the second stage, the resource allocation problem is formulated as a single-objective convex optimization task and solved globally using the Lagrangian dual method. Simulation results show that DPTORA significantly outperforms existing multi-agent reinforcement learning baselines in terms of task latency, energy consumption, and the task completion rate. Full article
(This article belongs to the Section Internet of Things)
Show Figures

Figure 1

27 pages, 5220 KB  
Article
Ship Motion Control Methods in Confined and Curved Waterways Combining Good Seamanship
by Liwen Huang and Jiahao Chen
J. Mar. Sci. Eng. 2025, 13(9), 1800; https://doi.org/10.3390/jmse13091800 - 17 Sep 2025
Cited by 1 | Viewed by 1142
Abstract
For the motion control of ships in confined and curved waterways, from broad coastal channels to narrow river bends, conventional methods often struggle to ensure both tracking accuracy and navigational safety. A key deficiency is the inability of standard algorithms to incorporate the [...] Read more.
For the motion control of ships in confined and curved waterways, from broad coastal channels to narrow river bends, conventional methods often struggle to ensure both tracking accuracy and navigational safety. A key deficiency is the inability of standard algorithms to incorporate the nuanced principles of good seamanship. To address this, a novel, hierarchical adaptive control framework is proposed. The core novelty of this framework lies in its versatile and adaptive guidance rules, which embed maritime practice into the control loop for different navigating scenarios. In general maritime channels with wind and current, these rules function to ensure robust, high-fidelity route tracking. For the most challenging inland river curved channels, it is further enhanced to generate a strategic, non-centerline trajectory that replicates the crucial inland navigational practice of “holding high and taking low”. This is complemented by a reinforcement learning-based strategy at the control layer, which performs real-time tuning of PID gains to adapt to the vessel’s dynamics. The framework’s dual capabilities were systematically validated. The core adaptive algorithms proved effective for robust control in curved channels under wind and current disturbances. Furthermore, the full framework, including the seamanship-informed strategy, demonstrated superior performance in the most complex inland river scenarios. Compared to a conventional controller, the proposed method reduced the peak cross-track error by over 40% and increased the minimum safety margin from the bank by more than 49% under a strong 3 m/s cross-current. An effective solution for motion control is thus provided, bridging the gap between modern control theory and the context-dependent expertise of practical pilotage. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

38 pages, 6012 KB  
Article
Adaptive Spectrum Management in Optical WSNs for Real-Time Data Transmission and Fault Tolerance
by Mohammed Alwakeel
Mathematics 2025, 13(17), 2715; https://doi.org/10.3390/math13172715 - 23 Aug 2025
Cited by 2 | Viewed by 1170
Abstract
Optical wireless sensor networks (OWSNs) offer promising capabilities for high-speed, energy-efficient communication, particularly in mission-critical environments such as industrial automation, healthcare monitoring, and smart buildings. However, dynamic spectrum management and fault tolerance remain key challenges in ensuring reliable and timely data transmission. This [...] Read more.
Optical wireless sensor networks (OWSNs) offer promising capabilities for high-speed, energy-efficient communication, particularly in mission-critical environments such as industrial automation, healthcare monitoring, and smart buildings. However, dynamic spectrum management and fault tolerance remain key challenges in ensuring reliable and timely data transmission. This paper proposes an adaptive spectrum management framework (ASMF) that addresses these challenges through a mathematically grounded and implementation-driven approach. The ASMF formulates the spectrum allocation problem as a constrained Markov decision process and leverages a dual-layer optimization strategy combining Lyapunov drift-plus-penalty for queue stability with deep reinforcement learning for adaptive long-term decision making. Additionally, ASMF integrates a hybrid fault-tolerant mechanism using LSTM-based link failure prediction and lightweight recovery logic, achieving up to 83% prediction accuracy. Experimental evaluations using real-world datasets from industrial, healthcare, and smart infrastructure scenarios demonstrate that ASMF reduces critical traffic latency by 37%, improves reliability by 42% under fault conditions, and enhances energy efficiency by 22.6% compared with state-of-the-art methods. The system also maintains a 99.94% packet delivery ratio for critical traffic and achieves 69.7% faster recovery after link failures. These results confirm the effectiveness of ASMF as a robust and scalable solution for adaptive spectrum management in dynamic, fault-prone OWSN environments. Full article
(This article belongs to the Special Issue Advances in Mobile Network and Intelligent Communication)
Show Figures

Figure 1

32 pages, 2102 KB  
Article
D* Lite and Transformer-Enhanced SAC: A Hybrid Reinforcement Learning Framework for COLREGs-Compliant Autonomous Navigation in Dynamic Maritime Environments
by Tianqing Chen, Yamei Lan, Yichen Li, Jiesen Zhang and Yijie Yin
J. Mar. Sci. Eng. 2025, 13(8), 1498; https://doi.org/10.3390/jmse13081498 - 4 Aug 2025
Viewed by 1747
Abstract
Autonomous navigation in dynamic, multi-vessel maritime environments presents a formidable challenge, demanding strict adherence to the International Regulations for Preventing Collisions at Sea (COLREGs). Conventional approaches often struggle with the dual imperatives of global path optimality and local reactive safety, and they frequently [...] Read more.
Autonomous navigation in dynamic, multi-vessel maritime environments presents a formidable challenge, demanding strict adherence to the International Regulations for Preventing Collisions at Sea (COLREGs). Conventional approaches often struggle with the dual imperatives of global path optimality and local reactive safety, and they frequently rely on simplistic state representations that fail to capture complex spatio-temporal interactions among vessels. We introduce a novel hybrid reinforcement learning framework, D* Lite + Transformer-Enhanced Soft Actor-Critic (TE-SAC), to overcome these limitations. This hierarchical framework synergizes the strengths of global and local planning. An enhanced D* Lite algorithm generates efficient, long-horizon reference paths at the global level. At the local level, the TE-SAC agent performs COLREGs-compliant tactical maneuvering. The core innovation resides in TE-SAC’s synergistic state encoder, which uniquely combines a Graph Neural Network (GNN) to model the instantaneous spatial topology of vessel encounters with a Transformer encoder to capture long-range temporal dependencies and infer vessel intent. Comprehensive simulations demonstrate the framework’s superior performance, validating the strengths of both planning layers. At the local level, our TE-SAC agent exhibits remarkable tactical intelligence, achieving an exceptional 98.7% COLREGs compliance rate and reducing energy consumption by 15–20% through smoother, more decisive maneuvers. This high-quality local control, guided by the efficient global paths from the enhanced D* Lite algorithm, culminates in a 10–32 percentage point improvement in overall task success rates compared to state-of-the-art baselines. This work presents a robust, verifiable, and efficient framework. By demonstrating superior performance and compliance with rules in high-fidelity simulations, it lays a crucial foundation for advancing the practical application of intelligent autonomous navigation systems. Full article
(This article belongs to the Special Issue Motion Control and Path Planning of Marine Vehicles—3rd Edition)
Show Figures

Figure 1

27 pages, 8957 KB  
Article
DFAN: Single Image Super-Resolution Using Stationary Wavelet-Based Dual Frequency Adaptation Network
by Gyu-Il Kim and Jaesung Lee
Symmetry 2025, 17(8), 1175; https://doi.org/10.3390/sym17081175 - 23 Jul 2025
Cited by 1 | Viewed by 2070
Abstract
Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this [...] Read more.
Single image super-resolution is the inverse problem of reconstructing a high-resolution image from its low-resolution counterpart. Although recent Transformer-based architectures leverage global context integration to improve reconstruction quality, they often overlook frequency-specific characteristics, resulting in the loss of high-frequency information. To address this limitation, we propose the Dual Frequency Adaptive Network (DFAN). DFAN first decomposes the input into low- and high-frequency components via Stationary Wavelet Transform. In the low-frequency branch, Swin Transformer layers restore global structures and color consistency. In contrast, the high-frequency branch features a dedicated module that combines Directional Convolution with Residual Dense Blocks, precisely reinforcing edges and textures. A frequency fusion module then adaptively merges these complementary features using depthwise and pointwise convolutions, achieving a balanced reconstruction. During training, we introduce a frequency-aware multi-term loss alongside the standard pixel-wise loss to explicitly encourage high-frequency preservation. Extensive experiments on the Set5, Set14, BSD100, Urban100, and Manga109 benchmarks show that DFAN achieves up to +0.64 dBpeak signal-to-noise ratio, +0.01 structural similarity index measure, and −0.01learned perceptual image patch similarity over the strongest frequency-domain baselines, while also delivering visibly sharper textures and cleaner edges. By unifying spatial and frequency-domain advantages, DFAN effectively mitigates high-frequency degradation and enhances SISR performance. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

35 pages, 2297 KB  
Article
Secure Cooperative Dual-RIS-Aided V2V Communication: An Evolutionary Transformer–GRU Framework for Secrecy Rate Maximization in Vehicular Networks
by Elnaz Bashir, Francisco Hernando-Gallego, Diego Martín and Farzaneh Shoushtari
World Electr. Veh. J. 2025, 16(7), 396; https://doi.org/10.3390/wevj16070396 - 14 Jul 2025
Cited by 1 | Viewed by 1099
Abstract
The growing demand for reliable and secure vehicle-to-vehicle (V2V) communication in next-generation intelligent transportation systems has accelerated the adoption of reconfigurable intelligent surfaces (RIS) as a means of enhancing link quality, spectral efficiency, and physical layer security. In this paper, we investigate the [...] Read more.
The growing demand for reliable and secure vehicle-to-vehicle (V2V) communication in next-generation intelligent transportation systems has accelerated the adoption of reconfigurable intelligent surfaces (RIS) as a means of enhancing link quality, spectral efficiency, and physical layer security. In this paper, we investigate the problem of secrecy rate maximization in a cooperative dual-RIS-aided V2V communication network, where two cascaded RISs are deployed to collaboratively assist with secure data transmission between mobile vehicular nodes in the presence of eavesdroppers. To address the inherent complexity of time-varying wireless channels, we propose a novel evolutionary transformer-gated recurrent unit (Evo-Transformer-GRU) framework that jointly learns temporal channel patterns and optimizes the RIS reflection coefficients, beam-forming vectors, and cooperative communication strategies. Our model integrates the sequence modeling strength of GRUs with the global attention mechanism of transformer encoders, enabling the efficient representation of time-series channel behavior and long-range dependencies. To further enhance convergence and secrecy performance, we incorporate an improved gray wolf optimizer (IGWO) to adaptively regulate the model’s hyper-parameters and fine-tune the RIS phase shifts, resulting in a more stable and optimized learning process. Extensive simulations demonstrate the superiority of the proposed framework compared to existing baselines, such as transformer, bidirectional encoder representations from transformers (BERT), deep reinforcement learning (DRL), long short-term memory (LSTM), and GRU models. Specifically, our method achieves an up to 32.6% improvement in average secrecy rate and a 28.4% lower convergence time under varying channel conditions and eavesdropper locations. In addition to secrecy rate improvements, the proposed model achieved a root mean square error (RMSE) of 0.05, coefficient of determination (R2) score of 0.96, and mean absolute percentage error (MAPE) of just 0.73%, outperforming all baseline methods in prediction accuracy and robustness. Furthermore, Evo-Transformer-GRU demonstrated rapid convergence within 100 epochs, the lowest variance across multiple runs. Full article
Show Figures

Figure 1

Back to TopTop