Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,663)

Search Parameters:
Keywords = adaptive feature enhancement

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 1699 KB  
Article
Three-Way Multimodal Learning with Severely Missing Modalities
by Hanrui Wang, Yu Fang, Xin Wang and Fan Min
Information 2026, 17(4), 384; https://doi.org/10.3390/info17040384 (registering DOI) - 19 Apr 2026
Abstract
Missing modalities remain a major obstacle to the real-world deployment of multimodal learning systems, as incomplete inputs can substantially degrade model performance. Existing methods often suffer from biased imputation under high missing rates and lack uncertainty-aware, differentiated processing. Inspired by three-way decision, a [...] Read more.
Missing modalities remain a major obstacle to the real-world deployment of multimodal learning systems, as incomplete inputs can substantially degrade model performance. Existing methods often suffer from biased imputation under high missing rates and lack uncertainty-aware, differentiated processing. Inspired by three-way decision, a framework for handling uncertainty by adding a deferment option to acceptance and rejection, we propose three-way multimodal learning with severely missing modalities (3WML-SMMs), a novel framework that introduces a three-way decision mechanism into both missing-modality imputation and feature regularization for the first time. Specifically, 3WML-SMM treats variance not merely as a descriptive measure of uncertainty, but as a decision signal for adaptive processing. Based on this idea, the framework incorporates (1) a variance-guided three-way imputation strategy with accept–delay–reject decisions to reduce unreliable reconstruction when only a limited number of complete samples are available and (2) a dimension-wise adaptive feature enhancement module that performs fine-grained regularization according to perturbation uncertainty. Experiments on the CMU Multimodal Opinion Sentiment Intensity (CMU-MOSI) and Multimodal Internet Movie Database (MM-IMDb) datasets show that 3WML-SMM consistently outperforms representative baselines, including reconstruction-based methods, complete-input multimodal methods, and missing-modality-specific methods under severe missing-modality settings, with statistically significant improvements over the multimodal learning with severely missing modality (SMIL) baseline (p<0.05). These results demonstrate the effectiveness of the proposed framework, even in extreme settings where only 10% of the text modality is available. Full article
(This article belongs to the Section Artificial Intelligence)
16 pages, 3021 KB  
Article
Chasing the Pareto Frontier: Adaptive Economic–Environmental Microgrid Dispatch via a Lévy–Triangular Walk Dung Beetle Optimizer
by Haoda Yang, Wei Hong Lim and Jun-Jiat Tiang
Sustainability 2026, 18(8), 4041; https://doi.org/10.3390/su18084041 (registering DOI) - 18 Apr 2026
Abstract
With the rapid penetration of renewable energy, grid-connected microgrids have become a cornerstone of low-carbon power systems, while also posing major challenges for coordinated scheduling under coupled economic and environmental goals. The resulting dispatch problem is highly nonlinear and high-dimensional, featuring tight operational [...] Read more.
With the rapid penetration of renewable energy, grid-connected microgrids have become a cornerstone of low-carbon power systems, while also posing major challenges for coordinated scheduling under coupled economic and environmental goals. The resulting dispatch problem is highly nonlinear and high-dimensional, featuring tight operational constraints and conflicting cost–emission trade-offs that often undermine the efficiency and reliability of conventional optimization methods, thereby limiting overall economic productivity. This paper presents an adaptive economic–environmental dispatch framework for grid-connected microgrids formulated as a multi-objective optimization problem that simultaneously minimizes operating cost and environmental protection cost. To navigate the rugged and constrained search landscape, we develop an enhanced metaheuristic termed the Lévy–Triangular Walk Dung Beetle Optimizer (LTWDBO). The LTWDBO integrates (i) chaotic population initialization to improve diversity and feasibility coverage, (ii) a geometry-inspired triangular walk operator to strengthen local exploitation, and (iii) an adaptive Lévy-flight strategy to boost global exploration, achieving a robust exploration–exploitation balance over the entire optimization process, representing a process innovation in metaheuristic-driven dispatch optimization. The proposed method is validated on a representative grid-connected microgrid comprising photovoltaic generation, wind turbines, micro gas turbines, and battery energy storage. Comparative experiments against representative baselines (DBO, WOA, TDBO, and NSGA-II) demonstrate that the LTWDBO achieves consistently better solution quality. Our LTWDBO attains the lowest optimal objective value of 255,718.34 Yuan, compared with 357,702.68 Yuan (DBO), 347,369.28 Yuan (TDBO), and 3,854,359.36 Yuan (WOA). The LTWDBO also yields the best average objective value of 673,842.24 Yuan, an improvement of over 1,001,813.10 Yuan (DBO). Full article
(This article belongs to the Section Energy Sustainability)
25 pages, 11345 KB  
Article
Uncertainty-Aware Cross-Domain Few-Shot Scene Classification from Remote Sensing Imagery
by Zifan Ning, Can Li, He Chen, Guangyao Zhou, Shanghang Zhang, Lianlin Li and Yin Zhuang
Remote Sens. 2026, 18(8), 1233; https://doi.org/10.3390/rs18081233 (registering DOI) - 18 Apr 2026
Abstract
Cross-Domain Few-Shot Scene Classification (CDFSSC) aims to transfer knowledge from a source domain to a target domain for few-shot classification tasks, and is essential for remote sensing applications involving diverse platforms and dynamic environments. However, distribution discrepancies and category misalignment across domains often [...] Read more.
Cross-Domain Few-Shot Scene Classification (CDFSSC) aims to transfer knowledge from a source domain to a target domain for few-shot classification tasks, and is essential for remote sensing applications involving diverse platforms and dynamic environments. However, distribution discrepancies and category misalignment across domains often introduce high predictive uncertainty, significantly degrading model performance. To address these challenges, an uncertainty-aware cross-domain (UACD) framework is proposed to enhance model reliability by systematically mining uncertainty-related information. Specifically, in the cross-domain training process, a feature-decision consistency regularization (FDCR) structure is designed to stabilize cross-domain training by enforcing consistency at both feature and decision levels. Furthermore, an uncertainty-aware knowledge mining (UKM) policy is introduced to effectively exploit high-uncertainty target samples, mitigating the negative impact of unreliable pseudo-labels and improving representation learning. In the few-shot adaptation stage, an uncertainty-aware predictor is developed to enhance adaptability and decision-making in target tasks. Extensive experiments on 12 cross-domain scenarios demonstrate that the proposed UACD framework consistently achieves superior or competitive performance, with strong robustness and generalization capability across diverse CDFSSC tasks. Full article
(This article belongs to the Special Issue Advances in Remote Sensing Image Target Detection and Recognition)
29 pages, 2377 KB  
Article
Multi-Scale Spectral Recurrent Network Based on Random Fourier Features for Wind Speed Forecasting
by Eder Arley Leon-Gomez, Víctor Elvira, Jorge Iván Montes-Monsalve, Andrés Marino Álvarez-Meza, Alvaro Orozco-Gutierrez and German Castellanos-Dominguez
Technologies 2026, 14(4), 238; https://doi.org/10.3390/technologies14040238 (registering DOI) - 18 Apr 2026
Abstract
Accurate wind speed forecasting is critical for reliable wind-power integration, yet it remains challenging due to the strongly non-stationary and inherently multi-scale nature of atmospheric processes. While deep learning models—such as LSTM, GRU, and Transformer architectures—achieve competitive short- and medium-term performance, they frequently [...] Read more.
Accurate wind speed forecasting is critical for reliable wind-power integration, yet it remains challenging due to the strongly non-stationary and inherently multi-scale nature of atmospheric processes. While deep learning models—such as LSTM, GRU, and Transformer architectures—achieve competitive short- and medium-term performance, they frequently suffer from spectral bias, hyperparameter sensitivity, and reduced generalization under heterogeneous operating regimes. To address these limitations, we propose a multi-scale spectral–recurrent framework, termed RFF-RNN, which integrates multi-band Random Fourier Feature (RFF) encodings with parameterizable recurrent backbones. A key innovation of our approach is the deliberate relaxation of strict shift-invariance constraints; by jointly optimizing spectral frequencies, phase biases, and bandwidth scales alongside the neural weights, the framework dynamically shapes a fully data-driven spectral embedding. To ensure robust adaptation, we employ a two-stage optimization strategy combining gradient-based inner-loop learning with outer-loop Bayesian hyperparameter tuning. Our extensive evaluations on a controlled synthetic benchmark and six geographically diverse real-world wind datasets (spanning the USA, China, and the Netherlands) demonstrate the superiority of the proposed framework. Statistical validation via the Friedman test confirms that RFF-enhanced models—particularly RFF-GRU and RFF-LSTM—systematically outperform standard recurrent networks and state-of-the-art Transformer architectures (Autoformer and FEDformer). The proposed approach yields significantly lower error metrics (MAE and RMSE) and higher explained variance (R2), while exhibiting remarkable resilience against error accumulation at extended forecasting horizons. Full article
(This article belongs to the Special Issue AI for Smart Engineering Systems)
27 pages, 8200 KB  
Article
Few-Shot Bearing Fault Diagnosis Based on Multi-Layer Feature Fusion and Similarity Measurement
by Changyong Deng, Dawei Dong, Sipeng Wang, Hongsheng Zhang and Li Feng
Lubricants 2026, 14(4), 172; https://doi.org/10.3390/lubricants14040172 - 17 Apr 2026
Abstract
The running reliability of rolling bearings depends on the effective lubrication state, and poor lubrication will induce abnormal vibration. Therefore, vibration-based fault diagnosis is an important means to evaluate the health of bearings through vibration characteristics. However, the lack of fault samples in [...] Read more.
The running reliability of rolling bearings depends on the effective lubrication state, and poor lubrication will induce abnormal vibration. Therefore, vibration-based fault diagnosis is an important means to evaluate the health of bearings through vibration characteristics. However, the lack of fault samples in actual working conditions seriously restricts the generalization ability and accuracy of an intelligent diagnosis model. A novel few-shot diagnosis method integrating multi-layer feature fusion and adaptive similarity measurement is proposed. This method adopts a meta-learning framework to simulate sample scarcity through numerous N-way K-shot diagnostic tasks. An efficient feature extractor with a cross-task feature stitching mechanism is designed to fuse features from support and query sets. To overcome the limitation of fixed-distance metrics in existing meta-learners, a learnable similarity scheduler adaptively generates optimal pseudo-distance functions. In particular, a multi-layer feature fusion strategy is introduced to compute adaptive similarities at multiple network depths, which significantly enhances feature robustness against operational variations. Experimental results demonstrate the method achieves stable diagnostic accuracy above 90% under extremely few-shot conditions and maintains over 90% accuracy when transferring from laboratory-simulated faults to natural operational faults, validating its strong potential for practical industrial applications where annotated fault data is scarce. Full article
(This article belongs to the Special Issue Advances in Wear Life Prediction of Bearings)
28 pages, 720 KB  
Article
Wavelet-Based and MAML-Driven Framework for Enhanced Few-Shot Malware Classification
by Abdullah Almuqrin, Ibrahim Mutambik and Majed Abusharhah
Appl. Sci. 2026, 16(8), 3921; https://doi.org/10.3390/app16083921 - 17 Apr 2026
Abstract
Traditional malware classification approaches primarily address fixed sets of well-studied malware types and therefore struggle to accommodate the continual emergence of novel or previously unseen malware strains. While visualization-based strategies have shown promise in few-shot malware classification, existing methods often produce representations with [...] Read more.
Traditional malware classification approaches primarily address fixed sets of well-studied malware types and therefore struggle to accommodate the continual emergence of novel or previously unseen malware strains. While visualization-based strategies have shown promise in few-shot malware classification, existing methods often produce representations with limited semantic richness. In parallel, few-shot learning models frequently converge with suboptimal solutions, limiting their ability to generalize effectively to new classes. To address these challenges, we propose MetaWave, a unified framework that jointly optimizes both data representation and model learning for few-shot malware classification. Rather than treating feature representation and learning strategy as largely independent stages, MetaWave is formulated as an explicit representation–adaptation integration framework that combines multi-view malware encoding with meta-learning-based optimization. At the data level, we propose a Wavelet Transform-based Malware Representation method that leverages multi-scale frequency analysis and complementary views to generate semantically enriched representations. At the model level, we adopt Model-Agnostic Meta-Learning (MAML) to optimize model initialization for rapid adaptation to unseen tasks under limited data conditions. Extensive experiments are conducted on two benchmark datasets, EMBER and Malicia, under a 5-way 5-shot protocol with disjoint class splits to ensure evaluation on previously unseen malware families. The proposed framework achieves superior performance, reaching 97.8% accuracy on EMBER and 96.2% on Malicia, consistently outperforming state-of-the-art methods. These results indicate that jointly enhancing representation quality and model adaptability can improve classification accuracy and unseen-family performance under the evaluated 5-way 5-shot protocol. Overall, MetaWave provides an effective framework for few-shot malware classification and offers a promising basis for detecting emerging malware under limited-data conditions, while robustness to adversarial perturbation, obfuscation, and polymorphism remains to be validated through dedicated future evaluation. Full article
(This article belongs to the Special Issue Approaches to Cyber Attacks and Malware Detection)
32 pages, 8881 KB  
Article
WS-R-IR Adapter: A Multimodal RGB–Infrared Remote Sensing Framework for Water Surface Object Detection
by Bin Xue, Qiang Yu, Kun Ding, Mengxin Jiang, Ying Wang, Shiming Xiang and Chunhong Pan
Remote Sens. 2026, 18(8), 1220; https://doi.org/10.3390/rs18081220 - 17 Apr 2026
Abstract
Water surface object detection in shipborne remote sensing is challenged by unstable wave-induced backgrounds, illumination variations, extreme scale changes with tiny objects, and limited annotations. Multimodal RGB–infrared (RGB–IR) sensing leverages complementary visible and infrared cues to enhance robustness. However, most existing RGB–IR methods [...] Read more.
Water surface object detection in shipborne remote sensing is challenged by unstable wave-induced backgrounds, illumination variations, extreme scale changes with tiny objects, and limited annotations. Multimodal RGB–infrared (RGB–IR) sensing leverages complementary visible and infrared cues to enhance robustness. However, most existing RGB–IR methods rely on backbones pretrained on limited-scale data, which constrain their performance for complex water surface scenes. In this work, we propose the WS-R-IR Adapter, a parameter-efficient vision foundation model (VFM)-based framework for shipborne RGB–IR object detection. Instead of full fine-tuning, it adapts frozen VFM representations via lightweight task-specific designs. the WS-R-IR Adapter includes (1) a water scene domain-aware modal adapter that progressively guides frozen backbone features with evolving semantic cues, (2) a parallel multi-scale structural perception module for fine-grained, scale-sensitive modeling, (3) an adaptive RGB–IR feature modulation fusion strategy, and (4) a resolution-aligned context semantic and structural detail fusion module. Moreover, we introduce an object-guided global-to-local registration framework to address dynamic cross-modal misalignment, and construct modality-aligned PoLaRIS-DET and ASV-RI-DET datasets that cover diverse water surface scenes. On the two datasets, the proposed method achieves mAP@0.5:0.95 scores of 74.2% and 50.2%, respectively, significantly outperforming existing methods with only 11.9M additional parameters. These results demonstrate the effectiveness of parameter-efficient VFM adaptation for multimodal water surface remote sensing. Full article
(This article belongs to the Section Remote Sensing Image Processing)
34 pages, 8222 KB  
Article
DPF-DETR: Enhancing Drone Image Detection with Density Perception and Multi-Scale Feature Fusion
by Sidi Lai, Zhensong Li, Xiaotan Wei, Yutong Wang and Shiliang Zhu
Remote Sens. 2026, 18(8), 1221; https://doi.org/10.3390/rs18081221 - 17 Apr 2026
Abstract
The DPF-DETR model has been designed to address the challenges encountered in object detection within drone imagery, particularly in scenarios involving significant target scale variations, dense targets, and complex backgrounds. To overcome the limitations of traditional object detection methods, the Density Sensing Mechanism [...] Read more.
The DPF-DETR model has been designed to address the challenges encountered in object detection within drone imagery, particularly in scenarios involving significant target scale variations, dense targets, and complex backgrounds. To overcome the limitations of traditional object detection methods, the Density Sensing Mechanism (DSM) and Adaptive Density Map Loss (AdaptiveDM Loss) have been incorporated into the model to provide fine-grained supervision signals. The DSM optimizes the query selection mechanism by utilizing density maps, enabling the number of queries to be adaptively adjusted based on the distribution density of targets, thus improving detection accuracy in dense regions. Furthermore, the precision of the model in detecting dense targets is enhanced by AdaptiveDM Loss, which dynamically adjusts the weights for object localization and classification. Multi-scale feature fusion capabilities are also improved by the Multi-Scale Feature Fusion Network (MSFFN) and the Selective Feature Integration Module (SFIM). The MSFFN refines the fusion of features, which improves the detection of targets across various scales, particularly in complex scenes. Additionally, SFIM enhances the detection accuracy for small targets and complex backgrounds by integrating low-level spatial features with high-level semantic information. The Context-Sensitive Feature Interaction Module (CSFIM) further optimizes multi-scale feature fusion through context-guided interactions, bridging the semantic gap between features of different scales, thus improving the robustness of the model in dense scenarios. Experimental results have shown that DPF-DETR outperforms traditional models and state-of-the-art detection methods across multiple datasets, demonstrating superior robustness and accuracy, especially in dense target detection and complex background scenarios. Full article
28 pages, 3181 KB  
Article
An Attention-Augmented CNN–LSTM Framework for Reconstructing Transient Temperature Fields of Turbine Blades from Sparse Measurements
by Yingtao Chen, Langlang Liu, Dan Sun, Haida Liu and Junjie Yang
Aerospace 2026, 13(4), 381; https://doi.org/10.3390/aerospace13040381 - 17 Apr 2026
Abstract
Accurately predicting the temperature field of turbine blades is of great significance for evaluating the thermal reliability and service life of high-temperature components in aero-engines. However, due to the high computational cost of numerical simulations and the limitations imposed by complex geometric structures [...] Read more.
Accurately predicting the temperature field of turbine blades is of great significance for evaluating the thermal reliability and service life of high-temperature components in aero-engines. However, due to the high computational cost of numerical simulations and the limitations imposed by complex geometric structures and harsh operating environments, experimental measurements can usually only obtain sparse sensor data, making the acquisition of complete temperature distributions still challenging. Therefore, reconstructing the complete temperature field under sparse measurement conditions has become a key research issue in turbine thermal analysis. To address this problem, this paper proposes an attention-enhanced CNN–LSTM framework for reconstructing transient turbine blade temperature fields from sparse data. The model combines the spatial feature extraction capability of Convolutional Neural Networks (CNNs) with the time-series modeling capability of Long Short-Term Memory networks (LSTM). An SE channel attention module is introduced in the CNN feature extraction stage to achieve adaptive recalibration of channel features, and a temporal attention mechanism is incorporated after the LSTM layer to highlight key transient thermal features. A multi-condition temperature field dataset was constructed by conducting Computational Fluid Dynamics (CFD) simulations on low-pressure turbine guide vanes, and the model was experimentally validated through thermal shock tests. The results show that the proposed model can accurately reconstruct the spatial distribution and transient evolution of the turbine blade temperature field under sparse measurement conditions. Under different operating conditions, the predicted temperature fields are highly consistent with the CFD results, with the maximum Reconstruction error remaining below 19 °C. Error distribution analysis indicates that the model has stable Reconstruction performance and good generalization ability. Full article
(This article belongs to the Section Aeronautics)
21 pages, 6052 KB  
Article
An Uncertainty-Aware Hybrid CNN–Transformer Network for Accurate Water Body Extraction from High-Resolution Remote Sensing Images in Complex Scenarios
by Qiao Xu, Huifan Wang, Pengcheng Zhong, Yao Xiao, Yuxin Jiang, Yan Meng, Qi Zhang, Cheng Zeng, Yangjie Sun and Yuxuan Liu
Remote Sens. 2026, 18(8), 1210; https://doi.org/10.3390/rs18081210 - 17 Apr 2026
Abstract
Timely and accurate monitoring of surface water dynamics via remote sensing is critical, given water resources’ importance. However, accurate water body delineation based on high-resolution remotely sensed imagery is still challenging due to the complexity of water bodies’ boundaries and the diversity of [...] Read more.
Timely and accurate monitoring of surface water dynamics via remote sensing is critical, given water resources’ importance. However, accurate water body delineation based on high-resolution remotely sensed imagery is still challenging due to the complexity of water bodies’ boundaries and the diversity of their shapes and sizes, which can lead to boundary ambiguity and varying degrees of confusion with near-water vegetation in water body maps. To address this challenge, we introduce an uncertainty-aware hybrid CNN–Transformer model for delineating water bodies using remotely sensed imagery. In our designed network, a multi-scale transformer (MST) module is first designed to effectively model and refine the multi-scale global semantic dependencies of water bodies. Subsequently, an uncertainty-guided multi-scale information fusion (MSIF) module is constructed to extract water body mapping information from these multi-scale features output from the MST module and fuse them adaptively. Across different scales, the extracted features differ in their ability to distinguish water bodies from non-water bodies and in their levels of uncertainty. Consequently, during the adaptive fusion of multi-scale water body information in the MSIF module, the mapping uncertainty is quantified and suppressed to minimize its impact, thus yielding enhanced precision in water body delineation. Ultimately, a comprehensive loss function is designed for model optimization to generate the final water body map. Furthermore, to promote water body segmentation models’ development, this study also presents the HBD_Water water body sample dataset, which contains 44 multispectral, 5000 × 5000-pixel images at 2 m spatial resolution, and will be released on the LuojiaSET platform soon. Finally, to verify the proposed model and its constituent MST and MSIF modules, extensive water mapping experiments were performed on three datasets. The experimental results substantiate their effectiveness. Furthermore, comparative experiment results demonstrate that the proposed model performs better at water body extraction than advanced networks including TransUNet, DeeplabV3+, and ADCNN. Full article
Show Figures

Figure 1

18 pages, 9280 KB  
Article
MSResBiMamba: A Deep Cascaded Architecture for EEG Signal Decoding
by Ruiwen Jiang, Yi Zhou and Jingxiang Zhang
Mathematics 2026, 14(8), 1348; https://doi.org/10.3390/math14081348 - 17 Apr 2026
Abstract
Electroencephalogram (EEG) signals serve as the core information carrier for brain–computer interfaces (BCIs); however, their highly non-stationary nature, extremely low signal-to-noise ratio, and significant inter-individual variability pose considerable challenges for signal decoding. Existing deep learning methods struggle to strike a balance between multi-scale, [...] Read more.
Electroencephalogram (EEG) signals serve as the core information carrier for brain–computer interfaces (BCIs); however, their highly non-stationary nature, extremely low signal-to-noise ratio, and significant inter-individual variability pose considerable challenges for signal decoding. Existing deep learning methods struggle to strike a balance between multi-scale, fine-grained feature extraction and efficient long-range temporal modeling. To overcome this limitation, this study proposes a novel deep cascaded architecture, MSResBiMamba, which deeply integrates multi-scale spatiotemporal feature learning with cutting-edge long-sequence modeling techniques. The model first utilizes an enhanced multi-scale spatiotemporal convolutional network (MS-CNN) combined with a SE-channel attention mechanism to adaptively extract local multi-band features and dynamically suppress redundant artefacts. Subsequently, it innovatively introduces an enhanced bidirectional Mamba (Bi-Mamba) module to efficiently capture non-causal long-range temporal dependencies with linear computational complexity, whilst cascading multi-head self-attention mechanisms to establish global higher-order feature interactions. Extensive experiments on the BCI Competition IV-2a dataset demonstrate that MSResBiMamba achieves outstanding classification performance in multi-class motor imagery tasks, significantly outperforming traditional methods and existing state-of-the-art neural networks. Ablation studies and t-SNE visualisations further confirm the model’s robustness in feature decoupling and cross-subject applications, providing a high-precision, high-efficiency decoding solution for BCI systems. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
21 pages, 12845 KB  
Article
VETA-CLIP: Lightweight Video Adaptation with Efficient Spatio-Temporal Attention and Variation Loss
by Jing Huang and Jiaxin Liao
Electronics 2026, 15(8), 1701; https://doi.org/10.3390/electronics15081701 - 17 Apr 2026
Abstract
Full fine-tuning of large-scale vision-language models for video action recognition incurs prohibitive computational cost and often degrades pre-trained spatial representations. To address this, we propose VETA-CLIP, a Video Efficient Temporal Adaptation framework that enhances temporal modeling while preserving cross-modal alignment. By incorporating lightweight [...] Read more.
Full fine-tuning of large-scale vision-language models for video action recognition incurs prohibitive computational cost and often degrades pre-trained spatial representations. To address this, we propose VETA-CLIP, a Video Efficient Temporal Adaptation framework that enhances temporal modeling while preserving cross-modal alignment. By incorporating lightweight adapters into a frozen backbone, VETA-CLIP introduces only 3.55M trainable parameters (a 98% reduction compared to full fine-tuning). Our approach features two key innovations: (1) an Efficient Spatio-Temporal Attention (ESTA) mechanism with a parameter-free boundary replication temporal shift (BRTS) module, which explicitly decouples spatial and temporal attention heads to capture inter-frame dynamics while minimizing disruption to the pre-trained spatial representations; and (2) a novel Variation Loss that maximizes both local inter-frame differences and global temporal variance, encouraging the model to focus on action-related changes rather than static backgrounds. Extensive experiments on HMDB-51, UCF-101, and Something-Something v2 demonstrate that VETA-CLIP achieves competitive performance across zero-shot, base-to-novel, and few-shot protocols, while and remains competitive on the Kinetics-400 dataset. Notably, our eight-frame variant requires only 4.7 GB of peak GPU memory and 2.47 ms of inference per video, demonstrating exceptional computational efficiency alongside consistent accuracy gains. Full article
(This article belongs to the Section Artificial Intelligence)
29 pages, 1924 KB  
Article
Incipient Fault Diagnosis in Power Cables Based on WOA-CEEMDAN and a TCN-BiLSTM Network with Multi-Head Attention
by Yuhua Xing and Yaolong Yin
Appl. Sci. 2026, 16(8), 3908; https://doi.org/10.3390/app16083908 - 17 Apr 2026
Abstract
Incipient faults in power cables are difficult to diagnose because their transient signatures are weak, non-stationary, and easily masked by background noise, while labeled real-world samples are often scarce. To address these challenges, this paper proposes an offline diagnosis framework that integrates Whale [...] Read more.
Incipient faults in power cables are difficult to diagnose because their transient signatures are weak, non-stationary, and easily masked by background noise, while labeled real-world samples are often scarce. To address these challenges, this paper proposes an offline diagnosis framework that integrates Whale Optimization Algorithm (WOA)-guided CEEMDAN with a TCN-BiLSTM-Multi-HeadAttention network. The proposed method has three main features. First, WOA is explicitly mapped to the CEEMDAN parameter optimization problem and is used to adaptively optimize the noise amplitude and ensemble number, thereby improving decomposition quality and enhancing weak fault-related components. Second, the optimized intrinsic mode functions are reconstructed into a multi-channel representation that preserves complementary fault information across different frequency bands. Third, a hybrid deep architecture combining Temporal Convolutional Networks, Bidirectional Long Short-Term Memory, and multi-HeadAttention is designed to jointly capture local transient characteristics, bidirectional temporal dependencies, and fault-sensitive feature interactions. Experimental results on both PSCAD/EMTDC simulation data and real-world measured data show that the optimized WOA-CEEMDAN achieves superior decomposition performance, with an RMSE of 0.097 and an SNR of 8.42 dB. On the real-world test dataset, the proposed framework achieves 96.00% accuracy, 97.25% precision, 96.84% recall, an F1-score of 0.970, and an AUC of 0.97, outperforming several representative baseline models. Additional ablation, noise-robustness, small-sample, confusion-matrix, and cross-cable validation results further demonstrate the effectiveness and robustness of the proposed framework for incipient cable fault diagnosis. Full article
(This article belongs to the Section Electrical, Electronics and Communications Engineering)
26 pages, 8974 KB  
Article
Deep-MiSR: Multi-Scale Convolution and Attention-Enhanced DeepLabV3+ for Brain Tumor Segmentation in MRI
by Md Parvej Mosharaf, Jie Su and Jing Zhang
Appl. Sci. 2026, 16(8), 3900; https://doi.org/10.3390/app16083900 (registering DOI) - 17 Apr 2026
Abstract
Accurate brain tumor segmentation in magnetic resonance imaging (MRI) is essential for diagnosis, treatment planning, and therapy monitoring. Conventional deep learning models often struggle with large variations in tumor shape, size, and contrast, as well as severe foreground–background imbalance. To address these challenges, [...] Read more.
Accurate brain tumor segmentation in magnetic resonance imaging (MRI) is essential for diagnosis, treatment planning, and therapy monitoring. Conventional deep learning models often struggle with large variations in tumor shape, size, and contrast, as well as severe foreground–background imbalance. To address these challenges, this study presents Deep-MiSR, an enhanced encoder–decoder framework built upon DeepLabV3+ with a MobileNetV2 backbone, tailored for single-modality contrast-enhanced T1-weighted (T1CE) MRI segmentation. Three complementary components are integrated into the architecture: mixed depthwise convolution (MixConv) with heterogeneous kernels within the atrous spatial pyramid pooling module for multi-scale feature aggregation, a squeeze-and-excitation block for adaptive channel recalibration, and R-Drop regularization that enforces prediction consistency via symmetric Kullback–Leibler divergence. The model was evaluated on 3064 T1CE slices from 233 patients drawn from the publicly available Nanfang Hospital brain MRI dataset. Deep-MiSR achieved a Dice similarity coefficient of 0.9281, a mean intersection-over-union of 0.8738, a precision of 0.8839, and a 95th-percentile Hausdorff distance of 7.69 mm, demonstrating consistent improvements over both the DeepLabV3+ baseline and all prior methods evaluated on the same data. Ablation studies confirmed that each component contributes independently, with R-Drop providing the largest individual gain. These findings demonstrate that combining multi-scale convolution, channel attention, and consistency regularization constitutes an effective and computationally practical strategy for robust single-modality brain tumor segmentation. Full article
(This article belongs to the Special Issue Advances in Deep Learning-Based Medical Image Analysis: 2nd Edition)
24 pages, 2463 KB  
Article
Optimized Reconfigurable Intelligent Surfaces Configuration in Multiuser Wireless Networks via Fuzzy-Enhanced Pied Kingfisher Strategy
by Mona Gafar, Shahenda Sarhan, Abdullah M. Shaheen and Ahmed S. Alwakeel
Technologies 2026, 14(4), 237; https://doi.org/10.3390/technologies14040237 - 17 Apr 2026
Abstract
This paper proposes a new fuzzified multi-objective wireless communication optimization model that maximizes the quantity and placement of Reconfigurable Intelligent Surfaces (RISs). In order to meet realistic deployment constraints like non-overlapping and acceptable location, the model aims to decrease the number of deployed [...] Read more.
This paper proposes a new fuzzified multi-objective wireless communication optimization model that maximizes the quantity and placement of Reconfigurable Intelligent Surfaces (RISs). In order to meet realistic deployment constraints like non-overlapping and acceptable location, the model aims to decrease the number of deployed RISs while raising the achievable rate. The Modified Pied Kingfisher Optimization Algorithm (MPKOA) is suggested as a solution to this intricate optimization issue. MPKOA features many significant improvements over the traditional Pied Kingfisher Optimization Algorithm (PKOA), such as energy-based motion control, adaptive subgrouping, flock cooperation, and memory-driven re-perching. These techniques speed up convergence, improve solution precision, reduce computation time, and balance exploration and exploitation. MPKOA performs better than standard PKOA, Enhanced version of PKOA (EPKO), Differential Evolution (DE), Grey Wolf Optimizer (GWO), and other existing algorithms, according to extensive comparisons. MPKOA can achieve up to 20% higher optimization values and 30% faster convergence, according to simulation data. In addition, the proposed MPKOA reduces computational complexity and runtime by about 50% when compared to standard PKOA-based approaches since it only requires single fitness evaluation per iteration. This enables the deployment of fewer RISs while still achieving higher communication rates. In multiuser wireless systems, MPKOA offers a robust and effective approach to RIS placement optimization, which helps to boost capacity and provide more energy-efficient 6G communication networks. Full article
Show Figures

Figure 1

Back to TopTop