Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (12,811)

Search Parameters:
Keywords = convolutional neural networks (CNNs)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
34 pages, 3672 KB  
Article
Explainable Text-Based Depression and Suicide Risk Prediction from Social Media Using Deep Learning and Graph Neural Networks
by Atiq Ur Rehman, Abid Iqbal, Ali Sayyed, Zaheer Aslam, Muhammad Ismail Mohmand and Ghassan Husnain
Healthcare 2026, 14(11), 1440; https://doi.org/10.3390/healthcare14111440 - 22 May 2026
Abstract
Objectives: The rise in the frequency of mental health concerns (depression and suicide) expressed on social media calls for reliable, explainable, and efficient computational methods for mental health surveillance. In this paper, we propose an interpretable framework for text-based detection of post- and [...] Read more.
Objectives: The rise in the frequency of mental health concerns (depression and suicide) expressed on social media calls for reliable, explainable, and efficient computational methods for mental health surveillance. In this paper, we propose an interpretable framework for text-based detection of post- and community-level mental health risk on social media. Methods: The framework combines (i) Secretary Bird Optimization (SBO) for feature selection of informative linguistic and psychological features, (ii) a BERT (Bidirectional Encoder Representations from Transformers)—CNN (Convolutional Neural Network) model for post-level reasoning, and (iii) a Graph Neural Network (GraphSAGE) for community-level reasoning. The graph is estimated based on semantic similarity between posts and author relations, instead of social interactions (e.g., mentions, replies) between authors. We use SHAP and LIME for model interpretability, uncertainty, and calibration analysis to evaluate the trustworthiness of predictions. Results: The model delivers 93.1% accuracy, 0.91 F1-score, and 0.944 ROC-AUC on the eRisk and CLPsych datasets using a strict user-disjoint validation strategy. SBO lowers the number of features by about 38%, leading to better generalization. The graph-based model enables improved learning of post and user representations by capturing relational dependencies. Conclusions: Our approach offers an explainable and robust means of detecting mental health risk from text. Graph-based representations of semantic and authorship interactions enable community-level analyses, while interpretability and uncertainty estimation facilitate possible human-in-the-loop decision-making. This research does not explicitly consider a human-in-the-loop experiment. Full article
32 pages, 13846 KB  
Article
A Dual-Branch CNN with Depthwise Separable Fusion for Hyperspectral Image Classification
by Teng Li, Yunhua Cao, Xing Guo, Shikun Zhang and Lining Yan
Remote Sens. 2026, 18(11), 1685; https://doi.org/10.3390/rs18111685 - 22 May 2026
Abstract
Hyperspectral image classification remains challenging because robust recognition requires preserving spatial–spectral coupling, extracting complementary spectral and spatial cues, and fusing heterogeneous features without excessive redundancy. To address this issue, a dual-branch convolutional neural network (CNN) with depthwise separable fusion, termed DSFA-CNN, is developed. [...] Read more.
Hyperspectral image classification remains challenging because robust recognition requires preserving spatial–spectral coupling, extracting complementary spectral and spatial cues, and fusing heterogeneous features without excessive redundancy. To address this issue, a dual-branch convolutional neural network (CNN) with depthwise separable fusion, termed DSFA-CNN, is developed. The network combines a 3D convolution branch for coupled spatial–spectral representation learning with a 1D+2D branch for efficient spectral and spatial modeling. A convolutional block attention module (CBAM) is introduced in the decomposed branch to emphasize informative spectral responses and salient spatial regions, and a depthwise separable fusion module is used to improve cross-branch integration while limiting fusion-stage redundancy and the risk of overfitting. Experiments on Indian Pines, University of Pavia, Salinas, and Houston2013 yield overall accuracies of 95.62 ± 0.13%, 99.25 ± 0.13%, 99.89 ± 0.11%, and 97.62 ± 0.23%, respectively. The gains are most evident on the more challenging Indian Pines and Houston2013 scenes. Ablation results show that the dual-branch design provides complementary information, whereas CBAM and the fusion module further improve representation selectivity and feature integration. Computational cost analysis further indicates that DSFA-CNN achieves a more favorable trade-off between classification accuracy and computational efficiency than several recent competitive baselines. These results demonstrate the effectiveness of parallel coupled–decomposed modeling with efficient feature fusion for robust hyperspectral image classification. Full article
Show Figures

Figure 1

38 pages, 5684 KB  
Review
Vision and Multimodal Perception for Autonomous Driving: Deep Learning Architectures, Tasks, and Sensor Fusion
by Savvas Nikolaidis and Paraskevas Koukaras
World Electr. Veh. J. 2026, 17(6), 277; https://doi.org/10.3390/wevj17060277 - 22 May 2026
Abstract
The rapid development of autonomous vehicles is based mainly on their ability to accurately perceive their environment, where artificial intelligence and computer vision act as the core of environmental perception. In this regard, deep learning-based perception architectures have revolutionized the field of autonomous [...] Read more.
The rapid development of autonomous vehicles is based mainly on their ability to accurately perceive their environment, where artificial intelligence and computer vision act as the core of environmental perception. In this regard, deep learning-based perception architectures have revolutionized the field of autonomous driving. However, as the use of single sensors fails to ensure reliability in complex scenarios, multimodal sensor fusion has become an essential part of modern deep learning architectures. In this context, covering the literature from 2020 to 2025, we analyze the transition from traditional Convolutional Neural Networks (CNNs) to modern Vision Transformers (ViTs) and explore data fusion design methodologies at various processing levels. In addition, significant limitations related to adverse weather conditions and dynamic environments, computational resources and overall quality and management of data are identified. The conducted comparative analysis indicates that vision-transformer and multimodal fusion methodologies provide higher accuracy in perception tasks but at the cost of increased computational requirements and sensor synchronization challenges. Finally, it becomes clear that achieving full autonomy requires further research in subjects such as collaborative perception, unsupervised domain adaptation and the creation of lightweight models, thus offering a roadmap for future developments. Full article
(This article belongs to the Section Automated and Connected Vehicles)
16 pages, 2294 KB  
Article
A Quantitative Evaluation of Gradient-Based Visual Explainability Methods Across Convolutional and Transformer-Based Vision Models
by Angelos Tzirtis, Christos Troussas, Akrivi Krouska, Phivos Mylonas and Cleo Sgouropoulou
Electronics 2026, 15(11), 2241; https://doi.org/10.3390/electronics15112241 - 22 May 2026
Abstract
Explainable Artificial Intelligence (XAI) has become a critical requirement for the responsible deployment of deep learning systems in safety-critical and regulated domains, particularly in medical imaging. In computer vision, gradient-based explanation methods such as Saliency Maps and Gradient-weighted Class Activation Mapping (Grad-CAM) are [...] Read more.
Explainable Artificial Intelligence (XAI) has become a critical requirement for the responsible deployment of deep learning systems in safety-critical and regulated domains, particularly in medical imaging. In computer vision, gradient-based explanation methods such as Saliency Maps and Gradient-weighted Class Activation Mapping (Grad-CAM) are widely used for interpreting convolutional neural networks (CNNs). However, the increasing adoption of Vision Transformers (ViTs) introduces structural differences in internal representations that challenge the direct transfer of convolutional explainability mechanisms. This study presents a systematic, quantitative, and statistically validated evaluation of gradient-based visual explainability across CNN architectures (VGG16 and ResNet50) and a Vision Transformer (ViT-B/16), using both a domain-specific medical imaging dataset (brain MRI, tumor vs. non-tumor classification). Beyond qualitative heatmap inspection, we conduct deletion-based faithfulness analysis, sensitivity-to-noise evaluation, feature masking validation, and statistical hypothesis testing over 30 independent runs. All models achieve strong predictive performance on the domain dataset (mean accuracy ≈ 0.99), enabling a fair and meaningful comparison of explanation methods across architectures. Results demonstrate that explanation reliability is highly method- and architecture-dependent. Sensitivity differences are consistently statistically significant, whereas deletion-based faithfulness does not always yield equally strong separation under the adopted masking protocol. Masking-based analysis reveals substantial false-positive rates in certain configurations, indicating that visually plausible heatmaps do not necessarily isolate decision-necessary evidence. These findings underscore the importance of coupling visual explanations with behavioral validation metrics, particularly in high-risk domains governed by emerging regulatory frameworks such as the EU AI Act. Overall, the study advocates for empirically validated, architecture-aware, and statistically grounded approaches to medical XAI. Full article
Show Figures

Figure 1

21 pages, 3429 KB  
Article
Visible–Infrared Fusion Based on CNN and Deformable Transformer
by Xiaoyi Wang, Xiansong Gu, Bin Li, Mingqiang Zhang, Panpan Yang and Qiang Fu
J. Imaging 2026, 12(6), 219; https://doi.org/10.3390/jimaging12060219 - 22 May 2026
Abstract
To address the limitations of traditional methods in feature extraction and multi-modal information fusion, this paper proposes an infrared–visible image object detection architecture that integrates Convolutional Neural Networks (CNNs) and Deformable Transformers. This method leverages the advantages of CNN in local feature modeling [...] Read more.
To address the limitations of traditional methods in feature extraction and multi-modal information fusion, this paper proposes an infrared–visible image object detection architecture that integrates Convolutional Neural Networks (CNNs) and Deformable Transformers. This method leverages the advantages of CNN in local feature modeling and the capabilities of Transformer in capturing global contextual information, facilitating the fusion of semantic consistency and structural details across modalities. By introducing a detection-aware multi-task optimization mechanism, the model improves object detection in challenging scenarios such as low-light conditions, occlusion, and complex backgrounds. Experiments on multiple standard datasets, including M3FD and LLVIP, indicate that the proposed method achieves competitive or better performance than the compared methods in key metrics such as mAP. Specifically, our method obtains the best mAP50 among the evaluated methods with an mAP50 of 74.2% on the M3FD dataset and 98.6% on the LLVIP dataset, surpassing the second-best PIAFusion by 4.3% and 2.5% respectively. These quantitative results support the practicality and effectiveness of our approach in the evaluated complex environments. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

21 pages, 14302 KB  
Article
Audio-Based Device for Automated Surgical Counting, ToolSafe
by Michael R. Gardner, Latifa A. Aladdal, Lama Alshammari, Fatima Aldalgan, Maram A. Alomair, Shahad Alomair and Amani Alrashed
Appl. Sci. 2026, 16(11), 5181; https://doi.org/10.3390/app16115181 - 22 May 2026
Abstract
Manual counting of surgical tools, known as surgical counting, is a time-consuming and error-prone task that increases the risk of retained surgical instruments and extends operating room (OR) time. Presently, in hospitals around the world, surgical counting is often performed manually with paper [...] Read more.
Manual counting of surgical tools, known as surgical counting, is a time-consuming and error-prone task that increases the risk of retained surgical instruments and extends operating room (OR) time. Presently, in hospitals around the world, surgical counting is often performed manually with paper or tablet checklists, often leading to delays, increased infection risk, and financial cost. RFID, barcode-based, and computer vision solutions exist but are expensive and have challenges with sterilization and signal interference. This paper presents ToolSafe, a low-cost, portable system that classifies surgical tools by their acoustic signatures when dropped into a detection box. A pilot dataset of 4004 audio samples from four tool types (n = 996, tissue forceps; n = 1005, iris scissors; n = 1006, scalpel handle; n = 997, testing needle) was collected using ToolSafe. A convolutional neural network (CNN) was evaluated using stratified five-fold cross-validation on the laboratory dataset, with a k-nearest neighbors (KNN) classifier implemented as a control model. In each fold, both models were trained on 80% of the data and tested on the remaining 20%, ensuring that all samples were used for both training and evaluation. The CNN achieved a mean (±standard deviation) classification accuracy of 99.55% (±0.19%) across the validation folds, outperforming the KNN model, which achieved a mean accuracy of 97.28% (±0.50%). The difference was statistically significant according to a paired t-test across folds (p = 0.0003), indicating CNN’s superior performance on the dataset. For a run of 100 additional samples using the Raspberry Pi-based system, spectrogram generation averaged 0.121 s (±0.025 s), CNN inference averaged 0.180 s (±0.033 s), and total end-to-end latency averaged 1.851 s (±0.253 s) per tool. This pilot study proposes a possible technological solution for surgical counting that reduces human error and enhances patient safety. ToolSafe may be subsequently improved by increasing the number of surgical tools used in the training dataset, testing under more robust OR-like environments, and comparing to other classification algorithms. Further refinement and incorporation of ToolSafe in operating room workflows have the potential to reduce patient risks from extended surgical times and retained surgical instruments. Full article
Show Figures

Figure 1

29 pages, 4755 KB  
Article
DenseViT-OCT: A Hybrid CNN-Transformer Architecture with Multi-Scale Dense Feature Aggregation for Automated Epiretinal Membrane Severity Classification
by Elif Yusufoğlu, Salih Taha Alperen Özçelik, Orhan Atila, Numan Halit Guldemir and Abdulkadir Sengur
Tomography 2026, 12(6), 76; https://doi.org/10.3390/tomography12060076 - 22 May 2026
Abstract
Background/Objectives: Epiretinal membrane (ERM) is a common vitreoretinal disorder characterized by fibrocellular proliferation on the inner retinal surface, often leading to progressive visual impairment. Accurate grading of ERM severity using optical coherence tomography (OCT) is critical for treatment planning and surgical decision-making; however, [...] Read more.
Background/Objectives: Epiretinal membrane (ERM) is a common vitreoretinal disorder characterized by fibrocellular proliferation on the inner retinal surface, often leading to progressive visual impairment. Accurate grading of ERM severity using optical coherence tomography (OCT) is critical for treatment planning and surgical decision-making; however, manual grading is labor-intensive and subjective. This study aims to develop an automated and reliable deep learning-based method for ERM severity classification. Methods: We propose DenseViT-OCT, a hybrid deep learning model that integrates dense convolutional neural networks (CNN) and vision transformers (ViT). The model introduces three key modules: Multi-Scale Dense Feature Aggregation (MDFA) for capturing hierarchical features across multiple spatial scales, Adaptive Feature Calibration (AFC) for enhancing feature discrimination through channel and spatial attention, and Cross-Attention Feature Fusion (CAFF) for enabling bidirectional interaction between convolutional and transformer representations. The model was trained and evaluated on 2195 OCT B-scan images obtained from 397 patients. Results: DenseViT-OCT achieved an overall accuracy of 94.76% on the internal four-class test set, outperforming 19 benchmark models, including ConvNeXt, EfficientNet, ViT, and Swin Transformers. The model demonstrated balanced performance with a macro-averaged precision of 93.76%, recall of 93.22%, F1-score of 93.47%, Cohen’s kappa of 92.62%, and macro-Area Under the Curve (AUC) of 98.95%. Ablation experiments confirmed the contribution of the proposed MDFA, AFC, CAFF, and deep supervision components, with the full model consistently outperforming reduced variants and standalone DenseNet121 and ViT-B/16 backbones. In repeated experiments across five random seeds, DenseViT-OCT also achieved the best mean accuracy (0.9399 ± 0.0052). External validation on the public multicenter OCTDL dataset, performed as binary ERM-versus-normal classification because of label availability, yielded 90.76% accuracy and 97.61% AUC, indicating promising generalization beyond the development cohort. Conclusions: DenseViT-OCT provides a robust framework for automated ERM severity classification from OCT B-scans. The combination of local CNN features, global transformer context, and dedicated fusion modules improves classification performance and yields clinically meaningful error patterns. Although further stage-wise multicenter validation, volumetric OCT analysis, and prospective clinical assessment are required, the proposed method shows promise as a research-oriented decision-support framework for B-scan-level ERM assessment. Full article
(This article belongs to the Special Issue Medical Image Analysis in CT Imaging)
Show Figures

Figure 1

22 pages, 5019 KB  
Article
Hyperspectral Detection and Classification of Stain-Contaminated Waste Textiles
by Jiacheng Zou, Haonan He, Wei Tian, Chengyan Zhu, Fei Ye and Xiaoke Jin
Coatings 2026, 16(6), 629; https://doi.org/10.3390/coatings16060629 - 22 May 2026
Abstract
Surface stain contamination poses a critical barrier to the automated, high-precision fiber identification required for industrial-scale waste textile recycling. In this study, a dataset comprising 120 physical specimens (yielding 1200 regions of interest, ROIs) across 12 contamination categories was constructed by contaminating cotton, [...] Read more.
Surface stain contamination poses a critical barrier to the automated, high-precision fiber identification required for industrial-scale waste textile recycling. In this study, a dataset comprising 120 physical specimens (yielding 1200 regions of interest, ROIs) across 12 contamination categories was constructed by contaminating cotton, polyester, and poly-cotton blend textiles with carbon black, protein, and oil stains. The spectral interference effects of stains—including baseline drift and spectral overlapping induced by physical shielding and chemical absorption—were systematically analyzed. To identify the optimal classification pipeline, three mathematical preprocessing methods (First Derivative, FD; Standard Normal Variate, SNV; and Multiplicative Scatter Correction, MSC) were evaluated alongside Support Vector Machine (SVM) and One-Dimensional Convolutional Neural Network (1D-CNN) models. Results show that among the SVM-based pipelines, the FD-SVM model effectively resolves overlapping absorption peaks, achieved an average accuracy of 98.17% ± 1.33%, but remains highly dependent on mathematical preprocessing. In contrast, the 1D-CNN model employing a progressive stacking architecture of multi-scale convolutional kernels attains a highly robust mean accuracy of 99.58% ± 0.56% under a strict specimen-level 10-fold cross-validation. It achieves this by directly utilizing radiometrically calibrated raw spectra, thereby effectively bypassing manual spectral feature engineering. These findings demonstrate that Hyperspectral Imaging coupled with end-to-end deep learning provides a feasible and industrially deployable solution for simultaneous stain detection and fiber identification in waste textile sorting. Full article
Show Figures

Graphical abstract

17 pages, 1020 KB  
Article
Research on a Portable Multispectral Imaging System for Starch Content Detection in Watermelon–Pumpkin Grafted Seedling Leaves
by Shengyong Xu, Honglei Yang, Yu Zeng, Shaodong Wang, Shuo Yang, Zhilong Bie and Yuan Huang
Agriculture 2026, 16(10), 1127; https://doi.org/10.3390/agriculture16101127 - 21 May 2026
Abstract
Plant leaf starch content is a critical indicator of metabolic status, yet traditional enzymatic methods are destructive, labor-intensive, and costly. This study proposes a novel non-destructive detection method using watermelon–pumpkin grafted seedlings. To optimize hardware design, 12 characteristic wavelengths were identified via competitive [...] Read more.
Plant leaf starch content is a critical indicator of metabolic status, yet traditional enzymatic methods are destructive, labor-intensive, and costly. This study proposes a novel non-destructive detection method using watermelon–pumpkin grafted seedlings. To optimize hardware design, 12 characteristic wavelengths were identified via competitive adaptive reweighted sampling (CARS). A portable multispectral imaging system was developed, featuring narrowband LEDs and integrated human–computer interaction software for real-time visualization. We constructed a multimodal deep learning architecture that integrates a convolutional neural network (CNN) for spatial feature extraction from RGB images, a fully connected neural network (FCNN) for spectral data, and a Transformer network for high-level feature fusion. Experimental results showed that the ShuffleNet v2-Transformer model achieved an R2 of 0.956 (RMSE = 0.036) for watermelon leaves, while the EfficientNet b1-Transformer model reached an R2 of 0.967 (RMSE = 0.052) for pumpkin leaves. This multimodal approach significantly outperformed conventional PLSR and single-modal CNN models, demonstrating superior ability in processing long-range dependencies within spectral–spatial data. The system enables accurate detection with a throughput of 120 samples per hour at a hardware cost approximately 90% lower than commercial multispectral cameras. This provides an efficient, low-cost solution for large-scale monitoring of plant physiological indicators in precision breeding. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
22 pages, 4581 KB  
Article
Climate-Driven Redistribution of Early-Spring Ephemeral Plant Communities in Cold Arid Deserts: Evidence from the Gurbantunggut Desert, China
by Yang Xue, Jiazheng Ma, Songmei Ma, Yuting Chen, Xu Sun, Mengyuan Ren and Liqiang Shen
Plants 2026, 15(10), 1586; https://doi.org/10.3390/plants15101586 - 21 May 2026
Abstract
Early-spring ephemeral plants act as pioneer species on stabilized dunes in cold arid deserts; they are capable of rapid growth under extreme drought and low-temperature conditions while sustaining dune ecosystem functions. These species are highly sensitive to climate change, yet their spatiotemporal dynamics [...] Read more.
Early-spring ephemeral plants act as pioneer species on stabilized dunes in cold arid deserts; they are capable of rapid growth under extreme drought and low-temperature conditions while sustaining dune ecosystem functions. These species are highly sensitive to climate change, yet their spatiotemporal dynamics and the mechanisms by which climatic factors regulate their growth remain poorly understood. In this study, we investigated the Gurbantunggut Desert, China, using long-term NDVI time series to extract phenological traits associated with their life cycle and developed a remote-sensing-based analytical framework to quantify the distribution patterns of early-spring ephemeral plants and their environmental drivers. We combined random forest (RF), structural equation modeling (SEM), and convolutional neural networks (CNN) to assess the relative importance and pathways of key climatic drivers and to predict future distribution changes. Our results indicate that: (1) the life cycle extraction method achieved a classification accuracy exceeding 80%, and from 2001 to 2022, the overall distribution of early-spring ephemeral plants exhibited an increasing trend; (2) snowend, snowday, and precipitation during the driest quarter were the primary drivers of ephemeral plant distribution, collectively explaining over 60% of the observed variation, and structural equation modeling further revealed that snow and precipitation had significant positive effects on their distribution; and (3) under future climate scenarios, Medium-NDVI areas are projected to expand northward and westward, with the potential emergence of new suitable habitats in northern localities by mid-century. Climate warming may facilitate the dispersal and latitudinal migration of early-spring ephemeral plants. Based on these findings, biodiversity conservation efforts should prioritize ecologically sensitive transitional zones and promote species migration and establishment under climate change through the construction of ecological corridors. Full article
(This article belongs to the Special Issue Plant Conservation Science and Practice)
25 pages, 1522 KB  
Article
A Robust Deep Learning Framework for Skill Level Discrimination in Tennis Strokes Using Bilateral IMU Measurements
by Enes Halit Aydin and Onder Aydemir
Sensors 2026, 26(10), 3273; https://doi.org/10.3390/s26103273 - 21 May 2026
Abstract
In tennis, where performance is governed by complex kinetic chain interactions, objective skill classification is vital for coaching and talent identification. This study presents a hierarchical deep learning framework leveraging synchronized bilateral Inertial Measurement Unit (IMU) data from 39 participants (11 elite, 28 [...] Read more.
In tennis, where performance is governed by complex kinetic chain interactions, objective skill classification is vital for coaching and talent identification. This study presents a hierarchical deep learning framework leveraging synchronized bilateral Inertial Measurement Unit (IMU) data from 39 participants (11 elite, 28 amateur). The proposed system successfully distinguishes expertise levels across a total of 4594 strokes, including augmented samples.. A hybrid Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) architecture was developed to autonomously extract spatiotemporal features from the raw kinematic signals of forehand, backhand, service, and volley strokes. The proposed model achieved an accuracy of 95.54%, significantly outperforming both traditional machine learning and state-of-the-art deep learning benchmarks. Qualitative t-distributed Stochastic Neighbor Embedding (t-SNE) analyses revealed that elite athletes form highly homogeneous clusters in the feature space. Furthermore, quantitative Asymmetry Index assessments confirmed that professionals exhibit superior bilateral coordination stability. These findings demonstrate that the proposed end-to-end system offers a robust, field-applicable solution for identifying technical excellence. It provides coaches with reliable digital biomarkers, thereby overcoming the limitations of subjective visual observation. Full article
(This article belongs to the Section Intelligent Sensors)
24 pages, 27236 KB  
Article
WFSCA-YOLO: Robust Object Detection for Terrestrial Optical Sensing Under Atmospheric Degradation via a Wavelet-Driven Frequency–Spatial Co-Awareness Framework
by Jiabao Yan, Qihang Xu, Zhian Zheng, Xian-Hua Han, Junjie Zhu and Yanhua Lin
Remote Sens. 2026, 18(10), 1667; https://doi.org/10.3390/rs18101667 - 21 May 2026
Abstract
Optical object detection under fog-induced atmospheric degradation remains a challenging problem for terrestrial sensing and monitoring systems. Atmospheric scattering reduces image contrast and attenuates high-frequency edge and texture features that are important for precise object localization, while standard downsampling in convolutional neural networks [...] Read more.
Optical object detection under fog-induced atmospheric degradation remains a challenging problem for terrestrial sensing and monitoring systems. Atmospheric scattering reduces image contrast and attenuates high-frequency edge and texture features that are important for precise object localization, while standard downsampling in convolutional neural networks (CNNs) further amplifies this information loss during feature extraction. Existing spatial-domain methods largely improve pixel appearance or feature refinement without explicitly preserving fog-weakened high-frequency edge and texture features during feature extraction. To address this issue, we propose WFSCA-YOLO, a frequency-aware and feature-preserving detection framework with cross-domain fusion between frequency-domain details and spatial semantic responses. The framework introduces the Wavelet-driven Frequency–spatial Co-awareness Block (WFSCA-Block) into YOLOv8, where the Discrete Wavelet Transform (DWT) is used to decompose feature maps into multi-directional high-frequency subbands and preserve high-frequency edge and texture features degraded by atmospheric scattering. A Cross-Domain Feature Selector (CDFS) is further designed to adaptively recalibrate the fusion of frequency-domain details and spatial semantic responses under varying visibility conditions. Experiments on synthetic and real-world degraded optical benchmarks from near-ground scenes, namely Foggy Cityscapes and RTTS, show that WFSCA-YOLO consistently outperforms representative state-of-the-art methods, achieving 50.3% mAP@50 on Foggy Cityscapes (2.1 percentage points above the baseline) and a mean mAP@50 of 79.28% on RTTS over three independent runs. Under a unified FP32 batch-1 inference benchmark, WFSCA-YOLO runs at 134.76 FPS on an RTX 4090D, indicating real-time capability with only a slight latency increase relative to the YOLOv8-s baseline. These results indicate that preserving high-frequency edge and texture features is an effective strategy for robust perception under degraded visibility and offers practical potential for terrestrial sensing and monitoring platforms. Full article
(This article belongs to the Section Engineering Remote Sensing)
24 pages, 62426 KB  
Article
GDBNet: A Three-Branch Semantic Segmentation Network Integrating CNN and Transformer for Land Cover Classification in Ski Resorts
by Zhiwei Yi, Lingjia Gu, Ruifei Zhu, Junwei Tian and He Mi
Remote Sens. 2026, 18(10), 1666; https://doi.org/10.3390/rs18101666 - 21 May 2026
Abstract
As a critical component of ice-snow tourism, land cover classification for ski resorts is crucial to ice-snow resource management. However, there is currently a scarcity of datasets and methods capable of high-precision mapping for such fine-grained scenarios. Although Transformers with long-sequence interactions and [...] Read more.
As a critical component of ice-snow tourism, land cover classification for ski resorts is crucial to ice-snow resource management. However, there is currently a scarcity of datasets and methods capable of high-precision mapping for such fine-grained scenarios. Although Transformers with long-sequence interactions and convolutional neural networks (CNNs) have emerged as mainstream solutions, their performance remains limited on high-resolution remote sensing data characterized by small datasets and high heterogeneity. Targeting land cover classification in ski resort areas, this study proposes a triple-branch segmentation framework integrating CNNs and Transformers to extract global, detail and boundary features (GDBNet), and constructs the first high-resolution ski resort land cover dataset with a resolution of 0.75 m using JiLin-1 satellite constellation (LULC_SKI). The framework employs a backbone combining SegFormer with dual CNN branches. SegFormer captures global semantic context, while dual ResNet-18 branches extract local semantics and edge details respectively. The neck integrates two specialized feature interaction modules, the proposed Pixel-Guided Feature Attention (PG-AFM) and Boundary-Guided Feature Attention (BG-AFM), which synergistically fuse these heterogeneous feature representations for enhanced multi-scale modeling. For the segmentation head, a multi-task learning approach supervises both semantic and edge outputs. LULC_SKI covers seven representative ski resorts in Jilin Province, China, comprising 10,000 multi-seasonal images annotated with six land cover classes, including roads, vegetation, built-up areas, ski runs, water bodies, and cropland. Experiments demonstrate GDBNet achieves 85.44% mIoU and 91.84% mF1 on LULC_SKI, outperforming other advanced models with particularly significant improvements for linear objects like roads and ski runs. Extensive experimental comparisons show that GDBNet delivers consistently excellent performance on both the iSAID and LoveDA datasets, underscoring the superiority of our proposed method. Ablation studies validate the effectiveness of the triple-branch architecture, attention modules, and multi-task supervision. This work proposes a modular framework for land cover classification in complex ski resort scenarios. Full article
31 pages, 7581 KB  
Article
Adapting the IDS-ML Framework for Automated Attack Detection on Edge Devices
by Ryan V. Cooper and Arslan Munir
Algorithms 2026, 19(5), 417; https://doi.org/10.3390/a19050417 - 21 May 2026
Abstract
As modern networks expand, the volume and destructiveness of cyberattacks continue to escalate, necessitating effective defense mechanisms. Intrusion Detection Systems (IDSs) are critical for maintaining network security; however, traditional signature-based systems often fail to detect zero-day attacks. This study explores recent advancements in [...] Read more.
As modern networks expand, the volume and destructiveness of cyberattacks continue to escalate, necessitating effective defense mechanisms. Intrusion Detection Systems (IDSs) are critical for maintaining network security; however, traditional signature-based systems often fail to detect zero-day attacks. This study explores recent advancements in Deep Learning (DL) for cybersecurity by analyzing and replicating the “IDS-ML” framework, an open-source repository for IDS development. We evaluate the performance of five deep learning Convolutional Neural Network (CNN) architectures adapted for intrusion detection via transfer learning on the CICIDS2017 dataset, and propose an enhancement by integrating Automated Machine Learning (AutoML) techniques that achieves a 94.7% reduction in model parameters while maintaining comparable accuracy, thus making our enhanced models suitable for deployment on edge devices. We further validate deployment feasibility by benchmarking both the baseline InceptionV3 and AutoML models on a Raspberry Pi 4, demonstrating an 18.7× inference speedup and 3.5× CPU reduction, with no change in predicted classes from model conversion. Our results confirm that lightweight AutoML architectures enable practical “zero-touch” edge-based intrusion detection on resource-constrained hardware. Full article
Show Figures

Figure 1

24 pages, 1009 KB  
Article
An Improved Method for Anomalous Traffic Detection in SDN Based on Gated Feature Fusion
by Ruize Gu, Xiaoying Wang, Fangfang Cui, Guoqing Yang, Shuai Liu and Panpan Qi
Future Internet 2026, 18(5), 270; https://doi.org/10.3390/fi18050270 - 20 May 2026
Viewed by 169
Abstract
Existing anomalous traffic detection methods based on feature fusion in Software-Defined Networking (SDN) lack adaptability in weight allocation mechanisms. Consequently, their detection accuracy and model generalization capabilities fail to meet practical security requirements. To solve these limitations, this paper proposes a refined detection [...] Read more.
Existing anomalous traffic detection methods based on feature fusion in Software-Defined Networking (SDN) lack adaptability in weight allocation mechanisms. Consequently, their detection accuracy and model generalization capabilities fail to meet practical security requirements. To solve these limitations, this paper proposes a refined detection method based on hybrid feature selection and gated fusion. First, the framework employs XGBoost combined with the Recursive Feature Elimination (RFE) algorithm. This process identifies shallow statistical features with high discriminative power. Simultaneously, the method utilizes a 1D Convolutional Neural Network (1D-CNN) integrated with a Squeeze-and-Excitation (SE) block to extract deep temporal semantic features. Subsequently, a tailored gated fusion mechanism incorporating linear projection layers for feature alignment adaptively integrates these two categories of features. The fused features are then input into a Multilayer Perceptron (MLP) to execute anomalous traffic detection. Experimental results demonstrate that the proposed method achieves superior performance. Specifically, on the InSDN Dataset, the binary and multi-classification accuracy rates reach 99.91% and 99.88%. Similarly, the accuracy rates on the NSL-KDD dataset are 99.78% and 99.76%. Finally, we established a local simulation environment. Experimental results demonstrate that our method attains an average precision exceeding 93% for anomalous traffic detection in simulated real scenarios. Full article
(This article belongs to the Section Cybersecurity)
Show Figures

Figure 1

Back to TopTop