Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (15,928)

Search Parameters:
Keywords = CNNs

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 2235 KB  
Article
AF-DETR: Transformer-Based Object Detection for Precise Atrial Fibrillation Beat Localization in ECG
by Peng Wang, Junxian Song, Pang Wu, Zhenfeng Li, Xianxiang Chen, Lidong Du and Zhen Fang
Bioengineering 2025, 12(10), 1104; https://doi.org/10.3390/bioengineering12101104 (registering DOI) - 14 Oct 2025
Abstract
Atrial fibrillation (AF) detection in electrocardiograms (ECG) remains challenging, particularly at the heartbeat level. Traditional deep learning methods typically classify ECG segments as a whole, limiting their ability to detect AF at the granularity of individual heartbeats. This paper presents AF-DETR, a novel [...] Read more.
Atrial fibrillation (AF) detection in electrocardiograms (ECG) remains challenging, particularly at the heartbeat level. Traditional deep learning methods typically classify ECG segments as a whole, limiting their ability to detect AF at the granularity of individual heartbeats. This paper presents AF-DETR, a novel transformer-based object detection model for precise AF heartbeat localization and classification. AF-DETR incorporates a CNN backbone and a transformer encoder–decoder architecture, where 2D bounding boxes are used to represent heartbeat positions. Through iterative refinement of these bounding boxes, the model improves both localization and classification accuracy. To further enhance performance, we introduce contrastive denoising training, which accelerates convergence and prevents redundant heartbeat predictions. We evaluate AF-DETR on five publicly available ECG datasets (CPSC2021, AFDB, LTAFDB, MITDB, NSRDB), achieving state-of-the-art performance with F1-scores of 96.77%, 96.20%, 90.55%, and 99.87% for heartbeat-level classification, and segment-level accuracies of 98.27%, 97.55%, 97.30%, and 99.99%, respectively. These results demonstrate the effectiveness of AF-DETR in accurately detecting AF heartbeats and its strong generalization capability across diverse ECG datasets. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Figure 1

43 pages, 11215 KB  
Article
Real-Time Efficient Approximation of Nonlinear Fractional-Order PDE Systems via Selective Heterogeneous Ensemble Learning
by Biao Ma and Shimin Dong
Fractal Fract. 2025, 9(10), 660; https://doi.org/10.3390/fractalfract9100660 (registering DOI) - 13 Oct 2025
Abstract
Rod-pumping systems represent complex nonlinear systems. Traditional soft-sensing methods used for efficiency prediction in such systems typically rely on complicated fractional-order partial differential equations, severely limiting the real-time capability of efficiency estimation. To address this limitation, we propose an approximate efficiency prediction model [...] Read more.
Rod-pumping systems represent complex nonlinear systems. Traditional soft-sensing methods used for efficiency prediction in such systems typically rely on complicated fractional-order partial differential equations, severely limiting the real-time capability of efficiency estimation. To address this limitation, we propose an approximate efficiency prediction model for nonlinear fractional-order differential systems based on selective heterogeneous ensemble learning. This method integrates electrical power time-series data with fundamental operational parameters to enhance real-time predictive capability. Initially, we extract critical parameters influencing system efficiency using statistical principles. These primary influencing factors are identified through Pearson correlation coefficients and validated using p-value significance analysis. Subsequently, we introduce three foundational approximate system efficiency models: Convolutional Neural Network-Echo State Network-Bidirectional Long Short-Term Memory (CNN-ESN-BiLSTM), Bidirectional Long Short-Term Memory-Bidirectional Gated Recurrent Unit-Transformer (BiLSTM-BiGRU-Transformer), and Convolutional Neural Network-Echo State Network-Bidirectional Gated Recurrent Unit (CNN-ESN-BiGRU). Finally, to balance diversity among basic approximation models and predictive accuracy, we develop a selective heterogeneous ensemble-based approximate efficiency model for nonlinear fractional-order differential systems. Experimental validation utilizing actual oil-well parameters demonstrates that the proposed approach effectively and accurately predicts the efficiency of rod-pumping systems. Full article
22 pages, 3358 KB  
Article
MultiScaleSleepNet: A Hybrid CNN–BiLSTM–Transformer Architecture with Multi-Scale Feature Representation for Single-Channel EEG Sleep Stage Classification
by Cenyu Liu, Qinglin Guan, Wei Zhang, Liyang Sun, Mengyi Wang, Xue Dong and Shuogui Xu
Sensors 2025, 25(20), 6328; https://doi.org/10.3390/s25206328 (registering DOI) - 13 Oct 2025
Abstract
Accurate automatic sleep stage classification from single-channel EEG remains challenging due to the need for effective extraction of multiscale neurophysiological features and modeling of long-range temporal dependencies. This study aims to address these limitations by developing an efficient and compact deep learning architecture [...] Read more.
Accurate automatic sleep stage classification from single-channel EEG remains challenging due to the need for effective extraction of multiscale neurophysiological features and modeling of long-range temporal dependencies. This study aims to address these limitations by developing an efficient and compact deep learning architecture tailored for wearable and edge device applications. We propose MultiScaleSleepNet, a hybrid convolutional neural network–bidirectional long short-term memory–transformer architecture that extracts multiscale temporal and spectral features through parallel convolutional branches, followed by sequential modeling using a BiLSTM memory network and transformer-based attention mechanisms. The model obtained an accuracy, macro-averaged F1 score, and kappa coefficient of 88.6%, 0.833, and 0.84 on the Sleep-EDF dataset; 85.6%, 0.811, and 0.80 on the Sleep-EDF Expanded dataset; and 84.6%, 0.745, and 0.79 on the SHHS dataset. Ablation studies indicate that attention mechanisms and spectral fusion consistently improve performance, with the most notable gains observed for stages N1, N3, and rapid eye movement. MultiScaleSleepNet demonstrates competitive performance across multiple benchmark datasets while maintaining a compact size of 1.9 million parameters, suggesting robustness to variations in dataset size and class distribution. The study supports the feasibility of real-time, accurate sleep staging from single-channel EEG using parameter-efficient deep models suitable for portable systems. Full article
(This article belongs to the Special Issue AI on Biomedical Signal Sensing and Processing for Health Monitoring)
19 pages, 5009 KB  
Article
Research on Preventive Maintenance Technology for Highway Cracks Based on Digital Image Processing
by Zhi Chen, Zhuozhuo Bai, Xinqi Chen and Jiuzeng Wang
Electronics 2025, 14(20), 4017; https://doi.org/10.3390/electronics14204017 (registering DOI) - 13 Oct 2025
Abstract
Cracks are the initial manifestation of various diseases on highways. Preventive maintenance of cracks can delay the degree of pavement damage and effectively extend the service life of highways. However, existing crack detection methods have poor performance in identifying small cracks and are [...] Read more.
Cracks are the initial manifestation of various diseases on highways. Preventive maintenance of cracks can delay the degree of pavement damage and effectively extend the service life of highways. However, existing crack detection methods have poor performance in identifying small cracks and are unable to calculate crack width, leading to unsatisfactory preventive maintenance results. This article proposes an integrated method for crack detection, segmentation, and width calculation based on digital image processing technology. Firstly, based on convolutional neural network, a optimized crack detection network called CFSSE is proposed by fusing the fast spatial pyramid pooling structure with the squeeze-and-excitation attention mechanism, with an average detection accuracy of 97.10%, average recall rate of 98.00%, and average detection precision at 0.5 threshold of 98.90%; it outperforms the YOLOv5-mobileone network and YOLOv5-s network. Secondly, based on the U-Net network, an optimized crack segmentation network called CBU_Net is proposed by using the CNN-block structure in the encoder module and a bicubic interpolation algorithm in the decoder module, with an average segmentation accuracy of 99.10%, average intersection over union of 88.62%, and average pixel accuracy of 93.56%; it outperforms the U_Net network, DeepLab v3+ network, and optimized DeepLab v3 network. Finally, a laser spot center positioning method based on information entropy combination is proposed to provide an accurate benchmark for crack width calculation based on parallel lasers, with an average error in crack width calculation of less than 2.56%. Full article
Show Figures

Figure 1

26 pages, 2931 KB  
Review
Prospects of AI-Powered Bowel Sound Analytics for Diagnosis, Characterization, and Treatment Management of Inflammatory Bowel Disease
by Divyanshi Sood, Zenab Muhammad Riaz, Jahnavi Mikkilineni, Narendra Nath Ravi, Vineeta Chidipothu, Gayathri Yerrapragada, Poonguzhali Elangovan, Mohammed Naveed Shariff, Thangeswaran Natarajan, Jayarajasekaran Janarthanan, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Keerthy Gopalakrishnan and Shivaram P. Arunachalam
Med. Sci. 2025, 13(4), 230; https://doi.org/10.3390/medsci13040230 - 13 Oct 2025
Abstract
Background: This narrative review examines the role of artificial intelligence (AI) in bowel sound analysis for the diagnosis and management of inflammatory bowel disease (IBD). Inflammatory bowel disease (IBD), encompassing Crohn’s disease and ulcerative colitis, presents a significant clinical burden due to its [...] Read more.
Background: This narrative review examines the role of artificial intelligence (AI) in bowel sound analysis for the diagnosis and management of inflammatory bowel disease (IBD). Inflammatory bowel disease (IBD), encompassing Crohn’s disease and ulcerative colitis, presents a significant clinical burden due to its unpredictable course, variable symptomatology, and reliance on invasive procedures for diagnosis and disease monitoring. Despite advances in imaging and biomarkers, tools such as colonoscopy and fecal calprotectin remain costly, uncomfortable, and impractical for frequent or real-time assessment. Meanwhile, bowel sounds—an overlooked physiologic signal—reflect underlying gastrointestinal motility and inflammation but have historically lacked objective quantification. With recent advances in artificial intelligence (AI) and acoustic signal processing, there is growing interest in leveraging bowel sound analysis as a novel, non-invasive biomarker for detecting IBD, monitoring disease activity, and predicting disease flares. This approach holds the promise of continuous, low-cost, and patient-friendly monitoring, which could transform IBD management. Objectives: This narrative review assesses the clinical utility, methodological rigor, and potential future integration of artificial intelligence (AI)-driven bowel sound analysis in inflammatory bowel disease (IBD), with a focus on its potential as a non-invasive biomarker for disease activity, flare prediction, and differential diagnosis. Methods: This manuscript reviews the potential of AI-powered bowel sound analysis as a non-invasive tool for diagnosing, monitoring, and managing inflammatory bowel disease (IBD), including Crohn’s disease and ulcerative colitis. Traditional diagnostic methods, such as colonoscopy and biomarkers, are often invasive, costly, and impractical for real-time monitoring. The manuscript explores bowel sounds, which reflect gastrointestinal motility and inflammation, as an alternative biomarker by utilizing AI techniques like convolutional neural networks (CNNs), transformers, and gradient boosting. We analyze data on acoustic signal acquisition (e.g., smart T-shirts, smartphones), signal processing methodologies (e.g., MFCCs, spectrograms, empirical mode decomposition), and validation metrics (e.g., accuracy, F1 scores, AUC). Studies were assessed for clinical relevance, methodological rigor, and translational potential. Results: Across studies enrolling 16–100 participants, AI models achieved diagnostic accuracies of 88–96%, with AUCs ≥ 0.83 and F1 scores ranging from 0.71 to 0.85 for differentiating IBD from healthy controls and IBS. Transformer-based approaches (e.g., HuBERT, Wav2Vec 2.0) consistently outperformed CNNs and tabular models, yielding F1 scores of 80–85%, while gradient boosting on wearable multi-microphone recordings demonstrated robustness to background noise. Distinct acoustic signatures were identified, including prolonged sound-to-sound intervals in Crohn’s disease (mean 1232 ms vs. 511 ms in IBS) and high-pitched tinkling in stricturing phenotypes. Despite promising performance, current models remain below established biomarkers such as fecal calprotectin (~90% sensitivity for active disease), and generalizability is limited by small, heterogeneous cohorts and the absence of prospective validation. Conclusions: AI-powered bowel sound analysis represents a promising, non-invasive tool for IBD monitoring. However, widespread clinical integration requires standardized data acquisition protocols, large multi-center datasets with clinical correlates, explainable AI frameworks, and ethical data governance. Future directions include wearable-enabled remote monitoring platforms and multi-modal decision support systems integrating bowel sounds with biomarker and symptom data. This manuscript emphasizes the need for large-scale, multi-center studies, the development of explainable AI frameworks, and the integration of these tools within clinical workflows. Future directions include remote monitoring using wearables and multi-modal systems that combine bowel sounds with biomarkers and patient symptoms, aiming to transform IBD care into a more personalized and proactive model. Full article
Show Figures

Figure 1

21 pages, 2783 KB  
Article
Deep Learning-Based Eye-Writing Recognition with Improved Preprocessing and Data Augmentation Techniques
by Kota Suzuki, Abu Saleh Musa Miah and Jungpil Shin
Sensors 2025, 25(20), 6325; https://doi.org/10.3390/s25206325 (registering DOI) - 13 Oct 2025
Abstract
Eye-tracking technology enables communication for individuals with muscle control difficulties, making it a valuable assistive tool. Traditional systems rely on electrooculography (EOG) or infrared devices, which are accurate but costly and invasive. While vision-based systems offer a more accessible alternative, they have not [...] Read more.
Eye-tracking technology enables communication for individuals with muscle control difficulties, making it a valuable assistive tool. Traditional systems rely on electrooculography (EOG) or infrared devices, which are accurate but costly and invasive. While vision-based systems offer a more accessible alternative, they have not been extensively explored for eye-writing recognition. Additionally, the natural instability of eye movements and variations in writing styles result in inconsistent signal lengths, which reduces recognition accuracy and limits the practical use of eye-writing systems. To address these challenges, we propose a novel vision-based eye-writing recognition approach that utilizes a webcam-captured dataset. A key contribution of our approach is the introduction of a Discrete Fourier Transform (DFT)-based length normalization method that standardizes the length of each eye-writing sample while preserving essential spectral characteristics. This ensures uniformity in input lengths and improves both efficiency and robustness. Moreover, we integrate a hybrid deep learning model that combines 1D Convolutional Neural Networks (CNN) and Temporal Convolutional Networks (TCN) to jointly capture spatial and temporal features of eye-writing. To further improve model robustness, we incorporate data augmentation and initial-point normalization techniques. The proposed system was evaluated using our new webcam-captured Arabic numbers dataset and two existing benchmark datasets, with leave-one-subject-out (LOSO) cross-validation. The model achieved accuracies of 97.68% on the new dataset, 94.48% on the Japanese Katakana dataset, and 98.70% on the EOG-captured Arabic numbers dataset—outperforming existing systems. This work provides an efficient eye-writing recognition system, featuring robust preprocessing techniques, a hybrid deep learning model, and a new webcam-captured dataset. Full article
21 pages, 3081 KB  
Article
Lightweight CNN–Transformer Hybrid Network with Contrastive Learning for Few-Shot Noxious Weed Recognition
by Ruiheng Li, Boda Yu, Boming Zhang, Hongtao Ma, Yihan Qin, Xinyang Lv and Shuo Yan
Horticulturae 2025, 11(10), 1236; https://doi.org/10.3390/horticulturae11101236 - 13 Oct 2025
Abstract
In resource-constrained edge agricultural environments, the accurate recognition of toxic weeds poses dual challenges related to model lightweight design and the few-shot generalization capability. To address these challenges, a multi-strategy recognition framework is proposed, which integrates a lightweight backbone network, a pseudo-labeling guidance [...] Read more.
In resource-constrained edge agricultural environments, the accurate recognition of toxic weeds poses dual challenges related to model lightweight design and the few-shot generalization capability. To address these challenges, a multi-strategy recognition framework is proposed, which integrates a lightweight backbone network, a pseudo-labeling guidance mechanism, and a contrastive boundary enhancement module. This approach is designed to improve deployment efficiency on low-power devices while ensuring high accuracy in identifying rare toxic weed categories. The proposed model achieves a real-time inference speed of 18.9 FPS on the Jetson Nano platform, with a compact model size of 18.6 MB and power consumption maintained below 5.1 W, demonstrating its efficiency for edge deployment. In standard classification tasks, the model attains 89.64%, 87.91%, 88.76%, and 88.43% in terms of precision, recall, F1-score, and accuracy, respectively, outperforming existing mainstream lightweight models such as ResNet18, MobileNetV2, and MobileViT across all evaluation metrics. In few-shot classification tasks targeting rare toxic weed species, the complete model achieves an accuracy of 80.32%, marking an average improvement of over 13 percentage points compared to ablation variants that exclude pseudo-labeling and self-supervised modules or adopt a CNN-only architecture. The experimental results indicate that the proposed model not only delivers strong overall classification performance but also exhibits superior adaptability for deployment and robustness in low-data regimes, offering an effective solution for the precise identification and ecological control of toxic weeds within intelligent agricultural perception systems. Full article
Show Figures

Figure 1

24 pages, 3657 KB  
Article
Construction and Comparative Analysis of a Water Quality Simulation and Prediction Model for Plain River Networks
by Yue Lan, Cundong Xu, Lianying Ding, Mingyan Wang, Zihao Ren and Zhihang Wang
Water 2025, 17(20), 2948; https://doi.org/10.3390/w17202948 (registering DOI) - 13 Oct 2025
Abstract
In plain river networks, a sluggish flow due to the flat terrain and hydraulic structures significantly reduces water’s capacity for self-purification, leading to persistent water pollution that threatens aquatic ecosystems and human health. Despite being critical, effective water quality prediction proves challenging in [...] Read more.
In plain river networks, a sluggish flow due to the flat terrain and hydraulic structures significantly reduces water’s capacity for self-purification, leading to persistent water pollution that threatens aquatic ecosystems and human health. Despite being critical, effective water quality prediction proves challenging in such regions, with current models lacking either physical interpretability or temporal accuracy. To address this gap, both a process-based model (MIKE 21) and a deep learning model (CNN-LSTM-Attention) were developed in this study to predict key water quality indicators—dissolved oxygen (DO), total nitrogen (TN), and total phosphorus (TP)—in a typical river network area in Jiaxing, China. This site was selected for its representative complexity and acute pollution challenges. The MIKE 21 model demonstrated strong performance, with R2 values above 0.88 for all indicators, offering high spatial resolution and mechanistic insight. The CNN-LSTM-Attention model excelled in capturing temporal dynamics, achieving an R2 of 0.9934 for DO. The results indicate the complementary nature of these two approaches: while MIKE 21 supports scenario-based planning, the deep learning model enables highly accurate real-time forecasting. The findings are transferable to similar river network systems, providing a robust reference for selecting modeling frameworks in the design of water pollution control strategies. Full article
19 pages, 20391 KB  
Article
Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs
by Jiseop Park and Jaejin Jeong
Sensors 2025, 25(20), 6324; https://doi.org/10.3390/s25206324 (registering DOI) - 13 Oct 2025
Abstract
With the proliferation of the Internet of Things (IoT), gesture recognition has attracted attention as a core technology in human–computer interaction (HCI). In particular, mmWave frequency-modulated continuous-wave (FMCW) radar has emerged as an alternative to vision-based approaches due to its robustness to illumination [...] Read more.
With the proliferation of the Internet of Things (IoT), gesture recognition has attracted attention as a core technology in human–computer interaction (HCI). In particular, mmWave frequency-modulated continuous-wave (FMCW) radar has emerged as an alternative to vision-based approaches due to its robustness to illumination changes and advantages in privacy. However, in real-world human–machine interface (HMI) environments, hand gestures are inevitably accompanied by torso- and arm-related reflections, which can also contain gesture-relevant variations. To effectively capture these variations without discarding them, we propose a preprocessing method called Adaptive Top-K Selection, which leverages vector entropy to summarize and preserve informative signals from both hand and body reflections. In addition, we present a Multi-Stream EfficientNetV2 architecture that jointly exploits temporal range and Doppler trajectories, together with radar-specific data augmentation and a training optimization strategy. In experiments on the publicly available FMCW gesture dataset released by the Karlsruhe Institute of Technology, the proposed method achieved an average accuracy of 99.5%. These results show that the proposed approach enables accurate and reliable gesture recognition even in realistic HMI environments with co-existing body reflections. Full article
(This article belongs to the Special Issue Sensor Technologies for Radar Detection)
30 pages, 23104 KB  
Article
MSAFNet: Multi-Modal Marine Aquaculture Segmentation via Spatial–Frequency Adaptive Fusion
by Guolong Wu and Yimin Lu
Remote Sens. 2025, 17(20), 3425; https://doi.org/10.3390/rs17203425 (registering DOI) - 13 Oct 2025
Abstract
Accurate mapping of marine aquaculture areas is critical for environmental management and sustainable development for marine ecosystem protection and sustainable resource utilization. However, remote sensing imagery based on single-sensor modalities has inherent limitations when extracting aquaculture zones in complex marine environments. To address [...] Read more.
Accurate mapping of marine aquaculture areas is critical for environmental management and sustainable development for marine ecosystem protection and sustainable resource utilization. However, remote sensing imagery based on single-sensor modalities has inherent limitations when extracting aquaculture zones in complex marine environments. To address this challenge, we constructed a multi-modal dataset from five Chinese coastal regions using cloud detection methods and developed Multi-modal Spatial–Frequency Adaptive Fusion Network (MSAFNet) for optical-radar data fusion. MSAFNet employs a dual-path architecture utilizing a Multi-scale Dual-path Feature Module (MDFM) that combines CNN and Transformer capabilities to extract multi-scale features. Additionally, it implements a Dynamic Frequency Domain Adaptive Fusion Module (DFAFM) to achieve deep integration of multi-modal features in both spatial and frequency domains, effectively leveraging the complementary advantages of different sensor data. Results demonstrate that MSAFNet achieves 76.93% mean intersection over union (mIoU), 86.96% mean F1 score (mF1), and 93.26% mean Kappa coefficient (mKappa) in extracting floating raft aquaculture (FRA) and cage aquaculture (CA), significantly outperforming existing methods. Applied to China’s coastal waters, the model generated 2020 nearshore aquaculture distribution maps, demonstrating its generalization capability and practical value in complex marine environments. This approach provides reliable technical support for marine resource management and ecological monitoring. Full article
Show Figures

Figure 1

14 pages, 1932 KB  
Article
Development and Validation of Transformer- and Convolutional Neural Network-Based Deep Learning Models to Predict Curve Progression in Adolescent Idiopathic Scoliosis
by Shinji Takahashi, Shota Ichikawa, Kei Watanabe, Haruki Ueda, Hideyuki Arima, Yu Yamato, Takumi Takeuchi, Naobumi Hosogane, Masashi Okamoto, Manami Umezu, Hiroki Oba, Yohan Kondo and Shoji Seki
J. Clin. Med. 2025, 14(20), 7216; https://doi.org/10.3390/jcm14207216 (registering DOI) - 13 Oct 2025
Abstract
Background/Objectives: The clinical management of adolescent idiopathic scoliosis (AIS) is hindered by the inability to accurately predict curve progression. Although skeletal maturity and the initial Cobb angle are established predictors of progression, their combined predictive accuracy remains limited. This study aimed to [...] Read more.
Background/Objectives: The clinical management of adolescent idiopathic scoliosis (AIS) is hindered by the inability to accurately predict curve progression. Although skeletal maturity and the initial Cobb angle are established predictors of progression, their combined predictive accuracy remains limited. This study aimed to develop a robust and interpretable artificial intelligence (AI) system using deep learning (DL) models to predict the progression of scoliosis using only standing frontal radiographs. Methods: We conducted a multicenter study involving 542 patients with AIS. After excluding 52 borderline progression cases (6–9° progression in the Cobb angle), 294 and 196 patients were assigned to progression (≥10° increase) and non-progression (≤5° increase) groups, respectively, considering a 2-year follow-up. Frontal whole spinal radiographs were preprocessed using histogram equalization and divided into two regions of interest (ROIs) (ROI 1, skull base–femoral head; ROI 2, C7–iliac crest). Six pretrained DL models, including convolutional neural networks (CNNs) and transformer-based models, were trained on the radiograph images. Gradient-weighted class activation mapping (Grad-CAM) was further performed for model interpretation. Results: Ensemble models outperformed individual ones, with the average ensemble model achieving area under the curve (AUC) values of 0.769 for ROI 1 and 0.755 for ROI 2. Grad-CAM revealed that the CNNs tended to focus on the local curve apex, whereas the transformer-based models demonstrated global attention across the spine, ribs, and pelvis. Models trained on ROI 2 performed comparably with respect to those using ROI 1, supporting the feasibility of image standardization without a loss of accuracy. Conclusions: This study establishes the clinical potential of transformer-based DL models for predicting the progression of scoliosis using only plain radiographs. Our multicenter approach, high AUC values, and interpretable architectures support the integration of AI into clinical decision-making for the early treatment of AIS. Full article
(This article belongs to the Special Issue Clinical New Insights into Management of Scoliosis)
Show Figures

Figure 1

30 pages, 7765 KB  
Article
Self-Controlled Autonomous Mobility System with Adaptive Spatial and Stair Recognition Using CNNs
by Hayato Mitsuhashi, Hiroyuki Kamata and Taku Itami
Appl. Sci. 2025, 15(20), 10978; https://doi.org/10.3390/app152010978 - 13 Oct 2025
Abstract
The aim of this study is to develop the next-generation fully autonomous electric wheelchair capable of operating in diverse environments. This study proposes a self-controlled autonomous mobility system that integrates a monocular camera and laser-based 3D spatial recognition, convolutional neural network-based obstacle recognition, [...] Read more.
The aim of this study is to develop the next-generation fully autonomous electric wheelchair capable of operating in diverse environments. This study proposes a self-controlled autonomous mobility system that integrates a monocular camera and laser-based 3D spatial recognition, convolutional neural network-based obstacle recognition, shape measurement, and stair structure recognition technology. Obstacle recognition and shape measurement are performed by analyzing the surrounding space using convolutional neural networks and distance calculation methods based on laser measurements. The stair structure recognition technology utilizes the stair-step characteristics from the laser’s irradiation pattern, enabling detection of distance information not captured by the camera. A principal analysis and algorithm development were conducted using a small-scale autonomous mobility system, and its feasibility was determined by application to an omnidirectional self-controlled autonomous electric wheelchair. Using the autonomous robot, we successfully demonstrated an obstacle-avoidance program based on obstacle recognition and shape measurement that is independent of environmental illumination. Additionally, 3D analysis of the number of stair steps, height, and depth was achieved. This study enhances mobility in complex environments under varying lighting conditions and lays the groundwork for inclusive mobility solutions in a barrier-free society. When the proposed method was applied to an omnidirectional self-controlled electric wheelchair, it accurately detected the distance to obstacles, their shapes, as well as the height and depth of stairs, with a maximum error of 0.8 cm. Full article
(This article belongs to the Section Robotics and Automation)
Show Figures

Figure 1

22 pages, 7434 KB  
Article
A Lightweight Image-Based Decision Support Model for Marine Cylinder Lubrication Based on CNN-ViT Fusion
by Qiuyu Li, Guichen Zhang and Enrui Zhao
J. Mar. Sci. Eng. 2025, 13(10), 1956; https://doi.org/10.3390/jmse13101956 - 13 Oct 2025
Abstract
Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, [...] Read more.
Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, a lightweight image retrieval model for cylinder lubrication is proposed, leveraging deep learning and computer vision to support oiling decisions based on visual features. The model comprises three components: a backbone network, a feature enhancement module, and a similarity retrieval module. Specifically, EfficientNetB0 serves as the backbone for efficient feature extraction under low computational overhead. MobileViT Blocks are integrated to combine local feature perception of Convolutional Neural Networks (CNNs) with the global modeling capacity of Transformers. To further improve receptive field and multi-scale representation, Receptive Field Blocks (RFB) are introduced between the components. Additionally, the Convolutional Block Attention Module (CBAM) attention mechanism enhances focus on salient regions, improving feature discrimination. A high-quality image dataset was constructed using WINNING’s large bulk carriers under various sea conditions. The experimental results demonstrate that the EfficientNetB0 + RFB + MobileViT + CBAM model achieves excellent performance with minimal computational cost: 99.71% Precision, 99.69% Recall, and 99.70% F1-score—improvements of 11.81%, 15.36%, and 13.62%, respectively, over the baseline EfficientNetB0. With only a 0.3 GFLOP and 8.3 MB increase in model size, the approach balances accuracy and inference efficiency. The model also demonstrates good robustness and application stability in real-world ship testing, with potential for further adoption in the field of intelligent ship maintenance. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

25 pages, 18664 KB  
Article
Study on Lower Limb Motion Intention Recognition Based on PO-SVMD-ResNet-GRU
by Wei Li, Mingsen Wang, Daxue Sun, Zhuoda Jia and Zhengwei Yue
Processes 2025, 13(10), 3252; https://doi.org/10.3390/pr13103252 (registering DOI) - 13 Oct 2025
Abstract
This study aims to enhance the accuracy of human lower limb motion intention recognition based on surface electromyography (sEMG) signals and proposes a signal denoising method based on Sequential Variational Mode Decomposition (SVMD) optimized by the Parrot Optimization (PO) algorithm and a joint [...] Read more.
This study aims to enhance the accuracy of human lower limb motion intention recognition based on surface electromyography (sEMG) signals and proposes a signal denoising method based on Sequential Variational Mode Decomposition (SVMD) optimized by the Parrot Optimization (PO) algorithm and a joint motion angle prediction model combining Residual Network (ResNet) with Gated Recurrent Unit (GRU) for the two aspects of signal processing and predictive modeling, respectively. First, for the two motion conditions of level walking and stair climbing, sEMG signals from the rectus femoris, vastus lateralis, semitendinosus, and biceps femoris, as well as the motion angles of the hip and knee joints, were simultaneously collected from five healthy subjects, yielding a total of 400 gait cycle data points. The sEMG signals were denoised using the method combining PO-SVMD with wavelet thresholding. Compared with denoising methods such as Empirical Mode Decomposition, Partial Ensemble Empirical Mode Decomposition, Independent Component Analysis, and wavelet thresholding alone, the signal-to-noise ratio (SNR) of the proposed method was increased to a maximum of 23.42 dB. Then, the gait cycle information was divided into training and testing sets at a 4:1 ratio, and five models—ResNet-GRU, Transformer-LSTM, CNN-GRU, ResNet, and GRU—were trained and tested individually using the processed sEMG signals as input and the hip and knee joint movement angles as output. Finally, the root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2) were used as evaluation metrics for the test results. The results show that for both motion conditions, the evaluation metrics of the ResNet-GRU model in the test results are superior to those of the other four models. The optimal evaluation metrics for level walking are 2.512 ± 0.415°, 1.863 ± 0.265°, and 0.979 ± 0.007, respectively, while the optimal evaluation metrics for stair climbing are 2.475 ± 0.442°, 2.012 ± 0.336°, and 0.98 ± 0.009, respectively. The method proposed in this study achieves improvements in both signal processing and predictive modeling, providing a new method for research on lower limb motion intention recognition. Full article
Show Figures

Figure 1

26 pages, 2445 KB  
Article
Image-Based Deep Learning Approach for Drilling Kick Risk Prediction
by Wei Liu, Yuansen Wei, Jiasheng Fu, Qihao Li, Yi Zou, Tao Pan and Zhaopeng Zhu
Processes 2025, 13(10), 3251; https://doi.org/10.3390/pr13103251 (registering DOI) - 13 Oct 2025
Abstract
As oil and gas exploration and development advance into deep and ultra-deep areas, kick accidents are becoming more frequent during drilling operations, posing a serious threat to construction safety. Traditional kick monitoring methods are limited in their multivariate coupling modeling. These models rely [...] Read more.
As oil and gas exploration and development advance into deep and ultra-deep areas, kick accidents are becoming more frequent during drilling operations, posing a serious threat to construction safety. Traditional kick monitoring methods are limited in their multivariate coupling modeling. These models rely too heavily on single-feature weights, making them prone to misjudgment. Therefore, this paper proposes a drilling kick risk prediction method based on image modality. First, a sliding window mechanism is used to slice key drilling parameters in time series to extract multivariate data for continuous time periods. Second, data processing is performed to construct joint logging curve image samples. Then, classical CNN models such as VGG16 and ResNet are used to train and classify image samples; finally, the performance of the model on a number of indicators is evaluated and compared with different CNN and temporal neural network models. Finally, the model’s performance is evaluated across multiple metrics and compared with CNN and time series neural network models of different structures. Experimental results show that the image-based VGG16 model outperforms typical convolutional neural network models such as AlexNet, ResNet, and EfficientNet in overall performance, and significantly outperforms LSTM and GRU time series models in classification accuracy and comprehensive discriminative power. Compared to LSTM, the recall rate increased by 23.8% and the precision increased by 5.8%, demonstrating that its convolutional structure possesses stronger perception and discriminative capabilities in extracting local spatiotemporal features and recognizing patterns, enabling more accurate identification of kick risks. Furthermore, the pre-trained VGG16 model achieved an 8.69% improvement in accuracy compared to the custom VGG16 model, fully demonstrating the effectiveness and generalization advantages of transfer learning in small-sample engineering problems and providing feasibility support for model deployment and engineering applications. Full article
(This article belongs to the Section Energy Systems)
Show Figures

Figure 1

Back to TopTop