Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (125)

Search Parameters:
Keywords = gradient-weighted class activation mapping (Grad-CAM)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 765 KB  
Systematic Review
Explainable AI in Clinical Decision Support Systems: A Meta-Analysis of Methods, Applications, and Usability Challenges
by Qaiser Abbas, Woonyoung Jeong and Seung Won Lee
Healthcare 2025, 13(17), 2154; https://doi.org/10.3390/healthcare13172154 - 29 Aug 2025
Viewed by 372
Abstract
Background: Theintegration of artificial intelligence (AI) into clinical decision support systems (CDSSs) has significantly enhanced diagnostic precision, risk stratification, and treatment planning. AI models remain a barrier to clinical adoption, emphasizing the critical role of explainable AI (XAI). Methods: This systematic meta-analysis synthesizes [...] Read more.
Background: Theintegration of artificial intelligence (AI) into clinical decision support systems (CDSSs) has significantly enhanced diagnostic precision, risk stratification, and treatment planning. AI models remain a barrier to clinical adoption, emphasizing the critical role of explainable AI (XAI). Methods: This systematic meta-analysis synthesizes findings from 62 peer-reviewed studies published between 2018 and 2025, examining the use of XAI methods within CDSSs across various clinical domains, including radiology, oncology, neurology, and critical care. Model-agnostic techniques such as visualization models like Gradient-weighted Class Activation Mapping (Grad-CAM) and attention mechanisms dominated in imaging and sequential data tasks. Results: However, there are still gaps in user-friendly evaluation, methodological transparency, and ethical issues, as seen by the absence of research that evaluated explanation fidelity, clinician trust, or usability in real-world settings. In order to enable responsible AI implementation in healthcare, our analysis emphasizes the necessity of longitudinal clinical validation, participatory system design, and uniform interpretability measures. Conclusions: This review offers a thorough analysis of the state of XAI practices in CDSSs today, identifies methodological and practical issues, and suggests a path forward for AI solutions that are open, moral, and clinically relevant. Full article
(This article belongs to the Special Issue The Role of AI in Predictive and Prescriptive Healthcare)
Show Figures

Figure 1

25 pages, 3053 KB  
Article
Enhanced YOLOv11 Framework for Accurate Multi-Fault Detection in UAV Photovoltaic Inspection
by Shufeng Meng, Yang Yue and Tianxu Xu
Sensors 2025, 25(17), 5311; https://doi.org/10.3390/s25175311 - 26 Aug 2025
Viewed by 514
Abstract
Stains, defects, and snow accumulation constitute three prevalent photovoltaic (PV) anomalies; each exhibits unique color and thermal signatures yet collectively curtail energy yield. Existing detectors typically sacrifice accuracy for speed, and none simultaneously classify all three fault types. To counter the identified limitations, [...] Read more.
Stains, defects, and snow accumulation constitute three prevalent photovoltaic (PV) anomalies; each exhibits unique color and thermal signatures yet collectively curtail energy yield. Existing detectors typically sacrifice accuracy for speed, and none simultaneously classify all three fault types. To counter the identified limitations, an enhanced YOLOv11 framework is introduced. First, the hue-saturation-value (HSV) color model is employed to decouple hue and brightness, strengthening color feature extraction and cross-sensor generalization. Second, an outlook attention module integrated into the backbone precisely delineates micro-defect boundaries. Third, a mix structure block in the detection head encodes global context and fine-grained details to boost small object recognition. Additionally, the bounded sigmoid linear unit (B-SiLU) activation function optimizes gradient flow and feature discrimination through an improved nonlinear mapping, while the gradient-weighted class activation mapping (Grad-CAM) visualizations confirm selective attention to fault regions. Experimental results show that overall mean average precision (mAP) rises by 1.8%, with defect, stain, and snow accuracies improving by 2.2%, 3.3%, and 0.8%, respectively, offering a reliable solution for intelligent PV inspection and early fault detection. Full article
(This article belongs to the Special Issue Feature Papers in Communications Section 2025)
Show Figures

Figure 1

11 pages, 989 KB  
Article
Visual and Predictive Assessment of Pneumothorax Recurrence in Adolescents Using Machine Learning on Chest CT
by Kwanyong Hyun, Jae Jun Kim, Kyong Shil Im, Sang Chul Han and Jeong Hwan Ryu
J. Clin. Med. 2025, 14(17), 5956; https://doi.org/10.3390/jcm14175956 - 23 Aug 2025
Viewed by 341
Abstract
Background: Spontaneous pneumothorax (SP) in adolescents has a high recurrence risk, particularly without surgical treatment. This study aimed to predict recurrence using machine learning (ML) algorithms applied to chest computed tomography (CT) and to visualize CT features associated with recurrence. Methods: We retrospectively [...] Read more.
Background: Spontaneous pneumothorax (SP) in adolescents has a high recurrence risk, particularly without surgical treatment. This study aimed to predict recurrence using machine learning (ML) algorithms applied to chest computed tomography (CT) and to visualize CT features associated with recurrence. Methods: We retrospectively reviewed 299 adolescents with conservatively managed SP from January 2018 to December 2022. Clinical risk factors were statistically analyzed. Chest CT images were evaluated using ML models, with performance assessed by AUC, accuracy, precision, recall, and F1 score. Gradient-weighted Class Activation Mapping (Grad-CAM) was used for visual interpretation. Results: Among 164 right-sided and 135 left-sided SP cases, recurrence occurred in 54 and 43 cases, respectively. Mean recurrence intervals were 10.5 ± 9.9 months (right) and 12.7 ± 9.1 months (left). Presence of blebs or bullae was significantly associated with recurrence (p < 0.001). Neural networks achieved the best performance (AUC: 0.970 right, 0.958 left). Grad-CAM confirmed the role of blebs/bullae and highlighted apical lung regions in recurrence, even in their absence. Conclusions: ML algorithms applied to chest CT demonstrate high accuracy in predicting SP recurrence in adolescents. Visual analyses support the clinical relevance of blebs/bullae and suggest a key role of apical lung regions in recurrence, even when blebs/bullae are absent. Full article
(This article belongs to the Section Nuclear Medicine & Radiology)
Show Figures

Graphical abstract

17 pages, 2167 KB  
Article
Interpretable EEG Emotion Classification via CNN Model and Gradient-Weighted Class Activation Mapping
by Yuxuan Zhao, Linjing Cao, Yidao Ji, Bo Wang and Wei Wu
Brain Sci. 2025, 15(8), 886; https://doi.org/10.3390/brainsci15080886 - 20 Aug 2025
Viewed by 399
Abstract
Background/Objectives: Electroencephalography (EEG)-based emotion recognition plays an important role in affective computing and brain–computer interface applications. However, existing methods often face the challenge of achieving high classification accuracy while maintaining physiological interpretability. Methods: In this study, we propose a convolutional neural network (CNN) [...] Read more.
Background/Objectives: Electroencephalography (EEG)-based emotion recognition plays an important role in affective computing and brain–computer interface applications. However, existing methods often face the challenge of achieving high classification accuracy while maintaining physiological interpretability. Methods: In this study, we propose a convolutional neural network (CNN) model with a simple architecture for EEG-based emotion classification. The model achieves classification accuracies of 95.21% for low/high arousal, 94.59% for low/high valence, and 93.01% for quaternary classification tasks on the DEAP dataset. To further improve model interpretability and support practical applications, Gradient-weighted Class Activation Mapping (Grad-CAM) is employed to identify the EEG electrode regions that contribute most to the classification results. Results: The visualization reveals that electrodes located in the right prefrontal cortex and left parietal lobe are the most influential, which is consistent with findings from emotional lateralization theory. Conclusions: This provides a physiological basis for optimizing electrode placement in wearable EEG-based emotion recognition systems. The proposed method combines high classification performance with interpretability and provides guidance for the design of efficient and portable affective computing systems. Full article
(This article belongs to the Section Cognitive, Social and Affective Neuroscience)
Show Figures

Figure 1

26 pages, 3497 KB  
Article
A Multi-Branch Network for Integrating Spatial, Spectral, and Temporal Features in Motor Imagery EEG Classification
by Xiaoqin Lian, Chunquan Liu, Chao Gao, Ziqian Deng, Wenyang Guan and Yonggang Gong
Brain Sci. 2025, 15(8), 877; https://doi.org/10.3390/brainsci15080877 - 18 Aug 2025
Viewed by 438
Abstract
Background: Efficient decoding of motor imagery (MI) electroencephalogram (EEG) signals is essential for the precise control and practical deployment of brain-computer interface (BCI) systems. Owing to the complex nonlinear characteristics of EEG signals across spatial, spectral, and temporal dimensions, efficiently extracting multidimensional [...] Read more.
Background: Efficient decoding of motor imagery (MI) electroencephalogram (EEG) signals is essential for the precise control and practical deployment of brain-computer interface (BCI) systems. Owing to the complex nonlinear characteristics of EEG signals across spatial, spectral, and temporal dimensions, efficiently extracting multidimensional discriminative features remains a key challenge to improving MI-EEG decoding performance. Methods: To address the challenge of capturing complex spatial, spectral, and temporal features in MI-EEG signals, this study proposes a multi-branch deep neural network, which jointly models these dimensions to enhance classification performance. The network takes as inputs both a three-dimensional power spectral density tensor and two-dimensional time-domain EEG signals and incorporates four complementary feature extraction branches to capture spatial, spectral, spatial-spectral joint, and temporal dynamic features, thereby enabling unified multidimensional modeling. The model was comprehensively evaluated on two widely used public MI-EEG datasets: EEG Motor Movement/Imagery Database (EEGMMIDB) and BCI Competition IV Dataset 2a (BCIIV2A). To further assess interpretability, gradient-weighted class activation mapping (Grad-CAM) was employed to visualize the spatial and spectral features prioritized by the model. Results: On the EEGMMIDB dataset, it achieved an average classification accuracy of 86.34% and a kappa coefficient of 0.829 in the five-class task. On the BCIIV2A dataset, it reached an accuracy of 83.43% and a kappa coefficient of 0.779 in the four-class task. Conclusions: These results demonstrate that the network outperforms existing state-of-the-art methods in classification performance. Furthermore, Grad-CAM visualizations identified the key spatial channels and frequency bands attended to by the model, supporting its neurophysiological interpretability. Full article
(This article belongs to the Section Neurotechnology and Neuroimaging)
Show Figures

Figure 1

23 pages, 2640 KB  
Article
DenseNet-Based Classification of EEG Abnormalities Using Spectrograms
by Lan Wei and Catherine Mooney
Algorithms 2025, 18(8), 486; https://doi.org/10.3390/a18080486 - 5 Aug 2025
Viewed by 441
Abstract
Electroencephalogram (EEG) analysis is essential for diagnosing neurological disorders but typically requires expert interpretation and significant time. Purpose: This study aims to automate the classification of normal and abnormal EEG recordings to support clinical diagnosis and reduce manual workload. Automating the initial screening [...] Read more.
Electroencephalogram (EEG) analysis is essential for diagnosing neurological disorders but typically requires expert interpretation and significant time. Purpose: This study aims to automate the classification of normal and abnormal EEG recordings to support clinical diagnosis and reduce manual workload. Automating the initial screening of EEGs can help clinicians quickly identify potential neurological abnormalities, enabling timely intervention and guiding further diagnostic and treatment strategies. Methodology: We utilized the Temple University Hospital EEG dataset to develop a DenseNet-based deep learning model. To enable a fair comparison of different EEG representations, we used three input types: signal images, spectrograms, and scalograms. To reduce dimensionality and simplify computation, we focused on two channels: T5 and O1. For interpretability, we applied Local Interpretable Model-agnostic Explanations (LIME) and Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize the EEG regions influencing the model’s predictions. Key Findings: Among the input types, spectrogram-based representations achieved the highest classification accuracy, indicating that time-frequency features are especially effective for this task. The model demonstrated strong performance overall, and the integration of LIME and Grad-CAM provided transparent explanations of its decisions, enhancing interpretability. This approach offers a practical and interpretable solution for automated EEG screening, contributing to more efficient clinical workflows and better understanding of complex neurological conditions. Full article
(This article belongs to the Special Issue AI-Assisted Medical Diagnostics)
Show Figures

Figure 1

20 pages, 4095 KB  
Article
Integrated Explainable Diagnosis of Gear Wear Faults Based on Dynamic Modeling and Data-Driven Representation
by Zemin Zhao, Tianci Zhang, Kang Xu, Jinyuan Tang and Yudian Yang
Sensors 2025, 25(15), 4805; https://doi.org/10.3390/s25154805 - 5 Aug 2025
Viewed by 424
Abstract
Gear wear degrades transmission performance, necessitating highly reliable fault diagnosis methods. To address the limitations of existing approaches—where dynamic models rely heavily on prior knowledge, while data-driven methods lack interpretability—this study proposes an integrated bidirectional verification framework combining dynamic modeling and deep learning [...] Read more.
Gear wear degrades transmission performance, necessitating highly reliable fault diagnosis methods. To address the limitations of existing approaches—where dynamic models rely heavily on prior knowledge, while data-driven methods lack interpretability—this study proposes an integrated bidirectional verification framework combining dynamic modeling and deep learning for interpretable gear wear diagnosis. First, a dynamic gear wear model is established to quantitatively reveal wear-induced modulation effects on meshing stiffness and vibration responses. Then, a deep network incorporating Gradient-weighted Class Activation Mapping (Grad-CAM) enables visualized extraction of frequency-domain sensitive features. Bidirectional verification between the dynamic model and deep learning demonstrates enhanced meshing harmonics in wear faults, leading to a quantitative diagnostic index that achieves 0.9560 recognition accuracy for gear wear across four speed conditions, significantly outperforming comparative indicators. This research provides a novel approach for gear wear diagnosis that ensures both high accuracy and interpretability. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

19 pages, 1160 KB  
Article
Multi-User Satisfaction-Driven Bi-Level Optimization of Electric Vehicle Charging Strategies
by Boyin Chen, Jiangjiao Xu and Dongdong Li
Energies 2025, 18(15), 4097; https://doi.org/10.3390/en18154097 - 1 Aug 2025
Viewed by 360
Abstract
The accelerating integration of electric vehicles (EVs) into contemporary transportation infrastructure has underscored significant limitations in traditional charging paradigms, particularly in accommodating heterogeneous user requirements within dynamic operational environments. This study presents a differentiated optimization framework for EV charging strategies through the systematic [...] Read more.
The accelerating integration of electric vehicles (EVs) into contemporary transportation infrastructure has underscored significant limitations in traditional charging paradigms, particularly in accommodating heterogeneous user requirements within dynamic operational environments. This study presents a differentiated optimization framework for EV charging strategies through the systematic classification of user types. A multidimensional decision-making environment is established for three representative user categories—residential, commercial, and industrial—by synthesizing time-variant electricity pricing models with dynamic carbon emission pricing mechanisms. A bi-level optimization architecture is subsequently formulated, leveraging deep reinforcement learning (DRL) to capture user-specific demand characteristics through customized reward functions and adaptive constraint structures. Validation is conducted within a high-fidelity simulation environment featuring 90 autonomous EV charging agents operating in a metropolitan parking facility. Empirical results indicate that the proposed typology-driven approach yields a 32.6% average cost reduction across user groups relative to baseline charging protocols, with statistically significant improvements in expenditure optimization (p < 0.01). Further interpretability analysis employing gradient-weighted class activation mapping (Grad-CAM) demonstrates that the model’s attention mechanisms are well aligned with theoretically anticipated demand prioritization patterns across the distinct user types, thereby confirming the decision-theoretic soundness of the framework. Full article
(This article belongs to the Section E: Electric Vehicles)
Show Figures

Figure 1

27 pages, 8594 KB  
Article
An Explainable Hybrid CNN–Transformer Architecture for Visual Malware Classification
by Mohammed Alshomrani, Aiiad Albeshri, Abdulaziz A. Alsulami and Badraddin Alturki
Sensors 2025, 25(15), 4581; https://doi.org/10.3390/s25154581 - 24 Jul 2025
Viewed by 1285
Abstract
Malware continues to develop, posing significant challenges for traditional signature-based detection systems. Visual malware classification, which transforms malware binaries into grayscale images, has emerged as a promising alternative for recognizing patterns in malicious code. This study presents a hybrid deep learning architecture that [...] Read more.
Malware continues to develop, posing significant challenges for traditional signature-based detection systems. Visual malware classification, which transforms malware binaries into grayscale images, has emerged as a promising alternative for recognizing patterns in malicious code. This study presents a hybrid deep learning architecture that combines the local feature extraction capabilities of ConvNeXt-Tiny (a CNN-based model) with the global context modeling of the Swin Transformer. The proposed model is evaluated using three benchmark datasets—Malimg, MaleVis, VirusMNIST—encompassing 61 malware classes. Experimental results show that the hybrid model achieved a validation accuracy of 94.04%, outperforming both the ConvNeXt-Tiny-only model (92.45%) and the Swin Transformer-only model (90.44%). Additionally, we extended our validation dataset to two more datasets—Maldeb and Dumpware-10—to strengthen the empirical foundation of our work. The proposed hybrid model achieved competitive accuracy on both, with 98% on Maldeb and 97% on Dumpware-10. To enhance model interpretability, we employed Gradient-weighted Class Activation Mapping (Grad-CAM), which visualizes the learned representations and reveals the complementary nature of CNN and Transformer modules. The hybrid architecture, combined with explainable AI, offers an effective and interpretable approach for malware classification, facilitating better understanding and trust in automated detection systems. In addition, a real-time deployment scenario is demonstrated to validate the model’s practical applicability in dynamic environments. Full article
(This article belongs to the Special Issue Cyber Security and AI—2nd Edition)
Show Figures

Figure 1

15 pages, 1758 KB  
Article
Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence
by Sahar Moradizeyveh, Ambreen Hanif, Sidong Liu, Yuankai Qi, Amin Beheshti and Antonio Di Ieva
Sensors 2025, 25(15), 4575; https://doi.org/10.3390/s25154575 - 24 Jul 2025
Viewed by 458
Abstract
Interpreting diagnostic imaging and identifying clinically relevant features remain challenging tasks, particularly for novice radiologists who often lack structured guidance and expert feedback. To bridge this gap, we propose an Eye-Gaze Guided Multimodal Fusion framework that leverages expert eye-tracking data to enhance learning [...] Read more.
Interpreting diagnostic imaging and identifying clinically relevant features remain challenging tasks, particularly for novice radiologists who often lack structured guidance and expert feedback. To bridge this gap, we propose an Eye-Gaze Guided Multimodal Fusion framework that leverages expert eye-tracking data to enhance learning and decision-making in medical image interpretation. By integrating chest X-ray (CXR) images with expert fixation maps, our approach captures radiologists’ visual attention patterns and highlights regions of interest (ROIs) critical for accurate diagnosis. The fusion model utilizes a shared backbone architecture to jointly process image and gaze modalities, thereby minimizing the impact of noise in fixation data. We validate the system’s interpretability using Gradient-weighted Class Activation Mapping (Grad-CAM) and assess both classification performance and explanation alignment with expert annotations. Comprehensive evaluations, including robustness under gaze noise and expert clinical review, demonstrate the framework’s effectiveness in improving model reliability and interpretability. This work offers a promising pathway toward intelligent, human-centered AI systems that support both diagnostic accuracy and medical training. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

17 pages, 3477 KB  
Article
Breaking Diagnostic Barriers: Vision Transformers Redefine Monkeypox Detection
by Gelan Ayana, Beshatu Debela Wako, So-yun Park, Jude Kong, Sahng Min Han, Soon-Do Yoon and Se-woon Choe
Diagnostics 2025, 15(13), 1698; https://doi.org/10.3390/diagnostics15131698 - 3 Jul 2025
Viewed by 523
Abstract
Background/Objective: The global spread of Monkeypox (Mpox) has highlighted the urgent need for rapid, accurate diagnostic tools. Traditional methods like polymerase chain reaction (PCR) are resource-intensive, while skin image-based detection offers a promising alternative. This study evaluates the effectiveness of vision transformers (ViTs) [...] Read more.
Background/Objective: The global spread of Monkeypox (Mpox) has highlighted the urgent need for rapid, accurate diagnostic tools. Traditional methods like polymerase chain reaction (PCR) are resource-intensive, while skin image-based detection offers a promising alternative. This study evaluates the effectiveness of vision transformers (ViTs) for automated Mpox detection. Methods: By fine-tuning a pre-trained ViT model on an Mpox lesion image dataset, a robust ViT-based transfer learning (TL) model was created. Performance was assessed relative to convolutional neural network (CNN)-based TL models and ViT models trained from scratch across key metrics: accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC). Furthermore, a transferability measure was utilized to assess the effectiveness of feature transfer to Mpox images. Results: The results show that the ViT model outperformed a CNN, achieving an AUC of 0.948 and an accuracy of 0.942 with a p-value of less than 0.05 across all metrics, highlighting its potential for accurate and scalable Mpox detection. Moreover, the ViT models yielded a better hypothesis margin-based transferability measure, highlighting its effectiveness in transferring useful learning weights to Mpox images. Gradient-weighted Class Activation Mapping (Grad-CAM) visualizations also confirmed that the ViT model attends to clinically relevant features, supporting its interpretability and reliability for diagnostic use. Conclusions: The results from this study suggest that ViT offers superior accuracy, making it a valuable tool for Mpox early detection in field settings, especially where conventional diagnostics are limited. This approach could support faster outbreak response and improved resource allocation in public health systems. Full article
Show Figures

Figure 1

21 pages, 3582 KB  
Article
A Cascade of Encoder–Decoder with Atrous Convolution and Ensemble Deep Convolutional Neural Networks for Tuberculosis Detection
by Noppadol Maneerat, Athasart Narkthewan and Kazuhiko Hamamoto
Appl. Sci. 2025, 15(13), 7300; https://doi.org/10.3390/app15137300 - 28 Jun 2025
Viewed by 390
Abstract
Tuberculosis (TB) is the most serious worldwide infectious disease and the leading cause of death among people with HIV. Early diagnosis and prompt treatment can cut off the rising number of TB deaths, and analysis of chest X-rays is a cost-effective method. We [...] Read more.
Tuberculosis (TB) is the most serious worldwide infectious disease and the leading cause of death among people with HIV. Early diagnosis and prompt treatment can cut off the rising number of TB deaths, and analysis of chest X-rays is a cost-effective method. We describe a deep learning-based cascade algorithm for detecting TB in chest X-rays. Firstly, the lung regions were segregated from other anatomical structures by an encoder–decoder with an atrous separable convolution network—DeepLabv3+ with an XceptionNet backbone, DLabv3+X, and then cropped by a bounding box. Using the cropped lung images, we trained several pre-trained Deep Convolutional Neural Networks (DCNNs) on the images with hyperparameters optimized by a Bayesian algorithm. Different combinations of trained DCNNs were compared, and the combination with the maximum accuracy was retained as the winning combination. The ensemble classifier was designed to predict the presence of TB by fusing DCNNs from the winning combination via weighted averaging. Our lung segmentation was evaluated on three publicly available datasets: it provided better Intercept over Union (IoU) values: 95.1% for Montgomery County (MC), 92.8% for Shenzhen (SZ), and 96.1% for JSRT datasets. For TB prediction, our ensemble classifier produced a better accuracy of 92.7% for the MC dataset and obtained a comparable accuracy of 95.5% for the SZ dataset. Finally, occlusion sensitivity and gradient-weighted class activation maps (Grad-CAM) were generated to indicate the most influential regions for the prediction of TB and to localize TB manifestations. Full article
(This article belongs to the Special Issue Advances in Deep Learning and Intelligent Computing)
Show Figures

Figure 1

26 pages, 8949 KB  
Article
Real-Time Detection of Hole-Type Defects on Industrial Components Using Raspberry Pi 5
by Mehmet Deniz, Ismail Bogrekci and Pinar Demircioglu
Appl. Syst. Innov. 2025, 8(4), 89; https://doi.org/10.3390/asi8040089 - 27 Jun 2025
Viewed by 1054
Abstract
In modern manufacturing, ensuring quality control for geometric features is critical, yet detecting anomalies in circular components remains underexplored. This study proposes a real-time defect detection framework for metal parts with holes, optimized for deployment on a Raspberry Pi 5 edge device. We [...] Read more.
In modern manufacturing, ensuring quality control for geometric features is critical, yet detecting anomalies in circular components remains underexplored. This study proposes a real-time defect detection framework for metal parts with holes, optimized for deployment on a Raspberry Pi 5 edge device. We fine-tuned and evaluated three deep learning models ResNet50, EfficientNet-B3, and MobileNetV3-Large on a grayscale image dataset (43,482 samples) containing various hole defects and imbalances. Through extensive data augmentation and class-weighting, the models achieved near-perfect binary classification of defective vs. non-defective parts. Notably, ResNet50 attained 99.98% accuracy (precision 0.9994, recall 1.0000), correctly identifying all defects with only one false alarm. MobileNetV3-Large and EfficientNet-B3 likewise exceeded 99.9% accuracy, with slightly more false positives, but offered advantages in model size or interpretability. Gradient-weighted Class Activation Mapping (Grad-CAM) visualizations confirmed that each network focuses on meaningful geometric features (misaligned or irregular holes) when predicting defects, enhancing explainability. These results demonstrate that lightweight CNNs can reliably detect geometric deviations (e.g., mispositioned or missing holes) in real time. The proposed system significantly improves inline quality assurance by enabling timely, accurate, and interpretable defect detection on low-cost hardware, paving the way for smarter manufacturing inspection. Full article
Show Figures

Figure 1

20 pages, 1771 KB  
Article
An Innovative Artificial Intelligence Classification Model for Non-Ischemic Cardiomyopathy Utilizing Cardiac Biomechanics Derived from Magnetic Resonance Imaging
by Liqiang Fu, Peifang Zhang, Liuquan Cheng, Peng Zhi, Jiayu Xu, Xiaolei Liu, Yang Zhang, Ziwen Xu and Kunlun He
Bioengineering 2025, 12(6), 670; https://doi.org/10.3390/bioengineering12060670 - 19 Jun 2025
Viewed by 762
Abstract
Significant challenges persist in diagnosing non-ischemic cardiomyopathies (NICMs) owing to early morphological overlap and subtle functional changes. While cardiac magnetic resonance (CMR) offers gold-standard structural assessment, current morphology-based AI models frequently overlook key biomechanical dysfunctions like diastolic/systolic abnormalities. To address this, we propose [...] Read more.
Significant challenges persist in diagnosing non-ischemic cardiomyopathies (NICMs) owing to early morphological overlap and subtle functional changes. While cardiac magnetic resonance (CMR) offers gold-standard structural assessment, current morphology-based AI models frequently overlook key biomechanical dysfunctions like diastolic/systolic abnormalities. To address this, we propose a dual-path hybrid deep learning framework based on CNN-LSTM and MLP, integrating anatomical features from cine CMR with biomechanical markers derived from intraventricular pressure gradients (IVPGs), significantly enhancing NICM subtype classification by capturing subtle biomechanical dysfunctions overlooked by traditional morphological models. Our dual-path architecture combines a CNN-LSTM encoder for cine CMR analysis and an MLP encoder for IVPG time-series data, followed by feature fusion and dense classification layers. Trained on a multicenter dataset of 1196 patients and externally validated on 137 patients from a distinct institution, the model achieved a superior performance (internal AUC: 0.974; external AUC: 0.962), outperforming ResNet50, VGG16, and radiomics-based SVM. Ablation studies confirmed IVPGs’ significant contribution, while gradient saliency and gradient-weighted class activation mapping (Grad-CAM) visualizations proved the model pays attention to physiologically relevant cardiac regions and phases. The framework maintained robust generalizability across imaging protocols and institutions with minimal performance degradation. By synergizing biomechanical insights with deep learning, our approach offers an interpretable, data-efficient solution for early NICM detection and subtype differentiation, holding strong translational potential for clinical practice. Full article
(This article belongs to the Special Issue Bioengineering in a Generative AI World)
Show Figures

Figure 1

14 pages, 3123 KB  
Article
Impact of Activation Functions on the Detection of Defects in Cast Steel Parts Using YOLOv8
by Yunxia Chen, Yangkai He and Yukun Chu
Materials 2025, 18(12), 2834; https://doi.org/10.3390/ma18122834 - 16 Jun 2025
Viewed by 407
Abstract
In this paper, to address the issue of the unknown influence of activation functions on casting defect detection using convolutional neural networks (CNNs), we designed five sets of experiments to investigate how different activation functions affect the performance of casting defect detection. Specifically, [...] Read more.
In this paper, to address the issue of the unknown influence of activation functions on casting defect detection using convolutional neural networks (CNNs), we designed five sets of experiments to investigate how different activation functions affect the performance of casting defect detection. Specifically, the study employs five activation functions—Rectified Linear Unit (ReLU), Exponential Linear Units (ELU), Softplus, Sigmoid Linear Unit (SiLU), and Mish—each with distinct characteristics, based on the YOLOv8 algorithm. The results indicate that the Mish activation function yields the best performance in casting defect detection, achieving an mAP@0.5 value of 90.1%. In contrast, the Softplus activation function performs the worst, with an mAP@0.5 value of only 86.7%. The analysis of the feature maps shows that the Mish activation function enables the output of negative values, thereby enhancing the model’s ability to differentiate features and improving its overall expressive power, which enhances the model’s ability to identify various types of casting defects. Finally, gradient class activation maps (Grad-CAM) are used to visualize the important pixel regions in the casting digital radiography (DR) images processed by the neural network. The results demonstrate that the Mish activation function improves the model’s focus on grayscale-changing regions in the image, thereby enhancing detection accuracy. Full article
Show Figures

Figure 1

Back to TopTop