Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (379)

Search Parameters:
Keywords = Xception

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 38860 KB  
Article
Explainable Deep Ensemble Meta-Learning Framework for Brain Tumor Classification Using MRI Images
by Shawon Chakrabarty Kakon, Zawad Al Sazid, Ismat Ara Begum, Md Abdus Samad and A. S. M. Sanwar Hosen
Cancers 2025, 17(17), 2853; https://doi.org/10.3390/cancers17172853 - 30 Aug 2025
Viewed by 118
Abstract
Background: Brain tumors can severely impair neurological function, leading to symptoms such as headaches, memory loss, motor coordination deficits, and visual disturbances. In severe cases, they may cause permanent cognitive damage or become life-threatening without early detection. Methods: To address this, we propose [...] Read more.
Background: Brain tumors can severely impair neurological function, leading to symptoms such as headaches, memory loss, motor coordination deficits, and visual disturbances. In severe cases, they may cause permanent cognitive damage or become life-threatening without early detection. Methods: To address this, we propose an interpretable deep ensemble model for tumor detection in Magnetic Resonance Imaging (MRI) by integrating pre-trained Convolutional Neural Networks—EfficientNetB7, InceptionV3, and Xception—using a soft voting ensemble to improve classification accuracy. The framework is further enhanced with a Light Gradient Boosting Machine as a meta-learner to increase prediction accuracy and robustness within a stacking architecture. Hyperparameter tuning is conducted using Optuna, and overfitting is mitigated through batch normalization, L2 weight decay, dropout, early stopping, and extensive data augmentation. Results: These regularization strategies significantly enhance the model’s generalization ability within the BR35H dataset. The framework achieves a classification accuracy of 99.83 on the MRI dataset of 3060 images. Conclusions: To improve interpretability and build clinical trust, Explainable Artificial Intelligence methods Grad-CAM++, LIME, and SHAP are employed to visualize the factors influencing model predictions, effectively highlighting tumor regions within MRI scans. This establishes a strong foundation for further advancements in radiology decision support systems. Full article
(This article belongs to the Section Methods and Technologies Development)
Show Figures

Figure 1

24 pages, 15799 KB  
Article
Performance Comparison of Embedded AI Solutions for Classification and Detection in Lung Disease Diagnosis
by Md Sabbir Ahmed, Stefano Giordano and Davide Adami
Appl. Sci. 2025, 15(17), 9345; https://doi.org/10.3390/app15179345 - 26 Aug 2025
Viewed by 371
Abstract
Lung disease diagnosis from chest X-ray images is a critical task in clinical care, especially in resource-constrained settings where access to radiology expertise and computational infrastructure is limited. Recent advances in deep learning have shown promise, yet most studies focus solely on either [...] Read more.
Lung disease diagnosis from chest X-ray images is a critical task in clinical care, especially in resource-constrained settings where access to radiology expertise and computational infrastructure is limited. Recent advances in deep learning have shown promise, yet most studies focus solely on either classification or detection in isolation, rarely exploring their combined potential in an embedded, real-world setting. To address this, we present a dual deep learning approach that combines five-class disease classification and multi-label thoracic abnormality detection, optimized for embedded edge deployment. Specifically, we evaluate six state-of-the-art CNN architectures—ResNet101, DenseNet201, MobileNetV3-Large, EfficientNetV2-B0, InceptionResNetV2, and Xception—on both base (2020 images) and augmented (9875 images) datasets. Validation accuracies ranged from 55.3% to 70.7% on the base dataset and improved to 58.4% to 72.0% with augmentation, with MobileNetV3-Large achieving the highest accuracy on both. In parallel, we trained a YOLOv8n model for multi-label detection of 14 thoracic diseases. While not deployed in this work, its lightweight architecture makes it suitable for future use on embedded platforms. All classification models were evaluated for end-to-end inference on a Raspberry Pi 4 using a high-resolution chest X-ray image (2566 × 2566, PNG). MobileNetV3-Large demonstrated the fastest latency at 429.6 ms, and all models completed inference in under 2.4 s. These results demonstrate the feasibility of combining classification for rapid triage and detection for spatial interpretability in real-time, embedded clinical environments—paving the way for practical, low-cost AI-based decision support systems for surgery rooms and mobile clinical environments. Full article
Show Figures

Figure 1

15 pages, 5342 KB  
Article
Transfer Learning-Based Multi-Sensor Approach for Predicting Keyhole Depth in Laser Welding of 780DP Steel
by Byeong-Jin Kim, Young-Min Kim and Cheolhee Kim
Materials 2025, 18(17), 3961; https://doi.org/10.3390/ma18173961 - 24 Aug 2025
Viewed by 429
Abstract
Penetration depth is a critical factor determining joint strength in butt welding; however, it is difficult to monitor in keyhole-mode laser welding due to the dynamic nature of the keyhole. Recently, optical coherence tomography (OCT) has been introduced for real-time keyhole depth measurement, [...] Read more.
Penetration depth is a critical factor determining joint strength in butt welding; however, it is difficult to monitor in keyhole-mode laser welding due to the dynamic nature of the keyhole. Recently, optical coherence tomography (OCT) has been introduced for real-time keyhole depth measurement, though accurate results require meticulous calibration. In this study, deep learning-based models were developed to estimate penetration depth in laser welding of 780 dual-phase (DP) steel. The models utilized coaxial weld pool images and spectrometer signals as inputs, with OCT signals serving as the output reference. Both uni-sensor models (based on coaxial pool images) and multi-sensor models (incorporating spectrometer data) were developed using transfer learning techniques based on pre-trained convolutional neural network (CNN) architectures including MobileNetV2, ResNet50V2, EfficientNetB3, and Xception. The coefficients of determination values (R2) of the uni-sensor CNN transfer learning models without fine-tuning ranged from 0.502 to 0.681, and the mean absolute errors (MAEs) ranged from 0.152 mm to 0.196 mm. In the fine-tuning models, R2 decreased by more than 17%, and MAE increased by more than 11% compared to the previous models without fine-tuning. In addition, in the multi-sensor model, R2 ranged from 0.900 to 0.956, and MAE ranged from 0.058 mm to 0.086 mm, showing better performance than uni-sensor CNN transfer learning models. This study demonstrated the potential of using CNN transfer learning models for predicting penetration depth in laser welding of 780DP steel. Full article
(This article belongs to the Special Issue Advances in Plasma and Laser Engineering (Second Edition))
Show Figures

Figure 1

33 pages, 8494 KB  
Article
Enhanced Multi-Class Brain Tumor Classification in MRI Using Pre-Trained CNNs and Transformer Architectures
by Marco Antonio Gómez-Guzmán, Laura Jiménez-Beristain, Enrique Efren García-Guerrero, Oscar Adrian Aguirre-Castro, José Jaime Esqueda-Elizondo, Edgar Rene Ramos-Acosta, Gilberto Manuel Galindo-Aldana, Cynthia Torres-Gonzalez and Everardo Inzunza-Gonzalez
Technologies 2025, 13(9), 379; https://doi.org/10.3390/technologies13090379 - 22 Aug 2025
Viewed by 495
Abstract
Early and accurate identification of brain tumors is essential for determining effective treatment strategies and improving patient outcomes. Artificial intelligence (AI) and deep learning (DL) techniques have shown promise in automating diagnostic tasks based on magnetic resonance imaging (MRI). This study evaluates the [...] Read more.
Early and accurate identification of brain tumors is essential for determining effective treatment strategies and improving patient outcomes. Artificial intelligence (AI) and deep learning (DL) techniques have shown promise in automating diagnostic tasks based on magnetic resonance imaging (MRI). This study evaluates the performance of four pre-trained deep convolutional neural network (CNN) architectures for the automatic multi-class classification of brain tumors into four categories: Glioma, Meningioma, Pituitary, and No Tumor. The proposed approach utilizes the publicly accessible Brain Tumor MRI Msoud dataset, consisting of 7023 images, with 5712 provided for training and 1311 for testing. To assess the impact of data availability, subsets containing 25%, 50%, 75%, and 100% of the training data were used. A stratified five-fold cross-validation technique was applied. The CNN architectures evaluated include DeiT3_base_patch16_224, Xception41, Inception_v4, and Swin_Tiny_Patch4_Window7_224, all fine-tuned using transfer learning. The training pipeline incorporated advanced preprocessing and image data augmentation techniques to enhance robustness and mitigate overfitting. Among the models tested, Swin_Tiny_Patch4_Window7_224 achieved the highest classification Accuracy of 99.24% on the test set using 75% of the training data. This model demonstrated superior generalization across all tumor classes and effectively addressed class imbalance issues. Furthermore, we deployed and benchmarked the best-performing DL model on embedded AI platforms (Jetson AGX Xavier and Orin Nano), demonstrating their capability for real-time inference and highlighting their feasibility for edge-based clinical deployment. The results highlight the strong potential of pre-trained deep CNN and transformer-based architectures in medical image analysis. The proposed approach provides a scalable and energy-efficient solution for automated brain tumor diagnosis, facilitating the integration of AI into clinical workflows. Full article
Show Figures

Figure 1

35 pages, 13933 KB  
Article
EndoNet: A Multiscale Deep Learning Framework for Multiple Gastrointestinal Disease Classification via Endoscopic Images
by Omneya Attallah, Muhammet Fatih Aslan and Kadir Sabanci
Diagnostics 2025, 15(16), 2009; https://doi.org/10.3390/diagnostics15162009 - 11 Aug 2025
Viewed by 463
Abstract
Background: Gastrointestinal (GI) disorders present significant healthcare challenges, requiring rapid, accurate, and effective diagnostic methods to improve treatment outcomes and prevent complications. Wireless capsule endoscopy (WCE) is an effective tool for diagnosing GI abnormalities; however, precisely identifying diverse lesions with similar visual patterns [...] Read more.
Background: Gastrointestinal (GI) disorders present significant healthcare challenges, requiring rapid, accurate, and effective diagnostic methods to improve treatment outcomes and prevent complications. Wireless capsule endoscopy (WCE) is an effective tool for diagnosing GI abnormalities; however, precisely identifying diverse lesions with similar visual patterns remains difficult. Methods: Many existing computer-aided diagnostic (CAD) systems rely on manually crafted features or single deep learning (DL) models, which often fail to capture the complex and varied characteristics of GI diseases. In this study, we proposed “EndoNet,” a multi-stage hybrid DL framework for eight-class GI disease classification using WCE images. Features were extracted from two different layers of three pre-trained convolutional neural networks (CNNs) (Inception, Xception, ResNet101), with both inter-layer and inter-model feature fusion performed. Dimensionality reduction was achieved using Non-Negative Matrix Factorization (NNMF), followed by selection of the most informative features via the Minimum Redundancy Maximum Relevance (mRMR) method. Results: Two datasets were used to evaluate the performance of EndoNer, including Kvasir v2 and HyperKvasir. Classification using seven different Machine Learning algorithms achieved a maximum accuracy of 97.8% and 98.4% for Kvasir v2 and HyperKvasir datasets, respectively. Conclusions: By integrating transfer learning with feature engineering, dimensionality reduction, and feature selection, EndoNet provides high accuracy, flexibility, and interpretability. This framework offers a powerful and generalizable artificial intelligence solution suitable for clinical decision support systems. Full article
Show Figures

Figure 1

21 pages, 2896 KB  
Article
Explainable CNN–Radiomics Fusion and Ensemble Learning for Multimodal Lesion Classification in Dental Radiographs
by Zuhal Can and Emre Aydin
Diagnostics 2025, 15(16), 1997; https://doi.org/10.3390/diagnostics15161997 - 9 Aug 2025
Viewed by 509
Abstract
Background/Objectives: Clinicians routinely rely on periapical radiographs to identify root-end disease, but interpretation errors and inconsistent readings compromise diagnostic accuracy. We, therefore, developed an explainable, multimodal AI framework that (i) fuses two data modalities, deep CNN embeddings and radiomic texture descriptors that [...] Read more.
Background/Objectives: Clinicians routinely rely on periapical radiographs to identify root-end disease, but interpretation errors and inconsistent readings compromise diagnostic accuracy. We, therefore, developed an explainable, multimodal AI framework that (i) fuses two data modalities, deep CNN embeddings and radiomic texture descriptors that are extracted only from lesion-relevant pixels selected by Grad-CAM, and (ii) makes every prediction transparent through dual-layer explainability (pixel-level Grad-CAM heatmaps + feature-level SHAP values). Methods: A dataset of 2285 periapical radiographs was processed using six CNN architectures (EfficientNet-B1/B4/V2M/V2S, ResNet-50, Xception). For each image, a Grad-CAM heatmap generated from the penultimate layer of the CNN was thresholded to create a binary mask that delineated the region most responsible for the network’s decision. Radiomic features (first-order, GLCM, GLRLM, GLDM, NGTDM, and shape2D) were then computed only within that mask, ensuring that handcrafted descriptors and learned embeddings referred to the same anatomic focus. The two feature streams were concatenated, optionally reduced by principal component analysis or SelectKBest, and fed to random forest or XGBoost classifiers; five-view test-time augmentation (TTA) was applied at inference. Pixel-level interpretability was provided by the original Grad-CAM, while SHAP quantified the contribution of each radiomic and deep feature to the final vote. Results: Raw CNNs achieved a ca. 52% accuracy and AUC values near 0.60. The multimodal fusion raised performance dramatically; the Xception + radiomics + random forest model achieved a 95.4% accuracy and an AUC of 0.9867, and adding TTA increased these to 96.3% and 0.9917, respectively. The top ensemble, Xception and EfficientNet-V2S fusion vectors classified with XGBoost under five-view TTA, reached a 97.16% accuracy and an AUC of 0.9914, with false-positive and false-negative rates of 4.6% and 0.9%, respectively. Grad-CAM heatmaps consistently highlighted periapical regions, while SHAP plots revealed that radiomic texture heterogeneity and high-level CNN features jointly contributed to correct classifications. Conclusions: By tightly integrating CNN embeddings, mask-targeted radiomics, and a two-tiered explainability stack (Grad-CAM + SHAP), the proposed system delivers state-of-the-art lesion detection and a transparent technique, addressing both accuracy and trust. Full article
Show Figures

Figure 1

14 pages, 2224 KB  
Article
Evaluation of Transfer Learning Efficacy for Surgical Suture Quality Classification on Limited Datasets
by Roman Ishchenko, Maksim Solopov, Andrey Popandopulo, Elizaveta Chechekhina, Viktor Turchin, Fedor Popivnenko, Aleksandr Ermak, Konstantyn Ladyk, Anton Konyashin, Kirill Golubitskiy, Aleksei Burtsev and Dmitry Filimonov
J. Imaging 2025, 11(8), 266; https://doi.org/10.3390/jimaging11080266 - 8 Aug 2025
Viewed by 368
Abstract
This study evaluates the effectiveness of transfer learning with pre-trained convolutional neural networks (CNNs) for the automated binary classification of surgical suture quality (high-quality/low-quality) using photographs of three suture types: interrupted open vascular sutures (IOVS), continuous over-and-over open sutures (COOS), and interrupted laparoscopic [...] Read more.
This study evaluates the effectiveness of transfer learning with pre-trained convolutional neural networks (CNNs) for the automated binary classification of surgical suture quality (high-quality/low-quality) using photographs of three suture types: interrupted open vascular sutures (IOVS), continuous over-and-over open sutures (COOS), and interrupted laparoscopic sutures (ILS). To address the challenge of limited medical data, eight state-of-the-art CNN architectures—EfficientNetB0, ResNet50V2, MobileNetV3Large, VGG16, VGG19, InceptionV3, Xception, and DenseNet121—were trained and validated on small datasets (100–190 images per type) using 5-fold cross-validation. Performance was assessed using the F1-score, AUC-ROC, and a custom weighted stability-aware score (Scoreadj). The results demonstrate that transfer learning achieves robust classification (F1 > 0.90 for IOVS/ILS, 0.79 for COOS) despite data scarcity. ResNet50V2, DenseNet121, and Xception were more stable by Scoreadj, with ResNet50V2 achieving the highest AUC-ROC (0.959 ± 0.008) for IOVS internal view classification. GradCAM visualizations confirmed model focus on clinically relevant features (e.g., stitch uniformity, tissue apposition). These findings validate transfer learning as a powerful approach for developing objective, automated surgical skill assessment tools, reducing reliance on subjective expert evaluations while maintaining accuracy in resource-constrained settings. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Medical Imaging Applications)
Show Figures

Figure 1

26 pages, 1790 KB  
Article
A Hybrid Deep Learning Model for Aromatic and Medicinal Plant Species Classification Using a Curated Leaf Image Dataset
by Shareena E. M., D. Abraham Chandy, Shemi P. M. and Alwin Poulose
AgriEngineering 2025, 7(8), 243; https://doi.org/10.3390/agriengineering7080243 - 1 Aug 2025
Viewed by 701
Abstract
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the [...] Read more.
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the lack of domain-specific, high-quality datasets and the limited representational capacity of traditional architectures. This study addresses these challenges by introducing a novel, well-curated leaf image dataset consisting of 39 classes of medicinal and aromatic plants collected from the Aromatic and Medicinal Plant Research Station in Odakkali, Kerala, India. To overcome performance bottlenecks observed with a baseline Convolutional Neural Network (CNN) that achieved only 44.94% accuracy, we progressively enhanced model performance through a series of architectural innovations. These included the use of a pre-trained VGG16 network, data augmentation techniques, and fine-tuning of deeper convolutional layers, followed by the integration of Squeeze-and-Excitation (SE) attention blocks. Ultimately, we propose a hybrid deep learning architecture that combines VGG16 with Batch Normalization, Gated Recurrent Units (GRUs), Transformer modules, and Dilated Convolutions. This final model achieved a peak validation accuracy of 95.24%, significantly outperforming several baseline models, such as custom CNN (44.94%), VGG-19 (59.49%), VGG-16 before augmentation (71.52%), Xception (85.44%), Inception v3 (87.97%), VGG-16 after data augumentation (89.24%), VGG-16 after fine-tuning (90.51%), MobileNetV2 (93.67), and VGG16 with SE block (94.94%). These results demonstrate superior capability in capturing both local textures and global morphological features. The proposed solution not only advances the state of the art in plant classification but also contributes a valuable dataset to the research community. Its real-world applicability spans field-based plant identification, biodiversity conservation, and precision agriculture, offering a scalable tool for automated plant recognition in complex ecological and agricultural environments. Full article
(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)
Show Figures

Figure 1

26 pages, 4572 KB  
Article
Transfer Learning-Based Ensemble of CNNs and Vision Transformers for Accurate Melanoma Diagnosis and Image Retrieval
by Murat Sarıateş and Erdal Özbay
Diagnostics 2025, 15(15), 1928; https://doi.org/10.3390/diagnostics15151928 - 31 Jul 2025
Viewed by 518
Abstract
Background/Objectives: Melanoma is an aggressive type of skin cancer that poses serious health risks if not detected in its early stages. Although early diagnosis enables effective treatment, delays can result in life-threatening consequences. Traditional diagnostic processes predominantly rely on the subjective expertise [...] Read more.
Background/Objectives: Melanoma is an aggressive type of skin cancer that poses serious health risks if not detected in its early stages. Although early diagnosis enables effective treatment, delays can result in life-threatening consequences. Traditional diagnostic processes predominantly rely on the subjective expertise of dermatologists, which can lead to variability and time inefficiencies. Consequently, there is an increasing demand for automated systems that can accurately classify melanoma lesions and retrieve visually similar cases to support clinical decision-making. Methods: This study proposes a transfer learning (TL)-based deep learning (DL) framework for the classification of melanoma images and the enhancement of content-based image retrieval (CBIR) systems. Pre-trained models including DenseNet121, InceptionV3, Vision Transformer (ViT), and Xception were employed to extract deep feature representations. These features were integrated using a weighted fusion strategy and classified through an Ensemble learning approach designed to capitalize on the complementary strengths of the individual models. The performance of the proposed system was evaluated using classification accuracy and mean Average Precision (mAP) metrics. Results: Experimental evaluations demonstrated that the proposed Ensemble model significantly outperformed each standalone model in both classification and retrieval tasks. The Ensemble approach achieved a classification accuracy of 95.25%. In the CBIR task, the system attained a mean Average Precision (mAP) score of 0.9538, indicating high retrieval effectiveness. The performance gains were attributed to the synergistic integration of features from diverse model architectures through the ensemble and fusion strategies. Conclusions: The findings underscore the effectiveness of TL-based DL models in automating melanoma image classification and enhancing CBIR systems. The integration of deep features from multiple pre-trained models using an Ensemble approach not only improved accuracy but also demonstrated robustness in feature generalization. This approach holds promise for integration into clinical workflows, offering improved diagnostic accuracy and efficiency in the early detection of melanoma. Full article
Show Figures

Figure 1

16 pages, 2557 KB  
Article
Explainable AI for Oral Cancer Diagnosis: Multiclass Classification of Histopathology Images and Grad-CAM Visualization
by Jelena Štifanić, Daniel Štifanić, Nikola Anđelić and Zlatan Car
Biology 2025, 14(8), 909; https://doi.org/10.3390/biology14080909 - 22 Jul 2025
Viewed by 706
Abstract
Oral cancer is typically diagnosed through histological examination; however, the primary issue with this type of procedure is tumor heterogeneity, where a subjective aspect of the examination may have a direct effect on the treatment plan for a patient. To reduce inter- and [...] Read more.
Oral cancer is typically diagnosed through histological examination; however, the primary issue with this type of procedure is tumor heterogeneity, where a subjective aspect of the examination may have a direct effect on the treatment plan for a patient. To reduce inter- and intra-observer variability, artificial intelligence algorithms are often used as computational aids in tumor classification and diagnosis. This research proposes a two-step approach for automatic multiclass grading using oral histopathology images (the first step) and Grad-CAM visualization (the second step) to assist clinicians in diagnosing oral squamous cell carcinoma. The Xception architecture achieved the highest classification values of 0.929 (±σ = 0.087) AUCmacro and 0.942 (±σ = 0.074) AUCmicro. Additionally, Grad-CAM provided visual explanations of the model’s predictions by highlighting the precise areas of histopathology images that influenced the model’s decision. These results emphasize the potential of integrated AI algorithms in medical diagnostics, offering a more precise, dependable, and effective method for disease analysis. Full article
Show Figures

Figure 1

18 pages, 4631 KB  
Article
Semantic Segmentation of Rice Fields in Sub-Meter Satellite Imagery Using an HRNet-CA-Enhanced DeepLabV3+ Framework
by Yifan Shao, Pan Pan, Hongxin Zhao, Jiale Li, Guoping Yu, Guomin Zhou and Jianhua Zhang
Remote Sens. 2025, 17(14), 2404; https://doi.org/10.3390/rs17142404 - 11 Jul 2025
Viewed by 594
Abstract
Accurate monitoring of rice-planting areas underpins food security and evidence-based farm management. Recent work has advanced along three complementary lines—multi-source data fusion (to mitigate cloud and spectral confusion), temporal feature extraction (to exploit phenology), and deep-network architecture optimization. However, even the best fusion- [...] Read more.
Accurate monitoring of rice-planting areas underpins food security and evidence-based farm management. Recent work has advanced along three complementary lines—multi-source data fusion (to mitigate cloud and spectral confusion), temporal feature extraction (to exploit phenology), and deep-network architecture optimization. However, even the best fusion- and time-series-based approaches still struggle to preserve fine spatial details in sub-meter scenes. Targeting this gap, we propose an HRNet-CA-enhanced DeepLabV3+ that retains the original model’s strengths while resolving its two key weaknesses: (i) detail loss caused by repeated down-sampling and feature-pyramid compression and (ii) boundary blurring due to insufficient multi-scale information fusion. The Xception backbone is replaced with a High-Resolution Network (HRNet) to maintain full-resolution feature streams through multi-resolution parallel convolutions and cross-scale interactions. A coordinate attention (CA) block is embedded in the decoder to strengthen spatially explicit context and sharpen class boundaries. The rice dataset consisted of 23,295 images (11,295 rice + 12,000 non-rice) via preprocessing and manual labeling and benchmarked the proposed model against classical segmentation networks. Our approach boosts boundary segmentation accuracy to 92.28% MIOU and raises texture-level discrimination to 95.93% F1, without extra inference latency. Although this study focuses on architecture optimization, the HRNet-CA backbone is readily compatible with future multi-source fusion and time-series modules, offering a unified path toward operational paddy mapping in fragmented sub-meter landscapes. Full article
Show Figures

Figure 1

24 pages, 9593 KB  
Article
Deep Learning Approaches for Skin Lesion Detection
by Jonathan Vieira, Fábio Mendonça and Fernando Morgado-Dias
Electronics 2025, 14(14), 2785; https://doi.org/10.3390/electronics14142785 - 10 Jul 2025
Viewed by 759
Abstract
Recently, there has been a rise in skin cancer cases, for which early detection is highly relevant, as it increases the likelihood of a cure. In this context, this work presents a benchmarking study of standard Convolutional Neural Network (CNN) architectures for automated [...] Read more.
Recently, there has been a rise in skin cancer cases, for which early detection is highly relevant, as it increases the likelihood of a cure. In this context, this work presents a benchmarking study of standard Convolutional Neural Network (CNN) architectures for automated skin lesion classification. A total of 38 CNN architectures from ten families (ConvNeXt, DenseNet, EfficientNet, Inception, InceptionResNet, MobileNet, NASNet, ResNet, VGG, and Xception) were evaluated using transfer learning on the HAM10000 dataset for seven-class skin lesion classification, namely, actinic keratoses, basal cell carcinoma, benign keratosis-like lesions, dermatofibroma, melanoma, melanocytic nevi, and vascular lesions. The comparative analysis used standardized training conditions, with all models utilizing frozen pre-trained weights. Cross-database validation was then conducted using the ISIC 2019 dataset to assess generalizability across different data distributions. The ConvNeXtXLarge architecture achieved the best performance, despite having one of the lowest performance-to-number-of-parameters ratios, with 87.62% overall accuracy and 76.15% F1 score on the test set, demonstrating competitive results within the established performance range of existing HAM10000-based studies. A proof-of-concept multiplatform mobile application was also implemented using a client–server architecture with encrypted image transmission, demonstrating the viability of integrating high-performing models into healthcare screening tools. Full article
Show Figures

Figure 1

30 pages, 4273 KB  
Article
Hybrid Attention-Enhanced Xception and Dynamic Chaotic Whale Optimization for Brain Tumor Diagnosis
by Aliyu Tetengi Ibrahim, Ibrahim Hayatu Hassan, Mohammed Abdullahi, Armand Florentin Donfack Kana, Amina Hassan Abubakar, Mohammed Tukur Mohammed, Lubna A. Gabralla, Mohamad Khoiru Rusydi and Haruna Chiroma
Bioengineering 2025, 12(7), 747; https://doi.org/10.3390/bioengineering12070747 - 9 Jul 2025
Viewed by 581
Abstract
In medical diagnostics, brain tumor classification remains essential, as accurate and efficient models aid medical professionals in early detection and treatment planning. Deep learning methodologies for brain tumor classification have gained popularity due to their potential to deliver prompt and precise diagnostic results. [...] Read more.
In medical diagnostics, brain tumor classification remains essential, as accurate and efficient models aid medical professionals in early detection and treatment planning. Deep learning methodologies for brain tumor classification have gained popularity due to their potential to deliver prompt and precise diagnostic results. This article proposes a novel classification technique that integrates the Xception model with a hybrid attention mechanism and progressive image resizing to enhance performance. The methodology is built on a combination of preprocessing techniques, transfer learning architecture reconstruction, and dynamic fine-tuning strategies. To optimize key hyper-parameters, this study employed the Dynamic Chaotic Whale Optimization Algorithm. Additionally, we developed a novel learning rate scheduler that dynamically adjusts the learning rate based on image size at each training phase, improving training efficiency and model adaptability. Batch sizes and layer freezing methods were also adjusted according to image size. We constructed an ensemble approach by preserving models trained on different image sizes and merging their results using weighted averaging, bagging, boosting, stacking, blending, and voting techniques. Our proposed method was evaluated on benchmark datasets achieving remarkable accuracies of 99.67%, 99.09%, and 99.67% compared to the classical algorithms. Full article
Show Figures

Figure 1

19 pages, 3729 KB  
Article
The Application of Migration Learning Network in FMI Lithology Identification: Taking Glutenite Reservoir of an Oilfield in Xinjiang as an Example
by Yangshuo Dou, Xinghua Qi, Weiping Cui, Xinlong Ma and Zhuwen Wang
Processes 2025, 13(7), 2095; https://doi.org/10.3390/pr13072095 - 2 Jul 2025
Viewed by 371
Abstract
Formation Microresistivity Scanner Imaging (FMI) plays a crucial role in identifying lithology, sedimentary structures, fractures, and reservoir evaluation. However, during the lithology identification process of FMI images relying on transfer learning networks, the limited dataset size of existing models and their relatively primitive [...] Read more.
Formation Microresistivity Scanner Imaging (FMI) plays a crucial role in identifying lithology, sedimentary structures, fractures, and reservoir evaluation. However, during the lithology identification process of FMI images relying on transfer learning networks, the limited dataset size of existing models and their relatively primitive architecture substantially compromise the accuracy of well-log interpretation results and practical production efficiency. This study employs the VGG-19 transfer learning model as its core framework to conduct preprocessing, feature extraction, and analysis of FMI well-log images from glutenite formations in an oilfield in Xinjiang, with the objective of achieving rapid and accurate intelligent identification and classification of formation lithology. Simultaneously, this paper emphasizes a systematic comparative analysis of the recognition performance between the VGG-19 model and existing models, such as GoogLeNet and Xception, to screen for the model exhibiting the strongest region-specific applicability. The study finds that lithology can be classified into five types based on physical structures and diagnostic criteria: gray glutenite, brown glutenite, fine sandstone, conglomerate, and mudstone. The research results demonstrate the VGG-19 model exhibits superior accuracy in identifying FMI images compared to the other two models; the VGG-19 model achieves a training accuracy of 99.64%, a loss value of 0.034, and a validation accuracy of 95.6%; the GoogLeNet model achieves a training accuracy of 96.1%, a loss value of 0.05615, and a validation accuracy of 90.38%; and the Xception model achieves a training accuracy of 91.3%, a loss value of 0.0713, and a validation accuracy of 87.15%. These findings are anticipated to provide a significant reference for the in-depth application of VGG-19 transfer learning in FMI well-log interpretation. Full article
Show Figures

Figure 1

21 pages, 3582 KB  
Article
A Cascade of Encoder–Decoder with Atrous Convolution and Ensemble Deep Convolutional Neural Networks for Tuberculosis Detection
by Noppadol Maneerat, Athasart Narkthewan and Kazuhiko Hamamoto
Appl. Sci. 2025, 15(13), 7300; https://doi.org/10.3390/app15137300 - 28 Jun 2025
Viewed by 390
Abstract
Tuberculosis (TB) is the most serious worldwide infectious disease and the leading cause of death among people with HIV. Early diagnosis and prompt treatment can cut off the rising number of TB deaths, and analysis of chest X-rays is a cost-effective method. We [...] Read more.
Tuberculosis (TB) is the most serious worldwide infectious disease and the leading cause of death among people with HIV. Early diagnosis and prompt treatment can cut off the rising number of TB deaths, and analysis of chest X-rays is a cost-effective method. We describe a deep learning-based cascade algorithm for detecting TB in chest X-rays. Firstly, the lung regions were segregated from other anatomical structures by an encoder–decoder with an atrous separable convolution network—DeepLabv3+ with an XceptionNet backbone, DLabv3+X, and then cropped by a bounding box. Using the cropped lung images, we trained several pre-trained Deep Convolutional Neural Networks (DCNNs) on the images with hyperparameters optimized by a Bayesian algorithm. Different combinations of trained DCNNs were compared, and the combination with the maximum accuracy was retained as the winning combination. The ensemble classifier was designed to predict the presence of TB by fusing DCNNs from the winning combination via weighted averaging. Our lung segmentation was evaluated on three publicly available datasets: it provided better Intercept over Union (IoU) values: 95.1% for Montgomery County (MC), 92.8% for Shenzhen (SZ), and 96.1% for JSRT datasets. For TB prediction, our ensemble classifier produced a better accuracy of 92.7% for the MC dataset and obtained a comparable accuracy of 95.5% for the SZ dataset. Finally, occlusion sensitivity and gradient-weighted class activation maps (Grad-CAM) were generated to indicate the most influential regions for the prediction of TB and to localize TB manifestations. Full article
(This article belongs to the Special Issue Advances in Deep Learning and Intelligent Computing)
Show Figures

Figure 1

Back to TopTop