MDPI - Publisher of Open Access Journals

21 pages, 5808 KB

Open AccessFeature PaperArticle

Segmentation of Skin Lesions Using Deep YOLO-Family Networks: A Comparison of the Performance of Selected Models on a New Dataset

by Zbigniew Omiotek, Natalia Krukar, Aleksandra Olejarz, Piotr Lichograj, Miłosz Komada and Magda Konieczna

Electronics 2026, 15(8), 1545; https://doi.org/10.3390/electronics15081545 - 8 Apr 2026

Viewed by 219

Abstract

The aim of this study was to develop an effective and fast tool to support the automatic segmentation of skin lesions, with particular emphasis on the precise differentiation between malignant and benign lesions. In response to the problem of high false positive rates [...] Read more.

The aim of this study was to develop an effective and fast tool to support the automatic segmentation of skin lesions, with particular emphasis on the precise differentiation between malignant and benign lesions. In response to the problem of high false positive rates in existing CAD systems, modern neural network architectures from the YOLO family (YOLOv8, YOLOv9, YOLOv11, YOLOv12, and YOLOv26) were used in this research. The models were trained and evaluated on a new, balanced dataset (7000 images) based on the ISIC archive, where the key innovation was the introduction of a dedicated background class representing healthy skin. Through a multi-stage, rigorous optimization process, it was demonstrated that the yolov11s-seg model is highly effective for this task. It achieved a strong balance between effectiveness and processing speed, obtaining an mAP@50 score of 0.840 and an overall precision of 0.852. From a clinical perspective, the model’s high sensitivity (85.9%) in detecting the most aggressive lesion, invasive melanoma (MI), is particularly noteworthy. Thanks to its extremely short inference time (only 4.8 ms), the proposed yolov11s-seg variant overcomes the limitations of heavy hybrid architecture, providing a stable and highly efficient solution showing significant potential for deployment in real-time medical mobile applications. Full article

(This article belongs to the Special Issue Signal and Image Processing Applications in Artificial Intelligence, 2nd Edition)

► Show Figures

Figure 1

32 pages, 6103 KB

Open AccessArticle

An Optimal Deep Hybrid Framework with Selective Kernel U-Net for Skin Lesion Detection and Classification

by Guzal Gulmirzaeva, Robert Hudec, Baxtiyorjon Akbaraliev and Batirbek Samandarov

Bioengineering 2026, 13(4), 427; https://doi.org/10.3390/bioengineering13040427 - 6 Apr 2026

Viewed by 286

Abstract

Early and accurate detection of skin cancer is critical for reducing mortality rates, particularly for malignant melanoma. Automated analysis of dermoscopic images has gained significant attention due to its potential to support clinical diagnosis and overcome the limitations of manual inspection. Motivated by [...] Read more.

Early and accurate detection of skin cancer is critical for reducing mortality rates, particularly for malignant melanoma. Automated analysis of dermoscopic images has gained significant attention due to its potential to support clinical diagnosis and overcome the limitations of manual inspection. Motivated by challenges such as image noise, low contrast, lesion variability, and redundant feature representation, this study proposes an optimal deep hybrid framework for skin lesion detection and classification. The objective of this work is to design a robust and efficient system that integrates advanced preprocessing, precise segmentation, optimal feature selection, and accurate classification. Initially, contrast enhancement using Contrast Limited Adaptive Histogram Equalization (CLAHE) and noise reduction using Wiener filtering are applied to improve image quality. Lesion regions are then segmented using a Selective Kernel U-Net (SK-UNet), which adaptively captures multi-scale spatial information. Subsequently, discriminative color, texture, and shape features are extracted and optimized using the Fossa Optimization Algorithm (FOA) to eliminate redundancy. A hybrid one-dimensional Convolutional Neural Network–Gated Recurrent Unit (1D-CNN–GRU) classifier is employed for final classification, learning both spatial and sequential feature patterns. Experimental evaluation on the ISIC and DermMNIST datasets demonstrates that the proposed framework achieves classification accuracies of 97.6% and 95.6%, respectively, outperforming several existing methods. The results confirm that the proposed hybrid framework provides reliable, accurate, and scalable skin cancer diagnosis, highlighting its potential for assisting clinical decision-making and early detection. Full article

(This article belongs to the Special Issue Deep Learning for Medical Applications: Challenges and Opportunities)

► Show Figures

Figure 1

22 pages, 1280 KB

Open AccessArticle

Enhancing Early Skin Cancer Detection: A Deep Learning Approach with Multi-Scale Feature Refinement and Fusion

by Siyuan Wu, Pengfei Zhao, Huafu Xu and Zimin Wang

Symmetry 2026, 18(4), 612; https://doi.org/10.3390/sym18040612 - 5 Apr 2026

Viewed by 216

Abstract

The global incidence of skin cancer is rising, making it an increasingly critical public health issue. Malignant skin tumors such as melanoma originate from pathological alterations in skin cells, and their accurate early-stage segmentation is crucial for quantitative analysis, early diagnosis, and effective [...] Read more.

The global incidence of skin cancer is rising, making it an increasingly critical public health issue. Malignant skin tumors such as melanoma originate from pathological alterations in skin cells, and their accurate early-stage segmentation is crucial for quantitative analysis, early diagnosis, and effective treatment. However, achieving precise and efficient segmentation remains a major challenge, as existing methods often struggle to capture complex lesion characteristics. To address this challenge, we propose a novel deep learning framework that integrates the PVT v2 backbone with two key modules: the Spatial-Aware Feature Enhancement (SAFE) module and the Multiscale Dual Cross-attention Fusion (MDCF) module. The SAFE module enhances multi-scale encoder features through a dual-branch architecture, which adaptively extracts offset information to integrate fine-grained shallow details with deep semantic information, thereby bridging the feature gap across network depths. The MDCF module establishes bidirectional cross-attention between decoder and encoder features, followed by multi-scale deformable convolutions that capture lesion boundaries and small fragments across heterogeneous receptive fields, thereby enriching semantic details while suppressing background interference. The proposed model was evaluated on two public benchmark datasets (ISIC 2016 and ISIC 2018), achieving Intersection over Union (IoU) scores of 87.33% and 83.67%, respectively. These results demonstrate superior performance compared to current state-of-the-art methods and indicate that our framework significantly enhances skin lesion image analysis, offering a promising tool for improving early detection of skin cancer. Full article

(This article belongs to the Special Issue Symmetric/Asymmetric Study in Medical Imaging)

► Show Figures

Figure 1

20 pages, 893 KB

Open AccessArticle

Step-Wise Dual Dynamic DPSGD: Enhancing Performance on Imbalanced Medical Datasets with Differential Privacy

by Xiaobo Huang and Fang Xie

Entropy 2026, 28(4), 409; https://doi.org/10.3390/e28040409 - 4 Apr 2026

Viewed by 168

Abstract

The application of differential privacy in deep learning often leads to significant performance degradation on class-imbalanced medical datasets. Methods such as adding noise to gradients for differential privacy are effective on large datasets, like MNIST and CIFAR-100, but perform poorly on small, imbalanced [...] Read more.

The application of differential privacy in deep learning often leads to significant performance degradation on class-imbalanced medical datasets. Methods such as adding noise to gradients for differential privacy are effective on large datasets, like MNIST and CIFAR-100, but perform poorly on small, imbalanced medical datasets, like HAM10000 and ISIC2019. This is because the imbalanced distribution causes the gradients from the few-shot classes to be clipped, resulting in the loss of crucial information, while the majority classes dominate the learning process. This leads the model to fall into suboptimal solutions early. To address this issue, we propose SDD-DPSGD, which uses a step-wise dynamic exponential scheduling mechanism for noise and clipping thresholds to preserve gradient information. By allocating more privacy budget and employing higher clipping thresholds during the initial training phases, the model can avoid suboptimal solutions and improve its performance. Experiments show that SDD-DPSGD outperforms comparable algorithms on the HAM10000 dataset, and the ISIC2019 dataset. Full article

(This article belongs to the Special Issue Information-Theoretic Security and Privacy)

► Show Figures

Figure 1

12 pages, 1115 KB

Open AccessArticle

From ABCD to AI: Assessing the Diagnostic Reliability of MLLMs in Cutaneous Melanoma Screening—A Head-to-Head Comparison

by Răzvan Ioan Andrei, Aniela Roxana Nodiți-Cuc, Silviu Cristian Voinea, Cristian Ioan Bordea and Alexandru Blidaru

Diagnostics 2026, 16(7), 1077; https://doi.org/10.3390/diagnostics16071077 - 2 Apr 2026

Viewed by 284

Abstract

Background: Melanoma remains a leading cause of cancer-related mortality, with early detection being the primary determinant of survival. The emergence of MLLMs offers a potential paradigm shift in accessible screening. However, the diagnostic reliability and safety of these general-purpose models in oncology [...] Read more.

Background: Melanoma remains a leading cause of cancer-related mortality, with early detection being the primary determinant of survival. The emergence of MLLMs offers a potential paradigm shift in accessible screening. However, the diagnostic reliability and safety of these general-purpose models in oncology remain insufficiently characterized. Methods: This study performed a head-to-head comparison of GPT-5, Gemini 3, and Grok 4 to evaluate their efficacy as first-level screening tools for cutaneous melanoma. A retrospective analysis was conducted using a balanced dataset of 100 clinical images (50 histopathologically confirmed benign, 50 malignant) from the ISIC archive. Results: Gemini 3 achieved the highest overall accuracy (71%) and specificity (94%), while Grok 4 demonstrated the highest sensitivity (52%). All models exhibited a critical deficit in sensitivity, missing approximately half of the malignant lesions. Statistical testing revealed no significant performance differences between the models (p > 0.05). Notably, Gemini 3 exhibited severe overconfidence, maintaining a high CI (84.62%) even during false-negative predictions, whereas GPT-5 and Grok 4 showed better calibration with a significant drop in confidence upon incorrect diagnosis. Conclusions: While current MLLMs possess a foundational capacity for dermatological analysis, their unacceptably low sensitivity and potential for overconfident misdiagnosis render them unsafe as standalone screening tools. At present, MLLMs should only be utilized as complementary tools under strict clinical supervision. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

15 pages, 2052 KB

Open AccessArticle

A Dual-Branch Multi-Scale Network for Skin Lesion Classification

by Ying Liu, Xinyu Feng, Yuchai Wan, Huifu Li, Xun Zhang and Abdureyim Raxidin

Electronics 2026, 15(5), 1118; https://doi.org/10.3390/electronics15051118 - 8 Mar 2026

Viewed by 287

Abstract

Dermoscopic images are widely used for diagnosing skin diseases, and automatic classification of lesion types using deep learning can significantly enhance diagnostic efficiency. However, challenges such as variations in imaging conditions, subtle differences between classes, high variability within classes, and severe class imbalance [...] Read more.

Dermoscopic images are widely used for diagnosing skin diseases, and automatic classification of lesion types using deep learning can significantly enhance diagnostic efficiency. However, challenges such as variations in imaging conditions, subtle differences between classes, high variability within classes, and severe class imbalance complicate skin lesion analysis. This paper introduces a dual-branch deep learning model where two branches independently process high-frequency and low-frequency image features to generate multi-scale fused representations. To address class imbalance, the model employs cosine similarity to strengthen inter-class discrimination and incorporates a bias term to improve recognition of minority lesion classes. Experiments conducted on the ISIC 2017 and ISIC 2018 datasets demonstrate that the proposed method surpasses state-of-the-art approaches, achieving accuracies of 97.0% and 91.9%, respectively, with sensitivity and specificity both exceeding 90% on the two datasets. Full article

(This article belongs to the Special Issue Deep Learning for Computer Vision Application: Second Edition)

► Show Figures

Figure 1

20 pages, 4095 KB

Open AccessArticle

Federated Deep Learning and Real-Time Inference on Edge Computing Device for Skin Cancer Classification

by Vincent and Nico Surantha

Appl. Sci. 2026, 16(5), 2289; https://doi.org/10.3390/app16052289 - 27 Feb 2026

Viewed by 431

Abstract

Skin cancer is one of the most common and dangerous cancers, with global mortality rates continuing to increase each year. Alongside rapid advancements in Artificial Intelligence (AI) within the medical field, significant challenges have emerged, particularly related to patient data privacy and security. [...] Read more.

Skin cancer is one of the most common and dangerous cancers, with global mortality rates continuing to increase each year. Alongside rapid advancements in Artificial Intelligence (AI) within the medical field, significant challenges have emerged, particularly related to patient data privacy and security. In response to these challenges, this research aims to develop a skin cancer classification system that not only ensures the security of patient data but also maintains model efficiency on devices with limited computing power by Federated Learning and real-time inference on edge computing platforms. The proposed approach combines deep learning through an Xception-based Convolutional Neural Network (CNN) architecture with Federated Learning. Federated Learning enables decentralized model training by utilizing a global server and multiple local servers, where sensitive data remain on local nodes and only model updates are shared for aggregation. The experiments were conducted using two benchmark datasets, HAM10000 (10,000 images) and ISIC 2019 (25,331 images). The resulting global federated model achieved an accuracy of 98.8%. In addition to training evaluation, the proposed model was further assessed during the inference stage on edge devices to evaluate its real-world performance under limited computational resources. Performance benchmarking was conducted on NVIDIA Jetson Orin and Raspberry Pi platforms, where Raspberry Pi 5 demonstrated the fastest inference time of 0.16 s. Full article

(This article belongs to the Special Issue Fifth Anniversary of "Biomedical Engineering" Section—Recent Advances in Biomedical Engineering)

► Show Figures

Figure 1

18 pages, 12952 KB

Open AccessArticle

Synthetic Melanoma Image Generation and Evaluation Using Generative Adversarial Networks

by Pei-Yu Lin, Yidan Shen, Neville Mathew, Renjie Hu, Siyu Huang, Courtney M. Queen, Cameron E. West, Ana Ciurea and George Zouridakis

Bioengineering 2026, 13(2), 245; https://doi.org/10.3390/bioengineering13020245 - 20 Feb 2026

Viewed by 721

Abstract

Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class [...] Read more.

Melanoma is the most lethal form of skin cancer, and early detection is critical for improving patient outcomes. Although dermoscopy combined with deep learning has advanced automated skin-lesion analysis, progress is hindered by limited access to large, well-annotated datasets and by severe class imbalance, where melanoma images are substantially underrepresented. To address these challenges, we present the first systematic benchmarking study comparing four GAN architectures—DCGAN, StyleGAN2, and two StyleGAN3 variants (T and R)—for high-resolution (

512 \times 512

) melanoma-specific synthesis. We train and optimize all models on two expert-annotated benchmarks (ISIC 2018 and ISIC 2020) under unified preprocessing and hyperparameter exploration, with particular attention to R1 regularization tuning. Image quality is assessed through a multi-faceted protocol combining distribution-level metrics (FID), sample-level representativeness (FMD), qualitative dermoscopic inspection, downstream classification with a frozen EfficientNet-based melanoma detector, and independent evaluation by two board-certified dermatologists. StyleGAN2 achieves the best balance of quantitative performance and perceptual quality, attaining FID scores of 24.8 (ISIC 2018) and 7.96 (ISIC 2020) at

γ = 0.8

. The frozen classifier recognizes 83% of StyleGAN2-generated images as melanoma, while dermatologists distinguish synthetic from real images at only 66.5% accuracy (chance = 50%), with low inter-rater agreement (

κ = 0.17

). In a controlled augmentation experiment, adding synthetic melanoma images to address class imbalance improved melanoma detection AUC from 0.925 to 0.945 on a held-out real-image test set. These findings demonstrate that StyleGAN2-generated melanoma images preserve diagnostically relevant features and can provide a measurable benefit for mitigating class imbalance in melanoma-focused machine learning pipelines. Full article

(This article belongs to the Special Issue AI and Data Science in Bioengineering: Innovations and Applications)

► Show Figures

Figure 1

31 pages, 1633 KB

Open AccessArticle

Foundation-Model-Driven Skin Lesion Segmentation and Classification Using SAM-Adapters and Vision Transformers

by Faisal Binzagr and Majed Hariri

Diagnostics 2026, 16(3), 468; https://doi.org/10.3390/diagnostics16030468 - 3 Feb 2026

Viewed by 703

Abstract

Background: The precise segmentation and classification of dermoscopic images remain prominent obstacles in automated skin cancer evaluation due, in part, to variability in lesions, low-contrast borders, and additional artifacts in the background. There have been recent developments in foundation models, with a particular [...] Read more.

Background: The precise segmentation and classification of dermoscopic images remain prominent obstacles in automated skin cancer evaluation due, in part, to variability in lesions, low-contrast borders, and additional artifacts in the background. There have been recent developments in foundation models, with a particular emphasis on the Segment Anything Model (SAM)—these models exhibit strong generalization potential but require domain-specific adaptation to function effectively in medical imaging. The advent of new architectures, particularly Vision Transformers (ViTs), expands the means of implementing robust lesion identification; however, their strengths are limited without spatial priors. Methods: The proposed study lays out an integrated foundation-model-based framework that utilizes SAM-Adapter-fine-tuning for lesion segmentation and a ViT-based classifier that incorporates lesion-specific cropping derived from segmentation and cross-attention fusion. The SAM encoder is kept frozen while lightweight adapters are fine-tuned only, to introduce skin surface-specific capacity. Segmentation priors are incorporated during the classification stage through fusion with patch-embeddings from the images, creating lesion-centric reasoning. The entire pipeline is trained using a joint multi-task approach using data from the ISIC 2018, HAM10000, and PH2 datasets. Results: From extensive experimentation, the proposed method outperforms the state-of-the-art segmentation and classification across the dataset. On the ISIC 2018 dataset, it achieves a Dice score of 94.27% for segmentation and an accuracy of 95.88% for classification performance. On PH2, a Dice score of 95.62% is achieved, and for HAM10000, an accuracy of 96.37% is achieved. Several ablation analyses confirm that both the SAM-Adapters and lesion-specific cropping and cross-attention fusion contribute substantially to performance. Paired t-tests are used to confirm statistical significance for all the previously stated measures where improvements over strong baselines indicate a

p < 0.01

for most comparisons and with large effect sizes. Conclusions: The results indicate that the combination of prior segmentation from foundation models, plus transformer-based classification, consistently and reliably improves the quality of lesion boundaries and diagnosis accuracy. Thus, the proposed SAM-ViT framework demonstrates a robust, generalizable, and lesion-centric automated dermoscopic analysis, and represents a promising initial step towards clinically deployable skin cancer decision-support system. Next steps will include model compression, improved pseudo-mask refinement and evaluation on real-world multi-center clinical cohorts. Full article

(This article belongs to the Special Issue Medical Image Analysis and Machine Learning)

► Show Figures

Figure 1

18 pages, 6105 KB

Open AccessArticle

Improving Skin Lesion Detection with Transformer-Based Architectures

by Andrés Villamarín-Olmos and Diego Renza

Information 2026, 17(2), 130; https://doi.org/10.3390/info17020130 - 1 Feb 2026

Viewed by 421

Abstract

This article describes the methodology for adjusting and comparing eleven variants of Transformer architectures for the classification of skin lesions using images: five variants of Google’s Vision Transformer (ViT) and six variants of Microsoft’s Swin Transformer. We present the methodology used to achieve [...] Read more.

This article describes the methodology for adjusting and comparing eleven variants of Transformer architectures for the classification of skin lesions using images: five variants of Google’s Vision Transformer (ViT) and six variants of Microsoft’s Swin Transformer. We present the methodology used to achieve these results, which includes meticulous hyperparameter tuning and a robust data augmentation strategy to address the class imbalance problem. This approach allowed us to surpass the state of the art on the DermaMNIST dataset with respect to CNN-based models, and achieve very competitive results on the ISIC Challenge 2019 dataset with respect to Transformer-based models. In addition, we employed the CheferCAM method to provide visual explanations that identify the most influential image regions in the models’ predictions. Full article

(This article belongs to the Special Issue Advancements in Healthcare Data Science: Innovations, Challenges and Applications)

► Show Figures

Graphical abstract

31 pages, 4397 KB

Open AccessArticle

Transformer-Based Foundation Learning for Robust and Data-Efficient Skin Disease Imaging

by Inzamam Mashood Nasir, Hend Alshaya, Sara Tehsin and Wided Bouchelligua

Diagnostics 2026, 16(3), 440; https://doi.org/10.3390/diagnostics16030440 - 1 Feb 2026

Viewed by 517

Abstract

Background/Objectives: Accurate and reliable automated dermoscopic lesion classification remains challenging. This is due to pronounced dataset bias, limited expert-annotated data, and poor cross-dataset generalization of conventional supervised deep learning models. In clinical dermatology, these limitations restrict the deployment of data-driven diagnostic systems across [...] Read more.

Background/Objectives: Accurate and reliable automated dermoscopic lesion classification remains challenging. This is due to pronounced dataset bias, limited expert-annotated data, and poor cross-dataset generalization of conventional supervised deep learning models. In clinical dermatology, these limitations restrict the deployment of data-driven diagnostic systems across diverse acquisition settings and patient populations. Methods: Motivated by these challenges, this study proposes a transformer-based, dermatology-specific foundation model. The model learns transferable visual representations from large collections of unlabeled dermoscopic images via self-supervised pretraining. It integrates large-scale dermatology-oriented self-supervised learning with a hierarchical vision transformer backbone. This enables effective capture of both fine-grained lesion textures and global morphological patterns. The evaluation is conducted across three publicly available dermoscopic datasets: ISIC 2018, HAM10000, and PH2. The study assesses in-dataset, cross-dataset, limited-label, ablation, and computational-efficiency settings. Results: The proposed approach achieves in-dataset classification accuracies of 94.87%, 97.32%, and 98.17% on ISIC 2018, HAM10000, and PH2, respectively. It outperforms strong transformer and hybrid baselines. Cross-dataset transfer experiments show consistent performance gains of 3.5–5.8% over supervised counterparts. This indicates improved robustness to domain shift. Furthermore, when fine-tuned with only 10% of the labeled training data, the model achieves performance comparable to fully supervised baselines. Conclusions: This highlights strong data efficiency. These results demonstrate that dermatology-specific foundation learning offers a principled and practical solution for robust dermoscopic lesion classification under realistic clinical constraints. Full article

(This article belongs to the Special Issue Advanced Imaging in the Diagnosis and Management of Skin Diseases)

► Show Figures

Figure 1

27 pages, 10518 KB

Open AccessArticle

DL-PCMNet: Distributed Learning Enabled Parallel Convolutional Memory Network for Skin Cancer Classification with Dermatoscopic Images

by Afnan M. Alhassan and Nouf I. Altmami

Diagnostics 2026, 16(2), 359; https://doi.org/10.3390/diagnostics16020359 - 22 Jan 2026

Viewed by 287

Abstract

Background/Objectives: Globally, one of the most dreadful and rapidly spreading illnesses is skin cancer, and it is acknowledged as a lethal form of cancer due to the abnormal growth of skin cells. Mostly, classifying and diagnosing the types of skin lesions is [...] Read more.

Background/Objectives: Globally, one of the most dreadful and rapidly spreading illnesses is skin cancer, and it is acknowledged as a lethal form of cancer due to the abnormal growth of skin cells. Mostly, classifying and diagnosing the types of skin lesions is complex, and recognizing tumors from dermoscopic images remains challenging. The existing methods have limitations like insufficient datasets, computational complexity, class imbalance issues, and poor classification performance. Methods: This research presents a method named the Distributed Learning enabled Parallel Convolutional Memory Network (DL-PCMNet) model to effectively classify skin cancer by overcoming the existing limitations. Hence, the proposed DL-PCMNet model utilizes a distributed learning framework to provide greater flexibility during the learning process, and it increases the reliability of the model. Moreover, the model integrates the Convolutional Neural Network (CNN) and Long Short-Term Memory model (LSTM) in a parallel distribution, which enhances robustness and accuracy by capturing the information of long-term dependencies. Furthermore, the utilization of advanced preprocessing and feature extraction techniques increases the accuracy of classification. Results: The evaluation results exhibit an accuracy of 97.28%, precision of 97.30%, sensitivity of 97.17%, and specificity of 97.72% at 90% of training by using the ISIC 2019 skin lesion dataset, respectively. Conclusions: Specifically, the proposed DL-PCMNet model achieved efficient and accurate skin cancer classification compared with other existing models. Full article

(This article belongs to the Special Issue Artificial Intelligence in Dermatology)

► Show Figures

Figure 1

23 pages, 1308 KB

Open AccessArticle

MFA-Net: Multiscale Feature Attention Network for Medical Image Segmentation

by Jia Zhao, Han Tao, Song Liu, Meilin Li and Huilong Jin

Electronics 2026, 15(2), 330; https://doi.org/10.3390/electronics15020330 - 12 Jan 2026

Cited by 1 | Viewed by 707

Abstract

Medical image segmentation acts as a foundational element of medical image analysis. Yet its accuracy is frequently limited by the scale fluctuations of anatomical targets and the intricate contextual traits inherent in medical images—including vaguely defined structural boundaries and irregular shape distributions. To [...] Read more.

Medical image segmentation acts as a foundational element of medical image analysis. Yet its accuracy is frequently limited by the scale fluctuations of anatomical targets and the intricate contextual traits inherent in medical images—including vaguely defined structural boundaries and irregular shape distributions. To tackle these constraints, we design a multi-scale feature attention network (MFA-Net), customized specifically for thyroid nodule, skin lesion, and breast lesion segmentation tasks. This network framework integrates three core components: a Bidirectional Feature Pyramid Network (Bi-FPN), a Slim-neck structure, and the Convolutional Block Attention Module (CBAM). CBAM steers the model to prioritize boundary regions while filtering out irrelevant information, which in turn enhances segmentation precision. Bi-FPN facilitates more robust fusion of multi-scale features via iterative integration of top-down and bottom-up feature maps, supported by lateral and vertical connection pathways. The Slim-neck design is constructed to simplify the network’s architecture while effectively merging multi-scale representations of both target and background areas, thus enhancing the model’s overall performance. Validation across four public datasets covering thyroid ultrasound (TNUI-2021, TN-SCUI 2020), dermoscopy (ISIC 2016), and breast ultrasound (BUSI) shows that our method outperforms state-of-the-art segmentation approaches, achieving Dice similarity coefficients of 0.955, 0.971, 0.976, and 0.846, respectively. Additionally, the model maintains a compact parameter count of just 3.05 million and delivers an extremely fast inference latency of 1.9 milliseconds—metrics that significantly outperform those of current leading segmentation techniques. In summary, the proposed framework demonstrates strong performance in thyroid, skin, and breast lesion segmentation, delivering an optimal trade-off between high accuracy and computational efficiency. Full article

(This article belongs to the Special Issue Deep Learning for Computer Vision Application: Second Edition)

► Show Figures

Figure 1

26 pages, 13386 KB

Open AccessArticle

QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images

by Khidhr Halab, Nabil Marzoug, Othmane El Meslouhi, Zouhair Elamrani Abou Elassad and Moulay A. Akhloufi

Big Data Cogn. Comput. 2026, 10(1), 12; https://doi.org/10.3390/bdcc10010012 - 30 Dec 2025

Viewed by 1200

Abstract

Background: Quantum Machine Learning (QML) has attracted significant attention in recent years. With quantum computing achievements in computationally costly domains, discovering its potential in improving the performance and efficiency of deep learning models in medical imaging has become a promising field of research. [...] Read more.

Background: Quantum Machine Learning (QML) has attracted significant attention in recent years. With quantum computing achievements in computationally costly domains, discovering its potential in improving the performance and efficiency of deep learning models in medical imaging has become a promising field of research. Methods: We investigate QML in healthcare by developing a novel quantum-enhanced U-Net (QU-Net). We experiment with six configurations of parameterized quantum circuits, varying the encoding technique (amplitude vs. angle), depth and entanglement. Using the ISIC-2017 skin cancer dataset, we compare QU-Net with classical U-Net on self-supervised image reconstruction and binary classification of benign and malignant skin cancer, where we combine bottleneck embeddings with patient metadata. Results: Our findings show that amplitude encoding stabilizes training, whereas angle encoding introduces fluctuations. The best performance is obtained with amplitude encoding and one layer. For reconstruction, QU-Net with entanglement converges faster (25 epochs vs. 44) with a lower Mean Squared Error per image (0.00015 vs. 0.00017) on unseen data. For classification, QU-Net with no entanglement embeddings reaches 79.03% F1-score compared with 74.14% for U-Net, despite compressing images to a smaller latent space (7 vs. 128). Conclusions: These results demonstrate that the quantum layer enhances U-Net’s expressive power with efficient data embedding. Full article

► Show Figures

Graphical abstract

32 pages, 7593 KB

Open AccessReview

Advancing Medical Decision-Making with AI: A Comprehensive Exploration of the Evolution from Convolutional Neural Networks to Capsule Networks

by Ichrak Khoulqi and Zakariae El Ouazzani

J. Imaging 2026, 12(1), 17; https://doi.org/10.3390/jimaging12010017 - 30 Dec 2025

Viewed by 819

Abstract

In this paper, we propose a literature review regarding two deep learning architectures, namely Convolutional Neural Networks (CNNs) and Capsule Networks (CapsNets), applied to medical images, in order to analyze them to help in medical decision support. CNNs demonstrate their capacity in the [...] Read more.

In this paper, we propose a literature review regarding two deep learning architectures, namely Convolutional Neural Networks (CNNs) and Capsule Networks (CapsNets), applied to medical images, in order to analyze them to help in medical decision support. CNNs demonstrate their capacity in the medical diagnostic field; however, their reliability decreases when there is slight spatial variability, which can affect diagnosis, especially since the anatomical structure of the human body can differ from one patient to another. In contrast, CapsNets encode not only feature activation but also spatial relationships, hence improving the reliability and stability of model generalization. This paper proposes a structured comparison by reviewing studies published from 2018 to 2025 across major databases, including IEEE Xplore, ScienceDirect, SpringerLink, and MDPI. The applications in the reviewed papers are based on the benchmark datasets BraTS, INbreast, ISIC, and COVIDx. This paper review compares the core architectural principles, performance, and interpretability of both architectures. To conclude the paper, we underline the complementary roles of these two architectures in medical decision-making and propose future directions toward hybrid, explainable, and computationally efficient deep learning systems for real clinical environments, thereby increasing survival rates by helping prevent diseases at an early stage. Full article

(This article belongs to the Special Issue Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives—2nd Edition)

► Show Figures

Figure 1

Search Results (185)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (185)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI