Loading [MathJax]/jax/output/HTML-CSS/jax.js
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (154)

Search Parameters:
Keywords = multiclass segmentation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 5835 KiB  
Article
Chronic Ulcers Healing Prediction through Machine Learning Approaches: Preliminary Results on Diabetic Foot Ulcers Case Study
by Elisabetta Spinazzola, Guillaume Picaud, Sara Becchi, Monica Pittarello, Elia Ricci, Marc Chaumont, Gérard Subsol, Fabio Pareschi, Luc Teot and Jacopo Secco
J. Clin. Med. 2025, 14(9), 2943; https://doi.org/10.3390/jcm14092943 - 24 Apr 2025
Viewed by 276
Abstract
Background: Chronic diabetic foot ulcers are a global health challenge, affecting approximately 18.6 million individuals each year. The timely and accurate prediction of wound healing paths is crucial for improving treatment outcomes and reducing complications. Methods: In this study, we apply predictive modeling [...] Read more.
Background: Chronic diabetic foot ulcers are a global health challenge, affecting approximately 18.6 million individuals each year. The timely and accurate prediction of wound healing paths is crucial for improving treatment outcomes and reducing complications. Methods: In this study, we apply predictive modeling to the case study of diabetic foot ulcers, analyzing and comparing multiple models based on Deep Neural Networks (DNNs) and Machine Learning (ML) algorithms to enhance wound prognosis and clinical decision making. Our approach leverages a dataset of 1766 diabetic foot wounds, each monitored for at least three visits, incorporating key clinical wound features such as WBP scores, wound area, depth, and tissue status. Results: Among the 12 models evaluated, the highest accuracy (80%) was achieved using a three-layer LSTM recurrent DNN trained on wound instances with four visits. The model performance was assessed through AUC (0.85), recall (0.80), precision (0.79), and F1-score (0.80). Our findings indicate that the wound depth and area at the first visit followed by the wound area and granulated tissue percentage at the second visit are the most influential factors in predicting the wound status. Conclusions: As future developments, we started building a weakly supervised semantic segmentation model that classifies wound tissues into necrosis, slough, and granulation, using tissue color proportions to further improve model performance. This research underscores the potential of predictive modeling in chronic wound management, specifically in the case of diabetic foot ulcers, offering a tool that can be seamlessly integrated into routine clinical practice. Full article
Show Figures

Figure 1

21 pages, 3383 KiB  
Article
Artificial Intelligence for Multiclass Rhythm Analysis for Out-of-Hospital Cardiac Arrest During Mechanical Cardiopulmonary Resuscitation
by Iraia Isasi, Xabier Jaureguibeitia, Erik Alonso, Andoni Elola, Elisabete Aramendi and Lars Wik
Mathematics 2025, 13(8), 1251; https://doi.org/10.3390/math13081251 - 10 Apr 2025
Viewed by 258
Abstract
Load distributing band (LDB) mechanical chest compression (CC) devices are used to treat out-of-hospital cardiac arrest (OHCA) patients. Mechanical CCs induce artifacts in the electrocardiogram (ECG) recorded by defibrillators, potentially leading to inaccurate cardiac rhythm analysis. A reliable analysis of the cardiac rhythm [...] Read more.
Load distributing band (LDB) mechanical chest compression (CC) devices are used to treat out-of-hospital cardiac arrest (OHCA) patients. Mechanical CCs induce artifacts in the electrocardiogram (ECG) recorded by defibrillators, potentially leading to inaccurate cardiac rhythm analysis. A reliable analysis of the cardiac rhythm is essential for guiding resuscitation treatment and understanding, retrospectively, the patients’ response to treatment. The aim of this study was to design a deep learning (DL)-based framework for cardiac automatic multiclass rhythm classification in the presence of CC artifacts during OHCA. Concretely, an automatic multiclass cardiac rhythm classification was addressed to distinguish the following types of rhythms: shockable (Sh), asystole (AS), and organized (OR) rhythms. A total of 15,479 segments (2406 Sh, 5481 AS, and 7592 OR) were extracted from 2058 patients during LDB CCs, whereof 9666 were used to train the algorithms and 5813 to assess the performance. The proposed architecture consists of an adaptive filter for CC artifact suppression and a multiclass rhythm classifier. Two DL alternatives were considered for the multiclass classifier: convolutional neuronal networks (CNNs) and residual networks (ResNets). A traditional machine learning-based classifier, which incorporates the research conducted over the past two decades in ECG rhythm analysis using more than 90 state-of-the-art features, was used as a point of comparison. The unweighted mean of sensitivities, the unweighted mean of F1-Scores, and the accuracy of the best method (ResNets) were 88.3%, 88.3%, and 88.2%, respectively. These results highlight the potential of DL-based methods to provide accurate cardiac rhythm diagnoses without interrupting mechanical CC therapy. Full article
Show Figures

Figure 1

23 pages, 7556 KiB  
Article
AI Diffusion Model-Based Technology for Automating the Multi-Class Labeling of Electron Microscopy Datasets of Brain Cell Organelles for Their Augmentation and Synthetic Generation
by Nikolay Sokolov, Alexandra Getmanskaya and Vadim Turlapov
Technologies 2025, 13(4), 127; https://doi.org/10.3390/technologies13040127 - 25 Mar 2025
Viewed by 213
Abstract
A technology for the automatic multi-class labeling of brain electron microscopy (EM) objects needed to create large synthetic datasets, which could be used for brain cell segmentation tasks, is proposed. The main research tools were a generative diffusion AI model and a U-Net-like [...] Read more.
A technology for the automatic multi-class labeling of brain electron microscopy (EM) objects needed to create large synthetic datasets, which could be used for brain cell segmentation tasks, is proposed. The main research tools were a generative diffusion AI model and a U-Net-like segmentation model. The technology was studied on the segmentation task of up to six brain organelles. The initial dataset used was the popular EPFL dataset labeled for the mitochondria class, which has training and test parts having 165 layers each. Our mark up for the EPFL dataset was named EPFL6 and contained six classes. The technology was implemented and studied in a two-step experiment: (1) dataset synthesis using a diffusion model trained on EPFL6; (2) evaluation of the labeling accuracy of a multi-class synthetic dataset by the segmentation accuracy on the test part of EPFL6. It was found that (1) the segmentation accuracy of the mitochondria class for the diffusion synthetic datasets corresponded to the accuracy of the original ones; (2) augmentation via geometric synthetics provided a better accuracy for underrepresented classes; (3) the naturalization of geometric synthetics by the diffusion model yielded a positive effect; (4) due to the augmentation of the 165 layers of the original EPFL dataset with diffusion synthetics, it was possible to achieve and surpass the record accuracy of Dice = 0.948, which was achieved using 3D estimation in Hive-net (2021). Full article
Show Figures

Figure 1

23 pages, 3644 KiB  
Article
Federated and Centralized Machine Learning for Cell Segmentation: A Comparative Analysis
by Sara Bruschi, Marco Esposito, Sara Raggiunto, Alberto Belli and Paola Pierleoni
Electronics 2025, 14(7), 1254; https://doi.org/10.3390/electronics14071254 - 22 Mar 2025
Viewed by 347
Abstract
The automatic segmentation of cell images plays a critical role in medicine and biology, as it enables faster and more accurate analysis and diagnosis. Traditional machine learning faces challenges since it requires transferring sensitive data from laboratories to the cloud, with possible risks [...] Read more.
The automatic segmentation of cell images plays a critical role in medicine and biology, as it enables faster and more accurate analysis and diagnosis. Traditional machine learning faces challenges since it requires transferring sensitive data from laboratories to the cloud, with possible risks and limitations due to patients’ privacy, data-sharing regulations, or laboratory privacy guidelines. Federated learning addresses data-sharing issues by introducing a decentralized approach that removes the need for laboratories’ data sharing. The learning task is divided among the participating clients, with each training a global model situated on the cloud with its local dataset. This guarantees privacy by only transmitting updated model weights to the cloud. In this study, the centralized learning approach for cell segmentation is compared with the federated one, demonstrating that they achieve similar performances. Stemming from a benchmarking of available cell segmentation models, Cellpose, having shown better recall and precision (F1=0.84) than U-Net (F1=0.50) and StarDist (F1=0.12), was used as the baseline for a federated learning testbench implementation. The results show that both binary segmentation and multi-class segmentation metrics remain high when employing both the centralized solution (F1=0.86) and the federated solution (F12clients=0.86). These results were also stable across an increasing number of clients and a reduced number of local data samples (F14clients=0.87F116clients=0.86), proving the effectiveness of central aggregation on the cloud of locally trained models. Full article
Show Figures

Figure 1

25 pages, 115458 KiB  
Article
RSAM-Seg: A SAM-Based Model with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation
by Jie Zhang, Yunxin Li, Xubing Yang, Rui Jiang and Li Zhang
Remote Sens. 2025, 17(4), 590; https://doi.org/10.3390/rs17040590 - 8 Feb 2025
Cited by 7 | Viewed by 1443
Abstract
High-resolution remote sensing satellites have revolutionized remote sensing research, yet accurately segmenting specific targets from complex satellite imagery remains challenging. While the Segment Anything Model (SAM) has emerged as a promising universal segmentation model, its direct application to remote sensing imagery yields suboptimal [...] Read more.
High-resolution remote sensing satellites have revolutionized remote sensing research, yet accurately segmenting specific targets from complex satellite imagery remains challenging. While the Segment Anything Model (SAM) has emerged as a promising universal segmentation model, its direct application to remote sensing imagery yields suboptimal results. To address these limitations, we propose RSAM-Seg, a novel deep learning model adapted from SAM specifically designed for remote sensing applications. Our model incorporates two key components: Adapter-Scale and Adapter-Feature modules. The Adapter-Scale modules, integrated within Vision Transformer (ViT) blocks, enhance model adaptability through learnable transformations, while the Adapter-Feature modules, positioned between ViT blocks, generate image-informed prompts by incorporating task-specific information. Extensive experiments across four binary and two multi-class segmentation scenarios demonstrate the superior performance of RSAM-Seg, achieving an F1 score of 0.815 in cloud detection, 0.834 in building segmentation, and 0.755 in road extraction, consistently outperforming established architectures like U-Net, DeepLabV3+, and Segformer. Moreover, RSAM-Seg shows significant improvements of up to 56.5% in F1 score compared to the original SAM. In addition, RSAM-Seg maintains robust performance in few-shot learning scenarios, achieving an F1 score of 0.656 with only 1% of the training data and increasing to 0.815 with full data availability. Furthermore, RSAM-Seg exhibits the capability to detect missing areas within the ground truth of certain datasets, highlighting its capability for completion. Full article
(This article belongs to the Special Issue Advanced AI Technology for Remote Sensing Analysis)
Show Figures

Graphical abstract

28 pages, 13922 KiB  
Article
Multi-Class Guided GAN for Remote-Sensing Image Synthesis Based on Semantic Labels
by Zhenye Niu, Yuxia Li, Yushu Gong, Bowei Zhang, Yuan He, Jinglin Zhang, Mengyu Tian and Lei He
Remote Sens. 2025, 17(2), 344; https://doi.org/10.3390/rs17020344 - 20 Jan 2025
Viewed by 1081
Abstract
In the scenario of limited labeled remote-sensing datasets, the model’s performance is constrained by the insufficient availability of data. Generative model-based data augmentation has emerged as a promising solution to this limitation. While existing generative models perform well in natural scene domains (e.g., [...] Read more.
In the scenario of limited labeled remote-sensing datasets, the model’s performance is constrained by the insufficient availability of data. Generative model-based data augmentation has emerged as a promising solution to this limitation. While existing generative models perform well in natural scene domains (e.g., faces and street scenes), their performance in remote sensing is hindered by severe data imbalance and the semantic similarity among land-cover classes. To tackle these challenges, we propose the Multi-Class Guided GAN (MCGGAN), a novel network for generating remote-sensing images from semantic labels. Our model features a dual-branch architecture with a global generator that captures the overall image structure and a multi-class generator that improves the quality and differentiation of land-cover types. To integrate these generators, we design a shared-parameter encoder for consistent feature encoding across two branches, and a spatial decoder that synthesizes outputs from the class generators, preventing overlap and confusion. Additionally, we employ perceptual loss (LVGG) to assess perceptual similarity between generated and real images, and texture matching loss (LT) to capture fine texture details. To evaluate the quality of image generation, we tested multiple models on two custom datasets (one from Chongzhou, Sichuan Province, and another from Wuzhen, Zhejiang Province, China) and a public dataset LoveDA. The results show that MCGGAN achieves improvements of 52.86 in FID, 0.0821 in SSIM, and 0.0297 in LPIPS compared to the Pix2Pix baseline. We also conducted comparative experiments to assess the semantic segmentation accuracy of the U-Net before and after incorporating the generated images. The results show that data augmentation with the generated images leads to an improvement of 4.47% in FWIoU and 3.23% in OA across the Chongzhou and Wuzhen datasets. Experiments show that MCGGAN can be effectively used as a data augmentation approach to improve the performance of downstream remote-sensing image segmentation tasks. Full article
Show Figures

Figure 1

17 pages, 6455 KiB  
Article
A Novel Mesoscale Eddy Identification Method Using Enhanced Interpolation and A Posteriori Guidance
by Lei Zhang, Xiaodong Ma, Weishuai Xu, Xiang Wan and Qiyun Chen
Sensors 2025, 25(2), 457; https://doi.org/10.3390/s25020457 - 14 Jan 2025
Viewed by 887
Abstract
Mesoscale eddies are pivotal oceanographic phenomena affecting marine environments. Accurate and stable identification of these eddies is essential for advancing research on their dynamics and effects. Current methods primarily focus on identifying Cyclonic and Anticyclonic eddies (CE, AE), with anomalous eddy identification often [...] Read more.
Mesoscale eddies are pivotal oceanographic phenomena affecting marine environments. Accurate and stable identification of these eddies is essential for advancing research on their dynamics and effects. Current methods primarily focus on identifying Cyclonic and Anticyclonic eddies (CE, AE), with anomalous eddy identification often requiring secondary analyses of sea surface height anomalies and eddy center properties, leading to segmented data interpretations. This study introduces a deep learning model integrating multi-source fusion data with a Squeeze-and-Excitation (SE) attention mechanism to enhance the identification accuracy for both normal and anomalous eddies. Comparative ablation experiments validate the model’s effectiveness, demonstrating its potential for more nuanced, multi-source, and multi-class mesoscale eddy identification. This approach offers a promising framework for advancing mesoscale eddy identification through deep learning. Full article
Show Figures

Figure 1

14 pages, 4833 KiB  
Article
Automatic Road Extraction from Historical Maps Using Transformer-Based SegFormers
by Elif Sertel, Can Michael Hucko and Mustafa Erdem Kabadayı
ISPRS Int. J. Geo-Inf. 2024, 13(12), 464; https://doi.org/10.3390/ijgi13120464 - 21 Dec 2024
Viewed by 1703
Abstract
Historical maps are valuable sources of geospatial data for various geography-related applications, providing insightful information about historical land use, transportation infrastructure, and settlements. While transformer-based segmentation methods have been widely applied to image segmentation tasks, they have mostly focused on satellite images. There [...] Read more.
Historical maps are valuable sources of geospatial data for various geography-related applications, providing insightful information about historical land use, transportation infrastructure, and settlements. While transformer-based segmentation methods have been widely applied to image segmentation tasks, they have mostly focused on satellite images. There is a growing need to explore transformer-based approaches for geospatial object extraction from historical maps, given their superior performance over traditional convolutional neural network (CNN)-based architectures. In this research, we aim to automatically extract five different road types from historical maps, using a road dataset digitized from the scanned Deutsche Heereskarte 1:200,000 Türkei (DHK 200 Turkey) maps. We applied the variants of the transformer-based SegFormer model and evaluated the effects of different encoders, batch sizes, loss functions, optimizers, and augmentation techniques on road extraction performance. Our best results, with an intersection over union (IoU) of 0.5411 and an F1 score of 0.7017, were achieved using the SegFormer-B2 model, the Adam optimizer, and the focal loss function. All SegFormer-based experiments outperformed previously reported CNN-based segmentation models on the same dataset. In general, increasing the batch size and using larger SegFormer variants (from B0 to B2) resulted in improved accuracy metrics. Additionally, the choice of augmentation techniques significantly influenced the outcomes. Our results demonstrate that SegFormer models substantially enhance true positive predictions and resulted in higher precision metric values. These findings suggest that the output weights could be directly applied to transfer learning for similar historical maps and the inference of additional DHK maps, while offering a promising architecture for future road extraction studies. Full article
Show Figures

Figure 1

25 pages, 1441 KiB  
Article
Unlocking Security for Comprehensive Electroencephalogram-Based User Authentication Systems
by Adnan Elahi Khan Khalil, Jesus Arturo Perez-Diaz, Jose Antonio Cantoral-Ceballos and Javier M. Antelis
Sensors 2024, 24(24), 7919; https://doi.org/10.3390/s24247919 - 11 Dec 2024
Cited by 1 | Viewed by 975
Abstract
With recent significant advancements in artificial intelligence, the necessity for more reliable recognition systems has rapidly increased to safeguard individual assets. The use of brain signals for authentication has gained substantial interest within the scientific community over the past decade. Most previous efforts [...] Read more.
With recent significant advancements in artificial intelligence, the necessity for more reliable recognition systems has rapidly increased to safeguard individual assets. The use of brain signals for authentication has gained substantial interest within the scientific community over the past decade. Most previous efforts have focused on identifying distinctive information within electroencephalogram (EEG) recordings. In this study, an EEG-based user authentication scheme is presented, employing a multi-layer perceptron feedforward neural network (MLP FFNN). The scheme utilizes P300 potentials derived from EEG signals, focusing on the user’s intent to select specific characters. This approach involves two phases: user identification and user authentication. Both phases utilize EEG recordings of brain signals, data preprocessing, a database to store and manage these recordings for efficient retrieval and organization, and feature extraction using mutual information (MI) from selected EEG data segments, specifically targeting power spectral density (PSD) across five frequency bands. The user identification phase employs multi-class classifiers to predict the identity of a user from a set of enrolled users. The user authentication phase associates the predicted user identities with user labels using probability assessments, verifying the claimed identity as either genuine or an impostor. This scheme combines EEG data segments with user mapping, confidence calculations, and claimed user verification for robust authentication. It also accommodates new users by transforming EEG data into feature vectors without the need for retraining. The model extracts selected features to identify users and to classify the input based on these features to authenticate the user. The experiments show that the proposed scheme can achieve 97% accuracy in EEG-based user identification and authentication. Full article
(This article belongs to the Special Issue Advances in Brain–Computer Interfaces and Sensors)
Show Figures

Figure 1

20 pages, 9751 KiB  
Article
6D Pose Estimation of Industrial Parts Based on Point Cloud Geometric Information Prediction for Robotic Grasping
by Qinglei Zhang, Cuige Xue, Jiyun Qin, Jianguo Duan and Ying Zhou
Entropy 2024, 26(12), 1022; https://doi.org/10.3390/e26121022 - 26 Nov 2024
Viewed by 1481
Abstract
In industrial robotic arm gripping operations within disordered environments, the loss of physical information on the object’s surface is often caused by changes such as varying lighting conditions, weak surface textures, and sensor noise. This leads to inaccurate object detection and pose estimation [...] Read more.
In industrial robotic arm gripping operations within disordered environments, the loss of physical information on the object’s surface is often caused by changes such as varying lighting conditions, weak surface textures, and sensor noise. This leads to inaccurate object detection and pose estimation information. A method for industrial object pose estimation using point cloud data is proposed to improve pose estimation accuracy. During the feature extraction process, both global and local information are captured by integrating the appearance features of RGB images with the geometric features of point clouds. Integrating semantic information with instance features effectively distinguishes instances of similar objects. The fusion of depth information and RGB color channels enriches spatial context and structure. A cross-entropy loss function is employed for multi-class target classification, and a discriminative loss function enables instance segmentation. A novel point cloud registration method is also introduced to address re-projection errors when mapping 3D keypoints to 2D planes. This method utilizes 3D geometric information, extracting edge features using point cloud curvature and normal vectors, and registers them with models to obtain accurate pose information. Experimental results demonstrate that the proposed method is effective and superior on the LineMod and YCB-Video datasets. Finally, objects are grasped by deploying a robotic arm on the grasping platform. Full article
(This article belongs to the Section Multidisciplinary Applications)
Show Figures

Figure 1

11 pages, 2309 KiB  
Article
Radiomics Feature Stability in True and Virtual Non-Contrast Reconstructions from Cardiac Photon-Counting Detector CT Datasets
by Luca Canalini, Elif G. Becker, Franka Risch, Stefanie Bette, Simon Hellbrueck, Judith Becker, Katharina Rippel, Christian Scheurig-Muenkler, Thomas Kroencke and Josua A. Decker
Diagnostics 2024, 14(22), 2483; https://doi.org/10.3390/diagnostics14222483 - 7 Nov 2024
Viewed by 849
Abstract
Objectives: Virtual non-contrast (VNC) series reconstructed from contrast-enhanced cardiac scans acquired with photon counting detector CT (PCD-CT) systems have the potential to replace true non-contrast (TNC) series. However, a quantitative comparison of the image characteristics of TNC and VNC data is necessary [...] Read more.
Objectives: Virtual non-contrast (VNC) series reconstructed from contrast-enhanced cardiac scans acquired with photon counting detector CT (PCD-CT) systems have the potential to replace true non-contrast (TNC) series. However, a quantitative comparison of the image characteristics of TNC and VNC data is necessary to determine to what extent they are interchangeable. This work quantitatively evaluates the image similarity between VNC and TNC reconstructions by measuring the stability of multi-class radiomics features extracted in intra-patient TNC and VNC reconstructions. Methods: TNC and VNC series of 84 patients were retrospectively collected. For each patient, the myocardium and epicardial adipose tissue (EAT) were semi-automatically segmented in both VNC and TNC reconstructions, and 105 radiomics features were extracted in each mask. Intra-feature correlation scores were computed using the intraclass correlation coefficient (ICC). Stable features were defined with an ICC higher than 0.75. Results: In the myocardium, 41 stable features were identified, and the three with the highest ICC were glrlm_GrayLevelVariance with ICC3 of 0.98 [0.97, 0.99], ngtdm_Strength with ICC3 of 0.97 [0.95, 0.98], firstorder_Variance with ICC3 of 0.96 [0.94, 0.98]. For the epicardial fat, 40 stable features were found, and the three highest ranked are firstorder_Median with ICC3 of 0.96 [0.93, 0.97], firstorder_RootMeanSquared with ICC3 of 0.95 [0.92, 0.97], firstorder_Mean with ICC3 of 0.95 [0.92, 0.97]. A total of 24 features (22.8%; 24/105) showed stability in both anatomical structures. Conclusions: The significant differences in the correlation of radiomics features in VNC and TNC volumes of the myocardium and epicardial fat suggested that the two reconstructions may differ more than initially assumed. This indicates that they may not be interchangeable, and such differences could have clinical implications. Therefore, care should be given when selecting VNC as a substitute for TNC in radiomics research to ensure accurate and reliable analysis. Moreover, the observed variations may impact clinical workflows, where precise tissue characterization is critical for diagnosis and treatment planning. Full article
(This article belongs to the Special Issue Recent Developments and Future Trends in Thoracic Imaging)
Show Figures

Figure 1

17 pages, 2991 KiB  
Article
Feature Extraction and Identification of Rheumatoid Nodules Using Advanced Image Processing Techniques
by Azmath Mubeen and Uma N. Dulhare
Rheumato 2024, 4(4), 176-192; https://doi.org/10.3390/rheumato4040014 - 24 Oct 2024
Viewed by 862
Abstract
Background/Objectives: Accurate detection and classification of nodules in medical images, particularly rheumatoid nodules, are critical due to the varying nature of these nodules, where their specific type is often unknown before analysis. This study addresses the challenges of multi-class prediction in nodule detection, [...] Read more.
Background/Objectives: Accurate detection and classification of nodules in medical images, particularly rheumatoid nodules, are critical due to the varying nature of these nodules, where their specific type is often unknown before analysis. This study addresses the challenges of multi-class prediction in nodule detection, with a specific focus on rheumatoid nodules, by employing a comprehensive approach to feature extraction and classification. We utilized a diverse dataset of nodules, including rheumatoid nodules sourced from the DermNet dataset and local rheumatologists. Method: This study integrates 62 features, combining traditional image characteristics with advanced graph-based features derived from a superpixel graph constructed through Delaunay triangulation. The key steps include image preprocessing with anisotropic diffusion and Retinex enhancement, superpixel segmentation using SLIC, and graph-based feature extraction. Texture analysis was performed using Gray-Level Co-occurrence Matrix (GLCM) metrics, while shape analysis was conducted with Fourier descriptors. Vascular pattern recognition, crucial for identifying rheumatoid nodules, was enhanced using the Frangi filter. A Hybrid CNN–Transformer model was employed for feature fusion, and feature selection and hyperparameter tuning were optimized using Gray Wolf Optimization (GWO) and Particle Swarm Optimization (PSO). Feature importance was assessed using SHAP values. Results: The proposed methodology achieved an accuracy of 85%, with a precision of 0.85, a recall of 0.89, and an F1 measure of 0.87, demonstrating the effectiveness of the approach in detecting and classifying rheumatoid nodules in both binary and multi-class classification scenarios. Conclusions: This study presents a robust tool for the detection and classification of nodules, particularly rheumatoid nodules, in medical imaging, offering significant potential for improving diagnostic accuracy and aiding in the early identification of rheumatoid conditions. Full article
Show Figures

Figure 1

24 pages, 9284 KiB  
Article
Application of Direct and Indirect Methodologies for Beach Litter Detection in Coastal Environments
by Angelo Sozio, Vincenzo Mariano Scarrica, Angela Rizzo, Pietro Patrizio Ciro Aucelli, Giovanni Barracane, Luca Antonio Dimuccio, Rui Ferreira, Marco La Salandra, Antonino Staiano, Maria Pia Tarantino and Giovanni Scicchitano
Remote Sens. 2024, 16(19), 3617; https://doi.org/10.3390/rs16193617 - 28 Sep 2024
Cited by 2 | Viewed by 1842
Abstract
In this study, different approaches for detecting of beach litter (BL) items in coastal environments are applied: the direct in situ survey, an indirect image analysis based on the manual visual screening approach, and two different automatic segmentation and classification tools. One is [...] Read more.
In this study, different approaches for detecting of beach litter (BL) items in coastal environments are applied: the direct in situ survey, an indirect image analysis based on the manual visual screening approach, and two different automatic segmentation and classification tools. One is a Mask-RCNN based-algorithm, already used in a previous work, but specifically improved in this study for multi-class analysis. Test cases were carried out at the Torre Guaceto Marine Protected Area (Apulia Region, southern Italy), using a novel dataset from images acquired in different coastal environments by tailored photogrammetric Unmanned Aerial Vehicle (UAV) surveys. The analysis of the overall methodologies used in this study highlights the potential exhibited by the two machine learning (ML) techniques (Mask-RCCN-based and SVM algorithms), but they still show some limitations concerning direct methodologies. The results of the analysis show that the Mask-RCNN-based algorithm requires further improvements and a consistent increase in the number of training elements, while the SVM algorithm shows limitations related to pixel-based classification. Furthermore, the outcomes of this research highlight the high suitability of ML tools for assessing BL pollution and contributing to coastal conservation efforts. Full article
(This article belongs to the Section Environmental Remote Sensing)
Show Figures

Figure 1

19 pages, 6915 KiB  
Article
Automated Crack Detection in Monolithic Zirconia Crowns Using Acoustic Emission and Deep Learning Techniques
by Kuson Tuntiwong, Supan Tungjitkusolmun and Pattarapong Phasukkit
Sensors 2024, 24(17), 5682; https://doi.org/10.3390/s24175682 - 31 Aug 2024
Viewed by 1695
Abstract
Monolithic zirconia (MZ) crowns are widely utilized in dental restorations, particularly for substantial tooth structure loss. Inspection, tactile, and radiographic examinations can be time-consuming and error-prone, which may delay diagnosis. Consequently, an objective, automatic, and reliable process is required for identifying dental crown [...] Read more.
Monolithic zirconia (MZ) crowns are widely utilized in dental restorations, particularly for substantial tooth structure loss. Inspection, tactile, and radiographic examinations can be time-consuming and error-prone, which may delay diagnosis. Consequently, an objective, automatic, and reliable process is required for identifying dental crown defects. This study aimed to explore the potential of transforming acoustic emission (AE) signals to continuous wavelet transform (CWT), combined with Conventional Neural Network (CNN) to assist in crack detection. A new CNN image segmentation model, based on multi-class semantic segmentation using Inception-ResNet-v2, was developed. Real-time detection of AE signals under loads, which induce cracking, provided significant insights into crack formation in MZ crowns. Pencil lead breaking (PLB) was used to simulate crack propagation. The CWT and CNN models were used to automate the crack classification process. The Inception-ResNet-v2 architecture with transfer learning categorized the cracks in MZ crowns into five groups: labial, palatal, incisal, left, and right. After 2000 epochs, with a learning rate of 0.0001, the model achieved an accuracy of 99.4667%, demonstrating that deep learning significantly improved the localization of cracks in MZ crowns. This development can potentially aid dentists in clinical decision-making by facilitating the early detection and prevention of crack failures. Full article
(This article belongs to the Special Issue Intelligent Sensing Technologies in Structural Health Monitoring)
Show Figures

Figure 1

24 pages, 7302 KiB  
Article
CTDUNet: A Multimodal CNN–Transformer Dual U-Shaped Network with Coordinate Space Attention for Camellia oleifera Pests and Diseases Segmentation in Complex Environments
by Ruitian Guo, Ruopeng Zhang, Hao Zhou, Tunjun Xie, Yuting Peng, Xili Chen, Guo Yu, Fangying Wan, Lin Li, Yongzhong Zhang and Ruifeng Liu
Plants 2024, 13(16), 2274; https://doi.org/10.3390/plants13162274 - 15 Aug 2024
Cited by 1 | Viewed by 1284
Abstract
Camellia oleifera is a crop of high economic value, yet it is particularly susceptible to various diseases and pests that significantly reduce its yield and quality. Consequently, the precise segmentation and classification of diseased Camellia leaves are vital for managing pests and diseases [...] Read more.
Camellia oleifera is a crop of high economic value, yet it is particularly susceptible to various diseases and pests that significantly reduce its yield and quality. Consequently, the precise segmentation and classification of diseased Camellia leaves are vital for managing pests and diseases effectively. Deep learning exhibits significant advantages in the segmentation of plant diseases and pests, particularly in complex image processing and automated feature extraction. However, when employing single-modal models to segment Camellia oleifera diseases, three critical challenges arise: (A) lesions may closely resemble the colors of the complex background; (B) small sections of diseased leaves overlap; (C) the presence of multiple diseases on a single leaf. These factors considerably hinder segmentation accuracy. A novel multimodal model, CNN–Transformer Dual U-shaped Network (CTDUNet), based on a CNN–Transformer architecture, has been proposed to integrate image and text information. This model first utilizes text data to address the shortcomings of single-modal image features, enhancing its ability to distinguish lesions from environmental characteristics, even under conditions where they closely resemble one another. Additionally, we introduce Coordinate Space Attention (CSA), which focuses on the positional relationships between targets, thereby improving the segmentation of overlapping leaf edges. Furthermore, cross-attention (CA) is employed to align image and text features effectively, preserving local information and enhancing the perception and differentiation of various diseases. The CTDUNet model was evaluated on a self-made multimodal dataset compared against several models, including DeeplabV3+, UNet, PSPNet, Segformer, HrNet, and Language meets Vision Transformer (LViT). The experimental results demonstrate that CTDUNet achieved an mean Intersection over Union (mIoU) of 86.14%, surpassing both multimodal models and the best single-modal model by 3.91% and 5.84%, respectively. Additionally, CTDUNet exhibits high balance in the multi-class segmentation of Camellia oleifera diseases and pests. These results indicate the successful application of fused image and text multimodal information in the segmentation of Camellia disease, achieving outstanding performance. Full article
(This article belongs to the Special Issue Sustainable Strategies for Tea Crops Protection)
Show Figures

Figure 1

Back to TopTop