Processing math: 100%
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,246)

Search Parameters:
Keywords = VggNet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 3350 KiB  
Article
Optimizing Backbone Networks Through Hybrid–Modal Fusion: A New Strategy for Waste Classification
by Houkui Zhou, Qifeng Ding, Chang Chen, Qinqin Liao, Qun Wang, Huimin Yu, Haoji Hu, Guangqun Zhang, Junguo Hu and Tao He
Sensors 2025, 25(10), 3241; https://doi.org/10.3390/s25103241 (registering DOI) - 21 May 2025
Abstract
With rapid urbanization, effective waste classification is a critical challenge. Traditional manual methods are time-consuming, labor-intensive, costly, and error-prone, resulting in reduced accuracy. Deep learning has revolutionized this field. Convolutional neural networks such as VGG and ResNet have dramatically improved automated sorting efficiency, [...] Read more.
With rapid urbanization, effective waste classification is a critical challenge. Traditional manual methods are time-consuming, labor-intensive, costly, and error-prone, resulting in reduced accuracy. Deep learning has revolutionized this field. Convolutional neural networks such as VGG and ResNet have dramatically improved automated sorting efficiency, and Transformer architectures like the Swin Transformer have further enhanced performance and adaptability in complex sorting scenarios. However, these approaches still struggle in complex environments and with diverse waste types, often suffering from limited recognition accuracy, poor generalization, or prohibitive computational demands. To overcome these challenges, we propose an efficient hybrid-modal fusion method, the Hybrid-modal Fusion Waste Classification Network (HFWC-Net), for precise waste image classification. HFWC-Net leverages a Transformer-based hierarchical architecture that integrates CNNs and Transformers, enhancing feature capture and fusion across varied image types for superior scalability and flexibility. By incorporating advanced techniques such as the Agent Attention mechanism and the LionBatch optimization strategy, HFWC-Net not only improves classification accuracy but also significantly reduces classification time. Comparative experimental results on the public datasets Garbage Classification, TrashNet, and our self-built MixTrash dataset demonstrate that HFWC-Net achieves Top-1 accuracy rates of 98.89%, 96.88%, and 94.35%, respectively. These findings indicate that HFWC-Net attains the highest accuracy among current methods, offering significant advantages in accelerating classification efficiency and supporting automated waste management applications. Full article
Show Figures

Figure 1

15 pages, 4025 KiB  
Article
Enhancing Dermatological Diagnosis Through Medical Image Analysis: How Effective Is YOLO11 Compared to Leading CNN Models?
by Rakib Ahammed Diptho and Sarnali Basak
NDT 2025, 3(2), 11; https://doi.org/10.3390/ndt3020011 - 21 May 2025
Abstract
Skin diseases represent a major worldwide health hazard affecting millions of people yearly and substantially compromising healthcare systems. Particularly in areas where dermatologists are scarce, standard diagnostic techniques, which mostly rely on visual inspection and clinical experience, are frequently subjective, time-consuming, and prone [...] Read more.
Skin diseases represent a major worldwide health hazard affecting millions of people yearly and substantially compromising healthcare systems. Particularly in areas where dermatologists are scarce, standard diagnostic techniques, which mostly rely on visual inspection and clinical experience, are frequently subjective, time-consuming, and prone to mistakes. This investigation undertakes a comparative analysis of four state-of-the-art deep learning architectures, YOLO11, YOLOv8, VGG16, and ResNet50, in the context of skin disease identification. This study evaluates the performance of these models using pivotal metrics, building upon the foundation of the YOLO paradigm, which revolutionized spatial attention and multi-scale representation. A properly selected collection of 900 high-quality dermatological images with nine disease categories was used for investigation. Robustness and generalizability were guaranteed by using data augmentation and hyperparameter adjustment. By varying benchmark models in balancing accuracy and recall while limiting false positives and false negatives, YOLO11 obtained a test accuracy of 80.72%, precision of 88.7%, recall of 86.7%, and an F1 score of 87.0%. The expedition performance of YOLO11 signifies a promising trajectory in the development of highly accurate skin disease detection models. Our analysis not only highlights the strengths and weaknesses of the model but also underscores the rapid development of deep learning techniques in medical imaging. Full article
Show Figures

Figure 1

28 pages, 8822 KiB  
Article
Multiclassification of Colorectal Polyps from Colonoscopy Images Using AI for Early Diagnosis
by Jothiraj Selvaraj, Kishwar Sadaf, Shabnam Mohamed Aslam and Snekhalatha Umapathy
Diagnostics 2025, 15(10), 1285; https://doi.org/10.3390/diagnostics15101285 (registering DOI) - 20 May 2025
Abstract
Background/Objectives: Colorectal cancer (CRC) remains one of the leading causes of cancer-related mortality worldwide, emphasizing the critical need for the accurate classification of precancerous polyps. This research presents an extensive analysis of the multiclassification framework leveraging various deep learning (DL) architectures for the [...] Read more.
Background/Objectives: Colorectal cancer (CRC) remains one of the leading causes of cancer-related mortality worldwide, emphasizing the critical need for the accurate classification of precancerous polyps. This research presents an extensive analysis of the multiclassification framework leveraging various deep learning (DL) architectures for the automated classification of colorectal polyps from colonoscopy images. Methods: The proposed methodology integrates real-time data for training and utilizes a publicly available dataset for testing, ensuring generalizability. The real-time images were cautiously annotated and verified by a panel of experts, including post-graduate medical doctors and gastroenterology specialists. The DL models were designed to categorize the preprocessed colonoscopy images into four clinically significant classes: hyperplastic, serrated, adenoma, and normal. A suite of state-of-the-art models, including VGG16, VGG19, ResNet50, DenseNet121, EfficientNetV2, InceptionNetV3, Vision Transformer (ViT), and the custom-developed CRP-ViT, were trained and rigorously evaluated for this task. Results: Notably, the CRP-ViT model exhibited superior capability in capturing intricate features, achieving an impressive accuracy of 97.28% during training and 96.02% during validation with real-time images. Furthermore, the model demonstrated remarkable performance during testing on the public dataset, attaining an accuracy of 95.69%. To facilitate real-time interaction and clinical applicability, a user-friendly interface was developed using Gradio, allowing healthcare professionals to upload colonoscopy images and receive instant classification results. Conclusions: The CRP-ViT model effectively predicts and categorizes colonoscopy images into clinically relevant classes, aiding gastroenterologists in decision-making. This study highlights the potential of integrating AI-driven models into routine clinical practice to improve colorectal cancer screening outcomes and reduce diagnostic variability. Full article
Show Figures

Figure 1

20 pages, 3013 KiB  
Article
Data-Driven Prediction of Grape Leaf Chlorophyll Content Using Hyperspectral Imaging and Convolutional Neural Networks
by Minglu Zeng, Xinghui Zhu, Ling Wan, Jian Xu and Luming Shen
Appl. Sci. 2025, 15(10), 5696; https://doi.org/10.3390/app15105696 - 20 May 2025
Abstract
Grapes, highly nutritious and flavorful fruits, require adequate chlorophyll to ensure normal growth and development. Consequently, the rapid, accurate, and efficient detection of chlorophyll content is essential. This study develops a data-driven integrated framework that combines hyperspectral imaging (HSI) and convolutional neural networks [...] Read more.
Grapes, highly nutritious and flavorful fruits, require adequate chlorophyll to ensure normal growth and development. Consequently, the rapid, accurate, and efficient detection of chlorophyll content is essential. This study develops a data-driven integrated framework that combines hyperspectral imaging (HSI) and convolutional neural networks (CNNs) to predict the chlorophyll content in grape leaves, employing hyperspectral images and chlorophyll a + b content data. Initially, the VGG16-U-Net model was employed to segment the hyperspectral images of grape leaves for leaf area extraction. Subsequently, the study discussed 15 different spectral preprocessing methods, selecting fast Fourier transform (FFT) as the optimal approach. Twelve one-dimensional CNN models were subsequently developed. Experimental results revealed that the VGG16-U-Net-FFT-CNN1-1 framework developed in this study exhibited outstanding performance, achieving an R2 of 0.925 and an RMSE of 2.172, surpassing those of traditional regression models. The t-test and F-test results further confirm the statistical robustness of the VGG16-U-Net-FFT-CNN1-1 framework. This provides a basis for estimating chlorophyll content in grape leaves using HSI technology. Full article
Show Figures

Figure 1

15 pages, 1201 KiB  
Article
Perspective Transformation and Viewpoint Attention Enhancement for Generative Adversarial Networks in Endoscopic Image Augmentation
by Laimonas Janutėnas and Dmitrij Šešok
Appl. Sci. 2025, 15(10), 5655; https://doi.org/10.3390/app15105655 - 19 May 2025
Viewed by 103
Abstract
This study presents an enhanced version of the StarGAN model, with a focus on medical applications, particularly endoscopic image augmentation. Our model incorporates novel Perspective Transformation and Viewpoint Attention Modules for StarGAN that improve image classification accuracy in a multiclass classification task. The [...] Read more.
This study presents an enhanced version of the StarGAN model, with a focus on medical applications, particularly endoscopic image augmentation. Our model incorporates novel Perspective Transformation and Viewpoint Attention Modules for StarGAN that improve image classification accuracy in a multiclass classification task. The Perspective Transformation Module enables the generation of more diverse viewing angles, while the Viewpoint Attention Module helps focus on diagnostically significant regions. We evaluate the performance of our enhanced architecture using the Kvasir v2 dataset, which contains 8000 images across eight gastrointestinal disease classes, comparing it against baseline models including VGG-16, ResNet-50, DenseNet-121, InceptionNet-V3, and EfficientNet-B7. Experimental results demonstrate that our approach achieves better performance in all models for this eight-class classification problem, increasing accuracy on average by 0.7% on VGG-16 and 0.63% on EfficientNet-B7 models. The addition of perspective transformation capabilities enables more diverse examples to augment the database and provide more samples of specific illnesses. Our approach offers a promising solution for medical image generation, enabling effective training with fewer data samples, which is particularly valuable in medical model development where data are often scarce due to challenges in acquisition. These improvements demonstrate significant potential for advancing machine learning disease classification systems in gastroenterology and medical image augmentation as a whole. Full article
(This article belongs to the Special Issue Deep Learning in Medical Image Processing and Analysis)
Show Figures

Figure 1

21 pages, 26641 KiB  
Article
A CNN-Based Method for Quantitative Assessment of Steel Microstructures in Welded Zones
by Cássio Danelon de Almeida, Thales Tozatto Filgueiras, Moisés Luiz Lagares, Bruno da Silva Macêdo, Camila Martins Saporetti, Matteo Bodini and Leonardo Goliatt
Fibers 2025, 13(5), 66; https://doi.org/10.3390/fib13050066 - 15 May 2025
Viewed by 216
Abstract
The mechanical performance of metallic components is intrinsically linked to their microstructural features. However, the manual quantification of microconstituents in metallographic images remains a time-consuming and subjective task, often requiring over 15 min per image by a trained expert. To address this limitation, [...] Read more.
The mechanical performance of metallic components is intrinsically linked to their microstructural features. However, the manual quantification of microconstituents in metallographic images remains a time-consuming and subjective task, often requiring over 15 min per image by a trained expert. To address this limitation, this study proposes an automated approach for quantifying the microstructural constituents from low-carbon steel welded zone images using convolutional neural networks (CNNs). A dataset of 210 micrographs was expanded to 720 samples through data augmentation to improve model generalization. Two architectures (AlexNet and VGG16) were trained from scratch, while three pre-trained models (VGG19, InceptionV3, and Xception) were fine-tuned. Among these, VGG19 optimized with stochastic gradient descent (SGD) achieved the best predictive performance, with an R2 of 0.838, MAE of 5.01%, and RMSE of 6.88%. The results confirm the effectiveness of CNNs for reliable and efficient microstructure quantification, offering a significant contribution to computational metallography. Full article
Show Figures

Figure 1

16 pages, 6927 KiB  
Article
Estimation of Missing DICOM Windowing Parameters in High-Dynamic-Range Radiographs Using Deep Learning
by Mateja Napravnik, Natali Bakotić, Franko Hržić, Damir Miletić and Ivan Štajduhar
Mathematics 2025, 13(10), 1596; https://doi.org/10.3390/math13101596 - 13 May 2025
Viewed by 149
Abstract
Digital Imaging and Communication in Medicine (DICOM) is a standard format for storing medical images, which are typically represented in higher bit depths (10–16 bits), enabling detailed representation but exceeding the display capabilities of standard displays and human visual perception. To address this, [...] Read more.
Digital Imaging and Communication in Medicine (DICOM) is a standard format for storing medical images, which are typically represented in higher bit depths (10–16 bits), enabling detailed representation but exceeding the display capabilities of standard displays and human visual perception. To address this, DICOM images are often accompanied by windowing parameters, analogous to tone mapping in High-Dynamic-Range image processing, which compress the intensity range to enhance diagnostically relevant regions. This study evaluates traditional histogram-based methods and explores the potential of deep learning for predicting window parameters in radiographs where such information is missing. A range of architectures, including MobileNetV3Small, VGG16, ResNet50, and ViT-B/16, were trained on high-bit-depth computed radiography images using various combinations of loss functions, including structural similarity (SSIM), perceptual loss (LPIPS), and an edge preservation loss. Models were evaluated based on multiple criteria, including pixel entropy preservation, Hellinger distance of pixel value distributions, and peak-signal-to-noise ratio after 8-bit conversion. The tested approaches were further validated on the publicly available GRAZPEDWRI-DX dataset. Although histogram-based methods showed satisfactory performance, especially scaling through identifying the peaks in the pixel value histogram, deep learning-based methods were better at selectively preserving clinically relevant image areas while removing background noise. Full article
Show Figures

Figure 1

19 pages, 5919 KiB  
Article
Evaluation of the Effectiveness of the UNet Model with Different Backbones in the Semantic Segmentation of Tomato Leaves and Fruits
by Juan Pablo Guerra Ibarra, Francisco Javier Cuevas de la Rosa and Julieta Raquel Hernandez Vidales
Horticulturae 2025, 11(5), 514; https://doi.org/10.3390/horticulturae11050514 - 9 May 2025
Viewed by 241
Abstract
Timely identification of crop conditions is relevant for informed decision-making in precision agriculture. The initial step in determining the conditions that crops require involves isolating the components that constitute them, including the leaves and fruits of the plants. An alternative method for conducting [...] Read more.
Timely identification of crop conditions is relevant for informed decision-making in precision agriculture. The initial step in determining the conditions that crops require involves isolating the components that constitute them, including the leaves and fruits of the plants. An alternative method for conducting this separation is to utilize intelligent digital image processing, wherein plant elements are labeled for subsequent analysis. The application of Deep Learning algorithms offers an alternative approach for conducting segmentation tasks on images obtained from complex environments with intricate patterns that pose challenges for separation. One such application is semantic segmentation, which involves assigning a label to each pixel in the processed image. This task is accomplished through training various models of Convolutional Neural Networks. This paper presents a comparative analysis of semantic segmentation performance using a convolutional neural network model with different backbone architectures. The task focuses on pixel-wise classification into three categories: leaves, fruits, and background, based on images of semi-hydroponic tomato crops captured in greenhouse settings. The main contribution lies in identifying the most efficient backbone-UNet combination for segmenting tomato plant leaves and fruits under uncontrolled conditions of lighting and background during image acquisition. The Convolutional Neural Network model UNet is is implemented with different backbones to use transfer learning to take advantage of the knowledge acquired by other models such as MobileNet, VanillaNet, MVanillaNet, ResNet, VGGNet trained with the ImageNet dataset, in order to segment the leaves and fruits of tomato plants. Highest percentage performance across five metrics for tomato plant fruit and leaves segmentation is the MVanillaNet-UNet and VGGNet-UNet combination with 0.88089 and 0.89078 respectively. A comparison of the best results of semantic segmentation versus those obtained with a color-dominant segmentation method optimized with a greedy algorithm is presented. Full article
(This article belongs to the Section Vegetable Production Systems)
Show Figures

Figure 1

24 pages, 7075 KiB  
Article
Visual Geometry Group-SwishNet-Based Asymmetric Facial Emotion Recognition for Multi-Face Engagement Detection in Online Learning Environments
by Qiaohong Yao, Mengmeng Wang and Yubin Li
Symmetry 2025, 17(5), 711; https://doi.org/10.3390/sym17050711 - 7 May 2025
Viewed by 113
Abstract
In the contemporary global educational environment, the automatic assessment of students’ online engagement has garnered widespread attention. A substantial number of studies have demonstrated that facial expressions are a crucial indicator for measuring engagement. However, due to the asymmetry inherent in facial expressions [...] Read more.
In the contemporary global educational environment, the automatic assessment of students’ online engagement has garnered widespread attention. A substantial number of studies have demonstrated that facial expressions are a crucial indicator for measuring engagement. However, due to the asymmetry inherent in facial expressions and the varying degrees of deviation of students’ faces from a camera, significant challenges have been posed to accurate emotion recognition in the online learning environment. To address these challenges, this work proposes a novel VGG-SwishNet model, which is based on the VGG-16 model and aims to enhance the recognition ability of asymmetric facial expressions, thereby improving the reliability of student engagement assessment in online education. The Swish activation function is introduced into the model due to its smoothness and self-gating mechanism. Its smoothness aids in stabilizing gradient updates during backpropagation and facilitates better handling of minor variations in input data. This enables the model to more effectively capture subtle differences and asymmetric variations in facial expressions. Additionally, the self-gating mechanism allows the function to automatically adjust its degree of nonlinearity. This helps the model to learn more effective asymmetric feature representations and mitigates the vanishing gradient problem to some extent. Subsequently, this model was applied to the assessment of engagement and provided a visualization of the results. In terms of performance, the proposed method achieved high recognition accuracy on the JAFFE, KDEF, and CK+ datasets. Specifically, under 80–20% and 10-fold cross-validation (CV) scenarios, the recognition accuracy exceeded 95%. According to the obtained results, the proposed approach demonstrates higher accuracy and robust stability. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

15 pages, 5253 KiB  
Article
Detection of Tagosodes orizicolus in Aerial Images of Rice Crops Using Machine Learning
by Angig Rivera-Cartagena, Heber I. Mejia-Cabrera and Juan Arcila-Diaz
AgriEngineering 2025, 7(5), 147; https://doi.org/10.3390/agriengineering7050147 - 7 May 2025
Viewed by 171
Abstract
This study employs RGB imagery and machine learning techniques to detect Tagosodes orizicolus infestations in “Tinajones” rice crops during the flowering stage, a critical challenge for agriculture in northern Peru. High-resolution images were acquired using an unmanned aerial vehicle (UAV) and preprocessed by [...] Read more.
This study employs RGB imagery and machine learning techniques to detect Tagosodes orizicolus infestations in “Tinajones” rice crops during the flowering stage, a critical challenge for agriculture in northern Peru. High-resolution images were acquired using an unmanned aerial vehicle (UAV) and preprocessed by extracting 256 × 256-pixel segments, focusing on three classes: infested zones, non-cultivated areas, and healthy rice crops. A dataset of 1500 images was constructed and utilized to train deep learning models based on VGG16 and ResNet50. Both models exhibited highly comparable performance, with VGG16 attaining a precision of 98.274% and ResNet50 achieving a precision of 98.245%, demonstrating their effectiveness in identifying infestation patterns with high reliability. To automate the analysis of complete UAV-acquired images, a web-based application was developed. This system receives an image, segments it into grids, and preprocesses each section using resizing, normalization, and dimensional adjustments. The pretrained VGG16 model subsequently classifies each segment into one of three categories: infested zone, non-cultivated area, or healthy crop, overlaying the classification results onto the original image to generate an annotated visualization of detected areas. This research contributes to precision agriculture by providing an efficient and scalable computational tool for early infestation detection, thereby supporting timely intervention strategies to mitigate potential crop losses. Full article
Show Figures

Figure 1

27 pages, 411 KiB  
Systematic Review
Artificial Neural Networks for Image Processing in Precision Agriculture: A Systematic Literature Review on Mango, Apple, Lemon, and Coffee Crops
by Christian Unigarro, Jorge Hernandez and Hector Florez
Informatics 2025, 12(2), 46; https://doi.org/10.3390/informatics12020046 - 6 May 2025
Viewed by 268
Abstract
Precision agriculture is an approach that uses information technologies to improve and optimize agricultural production. It is based on the collection and analysis of agricultural data to support decision making in agricultural processes. In recent years, Artificial Neural Networks (ANNs) have demonstrated significant [...] Read more.
Precision agriculture is an approach that uses information technologies to improve and optimize agricultural production. It is based on the collection and analysis of agricultural data to support decision making in agricultural processes. In recent years, Artificial Neural Networks (ANNs) have demonstrated significant benefits in addressing precision agriculture needs, such as pest detection, disease classification, crop state assessment, and soil quality evaluation. This article aims to perform a systematic literature review on how ANNs with an emphasis on image processing can assess if fruits such as mango, apple, lemon, and coffee are ready for harvest. These specific crops were selected due to their diversity in color and size, providing a representative sample for analyzing the most commonly employed ANN methods in agriculture, especially for fruit ripening, damage, pest detection, and harvest prediction. This review identifies Convolutional Neural Networks (CNNs), including commonly employed architectures such as VGG16 and ResNet50, as highly effective, achieving accuracies ranging between 83% and 99%. Additionally, it discusses the integration of hardware and software, image preprocessing methods, and evaluation metrics commonly employed. The results reveal the notable underuse of vegetation indices and infrared imaging techniques for detailed fruit quality assessment, indicating valuable opportunities for future research. Full article
Show Figures

Figure 1

34 pages, 15537 KiB  
Article
Explainable Artificial Intelligence for Diagnosis and Staging of Liver Cirrhosis Using Stacked Ensemble and Multi-Task Learning
by Serkan Savaş
Diagnostics 2025, 15(9), 1177; https://doi.org/10.3390/diagnostics15091177 - 6 May 2025
Viewed by 605
Abstract
Background/Objectives: Liver cirrhosis is a critical chronic condition with increasing global mortality and morbidity rates, emphasizing the necessity for early and accurate diagnosis. This study proposes a comprehensive deep-learning framework for the automatic diagnosis and staging of liver cirrhosis using T2-weighted MRI [...] Read more.
Background/Objectives: Liver cirrhosis is a critical chronic condition with increasing global mortality and morbidity rates, emphasizing the necessity for early and accurate diagnosis. This study proposes a comprehensive deep-learning framework for the automatic diagnosis and staging of liver cirrhosis using T2-weighted MRI images. Methods: The methodology integrates stacked ensemble learning, multi-task learning (MTL), and transfer learning within an explainable artificial intelligence (XAI) context to improve diagnostic accuracy, reliability, and transparency. A hybrid model combining multiple pre-trained convolutional neural networks (VGG16, MobileNet, and DenseNet121) with XGBoost as a meta-classifier demonstrated robust performance in binary classification between healthy and cirrhotic cases. Results: The model achieved a mean accuracy of 96.92%, precision of 95.12%, recall of 98.93%, and F1-score of 96.98% across 10-fold cross-validation. For staging (mild, moderate, and severe), the MTL framework reached a main task accuracy of 96.71% and an average AUC of 99.81%, with a powerful performance in identifying severe cases. Grad-CAM visualizations reveal class-specific activation regions, enhancing the transparency and trust in the model’s decision-making. The proposed system was validated using the CirrMRI600+ dataset with a 10-fold cross-validation strategy, achieving high accuracy (AUC: 99.7%) and consistent results across folds. Conclusions: This research not only advances State-of-the-Art diagnostic methods but also addresses the black-box nature of deep learning in clinical applications. The framework offers potential as a decision-support system for radiologists, contributing to early detection, effective staging, personalized treatment planning, and better-informed treatment planning for liver cirrhosis. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

18 pages, 4206 KiB  
Article
Disaster Recognition and Classification Based on Improved ResNet-50 Neural Network
by Lei Wen, Zikai Xiao, Xiaoting Xu and Bin Liu
Appl. Sci. 2025, 15(9), 5143; https://doi.org/10.3390/app15095143 - 6 May 2025
Viewed by 184
Abstract
Accurate and timely disaster classification is critical for effective disaster management and emergency response. This study proposes an improved ResNet-50-based deep learning model to classify seven types of natural disasters, including earthquake, fire, flood, mudslide, avalanche, landslide, and land subsidence. The dataset was [...] Read more.
Accurate and timely disaster classification is critical for effective disaster management and emergency response. This study proposes an improved ResNet-50-based deep learning model to classify seven types of natural disasters, including earthquake, fire, flood, mudslide, avalanche, landslide, and land subsidence. The dataset was compiled from publicly available sources and partitioned into training and validation sets using an 8:2 split. Experimental results demonstrate that the proposed model achieves a classification accuracy of 87% on the validation set and outperforms the traditional VGG16 model in most evaluation metrics, including precision, recall, F1-score, AUC, specificity, and log loss. Furthermore, the model effectively mitigates the gradient vanishing problem, ensuring stable convergence and robust training performance. These findings provide a practical technical reference for multi-disaster classification tasks and contribute to enhancing the efficiency of disaster response and societal resilience. Full article
Show Figures

Figure 1

23 pages, 5249 KiB  
Article
Multilabel Classification of Radiology Image Concepts Using Deep Learning
by Vito Santamato and Agostino Marengo
Appl. Sci. 2025, 15(9), 5140; https://doi.org/10.3390/app15095140 - 6 May 2025
Viewed by 218
Abstract
Understanding and interpreting medical images, particularly radiology images, is a time-consuming task that requires specialized expertise. In this study, we developed a deep learning-based system capable of automatically assigning multiple standardized medical concepts to radiology images, leveraging deep learning models. These concepts are [...] Read more.
Understanding and interpreting medical images, particularly radiology images, is a time-consuming task that requires specialized expertise. In this study, we developed a deep learning-based system capable of automatically assigning multiple standardized medical concepts to radiology images, leveraging deep learning models. These concepts are based on Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs) and describe the radiology images in detail. Each image is associated with multiple concepts, making it a multilabel classification problem. We implemented several deep learning models, including DenseNet121, ResNet101, and VGG19, and evaluated them on the ImageCLEF 2020 Medical Concept Detection dataset. This dataset consists of radiology images with multiple CUIs associated with each image and is organized into seven categories based on their modality information. In this study, transfer learning techniques were applied, with the models initially pre-trained on the ImageNet dataset and subsequently fine-tuned on the ImageCLEF dataset. We present the evaluation results based on the F1-score metric, demonstrating the effectiveness of our approach. Our best-performing model, DenseNet121, achieved an F1-score of 0.89 on the classification of the twenty most frequent medical concepts, indicating a significant improvement over baseline methods. Full article
Show Figures

Figure 1

20 pages, 9263 KiB  
Article
A Two-Stage YOLOv5s–U-Net Framework for Defect Localization and Segmentation in Overhead Transmission Lines
by Aohua Li, Dacheng Li and Anjing Wang
Sensors 2025, 25(9), 2903; https://doi.org/10.3390/s25092903 - 4 May 2025
Viewed by 277
Abstract
Transmission-line defect detection is crucial for grid operation. Existing methods struggle to balance defect localization and fine segmentation. Therefore, this study proposes a novel cascaded two-stage framework that first utilizes YOLOv5s for the global localization of defective regions, and then uses U-Net for [...] Read more.
Transmission-line defect detection is crucial for grid operation. Existing methods struggle to balance defect localization and fine segmentation. Therefore, this study proposes a novel cascaded two-stage framework that first utilizes YOLOv5s for the global localization of defective regions, and then uses U-Net for the fine segmentation of candidate regions. To improve the segmentation performance, U-Net adopts a transfer learning strategy based on the VGG16 pretrained model to alleviate the impact of limited dataset size on the training effect. Meanwhile, a hybrid loss function that combines Dice Loss and Focal Loss is designed to solve the small-target and class imbalance problems. This method integrates target detection and fine segmentation, enhancing detection precision and improving the extraction of detailed damage features. Experiments on the self-constructed dataset show that the method achieves 87% mAP on YOLOv5s, 88% U-Net damage recognition precision, a mean Dice coefficient of 93.66%, and 89% mIoU, demonstrating its effectiveness in accurately detecting transmission-line defects and efficiently segmenting the damage region, providing assistance for the intelligent operation and maintenance of transmission lines. Full article
(This article belongs to the Special Issue Computer Vision and Pattern Recognition Based on Remote Sensing)
Show Figures

Figure 1

Back to TopTop