MDPI - Publisher of Open Access Journals

32 pages, 13967 KB

Open AccessArticle

MCH-YOLOv12: Research on Surface Defect Detection Algorithm for Aluminum Profiles Based on Improved YOLOv12

by Yuyu Sun, Heqi Yan, Zongkai Shang and Mingxiao Yang

Sensors 2025, 25(17), 5389; https://doi.org/10.3390/s25175389 - 1 Sep 2025

Viewed by 277

Surface defect detection in aluminum profiles is critical for maintaining product quality and ensuring efficient industrial production. However, existing detection algorithms often struggle to address the challenges of imbalanced defect categories, low detection accuracy for small-scale defects, and irregular flaw geometries. These limitations [...] Read more.

Surface defect detection in aluminum profiles is critical for maintaining product quality and ensuring efficient industrial production. However, existing detection algorithms often struggle to address the challenges of imbalanced defect categories, low detection accuracy for small-scale defects, and irregular flaw geometries. These limitations compromise both detection accuracy and algorithmic robustness. Accordingly, we proposed MCH-YOLOv12—an improved YOLOv12-based algorithm for precise defect detection. Firstly, we enhanced the original Ghost convolution by incorporating multi-scale feature extraction and named the improved version MultiScaleGhost, which replaced the standard convolutions in the Backbone of YOLOv12. This improvement mitigated the limitations of single-scale convolution, enhancing feature representation and the detection of irregularly shaped defects. Secondly, we addressed the directional and edge-specific nature of defects by enhancing the traditional Channel-wise Gated Linear Unit (CGLU). We proposed the Spatial-Channel Collaborative Gated Linear Unit (SCCGLU), which was embedded after the C3k2 module in the Neck of YOLOv12 to better capture fine-grained features. Finally, we designed a Hybrid Head combining anchor-based and anchor-free detection to improve adaptability to defects of various sizes and shapes. Experimental results on an aluminum profile defect dataset demonstrated improved accuracy, reduced category imbalance, and lower parameters and Floating Point Operations (FLOPs), making the algorithm suitable for real-time industrial inspection. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

► Show Figures

Figure 1

23 pages, 1121 KB

Open AccessReview

Ecosystem Services in Northeast China’s Cold Region: A Comprehensive Review of Patterns, Drivers, and Policy Responses

by Xiaomeng Guo, Chuang Yang, Zilong Wang and Li Wang

Sustainability 2025, 17(16), 7352; https://doi.org/10.3390/su17167352 - 14 Aug 2025

Viewed by 453

Abstract

As a typical cold region, Northeast China is characterized by its unique climate, hydrological conditions, and land systems, which collectively shape the diversity and complexity of regional ecosystem services (ESs). This review systematically examines research on ESs in Northeast China from 1997 to [...] Read more.

As a typical cold region, Northeast China is characterized by its unique climate, hydrological conditions, and land systems, which collectively shape the diversity and complexity of regional ecosystem services (ESs). This review systematically examines research on ESs in Northeast China from 1997 to 2025, with particular emphasis on recent advances in service classification and spatiotemporal patterns, trade-offs and synergies among ESs, the identification of driving mechanisms, regulatory pathways, and policy effectiveness. The findings reveal obvious spatial heterogeneity and distinct stage-wise changing patterns in ESs across the region, with particularly pronounced trade-offs between food production and regulating services. The primary driving factors are concentrated in natural and human activities dimensions, whereas region-specific variables and policy-related drivers remain underexplored. Current research predominantly employs methods such as correlation analysis and geographically weighted regression; however, the capacity to uncover causal mechanisms and nonlinear interactions remains limited. Future research should strengthen the simulation of ecological processes in cold regions, improve the balance between ES supply and demand, improve policy scenario assessments, and develop dynamic feedback mechanisms. Compared with previous studies focusing on single services or regions, this review provides a multidimensional perspective by synthesizing multiple ES categories, integrating spatiotemporal comparative analysis, and incorporating modeling strategies specific to cold-region dynamics. These efforts will help shift ES research beyond static description toward more systematic regulation and management, providing both theoretical support and practical guidance for sustainable development and ecological governance in Northeast China. Full article

► Show Figures

Figure 1

18 pages, 2639 KB

Open AccessArticle

CA-NodeNet: A Category-Aware Graph Neural Network for Semi-Supervised Node Classification

by Zichang Lu, Meiyu Zhong, Qiguo Sun and Kai Ma

Electronics 2025, 14(16), 3215; https://doi.org/10.3390/electronics14163215 - 13 Aug 2025

Viewed by 219

Abstract

Graph convolutional networks (GCNs) have demonstrated remarkable effectiveness in processing graph-structured data and have been widely adopted across various domains. Existing methods mitigate over-smoothing through selective aggregation strategies such as attention mechanisms, edge dropout, and neighbor sampling. While some approaches incorporate global structural [...] Read more.

Graph convolutional networks (GCNs) have demonstrated remarkable effectiveness in processing graph-structured data and have been widely adopted across various domains. Existing methods mitigate over-smoothing through selective aggregation strategies such as attention mechanisms, edge dropout, and neighbor sampling. While some approaches incorporate global structural context, they often underexplore category-aware representations and inter-category differences, which are crucial for enhancing node discriminability. To address these limitations, a novel framework, CA-NodeNet, is proposed for semi-supervised node classification. CA-NodeNet comprises three key components: (1) coarse-grained node feature learning, (2) category-decoupled multi-branch attention, and (3) inter-category difference feature learning. Initially, a GCN-based encoder is employed to aggregate neighborhood information and learn coarse-grained representations. Subsequently, the category-decoupled multi-branch attention module employs a hierarchical multi-branch architecture, in which each branch incorporates category-specific attention mechanisms to project coarse-grained features into disentangled semantic subspaces. Furthermore, a layer-wise intermediate supervision strategy is adopted to facilitate the learning of discriminative category-specific features within each branch. To further enhance node feature discriminability, we introduce an inter-category difference feature learning module. This module first encodes pairwise differences between the category-specific features obtained from the previous stage and then integrates complementary information across multiple feature pairs to refine node representations. Finally, we design a dual-component optimization function that synergistically combines intermediate supervision loss with the final classification objective, encouraging the network to learn robust and fine-grained node representations. Extensive experiments on multiple real-world benchmark datasets demonstrate the superior performance of CA-NodeNet over existing state-of-the-art methods. Ablation studies further validate the effectiveness of each module in contributing to overall performance gains. Full article

(This article belongs to the Special Issue How Graph Convolutional Networks Work: Mechanisms and Models)

► Show Figures

Figure 1

29 pages, 15488 KB

Open AccessArticle

GOFENet: A Hybrid Transformer–CNN Network Integrating GEOBIA-Based Object Priors for Semantic Segmentation of Remote Sensing Images

by Tao He, Jianyu Chen and Delu Pan

Remote Sens. 2025, 17(15), 2652; https://doi.org/10.3390/rs17152652 - 31 Jul 2025

Viewed by 673

Abstract

Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability [...] Read more.

Geographic object-based image analysis (GEOBIA) has demonstrated substantial utility in remote sensing tasks. However, its integration with deep learning remains largely confined to image-level classification. This is primarily due to the irregular shapes and fragmented boundaries of segmented objects, which limit its applicability in semantic segmentation. While convolutional neural networks (CNNs) excel at local feature extraction, they inherently struggle to capture long-range dependencies. In contrast, Transformer-based models are well suited for global context modeling but often lack fine-grained local detail. To overcome these limitations, we propose GOFENet (Geo-Object Feature Enhanced Network)—a hybrid semantic segmentation architecture that effectively fuses object-level priors into deep feature representations. GOFENet employs a dual-encoder design combining CNN and Swin Transformer architectures, enabling multi-scale feature fusion through skip connections to preserve both local and global semantics. An auxiliary branch incorporating cascaded atrous convolutions is introduced to inject information of segmented objects into the learning process. Furthermore, we develop a cross-channel selection module (CSM) for refined channel-wise attention, a feature enhancement module (FEM) to merge global and local representations, and a shallow–deep feature fusion module (SDFM) to integrate pixel- and object-level cues across scales. Experimental results on the GID and LoveDA datasets demonstrate that GOFENet achieves superior segmentation performance, with 66.02% mIoU and 51.92% mIoU, respectively. The model exhibits strong capability in delineating large-scale land cover features, producing sharper object boundaries and reducing classification noise, while preserving the integrity and discriminability of land cover categories. Full article

► Show Figures

Graphical abstract

12 pages, 854 KB

Open AccessArticle

TOSQ: Transparent Object Segmentation via Query-Based Dictionary Lookup with Transformers

by Bin Ma, Ming Ma, Ruiguang Li, Jiawei Zheng and Deping Li

Sensors 2025, 25(15), 4700; https://doi.org/10.3390/s25154700 - 30 Jul 2025

Viewed by 519

Abstract

Sensing transparent objects has many applications in human daily life, including robot navigation and grasping. However, this task presents significant challenges due to the unpredictable nature of scenes that extend beyond/behind transparent objects, particularly the lack of fixed visual patterns and strong background [...] Read more.

Sensing transparent objects has many applications in human daily life, including robot navigation and grasping. However, this task presents significant challenges due to the unpredictable nature of scenes that extend beyond/behind transparent objects, particularly the lack of fixed visual patterns and strong background interference. This paper aims to solve the transparent object segmentation problem by leveraging the intrinsic global modeling capabilities of transformer architectures. We design a Query Parsing Module (QPM) that innovatively formulates segmentation as a dictionary lookup problem, differing fundamentally from conventional pixel-wise mechanisms, e.g., via attention-based prototype matching, and a set of learnable class prototypes as query inputs. Based on QPM, we propose a high-performance transformer-based end-to-end segmentation model, Transparent Object Segmentation through Query (TOSQ). TOSQ’s encoder is based on the Segformer’s backbone, and its decoder consists of a series of QPM modules, which progressively refine segmentation masks by the proposed QPMs. TOSQ achieves state-of-the-art performance on the Trans10K-V2 dataset (76.63% mIoU, 95.34% Acc), with particularly significant gains in challenging categories like windows (+23.59%) and glass doors (+11.22%), demonstrating its superior capability in transparent object segmentation. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

23 pages, 20932 KB

Open AccessArticle

Robust Small-Object Detection in Aerial Surveillance via Integrated Multi-Scale Probabilistic Framework

by Youyou Li, Yuxiang Fang, Shixiong Zhou, Yicheng Zhang and Nuno Antunes Ribeiro

Mathematics 2025, 13(14), 2303; https://doi.org/10.3390/math13142303 - 18 Jul 2025

Viewed by 464

Abstract

Accurate and efficient object detection is essential for aerial airport surveillance, playing a critical role in aviation safety and the advancement of autonomous operations. Although recent deep learning approaches have achieved notable progress, significant challenges persist, including severe object occlusion, extreme scale variation, [...] Read more.

Accurate and efficient object detection is essential for aerial airport surveillance, playing a critical role in aviation safety and the advancement of autonomous operations. Although recent deep learning approaches have achieved notable progress, significant challenges persist, including severe object occlusion, extreme scale variation, dense panoramic clutter, and the detection of very small targets. In this study, we introduce a novel and unified detection framework designed to address these issues comprehensively. Our method integrates a Normalized Gaussian Wasserstein Distance loss for precise probabilistic bounding box regression, Dilation-wise Residual modules for improved multi-scale feature extraction, a Hierarchical Screening Feature Pyramid Network for effective hierarchical feature fusion, and DualConv modules for lightweight yet robust feature representation. Extensive experiments conducted on two public airport surveillance datasets, ASS1 and ASS2, demonstrate that our approach yields substantial improvements in detection accuracy. Specifically, the proposed method achieves an improvement of up to 14.6 percentage points in mean Average Precision (mAP@0.5) compared to state-of-the-art YOLO variants, with particularly notable gains in challenging small-object categories such as personnel detection. These results highlight the effectiveness and practical value of the proposed framework in advancing aviation safety and operational autonomy in airport environments. Full article

(This article belongs to the Special Issue Artificial Intelligence and Optimization in Aircraft Design and Unmanned Aerial Vehicles, 2nd Edition)

► Show Figures

Graphical abstract

16 pages, 10129 KB

Open AccessArticle

PestOOD: An AI-Enabled Solution for Advancing Grain Security via Out-of-Distribution Pest Detection

by Jida Tian, Chuanyang Ma, Jiangtao Li and Huiling Zhou

Electronics 2025, 14(14), 2868; https://doi.org/10.3390/electronics14142868 - 18 Jul 2025

Viewed by 264

Abstract

Detecting stored-grain pests on the surface of the grain pile plays an important role in integrated pest management (IPM), which is crucial for grain security. Recently, numerous deep learning-based pest detection methods have been proposed. However, a critical limitation of existing methods is [...] Read more.

Detecting stored-grain pests on the surface of the grain pile plays an important role in integrated pest management (IPM), which is crucial for grain security. Recently, numerous deep learning-based pest detection methods have been proposed. However, a critical limitation of existing methods is their inability to detect out-of-distribution (OOD) categories that are unseen during training. When encountering such objects, these methods often misclassify them as in-distribution (ID) categories. To address this challenge, we propose a one-stage framework named PestOOD for out-of-distribution stored-grain pest detection via flow-based feature reconstruction. Specifically, we propose a novel Flow-Based OOD Feature Generation (FOFG) module that generates OOD features for detector training via feature reconstruction. This helps the detector learn to recognize OOD objects more effectively. Additionally, to prevent network overfitting that may lead to an excessive focus on ID feature extraction, we propose a Noisy DropBlock (NDB) module and integrate it into the backbone network. Finally, to ensure effective network convergence, a Stage-Wise Training Strategy (STS) is proposed. We conducted extensive experiments on our previously established multi-class stored-grain pest dataset. The results show that our proposed PestOOD demonstrates superior performance over state-of-the-art methods, providing an effective AI-enabled solution to ensure grain security. Full article

(This article belongs to the Special Issue Security Challenges and Opportunities of Artificial Intelligence/Big Data Scenarios)

► Show Figures

Figure 1

31 pages, 4412 KB

Open AccessArticle

Detection of Trees and Objects in Apple Orchard from LiDAR Point Cloud Data Using a YOLOv5 Framework

by Md Rejaul Karim, Md Nasim Reza, Shahriar Ahmed, Kyu-Ho Lee, Joonjea Sung and Sun-Ok Chung

Electronics 2025, 14(13), 2545; https://doi.org/10.3390/electronics14132545 - 24 Jun 2025

Cited by 1 | Viewed by 846

Abstract

Object detection is crucial for smart apple orchard management using agricultural machinery to avoid obstacles. The objective of this study was to detect apple trees and other objects in an apple orchard using LiDAR and the YOLOv5 algorithm. A commercial LiDAR was attached [...] Read more.

Object detection is crucial for smart apple orchard management using agricultural machinery to avoid obstacles. The objective of this study was to detect apple trees and other objects in an apple orchard using LiDAR and the YOLOv5 algorithm. A commercial LiDAR was attached to a tripod to collect apple tree trunk data, which were then pre-processed and converted into PNG images. A pre-processed set of 1500 images was manually annotated with bounding boxes and class labels (trees, water tanks, and others) to train and validate the YOLOv5 object detection algorithm. The model, trained over 100 epochs, resulted in 90% precision, 87% recall, mAP@0.5 of 0.89, and mAP@0.5:0.95 of 0.48. The accuracy reached 89% with a low classification loss of 0.001. Class-wise accuracy was high for water tanks (96%) and trees (95%), while the “others” category had lower accuracy (82%) due to inter-class similarity. Accurate object detection is challenging since the apple orchard environment is complex and unstructured. Background misclassifications highlight the need for improved dataset balance, better feature discrimination, and refinement in detecting ambiguous objects. Full article

(This article belongs to the Special Issue Innovations in Intelligent Agriculture: Advanced AI and Robotics for Modern Farming)

► Show Figures

Figure 1

18 pages, 839 KB

Open AccessArticle

From Narratives to Diagnosis: A Machine Learning Framework for Classifying Sleep Disorders in Aging Populations: The sleepCare Platform

by Christos A. Frantzidis

Brain Sci. 2025, 15(7), 667; https://doi.org/10.3390/brainsci15070667 - 20 Jun 2025

Viewed by 1112

Abstract

Background/Objectives: Sleep disorders are prevalent among aging populations and are often linked to cognitive decline, chronic conditions, and reduced quality of life. Traditional diagnostic methods, such as polysomnography, are resource-intensive and limited in accessibility. Meanwhile, individuals frequently describe their sleep experiences through [...] Read more.

Background/Objectives: Sleep disorders are prevalent among aging populations and are often linked to cognitive decline, chronic conditions, and reduced quality of life. Traditional diagnostic methods, such as polysomnography, are resource-intensive and limited in accessibility. Meanwhile, individuals frequently describe their sleep experiences through unstructured narratives in clinical notes, online forums, and telehealth platforms. This study proposes a machine learning pipeline (sleepCare) that classifies sleep-related narratives into clinically meaningful categories, including stress-related, neurodegenerative, and breathing-related disorders. The proposed framework employs natural language processing (NLP) and machine learning techniques to support remote applications and real-time patient monitoring, offering a scalable solution for the early identification of sleep disturbances. Methods: The sleepCare consists of a three-tiered classification pipeline to analyze narrative sleep reports. First, a baseline model used a Multinomial Naïve Bayes classifier with n-gram features from a Bag-of-Words representation. Next, a Support Vector Machine (SVM) was trained on GloVe-based word embeddings to capture semantic context. Finally, a transformer-based model (BERT) was fine-tuned to extract contextual embeddings, using the [CLS] token as input for SVM classification. Each model was evaluated using stratified train-test splits and 10-fold cross-validation. Hyperparameter tuning via GridSearchCV optimized performance. The dataset contained 475 labeled sleep narratives, classified into five etiological categories relevant for clinical interpretation. Results: The transformer-based model utilizing BERT embeddings and an optimized Support Vector Machine classifier achieved an overall accuracy of 81% on the test set. Class-wise F1-scores ranged from 0.72 to 0.91, with the highest performance observed in classifying normal or improved sleep (F1 = 0.91). The macro average F1-score was 0.78, indicating balanced performance across all categories. GridSearchCV identified the optimal SVM parameters (C = 4, kernel = ‘rbf’, gamma = 0.01, degree = 2, class_weight = ‘balanced’). The confusion matrix revealed robust classification with limited misclassifications, particularly between overlapping symptom categories such as stress-related and neurodegenerative sleep disturbances. Conclusions: Unlike generic large language model applications, our approach emphasizes the personalized identification of sleep symptomatology through targeted classification of the narrative input. By integrating structured learning with contextual embeddings, the framework offers a clinically meaningful, scalable solution for early detection and differentiation of sleep disorders in diverse, real-world, and remote settings. Full article

(This article belongs to the Special Issue Perspectives of Artificial Intelligence (AI) in Aging Neuroscience)

► Show Figures

Graphical abstract

20 pages, 395 KB

Open AccessReview

Protecting Repositories of Indigenous Traditional Ecological Knowledges: A Health-Focused Scoping Review

by Danya Carroll, Mélina Maureen Houndolo, Alia Big George and Nicole Redvers

Int. J. Environ. Res. Public Health 2025, 22(6), 886; https://doi.org/10.3390/ijerph22060886 - 31 May 2025

Viewed by 1136

Abstract

Indigenous Peoples have stewarded Indigenous traditional ecological knowledges (TEK) for millennia. Health-related TEK represents vital knowledge that promotes Indigenous health and wellbeing. Yet, the intergenerational protection of TEK continues to be threatened by various factors, including climate change, which underscores the importance of [...] Read more.

Indigenous Peoples have stewarded Indigenous traditional ecological knowledges (TEK) for millennia. Health-related TEK represents vital knowledge that promotes Indigenous health and wellbeing. Yet, the intergenerational protection of TEK continues to be threatened by various factors, including climate change, which underscores the importance of strengthening and supporting Indigenous-managed TEK repositories. Using a scoping review methodology, we aimed to identify documents for setting up health-related TEK repositories within Indigenous communities. A systematic search was completed in multiple databases—Medline, PubMed, CABI abstracts, Canadian Public Policy Collection, and JSTOR—with manual searches carried out on relevant Indigenous repositories and Google. Content analysis was then carried out with the nine documents meeting our inclusion criteria. We characterized six overarching categories and twelve sub-categories from the included documents. These categories covered impacts on Indigenous TEK repositories resulting from colonial processes, with TEK being seen as diverse, living knowledge protected by longstanding cultural protocols. Concerns surrounding TEK repository management included the need for platforming Indigenous data sovereignty and Indigenous Peoples’ access and ownership. Wise practices of Indigenous-led repository development demonstrated clear examples of data governance processes in action. Indigenous communities were seen to be vital in contributing to key policies and protocols that protect health-related TEK. Full article

(This article belongs to the Section Environmental Health)

► Show Figures

Figure 1

32 pages, 6964 KB

Open AccessArticle

MDFT-GAN: A Multi-Domain Feature Transformer GAN for Bearing Fault Diagnosis Under Limited and Imbalanced Data Conditions

by Chenxi Guo, Vyacheslav V. Potekhin, Peng Li, Elena A. Kovalchuk and Jing Lian

Appl. Sci. 2025, 15(11), 6225; https://doi.org/10.3390/app15116225 - 31 May 2025

Viewed by 811

Abstract

In industrial scenarios, bearing fault diagnosis often suffers from data scarcity and class imbalance, which significantly hinders the generalization performance of data-driven models. While generative adversarial networks (GANs) have shown promise in data augmentation, their efficacy deteriorates in the presence of multi-category and [...] Read more.

In industrial scenarios, bearing fault diagnosis often suffers from data scarcity and class imbalance, which significantly hinders the generalization performance of data-driven models. While generative adversarial networks (GANs) have shown promise in data augmentation, their efficacy deteriorates in the presence of multi-category and structurally complex fault distributions. To address these challenges, this paper proposes a novel fault diagnosis framework based on a Multi-Domain Feature Transformer GAN (MDFT-GAN). Specifically, raw vibration signals are transformed into 2D RGB representations via joint time-domain, frequency-domain, and time–frequency-domain mappings, effectively encoding multi-perspective fault signatures. A Transformer-based feature extractor, integrated with Efficient Channel Attention (ECA), is embedded into both the generator and discriminator to capture global dependencies and channel-wise interactions, thereby enhancing the representation quality of synthetic samples. Furthermore, a gradient penalty (GP) term is introduced to stabilize adversarial training and suppress mode collapse. To improve classification performance, an Enhanced Hybrid Visual Transformer (EH-ViT) is constructed by coupling a lightweight convolutional stem with a ViT encoder, enabling robust and discriminative fault identification. Beyond performance metrics, this work also incorporates a Grad-CAM-based interpretability scheme to visualize hierarchical feature activation patterns within the discriminator, providing transparent insight into the model’s decision-making rationale across different fault types. Extensive experiments on the CWRU and Jiangnan University (JNU) bearing datasets validate that the proposed method achieves superior diagnostic accuracy, robustness under limited and imbalanced conditions, and enhanced interpretability compared to existing state-of-the-art approaches. Full article

(This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications)

► Show Figures

Figure 1

17 pages, 35033 KB

Open AccessArticle

A Multi-Branch Attention Fusion Method for Semantic Segmentation of Remote Sensing Images

by Kaibo Li, Zhenping Qiang, Hong Lin and Xiaorui Wang

Remote Sens. 2025, 17(11), 1898; https://doi.org/10.3390/rs17111898 - 30 May 2025

Viewed by 786

Abstract

In recent years, advancements in remote sensing image observation technology have significantly enriched the surface feature information captured in remote sensing images, posing greater challenges for semantic information extraction from remote sensing imagery. While convolutional neural networks (CNNs) excel at understanding relationships between [...] Read more.

In recent years, advancements in remote sensing image observation technology have significantly enriched the surface feature information captured in remote sensing images, posing greater challenges for semantic information extraction from remote sensing imagery. While convolutional neural networks (CNNs) excel at understanding relationships between adjacent image regions, processing multidimensional data requires reliance on attention mechanisms. However, due to the inherent complexity of remote sensing images, most attention mechanisms designed for natural images underperform when applied to remote sensing data. To address these challenges in remote sensing image semantic segmentation, we propose a highly generalizable multi-branch attention fusion method based on shallow and deep features. This approach applies pixel-wise, spatial, and channel attention mechanisms to feature maps fused with shallow and deep features, thereby enhancing the network’s semantic information extraction capability. Through evaluations on the Cityscapes, LoveDA, and WHDLD datasets, we validate the performance of our method in processing remote sensing data. The results demonstrate consistent improvements in segmentation accuracy across most categories, highlighting its strong generalization capability. Specifically, compared to baseline methods, our approach achieves average mIoU improvements of 0.42% and 0.54% on the WHDLD and LoveDA datasets, respectively, significantly enhancing network performance in complex remote sensing scenarios. Full article

(This article belongs to the Special Issue Ocean Remote Sensing Based on Radar, Sonar and Optical Techniques (Second Edition))

► Show Figures

Figure 1

16 pages, 1174 KB

Open AccessArticle

Natural Language Processing for Aviation Safety: Predicting Injury Levels from Incident Reports in Australia

by Aziida Nanyonga, Keith Joiner, Ugur Turhan and Graham Wild

Modelling 2025, 6(2), 40; https://doi.org/10.3390/modelling6020040 - 28 May 2025

Viewed by 1398

Abstract

This study investigates the application of advanced deep learning models for the classification of aviation safety incidents, focusing on four models: Simple Recurrent Neural Network (sRNN), Gated Recurrent Unit (GRU), Bidirectional Long Short-Term Memory (BLSTM), and DistilBERT. The models were evaluated based on [...] Read more.

This study investigates the application of advanced deep learning models for the classification of aviation safety incidents, focusing on four models: Simple Recurrent Neural Network (sRNN), Gated Recurrent Unit (GRU), Bidirectional Long Short-Term Memory (BLSTM), and DistilBERT. The models were evaluated based on key performance metrics, including accuracy, precision, recall, and F1-score. DistilBERT achieved perfect performance with an accuracy of 1.00 across all metrics, while BLSTM demonstrated the highest performance among the deep learning models, with an accuracy of 0.9896, followed by GRU (0.9893) and sRNN (0.9887). Class-wise evaluations revealed that DistilBERT excelled across all injury categories, with BLSTM outperforming the other deep learning models, particularly in detecting fatal injuries, achieving a precision of 0.8684 and an F1-score of 0.7952. The study also addressed the challenges of class imbalance by applying class weighting, although the use of more sophisticated techniques, such as focal loss, is recommended for future work. This research highlights the potential of transformer-based models for aviation safety classification and provides a foundation for future research to improve model interpretability and generalizability across diverse datasets. These findings contribute to the growing body of research on applying deep learning techniques to aviation safety and underscore opportunities for further exploration. Full article

► Show Figures

Figure 1

28 pages, 10772 KB

Open AccessArticle

PBC-Transformer: Interpreting Poultry Behavior Classification Using Image Caption Generation Techniques

by Jun Li, Bing Yang, Jiaxin Liu, Felix Kwame Amevor, Yating Guo, Yuheng Zhou, Qinwen Deng and Xiaoling Zhao

Animals 2025, 15(11), 1546; https://doi.org/10.3390/ani15111546 - 25 May 2025

Cited by 1 | Viewed by 567

Abstract

Accurate classification of poultry behavior is critical for assessing welfare and health, yet most existing methods predict behavior categories without providing explanations for the image content. This study introduces the PBC-Transformer model, a novel model that integrates image captioning techniques to enhance poultry [...] Read more.

Accurate classification of poultry behavior is critical for assessing welfare and health, yet most existing methods predict behavior categories without providing explanations for the image content. This study introduces the PBC-Transformer model, a novel model that integrates image captioning techniques to enhance poultry behavior classification, mimicking expert assessment processes. The model employs a multi-head concentrated attention mechanism, Head Spatial Position Coding (HSPC), to enhance spatial information; a learnable sparse mechanism (LSM) and RNorm function to reduce noise and strengthen feature correlation; and a depth-wise separable convolutional network for improved local feature extraction. Furthermore, a multi-level attention differentiator dynamically selects image regions for precise behavior descriptions. To balance caption generation with classification, we introduce the ICL-Loss function, which adaptively adjusts loss weights. Extensive experiments on the PBC-CapLabels dataset demonstrate that PBC-Transformer outperforms 13 commonly used classification models, improving accuracy by 15% and achieving the highest scores across image captioning metrics: Bleu4 (0.498), RougeL (0.794), Meteor (0.393), and Spice (0.613). Full article

(This article belongs to the Special Issue Animal–Computer Interaction: New Horizons in Animal Welfare)

► Show Figures

Figure 1

19 pages, 5919 KB

Open AccessArticle

Evaluation of the Effectiveness of the UNet Model with Different Backbones in the Semantic Segmentation of Tomato Leaves and Fruits

by Juan Pablo Guerra Ibarra, Francisco Javier Cuevas de la Rosa and Julieta Raquel Hernandez Vidales

Horticulturae 2025, 11(5), 514; https://doi.org/10.3390/horticulturae11050514 - 9 May 2025

Viewed by 755

Abstract

Timely identification of crop conditions is relevant for informed decision-making in precision agriculture. The initial step in determining the conditions that crops require involves isolating the components that constitute them, including the leaves and fruits of the plants. An alternative method for conducting [...] Read more.

Timely identification of crop conditions is relevant for informed decision-making in precision agriculture. The initial step in determining the conditions that crops require involves isolating the components that constitute them, including the leaves and fruits of the plants. An alternative method for conducting this separation is to utilize intelligent digital image processing, wherein plant elements are labeled for subsequent analysis. The application of Deep Learning algorithms offers an alternative approach for conducting segmentation tasks on images obtained from complex environments with intricate patterns that pose challenges for separation. One such application is semantic segmentation, which involves assigning a label to each pixel in the processed image. This task is accomplished through training various models of Convolutional Neural Networks. This paper presents a comparative analysis of semantic segmentation performance using a convolutional neural network model with different backbone architectures. The task focuses on pixel-wise classification into three categories: leaves, fruits, and background, based on images of semi-hydroponic tomato crops captured in greenhouse settings. The main contribution lies in identifying the most efficient backbone-UNet combination for segmenting tomato plant leaves and fruits under uncontrolled conditions of lighting and background during image acquisition. The Convolutional Neural Network model UNet is is implemented with different backbones to use transfer learning to take advantage of the knowledge acquired by other models such as MobileNet, VanillaNet, MVanillaNet, ResNet, VGGNet trained with the ImageNet dataset, in order to segment the leaves and fruits of tomato plants. Highest percentage performance across five metrics for tomato plant fruit and leaves segmentation is the MVanillaNet-UNet and VGGNet-UNet combination with

0.88089

and

0.89078

respectively. A comparison of the best results of semantic segmentation versus those obtained with a color-dominant segmentation method optimized with a greedy algorithm is presented. Full article

(This article belongs to the Section Vegetable Production Systems)

► Show Figures

Figure 1

Search Results (124)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (124)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI