MDPI - Publisher of Open Access Journals

23 pages, 6924 KiB

Open AccessArticle

A Dynamic Multi-Scale Feature Fusion Network for Enhanced SAR Ship Detection

by Rui Cao and Jianghua Sui

Sensors 2025, 25(16), 5194; https://doi.org/10.3390/s25165194 - 21 Aug 2025

Viewed by 239

This study aims to develop an enhanced YOLO algorithm to improve the ship detection performance of synthetic aperture radar (SAR) in complex marine environments. Current SAR ship detection methods face numerous challenges in complex sea conditions, including environmental interference, false detection, and multi-scale [...] Read more.

This study aims to develop an enhanced YOLO algorithm to improve the ship detection performance of synthetic aperture radar (SAR) in complex marine environments. Current SAR ship detection methods face numerous challenges in complex sea conditions, including environmental interference, false detection, and multi-scale changes in detection targets. To address these issues, this study adopts a technical solution that combines multi-level feature fusion with a dynamic detection mechanism. First, a cross-stage partial dynamic channel transformer module (CSP_DTB) was designed, which combines the transformer architecture with a convolutional neural network to replace the last two C3k2 layers in the YOLOv11n main network, thereby enhancing the model’s feature extraction capabilities. Second, a general dynamic feature pyramid network (RepGFPN) was introduced to reconstruct the neck network architecture, enabling more efficient multi-scale feature fusion and information propagation. Additionally, a lightweight dynamic decoupled dual-alignment head (DYDDH) was constructed to enhance the collaborative performance of localization and classification tasks through task-specific feature decoupling. Experimental results show that the proposed DRGD-YOLO algorithm achieves significant performance improvements. On the HRSID dataset, the algorithm achieves an average precision (mAP50) of 93.1% at an IoU threshold of 0.50 and an mAP50–95 of 69.2% over the IoU threshold range of 0.50–0.95. Compared to the baseline YOLOv11n algorithm, the proposed method improves mAP50 and mAP50–95 by 3.3% and 4.6%, respectively. The proposed DRGD-YOLO algorithm not only significantly improves the accuracy and robustness of synthetic aperture radar (SAR) ship detection but also demonstrates broad application potential in fields such as maritime surveillance, fisheries management, and maritime safety monitoring, providing technical support for the development of intelligent marine monitoring technology. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

20 pages, 1331 KiB

Open AccessArticle

Distribution Network Situational Awareness Prediction Based on Spatio-Temporal Attention Dynamic Graph Neural Network

by Xixi Qiu, Yuteng Huang, Guojin Liu, Jiaxiang Yan and Shan Chen

Energies 2025, 18(16), 4402; https://doi.org/10.3390/en18164402 - 18 Aug 2025

Viewed by 262

Abstract

Distribution network situational awareness prediction is a key technology for ensuring the safe and stable operation of distribution networks. However, most existing methods suffer from spatio-temporal dynamic correlation and dynamic topology, resulting in unsatisfactory performance. To address these issues, we propose a distribution [...] Read more.

Distribution network situational awareness prediction is a key technology for ensuring the safe and stable operation of distribution networks. However, most existing methods suffer from spatio-temporal dynamic correlation and dynamic topology, resulting in unsatisfactory performance. To address these issues, we propose a distribution network situational awareness prediction method based on a spatio-temporal attention dynamic graph neural network model that realizes the decoupling of spatio-temporal features of the distribution network data by adopting the alternating stacking of the multi-head self-attention mechanism with temporal dynamic perception and the spatial dynamic graph convolution module. Furthermore, the dynamic correlation matrix is introduced to adaptively adjust the node interaction weights to effectively handle the network dynamic topology information. Through extensive experiments, the proposed method outperforms eight baseline models. Full article

(This article belongs to the Section A1: Smart Grids and Microgrids)

► Show Figures

Figure 1

23 pages, 1657 KiB

Open AccessArticle

High-Precision Pest Management Based on Multimodal Fusion and Attention-Guided Lightweight Networks

by Ziye Liu, Siqi Li, Yingqiu Yang, Xinlu Jiang, Mingtian Wang, Dongjiao Chen, Tianming Jiang and Min Dong

Insects 2025, 16(8), 850; https://doi.org/10.3390/insects16080850 - 16 Aug 2025

Viewed by 587

Abstract

In the context of global food security and sustainable agricultural development, the efficient recognition and precise management of agricultural insect pests and their predators have become critical challenges in the domain of smart agriculture. To address the limitations of traditional models that overly [...] Read more.

In the context of global food security and sustainable agricultural development, the efficient recognition and precise management of agricultural insect pests and their predators have become critical challenges in the domain of smart agriculture. To address the limitations of traditional models that overly rely on single-modal inputs and suffer from poor recognition stability under complex field conditions, a multimodal recognition framework has been proposed. This framework integrates RGB imagery, thermal infrared imaging, and environmental sensor data. A cross-modal attention mechanism, environment-guided modality weighting strategy, and decoupled recognition heads are incorporated to enhance the model’s robustness against small targets, intermodal variations, and environmental disturbances. Evaluated on a high-complexity multimodal field dataset, the proposed model significantly outperforms mainstream methods across four key metrics, precision, recall, F1-score, and mAP@50, achieving 91.5% precision, 89.2% recall, 90.3% F1-score, and 88.0% mAP@50. These results represent an improvement of over 6% compared to representative models such as YOLOv8 and DETR. Additional ablation studies confirm the critical contributions of key modules, particularly under challenging scenarios such as low light, strong reflections, and sensor data noise. Moreover, deployment tests conducted on the Jetson Xavier edge device demonstrate the feasibility of real-world application, with the model achieving a 25.7 FPS inference speed and a compact size of 48.3 MB, thus balancing accuracy and lightweight design. This study provides an efficient, intelligent, and scalable AI solution for pest surveillance and biological control, contributing to precision pest management in agricultural ecosystems. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Insect Pests Management: Securing Food Security, Human Health, and Natural Resources)

► Show Figures

Figure 1

18 pages, 22685 KiB

Open AccessArticle

ASC-YOLO: Multi-Scale Feature Fusion and Adaptive Decoupled Head for Fracture Detection in Medical Imaging

by Shenghong Du and Yan Wei

Appl. Sci. 2025, 15(16), 9031; https://doi.org/10.3390/app15169031 - 15 Aug 2025

Viewed by 277

Abstract

Fractures occur frequently in daily life, and before a surgeon implements treatment, the plan needs to be based on the radiologist’s imaging diagnosis of the X-ray. Despite progress in deep learning–based fracture detection, existing methods (e.g., two-stage detectors) face challenges such as small [...] Read more.

Fractures occur frequently in daily life, and before a surgeon implements treatment, the plan needs to be based on the radiologist’s imaging diagnosis of the X-ray. Despite progress in deep learning–based fracture detection, existing methods (e.g., two-stage detectors) face challenges such as small target leakage and sensitivity to background interference in complex medical images. To address these issues, this paper proposes the ASC-YOLO model, which employs the Scale-Sensitive Feature Fusion (SSFF) module to enhance multi-scale information extraction through cross-layer feature interaction. In addition, an Adaptive Decoupled Detection Head (AsDDet) is introduced to decouple the classification and regression tasks of the detection head, thereby improving the localization accuracy of small fracture regions and suppressing background noise. Experiments on a large fracture radiograph dataset, GRAZPEDWRI-DX, demonstrate that ASC-YOLO achieves 61% mAP@50, representing an 8% improvement over the baseline YOLO model (mAP@53%). It attains 95% mAP@50 for the fracture category and 97% mAP@50 for the metal category. Furthermore, the model was evaluated on a tumor dataset to verify its generalization capability. The proposed framework provides reliable technical support for accurate fracture screening, which is expected to reduce missed diagnoses and optimize treatment. Full article

► Show Figures

Figure 1

28 pages, 2107 KiB

Open AccessArticle

A Scale-Adaptive and Frequency-Aware Attention Network for Precise Detection of Strawberry Diseases

by Kaijie Zhang, Yuchen Ye, Kaihao Chen, Zao Li and Hongxing Peng

Agronomy 2025, 15(8), 1969; https://doi.org/10.3390/agronomy15081969 - 15 Aug 2025

Viewed by 316

Abstract

Accurate and automated detection of diseases is crucial for sustainable strawberry production. However, the challenges posed by small size, mutual occlusion, and high intra-class variance of symptoms in complex agricultural environments make this difficult. Mainstream deep learning detectors often do not perform well [...] Read more.

Accurate and automated detection of diseases is crucial for sustainable strawberry production. However, the challenges posed by small size, mutual occlusion, and high intra-class variance of symptoms in complex agricultural environments make this difficult. Mainstream deep learning detectors often do not perform well under these demanding conditions. We propose a novel detection framework designed for superior accuracy and robustness to address this critical gap. Our framework introduces four key innovations: First, we propose a novel attention-driven detection head featuring our Parallel Pyramid Attention (PPA) module. Inspired by pyramid attention principles, our module’s unique parallel multi-branch architecture is designed to overcome the limitations of serial processing. It simultaneously integrates global, local, and serial features to generate a fine-grained attention map, significantly improving the model’s focus on targets of varying scales. Second, we enhance the core feature fusion blocks by integrating Monte Carlo Attention (MCAttn), effectively empowering the model to recognize targets across diverse scales. Third, to improve the feature representation capacity of the backbone without increasing the parametric overhead, we replace standard convolutions with Frequency-Dynamic Convolutions (FDConv). This approach constructs highly diverse kernels in the frequency domain. Finally, we employ the Scale-Decoupled Loss function to optimize training dynamics. By adaptively re-weighting the localization and scale losses based on target size, we stabilize the training process and improve the Precision of bounding box regression for small objects. Extensive experiments on a challenging dataset related to strawberry diseases demonstrate that our proposed model achieves a mean Average Precision (MAP) of 81.1%. This represents an improvement of 2.1% over the strong YOLOv12-n baseline, highlighting its practical value as an effective tool for intelligent disease protection. Full article

(This article belongs to the Special Issue Modern Control of Biotic Stress in Crops: Intelligent Detection and Precision Pesticide Application)

► Show Figures

Figure 1

17 pages, 3121 KiB

Open AccessArticle

Development of Resonant De Ice Device Based on Visual Detection of Line Ice Coverage

by Yuan Ma, Xingping He, Peng Wu, Lei Chen, Yikai Wang, Ke Wang, Chengmeng Liu and Jing Fang

Electronics 2025, 14(16), 3246; https://doi.org/10.3390/electronics14163246 - 15 Aug 2025

Viewed by 215

Abstract

Aiming at the ice coating problem on medium-voltage overhead lines, this paper proposes a resonance de-icing device based on an improved YOLOv7 algorithm to achieve efficient and intelligent ice detection and removal. First, by introducing the SimAM three-dimensional attention mechanism to optimize feature [...] Read more.

Aiming at the ice coating problem on medium-voltage overhead lines, this paper proposes a resonance de-icing device based on an improved YOLOv7 algorithm to achieve efficient and intelligent ice detection and removal. First, by introducing the SimAM three-dimensional attention mechanism to optimize feature extraction capability, combining the MPDIoU loss function to enhance bounding box regression accuracy, and designing a task-specific context decoupling head to separate classification and regression tasks, the ice detection accuracy and real-time performance are significantly improved. Second, an integrated ice observation/de-icing device is developed, which incorporates the improved YOLOv7 visual detection algorithm and a resonance vibration module. Through a dynamic frequency optimization strategy, precise matching between the excitation frequency and the inherent frequency of the conductor is achieved. The findings from the engineering experiments demonstrate that the de-icing apparatus conceptualized in this study is capable of effectively identifying the condition of ice-covered conductors and de-icing them. This research presents a novel technical solution for the intelligent de-icing of overhead lines, which holds significant value for engineering applications. Full article

(This article belongs to the Special Issue Pattern Recognition and Sensor Fusion Solutions in Intelligent Sensor Systems, 2nd Edition)

► Show Figures

Figure 1

21 pages, 15647 KiB

Open AccessArticle

Research on Oriented Object Detection in Aerial Images Based on Architecture Search with Decoupled Detection Heads

by Yuzhe Kang, Bohao Zheng and Wei Shen

Appl. Sci. 2025, 15(15), 8370; https://doi.org/10.3390/app15158370 - 28 Jul 2025

Viewed by 443

Abstract

Object detection in aerial images can provide great support in traffic planning, national defense reconnaissance, hydrographic surveys, infrastructure construction, and other fields. Objects in aerial images are characterized by small pixel–area ratios, dense arrangements between objects, and arbitrary inclination angles. In response to [...] Read more.

Object detection in aerial images can provide great support in traffic planning, national defense reconnaissance, hydrographic surveys, infrastructure construction, and other fields. Objects in aerial images are characterized by small pixel–area ratios, dense arrangements between objects, and arbitrary inclination angles. In response to these characteristics and problems, we improved the feature extraction network Inception-ResNet using the Fast Architecture Search (FAS) module and proposed a one-stage anchor-free rotation object detector. The structure of the object detector is simple and only consists of convolution layers, which reduces the number of model parameters. At the same time, the label sampling strategy in the training process is optimized to resolve the problem of insufficient sampling. Finally, a decoupled object detection head is used to separate the bounding box regression task from the object classification task. The experimental results show that the proposed method achieves mean average precision (mAP) of 82.6%, 79.5%, and 89.1% on the DOTA1.0, DOTA1.5, and HRSC2016 datasets, respectively, and the detection speed reaches 24.4 FPS, which can meet the needs of real-time detection. Full article

(This article belongs to the Special Issue Innovative Applications of Artificial Intelligence in Engineering)

► Show Figures

Figure 1

17 pages, 3612 KiB

Open AccessArticle

MPVT: An Efficient Multi-Modal Prompt Vision Tracker for Visual Target Tracking

by Jianyu Xie, Yan Fu, Junlin Zhou, Tianxiang He, Xiaopeng Wang, Yuke Fang and Duanbing Chen

Appl. Sci. 2025, 15(14), 7967; https://doi.org/10.3390/app15147967 - 17 Jul 2025

Viewed by 373

Abstract

Visual target tracking is a fundamental task in computer vision. Combining multi-modal information with tracking leverages complementary information, which improves the precision and robustness of trackers. Traditional multi-modal tracking methods typically employ a full fine-tuning scheme, i.e., fine-tuning pre-trained single-modal models to multi-modal [...] Read more.

Visual target tracking is a fundamental task in computer vision. Combining multi-modal information with tracking leverages complementary information, which improves the precision and robustness of trackers. Traditional multi-modal tracking methods typically employ a full fine-tuning scheme, i.e., fine-tuning pre-trained single-modal models to multi-modal tasks. However, this approach suffers from low transfer learning efficiency, catastrophic forgetting, and high cross-task deployment costs. To address these issues, we propose an efficient model named multi-modal prompt vision tracker (MPVT) based on an efficient prompt-tuning paradigm. Three key components are involved in the model: a decoupled input enhancement module, a dynamic adaptive prompt fusion module, and a fully connected head network module. The decoupled input enhancement module enhances input representations via positional and type embedding. The dynamic adaptive prompt fusion module achieves efficient prompt tuning and multi-modal interaction using scaled convolution and low-rank cross-modal attention mechanisms. The fully connected head network module addresses the shortcomings of traditional convolutional head networks such as inductive biases. Experimental results from RGB-T, RGB-D, and RGB-E scenarios show that MPVT outperforms state-of-the-art methods. Moreover, MPVT can save 43.8% GPU memory usage and reduce training time by 62.9% compared with a full-parameter fine-tuning model. Full article

(This article belongs to the Special Issue Advanced Technologies Applied for Object Detection and Tracking)

► Show Figures

Figure 1

24 pages, 40762 KiB

Open AccessArticle

Multiscale Task-Decoupled Oriented SAR Ship Detection Network Based on Size-Aware Balanced Strategy

by Shun He, Ruirui Yuan, Zhiwei Yang and Jiaxue Liu

Remote Sens. 2025, 17(13), 2257; https://doi.org/10.3390/rs17132257 - 30 Jun 2025

Viewed by 386

Abstract

Current synthetic aperture radar (SAR) ship datasets exhibit a notable disparity in the distribution of large, medium, and small ship targets. This imbalance makes it difficult for a relatively small number of large and medium-sized ships to be effectively trained, resulting in many [...] Read more.

Current synthetic aperture radar (SAR) ship datasets exhibit a notable disparity in the distribution of large, medium, and small ship targets. This imbalance makes it difficult for a relatively small number of large and medium-sized ships to be effectively trained, resulting in many false alarms. Therefore, to address the issues of scale diversity, intra-class imbalance in ship data, and the feature conflict problem associated with traditional coupled detection heads, we propose an SAR image multiscale task-decoupled oriented ship target detector based on a size-aware balanced strategy. First, the multiscale target features are extracted using the multikernel heterogeneous perception module (MKHP). Meanwhile, the triple-attention module is introduced to establish the remote channel dependence to alleviate the issue of small target feature annihilation, which can effectively enhance the feature characterization ability of the model. Second, given the differences in the demand for feature information between the detection and classification tasks, a channel attention-based task decoupling dual-head (CAT2D) detector head structure is introduced to address the inherent conflict between classification and localization tasks. Finally, a new size-aware balanced (SAB) loss strategy is proposed to guide the network in focusing on the scarce targets in training to alleviate the intra-class imbalance problem during the training process. The ablation experiments on SSDD+ reflect the contribution of each component, and the results of the comparison experiments on the RSDD-SAR and HRSID datasets show that the proposed method achieves state-of-the-art performance compared to other state-of-the-art detection models. Furthermore, our approach exhibits superior detection coverage for both offshore and inshore scenarios for ship detection tasks. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Graphical abstract

17 pages, 3477 KiB

Open AccessArticle

Rapid Identification of Mangrove Leaves Based on Improved YOLOv10 Model

by Haitao Sang, Ziming Li, Xiaoxue Shen, Shuwen Wang and Ying Zhang

Forests 2025, 16(7), 1068; https://doi.org/10.3390/f16071068 - 26 Jun 2025

Viewed by 303

Abstract

To address the issue of low recognition accuracy caused by the high morphological similarity of mangrove plant leaves, this study proposes a rapid identification method for mangrove leaves based on the YOLOv10 model, with corresponding improvements made to the baseline model. First, the [...] Read more.

To address the issue of low recognition accuracy caused by the high morphological similarity of mangrove plant leaves, this study proposes a rapid identification method for mangrove leaves based on the YOLOv10 model, with corresponding improvements made to the baseline model. First, the open-source tool LabelImg was employed to annotate leaf images and construct a mangrove leaf species dataset. Subsequently, a PSA attention mechanism was introduced to enhance the extraction of leaf detail features, while the SCDown downsampling method was adopted to preserve key characteristics. Furthermore, a BiFPN architecture incorporating SE modules was implemented to dynamically adjust channel weights for multi-scale feature fusion. Finally, the classification and regression tasks are decoupled by separating the detection head, and the final model is named YOLOv10-MSDet. Experimental results demonstrate that the improved model achieves rapid and accurate identification of various mangrove leaf species, with an average recognition accuracy of 92.4%—a 2.84 percentage point improvement over the baseline model, significantly enhancing the precision of mangrove leaf detection. Full article

(This article belongs to the Special Issue Mangrove Ecosystems in the Face of Climate Change: Resilience, Adaptation, and Conservation Strategies)

► Show Figures

Figure 1

20 pages, 4244 KiB

Open AccessArticle

Edge-Optimized Lightweight YOLO for Real-Time SAR Object Detection

by Caiguang Zhang, Ruofeng Yu, Shuwen Wang, Fatong Zhang, Shaojia Ge, Shuangshuang Li and Xuezhou Zhao

Remote Sens. 2025, 17(13), 2168; https://doi.org/10.3390/rs17132168 - 24 Jun 2025

Viewed by 1032

Abstract

Synthetic Aperture Radar image object detection holds significant application value in both military and civilian domains. However, existing deep learning-based methods suffer from excessive model parameters and high computational costs, making them impractical for real-time deployment on edge computing platforms. To address these [...] Read more.

Synthetic Aperture Radar image object detection holds significant application value in both military and civilian domains. However, existing deep learning-based methods suffer from excessive model parameters and high computational costs, making them impractical for real-time deployment on edge computing platforms. To address these challenges, this paper proposes a lightweight SAR object detection method optimized for edge devices. First, we design an efficient backbone network based on inverted residual blocks and the information bottleneck principle, achieving an optimal balance between feature extraction capability and computational resource consumption. Then, a Fast Feature Pyramid Network is constructed to enable efficient multi-scale feature fusion. Finally, we propose a decoupled network-in-network Head, which significantly reduces the computational overhead while maintaining detection accuracy. Experimental results demonstrate that the proposed method achieves comparable detection performance to state-of-the-art YOLO variants while drastically reducing computational complexity (4.4 GFLOP) and parameter count (1.9 M). On edge platforms (Jetson TX2 and Huawei Atlas DK 310), the model achieves real-time inference speeds of 34.2 FPS and 30.7 FPS, respectively, proving its suitability for resource-constrained, real-time SAR object detection scenarios. Full article

(This article belongs to the Special Issue Synthetic Aperture Radar (SAR) Image Object Detection and Information Extraction: Methods and Applications (Second Edition))

► Show Figures

Graphical abstract

22 pages, 4582 KiB

Open AccessArticle

Enhanced Object Detection in Thangka Images Using Gabor, Wavelet, and Color Feature Fusion

by Yukai Xian, Yurui Lee, Te Shen, Ping Lan, Qijun Zhao and Liang Yan

Sensors 2025, 25(11), 3565; https://doi.org/10.3390/s25113565 - 5 Jun 2025

Cited by 1 | Viewed by 549

Abstract

Thangka image detection poses unique challenges due to complex iconography, densely packed small-scale elements, and stylized color–texture compositions. Existing detectors often struggle to capture both global structures and local details and rarely leverage domain-specific visual priors. To address this, we propose a frequency- [...] Read more.

Thangka image detection poses unique challenges due to complex iconography, densely packed small-scale elements, and stylized color–texture compositions. Existing detectors often struggle to capture both global structures and local details and rarely leverage domain-specific visual priors. To address this, we propose a frequency- and prior-enhanced detection framework based on YOLOv11, specifically tailored for Thangka images. We introduce a Learnable Lifting Wavelet Block (LLWB) to decompose features into low- and high-frequency components, while LLWB_Down and LLWB_Up enable frequency-guided multi-scale fusion. To incorporate chromatic and directional cues, we design a Color-Gabor Block (CGBlock), a dual-branch attention module based on HSV histograms and Gabor responses, and embed it via the Color-Gabor Cross Gate (C2CG) residual fusion module. Furthermore, we redesign all detection heads with decoupled branches and introduce center-ness prediction, alongside an additional shallow detection head to improve recall for ultra-small targets. Extensive experiments on a curated Thangka dataset demonstrate that our model achieves 89.5% mAP@0.5, 59.4% mAP@[0.5:0.95], and 84.7% recall, surpassing all baseline detectors while maintaining a compact size of 20.9 M parameters. Ablation studies validate the individual and synergistic contributions of each proposed component. Our method provides a robust and interpretable solution for fine-grained object detection in complex heritage images. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

25 pages, 6066 KiB

Open AccessArticle

FD²-YOLO: A Frequency-Domain Dual-Stream Network Based on YOLO for Crack Detection

by Junwen Zhu, Jinbao Sheng and Qian Cai

Sensors 2025, 25(11), 3427; https://doi.org/10.3390/s25113427 - 29 May 2025

Viewed by 785

Abstract

Crack detection in cement infrastructure is imperative to ensure its structural integrity and public safety. However, most existing methods use multi-scale and attention mechanisms to improve on a single backbone, and this single backbone network is often ineffective in detecting slender or variable [...] Read more.

Crack detection in cement infrastructure is imperative to ensure its structural integrity and public safety. However, most existing methods use multi-scale and attention mechanisms to improve on a single backbone, and this single backbone network is often ineffective in detecting slender or variable cracks in complex scenarios. We propose a novel network, FD²-YOLO, based on frequency-domain dual-stream YOLO, for accurate and efficient detection of cement cracks. Firstly, the model employs a dual backbone architecture, integrating edge and texture features in the frequency domain with semantic features in the spatial domain, to enhance the extraction of crack-related features. Furthermore, the Dynamic Inter-Domain Feature Fusion module (DIFF) is introduced, which uses large-kernel deep convolution and Hadamard to enable the adaptive fusion of features from different domains, thus addressing the problem of difficult feature fusion due to domain differences. Finally, the DIA-Head module has been proposed, which dynamically focuses on the texture and geometric deformation features of cracks by introducing the Deformable Interactive Attention Module (DIA Module) in Decoupled Head and utilizing its Deformable Interactive Attention. Extensive experiments on the RDD2022 dataset demonstrate that FD²-YOLO achieves state-of-the-art performance. Compared with existing YOLO-based models, it improves mAP50 by 1.3%, mAP50-95 by 1.1%, recall by 1.8%, and precision by 0.5%, validating its effectiveness in real-world object detection scenarios. In addition, evaluation on the UAV-PDD2023 dataset further confirms the robustness and generalization of our approach, where FD²-YOLO achieves a mAP50 of 67.9%, mAP50-95 of 35.9%, recall of 61.2%, and precision of 75.9%, consistently outperforming existing lightweight and Transformer-based detectors under more complex aerial imaging conditions. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

30 pages, 11131 KiB

Open AccessArticle

TCN–Transformer Spatio-Temporal Feature Decoupling and Dynamic Kernel Density Estimation for Gas Concentration Fluctuation Warning

by Yanping Wang, Longcheng Zhang, Zhenguo Yan, Jun Deng, Yuxin Huang, Zhixin Qin, Yuqi Cao and Yiyang Wang

Fire 2025, 8(5), 175; https://doi.org/10.3390/fire8050175 - 30 Apr 2025

Viewed by 543

Abstract

This study addresses the problems of multi-source data redundancy, insufficient feature capture timing, and delayed risk warning in the prediction of gas concentration in fully mechanized coal-mining operations by constructing a three-pronged technical approach that integrates feature dimensionality reduction, hybrid modeling, and intelligent [...] Read more.

This study addresses the problems of multi-source data redundancy, insufficient feature capture timing, and delayed risk warning in the prediction of gas concentration in fully mechanized coal-mining operations by constructing a three-pronged technical approach that integrates feature dimensionality reduction, hybrid modeling, and intelligent early warning. First, sparse kernel principal component analysis (SKPCA) is used to accomplish the feature decoupling of multi-source monitoring data, and its optimal dimensionality reduction performance is verified using long-term and short-term neural networks (LSTMs). Second, an innovative TCN–Transformer hybrid architecture is proposed. The transient fluctuation characteristics of gas concentration are captured using causal dilation convolution, while a multi-head self-attention mechanism is used to analyze the cross-scale correlation of geological mining parameters. A flood optimization algorithm (FLA) is used to establish a hyperparameter collaborative optimization framework. Compared to TCN-LSTM, CNN-GRU, and other hybrid models, the hybrid model proposed in this study exhibits superior point prediction performance, with a maximum R² of 0.980988. Finally, a dynamic confidence interval is established using the locally weighted kernel density estimation (LWD-KDE) method with an optimized bandwidth, and an unsupervised early warning mechanism for the risk of gas concentration fluctuations in coal mines is constructed. The results provide a comprehensive approach to preventing and controlling gas disasters in fully mechanized mining operations. This research effectively promotes the transformation and upgrading of coal-mine-safety-monitoring systems to an active defense paradigm. Full article

► Show Figures

Figure 1

17 pages, 43013 KiB

Open AccessArticle

Ship-Yolo: A Deep Learning Approach for Ship Detection in Remote Sensing Images

by Wuan Shi, Wen Zheng and Zhijing Xu

J. Mar. Sci. Eng. 2025, 13(4), 737; https://doi.org/10.3390/jmse13040737 - 7 Apr 2025

Viewed by 888

Abstract

This study introduces Ship-Yolo, a novel algorithm designed to tackle the challenges of detecting small targets against complex backgrounds in remote sensing imagery. Firstly, the proposed method integrates an efficient local attention mechanism into the C3 module of the neck network, forming the [...] Read more.

This study introduces Ship-Yolo, a novel algorithm designed to tackle the challenges of detecting small targets against complex backgrounds in remote sensing imagery. Firstly, the proposed method integrates an efficient local attention mechanism into the C3 module of the neck network, forming the EDC module. This enhancement significantly improves the model’s capability to capture critical features, enabling robust performance in scenarios involving intricate backgrounds and multi-scale targets. Secondly, a Lightweight Asymmetric Decoupled Head (LADH-Head) is proposed to separate classification and regression tasks, reducing task conflicts, improving detection performance, and maintaining the model’s lightweight characteristics. Additionally, the LiteConv module is designed to replace the C3 module in the backbone network, leveraging partial convolution to ignore invalid information in occluded regions and avoid misjudgments. Finally, the Content-Aware Reassembly Upsampling Module (CARAFE) is employed to replace the original upsampling module, expanding the receptive field to better capture global information while preserving the lightweight nature of the model. Experiments on the ShipRSImageNet and DOTA datasets demonstrate that Ship-Yolo outperforms other YOLO variants and existing methods in terms of precision, recall, and average precision, exhibiting strong generalization capabilities. Ablation studies further validate the stable performance improvements contributed by the EDC, LADH-Head, LiteConv, and CARAFE modules. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI