Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (370)

Search Parameters:
Keywords = large convolution kernels

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 3455 KB  
Article
FocusMamba: A Local–Global Mamba Framework Inspired by Visual Observation for Brain Tumor Segmentation
by Qiang Li, Tao Ni, Xueyan Wang and Hengxin Liu
Appl. Sci. 2026, 16(7), 3571; https://doi.org/10.3390/app16073571 - 6 Apr 2026
Abstract
Accurate brain tumor segmentation from magnetic resonance imaging (MRI) is crucial for brain tumor diagnosis, clinical treatment decisions, and advancing research. CNNs and Transformers have dominated this area, but CNNs struggle with long-range modeling, whereas Transformers are limited by the high computational costs [...] Read more.
Accurate brain tumor segmentation from magnetic resonance imaging (MRI) is crucial for brain tumor diagnosis, clinical treatment decisions, and advancing research. CNNs and Transformers have dominated this area, but CNNs struggle with long-range modeling, whereas Transformers are limited by the high computational costs of self-attention. Recently, Mamba has garnered significant attention due to its remarkable performance in long sequence modeling. However, the original Mamba architecture, designed primarily for 1D sequence modeling, fails to effectively capture the spatial and structural relationships essential for brain tumor segmentation. In this paper, we propose FocusMamba, a Mamba-based model inspired by human visual observation patterns, which jointly enhances local detail modeling and global contextual understanding. FocusMamba consists of three components: (i) a novel hierarchical and tri-directional Mamba unit that elevates attention from the global to the window level, reinforcing local semantic feature extraction, while simultaneously achieving window-level interactions to maintain broader global awareness, (ii) a large kernel convolution unit that captures long-range dependencies within whole-volume features, overcoming the limitations of Mamba’s single-scale context modeling, and (iii) a fusion unit that enhances the overall feature representation by fusing information from different levels. Extensive experiments on the BraTS 2023 and BraTS 2020 datasets demonstrate that FocusMamba achieves superior segmentation performance compared with several advanced methods. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

21 pages, 15830 KB  
Article
A Deep Learning-Enhanced Adaptive Kalman Filter with Multi-Scale Temporal Attention for Airborne Gravity Denoising
by Lili Li, Junxiang Liu, Guoqing Ma and Zhexin Jiang
Sensors 2026, 26(7), 2216; https://doi.org/10.3390/s26072216 - 3 Apr 2026
Viewed by 223
Abstract
Airborne gravity survey serves as a rapid remote sensing technique for mapping subsurface mineral target and geological structure over large areas. The raw gravity data contains significant noise corrupted by airflow and the flight platform’s attitude. The Kalman Filter (KF) is an effective [...] Read more.
Airborne gravity survey serves as a rapid remote sensing technique for mapping subsurface mineral target and geological structure over large areas. The raw gravity data contains significant noise corrupted by airflow and the flight platform’s attitude. The Kalman Filter (KF) is an effective method for airborne gravity data denoising, but its processing accuracy is highly dependent on the empirical parameters. The multi-scale CNN-LSTM-attention adaptive Kalman Filter (MSC-LA-AKF) method is proposed to obtain high precision gravity data, which combines the multi-scale CNN (MSC), bidirectional long short-term memory (Bi-LSTM) and attention mechanism for adaptively estimating the parameters of KF. The multi-scale CNN uses convolution kernel of varying sizes to extract signal features at different scales. The Bi-LSTM combines two LSTM layers in opposite directions to extract the signal features at bidirectional time series, and can effectively identify time-varying noise signals. A multi-head attention mechanism with four attention heads (H=4) is incorporated into the output feature layer of the Bi-LSTM to adaptively calculate weights for different features and optimize the parameters of the KF. The simulated data tests demonstrate that the MSC-LA-AKF achieves notably higher denoising accuracy than both the finite impulse response (FIR) and wavelet filters, with detailed quantitative comparisons provided in the experimental section. The proposed method is applied to real airborne gravity data, and effectively removes noise signals and enhances the geological interpretation of gravity maps. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

31 pages, 28128 KB  
Article
HMF-DEIM: High-Fidelity Multi-Domain Fusion Transformer for UAV Small Object Detection
by Lan Ma, Yun Luo and Jiajun Xu
Sensors 2026, 26(7), 2187; https://doi.org/10.3390/s26072187 - 1 Apr 2026
Viewed by 290
Abstract
Unmanned aerial vehicle (UAV) small object detection faces critical challenges including irreversible geometric detail loss during multi-level downsampling, cross-scale feature distortion from interpolation blur and aliasing, and limited long-range dependency modeling due to constrained receptive fields. To address these limitations, we propose HMF-DEIM [...] Read more.
Unmanned aerial vehicle (UAV) small object detection faces critical challenges including irreversible geometric detail loss during multi-level downsampling, cross-scale feature distortion from interpolation blur and aliasing, and limited long-range dependency modeling due to constrained receptive fields. To address these limitations, we propose HMF-DEIM (High-Fidelity Multi-Domain Fusion Transformer for UAV Small Object Detection), an end-to-end architecture tailored for UAV small object detection. First, we design a lightweight hierarchical differentiation backbone that removes redundant deepest-layer features (P5) to prevent tiny object information loss, employing Multi-Domain Feature Blending (MDFB) in shallow layers for geometric detail preservation and a Hierarchical Attention-guided Feature Modulation Block (HAFMB) in deep layers for global semantic modeling. Second, we develop a full-chain high-fidelity feature transformation framework comprising Channel-Adaptive Shift Upsampling (CASU) for interpolation-free resolution recovery, Multi-scale Context Alignment Fusion (MCAF) for bridging deep–shallow semantic gaps via bidirectional gating, and Diversified Residual Frequency-aware Downsampling (DRFD) for aliasing suppression through a three-branch parallel architecture. Finally, we devise the FocusFeature module that aligns multi-scale features to a unified scale and employs parallel multi-scale large-kernel depthwise convolutions to capture cross-scale long-range dependencies, generating dual-scale (P3/P4) features balancing details and semantics. Experiments demonstrate that HMF-DEIM outperforms DEIM on VisDrone2019 test by 0.405 mAP50 (+2.1%) and 0.235 mAP50–95 (+1.6%), with a remarkable 21.3% relative improvement in APs for tiny objects, while maintaining real-time inference (465 FPS with TensorRT FP16) on an NVIDIA A100 GPU with only 11.87M parameters and 34.1 GFLOPs. Further validation on AI-TOD v2 and DOTA v1.5 datasets confirms robust generalization across diverse aerial scenarios, making it a practical solution for resource-constrained UAV applications. Full article
(This article belongs to the Special Issue Communications and Networking Based on Artificial Intelligence)
Show Figures

Figure 1

16 pages, 10364 KB  
Article
A Method for Filling Blank Stripes in Electrical Imaging Based on the Fusion of Arbitrary Kernel Convolution and Generative Adversarial Networks
by Ruhan A, Die Liu, Ge Cao, Kun Meng, Taiping Zhao, Lili Tian, Bin Zhao, Guilan Lin and Sinan Fang
Appl. Sci. 2026, 16(7), 3267; https://doi.org/10.3390/app16073267 - 27 Mar 2026
Viewed by 310
Abstract
Electrical imaging logging images play a crucial role in petroleum exploration; however, in practical applications, blank strips frequently appear due to instrument malfunctions or data transmission failures, severely compromising geological interpretation and hydrocarbon evaluation. Existing image inpainting methods have limited adaptability to blank [...] Read more.
Electrical imaging logging images play a crucial role in petroleum exploration; however, in practical applications, blank strips frequently appear due to instrument malfunctions or data transmission failures, severely compromising geological interpretation and hydrocarbon evaluation. Existing image inpainting methods have limited adaptability to blank strips at different depth scales and exhibit blurred high-resolution geological textures. To address these issues, this paper proposes a blank strip filling method that integrates Arbitrary Kernel Convolution (AKConv) with the Aggregated Contextual-Transformations Generative Adversarial Network (AOT-GAN). Specifically, the adaptive sampling mechanism of AKConv is incorporated into the generator network of AOT-GAN, enabling the model—to effectively capture long-range contextual information and adaptively handle blank strips of varying scales and shapes through multi-scale feature fusion. Experimental results on real oilfield datasets demonstrate that the proposed method achieves significant improvements in PSNR, SSIM, and MAE, exhibiting superior structural preservation and texture sharpness—especially in restoring deep and large-scale blank strips. Furthermore, visual comparisons confirm the method’s superior performance in recovering key geological features, such as bedding continuity and fracture structures, thus providing an effective approach for electrical imaging logging image restoration. Full article
(This article belongs to the Special Issue Applied Geophysical Imaging and Data Processing, 2nd Edition)
Show Figures

Figure 1

25 pages, 6659 KB  
Article
MDS3-Net: A Multiscale Spectral–Spatial Sequence Hybrid CNN–Transformer Model for Hyperspectral Image Classification
by Taonian Bian, Bin Yang, Yuanjiang Chen, Xuan Zhou, Li Yue and Shunshi Hu
Remote Sens. 2026, 18(7), 977; https://doi.org/10.3390/rs18070977 - 25 Mar 2026
Viewed by 373
Abstract
Hyperspectral image (HSI) classification faces significant challenges due to the spatial–spectral heterogeneity of land covers and the geometric rigidity of standard convolutions. Although Transformers offer powerful global modeling capabilities, their quadratic computational complexity limits practical efficiency. To address these limitations, this paper proposes [...] Read more.
Hyperspectral image (HSI) classification faces significant challenges due to the spatial–spectral heterogeneity of land covers and the geometric rigidity of standard convolutions. Although Transformers offer powerful global modeling capabilities, their quadratic computational complexity limits practical efficiency. To address these limitations, this paper proposes a novel hierarchical framework named MDS3-Net (Multiscale Deformable Spectral–Spatial Sequence Network). Specifically, we design a Multiscale Spectral-Deformable Convolution (MSDC) module that adopts a cascaded strategy to sequentially extract discriminative spectral features and adaptively align spatial receptive fields with irregular object boundaries. To capture long-range dependencies efficiently, a Spectral–Spatial Sequence (S3) Encoder is introduced based on a gated large-kernel convolution mechanism, achieving global context modeling with linear complexity. Furthermore, a Dual-Path Feature Extraction (DPFE) module is proposed to perform semantics-preserving dimension reduction via spectral reorganization and spatial attention. Experimental results on four public datasets demonstrate that the proposed MDS3-Net achieves state-of-the-art classification performance and exhibits superior robustness under limited training samples compared to existing methods. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

26 pages, 44003 KB  
Article
GLKC-Net: Group Large Kernel Convolution for Short-Range Precipitation Forecasting
by Jie Tan, Min Chen, Li Gao, Shaohan Li and Hao Yang
Atmosphere 2026, 17(3), 287; https://doi.org/10.3390/atmos17030287 - 12 Mar 2026
Viewed by 249
Abstract
Accurate short-range precipitation prediction plays a crucial role in daily life and disaster mitigation. However, the existing methods often suffer from inefficient large-scale feature extraction, severe redundant information interference, and insufficient attention to the problem of imbalanced data distributions, leading to unsatisfactory performance. [...] Read more.
Accurate short-range precipitation prediction plays a crucial role in daily life and disaster mitigation. However, the existing methods often suffer from inefficient large-scale feature extraction, severe redundant information interference, and insufficient attention to the problem of imbalanced data distributions, leading to unsatisfactory performance. To address these issues, in this paper, we first propose a novel spatiotemporal module called Group Large Kernel Convolution (GLKC) and develop a short-range precipitation forecasting model based on it, GLKC-Net, using multiple meteorological variables. Specifically, we use decomposed large-kernel convolution to enhance the ability to understand large-scale atmospheric processes. Meanwhile, we introduce the group convolution and channel shuffle operator to control the fusion of channel-wise information, enabling efficient information exchange and reducing redundancy in the channel dimension with multiple variables. Furthermore, we treat the causes of poor model performance for extreme precipitation events with an imbalanced data distribution perspective and design a Multi-threshold Adaptive Loss function (MTA Loss). This function strengthens the model’s focus on high-threshold precipitation events that are inherently difficult to forecast, aiming to improve model performance for extreme events. Finally, forecasting experiments for validation were conducted over southwestern China using ERA5-Land and CMPAS datasets. The results demonstrate that our proposed method outperforms several existing approaches in terms of forecasting accuracy. Full article
(This article belongs to the Special Issue Atmospheric Modeling with Artificial Intelligence Technologies)
Show Figures

Figure 1

21 pages, 4501 KB  
Article
YOLOv8n-ALC: An Efficient Network for Bolt-Nut Fastener Detection in Complex Substation Environments
by Dazhang You, Fangke Li, Sicheng Wang and Yepeng Zhang
Appl. Sci. 2026, 16(6), 2716; https://doi.org/10.3390/app16062716 - 12 Mar 2026
Viewed by 228
Abstract
Bolt-nut fasteners are critical components of substation equipment, and their integrity directly affects the operational reliability of power systems. In practical inspection scenarios, however, the small physical scale of bolt-nut fasteners, together with complex background structures, often obscures their discriminative visual features, making [...] Read more.
Bolt-nut fasteners are critical components of substation equipment, and their integrity directly affects the operational reliability of power systems. In practical inspection scenarios, however, the small physical scale of bolt-nut fasteners, together with complex background structures, often obscures their discriminative visual features, making accurate automated detection particularly challenging. Reliable detection is a prerequisite for downstream tasks such as loosening identification and defect diagnosis. To address these challenges, this paper proposes YOLOv8n-ALC, an enhanced detection network built upon the lightweight YOLOv8n framework. The backbone is redesigned by integrating the AdditiveBlock from CAS-ViT and a Convolutional Gated Linear Unit (CGLU) to strengthen fine-grained feature extraction and suppress background interference without increasing computational burden. In addition, an improved Large Separable Kernel Attention (LSKA) module is introduced to expand the effective receptive field while maintaining efficiency, enabling more robust multi-scale feature representation. To further alleviate feature degradation of small bolt-nut fasteners in deep layers, a Context-Guided Reconstruction Feature Pyramid Network (CGRFPN) is employed in the neck to optimize cross-layer feature fusion and enhance localization accuracy. Experimental results demonstrate that YOLOv8n-ALC achieves an mAP@0.5 of 92.1%, with precision and recall of 93.5% and 87.1%, respectively, outperforming the baseline by clear margins. These results confirm the effectiveness and robustness of the proposed method for intelligent substation inspection and bolt-nut fastener condition monitoring. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

19 pages, 33281 KB  
Article
FLF-RCNN: A Fine-Tuned Lightweight Faster RCNN for Precise and Efficient Industrial Quality Inspection
by Ningli An, Zhichao Yang, Liangliang Wan, Jianan Li and Yiming Wang
Sensors 2026, 26(6), 1768; https://doi.org/10.3390/s26061768 - 11 Mar 2026
Viewed by 344
Abstract
Industrial Quality Inspection (IQI) is a pivotal part of intelligent manufacturing, critical to ensuring product quality. Deep learning-based methods have attracted growing attention for their excellent feature extraction ability, outperforming traditional detection approaches. However, existing methods still face issues of insufficient efficiency and [...] Read more.
Industrial Quality Inspection (IQI) is a pivotal part of intelligent manufacturing, critical to ensuring product quality. Deep learning-based methods have attracted growing attention for their excellent feature extraction ability, outperforming traditional detection approaches. However, existing methods still face issues of insufficient efficiency and poor transferability, and this paper proposes a Fine-tuned Lightweight Faster RCNN (FLF-RCNN) framework designed to address key challenges in IQI, including the trade-off between accuracy and computational efficiency, and the insufficient adaptability of preset anchor box ratios. FLF-RCNN introduces a lightweight backbone network, LSNet, which enhances the receptive field through architectural optimization. Specifically, it uses a collaborative mechanism that combines large kernel convolutions for extracting contextual information and small kernel convolutions for capturing fine-grained details. This mechanism enables the model to efficiently and precisely represent defects. To enhance generalization in data-scarce industrial scenarios, the framework leverages transfer learning with pretrained weights. Furthermore, an Adaptive Anchor Box-Adjustment Module (AAB-AM) based on K-means clustering is introduced to improve detection across varied defect scales. Extensive experiments conducted on the Tianchi dataset show that FLF-RCNN achieves a mAP50 of 43.6%, outperforming detectors using MobileNet and EfficientNet backbones and surpassing the baseline Faster R-CNN by 7.9% in mAP50. Meanwhile, the proposed method reduces computational complexity by approximately 40%, reaching 98.65 GFLOPs, and decreases parameter count by around 30% to 28.2M. These results demonstrate that FLF-RCNN offers a feasibility and practical solution for IQI, achieving a superior accuracy-efficiency balance within the two-stage detection paradigm. Full article
Show Figures

Figure 1

22 pages, 11365 KB  
Article
Addressing Dense Small-Object Detection in Remote Sensing: An Open-Vocabulary Object Detection Framework
by Menghan Ju, Yingchao Feng, Wenhui Diao and Chunbo Liu
Remote Sens. 2026, 18(6), 851; https://doi.org/10.3390/rs18060851 - 10 Mar 2026
Viewed by 454
Abstract
Remote sensing open-vocabulary object detection focuses on identifying and localizing unseen categories within remote sensing imagery. However, constrained by characteristics such as dense target distribution, complex background interference, and drastic scale variations inherent to remote sensing scenarios, existing methods are prone to background [...] Read more.
Remote sensing open-vocabulary object detection focuses on identifying and localizing unseen categories within remote sensing imagery. However, constrained by characteristics such as dense target distribution, complex background interference, and drastic scale variations inherent to remote sensing scenarios, existing methods are prone to background noise interference when extracting features from dense, small target regions. This leads to weakened semantic representation and reduced localization accuracy. Therefore, we propose RS-DINO to address these challenges. Specifically: Firstly, to address the issue of small features being obscured by the background, the feature extraction module incorporates a multi-scale large-kernel attention mechanism. This expands the receptive field while enhancing local detail modelling, significantly improving the feature representation of minute targets. Secondly, a cross-modal feature fusion module employing bidirectional cross-attention achieves deep alignment between image and textual features. Subsequently, a language-guided query selection mechanism enhances detection accuracy through hybrid query strategies. Finally, to enhance the spatial sensitivity and channel adaptability of fusion features, the multimodal decoder integrates a convolutional gated feedforward network, significantly boosting the model’s robustness in dense, multi-scale scenes. Experiments on DIOR, DOTA v2.0, and NWPU-VHR10 demonstrate substantial gains, with fine-tuned RS-DINO surpassing existing methods by 3.5%, 3.7%, and 4.0% in accuracy, respectively. Full article
Show Figures

Figure 1

23 pages, 14232 KB  
Article
A Dual-Branch Perception Network for High-Precision Oriented Object Detection in Remote Sensing
by Qi Wang and Wei Sun
Remote Sens. 2026, 18(5), 839; https://doi.org/10.3390/rs18050839 - 9 Mar 2026
Viewed by 381
Abstract
With the rapid evolution of remote sensing earth observation technology, high-resolution object detection is crucial in military and civilian domains but faces challenges from expansive views and complex backgrounds. Small objects are particularly challenging due to their low pixel coverage, poor textures, and [...] Read more.
With the rapid evolution of remote sensing earth observation technology, high-resolution object detection is crucial in military and civilian domains but faces challenges from expansive views and complex backgrounds. Small objects are particularly challenging due to their low pixel coverage, poor textures, and susceptibility to drastic illumination changes and background clutter. To address these problems, this paper proposes MDCA-YOLO for oriented object detection. A Dual-Branch Perception Module (DBPM) is designed utilizing a synergistic mechanism of large-kernel and strip convolutions to establish long-range dependencies, accurately capturing geometric features of tiny objects even in the absence of local details; Multi-Adaptive Selection Fusion (MASF) is proposed to address cross-scale feature loss by adaptively enhancing feature response while suppressing background noise; furthermore, a reconstructed decoupled detection head, CoordAttOBB, significantly improves angle regression accuracy while reducing complexity. Experimental results on the DIOR-R dataset show MDCA-YOLO surpasses YOLO11s, improving mAP50 and mAP50:95 by 2.5% and 2.7%, respectively, effectively proving the algorithm’s superiority in remote sensing tasks. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)
Show Figures

Figure 1

17 pages, 4021 KB  
Article
Dangerous Goods Detection in X-Ray Security Inspection Images Based on Improved YOLOv8-seg
by Ting Wang, Pengfei Yuan and Aili Wang
Electronics 2026, 15(5), 1112; https://doi.org/10.3390/electronics15051112 - 7 Mar 2026
Viewed by 354
Abstract
In X-ray security inspection imagery, hazardous object detection is challenged by severe object overlap/occlusion, ambiguous boundaries of small objects, and complex texture representations caused by material diversity. Although YOLOv8-seg provides real-time instance segmentation capability, it still has clear limitations in this application scenario. [...] Read more.
In X-ray security inspection imagery, hazardous object detection is challenged by severe object overlap/occlusion, ambiguous boundaries of small objects, and complex texture representations caused by material diversity. Although YOLOv8-seg provides real-time instance segmentation capability, it still has clear limitations in this application scenario. Specifically, the original SPPF module has limited ability to model long-range spatial dependencies, making it difficult to accurately separate boundaries of densely overlapped objects, while the C2f module is insufficient for multi-scale feature parsing of hazardous items with diverse sizes and materials and introduces feature redundancy, which degrades segmentation accuracy in occluded scenes. To address these issues, this paper proposes an improved YOLOv8-seg framework for X-ray hazardous object detection, termed LM-YOLOv8. For feature enhancement, an SPPF-LSKA module is constructed by integrating large-kernel separable attention with dynamic receptive-field adjustment, thereby improving global contextual modeling and alleviating boundary ambiguity. For multi-scale feature fusion, a C2f-MSC module is designed by combining multi-branch dilated convolutions with the C2f structure to enhance complex contour parsing and cross-scale feature interaction. Experiments on the PIDray dataset show that the proposed method achieves 84.8% mAP50 in instance segmentation, representing an improvement of approximately 4.0 percentage points over the baseline YOLOv8-seg. In addition, the method demonstrates stronger robustness on challenging hard/hidden subsets, validating its effectiveness for X-ray security inspection hazardous object detection. Full article
(This article belongs to the Special Issue Image Processing, Target Tracking and Recognition System Design)
Show Figures

Figure 1

30 pages, 3424 KB  
Article
Fault Diagnosis of Rolling Bearings Based on an Ascending-Dimension Convolutional Neural Network
by Xu Bai, Xin Zhong, Yaofeng Liu, Ke Zhang, Weiying Meng, Junzhou Li and Xiaochen Zhang
Machines 2026, 14(3), 302; https://doi.org/10.3390/machines14030302 - 6 Mar 2026
Viewed by 335
Abstract
Rolling bearings are critical and vulnerable components in mechanical equipment and are prone to various types of damage during operation. Consequently, rolling bearing fault diagnosis is of significant engineering importance. In recent years, deep learning-based approaches have achieved considerable progress in intelligent bearing [...] Read more.
Rolling bearings are critical and vulnerable components in mechanical equipment and are prone to various types of damage during operation. Consequently, rolling bearing fault diagnosis is of significant engineering importance. In recent years, deep learning-based approaches have achieved considerable progress in intelligent bearing fault diagnosis. However, existing models still suffer from several limitations, including insufficient feature extraction under noisy conditions, limited diagnostic accuracy, high computational cost, and low operational efficiency. To address these challenges, an intelligent rolling bearing fault diagnosis method based on an ascending-dimensional convolutional neural network (ADCNN) is proposed. Compared with conventional neural networks, the proposed ADCNN features a more compact model size, improved noise robustness, and higher diagnostic accuracy. A large convolutional kernel is introduced in the first layer to enhance noise immunity, while an ascending-dimensional module is employed to reduce the number of network parameters and improve feature extraction capability. In addition, a reduced linear transformation layer (RLTL) is incorporated to further achieve a lightweight architecture. Experimental results on the Case Western Reserve University (CWRU) dataset and a self-designed test dataset demonstrate that the proposed ADCNN achieves superior fault diagnosis performance under different noise environments while maintaining computational efficiency and model compactness. Full article
(This article belongs to the Section Machines Testing and Maintenance)
Show Figures

Figure 1

27 pages, 7867 KB  
Article
A Multi-Scale Object Detection Network with Integrated Spatial-Channel Collaborative Attention for Remote Sensing Images
by Lijun Ma, Chengjun Xu, Kun Jiao, Wenming Pei, Hongfei Zhang, Lanfeng Liu, Bin Deng and Juan Wu
Sensors 2026, 26(4), 1370; https://doi.org/10.3390/s26041370 - 21 Feb 2026
Viewed by 412
Abstract
In remote sensing object detection, current models typically employ feature extraction modules and attention mechanisms to tackle issues such as significant scale variations among targets, cluttered backgrounds, and the subtle characteristics of small objects. Nevertheless, existing feature extraction approaches often depend on convolution [...] Read more.
In remote sensing object detection, current models typically employ feature extraction modules and attention mechanisms to tackle issues such as significant scale variations among targets, cluttered backgrounds, and the subtle characteristics of small objects. Nevertheless, existing feature extraction approaches often depend on convolution kernels with fixed sizes, which can blur the contours of large objects and provide inadequate feature representation for small objects. Moreover, many attention mechanisms simply combine spatial and channel attention, without fully considering the deep integration between spatial and channel features, consequently leading to high-dimensional features and considerable computational overhead. To overcome these shortcomings, this paper introduces a multi-scale object detection network with integrated spatial-channel collaborative attention for remote sensing images. This approach enhances feature perception and representation for multi-scale targets, particularly small targets, through the design of the cross-channel multi-scale feature extraction module (CC-MSFE). Furthermore, a new channel-spatial cross-attention mechanism (CSCA) is introduced, comprising the channel attention mechanism (CA), the spatial attention mechanism (SA), and the cross-attention fusion module (CAFM). This design fosters dynamic interaction and joint optimization across channel and spatial dimensions, thereby improving detection accuracy while effectively reducing computational cost. The efficacy of the proposed model is evaluated on three publicly available remote sensing datasets. Experimental results show that the model achieves a mAP of 78.1% on the DIOR dataset and of 90.6% on the HRRSD dataset, outperforming YOLOv11 by 0.7% and 1.4%, respectively. On the RSOD dataset, it attains a mAP of 96.5%, surpassing YOLOv8 by 2.1%. In addition, the proposed method maintains a notably lower parameter count and computational complexity compared to existing approaches, achieving an effective balance between detection accuracy and computational efficiency. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

31 pages, 7853 KB  
Article
Condition-Adaptive CNN with Spatiotemporal Fusion for Enhanced Motor Fault Diagnosis
by Jin Lv, Lixin Wei and Yu Feng
Sensors 2026, 26(4), 1314; https://doi.org/10.3390/s26041314 - 18 Feb 2026
Viewed by 258
Abstract
Electric motors are widely used in industrial production systems, and various fault modes may occur during long-term operation under complex and noisy conditions. Accurate fault diagnosis remains challenging, especially when signal characteristics vary depending on the operating state. To address this issue, this [...] Read more.
Electric motors are widely used in industrial production systems, and various fault modes may occur during long-term operation under complex and noisy conditions. Accurate fault diagnosis remains challenging, especially when signal characteristics vary depending on the operating state. To address this issue, this paper presents a fault diagnosis framework based on a convolutional neural network (CNN), which features adaptive parameter optimization and enhanced feature representation. This method integrates the bee colony algorithm (BCA) into CNN training, adaptively adjusts the model parameters based on signal conditions, and shortens the convergence time compared to traditional gradient-based optimization. In order to improve the extraction of high-frequency and transient fault features, a spatiotemporal fusion architecture is designed, which combines large-kernel convolution, a bottleneck layer, and an improved self-attention (ISA) mechanism. In addition, an engineering-oriented data augmentation strategy based on multi-scale window offset and noise superposition has been applied to one-dimensional vibration signals to improve the robustness of the model. The proposed CNN-BCA-ISA framework is evaluated using a mixed dataset consisting of on-site data collected from a steel plant and a public dataset from Case Western Reserve University (CWRU). The experimental results show that the diagnostic accuracy is 96.4%, and the performance is stable under different noise levels, indicating good generalization abilities under various operating conditions. In addition, a real-time fault diagnosis system based on the proposed framework has been implemented and validated in industrial environments, confirming its feasibility in practical state monitoring applications. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

39 pages, 9763 KB  
Article
SAR-DRBNet: Adaptive Feature Weaving and Algebraically Equivalent Aggregation for High-Precision Rotated SAR Detection
by Lanfang Lei, Sheng Chang, Zhongzhen Sun, Xinli Zheng, Changyu Liao, Wenjun Wei, Long Ma and Ping Zhong
Remote Sens. 2026, 18(4), 619; https://doi.org/10.3390/rs18040619 - 16 Feb 2026
Viewed by 486
Abstract
Synthetic aperture radar (SAR) imagery is widely used for target detection in complex backgrounds and adverse weather conditions. However, high-precision detection of rotated small targets remains challenging due to severe speckle noise, significant scale variations, and the need for robust rotation-aware representations. To [...] Read more.
Synthetic aperture radar (SAR) imagery is widely used for target detection in complex backgrounds and adverse weather conditions. However, high-precision detection of rotated small targets remains challenging due to severe speckle noise, significant scale variations, and the need for robust rotation-aware representations. To address these issues, we propose SAR-DRBNet, a high-precision rotated small-target detection framework built upon YOLOv13. First, we introduce a Detail-Enhanced Oriented Bounding Box detection head (DEOBB), which leverages multi-branch enhanced convolutions to strengthen fine-grained feature extraction and improve oriented bounding box regression, thereby enhancing rotation sensitivity and localization accuracy for small targets. Second, we design a Ck-MultiDilated Reparameterization Block (CkDRB) that captures multi-scale contextual cues and suppresses speckle interference via multi-branch dilated convolutions and an efficient reparameterization strategy. Third, we propose a Dynamic Feature Weaving module (DynWeave) that integrates global–local dual attention with dynamic large-kernel convolutions to adaptively fuse features across scales and orientations, improving robustness in cluttered SAR scenes. Extensive experiments on three widely used SAR rotated object detection benchmarks (HRSID, RSDD-SAR, and DSSDD) demonstrate that SAR-DRBNet achieves a strong balance between detection accuracy and computational efficiency compared with state-of-the-art oriented bounding box detectors, while exhibiting superior cross-dataset generalization. These results indicate that SAR-DRBNet provides an effective and reliable solution for rotated small-target detection in SAR imagery. Full article
Show Figures

Figure 1

Back to TopTop