MDPI - Publisher of Open Access Journals

19 pages, 6822 KB

Open AccessArticle

Intelligent Fault Diagnosis Based on Dual-Graph Transformation and P2D-Sk-ResNet-XGBoost

by Zhining Jia, Hongtao Yu, Lei Qiao, Guanqun Wang, You Cui, Zhimin Xu, Yang Yang and Fengjun Zhang

Processes 2025, 13(10), 3342; https://doi.org/10.3390/pr13103342 - 18 Oct 2025

Viewed by 259

Abstract

To address the limitations of one-dimensional vibration signals in convolutional neural networks and the insufficient feature extraction capability of traditional single data processing methods under complex operating conditions, this paper proposes a novel fault diagnosis method that integrates dual-graph transformation and an improved residual network. Firstly, the one-dimensional vibration signals are converted into time–frequency representations using the short-time Fourier transform (STFT) and the synchrosqueezed wavelet transform (SWT). Subsequently, these dual-domain representations are fed in parallel into a customized parallel two-dimensional residual network (P2D-Sk-ResNet), which incorporates the selective kernel network (SKNet) mechanism into a ResNet architecture. This design enables adaptive multi-scale feature extraction. Finally, the features from the fully connected layer are classified using the extreme gradient boosting (XGBoost) algorithm to complete the fault diagnosis task. Comparative experiments demonstrate that the proposed STFT-SWT-P2D-Sk-ResNet-XGBoost achieves a diagnostic accuracy of 98.51% under constant load conditions, significantly outperforming several baseline models. Furthermore, the model exhibits superior generalization capability under varying load conditions and strong robustness in noisy environments. The proposed method provides a valuable and practical reference for intelligent fault diagnosis in industrial applications. Full article

(This article belongs to the Section Process Control and Monitoring)

► Show Figures

Figure 1

19 pages, 9284 KB

Open AccessArticle

UAV-YOLO12: A Multi-Scale Road Segmentation Model for UAV Remote Sensing Imagery

by Bingyan Cui, Zhen Liu and Qifeng Yang

Drones 2025, 9(8), 533; https://doi.org/10.3390/drones9080533 - 29 Jul 2025

Viewed by 1801

Abstract

Unmanned aerial vehicles (UAVs) are increasingly used for road infrastructure inspection and monitoring. However, challenges such as scale variation, complex background interference, and the scarcity of annotated UAV datasets limit the performance of traditional segmentation models. To address these challenges, this study proposes UAV-YOLOv12, a multi-scale segmentation model specifically designed for UAV-based road imagery analysis. The proposed model builds on the YOLOv12 architecture by adding two key modules. It uses a Selective Kernel Network (SKNet) to adjust receptive fields dynamically and a Partial Convolution (PConv) module to improve spatial focus and robustness in occluded regions. These enhancements help the model better detect small and irregular road features in complex aerial scenes. Experimental results on a custom UAV dataset collected from national highways in Wuxi, China, show that UAV-YOLOv12 achieves F1-scores of 0.902 for highways (road-H) and 0.825 for paths (road-P), outperforming the original YOLOv12 by 5% and 3.2%, respectively. Inference speed is maintained at 11.1 ms per image, supporting near real-time performance. Moreover, comparative evaluations with U-Net show that UAV-YOLOv12 improves by 7.1% and 9.5%. The model also exhibits strong generalization ability, achieving F1-scores above 0.87 on public datasets such as VHR-10 and the Drone Vehicle dataset. These results demonstrate that the proposed UAV-YOLOv12 can achieve high accuracy and robustness in diverse road environments and object scales. Full article

(This article belongs to the Special Issue Advances in Civil Applications of Unmanned Aircraft Systems: 2nd Edition)

► Show Figures

Figure 1

16 pages, 3143 KB

Open AccessArticle

DGA Domain Detection Based on Transformer and Rapid Selective Kernel Network

by Jisheng Tang, Yiling Guan, Shenghui Zhao, Huibin Wang and Yinong Chen

Electronics 2024, 13(24), 4982; https://doi.org/10.3390/electronics13244982 - 18 Dec 2024

Viewed by 1495

Abstract

Botnets pose a significant challenge in network security by leveraging Domain Generation Algorithms (DGA) to evade traditional security measures. Extracting DGA domain samples is inherently complex, and the current DGA detection models often struggle to capture domain features effectively when facing limited training data. This limitation results in suboptimal detection performance and an imbalance between model accuracy and complexity. To address these challenges, this paper introduces a novel multi-scale feature fusion model that integrates the Transformer architecture with the Rapid Selective Kernel Network (R-SKNet). The proposed model employs the Transformer’s encoder to couple the single-domain character elements with the multiple types of relationships within the global domain block. This paper proposes integrating R-SKNet into DGA detection and developing an efficient channel attention (ECA) module. By enhancing the branch information guidance in the SKNet architecture, the approach achieves adaptive receptive field selection, multi-scale feature capture, and lightweight yet efficient multi-scale convolution. Moreover, the improved Feature Pyramid Network (FPN) architecture, termed EFAM, is utilized to adjust channel weights for outputs at different stages of the backbone network, leading to achieving multi-scale feature fusion. Experimental results demonstrate that, in tasks with limited training samples, the proposed method achieves lower computational complexity and higher detection accuracy compared to mainstream detection models. Full article

(This article belongs to the Special Issue Advances in Intelligent Data Analysis and Its Applications, 2nd Edition)

► Show Figures

Figure 1

18 pages, 4533 KB

Open AccessArticle

A Bearing Fault Diagnosis Method in Scenarios of Imbalanced Samples and Insufficient Labeled Samples

by Xiaohan Cheng, Yuxin Lu, Zhihao Liang, Lei Zhao, Yuandong Gong and Meng Wang

Appl. Sci. 2024, 14(19), 8582; https://doi.org/10.3390/app14198582 - 24 Sep 2024

Viewed by 1580

Abstract

In practical working environments, rolling bearings are one of the components that are prone to failure. Their vibration signal samples are faced with challenges, mainly including the imbalance between normal and fault samples as well as an insufficient number of labeled samples. This study proposes a sample-expansion method based on generative adversarial networks (GANs) and a fault diagnosis method based on a transformer to solve the above issues. First, selective kernel networks (SKNets) and a genetic algorithm (GA) were introduced to construct a conditional variational autoencoder–evolutionary generative adversarial network with a selective kernel (CVAE-SKEGAN) to achieve a balance between the proportion of normal and faulty samples. Then, a semi-supervised learning–variational convolutional Swin transformer (SSL-VCST) network was built for the fault classification, specifically introducing variational attention and semi-supervised mechanisms to reduce the overfitting risk of the model and solve the problem of a shortage of labeled samples. Three typical operating conditions were designed for the multi-case applicability verification. The results show that the method proposed in this study had good application effects when solving both sample imbalances and labeled-sample deficiencies and improved the accuracy of fault diagnosis in the above scenarios. Full article

► Show Figures

Figure 1

19 pages, 2503 KB

Open AccessArticle

ResSKNet-SSDP: Effective and Light End-To-End Architecture for Speaker Recognition

by Fei Deng, Lihong Deng, Peifan Jiang, Gexiang Zhang and Qiang Yang

Sensors 2023, 23(3), 1203; https://doi.org/10.3390/s23031203 - 20 Jan 2023

Cited by 12 | Viewed by 2943

Abstract

In speaker recognition tasks, convolutional neural network (CNN)-based approaches have shown significant success. Modeling the long-term contexts and efficiently aggregating the information are two challenges in speaker recognition, and they have a critical impact on system performance. Previous research has addressed these issues by introducing deeper, wider, and more complex network architectures and aggregation methods. However, it is difficult to significantly improve the performance with these approaches because they also have trouble fully utilizing global information, channel information, and time-frequency information. To address the above issues, we propose a lighter and more efficient CNN-based end-to-end speaker recognition architecture, ResSKNet-SSDP. ResSKNet-SSDP consists of a residual selective kernel network (ResSKNet) and self-attentive standard deviation pooling (SSDP). ResSKNet can capture long-term contexts, neighboring information, and global information, thus extracting a more informative frame-level. SSDP can capture short- and long-term changes in frame-level features, aggregating the variable-length frame-level features into fixed-length, more distinctive utterance-level features. Extensive comparison experiments were performed on two popular public speaker recognition datasets, Voxceleb and CN-Celeb, with current state-of-the-art speaker recognition systems and achieved the lowest EER/DCF of 2.33%/0.2298, 2.44%/0.2559, 4.10%/0.3502, and 12.28%/0.5051. Compared with the lightest x-vector, our designed ResSKNet-SSDP has 3.1 M fewer parameters and 31.6 ms less inference time, but 35.1% better performance. The results show that ResSKNet-SSDP significantly outperforms the current state-of-the-art speaker recognition architectures on all test sets and is an end-to-end architecture with fewer parameters and higher efficiency for applications in realistic situations. The ablation experiments further show that our proposed approaches also provide significant improvements over previous methods. Full article

(This article belongs to the Special Issue Acoustic Sensors and Their Applications)

► Show Figures

Figure 1

20 pages, 4617 KB

Open AccessArticle

Mathematical Formula Image Screening Based on Feature Correlation Enhancement

by Hongyuan Liu, Fang Yang, Xue Wang and Jianhui Si

Electronics 2022, 11(5), 799; https://doi.org/10.3390/electronics11050799 - 3 Mar 2022

Cited by 2 | Viewed by 3925

Abstract

There are mathematical formula images or other images in scientific and technical documents or on web pages, and mathematical formula images are classified as either containing only mathematical formulas or formulas interspersed with other elements, such as text and coordinate diagrams. To screen and collect images containing mathematical formulas for others to study or for further research, a model for screening images of mathematical formulas based on feature correlation enhancement is proposed. First, the Feature Correlation Enhancement (FCE) module was designed to improve the correlation degree of mathematical formula features and weaken other features. Then, the strip multi-scale pooling (SMP) module was designed to solve the problem of non-uniform image size, while enhancing the focus on horizontal formula features. Finally, the loss function was improved to balance the dataset. The accuracy of the experiment was 89.50%, which outperformed the existing model. Using the model to screen images enables the user to screen out images containing mathematical formulas. The screening of images containing mathematical formulas helps to speed up the creation of a database of mathematical formula images. Full article

(This article belongs to the Collection Computer Vision and Pattern Recognition Techniques)

► Show Figures

Figure 1

17 pages, 7998 KB

Open AccessArticle

A Two-Stream CNN Model with Adaptive Adjustment of Receptive Field Dedicated to Flame Region Detection

by Peng Lu, Yaqin Zhao and Yuan Xu

Symmetry 2021, 13(3), 397; https://doi.org/10.3390/sym13030397 - 28 Feb 2021

Cited by 12 | Viewed by 3561

Abstract

Convolutional neural networks (CNN) have yielded state-of-the-art performance in image segmentation. Their application in video surveillance systems can provide very useful information for extinguishing fire in time. The current studies mostly focused on CNN-based flame image classification and have achieved good accuracy. However, the research of CNN-based flame region detection is extremely scarce due to the bulky network structures and high hardware configuration requirements of the state-of-the-art CNN models. Therefore, this paper presents a two-stream convolutional neural network for flame region detection (TSCNNFlame). TSCNNFlame is a lightweight CNN architecture including a spatial stream and temporal stream for detecting flame pixels in video sequences captured by fixed cameras. The static features from the spatial stream and dynamic features from the temporal stream are fused by three convolutional layers to reduce the false positives. We replace the convolutional layer of CNN with the selective kernel (SK)-Shuffle block constructed by integrating the SK convolution into the deep convolutional layer of ShuffleNet V2. The SKnet blocks can adaptively adjust the size of one receptive field with the proportion of one region of interest (ROI) in it. The grouped convolution used in Shufflenet solves the problem in which the multi-branch structure of SKnet causes the network parameters to double with the number of branches. Therefore, the CNN network dedicated to flame region detection balances the efficiency and accuracy by the lightweight architecture, the temporal–spatial features fusion, and the advantages of the SK-Shuffle block. The experimental results, which are evaluated by multiple metrics and are analyzed from many angles, show that this method can achieve significant performance while reducing the running time. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI