MDPI - Publisher of Open Access Journals

31 pages, 5221 KB

Open AccessArticle

Dynamic–Attentive Pooling Networks: A Hybrid Lightweight Deep Model for Lung Cancer Classification

by Williams Ayivi, Xiaoling Zhang, Wisdom Xornam Ativi, Francis Sam and Franck A. P. Kouassi

J. Imaging 2025, 11(8), 283; https://doi.org/10.3390/jimaging11080283 - 21 Aug 2025

Viewed by 412

Lung cancer is one of the leading causes of cancer-related mortality worldwide. The diagnosis of this disease remains a challenge due to the subtle and ambiguous nature of early-stage symptoms and imaging findings. Deep learning approaches, specifically Convolutional Neural Networks (CNNs), have significantly [...] Read more.

Lung cancer is one of the leading causes of cancer-related mortality worldwide. The diagnosis of this disease remains a challenge due to the subtle and ambiguous nature of early-stage symptoms and imaging findings. Deep learning approaches, specifically Convolutional Neural Networks (CNNs), have significantly advanced medical image analysis. However, conventional architectures such as ResNet50 that rely on first-order pooling often fall short. This study aims to overcome the limitations of CNNs in lung cancer classification by proposing a novel and dynamic model named LungSE-SOP. The model is based on Second-Order Pooling (SOP) and Squeeze-and-Excitation Networks (SENet) within a ResNet50 backbone to improve feature representation and class separation. A novel Dynamic Feature Enhancement (DFE) module is also introduced, which dynamically adjusts the flow of information through SOP and SENet blocks based on learned importance scores. The model was trained using a publicly available IQ-OTH/NCCD lung cancer dataset. The performance of the model was assessed using various metrics, including the accuracy, precision, recall, F1-score, ROC curves, and confidence intervals. For multiclass tumor classification, our model achieved 98.6% accuracy for benign, 98.7% for malignant, and 99.9% for normal cases. Corresponding F1-scores were 99.2%, 99.8%, and 99.9%, respectively, reflecting the model’s high precision and recall across all tumor types and its strong potential for clinical deployment. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

22 pages, 6051 KB

Open AccessArticle

Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones

by Jiawen Zhou, Mei Hu, Chao Zhou, Zongmin Liu and Chao Ma

Electronics 2025, 14(15), 3147; https://doi.org/10.3390/electronics14153147 - 7 Aug 2025

Viewed by 653

Abstract

With the rapid development of the low-altitude economy, the application of drones in both military and civilian fields has become increasingly widespread. The safety and accuracy of their positioning and navigation have become critical factors in ensuring the successful execution of missions. Currently, [...] Read more.

With the rapid development of the low-altitude economy, the application of drones in both military and civilian fields has become increasingly widespread. The safety and accuracy of their positioning and navigation have become critical factors in ensuring the successful execution of missions. Currently, GNSS spoofing attack techniques are becoming increasingly sophisticated, posing a serious threat to the reliability of drone positioning. This paper proposes a GNSS spoofing detection and autonomous positioning method for drones operating in mission mode, which is based on visual sensors and does not rely on additional hardware devices. First, during the deception detection phase, the ResNet50-SE twin network is used to extract and match real-time aerial images from the drone’s camera with satellite image features obtained via GNSS positioning, thereby identifying positioning anomalies. Second, once deception is detected, during the positioning recovery phase, the system uses the SuperGlue network to match real-time aerial images with satellite image features within a specific area, enabling the drone’s absolute positioning. Finally, experimental validation using open-source datasets demonstrates that the method achieves a GNSS spoofing detection accuracy of 89.5%, with 89.7% of drone absolute positioning errors controlled within 13.9 m. This study provides a comprehensive solution for the safe operation and stable mission execution of drones in complex electromagnetic environments. Full article

► Show Figures

Figure 1

24 pages, 3087 KB

Open AccessArticle

Photoplethysmogram (PPG)-Based Biometric Identification Using 2D Signal Transformation and Multi-Scale Feature Fusion

by Yuanyuan Xu, Zhi Wang and Xiaochang Liu

Sensors 2025, 25(15), 4849; https://doi.org/10.3390/s25154849 - 7 Aug 2025

Viewed by 462

Abstract

Using Photoplethysmogram (PPG) signals for identity recognition has been proven effective in biometric authentication. However, in real-world applications, PPG signals are prone to interference from noise, physical activity, diseases, and other factors, making it challenging to ensure accurate user recognition and verification in [...] Read more.

Using Photoplethysmogram (PPG) signals for identity recognition has been proven effective in biometric authentication. However, in real-world applications, PPG signals are prone to interference from noise, physical activity, diseases, and other factors, making it challenging to ensure accurate user recognition and verification in complex environments. To address these issues, this paper proposes an improved MSF-SE ResNet50 (Multi-Scale Feature Squeeze-and-Excitation ResNet50) model based on 2D PPG signals. Unlike most existing methods that directly process one-dimensional PPG signals, this paper adopts a novel approach based on two-dimensional PPG signal processing. By applying Continuous Wavelet Transform (CWT), the preprocessed one-dimensional PPG signal is transformed into a two-dimensional time-frequency map, which not only preserves the time-frequency characteristics of the signal but also provides richer spatial information. During the feature extraction process, the SENet module is first introduced to enhance the ability to extract distinctive features. Next, a novel Lightweight Multi-Scale Feature Fusion (LMSFF) module is proposed, which addresses the limitation of single-scale feature extraction in existing methods by employing parallel multi-scale convolutional operations. Finally, cross-stage feature fusion is implemented, overcoming the limitations of traditional feature fusion methods. These techniques work synergistically to improve the model’s performance. On the BIDMC dataset, the MSF-SE ResNet50 model achieved accuracy, precision, recall, and F1 scores of 98.41%, 98.19%, 98.27%, and 98.23%, respectively. Compared to existing state-of-the-art methods, the proposed model demonstrates significant improvements across all evaluation metrics, highlighting its significance in terms of network architecture and performance. Full article

(This article belongs to the Section Biomedical Sensors)

► Show Figures

Figure 1

25 pages, 5142 KB

Open AccessArticle

Wheat Powdery Mildew Severity Classification Based on an Improved ResNet34 Model

by Meilin Li, Yufeng Guo, Wei Guo, Hongbo Qiao, Lei Shi, Yang Liu, Guang Zheng, Hui Zhang and Qiang Wang

Agriculture 2025, 15(15), 1580; https://doi.org/10.3390/agriculture15151580 - 23 Jul 2025

Viewed by 407

Abstract

Crop disease identification is a pivotal research area in smart agriculture, forming the foundation for disease mapping and targeted prevention strategies. Among the most prevalent global wheat diseases, powdery mildew—caused by fungal infection—poses a significant threat to crop yield and quality, making early [...] Read more.

Crop disease identification is a pivotal research area in smart agriculture, forming the foundation for disease mapping and targeted prevention strategies. Among the most prevalent global wheat diseases, powdery mildew—caused by fungal infection—poses a significant threat to crop yield and quality, making early and accurate detection crucial for effective management. In this study, we present QY-SE-MResNet34, a deep learning-based classification model that builds upon ResNet34 to perform multi-class classification of wheat leaf images and assess powdery mildew severity at the single-leaf level. The proposed methodology begins with dataset construction following the GBT 17980.22-2000 national standard for powdery mildew severity grading, resulting in a curated collection of 4248 wheat leaf images at the grain-filling stage across six severity levels. To enhance model performance, we integrated transfer learning with ResNet34, leveraging pretrained weights to improve feature extraction and accelerate convergence. Further refinements included embedding a Squeeze-and-Excitation (SE) block to strengthen feature representation while maintaining computational efficiency. The model architecture was also optimized by modifying the first convolutional layer (conv1)—replacing the original 7 × 7 kernel with a 3 × 3 kernel, adjusting the stride to 1, and setting padding to 1—to better capture fine-grained leaf textures and edge features. Subsequently, the optimal training strategy was determined through hyperparameter tuning experiments, and GrabCut-based background processing along with data augmentation were introduced to enhance model robustness. In addition, interpretability techniques such as channel masking and Grad-CAM were employed to visualize the model’s decision-making process. Experimental validation demonstrated that QY-SE-MResNet34 achieved an 89% classification accuracy, outperforming established models such as ResNet50, VGG16, and MobileNetV2 and surpassing the original ResNet34 by 11%. This study delivers a high-performance solution for single-leaf wheat powdery mildew severity assessment, offering practical value for intelligent disease monitoring and early warning systems in precision agriculture. Full article

(This article belongs to the Special Issue How Optical Sensors and Deep Learning Enhance the Production Management in Smart Agriculture)

► Show Figures

Figure 1

21 pages, 3406 KB

Open AccessArticle

ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification

by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu

Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025

Viewed by 728

Abstract

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article

(This article belongs to the Special Issue Artificial Intelligence and Sensor-Enhanced Fault Diagnosis for Industrial Application)

► Show Figures

Figure 1

33 pages, 3352 KB

Open AccessArticle

Optimization Strategy for Underwater Target Recognition Based on Multi-Domain Feature Fusion and Deep Learning

by Yanyang Lu, Lichao Ding, Ming Chen, Danping Shi, Guohao Xie, Yuxin Zhang, Hongyan Jiang and Zhe Chen

J. Mar. Sci. Eng. 2025, 13(7), 1311; https://doi.org/10.3390/jmse13071311 - 7 Jul 2025

Viewed by 503

Abstract

Underwater sonar target recognition is crucial in fields such as national defense, navigation, and environmental monitoring. However, it faces issues such as the complex characteristics of ship-radiated noise, imbalanced data distribution, non-stationarity, and bottlenecks of existing technologies. This paper proposes the MultiFuseNet-AID network, [...] Read more.

Underwater sonar target recognition is crucial in fields such as national defense, navigation, and environmental monitoring. However, it faces issues such as the complex characteristics of ship-radiated noise, imbalanced data distribution, non-stationarity, and bottlenecks of existing technologies. This paper proposes the MultiFuseNet-AID network, aiming to address these challenges. The network includes the TriFusion block module, the novel lightweight attention residual network (NLARN), the long- and short-term attention (LSTA) module, and the Mamba module. Through the TriFusion block module, the original, differential, and cumulative signals are processed in parallel, and features such as MFCC, CQT, and Fbank are fused to achieve deep multi-domain feature fusion, thereby enhancing the signal representation ability. The NLARN was optimized based on the ResNet architecture, with the SE attention mechanism embedded. Combined with the long- and short-term attention (LSTA) and the Mamba module, it could capture long-sequence dependencies with an O(N) complexity, completing the optimization of lightweight long sequence modeling. At the same time, with the help of feature fusion, and layer normalization and residual connections of the Mamba module, the adaptability of the model in complex scenarios with imbalanced data and strong noise was enhanced. On the DeepShip and ShipsEar datasets, the recognition rates of this model reached 98.39% and 99.77%, respectively. The number of parameters and the number of floating point operations were significantly lower than those of classical models, and it showed good stability and generalization ability under different sample label ratios. The research shows that the MultiFuseNet-AID network effectively broke through the bottlenecks of existing technologies. However, there is still room for improvement in terms of adaptability to extreme underwater environments, training efficiency, and adaptability to ultra-small devices. It provides a new direction for the development of underwater sonar target recognition technology. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

20 pages, 1935 KB

Open AccessArticle

Residual Attention Network with Atrous Spatial Pyramid Pooling for Soil Element Estimation in LUCAS Hyperspectral Data

by Yun Deng, Yuchen Cao, Shouxue Chen and Xiaohui Cheng

Appl. Sci. 2025, 15(13), 7457; https://doi.org/10.3390/app15137457 - 3 Jul 2025

Viewed by 442

Abstract

Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address [...] Read more.

Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address these challenges, we propose ReSE-AP Net, a multi-scale attention residual network with spatial pyramid pooling. Built on convolutional residual blocks, the model incorporates a squeeze-and-excitation channel attention mechanism to recalibrate feature weights and an atrous spatial pyramid pooling (ASPP) module to extract multi-resolution spectral features. This architecture synergistically represents weak absorption peaks (400–1000 nm) and broad spectral bands (1000–2500 nm), overcoming single-scale modeling limitations. Validation on the LUCAS2009 dataset demonstrated that ReSE-AP Net outperformed conventional machine learning by improving the R² by 2.8–36.5% and reducing the RMSE by 14.2–69.2%. Compared with existing deep learning methods, it increased the R² by 0.4–25.5% for clay, silt, sand, organic carbon, calcium carbonate, and phosphorus predictions, and decreased the RMSE by 0.7–39.0%. Our contributions include statistical analysis of LUCAS2009 spectra, identification of conventional method limitations, development of the ReSE-AP Net model, ablation studies, and comprehensive comparisons with alternative approaches. Full article

(This article belongs to the Special Issue Advanced Agricultural Technologies: Monitoring, Modeling, and Machine Learning Techniques)

► Show Figures

Figure 1

20 pages, 2132 KB

Open AccessArticle

Deep Learning with Dual-Channel Feature Fusion for Epileptic EEG Signal Classification

by Bingbing Yu, Mingliang Zuo and Li Sui

Eng 2025, 6(7), 150; https://doi.org/10.3390/eng6070150 - 2 Jul 2025

Viewed by 579

Abstract

Background: Electroencephalography (EEG) signals play a crucial role in diagnosing epilepsy by reflecting distinct patterns associated with normal brain activity, ictal (seizure) states, and interictal (between-seizure) periods. However, the manual classification of these patterns is labor-intensive, time-consuming, and depends heavily on specialized expertise. [...] Read more.

Background: Electroencephalography (EEG) signals play a crucial role in diagnosing epilepsy by reflecting distinct patterns associated with normal brain activity, ictal (seizure) states, and interictal (between-seizure) periods. However, the manual classification of these patterns is labor-intensive, time-consuming, and depends heavily on specialized expertise. While deep learning methods have shown promise, many current models suffer from limitations such as excessive complexity, high computational demands, and insufficient generalizability. Developing lightweight and accurate models for real-time epilepsy detection remains a key challenge. Methods: This study proposes a novel dual-channel deep learning model to classify epileptic EEG signals into three categories: normal, ictal, and interictal states. Channel 1 integrates a bidirectional long short-term memory (BiLSTM) network with a Squeeze-and-Excitation (SE) ResNet attention module to dynamically emphasize critical feature channels. Channel 2 employs a dual-branch convolutional neural network (CNN) to extract deeper and distinct features. The model’s performance was evaluated on the publicly available Bonn EEG dataset. Results: The proposed model achieved an outstanding accuracy of 98.57%. The dual-channel structure improved specificity to 99.43%, while the dual-branch CNN boosted sensitivity by 5.12%. Components such as SE-ResNet attention modules contributed 4.29% to the accuracy improvement, and BiLSTM further enhanced specificity by 1.62%. Ablation studies validated the significance of each module. Conclusions: By leveraging a lightweight design and attention-based mechanisms, the dual-channel model offers high diagnostic precision while maintaining computational efficiency. Its applicability to real-time automated diagnosis positions it as a promising tool for clinical deployment across diverse patient populations. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence Techniques for Disease Prediction, Diagnosis and Management)

► Show Figures

Figure 1

20 pages, 760 KB

Open AccessArticle

Detecting AI-Generated Images Using a Hybrid ResNet-SE Attention Model

by Abhilash Reddy Gunukula, Himel Das Gupta and Victor S. Sheng

Appl. Sci. 2025, 15(13), 7421; https://doi.org/10.3390/app15137421 - 2 Jul 2025

Viewed by 974

Abstract

The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose [...] Read more.

The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose serious risks in terms of misinformation, digital forgery, and identity manipulation. This paper presents a novel hybrid deep learning model for detecting AI-generated images by integrating the ResNet-50 architecture with Squeeze-and-Excitation (SE) attention blocks. The proposed SE-ResNet50 model enhances channel-wise feature recalibration and interpretability by integrating Squeeze-and-Excitation (SE) blocks into the ResNet-50 backbone, enabling dynamic emphasis on subtle generative artifacts such as unnatural textures and semantic inconsistencies, thereby improving classification fidelity. Experimental evaluation on the CIFAKE dataset demonstrates the model’s effectiveness, achieving a test accuracy of 96.12%, precision of 97.04%, recall of 88.94%, F1-score of 92.82%, and an AUC score of 0.9862. The model shows strong generalization, minimal overfitting, and superior performance compared with transformer-based models and standard architectures like ResNet-50, VGGNet, and DenseNet. These results confirm the hybrid model’s suitability for real-time and resource-constrained applications in media forensics, content authentication, and ethical AI governance. Full article

(This article belongs to the Special Issue Advanced Signal and Image Processing for Applied Engineering)

► Show Figures

Figure 1

23 pages, 5745 KB

Open AccessArticle

BDSER-InceptionNet: A Novel Method for Near-Infrared Spectroscopy Model Transfer Based on Deep Learning and Balanced Distribution Adaptation

by Jianghai Chen, Jie Ling, Nana Lei and Lingqiao Li

Sensors 2025, 25(13), 4008; https://doi.org/10.3390/s25134008 - 27 Jun 2025

Viewed by 486

Abstract

Near-Infrared Spectroscopy (NIRS) analysis technology faces numerous challenges in industrial applications. Firstly, the generalization capability of models is significantly affected by instrumental heterogeneity, environmental interference, and sample diversity. Traditional modeling methods exhibit certain limitations in handling these factors, making it difficult to achieve [...] Read more.

Near-Infrared Spectroscopy (NIRS) analysis technology faces numerous challenges in industrial applications. Firstly, the generalization capability of models is significantly affected by instrumental heterogeneity, environmental interference, and sample diversity. Traditional modeling methods exhibit certain limitations in handling these factors, making it difficult to achieve effective adaptation across different scenarios. Specifically, data distribution shifts and mismatches in multi-scale features hinder the transferability of models across different crop varieties or instruments from different manufacturers. As a result, the large amount of previously accumulated NIRS and reference data cannot be effectively utilized in modeling for new instruments or new varieties, thereby limiting improvements in modeling efficiency and prediction accuracy. To address these limitations, this study proposes a novel transfer learning framework integrating multi-scale network architecture with Balanced Distribution Adaptation (BDA) to enhance cross-instrument compatibility. The key contributions include: (1) RX-Inception multi-scale structure: Combines Xception’s depthwise separable convolution with ResNet’s residual connections to strengthen global–local feature coupling. (2) Squeeze-and-Excitation (SE) attention: Dynamically recalibrates spectral band weights to enhance discriminative feature representation. (3) Systematic evaluation of six transfer strategies: Comparative analysis of their impacts on model adaptation performance. Experimental results on open corn and pharmaceutical datasets demonstrate that BDSER-InceptionNet achieves state-of-the-art performance on primary instruments. Notably, the proposed Method 6 successfully enables NIRS model sharing from primary to secondary instruments, effectively mitigating spectral discrepancies and significantly improving transfer efficacy. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

26 pages, 3424 KB

Open AccessArticle

MFF: A Multimodal Feature Fusion Approach for Encrypted Traffic Classification

by Hong Huang, Yinghang Zhou, Feng Jiang, Xiaolin Zhou and Qingping Jiang

Electronics 2025, 14(13), 2584; https://doi.org/10.3390/electronics14132584 - 26 Jun 2025

Viewed by 494

Abstract

With the widespread adoption of encryption technologies, encrypted traffic classification has become essential for maintaining network security awareness and optimizing service quality. However, existing deep learning-based methods often rely on fixed-length truncation during preprocessing, which can lead to the loss of critical information [...] Read more.

With the widespread adoption of encryption technologies, encrypted traffic classification has become essential for maintaining network security awareness and optimizing service quality. However, existing deep learning-based methods often rely on fixed-length truncation during preprocessing, which can lead to the loss of critical information and degraded classification performance. To address this issue, we propose a Multi-Feature Fusion (MFF) model that learns robust representations of encrypted traffic through a dual-path feature extraction architecture. The temporal modeling branch incorporates a Squeeze-and-Excitation (SE) attention mechanism into ResNet18 to dynamically emphasize salient temporal patterns. Meanwhile, the global statistical feature branch uses an autoencoder for the nonlinear dimensionality reduction and semantic reconstruction of 52-dimensional statistical features, effectively preserving high-level semantic information of traffic interactions. MFF integrates both feature types to achieve feature enhancement and construct a more robust representation, thereby improving classification accuracy and generalization. In addition, SHAP-based interpretability analysis further validates the model’s decision-making process and reliability. Experimental results show that MFF achieves classification accuracies of 99.61% and 99.99% on the ISCX VPN-nonVPN and USTC-TFC datasets, respectively, outperforming mainstream baselines. Full article

(This article belongs to the Section Networks)

► Show Figures

Figure 1

23 pages, 4555 KB

Open AccessArticle

Prediction of Medium-Thick Plates Weld Penetration States in Cold Metal Transfer Plus Pulse Welding Based on Deep Learning Model

by Yanli Song, Kang Song, Yipeng Peng, Lin Hua, Jue Lu and Xuanguo Wang

Metals 2025, 15(6), 637; https://doi.org/10.3390/met15060637 - 5 Jun 2025

Viewed by 556

Abstract

During the cold metal transfer plus pulse (CMT+P) welding process of medium-thick plates, problems such as incomplete penetration (IP) and burn-through (BT) are prone to occur, and weld pool morphology is important information reflecting the penetration states. In order to acquire high-quality weld [...] Read more.

During the cold metal transfer plus pulse (CMT+P) welding process of medium-thick plates, problems such as incomplete penetration (IP) and burn-through (BT) are prone to occur, and weld pool morphology is important information reflecting the penetration states. In order to acquire high-quality weld pool images under complex welding conditions, such as smoke and arc light, a welding monitoring system was designed. For the purpose of predicting weld penetration states, the improved Inception-ResNet prediction model was proposed. Squeeze-and-Excitation (SE) block was added after each Inception-ResNet block to further extract key feature information from weld pool images, increasing the weight of key features beneficial for predicting the penetration states. The model has been trained, validated, and tested. The results demonstrate that the improved model has an accuracy of over 96% in predicting penetration states of aluminum alloy medium-thick plates compared to the original model. The model was applied in welding experiments and achieved an accurate prediction. Full article

(This article belongs to the Special Issue Advanced Lightweight Materials: Processing, Characterization and Applications)

► Show Figures

Graphical abstract

15 pages, 2363 KB

Open AccessArticle

A Two-Stage Deep Learning Method for Auxiliary Diagnosis of Upper Limb Fractures Based on ResNet-50 and Enhanced YOLO

by Hongxiao Wang, Zhe Li and Dingsen Zhang

Mathematics 2025, 13(11), 1858; https://doi.org/10.3390/math13111858 - 2 Jun 2025

Viewed by 563

Abstract

Aiming at the problem that the existing auxiliary diagnosis methods for fractures are mostly limited to specific body parts and lack generality and robustness when applied to multi-part diagnoses, this study proposes a two-stage upper limb fracture auxiliary diagnosis method based on deep [...] Read more.

Aiming at the problem that the existing auxiliary diagnosis methods for fractures are mostly limited to specific body parts and lack generality and robustness when applied to multi-part diagnoses, this study proposes a two-stage upper limb fracture auxiliary diagnosis method based on deep learning and develops a corresponding auxiliary diagnosis system. In the first stage, this study employs an improved ResNet-50 model combined with transfer learning and a Squeeze-and-Excitation (SE) attention mechanism for fracture image localization. In the second stage, an improved You Only Look Once (YOLO) model based on Scale Sequence Feature Fusion (SSFF) and Triple Feature Encoder (TFE) modules is used for fracture diagnoses in different body parts. Contrary to the traditional methods that are tailored to specific body parts, the integrated design approach presented in this paper is better suited to meeting the diagnostic needs of multiple body parts, demonstrating better generality and clinical application potential. Full article

(This article belongs to the Special Issue AI-Driven Innovations in Healthcare: Advances in Machine Learning and Computer Vision)

► Show Figures

Figure 1

21 pages, 8812 KB

Open AccessArticle

A Three-Channel Improved SE Attention Mechanism Network Based on SVD for High-Order Signal Modulation Recognition

by Xujia Zhou, Gangyi Tu, Xicheng Zhu, Di Zhao and Luyan Zhang

Electronics 2025, 14(11), 2233; https://doi.org/10.3390/electronics14112233 - 30 May 2025

Viewed by 517

Abstract

To address the issues of poor differentiation capability for high-order signals and low average recognition rates in existing communication modulation recognition techniques, this paper first performs denoising using an entropy-based dynamic Singular Value Decomposition (SVD) method and proposes a three-channel convolutional gated recurrent [...] Read more.

To address the issues of poor differentiation capability for high-order signals and low average recognition rates in existing communication modulation recognition techniques, this paper first performs denoising using an entropy-based dynamic Singular Value Decomposition (SVD) method and proposes a three-channel convolutional gated recurrent units (GRU) model combined with an improved SE attention mechanism for automatic modulation recognition.The model denoises in-phase/quadrature (I/Q) signals using the SVD method to enhance signal quality. By combining one-dimensional (1D) convolutional and two-dimensional (2D) convolutional, it employs a three-channel approach to extract spatial features and capture local correlations. GRU is utilized to capture temporal sequence features so as to enhance the perception of dynamic changes. Additionally, an improved SE block is introduced to optimize feature representation, adaptively adjust channel weights, and improve classification performance. Experiments on the RadioML2016.10a dataset show that the model has a maximum classification recognition rate of 92.54%. Compared with traditional CNN, ResNet, CLDNN, GRU2, DAE, and LSTM2, the average recognition accuracy is improved by 5.41% to 8.93%. At the same time, the model significantly enhances the differentiation capability between 16QAM and 64QAM, reducing the average confusion probability by 27.70% to 39.40%. Full article

► Show Figures

Figure 1

18 pages, 3721 KB

Open AccessArticle

Haptic–Vision Fusion for Accurate Position Identification in Robotic Multiple Peg-in-Hole Assembly

by Jinlong Chen, Deming Luo, Zhigang Xiao, Minghao Yang, Xingguo Qin and Yongsong Zhan

Electronics 2025, 14(11), 2163; https://doi.org/10.3390/electronics14112163 - 26 May 2025

Viewed by 751

Abstract

Multi-peg-hole assembly is a fundamental process in robotic manufacturing, particularly for circular aviation electrical connectors (CAECs) that require precise axial alignment. However, CAEC assembly poses significant challenges due to small apertures, posture disturbances, and the need for high error tolerance. This paper proposes [...] Read more.

Multi-peg-hole assembly is a fundamental process in robotic manufacturing, particularly for circular aviation electrical connectors (CAECs) that require precise axial alignment. However, CAEC assembly poses significant challenges due to small apertures, posture disturbances, and the need for high error tolerance. This paper proposes a dual-stream Siamese network (DSSN) framework that fuses visual and tactile modalities to achieve accurate position identification in six-degree-of-freedom robotic connector assembly tasks. The DSSN employs ConvNeXt for visual feature extraction and SE-ResNet-50 with integrated attention mechanisms for tactile feature extraction, while a gated attention module adaptively fuses multimodal features. A bidirectional long short-term memory (Bi-LSTM) recurrent neural network is introduced to jointly model spatiotemporal deviations in position and orientation. Compared with state-of-the-art methods, the proposed DSSN achieves improvements of approximately 7.4%, 5.7%, and 5.4% in assembly success rates after 1, 5, and 10 buckling iterations, respectively. Experimental results validate that the integration of multimodal adaptive fusion and sequential spatiotemporal learning enables robust and precise robotic connectors assembly under high-tolerance conditions. Full article

► Show Figures

Figure 1

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (141)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI