Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline

Search Results (1,009)

Search Parameters:
Keywords = convolution block attention module

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 5127 KB  
Article
Research on Geographical Origin Traceability of Salvia miltiorrhiza by Combining Two-Trace Two-Dimensional (2T2D) Correlation Spectroscopy and Improved DeiT Model
by Jinpo Yang, Kai Chen, Yimin Zhou, Jian Zheng, Linhao Sun, Yun Zhang and Zhu Zhou
Plants 2025, 14(21), 3365; https://doi.org/10.3390/plants14213365 (registering DOI) - 3 Nov 2025
Abstract
Salvia miltiorrhiza Bunge (Danshen) is widely used in modern medicine, but the market faces challenges from counterfeit and mislabeled geographical indication products. To address this, we propose a novel framework combining Two-trace Two-dimensional (2T2D) correlation spectroscopy, hyperspectral imaging (HSI), transfer learning, and an [...] Read more.
Salvia miltiorrhiza Bunge (Danshen) is widely used in modern medicine, but the market faces challenges from counterfeit and mislabeled geographical indication products. To address this, we propose a novel framework combining Two-trace Two-dimensional (2T2D) correlation spectroscopy, hyperspectral imaging (HSI), transfer learning, and an enhanced deep learning model (DeiT-CBAM) to identify both authenticity and origin precisely. Hyperspectral data (873–1720 nm) were collected from six genuine and three adulterated regions and converted into synchronous 2T2D correlation spectroscopy images. We systematically evaluated five preprocessing strategies, three wavelength selection methods, three classical models, and four deep learning models. Models based on 2T2D correlation spectroscopy images consistently outperformed traditional one-dimensional spectral models. Notably, the DeiT-CBAM model, integrated with the successive projections algorithm (SPA), achieved optimal performance using only 79 wavelengths, with 100% accuracy on the training and validation sets and 99.62% on the test set, without the need for additional preprocessing. Model interpretability was further validated through layer-wise class activation mapping (layer-wise CAM). This study demonstrates that the integration of synchronous 2T2D correlation spectroscopy images with the DeiT-CBAM model offers robust discriminative performance, providing a reliable technical solution for geographical origin traceability of food, medicinal herbs, and other species. Full article
(This article belongs to the Section Horticultural Science and Ornamental Plants)
Show Figures

Figure 1

34 pages, 16941 KB  
Article
Explainable AI Based Multi Class Skin Cancer Detection Enhanced by Meta Learning with Generative DDPM Data Augmentation
by Muhammad Danish Ali, Muhammad Ali Iqbal, Sejong Lee, Xiaoyun Duan and Soo Kyun Kim
Appl. Sci. 2025, 15(21), 11689; https://doi.org/10.3390/app152111689 - 31 Oct 2025
Viewed by 88
Abstract
Despite the widespread success of convolutional deep learning frameworks in computer vision, significant limitations persist in medical image analysis. These include low image quality caused by noise and artifacts, limited data availability compromising robustness on unseen data, class imbalance leading to biased predictions, [...] Read more.
Despite the widespread success of convolutional deep learning frameworks in computer vision, significant limitations persist in medical image analysis. These include low image quality caused by noise and artifacts, limited data availability compromising robustness on unseen data, class imbalance leading to biased predictions, and insufficient feature representation, as conventional CNNs often fail to capture subtle patterns and complex dependencies. To address these challenges, we propose DAME (Diffusion-Augmented Meta-Learning Ensemble), a unified architecture that integrates hybrid modeling with generative learning using the Denoising Diffusion Probabilistic Model (DDPM). The DDPM component improves resolution, augments scarce data, and mitigates class imbalance. A hybrid backbone combining CNN, Vision Transformer (ViT), and CBAM captures both local dependencies and long-range spatial relationships, while CBAM further enhances feature representation by adaptively emphasizing informative regions. Predictions from multiple hybrids are aggregated, and a logistic regression meta classifier learns from these outputs to produce robust decisions. The framework is evaluated on the HAM10000 dataset, a benchmark for multi-class skin cancer classification. Explainable AI is incorporated through Grad CAM, providing visual insights into the decision-making process. This synergy mitigates CNN limitations and demonstrates superior generalizability, achieving 98.6% accuracy, 0.986 precision, 0.986 recall, and a 0.986 F1-score, significantly outperforming existing approaches. Overall, the proposed framework enables accurate, interpretable, and reliable medical image diagnosis through the joint optimization of contextual modeling, feature discrimination, and data generation. Full article
Show Figures

Figure 1

20 pages, 3102 KB  
Article
A Study on Digital Soil Mapping Based on Multi-Attention Convolutional Neural Networks: A Case Study in Heilongjiang Province
by Yaxue Liu, Hengkai Li, Yuchun Pan, Yunbing Gao and Yanbing Zhou
Agriculture 2025, 15(21), 2273; https://doi.org/10.3390/agriculture15212273 - 31 Oct 2025
Viewed by 97
Abstract
Machine learning-based digital soil mapping often struggles with spatial heterogeneity and long-range dependencies. To address these limitations, this study proposes Multi-Attention Convolutional Neural Networks (MACNN). This deep learning algorithm integrates multiple attention mechanisms to improve mapping accuracy. First, environmental covariates are determined from [...] Read more.
Machine learning-based digital soil mapping often struggles with spatial heterogeneity and long-range dependencies. To address these limitations, this study proposes Multi-Attention Convolutional Neural Networks (MACNN). This deep learning algorithm integrates multiple attention mechanisms to improve mapping accuracy. First, environmental covariates are determined from the soil-landscape model. These are then fed as structured input to the Convolutional Neural Network. Next, by incorporating Transformer self-attention and multi-head attention mechanisms, this study effectively models the long-range dependencies between soil types and features. Concurrently, the Convolutional Block Attention Module (CBAM) is introduced. CBAM features both channel and spatial dual attention, enabling adaptive weighting of crucial feature channels and spatial locations. This significantly enhances the algorithm’s sensitivity to discriminative information. To validate its effectiveness, the proposed MACNN algorithm was used for soil type mapping in Heilongjiang Province. Compared to Random Forest, Decision Tree, and One-Dimensional Convolutional Neural Network algorithms, MACNN demonstrated superior classification performance. It achieved an overall classification accuracy of 81.27%. An ablation study was conducted to investigate the importance of individual modules within the proposed algorithm. The findings indicate that progressively integrating Transformer and CBAM modules into the 1D-CNN baseline significantly enhances algorithm performance through synergistic gains. Therefore, this integrated algorithm offers a feasible solution to improve digital soil mapping accuracy, providing significant reference value for future research and applications. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

18 pages, 16806 KB  
Article
Refined Extraction of Sugarcane Planting Areas in Guangxi Using an Improved U-Net Model
by Tao Yue, Zijun Ling, Yuebiao Tang, Jingjin Huang, Hongteng Fang, Siyuan Ma, Jie Tang, Yun Chen and Hong Huang
Drones 2025, 9(11), 754; https://doi.org/10.3390/drones9110754 - 30 Oct 2025
Viewed by 127
Abstract
Sugarcane, a vital economic crop and renewable energy source, requires precise monitoring of the area in which it has been planted to ensure sugar industry security, optimize agricultural resource allocation, and allow the assessment of ecological benefits. Guangxi Zhuang Autonomous Region, leveraging its [...] Read more.
Sugarcane, a vital economic crop and renewable energy source, requires precise monitoring of the area in which it has been planted to ensure sugar industry security, optimize agricultural resource allocation, and allow the assessment of ecological benefits. Guangxi Zhuang Autonomous Region, leveraging its subtropical climate and abundant solar thermal resources, accounts for over 63% of China’s total sugarcane cultivation area. In this study, we constructed an enhanced RCAU-net model and developed a refined extraction framework that considers different growth stages to enable rapid identification of sugarcane planting areas. This study addresses key challenges in remote-sensing-based sugarcane extraction, namely, the difficulty of distinguishing spectrally similar objects, significant background interference, and insufficient multi-scale feature fusion. To significantly enhance the accuracy and robustness of sugarcane identification, an improved RCAU-net model based on the U-net architecture was designed. The model incorporates three key improvements: it replaces the original encoder with ResNet50 residual modules to enhance discrimination of similar crops; it integrates a Convolutional Block Attention Module (CBAM) to focus on critical features and effectively suppress background interference; and it employs an Atrous Spatial Pyramid Pooling (ASPP) module to bridge the encoder and decoder, thereby optimizing the extraction of multi-scale contextual information. A refined extraction framework that accounts for different growth stages was ultimately constructed to achieve rapid identification of sugarcane planting areas in Guangxi. The experimental results demonstrate that the RCAU-net model performed excellently, achieving an Overall Accuracy (OA) of 97.19%, a Mean Intersection over Union (mIoU) of 94.47%, a Precision of 97.31%, and an F1 Score of 97.16%. These results represent significant improvements of 7.20, 10.02, 6.82, and 7.28 percentage points in OA, mIoU, Precision, and F1 Score, respectively, relative to the original U-net. The model also achieved a Kappa coefficient of 0.9419 and a Recall rate of 96.99%. The incorporation of residual structures significantly reduced the misclassification of similar crops, while the CBAM and ASPP modules minimized holes within large continuous patches and false extractions of small patches, resulting in smoother boundaries for the extracted areas. This work provides reliable data support for the accurate calculation of sugarcane planting area and greatly enhances the decision-making value of remote sensing monitoring in modern agricultural management of sugarcane. Full article
Show Figures

Figure 1

23 pages, 3198 KB  
Article
Mulch-YOLO: Improved YOLOv11 for Real-Time Detection of Mulch in Seed Cotton
by Zhiwei Su, Wei Wei, Zhen Huang and Ronglin Yan
Appl. Sci. 2025, 15(21), 11604; https://doi.org/10.3390/app152111604 - 30 Oct 2025
Viewed by 126
Abstract
Machine harvesting of cotton in Xinjiang has significantly improved harvesting efficiency; however, it has also resulted in a considerable increase in residual mulch content within the cotton, which has severely affected the quality and market value of cotton textiles. Existing mulch detection algorithms [...] Read more.
Machine harvesting of cotton in Xinjiang has significantly improved harvesting efficiency; however, it has also resulted in a considerable increase in residual mulch content within the cotton, which has severely affected the quality and market value of cotton textiles. Existing mulch detection algorithms based on machine vision generally suffer from complex parameterization and insufficient real-time performance. To overcome these limitations, this study proposes a novel mulch detection algorithm, Mulch-YOLO, developed on the YOLOv11 framework. Specifically, an improved CBAM (Convolutional Block Attention Module) is incorporated into the BiFPN (Bidirectional Feature Pyramid Network) to achieve more effective fusion of multi-scale mulch features. To enhance the semantic representation of mulch features, a modified Content-Aware ReAssembly of Features module, CARAFE-Mulch (Content-Aware ReAssembly of Features), is designed to reorganize feature maps, resulting in stronger feature expressiveness compared with the original representations. Furthermore, the MobileOne module is optimized by integrating the DECA Dilated Efficient Channel Attention (Dilated Efficient Channel Attention) module, thereby reducing both the parameter count and computational load while improving detection efficiency in real time. To verify the effectiveness of the proposed approach, experiments were conducted on a real-world dataset containing 20,134 images of low-visual-saliency plastic mulch. The results indicate that Mulch-YOLO achieves a lightweight architecture and high detection accuracy. Compared with YOLOv11n, the proposed method improves mAP@0.5 by 4.7% and mAP@0.5:0.95 by 3.3%, with a 24% reduction in model parameters. Full article
(This article belongs to the Section Agricultural Science and Technology)
Show Figures

Figure 1

31 pages, 7049 KB  
Article
Objective Emotion Assessment Using a Triple Attention Network for an EEG-Based Brain–Computer Interface
by Lihua Zhang, Xin Zhang, Xiu Zhang, Changyi Yu and Xuguang Liu
Brain Sci. 2025, 15(11), 1167; https://doi.org/10.3390/brainsci15111167 - 29 Oct 2025
Viewed by 334
Abstract
Background: The assessment of emotion recognition holds growing significance in research on the brain–computer interface and human–computer interaction. Among diverse physiological signals, electroencephalography (EEG) occupies a pivotal position in affective computing due to its exceptional temporal resolution and non-invasive acquisition. However, EEG signals [...] Read more.
Background: The assessment of emotion recognition holds growing significance in research on the brain–computer interface and human–computer interaction. Among diverse physiological signals, electroencephalography (EEG) occupies a pivotal position in affective computing due to its exceptional temporal resolution and non-invasive acquisition. However, EEG signals are inherently complex, characterized by substantial noise contamination and high variability, posing considerable challenges to accurate assessment. Methods: To tackle these challenges, we propose a Triple Attention Network (TANet), a triple-attention EEG emotion recognition framework that integrates Conformer, Convolutional Block Attention Module (CBAM), and Mutual Cross-Modal Attention (MCA). The Conformer component captures temporal feature dependencies, CBAM refines spatial channel representations, and MCA performs cross-modal fusion of differential entropy and power spectral density features. Results: We evaluated TANet on two benchmark EEG emotion datasets, DEAP and SEED. On SEED, using a subject-specific cross-validation protocol, the model reached an average accuracy of 98.51 ± 1.40%. On DEAP, we deliberately adopted a segment-level splitting paradigm—in line with influential state-of-the-art methods—to ensure a direct and fair comparison of model architecture under an identical evaluation protocol. This approach, designed specifically to assess fine-grained within-trial pattern discrimination rather than cross-subject generalization, yielded accuracies of 99.69 ± 0.15% and 99.67 ± 0.13% for the valence and arousal dimensions, respectively. Compared with existing benchmark approaches under similar evaluation protocols, TANet delivers substantially better results, underscoring the strong complementary effects of its attention mechanisms in improving EEG-based emotion recognition performance. Conclusions: This work provides both theoretical insights into multi-dimensional attention for physiological signal processing and practical guidance for developing high-performance, robust EEG emotion assessment systems. Full article
(This article belongs to the Section Neurotechnology and Neuroimaging)
Show Figures

Figure 1

29 pages, 8732 KB  
Article
MFF-ClassificationNet: CNN-Transformer Hybrid with Multi-Feature Fusion for Breast Cancer Histopathology Classification
by Xiaoli Wang, Guowei Wang, Luhan Li, Hua Zou and Junpeng Cui
Biosensors 2025, 15(11), 718; https://doi.org/10.3390/bios15110718 - 29 Oct 2025
Viewed by 249
Abstract
Breast cancer is one of the most prevalent malignant tumors among women worldwide, underscoring the urgent need for early and accurate diagnosis to reduce mortality. To address this, A Multi-Feature Fusion Classification Network (MFF-ClassificationNet) is proposed for breast histopathological image classification. The network [...] Read more.
Breast cancer is one of the most prevalent malignant tumors among women worldwide, underscoring the urgent need for early and accurate diagnosis to reduce mortality. To address this, A Multi-Feature Fusion Classification Network (MFF-ClassificationNet) is proposed for breast histopathological image classification. The network adopts a two-branch parallel architecture, where a convolutional neural network captures local details and a Transformer models global dependencies. Their features are deeply integrated through a Multi-Feature Fusion module, which incorporates a Convolutional Block Attention Module—Squeeze and Excitation (CBAM-SE) fusion block combining convolutional block attention, squeeze-and-excitation mechanisms, and a residual inverted multilayer perceptron to enhance fine-grained feature representation and category-specific lesion characterization. Experimental evaluations on the BreakHis dataset achieved accuracies of 98.30%, 97.62%, 98.81%, and 96.07% at magnifications of 40×, 100×, 200×, and 400×, respectively, while an accuracy of 97.50% was obtained on the BACH dataset. These results confirm that integrating local and global features significantly strengthens the model’s ability to capture multi-scale and context-aware information, leading to superior classification performance. Overall, MFF-ClassificationNet surpasses conventional single-path approaches and provides a robust, generalizable framework for advancing computer-aided diagnosis of breast cancer. Full article
(This article belongs to the Special Issue AI-Based Biosensors and Biomedical Imaging)
Show Figures

Figure 1

19 pages, 1994 KB  
Article
IVCLNet: A Hybrid Deep Learning Framework Integrating Signal Decomposition and Attention-Enhanced CNN-LSTM for Lithium-Ion Battery SOH Prediction and RUL Estimation
by Yulong Pei, Hua Huo, Yinpeng Guo, Shilu Kang and Jiaxin Xu
Energies 2025, 18(21), 5677; https://doi.org/10.3390/en18215677 - 29 Oct 2025
Viewed by 339
Abstract
Accurate prediction of the degradation trajectory and estimation of the remaining useful life (RUL) of lithium-ion batteries are crucial for ensuring the reliability and safety of modern energy storage systems. However, many existing approaches rely on deep or highly complex models to achieve [...] Read more.
Accurate prediction of the degradation trajectory and estimation of the remaining useful life (RUL) of lithium-ion batteries are crucial for ensuring the reliability and safety of modern energy storage systems. However, many existing approaches rely on deep or highly complex models to achieve high accuracy, often at the cost of computational efficiency and practical applicability. To tackle this challenge, we propose a novel hybrid deep-learning framework, IVCLNet, which predicts the battery’s state-of-health (SOH) evolution and estimates RUL by identifying the end-of-life threshold (SOH = 80%). The framework integrates Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN), Variational Mode Decomposition (VMD), and an attention-enhanced Long Short-Term Memory (LSTM) network. IVCLNet leverages a cascade decomposition strategy to capture multi-scale degradation patterns and employs multiple indirect health indicators (HIs) to enrich feature representation. A lightweight Convolutional Block Attention Module (CBAM) is embedded to strengthen the model’s perception of critical features, guiding the one-dimensional convolutional layers to focus on informative components. Combined with LSTM-based temporal modeling, the framework ensures both accuracy and interpretability. Extensive experiments conducted on two publicly available lithium-ion battery datasets demonstrated that IVCLNet significantly outperforms existing methods in terms of prediction accuracy, robustness, and computational efficiency. The findings indicate that the proposed framework is promising for practical applications in battery health management systems. Full article
Show Figures

Figure 1

22 pages, 3835 KB  
Article
Phenology-Guided Wheat and Corn Identification in Xinjiang: An Improved U-Net Semantic Segmentation Model Using PCA and CBAM-ASPP
by Yang Wei, Xian Guo, Yiling Lu, Hongjiang Hu, Fei Wang, Rongrong Li and Xiaojing Li
Remote Sens. 2025, 17(21), 3563; https://doi.org/10.3390/rs17213563 - 28 Oct 2025
Viewed by 202
Abstract
Wheat and corn are two major food crops in Xinjiang. However, the spectral similarity between these crop types and the complexity of their spatial distribution has posed significant challenges to accurate crop identification. To this end, the study aimed to improve the accuracy [...] Read more.
Wheat and corn are two major food crops in Xinjiang. However, the spectral similarity between these crop types and the complexity of their spatial distribution has posed significant challenges to accurate crop identification. To this end, the study aimed to improve the accuracy of crop distribution identification in complex environments in three ways. First, by analysing the kNDVI and EVI time series, the optimal identification window was determined to be days 156–176—a period when wheat is in the grain-filling to milk-ripening phase and maize is in the jointing to tillering phase—during which, the strongest spectral differences between the two crops occurs. Second, principal component analysis (PCA) was applied to Sentinel-2 data. The top three principal components were extracted to construct the input dataset, effectively integrating visible and near-infrared band information. This approach suppressed redundancy and noise while replacing traditional RGB datasets. Finally, the Convolutional Block Attention Module (CBAM) was integrated into the U-Net model to enhance feature focusing on key crop areas. An improved Atrous Spatial Pyramid Pooling (ASPP) module based on deep separable convolutions was adopted to reduce the computational load while boosting multi-scale context awareness. The experimental results showed the following: (1) Wheat and corn exhibit obvious phenological differences between the 156th and 176th days of the year, which can be used as the optimal time window for identifying their spatial distributions. (2) The method proposed by this research had the best performance, with its mIoU, mPA, F1-score, and overall accuracy (OA) reaching 83.03%, 91.34%, 90.73%, and 90.91%, respectively. Compared to DeeplabV3+, PSPnet, HRnet, Segformer, and U-Net, the OA improved by 5.97%, 4.55%, 2.03%, 8.99%, and 1.5%, respectively. The recognition accuracy of the PCA dataset improved by approximately 2% compared to the RGB dataset. (3) This strategy still had high accuracy when predicting wheat and corn yields in Qitai County, Xinjiang, and had a certain degree of generalisability. In summary, the improved strategy proposed in this study holds considerable application potential for identifying the spatial distribution of wheat and corn in arid regions. Full article
Show Figures

Figure 1

23 pages, 11034 KB  
Article
UEBNet: A Novel and Compact Instance Segmentation Network for Post-Earthquake Building Assessment Using UAV Imagery
by Ziying Gu, Shumin Wang, Kangsan Yu, Yuanhao Wang and Xuehua Zhang
Remote Sens. 2025, 17(21), 3530; https://doi.org/10.3390/rs17213530 - 24 Oct 2025
Viewed by 305
Abstract
Unmanned aerial vehicle (UAV) remote sensing is critical in assessing post-earthquake building damage. However, intelligent disaster assessment via remote sensing faces formidable challenges from complex backgrounds, substantial scale variations in targets, and diverse spatial disaster dynamics. To address these issues, we propose UEBNet, [...] Read more.
Unmanned aerial vehicle (UAV) remote sensing is critical in assessing post-earthquake building damage. However, intelligent disaster assessment via remote sensing faces formidable challenges from complex backgrounds, substantial scale variations in targets, and diverse spatial disaster dynamics. To address these issues, we propose UEBNet, a high-precision post-earthquake building instance segmentation model that systematically enhances damage recognition by integrating three key modules. Firstly, the Depthwise Separable Convolutional Block Attention Module suppresses background noise that visually resembles damaged structures. This is achieved by expanding the receptive field using multi-scale pooling and dilated convolutions. Secondly, the Multi-feature Fusion Module generates scale-robust feature representations for damaged buildings with significant size differences by processing feature streams from different receptive fields in parallel. Finally, the Adaptive Multi-Scale Interaction Module accurately reconstructs the irregular contours of damaged buildings through an advanced feature alignment mechanism. Extensive experiments were conducted using UAV imagery collected after the Ms 6.8 earthquake in Tingri County, Tibet Autonomous Region, China, on 7 January 2025, and the Ms 6.2 earthquake in Jishishan County, Gansu Province, China, on 18 December 2023. Results indicate that UEBNet enhances segmentation mean Average Precision (mAPseg) and bounding box mean Average Precision (mAPbox) by 3.09% and 2.20%, respectively, with equivalent improvements of 2.65% in F1-score and 1.54% in overall accuracy, outperforming state-of-the-art instance segmentation models. These results demonstrate the effectiveness and reliability of UEBNet in accurately segmenting earthquake-damaged buildings in complex post-disaster scenarios, offering valuable support for emergency response and disaster relief. Full article
Show Figures

Figure 1

25 pages, 4755 KB  
Article
DA-GSGTNet: Dynamic Aggregation Gated Stratified Graph Transformer for Multispectral LiDAR Point Cloud Segmentation
by Qiong Ding, Runyuan Zhang, Alex Hay-Man Ng, Long Tang, Bohua Ling, Dan Wang and Yuelin Hou
Remote Sens. 2025, 17(21), 3515; https://doi.org/10.3390/rs17213515 - 23 Oct 2025
Viewed by 369
Abstract
Multispectral LiDAR point clouds, which integrate both geometric and spectral information, offer rich semantic content for scene understanding. However, due to data scarcity and distributional discrepancies, existing methods often struggle to balance accuracy and efficiency in complex urban environments. To address these challenges, [...] Read more.
Multispectral LiDAR point clouds, which integrate both geometric and spectral information, offer rich semantic content for scene understanding. However, due to data scarcity and distributional discrepancies, existing methods often struggle to balance accuracy and efficiency in complex urban environments. To address these challenges, we propose DA-GSGTNet, a novel segmentation framework that integrates Gated Stratified Graph Transformer Blocks (GSGT-Block) with Dynamic Aggregation Transition Down (DATD). The GSGT-Block employs graph convolutions to enhance the local continuity of windowed attention in sparse neighborhoods and adaptively fuses these features via a gating mechanism. The DATD module dynamically adjusts k-NN strides based on point density, while jointly aggregating coordinates and feature vectors to preserve structural integrity during downsampling. Additionally, we introduce a relative position encoding scheme using quantized lookup tables with a Euclidean distance bias to improve recognition of elongated and underrepresented classes. Experimental results on a benchmark multispectral point cloud dataset demonstrate that DA-GSGTNet achieves 86.43% mIoU, 93.74% mAcc, and 90.78% OA, outperforming current state-of-the-art methods. Moreover, by fine-tuning from source-domain pretrained weights and using only ~30% of the training samples (4 regions) and 30% of the training epochs (30 epochs), we achieve over 90% of the full-training segmentation accuracy (100 epochs). These results validate the effectiveness of transfer learning for rapid convergence and efficient adaptation in data-scarce scenarios, offering practical guidance for future multispectral LiDAR applications with limited annotation. Full article
Show Figures

Figure 1

15 pages, 6914 KB  
Article
Deep Learning-Based Inverse Design of Stochastic-Topology Metamaterials for Radar Cross Section Reduction
by Chao Zhang, Chunrong Zou, Shaojun Guo, Yanwen Zhao and Tongsheng Shen
Materials 2025, 18(21), 4841; https://doi.org/10.3390/ma18214841 - 23 Oct 2025
Viewed by 311
Abstract
Electromagnetic (EM) metamaterials have a wide range of applications due to their unique properties, but their design is often based on specific topological structures, which come with certain limitations. Designing with stochastic topologies can provide more diverse EM properties. However, this requires experienced [...] Read more.
Electromagnetic (EM) metamaterials have a wide range of applications due to their unique properties, but their design is often based on specific topological structures, which come with certain limitations. Designing with stochastic topologies can provide more diverse EM properties. However, this requires experienced designers to search and optimise in a vast design space, which is time-consuming and requires substantial computational resources. In this paper, we employ a deep learning network agent model to replace time-consuming full-wave simulations and quickly establish the mapping relationship between the metamaterial structure and its electromagnetic response. The proposed framework integrates a Convolutional Block Attention Module-enhanced Variational Autoencoder (CBAM-VAE) with a Transformer-based predictor. Incorporating CBAM into the VAE architecture significantly enhances the model’s capacity to extract and reconstruct critical structural features of metamaterials. The Transformer predictor utilises an encoder-only configuration that leverages the sequential data characteristics, enabling accurate prediction of electromagnetic responses from latent variables while significantly enhancing computational efficiency. The dataset is randomly generated based on the filling rate of unit cells, requiring only a small fraction of samples compared to the full design space for training. We employ the trained model for the inverse design of metamaterials, enabling the rapid generation of two cells for 1-bit coding metamaterials. Compared to a similarly sized metallic plate, the designed coding metamaterial radar cross-section (RCS) reduces by over 10 dB from 6 to 18 GHz. Simulation and experimental measurement results validate the reliability of this design approach, providing a novel perspective for the design of EM metamaterials. Full article
(This article belongs to the Section Materials Simulation and Design)
Show Figures

Figure 1

19 pages, 4569 KB  
Article
NeuroNet-AD: A Multimodal Deep Learning Framework for Multiclass Alzheimer’s Disease Diagnosis
by Saeka Rahman, Md Motiur Rahman, Smriti Bhatt, Raji Sundararajan and Miad Faezipour
Bioengineering 2025, 12(10), 1107; https://doi.org/10.3390/bioengineering12101107 - 15 Oct 2025
Viewed by 635
Abstract
Alzheimer’s disease (AD) is the most prevalent form of dementia. This disease significantly impacts cognitive functions and daily activities. Early and accurate diagnosis of AD, including the preliminary stage of mild cognitive impairment (MCI), is critical for effective patient care and treatment development. [...] Read more.
Alzheimer’s disease (AD) is the most prevalent form of dementia. This disease significantly impacts cognitive functions and daily activities. Early and accurate diagnosis of AD, including the preliminary stage of mild cognitive impairment (MCI), is critical for effective patient care and treatment development. Although advancements in deep learning (DL) and machine learning (ML) models improve diagnostic precision, the lack of large datasets limits further enhancements, necessitating the use of complementary data. Existing convolutional neural networks (CNNs) effectively process visual features but struggle to fuse multimodal data effectively for AD diagnosis. To address these challenges, we propose NeuroNet-AD, a novel multimodal CNN framework designed to enhance AD classifcation accuracy. NeuroNet-AD integrates Magnetic Resonance Imaging (MRI) images with clinical text-based metadata, including psychological test scores, demographic information, and genetic biomarkers. In NeuroNet-AD, we incorporate Convolutional Block Attention Modules (CBAMs) within the ResNet-18 backbone, enabling the model to focus on the most informative spatial and channel-wise features. We introduce an attention computation and multimodal fusion module, named Meta Guided Cross Attention (MGCA), which facilitates effective cross-modal alignment between images and meta-features through a multi-head attention mechanism. Additionally, we employ an ensemble-based feature selection strategy to identify the most discriminative features from the textual data, improving model generalization and performance. We evaluate NeuroNet-AD on the Alzheimer’s Disease Neuroimaging Initiative (ADNI1) dataset using subject-level 5-fold cross-validation and a held-out test set to ensure robustness. NeuroNet-AD achieved 98.68% accuracy in multiclass classification of normal control (NC), MCI, and AD and 99.13% accuracy in the binary setting (NC vs. AD) on the ADNI dataset, outperforming state-of-the-art models. External validation on the OASIS-3 dataset further confirmed the model’s generalization ability, achieving 94.10% accuracy in the multiclass setting and 98.67% accuracy in the binary setting, despite variations in demographics and acquisition protocols. Further extensive evaluation studies demonstrate the effectiveness of each component of NeuroNet-AD in improving the performance. Full article
Show Figures

Graphical abstract

19 pages, 5009 KB  
Article
Research on Preventive Maintenance Technology for Highway Cracks Based on Digital Image Processing
by Zhi Chen, Zhuozhuo Bai, Xinqi Chen and Jiuzeng Wang
Electronics 2025, 14(20), 4017; https://doi.org/10.3390/electronics14204017 - 13 Oct 2025
Viewed by 252
Abstract
Cracks are the initial manifestation of various diseases on highways. Preventive maintenance of cracks can delay the degree of pavement damage and effectively extend the service life of highways. However, existing crack detection methods have poor performance in identifying small cracks and are [...] Read more.
Cracks are the initial manifestation of various diseases on highways. Preventive maintenance of cracks can delay the degree of pavement damage and effectively extend the service life of highways. However, existing crack detection methods have poor performance in identifying small cracks and are unable to calculate crack width, leading to unsatisfactory preventive maintenance results. This article proposes an integrated method for crack detection, segmentation, and width calculation based on digital image processing technology. Firstly, based on convolutional neural network, a optimized crack detection network called CFSSE is proposed by fusing the fast spatial pyramid pooling structure with the squeeze-and-excitation attention mechanism, with an average detection accuracy of 97.10%, average recall rate of 98.00%, and average detection precision at 0.5 threshold of 98.90%; it outperforms the YOLOv5-mobileone network and YOLOv5-s network. Secondly, based on the U-Net network, an optimized crack segmentation network called CBU_Net is proposed by using the CNN-block structure in the encoder module and a bicubic interpolation algorithm in the decoder module, with an average segmentation accuracy of 99.10%, average intersection over union of 88.62%, and average pixel accuracy of 93.56%; it outperforms the U_Net network, DeepLab v3+ network, and optimized DeepLab v3 network. Finally, a laser spot center positioning method based on information entropy combination is proposed to provide an accurate benchmark for crack width calculation based on parallel lasers, with an average error in crack width calculation of less than 2.56%. Full article
Show Figures

Figure 1

22 pages, 7434 KB  
Article
A Lightweight Image-Based Decision Support Model for Marine Cylinder Lubrication Based on CNN-ViT Fusion
by Qiuyu Li, Guichen Zhang and Enrui Zhao
J. Mar. Sci. Eng. 2025, 13(10), 1956; https://doi.org/10.3390/jmse13101956 - 13 Oct 2025
Viewed by 282
Abstract
Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, [...] Read more.
Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, a lightweight image retrieval model for cylinder lubrication is proposed, leveraging deep learning and computer vision to support oiling decisions based on visual features. The model comprises three components: a backbone network, a feature enhancement module, and a similarity retrieval module. Specifically, EfficientNetB0 serves as the backbone for efficient feature extraction under low computational overhead. MobileViT Blocks are integrated to combine local feature perception of Convolutional Neural Networks (CNNs) with the global modeling capacity of Transformers. To further improve receptive field and multi-scale representation, Receptive Field Blocks (RFB) are introduced between the components. Additionally, the Convolutional Block Attention Module (CBAM) attention mechanism enhances focus on salient regions, improving feature discrimination. A high-quality image dataset was constructed using WINNING’s large bulk carriers under various sea conditions. The experimental results demonstrate that the EfficientNetB0 + RFB + MobileViT + CBAM model achieves excellent performance with minimal computational cost: 99.71% Precision, 99.69% Recall, and 99.70% F1-score—improvements of 11.81%, 15.36%, and 13.62%, respectively, over the baseline EfficientNetB0. With only a 0.3 GFLOP and 8.3 MB increase in model size, the approach balances accuracy and inference efficiency. The model also demonstrates good robustness and application stability in real-world ship testing, with potential for further adoption in the field of intelligent ship maintenance. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Back to TopTop