MDPI - Publisher of Open Access Journals

17 pages, 2779 KB

Open AccessArticle

Image Restoration Based on Semantic Prior Aware Hierarchical Network and Multi-Scale Fusion Generator

by Yapei Feng, Yuxiang Tang and Hua Zhong

Technologies 2025, 13(11), 521; https://doi.org/10.3390/technologies13110521 - 13 Nov 2025

As a fundamental low-level vision task, image restoration plays a pivotal role in reconstructing authentic visual information from corrupted inputs, directly impacting the performance of downstream high-level vision systems. Current approaches frequently exhibit two critical limitations: (1) Progressive texture degradation and blurring during [...] Read more.

As a fundamental low-level vision task, image restoration plays a pivotal role in reconstructing authentic visual information from corrupted inputs, directly impacting the performance of downstream high-level vision systems. Current approaches frequently exhibit two critical limitations: (1) Progressive texture degradation and blurring during iterative refinement, particularly in irregular damage patterns. (2) Structural incoherence when handling cross-domain artifacts. To address these challenges, we present a semantic-aware hierarchical network (SAHN) that synergistically integrates multi-scale semantic guidance with structural consistency constraints. Firstly, we construct a Dual-Stream Feature Extractor. Based on a modified U-Net backbone with dilated residual blocks, this skip-connected encoder–decoder module simultaneously captures hierarchical semantic contexts and fine-grained texture details. Secondly, we propose the semantic prior mapper by establishing spatial–semantic correspondences between damaged areas and multi-scale features through predefined semantic prototypes through adaptive attention pooling. Additionally, we construct a multi-scale fusion generator, by employing cascaded association blocks with structural similarity constraints. This unit progressively aggregates features from different semantic levels using deformable convolution kernels, effectively bridging the gap between global structure and local texture reconstruction. Compared to existing methods, our algorithm attains the highest overall PSNR of 34.99 with the best visual authenticity (with the lowest FID of 11.56). Comprehensive evaluations of three datasets demonstrate its leading performance in restoring visual realism. Full article

► Show Figures

Figure 1

18 pages, 24463 KB

Open AccessArticle

Multi-Scale Adaptive Modulation Network for Efficient Image Super-Resolution

by Zepeng Liu, Guodong Zhang, Jiya Tian and Ruimin Qi

Electronics 2025, 14(22), 4404; https://doi.org/10.3390/electronics14224404 - 12 Nov 2025

Abstract

As convolutional neural networks (CNNs) become gradually larger and deeper, their applicability in real-time and resource-constrained environments is significantly limited. Furthermore, while self-attention (SA) mechanisms excel at capturing global dependencies, they often emphasize low-frequency information and struggle to represent fine local details. To [...] Read more.

As convolutional neural networks (CNNs) become gradually larger and deeper, their applicability in real-time and resource-constrained environments is significantly limited. Furthermore, while self-attention (SA) mechanisms excel at capturing global dependencies, they often emphasize low-frequency information and struggle to represent fine local details. To overcome these limitations, we propose a multi-scale adaptive modulation network (MAMN) for image super-resolution. The MAMN mainly consists of a series of multi-scale adaptive modulation blocks (MAMBs), each of which incorporates a multi-scale adaptive modulation layer (MAML), a local detail extraction layer (LDEL), and two Swin Transformer Layers (STLs). The MAML is designed to capture multi-scale non-local representations, while the LDEL complements this by extracting high-frequency local features. Additionally, the STLs enhance long-range dependency modeling, effectively expanding the receptive field and integrating global contextual information. Extensive experiments demonstrate that the proposed method achieves an optimal trade-off between computational efficiency and reconstruction performance across five benchmark datasets. Full article

(This article belongs to the Special Issue Intelligent Signal Processing and Its Applications)

► Show Figures

Figure 1

19 pages, 3290 KB

Open AccessArticle

Multi-Granularity Content-Aware Network with Semantic Integration for Unsupervised Anomaly Detection

by Xinyu Guo, Shihui Zhao, Jianbin Xue, Dongdong Liu, Xinyang Han, Shuai Zhang and Yufeng Zhang

Appl. Sci. 2025, 15(21), 11842; https://doi.org/10.3390/app152111842 - 6 Nov 2025

Viewed by 313

Abstract

Unsupervised anomaly detection has been widely applied to industrial scenarios. Recently, transformer-based methods have also been developed and have produced good performance. Although the global dependencies in anomaly images are considered, the typical patch partition strategy in the vanilla self-attention mechanism ignores the [...] Read more.

Unsupervised anomaly detection has been widely applied to industrial scenarios. Recently, transformer-based methods have also been developed and have produced good performance. Although the global dependencies in anomaly images are considered, the typical patch partition strategy in the vanilla self-attention mechanism ignores the content consistencies in anomaly defects or normal regions. To sufficiently exploit the content consistency in images, we propose the multi-granularity content-aware network with semantic integration (MGCA-Net), in which superpixel segmentation is introduced into feature space to divide images according to their spatial structures. Specifically, we adopt a pre-trained ResNet as the encoder to extract features. Then, we design content-aware attention blocks (CAABs) to capture the global information in features at different granularities. In this block, we impose superpixel segmentation on the features from the encoder and employ the superpixels as tokens for the learning of global relationships. Because the superpixels are divided according to their content consistencies, the spatial structures of objects in anomaly or normal regions are preserved. Meanwhile, the multi-granularity semantic integration block is devised to further integrate the global information of all granularities. Next, we use semantic-guided fusion blocks (SGFBs) to progressively upsample the features with the help of CAABs. Finally, the differences between the outputs of CAABs and SGFBs are calculated and merged to predict the anomaly defects. Thanks to the preservation of content consistency of objects, experimental results on two benchmark datasets demonstrate that our proposed MGCA-Net achieves superior anomaly detection performance over state-of-the-art methods. Full article

(This article belongs to the Topic Intelligent Image Processing Technology)

► Show Figures

Figure 1

22 pages, 9577 KB

Open AccessArticle

YOLOv11-4ConvNeXtV2: Enhancing Persimmon Ripeness Detection Under Visual Challenges

by Bohan Zhang, Zhaoyuan Zhang and Xiaodong Zhang

AI 2025, 6(11), 284; https://doi.org/10.3390/ai6110284 - 1 Nov 2025

Viewed by 508

Abstract

Reliable and efficient detection of persimmons provides the foundation for precise maturity evaluation. Persimmon ripeness detection remains challenging due to small target sizes, frequent occlusion by foliage, and motion- or focus-induced blur that degrades edge information. This study proposes YOLOv11-4ConvNeXtV2, an enhanced detection [...] Read more.

Reliable and efficient detection of persimmons provides the foundation for precise maturity evaluation. Persimmon ripeness detection remains challenging due to small target sizes, frequent occlusion by foliage, and motion- or focus-induced blur that degrades edge information. This study proposes YOLOv11-4ConvNeXtV2, an enhanced detection framework that integrates a ConvNeXtV2 backbone with Fully Convolutional Masked Auto-Encoder (FCMAE) pretraining, Global Response Normalization (GRN), and Single-Head Self-Attention (SHSA) mechanisms. We present a comprehensive persimmon dataset featuring sub-block segmentation that preserves local structural integrity while expanding dataset diversity. The model was trained on 4921 annotated images (original 703 + 6 × 703 augmented) collected under diverse orchard conditions and optimized for 300 epochs using the Adam optimizer with early stopping. Comprehensive experiments demonstrate that YOLOv11-4ConvNeXtV2 achieves 95.9% precision and 83.7% recall, with mAP@0.5 of 88.4% and mAP@0.5:0.95 of 74.8%, outperforming state-of-the-art YOLO variants (YOLOv5n, YOLOv8n, YOLOv9t, YOLOv10n, YOLOv11n, YOLOv12n) by 3.8–6.3 percentage points in mAP@0.5:0.95. The model demonstrates superior robustness to blur, occlusion, and varying illumination conditions, making it suitable for deployment in challenging maturity detection environments. Full article

► Show Figures

Figure 1

22 pages, 361 KB

Open AccessArticle

Blocking Migration: The Underside of European Politics

by Peter O’Brien

Populations 2025, 1(4), 23; https://doi.org/10.3390/populations1040023 - 30 Oct 2025

Viewed by 603

Abstract

This article examines European policies blocking migration. It outlines a theory of borders and bordering that conceptualizes both as being far more complex and consequential than the mere regulation of conventional national frontiers. Although due attention is paid to efforts at the formal [...] Read more.

This article examines European policies blocking migration. It outlines a theory of borders and bordering that conceptualizes both as being far more complex and consequential than the mere regulation of conventional national frontiers. Although due attention is paid to efforts at the formal frontiers of Europe, the bulk of the analysis focuses on the effective externalization of Europe’s borders into African and Asian states that European governments pay (in kind or cash) to stop migrants from ever reaching Europe’s shores. The essay goes on to introduce the notion of Anglo-European hegemony to explain why postcolonial states, despite having achieved formal independence from colonial rule, continue to contribute to and even emulate patterns of blocking migration that originate in the Global North. Blocked migration casts doubt on Europe’s democratic credentials—so much so that efforts to reduce, end or evade blocked migration should be reinterpreted as necessary steps in the ongoing decolonization and democratization of European politics. Full article

29 pages, 8732 KB

Open AccessArticle

MFF-ClassificationNet: CNN-Transformer Hybrid with Multi-Feature Fusion for Breast Cancer Histopathology Classification

by Xiaoli Wang, Guowei Wang, Luhan Li, Hua Zou and Junpeng Cui

Biosensors 2025, 15(11), 718; https://doi.org/10.3390/bios15110718 - 29 Oct 2025

Viewed by 411

Abstract

Breast cancer is one of the most prevalent malignant tumors among women worldwide, underscoring the urgent need for early and accurate diagnosis to reduce mortality. To address this, A Multi-Feature Fusion Classification Network (MFF-ClassificationNet) is proposed for breast histopathological image classification. The network [...] Read more.

Breast cancer is one of the most prevalent malignant tumors among women worldwide, underscoring the urgent need for early and accurate diagnosis to reduce mortality. To address this, A Multi-Feature Fusion Classification Network (MFF-ClassificationNet) is proposed for breast histopathological image classification. The network adopts a two-branch parallel architecture, where a convolutional neural network captures local details and a Transformer models global dependencies. Their features are deeply integrated through a Multi-Feature Fusion module, which incorporates a Convolutional Block Attention Module—Squeeze and Excitation (CBAM-SE) fusion block combining convolutional block attention, squeeze-and-excitation mechanisms, and a residual inverted multilayer perceptron to enhance fine-grained feature representation and category-specific lesion characterization. Experimental evaluations on the BreakHis dataset achieved accuracies of 98.30%, 97.62%, 98.81%, and 96.07% at magnifications of 40×, 100×, 200×, and 400×, respectively, while an accuracy of 97.50% was obtained on the BACH dataset. These results confirm that integrating local and global features significantly strengthens the model’s ability to capture multi-scale and context-aware information, leading to superior classification performance. Overall, MFF-ClassificationNet surpasses conventional single-path approaches and provides a robust, generalizable framework for advancing computer-aided diagnosis of breast cancer. Full article

(This article belongs to the Special Issue AI-Based Biosensors and Biomedical Imaging)

► Show Figures

Figure 1

15 pages, 2225 KB

Open AccessFeature PaperArticle

An Automatic Pixel-Level Segmentation Method for Coal-Crack CT Images Based on U²-Net

by Yimin Zhang, Chengyi Wu, Jinxia Yu, Guoqiang Wang and Yingying Li

Electronics 2025, 14(21), 4179; https://doi.org/10.3390/electronics14214179 - 26 Oct 2025

Viewed by 325

Abstract

Automatically segmenting coal cracks in CT images is crucial for 3D reconstruction and the physical properties of mines. This paper proposes an automatic pixel-level deep learning method called Attention Double U²-Net to enhance the segmentation accuracy of coal cracks in CT [...] Read more.

Automatically segmenting coal cracks in CT images is crucial for 3D reconstruction and the physical properties of mines. This paper proposes an automatic pixel-level deep learning method called Attention Double U²-Net to enhance the segmentation accuracy of coal cracks in CT images. Due to the lack of public datasets of coal CT images, a pixel-level labeled coal crack dataset is first established through industrial CT scanning experiments and post-processing. Then, the proposed method utilizes a Double Residual U-Block structure (DRSU) based on U²-Net to improve feature extraction and fusion capabilities. Moreover, an attention mechanism module is proposed, which is called Atrous Asymmetric Fusion Non-Local Block (AAFNB). The AAFNB module is based on the idea of Asymmetric Non-Local, which enables the collection of global information to enhance the segmentation results. Compared with previous state-of-the-art models, the proposed Attention Double U²-Net method exhibits better performance over the coal crack CT image dataset in various evaluation metrics such as PA, mPA, MIoU, IoU, Precision, Recall, and Dice scores. The crack segmentation results obtained from this method are more accurate and efficient, which provides experimental data and theoretical support to the field of CBM exploration and damage of coal. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 1087 KB

Open AccessArticle

Modeling the Internal and Contextual Attention for Self-Supervised Skeleton-Based Action Recognition

by Wentian Xin, Yue Teng, Jikang Zhang, Yi Liu, Ruyi Liu, Yuzhi Hu and Qiguang Miao

Sensors 2025, 25(21), 6532; https://doi.org/10.3390/s25216532 - 23 Oct 2025

Viewed by 430

Abstract

Multimodal contrastive learning has achieved significant performance advantages in self-supervised skeleton-based action recognition. Previous methods are limited by modality imbalance, which reduces alignment accuracy and makes it difficult to combine important spatial–temporal frequency patterns, leading to confusion between modalities and weaker feature representations. [...] Read more.

Multimodal contrastive learning has achieved significant performance advantages in self-supervised skeleton-based action recognition. Previous methods are limited by modality imbalance, which reduces alignment accuracy and makes it difficult to combine important spatial–temporal frequency patterns, leading to confusion between modalities and weaker feature representations. To overcome these problems, we explore intra-modality feature-wise self-similarity and inter-modality instance-wise cross-consistency, and discover two inherent correlations that benefit recognition: (i) Global Perspective expresses how action semantics carry a broad and high-level understanding, which supports the use of globally discriminative feature representations. (ii) Focus Adaptation refers to the role of the frequency spectrum in guiding attention toward key joints by emphasizing compact and salient signal patterns. Building upon these insights, we propose a novel language–skeleton contrastive learning framework comprising two key components: (a) Feature Modulation, which constructs a skeleton–language action conceptual domain to minimize the expected information gain between vision and language modalities. (b) Frequency Feature Learning, which introduces a Frequency-domain Spatial–Temporal block (FreST) that focuses on sparse key human joints in the frequency domain with compact signal energy. Extensive experiments demonstrate the effectiveness of our method achieves remarkable action recognition performance on widely used benchmark datasets, including NTU RGB+D 60 and NTU RGB+D 120. Especially on the challenging PKU-MMD dataset, MICA has achieved at least a 4.6% improvement over classical methods such as CrosSCLR and AimCLR, effectively demonstrating its ability to capture internal and contextual attention information. Full article

(This article belongs to the Special Issue Deep Learning for Perception and Recognition: Method and Applications)

► Show Figures

Figure 1

22 pages, 2027 KB

Open AccessArticle

Agri-DSSA: A Dual Self-Supervised Attention Framework for Multisource Crop Health Analysis Using Hyperspectral and Image-Based Benchmarks

by Fatema A. Albalooshi

AgriEngineering 2025, 7(10), 350; https://doi.org/10.3390/agriengineering7100350 - 17 Oct 2025

Viewed by 410

Abstract

Recent advances in hyperspectral imaging (HSI) and multimodal deep learning have opened new opportunities for crop health analysis; however, most existing models remain limited by dataset scope, lack of interpretability, and weak cross-domain generalization. To overcome these limitations, this study introduces Agri-DSSA, a [...] Read more.

Recent advances in hyperspectral imaging (HSI) and multimodal deep learning have opened new opportunities for crop health analysis; however, most existing models remain limited by dataset scope, lack of interpretability, and weak cross-domain generalization. To overcome these limitations, this study introduces Agri-DSSA, a novel Dual Self-Supervised Attention (DSSA) framework that simultaneously models spectral and spatial dependencies through two complementary self-attention branches. The proposed architecture enables robust and interpretable feature learning across heterogeneous data sources, facilitating the estimation of spectral proxies of chlorophyll content, plant vigor, and disease stress indicators rather than direct physiological measurements. Experiments were performed on seven publicly available benchmark datasets encompassing diverse spectral and visual domains: three hyperspectral datasets (Indian Pines with 16 classes and 10,366 labeled samples; Pavia University with 9 classes and 42,776 samples; and Kennedy Space Center with 13 classes and 5211 samples), two plant disease datasets (PlantVillage with 54,000 labeled leaf images covering 38 diseases across 14 crop species, and the New Plant Diseases dataset with over 30,000 field images captured under natural conditions), and two chlorophyll content datasets (the Global Leaf Chlorophyll Content Dataset (GLCC), derived from MERIS and OLCI satellite data between 2003–2020, and the Leaf Chlorophyll Content Dataset for Crops, which includes paired spectrophotometric and multispectral measurements collected from multiple crop species). To ensure statistical rigor and spatial independence, a block-based spatial cross-validation scheme was employed across five independent runs with fixed random seeds. Model performance was evaluated using

R^{2}

, RMSE, F1-score, AUC-ROC, and AUC-PR, each reported as mean ± standard deviation with 95% confidence intervals. Results show that Agri-DSSA consistently outperforms baseline models (PLSR, RF, 3D-CNN, and HybridSN), achieving up to

R^{2} = 0.86

for chlorophyll content estimation and F1-scores above 0.95 for plant disease detection. The attention distributions highlight physiologically meaningful spectral regions (550–710 nm) associated with chlorophyll absorption, confirming the interpretability of the model’s learned representations. This study serves as a methodological foundation for UAV-based and field-deployable crop monitoring systems. By unifying hyperspectral, chlorophyll, and visual disease datasets, Agri-DSSA provides an interpretable and generalizable framework for proxy-based vegetation stress estimation. Future work will extend the model to real UAV campaigns and in-field spectrophotometric validation to achieve full agronomic reliability. Full article

► Show Figures

Figure 1

14 pages, 1149 KB

Open AccessArticle

Modality Information Aggregation Graph Attention Network with Adversarial Training for Multi-Modal Knowledge Graph Completion

by Hankiz Yilahun, Elyar Aili, Seyyare Imam and Askar Hamdulla

Information 2025, 16(10), 907; https://doi.org/10.3390/info16100907 - 16 Oct 2025

Viewed by 330

Abstract

Multi-modal knowledge graph completion (MMKGC) aims to complete knowledge graphs by integrating structural information with multi-modal (e.g., visual, textual, and numerical) features and leveraging cross-modal reasoning within a unified semantic space to infer and supplement missing factual knowledge. Current MMKGC methods have advanced [...] Read more.

Multi-modal knowledge graph completion (MMKGC) aims to complete knowledge graphs by integrating structural information with multi-modal (e.g., visual, textual, and numerical) features and leveraging cross-modal reasoning within a unified semantic space to infer and supplement missing factual knowledge. Current MMKGC methods have advanced in terms of integrating multi-modal information but have overlooked the imbalance in modality importance for target entities. Treating all modalities equally dilutes critical semantics and amplifies irrelevant information, which in turn limits the semantic understanding and predictive performance of the model. To address these limitations, we proposed a modality information aggregation graph attention network with adversarial training for multi-modal knowledge graph completion (MIAGAT-AT). MIAGAT-AT focuses on hierarchically modeling complex cross-modal interactions. By combining the multi-head attention mechanism with modality-specific projection methods, it precisely captures global semantic dependencies and dynamically adjusts the weight of modality embeddings according to the importance of each modality, thereby optimizing cross-modal information fusion capabilities. Moreover, through the use of random noise and multi-layer residual blocks, the adversarial training generates high-quality multi-modal feature representations, thereby effectively enhancing information from imbalanced modalities. Experimental results demonstrate that our approach significantly outperforms 18 existing baselines and establishes a strong performance baseline across three distinct datasets. Full article

► Show Figures

Figure 1

22 pages, 7434 KB

Open AccessArticle

A Lightweight Image-Based Decision Support Model for Marine Cylinder Lubrication Based on CNN-ViT Fusion

by Qiuyu Li, Guichen Zhang and Enrui Zhao

J. Mar. Sci. Eng. 2025, 13(10), 1956; https://doi.org/10.3390/jmse13101956 - 13 Oct 2025

Viewed by 321

Abstract

Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, [...] Read more.

Under the context of “Energy Conservation and Emission Reduction,” low-sulfur fuel has become widely adopted in maritime operations, posing significant challenges to cylinder lubrication systems. Traditional oil injection strategies, heavily reliant on manual experience, suffer from instability and high costs. To address this, a lightweight image retrieval model for cylinder lubrication is proposed, leveraging deep learning and computer vision to support oiling decisions based on visual features. The model comprises three components: a backbone network, a feature enhancement module, and a similarity retrieval module. Specifically, EfficientNetB0 serves as the backbone for efficient feature extraction under low computational overhead. MobileViT Blocks are integrated to combine local feature perception of Convolutional Neural Networks (CNNs) with the global modeling capacity of Transformers. To further improve receptive field and multi-scale representation, Receptive Field Blocks (RFB) are introduced between the components. Additionally, the Convolutional Block Attention Module (CBAM) attention mechanism enhances focus on salient regions, improving feature discrimination. A high-quality image dataset was constructed using WINNING’s large bulk carriers under various sea conditions. The experimental results demonstrate that the EfficientNetB0 + RFB + MobileViT + CBAM model achieves excellent performance with minimal computational cost: 99.71% Precision, 99.69% Recall, and 99.70% F1-score—improvements of 11.81%, 15.36%, and 13.62%, respectively, over the baseline EfficientNetB0. With only a 0.3 GFLOP and 8.3 MB increase in model size, the approach balances accuracy and inference efficiency. The model also demonstrates good robustness and application stability in real-world ship testing, with potential for further adoption in the field of intelligent ship maintenance. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

19 pages, 8850 KB

Open AccessArticle

Intelligent Defect Recognition of Glazed Components in Ancient Buildings Based on Binocular Vision

by Youshan Zhao, Xiaolan Zhang, Ming Guo, Haoyu Han, Jiayi Wang, Yaofeng Wang, Xiaoxu Li and Ming Huang

Buildings 2025, 15(20), 3641; https://doi.org/10.3390/buildings15203641 - 10 Oct 2025

Viewed by 245

Abstract

Glazed components in ancient Chinese architecture hold profound historical and cultural value. However, over time, environmental erosion, physical impacts, and human disturbances gradually lead to various forms of damage, severely impacting the durability and stability of the buildings. Therefore, preventive protection of glazed [...] Read more.

Glazed components in ancient Chinese architecture hold profound historical and cultural value. However, over time, environmental erosion, physical impacts, and human disturbances gradually lead to various forms of damage, severely impacting the durability and stability of the buildings. Therefore, preventive protection of glazed components is crucial. The key to preventive protection lies in the early detection and repair of damage, thereby extending the component’s service life and preventing significant structural damage. To address this challenge, this study proposes a Restoration-Scale Identification (RSI) method that integrates depth information. By combining RGB-D images acquired from a depth camera with intrinsic camera parameters, and embedding a Convolutional Block Attention Module (CBAM) into the backbone network, the method dynamically enhances critical feature regions. It then employs a scale restoration strategy to accurately identify damage areas and recover the physical dimensions of glazed components from a global perspective. In addition, we constructed a dedicated semantic segmentation dataset for glazed tile damage, focusing on cracks and spalling. Both qualitative and quantitative evaluation results demonstrate that, compared with various high-performance semantic segmentation methods, our approach significantly improves the accuracy and robustness of damage detection in glazed components. The achieved accuracy deviates by only ±10 mm from high-precision laser scanning, a level of precision that is essential for reliably identifying and assessing subtle damages in complex glazed architectural elements. By integrating depth information, real scale information can be effectively obtained during the intelligent recognition process, thereby efficiently and accurately identifying the type of damage and size information of glazed components, and realizing the conversion from two-dimensional (2D) pixel coordinates to local three-dimensional (3D) coordinates, providing a scientific basis for the protection and restoration of ancient buildings, and ensuring the long-term stability of cultural heritage and the inheritance of historical value. Full article

(This article belongs to the Section Building Materials, and Repair & Renovation)

► Show Figures

Figure 1

19 pages, 4133 KB

Open AccessArticle

FLOW-GLIDE: Global–Local Interleaved Dynamics Estimator for Flow Field Prediction

by Jinghan Su, Li Xiao and Jingyu Wang

Appl. Sci. 2025, 15(19), 10834; https://doi.org/10.3390/app151910834 - 9 Oct 2025

Viewed by 277

Abstract

Accurate prediction of the flow field is crucial to evaluating the aerodynamic performance of an aircraft. While traditional computational fluid dynamics (CFD) methods solve the governing equations to capture both global flow structures and localized gradients, they are computationally intensive. Deep learning-based surrogate [...] Read more.

Accurate prediction of the flow field is crucial to evaluating the aerodynamic performance of an aircraft. While traditional computational fluid dynamics (CFD) methods solve the governing equations to capture both global flow structures and localized gradients, they are computationally intensive. Deep learning-based surrogate models offer a promising alternative, yet often struggle to simultaneously model long-range dependencies and near-wall flow gradients with sufficient fidelity. To address this challenge, this paper introduces the Message-passing And Global-attention block (MAG-BLOCK), a graph neural network module that combines local message passing with global self-attention mechanisms to jointly learn fine-scale features and large-scale flow patterns. Building on MAG-BLOCK, we propose FLOW-GLIDE, a cross-architecture deep learning framework that learns a mapping from initial conditions to steady-state flow fields in a latent space. Evaluated on the AirfRANS dataset, FLOW-GLIDE outperforms existing models on key performance metrics. Specifically, it reduces the error in the volumetric flow field by 62% and surface pressure prediction by 82% compared to the state-of-the-art. Full article

(This article belongs to the Section Fluid Science and Technology)

► Show Figures

Figure 1

22 pages, 5361 KB

Open AccessArticle

LMVMamba: A Hybrid U-Shape Mamba for Remote Sensing Segmentation with Adaptation Fine-Tuning

by Fan Li, Xiao Wang, Haochen Wang, Hamed Karimian, Juan Shi and Guozhen Zha

Remote Sens. 2025, 17(19), 3367; https://doi.org/10.3390/rs17193367 - 5 Oct 2025

Viewed by 901

Abstract

High-precision semantic segmentation of remote sensing imagery is crucial in geospatial analysis. It plays an immeasurable role in fields such as urban governance, environmental monitoring, and natural resource management. However, when confronted with complex objects (such as winding roads and dispersed buildings), existing [...] Read more.

High-precision semantic segmentation of remote sensing imagery is crucial in geospatial analysis. It plays an immeasurable role in fields such as urban governance, environmental monitoring, and natural resource management. However, when confronted with complex objects (such as winding roads and dispersed buildings), existing semantic segmentation methods still suffer from inadequate target recognition capabilities and multi-scale representation issues. This paper proposes a neural network model, LMVMamba (LoRA Multi-scale Vision Mamba), for semantic segmentation of remote sensing images. This model integrates the advantages of convolutional neural networks (CNNs), Transformers, and state-space models (Mamba) with a multi-scale feature fusion strategy. It simultaneously captures global contextual information and fine-grained local features. Specifically, in the encoder stage, the ResT Transformer serves as the backbone network, employing a LoRA fine-tuning strategy to effectively enhance model accuracy by training only the introduced low-rank matrix pairs. The extracted features are then passed to the decoder, where a U-shaped Mamba decoder is designed. In this stage, a Multi-Scale Post-processing Block (MPB) is introduced, consisting of depthwise separable convolutions and residual concatenation. This block effectively extracts multi-scale features and enhances local detail extraction after the VSS block. Additionally, a Local Enhancement and Fusion Attention Module (LAS) is added at the end of each decoder block. LAS integrates the SimAM attention mechanism, further enhancing the model’s multi-scale feature fusion capability and local detail segmentation capability. Through extensive comparative experiments, it was found that LMVMamba achieves superior performance on the OpenEarthMap dataset (mIoU 52.3%, OA 69.8%, mF1: 68.0%) and LoveDA (mIoU 67.9%, OA 80.3%, mF1: 80.5%) datasets. Ablation experiments validated the effectiveness of each module. The final results indicate that this model is highly suitable for high-precision land-cover classification tasks in remote sensing imagery. LMVMamba provides an effective solution for precise semantic segmentation of high-resolution remote sensing imagery. Full article

(This article belongs to the Special Issue Advances in Deep Learning and Machine Learning for Remote Sensing Image Analysis)

► Show Figures

Graphical abstract

27 pages, 3948 KB

Open AccessArticle

Fully Automated Segmentation of Cervical Spinal Cord in Sagittal MR Images Using Swin-Unet Architectures

by Rukiye Polattimur, Emre Dandıl, Mehmet Süleyman Yıldırım and Utku Şenol

J. Clin. Med. 2025, 14(19), 6994; https://doi.org/10.3390/jcm14196994 - 2 Oct 2025

Cited by 1 | Viewed by 693

Abstract

Background/Objectives: The spinal cord is a critical component of the central nervous system that transmits neural signals between the brain and the body’s peripheral regions through its nerve roots. Despite being partially protected by the vertebral column, the spinal cord remains highly [...] Read more.

Background/Objectives: The spinal cord is a critical component of the central nervous system that transmits neural signals between the brain and the body’s peripheral regions through its nerve roots. Despite being partially protected by the vertebral column, the spinal cord remains highly vulnerable to trauma, tumors, infections, and degenerative or inflammatory disorders. These conditions can disrupt neural conduction, resulting in severe functional impairments, such as paralysis, motor deficits, and sensory loss. Therefore, accurate and comprehensive spinal cord segmentation is essential for characterizing its structural features and evaluating neural integrity. Methods: In this study, we propose a fully automated method for segmentation of the cervical spinal cord in sagittal magnetic resonance (MR) images. This method facilitates rapid clinical evaluation and supports early diagnosis. Our approach uses a Swin-Unet architecture, which integrates vision transformer blocks into the U-Net framework. This enables the model to capture both local anatomical details and global contextual information. This design improves the delineation of the thin, curved, low-contrast cervical cord, resulting in more precise and robust segmentation. Results: In experimental studies, the proposed Swin-Unet model (SWU1), which uses transformer blocks in the encoder layer, achieved Dice Similarity Coefficient (DSC) and Hausdorff Distance 95 (HD95) scores of 0.9526 and 1.0707 mm, respectively, for cervical spinal cord segmentation. These results confirm that the model can consistently deliver precise, pixel-level delineations that are structurally accurate, which supports its reliability for clinical assessment. Conclusions: The attention-enhanced Swin-Unet architecture demonstrated high accuracy in segmenting thin and complex anatomical structures, such as the cervical spinal cord. Its ability to generalize with limited data highlights its potential for integration into clinical workflows to support diagnosis, monitoring, and treatment planning. Full article

(This article belongs to the Special Issue Artificial Intelligence and Deep Learning in Medical Imaging)

► Show Figures

Figure 1

Search Results (537)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (537)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI