Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (613)

Search Parameters:
Keywords = geometric deep learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 5024 KB  
Article
A Study on Geometrical Consistency of Surfaces Using Partition-Based PCA and Wavelet Transform in Classification
by Vignesh Devaraj, Thangavel Palanisamy and Kanagasabapathi Somasundaram
AppliedMath 2025, 5(4), 134; https://doi.org/10.3390/appliedmath5040134 - 3 Oct 2025
Abstract
The proposed study explores the consistency of the geometrical character of surfaces under scaling, rotation and translation. In addition to its mathematical significance, it also exhibits advantages over image processing and economic applications. In this paper, the authors used partition-based principal component analysis [...] Read more.
The proposed study explores the consistency of the geometrical character of surfaces under scaling, rotation and translation. In addition to its mathematical significance, it also exhibits advantages over image processing and economic applications. In this paper, the authors used partition-based principal component analysis similar to two-dimensional Sub-Image Principal Component Analysis (SIMPCA), along with a suitably modified atypical wavelet transform in the classification of 2D images. The proposed framework is further extended to three-dimensional objects using machine learning classifiers. To strengthen fairness, we benchmarked against both Random Forest (RF) and Support Vector Machine (SVM) classifiers using nested cross-validation, showing consistent gains when TIFV is included. In addition, we carried out a robustness analysis by introducing Gaussian noise to the intensity channel, confirming that TIFV degrades much more gracefully compared to traditional descriptors. Experimental results demonstrate that the method achieves improved performance compared to traditional hand-crafted descriptors such as measured values and histogram of oriented gradients. In addition, it is found to be useful that this proposed algorithm is capable of establishing consistency locally, which is never possible without partition. However, a reasonable amount of computational complexity is reduced. We note that comparisons with deep learning baselines are beyond the scope of this study, and our contribution is positioned within the domain of interpretable, affine-invariant descriptors that enhance classical machine learning pipelines. Full article
Show Figures

Figure 1

29 pages, 13908 KB  
Article
SS3L: Self-Supervised Spectral–Spatial Subspace Learning for Hyperspectral Image Denoising
by Yinhu Wu, Dongyang Liu and Junping Zhang
Remote Sens. 2025, 17(19), 3348; https://doi.org/10.3390/rs17193348 - 1 Oct 2025
Abstract
Hyperspectral imaging (HSI) systems often suffer from complex noise degradation during the imaging process, significantly impacting downstream applications. Deep learning-based methods, though effective, rely on impractical paired training data, while traditional model-based methods require manually tuned hyperparameters and lack generalization. To address these [...] Read more.
Hyperspectral imaging (HSI) systems often suffer from complex noise degradation during the imaging process, significantly impacting downstream applications. Deep learning-based methods, though effective, rely on impractical paired training data, while traditional model-based methods require manually tuned hyperparameters and lack generalization. To address these issues, we propose SS3L (Self-Supervised Spectral-Spatial Subspace Learning), a novel HSI denoising framework that requires neither paired data nor manual tuning. Specifically, we introduce a self-supervised spectral–spatial paradigm that learns noisy features from noisy data, rather than paired training data, based on spatial geometric symmetry and spectral local consistency constraints. To avoid manual hyperparameter tuning, we propose an adaptive rank subspace representation and a loss function designed based on the collaborative integration of spectral and spatial losses via noise-aware spectral-spatial weighting, guided by the estimated noise intensity. These components jointly enable a dynamic trade-off between detail preservation and noise reduction under varying noise levels. The proposed SS3L embeds noise-adaptive subspace representations into the dynamic spectral–spatial hybrid loss-constrained network, enabling cross-sensor denoising through prior-informed self-supervision. Experimental results demonstrate that SS3L effectively removes noise while preserving both structural fidelity and spectral accuracy under diverse noise conditions. Full article
Show Figures

Figure 1

33 pages, 4190 KB  
Article
Preserving Songket Heritage Through Intelligent Image Retrieval: A PCA and QGD-Rotational-Based Model
by Nadiah Yusof, Nazatul Aini Abd. Majid, Amirah Ismail and Nor Hidayah Hussain
Computers 2025, 14(10), 416; https://doi.org/10.3390/computers14100416 - 1 Oct 2025
Abstract
Malay songket motifs are a vital component of Malaysia’s intangible cultural heritage, characterized by intricate visual designs and deep cultural symbolism. However, the practical digital preservation and retrieval of these motifs present challenges, particularly due to the rotational variations typical in textile imagery. [...] Read more.
Malay songket motifs are a vital component of Malaysia’s intangible cultural heritage, characterized by intricate visual designs and deep cultural symbolism. However, the practical digital preservation and retrieval of these motifs present challenges, particularly due to the rotational variations typical in textile imagery. This study introduces a novel Content-Based Image Retrieval (CBIR) model that integrates Principal Component Analysis (PCA) for feature extraction and Quadratic Geometric Distance (QGD) for measuring similarity. To evaluate the model’s performance, a curated dataset comprising 413 original images and 4956 synthetically rotated songket motif images was utilized. The retrieval system featured metadata-driven preprocessing, dimensionality reduction, and multi-angle similarity assessment to address the issue of rotational invariance comprehensively. Quantitative evaluations using precision, recall, and F-measure metrics demonstrated that the proposed PCAQGD + Rotation technique achieved a mean F-measure of 59.72%, surpassing four benchmark retrieval methods. These findings confirm the model’s capability to accurately retrieve relevant motifs across varying orientations, thus supporting cultural heritage preservation efforts. The integration of PCA and QGD techniques effectively narrows the semantic gap between machine perception and human interpretation of motif designs. Future research should focus on expanding motif datasets and incorporating deep learning approaches to enhance retrieval precision, scalability, and applicability within larger national heritage repositories. Full article
Show Figures

Graphical abstract

18 pages, 3163 KB  
Article
A Multi-Stage Deep Learning Framework for Antenna Array Synthesis in Satellite IoT Networks
by Valliammai Arunachalam, Luke Rosen, Mojisola Rachel Akinsiku, Shuvashis Dey, Rahul Gomes and Dipankar Mitra
AI 2025, 6(10), 248; https://doi.org/10.3390/ai6100248 - 1 Oct 2025
Abstract
This paper presents an innovative end-to-end framework for conformal antenna array design and beam steering in Low Earth Orbit (LEO) satellite-based IoT communication systems. We propose a multi-stage learning architecture that integrates machine learning (ML) for antenna parameter prediction with reinforcement learning (RL) [...] Read more.
This paper presents an innovative end-to-end framework for conformal antenna array design and beam steering in Low Earth Orbit (LEO) satellite-based IoT communication systems. We propose a multi-stage learning architecture that integrates machine learning (ML) for antenna parameter prediction with reinforcement learning (RL) for adaptive beam steering. The ML module predicts optimal geometric and material parameters for conformal antenna arrays based on mission-specific performance requirements such as frequency, gain, coverage angle, and satellite constraints with an accuracy of 99%. These predictions are then passed to a Deep Q-Network (DQN)-based offline RL model, which learns beamforming strategies to maximize gain toward dynamic ground terminals, without requiring real-time interaction. To enable this, a synthetic dataset grounded in statistical principles and a static dataset is generated using CST Studio Suite and COMSOL Multiphysics simulations, capturing the electromagnetic behavior of various conformal geometries. The results from both the machine learning and reinforcement learning models show that the predicted antenna designs and beam steering angles closely align with simulation benchmarks. Our approach demonstrates the potential of combining data-driven ensemble models with offline reinforcement learning for scalable, efficient, and autonomous antenna synthesis in resource-constrained space environments. Full article
Show Figures

Figure 1

17 pages, 10195 KB  
Article
Feature-Driven Joint Source–Channel Coding for Robust 3D Image Transmission
by Yinuo Liu, Hao Xu, Adrian Bowman and Weichao Chen
Electronics 2025, 14(19), 3907; https://doi.org/10.3390/electronics14193907 - 30 Sep 2025
Abstract
Emerging applications like augmented reality (AR) demand efficient wireless transmission of high-resolution three-dimensional (3D) images, yet conventional systems struggle with the high data volume and vulnerability to noise. This paper proposes a novel feature-driven framework that integrates semantic source coding with deep learning-based [...] Read more.
Emerging applications like augmented reality (AR) demand efficient wireless transmission of high-resolution three-dimensional (3D) images, yet conventional systems struggle with the high data volume and vulnerability to noise. This paper proposes a novel feature-driven framework that integrates semantic source coding with deep learning-based Joint Source–Channel Coding (JSCC) for robust and efficient transmission. Instead of processing dense meshes, the method first extracts a compact set of geometric features—specifically, the ridge and valley curves that define the object’s fundamental structure. This feature representation which is extracted by the anatomical curves is then processed by an end-to-end trained JSCC encoder, mapping the semantic information directly to channel symbols. This synergistic approach drastically reduces bandwidth requirements while leveraging the inherent resilience of JSCC for graceful degradation in noisy channels. The framework demonstrates superior reconstruction fidelity and robustness compared to traditional schemes, especially in low signal-to-noise ratio (SNR) regimes, enabling practical and efficient 3D semantic communications. Full article
(This article belongs to the Special Issue AI-Empowered Communications: Towards a Wireless Metaverse)
25 pages, 7878 KB  
Article
JOTGLNet: A Guided Learning Network with Joint Offset Tracking for Multiscale Deformation Monitoring
by Jun Ni, Siyuan Bao, Xichao Liu, Sen Du, Dapeng Tao and Yibing Zhan
Remote Sens. 2025, 17(19), 3340; https://doi.org/10.3390/rs17193340 - 30 Sep 2025
Abstract
Ground deformation monitoring in mining areas is essential for hazard prevention and environmental protection. Although interferometric synthetic aperture radar (InSAR) provides detailed phase information for accurate deformation measurement, its performance is often compromised in regions experiencing rapid subsidence and strong noise, where phase [...] Read more.
Ground deformation monitoring in mining areas is essential for hazard prevention and environmental protection. Although interferometric synthetic aperture radar (InSAR) provides detailed phase information for accurate deformation measurement, its performance is often compromised in regions experiencing rapid subsidence and strong noise, where phase aliasing and coherence loss lead to significant inaccuracies. To overcome these limitations, this paper proposes JOTGLNet, a guided learning network with joint offset tracking, for multiscale deformation monitoring. This method integrates pixel offset tracking (OT), which robustly captures large-gradient displacements, with interferometric phase data that offers high sensitivity in coherent regions. A dual-path deep learning architecture was designed where the interferometric phase serves as the primary branch and OT features act as complementary information, enhancing the network’s ability to handle varying deformation rates and coherence conditions. Additionally, a novel shape perception loss combining morphological similarity measurement and error learning was introduced to improve geometric fidelity and reduce unbalanced errors across deformation regions. The model was trained on 4000 simulated samples reflecting diverse real-world scenarios and validated on 1100 test samples with a maximum deformation up to 12.6 m, achieving an average prediction error of less than 0.15 m—outperforming state-of-the-art methods whose errors exceeded 0.19 m. Additionally, experiments on five real monitoring datasets further confirmed the superiority and consistency of the proposed approach. Full article
Show Figures

Graphical abstract

24 pages, 14166 KB  
Article
Robust and Transferable Elevation-Aware Multi-Resolution Network for Semantic Segmentation of LiDAR Point Clouds in Powerline Corridors
by Yifan Wang, Shenhong Li, Guofang Wang, Wanshou Jiang, Yijun Yan and Jianwen Sun
Remote Sens. 2025, 17(19), 3318; https://doi.org/10.3390/rs17193318 - 27 Sep 2025
Abstract
Semantic segmentation of LiDAR point clouds in powerline corridor environments is crucial for the intelligent inspection and maintenance of power infrastructure. However, existing deep learning methods often underperform in such scenarios due to severe class imbalance, sparse and long-range structures, and complex elevation [...] Read more.
Semantic segmentation of LiDAR point clouds in powerline corridor environments is crucial for the intelligent inspection and maintenance of power infrastructure. However, existing deep learning methods often underperform in such scenarios due to severe class imbalance, sparse and long-range structures, and complex elevation variations. We propose EMPower-Net, an Elevation-Aware Multi-Resolution Network, which integrates an Elevation Distribution (ED) module to enhance vertical geometric awareness and a Multi-Resolution (MR) module to enhance segmentation accuracy for corridor structures with varying object scales. Experiments on real-world datasets from Yunnan and Guangdong show that EMPower-Net outperforms state-of-the-art baselines, especially in recognizing power lines and towers with high structural fidelity under occlusion and dense vegetation. Ablation studies confirm the complementary effects of the MR and ED modules, while transfer learning results reveal strong generalization with minimal performance degradation across different powerline regions. Additional tests on urban datasets indicate that the proposed elevation features are also effective for vertical structure recognition beyond powerline scenarios. Full article
(This article belongs to the Special Issue Urban Land Use Mapping Using Deep Learning)
Show Figures

Figure 1

32 pages, 16554 KB  
Article
A Multi-Task Fusion Model Combining Mixture-of-Experts and Mamba for Facial Beauty Prediction
by Junying Gan, Zhenxin Zhuang, Hantian Chen, Wenchao Xu, Zhen Chen and Huicong Li
Symmetry 2025, 17(10), 1600; https://doi.org/10.3390/sym17101600 - 26 Sep 2025
Abstract
Facial beauty prediction (FBP) is a cutting-edge task in deep learning that aims to equip machines with the ability to assess facial attractiveness in a human-like manner. In human perception, facial beauty is strongly associated with facial symmetry, where balanced structures often reflect [...] Read more.
Facial beauty prediction (FBP) is a cutting-edge task in deep learning that aims to equip machines with the ability to assess facial attractiveness in a human-like manner. In human perception, facial beauty is strongly associated with facial symmetry, where balanced structures often reflect aesthetic appeal. Leveraging symmetry provides an interpretable prior for FBP and offers geometric constraints that enhance feature learning. However, existing multi-task FBP models still face challenges such as limited annotated data, insufficient frequency–temporal modeling, and feature conflicts from task heterogeneity. The Mamba model excels in feature extraction and long-range dependency modeling but encounters difficulties in parameter sharing and computational efficiency in multi-task settings. In contrast, mixture-of-experts (MoE) enables adaptive expert selection, reducing redundancy while enhancing task specialization. This paper proposes MoMamba, a multi-task decoder combining Mamba’s state-space modeling with MoE’s dynamic routing to improve multi-scale feature fusion and adaptability. A detail enhancement module fuses high- and low-frequency components from discrete cosine transform with temporal features from Mamba, and a state-aware MoE module incorporates low-rank expert modeling and task-specific decoding. Experiments on SCUT-FBP and SCUT-FBP5500 demonstrate superior performance in both classification and regression, particularly in symmetry-related perception modeling. Full article
Show Figures

Figure 1

27 pages, 6430 KB  
Article
Bayesian–Geometric Fusion: A Probabilistic Framework for Robust Line Feature Matching
by Chenyang Zhang, Yufan Ge and Shuo Gu
Electronics 2025, 14(19), 3783; https://doi.org/10.3390/electronics14193783 - 24 Sep 2025
Viewed by 29
Abstract
Line feature matching is a fundamental and extensively studied subject in the fields of photogrammetry and computer vision. Traditional methods, which rely on handcrafted descriptors and distance-based filtering outliers, frequently encounter challenges related to robustness and a high incidence of outliers. While some [...] Read more.
Line feature matching is a fundamental and extensively studied subject in the fields of photogrammetry and computer vision. Traditional methods, which rely on handcrafted descriptors and distance-based filtering outliers, frequently encounter challenges related to robustness and a high incidence of outliers. While some approaches leverage point features to assist line feature matching by establishing the invariant geometric constraints between points and lines, this typically results in a considerable computational load. In order to overcome these limitations, we introduce a novel Bayesian posterior probability framework for line matching that incorporates three geometric constraints: the distance between line feature endpoints, midpoint distance, and angular consistency. Our approach initially characterizes inter-image geometric relationships using Fourier representation. Subsequently, we formulate the posterior probability distributions for the distance constraint and the uniform distribution based on the constraint of angular consistency. By calculating the joint probability distribution under three geometric constraints, robust line feature matches are iteratively optimized through the Expectation–Maximization (EM) algorithm. Comprehensive experiments confirm the effectiveness of our approach: (i) it outperforms state-of-the-art (including deep learning-based) algorithms in match count and accuracy across common scenarios; (ii) it exhibits superior robustness to rotation, illumination variation, and motion blur compared to descriptor-based methods; and (iii) it notably reduces computational overhead in comparison to algorithms that involve point-assisted line matching. Full article
(This article belongs to the Section Circuit and Signal Processing)
Show Figures

Figure 1

21 pages, 3747 KB  
Article
Open-Vocabulary Crack Object Detection Through Attribute-Guided Similarity Probing
by Hyemin Yoon and Sangjin Kim
Appl. Sci. 2025, 15(19), 10350; https://doi.org/10.3390/app151910350 - 24 Sep 2025
Viewed by 152
Abstract
Timely detection of road surface defects such as cracks and potholes is critical for ensuring traffic safety and reducing infrastructure maintenance costs. While recent advances in image-based deep learning techniques have shown promise for automated road defect detection, existing models remain limited to [...] Read more.
Timely detection of road surface defects such as cracks and potholes is critical for ensuring traffic safety and reducing infrastructure maintenance costs. While recent advances in image-based deep learning techniques have shown promise for automated road defect detection, existing models remain limited to closed-set detection settings, making it difficult to recognize newly emerging or fine-grained defect types. To address this limitation, we propose an attribute-aware open-vocabulary crack detection (AOVCD) framework, which leverages the alignment capability of pretrained vision–language models to generalize beyond fixed class labels. In this framework, crack types are represented as combinations of visual attributes, enabling semantic grounding between image regions and natural language descriptions. To support this, we extend the existing PPDD dataset with attribute-level annotations and incorporate a multi-label attribute recognition task as an auxiliary objective. Experimental results demonstrate that the proposed AOVCD model outperforms existing baselines. In particular, compared to CLIP-based zero-shot inference, the proposed model achieves approximately a 10-fold improvement in average precision (AP) for novel crack categories. Attribute classification performance—covering geometric, spatial, and textural features—also increases by 40% in balanced accuracy (BACC) and 23% in AP. These results indicate that integrating structured attribute information enhances generalization to previously unseen defect types, especially those involving subtle visual cues. Our study suggests that incorporating attribute-level alignment within a vision–language framework can lead to more adaptive and semantically grounded defect recognition systems. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

40 pages, 9065 KB  
Article
Empirical Evaluation of Invariances in Deep Vision Models
by Konstantinos Keremis, Eleni Vrochidou and George A. Papakostas
J. Imaging 2025, 11(9), 322; https://doi.org/10.3390/jimaging11090322 - 19 Sep 2025
Viewed by 415
Abstract
The ability of deep learning models to maintain consistent performance under image transformations-termed invariances, is critical for reliable deployment across diverse computer vision applications. This study presents a comprehensive empirical evaluation of modern convolutional neural networks (CNNs) and vision transformers (ViTs) concerning four [...] Read more.
The ability of deep learning models to maintain consistent performance under image transformations-termed invariances, is critical for reliable deployment across diverse computer vision applications. This study presents a comprehensive empirical evaluation of modern convolutional neural networks (CNNs) and vision transformers (ViTs) concerning four fundamental types of image invariances: blur, noise, rotation, and scale. We analyze a curated selection of thirty models across three common vision tasks, object localization, recognition, and semantic segmentation, using benchmark datasets including COCO, ImageNet, and a custom segmentation dataset. Our experimental protocol introduces controlled perturbations to test model robustness and employs task-specific metrics such as mean Intersection over Union (mIoU), and classification accuracy (Acc) to quantify models’ performance degradation. Results indicate that while ViTs generally outperform CNNs under blur and noise corruption in recognition tasks, both model families exhibit significant vulnerabilities to rotation and extreme scale transformations. Notably, segmentation models demonstrate higher resilience to geometric variations, with SegFormer and Mask2Former emerging as the most robust architectures. These findings challenge prevailing assumptions regarding model robustness and provide actionable insights for designing vision systems capable of withstanding real-world input variability. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Computer Vision Applications)
Show Figures

Figure 1

28 pages, 6410 KB  
Article
Two-Step Forward Modeling for GPR Data of Metal Pipes Based on Image Translation and Style Transfer
by Zhishun Guo, Yesheng Gao, Zicheng Huang, Mengyang Shi and Xingzhao Liu
Remote Sens. 2025, 17(18), 3215; https://doi.org/10.3390/rs17183215 - 17 Sep 2025
Viewed by 234
Abstract
Ground-penetrating radar (GPR) is an important geophysical technique in subsurface detection. However, traditional numerical simulation methods such as finite-difference time-domain (FDTD) face challenges in accurately simulating complex heterogeneous mediums in real-world scenarios due to the difficulty of obtaining precise medium distribution information and [...] Read more.
Ground-penetrating radar (GPR) is an important geophysical technique in subsurface detection. However, traditional numerical simulation methods such as finite-difference time-domain (FDTD) face challenges in accurately simulating complex heterogeneous mediums in real-world scenarios due to the difficulty of obtaining precise medium distribution information and high computational costs. Meanwhile, deep learning methods require excessive prior information, which limits their application. To address these issues, this paper proposes a novel two-step forward modeling strategy for GPR data of metal pipes. The first step employs the proposed Polarization Self-Attention Image Translation network (PSA-ITnet) for image translation, which is inspired by the process where a neural network model “understands” image content and “rewrites” it according to specified rules. It converts scene layout images (cross-sectional schematics depicting geometric details such as the size and spatial distribution of underground buried metal pipes and their surrounding medium) into simulated clutter-free GPR B-scan images. By integrating the polarized self-attention (PSA) mechanism into the Unet generator, PSA-ITnet can capture long-range dependencies, enhancing its understanding of the longitudinal time-delay property in GPR B-scan images. which is crucial for accurately generating hyperbolic signatures of metal pipes in simulated data. The second step uses the Polarization Self-Attention Style Transfer network (PSA-STnet) for style transfer, which transforms the simulated clutter-free images into data matching the distribution and characteristics of a real-world underground heterogeneous medium under unsupervised conditions while retaining target information. This step bridges the gap between ideal simulations and actual GPR data. Simulation experiments confirm that PSA-ITnet outperforms traditional methods in image translation, and PSA-STnet shows superiority in style transfer. Real-world experiments in a complex bridge support structure scenario further verify the method’s practicability and robustness. Compared to FDTD, the proposed strategy is capable of generating GPR data matching real-world subsurface heterogeneous medium distributions from scene layout models, significantly reducing time costs and providing an efficient solution for GPR data simulation and analysis. Full article
Show Figures

Figure 1

21 pages, 4379 KB  
Article
Deep Learning-Based Super-Resolution Reconstruction of a 1/9 Arc-Second Offshore Digital Elevation Model for U.S. Coastal Regions
by Chenhao Wu, Bo Zhang, Meng Zhang and Chaofan Yang
Remote Sens. 2025, 17(18), 3205; https://doi.org/10.3390/rs17183205 - 17 Sep 2025
Viewed by 358
Abstract
High-resolution offshore digital elevation models (DEMs) are essential for coastal geomorphology, marine resource management, and disaster prevention. While deep learning-based super-resolution (SR) techniques have become a mainstream solution for enhancing DEMs, they often fail to maintain a balance between large-scale geomorphological structure and [...] Read more.
High-resolution offshore digital elevation models (DEMs) are essential for coastal geomorphology, marine resource management, and disaster prevention. While deep learning-based super-resolution (SR) techniques have become a mainstream solution for enhancing DEMs, they often fail to maintain a balance between large-scale geomorphological structure and fine-scale topographic detail due to limitations in modeling spatial dependency. To overcome this challenge, we propose DEM-Asymmetric multi-scale super-resolution network (DEM-AMSSRN), a novel asymmetric multi-scale super-resolution network tailored for offshore DEM reconstruction. Our method incorporates region-level non-local (RL-NL) modules to capture long-range spatial dependencies and residual multi-scale blocks (RMSBs) to extract hierarchical terrain features. Additionally, a hybrid loss function combining pixel-wise, perceptual, and adversarial losses is introduced to ensure both geometric fidelity and visual realism. Experimental evaluations on U.S. offshore DEM datasets demonstrate that DEM-AMSSRN significantly outperforms existing GAN-based models, reducing RMSE by up to 72.47% (vs. SRGAN) and achieving 53.30 dB PSNR and 0.995056 SSIM. These results highlight its effectiveness in preserving both continental shelf-scale bathymetric patterns and detailed terrain textures. Using this model, we also constructed the USA_OD_2025, a 1/9 arc-second high-resolution offshore DEM for U.S. coastal zones, providing a valuable geospatial foundation for future marine research and engineering. Full article
Show Figures

Figure 1

26 pages, 4906 KB  
Article
Real-Time Sequential Adaptive Bin Packing Based on Second-Order Dual Pointer Adversarial Network: A Symmetry-Driven Approach for Balanced Container Loading
by Zibao Zhou, Enliang Wang and Xuejian Zhao
Symmetry 2025, 17(9), 1554; https://doi.org/10.3390/sym17091554 - 17 Sep 2025
Viewed by 328
Abstract
Modern logistics operations require real-time adaptive solutions for three-dimensional bin packing that maintain spatial symmetry and load balance. This paper introduces a time-series-based online 3D packing problem with dual unknown sequences, where containers and items arrive dynamically. The challenge lies in achieving symmetric [...] Read more.
Modern logistics operations require real-time adaptive solutions for three-dimensional bin packing that maintain spatial symmetry and load balance. This paper introduces a time-series-based online 3D packing problem with dual unknown sequences, where containers and items arrive dynamically. The challenge lies in achieving symmetric distribution for stability and optimal space utilization. We propose the Second-Order Dual Pointer Adversarial Network (So-DPAN), a deep reinforcement learning architecture that leverages symmetry principles to decompose spatiotemporal optimization into sequence matching and spatial arrangement sub-problems. The dual pointer mechanism enables efficient item-container pairing, while the second-order structure captures temporal dependencies by maintaining symmetric packing patterns. Our approach considers geometric symmetry for spatial arrangement and temporal symmetry for sequence matching. The Actor-Critic framework uses symmetry-based reward functions to guide learning toward balanced configurations. Experiments demonstrate that So-DPAN outperforms DQN, DDPG, and traditional heuristics in solution quality and efficiency while maintaining superior symmetry metrics in center-of-gravity positioning and load distribution. The algorithm exploits inherent symmetries in packing structure, advancing theoretical understanding through symmetry-aware optimization while providing a deployable framework for Industry 4.0 smart logistics. Full article
(This article belongs to the Section Mathematics)
Show Figures

Figure 1

30 pages, 5137 KB  
Article
High-Resolution Remote Sensing Imagery Water Body Extraction Using a U-Net with Cross-Layer Multi-Scale Attention Fusion
by Chunyan Huang, Mingyang Wang, Zichao Zhu and Yanling Li
Sensors 2025, 25(18), 5655; https://doi.org/10.3390/s25185655 - 10 Sep 2025
Viewed by 452
Abstract
The accurate extraction of water bodies from remote sensing imagery is crucial for water resource monitoring and flood disaster warning. However, this task faces significant challenges due to complex land cover, large variations in water body morphology and spatial scales, and spectral similarities [...] Read more.
The accurate extraction of water bodies from remote sensing imagery is crucial for water resource monitoring and flood disaster warning. However, this task faces significant challenges due to complex land cover, large variations in water body morphology and spatial scales, and spectral similarities between water and non-water features, leading to misclassification and low accuracy. While deep learning-based methods have become a research hotspot, traditional convolutional neural networks (CNNs) struggle to represent multi-scale features and capture global water body information effectively. To enhance water feature recognition and precisely delineate water boundaries, we propose the AMU-Net model. Initially, an improved residual connection module was embedded into the U-Net backbone to enhance complex feature learning. Subsequently, a multi-scale attention mechanism was introduced, combining grouped channel attention with multi-scale convolutional strategies for lightweight yet precise segmentation. Thereafter, a dual-attention gated modulation module dynamically fusing channel and spatial attention was employed to strengthen boundary localization. Furthermore, a cross-layer geometric attention fusion module, incorporating grouped projection convolution and a triple-level geometric attention mechanism, optimizes segmentation accuracy and boundary quality. Finally, a triple-constraint loss framework synergistically optimized global classification, regional overlap, and background specificity to boost segmentation performance. Evaluated on the GID and WHDLD datasets, AMU-Net achieved remarkable IoU scores of 93.6% and 95.02%, respectively, providing an effective new solution for remote sensing water body extraction. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

Back to TopTop