Sign in to use this feature.

Years

Between: -

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,333)

Search Parameters:
Journal = Remote Sensing
Section = AI Remote Sensing

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
35 pages, 17848 KB  
Article
Satellite-Based Multi-Decadal Shoreline Change Detection by Integrating Deep Learning with DSAS: Eastern and Southern Coastal Regions of Peninsular Malaysia
by Saima Khurram, Amin Beiranvand Pour, Milad Bagheri, Effi Helmy Ariffin, Mohd Fadzil Akhir and Saiful Bahri Hamzah
Remote Sens. 2025, 17(19), 3334; https://doi.org/10.3390/rs17193334 - 29 Sep 2025
Abstract
Coasts are critical ecological, economic and social interfaces between terrestrial and marine systems. The current upsurge in the acquisition and availability of remote sensing datasets, such as Landsat remote sensing data series, provides new opportunities for analyzing multi-decadal coastal changes and other components [...] Read more.
Coasts are critical ecological, economic and social interfaces between terrestrial and marine systems. The current upsurge in the acquisition and availability of remote sensing datasets, such as Landsat remote sensing data series, provides new opportunities for analyzing multi-decadal coastal changes and other components of coastal risk. The emergence of machine learning-based techniques represents a new trend that can support large-scale coastal monitoring and modeling using remote sensing big data. This study presents a comprehensive multi-decadal analysis of coastal changes for the period from 1990 to 2024 using Landsat remote sensing data series along the eastern and southern coasts of Peninsular Malaysia. These coastal regions include the states of Kelantan, Terengganu, Pahang, and Johor. An innovative approach combining deep learning-based shoreline extraction with the Digital Shoreline Analysis System (DSAS) was meticulously applied to the Landsat datasets. Two semantic segmentation models, U-Net and DeepLabV3+, were evaluated for automated shoreline delineation from the Landsat imagery, with U-Net demonstrating superior boundary precision and generalizability. The DSAS framework quantified shoreline change metrics—including Net Shoreline Movement (NSM), Shoreline Change Envelope (SCE), and Linear Regression Rate (LRR)—across the states of Kelantan, Terengganu, Pahang, and Johor. The results reveal distinct spatial–temporal patterns: Kelantan exhibited the highest rates of shoreline change with erosion of −64.9 m/year and accretion of up to +47.6 m/year; Terengganu showed a moderated change partly due to recent coastal protection structures; Pahang displayed both significant erosion, particularly south of the Pahang River with rates of over −50 m/year, and accretion near river mouths; Johor’s coastline predominantly exhibited accretion, with NSM values of over +1900 m, linked to extensive land reclamation activities and natural sediment deposition, although local erosion was observed along the west coast. This research highlights emerging erosion hotspots and, in some regions, the impact of engineered coastal interventions, providing critical insights for sustainable coastal zone management in Malaysia’s monsoon-influenced tropical coastal environment. The integrated deep learning and DSAS approach applied to Landsat remote sensing data series provides a scalable and reproducible framework for long-term coastal monitoring and climate adaptation planning around the world. Full article
Show Figures

Figure 1

16 pages, 6871 KB  
Article
Investigation of Thermal Effects of Lakes on Their Adjacent Lands Across Tibetan Plateau Using Satellite Observation During 2000 to 2022
by Linan Guo, Wenbin Sun, Yanhong Wu, Junfeng Xiong and Jianing Jiang
Remote Sens. 2025, 17(19), 3314; https://doi.org/10.3390/rs17193314 - 27 Sep 2025
Abstract
Understanding the regulatory effects of lakes on land surface temperature is critical for assessing regional climatological and ecological dynamics on the Tibetan Plateau (TP). This study investigates the spatiotemporal variability in the thermal effect of lakes across the TP from 2000 to 2022 [...] Read more.
Understanding the regulatory effects of lakes on land surface temperature is critical for assessing regional climatological and ecological dynamics on the Tibetan Plateau (TP). This study investigates the spatiotemporal variability in the thermal effect of lakes across the TP from 2000 to 2022 using the MODIS land surface temperature product and a model-based lake surface water temperature product. Our results show that the lake–land temperature difference (LLTD) within 10 km buffer zones surrounding lakes ranges from −2.8 °C to 3.4 °C. A declining trend in 79.2% of the lakes is detected during 2000–2022, with summer contributing most significantly to this decrease at a rate of −0.56 °C per decade. Assessments of the spatial extent of lake thermal effects show that the “warm island” effect in autumn (5.5 km) influences a larger area compared to the “cold island” effect in summer (1.3 km). Furthermore, southwestern lakes exhibit stronger warming intensities, while northwestern lakes show more pronounced cooling intensities. Correlation analyses indicate that lake thermal effects are significantly related to lake depth, freeze-up start date, and salinity. These findings highlight the importance of lake thermal regulation in heat balance changes and provide a foundation for further research into its climatic and ecological implications on the Tibetan Plateau. Full article
Show Figures

Figure 1

21 pages, 9052 KB  
Article
SAM–Attention Synergistic Enhancement: SAR Image Object Detection Method Based on Visual Large Model
by Yirong Yuan, Jie Yang, Lei Shi and Lingli Zhao
Remote Sens. 2025, 17(19), 3311; https://doi.org/10.3390/rs17193311 - 26 Sep 2025
Abstract
The object detection model for synthetic aperture radar (SAR) images needs to have strong generalization ability and more stable detection performance due to the complex scattering mechanism, high sensitivity of the orientation angle, and susceptibility to speckle noise. Visual large models possess strong [...] Read more.
The object detection model for synthetic aperture radar (SAR) images needs to have strong generalization ability and more stable detection performance due to the complex scattering mechanism, high sensitivity of the orientation angle, and susceptibility to speckle noise. Visual large models possess strong generalization capabilities for natural image processing, but their application to SAR imagery remains relatively rare. This paper attempts to introduce a visual large model into the SAR object detection task, aiming to alleviate the problems of weak cross-domain generalization and poor adaptability to few-shot samples caused by the characteristics of SAR images in existing models. The proposed model comprises an image encoder, an attention module, and a detection decoder. The image encoder leverages the pre-trained Segment Anything Model (SAM) for effective feature extraction from SAR images. An Adaptive Channel Interactive Attention (ACIA) module is introduced to suppress SAR speckle noise. Further, a Dynamic Tandem Attention (DTA) mechanism is proposed in the decoder to integrate scale perception, spatial focusing, and task adaptation, while decoupling classification from detection for improved accuracy. Leveraging the strong representational and few-shot adaptation capabilities of large pre-trained models, this study evaluates their cross-domain and few-shot detection performance on SAR imagery. For cross-domain detection, the model was trained on AIR-SARShip-1.0 and tested on SSDD, achieving an mAP50 of 0.54. For few-shot detection on SAR-AIRcraft-1.0, using only 10% of the training samples, the model reached an mAP50 of 0.503. Full article
(This article belongs to the Special Issue Big Data Era: AI Technology for SAR and PolSAR Image)
Show Figures

Figure 1

21 pages, 7001 KB  
Article
CGNet: Remote Sensing Instance Segmentation Method Using Contrastive Language–Image Pretraining and Gated Recurrent Units
by Hui Zhang, Zhao Tian, Zhong Chen, Tianhang Liu, Xueru Xu, Junsong Leng and Xinyuan Qi
Remote Sens. 2025, 17(19), 3305; https://doi.org/10.3390/rs17193305 - 26 Sep 2025
Abstract
Instance segmentation in remote sensing imagery is a significant application area within computer vision, holding considerable value in fields such as land planning and aerospace. The target scales of remote sensing images are often small, the contours of different categories of targets can [...] Read more.
Instance segmentation in remote sensing imagery is a significant application area within computer vision, holding considerable value in fields such as land planning and aerospace. The target scales of remote sensing images are often small, the contours of different categories of targets can be remarkably similar, and the background information is complex, containing more noise interference. Therefore, it is essential for the network model to utilize the background and internal instance information more effectively. Considering all the above, to fully adapt to the characteristics of remote sensing images, a network named CGNet, which combines an enhanced backbone with a contour–mask branch, is proposed. This network employs gated recurrent units for the iteration of contour and mask branches and adopts the attention head for branch fusion. Additionally, to address the common issues of missed and misdetections in target detection, a supervised backbone network using contrastive pretraining for feature supplementation is introduced. The proposed method has been experimentally validated in the NWPU VHR-10 and SSDD datasets, achieving average precision metrics of 68.1% and 67.4%, respectively, which are 0.9% and 3.2% higher than those of the suboptimal methods. Full article
Show Figures

Figure 1

25 pages, 20535 KB  
Article
DWTF-DETR: A DETR-Based Model for Inshore Ship Detection in SAR Imagery via Dynamically Weighted Joint Time–Frequency Feature Fusion
by Tiancheng Dong, Taoyang Wang, Yuqi Han, Deren Li, Guo Zhang and Yuan Peng
Remote Sens. 2025, 17(19), 3301; https://doi.org/10.3390/rs17193301 - 25 Sep 2025
Abstract
Inshore ship detection in synthetic aperture radar (SAR) imagery poses significant challenges due to the high density and diversity of ships. However, low inter-object backscatter contrast and blurred boundaries of docked ships often result in performance degradation for traditional object detection methods, especially [...] Read more.
Inshore ship detection in synthetic aperture radar (SAR) imagery poses significant challenges due to the high density and diversity of ships. However, low inter-object backscatter contrast and blurred boundaries of docked ships often result in performance degradation for traditional object detection methods, especially under complex backgrounds and low signal-to-noise ratio (SNR) conditions. To address these issues, this paper proposes a novel detection framework, the Dynamic Weighted Joint Time–Frequency Feature Fusion DEtection TRansformer (DETR) Model (DWTF-DETR), specifically designed for SAR-based ship detection in inshore areas. The proposed model integrates a Dual-Domain Feature Fusion Module (DDFM) to extract and fuse features from both SAR images and their frequency-domain representations, enhancing sensitivity to both high- and low-frequency target features. Subsequently, a Dual-Path Attention Fusion Module (DPAFM) is introduced to dynamically weight and fuse shallow detail features with deep semantic representations. By leveraging an attention mechanism, the module adaptively adjusts the importance of different feature paths, thereby enhancing the model’s ability to perceive targets with ambiguous structural characteristics. Experiments conducted on a self-constructed inshore SAR ship detection dataset and the public HRSID dataset demonstrate that DWTF-DETR achieves superior performance compared to the baseline RT-DETR. Specifically, the proposed method improves mAP@50 by 1.60% and 0.72%, and F1-score by 0.58% and 1.40%, respectively. Moreover, comparative experiments show that the proposed approach outperforms several state-of-the-art SAR ship detection methods. The results confirm that DWTF-DETR is capable of achieving accurate and robust detection in diverse and complex maritime environments. Full article
Show Figures

Graphical abstract

29 pages, 23948 KB  
Article
CAGMC-Defence: A Cross-Attention-Guided Multimodal Collaborative Defence Method for Multimodal Remote Sensing Image Target Recognition
by Jiahao Cui, Hang Cao, Lingquan Meng, Wang Guo, Keyi Zhang, Qi Wang, Cheng Chang and Haifeng Li
Remote Sens. 2025, 17(19), 3300; https://doi.org/10.3390/rs17193300 - 25 Sep 2025
Abstract
With the increasing diversity of remote sensing modalities, multimodal image fusion improves target recognition accuracy but also introduces new security risks. Adversaries can inject small, imperceptible perturbations into a single modality to mislead model predictions, which undermines system reliability. Most existing defences are [...] Read more.
With the increasing diversity of remote sensing modalities, multimodal image fusion improves target recognition accuracy but also introduces new security risks. Adversaries can inject small, imperceptible perturbations into a single modality to mislead model predictions, which undermines system reliability. Most existing defences are designed for single-modal inputs and face two key challenges in multimodal settings: 1. vulnerability to perturbation propagation due to static fusion strategies, and 2. the lack of collaborative mechanisms that limit overall robustness according to the weakest modality. To address these issues, we propose CAGMC-Defence, a cross-attention-guided multimodal collaborative defence framework for multimodal remote sensing. It contains two main modules. The Multimodal Feature Enhancement and Fusion (MFEF) module adopts a pseudo-Siamese network and cross-attention to decouple features, capture intermodal dependencies, and suppress perturbation propagation through weighted regulation and consistency alignment. The Multimodal Adversarial Training (MAT) module jointly generates optical and SAR adversarial examples and optimizes network parameters under consistency loss, enhancing robustness and generalization. Experiments on the WHU-OPT-SAR dataset show that CAGMC-Defence maintains stable performance under various typical adversarial attacks, such as FGSM, PGD, and MIM, retaining 85.74% overall accuracy even under the strongest white-box MIM attack (ϵ=0.05), significantly outperforming existing multimodal defence baselines. Full article
Show Figures

Figure 1

32 pages, 33744 KB  
Article
Attention-Based Enhancement of Airborne LiDAR Across Vegetated Landscapes Using SAR and Optical Imagery Fusion
by Michael Marks, Daniel Sousa and Janet Franklin
Remote Sens. 2025, 17(19), 3278; https://doi.org/10.3390/rs17193278 - 24 Sep 2025
Viewed by 142
Abstract
Accurate and timely 3D vegetation structure information is essential for ecological modeling and land management. However, these needs often cannot be met with existing airborne LiDAR surveys, whose broad-area coverage comes with trade-offs in point density and update frequency. To address these limitations, [...] Read more.
Accurate and timely 3D vegetation structure information is essential for ecological modeling and land management. However, these needs often cannot be met with existing airborne LiDAR surveys, whose broad-area coverage comes with trade-offs in point density and update frequency. To address these limitations, this study introduces a deep learning framework built on attention mechanisms, the fundamental building block of modern large language models. The framework upsamples sparse (<22 pt/m2) airborne LiDAR point clouds by fusing them with stacks of multi-temporal optical (NAIP) and L-band quad-polarized Synthetic Aperture Radar (UAVSAR) imagery. Utilizing a novel Local–Global Point Attention Block (LG-PAB), our model directly enhances 3D point-cloud density and accuracy in vegetated landscapes by learning structure directly from the point cloud itself. Results in fire-prone Southern California foothill and montane ecosystems demonstrate that fusing both optical and radar imagery reduces reconstruction error (measured by Chamfer distance) compared to using LiDAR alone or with a single image modality. Notably, the fused model substantially mitigates errors arising from vegetation changes over time, particularly in areas of canopy loss, thereby increasing the utility of historical LiDAR archives. This research presents a novel approach for direct 3D point-cloud enhancement, moving beyond traditional raster-based methods and offering a pathway to more accurate and up-to-date vegetation structure assessments. Full article
Show Figures

Graphical abstract

34 pages, 2027 KB  
Article
A Multi-Model Framework Based on Remote Sensing to Assess Land Degradation in Rural Areas
by Federica D’Acunto, Olena Dubovyk, Nikhil Raghuvanshi, Francesco Marinello, Filippo Iodice and Andrea Pezzuolo
Remote Sens. 2025, 17(19), 3276; https://doi.org/10.3390/rs17193276 - 24 Sep 2025
Viewed by 207
Abstract
Land degradation is a complex and context-specific phenomenon with significant implications for rural areas, where agricultural and livestock activities intersect with natural ecosystem processes. Despite growing efforts to monitor land degradation, the absence of standardized methodologies limits the comparability of results and the [...] Read more.
Land degradation is a complex and context-specific phenomenon with significant implications for rural areas, where agricultural and livestock activities intersect with natural ecosystem processes. Despite growing efforts to monitor land degradation, the absence of standardized methodologies limits the comparability of results and the implementation of coherent mitigation strategies. This study introduces RURALIS, a multi-model framework, based on remote sensing, specifically designed to assess land degradation in the rural areas of Italy. Drawing on the structure and outputs of three existing models, RURALIS adopts a model-learning approach. A Random Forest classifier is then employed to compare outputs from all models and identify areas of severe degradation across all models. The analysis reveals that approximately 2.34 million hectares (13.6%) of Italy’s rural lands are severely degraded, with hotspots in northern Puglia, Sicilia, and parts of northern Italy. The model demonstrates strong classification performance and provides a flexible, high-resolution tool that leverages the shared foundation of remote sensing to deliver spatially detailed, decision-ready outputs for rural land management. Full article
(This article belongs to the Special Issue Multimodal Remote Sensing Data Fusion, Analysis and Application)
Show Figures

Graphical abstract

23 pages, 5234 KB  
Article
Instance Segmentation of LiDAR Point Clouds with Local Perception and Channel Similarity
by Xinmiao Du and Xihong Wu
Remote Sens. 2025, 17(18), 3239; https://doi.org/10.3390/rs17183239 - 19 Sep 2025
Viewed by 309
Abstract
Lidar point clouds are crucial for autonomous driving, but their sparsity and scale variations pose challenges for instance segmentation. In this paper, we propose LCPSNet, a Light Detection and Ranging (LiDAR) channel-aware point segmentation network designed to handle distance-dependent sparsity and scale variation [...] Read more.
Lidar point clouds are crucial for autonomous driving, but their sparsity and scale variations pose challenges for instance segmentation. In this paper, we propose LCPSNet, a Light Detection and Ranging (LiDAR) channel-aware point segmentation network designed to handle distance-dependent sparsity and scale variation in point clouds. A top-down FPN is adopted, where high-level features are progressively upsampled and fused with shallow layers. The fused features at 1/16, 1/8, and 1/4 are further aligned to a common BEV/polar grid and processed by the Local Perception Module (LPM), which applies cross-scale, position-dependent weighting to enhance intra-object coherence and suppress interference. The Inter-Channel Correlation Module (ICCM) employs ball queries to model spatial and channel correlations, computing an inter-channel similarity matrix to reduce redundancy and highlight valid features. Experiments on SemanticKITTI and Waymo show that LPM and ICCM effectively improve local feature refinement and global semantic consistency. LCPSNet achieves 70.9 PQ and 77.1 mIoU on SemanticKITTI, surpassing mainstream methods and reaching state-of-the-art performance. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

24 pages, 12464 KB  
Article
Hierarchical Frequency-Guided Knowledge Reconstruction for SAR Incremental Target Detection
by Yu Tian, Zongyong Cui, Zheng Zhou and Zongjie Cao
Remote Sens. 2025, 17(18), 3214; https://doi.org/10.3390/rs17183214 - 17 Sep 2025
Viewed by 241
Abstract
Synthetic Aperture Radar (SAR) incremental target detection faces challenges from the limits of incremental learning frameworks and distinctive properties of SAR imagery. The limited spatial representation of targets, combined with strong background interference and fluctuating scattering characteristics, leads to unstable feature learning when [...] Read more.
Synthetic Aperture Radar (SAR) incremental target detection faces challenges from the limits of incremental learning frameworks and distinctive properties of SAR imagery. The limited spatial representation of targets, combined with strong background interference and fluctuating scattering characteristics, leads to unstable feature learning when new classes are introduced. These factors exacerbate representation mismatches between existing and incremental tasks, resulting in significant degradation in detection performance. To address these challenges, we propose a novel incremental learning framework featuring Hierarchical Frequency-Knowledge Reconstruction (HFKR). HFKR leverages wavelet-based frequency decomposition and cross-domain feature reconstruction to enhance consistency between global and detailed features throughout the incremental learning process. Specifically, we analyze the manifestation of representation mismatch in feature space and its impact on detection accuracy, while investigating the correlation between hierarchical semantic features and frequency-domain components. Based on these insights, HFKR is embedded within the feature transfer phase, where frequency-guided decomposition and reconstruction facilitate seamless integration of new and old task features, thereby maintaining model stability across updates. Extensive experiments on two benchmark SAR datasets, MSAR and SARAIRcraft, demonstrate that our method delivers superior performance compared to existing incremental detection approaches. Furthermore, its robustness in multi-step incremental scenarios highlights the potential of HFKR for broader applications in SAR image analysis. Full article
Show Figures

Figure 1

25 pages, 4796 KB  
Article
Vision-Language Guided Semantic Diffusion Sampling for Small Object Detection in Remote Sensing Imagery
by Jian Ma, Mingming Bian, Fan Fan, Hui Kuang, Lei Liu, Zhibing Wang, Ting Li and Running Zhang
Remote Sens. 2025, 17(18), 3203; https://doi.org/10.3390/rs17183203 - 17 Sep 2025
Viewed by 503
Abstract
Synthetic aperture radar (SAR), with its all-weather and all-day active imaging capability, has become indispensable for geoscientific analysis and socio-economic applications. Despite advances in deep learning–based object detection, the rapid and accurate detection of small objects in SAR imagery remains a major challenge [...] Read more.
Synthetic aperture radar (SAR), with its all-weather and all-day active imaging capability, has become indispensable for geoscientific analysis and socio-economic applications. Despite advances in deep learning–based object detection, the rapid and accurate detection of small objects in SAR imagery remains a major challenge due to their extremely limited pixel representation, blurred boundaries in dense distributions, and the imbalance of positive–negative samples during training. Recently, vision–language models such as Contrastive Language-Image Pre-Training (CLIP) have attracted widespread research interest for their powerful cross-modal semantic modeling capabilities. Nevertheless, their potential to guide precise localization and detection of small objects in SAR imagery has not yet been fully exploited. To overcome these limitations, we propose the CLIP-Driven Adaptive Tiny Object Detection Diffusion Network (CDATOD-Diff). This framework introduces a CLIP image–text encoding-guided dynamic sampling strategy that leverages cross-modal semantic priors to alleviate the scarcity of effective positive samples. Furthermore, a generative diffusion-based module reformulates the sampling process through iterative denoising, enhancing contextual awareness. To address regression instability, we design a Balanced Corner–IoU (BC-IoU) loss, which decouples corner localization from scale variation and reduces sensitivity to minor positional errors, thereby stabilizing bounding box predictions. Extensive experiments conducted on multiple SAR and optical remote sensing datasets demonstrate that CDATOD-Diff achieves state-of-the-art performance, delivering significant improvements in detection robustness and localization accuracy under challenging small-object scenarios with complex backgrounds and dense distributions. Full article
Show Figures

Figure 1

28 pages, 13374 KB  
Article
Low-Light Remote Sensing Image Enhancement via Priors Guided End-to-End Latent Residual Diffusion
by Bing Ding, Bei Sun and Xiaoyong Sun
Remote Sens. 2025, 17(18), 3193; https://doi.org/10.3390/rs17183193 - 15 Sep 2025
Viewed by 375
Abstract
Low-light image enhancement, especially for remote sensing images, remains a challenging task due to issues like low brightness, high noise, color distortion, and the unique complexities of remote sensing scenes, such as uneven illumination and large coverage. Existing methods often struggle to balance [...] Read more.
Low-light image enhancement, especially for remote sensing images, remains a challenging task due to issues like low brightness, high noise, color distortion, and the unique complexities of remote sensing scenes, such as uneven illumination and large coverage. Existing methods often struggle to balance efficiency, accuracy, and robustness. Diffusion models have shown potential in image restoration, but they often rely on multi-step noise estimation, leading to inefficiency. To address these issues, this study proposes an enhancement framework based on a lightweight encoder–decoder and a physical-prior-guided end-to-end single-step residual diffusion model. The lightweight encoder–decoder, tailored for low-light scenarios, reduces computational redundancy while preserving key features, ensuring efficient mapping between pixel and latent spaces. Guided by physical priors, the end-to-end trained single-step residual diffusion model simplifies the process by eliminating multi-step noise estimation through end-to-end training, accelerating inference without sacrificing quality. Illumination-invariant priors guide the inference process, alleviating blurriness from missing details and ensuring structural consistency. Experimental results show that it not only demonstrates superiority over mainstream methods in quantitative metrics and visual effects but also achieves a 20× speedup compared with an advanced diffusion-based method. Full article
Show Figures

Figure 1

20 pages, 6489 KB  
Article
Post-Disaster High-Frequency Ground-Based InSAR Monitoring and 3D Deformation Reconstruction of Large Landslides Using MIMO Radar
by Xianlin Shi, Ziwei Zhao, Yingchao Dai, Keren Dai and Anhua Ju
Remote Sens. 2025, 17(18), 3183; https://doi.org/10.3390/rs17183183 - 14 Sep 2025
Viewed by 435
Abstract
Landslide InSAR monitoring is crucial for understanding the evolutionary mechanisms of geological disasters and enhancing risk prevention and control capabilities. However, for complex terrains and large-scale landslides, satellite-based SAR monitoring faces challenges such as a low observation frequency and limited spatial deformation interpretation [...] Read more.
Landslide InSAR monitoring is crucial for understanding the evolutionary mechanisms of geological disasters and enhancing risk prevention and control capabilities. However, for complex terrains and large-scale landslides, satellite-based SAR monitoring faces challenges such as a low observation frequency and limited spatial deformation interpretation capabilities. Additionally, two-dimensional monitoring struggles to comprehensively capture multi-directional movements. Taking the post-disaster monitoring of the landslide in Yunchuan, Sichuan Province, as an example, this study proposes a method for three-dimensional deformation dynamic monitoring by integrating dual-view MIMO ground-based synthetic aperture radar (GB-InSAR) data with high-resolution digital elevation model (DEM) data, successfully reconstructing the three-dimensional displacement fields in the east–west, north–south, and vertical directions. The results show that deformation in the landslide area evolved from slow accumulation to rapid failure, particularly concentrated in the middle and lower regions of the landslide. The average three-dimensional deformation of the main slip zone was approximately 60% greater than that of the original slope, with a maximum deformation of −100 mm. These deformation characteristics are highly consistent with the topographic structure and sliding direction. Field investigations further validated the radar data, with observed surface cracks and accumulation zones consistent with the high-deformation regions identified by the monitoring system. This system provides a solid foundation for geological disaster early warning systems, mechanism research, and risk prevention and control. Full article
(This article belongs to the Special Issue Deep Learning Techniques and Applications of MIMO Radar Theory)
Show Figures

Figure 1

32 pages, 6397 KB  
Article
Enhancing YOLO-Based SAR Ship Detection with Attention Mechanisms
by Ranyeri do Lago Rocha and Felipe A. P. de Figueiredo
Remote Sens. 2025, 17(18), 3170; https://doi.org/10.3390/rs17183170 - 12 Sep 2025
Viewed by 607
Abstract
This study enhances Synthetic Aperture Radar (SAR) ship detection by integrating attention mechanisms, Bi-Level Routing Attention (BRA), Swin Transformer, and a Convolutional Block Attention Module (CBAM) into state-of-the-art YOLO architectures (YOLOv11 and v12). Addressing challenges like small ship sizes and complex maritime backgrounds [...] Read more.
This study enhances Synthetic Aperture Radar (SAR) ship detection by integrating attention mechanisms, Bi-Level Routing Attention (BRA), Swin Transformer, and a Convolutional Block Attention Module (CBAM) into state-of-the-art YOLO architectures (YOLOv11 and v12). Addressing challenges like small ship sizes and complex maritime backgrounds in SAR imagery, we systematically evaluate the impact of adding and replacing attention layers at strategic positions within the models. Experiments reveal that replacing the original attention layer at position 4 (C3k2 module) with the CBAM in YOLOv12 achieves optimal performance, attaining an mAP@0.5 of 98.0% on the SAR Ship Dataset (SSD), surpassing baseline YOLOv12 (97.8%) and prior works. The optimized CBAM-enhanced YOLOv12 also reduces computational costs (5.9 GFLOPS vs. 6.5 GFLOPS in the baseline). Cross-dataset validation on the SAR Ship Detection Dataset (SSDD) confirms consistent improvements, underscoring the efficacy of targeted attention-layer replacement for SAR-specific challenges. Additionally, tests on the SADD and MSAR datasets demonstrate that this optimization generalizes beyond ship detection, yielding gains in aircraft detection and multi-class SAR object recognition. This work establishes a robust framework for efficient, high-precision maritime surveillance using deep learning. Full article
Show Figures

Figure 1

20 pages, 3823 KB  
Article
SA-Encoder: A Learnt Spatial Autocorrelation Representation to Inform 3D Geospatial Object Detection
by Tianyang Chen, Wenwu Tang, Shen-En Chen and Craig Allan
Remote Sens. 2025, 17(17), 3124; https://doi.org/10.3390/rs17173124 - 8 Sep 2025
Viewed by 403
Abstract
Contextual features play a critical role in geospatial object detection by characterizing the surrounding environment of objects. In existing deep learning-based studies of 3D point cloud classification and segmentation, these features have been represented through geometric descriptors, semantic context (i.e., modeled by an [...] Read more.
Contextual features play a critical role in geospatial object detection by characterizing the surrounding environment of objects. In existing deep learning-based studies of 3D point cloud classification and segmentation, these features have been represented through geometric descriptors, semantic context (i.e., modeled by an attention-based mechanism), global-level context (i.e., through global aggregation), and textural representation (e.g., RGB, intensity, and other attributes). Even though contextual features have been widely explored, spatial contextual features that explicitly capture spatial autocorrelation and neighborhood dependency have received limited attention in object detection tasks. This gap is particularly relevant in the context of GeoAI, which calls for mutual benefits between artificial intelligence and geographic information science. To bridge this gap, this study presents a spatial autocorrelation encoder, namely SA-Encoder, designed to inform 3D geospatial object detection by capturing spatial autocorrelation representation as types of spatial contextual features. The study investigated the effectiveness of such spatial contextual features by estimating the performance of a model trained on them alone. The results suggested that the derived spatial autocorrelation information can help adequately identify some large objects in an urban-rural scene, such as buildings, terrain, and large trees. We further investigated how the spatial autocorrelation encoder can inform model performance in a geospatial object detection task. The results demonstrated significant improvements in detection accuracy across varied urban and rural environments when we compared the results to models without considering spatial autocorrelation as an ablation experiment. Moreover, the approach also outperformed the models trained by explicitly feeding traditional spatial autocorrelation measures (i.e., Matheron’s semivariance). This study showcases the advantage of the adaptiveness of the neural network-based encoder in deriving a spatial autocorrelation representation. This advancement bridges the gap between theoretical geospatial concepts and practical AI applications. Consequently, this study demonstrates the potential of integrating geographic theories with deep learning technologies to address challenges in 3D object detection, paving the way for further innovations in this field. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

Back to TopTop