remotesensing-logo

Journal Browser

Journal Browser

Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing II

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "AI Remote Sensing".

Deadline for manuscript submissions: closed (30 September 2023) | Viewed by 35547

Special Issue Editor


E-Mail Website
Guest Editor
1. Faculty of Engineering and IT, University of Technology Sydney, Ultimo, NSW, Australia
2. McGregor Coxall Australia Pty Ltd., Sydney, NSW, Australia
Interests: machine learning; geospatial 3D analysis; geospatial database querying; web GIS; airborne/spaceborne image processing; feature extraction; time-series analysis in forecasting modelling and domain adaptation in various environmental applications
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Following the success of the previous Special Issue "Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing", a new one has been activated.

Artificial intelligence (AI) and machine learning (ML) techniques have been a principal element of image processing and spatial analysis in numerous applications for a decade. AI enables us to determine the real function of imagery data and process it with a well-fit algorithm to model a structural framework in terms of classification, regression, and clustering, and to model the spatial correlation. Deep neural networks, usually known as deep learning, are one of the robust methods of ML that can engage numerous layers of data-driven algorithms to perform a wide range of applications including pattern recognition, feature detection, trend prediction, instance segmentation, semantic segmentation, and image classification in the form of neural networks.

Conventional structured remotely sensed data need to be labelled manually when it comes to training model, which is a subjective user-centric, untransferable, tedious approach. Therefore, it is important to eliminate these uncertainties by establishing a reproducible and reliable approach, which can be referred to as “machine vision” (MV). MV attempts to leverage the current AI technology in a novel way in order to provide an automatic inspection workflow from image acquisition from the sensor to digital image pre-processing, training and testing techniques, validation, and knowledge extraction. It covers software products and hardware architects such as CPU, GPU/FPGA combination, parallel implementation, and computer vision to minimize computation time while maximizing the reproducible accuracy.   

In this Special Issue, we welcome the submission of scientific manuscripts proposing a framework to leverage MV with optimized AI techniques and geospatial information systems to automate the processing of remotely sensed imagery from, for example, lidar, radar, SAR, and multispectral sensors with higher precision for multiple spatial applications including but not limited to urbanism, land-use modelling, environment, weather and climate, energy sector, natural resources, landscape, geo-hazards, etc

Dr. Hossein M. Rizeei
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence (AI)
  • machine vision (MV)
  • machine learning (ML)
  • geospatial information systems (GIS)
  • optimization
  • spatial framework
  • deep learning (DL)

Related Special Issue

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

26 pages, 7207 KiB  
Article
MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices
by Yehui Liu, Yuliang Zhao, Xinyue Zhang, Xiaoai Wang, Chao Lian, Jian Li, Peng Shan, Changzeng Fu, Xiaoyong Lyu, Lianjiang Li, Qiang Fu and Wen Jung Li
Remote Sens. 2023, 15(24), 5665; https://doi.org/10.3390/rs15245665 - 7 Dec 2023
Cited by 2 | Viewed by 1428
Abstract
Tracking and segmenting small targets in remote sensing videos on edge devices carries significant engineering implications. However, many semi-supervised video object segmentation (S-VOS) methods heavily rely on extensive video random-access memory (VRAM) resources, making deployment on edge devices challenging. Our goal is to [...] Read more.
Tracking and segmenting small targets in remote sensing videos on edge devices carries significant engineering implications. However, many semi-supervised video object segmentation (S-VOS) methods heavily rely on extensive video random-access memory (VRAM) resources, making deployment on edge devices challenging. Our goal is to develop an edge-deployable S-VOS method that can achieve high-precision tracking and segmentation by selecting a bounding box for the target object. First, a tracker is introduced to pinpoint the position of the tracked object in different frames, thereby eliminating the need to save the results of the split as other S-VOS methods do, thus avoiding an increase in VRAM usage. Second, we use two key lightweight components, correlation filters (CFs) and the Mobile Segment Anything Model (MobileSAM), to ensure the inference speed of our model. Third, a mask diffusion module is proposed that improves the accuracy and robustness of segmentation without increasing VRAM usage. We use our self-built dataset containing airplanes and vehicles to evaluate our method. The results show that on the GTX 1080 Ti, our model achieves a J&F score of 66.4% under the condition that the VRAM usage is less than 500 MB, while maintaining a processing speed of 12 frames per second (FPS). The model we propose exhibits good performance in tracking and segmenting small targets on edge devices, providing a solution for fields such as aircraft monitoring and vehicle tracking that require executing S-VOS tasks on edge devices. Full article
Show Figures

Figure 1

18 pages, 3670 KiB  
Article
ERF-RTMDet: An Improved Small Object Detection Method in Remote Sensing Images
by Shuo Liu, Huanxin Zou, Yazhe Huang, Xu Cao, Shitian He, Meilin Li and Yuqing Zhang
Remote Sens. 2023, 15(23), 5575; https://doi.org/10.3390/rs15235575 - 30 Nov 2023
Cited by 2 | Viewed by 1489
Abstract
A significant challenge in detecting objects in complex remote sensing (RS) datasets is from small objects. Existing detection methods achieve much lower accuracy on small objects than medium and large ones. These methods suffer from limited feature information, susceptibility to complex background interferences, [...] Read more.
A significant challenge in detecting objects in complex remote sensing (RS) datasets is from small objects. Existing detection methods achieve much lower accuracy on small objects than medium and large ones. These methods suffer from limited feature information, susceptibility to complex background interferences, and insufficient contextual information. To address these issues, a small object detection method with the enhanced receptive field, ERF-RTMDet, is proposed to achieve a more robust detection capability on small objects in RS images. Specifically, three modules are employed to enhance the receptive field of small objects’ features. First, the Dilated Spatial Pyramid Pooling Fast Module is proposed to gather more contextual information on small objects and suppress the interference of background information. Second, the Content-Aware Reassembly of Features Module is employed for more efficient feature fusion instead of the nearest-neighbor upsampling operator. Finally, the Hybrid Dilated Attention Module is proposed to expand the receptive field of object features after the feature fusion network. Extensive experiments are conducted on the MAR20 and NWPU VHR-10 datasets. The experimental results show that our ERF-RTMDet attains higher detection precision on small objects while maintaining or slightly enhancing the detection precision on mid-scale and large-scale objects. Full article
Show Figures

Graphical abstract

21 pages, 6111 KiB  
Article
A Novel Adaptive Edge Aggregation and Multiscale Feature Interaction Detector for Object Detection in Remote Sensing Images
by Wei Huang, Yuhao Zhao, Le Sun, Lu Gao and Yuwen Chen
Remote Sens. 2023, 15(21), 5200; https://doi.org/10.3390/rs15215200 - 1 Nov 2023
Viewed by 1223
Abstract
Object detection (OD) in remote sensing (RS) images is an important task in the field of computer vision. OD techniques have achieved impressive advances in recent years. However, complex background interference, large-scale variations, and dense instances pose significant challenges for OD. These challenges [...] Read more.
Object detection (OD) in remote sensing (RS) images is an important task in the field of computer vision. OD techniques have achieved impressive advances in recent years. However, complex background interference, large-scale variations, and dense instances pose significant challenges for OD. These challenges may lead to misalignment between features extracted by OD models and the features of real objects. To address these challenges, we explore a novel single-stage detection framework for the adaptive fusion of multiscale features and propose a novel adaptive edge aggregation and multiscale feature interaction detector (AEAMFI-Det) for OD in RS images. AEAMFI-Det consists of an adaptive edge aggregation (AEA) module, a feature enhancement module (FEM) embedded in a context-aware cross-attention feature pyramid network (2CA-FPN), and a pyramid squeeze attention (PSA) module. The AEA module employs an edge enhancement mechanism to guide the network to learn spatial multiscale nonlocal dependencies and solve the problem of feature misalignment between the network’s focus and the real object. The 2CA-FPN employs level-by-level feature fusion to enhance multiscale feature interactions and effectively mitigate the misalignment between the scales of the extracted features and the scales of real objects. The FEM is designed to capture the local and nonlocal contexts as auxiliary information to enhance the feature representation of information interaction between multiscale features in a cross-attention manner. We introduce the PSA module to establish long-term dependencies between multiscale spaces and channels for better interdependency refinement. Experimental results obtained using the NWPU VHR-10 and DIOR datasets demonstrate the superior performance of AEAMFI-Det in object classification and localization. Full article
Show Figures

Figure 1

16 pages, 3981 KiB  
Article
Transformer in UAV Image-Based Weed Mapping
by Jiangsan Zhao, Therese With Berge and Jakob Geipel
Remote Sens. 2023, 15(21), 5165; https://doi.org/10.3390/rs15215165 - 29 Oct 2023
Cited by 1 | Viewed by 1238
Abstract
Weeds affect crop yield and quality due to competition for resources. In order to reduce the risk of yield losses due to weeds, herbicides or non-chemical measures are applied. Weeds, especially creeping perennial species, are generally distributed in patches within arable fields. Hence, [...] Read more.
Weeds affect crop yield and quality due to competition for resources. In order to reduce the risk of yield losses due to weeds, herbicides or non-chemical measures are applied. Weeds, especially creeping perennial species, are generally distributed in patches within arable fields. Hence, instead of applying control measures uniformly, precision weeding or site-specific weed management (SSWM) is highly recommended. Unmanned aerial vehicle (UAV) imaging is known for wide area coverage and flexible operation frequency, making it a potential solution to generate weed maps at a reasonable cost. Efficient weed mapping algorithms need to be developed together with UAV imagery to facilitate SSWM. Different machine learning (ML) approaches have been developed for image-based weed mapping, either classical ML models or the more up-to-date deep learning (DL) models taking full advantage of parallel computation on a GPU (graphics processing unit). Attention-based transformer DL models, which have seen a recent boom, are expected to overtake classical convolutional neural network (CNN) DL models. This inspired us to develop a transformer DL model for segmenting weeds, cereal crops, and ‘other’ in low-resolution RGB UAV imagery (about 33 mm ground sampling distance, g.s.d.) captured after the cereal crop had turned yellow. Images were acquired during three years in 15 fields with three cereal species (Triticum aestivum, Hordeum vulgare, and Avena sativa) and various weed flora dominated by creeping perennials (mainly Cirsium arvense and Elymus repens). The performance of our transformer model, 1Dtransformer, was evaluated through comparison with a classical DL model, 1DCNN, and two classical ML methods, i.e., random forest (RF) and k-nearest neighbor (KNN). The transformer model showed the best performance with an overall accuracy of 98.694% on pixels set aside for validation. It also agreed best and relatively well with ground reference data on total weed coverage, R2 = 0.598. In this study, we showed the outstanding performance and robustness of a 1Dtransformer model for weed mapping based on UAV imagery for the first time. The model can be used to obtain weed maps in cereals fields known to be infested by perennial weeds. These maps can be used as basis for the generation of prescription maps for SSWM, either pre-harvest, post-harvest, or in the next crop, by applying herbicides or non-chemical measures. Full article
Show Figures

Figure 1

25 pages, 10723 KiB  
Article
CDAU-Net: A Novel CoordConv-Integrated Deep Dual Cross Attention Mechanism for Enhanced Road Extraction in Remote Sensing Imagery
by Anchao Yin, Chao Ren, Weiting Yue, Hongjuan Shao and Xiaoqin Xue
Remote Sens. 2023, 15(20), 4914; https://doi.org/10.3390/rs15204914 - 11 Oct 2023
Viewed by 1060
Abstract
In the realm of remote sensing image analysis, the task of road extraction poses significant complexities, especially in the context of intricate scenes and diminutive targets. In response to these challenges, we have developed a novel deep learning network, christened CDAU-Net, designed to [...] Read more.
In the realm of remote sensing image analysis, the task of road extraction poses significant complexities, especially in the context of intricate scenes and diminutive targets. In response to these challenges, we have developed a novel deep learning network, christened CDAU-Net, designed to discern and delineate these features with enhanced precision. This network takes its structural inspiration from the fundamental architecture of U-Net while introducing innovative enhancements: we have integrated CoordConv convolutions into both the initial layer of the U-Net encoder and the terminal layer of the decoder, thereby facilitating a more efficacious processing of spatial information inherent in remote sensing images. Moreover, we have devised a unique mechanism termed the Deep Dual Cross Attention (DDCA), purposed to capture long-range dependencies within images—a critical factor in remote sensing image analysis. Our network replaces the skip-connection component of the U-Net with this newly designed mechanism, dealing with feature maps of the first four scales in the encoder and generating four corresponding outputs. These outputs are subsequently linked with the decoder stage to further capture the remote dependencies present within the remote sensing imagery. We have subjected CDAU-Net to extensive empirical validation, including testing on the Massachusetts Road Dataset and DeepGlobe Road Dataset. Both datasets encompass a diverse range of complex road scenes, making them ideal for evaluating the performance of road extraction algorithms. The experimental results showcase that whether in terms of accuracy, recall rate, or Intersection over Union (IoU) metrics, the CDAU-Net outperforms existing state-of-the-art methods in the task of road extraction. These findings substantiate the effectiveness and superiority of our approach in handling complex scenes and small targets, as well as in capturing long-range dependencies in remote sensing imagery. In sum, the design of CDAU-Net not only enhances the accuracy of road extraction but also presents new perspectives and possibilities for deep learning analysis of remote sensing imagery. Full article
Show Figures

Figure 1

18 pages, 2869 KiB  
Article
An Enhanced Target Detection Algorithm for Maritime Search and Rescue Based on Aerial Images
by Yijian Zhang, Yong Yin and Zeyuan Shao
Remote Sens. 2023, 15(19), 4818; https://doi.org/10.3390/rs15194818 - 3 Oct 2023
Cited by 1 | Viewed by 2037
Abstract
Unmanned aerial vehicles (UAVs), renowned for their rapid deployment, extensive data collection, and high spatial resolution, are crucial in locating distressed individuals during search and rescue (SAR) operations. Challenges in maritime search and rescue include missed detections due to issues including sunlight reflection. [...] Read more.
Unmanned aerial vehicles (UAVs), renowned for their rapid deployment, extensive data collection, and high spatial resolution, are crucial in locating distressed individuals during search and rescue (SAR) operations. Challenges in maritime search and rescue include missed detections due to issues including sunlight reflection. In this study, we proposed an enhanced ABT-YOLOv7 algorithm for underwater person detection. This algorithm integrates an asymptotic feature pyramid network (AFPN) to preserve the target feature information. The BiFormer module enhances the model’s perception of small-scale targets, whereas the task-specific context decoupling (TSCODE) mechanism effectively resolves conflicts between localization and classification. Using quantitative experiments on a curated dataset, our model outperformed methods such as YOLOv3, YOLOv4, YOLOv5, YOLOv8, Faster R-CNN, Cascade R-CNN, and FCOS. Compared with YOLOv7, our approach enhances the mean average precision (mAP) from 87.1% to 91.6%. Therefore, our approach reduces the sensitivity of the detection model to low-lighting conditions and sunlight reflection, thus demonstrating enhanced robustness. These innovations have driven advancements in UAV technology within the maritime search and rescue domains. Full article
Show Figures

Figure 1

22 pages, 3244 KiB  
Article
Leveraging High-Resolution Long-Wave Infrared Hyperspectral Laboratory Imaging Data for Mineral Identification Using Machine Learning Methods
by Alireza Hamedianfar, Kati Laakso, Maarit Middleton, Tuomo Törmänen, Juha Köykkä and Johanna Torppa
Remote Sens. 2023, 15(19), 4806; https://doi.org/10.3390/rs15194806 - 3 Oct 2023
Cited by 1 | Viewed by 1687
Abstract
Laboratory-based hyperspectral imaging (HSI) is an optical non-destructive technology used to extract mineralogical information from bedrock drill cores. In the present study, drill core scanning in the long-wave infrared (LWIR; 8000–12,000 nm) wavelength region was used to map the dominant minerals in HSI [...] Read more.
Laboratory-based hyperspectral imaging (HSI) is an optical non-destructive technology used to extract mineralogical information from bedrock drill cores. In the present study, drill core scanning in the long-wave infrared (LWIR; 8000–12,000 nm) wavelength region was used to map the dominant minerals in HSI pixels. Machine learning classification algorithms, including random forest (RF) and support vector machine, have previously been applied to the mineral characterization of drill core hyperspectral data. The objectives of this study are to expand semi-automated mineral mapping by investigating the mapping accuracy, generalization potential, and classification ability of cutting-edge methods, such as various ensemble machine learning algorithms and deep learning semantic segmentation. In the present study, the mapping of quartz, talc, chlorite, and mixtures thereof in HSI data was performed using the ENVINet5 algorithm, which is based on the U-net deep learning network and four decision tree ensemble algorithms, including RF, gradient-boosting decision tree (GBDT), light gradient-boosting machine (LightGBM), AdaBoost, and bagging. Prior to training the classification models, endmember selection was employed using the Sequential Maximum Angle Convex Cone endmember extraction method to prepare the samples used in the model training and evaluation of the classification results. The results show that the GBDT and LightGBM classifiers outperformed the other classification models with overall accuracies of 89.43% and 89.22%, respectively. The results of the other classifiers showed overall accuracies of 87.32%, 87.33%, 82.74%, and 78.32% for RF, bagging, ENVINet5, and AdaBoost, respectively. Therefore, the findings of this study confirm that the ensemble machine learning algorithms are efficient tools to analyze drill core HSI data and map dominant minerals. Moreover, the implementation of deep learning methods for mineral mapping from HSI drill core data should be further explored and adjusted. Full article
Show Figures

Graphical abstract

21 pages, 5500 KiB  
Article
Unified Transformer with Cross-Modal Mixture Experts for Remote-Sensing Visual Question Answering
by Gang Liu, Jinlong He, Pengfei Li, Shenjun Zhong, Hongyang Li and Genrong He
Remote Sens. 2023, 15(19), 4682; https://doi.org/10.3390/rs15194682 - 24 Sep 2023
Cited by 1 | Viewed by 1402
Abstract
Remote-sensing visual question answering (RSVQA) aims to provide accurate answers to remote sensing images and their associated questions by leveraging both visual and textual information during the inference process. However, most existing methods ignore the significance of the interaction between visual and language [...] Read more.
Remote-sensing visual question answering (RSVQA) aims to provide accurate answers to remote sensing images and their associated questions by leveraging both visual and textual information during the inference process. However, most existing methods ignore the significance of the interaction between visual and language features, which typically adopt simple feature fusion strategies and fail to adequately model cross-modal attention, struggling to capture the complex semantic relationships between questions and images. In this study, we introduce a unified transformer with cross-modal mixture expert (TCMME) model to address the RSVQA problem. Specifically, we utilize the vision transformer (VIT) and BERT to extract visual and language features, respectively. Furthermore, we incorporate cross-modal mixture experts (CMMEs) to facilitate cross-modal representation learning. By leveraging the shared self-attention and cross-modal attention within CMMEs, as well as the modality experts, we effectively capture the intricate interactions between visual and language features and better focus on their complex semantic relationships. Finally, we conduct qualitative and quantitative experiments on two benchmark datasets: RSVQA-LR and RSVQA-HR. The results demonstrate that our proposed method surpasses the current state-of-the-art (SOTA) techniques. Additionally, we perform an extensive analysis to validate the effectiveness of different components in our framework. Full article
Show Figures

Figure 1

23 pages, 37749 KiB  
Article
MHLDet: A Multi-Scale and High-Precision Lightweight Object Detector Based on Large Receptive Field and Attention Mechanism for Remote Sensing Images
by Liming Zhou, Hang Zhao, Zhehao Liu, Kun Cai, Yang Liu and Xianyu Zuo
Remote Sens. 2023, 15(18), 4625; https://doi.org/10.3390/rs15184625 - 20 Sep 2023
Viewed by 1079
Abstract
Object detection in remote sensing images (RSIs) has become crucial in recent years. However, researchers often prioritize detecting small objects, neglecting medium- to large-sized ones. Moreover, detecting objects hidden in shadows is challenging. Additionally, most detectors have extensive parameters, leading to higher hardware [...] Read more.
Object detection in remote sensing images (RSIs) has become crucial in recent years. However, researchers often prioritize detecting small objects, neglecting medium- to large-sized ones. Moreover, detecting objects hidden in shadows is challenging. Additionally, most detectors have extensive parameters, leading to higher hardware costs. To address these issues, this paper proposes a multi-scale and high-precision lightweight object detector named MHLDet. Firstly, we integrated the SimAM attention mechanism into the backbone and constructed a new feature-extraction module called validity-neat feature extract (VNFE). This module captures more feature information while simultaneously reducing the number of parameters. Secondly, we propose an improved spatial pyramid pooling model, named SPPE, to integrate multi-scale feature information better, enhancing the model to detect multi-scale objects. Finally, this paper introduces the convolution aggregation crosslayer (CACL) into the network. This module can reduce the size of the feature map and enhance the ability to fuse context information, thereby obtaining a feature map with more semantic information. We performed evaluation experiments on both the SIMD dataset and the UCAS-AOD dataset. Compared to other methods, our approach achieved the highest detection accuracy. Furthermore, it reduced the number of parameters by 12.7% compared to YOLOv7-Tiny. The experimental results illustrated that our proposed method is more lightweight and exhibits superior detection accuracy compared to other lightweight models. Full article
Show Figures

Graphical abstract

24 pages, 11641 KiB  
Article
ARE-Net: An Improved Interactive Model for Accurate Building Extraction in High-Resolution Remote Sensing Imagery
by Qian Weng, Qin Wang, Yifeng Lin and Jiawen Lin
Remote Sens. 2023, 15(18), 4457; https://doi.org/10.3390/rs15184457 - 10 Sep 2023
Cited by 1 | Viewed by 959
Abstract
Accurate building extraction for high-resolution remote sensing images is critical for topographic mapping, urban planning, and many other applications. Its main task is to label each pixel point as a building or non-building. Although deep-learning-based algorithms have significantly enhanced the accuracy of building [...] Read more.
Accurate building extraction for high-resolution remote sensing images is critical for topographic mapping, urban planning, and many other applications. Its main task is to label each pixel point as a building or non-building. Although deep-learning-based algorithms have significantly enhanced the accuracy of building extraction, fully automated methods for building extraction are limited by the requirement for a large number of annotated samples, resulting in a limited generalization ability, easy misclassification in complex remote sensing images, and higher costs due to the need for a large number of annotated samples. To address these challenges, this paper proposes an improved interactive building extraction model, ARE-Net, which adopts a deep interactive segmentation approach. In this paper, we present several key contributions. Firstly, an adaptive-radius encoding (ARE) module was designed to optimize the interaction features of clicks based on the varying shapes and distributions of buildings to provide maximum a priori information for building extraction. Secondly, a two-stage training strategy was proposed to enhance the convergence speed and efficiency of the segmentation process. Finally, some comprehensive experiments using two models of different sizes (HRNet18s+OCR and HRNet32+OCR) were conducted on the Inria and WHU building datasets. The results showed significant improvements over the current state-of-the-art method in terms of NoC90. The proposed method achieved performance enhancements of 7.98% and 13.03% with HRNet18s+OCR and 7.34% and 15.49% with HRNet32+OCR on the WHU and Inria datasets, respectively. Furthermore, the experiments demonstrated that the proposed ARE-Net method significantly reduced the annotation costs while improving the convergence speed and generalization performance. Full article
Show Figures

Figure 1

21 pages, 13149 KiB  
Article
A Split-Frequency Filter Network for Hyperspectral Image Classification
by Jinfu Gong, Fanming Li, Jian Wang, Zhengye Yang and Xuezhuan Ding
Remote Sens. 2023, 15(15), 3900; https://doi.org/10.3390/rs15153900 - 7 Aug 2023
Cited by 1 | Viewed by 1161
Abstract
The intricate structure of hyperspectral images comprising hundreds of successive spectral bands makes it challenging for conventional approaches to quickly and precisely classify this information. The classification performance of hyperspectral images has substantially improved in the past decade with the emergence of deep-learning-based [...] Read more.
The intricate structure of hyperspectral images comprising hundreds of successive spectral bands makes it challenging for conventional approaches to quickly and precisely classify this information. The classification performance of hyperspectral images has substantially improved in the past decade with the emergence of deep-learning-based techniques. Due to convolutional neural networks’(CNNs) excellent feature extraction and modeling, they have become a robust backbone network for hyperspectral image classification. However, CNNs fail to adequately capture the dependency and contextual information of the sequence of spectral properties due to the restrictions inherent in their fundamental network characteristics. We analyzed hyperspectral image classification from a frequency-domain angle to tackle this issue and proposed a split-frequency filter network. It is a simple and effective network architecture that improves the performance of hyperspectral image classification through three critical operations: a split-frequency filter network, a detail-enhancement layer, and a nonlinear unit. Firstly, a split-frequency filtering network captures the interactions between neighboring spectral bands in the frequency domain. The classification performance is then enhanced using a detail-improvement layer with a frequency-domain attention technique. Finally, a nonlinear unit is incorporated into the frequency-domain output layer to expedite training and boost performance. Experiments on various hyperspectral datasets demonstrate that the method outperforms other state-of-art approaches (an overall accuracy(OA) improvement of at least 2%), particularly when the training sample is insufficient. Full article
Show Figures

Graphical abstract

21 pages, 9733 KiB  
Article
Impact of Horizontal Resolution on the Robustness of Radiation Emulators in a Numerical Weather Prediction Model
by Hwan-Jin Song and Soonyoung Roh
Remote Sens. 2023, 15(10), 2637; https://doi.org/10.3390/rs15102637 - 18 May 2023
Cited by 1 | Viewed by 1201
Abstract
Developing a machine-learning-based radiative transfer emulator in a weather forecasting model is valuable because it can significantly improve the computational speed of forecasting severe weather events. To replace the radiative transfer parameterization in the weather forecasting model, the universal applicability of the radiation [...] Read more.
Developing a machine-learning-based radiative transfer emulator in a weather forecasting model is valuable because it can significantly improve the computational speed of forecasting severe weather events. To replace the radiative transfer parameterization in the weather forecasting model, the universal applicability of the radiation emulator is essential, indicating a transition from the research to the operational level. This study investigates the degradation of the forecast accuracy of the radiation emulator for the Korea peninsula when it is tested at different horizontal resolutions (100–0.25 km) concerning the accuracy attained at the training resolution (5 km) for universal applications. In real-case simulations (100–5 km), the forecast errors of radiative fluxes and precipitation were reduced at coarse resolutions. Ideal-case simulations (5–0.25 km) showed larger errors in heating rates and fluxes at fine resolutions, implying the difficulty in predicting heating rates and fluxes at cloud-resolving scales. However, all simulations maintained an appropriate accuracy range compared with observations in real-case simulations or the infrequent use of radiative transfer parameterization in ideal-case simulations. These findings demonstrate the feasibility of a universal radiation emulator associated with different resolutions/models and emphasize the importance of emulating high-resolution modeling in the future. Full article
Show Figures

Figure 1

19 pages, 1145 KiB  
Article
CRSNet: Cloud and Cloud Shadow Refinement Segmentation Networks for Remote Sensing Imagery
by Chao Zhang, Liguo Weng, Li Ding, Min Xia and Haifeng Lin
Remote Sens. 2023, 15(6), 1664; https://doi.org/10.3390/rs15061664 - 20 Mar 2023
Cited by 21 | Viewed by 2100
Abstract
Cloud detection is a critical task in remote sensing image tasks. Due to the influence of ground objects and other noises, the traditional detection methods are prone to miss or false detection and rough edge segmentation in the detection process. To avoid the [...] Read more.
Cloud detection is a critical task in remote sensing image tasks. Due to the influence of ground objects and other noises, the traditional detection methods are prone to miss or false detection and rough edge segmentation in the detection process. To avoid the defects of traditional methods, Cloud and Cloud Shadow Refinement Segmentation Networks are proposed in this paper. The network can correctly and efficiently detect smaller clouds and obtain finer edges. The model takes ResNet-18 as the backbone to extract features at different levels, and the Multi-scale Global Attention Module is used to strengthen the channel and spatial information to improve the accuracy of detection. The Strip Pyramid Channel Attention Module is used to learn spatial information at multiple scales to detect small clouds better. Finally, the high-dimensional feature and low-dimensional feature are fused by the Hierarchical Feature Aggregation Module, and the final segmentation effect is obtained by up-sampling layer by layer. The proposed model attains excellent results compared to methods with classic or special cloud segmentation tasks on Cloud and Cloud Shadow Dataset and the public dataset CSWV. Full article
Show Figures

Figure 1

25 pages, 8188 KiB  
Article
Construction Site Multi-Category Target Detection System Based on UAV Low-Altitude Remote Sensing
by Han Liang, Jongyoung Cho and Suyoung Seo
Remote Sens. 2023, 15(6), 1560; https://doi.org/10.3390/rs15061560 - 13 Mar 2023
Cited by 3 | Viewed by 1946
Abstract
On-site management of construction sites has always been a significant problem faced by the construction industry. With the development of UAVs, their use to monitor construction safety and progress will make construction more intelligent. This paper proposes a multi-category target detection system based [...] Read more.
On-site management of construction sites has always been a significant problem faced by the construction industry. With the development of UAVs, their use to monitor construction safety and progress will make construction more intelligent. This paper proposes a multi-category target detection system based on UAV low-altitude remote sensing, aiming to solve the problems of relying on fixed-position cameras and a single category of established detection targets when mainstream target detection algorithms are applied to construction supervision. The experimental results show that the proposed method can accurately and efficiently detect 15 types of construction site targets. In terms of performance, the proposed method achieves the highest accuracy in each category compared to other networks, with a mean average precision (mAP) of 82.48%. Additionally, by applying it to the actual construction site, the proposed system is confirmed to have comprehensive detection capability and robustness. Full article
Show Figures

Figure 1

17 pages, 8037 KiB  
Article
YOLO-HR: Improved YOLOv5 for Object Detection in High-Resolution Optical Remote Sensing Images
by Dahang Wan, Rongsheng Lu, Sailei Wang, Siyuan Shen, Ting Xu and Xianli Lang
Remote Sens. 2023, 15(3), 614; https://doi.org/10.3390/rs15030614 - 20 Jan 2023
Cited by 21 | Viewed by 8175
Abstract
Object detection is essential to the interpretation of optical remote sensing images and can serve as a foundation for research into additional visual tasks that utilize remote sensing. However, the object detection network currently employed in optical remote sensing images underutilizes the output [...] Read more.
Object detection is essential to the interpretation of optical remote sensing images and can serve as a foundation for research into additional visual tasks that utilize remote sensing. However, the object detection network currently employed in optical remote sensing images underutilizes the output of the feature pyramid, so there remains potential for an improved detection. At present, a suitable balance between the detection efficiency and detection effect is difficult to attain. This paper proposes an enhanced YOLOv5 algorithm for object detection in high-resolution optical remote sensing images, utilizing multiple layers of the feature pyramid, a multi-detection-head strategy, and a hybrid attention module to improve the effect of object-detection networks for use with optical remote sensing images. According to the SIMD dataset, the mAP of the proposed method was 2.2% better than YOLOv5 and 8.48% better than YOLOX, achieving an improved balance between the detection effect and speed. Full article
Show Figures

Figure 1

17 pages, 2769 KiB  
Article
Mapping Dwellings in IDP/Refugee Settlements Using Deep Learning
by Omid Ghorbanzadeh, Alessandro Crivellari, Dirk Tiede, Pedram Ghamisi and Stefan Lang
Remote Sens. 2022, 14(24), 6382; https://doi.org/10.3390/rs14246382 - 16 Dec 2022
Viewed by 2880
Abstract
The improvement in computer vision, sensor quality, and remote sensing data availability makes satellite imagery increasingly useful for studying human settlements. Several challenges remain to be overcome for some types of settlements, particularly for internally displaced populations (IDPs) and refugee camps. Refugee-dwelling footprints [...] Read more.
The improvement in computer vision, sensor quality, and remote sensing data availability makes satellite imagery increasingly useful for studying human settlements. Several challenges remain to be overcome for some types of settlements, particularly for internally displaced populations (IDPs) and refugee camps. Refugee-dwelling footprints and detailed information derived from satellite imagery are critical for a variety of applications, including humanitarian aid during disasters or conflicts. Nevertheless, extracting dwellings remains difficult due to their differing sizes, shapes, and location variations. In this study, we use U-Net and residual U-Net to deal with dwelling classification in a refugee camp in northern Cameroon, Africa. Specifically, two semantic segmentation networks are adapted and applied. A limited number of randomly divided sample patches is used to train and test the networks based on a single image of the WorldView-3 satellite. Our accuracy assessment was conducted using four different dwelling categories for classification purposes, using metrics such as Precision, Recall, F1, and Kappa coefficient. As a result, F1 ranges from 81% to over 99% and approximately 88.1% to 99.5% based on the U-Net and the residual U-Net, respectively. Full article
Show Figures

Figure 1

24 pages, 19513 KiB  
Article
Fusion of Remote Sensing, Magnetometric, and Geological Data to Identify Polymetallic Mineral Potential Zones in Chakchak Region, Yazd, Iran
by Ali Akbar Aali, Adel Shirazy, Aref Shirazi, Amin Beiranvand Pour, Ardeshir Hezarkhani, Abbas Maghsoudi, Mazlan Hashim and Shayan Khakmardan
Remote Sens. 2022, 14(23), 6018; https://doi.org/10.3390/rs14236018 - 28 Nov 2022
Cited by 16 | Viewed by 2309
Abstract
Exploration geologists are urged to develop new, robust, and low-cost approaches to identify high potential zones related to underground/unexplored mineral deposits because of increased depletion of ore deposits and high consumption of basic metal production industries. Fusing remote sensing, geophysical and geological data [...] Read more.
Exploration geologists are urged to develop new, robust, and low-cost approaches to identify high potential zones related to underground/unexplored mineral deposits because of increased depletion of ore deposits and high consumption of basic metal production industries. Fusing remote sensing, geophysical and geological data has great capability to provide a complete range of prerequisite data to accomplish this purpose. This investigation fuses remote sensing data, such as Sentinel-2 and Landsat 7, aerial magnetic geophysical data, and geological data for identifying polymetallic mineralization potential zones in the Chakchak region, Yazd province, Iran. Hydrothermal alteration mineral zones and surface and deep intrusive masses, hidden faults and lineaments, and lithological units were detected using remote sensing, aerial magnetic, and geological data, respectively. The exploratory/information layers were fused using fuzzy logic modeling and the multi-class index overlap method. Subsequently, mineral potential maps were generated for the study area. Some high potential zones of polymetallic mineralization were identified and verified through a detailed field campaign and drilling programs in the Chakchak region. In conclusion, the fusion of remote sensing, geophysical, and geological data using fuzzy logic modeling and the multi-class index overlap method is a robust, reliable, and low-cost approach for mining companies to explore the frontier areas with identical geologic conditions that are alleged to indicate polymetallic mineralization potential. Full article
Show Figures

Figure 1

Back to TopTop